39
Math 5305 Notes Logistic Regression and Discriminant Analysis Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Logistic Reg. and Discr. Analysis 1 / 39

Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Math 5305 NotesLogistic Regression and Discriminant Analysis

Jesse Crawford

Department of MathematicsTarleton State University

(Tarleton State University) Logistic Reg. and Discr. Analysis 1 / 39

Page 2: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Outline

1 Logistic Regression

2 Discriminant Analysis

(Tarleton State University) Logistic Reg. and Discr. Analysis 2 / 39

Page 3: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Logistic Regression Models

Output variable Y is dichotomous (Yi = 0 or Yi = 1)

gi = Xiβ = β1Xi1 + · · ·+ βpXip, for i = 1, . . . ,n.

P(Yi = 1) = πi =egi

1 + egi, for i = 1, . . . ,n.

(Tarleton State University) Logistic Reg. and Discr. Analysis 3 / 39

Page 4: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Maximum Likelihood Estimation

Likelihood function

L =n∏

i=1

πYii (1− πi)

1−Yi

Likelihood equations

n∑i=1

Xij(Yi − πi) = 0, for j = 1, . . . ,p.

(Tarleton State University) Logistic Reg. and Discr. Analysis 4 / 39

Page 5: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Example in R

True Model

gi = −3 + 0.06Xi , for i = 1, . . . ,100000.

X=runif(100000,0,100)

g=-3+.06*XPi=(exp(g)/(1+exp(g)))

U=runif(100)Y=(U<Pi)*1

(Tarleton State University) Logistic Reg. and Discr. Analysis 5 / 39

Page 6: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

True Model

gi = −3 + 0.06Xi , for i = 1, . . . ,100000.

model=glm(Y~X,family=binomial)summary(model)

(Tarleton State University) Logistic Reg. and Discr. Analysis 6 / 39

Page 7: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Plots

Y vs. X (Not very useful).

(Tarleton State University) Logistic Reg. and Discr. Analysis 7 / 39

Page 8: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Plots

π vs. X

(Tarleton State University) Logistic Reg. and Discr. Analysis 8 / 39

Page 9: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Plots

g vs. X (Best plot for assessing functional form)

(Tarleton State University) Logistic Reg. and Discr. Analysis 9 / 39

Page 10: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Plots

g vs. X (Best plot for assessing functional form)

(Tarleton State University) Logistic Reg. and Discr. Analysis 10 / 39

Page 11: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Deviance and AIC

Deviance = −2 ln(L) = −2n∑

i=1

Yi ln(πi) + (1− Yi) ln(1− πi)

AIC = 2p − 2 ln(L)

gi = Xi β = β1Xi1 + · · ·+ βpXip, for i = 1, . . . ,n.

πi =egi

1 + egi, for i = 1, . . . ,n.

(Tarleton State University) Logistic Reg. and Discr. Analysis 11 / 39

Page 12: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Hypothesis Testing

Consider the logistic regression model

P(Yi = 1) =egi

1 + egi, where

gi = Xiβ.

Let V0 ≤ V ≤ Rp, and consider the testing problem

H0 : β ∈ V0 vs. H : β ∈ V .

The test statistic is G = D0 − D, where D0 and D are thedeviances under H0 and H, respectively.Under H0, the approximate distribution of G is chi-square withdim(V )− dim(V0) degrees of freedom, so

reject H0 if G > χ2α(dim(V )− dim(V0)).

(Tarleton State University) Logistic Reg. and Discr. Analysis 12 / 39

Page 13: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Assessing the Model

Functional form:I Group plotsI Likelihood ratio tests

Overall PerformanceI Classification AccuracyI Area under ROC Curve

(Tarleton State University) Logistic Reg. and Discr. Analysis 13 / 39

Page 14: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Classification Accuracy

Choose a cutoff value, and use the classification ruleI If πi > cutoff, then Yi = 1I If πi < cutoff, then Yi = 0.

The classification accuracy is percentage of observations thatwere correctly classified (percentage of cases where Yi = Yi ).

Classification Accuracy = P(Yi = Yi)

To optimize classification accuracy, a reasonable cutoff to use is0.5.

(Tarleton State University) Logistic Reg. and Discr. Analysis 14 / 39

Page 15: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Sensitivity and Specificity

Sensitivity = P(Yi = 1 | Yi = 1)

Specificity = P(Yi = 0 | Yi = 0)

(Tarleton State University) Logistic Reg. and Discr. Analysis 15 / 39

Page 16: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Variable Selection

ManuallyStepwiseBest subsets

(Tarleton State University) Logistic Reg. and Discr. Analysis 16 / 39

Page 17: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Outline

1 Logistic Regression

2 Discriminant Analysis

(Tarleton State University) Logistic Reg. and Discr. Analysis 17 / 39

Page 18: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Discriminant Analysis

Used when output variable Y is categorical.Assume Y is categorical with possible values 0, . . . , k .Let X be a vector of input variables.Given an observation X = x , we want to predict the value of Y .Can also be viewed as a classification problem.

ExampleY = grade in Biol 120 (Y = 1 or Y = 0)X = student’s high school rank (0 ≤ X ≤ 1)

(Tarleton State University) Logistic Reg. and Discr. Analysis 18 / 39

Page 19: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Y is a discrete random variable.It has a p.m.f.

f (y) = P(Y = y), for y = 0, . . . , k .

For each value of Y , the vector X has a conditional distributiongiven by

f (x | y)

The conditional p.m.f. of Y given X = x is

P(Y = y | X = x) = f (y | x) =f (x , y)

f (x)=

f (y)f (x | y)∑ky=0 f (y)f (x | y)

(Tarleton State University) Logistic Reg. and Discr. Analysis 19 / 39

Page 20: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

The conditional p.m.f. of Y given X = x is

P(Y = y | X = x) = f (y | x) =f (x , y)

f (x)=

f (y)f (x | y)∑ky=0 f (y)f (x | y)

Given the observation X = x , we predict Y will be equal to thevalue of y maximizing f (y)f (x | y).

(Tarleton State University) Logistic Reg. and Discr. Analysis 20 / 39

Page 21: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Discriminant Analysis with Multivariate NormalPredictor

Given Y = y , X ∼ N(µy ,Σy ), for y = 0, . . . , k .

f (x | y) = (2π)−p/2|Σy |−1/2 exp{−12(x − µy )′Σ−1

y (x − µy )}

If X = x , we predict Y will be equal to the value of y minimizing

d2y (x) = −2 ln[f (y)] + ln |Σy |+ (x − µy )′Σ−1

y (x − µy )

In practice, we would use

d2y (x) = −2 ln[f (y)] + ln |Sy |+ (x − xy )′S−1

y (x − xy )

(Tarleton State University) Logistic Reg. and Discr. Analysis 21 / 39

Page 22: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

In practice, we would use

d2y (x) = −2 ln[f (y)] + ln |Sy |+ (x − xy )′S−1

y (x − xy )

How can we estimate these quantities?Assume we have observations for Yi and Xi , for i = 1, . . . ,n.

f (y) =Number of times Yi = y

n

(Tarleton State University) Logistic Reg. and Discr. Analysis 22 / 39

Page 23: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Sample Mean and Covariance Matrix

Let x1, . . . , xn ∈ Rp be observations from N(µ,Σ).The estimate for the mean µ is the sample mean x .

µ = x =1n

n∑i=1

xi

The estimate for the covariance matrix Σ is the empiricalcovariance matrix S.

Σ = S =1

n − 1

n∑i=1

(xi − x)(xi − x)′

If X is a matrix whose rows are x1, . . . , xn then x and S can beobtained with the R commands colMeans(X) and cov(X).

(Tarleton State University) Logistic Reg. and Discr. Analysis 23 / 39

Page 24: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

In practice, we would use

d2y (x) = −2 ln[f (y)] + ln |Sy |+ (x − xy )′S−1

y (x − xy )

For each y , set aside all rows of data where Yi = y .xy and Sy are the sample mean and covariance matrix for thevectors xi from these rows of data.For each y , let xy1, . . . , xyny be the values of Xi for those subjectswith Yi = y .

(Tarleton State University) Logistic Reg. and Discr. Analysis 24 / 39

Page 25: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Highschool Rank and Biol 120 Grade

trank=rank[1:2000]tgrade=grade[1:2000]vrank=rank[2001:3146]vgrade=grade[2001:3146]

L=groupplot(trank,tgrade,20)plot(L$x,L$g)

(Tarleton State University) Logistic Reg. and Discr. Analysis 25 / 39

Page 26: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

model=glm(tgrade~trank,family=binomial)betahat=coef(model)

x=(1:100)/100lines(x,betahat[1]+betahat[2]*x,col=’red’)

(Tarleton State University) Logistic Reg. and Discr. Analysis 26 / 39

Page 27: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

model2=glm(tgrade~trank+I(trank^2),family=binomial)betahat2=coef(model2)

lines(x,betahat2[1]+betahat2[2]*x+betahat2[3]*x^2,col=’red’)

(Tarleton State University) Logistic Reg. and Discr. Analysis 27 / 39

Page 28: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

model3=glm(tgrade~trank+I(trank^2)+I(trank^3),family=binomial)

betahat3=coef(model3)

lines(x,betahat3[1]+betahat3[2]*x+betahat3[3]*x^2+betahat3[4]*x^3,col=’red’)

(Tarleton State University) Logistic Reg. and Discr. Analysis 28 / 39

Page 29: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Model with 4th Order Term

(Tarleton State University) Logistic Reg. and Discr. Analysis 29 / 39

Page 30: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Likelihood Ratio Tests

(Tarleton State University) Logistic Reg. and Discr. Analysis 30 / 39

Page 31: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Model Validation

For i = 2001, . . . ,3146, we predict Yi = 1 if πi ≥ 12 .

πi ≥ 12 iff g(xi) ≥ 0.

g(x) = −2.89 + 11.63x − 24.19x2 + 18.92x3

g(xi) ≥ 0 iff x ≥ .71929

(Tarleton State University) Logistic Reg. and Discr. Analysis 31 / 39

Page 32: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Classification Accuracy

vrank=rank[2001:3146]vgrade=grade[2001:3146]

vgradehat=(vrank>=.71929)*1mean(vgradehat==vgrade)

Classification Accuracy = P(Yi = Yi) = 0.699

(Tarleton State University) Logistic Reg. and Discr. Analysis 32 / 39

Page 33: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

HS Rank and Biol 120 Grade with DiscriminantAnalysis

If X = x , we predict Y will be equal to the value of y minimizing

d2y (x) = −2 ln[f (y)] + ln |Sy |+ (x − xy )′S−1

y (x − xy )

rank0=trank[tgrade==0]rank1=trank[tgrade==1]

n0=length(rank0)n1=length(rank1)

f0=n0/(n0+n1)f1=n1/(n0+n1)

(Tarleton State University) Logistic Reg. and Discr. Analysis 33 / 39

Page 34: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

If X = x , we predict Y will be equal to the value of y minimizing

d2y (x) = −2 ln[f (y)] + ln |Sy |+ (x − xy )′S−1

y (x − xy )

rank0=trank[tgrade==0]rank1=trank[tgrade==1]

xbar0=mean(rank0)xbar1=mean(rank1)

s0=sd(rank0)s1=sd(rank1)

(Tarleton State University) Logistic Reg. and Discr. Analysis 34 / 39

Page 35: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

If X = x , we predict Y will be equal to the value of y minimizing

d2y (x) = −2 ln[f (y)] + ln |Sy |+ (x − xy )′S−1

y (x − xy )

allranks=(1:1000)/1000

d0=-2*log(f0)+log(s0^2)+(allranks-xbar0)^2/s0^2d1=-2*log(f1)+log(s1^2)+(allranks-xbar1)^2/s1^2

(Tarleton State University) Logistic Reg. and Discr. Analysis 35 / 39

Page 36: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

cbind(allranks,d0,d1,d0<d1)

Discr. Analysis Optimal Cutoff = 0.649

(Tarleton State University) Logistic Reg. and Discr. Analysis 36 / 39

Page 37: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Cross-validation for Discriminant Analysis

vgradehat=(vrank>=.649)*1mean(vgradehat==vgrade)

Classification Accuracy = P(Yi = Yi) = 0.702

(Tarleton State University) Logistic Reg. and Discr. Analysis 37 / 39

Page 38: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

“Brute Force” Approach

allranks=(1:1000)/1000

classacc=1:1000

for(i in 1:1000){tempgradehat=(trank>=allranks[i])*1classacc[i]=mean(tempgradehat==tgrade)

}

max(classacc)

allranks[classacc==max(classacc)]

Optimal Cutoffs = (.717, .719, .720, .721)

Optimal Cutoffs = .7195

(Tarleton State University) Logistic Reg. and Discr. Analysis 38 / 39

Page 39: Math 5305 Notes - Tarleton State University€¦ · Consider the logistic regression model P(Yi = 1) = egi 1 + egi, where gi = Xi : Let V0 V Rp, and consider the testing problem H0:

Additional Reading

Hosmer, D.W. (2000). Applied Logistic Regression, 2nd ed.Wiley-Interscience, New York, N.Y.Khattree, R. and Naik, D.N. (1999) Applied Multivariate Statisticswith SAS Software, 2nd ed. SAS Institute Inc., Cary, N.C.Khattree, R. and Naik, D.N. (2000) Multivariate Data Reductionand Discrimination with SAS Software SAS Institute Inc., Cary,N.C.

(Tarleton State University) Logistic Reg. and Discr. Analysis 39 / 39