44
Department of Data Analysis Ghent University The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44

The linear mixed model: modeling hierarchical and

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

The linear mixed model: modeling hierarchicaland longitudinal data

Analysis of Experimental Data

AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44

Page 2: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Contents1 Modeling Hierarchical Data 3

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Example: the “High School and Beyond” data . . . . . . . . . . . 41.3 Further issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2 Modeling Longitudinal Data 392.1 Example: Reaction times in a sleep deprivation study . . . . . . . 40

AED The linear mixed model: modeling hierarchical and longitudinal data 2 of 44

Page 3: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

1 Modeling Hierarchical Data

1.1 Overview• especially in education (think ‘students in schools’), applications of mixed

models to hierarchical data have become very popular

• when researchers in the social sciences talk about ‘multilevel analysis’, theytypically mean the application of mixed models to hierarchical data

• typical for this type of data is the sharp distinction between ‘levels’: the‘student level’, and the ‘school level’, where one level (student) is nested inthe other (school)

• data often contains both student characteristics and school charateristics; ina multilevel analysis, both types of variables can be included in one analysissimultaneously

AED The linear mixed model: modeling hierarchical and longitudinal data 3 of 44

Page 4: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

1.2 Example: the “High School and Beyond” data• the following example is borrowed from Raudenbusch and Bryk (2001, chap-

ter 4).

• the data are from the 1982 “High School and Beyond” survey, and pertain to7185 U.S. high-school students from 160 schools (70 catholic, 90 public)

• these are the variables in the dataset:

– school an ordered factor designating the school that the student attends.

– minrty a factor with 2 levels (not used)

– sx a factor with levels Male and Female (not used)

– ses a numeric vector of socio-economic scores

– mAch a numeric vector of Mathematics achievement scores

– meanses a numeric vector of mean ses for the school

– sector a factor with levels Public and Catholic

– cses a numeric vector of centered ses values where the centering is withrespect to the meanses for the school

AED The linear mixed model: modeling hierarchical and longitudinal data 4 of 44

Page 5: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• the aim of the analysis is to determine how students’ math achievementscores are related to their family socioeconomic status

• but this relationship may very well vary among schools

• if there is indeed variation among schools, can we find any school charac-teristics that ‘explain’ this variation? the two school characteristics that wewill use are:

– sector: public school or Catholic school

– meanses: the average SES of students in the school

AED The linear mixed model: modeling hierarchical and longitudinal data 5 of 44

Page 6: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Exploring the data> library(mlmRev)> summary(Hsb82)

school minrty sx ses mAch2305 : 67 No :5211 Male :3390 Min. :-3.7580000 Min. :-2.8325619 : 66 Yes:1974 Female:3795 1st Qu.:-0.5380000 1st Qu.: 7.2754292 : 65 Median : 0.0020000 Median :13.1318857 : 64 Mean : 0.0001434 Mean :12.7484042 : 64 3rd Qu.: 0.6020000 3rd Qu.:18.3173610 : 64 Max. : 2.6920000 Max. :24.993(Other):6795

meanses sector csesMin. :-1.1939459 Public :3642 Min. :-3.651e+001st Qu.:-0.3230000 Catholic:3543 1st Qu.:-4.479e-01Median : 0.0320000 Median : 1.600e-02Mean : 0.0001434 Mean :-9.144e-193rd Qu.: 0.3269123 3rd Qu.: 4.694e-01Max. : 0.8249825 Max. : 2.856e+00

> head(Hsb82)

school minrty sx ses mAch meanses sector cses1 1224 No Female -1.528 5.876 -0.434383 Public -1.093617022 1224 No Female -0.588 19.708 -0.434383 Public -0.15361702

AED The linear mixed model: modeling hierarchical and longitudinal data 6 of 44

Page 7: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

3 1224 No Male -0.528 20.349 -0.434383 Public -0.093617024 1224 No Male -0.668 8.781 -0.434383 Public -0.233617025 1224 No Male -0.158 17.898 -0.434383 Public 0.276382986 1224 No Male 0.022 4.583 -0.434383 Public 0.45638298

> tail(Hsb82)

school minrty sx ses mAch meanses sector cses7180 9586 Yes Female 1.612 20.967 0.6211525 Catholic 0.99084757181 9586 No Female 1.512 20.402 0.6211525 Catholic 0.89084757182 9586 No Female -0.038 14.794 0.6211525 Catholic -0.65915257183 9586 No Female 1.332 19.641 0.6211525 Catholic 0.71084757184 9586 No Female -0.008 16.241 0.6211525 Catholic -0.62915257185 9586 No Female 0.792 22.733 0.6211525 Catholic 0.1708475

AED The linear mixed model: modeling hierarchical and longitudinal data 7 of 44

Page 8: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Exploring the data: math achievement versus ses for several schools> set.seed(1234)> library(lattice)> cat <- sample(unique(Hsb82$school[Hsb82$sector == "Catholic"]),+ 18)> Cat.18 <- Hsb82[is.element(Hsb82$school, cat), ]> plot1 <- xyplot(mAch ˜ ses | school, data = Cat.18, main = "Catholic",+ xlab = "student SES", ylab = "Math Achievement", layout = c(6,+ 3), panel = function(x, y) {+ panel.xyplot(x, y)+ panel.lmline(x, y)+ })

> pub <- sample(unique(Hsb82$school[Hsb82$sector == "Public"]),+ 18)> Pub.18 <- Hsb82[is.element(Hsb82$school, pub), ]> plot2 <- xyplot(mAch ˜ ses | school, data = Pub.18, main = "Public",+ xlab = "student SES", ylab = "Math Achievement", layout = c(6,+ 3), panel = function(x, y) {+ panel.xyplot(x, y)+ panel.lmline(x, y)+ })

AED The linear mixed model: modeling hierarchical and longitudinal data 8 of 44

Page 9: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Catholic

student SES

Mat

h Ac

hieve

men

t

0

5

10

15

20

25

−2 −1 0 1

●● ●

●●

●●

●●

●●●

●● ●

8800

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

8009

−2 −1 0 1

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

4511

●●

●●

●●●

●●

● ●

●●

●●

9347

−2 −1 0 1

●●

●●

●●

●●

●●

●●

●●

●●

7342

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●●●●●

4292

●●

●●● ●

●●

●●

●●

●●●

●●

●●

●●

●●●● ●

●●

8857

●●

●● ●

●●

●●

●●

●●●

●●

●●

●●●

5667

●● ●

● ●

●●● ●●

●●

●●

● ●●

5720●●

●●

●●

● ●

●●●

●●

●●

●●●

●●

● ●

9104

●●

●●

●●

●● ●

●●

●●

●●●

8150

0

5

10

15

20

25●

●●

● ●

●●

●●●

●●

●●

● ● ●●

●●

9359

0

5

10

15

20

25●

●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●●

●●

●●

2208

−2 −1 0 1

●●● ●

●●

● ●

● ●●

1308

●●●

●● ●

●●

●●

● ●●

● ●●

●●

●● ●

● ●

2755

−2 −1 0 1

●●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●

2990

●●

●●

●●

●●●

●●

●●●

● ●●

●●

●●

●●●

3020

−2 −1 0 1

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

6469

AED The linear mixed model: modeling hierarchical and longitudinal data 9 of 44

Page 10: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Public

student SES

Mat

h Ac

hieve

men

t

0

5

10

15

20

25

−2 −1 0 1

●●

●●

●● ●●

8367

●●

●●

●●

●●

●●

1358

−2 −1 0 1

●●●

●●●

●●

●●

●●

●●

●●

2818

●●

● ●

●●

●●

5783

−2 −1 0 1

● ●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

3013

●●

●●

●●

●●

4350

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

9397

●●

●●

●●

● ●●

●●

●●●

● ●●

●●

6464

●●

●●

●●

●●

●●

●●

●●

2651

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

2467

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

9158

0

5

10

15

20

25●

●●

●● ●

●●

4420

0

5

10

15

20

25

●●

●●

7734

−2 −1 0 1

●●

●●

●●

●●

●●●

9225

● ●

●●

2768

−2 −1 0 1

● ●

● ●

● ●

●●

●●

●●

●●

●●●

3332

●●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

3351

−2 −1 0 1

●●

●●

●●

●●

7697

AED The linear mixed model: modeling hierarchical and longitudinal data 10 of 44

Page 11: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Fitting a simple regression model for each school separately> cat.list <- lmList(mAch ˜ ses | school, subset = sector == "Catholic",+ data = Hsb82)> head(coef(cat.list))

(Intercept) ses7172 8.355037 0.99448054868 11.845609 1.28647122305 10.646595 -0.78211128800 9.157380 2.56812545192 10.511666 1.60349504523 8.479699 2.3807892

> pub.list <- lmList(mAch ˜ ses | school, subset = sector == "Public",+ data = Hsb82)> head(coef(pub.list))

(Intercept) ses8367 4.546383 0.25037488854 5.707002 1.93884464458 6.999212 1.13183725762 3.114085 -1.01409926990 6.440875 0.94769035815 9.323601 3.0180018

AED The linear mixed model: modeling hierarchical and longitudinal data 11 of 44

Page 12: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Catholic

7172486823058800519245236816227780094530902145116578934737053533425373423499736456502658950842928857131726294223146249315667572034983688816591048150404260741906399241735761763524583610383893592208147730391308143314362526275529903020342754045619636664697011733276888193862891989586

5 10 15 20

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||||

||

||

||

||

||

||

||

||

||

||

||

||

||||

|

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

(Intercept)

−5 0 5

||

||

||

||

||

||

||

||

|||

||

||

||||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||||

||

||

||

||

||

||

||

||

|||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

ses

AED The linear mixed model: modeling hierarchical and longitudinal data 12 of 44

Page 13: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Public

836788544458576269905815734113584383308887757890614464436808281893405783301371012639337712964350939726559292898381884410870714998477128862911224396764159550646459377919371619092651246713746600388129955838915889467232291761702030835785314420399943256484689777348175887492255640608927685819639714611637194219462336262627713152333233513657464272767345769782028627

0 5 10 15 20

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

|||

|||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

(Intercept)

−5 0 5 10

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

|||

||

||

|

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

|||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

|||

|ses

AED The linear mixed model: modeling hierarchical and longitudinal data 13 of 44

Page 14: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

●●

Catholic Public

510

1520

Intercepts

Catholic Public

−20

24

6

Slopes

AED The linear mixed model: modeling hierarchical and longitudinal data 14 of 44

Page 15: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Relationship intercepts and slopes and mean school SES

●●

● ●

●●

●●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

● ●

● ●

−1.0 0.0 0.5

510

1520

Mean SES School

Inte

rcep

t

●●

●●

●●

●●

● ●●

●●

●●

●●●

●●

●●

−1.0 0.0 0.5−2

02

46

Mean SES School

Slop

e

AED The linear mixed model: modeling hierarchical and longitudinal data 15 of 44

Page 16: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Model 1: a random-effects one-way ANOVA

• this is often called the ‘empty’ model, since it containts no predictors, butsimply reflects the nested structure

• no level-1 predictors, no level-2 predictors

• in the ‘multilevel’ notation we specify a model for each level

• model for the first (student) level:

yij = α0i + εij

• model for the second (school) level:

α0i = γ00 + u0i

• the combined model and the Laird-Ware form:

yij = γ00 + u0i + εij

= β0 + b0i + εij

AED The linear mixed model: modeling hierarchical and longitudinal data 16 of 44

Page 17: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• this is an example of a random-effects one-way ANOVA model with onefixed effect (the intercept, β0) representing the general population mean ofmath achievement, and two random effects:

– b0i representing the deviation of math achievement in school i from thegeneral mean

– εij representing the deviation of individual j’s math achievement inschool i from the school mean

• there are two variance components for this model:

– Var(b0i) = d2: the variance among school means

– Var(εij) = σ2: the variance among individuals in the same school

• since b0i and εij are assumed to be independent, the variation in math scoresamong individuals can be decomposed into these two variance components:

Var(yij) = d2 + σ2

AED The linear mixed model: modeling hierarchical and longitudinal data 17 of 44

Page 18: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

R code> fit.model1 <- lmer(mAch ˜ 1 + (1 | school), data = Hsb82)> summary(fit.model1)

Linear mixed model fit by REMLFormula: mAch ˜ 1 + (1 | school)

Data: Hsb82AIC BIC logLik deviance REMLdev

47123 47143 -23558 47116 47117Random effects:Groups Name Variance Std.Dev.school (Intercept) 8.614 2.9350Residual 39.148 6.2569Number of obs: 7185, groups: school, 160

Fixed effects:Estimate Std. Error t value

(Intercept) 12.6370 0.2443 51.72

AED The linear mixed model: modeling hierarchical and longitudinal data 18 of 44

Page 19: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Intra-class correlation

• the intra-class correlation coefficient is the proportion of variation in indi-viduals’ scores due to differences among schools:

d2

Var(yij)=

d2

d2 + σ2= ρ

• ρ may also be interpreted as the correlation between the math scores of twoindividuals from the same school:

Cor(yij , yij′) =Cov(yij , yij′)√

Var(yij)× Var(yij′)=

d2

d2 + σ2= ρ

> s <- summary(fit.model1)> d2 <- as.numeric(s@REmat[, 3])[1]> s2 <- as.numeric(s@REmat[, 3])[2]> rho <- d2/(d2 + s2)> rho

[1] 0.1803526

AED The linear mixed model: modeling hierarchical and longitudinal data 19 of 44

Page 20: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• about 18 percent of the variation in students’ match-achievement scores is“attributable” to differences among schools

• this is often called ‘the cluster effect’

AED The linear mixed model: modeling hierarchical and longitudinal data 20 of 44

Page 21: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Model 2: a random-effects one-way ANCOVA

• 1 level-1 predictor (SES, centered within school), no level-2 predictors

• random intercept, no random slopes

• model for the first (student) level:

yij = α0i + α1i csesij + εij

• model for the second (school) level:

α0i = γ00 + u0i (the random intercept)α1i = γ10 (the constant slope)

• the combined model and the Laird-Ware form:

yij = (γ00 + u0i) + γ10 csesij + εij

= γ00 + γ10 csesij + u0i + εij

= β0 + β1x1ij + b0i + εij

AED The linear mixed model: modeling hierarchical and longitudinal data 21 of 44

Page 22: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• the fixed-effect coefficients β0 and β1 represent the average within-schoolspopulation intercept and slope respectively

– note: because SES is centered within schools, the intercept β0 repre-sents the ‘average’ level of math achievement in the population

• the model has two variance-covariance components:

– Var(b0i) = d2: the variance among school intercepts

– Var(εij) = σ2: the error variance around the within-school regressions

AED The linear mixed model: modeling hierarchical and longitudinal data 22 of 44

Page 23: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

R code> fit.model2 <- lmer(mAch ˜ 1 + cses + (1 | school), data = Hsb82)> summary(fit.model2)

Linear mixed model fit by REMLFormula: mAch ˜ 1 + cses + (1 | school)

Data: Hsb82AIC BIC logLik deviance REMLdev

46732 46760 -23362 46720 46724Random effects:Groups Name Variance Std.Dev.school (Intercept) 8.6723 2.9449Residual 37.0104 6.0836Number of obs: 7185, groups: school, 160

Fixed effects:Estimate Std. Error t value

(Intercept) 12.6361 0.2445 51.69cses 2.1912 0.1087 20.17

Correlation of Fixed Effects:(Intr)

cses 0.000

AED The linear mixed model: modeling hierarchical and longitudinal data 23 of 44

Page 24: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Model 3: a random-coefficients regression model

• 1 level-1 predictor (SES, centered within school), no level-2 predictors

• random intercept and random slopes

• model for the first (student) level:

yij = α0i + α1i csesij + εij

• model for the second (school) level:

α0i = γ00 + u0i (the random intercept)α1i = γ10 + u1i (the random slope)

• the combined model and the Laird-Ware form:

yij = (γ00 + u0i) + (γ10 + u1i) csesij + εij

= γ00 + γ10 csesij + u0i + u1i csesij + εij

= β0 + β1x1ij + b0i + b0iz1ij + εij

AED The linear mixed model: modeling hierarchical and longitudinal data 24 of 44

Page 25: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• the fixed-effect coefficients β0 and β1 again represent the average within-schools population intercept and slope respectively

• the model has four variance-covariance components:

– Var(b0i) = d20: the variance among school intercepts

– Var(b1i) = d21: the variance among school slopes

– Cov(b0i, b1i) = d01: the covariance between within-school interceptsand slopes

– Var(εij) = σ2: the error variance around the within-school regressions

AED The linear mixed model: modeling hierarchical and longitudinal data 25 of 44

Page 26: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

R code> fit.model3 <- lmer(mAch ˜ 1 + cses + (1 + cses | school), data = Hsb82)> summary(fit.model3)

Linear mixed model fit by REMLFormula: mAch ˜ 1 + cses + (1 + cses | school)

Data: Hsb82AIC BIC logLik deviance REMLdev

46726 46768 -23357 46711 46714Random effects:Groups Name Variance Std.Dev. Corrschool (Intercept) 8.68102 2.94636

cses 0.69399 0.83306 0.019Residual 36.70019 6.05807Number of obs: 7185, groups: school, 160

Fixed effects:Estimate Std. Error t value

(Intercept) 12.6362 0.2445 51.69cses 2.1932 0.1283 17.10

Correlation of Fixed Effects:(Intr)

cses 0.009

AED The linear mixed model: modeling hierarchical and longitudinal data 26 of 44

Page 27: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Is the random slope necessary?

R code> anova(fit.model2, fit.model3)

Data: Hsb82Models:fit.model2: mAch ˜ 1 + cses + (1 | school)fit.model3: mAch ˜ 1 + cses + (1 + cses | school)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)fit.model2 4 46728 46756 -23360fit.model3 6 46723 46764 -23356 9.4299 2 0.00896 **---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

• note: you can only compare two models using likelihood-ratio tests if thetwo models have the same fixed effects!

AED The linear mixed model: modeling hierarchical and longitudinal data 27 of 44

Page 28: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Model 4: random intercept + level-2 predictor

• we drop the level-1 predictor (cses), but add a level-2 predictor (meanses)

• random intercept, no random slopes

• model for the first (student) level:

yij = α0i + εij

• model for the second (school) level:

α0i = γ00 + γ01 meansesi + u0i

• the combined model and the Laird-Ware form:

yij = γ00 + γ01 meansesi + u0i + εij

= β0 + β1x1ij + b0i + εij

• note that in the Laird-Ware notation, we use a double index for the fixed-effects xij , even if the variabele (meanses) does not change over students

AED The linear mixed model: modeling hierarchical and longitudinal data 28 of 44

Page 29: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• this model has two fixed effects (β0 and β1) and two random effects (b0i andεij)

• the interpretation of b0i has changed: whereas in the previous model it hadbeen the deviation of school i’s mean from the grand mean, it now repre-sents the residual (α0i−γ00−γ10 meansesj); correspondingly, the variancecomponent d2 is now a conditional variance (conditional on the school meanSES)

• in Raudenbush and Bryk, this is called a ‘Regression with means-as-outcomes’model, because the school’s mean (α0i) is predicted by the means SES of theschool

• note in the following output that the residual variance between schools (2.64)is substantially smaller than the original (8.61)

AED The linear mixed model: modeling hierarchical and longitudinal data 29 of 44

Page 30: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

R code> fit <- lmer(mAch ˜ 1 + meanses + (1 | school), data = Hsb82)> summary(fit)

Linear mixed model fit by REMLFormula: mAch ˜ 1 + meanses + (1 | school)

Data: Hsb82AIC BIC logLik deviance REMLdev

46969 46997 -23481 46959 46961Random effects:Groups Name Variance Std.Dev.school (Intercept) 2.6389 1.6245Residual 39.1571 6.2576Number of obs: 7185, groups: school, 160

Fixed effects:Estimate Std. Error t value

(Intercept) 12.6846 0.1493 84.98meanses 5.8635 0.3614 16.22

Correlation of Fixed Effects:(Intr)

meanses 0.010

AED The linear mixed model: modeling hierarchical and longitudinal data 30 of 44

Page 31: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Model 5: random intercept + level-1 predictor + level-2 predictor

• one level-1 predictor (cses), and a level-2 predictor (meanses)

• random intercept, nonrandomly varying slopes

• model for the first (student) level:

yij = α0i + α1i csesij + εij

• model for the second (school) level:

α0i = γ00 + γ01 meansesi + u0i

α1i = γ10 + γ11 meansesi

• the combined model and the Laird-Ware form:

yij = (γ00 + γ01 meansesi + u0i) + (γ10 + γ11 meansesi) csesij + εij

= γ00 + γ01 meansesi + γ10 csesij + γ11 meansesi csesij + u0i + εij

= β0 + β1x1ij + β2x2ij + β3(x1ijx2ij) + b0i + εij

AED The linear mixed model: modeling hierarchical and longitudinal data 31 of 44

Page 32: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• this model illustrates that in a mixed model, we combine predictors fromboth levels to construct an interaction term

• no random slopes are used in this model; the idea is that once we control forthe level-2 predictor (meanses), little or no variance in the slopes remainsto be explained; the slopes vary across schools, but as a strict function ofmeanses

• this results in a random-intercept only model

AED The linear mixed model: modeling hierarchical and longitudinal data 32 of 44

Page 33: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

R code> fit <- lmer(mAch ˜ 1 + meanses * cses + (1 | school), data = Hsb82)> summary(fit)

Linear mixed model fit by REMLFormula: mAch ˜ 1 + meanses * cses + (1 | school)

Data: Hsb82AIC BIC logLik deviance REMLdev

46580 46621 -23284 46562 46568Random effects:Groups Name Variance Std.Dev.school (Intercept) 2.6926 1.6409Residual 37.0169 6.0842Number of obs: 7185, groups: school, 160

Fixed effects:Estimate Std. Error t value

(Intercept) 12.6833 0.1494 84.91meanses 5.8662 0.3617 16.22cses 2.2009 0.1090 20.20meanses:cses 0.3248 0.2735 1.19

Correlation of Fixed Effects:(Intr) meanss cses

meanses 0.010cses 0.000 0.000meanses:css 0.000 0.000 0.075

AED The linear mixed model: modeling hierarchical and longitudinal data 33 of 44

Page 34: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Model 6: intercepts-and-slopes-as-outcomes model

• we expand the model by including two level-2 predictors: meanses and sec-tor; the slopes are again allowed to vary randomly

• model for the first (student) level:

yij = α0i + α1i csesij + εij

• model for the second (school) level:

α0i = γ00 + γ01 meansesi + γ02 sectori + u0i

α1i = γ10 + γ11 meansesi + γ12 sectori + u1i

AED The linear mixed model: modeling hierarchical and longitudinal data 34 of 44

Page 35: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

• the combined model and the Laird-Ware form:

yij =(γ00 + γ01 meansesi + γ02 sectori + u0i)+

(γ10 + γ11 meansesi + γ12 sectori + u1i) csesij + εij

= γ00 + γ01 meansesi + γ02 sectori + γ10 csesij+γ11 meansesi csesij + γ12 sectori csesij + u0i + u1i csesij + εij

=β0 + β1x1ij + β2x2ij + β3x3ij + β4(x1ijx3ij) + β5(x2ijx3ij)+

b0i + b1iz1ij + εij

AED The linear mixed model: modeling hierarchical and longitudinal data 35 of 44

Page 36: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

R code> fit.model6 <- lmer(mAch ˜ 1 + meanses * cses + sector * cses ++ (1 + cses | school), data = Hsb82)> summary(fit.model6)

Linear mixed model fit by REMLFormula: mAch ˜ 1 + meanses * cses + sector * cses + (1 + cses | school)

Data: Hsb82AIC BIC logLik deviance REMLdev

46524 46592 -23252 46496 46504Random effects:Groups Name Variance Std.Dev. Corrschool (Intercept) 2.37956 1.54258

cses 0.10123 0.31817 0.391Residual 36.72115 6.05980Number of obs: 7185, groups: school, 160

Fixed effects:Estimate Std. Error t value

(Intercept) 12.1279 0.1993 60.86meanses 5.3329 0.3692 14.45cses 2.9450 0.1556 18.93sectorCatholic 1.2266 0.3063 4.00meanses:cses 1.0392 0.2989 3.48cses:sectorCatholic -1.6427 0.2398 -6.85

Correlation of Fixed Effects:

AED The linear mixed model: modeling hierarchical and longitudinal data 36 of 44

Page 37: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

(Intr) meanss cses sctrCt mnss:cmeanses 0.256cses 0.075 0.019sectorCthlc -0.699 -0.356 -0.052meanses:css 0.019 0.074 0.293 -0.026css:sctrCth -0.052 -0.027 -0.696 0.077 -0.351

Do we need the random slopes?> fit.model6bis <- lmer(mAch ˜ 1 + meanses * cses + sector * cses ++ (1 | school), data = Hsb82)> anova(fit.model6bis, fit.model6)

Data: Hsb82Models:fit.model6bis: mAch ˜ 1 + meanses * cses + sector * cses + (1 | school)fit.model6: mAch ˜ 1 + meanses * cses + sector * cses + (1 + cses | school)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)fit.model6bis 8 46513 46568 -23249fit.model6 10 46516 46585 -23248 0.9712 2 0.6153

• no: apparently, the level-2 predictors do a sufficiently good job of accountingfor differences in slopes that the variance component for slopes is no longerneeded

AED The linear mixed model: modeling hierarchical and longitudinal data 37 of 44

Page 38: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

1.3 Further issues• modern software (for example lmer in R) can also handle partially nested

data structures (for example, some students belong to several schools)

• you should always ‘center’ level-1 covariates

• the standard hierarchical model can be extended to include heterogeneouslevel-1 variance terms

– in this case: each individual/cluster has a different variance

– for example: math achievement is more variable among boys than girls

– heterogeneity of the level-1 variance may be modelled as a function ofpredictors at both levels

– in practice, the log of the variance is modelled (to ensure positivity)

– the presence of heterogeneity of variance at level-1 is often due to mis-specification of the level-1 model

AED The linear mixed model: modeling hierarchical and longitudinal data 38 of 44

Page 39: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

2 Modeling Longitudinal Data• longitudinal data is very similar to hierarchical data: you typically have two

levels: the timepoints (level-1) and the individuals/units (level-2)

• the focus of the analysis is often to find an appropriate polynomial (linear,quadratic, cubic, . . . ) that describes the ‘evolution’ or ‘growth curve’ of thedata over time

• the components of that polynomial (intercept, slope, quadratic component,. . . ) may be constant (fixed) or varying (random) across individuals/units

• a complication in longitudinal data is that it may not be longer reasonableto assume that the ‘level-1’ errors εij are independent, since observationstaken close in time on the same individual/units may well be more similarthan observations farther apart in time; this is called auto-correlation

• in the world of linear mixed models, there is a delicate trade-off betweenadding more random effects to the model (implying more complicated pat-terns in the covariance matrix) or using a more complicated form for Σi

AED The linear mixed model: modeling hierarchical and longitudinal data 39 of 44

Page 40: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

2.1 Example: Reaction times in a sleep deprivation study• Dependent variable: the average reaction time per day (Reaction) for sub-

jects in a sleep deprivation study.

• On day 0 the subjects had their normal amount of sleep. Starting that nightthey were restricted to 3 hours of sleep per night. The observations representthe average reaction time on a series of tests given each day to each subject.

• Days Number of days of sleep deprivation

• Subject Subject number on which the observation was made

• dataset is available in the lme4 package

AED The linear mixed model: modeling hierarchical and longitudinal data 40 of 44

Page 41: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Exploring the data: the spaghetti plot

Days of sleep deprivation

Aver

age

reac

tion

time

(ms)

200

250

300

350

400

450

0 2 4 6 8

AED The linear mixed model: modeling hierarchical and longitudinal data 41 of 44

Page 42: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

Exploring the data: linear regression for each subject

Days of sleep deprivation

Aver

age

reac

tion

time

(ms)

200

250

300

350

400

450

0 2 4 6 8

●●●

308

●●●●●●●●●

309

0 2 4 6 8

●●

●●●●●

●●●

310

●●

●●●●

●●

330

0 2 4 6 8

●●●

●●●●

331

●●

●●●

●●

332

●●●

●●

●●●

333

●●

●●

●●●

●●

334

●●

●●●●

●●●

335

●●●

●●

●●

●●

337

●●●●●

●●

●●

349

200

250

300

350

400

450

●●

●●●

●●

●●

350200

250

300

350

400

450

●●●

●●

351

0 2 4 6 8

●●●●●●

●●

352

●●●

●●●

●●

369

0 2 4 6 8

●●●●

●●●

370

●●●●●●

●●

371

0 2 4 6 8

●●●

●●

●●●

●●

372

AED The linear mixed model: modeling hierarchical and longitudinal data 42 of 44

Page 43: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

A simple linear growth model> fit.linear <- lmer(Reaction ˜ Days + (Days | Subject), data = sleepstudy)> summary(fit.linear)

Linear mixed model fit by REMLFormula: Reaction ˜ Days + (Days | Subject)

Data: sleepstudyAIC BIC logLik deviance REMLdev1756 1775 -871.8 1752 1744Random effects:Groups Name Variance Std.Dev. CorrSubject (Intercept) 612.092 24.7405

Days 35.072 5.9221 0.066Residual 654.941 25.5918Number of obs: 180, groups: Subject, 18

Fixed effects:Estimate Std. Error t value

(Intercept) 251.405 6.825 36.84Days 10.467 1.546 6.77

Correlation of Fixed Effects:(Intr)

Days -0.138

> fit.quad <- lmer(Reaction ˜ Days + I(Daysˆ2) + (Days | Subject),

AED The linear mixed model: modeling hierarchical and longitudinal data 43 of 44

Page 44: The linear mixed model: modeling hierarchical and

Department of Data Analysis Ghent University

+ data = sleepstudy)> summary(fit.quad)

Linear mixed model fit by REMLFormula: Reaction ˜ Days + I(Daysˆ2) + (Days | Subject)

Data: sleepstudyAIC BIC logLik deviance REMLdev1757 1779 -871.4 1750 1743Random effects:Groups Name Variance Std.Dev. CorrSubject (Intercept) 613.108 24.7610

Days 35.108 5.9252 0.064Residual 651.973 25.5338Number of obs: 180, groups: Subject, 18

Fixed effects:Estimate Std. Error t value

(Intercept) 255.4494 7.5135 34.00Days 7.4341 2.8189 2.64I(Daysˆ2) 0.3370 0.2619 1.29

Correlation of Fixed Effects:(Intr) Days

Days -0.418I(Daysˆ2) 0.418 -0.836

AED The linear mixed model: modeling hierarchical and longitudinal data 44 of 44