Lecture 5 Linear Mixed Effects Models Olivier MISSA, [email protected]@york.ac.uk Advanced Research Skills

Lecture 5

Linear Mixed Effects Models

Olivier MISSA, [email protected]

Advanced Research Skills

mailto:[email protected]

2

Outline

Explore options available when assumptions

of classical linear models are untenable.

In this lecture:

What can we do when observations

(and thus residuals) are not strictly independent ?

3

Defined by three assumptions:

(1) the response variable is continuous.

(2) the residuals (ε) are normally distributed and ...

(3) ... independently and identically distributed.

Today, we will consider a range of options available

when we either know or suspect that our data

are not strictly independent from each other.

(Departures from the other assumptions will be dealt with later)

Classical Linear Models

4

In previous lectures:

We merely checked the independence of our residuals by inspecting the plot of residuals vs. fitted values.

Example from lecture 2:

Non-independent Residuals

A non-linear trend

which suggested that our linear model was probably misspecified

5

Data collection can often lead to non-independence among your observations.

A few examples:

Repeated (longitudinal) observations on the same "individuals"

(on different days, weeks, months, years)

Collecting data from a few locations (spatial structure)

(surveys conducted in schools/streets, fields/sites/islands)

Collecting data on related individuals

(father & sons, twins, species within the same genus/tribe/family).

If we treat all these observations as fully independent, we are likely to overestimate the number of degrees of freedom.

which may lead us to wrongly reject a null hypothesis (type I error)

Non-independent Observations

6

Two ways to cope with non-independent observations

When design is balanced ("equal sample size")

We can use factors to partition our observations in different "groups" and analyse them as an ANOVA or ANCOVA.

We already know how to do that (when factors are "crossed")

We just need to figure out how to cope with nested factors.

When design is unbalanced ("uneven sample size")

Mixed effect models are then called for.

Non-independent Observations

7

Nested AnovaExample:

A designed field experiment on crop yield with three treatments :

irrigation (control, irrigated)

sowing density (low, medium, high)

fertilizer (N, P, NP)

Block

control irrigated

high

low

medium

N P NP

Split plot design

each block has 18 different subplots

8


A designed field experiment on crop yield with three treatments

> yields <- read.table("splityield.txt", header=T)> attach(yields)> names(yields)[1] "yield" "block" "irrigation" "density" "fertilizer"> str(yields)'data.frame': 72 obs. of 5 variables: $ yield : int 90 95 107 92 89 92 81 92 93 80 ... $ block : Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1... $ irrigation: Factor w/ 2 levels "control","irrigated": 1 1... $ density : Factor w/ 3 levels "high","low","medium": 2 2... $ fertilizer: Factor w/ 3 levels "N","NP","P": 1 3 2 1 3 2...

9


A designed field experiment on crop yield with three treatments

> model0 <- aov(yield ~ irrigation*density*fertilizer)

## non-nested version (incorrect !!)

> summary(model0) Df Sum Sq Mean Sq F value Pr(>F) irrigation 1 8277.6 8277.6 59.5746 2.813e-10 ***density 2 1758.4 879.2 6.3276 0.0033972 ** fertilizer 2 1977.4 988.7 7.1160 0.0018070 ** irrigation:density 2 2747.0 1373.5 9.8853 0.0002197 ***irrigation:fertilizer 2 953.4 476.7 3.4310 0.0395615 * density:fertilizer 4 304.9 76.2 0.5486 0.7008151 irrigation:density:fertilizer 4 234.7 58.7 0.4223 0.7918283 Residuals 54 7503.0 138.9

Sum 71 23756.44

10

Nested Anova> model1 <- aov(yield ~ irrigation*density*fertilizer +

Error(block/irrigation/density) ) ## Correct nested version, nesting from large to small> summary(model1)Error:block Df Sum Sq Mean Sq F value Pr(>F)Residuals 3 194.44 64.815

Error:block:irrigation Df Sum Sq Mean Sq F value Pr(>F) irrigation 1 8277.6 8277.6 17.59 0.0247 *Residuals 3 1411.8 470.6

Error:block:irrigation:density Df Sum Sq Mean Sq F value Pr(>F) density 2 1758.4 879.18 3.784 0.05318 .irrigation:density 2 2747.0 1373.51 5.912 0.01633 *Residuals 12 2787.9 232.33

Error:Within Df Sum Sq Mean Sq F value Pr(>F) fertilizer 2 1977.4 988.72 11.449 0.000142 ***irrigation:fertilizer 2 953.4 476.72 5.520 0.008108 ** density:fertilizer 4 304.9 76.22 0.883 0.484053 irrigation:density:fertilizer 4 234.7 58.68 0.679 0.610667 Residuals 36 3108.8 86.36

Res Sum 54 7503.00Gd Sum 71 23756.44

11

Nested AnovaComparison between nested and non-nested results

Non-nested Nested Df F value Pr(>F) F value Pr(>F)irrigation 1 59.5746 2.81e-10 17.5896 0.024725density 2 6.3276 0.003397 3.7842 0.053181fertilizer 2 7.1160 0.001807 11.4493 0.000142irrig:dens 2 9.8853 0.000220 5.9119 0.016331irrig:ferti 2 3.4310 0.039562 5.5204 0.008108 dens:ferti 4 0.5486 0.700815 0.8826 0.484053irrig:dens:ferti 4 0.4223 0.791828 0.6795 0.610667

Block

control irrigated

high

low

medium

N P NP

12

Recognizing Nestedness is key !

Being able to distinguish crossed factors (independent from each other) from nested factors is essential.

Nestedness occurs most often from spatial structureStudent surveys in different classes from different schools.

Samples from individual branches on sets of trees within a number of forest patches.

But can also occur from temporal structureSamples taken from the same individuals every fortnight for 2 months on two successive years.

13

When the design is not balanced

We need a different modelling framework:Mixed Effects Models.

So called because they mix together fixed effects and random effects.

Until now, we have only used fixed effects in our models,

each effect having an estimated parameter

(intercept, slope, mean, ...).

But in certain circumstances, these parameters may not

be very informative and one would be better off trying to

"estimate" the underlying distribution they come from.

An example will help clarify the difference between these 2 approaches.

14

> library(nlme) ## package dedicated to mixed effects models> data(Rail)> names(Rail)[1] "Rail" "travel"> stripchart(Rail$travel ~ Rail$Rail, pch=16, ylab="Ultrasonic Travel Time (nanosecs)", xlab="Rail number", vertical =T, col=rainbow(6) )

> abline(h=mean(Rail$travel), col="Gray85", lty=2, lwd=2)

Mixed Effects Modelling Example: railway rails tested for longitudinal stress.

6 rails chosen at random and tested three times with ultrasound.

Classically in a linear model, we would be able to tell whether the rails differ significantly from each other.

But it doesn't help us make predictions about other rails.

15

Mixed Effects Modelling

Random effects: interested in explaining the variance of the response.

Fixed effects: interested in explaining the response itself.

Fixed effects

Male & Female

Control & Treatment

Wet vs. Dry

Light vs. Shade

Random effects

Blocks

Individuals withRepeated measures

Genotypes

Sites

16

Mixed Effects Modelling Back to our example:

> Rail2 <- data.frame(travel=Rail$travel, Rail=factor(as.character(Rail$Rail)) )

> Rail.lm <- lm(travel ~ Rail, data=Rail2) ## LINEAR MODEL

> summary(Rail.lm)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 54.000 2.321 23.262 2.37e-11 ***Rail2 -22.333 3.283 -6.803 1.90e-05 ***Rail3 30.667 3.283 9.341 7.44e-07 ***Rail4 42.000 3.283 12.793 2.36e-08 ***Rail5 -4.000 3.283 -1.218 0.246 Rail6 28.667 3.283 8.732 1.52e-06 ***---Residual standard error: 4.021 on 12 degrees of freedomMultiple R-squared: 0.9796, Adjusted R-squared: 0.9711 F-statistic: 115.2 on 5 and 12 DF, p-value: 1.033e-09

Makes Rail a simple factor

(not an ordered one)

17

Mixed Effects Modelling > anova(Rail.lm)Analysis of Variance Table

Response: travel Df Sum Sq Mean Sq F value Pr(>F) Rail 5 9310.5 1862.1 115.18 1.033e-09 ***Residuals 12 194.0 16.2

## now as a MIXED EFFECT MODEL> Rail.lme <- lme(travel ~ 1, data=Rail, random= ~1|Rail)> summary(Rail.lme)Linear mixed-effects model fit by REMLData: Rail AIC BIC logLik 128.177 130.6766 -61.0885

Random effects: Formula: ~1 | Rail (Intercept) ResidualStdDev: 24.80547 4.020779

Fixed effects: travel ~ 1 Value Std.Error DF t-value p-value(Intercept) 66.5 10.17104 12 6.538173 0

Standard Deviation associated with rails

Standard Deviation of residuals

Grand Average

18

Mixed Effects Modelling > anova(Rail.lm)Analysis of Variance Table

Response: travel Df Sum Sq Mean Sq F value Pr(>F) Rail 5 9310.5 1862.1 115.18 1.033e-09 ***Residuals 12 194.0 16.2

## now as a MIXED EFFECT MODEL> Rail.lme <- lme(travel ~ 1, data=Rail, random= ~1|Rail)> summary(Rail.lme)Linear mixed-effects model fit by REMLData: Rail AIC BIC logLik 128.177 130.6766 -61.0885

Random effects: Formula: ~1 | Rail (Intercept) ResidualStdDev: 24.80547 4.020779

Fixed effects: travel ~ 1 Value Std.Error DF t-value p-value(Intercept) 66.5 10.17104 12 6.538173 0

Standard Deviation associated with rails

Standard Deviation of residuals

Grand Average

19


> Rail.lme$call ## random effect modellme.formula(fixed = travel ~ 1, data = Rail,

random = ~1 | Rail)> AIC(Rail.lme) ## exact AIC version[1] 128.177 > Rail.lm0 <- lm(travel ~ 1, data=Rail2) ## NULL linear model> AIC(Rail.lm0)[1] 167.9265

> Rail.lm$call ## model with Rail as a fixed effect factorlm(formula = travel ~ Rail, data = Rail)> AIC(Rail.lm)[1] 107.8765

Is the Random effect significant ?

Comparing models which only differ in their random effects is easy (with AIC).

Comparing models which differ in their fixed effects is a little harder. Can only be done using "maximum likelihood" (not the default method in lme).

20


> yields <- read.table("splityield.txt", header=T)> yield.lme <- lme(yield ~ irrigation*density*fertilizer, data=yields, random= ~1|block/irrigation/density) > summary(yield.lme)Linear mixed-effects model fit by REML Data: yields AIC BIC logLik 481.6212 525.3789 -218.8106

Random effects: Formula: ~1 | block (Intercept) StdDev: 0.0006600339

Formula: ~1 | irrigation %in% block (Intercept) StdDev: 1.982461

Formula: ~1 | density %in% irrigation %in% block (Intercept) Residual StdDev: 6.975554 9.292805 ...

Applied on the split plot study of crop yield

21

Mixed Effects Modelling Fixed effects: yield ~ irrigation * density * fertilizer Value Std.Error DF t-value p-value(Intercept) 80.50 5.893741 36 13.658558 0.0000irrigirrig 31.75 8.335008 3 3.809234 0.0318dnslow 5.50 8.216282 12 0.669403 0.5159dnsmed 14.75 8.216282 12 1.795216 0.0978fertiNP 5.50 6.571005 36 0.837010 0.4081fertiP 4.50 6.571005 36 0.684827 0.4978irrigirrig:dnslow -39.00 11.619577 12 -3.356404 0.0057irrigirrig:dnsmed -22.25 11.619577 12 -1.914872 0.0796irrigirrig:fertiNP 13.00 9.292805 36 1.398932 0.1704irrigirrig:fertiP 5.50 9.292805 36 0.591856 0.5576dnslow:fertiNP 3.25 9.292805 36 0.349733 0.7286dnsmed:fertiNP -6.75 9.292805 36 -0.726368 0.4723dnslow:fertiP -5.25 9.292805 36 -0.564953 0.5756dnsmed:fertiP -5.50 9.292805 36 -0.591856 0.5576irrigirrig:dnslow:fertiNP 7.75 13.142011 36 0.589712 0.5591irrigirrig:dnsmed:fertiNP 3.75 13.142011 36 0.285344 0.7770irrigirrig:dnslow:fertiP 20.00 13.142011 36 1.521837 0.1368irrigirrig:dnsmed:fertiP 4.00 13.142011 36 0.304367 0.7626

22


> anova(yield.lme) numDF denDF F-value p-value(Intercept) 1 36 2674.6301 <.0001irrigation 1 3 30.9207 0.0115density 2 12 3.7842 0.0532fertilizer 2 36 11.4493 0.0001irrigation:density 2 12 5.9119 0.0163irrigation:fertilizer 2 36 5.5204 0.0081density:fertilizer 4 36 0.8826 0.4841irrigation:density:fertilizer 4 36 0.6795 0.6107

## We should probably remove the three-way interaction

## But if we are fiddling with the fixed effects, we ought ## to fit the model through Maximum Likelihood and base our ## decisions on its AIC values and Likelihood Ratio Tests

> yield.lme.ml <- update(yield.lme, ~. ,method="ML")

> AIC(yield.lme.ml)[1] 573.5108

23

Mixed Effects Modelling > yield.lme.ml2 <- update(yield.lme.ml, ~.

- irrigation:density:fertilizer)

> yield.lme.ml2$method[1] "ML" ## just checking that update() kept using "ML"

> AIC(yield.lme.ml2)[1] 569.0046 ## an improvement

> anova(yield.lme.ml2) numDF denDF F-value p-value(Intercept) 1 40 2872.7394 <.0001irrigation 1 3 33.2110 0.0104density 2 12 4.0645 0.0449fertilizer 2 40 11.4341 0.0001irrigation:density 2 12 6.3499 0.0132irrigation:fertilizer 2 40 5.5131 0.0077density:fertilizer 4 40 0.8815 0.4837

> yield.lme.ml3 <- update(yield.lme.ml2, ~. - density:fertilizer)

> AIC(yield.mle.lm3)[1] 565.1933

24

Mixed Effects Modelling > anova(yield.lme.ml, yield.lme.ml2) Model df AIC BIC logLik Test L.Ratio p-valueyield.lme.ml 1 22 573.5108 623.5974 -264.7554 yield.lme.ml2 2 18 569.0046 609.9845 -266.5023 1vs2 3.49379 0.4788

> anova(yield.lme.ml2, yield.lme.ml3) Model df AIC BIC logLik Test L.Ratio p-valueyield.lme.ml2 1 18 569.0046 609.9845 -266.5023 yield.lme.ml3 2 14 565.1933 597.0667 -268.5967 1vs2 4.18877 0.3811

> anova(yield.lme.ml3) numDF denDF F-value p-value(Intercept) 1 44 3070.8771 <.0001irrigation 1 3 35.5016 0.0095density 2 12 4.3448 0.0381fertilizer 2 44 11.2013 0.0001irrigation:density 2 12 6.7878 0.0107irrigation:fertilizer 2 44 5.4008 0.0080> yield.lme.ml4 <- update(yield.lme.ml3, ~. –irrigation:density)> AIC(yield.mle.ml4)[1] 572.9022

25

Mixed Effects Modelling > anova(yield.lme.m3, yield.lme.ml4) Model df AIC BIC logLik Test L.Ratio p-valueyield.lme.ml3 1 14 565.1933 597.0667 -268.5967 yield.lme.ml4 2 12 572.9022 600.2221 -274.4511 1vs2 11.7088 0.0029

> anova(yield.lme.ml4) numDF denDF F-value p-value(Intercept) 1 44 2138.9678 <.0001irrigation 1 3 24.7281 0.0156density 2 14 2.6264 0.1075fertilizer 2 44 11.5626 0.0001irrigation:fertilizer 2 44 5.5750 0.0069

## here comes Model Checking

> shapiro.test(yield.lme.ml3$residuals[,"fixed"]) # Best Model

Shapiro-Wilk normality test

data: yield.lme.ml3$residuals[, "fixed"] W = 0.9797, p-value = 0.2958

ml4 is one simplification too far

matrix column

26


> res <- yield.lme.ml3$resid[,"fixed"]> st.res <- res/sd(res)

> qqnorm(st.res, pch=16, main="")> qqline(st.res, col="red", lwd=2)

> res <- yield.lme.ml3$resid[,4] > st.res <- res/sd(res)

> qqnorm(st.res, pch=16, main="")> qqline(st.res, col="red", lwd=2)

including all random effects

excluding all random effects

27


> plot(yield.lme.ml3) ## by default Residuals vs Fitted values

> plot(yield.lme.ml3, yield ~ fitted(.) ) ## Observed vs Fitted values

28


> qqnorm(yield.lme.ml3, ~resid(.) | block) ## qqplot but broken down by block

Documents

Lecture 5 Linear Mixed Effects Models Olivier MISSA, [email protected]@york.ac.uk Advanced Research Skills