Heterogeneity and Hierarchical Data...Mixed Effects for Nested Data •Example (from Zuur et al., Mixed Effects Models and Extensions in Ecology with R) –Species richness on a beach

Heterogeneity and Hierarchical Data

Example from Squid

• From Zuur et al. Mixed Effects Models and Extensions in Ecology with R

• Testis weight was measured in squids of different sizes over multiple months

• Body size and month are potential explanatory variables

Squid Scatterplot

100 200 300 400 500

010

20

30

Dorsal Mantle Length (DML)

Testis W

eig

ht

100 200 300 400 500

010

20

30

Squid Residuals

0 5 10 15 20

-10

-50

510

15

Fitted values

Resid

uals

768

462287

1 2 3 4 5 6 7 8 9 11-5

05

10

15

Month

Resid

uals

100 200 300 400 500

-50

510

15

DML

Resid

uals

Squid Residuals 0 5 10 15 20

-10

-50

510

15

Fitted values

Resid

uals

768

462287

1 2 3 4 5 6 7 8 9 11

-50

510

15

Month

Resid

uals

100 200 300 400 500

-50

510

15

DML

Resid

uals

Problem and Solution

• Variance in the residuals increases with DML

• Variance in the residuals also varies with month

• Solution: Use a model that can accommodate a changing variance as a function of explanatory variables

Generalized Least Squares

• Generalized least squares models permit heterogeneity by explicitly modeling the change in variance as a function of an explanatory variable

• It’s essentially a linear regression, but with an added term (or function) for the variance

Modeling the Variance

• Fixed Variance Structure – Assumes that var(εi) = σ2 × DMLi

• Implementation > M.lm <- gls(Testisweight ~ DML*fmonth, data=squid)

> vf1Fixed <- varFixed(~DML)

> M.gls1 <- gls(Testisweight ~ DML*fmonth, weights=vf1Fixed, data=squid)

> anova(M.lm,M.gls1)

Linear regression model

Linear model with variance proportional to the DML variable Compare the models to

see which has lower AIC

Which Model is Better?

• The variance is allowed to vary with the value of DML

• In this case, the variance will be proportional to DML, so the varFixed model only works when the variance either increases or decreases as DML increases

Model df AIC BIC logLik M.lm 25 3752.084 3867.385 -1851.042 M.gls1 25 3620.898 3736.199 -1785.449

Modeling the Change in Variance Over Months

• Month is modeled as a nominal (categorical) variable

• varIdent allows each month to have its own variance > vf2 <- varIdent(form= ~1 | fmonth)

> M.gls2 <- gls(Testisweight ~ DML*fmonth, data=squid, weights=vf2)

> anova(M.lm,M.gls1,M.gls2)

> summary(M.gls2)

Results of anova()

Model df AIC BIC logLik Test L.Ratio p-value M.lm 1 25 3752.1 3867.4 -1851.0 M.gls1 2 25 3620.9 3736.2 -1785.4 M.gls2 3 36 3614.4 3780.5 -1771.2 2 vs 3 28.46 0.0027

df: Degrees of freedom, indicates complexity of the models – note that having a different variance for each level of increases the complexity quite a bit. AIC: Aikake Information Criterion, the lowest value indicates the best model. A difference of about 6 indicates that the better model is about 95% more likely. L.Ratio and p-value: A formal test for a significant difference between models. This test is only valid if the models are nested (that is, you can get the simpler model by setting a parameter in the more complex model to zero). AIC is valid whether or not the models are nested (but it isn’t a formal test).

Results of summary() Generalized least squares fit by REML Model: Testisweight ~ DML * fmonth Data: squid AIC BIC logLik 3614.436 3780.469 -1771.218 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | fmonth Parameter estimates: 2 9 12 11 8 10 5 7 1.000 2.991 1.273 1.509 0.982 2.216 1.639 1.378 6 4 1 3 1.647 1.423 1.958 1.979

… plus a lot more. Shows the model, AIC, and the standard deviations estimated for each month.

Variance Structures

• varFixed: Variance changes proportionally to a continuous variable • varIdent: Variance is different for each level of a nominal variable • varPower: Variance is proportional to a continuous variable raised

to a power – εi ~ N(0, σ2 × |DMLi|

2δ)

• varExp: Exponential variance structure – Var(εi ) = σ2 × e2δ × DMLi

– Better when the covariate can take the value of zero

• varConstPower: Constant plus power of the variance covariate function – Var(εi) = σ2 × (δ1 + |DMLi|

δ)2

• varComb: Combination of variance structures – Var(εi) = σj

2 × e2δ × DMLij – Allows different levels for month and a change in variance with DML

Applying the varComb Variance Structure

#Allow variance to vary with month and DML > vf8 <- varComb(varIdent(form = ~1 | fmonth), varExp(form = ~DML)) > M.gls8 <- gls(Testisweight ~ DML * fmonth, weights=vf8, data=squid) > anova(M.lm, M.gls8) > vf9 <- varComb(varIdent(form = ~1 | fmonth), varPower(form = ~DML)) > M.gls9 <- gls(Testisweight ~ DML * fmonth, weights=vf9, data=squid) > anova(M.gls8, M.gls9) > plot(M.gls9)

Model df AIC BIC logLik M.gls8 1 37 3414.817 3585.463 -1670.409 M.gls9 2 37 3406.231 3576.877 -1666.116

NOTE: These models are NOT nested

Residuals of M.gls9

0 5 10 15 20

-10

-50

510

15

Fitted values

Resid

uals

768

462287

1 2 3 4 5 6 7 8 9 11

-50

510

15

Month

Resid

uals

100 200 300 400 500

-50

510

15

DML

Resid

uals

Fitted values

Sta

nd

ard

ize

d r

esid

ua

ls

-4

-2

0

2

4

0 5 10 15 20

Model with varComb: Original linear model:

In real life, you might want to examine this relationship for each value of the nominal variable.

How do I know which variance structure to use?

• If the variance covariate (i.e., the variable across which the variance changes) is nominal, use varIdent because it’s the only choice.

• In general, don’t use varFixed, because it’s inflexible (a strict linear relationship).

• Start with varPower, varExp, or varConstPower.

• Use varComb to combine different models for different variables.

• Use the distribution of residuals and AIC to pick the best model.

• If you have a biological reason to believe the variance should vary in a particular way, use that knowledge.

• If the values of the variance covariate are extremely large (100 or more), consider using a different scale (e.g., meters instead of mm) or standardizing them to avoid model instability.

Mixed Effects for Nested Data

• Example (from Zuur et al., Mixed Effects Models and Extensions in Ecology with R) – Species richness on a beach as a function of exposure and the

height of the sampling station compared to mean tidal level (NAP)

– Exposure is nominal and has two classes, NAP is continuous

– Each of nine beaches was sampled at five sites, so samples are nested within beaches

– The questions is whether species richness is affected by NAP and exposure (two factor anova, except for the hierarchical nature of the data)

Nested Data

• Obviously the sites within beaches are not independent

• One old-fashioned way to analyze the data would be to estimate the slope of species richness on NAP for each beach and then use the slopes in the analysis

• This approach discards much of the information by using a summary statistic for each beach and reduces the sample size to 9 (number of beaches)

The Data Sample Richness Exposure NAP Beach 1 1 11 10 0.045 1 2 2 10 10 -1.036 1 3 3 13 10 -1.336 1 4 4 11 10 0.616 1 5 5 10 10 -0.684 1 6 6 8 8 1.190 2 7 7 9 8 0.820 2 8 8 8 8 0.635 2 9 9 19 8 0.061 2 10 10 17 8 -1.334 2 11 11 6 11 -0.976 3 12 12 1 11 1.494 3 13 13 4 11 -0.201 3 14 14 3 11 -0.482 3 15 15 3 11 0.167 3

It looks like beach and exposure are confounded at first, but they aren’t because in the full dataset some different beaches have the same exposure (also 8 and 10 are both “low exposure” and 11 is “high exposure”).

Getting the Linear Regression Coefficients for Each Beach

• > Library(nlme)

• > by_beach <- lmList(Richness ~ NAP | Beach, data=rikz)

• This function estimates the linear regression for each value of Beach

• However, given the limitations of this approach, we probably want to use a model that makes better use of the data

Linear Mixed Effects Model

• Contains a mixture of fixed effects and random effects – Ri = Xi × β + Zi × bi + εi

– This is a system of equations for each response variable observation as a function of the value of NAP, and the relationship can vary among beaches.

• Assumptions: – bi ~ N(0, D)

– εi ~ N(0, Σi)

– b’s and ε’s independent

Steps to Select the Model

(1) Start with a model where the fixed component contains all explanatory variables and as many interactions as possible: the beyond optimal model.

(2) Find the optimal structure of the random component. Use REML estimation to compare nested models with different random components.

(3) Find the optimal fixed structure, keeping the optimal random structure you already found in (2). Nested models with different fixed effects should be compared using ML.

(4) Present the final model using REML estimation.

REML vs ML

• ML: Maximum Likelihood Estimation – Develops a probability equation for the probability of the

data given the model – Chooses the parameter values that maximize this

probability – The ML estimate is slightly biased due to the joint

estimation of multiple parameters

• REML: Restricted Maximum Likelihood – REML provides a solution to remove the bias

• When comparing two models, they need to both be fit using the same estimation procedure

Random Effects

• Random intercept model: – If beach is the random factor, then it allows the intercept

to vary among beaches (but the lines will be parallel)

> Mlme1 <- lme(Richness ~ NAP, random = ~1 | fbeach, data=rikz) #[the “1” represents the intercept]

> summary(Mlme1)

• Random intercept and slope model: – Allows the slope and intercept to vary among beaches

> Mlme2 <- lme(Richness ~ NAP, random = ~1 + NAP | fbeach, data=rikz)

> summary(Mlme2)

Random Intercepts

-1.0 -0.5 0.0 0.5 1.0 1.5 2.0

05

10

15

20

NAP

Ric

hn

ess

1

1

1

1

1

2

2

2

2

2

3

3

3

3 3

4

4 4

4

4

5

5

5

5

5

6

6

6

6

6

7

77

7

7

8

8

8

8

8

9

9

9

9

9

How to produce this graph

#Graph of random intercepts F0 <- fitted(Mlme1,level=0) F1 <- fitted(Mlme1,level=1) I <- order(rikz$NAP) NAPs <- sort(rikz$NAP) plot(NAPs, F0[I], lwd=4, type="l", ylim=c(0,22), ylab="Richness", xlab="NAP") for(i in 1:9) { x1 <- rikz$NAP[rikz$Beach==i] y1 <- F1[rikz$Beach == i] K <- order(x1) lines(sort(x1),y1[K]) } text(rikz$NAP, rikz$Richness, rikz$Beach, cex=0.9)

Random Intercepts and Slopes

-1.0 -0.5 0.0 0.5 1.0 1.5 2.0

05

10

15

20

NAP

Ric

hn

ess

1

1

1

1

1

2

2

2

2

2

3

3

3

3 3

4

4 4

4

4

5

5

5

5

5

6

6

6

6

6

7

77

7

7

8

8

8

8

8

9

9

9

9

9

Example 2: Putting it All Together

• We are going to follow the Zuur et al. 10 step plan (similar to the 4-step plan from earlier but with more steps)

• Sample data – Begging behavior of nestling barn owls – Response: “sibling negotiation”

• Number of calls in the 15 minutes preceding parent arrival divided by the number of nestlings

– Explanatory variables • Sex of the parent, food treatment (food satiated or food

deprived), arrival time of parent

Step 1: Linear Regression > M.lm <- lm(NegPerChick ~ SexParent * FoodTreatment + SexParent*ArrivalTime, data=owls) > plot(M.lm)

Step 1

• Evidence for heterogeneity

• Plot each explanatory variable against residuals to look for a pattern

• No pattern, so we can’t simply model the heterogeneity

• Instead, log transform in this case: log10(Y+1)

• lm and boxplot the log transformed data

Residuals After Transformation -3

-2-1

01

23

Auta

vauxT

VB

och

et

Cha

mpm

artin

ChE

sard

Che

vroux

Cor

celle

sFavr

es

Etrablo

z

Fore

l

Fra

nex

GD

LV

Gle

ttere

nsH

ennie

z

Jeus

sLes

Pla

nches

Luc

ens

Lul

ly

Marn

and

Mont

et

Muris

tO

leye

sP

aye

rne

Rue

yes

Seiry

Seva

z

StA

ubin

Tre

y

Yvo

nnand

Code for Previous Slide

#Log transform > owls$LogNeg <- log10(owls$NegPerChick+1) > M2.lm <- lm(LogNeg ~ SexParent * FoodTreatment + SexParent * ArrivalTime, data=owls) > E <- rstandard(M2.lm) > boxplot(E ~ Nest, data=owls, axes=FALSE, ylim=c(-3,3)) > abline(0,0); axis(2) > text(1:27, -2.5, levels(owls$Nest), cex=0.75, srt=65)

Step 2: Fit the model with GLS

• > library(nlme)

• > Form <- formula(LogNeg ~ SexParent*FoodTreatment + SexParent*ArrivalTime)

• > M.gls <- gls(Form, data=owls)

This model will give the same results as lm, but the model is fit with REML so that it can be compared to the mixed models

Step 3: Choose a Variance Structure

• Nest looked like it introduced variation into the analysis, because some were above and some below the line

• Thus, we want to include nest as a random factor

Step 4: Fit the Model

• > M1.lme <- lme(Form, random = ~ 1 | Nest, method = "REML", data=owls)

• Form is the same formula as before

• We tell lme to use “REML” so it will be comparable to the gls model

Step 5: Compare New Model with Old Model

• > anova(M.gls, M1.lme)

• The models were both fit with REML and are nested, so the Likelihood Ratio Test is valid

Model df AIC BIC logLik Test L.Ratio p-value M.gls 1 7 64.37422 95.07058 -25.18711 M1.lme 2 8 37.71547 72.79702 -10.85773 1 vs 2 28.65875 <.0001

Step 6: Validate the Model so Far

0.0 0.1 0.2 0.3 0.4 0.5 0.6

-2-1

01

23

Fitted values

Resid

uals

Female Male

-2-1

01

23

Sex of parent

Resid

uals

Deprived Satiated

-2-1

01

23

Food treatment

Resid

uals

22 24 26 28

-2-1

01

23

Arrival time

Time (hours)

Resid

uals

Code for the Previous Slide #Step 6 Model Validation E2 <- resid(M1.lme, type="normalized") F2 <- fitted(M1.lme) op <- par(mfrow = c(2,2), mar=c(4,4,3,2)) MyYlab <- "Residuals" plot (x=F2, y=E2, xlab="Fitted values", ylab=MyYlab) boxplot(E2 ~ SexParent, data=owls, main="Sex of parent", ylab=MyYlab) boxplot(E2 ~ FoodTreatment, data=owls, main="Food treatment", ylab=MyYlab) plot(x = owls$ArrivalTime, y=E2, ylab=MyYlab, main="Arrival time", xlab = "Time (hours)") par(op)

Steps 7 and 8: The Optimal Fixed Structure

• > summary(M1.lme) #NOT anova()

Fixed effects: list(Form) Value Std.Error DF t-value p-value (Intercept) 1.1236414 0.19522087 567 5.755744 0.0000 SexParentMale 0.1082138 0.25456854 567 0.425087 0.6709 FoodTreatmentSatiated -0.1818952 0.03062840 567 -5.938776 0.0000 ArrivalTime -0.0290079 0.00781832 567 -3.710251 0.0002 SexParentMale:FoodTrtSat 0.0140178 0.03971071 567 0.352998 0.7242 SexParentMale:ArrivalTm -0.0038358 0.01019764 567 -0.376144 0.7070

Consider dropping the least significant interaction term and refitting the model. Sequentially do this until only significant interactions remain

Step 7 and 8

• Best way to perform this step is probably via the likelihood ratio test

• Refit the model using “ML” but keep the same random structure

• Compare models with and without various terms until all terms are significant

Steps 7 and 8 #Use ML to refit the model and compare reduced models using LR tests M1.Full <- lme(Form, random = ~ 1 | Nest, method = "ML", data=owls) M1.A <- update(M1.Full, .~. -SexParent:FoodTreatment) M1.B <- update(M1.Full, .~. -SexParent:ArrivalTime) anova(M1.Full, M1.A) anova(M1.Full, M1.B)

> anova(M1.Full, M1.A) Model df AIC BIC logLik Test L.Ratio p-value M1.Full 1 8 -0.74 34.4 8.374 M1.A 2 7 -2.62 28.1 8.312 1 vs 2 0.123736 0.725 > anova(M1.Full, M1.B) p-value = 0.7102

M1.A had the highest p-value (of these two tests), so we drop the SexParent:FoodTreatment interaction first. Repeat until only significant effects are left – but remember that if an interaction is significant, you must keep the associated main effects

Step 9: Refit the reduced model with REML

• M5 <- lme(LogNeg ~ FoodTreatment + ArrivalTime, random = ~ 1 | Nest, method = "REML", data=owls)

• summary(M5)

Step 9 output

Linear mixed-effects model fit by REML Fixed effects: LogNeg ~ FoodTreatment + ArrivalTime Value Std.Error DF t-value p-value (Intercept) 1.1821386 0.12897491 570 9.165648 0 FoodTreatment -0.1750754 0.01996606 570 -8.768650 0 ArrivalTime -0.0310214 0.00511232 570 -6.067954 0 Correlation: (Intr) FdTrtS FoodTreatmentSatiated -0.112 ArrivalTime -0.984 0.039 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.22283609 -0.78307304 -0.07461892 0.68690000 3.29183331

Step 10

• Model validation and interpretation

Documents

Heterogeneity and Hierarchical Data...Mixed Effects for Nested Data •Example (from Zuur et al., Mixed Effects Models and Extensions in Ecology with R) –Species richness on a beach