38
Self-Modeling Regression for Longitudinal Data with Time- Invariant Covariates Naomi S. Altman Penn State University [email protected] Julio Villarreal EdVision

Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Self-Modeling Regression for

Longitudinal Data with Time-

Invariant Covariates

Naomi S. Altman

Penn State University

[email protected]

Julio Villarreal

EdVision

Page 2: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Experiments in Which the

Response is a Curve

• growth curves

• longitudinal

data

• blood

concentration

curves

Page 3: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Features

• multiple curves with similar shape

• covariates or treatments (“time

invariant”)

Page 4: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Objectives

• flexible nonparametric model for shape

• interpretable parameters for treatment

effects

• test statistics for treatment effects

• test statistics for effect of treatment on

shape

Page 5: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Shape-Invariant Regression

yi (t) = ααααi0 + Ai1 µµµµ0(bi0 +Bi1t) + εεεεit

(Lawton et al 1972)

• Common shape: µµµµ0 (t)

• Parameters: θθθθi’ = (ααααi0, ααααi1, ββββi0, ββββi1)

• Ai1= exp(ααααi1)

• Bi1= exp(ββββi1)

Page 6: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Why Use Shape-Invariant

Regression?

• If shape is not of interest:

– treatment effects can be summarized by a model

for the parameters

– common summaries (e.g. height of maximum,

time to maximum) depend on shape only through

functionals which are constant over “i”

• Shape can be estimated nonparametrically

– this allows a test of treatment effect on shape

– a functional form for shape may be suggested

Page 7: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Outline

• Nestling Growth Experiments

• Fitting a SIM model (algorithm)

• Testing effects (simulation)

• Results for Tree Swallow Growth

• Does the model fit?

Page 8: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Nestling Growth Experiments

• Several data sets and experimental conditions and covariates. Many response variables.

• Questions of interest:

a) Are there treatment effects?

b) Does the shape of the growth curve vary

with response variable?

c) Do treatment effects for the response variables differ? - e.g. If a treatment delays growth of tarsus length, does it delay growth of head circumference.

Page 9: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Experiment Design

For Tree Swallow Study

• 2 to 8 times per curve (mean 6)

• 297 nestlings

• A split plot design with whole plot

(nest) factors:

– covariate HatchDate

– dietary supplement/none

(courtesy of Matt Wasson, Cornell)

Page 10: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Growth Curves

Age

Tarsus Length

0 2 4 6 8 10 12 14

46

810

12

Age

Wing Length

0 2 4 6 8 10 12 14

10

20

30

40

50

60

Age

Mass

0 2 4 6 8 10 12 14

510

15

20

25

Age

Head Diameter

0 2 4 6 8 10 12 14

15

20

25

tarsus

transformed time

Growth Curve

0 2 4 6

45

67

89

wing

transformed time

Growth Curve

0.0 0.05 0.10 0.15 0.20 0.25 0.30

-5*10^13

05*10^13

10^14

mass

transformed time

Growth Curve

0 5*10 -̂6 10 -̂5 1.5*10 -̂5

0.2

0.4

0.6

0.8

1.0

head

transformed time

Growth Curve

2 4 6 8 10

12

14

16

18

tarsus lengthwing lengthtarsus length wing length

mass masshead head

size

age transformed time

Raw Data Fitted Growth Curve

Page 11: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Back to

Shape-Invariant Regression

yi (t) = ααααi0 + Ai1 µµµµ0(ββββιιιι0 +Bi1t) + εεεεit

(Lawton et al 1972)

• Common shape: µµµµ0 (t)

• Parameters: θθθθi’ = (ααααi0, ααααi1, ββββi0, ββββi1)

• Ai1= exp(ααααi1)

• Bi1= exp(ββββi1)

Page 12: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Fitting the SIM Model

Starting with θθθθij = (0,0,0,0)

1. Let Y*ij(t) = (Yij(t) – aaaaij0)/exp(aij1) “aligned

response”

t* = bij0 + exp(bij1 )t “aligned time”

2. Use a nonparametric smoother to regress Y* on t*.

Call this m(t*).

3. Use nonlinear mixed models to fit the model

yij (t) = ααααij0 + Aij1 m(ββββijijijij0 +Bij1t*) + εεεεijt

4. Check for convergence. If not converged, go to 1.

Page 13: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Notes

• In fitting a complicated SIM model such as the Bird Growth Data, it is convenient to fit first without the linear model for the parameters. In the last step, the linear model for the parameters can be fitted.

• The convergence criterion used was the change in predicted values.

• It is convenient to smooth using a penalized cubic spline. This can be done using a linear mixed models routine.

Page 14: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Penalized Cubic Spline

• Pick a dense set of equally spaced time points – in a typical study with 4-12 time points per curve, 20 points will do.

• Fit a linear mixed model:

cubic polynomial in time is the fixed effect

are the random effects

• The result is similar to a smoothing spline, but computationally simpler.

(Carroll and Ruppert, 1997; Eilers and Marx, 1997)

tj

K

j j

i

i i ttt ητγδµ +−+= +== ∑∑ 3

1

3

10 )()(

3)( +− it τ

subject to γ’γ ≤ C

Page 15: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

It turns out that this is readily fitted by considering the

δδδδ’s to be fixed effects and the γγγγ’s to be random effects with common variance .

Why? Computationally very simple and fast compared

to other smoothing techniques.

This has two nice consequences:

The shape is a polynomial if =0.

The treatment effects on the curves can be readily

modeled by using the same linear model that we used

for the parameters.

2

γσ

2

γσ

Fitting the Penalized Spline

Page 16: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Mixed Models for the Parameters

• Suppose that we now add another level

to the model:

• yi (t) = ααααi0 + Ai1 µµµµ0(ββββi0 +Bi1t) + εεεεit

θθθθi’ = (ααααi0, ααααi1, ββββi0, ββββi1)

• Ai1= exp(ααααi1) Bi1= exp(ββββi1)

θθθθij = Xjdj + Zj Dj• where Xj and Zj are observed

predictors; dj and Dj are fixed and

random effects

Page 17: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Estimation of the Mixed Model

Parameters

is readily incorporated into the algorithm by either:

• adding the mixed model to the NLME step during iteration

(unconditional method)

• iterating the basic SIM model until convergence and then fitting the mixed model in the final NLME step

(conditional method)

Page 18: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Testing the Mixed Model

Parameters

• Conditional on the fitted shape, we

have a NLME. So, the LRT from the

conditional method should be

asymptotically chi-squared.

• How asymptotic is this?

• What about the unconditional method?

• Why not fit the entire model as one

huge NLME?

Page 19: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Distribution of the Conditional

LRT

Time

mu(t)

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Curve 1

Curve 2

Curve 3

Simulation Study

0.100.05σσσσεεεε

0.100.05σσσσb1

0.300.10σσσσb0

0.100.05σσσσa1

0.300.10σσσσa0

Large

Error

Small

Error

Page 20: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

How Good is the Fit?

The ASE is an order of magnitude smaller than the fit

obtained by fitting each curve individually.

23

45

67

20 30 50

Number of Curves

AS

E*1

0000

23

45

67

23

45

67

20 30 50

Number of CurvesA

SE

*100

00

23

45

67

23

45

67

20 30 50

Number of Curves

AS

E*1

0000

23

45

67

510

1520

2530

20 30

Number of Curves

AS

E*1

0000

510

1520

2530

510

1520

2530

20 30

Number of Curves

AS

E*1

0000

510

1520

2530

510

1520

2530

20 30

Number of Curves

AS

E*1

0000

510

1520

2530

20 points/curve

ASE*10000

5 10 15 20 25 30

ASE*10000

2 3 4 5 6 7

ASE*10000

2 3 4 5 6 7

ASE*10000

5 10 15 20 25 30

ASE*10000

5 10 15 20 25 30

20 30 50

Number of CurvesASE*10000

2 3 4 5 6 7

20 30 50

Number of Curves

20 30 50

Number of Curves

30 points/curve

20 30 20 30 20 30

Page 21: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

How Good are the Parameter

Estimates?

0.2

0.4

0.6

0.8

1.0

20 30 50

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

Curve 1- Small Variance A0

0.2

0.4

0.6

0.8

1.0

20 30 50

A1

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

20 30 50

B0

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

20 30 50

B1

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0correlation

0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

20 30

B1

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

20 30

B1

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

20 30

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

Curve 1- Large Variance A0

0.2

0.4

0.6

0.8

1.0

20 30

A1

Number of Curves

Cor

rela

tion

0.2

0.4

0.6

0.8

1.0

a1 b1 a1 b1

Small Variance Large Variance

a0 b0 a0 b0

Page 22: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Correlation Among Estimates:

-0.5

0.0

0.5

r(A0,A1) r(A0,B0) r(A0,B1) r(A1,B0) r(A1,B1) r(B0,B1)

Cor

rela

tion

Curve 1 - Small Variance

-0.5

0.0

0.5

r(A0,A1) r(A0,B0) r(A0,B1) r(A1,B0) r(A1,B1) r(B0,B1)

Cor

rela

tion

Curve 1 - Large Variance

correlation

-0.5 0.0 0.5

correlation

-0.5 0.0 0.5

r(a0,a1) r(a0,b0) r(a0,b1) r(a1, b0) r(a1, b1) r(b0,b1)

r(a0,a1) r(a0,b0) r(a0,b1) r(a1, b0) r(a1, b1) r(b0,b1)

Page 23: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Distribution of the LRT

m=n=20 m=30 n=50

The top curves are the observed

percentiles of the CLRT versus

chi-square

The lower curves are the

observed percentiles of the

CLRT versus the LRT with the

correct parametric form.

chi-square percentile

CLRT percentile

CLRT percentile

CLRT percentile

CLRT percentile

parametric LRT percentile

Page 24: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Power of the CLRT

(level=.05)

89%87%1.0 σ

36%33%0.5 σ

unconditionalconditionalshift

99%91%1.0 σ

70%67%0.5 σ

unconditionalconditionalshift

20 curves, 20 time points

50 curves, 30 time points

Page 25: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Why not fit the whole thing as

one big mixed model?

Page 26: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

SIM Model for the Nestling

Growth Study

yijk (t) = ααααij0 +exp( ααααij1)µµµµ0 (exp( ββββij1 )t) + εεεεijt

θθθθij =γγγγ0 + φφφφi(HatchDateij) + Treatmentij

HatchDate and Treatment are time invariant

and are applied to every bird in the nest

Note: ααααijk ↑ larger birds

ββββij1 ↑ faster growth

Page 27: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Tarsus Length versus Age

Tarsus

Time

Tarsus

0 2 4 6 8 10 12 14

46

810

12

Tarsus

transformed time

Transform

ed Tarsus

0 1 2 3 4

45

67

8

transformed time

Growth Curve

0 2 4 6

45

67

89

time

size

2 4 6 8

46

810

12

26 knots

Random effects only

Some individual

growth curves

Aggregate Aligned

Page 28: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Parameters versus Hatch Datea0

Hatch Date

a0

30 40 50 60

-1.5

-0.5

0.5

a1

Hatch Date

a1

30 40 50 60

-10^-12

-4*10^-13

04*10^-13

b1

Hatch Date

b1

30 40 50 60

-0.2

-0.1

0.0

0.1

Hatch Date

Hatch Date

Hatch Date

Page 29: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Parameters versus Treatment

-1.5

-0.5

0.5

1.0

Control Calcium

a0

a0

-3*10^-16

02*10^-16

Control Calcium

a1

a1

-0.2

-0.1

0.0

0.1

Control Calcium

b1

b1

cor(a0,a1)=.88

cor(a0, b1)=.60

control calcium control calcium

control calcium

ao

-1.5 -0.5 0.5 1.0

a1*1016

-3 -1 1

b1

-0.2 -0.1 0.0 0.1

Page 30: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

SIM Model for the Nestling

Growth Study

yijk (t) = ααααij0 +exp( ααααij1)µµµµ0 (exp( ββββij1 )t) + εεεεijt

θθθθij =γγγγ0 + ρρρρ1 HatchDateij + ρρρρ2222 HatchDateij2

+ Treatmentij + ζζζζij

Page 31: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Conditional Likelihoods

• Conclusion: Both hatch date (quadratic) and

treatment have an effect on nestling growth.

• Similarly, despite the small variance

component for αααα1, the fit is significantly worse without it.

0.04

0.07

0.02

p-value

19-1832full

13-1839treatment

16-1836hatch date

10-1843null

d.f.likelihoodmodel

Page 32: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Does the Model Fit?

• Does the Treatment Affect Shape?

• A simple idea: Fit a linear mixed model

to the LME for shape.

Page 33: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Does the Model Fit?

• Does the Treatment Affect Shape?

• A simple idea: Fit a linear mixed model

to the LME for shape.

Crainiceanu and Ruppert (2003) show that the LRT cannot be

used to test for polynomial versus p-spline, unless the design

matrices are orthogonalized.

Xu (2003) found that for test equality of curves, P-spline fit of

full model under the null hypothesis is WORSE than the fit of

the null model (although the models are nested) unless the

design matrices are orthogonalized.

Page 34: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Does the Model Fit?

• Does the Treatment Affect Shape?

• A simple idea: Fit a linear mixed model

to the LME for shape.

Crianiceanu and Ruppert (2003)

Xu (2003)

Good News: There is never a shortage of research problems.

Page 35: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Does the Model Fit?

∆LRT=56

P<0.05 (for d.f. < 40)

estimated d.f. @ 5

Time

Page 36: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Why consider the SIM Model?

• can be used in a variety of problems:– growth

– sera concentration (hormones, drugs)

– bio-equivalence

– materials deformation

• more flexible than polynomial or other parametric fits

• just as easy to use and interpret as parametric nonlinear mixed model

• can be used to test goodness-of-fit of parametric models (particularly easy for polynomials)

• can be used to suggest parametric shapes

• can be used to compare across curves with different shapes but similar treatment effects

Page 37: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

Main References

• Crainiceanu et al (2005) Biometrika

• Ke andWang, (2001) JASA

• Kneip and Gasser (1998)

• Lawton, Sylvestre, and Maggio (1972)

Technometrics

• Lindstrom (1995) Statistics in Medicine

• Murphy and van der Vaart (2000) JASA

Page 38: Self-Modeling Regression for Longitudinal Data with Time ...€¦ · Self-Modeling Regression for Longitudinal Data with Time-Invariant Covariates Naomi S. Altman Penn State University

And many thanks to ...

Chuck McCulloch (penalized splines)

Matt Wasson (data)

Doug Bates and Jose Pinheiro (lme)

JCSS editors, associate editors and

reviewers

The awards committee