22
Biostatistics Case Studies Peter D. Christenson Biostatistician http://gcrc.humc.edu/Biostat Session 3: Missing Data in Longitudinal Studies

Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies

Embed Size (px)

DESCRIPTION

Case Study Outline Subjects randomized to one of 3 drugs for controlling hypertension: A: Carvedilol (new) B: Nifedipine (standard) C: Atenolol (standard) Blood pressure and HR measured at baseline and 4 post- treatment periods. Primary analysis is unclear, but changes over time in HR and bp are compared among the 3 groups.

Citation preview

Page 1: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Biostatistics Case Studies

Peter D. Christenson

Biostatistician

http://gcrc.humc.edu/Biostat

Session 3: Missing Data in Longitudinal Studies

Page 2: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Case Study

Hall S et al: A comparative study of Carvedilol, slow release Nifedipine, and Atenolol in the management of essential hypertension.

J of Cardiovascular Pharmacology 1991;18(4)S35-38.

Page 3: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Case Study Outline

Subjects randomized to one of 3 drugs for controlling hypertension:

A: Carvedilol (new) B: Nifedipine (standard) C: Atenolol (standard)

Blood pressure and HR measured at baseline and 4 post-treatment periods.

Primary analysis is unclear, but changes over time in HR and bp are compared among the 3 groups.

Page 4: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Available Data: sitting dbp

Visit #Week

Number of SubjectsPaper Data

A B C

Screen -1Baseline dbp1 0 100 93 95Post 1 2 3 2 100 93 94Post 2 3 4 4 94 91 94Post 3 4 5 6 87 88 93Post 4 5 6 8 83 84 91

Page 5: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Sitting dbp from Figure 2Figure 2: Carvedilol

Week

0 2 4 6 8 10

DB

P

88

90

92

94

96

98

100

102

104

106

N=100 N=100 N=94 N=87 N=83

Mean +/- SE

Page 6: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Group A: Baseline and Final dbp

Week 0Last Value:Pre Week 8 Week 8 Final Δ

Graph N=100103.04± 0.52

N=8390.43± 0.96

N=8390.43± 0.96

12.61± ?

Completers N=83102.99± 0.53

N=8390.43± 0.96

N=8390.43± 0.96

12.55± 1.10

Last ObservationCarried Forward(LOCF)

N=100103.04± 0.52

N=1797.47± 3.47

N=8390.43± 0.96

N=10091.63± 1.02

11.41± 1.11

Page 7: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Wanted: Use N=100 w/o LOCF

Combine:Info on true 8 week change in 83 subjects.Info on baseline only in 17 subjects.

Use week0-week8 correlation in 83 subjects.

More generally:Suppose 9 subjects had only week 0 and 8 subjects had only week 8.Then, really 2 experiments, 1 paired (N=83) and 1 unpaired (N1=9 and N2=8).Combining involves weighting Δs from the 2

experiments. Does not impute (substitute) values for the 17 unknown values.

Generalize further to >2 time periods and >1 treatment, etc.

Page 8: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Mixed ModelsMixed models implement our need here.

“Mixed” means combination of fixed effects (e.g., drugs; want info on those particular drugs) and random effects (e.g., centers or patients; not interested in the particular ones in the study).

AKA multilevel models, hierarchical models.

Very flexible, incorporate unequal patient variability, correlation, pairing, repeated values at multiple levels (e.g., sitting and standing dbp in Fig 2, or if subjects were clustered, say from the same family and genetics was an issue, etc), and data missing at random.

More assumptions required than typical analyses.

Page 9: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Data Structure for SoftwareNeed: patient week dbp 1 0 97 1 2 101 1 4 88 1 6 89 1 8 86 2 0 109 2 2 72 etcNot: patient wk0 wk2 wk4 wk6 wk8

1 97 101 88 89 86 2 109 72 . . .

Page 10: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Software

Need to use a mixed model module. Often, options are unclear. Use: SPSS Analyze > Mixed

SAS proc mixed.

Repeated measures modules with options for random factors do not typically handle missing data, e.g.:

SPSS Analyze > GLM > Repeated > … Random

SAS proc glm; model …; random …;

are not in general OK, but will work with certain balanced patterns of missing data.

Page 11: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Mixed Models in SPSSSelect Analyze > Mixed > Linear. First menu:

Page 12: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Mixed Models in SASSelect Solutions > Analysis > Analyst >

Statistics > ANOVA > Mixed models

Alternatively, typical code is:

proc mixed; class week patient; model dbp=week/ddfm=satterthwaite; lsmeans week/cl; estimate 'Week Diff' week 1 -1; repeated week/subject=patient type=un rcorr; title 'Mixed Model N=100+83 Unstructured';run;

Page 13: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Model 1 Results

Estimated Change: StandardLabel Estimate Error DF t Value Pr > |t|

Week Diff 12.6058 1.0441 95.6 12.07 <.0001

So, Δ = 12.61±1.04 incorporates 100 + 83 observations.

Estimated Means: StandardEffect week Estimate Error

week 0 103.04 0.7059week 8 90.43 0.7749

Page 14: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Group A: Baseline and Final dbp Update

Week 0Last Value:Pre Week 8 Week 8 Final Δ

Graph N=100103.04± 0.52

N=8390.43± 0.96

N=8390.43± 0.96

12.61± 1.04

Completers N=83102.99± 0.53

N=8390.43± 0.96

N=8390.43± 0.96

12.55± 1.10

Last ObservationCarried Forward(LOCF)

N=100103.04± 0.52

N=1797.47± 3.47

N=8390.43± 0.96

N=10091.63± 1.02

11.41± 1.11

Is model appropriate? Depends on assumed covariance pattern.

Page 15: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Model 1 Covariance Pattern: Compound Symmetry

Software Output

Estimated R Correlation Matrix for patient 4

Row Col1 Col2

1 1.0000 0.008760 2 0.008760 1.0000

Covariance Parameter Estimates

Cov Parm Subject Estimate

CS patient 0.4366 Residual 49.3989

Output Interpretation

Estimated Covariance Pattern:

Week 0 8

0 (7.06)2 0.44 8 0.44 (7.06)2

(7.06)2 = 49.3989 + 0.4366

Note that this model assumes that variability among subjects is the same at each week, and that there is a correlation between the weeks (estimated at 0.00876).But: Week 0 SD = 5.2 Week 8 SD = 8.8

Page 16: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Model 2 Covariance Pattern: Unstructured

Software Output

Estimated R Correlation Matrix for patient 4

Row Col1 Col2

1 1.0000 0.01129 2 0.01129 1.0000

Covariance Parameter Estimates

Cov Parm Subject Estimate

UN(1,1) patient 27.1700 UN(2,1) patient 0.5169 UN(2,2) patient 77.2008

Output Interpretation

Estimated Covariance Pattern:

Week 0 8

0 (5.21)2 0.44 8 0.44 (8.79)2

(5.21)2 = 27.17

This model allows different variability among subjects at each week, and a correlation between the weeks (estimated at 0.011).This better models the SDs: Week 0 SD = 5.2 Week 8 SD = 8.8

Page 17: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Model 3 Covariance: Heterogeneous Uncorrelated

Software Output

Estimated R Correlation Matrix for patient 4

Row Col1 Col2 1 1.0000 2 1.0000

Covariance Parameter Estimates

Cov Parm Subject Estimate

UN(1,1) patient 27.1701 UN(2,1) patient 0 UN(2,2) patient 77.1998

Output Interpretation

Estimated Covariance Pattern:

Week 0 8

0 (5.21)2 0 8 0 (8.79)2

(5.21)2 = 27.17

This model allows different variability among subjects at each week, but no correlation between the two weeks.

Matches: Week 0 SD = 5.2 Week 8 SD = 8.8

Page 18: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Choice of Covariance Pattern

Model Covariance Pattern -2 Log Likelihood

1: Comp Sym 1: Corr & = SDs 1230.2

2: Unstructured 2: Corr & ≠ SDs 1206.0

3: Heterog Uncorr 3: 0 Corr & ≠ SDs 1206.0

Use likelihood ratio test to test whether a more complex model significantly improves fit of the data. Models must be “nested”.

Is model 2 significantly better than model 1?

Χ2 = 1230.2-1206.0 = 24.2 has Χ2 distribution with d.f.= difference in # of estimated parameters (here 3-2) if model 2 is not an improvement. P-value=Prob(Χ2 >24.2) <0.0001, so model 2 is needed. Final choice: model 3.

Page 19: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Model 3 Results

Estimated Change: StandardLabel Estimate Error DF t Value Pr > |t|

Week Diff 12.6063 1.0963 128 11.50 <.0001

Thus, use Δ = 12.61±1.10 from 100 + 83 observations.

Estimated Means: StandardEffect week Estimate Error DF

week 0 103.04 0.5212 99week 8 90.43 0.9644 82

Page 20: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Conclusions for Group A Week 0 to Week 8 dbp Δ

Last observation carried forward overestimates dbp at week 8.

Essentially 0 correlation between residual week 0 and week 8 dbp.

Use mixed model with heterogeneous uncorrelated covariance pattern.

This mixed model is equivalent to a 2-sample t-test with unequal variance using Satterthwaite’s weighting. This would not happen if either (1) some subjects only had dbp at week 8, or (2) correlation was stronger between weeks 0 and 8, which usually happens.

Page 21: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Generalize: Group A with all 5 Time Periods

Covariance Pattern Parameters -2 Log Likelihood

Compound Symmetry 2 3193.7

Heterogeneous Uncorrelated 5 3245.4

Toeplitz 5 3172.0

Heterogeneous Toeplitz 9 3141.4

Unstructured 15 3111.7

Since LR = 3141.4 - 3111.7 = 30.7 is large for a Χ26 , there

is substantial unstructured correlation over weeks.

Page 22: Biostatistics Case Studies Peter D. Christenson Biostatistician  Session 3: Missing Data in Longitudinal Studies

Conclusions: Repeated Measures with Mixed Models

Very useful for missing data.

Requires more than usual assumptions.

Mild deviations from assumed covariance pattern do not have a large influence.

Software can be intimidating due to specifying many model assumptions, since the method is so general and flexible.

May be difficult to apply unbiasedly in clinical trials where the primary analysis needs to be specifically detailed.