Upload
bonifacy-ania
View
25
Download
3
Embed Size (px)
DESCRIPTION
Statistics 262: Intermediate Biostatistics. May 18, 2004: Cox Regression III: residuals and diagnostics, repeated events. Jonathan Taylor and Kristin Cobb. Residuals. Residuals are used to investigate the lack of fit of a model to a given subject. - PowerPoint PPT Presentation
Citation preview
Satistics 262 1
Statistics 262: Intermediate Biostatistics
Jonathan Taylor and Kristin Cobb
May 18, 2004: Cox Regression III: residuals and diagnostics, repeated events
Satistics 262 2
Residuals Residuals are used to investigate
the lack of fit of a model to a given subject.
For Cox regression, there’s no easy analog to the usual “observed minus predicted” residual of linear regression
Satistics 262 3
Deviance Residuals Deviance residuals are based on
martingale residuals: ci (1 if event, 0 if censored) minus the estimated cumulative hazard to ti (as a function of fitted model) for individual i:ci-H(ti,Xi,ßi)
See Hosmer and Lemeshow for more discussion…
Satistics 262 4
Deviance Residuals Behave like residuals from ordinary
linear regression Should be symmetrically distributed
around 0 and have standard deviation of 1.0.
Negative for observations with longer than expected observed survival times.
Plot deviance residuals against covariates to look for unusual patterns.
Satistics 262 5
Deviance Residuals In SAS, option on the output
statement:Ouput out=outdata resdev=
Satistics 262 6
Schoenfeld residuals Schoenfeld (1982) proposed the first set of
residuals for use with Cox regression packages Schoenfeld D. Residuals for the proportional hazards
regresssion model. Biometrika, 1982, 69(1):239-241. Instead of a single residual for each individual,
there is a separate residual for each individual for each covariate
Based on the individual contributions to the derivative of the log partial likelihood (see chapter 6 in Hosmer and Lemeshow for more math details, p.198-199)
Note: Schoenfeld residuals are not defined for censored individuals.
Satistics 262 7
Schoenfeld residualsWhere K is the covariate of interest,
the Schoenfeld residual is the covariate-value, Xik, for the person (i) who actually died at time ti minus the expected value of the covariate for the risk set at ti (=a weighted-average of the covariate, weighted by each individual’s likelihood of dying at ti).
)(
1
residualitRj
ijkjik pxx
Plot Schoenfeld residuals against time to evaluate PH assumption
Satistics 262 8
Schoenfeld residualsIn SAS: option on the output statement:ressch=
Satistics 262 9
Influence diagnostics How would the result change if a
particular observation is removed from the analysis?
Satistics 262 10
Influence statistics• Likelihood displacement (ld): measures
influence of removing one individual on the model as a whole. What’s the change in the likelihood when this individual is omitted?
• DFBETA-how much each coefficient will change by removal of a single observation
• negative DFBETA indicates coefficient increases when the observation is removed
Satistics 262 11
Influence statisticsIn SAS: option on the output statement:ld= dfbeta=
Satistics 262 12
Death (presumably) can only happen once, but many outcomes could happen twice… Fractures Heart attacks PregnancyEtc…
What about repeated events?
Satistics 262 13
Strategy 1: run a second Cox regression (among those who had a first event) starting with first event time as the origin
Repeat for third, fourth, fifth, events, etc. Problems: increasingly smaller and
smaller sample sizes.
Repeated events: 1
Satistics 262 14
Treat each interval as a distinct observation, such that someone who had 3 events, for example, gives 3 observations to the dataset Major problem: dependence between
the same individual
Repeated events:Strategy 2
Satistics 262 15
Stratify by individual (“fixed effects partial likelihood”)
In PROC PHREG: strata id; Problems: does not work well with RCT data, however requires that most individuals have at least 2
events Can only estimate coefficients for those
covariates that vary across successive spells for each individual; this excludes constant personal characteristics such as age, education, gender, ethnicity, genotype
Strategy 3