Assessing the Total Effect of Time-Varying Predictors in
Prevention Research
Bethany BrayApril 7, 2003
University of Michigan, Dearborn
2
OUTLINE
I. Introduction to the problem
II. Standard model
III. Problems with the standard model
IV. Suggested solution
V. Data example
VI. Future directions
3
GOALAssess the total effect that delaying the timing of a
predictor has on the timing of a response.
4
“Does delaying conduct disorder initiation lead to a delay in the initiation of marijuana?”
OUR QUESTION
OBJECTIVETo estimate total effect of conduct disorder
initiation on marijuana initiation.
5
Common correlates of the predictor and the response.
Alternate explanations for the observed relationship between the predictor and response.
Must be controlled for when estimating the total effect.
CONFOUNDERS
6
COMPOSITIONAL DIFFERENCES
The unequal distribution of levels of the confounder between the types of children that initiate the predictor
and those who do not.
7
WHY WORRY?The coefficient of the predictor is a biased estimate of the
total effect.
8
COEFFICIENT ESTIMATESEstimated coefficient reflects the difference between the predictor groups, in addition to the causal effect.
9
WHY ONLY IN OBSERVATIONAL STUDIES?
Compositional differences are minimized by randomization.
Observational studies require statistical methods and scientific assumptions to adjust for compositional
differences.
10
WHAT DO WE NORMALLY DO?
The standard model.
11
THE STANDARD MODELIncludes confounders as covariates in the response
regression model.
12
Figure 1. Illustration of a spurious correlation between predictors and response in the sprinkler example
Predict = PredictorConf = ConfounderResp = ResponseU = Unmeasured Predictor
c c c c
a aConf1 Predict1 Resp2 Conf2 Predict2 Resp3
b
Front Front Back Front Front BackYard Yard Yard Yard Yard Yard
Grass Sprinkler Grass Grass Sprinkler Grass(time 1) (time 1) (time 2) (time 2) (time 2) (time 3)
U1 U2
RAINING RAINING(time 1) (time 2)
Sprinkler example follows examples often used by Pearl.
13
PROBLEMThe confounder is affected by the predictor.
If the confounder is included as a covariate, a spurious correlation is created.
14
SPRINKLER EXAMPLEConsider a simple example involving sprinklers.
15
Figure 1. Illustration of a spurious correlation between predictors and response in the sprinkler example
Predict = PredictorConf = ConfounderResp = ResponseU = Unmeasured Predictor
c c c c
a aConf1 Predict1 Resp2 Conf2 Predict2 Resp3
b
Front Front Back Front Front BackYard Yard Yard Yard Yard Yard
Grass Sprinkler Grass Grass Sprinkler Grass(time 1) (time 1) (time 2) (time 2) (time 2) (time 3)
U1 U2
RAINING RAINING(time 1) (time 2)
Sprinkler example follows examples often used by Pearl.
16
Figure 2. Some relationships among conduct disorder, peer pressure resistance, and marijuana
Cd = PredictorPpress = ConfounderMj = ResponseU = Unmeasured Predictor
c c c c
a aPpress1 Cd1 Mj2 Ppress2 Cd2 Mj3
b
Peer Conduct Disorder Marijuana Peer Conduct Disorder MarijuanaPressure Initiation Initiation Pressure Initiation Initiation
Resistance Resistance(time 1) (time 1) (time 2) (time 2) (time 2) (time 3)
U1 U2
Relationship Quality Relationship Quality(time 2)
Parent-Child
(time 1)
Parent-Child
17
RESULTSpurious correlations are dangerous.
18
DANGER OF SPURIOUS CORRELATIONS
Degree of bias related to the strength of the correlations.
In simulations, false conclusions reached in up to 80% of the data sets.
19
WHAT DO WE DO NOW?Use sample weights to statistically control for time-
varying confounders*.
WEIGHTING?Weighting attempts to make people with different predictor levels comparable in all other respects.
*Hernán, Brumback, and Robins, 2000
20
HOW DOES IT WORK?Equalizes the compositional differences of the
confounder among the predictor levels.
21
Original frequencies – conduct disorder initiation by peer press. resistanceConduct Disorder Initiation StatusNon-Initiator Initiator Total
High Peer Pressure Resistance 40 10 50Low Peer Pressure Resistance 30 30 60Total 70 40 110
Ideal frequencies – conduct disorder initiation by peer press. resistanceConduct Disorder Initiation StatusNon-Initiator Initiator Total
High Peer Pressure Resistance 25 25 50Low Peer Pressure Resistance 30 30 60Total 55 55 110
Weighted frequencies – conduct disorder initiation by peer press. resistanceConduct Disorder Initiation StatusNon-Initiator Initiator Total
High Peer Pressure Resistance 50 50 100Low Peer Pressure Resistance 60 60 120Total 110 110 220
22
HOW DO WE GET THE WEIGHTS?
Inverse of the conditional probability of predictor status given confounder status.
10 Initiators w/ high peer pressure resistance:
Weight of (10/50)-1 = 5
40 Non-initiators w/ high peer pressure resistance:
Weight of (40/50)-1 = 5/4
60 Children w/ low peer pressure resistance:
Weight of (30/60)-1 = 2
23
EQUATION 1
]Conf |P[Cd]P[CdW
ii
i
IN PRACTICEEliminate the elevation of the total sample size.
24
WHY DOES THIS WORK?•Eliminates the problematic spurious correlation.
•Controls for confounders by equalizing compositional differences.
25
Figure 3. Elimination of relationship among conduct disorder and peer pressure resistance by using sample weights
Cd = PredictorPpress = ConfounderMj = ResponseU = Unmeasured Predictor
c c c c
a aPpress1 Cd1 Mj2 Ppress2 Cd2 Mj3
b
Peer Conduct Disorder Marijuana Peer Conduct Disorder MarijuanaPressure Initiation Initiation Pressure Initiation Initiation
Resistance Resistance(time 1) (time 1) (time 2) (time 2) (time 2) (time 3)
Parent-Child Parent-Child
U1 U2
Relationship Quality Relationship Quality(time 1) (time 2)
26
Figure 4. Some Relationships in a weighted sample when peer pressure resistance is omitted
Cd = PredictorMj = ResponseU = Unmeasured Predictor
Cd1 Mj2 Cd2 Mj3
Conduct Disorder Marijuana Conduct Disorder MarijuanaInitiation Initiation Initiation Initiation
(time 1) (time 2) (time 2) (time 3)
Parent-Child Parent-Child
U1 U2
Relationship Quality Relationship Quality(time 1) (time 2)
27
HOW DO WE DO IT?1. Ratio of two predicted probabilities
a. Denominator: predicted probability of observed conduct disorder initiation given confounders
and baseline variables.
b. Numerator: predicted probability of observed conduct disorder initiation given baseline variables.
2. Weight at time t, Wt: product of these ratios up to time t.
28
EQUATION 2*
*The “over-bars” above Alci-1 and Mji-1 signal that the probability is conditional on the complete past predictor and response patterns.
t
1i 1-ii1-ii
1-i1-iit ]MjRace, Sex,, Conf,Cd |P[Cd
]MjRace, Sex,,Cd |P[Cd W
29
NOW WHAT? Weighted logistic regression of the response on the
predictor.
30
DATA EXAMPLE•Naïve Model
•Standard Model
•Weighted Model
31
Predictor:Conduct Disorder 1.2544*** 0.3628 0.6565**Odds 3.51 1.44 2.06
(<0.0001) (0.1203) (0.0054)
Time-Varying Confounders:Cigarettes 0.4085
Alcohol 0.8238**
Other Drug Use 1.2848**
Peer Pressure Res. -0.0470***
Non-Time-Varying Confounders:Heart Rate -0.0118
Verbal IQ -0.0265**
Performance IQ -0.0117
Ave. Sen. Seeking 0.0191
RESPONSE REGRESSION MODELS WITH CONDUCT DISORDER AS THE PREDICTOR+
Naïve† Standard Weighted†
+Coefficients for intercepts and baseline
variables are omitted.
†These models do not include confounders by definition.
One tailed tests: *p<0.05 **p<0.01 ***p<0.001
NOTES:
32
SUMMARY•Worry about confounders in observational studies.
•Standard method of controlling for confounders results in biased estimates from spurious correlation issues.
•The weighting method is one way to reduce bias.
33
ASSUMPTIONS1. Sequential Ignorability
2. Past confounder patterns do not exclude particular levels of exposure
34
FUTURE DIRECTIONS•Generalization of method to multilevel data structures
•Procedures to detect assumption violations
•Robustness to assumption violations
35
ROBUSTNESS TO ASSUMPTION VIOLATIONS
Assumption 1: Adjusting for more and more confounders leads to decreased bias using the weighted model.
Assumption 2: Biased estimators from the weighted model.
36
EXTRA INFO
37
PATH ANALYSISA Few Rules:
•Paths with no converging arrows and variables not in model do contribute to correlation
•Paths with converging arrows and variable not in model do not contribute to correlation
•Paths with no converging arrows and a variable in model do not contribute to correlation, path is blocked
•Paths with converging arrows and a variable in model do contribute to correlation, multiply path’s sign by -1
38
OUR DATA•Lexington Longitudinal Study
•121 Female, 41 non-white
•Multiple confounders
•Time measured ever 1/3 of a school year
39
WEIGHT CALCULATIONS
iitti
ti RaceSexSchyrnumpr
numpr **)1
log( 21
•Numerator Regression Model:
•Denominator Regression Model:
•Weight (conduct disorder initiation at time t):
tiiitti
ti ConfRaceSexSchyrdenpr
denpr***)
1log( 21
11t 1
111
W
i
i
ti
i
ti
i
denprnumpr
denprnumpr
denprnumpr
40
•Confounders:
•Naïve Model and Weighted Model:
ttt
t CdRaceSexSchyrp
p321 **)
1log(
ttt SchyrSchyrSchyrSchyr *** 2211 •Intercept Term:
•Standard Model:
ttttt
t ConfCdRaceSexSchyrp
p 321 **)
1log(
],,,,,,,[ ttttt AlcCigAsssViqPiqHrOdgaPpressConf