Upload
autumn-reeves
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Bias Correction in Pharmaceutical
Risk-Benefit Assessment
Bob Obenchain, PhD, FASARisk-Benefit Statistics LLC
Yin = Dark = Evil = Risk
Yang = Light = Good = Benefit
Outline:• Covariate Adjustment (Simplistic,
Global Modeling) is Inadequate• Local Control methods take BIG
Steps in “Right Directions.”• Emerging Credibility Crisis in
Pharmaceutical Safety
Shah BR, Laupacis A, Hux JE, Austin PC. Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review. J Clin Epidemiol 2005; 58: 550–559.
Stürmer T, Joshi M, Glynn RJ, Avorn J, Rothman KJ, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. [REVIEW ARTICLE] J Clin Epidemiol 2006; 59: 437–447.
With titles like these, do youreally need to read the paper?
Heckman JJ. Sample selection bias as a specification error. Econometrica 1979; 47: 153–161.
Crown WE, Obenchain RL, Engelhart L, Lair TJ, Buesching DP, Croghan TW. The application of sample selection models in evaluating treatment effects: the case for examining the effects of antidepressant medication. Stat Med 1998; 17, 1943–1958.
Obenchain RL, Melfi CA. Propensity score and Heckman adjustments for treatment selection bias in database studies. 1997 Proceedings of the Biopharmaceutical Section. Alexandria, VA: American Statistical Association. 1998; 297–306.
Early CA Modeling Efforts
D’Agostino RB Jr. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. [TEACHER’S CORNER] Stat Med 1998; 17: 2265–2281.
Highly Influential ???
Claimed that 3rd form of PS Adjustment (after matching and sub-grouping) was to
simply use some function of PS estimates as an additional X in Covariate Adjustment.
• Epidemiology (case-control & cohort) studies• Post-stratification and re-weighting in surveys• Stratified, dynamic randomization to improve balance
on predictors of outcome• Matching and Sub-grouping using Propensity Scores • Econometric Instrumental Variables (LATEs)• Marginal Structural Models (IPW 1/PS)• Unsupervised Propensity Scoring: Nested Treatment-
within-Cluster ANOVA model …with LATE, LTD and Error sources of variation
History of Local ControlMethods for Human Studies
“Local” Terminology:
• Subgroups of Patients
• Subclasses…
• Strata…
• Clusters… (natural or forced)
Notation for Variables
y = observed outcome variable(s)x = observed baseline covariate(s)t = observed treatment assignment
(usually non-random)z = unobserved explanatory variable(s)
Fundamental PS TheoremJoint distribution of x and t given p:
Pr( x, t | p ) Pr( x | p ) Pr( t | x, p ) = Pr( x | p ) Pr( t | x ) = Pr( x | p ) times p or (1p) = Pr( x | p ) Pr( t | p )
...i.e x and t are conditionally independent given the propensity for new, p = Pr( t = 1 | x ).
Conditioning (patient matching) on Propensity Scores implies both…
Balance: local X-covariate distributions must be the same for both treatments
and
Imbalance: Unequal local treatment fractions
unless Pr( t | p ) = p = 1p = 0.5
Constant PS Estimate Calipersfrom Discrete Choice (Logit or Probit) Model
x2
x3
x1
xx LinearLinearFunctionalFunctional constantconstant
Infinite 3-D Slab
Pr( x, t | p ) = Pr( x | p ) Pr( t | p )The unknown true propensity
score is the “most coarse”possible balancing score.
The known x-vector itself is the“most detailed” balancing score…
Pr( x, t ) = Pr( x ) Pr( t | x )
Conditioning upon Cluster Membership is intuitivelysomewhere between the two PS extremes in the limit as
individual clusters become numerous, small and compact…
But LESS “detailed” thanPr( x, t ) = Pr( x ) Pr( t | x ) ?
Pr( x, t | C ) = Pr( x | C ) Pr( t | x, C ) = Pr( x | C ) Pr( t | x ) for xC
constant Pr( t | C )
What is LESS “coarse” thanPr( x, t | p ) = Pr( x | p ) Pr( t | p ) ?
Unsupervised No PS Estimates Needed
x2
x3
x1
3-D Clusters (Informative orUninformative)
Source Degrees-of- Freedom Interpretation
Clusters (Subgroups)
C = Number of Clusters
Local Average Treatment Effects (LATEs) are
Cluster MeansTreatment
within Cluster
Number of “Informative” Clusters C
Local Treatment Differences (LTDs)
Error Number of Patients 2C Uncertainty
Although a NESTED model can be (technically) WRONG, it is sufficiently versatile to almost always be
USEFUL as the number of “clusters” increases.
Nested ANOVA
Source Degrees-of- Freedom Interpretation
Clusters (Subgroups)
C = Number of Clusters
Local Average Treatment Effects (LATEs) are
Cluster MeansTreatment
within Cluster
Number of “Informative” Clusters C
Local Treatment Differences (LTDs)
Error Number of Patients 2C Uncertainty
Although a NESTED model can be (technically) WRONG, it is sufficiently versatile to almost always be
USEFUL as the number of “clusters” increases.
Nested ANOVA
Multiplicative “Shrinkage” Model
1(observed )pi i
ii
YE Y E
1= 1 if "treated" is observed; 0,otherwise.i iY
and isstatistically independent of .i iYiPropensityScore = Pr( =1) = p 0 and < 1i
0 = Number untreated patients in th cluster > 0in i
Nested ANOVA Treatment Difference within ith Cluster:
0 ˆ(1 )i in p
1 = Number treated patients in th cluster > 0in i
1 0
1 1for treated patient for untreated patient
i i
y yn n
1 0 1 1ˆ /i i i i ip n n n n
1 ˆi in pLocal Treatment
Imbalance!
i.e. not Generalized Linear Modelsand their Nonlinear extensions.
The “statistical methodology” engine ideal for making fair treatment comparisons is:
Cluster Analysis(Unsupervised Learning)
plus Nested ANOVA
ˆ ( ) / ( )i i i ii i
k ky x x x
p p
Inverse Probability Weighting(IPW) for CA models:
( | )i i iE y x x
2( | ) pi i iV y x
The “Local Control” Philosophy:
• y = Outcome comparisons among patients with the have most similar X characteristics are most relevant
• Robust, Nested Treatment-within-Cluster ANOVA• Systematically form, compare, subdivide & recombine
subgroups (clusters) …built-in sensitivity• Non-parametric Distribution of Observed Local
Treatment Differences (LTDs) …no prior distribution!• Main Effect of Treatment is Mean of CDF formed by
combining LTD estimates weighted Cluster Size• Only when Combined CDF suggests Differential
Response: Which patient characteristics predict What?
Credibility…• Conflicts of Interest between Pharmaceutical
Industry, Regulators and Data Custodians / Analysts
• Why should industry pay BIG $$$ for observational studies when poor / naïve analyses of biased data can create perceived needs for even more expensive RCTs?
FDA
MC
Res
earc
h
CR
O &
Academ
ic
Research
Man
aged
Car
e CM
S&
VA
PharmaIndustry
PU
BL
IC
The pieces don’t fit together very well in the USA!
Aprotinin Case Study…• Attack in early 2006 by a US MD who got
some very sloppy analyses of international patient registry data published in NEJM
• Bayer (Germany) commissioned gigantic admin claims analysis by the research arm of their major US payer in mid 2006
• MC researcher emailed a flawed, highly unfavorable analysis to Germany 8 days before 2006 US advisory board meeting
Drug warnings fall flatBayer hides bad news; a researcher doesn't, and takes heat.
KRIS HUNDLEY, St. Petersburg Times, August 5, 2007
Dr. Thomas Kelly, a heart surgeon for 30 years, …routinely uses Trasylol on repeat open-heart patients or people on blood thinners.
"Bleeding is a tremendous problem" Kelly said. “In certain populations, there is much less need for transfusions with Trasylol. The alternatives are not nearly as effective.“
"This drug is used on high-risk people; that's why there's a higher incidence of death," the surgeon said. "I think a terrible disservice has been done to a very helpful drug."
Though he thinks the recent studies "unfairly impugned" Trasylol, Kelly said he is using the drug more selectively and reading all the research available on the topic.
FDA
MC
Dat
aC
RO
&
Academ
ic
Research
Man
aged
Car
e CM
S&
VA
PharmaIndustry
PU
BL
IC
Why should Pharma TRUST the other Players?
Unbiased
Arbitration
What constitutes a BENEFIT ???When a treatment is approved only for patients with high disease severity or clear vulnerability / frailty,
there appear to be two possible “standards.”
Treated patients have better outcomes than untreated patients with same risk
orTreated patients have better outcomes than untreated patients with high risk
Bang H, Robins JM. Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics 2005; 61: 962-972.
Fraley C, Raftery AE. Model based clustering, discriminant analysis and density estimation. JASA 2002; 97: 611-631.
Imbens GW, Angrist JD. Identification and Estimation of Local Average Treatment Effects. Econometrica 1994; 62: 467-475.
McClellan M, McNeil BJ, Newhouse JP. Does More Intensive Treatment of Myocardial Infarction in the Elderly Reduce Mortality?: Analysis Using Instrumental Variables. JAMA 1994; 272: 859-866.
McEntegart D. “The Pursuit of Balance Using Stratified and Dynamic Randomization Techniques: An Overview.” Drug Information Journal 2003; 37: 293-308.
References
Obenchain RL. USPS package: Unsupervised and Supervised Propensity Scoring in R. Version 1.1-0. www.r-project.org August 2007.
Obenchain RL. Unsupervised Propensity Scoring: NN and IV Plots. 2004 Proceedings of the JSM.
Robins JM, Hernan MA, Brumback B. Marginal Structural Models and Causal Inference in Epidemiology. Epidemiology 2000; 11: 550-560.
Rosenbaum PR, Rubin RB. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika 1983; 70: 41-55.
Rosenbaum PR. Observational Studies, Second Edition. 2002. New York: Springer-Verlag.
References …concluded