View
580
Download
0
Category
Preview:
Citation preview
Aggregating Published Prediction Modelswith Individual Patient Data
A Comparison of Approaches
T.P.A. Debray, H. Koffijberg, Y. VergouweK.G.M. Moons, E.W. Steyerberg
Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtCenter for Medical Decision Sciences, Erasmus Medical Center Rotterdam
The Netherlands
T.Debray@umcutrecht.nl
Introduction
Generalization of Prediction models
Sensible Modeling Strategies
Large Sample Size
Penalization and Shrinkage
Incorporation of Prior Evidence
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 2 / 13
Practical Example
Traumatic Brain Injury (TBI)
Mushkudiani review (J Clin Epidemiol 2008)
Although most models agree on the three most importantpredictors, many were developed on small sample sizeswithin single centers and hence lack generalizability.Modeling strategies have to be improved, and includeexternal validation.
Illustration: Derivation of a novel Prediction Model
Favorable vs unfavorable outcome after 6 months TBIPatient data at hand (n=603)Predictors: Age, motor score (6 cat.), pupillary reactivity (3 cat.)
Modeling Strategies
Regular Logistic Regression (LRM)Penalized Likelihood Regression (PMLE)
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 3 / 13
Practical Example
Traumatic Brain Injury (TBI)
External Validation (AUC)
1. EBIC (n=822)2. SLIN (n=409)3. NABIS (n=385)
LRM0.790.720.71
PMLE0.800.720.72
Can we improve the external validity of a new prediction model byaggregating newly collected IPD with prior evidence?
Prior evidence
11 studies on TBISimilar set of predictorsHeterogeneity
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 4 / 13
Methods
Approach 1: Independent Pooling
Typical meta-analysis
Fixed (or random) effects
Multivariable regression coefficients are pooled independentlyWeights are based on standard errors or sample size
Update intercept using IPD
Example: β̂1 =∑m
j=1 β1jσ−11j∑m
j=1 σ−11j
with σ̂1 = 1∑mj=1 σ
−11j
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 5 / 13
Methods
Approach 2: Bayesian Inference
Prior: previously published prediction models
Regression coefficients: (β0, β1, . . . , βk)jVariances: diag (Σj)Multivariate random-effects with penalization: µPRIOR,ΣPRIOR
Likelihood: newly derived prediction model
Regression coefficients: µIPD = (β0, β1, . . . , βk)TIPD
Variance-covariance matrix: ΣIPD
Posterior: aggregated prediction model
µPOST = ΣPOST
(Σ−1
PRIORµPRIOR + Σ−1IPDµIPD
)ΣPOST =
(Σ−1
PRIOR + Σ−1IPD
)−1
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 6 / 13
Results
Practical Example: Validation Results
Discrimination
AU
C
0.5
0.6
0.7
0.8
0.9
1.0
EBIC SLIN NABIS
Logistic RegressionPMLEIndependent PoolingBayesian Inference
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 7 / 13
Results
Practical Example: Validation Results
Calibration
Cal
ibra
tion
Slo
pe
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
EBIC SLIN NABIS
Logistic RegressionPMLEIndependent PoolingBayesian Inference
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 8 / 13
Results
Simulation Study
Reference model with 5 predictors for generating data
Individual Patient Data at hand (nIPD = 200→ 1000)(b0, b1, . . . , b5)IPD = {−3, 1, 1, 1, 1, 1}4 heterogeneous literature studies (nj = 500)(data discarded after derivation of literature models)(b0, b1, . . . , b5)j ∝ N 6((−3, 1, 1, 1, 1, 1) , Γ) with Γ ∝Wish−1 (v ,S)Validation Data (nVAL = 5000)(b0, b1, . . . , b5)VAL = {−3, 1, 1, 1, 1, 1}
Evaluation of Performance
Mean Squared Error of predictor associations:
MSEcoef = 1r
∑rρ=1
∑5k=1
(β̂k − bk,VAL
)2Area Under the ROC curve (AUC)
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 9 / 13
Results
Simulation Study
Reference model with 5 predictors for generating data
Individual Patient Data at hand (nIPD = 200→ 1000)(b0, b1, . . . , b5)IPD = {−3, 1, 1, 1, 1, 1}4 heterogeneous literature studies (nj = 500)(data discarded after derivation of literature models)(b0, b1, . . . , b5)j ∝ N 6((−3, 1, 1, 1, 1, 1) , Γ) with Γ ∝Wish−1 (v ,S)Validation Data (nVAL = 5000)(b0, b1, . . . , b5)VAL = {−3, 1, 1, 1, 1, 1}
Evaluation of Performance
Mean Squared Error of predictor associations:
MSEcoef = 1r
∑rρ=1
∑5k=1
(β̂k − bk,VAL
)2Area Under the ROC curve (AUC)
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 9 / 13
Results
Weak heterogeneity
MSEcoef
nIPD/ΣnLIT
0
0.2
0.4
0.6
0.8
1.0
0.1 0.2 0.3 0.4 0.5
Logistic RegressionIndependent PoolingBayesian Inference
AUC
nIPD/ΣnLIT
0.86
0.87
0.88
0.89
0.90
0.1 0.2 0.3 0.4 0.5
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 10 / 13
Results
Moderate heterogeneity
MSEcoef
nIPD/ΣnLIT
0
0.2
0.4
0.6
0.8
1.0
0.1 0.2 0.3 0.4 0.5
Logistic RegressionIndependent PoolingBayesian Inference
AUC
nIPD/ΣnLIT
0.86
0.87
0.88
0.89
0.90
0.1 0.2 0.3 0.4 0.5
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 11 / 13
Results
Strong heterogeneity
MSEcoef
nIPD/ΣnLIT
0
0.2
0.4
0.6
0.8
1.0
0.1 0.2 0.3 0.4 0.5
Logistic RegressionIndependent PoolingBayesian Inference
AUC
nIPD/ΣnLIT
0.86
0.87
0.88
0.89
0.90
0.1 0.2 0.3 0.4 0.5
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 12 / 13
Discussion
Discussion
Recomendations
Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand
Conclusion
Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.
Future Work
Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 13 / 13
Discussion
Discussion
Recomendations
Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand
Conclusion
Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.
Future Work
Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 13 / 13
Discussion
Discussion
Recomendations
Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand
Conclusion
Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.
Future Work
Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models
Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 13 / 13
Recommended