Aggregating Published Prediction models with Individual Patient Data

Aggregating Published Prediction Modelswith Individual Patient Data

A Comparison of Approaches

T.P.A. Debray, H. Koffijberg, Y. VergouweK.G.M. Moons, E.W. Steyerberg

Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtCenter for Medical Decision Sciences, Erasmus Medical Center Rotterdam

The Netherlands

T.Debray@umcutrecht.nl

Introduction

Generalization of Prediction models

Sensible Modeling Strategies

Large Sample Size

Penalization and Shrinkage

Incorporation of Prior Evidence

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 2 / 13

Practical Example

Traumatic Brain Injury (TBI)

Mushkudiani review (J Clin Epidemiol 2008)

Although most models agree on the three most importantpredictors, many were developed on small sample sizeswithin single centers and hence lack generalizability.Modeling strategies have to be improved, and includeexternal validation.

Illustration: Derivation of a novel Prediction Model

Favorable vs unfavorable outcome after 6 months TBIPatient data at hand (n=603)Predictors: Age, motor score (6 cat.), pupillary reactivity (3 cat.)

Modeling Strategies

Regular Logistic Regression (LRM)Penalized Likelihood Regression (PMLE)

Practical Example

Traumatic Brain Injury (TBI)

External Validation (AUC)

1. EBIC (n=822)2. SLIN (n=409)3. NABIS (n=385)

LRM0.790.720.71

PMLE0.800.720.72

Can we improve the external validity of a new prediction model byaggregating newly collected IPD with prior evidence?

Prior evidence

11 studies on TBISimilar set of predictorsHeterogeneity

Methods

Approach 1: Independent Pooling

Typical meta-analysis

Fixed (or random) effects

Multivariable regression coefficients are pooled independentlyWeights are based on standard errors or sample size

Update intercept using IPD

Example: β̂1 =∑m

j=1 β1jσ−11j∑m

j=1 σ−11j

with σ̂1 = 1∑mj=1 σ

−11j

Methods

Approach 2: Bayesian Inference

Prior: previously published prediction models

Regression coefficients: (β0, β1, . . . , βk)jVariances: diag (Σj)Multivariate random-effects with penalization: µPRIOR,ΣPRIOR

Likelihood: newly derived prediction model

Regression coefficients: µIPD = (β0, β1, . . . , βk)TIPD

Variance-covariance matrix: ΣIPD

Posterior: aggregated prediction model

µPOST = ΣPOST

(Σ−1

PRIORµPRIOR + Σ−1IPDµIPD

)ΣPOST =

(Σ−1

PRIOR + Σ−1IPD

Results

Practical Example: Validation Results

Discrimination

EBIC SLIN NABIS

Logistic RegressionPMLEIndependent PoolingBayesian Inference

Results

Practical Example: Validation Results

Calibration

EBIC SLIN NABIS

Logistic RegressionPMLEIndependent PoolingBayesian Inference

Results

Simulation Study

Reference model with 5 predictors for generating data

Individual Patient Data at hand (nIPD = 200→ 1000)(b0, b1, . . . , b5)IPD = {−3, 1, 1, 1, 1, 1}4 heterogeneous literature studies (nj = 500)(data discarded after derivation of literature models)(b0, b1, . . . , b5)j ∝ N 6((−3, 1, 1, 1, 1, 1) , Γ) with Γ ∝Wish−1 (v ,S)Validation Data (nVAL = 5000)(b0, b1, . . . , b5)VAL = {−3, 1, 1, 1, 1, 1}

Evaluation of Performance

Mean Squared Error of predictor associations:

MSEcoef = 1r

∑rρ=1

∑5k=1

(β̂k − bk,VAL

)2Area Under the ROC curve (AUC)

Results

Simulation Study

Reference model with 5 predictors for generating data

Individual Patient Data at hand (nIPD = 200→ 1000)(b0, b1, . . . , b5)IPD = {−3, 1, 1, 1, 1, 1}4 heterogeneous literature studies (nj = 500)(data discarded after derivation of literature models)(b0, b1, . . . , b5)j ∝ N 6((−3, 1, 1, 1, 1, 1) , Γ) with Γ ∝Wish−1 (v ,S)Validation Data (nVAL = 5000)(b0, b1, . . . , b5)VAL = {−3, 1, 1, 1, 1, 1}

Evaluation of Performance

Mean Squared Error of predictor associations:

MSEcoef = 1r

∑rρ=1

∑5k=1

(β̂k − bk,VAL

)2Area Under the ROC curve (AUC)

Results

Weak heterogeneity

MSEcoef

nIPD/ΣnLIT

0.1 0.2 0.3 0.4 0.5

Logistic RegressionIndependent PoolingBayesian Inference

nIPD/ΣnLIT

0.1 0.2 0.3 0.4 0.5

Results

Moderate heterogeneity

MSEcoef

nIPD/ΣnLIT

0.1 0.2 0.3 0.4 0.5

nIPD/ΣnLIT

0.1 0.2 0.3 0.4 0.5

Results

Strong heterogeneity

MSEcoef

nIPD/ΣnLIT

0.1 0.2 0.3 0.4 0.5

nIPD/ΣnLIT

0.1 0.2 0.3 0.4 0.5

Discussion

Recomendations

Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand

Conclusion

Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.

Future Work

Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models

Discussion

Recomendations

Conclusion

Future Work

Discussion

Recomendations

Conclusion

Future Work

Aggregating Published Prediction models with Individual Patient Data

Health & Medicine

Aggregating Data Using Group Functions

2008 06 - Prediction of individual automobile RBNS claim ... · Prediction of individual automobile RBNS claim reserves in the ... most motor insurance companies calculate ... Committee

Aggregating Earnings per Share Forecasts

Analysis and Prediction of Individual Stock Prices of

Prediction of individual probabilities of livebirth and multiple birth … · 2017. 9. 4. · Prediction of individual probabilities of livebirth and multiple birth events following

Analysis and Prediction of Individual Vehicle Activity for Microscopic Traffic Modeling

Prediction of the economic cost of individual long-term

Aggregating Internet Trafﬁc by Domain Name using ...jrex/papers/thesis-dns...such as a domain name, an individual user, or an individual device [10]. The essential goal of this project

Aggregating local image descriptors into compact codes · Aggregating local image descriptors into compact codes ... and evaluate different ways of aggregating local image descriptors

When And Where Next: Individual Mobility Prediction

Aggregating Data Using Group Functions

Online Learning with Continuous Ranked Probability Scoreproceedings.mlr.press/v105/v-yugin19a/v-yugin19a.pdf · Keywords: On-line learning, Prediction with Expert Advice, Aggregating

Aggregating capital

AGGREGATING EYEBALLS

Semantically Aggregating Marine Science Data

BOOTSTRAP AGGREGATING MULTIVARIATE ADAPTIVE …

Individual prediction of thrombocytopenia at next chemotherapy cycle { a model … · 2020. 6. 10. · Individual prediction of thrombocytopenia at next chemotherapy cycle { a model

“Aggregating governance indicators”

04 - Grouping and Aggregating Data

A prediction of individual growth of height according …. Inst. Statist. Math. Vol. 43, No. 4, 607-619 (1991) A PREDICTION OF INDIVIDUAL GROWTH OF HEIGHT ACCORDING TO AN EMPIRICAL