16
Aggregating Published Prediction Models with Individual Patient Data A Comparison of Approaches T.P.A. Debray, H. Koffijberg, Y. Vergouwe K.G.M. Moons, E.W. Steyerberg Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht Center for Medical Decision Sciences, Erasmus Medical Center Rotterdam The Netherlands [email protected]

Aggregating Published Prediction models with Individual Patient Data

Embed Size (px)

Citation preview

Page 1: Aggregating Published Prediction models with Individual Patient Data

Aggregating Published Prediction Modelswith Individual Patient Data

A Comparison of Approaches

T.P.A. Debray, H. Koffijberg, Y. VergouweK.G.M. Moons, E.W. Steyerberg

Julius Center for Health Sciences and Primary Care, University Medical Center UtrechtCenter for Medical Decision Sciences, Erasmus Medical Center Rotterdam

The Netherlands

[email protected]

Page 2: Aggregating Published Prediction models with Individual Patient Data

Introduction

Generalization of Prediction models

Sensible Modeling Strategies

Large Sample Size

Penalization and Shrinkage

Incorporation of Prior Evidence

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 2 / 13

Page 3: Aggregating Published Prediction models with Individual Patient Data

Practical Example

Traumatic Brain Injury (TBI)

Mushkudiani review (J Clin Epidemiol 2008)

Although most models agree on the three most importantpredictors, many were developed on small sample sizeswithin single centers and hence lack generalizability.Modeling strategies have to be improved, and includeexternal validation.

Illustration: Derivation of a novel Prediction Model

Favorable vs unfavorable outcome after 6 months TBIPatient data at hand (n=603)Predictors: Age, motor score (6 cat.), pupillary reactivity (3 cat.)

Modeling Strategies

Regular Logistic Regression (LRM)Penalized Likelihood Regression (PMLE)

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 3 / 13

Page 4: Aggregating Published Prediction models with Individual Patient Data

Practical Example

Traumatic Brain Injury (TBI)

External Validation (AUC)

1. EBIC (n=822)2. SLIN (n=409)3. NABIS (n=385)

LRM0.790.720.71

PMLE0.800.720.72

Can we improve the external validity of a new prediction model byaggregating newly collected IPD with prior evidence?

Prior evidence

11 studies on TBISimilar set of predictorsHeterogeneity

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 4 / 13

Page 5: Aggregating Published Prediction models with Individual Patient Data

Methods

Approach 1: Independent Pooling

Typical meta-analysis

Fixed (or random) effects

Multivariable regression coefficients are pooled independentlyWeights are based on standard errors or sample size

Update intercept using IPD

Example: β̂1 =∑m

j=1 β1jσ−11j∑m

j=1 σ−11j

with σ̂1 = 1∑mj=1 σ

−11j

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 5 / 13

Page 6: Aggregating Published Prediction models with Individual Patient Data

Methods

Approach 2: Bayesian Inference

Prior: previously published prediction models

Regression coefficients: (β0, β1, . . . , βk)jVariances: diag (Σj)Multivariate random-effects with penalization: µPRIOR,ΣPRIOR

Likelihood: newly derived prediction model

Regression coefficients: µIPD = (β0, β1, . . . , βk)TIPD

Variance-covariance matrix: ΣIPD

Posterior: aggregated prediction model

µPOST = ΣPOST

(Σ−1

PRIORµPRIOR + Σ−1IPDµIPD

)ΣPOST =

(Σ−1

PRIOR + Σ−1IPD

)−1

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 6 / 13

Page 7: Aggregating Published Prediction models with Individual Patient Data

Results

Practical Example: Validation Results

Discrimination

AU

C

0.5

0.6

0.7

0.8

0.9

1.0

EBIC SLIN NABIS

Logistic RegressionPMLEIndependent PoolingBayesian Inference

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 7 / 13

Page 8: Aggregating Published Prediction models with Individual Patient Data

Results

Practical Example: Validation Results

Calibration

Cal

ibra

tion

Slo

pe

0.6

0.7

0.8

0.9

1.0

1.1

1.2

1.3

EBIC SLIN NABIS

Logistic RegressionPMLEIndependent PoolingBayesian Inference

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 8 / 13

Page 9: Aggregating Published Prediction models with Individual Patient Data

Results

Simulation Study

Reference model with 5 predictors for generating data

Individual Patient Data at hand (nIPD = 200→ 1000)(b0, b1, . . . , b5)IPD = {−3, 1, 1, 1, 1, 1}4 heterogeneous literature studies (nj = 500)(data discarded after derivation of literature models)(b0, b1, . . . , b5)j ∝ N 6((−3, 1, 1, 1, 1, 1) , Γ) with Γ ∝Wish−1 (v ,S)Validation Data (nVAL = 5000)(b0, b1, . . . , b5)VAL = {−3, 1, 1, 1, 1, 1}

Evaluation of Performance

Mean Squared Error of predictor associations:

MSEcoef = 1r

∑rρ=1

∑5k=1

(β̂k − bk,VAL

)2Area Under the ROC curve (AUC)

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 9 / 13

Page 10: Aggregating Published Prediction models with Individual Patient Data

Results

Simulation Study

Reference model with 5 predictors for generating data

Individual Patient Data at hand (nIPD = 200→ 1000)(b0, b1, . . . , b5)IPD = {−3, 1, 1, 1, 1, 1}4 heterogeneous literature studies (nj = 500)(data discarded after derivation of literature models)(b0, b1, . . . , b5)j ∝ N 6((−3, 1, 1, 1, 1, 1) , Γ) with Γ ∝Wish−1 (v ,S)Validation Data (nVAL = 5000)(b0, b1, . . . , b5)VAL = {−3, 1, 1, 1, 1, 1}

Evaluation of Performance

Mean Squared Error of predictor associations:

MSEcoef = 1r

∑rρ=1

∑5k=1

(β̂k − bk,VAL

)2Area Under the ROC curve (AUC)

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 9 / 13

Page 11: Aggregating Published Prediction models with Individual Patient Data

Results

Weak heterogeneity

MSEcoef

nIPD/ΣnLIT

0

0.2

0.4

0.6

0.8

1.0

0.1 0.2 0.3 0.4 0.5

Logistic RegressionIndependent PoolingBayesian Inference

AUC

nIPD/ΣnLIT

0.86

0.87

0.88

0.89

0.90

0.1 0.2 0.3 0.4 0.5

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 10 / 13

Page 12: Aggregating Published Prediction models with Individual Patient Data

Results

Moderate heterogeneity

MSEcoef

nIPD/ΣnLIT

0

0.2

0.4

0.6

0.8

1.0

0.1 0.2 0.3 0.4 0.5

Logistic RegressionIndependent PoolingBayesian Inference

AUC

nIPD/ΣnLIT

0.86

0.87

0.88

0.89

0.90

0.1 0.2 0.3 0.4 0.5

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 11 / 13

Page 13: Aggregating Published Prediction models with Individual Patient Data

Results

Strong heterogeneity

MSEcoef

nIPD/ΣnLIT

0

0.2

0.4

0.6

0.8

1.0

0.1 0.2 0.3 0.4 0.5

Logistic RegressionIndependent PoolingBayesian Inference

AUC

nIPD/ΣnLIT

0.86

0.87

0.88

0.89

0.90

0.1 0.2 0.3 0.4 0.5

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 12 / 13

Page 14: Aggregating Published Prediction models with Individual Patient Data

Discussion

Discussion

Recomendations

Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand

Conclusion

Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.

Future Work

Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 13 / 13

Page 15: Aggregating Published Prediction models with Individual Patient Data

Discussion

Discussion

Recomendations

Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand

Conclusion

Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.

Future Work

Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 13 / 13

Page 16: Aggregating Published Prediction models with Individual Patient Data

Discussion

Discussion

Recomendations

Standard approach: Strongly heterogeneous evidence and relativelymuch data at handIndependent Pooling: Homogeneous evidence and relatively few data athandBayesian Inference: Moderately heterogeneous evidence and relativelyfew data at hand

Conclusion

Bayesian Inference performs similarly or better than the standardapproach, except in the presence of strong heterogeneity. Aggregationof prior evidence may not be desirable in those scenarios.

Future Work

Optimization of shrinkage intensity in Bayesian approachIncorporation of prediction models with different predictorsQuantification of heterogeneity across prediction models

Thomas Debray (Julius Center) Aggregating Prediction Models ISCB 2011 13 / 13