View
13
Download
0
Category
Preview:
Citation preview
Full Bayesian models to handle missing values in cost-effectivenessanalysis from individual level data
Gianluca Baio
(Joint work with Andrea Gabrio and Alexina Mason)
University College LondonDepartment of Statistical Science
g.baio@ucl.ac.uk
http://www.ucl.ac.uk/statistics/research/statistics-health-economics/http://www.statistica.it/gianluca
https://github.com/giabaio
RSS 2017 International ConferenceUniversity of Strathclyde, Glasgow
Wednesday 6 September 2017
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 1 / 15
Health technology assessment (HTA)
Objective: Combine costs & benefits of a given intervention into a rational scheme forallocating resources, increasingly often under a Bayesian framework
Statisticalmodel
Economicmodel
Decisionanalysis
Uncertaintyanalysis
• Estimates relevant populationparameters θ
• Varies with the type ofavailable data (& statisticalapproach!)
• Combines the parameters to obtaina population average measure forcosts and clinical benefits
• Varies with the type of availabledata & statistical model used
• Summarises the economic modelby computing suitable measures of“cost-effectiveness”
• Dictates the best course ofactions, given current evidence
• Standardised process
∆e = fe(θ)
∆c = fc(θ)
. . .
ICER = E[∆c]/E[∆e]
EIB = kE[∆e] − E[∆c]
CEAC = Pr(k∆e − ∆c > 0)
• Assesses the impact of uncertainty (eg inparameters or model structure) on theeconomic results
• Mandatory in many jurisdictions (includingNICE)
• Fundamentally Bayesian!
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 2 / 15
Statistical model — individual level data
• The available data usually look something like this:
Demographics HRQL data Resource use dataID Trt Sex Age . . . u0 u1 . . . uJ c0 c1 . . . cJ
1 1 M 23 . . . 0.32 0.66 . . . 0.44 103 241 . . . 802 1 M 21 . . . 0.12 0.16 . . . 0.38 1 204 1 808 . . . 8773 2 F 19 . . . 0.49 0.55 . . . 0.88 16 12 . . . 22. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
and the typical analysis is based on the following steps:
1 Compute individual QALYs and total costs as
ei =
J∑j=1
(uij + uij−1)δj
2and ci =
J∑j=0
cij ,[
with: δj =Timej − Timej−1
Unit of time
]
2 (Often implicitly) assume normality and linearity and model independently individualQALYs and total costs by controlling for baseline values
ei = αe0 + αe1u0i + αe2Trti + εei [+ . . .], εei ∼ Normal(0, σe)
ci = αc0 + αc1c0i + αc2Trti + εci [+ . . .], εci ∼ Normal(0, σc)
3 Estimate population average cost and effectiveness differentials and use bootstrap toquantify uncertainty
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 3 / 15
What’s wrong with this?
• Potential correlation between costs & clinical benefits– Strong positive correlation — effective treatments are innovative and result from
intensive and lengthy research ⇒ are associated with higher unit costs– Negative correlation — more effective treatments may reduce total care pathway costs
e.g. by reducing hospitalisations, side effects, etc.– NB: In any case, the economic evaluation is based on both!
• Joint/marginal normality not realistic– Costs usually skewed and benefits may be bounded in [0; 1]– Can use transformation (e.g. logs) — but care is needed when back transforming to
the natural scale– Can use more suitable models (e.g. Gamma or log-Normal) — especially under the
Bayesian approach
• ... and of course Partially Observed data– Can have item and/or unit non-response– Missingness may occur in either or both benefits/costs– The missingness mechanisms may also be correlated
– Focus in decision-making — not inference!
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 4 / 15
To be or not to be (Bayesians)?...
p(θ | y) ∝ p(θ)p(y | θ)?
• In general, can represent a joint distribution as a conditional regression
p(e, c) = p(e)p(e)p(c | e)p(c | e) = p(c)p(e | c)
ci
φicψc
µc[. . .]
ei
φie
ψe
µe [. . .]β
Conditional model for cMarginal model for e
ei ∼ p(e | φei,ψe)
g(φei) = µe [+ . . .]
φei = locationψe = ancillary
φci = locationψc = ancillary
φei = marginal meanψe = marginal variance
φci = conditional meanψc = conditional variance
φei = marginal meanψe = marginal precision
φci = conditional meanψc = rateφciψc = shape
ci ∼ p(c | e, φci,ψc)
g(φci) = µc + β(ei − µe) [+ . . .]
• For example:
• Combining “modules” and fully characterising uncertainty about deterministicfunctions of random quantities is relatively straightforward using MCMC
• Prior information can help stabilise inference (especially with sparse data!), eg– Cancer patients are unlikely to survive as long as the general population– ORs are unlikely to be greater than ±5
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 5 / 15
To be or not to be (Bayesians)?...
• In general, can represent a joint distribution as a conditional regression
p(e, c) = p(e)p(e)p(c | e)p(c | e) = p(c)p(e | c)
ci
φicψc
µc[. . .]
ei
φie
ψe
µe [. . .]β
Conditional model for cMarginal model for e
ei ∼ p(e | φei,ψe)
g(φei) = µe [+ . . .]
φei = locationψe = ancillary
φci = locationψc = ancillary
φei = marginal meanψe = marginal variance
φci = conditional meanψc = conditional variance
φei = marginal meanψe = marginal precision
φci = conditional meanψc = rateφciψc = shape
ci ∼ p(c | e, φci,ψc)
g(φci) = µc + β(ei − µe) [+ . . .]
• For example:
ei ∼ Beta(φeiψe, (1− φei)ψe), logit(φei) = µe [+ . . . ]
ci | ei ∼ Gamma(ψcφci, ψc), log(φci) = µc + β(ei − µe) [+ . . .]
• Combining “modules” and fully characterising uncertainty about deterministicfunctions of random quantities is relatively straightforward using MCMC
• Prior information can help stabilise inference (especially with sparse data!), eg– Cancer patients are unlikely to survive as long as the general population– ORs are unlikely to be greater than ±5
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 5 / 15
Dealing with missing data — selection models
MCAR (e, c)
ci
φicψc
µcxci
ei
φie ψe
µe xei
mei
πei
γe
mci
πci
γc
Model of analysis for (c, e) Model of missingness for eModel of missingness for c
Partially observed dataUnobservable parametersDeterministic function of random quantitiesFully observed, unmodelled dataFully observed, modelled data
• mei ∼ Bernoulli(πei); logit(πei) = γe0
• mci ∼ Bernoulli(πci); logit(πci) = γc0
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 6 / 15
Dealing with missing data — selection models
MAR (e, c)
ci
φicψc
µcxci
ei
φie ψe
µe xei
mei
πei
γe
mci
πci
γc
Model of analysis for (c, e) Model of missingness for eModel of missingness for c
Partially observed dataUnobservable parametersDeterministic function of random quantitiesFully observed, unmodelled dataFully observed, modelled data
• mei ∼ Bernoulli(πei); logit(πei) = γe0 +∑Kk=1 γekxeik
• mci ∼ Bernoulli(πci); logit(πci) = γc0 +∑Hh=1 γchxcih
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 6 / 15
Dealing with missing data — selection models
MNAR (e, c)
ci
φicψc
µcxci
ei
φie ψe
µe xei
mei
πei
γe
mci
πci
γc
Model of analysis for (c, e) Model of missingness for eModel of missingness for c
Partially observed dataUnobservable parametersDeterministic function of random quantitiesFully observed, unmodelled dataFully observed, modelled data
• mei ∼ Bernoulli(πei); logit(πei) = γe0 +∑Kk=1 γekxeik + γeK+1ei
• mci ∼ Bernoulli(πci); logit(πci) = γc0 +∑Hh=1 γchxcih + γcH+1ci
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 6 / 15
Dealing with missing data — selection models
MNAR e; MAR c
ci
φicψc
µcxci
ei
φie ψe
µe xei
mei
πei
γe
mci
πci
γc
Model of analysis for (c, e) Model of missingness for eModel of missingness for c
Partially observed dataUnobservable parametersDeterministic function of random quantitiesFully observed, unmodelled dataFully observed, modelled data
• mei ∼ Bernoulli(πei); logit(πei) = γe0 +∑Kk=1 γekxeik + γeK+1ei
• mci ∼ Bernoulli(πci); logit(πci) = γc0 +∑Hh=1 γchxcih
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 6 / 15
Dealing with missing data — selection models
MAR e; MNAR c
ci
φicψc
µcxci
ei
φie ψe
µe xei
mei
πei
γe
mci
πci
γc
Model of analysis for (c, e) Model of missingness for eModel of missingness for c
Partially observed dataUnobservable parametersDeterministic function of random quantitiesFully observed, unmodelled dataFully observed, modelled data
• mei ∼ Bernoulli(πei); logit(πei) = γe0 +∑Kk=1 γekxeik
• mci ∼ Bernoulli(πci); logit(πci) = γc0 +∑Hh=1 γchxcih + γcH+1ci
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 6 / 15
Motivating example: MenSS trial Partially observed data
• The MenSS pilot RCT evaluates the cost-effectiveness of a new digital interventionto reduce the incidence of STI in young men with respect to the SOC
– QALYs calculated from utilities (EQ-5D 3L)– Total costs calculated from different components (no baseline)
Time Type of outcome observed (%) observed (%)Control (n1=75) Intervention (n2=84)
Baseline utilities 72 (96%) 72 (86%)3 months utilities and costs 34 (45%) 23 (27%)6 months utilities and costs 35 (47%) 23 (27%)
12 months utilities and costs 43 (57%) 36 (43%)Complete cases utilities and costs 27 (44%) 19 (23%)
QALYs
Fre
quen
cy
0.5 0.6 0.7 0.8 0.9 1.0
02
46
810 n1 = 27
mean = 0.904 median = 0.930
costs
Fre
quen
cy
0 200 400 600 800 1000
02
46
810 n1 = 27
mean = 186 median = 176
QALYs
Fre
quen
cy
0.5 0.6 0.7 0.8 0.9 1.0
02
46
810 n2 = 19
mean = 0.902 median = 0.940
costs
Fre
quen
cy
0 100 200 300 400 500
02
46
810 n2 = 19
mean = 208 median = 123
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 7 / 15
Motivating example: MenSS trial Skewness & “structural values”
• The MenSS pilot RCT evaluates the cost-effectiveness of a new digital interventionto reduce the incidence of STI in young men with respect to the SOC
– QALYs calculated from utilities (EQ-5D 3L)– Total costs calculated from different components (no baseline)
QALYs
Fre
quen
cy
0.5 0.6 0.7 0.8 0.9 1.0
02
46
810 n1 = 27
mean = 0.904 median = 0.930
costs
Fre
quen
cy
0 200 400 600 800 1000
02
46
810 n1 = 27
mean = 186 median = 176
QALYs
Fre
quen
cy
0.5 0.6 0.7 0.8 0.9 1.0
02
46
810 n2 = 19
mean = 0.902 median = 0.940
costs
Fre
quen
cy
0 100 200 300 400 500
02
46
810 n2 = 19
mean = 208 median = 123
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 7 / 15
Modelling
1 Bivariate Normal– Simpler and closer to “standard” frequentist model– Account for correlation between QALYs and costs
2 Beta-Gamma– Account for– Model the relevant ranges: QALYs ∈ (0, 1) and costs ∈ (0,∞)– But: needs to rescale observed data e∗it = (eit − ε) to avoid spikes at 1
3 Hurdle model– Model eit as a mixture to account for correlation between outcomes, model the
relevant ranges and account for structural values– May expand to account for partially observed baseline utility u0it
cit
φictψct
µct βt
eit
φiet
ψet
µet u∗0it αtMarginal model for e
eit ∼ Normal(φeit, ψet)
φeit = µet + αt(u0it − u0t)
φeit = µet + αtu∗0it
Conditional model for c | ecit | eit ∼ Normal(φcit, ψct)
φcit = µct + βt(eit − µet)
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 8 / 15
Modelling
1 Bivariate Normal– Simpler and closer to “standard” frequentist model– Account for correlation between QALYs and costs
2 Beta-Gamma– Account for correlation between outcomes– Model the relevant ranges: QALYs ∈ (0, 1) and costs ∈ (0,∞)– But: needs to rescale observed data e∗it = (eit − ε) to avoid spikes at 1
3 Hurdle model– Model eit as a mixture to account for correlation between outcomes, model the
relevant ranges and account for structural values– May expand to account for partially observed baseline utility u0it
cit
φictψct
µct βt
e∗it
φiet
ψet
µet u∗0it αt
Marginal model for e∗
e∗it ∼ Beta (φeitψet, (1− φeit)ψet)logit(φeit) = µet + αt(u0it − u0t)
φeit = µet + αtu∗0it
Conditional model for c | e∗
cit | e∗it ∼ Gamma(ψctφcit, ψct)
log(φcit) = µct + βt(e∗it − µet)
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 8 / 15
Modelling
1 Bivariate Normal– Simpler and closer to “standard” frequentist model– Account for correlation between QALYs and costs
2 Beta-Gamma– Account for– Model the relevant ranges: QALYs ∈ (0, 1) and costs ∈ (0,∞)– But: needs to rescale observed data e∗it = (eit − ε) to avoid spikes at 1
3 Hurdle model– Model eit as a mixture to account for correlation between outcomes, model the
relevant ranges and account for structural values– May expand to account for partially observed baseline utility u0it
cit
φictψct
µct βt
e<1it
φiet ψet
µ<1et
u∗0it αt
e1it
e∗itπit
ditXit ηt
µet
Model for the structural ones
dit := I(eit = 1) ∼ Bernoulli(πit)
logit(πit) = Xitηt
Mixture model for e
e1it := 1
e<1it ∼ Beta (φeitψet, (1− φeit)ψet)
logit(φeit) = µ<1et + αt(u0it − u0t)
logit(φeit) = µ<1et + αtu
∗0it
e∗it = πite1it + (1− πit)e<1
it
µet = (1− πt)µ<1et + πt
Conditional model for c | e∗
cit | e∗it ∼ Gamma(ψctφcit, ψct)
log(φcit) = µct + βt(e∗it − µet)
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 8 / 15
Results — estimation of the main parameters (CCA + MAR)
QALYs (Control)0.80 0.85 0.90 0.95 1.00
●
●
●
●
●
●
QALYs (Intervention)0.80 0.85 0.90 0.95 1.00
●
●
●
●
●
●
Costs (Control)100 200 300 400 500
●
●
●
●
●
●
Costs (Intervention)100 200 300 400 500
●
●
●
●
●
●
Bivariate Normal
Beta-Gamma
Hurdle model
µe1 µe2 µc1 µc2
—•— Fully observed—•— All (MAR)
Probability of structural ones0.1 0.2 0.3 0.4 0.5
●
●
●
●
Mean QALY (non−ones)0.75 0.80 0.85 0.90
●
●
●
●
Intervention (t=2)
Control (t=1)
πt µ<1et
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 9 / 15
Bayesian multiple imputation (under MAR)
QAL
Y va
lues
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●●
●
●●
●
●
●
● ●●
●
●
●
● ● ●●
●
●
●
●
●●
●
●
●
●●
●
●
●
● ● ●
QAL
Y va
lues
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●
●● ●●
●
●
●
●●●
●
●
●●
●
●●
●
●
●
●
●●
●
●● ●● ●●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●●
●
●
● ●● ●● ● ● ● ● ● ● ● ●Q
ALY
valu
es
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●●
●
●●
●
●
●
● ●●
●
●
●
● ● ●●
●
●
●
●
●●
●
●
●
●●
●
●
●
● ● ●
QAL
Y va
lues
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●●● ●●
●
●
●
●●●
●
●
●●
●
●●●
●
●
●
●●
●
●● ●● ●●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●●
●
●
● ●● ●● ● ● ● ● ● ● ● ●
QAL
Y va
lues
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●● ●●
QAL
Y va
lues
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●●● ●●
●
●●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●●
●
●
● ●●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●●
●
●●
●
●●
●
●
●
●
●● ●●
●
●
●●
● ● ● ●
●
●
Bivariate Normal
Individuals (n1 = 75) Individuals (n2 = 84)
Beta-Gamma
Individuals (n1 = 75) Individuals (n2 = 84)
Hurdle model
Individuals (n1 = 75) Individuals (n2 = 84)
—•— Imputed, observed baseline—•— Imputed, missing baseline× Observed
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 10 / 15
MNAR
• We observe n01 = 13 and n02 = 22 individuals with u0it = 1 and ujit = NA, forj > 1
• For those individuals, we cannot compute directly the structural one indicator ditand so need to make assumptions/model this
– Sensitivity analysis to alternative MNAR departures from MAR
MNAR1. Set dit = 1 for all individuals with unit observed baseline utility
MNAR2. Set dit = 0 for all individuals with unit observed baseline utility
MNAR3. Set dit = 1 for the n01 = 13 individuals with u0i1 = 1 and dit = 0 for then02 = 22 individuals with u0i2 = 1
MNAR4. Set dit = 0 for the n01 = 13 individuals with u0i1 = 1 and dit = 1 for then02 = 22 individuals with u0i2 = 1
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 11 / 15
Results — MNAR
Probability of structural ones0.0 0.1 0.2 0.3 0.4 0.5 0.6
MAR
MNAR1
MNAR2
MNAR3
MNAR4
●
●
●
●
●
●
●
●
●
●
Mean QALY0.75 0.80 0.85 0.90 0.95
MAR
MNAR1
MNAR2
MNAR3
MNAR4
●
●
●
●
●
●
●
●
●
●
πt µet
—•— Control (t = 1)—•— Intervention(t = 2)
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 12 / 15
Cost-effectiveness analysis
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 13 / 15
Conclusions
• A full Bayesian approach to handling missing data extends standard“imputation methods”
– Can consider MAR and MNAR with relatively little expansion to the basic model
• Particularly helpful in cost-effectiveness analysis, to account for– Asymmetrical distributions for the main outcomes– Correlation between costs & benefits– Structural values (eg spikes at 1 for utilities or spikes at 0 for costs)
• Need specialised software + coding skills– R package missingHE under development to implement a set of general models– Preliminary work available at https://github.com/giabaio/missingHE
– Eventually, will be able to combine with existing packages (eg BCEA:http://www.statistica.it/gianluca/BCEA; https://github.com/giabaio/BCEA) toperform the whole economic analysis
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 14 / 15
Thank you!
Gianluca Baio (UCL) Bayesian models for missing data in CEA RSS Conference, 6 Sept 2017 15 / 15
Recommended