Flexible Accelerated Failure Time Frailty Modelsfor Multivariate Interval-Censored Datawith an Application in Caries Research
Emmanuel Lesaffrejoint work with Arnošt Komárek and Dominique Declerck
Department of BiostatisticsErasmus Medical Center, Rotterdam, the Netherlands
L-BiostatKU Leuven, Leuven, Belgium
Haceteppe University
December 2012
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 1 / 53
Outline
1 Signal Tandmobielr (STM) Study
2 Suggested statistical model
3 The Mixed-Effects AFT regression model
4 Mixed-Effects AFT model = linear mixed model
5 Analysis of STM data - Doubly Interval Censoring
6 Discussion
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 2 / 53
Signal Tandmobielr (STM) Study
Outline
1 Signal Tandmobielr (STM) StudyDescription studyResearch QuestionsInterval censoringClustering & dependence
2 Suggested statistical model
3 The Mixed-Effects AFT regression model
4 Mixed-Effects AFT model = linear mixed model
5 Analysis of STM data - Doubly Interval Censoring
6 Discussion
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 3 / 53
Signal Tandmobielr (STM) Study Description study
Signal Tandmobielr Study
Longitudinal dental study
Flanders (Dutch speaking part of Belgium) 1996 – 2001
4 468 children followed from 7 till 12 years of age
Annual (pre-scheduled) dental examinations
Caries times recorded for deciduous teeth
Emergence and caries times recorded for permanent teeth
Many other covariates collected, e.g.:Status of primary teeth (dmft score)Frequency of brushingAmount of plaquePresence of sealants
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 4 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions
Primary Research Question:
Influence ofcaries status of deciduous second molars (teeth 55, 65, 75, 85)
on caries susceptibilityof the adjacent permanent first molars (teeth 16, 26, 36, 46)
Other Research Questions:Impact of other covariates (sealants, gender, etc) on caries statusof permanent molarsLeft-Right symmetry, Maxilla-Mandibular differenceIdentify periods of high risk for caries
taking into account time at “risk”.
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 5 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions
Primary Research Question:
Influence ofcaries status of deciduous second molars (teeth 55, 65, 75, 85)
on caries susceptibilityof the adjacent permanent first molars (teeth 16, 26, 36, 46)
Other Research Questions:Impact of other covariates (sealants, gender, etc) on caries statusof permanent molarsLeft-Right symmetry, Maxilla-Mandibular differenceIdentify periods of high risk for caries
taking into account time at “risk”.
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 5 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions
Primary Research Question:
Influence ofcaries status of deciduous second molars (teeth 55, 65, 75, 85)
on caries susceptibilityof the adjacent permanent first molars (teeth 16, 26, 36, 46)
Other Research Questions:Impact of other covariates (sealants, gender, etc) on caries statusof permanent molarsLeft-Right symmetry, Maxilla-Mandibular differenceIdentify periods of high risk for caries
taking into account time at “risk”.
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 5 / 53
Signal Tandmobielr (STM) Study Research Questions
Deciduous & Permanent Teeth
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 6 / 53
Signal Tandmobielr (STM) Study Research Questions
Deciduous & Permanent Teeth
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 7 / 53
Signal Tandmobielr (STM) Study Research Questions
Deciduous & Permanent Teeth
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 8 / 53
Signal Tandmobielr (STM) Study Research Questions
Transition from deciduous to permanent teeth
From 7 to 12 years of age children have mixed dentition.
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 9 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions involve
Regression model for emergence times
Regression model for times to caries (from birth)
Better: Modelling time to caries given period at risk=⇒ jointly modelling emergence/caries times
In a multivariate sense (4 molars jointly)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 10 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions involve
Regression model for emergence times
Regression model for times to caries (from birth)
Better: Modelling time to caries given period at risk=⇒ jointly modelling emergence/caries times
In a multivariate sense (4 molars jointly)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 10 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions involve
Regression model for emergence times
Regression model for times to caries (from birth)
Better: Modelling time to caries given period at risk=⇒ jointly modelling emergence/caries times
In a multivariate sense (4 molars jointly)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 10 / 53
Signal Tandmobielr (STM) Study Research Questions
Research Questions involve
Regression model for emergence times
Regression model for times to caries (from birth)
Better: Modelling time to caries given period at risk=⇒ jointly modelling emergence/caries times
In a multivariate sense (4 molars jointly)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 10 / 53
Signal Tandmobielr (STM) Study Research Questions
Research questions involve also:
1 Interval censoring (emergence time & time to caries)!!! doubly interval censoring (time to caries, given period at risk)!!!
2 Clustering & dependence
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 11 / 53
Signal Tandmobielr (STM) Study Research Questions
Research questions involve also:
1 Interval censoring (emergence time & time to caries)!!! doubly interval censoring (time to caries, given period at risk)!!!
2 Clustering & dependence
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 11 / 53
Signal Tandmobielr (STM) Study Interval censoring
Emergence Times (similar for time to caries)
Response of main interest: emergence time U
We only know uL < U ≤ uU : interval censoring
uL= last dental examination where tooth hadn’t still not emerged
uU= first dental examination where tooth emerged
For uL = 0 =⇒ left censoring
For uU →∞ =⇒ right censoring
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 12 / 53
Signal Tandmobielr (STM) Study Clustering & dependence
1 ClusteringTeeth of a single child
share the same environment
share the same dietary and brushing habits
⇒ Marginal dependence but possibly conditional independence
2 (Conditional) Dependence (especially for time to caries)
Spatial structure of the mouth
adjacent teeth (not applicable here)
vert opp teeth occluding temporarily(?) (ignored here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 13 / 53
Signal Tandmobielr (STM) Study Clustering & dependence
1 ClusteringTeeth of a single child
share the same environment
share the same dietary and brushing habits
⇒ Marginal dependence but possibly conditional independence
2 (Conditional) Dependence (especially for time to caries)
Spatial structure of the mouth
adjacent teeth (not applicable here)
vert opp teeth occluding temporarily(?) (ignored here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 13 / 53
Signal Tandmobielr (STM) Study Clustering & dependence
1 ClusteringTeeth of a single child
share the same environment
share the same dietary and brushing habits
⇒ Marginal dependence but possibly conditional independence
2 (Conditional) Dependence (especially for time to caries)
Spatial structure of the mouth
adjacent teeth (not applicable here)
vert opp teeth occluding temporarily(?) (ignored here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 13 / 53
Suggested statistical model
Outline
1 Signal Tandmobielr (STM) Study
2 Suggested statistical model
3 The Mixed-Effects AFT regression model
4 Mixed-Effects AFT model = linear mixed model
5 Analysis of STM data - Doubly Interval Censoring
6 Discussion
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 14 / 53
Suggested statistical model
‘Possible’ statistical model
What characteristics should the statistical model have?
Applicable to interval-censored data
Allowing for extensions to multivariate outcomes
Allowing for correction wrt covariates
Allowing for clustering
Computationally feasible, while minimizing on parametricassumptions
Suggestion:Flexible Mixed-Effects Accelerated Failure Time Regression Model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 15 / 53
Suggested statistical model
‘Possible’ statistical model
What characteristics should the statistical model have?
Applicable to interval-censored data
Allowing for extensions to multivariate outcomes
Allowing for correction wrt covariates
Allowing for clustering
Computationally feasible, while minimizing on parametricassumptions
Suggestion:Flexible Mixed-Effects Accelerated Failure Time Regression Model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 15 / 53
Suggested statistical model
‘Possible’ statistical model
What characteristics should the statistical model have?
Applicable to interval-censored data
Allowing for extensions to multivariate outcomes
Allowing for correction wrt covariates
Allowing for clustering
Computationally feasible, while minimizing on parametricassumptions
Suggestion:Flexible Mixed-Effects Accelerated Failure Time Regression Model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 15 / 53
Suggested statistical model
‘Possible’ statistical model
What characteristics should the statistical model have?
Applicable to interval-censored data
Allowing for extensions to multivariate outcomes
Allowing for correction wrt covariates
Allowing for clustering
Computationally feasible, while minimizing on parametricassumptions
Suggestion:Flexible Mixed-Effects Accelerated Failure Time Regression Model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 15 / 53
Suggested statistical model
‘Possible’ statistical model
What characteristics should the statistical model have?
Applicable to interval-censored data
Allowing for extensions to multivariate outcomes
Allowing for correction wrt covariates
Allowing for clustering
Computationally feasible, while minimizing on parametricassumptions
Suggestion:Flexible Mixed-Effects Accelerated Failure Time Regression Model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 15 / 53
Suggested statistical model
‘Possible’ statistical model
What characteristics should the statistical model have?
Applicable to interval-censored data
Allowing for extensions to multivariate outcomes
Allowing for correction wrt covariates
Allowing for clustering
Computationally feasible, while minimizing on parametricassumptions
Suggestion:Flexible Mixed-Effects Accelerated Failure Time Regression Model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 15 / 53
The Mixed-Effects AFT regression model
Outline
1 Signal Tandmobielr (STM) Study
2 Suggested statistical model
3 The Mixed-Effects AFT regression modelNotationComparison with (frailty) PH model
4 Mixed-Effects AFT model = linear mixed model
5 Analysis of STM data - Doubly Interval Censoring
6 Discussion
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 16 / 53
The Mixed-Effects AFT regression model Notation
Notation
Ui,l : event time of the l th unit of the i th cluster(i = 1, . . . ,N; l = 1, . . . ,ni)
xi,l : covariates to explain Ui,l
β: regression parameters (fixed effects)
bi : cluster-specific random effect, bii.i.d.∼ gb(b)
Survival function: S(u |x i,l , β, bi)
Hazard function: }(u |x i,l , β, bi)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 17 / 53
The Mixed-Effects AFT regression model Notation
Notation
Ui,l : event time of the l th unit of the i th cluster(i = 1, . . . ,N; l = 1, . . . ,ni)
xi,l : covariates to explain Ui,l
β: regression parameters (fixed effects)
bi : cluster-specific random effect, bii.i.d.∼ gb(b)
Survival function: S(u |x i,l , β, bi)
Hazard function: }(u |x i,l , β, bi)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 17 / 53
The Mixed-Effects AFT regression model Notation
Notation
Ui,l : event time of the l th unit of the i th cluster(i = 1, . . . ,N; l = 1, . . . ,ni)
xi,l : covariates to explain Ui,l
β: regression parameters (fixed effects)
bi : cluster-specific random effect, bii.i.d.∼ gb(b)
Survival function: S(u |x i,l , β, bi)
Hazard function: }(u |x i,l , β, bi)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 17 / 53
The Mixed-Effects AFT regression model Notation
Notation
Ui,l : event time of the l th unit of the i th cluster(i = 1, . . . ,N; l = 1, . . . ,ni)
xi,l : covariates to explain Ui,l
β: regression parameters (fixed effects)
bi : cluster-specific random effect, bii.i.d.∼ gb(b)
Survival function: S(u |x i,l , β, bi)
Hazard function: }(u |x i,l , β, bi)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 17 / 53
The Mixed-Effects AFT regression model Notation
Notation
Ui,l : event time of the l th unit of the i th cluster(i = 1, . . . ,N; l = 1, . . . ,ni)
xi,l : covariates to explain Ui,l
β: regression parameters (fixed effects)
bi : cluster-specific random effect, bii.i.d.∼ gb(b)
Survival function: S(u |x i,l , β, bi)
Hazard function: }(u |x i,l , β, bi)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 17 / 53
The Mixed-Effects AFT regression model Notation
Notation
Ui,l : event time of the l th unit of the i th cluster(i = 1, . . . ,N; l = 1, . . . ,ni)
xi,l : covariates to explain Ui,l
β: regression parameters (fixed effects)
bi : cluster-specific random effect, bii.i.d.∼ gb(b)
Survival function: S(u |x i,l , β, bi)
Hazard function: }(u |x i,l , β, bi)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 17 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
PH and AFT Model
Cox PH model
}(u |x i,l , β) = exp(β′x i,l) }0(u)
S(u |x i,l , β) = S0(u)exp(β′x i,l )
AFT model
}(u |x i,l , β) = exp(−β′x i,l) }0{
u exp(−β′x i,l)︸ ︷︷ ︸}S(u |x i,l , β) = S0
{u exp(−β′x i,l)︸ ︷︷ ︸}
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 18 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
PH and AFT Model
Cox PH model
}(u |x i,l , β) = exp(β′x i,l) }0(u)
S(u |x i,l , β) = S0(u)exp(β′x i,l )
AFT model
}(u |x i,l , β) = exp(−β′x i,l) }0{
u exp(−β′x i,l)︸ ︷︷ ︸}S(u |x i,l , β) = S0
{u exp(−β′x i,l)︸ ︷︷ ︸}
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 18 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Graphical RepresentationGraphical Representation
0 1 2 3 4 5
12
34
t
haza
rd(t)
AFT model, β = 0.5
x = 1.2
x = 0.6
x = 0
5 0 1 2 3 4 5
12
34
t
haza
rd(t)
PH model, β = − 0.5
x = 1.2
x = 0.6
x = 0
16
Graphical Representation
0 1 2 3 4 5
12
34
t
haza
rd(t)
AFT model, β = 0.5
x = 1.2
x = 0.6
x = 0
5 0 1 2 3 4 5
12
34
t
haza
rd(t)
PH model, β = − 0.5
x = 1.2
x = 0.6
x = 0
16
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 19 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Graphical RepresentationGraphical Representation
0 1 2 3 4 5
12
34
t
haza
rd(t)
AFT model, β = 0.5
x = 1.2
x = 0.6
x = 0
5 0 1 2 3 4 5
12
34
t
haza
rd(t)
PH model, β = − 0.5
x = 1.2
x = 0.6
x = 0
16
Graphical Representation
0 1 2 3 4 5
12
34
t
haza
rd(t)
AFT model, β = 0.5
x = 1.2
x = 0.6
x = 0
5 0 1 2 3 4 5
12
34
t
haza
rd(t)
PH model, β = − 0.5
x = 1.2
x = 0.6
x = 0
16Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 19 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Frailty PH and Mixed-Effects AFT Model
Frailty Cox PH model
}(u |x i,l , β, bi) = exp(β′x i,l + bi) }0(u)
S(u |x i,l , β, bi) = S0(u)exp(β′x i,l+bi )
Mixed-Effects AFT model
}(u |x i,l , β, bi) = exp(−β′x i,l − bi) }0{
u exp(−β′x i,l − bi)︸ ︷︷ ︸}S(u |x i,l , β, bi) = S0
{u exp(−β′x i,l − bi)︸ ︷︷ ︸}
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 20 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Frailty PH and Mixed-Effects AFT Model
Frailty Cox PH model
}(u |x i,l , β, bi) = exp(β′x i,l + bi) }0(u)
S(u |x i,l , β, bi) = S0(u)exp(β′x i,l+bi )
Mixed-Effects AFT model
}(u |x i,l , β, bi) = exp(−β′x i,l − bi) }0{
u exp(−β′x i,l − bi)︸ ︷︷ ︸}S(u |x i,l , β, bi) = S0
{u exp(−β′x i,l − bi)︸ ︷︷ ︸}
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 20 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Robustness properties of (Mixed-Effects) AFT Model
Robust against uncorrelated omitted covariates
Inference for the fixed effects β does not (heavily) depend on thechosen distribution gb for the random effects
Marginal survival distribution (after integrating bi out)is of the same form as theconditional survival distribution (given bi )
Above robustness properties (generally) not truefor (frailty) Cox PH model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 21 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Robustness properties of (Mixed-Effects) AFT Model
Robust against uncorrelated omitted covariates
Inference for the fixed effects β does not (heavily) depend on thechosen distribution gb for the random effects
Marginal survival distribution (after integrating bi out)is of the same form as theconditional survival distribution (given bi )
Above robustness properties (generally) not truefor (frailty) Cox PH model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 21 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Robustness properties of (Mixed-Effects) AFT Model
Robust against uncorrelated omitted covariates
Inference for the fixed effects β does not (heavily) depend on thechosen distribution gb for the random effects
Marginal survival distribution (after integrating bi out)is of the same form as theconditional survival distribution (given bi )
Above robustness properties (generally) not truefor (frailty) Cox PH model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 21 / 53
The Mixed-Effects AFT regression model Comparison with (frailty) PH model
Robustness properties of (Mixed-Effects) AFT Model
Robust against uncorrelated omitted covariates
Inference for the fixed effects β does not (heavily) depend on thechosen distribution gb for the random effects
Marginal survival distribution (after integrating bi out)is of the same form as theconditional survival distribution (given bi )
Above robustness properties (generally) not truefor (frailty) Cox PH model
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 21 / 53
Mixed-Effects AFT model = linear mixed model
Outline
1 Signal Tandmobielr (STM) Study
2 Suggested statistical model
3 The Mixed-Effects AFT regression model
4 Mixed-Effects AFT model = linear mixed modelLinear Mixed-Effects model with interval-censored responseDistributional assumptionsFlexible distributions
5 Analysis of STM data - Doubly Interval Censoring
6 Discussion
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 22 / 53
Mixed-Effects AFT model = linear mixed model Linear Mixed-Effects model with interval-censored response
Mixed-effects AFT Model = Linear Mixed Model
}(u |x i,l , β, bi ) = exp(−β′x i,l − bi ) }0{
u exp(−β′x i,l − bi )}
S(u |x i,l , β, bi ) = S0{
u exp(−β′x i,l − bi )}
Linear mixed model with interval-censored response
log(Ui,j) = Yi,l = β′x i,l + bi + εi,l(i = 1, . . . ,N; l = 1, . . . ,ni)
with
β = fixed effectbi = random effect bi ∼ gb(·)εi,l = error random variable εi,l ∼ gε(·)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 23 / 53
Mixed-Effects AFT model = linear mixed model Linear Mixed-Effects model with interval-censored response
Mixed-effects AFT Model = Linear Mixed Model
}(u |x i,l , β, bi ) = exp(−β′x i,l − bi ) }0{
u exp(−β′x i,l − bi )}
S(u |x i,l , β, bi ) = S0{
u exp(−β′x i,l − bi )}
Linear mixed model with interval-censored response
log(Ui,j) = Yi,l = β′x i,l + bi + εi,l(i = 1, . . . ,N; l = 1, . . . ,ni)
with
β = fixed effectbi = random effect bi ∼ gb(·)εi,l = error random variable εi,l ∼ gε(·)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 23 / 53
Mixed-Effects AFT model = linear mixed model Distributional assumptions
Distributional Assumptions
Distribution of the error terms εi,l have to be specified
=⇒Which parametric density gε(ε)?
Distribution of the random effects bi have to be specified=⇒Which parametric density gb(b)?
Parametric assumptions influence the shape of hazard andsurvivor curvesNeeded for prediction & detecting periods of high risk=⇒ flexible model for gε(ε) and gb(b) is needed
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 24 / 53
Mixed-Effects AFT model = linear mixed model Distributional assumptions
Distributional Assumptions
Distribution of the error terms εi,l have to be specified
=⇒Which parametric density gε(ε)?
Distribution of the random effects bi have to be specified=⇒Which parametric density gb(b)?
Parametric assumptions influence the shape of hazard andsurvivor curvesNeeded for prediction & detecting periods of high risk=⇒ flexible model for gε(ε) and gb(b) is needed
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 24 / 53
Mixed-Effects AFT model = linear mixed model Distributional assumptions
Distributional Assumptions
Distribution of the error terms εi,l have to be specified
=⇒Which parametric density gε(ε)?
Distribution of the random effects bi have to be specified=⇒Which parametric density gb(b)?
Parametric assumptions influence the shape of hazard andsurvivor curvesNeeded for prediction & detecting periods of high risk=⇒ flexible model for gε(ε) and gb(b) is needed
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 24 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Generic model (error & random part)
Y = α + τY ∗
Y ∗ ∝∑
Kj=−K wjN (µj , σ
2)
Fixed:Number of components 2K + 1Mixture means(knots) µj on grid
Mixture variance σ2
To estimate:Weights w = (w−K , . . . ,wK )T and
∑Kj=−K wj = 1
Reparametrisation: wj =exp(aj )∑Kk=−K ak
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 25 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Generic model (error & random part)
Y = α + τY ∗
Y ∗ ∝∑
Kj=−K wjN (µj , σ
2)
Fixed:Number of components 2K + 1Mixture means(knots) µj on grid
Mixture variance σ2
To estimate:Weights w = (w−K , . . . ,wK )T and
∑Kj=−K wj = 1
Reparametrisation: wj =exp(aj )∑Kk=−K ak
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 25 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Grid of Normal ComponentsGrid of Normal Components
−3 −2 −1 0 1 2 3
0.0
0.5
1.0
1.5
2.0
y∗
g∗(y
∗)
24Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 26 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Grid of Weighted Normal ComponentsGrid of Weighted Normal Components
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
0.5
y∗
g∗(y
∗)
Each curve multiplied by some wj ∈ (0, 1)
25Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 27 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Mixture of Weighted Normal ComponentsMixture of Weighted Normal Components
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
0.5
y∗
g∗(y
∗)
Weighted sum of the curves using weights wj
26Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 28 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Normal Mixture - Properties
Properties:Only mixtures weights are assumed to be unknownNumber of mixture components K fixed and relatively high (≈ 30)Mixture means: fixed equidistant grid of knotsMixture variances: fixed and same for all mixture components
Advantages:FlexibilityStandard theories apply
Disadvantages:Relatively high number of parameters must be estimatedPotential overfitting and identifiability problems
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 29 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Normal Mixture - Properties
Properties:Only mixtures weights are assumed to be unknownNumber of mixture components K fixed and relatively high (≈ 30)Mixture means: fixed equidistant grid of knotsMixture variances: fixed and same for all mixture components
Advantages:FlexibilityStandard theories apply
Disadvantages:Relatively high number of parameters must be estimatedPotential overfitting and identifiability problems
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 29 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Normal Mixture - Properties
Properties:Only mixtures weights are assumed to be unknownNumber of mixture components K fixed and relatively high (≈ 30)Mixture means: fixed equidistant grid of knotsMixture variances: fixed and same for all mixture components
Advantages:FlexibilityStandard theories apply
Disadvantages:Relatively high number of parameters must be estimatedPotential overfitting and identifiability problems
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 29 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Penalized Normal Mixture
To avoid overfitting & identifiability problems:
Put a penalty using the k th order difference ∆kaj :
∆1aj = aj − aj−1 ∆kaj = ∆k−1aj −∆k−1aj−1
Maximize penalized log-likelihood w.r.t. θ (= regression and modelparameters)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 30 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Penalized Normal Mixture
Penalized approach:
Penalized log-likelihood:Log-likelihood: log L(θ; Y )
Penalty term:∑J
j=k+1(∆k aj )2
Penalized log-lik: log LP(θ; Y ) = log L(θ; Y )− λ2
∑Jj=k+1(∆k aj )
2
Maximize log LP(θ; Y ) w.r.t. θ for a given λ
Choose optimal λ by maximizing (minimizing) AIC
In Bayesian approach⇔ difference prior on the a-coefficients(approach followed here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 31 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Penalized Normal Mixture
Penalized approach:
Penalized log-likelihood:Log-likelihood: log L(θ; Y )
Penalty term:∑J
j=k+1(∆k aj )2
Penalized log-lik: log LP(θ; Y ) = log L(θ; Y )− λ2
∑Jj=k+1(∆k aj )
2
Maximize log LP(θ; Y ) w.r.t. θ for a given λ
Choose optimal λ by maximizing (minimizing) AIC
In Bayesian approach⇔ difference prior on the a-coefficients(approach followed here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 31 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Penalized Normal Mixture
Penalized approach:
Penalized log-likelihood:Log-likelihood: log L(θ; Y )
Penalty term:∑J
j=k+1(∆k aj )2
Penalized log-lik: log LP(θ; Y ) = log L(θ; Y )− λ2
∑Jj=k+1(∆k aj )
2
Maximize log LP(θ; Y ) w.r.t. θ for a given λ
Choose optimal λ by maximizing (minimizing) AIC
In Bayesian approach⇔ difference prior on the a-coefficients(approach followed here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 31 / 53
Mixed-Effects AFT model = linear mixed model Flexible distributions
Penalized Normal Mixture
Penalized approach:
Penalized log-likelihood:Log-likelihood: log L(θ; Y )
Penalty term:∑J
j=k+1(∆k aj )2
Penalized log-lik: log LP(θ; Y ) = log L(θ; Y )− λ2
∑Jj=k+1(∆k aj )
2
Maximize log LP(θ; Y ) w.r.t. θ for a given λ
Choose optimal λ by maximizing (minimizing) AIC
In Bayesian approach⇔ difference prior on the a-coefficients(approach followed here)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 31 / 53
Analysis of STM data - Doubly Interval Censoring
Outline
1 Signal Tandmobielr (STM) Study
2 Suggested statistical model
3 The Mixed-Effects AFT regression model
4 Mixed-Effects AFT model = linear mixed model
5 Analysis of STM data - Doubly Interval CensoringSTM Study: Time to CariesReg model for clustered doubly IC dataApplication to STM Study
6 Discussion
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 32 / 53
Analysis of STM data - Doubly Interval Censoring STM Study: Time to Caries
Research Questions
Of interest to dental researchers is:
Effect of covariates (gender, caries on adjacent 2nd deciduousmolar , frequency of brushing, amount of plaque, presence ofsealants) on the time to caries of the permanent first molars (16,26, 36, 46)
Response of main interest:time at risk T = time to caries (V ) - emergence time (U)
Both U and V are interval-censored
⇒ Involves doubly-interval censoring
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 33 / 53
Analysis of STM data - Doubly Interval Censoring STM Study: Time to Caries
Doubly Interval CensoringDoubly Interval Censoring
AA AA AA AA�� �� �� ��
uL
uU
vL
vU
Examinations
�� ��AA AA
U V
Emergence
time
Caries
time
Time to caries T
Warning
For modelling, distribution of both U and V need to be modelling.
It is not correct to assume that T is interval censored.
(De Gruttula & Lagakos, 1989).
37
For modelling, distribution of both U and V need to be modelled.It is not correct to model only the distribution of T as interval-censored
observation (De Gruttula & Lagakos, 1989)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 34 / 53
Analysis of STM data - Doubly Interval Censoring STM Study: Time to Caries
Doubly Interval CensoringDoubly Interval Censoring
AA AA AA AA�� �� �� ��
uL
uU
vL
vU
Examinations
�� ��AA AA
U V
Emergence
time
Caries
time
Time to caries T
Warning
For modelling, distribution of both U and V need to be modelling.
It is not correct to assume that T is interval censored.
(De Gruttula & Lagakos, 1989).
37
For modelling, distribution of both U and V need to be modelled.It is not correct to model only the distribution of T as interval-censored
observation (De Gruttula & Lagakos, 1989)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 34 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Notation
Ui,l : emergence time of the l th unit of the i th clusterxu
i,l : covariates to explain Ui,l
AND
Ti,l : time from emergence to caries of the l th unit of the i th clusterx t
i,l : covariates to explain Ti,l
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 35 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Notation
Ui,l : emergence time of the l th unit of the i th clusterxu
i,l : covariates to explain Ui,l
AND
Ti,l : time from emergence to caries of the l th unit of the i th clusterx t
i,l : covariates to explain Ti,l
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 35 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Cluster-Specific AFT Model for D-I-C Data
Model for emergence time
log(Ui,l) = Yi,l = δ′xui,l + di + ζi,l
(i = 1, . . . ,N; l = 1, . . . ,ni)
Model for time to caries
log(Ti,l) = log(Vi,l − Ui,l) = β′x ti,l + bi + εi,l
(i = 1, . . . ,N; l = 1, . . . ,ni)
Distributions
Random effects distributions = gd (d) (emergence) and gb(b) (caries)Error distributions = gζ(ζ) (emergence) and gε(ε) (caries)
Model all distributions in a FLEXIBLE manner JOINTLY
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 36 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Cluster-Specific AFT Model for D-I-C Data
Model for emergence time
log(Ui,l) = Yi,l = δ′xui,l + di + ζi,l
(i = 1, . . . ,N; l = 1, . . . ,ni)
Model for time to caries
log(Ti,l) = log(Vi,l − Ui,l) = β′x ti,l + bi + εi,l
(i = 1, . . . ,N; l = 1, . . . ,ni)
Distributions
Random effects distributions = gd (d) (emergence) and gb(b) (caries)Error distributions = gζ(ζ) (emergence) and gε(ε) (caries)
Model all distributions in a FLEXIBLE manner JOINTLY
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 36 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Cluster-Specific AFT Model for D-I-C Data
Model for emergence time
log(Ui,l) = Yi,l = δ′xui,l + di + ζi,l
(i = 1, . . . ,N; l = 1, . . . ,ni)
Model for time to caries
log(Ti,l) = log(Vi,l − Ui,l) = β′x ti,l + bi + εi,l
(i = 1, . . . ,N; l = 1, . . . ,ni)
Distributions
Random effects distributions = gd (d) (emergence) and gb(b) (caries)Error distributions = gζ(ζ) (emergence) and gε(ε) (caries)
Model all distributions in a FLEXIBLE manner JOINTLY
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 36 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Assumptions
Some simplifications were necessary (and reasonable).Given the covariates:
Independence of (bi , εi,1, . . . , εi,ni )′ and (di , ζi,1, . . . , ζi,ni )
′
=⇒ Time-at-risk Ti is independent of the emergence time Ui(given covariates)
Independence of bi and di=⇒Whether a child is an early emerger is independent ofwhether a child is more or less sensitive against caries
Independence of εi,l and ζi,l
=⇒Whether a specific tooth emerges early or late is independentof whether that tooth is more or less sensitive against caries
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 37 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Assumptions
Some simplifications were necessary (and reasonable).Given the covariates:
Independence of (bi , εi,1, . . . , εi,ni )′ and (di , ζi,1, . . . , ζi,ni )
′
=⇒ Time-at-risk Ti is independent of the emergence time Ui(given covariates)
Independence of bi and di=⇒Whether a child is an early emerger is independent ofwhether a child is more or less sensitive against caries
Independence of εi,l and ζi,l
=⇒Whether a specific tooth emerges early or late is independentof whether that tooth is more or less sensitive against caries
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 37 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Assumptions
Some simplifications were necessary (and reasonable).Given the covariates:
Independence of (bi , εi,1, . . . , εi,ni )′ and (di , ζi,1, . . . , ζi,ni )
′
=⇒ Time-at-risk Ti is independent of the emergence time Ui(given covariates)
Independence of bi and di=⇒Whether a child is an early emerger is independent ofwhether a child is more or less sensitive against caries
Independence of εi,l and ζi,l
=⇒Whether a specific tooth emerges early or late is independentof whether that tooth is more or less sensitive against caries
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 37 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Likelihood Contribution of the i th Cluster
Li =
∫<
∫<
[ni∏
l=1
∫ uUi,l
uLi,l
{∫ vUi,l−ui,l
vLi,l−ui,l
p(ti,l |bi ) dti,l
}p(ui,l |di ) dui,l
]p(bi ) p(di ) dbi ddi
p(ti,l |bi ) = t−1i,l gε
{log(ti,l )− β′x t
i,l − bi}
p(ui,l |di ) = u−1i,l gζ
{log(ui,l )− δ′xu
i,l − di}
p(bi ) = gb(bi )p(di ) = gd (di )
=shifted and scalednormal mixtures
=⇒ Maximum-likelihood may be quite difficult
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 38 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Parameter Estimation
In a Bayesian way
Natural solution to doubly interval censoringData-Augmentation
MCMC methodology
Does not maximize a complex likelihood
Sample latent event times together with remaining parameters
Base the inference on a sample from the posterior distribution ofthe model parameters
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 39 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Parameter Estimation
In a Bayesian way
Natural solution to doubly interval censoringData-Augmentation
MCMC methodology
Does not maximize a complex likelihood
Sample latent event times together with remaining parameters
Base the inference on a sample from the posterior distribution ofthe model parameters
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 39 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Prior Distributions
Vague normal priors for δ and β
Vague normal prior for α and vague gamma for τ−2
Markov random field prior for transformed weights aj
p(a|λ) ∝ exp
[−λ
2
K∑j=−K+s
(∆saj )2
]= exp
[−λ
2a′D′Da
]
∆s = sth order difference operator & D associated differenceoperator matrix
λ = smoothing parameter
p(λ) = vague gamma, acts as a precision parameter for a
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 40 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Prior Distributions
Vague normal priors for δ and β
Vague normal prior for α and vague gamma for τ−2
Markov random field prior for transformed weights aj
p(a|λ) ∝ exp
[−λ
2
K∑j=−K+s
(∆saj )2
]= exp
[−λ
2a′D′Da
]
∆s = sth order difference operator & D associated differenceoperator matrix
λ = smoothing parameter
p(λ) = vague gamma, acts as a precision parameter for a
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 40 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Prior Distributions
Vague normal priors for δ and β
Vague normal prior for α and vague gamma for τ−2
Markov random field prior for transformed weights aj
p(a|λ) ∝ exp
[−λ
2
K∑j=−K+s
(∆saj )2
]= exp
[−λ
2a′D′Da
]
∆s = sth order difference operator & D associated differenceoperator matrix
λ = smoothing parameter
p(λ) = vague gamma, acts as a precision parameter for a
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 40 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
Directed Acyclic GraphDirected Acyclic Graph
Emergence Caries
censoringi,l
uUi,luL
i,l vLi,l vU
i,l
vi,l
ui,l ti,l
δ di xui,l ζi,l εi,l xt
i,l bi β
rdi r
ζi,l
rεi,l rb
i
Gd Gζ Gε Gb
l=
1,.
..,n
i
i=
1,.
..,N
46
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 41 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
MCMC Sampling
Gibbs algorithm using full conditionals
When full conditional is not a standard distribution
Slice sampling (Neal, 2003, Ann. Stat)
Adaptive rejection sampling (Gilks, Wild, 1992, Appl. Stat)
R-package bayesSurv (A. Komárek)
Available from CRAN at http://www.R-project.org
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 42 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
MCMC Sampling
Gibbs algorithm using full conditionals
When full conditional is not a standard distribution
Slice sampling (Neal, 2003, Ann. Stat)
Adaptive rejection sampling (Gilks, Wild, 1992, Appl. Stat)
R-package bayesSurv (A. Komárek)
Available from CRAN at http://www.R-project.org
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 42 / 53
Analysis of STM data - Doubly Interval Censoring Reg model for clustered doubly IC data
MCMC Sampling
Gibbs algorithm using full conditionals
When full conditional is not a standard distribution
Slice sampling (Neal, 2003, Ann. Stat)
Adaptive rejection sampling (Gilks, Wild, 1992, Appl. Stat)
R-package bayesSurv (A. Komárek)
Available from CRAN at http://www.R-project.org
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 42 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Model (Komárek & Lesaffre, JASA, Applications & Case studies, 2008)
Emergence
xui,l = (genderi , tooth26i,l , tooth36i,l , tooth46i,l )
′
Caries
x ti,l = (genderi , statusi,l , brushingi , sealantsi,l , plaquei,l ,
tooth26i,l , tooth36i,l , tooth46i,l )′.
status status of the adjacent 2nd deciduous molar(0 = sound, 1 = decayed/filled/missing due to caries)
brushing frequency of brushing(0 = less than once a day, 1 = at least once a day )
sealants presence of sealants(0 = absent, 1 = present)
plaque presence of plaque on occlusal surfaces(0 = absent, 1 = present)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 43 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Model (Komárek & Lesaffre, JASA, Applications & Case studies, 2008)
Emergence
xui,l = (genderi , tooth26i,l , tooth36i,l , tooth46i,l )
′
Caries
x ti,l = (genderi , statusi,l , brushingi , sealantsi,l , plaquei,l ,
tooth26i,l , tooth36i,l , tooth46i,l )′.
status status of the adjacent 2nd deciduous molar(0 = sound, 1 = decayed/filled/missing due to caries)
brushing frequency of brushing(0 = less than once a day, 1 = at least once a day )
sealants presence of sealants(0 = absent, 1 = present)
plaque presence of plaque on occlusal surfaces(0 = absent, 1 = present)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 43 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Model (Komárek & Lesaffre, JASA, Applications & Case studies, 2008)
Emergence
xui,l = (genderi , tooth26i,l , tooth36i,l , tooth46i,l )
′
Caries
x ti,l = (genderi , statusi,l , brushingi , sealantsi,l , plaquei,l ,
tooth26i,l , tooth36i,l , tooth46i,l )′.
status status of the adjacent 2nd deciduous molar(0 = sound, 1 = decayed/filled/missing due to caries)
brushing frequency of brushing(0 = less than once a day, 1 = at least once a day )
sealants presence of sealants(0 = absent, 1 = present)
plaque presence of plaque on occlusal surfaces(0 = absent, 1 = present)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 43 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Posterior SummarySignal-Tandmobielr Study: Posterior Summary
Emergence Caries
Posterior Posterior
Parameter median 95% CR median 95% CR
Tooth p > 0.5 p > 0.5
tooth 26 −0.003 (−0.013, 0.007) −0.006 (−0.045, 0.031)
tooth 36 0.001 (−0.008, 0.011) −0.009 (−0.051, 0.034)
tooth 46 0.002 (−0.008, 0.012) −0.016 (−0.059, 0.026)
Gender p = 0.008 p = 0.085
girl −0.023 (−0.039, −0.007) −0.071 (−0.155, 0.009)
Status p < 0.001
dmf −0.140 (−0.193, −0.091)
Brushing p < 0.001
daily 0.337 (0.233, 0.436)
Sealants p < 0.001
present 0.119 (0.060, 0.178)
Plaque p < 0.001
present −0.114 (−0.171, −0.067)
55
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 44 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Posterior Predictive Survival Function
S(t |data, x tnew ) =
∫S(t |θ, data, x t
new ) p(θ |data) dθ (θ = all parameters)
From our model
S(t |θ, x tnew ) = 1−
K∑j=−K
wεj Φ{
log(t)− β′x tnew − b
∣∣ αε + τεµεj , (σετε)2}
MCMC estimate of the predictive survivor function:
S(t |data, x tnew ) =
1M
M∑m=1
S(t |θ(m), x tnew )
θ(m), m = 1, . . . ,M . . . MCMC sample from PPD
All components of θ(m) directly available except b(m)
⇒ sample from G(m)b (normal mixture)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 45 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Posterior Predictive Survival Function
S(t |data, x tnew ) =
∫S(t |θ, data, x t
new ) p(θ |data) dθ (θ = all parameters)
From our model
S(t |θ, x tnew ) = 1−
K∑j=−K
wεj Φ{
log(t)− β′x tnew − b
∣∣ αε + τεµεj , (σετε)2}
MCMC estimate of the predictive survivor function:
S(t |data, x tnew ) =
1M
M∑m=1
S(t |θ(m), x tnew )
θ(m), m = 1, . . . ,M . . . MCMC sample from PPD
All components of θ(m) directly available except b(m)
⇒ sample from G(m)b (normal mixture)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 45 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Posterior Predictive Survival Function
S(t |data, x tnew ) =
∫S(t |θ, data, x t
new ) p(θ |data) dθ (θ = all parameters)
From our model
S(t |θ, x tnew ) = 1−
K∑j=−K
wεj Φ{
log(t)− β′x tnew − b
∣∣ αε + τεµεj , (σετε)2}
MCMC estimate of the predictive survivor function:
S(t |data, x tnew ) =
1M
M∑m=1
S(t |θ(m), x tnew )
θ(m), m = 1, . . . ,M . . . MCMC sample from PPD
All components of θ(m) directly available except b(m)
⇒ sample from G(m)b (normal mixture)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 45 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Caries Free (Survivor) Curves, Tooth 16, Boys
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
1.0
Time since emergence (years)
Car
ies
free
Daily brushing, sealed, no plaque
Not daily brushing, not sealed, present plaque
Sound primary predecessorDMF primary predecessor
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 46 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Caries Hazard, Tooth 16, Boys
0 1 2 3 4 5 6
0.00
0.05
0.10
0.15
0.20
Time since emergence (years)
Haz
ard
of c
arie
s
Daily brushing, sealed, no plaque
Not daily brushing, not sealed, present plaque
DMF primary predecessorSound primary predecessor
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 47 / 53
Analysis of STM data - Doubly Interval Censoring Application to STM Study
Predictive Distributions
0.35 0.40 0.45 0.50 0.55
02
46
810
1214
Emergence: error
ζ
g(ζ)
−0.4 −0.2 0.0 0.2 0.4
0.0
0.5
1.0
1.5
Emergence: random
d
g(d)
−1 0 1 2 3 4
0.0
0.5
1.0
1.5
Caries: error
ε
g(ε)
−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.0
0.4
0.8
1.2
Caries: random
b
g(b)
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 48 / 53
Discussion
Outline
1 Signal Tandmobielr (STM) Study
2 Suggested statistical model
3 The Mixed-Effects AFT regression model
4 Mixed-Effects AFT model = linear mixed model
5 Analysis of STM data - Doubly Interval Censoring
6 DiscussionCan it be simpler?Some criticismAnother criticism
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 49 / 53
Discussion Can it be simpler?
Earlier approach
Avoiding doubly interval censored outcome,by imputing emergence time and employing it as covariate
Problem: caries prior to emergence
Discussion
The advantages of survival analysis for the analysis oflongitudinal caries data have been emphasized in recentreports (18, 19). For each individual tooth (or toothsurface) the time at risk can be estimated, even in the caseof censoring (e.g. when a patient enters the study with adecayed tooth or leaves the study prematurely, etc.).Moreover, allowance is made for the changing numberof surfaces at risk, which is a result of the natural exfo-liation and emergence process in children.To obtain statistically valid estimates and CIs adjusted
for dependent censored observations, the GEE-type testfor bivariate right-censored data, proposed by Huster
et al. (16), was extended to the multivariate setting. Itpermits inferences to be made about the marginal dis-tributions, while treating the dependence between theteeth as a nuisance. Chuang et al. (20) compared pre-dicted dental implant survival estimates assuming theindependence or dependence of clustered observations.They found that the point estimates were similar, but thevariance estimates were drastically different. The 95%CIs for the naıve model were narrower, resulting in anincreased risk for type I error and erroneous rejection ofthe null hypothesis. In the present study the CIs were upto 10% wider when dependence was taken into account(data not shown).
Usually, tooth emergence is taken as the zero-pointof the analysis (6, 21–24). As it is impossible to assessthe exact time of emergence, it is often assumed thatemergence occurred in the middle of the intervalbetween two examinations (19, 21, 23). Parner et al.(25) noted that this approach is only sensible for properintervals between two examinations, and even then itmay give rise to bias. Other groups rely on publishedmean emergence ages (23) or fail to inform the readerof how exact emergence ages for individual teeth inindividual subjects were determined, in spite of annualor bi-annual examinations (6, 26). The same problemapplies for the outcome: in some studies it is assumedthat the event (e.g. cavity formation) occurred in themiddle of the interval between two examinations (19).In the present study, to avoid invalid inferences no suchassumptions were made. The date of birth was used asthe zero-point of the analysis, a date that is exactlyknown for each child. For the sake of completeness,the results were verified with additional analyseswhere tooth emergence was taken as the zero-point ofanalysis. These analyses revealed comparable results –occlusal plaque accumulation, reported brushing fre-quency, and caries experience in the deciduous dentitionwere highly significant covariates. As expected, genderbecame less significant (i.e. the overall P-value increasedfrom 0.062 in the original analysis to 0.106 in the
Age (yr)
Sur
viva
l pro
babi
lity
6 7 8 9 10 11 12
0.0
0.2
0.4
0.6
0.8
1.0
Girls Tooth 16
ABC
ABC
α Age (yr)
Sur
viva
l pro
babi
lity
6 7 8 9 10 11 12
0.0
0.2
0.4
0.6
0.8
1.0
Girls Tooth 36
ABC
ABC
β
Age (yr)
Haz
ard
6 7 8 9 10 11 12
0.0
0.1
0.2
0.3
0.4
0.5
Girls Tooth 16
ABC
ABC
γ Age (yr)
Haz
ard
6 7 8 9 10 11 12
0.0
0.1
0.2
0.3
0.4
0.5
Girls Tooth 36
ABC
ABC
δ
Fig. 1. a and b: survival curves for girls; c and d: hazard functions for girls. Unbroken line: frequent brushing, no cavities indeciduous dentition, and no visible plaque on the occlusal surfaces of permanent first molars. Broken line: infrequent brushing,cavities in deciduous dentition and visible plaque on the occlusal surfaces of permanent first molars. Line A represents the emergenceinterval […-6] yr; line B represents the emergence interval ]6–7] yr; and line C represents the emergence interval ]7-…] yr. Comparableresults were obtained for boys.
150 Leroy et al.
A = emergence < 6 yrs, B = emergence 6 < < 7 yrs, C = emergence > 7 years
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 50 / 53
Discussion Can it be simpler?
Earlier approach
Avoiding doubly interval censored outcome,by imputing emergence time and employing it as covariateProblem: caries prior to emergence
Discussion
The advantages of survival analysis for the analysis oflongitudinal caries data have been emphasized in recentreports (18, 19). For each individual tooth (or toothsurface) the time at risk can be estimated, even in the caseof censoring (e.g. when a patient enters the study with adecayed tooth or leaves the study prematurely, etc.).Moreover, allowance is made for the changing numberof surfaces at risk, which is a result of the natural exfo-liation and emergence process in children.To obtain statistically valid estimates and CIs adjusted
for dependent censored observations, the GEE-type testfor bivariate right-censored data, proposed by Huster
et al. (16), was extended to the multivariate setting. Itpermits inferences to be made about the marginal dis-tributions, while treating the dependence between theteeth as a nuisance. Chuang et al. (20) compared pre-dicted dental implant survival estimates assuming theindependence or dependence of clustered observations.They found that the point estimates were similar, but thevariance estimates were drastically different. The 95%CIs for the naıve model were narrower, resulting in anincreased risk for type I error and erroneous rejection ofthe null hypothesis. In the present study the CIs were upto 10% wider when dependence was taken into account(data not shown).
Usually, tooth emergence is taken as the zero-pointof the analysis (6, 21–24). As it is impossible to assessthe exact time of emergence, it is often assumed thatemergence occurred in the middle of the intervalbetween two examinations (19, 21, 23). Parner et al.(25) noted that this approach is only sensible for properintervals between two examinations, and even then itmay give rise to bias. Other groups rely on publishedmean emergence ages (23) or fail to inform the readerof how exact emergence ages for individual teeth inindividual subjects were determined, in spite of annualor bi-annual examinations (6, 26). The same problemapplies for the outcome: in some studies it is assumedthat the event (e.g. cavity formation) occurred in themiddle of the interval between two examinations (19).In the present study, to avoid invalid inferences no suchassumptions were made. The date of birth was used asthe zero-point of the analysis, a date that is exactlyknown for each child. For the sake of completeness,the results were verified with additional analyseswhere tooth emergence was taken as the zero-point ofanalysis. These analyses revealed comparable results –occlusal plaque accumulation, reported brushing fre-quency, and caries experience in the deciduous dentitionwere highly significant covariates. As expected, genderbecame less significant (i.e. the overall P-value increasedfrom 0.062 in the original analysis to 0.106 in the
Age (yr)
Sur
viva
l pro
babi
lity
6 7 8 9 10 11 12
0.0
0.2
0.4
0.6
0.8
1.0
Girls Tooth 16
ABC
ABC
α Age (yr)
Sur
viva
l pro
babi
lity
6 7 8 9 10 11 12
0.0
0.2
0.4
0.6
0.8
1.0
Girls Tooth 36
ABC
ABC
β
Age (yr)
Haz
ard
6 7 8 9 10 11 12
0.0
0.1
0.2
0.3
0.4
0.5
Girls Tooth 16
ABC
ABC
γ Age (yr)
Haz
ard
6 7 8 9 10 11 12
0.0
0.1
0.2
0.3
0.4
0.5
Girls Tooth 36
ABC
ABC
δ
Fig. 1. a and b: survival curves for girls; c and d: hazard functions for girls. Unbroken line: frequent brushing, no cavities indeciduous dentition, and no visible plaque on the occlusal surfaces of permanent first molars. Broken line: infrequent brushing,cavities in deciduous dentition and visible plaque on the occlusal surfaces of permanent first molars. Line A represents the emergenceinterval […-6] yr; line B represents the emergence interval ]6–7] yr; and line C represents the emergence interval ]7-…] yr. Comparableresults were obtained for boys.
150 Leroy et al.
A = emergence < 6 yrs, B = emergence 6 < < 7 yrs, C = emergence > 7 years
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 50 / 53
Discussion Can it be simpler?
Earlier approach
Avoiding doubly interval censored outcome,by imputing emergence time and employing it as covariateProblem: caries prior to emergence
Discussion
The advantages of survival analysis for the analysis oflongitudinal caries data have been emphasized in recentreports (18, 19). For each individual tooth (or toothsurface) the time at risk can be estimated, even in the caseof censoring (e.g. when a patient enters the study with adecayed tooth or leaves the study prematurely, etc.).Moreover, allowance is made for the changing numberof surfaces at risk, which is a result of the natural exfo-liation and emergence process in children.To obtain statistically valid estimates and CIs adjusted
for dependent censored observations, the GEE-type testfor bivariate right-censored data, proposed by Huster
et al. (16), was extended to the multivariate setting. Itpermits inferences to be made about the marginal dis-tributions, while treating the dependence between theteeth as a nuisance. Chuang et al. (20) compared pre-dicted dental implant survival estimates assuming theindependence or dependence of clustered observations.They found that the point estimates were similar, but thevariance estimates were drastically different. The 95%CIs for the naıve model were narrower, resulting in anincreased risk for type I error and erroneous rejection ofthe null hypothesis. In the present study the CIs were upto 10% wider when dependence was taken into account(data not shown).
Usually, tooth emergence is taken as the zero-pointof the analysis (6, 21–24). As it is impossible to assessthe exact time of emergence, it is often assumed thatemergence occurred in the middle of the intervalbetween two examinations (19, 21, 23). Parner et al.(25) noted that this approach is only sensible for properintervals between two examinations, and even then itmay give rise to bias. Other groups rely on publishedmean emergence ages (23) or fail to inform the readerof how exact emergence ages for individual teeth inindividual subjects were determined, in spite of annualor bi-annual examinations (6, 26). The same problemapplies for the outcome: in some studies it is assumedthat the event (e.g. cavity formation) occurred in themiddle of the interval between two examinations (19).In the present study, to avoid invalid inferences no suchassumptions were made. The date of birth was used asthe zero-point of the analysis, a date that is exactlyknown for each child. For the sake of completeness,the results were verified with additional analyseswhere tooth emergence was taken as the zero-point ofanalysis. These analyses revealed comparable results –occlusal plaque accumulation, reported brushing fre-quency, and caries experience in the deciduous dentitionwere highly significant covariates. As expected, genderbecame less significant (i.e. the overall P-value increasedfrom 0.062 in the original analysis to 0.106 in the
Age (yr)
Sur
viva
l pro
babi
lity
6 7 8 9 10 11 12
0.0
0.2
0.4
0.6
0.8
1.0
Girls Tooth 16
ABC
ABC
α Age (yr)
Sur
viva
l pro
babi
lity
6 7 8 9 10 11 12
0.0
0.2
0.4
0.6
0.8
1.0
Girls Tooth 36
ABC
ABC
β
Age (yr)
Haz
ard
6 7 8 9 10 11 12
0.0
0.1
0.2
0.3
0.4
0.5
Girls Tooth 16
ABC
ABC
γ Age (yr)
Haz
ard
6 7 8 9 10 11 12
0.0
0.1
0.2
0.3
0.4
0.5
Girls Tooth 36
ABC
ABC
δ
Fig. 1. a and b: survival curves for girls; c and d: hazard functions for girls. Unbroken line: frequent brushing, no cavities indeciduous dentition, and no visible plaque on the occlusal surfaces of permanent first molars. Broken line: infrequent brushing,cavities in deciduous dentition and visible plaque on the occlusal surfaces of permanent first molars. Line A represents the emergenceinterval […-6] yr; line B represents the emergence interval ]6–7] yr; and line C represents the emergence interval ]7-…] yr. Comparableresults were obtained for boys.
150 Leroy et al.
A = emergence < 6 yrs, B = emergence 6 < < 7 yrs, C = emergence > 7 years
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 50 / 53
Discussion Some criticism
Torturing of the data?
Perhaps, but
Shape of the distribution?
Simplifying methods (mid-point approach) are generally invalidAd-hoc approach gave (in retrospect) peculiar resultsSimulation study indicates good behavior
Other smoothing approaches are possible
Various extensions of current model are possible
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 51 / 53
Discussion Some criticism
Torturing of the data?
Perhaps, but
Shape of the distribution?Simplifying methods (mid-point approach) are generally invalid
Ad-hoc approach gave (in retrospect) peculiar resultsSimulation study indicates good behavior
Other smoothing approaches are possible
Various extensions of current model are possible
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 51 / 53
Discussion Some criticism
Torturing of the data?
Perhaps, but
Shape of the distribution?Simplifying methods (mid-point approach) are generally invalidAd-hoc approach gave (in retrospect) peculiar results
Simulation study indicates good behavior
Other smoothing approaches are possible
Various extensions of current model are possible
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 51 / 53
Discussion Some criticism
Torturing of the data?
Perhaps, but
Shape of the distribution?Simplifying methods (mid-point approach) are generally invalidAd-hoc approach gave (in retrospect) peculiar resultsSimulation study indicates good behavior
Other smoothing approaches are possible
Various extensions of current model are possible
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 51 / 53
Discussion Some criticism
Torturing of the data?
Perhaps, but
Shape of the distribution?Simplifying methods (mid-point approach) are generally invalidAd-hoc approach gave (in retrospect) peculiar resultsSimulation study indicates good behavior
Other smoothing approaches are possible
Various extensions of current model are possible
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 51 / 53
Discussion Some criticism
Torturing of the data?
Perhaps, but
Shape of the distribution?Simplifying methods (mid-point approach) are generally invalidAd-hoc approach gave (in retrospect) peculiar resultsSimulation study indicates good behavior
Other smoothing approaches are possible
Various extensions of current model are possible
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 51 / 53
Discussion Another criticism
Is approach general enough?
Possibly not, Bayesian non-parametric approaches offer a moreflexible and general approach
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 52 / 53
Discussion Another criticism
THANK YOU FOR YOUR ATTENTION
Emmanuel Lesaffre (ERASMUS and KUL) AFT models for multivariate IC data Ankara, December 2012 53 / 53