View
10
Download
0
Category
Preview:
Citation preview
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Doubly robust estimates for
longitudinal data analysis with missingresponse and missing covariates
Xiao-Hua Andrew Zhou, Ph.D
Co-Investigator and Senior Biostatistician, NACCProfessor, Department of Biostatistics
University of Washington
October, 2009
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 1 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
1 NACC UDS
2 Analysis of Complete Longitudinal Data
3 Estimating Equations for Missing Outcome
4 Methods for Handling Missing Covariates
5 New MethodModel Formulation For Missing Response and CovariatesEstimation and Inference
6 Simulations and ApplicationsSimulationsApplications
7 Summary
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 2 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
A NACC example
Using the National Alzheimer’s Coordinating Center (NACC)Uniform Data Set (UDS), we are interested in assessing heassociation between patient’s characteristics and the onset ofdementia.
The response is the diagnosis of dementia (Yes/No).
The covariates that may be related to the status of dementiainclude sex, congestive heart failure (CVCHF, yes/no), familyhistory of dementia (FHDEM, yes/no), diabetes (yes/no),behavioral assessment (depression or dysphoria, yes/no),hypertension (yes/no), education (years), Mini-Mental StateExam (MMSE) score, and age.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 3 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
A NACC example, continued
There are 16223 subjects from 29 Alzheimer’s Disease Centersincluded at the entry of this study.
Follow-up visits for subjects are scheduled at approximatelyone-year intervals, with up to three follow-ups at present.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 4 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
An example, continued
Due to some reasons, there are some missing data for theresponse and the behavioral assessment covariate.
There are 8724 subjects with complete data on scheduledvisits.
About 11.9% subjects miss both the response and behavioralassessment; about 31.2% subjects miss the response butobserve behavioral assessment; about 3.2% subjects miss thebehavioral assessment but observe the response; and about53.7% subjects observe both the response and the behavioralassessment covariate.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 5 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
GEE Approach with Complete LongitudinalData
The method of generalized estimating equations (GEE) is apopular method for analyzing longitudinal data.
It requires only the specification of a model for the marginalmean and variance of each measurement and of a ”working”matrix for the correlation between measurements in a cluster.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 6 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Notations
Let Yij denote the response of individual i at time j
(i = 1, . . . ,N; j = 1, . . . ,Mi). Let Yi = (Yi1, . . . ,YiMi)T .
Let xij denote a vector of covariates for individual i at time j ,and xi = (xT
i1 , . . . , xTiMi
)T . xi = (xTi1, . . . , x
TiMi
)T .
Let µij = E (Yij | xij), g(µij) = βT xij ; letµi = (µi1 . . . , µiMi
)T .
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 7 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
GEE for Complete Data Analysis
The GEE for complete data are
N∑
i=1
Ui(β, ρ;Yi , xi ) = 0,
where
Ui(β, ρ;Yi , xi ) =∂µT
i
∂βVi (ρ)−1(Yi − µi ),
and Vi(ρ) is the working covariance matrix of Yi .
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 8 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Asymptotic results
When xi contains only time-independent covariates, undersome regularity conditions, the GEE yields estimators that areconsistent.
If xi includes some time-dependent covariates, the GEE stillyields consistent estimators under one additional assumptionthat E (Yij | xi ) = E (Yij | xij). If this is not the case, then forconsistency the independent working correlation should beused.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 9 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Time-dependent Covariates
Let Lij denote all the data that should be collected onindividual i at time j .
Let Lij denote the data available on individual i by time j .
Let Lij denote the data not yet available by time j .
Note that Lij includes both Yij and xij .
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 10 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Drop-out
Let Rij = 1 if measurement j on individual i is observed andRij = 0 otherwise.
Assume monotone drop-out: Rij = 0 implies Rik = 0 for alltimes k > j .
Let Cij = 1 if subject is last observed measurement is at timej and 0 otherwise.
We assume that the covariates included in Lij are chosen so thatthe data can assumed to be Missing at Random (MAR):
P(Rij = 1|LiMi,Ri ,j−1 = 1) = P(Rij = 1|Li ,j−1,Ri ,j−1 = 1).
i.e., the probability of missingness only depends on the observeddata.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 11 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
GEE for Complete-Data
N∑
i=1
Ui(β, ρ;Yi , xi ) = 0,
where
Ui(β, ρ;Yi , xi ) =∂µT
i
∂βVi(ρ)−1(Yi − µi ),
and Vi(ρ) is the working covariance matrix of Yi .These equations yield estimates that are consistent if the data areMissing Completely at Random (MCAR), but not necessarily ifthey are MAR.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 12 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Re-weighting
With missing data, we can base our estimates on thecomplete cases, but re-weight them according to theprobability of being observed.The estimating equations are then
N∑
i=1
∂µTi
∂βVi(ρ)−1∆i(α)(Yi − µi),
where ∆i (α) = diag(Ri1/πi1, . . . ,RiMi/πiMi
) andπij = πij(α) is the probability, according to a specifieddropout model, that measurement j on subject i is observed.Under the drop-out missing data,
πij(α) = (1 − λi1(α)) . . . (1 − λij(α)),
where λij(α) = P(Rij = 0 | Lij ,Rij = 1).The resulting estimates are consistent if the data are MAR, aslong as the probability model for the missingness is correctlyspecified.Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 13 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Imputation
Alternatively, we can impute, or “guess”, what the missingvalues are based on some probability model.
Then the estimates are based on both the observed data andthe imputed data.
The complete case estimating equations are used, but afterimputing missing responses with their expected values:
E (Yij |Lik ,Rik = 1), for j > k.
The imputations are based on specified regression models.
The resulting estimates are consistent if the data are MAR, aslong as the probability model for the imputations is correct.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 14 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Doubly-robust Estimating Equations
The inverse probability weighting estimates make no use ofthe available data on subjects with missing measurements.
Let d(LM ,β) = U(β,ρ;Y, x) be the contribution of a fullyobserved subject to the estimating equations.
For drop-out missing data, the IPW estimating equations canbe augmented by a term F (C ,LC ,β) satisfyingEC{F (C ,LC ,β)|LM} = 0.
The resulting augmented estimating equations are
N∑
i=1
{
RiMi
πiMi
d(LMi ,β) + F (C ,LC ,β)
}
= 0.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 15 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Doubly-robust Estimating Equations (2)
The optimal choice of augmentation term is
Fopt(C ,LC ,β) =
M−1∑
j=1
(
Cj − λj+1Rj
πj+1
)
Hj(β),
where Hj (β) = ELj{d(LM ,β)|Lj ,Rj = 1}.
We specify models for Hj (β), j = 1, . . . ,M − 1 which involveparameters γ.
Let α̂ and γ̂ denote consistent estimators of α and γ.
Then, in the estimating equations, replace λj , πj , and Hj withλj(α), πj(α), and Hj(β, γ̂).
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 16 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Properties of DR Estimating Equations
If:
The data are MAR,
the marginal model is correct, g(µij) = βTxij , and
either the dropout model πj , or the model for Hj (or both) iscorrectly specified,
then the solution to the estimating equations β̂ is consistent for β.
Furthermore, if both the dropout model and the model for Hj
are correct, then this solution β̂ is optimal in the sense that ithas the smallest asymptotic variance among estimates fromaugmented estimating equations. A consistent estimate ofthis variance exists in closed form.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 17 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Methods for Handling Missing Covariates
Lipsitz et al. (1999) considered the doubly robust estimate in thecross-sectional study with a missing covariate
Notations:
yi : response, xi : covariate vector that is always observedzi : covariate that is subject to missingri : missing indicator for zi
Joint density of (ri , yi , zi |xi)
p(ri , yi , zi |xi ) = p(ri |yi , zi , xi , ω)p(yi |zi , xi , β)p(zi |xi , α)
= p(ri |yi , xi , ω)p(yi |zi , xi , β)p(zi |xi , α) (MAR)
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 18 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Score Equation for Complete Data
The likelihood-based score question:
n∑
i=1
u1i (β)u2i(α)u3i (ω)
= 0,
where
u1i (β; yi , xi , zi ) = ∂ log p(yi |xi ,zi ,β)∂β
u2i (β; xi , zi ) = ∂ log p(zi |xi ,α)∂α
u3i (β; ri , xi , yi ) = ∂ log p(ri |xi ,yi ,zi ,ω)∂ω
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 19 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Methods for Handling Missing Covariates
With missing data, the maximum likelihood estimating equationsfor γ̂ = (β̂′, α̂′, ω̂′)′ solves
u∗(γ̂) =
n∑
i=1
u∗i (γ̂) =
n∑
i=1
E
u1i(β̂)u2i (α̂) observed datau3i (ω̂)
= 0
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 20 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Methods for Handling Missing Covariates
We can further show that
u∗(γ) =
n∑
i=1
riu1i(β; yi , xi , zi) + (1 − ri)Ezi |yi ,xi[u1i (β; yi , xi , zi )]
riu2i (α; zi , xi ) + (1 − ri)Ezi |yi ,xi[u2i (α; zi , xi )]
u3i (ω; yi , xi , ri )
Solving u∗(γ̂) = 0 we get the MLE
The asymptotic properties of (β̂, α̂)′ don’t depend on themissing data model
If p(yi |xi , zi ) and p(zi |xi ) are correctly specified, we can getconsistent estimate of (β̂, α̂)′ by solving u∗(γ̂) = 0
If p(yi |xi , zi ) or/and p(zi |xi ) are misspecified, then β̂ will notbe consistent
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 21 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Methods for Handling Missing Covariates
Weighted GEE
S(γ) =
n∑
i=1
riπi
u1i (β; yi , xi , zi) +(
1 − riπi
)
Ezi |yi ,xi[u1i (β; yi , xi , zi)]
riπi
u2i (α; zi , xi) +(
1 − riπi
)
Ezi |yi ,xi[u2i (α; zi , xi )]
u3i (ω; yi , xi , ri )
where πi = P(ri = 1|yi , xi )
Doubly robust estimate, i.e., solving S(γ̂) = 0 can getasymptotic unbiased estimate for β when either πi or p(zi |xi )is correctly specified
EM algorithm for the estimate
Asymptotic variance
Var(γ̂) ={
n∑
i=1
E
[
∂Si(γ)
∂γ′
]}
−1n
∑
i=1
E [Si(γ)S ′
i (γ)]{
n∑
i=1
E
[
∂Si(γ)
∂γ
]}
−1
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 22 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Model Formulation
Notations
Response: Yi = (Yi1, Yi2, . . . , YiJi)′
Covariate: Xi = (Xi1, Xi2, . . . , XiJi)′
Rij =
0 Yij and Xij are missing
1 Yij is missing and Xij is observed
2 Yij is observed and Xij is missing
3 Yij and Xij are observed
Covariate: Zi [always observed]
Response model: µij = E (Yij |Xi ,Zi)var(Yij |Xi ,Zi ) = κf (µij)
g(µij) = Xijβx + Z ′ijβz
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 23 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Model Formulation (Continued)
Missing data models: λijk = P(Rij = k |R̄ij , Yi , Xi , Zi ), k = 0, 1, 2, 3
log(λijk
λij0
)
= uijk′αk k = 1, 2, 3
R̄ij : missing response indicator history
Covariate model: ωij = E (Xij |X̄ij ,Zi )
h(ωij) = v ′ijγ
X̄ij : covariate history
θ = (β′, α′, γ′)′, where β is of interest
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 24 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Model Formulation (Continued)
MAR assumption:
P(Rij = k|R̄ij ,Yi ,Xi ,Zi )
= P(Rij = k|R̄ij ,Y(o)i ,X
(o)i ,Zi)
Yi = (Y(o)i , Y
(m)i )
Xi = (X(o)i , X
(m)i )
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 25 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Model Formulation (Continued)
Weighted GEE (WGEE) for β:
S1(θ) =
n∑
i=1
[
DiMi(Yi−µi )+EY
(m)i
,X(m)i
|Y(o)i
,X(o)i
,Zi[DiNi (Yi−µi)]
]
= 0
Mi = κ−1F−1/2i [C−1
i • ∆i ]F−1/2i
Ni = κ−1F−1/2i [C−1
i • (11′ − ∆i)]F−1/2i
Fi = diag(var(Yij |Xij ,Zij), j = 1, . . . , Ji )
Ci : working correlation matrix
∆i = [δijk ] with
δijk = [I (Rij = 1,Rik = 3) + I (Rij = 3,Rik = 3)]/πijk for j 6= k
andδijj = I (Rij = 3)/πij
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 26 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Model Formulation (Continued)
Weighted GEE (WGEE) for γ:
S2(θ) =
n∑
i=1
[
vi∆∗i (Xi −ωi)+E
X(m)i
|X(o)i
,Zi[vi(I −∆∗
i )(Xi −ωi)]]
= 0
∆∗i = diag(I (Rij = 1 or 3)/πx
ij , j = 1, . . . , Ji)
πxij = P(Rij = 1 or 3|Yi ,Zi ,Xi )
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 27 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Model Formulation (Continued)
Estimation function for missing data parameter α:
S3(α) =
n∑
i=1
Ji∑
j=1
3∑
k=0
I (Rij = k)
λijk
∂λijk
∂α= 0
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 28 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Estimation and Inference
Solve estimating equations
S(θ̂) =
S1(θ̂)
S2(θ̂)S3(α̂)
=n
∑
i=1
Si(θ) = 0
EM algorithm for the estimation
Variance estimate
Var(θ̂) ={
n∑
i=1
E
[
∂Si(θ)
∂θ
]}
−1n
∑
i=1
E [Si(θ)S′
i (θ)]{
n∑
i=1
E
[
∂Si(θ)
∂θ
]
′}
−1
.
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 29 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Estimation and Inference (Continued)
Doubly robust estimate
If missing data model is correctly specified, we get asymptoticunbiased estimate for β no matter the model for the covariateis correctly specified or not
If covariate model is correctly specified, we get asymptoticunbiased estimate for β no matter the model for the missingdata is correctly specified or not
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 30 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Simulations
Response model is logit(µij) = β0 + β1xij + β2Zij , j = 1, 2, 3,with exchangeable correlation ρ.
Covariate model
logitωij = γ0 + γ1Xi ,j−1 + γ2Zij
Missing data model
log(λijk
λij0
)
= α0k + α1k1I (Ri ,j−1 = 1) + α1k2I (Ri ,j−1 = 2)
+α1k3I (Ri ,j−1 = 3) + α2ky(o)i ,j−1 + α3kx
(o)i ,j−1
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 31 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Simulations (Continued)
Methods considered
1 EM(x+): EM with correct covariate model
2 WGEE(x+, r+): WGEE with correct covariate and missingdata models
3 WGEE(x−, r+): WGEE with incorrect covariate and correctmissing data models
4 WGEE(x+, r−): WGEE with correct covariate and incorrectmissing data models
5 WGEE(x−, r−): WGEE with incorrect covariate and incorrectmissing data models
6 cc: complete case MLE
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 32 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Simulations (Continued)
Table: Empirical bias, standard deviation and coverageprobabilities for six approaches to estimation and inference withincomplete covariate and response data (ρ = 0.6, α2 = γ2 = −2)
β0 β1 β2
Method Bias% SD CP% Bias% SD CP% Bias% SD CP%
EM(x+) -0.3 0.102 94.9 -1.1 0.077 94.3 0.5 0.091 94.8(x+, r+) 0.7 0.104 95.1 0.8 0.080 94.5 -0.9 0.093 94.9(x+, r−) -1.0 0.110 95.2 -1.6 0.088 94.9 1.6 0.102 95.0(x−, r+) 0.4 0.105 94.4 1.0 0.084 94.8 -0.3 0.096 94.5(x−, r−) -20.1 0.094 91.4 12.0 0.081 92.9 3.0 0.096 93.9cc -302.0 0.876 53.8 49.9 1.077 96.8 0.4 1.218 94.6
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 33 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Table: Empirical bias, standard deviation and coverageprobabilities for six approaches to estimation and inference withincomplete covariate and response data (ρ = 0.3, α2 = γ2 = −2)
β0 β1 β2
Method Bias% SD CP% Bias% SD CP% Bias% SD CP%
EM(x+) -1.6 0.058 94.4 -0.2 0.067 95.3 1.1 0.084 94.4(x+, r+) 0.1 0.060 95.4 0.1 0.072 95.1 0.3 0.086 94.6(x+, r−) 0.0 0.066 94.3 0.8 0.071 94.9 0.2 0.091 94.7(x−, r+) 1.2 0.062 94.7 0.6 0.079 94.8 -0.9 0.087 94.5(x−, r−) -12.4 0.076 93.4 8.4 0.077 94.1 2.0 0.087 94.2cc -219.6 0.784 78.6 -27.0 1.065 97.2 0.0 0.930 94.9
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 34 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Simulations (Continued)
Summary of the Simulations:
EM algorithm gives consistent and most efficient estimatewhen the covariate model is correctly specified
The proposed method yields negligible biases when either thecovariate model or the missing data model is correctlyspecified
If both the covariate and missing data model are misspecified,the proposed method yield biased result
The complete case analysis gives biased estimate
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 35 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Impact of model misspecification
−4 −2 0 2 4
−50
510
γ1
% RELA
TIVE B
IAS FO
R β2
α2=−2α2=−1α2= 0α2= 1α2= 2
Figure: Asymptotic percent relative bias of β2 with misspecifiedcovariate model and missing data model
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 36 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Application to the NACCUDS
Table: Frequency table for the responses and covariate for the
missingness (X ,Y )
Time (m, m) (o, m) (m, o) (o, o)
1 6.0 28.8 8.9 56.32 10.3 31.7 3.9 54.13 12.8 31.1 2.7 53.44 14.1 31.3 1.6 52.9
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 37 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Application to the NACCUDS
Table: Parameter estimate for the NACCUDS: proposed method,
n = 16223
Parameter Est. SE p
(Intercept) -0.136 0.106 0.198SEX(F) -0.203 0.025 <0.001CVCHF -0.031 0.063 0.618DEPRESSION 0.679 0.029 <0.001MMSE -0.002 0.001 <0.001FHDEM 0.181 0.028 <0.001DIABETE -0.124 0.038 0.001HYPERT -0.195 0.026 <0.001EDUC -0.002 0.001 0.040AGE 0.006 0.001 <0.001
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 38 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Application to the NACCUDS
Table: Parameter estimate for the NACCUDS: missing response
only, n = 15416
Parameter Est. SE p
(Intercept) -0.272 0.110 0.013SEX(F) -0.113 0.026 <0.001CVCHF 0.123 0.066 0.063DEPRESSION 0.505 0.030 <0.001MMSE -0.007 0.001 <0.001FHDEM -0.004 0.029 0.897DIABETE -0.176 0.038 <0.001HYPERT -0.220 0.027 <0.001EDUC 0.000 0.001 0.670AGE 0.013 0.001 <0.001
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 39 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Application to the NACCUDS
Table: Parameter estimate for the NACCUDS: missing covariate
only, n = 10755
Parameter Est. SE p
(Intercept) 0.198 0.142 0.163SEX(F) -0.040 0.032 0.215CVCHF 0.044 0.080 0.579DEPRESSION 0.451 0.034 <0.001MMSE -0.019 0.001 <0.001FHDEM -0.048 0.036 0.177DIABETE -0.177 0.047 <0.001HYPERT -0.212 0.034 <0.001EDUC -0.000 0.002 0.904AGE 0.011 0.002 <0.001
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 40 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Application to the NACCUDS
Table: Parameter estimate for the NACCUDS: complete case
analysis, n = 8724
Parameter Est. SE p
(Intercept) 0.283 0.162 0.081SEX(F) -0.022 0.037 0.551CVCHF -0.019 0.092 0.834DEPRESSION 0.416 0.039 <0.001MMSE -0.021 0.001 <0.001FHDEM -0.067 0.040 0.099DIABETE -0.168 0.054 0.002HYPERT -0.212 0.039 <0.001EDUC 0.002 0.002 0.252AGE 0.013 0.002 <0.001
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 41 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Summary
Likelihood-based method is robust to the misspecification ofthe missing data process model
Weighted GEE method is robust to the misspecification of thecovariate model
Our proposed method is robust to the misspecification ofeither the missing data process model or the covariate model
Our proposed method can deal with intermittent missingnesspattern for longitudinal data with both missing response andmissing covariate
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 42 / 43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates
Questions?
Thank You!!!
Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 43 / 43
Recommended