34
The Annals of Statistics 0, Vol. 0, No. 00, 1–34 DOI: 10.1214/10-AOS824 © Institute of Mathematical Statistics,0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 32 32 33 33 34 34 35 35 36 36 37 37 38 38 39 39 40 40 41 41 42 42 43 43 ANOVA FOR LONGITUDINAL DATA WITH MISSING VALUES 1 BY SONG XI CHEN AND PING-SHOU ZHONG Iowa State University and Peking University and Iowa State University We carry out ANOVA comparisons of multiple treatments for longitudi- nal studies with missing values. The treatment effects are modeled semipara- metrically via a partially linear regression which is flexible in quantifying the time effects of treatments. The empirical likelihood is employed to formulate model-robust nonparametric ANOVA tests for treatment effects with respect to covariates, the nonparametric time-effect functions and interactions be- tween covariates and time. The proposed tests can be readily modified for a variety of data and model combinations, that encompasses parametric, semi- parametric and nonparametric regression models; cross-sectional and longi- tudinal data, and with or without missing values. 1. Introduction. Randomized clinical trials and observational studies are of- ten used to evaluate treatment effects. While the treatment versus control stud- ies are popular, multi-treatment comparisons beyond two samples are commonly practised in clinical trails and observational studies. In addition to evaluate overall treatment effects, investigators are also interested in intra-individual changes over time by collecting repeated measurements on each individual over time. Although most longitudinal studies are desired to have all subjects measured at the same set of time points, such “balanced” data may not be available in practice due to missing values. Missing values arise when scheduled measurements are not made, which make the data “unbalanced.” There is a good body of literature on para- metric, nonparametric and semiparametric estimation for longitudinal data with or without missing values. This includes Liang and Zeger (1986), Laird and Ware (1982), Wu (?w98, ?w00), Fitzmaurice, Laird and Ware (2004) for methods de- <ref:w98?> <ref:w00?> veloped for longitudinal data without missing values; and Little and Rubin (2002), Little (1995), Laird (2004), Robins, Rotnitzky and Zhao (1995) for missing values. The aim of this paper is to develop ANOVAtests for multi-treatment compar- isons in longitudinal studies with or without missing values. Suppose that at time t , corresponding to k treatments there are k mutually independent samples: {(Y 1i (t),X τ 1i (t))} n 1 i =1 ,..., {(Y ki (t),X τ ki (t))} n k i =1 , Received February 2010; revised April 2010. 1 Supported by NSF Grants SES-0518904, DMS-06-04563 and DMS-07-14978. AMS 2000 subject classifications. Primary 62G10; secondary 62G20, 62G09. Key words and phrases. Analysis of variance, empirical likelihood, kernel smoothing, missing at random, semiparametric model, treatment effects. AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 1 1

Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

The Annals of Statistics0, Vol. 0, No. 00, 1–34DOI: 10.1214/10-AOS824© Institute of Mathematical Statistics, 0

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

ANOVA FOR LONGITUDINAL DATA WITH MISSING VALUES1

BY SONG XI CHEN AND PING-SHOU ZHONG

Iowa State University and Peking University and Iowa State University

We carry out ANOVA comparisons of multiple treatments for longitudi-nal studies with missing values. The treatment effects are modeled semipara-metrically via a partially linear regression which is flexible in quantifying thetime effects of treatments. The empirical likelihood is employed to formulatemodel-robust nonparametric ANOVA tests for treatment effects with respectto covariates, the nonparametric time-effect functions and interactions be-tween covariates and time. The proposed tests can be readily modified for avariety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression models; cross-sectional and longi-tudinal data, and with or without missing values.

1. Introduction. Randomized clinical trials and observational studies are of-ten used to evaluate treatment effects. While the treatment versus control stud-ies are popular, multi-treatment comparisons beyond two samples are commonlypractised in clinical trails and observational studies. In addition to evaluate overalltreatment effects, investigators are also interested in intra-individual changes overtime by collecting repeated measurements on each individual over time. Althoughmost longitudinal studies are desired to have all subjects measured at the sameset of time points, such “balanced” data may not be available in practice due tomissing values. Missing values arise when scheduled measurements are not made,which make the data “unbalanced.” There is a good body of literature on para-metric, nonparametric and semiparametric estimation for longitudinal data with orwithout missing values. This includes Liang and Zeger (1986), Laird and Ware(1982), Wu (?w98, ?w00), Fitzmaurice, Laird and Ware (2004) for methods de- <ref:w98?>

<ref:w00?>veloped for longitudinal data without missing values; and Little and Rubin (2002),Little (1995), Laird (2004), Robins, Rotnitzky and Zhao (1995) for missing values.

The aim of this paper is to develop ANOVA tests for multi-treatment compar-isons in longitudinal studies with or without missing values. Suppose that at time t ,corresponding to k treatments there are k mutually independent samples:

{(Y1i (t),Xτ1i(t))}n1

i=1, . . . , {(Yki(t),Xτki(t))}nk

i=1,

Received February 2010; revised April 2010.1Supported by NSF Grants SES-0518904, DMS-06-04563 and DMS-07-14978.AMS 2000 subject classifications. Primary 62G10; secondary 62G20, 62G09.Key words and phrases. Analysis of variance, empirical likelihood, kernel smoothing, missing at

random, semiparametric model, treatment effects.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 1

1

Page 2: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

2 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

where the response variable Yji(t) and the covariate Xji(t) are supposed to bemeasured at time points t = tj i1, . . . , tj iTj

. Here Tj is the fixed number of sched-uled observations for the j th treatment. However, {Yji(t),X

τji(t)} may not be ob-

served at some times, resulting in missing values in either the response Yji(t) orthe covariates Xji(t).

We consider a semiparametric regression model for the longitudinal data

Yji(t) = Xτji(t)βj0 + Mτ(Xji(t), t)γj0 + gj0(t) + εji(t),

(1.1)j = 1,2, . . . , k,

where M(Xji(t), t) are known functions of Xji(t) and time t representing inter-actions between the covariates and the time, βj0 and γj0 are p- and q-dimensionalparameters, respectively, gj0(t) are unknown smooth functions representing thetime effect, and {εji(t)} are residual time series. Such a semiparametric modelmay be viewed as an extended partially linear model. The partially linear modelhas been used for longitudinal data analysis; see Zeger and Diggle (1994), Zhanget al. (1998), Lin and Ying (2001), Wang, Carroll and Lin (2005). Wu, Chiang andHoover (1998) and Wu and Chiang (2000) proposed estimation and confidence re-gions for a semiparametric varying coefficient regression model. Despite a bodyof works on estimation for longitudinal data, analysis of variance for longitudinaldata have attracted much less attention. A few exceptions include Forcina (1992)who proposed an ANOVA test in a fully parametric setting; and Scheike and Zhang(1998) who considered a two sample test in a fully nonparametric setting.

In this paper, we propose ANOVA tests for differences among the βj0’s andthe baseline time functions gj0’s, respectively, in the presence of the interactions.The ANOVA statistics are formulated based on the empirical likelihood [Owen(1988, 2001)], which can be viewed as a nonparametric counterpart of the conven-tional parametric likelihood. Despite its not requiring a fully parametric model, theempirical likelihood enjoys two key properties of a conventional likelihood, theWilks’ theorem [Owen (1990), Qin and Lawless (1994), Fan and Zhang (2004)]and Bartlett correction [DiCicco, Hall and Romano (1991), Chen and Cui (2006)];see Chen and Van Keilegom (2009) for an overview on the empirical likelihoodfor regression. This resemblance to the parametric likelihood ratio motivates us toconsider using empirical likelihood to formulate ANOVA test for longitudinal datain nonparametric situations. This will introduce a much needed model-robustnessin the ANOVA testing.

Empirical likelihood has been used in studies for either missing or longitudinaldata. Wang and Rao (2002), Wang, Linton and Härdle (2004) considered an empir-ical likelihood inference with a kernel regression imputation for missing responses.Liang and Qin (2008) treated estimation for the partially linear model with miss-ing covariates. For longitudinal data, Xue and Zhu (2007a, 2007b) proposed a biascorrection method to make the empirical likelihood statistic asymptotically pivotal

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 2

Page 3: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 3

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

in a one sample partially linear model; see also You, Chen and Zhou (?ycz07) and <ref:ycz07?>

Huang, Qin and Follmann (2008).In this paper, we propose three empirical likelihood based ANOVA tests for

the equivalence of the treatment effects with respect to (i) the covariate Xji ; (ii)the interactions M(Xji(t), t) and (iii) the time effect functions gj0(·)’s, by for-mulating empirical likelihood ratio test statistics. It is shown that for the proposedANOVA tests for the covariates effects and the interactions, the empirical likeli-hood ratio statistics are asymptotically chi-squared distributed, which resemblesthe conventional ANOVA statistics based on parametric likelihood ratios. This isachieved without parametric model assumptions for the residuals in the presenceof the nonparametric time effect functions and missing values. Hence, the empiri-cal likelihood ANOVA tests have the needed model-robustness. Another attractionof the proposed ANOVA tests is that they encompass a set of ANOVA tests for avariety of data and model combinations. Specifically, they imply specific ANOVAtests for both cross-sectional and longitudinal data; for parametric, semiparametricand nonparametric regression models; and with or without missing values.

The paper is organized as below. In Section 2, we describe the model and themissing value mechanism. Section 3 outlines the ANOVA test for comparing treat-ment effects due to the covariates; whereas the tests regarding interaction are pro-posed in Section 4. Section 5 considers ANOVA test for the nonparametric timeeffects. The bootstrap calibration to the ANOVA test on the nonparametric part isoutlined in Section 6. Section 7 reports simulation results. We applied the proposedANOVA tests in Section 8 to analyze an HIV-CD4 data set. Technical assumptionsare presented in the Appendix. All the technical proofs to the theorems are reportedin a supplement article [Chen and Zhong (2010)].

2. Models, hypotheses and missing values. For the ith individual of the j thtreatment, the measurements taken at time tj im follow a semiparametric model

Yji(tj im) = Xτji(tj im)βj0 + Mτ(Xji(tj im), tjim)γj0

(2.1)+ gj0(tjim) + εji(tj im),

for j = 1, . . . , k, i = 1, . . . , nj , m = 1, . . . , Tj . Here βj0 and γj0 are un-known p- and q-dimensional parameters and gj0(t) are unknown functionsrepresenting the time effects of the treatments. The time points {tj im}Tj

m=1 areknown design points. For the ease of notation, we write (Yjim,Xτ

jim,Mτjim)

to denote (Yji(tj im),Xτji(tj im),Mτ (Xji(tj im), tjim)). Also, we will use X

τjim =

(Xτjim,Mτ

jim) and ξτj = (βτ

j , γ τj ). For each individual, the residuals {εji(t)} sat-

isfy E{εji(t)|Xji(t)} = 0, Var{εji(t)|Xji(t)} = σ 2j (t) and

Cov{εji(t), εji(s)|Xji(t),Xji(s)} = ρj (s, t)σj (t)σj (s),

where ρj (s, t) is the conditional correlation coefficient between two residuals attwo different times. And the residual time series {εji(t)} from different subjects

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 3

Page 4: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

4 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

and different treatments are independent. Without loss of generality, we assumet, s ∈ [0,1]. For the purpose of identifying βj0, γj0 and gj0(t), we assume

(βj0, γj0, gj0) = arg min(βj ,γj ,gj )

1

njTj

nj∑i=1

Tj∑m=1

E{Yjim −Xτjimβj −Mτ

jimγj −gj (tjim)}2.

We also require that 1njTj

∑nj

i=1∑Tj

m=1 E(XjimXτjim) > 0, where Xjim = Xjim −

E(Xjim|tj im). This condition also rules out M(Xji(t), t) being a pure functionof t , and hence it has to be genuine interaction. For the same reason, the interceptin model (2.1) is absorbed into the nonparametric part gj0(t).

As commonly exercised in the partially linear model [Speckman (1988); Linton(?l95)], there is a secondary model for the covariate Xjim: <ref:l95?>

Xjim = hj (tjim) + ujim,(2.2)

j = 1,2, . . . , k, i = 1, . . . , nj ,m = 1, . . . , Tj ,

where hj (·)’s are p-dimensional smooth functions with continuous second deriv-atives, the residual ujim = (u1

jim, . . . , upjim)τ satisfy E(ujim) = 0 and ujl and ujk

are independent for l �= k, where ujl = (ujl1, . . . , ujlTj). By the identification con-

dition given above, the covariance matrix of ujim is assumed to be finite and posi-tive definite.

We are interested in testing three ANOVA hypotheses. The first one is on thetreatment effects with respect to the covariates:

H0a :β10 = β20 = · · · = βk0 vs. H1a :βi0 �= βj0 for some i �= j.

The second one is regarding the time effect functions:

H0b :g10(·) = · · · = gk0(·) vs. H1b :gi0(·) �= gj0(·) for some i �= j.

The third one is on the existence of the interaction H0c :γj0 = 0 and H1c :γj0 �= 0.And the last one is the ANOVA test for

H0d :γ10 = γ20 = · · · = γk0 vs. H1d :γi0 �= γj0 for some i �= j.

Let Xji = {Xji0, . . . ,XjiTj} and Yji = {Yji0, . . . , YjiTj

} be the completetime series of the covariates and responses of the (j, i)th subject (the ith sub-ject in the j th treatment), and

Yjit,d = {Yji(t−d), . . . , Yji(t−1)} and↼

Xjit,d ={Xji(t−d), . . . ,Xji(t−1)} be the past d observations at time t for a positive inte-ger d ≤ minj {Tj }. For t < d , we set d = t − 1.

Define the missing value indicator δjit = 1 if (Xτjit , Yjit ) is observed and δjit =

0 if (Xτjit , Yjit ) is missing. Here, we assume Xjit and Yjit are either both observed

or both missing. This simultaneous missingness of Xjit and Yjit is for the ease ofmathematical exposition. We also assume that δji0 = 1, namely the first visit ofeach subject is always made.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 4

Page 5: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 5

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

Monotone missingness is a common assumption in the analysis of longitudi-nal data [Robins, Rotnitzky and Zhao (1995)]. It assumes that if δji(t−1) = 0then δjit = 0. However, in practice after missing some scheduled appointmentspeople may rejoin the study. This kind of casual drop-out appears quite often inempirical studies. To allow more data being included in the analysis, we relaxthe monotone missingness to allow segments of consecutive d visits being used.Let δjit,d = ∏d

l=1 δji(t−l). We assume the missingness of (Xτjit , Yjit ) is missing at

random (MAR) Rubin (1976) given its immediate past d complete observations,namely

P(δjit = 1|δjit,d = 1,Xji, Yji) = P(δjit = 1|δjit,d = 1,↼

Xjit,d ,↼

Yjit,d)(2.3)

= pj (↼

Xjit,d ,↼

Yjit,d; θj0).

Here the missing propensity pj is known up to a parameter θj0. To allow derivationof a binary likelihood function, we need to set δjit = 0 if δjit,d = 0 when there issome drop-outs among the past d visits, which is only temporarily if δjit = 1. Thisset-up ensures

P(δjit = 0|δjit,d = 0,↼

Xjit,d ,↼

Yjit,d) = 1.(2.4)

Now the conditional binary likelihood for {δjit }Tj

t=1 given Xji and Yji is

P(δji0, . . . , δjiTj|Xji, Yji)

=Tj∏

m=1

P(δjim|δji(m−1), . . . , δji0,Xji, Yji

)

=Tj∏

m=1

P(δjim|δjim,d = 1,↼

Xjim,d,↼

Yjim,d)

=Tj∏

m=1

[pj (

Xjim,d,↼

Yjim,d; θj )δjim{1 − pj (

Xjim,d,↼

Yjim,d; θj )}(1−δjim)]δjim,d .

In the second equation above, we use both the MAR in (2.3) and (2.4). Hence, theparameters θj0 can be estimated by maximizing the binary likelihood

LBj(θj ) =

nj∏i=1

Tj∏t=1

[pj (

Xjit,d ,↼

Yjit,d; θj )δjit

(2.5)× {1 − pj (

Xjit,d ,↼

Yjit,d; θj )}(1−δjit )]δjit,d .

Under some regular conditions, the binary maximum likelihood estimator θj is√n-consistent estimator of θj0; see Chen, Leung and Qin (2008) for results on

a related situation. Some guidelines on how to choose models for the missing

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 5

Page 6: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

6 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

propensity are given in Section 8 in the context of the empirical study. The ro-bustness of the ANOVA tests with respect to the missing propensity model arediscussed in Sections 3 and 4.

3. ANOVA test for covariate effects. We consider testing for H0a :β10 =β20 = · · · = βk0 with respect to the covariates. Let πjim(θj ) = ∏m

l=m−d pj (↼

Xjil,d ,↼

Yjil,d; θj ) be the overall missing propensity for the (j, i)th subject up to time tj im.To remove the nonparametric part in (2.1), we first estimate the nonparametricfunction gj0(t). If βj0 and γj0 were known, gj0(t) would be estimated by

gj (t;βj0) =nj∑i=1

Tj∑m=1

wjim,h(t)(Yjim − Xτjimβj0 − Mτ

jimγj0),(3.1)

where

wjim,hj(t) = (δjim/πjim(θj ))Khj

(tjim − t)∑nj

s=1∑Tj

l=1(δjsl/πjsl(θj ))Khj(tjsl − t)

(3.2)

is a kernel weight that has been inversely weighted by the propensity πjim(θj )

to correct for selection bias due to the missing values. In (3.2), K is a univariatekernel function which is a symmetric probability density, Khj

(t) = K(t/hj )/hj

and hj is a smoothing bandwidth. The conventional kernel estimation of gj0(t)

without weighting by πjsl(θj ) may be inconsistent if the missingness depends onthe responses Yjil , which can be the case for missing covariates.

Let Ajim denote any of Xjim,Yjim and Mjim and define

Ajim = Ajim −nj∑

i1=1

Tj∑m1=1

wji1m1,hj(tj im)Aji1m1(3.3)

to be the centering of Ajim by the kernel conditional mean estimate, as is com-monly exercised in the partially linear regression [Härdle, Liang and Gao (2000)].An estimating function for the (j, i)th subject is

Zji(βj ) =Tj∑

m=1

δjim

πjim(θj )Xjim(Yjim − Xτ

jimβj − Mτjimγj ),

where γj is the solution of

nj∑i=1

Tj∑m=1

δjim

πjim(θj )Mjim(Yjim − Xτ

jimβj0 − Mτjimγj ) = 0

at the true βj0. Note that E{Zji(βj0)} = o(1). Although it is not exactly zero,Zji(βj0) can still be used as an approximate zero mean estimating function toformulate an empirical likelihood for βj as follows.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 6

Page 7: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 7

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

Let {pji}nj

i=1 be nonnegative weights allocated to {(Xτji, Yji)}nj

i=1. The empiricallikelihood for βj is

Lnj(βj ) = max

{ nj∏i=1

pji

},(3.4)

subject to∑nj

i=1 pji = 1 and∑nj

i=1 pjiZji(βj ) = 0.

By introducing a Lagrange multiplier λj to solve the above optimization prob-lem and following the standard derivation in empirical likelihood [Owen (1990)],it can be shown that

Lnj(βj ) =

nj∏i=1

{1

nj

1

1 + λτjZji(βj )

},(3.5)

where λj satisfies

nj∑i=1

Zji(βj )

1 + λτjZji(βj )

= 0.(3.6)

The maximum of Lnj(βj ) is

∏nj

i=11nj

, achieved at βj = βj and λj = 0, where βj

solves∑nj

i=1 Zji(βj ) = 0.Let n = ∑k

i=1 nj , nj/n → ρj for some nonzero ρj as n → ∞ such that∑ki=1 ρj = 1. As the k samples are independent, the joint empirical likelihood

for (β1, β2, . . . , βk) is

Ln(β1, β2, . . . , βk) =k∏

j=1

Lnj(βj ).

The log likelihood ratio statistic for H0a is

�n := −2 maxβ

logLn(β,β, . . . , β) +k∑

j=1

nj lognj

(3.7)

= 2 minβ

k∑j=1

nj∑i=1

log{1 + λτjZji(β)}.

Using a Taylor expansion and the Lagrange multiplier to carry out the mini-mization in (3.7), the optimal solution to β is(

k∑j=1

�xjB−1

j �xj

)−1(k∑

j=1

�xjB−1

j �xj yj

)+ op(1),(3.8)

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 7

Page 8: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

8 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

where Bj = limnj→∞ (njTj )−1 ∑nj

i=1 E{Zji(βj0)Zji(βj0)τ },

�xj= 1√

njTj

nj∑i=1

Tj∑m=1

E

{δjim

πjim(θj )XjimXτ

jim

}and

�xjyj= 1√

njTj

nj∑i=1

Tj∑m=1

δjim

πjim(θj )Xjim(Yjim − Mτ

jimγj ).

The ANOVA test statistic (3.7) can be viewed as a nonparametric counterpart ofthe conventional parametric likelihood ratio ANOVA test statistic, for instance thatconsidered in Forcina (1992). Like its parametric counterpart, the Wilks’ theoremis maintained for �n.

THEOREM 1. If conditions A1–A4 given in the Appendix hold, then under

H0a , �nd→ χ2

(k−1)p as n → ∞.

The theorem suggests an empirical likelihood ANOVA test that rejects H0a if�n > χ2

(k−1)p,α where α is the significant level and χ2(k−1)p,α is the upper α quantile

of the χ2(k−1)p distribution.

We next evaluate the power of the empirical likelihood ANOVA test under aseries of local alternative hypotheses:

H1a :βj0 = β10 + cnn−1/2j for 2 ≤ j ≤ k,

where {cn} is a sequence of bounded constants. Define �β = (βτ10 − βτ

20, βτ10 −

βτ30, . . . , β

τ10 − βτ

k0)τ , D1j = �−1

x1�x1y1 − �−1

xj�xj yj

for 2 ≤ j ≤ k and D =(Dτ

12,Dτ13, . . . ,D

τ1k)

τ . Let �D = Var(D) and γ 2 = �τβ�−1

D �β . Theorem 2 givesthe asymptotic distribution of �n under the local alternatives.

THEOREM 2. Suppose conditions A1–A4 in the Appendix hold, then under

H1a , �nd→ χ2

(k−1)p(γ 2) as n → ∞.

It can be shown that

�D = �−1x1

B1�−1x1

1(k−1) ⊗ 1(k−1) + diag{�−1x2

B2�−1x2

, . . . ,�−1xk

Bk�−1xk

}.(3.9)

As each �−1xj

is O(n1/2), the noncentral component γ 2 is nonzero and bounded.The power of the α level empirical likelihood ANOVA test is

β(γ ) = P{χ2

(k−1)p(γ 2) > χ2(k−1)p,α

}.

This indicates that the test is able to detect local departures of size O(n−1/2)

from H0a , which is the best rate we can achieve under the local alternative set-up.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 8

Page 9: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 9

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

This is attained despite the fact that nonparametric kernel estimation is involved inthe formulation, which has a slower rate of convergence than

√n, as the centering

in (3.3) essentially eliminates the effects of the nonparametric estimation.

REMARK 1. When there is no missing values, namely all δjim = 1, we willassign all πjim(θj ) = 1 and there is no need to estimate each θj . In this case,Theorems 1 and 2 remain valid. It is a different matter for estimation as estimationefficiency with missing values will be less than that without missing values.

REMARK 2. The above ANOVA test is robust against misspecifying the miss-ing propensity pj (·; θj0) provided the missingness does not depend on the re-sponses

Yjit,d . This is because despite the mispecification, the mean of Zji(β)

is still approximately zero and the empirical likelihood formulation remains valid,as well as Theorems 1 and 2. However, if the missingness depends on the responsesand if the model is misspecified, Theorems 1 and 2 will be affected.

REMARK 3. The empirical likelihood test can be readily modified for ANOVAtesting on pure parametric regressions with some parametric time effects gj0(t;ηj )

with parameters ηj . When there is absence of interaction, we may formulate theempirical likelihood for (βj , ηj ) ∈ Rp+s using

Zji(βj ;ηj ) =Tj∑

m=1

δjim

πjim(θj )

(Xτ

jim,∂gτ

j (tjim;ηj )

∂ηj

× {Yjim − Xτjimβj − gj0(tjim;ηj )}

as the estimating function for the (j, i)th subject. The ANOVA test can be for-mulated following the same procedures from (3.5) to (3.7), and both Theorems 1and 2 remaining valid after updating p with p + s where s is the dimension of ηj .

In our formulation for the ANOVA test here and in the next section, we relyon the Nadaraya–Watson type kernel estimator. The local linear kernel estimatormay be employed when the boundary bias may be an issue. However, as we areinterested in ANOVA tests instead of estimation, the boundary bias does not havea leading order effect.

4. ANOVA test for time effects. In this section, we consider the ANOVA testfor the nonparametric part

H0b :g10(·) = · · · = gk0(·).We will first formulate an empirical likelihood for gj0(t) at each t , which thenlead to an overall likelihood ratio for H0b. We need an estimator of gj0(t) that is

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 9

Page 10: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

10 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

less biased than the one in (3.1). Recall the notation defined in Section 2: Xτjim =

(Xτjim,Mτ

jim) and ξτj = (βτ

j , γ τj ). Plugging-in the estimator ξj to (3.1), we have

gj (t) =nj∑i=1

Tj∑m=1

wjim,hj(t)(Yjim − X

τjimξj ).(4.1)

It follows that, for any t ∈ [0,1],

gj (t) − gj0(t) =nj∑i=1

Tj∑m=1

wjim,hj(t){εji(tj im) + X

τjim(ξj − ξj )

(4.2)+ gj0(tjim) − gj0(t)}.

However, there is a bias of order h2j in the kernel estimation since

nj∑i=1

Tj∑m=1

wjim,hj(t){gj0(tjim) − gj0(t)} = 1

2

{∫z2K(z)dz

}g′′

j0(t)h2j + op(h2

j ).

If we formulated the empirical likelihood based on gj (t), the bias will contributeto the asymptotic distribution of the ANOVA test statistic. To avoid that, we usethe bias-correction method proposed in Xue and Zhu (2007a) so that the estimatorof gj0 is

gj (t) =nj∑i=1

Tj∑m=1

wjim,hj(t)

{Yjim − X

τjimξj − (

gj (tj im) − gj (t))}

.

Based on this modified estimator gj (t), we define the auxiliary variable

Rji{gj (t)} =Tj∑

m=1

δjim

πjim(θj )K

(tj im − t

hj

)× {

Yjim − Xτjimξj − gj (t) − (

gj (tj im) − gj (t))}

for empirical likelihood formulation. At true function gj0(t), E[Rji{gj0(t)}] =o(1).

Using a similar procedure to Lnj(βj ) as given in (3.5) and (3.6), the empirical

likelihood for gj0(t) is

Lnj{gj0(t)} = max

{ nj∏i=1

pji

}

subject to∑nj

i=1 pji = 1 and∑nj

i=1 pjiRji{gj (t)} = 0. The latter is obtained in asimilar fashion as we obtain (3.5) by introducing Lagrange multipliers so that

Lnj{gj0(t)} =

nj∏i=1

{1

nj

1

1 + ηj (t)Rji{gj0(t)}},

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 10

Page 11: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 11

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

where ηj (t) is a Lagrange multiplier that satisfies

nj∑i=1

Rji{gj0(t)}1 + ηj (t)Rji{gj0(t)} = 0.(4.3)

The log empirical likelihood ratio for g10(t) = · · · = gk0(t) := g(t), say, is

Ln(t) = 2 ming(t)

k∑j=1

nj∑i=1

log(1 + ηj (t)Rji{g(t)}),(4.4)

which is analogues of �n in (3.7). As shown in the proof of Theorem 3 given in thesupplement article [Chen and Zhong (2010)], the leading order term of the Ln(t)

is a studentized version of the distance(g1(t) − g2(t), g1(t) − g3(t), . . . , g1(t) − gk(t)

),

namely between g1(t) and the other gj (t)(j �= 1). This motivates us to proposeusing

Tn =∫ 1

0Ln(t)�(t) dt(4.5)

to test for the equivalence of {gj0(·)}kj=1, where �(t) is a probability weight func-tion over [0,1].

To define the asymptotic distribution of Tn, we assume without loss of generalitythat for each hj and Tj , j = 1, . . . , k, there exist fixed finite positive constants αj

and bj such that αjTj = T and bjhj = h for some T and h as h → 0. Effectively,

T is the smallest common multiple of T1, . . . , Tk . Let K(2)c (t) = ∫

K(w)K(t −cw)dt and K

(4)c (0) = ∫

K(2)c (w

√c)K

(2)1/c(w/

√c) dw. For c = 1, we resort to the

standard notations of K(2)(t) and K(4)(0) for K(2)1 (t) and K

(4)1 (0), respectively.

For each treatment j , let fj be the super-population density of the design points{tj im}. Let aj = ρ−1

j αj ,

Wj(t) = fj (t)/{ajbjσ2εj }∑k

l=1 fl(t)/{alblσ2εl}

and Vj (t) = K(2)(0)σ 2εjfj (t) where σ 2

εj = 1njTj

∑nj

i=1∑Tj

m=1 E{ ε2jim

πjim(θj0)}. Further-

more, we define

�(t) =k∑

j=1

b−1j K(4)(0)

(1 − Wj(t)

)2

+k∑

j �=j1

(bjbj1)−1/2K

(4)bj /bj1

(0)Wj (t)Wj1(t)

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 11

Page 12: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

12 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

and

μ1 =∫ 1

0

[k∑

j=1

b−1/2j V −1

j (t)f 2j (t)�2

nj (t)

−(

k∑s=1

b−1/4s V −1/2

s (t)W 1/2s (t)fs(t)�ns(t)

)2]�(t) dt.

We consider a sequence of local alternative hypotheses:

gj0(t) = g10(t) + Cjn�jn(t),(4.6)

where Cjn = (njTj )−1/2h

−1/4j for j = 2, . . . , k and {�jn(t)}n≥1 is a sequence of

uniformly bounded functions.

THEOREM 3. Assume conditions A1–A4 in the Appendix and h = O(n−1/5),then under (4.6),

h−1/2(Tn − μ0)d→ N(0, σ 2

0 ),

where μ0 = (k − 1) + h1/2μ1 and σ 20 = 2K(2)(0)−2 ∫ 1

0 �(t)� 2(t) dt .

We note that under H0b :g10(·) = · · · = gk0(·), �jn(t) = 0 which yields μ1 = 0and

h−1/2{Tn − (k − 1)} d→ N(0, σ 20 ).

This may lead to an asymptotic test at a nominal significance level α that rejectsH0b if

Tn ≥ h1/2σ0zα + (k − 1),(4.7)

where zα is the upper α quantile of N(0,1) and σ0 is a consistent estimator ofσ0. The asymptotic power of the test under the local alternatives is 1 − �(zα −μ1σ0

), where �(·) is the standard normal distribution function. This indicates thatthe test is powerful in differentiating null hypothesis and its local alternative atthe convergence rate O(n

−1/2j h

−1/4j ) for Cjn. The rate is the best when a single

bandwidth is used Härdle and Mammen (1993).If all the hj (j = 1, . . . , k) are the same, the asymptotic variance σ 2

0 = 2(k −1)K(2)(0)−2K(4)(0)

∫ 10 � 2(t) dt , which means that the test statistic under H0b is

asymptotic pivotal. However, when the bandwidths are not the same, which is mostlikely as different treatments may require different amount of smoothness in theestimation of gj0(·), the asymptotical pivotalness of Tn is no longer available, andestimation of σ 2

0 is needed for conducting the asymptotic test in (4.7). We willpropose a test based on a bootstrap calibration to the distribution of Tn in Section 6.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 12

Page 13: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 13

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

REMARK 4. Similar to Remarks 1 and 2 made on the ANOVA tests for thecovariate effects, the proposed ANOVA test for the nonparametric baseline func-tions (Theorem 3) remains valid in the absence of missing values or if the missingpropensity is misspecified as long as the responses do not contribute to the miss-ingness.

REMARK 5. We note that the proposed test is not affected by the within-subject dependent structure (the longitudinal aspect) due to the fact that the formu-lation of the empirical likelihood is made for each subject. This is clearly shown inthe construction of Rji{gj (t)} and by the fact that the nonparametric functions canbe separated from the covariate effects in the semiparametric model. Again thiswould be changed if we are interested in estimation as the correlation structure inthe longitudinal data will affect the estimation efficiency. However, the test will bedependent on the choice of the weight function �(·), and {αj }, {ρj } and {bj }, therelative ratios among {Tj }, {nj } and {hj }.

REMARK 6. The ANOVA test statistics for the time effects for the semipara-metric model can be readily modified to obtain ANOVA test for purely nonpara-metric regression by simply setting ξj = 0 in the formulation of the test statisticLn(t). In this case, the model (2.1) takes the form

Yji(t) = gj (Xji(t), t) + εji(t),(4.8)

where gj (·) is the unknown nonparametric function of Xji(t) and t . The proposedANOVA test can be viewed as generalization of the tests considered in Mund andDettle (1998), Pardo-Fernández, Van Keilegom and González-Manteiga (2007)and Wang, Akritas and Van Keilegom (2008) by considering both the longitudi-nal and missing aspects. See also Cao and Van Keilegom (2006) for a two sampletest for the equivalence of two probability densities.

5. Tests on interactions. Model (1.1) contains an interactive term M(Xjim, t)

that is flexible in prescribing the interact between Xjim and the time, as long asthe positive definite condition in condition A3 is satisfied. In this section, we pro-pose tests for the presence of the interaction in the j th treatment and the ANOVAhypothesis on the equivalence of the interactions among the treatments.

We firstly consider testing H0c :γj0 = 0 vs. H1c :γj0 �= 0 for a fixed j . In theformulation of the empirical likelihood for γj0, we treat Mjim = M(Xjim, t) asa covariates with the same role like Xjim in the previous section when we con-structed empirical likelihood for βj0. For this purpose, we define estimating equa-tions for γj0

φji(γj0) =Tj∑

m=1

δjim

πjim(θj )Mjim(Yjim − Xτ

jimβj − Mτjimγj0),(5.1)

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 13

Page 14: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

14 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

where

βj ={ nj∑

i=1

Tj∑m=1

δjim

πjim(θj )XjimXτ

jim

}−1

(5.2)

×nj∑i=1

Tj∑m=1

δjim

πjim(θj )Xjim(Yjim − Mτ

jimγj0)

is the “estimator” of βj at the true γj0. Similar to establishing �nj(βj ) in Section 3,

the log-empirical likelihood for γj0 can be written as

�γnj

(γj ) = 2nj∑i=1

log{1 + �′jφji(γj )},

where the Lagrange multipliers �j satisfiesnj∑i=1

φji(γj )

1 + �′jφji(γj )

= 0.(5.3)

To test for H0d :γ10 = γ20 = · · · = γk0 vs. H1d :γi0 �= γj0 for some i �= j , weconstruct the joint empirical likelihood ratio

�γn := 2 min

γ

k∑j=1

nj∑i=1

log{1 + �τjφji(γ )},(5.4)

where �j satisfy (5.3).The asymptotic distributions of the empirical likelihood ratios �

γnj (0) and �

γn

under the null hypotheses as given in the next theorem whose proofs will not begiven as they follow the same routes in the proof of Theorem 1 by exchangingXjim and βj0 with Mjim and γj0, respectively.

THEOREM 4. Under conditions A1–A4 given in the Appendix, then (i) under

H0c, �γnj (0)

d→ χ2q as nj → ∞; (ii) under H0d , �

γn

d→ χ2(k−1)q as n → ∞.

Based on Theorem 4, an α-level empirical likelihood ratio test for the presenceof the interaction in the j th sample rejects H0c if �

γnj (0) > χ2

q,α , and the ANOVA

test for the equivalence of the interactive effects rejects H0d if �γn > χ2

(k−1)q,α . TheANOVA test for H0d has a similar local power performance as that described afterTheorem 2 for the ANOVA test regarding βj0 in Section 3. The power propertiesof the test for H0c can be established using a much easier method.

We have assumed parametric models for the interaction in model (1.1). A semi-parametric model would be employed to model the interaction given that the modelfor the time effect is nonparametric. The parametric interaction is a simplificationand avoids some of the involved technicalities associated with a semiparametricmodel.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 14

Page 15: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 15

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

6. Bootstrap calibration. To avoid direct estimation of σ 20 in Theorem 3 and

to speed up the convergence of Tn, we resort to the bootstrap. While the wildbootstrap [Wu (1986), Liu (1988) and Härdle and Mammen (1993)] originallyproposed for parametric regression without missing values has been modified byShao and Sitter (1996) to take into account missing values, we extend it further tosuit the longitudinal feature.

Let �toj and �tmj be the sets of the time points with full and missing observa-tions, respectively. According to model (2.2), we impute a missing Xji(t) fromXji(t), t ∈ �toj , so that for any t ∈ �tmj

Xji(t) =nj∑i=1

Tj∑m=1

wjim,hj(t)Xjim,(6.1)

where wjim,hj(t) is the kernel weight defined in (3.2).

To mimic the heteroscedastic and correlation structure in the longitudinal data,we estimate the covariance matrix for each subject in each treatment. Let

εj im = Yjim − Xτjimξj − gj (tj im).

An estimator of σ 2j (t), the variance of εji(t), is σ 2

j (t) = ∑nj

i=1∑Tj

m=1 wjim,hj(t) ×

ε2jim and an estimator of ρj (s, t), the correlation coefficient between εji(t) and

εji(s) for s �= t , is

ρj (s, t) =nj∑i=1

Tj∑m�=m′

Hjim,m′(s, t)ej imejim′,

where ej im = εj im/σj (tj im),

Hjim,m′(s, t) = δjimδjim′Kbj(s − tj im)Kbj

(t − tj im′)/πjim,m′(θj )∑nj

i=1∑

m�=m′ δjimδjim′Kbj(s − tj im)Kbj

(t − tj im′)/πjim,m′(θj )

and πjim,m′(θj ) = πjim(θj )πjim′(θj ) if |m − m′| > d; πjim,m′(θj ) = πjimb(θj )

if |m − m′| ≤ d where mb = max(m,m′). Here bj is a smoothing bandwidthwhich may be different from the bandwidth hj for calculating the test sta-tistics Tn [Fan, Huang and Li (2007)]. Then, the covariance �ji of εji =(εji1, . . . , εjiTj

)τ is estimated by �ji which has σ 2j (tj im) as its mth diagonal ele-

ment and ρj (tj ik, tj il)σj (tj ik)σj (tj il) as its (k, l)th element for k �= l.Let Yji, δji, tj i be the vector of random variables of the (j, i)th subject, Xji =

(Xji(tj i1), . . . ,Xji(tj iTj))τ and gj0(tsl) = (gj0(tsl1), . . . , gj0(tslTk

))τ , where s

may be different from j . Let Xcji = {Xo

ji, Xmji}, where Xo

ji contains observed

Xji(t) for tj ∈ �to and Xmji collects the imputed Xji(t) for t ∈ �tmj according to (6.1).

Plugging the value of Xcji , we get Mc

ji = {Moji, M

mji}, the observed and the imputed

interactions for (j, i)th subject, and then Xcji .

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 15

Page 16: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

16 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

The proposed bootstrap procedure consists of the following steps:Step 1. Generate a bootstrap re-sample {Y ∗

ji ,Xcji, δ

∗ji, tj i} for the (j, i)th subject

by

Y ∗ji = X

cji

τξj + g1(tji) + �jie

∗ji,

where e∗ji ’s are i.i.d. random vectors simulated from a distribution satisfying

E(e∗ji) = 0 and Var(e∗

ji) = ITj, δ∗

jim ∼ Bernoulli(πjim(θj )) where θj is estimatedbased on the original sample as given in (2.5). Here, g1(tji) is used as the commonnonparametric time effect to mimic the null hypothesis H0b.

Step 2. For each treatment j , we reestimate ξj , θj and gj (t) based on the resam-ple {Y ∗

ji,Xcji, δ

∗ji, tj i} and denote them as ξ∗

j , θ∗j and g∗

j (t). The bootstrap versionof Rji{g1(t)} is

R∗ji{g1(t)} =

Tj∑m=1

δ∗jim

πjim(θ∗j )

K

(tj im − t

hj

)× {

Y ∗jim − X

τjimξ∗

j − g1(t) − {g∗j (tj im) − g∗

j (t)}}and use it to substitute Rji{gj (t)} in the formulation of Ln(t), we obtain L∗

n(t) andthen T ∗

n = ∫L∗

n(t)�(t) dt .Step 3. Repeat the above two steps B times for a large integer B and obtain B

bootstrap values {T ∗nb}Bb=1. Let tα be the 1 − α quantile of {T ∗

nb}Bb=1, which is abootstrap estimate of the 1 − α quantile of Tn. Then, we reject the null hypothesisH0b if Tn > tα .

The following theorem justifies the bootstrap procedure.

THEOREM 5. Assume conditions A1–A4 in the Appendix hold and h =O(n−1/5). Let Xn denote the original sample, h and σ 2

0 be defined as in The-orem 3. The conditional distribution of h−1/2(T ∗

n − μ0) given Xn converges toN(0, σ 2

0 ) almost surely, namely,

h−1/2{T ∗n − (k − 1)}|Xn

d→ N(0, σ 20 ) a.s.

7. Simulation results. In this section, we report results from simulation stud-ies which were designed to confirm the proposed ANOVA tests proposed in theprevious sections. We simulated data from the following three-treatment model:

Yjim = Xjimβj + Mjimγj + gj (tjim) + εjim and(7.1)

Xjim = 2 − 1.5tj im + ujim,

where Mjim = tj im × (Xjim − 1.5)2, εjim = eji + νjim, ujim ∼ N(0, σ 2aj

), eji ∼N(0, σ 2

bj) and νjim ∼ N(0, σ 2

cj) for j = {1,2,3}, i = 1, . . . , nj and m = 1, . . . , Tj .

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 16

Page 17: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 17

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

This structure used to generate {εjim}Tj

m=1 ensures dependence among the repeatedmeasurements {Yjim} for each subject i. The correlation between Yjim and Yjil

for any m �= l is σ 2bj

/(σ 2bj

+ σ 2cj

). The time points {tj im}Tj

m=1 were obtained by firstindependently generating uniform[0,1] random variables and then sorted in the as-cending order. We set the number of repeated measures Tj to be the same, say T ,for all three treatments; and chose T = 5 and 10, respectively. The standard devia-tion parameters in (7.1) were σa1 = 0.5, σb1 = 0.5, σc1 = 0.2 for the first treatment,σa2 = 0.5, σb2 = 0.5, σc2 = 0.2 for the second and σa3 = 0.6, σb3 = 0.6, σc3 = 0.3for the third.

The parameters and the time effects for the three treatments were:

Treatment 1: β1 = 2, γ1 = 1, g1(t) = 2 sin(2πt) − �1n(t);Treatment 2: β2 = 2 + D2n, γ2 = 1 + D2n, g2(t) = 2 sin(2πt) − �2n(t);Treatment 3: β3 = 2 + D3n, γ3 = 1 + D3n, g3(t) = 2 sin(2πt) − �3n(t).

We designated different values of D1n,D2n,D3n,�1n(t),�2n(t) and �3n(t) in theevaluation of the size and the power, whose details will be reported shortly.

We considered two missing data mechanisms. In the first mechanism (I), themissing propensity was

logit{P(δjim = 1|δjim,m−1 = 1,Xji, Yji)} = θjXji(m−1) for m > 1,(7.2)

which is not dependent on the response Y , with θ1 = 3, θ2 = 2 and θ3 = 2. In thesecond mechanism (II),

logit{P(δjim = 1|δjim,m−1 = 1,Xji, Yji)}(7.3)

={

θj1Xji(m−1) + θj2{Yji(m−1) − Yji(m−2)

}, if m > 2,

θj1{Xji(m−1)

}, if m = 2;

which is influenced by both covariate and response, with θ1 = (θ11, θ12)τ =

(2,−1)τ , θ2 = (θ21, θ22)τ = (2,−1.5)τ and θ3 = (θ31, θ32)

τ = (2,−1.5)τ . In bothmechanisms, the first observation (m = 1) for each subject was always observedas we have assumed earlier.

We used the Epanechnikov kernel K(u) = 0.75(1 − u2)+ throughout the simu-lation where (·)+ stands for the positive part of a function. The bandwidths werechosen by the “leave-one-subject” out cross-validation. Specifically, we chose thebandwidth hj that minimized the cross-validation score functions

nj∑i=1

Tj∑m=1

δjim

πjim(θj )

(Yjim − Xτ

jimβ(−i)j − Mτ

jimγ(−i)j − g

(−i)j (tj im)

)2,

where β(−i)j , γ

(−i)j and g

(−i)j (tj im) were the corresponding estimates without us-

ing observations of the ith subject. The cross-validation was used to choose anoptimal bandwidth for representative data sets and fixed the chosen bandwidths in

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 17

Page 18: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

18 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

TABLE 1Empirical size and power of the 5% ANOVA test for H0a :β10 = β20 = β30

Sample size Missingness Missingness

n1 n2 n3 D2n D3n T I II T I II

60 65 55 0.0 0.0 (size) 5 0.042 0.050 10 0.046 0.0440.2 0.0 0.192 0.254 0.408 0.4340.3 0.0 0.548 0.630 0.810 0.8640.0 0.2 0.236 0.214 0.344 0.3540.0 0.3 0.508 0.546 0.714 0.7220.2 0.2 0.208 0.262 0.446 0.4580.2 0.3 0.412 0.440 0.680 0.6980.3 0.2 0.426 0.490 0.728 0.7280.3 0.3 0.594 0.620 0.836 0.818

100 110 105 0.0 0.0 (size) 5 0.052 0.054 10 0.042 0.0380.2 0.0 0.426 0.470 0.686 0.7180.3 0.0 0.854 0.854 0.964 0.9740.0 0.2 0.406 0.444 0.612 0.5680.0 0.3 0.816 0.836 0.936 0.9100.2 0.2 0.404 0.480 0.674 0.6860.2 0.3 0.744 0.694 0.944 0.8820.3 0.2 0.712 0.768 0.922 0.9200.3 0.3 0.824 0.814 0.972 0.970

the simulations with the same sample size. We fixed the number of simulations tobe 500.

The average missing percentages based on 500 simulations for the missingmechanism I were 8%, 15% and 17% for treatments 1–3, respectively, when T = 5,and were 16%, 28% and 31% when T = 10. In the missing mechanism II, the av-erage missing percentages were 10%, 8% and 15% for T = 5, and 23%, 20% and36% for T = 10, respectively.

For the ANOVA test for H0a :β10 = β20 = β30 with respect to the covariate ef-fects, three values of D2n and D3n: 0, 0.2 and 0.3, were used, respectively, while�1n(t) = �2n(t) = �3n(t) = 0. Table 1 summarizes the empirical size and powerof the proposed EL ANOVA test with 5% nominal significant level for H0a for9 combinations of (D2n,D3n), where the sizes corresponding to D2n = 0 andD3n = 0. We observed that the size of the ANOVA tests improved as the sam-ple sizes and the observational length T increased, and the overall level of sizewere close to the nominal 5%. This is quite reassuring considering the ANOVAtest is based on the asymptotic chi-square distribution. We also observed thatthe power of the test increased as sample sizes and T were increased, and asthe distance among the three βj0 was increased. For example, when D2n = 0.0and D3n = 0.3, the L2 distance was

√0.32 + 0.32 = 0.424, which is larger than√

0.12 + 0.22 + 0.32 = 0.374 for D2n = 0.2 and D3n = 0.3. This explains why the

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 18

Page 19: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 19

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

TABLE 2Empirical size and power of the 5% test for the existence of interaction H0c :γ20 = 0

Sample size Missingness Missingness

n1 n2 n3 γ20 T I II T I II

60 65 55 0.0 (size) 5 0.052 0.048 10 0.048 0.0520.2 0.428 0.456 0.568 0.6360.3 0.722 0.788 0.848 0.8820.4 0.928 0.952 0.948 0.968

100 110 105 0.0 (size) 5 0.054 0.046 10 0.056 0.0420.2 0.608 0.718 0.694 0.8120.3 0.940 0.938 0.940 0.9580.4 0.986 0.994 0.952 0.966

ANOVA test was more powerful for D2n = 0.0 and D3n = 0.3 than D2n = 0.2 andD3n = 0.3. At the same time, we see similar power performance between the twomissing mechanisms.

To gain information on the empirical performance of the test on the existenceof interaction, we carried out a test for H0c :γ20 = 0. In the simulation, we choseγ20 = 0,0.2,0.3,0.4, β20 = 2 + γ20 and fixed �2n(t) = 0, respectively. Table 2summarizes the sizes and the powers of the test. Table 3 reports the simulationresults of the ANOVA test on the interaction effect H0d :γ10 = γ20 = γ30 with asimilar configurations as those used as the ANOVA tests for the covarites effectsreported in Table 1. We observe satisfactory performance of these two tests interms of both the accurate of the size approximation and the empirical power. Inparticular, the performance of the ANOVA tests were very much similar to thatconveyed in Table 1.

We then evaluate the power and size of the proposed ANOVA test regarding thenonparametric components. To study the local power of the test, we set �2n(t) =Un sin(2πt) and �3n(t) = 2 sin(2πt) − 2 sin(2π(t + Vn)), and fixed D2n = 0 andD3n = 0.2. Here, Un and Vn were designed to adjust the amplitude and phase ofthe sine function. The same kernel and bandwidths chosen by the cross-validationas outlined earlier in the parametric ANOVA test were used in the test for thenonparametric time effects. We calculated the test statistic Tn with �(t) being thekernel density estimate based on all the time points in all treatments. We appliedthe wild bootstrap proposed in Section 5 with B = 100 to obtain t0.05, the bootstrapestimator of the 5% critical value. The simulation results of the nonparametricANOVA test for the time effects are given in Table 4.

The sizes of the nonparametric ANOVA test were obtained when Un = 0 andVn = 0, which were quite close to the nominal 5%. We observe that the power ofthe test increased when the distance among g1(·), g2(·) and g3(·) were becominglarger, and when the sample size or repeated measurement T were increased. We

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 19

Page 20: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

20 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

TABLE 3Empirical size and power of the 5% ANOVA test for H0d :γ10 = γ20 = γ30

Sample size Missingness Missingness

n1 n2 n3 D2n D3n T I II T I II

60 65 55 0.0 0.0 (size) 5 0.058 0.058 10 0.068 0.0360.2 0.0 0.134 0.188 0.232 0.2540.3 0.0 0.358 0.486 0.510 0.6220.0 0.2 0.136 0.166 0.230 0.2180.0 0.3 0.356 0.414 0.466 0.4740.2 0.2 0.170 0.208 0.286 0.2760.2 0.3 0.292 0.328 0.462 0.4280.3 0.2 0.266 0.356 0.498 0.4740.3 0.3 0.392 0.476 0.578 0.588

100 110 105 0.0 0.0 (size) 5 0.068 0.040 10 0.054 0.0460.2 0.0 0.262 0.366 0.354 0.4320.3 0.0 0.654 0.744 0.744 0.8200.0 0.2 0.272 0.330 0.340 0.3340.0 0.3 0.590 0.676 0.722 0.6720.2 0.2 0.282 0.332 0.412 0.4100.2 0.3 0.528 0.582 0.716 0.6400.3 0.2 0.502 0.580 0.680 0.7280.3 0.3 0.672 0.674 0.814 0.808

noticed that the power was more sensitive to change in Vn, the initial phase of thesine function, than Un.

TABLE 4Empirical size and power of the 5% ANOVA test for H0b :g1(·) = g2(·) = g3(·) with

�2n(t) = Un sin(2πt) and �3n(t) = 2 sin(2πt) − 2 sin(2π(t + Vn))

Sample size Missingness Missingness

n1 n2 n3 Un Vn T I II T I II

60 65 55 0.00 0.00 (size) 5 0.040 0.050 10 0.054 0.0600.30 0.00 0.186 0.232 0.282 0.2560.50 0.00 0.666 0.718 0.828 0.8400.00 0.05 0.664 0.726 0.848 0.8420.00 0.10 1.000 1.000 1.000 1.000

100 110 105 0.00 0.00 (size) 5 0.032 0.062 10 0.050 0.0360.30 0.00 0.434 0.518 0.526 0.5400.50 0.00 0.938 0.980 0.992 0.9980.00 0.05 0.916 0.974 1.000 1.0000.00 0.10 1.000 1.000 1.000 1.000

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 20

Page 21: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 21

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

We then compared the proposed tests with a test proposed by Scheike and Zhang(1998). Scheike and Zhang’s test was comparing two treatments for the nonpara-metric regression model (4.8) for longitudinal data without missing values. Theirtest was based on a cumulative statistic

T (z) =∫ z

a

(g1(t) − g2(t)

)dt,

where a, z are in a common time interval [0,1]. They showed that√

n1 + n2T (z)

converges to a Gaussian Martingale with mean 0 and variance function ρ−11 h1(z)+

ρ−12 h2(z), where hj (z) = ∫ z

a σ 2j (y)f −1

j (y) dy. Hence, the test statistic T (1 −a)/

√Var{T (1 − a)} is used for two group time-effect functions comparison.

To make the proposed test and the test of Scheike and Zhang (1998) comparable,we conducted simulation in a set-up that mimics the setting of model (7.1) but withonly the first two treatments, no missing values and only the nonparametric part inthe regression by setting βj = γj = 0. Specifically, we test for H0 :g1(·) = g2(·) vs.H1 :g1(·) = g2(·) + �2n(·) for three cases of the alternative shift function �2n(·)functions which are spelt out in Table 5 and set a = 0 in the test of Scheike andZhang. The simulation results are summarized in Table 5. We found that in the firsttwo cases (I and II) of the alternative shift function �2n, the test of Scheike andZhang had little power. It was only in the third case (III), the test started to pick upsome power although it was still not as powerful as the proposed test.

8. Analysis on HIV-CD4 data. In this section, we analyzed a longitudinaldata set from AIDS Clinical Trial Group 193A Study [Henry et al. (1998)], whichwas a randomized, double-blind study of HIV-AIDS patients with advanced im-mune suppression. The study was carried out in 1993 with 1309 patients whowere randomized to four treatments with regard to HIV-1 reverse transcriptase in-hibitors. Patients were randomly assigned to one of four daily treatment regimes:600 mg of zidovudine alternating monthly with 400 mg didanosine (treatment I);600 mg of zidovudine plus 2.25 mg of zalcitabine (treatment II); 600 mg of zi-dovudine plus 400 mg of didanosine (treatment III); or 600 mg of zidovudine plus400 mg of didanosine plus 400 mg of nevirapine (treatment VI). The four treat-ments had 325, 324, 330 and 330 patients, respectively.

The aim of our analysis was to compare the effects of age (Age), baseline CD4counts (PreCD4), and gender (Gender) on Y = log(CD4 counts +1). The semi-parametric model regression is, for j = 1,2,3 and 4,

Yji(t) = βj1 Ageji +βj2 PreCD4ji +βj3 Genderji +gj (t) + εji(t)(8.1)

with the intercepts absorbed in the nonparametric gj (·) functions, and βj =(βj1, βj2, βj3)

τ is the regression coefficients to the covariates (Age, PreCD4, Gen-der).

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 21

Page 22: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

22 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

TABLE 5The empirical sizes and powers of the proposed test (CZ) and the test (SZ) proposed by Scheike and

Zhang (1998) for H0b :g1(·) = g2(·) vs. H1b :g1(·) = g2(·) + �2n(·)

Sample size Tests Tests

n1 n2 n3 Un T CZ SZ T CZ SZ

60 65 55 Case I: �2n(t) = Un sin(2πt)

0.00 (size) 5 0.060 0.032 10 0.056 0.0280.30 0.736 0.046 0.844 0.0280.50 1.000 0.048 1.000 0.026

Case II: �2n(t) = 2 sin(2πt) − 2 sin(2π(t + Un))

0.05 1.000 0.026 1.000 0.0420.10 1.000 0.024 1.000 0.044

Case III: �2n(t) = −Un

0.10 0.196 0.162 0.206 0.1440.20 0.562 0.514 0.616 0.532

100 110 105 Case I: �2n(t) = Un sin(2πt)

0.00 (size) 5 0.056 0.028 10 0.042 0.0180.30 0.982 0.038 0.994 0.0400.50 1.000 0.054 1.000 0.028

Case II: �2n(t) = 2 sin(2πt) − 2 sin(2π(t + Un))

0.05 1.000 0.022 1.000 0.0300.10 1.000 0.026 1.000 0.030

Case III: �2n(t) = −Un

0.10 0.290 0.260 0.294 0.2180.20 0.780 0.774 0.760 0.730

To make gj (t) more interpretable, we centralized Age and PreCD4 so that theirsample means in each treatment were 0, respectively. As a result, gj (t) can be in-terpreted as the baseline evolution of Y for a female (Gender = 0) with the averagePreCD4 counts and the average age in treatment j . This kind of normalization isused in Wu and Chiang (2000) in their analyzes for another CD4 data set. Ourobjectives were to detect any difference in the treatments with respect to (i) thecovariates; and (ii) the nonparametric baseline functions.

Measurements of CD4 counts were scheduled at the start time 1 and at a 8-week intervals during the follow-up. However, the data were unbalanced due tovariations from the planned measurement time and missing values resulted fromskipped visits and dropouts. The number of CD4 measurements for patients duringthe first 40 weeks of follow-up varied from 1 to 9, with a median of 4. Therewere 5036 complete measurements of CD4, and 2826 scheduled measurementswere missing. Hence, considering missing values is very important in this analysis.Most of the missing values follow the monotone pattern. Therefore, we model themissing mechanism under the monotone assumption.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 22

Page 23: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 23

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

TABLE 6Difference in the AIC and BIC scores among three models (M1)–(M3)

Treatment I Treatment II Treatment III Treatment VI

Models AIC BIC AIC BIC AIC BIC AIC BIC

(M1)-(M2) 3.85 3.85 14.90 14.90 17.91 17.91 10.35 10.35(M2)-(M3) −2.47 −11.47 0.93 −8.12 0.30 −8.75 −3.15 −12.27

We considered three logistic regression models for the missing propensi-ties and used the AIC and BIC criteria to select the one that was the mostlysupported by data. The first model (M1) was a logistic regression model forpj (

Xjit,3,↼

Yjit,3; θj0) that effectively depends on Xjit (the PreCD4) and (Yji(t−1),Yji(t−2), Yji(t−3)) if t > 3. For t < 3, it relies on all Yjit observed before t . In thesecond model (M2), we replace the Xjit in the first model with an intercept. Inthe third model (M3), we added to the second logistic model with covariates rep-resenting the square of Yji(t−1) and the interactions between Yji(t−1) and Yji(t−2).In the formulation of the AIC and BIC criteria, we used the binary conditional like-lihood given in (2.5) with the respective penalties. The difference of AIC and BICvalues among these models for four treatment groups is given in Table 6. Underthe BIC criterion, M2 was the best model for all four treatments. For treatments IIand III, M3 had smaller AIC values than M2, but the differences were very small.For treatments I and VI, M2 had smaller AIC than M3. As the AIC tends to selectmore explanatory variables, we chose M2 as the model for the parametric missingpropensity.

Model (8.1) does not have interactions. It is interesting to check if there is aninteraction between gender and time. Then the model becomes

Yji(t) = βj1Ageji + βj2 PreCD4ji + βj3 Genderji

(8.2)+ γj4 Genderji × t + gj (t) + εji(t).

We applied the proposed test in Section 5 for H0c :γj4 = 0 for j = 1,2,3 and 4,respectively. The p-values were 0.9234,0.9885,0.9862 and 0.5558, respectively,which means that the interaction was not significant. Therefore, in the followinganalyzes, we would not include the interaction term and continue to use model(8.1).

Table 7 reports the parameter estimates βj of βj based on the estimating func-tion Zji(βj ) given in Section 3. It contains the standard errors of the estimates,which were obtained from the length of the EL confidence intervals based onthe marginal empirical likelihood ratio for each βj as proposed in Chen and Hall(1994). In getting these estimates, we use the “leave-one-subject” cross-validation[Rice and Silverman (1991)] to select the smoothing bandwidths {hj }4

j=1 for the

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 23

Page 24: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

24 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

TABLE 7Parameter estimates and their standard errors

Treatment I Treatment II Treatment III Treatment IV

Coefficients β1 β2 β3 β4

Age 0.0063 (0.0039) 0.0050 (0.0040) 0.0047 (0.0058) 0.0056 (0.0046)PreCD4 0.7308 (0.0462) 0.7724 (0.0378) 0.7587 (0.0523) 0.8431 (0.0425)Gender 0.1009 (0.0925) 0.1045 (0.0920) −0.3300 (0.1510) −0.3055 (0.1136)

four treatments, which were 12.90,7.61,8.27 and 16.20, respectively. We see thatthe estimates of the coefficients for the Age and PreCD4 were similar among allfour treatments with comparable standard errors, respectively. In particular, the es-timates of the Age coefficients endured large variations while the estimates of thePreCD4 coefficients were quite accurate. However, estimates of the Gender co-efficients had different signs among the treatments. We may also notice that theconfidence intervals from treatments I–IV for each coefficient were overlap.

We then formally tested H0a :β1 = β2 = β3 = β4. The empirical likelihood ra-tio statistic �n was 8.1348, which was smaller than χ2

9,0.95 = 16.9190, which pro-duced a p-value of 0.5206. So we do not have enough evidence to reject H0a ata significant level 5 %. The parameter estimates reported in Table 7 suggestedsimilar covariate effects between treatments I and II, and between treatments IIIand IV, respectively; but different effects between the first two treatments and thelast two treatments. To verify this suggestion, we carry out formal ANOVA testfor pair-wise equality among the βj ’s as well as for equality of any three βj ’s.The p-values of these ANOVA test are reported in Table 8. Indeed, the differencebetween the first two treatments and between the last two treatments were insignif-icant. However, the differences between the first three (I, II and III) treatments andthe last treatment were also not significant.

We then tested for the nonparametric baseline time effects. The kernel esti-mates gj (t) are displayed in Figure 1, which shows that treatments I and II and

TABLE 8p-values of ANOVA tests for βj ’s

H0a p-value H0a p-value

β1 = β2 0.9661 β1 = β2 = β3 0.7399β1 = β3 0.4488 β1 = β2 = β4 0.4011β1 = β4 0.1642 β1 = β3 = β4 0.3846β2 = β3 0.4332 β2 = β3 = β4 0.4904β2 = β4 0.2523 β1 = β2 = β3 = β4 0.5206β3 = β4 0.8450

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 24

Page 25: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 25

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

FIG. 1. (a) The raw data excluding missing values plots with the estimates of gj (t) (j = 1,2,3,4).(b) The estimates of gj (t) in the same plot: treatment I (solid line), treatment II (short dashed line),treatment III (dashed and doted line) and treatment IV (long dashed line).

treatments III and IV had similar baselines evolution overtime, respectively. How-ever, a big difference existed between the first two treatments and the last twotreatments. Treatment IV decreased more slowly than that of the other three treat-ments, which seemed to be the most effective in slowing down the decline of CD4.We also found that during the first 16 weeks the CD4 counts decrease slowly andthen the decline became faster after 16 weeks for treatments I, II and III.

The p-value for testing H0b :g1(·) = g2(·) = g3(·) = g4(·) is shown in Table 9.The entries were based on 500 bootstrapped resamples according to the procedureintroduced in Section 6. The statistics Tn for testing H0b :g1(·) = g2(·) = g3(·) =g4(·) was 3965.00, where we take �(t) = 1 over the range of t . The p-value of thetest was 0.004. Thus, there existed significant difference in the baseline time effectsgj (·)’s among treatments I–IV. At the same time, we also calculate the test statis-tics Tn for testing g1(·) = g2(·) and g3(·) = g4(·). The statistics values were 19.26

TABLE 9p-values of ANOVA tests on gj (·)’s

H0b p-value H0b p-value

g1(·) = g2(·) 0.894 g1(·) = g2(·) = g3(·) 0.046g1(·) = g3(·) 0.018 g1(·) = g2(·) = g4(·) 0.010g1(·) = g4(·) 0.004 g1(·) = g3(·) = g4(·) 0.000g2(·) = g3(·) 0.020 g2(·) = g3(·) = g4(·) 0.014g2(·) = g4(·) 0.006 g1(·) = g2(·) = g3(·) = g4(·) 0.004g3(·) = g4(·) 0.860

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 25

Page 26: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

26 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

and 26.22, with p-values 0.894 and 0.860, respectively. These p-values are muchbigger than 0.05. We conclude that treatment I and II has similar baseline timeeffects, but they are significantly distinct from the baseline time effects of treat-ment III and IV, respectively. p-values of testing other combinations on equalitiesof g1(·), g2(·), g3(·) and g4(·) are also reported in Table 9.

This data set has been analyzed by Fitzmaurice, Laird and Ware (2004) using arandom effects model that applied the Restricted Maximum Likelihood (REML)method. They conducted a two sample comparison test via parameters in the modelfor the difference between the dual therapy (treatment I–III) versus triple therapy(treatment VI) without considering the missing values. More specifically, they de-noted Group = 1 if subject in the triple therapy treatment and Group = 0 if subjectin the dual therapy treatment, and the linear mixed effect was

E(Y |b) = β1 + β2t + β3(t − 16)+ + β4 Group × t

+ β5 Group × (t − 16)+ + b1 + b2t + b3(t − 16)+,

where b = (b1, b2, b3) are random effects. They tested H0 :β4 = β5 = 0. Thisis equivalent to test the null hypothesis of no treatment group difference in thechanges in log CD4 counts between therapy and dual treatments. Both Wald testand likelihood ratio test rejected the null hypothesis, indicating the difference be-tween dual and triple therapy in the change of log CD4 counts. Their results areconsistent with the result we illustrated in Table 9.

APPENDIX: TECHNICAL ASSUMPTIONS

We provides the conditions used for Theorems 1–5 and some remark in thissection. The proofs for Theorems 1, 2, 3 and 5 are contained in the supplementarticle [Chen and Zhong (2010)]. The proof for Theorem 4 is largely similar tothat of Theorem 1 and is omitted.

The following assumptions are made in the paper:

A1. Let S(θj ) be the score function of the partial likelihood LBj(θj ) for a q-

dimensional parameter θj defined in (2.5), and θj0 is in the interior of com-pact �j . We assume E{S(θj )} �= 0 if θj �= θj0, Var(S(θj0)) is finite and pos-

itive definite, and E(∂S(θj0)

∂θj0) exists and is invertible. The missing propensity

πjim(θj0) > b0 > 0 for all j, i,m.A2. (i) The kernel function K is a symmetric probability density which is dif-

ferentiable of Lipschitz order 1 on its support [−1,1]. The bandwidthssatisfy njh

2j / log2 nj → ∞, n

1/2j h4

j → 0 and hj → 0 as nj → ∞.(ii) For each treatment j (j = 1, . . . , k), the design points {tj im} are thought

to be independent and identically distributed from a super-populationwith density fj (t). There exist constants bl and bu such that 0 < bl ≤supt∈S fj (t) ≤ bu < ∞.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 26

Page 27: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 27

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

(iii) For each hj and Tj , j = 1, . . . , k, there exist finite positive constantsαj , bj and T such that αjTj = T and bjhj = h for some h as h → 0.Let n = ∑k

i=1 nj , nj/n → ρj for some nonzero ρj as n → ∞ such that∑ki=1 ρj = 1.

A3. The residuals {εji} and {uji} are independent of each other and each of {εji}and {uji} are mutually independent among different j or i, respectively;

max1≤i≤nj‖ujim‖ = op{n(2+r)/(2(4+r))

j (lognj )−1}, max1≤i≤nj

E|εjim|4+r <

∞, for some r > 0; and assume that

limnj→∞(njTj )

−1nj∑i=1

Tj∑m=1

E{XjimXτjim} = �x > 0,

where Xjim = Xjim − E(Xjim|tj im).A4. The functions gj0(t) and hj (t) are, respectively, one-dimensional and p-

dimensional smooth functions with continuously second derivatives on S =[0,1].

REMARK. Condition A1 are the regular conditions for the consistency of thebinary MLE for the parameters in the missing propensity. Condition A2(i) are theusual conditions for the kernel and bandwidths in nonparametric curve estimation.Note that the optimal rate for the bandwidth hj = O(n

−1/5j ) satisfies A2(i). The

requirement of design points {tj im} in A2(ii) is a common assumption similar to theones in Müller (1987). Condition A2(iii) is a mild assumption on the relationshipbetween bandwidths and sample sizes among different samples. In A3, we do notrequire the residuals {εji} and {uji} being, respectively, identically distributed foreach fixed j . This allows extra heterogeneity among individuals for a treatment.The positive definite of �x in condition A3 is used to identify the “parameters”(βj0, γj0, gj0) uniquely, which is a generalization of the identification conditionused in Härdle, Liang and Gao (2000) to longitudinal data. This condition can bechecked empirically by constructing consistent estimate of �x .

Acknowledgments. The authors thank the referees, the Associate Editors andthe editors for valuable comments which lead to improvement of the presentationof the paper.

SUPPLEMENTARY MATERIAL

Supplement: Supplement to “ANOVA for Longitudinal Data with MissingValues” (DOI: 10.1214/10-AOS824SUPP; .pdf). This supplement material pro- ???.???vides technical proofs to the asymptotic distributions of the empirical likelihoodANOVA test statistics for comparing the treatment effects with respect to covari-ates given in Theorems 1 and 2, the asymptotic normality of the empirical likeli-hood ratio based ANOVA test statistic for comparing the nonparametric time effectfunctions given in Theorem 3 and justifies the usage of the proposed bootstrap pro-cedure.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 27

Page 28: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

28 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

REFERENCES

CAO, R. and VAN KEILEGOM, I. (2006). Empirical likelihood tests for two-sample problems vianonparametric density estimation. Canad. J. Statist. 34 61–77. MR2267710

CHEN, S.-X. and CUI, H.-J. (2006). On Bartlett correction of empirical likelihood in the presenceof nuisance parameters. Biometrika 93 215–220. MR2277752

CHEN, S.-X. and HALL, P. (1994). On the calculation of standard error for quotation in confidencestatements. Statist. Probab. Lett. 19 147–151.

CHEN, S.-X., LEUNG, D. and QIN, J. (2008). Improved semiparametric estimation using surrogatedata. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 803–823. MR2523905

CHEN, S.-X. and VAN KEILEGOM, I. (2009). A review on empirical likelihood for regressions.TEST 18 415–447. MR2566404

CHEN, S.-X. and ZHONG, P.-S. (2010). A Supplement to “ANOVA for longitudinal data with miss-ing values.” DOI: 10.1214/09-AOS824SUPP.

DICICCO, T., HALL, P. and ROMANO, J. (1991). Empirical likelihood is Barterlett-correctable. Ann.Statist. 19 1053–1061. MR1105861

FAN, J., HUANG, T. and LI, R. (2007). Analysis of longitudinal data with semiparametric estimationof covariance function. J. Amer. Statist. Assoc. 102 632–641. MR2370857

FAN, J. and ZHANG, J. (2004). Sieve empirical likelihood ratio tests for nonparametric functions.Ann. Statist. 32 1858–1907. MR2102496

FITZMAURICE, G., LAIRD, N. and WARE, J. (2004). Applied Longitudinal Analysis. Wiley, Hobo-ken, NJ. MR2063401

FORCINA, A. (1992). Modelling balanced longitudinal data: Maximum likelihood estimation andanalysis of variance. Biometrics 48 743–750. MR1187599

HÄRDLE, W., LIANG, H. and GAO, J. (2000). Partially Linear Models. Heidelberg.HÄRDLE, W. and MAMMEN, E. (1993). Comparing nonparametric versus parametric regression fits.

Ann. Statist. 21 1926–1947. MR1245774HENRY, K., ERICE, A., TIERNEY, C., BALFOUR, H., FISCHL, M., KMACK, A., LIOU, S., KEN-

TON, A., HIRSCH, M., PHAIR, J., MARTINEZ, A. and KAHN, J. FOR THE AIDS CLINICAL

TRIAL GROUP 193A STUDY TEAM (1998). A randomized, controlled, double-blind study com-paring the survival benefit of four different reverse transcriptase inhibitor therapies (three-drug,two-drug, and alternating drug) for the treatment of advanced AIDS. Journal of Acquired ImmuneDeficiency Syndromes And Human Retrovirology 19 339–349.

HUANG, C. Y., QIN, J. and FOLLMANN, D. (2008). Empirical likelihood-based estimation of thetreatment effect in a pretest-posttest study. J. Amer. Statist. Assoc. 103 1270–1280. MR2462898

LAIRD, N. (2004). Analysis of Longitudinal and Cluster-Correlated Data.LAIRD, N. and WARE, J. (1982). Random-effects models for longitudinal data. Biometrics 38 963–

974.LIANG, H. and QIN, Y.-S. (2008). Empirical likelihood-based inferences for partially linear models

with missing covariates. Aust. N. Z. J. Stat. 50 347–359. MR2474196LIANG, K. and ZEGER, S. (1986). Longitudinal data analysis using generalized linear models. Bio-

metrica 73 13–22. MR0836430LIN, D. and YING, Z. (2001). Semiparametric and nonparametric regression analysis of longitudinal

data. J. Amer. Statist. Assoc. 96 103–126. MR1952726<uncited> LINTON, O. and NIELSEN J. (1995). A kernel method of estimating structured nonparametric re-

gression based on marginal integration. Biometrika 82 93–100. MR1332841LIU, R. (1988). Bootstrap procedures under some non-i.i.d. models. Ann. Statist. 16 1696–1708.

MR0964947LITTLE, R. and RUBIN, B. (2002). Statistical Analysis with Missing Data, 2nd ed. Wiley, Hoboken,

NJ. MR1925014

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 28

Page 29: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

ANOVA FOR LONGITUDINAL DATA 29

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

LITTLE, R. (1995). Modeling the drop-out mechanism in repeated-measures studies. J. Amer. Statist.Assoc. 90 1112–1121. MR1354029

MÜLLER, H. (1987). Weighted local regression and kernel methods for nonparametric curve fitting.J. Amer. Statist. Assoc. 82 231–238. MR0883351

MUND, A. and DETTLE, H. (1998). Nonparametric comparison of several regression functions:Exact and asymptotic theory. Ann. Statist. 26 2339–2368. MR1700235

OWEN, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika75 237–249. MR0946049

OWEN, A. (1990). Empirical likelihood ratio confidence regions. Ann. Statist. 18 90–120.MR1041387

OWEN, A. (2001). Empirical Likelihood. Chapman and Hall/CRC.PARDO-FERNÁNDEZ, J., VAN KEILEGOM, I. and GONZÁLEZ-MANTEIGA, W. (2007). Testing for

the equality of k regression curves. Statist. Sinica 17 1115–1137. MR2397389QIN, J. and LAWLESS, J. (1994). Empirical likelihood and general estimating equations. Ann. Statist.

22 300–325. MR1272085RICE, J. and SILVERMAN, B. (1991). Estimating the mean and covariance structure nonparametri-

cally when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243. MR1094283ROBINS, J., ROTNITZKY, A. and ZHAO, L. (1995). Analysis of semiparametric regression mod-

els for repeated outcomes in the presence of missing data. J. Amer. Statist. Assoc. 90 106–121.MR1325118

RUBIN, B. (1976). Inference and missing data (with discussion). Biometrika 63 581–592.MR0455196

SCHEIKE, T. H. and ZHANG, M. J. (1998). Cumulative regression function tests for regressionmodels for longitudinal data. Ann. Statist. 26 1328–1355. MR1647665

SHAO, J. and SITTER, R. (1996). Bootstrap for imputed survey data. J. Amer. Statist. Assoc. 911278–1288. MR1424624

SPECKMAN, P. (1988). Kernel smoothing in partial linear models. J. Roy. Statist. Soc. Ser. B 50413–436. MR0970977

WANG, L., AKRITAS, M. and VAN KEILEGOM, I. (2008). An ANOVA-type nonparametric diag-nostic test for heteroscedastic regression models. J. Nonparametr. Stat. 20 365–382. MR2424247

WANG, N., CARROLL, R. and LIN, X. (2005). Efficient marginal estimation for longitudi-nal/clustered data. J. Amer. Statist. Assoc. 100 147–157. MR2156825

WANG, Q.-H., LINTON, O. and HÄRDLE, W. (2004). Semiparametric regression analysis withmissing response at random. J. Amer. Statist. Assoc. 99 334–345. MR2062820

WANG, Q.-H. and RAO, J. (2002). Empirical likelihood-based inference under imputation for miss-ing response data. Ann. Statist. 30 896–924. MR1922545

WU, C.-F.-J. (1986). Jackknife, bootstrap and other resampling method in regression analysis. Ann.Statist. 14 1261–1295. MR0868303

WU, C., CHIANG, C. and HOOVER, D. (1998). Asymptotic confidence regions for kernel smoothingof a varying-coefficient model with longitudinal data. J. Amer. Statist. Assoc. 93 1388–1402.MR1666635

WU, C. and CHIANG, C. (2000). Kernel smoothing on varying coefficient models with longitudinaldependent variable. Statist. Sinica 10 433–456. MR1769751

XUE, L. and ZHU, L. (2007a). Empirical likelihood semiparametric regression analysis for longitu-dinal data. Biometrika 94 921–937. MR2416799

XUE, L. and ZHU, L. (2007b). Empirical likelihood for a varying voefficient model with longitudinaldata. J. Amer. Statist. Assoc. 102 642–654. MR2370858

<uncited> YOU, J., CHEN, G. and ZHOU, Y. (2006). Block empirical likelihood for longitudinal partially linearregression models. Canad. J. Statist. 34 79–96. MR2267711

ZEGER, S. and DIGGLE, P. (1994). Semiparametric models for longitudinal data with application toCD4 cell numbers in HIV seroconverters. Biometrics 50 689–699.

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 29

Page 30: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

30 S. X. CHEN AND P.-S. ZHONG

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

ZHANG, D., LIN, X., RAZ, J. and SOWERS, J. (1998). Semiparametric stochastic mixed models forlongitudinal data. J. Amer. Statist. Assoc. 93 710–719. MR1631369

DEPARTMENT OF STATISTICS

IOWA STATE UNIVERSITY

AMES, IOWA 50011-1210USAAND

GUANGHUA SCHOOL OF MANAGEMENT

PEKING UNIVERSITY, BEIJING 100871CHINA

E-MAIL: [email protected]

DEPARTMENT OF STATISTICS

IOWA STATE UNIVERSITY

AMES, IOWA 50011-1210USAE-MAIL: [email protected]

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 30

Page 31: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

THE LIST OF SOURCE ENTRIES RETRIEVED FROM MATHSCINET

The list of entries below corresponds to the Reference section of your article and was retrievedfrom MathSciNet applying an automated procedure. Please check the list and cross out thoseentries which lead to mistaken sources. Please update your references entries with the datafrom the corresponding sources, when applicable. More information can be found in the sup-port page:http://www.e-publications.org/ims/support/mrhelp.html.

CAO, R. AND VAN KEILEGOM, I. (2006). Empirical likelihood tests for two-sample problems vianonparametric density estimation. Canad. J. Statist. 34 61–77. 2267710 (2008e:62083)

CHEN, S. X. AND CUI, H. (2006). On Bartlett correction of empirical likelihood in the presence ofnuisance parameters. Biometrika 93 215–220. 2277752

Not Found!CHEN, S. X., LEUNG, D. H. Y., AND QIN, J. (2008). Improving semiparametric estimation by

using surrogate data. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 803–823. 2523905CHEN, S. X. AND VAN KEILEGOM, I. (2009). A review on empirical likelihood methods for

regression. TEST 18 415–447. 2566404Not Found!DICICCIO, T., HALL, P., AND ROMANO, J. (1991). Empirical likelihood is Bartlett-correctable.

Ann. Statist. 19 1053–1061. 1105861 (92h:62004)FAN, J., HUANG, T., AND LI, R. (2007). Analysis of longitudinal data with semiparametric estima-

tion of convariance function. J. Amer. Statist. Assoc. 102 632–641. 2370857FAN, J. AND ZHANG, J. (2004). Sieve empirical likelihood ratio tests for nonparametric functions.

Ann. Statist. 32 1858–1907. 2102496 (2005h:62131)FITZMAURICE, G. M., LAIRD, N. M., AND WARE, J. H. (2004). Applied longitudinal analysis.

Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ.2063401 (2005a:62002)

FORCINA, A. (1992). Modelling balanced longitudinal data: maximum likelihood estimation andanalysis of variance. Biometrics 48 743–750. 1187599 (93f:62144)

Not Found!HÄRDLE, W. AND MAMMEN, E. (1993). Comparing nonparametric versus parametric regression

fits. Ann. Statist. 21 1926–1947. 1245774 (94k:62057)Not Found!HUANG, C.-Y., QIN, J., AND FOLLMANN, D. A. (2008). Empirical likelihood-based estimation of

the treatment effect in a pretest-posttest study. J. Amer. Statist. Assoc. 103 1270–1280. 2462898(2009m:62031)

Not Found!Not Found!LIANG, H. AND QIN, Y. (2008). Empirical likelihood-based inferences for partially linear models

with missing covariates. Aust. N. Z. J. Stat. 50 347–359. 2474196 (2010b:62272)LIANG, K. Y. AND ZEGER, S. L. (1986). Longitudinal data analysis using generalized linear mod-

els. Biometrika 73 13–22. 836430 (87h:62127)LIN, D. Y. AND YING, Z. (2001). Semiparametric and nonparametric regression analysis of longi-

tudinal data. J. Amer. Statist. Assoc. 96 103–126. With comments and a rejoinder by the authors.1952726

LINTON, O. AND NIELSEN, J. P. (1995). A kernel method of estimating structured nonparametricregression based on marginal integration. Biometrika 82 93–100. 1332841 (96f:62065)

LIU, R. Y. (1988). Bootstrap procedures under some non-i.i.d. models. Ann. Statist. 16 1696–1708.964947 (90h:62095)

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 31

Page 32: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

LITTLE, R. J. A. AND RUBIN, D. B. (2002). Statistical analysis with missing data, Second ed.Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken,NJ. 1925014 (2003g:62007)

LITTLE, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. J. Amer.Statist. Assoc. 90 1112–1121. 1354029

MÜLLER, H.-G. (1987). Weighted local regression and kernel methods for nonparametric curvefitting. J. Amer. Statist. Assoc. 82 231–238. 883351 (88d:62074)

MUNK, A. AND DETTE, H. (1998). Nonparametric comparison of several regression functions:exact and asymptotic theory. Ann. Statist. 26 2339–2368. 1700235 (2000f:62077)

OWEN, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Bio-metrika 75 237–249. 946049 (90b:62047)

OWEN, A. (1990). Empirical likelihood ratio confidence regions. Ann. Statist. 18 90–120. 1041387(91g:62037)

Not Found!PARDO-FERNÁNDEZ, J. C., VAN KEILEGOM, I., AND GONZÁLEZ-MANTEIGA, W. (2007). Test-

ing for the equality of k regression curves. Statist. Sinica 17 1115–1137. 2397389 (2009b:62083)QIN, J. AND LAWLESS, J. (1994). Empirical likelihood and general estimating equations. Ann.

Statist. 22 300–325. 1272085 (95g:62033)RICE, J. A. AND SILVERMAN, B. W. (1991). Estimating the mean and covariance structure non-

parametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243. 1094283ROBINS, J. M., ROTNITZKY, A., AND ZHAO, L. P. (1995). Analysis of semiparametric regression

models for repeated outcomes in the presence of missing data. J. Amer. Statist. Assoc. 90 106–121.1325118 (96d:62066)

RUBIN, D. B. (1976). Inference and missing data. Biometrika 63 581–592. With comments by R.J. A. Little and a reply by the author. 0455196 (56 #13435)

SCHEIKE, T. H. AND ZHANG, M.-J. (1998). Cumulative regression function tests for regressionmodels for longitudinal data. Ann. Statist. 26 1328–1355. 1647665 (99i:62217)

SHAO, J. AND SITTER, R. R. (1996). Bootstrap for imputed survey data. J. Amer. Statist. Assoc. 911278–1288. 1424624 (97g:62010)

SPECKMAN, P. (1988). Kernel smoothing in partial linear models. J. Roy. Statist. Soc. Ser. B 50413–436. 970977 (90g:62093)

WANG, L., AKRITAS, M. G., AND VAN KEILEGOM, I. (2008). An ANOVA-type nonparametricdiagnostic test for heteroscedastic regression models. J. Nonparametr. Stat. 20 365–382. 2424247(2009i:62079)

WANG, N., CARROLL, R. J., AND LIN, X. (2005). Efficient semiparametric marginal estimationfor longitudinal/clustered data. J. Amer. Statist. Assoc. 100 147–157. 2156825

WANG, Q., LINTON, O., AND HÄRDLE, W. (2004). Semiparametric regression analysis with miss-ing response at random. J. Amer. Statist. Assoc. 99 334–345. 2062820 (2005e:62053)

WANG, Q. AND RAO, J. N. K. (2002). Empirical likelihood-based inference under imputation formissing response data. Ann. Statist. 30 896–924. Dedicated to the memory of Lucien Le Cam.1922545 (2003f:62054)

WU, C.-F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. Ann.Statist. 14 1261–1350. With discussion and a rejoinder by the author. 868303 (88f:62106)

WU, C. O., CHIANG, C.-T., AND HOOVER, D. R. (1998). Asymptotic confidence regions for kernelsmoothing of a varying-coefficient model with longitudinal data. J. Amer. Statist. Assoc. 93 1388–1402. 1666635 (2000b:62073)

WU, C. O. AND CHIANG, C.-T. (2000). Kernel smoothing on varying coefficient models withlongitudinal dependent variable. Statist. Sinica 10 433–456. 1769751

XUE, L. AND ZHU, L. (2007). Empirical likelihood semiparametric regression analysis for longitu-dinal data. Biometrika 94 921–937. 2416799

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 32

Page 33: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

XUE, L. AND ZHU, L. (2007). Empirical likelihood for a varying coefficient model with longitudinaldata. J. Amer. Statist. Assoc. 102 642–654. 2370858

YOU, J., CHEN, G., AND ZHOU, Y. (2006). Block empirical likelihood for longitudinal partiallylinear regression models. Canad. J. Statist. 34 79–96. 2267711 (2009a:62262)

Not Found!ZHANG, D., LIN, X., RAZ, J., AND SOWERS, M. (1998). Semiparametric stochastic mixed models

for longitudinal data. J. Amer. Statist. Assoc. 93 710–719. 1631369

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 33

Page 34: Iowa State University and Peking University and Iowa State … · variety of data and model combinations, that encompasses parametric, semi-parametric and nonparametric regression

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

META DATA IN THE PDF FILE

Following information will be included as pdf file Document Properties:

Title : ANOVA for longitudinal data with missing valuesAuthor : Song Xi Chen, Ping-Shou ZhongSubject : The Annals of Statistics, 0, Vol.0, No.00, 1-34Keywords: 62G10, 62G20, 62G09, Analysis of variance, empirical like-

lihood, kernel smoothing, missing at random, semiparametric model,treatment effects

THE LIST OF URI ADRESSES

Listed below are all uri addresses found in your paper. The non-active uri addresses, if any, areindicated as ERROR. Please check and update the list where necessary. The e-mail addressesare not checked – they are listed just for your information. More information can be found inthe support page:http://www.e-publications.org/ims/support/urihelp.html.

200 http://www.imstat.org/aos/ [2:pp.1,1] OK200 http://dx.doi.org/10.1214/10-AOS824 [2:pp.1,1] OK200 http://www.imstat.org [2:pp.1,1] OK302 http://www.ams.org/msc/ [2:pp.1,1] Found200 http://dx.doi.org/10.1214/10-AOS824SUPP [2:pp.27,27] OK200 http://dx.doi.org/10.1214/09-AOS824SUPP [2:pp.28,28] OK--- mailto:[email protected] [2:pp.30,30] Check skip--- mailto:[email protected] [2:pp.30,30] Check skip

AOS imspdf v.2010/08/06 Prn:2010/08/31; 14:58 F:aos824.tex; (Laima) p. 34