A New Diagnostic Test for Cross–Section … J. Chen, J. Gao and D. Li The rest of this paper is organized as follows. A nonparametric test for cross–section inde-pendence in a

The University of Adelaide School of Economics

Research Paper No. 2009-16 October 2009

A New Diagnostic Test for Cross–Section Independence in Nonparametric Panel Data

Model

Jia Chen, Jiti Gao and Degui Li

1

The University of Adelaide, School of EconomicsWorking Paper Series No: 0075 (2009-16)

A New Diagnostic Test for Cross–Section

Independence in Nonparametric Panel Data Models

Jia Chen, Jiti Gao1 and Degui Li

The University of Adelaide, SA 5005, Australia

Abstract

In this paper, we propose a new diagnostic test for residual cross–section in-

dependence in a nonparametric panel data model. The proposed nonparametric

cross–section dependence (CD) test is a nonparametric counterpart of an existing

parametric CD test proposed in Pesaren (2004) for the parametric case. We establish

an asymptotic distribution of the proposed test statistic under the null hypothesis.

As in the parametric case, the proposed test has an asymptotically normal distribu-

tion. We then analyze the power function of the proposed test under an alternative

hypothesis that involves a nonlinear multi–factor model. We also provide several

numerical examples. The small sample studies show that the nonparametric CD

test associated with an asymptotic critical value works well numerically in each in-

dividual case. An empirical analysis of a set of CPI data in Australian capital cities

is given to examine the applicability of the proposed nonparametric CD test.

Keywords: Cross–section independence; local linear smoother; nonlinear panel data

model; nonparametric diagnostic test, size and power function

1Jiti Gao is from the School of Economics, The University of Adelaide. Adelaide SA 5005, Australia. Email:

[email protected].

2 J. Chen, J. Gao and D. Li

1. Introduction

Panel data analysis has become increasingly popular in many fields, such as economics,

finance and biology, since it provides the researcher with a wide variety of double–index models

rather than just purely cross–section or time series data models. There exists a rich literature

on parametric linear and nonlinear panel data models. For an overview of statistical inference

and econometric analysis of the parametric panel data models, we refer to the books by Baltagi

(1995), Arellano (2003) and Hsiao (2003). As in both the cross–sectional and time series

cases, parametric models may be too restrictive in some cases. As a consequence, existing

parametric tests may not be applicable in such cases. To address such issues, nonparametric

and semiparametric methods have been used in both model estimation and specification testing.

Recent studies include Li and Hsiao (1998), Ullah and Roy (1998), Hjellvik, Chen and Tjøstheim

(2004), Li and Racine (2007), Cai and Li (2008), and Henderson, Carroll and Li (2008).

Existing studies in nonparametric and semiparametric estimation and model specification

testing mainly assume cross–section independence. Such an assumption is far from realistic,

since cross–section dependence may arise in practice due to the presence of common shocks,

unobserved components that become part of the error term ultimately, economic distance and

spatial correlations. If observations are cross–section dependent, parametric and nonparametric

estimators based on the assumption of cross–section independence may be inconsistent. As

pointed out by Hsiao (2003), meanwhile, there is no natural ordering for cross–section indices,

and appropriate modelling and estimation of cross–section dependence is difficult particularly

when the dimension of cross–section observations N is large. Hence, it is appealing to test for

cross–section independence before one attempts to make some statistical inference for a panel

data model.

There is a substantial literature on diagnostic tests for cross–section independence in para-

metric panel data models. Breusch and Pagan (1980) proposed an Lagrange multiplier (LM)

test statistic, which is based on the average of the squared pair–wise correlation coefficients of

the residuals. The LM test requires that T is much larger than N , where T and N are the time

dimension and the cross–section dimension, respectively. Note that the mean of the squared

correlation coefficients is, however, not correctly centered when T is small. Frees (1995) thus

proposed a test statistic that is based on the squared Spearman rank correlation coefficients

and allows N to be larger than T . Recently, Pesaran (2004) introduced the so–called cross–

section dependence (CD) test. The main idea of proposing the CD test is to use the simple

average of all pair–wise correlation coefficients of the residuals from the individual parametric

linear regressions in the panel. The advantage of the CD test is that it is correctly centered

when both N and T are fixed. Ng (2006) employed spacing variance ratio statistics to test the

severity of cross–section correlation in panels by partitioning the pair–wise cross–correlations

into groups from high to low. Ng (2006)’s test statistics are proposed as agnostic tools for

Nonparametric Test for Cross–Section Independence 3

identifying and characterizing correlations across groups. More recently, Hsiao, Pesaran and

Pick (2007) extended the LM and CD tests from parametric linear panel data models to para-

metric nonlinear models. For other recent contributions to diagnostic tests of cross–section

independence, we refer to Huang, Kab and Urga (2008), Pesaran, Ullah and Yamagata (2008),

and Sarafidis, Yamagata and Robertson (2009).

By contrast, there is little study on diagnostic testing of the null hypothesis that the

residuals are cross–section independence in a nonparametric nonlinear panel data model. We

therefore propose a new diagnostic test for cross–section independence in a nonparametric

nonlinear panel data model. The main contributions of this paper can be summarized as

follows.

(i) We construct a local linear estimator of an individual regression function in the case where

T → ∞ and then propose a nonparametric CD test statistic in a similar fashion to that

proposed in Pesaran (2004) for the parametric case. As a sequence of using the local linear

estimation method, the first order biases involved are all eliminated in the construction

of the proposed test. As shown in Sections 3 and 5, respectively, the proposed test has

both sound large and small sample properties.

(ii) We then establish an asymptotically normal distribution under the null hypothesis, and

also an asymptotically normal distribution under a sequence of local alternatives in Section

3 below. In the small sample studies in Section 5 below, we examine the performance of

both the size and the power functions under various cases where the conditional mean

function and the residual may take the form of either linear, nonlinear or a mixture of

both.

(iii) We conclude from the small sample studies in Section 5 that the proposed nonparametric

CD test performs well when the data satisfy a nonparametric panel data model. By

comparison, existing tests for the parametric case are not applicable. In addition, the

proposed nonparametric CD test also performs well in both the size and power even when

the conditional mean function is of a parametric form. In this case, the nonparametric

CD test is just slightly less powerful that the parametric CD test.

(iv) In summary, the construction in Section 2 and the small sample analysis in Section 5 both

show that the proposed nonparametric CD test is easily computable and implementable.

The simulation study in Section 5 shows that the proposed nonparametric CD test is

generally more applicable than the corresponding parametric CD test. As an empirical

application, we apply the proposed test for testing the cross–section independence of a

set of CPI data in Australian capital cities.


The rest of this paper is organized as follows. A nonparametric test for cross–section inde-

pendence in a nonlinear panel data model is proposed in Section 2. An asymptotic distribution

of the proposed nonparametric CD test statistic is established in Section 3. Section 3 also es-

tablishes an asymptotic normality under an alternative hypothesis. Section 4 discusses possible

extensions. Several simulated examples are given in Section 5. An empirical analysis of a set

of CPI data in Australian capital cities is given in Section 6. All the mathematical proofs of

the asymptotic results are given in Appendix A.

2. Nonparametric panel data model and CD test statistic

Consider a nonparametric nonlinear panel data model of the form

Yit = gi(Xit) + uit, i = 1, · · · , N ; t = 1, · · · , T, (2.1)

where gi(·) is the individual regression function, Xit is random and satisfies some mild con-

ditions (see A2 below), and uit is independent of Xit with E[uit] = 0.

The aim of this paper is to test the null hypothesis

H0 : uit is independent of ujt for all i 6= j. (2.2)

The above testing problem has been studied by many authors in the context of parametric

panel data models. In the parametric case, the so–called CD test statistic was introduced

by Pesaran (2004) in the parametric linear panel data case. The main idea is to use the

simple average of all pair–wise correlation coefficients of the residuals from the individual

nonparametric nonlinear regression in the panel.

Before proposing a nonparametric CD test statistic, we need to decide which kernel method

should be used in the construction of our nonparametric CD test. Existing studies (see, for

example, Chapter 3 of Gao 2007) already show that the use of the Nadaraya–Watson kernel

estimation method in the construction of a nonparametric kernel test may have severe size

distortion due to the first order bias issue inherited from the Nadaraya–Watson kernel esti-

mation method. In this paper, we thus choose to use a local linear estimation method in the

construction of our nonparametric CD test. As shown in Section 3, the proposed nonparamet-

ric CD test has sound large sample theory under some mild conditions. Section 5 shows that

the proposed nonparametric CD test also has good small sample properties without using a

bootstrap method.

We now introduce the local linear estimator of the individual regression function gi(·).Assume that gi(·) has derivatives up to the second order at the point x0. By Taylor’s expansion,

for x in a neighborhood of x0, we have

gi(x) = gi(x0) + g′i(x0)(x− x0) +O((x− x0)2

). (2.3)


Then, we find (α0, α1) to minimize

T∑t=1

(Yit − α0 − α1(Xit − x0))2K

(Xit − x0

h

), (2.4)

where K(·) is some kernel function and h := hT is the bandwidth. The local linear estimator for

gi(x0) is defined as gi(x0) = α0i, where (α0i, α1i) is the unique pair that minimizes (2.4). For

more details about the local linear estimators, we refer to Fan and Gijbels (1996). In general,

one probably should use a kernel function and a bandwidth indexed by i for each cross section.

For notational simplicity, this paper uses the same kernel and bandwidth for both the large

and small sample discussion. In practice, the bandwidth can be chosen using the conventional

leave–one–out cross–validation method.

By an elementary calculation, the local linear estimator of gi(x0) can be expressed as

gi(x0) =T∑t=1

wit(x0)Yit, 1 ≤ i ≤ N, (2.5)

where wit(x0) = Kx0,h(Xit)T∑

t=1

Kx0,h(Xit)

, in which

Kx0,h(Xit) =1hK

(Xit − x0

h

)(Si2(x0)−

(Xit − x0

h

)Si1(x0)

)

with Sij(x0) = 1Th

∑Tt=1

(Xit−x0

h

)jK(Xit−x0

h

)for j = 0, 1, 2.

With the help of the local linear smoother defined above, we estimate uit by uit = Yit −gi(Xit).

We are now ready to propose a nonparametric CD test statistic of the form

NCD =

√T

N(N − 1)

N∑i=1

N∑j 6=i

ρij , (2.6)

where

ρij = ρji =

T∑t=1

uitujt√T∑t=1

u2it

√T∑t=1

u2jt

=

1T

T∑t=1

uitujt√1T

T∑t=1

u2it

√1T

T∑t=1

u2jt

,

in which uit = uitfi(Xit) and fi(x0) = 1T

T∑t=1

Kx0,h(Xit). Note that the test statistic NCD is

invariant to σ2ui = E[u2

i1].

The aim of using uit instead of uit in the test statistic (2.6) is to eliminate the random

denominator problem involved in the nonparametric estimator gi. The construction of the

nonparametric CD test in (2.6) is motivated by a similar form proposed in Pesaran (2004) for


the parametric case. The main step is the involvement of a nonparametric estimate uit, which

is equivalent to the OLS estimate in the parametric case. As shown in Sections 3 and 5 below,

the nonparametric CD test has both good large and small sample properties.

In Section 3 below, we show that the nonparametric CD test statistic (2.6) is asymptotically

centered when T → ∞ first and then N → ∞. Furthermore, asymptotic distributions of the

test statistic are established under either the null hypothesis or a sequence of local alternatives.

3. Large sample theory

3.1 Asymptotic theory under the null hypothesis

To study the asymptotic theory of the test statistic, we need the following conditions.

A1 (i). The probability kernel function K(·) is a symmetric and continuous function with

some compact support.

(ii). The individual regression function gi(·), 1 ≤ i ≤ N , has derivatives up to the second

order and the derivatives are continuous. Furthermore, maxi≥1E[|g′′i (Xi1)|2

]<∞, where

g′′i (·) is the second order derivative of gi(·).

A2 (i). For each individual series (for each fixed 1 ≤ i ≤ N), Xit is a sequence of station-

ary α–mixing random regressors with maxi≥1E[|Xi1|2

]< ∞ and the mixing coefficient

αxi(·) satisfying αxi(k) ≤ C0k−β uniformly in i ≥ 1 and for some 0 < C0 < ∞ and

β > 3.

(ii). Let fi(·) be the density function of Xit. Suppose that fi(x) is continuous and

bounded in x ∈ R. There exists a joint density function fis1,is2,···,isl,jt1,jt2,···,jtk(·, · · · , ·) of

(Xis1 , Xis2 , · · · , Xisl, Xjt1 , Xjt2 , · · · , Xjtk), 1 ≤ i, j ≤ N, 1 ≤ l, k ≤ 4,

such that fis1,is2,···,isl,jt1,jt2,···,jtk(·, · · · , ·) is continuous and bounded.

(iii). Let uis, 1 ≤ i ≤ N, 1 ≤ s ≤ T and Xjt, 1 ≤ j ≤ N, 1 ≤ t ≤ T be inde-

pendent for all (i, j) and (s, t). For each individual series (for each fixed 1 ≤ i ≤ N),

uit is a sequence of stationary α–mixing random errors with the mixing coefficient

αui(·) satisfying∞∑k=1

αδ0

2+δ0ui (k) <∞ for some δ0 > 0. In addition, E

[u2i1

]= σ2

ui > 0 and

maxi≥1E[|ui1|2+δ0

]<∞.

(iv). Let τ2i,j = µ4

2µ40

(κi,jσ

2uiσ

2uj + 2

∞∑t=2

E[ui1uit]E[uj1ujt]κi,j(t))

, where µk =∫ukK(u)du,

κi,j =∫ ∫

f4i (x)f

4j (y)fi1,j1(x, y)dxdy and

κi,j(t) =∫ ∫ ∫ ∫

f2i (x1)f2

i (x2)f2j (y1)f2

j (y2)fi1,it,j1,jt(x1, x2, y1, y2)dx1dx2dy1dy2,


and σ2i = µ2

2µ20σ

2ui

∫f5i (x)dx. Let 0 < τ2

i,j < ∞ and σ2i > 0 for all 1 ≤ i, j ≤ N . Suppose

that there exists some τ0 > 0 such that as N →∞,

1N(N − 1)

N∑i=1

N∑j 6=i

τ2i,j

σ2i σ

2j

→ τ0. (3.1)

A3. The bandwidth h satisfies

T θh

log T→∞ as T →∞ and N2Th8 → 0 as T →∞ and N →∞, (3.2)

where θ = β−3β+2 .

Remark 3.1. The above assumptions are mild and can be satisfied in many cases. For

example, A1(i) is a mild condition on the kernel function and is assumed by many authors in

nonparametric inference of both stationary time series and panel data (see, for example, Fan

and Yao 2003; Gao 2007; Cai and Li 2008). A1(ii) and A2(ii) are some mild conditions on

the individual regression functions and density functions. The α–mixing condition assumed

in A2(i) and A2(iii) is a commonly used condition in the time series case (see, for example,

Auestad and Tjøstheim 1990, Chen and Tsay 1993, Fan and Yao 2003; Gao 2007; Li and Racine

2007). It is introduced in this paper for the nonparametric panel data case. Note that when

uis and uit are mutually independent for all s 6= t and each fixed i, and Xit and Xjtare independent for all i 6= j and each given t, we have κi,j =

∫ ∫f5i (x)f

5j (y)dxdy, κi,j(t) ≡ 0

and τ2ij = µ4

2µ40κi,jσ

2uiσ

2uj = σ2

i σ2j . Thus, τ0 ≡ 1.

Condition A3 is a set of conditions on the bandwidth as well as on the restriction on T and

N . The first bandwidth condition in A3 is proposed in order to apply the uniform consistency

of the nonparametric kernel estimator in the proofs of Theorems 3.1 and 3.2 below. The second

bandwidth condition in A3 is also needed in the proofs of Theorems 3.1 and 3.2. In addition, the

second part of A3 allows for the case where rate of T →∞ is slower than that of N →∞. This

basically implies that condition A3 allows for both medium and small integers for T in practice

while the asymptotic theory requires both N → ∞ and T → ∞ in theory. The simulation

studies in Section 5 support that the nonparametric CD test works well even when T as small

as T = 10, although it cannot be shown at this stage that the conclusions of Theorems 3.1 and

3.2 remain true when T is fixed.

In the following theorem, we show that the nonparametric CD test statistic, defined by

(2.6), has an asymptotically normal distribution as that obtained by Pesaran (2004) and Hsiao,

Pesaran and Pick (2007), who considered similar testing problems in the context of parametric

linear and nonlinear panel data models.


Theorem 3.1. Assume that (2.1) and the conditions A1–A3 are satisfied. Then under H0,

NCD =

√T

N(N − 1)

N∑i=1

N∑j 6=i

ρij

d−→ N(0, τ0) (3.3)

as T →∞ first and then N →∞.

The proof of Theorem 3.1 is given in Appendix A below.

Remark 3.2. (i) Note that τ0 = 1 when uis and uit are mutually independent for all s 6= t

and each fixed i, and Xit and Xjt are independent for all i 6= j and each given t.

(ii) In general, τ0 is an unknown parameter to be estimated. Define

ρij =ρij σiσjτi,j

,

where τi,j and σi are consistent estimators of τi,j and σi, respectively. In this case, it can be

shown that under H0

NCD =:

√T

N(N − 1)

N∑i=1

N∑j 6=i

ρij

d−→ N(0, 1)

as T →∞ first and then N →∞.

Remark 3.3. The asymptotic distribution in Theorem 3.1 is obtained by letting T →∞ first

and then N → ∞. A natural question is what will happen if either N → ∞ first and then

T →∞ or T →∞ and N →∞ simultaneously.

(i) To see this, we define Zit = uitf2i (Xit)µ0µ2

σi. Following the proof of Theorem 3.1 in

Appendix A, we can find that the leading term of NCD is

1√N(N − 1)T

N∑i=1

N∑j 6=i

T∑t=1

ZitZjt =1√T

T∑t=1

1√N(N − 1)

N∑i=1

N∑j 6=i

ZitZjt

=:1√T

T∑t=1

ωN (t),

where ωN (t) = 1√N(N−1)

N∑i=1

N∑j 6=i

ZitZjt = 1√N(N−1)

(∑Ni=1 Zit

)2−∑Ni=1 Z

2it

.

It is obvious that, for each fixed 1 ≤ t ≤ T , ωN (t) is a sequence of U–statistics. By

Theorem 5.5.2 in Serfling (1980), under H0,

ωN (t) d−→ χ2t,1 − 1 as N →∞,

where χ2t,1 is the chi–square distribution with one degree of freedom. If both uit and Xit

are i.i.d. for all (i, t), then it can be seen that χ2t,1, 1 ≤ t ≤ T is a sequence of i.i.d. chi–

square random variables. By the conventional central limit theorem for the i.i.d. case (see, for

example, Chow and Teicher 1988), the conclusion of Corollary 3.1 remains true when N →∞first and then T →∞.


(ii) If both uit and Xit are i.i.d. for all (i, t), moreover, ωN (t is also a sequence of

i.i.d. errors. Thus, it follows from the conventional central limit theorem for the i.i.d. case

that as both N →∞ and T →∞ simultaneously

1√T

T∑t=1

ωN (t) d−→ N(0, 1),

which implies that the conclusion of Corollary 3.1 remains true.

In summary, Theorem 3.1 and the discussion given in Remarks 3.2(i) and 3.3 imply the

following corollary; its implementation is given through the simulation studies in Section 5 and

the empirical application in Section 6.

Corollary 3.1. Assume that (2.1) and the conditions A1 and A3 are satisfied. If, in addition,

uis and uit are mutually independent for all s 6= t and each fixed i, and Xit and Xjtare independent for all i 6= j and each given t, then under H0,

NCD =

√T

N(N − 1)

N∑i=1

N∑j 6=i

ρij

d−→ N(0, 1) (3.4)

as either T → ∞ first and then N → ∞, or N → ∞ first and then T → ∞, or both N → ∞and T →∞ simultaneously.

In summary, the limiting distribution of the suitably normalized test statistic depends

on the independence or dependence assumption on Xit and uit as well as the treatment

of the two indices N and T . Phillips and Moon (1999) introduced three limit approaches:

sequential limit theory, diagonal path limit theory and joint limit theory. They also discuss some

relations between sequential and joint limits. The asymptotic distribution given in Theorem

3.1 is obtained by a sequential limit approach (T →∞ first and then N →∞). It is not clear

whether the conclusion of Theorem 3.1 remains true when either N →∞ first and then T →∞or both N →∞ and T →∞ simultaneously. Such issues are thus left to future research.

3.2. Asymptotic theory under an alternative hypothesis

In this section, we analyze the power of the proposed test under a sequence of local alterna-

tives. Naturally, the power of the proposed test for the cross–section dependence relies on the

form of an alternative hypothesis. We now consider a sequence of cross–sectional dependence

alternatives via a nonlinear multi–factor model of the form

H1 : uit = FNT (zt, βi) + εit with FNT (zt, βi) =1(

N1/2T 1/4)kG(zt, βi) (3.5)

for k = 0, 1, where G(zt, βi) is a sequence of known parametric linear or nonlinear functions

indexed by βi, zt, 1 ≤ t ≤ T is a sequence of stationary α–mixing random variables,


βi, 1 ≤ i ≤ N is a sequence of common factors, εit is a sequence of stationary α–mixing

random variables for fixed i and is independent of zt, and εit is independent of εjt for

all t and i 6= j. Note that form (3.5) defines a global alternative when k = 0, while it gives a

sequence of local alternatives when k = 1.

Before establishing an asymptotic distribution of the nonparametric CD test statistic under

the alternative hypothesis H1, we need the following set of conditions.

A4 (i) zt is a sequence of stationary α–mixing random variables with the mixing coefficient

αz(·) satisfying∞∑t=1

αδ1/(2+δ1)z (t) <∞ for some δ1 > 0.

(ii) The nonlinear function G(·, ·) satisfies the following conditions,

E[G(zt, βi)] = 0, (3.6)

E[G(zt, βi)]2+δ1 <∞. (3.7)

In addition, there exists an array of constants ψij ; 1 ≤ i ≤ N, 1 ≤ j ≤ N with ψij = ψji

such that

E[G(zt, βi)G(zt, βj)] = ψij (3.8)

1N(N − 1)

N∑i=1

N∑j 6=i

ψij → ψ as N →∞, (3.9)

where ψ is a constant.

(iii) A2(iii) and A.2(iv) are both satisfied when uit is replaced by εit. Moreover,

εit is independent of zt. Let τ1 be defined in the same way as for τ0 with uit being

replaced by εit.

Condition A4 allows for a general class of forms for G(zt, βi). It obviously covers the linear

multi–factor case: G(zt, βi) = ztβi, which was studied by Pesaran (2004).

When the alternative hypothesis H1 holds, we have the following asymptotic distribution

for the nonparametric CD test statistic NCD. The proof of Theorem 3.2 below is given in

Appendix A below.

Theorem 3.2. Assume that (2.1) and the conditions A1, A2 (i), A3 and A4 are satisfied.

(i) Under H1 with k = 0, we have as T →∞ first and then N →∞

NCDP−→∞. (3.10)


(ii) Under H1 with k = 1, we have as T →∞ first and then N →∞

NCDd−→ N(ψ, τ1). (3.11)

The divergence result in (3.10) is quite common in the case where we assume this kind of

global alternative. The asymptotic distribution in (3.11) is similar to the result obtained by

Pesaran (2004). FNT (zt, βi) can be viewed as the measure of the dependence between individual

time series. By an elementary calculation and (3.8), we can show that E[uitujt] = ψij

N√T

.

Furthermore, (3.9) implies that the nonparametric CD test statistic allows the detec-

tion of the alternatives when the nonlinear multi–factor function has a decreasing rate of

O(T−1/4N−1/2

), which is the same as that in Pesaran (2004).

The simulated examples in Section 5 show that the power of the proposed test is satisfactory

when ψ > 0 (or ψ < 0). However, when ψ = 0, the asymptotic distribution in (3.11) is the

same as that in Theorem 3.1, which implies that the test will not have a satisfactory power. In

the context of parametric panel data models, Pesaran, Ullah and Yamagata (2008) proposed a

bias adjusted LM test to avoid the problem of poor power for the case of ψ = 0. It is interesting

to consider a nonparametric type of bias adjusted LM test statistic. Such an issue is left for

our future study.

4. Some extensions

In both theory and practice, there are cases where we need to consider a nonlinear autore-

gressive panel data model of the form

Yit = gi(Yi,t−1) + uit, i = 1, · · · , N ; t = 1, · · · , T. (4.1)

Pesaran (2004) and Sarafidis, Yamagata and Robertson (2008) studied the test of error

cross–section dependence when all gi(·), 1 ≤ i ≤ N , are of some linear form. As far as we

are aware, however, there is little study on diagnostic testing of cross–section independence for

model (4.1). It seems that we may apply the nonparametric CD test statistic NCD to test

whether H0 holds. In theory, establishing an asymptotic distribution for the nonparametric

CD test NCD in this case is not straightforward. Further discussion is left for our future study.

Meanwhile, nonparametric approaches are useful for exploring hidden structures. When

there are multiple regressor variables, however, the nonparametric approaches face a serious

problem of the so–called “curse of dimensionality”. To address this issue, some dimensional

reduction methods have been discussed in both the cross–section data and the time series data

cases (see, for example, Hardle, Liang and Gao 2000; Gao 2007; Li and Racine 2007).

In the panel data case, we may consider a partially linear model of the form

Yit = Xτitα+m(Zit) + εit, i = 1, · · · , N ; t = 1, · · · , T, (4.2)


where Xit is a vector of regressors, α is a vector of unknown parameters and the coefficient

functions m(·) are all unknown. Recently, there have been some attempts on both theoretical

studies and empirical applications of this type of partially linear models in the panel data case

(see, for example, Li and Hsiao 1998; Henderson, Carroll and Li 2008).

To the best of our knowledge, there is little study on the testing of cross–section indepen-

dence for a partially linear panel data model of the form (4.2). We will extend the proposed

test statistic to the partially linear model case and establish an asymptotic distribution of

the proposed test statistic. Since different methods and more technicalities are likely to get

involved, such an issue is therefore left for future research.

5. Small sample simulation studies

In this section, we give some simulated examples to show the finite sample performance

of the nonparametric CD test. In addition, we also compare its performance with that of a

parametric CD test. Since both the sizes and power values of the proposed nonparametric CD

test associated with an asymptotic critical value in each case are already comparable with those

of the parametric CD test based on an asymptotic critical value, our experience suggests that

there is no need to introduce a bootstrap simulation procedure to improve the finite sample

performance of the proposed nonparametric CD test.

In the following experiments, the uniform kernel K(u) = 12I|u| ≤ 1 is used in the

implementation of the proposed nonparametric CD test. The bandwidth is chosen using the

conventional leave–one–out cross–validation method.

We first examine the finite sample performance of the proposed nonparametric test when

the data set is simulated from a parametric linear panel data model of the form

Yit = ai + biXit + uit, i = 1, 2, · · · , N ; t = 1, 2, · · · , T, (5.1)

where aii.i.d.∼ U(0, 1), bi

i.i.d.∼ N(1, 0.04), Xiti.i.d.∼ N(0, 1), uit = f(ri, βt) + eit, βt is the time–

specific common effect and βti.i.d.∼ N(0, 1), eit

i.i.d.∼ N(0, 1), and ri is a sequence of non–random

numbers indicating the degree of cross–section error correlations. Note that uit and Xitare generated independently.

Under the null hypothesis of cross–section independence, we have ri = 0, and under the

alternative hypothesis, we experiment with rii.i.d.∼ U(0.1, 0.3). The parameters ai, bi, and ri

are drawn once for each i = 1, 2, · · · , N , and then fixed throughout the replications. Xit, βt,

and eit are newly drawn for each replication, independently of each other.

We experiment with both linear and nonlinear forms for the function f(·, ·). For the linear

case, we set f(ri, βt) = riβt, and for the nonlinear case, we set f(ri, βt) = riβt

1+r2i β2t. It can be


easily seen that the pair–wise correlation coefficients are given by

corr(ui,t, ujt) =rirj√

(1 + r2i )(1 + r2j ),

for the parametric linear case, and

corr(uit, ujt) =E(rirjβ

2t /((1 + r2i β

2t )(1 + r2jβ

2t )))

√(1 + E

(r2i β

2t /(1 + r2i β

2t )2)) (

1 + E(r2jβ

2t /(1 + r2jβ

2t )2))

for the parametric nonlinear case.

Using an asymptotic critical value, we computed the two–sided simulated sizes and power

values of the proposed nonparametric CD test and the parametric counterpart in each case.

The experiments are carried out for N , T = 10, 20, 30, 50, 100. The number of replications

is 1000, and the significance level is p = 1%, 5%, and 10%, respectively. The simulated sizes

of the parametric and the nonparametric CD tests for the linear model (5.1) are reported in

Table 5.1 below.

Table 5.1(a) Size of the tests for linear model (5.1) at the 1% level

parametric test nonparametric test

T\N 10 20 30 50 100 10 20 30 50 100

10 0.023 0.019 0.025 0.024 0.024 0.025 0.032 0.029 0.022 0.027

20 0.021 0.011 0.016 0.013 0.010 0.020 0.009 0.016 0.024 0.012

30 0.011 0.018 0.012 0.015 0.014 0.014 0.011 0.012 0.014 0.013

50 0.013 0.011 0.013 0.013 0.019 0.006 0.014 0.010 0.010 0.005

100 0.010 0.011 0.011 0.010 0.015 0.014 0.011 0.015 0.009 0.010

Table 5.1(b) Size of the tests for linear model (5.1) at the 5% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.053 0.052 0.069 0.048 0.057 0.044 0.062 0.058 0.053 0.051

20 0.064 0.049 0.054 0.052 0.041 0.058 0.042 0.055 0.050 0.044

30 0.048 0.051 0.054 0.060 0.048 0.047 0.041 0.053 0.057 0.042

50 0.063 0.054 0.043 0.049 0.052 0.046 0.058 0.044 0.048 0.046

100 0.047 0.049 0.049 0.042 0.055 0.047 0.044 0.048 0.041 0.046

Table 5.1(c) Size of the tests for linear model (5.1) at the 10% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.100 0.104 0.100 0.082 0.089 0.094 0.102 0.103 0.090 0.095

20 0.111 0.096 0.094 0.102 0.088 0.110 0.105 0.100 0.093 0.098

30 0.104 0.107 0.106 0.111 0.092 0.094 0.103 0.101 0.111 0.084

50 0.103 0.108 0.084 0.096 0.106 0.096 0.107 0.088 0.096 0.101

100 0.102 0.099 0.098 0.101 0.114 0.103 0.097 0.094 0.081 0.098


Tables 5.1(a)–5.1(c) show that the simulated sizes look quite reasonable in each case re-

gardless of whether using the nonparametric CD test or using the parametric CD test. This

implies that the nonparametric CD test is still applicable even when the data follow a paramet-

ric linear model. In addition, the results in Tables 5.1(a)–5.1(c) show that the nonparametric

CD test associated with an asymptotic critical value works well numerically even when T and

N are as small as T = N = 20. In addition, the tables also show that the sizes of the parametric

CD test are slightly more stable than those of the nonparametric CD test, mainly because the

true model is just parametric and the parametric CD test is supposed to perform better.

The power values of the tests for model (5.1) with linear (f(ri, βt) = riβt) and nonlinear

(f(ri, βt) = riβt/(1 + r2i β2t )) forms of f(·, ·) are given in Table 5.2 below.

Table 5.2(a) Power of the tests for linear model (5.1) at the 1% level


T\N 10 20 30 50 100 10 20 30 50 100

f(ri, βt) = riβt

10 0.098 0.234 0.336 0.649 0.896 0.076 0.163 0.256 0.484 0.815

20 0.116 0.509 0.657 0.911 0.995 0.080 0.366 0.503 0.797 0.967

30 0.122 0.685 0.688 0.948 0.997 0.079 0.529 0.567 0.876 0.992

50 0.343 0.676 0.978 0.997 1.000 0.247 0.528 0.941 0.987 1.000

100 0.432 0.902 1.000 1.000 1.000 0.320 0.804 0.995 1.000 1.000

f(ri, βt) = riβt/(1 + r2i β2

t )

10 0.075 0.184 0.253 0.448 0.829 0.058 0.143 0.183 0.334 0.705

20 0.091 0.182 0.433 0.822 0.990 0.059 0.126 0.304 0.678 0.965

30 0.092 0.324 0.575 0.913 0.998 0.061 0.223 0.436 0.804 0.991

50 0.266 0.564 0.839 0.963 1.000 0.189 0.411 0.725 0.892 1.000

100 0.304 0.797 0.997 1.000 1.000 0.205 0.667 0.982 1.000 1.000

Table 5.2(b) Power of the tests for linear model (5.1) at the 5% level


T\N 10 20 30 50 100 10 20 30 50 100

f(ri, βt) = riβt

10 0.170 0.348 0.450 0.742 0.942 0.147 0.255 0.364 0.616 0.884

20 0.234 0.663 0.773 0.958 0.997 0.174 0.525 0.646 0.879 0.988

30 0.260 0.795 0.807 0.976 0.999 0.190 0.665 0.691 0.936 0.996

50 0.527 0.812 0.996 0.999 1.000 0.406 0.692 0.977 0.995 1.000

100 0.624 0.955 1.000 1.000 1.000 0.510 0.894 0.997 1.000 1.000


t )

10 0.146 0.282 0.394 0.575 0.883 0.139 0.239 0.297 0.453 0.780

20 0.174 0.286 0.589 0.898 0.995 0.129 0.232 0.454 0.787 0.982

30 0.207 0.477 0.708 0.953 1.000 0.148 0.372 0.595 0.898 0.994

50 0.435 0.704 0.921 0.984 1.000 0.346 0.591 0.847 0.947 1.000

100 0.488 0.899 0.999 1.000 1.000 0.385 0.811 0.995 1.000 1.000


Table 5.2(c) Power of the tests for linear model (5.1) at the 10% level


T\N 10 20 30 50 100 10 20 30 50 100

f(ri, βt) = riβt

10 0.225 0.393 0.520 0.795 0.957 0.211 0.319 0.429 0.677 0.904

20 0.322 0.723 0.838 0.970 0.998 0.246 0.603 0.717 0.923 0.995

30 0.338 0.839 0.851 0.987 0.999 0.274 0.751 0.755 0.956 0.999

50 0.622 0.859 0.996 0.999 1.000 0.506 0.772 0.984 0.995 1.000

100 0.732 0.968 1.000 1.000 1.000 0.601 0.934 0.998 1.000 1.000


t )

10 0.198 0.365 0.470 0.625 0.908 0.190 0.296 0.369 0.529 0.827

20 0.242 0.369 0.669 0.927 0.998 0.205 0.303 0.546 0.852 0.992

30 0.277 0.555 0.771 0.974 1.000 0.220 0.451 0.675 0.934 0.997

50 0.525 0.783 0.950 0.991 1.000 0.437 0.679 0.888 0.966 1.000

100 0.581 0.934 0.999 1.000 1.000 0.484 0.867 0.996 1.000 1.000

Tables 5.2(a)–5.2(c) show that the simulated power values are quite satisfactory in each

of the cases concerned. Meanwhile, the simulated power values of the nonparametric CD test

associated with an asymptotic critical value are quite comparable with those of the parametric

CD test based on the use of an asymptotic critical value. This may be due to the fact that the

asymptotic normality can be used as a good approximation to the sample distribution of the

proposed nonparametric test in each of the cases considered.

In addition, Tables 5.2(a)–5.2(c) show that the parametric CD test is more powerful than

the nonparametric CD test. This is not surprising, since the true model is just parametric and

the parametric CD test is supposed to be more powerful.

In the following simulation studies, we examine the finite sample performance of the pro-

posed nonparametric test when the data set is simulated from a parametric nonlinear panel

data model of the form

Yit =1

1 + θ2iX

2it

+ uit, i = 1, 2, · · · , N ; t = 1, 2, · · · , T, (5.2)

where θii.i.d.∼ N(1, 0.04), Xit

i.i.d.∼ U(0.1, 0.7), and uit is the same as in model (5.1). When

ri = 0, the simulated sizes of the parametric and nonparametric CD test for this model are

reported in Table 5.3 below at different significance levels, and when rii.i.d.∼ U(0.1, 0.3), the

power values of the test are reported in Table 5.4 below.


Table 5.3(a) Size of the tests for nonlinear model (5.2) at the 1% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.027 0.022 0.031 0.053 0.029 0.021 0.016 0.018 0.026 0.024

20 0.019 0.018 0.015 0.023 0.024 0.020 0.013 0.014 0.016 0.009

30 0.008 0.012 0.009 0.019 0.013 0.007 0.013 0.011 0.021 0.011

50 0.010 0.019 0.009 0.008 0.012 0.011 0.018 0.011 0.009 0.012

100 0.005 0.007 0.016 0.011 0.011 0.010 0.004 0.016 0.013 0.013

Table 5.3(b) Size of the tests for nonlinear model (5.2) at the 5% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.061 0.056 0.075 0.110 0.240 0.051 0.045 0.047 0.051 0.052

20 0.055 0.060 0.062 0.068 0.059 0.060 0.059 0.055 0.057 0.039

30 0.047 0.052 0.052 0.060 0.056 0.046 0.053 0.049 0.066 0.048

50 0.046 0.048 0.052 0.052 0.051 0.049 0.054 0.057 0.056 0.045

100 0.050 0.053 0.058 0.047 0.051 0.047 0.046 0.058 0.050 0.058

Table 5.3(c) Size of the tests for nonlinear model (5.2) at the 10% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.101 0.089 0.120 0.165 0.316 0.106 0.095 0.085 0.095 0.092

20 0.096 0.115 0.104 0.120 0.099 0.101 0.117 0.101 0.109 0.079

30 0.093 0.101 0.104 0.113 0.095 0.095 0.102 0.099 0.118 0.095

50 0.100 0.091 0.106 0.108 0.102 0.095 0.094 0.112 0.113 0.094

100 0.105 0.101 0.100 0.091 0.103 0.101 0.096 0.090 0.101 0.098

Tables 5.3(a)–5.3(c) show that both the parametric CD test and the nonparametric CD

test already have reasonable simulated sizes when using an asymptotic critical value in each

case. As in Tables 5.1(a)–5.1(c), the simulated sizes of the nonparametric CD test are very

comparable with those of the parametric CD test.

Table 5.4 gives the corresponding power values for both the parametric and nonparametric

CD tests.


Table 5.4(a) Power of the tests for nonlinear model (5.2) at the 1% level


T\N 10 20 30 50 100 10 20 30 50 100

f(ri, βt) = riβt

10 0.086 0.251 0.417 0.799 0.954 0.084 0.229 0.358 0.735 0.915

20 0.128 0.322 0.583 0.884 0.996 0.130 0.304 0.567 0.852 0.995

30 0.152 0.571 0.736 0.973 1.000 0.144 0.573 0.722 0.968 0.999

50 0.265 0.883 0.958 0.993 1.000 0.261 0.879 0.952 0.992 1.000

100 0.322 0.988 0.998 1.000 1.000 0.299 0.985 0.993 1.000 1.000

f(ri, βt) = riβt/(1 + r2i β2t )

10 0.090 0.221 0.340 0.665 0.940 0.086 0.204 0.297 0.560 0.869

20 0.084 0.376 0.510 0.807 0.984 0.090 0.369 0.504 0.789 0.978

30 0.152 0.408 0.581 0.934 0.999 0.152 0.414 0.574 0.925 0.999

50 0.167 0.621 0.911 0.981 1.000 0.164 0.603 0.908 0.978 1.000

100 0.397 0.811 0.998 1.000 1.000 0.391 0.804 0.998 1.000 1.000

Table 5.4(b) Power of the tests for nonlinear model (5.2) at the 5% level


T\N 10 20 30 50 100 10 20 30 50 100

f(ri, βt) = riβt

10 0.154 0.385 0.544 0.863 0.969 0.143 0.343 0.486 0.816 0.940

20 0.244 0.482 0.729 0.937 0.999 0.234 0.444 0.705 0.919 0.998

30 0.276 0.727 0.848 0.987 1.000 0.271 0.727 0.831 0.983 1.000

50 0.437 0.954 0.979 0.996 1.000 0.427 0.950 0.977 0.995 1.000

100 0.522 0.998 1.000 1.000 1.000 0.503 0.995 1.000 1.000 1.000


10 0.159 0.360 0.477 0.762 0.972 0.152 0.312 0.413 0.688 0.916

20 0.167 0.533 0.650 0.892 0.990 0.170 0.517 0.626 0.869 0.986

30 0.287 0.563 0.716 0.971 1.000 0.278 0.560 0.708 0.967 1.000

50 0.304 0.750 0.967 0.991 1.000 0.297 0.743 0.961 0.990 1.000

100 0.604 0.914 0.999 1.000 1.000 0.569 0.915 0.999 1.000 1.000


Table 5.4(c) Power of the tests for nonlinear model (5.2) at the 10% level


T\N 10 20 30 50 100 10 20 30 50 100

f(ri, βt) = riβt

10 0.205 0.461 0.620 0.893 0.977 0.200 0.419 0.547 0.853 0.956

20 0.327 0.558 0.797 0.958 1.000 0.310 0.544 0.776 0.950 0.998

30 0.357 0.802 0.883 0.990 1.000 0.350 0.799 0.876 0.990 1.000

50 0.529 0.966 0.987 0.997 1.000 0.517 0.970 0.989 0.997 1.000

100 0.608 0.999 1.000 1.000 1.000 0.607 1.000 1.000 1.000 1.000


10 0.216 0.429 0.550 0.820 0.984 0.215 0.381 0.479 0.742 0.932

20 0.247 0.606 0.729 0.917 0.995 0.255 0.589 0.692 0.900 0.990

30 0.364 0.664 0.768 0.983 1.000 0.362 0.648 0.764 0.979 1.000

50 0.396 0.812 0.984 0.994 1.000 0.394 0.807 0.978 0.994 1.000

100 0.691 0.943 0.999 1.000 1.000 0.668 0.941 0.999 1.000 1.000

Tables 5.4(a)–5.4(c) show that the simulated power values are quite satisfactory in each of

the cases concerned. Meanwhile, the simulated power values of the nonparametric test show

that the nonparametric CD test is only slightly less powerful than the parametric CD test.

In summary, we can conclude that in both the parametric linear and nonlinear models,

the nonparametric CD test has the correct size even for small N and T . While the power of

the proposed nonparametric CD test increases as N or T increases. it increases faster with N

than with T . Similar findings have been drawn from Hsiao, Pesaran and Pick (2007) for the

parametric CD test.

This shows that the proposed nonparametric CD test is a generally applicable test in this

kind of testing for cross–section independence, as the applicability does not require a model to

be parametrically specified. In other words, it still works well without necessarily pre–specifying

the conditional mean function.

In the following example, we show that the proposed nonparametric CD test is needed

when the data follow a nonparametric panel data model, since existing tests for the parametric

case are not applicable.

Consider a nonparametric panel data model of the form

Yit =Xit

1 +X2it

+ uit, i = 1, 2, · · · , N ; t = 1, 2, · · · , T, (5.3)

where Xiti.i.d.∼ N(0, 1), and uit is the same as used in model (5.1). For ri = 0, the sizes of

the proposed nonparametric CD test are reported in Table 5.5, and for ri ∼ U(0.1, 0.3), the

power values are given in Table 5.6.


Table 5.5(a) Size of the nonparametric test for model (5.3) at the 1% level

T\N 10 20 30 50 100

10 0.022 0.022 0.015 0.028 0.020

20 0.013 0.014 0.014 0.020 0.011

30 0.012 0.017 0.011 0.008 0.014

50 0.012 0.012 0.009 0.010 0.015

100 0.013 0.011 0.011 0.009 0.007

Table 5.5(b) Size of the nonparametric test for model (5.3) at the 5% level

T\N 10 20 30 50 100

10 0.053 0.049 0.044 0.062 0.046

20 0.049 0.049 0.046 0.057 0.046

30 0.040 0.053 0.046 0.041 0.049

50 0.044 0.044 0.041 0.052 0.045

100 0.052 0.046 0.047 0.052 0.047

Table 5.5(c) Size of the nonparametric test for model (5.3) at the 10% level

T\N 10 20 30 50 100

10 0.088 0.098 0.079 0.096 0.084

20 0.100 0.091 0.090 0.092 0.087

30 0.105 0.098 0.088 0.098 0.103

50 0.091 0.095 0.091 0.088 0.105

100 0.106 0.099 0.091 0.096 0.095

Table 5.6(a) Power of the nonparametric test for model (5.3) at the 1% level

f(ri, βt) = riβt f(ri, βt) = riβt/(1 + r2i β2t )

T\N 10 20 30 50 100 10 20 30 50 100

10 0.083 0.232 0.250 0.473 0.771 0.051 0.110 0.184 0.325 0.694

20 0.061 0.255 0.500 0.795 0.966 0.080 0.185 0.348 0.648 0.942

30 0.105 0.422 0.603 0.928 0.999 0.073 0.264 0.555 0.836 0.994

50 0.189 0.381 0.865 0.984 1.000 0.104 0.427 0.678 0.917 0.998

100 0.440 0.781 0.944 1.000 1.000 0.136 0.698 0.952 0.999 1.000


Table 5.6(b) Power of the nonparametric test for model (5.3) at the 5% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.149 0.326 0.374 0.581 0.848 0.117 0.192 0.296 0.450 0.787

20 0.136 0.376 0.652 0.879 0.986 0.161 0.308 0.486 0.757 0.969

30 0.201 0.584 0.723 0.960 0.999 0.169 0.418 0.711 0.920 0.999

50 0.338 0.556 0.927 0.997 1.000 0.210 0.602 0.823 0.964 1.000

100 0.653 0.898 0.979 1.000 1.000 0.261 0.846 0.982 1.000 1.000

Table 5.6(c) Power of the nonparametric test for model (5.3) at the 10% level


T\N 10 20 30 50 100 10 20 30 50 100

10 0.215 0.391 0.438 0.646 0.876 0.180 0.266 0.355 0.516 0.826

20 0.217 0.466 0.730 0.908 0.987 0.212 0.405 0.562 0.811 0.981

30 0.274 0.655 0.786 0.970 1.000 0.231 0.503 0.776 0.944 1.000

50 0.434 0.666 0.959 0.998 1.000 0.289 0.694 0.882 0.980 1.000

100 0.743 0.931 0.988 1.000 1.000 0.351 0.893 0.991 1.000 1.000

Tables 5.5(a)–5.5(c) show that the nonparametric CD test has the correct sizes for the

simulated nonparametric panel data model (5.3). Meanwhile, Tables 5.6(a)–5.6(c) show that

the simulated power values of the nonparametric CD test are also satisfactory.

6. Empirical application: An analysis of CPI in Australian capital cities

As an application of our testing method, we test for the cross–sectional independence of

CPI (consumer price index) between eight Australian capital cities during the period 1989–

2008. The data set, which is obtained from the website of the Australian Bureau of Statistics,

is recorded quarterly each year. Hence, it consists of the CPI numbers for eight cities (N = 8)

at 80 different times (T = 80). We chose Yit as the log of the food CPI for city i at time

t and Xit as the log of all group CPI for city i at time t. For each city i, we computed

the nonparametric regression function of Yit on Xit (t = 1, 2, · · · , T ) using the nonparametric

local linear estimation method. Then, we used the estimation residuals uit to compute the

nonparametric CD test statistic. In a similar way, we also computed the regression of log of

the transportation CPI on log of all group CPI for each city. The results are summarized in

Table 6.1.


Table 6.1 Cross section dependence of CPI in Australian capital cities

food transportation

nonparametric CD test 47.2378 47.0227

bootstrap 1% critical values [−2.3130, 2.6100] [−2.4895, 2.7300]



Note that the two-sided bootstrap critical values were calculated using 1000 iterations.

It follows from Table 6.1 that there is some evidence to suggest rejecting the null hypothesis

that the cross–section independence is true for both the food and transportation indexes.

Meanwhile, based on the bootstrap simulated critical value in each case, the cross–section

independence should be rejected at all the levels of 1%, 5% and 10%.

This suggests that the assumption of cross–section independence in such empirical studies

may not be appropriate. Further studies are needed to find ways of defining a suitable cross–

section dependence structure in order to deal with panel data analysis when there is some

cross–section dependence.

7. Conclusions and discussion

We have proposed a new diagnostic test for residual cross–section independence in a non-

parametric panel data model. The proposed test is a nonparametric counterpart of an existing

test proposed in Pesaren (2004) for the parametric case. The asymptotic distribution under

either the null or a sequence of local alternatives has been established. The small sample per-

formance of the proposed test has been examined in Section 5. Section 6 has given an example

of empirical application.

Future research in this field includes discussion about how to choose a data–driven band-

width such that both the resulting size and power functions are appropriately assessed. As

pointed out in Section 4, certain extensions of the model may also be considered. Since study

of such topics is not trivial, they are left for future research.

8. Acknowledgments

This work was motivated by a keynote presentation by Professor Cheng Hsiao at an In-

ternational Conference on Time Series Econometrics at WISE in Xiamen, China in May 2008

when the second author was a participant at the conference. The second author would like to

thank Professor Yongmiao Hong for his invitation to participate in the conference. The authors

would all like to acknowledge the Australian Research Council Discovery Grants Program for

its financial support under Grant Numbers: DP0558602 and DP0879088.

Appendix A: Proofs of the main results


Before proving the main results, we need the following lemma on the uniform consistency

of nonparametric estimators. Since the proposed test statistic is invariant to σ2ui = E[u2

i1], we

assume without loss of generality that σui ≡ 1 throughout this appendix. In addition, we use a

double sum of the form∑Tt=1

∑s6=t to replace either

∑Tt=1

∑Ts=1,6=t or

∑Tt=1

∑Ts6=t for notational

simplicity throughout this appendix.

Lemma A.1. Assume that A1(i) and A2(i) are satisfied. If, in addition, T θhlog T → ∞ as

T →∞, where θ = β−3β+2 , then we have for k = 0, 1, 2,

supx∈R

|Sik(x)− fi(x)µk| = oP (1),

uniformly in i ≥ 1, where µk =∫ukK(u)du.

Proof. Observe that

supx∈R

|Sik(x)− fi(x)µk| ≤ sup|x|≤cT

|Sik(x)− fi(x)µk|+ sup|x|≥cT

|Sik(x)− fi(x)µk|

uniformly in i ≥ 1, where cT = T 1/2 log T .

For any ε > 0, by Theorem 6 in Hansen (2008), we know that there exists an integer T0

such that when T > T0, we have

P

(maxi≥1

sup|x|≤cT

|Sik(x)− fi(x)µk| ≤ ε/2

)→ 1.

Since fi(·) is continuous and integrable, maxi≥1 sup|x|≥cT

|fi(x)| → 0 as T →∞. Hence, when

µk 6= 0,

P

(maxi≥1 sup

|x|≥cT|Sik(x)− fi(x)µk| > ε/2

)

≤ P

(maxi≥1 sup

|x|≥cT|Sik(x)| > ε/4

)+ P

(maxi≥1 sup

|x|≥cT|fi(x)| > ε/(4|µk|)

)

= P

(maxi≥1 sup

|x|≥cT

1Th

∣∣∣∣∣ T∑t=1

(Xit−xh

)kK(Xit−xh

)∣∣∣∣∣ > ε/4

)

+ P

(maxi≥1 sup

|x|≥cT|fi(x)| > ε/(4|µk|)

)

≤T∑t=1

P (maxi≥1 |Xit| ≥ cT − Ch) + P

(maxi≥1 sup

|x|≥cT|fi(x)| > ε/(4|µk|)

)

≤ CT (cT )−2 maxi≥1E|Xi1|2 + P

(maxi≥1 sup

|x|≥cT|fi(x)| > ε/(4|µk|)

)→ 0,

where C is some positive constant. When µk = 0, from the above argument we can see that


P

(maxi≥1 sup

|x|≥cT|Sik(x)| > ε/2

)→ 0. Therefore, we have

P

(maxi≥1

supx∈R

|Sik(x)− fi(x)µk| > ε

)→ 0,

which completes the proof of Lemma A.1.

We then give the well–known Davydovs inequality for α–mixing sequence, which follows

from Corollary A2 of Hall and Heyde (1980). An updated version is given in Lemma A.1 of

Gao (2007).

Lemma A.2. Suppose that E|X|p <∞ and E|Y |q <∞, where p, q > 1, p−1 + q−1 < 1. Then

|E(XY )− (EX)(EY )| ≤ 8(E|X|p)1/p(E|Y |q)1/qα1−p−1−q−1,

where α = supA∈σ(X),B∈σ(Y )

|P (AB)− P (A)P (B)|.

Proof of Theorem 3.1. Note that

uit = (Yit − gi(Xit))fi(Xit) = uitfi(Xit) + (gi(Xit)− gi(Xit))fi(Xit).

Hence, by a standard decomposition, we have

N∑i=1

∑j 6=i

T∑t=1

uitujt =N∑i=1

∑j 6=i

T∑t=1

uitfi(Xit)ujtfj(Xjt)−N∑i=1

∑j 6=i

T∑t=1

uitfi(Xit)

(1T

T∑s=1

ujsKjst

)

+N∑i=1

∑j 6=i

T∑t=1

uitfi(Xit)

(1T

T∑s=1

(gj(Xjt)− gj(Xjs)) Kjst

)

−N∑i=1

∑j 6=i

T∑t=1

ujtfj(Xjt)

(1T

T∑s=1

uisKist

)

+N∑i=1

∑j 6=i

T∑t=1

ujtfj(Xjt)

(1T

T∑s=1

(gi(Xit)− gi(Xis)) Kist

)+

N∑i=1

∑j 6=i

T∑t=1

(gi(Xit)− gi(Xit)) (gj(Xjt)− gj(Xjt)) fi(Xit)fj(Xjt)

=:N∑i=1

∑j 6=i

6∑k=1

ρT (i, j, k),

(A.1)

where Kist = KXit,h(Xis).

In the following, we complete the proof of Theorem 3.1 through using Lemmas A.3–A.7

below.

Lemma A.3. Assume that the conditions of Theorem 3.1 are satisfied. Then under H0, we

haveN∑i=1

∑j 6=i

ρT (i, j, 2) = oP (N√T ),

N∑i=1

∑j 6=i

ρT (i, j, 4) = oP (N√T ). (A.2)


Proof. We only give the detailed proof for the case ofN∑i=1

∑j 6=i

ρT (i, j, 2) since the proof for

N∑i=1

∑j 6=i

ρT (i, j, 4) is similar. Observe that under H0,

E

N∑i=1

∑j 6=i

ρT (i, j, 2)

2

=1T 4E

N∑i=1

∑j 6=i

T∑t=1

uit

(T∑s=1

Kist

)(T∑l=1

ujlKjlt

)2

=1T 4E

N∑i=1

∑j 6=i

T∑t1,t2=1

T∑l1,l2=1

T∑s1,s2=1

uit1uit2ujl1ujl2Kis1t1K

is2t2K

jl1t1

Kjl2t2

≤ C

T 4E

N∑i=1

∑j 6=i

T∑t=1

T∑l=1

T∑s=1

u2itu

2jl(K

ist)

2(Kjlt)

2

+

C

T 4E

N∑i=1

∑j 6=i

T∑t=1

T∑l=1

T∑s1 6=s2

u2itu

2jlK

is1tK

is2t(K

jlt)

2

+

C

T 4E

N∑i=1

∑j 6=i

T∑t1 6=t2

T∑l=1

T∑s=1

uit1uit2u2jlK

ist1K

ist2(K

jlt)

2

+

C

T 4E

N∑i=1

∑j 6=i

T∑t=1

T∑l1 6=l2

T∑s=1

u2itujl1ujl2(K

ist)

2Kjl1tKjl2t

+

C

T 4E

N∑i=1

∑j 6=i

T∑t1 6=t2

T∑l=1

T∑s1 6=s2

uit1uit2u2jlK

is1t1K

is2t2(K

jlt)

2

+

C

T 4E

N∑i=1

∑j 6=i

T∑t=1

T∑l1 6=l2

T∑s1 6=s2

u2itujl1ujl2K

is1tK

is2tK

jl1tKjl2t

+

C

T 4E

N∑i=1

∑j 6=i

T∑t1 6=t2

T∑l1 6=l2

T∑s=1

uit1uit2ujl1ujl2Kist1K

ist2K

jl1t1

Kjl2t2

+

C

T 4E

N∑i=1

∑j 6=i

T∑t1 6=t2

T∑l1 6=l2

T∑s1 6=s2

uit1uit2ujl1ujl2Kis1t1K

is2t2K

jl1t1

Kjl2t2

=:

8∑k=1

Πk.

By Lemma A.1 and µ1 = 0, we have

supx∈R

∣∣∣∣hKx,h(Xit)−K

(Xit − x

h

)(fi(x)µ2 − oP (1) ·

(Xit − x

h

))∣∣∣∣ = oP (1),


which, by A1 (iii), implies that

Kx,h(Xit) ≤ C1h−1K

(Xit − x

h

), (A.3)

where C1 is independent of x and Xit. Let Kist = K

(Xit−Xis

h

).

For Π1, by (A.3), we have

Π1 ≤ C

T 4h4E

N∑i=1

∑j 6=i

T∑t=1

u2itu

2jt

+C

T 4h4E

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

u2itu

2jt

((Ki

st)2 + (Kj

st)2)

+1

T 4h4E

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∑l 6=t

u2itu

2jl(K

ist)

2(Kjlt)

2

= O

(N2T−3h−4

)+O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

E[(Ki

st)2 + (Kj

st)2]

+ O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∑l 6=t

E[(Ki

st)2(Kj

lt)2]

= O(N2T−3h−4

)+O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∫ ∫K2

(w − v

h

)fis,it(v, w)dvdw

+ O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∫ ∫K2

(w − v

h

)fjs,jt(v, w)dvdw

+ O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∑l 6=t

∫ ∫ ∫ ∫K2

(v1 − u1

h

)

× K2(u2 − v2h

)fis,it,jl,jt(u1, v1, u2, v2)du1du2dv1dv2

)= O

(N2

T 3h4+

N2

T 2h3+

N2

Th2

).

Therefore,

Π1 = O

(N2

Th2

). (A.4)

On the other hand, for Π2, we have

Π2 ≤ C

T 4h4E

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

T∑l=1

u2itu

2jlK

ist(K

jlt)

2

+

C

T 4h4E

N∑i=1

∑j 6=i

T∑t=1

∑s1 6=t

∑s2 6=t,s1

T∑l=1

u2itu

2jlK

is1tK

is2t(K

jlt)

2

= O

1T 4h4

E

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

Kist

+O

1T 4h4

E

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∑l 6=t

Kist(K

jlt)

2


+ O

1T 4h4

E

N∑i=1

∑j 6=i

T∑t=1

∑s1 6=t

∑s2 6=t,s1

Kis1tK

is2t

+ O

1T 4h4

E

N∑i=1

∑j 6=i

T∑t=1

∑s1 6=t

∑s2 6=t,s1

∑l 6=t

Kis1tK

is2t(K

jlt)

2

≤ O

N

T 4h4

N∑i=1

T∑t=1

∑s6=t

∫ ∫K

(w − v

h

)fis,it(v, w)dvdw

+ O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s6=t

∑l 6=t

∫ ∫ ∫ ∫K

(v1 − u1

h

)

× K2(v2 − u2

h

)fis,it,jl,jt(u1, v1, u2, v2)du1du2dv1dv2

)

+ O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s1 6=t

∑s2 6=t,s1

∫ ∫ ∫K

(v − u

h

)

× K

(w − u

h

)fis1,is2,it(u, v, w)dudvdw

)

+ O

1T 4h4

N∑i=1

∑j 6=i

T∑t=1

∑s1 6=t

∑s2 6=t,s1

∑l 6=t

∫ ∫ ∫ ∫ ∫K

(v1 − u1

h

)K

(w1 − u1

h

)

× K2(v2 − u2

h

)fis1,is2,it,jl,jt(u1, v1, w1, u2, v2)du1du2dv1dv2dw1

)= O

(N2

T 2h3+

N2

Th2+N2

h

).

Therefore, we have

Π2 = O

(N2

h

). (A.5)

By A2 (ii) and Lemma A.2, we have

|E [uit1uit2 ]− E [uit1 ]E [uit2 ]| ≤ C0αδ0

2+δ0u (|t1 − t2|), (A.6)

where C0 is some positive constant. Hence, by the α–mixing coefficient condition in A2 (ii),

(A.6) and following the calculation of Π2, we have

Πk = O

(N2

h

), k = 3, · · · , 8. (A.7)

In view of (A.4), (A.5) and (A.7), we have

N∑i=1

∑j 6=i

ρT (i, j, 2) = OP (Nh−1/2) = oP (N√T ) (A.8)

since Th→∞ by A3.



haveN∑i=1

∑j 6=i

ρT (i, j, k) = oP (N√T ), k = 3, 5, 6. (A.9)

Proof. For any x, by A1 (ii) and the definition of the local linear estimator, we have

(gi(x)− gi(x))ft(x) =1T

T∑t=1

Kx,h(Xit)uit +g′′i (x)2T

T∑t=1

(Xit − x)2Kx,h(Xit). (A.10)

ForN∑i=1

∑j 6=i

ρT (i, j, 6), note that, by (A.10),

N∑i=1

∑j 6=i

ρT (i, j, 6)

=N∑i=1

∑j 6=i

T∑t=1

(gi(Xit)− gi(Xit))(gj(Xjt)− gj(Xjt))fi(Xit)fj(Xjt)

=1T 2

N∑i=1

∑j 6=i

T∑t=1

T∑s1=1

T∑s2=1

Kis1tK

js2tuis1ujs2

+1

2T 2

N∑i=1

∑j 6=i

T∑t=1

g′′j (Xjt)T∑

s1=1

T∑s2=1

(Xjs2 −Xjt)2Kis1tK

js2tuis1

+1

2T 2

N∑i=1

∑j 6=i

T∑t=1

g′′i (Xit)T∑

s1=1

T∑s2=1

(Xis1 −Xit)2Kis1tK

js2tujs2

+1

4T 2

N∑i=1

∑j 6=i

T∑t=1

g′′i (Xit)g′′j (Xjt)T∑

s1=1

T∑s2=1

(Xis1 −Xit)2Kis1t(Xjs2 −Xjt)2K

js2t

=:N∑i=1

∑j 6=i

4∑k=1

ρT (i, j, 6, k).

Similarly to the calculation ofN∑i=1

∑j 6=i

ρT (i, j, 2), we have

N∑i=1

∑j 6=i

ρT (i, j, 6, 1) = oP (N√T ). (A.11)

By A1(i)–(iii), we have

E

∣∣∣∣∣∣N∑i=1

∑j 6=i

ρT (i, j, 6, 4)

∣∣∣∣∣∣≤ Ch4

T 2h2E

∣∣∣∣∣∣N∑i=1

∑j 6=i

T∑t=1

g′′i (Xit)g′′j (Xjt)∑s1 6=t

(Xis1 −Xit

h

)2

K

(Xis1 −Xit

h

)


×∑s2 6=t

(Xjs2 −Xjt

h

)2

K

(Xjs2 −Xjt

h

)∣∣∣∣∣∣ (1 + o(1))

= O(N2Th4).

Since N2Th8 → 0 by A3, we have

N∑i=1

∑j 6=i

ρT (i, j, 6, 4) = OP (N2Th4) = oP (N√T ). (A.12)

By (A.11), (A.12) and the Cauchy–Schwarz inequality, we have

N∑i=1

∑j 6=i

ρT (i, j, 6, 2) = oP (N√T ) (A.13)

andN∑i=1

∑j 6=i

ρT (i, j, 6, 3) = oP (N√T ). (A.14)

It then follows from (A.11)–(A.14) that

N∑i=1

∑j 6=i

ρT (i, j, 6) = oP(N√T). (A.15)

By the arguments and derivations as inN∑i=1

∑j 6=i

ρT (i, j, 6), we have

N∑i=1

∑j 6=i

ρT (i, j, 3) = oP(N√T)

andN∑i=1

∑j 6=i

ρT (i, j, 5) = oP(N√T). (A.16)


haveN∑i=1

∑j 6=i

ρT (i, j, 1) =N∑i=1

∑j 6=i

T∑t=1

uitujtf2i (Xit)f2

j (Xjt)µ22µ

20 + oP (N

√T ). (A.17)

Proof. The proof can be done in a similar way to that of Lemma A.4 above. In the following,

we adopt a simplified way to complete the proof. Note that

ρT (i, j, 1) =T∑t=1

uitfi(Xit)ujtfj(Xjt) =T∑t=1

uitujt

×[f2i (Xit)µ2µ0 + fi(Xit)− f2

i (Xit)µ2µ0

] [f2j (Xjt)µ2µ0 + fj(Xjt)− f2

j (Xjt)µ2µ0

]=

T∑t=1

uitujtf2i (Xit)f2

j (Xjt)µ22µ

20


+T∑t=1

uitujt(fi(Xit)− f2

i (Xit)µ2µ0

)f2j (Xjt)µ2µ0

+T∑t=1

uitujt(fj(Xjt)− f2

j (Xjt)µ2µ0

)f2i (Xit)µ2µ0

−T∑t=1

uitujt(fi(Xit)− f2

i (Xit)µ2µ0

) (fj(Xjt)− f2

j (Xjt)µ2µ0

)

=:4∑

k=1

ρT (i, j, 1, k).

For any ε > 0,

P

N∑i=1

N∑j=1,6=i

ρT (i, j, 1, 4) > εN√T

= P

N∑i=1

N∑j=1,6=i

ρT (i, j, 1, 4) > εN√T , Ω(η)

+ P

N∑i=1

N∑j=1,6=i

ρT (i, j, 1, 4) > εN√T , Ωc(η)

, (A.18)

where Ω(η) :=

maxi≥1 sup

x∈R

∣∣∣fi(x)− f2i (x)µ2µ0

∣∣∣ < η, maxj≥1 supx∈R

∣∣∣fj(x)− f2j (x)µ2µ0

∣∣∣ < η

.

By Lemma A.1,

P

N∑i=1

N∑j=1,6=i

ρT (i, j, 1, 4) > εN√T , Ωc(η)

≤ P (Ωc(η)) → 0. (A.19)

Let δi(Xit) = fi(Xit)− f2i (Xit)µ2µ0. Meanwhile, for each fixed (i, j)

E [ρT (i, j, 1, 4)I(Ω(η))]2 =T∑t=1

E[u2itu

2jtδ

2i (Xit)δ2j (Xjt)I(Ω(η))

]

+T∑t=1

T∑s=1,6=t

E [uituisujtujs]E [δi(Xit)δi(Xis)δj(Xjt)δj(Xjs)I(Ω(η))]

≤T∑t=1

E[u2itu

2jtδ


]

+12

T∑t=1

T∑s=1,6=t

|E [uituisujtujs]|E[(δ2i (Xit)δ2j (Xjt) + δ2i (Xis)δ2j (Xjs)

)I(Ω(η))

]

≤T∑t=1

E[u2itu

2jtδ


]

+

(T∑t=1

|E [uitui1ujtuj1]|)(

T∑s=1

E[δ2i (Xis)δ2j (Xjs)I(Ω(η))

])

≤ Cη4T∑t=1

E[u2itu

2jt

]= O(η2T ), (A.20)


where I(A) is the indicator function of a set A, and we have used that uit and Xjt are

mutually independent as well as the fact that∑Tt=1 |E [uitui1ujtuj1]| < ∞ for all (i, j), which

all follow from condition A2 and Lemma A.2.

Let vit = uitδi(Xit)I(Ω(η)). In a similar way, we can derive that for T and N large enough

E

N∑i=2

i−1∑j=1

T∑t=1

vitvjt

2

=N∑i=2

i−1∑j=1

T∑t=1

E[u2itu

2jtδ


]

+ 2N∑i=2

i−1∑j=1

T∑t=2

t−1∑s=1

E [uisuitujsujt]E [δi(Xit)δi(Xis)δj(Xjt)δj(Xjt)I(Ω(η))]

+ 2N∑i=3

i−1∑j1=1

j1−1∑j2=1

T∑t=1

E[u2ituj1tuj2t

]E[δ2i (Xit)δj1(Xj1t)δj2(Xj2t)I(Ω(η))

]

+ 4N∑i=3

i−1∑j1=1

j1−1∑j2=1

T∑t=2

t−1∑s=1

E [uisuituj1suj2t]E [δi(Xis)δi(Xit)δj1(Xj1s)δj2(Xj2t)I(Ω(η))]

+ 4N∑i1=4

i1−1∑i2=3

i1−1∑j1=1

i2−1∑j2=1

T∑t=1

E [ui1tui2tuj1tuj2t]E [δi1(Xi1t)δi2(Xi2t)δj1(Xj1t)δj2(Xj2t)I(Ω(η))]

+ 8N∑i1=4

i1−1∑i2=3

i1−1∑j1=1

i2−1∑j2=1

T∑t=2

t−1∑s=1

E [ui1sui2tuj1suj2t]

× E [δi1(Xi1s)δi2(Xi2t)δj1(Xj1s)δj2(Xj2t)I(Ω(η))]

≤ Cη4N∑i=2

i−1∑j=1

T∑t=1

E[u2itu

2jt

]= O(η4N2T ), (A.21)

where condition A.2 and Lemma A.2 have been repeatedly used under H0, under which

E[ukituljt] = E[ukit]E[uljt] hold for all i 6= j, all t ≥ 1 and k, l = 1, 2.

Letting η → 0, it follows from (A.18)–(A.21) that

N∑i=1

∑j 6=i

ρT (i, j, 1, 4) = oP (N√T ). (A.22)

Note thatN∑i=1

∑j 6=i

ρT (i, j, 1, 1) = OP (N√T ).

By the above equation, (A.22) and the Cauchy–Schwarz inequality, we have

N∑i=1

∑j 6=i

ρT (i, j, 1, 2) = oP (N√T ) and

N∑i=1

∑j 6=i

ρT (i, j, 1, 3) = oP (N√T ). (A.23)

Then (A.17) follows from (A.22) and (A.23).



have for each fixed i and as T →∞,

1T

T∑t=1

u2it = σ2

i + oP (1), (A.24)

where σ2i = µ2

2µ20

∫f5i (x)dx for each fixed i.

Proof. Observe that

1T

T∑t=1

u2it =

1T

T∑t=1

(uitf

2i (Xit)µ2µ0 + uit − ui,tf

2i (Xit)µ2µ0

)2

=1T

T∑t=1

(uitf

2i (Xit)µ2µ0 + ui,t(fi(Xit)− f2

i (Xit)µ2µ0)

− 1T

T∑s=1

KXit,h(Xis)uis −1T

T∑s=1

KXit,h(Xis)(gi(Xis)− gi(Xit)))2

By A1 (iii) and the law of large numbers for α–mixing sequence (cf. Lin and Lu 1996), we

have1T

T∑t=1

u2itf

4i (Xit)µ2

2µ20 = σ2

i + oP (1). (A.25)

Similarly to the calculation ofN∑i=1

∑j 6=i

ρT (i, j, 1, 4), we have

E

[T∑t=1

u2it

(fi(Xit)− f2

i (Xit)µ2µ0

)2]

= o(T ). (A.26)

Furthermore, by (A.3) and standard calculation, we have

E

T∑t=1

(1T

T∑s=1

KXit,h1(Xis)uis

)2 ≤ C

T 2h2

T∑t=1

T∑s=1

E(K2

(Xis−Xit

h

)u2is

)= O(h−1) = o(T ).

(A.27)

By Taylor expansion and a calculation similar to ρT (i, j, 6, 4),

E

T∑t=1

(1T

T∑s=1

KXit,h(Xis)(gi(Xis)− gi(Xit))

)2 = O(Th2) = o(T ). (A.28)

By (A.25)–(A.28), we have

T∑t=1

u2it

(fi(Xit)− f2

i (Xit)µ2µ0

)2= oP (T ),

T∑t=1

(1T

T∑s=1

KXit,h(Xis)uis

)2

= oP (T ),

T∑t=1

(1T

T∑s=1

KXit,h(Xis)(gi(Xis)− gi(Xit))

)2

= oP (T ),


which imply that (A.24) holds.

Lemma A.7. Define Zit = uitf2i (Xit)µ0µ2

σi. Let the conditions of Theorem 3.1 hold. Then under

H0, we have1√

N(N − 1)T

N∑i=1

N∑j 6=i

T∑t=1

ZitZjtd−→ N(0, τ0) (A.29)

as T →∞ first and then N →∞, where τ0 is as defined in Theorem 3.1.

Proof. Let Zit = σi√τi,jZit. By the central limit theorem for stationary α–mixing sequence (cf.

Lin and Lu 1996), we have for each fixed (i, j) and as T →∞

1√T

T∑t=1

ZitZjtd−→ N(0, 1), (A.30)

in view of the fact that under H0 we have E[ZitZjt

]= 0 and

E

[1√T

T∑t=1

ZitZjt

]2

=1T

T∑t=1

E[ZitZjt

]2+

1T

T∑t=1

T∑s=1,6=s

E[ZitZjtZitZjt

]

=µ4

0µ42

τ2ij

1T

T∑t=1

E[u2it]E[u2

jt]E[f4i (Xit)f4

j (Xjt)]

+µ4

0µ42

τ2ij

1T

T∑t=1

T∑s=1,6=t

E[uituis]E[ujtujs]E[fi(Xitfi(Xis)fj(Xjtfj(Xjs)]

= 1 + o(1)

for large enough T and all fixed (i, j).

Let Wij = 1√T

τijσiσj

T∑t=1

ZitZjt. Note that under H0, Wij and Wkl are uncorrelated for

all (i, j) 6= (k, l). Thus, by equation (A.30) and the continuous mapping theorem (see, for

example, Corollary 2 of Billingsley 1968, p. 31), we have as N →∞

1√N(N − 1)

N∑i=1

N∑j=1,6=i

Wijd−→ N(0, τ0), (A.31)

which implies that (A.29) holds.

Proof of Theorem 3.2. We start the proof of Theorem 3.2(ii). Following the proof of

Theorem 3.1 as above, we need only to show that

1N(N − 1)T

N∑i=1

∑j 6=i

T∑t=1

G(zt, βi)G(zt, βj)P−→ ψ, (A.32)

N∑i=1

∑j 6=i

T∑t=1

F (zt, βi)

(1T

T∑s=1

εjsKjst

)= oP (N

√T ), (A.33)


N∑i=1

∑j 6=i

T∑t=1

F (zt, βi)

(1T

T∑s=1

F (zs, βj)Kjst

)= oP (N

√T ), (A.34)

N∑i=1

∑j 6=i

T∑t=1

1T

T∑s1=1

F (zs1 , βi)Kis1t

1T

T∑s2=1

F (zs2 , βj)Kjs2t

= oP (N√T ), (A.35)

N∑i=1

∑j 6=i

T∑t=1

1T

T∑s1=1

F (zs1 , βi)Kis1t

1T

T∑s2=1

εjs2Kjs2t

= oP (N√T ), (A.36)

N∑i=1

∑j 6=i

T∑t=1

εit

(1T

T∑s=1

F (zs, βj)Kjst

)= oP (N

√T ), (A.37)

1T

T∑t=1

F 2(zt, βi) → 0, (A.38)

where F (zt, βi) := FNT (zt, βi).

By the law of large numbers for stationary α–mixing sequences, under A4(i) and (ii) we

have1T

T∑t=1

G(zt, βi)G(zt, βj) → ψij , (A.39)

which, together with (3.8) and (3.9), implies that (A.32) holds.

By (3.8) and the law of large numbers for stationary α–mixing sequences, we can show

that (A.38) holds analogously.

We now start to prove (A.33). By A3, A4 and Lemma A.2, we have

E

N∑i=1

∑j 6=i

T∑t=1

F (zt, βi)

(1T

T∑s=1

εjsKjst

)2

≤ C

T 2h2

N∑i1=1

N∑i2=1

∑j 6=i1,i2

T∑t1=1

T∑t2=1

T∑s=1

E[F (zt1 , βi1)F (zt2 , βi2)]E[ε2js

]E[Kjst1K

jst2

]

≤ C

T 2h2

N∑i1=1

N∑i2=1

∑j 6=i1,i2

T∑t1=1

T∑t2=1

T∑s=1

h2E [F (zt1 , βi1)F (zt2 , βi2)]E[ε2js

]

≤ C

T 2h2

N∑i1=1

N∑i2=1

∑j 6=i1,i2

T 2h2

NT 1/2

= O(N2T−1/2

),

which, by Markov inequality, implies that (A.33) holds. The proofs of (A.34) and (A.37) are

similar to that of (A.33).

We then prove (A.35). By A4 and Lemma A.2, we have

E

1T

T∑s1=1

F (zs1 , βi)Kis1,t

1T

T∑s2=1

F (zs2 , βj)Kjs2,t

2


≤ 1T 4h4

T∑s1=1

T∑t1=1

T∑s2=1

T∑t2=1

E [F (zs1 , βi)F (zt1 , βi)F (zs2 , βj)F (zt2 , βj)]

× E[Kis1tK

it1tK

js2tK

jt2t

]= O

(1

N2T 3

),

which, by Markov inequality, implies that (A.35) holds. By the same argument, we can show

that (A.36) holds. The proof of Theorem 3.2(ii) is therefore completed.

Let G(zt, βi) =(N1/2T 1/4G(zt, βi)

). In view of

FNT (zt, βi) = G(zt, βi) =1

N1/2T 1/4

(N1/2T 1/4G(zt, βi)

)=

1N1/2T 1/4

G(zt, βi),

the proof of Theorem 3.2(i) follows trivially from (A.32) with G(zt, βi) being replaced by

G(zt, βi).

References

Arellano, M. (2003). Panel Data Econometrics. Oxford University Press: Oxford.

Auestad, B. and Tjøstheim, D. (1990). Identification of nonlinear time series: first order characteriza-tion and order determination. Biometrika 77, 669-687.

Baltagi, B. H. (1995). Econometrics Analysis of Panel Data. John Wiley: New York.

Billingsley, P. (1968). Convergence of Probability Measures. John Wiley: New York.

Breusch, T. S. and Pagan, A. R. (1980). The Lagrange multiplier test and its application to modelspecifications in econometrics. Review of Economic Studies 47, 239-253.

Cai, Z. and Li, Q. (2008). Nonparametric estimation of varying coefficient dynamic panel data models.Econometric Theory 24, 1321-1342.

Chen, R. and Tsay, R. S. (1993). Functional–coefficient autoregressive models. Journal of the AmericanStatistical Association 88, 298-308.

Chow, Y. S. and Teicher, H. (1988). Probability Theory. Springer–Verlag: New York.

Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman & Hall:London.

Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer:New York.

Frees, E. W. (1995). Assessing cross sectional correlation in panel data. Journal of Econometrics 69,393-414.

Gao, J. (2007). Nonlinear Time Series: Semiparametric and Nonparametric Methods. Chapman &Hall CRC: London.


Hall, P. and Heyde, C. (1980). Martingale Limit Theory and Its Applications. Academic Press: NewYork.

Hansen, B. E. (2008). Uniform convergence rates for kernel estimation with dependent data. Econo-metric Theory 24, 726-748.

Hardle, W., Liang, H. and Gao, J. (2000). Partially Linear Models. Springer Series: Contributions toStatistics. Physica-Verlag: New York.

Henderson, D., Carroll, R. and Li, Q. (2008). Nonparametric estimation and testing of fixed effectspanel data models. Journal of Econometrics 144, 257-275.

Hjellvik, V., Chen, R. and Tjøstheim, D. (2004). Nonparametric estimation and testing in panels ofintercorrelated time series. Journal of Time Series Analysis 25, 831-872.

Hsiao, C. (2003). Analysis of Panel Data. Cambridge University Press: Cambridge.

Hsiao, C., Pesaran, M. H. and Pick, A. (2007). Diagnostic tests of cross section independence fornonlinear panel data models. IZA discussion paper No. 2756.

Huang, H., Kab, C. and Urga, G. (2008). Copula-based tests for cross-sectional independence in panelmodels. Economics Letter 100, 224-228.

Li, Q. and Hsiao, C. (1998). Testing serial correlation in semiparametric panel data models. Journalof Econometrics 87, 207-237.

Li, Q. and Racine, J. (2007). Nonparametric Econometrics: Theory and Practice. Princeton UniversityPress: Princeton.

Lin, Z. and Lu, C. (1996). Limit Theorems of Mixing Dependent Random Variables. Science Press,Kluwer, Academic Pub: New York, Dordrecht.

Ng, S. (2006). Testing cross section correlation in panel data using spacing. Journal of Business andEconomic Statistics 24, 12-23.

Pesaran, M. H. (2004). General diagnostic tests for cross section dependence in panels. CambridgeWorking Paper in Economics No. 0435.

Pesaran, M. H., Ullah, A. and Yamagata, T. (2008). A bias adjusted LM test of error cross sectionindependence. Econometrics Journal 11, 105-127.

Phillips, P. C. B. and Moon, H. (1999). Linear regression limit theory for nonstationary panel data.Econometrica 67, 1057-1111.

Sarafidis, V., Yamagata, T. and Robertson, D. (2009). A test of cross section dependence for a lineardynamic panel model with regressors. Journal of Econometrics 148, 149–161.

Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. Wiley: New York.

Ullah, A. and Roy, N. (1998). Nonparametric and semiparametric econometrics of panel data. Hand-book of Applied Economics Statistics, Ullah, A., Giles, D.E.A. (Eds.). Marcel Dekker, New York,pp. 579C604.

Documents

A New Diagnostic Test for Cross–Section … J. Chen, J. Gao and D. Li The rest of this paper is organized as follows. A nonparametric test for cross–section inde-pendence in a