21
Article No. ectj?????? Econometrics Journal (2007), volume 00, pp. 0–20. Generic Consistency of the Break-Point Estimators under Specification Errors in a Multiple-Break Model Jushan Bai , Haiqiang Chen , Terence Tai-Leung Chong *1 , and Seraph Xin Wang * Department of Economics, New York University E-mail: [email protected] Department of Economics, Cornell University E-mail: [email protected] * Department of Economics, The Chinese University of Hong Kong E-mail: [email protected]; seraph [email protected] Received: March 2006 Summary This paper considers the estimation of multiple-structural-break models under specification errors. A common example in economics is that the true model is measured in level, but a linear-log model is estimated. We show that, under specification errors, if there are more than one break point and if a single-break model is estimated, the estimated break point is consistent for one of the true break points. This consistency result applies to models with multiple regressors where some or all of the regressors are misspecified. Another important contribution of this paper is that we have constructed a Sup-Wald test whose limiting distribution is not affected by model misspecifiction. Using this robust test, we show that the break points can be estimated sequentially one at a time. Simulation evidence and an empirical application are provided. Keywords: Multiple Changes, Misspecification, Measurement Error, Consistency. 1. INTRODUCTION The end of the last century saw significant advance in the research of structural-break models 1 . While earlier studies focus on the single-break model, the attention has shifted to the multiple-break model in recent years. Pioneering works on multiple-break models include Chong (1995), Bai (1997), Bai and Perron (1998), and Chong (2001). The afore- mentioned studies propose different methods to estimate the locations and the number of breaks and establish the consistency of the break-point estimators under correct model specification. In the present of model misspecification, Chong (2003) shows that the break-point estimator is still consistent when there is only one break. It is not known, however, whether this consistency result can be extended to the case of multiple breaks. In this paper, we extend the work of Chong (2003) to study the consistency of the break- point estimator in a multiple-break model under specification errors. This problem is important because, at least philosophically, all models are incorrect. We investigate the robustness of the estimated break point when incorrect models are employed. The main 1 Corresponding author 1 For the recent development of the literature, one is referred to Perron (2006). c Royal Economic Society 2007

Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Article No. ectj??????Econometrics Journal (2007), volume 00, pp. 0–20.

Generic Consistency of the Break-Point Estimators

under Specification Errors in a Multiple-Break Model

Jushan Bai†, Haiqiang Chen‡, Terence Tai-Leung Chong∗1 , and Seraph Xin Wang∗

† Department of Economics, New York University

E-mail: [email protected]‡ Department of Economics, Cornell University

E-mail: [email protected]∗ Department of Economics, The Chinese University of Hong Kong

E-mail: [email protected]; seraph [email protected]

Received: March 2006

Summary This paper considers the estimation of multiple-structural-break models

under specification errors. A common example in economics is that the true model is

measured in level, but a linear-log model is estimated. We show that, under specification

errors, if there are more than one break point and if a single-break model is estimated,

the estimated break point is consistent for one of the true break points. This consistency

result applies to models with multiple regressors where some or all of the regressors are

misspecified. Another important contribution of this paper is that we have constructed

a Sup-Wald test whose limiting distribution is not affected by model misspecifiction.

Using this robust test, we show that the break points can be estimated sequentially

one at a time. Simulation evidence and an empirical application are provided.

Keywords: Multiple Changes, Misspecification, Measurement Error, Consistency.

1. INTRODUCTION

The end of the last century saw significant advance in the research of structural-breakmodels1. While earlier studies focus on the single-break model, the attention has shiftedto the multiple-break model in recent years. Pioneering works on multiple-break modelsinclude Chong (1995), Bai (1997), Bai and Perron (1998), and Chong (2001). The afore-mentioned studies propose different methods to estimate the locations and the number ofbreaks and establish the consistency of the break-point estimators under correct modelspecification. In the present of model misspecification, Chong (2003) shows that thebreak-point estimator is still consistent when there is only one break. It is not known,however, whether this consistency result can be extended to the case of multiple breaks.In this paper, we extend the work of Chong (2003) to study the consistency of the break-point estimator in a multiple-break model under specification errors. This problem isimportant because, at least philosophically, all models are incorrect. We investigate therobustness of the estimated break point when incorrect models are employed. The main

1Corresponding author1For the recent development of the literature, one is referred to Perron (2006).

c© Royal Economic Society 2007

Page 2: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 1

finding is that the break points can still be consistently estimated despite incorrect modelspecifications.

To present the key idea of the underlying consistency argument, consider the followingmeasurement error model of Chang and Huang (1997):

yt = β1ξt + εt, t ≤ k1

yt = β2ξt + εt, t > k1

t = 1, ..., T . The variable ξt is not observable, but instead, we observe

xt = ξt + ηt,

where ηt is the measurement error. It is assumed that εt is independent of ηt and ξt.In standard regression models without a break, it is well known that if xt is used as aregressor, simple least squares cannot yield consistent estimate of the slope parametersbecause the new error is correlated with the regressor xt.

We can rewrite the model with new slope coefficients without the measurement errorproblem. First, projecting ξt on xt, we have

ξt = c xt + et,

for some constant c, where the projection residual et is uncorrelated with xt by theprojection argument. Thus, the original model can be rewritten as

yt = (β1c)xt + εt + β1et, t ≤ k1

yt = (β2c)xt + εt + β2et, t > k1

The disturbances εt + β1et and εt + β2et are uncorrelated with the regressor xt. Thus,this is a standard change-point problem without measurement error. Furthermore,

β1 6= β2

implies

β1c 6= β2c.

Note that c 6= 0 under the measurement error setup. The case of c = 0 corresponds to thecase of using a regressor which is unrelated to the true regressor ξt. Such a situation willbe ruled out by our assumptions. Since we have a standard structural-break model, thebreak fraction τ = k1/T can be consistently estimated. More interestingly, the estimatedbreak has a convergence rate of T . That is, T (τ − τ) is stochastically bounded despitemeasurement error in the regressors.

The above simple example highlights the main argument for consistency. We can extendthis result to multiple-break models with general specification errors. Our estimationmethod does not assume that the number of breaks is known. It will be demonstratedthat the estimated break point is consistent for one of the true break points when themodel has multiple breaks, and that it has a convergence rate of T in spite of variousmisspecifications.

The paper is organized as follows: Section 2 presents the model and major assumptions.Section 3 shows the asymptotic behavior of the break-point estimator and the Sup-Waldstatistic under misspecification. Simulations in support of our theories are given in Section

c© Royal Economic Society 2007

Page 3: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

2 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

4. Section 5 provides an empirical application. The last section concludes the paper. Allproofs are relegated to the appendix.

2. THE MODEL AND ASSUMPTIONS

To begin with, we present some frequently used mathematical notations in this paper.[x] denotes the greatest integer ≤ x. The symbol ‘

p→’ represents convergence in probabil-ity, ‘ d→’ represents convergence in distribution, and ‘=⇒’ signifies weak convergence inD [0, 1] , see Billingsley (1968) and Pollard (1984). All limits are defined as the samplesize T →∞ unless otherwise stated.

Consider a multivariate linear regression model with p break points at unknown timepoints, the number of break points p is also unknown. We assume that the true modelhas the following form:

yt = f1(xt1)β1i + f2(xt2)β2i + · · ·+ fL(xtL)βLi + ut, ki−1 < t ≤ ki

t = 1, 2, ..., T ; i = 1, 2...., p + 1.

This is a model with p + 1 regimes, and the p break points are k1, ..., kp. We define,throughout, k0 = 0, kp+1 = T .

In matrix notation, the true model can be rewritten as

Y =p+1∑

i=1

IiFβi + U, (2.1)

whereY = (y1 y2 ... yT )′ is a T by 1 vector of yt;F is a T by L matrix with (t, l)th element fl (xtl), where fl (·) is a real-value function

for l = 1, 2, ...L;U = (u1 u2 ... uT )′ is a T by 1 vector with the element ut being a martingale difference

sequence with 1T

∑Tt=1 u2

tp→ σ2 < ∞, supt E |ut|4+c

< ∞ for some c > 0 (Bai and Perron,1998);

βi = (β1i β2i ... βLi)′ is a L by 1 vector of true parameters, i = 1, 2, ..., p + 1;

Ii is a T by T diagonal matrix with the (t, t)th element being an indicator function1 ki−1 < t ≤ ki , i = 1, 2, ..., p + 1, ki are the dates of changes, i = 1, 2, ..., p.

Let τi = ki/T be the true break fraction for i = 1, 2, ..., p. Also let τ0 = 0, τp+1 = 1.

Let k = [τT ], where τ ∈ [0, 1].Without knowing the true model, the following regression model with a single break

is estimated:

Y = IaGβa[τ ] + IbGβb[τ ] + V , (2.2)

whereIa is a T by T diagonal matrix with the (t, t)th element being an indicator function

1 t < k with k = [Tτ ], and Ib = IT×T − Ia;βa[τ ] and βb[τ ] are N by 1 vectors of OLS coefficient estimates for the first and second

sample subsample respectively;G is a T by N matrix with the (t, n)th element gn (xtn), where gn (·) is a real-value

function, n = 1, 2, ..., N ;V is a T by 1 vector of the residuals for the misspecified model.

c© Royal Economic Society 2007

Page 4: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 3

In this model, only one break point is estimated despite the existence of p true breakpoints. Therefore, two kinds of misspecifications occur in this model: misspecification inregressors and misspecification in the number of breaks.

Define the following matrices:

limT→∞

1T

T∑t=1

E[f (xt) f (xt)

′] = Qff ,

limT→∞

1T

T∑t=1

E[g (xt) g (xt)

′] = Qgg

and

limT→∞

1T

T∑t=1

E[g (xt) f (xt)

′] = Qgf .

where f(xt) = (f1(xt1), ..., fL(xtL))′ is a L×1 vector, and g(xt) = (g1(xt1), ..., gN (xtN ))′

is an N × 1 vector. We define Qfg = Q′gf .Qff and Qgg are assumed to be of full rank. The matrix Qff being of full rank is a

necessary identification condition even in the absence of breaks. The full rank requirementis standard. Define

Q = QfgQ−1gg Qgf .

It is assumed that Q is of full rank2.Following Chong (2003), we also assume, uniformly in τ ∈ [0, 1], that(A1) 1

T F ′IaFp→ τQff ;

(A2) 1T G′IaG

p→ τQgg;(A3) 1

T G′IaFp→ τQgf ;

(A4) det F ′ [I (j)− I (h)]F > 0 and det G′ [I (j)− I (h)]G > 0 for ∀j, h ∈ [0, T ]and j > h, where I (j) and I (h) are T by T diagonal matrices with the (t, t)th elementbeing an indicator function 1 t ≤ j and 1 t ≤ h respectively;

(A5) There exist some real numbers r > 2, Cl > 0 and Dn > 0 (1 ≤ l ≤ L, 1 ≤ n ≤N) such that for all 0 ≤ h < j > T , the two inequalities

E

∣∣∣∣∣j∑

t=h+1

fl (xtl) ut

∣∣∣∣∣

r

≤ Cl (j − h)r/2

and

E

∣∣∣∣∣j∑

t=h+1

gn (xtn) ut

∣∣∣∣∣

r

≤ Dn (j − h)r/2

hold;(A6)

(τ1,τ2,..., τp, β

′1, β

′2, ..., β

′p+1

) ∈ Θ = [τ , τ ]p × Bp+1 ⊂ (0, 1)p ×R(p+1)×L,where B is a L dimension parameter space;

(A7) infi |τi+1 − τi| > 0, i = 0, 1, ..., p, τ0 = 0, τp+1 = 1;(A8) infi ‖βi+1 − βi‖ > 0, i = 0, 1, ..., p.

2The assumption of full rank for Q is equivalent to Qgf having a full column rank. We do not allow

all the regressors to be completely unrelated to the true regressors. The requirement that Qgf has

full column rank says that at least a subset of regressors (same dimension as the true regressors f) is

correlated to the true regressors. This is a reasonable assumption since one cannot expect to consistently

estimate the model with arbitrary unrelated regressors.

c© Royal Economic Society 2007

Page 5: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

4 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

Assumptions (A1) - (A3) exclude trending regressors. Assumption (A4) guarantees theinvertibility of the matrices defined there, and the OLS regression coefficients βa and βb

can thus be estimated. The uniform convergence result is ensured by assumption (A5).Assumption (A6) states that the true break points are in a compact set in (0, 1) as theestimated βa and βb are not defined at the boundary of the time domain. Assumption(A7) requires two consecutive change points to be separated far enough from each other.Assumption (A8) states that the magnitudes of changes should not be negligible.

In model (2), there are two kinds of misspecifications, namely, F 6= G or the numberof true breaks is more than one (p > 1). In other words, the model is misspecified if thefunctional form of the regressors or the true number of changes is actually more thanone.

3. ASYMPTOTICS

3.1. Consistency

Following Chong (2001), we define

τT = arg minτ∈[τ,τ ]

RSS (τ, 0, 1) , (3.3)

where RSS(τ, τv, τ l

)denotes the residual sum of squares of model (2) regressed on data

dating from τvT to τ lT , and

RSS (τ, 0, 1) =∥∥∥Y − IaGβa[τ ] − IbGβb[τ ]

∥∥∥2

is the residual sum of squares for the whole sample. For any given value of τ , the leastsquares estimators of the pre- and post-shift parameters of model (2) are

βa[τ ] = (G′IaG)−1G′IaY,

βb[τ ] = (G′IbG)−1G′IbY.

For τ ∈ [τh, τh+1] , h = 0, 1, ..., p,

βa[τ ]p→ Q−1

gg QgfΓ1 (τ) (3.4)

βb[τ ]p→ Q−1

gg QgfΓ2 (τ)

where for τ ∈ [τh, τh+1),

Γ1 (τ) = β1 for h = 0 (3.5)

=∑h

i=1 (τi − τi−1) βi + (τ − τh) βh+1

τ, for h = 1, 2, ..., p

Γ2 (τ) =∑p+1

i=h+2 (τi − τi−1) βi + (τh+1 − τ) βh+1

1− τ, for h = 0, 1, ..., p− 1 (3.6)

= βp+1, for h = p

In general, for any interval(τv, τ l

) ⊆ [0, 1], we have

1T

RSS(τ, τv, τ l

) p→ R(τ, τv, τ l

)uniformly,

c© Royal Economic Society 2007

Page 6: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 5

where R(τ, τv, τ l

)is a piecewise concave function of τ defined on

(τv, τ l

).

In particular, it can be shown that

1T

RSS(τ, 0, 1)p→ R (τ, 0, 1) , τ ∈ [0, 1]

uniformly, where

R (τ, 0, 1) = σ2 +p+1∑

i=1

(τi − τi−1) β′iQffβi − τΓ′1 (τ) QΓ1 (τ)− (1− τ) Γ′2 (τ) QΓ2 (τ) .

Note that for τ ∈ [0, τ1), we have

∂R (τ, 0, 1)∂τ

= −(

11− τ

)2

L′1QL1 ≤ 0

and∂2R (τ, 0, 1)

∂τ2= − 2

(1− τ)3L′1QL1 ≤ 0,

where

L1 =p+1∑

i=2

(τi − τi−1)βi − (1− τ1)β1.

For τ ∈ [τh, τh+1), h = 1, 2, ..., p− 1, we have

∂R (τ, 0, 1)∂τ

= Γ′1 (τ) QΓ1 (τ)− Γ′2 (τ) QΓ2 (τ)− 2β′h+1Q (Γ1 (τ)− Γ2 (τ)) ,

∂2R (τ, 0, 1)∂τ2

= −21τ

(βh+1 − Γ1 (τ))′Q (βh+1 − Γ1 (τ))

−21

1− τ(βh+1 − Γ2 (τ))′Q (βh+1 − Γ2 (τ)) ≤ 0.

For τ ∈ [τp, 1], we have

∂R (τ, 0, 1)∂τ

=1τ2

L′2QL2 ≥ 0,

where

L2 =p∑

i=1

(τi − τi−1) βi − τpβp+1,

and∂2R (τ, 0, 1)

∂τ2= − 2

τ3L′2QL2 ≤ 0.

The beauty of R (τ, 0, 1) is that it is concave within any interval [τh, τh+1] , h =

0, 1, ..., p. In addition, for the two boundary intervals, we have∂R (τ, 0, τ1)

∂τ≤ 0 and

∂R (τ, τp, 1)∂τ

≥ 0, implying that the function is decreasing for τ ∈ [0, τ1] and is increasingfor τ ∈ [τp, 1]. Thus, the function R(τ, 0, 1) achieves its minimum at one of the true breakpoints (τ1, ..., τp), which in turn implies that τT is consistent for one of the true breakpoints.

Propositions 3.1. Under assumptions (A1)-(A8), R(τ, τv, τ l

)defined in any subsam-

ple[τv, τ l

] ⊆ [0, 1] is piece-wise concave.

The proof is provided in the appendix. Proposition 1 suggests that the estimated

c© Royal Economic Society 2007

Page 7: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

6 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

break point is consistent for one of the true breaks. The piece-wise concavity over eachsubinterval guarantees the consistent estimation of all the break points. The followingtheorem strengthens the consistency argument by including the convergence rate for theestimated break points.

Theorem 3.1. Under assumptions (A1)− (A8), for some k=1,2,...,p(i) τT is consistent for one of the true break points, i.e.,

plim τT = τk.

(ii) Suppose that τT is consistent for τk. Then

T (τT − τk) = Op(1).

Proof of Theorem 3.1: Recall the true model has the form:

Y =p+1∑

i=1

IiFβi + U. (3.7)

Let Ft = f(xt) = (f1(xt1), ..., fL(xtL))′ and similarly, let Gt = g(xt). Thus

F = (F1, F2, ..., FT )′ ,

G = (G1, G2, ..., GT )′ .

Project Ft on Gt, we have

Ft = CGt + et, (3.8)

where C is a L×N matrix of coefficient. Note that the rank of C is also L, which is lessthan or equal to N . This follows from 1

T

∑Tt=1 FtG

′t = C 1

T

∑Tt=1 GtG

′t + 1

T

∑Tt=1 etG

′t.

From the orthogonality of Gt and et due to the projection, we have, in the limit, Qfg =CQgg. Because the rank of Qfg is L and the rank of Qgg is N (N ≥ L), the rank of C

must be L (full row rank).Let e = (e1, e2, ..., eT )′. The model can be rewritten as

Y =p+1∑

i=1

Ii (GC ′ + e)βi + U =p+1∑

i=1

IiGθi + V, (3.9)

where

θi = C ′βi

and

V =p+1∑

i=1

Iieβi + U.

Furthermore, from βi+1 − βi 6= 0, we have θi+1 − θi = C ′(βi+1 − βi) 6= 0 because C ′ =Q−1

gg Qgf has a rank of L and βi is L× 1. Thus, the model reduces to that of a standardmultiple-change model with G being considered as the true regressors and θi as the newcoefficients. From this point of view, the estimated model only has misspecification inthe number of breaks points. Now invoke the results of Bai (1997), τT is T-consistent forone of the true breaks. This completes the proof of Theorem 3.1 ¤. 3.

3The proof uses the fact that Qgf (βi+1 − βi) 6= 0, which is a necessary condition for consistent

estimation of the break point. This is because the condition implies the existence of a break point in the

c© Royal Economic Society 2007

Page 8: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 7

Theorem 3.1 shows the consistency of the estimated change point for one of the trueones. However, which of the change points will be identified depends on a number offactors. It is not necessarily the case that the break with the biggest magnitude will beidentified first. The result also depends on the duration of the break and the form ofmisspecification. For example, in the simplest case of Chong (1995), where F = G andthe true model has two breaks, but a single-break model is estimated, if

(β3 − β2)′Q (β3 − β2)

(β2 − β1)′Q (β2 − β1)

<τ1 (1− τ1)τ2 (1− τ2)

,

then τp→ τ1, and if this inequality is reversed, then τ

p→ τ2.

According to Theorem 3.1, we can first consistently estimate one of the break points τT

in the whole sample, then split the sample into pre- and post-shift subsample at the breakfraction τT . Within both subsamples [0, τT ] and (τT , 1], estimate τ ′T = arg minRSS (τ, 0, τT )and τ ′′T = arg minRSS (τ, τT , 1) to obtain the next two break points, assuming eachsubinterval contains a break point. This process continues until all other break pointsare located4. According to Proposition 1, the estimated break is consistent. Alternatively,in each subsequent round of estimation, only one break point is chosen in one of the sub-samples for which the reduction in residual sum of squares is greatest. This method isconsidered by Bai and Perron (1998) and is referred to as the sequential method. Theprocess stops when the specified number of breaks is obtained.

3.2. The number of break points and the Wald test

Consistent estimation of a break point does not hinge on the prior knowledge of thenumber of breaks, as long as there exists at least one break point. However, if the objectiveis to estimate all break points, clearly, one needs to know the number of break points.Recently, Altissimo and Corradi (2003) propose a consistent estimation method for thenumber of break points when the model is correctly specified. In this paper, we discussthree estimation methods. The first is based on the Sup-Wald test procedure of Chong(2003). The second estimation method is based on a Wald test with a heteroskedasticand autocorrelation (HAC) robust estimator of the variance. The third estimation is touse the information criteria of Yao (1998).

3.2.1. The Sup-Wald test statistic The Sup-Wald test statistic together with sample-splitting method can be used to determine the location and number of the breaks. Supposemodel (2.1) is the true model, but model (2) is estimated; the Sup-Wald statistic for thenull hypothesis H0 : β1 = β2 = ... = βp+1 = β is defined as:

projected model. The condition that Qgf (βi+1−βi) 6= 0 (i = 1, 2, ..., m) does not always require Qgf to

be of full column rank for a given parameter configuration β. However, in order for the condition to hold

for all configurations of β, the matrix Qgf must be of full column rank. Because Theorem 1 makes a

general statement that covers all configurations of β, the full column rank requirement for Qgf becomes

necessary.4Note that R (τ, 0, 1) will no longer be piecewise concave and the consistency of the estimation is not

guaranteed if Qfg = 0. For example, when f (xt) = xt ∼ i.i.d.N(0, σ2x) and g (xt) = x2

t , we have Qfg = 0,

thus RSS (τ, 0, 1)p→ R (τ, 0, 1) = σ2 +

∑p+1i=1 (τi − τi−1) β′iQff βi which is no longer a function of τ.

c© Royal Economic Society 2007

Page 9: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

8 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

supτ∈S

WT (τ, 0, 1) = supτ∈S

Tτ (1− τ)RSS (τ, 0, 1)

(βa[τ ] − βb[τ ]

)′G′G

(βa[τ ] − βb[τ ]

), (3.10)

where S is a set with closure in [0, 1] .

The second and the third parameter of WT (τ, 0, 1) imply that the test is calculatedwith the full sample data. As shown in Chong (2003), although the pre- and post-shiftestimators will not be consistent in the presence of misspecification, they will converge tothe same constant under the null hypothesis of no structural break. Meanwhile, if thereare structural breaks, the probability limits of the pre- and post-shift estimators willgenerally be different in the presence of specification errors. Therefore, the Wald typetest based on the magnitude of the estimated break will still be a consistent test.

To derive the distribution of WT (τ, 0, 1) , we make the following assumptions:

(B1) ut follows an i.i.d(0, σ2

)with σ2 < ∞;

(B2) 1√T

F ′IaU =⇒ Bfu (τ) , where Bfu (τ) is a L-vector zero-mean Gaussian processwith variance τσ2Qff ;

(B3) 1√T

G′IaU =⇒ Bgu (τ) , where Bgu (τ) is a N -vector zero-mean Gaussian processwith variance τσ2Qgg;

(B4)√

TV ec(

1T G′IaG− τQgg

)=⇒ Bgg (τ) , where Bgg (τ) is a N2 × 1 vector of zero-

mean Gaussian processes with covariance matrix

τσ2 limT→∞

TE

[V ec

(1T

G′G−Qgg

)(V ec

(1T

G′G−Qgg

))′];

(B5)√

TV ec(

1T G′IaF − τQfg

)=⇒ V ecBfg (τ) , where Bfg (τ) is a NL× 1 matrix of

zero-mean Gaussian processes with variance

τσ2 limT→∞

TE

[V ec

(1T

G′F −Qfg

)(V ec

(1T

G′F −Qfg

))′].

Chong (2003) derives the null distribution of WT (τ, 0, 1) under assumptions (A1) -(A5) and (B1) - (B5).

Theorem 3.2. Under assumptions (A1) - (A5) and (B1) - (B5), and H0 : β1 = β2 =... = βp+1 = β, as T →∞, we have

WT (τ, 0, 1) =⇒ [τBA (1)−BA (τ)]′Q−1gg [τBA (1)−BA (τ)]

τ (1− τ) [σ2 + β′ (Qff −Q)β]

and

supτ∈S

WT (τ, 0, 1) d→ supτ∈S

[τBA (1)−BA (τ)]′Q−1gg [τBA (1)−BA (τ)]

τ (1− τ) [σ2 + β′ (Qff −Q) β]

where S is a set with closure in [τ , τ ] , BA (τ) =[Bfg (τ)−Bgg (τ) Q−1

gg Qfg

]β + Bgu (u)

is a N -vector of Brownian motions on [0, 1] and Q = Q′fgQ−1gg Qfg.

The proof is given in Chong (2003) and is thus omitted. If there is no specification error,

c© Royal Economic Society 2007

Page 10: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 9

we have BA (τ) = Bfu (u) , Q = Qff , and supτ∈S WT (τ, 0, 1) d→ supτ∈S

‖τB (1)−B (τ)‖2τ (1− τ)

under H0, where B (τ) is a L-vector of independent Brownian motions. With little modi-fication, Theorem 2 can also be shown to hold in any subsample. To estimate the numberof breaks, one can follow the sample splitting method of Chong (2001). After the firstbreak point is identified, we split the sample into two subsamples using the first break-point estimate as a cut-off point. Then the same test is performed on the two subsamples.The procedure continues until all the break points are identified.

Note that the distribution of Sup-Wald statistic is now affected by the form of misspec-ification in an unknown manner. However, if the misspecification is due to measurementerrors, we can still perform the test in some special cases. To see this, let the true modelbe:

Y =p+1∑

i=1

IiFβi + U,

but we estimate

Y = Ia (F + ∆) βa[τ ] + Ib (F + ∆) βb[τ ] + V ,

where ∆ = (δ1,δ2, ..., δT )′ is a T by L matrix and δt = (δt1, δt2, ..., δtL)′, (t = 1, 2, ..., T ),is the measurement error vector. Assuming that δtlT

t=1 are zero-mean i.i.d across t andl, and are independent of F and U , then we have

E (∆) = 0

and

limT→∞

1T

∆′∆ = Σ,

where Σ is a L×L diagonal matrix with the (l, l)th element var (δtl) = σ2l . Note that in

this case, Qfg = Qff and Qgg = Qff + Σ. It is not clear how ∆ affects the test statisticwhenQff is not diagonal. However, when L = 1 and if f (xt) ∼ i.i.d.N

(0, αfσ2

), δt ∼

i.i.d.N(0, αδσ

2), where αf and αδ are the ratios of var [f (xt)] and var (δt) to that of

ut, then supτ∈S WT (τ, 0, 1) d→ supτ∈S

(τB (1)−B (τ))2

τ (1− τ), which is the conventional null

distribution of Sup-Wald test statistic for testing structural break.

3.2.2. The HAC adjusted Wald test statistic In general, the limiting distribution of theSup-Wald statistic depends on unknown parameters, except for some special cases. Theunderlying reason is that misspecification and measurement error induce heteroskedas-ticity in the sense that E(gtg

′tv

2t ) 6= E(gtg

′t)Ev2

t . As is well known, the usual F test (andhence the Sup-Wald test) fails under heteroskedasticity. Thus, a heteroskedasticity-robustcovariance should be used. In this paper, we propose a modified test whose asymptoticdistribution will not be affected by the functional form F and the unknown parameterβ. As the misspecification in the functional form can be absorbed by the error term, wecan construct a HAC adjusted Wald test statistic (Bai and Perron, 1998) as follows:

WHACT (τ, 0, 1) = Tτ (1− τ)

(βa[τ ] − βb[τ ]

)′Ω−1

(βa[τ ] − βb[τ ]

), (3.11)

c© Royal Economic Society 2007

Page 11: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

10 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

where S is a set with closure in [0, 1], and the estimated asymptotic variance of βa[τ ]−βb[τ ]

is1

τ(1− τ)Ω, where

Ω = p lim(G ′GT

)−1[ΣT

t=1gtg ′tv2t

T](

G ′GT

)−1,

and vt = yt − g′tβ (t = 1, 2, ...T ) are the estimated residuals. Note that Ω is robust toheteroskedasticity. One can also employ the Newey-West (1987) estimator to make itrobust to serial correlation as well.

Theorem 3.3. Under assumptions (A1) - (A5) and (B1) - (B5), and H0 : β1 = β2 =... = βp+1 = β, as T → ∞, we have WHAC

T (τ, 0, 1) d→ χ2N for each fixed τ , and

supτ∈S WHACT (τ, 0, 1) d→ supτ∈S

‖τB (1)−B (τ)‖2τ (1− τ)

, where S is a set with closure in

[τ , τ ], B (τ) is a N -vector of independent Brownian motions.

Proof of Theorem 3.3 See the Appendix.

According to Theorem 3.3, supWHACT (τ, 0, 1) converges to the same asymptotic dis-

tribution of the Sup-Wald test of Andrews (1993).

3.2.3. Information criteria of Yao (1988) Alternatively, one can use the Schwarz in-formation criterion approach of Yao (1988) to determine the total number of breaks.Once the number of breaks is determined, the dynamic programming method of Baiand Perron (1998) can be used to estimate all breaks simultaneously. No matter whichmethod is used, the estimated break point is T consistent.

4. SIMULATIONS

Experiment 1. This experiment shows the distribution of the first break-point estimatorin a two-break model under misspecifications. We perform the following simulations:

True model:

yt = 1 + xt + ut for t ∈ [1,T

3]

yt = 1 + 3xt + ut for t ∈ (T

3,2T

3]

yt = 1 + 4xt + ut for t ∈ (2T

3, T ]

Estimated model:

yt = β01 + β1g (xt) + vt for t ≤ τT

yt = β02 + β2g (xt) + vt for t > τT

Model (1) (2) (3)g (xt) xt x3

t lnxt

where T = 1000; ut ∼ i.i.d.N (0, 1) ; xt ∼ i.i.d.U (1, 10) ; xtTt=1 and utT

t=1 are inde-pendent of each other. Table 1 shows the frequency distribution of the minimum ofRSS (τ, 0, 1) around the two true break points. The number of replication is set to

c© Royal Economic Society 2007

Page 12: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 11

Table 1. Distribution of the first estimated break pointEstimated Models

k1 − 334 (1) (2) (3)< −6 0 10 0−6 0 3 0−5 0 14 0−4 0 17 0−3 2 29 0−2 7 68 0−1 81 126 20 820 429 4461 66 112 1892 17 56 1113 3 47 724 4 26 375 0 34 346 0 11 26

> 6 0 18 83

N = 1000. Note that the first break will be identified first. Table 1 shows that thefirst break points always have a high chance of being identified.

Experiment 2. When there is only one regressor in both the true and the estimatedmodels, we have shown that the Sup-Wald statistic will not be affected asymptotically.This experiment studies the behavior of Sup-Wald statistic in the presence of measure-ment error. We estimate the following model:

True model:

yt = βxt + ut, t = 1, 2, ..., T.

Misspecified model:

yt = β (xt + δt) + vt, t = 1, 2, ..., T.

T = 1000 (sample size);N = 10000 (number of replications)ut ∼ i.i.d.N(0, 1);xt ∼ i.i.d.N(0, 1);δt ∼ i.i.d.N(0, αδ);xtT

t=1 , utTt=1 and δtT

t=1 are independent of each other.

Using the misspecified model, we simulate the critical values of null hypothesis of nobreak point under measurement error. Table 2 reports the critical value c such thatPr

[supτ∈(λ,1−λ) WT (τ, 0, 1) > c

]= α. The values are compared with Andrews (1993).

It can be clearly observed from Table 2 that the critical values are very close to thoseof Andrews (1993). Additional simulations (not reported here) show that, under general

c© Royal Economic Society 2007

Page 13: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

12 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

Table 2. Critical value Table: β = 1Andrews(1993) αδ = 0 αδ = 1 αδ = 2

λ\α 0.1 0.05 0.01 0.1 0.05 0.01 0.1 0.05 0.01 0.1 0.05 0.010.1 7.63 9.31 12.69 7.60 9.51 12.55 7.55 9.19 13.06 7.51 8.21 12.800.2 6.80 8.45 11.69 6.74 8.41 11.79 6.74 8.42 11.58 6.69 8.21 11.620.3 6.05 7.51 10.91 6.15 7.44 10.80 5.87 7.46 11.08 6.13 7.46 11.190.4 5.10 6.57 9.82 5.18 6.50 9.77 4.90 6.35 9.81 4.94 6.34 10.11

Table 3. Comparison results for different models and methodsModel SEQ with BIC GLM with BIC SEQ with Sup-HAC-WaldMean-shift 1st break: 79 2 breaks: 47, 79 1st break: 79

2nd break: 47 2nd break: 47AR(1) 1st break: 81 2 breaks: 46, 78. 1st break: 81; 2nd

2nd break: 46 break: 46; 3rd break: 24AR(2) 1st break: 80 1 break: 80 1st break: 80; 2nd

break: 46; 3rd break: 24

misspecifications, with HAC-adjusted Sup-Wald test, the critical values are also close tothose of Andrews (1993), as predicted by Theorem 3.

5. EMPIRICAL APPLICATION

In this section, we provide an application on the U.S. ex-post real interest rate seriesconsidered by Garcia and Perron (1996) and Bai and Perron (2003). The real interestrate series is constructed from the three-month treasure bill rate deflated by the CPItaken from the Citibase data base. Quarterly data from 1961Q1 to 1986Q3 are used.Bai and Perron (2003) apply a simple mean-shift model to the data and found two breaks.In this paper, we check the robustness of the estimation when different models are used.Meanwhile, we would also like to see whether the sequential estimation method (SEQ)will give the same estimation result as the global minimization method (GLM). Thelatter estimates all the break points simultaneously. Two methods are used to determinethe number of breaks. The first is the BIC of Yao (1988), while the second one is theSup-HAC weighted Wald test.

Note that the sequential estimation method can obtain similar results as the globalminimization method for all cases. For the mean-shift model, they give identical results.Using the BIC, two breaks are detected for the mean-shift model and for the AR(1)model, and only one break has been found for the AR(2) model. However, if we usethe Sup-HAC-Wald test, we obtain two breaks for the mean-shift model and three forthe AR(1) and the AR(2) models. The number of breaks detected by the BIC andSup-HAC-Wald test can vary across different models, which is not unexpected in a finitesample. According to the Sup-HAC weighted Wald test, we conclude that there are threebreaks in the AR model. Note that the estimation results are very close regardless of themethod used.

c© Royal Economic Society 2007

Page 14: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 13

6. CONCLUSION

In this paper, we combine the works of Bai (1997) and Chong (2003) to examine the con-sistency of the break-point estimator under misspecification of the independent variablesin a multiple-break model with unknown number of break points. The misspecificationcan occur in different dimensions: (i) the number of breaks is incorrectly specified, (ii) thefunctional forms of regressors are incorrectly specified, (iii) variables are observed witherrors. It is shown that the residual sum of squares divided by the sample size convergesuniformly to a piecewise concave function, which in turn implies that, if a one-breakmodel is estimated in the presence of multiple breaks, the break-point estimator willconverge to one of the true break points. Thus, it is possible to locate other break pointsby using the sample splitting method. The rate of convergence for the estimated breakpoints is also provided. If the number of break points is known, all break points can beconsistently estimated one by one. The issue of how to determine the number of breaksis also discussed. Experimental and empirical evidence confirms our results.

ACKNOWLEDGEMENTS

The authors would like to thank Stephane Gregoir, Win lin Chou, Andy Kwan, HenryLin and two anonymous referees for helpful comments. All errors are ours.

REFERENCES

Andrews, D.W.K. (1993). Tests for parameter instability and structural change withunknown change point. Econometrica 61, 821-856.

Altissimo, F. and V. Corradi (2003). Strong rules for detecting the number of breaks ina time series. Journal of Econometrics 117, 207-244.

Bai, J. (1997). Estimating multiple breaks one at a time. Econometric Theory 13, 315-352.

Bai, J. and P. Perron (1998). Testing for and estimating of multiple structural changes.Econometrica 66, 47-79.

Bai, J. and P. Perron (2003). Computation and analysis of multiple structural-changemodels. Journal of Applied Econometrics 18, 1-22.

Billingsley, P. (1968). Convergence of Probability Measures. New York: John Wiley andSons.

Busetti, F. and A. Harvey (1998). Testing for the presence of a random walk in series withstructural breaks. Discussion Paper No. Em/98/365 , London School of Economics andPolitical Science.

Chang, Y.P. and W.T. Huang (1997). Inference for the linear errors-in-variables withchange point models. Journal of the American Statistical Association 92, 171-178.

Chesher, A. (1991). The effect of measurement error. Biometrika 78, 451-462.Chong, T.T.L. (1995). Partial parameter consistency in a misspecified structural change

model. Economics Letters 49, 351-357.Chong, T.T.L. (2001). Estimating the locations and number of change points by the

sample-splitting method. Statistical Papers 42, 53-79.Chong, T.T.L. (2003). Generic consistency of the break-point estimator under specifica-

tion errors. Econometrics Journal 6, 167-192.

c© Royal Economic Society 2007

Page 15: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

14 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

Forchini, G. (2002), Optimal similar tests for structural change for the linear regressionmodel. Econometric Theory 18, 853-867.

Garcia, R. and P. Perron (1996), An analysis of the real interest rate under regime shifts.Review of Economics and Statistics 78, 111-125.

Lee, J., Huang, C.J and Y.C. Shin (1997). On stationary tests in the presence of structuralbreaks. Economics Letters 55, 165-172.

Nelson, D.B. (1995). Vector attenuation bias in the classical errors-in-variables model.Economics Letters 49, 345-349.

Newey, W. K. and K.D. West (1987). A simple positive semi-definite, heteroskedasticityand autocorrelation consistent covariance matrix. Econometrica 55, 703–708.

Perron, P. (2006). Dealing with structural breaks, in Palgrave Handbook of Econometrics,Vol. 1: Econometric Theory, K. Patterson and T.C. Mills (eds.), Palgrave Macmillan,278-352.

Pollard, D. (1984). Convergence of Stochastic Process. New York: Springer Inc.White, H. (1980). A heteroskedasticity-consistent covariance matrix and a direct test for

heteroskedasticity. Econometrica 48, 817–838.Yao, Y.C. (1988). Estimating the number of change-points via Schwarz criterion. Statis-

tics & Probability Letters 6, 181-189.

APPENDIX

Proof of Proposition 1: We shall establish the piece-wise concavity of R(τ, τv, τ l

).

Suppose there are q (q ≤ p) breaks within the segment t = v + 1, ...l − 1, l. Let:(i) τv = v/T, τ l = l/T,

[τv, τ l

] ⊂ [τ , τ ] ;(ii) τv < τ1 < τ2 < ... < τq−1 < τq < τ l, where τi, i = 1, 2, ..., q are q consecutive

true change points with τ1 = τj ,∀j ∈ 1, 2, ..., p− q;(iii) Let Iv, I l, Ia, Ib, Ii be respectively a (l − v) × (l − v) diagonal matrix with the

(t− v, t− v)th element an indicator function 1 v < t ≤ τ1T, 1 τqT < t ≤ l, 1 v < t ≤ τT,1 τT < t ≤ l, 1 τi−1T < t ≤ τiT, i = 2, ..., q.

For convenience in notation, let τ0 = τv, τq+1 = τ l, I1 = Iv, and Iq+1. = I l.

The true model in this segment is

Y =q+1∑

i=1

IiFβr−1+i + U. (A.1)

where r is an integer such that τr−1 ≤ τv < τr.

Let βi = βr−1+i, i = 1, 2, ...q + 1. We then rewrite (12) as:

Y =q+1∑

i=1

IiFβi + U.

Due to misspecification, model (2) is estimated.For τ0 ≤ τ < τ1, we haveIaI1 = Ia;IaIi = 0, i = 2, 3, ..., q + 1;IbI1 = (I − Ia) I1 = I1 − Ia;IbIi = Ii, i = 2, 3, ..., p.

c© Royal Economic Society 2007

Page 16: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 15

Thus,

βa[τ ] = (G′IaG)−1G′IaY

=(

G′IaG

T ′

)−1 1T ′

G′Ia

(q+1∑

i=1

IiFβi + U

)

p→ [(τ − τ0)Qgg]−1 (τ − τ0)Qgf β1

= Q−1gg Qgf β1,

βb[τ ] = (G′IbG)−1G′IbY =

(G′IbG

T ′

)−1 1T ′

G′Ib

(q+1∑

i=1

IiFβi + U

)

p→ (τq+1 − τ) Q−1gg Qgf

[(τ1 − τ)β1 +

q∑

i=1

(τi+1 − τi) βi+1

]

= Q−1gg QgfΓ2 (τ) ,

uniformly, where

Γ2 (τ) =1

τq+1 − τ

[(τ1 − τ)β1 +

q∑

i=1

(τi+1 − τi) βi+1

], τ ∈ [τ0, τ1).

Define

R(τ, τv, τ l

)= σ2 +

q+1∑

i=1

(τi − τi−1) β′iQff βi − (τ − τ0) β′1Qβ1 − (τq+1 − τ) Γ′2 (τ) QΓ2 (τ)

on [τ0, τ1), where Q = Q′gfQ−1gg Qgf is a symmetric positive definite matrix. Let T ′ =

(τq+1 − τ0)T . By assumptions (A1) to (A8), and the use of triangle inequality, we have

supτ∈[τ0,τ1)

∣∣∣∣1T ′

RSS(τ, τv, τ l

)−R(τ, τv, τ l

)∣∣∣∣

≤∣∣∣∣

1T ′

U ′U − σ2

∣∣∣∣ +

∣∣∣∣∣1T ′

q+1∑

i=1

β′iF′IiFβi −

q+1∑

i=1

(τi − τi−1) β′iQff βi

∣∣∣∣∣

+ supτ∈[τ0,τ1)

∣∣∣∣1T ′

(β′a[τ ]G

′IaGβa[τ ] − 2βa[τ ]G′IaF β1

)+ (τ − τ0) β′1Qβ1

∣∣∣∣

+ supτ∈[τ0,τ1)

∣∣∣∣∣1T ′

(β′b[τ ]G

′IbGβb[τ ] − 2βb[τ ]G′ (I1 − Ia) Fβ1 − 2

∑q+1i=2 βb[τ ]G

′IiFβi

)

+(τq+1 − τ) Γ′2 (τ) QΓ2 (τ)

∣∣∣∣∣

+2T ′

∣∣∣∣∣q+1∑

i=1

U ′IiFβi

∣∣∣∣∣ + supτ∈[τ0,τ1)

2T ′

∣∣∣U ′IaGβa[τ ]

∣∣∣ + supτ∈[τ0,τ1)

2T ′

∣∣∣U ′IbGβb[τ ]

∣∣∣

= op (1) since each individual term above is op (1) .

Thus,

1T ′

RSS(τ, τv, τ l

) p→ R(τ, τv, τ l

)

uniformly, where R(τ, τv, τ l

)is a piecewise concave function of τ defined on

(τv, τ l

).

c© Royal Economic Society 2007

Page 17: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

16 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

Meanwhile,

∂R(τ, τv, τ l

)

∂τ= −β′1Qβ1 + Γ′2 (τ) QΓ2 (τ)− 2 (1− τ) Γ′2 (τ) Q

∂Γ2 (τ)∂τ

= −β′1Qβ1 + Γ′2 (τ) QΓ2 (τ)− 2Γ′2 (τ)Q(Γ2 (τ)− β1)

= − (Γ2 (τ)− β1

)′Q

(Γ2 (τ)− β1

)

= −(

1τq+1 − τ

)2

L′1QL1 ≤ 0,

where L1 =∑q

i=1 (τi+1 − τi) βi+1 − (τq+1 − τ1) β1.

∂2R(τ, τv, τ l

)

∂τ2= − 2

(τq+1 − τ)3L′1QL1 ≤ 0.

Therefore, 1T RSS

(τ, τv, τ l

)converges uniformly to a concave function of τ for τ ∈

[τ0, τ1).For τ ∈ [τh, τh+1), h = 1, 2, ..., q − 1, we have:IaIi = Ii, 1 ≤ i ≤ h;IaIh+1 = Ia −

∑hi=1 Ii;

IaIi = 0, h + 1 < i ≤ q + 1;IbIi = 0, i ≤ h;IbIh+1 =

∑h+1i=1 Ii − Ia;

IbIi = Ii, h + 1 < i ≤ q + 1.

Thus,

βa[τ ] = (G′IaG)−1G′IaY =

(G′IaG

T ′

)−1 1T ′

G′Ia

(q+1∑

i=1

IiFβi + U

)

=(

G′IaG

T ′

)−1 1T ′

G′(

h∑

i=1

IiFβi + IaFβh+1 −h∑

i=1

IiF βh+1 + IaU

)

p→ [(τ − τ0) Qgg]−1

Qgf

(h∑

i=1

(τi − τi−1) βi + (τ − τh) βh+1

)

= Q−1gg QgfΓ1 (τ)

uniformly, where

Γ1 (τ) =1

τ − τ0

(h∑

i=1

(τi − τi−1) βi + (τ − τh) βh+1

), τ ∈ [τh, τh+1).

βb[τ ] = (G′IbG)−1G′IbY = (G′IbG)−1

G′Ib

(q+1∑

i=1

IiF βi + U

)

=(

G′IbG

T ′

)−1 1T ′

G′[(

h+1∑

i=1

Ii − Ia

)Fβh+1 +

q+1∑

i=h+2

IiF βi + IbU

]

p→ [(τq+1 − τ) Qgg]−1

Qgf

(q+1∑

i=h+2

(τi − τi−1) βi + (τh+1 − τ) βh+1

)

= Q−1gg QgfΓ2 (τ)

c© Royal Economic Society 2007

Page 18: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 17

uniformly, where

Γ2 (τ) =1

τq+1 − τ

(q+1∑

i=h+2

(τi − τi−1) βi + (τh+1 − τ) βh+1

), τ ∈ [τh, τh+1).

Define

R(τ, τv, τ l

)= σ2 +

q+1∑

i=1

(τi − τi−1) β′iQff βi

− (τ − τ0) Γ′1 (τ)QΓ1 (τ)− (τq+1 − τ) Γ′2 (τ) QΓ2 (τ)

on [τh, τh+1), it follows that

supτ∈[τh,τh+1)

∣∣∣∣1T ′

RSS(τ, τv, τ l

)−R(τ, τv, τ l

)∣∣∣∣

≤∣∣∣∣

1T ′

U ′U − σ2

∣∣∣∣ +

∣∣∣∣∣1T ′

q+1∑

i=1

β′iF′IiFβi −

q+1∑

i=1

(τi − τi−1) β′iQff βi

∣∣∣∣∣

+ supτ∈[τh,τh+1)

∣∣∣∣∣1T ′

[β′a[τ ]G

′IaGβa[τ ] − 2β′a[τ ]G′∑h

i+1 IiFβi

−2β′a[τ ]G′(Ia −

∑hi=1 Ii

)Fβh+1

]+ (τ − τ0) Γ′1 (τ) QΓ1 (τ)

∣∣∣∣∣

+ supτ∈[τh,τh+1)

∣∣∣∣∣1T ′

[β′b[τ ]G

′IbGβb[τ ] − 2β′b[τ ]G′(∑h+1

i=1 Ii − Ia

)F βh+1 − 2β′b[τ ]G

′∑q+1i=h+2 IiFβi

]

+(τq+1 − τ) Γ′2 (τ) QΓ2 (τ)

∣∣∣∣∣

+2T ′

∣∣∣∣∣q+1∑

i=1

U ′IiF βi

∣∣∣∣∣ + supτ∈[τh,τh+1)

2T ′

∣∣∣U ′IaGβa[τ ]

∣∣∣ + supτ∈[τh,τh+1)

2T ′

∣∣∣U ′IbGβb[τ ]

∣∣∣

= op (1) ,

meanwhile

∂R(τ, τv, τ l

)

∂τ= −

Γ′1 (τ) QΓ1 (τ)− Γ′2 (τ) QΓ2 (τ)+

2 (τ − τ0) Γ′1 (τ) Q∂Γ1 (τ)

∂τ+ 2 (τq+1 − τ) Γ′2 (τ)Q

∂Γ2 (τ)∂τ

= −[

Γ′1 (τ) QΓ1 (τ)− Γ′2 (τ) QΓ2 (τ)+2Γ′1 (τ) Q

(βh+1 − Γ1 (τ)

)+ 2Γ′2 (τ) Q

(Γ2 (τ)− βh+1

)]

= Γ′1 (τ) QΓ1 (τ)− Γ′2 (τ) QΓ2 (τ)− 2β′h+1Q (Γ1 (τ)− Γ2 (τ))

=(βh+1 − Γ1 (τ)

)′Q

(βh+1 − Γ1 (τ)

)− (βh+1 − Γ2 (τ)

)′Q

(βh+1 − Γ2 (τ)

)

c© Royal Economic Society 2007

Page 19: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

18 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

∂2R(τ, τv, τ l

)

∂τ2= 2Γ′1 (τ)Q

∂Γ1 (τ)∂τ

− 2Γ′2 (τ)Q∂Γ2 (τ)

∂τ

−2β′h+1Q

[1

τ − τ0

(βh+1 − Γ1 (τ)

)− 1τq+1 − τ

(Γ2 (τ)− βh+1

)]

=2

τ − τ0Γ′1 (τ) Q

(βh+1 − Γ1 (τ)

)− 2τq+1 − τ

Γ′2 (τ) Q(Γ2 (τ)− βh+1

)

−2β′h+1Qβh+1

(1

τ − τ0+

1τq+1 − τ

)+ 2β′h+1Q

(Γ1 (τ)τ − τ0

+Γ2 (τ)

τq+1 − τ

)

= − 2τ − τ0

(β′h+1Qβh+1 − 2β′h+1QΓ1 (τ) + Γ′1 (τ) QΓ1 (τ)

)

− 2τq+1 − τ

(β′h+1Qβh+1 − 2β′h+1QΓ2 (τ) + Γ′2 (τ) QΓ2 (τ)

)

= − 2τ − τ0

(βh+1 − Γ1 (τ)

)′Q

(βh+1 − Γ1 (τ)

)

− 2τq+1 − τ

(βh+1 − Γ2 (τ)

)′Q

(βh+1 − Γ2 (τ)

)

≤ 0.

Therefore, 1T RSS

(τ, τv, τ l

)converges uniformly to a concave function of τ for τ ∈

[τh, τh+1), h = 1, 2, ..., q − 1.

For τ ∈ [τq, τq+1], we haveIaIi = Ii, i = 1, 2, ..., q;IaIq+1 = Ia −

∑qi=1 Ii;

IbIi = 0, i = 1, 2, ..., q;IbIq+1 = Ib.

Using similar trick, we can prove that

βa[τ ]p→ Q−1

gg QgfΓ1 (τ)

where

Γ1 (τ) =1

τ − τ0

[q∑

i=1

(τi − τi−1) βi + (τ − τq) βq+1

], τ ∈ [τq, τq1+1)

and

βb[τ ]p→ Q−1

gg Qgf βq+1

Define

R(τ, τv, τ l

)= σ2 +

q+1∑

i=1

(τi − τi−1) β′iQff βi

− (τ − τ0) Γ′1 (τ)QΓ1 (τ)− (τq+1 − τ) β′q+1Qβq+1

on [τq, τq+1], and it follows that

supτ∈[τq,τq+1]

∣∣∣∣1T ′

RSS(τ, τv, τ l

)−R(τ, τv, τ l

)∣∣∣∣ = op (1) .

Also,∂R

(τ, τv, τ l

)

∂τ=

1τ2

L′2QL2 ≥ 0,

where L2 =∑q

i=1 (τi − τi−1) βi − τqβq+1.

∂2R(τ, τv, τ l

)

∂τ2= − 2

τ3L′2QL2 ≤ 0.

c© Royal Economic Society 2007

Page 20: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

Break-Point Estimators under Specification Errors 19

Combining the results above, we prove that 1T ′RSS

(τ, τv, τ l

)converges to an piecewise

concave function R(τ, τv, τ l

)on any segment

(τvT, τ lT

)of [0, T ]. With little modifica-

tion, one can show that:

1T

RSS(τ, 0, 1)p→ R (τ, 0, 1) , τ ∈ [τ , τ ]

uniformly, where R (τ, 0, 1) is piecewise concave function of τ on (0, 1) and is defined as

R (τ, 0, 1) = σ2 +p+1∑

i=1

(τi − τi−1) β′iQffβi − τΓ′1 (τ) QΓ1 (τ)− (1− τ) Γ′2 (τ) QΓ2 (τ)

with Γ1 (τ) and Γ2 (τ) as defined in Section 3. ¤

Proof of Theorem 3.3: Rewrite

f(xt) = QfgQ−1gg g(xt) + et,

where Qgg = E(gtg′t). By definition, E(etg

′t) = 0, where gt = g(xt). In matrix notation,

F = GQ−1gg Qgf + e,

where

e = F −GQ−1gg Qgf .

Thus,

Y = Gδ + eβ + ε = Gδ + v,

where

δ = Q−1gg Qgfβ,

v = eβ + ε.

Under the null, we have,

β = (G′G)−1G′Y = δ + (G′G)−1G′v.

Thus,

β − δ = (G′G)−1G′ (eβ + ε) .

While et is uncorrelated with gt, (β′et)2 is not necessarily uncorrelated with gtg′t under

measurement error or misspecification. This means that the equality E[(β′et)2gtg′t] =

E(β′et)2E(gtg′t) does not hold in general. Thus, we need to use the HAC robust Wald

test. The whole sample estimator β has the limiting distribution√

T (β − δ) d→ N(0,Ω),

where Ω = Q−1gg VLQ−1

gg , VL the long-run covariance matrix for gtvt = gte′tβ + gtεt.

For the first subsample, we have√

T (βa[τ ] − δ) d→ 1τ

Ω1/2B(τ),

where B(τ) is a vector of independent of Brownian motions.Similarly, for the second subsample estimator, we have

√T (βb[τ ] − δ) d→ 1

1− τΩ1/2[B(1)−B(τ)].

c© Royal Economic Society 2007

Page 21: Generic Consistency of the Break-Point Estimators under ...€¦ · break-point estimator is still consistent when there is only one break. It is not known, however, whether this

20 J. Bai, H. Chen, T.T.L. Chong and S.X. Wang

Combining the two results, we have√

T (βa[τ ] − βb[τ ])d→ 1

τ(1− τ)[B(τ)− τB(1)].

Assuming no serial correlation for simplicity, the White (1980) estimator for the covari-ance matrix VL is

VL =1T

T∑t=1

gtg′tvt

2.

The Wald test

Tτ(1− τ)(βa[τ ] − βb[τ ])′Ω−1(βa[τ ] − βb[τ ])d→ ‖B(τ)− τB(1)‖2

τ(1− τ)

where Ω = Q−1gg VLQ−1

gg and Qgg =G′GT

.

c© Royal Economic Society 2007