3528_Bootstrap Methods for Markov Processes

8/6/2019 3528_Bootstrap Methods for Markov Processes

1/44

BOOTSTRAP METHODS FOR MARKOV PROCESSES

by

Joel L. Horowitz

Department of Economics

Northwestern University

Evanston, IL 60208-2600

October 2002

ABSTRACT

The block bootstrap is the best known method for implementing the bootstrap with time-

series data when the analyst does not have a parametric model that reduces the data generation

process to simple random sampling. However, the errors made by the block bootstrap converge

to zero only slightly faster than those made by first-order asymptotic approximations. This paper

describes a bootstrap procedure for data that are generated by a (possibly higher-order) Markov

process or by a process that can be approximated by a Markov process with sufficient accuracy.

The procedure is based on estimating the Markov transition density nonparametrically. Bootstrap

samples are obtained by sampling the process implied by the estimated transition density.

Conditions are given under which the errors made by the Markov bootstrap converge to zero

more rapidly than those made by the block bootstrap.

______________________________________________________________________________

I thank Kung-Sik Chan, Wolfgang Hrdle, Bruce Hansen, Oliver Linton, Daniel McFadden,

Whitney Newey, Efstathios Paparoditis, Gene Savin, and two anonymous referees for many

helpful comments and suggestions. Research supported in part by NSF Grant SES-9910925.


2/44

BOOTSTRAP METHODS FOR MARKOV PROCESSES

1. INTRODUCTION

This paper describes a bootstrap procedure for data that are generated by a (possibly

higher-order) Markov process. The procedure is also applicable to non-Markov processes, such

as finite-order MA processes, that can be approximated with sufficient accuracy by Markov

processes. Under suitable conditions, the procedure is more accurate than the block bootstrap,

which is the leading nonparametric method for implementing the bootstrap with time-series data.

The bootstrap is a method for estimating the distribution of an estimator or test statistic

by resampling ones data or a model estimated from the data. Under conditions that hold in a

wide variety of econometric applications, the bootstrap provides approximations to distributions

of statistics, coverage probabilities of confidence intervals, and rejection probabilities of tests that

are more accurate than the approximations of first-order asymptotic distribution theory. Monte

Carlo experiments have shown that the bootstrap can spectacularly reduce the difference between

the true and nominal probabilities that a test rejects a correct null hypothesis (hereinafter the error

in the rejection probability or ERP). See Horowitz (1994, 1997, 1999) for examples. Similarly,

the bootstrap can greatly reduce the difference between the true and nominal coverage

probabilities of a confidence interval (the error in the coverage probability or ECP).

The methods that are available for implementing the bootstrap and the improvements in

accuracy that it achieves relative to first-order asymptotic approximations depend on whether the

data are a random sample from a distribution or a time series. If the data are a random sample,

then the bootstrap can be implemented by sampling the data randomly with replacement or by

sampling a parametric model of the distribution of the data. The distribution of a statistic is

estimated by its empirical distribution under sampling from the data or parametric model

(bootstrap sampling). To summarize important properties of the bootstrap when the data are a

random sample, let n be the sample size and T be a statistic that is asymptotically distributed as

N(0,1) (e.g., a t statistic for testing a hypothesis about a slope parameter in a linear regression

model). Then the following results hold under regularity conditions that are satisfied by a wide

variety of econometric models. See Hall (1992) for details.

n

1. The error in the bootstrap estimate of the one-sided probability is O( nP T z ) p(n-1),

whereas the error made by first order asymptotic approximations is O(n-1/2).

2. The error in the bootstrap estimate of the symmetrical probability is

O

(| | )nP T z

p(n-3/2), whereas the error made by first-order approximations is O(n-1).

1


3/44

3. When the critical value of a one-sided hypothesis test is obtained by using the

bootstrap, the ERP of the test is O(n-1), whereas it is O(n-1/2) when the critical value is obtained

from first-order approximations. The same result applies to the ECP of a one-sided confidence

interval. In some cases, the bootstrap can reduce the ERP of a one-sided test to O(n-3/2) (Hall

1992, p. 178; Davidson and MacKinnon 1999).

4. When the critical value of a symmetrical hypothesis test is obtained by using the

bootstrap, the ERP of the test is O(n-2), whereas it is O(n-1) when the critical value is obtained

from first-order approximations. The same result applies to the ECP of a symmetrical confidence

interval.

The practical consequence of these results is that the ERPs of tests and ECPs of

confidence intervals based on the bootstrap are often substantially smaller than ERPs and ECPs

based on first-order asymptotic approximations. These benefits are available with samples of the

sizes encountered in applications (Horowitz 1994, 1997, 1999).

The situation is more complicated when the data are a time series. To obtain asymptotic

refinements, bootstrap sampling must be carried out in a way that suitably captures the

dependence structure of the data generation process (DGP). If a parametric model is available

that reduces the DGP to independent random sampling (e.g., an ARMA model), then the results

summarized above continue to hold under appropriate regularity conditions. See, for example,

Andrews (1999), Bose (1988), and Bose (1990). If a parametric model is not available, then the

best known method for generating bootstrap samples consists of dividing the data into blocks and

sampling the blocks randomly with replacement. This is called the block bootstrap. The blocks,

whose lengths increase with increasing size of the estimation data set, may be non-overlapping

(Carlstein 1986, Hall 1985) or overlapping (Hall 1985, Knsch 1989). Regardless of the method

that is used, blocking distorts the dependence structure of the data and, thereby, increases the

error made by the bootstrap. The main results are that under regularity conditions and when the

block length is chosen optimally:

1. The errors in the bootstrap estimates of one-sided and symmetrical probabilities are

almost surely Op(n-3/4) and Op(n

-6/5), respectively (Hall et al., 1995).

2. The ECPs (ERPs) of one-sided and symmetrical confidence intervals (tests) are

O(n-3/4) and O(n-5/4), respectively (Zvingelis 2000).

Thus, the errors made by the block bootstrap converge to zero at rates that are slower

than those of the bootstrap based on data that are a random sample. Monte Carlo results have

confirmed this disappointing performance of the block bootstrap (Hall and Horowitz 1996).

2


4/44

The relatively poor performance of the block bootstrap has led to a search for other ways

to implement the bootstrap with dependent data. Bhlmann (1997, 1998), Choi and Hall (2000),

Kreiss (1992), and Paparoditis (1996) have proposed a sieve bootstrap for linear processes (that

is, AR, vector AR, or invertible MA processes of possibly infinite order). In the sieve bootstrap,

the DGP is approximated by an AR(p) model in which p increases with increasing sample size.

Bootstrap samples are generated by the estimated AR(p) model. Choi and Hall (2000) have

shown that the ECP of a one-sided confidence interval based on the sieve bootstrap is O n 1( ) +

for any > 0, which is only slightly larger than the ECP of 1(O n ) that is available when the data

are a random sample. This result is encouraging, but its practical utility is limited. If a process

has a finite-order ARMA representation, then the ARMA model can be used to reduce the DGP

to random sampling from some distribution. Standard methods can be used to implement the

bootstrap, and the sieve bootstrap is not needed. Sieve methods have not been developed fornonlinear processes such as nonlinear autoregressive, ARCH, and GARCH processes.

The bootstrap procedure described in this paper applies to a linear or nonlinear DGP that

is a (possibly higher-order) Markov process or can be approximated by one with sufficient

accuracy. The procedure is based on estimating the Markov transition density nonparametrically.

Bootstrap samples are obtained by sampling the process implied by the estimated transition

density. This procedure will be called the Markov conditional bootstrap (MCB). Conditions are

given under which:

1. The errors in the MCB estimates of one-sided and symmetrical probabilities are almost

surely1

(O n ) + and O n 3/ 2( ) + , respectively, for any > 0.

2. The ERPs (ECPs) of one sided and symmetrical tests (confidence intervals) based on

the MCB are 1(O n ) + and 3/ 2(O n ) + , respectively, for any > 0.

Thus, under the conditions that are given here, the errors made by the MCB converge to

zero more rapidly than those made by the block bootstrap. Moreover for one-sided probabilities,

symmetrical probabilities, and one-sided confidence intervals and tests, the errors made by the

MCB converge only slightly less rapidly than those made by the bootstrap for data that are

sampled randomly from a distribution.

The conditions required to obtain these results are stronger than those required to obtain

asymptotic refinements with the block bootstrap. If the required conditions are not satisfied, then

the errors made by the MCB may converge more slowly than those made by the block bootstrap.

Moreover, as will be explained in Section 3.2, the MCB suffers from a form of the curse of

dimensionality of nonparametric estimation. A large data set (e.g., high-frequency financial data)

3


5/44

is likely to be needed to obtain good performance if the DGP is a high-dimension vector process

or a high-order Markov process. Thus, the MCB is not a replacement for the block bootstrap.

The MCB is, however, an attractive alternative to the block bootstrap when the conditions needed

for good performance of the MCB are satisfied.

There have been several previous investigations of the MCB. Rajarshi (1990) gave

conditions under which the MCB consistently estimates the asymptotic distribution of a statistic.

Datta and McCormick (1995) gave conditions under which the error in the MCB estimator of the

distribution function of a normalized sample average is almost surely o n . Hansen (1999)

proposed using an empirical likelihood estimator of the Markov transition probability but did not

prove that the resulting version of the MCB is consistent or provides asymptotic refinements.

Chan and Tong (1998) proposed using the MCB in a test for multimodality in the distribution of

dependent data. Paparoditis and Politis (2001a, 2001b) proposed estimating the Markov

transition probability by resampling the data in a suitable way. No previous authors have

evaluated the ERP or ECP of the MCB or compared its accuracy to that of the block bootstrap.

Thus, the results presented here go well beyond those of previous investigators.

1/ 2( )

)

The MCB is described informally in Section 2 of this paper. Section 3 presents regularity

conditions and formal results for data that are generated by a Markov process. Section 4 extends

the MCB to generalized method of moments (GMM) estimators and approximate Markov

processes. Section 5 presents the results of a Monte Carlo investigation of the numerical

performance of the MCB. Section 6 presents concluding comments. The proofs of theorems are

in the Appendix.

2. INFORMAL DESCRIPTION OF THE METHOD

This section describes the MCB procedure for data that are generated by a Markov

process and provides an informal summary of the main results of the paper. For any integerj, let

be a continuously distributed random variable. Let {djX R ( 1d : 1,2,..., }jX j n= be a

realization of a strictly stationary, qth order Markov process. Thus,

1 1 2 2 1 1( | , ,...) ( | ,..., )j j j j j j j j j j j q j qX x X x X x X x X x X x = = = = =P P almost surely ford-vectors and some finite integerq 1. It is assumed that q is

known. Cheng and Tong (1992) show how to estimate q. In addition, for technical reasons that

are discussed further in Section 3.1, it is assumed that

1 2, , ,...j j jx x x

jX has bounded support and that

if for somecov( , ) 0j j k X X + = k> M M< . Define 1( )X = E and .1

1

n

jjm n X

==

4


6/44

2.1. Statement of the Problem

The problem addressed in the remainder of this section and in Section 3 is to carry out

inference based on a Studentized statistic, T , whose form isn

(2.1) ,1/ 2

[ ( ) ( )]/n nT n H m H s=

where H is a sufficiently smooth, scalar-valued function, 2ns is a consistent estimator of the

variance of the asymptotic distribution of n H1/ 2[ ( ) ]m H( )

( )nT t

, and T as .

The objects of interest are (1) the probabilities

(0,1)dn N n

P and (| nT | )tP for any finite, scalart,

(2) the probability that a test based on T rejects the correct hypothesis Hn 0: [ (H X)] ( )H =E ,

and (3) the coverage probabilities of confidence intervals for ( )H that are based on T . To

avoid repetitive arguments, only probabilities and symmetrical hypothesis tests are treated

explicitly. An

n

-level symmetrical test based on T rejects Hn 0 if | |n nT z> , where nz is the-level critical value. Arguments similar to those made in this section and Section 4 can be used

to obtain the results stated in the introduction for one-sided tests and for confidence intervals

based on the MCB.

The focus on statistics of the form (2.1) with a continuously distributed X may appear to

be restrictive, but this appearance is misleading. A wide variety of statistics that are important in

applications can be approximated with negligible error by statistics of the form (2.1). In

particular, as will be explained in Section 4.1, tstatistics for testing hypotheses about parameters

estimated by GMM can be approximated this way.1

2.2. The MCB Procedure

Consider the problem of estimating ( )nT zP , (| | )nT zP , or nz . For any integer

, define Y X . Letj q> 1( ,..., )j j j qX = yp denote the probability density function of

. Let fdenote the probability density function of1 ( ,...q qY X 1)X + jX conditional on Y . Iff

andp were known, then and

j

( )nT zP (| n | )T zP could be estimated as follows:

1. Draw Y X from the distribution whose density is1 ( ,..., )q q X+

1 yp . Draw 1qX +

from the distribution whose density is 1( | )qf Y+ . Set Y X2 1( ,..., )q q X+ + 2 = .

2. Having obtained Y X 1( ,..., )j j j qX for any , draw2j q + jX from the

distribution whose density is ( | )jf Y (j j. Set Y X1 1,..., jX+ )q+ = .

5


7/44

3. Repeat step 2 until a simulated data series { : 1,..., }jX j = n has been obtained.

Compute as (say) 1 1( ,..., ) ...y q q 1x p x x dx dx . Then compute a simulated test statistic T bysubstituting the simulated data into (2.1).

n

4. Estimate (( )nT zP (| | )nT zP ) from the empirical distribution of T ( | ) that is

obtained by repeating steps 1-3 many times. Estimate

n |nT

nz by the 1 - quantile of the empirical

distribution of | .|nT

This procedure cannot be implemented in an application because fand py are unknown.

The MCB replacesfand yp with kernel nonparametric estimators. To obtain the estimators, let

fK be a kernel function (in the sense of nonparametric density estimation) of a ( 1)d q + -

dimensional argument. Let pK be a kernel function of a dq -dimensional argument. Let

be a sequence of positive constants (bandwidths) such that as .

Conditions that

{ :n 1,2,...}h n = 0nh n

fK , pK , and { must satisfy are given in Section 3. For , ,

and , define

}nhdx dqy

( , )z x y=

( 1)1

1( , ) ,

( )

nj j

nz fd qn nn j q

x X y Y p x y K

h hn q h + = +

=

and

1

1( )

( )

nj

ny pdqnn j q

y Yp y K hn q h = +

= .

The estimators of y andf, respectively, are nyp and

(2.2) ( | ) ( , ) / ( )n nz nyf x y p x y p y= .

The MCB estimates ,( )nT zP (| | )nT zP , and nz by repeatedly sampling the Markov

process generated by the transition density (n | )f x y . However, ( | )n x y is an inaccurate

estimator of ( | )x y in regions where ( )yp y is close to zero. To obtain the asymptotic

refinements described in Section 1, it is necessary to avoid such regions. Here, this is done by

truncating the MCB sample. To carry out the truncation, let { :n nC y ( ) }yp y n= , where

0n > for each and1n = ,2,..., 0n

,...,X X

as at a rate that is specified in Section 3.1.

Having obtained realizations from the Markov process induced by

n

1 1 (j j q +1)

6


8/44

( | )nf x y , the MCB retains a realization of

jX only if ( 1 ,..., )j j q nX X + C

. Thus, MCB

proceeds as follows:

1

1qY+

,..., 1j q +

(j j+ =

( ,...,X

,..., n

: 1,..., }jX j n=

)]/n ns1m n=[ (H

| )z

nz

nz

..., }X n

MCB 1. Draw from the distribution whose density is1 ( ,..., )q qY X X+ ny . Retain

if Y . Otherwise, discard the current Y1q C+ n 1q+ and draw a new one. Continue this

process until a Y is obtained.1q + nC

MCB 2. Having obtained Y X 1 ( )j j jX for any , draw jX from the

distribution whose density is ( | )n jY . Retain jX and set Y X1 1, )j qX..., + if

1)j j qX + C . Otherwise, discard the current jX and draw a new one. Continue this

process until an jX is obtained for which 1( )j j qX X C + .

q

n

MCB 3. Repeat step 2 until a bootstrap data series { has been obtained.

Compute the bootstrap test statistic T n , where ,1/ 2 ) (m H 1

njj

X= is

the mean of X relative to the distribution induced by the sampling procedure of steps MCB 1

and MCB 2 (bootstrap sampling), and 2ns is an estimator of the variance of the asymptotic

distribution of1/ 2 [ ( ) ( )]n H m H under bootstrap sampling.

MCB 4. Estimate (( )nT zP (| nTP ) from the empirical distribution of T ( | | )

that is obtained by repeating steps 1-3 many times. Estimate

n

nT

by the 1 - quantile of the

empirical distribution of | . Denote this estimator byT |n nz .

A symmetrical test of H0 based on T and the bootstrap critical valuen rejects at the

nominal level if | | n nT z .

2.3. Properties of the MCB

This section presents an informal summary of the main results of the paper and of the

arguments that lead to them. The results are stated formally in Section 3. Let denote the

Euclidean norm. Let P denote the probability measure induced by the MCB sampling procedure

(steps MCB 1 MCB 2) conditional on the data { : 1,j j = . Let any > 0 be given.

The main results are that under regularity conditions stated in Section 3.1:

(2.3) 1 sup ( ) ( ) | ( )n nz

T z T z O n + =| P P

7


9/44

almost surely,

(2.4) 3/ 2 sup (| | ) (| | ) | (n nz

T z T z O n ) + =| P P

almost surely, and

(2.5) 3/ 2(| | ) ( )n nT z O n +> = +P .

These results may be contrasted with the analogous ones for the block bootstrap. The

block bootstrap with optimal block lengths yields O n , , and for the

right-hand sides of (2.3)-(2.5), respectively (Hall et al. 1995, Zvingelis 2000). Therefore, the

MCB is more accurate than the block bootstrap under the regularity conditions of Section 3.1.

3/ 4( )p 6 / 5(pO n

) )

)

5/ 4(O n

These results are obtained by carrying out Edgeworth expansions of and

. Additional notation is needed to describe the expansions. Let

( )nT zP

{ : 1,j ( nT zP ..., }X j n =

denote the data. Let and , respectively, denote the standard normal distribution function and

density. The jth cumulant of T ( 4n )j has the form ifj is odd and

ifj is even, where

1/ 2 1(jn o + / 2 )n

1jn o

+ 1( 2)I j = + (n ) j is a constant (Hall 1992, p. 46). Define

. Conditional on1( ,..., = 4 ) , the jth cumulant of almost surely has the form

ifj is odd and

nT

1)1/ 2 jn + 1/ 2(o n ) 1) (j o nn ( 2I j

= + + ifj is even. The quantities j

depend on . They are nonstochastic relative to bootstrap sampling but are random variables

relative to the stochastic process that generates . Define 1 4 ( , ..., ) = .

Under the regularity conditions of Section 3.1, ( )nT zP has the Edgeworth expansion

(2.6)2

/ 2 3/ 2

1

( ) ( ) ( , ) ( ) (jn jj

T z z n z z O n

=

= + +P )

uniformly overz, where ( , )j z is a polynomial function ofz for each , a continuously

differentiable function of the components of

for eachz, an even function ofzifj = 1, and an

odd function ofzifj = 2. Moreover, (| |nT z)P has the expansion

(2.7) 1 32(| | ) 2 ( ) 1 2 ( , ) ( ) ( )nT z z n z z O n = + +P / 2

uniformly overz. Conditional on , the bootstrap probabilities ( )nT zP and have

the expansions

( | | )nT P z

)(2.8)2

/ 2 3/ 2

1

( ) ( ) ( , ) ( ) (j

n j

j

T z z n z z O n

=

= + +P

8


10/44

and

(2.9) 1 32 (| | ) 2 ( ) 1 2 ( , ) ( ) ( )nT z z n z z O n

= + +P / 2

uniformly overzalmost surely. Therefore,

(2.10)

( )1/ 2 1 | ( ) ( ) | ( )

n nT z T z O n O n = +P P

and

(2.11) ( )1 3 | ( | | ) (| | ) | ( )n nT z T z O n O n = +P P / 2

almost surely uniformly overz. Under the regularity conditions of Section 3.1,

(2.12) 1/ 2 ( )O n + =

almost surely for any 0 > . Results (2.3)-(2.4) follow by substituting (2.12) into (2.10)-(2.11).

To obtain (2.5), observe that (| | ) (| | ) 1n n n nT z T z = =P P

/ 2

/ 2

. It follows from (2.7)

and (2.10) that

(2.13)1 3

22 ( ) 1 2 ( , ) ( ) 1 ( )n n nz n z z O n + = +

and

(2.14) 1 32 2 ( ) 1 2 ( , ) ( ) 1 ( )n n nz n z z O n + = +

almost surely. Let v denote the 1 / 2 quantile of the N(0,1) distribution. Then Cornish-

Fisher inversions of (2.13) and (2.14) (e.g., Hall 1992, p. 88-89) give

(2.15) 1 32 ( , ) ( )nz v n v O n = + / 2

/ 2

and

(2.16) 1 32 ( , ) ( )nz v n v O n = +

almost surely. Therefore,

( )

1 32 2

1 3/ 2

(2.17) (| | ) {| | [ ( , ) ( , )] ( )}

(2.18) [| | ( )].

n n n n

n n

T z T z n v v O n

T z O n O n

= + +

= + +

P P

P

/ 2

.

Result (2.5) follows by applying (2.12) to the right-hand side of (2.18).

3. MAIN RESULTS

This section presents theorems that formalize results (2.3)-(2.5).

9


11/44

3.1Assumptions

Results (2.3)-(2.5) are established under assumptions that are stated in this section. The

proof of the validity of the Edgeworth expansions (2.6)-(2.9) relies on a theorem of Gtze and

Hipp (1983) and requires certain restrictive assumptions. See Assumption 4 below. It is likely

that the expansions are valid under weaker assumptions, but proving this conjecture is beyond the

scope of this paper. The results of this section hold under weaker assumptions if the Edgeworth

expansions remain valid.

The following additional notation is used. Let zp denote the probability density function

of . Let1 1( ,q q qZ X Y + + + 1) E denote the expectation with respect to the distribution induced by

bootstrap sampling (steps MCB 1 and MCB 2 of Section 2.2). Define [ ( )( ) ]n m m =

and

E

2 ( ) 'ns H ( )H

)

=

1 1( )( jX X

. For reasons that are explained later in this section, it is assumed that

0 + =E if j M for some integer> M< . Define

{ }1 1 1( ) ( )( ) ( )( ) ( )( )n M M

n i i i i j i j ii jn M X m X m X m X m X m X m

+ += =

= + + ,

{ }1 1 1 ( ) ( )( ) ( )( ) ( )( )n M M

n i i i i j i j ii jn M X m X m X m X m X m X m

+ += =

= + + ,

{ }1 1 1 1 1 11 ( )( ) ( )( ) ( )( )M

n jjX X X X X X + += j

= + + E ,

and 2 ( ) ' ( ) /n n2nH H s = Note that

2n can be evaluated in an application because the

bootstrap DGP is known. For any 0> , define, 1{ : ( ,..., )d

q y qC x p x x =

1 1some ,..., }qfor x x (C. Let )B denote the measurable subsets of C

( , )k

AP. Let denote the

k-step transition probability from a point C to a set ( )CA B .

Assumption 1: { is a realization of a strictly stationary, qth

order Markov process that is geometrically strongly mixing(GSM).

: 1,2,..., ; }dj jX j n X = R

2

Assumption 2: (i) The distribution of 1qZ + is absolutely continuous with respect to

Lebesgue measure. (ii)For t and each k such that 0d k q<

(3.1) ( )lim sup exp ,| | , 1j jt

t X X j j k j j


12/44

Assumption 3: (i)H is three times continuously differentiable in a neighborhood of .

(ii) The gradient of H is non-zero in a neighborhood of.

Assumption 4: (i) jX has bounded support. (ii)For all sufficiently small 0> , some

0 > , and some integer ,0>k

, ; ( )

sup | ( , ) ( , ) | 1k k

C A C

A A

s H m H m= , and 2 2n n ( ) ( )n s H m H m = . (v) For n as

in Assumption 6, 1) ]1) | ] [(loy q n q qY y o[ ( gp Y n

+ < = =P as uniformly over such thatn qy

( )y q np y .

LetKbe a bounded, continuous function whose support is [-1,1] and that is symmetricalabout 0. For each integer , letKsatisfy,0,...,j =

1

1

1 if 0

( ) 0 if 1

(nonzero) if

j

K

j

v K v dv j

B j

=

= 0, let W (j = 1, ,J) denote thejth component of the vector( )j JW .

Assumption 5: Let ( 1)d qfv+ and .dqpv fK and pK have the forms

( 1)

( ) ( )

1 1

( ) ( ); ( ) (

d q dq

j jf f p p pf

j j

K v K v K v K v

+

= == = ) .

Assumption 6: (i) Let 1/[2 ( 1)]d q= + + . Then n hh c n= for some finite constant

. (ii) .0hc >2

(log )n nn h

Assumptions 2(i), 2(ii) and 4 are used to insure the validity of the Edgeworth expansions

(2.6)-(2.9). Condition (3.1) is a dependent-data version of the Cramr condition. See Gtze and

Hipp (1983). Assumptions 2(iii) and 2(iv) insure sufficiently rapid convergence of .

Assumptions 4(i)-4(ii) are used to show that the bootstrap DGP is GSM. The GSMproperty is used to prove the validity of the Edgeworth expansions (2.8)-(2.9). The results of this

paper hold when jX has unbounded support if the expansions (2.8)-(2.9) are valid and ( )yp y

decreases at an exponentially fast rate as y .3 Assumption 4(iii) is used to insure the

validity of the Edgeworth expansions (2.6)-(2.9). It is needed because the known conditions for

the validity of these expansions apply to statistics that are functions of sample moments. Under

11


13/44

4(iii)-4(iv), ns and T are functions of sample moments ofn jX . This is not the case if T is

Studentized with a kernel-type variance estimator (e.g., Andrews 1991; Andrews and Monahan

1992; Newey and West 1987, 1994;). However, under Assumption 1, the smoothing parameter of

a kernel variance estimator can be chosen so that the estimator is

n

1/ 2n -consistent for any 0 > .

Therefore, if the required Edgeworth expansions are valid, the results established here hold when

is Studentized with a kernel variance estimator.nT4

n

( )H m

(H

n

)]

1)]+

0 >

)T z

T z

| P

p | P

(| )P

| )> =

(1

The quantity 2 in 2ns is a correction factor analogous to that used by Hall and Horowitz

(1996) and Andrews (1999, 2002). It is needed because is a biased estimator

of the variance of the asymptotic distribution of n H

( )H m

)m1/ 2[ ( . The bias converges to zero

too slowly to enable the MCB to achieve the asymptotic refinements described in Section 2.3.

The correction factor removes the bias without distorting higher-order terms of the Edgeworth

expansion of the CDF of . n( )T zP

Assumption 4(v) restricts the tail thickness of the transition density function.5

Assumption 5 specifies convenient forms for fK and pK , which may be higher-order kernels.

Higher-order kernels are used to insure sufficiently rapid convergence of .

3.2 Theorems

This section gives theorems that establish conditions under which results (2.3)-(2.5) hold.

The bootstrap is implemented by carrying out steps MCB 1-MCB 4. Let /[2 (d q = + .

Theorem 3.1: Let Assumptions 1-6 hold. Then for every

(3.2) 1/ 2 sup ( ( ) | ( )n nz

T z O n + =P

and

(3.3) 1 su (| | ) (| | ) | (n nz

T z O n + =P )

almost surely.

Theorem 3.2: Let Assumptions 1-6 hold. For any (0,1) let nz satisfy

|n nT z > = . Then for every 0 >

(3.4)1(| ( )n nT z O n

++P .

Theorems 3.1 and 3.2 imply that results (2.3)-(2.6) hold if 2 ) ( 1) /(4 )d q > + . With

the block bootstrap, the right-hand sides of (3.2)-(3.4) are Op(n-3/4), Op(n

-6/5), and O(n-5/4),

12


14/44

respectively. The errors made by the MCB converge to zero more rapidly than do those of the

block bootstrap if is sufficiently large. With the MCB, the right-hand side of (3.2) is o n

if , the right-hand side of (3.3) is if , and the right-hand

side of (3.4) is if . However, the errors of the MCB converge more

slowly than do those of the block bootstrap if the distribution of

3/ 4( )

( 1) / 2d q> + 6 / 5( )o n ( 1) / 3d q> +

5 / 4

( )

o n ( 1) / 2d q> +

Z is not sufficiently smooth (

is too small). Moreover, the MCB suffers from a form of the curse of dimensionality in

nonparametric estimation. That is, with a fixed value of , the accuracy of the MCB decreases as

dand q increase. Thus, the MCB, like all nonparametric estimators, is likely to be most attractive

in applications where dand q are not large. It is possible that this problem can be mitigated,

though at the cost of imposing additional structure on the DGP, through the use of dimension

reduction methods. For example, many familiar time series DGPs can be represented as single-

index models or nonparametric additive models with a possibly unknown link function.

However, investigation of dimension reduction methods for the MCB is beyond the scope of this

paper.

: 1,...,jX i n= j

0

( ,...,j jX X )j = 1j +

1+ 1L

4. EXTENSIONS

Section 4.1 extends the results of Section 3 to tests based on GMM estimators. Section

4.2 presents the extension to approximate Markov processes.

4.1 Tests Based on GMM Estimators

This section gives conditions under which (3.2)-(3.4) hold for the tstatistic for testing a

hypothesis about a parameter that is estimated by GMM. The main task is to show that the

probability distribution of the GMM tstatistic can be approximated with sufficient accuracy by

the distribution of a statistic of the form (2.1). Hall and Horowitz (1996) and Andrews (1999,

2002) use similar approximations to show that the block bootstrap provides asymptotic

refinements forttests based on GMM estimators.

Denote the sample by { } . In this section, some components of X may be

discrete, but there must be at least one continuous component. See Assumption 9 below.Suppose that the GMM moment conditions depend on up to lags of jX . Define

X for some fixed integer 0 and . Let X denote a random vector

that is distributed as X . Estimation of the parameter is based on the moment

condition ( , )G 0 =E X , where G is a known 1GL function and GL L . Let 0 denote the

13


15/44

true but unknown value of . Assume that 0 0( , ) ( , ) 0i jG G =E X X if | for some

.

| Gi j M >

GM < 6 As in Hall and Horowitz (1996) and Andrews (1999, 2002), two forms of the GMM

estimator are considered. One uses a fixed weight matrix, and the other uses an estimator of the

asymptotically optimal weight matrix. The results stated here can be extended to other forms

such as the continuous updating estimator. In the first form, the estimator of , n , solves

1 1

1 1

( , )n n

i

i i

n G

= += +

X X

G GL L

1 1

1 1

( ) ( , )n n

n i

i i

n G

= +

X X

n

1

1 1

( , , )GM

i i i

i j

H

n G

+= +

+

X X

( , ) ( , )i j iG G( , ) + X X

1 10 0[ ( ) ]nE

GL L

D E=

( , n

1 10

1

n

n

0 ( )n rr , )r r

r n

r = r=

(4.1) min ( ) ( , )n iJ n G

,

where is the parameter set and is a , positive-semidefinite, symmetrical matrix of

constants. In the second form, n solves

(4.2) min ( , ) ( , )n n iJ n G

= +

,

where solves (4.1),

1( ) ( , ) ( , )n

n i G

= +

= X X ,j

and , , ) ( ( , )i i j i i jH G G + + = +X X X X .7

To obtain the t statistic, let = . Define the matrices

0[ ( , ) /G X

and]

1

1

) /n

n i

i

D n G

= +

= X .

Define

(4.3) ( ) ( )D D D D D D =

if solves (4.1) and

(4.4)1

0( )D D=

if solves (4.2). Let n be the consistent estimator of that is obtained by replacing D and

in (4.3) and (4.4) by andnD ( )n n . In addition, let be the ( component of n ,

and let and nr be the rth components of and , respectively. The tstatistic for testing

H0: 0 is .1/ / 2

0 rt n 2 1( ) /(r n r )nr n

14


16/44

To obtain the MCB version of t , let {nr : 1,..., }jX j = n be a bootstrap sample that is

obtained by carrying out steps MCB 1 and MCB 2 but with the modified transition density

estimator that is described in equations (4.5)-(4.7) below. The modified estimator allows some

components ofj

X to be discrete. Define j

( ,..., )j j

X X

=X . Let X denote a random vector

that is distributed as

1

+X . LetE denote the expectation with respect to the distribution induced

by bootstrap sampling. Define ( , ) ( , )G G = ( , )nGi i E X . The bootstrap version of the

moment condition ( ,G ) 0 =E X is ( ,G ) 0=E X . As in Hall and Horowitz (1996) and Andrews

(1999, 2002), the bootstrap version is recentered relative to the population version because,

except in special cases, there is no such that ( , ) 0G =E X when GL L> . Brown, et al.

(2000) and Hansen (1999) discuss an empirical likelihood approach to recentering. Recentering

is unnecessary if GL L= , but it simplifies the technical analysis and, therefore, is done here.8

To form the bootstrap version of , letnrtn denote the bootstrap estimator of . Let

nD

be the quantity that is obtained by replacing with andiXiX n with

n in the expression for

. DefinenD 1 ( , ) /n nD G += E X ,

1

1

1 1

( ) ( , ) ( , ) ( , , )GMn

n i i i i

i j

n G G H

j

+

= + = +

= +

X X X X ,

1

1 1 1 1

1

( , ) ( , ) ( , , )GM

n n n j

j

G G H

n

+ + + + += +

= +

E X X X X ,

and

1

1 1 1 1

1

( , ) ( , ) ( , , )n n n jj

G G H

n

= + + + + += +

+

E X X X X .

Define n , by replacing with , with ,G G iXiX n with

n , withnD

nD and ( )n n with

(n

)n

in the formula forn

. Let *n

= ifn

solves (4.1) and *n

(n

)n

= ifn

solves

(4.2). Define

1 1( * ) * * ( * )n n n n n n n n n n n nD D D D D D = 1 ,

1 1( * ) * * ( * )n n n n n n n n n n n nD D D D D D = 1 ,

15


17/44

and 2 ( ) /( )nr n rr n rr = , where ( )n rr and ( )n rr are the ( components of , )r r n and n ,

respectively. Let nr denote the rth component ofn . Then the MCB version of the tstatistic

is t n . The quantity1/ 2 1/ ( ) /(nr nr nr n rr 2)nr= nr is a correction factor analogous to that used

by Hall and Horowitz (1996) and Andrews (1999, 2002).

Now let V( , )j X ( 1 ,..., )j n= + be the vector containing the unique components of

( ,jG )X , ( , ) (j iG G , )j X (0 Gi M+X ) , and the derivatives through order 6 of G( , )j X

and ( , ) (jG G , )i j +XX . Let denote the support of (XS 1 1,..., )X X + . Define y , z , and f

as in Sections 2-3 but with counting measure as the dominating measure for discrete components

of jX . The following new assumptions are used to derive the results of this section.

Assumptions 7-9 are similar to ones made by Hall and Horowitz (1996) and Andrews (1999,

2002).

( 1,...,j = )n

Assumption 7: 0 is an interior point of the compact parameter set and is the unique

solution in to the equation

( , ) 0G =E X .

Assumption 8: (i) There are finite constants and such thatGC VC 1( , ) GG C + X

and 1( , ) VV C + X for all 1 + XSX and . (ii) 1 0 1 0( , ) ( , )jG G 0 + + + =E

M

j

X X

1 11[ ( , ) ( , )

G

jG G + + +=

if

for some .(iii)GM Mj > G < { 1 1 + +( , ) , ) (G G + +E X X X X

}1( , ) (jG G + + +X X1 , ) ] exists for all . Its smallest eigenvalue is bounded away from 0uniformly over in an open sphere, 0N , centered on 0 . (iv) There is a bounded function

such that(GC )i 1 1 1) (G G + +X X 2, ) ( , 1 1 2( )GC +X for all X and1 + XS

1 2, . (v) G is 6-times continuously differentiable with respect to the components of

everywhere in 0N . (vi) There is a bounded function C such that( )V i

1 1) (V V 1 2, ) 1 1( )C 2( , V +X + +X X for all 1 + X 1 2,X S and .

Assumption 9: (i)j

X can be partitioned( 1,..., )j = n )( ) ( )( ,c dj j

X X , where

for some , the distributions of

( )c d

jX

1d ( )cjX and 0( , ) /G X

( )dj

are absolutely continuous with

respect to Lebesgue measure, and the distribution of X is discrete with finitely many mass

points. There need not be any discrete components of j X , but there must be at least one

continuous component. (ii) The functions yp , z p , and f are bounded. (iii) For some integer

16


18/44

2> , yp and zp are everywhere at least times continuously differentiable with respect to

any mixture of their continuous arguments.

}

) =

( )

)y

d=

cj

nt z>

j jX = =

1,j

Assumption 10:Assumptions 2 and4 hold with V 0( ,j )X in place of jX .

As in Sections 2-3, {

: 1,...,jX j = n in the MCB for GMM is a realization of the

stochastic process induced by a nonparametric estimator of the Markov transition density. If jX

has no discrete components, then the density estimator is (2.2) and MCB samples are generated

by carrying out steps MCB 1 and MCB 2 of Section 2.2. A modified transition density estimator

is needed if jX has one or more discrete components. Let( ) ( )

( ,c d

j jY Y ) be the partition of

into continuous and discrete components. The modified density estimator is

jY

(4.5) ( | ( , ) / ( )n n nf x y g x y p y ,y

where

(4.6)

( ) ( )( )( ) ( )( ) ( )

( 1)1

1( , , ( ) ( )

( )

c cc cnj j d dd d

n f j jd qn nn j q

x X y Y g x K I x X I y Y

h hn q h + = +

= =

,=

( )dim( )c

jX , and

(4.7)

( )( )( )( )

1

1( ) ( )

( )

cndd

ny p jdqnn j q

y Yp y K I y Y

hn q h = +

= =

.

The result of this section is given by the following theorem.

Theorem 4.1: Let Assumptions 1, 4(i), 4(v), and5-10 hold. For any (0,1) let nz

satisfy (| | )nr =P . Then (3.2)-(3.4) hold with t and t in place of T and T .nr nr n n

4.2 Approximate Markov Processes

This section extends the results of Section 3.2 to approximate Markov processes. As in

Sections 2-3, the objective is to carry out inference based on the statistic T defined in (2.1). For

an arbitrary random vector V, let

n

( | )jx vp denote the conditional probability density of jX at

x and V . An approximate Markov process is defined to be a stochastic process that

satisfies the following assumption.

v

Assumption AMP: (i) { is strictly stationary and GSM.

(ii) For some finite b , integer , all finite and all

: 0, 2,...; }dX j X = R

0 0q > j q 0> 0q

j

17


19/44

1

1 2 1 2, ,...

sup ( | , ,...) ( | , ,..., )j j

bqj j j j j j j q

x x

x x x x x x x e

c

0

}n

(d q 1) / n+ as .n

Assumption 2: For some finite integer and each : (i) The distribution of0n 0n n( )

1q

qZ +

is absolutely continuous with respect to Lebesgue measure. (ii) For and each k such thatdt

0 k q<

(3.1) ( )( ) ( )lim sup exp ,| | , 1q qj jt

t X X j j k j j


20/44

For each positive integer , let be a bounded, continuous function whose support is

[-1,1], that is symmetrical about 0, and that satisfies

n nK

1

1

1 if 0

( ) 0 if 1

(nonzero) if

jn n

K n

j

v K v dv j

B j

=

= 2(log ) nn n

n h

The main difference between these assumptions and Assumptions 2, 5, and 6 of the MCB

is the strengthened smoothness Assumption 2(iv). The MCB for an order q Markov process

provides asymptotic refinements whenever yp and zp have derivatives of fixed order

, whereas the AMCB requires the existence of derivatives of all orders as .( 1d q> + ) n

The ability of the AMCB to provide asymptotic refinements is established by the

following theorem.

Theorem 4.2: Let Assumptions AMCB, 2, 3, 4, 5, and6hold with for

some finite . For any

2/(log )q n c

0c > (0,1) let nz satisfy (| | )n nT z > =P . Then for every 0 >

1 sup ( ) ( ) | ( )n nz

T z T z O n + =| P P ,

3/ 2 sup (| | ) (| | ) | (n nz

T z T z O n ) + =| P P ,

almost surely, and

3/ 2(| | ) ( )n nT z O n

+> = +P .

5. MONTE CARLO EXPERIMENTS

This section describes four Monte Carlo experiments that illustrate the numerical

performance of the MCB. The number of experiments is small because the computations are very

lengthy.

Each experiment consists of testing the hypothesis H0 that the slope coefficient is zero in

the regression of jX on 1jX . The coefficient is estimated by ordinary least squares (OLS), and

19


21/44

acceptance or rejection of H0 is based on the OLS t statistic. The experiments evaluate the

empirical rejection probabilities of one-sided and symmetrical tests at the nominal 0.05 level.

Results are reported using critical values obtained from the MCB, the block bootstrap, and first-

order asymptotic distribution theory. Four DGPs are used in the experiments. Two are the

ARCH(1) processes

t

j

1)q +

(5.1) ,2 1/ 21(1 0.3 )j j jX U X = +

where { is an iidsequence that has either the N(0,1) distribution or the distribution with}jU

(5.2) .7( ) 0.5[sin ( / 2) 1] (| | 1)jU u u I u = + P

The other two DGPs are the GARCH(1,1) processes

(5.3) 1/ 2j j jX U h= ,

where

(5.4) 21 11 0.4( )j jh h X = + + j

and { is an iidsequence with either the N(0,1) distribution or the distribution of (5.2). DGP

(5.1) is a first-order Markov process. DGP (5.3)-(5.4) is an approximate Markov process. In the

experiments reported here, this DGP is approximated by a Markov process of order . When

has the admittedly somewhat artificial distribution (5.2),

}jU

2q =

jU X has bounded support as required

by Assumption 4. When U N ,~ (0,1)j jX has unbounded support and2

jX has moments only

through orders 8 and 4 for models (5.1) and (5.3)-(5.4), respectively (He and Tersvirta 1999).

Therefore, the experiments with U N illustrate the performance of the MCB under

conditions that are considerably weaker than those of the formal theory.

~ (0,1)j

The MCB was carried out using the 4th-order kernel

(5.4) .2 4 6

( ) (105/ 64)(1 5 7 3 ) (| | 1)K v v v v I v= +

Implementation of the MCB requires choosing the bandwidth parameter . Preliminary

experiments showed that the Monte Carlo results are not highly sensitive to the choice of h , so a

simple method motivated by Silvermans (1986) rule-of-thumb is used. This consists of setting

equal to the asymptotically optimal bandwidth for estimating the variate normal

density , where is the (

nh

1)

n

nh (q +

21(0, )n qN I + 1qI + 1) (q + identity matrix and

2n is the estimated

variance of 1X . Of course, there is no reason to believe that this is optimal in any sense in the

MCB setting. The preliminary experiments also indicated that the Monte Carlo results are

nh

20


22/44

insensitive to the choice of trimming parameter, so trimming was not carried out in the

experiments reported here.

Implementation of the block bootstrap requires selecting the block length. Data-based

methods for selecting block lengths in hypothesis testing are not available, so results are reported

here for three different block lengths, (2, 5, 10). The experiments were carried out in GAUSS

using GAUSS random number generators. The sample size is 50n = . There are 5000 Monte

Carlo replications in each experiment. MCB and block bootstrap critical values are based on 99

bootstrap samples.10

The results of the experiments are shown in Table 1. The differences between the

empirical and nominal rejection probabilities (ERPs) with first-order asymptotic critical values

tend to be large. The symmetrical and lower-tail tests reject the null hypothesis too often. The

upper-tail test does not reject the null hypothesis often enough when the innovations have the

distribution (5.2). The ERPs with block bootstrap critical values are sensitive to the block

length. With some block lengths, the ERPs are small, but with others they are comparable to or

larger than the ERPs with asymptotic critical values. With the MCB, the ERPs are smaller than

they are with asymptotic critical values in 10 of the 12 experiments. The MCB has relatively

large ERPs with the GARCH(1,1) model and normal innovations because this DGP lacks the

higher-order moments needed to obtain good accuracy with the bootstrap even with iiddata.

6. CONCLUSIONS

The block bootstrap is the best known method for implementing the bootstrap with time

series data when one does not have a parametric model that reduces the DGP to simple random

sampling. However, the errors made by the block bootstrap converge to zero only slightly faster

than those made by first-order asymptotic approximations. This paper has shown that the errors

made by the MCB converge to zero more rapidly than those made by the block bootstrap if the

DGP is a Markov or approximate Markov process and certain other conditions are satisfied.

These conditions are stronger than those required by the block bootstrap. Therefore, the MCB is

not a substitute for the block bootstrap, but the MCB is an attractive alternative to the block

bootstrap when the MCBs stronger regularity conditions are satisfied. Further research could

usefully investigate the possibility of developing bootstrap methods that are more accurate than

the block bootstrap but impose less a priori structure on the DGP than do the MCB or the sieve

bootstrap for linear processes.

21


23/44

FOOTNOTES

1 Statistics with asymptotic chi-square distributions are not treated explicitly in this paper.

However, the arguments made here can be extended to show that asymptotic chi-square statistics

based on GMM estimators behave like symmetrical statistics. See, for example, the discussion of

the GMM test of overidentifying restrictions in Hall and Horowitz (1996) and Andrews (1999).

Under regularity conditions similar to those of Sections 3 and 4, the results stated here for

symmetrical probabilities and tests apply to asymptotic chi-square statistics based on GMM

estimators. Andrews (1999) defines a class of minimum estimators that is closely related to

GMM estimators. The results here also apply to ttests based on minimum estimators.

2 GSM means that the process is strongly mixing with a mixing parameter that is an

exponentially decreasing function of the lag length.

3 If Y has finitely many moments and the required Edgeworth expansions are valid, then the

errors made by the MCB increase as the number of moments of Y decreases. If Y has too few

moments, then the errors made by the MCB decrease more slowly than the errors made by the

block bootstrap.

4 Gtze and Knsch (1996) have given conditions under which T with a kernel-type variance

estimator has an Edgeworth expansion up to O n . Analogous conditions are not yet known

for expansions through . A recent paper by Inoue and Shintani (2000) gives expansions

for statistics based on the block bootstrap for linear models with kernel-type variance estimators.

n

1/ 2( )

1

( )O n

5 As an example of a sufficient condition for 4(v), let X be scalar with support . Then 4(v)

holds if there are finite constants , , , and such that

when and when 1 .

The generalization to multivariate X is analytically straightforward though notationally

cumbersome.

[0,1]

1 0c > 2 0c > 0> 0 > 1 qc x

1 2( ,..., )y q qp x x c x qx < 1 1 2(1 ) ( ,..., ) (1 )q y q qc x p x x c x

qx [(log ) ]cnz nO n h = ( )ny nz o = a.s..

Proof: This is slightly modified version of Theorem 2.2 of Bosq (1996) and is proved the

same way as that theorem. Q.E.D.

Lemma 2: ( ) ( ) (n n n nyC C O ) = .

Proof: C is bounded uniformly over becausen n X has bounded support. Therefore,

( ) ( ) [ ( ) ( )] (n

n n n ny y nyC

C C p y p y dy O ) = = .

Q.E.D.

Lemma 3: ( ) 1 ( )n nC o = a.s..

Proof: Let V(2 )n denote volume of { : ( ) 2 }yy p y n< . By Assumption 6(ii) and

Lemma 1, a.s. for any . Therefore,2ny n = [ (log )c

no ] 1c > ( )nony = a.s., and

1 ( ) [ ( ) ] ( )

2 [ ( ) 2 ] 2 (2 ) a.

n ny n y

n y n n n

C I p y p y dy

I p y dy V

=

ny C (

(A1) ( ) (

(

yB

ny

n n

B p

[ ( )

) ( )

yB B

y B n

y dy

p y dy p y

V B c V

= +

n

=

P

1( | )

y jf y

1j 1jy

=

BM

( ) ( | ) (

( | ) (

( )n

y y

y y

B yC

B dxf x y p y

dxf x y p y

p y dy V =

P

Lemma 6: Conditional on the data, the support of Y j (j = in the truncated

Markov process is C a.s..

Proof: The support ofY is by construction, and the support ofY j can be no

larger than . For , suppose that the support of Y

2)

1j is but the support of Y is a

proper subset of . Then there is a measurable set

nC j

nB C such that andP ( 0| )B y =P

for all . Let V B denote the volume of) B . Then by Lemma 1 and Assumption 6(ii),

)

( ) ( )]

( ) a.s.

nyp y dy

B

.

for all sufficiently large and some cB < . Let nC denote the complement of . Let

denote the probability density of Y conditional on Y

nC

j . Then for some

constant , Lemma 3 yields<

) ( | ) ( )

)

( ) ( ) ( ).

n n

n

y yC B C B

C B

n

dy dy dxf x y p y

dy

M V B B o

= +

=

This is a contradiction of (A1). Q.E.D.

Lemma 7: Define

27


29/44


30/44

| | 1

2

| | 2

1

2 2

!( ,..., ) ...

| | !( | |)!

[(1 ) 1 ]

0.5 ( ) a.s.

n

n

wn n

n j k j k nnw

rn n n n n

n n n n n

rC p x x dx dx

w r w

r

r o r

= +

+

A

uniformly over . It follows that

21| | (1 ) [ ( ) ( )] (

nrn n n n n n nB O r O r O r )n n + =

a.s

uniformly over . By Assumption 4(v), 1[(log ) ]n o n= , and the result follows. Q.E.D.

Lemma 8: As and conditional on the data,n

(A2) | ( ) ( ) | ( log )n

y y nC

g y p y dy O n = a.s.

and

(A3) | ( ) ( ) | ( log )n

ny ny nC

g y p y dy O n = a.s.

Proof: Only (A2) is proved here. The proof of (A3) is similar. Let denote the

measurable subsets of C . Let G and

B

n y yP denote the probability measures associated with yg

and y . Then

| ( ) ( ) | sup [ ( ) ( )] inf [ ( ) ( )]

2 sup | ( ) ( ) | .

n

y y y y y yC

y y

g y p y dy A A A A

A A

=

G P G P

G P

A BA B

A B

Thus, it suffices to investigate the convergence of

sup | ( ) ( ) |y yA

A A

G PB

.

This is done here for the case . The argument for the case of is similar but more

complex notationally. Let

1q = 1q >

s be the integer part of for some . Thes-step transition

probability from a point sampled randomly from

logb n 0>b

yP to a measurable set isA

( )1 1 1

2

( ) ... ( | ) ( )s

ss s i i y

Ai

A dx dx dx f x x p x =

=

P 1

29


31/44

1 1 1 1 1 1 1

2 2

... ( | ) ( ) ... ( | ) ( ).n n

s s

s s i i y s s i i yA A

i i

dx dx dx f x x p x dx dx dx f x x p x = =

= +

A A

nA

1

1

2S

1x

Thes-step transition probability of the truncated process starting from the same initial density but

restricted to is

( ) 11 1 1

2

( ) ( ) ... ( | ) ( )

n

ss

n s s i i y

iA

A C dx dx dx f x x p x =

=

G

A

.

Therefore, , where( ) ( ) 1( ) ( )s s

n nA A S = G P

11 1 1 1 1

2 2

... ( ) ( | ) ( | ) ( )n

s s

n s s n i i i i yA

i i

S dx dx dx C f x x f x x p = =

=

A

and

2 1 1 1

2... ( | ) ( )

n

s

n s s i i yA

iS dx dx dx f x x p

=

= A1x

)n

.

Arguments like those made in the proof of Lemma 7 show that 1 (nS O s= a.s. uniformly over

. In additionA B 2 (n P A )nS c for some constant c < . But

1

1

{ }s

n i

i

nx C

=

A

so

1

1( ) [1 ( )] (

s

n n

iC O s )n

= =P A a.s.

It follows that

(A4) ( ) ( )( ) ( ) ( )s s

nA A O s =G P a.s.

uniformly over . Similar arguments show that the s step transition probabilities from a

point

A B

nC to satisfyA B

(A5) ( ) ( )( , ) ( , ) ( )s s

nA A O s =G P a.s.

uniformly over andA B nC . In addition,

( ) ( )

( ) ( )

(A6) sup | ( ) ( ) | sup | ( ) ( ) |

sup | ( ) ( ) | sup | ( ) ( ) | .

s sy y

A A

s sy y

A A

A A A A

A A A A

+ +

G P G P

G G P P

B B

B B

30


32/44

The first term on the right-hand side of (A6) is O s( )n a.s.. By a result of Nagaev (1961), it

follows from (A5) and Assumption 4(ii) that for some 1 < , the second term is a.s. O / 1( )s k ,

where is as in Assumption 4(ii) (Nagaev 1961). The third term isk ( )sO for some 1<

because { }jX is GSM and, therefore, uniformly ergodic (Doukhan 1995, p. 21). For b

sufficiently large, the second two terms are ( )no . Q.E.D.

Lemma 9: Define

2 ( ) ( | ) [ ( ) ( )] ...n

k

n j j y j q y j q

i j q

u f x y g y p y dx dx + += +

= A j k

)n

.

Then conditional on the data, 2 ( logn nO = a.s. uniformly over as .n

Proof:

2 | ( ) ( ) |n

n y yC

B g y p y dy

for some constant B < . The result now follows from Lemma 8. Q.E.D.

Lemma 10: Define

3 ( ) ( | ) ( )] ...n

k

n i i y j q

i j q

u f x y p y dx dx += +

= A j k

)n

.

Then conditional on the data, 3 (n nO r = a.s. uniformly over as .n

Proof:3

(n

B P A )n

for some constant B < . The result now follows from arguments

like those used to prove Lemma 8. Q.E.D.

Lemma 11: Conditional on the data, a.s. uniformly

over

[ ( )] [ ( )] ( log )nU U O =E E n

.

Proof: Some algebra shows that 1 2[ ( )] [ ( )] n n nU U 3 = + +E E . Now combine

Lemmas 7, 9, and 10. Q.E.D.

Lemma 12: The bootstrap process { j}X is geometrically strongly mixing a.s..

Proof: It follows from Lemma 5 and assumption 4(ii) that

, ; ( )

sup | ( , ) ( , ) | 1n n

m m

C A C

A A

. The lemma follows from a result of Nagaev (1961).

Q.E.D.

31


33/44


34/44

and

4 ( ) ( ) ...,n

n n ny j q ju S g y dx dx += A k

It follows from Lemmas 5 and 8 that 1 ( log )n nO n = and 3 ( log )n nO n = a.s. uniformly over

. Now consider 2n . A Taylor series approximation shows that

(1) (2) (3)2 2 2 2

1

( )k

n nl nl nl

l j q

= + +

= + + ,

where

(1)2

( , ) ( , )( ) ( | ) ( *) ( ) ...

( ) ( *| )n

knz l l z l l

i i l nl y j q j k nly l n l i j q

i l

p x y p x yu f x y I x C g y dx

p y C y +

= +

=

A dx ,

(2)2

( | )[ ( ) ( )]( ) ( | ) ( *) ( ) ...

( ) ( *| )n

kl l ny l y l

i i l nl y j q j k nly l n l i j q

i l

f x y p y p yu f x y I x C g y dx

p y C y +

= +

=

A dx ,

and

(3)2

( | )( ) ( | ) ( ) ( ) ( ) ...

( ) ( *| )n

kl l

i i nz ny y j q jnly l n l i j q

i l

f x yu f x y O O g y dx dx

p y C y +

= +

= +

A

k.

For some constant B < ,

(1)2

1

1

1

1

1( | ) ( ) ...

( )

1( | ) ( ) ...

( )

1( | ) ( ) ... ( log )( )

k

nz i i y j q j k nly li j q

i l

l

nz i i y j q j l y li j q

l

nz i i z j q j l ny li j q

B f x y g y dx dxp y

B f x y g y dx dxp y

B f x y p y dx dx O np y

+= +

+ = +

+ = +

=

= +

a.s. uniformly over , where the last line follows from Lemma 8. Some algebra shows that

1 11 , 1

1 1

( ) ( *( | ) ( | )... ( | )

( ) *( )

l ly j q i n i

i i l q l j j qy l ii j q i j q

p y I y C f x y f x y f x y

p y y

+ + +

+ += + = +

=

)

33


35/44

21 1(1 ) ( | )... ( | ).

l j qn l q l j jf x y f x y

+ + q

n

Therefore, a.s. uniformly over(1)

2 ( log )nnl O . Similar arguments can be used to show that

and a.s. uniformly over(2)

2nl = ( log )nO n(3)

2nl = ( logno )n . Therefore,

a.s. uniformly over

22 [ (log )n nO n = ]

. Now consider 4n . Given , let be the smallest value of i such that

. Define

w w

1iw =

, : ( )

sup | ( | ) ( | ) |ny n

n nx y p y

f x y f x y

= .

Then

1| | 1

2

| | 2 1

( | ) [ ( | ) ( | )w

w w w w

wn n i i n

w i

S B f x y f x y f x y

=

]

j k

for some constant . Therefore,2B < 1

| | 14 2

| | 2 1

( | ) | ( | ) ( | ) | ( ) ...w

w w w wn

wn n i i n ny j q

w i

B f x y f x y f x y g y dx dx

+ =

A.

Arguments like those made for 2n show that the integral is a.s.. Arguments like

those made in the proof of Lemma 7 show that the sum is

2[ (log )nO n ]

)2( n nO r . Therefore,

a.s. uniformly over4

4 [ (log )n nzo n = ] . The lemma follows by combining the rates of

convergence of 1n , 2n , 3n , and 4n . Q.E.D.

Lemma 14: Define /[2 ( 1)]d q = + + . For any 0 > , ( )O n + = a.s.

Proof: Under Assumptions 3-4, is a smooth function of sample moments. Therefore,

thejth cumulant ofT has the expansion

nT

n

( 2) / 2 1 21 2[ (

jj j jK n k n k O n

= + + )] .

with and . The vector11 0k = 21 1k = of Section 2.3 can be written 12 22 31 41( , , , )k k k k = .

Then the functions 1 and 2 in (2.6) are

21 12 31( , ) [ (1/ 6) ( 1)]z k k z = +

and

2 2 2 42 22 12 41 12 31 31( , ) [(1/ 2)( ) (1/ 24)( 4 )( 3) (1/ 72) ( 10 15)]z z k k k k k z k z z = + + + + +

2.

34


36/44

See, e.g., Hall (1992). The parameters in these expressions are independent of . Defineijk n

1 1( )(j jX X ) += E and nj = 1

1( )(

n j

i i jin X m X

+=

)m . A Taylor series expansionof aboutnT m = njand j = ( 0, )j M...,= shows that is a continuous function of terms of

the form 2)( , )D n(m E , 2 3( , ) ( )D n m E , 2 4( , ) ( )D n m E

( , ) ( ) nj jD n m ) ( E D, ( , ) 2 2( ) )nj jn m ( (DE , , )

2 3( ) nj jn m ) (E ,

3/ 2( , ) ( )D n m E nj2)j ( , and

2 2( , ) ( ) njD n m2)j ( E , where D represents a

differentiable function that may be different in different terms, and 0( ,..., ) = . Similar

expansions applied to T show thatn is a continuous function of the same terms but withE ,

,m , and 1 1( )X X j )j( + = E in place of E , ,m , and j . Thus, it is necessary to

show that the differences between the bootstrap and population terms are o n( ) + a.s.. This is

done here for2 2

( , ) ( )m D n m (D n ( , ) ) EE - and 2 3 ( , ) )D n m (E -

2 3( , ) ( )D n m E . The calculations for the remaining moments are similar but much

lengthier. Lemma 13 implies that )( , ( , ) ( )D D O n + = a.s., so it suffices to show that

2 2 ( ) ( )n m n m E - E and 2 3 ( ) 2 3( )n mn m E (E - are O n ) + .

First consider2 ( ) (n m n m 2) E - E . Let { }nJ be an increasing sequence of positive

integers such that / lognJ n b for some finite constant b . Then it follows from Lemma

12 that

1/ 2>

2 10

2

( ) 2 (1 / ) (nJ

j

j

n m j n o n

=

= + +E / 2 )

)

a.s.

It further follows from Lemma 13 that 2 2 ( ) ( ) (n m n m o n + = E - E a.s. for any 0 > .

Now consider 3/ 2 3 3/ ( )n m n m2 3( ) i i E - E . Let V X = i iand V X = . Then

2 3

, , 1

1 ( )

n

i j k

i j kn m VV V n =

=

1 2 13 2 2

1 1 1 1 1 1

1 1 6 ( )n n n n n n

i i j i j

i i j i i j i k j

V V V VV n n n

= = = + = = + = +

= + + + .i j kVV V

Since the bootstrap DGP is stationary, some algebra shows that

35


37/44

2 3

1 2 13 2 2

1 1 1 1 1 1

1 1 1

( )

(1 / ) ( ) 6 [1 ( ) / ] .n n n i

i i i i

i i j

n m

V i n V V V V i j n V V V

1 1 j+ + += = =

=

+ + + +

E

E E E + +

Define nJ as before. By Lemma 12 and Billingsleys inequality (Bosq, 1996, inequality (1.11))

2 3 3 2 21 1 1 1 1

1

min( 1, )1/ 2

1 1 1

1 1

( ) (1 / ) ( )

6 [1 ( ) / ] (

n

n n

J

i i

i

J n i J

i i j

i j

n m V i n V V V V

i j n V V V o n

+ +=

+ + += =

= + +

+ + +

E E E

E ) a.s.

).

A similar argument shows that

2 3 3 2 21 1 1 1 1

1

min( 1, )1/ 2

1 1 1

1 1

( ) (1 / ) ( )

6 [1 ( ) / ] (

n

n n

J

i i

i

J n i J

i i j

i j

n m V i n V V V V

i j n V V V o n

+ +=

+ + += =

= + +

+ + +

E E E

E

Now apply Lemma 13. Q.E.D.

A.2 Proofs of Theorems 3.1 and3.2

Proof of Theorem 3.1: Under Assumptions 3-4, T is a smooth function of samplemoments. Therefore, Theorem 2.8 of Gtze and Hipp (1983) and arguments like those used to

prove Theorem 2 of Bhattacharya and Ghosh (1978) establish the expansions (2.6) and (2.7).

Repetition of Gtzes and Hipps proof shows that the expansions (2.8) and (2.9) are also valid.

The theorem follows from Lemma 14 and the fact that and in (2.6)-(2.9) are polynomial

functions of the components of and

n

1g 2g

. Q.E.D.

Proof of Theorem 3.2: Equations (2.13) and (2.14) follow from Theorem 3.1. Cornish-

Fisher inversions of (2.13) and (2.14) yield (2.15)-(2.18). The theorem follows by applying

Lemma 14 and the delta method to (2.18). Q.E.D.

A.3 Proofs of Theorems 4.1-4.2

Assumptions 1 and 5-10 hold throughout this section.

Lemma 15: For any 0 > ,

36


38/44

1/ 20lim 0n

nn

= a.s.

Proof: This follows from Lemmas 3 and 4 of Andrews (1999) and the Borel-Cantelli

lemma. Q.E.D.

Lemma 16: Define 1 01 ( , )n

n iiS n V = += X ,1

1 ( , )n

n iS n V i n = += X , ,and . There is an infinitely differentiable function

( )nS S= E

( nS S= E ) such that ( ) 0S( )S = = ,

3/ 2 1/ 2lim sup ( ) [ ( ) ] 0nr nn z

n t z n S z

=P P ,

and

3/ 2 1/ 2 lim sup ( ) [ ( ) ] 0nr nn z

n t z n S z

=P P a.s.

Proof: This is a slightly modified version of Lemma 13 of Andrews (1999) and is proved

using the methods of proof of that lemma. Q.E.D.

Proof of Theorem 4.1: 1/ 2[ ( )nn S ]z P

3/ 2( )

has an Edgeworth expansion. By Lemma 16,

has the same expansion to O n , and( nrt zP )

)

) 4)

)

4)

2/ 2 3/ 2

1

( ) ( ) ( , ) ( ) (jnr jj

t z z n z z O n

=

= + +P

uniformly over , where is an infinitely differentiable function of the following moments:

, , and . Similarly,

z

E

2 3S2( )nn S SE ( nn S 2 ( nn S SE

2/ 2 3/ 2

1

( ) ( ) ( , ) ( ) (jnr jj

t z z n z z O n

=

= + +P

uniformly over a.s., where is an infinitely differentiable function of ,

, and . It is easy to see that the conclusion of Lemma 13 holds for

moments ofV

z

)

( i

S

2 ( )nn S SE

2 3 ( nn S SE2 ( nn S E

0, )X and V 0 )( i ,X . Lemma 15 implies that V

( ,i n )X can be replaced with

0( , )iV X in moments comprising . Therefore, the conclusion of Lemma 14 holds for ,

and the theorem is proved. Q.E.D.

Proof of Theorem 4.2: Let T be the version of T that is obtained by sampling the

process . Let

n

n

( ){q

jX } P denote the probability measure corresponding to T . Repetition of the

steps leading to the conclusions of Theorems 3.1 and 3.2 shows that for every

n

0 >

37


39/44

1 sup ( ) ( ) | ( )n nz

T z T z o n + =| P P ,

3/ 2 sup (| | ) (| | ) | (n nz

T z T z o n ) + =| P P a.s.,

and

3/ 2(| | ) ( )n nT z o n

+> = +P .

Therefore, it suffices to show that

3/ 2sup | ( ) ( ) | ( )n nz

T z T z n + =P P .

( nT zP ) and have Edgeworth expansions, so it is enough to show that( )nT zP

1/ 2(O n ) = + , where and is defined as in Lemma 14 but

relative to the distribution of T instead of the distribution of T . As in the proof of Lemma 14,

12 22 31 41( , , ,k k k k =

n

) ijk

n

1/ 2(O n ) = + follows if | ( 1/(O 2n) ( ) |u u ) + =E E for any 0 > , where E

2 ...}

denotes

the expectation relative to the measure induced by { . Let( )qjX } 1 ,jX X{ ,j j = . Define

( )u , , the multi-index , and | as in the proof of Lemma 7. Then for a suitable constant

,

nr

1C <

w |w

( )

( )1

1

( ) ( ) | ( ) | ( | ) ( | ) ... ( )

( | ) ( | ) ... ( )

,

k kq

i i i i j k j

i j i j q

k kq

i i i i j k j

i j i j q

n

u u u f x y f x dx dx d

C f x y f x dx dx d

C S

= = +

= = +

=

E E P

P

j

where

1 ( )

| | 1

( | ) | ( | ) ( | ) | ... ( )i ik

w wqn i i i i i i j k

w i j

S f x f x y f x dx dx d

=

=

P .

Each value of | | generatesw !/[| | !( | |)!n nr w r w terms in the sum on the right-hand side of .

Each term is bounded by ( for some constant

nS

| |)bq w2C e 2C < . Therefore,

| |2 2

| | 1

!( ) (1 )

| |!( | |)!nrbq w bqn

nnw

rS C e C

w r w

= + 1e .

38


40/44

1/ 2(nS O n ) += for any 0 > follows from (log )nr O n= and q c . Q.E.D.

2(log )n=

39


41/44

REFERENCES

Andrews, D.W.K. (1991): Heteroskedasticity and Autocorrelation Consistent Covariance

Matrix Estimation,Econometrica, 59, 817-858.

Andrews, D.W.K. (1999): Higher-Order Improvements of a Computationally Attractive k-StepBootstrap for Extremum Estimators, Cowles Foundation Discussion Paper No. 1230, Cowles

Foundation for Research in Economics, Yale University, New Haven, CT.

Andrews, D.W.K. (2002): Higher-Order Improvements of a Computationally Attractive k-Step

Bootstrap for Extremum Estimators,Econometrica, 70, 119-162.

Andrews, D.W.K. and J.C. Monahan (1992): An Improved Heteroskedasticity andAutocorrelation Consistent Covariance Matrix,Econometrica, 60, 953-966.

Beran, R. (1988): Prepivoting Test Statistics: A Bootstrap View of Asymptotic Refinements,

Journal of the American Statistical Association, 83, 687-697.

Bhattacharya, R.N. and J.K. Ghosh (1978). On the Validity of the Formal Edgeworth Expansion,

Annals of Statistics, 6, 434-451.

Bose, A. (1988): Edgeworth Correction by Bootstrap in Autoregressions,Annals of Statistics,

16, 1709-1722.

Bose, A. (1990): Bootstrap in Moving Average Models, Annals of the Institute of Statistical

Mathematics, 42, 753-768.

Bosq, D. (1996): Nonparametric Statistics for Stochastic Processes: Estimation and Prediction.New York: Springer-Verlag.

Brown, B.W., W.K. Newey, and S.M. May (2000): Efficient Bootstrapping for GMM,

unpublished manuscript, Department of Economics, Massachusetts Institute of Technology,

Cambridge, MA.

Bhlmann, P. (1997): Sieve Bootstrap for Time Series,Bernoulli, 3, 123-148.

Bhlmann, P. (1998): Sieve Bootstrap for Smoothing in Nonstationary Time Series,Annals ofStatistics, 26, 48-83.

Carlstein, E. (1986): The Use of Subseries Methods for Estimating the Variance of a General

Statistic from a Stationary Time Series,Annals of Statistics, 14, 1171-1179.

Chan, K.S. and H. Tong (1998): A Note on Testing for Multi-Modality with Dependent Data,

working paper, Department of Statistics and Actuarial Science, University of Iowa, Iowa City,

IA.

Cheng, B. and Tong, H. (1992): Consistent Non-Parametric Order Determination and Chaos,

Journal of the Royal Statistical Society, Series B, 54, 427-229.

40


42/44

Choi, E. and Hall, P. (2000): Bootstrap Confidence Regions Computed from Autoregressions of

Arbitrary Order,Journal of the Royal Statistical Society, Series B, 62, 461-477

Datta, S. and W.P. McCormick (1995): Some Continuous Edgeworth Expansions for Markov

Chains with Applications to Bootstrap,Journal of Multivariate Analysis, 52, 83-106.

Davidson, R. and J.G. MacKinnon (1999): The Size Distortion of Bootstrap Tests,Econometric Theory, 15, 361-376.

Doukhan, P. (1995): Mixing: Properties and Examples. New York: Springer-Verlag.

Gtze, F. and C. Hipp (1983): Asymptotic Expansions for Sums of Weakly Dependent Random

Vectors,Zeitschrift fr Wahrscheinlichkeitstheorie und verwandte Gebiete, 64, 211-239.

Gtze, F. and H.R. Knsch (1996): Second-Order Correctness of the Blockwise Bootstrap for

Stationary Observations,Annals of Statistics, 24, 1914-1933.

Hall, P. (1985): Resampling a Coverage Process, Stochastic Process Applications, 19, 259-

269.

Hall, P. (1992): The Bootstrap and Edgeworth Expansion. New York: Springer-Verlag.

Hall, P. and J.L Horowitz (1996): Bootstrap Critical Values for Tests Based on Generalized-

Method-of-Moments Estimators,Econometrica, 64, 891-916.

Hall, P., J.L. Horowitz, and B.-Y. Jing (1995): On Blocking Rules for the Bootstrap with

Dependent Data,Biometrika, 82, 561-574.

Hansen, B. (1999): Non-Parametric Dependent Data Bootstrap for Conditional Moment

Models, working paper, Department of Economics, University of Wisconsin, Madison, WI.

He, C. and T. Tersvirta (1999): Properties of Moments of a Family of GARCH Processes,

Journal of Econometrics, 92, 173-192.

Horowitz, J.L. (1994): Bootstrap-Based Critical Values for the Information-Matrix Test,

Journal of Econometrics, 61, 395-411.

Horowitz, J.L. (1997): Bootstrap Methods in Econometrics, in Advances in Economics andEconometrics: Theory and Applications, Seventh World Congress, ed. by D.M. Kreps and K.F.Wallis. Cambridge: Cambridge University Press, pp. 188-222.

Horowitz, J.L. (1999): The Bootstrap in Econometrics, inHandbook of Econometrics, Vol. 5,

ed. by J.J. Heckman and E. E. Leamer. Amsterdam: Elsevier, forthcoming.

Inoue, A. and M. Shintani (2000): Bootstrapping GMM Estimators for Time Series,

unpublished manuscript, Department of Agricultural and Resource Economics, North Carolina

State University.

Kreiss, J.-P. (1992): Bootstrap procedures for AR() Processes, inBootstrapping and RelatedTechniques, Lecture Notes in Economics and Mathematical Systems, 376, ed. by K.H. Jckel,

G. Rothe, and W. Sender. Heidelberg: Springer-Verlag, pp. 107-113.

41


43/44

Knsch, H.R. (1989): The Jackknife and the Bootstrap for General Stationary Observations,

Annals of Statistics, 17, 1217-1241.

Meyn, S.P. and R.L. Tweedie (1993): Markov Chains and Stochastic Stability. Berlin: Springer-

Verlag.

Nagaev, S.V. (1961): More Exact Statements of Limit Theorems for Homogeneous Markov

Chains, Theory of Probability and Its Applications, 6, 62-81.

Newey, W.K. and K.D. West (1987): A Simple, Positive Semi-Definite, Heteroskedasticity and

Autocorrelation Consistent Covariance Matrix,Econometrica, 55, 703-708.

Newey, W.K. and K.D. West (1994): Automatic Lag Selection in Covariance MatrixEstimation,Review of Economic Studies, 61, 631-653.

Paparoditis, E. (1996): Bootstrapping Autoregressive and Moving Average Parameter Estimates

of Infinite Order Vector Autoregressive Processes,Journal of Multivariate Analysis, 57, 277-

296.

Paparoditis, E. and D.N. Politis (2001a): A Markovian Local Resampling scheme for

Nonparametric Estimators in Time Series Analysis,Econometric Theory , 17, 540-566.

Paparoditis, E. and D.N. Politis (2001b): The Local Bootstrap for Markov Processes,Journal

of Statistical Planning and Inference, forthcoming.

Rajarshi, M.B. (1990): Bootstrap in Markov-Sequences Based on Estimates of Transition

Density,Annals of the Institute of Statistical Mathematics, 42, 253-268.

Silverman, B.W. (1986): Density Estimation for Statistics and Data Analysis, Chapman & Hall.

Zvingelis, J.J. (2000): On Bootstrap Coverage Probability with Dependent Data, working

paper, Department of Economics, University of Iowa, Iowa City, IA.

42


44/44

TABLE 1: RESULTS OF MONTE CARLO EXPERIMENTS

Empirical Rejection Probabilitya

Critical Block U ~ as in (5.2) U ~ N(0,1)

Value Length ARCH(1) GARCH(1,1) ARCH(1) GARCH(1,1)_

Symmetrical Tests

Asymptotic 0.089 0.063 0.092 0.099Block Boot. 2 0.069 0.054 0.076 0.081

5 0.075 0.068 0.073 0.07310 0.072 0.074 0.073 0.058

MCB 0.044 0.048 0.054 0.067

One-Sided Upper Tail Tests


5 0.085 0.052 0.085 0.09110 0.087 0.074 0.092 0.056

MCB 0.038 0.050 0.064 0.073

One-Sided Lower-Tail Tests


5 0.078 0.058 0.085 0.08410 0.082 0.059 0.088 0.062

MCB 0.046 0.040 0.055 0.068___________________________________________________ ___

a Standard errors of the empirical rejection probabilities arein the range 0.0025 to 0.0042, depending on the probability.

Documents

3528_Bootstrap Methods for Markov Processes