3528_Bootstrap Methods for Markov Processes

Embed Size (px)

Citation preview

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    1/44

    BOOTSTRAP METHODS FOR MARKOV PROCESSES

    by

    Joel L. Horowitz

    Department of Economics

    Northwestern University

    Evanston, IL 60208-2600

    October 2002

    ABSTRACT

    The block bootstrap is the best known method for implementing the bootstrap with time-

    series data when the analyst does not have a parametric model that reduces the data generation

    process to simple random sampling. However, the errors made by the block bootstrap converge

    to zero only slightly faster than those made by first-order asymptotic approximations. This paper

    describes a bootstrap procedure for data that are generated by a (possibly higher-order) Markov

    process or by a process that can be approximated by a Markov process with sufficient accuracy.

    The procedure is based on estimating the Markov transition density nonparametrically. Bootstrap

    samples are obtained by sampling the process implied by the estimated transition density.

    Conditions are given under which the errors made by the Markov bootstrap converge to zero

    more rapidly than those made by the block bootstrap.

    ______________________________________________________________________________

    I thank Kung-Sik Chan, Wolfgang Hrdle, Bruce Hansen, Oliver Linton, Daniel McFadden,

    Whitney Newey, Efstathios Paparoditis, Gene Savin, and two anonymous referees for many

    helpful comments and suggestions. Research supported in part by NSF Grant SES-9910925.

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    2/44

    BOOTSTRAP METHODS FOR MARKOV PROCESSES

    1. INTRODUCTION

    This paper describes a bootstrap procedure for data that are generated by a (possibly

    higher-order) Markov process. The procedure is also applicable to non-Markov processes, such

    as finite-order MA processes, that can be approximated with sufficient accuracy by Markov

    processes. Under suitable conditions, the procedure is more accurate than the block bootstrap,

    which is the leading nonparametric method for implementing the bootstrap with time-series data.

    The bootstrap is a method for estimating the distribution of an estimator or test statistic

    by resampling ones data or a model estimated from the data. Under conditions that hold in a

    wide variety of econometric applications, the bootstrap provides approximations to distributions

    of statistics, coverage probabilities of confidence intervals, and rejection probabilities of tests that

    are more accurate than the approximations of first-order asymptotic distribution theory. Monte

    Carlo experiments have shown that the bootstrap can spectacularly reduce the difference between

    the true and nominal probabilities that a test rejects a correct null hypothesis (hereinafter the error

    in the rejection probability or ERP). See Horowitz (1994, 1997, 1999) for examples. Similarly,

    the bootstrap can greatly reduce the difference between the true and nominal coverage

    probabilities of a confidence interval (the error in the coverage probability or ECP).

    The methods that are available for implementing the bootstrap and the improvements in

    accuracy that it achieves relative to first-order asymptotic approximations depend on whether the

    data are a random sample from a distribution or a time series. If the data are a random sample,

    then the bootstrap can be implemented by sampling the data randomly with replacement or by

    sampling a parametric model of the distribution of the data. The distribution of a statistic is

    estimated by its empirical distribution under sampling from the data or parametric model

    (bootstrap sampling). To summarize important properties of the bootstrap when the data are a

    random sample, let n be the sample size and T be a statistic that is asymptotically distributed as

    N(0,1) (e.g., a t statistic for testing a hypothesis about a slope parameter in a linear regression

    model). Then the following results hold under regularity conditions that are satisfied by a wide

    variety of econometric models. See Hall (1992) for details.

    n

    1. The error in the bootstrap estimate of the one-sided probability is O( nP T z ) p(n-1),

    whereas the error made by first order asymptotic approximations is O(n-1/2).

    2. The error in the bootstrap estimate of the symmetrical probability is

    O

    (| | )nP T z

    p(n-3/2), whereas the error made by first-order approximations is O(n-1).

    1

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    3/44

    3. When the critical value of a one-sided hypothesis test is obtained by using the

    bootstrap, the ERP of the test is O(n-1), whereas it is O(n-1/2) when the critical value is obtained

    from first-order approximations. The same result applies to the ECP of a one-sided confidence

    interval. In some cases, the bootstrap can reduce the ERP of a one-sided test to O(n-3/2) (Hall

    1992, p. 178; Davidson and MacKinnon 1999).

    4. When the critical value of a symmetrical hypothesis test is obtained by using the

    bootstrap, the ERP of the test is O(n-2), whereas it is O(n-1) when the critical value is obtained

    from first-order approximations. The same result applies to the ECP of a symmetrical confidence

    interval.

    The practical consequence of these results is that the ERPs of tests and ECPs of

    confidence intervals based on the bootstrap are often substantially smaller than ERPs and ECPs

    based on first-order asymptotic approximations. These benefits are available with samples of the

    sizes encountered in applications (Horowitz 1994, 1997, 1999).

    The situation is more complicated when the data are a time series. To obtain asymptotic

    refinements, bootstrap sampling must be carried out in a way that suitably captures the

    dependence structure of the data generation process (DGP). If a parametric model is available

    that reduces the DGP to independent random sampling (e.g., an ARMA model), then the results

    summarized above continue to hold under appropriate regularity conditions. See, for example,

    Andrews (1999), Bose (1988), and Bose (1990). If a parametric model is not available, then the

    best known method for generating bootstrap samples consists of dividing the data into blocks and

    sampling the blocks randomly with replacement. This is called the block bootstrap. The blocks,

    whose lengths increase with increasing size of the estimation data set, may be non-overlapping

    (Carlstein 1986, Hall 1985) or overlapping (Hall 1985, Knsch 1989). Regardless of the method

    that is used, blocking distorts the dependence structure of the data and, thereby, increases the

    error made by the bootstrap. The main results are that under regularity conditions and when the

    block length is chosen optimally:

    1. The errors in the bootstrap estimates of one-sided and symmetrical probabilities are

    almost surely Op(n-3/4) and Op(n

    -6/5), respectively (Hall et al., 1995).

    2. The ECPs (ERPs) of one-sided and symmetrical confidence intervals (tests) are

    O(n-3/4) and O(n-5/4), respectively (Zvingelis 2000).

    Thus, the errors made by the block bootstrap converge to zero at rates that are slower

    than those of the bootstrap based on data that are a random sample. Monte Carlo results have

    confirmed this disappointing performance of the block bootstrap (Hall and Horowitz 1996).

    2

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    4/44

    The relatively poor performance of the block bootstrap has led to a search for other ways

    to implement the bootstrap with dependent data. Bhlmann (1997, 1998), Choi and Hall (2000),

    Kreiss (1992), and Paparoditis (1996) have proposed a sieve bootstrap for linear processes (that

    is, AR, vector AR, or invertible MA processes of possibly infinite order). In the sieve bootstrap,

    the DGP is approximated by an AR(p) model in which p increases with increasing sample size.

    Bootstrap samples are generated by the estimated AR(p) model. Choi and Hall (2000) have

    shown that the ECP of a one-sided confidence interval based on the sieve bootstrap is O n 1( ) +

    for any > 0, which is only slightly larger than the ECP of 1(O n ) that is available when the data

    are a random sample. This result is encouraging, but its practical utility is limited. If a process

    has a finite-order ARMA representation, then the ARMA model can be used to reduce the DGP

    to random sampling from some distribution. Standard methods can be used to implement the

    bootstrap, and the sieve bootstrap is not needed. Sieve methods have not been developed fornonlinear processes such as nonlinear autoregressive, ARCH, and GARCH processes.

    The bootstrap procedure described in this paper applies to a linear or nonlinear DGP that

    is a (possibly higher-order) Markov process or can be approximated by one with sufficient

    accuracy. The procedure is based on estimating the Markov transition density nonparametrically.

    Bootstrap samples are obtained by sampling the process implied by the estimated transition

    density. This procedure will be called the Markov conditional bootstrap (MCB). Conditions are

    given under which:

    1. The errors in the MCB estimates of one-sided and symmetrical probabilities are almost

    surely1

    (O n ) + and O n 3/ 2( ) + , respectively, for any > 0.

    2. The ERPs (ECPs) of one sided and symmetrical tests (confidence intervals) based on

    the MCB are 1(O n ) + and 3/ 2(O n ) + , respectively, for any > 0.

    Thus, under the conditions that are given here, the errors made by the MCB converge to

    zero more rapidly than those made by the block bootstrap. Moreover for one-sided probabilities,

    symmetrical probabilities, and one-sided confidence intervals and tests, the errors made by the

    MCB converge only slightly less rapidly than those made by the bootstrap for data that are

    sampled randomly from a distribution.

    The conditions required to obtain these results are stronger than those required to obtain

    asymptotic refinements with the block bootstrap. If the required conditions are not satisfied, then

    the errors made by the MCB may converge more slowly than those made by the block bootstrap.

    Moreover, as will be explained in Section 3.2, the MCB suffers from a form of the curse of

    dimensionality of nonparametric estimation. A large data set (e.g., high-frequency financial data)

    3

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    5/44

    is likely to be needed to obtain good performance if the DGP is a high-dimension vector process

    or a high-order Markov process. Thus, the MCB is not a replacement for the block bootstrap.

    The MCB is, however, an attractive alternative to the block bootstrap when the conditions needed

    for good performance of the MCB are satisfied.

    There have been several previous investigations of the MCB. Rajarshi (1990) gave

    conditions under which the MCB consistently estimates the asymptotic distribution of a statistic.

    Datta and McCormick (1995) gave conditions under which the error in the MCB estimator of the

    distribution function of a normalized sample average is almost surely o n . Hansen (1999)

    proposed using an empirical likelihood estimator of the Markov transition probability but did not

    prove that the resulting version of the MCB is consistent or provides asymptotic refinements.

    Chan and Tong (1998) proposed using the MCB in a test for multimodality in the distribution of

    dependent data. Paparoditis and Politis (2001a, 2001b) proposed estimating the Markov

    transition probability by resampling the data in a suitable way. No previous authors have

    evaluated the ERP or ECP of the MCB or compared its accuracy to that of the block bootstrap.

    Thus, the results presented here go well beyond those of previous investigators.

    1/ 2( )

    )

    The MCB is described informally in Section 2 of this paper. Section 3 presents regularity

    conditions and formal results for data that are generated by a Markov process. Section 4 extends

    the MCB to generalized method of moments (GMM) estimators and approximate Markov

    processes. Section 5 presents the results of a Monte Carlo investigation of the numerical

    performance of the MCB. Section 6 presents concluding comments. The proofs of theorems are

    in the Appendix.

    2. INFORMAL DESCRIPTION OF THE METHOD

    This section describes the MCB procedure for data that are generated by a Markov

    process and provides an informal summary of the main results of the paper. For any integerj, let

    be a continuously distributed random variable. Let {djX R ( 1d : 1,2,..., }jX j n= be a

    realization of a strictly stationary, qth order Markov process. Thus,

    1 1 2 2 1 1( | , ,...) ( | ,..., )j j j j j j j j j j j q j qX x X x X x X x X x X x = = = = =P P almost surely ford-vectors and some finite integerq 1. It is assumed that q is

    known. Cheng and Tong (1992) show how to estimate q. In addition, for technical reasons that

    are discussed further in Section 3.1, it is assumed that

    1 2, , ,...j j jx x x

    jX has bounded support and that

    if for somecov( , ) 0j j k X X + = k> M M< . Define 1( )X = E and .1

    1

    n

    jjm n X

    ==

    4

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    6/44

    2.1. Statement of the Problem

    The problem addressed in the remainder of this section and in Section 3 is to carry out

    inference based on a Studentized statistic, T , whose form isn

    (2.1) ,1/ 2

    [ ( ) ( )]/n nT n H m H s=

    where H is a sufficiently smooth, scalar-valued function, 2ns is a consistent estimator of the

    variance of the asymptotic distribution of n H1/ 2[ ( ) ]m H( )

    ( )nT t

    , and T as .

    The objects of interest are (1) the probabilities

    (0,1)dn N n

    P and (| nT | )tP for any finite, scalart,

    (2) the probability that a test based on T rejects the correct hypothesis Hn 0: [ (H X)] ( )H =E ,

    and (3) the coverage probabilities of confidence intervals for ( )H that are based on T . To

    avoid repetitive arguments, only probabilities and symmetrical hypothesis tests are treated

    explicitly. An

    n

    -level symmetrical test based on T rejects Hn 0 if | |n nT z> , where nz is the-level critical value. Arguments similar to those made in this section and Section 4 can be used

    to obtain the results stated in the introduction for one-sided tests and for confidence intervals

    based on the MCB.

    The focus on statistics of the form (2.1) with a continuously distributed X may appear to

    be restrictive, but this appearance is misleading. A wide variety of statistics that are important in

    applications can be approximated with negligible error by statistics of the form (2.1). In

    particular, as will be explained in Section 4.1, tstatistics for testing hypotheses about parameters

    estimated by GMM can be approximated this way.1

    2.2. The MCB Procedure

    Consider the problem of estimating ( )nT zP , (| | )nT zP , or nz . For any integer

    , define Y X . Letj q> 1( ,..., )j j j qX = yp denote the probability density function of

    . Let fdenote the probability density function of1 ( ,...q qY X 1)X + jX conditional on Y . Iff

    andp were known, then and

    j

    ( )nT zP (| n | )T zP could be estimated as follows:

    1. Draw Y X from the distribution whose density is1 ( ,..., )q q X+

    1 yp . Draw 1qX +

    from the distribution whose density is 1( | )qf Y+ . Set Y X2 1( ,..., )q q X+ + 2 = .

    2. Having obtained Y X 1( ,..., )j j j qX for any , draw2j q + jX from the

    distribution whose density is ( | )jf Y (j j. Set Y X1 1,..., jX+ )q+ = .

    5

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    7/44

    3. Repeat step 2 until a simulated data series { : 1,..., }jX j = n has been obtained.

    Compute as (say) 1 1( ,..., ) ...y q q 1x p x x dx dx . Then compute a simulated test statistic T bysubstituting the simulated data into (2.1).

    n

    4. Estimate (( )nT zP (| | )nT zP ) from the empirical distribution of T ( | ) that is

    obtained by repeating steps 1-3 many times. Estimate

    n |nT

    nz by the 1 - quantile of the empirical

    distribution of | .|nT

    This procedure cannot be implemented in an application because fand py are unknown.

    The MCB replacesfand yp with kernel nonparametric estimators. To obtain the estimators, let

    fK be a kernel function (in the sense of nonparametric density estimation) of a ( 1)d q + -

    dimensional argument. Let pK be a kernel function of a dq -dimensional argument. Let

    be a sequence of positive constants (bandwidths) such that as .

    Conditions that

    { :n 1,2,...}h n = 0nh n

    fK , pK , and { must satisfy are given in Section 3. For , ,

    and , define

    }nhdx dqy

    ( , )z x y=

    ( 1)1

    1( , ) ,

    ( )

    nj j

    nz fd qn nn j q

    x X y Y p x y K

    h hn q h + = +

    =

    and

    1

    1( )

    ( )

    nj

    ny pdqnn j q

    y Yp y K hn q h = +

    = .

    The estimators of y andf, respectively, are nyp and

    (2.2) ( | ) ( , ) / ( )n nz nyf x y p x y p y= .

    The MCB estimates ,( )nT zP (| | )nT zP , and nz by repeatedly sampling the Markov

    process generated by the transition density (n | )f x y . However, ( | )n x y is an inaccurate

    estimator of ( | )x y in regions where ( )yp y is close to zero. To obtain the asymptotic

    refinements described in Section 1, it is necessary to avoid such regions. Here, this is done by

    truncating the MCB sample. To carry out the truncation, let { :n nC y ( ) }yp y n= , where

    0n > for each and1n = ,2,..., 0n

    ,...,X X

    as at a rate that is specified in Section 3.1.

    Having obtained realizations from the Markov process induced by

    n

    1 1 (j j q +1)

    6

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    8/44

    ( | )nf x y , the MCB retains a realization of

    jX only if ( 1 ,..., )j j q nX X + C

    . Thus, MCB

    proceeds as follows:

    1

    1qY+

    ,..., 1j q +

    (j j+ =

    ( ,...,X

    ,..., n

    : 1,..., }jX j n=

    )]/n ns1m n=[ (H

    | )z

    nz

    nz

    ..., }X n

    MCB 1. Draw from the distribution whose density is1 ( ,..., )q qY X X+ ny . Retain

    if Y . Otherwise, discard the current Y1q C+ n 1q+ and draw a new one. Continue this

    process until a Y is obtained.1q + nC

    MCB 2. Having obtained Y X 1 ( )j j jX for any , draw jX from the

    distribution whose density is ( | )n jY . Retain jX and set Y X1 1, )j qX..., + if

    1)j j qX + C . Otherwise, discard the current jX and draw a new one. Continue this

    process until an jX is obtained for which 1( )j j qX X C + .

    q

    n

    MCB 3. Repeat step 2 until a bootstrap data series { has been obtained.

    Compute the bootstrap test statistic T n , where ,1/ 2 ) (m H 1

    njj

    X= is

    the mean of X relative to the distribution induced by the sampling procedure of steps MCB 1

    and MCB 2 (bootstrap sampling), and 2ns is an estimator of the variance of the asymptotic

    distribution of1/ 2 [ ( ) ( )]n H m H under bootstrap sampling.

    MCB 4. Estimate (( )nT zP (| nTP ) from the empirical distribution of T ( | | )

    that is obtained by repeating steps 1-3 many times. Estimate

    n

    nT

    by the 1 - quantile of the

    empirical distribution of | . Denote this estimator byT |n nz .

    A symmetrical test of H0 based on T and the bootstrap critical valuen rejects at the

    nominal level if | | n nT z .

    2.3. Properties of the MCB

    This section presents an informal summary of the main results of the paper and of the

    arguments that lead to them. The results are stated formally in Section 3. Let denote the

    Euclidean norm. Let P denote the probability measure induced by the MCB sampling procedure

    (steps MCB 1 MCB 2) conditional on the data { : 1,j j = . Let any > 0 be given.

    The main results are that under regularity conditions stated in Section 3.1:

    (2.3) 1 sup ( ) ( ) | ( )n nz

    T z T z O n + =| P P

    7

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    9/44

    almost surely,

    (2.4) 3/ 2 sup (| | ) (| | ) | (n nz

    T z T z O n ) + =| P P

    almost surely, and

    (2.5) 3/ 2(| | ) ( )n nT z O n +> = +P .

    These results may be contrasted with the analogous ones for the block bootstrap. The

    block bootstrap with optimal block lengths yields O n , , and for the

    right-hand sides of (2.3)-(2.5), respectively (Hall et al. 1995, Zvingelis 2000). Therefore, the

    MCB is more accurate than the block bootstrap under the regularity conditions of Section 3.1.

    3/ 4( )p 6 / 5(pO n

    ) )

    )

    5/ 4(O n

    These results are obtained by carrying out Edgeworth expansions of and

    . Additional notation is needed to describe the expansions. Let

    ( )nT zP

    { : 1,j ( nT zP ..., }X j n =

    denote the data. Let and , respectively, denote the standard normal distribution function and

    density. The jth cumulant of T ( 4n )j has the form ifj is odd and

    ifj is even, where

    1/ 2 1(jn o + / 2 )n

    1jn o

    + 1( 2)I j = + (n ) j is a constant (Hall 1992, p. 46). Define

    . Conditional on1( ,..., = 4 ) , the jth cumulant of almost surely has the form

    ifj is odd and

    nT

    1)1/ 2 jn + 1/ 2(o n ) 1) (j o nn ( 2I j

    = + + ifj is even. The quantities j

    depend on . They are nonstochastic relative to bootstrap sampling but are random variables

    relative to the stochastic process that generates . Define 1 4 ( , ..., ) = .

    Under the regularity conditions of Section 3.1, ( )nT zP has the Edgeworth expansion

    (2.6)2

    / 2 3/ 2

    1

    ( ) ( ) ( , ) ( ) (jn jj

    T z z n z z O n

    =

    = + +P )

    uniformly overz, where ( , )j z is a polynomial function ofz for each , a continuously

    differentiable function of the components of

    for eachz, an even function ofzifj = 1, and an

    odd function ofzifj = 2. Moreover, (| |nT z)P has the expansion

    (2.7) 1 32(| | ) 2 ( ) 1 2 ( , ) ( ) ( )nT z z n z z O n = + +P / 2

    uniformly overz. Conditional on , the bootstrap probabilities ( )nT zP and have

    the expansions

    ( | | )nT P z

    )(2.8)2

    / 2 3/ 2

    1

    ( ) ( ) ( , ) ( ) (j

    n j

    j

    T z z n z z O n

    =

    = + +P

    8

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    10/44

    and

    (2.9) 1 32 (| | ) 2 ( ) 1 2 ( , ) ( ) ( )nT z z n z z O n

    = + +P / 2

    uniformly overzalmost surely. Therefore,

    (2.10)

    ( )1/ 2 1 | ( ) ( ) | ( )

    n nT z T z O n O n = +P P

    and

    (2.11) ( )1 3 | ( | | ) (| | ) | ( )n nT z T z O n O n = +P P / 2

    almost surely uniformly overz. Under the regularity conditions of Section 3.1,

    (2.12) 1/ 2 ( )O n + =

    almost surely for any 0 > . Results (2.3)-(2.4) follow by substituting (2.12) into (2.10)-(2.11).

    To obtain (2.5), observe that (| | ) (| | ) 1n n n nT z T z = =P P

    / 2

    / 2

    . It follows from (2.7)

    and (2.10) that

    (2.13)1 3

    22 ( ) 1 2 ( , ) ( ) 1 ( )n n nz n z z O n + = +

    and

    (2.14) 1 32 2 ( ) 1 2 ( , ) ( ) 1 ( )n n nz n z z O n + = +

    almost surely. Let v denote the 1 / 2 quantile of the N(0,1) distribution. Then Cornish-

    Fisher inversions of (2.13) and (2.14) (e.g., Hall 1992, p. 88-89) give

    (2.15) 1 32 ( , ) ( )nz v n v O n = + / 2

    / 2

    and

    (2.16) 1 32 ( , ) ( )nz v n v O n = +

    almost surely. Therefore,

    ( )

    1 32 2

    1 3/ 2

    (2.17) (| | ) {| | [ ( , ) ( , )] ( )}

    (2.18) [| | ( )].

    n n n n

    n n

    T z T z n v v O n

    T z O n O n

    = + +

    = + +

    P P

    P

    / 2

    .

    Result (2.5) follows by applying (2.12) to the right-hand side of (2.18).

    3. MAIN RESULTS

    This section presents theorems that formalize results (2.3)-(2.5).

    9

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    11/44

    3.1Assumptions

    Results (2.3)-(2.5) are established under assumptions that are stated in this section. The

    proof of the validity of the Edgeworth expansions (2.6)-(2.9) relies on a theorem of Gtze and

    Hipp (1983) and requires certain restrictive assumptions. See Assumption 4 below. It is likely

    that the expansions are valid under weaker assumptions, but proving this conjecture is beyond the

    scope of this paper. The results of this section hold under weaker assumptions if the Edgeworth

    expansions remain valid.

    The following additional notation is used. Let zp denote the probability density function

    of . Let1 1( ,q q qZ X Y + + + 1) E denote the expectation with respect to the distribution induced by

    bootstrap sampling (steps MCB 1 and MCB 2 of Section 2.2). Define [ ( )( ) ]n m m =

    and

    E

    2 ( ) 'ns H ( )H

    )

    =

    1 1( )( jX X

    . For reasons that are explained later in this section, it is assumed that

    0 + =E if j M for some integer> M< . Define

    { }1 1 1( ) ( )( ) ( )( ) ( )( )n M M

    n i i i i j i j ii jn M X m X m X m X m X m X m

    + += =

    = + + ,

    { }1 1 1 ( ) ( )( ) ( )( ) ( )( )n M M

    n i i i i j i j ii jn M X m X m X m X m X m X m

    + += =

    = + + ,

    { }1 1 1 1 1 11 ( )( ) ( )( ) ( )( )M

    n jjX X X X X X + += j

    = + + E ,

    and 2 ( ) ' ( ) /n n2nH H s = Note that

    2n can be evaluated in an application because the

    bootstrap DGP is known. For any 0> , define, 1{ : ( ,..., )d

    q y qC x p x x =

    1 1some ,..., }qfor x x (C. Let )B denote the measurable subsets of C

    ( , )k

    AP. Let denote the

    k-step transition probability from a point C to a set ( )CA B .

    Assumption 1: { is a realization of a strictly stationary, qth

    order Markov process that is geometrically strongly mixing(GSM).

    : 1,2,..., ; }dj jX j n X = R

    2

    Assumption 2: (i) The distribution of 1qZ + is absolutely continuous with respect to

    Lebesgue measure. (ii)For t and each k such that 0d k q<

    (3.1) ( )lim sup exp ,| | , 1j jt

    t X X j j k j j

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    12/44

    Assumption 3: (i)H is three times continuously differentiable in a neighborhood of .

    (ii) The gradient of H is non-zero in a neighborhood of.

    Assumption 4: (i) jX has bounded support. (ii)For all sufficiently small 0> , some

    0 > , and some integer ,0>k

    , ; ( )

    sup | ( , ) ( , ) | 1k k

    C A C

    A A

    s H m H m= , and 2 2n n ( ) ( )n s H m H m = . (v) For n as

    in Assumption 6, 1) ]1) | ] [(loy q n q qY y o[ ( gp Y n

    + < = =P as uniformly over such thatn qy

    ( )y q np y .

    LetKbe a bounded, continuous function whose support is [-1,1] and that is symmetricalabout 0. For each integer , letKsatisfy,0,...,j =

    1

    1

    1 if 0

    ( ) 0 if 1

    (nonzero) if

    j

    K

    j

    v K v dv j

    B j

    =

    = 0, let W (j = 1, ,J) denote thejth component of the vector( )j JW .

    Assumption 5: Let ( 1)d qfv+ and .dqpv fK and pK have the forms

    ( 1)

    ( ) ( )

    1 1

    ( ) ( ); ( ) (

    d q dq

    j jf f p p pf

    j j

    K v K v K v K v

    +

    = == = ) .

    Assumption 6: (i) Let 1/[2 ( 1)]d q= + + . Then n hh c n= for some finite constant

    . (ii) .0hc >2

    (log )n nn h

    Assumptions 2(i), 2(ii) and 4 are used to insure the validity of the Edgeworth expansions

    (2.6)-(2.9). Condition (3.1) is a dependent-data version of the Cramr condition. See Gtze and

    Hipp (1983). Assumptions 2(iii) and 2(iv) insure sufficiently rapid convergence of .

    Assumptions 4(i)-4(ii) are used to show that the bootstrap DGP is GSM. The GSMproperty is used to prove the validity of the Edgeworth expansions (2.8)-(2.9). The results of this

    paper hold when jX has unbounded support if the expansions (2.8)-(2.9) are valid and ( )yp y

    decreases at an exponentially fast rate as y .3 Assumption 4(iii) is used to insure the

    validity of the Edgeworth expansions (2.6)-(2.9). It is needed because the known conditions for

    the validity of these expansions apply to statistics that are functions of sample moments. Under

    11

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    13/44

    4(iii)-4(iv), ns and T are functions of sample moments ofn jX . This is not the case if T is

    Studentized with a kernel-type variance estimator (e.g., Andrews 1991; Andrews and Monahan

    1992; Newey and West 1987, 1994;). However, under Assumption 1, the smoothing parameter of

    a kernel variance estimator can be chosen so that the estimator is

    n

    1/ 2n -consistent for any 0 > .

    Therefore, if the required Edgeworth expansions are valid, the results established here hold when

    is Studentized with a kernel variance estimator.nT4

    n

    ( )H m

    (H

    n

    )]

    1)]+

    0 >

    )T z

    T z

    | P

    p | P

    (| )P

    | )> =

    (1

    The quantity 2 in 2ns is a correction factor analogous to that used by Hall and Horowitz

    (1996) and Andrews (1999, 2002). It is needed because is a biased estimator

    of the variance of the asymptotic distribution of n H

    ( )H m

    )m1/ 2[ ( . The bias converges to zero

    too slowly to enable the MCB to achieve the asymptotic refinements described in Section 2.3.

    The correction factor removes the bias without distorting higher-order terms of the Edgeworth

    expansion of the CDF of . n( )T zP

    Assumption 4(v) restricts the tail thickness of the transition density function.5

    Assumption 5 specifies convenient forms for fK and pK , which may be higher-order kernels.

    Higher-order kernels are used to insure sufficiently rapid convergence of .

    3.2 Theorems

    This section gives theorems that establish conditions under which results (2.3)-(2.5) hold.

    The bootstrap is implemented by carrying out steps MCB 1-MCB 4. Let /[2 (d q = + .

    Theorem 3.1: Let Assumptions 1-6 hold. Then for every

    (3.2) 1/ 2 sup ( ( ) | ( )n nz

    T z O n + =P

    and

    (3.3) 1 su (| | ) (| | ) | (n nz

    T z O n + =P )

    almost surely.

    Theorem 3.2: Let Assumptions 1-6 hold. For any (0,1) let nz satisfy

    |n nT z > = . Then for every 0 >

    (3.4)1(| ( )n nT z O n

    ++P .

    Theorems 3.1 and 3.2 imply that results (2.3)-(2.6) hold if 2 ) ( 1) /(4 )d q > + . With

    the block bootstrap, the right-hand sides of (3.2)-(3.4) are Op(n-3/4), Op(n

    -6/5), and O(n-5/4),

    12

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    14/44

    respectively. The errors made by the MCB converge to zero more rapidly than do those of the

    block bootstrap if is sufficiently large. With the MCB, the right-hand side of (3.2) is o n

    if , the right-hand side of (3.3) is if , and the right-hand

    side of (3.4) is if . However, the errors of the MCB converge more

    slowly than do those of the block bootstrap if the distribution of

    3/ 4( )

    ( 1) / 2d q> + 6 / 5( )o n ( 1) / 3d q> +

    5 / 4

    ( )

    o n ( 1) / 2d q> +

    Z is not sufficiently smooth (

    is too small). Moreover, the MCB suffers from a form of the curse of dimensionality in

    nonparametric estimation. That is, with a fixed value of , the accuracy of the MCB decreases as

    dand q increase. Thus, the MCB, like all nonparametric estimators, is likely to be most attractive

    in applications where dand q are not large. It is possible that this problem can be mitigated,

    though at the cost of imposing additional structure on the DGP, through the use of dimension

    reduction methods. For example, many familiar time series DGPs can be represented as single-

    index models or nonparametric additive models with a possibly unknown link function.

    However, investigation of dimension reduction methods for the MCB is beyond the scope of this

    paper.

    : 1,...,jX i n= j

    0

    ( ,...,j jX X )j = 1j +

    1+ 1L

    4. EXTENSIONS

    Section 4.1 extends the results of Section 3 to tests based on GMM estimators. Section

    4.2 presents the extension to approximate Markov processes.

    4.1 Tests Based on GMM Estimators

    This section gives conditions under which (3.2)-(3.4) hold for the tstatistic for testing a

    hypothesis about a parameter that is estimated by GMM. The main task is to show that the

    probability distribution of the GMM tstatistic can be approximated with sufficient accuracy by

    the distribution of a statistic of the form (2.1). Hall and Horowitz (1996) and Andrews (1999,

    2002) use similar approximations to show that the block bootstrap provides asymptotic

    refinements forttests based on GMM estimators.

    Denote the sample by { } . In this section, some components of X may be

    discrete, but there must be at least one continuous component. See Assumption 9 below.Suppose that the GMM moment conditions depend on up to lags of jX . Define

    X for some fixed integer 0 and . Let X denote a random vector

    that is distributed as X . Estimation of the parameter is based on the moment

    condition ( , )G 0 =E X , where G is a known 1GL function and GL L . Let 0 denote the

    13

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    15/44

    true but unknown value of . Assume that 0 0( , ) ( , ) 0i jG G =E X X if | for some

    .

    | Gi j M >

    GM < 6 As in Hall and Horowitz (1996) and Andrews (1999, 2002), two forms of the GMM

    estimator are considered. One uses a fixed weight matrix, and the other uses an estimator of the

    asymptotically optimal weight matrix. The results stated here can be extended to other forms

    such as the continuous updating estimator. In the first form, the estimator of , n , solves

    1 1

    1 1

    ( , )n n

    i

    i i

    n G

    = += +

    X X

    G GL L

    1 1

    1 1

    ( ) ( , )n n

    n i

    i i

    n G

    = +

    X X

    n

    1

    1 1

    ( , , )GM

    i i i

    i j

    H

    n G

    += +

    +

    X X

    ( , ) ( , )i j iG G( , ) + X X

    1 10 0[ ( ) ]nE

    GL L

    D E=

    ( , n

    1 10

    1

    n

    n

    0 ( )n rr , )r r

    r n

    r = r=

    (4.1) min ( ) ( , )n iJ n G

    ,

    where is the parameter set and is a , positive-semidefinite, symmetrical matrix of

    constants. In the second form, n solves

    (4.2) min ( , ) ( , )n n iJ n G

    = +

    ,

    where solves (4.1),

    1( ) ( , ) ( , )n

    n i G

    = +

    = X X ,j

    and , , ) ( ( , )i i j i i jH G G + + = +X X X X .7

    To obtain the t statistic, let = . Define the matrices

    0[ ( , ) /G X

    and]

    1

    1

    ) /n

    n i

    i

    D n G

    = +

    = X .

    Define

    (4.3) ( ) ( )D D D D D D =

    if solves (4.1) and

    (4.4)1

    0( )D D=

    if solves (4.2). Let n be the consistent estimator of that is obtained by replacing D and

    in (4.3) and (4.4) by andnD ( )n n . In addition, let be the ( component of n ,

    and let and nr be the rth components of and , respectively. The tstatistic for testing

    H0: 0 is .1/ / 2

    0 rt n 2 1( ) /(r n r )nr n

    14

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    16/44

    To obtain the MCB version of t , let {nr : 1,..., }jX j = n be a bootstrap sample that is

    obtained by carrying out steps MCB 1 and MCB 2 but with the modified transition density

    estimator that is described in equations (4.5)-(4.7) below. The modified estimator allows some

    components ofj

    X to be discrete. Define j

    ( ,..., )j j

    X X

    =X . Let X denote a random vector

    that is distributed as

    1

    +X . LetE denote the expectation with respect to the distribution induced

    by bootstrap sampling. Define ( , ) ( , )G G = ( , )nGi i E X . The bootstrap version of the

    moment condition ( ,G ) 0 =E X is ( ,G ) 0=E X . As in Hall and Horowitz (1996) and Andrews

    (1999, 2002), the bootstrap version is recentered relative to the population version because,

    except in special cases, there is no such that ( , ) 0G =E X when GL L> . Brown, et al.

    (2000) and Hansen (1999) discuss an empirical likelihood approach to recentering. Recentering

    is unnecessary if GL L= , but it simplifies the technical analysis and, therefore, is done here.8

    To form the bootstrap version of , letnrtn denote the bootstrap estimator of . Let

    nD

    be the quantity that is obtained by replacing with andiXiX n with

    n in the expression for

    . DefinenD 1 ( , ) /n nD G += E X ,

    1

    1

    1 1

    ( ) ( , ) ( , ) ( , , )GMn

    n i i i i

    i j

    n G G H

    j

    +

    = + = +

    = +

    X X X X ,

    1

    1 1 1 1

    1

    ( , ) ( , ) ( , , )GM

    n n n j

    j

    G G H

    n

    + + + + += +

    = +

    E X X X X ,

    and

    1

    1 1 1 1

    1

    ( , ) ( , ) ( , , )n n n jj

    G G H

    n

    = + + + + += +

    +

    E X X X X .

    Define n , by replacing with , with ,G G iXiX n with

    n , withnD

    nD and ( )n n with

    (n

    )n

    in the formula forn

    . Let *n

    = ifn

    solves (4.1) and *n

    (n

    )n

    = ifn

    solves

    (4.2). Define

    1 1( * ) * * ( * )n n n n n n n n n n n nD D D D D D = 1 ,

    1 1( * ) * * ( * )n n n n n n n n n n n nD D D D D D = 1 ,

    15

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    17/44

    and 2 ( ) /( )nr n rr n rr = , where ( )n rr and ( )n rr are the ( components of , )r r n and n ,

    respectively. Let nr denote the rth component ofn . Then the MCB version of the tstatistic

    is t n . The quantity1/ 2 1/ ( ) /(nr nr nr n rr 2)nr= nr is a correction factor analogous to that used

    by Hall and Horowitz (1996) and Andrews (1999, 2002).

    Now let V( , )j X ( 1 ,..., )j n= + be the vector containing the unique components of

    ( ,jG )X , ( , ) (j iG G , )j X (0 Gi M+X ) , and the derivatives through order 6 of G( , )j X

    and ( , ) (jG G , )i j +XX . Let denote the support of (XS 1 1,..., )X X + . Define y , z , and f

    as in Sections 2-3 but with counting measure as the dominating measure for discrete components

    of jX . The following new assumptions are used to derive the results of this section.

    Assumptions 7-9 are similar to ones made by Hall and Horowitz (1996) and Andrews (1999,

    2002).

    ( 1,...,j = )n

    Assumption 7: 0 is an interior point of the compact parameter set and is the unique

    solution in to the equation

    ( , ) 0G =E X .

    Assumption 8: (i) There are finite constants and such thatGC VC 1( , ) GG C + X

    and 1( , ) VV C + X for all 1 + XSX and . (ii) 1 0 1 0( , ) ( , )jG G 0 + + + =E

    M

    j

    X X

    1 11[ ( , ) ( , )

    G

    jG G + + +=

    if

    for some .(iii)GM Mj > G < { 1 1 + +( , ) , ) (G G + +E X X X X

    }1( , ) (jG G + + +X X1 , ) ] exists for all . Its smallest eigenvalue is bounded away from 0uniformly over in an open sphere, 0N , centered on 0 . (iv) There is a bounded function

    such that(GC )i 1 1 1) (G G + +X X 2, ) ( , 1 1 2( )GC +X for all X and1 + XS

    1 2, . (v) G is 6-times continuously differentiable with respect to the components of

    everywhere in 0N . (vi) There is a bounded function C such that( )V i

    1 1) (V V 1 2, ) 1 1( )C 2( , V +X + +X X for all 1 + X 1 2,X S and .

    Assumption 9: (i)j

    X can be partitioned( 1,..., )j = n )( ) ( )( ,c dj j

    X X , where

    for some , the distributions of

    ( )c d

    jX

    1d ( )cjX and 0( , ) /G X

    ( )dj

    are absolutely continuous with

    respect to Lebesgue measure, and the distribution of X is discrete with finitely many mass

    points. There need not be any discrete components of j X , but there must be at least one

    continuous component. (ii) The functions yp , z p , and f are bounded. (iii) For some integer

    16

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    18/44

    2> , yp and zp are everywhere at least times continuously differentiable with respect to

    any mixture of their continuous arguments.

    }

    ) =

    ( )

    )y

    d=

    cj

    nt z>

    j jX = =

    1,j

    Assumption 10:Assumptions 2 and4 hold with V 0( ,j )X in place of jX .

    As in Sections 2-3, {

    : 1,...,jX j = n in the MCB for GMM is a realization of the

    stochastic process induced by a nonparametric estimator of the Markov transition density. If jX

    has no discrete components, then the density estimator is (2.2) and MCB samples are generated

    by carrying out steps MCB 1 and MCB 2 of Section 2.2. A modified transition density estimator

    is needed if jX has one or more discrete components. Let( ) ( )

    ( ,c d

    j jY Y ) be the partition of

    into continuous and discrete components. The modified density estimator is

    jY

    (4.5) ( | ( , ) / ( )n n nf x y g x y p y ,y

    where

    (4.6)

    ( ) ( )( )( ) ( )( ) ( )

    ( 1)1

    1( , , ( ) ( )

    ( )

    c cc cnj j d dd d

    n f j jd qn nn j q

    x X y Y g x K I x X I y Y

    h hn q h + = +

    = =

    ,=

    ( )dim( )c

    jX , and

    (4.7)

    ( )( )( )( )

    1

    1( ) ( )

    ( )

    cndd

    ny p jdqnn j q

    y Yp y K I y Y

    hn q h = +

    = =

    .

    The result of this section is given by the following theorem.

    Theorem 4.1: Let Assumptions 1, 4(i), 4(v), and5-10 hold. For any (0,1) let nz

    satisfy (| | )nr =P . Then (3.2)-(3.4) hold with t and t in place of T and T .nr nr n n

    4.2 Approximate Markov Processes

    This section extends the results of Section 3.2 to approximate Markov processes. As in

    Sections 2-3, the objective is to carry out inference based on the statistic T defined in (2.1). For

    an arbitrary random vector V, let

    n

    ( | )jx vp denote the conditional probability density of jX at

    x and V . An approximate Markov process is defined to be a stochastic process that

    satisfies the following assumption.

    v

    Assumption AMP: (i) { is strictly stationary and GSM.

    (ii) For some finite b , integer , all finite and all

    : 0, 2,...; }dX j X = R

    0 0q > j q 0> 0q

    j

    17

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    19/44

    1

    1 2 1 2, ,...

    sup ( | , ,...) ( | , ,..., )j j

    bqj j j j j j j q

    x x

    x x x x x x x e

    c

    0

    }n

    (d q 1) / n+ as .n

    Assumption 2: For some finite integer and each : (i) The distribution of0n 0n n( )

    1q

    qZ +

    is absolutely continuous with respect to Lebesgue measure. (ii) For and each k such thatdt

    0 k q<

    (3.1) ( )( ) ( )lim sup exp ,| | , 1q qj jt

    t X X j j k j j

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    20/44

    For each positive integer , let be a bounded, continuous function whose support is

    [-1,1], that is symmetrical about 0, and that satisfies

    n nK

    1

    1

    1 if 0

    ( ) 0 if 1

    (nonzero) if

    jn n

    K n

    j

    v K v dv j

    B j

    =

    = 2(log ) nn n

    n h

    The main difference between these assumptions and Assumptions 2, 5, and 6 of the MCB

    is the strengthened smoothness Assumption 2(iv). The MCB for an order q Markov process

    provides asymptotic refinements whenever yp and zp have derivatives of fixed order

    , whereas the AMCB requires the existence of derivatives of all orders as .( 1d q> + ) n

    The ability of the AMCB to provide asymptotic refinements is established by the

    following theorem.

    Theorem 4.2: Let Assumptions AMCB, 2, 3, 4, 5, and6hold with for

    some finite . For any

    2/(log )q n c

    0c > (0,1) let nz satisfy (| | )n nT z > =P . Then for every 0 >

    1 sup ( ) ( ) | ( )n nz

    T z T z O n + =| P P ,

    3/ 2 sup (| | ) (| | ) | (n nz

    T z T z O n ) + =| P P ,

    almost surely, and

    3/ 2(| | ) ( )n nT z O n

    +> = +P .

    5. MONTE CARLO EXPERIMENTS

    This section describes four Monte Carlo experiments that illustrate the numerical

    performance of the MCB. The number of experiments is small because the computations are very

    lengthy.

    Each experiment consists of testing the hypothesis H0 that the slope coefficient is zero in

    the regression of jX on 1jX . The coefficient is estimated by ordinary least squares (OLS), and

    19

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    21/44

    acceptance or rejection of H0 is based on the OLS t statistic. The experiments evaluate the

    empirical rejection probabilities of one-sided and symmetrical tests at the nominal 0.05 level.

    Results are reported using critical values obtained from the MCB, the block bootstrap, and first-

    order asymptotic distribution theory. Four DGPs are used in the experiments. Two are the

    ARCH(1) processes

    t

    j

    1)q +

    (5.1) ,2 1/ 21(1 0.3 )j j jX U X = +

    where { is an iidsequence that has either the N(0,1) distribution or the distribution with}jU

    (5.2) .7( ) 0.5[sin ( / 2) 1] (| | 1)jU u u I u = + P

    The other two DGPs are the GARCH(1,1) processes

    (5.3) 1/ 2j j jX U h= ,

    where

    (5.4) 21 11 0.4( )j jh h X = + + j

    and { is an iidsequence with either the N(0,1) distribution or the distribution of (5.2). DGP

    (5.1) is a first-order Markov process. DGP (5.3)-(5.4) is an approximate Markov process. In the

    experiments reported here, this DGP is approximated by a Markov process of order . When

    has the admittedly somewhat artificial distribution (5.2),

    }jU

    2q =

    jU X has bounded support as required

    by Assumption 4. When U N ,~ (0,1)j jX has unbounded support and2

    jX has moments only

    through orders 8 and 4 for models (5.1) and (5.3)-(5.4), respectively (He and Tersvirta 1999).

    Therefore, the experiments with U N illustrate the performance of the MCB under

    conditions that are considerably weaker than those of the formal theory.

    ~ (0,1)j

    The MCB was carried out using the 4th-order kernel

    (5.4) .2 4 6

    ( ) (105/ 64)(1 5 7 3 ) (| | 1)K v v v v I v= +

    Implementation of the MCB requires choosing the bandwidth parameter . Preliminary

    experiments showed that the Monte Carlo results are not highly sensitive to the choice of h , so a

    simple method motivated by Silvermans (1986) rule-of-thumb is used. This consists of setting

    equal to the asymptotically optimal bandwidth for estimating the variate normal

    density , where is the (

    nh

    1)

    n

    nh (q +

    21(0, )n qN I + 1qI + 1) (q + identity matrix and

    2n is the estimated

    variance of 1X . Of course, there is no reason to believe that this is optimal in any sense in the

    MCB setting. The preliminary experiments also indicated that the Monte Carlo results are

    nh

    20

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    22/44

    insensitive to the choice of trimming parameter, so trimming was not carried out in the

    experiments reported here.

    Implementation of the block bootstrap requires selecting the block length. Data-based

    methods for selecting block lengths in hypothesis testing are not available, so results are reported

    here for three different block lengths, (2, 5, 10). The experiments were carried out in GAUSS

    using GAUSS random number generators. The sample size is 50n = . There are 5000 Monte

    Carlo replications in each experiment. MCB and block bootstrap critical values are based on 99

    bootstrap samples.10

    The results of the experiments are shown in Table 1. The differences between the

    empirical and nominal rejection probabilities (ERPs) with first-order asymptotic critical values

    tend to be large. The symmetrical and lower-tail tests reject the null hypothesis too often. The

    upper-tail test does not reject the null hypothesis often enough when the innovations have the

    distribution (5.2). The ERPs with block bootstrap critical values are sensitive to the block

    length. With some block lengths, the ERPs are small, but with others they are comparable to or

    larger than the ERPs with asymptotic critical values. With the MCB, the ERPs are smaller than

    they are with asymptotic critical values in 10 of the 12 experiments. The MCB has relatively

    large ERPs with the GARCH(1,1) model and normal innovations because this DGP lacks the

    higher-order moments needed to obtain good accuracy with the bootstrap even with iiddata.

    6. CONCLUSIONS

    The block bootstrap is the best known method for implementing the bootstrap with time

    series data when one does not have a parametric model that reduces the DGP to simple random

    sampling. However, the errors made by the block bootstrap converge to zero only slightly faster

    than those made by first-order asymptotic approximations. This paper has shown that the errors

    made by the MCB converge to zero more rapidly than those made by the block bootstrap if the

    DGP is a Markov or approximate Markov process and certain other conditions are satisfied.

    These conditions are stronger than those required by the block bootstrap. Therefore, the MCB is

    not a substitute for the block bootstrap, but the MCB is an attractive alternative to the block

    bootstrap when the MCBs stronger regularity conditions are satisfied. Further research could

    usefully investigate the possibility of developing bootstrap methods that are more accurate than

    the block bootstrap but impose less a priori structure on the DGP than do the MCB or the sieve

    bootstrap for linear processes.

    21

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    23/44

    FOOTNOTES

    1 Statistics with asymptotic chi-square distributions are not treated explicitly in this paper.

    However, the arguments made here can be extended to show that asymptotic chi-square statistics

    based on GMM estimators behave like symmetrical statistics. See, for example, the discussion of

    the GMM test of overidentifying restrictions in Hall and Horowitz (1996) and Andrews (1999).

    Under regularity conditions similar to those of Sections 3 and 4, the results stated here for

    symmetrical probabilities and tests apply to asymptotic chi-square statistics based on GMM

    estimators. Andrews (1999) defines a class of minimum estimators that is closely related to

    GMM estimators. The results here also apply to ttests based on minimum estimators.

    2 GSM means that the process is strongly mixing with a mixing parameter that is an

    exponentially decreasing function of the lag length.

    3 If Y has finitely many moments and the required Edgeworth expansions are valid, then the

    errors made by the MCB increase as the number of moments of Y decreases. If Y has too few

    moments, then the errors made by the MCB decrease more slowly than the errors made by the

    block bootstrap.

    4 Gtze and Knsch (1996) have given conditions under which T with a kernel-type variance

    estimator has an Edgeworth expansion up to O n . Analogous conditions are not yet known

    for expansions through . A recent paper by Inoue and Shintani (2000) gives expansions

    for statistics based on the block bootstrap for linear models with kernel-type variance estimators.

    n

    1/ 2( )

    1

    ( )O n

    5 As an example of a sufficient condition for 4(v), let X be scalar with support . Then 4(v)

    holds if there are finite constants , , , and such that

    when and when 1 .

    The generalization to multivariate X is analytically straightforward though notationally

    cumbersome.

    [0,1]

    1 0c > 2 0c > 0> 0 > 1 qc x

    1 2( ,..., )y q qp x x c x qx < 1 1 2(1 ) ( ,..., ) (1 )q y q qc x p x x c x

    qx [(log ) ]cnz nO n h = ( )ny nz o = a.s..

    Proof: This is slightly modified version of Theorem 2.2 of Bosq (1996) and is proved the

    same way as that theorem. Q.E.D.

    Lemma 2: ( ) ( ) (n n n nyC C O ) = .

    Proof: C is bounded uniformly over becausen n X has bounded support. Therefore,

    ( ) ( ) [ ( ) ( )] (n

    n n n ny y nyC

    C C p y p y dy O ) = = .

    Q.E.D.

    Lemma 3: ( ) 1 ( )n nC o = a.s..

    Proof: Let V(2 )n denote volume of { : ( ) 2 }yy p y n< . By Assumption 6(ii) and

    Lemma 1, a.s. for any . Therefore,2ny n = [ (log )c

    no ] 1c > ( )nony = a.s., and

    1 ( ) [ ( ) ] ( )

    2 [ ( ) 2 ] 2 (2 ) a.

    n ny n y

    n y n n n

    C I p y p y dy

    I p y dy V

    =

    ny C (

    (A1) ( ) (

    (

    yB

    ny

    n n

    B p

    [ ( )

    ) ( )

    yB B

    y B n

    y dy

    p y dy p y

    V B c V

    = +

    n

    =

    P

    1( | )

    y jf y

    1j 1jy

    =

    BM

    ( ) ( | ) (

    ( | ) (

    ( )n

    y y

    y y

    B yC

    B dxf x y p y

    dxf x y p y

    p y dy V =

    P

    Lemma 6: Conditional on the data, the support of Y j (j = in the truncated

    Markov process is C a.s..

    Proof: The support ofY is by construction, and the support ofY j can be no

    larger than . For , suppose that the support of Y

    2)

    1j is but the support of Y is a

    proper subset of . Then there is a measurable set

    nC j

    nB C such that andP ( 0| )B y =P

    for all . Let V B denote the volume of) B . Then by Lemma 1 and Assumption 6(ii),

    )

    ( ) ( )]

    ( ) a.s.

    nyp y dy

    B

    .

    for all sufficiently large and some cB < . Let nC denote the complement of . Let

    denote the probability density of Y conditional on Y

    nC

    j . Then for some

    constant , Lemma 3 yields<

    ) ( | ) ( )

    )

    ( ) ( ) ( ).

    n n

    n

    y yC B C B

    C B

    n

    dy dy dxf x y p y

    dy

    M V B B o

    = +

    =

    This is a contradiction of (A1). Q.E.D.

    Lemma 7: Define

    27

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    29/44

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    30/44

    | | 1

    2

    | | 2

    1

    2 2

    !( ,..., ) ...

    | | !( | |)!

    [(1 ) 1 ]

    0.5 ( ) a.s.

    n

    n

    wn n

    n j k j k nnw

    rn n n n n

    n n n n n

    rC p x x dx dx

    w r w

    r

    r o r

    = +

    +

    A

    uniformly over . It follows that

    21| | (1 ) [ ( ) ( )] (

    nrn n n n n n nB O r O r O r )n n + =

    a.s

    uniformly over . By Assumption 4(v), 1[(log ) ]n o n= , and the result follows. Q.E.D.

    Lemma 8: As and conditional on the data,n

    (A2) | ( ) ( ) | ( log )n

    y y nC

    g y p y dy O n = a.s.

    and

    (A3) | ( ) ( ) | ( log )n

    ny ny nC

    g y p y dy O n = a.s.

    Proof: Only (A2) is proved here. The proof of (A3) is similar. Let denote the

    measurable subsets of C . Let G and

    B

    n y yP denote the probability measures associated with yg

    and y . Then

    | ( ) ( ) | sup [ ( ) ( )] inf [ ( ) ( )]

    2 sup | ( ) ( ) | .

    n

    y y y y y yC

    y y

    g y p y dy A A A A

    A A

    =

    G P G P

    G P

    A BA B

    A B

    Thus, it suffices to investigate the convergence of

    sup | ( ) ( ) |y yA

    A A

    G PB

    .

    This is done here for the case . The argument for the case of is similar but more

    complex notationally. Let

    1q = 1q >

    s be the integer part of for some . Thes-step transition

    probability from a point sampled randomly from

    logb n 0>b

    yP to a measurable set isA

    ( )1 1 1

    2

    ( ) ... ( | ) ( )s

    ss s i i y

    Ai

    A dx dx dx f x x p x =

    =

    P 1

    29

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    31/44

    1 1 1 1 1 1 1

    2 2

    ... ( | ) ( ) ... ( | ) ( ).n n

    s s

    s s i i y s s i i yA A

    i i

    dx dx dx f x x p x dx dx dx f x x p x = =

    = +

    A A

    nA

    1

    1

    2S

    1x

    Thes-step transition probability of the truncated process starting from the same initial density but

    restricted to is

    ( ) 11 1 1

    2

    ( ) ( ) ... ( | ) ( )

    n

    ss

    n s s i i y

    iA

    A C dx dx dx f x x p x =

    =

    G

    A

    .

    Therefore, , where( ) ( ) 1( ) ( )s s

    n nA A S = G P

    11 1 1 1 1

    2 2

    ... ( ) ( | ) ( | ) ( )n

    s s

    n s s n i i i i yA

    i i

    S dx dx dx C f x x f x x p = =

    =

    A

    and

    2 1 1 1

    2... ( | ) ( )

    n

    s

    n s s i i yA

    iS dx dx dx f x x p

    =

    = A1x

    )n

    .

    Arguments like those made in the proof of Lemma 7 show that 1 (nS O s= a.s. uniformly over

    . In additionA B 2 (n P A )nS c for some constant c < . But

    1

    1

    { }s

    n i

    i

    nx C

    =

    A

    so

    1

    1( ) [1 ( )] (

    s

    n n

    iC O s )n

    = =P A a.s.

    It follows that

    (A4) ( ) ( )( ) ( ) ( )s s

    nA A O s =G P a.s.

    uniformly over . Similar arguments show that the s step transition probabilities from a

    point

    A B

    nC to satisfyA B

    (A5) ( ) ( )( , ) ( , ) ( )s s

    nA A O s =G P a.s.

    uniformly over andA B nC . In addition,

    ( ) ( )

    ( ) ( )

    (A6) sup | ( ) ( ) | sup | ( ) ( ) |

    sup | ( ) ( ) | sup | ( ) ( ) | .

    s sy y

    A A

    s sy y

    A A

    A A A A

    A A A A

    + +

    G P G P

    G G P P

    B B

    B B

    30

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    32/44

    The first term on the right-hand side of (A6) is O s( )n a.s.. By a result of Nagaev (1961), it

    follows from (A5) and Assumption 4(ii) that for some 1 < , the second term is a.s. O / 1( )s k ,

    where is as in Assumption 4(ii) (Nagaev 1961). The third term isk ( )sO for some 1<

    because { }jX is GSM and, therefore, uniformly ergodic (Doukhan 1995, p. 21). For b

    sufficiently large, the second two terms are ( )no . Q.E.D.

    Lemma 9: Define

    2 ( ) ( | ) [ ( ) ( )] ...n

    k

    n j j y j q y j q

    i j q

    u f x y g y p y dx dx + += +

    = A j k

    )n

    .

    Then conditional on the data, 2 ( logn nO = a.s. uniformly over as .n

    Proof:

    2 | ( ) ( ) |n

    n y yC

    B g y p y dy

    for some constant B < . The result now follows from Lemma 8. Q.E.D.

    Lemma 10: Define

    3 ( ) ( | ) ( )] ...n

    k

    n i i y j q

    i j q

    u f x y p y dx dx += +

    = A j k

    )n

    .

    Then conditional on the data, 3 (n nO r = a.s. uniformly over as .n

    Proof:3

    (n

    B P A )n

    for some constant B < . The result now follows from arguments

    like those used to prove Lemma 8. Q.E.D.

    Lemma 11: Conditional on the data, a.s. uniformly

    over

    [ ( )] [ ( )] ( log )nU U O =E E n

    .

    Proof: Some algebra shows that 1 2[ ( )] [ ( )] n n nU U 3 = + +E E . Now combine

    Lemmas 7, 9, and 10. Q.E.D.

    Lemma 12: The bootstrap process { j}X is geometrically strongly mixing a.s..

    Proof: It follows from Lemma 5 and assumption 4(ii) that

    , ; ( )

    sup | ( , ) ( , ) | 1n n

    m m

    C A C

    A A

    . The lemma follows from a result of Nagaev (1961).

    Q.E.D.

    31

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    33/44

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    34/44

    and

    4 ( ) ( ) ...,n

    n n ny j q ju S g y dx dx += A k

    It follows from Lemmas 5 and 8 that 1 ( log )n nO n = and 3 ( log )n nO n = a.s. uniformly over

    . Now consider 2n . A Taylor series approximation shows that

    (1) (2) (3)2 2 2 2

    1

    ( )k

    n nl nl nl

    l j q

    = + +

    = + + ,

    where

    (1)2

    ( , ) ( , )( ) ( | ) ( *) ( ) ...

    ( ) ( *| )n

    knz l l z l l

    i i l nl y j q j k nly l n l i j q

    i l

    p x y p x yu f x y I x C g y dx

    p y C y +

    = +

    =

    A dx ,

    (2)2

    ( | )[ ( ) ( )]( ) ( | ) ( *) ( ) ...

    ( ) ( *| )n

    kl l ny l y l

    i i l nl y j q j k nly l n l i j q

    i l

    f x y p y p yu f x y I x C g y dx

    p y C y +

    = +

    =

    A dx ,

    and

    (3)2

    ( | )( ) ( | ) ( ) ( ) ( ) ...

    ( ) ( *| )n

    kl l

    i i nz ny y j q jnly l n l i j q

    i l

    f x yu f x y O O g y dx dx

    p y C y +

    = +

    = +

    A

    k.

    For some constant B < ,

    (1)2

    1

    1

    1

    1

    1( | ) ( ) ...

    ( )

    1( | ) ( ) ...

    ( )

    1( | ) ( ) ... ( log )( )

    k

    nz i i y j q j k nly li j q

    i l

    l

    nz i i y j q j l y li j q

    l

    nz i i z j q j l ny li j q

    B f x y g y dx dxp y

    B f x y g y dx dxp y

    B f x y p y dx dx O np y

    += +

    + = +

    + = +

    =

    = +

    a.s. uniformly over , where the last line follows from Lemma 8. Some algebra shows that

    1 11 , 1

    1 1

    ( ) ( *( | ) ( | )... ( | )

    ( ) *( )

    l ly j q i n i

    i i l q l j j qy l ii j q i j q

    p y I y C f x y f x y f x y

    p y y

    + + +

    + += + = +

    =

    )

    33

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    35/44

    21 1(1 ) ( | )... ( | ).

    l j qn l q l j jf x y f x y

    + + q

    n

    Therefore, a.s. uniformly over(1)

    2 ( log )nnl O . Similar arguments can be used to show that

    and a.s. uniformly over(2)

    2nl = ( log )nO n(3)

    2nl = ( logno )n . Therefore,

    a.s. uniformly over

    22 [ (log )n nO n = ]

    . Now consider 4n . Given , let be the smallest value of i such that

    . Define

    w w

    1iw =

    , : ( )

    sup | ( | ) ( | ) |ny n

    n nx y p y

    f x y f x y

    = .

    Then

    1| | 1

    2

    | | 2 1

    ( | ) [ ( | ) ( | )w

    w w w w

    wn n i i n

    w i

    S B f x y f x y f x y

    =

    ]

    j k

    for some constant . Therefore,2B < 1

    | | 14 2

    | | 2 1

    ( | ) | ( | ) ( | ) | ( ) ...w

    w w w wn

    wn n i i n ny j q

    w i

    B f x y f x y f x y g y dx dx

    + =

    A.

    Arguments like those made for 2n show that the integral is a.s.. Arguments like

    those made in the proof of Lemma 7 show that the sum is

    2[ (log )nO n ]

    )2( n nO r . Therefore,

    a.s. uniformly over4

    4 [ (log )n nzo n = ] . The lemma follows by combining the rates of

    convergence of 1n , 2n , 3n , and 4n . Q.E.D.

    Lemma 14: Define /[2 ( 1)]d q = + + . For any 0 > , ( )O n + = a.s.

    Proof: Under Assumptions 3-4, is a smooth function of sample moments. Therefore,

    thejth cumulant ofT has the expansion

    nT

    n

    ( 2) / 2 1 21 2[ (

    jj j jK n k n k O n

    = + + )] .

    with and . The vector11 0k = 21 1k = of Section 2.3 can be written 12 22 31 41( , , , )k k k k = .

    Then the functions 1 and 2 in (2.6) are

    21 12 31( , ) [ (1/ 6) ( 1)]z k k z = +

    and

    2 2 2 42 22 12 41 12 31 31( , ) [(1/ 2)( ) (1/ 24)( 4 )( 3) (1/ 72) ( 10 15)]z z k k k k k z k z z = + + + + +

    2.

    34

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    36/44

    See, e.g., Hall (1992). The parameters in these expressions are independent of . Defineijk n

    1 1( )(j jX X ) += E and nj = 1

    1( )(

    n j

    i i jin X m X

    +=

    )m . A Taylor series expansionof aboutnT m = njand j = ( 0, )j M...,= shows that is a continuous function of terms of

    the form 2)( , )D n(m E , 2 3( , ) ( )D n m E , 2 4( , ) ( )D n m E

    ( , ) ( ) nj jD n m ) ( E D, ( , ) 2 2( ) )nj jn m ( (DE , , )

    2 3( ) nj jn m ) (E ,

    3/ 2( , ) ( )D n m E nj2)j ( , and

    2 2( , ) ( ) njD n m2)j ( E , where D represents a

    differentiable function that may be different in different terms, and 0( ,..., ) = . Similar

    expansions applied to T show thatn is a continuous function of the same terms but withE ,

    ,m , and 1 1( )X X j )j( + = E in place of E , ,m , and j . Thus, it is necessary to

    show that the differences between the bootstrap and population terms are o n( ) + a.s.. This is

    done here for2 2

    ( , ) ( )m D n m (D n ( , ) ) EE - and 2 3 ( , ) )D n m (E -

    2 3( , ) ( )D n m E . The calculations for the remaining moments are similar but much

    lengthier. Lemma 13 implies that )( , ( , ) ( )D D O n + = a.s., so it suffices to show that

    2 2 ( ) ( )n m n m E - E and 2 3 ( ) 2 3( )n mn m E (E - are O n ) + .

    First consider2 ( ) (n m n m 2) E - E . Let { }nJ be an increasing sequence of positive

    integers such that / lognJ n b for some finite constant b . Then it follows from Lemma

    12 that

    1/ 2>

    2 10

    2

    ( ) 2 (1 / ) (nJ

    j

    j

    n m j n o n

    =

    = + +E / 2 )

    )

    a.s.

    It further follows from Lemma 13 that 2 2 ( ) ( ) (n m n m o n + = E - E a.s. for any 0 > .

    Now consider 3/ 2 3 3/ ( )n m n m2 3( ) i i E - E . Let V X = i iand V X = . Then

    2 3

    , , 1

    1 ( )

    n

    i j k

    i j kn m VV V n =

    =

    1 2 13 2 2

    1 1 1 1 1 1

    1 1 6 ( )n n n n n n

    i i j i j

    i i j i i j i k j

    V V V VV n n n

    = = = + = = + = +

    = + + + .i j kVV V

    Since the bootstrap DGP is stationary, some algebra shows that

    35

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    37/44

    2 3

    1 2 13 2 2

    1 1 1 1 1 1

    1 1 1

    ( )

    (1 / ) ( ) 6 [1 ( ) / ] .n n n i

    i i i i

    i i j

    n m

    V i n V V V V i j n V V V

    1 1 j+ + += = =

    =

    + + + +

    E

    E E E + +

    Define nJ as before. By Lemma 12 and Billingsleys inequality (Bosq, 1996, inequality (1.11))

    2 3 3 2 21 1 1 1 1

    1

    min( 1, )1/ 2

    1 1 1

    1 1

    ( ) (1 / ) ( )

    6 [1 ( ) / ] (

    n

    n n

    J

    i i

    i

    J n i J

    i i j

    i j

    n m V i n V V V V

    i j n V V V o n

    + +=

    + + += =

    = + +

    + + +

    E E E

    E ) a.s.

    ).

    A similar argument shows that

    2 3 3 2 21 1 1 1 1

    1

    min( 1, )1/ 2

    1 1 1

    1 1

    ( ) (1 / ) ( )

    6 [1 ( ) / ] (

    n

    n n

    J

    i i

    i

    J n i J

    i i j

    i j

    n m V i n V V V V

    i j n V V V o n

    + +=

    + + += =

    = + +

    + + +

    E E E

    E

    Now apply Lemma 13. Q.E.D.

    A.2 Proofs of Theorems 3.1 and3.2

    Proof of Theorem 3.1: Under Assumptions 3-4, T is a smooth function of samplemoments. Therefore, Theorem 2.8 of Gtze and Hipp (1983) and arguments like those used to

    prove Theorem 2 of Bhattacharya and Ghosh (1978) establish the expansions (2.6) and (2.7).

    Repetition of Gtzes and Hipps proof shows that the expansions (2.8) and (2.9) are also valid.

    The theorem follows from Lemma 14 and the fact that and in (2.6)-(2.9) are polynomial

    functions of the components of and

    n

    1g 2g

    . Q.E.D.

    Proof of Theorem 3.2: Equations (2.13) and (2.14) follow from Theorem 3.1. Cornish-

    Fisher inversions of (2.13) and (2.14) yield (2.15)-(2.18). The theorem follows by applying

    Lemma 14 and the delta method to (2.18). Q.E.D.

    A.3 Proofs of Theorems 4.1-4.2

    Assumptions 1 and 5-10 hold throughout this section.

    Lemma 15: For any 0 > ,

    36

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    38/44

    1/ 20lim 0n

    nn

    = a.s.

    Proof: This follows from Lemmas 3 and 4 of Andrews (1999) and the Borel-Cantelli

    lemma. Q.E.D.

    Lemma 16: Define 1 01 ( , )n

    n iiS n V = += X ,1

    1 ( , )n

    n iS n V i n = += X , ,and . There is an infinitely differentiable function

    ( )nS S= E

    ( nS S= E ) such that ( ) 0S( )S = = ,

    3/ 2 1/ 2lim sup ( ) [ ( ) ] 0nr nn z

    n t z n S z

    =P P ,

    and

    3/ 2 1/ 2 lim sup ( ) [ ( ) ] 0nr nn z

    n t z n S z

    =P P a.s.

    Proof: This is a slightly modified version of Lemma 13 of Andrews (1999) and is proved

    using the methods of proof of that lemma. Q.E.D.

    Proof of Theorem 4.1: 1/ 2[ ( )nn S ]z P

    3/ 2( )

    has an Edgeworth expansion. By Lemma 16,

    has the same expansion to O n , and( nrt zP )

    )

    ) 4)

    )

    4)

    2/ 2 3/ 2

    1

    ( ) ( ) ( , ) ( ) (jnr jj

    t z z n z z O n

    =

    = + +P

    uniformly over , where is an infinitely differentiable function of the following moments:

    , , and . Similarly,

    z

    E

    2 3S2( )nn S SE ( nn S 2 ( nn S SE

    2/ 2 3/ 2

    1

    ( ) ( ) ( , ) ( ) (jnr jj

    t z z n z z O n

    =

    = + +P

    uniformly over a.s., where is an infinitely differentiable function of ,

    , and . It is easy to see that the conclusion of Lemma 13 holds for

    moments ofV

    z

    )

    ( i

    S

    2 ( )nn S SE

    2 3 ( nn S SE2 ( nn S E

    0, )X and V 0 )( i ,X . Lemma 15 implies that V

    ( ,i n )X can be replaced with

    0( , )iV X in moments comprising . Therefore, the conclusion of Lemma 14 holds for ,

    and the theorem is proved. Q.E.D.

    Proof of Theorem 4.2: Let T be the version of T that is obtained by sampling the

    process . Let

    n

    n

    ( ){q

    jX } P denote the probability measure corresponding to T . Repetition of the

    steps leading to the conclusions of Theorems 3.1 and 3.2 shows that for every

    n

    0 >

    37

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    39/44

    1 sup ( ) ( ) | ( )n nz

    T z T z o n + =| P P ,

    3/ 2 sup (| | ) (| | ) | (n nz

    T z T z o n ) + =| P P a.s.,

    and

    3/ 2(| | ) ( )n nT z o n

    +> = +P .

    Therefore, it suffices to show that

    3/ 2sup | ( ) ( ) | ( )n nz

    T z T z n + =P P .

    ( nT zP ) and have Edgeworth expansions, so it is enough to show that( )nT zP

    1/ 2(O n ) = + , where and is defined as in Lemma 14 but

    relative to the distribution of T instead of the distribution of T . As in the proof of Lemma 14,

    12 22 31 41( , , ,k k k k =

    n

    ) ijk

    n

    1/ 2(O n ) = + follows if | ( 1/(O 2n) ( ) |u u ) + =E E for any 0 > , where E

    2 ...}

    denotes

    the expectation relative to the measure induced by { . Let( )qjX } 1 ,jX X{ ,j j = . Define

    ( )u , , the multi-index , and | as in the proof of Lemma 7. Then for a suitable constant

    ,

    nr

    1C <

    w |w

    ( )

    ( )1

    1

    ( ) ( ) | ( ) | ( | ) ( | ) ... ( )

    ( | ) ( | ) ... ( )

    ,

    k kq

    i i i i j k j

    i j i j q

    k kq

    i i i i j k j

    i j i j q

    n

    u u u f x y f x dx dx d

    C f x y f x dx dx d

    C S

    = = +

    = = +

    =

    E E P

    P

    j

    where

    1 ( )

    | | 1

    ( | ) | ( | ) ( | ) | ... ( )i ik

    w wqn i i i i i i j k

    w i j

    S f x f x y f x dx dx d

    =

    =

    P .

    Each value of | | generatesw !/[| | !( | |)!n nr w r w terms in the sum on the right-hand side of .

    Each term is bounded by ( for some constant

    nS

    | |)bq w2C e 2C < . Therefore,

    | |2 2

    | | 1

    !( ) (1 )

    | |!( | |)!nrbq w bqn

    nnw

    rS C e C

    w r w

    = + 1e .

    38

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    40/44

    1/ 2(nS O n ) += for any 0 > follows from (log )nr O n= and q c . Q.E.D.

    2(log )n=

    39

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    41/44

    REFERENCES

    Andrews, D.W.K. (1991): Heteroskedasticity and Autocorrelation Consistent Covariance

    Matrix Estimation,Econometrica, 59, 817-858.

    Andrews, D.W.K. (1999): Higher-Order Improvements of a Computationally Attractive k-StepBootstrap for Extremum Estimators, Cowles Foundation Discussion Paper No. 1230, Cowles

    Foundation for Research in Economics, Yale University, New Haven, CT.

    Andrews, D.W.K. (2002): Higher-Order Improvements of a Computationally Attractive k-Step

    Bootstrap for Extremum Estimators,Econometrica, 70, 119-162.

    Andrews, D.W.K. and J.C. Monahan (1992): An Improved Heteroskedasticity andAutocorrelation Consistent Covariance Matrix,Econometrica, 60, 953-966.

    Beran, R. (1988): Prepivoting Test Statistics: A Bootstrap View of Asymptotic Refinements,

    Journal of the American Statistical Association, 83, 687-697.

    Bhattacharya, R.N. and J.K. Ghosh (1978). On the Validity of the Formal Edgeworth Expansion,

    Annals of Statistics, 6, 434-451.

    Bose, A. (1988): Edgeworth Correction by Bootstrap in Autoregressions,Annals of Statistics,

    16, 1709-1722.

    Bose, A. (1990): Bootstrap in Moving Average Models, Annals of the Institute of Statistical

    Mathematics, 42, 753-768.

    Bosq, D. (1996): Nonparametric Statistics for Stochastic Processes: Estimation and Prediction.New York: Springer-Verlag.

    Brown, B.W., W.K. Newey, and S.M. May (2000): Efficient Bootstrapping for GMM,

    unpublished manuscript, Department of Economics, Massachusetts Institute of Technology,

    Cambridge, MA.

    Bhlmann, P. (1997): Sieve Bootstrap for Time Series,Bernoulli, 3, 123-148.

    Bhlmann, P. (1998): Sieve Bootstrap for Smoothing in Nonstationary Time Series,Annals ofStatistics, 26, 48-83.

    Carlstein, E. (1986): The Use of Subseries Methods for Estimating the Variance of a General

    Statistic from a Stationary Time Series,Annals of Statistics, 14, 1171-1179.

    Chan, K.S. and H. Tong (1998): A Note on Testing for Multi-Modality with Dependent Data,

    working paper, Department of Statistics and Actuarial Science, University of Iowa, Iowa City,

    IA.

    Cheng, B. and Tong, H. (1992): Consistent Non-Parametric Order Determination and Chaos,

    Journal of the Royal Statistical Society, Series B, 54, 427-229.

    40

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    42/44

    Choi, E. and Hall, P. (2000): Bootstrap Confidence Regions Computed from Autoregressions of

    Arbitrary Order,Journal of the Royal Statistical Society, Series B, 62, 461-477

    Datta, S. and W.P. McCormick (1995): Some Continuous Edgeworth Expansions for Markov

    Chains with Applications to Bootstrap,Journal of Multivariate Analysis, 52, 83-106.

    Davidson, R. and J.G. MacKinnon (1999): The Size Distortion of Bootstrap Tests,Econometric Theory, 15, 361-376.

    Doukhan, P. (1995): Mixing: Properties and Examples. New York: Springer-Verlag.

    Gtze, F. and C. Hipp (1983): Asymptotic Expansions for Sums of Weakly Dependent Random

    Vectors,Zeitschrift fr Wahrscheinlichkeitstheorie und verwandte Gebiete, 64, 211-239.

    Gtze, F. and H.R. Knsch (1996): Second-Order Correctness of the Blockwise Bootstrap for

    Stationary Observations,Annals of Statistics, 24, 1914-1933.

    Hall, P. (1985): Resampling a Coverage Process, Stochastic Process Applications, 19, 259-

    269.

    Hall, P. (1992): The Bootstrap and Edgeworth Expansion. New York: Springer-Verlag.

    Hall, P. and J.L Horowitz (1996): Bootstrap Critical Values for Tests Based on Generalized-

    Method-of-Moments Estimators,Econometrica, 64, 891-916.

    Hall, P., J.L. Horowitz, and B.-Y. Jing (1995): On Blocking Rules for the Bootstrap with

    Dependent Data,Biometrika, 82, 561-574.

    Hansen, B. (1999): Non-Parametric Dependent Data Bootstrap for Conditional Moment

    Models, working paper, Department of Economics, University of Wisconsin, Madison, WI.

    He, C. and T. Tersvirta (1999): Properties of Moments of a Family of GARCH Processes,

    Journal of Econometrics, 92, 173-192.

    Horowitz, J.L. (1994): Bootstrap-Based Critical Values for the Information-Matrix Test,

    Journal of Econometrics, 61, 395-411.

    Horowitz, J.L. (1997): Bootstrap Methods in Econometrics, in Advances in Economics andEconometrics: Theory and Applications, Seventh World Congress, ed. by D.M. Kreps and K.F.Wallis. Cambridge: Cambridge University Press, pp. 188-222.

    Horowitz, J.L. (1999): The Bootstrap in Econometrics, inHandbook of Econometrics, Vol. 5,

    ed. by J.J. Heckman and E. E. Leamer. Amsterdam: Elsevier, forthcoming.

    Inoue, A. and M. Shintani (2000): Bootstrapping GMM Estimators for Time Series,

    unpublished manuscript, Department of Agricultural and Resource Economics, North Carolina

    State University.

    Kreiss, J.-P. (1992): Bootstrap procedures for AR() Processes, inBootstrapping and RelatedTechniques, Lecture Notes in Economics and Mathematical Systems, 376, ed. by K.H. Jckel,

    G. Rothe, and W. Sender. Heidelberg: Springer-Verlag, pp. 107-113.

    41

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    43/44

    Knsch, H.R. (1989): The Jackknife and the Bootstrap for General Stationary Observations,

    Annals of Statistics, 17, 1217-1241.

    Meyn, S.P. and R.L. Tweedie (1993): Markov Chains and Stochastic Stability. Berlin: Springer-

    Verlag.

    Nagaev, S.V. (1961): More Exact Statements of Limit Theorems for Homogeneous Markov

    Chains, Theory of Probability and Its Applications, 6, 62-81.

    Newey, W.K. and K.D. West (1987): A Simple, Positive Semi-Definite, Heteroskedasticity and

    Autocorrelation Consistent Covariance Matrix,Econometrica, 55, 703-708.

    Newey, W.K. and K.D. West (1994): Automatic Lag Selection in Covariance MatrixEstimation,Review of Economic Studies, 61, 631-653.

    Paparoditis, E. (1996): Bootstrapping Autoregressive and Moving Average Parameter Estimates

    of Infinite Order Vector Autoregressive Processes,Journal of Multivariate Analysis, 57, 277-

    296.

    Paparoditis, E. and D.N. Politis (2001a): A Markovian Local Resampling scheme for

    Nonparametric Estimators in Time Series Analysis,Econometric Theory , 17, 540-566.

    Paparoditis, E. and D.N. Politis (2001b): The Local Bootstrap for Markov Processes,Journal

    of Statistical Planning and Inference, forthcoming.

    Rajarshi, M.B. (1990): Bootstrap in Markov-Sequences Based on Estimates of Transition

    Density,Annals of the Institute of Statistical Mathematics, 42, 253-268.

    Silverman, B.W. (1986): Density Estimation for Statistics and Data Analysis, Chapman & Hall.

    Zvingelis, J.J. (2000): On Bootstrap Coverage Probability with Dependent Data, working

    paper, Department of Economics, University of Iowa, Iowa City, IA.

    42

  • 8/6/2019 3528_Bootstrap Methods for Markov Processes

    44/44

    TABLE 1: RESULTS OF MONTE CARLO EXPERIMENTS

    Empirical Rejection Probabilitya

    Critical Block U ~ as in (5.2) U ~ N(0,1)

    Value Length ARCH(1) GARCH(1,1) ARCH(1) GARCH(1,1)_

    Symmetrical Tests

    Asymptotic 0.089 0.063 0.092 0.099Block Boot. 2 0.069 0.054 0.076 0.081

    5 0.075 0.068 0.073 0.07310 0.072 0.074 0.073 0.058

    MCB 0.044 0.048 0.054 0.067

    One-Sided Upper Tail Tests

    Asymptotic 0.032 0.035 0.058 0.064Block Boot. 2 0.049 0.042 0.067 0.110

    5 0.085 0.052 0.085 0.09110 0.087 0.074 0.092 0.056

    MCB 0.038 0.050 0.064 0.073

    One-Sided Lower-Tail Tests

    Asymptotic 0.092 0.075 0.091 0.093Block Boot. 2 0.067 0.055 0.069 0.092

    5 0.078 0.058 0.085 0.08410 0.082 0.059 0.088 0.062

    MCB 0.046 0.040 0.055 0.068___________________________________________________ ___

    a Standard errors of the empirical rejection probabilities arein the range 0.0025 to 0.0042, depending on the probability.