Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
ARTICLE IN PRESS
Journal of Econometrics 134 (2006) 1–68
0304-4076/$ -
doi:10.1016/j
�CorrespoE-mail ad
www.elsevier.com/locate/jeconom
Asymptotic properties of Monte Carlo estimatorsof diffusion processes
Jerome Detemplea,b, Rene Garciac,b,d,�, Marcel Rindisbacherb,e
aBoston University School of Management, USAbCIRANO, Canada
cEconomics Department, Universite de Montreal, CanadadCIREQ, Canada
eRotman School of Management, University of Toronto, Canada
Available online 31 August 2005
Abstract
This paper studies the limit distributions of Monte Carlo estimators of diffusion processes.
We examine two types of estimators based on the Euler scheme, one applied to the original
processes, the other to a Doss transformation of the processes. We show that the
transformation increases the speed of convergence of the Euler scheme. We also study
estimators of conditional expectations of diffusions. After characterizing expected approx-
imation errors, we construct second-order bias-corrected estimators. We also derive new
convergence results for the Mihlstein scheme. Illustrations of the results are provided in the
context of simulation-based estimation of diffusion processes.
r 2005 Elsevier B.V. All rights reserved.
JEL classification: C13; C15
Keywords: Monte Carlo estimators; Diffusion processes; Simulation-based estimation; Doss
transformation
see front matter r 2005 Elsevier B.V. All rights reserved.
.jeconom.2005.06.028
nding author. Tel.: +1514 343 5960; fax: +1 514 343 5831.
dress: [email protected] (R. Garcia).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–682
1. Introduction
Standard applications in finance, such as portfolio allocation, asset pricing andrisk management, rely on models of prices and state variables described by diffusionprocesses. For practical implementations of these financial models estimates of thedrift and volatility coefficients of the underlying processes are needed. Precisestatistical methods are essential for that purpose. The importance of accuratemethods is further enhanced by the potentially large impact of parameter uncertaintyand estimation errors on economic decisions.1
While maximum likelihood is the estimation method of choice, it is, in general,infeasible as explicit formulas for transition densities are often unknown. Inferencecan, therefore, only proceed with the use of an auxiliary numerical proceduredesigned to approximate the transition density. An example is the simulatedmaximum likelihood estimation (SMLE) method, suggested by Pedersen (1995a,b).2
This method approximates the transition density by applying a Euler discretizationto the diffusion process. It splits the time between observations into subintervals andevaluates the transition of the process between any two observations by integratingout convolutions of subdensities (generally Gaussian) using a Monte Carloprocedure. Consistency of the estimator is established when the numbers ofsubintervals and Monte Carlo replications go to infinity, while a particular ratio ofthe two converges to zero.3 Any feasible estimator, however, involves finite numbersof subintervals and replications and will therefore be biased and inefficient.Moreover, there are trade-offs between the number of subintervals and that ofreplications. Increasing the former will decrease the bias but will boost the varianceof the estimator. Increasing the latter will reduce the variance, but will also, if theestimator is biased, diminish the probability that a confidence interval of a given sizecovers the true value of the parameter. The main contribution of our paper is tocharacterize the asymptotic properties of the errors associated with efficientapproximation procedures for the estimation of diffusion processes. Our resultsenable us to construct (second-order) bias-corrected estimators that are asympto-tically equivalent to estimators obtained by sampling from the true distribution ofthe processes.
In a recent paper, Durham and Gallant (2002) stress that SMLE can becomecomputationally expensive, if the goal is to achieve a reasonable degree of accuracy,and propose various ways to reduce this cost. Their approach uses bias- andvariance-reduction schemes to limit both the number of subintervals and that ofreplications. They investigate, by experimentation, the trade-offs between these twocontrol parameters for various discretization and simulation schemes. Our goal is tosupplement experimentation with an explicit characterization of the asymptotic errorproperties of Monte Carlo estimators for general diffusion processes.
1For optimal portfolio choice, see, for example, Bawa et al. (1979) and Barberis (2000).2See also Santa-Clara (1995) and Brandt and Santa-Clara (2002).3In Brandt and Santa-Clara (2002), the ratio S1=2=M ! 0, with S the number of replications and M the
number of subintervals. A discussion of this ratio is provided in Section 5.1.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 3
As an illustration, we apply our convergence results to simulated methods ofmoments.4 These include the simulated method of moments (Duffie and Singleton,1993), indirect inference procedures (Smith, 1990; Gourieroux et al., 1993) andefficient method of moment procedures (EMM) (Gallant and Tauchen, 1996).5
Recently, Broze et al. (1998) studied the asymptotic bias associated with indirectinference. This procedure relies on the simulation of diffusion processes at a step thatis finer than the observation step to correct the discretization bias. They emphasizethat the use of any fixed time interval to perform the simulation implies anasymptotic bias.6 Although their theoretical asymptotic results support the choice ofan arbitrarily small discretization step, their numerical experiments show a trade-offbetween the length of the simulated data MT (where T is the length of the observedsample and M the number of trajectories) and the size h ¼ T=N of the discretizationstep. Our results provide an analytical characterization of this trade-off. We show, inparticular, that the second-order bias is a function of both M and N and provide aprocedure to correct simulation-based method-of-moment estimators for it.
To state the problem in general terms, note that auxiliary numerical proceduresfor SMM estimators involve the calculation of conditional expectations of functionsof the solutions of stochastic differential equations. The use of a Monte Carloprocedure for that purpose induces two types of approximation errors. The firsterror is due to the numerical solution of a stochastic differential equation based on atime discretization scheme. The second error is the Monte Carlo error due to theapproximation of the expectation in the moment condition, by an average overindependent replications. Suppose that one seeks to compute the conditionalexpectation f ðt;xÞ ¼ Et½gðX T Þ�; where X T is the terminal value of the solution of thestochastic differential equation (SDE)
dX v ¼ AðX vÞdvþ BðX vÞdW v; X t ¼ x. (1)
To approximate the terminal value X T , of the solution of (1), several discretizationschemes can be used.7 The most popular, perhaps because of its ease ofimplementation, is the Euler scheme. This iterative procedure evaluates the driftand volatility functions at the value X ðtnÞ at time tn in order to infer the value X ðtnþ1Þ
at tnþ1, and proceeds in this manner until tN ¼ T . The second approximation is inthe computation of the conditional expectation that is performed by averaging over
4The methodology developed here could be applied to other simulation-based estimation methods such
as SMLE. In those applications the steps needed to characterize the bias will typically be more involved
than for the simulated method of moments.5Other numerical approaches have been proposed to estimate a diffusion process when the transition
density is not known in closed form. Let us mention the numerical resolution of the Kolmogorov forward
equation (Lo, 1988), the estimation of the infinitesimal generator based on a truncated set of
eigenfunctions (Kessler and Sørensen, 1999; Hansen and Scheinkman, 1995), orthogonal series expansions
(Ait-Sahalia, 2002), and Markov chain Monte Carlo (MCMC) Bayesian techniques (Eraker, 2001; Chib et
al., 2001). Except for Ait-Sahalia (2002), all other methods require that the sampling interval go to zero to
obtain convergence and involve the two types of approximation errors mentioned above.6Because the simulation is based on a fixed time step, and therefore not performed using the true
probability density function, they rename the method quasi-indirect inference.7A detailed analysis of discretization schemes available can be found in Kloeden and Platen (1997).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–684
a finite sample of approximated terminal values X ðTÞ. Justification for this averagingrests on the law of large numbers. The combination of these two operations, labelledMCE (Monte Carlo with Euler discretization), produces an estimate of f ðt;xÞ thatinvolves the two types of errors mentioned above. Understanding the trade-offbetween these errors requires the asymptotic error distribution.
In an insightful paper, Duffie and Glynn (1995) highlighted the trade-off betweenthe discretization error and the Monte Carlo averaging error, and showed theexistence of an efficient choice of discretization steps and Monte Carlo replications.For this efficient MCE scheme, they also characterized the asymptotic distribution ofthe approximation error and found it to be non-centered. A consequence is that theefficient procedure has a second-order bias. In this paper we characterize the second-order bias as the expected value of a known random variable. This random variablecan be simulated along with the diffusion and used to design a new approximationthat corrects for second-order bias. The bias corrected estimate is asymptoticallyequivalent to a Monte Carlo procedure that samples directly from the truedistribution of the terminal point of the diffusion.8
In the first part of this paper (Sections 2–4), we study the asymptotic distributionsof errors associated with discretization schemes for general diffusion processes andof Monte Carlo estimators of conditional expectations of diffusions. Errorproperties for approximations of solutions of SDEs (the first type of error) havebeen studied before. For the Euler discretization scheme the asymptotic errordistribution was found by Kurtz and Protter (1991a) and Jacod and Protter (1998).We extend their results by proposing a change of variables, commonly referred to asa Doss transformation (see Doss, 1977; Detemple et al., 2005) that reduces thediffusion coefficient of the SDE to unity. This transformation has enjoyed recentpopularity in financial econometrics (see, for instance, Ait-Sahalia, 2002; Durhamand Gallant, 2002) and has been used for the computation of optimal portfolios indynamic asset allocation models (Detemple et al., 2003). We show that a Dosstransformation of the SDE can improve the speed of convergence of thediscretization scheme as the martingale part of the transformed SDE can beapproximated without error. The asymptotic law of the estimate of the Dosstransformed state variable is derived and found to be non-centered. This can becontrasted with the simple Euler scheme applied to the original (non-transformed)SDE that produces an error whose asymptotic law is centered.
One of the numerical schemes used to reduce bias is the Milshtein second-orderscheme (Milshtein, 1995). We characterize its asymptotic error distribution and showthat it does not dominate the Euler scheme with transformation, in terms ofconvergence behavior. This highlights the benefits of the transformation. Our resultsfor the weak limit of the solution of SDEs based on the Milshtein scheme and thesecond-order bias of estimators of conditional expectations are new and therefore of
8Our method is simpler than the one in Talay and Tubaro (1990) as it does not require solving a PDE in
addition to calculating an expectation. In addition, it leads to the construction of computationally feasible
second-order bias corrected approximation schemes.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 5
independent interest. Moreover, they enlighten some of the experimentation resultsof Durham and Gallant (2002).
In the second part of the paper (Section 5), we apply these results to the estimationof the parameters of diffusions by simulated methods of moments. We first provide ageneral setup for parameter estimation and then develop an explicit simulatedextended quasi-maximum likelihood estimator based on the estimator presented byWefelmeyer (1996). We illustrate the method with L-CEV and CIR processes thatare widely used in the literature to model the short rate of interest.
As mentioned above, our theoretical convergence results enable us to design bias-corrected estimators that have the same asymptotic distributions as the infeasibleestimators based on the unknown true transition densities. We illustrate theimproved performance of these estimators in simulation experiments similar to thoseperformed by Durham and Gallant (2002) for the CIR model. Their simulationsshow that the Doss transformation reduces the bias of the parameter estimates. Ourasymptotic convergence results establish that this transformation indeed increasesthe speed of convergence, hence provides a theoretical explanation for their finding.
The paper is organized as follows. In Section 2 we study the asymptotic errordistributions of approximations of solutions of SDEs and provide numericalillustrations for standard processes in finance. Section 3 describes the asymptoticlaws of estimators of conditional expectations and characterizes expectedapproximation errors, second-order discretization biases and bias-corrected estima-tors. New asymptotic convergence results for the Milshtein scheme are presented inSection 4. Section 5 provides an application to a simulation-based method ofmoments. Conclusions are formulated in Section 6. All proofs are collected inAppendix A. Appendix B contains expressions needed to characterize the second-order bias-corrected estimators.
2. Asymptotic laws of estimators of solutions of SDEs
Continuous-time financial models are often based on multivariate diffusions withgeneral drift and diffusion functions. The solution of the financial application, be itasset pricing, portfolio allocation or risk management, relies on the simulation ofdiscretized versions of these stochastic differential equations (SDEs). The Eulerscheme is most often used for this purpose. This discretization involves anapproximation error. In this section we study the asymptotic error distribution ofEuler approximations of solutions of SDEs. We also study the error distributionassociated with a Doss transformation of the state variables. This change ofvariables is useful for numerical efficiency, as shown by the dynamic portfolioapplication in Detemple et al. (2003). It has also been used by Ait-Sahalia (2002) andDurham and Gallant (2002) as a variance reduction device. The importance ofobtaining an explicit expression for the asymptotic distribution of the approximationerror cannot be underestimated. As we will see, it is not possible to approximate thedistribution of this error in a finite simulation experiment using a simulatedbenchmark for the true value X T , no matter how finely the process is sampled.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–686
Convergence results for the Euler scheme are reviewed in Section 2.1. Theirapplication to Doss-transformed processes is studied in Section 2.2. Numericalillustrations are provided in Section 2.3.9
2.1. Euler approximation without transformation
To set the stage for the convergence results with the transformation and theMilshtein scheme, we recall known results for the standard Euler scheme. Theseresults are also essential for finding an explicit expression for the second-order bias.
Consider the d � 1 random vector X T given by the terminal value of the solutionof the SDE
dX v ¼ AðX vÞdvþXd
j¼1
BjðX vÞdW jv, (2)
where A and Bj are d � 1 vectors such that A 2 C3ðRd Þ, Bj 2 C3ðRd Þ and A;Bj are atmost of linear growth.10 The Euler approximation of (2) is
X NT ¼ X 0 þ
XN�1n¼0
AðX NnhÞhþ
XN�1n¼0
Xd
j¼1
BjðXNnhÞDW
jnh, (3)
where h ¼ T=N and DWjnh ¼W
jðnþ1Þh �W
jnh.
11
Kurtz and Protter (1991a) and Jacod and Protter (1998) deduce the asymptoticbehavior of the error associated with the Euler approximation of X T (see Jacod andProtter, 1998, Theorem 3.2, p. 276).
Theorem 1. The approximation error X NT � X T converges weakly at the rate 1=
ffiffiffiffiffiNp
(i.e.ffiffiffiffiffiNpðX N
T � X T Þ ) UXT ).
12 The asymptotic error is
UXT ¼ �
1ffiffiffi2p OT
Z T
0
O�1v
Xd
l;j¼1
½qBjBl �ðX vÞdZl;jv (4)
with ½Zl;j �l;j2f1;...;dg a d2� 1 standard Brownian motion independent of W, qBj a d � d
matrix of derivatives of Bj with respect to X and
Ov ¼ ER
Z �0
qAðX sÞdsþXd
j¼1
Z �0
qBjðX sÞdW js
!v
. (5)
9To simplify the presentation we assume homogeneous dynamics of the process to be simulated.10The space CkðRd Þ denotes the space of k times continuously differentiable, Rd -valued functions.11To simplify the notation we restrict the error analysis to equidistant discretization schemes.12Let S be a metric space and S its Borel sets. A sequence of random variables X N is said to converge
weakly to a random variable X whenever, with PXN � P � ðX N Þ�1 and PX � P � X�1, we haveR
Sf ðsÞdPXN ðsÞ !
RS
f ðsÞdPX ðsÞ for all continuous and bounded functions f on S (see Billingsley, 1968).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 7
In this last expression qA is the d � d matrix of derivatives of the vector A with respect
to the elements of X and ERð�Þ is the right stochastic exponential.13
Theorem 1 says that the error converges in law at the rate 1=ffiffiffiffiffiNp
to the randomvariable UX
T , as the number of discretization points N becomes large. The asymptoticerror UX
T depends on the coefficients of the SDE and their derivatives. Surprisingly,it also depends on new Brownian motions (½Zl;j�l;j2f1;...;dg), which are orthogonal tothe original ones (W). These appear because the stochastic integral of a time-dependent function with respect to a Brownian motion is imperfectly correlated withthe terminal value of the Brownian motion. A second Brownian motion is thenneeded to describe the law of the integral. The result in Theorem 1 follows becausethe limit law of X N
T depends on stochastic integrals of this sort.To illustrate the result consider the simple case of a geometric Brownian motion
dX v ¼ aX v dvþ bX v dW v. (6)
The asymptotic error is UXT ¼ �ðb=
ffiffiffi2pÞX T ZT ¼ Z
ðb2=2ÞX2T, where X T is log-normally
distributed and ZT is normal. The error distribution is a mixture of normals wherethe mixing distribution is the square of the geometric Brownian motion X T .
The scaling property of the Brownian motions Zl;j shows that the asymptoticdistribution, in the univariate case, is always a mixture of normals. But the law of themixing distribution is not always explicit. For instance, in the case of a CIR process
dX v ¼ kðX � X vÞdvþ sffiffiffiffiffiffiX v
pdW v, (7)
one finds that
UXT ¼ �
s2
2ffiffiffi2p OT
Z T
0
O�1v dZv ¼ Zs48O2
T
R T
0O�2v dv
.
The law of the mixing random variable ðs4=8ÞO2T
R T
0 O�2v dv is unknown as O satisfiesthe equation dOv ¼ ð�kdvþ ððs=2Þ=
ffiffiffiffiffiffiX v
pÞdW vÞOv where O0 ¼ 1, whose solution
depends on the path of X. One must then resort to numerical procedures to examinethe asymptotic error distribution. Illustrations of this are provided in Figs.1 and 2for the CIR process and the constant elasticity of variance process with linear drift(L-CEV).
The dependence of the asymptotic distribution on an independent Brownianmotion, that does not exist on the original probability space, shows that it is notpossible to approximate the distribution of the approximation error in a finitesimulation experiment using a simulated benchmark for the true value X T . Thisunderscores the importance of an explicit formula for the asymptotic error. In theabsence of an exact expression for UX
T , error analysis using a simulated benchmarkX N%
T with N% large (i.e. analysis offfiffiffiffiffiNpðX N
T � X N%
T Þ) will always depend just on theoriginal Brownian motions W j and not on the independent Brownian motions Zl;j
that appear in the random limit.
13For a d � d semimartingale M, the right stochastic exponential Zv ¼ ERðMÞv is the unique solution of
the d � d matrix SDE dZv ¼ dMv Zv with Z0 ¼ Id .
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–688
Theorem 1 shows that the standard Euler procedure converges at the rate of1=
ffiffiffiffiffiNp
. Next, we introduce a change of variables that simplifies the volatility of theunderlying SDE, thereby leading to an approximation of the true value X T with animproved rate of convergence, equal to 1=N. Durham and Gallant (2002) havealready remarked in their experiments that this transformation improves theperformance of their simulation and importance sampling schemes. They suggestthat the improvement might be related to the fact that the transformed process is‘‘closer’’ to a Gaussian process. We formalize this intuition and illustrate it withexamples in the next sub-sections.
2.2. Euler approximation with Doss transformation
Let us first introduce the Doss transformation.14 Consider the transformedvolatility coefficient, BBðxÞ � BðxÞB
�1, where B is an arbitrary matrix of constants.
Suppose that the rotated volatility coefficient BB satisfies the rank condition,
rankðBBÞ ¼ d; a.e., (8)
and the commutativity condition,
qBBj BB
i ¼ qBBi BB
j for all i; j ¼ 1; . . . ; d. (9)
Then, there exists a function GB : Rd ! Rd solution of the total ODE
qzGBðzÞ ¼ BBðGBðzÞÞ; GBð0Þ ¼ 0, (10)
such that X t ¼ GBðX tÞ, where
dX v ¼ AðX vÞdvþXd
j¼1
Bj dW jv with GBðX 0Þ ¼ X 0, (11)
and
AðxÞ � BBðxÞ�1AtGBðxÞ, (12)
where the operator A is defined by
AGB � AðGBÞ �1
2
Xd
j¼1
qBBj ðG
BÞ. (13)
Note that in the case where B is commutative, one can choose B ¼ Id . In thisinstance the transformed state variables X j have unit volatility coefficients.
The Euler approximation of the transformed state variables satisfying (11) is
XN
T ¼ X 0 þXN�1n¼0
AðXN
nhÞhþXN�1n¼0
Xd
j¼1
BjDWjnh.
The error distribution of this approximation of the d-vector X T is given next.
14See Doss (1977) and Detemple et al. (2005) for details.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 9
Theorem 2. Suppose that the rank and commutativity conditions (8) and (9) are
satisfied. The approximation error XN
T � X T converges weakly at the rate 1=N (i.e.
NðXN
T � X T Þ ) UXT ). The asymptotic error is
UXT ¼ � OT
Z T
0
O�1
v qAðX vÞ
�1
2dX v þ
1ffiffiffiffiffi12p
Xd
j¼1
Bj dZjv þ
1
2
Xd
j;k;l¼1
ql;kAðX vÞBk;j Bl;j dv
!ð14Þ
with ½Zj �j2f1;...;dg a d � 1 standard Brownian motion independent of W and Zh;j ,
qAðX vÞ ¼ ½q1AðX vÞ; . . . ; qdAðX vÞ� the d � d matrix with columns given by the
derivatives of the vector AðX vÞ, and ql;kAðX sÞ the d � 1 vector of cross derivatives of
AðX vÞ with respect to arguments l; k. The d � d matrix Ov is
Ov ¼ ER
Z �0
qAðX sÞds
� �v
. (15)
Theorem 2 shows that the speed of convergence increases after application of thetransformation. It also shows that the limiting random variable is different and, inparticular, involves exponentials of a bounded total variation process instead of astochastic integral. But, in contrast to the limit without transformation UX
T , theasymptotic error UX
T does not have a zero mean.The simplest example is the Ornstein–Uhlenbeck process
dX v ¼ kðX � X vÞdvþ sdW v. (16)
Its asymptotic error UXT is the sum of two normal random variables
UXT ¼
k2e�kT
Z T
0
ekv dX v þksffiffiffiffiffi12p e�kT
Z T
0
ekv dZv
�k2
2ðX � X 0Þe
�kT T þW aðTÞ þ ZbðTÞ, ð17Þ
where
aðTÞ �s2kðe2kT � 1þ 2kTð1� kTÞÞ
16e2kTand bðTÞ �
ks2
24ð1� e�2kT Þ.
If X 0aX the asymptotic law is clearly non-centered.The result in Theorem 2 can now be used to construct an approximation of
X T ¼ GðX T Þ with an improved speed of convergence.
Corollary 1. Under the conditions of Theorem 2, NðGðXN
T Þ � X T Þ ) BðX T ÞUXT .
The convergence rate 1=N attained by GðXN
T Þ is the same as the convergence rateof the Euler scheme applied to an ordinary differential equation. This is the best ratethat can be attained with an Euler scheme.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6810
2.3. A numerical example
For concreteness we illustrate Theorems 1 and 2 with a mean reverting, constantelasticity of variance process
dX v ¼ kðX � X vÞdvþ sX gv dW v. (18)
This specification is often used to model the short rate in term structure models, as inChan et al. (1992). For g ¼ 0:5 we obtain a CIR process; otherwise it is a standardL-CEV process.
For a precise characterization of the asymptotic error we need to identify theexpressions in OT and UX
T . Straightforward computations give qBðxÞ ¼ sgxg�1 andqAðxÞ ¼ �k. The SDE for the transformed process is15
dX v ¼qA
B�
1
2qB
� �� G
� �ðX vÞ þ dW v, (19)
where GðxÞ ¼ ðsð1� gÞxÞ1=ð1�gÞ. Similarly, expressions for qA and q2A that appear inOT and UX
T , are
qAðxÞ ¼ qA�AqB
B�
1
2q2BB
� �� G
� �ðxÞ
and
q2AðxÞ ¼ q2AB� qA qB� Aq2 BþAðqBÞ2
B�
1
2ðq3B Bþ q2BqBÞB
� �� G
� �ðxÞ
(20)
with q2BðxÞ ¼ sgðg� 1Þxg�2, q3BðxÞ ¼ sgðg� 1Þðg� 2Þxg�3 and q2AðxÞ ¼ 0.For the L-CEV process we adopt the parameter values in Chan et al. (1992):
k ¼ 0:0171, X ¼ 0:1138, s ¼ �0:0655, g ¼ 0:9997. Parameter values for the CIRprocess are taken from Broze et al. (1998): k ¼ 0:0305, X ¼ 0:0791, s ¼ �0:0219.
Figs. 1 and 2 graph the asymptotic error distributions for CIR and L-CEV. Thegraphs plot the empirical distribution functions based on M ¼ 50 000 replicationsand use T ¼ 1 and N ¼ 365. Observe that the empirical distributions using thetransformation are non-centered and that the errors, with the transformation, areconsiderably smaller.
Comparing Figs. 3 and 4 for the CIR process and Figs. 5 and 6 for the L-CEVprocess reveals the increased speed of convergence with the transformation. In thoseexperiments the benchmark ‘‘true’’ value is computed without transformation andtaking N ¼ 214.16 Approximation errors, relative to this benchmark, are thencomputed using N ¼ 2x with x ¼ 2; . . . ; 9. Distribution functions are again based onM ¼ 50 000 replications.
The simulation of the discretized version of the process of interest is usually thefirst step of a Monte Carlo procedure for an econometric technique such as the
15Given two functions f and g, the notation � reads ½f � g�ðxÞ � f ðgðxÞÞ.16We take this shortcut to illustrate the relative speed of convergence.
ARTICLE IN PRESS
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
-5e-05
-2e-07 -1e-07 1e-07 2e-07 3e-07 4e-07 5e-070
-4e-05 -3e-05 -2e-05 -1e-05 1e-05 2e-05 3e-05 4e-05 5e-0500
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Fig. 1. Asymptotic error distribution function of CIR process, approximated with a Euler scheme without
Doss transformation (PðUX1 pxÞ upper graph), and with Doss transformation (PðU
GðX Þ1 pxÞ lower graph).
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 11
simulated method of moments. The next step is to generate a large number ofdiscretized trajectories to compute an average designed to approximate the momentcondition. Common wisdom suggests that the precision of this estimator can beimproved by discretizing the process as finely as possible and taking a very largenumber of independent replications. In practice, however, one faces a limited budgetof computation time. Duffie and Glynn (1995) propose a computationally efficientscheme which optimizes the gains achieved by reducing the length of thediscretization step and by increasing the number of simulations of the sample pathof the discretized process. For this efficient MCE scheme, they also characterized theasymptotic distribution of the approximation error and found it to be non-centered.The efficient procedure has therefore a second-order bias. In the next section, weextend their results in several directions. We start with a characterization of thesecond-order bias as the expected value of a known random variable. As this randomvariable can be simulated, along with the diffusion, a new approximation thatcorrects for second-order bias can be designed. We also provide equivalent resultsfor the Doss-transformed process introduced in the previous section.
ARTICLE IN PRESS
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-1.5e-05
-1e-07 -5e-08 5e-08 1e-07 1.5e-07 2e-07 2.5e-07 3e-07 3.5e-070
-1e-05 -5e-06 5e-06 1e-05 1.5e-050
Fig. 2. Asymptotic error distribution function of a CEV process, approximated with a Euler scheme
without Doss transformation (PðUX1 pxÞ upper graph), and with Doss transformation (PðUGðX Þ
1 pxÞ lower
graph).
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6812
3. Asymptotic laws of estimators of conditional expectations
We now derive the asymptotic error of the estimate of the conditional expectationof a function of the terminal value of an SDE, X T . When the distribution of X T isunknown an estimator of the expected value is obtained by sampling independentreplications of the numerical solution of the SDE and averaging over the sampledvalues. The approximation error of this scheme has two components (Duffie andGlynn, 1995). The first is the error due to the discretization of the SDE. The secondis the error in the approximation of the conditional expectation by a sample average.
Section 3.1 presents our central result, namely the asymptotic error distributionsassociated with estimators of conditional expectations. Auxiliary results concerningthe error component associated with the discretization scheme are described inSection 3.2. The second-order biases of these estimators are discussed in Section 3.3,and bias correction is performed in Section 3.4.
ARTICLE IN PRESS
0
0.1
−7 − 6 − 5 − 4 − 3 − 2 −1 0 1 2
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Convergence of Cumulative Distribution Function: CIR (no transformation)
Pro
babi
lity
Error x 10− 4
Fig. 3. Speed of convergence of distribution function of the approximation error X N1 � X 1 for a CIR
process approximated with a Euler scheme for N ¼ 2x and x ¼ 2; . . . ; 9.
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 13
3.1. Asymptotic error distributions
Suppose that we wish to calculate E½gðX T ÞjF0� ¼ E0½gðX T Þ� ¼ E0½gðX T Þ� where X
solves (2) or (11). The estimators without and with transformation are
gN ;M �1
M
XMi¼1
gðX i;NT Þ, (21)
and
gN ;M�
1
M
XMi¼1
gðXi;N
T Þ. (22)
These estimators of the conditional expectation rely on the law of large num-
bers and draw independent replications X i;NT (resp. X
i;N
T ) of the terminal points
ARTICLE IN PRESS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Convergence of Cumulative Distribution Function: CIR (transformation)
Pro
babi
lity
− 1 − 0.5 0.5 1.51 0
Error x 10− 4
Fig. 4. Speed of convergence of the distribution function of the approximation error GðXN
1 Þ � X 1 for a
Doss-transformed CIR process approximated with a Euler scheme for N ¼ 2x and x ¼ 2; . . . ; 9.
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6814
X NT (resp. X
N
T ) of the Euler discretized diffusion without (resp. with) Dosstransformation. Our next theorem describes their asymptotic laws.
Theorem 3. Let g 2 C1ðRdÞ, g 2 C3ðRd Þ, and suppose that gðX T Þ 2 D1;2.17 Also
suppose that the assumptions of Theorems 4 and 5 hold. For the schemes without and
with transformation, we have, as M !1,ffiffiffiffiffiffiMpðgNM ;M � E0½gðX T Þ�Þ ) �1
2KT ðX 0Þ þ LT ðX 0Þ, (23)
ffiffiffiffiffiffiMpðgNM ;M
� E0½gðX T Þ�Þ ) �12
KT ðX 0Þ þ LT ðX 0Þ, (24)
17The spaceD1;2 is defined as the domain of the Malliavin derivative operator (see Nualart (1995) for an
exact definition and more on Malliavin calculus). A brief introduction to Malliavin calculus also appears
in Appendix D of Detemple et al. (2003).
ARTICLE IN PRESS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Convergence of Cumulative Distribution Function: CKLS (no transformation)
Pro
babi
lity
− 8 − 6 − 4 − 2 0 2
x 10− 4Error
Fig. 5. Speed of convergence of the distribution function of the approximation error X N1 � X 1 for a CEV
process approximated with a Euler scheme for N ¼ 2x and x ¼ 2; . . . ; 9.
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 15
where limM!1NM ¼ þ1 and � ¼ limM!1
ffiffiffiffiffiffiMp
=NM , and LT ðX 0Þ is the terminal
value of a centered Gaussian martingale with (deterministic) quadratic variation and
conditional variance given by
½L;L�T ¼
Z T
0
E0½NvðNvÞ0�dv � VAR½gðX T ÞjF0�, (25)
Nv ¼ Ev½qgðX T ÞDvX T �. (26)
In these expressions DsX T is the Malliavin derivative of X T . The deterministic
functions K and K are defined in Theorems 4 and 5 below.
The random variable DsX T captures the impact of an innovation in the Brownianmotion W at time s on the state variable X at time T. In essence this derivativemeasures the persistence of a shock in the state variable. It is similar to an impulse
ARTICLE IN PRESS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Convergence of Cumulative Distribution Function: CKLS (transformation)
Pro
babi
lity
− 6 − 4 − 2 0 2 4 6 8 10
Error x 10− 5
Fig. 6. Speed of convergence of the distribution function of the approximation error GðXN
T Þ �X T for a
Doss-transformed CEV process approximated with a Euler scheme for N ¼ 2x and x ¼ 2; . . . ; 9:
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6816
response function that quantifies the sensitivity of the variable X T to an uncertaintyshock at the prior time s.
The theorem shows that the asymptotic laws of the estimators have two parts. Thefirst, K, corresponds to the discretization bias; the second, L, results from the MonteCarlo estimation of the expectation. Note that L would not vanish, even if sampleswhere taken from the law of X T . This is because the conditional expectation cannotbe calculated in closed form.
The theorem also shows that the estimators converge at the same rate. Thisfollows from the fact that the convergence rate of the expected approxi-mation error, described in Theorems 4 and 5 in the next section, is the same.As the rate 1=
ffiffiffiffiffiffiMp
is obtained from a central limit theorem, an additional con-clusion is that higher-order schemes would fail to improve the convergencespeed. They would just reduce the factor � in the limits and therefore the second-order bias.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 17
3.2. Expected approximation errors
We now provide auxiliary results concerning the error component associated withthe discretization scheme. Let eN
T (eNT ) be the expected approximation error for the
scheme without (with) transformation. By definition
eNT � E0½gðX
NT Þ� � E0½gðX T Þ�, (27)
eNT � E0½gðX
N
T Þ� � E0½gðX T Þ�, (28)
where g � g � F with F the inverse of G, gðX NT Þ is an approximation of gðX T Þ based
on the Euler discretization of X, and gðXN
T Þ is an approximation of gðX T Þ based onthe Euler discretization of the transformed state variables X .
Next we study the convergence properties of these errors. These results arerelevant to the extent that the limits of NeN
T and NeNT determine the means of the
asymptotic error distributions of the corresponding efficient estimators of condi-tional expectations. As we will show, these limits characterize the second-orderapproximation bias of efficient Monte Carlo estimators of diffusions.
3.2.1. Euler scheme on the original state variables
Our first result describes the convergence of the expected approximation error eNT ,
in (27). Define the random variables
V1;T � � OT
Z T
0
O�1s qAðX sÞdX s þXd
j¼1
½qBjA�ðX sÞdW js
�Xd
i;j¼1
½qBj qBj Bi�ðX sÞdW is
!
þ OT
Z T
0
O�1s
Xd
j¼1
½qBj qBj A� þXd
j;k;l¼1
½qkðqlABk;jÞ�
" #ðX sÞds
þ OT
Z T
0
O�1s
Xd
i;j¼1
ð½q½qBj qBj Bi�Bi � qBi qBj qBj Bi�ðX sÞÞds, ð29Þ
V2;T � �
Z T
0
Xd
i;j¼1
ni;jðs;TÞds, (30)
where qA, qBj are d � d matrices of Jacobians, O is defined in Theorem 1 andni;jðs;TÞ is defined in (137)–(141). With this notation we have:
Theorem 4. Suppose that A;Bj 2 C2ðRdÞ. Let g 2 C3ðRd Þ be such that
limr!1
lim supN
E0½1fjNðgðX NTÞ�gðX T ÞÞj4rgNjgðX
NT Þ � gðX T Þj� ¼ 0 (31)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6818
(P-a.s.). Then,
NeNT !
12
KT ðX 0Þ �12E0½qgðX T ÞV 1;T þ V 2;T �, (32)
where V1;T , V 2;T are given by (29) and (30), and eNT is defined in (27).
Theorem 4 provides a probabilistic characterization of the asymptotic expectederror. The expressions in (32) depend on random variables V1 and V 2 that aredetermined in closed form by the derivatives of the drift and volatility coefficients ofthe SDE. They can therefore be simulated along with the state variables to derivebias-corrected estimators (see Section 3.4). This characterization of the asymptoticexpected error is easier to evaluate than the expressions presented in Talay andTubaro (1990) and Bally and Talay (1996a,b). These authors show that theasymptotic expected error for the Euler scheme can be written as the expectation of afunction of a random variable, where the function solves a PDE. In contrast, ourcharacterization is fully probabilistic as it does not require the resolution of a PDE,and therefore, does not suffer from the curse of dimensionality affecting numericalsolutions of PDEs. As a consequence, it remains applicable for multivariatediffusions. Even in the univariate case, using Monte Carlo simulation incombination with the solution of a PDE is computationally costly. This mayexplain why the theoretical results of Talay and Tubaro (1990) and Bally and Talay(1996a,b) have seen limited use in applications.
Implementation, in numerous applications, requires the computation of condi-tional expectations of path-dependent functionals of diffusion processes, such as theRiemann integral
R T
0gðX sÞds (see the example in Section 5.2). A convergent
estimator for this integral, based on the Euler scheme, isPN�1
n¼0 gðX nhÞh, whereh ¼ T=N, and X N is the solution of the Euler-discretized SDE starting at X 0.Theorem 4 can be used to deduce the asymptotic expected approximation error inthese cases.
Corollary 2. Under the assumptions of Theorem 4, we have, as N !1,
NE0
XN�1n¼0
gðX NnhÞh�
Z T
0
gðX sÞds
" #!
1
2K1T ðX 0Þ
P–a.s., with
K1T ðX 0Þ �
Z T
0
KsðX 0Þdsþ K2T ðX 0Þ, (33)
where K is defined in Theorem 4, and
K2T ðX 0Þ � � E0
Z T
0
qgðX sÞ AðX sÞ þXd
j¼1
½ðqBjÞBj�ðX sÞ
! "
þXd
j¼1
½B0j q2g Bj�ðX sÞ
!ds
#. ð34Þ
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 19
The expected approximation error has two terms. The first term,R T
0 KsðX 0Þds,captures the cumulated expected approximation error X N � X in the Riemannintegral over the interval ½0;T �. The second term, K2T , emerges because thecontinuous Riemann integral,
R T
0 gðX sÞds, is approximated by a discrete sum,PN�1n¼0 gðX N
nhÞh.
3.2.2. Euler scheme on the transformed state variables
To derive the expected approximation error for the estimator with transformationdefine the random variable
VT ¼ �OT
Z T
0
O�1
v qAðX vÞ dX v þXd
j;k;l¼1
ql;kAðX vÞBk;j Bl;j dv
!(35)
with OT ¼ ERðR v
0 qAðX sÞdsÞ. We obtain
Theorem 5. Suppose that A 2 C1ðRdÞ, and that the conditions of Theorem 2 hold. For
g 2 C1ðRdÞ such that
limr!1
lim supN
E0½1fjNðgðX
NT Þ�gðX T ÞÞj4rg
NjgðXN
T Þ � gðX T Þj� ¼ 0 (36)
P-a.s. we have, P-a.s., as N !1,
NeNT !
12
KT ðX 0Þ �12E0½qgðX T ÞVT �, (37)
where VT is defined in (35), and eNT is defined in (28).
A comparison of (32) with (37) suggests that it will be difficult, in general, toestablish the dominance of one method over the other on the basis of the asymptoticexpected error. Indeed, the formulas reveal that both methods converge at the samespeed 1=N and that the second-order biases, while different (KT ðX 0ÞaKT ðX 0Þ), donot appear to be ordered in a systematic manner. To compare the two methods onemay want to use additional criteria, such as the computational cost.
The difference in the convergence rates to the limit errors (Theorems 1 and 2) andthose to the limit expected errors (Theorems 4 and 5) can be explained as follows.The speed of convergence for the limit error is determined by the martingale part ofthe error expansion that converges more slowly (1=
ffiffiffiffiffiNp
) than the bounded variationpart (1=N). The transformation eliminates the martingale part of the error. Takingthe expectation also eliminates the martingale part of the error. It followsimmediately that the expected errors will converge at the same rate. To see thatthe expectation eliminates the martingale part of the error in the absence of atransformation it suffices to note that this term converges weakly to a stochasticintegral whose expectation is null.18
Let us now consider estimators of Riemann integrals. The counterpart ofCorollary 2, for the scheme based on the transformed state variables, is
18The martingale part of the error converges to the product of a random variable and a stochastic
integral with independent integrator.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6820
Corollary 3. Under the assumptions of Theorem 5, we have, as N !1,
NE0
XN�1n¼0
gðXN
nhÞh�
Z T
0
gðX sÞds
" #!
1
2K1T ðX 0Þ
P–a.s., with
K1T ðX 0Þ �
Z T
0
KsðX 0Þdsþ K2T ðX 0Þ, (38)
where K is defined in Theorem 5, and
K2T ðX 0Þ � �E0
Z T
0
qgðX sÞAðX sÞ þXd
j¼1
B0
j q2gðX sÞBj
!ds
" #. (39)
The two terms in this decomposition have the same interpretation as those ofCorollary 2. Note, in particular, that the expression for K2 follows immediately fromK2 and the deterministic nature of the volatility coefficient B (i.e. qBj ¼ 0).
3.3. Second-order discretization biases
Theorem 3 shows that the two procedures have asymptotic second-orderdiscretization biases that are, respectively, given by �
2K and �
2K . It follows that
any confidence interval, based solely on the Gaussian process L, will suffer from asize distortion. More precisely, when M !1 we have
P E0½gðX tÞ� 2 CNM ;Mg ðaÞ
� �! CðdÞ ð40Þ
with
CNM ;Mg ðaÞ �
ffiffiffiffiffiffiMp
gNM ;M þ F�1ða=2ÞsNM ;Mffiffiffiffiffiffi
Mp ;
ffiffiffiffiffiffiMp
gNM ;M þ F�1ða=2ÞsNM ;Mffiffiffiffiffiffi
Mp
� �
and
CðdÞ � FðF�1ð1� a=2Þ � dÞ � FðF�1ða=2Þ � dÞ, (41)
where F denotes the cumulative Gaussian distribution, d ¼ �KT ðX 0Þ=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVAR½LT jF0�
pand where ðsN;M Þ
2¼ VARN ;M
½LT jF0� is a convergent estimatorof the variance. A confidence interval of nominal size a, based on L, will cover thetrue value E0½gðX T Þ� only with probability CðdÞ and not 1� a. For the method withtransformation the coverage probability of the true value is CðdÞ with d ¼ �KT ðX 0Þ=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVAR½LT jF0�
p.
The degree of size distortion can be measured by sðzÞ � 1� a�CðzÞ, wherez 2 fd; dg. As CðzÞ is strictly monotone with Cð0Þ ¼ 1� a we conclude that theseconfidence intervals have the requested nominal size if and only if there is no second-order bias, i.e. d ¼ 0. Clearly, an increase in the second-order bias reduces the realcoverage probability. Likewise, a decrease in the asymptotic variance or an increasein � will increase the size distortion.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 21
A benefit of formulas (32), (37) for the second-order biases KT ðX 0Þ and KT ðX 0Þ, isthat they can be computed by simulation. Theorems 4 and 5 can then be used todevelop approximation schemes that correct for second-order bias. Likewise,asymptotically valid confidence intervals can easily be implemented. Furthermore,bias correction is feasible even when the number of state variables is large. Incontrast bias correction based on the solutions of PDEs quickly becomes infeasiblewhen the number of state variables increases.19
3.4. Bias-corrected estimators
Bias-corrected estimators are asymptotically equivalent to Monte Carlo estimatorsobtained by sampling from the true distribution of X T . As a result they do not sufferfrom the size distortion problem described above. Bias-corrected estimators ofconditional expectations can be constructed as
gN ;Mc ¼
1
M
XMi¼1
½gðX i;NNh Þ þ
12qgðX i;N
Nh ÞCi;N1;Nh þ
12
Ci;N2;Nh�, (42)
gN ;Mc ¼
1
M
XMi¼1
gðXi;N
Nh Þ þ1
2qgðX
i;N
Nh ÞCi;N
Nh
� �, (43)
where Ci;N1;Nh;C
i;N2;Nh; C
i;N
Nh , for n ¼ 0; . . . ;N � 1, are defined in Appendix B and g isimplicitly defined at the beginning of Section 3.2.2. Our next result shows thatgN ;Mc and gN;M
c are bias-corrected estimators.
Theorem 6. Suppose that the conditions of Theorems 4 and 5 hold. Then,ffiffiffiffiffiffiMpðgNM ;M
c � E0½gðX T Þ�Þ ) LT ðX 0Þ (44)
and ffiffiffiffiffiffiMpðgNM ;M
c � gNM ;Mc Þ ) 0 (45)
as M !1, with limM!1NM ¼ þ1.
Theorem 6 shows that gN;Mc and gN ;M
c correct for the second-order bias andare asymptotically equivalent.20 This means that a bias correction eliminates allthe potential benefits of the transformation. However, one should bear in mind thatthe equivalence is asymptotic and that bias-corrected estimators may performdifferently in finite samples.
19Duffie and Glynn (1995) propose the Richardson–Romberg type of estimator ð2=4MÞP4M
i¼1 gðX i;2NT Þ �
ð1=MÞPM
i¼1 gðX i;NT Þ to eliminate the second-order bias asymptotically. To calculate this estimator one must
quadruple the number of replications and double the number of discretization points. In our method the
second-order bias can be simulated along with the state variables, which is computationally cheaper than a
Richardson–Romberg approximation scheme.20Two estimators are called equivalent if they share the same asymptotic error distribution.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6822
Corresponding bias-corrected estimators for conditional expectations of Riemannintegrals
f ðX 0Þ ¼ E0
Z T
0
gðX sÞds
� �(46)
are
f N ;Mc ¼
1
M
XMi¼1
XN�1n¼0
gðX i;Nnh Þ þ
1
2qgðX i;N
nh ÞCi;N1;nh þ
1
2Ci;N
2;nh
� �hþ
1
2Ci;N
3;Nh
!, (47)
fN ;M
c ¼1
M
XMi¼1
XN�1n¼0
gðXi;N
nh Þ þ1
2qgðX
i;N
nh ÞCi;N
nh
� �hþ
1
2C
i;N
1;Nh
!. (48)
These estimators are sums over (42) and (43). Each involves an additional term(C3 and C1), that corrects for the bias due to the approximation of a continuousintegral by a discrete sum. Appendix B provides exact expressions for theseadditional bias correction terms. The next result parallels Theorem 6.
Corollary 4. Suppose that the conditions of Theorem 4 and 5 hold. Then,ffiffiffiffiffiffiMpðf NM ;M
c �
f ðX 0ÞÞ ) LfT ðX 0Þ, where L
fT ðX 0Þ is a centered Gaussian martingale with (deter-
ministic) quadratic variation and conditional variance given by
½Lf ;Lf �T ¼
Z T
0
E0½Nfv ðN
fvÞ0�dv � VAR
Z T
0
gðX sÞds
����F0
� �, (49)
Nfv ¼ Ev
Z T
v
qgðX sÞDvX s ds
� �. (50)
Furthermore,ffiffiffiffiffiffiMpðf NM ;M
c � fNM ;M
c Þ ) 0 (51)
as M !1 with limM!1NM ¼ 1.
4. A comparison with Milshtein’s second-order approximation
While Euler schemes for SDEs are appealing from a computational point of view,they might be judged insufficiently accurate. Second-order schemes such asMilshtein’s scheme (see Milshtein, 1984, 1995; Talay, 1984, 1986, 1996) have infact been proposed to provide improved approximations. In this section, we extendour analysis to the Milshtein second-order approximation.
The Milshtein approximation of X T in (2) is
~XN
T ¼ X 0 þXN�1n¼0
Að ~XN
nhÞhþXd
j¼1
Bjð ~XN
nhÞDWjnh þ
Xd
j;l¼1
½qBl Bj�ð ~XN
nhÞDF l;j
!,
(52)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 23
where
DF l;j �
Z ðnþ1Þhnh
Z s
nh
dW lv dW j
s, (53)
with h ¼ T=N and DWjnh ¼W
jðnþ1Þh �W
jnh. This scheme is obtained using a
stochastic Taylor expansion for the diffusion coefficient.21
The Milshtein scheme is often difficult to implement for multivariate diffusions.The increments DF l;j, defined in (53) cannot, in general, be written in a form that iseasy to simulate. In fact, for multivariate diffusions without commutative noise(qBl BkaqBkBl for some kal), the increment DFl;j can only be simulated by using afurther discretization of the intervals ½nh; ðnþ 1Þh� (see Gaines and Lyons, 1997).This entails a substantial increase in computational cost. The comparison to a Eulerscheme with a number of discretization points equal to the total number of pointsrequired to implement the Milshtein scheme is unclear.22
For commutative noise (i.e. qBl Bk ¼ qBk Bl), the last term in (52) simplifies to23
Xd
j;l¼1
½qBl Bj�ð ~XN
nhÞDFl;j ¼1
2
Xd
j¼1
½qBj Bj�ð ~XN
nhÞððDWjnhÞ
2� hÞ
!
þ1
2
Xd
j;l¼1jal
½qBl Bj �ð ~XN
nhÞDWjnh DW l
nh
0BB@
1CCA. ð54Þ
The representation of ~XN
T as a functional of the Brownian increments DW knh for
k ¼ l; j follows. Note, in particular, that every univariate diffusion has commutativenoise and can be implemented on the basis of (54).
Our next result describes the asymptotic distribution of the approximation errorassociated with (52). It will enable us to find an explicit expression for the MonteCarlo estimator of a conditional expectation based on the Milshtein scheme.
Theorem 7. The approximation error ~XN
T � X T converges weakly at the rate 1=N (i.e.
Nð ~XN
T � X T Þ ) ~UX
T ). The asymptotic error is
~UX
T ¼ �1
2OT
Z T
0
O�1s qAðX sÞdX s �Xd
j¼1
½ðqBjÞðqAÞBj�ðX sÞds
!
21For details on the stochastic Taylor expansion and the derivation of this result see Kloeden and Platen
(1997) or Milshtein (1995).22The commutative noise condition necessary for this representation is the same condition that is needed
for the Doss transformation. When commutativity fails the transformation cannot be used (see Remarks
3.1–3.4 in Detemple et al., 2005 for further details) and the implementation of the Milshtein scheme
requires the simulation of the iterated Wiener integrals DF l;j . In those cases implementation based on a
standard Euler scheme without transformation is considerably less demanding.23See Milshtein (1995) or Gaines and Lyons (1997).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6824
�1
2OT
Z T
0
O�1s
Xd
j¼1
½ðq½ðqAÞBj�ÞBj � ðqBjÞðqBjÞA�ðX sÞds
�1
2OT
Z T
0
O�1s
Xd
j¼1
½ðqBjÞA�ðX sÞdW js
�1ffiffiffiffiffi12p OT
Z T
0
O�1s
Xd
j¼1
½ðqAÞBj � ðqBjÞA�ðX sÞdZjs
�1ffiffiffi6p OT
Z T
0
O�1s
Xd
i;l;j¼1
½qBi qBl Bj�ðX sÞd ~Zl;j;is ,
where the process ððZjÞj2f1;...;dg; ð ~Zl;j;iÞi;l;j¼1;...;dÞ is a d þ d3
� 1 standard Brownian
motion independent of W. The process OT is given in Theorem 1.
Theorem 7 shows that the speed of convergence increases when one uses thestochastic Taylor expansion of the diffusion term. This expansion eliminates theerror component of order 1=
ffiffiffiffiffiNp
in the martingale part of the Euler approximation.It follows that the error for the Milshtein scheme converges at the same speed as theerror for the Euler scheme with transformation: both schemes improve on thestandard Euler approach. Note also that, in the special case of a constant volatilitycoefficient, estimators with and without transformation are identical, and asymptoticerror distributions for Milshtein and Euler with transformation are the same (i.e.~U
X
T ¼ UXT ). In this instance, all the schemes have the same asymptotic distribution.
The asymptotic error distribution of estimators of conditional expectations isdescribed next.
Theorem 8. Suppose that the assumptions of Theorem 9 below hold. Let g 2 C1ðRd Þ
and suppose that gðX T Þ 2 D1;2. For the Milshtein scheme, we have, as M !1,
ffiffiffiffiffiffiMp 1
M
XMi¼1
gð ~Xi;NM
T Þ � E0½gðX T Þ�
!) �
1
2~KT ðX 0Þ þ LT ðX 0Þ, (55)
where limM!1NM ¼ þ1 and � ¼ limM!1
ffiffiffiffiffiffiMp
=NM . The random variable LT ðX 0Þ
is the terminal value of a centered Gaussian martingale with (deterministic) quadratic
variation and conditional variance defined in Theorem 3. The deterministic function ~KT
is defined in Theorem 9.
For estimates of conditional expectations the rate of convergence of the Milshteinscheme is identical to the rates of the Euler schemes with and withouttransformation. The three schemes differ only in their second-order biases. Thesecond-order bias ~K can be found explicitly, as shown in Theorem 9 below. Its sizerelative to the second-order biases of Euler schemes depends on the coefficients ofthe underlying processes. A global ordering of the three schemes in terms of second-order asymptotic properties is not readily apparent.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 25
As for the estimator with transformation, the expected approximation error
~eNt � E0½gð ~X
N
T Þ � gðX T Þ�, (56)
can be directly deduced from Theorem 7 under an additional uniform integrabilitycondition. In order to do this, define the random variable
~VT ¼ � OT
Z T
0
O�1s qAðX sÞdX s �Xd
j¼1
½ðqBjÞðqAÞBj�ðX sÞds
!
� OT
Z T
0
O�1s
Xd
j¼1
½ðq½ðqAÞBj�ÞBj � ðqBjÞðqBjÞA�ðX sÞds
� OT
Z T
0
O�1s
Xd
j¼1
½ðqBjÞA�ðX sÞdW js. ð57Þ
The equivalent of Theorems 4 and 5, is
Theorem 9. For g 2 C1ðRdÞ such that
limr!1
lim supN
E0½1fjNðgð ~X NT Þ�gðX T ÞÞj4rg
Njgð ~XN
T Þ � gðX T Þj� ¼ 0 (58)
we have, P-a.s., as N !1,
N ~eNT !
12~KT ðX 0Þ �
12E0½qgðX T Þ ~VT �, (59)
where ~VT is defined in (57) and ~eNT is defined in (56).
Our next result extends Theorem 9 to Riemann integrals. The result parallelsCorollary 2.
Corollary 5. Under the assumptions of Theorem 9, we have, as N !1,
NE0
XN�1n¼0
gð ~XN
nhÞh�
Z T
0
gðX sÞds
" #!
1
2~K1T ðX 0Þ
P–a.s., with
~K1T ðX 0Þ �
Z T
0
~KsðX 0Þdsþ ~K2T ðX 0Þ, (60)
where ~K is given in Theorem 9, and where ~K2T ¼ K2T is given in Corollary 2.
The asymptotic expected errors, and hence the second-order bias, for estimators ofRiemann integrals based on the Milshtein approximation, have the same structuresas the corresponding quantities for Euler schemes in Corollaries 2 and 3. The firstcomponent captures the cumulative expected error of the Milshtein approximationof the state variables in the Riemann integral. The second component is the expectederror due to the approximation of the continuous integral by a discrete sum.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6826
Bias-corrected estimators based on the Milshtein scheme are asymptoticallyequivalent to Monte Carlo estimators obtained by sampling from the truedistribution of X T . As a result they do not suffer from the size distortion problemdescribed earlier. Bias-corrected estimators of conditional expectations can beconstructed as
~gcN ;M ¼
1
M
XMi¼1
gð ~Xi;NNh Þ þ
1
2qgð ~X
i;NNh Þ
~Ci;N
Nh
� �, (61)
where ~Ci;N
Nh is a convergent approximation of � ~VT , defined in Appendix B. Our nextresult shows that ~gN;M
c is a bias-corrected estimator of the conditional expectationbased on the Milshtein scheme.
Theorem 10. Suppose that the conditions of Theorems 4, 5 and 9 hold. Then,ffiffiffiffiffiffiMpð ~gNM ;M
c � E0½gðX T Þ�Þ ) LT ðX 0Þ, (62)
ffiffiffiffiffiffiMpðgNM ;M
c � gcNM ;M
Þ ) 0 (63)
and ffiffiffiffiffiffiMpðgNM ;M
c � ~gcNM ;M Þ ) 0 (64)
as M !1, with limM!1NM ¼ 1.
It is important to note that bias-corrected Milshtein does not improve over bias-corrected Euler, even without transformation. Indeed, the theorem shows that allestimators (Milshtein and Euler) are asymptotically equivalent after second-orderbias corrections. For non-commutative noise, our explicit expressions for the second-order biases can be used to develop estimators based on the Euler scheme that arecomputationally superior to the Milshtein scheme because they do not requirefurther subdivisions of discretization intervals to approximate the increments DF l;j .
Second-order bias-corrected estimators for conditional expectations of Riemannintegrals based on the Milshtein scheme (as in (47)–(48)) are
~fN ;M
c ¼1
M
XMi¼1
XN�1n¼0
gð ~Xi;Nnh Þ þ
1
2qgð ~X
i;Nnh Þ
~Ci;N
nh
� �hþ
1
2~C
i;N
1;Nh
!. (65)
As for Euler-based estimators, they involve a new term ~C1 that corrects the second-order bias due to the approximation of a continuous integral by a discrete sum. Anexplicit expression can be found in Appendix B. In parallel with Corollary 4, wehave:
Corollary 6. Suppose that the conditions of Theorems 4 and 9 hold. ThenffiffiffiffiffiffiMpðf NM ;M
c � ~fNM ;M
c Þ ) 0 (66)
as M !1, with limM!1NM ¼ 1.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 27
Corollary 6 combined with Corollary 4 enable us to conclude that bias-corrected
estimators based on the Euler scheme f N ;Mc , the Euler scheme applied to the
transformed variables fN;M
c , and the Milshtein scheme ~fN ;M
c , are asymptotically
equivalent.
5. Simulation-based inference for diffusions
Standard dynamic models in finance involve diffusion processes with unknowncoefficients. Estimation and statistical inference are therefore essential forimplementation purposes. As mentioned in the introduction, the absence of ex-plicit formulas for transition densities of general diffusions implies that maxi-mum likelihood inference can only proceed with the use of an auxiliary numericalprocedure designed to approximate the transition density.24 Our characterizationsof asymptotic errors apply directly to these procedures and can be used tofind efficient experimental designs for simulation-based inference methods fordiffusions.
Section 5.1 sets up a general framework for the estimation of the parameters of adiffusion process using simulation-based methods of moments. A simulation-basedversion of extended quasi maximum likelihood estimation is presented in Section 5.2.Illustrations of the results are provided in Section 5.3.
5.1. A simulated method-of-moment approach for diffusions
We briefly sketch a general setup for parameter estimation of diffusion processes.Suppose that we observe an equally spaced discrete-time sample fY 0; . . . ;Y Lg, whereY l � Y tl
and D � tlþ1 � tl , of the following diffusion process, supposed to beunivariate for notational convenience,
dY t ¼ AðY t; yÞdtþ BðY t; yÞdW t and Y t0 given. (67)
Assume that coefficients depend on an unknown parameter vector y 2 C � Rp. Thetransition law of the Markov chain fY 0; . . . ;Y Lg is assumed to be absolutelycontinuous with respect to the Lebesgue measure l on Rd ,
Py0ðY tlþ12 dzjFtl
Þ � py0D ðY l ; zÞlðdzÞ, (68)
24Numerical approximations can be avoided if we let the observation interval decrease to zero. For this
type of fill-in asymptotics, efficient estimators have been obtained by Dacunha-Castelle and Florens-
Zmirou (1986), Florens-Zmirou (1989, 1993), Genon-Catalot (1990), Genon-Catalot and Jacod (1993) and
Bibby and Sørensen, 1995. This type of asymptotics seems less appropriate for non-experimental setups
like those encountered in financial econometrics.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6828
where py0D ðY l ; zÞ is the transition density. Also suppose that the Markov chain
fY 0; . . . ;Y Lg is geometrically ergodic.25 This last assumption guarantees theexistence of a stationary density, denoted py0ðyÞ.26
We seek to estimate the parameter y using the conditional constraints:ZRd
hj;DðY l ; z; yÞpyDðY l ; zÞlðdzÞ ¼ 0 for all j ¼ 1; . . . ; q. (70)
The structure of (70) implies that hDðY l ; z; y0Þ � ðhj;DðY l ; z; y0ÞÞj¼1;...;q is a vector of
martingale increments. This suggests estimating y0 by solutions yL
D of martingale esti-mating equations
1
L
XL�1l¼0
gDðY l ;Y lþ1; yÞ ¼ 0, (71)
where gDðY l ;Y lþ1; yÞ � JDðY lÞ0hDðY l ;Y lþ1; yÞ and JDðY lÞ is a q� p matrix of
adapted weights.In contrast to the estimator above, GMM estimators can be written as M
estimators,
~yL
D ¼ argminy
S1=2L
1
L
XL�1l¼0
gDðY l ;Y lþ1; yÞ
!2
, (72)
where the weight JDðY lÞ is now interpreted as the optimal instrument. The matrix SL
is a weighting matrix that controls the efficiency of the GMM estimator.It is known from Hansen (1985); Hansen et al. (1988); Chamberlain (1987, 1992);
Newey (1990) and Wefelmeyer (1996) that the efficient estimator ~yL
D is obtained forinstruments given by
JDðyÞ ¼ CDðyÞ�1GDðyÞ,
CDðyÞ �
ZR
hDðy; z; y0ÞðhDðy; z; y0ÞÞ0p
y0D ðy; zÞlðdzÞ,
25This assumption can be avoided by introducing a stochastic normalization for limit theorems, i.e. a
random sequence cL such that cLðyL
D � y0Þ ) Z where Z is a standard normal variate and cL !1.
Random normalizations arise naturally in the definitions of locally asymptotically mixed normal (LAMN)
and locally asymptotically Brownian functional (LABF) estimators. See Basawa and Scott (1982) for more
on this. As shown by Heyde (1992) confidence intervals based on a stochastic normalization of the
estimator are, in general, shorter than those based on a deterministic normalization.26A Markov chain is geometrically ergodic if there exists a constant r41 such that
X1k¼1
rk
ZB
py0D ðY l ; zÞlðdzÞ �
ZB
py0 ðzÞlðdzÞ
V
� �k
o1 for all B 2 R, (69)
where k � kV is the variational norm (see Meyn and Tweedie, 1993). See Duffie and Singleton (1993) for an
application of this in a discrete time model. General convergence results for Markov chains appear in
Meyn and Tweedie (1993).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 29
GDðyÞ �
ZR
ðqyhDðy; z; y0ÞÞ0p
y0D ðy; zÞlðdzÞ.
Our next result shows that the optimal weight SL does not play any role if optimal
instruments are used. This follows from the first-order asymptotic equivalence of ~yL
D
and yL
D.
Theorem 11. In an ergodic model, the efficient estimator yL
D has the asymptotic error
distributionffiffiffiffiLpðy
L
D � y0Þ ) S�1=2D Z (73)
as L!1, where
SD �
ZR
GDðyÞ0CDðyÞ
�1GDðyÞpy0ðyÞlðdyÞ
� �, (74)
and Z�Nð0; IpÞ is a standard normal random variable vector. Furthermore,ffiffiffiffiLpðy
L
D �~y
L
DÞ ) 0, (75)
as L!1.
A similar result for d-order Markov chains can be found in Muller andWefelmeyer (2002) (see also references therein). They show that the martingaleestimating function framework covers GMM, generalized quasi-likelihood, extendedquasi-likelihood, and quasi-likelihood estimations. Our result simply expresses thefact that the GMM estimator with optimal instruments is just identified andconsequently can be obtained as the solution of (71).
If the transition density pyDðy; zÞ is known, the estimation of the unknown
parameters based on (70) is equivalent to the estimation of unknown parameters in aMarkovian ergodic time series model.27 It is of particular interest to note that thediscrete nature of observations has no impact on the inference procedure.
Unfortunately, the transition density pyDðy; zÞ is usually unknown or the optimal
weights JðyÞ cannot be calculated in closed form. The corresponding optimalestimating function cannot therefore be used as it stands for estimating y, i.e. theefficient estimator is infeasible. Similarly, for moment-based constraints, indirectinference or EMM estimators, the functions hj;DðY l ;Y lþ1; yÞ that identify theparameters through (70), are not known in explicit form and are obtained by MonteCarlo simulation.28
27The particular choice hj;Dðy; z; yÞ ¼ qyjpyDðy; zÞ leads to a maximum likelihood estimator which attains
the Cramer–Rao lower bound ðRRd CDðyÞp
y0 ðyÞlðdyÞÞ�1.28See Gallant and Tauchen (2002) for a survey. The moment-based estimating functions hj are often of
the form hj;DðY l ;Y lþ1; yÞ ¼ f jðY l ;Y lþ1Þ �RRd f jðY l ; zÞpyDðY l ; zÞlðdzÞ. The functions hj cannot be obtained
in closed form, if the transition density is unknown or the Riemann integral cannot be solved explicitly, as
it is often the case in applications. Similarly, for indirect inference or EMM the functions hj are associated
with the score function of an auxiliary model, where the parameters of the auxiliary model are replaced by
their estimates based on the sample of data and where the structural processes are simulated for given
values of the structural parameters. Estimates of the latter are the values that set the moment conditions as
close to zero as possible based on a suitable GMM metric.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6830
In a diffusion setting, simulations and numerical solution techniques for SDEs canbe used to approximate the optimal weights J and/or the functions hj in theconstraints that identify the parameters.29 For simulation-based estimatorscomputations can be performed by first discretizing time and using the Euler orthe Milshtein schemes to approximate the solution of the stochastic differentialequation, and then replicating this approximation M times to generate M
independent realizations of the terminal point fY i;Nlþ1ðY lÞ : i ¼ 1; . . . ;Mg. The integral
in the expression for hj and/or J can then be replaced by an empirical mean overindependent replications given the initial value.
If the transition density is unknown, the estimator yL;M;N
is obtained from asimulation-based martingale estimating function, i.e. as the unique root of
1
L
XL�1l¼0
gM ;ND ðY l ;Y lþ1; y
L;M ;NÞ �
1
L
XL�1l¼0
JM ;ND ðY lÞh
M;ND ðY l ;Y lþ1; y
L;M ;NÞ ¼ 0.
(76)
The results of this paper can be used to study the asymptotic properties of thenormalized error
ffiffiffiffiffiffiMpðgM ;N
D � gDÞ resulting from this procedure. In particular, wenow show that an approximation of the true estimating function based on M
independent Monte Carlo replications with N discretization points introduces asecond-order bias. As a result simulation-based estimators have higher-orderasymptotic properties that differ from those of their infeasible optimal counterparts.The second-order bias has important implications for the construction of asymptoticconfidence intervals and for parametric hypothesis testing.
To describe the second-order bias of the simulation-based estimator, introduce theexpression,
kj;Dðy0Þ �ZR�R
Kj;Dðy; y; z; y0ÞpDðy; z; y0ÞlðdzÞpy0ðyÞlðdyÞ for j ¼ 1; . . . ; p,
(77)
where Kj;Dðy; y; z; y0Þ is defined in Theorem 4 (the notation emphasizes that thefunction being averaged over depends, for simulation-based estimating functions, onthe observations Y l and Y lþ1. As in Theorem 4, the first argument corresponds tothe starting point of the simulations Y l , i.e., the state for which the conditionalexpectation is calculated).30
Our next result shows that the simulation procedure introduces an additionalsecond-order bias.
Theorem 12. Assume that the conditions of Theorem 4, for the Euler scheme without
transformation, are satisfied. Then, for geometrically ergodic diffusions observed over
29Alternative kernel-based estimation methods for conditional expectations converge at a slower rate
(L�2=ðdþ4Þ) that depends on the dimension of the diffusion. Therefore, for multivariate diffusions, a large
sample is necessary to get reasonably accurate estimates of the weights and/or the functions hj . Simulation-
based estimators are particularly interesting in those cases.30For MCE estimators these observations do not affect the convergence properties.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 31
time intervals of equal fixed length D, we have, as L!1,ffiffiffiffiLpðy
L;ML;NL
D � yL
DÞ ) �e1e2SDðy0Þ�1kDðy0Þ, (78)
where e1 ¼ limL!1
ffiffiffiffiLp
=ffiffiffiffiffiffiffiffiML
pand e2 ¼ limL!1
ffiffiffiffiffiffiffiffiML
p=NL, with limL!1ML ¼
limL!1NL ¼ 1.Corresponding estimators for the Euler scheme with Doss transformation (under the
assumptions of Theorem 5) and the Milshtein scheme (under the assumptions of
Theorem 9), are obtained if Kj;Dð�; y; z; y0Þ in (77) is replaced by Kj;Dð�; y; z; y0Þ of
Theorem 5 and ~Kj;Dð�; y; z; y0Þ of Theorem 9, both adjusted for the dependence of hj and
J on the observations ðY l ;Y lþ1Þ ¼ ðy; zÞ.
Theorem 12 shows that the efficient simulated estimator has a second-order biasthat depends on the discretization scheme chosen. The numerical scheme used tosolve the SDE becomes irrelevant only if e1e2kDðy0Þ ¼ 0. The conditions stating thate1 and e2 are finite, which are conditions linking the sample size L, the number ofreplications ML and the number of discretization points NL, are therefore necessary.If these conditions fail, the growth of L has to be restricted, and the resultingestimators become inefficient. Ignoring the second-order bias leads to size-distortedstatistical tests.
The size of the second-order bias depends on the choice of the discretizationscheme only through the functions K, K and ~K that characterize the expectedapproximation errors of the different methods. This follows because the speed ofconvergence of the expected error is the same for all discretization schemes. We alsosee that the second-order bias is inversely related to the asymptotic variance of theestimator. Variance reduction, without second-order bias correction, will thereforeincrease the size distortion of asymptotic confidence intervals and hypothesis tests.
It is also important to note that the results of Theorem 12 do not directly coversimulated maximum likelihood estimators obtained from the numerical integrationof the Chapman–Kolmogorov equation (see Pedersen, 1995a,b; Brandt and Santa-Clara, 2002). For an estimator of this type, the Monte Carlo integration of the truetransition density kernel is based on a Gaussian kernel approximation withbandwidth 1=N determined by the number of discretization points N. The resultingscore function depends on the time discretization parameter N not just through thesimulated discretized trajectories but also through its functional form. As describedby Milshtein et al. (2004), this estimator of the transition density can be interpretedas a kernel estimator based on simulated i.i.d. data. It is well known that such anestimator will converge at a rate slower than M�1=2. In fact the rate for the score isM�2=ðdþ4Þ, a quantity that depends on the dimension of the diffusion. Optimality fora simulated maximum likelihood estimator requires that
ffiffiffiffiLp
=M2=ðdþ4ÞL ! e1 and
M2=ðdþ4ÞL =NL ! e2. It follows that efficient simulated maximum likelihood estima-
tors based on the numerical integration of the Chapman–Kolmogorov equation areless efficient than maximum likelihood estimators, even in the univariate case.
Theorem 12 shows that estimators with e1e2 ¼ 0 are inefficient. If L;ML;NL arechosen such that e1e2 ¼ 0, L has to grow at a slower rate than if 0oe1e2o1. Itfollows that the asymptotic variance and therefore the lengths of confidence intervals
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6832
are larger than the variance and length of the asymptotic confidence interval of theefficient estimator, as L is too small compared to the efficient estimator. The efficientestimator requires quadrupling the sample length, quadrupling the number ofsimulations and doubling the number of discretization points, in order to cut theasymptotic variance and therefore the length of the confidence interval in half.
Corresponding optimal designs for the simulated maximum likelihood estimatorsof Pedersen (1995a,b) and Brandt and Santa-Clara (2002), depend in addition on thedimension of the diffusion. These authors assume e2 ¼ 0. This assumption,unfortunately, is not sufficient. To preclude an exploding second-order bias it mustalso be the case that
ffiffiffiffiLp
=ffiffiffiffiffiffiffiffiML
pdoes not diverge faster than
ffiffiffiffiffiffiffiffiML
p=NL vanishes. The
rate of divergence of ML can therefore not be chosen independent of L.For estimators with a fixed number of discretization points and a fixed number of
Monte Carlo simulations the second-order bias always explodes when the samplesize becomes large. Asymptotic confidence intervals will then cover the true valueswith null probability. Hypothesis tests become invalid as their effective size canbecome considerably smaller than the nominal significance level of the test.
The results in Theorems 4, 5 and 9 describing the second-order biases for the Eulerschemes with and without Doss transformation and for the Milshtein scheme, enableus to construct second-order bias-corrected estimators that are both efficient andhave asymptotic confidence intervals that do not suffer from size distortion. It isimportant to note that asymptotic efficiency (i.e. estimators with the shortestasymptotic confidence intervals) can only be achieved by estimators that correct forsecond-order bias. Non-corrected estimators require a refinement of the timediscretization in the simulation of SDEs as the sample size increases. As aconsequence, the computational cost of efficient estimators without second-orderbias corrections explodes. Second-order bias-corrected estimators, alone, have thesame asymptotic distributions as estimators obtained from the infeasible optimalestimating function.31
5.2. Simulated extended quasi-maximum likelihood estimator
We now present a simulation-based version of the extended quasi-maximumlikelihood estimator (EQMLE) introduced for Markovian semi-martingales byWefelmeyer (1996). This estimator is a special case of a moment-based estimator. Itis based on parameter restrictions
aðY l ; yÞ � E
Z tlþ1
tl
AðY s; yÞds
�����Y l
" #, (79)
31This does not imply that simulation-based inference methods without second-order bias correction are
inferior to estimators based on analytic orthogonal series expansions of the likelihood, such as those
proposed by Ait-Sahalia (2002). Explosions of the second-order biases associated with the efficient
estimators of this type can only be avoided by increasing the dimensionality of the orthogonal basis as the
sample increases. Second-order biases for such estimators are unknown.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 33
bðY l ; yÞ � E
Z tlþ1
tl
BðY s; yÞBðY s; yÞ0 ds
�����Y l
" #(80)
implied by the drift and volatility functions of the diffusion process. The momentconditions associated with
h1;DðY l ;Y lþ1; yÞ � Y lþ1 � Y l � aðY l ; yÞ, (81)
h2;DðY l ;Y lþ1; yÞ � ðY lþ1 � Y l � aðY l ; yÞÞ2� bðY l ; yÞ, (82)
together with the optimal instruments JDðY lÞ are then used to obtain an estimatingfunction, as in the previous section, based on
giðY l ;Y lþ1; yÞ ¼ JDðY lÞhiðY l ;Y lþ1; yÞ for i ¼ 1; 2. (83)
Such an estimator is labelled extended quasi-maximum likelihood estimator
(EQMLE). This estimator is the most efficient in the class of estimating functionsbased on hi and is asymptotically equivalent to the GMM estimator based on theoptimal instruments (see Theorem 11).
The EQMLE is infeasible if both the drift and volatility functions ða; bÞ are notavailable in closed form. This is the case for general diffusions for which thetransition density is unknown or the expectation cannot be obtained in closed form.In this case calculation of the drift and volatility functions can proceed bysimulation. The resulting estimator, which uses the same optimal weights as theEQMLE, is called the simulated extended quasi-maximum likelihood estimator
(SEQMLE). The asymptotic error distribution of this estimator can be studiedwithin our theoretical framework. We also design estimators that correct for second-order bias. They are called simulated, bias-corrected, extended quasi-maximum
likelihood estimator (SBCEQMLE). The next section presents results for bothestimators for CIR and L-CEV processes.
5.3. An application of the simulated quasi-maximum likelihood estimator to CIR and
L-CEV processes
We perform a Monte Carlo study to analyze the properties of the simulation-based estimators associated with various discretization and transformation schemesintroduced in the previous sections. The candidate methods are the Eulerdiscretization scheme, with and without the Doss transformation, and the Milshteindiscretization scheme. The Monte Carlo setup is inspired by the study of Chan et al.(1992), also used by Ait-Sahalia (1999). We simulate 500 series of 306 monthlyobservations and compute our estimators, with and without bias correction(SEQMLE and SBCEQMLE) for the three methods just mentioned. For comparisonpurposes, we also report the results obtained with maximum likelihood estimatorsbased on the Euler approximation (ML-Euler) and on the approximations of thetransition densities proposed by Ait-Sahalia (1999) (ML-YAS) with Hermitepolynomials of order one (J ¼ 1) and two (J ¼ 2). The series in our setup areshort and the models considered are non-linear. The estimation task is therefore
ARTICLE IN PRESS
Table 1
RMSE and Average Bias of SEQMLE and SBCEQMLE estimators for CIR process
CIR: dY t ¼ kðY � Y tÞdtþ sffiffiffiffiffiffiY t
pdW t
True values: k ¼ 0:5, Y ¼ 0:06, s ¼ 0:072Method Y k s
RMSE (500 trials)
MLE-Euler 0.0073 0.2778 0.0032
MLE-YAS (J ¼ 1) 0.0073 0.2945 0.0029
MLE-YAS (J ¼ 2) 0.0073 0.3011 0.0029
SEQMLE-Euler 0.0108 0.2740 0.0047
SEQMLE-Doss 0.0108 0.2740 0.0047
SEQMLE-Milshtein 0.0108 0.2740 0.0047
SBCEQMLE-Euler 0.0109 0.2867 0.0048
SBCEQMLE-Doss 0.0109 0.2866 0.0048
SBCEQMLE-Milshtein 0.0109 0.2867 0.0048
Average bias (500 trials)
ML-Euler �0.0002 0.1342 �0.0016
MLE-YAS (J ¼ 1) �0.0002 0.1502 0.0002
MLE-YAS (J ¼ 2) �0.0002 0.1544 0.0002
SEQMLE-Euler 0.0043 0.0962 �0.0039
SEQMLE-Doss 0.0043 0.0962 �0.0039
SEQMLE-Milshtein 0.0043 0.0962 �0.0039
SBCEQMLE-Euler 0.0044 0.1085 �0.0040
SBCEQMLE-Doss 0.0044 0.1084 �0.0040
SBCEQMLE-Milshtein 0.0044 0.1085 �0.0040
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6834
difficult and standard errors of estimators will generally be high, especially for theL-CEV process.
In the simulation-based methods, the discretization parameter N ¼ 24 and thenumber of replications M ¼ 500 per observation. For the optimization procedure,we used quasi-Newton methods for computing the SEQMLE and SBCEQMLE, buta simplex method for the maximum likelihood estimators. The choice of the simplexmethod was motivated by the difficulties we experienced by using gradient-basedoptimization methods for the estimators based on the Hermite polynomialsapproximations (ML-YAS). The calculation of numerical derivatives using finitedifferences appeared numerically unstable and optimization in a Monte Carlocontext with gradient-based methods proved simply infeasible. We made sure thatthe simplex method provided reasonable results by comparing the estimation resultswith the actual series of short-term interest rates used in Ait-Sahalia (1996). Wereproduced the results reported in Table VI of Aıt-Sahalia, and verified that thesimplex method and a gradient-based method were giving very similar results interms of likelihood value and parameter estimates.
Table 1 displays the results for the CIR process. Algorithms for the variousestimators were started with the same initial values. These are obtained based on the
ARTICLE IN PRESS
Table 2
RMSE and Average Bias of SEQMLE and SBCEQMLE estimators for L-CEV process
L-CEV: dY t ¼ kðY � Y tÞdtþ sminfY t;Y%gg dW t
True values: k ¼ 0:5, Y ¼ 0:06, s ¼ 1:2, g ¼ 1:5 Y% ¼ 1010
(Y% guarantees thatR �0 minfY s;Y
%gg dW s remains a martingale for g41)
Method Y k s g
RMSE (500 trials)
ML-Euler 0.0131 0.3124 0.5828 0.1905
MLE-YAS (J ¼ 1) 0.0133 0.3232 0.6815 0.1807
SEQMLE-Euler 0.0433 0.3130 0.6173 0.1823
SEQMLE-Doss 0.0438 0.2965 0.6292 0.1822
SEQMLE-Milshtein 0.0432 0.3133 0.6169 0.1822
SBCEQMLE-Euler 0.0418 0.3221 0.5905 0.1699
SBCEQMLE-Doss 0.0460 0.3067 0.5958 0.1765
SBCEQMLE-Milshtein 0.0417 0.3222 0.5925 0.1744
Average bias (500 trials)
ML-Euler 0.0013 0.1336 �0.0679 �0.0573
MLE-YAS (J ¼ 1) 0.0015 0.1448 0.1014 �0.0149
SEQMLE-Euler 0.0243 �0.0357 0.0264 �0.0071
SEQMLE-Doss 0.0249 �0.0495 0.0320 �0.0071
SEQMLE-Milshtein 0.0241 �0.0346 0.0263 �0.0071
SBCEQMLE-Euler 0.0243 �0.0279 0.0151 0.0046
SBCEQMLE-Doss 0.0260 �0.0411 0.0284 0.0077
SBCEQMLE-Milshtein 0.0242 �0.0274 0.0165 0.0047
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 35
first-order autocorrelation coefficient for the speed of mean reversion, theunconditional sample mean for the long-term mean and the square-root of theunconditional sample variance divided by the sample mean for the volatilityparameter. The results differ across parameters. For the long-term mean, the resultsare better for the MLE methods both in terms of RMSE and average bias. For themean reversion parameter, the results of all methods are of the same order ofmagnitude for the RMSE, but both simulation-based methods, with and withoutbias correction, reduce the bias with respect to the MLE methods. Finally, for thevolatility parameter, the results appear to be slightly better for the MLE methods,both in terms of RMSE and average bias. For all parameters, the RMSE and theaverage bias obtained with the bias correction methods (SBCEQMLE) are, ifanything slightly larger than without bias correction. As discussed below, the effectsof bias corrections are more evident for the L-CEV process.
Table 2 provides the estimation results for the L-CEV process.32 Initial values forall estimators have been obtained by regressing observed increments on a constant
32The second-order approximation is not included because of severe numerical difficulties in maximizing
this function. Note that Ait-Sahalia (1999) does not report results with the second-order approximation
for estimating the parameters of the CEV model with interest rate data in Table VI.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6836
and the level of the process for the parameters of the drift function, and by regressingthe logarithm of the observed increments of the quadratic variation on a con-stant and the level of the process for the diffusion function. In terms of RMSE,the order of magnitude is similar for MLE and simulation-based methods exceptfor the long-term mean and the volatility parameter g. For the long-term mean,the MLE methods perform better, while for g the RMSE is smallest for thesimulation-based methods with bias correction (SBCEQMLE). However, forthe average bias, simulation-based methods improve results by a significant marginwith respect to the MLE methods, with the exception again of the long-termmean. The bias correction also appears sizable for k, s and g with respect tothe simulation-based methods without correction. Contrary to the results for theCIR process, the second-order bias-corrected estimates dominate in generalestimates without bias correction in terms of average bias. If we compare theresults between the various simulation-based methods, there are no differencesfor the CIR process, while the results appear to be better for Euler and Mihlsteinthan for Doss. In Durham and Gallant (2002), the simulation results wereshowing that the second-order bias was smallest if the Doss transformation wasused.
6. Conclusion
Error analysis for Monte Carlo estimators of numerical solutions of SDEs is adifficult task. The present paper introduced key elements to perform this analysis.Our investigation produced explicit formulas for error distributions that can be usedto construct asymptotically valid confidence intervals and to assess the asymptoticerrors of Euler schemes, with and without Doss transformation, and of the Milshteinscheme. Numerical experiments showed that the error distributions differsignificantly depending on whether the transformation is applied or not. Biascorrections were found to be important in the case of non-linear state variableprocesses, such as the L-CEV process. This underscores the importance of the errorstatistics obtained, as processes of relevance in financial economics often have a non-linear structure.
The characterizations obtained here permit a rigorous analysis of the asymptoticerror properties of a given estimator. They are therefore crucial for the design ofefficient Monte Carlo estimators of diffusion processes. Feasible efficient simulatedestimating function estimators were shown to require second-order bias corrections.Our explicit expressions for the second-order biases can be used to derive suchestimators.
In principle, error distributions, second-order biases and bias-corrected condi-tional estimators can also be derived when the conditional expectations to beestimated depend on unknown state variables. This situation is often encountered infinancial applications such as stochastic volatility models. The convergence results ofsimulation schemes presented could serve as a starting point for the study ofsimulation-based estimators for such latent variable models.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 37
Acknowledgements
The paper was presented at the CIRANO conference on Mathematical Financeand Econometrics (June 2001), the CIRANO conference on Monte Carlo Methodsin Finance (March 2002), the Bachelier Society Meetings in Crete (June 2002), andthe Econometric Society North American Winter Meeting in San Diego (January2004). Funding from MITACS is gratefully acknowledged. Rene Garcia also thanksHydro-Quebec and the Bank of Canada for financial support.
Appendix A. Proofs
We start with a series of auxiliary lemmas that are used to prove the main resultsof the paper. These lemmas give the weak limits of the components in the errorexpansions of the estimators (102) and (103).
Let ½Nt� be the integer part of Nt and define the time change ZNt ¼ ½Nt�=N if NteN
and ZNt ¼ t� 1=N otherwise. Note that limN!1 ZN
t ¼ t. We have
Lemma 1. The following weak convergence results hold:
V1;Nt � N
Z t
0
ðs� ZNs Þds)
1
2t � V1
t , (84)
V2;i;Nt � N
Z t
0
ðW is �W i � ZN
s Þds)1
2W i
t þ1ffiffiffiffiffi12p Zi
t � V2;it , (85)
V3;i;Nt � N
Z t
0
ðs� ZNs ÞdW i
s )1
2W i
t �1ffiffiffiffiffi12p Zi
t � V 3;it , (86)
V4;i;j;Nt �
ffiffiffiffiffiNp
Z t
0
ðW is �W i � ZN
s ÞdW js )
1ffiffiffi2p Z
i;jt � V
4;i;jt (87)
as N !1, where ððW iÞi2f1;...;dg; ðZiÞi2f1;...;dg; ðZ
i;jÞi;j2f1;...;dgÞ is a ð2d þ d2Þ-dimensional
standard Brownian motion.
Proof. As ZNt ¼ ½Nt�=N for NteN we have
R t
0ðs� ZN
s Þds ¼ ð1=N2ÞRNt
0ðs� ½s�Þds.
Using NR t
0ðs� ZNs Þds ¼ ð1=NÞ
P½Nt�k¼1
R½k�1;k½ðs� ½s�Þdsþ ð1=NÞ
RNt
½Nt�ðs� ½s�Þds we
then obtain
N
Z t
0
ðs� ZNs Þds ¼
1
2
½Nt�
Nþ
1
2
1
NðNt� ½Nt�Þ2 !
1
2t (88)
as N !1. The limit on the right-hand side uses ½s� ¼ k � 1 for s 2 ½k � 1; k½ and thebound Nt� ½Nt�p1.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6838
Similarly, it can be shown, using the scaling property of Brownian motion that
N
Z t
0
ðW is �W i � ZN
s Þds ¼1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðW is �W i
½s�Þds
þ1ffiffiffiffiffiNp
Z Nt
½Nt�
ðW is �W i
½s�Þds. ð89Þ
Ito’s lemma then impliesR½k�1;k½ðW
is �W i
½s�Þds ¼R k
k�1ðk � sÞdW i
s so that NR t
0ðW i
s�
W i � ZNs Þds ¼ ð1=
ffiffiffiffiffiNpÞP½Nt�
k¼1
R½k�1;k½ðk � sÞdW i
s þ ð1=ffiffiffiffiffiNpÞRNt
½Nt�ðW i
s �W i½s�Þds. Note
that the sequence of i.i.d. random variablesR½k�1;k½ðk � sÞdW i
s has variance 13 and
covariance 12 1fi¼jg with the Brownian motion W j. Donsker’s functional central limit
theorem (see Kallenberg, 1997, Theorem 12.9, p. 225) then gives
1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðk � sÞdW is )
1
2W i
t þ1ffiffiffiffiffi12p Zi
t, (90)
where Zi is a standard Brownian motion independent of W j for all j 2 f1; . . . ; dg.This establishes (85) because of the continuity of the pathwise integral with respect tothe Lebesgue measure,
P� limN!1
1ffiffiffiffiffiNp
Z Nt
½Nt�
ðW is �W i
½s�Þds ¼ 0. (91)
The same type of argument shows that
N
Z t
0
ðs� ZNs ÞdW i
s ¼1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðs� ½s�ÞdW is þ
1ffiffiffiffiffiNp
Z Nt
½Nt�
ðs� ½s�ÞdW is.
(92)
Donsker’s functional central limit theorem implies that the first part converges to aBrownian motion. The second part converges to zero, in probability, by thecontinuity of the Wiener integral,
P� limN!1
1ffiffiffiffiffiNp
Z Nt
½Nt�
ðs� ½s�ÞdW is ¼ 0. (93)
As the sequence of i.i.d. random variablesR½k�1;k½ðs� ½s�ÞdW i
s has variance 13 and
covariance 121fi¼jg with W j as well as covariance 1
61fi¼jg with
R½k�1;k½ðk � sÞdW j
s wehave
1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðs� ½s�ÞdW is )
1
2W i
t �1ffiffiffiffiffi12p Zi
t, (94)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 39
which establishes (86). It remains to show (87). Again, by the scaling property ofBrownian motion
ffiffiffiffiffiNp
Z t
0
ðW js �W j � ZN
s ÞdW is ¼
1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðW js �W
j½s�ÞdW i
s
þ1ffiffiffiffiffiNp
Z Nt
½Nt�
ðW js �W
j½s�ÞdW i
s. ð95Þ
As the sequence of i.i.d random variablesR½k�1;k½ðW
js �W
j½s�ÞdW i
s has variance12and
is independent of W j,R½k�1;k½ðW
is �W i
½s�Þds as well as ofR½k�1;k½ðs� ½s�ÞdW i
s we can
appeal to Donsker’s invariance principle to conclude that
1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðW js �W
j½s�ÞdW i
s )1ffiffiffi2p Z
i;jt . (96)
The continuity of the Ito integral implies that P� limN!1ð1=ffiffiffiffiffiNpÞRNt
½Nt�ðW j
s �
Wj½s�ÞdW i
s ¼ 0. This establishes (87). &
In the sequel, we need to assess the convergence of stochastic integrals with respectto the processes VN � ½V1;N ;V 2;N ;V 3;N ;V 4;N � defined in Lemma 1. Results of Duffieand Protter (1992), that provide sufficient conditions for the weak convergence ofstochastic integrals, i.e. goodness, prove useful in that regard.
Definition 1. A process Y N is good if ðX N ;Y N Þ ) ðX ;Y Þ implies thatðX N ;Y N ;
R �0 X N
s dY Ns Þ ) ðX ;Y ;
R �0 X s dY sÞ.
Lemma 2. The semimartingales ðV 1;N ;V 3;N ;V 4;N Þ are good.
Proof. From condition A in Duffie and Protter (1992) it follows that V 1;N is good ifsupNN
R t
0ðs� ZNs Þdso1 for all t 2 ½0;T �. Similarly, V 3;N and V4;N are good if
supN VAR½V 3;i;Nt �o1 for all i ¼ 1; . . . ; d and supNVAR½V
4;i;j;Nt �o1 for all i; j ¼
1; . . . ; d and t 2 ½0;T �. But,
VAR½V 3;i;Nt � ¼ N2
Z t
0
ðs� ZNs Þ
2 ds, (97)
VAR½V4;i;j;Nt � ¼ NE
Z t
0
ðW is �W i
ZNsÞ2 ds1i¼j
� �¼ N
Z t
0
ðs� ZNs Þds1i¼j. (98)
It is therefore sufficient to show that N�R t
0ðs� ZNs Þ
� dso1 for � ¼ 1; 2 and t 2 ½0;T �.The bound 14ðt� ½t�Þ�X0 implies
N�
Z t
0
ðs� ZNs Þ
� ds ¼1
N
Z Nt
0
ðs� ½s�Þ� dso1
N
Z Nt
0
ds ¼ t. (99)
We conclude that V 1;N ;V 3;N ;V4;N are good. &
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6840
The proof of our next lemma shows that the total variation of the semi-martingaleV 2;N is OPð
ffiffiffiffiffiNpÞ.33 As a result V 2;N cannot be good.
Lemma 3. The semi-martingale V2;N is not good.
Proof. We will show that N�1=2NR t
0 jW s �W ZNsjds ¼ OPð1Þ so that the total
variation of V 2;N satisfiesR t
0 jdV 2;Ns j ¼ OPð
ffiffiffiffiffiNpÞ. Note that
N1=2
Z t
0
jW s �W ZNsjds ¼
1
N
Z Nt
0
jW s �W ½s�jds
¼1
N
X½Nt��1
k¼0
Z kþ1
k
jW s �W kjdsþ1
N
Z Nt
½Nt�
jW s �W ½Nt�jds.
ð100Þ
By the continuity of the integral the second term ð1=NÞRNt
½Nt�jW s �W ½Nt�jds ¼ oPð1Þ.
For the first term, note that, by Jensen’s inequality,
1
N2
X½Nt��1
k¼0
E
Z kþ1
k
jW s �W kjds
� �2" #
p1
N2
X½Nt��1
k¼0
Z kþ1
k
ðs� kÞds
¼1
N
1
2
½Nt�
N! 0, ð101Þ
as ½Nt�=N ! t as N !1. By (100), (101) and invoking Chebyshev’s weak law oflarge numbers, we conclude34
P� limN!1
N�1=2N
Z t
0
jW s �W ZNsjds
¼ limN!1
1
N
X½Nt��1
k¼0
E
Z kþ1
k
jW s �W kjds
� �
¼ limN!1
1
N
X½Nt��1
k¼0
E½jW 1j�
Z 1
0
ffiffisp
ds ¼ t
ffiffiffi2
p
r2
3
� �,
where the last equality uses E½jW 1j� ¼ffiffiffiffiffiffiffiffi2=p
pand
R 10
ffiffisp
ds ¼ 23as well as ð½Nt�=NÞ !
t as N !1. This establishesR t
0jdV 2;N
s j ¼ OPðffiffiffiffiffiNpÞ. It follows that
supN
R t
0 jdV 2;Ns j ¼ supNN
R t
0 jW s �W ZNsjds ¼ 1. Theorem 3.2 of Jacod and Protter
(1998) then shows that V 2;Nt cannot be good. &
Define the errors UXN
t ¼ ½UXN1
t ; . . . ;UXN
dt � where U
XNl
t ¼ X Nl;t � X l;t, and U
X N
t ¼
½UXN
lt ; . . . ;U
X Nl
t � where UXN
lt ¼ X N
l;t � X Nl;ZN
t. To prove Theorem 1 we use the mean
33The random variable X N is OPðNÞ if P� limN!1ð1=NÞX N ¼ Ka0 for some random variable K. The
random variable X N is oPðNÞ if P� limN!1ð1=NÞX N ¼ 0. If P� limN!1X N ¼ Ka0 (resp.
P� limN!1X N ¼ 0) we say that X N is OPð1Þ (resp. oPð1Þ).34Chebyshev’s weak law of large numbers states that if ð1=NÞ
PNi¼1 Zi is such that
ð1=N2ÞPN
i¼1 E½ðZiÞ2� ! 0 then P� limN!1ðð1=NÞ
PNi¼1ðZ
i � E½Zi �ÞÞ ¼ 0.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 41
value theorem to write
AðX NZN
tÞ � AðX tÞ ¼ AðX N
t Þ � AðX tÞ � ðAðXNt Þ � AðX N
ZNtÞÞ
¼Xd
l¼1
ðqlAðX t þ l1;lUXN
lt elÞU
X Nl
t � qlAðXNt þ l3;lU
XNl
t elÞUX N
lt Þ,
where l�;l 2�0; 1½ for all l ¼ 1; . . . ; d, and e0l ¼ ½0; . . . ; 0; 1; 0; . . . ; 0� is the lth unit vectorof dimension d. A similar expression holds for BðX N
ZNtÞ � BðX tÞ. The expansion of the
error UX N
t of the Euler continuous approximation can be decomposed as
UXN
T ¼
Z T
0
Xd
l¼1
qlAðX s þ l1;lelUXN
ls ÞU
XNl
s ds
þ
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðX s þ l2;lelUX N
ls ÞU
X Nl
s dW js
�
Z T
0
Xd
l¼1
qlAðXNs þ l3;lelU
XNl
s ÞUXN
ls ds
�
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðXNs þ l4;lelU
XNl
s ÞUXN
ls dW j
s.
As UX N
ls ¼ AlðX
NZN
sÞðs� ZN
s Þ þPd
i¼1 Bl;iðXNZN
sÞðW i
s �W iZN
sÞ we have
UXN
T ¼
Z T
0
Xd
l¼1
qlAðX s þ l1;lelUXN
ls ÞU
XNl
s ds
þ
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðX s þ l2;lelUX N
ls ÞU
X Nl
s dW js
�1
N
Z T
0
Xd
l¼1
qlAðXNs þ l3;lelU
XNl
s ÞAlðXNZN
sÞdV1;N
s
�1
N
Z T
0
Xd
l¼1
qlAðXNs þ l3;lelU
XNl
s ÞXd
j¼1
Bl;jðXNZN
sÞdV 2;j;N
s
�1
N
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðXNs þ l4;lelU
XNl
s ÞAlðXNZN
sÞdV3;j;N
s
�1ffiffiffiffiffiNp
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðXNs þ l4;lelU
X Nl
s ÞXd
i¼1
Bl;iðXNZN
sÞdV 4;i;j;N
s . ð102Þ
Lemmas 2 and 3 imply that the integral with respect to V 2;N is the only term whoselimit is difficult to find. Our next result shows that its limit must account for a‘‘Wong-Zakai correction term’’ (Wong and Zakai, 1964).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6842
Lemma 4. For a sequence of good semimartingales aN such that ðaN ;V2;NÞ ) ða;V2Þ
we haveR T
0 aNs dV 2;j;N
s )R T
0 as dV2;js þ ½a;V
2;j�T , for j ¼ 1; . . . ; d.
Proof. The integration by parts formula combined with the fact that V2;j;N is ofbounded total variation for fixed N, and that aN is good givesZ t
0
aNs dV 2;j;N
s ¼ aNt V
2;j;Nt �
Z t
0
V2;j;Ns daN
s � ½V2;j;N ; aN �t
¼ aNt V
2;j;Nt �
Z t
0
V2;j;Ns daN
s
) atV2;jt �
Z t
0
V 2;js das
¼
Z t
0
as dV 2;js þ ½a;V
2;j �t.
The second equality follows from ½V 2;j;N ; aN � ¼ 0, a consequence of the fact thatV 2;j;N is of bounded total variation for fixed N. The last equality is an application ofthe integration by parts formula. &
The asymptotic error expansion for X N in (102) involves an integral with respectto the bounded variation process V 2;j;N that is not good (by Lemma 3). The limit ofthis integral is therefore not the integral of the weak limit of the integrand withrespect to the weak limit of the integrator. Lemma 4 shows how the limit can beconstructed if the integrand is good. To apply this result to the integral with respectto V2;N in (102) it remains to show that the integrand is good. For this we need thefollowing lemma.
Lemma 5. The semimartingale X N is good.
Proof. By Theorem 2.5 of Kurtz and Protter (1991b) goodness is equivalent touniform tightness, defined as follows,
Definition 2 (Jakubowski et al., 1989). A sequence of semimartingales X N isuniformly tight if and only if for each t and any simple predictable HN such thatjHN jp1 (P-a.s.), the set f
R t
0 HNs� dX N
s ; NX1g, is stochastically bounded (uniformlyin N).35,36
To prove Lemma 5 it suffices to show that X N is uniformly tight. The proof is bycontradiction and uses arguments in Kurtz and Protter (1996).
To simplify notation set Y 0t ¼ ½t;W0t� and f ¼ ½A;B1; . . . ;Bd �, and write X N
T ¼
X 0 þPd
j¼0
R T
0 f jðXNZN
sÞdY j
s. As f 2 C2 (by assumption), P� limN!1X N ¼ X (byTheorem 3.1 of Jacod and Protter (1998)) and ZN
t ! t as N !1, we can apply thecontinuous mapping Theorem to conclude ðX N ; f ðX N
ZN Þ;Y Þ ) ðX ; f ðX Þ;Y Þ.
35H is simple predictable if there exists a sequence of stopping times 0 ¼ t1p �ptnþ1 ¼ T , and Fti
measurable finite random variables ðHiÞ0pipn such that Ht ¼ H01f0gðtÞ þPn
i¼1 Hi1�ti ;tiþ1 �ðtÞ.
36A set of random variables fX N ; NX1g is uniformly stochastically bounded if for any �40 there exists
d40 such that supN PðjX N j4dÞo�.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 43
Suppose now that X N is not uniformly tight. By Definition 2 it follows that thereexist a time t and simple predictable processes HN with jHN jp1 (P-a.s.), for whichthe set f
R t
0 HNs� dX N
s ; NX1g is not stochastically bounded, i.e. there exists an � forwhich there is no d40 such that supN Pðj
R t
0 HNs� dX sj4dÞo�. For this sequence
jHN jp1 (P-a.s.), and this �, we have supN0 infN4N0 PðjR t
0HN
s� dX Ns j4dÞX� for any
d40. Thus, there exists a sequence of positive constants cN !1, such that
�p lim infN!1
P
Z t
0
HNs� dX N
s
��������4cN
� �
p lim infN!1
P
Z t
0
HNs� dX N
s 4cN
� �þ P �
Z t
0
HNs� dX N
s 4cN
� �� �
¼ lim infN!1
P
Z t
0
HNs�
cNdX N
s 41
� �þ P �
Z t
0
HNs�
cNdX N
s 41
� �� �
¼ lim infN!1
PXd
j¼0
Z t
0
HNj;s�
cNf jðX
NZN
sÞdY j
s41
!
þP �Xd
j¼0
Z t
0
HNj;s�
cNf jðX
NZN
sÞdY j
s41
!!
¼ 0.
The null limit on the last line uses goodness of Y and ðHNj;�f jðX
NZN�Þ=cN ;Y jÞj¼0;...;d )
ð0;Y Þ. The limit for the integrand follows from jHNj jp1, f jðX
NZN�Þ ) f jðX Þ, and
cN !1 (so that HNj f jðX
NZN� Þ=cN ) 0 uniformly in N). Goodness of Y then impliesPd
j¼0
R t
0ðHNj;s=cNÞf jðX
NZN
sÞdY j
s ) 0.
From this contradiction we conclude that X N is uniformly tight and hence
good. &
The weak limit results in Lemmas 1–5 are sufficient to prove Theorem 1.
Proof of Theorem 1 (Kurtz and Protter, 1991a). It follows from the weak limits inLemmas 1–5 that lines 2, 3 and 4 of the approximation error UX N
T in (102) convergeweakly to zero. The remaining terms (lines 1 and 5) converge weakly to a linear SDEwhose solution corresponds to the expression for UX
T in the theorem. &
If we apply the Doss transformation and sample Gaussian increments Wjtþðnþ1Þh �
Wjtþnh then the error UX
N
¼ XN� X of the Euler approximation with transforma-
tion can be expanded as
UXN
T ¼
Z T
0
Xd
l¼1
ql AðX s þ l1;lelUX
Nl
s ÞUX
Nl
s ds
�
Z T
0
Xd
l¼1
ql AðXN
s þ l3;lelUX
Nl
s ÞUX
Nl
s ds,
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6844
where l�;l 2�0; 1½ for all l ¼ 1; . . . ; d, el ¼ ½0; . . . ; 1; . . . ; 0�0 is the lth unit vector and
UX
Nl
s ¼ XN
l;s � XN
l;ZNs. As U
XNl
s ¼ AlðX ZNsÞðs� ZN
s Þ þPd
j¼1 Bl;jðWjs �W
j
ZNsÞ we obtain
UXN
T ¼
Z T
0
Xd
l¼1
ql AðX s þ l1;lelUX
Nl
s ÞUX
Nl
s ds
�1
N
Z T
0
Xd
l¼1
ql AðXN
s þ l3;lelUX
Nl
s ÞAlðX ZNsÞdV1;N
s
�1
N
Z T
0
Xd
l¼1
ql AðXN
s þ l3;lelUX
Nl
s ÞXd
j¼1
Bl;j dV2;j;Ns . ð103Þ
As for the error expansion without transformation (102) we have integrals with
respect to XN. To find the limit using integration by parts we need to establish
goodness of XN. Our next lemma verifies this property.
Lemma 6. The semimartingale XN
is good.
Proof. The proof parallels the proof of Lemma 5. &
Proof of Theorem 2. As in the proof of Theorem 1 integrals with respect to V 2;j;N
need special treatment. Let al;j;Ns � ql AðX
N
s þ l3;lelUX
Nl
s ÞBl;j and note that al;j;N isgood by the continuity of ql A. An application of Lemma 4 then shows thatZ T
0
al;j;Ns dV 2;j;N
s )
Z T
0
ql AðX sÞBl;j dV2;js þ
1
2
Z T
0
Xd
k¼1
ql;kAðX sÞBk;j Bl;j ds.
(104)
As limN!1 ZNs ¼ s and P� limN!1U
XNl ¼ P� limN!1UX
Nl ¼ 0 and as V1;N is
good by Lemma 2, it follows that NUXN
) UX where
UXT ¼
Z T
0
Xd
l¼1
ql AðX sÞUX ls ds�
1
2
Z T
0
Xd
l¼1
½ql AðX sÞAlðX sÞ
þXd
j;k¼1
ql;kAðX sÞBk;j Bl;j �ds
�
Z T
0
Xd
l¼1
ql AðX sÞXd
j¼1
Bl;j1
2dW j
s þ1ffiffiffiffiffi12p dZj
s
� �.
This SDE is linear and its solution corresponds to the result announced. &
Proof of Corollary 1. The result follows from NðGðXN
T Þ � X T Þ ¼ qGðX T ÞUX
N
T þ
oPð1Þ, where oPð1Þ denotes terms that vanish in probability (see footnote 33), andqGðzÞ ¼ BðGðzÞÞ. &
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 45
Proof of Theorem 3. The proofs are the same for the cases with and withouttransformation. The approximation error can be written as
1ffiffiffiffiffiffiMp
XMi¼1
gðX i;NT Þ � E0½gðX T Þ�
!
¼1ffiffiffiffiffiffiMp
XMi¼1
ðgðX i;NT Þ � gðX i
T ÞÞ þ1ffiffiffiffiffiffiMp
XMi¼1
ðgðX iT Þ � E0½gðX T Þ�Þ. ð105Þ
The Lindeberg central limit theorem for i.i.d. random variables gives
1ffiffiffiffiffiffiMp
XMi¼1
ðgðX iT Þ � E0½gðX T Þ�Þ )
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVAR½gðX T ÞjF0�
pZ, (106)
where Z�Nð0; 1Þ. The Clark–Ocone formula gðX T Þ � E0½gðX T Þ� ¼R T
0 Es½DsgðX T Þ�dW s combined with the chain rule of Malliavin calculus (seeNualart, 1995) enables us to write VAR½gðX T ÞjF0� ¼
R T
0 E0½kEs½qgðX T ÞDsX T �k2�ds.
This establishes thatffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVAR½gðX T ÞjF0�
pZ ¼ LT ðX 0Þ in distribution. It remains to
find the weak limit of ð1=ffiffiffiffiffiffiMpÞPM
i¼1 ðgðXi;NT Þ � gðX i
T ÞÞ.If we introduce two sequences of numbers, NM and �M ¼
ffiffiffiffiffiffiMp
=NM such thatlimM!1 �M ¼ �o1 and limM!1NM ¼ 1, we can invoke Kolmogorov’s stronglaw of large numbers, to conclude that
limM!1
�MNM
1
M
XMi¼1
ðgðXi;NMT Þ � gðX i
T ÞÞ
!
¼ � limM!1
NME0½gðXNMT Þ � gðX T Þ�; P-a.s. ð107Þ
Theorem 4 on the expected approximation error (see below) then gives
1ffiffiffiffiffiffiMp
XMi¼1
ðgðXi;NMT Þ � gðX i
T ÞÞ !�
2KT ðX 0Þ (108)
as M !1. The corresponding result for the estimator with transformation isobtained by invoking Theorem 5 instead Theorem 4. This establishes the claim forboth estimators. &
We now turn to the proof of the expected approximation error. The errorexpansion (102) shows that UX N
satisfies a linear SDE. The solution is
NðONT Þ�1UX N
T ¼ �
Z T
0
ðONs Þ�1 dIN
1;s �
Z T
0
ðONs Þ�1 dIN
2;s
�
Z T
0
ðONs Þ�1 dðIN
3;s � ½RN ; IN
3 �sÞ
�ffiffiffiffiffiNp
Z T
0
ðONs Þ�1 dðIN
4;s � ½RN ; IN
4 �sÞ, ð109Þ
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6846
where ONT ¼ ERðRNÞT with RN
T ¼ ½RN1;T ; . . . ;R
Nd;T � such that
RNi;T ¼
Z T
0
qiAðX s þ l1;ieiUX N
is Þdsþ
Z T
0
Xd
j¼1
qiBjðX s þ l2;ieiUXN
is ÞdW j
s (110)
for i ¼ 1; . . . ; d, and
IN1;T �
Z T
0
Xd
l¼1
qlAðXNs þ l3;lelU
X Nl
s ÞAlðXNZN
sÞdV 1;N
s ,
IN2;T �
Z T
0
Xd
l¼1
qlAðXNs þ l3;lelU
X Nl
s ÞXd
j¼1
Bl;jðXNZN
sÞdV2;j;N
s ,
IN3;T �
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðXNs þ l4;lelU
X Nl
s ÞAlðXNZN
sÞdV 3;j;N
s ,
IN4;T �
Z T
0
Xd
i;j¼1
Xd
l¼1
qlBjðXNs þ l4;lelU
XNl
s ÞBl;iðXNZN
sÞ
!dV4;i;j;N
s .
In order to find the limit of the expectation of NUXN
T we need the following lemma.
Lemma 7. Define JN4;T �
R T
0 ðONs Þ�1 dðIN
4;s � ½RN ; IN
4 �sÞ. For any FT -measurable
square integrable d � 1 random vector HT , and FT -measurable sequence of d � 1random vectors such that HN
T ) HT and
limr!1
lim supN
E0½jffiffiffiffiffiNpðHN
T Þ0JN
4;T j1fjffiffiffiNp
H 0T
JN4;Tj4rg� ¼ 0 (111)
(i.e. ðHNT Þ0JN
4;T is asymptotically uniformly integrable) the cross momentffiffiffiffiffiNp
E0½ðHNT Þ0JN
4;T � !12E0½U2;T �, (112)
where
U2;T ¼Xd
i;j;k¼1
½hk;jbk;i;j ;W i�T �HkT
Z T
0
dk;i;js dW i
s þ ½dk;i;j ;W i�T
� �� �(113)
with bk;i;jt ; dk;i;j
t the kth elements of the d � 1 vectors bi;jt ; d
i;jt given by
di;jt ¼ O�1t qBj
Xd
l¼1
qlBj Bl;i
" #ðX tÞ for i; j ¼ 1; . . . ; d, (114)
bi;jt ¼ O�1t
Xd
l¼1
qlBjBl;i
" #ðX tÞ for i; j ¼ 1; . . . ; d (115)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 47
and with fhk;jt : j ¼ 1; . . . ; dg the integrands in the martingale representation of Hk
T ,
HkT � E0½H
kT � ¼
Xd
j¼1
Z T
0
hk;js dW j
s for k ¼ 1; . . . ; d. (116)
Proof. To establish this limit, recall that X N is good and that the coefficients arecontinuous by assumption. It follows that RN is good as well. Consider now the limit
offfiffiffiffiffiNp
E0½ðHNT Þ0R T
0 ðONs Þ�1d½RN ; IN
4 �s�. Given that
ffiffiffiffiffiNp
Z T
0
ðONs Þ�1 d½RN ; IN
4 �s ¼Xd
i;j¼1
Z T
0
di;js dV 2;i;N
s þ oPð1Þ, (117)
where di;j is the fixed random variable (114) we can apply Lemma 4 to obtain theweak limit
ffiffiffiffiffiNp
Z T
0
ðONs Þ�1 d½RN ; IN
4 �s )Xd
i;j¼1
Z T
0
di;js dV 2;i
s þXd
i;j¼1
½di;j ;V2;i�T . (118)
Weak convergence, along with the uniform integrability offfiffiffiffiffiNpðHN
T Þ0R T
0ðON
s Þ�1 d½RN ; IN
4 �s in assumption (111), implies convergence in means. We obtain,
limN!1
ffiffiffiffiffiNp
E0 ðHNT Þ0
Z T
0
ðONs Þ�1 d½RN ; IN
4 �s
� �
¼ E0 H 0T
Xd
i;j¼1
Z T
0
di;js dV 2;i
s þXd
i;j¼1
½di;j ;V 2;i�T
!" #
¼1
2
Xd
i;j¼1
E0 H 0T
Z T
0
di;js dW i
s þ ½di;j ;W i�T
� �� �,
where the last equality uses the independence of ðdi;j� ;HT Þ from Zi
� in V 2;i.Next, consider the limit of
ffiffiffiffiffiNp
E0½ðHNT Þ0R T
0 ðONs Þ�1 dIN
4;s� where HT is an arbitrarysquare integrable vector of FT -measurable random variables. By the Martingale
Representation Theorem we can, for each k ¼ 1; . . . ; d, write Hk;NT � E0½H
k;NT � ¼Pd
l¼1
R T
0hk;l;N
s dW ls. As
ffiffiffiffiffiNp
d½V4;i;j;N ;W l �t ¼ 1fj¼lg dV2;i;Nt we have
Xd
l¼1
Z �0
hk;l;Ns dW l
s;ffiffiffiffiffiNp
Z �0
ðONs Þ�1 dIN
4;s
� �T
¼Xd
i;j¼1
Z T
0
hk;j;Ns bi;j
s dV 2;i;Ns þ oPð1Þ
(119)
)Xd
i;j¼1
Z T
0
hk;js bi;j
s dV 2;is þ ½h
k;jbi;j ;V 2;i�T , (120)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6848
where the limit in the second line follows from Lemma 4 and goodness of thesequence hk;j;N .
Combining the results above gives, with hk;N0 �
ffiffiffiffiffiNp
E0½Hk;NT
R T
0 ðONs Þ�1 dIN
4;s�,
limN!1
hk;N0 ¼ lim
N!1
ffiffiffiffiffiNp
E0 E0½Hk;NT � þ
Xd
l¼1
Z T
0
hk;l;Ns dW l
s
! Z T
0
ðONs Þ�1 dIN
4;s
� �" #
¼ limN!1
ffiffiffiffiffiNp Xd
l¼1
E0
Z �0
hk;l;Ns dW l
s;
Z �0
ðONs Þ�1 dIN
4;s
� �T
� �
¼ limN!1
Xd
i;j¼1
E0
Z T
0
hk;j;Ns bi;j
s dV2;i;Ns
� �
¼Xd
i;j¼1
E0
Z T
0
hk;js bi;j
s dV 2;is þ ½h
k;jbi;j ;V2;i�T
� �
¼1
2
Xd
i;j¼1
E0½½hk;jbi;j ;W i�T �.
The second line in this derivation uses E0½R T
0 ðONs Þ�1 dIN
4;s� ¼ 0. The third and fourthlines follow from (119), (120) and the uniform integrability assumption (111). The
last line uses E0½R T
0 hk;js bi;j
s dV2;is � ¼ 0 for all i; j; k ¼ 1; . . . ; d. &
We now proceed with the proof of Theorem 4.
Proof of Theorem 4. By Lemma 7 we know that
ffiffiffiffiffiNp
E0 ðHNT Þ0
Z T
0
ðONs Þ�1dðIN
4;s � ½RN ; IN
4 �sÞ
� �!
1
2E0½U2;T �, (121)
with U2;T as in (113), for any FT -measurable sequence of random vectors HNT ) HT .
Given that other terms in (109) are of the same order and that each term isasymptotically uniformly integrable by assumption, we can find the expectedapproximation error from the weak limits of these terms.
As P� limN!1UX N¼ 0, and
d½RN ; IN3 �s ¼
Xd
j;k¼1
qBkðX sÞgj;N3;s d½W
k;V3;j;N �s
¼Xd
j;k¼1
qBkðX sÞgj;N3;s dV 1;N
s 1fk¼jg
¼Xd
j¼1
qBjðX sÞgj;N3;s dV 1;N
s
with
gj;N3;s �
Xd
l¼1
qlBjðX s þ l4;lelUX N
ls ÞAlðX
NZN
sÞ, (122)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 49
we get
½RN ; IN3 �T )
1
2
Z T
0
Xd
j¼1
½qBj qBj A�ðX sÞds. (123)
For the remaining terms we have ðIN1 ; I
N2 ; I
N3 Þ ) ðI1; I2; I3Þ where
I1;T ¼1
2
Z T
0
qAðX sÞAðX sÞds,
I2;T ¼
Z T
0
qAðX sÞXd
j¼1
BjðX sÞ1
2dW j
s þ1ffiffiffiffiffi12p dZj
s
� �
þ1
2
Z T
0
Xd
j;k;l¼1
½qkðqlABl;jÞBk;j�ðX sÞds,
I3;T ¼
Z T
0
Xd
j¼1
qBjðX sÞAðX sÞ1
2dW j
s �1ffiffiffiffiffi12p dZj
s
� �.
For completeness we establish the limit of IN2;T . The other two limits are obtained
along the same lines. Applying Lemma 4 with aNs ¼
Pdl;j¼1 qlAðX
Ns þ
l3;lelUX N
ls ÞBl;jðX
NZN
sÞ, where the sequence aN is good, givesZ T
0
aNs dV 2;j;N
s )
Z T
0
as dV 2;js þ ½a;V
2;j�T , (124)
where as ¼Pd
l;j¼1 ½qlABl;j�ðX sÞ. Given that V2;jt ¼
12
Wjt þ
1ffiffiffiffi12p Z
jt Ito’s lemma implies
½a;V2;j�T ¼1
2
Z T
0
Xd
l;j¼1
½q½qlABl;j��ðX sÞd½X ;Wj�s
¼1
2
Z T
0
Xd
l;j;k¼1
½qk½qlABl;j�Bk;j�ðX sÞds, ð125Þ
where the second equality follows from d½X ;W j�s ¼ BjðX sÞds. This explains the limitof IN
2 .As g 2 C3ðRdÞ and as X N converges, the mean value theorem shows that
NðgðX NT Þ � gðX T ÞÞ ¼ qgðX T þ diag½l�UX N
T ÞNUXN
T
¼ qgðX T þ diag½l�UX N
T ÞONT ðNðO
NT Þ�1UX N
T Þ
for some vector l ¼ ½l1; . . . ; ld � with li 2�0; 1½; i ¼ 1; . . . d, where diag½l� is the
diagonal matrix with l on the diagonal and zeros elsewhere, and where NðONT Þ�1UXN
T
is defined in (109).Now define Hk
T � qgðX T ÞOkT for k ¼ 1; . . . ; d where Ok is the kth column of the
matrix process O. As NðgðX NT Þ � gðX T ÞÞ is uniformly integrable by assumption (31) we
can apply Lemma 7 to conclude that the expected approximation error converges to
NE0½gðXNT Þ � gðX T Þ� ! E0½qgðX T ÞU1;T �U2;T �, (126)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6850
where ðOT Þ�1U1;T is the limit of the first 3 terms in (109), i.e.
ðOT Þ�1U1;T ¼ �
Z T
0
ðOsÞ�1 dI1;s �
Z T
0
ðOsÞ�1 dI2;s �
Z T
0
ðOsÞ�1 dI3;s
þ1
2
Z T
0
Xd
j¼1
½qBj qBj A�ðX sÞds� ð127Þ
and U2;T is defined in Lemma 7 with HkT ¼ qgðX T ÞOk
T .
Let us now simplify U2;T . Note that (113) can be written as U2;T �Pdi;j;k¼1 ðU
k;i;j2;1;T�Hk
T Uk;i;j2;2;T Þ where U
k;i;j2;1;T � ½h
k;jbk;i;j ;W i�T and Uk;i;j2;2;T �
R T
0 dk;i;js dW i
s
þ½dk;i;j ;W i�T .
To simplify the second term, Uk;i;j2;2;T , recall the d � 1 process di;j
t ¼
O�1t ½qBj
Pdl¼1 qlBj Bl;i�ðX tÞ. Ito’s lemma gives
ddi;jt ¼ O�1t d qBj
Xd
l¼1
qlBj Bl;i
!þ ðdðO�1t ÞÞ qBj
Xd
l¼1
qlBj Bl;i
!
þ d O�1; qBj
Xd
l¼1
qlBj Bl;i
" #. ð128Þ
From dOt ¼ ½qAðX tÞdtþPd
j¼1 qBjðX tÞdWjt�Ot we deduce
dO�1t ¼ � O�1t ðdOtÞO�1t � O�1t d½O;O�1�t
¼ � O�1t qAðX tÞdtþXd
j¼1
qBjðX tÞdWjt
" #þ O�1t
Xd
j¼1
½qBj qBj�ðX tÞdt
ð129Þ
and
d½di;j ;W i�t ¼ O�1t d qBj
Xd
l¼1
qlBj Bl;i;Wi
" #t
þ d½O�1;W i�t qBj
Xd
l¼1
qlBj Bl;i
!
¼ O�1t q qBj
Xd
l¼1
qlBj Bl;i
" #d½X ;W i�t
� O�1X
j
qBj d½Wj ;W i�t qBj
Xd
l¼1
qlBj Bl;i
!
¼ O�1t q qBj
Xd
l¼1
qlBj Bl;i
" #Bi dt� O�1qBi qBj
Xd
l¼1
qlBj Bl;i
!dt.
Given thatPd
l¼1 qlBj Bl;i ¼ qBj Bi we can write
Xd
i;j¼1
½di;j ;W i�T ¼
Z T
0
Xd
i;j¼1
d½di;j ;W i�t ¼
Z T
0
O�1t ZðX tÞdt, (130)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 51
with
Z �Xd
i;j¼1
ðq½qBj qBj Bi�Bi � qBi qBj qBj BiÞ. (131)
Let us now simplify the first term, Uk;i;j2;1;T � ½h
k;jbk;i;j ;W i�T . The Clark–Oconeformula (see Nualart, 1995) and the chain rule of Malliavin calculus give
Uk;i;j2;1;T � ½h
k;jbk;i;j ;W i�T ¼
Z T
0
Es½Di;sðhk;js bk;i;j
s Þ�ds
¼
Z T
0
Es½ðDi;shk;js Þb
k;i;js þ hk;j
s ðDi;sbk;i;js Þ�ds.
By definition, hk;j is the process in the representation of HkT � qgðX T ÞOk
T . TheClark–Ocone formula gives hk;j
t ¼ Et½Dj;tðPd
l¼1 qlgðX T ÞOl;kT Þ� and the rules of
Malliavin calculus establish that
Dj;t
Xd
l¼1
qlgðX T ÞOl;kT
!¼Xd
l;n¼1
q2l;ngðX T ÞðDj;tX n;T ÞOl;kT þ
Xd
l¼1
qlgðX T ÞDj;tOl;kT ,
(132)
D2i;j;t
Xd
l¼1
qlgðX T ÞOl;kT
!¼
Xd
l;m;n¼1
q3l;m;ngðX T ÞðDi;tX m;T ÞðDj;tX n;T ÞOl;kT
þXd
l;n¼1
q2l;ngðX T ÞðD2i;j;tX n;T ÞO
l;kT
þXd
l;n¼1
q2l;ngðX T ÞðDj;tX n;T ÞDi;tOl;kT
þXd
l;n¼1
q2l;ngðX T ÞðDi;tX n;T ÞDj;tOl;kT
þXd
l¼1
qlgðX T ÞD2i;j;tO
l;kT ,
where
Di;tX m;T ¼Xd
h¼1
Om;ht;T Bh;iðX tÞ, (133)
D2i;j;tX m;T ¼
Xd
h¼1
ðDi;tOm;ht;T ÞBh;jðX tÞ þ
Xd
h¼1
Om;ht;T ½ðqBh;jÞBi�ðX tÞ (134)
and where, from
dOl;kt;v ¼
Xd
h¼1
qhAlðX vÞdvþXd
j¼1
qhBl;jðX vÞdW jv
" #Oh;k
t;v ; Ol;kt;t ¼ 1fl¼kg, (135)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6852
we deduce that Gl;k1;i ðt;TÞ � Di;tO
l;kt;T and Gl;k
2;i;jðt;TÞ ¼ D2i;j;tO
l;kt;T solve the linear
SDEs,
dGl;k1;i ðt; vÞ ¼
Xd
h¼1
qhAlðX vÞdvþXd
j¼1
qhBl;jðX vÞdW jv
!Gh;k1;i ðt; vÞ
þXd
h;m¼1
q2h;mAlðX vÞdvþXd
j¼1
q2h;mBl;jðX vÞdW jv
!Oh;k
v ðDi;tX m;vÞ
with Gl;k1;i ðt; tÞ ¼
Pdh¼1 qhBl;iðX tÞOh;k
t for i; l; k ¼ 1; . . . ; d, and
dGl;k2;i;jðt; vÞ ¼
Xd
h¼1
qhAlðX vÞdvþXd
p¼1
qhBl;pðX vÞdW pv
!Gh;k2;i;jðt; vÞ
þXd
h;m¼1
q2h;mAlðX vÞdvþXd
p¼1
q2h;mBl;pðX vÞdW pv
!ðDi;tX m;vÞG
h;k1;j ðt; vÞ
þXd
h;m¼1
q2h;mAlðX vÞdvþXd
p¼1
q2h;mBl;pðX vÞdW pv
!Oh;k
v D2i;j;tX m;v
þXd
h;m;n¼1
q3h;m;nAlðX vÞdvþXd
p¼1
q3h;m;nBl;pðX vÞdW pv
!
�Oh;kv ðDj;tX m;vÞDi;tX n;v,
with Gl;k2;i;jðt; tÞ¼
Pdh;n¼1 q
2h;nBl;jðX tÞBn;iðX tÞOh;k
t þPd
h¼1 qhBl;jðX tÞGh;ki ðt; tÞ for i; j; l; k ¼
1; . . . ; d.From the definition bi;j
t ¼ O�1t ½ðqBjÞBi�ðX tÞ we also deduce that
Di;tbi;jt ¼ Di;tðO�1t ½ðqBjÞBi�ðX tÞÞ ¼ O�1t ½ðq½ðqBjÞBi� � qBiðqBjÞÞBi�ðX tÞ. (136)
Setting ni;jðt;TÞ � Et½ðDi;thk;jt Þb
k;i;jt þ hk;j
t ðDi;tbk;i;jt Þ� and using the law of iterated
expectations gives
ni;jðt;TÞ ¼Xd
k¼1
ðCk;i;jðt;TÞ½ðqBjÞBi�ðX tÞ þ Fk;i;jðt;TÞO�1t
�½ðq½ðqBjÞBi� � qBiðqBjÞÞBi�ðX tÞÞ, ð137Þ
where
Fk;i;jðt;TÞ ¼Xd
l;n¼1
q2l;ngðX T ÞXd
h¼1
On;ht;T Bh;jðX tÞ
!Ol;k
T þXd
l¼1
qlgðX T ÞGl;k1;j ðt;TÞ,
(138)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 53
Ck;i;jðt;TÞ ¼Xd
l;m;n¼1
q3l;m;ngðX T ÞXd
h¼1
Om;ht;T Bh;iðX tÞ
! Xd
h¼1
On;ht;T Bh;jðX tÞ
!Ol;k
T
þXd
l;m¼1
q2l;mgðX T ÞXd
h¼1
Gm;h1;i ðt;TÞBh;jðX tÞ þ
Xd
h¼1
Om;hT ½ðqBh;jÞBi�ðX tÞ
!Ol;k
T
þXd
l;m¼1
q2l;mgðX T ÞXd
h¼1
Om;ht;T Bh;jðX tÞ
!Gl;k1;i ðt;TÞ ð139Þ
and where Ol;kt;v solves (135) while Gl;k
1;i ðt; vÞ and Gl;k2;i;jðt; vÞ solve the linear SDEs
dGl;k1;i ðt; vÞ ¼
Xd
h¼1
qhAlðX vÞdvþXd
j¼1
qhBl;jðX vÞdW jv
!Gh;k1;i ðt; vÞ
þXd
h;m¼1
q2h;mAlðX vÞdvþXd
j¼1
q2h;mBl;jðX vÞdW jv
!
�Oh;kv
Xd
n¼1
Om;nt;v Bn;iðX tÞ
!ð140Þ
with Gl;k1;i ðt; tÞ ¼
Pdh¼1 qhBl;iðX tÞOh;k
t , and
dGl;k2;i;jðt; vÞ ¼
Xd
h¼1
qhAlðX vÞdvþXd
p¼1
qhBl;pðX vÞdW pv
!Gh;k2;i;jðt; vÞ
þXd
h;m¼1
q2h;mAlðX vÞdvþXd
p¼1
q2h;mBl;pðX vÞdW pv
!
�Xd
n¼1
Om;nt;v Bn;iðX tÞ
!Gh;k1;j ðt; vÞ
þXd
h;m¼1
q2h;mAlðX vÞdvþXd
p¼1
q2h;mBl;pðX vÞdW pv
!Oh;k
v
�Xd
h¼1
Gm;h1;i ðt; vÞBh;jðX tÞ þ Om;h
t;v ½ðqBh;jÞBi�ðX tÞ
� �
þXd
h;m;n¼1
q3h;m;nAlðX vÞdvþXd
p¼1
q3h;m;nBl;pðX vÞdW pv
!Oh;k
v
�Xd
p¼1
Om;pt;v Bp;jðX tÞ
! Xd
q¼1
On;qt;v Bq;iðX tÞ
!ð141Þ
with Gl;k2;i;jðt; tÞ ¼
Pdh;n¼1 q
2h;nBl;jðX tÞBn;iðX tÞOh;k
t þPd
h¼1 qhBl;jðX tÞGh;k1;i ðt; tÞ.
Substituting the definitions U2;T �Pd
i;j;k¼1 ðUk;i;j2;1;T �Hk
T Uk;i;j2;2;T Þ and Hk
T �
qgðX T ÞOkT in (126), using the results above and eliminating expectations of
stochastic integrals with respect to Zj (which is independent from FT ), enables us
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6854
to rewrite the limit (126) as
NE0½gðXNT Þ � gðX T Þ� ! E0 qgðX T ÞU1;T �
Xd
i;j;k¼1
ðUk;i;j2;1;T �Hk
T Uk;i;j2;2;T Þ
" #
¼ E0 qgðX T Þ U1;T þXd
i;j;k¼1
OkT U
k;i;j2;2;T
!�Xd
i;j;k¼1
Uk;i;j2;1;T
" #
¼1
2E0½qgðX T ÞV 1;T þ V 2;T � ð142Þ
with V1;T ;V 2;T as defined in (29) and (30). This establishes the result. &
Proof of Corollary 2. The estimatorPN�1
n¼0 gðX NnhÞh based on the Euler discrete
approximation X Nnh with h ¼ T=N, is asymptotically equivalent to the estimatorR T
0gðX N
ZNsÞds based on the Euler continuous approximation X N
ZNs(Jacod and Protter,
1998, Lemma 1). To find the asymptotic limit of the expected approximation error ofthe Riemann integral, we can then study the approximation error of the continuousEuler approximation. This error has two parts:Z T
0
ðgðX NZN
sÞds� gðX sÞÞds ¼ HN
1T ðX 0Þ þHN2T ðX 0Þ, (143)
where
HN1T �
Z T
0
ðgðX Ns Þ � gðX sÞÞds,
HN2T � �
Z T
0
ðgðX Ns Þ � gðX N
ZNsÞÞds.
By Theorem 4 NE0½gðXNs Þ � gðX sÞ� !
12
KsðX 0Þ, P0-a.s., so that
NE0½HN1T � !
1
2
Z T
0
KsðX 0Þds, (144)
P0-a.s. It remains to study the second component, HN2T . As shown in the proof of
Theorem 1, we get, by the mean value theorem,
�NHN2T ¼
Z T
0
qgðX sÞ AðX NZN
sÞdV 1;N
s þXd
j¼1
BjðXNZN
sÞdV2;j;N
s
!þ oPð1Þ. (145)
The result now follows using the same arguments as those invoked in the proof ofTheorem 4 to find the expected value of limits involving the processes V1;N and V 2;N .In particular, with aN
1s � qgðX sÞAðXNZN
sÞ and aj;N
2s � qgðX sÞBjðXNZN
sÞ, we use
E0
Z T
0
aN1s dV 1;N
s
� �!
1
2E0
Z T
0
a1s ds
� �, (146)
E0
Z T
0
aj;N2s dV 2;j;N
s
� �! E0
Z T
0
ðaj2s dV 2;j
s þ d½aj2;V
2;j�sÞ
� �, (147)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 55
P0-a.s., where a1s ¼ ½ðqgÞA�ðX sÞ and aj2s ¼ ½ðqgÞBj�ðX sÞ. The expression for K2T ðX 0Þ
follows because the expectation of the integral relative to the martingale V2;j is nulland because d½aj
2;V2;j�s ¼
12d½aj
2;Wj�s. Calculating d½aj
2;Wj �s yields the expression for
K2T ðX 0Þ. &
We now turn to the expected error associated with the Doss transformation.
Proof of Theorem 5. Note that all the terms of the error expansion (103) are of order1=N and that the limit distribution is non-centered. If the terms in the errorexpansion are uniformly integrable, the result follows by taking the expectation ofthe solution of the linear SDE for the approximation error
NUX
Nt
T ¼ �ON
T
Z T
0
ðON
s Þ�1 dI
N
1;s þ
Z T
0
ðON
s Þ�1 dI
N
2;s
� �, (148)
where ON
T ¼ EðRNÞT with R
N
T ¼ ½RN
1;T ; . . . ; RN
d ;T � and RN
i;T ¼ qiAðX s þ l1;ieiUX
N
s Þ fori ¼ 1; . . . ; d and where
IN
1;T �
Z T
0
Xd
l¼1
ql AðXN
s þ l3;lelUX
Nl
s ÞAlðXN
ZNsÞdV 1;N
s ,
IN
2;T �
Z T
0
Xd
l¼1
ql AðXN
s þ l3;lelUX
Nl
s ÞXd
j¼1
Bl;jðXN
ZNsÞdV2;j;N
s .
As ðON
t ; IN
1;t; IN
2;t; XN
t Þ ) ðOt; I1;t; I2;t; X tÞ with Ot; X t as defined in Theorem 2 and
I1;T �1
2
Z T
0
½ðqAÞA�ðX sÞds
I2;T �
Z T
0
qAXd
j¼1
Bj
" #ðX sÞ
1
2dW j
s þ1ffiffiffiffiffi12p dZj
s
� �
þ1
2
Z T
0
Xd
j;k;l¼1
ql;kAðX sÞBk;j Bl;j ds
(the expression for I2;T follows from (104)), the result follows using the samearguments as in the proof of Theorem 2. &
Proof of Corollary 3. We proceed as in the proof of Corollary 2. From Theorem 5 itfollows that
NE0
Z T
0
ðgðXN
s Þ � gðX sÞÞds
� �!
1
2E0
Z T
0
qgðX sÞV s ds
� �. (149)
The result for the second part of the expected approximation error, NE0½R T
0 ðgðXN
s Þ�
gðXN
ZNsÞÞds�, follows using the same arguments as in the proof of Corollary 2 for
K2T ðX 0Þ. Simply replace the drift coefficient A by A and the volatility coefficient B
by B. With Bj deterministic we get qBj ¼ 0 and the expression announcedfollows. &
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6856
Proof of Theorem 6. Consider the Euler discrete approximations (172)–(173), inAppendix B, of the processes NCN
j;� for j ¼ 1; 2. The Euler continuous approxima-tions are
dðNCN1;sÞ ¼ qAðX N
ZNsÞhþ
Xd
j¼1
qBjðXNZN
sÞdW j
s
" #ðNCN
1;sÞ
�Xd
i;j¼1
½q½qBj qBj Bi�Bi�ðXNZN
sÞ
!ds
þ ðqAÞAþXd
j¼1
qBj qA Bj þXd
j;k;l¼1
qkðqlABl;jÞBk;j
" #ðX N
ZNsÞ
!ds
þXd
j¼1
½ðqAÞBj þ ðqBjÞA�ðXNZN
sÞdW j
s �Xd
i;j¼1
ðqBjÞðqBjÞBi�ðXNZN
sÞdW i
s,
dðNCN2;sÞ ¼
Xd
i;j¼1
nNi;jðZ
Ns ;TÞds
with CN1;0 ¼ CN
2;0 ¼ 0. Thus, NE0½qgðX NNhÞC
N1;Nh þ CN
2;Nh� ! �KT ðX 0Þ.We can use the arguments in the proofs of Theorems 4 and 5 to show that, for NM
and �M ¼ffiffiffiffiffiffiMp
=NM such that limM!1 �M ¼ �o1 and limM!1NM ¼ 1, we have
limM!1
1
2
1ffiffiffiffiffiffiMp
XMi¼1
qgðXi;NMNM hÞC
i;NM1;NM h þ C
i;NM2;NM h
� �
¼1
2� lim
NM!1NME0 qgðX
NMT ÞC
NM1;T þ C
NM2;T
h i
¼ �1
2�KT ðX 0Þ.
Defining
gN ;Mc �
1
M
XMi¼1
gðX i;NNh Þ þ
1
2ðqgðX i;N
Nh ÞCi;N1;Nh þ Ci;N
2;NhÞ
� �, (150)
it then follows that
1
2
1ffiffiffiffiffiffiMp
XMi¼1
ðqgðXi;NMNM hÞC
i;NM1;NM h þ C
i;NM2;NM hÞ
!(151)
corrects the asymptotic second-order bias for the estimator without transformationwhen M !1.
The proof for the estimator with transformation follows the same steps. In this
case the average over independent replications of the random variables 12 qgðX
N
T ÞCN
nh
approximates the negative of the second-order bias with transformation.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 57
The asymptotic equivalence of the bias-corrected estimators with and withouttransformation is a consequence of the fact that they have the same asymptoticdistribution. &
Proof of Corollary 4. The proof follows from the arguments used to establishTheorem 3 and from Corollaries 2 and 3. &
To prove convergence results for the Milshtein scheme, we need the followinglemmas.
Lemma 8. The following weak convergence results hold
V5;l;j;Nt � N3=2
Z t
0
Z v
ZNv
Z u
ZNv
dW lr dW j
u dv)
ffiffiffi2p
6Z
l;jt þ
1
6~Z
l;j� V
5;l;jt , (152)
V6;l;j;i;Nt � N
Z t
0
Z v
ZNv
Z u
ZNv
dW lr dW j
u dW iv )
1ffiffiffi6p ~Z
l;j;it � V
6;l;j;it (153)
as N !1, where ððZl;jÞl;j2f1;...;dg; ð ~Zl;jÞl;j2f1;...;dg; ð ~Z
l;j;iÞl;j;i2f1;...;dgÞ is a ð2d2
þ d3Þ-
dimensional standard Brownian motion and where ðZl;jÞl;j2f1;...;dg is the Brownian
motion in the limit V 4;l;j in Lemma 1.
Proof. Using the scaling property of Brownian motion we obtain
V5;l;j;Nt ¼
1ffiffiffiffiffiNp
Z Nt
0
Z v
½v�
Z u
½v�
dW lr dW j
u dv
¼1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
Z v
k�1
Z u
k�1
dW lr dW j
u dv
þ1ffiffiffiffiffiNp
Z Nt
½Nt�
Z v
½Nt�
Z u
½Nt�
dW lr dW j
u dv.
Integration by parts givesR½k�1;k½
R v
k�1
R u
k�1 dW lr dW j
u dv ¼R½k�1;k½ðk � vÞR v
k�1 dW lr dW j
v. The sequence of i.i.d. random variablesR½k�1;k½ðk � vÞR v
k�1dW l
r dW jv is independent of W j, has mean zero, variance 1=12 and covariance
withR½k�1;k½
R v
k�1 dW mr dW n
v of ð1=6Þ1fl¼m;j¼ng. Donsker’s functional central limit
Theorem (see Kallenberg, 1997, Theorem 2.9, p. 225) then gives
1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
ðk � vÞ
Z v
k�1
dW lr dW j
v )
ffiffiffi2p
6Z
l;jt þ
1
6~Z
l;jt . (154)
This result and the continuity of the Riemann integral P� limN!1ð1=ffiffiffiffiffiNpÞRNt
½Nt�
R v
½Nt�
R u
½Nt�dW l
r dW ju dv ¼ 0, establishes (152).
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6858
Similarly, using the scaling property of Brownian motion we obtain
V6;l;j;i;Nt ¼
1ffiffiffiffiffiNp
Z Nt
0
Z v
½v�
Z u
½v�
dW lr dW j
u dW iv
¼1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
Z v
k�1
Z u
k�1
dW lr dW j
u dW iv
þ1ffiffiffiffiffiNp
Z Nt
½Nt�
Z v
½Nt�
Z u
½Nt�
dW lr dW j
u dW iv.
As the sequence of i.i.d random variablesR½k�1;k½
R v
k�1
R u
k�1 dW lr dW j
u dW iv is
independent of W j andR½k�1;k½
R v
k�1 dW mr dW n
v for all m; n ¼ 1; . . . ; d, and hasvariance 1=6, Donsker’s invariance principle establishes
1ffiffiffiffiffiNp
X½Nt�
k¼1
Z½k�1;k½
Z v
k�1
Z u
k�1
dW lr dW j
u dW iv )
1ffiffiffi6p ~Z
l;j;it . (155)
As by the continuity of the stochastic integral P� limN!1ð1=ffiffiffiffiffiNpÞRNt
½Nt�
R v
½Nt�
R u
½Nt�dW l
r dW ju dW i
v ¼ 0, this proves (153). &
Lemma 9. The semimartingale V6;l;j;i;N is good.
Proof. As for Lemma 2 it is sufficient to show that supN VAR½V6;l;j;i;Nt �o1.
Using
VAR½V6;l;j;i;Nt � ¼ N�1E
Z Nt
0
Z v
½v�
Z u
½v�
dW lr dW j
u dW iv
� �2" #
¼1
N
Z Nt
0
Z v
½v�
ðu� ½v�Þdudv ¼1
2
1
N
Z Nt
0
ðv� ½v�Þ2 dv
¼1
2
1
N
X½Nt�
k¼1
Z½k�1;k½
ðv� ðk � 1ÞÞ2 dvþ1
2
1
N
Z Nt
½Nt�
ðv� ½Nt�Þ2 dv
¼1
6
½Nt�
NþðNt� ½Nt�Þ3
N
� �o
1
6t,
we conclude that V6;l;j;i;N is good. &
Lemma 10. The semimartingale V 5;l;j;N is not good. For any sequence of good semi-
martingales aN such that ðaN ;V5;l;j;N Þ ) ða;V 5;l;jÞ we haveR T
0 aNs dV 5;l;j;N
s )R T
0as dV 5;l;j
s þ ½a;V5;l;j�T , for l; j ¼ 1; . . . ; d.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 59
Proof. Let aN be an arbitrary sequence of good semimartingales such thatðaN ;V5;l;j;N Þ ) ða;V 5;l;jÞ. Integration by parts givesZ T
0
aNs dV 5;l;j;N
s ¼ aNT V
5;l;j;NT �
Z T
0
V 5;l;j;Ns daN
s
) aT V5;l;jT �
Z T
0
V 5;l;js das
¼
Z T
0
as dV5;l;js þ ½a;V
5;l;j �T . ð156Þ
The semimartingale V 5;l;j;N is good if and only if ½a;V5;l;j�T ¼ 0. To show thatgoodness fails it suffices to find an aN such that ½a;V5;l;j �Ta0. Taking aN ¼ V 4;l;j;N
we obtain ½V4;l;j ;V5;l;j�T ¼16 T . The conclusion follows. The limit is given by
(156). &
The expansion of ~U~X
N
T � ~XN
T � X T for the approximation error of the Milshteinscheme is obtained along the same lines as (102) for the Euler scheme. We get
~U~X
N
T ¼
Z T
0
Xd
l¼1
qlAðX s þ l1;lel~U~X
Nl
s Þ~U~X
Nl
s ds
þ
Z T
0
Xd
l¼1
Xd
j¼1
qlBjðX s þ l2;lel~U~X
Nl
s Þ~U~X
Nl
s dW js
�1
N
Z T
0
Xd
l¼1
qlAð ~XN
s þ l3;lelU~X
Nl
s ÞAlð ~XN
ZNsÞdV1;N
s
�1
N
Z T
0
Xd
l¼1
qlAð ~XN
s þ l3;lelU~X
Nl
s ÞXd
j¼1
Bl;jð ~XN
ZNsÞdV 2;j;N
s
�1
N
Z T
0
Xd
l;j¼1
qlBjð ~XN
s þ l4;lelU~X
Nl
s ÞAlð ~XN
ZNsÞdV 3;j;N
s
�1
N
Z T
0
Xd
i;k;l;j¼1
qkBið ~XN
s þ l4;kekU~X
Nk
s Þ½qBlBj�kð~X
N
ZNsÞdV 6;l;j;i;N
s
�1
N3=2
Z T
0
Xd
k;l;j¼1
qkAð ~XN
s þ l3;kekU~X
Nk
s Þ½qBl Bj�kð~X
N
ZNsÞdV5;l;j;N
s , ð157Þ
where ~U~X
Nl
s ¼ ~XN
l;s � X l;s and U~X
Nl
s � ~XN
l;s �~X l;ZN
s. Comparing (157) with (102),
shows that the additional term in the Milshtein scheme eliminates integrals withrespect to V 4;i;j;N in the error expansion of the Euler scheme. At the same time itinduces two additional integrals, with respect to V 5;l;j;N and V 6;l;j;i;N (lines 5 and 6 of
(157)), in the approximation errors Að ~XN
ZNsÞ � Að ~X
N
s Þ and Bjð ~XN
ZNsÞ � Bjð ~X
N
s Þ.
Proof of Theorem 7. Lemmas 9 and 10 imply that the term in line 6 of (157) is ofhigher order than those in lines 2–5. Lemmas 1–5 and 8–10 imply that lines 1–5 of
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6860
(157) converge weakly to a linear SDE whose solution is the expression for U~X
T inTheorem 7. &
Proof of Theorem 8. Note that the last two terms of ~UX
T are both products of twoindependent random variables one of which has null expectation. Thus, using
E OT
Z T
0
O�1s ½ðqAÞBj � ðqBjÞA�ðX sÞdZjs
� �
¼ E OT
Z T
0
O�1s ½ðqBiÞðqBlÞBj�ðX sÞd ~Zl;j;is
� �¼ 0, ð158Þ
for l; j; i ¼ 1; . . . ; d leads to the result announced. &
Proof of Theorem 9. The proof is the same as for the Euler scheme withtransformation. &
Proof of Corollary 5. We proceed as in the proof of Corollary 2. Decompose the
approximation error in two parts, ~HN
1T and ~HN
2T . These are the equivalents of HN1T
and HN2T in Corollary 2, obtained by replacing the Euler scheme by the Milshtein
scheme. The limit of NE0½ ~HN
1T �, namely 12
R T
0~KsðX 0Þds, is found using the same
arguments as for the Euler scheme.
The second component ~HN
2T can be written, using the mean value theorem, as
�N ~HN
2T ¼
Z T
0
qgðX sÞ Að ~XN
ZNsÞdV 1;N
s þXd
j¼1
Bjð ~XN
ZNsÞdV 2;N
s
þ1ffiffiffiffiffiNp
Xd
l;j¼1
½ðqBlÞBj �ðX sÞdV5;l;j;Ns
!þ oPð1Þ. ð159Þ
The limits for the integrals with respect to V1;N and V2;j;N follow from the argumentsin the proof of Corollary 2. For the integral with respect to V5;l;j;N
s note that Lemmas8 and 10 imply
E1ffiffiffiffiffiNp
Xd
l;j¼1
½ðqBlÞBj�ðX sÞdV5;l;j;Ns
" #! 0, (160)
as N !1. This establishes the result announced. &
Proof of Theorem 10. The proof parallels the proof for the Euler scheme with
transformation. The average over independent replications of 12qgð ~X
N
T Þ~C
N
nh approx-
imates the negative of the second-order bias with transformation. The bias-correctedestimators with and without transformation are asymptotically equivalent becausethey share the same asymptotic distribution. &
Proof of Corollary 6. The proof is identical to the proof of Corollary 4. &
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 61
Proof of Theorem 11. From (70) we get
�1
L
XL�1l¼0
gDðY l ;Y lþ1; y0Þ ¼1
L
XL�1l¼0
JDðY lÞ0qyhDðY l ;Y lþ1; y0Þðy
L
D � y0Þ þ oPð1Þ.
(161)
By assumption, Y is an ergodic Markov process. We then get, from the central limittheorem for ergodic processes,ffiffiffiffi
Lpðy
L
D � y0Þ ) S�1=2D Z (162)
where Z�Nð0; IpÞ, and SDðy0Þ � ðADðy0Þ�1Þ0BDðy0ÞADðy0Þ
�1, with
ADðy0Þ0�
ZRd
JDðyÞ0GDðyÞp
y0 ðyÞlðdyÞ,
BDðy0Þ �ZRd
JDðyÞ0CDðyÞJDðyÞp
y0 ðyÞlðdyÞ,
GDðyÞ �
ZRd
qyhðy; z; y0Þpy0D ðy; zÞlðdzÞ,
CDðyÞ �
ZRd
hðy; z; y0Þðhðy; z; y0ÞÞ0p
y0D ðy; zÞlðdzÞ.
Setting JDðyÞ0¼ GDðyÞ
0CDðyÞ�1 and using the Cauchy–Schwartz inequality gives the
variance lower bound
SDðy0Þ ¼ZRd
GDðyÞ0CDðyÞ
�1GDðyÞpy0 ðyÞlðdyÞ. (163)
Similarly, if P� limL!1 SL ¼ S, we can use (72) to show that
�1
L
XL
l¼1
ðS1=2gDðY l ;Y lþ1; y0ÞÞ
!01
L
XL
l¼1
qyðS1=2gDðY l ;Y lþ1; ~yL
DÞÞ
!
¼1
L
XL
l¼1
qyðS1=2gDðY l ;Y lþ1; y0ÞÞ
!01
L
XL
l¼1
qyðS1=2gDðY l ;Y lþ1; ~yL
DÞÞ
!
�ð~yL
D � y0Þ þ oPð1Þ.
It now follows thatffiffiffiffiLpð~y
L
D � y0Þ ) XDðy0Þ�1=2Z (164)
where XDðy0Þ ¼ ðCDðy0Þ�1Þ0DDðy0Þ
�1CDðy0Þ, with
CDðy0Þ0¼
ZRdðS1=2JDðyÞÞ
0GDðyÞpy0 ðyÞlðdyÞ, (165)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6862
DDðy0Þ ¼ZRdðS1=2JDðyÞÞ
0CDðyÞðS1=2JDðyÞÞp
y0ðyÞlðdyÞ. (166)
This is of the same form as the variance SD offfiffiffiffiLpðy
L
D � y0Þ above for which JDðyÞ isthe optimal weight (by the Cauchy–Schwartz inequality). We conclude that theefficient GMM must have SL ¼ Ip. &
Proof of Theorem 12. The mean value theorem and P� limL!1 yL;ML;NL
D � yL
D
� �¼
0 imply the existence of yL%
such that P� limL!1 yL%¼ y0 and
� GL;ML;NLD ðY l ;Y lþ1; y
L%ÞffiffiffiffiLpðy
L;ML;NL
D � yL
DÞ
¼
ffiffiffiffiLpffiffiffiffiffiffiffiffiML
p1
L
XL�1l¼0
UgML ;NLðY l ;Y lþ1; y
L;ML;NL
D ; yL
DÞ
!, ð167Þ
where
UgML ;NLY l ;Y lþ1; y
L;ML;NL
D ; yL
D
� ��
ffiffiffiffiffiffiffiffiML
pg
ML;NLD ðY l ;Y lþ1; y
L;ML;NLÞ � gDðY l ;Y lþ1; y
LÞ
� �, ð168Þ
and
GL;ML;NLD ðY l ;Y lþ1; y
L%Þ �
1
L
XL�1l¼0
qygML;NLD ðY l ;Y lþ1; y
L%Þ. (169)
By the law of large numbers for ergodic Markov chains
P� limL!1
GL;ML;NLD ðY l ;Y lþ1; y
L%Þ ¼ SDðy0Þ. (170)
Similarly, using Theorems 4, 5, or 9, adjusted for the dependence on the observations
Y l , Y lþ1, and the fact that P� limL!1 ðyL;ML;NL
D � yL
DÞ ¼ 0, we conclude that
P� limL!1
1
L
XL�1l¼0
UgML ;NLðY l ;Y lþ1; y
L;ML;NL
D ; yL
DÞ ¼ e2kDðy0Þ (171)
if limL!1
ffiffiffiffiffiffiffiffiML
p=NL ¼ e2, where kDðy0Þ is defined in (77). The result announced
follows if limL!1
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiL=ML
p¼ e1o1. &
Appendix B. Correcting for second-order bias
This appendix provides formulas for the bias-corrected estimators of condi-tional expectations. In the expressions that follow we use the notation h ¼ Dt
for the time increment. The terms in the bias-corrected estimator without
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 63
transformation, (42), are
CN1;ðnþ1Þh ¼ CN
1;nh þ qAðX NnhÞhþ
Xd
j¼1
qBjðXNnhÞDW
jnh
" #CN
1;nh
þ ðqAÞAþXd
j¼1
qBj qABj þXd
j;k;l¼1
qkðqlABl;jÞBk;j
" #ðX N
nhÞ
!h
N
�Xd
j;k¼1
½q½qBj qBj Bk�Bk�ðXNnhÞ
!h
N
þXd
j¼1
½ðqAÞBj þ ðqBjÞA�ðXNnhÞ
DWjnh
N
�Xd
j;k¼1
½ðqBjÞðqBjÞBk�ðXNnhÞ
DW knh
N, ð172Þ
Ci;N2;ðnþ1Þh ¼ CN
2;nh þXd
l;j¼1
nNl;jðnh;NhÞ
!h
N, (173)
where
ni;jðnh;NhÞ ¼Xd
k¼1
CNk;i;jðnh;NhÞ½ðqBjÞBi�ðX
NnhÞ
þXd
k¼1
FNk;i;jðnh;NhÞðON
nhÞ�1½ðq½ðqBjÞBi� � qBiðqBjÞÞBi�ðX
NnhÞ
with
FNk;i;jðnh;NhÞ ¼
Xd
l;n¼1
q2l;hgðX nN ÞXd
h¼1
On;h;Nnh;NhBh;jðX
NnhÞ
!Ol;k;N
Nh
þXd
l¼1
qlgðXNNhÞG
l;k1;j ðnh;NhÞ, ð174Þ
CNk;i;jðnh;NhÞ ¼
Xd
l;m;n¼1
q3l;m;ngðX NNhÞ
Xd
h¼1
Om;h;Nnh;Nh Bh;iðX
NnhÞ
!
�Xd
h¼1
On;h;Nnh;NhBh;jðX
NnhÞ
!Ol;k;N
Nh þXd
l;m¼1
q2l;mgðX NNhÞ
�Xd
h¼1
Gm;h;N1;i ðnh;NhÞBh;jðX
NnhÞ þ
Xd
h¼1
Om;h;NNh ½ðqBh;jÞBi�ðX
NnhÞ
!Ol;k;N
Nh
þXd
l;m¼1
q2l;hgðX NNhÞ
Xd
h¼1
Om;h;Nnh;Nh Bh;jðX
NnhÞ
!Gl;k;N1;i ðnh;NhÞ, ð175Þ
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6864
and Gl;k;N1;i ðnh;NhÞ ¼
Pdp¼1 qpBl;iðX
NnhÞO
p;k;Nnh þ
PNhs¼nh DhG
l;k;N1;i ðnh; shÞ where
DhGl;k;N1;i ðnh; shÞ ¼
Xd
p¼1
qpAlðXNshÞhþ
Xd
j¼1
qpBl;jðX shÞDhWjsh
!Gp;k;N1;i ðnh; shÞ
þXd
p;m¼1
q2p;mAlðXNshÞhþ
Xd
j¼1
q2p;mBl;jðXNshÞDhW
jsh
!Op;k;N
sh
�Xd
q¼1
Om;q;Nnh;sh Bq;iðX
NshÞ
!, ð176Þ
and with Gl;k;N2;i;j ðnh;NhÞ ¼
Pdp;r¼1 q
2p;rBl;jðX
NnhÞBr;iðX
NnhÞO
p;k;Nnh þ
Pdp¼1 qpBl;jðX
NnhÞG
p;k;N1;i
ðnh; nhÞ þPNh
s¼nh DhGl;k;N2;i;j ðnh; shÞ where
DhGl;k;N2;i;j ðnh; shÞ ¼
Xd
p¼1
qpAlðXNshÞhþ
Xd
q¼1
qpBl;qðXNshÞDhW
qsh
!Gp;k;N2;i;j ðnh; shÞ
þXd
p;m¼1
q2p;mAlðXNshÞhþ
Xd
q¼1
q2p;mBl;qðXNshÞDhW
qsh
!
�Xd
r¼1
Om;r;Nnh;sh Br;iðX nhÞ
!Gp;k;N1;j ðnh; shÞ
þXd
p;m¼1
q2p;mAlðXNshÞhþ
Xd
q¼1
q2p;mBl;qðXNshÞDhW
qsh
!Op;k;N
sh
�Xd
p¼1
Gm;p;N1;i ðnh; shÞBp;jðX nhÞ þ Om;p;N
nh;sh ½ðqBp;jÞBi�ðXNnhÞ
� �
þXd
p;m;r¼1
q3p;m;rAlðXNshÞhþ
Xd
q¼1
q3p;m;rBl;qðX shÞDhWqsh
!Op;k;N
sh
�Xd
q¼1
Om;q;Nnh;sh Bq;jðX
NnhÞ
! Xd
q¼1
Or;q;Nnh;sh Bq;iðX
NnhÞ
!, ð177Þ
ONðnþ1Þh ¼ ON
nh þ qAðX NnhÞhþ
XN
j¼1
qBjðXNnhÞðDW
jnhÞ
!ON
nh,
X Nðnþ1Þh ¼ X N
nh þ AðX NnhÞhþ
Xd
j¼1
BjðXNnhÞðDW
jnhÞ
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 65
with X N0 ¼ X 0, ON
0 ¼ Id and CN1;0 ¼ CN
2;0 ¼ 0. For the bias-corrected estimator withthe transformation, (43), we have
CN
ðnþ1Þh ¼ CN
nh þ ½qAðXN
nhÞh�CN
nh þ ðqAÞAþXd
j;k;l¼1
ql;kABk;j Bl;j
" #ðX
N
nhÞ
!h
N
þXd
j¼1
qAðXN
nhÞBj
DWjnh
N,
XN
ðnþ1Þh ¼ XN
nh þ AðXN
nhÞhþXd
j¼1
BjðDWjnhÞ (178)
with XN
0 ¼ X 0 and C0 ¼ 0, when the transformation is applied.Additional correction terms for the estimators of conditional expectations of
Riemann integrals are, for the Euler scheme,
CN3;ðnþ1Þh ¼ CN
3;nh þ ðqgÞ AþXd
j¼1
ðqBjÞBj
!" #ðX N
nhÞ þXd
j¼1
½B0jðq2gÞBj�ðX
NnhÞ
!h
N,
(179)
and for the Euler scheme with transformation,
CN
1;ðnþ1Þh ¼ CN
1;nh þ ½ðqgÞðAÞ�ðXN
nhÞ þXd
j¼1
B0
jðq2gðX
N
nhÞÞBj
!h
N. (180)
For the Milshtein scheme the process ~Cðnþ1Þh in the bias-corrected estimator (61) is
~CN
ðnþ1Þh ¼~C
N
nh þ qAð ~XN
nhÞhþXd
j¼1
qBjð ~XN
nhÞDWjnh
" #~C
N
nh
þ qAð ~XN
nhÞDh
~XN
nh
NþXd
j¼1
½ðq½ðqAÞBj�ÞBj�ð ~XN
nhÞh
N
þXd
j¼1
½ðqBjÞA�ð ~XN
nhÞDhW
jnh
N, ð181Þ
~XN
ðnþ1Þh ¼~X
N
nh þ Að ~XN
nhÞhþXd
j¼1
Bjð ~XN
nhÞðDWjnhÞ þ
Xd
l;j¼1
½ðqBlÞBj�ð ~XN
nhÞDFl;j,
(182)
where DFl;j is defined in (53). The additional correction term for second-order bias-corrected estimators of Riemann integrals based on the Milshtein scheme is
~CN
1;ðnþ1Þh ¼~C
N
1;nh þ ðqgÞ AþXd
j¼1
ðqBjÞBj
!" #ð ~X
N
nhÞ þXd
j¼1
½B0jðq2gÞBj�ð ~X
N
nhÞ
!h
N.
(183)
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6866
An alternative is to use the Euler scheme to approximate X in the recursion for ~C.As both ~X
N
� and X N� converge to X � the resulting approximation is first-order
asymptotically equivalent.
References
Ait-Sahalia, Y., 1996. Testing continuous-time models of the spot interest rate. Review of Financial
Studies 9, 385–426.
Ait-Sahalia, Y., 1999. Transition densities for interest rate and other nonlinear diffusions. Journal of
Finance 54, 1361–1395.
Ait-Sahalia, Y., 2002. Maximum likelihood estimation of discretely observed diffusions: a closed form
approximation approach. Econometrica 70, 223–262.
Bally, V., Talay, D., 1996a. The law of the Euler scheme for stochastic differential equations (I):
convergence rate of the distribution function. Probability Theory and Related Fields 104, 43–60.
Bally, V., Talay, D., 1996b. The law of the Euler scheme for stochastic differential equations (II):
convergence rate of the density. Monte Carlo Methods and its Applications 2, 93–128.
Barberis, N., 2000. Investing for the long run when returns are predictable. Journal of Finance 55,
225–264.
Basawa, I.V., Scott, D.J., 1982. Asymptotic optimal inference for non-ergodic models. Lecture Notes in
Statistics, vol. 17. Springer, New York.
Bawa, V.S., Brown, S.J., Klein, R.W., 1979. Estimation Risk and Optimal Portfolio Choice. North-
Holland, New York.
Bibby, B.M., Sørensen, M., 1995. Martingale estimating functions for discretely observed diffusion
processes. Bernoulli 1, 17–39.
Billingsley, P., 1968. Convergence of Probability Measures. Wiley, New York.
Brandt, M., Santa-Clara, P., 2002. Simulated likelihood estimation of diffusions with an application to
exchange rate dynamics in incomplete markets. Journal of Financial Economics 63, 161–210.
Broze, L., Scaillet, O., Zakoian, J.-M., 1998. Quasi-indirect inference for diffusion processes. Econometric
Theory 14, 161–186.
Chamberlain, G., 1987. Asymptotic efficiency in estimation with conditional moment restrictions. Journal
of Econometrics 34, 305–334.
Chamberlain, G., 1992. Efficiency bounds for semiparametric regression. Econometrica 60, 567–596.
Chan, K.C., Karolyi, A., Longstaff, F.A., Saunders, A., 1992. A comparison of alternative models of the
short-term interest rates. Journal of Finance 47, 1209–1227.
Chib, S., Elerian, O., Shephard, N., 2001. Likelihood inference for discretely observed non-linear
diffusions. Econometrica 69, 959–993.
Dacunha-Castelle, D., Florens-Zmirou, D., 1986. Estimation of the coefficients of a diffusion from
discrete observations. Stochastics 19, 263–284.
Detemple, J.B., Garcia, R., Rindisbacher, M., 2003. A Monte Carlo method for optimal portfolios.
Journal of Finance 58, 401–446.
Detemple, J.B., Garcia, R., Rindisbacher, M., 2005. Representation formulas for Malliavin derivatives of
diffusion processes. Finance and Stochastics 9, 349–369.
Doss, H., 1977. Liens entre equations differentielles stochastiques et ordinaires. Annales de l’Institut H.
Poincare 13, 99–125.
Duffie, D., Glynn, P., 1995. Efficient Monte Carlo simulation of security prices. Annals of Applied
Probability 5, 897–905.
Duffie, D., Protter, P., 1992. From discrete to continuous finance: weak convergence of the financial gain
process. Mathematical Finance 2, 1–15.
Duffie, D., Singleton, K., 1993. Simulated moments estimation of Markov models of asset prices.
Econometrica 61, 929–952.
Durham, G.B., Gallant, A.R., 2002. Numerical techniques for maximum likelihood estimation of
continuous-time diffusion processes. Journal of Business and Economic Statistics 20, 297–316.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–68 67
Eraker, B., 2001. MCMC analysis of diffusion models with application to finance. Journal of Business and
Economic Statistics 19, 177–191.
Florens-Zmirou, D., 1989. Approximate discrete-time schemes for statistics of diffusion processes.
Statistics 20, 547–557.
Florens-Zmirou, D., 1993. On estimating the diffusion coefficient from discrete observations. Journal of
Applied Probability 30, 790–804.
Gaines, J.G., Lyons, T.J., 1997. Variable step size control in the numerical solution of stochastic
differential equations. SIAM Journal of Applied Mathematics 57, 1455–1484.
Gallant, A.R., Tauchen, G., 1996. Which moments to match? Econometric Theory 12, 657–681.
Gallant, A.R., Tauchen, G., 2002. Simulated score methods and indirect inference for continuous-time
models. Handbook of Financial Econometrics, forthcoming.
Genon-Catalot, V., 1990. Maximum contrast estimation for diffusion processes from discrete
observations. Statistics 21, 99–116.
Genon-Catalot, V., Jacod, J., 1993. On the estimation of the diffusion coefficient for multidimensional
diffusion processes. Annales de l’I.H.P., Probabilite et Statistiques 29, 119–151.
Gourieroux, C., Montfort, A., Renault, E., 1993. Indirect inference. Journal of Applied Econometrics 8,
85–118.
Hansen, L., 1985. A method for calculating bounds on the asymptotic covariance matrices of generalized
method of moments estimators. Journal of Econometrics 30, 203–238.
Hansen, L.P., Scheinkman, J.A., 1995. Back to the future: generating moment implications for
continuous-time Markov processes. Econometrica 63, 767–804.
Hansen, L.P., Heaton, J.C., Ogaki, M., 1988. Efficiency bounds implied by multi-period conditional
moment restrictions. Journal of the American Statistical Association 83, 863–871.
Heyde, C.C., 1992. On best asymptotic confidence intervals for parameters of stochastic processes. Annals
of Statistics 20, 603–607.
Jacod, J., Protter, P., 1998. Asymptotic error distributions for the Euler method for stochastic differential
equations. Annals of Probability 26, 267–307.
Jakubowski, A., Memin, J., Pages, G., 1989. Convergence en loi des suites d’integrals stochastiques sur
l’espace D1 de Skorohod. Probability Theory and Related Fields 81, 111–137.
Kallenberg, O., 1997. Foundations of Modern Probability Theory. Springer, New York.
Kessler, M., Sørensen, M., 1999. Estimating equations based on eigenfunctions for a discretely observed
diffusion process. Bernoulli 5 (2), 299–314.
Kloeden, P., Platen, E., 1997. Numerical Solution of Stochastic Differential Equations. Springer, Berlin.
Kurtz, T.G., Protter, P., 1991a. Wong–Zakai corrections, random evolutions and numerical schemes for
SDE’s. In: Stochastic Analysis. Academic Press, New York, pp. 331–346.
Kurtz, T.G., Protter, P., 1991b. Weak limit theorems of stochastic integrals and stochastic differential
equations. Annals of Probability 19, 1035–1070.
Kurtz, T.G., Protter, P., 1996. Weak convergence and stochastic integrals and differential equations, 1995
CIME School in Probability. Lecture Notes in Mathematics, vol. 1627. Springer, Berlin, pp. 1–41.
Lo, A., 1988. Maximum likelihood estimation of generalized Ito processes with discretely-sampled data.
Econometric Theory 4, 231–247.
Meyn, S.P., Tweedie, R.L., 1993. Markov Chains and Stochastic Stability. Communications and Control
Engineering Series. Springer, Berlin, New York.
Milshtein, G.N., 1984. Weak approximation of solutions of systems of stochastic differential equation.
Theory of Probability and its Application 4, 750–768.
Milshtein, G.N., 1995. Numerical Integration of Stochastic Differential Equations. Kluwer Academic
Publishers, New York.
Milshtein, G.N., Schoenmakers, J., Spokoiny, V., 2004. Transition density estimation for stochastic
differential equations via forward-reverse representations. Bernoulli 10, 281–312.
Muller, U.U., Wefelmeyer, W., 2002. Autoregression, estimating functions, and optimality criteria. In
Advances in Statistics, Combinatorics and Related Areas, Selected Papers from the SCRA2001-FIM
VIII Proceedings of the Wollongong Conference. World Scientific, Singapore.
Newey, W.K., 1990. Semiparametric efficiency bounds. Journal of Applied Econometrics 5, 90–135.
ARTICLE IN PRESS
J. Detemple et al. / Journal of Econometrics 134 (2006) 1–6868
Nualart, D., 1995. The Malliavin Calculus and Related Topics. Springer, New York.
Pedersen, A.R., 1995a. A new approach to maximum likelihood estimation for stochastic differential
equations based on discrete observations. Scandinavian Journal of Statistics 22, 55–71.
Pedersen, A.R., 1995b. Consistency and asymptotic normality of an approximate maximum likelihood
estimator for discretely observed diffusion processes. Bernoulli 1, 257–279.
Santa-Clara, P., 1995. Simulated likelihood estimation of diffusions with and application to the short term
interest rate. Ph.D. Dissertation, INSEAD.
Smith, A.A., Jr. 1990. Three Essays on the Solution and estimation of dynamic macroeconomic models.
Ph.D. Thesis, Duke University.
Talay, D., 1984. Efficient numerical schemes for the approximation of expectations of functionals of
S.D.E. In: Korezioglu, H., Maziotto, G., Szpirglas, J. (Eds.), Filtering and Control of Random
Processes, Lecture Notes in Control and Information Sciences, vol. 61, Springer, New York, pp.
294–313.
Talay, D., 1986. Discretisation d’une equation differentielle stochastique et calcul approche d’esperance de
fonctionelles de la solution. Mathematical Modelling and Numerical Analysis 0, 141–179.
Talay, D., 1996. Probabilistic numerical methods for partial differential equations: elements of analysis.
CIME School in Probability, Lecture Notes in Mathematics, vol. 1627. Springer, Berlin, pp. 149–196.
Talay, D., Tubaro, L., 1990. Expansion of the global error for numerical schemes solving stochastic
differential equations. Stochastic Analysis and its Application 8, 483–509.
Wefelmeyer, W., 1996. Quasi-likelihood models and optimal inference. Annals of Statistics 24, 405–422.
Wong, E., Zakai, M., 1964. On the convergence of ordinary integrals to stochastic integrals. Annals of
Mathematical Statistics 36, 1560–1564.