50
This article was downloaded by: [University of Tennessee, Knoxville] On: 21 December 2014, At: 19:54 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Stochastic Analysis and Applications Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsaa20 Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations Kyoung-Sook Moon a , Anders Szepessy b , Raúl Tempone c d & Georgios E. Zouraris e f a Department of Mathematics , University of Maryland , College Park, Maryland, USA b Matematiska Institutionen , Kungl. Tekniska Högskolan , Stockholm, Sweden c ICEs, The University of Texas at Austin , Austin, Texas, USA d IMERL, Facultad De Ingeniería , Montevideo, Uruguay e Department of Mathematics , University of Crete , Heraklion, Greece f IACM, FORTH , Heraklion, Greece Published online: 15 Feb 2007. To cite this article: Kyoung-Sook Moon , Anders Szepessy , Raúl Tempone & Georgios E. Zouraris (2005) Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations, Stochastic Analysis and Applications, 23:3, 511-558, DOI: 10.1081/SAP-200056678 To link to this article: http://dx.doi.org/10.1081/SAP-200056678 PLEASE SCROLL DOWN FOR ARTICLE

Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Embed Size (px)

Citation preview

Page 1: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

This article was downloaded by: [University of Tennessee, Knoxville]On: 21 December 2014, At: 19:54Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Stochastic Analysis andApplicationsPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/lsaa20

Convergence Rates forAdaptive Weak Approximationof Stochastic DifferentialEquationsKyoung-Sook Moon a , Anders Szepessy b , RaúlTempone c d & Georgios E. Zouraris e fa Department of Mathematics , University ofMaryland , College Park, Maryland, USAb Matematiska Institutionen , Kungl. TekniskaHögskolan , Stockholm, Swedenc ICEs, The University of Texas at Austin , Austin,Texas, USAd IMERL, Facultad De Ingeniería , Montevideo,Uruguaye Department of Mathematics , University of Crete ,Heraklion, Greecef IACM, FORTH , Heraklion, GreecePublished online: 15 Feb 2007.

To cite this article: Kyoung-Sook Moon , Anders Szepessy , Raúl Tempone & GeorgiosE. Zouraris (2005) Convergence Rates for Adaptive Weak Approximation of StochasticDifferential Equations, Stochastic Analysis and Applications, 23:3, 511-558, DOI:10.1081/SAP-200056678

To link to this article: http://dx.doi.org/10.1081/SAP-200056678

PLEASE SCROLL DOWN FOR ARTICLE

Page 2: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 3: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Stochastic Analysis and Applications, 23: 511–558, 2005Copyright © Taylor & Francis, Inc.ISSN 0736-2994 print/1532-9356 onlineDOI: 10.1081/SAP-200056678

Convergence Rates for Adaptive WeakApproximation of Stochastic Differential Equations

Kyoung-Sook MoonDepartment of Mathematics, University of Maryland, College Park,

Maryland, USA

Anders SzepessyMatematiska Institutionen, Kungl. Tekniska Högskolan, Stockholm, Sweden

Raúl TemponeICEs, The University of Texas at Austin, Austin, Texas, USA and IMERL,

Facultad De Ingeniería, Montevideo, Uruguay

Georgios E. ZourarisDepartment of Mathematics, University of Crete, Heraklion, Greece and

IACM, FORTH, Heraklion, Greece

Abstract: Convergence rates of adaptive algorithms for weak approximationsof Itô stochastic differential equations are proved for the Monte Carlo Eulermethod. Two algorithms based either on optimal stochastic time steps oroptimal deterministic time steps are studied. The analysis of their computationalcomplexity combines the error expansions with a posteriori leading order termintroduced in Szepessy et al. [Szepessy, A., R. Tempone, and G. Zouraris. 2001.Comm. Pure Appl. Math. 54:1169–1214] and an extension of the convergenceresults for adaptive algorithms approximating deterministic ordinary differentialequations, derived in Moon et al. [Moon, K.-S., A. Szepessy, R. Tempone, andG. Zouraris. 2003. Numer. Math. 93:99–129]. The main step in the extension isthe proof of the almost sure convergence of the error density. Both adaptivealogrithms are proven to stop with asymptotically optimal number of steps up

Received September 30, 2003; Accepted May 4, 2004Address correspondence to Anders Szepessy, Department of Mathematics,

Royal Institute of Technology, Stockholm 10044, Sweden; E-mail: [email protected]

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 4: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

512 Moon et al.

to a problem independent factor defined in the algorithm. Numerical examplesillustrate the behavior of the adaptive algorithms, motivating when stochasticand deterministic adaptive time steps are more efficient than constant time stepsand when adaptive stochastic steps are more efficient than adaptive deterministicsteps.

Keywords: Adaptive mesh refinement algorithm; Almost sure convergence;Computational complexity; Monte Carlo method; Stochastic differentialequations.

Mathematics Subject Classification: 65C30; 65Y20; 65L50; 60H35.

1. INTRODUCTION TO ADAPTIVE ALGORITHMS FOR SDEs

This work derives convergence rates of adaptive algorithms for weakapproximation of Itô stochastic differential equations (SDEs)

dXk�t� = ak�t� X�t��dt +�0∑�=1

b�k�t� X�t��dW��t�� t > 0 (1)

where k = 1� 2� � � � � d and �X�t� w�� is a stochastic process in �d,with randomness generated by the independent one-dimensional Wienerprocesses W��t� w� for � = 1� 2� � � � � �0, on the probability space���� � P�, cf. Karatzas and Shreve [24] and Øksendal [36]. Thefunctions a�t� x� ∈ �d and b��t� x� ∈ �d� � = 1� � � � � �0, are given driftand diffusion fluxes.

The goal is to construct an approximation of an expected valueE�g�X�T�� by the Monte Carlo method, for a given function g �d →�. A topical example of such an expected value is to compute optionprices in mathematical finance, cf. Jouini et al. [23] and Glasserman[17]. Other related models based on stochastic dynamics are used, e.g.,for stochastic climate prediction and for wave propagation in randommedia, cf. Majda et al. [26] and Abdullaev et al. [1]. The MonteCarlo Euler method approximates the unknown process X by the Eulerapproximation X�tn� (cf. Kloeden and Platen [25] and Milstein [27]),which is a time discretization based on the nodes 0 = t1 < t2 < · · · <tN+1 = T where

X�tn+1

)− X�tn� = �tna�tn� �tn� X�tn��+�0∑�=1

�W�nb

��tn� X�tn��� (2)

with time increments �tn ≡ tn+1 − tn and Wiener increments �W�n ≡

W��tn+1�−W��tn� for n = 1� 2� � � � � N and � = 1� 2� � � � � �0. The aim ofthe adaptive algorithm is to choose the size of the time steps, �tn,

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 5: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 513

and the number of independent identically distributed samples X�·� wj��j = 1� 2� � � � �M� such that the computational work, N ·M , is minimalwhile the approxmiation error is asymptotically bounded by a givenerror tolerance, TOL, i.e., the event∣∣∣∣∣E�g�X�T��− 1

M

M∑j=1

g�X�T�wj��

∣∣∣∣∣ ≤ TOL (3)

has a probability close to one. Stopped diffusion is a good example thatadaptive time steps improve the convergence rate, see Buchmann andPetersen [9] and Moon et al. [14]. A priori error estimates of the timediscretization error in (3) were first derived by Talay and Tubaro [41].The work of Szepessy et al. [39] modified Talay’s and Tubaro’s errorexpansion to an expansion with computable leading order term in aposteriori form, based on computable stochastic flows and discrete dualbackward problems.

Here we derive convergence rates of two algorithms with eitherstochastic or deterministic time steps. The difference between the twoalgorithms is that the stochastic time steps may use different meshesfor each realization, based on successive Brownian bridge samplingof a Brownian motion realization, while the deterministic time stepsuse the same mesh for all realizations of the Brownian motion W =�W 1� � � � �W�0�. The construction and the analysis of the adaptivealgorithms are inspired by the related work of Moon et al. [32], onadaptive algorithms for deterministic ordinary differential equations, andthe error estimates from Szepessy et al. [39].

There are numerous adaptive algorithms for ordinary and partialdifferential equations cf. Ainsworth and Oden [2], Babuska andRheinboldt [3], Babuska and Vogelius [4], and Harrier et al. [21], andsome are, as here, based on dual problems cf. Becker and Rannacher[5, 6], Eriksson et al. [15], Johnson and Szepessy [22], and Moonet al. [30], but the theoretical understanding of convergence ratesof adaptive algorithms is not as well developed; there are, however,recent important contributions. The work of Hofmann et al. [19, 20]and Müller-Gronbach [34] prove optimal convergence rates for strongapproximation of stochastic differential equations. DeVore studies in [12]the efficiency of adaptive approximation of functions, including waveletexpansions, based on smoothness conditions in Besov spaces. Inspiredby this approximation results, Cohen et al. proves in [10] that a wavelet-based adaptive N -term approximation algorithm produces a solutionwith optimal error ��N−s� in the energy norm for linear coercive ellipticproblems, see also Dahmen [11]. Based on Morin et al. [33], the work ofBinev et al. [7] and Stevenson [38] extend the analysis of Cohen et al. [10]to include finite element approximation. The work of Moon et al. [32]connects DeVore’s smoothness conditions to error densities for adaptive

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 6: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

514 Moon et al.

approximation of ordinary differential equations. In particular, Moonet al. [32] construct an algorithm and proves that it stops with theoptimal number of time steps, up to a problem independent factordefined in the algorithm; for any pth order accurate method, the optimalnumber of adaptive steps is proportional to the pth root of the L

1p+1

quasi-norm of the error density, while the number of constant steps, withthe same error, is proportional to the pth root of the larger L1-normof the error density. This work generalizes Moon et al. [32] to weakapproximation of stochastic differential equations.

There are two main results on efficiency and accuracy of the adaptivealgorithms described in section 3. In view of accuracy with probabilityclose to one, the approximation errors in (3) are asymptotically boundedby the specified error tolerance times a problem independent factoras the tolerance parameter tends to zero. In view of efficiency, boththe algorithms with stochastic steps and deterministic steps stop withasymptotically optimal expected number of final time steps and optimalnumber of final time steps, respectively, up to a problem independentfactor. The number of final time steps is related to the numerical effortneeded to compute the approximation. To be more precise, the totalwork for deterministic steps is roughly M · N , where M is the finalnumber of realizations and N is the final number of time steps, sincethe work to determine the mesh turns out to be negligible. On the otherhand, the total work with stochastic steps is on average bounded byM · E�Ntot, where the total number, Ntot, of steps including all refinementlevels is bounded by ��N logN� with N steps in the final refinement; foreach realization, it is necessary to determine the mesh, which may bedifferent for each realization.

The computation of the stochastic flows up to third order requiresthe solution of linear dual stochastic differential equations in dimensiond3 and to store a realization of X and W for all time levels. Thisadditional work is clearly a drawback, especially for high-dimensionalproblems with large d. On the other hand, it is sometimes possible touse sparse matrix structure so that the additional work to computethe stochastic flows is low compared to compute X; see Björk et al. [8]for an application of high-dimensional HJM term structure models. Theadditional storage of one realization ofX andW is clearly also a drawback,butmany computer programs for differential equations store the solutionat all time levels for other reasons, e.g, post processing. At the expense oflosing optimal mesh adaptivity, the additional work and storage (of X)can be removed by setting the dual weights to one, which corresponds touse only the local error as indicator for the adaptive refinements.

The accuracy and efficiency results are based on the fact thatthe error density, �, which measures the approximation error foreach interval following (4), converges almost surely �a�s� as the errortolerance tends to zero. This convergence can be understood by the

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 7: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 515

a.s. convergence of the approximate solution, X, as the maximal stepsize tends to zero. Although the time steps are not adapted to thestandard filtration generated by only W for the stochastic time steppingalgorithm, the work of Szepessy et al. [39] proved that the correspondingapproximate solution converges to the correct adapted solution X.This result makes it possible to prove the martingale property of theapproximate error term with respect to a specific filtration including Wand the time levels, see Lemma 4.2. Therefore, Theorem 4.1 and 4.4 useDoob’s inequality to prove the a.s. convergence of X. Similar results ofpointwise convergence with constant step sizes, adapted to the standardfiltration, are surveyed by Talay [40].

The outline of the paper is: section 2 states the a posteriori errorexpansion, proved in Szepessy et al. [39] and used in the adaptivealgorithms; section 3 describes and analyzes the adaptive algorithms withstochastic time steps and deterministic time steps; section 4 proves a.s.convergence of the error density; and finally section 5 presents numericalexperiments based on the adaptive algorithms.

For simplicity, we introduce the following notation

dij ≡12b�i b

�j � �k ≡

�xk� �ki ≡

�2

�xk�xi� � � �

with the summation convention, i.e., if the same subscript appears twicein a term, the term denotes the sum over the range of this subscript, e.g.,

cik�kbj ≡d∑k=1

cik�kbj�

For a derivative � , the notation � � is its order.

2. A POSTERIORI ERROR EXPANSION

The main result of Szepessy et al. [39] is new expansions of thecomputational error with computable leading order term in a posterioriform. The result was inspired by a corresponding a priori analysisderived in Talay and Tubaro [41], with the main difference that theweight for the local error contribution to the global error can becomputed efficiently by stochastic flows and discrete dual backwardproblems, extending Moon et al. [31] to SDEs. These a posteriorierror expansions can be used in adaptive algorithms, in order tocontrol the approximation error. Although the work Szepessy et al.[39] proposed adaptive algorithms, the main focus in that work wason error estimates. Properties regarding the stopping, efficiency, andaccuracy of the adaptive algorithms, following the ideas in Moon et al.[32] are first studied here. Assume that the process X satisfies (1) andits approximation, X, is given by (2), then the error expansions in

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 8: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

516 Moon et al.

Theorems 1.2 and 2.2 of Szepessy et al. [39] have the form

E�g�X�T��− g�X�T�� = E

[N∑n=1

�n�t2n]

= E

[N∑n=1

�n�t2n]+ higher order terms, (4)

where �n�t2n are computable error indicators, i.e., they provideinformation for further improvement of the time mesh and �n measuresthe density of the global error in (4). A typical adaptive algorithm doestwo things iteratively:

(I) If the error indicators satisfy an accuracy condition then it stops;otherwise

(II) The algorithm chooses where to refine the mesh and then makes aniterative step to (1).

In addition to estimate the global error E�g�X�T��− g�X�T�� in the senseof (4) the indicators �n�t2n also give simple information on where torefine to reach an optimal mesh, based on the almost sure convergenceof the density �n as we refine the discretization, see section 4.

In the remaining part of this section, we recall one error expansionfrom Szepessy et al. [39], which can be used with either stochastic ordeterministic time steps. The work of Szepessy et al. [39] also provesanother error expansion, which requires less computational work perrealization, in particular for large d. However, this expansion is onlyvalid with deterministic time steps and it has larger statistical error.Although adaptive algorithms based on this second expansion work wellin practice, the larger statistical error makes it difficult to analyze itprecisely. Therefore, this work focuses on the first error expansion.

The following Lemma 2.1 and Theorem 2.2, derived in Szepessy et al.[39], describe the error expansion that is used in the adaptive algorithmsin section 3. Assume that for all times t ∈ �tn� tn+1� and all outcomes �,the time steps �t�t� = �tn are constructed by the refinement criterion

�t�t� = T2−m� for some positive integer m = m�t���

���t� �����t�t��2 < constant�(5)

with an approximate error density function, ��t� �� = �n���, satisfying,for t ∈ �tn� tn+1 and s ∈ �0� T and all outcomes �, the uniform upperand lower bounds

c�TOL� ≤ ���s���� ≤ C�TOL��

��W�t���s� ��� ≤ C�TOL��(6)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 9: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 517

for some positive functions c and C, with TOL/�TOL�→ 0 as TOL →0. Here �W�t�Y denotes the Malliavin derivative, which is the firstvariation of a process Y with respect to a perturbation dW�t� at time tof the Wiener process, cf. Nualart [35] and Szepessy et al. [39].

Lemma 2.1. Suppose there are positive constants k and C and an integerm0 with the bounds

g ∈ Cm0loc��

d�� �� g�x�� ≤ C�1+ �x�k�� for all � � ≤ m0�

E��X�0��2k+d+1 + �X�0��2k+d+1 ≤ C�

and

a and b are bounded in Cm0��0� T�d��

Assume that X is constructed by the forward Euler method with step sizes�tn satisfying (5) and (6) and the corresponding �Wn ≡ W�tn+1�−W�tn�are generated by Brownian bridges, based on the stochastic time stepalgorithm in section 3. Assume also that X�0� = X�0�. Then there exists asufficiently large integer m0 such that

supt∈�0�T

√E��X�t�− X�t��2 = �

(√�tsup

)= �

(√TOLc�TOL�

)→ 0� (7)

as TOL → 0, where �tsup ≡ supn���tn���.

Theorem 2.2. Suppose that a, b, g and X satisfy the assumptions in Lemma2.1 and E��X�0��k0 ≤ C for some k0 ≥ 16. Then the time discretization errorhas the expansion

E�g�X�T��− g�X�T��

= E

[ N∑n=1

��tn� X���tn�2]+ �

(√TOLc�TOL�

(C�TOL�c�TOL�

)8/k0)

×E[ N∑n=1

��tn�2]

(8)

with computable leading order terms, where

��tn� X� ≡12

(�

�tak + �jakaj + �ijakdij

)�k�tn+1�

+ 12

(�

�tdkm + �jdkmaj + �ijdkmdij + 2�jakdjm

)�′km�tn+1�

+ �jdkmdjr�′′kmr�tn+1�� (9)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 10: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

518 Moon et al.

and the terms in the sum of (9) are evaluated at the a posteriori knownpoints �tn� X�tn��, i.e.,

� a ≡ � a�tn� X�tn��� � b ≡ � b�tn� X�tn��� � d ≡ � d�tn� X�tn���

Here, � ∈ �d is the solution of the discrete dual backward problem

�i�tn� = �icj�tnX�tn���j�tn+1�� tn < T�

�i�T� = �ig�X�T���(10)

with

ci�tn� x� ≡ xi +�tnai�tn� x�+�W�nb

�i �tn� x� (11)

and its first and second variation

�′ij ≡ �xj�tn��i�tn� ≡

��i�tn� X�tn� = x�

�xj� (12)

�′′ikm�tn� ≡ �xm�tn��

′ik�tn� ≡

��′ik�tn� X�tn� = x�

�xm� (13)

which satisfy

�′ik�tn� = �icj�tn� X�tn���kcp�tn� X�tn���

′jp�tn+1�

+ �ikcj�tn� X�tn���j�tn+1�� tn < T� (14)

�′ik�T� = �ikg�X�T���

and

�′′ikm�tn� = �icj�tn� X�tn���kcp�tn� X�tn���mcr�tn� X�tn���

′′jpr�tn+1�

+ �imcj�tn� X�tn���kcp�tn� X�tn���′jp�tn+1�

+ �icj�tn� X�tn���kmcp�tn� X�tn���′jp�tn+1�

+ �ikcj�tn� X�tn���mcp�tn� X�tn���′jp�tn+1�

+ �ikmcj�tn� X�tn���j�tn+1�� tn < T�

�′ikm� T � = �ikmg�X�T���

(15)

respectively.

The previous result can also be directly applied to the particular caseof deterministic time steps. The deterministic time stepping algorithmuses the sample average of ��tn� X� in (9) to approximate the expected

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 11: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 519

value of the error density in (8), E�∑

n �n��tn�2 =∑

n E��n��tn�2; theerror density becomes

��tn� X� ≡1MT

MT∑j=1

��tn� X���j� E���tn� X�� (16)

where the variance Var���tn� X� = ��M−1T � tends to zero as TOL → 0+.

3. ADAPTIVE ALGORITHMS FOR SDEs

This section presents two adaptive time stepping algorithms and analyzesthe basic properties of the adaptive algorithms for weak approximationof Itô SDEs (1). These adaptive algorithms choose, adaptively, thenumber of realizations and the time steps, to asymptotically bound theapproximation error by a given error tolerance.

The computational error in (3) naturally separates into the timediscretization error and the statistical error

E�g�X�T��− 1M

M∑j=1

g�X�T��j��

= �E�g�X�T��− g�X�T���+(E�g�X�T��− 1

M

M∑j=1

g�X�T��j��

)≡ �T + �S� (17)

The time steps �t are determined from the time discretization error �T ,and the number M of realizations is determined from the statistical error�S . The statistical error and the time discretization error are combinedin order to bound the computational error (17). Therefore, we splita given error tolerance TOL into a statistical tolerance TOLS , and atime discretization tolerance TOLT . The computational work is roughly��N ·M� = ��TOL−1

T TOL−2S �; therefore, we use

TOLT = 13TOL and TOLS =

23TOL� (18)

by minimizing TOL−1T TOL−2

S under the constraint TOLT + TOLS =TOL.

3.1. Control of the Statistical Error

For M independent samples �Y��j��Mj=1 of a random variable Y , with

E��Y �6 < , define the sample average ��Y�M� and the sample standard

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 12: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

520 Moon et al.

deviation � �Y�M� of Y by

��Y�M� ≡ 1M

M∑j=1

Y��j� and

� �Y�M� ≡ ���Y 2�M�− ��Y�M��21/2� (19)

Let �Y ≡ �E��Y − E�Y�2�1/2 and consider the random variable

ZM ≡√M

�Y���Y�M�− E�Y�

with cumulative distribution function FZM �x� ≡ P�ZM ≤ x�, x ∈ �. Let� ≡ �E��Y − E�Y�3�1/3/�Y < , then the Berry-Esseen theorem, cf.Durett [13], gives the following estimation in the Central Limit Theorem

supx∈�

�FZM �x�−��x�� ≤ 3√M�3

for the rate of convergence of FZM to the distribution function � of anormal random variable with mean zero and variance one, i.e.,

��x� = 1√2�

∫ x

−exp

(− 1

2s2)ds� (20)

Since in the examples presented below M is sufficiently large i.e.,M � 36�6, the statistical error

�S�Y�M� ≡ E�Y � − ��Y�M�

satisfies, by the Berry-Esseen theorem, the following probabilityapproximation

P

([��S�Y�M�� ≤ c0

�Y√M

]) 2��c0�− 1�

for all c0 > 0. In practice, choose some constant c0 ≥ 1�65, so the normaldistribution satisfies 1 > 2��c0�− 1 ≥ 0�901 and the event

��S�Y�M�� ≤ Es�Y�M� ≡ c0� �Y�M�√

M(21)

has probability close to one, which involves the additional step toapproximate �Y by � �Y�M�, cf. Fishman [16]. Thus, in the computationsES�Y�M� is a good approximation of the statistical error �S�Y�M�. Fora given TOLS > 0, the goal is to find M such that ES�Y�M� ≤ TOLS .

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 13: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 521

3.2. Control of the Time Discretization Error

For given time nodes 0 = t1 < · · · < tN+1 = T , let the piecewise constantmesh function �t be determined by

�t��� ≡ �tn for � ∈ �tn� tn+1� and n = 1� � � � � N�

Then the number of time steps that corresponds to a mesh �t, for theinterval �0� T, is given by

N��t� ≡∫ T

0

1�t���

d�� (22)

Consider, for � ∈ �tn� tn+1� and n = 1� � � � � N , the piecewise constantfunction

���� ≡ sign��n�min�max���n�� ��� ��� (23)

where

� = TOL�� 0 < � <

+ 2� 0 < <

12�

� = TOL−r � r > 0�(24)

The function � measures the density of the time discretization error, and�n = ��tn� X� is defined by (9) for the stochastic time stepping algorithmor �n = ��tn� X� from (16) for the deterministic time stepping algorithm.Here, the function sign denotes sign�x� = 1 for x ≤ 0 and sign�x� =−1 for x < 0. We use the positive parameters, �� �, and �, in order toguarantee that �tsup → 0 as TOL → 0 and to have the bound for theerror density in (36), see Lemma 3.3. From now on, with a slight abuseof notation, ��tn� = �n denotes the modified density (23).

Following the error expansion in Theorem 2.2, the timediscretization error is approximated by

��T � = �E�g�X�T��− g�X�T��� � E

[N∑n=1

rn

](25)

using the error indicator, rn, defined by

rn ≡ ���tn���t2n (26)

with the modified error density defined by (23). To motivate theadaptivity procedure for the time partition, let us now formulate

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 14: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

522 Moon et al.

an optimal choice of the time steps by minimizing the expectedcomputational work subject to the accuracy constraint

E

[N∑n=1

rn

]≤ TOLT � (27)

More precisely, solve

minE�N��t�� such that �t ∈ � and E

N��t�∑n=1

rn

≤ TOLT � (28)

where � is the feasible set for the mesh function �t and N��t� is thecorresponding number of time steps. The optimal choice of time steps in� is based on the given density �n��, which is piecewise constant on themesh �t�k� k = 1� 2� � � � . The choice of � determines either deterministictime steps or stochastic time steps. For example, if we let

� ≡{�t ∈ L2

dt��0� T�� �t is deterministic, positive and

piecewise constant on �t�k

}then the objective function in (28) becomes deterministic and a standardapplication of a Lagrange multiplier shows that the minimizer of theproblem (28) satisfies

E�rn = constant� for all time steps n� (29)

which sets the basis for the refinement procedure with deterministic timesteps. On the other hand, letting

� ≡{�t ∈ L2

dt×P��0� T�×��� �t is stochastic, positive and

piecewise constant on �t�k���

}leads to

rn��� = constant� for all time steps n and for all realization �� (30)

which sets the basis for the refinement procedure with stochastic timesteps.

Thus, the adaptive algorithm with stochastic time steps uses theoptimal conditions (27) and (30) to construct the mesh, which may bedifferent for each realization. On the other hand, the adaptive algorithmwith deterministic time steps uses the optimal conditions (27) and (29)to construct the mesh, which is the same for all realizations. Note that(29)–(30) do not take the sign of the error density into account and,in this sense, our time steps are optimal only for error densities of onesign. This work does not consider the more difficult question to usecancellation of the error in an optimal way.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 15: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 523

3.3. Convergence Rates for Stochastic Time Steps

The optimal conditions (27), (30) and the restriction (15) motivate thatthe goal of the adaptive algorithm, with stochastic time steps, for eachrealization, is to construct a time partition �t of �0� T such that

��n��t2n ≤ s1TOLTE�N

� n = 1� � � � � N (31)

where s1 is a given positive constant, see Remark 3.1. Note that, inpractice, the quantity E�N is not known and we can only estimate itby a sample average ��N�M� from the previous batch of realizations.The statistical error �E�N− ��N�M�� is then bounded by Es�N�M�, withprobability close to one, by the same argument as in (21). The remainderof this section analyzes an adaptive algorithm based on (31) with respectto stopping, accuracy and efficiency.

Let N�j ≡ ��N�M�j� be the sample average of the final numberof time steps in the jth batch of M�j numbers of realizations. Toachieve (31) for each realization, start with an intial partition �t�1 andthen specify iteratively a new partition �t�k+ 1, from �t�k, using thefollowing refinement strategy: For each realization in the mth batch,

for each time step n = 1� 2� � � � � N�k�

if rn�k ≤ s1TOLT

N�M − 1� then divide �tn�k into H = 2 uniform

substeps�else let the new step be the same as the oldendif

endfor� (32)

The refinement strategy (32) motivates the following stoppingcriteria: For each realization of the mth batch,

if(

max1≤n≤N�k

rn�k < S1TOLTN�m− 1

)then stop� (33)

Here S1 is a given constant, with S1 > s1 > 0, determined more preciselyas follows: We want the maximal error indicator to decay quickly to thestopping level S1TOLT /N , but when almost all rn satisfy rn ≤ s1TOLT /N ,the reduction of the error may be slow. Theorem 3.2 shows that a slowreduction is avoided if S1 satisfies (37).

Remark 3.1. In practice, the numerical tests show that

�E�g�X�T��− ��g�X�T���M��TOL

≈ s12�

so we choose s1 = 2.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 16: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

524 Moon et al.

3.3.1. The Adaptive Algorithm

The adaptive stochastic time stopping algorithm has a structure similarto as basic Monte Carlo algorithm cf. (66) with an additional innerloop for individual mesh refinement for each realization of a Brownianmotion. First, we split the specified error tolerance by (18) the outerloop computes the batches of realizations of X, until an estimate forthe statistical error (21), for Y = g�X�T��, is below the tolerance, TOLs;then for each realization, the inner loop applies the refinement strategy(32) iteratively, until time mesh is sufficiently resolved, in other words,until the approximate error density and the time steps satisfy thestopping criteria (33) with a given time discretization tolerance TOlT .This refinement procedure, in the inner loop, needs to sample the Wienerprocess, W , on finer partitions, given its values on coarser, which isaccomplished by Brownian bridge refinements (34). The computation ofthe stochastic flows requires to store the current realization of X and W .Note that the inner loop computes a single realization at a time.

Now we are ready for the detailed defintion of the adaptivealgorithm with stochastic steps:

Algorithm SInitialization Choose:

(1) An error tolerance, TOL ≡ TOLs + TOLT .(2) A number N�1 of initial uniform steps �t�1 for �0� T, with TOLN�1

bounded from above and below by positive constants, and set N�0 =N�1.

(3) A number M�1 of initial realizations, with TOL2M�1 bounded fromabove and below by positive constants.

(4) An integer H = 2 for two subdivisions of a refined time step, anumber s1 = 2 in (32) and a rough estimate c in (36) to compute �1

using (37).(5) A constant c0 ≥ 1�65 and an integer MCH≥ 2 to determine the

number of realizations in (35).

Set the iteration counter for realization batches m = 1 and the stochasticerror to Es�m = +.

Do While �ES�m > TOLS�For realizations j = 1� � � � �M�m

Set the number of time levels for realization j to k = 1and set the error indicator to r�k = +.Start with the initial partition �t�k and generate �W�k.Compute for realization jg�X�T���J and N�J by calling routine Control-Time

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 17: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 525

-Error where k = J is the number of final time levels foran accurate mesh of this realization.

end-forCompute the sample average Eg ≡ ��g�X�T���M�m�, the samplestandard deviation � �m ≡ � �g�X�T��M�m� and the a posterioribound for the statistical error ES�m ≡ ES�g�X�T���M�m� in (21).if �ES�m > TOLS�

Discard all old M�m realizations anddetermine a larger M�m+ 1 by change_M�M�m� � �m,TOLS; M�m+ 1�, in (35), and update N = ��N�J�M�m�,where the random variable N�J is the final number of timesteps on each realization.

end-ifIncrease m by 1.

end-do

Accept Eg as an approximation of E�g�X�T��, since the estimate ofthe computational error is bounded by TOL.routine Control-Time-Error��t�k� �W�k� r�k� N�m− 1;g�X�T���J� N�J�

Do while (r�k violates the stopping (33))Compute the Euler approximation X�k in (2) and the errorindicator r�k in (26) using the error density (9) on �t�kwith the known Wiener increments �W�k.If �r�k violates the stopping (33)�

Do the refinement process (32) to compute �t�k+ 1from �t�k and compute �W�k+ 1 from �W�k usingBrownian bridges (34).

end-ifIncrease k by 1.

end-doSet the number of the final level J = k− 1.

end of Control-Time-Error

At the new time steps t′i ≡ �ti�k+ ti+1�k�/2, on level k+ 1, thenew sample points from W are constructed by the Brownian bridge, cf.Karatzas and Shreve [24], Glasserman [17],

W��t′i� =12�W��ti�k�+W��ti+1�k��+ z�i (34)

where z�i are independent random variables, also independent of W�tj�k�for all i, j and �, and each component z�i is normal distributed with meanzero and variance �ti+1�k− ti�k�/4.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 18: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

526 Moon et al.

routine change_M �Min��in�TOLS�Mout�

M∗ = min{integer part

(c0�in

TOLS

)2

�MCH×Min

}n = integer part �log2M

∗�+ 1 (35)

Mout = 2n�

end of change_M

The remainder of this section analyzes Algorithm S in threetheorems with respect to stopping, accuracy, and efficiency. Animportant ingredient of the analysis is the proof in section 4 that theerror density converges a�s� as the maximal time steps tends to zero.Therefore, for each realization and sufficiently refined meshes, thereexists a constant c = c�ti� ��, close to 1 a�s� such that for all time steps�ti� ti+1��k and all refinement levels k the error density satisfies

c ≤∣∣∣��ti��parent�i� k�

��ti��k

∣∣∣ ≤ c−1�

c ≤∣∣∣��ti��k− 1��ti��k

∣∣∣ ≤ c−1�

(36)

provided supn�k�� �tn�k is sufficiently small. Here ��ti��parent�i� k�denotes the corresponding error density on a previous refinementlevel, parent�i� k�, of the time step �ti� ti+1��k. Note that the previousrefinement level, parent�i� k� of k is not necessary k− 1.

3.3.2. Stopping of the Adaptive Algorithm

The right choice of the parameters 0 < s1 < S1 is explained by

Theorem 3.2 (Stopping). Suppose the adaptive algorithm uses the strategy(32) and (33). Assume that c satisfies (36) for the time steps correspondingto the maximal error indicator on each refinement level, and that

S1 >1cs1� 1 >

c−1

H2� (37)

Then, for each realization of a Brownian motion, the adaptive refinementprocess decrease the maximal error indicator with the factor

max1≤i≤N�k+1

ri�k+ 1 ≤ c−1

H2max

1≤i≤N�kri�k� (38)

or stops the Control-Time-Error routine for this realization.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 19: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 527

Proof. Consider a fixed realization and let N�k denote the number oftime steps on the kth refinement level. There is a t∗ ∈ �0� T such that

r�t∗��k+ 1 = max1≤i≤N�k+1

ri�k+ 1

on refinement level k+ 1. The corresponding indicator r�t∗��k, on theprevious level, satisfies precisely one of the following three statements

r�t∗��k ≤ s1TOLTN

� (39)

s1TOLTN

< r�t∗��k ≤ H2 s1TOLTN

� (40)

r�t∗��k > H2 s1TOLTN

� (41)

If (39) holds, the time step containing t∗ is not divided on level k+ 1 andby (36)

r�t∗��k+ 1 ≤ c−1s1TOLTN

� (42)

The condition S1 > c−1s1 in (37) shows that the algorithm stops at levelk+ 1 if (39) holds.

Similarly, if (40) holds, the time step containing t∗ is divided onlevel k+ 1, so that r�t∗��k+ 1 ≤ c−1s1TOLT

Nagain and consequently the

algorithm stops at level k+ 1.Finally, if (41) holds, the time step containing t∗ is divided and

by (36)

r�t∗��k+ 1 ≤ c−1

H2r�t∗��k ≤ c−1

H2max

1≤i≤N�kri�k�

which proves the theorem. �

Note that the error density may be very large for some realizations.However, in that case, the algorithm forces to refine the mesh andasymptotically the ration will tend to one, due to the a�s� convergence.

Now let us verify that the choice of �, i.e., � = TOL�, where 0 < � < /� + 2� and 0 < < 1/2, implies that �tsup → 0, as TOL → 0, and thatc is close to 1 in (36) for sufficiently refined meshes.

Lemma 3.3. Suppose (23) and (103) hold, then

supt

�t�t��J ≤√S13TOL

1−�2 � a�s� (43)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 20: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

528 Moon et al.

for the final mesh J , and

lim supTOL→0+

∣∣∣∣ ���ti��parent�i� k�����ti��k�− 1

∣∣∣∣ = 0 a�s��

lim supTOL→0+

∣∣∣∣ ���ti��k− 1����ti��k�

− 1

∣∣∣∣ = 0 a�s�

Proof. For each realization when the routine Control-Time-Errorin Algorithm S is finished, the error indicators satisfy the bound

��i��t2i ≤S1TOLTN

� for all i� (44)

where N is the sample average of the previous batch. Consequently, wehave by (23) and the choice � = TOL� in (24)

��tsup�2 ≤ S1TOLT

�N≤ S1

3TOL1−�

which proves (43).The definitions (23) and (103) in Corollary 4.3 imply

��� = max���� + o���tsup� � ��� ��

for 0 < < 12 , where � is the limit of �. Therefore, we have∣∣∣∣ ���ti��k− 1�

���ti��k�− 1

∣∣∣∣ ≤ o���tsup� �k� ��

�= o

(TOL�

1−�2 � −�)�

The same estimate for ���ti��parent�i� k��/���ti��k� finishes theproof. �

3.3.3. Accuracy of the Adaptive Algorithm

The adaptive algorithm guarantees that the estimated error is boundedby S1TOLT + TOLS =

(S13 + 2

3

)TOL. The next question is whether the

true error is bounded by(S13 + 2

3

)TOL asymptotically. Using the upper

bound (33) of the error indicators and the a.s. convergence of �, provedin section 4, the approximate error has the estimate

Theorem 3.4 (Accuracy). Suppose that the assumptions of Lemma 3.3hold. Then the adaptive algorithm (32) and (33) satisfies, for any constantc0 > 0 defined in (35),

lim infTOL→0+

P

( �E�g�X�T��− ��g�X�T���M��TOL

≤ S13

+ 23

)≥

∫ c0

−c0

e−x2/2√2�dx�

(45)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 21: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 529

Proof. First, we split the fraction on the left-hand side of (45) into astatistical error and a time discretization error, for all s ∈ �+,

lim infTOL→0+

P

( �E�g�X�T��− ��g�X�T���M��TOL

≤ S13

+ 23

)≥ lim inf

TOL→0+P

( �E�g�X�T��− g�X�T���TOL

+ �E�g�X�T��− ��g�X�T���M��TOL

≤ S13

+ 23

)≥ lim inf

TOL→0+P

( �E�g�X�T��− g�X�T���TOL

≤ s

and�E�g�X�T��− ��g�X�T���M��

TOL≤ S1

3+ 2

3− s

)= lim inf

TOL→0+

(P

( �E�g�X�T��− g�X�T���TOL

≤ s

)×P

( �E�g�X�T��− ��g�X�T���M��TOL

≤ S13

+ 23− s

))�

(46)

The Time Discretization Error. When the adaptive algorithm stopsat the m-th batch, the error estimates (4) (defining �) and (8) andthe stopping bound for �t (33) imply by Jensen’s inequality and theindependence of X�m− 1 and X�m

�E�g�X�T��− g�X�T��� ≤ E

[N∑i=1

�ti

∫ ti+1

ti

������d�]

<√S1TOLTE

1√N�m− 1

∫ T

0

������√������d�

≤ √S1TOLT

√E

[1

N�m− 1

]E

[ ∫ T

0

������√������d�]�

and, consequently,

lim supTOL→0+

�E�g�X�T��− g�X�T���TOLT

≤ √S1 lim sup

TOL→0+

√E

[1

TOLTN�m− 1

]lim supTOL→0+

[ ∫ T

0

������√������d�]� (47)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 22: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

530 Moon et al.

Rewrite the stopping condition (33) as

√��� <√S1

TOLTN�m− 1

1�ti

integrate both sides, use the definition (22) and take the sample averageof M�m independent samples to obtain

N�m ≥√N�m− 1√S1TOLT

�( ∫ T

0

√������d��M�m)� (48)

which by Jensen’s inequality, implies(E

[1

TOLTN�m

])2

≤ E

[1

�TOLTN�m�2

]≤ S1E

[1

TOLTN�m− 1

]E

[1

���∫ T0

√������d���2]�

This recursion, with the initial conditions of N�1 and M�1, shows using�→ � a�s�, by Corollary 4.3,

lim supTOL→0+

√E

[1

TOLTN

]≤

√S1

E�∫ T0

√���d� � (49)

Since, by (23) and (24) ��� ≥ TOL�, the L2 bound (7) implies

lim supTOL→0+

E

[ ∫ T

0

������√������d�]= E

[ ∫ T

0

√������d�]�

and, consequently, we conclude by (47) and (49) that

lim supTOL→0+

�E�g�X�T��− g�X�T���TOLT

≤ S1�

Since TOLT = 13TOL, this deterministic limit implies that for all s > S1/3

lim infTOL→0+

P

( �E�g�X�T��− g�X�T���TOL

≤ s

)= 1� (50)

The Statistical Error. Use that in (35) the number of realization is

M ≥(c0� �g�X�T���M�m− 1�

TOLS

)2

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 23: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 531

to rewrite 1/TOL ≤ �2/3c0�√M/� �g�X�T���M�m− 1� by (18), so that

P

( �E�g�X�T��− ��g�X�T���M�m��TOL

≤ S13

+ 23− s

)≥ P

( �E�g�X�T��− ��g�X�T���M�m��� �g�X�T���M�m− 1�

√M ≤ 3c0

2

(S13

+ 23− s

))�

Let �≡ �E�g�X�T��−��g�X�T���M�m��/� �g�X�T���M�m− 1� and� ≡ �E�g�X�T��− ��g�X�T���M�m��/� �g�X�T���M�m− 1�. Then,write

� = �� − ��+ ��

Use the strong convergence (7), that � �g�X�T���M�m− 1� isindependent of X�m and X�m and Chebyshev inequality to conclude

P��� − ��√M > �� ≤ ���tsup�

�2� (51)

We have � �g�X�T���M�→ �g�X�T�� a.s., as M → , i.e., as TOL → 0+,by the strong law of large numbers. Therefore, the central limit theoremyields the weak convergence

�√M =

∑j�E�g�X�T��− g�X�T���j��

� �g�X�T���M�√M

⇀ � as TOL → 0+� (52)

where � is normally distributed with mean zero and variance one.Consequently, (51) and (52) imply

sups>

S13

limTOL→0+

P

( �E�g�X�T��− ��g�X�T���M��� �g�X�T���M�

√M ≤ 3c0

2

(S13

+ 23− s

))

= 2 sups>

S13

∫ 3c02 �

S13 + 2

3−s�

0

e−x2/2√2�dx =

∫ c0

−c0

e−x2/2√2�dx�

Finally, the combination of (46), (50), and (53) proves thetheorem. �

3.3.4. Efficiency of the Adaptive Algorithm

The minimal expected number of time steps in the class of stochastictime steps �t�t� �� depends on the stochastic data and the individualrealizations X through the constraint E�g�X�T��− g�X�T�� = TOLT .

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 24: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

532 Moon et al.

The conditions (27) and (30) imply that the optimal expected number oftime steps, E�NS, satisfies (cf. Szepessy et al. [39])

E�NS =1

TOLT

(E

[ ∫ T

0

√������d�])2

� (53)

On the other hand, for the constant time steps �t = constant, thenumber of steps, NC , to achieve

∑Ni=1 ��i��t2i = TOLT , becomes

NC = T

TOLT

( ∫ T

0E�������d�

)� (54)

Therefore, Jensen’s inequality shows that the adaptive method withstochastic time steps uses fewer time steps than the method with aconstant time step, i.e.,

E�NS ≤ NC� (55)

The following theorem uses a bound of the error indicators, obtainedfrom the stopping condition (33) and the ratio of the error density (36),to show that the algorithm (32)–(33) generates a mesh, which is optimal,up to a multiplicative constant.

Theorem 3.5 (Efficiency). Assume for each realization that c = c�t� ��satisfies for all time steps at the final refinement level, that all initial timesteps have been divided when the algorithm stops, and that the assumptionsof Lemma 3.3 hold. Then the final sample average N�m = ��N�M�m� ofthe number of adaptive steps of the algorithm (32) and (33) satisfies

TOLTN�m2

N�m− 1<H2

s1�

(∫ T

0

√∣∣∣∣�c∣∣∣∣d��M�m

)2

� (56)

and asymptotically

lim supTOLT→0+

TOLTE�N ≤H2

s1

(E

[ ∫ T

0

√������d�])2

� (57)

Proof. When the adaptive algorithm stops at the m-th batch, on thefinal refinement level J of a fixed realization, the error indicators satisfythe upper bound

ri�J = ����ti���t2i ��J ≤S1TOLTN�m− 1

for all i�

By assumption, each time step �ti� ti+1��J has a parent on a previouslevel parent�i� J� not necessary the previous level J − 1, which wasdivided. Therefore, the indicators of the parent time steps satisfy the

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 25: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 533

lower bound

���ti��parent�i� J��H2�t�ti�2�J = ����ti���t�ti�2��parent�i� J�

>s1TOLTN�m− 1

The estimate on the number of steps now follows by relating the errorindicators to the lower bounds of their parents:

�t�ti�2�J >

s1TOLTN�m− 1

1H2

1���ti��parent�i� J��

≥ s1TOLTN�m− 1H2

c

���ti��J��

The above inequality and (22) imply

N =∫ T

0

dt

�t�t�< H

√N�m− 1s1TOLT

∫ T

0

√∣∣∣∣�c∣∣∣∣dt� (58)

Note that N is the number of time steps of the realizations on m-th batchand taking its sample average proves (56)

N�m√N�m− 1

≤ H√s1TOLT

(∫ T

0

√∣∣∣∣�c∣∣∣∣dt�M�m

)�

Take the expectation value of (58) and use independence andJensen’s inequality to obtain

E�TOLTN�m ≤√E�TOLTN�m− 1

H√s1E

[∫ T

0

√∣∣∣∣�c∣∣∣∣dt

]�

This recursion with the initial conditions of N �1 and M�1proves (57). �

Remark 3.6. The error density condition also implies constraints onthe optimal mesh, H = 2 and the assumption 1

2 ��i�k+ �i+1�k� = ��ti��k− 1 show that

2c − 1 ≤∣∣∣∣�i+1�k

�i�k

∣∣∣∣ ≤ 2c−1 − 1� (59)

Remark 3.7. If the number of elements in each refinement iterationincrease only very slowly, the total work including all refinement levels

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 26: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

534 Moon et al.

becomes proportional to the product of the number of steps in the finestmesh times the number of refinement levels, J , which satisfies min�t =H−JT/N�1 = ��TOL�, so that

J = ��log�TOL−1�� = ��logN��

Therefore, the average of the total number of time steps is essentiallybounded by E�N log�E�N�.

Remark 3.8. The adaptive algorithm based only on dividing (32) canalso include merging of steps. These two adaptive algorithms performsimilarly and analogous theoretical results are proved in Moon [29]. Anadvantage without merging is that stopping requires (36) only at themaximal error indicator on each level and that fewer parameters areused. The dividing-merging adaptive algorithm takes the form: for i =1� 2� � � � � N�k let

ri�k = ��i�k���ti�k�2

and

if(ri�k > s1

TOLTN

)then

divide �ti�k into H uniform substeps,elseif max�ri�k� ri+1�k� < s2

TOLTN

, thenmerge �ti�k and �ti+1�k into one stepand increase i by 1,

else let the new step be the same as the oldendif�

With this the dividing and merging strategy it is natural to use thefollowing stopping criteria:

if(ri�k ≤ S1

TOLTN

�∀i = 1� � � � � N)

and(max�ri�k� ri+1�k� ≥ S2

TOLTN

�∀i = 1� � � � � N − 1)

then stop.

Here, 0 < S2 < s2 < s1 < S1 are given constant determined more preciselyin Moon [29].

3.4. Convergence Rates for Deterministic Time Steps

The main difference between the stochastic and the deterministictime step algorithm is that the additional work to find the optimaldeterministic steps requires a much smaller number, MT , of realizationsthan total number of realizations M . The approximations of the time

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 27: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 535

discretization error in the right-hand side of (25) can be separated intotwo parts

E

[N∑n=1

rn

]≤ �

(N∑n=1

rn�MT

)+

∣∣∣∣∣E[

N∑n=1

rn

]− �

(N∑n=1

rn�MT

)∣∣∣∣∣� (60)

where the second error term in the right-hand side of (60) is withprobability close to one asymptotically bounded by∣∣∣∣∣E

[N∑n=1

rn

]− �

(N∑n=1

rn�MT

)∣∣∣∣∣ � ETS ≡ c0� �

∑Nn=1 rn�MT�√MT

(61)

and the first term defines ETT ≡ ��∑N

n=1 rn�MT�. Then for a givenTOLT > 0 the goal is to construct a partition �t of �0� T, with as fewtime steps and realizations MT as possible, such that

ETT + ETS ≤ TOLT �

To this end, first split the time discretization tolerance TOLT in twopositive parts TOLTT and TOLTS for ETT and ETS , respectively. Thestatistical error of the time discretization using the density (16) is���tsup/

√MT�. Therefore, the percentage of the tolerance, TOL, devoted

to the control of statistical time discretization error can be arbitrarilysmall as �tsup → 0. In practice, we choose

TOLTT = 23TOLT = 2

9TOL and TOLTS =

13TOLT = 1

9TOL� (62)

The control of the statistical time discretization error determines thenumber of realizations MT to ensure a reliable choice of the timediscretization in the deterministic time stepping algorithm.

Take into account (23) and define the error density, �, by

ETT ≡ �

(N∑n=1

rn�MT

)=

N∑n=1

�n�t2n�

Following the optimal conditions of (27) and (29), the goal of theadaptive algorithm described here is to construct a time partition �t of�0� T such that

��n��t2n ≤ d1TOLTTN

� ∀n = 1� � � � � N (63)

where d1 is a given positive constant, see Remark 3.9.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 28: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

536 Moon et al.

To achieve (63), start as in section 3.1 with an initial partition �t�1and then specify iteratively a new partition �t�k+ 1, from �t�k, usingthe following refinement strategy.

for each time step n = 1� 2� � � � � N�k� let rn ≡ ��n��t2n�if rn�k ≥ d1

TOLTTN�k

then divide �tn�k into H uniform substeps.

else let the new step be the same as the old (64)endif

endfor.

until the following stopping criteria is satisfied.

if(

max1≤n≤N�k

rn�k < D1TOLTTN�k

)then stops. (65)

Here, D1 is a given constant satisfying 0 < d1 < D1. The combinationof (60) and (65) asymptotically guarantees a given level of accuracy,ETT < D1TOLTT . The positive numbers D1 is motivated to avoid slowconvergence in case almost all rn satisfy (65), as in section 3.1. Beforedetermining the sufficient conditions for the constants 0 < d1 < D1, wewill describe the algorithm with deterministic time steps in detail.

Remark 3.9. In practice, the numerical tests show that

�E�g�X�T��− ��g�X�T���M��TOL

≈ d12<

29D1�

so we choose d1 = 2.

3.4.1. The Adaptive Algorithm

First, we split the specified error tolerance into three parts, TOLS ,TOLTT , and TOLTS by (18) and (62). The first loop below determines themesh with MT realizations by changing, iteratively, the time steps usingour refinement strategy (64) until ��� and �t satisfy the stopping criteria(65) and the statistical error estimate ETS ≤ TOLTS . Then, the secondloop with fixed mesh chooses the number M of realizations, using (67)until ES ≤ TOLS holds.

Now we are ready for the detailed definition of the adaptivealgorithm with deterministic steps:

Algorithm DInitialization Choose:

(1) An error tolerance, TOL ≡ TOLS + TOLTT + TOLTS .(2) A number, N�1, of initial uniform steps �t�1 for �0� T.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 29: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 537

(3) A number, M�1, of initial realizations and set MT�1 = M�1.(4) An integer H ≥ 2 for the number of subdivisions of a refined time

step, a number, d1 = 2 in (64) and a rough estimate of c in (36) tocompute D1 using (69).

(5) A constant c0 ≥ 1�65 and an integer MCH ≥ 2 to determine thenumber of realizations in (67).

Set the iteration counter, k, for time refinement levels, to 1 and set thestatistical error, ETS = + and r�k = +.

Do while (r�k violates the stopping (65) or ETS > TOLTS)Compute the sample averages and the error estimates on �t�kby calling Euler. Set MT�k+ 1 = MT�k and �t�k+ 1 = �t�k.If (r�k violates the stopping (65))

For all time steps i = 1� � � � � N�k, do the refinement process(64) to update �t�k+ 1 from �t�k.

elseif (ETS > TOLTS)Update MT�k+ 1 by change_M �MT �k��TS�k�TOLTS�MT �k+ 1� in (67), where �TS�k is defined in routine Euler.

end-ifIncrease k by 1.

end-doCompute an approximation, Eg, for E�g�X�T�� with fixed time mesh�t = �t�k by calling Monte-Carlo(TOLS , MT�k; Eg) in (66).Accept Eg as an approximation of E�g�X�T��, since the estimate of thecomputational error is bounded by TOL.

routine EulerCompute MT�k new realizations of the Euler method with the samepartition �t�k and update the approximations of the timediscretization error indicators r�k and the statistical timediscretization error ETS�k and compute the sample standarddeviation �TS�k ≡ � �g�X�T���MT �k�.

end-of-Euler

routine Monte-Carlo(TOLS , M0; Eg)Set the batch counter m = 1, M�m = M0 and ES�m = +.Do while (ES�m > TOLS)

Discard all old M�m− 1 realizations andcompute M�m new samples of Y ≡ g�X�T��, along with thesample average EY ≡ ��Y�M�m�, the sample standarddeviation � �m ≡ � �Y�M�m� and the statistical error (66)estimation ES�m+ 1 ≡ ES�Y�M�m�. Compute M�m+ 1 bychange_M �M�m�� �m�TOLS�M�m+ 1�.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 30: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

538 Moon et al.

Increase m by 1.

end-doend of Monte-Carlo

routine change_M �Min��in�TOLS�Mout�

M∗ = min{integer part

(c0�in

TOLS

)2

�MCH×Min

}n = integer part �log2M

∗�+ 1 (67)

Mout = 2n�

end of change_M

Here, M0 is a given initial value for M , and MCH > 1 is a positiveinteger parameter introduced to avoid a large new number of realizationsin the next batch due to a possibly inaccurate sample standard deviation� �m. Indeed, M�m+ 1 cannot be greater than MCH×M�m.

The remainder of this section analyzes Algorithm D in threetheorems with respect to stopping, accuracy, and efficiency. The analysisis based on the a.s. convergence of the error density in Theorem 4.5.The proofs are similar to the theorems in section 3.1, thus we state onlyresults here.

3.4.2. Stopping of the Adaptive Algorithm

The right choice of the parameters 0 < d1 < D1 is explained by

Theorem 3.10 (Stopping). Suppose the adaptive algorithm uses thestrategy (64) and (65). Assume that there exists a positive constant c�t�such that

c ≤∣∣∣∣ ��ti��parent�i� k���ti��k

∣∣∣∣ ≤ c−1�

c ≤∣∣∣∣ ��ti��k− 1��ti��k

∣∣∣∣ ≤ c−1�

(68)

for the time steps corresponding to the maximal error indicator on eachrefinement level and that

D1 >H

cd1� 1 >

c−1

H2� (69)

Then, for each realization, the adaptive refinement process decreases themaximal error indicator with the factor

max1≤i≤N�k+1

ri�k+ 1 ≤ c−1

H2max

1≤i≤N�kri�k� (70)

or stops the mesh refinement (do while) loop in the algorithm.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 31: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 539

3.4.3. Accuracy of the Adaptive Algorithm

The adaptive algorithm guarantees that the estimated error is boundedby a given error tolerance, D1TOLTT +TOLTS +TOLS = � 29D1 + 7

9 �TOL.By applying the same arguments as in the proof of the Theorem 3.4,the true error for the deterministic time stepping is also bounded by� 29D1 + 7

9 �TOL asymptotically, using the upper bound (65) of the errorindicators and the a.s. convergence of ��� in section 4.

Theorem 3.11 (Accuracy). Suppose that the assumptions of Lemma 3.3hold. Then the adaptive algorithm (44) and (65) satisfies, for any constantc0 > 0 defined in (67).

lim infTOL→0+

( �E�g�X�T��− ��g�X�T���M��TOL

≤ 29D1 +

79

)≤

∫ c0

−c0

e−x2/2√2�dx�

(71)

3.4.4. Efficiency of the Adaptive Algorithm

Within the class of deterministic time steps, the conditions (27) and (28)give the minimal number of steps

ND = 1TOLT

( ∫ T

0

√E�������d�

)2

� (72)

Using the definitions (53) and (54) of the minimal expected number,E�NS, of stochastic time steps and the number, NC , of constant, timesteps, respectively, we obtain

E�NS ≤ ND ≤ NC (73)

by applying Jensen’s inequality.Similar to Theorem 3.5, with N instead of N , we prove the following

theorem using the bound of the error indicators to show that thealgorithm (64) and (65) generates a mesh that is optimal, up to amultiplicative constant.

Theorem 3.12 (Efficiency). Assume that c = c�t� satisfies (68) for all timesteps at the final refinement level, that all initial time steps have been dividedwhen the algorithm stops, and that the assumptions of Lemma 3.3 hold. Thenthe final number of adaptive steps N , of the algorithm (32)–(33) satisfies

TOLTN ≤ H2

d1�

(∫ T

0

√∣∣∣∣�c ���∣∣∣∣d��M�m

)2

� (74)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 32: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

540 Moon et al.

4. ALMOST SURE CONVERGENCE OF THE ERROR DENSITY

This subsection proves pathwise a.s. convergence of the error densityusing a.s. convergence of the approximate solution of the Monte CarloEuler method based on the adaptive algorithms presented in section 3.Before presenting the main result, we extend the Euler method, fortheoretical purposes only, to t ∈ �0� T by

X�t�− X�0� =∫ t

0a�s� X�ds +

�0∑�=1

∫ t

0b��s� X�dW��s� (75)

where a and b� are the piecewise constant approximations; see Figure 1.

a�s� X� ≡ a�tn� X�tn�� and b��s� X� ≡ b��tn� X�tn�� for s ∈ �tn� tn+1��

4.1. Stochastic Time Steps

Let us now for H = 2 define the maximum step size for a fixedrealization by

�tsup��� ≡ sup1≤n≤N

�tn��� = T2−�� (76)

for a positive integer �. Theorem 4.1 and Lemma 4.2 below show thatthe approximate solution converges a.s. to the correct limit.

Theorem 4.1. Suppose that a� b� g, and X satisfy the assumptions inLemma 2.1. with the maximum time step size �tsup��� in (76). Then, for any < 1/2,

lim�→�t

− sup��� sup

t∈�0�T �X�t�− X�t�� = 0 a�s�� (77)

where �t�sup��� ≡ ��tsup����� for � ∈ �.

Figure 1. The piecewise constant approximations a and a.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 33: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 541

Proof. To simplify the proof, let us introduce the forward Eulerapproximation X of X with uniform time steps, �t, on a much finer gridthan �t, so that �i�t i = 0� � � � � N � includes all time steps for X. We canextend X as in (75) to t ∈ �0� T by

X�t�− X�0� =∫ t

0a�s� X�ds +

�0∑�=1

∫ t

0b��s� X�dW��s� (78)

where a and b� are the piecewise constant approximations; see Figure 1,

a�s� X� = a�i�t� X�i�t�� and b��s� X� = b��i�t� X�i�t��

for s ∈ �i�t� �i+ 1��t��

Then, the error splits into

�t− sup��� supt

�X�t�− X�t��

≤ �t− sup��� supt

�X�t�− X�t�� + �t− sup��� supt

�X�t�− X�t��� (79)

Let us first study the a.s. convergence of the second term in (79). Theconvergence of the standard Euler method with uniform time steps in thefirst terms follows then by a similar derivation or by Talay [40]. To showa.s. convergence of the second term, we observe that

X�t�− X�t� =∫ t

0�a�s�ds +

∫ t

0�b��s�dW��s�� (80)

where for t ∈ �tn� tn+1�,

�a�t� = a�t� X�− a�t� X��

�b��t� = b��t� X�− b��t� X��(81)

Now let us define the second term of the right hand side in (80) by

Y�t� ≡∫ t

0�b��s�dW��s�� (82)

We have supt√E��Y�t��2 = ���t

12sup� by using the mean square strong

convergence of the Euler approximation in Lemma 2.1. Lemma 4.2verifies that Y is a continuous martingale with respect to a filtrationgenerated by �W�s�� �t�s� s ≤ t�, so that �Y � is a submartingale.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 34: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

542 Moon et al.

Therefore, Doob’s inequality and Jensen’s inequality give, for any ′ ∈ �,

P

(sup0≤t≤T

�Y�t�� ≥ �t ′sup���

)≤ 1�t ′sup���

E��Y�T��

≤ 1�t ′sup���

√E��Y�T��2

= �(�t

12− ′sup

)�

The definition (76) then implies, for a positive constant C,

∑�=1

P

(sup0≤t≤T

�Y�t�� ≥ �t ′sup���

)≤

∑�=1

�(�t

12− ′sup

)≤ C

∑�=1

2−��12− ′� < � (83)

provided 12 − ′ > 0. Therefore, the Borel-Cantelli lemma implies

P

(sup0≤t≤T

�Y�t�� ≥ �t ′sup��� infinitely often

)= 0�

i.e., with < 12

�t− sup��� sup0≤t≤T

�Y�t�� → 0 a�s�� as �→ � (84)

The first term in the right-hand side of (80) satisfies

supt

∣∣∣ ∫ t

0�a�s�ds

∣∣∣ ≤ ∫ T

0��a�s��ds�

Therefore, Chebyshev’s inequality yields

P

(supt

∣∣∣∣ ∫ t

0�a�s�ds

∣∣∣∣ ≥ �t ′sup���

)≤ P

( ∫ T

0

∣∣�a�s�∣∣ds ≥ �t ′sup���

)≤ 1�t2 ′sup���

E

[( ∫ T

0��a�s��ds

)2]≤ T

�t2 ′sup���E

[ ∫ T

0��a�s��2 ds

]= �

(�t1−2 ′

sup ���)

where the last equality follows from Lemma 2.1. Theargument in (83) and the Borel-Cantelli lemma gives similarly

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 35: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 543

�t− sup��� supt �∫ t0 �a�s�ds� → 0 a.s. for < 1

2 , as �→ , and,consequently, for < 1

2

�t− sup��� supt∈�0�T

�X�t�− X�t�� → 0 a�s� as �→ � (85)

The same arguments applied to the first term in the right-hand sizeof (79) yield for < 1

2 .

�t− sup��� supt∈�0�T

�X�t�− X�t�� → 0 a.s. as �→ � (86)

Here Lemma 4.2 holds directly for X, since X is adapted to thestandard �-algebra generated by W�s�. The combination (86) and (85)proves (77). �

The key to our proof of Theorem 4.1 is that Yt ≡ Y�t� in (82) is amartingale with respect to a filtration �t:

Lemma 4.2. Suppose that a� b� g, and X satisfy the assumptions inLemma 2.1. Let the process Yt be defined by (82), and the filtration �t bethe �-algebra generated by �W�s�� �t�s� s ≤ t�. Then Yt satisfies

(i) E��Yt� < , for 0 ≤ t ≤ T ,(ii) Yt is adapted to �t,(iii) E�Yt��s = Ys for t ≥ s,

i.e., Yt is a martingale with respect to �t.

Proof. To prove the lemma, we observe by Lemma 2.1 that E��Yt� ≤√E��Yt�2 < and the construction of X implies Yt ∈ �t for all t, so that

(i) and (ii) holds. Note that Yt is not adapted to the standard filtrationgenerated by W only. Using conditional expectation and the notations inTheorem 4.1.

E�Yt��s = E�Ys��s+ E

[ ∫ t

s�b�dW���s

]= Ys + E

[ ∫ t

s�b�dW���s

]� (87)

It remains to verify E�∫ ts�b�dW���s = 0. Since �b� is piecewise

constant, rewrite∫ ts�b�dW� = ∑n2

i=n1 �b��i�t��W�

i ; where �W�i =

W���i+ 1��t�−W��i�t� with the uniform time steps, �t, which arefiner than �t; at the end point t of the interval �s� t, we let �W�

n2=

W��t�−W��n2�t� and, similarly, for the end point s.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 36: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

544 Moon et al.

Let us now study one term E��b��n�t��W�n . Divide the

interval, �n�t� �n+ 1��t�, into the union of disjoint intervals[n�t +∑m−1

j=1ˆ�tj� n�t +

∑mj=1

ˆ�tj

), so that �t = ∑

Nm=1

ˆ�tm with the

corresponding Wiener increments

ˆ�W�

m = W�

(n�t +

m∑j=1

ˆ�tj

)−W�

(n�t +

m−1∑j=1

ˆ�tj

)�

�W�m = W���n+ 1��t�−W�

(n�t +

m∑j=1

ˆ�tj

)�

The end intervals �n2�t� t� and �s� n1�t� are treated similarly.We claim that X�tn� is essentially independent of the increments

W�t + ˆ�t�−W�t�, provided ˆ

�t�t� is sufficiently small and tn < t. Thestopping criteria (33) for each final acceptable time discretization implythat minn�S1

TOLTN

− rn� is positive for all realizations. The approximatesolution X��� depends on dW�t�, for � < t ≡ n�t, only through changes

in the mesh. We shall show that, provided ˆ�t is sufficiently small and

conditioned on the �-algebra ��t� ˆ�t� generated by �dW��� � < t, or

� > t + ˆ�t�, the probability to change the mesh by a change in only

ˆ�W�t� is arbitrary small, thus X��� will be essentially independent ofˆ�W�t� for � < t. Conditioned on ��t� ˆ�t�, let rn� ˆ�W� ≡ E�rn���t� ˆ�t�denote the dependence of the error indicator rn on the noise ˆ

�W . TheMalliavin derivative, �W��t�rn, and Taylor’s formula imply

rn�ˆ�W� = rn�0�+ ˆ

�W�∫ 1

0�W��t�rn��

ˆ�W�d��

The mesh generated by r�� ˆ�W� and r��0� is the same provided

0 < S1TOLTN

− rn�ˆ�W�

= S1TOLTN

−(rn�0�+ ˆ

�W�∫ 1

0�W��t�rn��

ˆ�W�d�

)(88)

≡ S1TOLTN

− �rn�0�+ ˆ�W · r ′n�� for all n�

and

minn

(S1

TOLTN

− rn�0�)≡ ���� > 0� (89)

Therefore, (88) and (89) hold if ˆ�W · r ′n < ���� for all n. Let

� ≡ �/((

supn

�tn

)2C�TOL�

)�

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 37: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 545

then � ˆ�W � < � implies ˆ�W · r ′n < �, since

ˆ�W · r ′n ≤ � ˆ�W ��r ′n� < ��r ′n� = ���tn�

2� ∫ 1

0 �W�t���tn� �ˆ�W�d��

�sup�tn�2C�TOL��≤ ��

provided ��tn�ˆ�W� ≡ E���tn����t� ˆ�t� and � is an approximate error

density function satisfying (6).Consequently, the following independence claim holds:

X�tn� is independent ofˆ�W�t� conditioned on

� ˆ�W�t�� < � and ��t� ˆ�t�� for tn < t�(90)

and the conditional probability to have different meshes with r��0� or

r��ˆ�W�t�� is, for sufficiently small ˆ

�t, bounded by

P�� ˆ�W � ≥ ����t� ˆ�t���s� = 1− P�� ˆ�W � < ����t� ˆ�t���s� ≤ �ˆ�t�3 (91)

since

P

(� ˆ�W � < � ���t� ˆ�t���s

)= P

(� ˆ�W � < ����t� ˆ�t�

)≥ 1− C�0 exp

(− ���2

4 ˆ�t

)> 1− �

ˆ�t�3�

Let us define the �-algebra generated by �dW��� � < n�t or � >

n�t + ˆ�t� and ��t��� � < n�t�. Then using the results (91) and (90), we

get

E��b��W�n ��s = E�E��b��W�

n ���s= E

[E[�b�

(�W�

1 +(1�� ˆ�W1�<�� + 1

�� ˆ�W1�≥��

) ˆ�W�

1

)�

]��s

]= E

[E[�b��W�

1 �]��s

]+ E

[E[�b�1

�� ˆ�W1�<��ˆ�W�

1 �]��s

]+E

[E[�b�1

�� ˆ�W1�≥��ˆ�W�

1 �]��s

](92)

By (91), the last term of (92) becomes

E[E[�b�1

�� ˆ�W1�≥��ˆ�W�

1 �]��s

]= E

[��� ˆ�t1�3�

](93)

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 38: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

546 Moon et al.

and from (90) the second term of (92) becomes

E[E[�b�1

�� ˆ�W1�<��ˆ�W�

1 �]��s

]= E

[E[�b��� �� ˆ�W1� < ����s

](94)

× E[1�� ˆ�W1�<��

ˆ�W�

1 �� �� ˆ�W1� < ����s]

︸ ︷︷ ︸=0

P�� ˆ�W1� < ����s�]= 0�

Therefore, we obtain from (93) and (94)

E[�b��W�

n ��s]= E�E��b��W�

1 ���s+ E���� ˆ�t1�3�� (95)

Apply this ˆ�W� �W argument recursively to E��b��W�

m conditioned on

��i�t +∑mj=1

ˆ�tj�

ˆ�tm+1� to get

E��b��W�n ��s = E

[∑m

��� ˆ�tm�3�]= E

[ ∫ n�t

�n−1��t��� ˆ�t�2dt

]� (96)

Letting ˆ�→ 0+ in (96) proves E��b��W�

n ��s = 0 and apply the samearguments to all intervals �n�t� �n+ 1��t, so that by (87) E�Yt��s = Ysfor t ≥ s. �

To verify a.s. convergence of the error density, let us recall thedefinition of the variation of a process Y : The first variation of a functionF�Y�T�� with respect to a perturbation in the initial location of the pathY , at time s, is denoted by

F ′�T� s� = �y�s�F�Y�T��

≡(�

�y1F�Y�T�� Y�s� = y�� � � � �

�ydF�Y�T�� Y�s� = y�

)� (97)

The definition (97) implies that the first variation, X′, of the solution Xwith respect to a perturbation in the initial location at t satisfies

dX′ik�s� = �jai�s� X�s��X

′jk�s�ds + �jb

�i �s� X�s��X

′jk�s�dW

��s�� s > t�

X′′ik�t� = �ik =

{0� i �= k�

1� i = k�(98)

and, similarly, one can derive the equations for the second and thirdvariation of X, cf. Szepessy et al. [39].

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 39: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 547

The definition of ci in (11) shows that the forward Eulerapproximation, X

′, in (98) can be written

X′ik�tn+1� tm� = �jci�tn� X�tn��X

′jk�tn� tm�

X′ik�tm� = �ik�

(99)

Then the equations for � in (10) and for X′in (99) imply

0 =N−1∑n=m��i�tn�− �icj�tn� X�tn���j�tn+1��X

′ik�tn� tm�

=N−1∑n=m

�i�tn+1��X′ik�tn+1� tm�− �jci�tn� X�tn��X

′jk�tn� tm��

+�i�tm�X′ik�tm� tm�− �i�T�X

′ik�T� tm�

= �i�tm�X′ik�tm� tm�− �i�T�X

′ik�T� tm��

i.e.,

�k�tm� = �ig�X�T��X′ik�T� tm�� (100)

by using the initial conditions of � and X′. The definitions of the first

and second variations of � in (12) and (13) also yield

�′kn�tm� = �ig�X�T��X

′′ikn�T� tm�+ �irg�X�T��X

′rn�T� tm�X

′ik�T� tm� (101)

and

�′′knm�tm� = �ig�X�T��X

′′′iknm�T� tm�+ �irg�X�T��X

′rm�T� tm�X

′′ikn�T� tm�

+ �irg�X�T��X′ik�T� tm�X

′′rnm�T� tm�

+ �irg�X�T��X′′ikm�T� tm�X

′rn�T� tm�

+ �irvg�X�T��X′vm�T� tm�X

′ik�T� tm�X

′rn�T� tm�� (102)

Define the function � following the error density � in (9) bysubstituting X by the limit X and replacing �, �′, �′′ by thecorresponding limits in (100) (102), where X is substituted by X.

Now we are ready to prove the a.s. convergence of the error density:

Corollary 4.3. Suppose that the assumptions of Theorem 4.1 hold. Then forany < 1

2

limTOL→0

��tsup�− ��− �� = 0� a�s� (103)

Proof. To prove the corollary it is necessary to understand theconvergence of the discrete dual solutions, �, in (10) and its first and

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 40: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

548 Moon et al.

second variation, �′� �′′. Using the definitions of the variations of X,let us consider an augmented system for Z = �X�X′� X′′� X′′′�T and let Idenote the d × d identity matrix. Then (98) and the equations for X′′ andX′′′ can be written

dZ = A�t� Z�dt + B��t� Z�dW��t�� t > t0

Z�t0� = �x� I� 0� 0�T(104)

The Euler approximation Z = �X�X′� X

′′� X

′′′�T of Z in (104) with

piecewise constant drift and diffusion fluxes satisfies

dZ = A�t� Z�dt + B��t� Z�dW�

defined as in (75).Applying Theorem 4.1 to Z and the augmented system (104) shows

that

limTOL→0

��tsup�− �Z − Z� = 0 a�s�

Therefore, the representations (100)–(102) and the regularity assump-tions on the functions g� a, and b imply that ���′, and �′′ converge a.s.with the same rate. Consequently, the error density � converges a.s. withthe rate ��tsup�− to the true error density � as the error tolerance, TOL,tends to zero, i.e., as the maximum step size tends to zero. �

4.2. Deterministic Time Steps

Lemma 4.4. Suppose that a, b, g, and X satisfy the assumptions inLemma 2.1. Then for any ∈ �0� 1/2�

limM→

M supt∈�0�T

�E�X�t�− ��X�t��M�� = 0 a�s� (105)

Proof. The idea is to split the difference in (105) into a sum of amartingale and a bounded integral and use Doob’s inequality for themartingale. Thus, representation (75) shows

E�X�t�− ��X�t��M�

= E

[ ∫ t

0b��s� X�s��dW

]︸ ︷︷ ︸

=0

−�( ∫ t

0b��s� X�s��dW��M

)

+∫ t

0�E�a�s� X�s��− ��a�s� X�s���M��ds� (106)

Denote by ��M�t�t≥0 the filtration generated by the M-independentWiener processes that define the sample average ��X�t��M� and letthe �-algebra generated by the step sizes be ���t�. The construction of

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 41: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 549

Algorithm D implies that ���t� is independent of �M�t. The conditionalexpectation process

YM�t� ≡ E

[�( ∫ t

0b��s� X�s��dW��M

)����t�

]is then a �M�t martingale and Doob’s inequality yields

P

(M sup

t∈�0�T�YM�t�� ≥ �

)≤ M2 Var�YM�T�

�2

= M2 Var�E���∫ T0 b

��s� X�s��dW��s��M�����t��2

= Var�E�∫ T0 b

��s� X�s��dW��s�����t�M1−2 �2

≤ CTM1−2 �2

Here, CT is a constant that, due to the smoothness assumptions inthe drift and diffusion coefficients, bounds uniformly the variance of∫ T0 b

��s� X�s��dW��s�. The construction (35) shows that all M belong tothe set �2k�k ∈ �. Let Mk ≡ 2k, then

∑k=1

P

(M k supt∈�0�T

�YMk�t�� ≥ �

)≤ CT�2

∑k=1

1�MK�

1−2 <∝ (107)

and Borel-Cantelli’s lemma implies

P

(�Mk�

supt∈�0�T

�YMk�t�� ≥ � infinitely often

)= 0� (108)

i.e., �Mk� supt∈�0�T �YMk

�t�� → 0 a.s., as k→ for 0 < < 1/2.Now consider the second term in (106) namely,

∫ t0 �E�a�s� X�s��−

��a�s� X�s���Mk�� ds. First, note that we have

supt∈�0�T

∣∣∣∣ ∫ t

0E�a�s� X�s��− ��a�s� X�s���Mk�ds

∣∣∣∣≤

∫ T

0�E�a�s� X�s��− ��a�s� X�s���Mk��ds (109)

so that a combination of Chebychev’s inequality and Cauchy-Schwartzinequality yield

P

(�Mk�

supt∈�0�T

∣∣∣∣ ∫ t

0�E�a�s� X�s��− ��a�s� X�s���Mk��ds

∣∣∣∣ ≥ �

)

≤ M2 k

�2E

[( ∫ T

0�E�a�s� X�s��− ��a�s� X�s���Mk��ds

)2]

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 42: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

550 Moon et al.

≤ M2 k

�2T∫ T

0E��E�a�s� X�s��− ��a�s� X�s���Mk��

2ds

= T

�2M1−2 k

∫ T

0Var�a�s� X�s��ds ≤ CT

�2M1−2 �

The analogous Borel-Cantelli arguments (107)–(108), therefore, imply for < 1/2

�Mk� supt∈�0�T

∣∣∣∣ ∫ t

0E�a�s� X�s��− ��a�s� X�s���Mk�ds

∣∣∣∣ → 0 a�s� (110)

which combined with (108) proves the lemma. �

Theorem 4.5 (Convergence). Suppose the assumptions of Lemma 4.4 hold.Then the error density � in (16) converges a.s. to E��, defined in Corollary4.3, as the specified error tolerance tends to zero.

Proof. First split E��− ����M� = E��− �+ �E��− ����M��. Thefirst term is bounded by ���tsup� using, for the augmented system in(104) the weak convergence of order one, proved for both stochasticand deterministic time steps in Theorem 2.2. The a.s. convergence of thesecond term follows from applying Lemma 4.4 to the augmented systemin (104). �

5. NUMERICAL EXPERIMENTS

This section presents numerical results from the implementation ofthe adaptive algorithms described in section 3, namely, the adaptivealgorithms with deterministic time steps (Algorithm D) and withstochastic time steps (Algorithm S).

Algorithm D uses a Matlab version 6 implementation andsimulates the �0 independent Wiener processes with a pseudo-randomnumber generator, based on either a linear congruential recursion,

uk+1 = �auk + c� mod m (111)

with a = 16807� c = 0 and m = 231 − 1, or on a more advanced randomnumber generator that uses several seeds proposed by Marsaglia; seeMoler [28].

Algorithm S uses a double precision FORTRAN 77 implementa-tion, with the same linear congruential pseudo-random numbergenerator (111). In particular, the program applies a double precisionmodification of the function ran1 and gasdev proposed in Press [37],provided with an initial seed.

In all computations, the number of subdivisions of a refined step isH = 2, and the constants to determine the number of realizations in (35)are c0 = 1�65 and MCH = 16.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 43: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 551

Table 1. (Example 5.1) Numerical results using the matlab 6 pseudo-randomnumber generator; see Moler [28]

Algorithm DAnti M N 1st MN 2nd MN N ∗

1 215 4�6× 102 0�6× 106 46× 106 4�4× 102

0 217 5�0× 102 0�3× 106 74× 106 4�4× 102

We prescribe the error tolerance TOL = 0�025 with the splittingTOLS = 2

3TOL and TOLT = 13TOL and Algorithm D uses TOLT =

TOLTS + TOLTT , where TOLTS = 19TOL and TOLTT = 2

9TOL.Unless otherwise stated, the time adaptivity procedures in

Algorithm D and Algorithm S are applied with initial uniformpartitions of �0� T having N�1 = 20 subintervals and the initial numberof realizations is M�1 = 64 = 26. The constants for the refinementprocedure are taken in accordance with Remarks 3.1 and 3.9 and thecorresponding stopping results in Theorems 3.2 and 3.10. Indeed, we used1 = s1 = 2, while the corresponding constants for the stopping criteriaare D1 = S1 = 8.

In order to reduce the computational effort, we use antitheticvariates (see Hammersley and Morton [18]) to reduce the variance insome computations. The results with the antithetic variance reductioncorrespond to Anti = 1 in Tables 1–4. Due to the construction of ouralgorithms, the number of realizations, M , is a power of 2, see (35).In fact, the final number of realizations is either M = 215 = 32768 orM = 217 = 131072, depending on the use of variance reduction and theaccuracy level, TOL = 0�025.

In Algorithm S, the total amount of computational work isproportional to the total number of time steps performed in allrealizations. The latter is shown for all the computation under the tableentry MN . Similarly, in Algorithm D, we show for each computationthe table entry 1st MN , which corresponds to the total number oftime steps in all realizations of the first loop that determines the mesh,and 2nd MN , which corresponds to the total number of time stepsin all realizations of the second loop with fixed mesh. We make a

Table 2. (Example 5.1) Similar to Table 1, but using the linear congruentialpseudo-random number generator from (111) to simulate the Wiener increments

Algorithm DAnti M N 1st MN 2nd MN N ∗

1 215 5�0× 102 0�6× 106 50× 106 4�9× 102

0 217 5�7× 102 0�35× 106 85× 106 4�5× 102

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 44: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

552 Moon et al.

Table 3. (Example 5.1) For each realization, the time adaptivity procedure isstarted with the same uniform mesh that has 20 time steps. This computationuses antithetic variates and the estimate for the expected value of the optimalnumber of time steps is in this case 2�1× 102

Algorithm SAnti M ��N�M� � �N�M� MN

1 215 2�3× 102 4× 10 175× 106

distinction between the amount of work in the first and the secondloop since only the first entails the computation of time discretizationerror approximations, and therefore, the first loop uses much morecomputational effort per time step.

For each computation with Algorithm D, we show the final numberof time steps. N , as well as an estimate of the optimal number of time steps,N ∗, that would be required to achieve the desired accuracy. The optimalnumber of time steps is based on the approximate error density defined in(16) and (23) and it is given by N ∗ = �

∫ T0

√������d��2/TOLT .Similarly, for each of the computations with Algorithm S, we

show the sample average of the final number of time steps, ��N�M�and its sample standard deviation, � �N�M�. We use the estimate���

∫ T0

√������d��M��2/TOLT for the optimal expected number of timesteps that would be required to achieve the desired accuracy, based onthe approximate error density defined in (9) and (23).

Example 5.1. Let us consider a real constant ∈ �0� T� and the linearstochastic differential equation

dX�t� =

X�t�dW�t�� t ∈ �0� X�t�dt

2√t −

+ X�t�dW�t�� t ∈ � � T

X�0� = 1

(112)

Table 4. (Example 5.1) For each realization; the time adaptivity procedure isstarted with the optimal mesh from the previous realization. This computationuses both antithetic variates and the merging procedures (see remark 3.8). Theestimate for the expected value of the optimal number of time steps is in thiscase 2�1× 102

Algorithm SAnti M ��N�M� � �N�M� MN

1 215 5�8× 102 1�5× 102 56× 106

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 45: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 553

with the unique solution

X�t� =exp

(W�t�− 1

2 t)� t ∈ �0�

exp(W�t�− 1

2 t)exp�

√t − �� t ∈ � � T�

Here we choose T = 1 and = T/3. Our goal is to approximateE�X�T� = exp�

√T − �. To avoid evaluating arbitrarily large values of

the drift in (112), we modify the drift to be

a�t� x� =

0� t ∈ �0�

x

2√t − + TOL4

� t ∈ � � T

yielding a higher order perturbation ��TOL2� in the computed resultand in the size of the optimal time steps. Due to the time discontinuityof the drift function and to ensure optimal convergence of the adaptivealgorithms, we modify the Euler method by

Xn+1 − Xn = a�t� Xn��tn + Xn�Wn� n = 0� � � �

where we choose the stochastic evaluation time t ∈ �tn� tn+1� by

�a�t� Xn�� = max��a�tn� Xn��� �a�tn+1� Xn����

Observe that the use of t does not change the adapted nature of the Eulermethod.

We compare the results of Algorithm D and Algorithm S inTables 1, 2, 3, and 4. First, Tables 1 and 2 show results from AlgorithmD, the only difference being the choice of the pseudo-random numbergenerator. The result in Table 1 use random numbers based on Moler[28] and those in Table 2 are based on the linear congruential randomnumber generator (111). As expected, the level of difference betweenthese results is comparable to the effect of taking different initial seedsin the computations.

Next, Tables 3 and 4 show results from Algorithm S. Table 3shows results when the time adaptivity procedure is started with the samemesh for each realization, namely, a uniform mesh with 20 time steps,whereas the results of Table 4 use that the time adaptivity procedure isstarted with the optimal mesh from the previous realization, yielding lesscomputational work since the drift singularity occurs at a deterministictime. Table 4 shows a situation where merging of the time steps isuseful; see Remark 3.8, namely, when using the stochastic time steppingalgorithm and the optimal mesh from the previous realization as a

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 46: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

554 Moon et al.

Figure 2. (Example 5.1). Computation with variance reduction and the mergingprocedure. Final mesh function for Algorithm D and TOL = 0�025.

starting guess for the next realization. In this case the merging procedureavoids an excessive increment of the average number of time steps.

In this example, Algorithm D and Algorithm S select the finalnumber of time steps in a similar way, since there is no remarkableinfluence from the stochastic term X�t�dW�t� in the dynamics of (112).Figure 2 shows the final mesh function obtained by Algorithm D,

Figure 3. (Example 5.1). Computations with uniform time steps and variancereduction. The computational order of convergence for the time disctretizationis 0.49 and the estimated number of time steps to achieve the accuracy TOLTT ≈5�6× 10−3 becomes 1�7× 105. The computational time discretization error isgiven by ��g�X�T��− g�X�T��� 215�.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 47: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 555

Table 5. (Example 5.2) The computation uses antithetic variates and theestimate for the expected value of the optimal number of time steps is in thiscase 1�7× 102.

Algorithm SM ��N�M� � �N�M� MN

215 1�9× 102 5× 10 146× 106

including merging, which essentially detects the place of the singularpoint . Besides this, Figure 3 shows that both Algorithm D andAlgorithm S have a remarkable advantage over a computation withuniform time steps which needs 1�7× 105 time steps to achieve the givenlevel of accuracy TOLTT ≈ 5�6× 10−3, compared to on average less than600 final adaptive steps. However, there is a noticeable difference inthe total number of operations in all levels, and the deterministic timestepping algorithm has a clear advantage. Table 1 shows that variancereduction is useful in the second loop of the deterministic time steppingalgorithm where the statistical error to control has an associated largevariance.

Example 5.2. Now we change (5.2) taking no longer a constant buta uniformly distributed random variable independent of the Wiener

Figure 4. (Example 5.2). Computations with uniform time steps and variancereduction. The order of convergence for the time disctretization is 0.49 andthe estimated number of time steps to achieve the accuracy TOLTT ≈ 5�6×10−3 becomes 1�3× 105. The computational time discretization error is given by��g�X�T��− g�X�T��� 215�.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 48: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

556 Moon et al.

process, i.e., ∼ U� � � with = 1/22 and = 1− 1/22, for T = 1.The conditional expectation E�X�T�� = exp�

√T − � can be used to

compute

E�X�T� = 1 −

exp�

√T − �d �

which is now the functional to approximate. Since the position of thedrift singularity is stochastic, the deterministic time stepping AlgorithmD gives the approximation error ��

√�tsup�, so that ND ∼ TOL−2

T ; andAlgorithm D behaves like a uniform time stepping algorithm that needs1�3× 105 time steps to achieve the given level of accuracy TOLTT ≈ 5�6×10−3. Thus, Algorithm S has a clear advantage in this example, withasymptotic accuracy ���tsup�. so that E�NS ∼ TOL−1

T using on averageless than 200 final time steps, as Table 5 and Figure 4 show.

ACKNOWLEDGMENT

This work is supported by the Swedish Research Council grants 2002-6285 and 2002-4961, UdelaR and UdeM in Uruguay, and the Europeannetwork HYKE, funded by the EC as contract HPRN-CT-2002-00282.

REFERENCES

1. Abdullaev, F. Kh., J. C. Bronski and G. Papanicolaou. 2000. Solitonperturbations and the random Kepler problem. Phys. D. 135:369–386.

2. Ainsworth M. and J. T. Oden. 1997. A posteriori error estimation in finiteelement analysis. Comput. Methods Appl. Mech. Engrg. 1–42.

3. Babuska, I. and W. C. Rheinboldt. 1978. Error estimates for adaptive finiteelement computations. SIAM J. Numer. Anal. 15:736–754.

4. Babuska, I. and M. Vogelius. 1984. Feedback and adaptive finite elementsolution of one-dimensional boundary value problems. Numer. Math.44(1):75–102.

5. Becker, R., and R. Rannacher. 1996. A feed-back approach to error controlin finite element methods: basic analysis and examples. East-West J. Numer.Math. (4)237–264.

6. Becker, R., and R. Rannacher. 2001. An optimal control approach to aposteriori error estimation in finite element methods. Acta Numerica 1–102.

7. Binev, P., W. Dahmen, and R. DeVore. 2004. Adaptive finite elementmethods with convergence rates. Numer. Math. 97(2):219–268.

8. Björk, T., A. Szepessy, R. Tempone, and G. Zouraris. 2002. Monte Carloeuler approximation if HJM term structure financial models, preprint.

9. Buchmann, F., and W. Petersen. 2002. Solving Dirichlet problemsnumerically using the Feynman-Kac representation. Research report no.2002–01, ETH.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 49: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

Convergence Rates for Adaptive Approximation 557

10. Cohen, A., W. Dahmen, and R. DeVore. 2000. Adaptive waveletmethod for elliptic operator equations: convergence rates. Math. Comp70(233):27–75.

11. Dahmen, W. 1997. Wavellet and multiscale methods for operatorequations. Acta Numerica 6:55–228.

12. DeVore, R. A. 1998. Nonlinear approximation. Acta Numerica 51–150.13. Durett, R. 1964. Probability: Theory and Examples. Belmont, CA:

Duxbury Press.14. Dzougotov, A., K.-S. Moon, E. von Schwerin, A. Szepessy, and R.

Tempone. 2005. Adaptive Monte Carlo algorithms for stopped diffusion.Lecture Notes in Computational Science and Engineering. Heidelberg:Springer Verlag.

15. Eriksson, K., D. Estep, P. Hansbo, and C. Johnson. 1995. Introduction toadaptive methods for differential equations. Acta Numerica 105–158.

16. Fishman, G. S. 1996. Monte Carlo: Concepts, Algorithms, Applications.New York, NY: Springer-Verlag.

17. Glasserman, P. 2003. Monte Carlo Methods in Financial Engineering. NewYork, NY: Springer-Verlag.

18. Hammersley, J. M., and K. W. Morton. 1956. A new Monte Carlotechnique: Antithetic variates. Proc. Cambridge Philos. Soc. 52:449–475.

19. Hofmann, N., P. Müller-Gronbach, and K. Ritter. 2000. Optimalapproximation of stochastic differential equations by adaptive step-sizecontrol. Math. Comp. 69:1017–1034.

20. Hofmann, N., P. Müller-Gronbach, and K. Ritter. 2001. The optimaldiscretization of stochastic differential equations. J. Complexity 17:117–153.

21. Harrier, E., S. P. Norsett, and G. Wanner. 1993. Solving OrdinaryDifferential Equations. New York, NY: Springer-Verlag.

22. Johnson, C. and A. Szepessy. 1995. Adaptive finite element methods forconservation laws based on a posteriori error estimates. Comm. Pure Appl.Math. 48:199–234.

23. Jouini, C., J. Cvitanic, and M. Musiela. 2001. Option Pricing, InterestRates and Risk Management. New York, NY: Cambridge University Press.

24. Karatzas, I., and S. E. Shreve. 1998. Brownian Motion and StochasticCalculus, Graduate Texts in Mathematics 113. New York, NY: Springer-Verlag.

25. Kloeden, P. E., and E. Platen. 1992. Numerical solution of stochasticdifferential equations. Applications of Mathematics 23.

26. Majda, A., I. Timofeyev, and E. Vanden Eijnden. 2001. A mathematicalframework for stochastic climate models. Comm. Pure Appl. Math. 54:891–974.

27. Milstein, G. N. 1995. Numerical integration of stochastic differentialequations, Mathematics and its Applications 313.

28. Moler, C. 1995. Random thoughts (http://www.mathworks.com/company/newsletter/pdf/Cleve.pdf).

29. Moon, K.-S. 2001. Convergence Rates of Adaptive Algorithmsfor Deterministic and Stochastic Differential Equations, (Licentiatethesis, ISBN 91-7283-196-0, Royal Institute of Technology, 2001).(http://www.nada.kth.se/∼moon/paper.html).

30. Moon, K.-S., A. Szepessy, R. Tempone, and G. E. Zouraris, 2001.Hyperbolic differential equations and adaptive numerics. In Theory and

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014

Page 50: Convergence Rates for Adaptive Weak Approximation of Stochastic Differential Equations

558 Moon et al.

Numerics of Differential Equations, Durham 2000 eds. J. F. Blowey, J. P.Coleman and A. W. Craig. Universitext, Springer: Berlin, 231–280.

31. Moon, K.-S., A. Szepessy, R. Tempone, and G. E. Zouraris. 2003. Avariational principle for adaptive approximation of ordinary differentialequations. Numer. Math. 93:131–152.

32. Moon, K.-S., A. Szepessy, R. Tempone, and G. E. Zouraris. 2003.Convergence rates for adaptive approximation of ordinary differentialequation. Numer. Math. 93:99–129.

33. Morin, R., H. Nochetto, and K. G. Sieber. 2000. Data oscillation andconvergence of adaptive FEM. SIAM J. Numer. Anal. 38(2):466–488.

34. Müller-Gronbach, 2002. The optimal uniform approximation of systems ofstochastic differential equations. Ann. Appl. Probab. 12(2):664–690.

35. Nualart, D. 1995. The Malliavin calculus and related topics. Probability andits Applications. New York: Springer-Verlag.

36. Øksendal, B. 1998. Stochastic Differential Equations, Fifth edition. Berlin:Springer-Verlag.

37. Press, W. H. Numerical Recipes in FORTRAN: The Art of ScientificComputing, 2nd edition. Cambridge, UK: Cambridge University Press.

38. Stevenson, R. 2003. An Optimal Adaptive Finite Element Method. Preprintno. 1271, Univ. of Utrecht.

39. Szepessy, A., R. Tempone, and G. E. Zouraris. 2001. Adaptive weakapproximation of stochastic differential equations. Comm. Pure Appl. Math.54(10):1169–1214

40. Talay, D. 1995. Simulation of stochastic differential systems. InProbabilistic Method in Applied Physics. (P. Krée, and W. Wedig, 54–96,Lecture notes in Physics, eds. 451. Berlin: Springer.

41. Talay, D. and L. Tubaro. 1990. Expansion of the global error for numericalschemes solving stochastic differential equation. Stochastic Anal. Appl.8:483–509.

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

, Kno

xvill

e] a

t 19:

54 2

1 D

ecem

ber

2014