Part III - Stochastic Calculus and Applications...Part III | Stochastic Calculus and Applications Based on lectures by R. Bauerschmidt Notes taken by Dexter Chua Lent 2018 These notes

Part III — Stochastic Calculus and Applications

Based on lectures by R. BauerschmidtNotes taken by Dexter Chua

Lent 2018

These notes are not endorsed by the lecturers, and I have modified them (oftensignificantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

– Brownian motion. Existence and sample path properties.

– Stochastic calculus for continuous processes. Martingales, local martingales, semi-martingales, quadratic variation and cross-variation, Ito’s isometry, definition ofthe stochastic integral, Kunita–Watanabe theorem, and Ito’s formula.

– Applications to Brownian motion and martingales. Levy characterization ofBrownian motion, Dubins–Schwartz theorem, martingale representation, Gir-sanov theorem, conformal invariance of planar Brownian motion, and Dirichletproblems.

– Stochastic differential equations. Strong and weak solutions, notions of existenceand uniqueness, Yamada–Watanabe theorem, strong Markov property, andrelation to second order partial differential equations.

Pre-requisites

Knowledge of measure theoretic probability as taught in Part III Advanced Probability

will be assumed, in particular familiarity with discrete-time martingales and Brownian

motion.

1

Contents III Stochastic Calculus and Applications

Contents

0 Introduction 3

1 The Lebesgue–Stieltjes integral 6

2 Semi-martingales 92.1 Finite variation processes . . . . . . . . . . . . . . . . . . . . . . 92.2 Local martingale . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Square integrable martingales . . . . . . . . . . . . . . . . . . . . 152.4 Quadratic variation . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Covariation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.6 Semi-martingale . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 The stochastic integral 253.1 Simple processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Ito isometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Extension to local martingales . . . . . . . . . . . . . . . . . . . . 283.4 Extension to semi-martingales . . . . . . . . . . . . . . . . . . . . 303.5 Ito formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.6 The Levy characterization . . . . . . . . . . . . . . . . . . . . . . 353.7 Girsanov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Stochastic differential equations 414.1 Existence and uniqueness of solutions . . . . . . . . . . . . . . . 414.2 Examples of stochastic differential equations . . . . . . . . . . . . 454.3 Representations of solutions to PDEs . . . . . . . . . . . . . . . . 47

Index 51

2

0 Introduction III Stochastic Calculus and Applications

0 Introduction

Ordinary differential equations are central in analysis. The simplest class ofequations tend to look like

x(t) = F (x(t)).

Stochastic differential equations are differential equations where we make thefunction F “random”. There are many ways of doing so, and the simplest wayis to write it as

x(t) = F (x(t)) + η(t),

where η is a random function. For example, when modeling noisy physicalsystems, our physical bodies will be subject to random noise. What should weexpect the function η to be like? We might expect that for |t − s| 0, thevariables η(t) and η(s) are “essentially” independent. If we are interested inphysical systems, then this is a rather reasonable assumption, since random noiseis random!

In practice, we work with the idealization, where we claim that η(t) andη(s) are independent for t 6= s. Such an η exists, and is known as white noise.However, it is not a function, but just a Schwartz distribution.

To understand the simplest case, we set F = 0. We then have the equation

x = η.

We can write this in integral form as

x(t) = x(0) +

∫ t

0

η(s) ds.

To make sense of this integral, the function η should at least be a signed measure.Unfortunately, white noise isn’t. This is bad news.

We ignore this issue for a little bit, and proceed as if it made sense. If theequation held, then for any 0 = t0 < t1 < · · · , the increments

x(ti)− x(ti−1) =

∫ ti

ti−1

η(s) ds

should be independent, and moreover their variance should scale linearly with|ti − ti−1|. So maybe this x should be a Brownian motion!

Formalizing these ideas will take up a large portion of the course, and thework isn’t always pleasant. Then why should we be interested in this continuousproblem, as opposed to what we obtain when we discretize time? It turns outin some sense the continuous problem is easier. When we learn measure theory,there is a lot of work put into constructing the Lebesgue measure, as opposedto the sum, which we can just define. However, what we end up is much easier

— it’s easier to integrate 1x3 than to sum

∑∞n=1

1n3 . Similarly, once we have set

up the machinery of stochastic calculus, we have a powerful tool to do explicitcomputations, which is usually harder in the discrete world.

Another reason to study stochastic calculus is that a lot of continuoustime processes can be described as solutions to stochastic differential equations.Compare this with the fact that functions such as trigonometric and Besselfunctions are described as solutions to ordinary differential equations!

3


There are two ways to approach stochastic calculus, namely via the Itointegral and the Stratonovich integral. We will mostly focus on the Ito integral,which is more useful for our purposes. In particular, the Ito integral tends togive us martingales, which is useful.

To give a flavour of the construction of the Ito integral, we consider a simplerscenario of the Wiener integral.

Definition (Gaussian space). Let (Ω,F ,P) be a probability space. Then asubspace S ⊆ L2(Ω,F ,P) is called a Gaussian space if it is a closed linearsubspace and every X ∈ S is a centered Gaussian random variable.

An important construction is

Proposition. Let H be any separable Hilbert space. Then there is a probabilityspace (Ω,F ,P) with a Gaussian subspace S ⊆ L2(Ω,F ,P) and an isometryI : H → S. In other words, for any f ∈ H, there is a corresponding randomvariable I(f) ∼ N(0, (f, f)H). Moreover, I(αf + βg) = αI(f) + βI(g) and(f, g)H = E[I(f)I(g)].

Proof. By separability, we can pick a Hilbert space basis (ei)∞i=1 of H. Let

(Ω,F ,P) be any probability space that carries an infinite independent sequenceof standard Gaussian random variables Xi ∼ N(0, 1). Then send ei to Xi, extendby linearity and continuity, and take S to be the image.

In particular, we can take H = L2(R+).

Definition (Gaussian white noise). A Gaussian white noise on R+ is an isometryWN from L2(R+) into some Gaussian space. For A ⊆ R+, we write WN(A) =WN(1A).

Proposition.

– For A ⊆ R+ with |A| <∞, WN(A) ∼ N(0, |A|).

– For disjoint A,B ⊆ R+, the variables WN(A) and WN(B) are indepen-dent.

– If A =⋃∞i=1Ai for disjoint sets Ai ⊆ R+, with |A| <∞, |Ai| <∞, then

WN(A) =

∞∑i=1

WN(Ai) in L2 and a.s.

Proof. Only the last point requires proof. Observe that the partial sum

Mn =

n∑i=1

WN(A)

is a martingale, and is bounded in L2 as well, since

EM2n =

n∑i=1

EWN(Ai)2 =

n∑i=1

|Ai| ≤ |A|.

So we are done by the martingale convergence theorem. The limit is indeedWN(A) because 1A =

∑∞n=1 1Ai

.

4


The point of the proposition is that WN really looks like a random measureon R+, except it is not. We only have convergence almost surely above, whichmeans we have convergence on a set of measure 1. However, the set depends onwhich A and Ai we pick. For things to actually work out well, we must have afixed set of measure 1 for which convergence holds for all A and Ai.

But perhaps we can ignore this problem, and try to proceed. We define

Bt = WN([0, t])

for t ≥ 0.

Exercise. This Bt is a standard Brownian motion, except for the continuityrequirement. In other words, for any t1, t2, . . . , tn, the vector (Bti)

ni=1 is jointly

Gaussian withE[BsBt] = s ∧ t for s, t ≥ 0.

Moreover, B0 = 0 a.s. and Bt −Bs is independent of σ(Br : r ≤ s). Moreover,Bt −Bs ∼ N(0, t− s) for t ≥ s.

In fact, by picking a good basis of L2(R+), we can make Bt continuous.We can now try to define some stochastic integral. If f ∈ L2(R+) is a step

function,

f =

n∑i=1

fi1[si,ti]

with si < ti, then

WN(f) =

n∑i=1

fi(Bti −Bsi)

This motivates the notation

WN(f) =

∫f(s) dBS .

However, extending this to a function that is not a step function would beproblematic.

5

1 The Lebesgue–Stieltjes integral III Stochastic Calculus and Applications

1 The Lebesgue–Stieltjes integral

In calculus, we are able to perform integrals more exciting than simply∫ 1

0h(x) dx.

In particular, if h, a : [0, 1]→ R are C1 functions, we can perform integrals ofthe form ∫ 1

0

h(x) da(x).

For them, it is easy to make sense of what this means — it’s simply∫ 1

0

h(x) da =

∫ 1

0

h(x)a′(x) dx.

In our world, we wouldn’t expect our functions to be differentiable, so this isnot a great definition. One reasonable strategy to make sense of this is to comeup with a measure that should equal “da”.

An immediate difficulty we encounter is that a′(x) need not be positive all the

time. So for example,∫ 1

01 da could be a negative number, which one wouldn’t

expect for a usual measure! Thus, we are naturally lead to think about signedmeasures.

From now on, we always use the Borel σ-algebra on [0, T ] unless otherwisespecified.

Definition (Signed measure). A signed measure on [0, T ] is a difference µ =µ+−µ− of two positive measures on [0, T ] of disjoint support. The decompositionµ = µ+ − µ− is called the Hahn decomposition.

In general, given two measures µ1 and µ2 with not necessarily disjointsupports, we may still want to talk about µ1 − µ2.

Theorem. For any two finite measures µ1, µ2, there is a signed measure µ withµ(A) = µ1(A)− µ2(A).

If µ1 and µ2 are given by densities f1, f2, then we can simply decompose µas (f1− f2)+ dt+ (f1− f2)− dt, where + and − denote the positive and negativeparts respectively. In general, they need not be given by densities with respect todx, but they are always given by densities with respect to some other measure.

Proof. Let ν = µ1 + µ2. By Radon–Nikodym, there are positive functions f1, f2such that µi(dt) = fi(t)ν(dt). Then

(µ1 − µ2)(dt) = (f1 − f2)+(t) · ν(dt) + (f1 − f2)−(t) · ν(dt).

Definition (Total variation). The total variation of a signed measure µ =µ+ − µ− is |µ| = µ+ + µ−.

We now want to figure out how we can go from a function to a signed measure.

Let’s think about how one would attempt to define∫ 1

0f(x) dg as a Riemann

sum. A natural option would be to write something like∫ t

0

h(s) da(s) = limm→∞

nm∑i=1

h(t(m)i−1)

(a(t

(m)i )− a(t

(m)i−1)

)

6


for any sequence of subdivisions 0 = t(m)0 < · · · < t

(m)nm = t of [0, t] with

maxi |t(m)i − t(m)

i−1| → 0.In particular, since we want the integral of h = 1 to be well-behaved, the

sum∑

(a(t(m)i )− a(t

(m)i−1)) must be well-behaved. This leads to the notion of

Definition (Total variation). The total variation of a function a : [0, T ]→ R is

Va(t) = |a(0)|+ sup

n∑i=1

|a(ti)− a(ti−1)| : 0 = t0 < t1 < · · · < tn = T

.

We say a has bounded variation if Va(T ) <∞. In this case, we write a ∈ BV .

We include the |a(0)| term because we want to pretend a is defined on all ofR with a(t) = 0 for t < 0.

We also define

Definition (Cadlag). A function a : [0, T ]→ R is cadlag if it is right-continuousand has left-limits.

The following theorem is then clear:

Theorem. There is a bijectionsigned measures on [0, T ]

←→

cadlag functions of bounded

variation a : [0, T ]→ R

that sends a signed measure µ to a(t) = µ([0, t]). To construct the inverse, givena, we define

a± =1

2(Va ± a).

Then a± are both positive, and a = a+ − a−. We can then define µ± by

µ±[0, t] = a±(t)− a±(0)

µ = µ+ − µ−

Moreover, Vµ[0,t] = |µ|[0, t].

Example. Let a : [0, 1]→ R be given by

a(t) =

1 t < 1

2

0 t ≥ 12

.

This is cadlag, and it’s total variation is v0(1) = 2. The associated signedmeasure is

µ = δ0 − δ1/2,

and the total variation measure is

|µ| = δ0 + δ1/2.

We are now ready to define the Lebesgue–Stieltjes integral.

7


Definition (Lebesgue–Stieltjes integral). Let a : [0, T ] → R be cadlag ofbounded variation and let µ be the associated signed measure. Then forh ∈ L1([0, T ], |µ|), the Lebesgue–Stieltjes integral is defined by∫ t

s

h(r) da(r) =

∫(s,t]

h(r)µ(dr),

where 0 ≤ s ≤ t ≤ T , and∫ t

s

h(r) |da(r)| =∫(s,t]

h(r)|µ|(dr).

We also write

h · a(t) =

∫ t

0

h(r) da(r).

To let T =∞, we need the following notation:

Definition (Finite variation). A cadlag function a : [0,∞) → R is of finitevariation if a|[0,T ] ∈ BV [0, 1] for all T > 0.

Fact. Let a : [0, T ]→ R be cadlag and BV, and h ∈ L1([0, T ], |da|), then∣∣∣∣∣∫ T

0

h(s) da(s)

∣∣∣∣∣ ≤∫ b

a

|h(s)| |da(s)|,

and the function h · a : [0, T ] → R is cadlag and BV with associated signedmeasure h(s) da(s). Moreover, |h(s) da(s)| = |h(s)| |da(s)|.

We can, unsurprisingly, characterize the Lebesgue–Stieltjes integral by aRiemann sum:

Proposition. Let a be cadlag and BV on [0, t], and h bounded and left-continuous. Then∫ t

0

h(s) da(s) = limm→∞

nm∑i=1

h(t(m)i−1)

(a(t

(m)i )− a(t

(m)i−1)

)∫ t

0

h(s) |da(s)| = limm→∞

nm∑i=1

h(t(m)i−1)

∣∣∣a(t(m)i )− a(t

(m)i−1)

∣∣∣for any sequence of subdivisions 0 = t

(m)0 < · · · < t

(m)nm = t of [0, t] with

maxi |t(m)i − t(m)

i−1| → 0.

Proof. We approximate h by hm defined by

hm(0) = 0, hm(s) = h(t(m)i−1) for s ∈ (t

(m)i−1, t

(m)i ].

Then by left continuity, we have

h(s) = limn→∞

hm(s)

by left continuity, and moreover

limm→∞

nm∑i=1

h(t(m)i−1)(a(t

(m)i )− a(t

(m)i−1)) = lim

m→∞

∫(0,t]

hm(s)µ( ds) =

∫(0,t]

h(s)µ(ds)

by dominated convergence theorem. The statement about |da(s)| is left as anexercise.

8

2 Semi-martingales III Stochastic Calculus and Applications

2 Semi-martingales

The title of the chapter is “semi-martingales”, but we are not going even meetthe definition of a semi-martingale till the end of the chapter. The reason is thata semi-martingale is essentially defined to be the sum of a (local) martingale anda finite variation process, and understanding semi-martingales mostly involvesunderstanding the two parts separately. Thus, for most of the chapter, wewill be studying local martingales (finite variation processes are rather moreboring), and at the end we will put them together to say a word or two aboutsemi-martingales.

From now on, (Ω,F , (Ft)t≥0,P) will be a filtered probability space. Recallthe following definition:

Definition (Cadlag adapted process). A cadlag adapted process is a map X :Ω× [0,∞)→ R such that

(i) X is cadlag, i.e. X(ω, · ) : [0,∞)→ R is cadlag for all ω ∈ Ω.

(ii) X is adapted, i.e. Xt = X( · , t) is Ft-measurable for every t ≥ 0.

Notation. We will write X ∈ G to denote that a random variable X is measur-able with respect to a σ-algebra G.

2.1 Finite variation processes

The definition of a finite variation function extends immediately to a finitevariation process.

Definition (Finite variation process). A finite variation process is a cadlagadapted process A such that A(ω, · ) : [0,∞) → R has finite variation for allω ∈ Ω. The total variation process V of a finite variation process A is

Vt =

∫ T

0

|dAs|.

Proposition. The total variation process V of a cadlag adapted process A isalso cadlag, finite variation and adapted, and it is also increasing.

Proof. We only have to check that it is adapted. But that follows directlyfrom our previous expression of the integral as the limit of a sum. Indeed, let

0 = t(m)0 < t

(m)1 < · · · < tnm = t be a (nested) sequence of subdivisions of [0, t]

with maxi |t(m)i − t(m)

i−1| → 0. We have seen

Vt = limm→∞

nm∑i=1

|At(m)i−A

t(m)i−1|+ |A(0)| ∈ Ft.

Definition ((H ·A)t). Let A be a finite variation process and H a process suchthat for all ω ∈ Ω and t ≥ 0,∫ t

0

Hs(ω)| |dAs(ω)| <∞.

Then define a process ((H ·A)t)t≥0 by

(H ·A)t =

∫ t

0

Hs dAs.

9


For the process H ·A to be adapted, we need a condition.

Definition (Previsible process). A process H : Ω× [0,∞)→ R is previsible ifit is measurable with respect to the previsible σ-algebra P generated by the setsE × (s, t], where E ∈ Fs and s < t. We call the generating set Π.

Very roughly, the idea is that a previsible event is one where whenever ithappens, you know it a finite (though possibly arbitrarily small) before.

Definition (Simple process). A process H : Ω× [0,∞)→ R is simple, writtenH ∈ E , if

H(ω, t) =

n∑i=1

Hi−1(ω)1(ti−1,ti](t)

for random variables Hi−1 ∈ Fi−1 and 0 = t0 < · · · < tn.

Fact. Simple processes and their limits are previsible.

Fact. Let X be a cadlag adapted process. Then Ht = Xt− defines a left-continuous process and is previsible.

In particular, continuous processes are previsible.

Proof. Since X is cadlag adapted, it is clear that H is left-continuous andadapted. Since H is left-continuous, it is approximated by simple processes.Indeed, let

Hnt =

2n∑i=1

H(i−1)2−n1((i−1)2−n,i2−n](t) ∧ n ∈ E .

Then Hnt → H for all t by left continuity, and previsibility follows.

Exercise. Let H be previsible. Then

Ht ∈ Ft− = σ(Fs : s < t).

Example. Brownian motion is previsible (since it is continuous).

Example. A Poisson process (Nt) is not previsible since Nt 6∈ Ft− .

Proposition. Let A be a finite variation process, and H previsible such that∫ t

0

|H(ω, s)| |dA(ω, s)| <∞ for all (ω, t) ∈ Ω× [0,∞).

Then H ·A is a finite variation process.

Proof. The finite variation and cadlag parts follow directly from the deterministicversions. We only have to check that H ·A is adapted, i.e. (H ·A)( · , t) ∈ Ft forall t ≥ 0.

First, H ·A is adapted if H(ω, s) = 1(u,v](s)1E(ω) for some u < v and E ∈ Fu,since

(H ·A)(ω, t) = 1E(ω)(A(ω, t ∧ v)−A(ω, t ∧ u)) ∈ Ft.

Thus, H ·A is adapted for H = 1F when F ∈ Π. Clearly, Π is a π system, i.e. itis closed under intersections and non-empty, and by definition it generates the

10


previsible σ-algebra P. So to extend the adaptedness of H ·A to all previsibleH, we use the monotone class theorem.

We letV = H : Ω× [0,∞)→ R : H ·A is adapted.

Then

(i) 1 ∈ V

(ii) 1F ∈ V for all F ∈ Π.

(iii) V is closed under monotone limits.

So V contains all bounded P-measurable functions.

So the conclusion is that if A is a finite variation process, then as long asreasonable finiteness conditions are satisfied, we can integrate functions againstdA. Moreover, this integral was easy to define, and it obeys all expectedproperties such as dominated convergence, since ultimately, it is just an integralin the usual measure-theoretic sense. This crucially depends on the fact that Ais a finite variation process.

However, in our motivating example, we wanted to take A to be Brownianmotion, which is not of finite variation. The work we will do in this chapterand the next is to come up with a stochastic integral where we let A be amartingale instead. The heuristic idea is that while martingales can vary wildly,the martingale property implies there will be some large cancellation betweenthe up and down movements, which leads to the possibility of a well-definedstochastic integral.

2.2 Local martingale

From now on, we assume that (Ω,F , (Ft)t,P) satisfies the usual conditions,namely that

(i) F0 contains all P-null sets

(ii) (Ft)t is right-continuous, i.e. Ft = (Ft+ =⋂s>t Fs for all t ≥ 0.

We recall some of the properties of continuous martingales.

Theorem (Optional stopping theorem). Let X be a cadlag adapted integrableprocess. Then the following are equivalent:

(i) X is a martingale, i.e. Xt ∈ L1 for every t, and

E(Xt | Fs) = Xs for all t > s.

(ii) The stopped process XT = (XTt ) = (XT∧t) is a martingale for all stopping

times T .

(iii) For all stopping times T, S with T bounded, XT ∈ L1 and E(XT | FS) =XT∧S almost surely.

(iv) For all bounded stopping times T , XT ∈ L1 and E(XT ) = E(X0).

11


For X uniformly integrable, (iii) and (iv) hold for all stopping times.

In practice, most of our results will be first proven for bounded martingales,or perhaps square integrable ones. The point is that the square-integrablemartingales form a Hilbert space, and Hilbert space techniques can help us saysomething useful about these martingales. To get something about a generalmartingale M , we can apply a cutoff Tn = inft > 0 : Mt ≥ n, and then MTn

will be a martingale for all n. We can then take the limit n → ∞ to recoversomething about the martingale itself.

But if we are doing this, we might as well weaken the martingale conditiona bit — we only need the MTn to be martingales. Of course, we aren’t doingthis just for fun. In general, martingales will not always be closed under theoperations we are interested in, but local (or maybe semi-) martingales will be.In general, we define

Definition (Local martingale). A cadlag adapted process X is a local martingaleif there exists a sequence of stopping times Tn such that Tn →∞ almost surely,and XTn is a martingale for every n. We say the sequence Tn reduces X.

Example.

(i) Every martingale is a local martingale, since by the optional stoppingtheorem, we can take Tn = n.

(ii) Let (Bt) to be a standard 3d Brownian motion on R3. Then

(Xt)t≥1 =

(1

|Bt|

)t≥1

is a local martingale but not a martingale.

To see this, first note that

supt≥1

EX2t <∞, EXt → 0.

Since EXt → 0 and Xt ≥ 0, we know X cannot be a martingale. However,we can check that it is a local martingale. Recall that for any f ∈ C2

b ,

Mf = f(Bt)− f(B1)− 1

2

∫ t

0

∆f(Bs) ds

is a martingale. Moreover, ∆ 1|x| = 0 for all x 6= 0. Thus, if 1

|x| didn’t have

a singularity at 0, this would have told us Xt is a martingale. Thus, weare safe if we try to bound |Bs| away from zero.

Let

Tn = inf

t ≥ 1 : |Bt| <

1

n

,

and pick fn ∈ C2b such that fn(x) = 1

|x| for |x| ≥ 1n . Then XT

t −XTn1 =

Mfnt∧Tn

. So XTn is a martingale.

It remains to show that Tn → ∞, and this follows from the fact thatEXt → 0.

12


Proposition. Let X be a local martingale and Xt ≥ 0 for all t. Then X is asupermartingale.

Proof. Let (Tn) be a reducing sequence for X. Then

E(Xt | Fs) = E(

lim infn→∞

Xt∧Tn| Fs

)≤ limn→∞

E(Xt∧Tn| Fs)

= lim infTn→∞

Xs∧Tn

= Xs.

Recall the following result from Advanced Probability:

Proposition. Let X ∈ L1(Ω,F ,P). Then the set

χ = E(X | G) : G ⊆ F a sub-σ-algebra

is uniformly integrable, i.e.

supY ∈χ

E(|Y |1|Y |>λ)→ 0 as λ→∞.

Recall also the following important result about uniformly integrable randomvariables:

Theorem (Vitali theorem). Xn → X in L1 iff (Xn) is uniformly integrable andXn → X in probability.

With these, we can state the following characterization of martingales interms of local martingales:

Proposition. The following are equivalent:

(i) X is a martingale.

(ii) X is a local martingale, and for all t ≥ 0, the set

χt = XT : T is a stopping time with T ≤ t

is uniformly integrable.

Proof.

– (a)⇒ (b): Let X be a martingale. Then by the optional stopping theorem,XT = E(Xt | FT ) for any bounded stopping time T ≤ t. So χt is uniformlyintegrable.

– (b) ⇒ (a): Let X be a local martingale with reducing sequence (Tn), andassume that the sets χt are uniformly integrable for all t ≥ 0. By theoptional stopping theorem, it suffices to show that E(XT ) = E(X0) for anybounded stopping time T .

So let T be a bounded stopping time, say T ≤ t. Then

E(X0) = E(XTn0 ) = E(XTn

T ) = E(XT∧Tn)

13


for all n. Now T ∧ Tn is a stopping time ≤ t, so XT∧Tn is uniformly

integrable by assumption. Moreover, Tn ∧ T → T almost surely as n→∞,hence XT∧Tn → XT in probability. Hence by Vitali, this converges in L1.So

E(XT ) = E(X0).

Corollary. If Z ∈ L1 is such that |Xt| ≤ Z for all t, then X is a martingale. Inparticular, every bounded local martingale is a martingale.

The definition of a local martingale does not give us control over what thereducing sequence Tn is. In particular, it is not necessarily true that XTn

will be bounded, which is a helpful property to have. Fortunately, we have thefollowing proposition:

Proposition. Let X be a continuous local martingale with X0 = 0. Define

Sn = inft ≥ 0 : |Xt| = n.

Then Sn is a stopping time, Sn → ∞ and XSn is a bounded martingale. Inparticular, (Sn) reduces X.

Proof. It is clear that Sn is a stopping time, since (if it is not clear)

Sn ≤ t =⋂k∈N

sups≤t|Xs| > n− 1

k

=⋂k∈N

⋃s<t,s∈Q

|Xs| > n− 1

k

∈ Ft.

It is also clear that Sn →∞, since

sups≤t|Xs| ≤ n↔ Sn ≥ t,

and by continuity and compactness, sups≤t |Xs| is finite for every (ω, t).

Finally, we show that XSn is a martingale. By the optional stopping theorem,XTn∧Sn is a martingale, so XSn is a local martingale. But it is also bounded byn. So it is a martingale.

An important and useful theorem is the following:

Theorem. Let X be a continuous local martingale with X0 = 0. If X is also afinite variation process, then Xt = 0 for all t.

This would rule out interpreting∫Hs dXs as a Lebesgue–Stieltjes integral

for X a non-zero continuous local martingale. In particular, we cannot take Xto be Brownian motion. Instead, we have to develop a new theory of integrationfor continuous local martingales, namely the Ito integral.

On the other hand, this theorem is very useful. We will later want to definethe stochastic integral with respect to the sum of a continuous local martingaleand a finite variation process, which is the appropriate generality for our theoremsto make good sense. This theorem tells us there is a unique way to decompose aprocess as a sum of a finite variation process and a continuous local martingale(if it can be done). So we can simply define this stochastic integral by using theLebesgue–Stieltjes integral on the finite variation part and the Ito integral onthe continuous local martingale part.

14


Proof. Let X be a finite-variation continuous local martingale with X0 = 0. SinceX is finite variation, we can define the total variation process (Vt) correspondingto X, and let

Sn = inft ≥ 0 : Vt ≥ n = inf

t ≥ 0 :

∫ 1

0

|dXs| ≥ n.

Then Sn is a stopping time, and Sn → ∞ since X is assumed to be finitevariation. Moreover, by optional stopping, XSn is a local martingale, and is alsobounded, since

XSnt ≤

∫ t∧Sn

0

|dXs| ≤ n.

So XSn is in fact a martingale.We claim its L2-norm vanishes. Let 0 = t0 < t1 < · · · < tn = t be a

subdivision of [0, t]. Using the fact that XSn is a martingale and has orthogonalincrements, we can write

E((XSnt )2) =

k∑i=1

E((XSnti −X

Snti−1

)2).

Observe that XSn is finite variation, but the right-hand side is summing thesquare of the variation, which ought to vanish when we take the limit max |ti −ti−1| → 0. Indeed, we can compute

E((XSnt )2) =

k∑i=1

E((XSnti −X

Snti−1

)2)

≤ E

(max1≤i≤k

|XSnti −X

Snti−1|k∑i=1

|XSnti −X

Snti−1|

)

≤ E(

max1≤i≤k

|XSnti −X

Snti−1|Vt∧Sn

)≤ E

(max1≤i≤k

|XSnti −X

Snti−1|n).

Of course, the first term is also bounded by the total variation. Moreover, wecan make further subdivisions so that the mesh size tends to zero, and then thefirst term vanishes in the limit by continuity. So by dominated convergence, wemust have E((XSn

t )2) = 0. So XSnt = 0 almost surely for all n. So Xt = 0 for all

t almost surely.

2.3 Square integrable martingales

As previously discussed, we will want to use Hilbert space machinery to constructthe Ito integral. The rough idea is to define the Ito integral with respect to afixed martingale on simple processes via a (finite) Riemann sum, and then bycalculating appropriate bounds on how this affects the norm, we can extend thisto all processes by continuity, and this requires our space to be Hilbert. Theinteresting spaces are defined as follows:

15


Definition (M2). Let

M2 =

X : Ω× [0,∞)→ R : X is cadlag martingale with sup

t≥0E(X2

t ) <∞.

M2c =

X ∈M2 : X(ω, · ) is continuous for every ω ∈ Ω

We define an inner product on M2 by

(X,Y )M2 = E(X∞Y∞),

which in aprticular induces a norm

‖X‖M2 =(E(X2

∞))1/2

.

We will prove this is indeed an inner product soon. Here recall that for X ∈M2,the martingale convergence theorem implies Xt → X∞ almost surely and in L2.

Our goal will be to prove that these spaces are indeed Hilbert spaces. Firstobserve that if X ∈M2, then (X2

t )t≥0 is a submartingale by Jensen, so t 7→ EX2t

is increasing, andEX2∞ = sup

t≥0EX2

t .

All the magic that lets us prove they are Hilbert spaces is Doob’s inequality.

Theorem (Doob’s inequality). Let X ∈M2. Then

E(

supt≥0

X2t

)≤ 4E(X2

∞).

So once we control the limit X∞, we control the whole path. This is whythe definition of the norm makes sense, and in particular we know ‖X‖M2 = 0implies that X = 0.

Theorem. M2 is a Hilbert space and M2c is a closed subspace.

Proof. We need to check thatM2 is complete. Thus let (Xn) ⊆M2 be a Cauchysequence, i.e.

E((Xn∞ −Xm

∞)2)→ 0 as n,m→∞.

By passing to a subsequence, we may assume that

E((Xn∞ −Xn−1

∞ )2) ≤ 2−n.

First note that

E

(∑n

supt≥0|Xn

t −Xn−1t |

)≤∑n

E(

supt≥0|Xn

t −Xn−1t |2

)1/2

(CS)

≤∑n

2E(|Xn∞ −Xn−1

∞ |2)1/2

(Doob’s)

≤ 2∑n

2−n/2 <∞.

16


So∞∑n=1

supt≥0|Xn

t −Xn−1t | <∞ a.s. (∗)

So on this event, (Xn) is a Cauchy sequence in the space (D[0,∞), ‖ · ‖∞) ofcadlag sequences. So there is some X(ω, · ) ∈ D[0,∞) such that

‖Xn(ω, · )−X(ω, · )‖∞ → 0 for almost all ω.

and we set X = 0 outside this almost sure event (∗). We now claim that

E(

supt≥0|Xn −X|2

)→ 0 as n→∞.

We can just compute

E(

supt|Xn −X|2

)= E

(limm→∞

supt|Xn −Xm|2

)≤ lim inf

m→∞E(

supt|Xn −Xm|2

)(Fatou)

≤ lim infm→∞

4E(Xn∞ −X∞m )2 (Doob’s)

and this goes to 0 in the limit n→∞ as well.We finally have to check that X is indeed a martingale. We use the triangle

inequality to write

‖E(Xt | Fs)−Xs‖L2 ≤ ‖E(Xt −Xnt | Fs)‖L2 + ‖Xn

s −Xs‖L2

≤ E(E((Xt −Xnt )2 | Fs))1/2 + ‖Xn

s −Xs‖L2

= ‖Xt −Xnt ‖L2 + ‖Xn

s −Xs‖L2

≤ 2E(

supt|Xt −Xn

t |2)1/2

→ 0

as n→∞. But the left-hand side does not depend on n. So it must vanish. SoX ∈M2.

We could have done exactly the same with continuous martingales, so thesecond part follows.

2.4 Quadratic variation

Physicists are used to dropping all terms above first-order. It turns out thatBrownian motion, and continuous local martingales in general oscillate so wildlythat second order terms become important. We first make the following definition:

Definition (Uniformly on compact sets in probability). For a sequence ofprocesses (Xn) and a process X, we say that Xn → X u.c.p. iff

P

(sups∈[0,t]

|Xns −Xs| > ε

)→ 0 as n→∞ for all t > 0, ε > 0.

17


Theorem. Let M be a continuous local martingale with M0 = 0. Then thereexists a unique (up to indistinguishability) continuous adapted increasing process(〈M〉t)t≥0 such that 〈M〉0 = 0 and M2

t − 〈M〉t is a continuous local martingale.Moreover,

〈M〉t = limn→∞

〈M〉(n)t , 〈M〉(n)t =

d2nte∑i=1

(Mt2−n −M(i−1)2−n)2,

where the limit u.c.p.

Definition (Quadratic variation). 〈M〉 is called the quadratic variation of M .

It is probably more useful to understand 〈M〉t in terms of the explicit formula,and the fact that M2

t − 〈M〉t is a continuous local martingale is a convenientproperty.

Example. Let B be a standard Brownian motion. Then B2t − t is a martingale.

Thus, 〈B〉t = t.

The proof is long and mechanical, but not hard. All the magic happenedwhen we used the magical Doob’s inequality to show that M2

c and M2 areHilbert spaces.

Proof. To show uniqueness, we use that finite variation and local martingaleare incompatible. Suppose (At) and (At) obey the conditions for 〈M〉. ThenAt−At = (M2

t −At)−(M2t −At) is a continuous adapted local martingale starting

at 0. Moreover, both At and At are increasing, hence have finite variation. SoA− A = 0 almost surely.

To show existence, we need to show that the limit exists and has the rightproperty. We do this in steps.

Claim. The result holds if M is in fact bounded.

Suppose |M(ω, t)| ≤ C for all (ω, t). Then M ∈M2c . Fix T > 0 deterministic.

Let

Xnt =

d2nTe∑i=1

M(i−1)2−n(Mi2−n∧t −M(i−1)2−n∧t).

This is defined so that

〈M〉(n)k2−n = M2k2−n − 2Xn

k2−n .

This reduces the study of 〈M〉(n) to that of Xnk2−n .

We check that (Xnt ) is a Cauchy sequence in M2

c . The fact that it is amartingale is an immediate computation. To show it is Cauchy, for n ≥ m, wecalculate

Xn∞ −Xm

∞ =

d2nTe∑i=1

(M(i−1)2−n −Mb(i−1)2m−nc2−m)(Mi2−n −M(i−1)2−n).

18


We now take the expectation of the square to get

E(Xn∞ −Xm

∞)2 = E

d2nTe∑i=1

(M(i−1)2−n −Mb(i−1)2m−nc2−m)2(Mi2−n −M(i−1)2−n)2

≤ E

sup|s−t|≤2−m

|Mt −Ms|2d2nTe∑i=1

(Mi2−n −M(i−1)2−n)2

= E

(sup

|s−t|≤2−m

|Mt −Ms|2〈M〉(n)T

)

≤ E

(sup

|s−t|≤2−m

|Mt −Ms|4)1/2

E(

(〈M〉(n)T )2)1/2(Cauchy–Schwarz)

We shall show that the second factor is bounded, while the first factor tends tozero as m → ∞. These are both not surprising — the first term vanishing inthe limit corresponds to M being continuous, and the second term is boundedsince M itself is bounded.

To show that the first term tends to zero, we note that we have

|Mt −Ms|4 ≤ 16C4,

and moreover

sup|s−t|≤2−m

|Mt −Ms| → 0 as m→∞ by uniform continuity.

So we are done by the dominated convergence theorem.To show the second term is bounded, we do (writing N = d2nT e)

E(

(〈M〉(n)T )2)

= E

( N∑i=1

(Mi2−n −M(i−1)2−n)2

)2

=

N∑i=1

E((Mi2−n −M(i−1)2−n)4

)+ 2

N∑i=1

E

((Mi2−n −M(i−1)2−n)2

N∑k=i+1

(Mk2−n −M(k−1)2−n)2

)We use the martingale property and orthogonal increments the rearrange theoff-diagonal term as

E((Mi2−n −M(i−1)2−n)(MN2−n −Mi2−n)2

).

Taking some sups, we get

E(

(〈M〉(n)T )2)≤ 12C2E

(N∑i=1

(Mi2−n −M(i−1)2−n)2

)= 12C2E

((MN2−n −M0)2

)≤ 12C2 · 4C2.

19


So done.So we now have Xn → X in M2

c for some X ∈M2c . In particular, we have∥∥∥∥sup

t|Xn

t −Xt|∥∥∥∥L2

→ 0

So we know thatsupt|Xn

t −Xt| → 0

almost surely along a subsequence Λ.Let N ⊆ Ω be the events on which this convergence fails. We define

A(T )t =

M2t − 2Xt ω ∈ Ω \N

0 ω ∈ N.

Then A(T ) is continuous, adapted since M and X are, and (M2t∧T −A

(T )t∧T )t is a

martingale since X is. Finally, A(T ) is increasing since M2t −Xn

t is increasingon 2−nZ ∩ [0, T ] and the limit is uniform. So this A(T ) basically satisfies all theproperties we want 〈M〉t to satisfy, except we have the stopping time T .

We next observe that for any T ≥ 1, A(T )t∧T = A

(T+1)t∧T for all t almost surely.

This essentially follows from the same uniqueness argument as we had at thebeginning of the proof. Thus, there is a process (〈M〉t)t≥0 such that

〈M〉t = A(T )t

for all t ∈ [0, T ] and T ∈ N, almost surely. Then this is the desired process. Sowe have constructed 〈M〉 in the case where M is bounded.

Claim. 〈M〉(n) → 〈M〉 u.c.p.

Recall that〈M〉(n)t = M2

2−nb2ntc − 2Xn2−nb2ntc.

We also know thatsupt≤T|Xn

t −Xt| → 0

in L2, hence also in probability. So we have

|〈M〉t − 〈M〉(n)t | ≤ supt≤T|M2

2−nb2ntc −M2t |

+ supt≤T|Xn

2−nb2ntc −X2−nb2ntc|+ supt≤T|X2−nb2ntc −Xt|.

The first and last terms → 0 in probability since M and X are uniformlycontinuous on [0, T ]. The second term converges to zero by our previous assertion.So we are done.

Claim. The theorem holds for M any continuous local martingale.

We let Tn = inft ≥ 0 : |Mt| ≥ n. Then (Tn) reduces M and MTn is abounded martingale. So in particular MTn is a bounded continuous martingale.We set

An = 〈MTn〉.

20


Then (Ant ) and (An+1t∧Tn

) are indistinguishable for t < Tn by the uniqueness argu-ment. Thus there is a process 〈M〉 such that 〈M〉t∧Tn = Ant are indistinguishablefor all n. Clearly, 〈M〉 is increasing since the An are, and M2

t∧Tn− 〈M〉t∧Tn is a

martingale for every n, so M2t − 〈M〉t is a continuous local martingale.

Claim. 〈M〉(n) → 〈M〉 u.c.p.

We have seen〈MTk〉(n) → 〈MTk〉 u.c.p.

for every k. So

P(

supt≤T|〈M〉(n)t − 〈Mt〉| > ε

)≤ P(Tk < T ) + P

(supt≤T|〈MTk〉(n)t − 〈MTk〉t > ε

).

So we can fisrt pick k large enough such that the first term is small, then pick nlarge enough so that the second is small.

There are a few easy consequence of this theorem.

Fact. Let M be a continuous local martingale, and let T be a stopping time.Then alsmot surely for all t ≥ 0,

〈MT 〉t = 〈M〉t∧T

Proof. Since M2t −〈M〉t is a continuous local martingle, so is M2

t∧T −〈M〉t∧T =(MT )2t − 〈M〉t∧T . So we are done by uniqueness.

Fact. Let M be a continuous local martingale with M0 = 0. Then M = 0 iff〈M〉 = 0.

Proof. If M = 0, then 〈M〉 = 0. Conversely, if 〈M〉 = 0, then M2 is a continuouslocal martingale and positive. Thus EM2

t ≤ EM20 = 0.

Proposition. Let M ∈M2c . Then M2 − 〈M〉 is a uniformly integrable martin-

gale, and‖M −M0‖M2 = (E〈M〉∞)1/2.

Proof. We will show that 〈M〉∞ ∈ L1. This then implies

|M2t − 〈M〉t| ≤ sup

t≥0M2t + 〈M〉∞.

Then the right hand side is in L1. Since M2 − 〈M〉 is a local martingale, thisimplies that it is in fact a uniformly integrable martingale.

To show 〈M〉∞ ∈ L1, we let

Sn = inft ≥ 0 : 〈M〉t ≥ n.

Then Sn →∞, Sn is a stopping time and moreover 〈M〉t∧Sn ≤ n. So we have

M2t∧Sn

− 〈M〉t∧Sn ≤ n+ supt≥0

M2t ,

and the second term is in L1. So M2t∧Sn

− 〈M〉t∧Sn is a true martingale.

21


SoEM2

t∧Sn− EM2

0 = E〈M〉t∧Sn .

Taking the limit t→∞, we know EM2t∧Sn

→ EM2Sn

by dominated convergence.Since 〈M〉t∧Sn

is increasing, we also have E〈M〉t∧Sn→ E〈M〉Sn

by monotoneconvergence. We can take n→∞, and by the same justification, we have

E〈M〉 ≤ EM2∞ − EM2

0 = E(M∞ −M0)2 <∞.

2.5 Covariation

We know M2c not only has a norm, but also an inner product. This can also be

reflected in the bracket by the polarization identity, and it is natural to define

Definition (Covariation). Let M,N be two continuous local martingales. Definethe covariation (or simply the bracket) between M and N to be process

〈M,N〉t =1

4(〈M +N〉t − 〈M −N〉t).

Then if in fact M,N ∈M2c , then putting t =∞ gives the inner product.

Proposition.

(i) 〈M,N〉 is the unique (up to indistinguishability) finite variation processsuch that MtNt − 〈M,N〉t is a continuous local martingale.

(ii) The mapping (M,N) 7→ 〈M,N〉 is bilinear and symmetric.

(iii)

〈M,N〉t = limn→∞

〈M,N〉(n)t u.c.p.

〈M,N〉(n)t =

d2nte∑i=1

(Mi2−n−M(i−1)2−n)(Ni2−n −N(i−1)2

−n).

(iv) For every stopping time T ,

〈MT , NT 〉t = 〈MT , N〉t = 〈M,N〉t∧T .

(v) If M,N ∈M2c , then MtNt − 〈M,N〉t is a uniformly integrable martingale,

and〈M −M0, N −N0〉M2 = E〈M,N〉∞.

Example. Let B,B′ be two independent Brownian motions (with respect tothe same filtration). Then 〈B,B′〉 = 0.

Proof. Assume B0 = B′0 = 0. Then X± = 1√2(B ± B′) are Brownian motions,

and so 〈X±〉 = t. So their difference vanishes.

An important result about the covariation is the following Cauchy–Schwarzlike inequality:

22


Proposition (Kunita–Watanabe). Let M,N be continuous local martingalesand let H,K be two (previsible) processes. Then almost surely∫ ∞

0

|Hs||Ks||d〈M,N〉s| ≤(∫ ∞

0

H2s d〈M〉s

)1/2(∫ ∞0

H2s 〈N〉s

)1/2

.

In fact, this is Cauchy–Schwarz. All we have to do is to take approximationsand take limits and make sure everything works out well.

Proof. For convenience, we write

〈M,N〉ts = 〈M,N〉t − 〈M,N〉s.

Claim. For all 0 ≤ s ≤ t, we have

|〈M,N〉ts| ≤√〈M,M〉ts

√〈N,N〉ts.

By continuity, we can assume that s, t are dyadic rationals. Then

|〈M,N〉ts| = limn→∞

∣∣∣∣∣2nt∑

i=2ns+1

(Mi2−n −M(i−1)2−n)(Ni2−n −N(i−1)2−n)

∣∣∣∣∣≤ limn→∞

∣∣∣∣∣2nt∑

i=2ns+1

(Mi2−n −M(i−1)2−n)2

∣∣∣∣∣1/2

×

∣∣∣∣∣2nt∑

i=2ns+1

(Ni2−n −N(i−1)2−n)2

∣∣∣∣∣1/2

(Cauchy–Schwarz)

=(〈M,M〉ts

)1/2 (〈N,N〉ts)1/2 ,where all equalities are u.c.p.

Claim. For all 0 ≤ s < t, we have∫ t

s

|d〈M,N〉u| ≤√〈M,M〉ts

√〈N,N〉ts.

Indeed, for any subdivision s = t0 < t1 < · · · tn = t, we have

n∑i=1

|〈M,N〉titi−1| ≤

n∑i=1

√〈M,M〉titi−1

√〈N,N〉titi−1

≤

(n∑i=1

〈M,M〉titi−1

)1/2( n∑i=1

〈N,N〉titi−1

)1/2

.

(Cauchy–Schwarz)

Taking the supremum over all subdivisions, the claim follows.

Claim. For all bounded Borel sets B ⊆ [0,∞), we have∫B

|d〈M,N〉u| ≤

√∫B

d〈M〉u

√∫B

d〈N〉u.

23


We already know this is true if B is an interval. If B is a finite union ofintegrals, then we apply Cauchy–Schwarz. By a monotone class argument, wecan extend to all Borel sets.

Claim. The theorem holds for

H =

k∑`=1

h`1B`, K =

n∑`=1

k`1B`

for B` ⊆ [0,∞) bounded Borel sets with disjoint support.

We have∫|HsKs| |d〈M,N〉s| ≤

n∑`=1

|h`k`|∫B`

|dM,N〉s|

≤n∑`=1

|h`k`|(∫

B`

d〈M〉s)1/2(∫

B`

d〈N〉s)1/2

≤

(n∑`=1

h2`

∫B`

d〈M〉s

)1/2( n∑`=1

k2`

∫B`

d〈N〉s

)1/2

To finish the proof, approximate general H and K by step functions and takethe limit.

2.6 Semi-martingale

Definition (Semi-martingale). A (continuous) adapted process X is a (contin-uous) semi-martingale if

X = X0 +M +A,

where X0 ∈ F0, M is a continuous local martingale with M0 = 0, and A is acontinuous finite variation process with A0 = 0.

This decomposition is unique up to indistinguishables.

Definition (Quadratic variation). Let X = X0 +M+A and X ′ = X ′0 +M ′+A′

be (continuous) semi-martingales. Set

〈X〉 = 〈M〉, 〈X,X ′〉 = 〈M,M ′〉.

This definition makes sense, because continuous finite variation processes donot have quadratic variation.

Exercise. We have

〈X,Y 〉(n)t =

d2nte∑i=1

(Xi2−n −X(i−1)2−n)(Yi2−n − Y(i−1)2−n)→ 〈X,Y 〉 u.c.p.

24

3 The stochastic integral III Stochastic Calculus and Applications

3 The stochastic integral

3.1 Simple processes

We now have all the background required to define the stochastic integral, andwe can start constructing it. As in the case of the Lebesgue integral, we firstdefine it for simple processes, and then extend to general processes by taking alimit. Recall that we have

Definition (Simple process). The space of simple processes E consists of func-tions H : Ω× [0,∞)→ R that can be written as

Ht(ω) =

n∑i=1

Hi−1(ω)1(ti−1,ti](t)

for some 0 ≤ t0 ≤ t1 ≤ · · · ≤ tn and bounded random variables Hi ∈ Fti .

Definition (H ·M). For M ∈M2 and H ∈ E , we set∫ t

0

H dM = (H ·M)t =

n∑i=1

Hi−1(Mti∧t −Mti−1∧t).

If M were of finite variation, then this is the same as what we have previouslyseen.

Recall that for the Lebesgue integral, extending this definition to generalfunctions required results like monotone convergence. Here we need some similarresults that put bounds on how large the integral can be. In fact, we getsomething better than a bound.

Proposition. If M ∈M2c and H ∈ E , then H ·M ∈M2

c and

‖H ·M‖2M2 = E(∫ ∞

0

H2s d〈M〉s

). (∗)

Proof. We first show that H ·M ∈M2c . By linearity, we only have to check it

forXit = Hi−1(Mti∧t −Mti−1∧t)

We have to check that E(Xit | Fs) = 0 for all t > s, and the only non-trivial case

is when t > ti−1.

E(Xit | Fs) = Hi−1E(Mti∧t −Mti−1∧t | Fs) = 0.

We also check that‖Xi‖M2 ≤ 2‖H‖∞‖M‖M2 .

So it is bounded. So H ·M ∈M2c .

To prove (∗), we note that the Xi are orthogonal and that

〈Xi〉t = H2i−1(〈M〉ti∧t − 〈M〉ti−1∧t).

So we have

〈H ·M,H ·M〉 =∑〈Xi, Xi〉 =

∑H2i−1(〈M〉ti∧t−〈M〉ti−1∧t) =

∫ t

0

H2s d〈M〉s.

25


In particular,

‖H ·M‖2M2 = E〈H ·M〉∞ = E(∫ ∞

0

H2s d〈M〉s

).

Proposition. Let M ∈M2c and H ∈ E . Then

〈H ·M,N〉 = H · 〈M,N〉

for all N ∈M2.

In other words, the stochastic integral commutes with the bracket.

Proof. Write H ·M =∑Xi =

∑Hi−1(Mti∧t −Mti−1∧t) as before. Then

〈Xi, N〉t = Hi−1〈Mti∧t −Mti−1∧t, N〉 = Hi−1(〈M,N〉ti∧t − 〈M,N〉ti−1∧t).

3.2 Ito isometry

We now try to extend the above definition to something more general than simpleprocesses.

Definition (L2(M)). Let M ∈M2c . Define L2(M) to be the space of (equiva-

lence classes of) previsible H : Ω× [0,∞)→ R such that

‖H‖L2(M) = ‖H‖M = E(∫ ∞

0

H2s d〈M〉s

)1/2

<∞.

For H,K ∈ L2(M), we set

(H,K)L2(M) = E(∫ ∞

0

HsKs d〈M〉s).

In fact, L2(M) is equal to L2(Ω × [0,∞),P,dP d〈M〉), where P is theprevisible σ-algebra, and in particular is a Hilbert space.

Proposition. Let M ∈M2c . Then E is dense in L2(M).

Proof. Since L2(M) is a Hilbert space, it suffices to show that if (K,H) = 0 forall H ∈ E , then K = 0.

So assume that (K,H) = 0 for all H ∈ E and

Xt =

∫ t

0

Ks d〈M〉s,

Then X is a well-defined finite variation process, and Xt ∈ L1 for all t. Itsuffices to show that Xt = 0 for all t, and we shall show that Xt is a continuousmartingale.

Let 0 ≤ s < t and F ∈ Fs bounded. We let H = F1(s,t] ∈ E . By assumption,we know

0 = (K,H) = E(F

∫ t

s

Ku d〈M〉u)

= E(F (Xt −XS)).

Since this holds for all Fs measurable F , we have shown that

E(Xt | Fs) = Xs.

So X is a (continuous) martingale, and we are done.

26


Theorem. Let M ∈M2c . Then

(i) The map H ∈ E 7→ H ·M ∈M2c extends uniquely to an isometry L2(M)→

M2c , called the Ito isometry .

(ii) For H ∈ L2(M), H ·M is the unique martingale in M2c such that

〈H ·M,N〉 = H · 〈M,N〉

for all N ∈M2c , where the integral on the LHS is the stochastic integral

(as above) and the RHS is the finite variation integral.

(iii) If T is a stopping time, then (1[0,T ]H) ·M = (H ·M)T = H ·MT .

Definition (Stochastic integral). H ·M is the stochastic integral of H withrespect to M and we also write

(H ·M)t =

∫ t

0

Hs dMs.

It is important that the integral of martingale is still a martingale. Afterproving Ito’s formula, we will use this fact to show that a lot of things are infact martingales in a rather systematic manner. For example, it will be rathereffortless to show that B2

t − t is a martingale when Bt is a standard Brownianmotion.

Proof.

(i) We have already shown that this map is an isometry when restricted to E .So extend by completeness of M2

c and denseness of E .

(ii) Again the equation to show is known for simple H, and we want to showit is preserved under taking limits. Suppose Hn → H in L2(M) withHn ∈ L2(M). Then Hn ·M → H ·M in M2

c . We want to show that

〈H ·M,N〉∞ = limn→∞

〈Hn ·M,N〉∞ in L1.

H · 〈M,N〉 = limn→∞

Hn · 〈M,N〉 in L1.

for all N ∈M2c .

To show the first holds, we use the Kunita–Watanabe inequality to get

E|〈H ·M −Hn ·M,N〉∞| ≤ E (〈H ·M −Hn ·M〉∞)1/2

(E〈N〉∞)1/2

,

and the first factor is ‖H ·M −Hn ·M‖M2 → 0, while the second is finitesince N ∈M2

c . The second follows from

E |((H −Hn) · 〈M,N〉)∞| ≤ ‖H −Hn‖L2(M)‖N‖M2 → 0.

So we know that 〈H ·M,N〉∞ = (H · 〈M,N〉)∞. We can then replace Nby the stopped process N t to get 〈H ·M,N〉t = (H · 〈M,N〉)t.To see uniqueness, suppose X ∈M2

c is another such martingale. Then wehave 〈X −H ·M,N〉 = 0 for all N . Take N = X −H ·M , and then weare done.

27


(iii) For N ∈M2c , we have

〈(H ·M)T , N〉t = 〈H ·M,N〉t∧T = H · 〈M,N〉t∧T = (H1[0,T ] · 〈M,N〉)t

for every N . So we have shown that

(H ·M)T = (1[0,T ]H ·M)

by (ii). To prove the second equality, we have

〈H ·MT , N〉t = H · 〈MT , N〉t = H · 〈M,N〉t∧T = ((H1[0,T ] · 〈M,N〉)t.

Note that (ii) can be written as⟨∫ (−)

0

Hs dMs, N

⟩t

=

∫ t

0

Hs d〈M,N〉s.

Corollary.

〈H ·M,K ·N〉 = H · (K · 〈M,N〉) = (HK) · 〈M,N〉.

In other words,⟨∫ (−)

0

Hs dMs,

∫ (−)

0

Ks dNs

⟩t

=

∫ t

0

HsKs d〈M,N〉s.

Corollary. Since H ·M and (H ·M)(K ·N)− 〈H ·M,K ·N〉 are martingalesstarting at 0, we have

E(∫ t

0

H dMs

)= 0

E((∫ t

0

Hs dMs

)(∫ t

0

Ks dNs

))=

∫ t

0

HsKs d〈M,N〉s.

Corollary. Let H ∈ L2(M), then HK ∈ L2(M) iff K ∈ L2(H ·M), in whichcase

(KH) ·M = K · (H ·M).

Proof. We have

E(∫ ∞

0

K2sH

2s d〈Ms〉

)= E

(∫ ∞0

K2s 〈H ·M〉s

),

so ‖K‖L2(H·M) = ‖HK‖L2(M). For N ∈M2c , we have

〈(KH) ·M,N〉t = (KH · 〈M,N〉)t = (K · (H · 〈M,N〉))t = (K · 〈H ·M,N〉)t.

3.3 Extension to local martingales

We have now defined the stochastic integral for continuous martingales. We nextgo through some formalities to extend this to local martingales, and ultimatelyto semi-martingales. We are not doing this just for fun. Rather, when welater prove results like Ito’s formula, even when we put in continuous (local)martingales, we usually end up with some semi-martingales. So it is useful to beable to deal with semi-martingales in general.

28


Definition (L2bc(M)). Let L2

bc(M) be the space of previsible H such that∫ t

0

H2s d〈M〉s <∞ a.s.

for all finite t > 0.

Theorem. Let M be a continuous local martingale.

(i) For every H ∈ L2bc(M), there is a unique continuous local martingale H ·M

with (H ·M)0 = 0 and

〈H ·M,N〉 = H · 〈M,N〉

for all N,M .

(ii) If T is a stopping time, then

(1[0,T ]H) ·M = (H ·M)T = H ·MT .

(iii) If H ∈ L2loc(M), K is previsible, then K ∈ L2

loc(H ·M) iff HK ∈ L2loc(M),

and thenK · (H ·M) = (KH) ·M.

(iv) Finally, if M ∈ M2c and H ∈ L2(M), then the definition is the same as

the previous one.

Proof. Assume M0 = 0, and that∫ t0H2s d〈M〉s < ∞ for all ω ∈ Ω (by setting

H = 0 when this fails). Set

Sn = inf

t ≥ 0 :

∫ t

0

(1 +H2s ) d〈M〉s ≥ n

.

These Sn are stopping times that tend to infinity. Then

〈MSn ,MSn〉t = 〈M,M〉t∧Sn ≤ n.

So MSn ∈M2c . Also,∫ ∞

0

Hs d〈MSn〉s =

∫ Sn

0

H2s d〈M〉s ≤ n.

So H ∈ L2(MSn), and we have already defined what H ·MSn is. Now noticethat

H ·MSn = (H ·MSm)Sn for m ≥ n.

So it makes sense to define

H ·M = limn→∞

H ·MSn .

This is the unique process such that (H ·M)Sn = H ·MSn . We see that H ·Mis a continuous adapted local martingale with reducing sequence Sn.

Claim. 〈H ·M,N〉 = H · 〈M,N〉.

29


Indeed, assume that N0 = 0. Set S′n = inft ≥ 0 : |Nt| ≥ n. Set Tn =Sn ∧ S′n. Observe that NS′n ∈M2

c . Then

〈H ·M,N〉Tn = 〈H ·MSn , NS′n〉 = H · 〈MSn , NS′n〉 = H · 〈M,N〉Tn .

Taking the limit n→∞ gives the desired result.The proofs of the other claims are the same as before, since they only use

the characterizing property 〈H ·M,N〉 = H · 〈M,N〉.

3.4 Extension to semi-martingales

Definition (Locally boounded previsible process). A previsible process H islocally bounded if for all t ≥ 0, we have

sups≤t|Hs| <∞ a.s.

Fact.

(i) Any adapted continuous process is locally bounded.

(ii) If H is locally bounded and A is a finite variation process, then for allt ≥ 0, we have ∫ t

0

|Hs| |dAs| <∞ a.s.

Now if X = X0 + M + A is a semi-martingale, where X0 ∈ F0, M is acontinuous local martingale and A is a finite variation process, we want to define∫Hs dXs. We already know what it means to define integration with respect to

dMs and dAs, using the Ito integral and the finite variation integral respectively,and X0 doesn’t change, so we can ignore it.

Definition (Stochastic integral). Let X = X0 +M +A be a continuous semi-martingale, and H a locally bounded previsible process. Then the stochasticintegral H ·X is the continuous semi-martingale defined by

H ·X = H ·M +H ·A,

and we write

(H ·X)t =

∫ T

0

Hs dXs.

Proposition.

(i) (H,X) 7→ H ·X is bilinear.

(ii) H · (K ·X) = (HK) ·X if H and K are locally bounded.

(iii) (H ·X)T = H1[0,T ] ·X = H ·XT for every stopping time T .

(iv) If X is a continuous local martingale (resp. a finite variation process), thenso is H ·X.

30


(v) If H =∑ni=1Hi−11(ti−1,ti] and Hi−1 ∈ Fti−1

(not necessarily bounded),then

(H ·X)t =

n∑i=1

Hi−1(Xti∧t −Xti−1∧t).

Proof. (i) to (iv) follow from analogous properties for H ·M and H · A. Thelast part is also true by definition if the Hi are uniformly bounded. If Hi is notbounded, then the finite variation part is still fine, since for each fixed ω ∈ Ω,Hi(ω) is a fixed number. For the martingale part, set

Tn = inft ≥ 0 : |Ht| ≥ n.

Then Tn are stopping times, Tn →∞, and H1[0,Tn] ∈ E . Thus

(H ·M)t∧Tn=

n∑i=1

Hi−1T[0,Tn](Xti∧t −Xti−1∧t).

Then take the limit n→∞.

Before we get to Ito’s formula, we need a few more useful properties:

Proposition (Stochastic dominated convergence theorem). Let X be a contin-uous semi-martingale. Let H,Hs be previsible and locally bounded, and let Kbe previsible and non-negative. Let t > 0. Suppose

(i) Hns → Hs as n→∞ for all s ∈ [0, t].

(ii) |Hns | ≤ Ks for all s ∈ [0, t] and n ∈ N.

(iii)∫ t0K2s d〈M〉s < ∞ and

∫ t0Ks |dAs| < ∞ (note that both conditions are

okay if K is locally bounded).

Then ∫ t

0

Hns dXs →

∫ t

0

Hs dXs in probability.

Proof. For the finite variation part, the convergence follows from the usualdominated convergence theorem. For the martingale part, we set

Tm = inf

t ≥ 0 :

∫ t

0

K2s d〈M〉s ≥ m

.

So we have

E

(∫ Tm∧t

0

Hns dMs −

∫ Tn∧t

0

Hs dMs

)2 ≤ E

(∫ Tn∧t

0

(Hns −Hs)

2 d〈M〉s

)→ 0.

using the usual dominated convergence theorem, since∫ Tn∧t0

K2s d〈M〉s ≤ m.

Since Tn ∧ t = t eventually as n→∞ almost surely, hence in probability, weare done.

31


Proposition. Let X be a continuous semi-martingale, and let H be an adapted

bounded left-continuous process. Then for every subdivision 0 < t(m)0 < t

(m)1 <

· · · < t(m)nm of [0, t] with maxi |t(m)

i − t(m)i−1| → 0, then∫ t

0

Hs dXs = limm→∞

nm∑i=1

Ht(m)i−1

(Xt(m)i−X

t(m)i−1

)

in probability.

Proof. We have already proved this for the Lebesgue–Stieltjes integral, and allwe used was dominated convergence. So the same proof works using stochasticdominated convergence theorem.

3.5 Ito formula

We now prove the equivalent of the integration by parts and the chain rule, i.e.Ito’s formula. Compared to the world of usual integrals, the difference is thatthe quadratic variation, i.e. “second order terms” will crop up quite a lot, sincethey are no longer negligible.

Theorem (Integration by parts). Let X,Y be a continuous semi-martingale.Then almost surely,

XtYt −X0Y0 =

∫ t

0

Xs dYs +

∫ t

0

Ys dXs + 〈X,Y 〉t

The last term is called the Ito correction.

Note that if X,Y are martingales, then the first two terms on the right aremartingales, but the last is not. So we are forced to think about semi-martingales.

Observe that in the case of finite variation integrals, we don’t have thecorrection.

Proof. We have

XtYt −XsYs = Xs(Yt − Ys) + (Xt −Xs)Ys + (Xt −Xs)(Yt − Ys).

When doing usual calculus, we can drop the last term, because it is second order.However, the quadratic variation of martingales is in general non-zero, and sowe must keep track of this. We have

Xk2−nYk2−n −X0Y0 =

k∑i=1

(Xi2−nYi2−n −X(i−1)2−nY(i−1)2−n)

=

n∑i=1

(X(i−1)2−n(Yi2−n − Y(i−1)2−n)

+ Y(i−1)2−n(Xi2−n −X(i−1)2−n)

+ (Xi2−n −X(i−1)2−n )(Yi2−n − Y(i−1)2−n))

Taking the limit n→∞ with k2−n fixed, we see that the formula holds for t adyadic rational. Then by continuity, it holds for all t.

32


The really useful formula is the following:

Theorem (Ito’s formula). Let X1, . . . , Xp be continuous semi-martingales, andlet f : Rp → R be C2. Then, writing X = (X1, . . . , Xp), we have, almost surely,

f(Xt) = f(X0) +

p∑i=1

∫ t

0

∂f

∂xi(Xs) dXi

s +1

2

p∑i,j=1

∫ t

0

∂2f

∂xi∂xj(Xs) d〈Xi, Xj〉s.

In particular, f(X) is a semi-martingale.

The proof is long but not hard. We first do it for polynomials by explicitcomputation, and then use Weierstrass approximation to extend it to moregeneral functions.

Proof.

Claim. Ito’s formula holds when f is a polynomial.It clearly does when f is a constant! We then proceed by induction. Suppose

Ito’s formula holds for some f . Then we apply integration by parts to

g(x) = xkf(x).

where xk denotes the kth component of x. Then we have

g(Xt) = g(X0) +

∫ t

0

Xks df(Xs) +

∫ t

0

f(Xs) dXks + 〈Xk, f(X)〉t

We now apply Ito’s formula for f to write∫ t

0

Xks df(Xs) =

p∑i=1

∫ t

0

Xks

∂f

∂xi(Xs) dXi

s

+1

2

p∑i,j=1

∫ t

0

Xks

∂2f


We also have

〈Xk, f(X)〉t =

p∑i=1

∫ t

0

∂f

∂xi(Xs) d〈Xk, Xi〉s.

So we have

g(Xt) = g(X0) +

p∑i=1

∫ t

0

∂g

∂xi(Xs) dXi

s +1

2

p∑i,j=1

∫ t

0

∂2g


So by induction, Ito’s formula holds for all polynomials.

Claim. Ito’s formula holds for all f ∈ C2 if |Xt(ω)| ≤ n and∫ t0|dAs| ≤ n for

all (t, ω).

By the Weierstrass approximation theorem, there are polynomials pk suchthat

sup|x|≤k

(|f(x)− pk(x)|+ max

i

∣∣∣∣ ∂f∂xi − ∂p

∂xi

∣∣∣∣+ maxi,j

∣∣∣∣ ∂2f

∂xi∂xj− ∂pk∂xi∂xj

∣∣∣∣) ≤ 1

k.

33


By taking limits, in probability, we have

f(Xt)− f(X0) = limk→∞

(pk(Xt)− pk(X0))∫ t

0

∂f

∂xi(Xs) dXi

s = limk→∞

∂pk∂xi

(Xs) dXis

by stochastic dominated convergence theorem, and by the regular dominatedconvergence, we have∫ t

0

∂f

∂xi∂xjd〈Xi, Xj〉s = lim

k→∞

∫ t

0

∂2pk∂xi∂xj

d〈Xi, Xj〉.

Claim. Ito’s formula holds for all X.

Let

Tn = inf

t ≥ 0 : |Xt| ≥ n or

∫ t

0

|dAs| ≥ n

Then by the previous claim, we have

f(XTnt ) = f(X0) +

p∑i=1

∫ t

0

∂f

∂xi(XTn

s ) d(Xi)Tns

+1

2

∑i,j

∫ t

0

∂2f

∂xi∂xj(XTn

s ) d〈(Xi)Tn , (Xj)Tn〉s

= f(X0) +

p∑i=1

∫ t∧Tn

0

∂f

∂xi(Xs) d(Xi)s

+1

2

∑i,j

∫ t∧Tn

0

∂2f

∂xi∂xj(Xs) d〈(Xi), (X

j)〉s.

Then take Tn →∞.

Example. Let B be a standard Brownian motion, B0 = 0 and f(x) = x2. Then

B2t = 2

∫ t

0

BS dBs + t.

In other words,

B2t − t = 2

∫ t

0

Bs dBs.

In particular, this is a continuous local martingale.

Example. Let B = (B1, . . . , Bd) be a d-dimensional Brownian motion. Thenwe apply Ito’s formula to the semi-martingale X = (t, B1, . . . , Bd). Then wefind that

f(t, Bt)− f(0, B0)−∫ t

0

(∂

∂s+

1

2∆

)f(s,Bs) ds =

d∑i=1

∫ t

0

∂

∂xif(s,Bs) dBis

is a continuous local martingale.

34


There are some syntactic tricks that make stochastic integrals easier tomanipulate, namely by working in differential form. We can state Ito’s formulain differential form

df(Xt) =

p∑i=1

∂f

∂xidXi +

1

2

p∑i,j=1

∂2f

∂xi∂xjd〈Xi, Xj〉,

which we can think of as the chain rule. For example, in the case case of Brownianmotion, we have

df(Bt) = f ′(Bt) dBt +1

2f ′′(Bt) dt.

Formally, one expands f using that that “(dt)2 = 0” but “(dB)2 = dt”. Thefollowing formal rules hold:

Zt − Z0 =

∫ t

0

Hs dXs ⇐⇒ dZt = Ht dXt

Zt = 〈X,Y 〉t =

∫ t

0

d〈X,Y 〉t ⇐⇒ dZt = dXt dYt.

Then we have rules such as

Ht(Kt dXt) = (HtKt) dXt

Ht(dXt dYt) = (Ht dXt) dYt

d(XtYt) = Xt dYt + Yt dXt + dXt dYt

df(Xt) = f ′(Xt) dXt +1

2f ′′(Xt) dXt dXt.

3.6 The Levy characterization

A more major application of the stochastic integral is the following convenientcharacterization of Brownian motion:

Theorem (Levy’s characterization of Brownian motion). Let (X1, . . . , Xd) becontinuous local martingales. Suppose that X0 = 0 and that 〈Xi, Xj〉t = δijtfor all i, j = 1, . . . , d and t ≥ 0. Then (X1, . . . , Xd) is a standard d-dimensionalBrownian motion.

This might seem like a rather artificial condition, but it turns out to be quiteuseful in practice (though less so in this course). The point is that we know that〈H ·M〉t = H2

t · 〈M〉t, and in particular if we are integrating things with respectto Brownian motions of some sort, we know 〈Bt〉t = t, and so we are left withsome explicit, familiar integral to do.

Proof. Let 0 ≤ s < t. It suffices to check that Xt −Xs is independent of Fs andXt −Xs ∼ N(0, (t− s)I).

Claim. E(eiθ·(Xt−Xs) | Fs) = e−12 |θ|

2(t−s) for all θ ∈ Rd and s < t.

This is sufficient, since the right-hand side is independent of Fs, hence so isthe left-hand side, and the Fourier transform characterizes the distribution.

35


To check this, for θ ∈ Rd, we define

Yt = θ ·Xt =

d∑i=1

θiXit .

Then Y is a continuous local martingale, and we have

〈Y 〉t = 〈Y, Y 〉t =

d∑i,j=1

θjθk 〈Xj , Xk〉t = |θ|2t.

by assumption. Let

Zt = eiYt+12 〈Y 〉t = eiθ·Xt+

12 |θ|

2t.

By Ito’s formula, with X = iY + 12 〈Y 〉t and f(x) = ex, we get

dZt = Zt

(idYt −

1

2d〈Y 〉t +

1

2d〈Y 〉t

)= iZt dYt.

So this implies Z is a continuous local martingale. Moreover, since Z is boundedon bounded intervals of t, we know Z is in fact a martingale, and Z0 = 1. Thenby definition of a martingale, we have

E(Zt | Fs) = Zs,

And unwrapping the definition of Zt shows that the result follows.

In general, the quadratic variation of a process doesn’t have to be linear in t.It turns out if the quadratic variation increases to infinity, then the martingaleis still a Brownian motion up to reparametrization.

Theorem (Dubins–Schwarz). Let M be a continuous local martingale withM0 = 0 and 〈M〉∞ =∞. Let

Ts = inft ≥ 0 : 〈M〉t > s,

the right-continuous inverse of 〈M〉t. Let Bs = MTs and Gs = FTs . Then Ts is a(Ft) stopping time, 〈M〉Ts = s for all s ≥ 0, B is a (Gs)-Brownian motion, and

Mt = B〈M〉t .

Proof. Since 〈M〉 is continuous and adapted, and 〈M〉∞ =∞, we know Ts is astopping time and Ts <∞ for all s ≥ 0.

Claim. (Gs) is a filtration obeying the usual conditions, and G∞ = F∞Indeed, if A ∈ Gs and s < t, then

A ∩ Tt ≤ u = A ∩ Ts ≤ u ∩ Tt ≤ u ∈ Fu,

using that A ∩ Ts ≤ u ∈ Fu since A ∈ Gs. Then right-continuity follows fromthat of (Ft) and the right-continuity of s 7→ Ts.

Claim. B is adapted to (Gs).

36


In general, if X is cadlag and T is a stopping time, then XT1T<∞ ∈ FT .Apply this is with X = M , T = Ts and FT = Gs. Thus Bs ∈ Gs.

Claim. B is continuous.

Here this is actually something to verify, because s 7→ Ts is only right contin-uous, not necessarily continuous. Thus, we only know Bs is right continuous,and we have to check it is left continuous.

Now B is left-continuous at s iff Bs = Bs− , iff MTs = MTs− . Now we have

Ts− = inft ≥ 0 : 〈M〉t ≥ s.

If Ts = Ts−, then there is nothing to show. Thus, we may assume Ts > Ts−.Then we have 〈M〉Ts

= 〈M〉Ts− . Since 〈M〉t is increasing, it means 〈M〉Tsis

constant in [Ts−, Ts]. We will later prove that

Lemma. M is constant on [a, b] iff 〈M〉 being constant on [a, b].

So we know that if Ts > Ts−, then MTs = MTs−. So B is left continuous.We then have to show that B is a martingale.

Claim. (M2 − 〈M〉)Ts is a uniformly integrable martingale.

To see this, observe that 〈MTs〉∞ = 〈M〉Ts= s, and so MTs is bounded. So

(M2 − 〈M〉)Ts is a uniformly integrable martingale.We now apply the optional stopping theorem, which tells us

E(Bs | Gr) = E(MTs∞ | Gs) = MTt = Bt.

So Bt is a martingale. Moreover,

E(B2s − s | Gr) = E((M2 − 〈M〉)Ts | FTr

) = M2Tr− 〈M〉Tr

= B2r − r.

So B2t − t is a martingale, so by the characterizing property of the quadratic

variation, 〈B〉t = t. So by Levy’s criterion, this is a Brownian motion in onedimension.

The theorem is only true for martingales in one dimension. In two dimen-sions, this need not be true, because the time change needed for the horizontaland vertical may not agree. However, in the example sheet, we see that theholomorphic image of a Brownian motion is still a Brownian motion up to atime change.

Lemma. M is constant on [a, b] iff 〈M〉 being constant on [a, b].

Proof. It is clear that if M is constant, then so is 〈M〉. To prove the converse,by continuity, it suffices to prove that for any fixed a < b,

Mt = Ma for all t ∈ [a, b] ⊇ 〈M〉b = 〈M〉a almost surely.

We set Nt = Mt −Mt ∧ a. Then 〈N〉t = 〈M〉t − 〈M〉t∧a. Define

Tε = inft ≥ 0 : 〈N〉t ≥ ε.

Then since N2 − 〈N〉 is a local martingale, we know that

E(N2t∧Tε

) = E(〈N〉t∧Tε) ≤ ε.

Now observe that on the event 〈M〉b = 〈M〉a, we have 〈N〉b = 0. So fort ∈ [a, b], we have

E(1〈M〉b=〈M〉aN2t ) = E(1〈M〉b=〈M〉aN

2t∧Tε

) = E(〈N〉t∧Tε) = 0.

37


3.7 Girsanov’s theorem

Girsanov’s theorem tells us what happens to our (semi)-martingales when wechange the measure of our space. We first look at a simple example when weperform a shift.

Example. Let X ∼ N(0, C) be an n-dimensional centered Gaussian withpositive definite covariance C = (Cij)

ni,j=1. Put M = C−1. Then for any

function f , we have

Ef(X) =

(det

M

2π

)1/2 ∫Rn

f(x)e−12 (x,Mx) dx.

Now fix an a ∈ Rn. The distribution of X + a then satisfies

Ef(X + a) =

(det

M

2π

)1/2 ∫Rn

f(x)e−12 (x−a,M(x−a)) dx = E[Zf(X)],

whereZ = Z(x) = e−

12 (a,Ma)+(x,Ma).

Thus, if P denotes the distribution of X, then the measure Q with

dQdP

= Z

is that of N(a,C) vector.

Example. We can extend the above example to Brownian motion. Let B be aBrownian motion with B0 = 0, and h : [0,∞)→ R a deterministic function. Wethen want to understand the distribution of Bt + h.

Fix a finite sequence of times 0 = t0 < t1 < · · · < tn. Then we know that(Bti)

ni=1 is a centered Gaussian random variable. Thus, if f(B) = f(Bt1 , . . . , Btn)

is a function, then

E(f(B)) = c ·∫Rn

f(x)e− 1

2

∑ni=1

(xi−xi−1)2

ti−ti−1 dx1 · · · dxn.

Thus, after a shift, we get

E(f(B + h)) = E(Zf(B)),

Z = exp

(−1

2

n∑i=1

(hti − hti−1)2

ti − ti−1+

n∑i=1

(hti − hti−1)(Bti −Bti−1

)

ti − ti−1

).

In general, we are interested in what happens when we change the measureby an exponential:

Definition (Stochastic exponential). Let M be a continuous local martingale.Then the stochastic exponential (or Doleans–Dade exponential) of M is

E(M)t = eMt− 12 〈M〉t

The point of introducing that quadratic variation term is

38


Proposition. Let M be a continuous local martingale with M0 = 0. ThenE(M) = Z satisfies

dZt = Zt dM,

i.e.

Zt = 1 +

∫ t

0

Zs dMs.

In particular, E(M) is a continuous local martingale. Moreover, if 〈M〉 isuniformly bounded, then E(M) is a uniformly integrable martingale.

There is a more general condition for the final property, namely Novikov’scondition, but we will not go into that.

Proof. By Ito’s formula with X = M − 12 〈M〉, we have

dZt = Ztd

(Mt −

1

2d〈M〉t

)+

1

2Ztd〈M〉t = Zt dMt.

Since M is a continuous local martingale, so is∫Zs dMs. So Z is a continuous

local martingale.Now suppose 〈M〉∞ ≤ b <∞. Then

P(

supt≥0

Mt ≥ a)

= P(

supt≥0

Mt ≥ a, 〈M〉∞ ≤ b)≤ e−a

2/2b,

where the final equality is an exercise on the third example sheet, which is truefor general continuous local martingales. So we get

E(

exp

(suptMt

))=

∫ ∞0

P(exp(supMt) ≥ λ) dλ

=

∫ ∞0

P(supMt ≥ log λ) dλ

≤ 1 +

∫ ∞1

e−(log λ)2/2b dλ <∞.

Since 〈M〉 ≥ 0, we know that

supt≥0E(M)t ≤ exp (supMt) ,

So E(M) is a uniformly integrable martingale.

Theorem (Girsanov’s theorem). Let M be a continuous local martingale withM0 = 0. Suppose that E(M) is a uniformly integrable martingale. Define a newprobability measure

dQdP

= E(M)∞

Let X be a continuous local martingale with respect to P. Then X − 〈X,M〉 isa continuous local martingale with respect to Q.

39


Proof. Define the stopping time

Tn = inft ≥ 0 : |Xt − 〈X,M〉t| ≥ n,

and P(Tn → ∞) = 1 by continuity. Since Q is absolutely continuous withrespect to P, we know that Q(Tn → ∞) = 1. Thus it suffices to show thatXTn − 〈XTn ,M〉 is a continuous martingale for any n. Let

Y = XTn − 〈XTn ,M〉, Z = E(M).

Claim. ZY is a continuous local martingale with respect to P.

We use the product rule to compute

d(ZY ) = Yt dZt + Zt dYt + d〈Y,Z〉t= Y Zt dMt + Zt(dX

Tn − d〈XTn ,M〉t) + Zt d〈M,XTn〉= Y Zt dMt + Zt dXTn

So we see that ZY is a stochastic integral with respect to a continuous localmartingale. Thus ZY is a continuous local martingale.

Claim. ZY is uniformly integrable.

Since Z is a uniformly integrable martingale, ZT : T is a stopping time isuniformly integrable. Since Y is bounded, ZTYT : T is a stopping time is alsouniformly integrable. So ZY is a true martingale (with respect to P).

Claim. Y is a martingale with respect to Q.

We have

EQ(Yt − Ys | Fs) = EP(Z∞Yt − Z∞Ys | Fs)= EP(ZtYt − ZsYs | Fs) = 0.

Note that the quadratic variation does not change since

〈X − 〈X,M〉〉 = 〈X〉t = limn→∞

b2ntc∑i=1

(Xi2−n −X(i−1)2−n)2 a.s.

along a subsequence.

40

4 Stochastic differential equations III Stochastic Calculus and Applications

4 Stochastic differential equations

4.1 Existence and uniqueness of solutions

After all this work, we can return to the problem we described in the introduction.We wanted to make sense of equations of the form

x(t) = F (x(t)) + η(t),

where η(t) is Gaussian white noise. We can now interpret this equation as saying

dXt = F (Xt) dt+ dBt,

or equivalently, in integral form,

Xt −X0 =

∫ T

0

F (Xs) ds+Bt.

In general, we can make the following definition:

Definition (Stochastic differential equation). Let d,m ∈ N, b : R+ × Rd → Rd,σ : R+ × Rd → Rd×m be locally bounded (and measurable). A solution to thestochastic differential equation E(σ, b) given by

dXt = b(t,Xt) dt+ σ(t,Xt) dBt

consists of

(i) a filtered probability space (Ω,F , (Ft),P) obeying the usual conditions;

(ii) an m-dimensional Brownian motion B with B0 = 0; and

(iii) an (Ft)-adapted continuous process X with values in Rd such that

Xt = X0 +

∫ t

0

σ(s,Xs) dBs +

∫ t

0

b(s,Xs) ds.

If X0 = x ∈ Rd, then we say X is a (weak) solution to Ex(σ, b). It is a strongsolution if it is adapted with respect to the canonical filtration of B.

Our goal is to prove existence and uniqueness of solutions to a general classof SDEs. We already know what it means for solutions to be unique, and ingeneral there can be multiple notions of uniqueness:

Definition (Uniqueness of solutions). For the stochastic differential equationE(σ, b), we say there is

– uniqueness in law if for every x ∈ Rd, all solutions to Ex(σ, b) have thesame distribution.

– pathwise uniqueness if when (Ω,F , (Ft),P) and B are fixed, any twosolutions X,X ′ with X0 = X ′0 are indistinguishable.

These two notions are not equivalent, as the following example shows:

41


Example (Tanaka). Consider the stochastic differential equation

dXt = sgn(Xt) dBt, X0 = x,

where

sgn(x) =

+1 x > 0

−1 x ≤ 0.

This has a weak solution which is unique in law, but pathwise uniqueness fails.To see the existence of solutions, let X be a one-dimensional Brownian motion

with X0 = x, and set

Bt =

∫ t

0

sgn(Xs) dXs,

which is well-defined because sgn(Xs) is previsible and left-continuous. Then wehave

x+

∫ t

0

sgn(Xs) dBs = x+

∫ t

0

sgn(Xs)2 dXs = x+Xt −X0 = Xt.

So it remains to show that B is a Brownian motion. We already know that B isa continuous local martingale, so by Levy’s characterization, it suffices to showits quadratic variation is t. We simply compute

〈B,B〉t =

∫ t

0

d〈Xs, Xs〉 = t.

So there is weak existence. The same argument shows that any solution is aBrownian motion, so we have uniqueness in law.

Finally, observe that if x = 0 and X is a solution, then −X is also a solutionwith the same Brownian motion. Indeed,

−Xt =

∫ t

0

sgn(Xs) dBs =

∫ t

0

sgn(−Xs) dBs + 2

∫ t

0

1Xs=0 dBs,

where the second term vanishes, since it is a continuous local martingale withquadratic variation

∫ t0

1Xs=0 ds = 0. So pathwise uniqueness does not hold.

In the other direction, however, it turns out pathwise uniqueness impliesuniqueness in law.

Theorem (Yamada–Watanabe). Assume weak existence and pathwise unique-ness holds. Then

(i) Uniqueness in law holds.

(ii) For every (Ω,F , (Ft),P) and B and any x ∈ Rd, there is a unique strongsolution to Ex(a, b).

We will not prove this, since we will not actually need it.The key, important theorem we are now heading for is the existence and

uniqueness of solutions to SDEs, assuming reasonable conditions. As in the caseof ODEs, we need the following Lipschitz conditions:

42


Definition (Lipschitz coefficients). The coefficients b : R+ × Rd → Rd, σ :R+ × Rd → Rd×m are Lipschitz in x if there exists a constant K > 0 such thatfor all t ≥ 0 and x, y ∈ Rd, we have

|b(t, x)− b(t, y)| ≤ K|x− y||σ(t, x)− σ(t, y)| ≤ |x− y|

Theorem. Assume b, σ are Lipschitz in x. Then there is pathwise uniquenessfor the E(σ, b) and for every (Ω,F , (Ft),P) satisfying the usual conditions andevery (Ft)-Brownian motion B, for every x ∈ Rd, there exists a unique strongsolution to Ex(σ, b).

Proof. To simplify notation, we assume m = d = 1.We first prove pathwise uniqueness. Suppose X,X ′ are two solutions with

X0 = X ′0. We will show that E[(Xt − X ′t)2] = 0. We will actually put somebounds to control our variables. Define the stopping time

S = inft ≥ 0 : |Xt| ≥ n or |X ′t| ≥ n.

By continuity, S →∞ as n→∞. We also fix a deterministic time T > 0. Thenwhenever t ∈ [0, T ], we can bound, using the identity (a+ b)2 ≤ 2a2 + 2b2,

E((Xt∧S −X ′t∧S)2) ≤ 2E

(∫ t∧S

0

(σ(s,Xs)− σ(s,X ′s)) dBs

)2

+ 2E

(∫ t∧S

0

(b(s,Xs)− b(s,X ′s)) ds

)2 .

We can apply the Lipschitz bound to the second term immediately, while we cansimplify the first term using the (corollary of the) Ito isometry

E

(∫ t∧S

0

(σ(s,Xs)− σ(s,X ′s)) dBs

)2 = E

(∫ t∧S

0

(σ(s,Xs)− σ(s,X ′s))2 ds

).

So using the Lipschitz bound, we have

E((Xt∧S −X ′t∧S)2) ≤ 2K2(1 + T )E

(∫ t∧S

0

|Xs −X ′s|2 ds

)

≤ 2K2(1 + T )

∫ t

0

E(|Xs∧S −X ′s∧S |2) ds.

We now use Gronwall’s lemma:

Lemma. Let h(t) be a function such that

h(t) ≤ c∫ t

0

h(s) ds

for some constant c. Thenh(t) ≤ h(0)ect.

43


Applying this toh(t) = E((Xt∧S −X ′t∧S)2),

we deduce that h(t) ≤ h(0)ect = 0. So we know that

E(|Xt∧S −X ′t∧S |2) = 0

for every t ∈ [0, T ]. Taking n→∞ and T →∞ gives pathwise uniqueness.We next prove existence of solutions. We fix (Ω,F , (Ft)t) and B, and define

F (X)t = X0 +

∫ t

0

σ(s,Xs) dBs +

∫ t

0

b(s,Xs) ds.

Then X is a solution to Ex(a, b) iff F (X) = X and X0 = x. To find a fixed point,we use Picard iteration. We fix T > 0, and define the T -norm of a continuousadapted process X as

‖X‖T = E(

supt≤T|Xt|2

)1/2

.

In particular, if X is a martingale, then this is the same as the norm on thespace of L2-bounded martingales by Doob’s inequality. Then

B = X : Ω× [0, T ]→ R : ‖X‖T <∞

is a Banach space.

Claim. ‖F (0)‖T <∞, and

‖F (X)− F (Y )‖2T ≤ (2T + 8)K2

∫ T

0

‖X − Y ‖2t dt.

We first see how this claim implies the theorem. First observe that the claimimplies F indeed maps B into itself. We can then define a sequence of processesXit by

X0t = x, Xi+1 = F (Xi).

Then we have

‖Xi+1 −Xi‖2T ≤ CT∫ T

0

‖Xi −Xi−1‖2t dt ≤ · · · ≤ ‖X1 −X0‖2T(CT i

i!

).

So we find that∞∑i=1

‖Xi −Xi−1‖2T <∞

for all T . So Xi converges to X almost surely and uniformly on [0, T ], andF (X) = X. We then take T →∞ and we are done.

To prove the claim, we write

‖F (0)‖T ≤ |X0|+∥∥∥∥∫ t

0

b(s, 0) ds

∥∥∥∥+

∥∥∥∥∫ t

0

σ(s, 0) dBs

∥∥∥∥T

.

44


The first two terms are constant, and we can bound the last by Doob’s inequalityand the Ito isometry:

∥∥∥∥∫ t

0

σ(s, 0) dBs

∥∥∥∥T

≤ 2E

∣∣∣∣∣∫ T

0

σ(s, 0) dBs

∣∣∣∣∣2 = 2

∫ T

0

σ(s, 0)2 ds.

To prove the second part, we use

‖F (X)− F (Y )‖2 ≤ 2E

(supt≤T

∣∣∣∣∫ t

0

b(s,X − s)− b(s, Ys) ds

∣∣∣∣2)

+ 2E

(supt≤T

∣∣∣∣∫ t

0

(σ(s,Xs)− σ(s, Ys)) dBs

∣∣∣∣2).

We can bound the first term with Cauchy–Schwartz by

TE

(∫ T

0

|b(s,Xs)− b(s, Ys)|2)≤ TK2

∫ T

0

‖X − Y ‖2t dt,

and the second term with Doob’s inequality by

E

(∫ T

0

|σ(s,Xs)− σ(s, Ys)|2 ds

)≤ 4K2

∫ T

0

‖X − Y ‖2t dt.

4.2 Examples of stochastic differential equations

Example (The Ornstein–Uhlenbeck process). Let λ > 0. Then the Ornstein–Uhlenbeck process is the solution to

dXt = −λXt dt+ dBt.

The solution exists by the previous theorem, but we can also explicitly find one.By Ito’s formula applied to eλtXt, we get

d(eλtXt) = eλtdXt + λeλtXt dt = dBt.

So we find that

Xt = e−λtX0 +

∫ t

0

e−λ(t−s) dBs.

Observe that the integrand is deterministic. So we can in fact interpret this asan Wiener integral.

Fact. If X0 = x ∈ R is fixed, then (Xt) is a Gaussian process, i.e. (Xti)ni=1 is

jointly Gaussian for all t1 < · · · < tn. Any Gaussian process is determined bythe mean and covariance, and in this case, we have

EXt = e−λtx, cov(Xt, Xs) =1

2λ

(e−λ|t−s| − e−λ|t+s|

)

45


Proof. We only have to compute the covariance. By the Ito isometry, we have

E((Xt − EXt)(Xs − EXs)) = E(∫ t

0

e−λ(t−u) dBu

∫ s

0

e−λ(s−u) dBu

)= e−λ(t+s)

∫ t∧s

0

eλu du.

In particular,

Xt ∼ N(e−λtx,

1− e−2λt

2λ

)→ N

(0,

1

2λ

).

Fact. If X0 ∼ N(0, 12λ ), then (Xt) is a centered Gaussian process with stationary

covariance, i.e. the covariance depends only on time differences:

cov(Xt, Xs) =1

2λe−λ|t−s|.

The difference is that in the deterministic case, the EXt cancels the firste−λtX0 term, while in the non-deterministic case, it doesn’t.

This is a very nice example where we can explicitly understand the long-timebehaviour of the SDE. In general, this is non-trivial.

Dyson Brownian motion

Let HN be an inner product space of real symmetric N ×N matrices with innerproduct N Tr(HK) for H,K ∈ HN . Let H1, . . . ,Hdim(HN ) be an orthonormalbasis for HN .

Definition (Gaussian orthogonal ensemble). The Gaussian Orthogonal Ensem-ble GOEN is the standard Gaussian measure on HN , i.e. H ∼ GOEN if

H =

dimHn∑r=1

HiXi

where each Xi are iid standard normals.

We now replace each Xi by a Ornstein–Uhlenbeck process with λ = 12 . Then

GOEN is invariant under the process.

Theorem. The eigenvalues λ1(t) ≤ · · · ≤ λN (t) satisfies

dλit =

−λi2

+1

N

∑j 6=i

1

λi − λj

dt+

√2

NβdBi.

Here β = 1, but if we replace symmetric matrices by Hermitian ones, we getβ = 2; if we replace symmetric matrices by symplectic ones, we get β = 4.

This follows from Ito’s formula and formulas for derivatives of eigenvalues.

46


Example (Geometric Brownian motion). Fix σ > 0 and t ∈ R. Then geometricBrownian motion is given by

dXt = σXt dBt + rXt dt.

We apply Ito’s formula to logXt to find that

Xt = X0 exp

(σBt +

(r − σ2

2

)t

).

Example (Bessel process). Let B = (B1, . . . , Bd) be a d-dimensional Brownianmotion. Then

Xt = |Bt|

satisfies the stochastic differential equation

dXt =d− 1

2Xtdt+ dBt

if t < inft ≥ 0 : Xt = 0.

4.3 Representations of solutions to PDEs

Recall that in Advanced Probability, we learnt that we can represent the solutionto Laplace’s equation via Brownian motion, namely if D is a suitably nice domainand g : ∂D → R is a function, then the solution to the Laplace’s equation on Dwith boundary conditions g is given by

u(x) = Ex[g(BT )],

where T is the first hitting time of the boundary ∂D.A similar statement we can make is that if we want to solve the heat equation

∂u

∂t= ∇2u

with initial conditions u(x, 0) = u0(x), then we can write the solution as

u(x, t) = Ex[u0(√

2Bt)]

This is just a fancy way to say that the Green’s function for the heat equation isa Gaussian, but is a good way to think about it nevertheless.

In general, we would like to associate PDEs to certain stochastic processes.Recall that a stochastic PDE is generally of the form

dXt = b(Xt) dt+ σ(Xt) dBt

for some b : Rd → R and σ : Rd → Rd×m which are measurable and locallybounded. Here we assume these functions do not have time dependence. Wecan then associate to this a differential operator L defined by

L =1

2

∑i,j

aij∂i∂j +∑i

bi∂i.

where a = σσT .

47


Example. If b = 0 and σ =√

2I, then L = ∆ is the standard Laplacian.

The basic computation is the following result, which is a standard applicationof the Ito formula:

Proposition. Let x ∈ Rd, and X a solution to Ex(σ, b). Then for everyf : R+ × Rd → R that is C1 in R+ and C2 in Rd, the process

Mft = f(t,Xt)− f(0, X0)−

∫ t

0

(∂

∂s+ L

)f(s,Xs) ds

is a continuous local martingale.

We first apply this to the Dirichlet–Poisson problem, which is essentially tosolve −Lu = f . To be precise, let U ⊆ Rd be non-empty, bounded and open;f ∈ Cb(U) and g ∈ Cb(∂U). We then want to find a u ∈ C2(U) = C2(U)∩C(U)such that

−Lu(x) = f(x) for x ∈ Uu(x) = g(x) for x ∈ ∂U.

If f = 0, this is called the Dirichlet problem; if g = 0, this is called the Poissonproblem.

We will have to impose the following technical condition on a:

Definition (Uniformly elliptic). We say a : U → Rd×d is uniformly elliptic ifthere is a constant c > 0 such that for all ξ ∈ Rd and x ∈ U , we have

ξTa(x)ξ ≥ c|ξ|2.

If a is symmetric (which it is in our case), this is the same as asking for thesmallest eigenvalue of a to be bounded away from 0.

It would be very nice if we can write down a solution to the Dirichlet–Poissonproblem using a solution to Ex(σ, b), and then simply check that it works. Wecan indeed do that, but it takes a bit more time than we have. Instead, we shallprove a slightly weaker result that if we happen to have a solution, it must begiven by our formula involving the SDE. So we first note the following theoremwithout proof:

Theorem. Assume U has a smooth boundary (or satisfies the exterior conecondition), a, b are Holder continuous and a is uniformly elliptic. Then forevery Holder continuous f : U → R and any continuous g : ∂U → R, theDirichlet–Poisson process has a solution.

The main theorem is the following:

Theorem. Let σ and b be bounded measurable and σσT uniformly elliptic,U ⊆ Rd as above. Let u be a solution to the Dirichlet–Poisson problem and X asolution to Ex(σ, b) for some x ∈ Rd. Define the stopping time

TU = inft ≥ 0 : Xt 6∈ U.

Then ETU <∞ and

u(x) = Ex

(g(XTU

) +

∫ TU

0

f(Xs) ds

).

In particular, the solution to the PDE is unique.

48


Proof. Our previous proposition applies to functions defined on all of Rn, whileu is just defined on U . So we set

Un =

x ∈ U : dist(x, ∂U) >

1

n

, Tn = inft ≥ 0 : Xt 6∈ Un,

and pick un ∈ C2b (Rd) such that u|Un = un|Un . Recalling our previous notation,

let

Mnt = (Mun)Tn

t = un(Xt∧Tn)− un(X0)−

∫ t∧Tn

0

Lun(Xs) ds.

Then this is a continuous local martingale that is bounded by the proposition,and is bounded, hence a true martingale. Thus for x ∈ U and n large enough,the martingale property implies

u(x) = un(x) = E

(u(Xt∧Tn)−

∫ t∧Tn

0

Lu(Xs) ds

)

= E

(u(Xt∧Tn

) +

∫ t∧Tn

0

f(Xs) ds

).

We would be done if we can take n→∞. To do so, we first show that E[TU ] <∞.Note that this does not depend on f and g. So we can take f = 1 and g = 0,

and let v be a solution. Then we have

E(t ∧ Tn) = E

(−∫ t∧Tn

0

Lv(Xs) ds

)= v(x)− E(v(Xt∧Tn)).

Since v is bounded, by dominated/monotone convergence, we can take the limitto get

E(TU ) <∞.Thus, we know that t ∧ Tn → TU as t→∞ and n→∞. Since

E

(∫ TU

0

|f(Xs)| ds

)≤ ‖f‖∞E[TU ] <∞,

the dominated convergence theorem tells us

E

(∫ t∧Tn

0

f(Xs) ds

)→ E

(∫ TU

0

f(Xs) ds

).

Since u is continuous on U , we also have

E(u(Xt∧Tn))→ E(u(Tu)) = E(g(Tu)).

We can use SDE’s to solve the Cauchy problem for parabolic equations aswell, just like the heat equation. The problem is as follows: for f ∈ C2

b (Rd), wewant to find u : R+ × Rd → R that is C1 in R+ and C2 in Rd such that

∂u

∂t= Lu on R+ × Rd

u(0, · ) = f on Rd

Again we will need the following theorem:

49


Theorem. For every f ∈ C2b (Rd), there exists a solution to the Cauchy problem.

Theorem. Let u be a solution to the Cauchy problem. Let X be a solution toEx(σ, b) for x ∈ Rd and 0 ≤ s ≤ t. Then

Ex(f(Xt) | Fs) = u(t− s,Xs).

In particular,u(t, x) = Ex(f(Xt)).

In particular, this implies Xt is a continuous Markov process.

Proof. The martingale has ∂∂t + L, but the heat equation has ∂

∂t − L. So we setg(s, x) = u(t− s, x). Then(

∂

∂s+ L

)g(s, x) = − ∂

∂tu(t− s, x) + Lu(t− s, x) = 0.

So g(s,Xs)− g(0, X0) is a martingale (boundedness is an exercise), and hence

u(t− s,Xs) = g(s,Xs) = E(g(t,Xt) | Fs) = E(u(0, Xt) | Fs) = E(f(Xt) | Xs).

There is a generalization to the Feynman–Kac formula.

Theorem (Feynman–Kac formula). Let f ∈ C2b (Rd) and V ∈ Cb(Rd) and

suppose that u : R+ × Rd → R satisfies

∂u

∂t= Lu+ V u on R+ × Rd

u(0, · ) = f on Rd,

where V u = V (x)u(x) is given by multiplication.Then for all t > 0 and x ∈ Rd, and X a solution to Ex(σ, b). Then

u(t, x) = Ex(f(Xt) exp

(∫ t

0

V (Xs) ds

)).

If L is the Laplacian, then this is Schrodinger equation, which is why Feynmanwas thinking about this.

50

Index III Stochastic Calculus and Applications

Index

(H ·A)t, 9H ·X, 30L2(M), 26L2bc(M), 29

XT , 11E , 10M2, 16

Bessel process, 47bounded variation, 7bracket, 22BV, 7

cadlag, 7cadlag adapted process, 9Cauchy problem, 49covariation, 22

Dirichlet problem, 48Dirichlet–Poisson problem, 48Doleans–Dade exponential, 38Doob’s inequality, 16Dubins–Schwarz theorem, 36Dyson Brownian motion, 46

Feynman–Kac formula, 50finite variation, 8Finite variation process, 9

Gaussian orthogonal ensemble, 46Gaussian space, 4Gaussian white noise, 4Geometric Brownian motion, 47Girsanov’s theorem, 39

Hahn decomposition, 6

integration by parts, 32Ito correction, 32Ito isometry, 27Ito’s formula, 33

Kunita–Watanabe, 23

Levy’s characterization of Brownianmotion, 35

Lebesgue–Stieltjes integral, 8Lipschitz coefficients, 43local martingale, 12locally bounded previsible process,

30

Ornstein–Uhlenbeck process, 45

pathwise uniqueness, 41Poisson problem, 48previsible σ-algebra, 10previsible process, 10

quadratic variation, 18semi-martingale, 24

reduces, 12

semi-martingale, 24signed measure, 6simple process, 10, 25stochastic differential equation, 41stochastic dominated convergence

theorem, 31stochastic exponential, 38stochastic integral, 27, 30stopped process, 11strong solution, 41

Tanaka equation, 42total variation, 6, 7total variation process, 9

u.c.p., 17uniformly elliptic, 48uniformly on compact sets in

probability, 17uniqueness in law, 41usual conditions, 11

Vitali theorem, 13

weak solution, 41

Yamada–Watanabe, 42

51

Documents

Part III - Stochastic Calculus and Applications...Part III | Stochastic Calculus and Applications Based on lectures by R. Bauerschmidt Notes taken by Dexter Chua Lent 2018 These notes