53
Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology Sweden Summer School on Monte Carlo Methods and Rare Events Brown University, June 13-17, 2016 H. Hult Lecture 2

Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Lecture 2: Importance sampling – the basics

Henrik Hult

Department of MathematicsKTH Royal Institute of Technology

Sweden

Summer School on Monte Carlo Methods and Rare EventsBrown University, June 13-17, 2016

H. Hult Lecture 2

Page 2: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Outline

1 A random walk example with poor performance

2 A Markov random walk model

3 Markov chains in continuous time

4 Small-noise diffusions

H. Hult Lecture 2

Page 3: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceThe model

Let Z1,Z2, . . . be iid N(0,1) random variables and put

H(α) = log E [exp{αZ1}] =α2

2, α ∈ R.

Let X n0 = 0, X n

k = 1n (Z1 + · · ·+ Zk ), for k ≥ 1, be the normalized

random walk.Consider computing the probability P{X n

n ∈ (−∞,a] ∪ [b,∞)}, byimportance sampling, where a < 0 < b and b < |a|.Clearly, there is no need for importance sampling because

P{X nn ∈ (−∞,a] ∪ [b,∞)} = Φ(a

√n) + 1− Φ(b

√n),

where Φ is the standard normal cdf.

H. Hult Lecture 2

Page 4: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceThe model

Let Z1,Z2, . . . be iid N(0,1) random variables and put

H(α) = log E [exp{αZ1}] =α2

2, α ∈ R.

Let X n0 = 0, X n

k = 1n (Z1 + · · ·+ Zk ), for k ≥ 1, be the normalized

random walk.Consider computing the probability P{X n

n ∈ (−∞,a] ∪ [b,∞)}, byimportance sampling, where a < 0 < b and b < |a|.Clearly, there is no need for importance sampling because

P{X nn ∈ (−∞,a] ∪ [b,∞)} = Φ(a

√n) + 1− Φ(b

√n),

where Φ is the standard normal cdf.

H. Hult Lecture 2

Page 5: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceThe model

Let Z1,Z2, . . . be iid N(0,1) random variables and put

H(α) = log E [exp{αZ1}] =α2

2, α ∈ R.

Let X n0 = 0, X n

k = 1n (Z1 + · · ·+ Zk ), for k ≥ 1, be the normalized

random walk.Consider computing the probability P{X n

n ∈ (−∞,a] ∪ [b,∞)}, byimportance sampling, where a < 0 < b and b < |a|.Clearly, there is no need for importance sampling because

P{X nn ∈ (−∞,a] ∪ [b,∞)} = Φ(a

√n) + 1− Φ(b

√n),

where Φ is the standard normal cdf.

H. Hult Lecture 2

Page 6: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceThe model

Let Z1,Z2, . . . be iid N(0,1) random variables and put

H(α) = log E [exp{αZ1}] =α2

2, α ∈ R.

Let X n0 = 0, X n

k = 1n (Z1 + · · ·+ Zk ), for k ≥ 1, be the normalized

random walk.Consider computing the probability P{X n

n ∈ (−∞,a] ∪ [b,∞)}, byimportance sampling, where a < 0 < b and b < |a|.Clearly, there is no need for importance sampling because

P{X nn ∈ (−∞,a] ∪ [b,∞)} = Φ(a

√n) + 1− Φ(b

√n),

where Φ is the standard normal cdf.

H. Hult Lecture 2

Page 7: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceImportance sampling

When applying importance sampling we may suggest a samplingdistribution with density

ϕα(z) = eαz−H(α)ϕ(z) = eαz−α22 ϕ(z),

where ϕ is the standard normal density.Note that ϕα is simply the density of a N(α,1) distribution.For large n, because b < |a|,

P{X nn ≥ b} � P{X n

n ≤ a}

so it may seem reasonable to focus on changing the measureassociated with the probability P{X n

n ≥ b}.Guided by, say Cramér’s Theorem, we suggest taking α = b.

H. Hult Lecture 2

Page 8: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceImportance sampling

When applying importance sampling we may suggest a samplingdistribution with density

ϕα(z) = eαz−H(α)ϕ(z) = eαz−α22 ϕ(z),

where ϕ is the standard normal density.Note that ϕα is simply the density of a N(α,1) distribution.For large n, because b < |a|,

P{X nn ≥ b} � P{X n

n ≤ a}

so it may seem reasonable to focus on changing the measureassociated with the probability P{X n

n ≥ b}.Guided by, say Cramér’s Theorem, we suggest taking α = b.

H. Hult Lecture 2

Page 9: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceImportance sampling

When applying importance sampling we may suggest a samplingdistribution with density

ϕα(z) = eαz−H(α)ϕ(z) = eαz−α22 ϕ(z),

where ϕ is the standard normal density.Note that ϕα is simply the density of a N(α,1) distribution.For large n, because b < |a|,

P{X nn ≥ b} � P{X n

n ≤ a}

so it may seem reasonable to focus on changing the measureassociated with the probability P{X n

n ≥ b}.Guided by, say Cramér’s Theorem, we suggest taking α = b.

H. Hult Lecture 2

Page 10: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceNumerics

Table: Simulation results for the random walk, a = −0.25, b = 0.2,

True Estimate Std·√

N Relative ErrorN = 103 0.029 0.022 0.034 1.55N = 104 0.029 0.022 0.034 1.55N = 105 0.029 0.034 3.63 106

H. Hult Lecture 2

Page 11: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceNumerics

Table: Simulation results for the random walk, a = −0.25, b = 0.2,

True Estimate Std·√

N Relative ErrorN = 103 0.029 0.022 0.034 1.55N = 104 0.029 0.022 0.034 1.55N = 105 0.029 0.034 3.63 106

H. Hult Lecture 2

Page 12: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceNumerics

Table: Simulation results for the random walk, a = −0.25, b = 0.2,

True Estimate Std·√

N Relative ErrorN = 103 0.029 0.022 0.034 1.55N = 104 0.029 0.022 0.034 1.55N = 105 0.029 0.034 3.63 106

H. Hult Lecture 2

Page 13: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A random walk example, with poor performanceWeights and trajectories

Figure: Left: The log of the likelihood ratio weights. Right: Trajectories

Histogram of log(W)

log(W)

Fre

qu

en

cy

−10 −5 0 5

02

00

04

00

06

00

08

00

01

00

00

0 20 40 60 80 100

−0

.2−

0.1

0.0

0.1

0.2

0.3

Index

X[n

n,

]

H. Hult Lecture 2

Page 14: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk model

Let {vi(x), x ∈ Rd , i ≥ 0} be independent and identicallydistributed random vector fields with distribution

P{vi(x) ∈ ·} = θ(· | x),

where θ is a regular conditional probability distribution.Let

X ni+1 = X n

i +1n

vi(X ni ), X n

0 = x0.

Denote the log moment generating function of θ(· | x) by

H(x , α) = log E [exp{〈α, v1(x)〉}]

and suppose H(x , α) <∞ for all x and α in Rd .The Fenchel-Legendre transform (convex conjugate) of H(x , ·),denoted by

L(x , β) = supα∈Rd

[〈α, β〉 − H(x , α)] .

H. Hult Lecture 2

Page 15: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk model

Let {vi(x), x ∈ Rd , i ≥ 0} be independent and identicallydistributed random vector fields with distribution

P{vi(x) ∈ ·} = θ(· | x),

where θ is a regular conditional probability distribution.Let

X ni+1 = X n

i +1n

vi(X ni ), X n

0 = x0.

Denote the log moment generating function of θ(· | x) by

H(x , α) = log E [exp{〈α, v1(x)〉}]

and suppose H(x , α) <∞ for all x and α in Rd .The Fenchel-Legendre transform (convex conjugate) of H(x , ·),denoted by

L(x , β) = supα∈Rd

[〈α, β〉 − H(x , α)] .

H. Hult Lecture 2

Page 16: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk model

Let {vi(x), x ∈ Rd , i ≥ 0} be independent and identicallydistributed random vector fields with distribution

P{vi(x) ∈ ·} = θ(· | x),

where θ is a regular conditional probability distribution.Let

X ni+1 = X n

i +1n

vi(X ni ), X n

0 = x0.

Denote the log moment generating function of θ(· | x) by

H(x , α) = log E [exp{〈α, v1(x)〉}]

and suppose H(x , α) <∞ for all x and α in Rd .The Fenchel-Legendre transform (convex conjugate) of H(x , ·),denoted by

L(x , β) = supα∈Rd

[〈α, β〉 − H(x , α)] .

H. Hult Lecture 2

Page 17: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk model

Let {vi(x), x ∈ Rd , i ≥ 0} be independent and identicallydistributed random vector fields with distribution

P{vi(x) ∈ ·} = θ(· | x),

where θ is a regular conditional probability distribution.Let

X ni+1 = X n

i +1n

vi(X ni ), X n

0 = x0.

Denote the log moment generating function of θ(· | x) by

H(x , α) = log E [exp{〈α, v1(x)〉}]

and suppose H(x , α) <∞ for all x and α in Rd .The Fenchel-Legendre transform (convex conjugate) of H(x , ·),denoted by

L(x , β) = supα∈Rd

[〈α, β〉 − H(x , α)] .

H. Hult Lecture 2

Page 18: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Examples

Queueing modelsCredit risk modelsEpidemic modelsChemical reactions

H. Hult Lecture 2

Page 19: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelProbabilities and expectations

In the generic setup we will be interested in computing anexpectation of the form

E [exp{−nF (X nn )}],

where F : Rd → R is a bounded continuous function.Rare-event probabilities such as P{X n

n ∈ A} can be incorporatedby formally writing

F (x) =

{0, x ∈ A,∞, x ∈ Ac .

H. Hult Lecture 2

Page 20: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelProbabilities and expectations

In the generic setup we will be interested in computing anexpectation of the form

E [exp{−nF (X nn )}],

where F : Rd → R is a bounded continuous function.Rare-event probabilities such as P{X n

n ∈ A} can be incorporatedby formally writing

F (x) =

{0, x ∈ A,∞, x ∈ Ac .

H. Hult Lecture 2

Page 21: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelThe backward equation

Let An denote the backward evolution operator associated withX n, that is,

Anf (i , x) = Ei,x [f (i + 1,X ni+1)− f (i , x)]

=

∫ [f (i + 1, x +

1n

z)− f (i , x)]θ(dz | x).

The (Kolmogorov) backward equation implies thatV n(i , x) = Ei,x [exp{−nF (X n

n )}] satisfies

AnV n(i , x) = 0,V n(n, x) = exp{−nF (x)},

where V n(0, x0) = E [exp{−nF (X nn )}] is the quantity we are

interested in computing.To see this, simply use the iterated expectation.

H. Hult Lecture 2

Page 22: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelThe backward equation

Let An denote the backward evolution operator associated withX n, that is,

Anf (i , x) = Ei,x [f (i + 1,X ni+1)− f (i , x)]

=

∫ [f (i + 1, x +

1n

z)− f (i , x)]θ(dz | x).

The (Kolmogorov) backward equation implies thatV n(i , x) = Ei,x [exp{−nF (X n

n )}] satisfies

AnV n(i , x) = 0,V n(n, x) = exp{−nF (x)},

where V n(0, x0) = E [exp{−nF (X nn )}] is the quantity we are

interested in computing.To see this, simply use the iterated expectation.

H. Hult Lecture 2

Page 23: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelThe backward equation

Let An denote the backward evolution operator associated withX n, that is,

Anf (i , x) = Ei,x [f (i + 1,X ni+1)− f (i , x)]

=

∫ [f (i + 1, x +

1n

z)− f (i , x)]θ(dz | x).

The (Kolmogorov) backward equation implies thatV n(i , x) = Ei,x [exp{−nF (X n

n )}] satisfies

AnV n(i , x) = 0,V n(n, x) = exp{−nF (x)},

where V n(0, x0) = E [exp{−nF (X nn )}] is the quantity we are

interested in computing.To see this, simply use the iterated expectation.

H. Hult Lecture 2

Page 24: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelImportance sampling

Let the sampling distribution be given by

θα(dz | x) = exp{〈α, z〉 − H(x , α)}θ(dz | x).

Consider the controlled process X n, where

X ni+1 = X n

i +1n

Zi , X n0 = x0.

The likelihood ratio is given by

dPα

dP=

n−1∏i=0

exp{〈αni , Zi〉 − H(X n

i , αni )}.

The importance sampling estimator is

dPdPα

exp{−nF (X nn )}.

H. Hult Lecture 2

Page 25: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelImportance sampling

Let the sampling distribution be given by

θα(dz | x) = exp{〈α, z〉 − H(x , α)}θ(dz | x).

Consider the controlled process X n, where

X ni+1 = X n

i +1n

Zi , X n0 = x0.

The likelihood ratio is given by

dPα

dP=

n−1∏i=0

exp{〈αni , Zi〉 − H(X n

i , αni )}.

The importance sampling estimator is

dPdPα

exp{−nF (X nn )}.

H. Hult Lecture 2

Page 26: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelImportance sampling

Let the sampling distribution be given by

θα(dz | x) = exp{〈α, z〉 − H(x , α)}θ(dz | x).

Consider the controlled process X n, where

X ni+1 = X n

i +1n

Zi , X n0 = x0.

The likelihood ratio is given by

dPα

dP=

n−1∏i=0

exp{〈αni , Zi〉 − H(X n

i , αni )}.

The importance sampling estimator is

dPdPα

exp{−nF (X nn )}.

H. Hult Lecture 2

Page 27: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelImportance sampling

Let the sampling distribution be given by

θα(dz | x) = exp{〈α, z〉 − H(x , α)}θ(dz | x).

Consider the controlled process X n, where

X ni+1 = X n

i +1n

Zi , X n0 = x0.

The likelihood ratio is given by

dPα

dP=

n−1∏i=0

exp{〈αni , Zi〉 − H(X n

i , αni )}.

The importance sampling estimator is

dPdPα

exp{−nF (X nn )}.

H. Hult Lecture 2

Page 28: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelAnalysis of the second moment

With Sαj =

∑j−1i=0−〈α

ni , vi(X n

i )〉+ H(X ni , α

ni ) we can write the

second moment as

W n(0, x0) := E α[(

e∑n−1

i=0 −〈αni ,Zi 〉+H(X n

i ,αni )e−nF (X n

n ))2]

= E[e∑n−1

i=0 −〈αni ,vi 〉+H(X n

i ,αni )e−2nF (X n

n )]

= E[eSα

n −2nF (X nn )]

Let W n(j , x) be the second moment of the importance samplingestimator, starting from x at time j . Then,

W n(j , x) = E αj,x

[(e∑n−1

i=j −〈αni ,Zi 〉+H(X n

i ,αni )e−nF (X n

n ))2]

= Ej,x

[eSα

n −Sαj −2nF (X n

n )].

W n satisfies a backward equation, similar to V n.

H. Hult Lecture 2

Page 29: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelAnalysis of the second moment

With Sαj =

∑j−1i=0−〈α

ni , vi(X n

i )〉+ H(X ni , α

ni ) we can write the

second moment as

W n(0, x0) := E α[(

e∑n−1

i=0 −〈αni ,Zi 〉+H(X n

i ,αni )e−nF (X n

n ))2]

= E[e∑n−1

i=0 −〈αni ,vi 〉+H(X n

i ,αni )e−2nF (X n

n )]

= E[eSα

n −2nF (X nn )]

Let W n(j , x) be the second moment of the importance samplingestimator, starting from x at time j . Then,

W n(j , x) = E αj,x

[(e∑n−1

i=j −〈αni ,Zi 〉+H(X n

i ,αni )e−nF (X n

n ))2]

= Ej,x

[eSα

n −Sαj −2nF (X n

n )].

W n satisfies a backward equation, similar to V n.

H. Hult Lecture 2

Page 30: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelA backward equation for the second moment

TheoremThe second moment W n satisfies the backward equation∫

[W n(i + 1, x +1n

z)e−〈αni (x),z〉+H(x ,αn

i (x)) −W n(i , x)]θ(dz | x) = 0,

W n(n, x) = e−2nF (x).

H. Hult Lecture 2

Page 31: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelProof of the backward equation

Let

W n(j , x , s) = Ej,x ,s

[exp{Sα

n − Sαj − 2nF (X n

n )}]

and An denote the backward operator for the Markov chain(X n,Sα), i.e.,

Anf (i , x , s) = Ei,x ,s[f (i + 1,X ni+1,S

αi+1)− f (i , x , s)]

=

∫ [f (i + 1, x +

1n

z, s − 〈αni (x), z〉+ H(x , αn

i (x)))− f (i , x , s)]dθ.

By the backward equation for W n:

AnW n(i , x , s) = 0,

W n(n, x , s) = exp{−2nF (x)},

and, since W n(j , x , s) = esW n(j , x) the equation for W n followsfrom a short calculcation.

H. Hult Lecture 2

Page 32: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A Markov random walk modelProof of the backward equation

Let

W n(j , x , s) = Ej,x ,s

[exp{Sα

n − Sαj − 2nF (X n

n )}]

and An denote the backward operator for the Markov chain(X n,Sα), i.e.,

Anf (i , x , s) = Ei,x ,s[f (i + 1,X ni+1,S

αi+1)− f (i , x , s)]

=

∫ [f (i + 1, x +

1n

z, s − 〈αni (x), z〉+ H(x , αn

i (x)))− f (i , x , s)]dθ.

By the backward equation for W n:

AnW n(i , x , s) = 0,

W n(n, x , s) = exp{−2nF (x)},

and, since W n(j , x , s) = esW n(j , x) the equation for W n followsfrom a short calculcation.

H. Hult Lecture 2

Page 33: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeThe model

Consider a continuous time Markov chain X n(t).At X n(t) = x the process can jump to new states x + 1

n v withv ∈ V.V denotes the set of possible jumps.The intensity of jumping from x to x + 1

n v is nλv (x) ≥ 0.The stochastic kernel associated with X n is

Θn(dt , v | x) = P{

Tk+1 − Tk ∈ dt ,X n(Tk+1) = x +1n

v | X n(Tk ) = x}

= nλv (x)e−nΛ(x)tdt ,

where 0 = T0 < T1, . . . denotes the jump times of X n andΛ(x) =

∑v∈V λv (x).

H. Hult Lecture 2

Page 34: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeThe model

Consider a continuous time Markov chain X n(t).At X n(t) = x the process can jump to new states x + 1

n v withv ∈ V.V denotes the set of possible jumps.The intensity of jumping from x to x + 1

n v is nλv (x) ≥ 0.The stochastic kernel associated with X n is

Θn(dt , v | x) = P{

Tk+1 − Tk ∈ dt ,X n(Tk+1) = x +1n

v | X n(Tk ) = x}

= nλv (x)e−nΛ(x)tdt ,

where 0 = T0 < T1, . . . denotes the jump times of X n andΛ(x) =

∑v∈V λv (x).

H. Hult Lecture 2

Page 35: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeThe model

Consider a continuous time Markov chain X n(t).At X n(t) = x the process can jump to new states x + 1

n v withv ∈ V.V denotes the set of possible jumps.The intensity of jumping from x to x + 1

n v is nλv (x) ≥ 0.The stochastic kernel associated with X n is

Θn(dt , v | x) = P{

Tk+1 − Tk ∈ dt ,X n(Tk+1) = x +1n

v | X n(Tk ) = x}

= nλv (x)e−nΛ(x)tdt ,

where 0 = T0 < T1, . . . denotes the jump times of X n andΛ(x) =

∑v∈V λv (x).

H. Hult Lecture 2

Page 36: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeThe model

Consider a continuous time Markov chain X n(t).At X n(t) = x the process can jump to new states x + 1

n v withv ∈ V.V denotes the set of possible jumps.The intensity of jumping from x to x + 1

n v is nλv (x) ≥ 0.The stochastic kernel associated with X n is

Θn(dt , v | x) = P{

Tk+1 − Tk ∈ dt ,X n(Tk+1) = x +1n

v | X n(Tk ) = x}

= nλv (x)e−nΛ(x)tdt ,

where 0 = T0 < T1, . . . denotes the jump times of X n andΛ(x) =

∑v∈V λv (x).

H. Hult Lecture 2

Page 37: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeThe model

Consider a continuous time Markov chain X n(t).At X n(t) = x the process can jump to new states x + 1

n v withv ∈ V.V denotes the set of possible jumps.The intensity of jumping from x to x + 1

n v is nλv (x) ≥ 0.The stochastic kernel associated with X n is

Θn(dt , v | x) = P{

Tk+1 − Tk ∈ dt ,X n(Tk+1) = x +1n

v | X n(Tk ) = x}

= nλv (x)e−nΛ(x)tdt ,

where 0 = T0 < T1, . . . denotes the jump times of X n andΛ(x) =

∑v∈V λv (x).

H. Hult Lecture 2

Page 38: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeImportance sampling

The importance sampling algorithm is implemented usingsampling intensities λαv (x) of the form

λαv (x) = e〈α,v〉λv (x),

The corresponding likelihood ratio is

dPα

dP=

NT∏k=1

Θα(dτk , vk | X n(Tk−1))

Θn(dτk , vk | X n(Tk−1)),

where NT = inf{k ≥ 1 : Tk > T}, τk = Tk − Tk−1,vk = n(X n(Tk )− X n(Tk−1)), Λα(x) =

∑v∈V λ

αv (x).

For a given λα the corresponding importance sampling estimatoris given as the sample mean of independent copies of

dPdPα

exp{−nF (X n(T ))}.

H. Hult Lecture 2

Page 39: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeImportance sampling

The importance sampling algorithm is implemented usingsampling intensities λαv (x) of the form

λαv (x) = e〈α,v〉λv (x),

The corresponding likelihood ratio is

dPα

dP=

NT∏k=1

Θα(dτk , vk | X n(Tk−1))

Θn(dτk , vk | X n(Tk−1)),

where NT = inf{k ≥ 1 : Tk > T}, τk = Tk − Tk−1,vk = n(X n(Tk )− X n(Tk−1)), Λα(x) =

∑v∈V λ

αv (x).

For a given λα the corresponding importance sampling estimatoris given as the sample mean of independent copies of

dPdPα

exp{−nF (X n(T ))}.

H. Hult Lecture 2

Page 40: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Markov chains in continuous timeImportance sampling

The importance sampling algorithm is implemented usingsampling intensities λαv (x) of the form

λαv (x) = e〈α,v〉λv (x),

The corresponding likelihood ratio is

dPα

dP=

NT∏k=1

Θα(dτk , vk | X n(Tk−1))

Θn(dτk , vk | X n(Tk−1)),

where NT = inf{k ≥ 1 : Tk > T}, τk = Tk − Tk−1,vk = n(X n(Tk )− X n(Tk−1)), Λα(x) =

∑v∈V λ

αv (x).

For a given λα the corresponding importance sampling estimatoris given as the sample mean of independent copies of

dPdPα

exp{−nF (X n(T ))}.

H. Hult Lecture 2

Page 41: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A simple epidemic model

Considers a population of n individuals, each who is eithersusceptible (S) to a virus or infected (I).The Markov chain X n(t) is the fraction of infected individuals attime t .This is an example of a continuous time Markov chain withV = {−1,1} and we take λ−1(x) = x , λ1(x) = ρx(1− x), ρ > 1.As n→∞ the process X n(t) converges to a deterministic process(by the law of large numbers) x satisfying the ODE

x(t) = −λ−1(x(t)) + λ1(x(t)) = −x(t) + ρx(t)(1− x(t)),

x(0) = x0.

This dynamical system has an absorbing state at x = 0 and astable equilibrium at x = 1− ρ−1.We may be interested in the probability that an infection, startingfrom x0 reaches a hight level x1 > x0 > 1− ρ−1 before comingback to the equilibrium at 1− ρ−1.

H. Hult Lecture 2

Page 42: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A simple epidemic model

Considers a population of n individuals, each who is eithersusceptible (S) to a virus or infected (I).The Markov chain X n(t) is the fraction of infected individuals attime t .This is an example of a continuous time Markov chain withV = {−1,1} and we take λ−1(x) = x , λ1(x) = ρx(1− x), ρ > 1.As n→∞ the process X n(t) converges to a deterministic process(by the law of large numbers) x satisfying the ODE

x(t) = −λ−1(x(t)) + λ1(x(t)) = −x(t) + ρx(t)(1− x(t)),

x(0) = x0.

This dynamical system has an absorbing state at x = 0 and astable equilibrium at x = 1− ρ−1.We may be interested in the probability that an infection, startingfrom x0 reaches a hight level x1 > x0 > 1− ρ−1 before comingback to the equilibrium at 1− ρ−1.

H. Hult Lecture 2

Page 43: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A simple epidemic model

Considers a population of n individuals, each who is eithersusceptible (S) to a virus or infected (I).The Markov chain X n(t) is the fraction of infected individuals attime t .This is an example of a continuous time Markov chain withV = {−1,1} and we take λ−1(x) = x , λ1(x) = ρx(1− x), ρ > 1.As n→∞ the process X n(t) converges to a deterministic process(by the law of large numbers) x satisfying the ODE

x(t) = −λ−1(x(t)) + λ1(x(t)) = −x(t) + ρx(t)(1− x(t)),

x(0) = x0.

This dynamical system has an absorbing state at x = 0 and astable equilibrium at x = 1− ρ−1.We may be interested in the probability that an infection, startingfrom x0 reaches a hight level x1 > x0 > 1− ρ−1 before comingback to the equilibrium at 1− ρ−1.

H. Hult Lecture 2

Page 44: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A simple epidemic model

Considers a population of n individuals, each who is eithersusceptible (S) to a virus or infected (I).The Markov chain X n(t) is the fraction of infected individuals attime t .This is an example of a continuous time Markov chain withV = {−1,1} and we take λ−1(x) = x , λ1(x) = ρx(1− x), ρ > 1.As n→∞ the process X n(t) converges to a deterministic process(by the law of large numbers) x satisfying the ODE

x(t) = −λ−1(x(t)) + λ1(x(t)) = −x(t) + ρx(t)(1− x(t)),

x(0) = x0.

This dynamical system has an absorbing state at x = 0 and astable equilibrium at x = 1− ρ−1.We may be interested in the probability that an infection, startingfrom x0 reaches a hight level x1 > x0 > 1− ρ−1 before comingback to the equilibrium at 1− ρ−1.

H. Hult Lecture 2

Page 45: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

A simple epidemic model

Considers a population of n individuals, each who is eithersusceptible (S) to a virus or infected (I).The Markov chain X n(t) is the fraction of infected individuals attime t .This is an example of a continuous time Markov chain withV = {−1,1} and we take λ−1(x) = x , λ1(x) = ρx(1− x), ρ > 1.As n→∞ the process X n(t) converges to a deterministic process(by the law of large numbers) x satisfying the ODE

x(t) = −λ−1(x(t)) + λ1(x(t)) = −x(t) + ρx(t)(1− x(t)),

x(0) = x0.

This dynamical system has an absorbing state at x = 0 and astable equilibrium at x = 1− ρ−1.We may be interested in the probability that an infection, startingfrom x0 reaches a hight level x1 > x0 > 1− ρ−1 before comingback to the equilibrium at 1− ρ−1.

H. Hult Lecture 2

Page 46: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsThe model

Compute an expectation of the form E [exp{−1εF (X ε(T ))], where

F is a bounded continuous function and X ε is the unique strongsolution to the stochastic differential equation

dX ε(t) = b(X ε(t))dt +√εσ(X ε(t))dB(t),

X ε(0) = x0.

Change the measure by a Girsanov transformation

dPα

dP= exp

{ 1√ε

∫ T

0〈α(s,X ε(s)),dB(s)〉 − 1

∫ T

0|α(s,X ε(s))|2ds

}.

Under Pα the underlying process has dynamics:

dX ε(t) = b(X ε(t)) + σ(X ε)α(t ,X ε(t))dt +√εσ(X ε(t))dB(t),

X ε(0) = x0,

where B is a Pα-Brownian motion.H. Hult Lecture 2

Page 47: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsThe model

Compute an expectation of the form E [exp{−1εF (X ε(T ))], where

F is a bounded continuous function and X ε is the unique strongsolution to the stochastic differential equation

dX ε(t) = b(X ε(t))dt +√εσ(X ε(t))dB(t),

X ε(0) = x0.

Change the measure by a Girsanov transformation

dPα

dP= exp

{ 1√ε

∫ T

0〈α(s,X ε(s)),dB(s)〉 − 1

∫ T

0|α(s,X ε(s))|2ds

}.

Under Pα the underlying process has dynamics:

dX ε(t) = b(X ε(t)) + σ(X ε)α(t ,X ε(t))dt +√εσ(X ε(t))dB(t),

X ε(0) = x0,

where B is a Pα-Brownian motion.H. Hult Lecture 2

Page 48: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsThe model

Compute an expectation of the form E [exp{−1εF (X ε(T ))], where

F is a bounded continuous function and X ε is the unique strongsolution to the stochastic differential equation

dX ε(t) = b(X ε(t))dt +√εσ(X ε(t))dB(t),

X ε(0) = x0.

Change the measure by a Girsanov transformation

dPα

dP= exp

{ 1√ε

∫ T

0〈α(s,X ε(s)),dB(s)〉 − 1

∫ T

0|α(s,X ε(s))|2ds

}.

Under Pα the underlying process has dynamics:

dX ε(t) = b(X ε(t)) + σ(X ε)α(t ,X ε(t))dt +√εσ(X ε(t))dB(t),

X ε(0) = x0,

where B is a Pα-Brownian motion.H. Hult Lecture 2

Page 49: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsThe model

For a given α the corresponding importance sampling estimator isgiven as the sample mean of independent copies of

dPdPα

exp{−1ε

F (X ε(T ))}.

Theorem

The second moment W ε(t , x) = Et ,x [ dPdPα e−2 1

εF (X ε(T ))] satisfies the

backward equation

α2

εW ε − α√

εσDW ε + W ε

t + bDW ε + εσ2

2D2W ε = 0,

W ε(T , x) = e−2 1εF (x).

H. Hult Lecture 2

Page 50: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsThe model

For a given α the corresponding importance sampling estimator isgiven as the sample mean of independent copies of

dPdPα

exp{−1ε

F (X ε(T ))}.

Theorem

The second moment W ε(t , x) = Et ,x [ dPdPα e−2 1

εF (X ε(T ))] satisfies the

backward equation

α2

εW ε − α√

εσDW ε + W ε

t + bDW ε + εσ2

2D2W ε = 0,

W ε(T , x) = e−2 1εF (x).

H. Hult Lecture 2

Page 51: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsExample: Loss probabilities

Consider a multidimensional Black-Scholes model for theevolution of n financial asset prices, X ε

i (t), i = 1, . . . ,n, where

dX εi (t) = µiX ε

i (t)dt +√ε

n∑j=1

LijdBj(t), X εi (0) = x0,i .

The price of the k th derivative at time T is πk (T ,X ε(T )) and if theportfolio contains hk contracts of type k the value of the portfolioat time T is given by ∑

k

hkπk (T ,X ε(T )).

We may be interested in the probabilityP{∑

k hkπk (T ,X ε(T )) < b}, that is, the probability that the valueof the portfolio at a future time T is below some small number b.

H. Hult Lecture 2

Page 52: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsExample: Loss probabilities

Consider a multidimensional Black-Scholes model for theevolution of n financial asset prices, X ε

i (t), i = 1, . . . ,n, where

dX εi (t) = µiX ε

i (t)dt +√ε

n∑j=1

LijdBj(t), X εi (0) = x0,i .

The price of the k th derivative at time T is πk (T ,X ε(T )) and if theportfolio contains hk contracts of type k the value of the portfolioat time T is given by ∑

k

hkπk (T ,X ε(T )).

We may be interested in the probabilityP{∑

k hkπk (T ,X ε(T )) < b}, that is, the probability that the valueof the portfolio at a future time T is below some small number b.

H. Hult Lecture 2

Page 53: Lecture 2: Importance sampling the basics - Brown University · Lecture 2: Importance sampling – the basics Henrik Hult Department of Mathematics KTH Royal Institute of Technology

Small-noise diffusionsExample: Loss probabilities

Consider a multidimensional Black-Scholes model for theevolution of n financial asset prices, X ε

i (t), i = 1, . . . ,n, where

dX εi (t) = µiX ε

i (t)dt +√ε

n∑j=1

LijdBj(t), X εi (0) = x0,i .

The price of the k th derivative at time T is πk (T ,X ε(T )) and if theportfolio contains hk contracts of type k the value of the portfolioat time T is given by ∑

k

hkπk (T ,X ε(T )).

We may be interested in the probabilityP{∑

k hkπk (T ,X ε(T )) < b}, that is, the probability that the valueof the portfolio at a future time T is below some small number b.

H. Hult Lecture 2