47
Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering Axel Gandy Department of Mathematics Imperial College London http://www2.imperial.ac.uk/ ~ agandy London Taught Course Centre for PhD Students in the Mathematical Sciences Autumn 2015

Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Advanced Computational Methods in Statistics:Lecture 5

Sequential Monte Carlo/Particle Filtering

Axel Gandy

Department of MathematicsImperial College London

http://www2.imperial.ac.uk/~agandy

London Taught Course Centrefor PhD Students in the Mathematical Sciences

Autumn 2015

Page 2: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

OutlineIntroduction

SetupExamplesFinitely Many StatesKalman Filter

Particle Filtering

Improving the Algorithm

Further Topics

Summary

Axel Gandy Particle Filtering 2

Page 3: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Sequential Monte Carlo/Particle Filtering

I Particle filtering introduced in Gordon et al. (1993)

I Most of the material on particle filtering is based on Doucetet al. (2001) and on the tutorial Doucet & Johansen (2008).

I Examples:I Tracking of ObjectsI Robot LocalisationI Financial Applications

Axel Gandy Particle Filtering 3

Page 4: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Setup - Hidden Markov ModelI x0, x1, x2, . . . : unobserved Markov chain - hidden states

I y1, y2, . . . : observations;

x0 x1

y1

x2

y2

x3

y3

x4

y4

. . .

I y1, y2, . . . are conditionally independent given x0, x1, . . .

I Model given byI π(x0) - the initial distributionI f (xt |xt−1) for t ≥ 1 - the transition kernel of the Markov chainI g(yt |xt) for t ≥ 1 - the distribution of the observations

I Notation:I x0:t = (x0, . . . , xt) - hidden states up to time tI y1:t = (y1, . . . , yt) - observations up to time t

I Interested in the posterior distribution p(x0:t |y1:t) or in p(xt |y1:t)

Page 5: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Some Remarks

I No explicit solution in the general case - only in special cases.

I Will focus on the time-homogeneous case, i.e. the transitiondensities f and the density of the observations g will be thesame for each step.Extensions to inhomogeneous case straightforward.

I It is important that one is able to update quickly as new databecomes available, i.e. if yt+1 is observed want to be able toquickly compute p(xt+1|y1:t+1) based on p(xt |y1:t) and yt+1.

Axel Gandy Particle Filtering 5

Page 6: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Example- Bearings Only TrackingI Gordon et al. (1993); Ship moving in the two-dimensional planeI Stationary observer sees only the angle to the ship.I Hidden states: position xt,1, xt,2, speed xt,3, xt,4.I Speed changes randomly

xt,3 ∼ N(xt−1,3, σ2), xt,4 ∼ N(xt−1,4, σ

2)

I Position changes accordingly

xt,1 = xt−1,1 + xt−1,3, xt,2 = xt−1,2 + xt−1,4

I Observations: yt ∼ N(tan−1(xt,1/xt,2), η2)●●●●●●●●●●●●●●

●●

●●●

●●

●●●●●●●

●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●−1.0 −0.8 −0.6 −0.4 −0.2 0.0

0.0

0.2

0.4

0.6

0.8

1.0

Page 7: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Example- Stochastic Volatility

I Returns: yt ∼ N(0, β2 exp(xt)) (observable from price data)

I Volatility: xt ∼ N(αxt−1,σ2

(1−α)2 ), x1 ∼ N(0, σ2

(1−α)2 ),

I σ = 1, β = 0.5, α = 0.95

●●

●●

●●●

●●

●●

●●●●

●●●

●●

●●●●●

●●●●●●●

●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●

●●

●●

●●●

●●●

●●

●●●●

●●●●

●●●

●●●●

●●●●

●●

●●●

●●●●●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●●●●●●●

●●●●●

●●●●●

●●

●●●●

●●

●●●

●●●●●

●●●

●●●

●●●●●●●

●●

●●

●●

●●●●●

●●●●●●

●●

●●

●●

●●

●●●●●●●●

0 50 100 150 200 250 300 350

−5

05

10

t

Volatility xt

Returns yt

Axel Gandy Particle Filtering 7

Page 8: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Hidden Markov Chain with finitely many states

I If the hidden process xt takes only finitely many values thenp(xt+1|y1:t+1) can be computed recursively via

p(xt+1|y1:t+1) ∝ g(yt+1|xt+1)∑xt

f (xt+1|xt)p(xt |y0:t)

I May not be practicable if there are too many states!

Axel Gandy Particle Filtering 8

Page 9: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Kalman FilterI Kalman (1960)

I Linear Gaussian state space model:

I x0 ∼ N(x0,P0)I xt = Axt−1 + wt

I yt = Hxt + vtI wt ∼ N(0,Q) iid, vt ∼ N(0,R) iidI A, H deterministic matrices; x0, P0, A, H, Q, R known

I Explicit computation and updating of the posterior possible:

Posterior: xt |y1, . . . , yt ∼ N(xt ,Pt)

Time Update

x−t = Axt−1

P−t = APt−1A′ + Q

x0, P0

Measurement Update

Kt = P−t H ′(HP−t H ′ + R)−1

xt = x−t + Kt(yt − H x−t )Pt = (I − KtH)P−t

Page 10: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Kalman Filter - Simple ExampleI x0 ∼ N(0, 1)I xt = 0.9xt−1 + wt

I yt = 2xt + vtI wt ∼ N(0, 1) iid, vt ∼ N(0, 1) iid

0 20 40 60 80 100

−15

−10

−5

05

t

yt

0 20 40 60 80 100

−5

05

t

true xt

mean xt|y1,...,yt

95% credible interval

Page 11: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Kalman Filter - Remarks I

I Updating very easy - only involves linear algebra.

I Very widely used

I A, H, Q and R can change with time

I A linear control input can be incorporated, i.e. the hidden statecan evolve according to

xt = Axt−1 + But−1 + wt−1

where ut can be controlled.

I Normal prior/updating/observations can be replaced throughappropriate conjugate distributions.

I Continuous Time Version: Kalman-Bucy filterSee Øksendal (2003) for a nice introduction.

Axel Gandy Particle Filtering 11

Page 12: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Kalman Filter - Remarks III Extended Kalman Filter

extension to nonlinear dynamics:

I xt = f (xt−1, ut−1,wt−1)I yt = g(xt , vt).

where f and g are nonlinear functions.The extended Kalman filter linearises the nonlinear dynamicsaround the current mean and covariance.To do so it uses the Jacobian matrices, i.e. the matrices ofpartial derivatives of f and g with respect to its components.The extended Kalman filter does no longer compute preciseposterior distributions.

Axel Gandy Particle Filtering 12

Page 13: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

OutlineIntroduction

Particle FilteringIntroductionBootstrap FilterExample-TrackingExample-Stochastic VolatilityExample-FootballTheoretical Results

Improving the Algorithm

Further Topics

Summary

Axel Gandy Particle Filtering 13

Page 14: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

What to do if there were no observations...

x0 x1

y1

x2

y2

x3

y3

x4

y4

. . .

Without observations yi the following simple approach would work:

I Sample N particles following the initial distribution

x(1)0 , . . . , x

(N)0 ∼ π(x0)

I For every step propagate each particle according to thetransition kernel of the Markov chain:

x(j)i+1 ∼ f (·|x(j)

i ), j = 1, . . . ,N

I After each step there are N particles that approximate thedistribution of xi .

I Note: very easy to update to the next step.

Page 15: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Importance Sampling

I Cannot sample from x0:t |y1:t directly.I Main idea: Change the density we are sampling from.I Interested in E(φ(x0:t)|y1:t) =

∫φ(x0:t)p(x0:t |y1:t)dx0:t

I For any density h,

E(φ(x0:t)|y1:t) =

∫φ(x0:t)

p(x0:t |y1:t)

h(x0:t)h(x0:t)dx0:t ,

I Thus an unbiased estimator of E(φ(x0:t)|y1:t) is

I =1

N

N∑i=1

φ(xi0:t)wit ,

where w it =

p(xi0:t |y1:t)

h(xi0:t)and x1

0:t , . . . , xN0:t ∼ h iid.

I How to evaluate p(xi0:t |y1:t)?I How to choose h? Can importance sampling be done recursively?

Axel Gandy Particle Filtering 15

Page 16: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Sequential Importance Sampling I

I Recursive definition and sampling of the importance samplingdistribution:

h(x0:t) = h(xt |x0:t−1)h(x0:t−1)

I Can the weights be computed recursively? By Bayes’ Theorem:

wt =p(x0:t |y1:t)

h(x0:t)=

p(y1:t |x0:t)p(x0:t)

h(x0:t)p(y1:t)

Hence,

wt =g(yt |xt)p(y1:t−1|x0:t−1)f (xt |xt−1)p(x0:t−1)

h(xt |x0:t−1)h(x0:t−1)p(y1:t)

Thus,

wt = wt−1g(yt |xt)f (xt |xt−1)

h(xt |x0:t−1)

p(y1:t−1)

p(y1:t)

Axel Gandy Particle Filtering 16

Page 17: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Sequential Importance Sampling I

I Recursive definition and sampling of the importance samplingdistribution:

h(x0:t) = h(xt |x0:t−1)h(x0:t−1)

I Can the weights be computed recursively? By Bayes’ Theorem:

wt =p(x0:t |y1:t)

h(x0:t)=

p(y1:t |x0:t)p(x0:t)

h(x0:t)p(y1:t)

Hence,

wt =g(yt |xt)p(y1:t−1|x0:t−1)f (xt |xt−1)p(x0:t−1)

h(xt |x0:t−1)h(x0:t−1)p(y1:t)

Thus,

wt = wt−1g(yt |xt)f (xt |xt−1)

h(xt |x0:t−1)

p(y1:t−1)

p(y1:t)

Axel Gandy Particle Filtering 16

Page 18: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Sequential Importance Sampling I

I Recursive definition and sampling of the importance samplingdistribution:

h(x0:t) = h(xt |x0:t−1)h(x0:t−1)

I Can the weights be computed recursively? By Bayes’ Theorem:

wt =p(x0:t |y1:t)

h(x0:t)=

p(y1:t |x0:t)p(x0:t)

h(x0:t)p(y1:t)

Hence,

wt =g(yt |xt)p(y1:t−1|x0:t−1)f (xt |xt−1)p(x0:t−1)

h(xt |x0:t−1)h(x0:t−1)p(y1:t)

Thus,

wt = wt−1g(yt |xt)f (xt |xt−1)

h(xt |x0:t−1)

p(y1:t−1)

p(y1:t)

Axel Gandy Particle Filtering 16

Page 19: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Sequential Importance Sampling II

I Can work with normalised weights: w it = w i

t∑j w

jt

; then one gets

the recursion

w it ∝ w i

t−1

g(yt |xit)f (xit |xit−1)

h(xit |xi0:t−1)

I If one uses uses the prior distribution h(x0) = π(x0) andh(xt |x0:t−1) = f (xt |xt−1) as importance sampling distributionthen the recursion is simply

w it ∝ w i

t−1g(yt |xit)

Axel Gandy Particle Filtering 17

Page 20: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Failure of Sequential Importance Sampling

I Weights degenerate as t increases.I Example: x0 ∼ N(0, 1), xt+1 ∼ N(xt , 1), yt ∼ N(xt , 1).

I N = 100 particlesI Plot of the empirical cdfs of the normalised weights w1

t , . . . ,wNt

1e−52 1e−41 1e−30 1e−19 1e−08

0.0

0.2

0.4

0.6

0.8

1.0

t= 1

x

Fn(

x)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

1e−52 1e−41 1e−30 1e−19 1e−08

0.0

0.2

0.4

0.6

0.8

1.0

t= 10

x

Fn(

x)

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●●●

●●

●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

1e−52 1e−41 1e−30 1e−19 1e−08

0.0

0.2

0.4

0.6

0.8

1.0

t= 20

x

Fn(

x)

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●●●

●●●●

●●●

●●●●●

●●●●

●●●●

●●●

●●●●●

●●●●●●●●●●●

●●

●●

●●●●

1e−52 1e−41 1e−30 1e−19 1e−08

0.0

0.2

0.4

0.6

0.8

1.0

t= 40

x

Fn(

x)

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

Note the log-scale on the x-axis.I Most weights get very small.

Axel Gandy Particle Filtering 18

Page 21: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Resampling

I Goal: Eliminate particles with very low weights.

I Suppose

Q =N∑i=1

w itδxit

is the current approximation to the distribution of xt .I Then one can obtain a new approximation as follows:

I Sample N iid particles xit from QI The new approximation is

Q =N∑i=1

1

Nδxit

Axel Gandy Particle Filtering 19

Page 22: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

The Bootstrap Filter

1. Sample x(i)0 ∼ π(x0) and set t = 1

2. Importance Sampling StepFor i = 1, . . . ,N:

I Sample x(i)t ∼ f (xt |x(i)

t−1) and set x(i)0:t = (x

(i)0:t−1, x

(i)t )

I Evaluate the importance weights w(i)t = g(yt |x

(i)t ).

3. Selection Step:

I Resample with replacement N particles (x(i)0:t ; i = 1, . . . ,N) from

the set {x(1)0:t , . . . , x

(1)0:t } according to the normalised importance

weights w(i)t∑N

j=1 w(j)t

.

I t:=t+1; go to step 2.

Axel Gandy Particle Filtering 20

Page 23: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Illustration of the Bootstrap FilterN=10 particles

−2 −1 0 1 2

●● ●● ●● ●●●●

●● ●● ●● ●●●●

●● ● ● ●●●

●● ●

●●● ● ●● ●● ●●

●●● ● ●● ● ●●

●● ●● ● ●●●●●

●● ●●●● ●●● ●

●● ●●●● ●●● ●

● ●● ●●●●●

●●

g(y1|x)

g(y2|x)

g(y3|x)

Page 24: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Example- Bearings Only Tracking

I N = 10000 particles

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●●

●●●

●●

●●

●●

●●●●

●●●

−1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0

−0.

50.

00.

51.

0

true positionestimated positionobserver

Axel Gandy Particle Filtering 22

Page 25: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Example- Stochastic Volatility

I Bootstrap Particle Filter with N = 1000

0 50 100 150 200 250 300 350

−5

05

t

xtru

e

true volatilitymean posterior volatility95% credible interval

Axel Gandy Particle Filtering 23

Page 26: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Example: Football

I Data: Premier League 2007/08I xt,j “strength” of the jth team at time t, j = 1, . . . , 20I yt result of the games on date tI Note: not time-homogeneous (different teams playing

one-another - different time intervals between games).I Model:

I Initial distribution of the strength: xt,j ∼ N(0, 1)I Evolution of strength: xt,j ∼ N((1−∆/β)1/2xt−∆,j ,∆/β)

will use β = 50I Result of games conditional on strength:

Match between team H of strength xH (at home) against team Aof strength xA.Goals scored by the home team ∼ Poisson(λH exp(xH − xA))Goals scored by the away team ∼ Poisson(λA exp(xA − xH))λH and λA constants chosen based on the average number ofgoals scored at home/away.

Axel Gandy Particle Filtering 24

Page 27: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Mean Team Strength

0 50 100 150 200 250 300 350

−1.

0−

0.5

0.0

0.5

1.0

t

ArsenalAston VillaBirminghamBlackburnBoltonChelseaDerbyEvertonFulhamLiverpool

0 50 100 150 200 250 300 350

−1.

0−

0.5

0.0

0.5

1.0

t

Man CityMan UnitedMiddlesbroughNewcastlePortsmouthReadingSunderlandTottenhamWest HamWigan

(based on N = 100000 particle)

Page 28: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

League Table at the end of 2007/081 Man Utd 872 Chelsea 853 Arsenal 834 Liverpool 765 Everton 656 Aston Villa 607 Blackburn 588 Portsmouth 579 Manchester City 5510 West Ham Utd 4911 Tottenham 4612 Newcastle 4313 Middlesbrough 4214 Wigan Athletic 4015 Sunderland 3916 Bolton 3717 Fulham 3618 Reading 3619 Birmingham 3520 Derby County 11

Page 29: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Influence of the Number N of Particles

0 50 100 150 200 250 300 350

−1.

0−

0.5

0.0

0.5

1.0

N=100

ArsenalAston VillaBirminghamBlackburnBoltonChelseaDerbyEvertonFulhamLiverpool

0 50 100 150 200 250 300 350

−1.

0−

0.5

0.0

0.5

1.0

N=1000

ArsenalAston VillaBirminghamBlackburnBoltonChelseaDerbyEvertonFulhamLiverpool

0 50 100 150 200 250 300 350

−1.

0−

0.5

0.0

0.5

1.0

N=10000

ArsenalAston VillaBirminghamBlackburnBoltonChelseaDerbyEvertonFulhamLiverpool

0 50 100 150 200 250 300 350

−1.

0−

0.5

0.0

0.5

1.0

N=100000

ArsenalAston VillaBirminghamBlackburnBoltonChelseaDerbyEvertonFulhamLiverpool

Page 30: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Theoretical Results

I Convergence results are as N →∞I Laws of Large Numbers

I Central limit theoremssee e.g. Chopin (2004)

I The central limit theorems yield an asymptotic variance. Thisasymptotic variance can be used for theoretical comparisons ofalgorithms.

Axel Gandy Particle Filtering 28

Page 31: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

OutlineIntroduction

Particle Filtering

Improving the AlgorithmGeneral Proposal DistributionImproving Resampling

Further Topics

Summary

Axel Gandy Particle Filtering 29

Page 32: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

General Proposal Distribution

Algorithm with a general proposal distribution h:

1. Sample x(i)0 ∼ π(x0) and set t = 1

2. Importance Sampling StepFor i = 1, . . . ,N:

I Sample x(i)t ∼ ht(xt |x(i)

t−1) and set x(i)0:t = (x

(i)0:t−1, x

(i)t )

I Evaluate the importance weights w(i)t =

g(yt |x(i)t )f (x

(i)t |x

(i)t−1)

ht(x(i)t |x

(i)t−1)

.

3. Selection Step:I Resample with replacement N particles (x

(i)0:t ; i = 1, . . . ,N) from

the set {x(1)0:t , . . . , x

(1)0:t } according to the normalised importance

weights w(i)t∑N

j=1 w(j)t

.

I t:=t+1; go to step 2.

Optimal proposal distribution depends on quantity to be estimated.generic choice: choose proposal to minimise the variance of thenormalised weights.

Axel Gandy Particle Filtering 30

Page 33: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Improving Resampling

I Resampling was introduced to remove particles with low weights

I Downside: adds variance

Axel Gandy Particle Filtering 31

Page 34: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Other Types of ResamplingI Goal: Reduce additional variance in the resampling step

I Standardised weights W 1, . . . ,WN ;

I N i - number of ‘offspring’ of the ith element.

I Need EN i = W iN for all i .

I Want to minimise the resulting variance of the weights.

Multinomial Resampling - resampling with replacement

Systematic Resampling I Sample U1 ∼ U(0, 1/N). LetUi = U1 + i−1

N , i = 2, . . . ,N

I N i = |{j :∑i−1

k=1 Wk ≤ Uj ≤

∑ik=1 W

k |Residual Resampling I Idea: Guarantee at least N i = bW iNc

offspring of the ith element;I N1, . . . , Nn: Multinomial sample of N −

∑N i

items with weights W i ∝W i − N i/N.I Set N i = N i + N i .

Page 35: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Adaptive ResamplingI Resampling was introduced to remove particles with low weights.I However, resampling introduces additional randomness to the

algorithm.I Idea: Only resample when weights are “too uneven”.I Can be assessed by computing the variance of the weights and

comparing it to a threshold.I Equivalently, one can compute the “effective sample size” (ESS):

ESS =

(n∑

i=1

(w it )2

)−1

.

(w1t , . . . ,w

nt are the normalised weights)

I Intuitively the effective sample size describes how many samplesfrom the target distribution would be roughly equivalent toimportance sampling with the weights w i

t .I Thus one could decide to resample only if

ESS < k

where k can be chosen e.g. as k = N/2.

Page 36: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

OutlineIntroduction

Particle Filtering

Improving the Algorithm

Further TopicsPath DegeneracySmoothingParameter Estimates

Summary

Axel Gandy Particle Filtering 34

Page 37: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Path Degeneracy

Let s ∈ N.

#{xi0:s : i = 1, . . . ,N} → 1 (as # of steps t → infty)

Example

x0 ∼ N(0, 1), xt+1 ∼ N(xt , 1), yt ∼ N(xt , 1). N = 100 particles.

0 20 40 60 80

−4

−2

02

46

8

t=20

0 20 40 60 80

−4

−2

02

46

8t=40

0 20 40 60 80

−4

−2

02

46

8

t=80

Axel Gandy Particle Filtering 35

Page 38: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Particle Smoothing

I Estimate the distribution of the state xt given all theobservations y1, . . . , yτ up to some late point τ > t.

I Intuitively, a better estimation should be possible than withfiltering (where only information up to τ = t is available).

I Trajectory estimates tend to be smoother than those obtained byfiltering.

I More sophisticated algorithms are needed.

Axel Gandy Particle Filtering 36

Page 39: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Filtering and Parameter Estimates

I Recall that the model is given byI π(x0) - the initial distributionI f (xt |xt−1) for t ≥ 1 - the transition kernel of the Markov chainI g(yt |xt) for t ≥ 1 - the distribution of the observations

I In practical applications these distributions will not be knownexplicitly - they will depend on unknown parameters themselves.

I Two different starting pointsI Bayesian point of view: parameters have some prior distributionI Frequentist point of view: parameters are unknown constants.

I Examples:

Stoch. Volatility: xt ∼ N(αxt−1,σ2

(1−α)2 ), x1 ∼ N(0, σ2

(1−α)2 ),

yt ∼ N(0, β2 exp(xt))Unknown parameters: σ, β, α

Football: Unknown Parameters: β, λH , λA.

I How to estimate these unknown parameters?

Axel Gandy Particle Filtering 37

Page 40: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Maximum Likelihood Approach

I Let θ ∈ Θ contain all unknown parameters in the model.

I Would need marginal density pθ(y1:t).

I Can be estiamted by running the particle filter for each θ ofinterest and by muliplying the unnormalised weights, see theslides Sequential Importance Sampling I/II earlier in the lecturefor some intuition.

Axel Gandy Particle Filtering 38

Page 41: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Artificial(?) Random Walk Dynamics

I Allow the parameters to change with time - give them somedynamic.

I More precisely:I Suppose we have a parameter vector θI Allow it to depend on time (θt),I assign a dynamic to it, i.e. a prior distribution and some

transition probability from θt−1 to θt

I incorporate θt in the state vector xt

I May be reasonable in the Stoch. Volatility and Football Example

Axel Gandy Particle Filtering 39

Page 42: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Bayesian Parameters

I Prior on θ

I Want: Posterior p(θ|y1, . . . , yt).I Naive Approach:

I Incorporate θ into xt ; transition for these components is just theidentity

I Resampling will lead to θ degenerating - after a moderatenumber of steps only few (or even one) θ will be left

I New approaches:I Particle filter within an MCMC algorithm, Andrieu et al. (2010) -

computationally very expensive.I SMC2 by Chopin, Jacob, Papaspiliopoulos, arXiv:1101.1528.

Axel Gandy Particle Filtering 40

Page 43: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

OutlineIntroduction

Particle Filtering

Improving the Algorithm

Further Topics

SummaryConcluding Remarks

Axel Gandy Particle Filtering 41

Page 44: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

Introduction Particle Filtering Improving the Algorithm Further Topics Summary

Concluding Remarks

I Active research area

I A collection of references/resources regarding SMC http:

//www.stats.ox.ac.uk/~doucet/smc_resources.html

I Collection of application-oriented articles: Doucet et al. (2001)

I Brief introduction to SMC: (Robert & Casella, 2004, Chapter 14)

I R-package implementing several methods: pomphttp://pomp.r-forge.r-project.org/

Axel Gandy Particle Filtering 42

Page 45: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

References

Part I

Appendix

Axel Gandy Particle Filtering 43

Page 46: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

References

References I

Andrieu, C., Doucet, A. & Holenstein, R. (2010). Particle markov chain montecarlo methods. Journal of the Royal Statistical Society: Series B (StatisticalMethodology) 72, 269–342.

Chopin, N. (2004). Central limit theorem for sequential Monte Carlo methods andits application to Bayesian inference. Ann. Statist. 32, 2385–2411.

Doucet, A., Freitas, N. D. & Gordon, N. (eds.) (2001). Sequential Monte CarloMethods in Practice. Springer.

Doucet, A. & Johansen, A. M. (2008). A tutorial on particle filtering andsmoothing: Fifteen years later. Available athttp://www.cs.ubc.ca/~arnaud/doucet_johansen_tutorialPF.pdf.

Gordon, N., Salmond, D. & Smith, A. (1993). Novel approach tononlinear/non-gaussian bayesian state estimation. Radar and Signal Processing,IEE Proceedings F 140, 107–113.

Kalman, R. (1960). A new approach to linear filtering and prediction problems.Journal of Basic Engineering 82, 35–45.

Axel Gandy Particle Filtering 44

Page 47: Advanced Computational Methods in Statistics: Lecture 5 ...agandy/teaching/ltcc/lecture5.pdf · Introduction Particle Filtering Improving the Algorithm Further Topics Summary Kalman

References

References IIØksendal, B. (2003). Stochastic Differential Equations: An Introduction With

Applications. Springer.

Robert, C. & Casella, G. (2004). Monte Carlo Statistical Methods. Second ed.,Springer.

Axel Gandy Particle Filtering 45