Sampling Methods for Estimation: An Introduction Particle Filters, Sequential Monte-Carlo, and...

Sampling Methods for Estimation: An Introduction

Particle Filters, Sequential Monte-Carlo, and Related Topics

Nahum Shimkin Faculty of Electrical Engineering Technion

Outline

The nonlinear estimation problem

Monte-Carlo Sampling:

- Crude Monte-Carlo

- Importance Sampling

Sequential Monte-Carlo

The Estimation Problem:

Consider a discrete-time stochastic system with state

xtX (usually n), and observation ytY. The system is

specified by the following prior data:

p(xo): initial state distribution

p(xt+1| xt): state transition law (t=0, 1, 2, ..)

p(yt| xt): measurement distribution (t=1, 2, ..)

Denote 0: 0 1: 1, , , , , .t t t t x x x y y y

The Estimation Problem (cont’d)

As usual, we assume the Markovian state property:

The above laws are often specified through state

equations:

The system may be time-varying, for simplicity we

omit the time variable.

1 0: 1: 1 0: 1: 1( | , ) ( | ), ( | , ) ( | )t t t t t t t t t tp p p p x x y x x y x y y x

1 ( , ); ( , )t t t t t tf w h v x x y x

The Estimation Problem (cont’d)

We wish to estimate the state xt, based on the prior

data and the observations y1:t.

The MMSE-optimal estimator, for example, is given by

with covariance

More generally, compute the posterior state

distribution:

We can then compute

1:ˆ ( | )t t tEx x y 1:cov( | ).t t tP x y

1:ˆ ( ) ( | ),t t t t tp p x x y x X

ˆ ˆ ( ) (and its covariance)

ˆ ˆarg max ( ), (etc.)

Posterior Distribution Formulas:

Using the Bayes rule, we easily obtain the following two-

step recursive formulas that relate p(xt|y1:t) to

p(xt-1| y1:t-1) and yt:

Time Update (prediction step):

Measurement Update:

1: 1 1 1 1: 1 1( | ) ( | ) ( | )t

t t t t t t t

p p p d x y x x x y x

1: 11:

( | ) ( | )( | )

( | ) ( | )t t t t

t t t t

p pp d

y x x yx y x

x x x x

Posterior Distribution Formulas: (2)

Unfortunately, these are hard to compute directly –

especially when x is high-dimensional!

Some well-known approximations:

The Extended Kalman Filter

Gaussian Sum Filters

General/Specific Numerical Integration Methods

Sampling-based methods – Particle Filters.

Monte-Carlo Sampling:

Crude Monte Carlo

The basic idea:

Estimate the expectation integral

by the following mean:

where That is, are independent

samples from p(x).

~ ( )( ) ( ( )) ( ) ( )x pI g E g g p d x x x x x

1( ) ( )

I g gN

( ) ~ ( ).i px x ( )

Crude Monte-Carlo – Interpretation

We can interpret Monte-Carlo sampling as trying to

approximate p(x) by the discrete distribution

It is easily seen that

The samples x(i) are sometimes called “particles”,

with weights w(i) = N-1.

1ˆ ( ) ( ).

ˆ( ) ( ) ( ) .N NI g g p d x x x

Crude Monte Carlo: Properties

The following basic properties follow from standard results for i.i.d. sequences.

1. Bias: IN (g) is an unbiased estimate of I(g), namely

2. SLLN:

3. CLT:

whenever

Advantages (over numerical integration):

Straightforward scheme.

Focuses on “important” areas of p(x).

Rate of convergence does not depend (directly) on dim(x).

( ( )) ( ).NE I g I ga.s.( ) ( ).N N

I g I g

in distr. 2( ) ( ) (0, )N gNN I g I g N

2 cov( ( )) .g g x

The Monte-Carlo Method: Origins

Conceiver: Stanislav Ulam (1946). Early developments (1940’s-50’s):

Ulam & Von-Neumann: Importance Sampling, Rejection Sampling. Metropolis & Von-Neumann: Markov-Chain Monte-Carlo (MCMC).

A surprisingly large array of applications, including global optimization statistical and quantum physics computational biology signal and image processing rare-event simulation state estimation …

Importance Sampling (IS)

Sample x from another distribution, q(x). Use weights to obtain the correct distribution.

Advantages:

May be (much) easier to sample from q than from p.

The covariance may be (much) smaller

Faster convergence / better accuracy.

Importance Sampling (2)

Let q(x) be a selected distribution over x – the

proposal distribution. Assume that

Observe that:

is the likelihood ratio, or simply the weight-function.

( ) 0 ( ) 0.p q x x

( )( ) ( ) ( ) ( ) ( )

( ) ( ) ( )

pI g g p d g q d

g w q d

xx x x x x x

x x x x

( )( )

This immediately leads to the IS algorithm:

(1) Sample

(2) Compute the weights and

Basic Properties: The above-mentioned convergence properties are

maintained. The CLT covariance is now This covariance is minimized by q(x)=p(x) g(x).

Unfortunately this choice is often too complicated, and we settle for approximations.

( ) ~ ( ), 1, , .i q i Nx x ( )

( )( )

x( ) ( )

1ˆ ( ) ( ).N

I g w gN

2~ ( )ˆ cov ( ( ) ( )).g q g w x x x x

Additional Comments:

It is often convenient to use the normalized weights

so that

The weights w(i) are sometimes not properly normalized

for example, when p(x) is known up to a (multiplicative)

constant.

We can still use the last formula, with

We may interpret as a “weighted-particles”

approximation to p(x):

( ) ( )1,i iw w

N ( ) ( )

ˆ( ) ( ).N

I g w g

( )( )

( ) ( )( , )i iwx

( ) ( )ˆ( ) ( ) ( )N

x x x x

Two Discrete Approximations

The Particle Filter Algorithm

We can now apply these ideas to our filtering problem. The basic algorithm is as follows:

0. Initialization: sample from p0(x) to obtain

with For t=1,2,…, given

1. Time update: For i=1,…, N sample

2. Measurement update: For i=1,…, N, evaluate the

(unnormalized) importance weights:

3. Resampling: Sample N points from the discrete

distribution corresponding to Continue with

( ) ( )0 0, ,i iwx

( ) 10 .iw N ( ) ( )

1 1, :i it tw x( ) ( )

1( | ).i it t tp x x x

( ) ( )1ˆ .i i

t tw w

( ) ( ) ( )ˆ ( | ).i i it t t tw w p y x

Nit ix

( ) ( )ˆ , .i it twx

( ) ( ) ( ) 1, , .i i it t tw w N x

The Particle Filter Algorithm (2)

Output: Based on we can estimate

We may also use for that purpose.

This algorithm is the original “bootstrap filter” introduced in Gordon et al. (1993).

CONDENSATION is a similar algorithm developed independently in the computer vision context (Isard and Blake, 1998).

Many improvements and variations now exist.

( ) ( ), ,i it tx w

( ) ( )1:

( | ) ( )

( ( ) | ) ( )

t t t ti

E g w g

x y x x

( ) ( )ˆ ,i it tx w

Particle Flow

)|( 1:11 ttp yx

)|( 1:1 ttp yx

)|( :1 ttp yx

)|( )(ittp xy

(from Doucet et. al., 2001)

The Particle Filter: Time Update

Recall that approximates and

is given.

An approximation is therefore obtained by

sampling and setting

For example, for the familiar state model

we get

( ) ( ) ( ) ( )1 1 1ˆ ( , ), ~ (0, ).i i i i

t t t tf v v Q x x N

( ) ( )1 1,i i

t tw x 1 1: 1( | ),t tp x y

1( | )t tp x x

( ) ( )ˆ ˆ,i it twx

( ) ( )1~ ( | ),i i

t t tp x x x ( ) ( )1ˆ .i i

t tw w

1 1 1( , ), ~ (0, ),t t t tf v Q x x x N

The Particle Filter: Time Update (2)

It is possible (and often useful) to sample from another (proposal) distribution

In that case we need to modify the weights:

The ideal choice for would by

Some schemes exist which try to approximate

sampling with q* - e.g., using MCMC.

( )ˆ itx

1( | ).t tq x x

( ) ( )( ) ( ) 1

1 ( ) ( )1

( | )ˆ .

i ii i t t

t t i it t

1( | )t tq x x

( | )( | ) .

( | )t t

t tt t

x xx y

The Particle Filter: Measurement Update

Recall that

and approximate

To obtain an approximation to

we only need to modify the weights

accordingly:

1: 1: 1

1( | ) ( | ) ( | ),t t t t t tp p p

c x y y x x y

( ) ( )ˆ ˆ,i it twx 1: 1( | ).t tp x y

( ) ( )

t t iw

1:( | ),t tp x y

( ) ( ) ( )ˆ ˆ( | ),i i it t t tw w p y x

( ) ( )ˆ .i it tx x

The Particle Filter: Resampling

Resampling is not strictly required for asymptotic convergence (as ).

However, without it, the distributions tend to degenerate, with few “large” particles, and many small ones.

The idea is therefore to split large particles into several smaller ones, and discard very small particles.

Random sampling from is one option. Another is

a deterministic splitting of particles according to

namely: particles at with

In fact, the last option tends to reduce the variance.

N ( ) ( ),i i

( ) ( ),i it twx

( ) ,itw

( ) ( ) ( )( , )i i it t i tw N N w x ( ) ( ) ,i i

t tx x( ) 1 ( ) .i it i tw N w

Additional Issues and Comments

Data Association: This is an important problem for applications such as

multiple target tracking, and visual tracking. Within the Kalman Filter framework, such problems are

often treated by creating multiple explicit hypothesis regarding the data origin which may lead to combinatorial explosion.

Isard and Blake (1998) observed that particle filters open new possibilities for data association for tracking. Specifically, different particles can be associated to different measurements, based on some “closeness” (or likelihood) measure.

Encouraging results have been reported.

Additional Issues and Comments (2)

Joint Parameter and State Estimation:

As is well known, this “adaptive estimation” problem can be reduced to a standard estimation problem by embedding the parameters in the state vector.

This however leads to a bi-linear state equation, for which standard KF algorithms did not prove successful. The problem is currently treated using non-sequential algorithms (e.g. EM + smoothing).

Particle filters have been applied to this problem with encouraging results.

Additional Issues and Comments (3)

Low-noise State Components:

State components which are not excited by external noise (such as fixed parameters) pose a problem since the respective particle components may be fixed in their place, and cannot correct initially bad placements.

The simplest solution is to add a little noise. Additional suggestions involve intermediate MCMC steps, and related ideas.

Conclusion:

Particle Filters and Sequential Monte Carlo Methods are an evolving and active topic, with good potential to handle “hard” estimation problems, involving non-linearity and multi-modal distributions.

In general, these schemes are computationally expensive as the number of “particles” N needs to be large for precise results.

Additional work is required on the computational issue – in particular, optimizing the choice of N, and related error bounds.

Another promising direction is the merging of sampling methods with more disciplined approaches (such as Gaussian filters, and the Rao-Blackwellization scheme …).

To Probe Further:

Monte-Carlo Methods (at large): R. Rubinstein, Simulation and the Monte-Carlo Method, Wiley, 1981 (and

1996). J.S. Liu, Monte-Carlo Strategies in Scientific Computing, Springer, 2001. M. Evans and T. Swartz, “Methods for approximating integrals in statistics with

special emphasis on Bayesian integration problems”, Statistical Science 10(3), 1995, pp. 254-272.

Particle Filters: N.J. Gordon, D. J. Salmond and A. Smith, “Novel approach to non-linear / non-

Gaussian Bayesian state estimation”, IEE Proceedings-F 140(2), pp. 107-113. M. Isard and A. Blake, “CONDENSATION – conditional density propagation for

visual tracking”, International Journal of Computer Vision 28(1), pp. 5-28. A Doucet, N. de Freitas and N. Gordon (eds.) Sequential Monte Carlo Method in

Practice, Springer, 2001. S. Arulampalam et al., “A tutorial on particle filters, for on-line non-linear/non-

Gaussian Bayesian tracking”, IEEE Trans. on Signal Processing 50(2), 2002.

Sampling Methods for Estimation: An Introduction Particle Filters, Sequential Monte-Carlo, and...

Documents

The Prophets Micah, Nahum and Obadiah

Minor Prophets: Nahumsolidrock831.com/studies/Nahum/Minor Prophets - Nahum.pdf · 2011-10-26 · Minor Prophets: Nahum Introduction The book of Nahum fits into the beginning of the

The Book of NahumTHE BOOK OF NAHUM (Studies led by Bro. Frank Shallieu in 1980 and 1992) Nahum 1:1 The burden of Nineveh. The book of the vision of Nahum the Elkoshite. The name Nahum

Bible Study Neal Parker. Nahum: Background Info. “Nahum” means comfort. “Nahum” means comfort. Related to Nehemiah’s name; “The Lord Comforts” or “Comfort

07 Nahum - Descent of the Goddess Ishtar to the Netherworld and Nahum II 8 - Aron Pinker

The Holy Prophet Nahum (Dec. 1)

Nahum Torres

Nahum 3 commentary

aF'm; hwEn>yn rp,se !Azx' ~Wxn: yviqol.a, · Nahum 1 Nahum 1:1 The oracle of Nineveh. The book of the vision of Nahum the Elkoshite. Analysis of v. 1: 1) The prophecy is designated

Nahum 2 Our God is the God of History Expository series on the book of Nahum part 3/4

Minor Prophets Nahum & Habakkuk

NAHUM - Orthokairosorthokairos.weebly.com/uploads/5/7/3/1/57311059/nahum... · 2018. 9. 1. · Nahum refers himself to Kosh, which some believe to be a village in Galilee, while others

NAHUM. - Amazon S3 · NAHUM. The Period of Nahum was about 713 B.C. 'THE burden of Ninevah. The book of the vision of Nahum the Elkoshite. 1 A jealous GOD! avenging LORD! 2 The ruling

INBAL (BILLIE NAHUM-SHANI CURRICULUM VITAEinbal/Nahum-Shani_CV_website.pdf · Page 2 of 17 | Curriculum Vitae | Inbal Nahum-Shani POSITIONS HELD 2018-present Co-Director Data-Science

34 Nahum NAHUM Assyria (Northern Kingdom already in Assyrian exile)

Minor nahum

& nahum - mysouthland.com

Nahum Booklet

Study in Nahum

The Concert/Cafeteria Queueing Problem A Game of Arrivals...Joint work with Rahul Jain (USC, LA) and Nahum Shimkin (Technion, Israel) ()Game of Arrivals Cambridge, June 16 3 / 30