Upload
ledang
View
213
Download
0
Embed Size (px)
Citation preview
Introduction to Sequential Monte Carloand Particle MCMC Methods
Axel Finke
Department of Statistics,University of Warwick, UK
Oberseminar, Münster: July 10, 2013
Piecewise Deterministic Processes (PDPs)
Ingredients� Time-dependent parameters: marked point process.�j ; �j /j2N[f0g with
– jump times 0 D �0 < �1 < �2 < : : :– jump sizes �0; �1; �2; : : :
� Static parameters: � .� Deterministic function: F � .
2 / 51
Piecewise Deterministic Processes (PDPs)
Ingredients� Time-dependent parameters: marked point process.�j ; �j /j2N[f0g with
– jump times 0 D �0 < �1 < �2 < : : :– jump sizes �0; �1; �2; : : :
� Static parameters: � .� Deterministic function: F � .
2 / 51
Piecewise Deterministic Processes (PDPs)
Ingredients� Time-dependent parameters: marked point process.�j ; �j /j2N[f0g with
– jump times 0 D �0 < �1 < �2 < : : :– jump sizes �0; �1; �2; : : :
� Static parameters: � .� Deterministic function: F � .
2 / 51
Piecewise Deterministic Processes (PDPs)
Ingredients� Time-dependent parameters: marked point process.�j ; �j /j2N[f0g with
– jump times 0 D �0 < �1 < �2 < : : :– jump sizes �0; �1; �2; : : :
� Static parameters: � .� Deterministic function: F � .
2 / 51
Sketch
PDP
Valu
e
Time
F θ( · , τ2, φ2)
F θ( · , τ1, φ1)F θ( · , τ0, φ0)
τ0 τ1 τ2
φ0
φ2
φ1
3 / 51
Observations
ObservationsPDP
Valu
e
Timeτ0 τ1 τ2
φ0
φ2
φ1
4 / 51
Example I: Object Tracking
ObservationsTrue trajectory
Dim
ensio
n2
Dimension 1 ×1040 1 2 3 4 5 6
−2000
0
2000
4000
6000
5 / 51
Example I: Object Tracking (continued)
not observed
[m/s
2 ]Ac
celer
atio
n
Time
not observed
[m/s
]Sp
eed
Observations[m
]Po
sitio
n
τj τj+1
−200
20
0
4000
104
6 / 51
Example II: Shot-Noise Cox Process
Insurance claims
Freq
uenc
y
Time
True intensity"Catastrophic" events
Inte
nsity
0 500 1000 1500 20000
20
400
5
10
15
7 / 51
Inference
� Sequential Monte Carlo filter for PDPs introduced byWhiteley, Johansen & Godsill (2011).� Efficient methods for estimating � still missing (though
Reversible-Jump MCMC works for simple models)
8 / 51
Inference
� Sequential Monte Carlo filter for PDPs introduced byWhiteley, Johansen & Godsill (2011).� Efficient methods for estimating � still missing (though
Reversible-Jump MCMC works for simple models)
8 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Motivation
� X � � with support E.� f W E ! R some (�-integrable) function.� Want to calculate
�.f / WDZE
f .x/�.dx/�DZE
f .x/�.x/ dx�
D EŒf .X/�:
9 / 51
Motivation
� X � � with support E.� f W E ! R some (�-integrable) function.� Want to calculate
�.f / WDZE
f .x/�.dx/�DZE
f .x/�.x/ dx�
D EŒf .X/�:
9 / 51
Motivation
� X � � with support E.� f W E ! R some (�-integrable) function.� Want to calculate
�.f / WDZE
f .x/�.dx/�DZE
f .x/�.x/ dx�
D EŒf .X/�:
9 / 51
Motivation, continued
ExampleOften, E � R and �.x/ D p.xjy/ for some data y, so that
�.f / D
8̂̂̂̂<̂ˆ̂̂:
P.X 2 AjY D y/; if f D 1A for A � E,EŒXkjY D y�; if f D idk,varŒX jY D y�; if f D Œid��.f /�2,
:::
10 / 51
Motivation, continued
ExampleOften, E � R and �.x/ D p.xjy/ for some data y, so that
�.f / D
8̂̂̂̂<̂ˆ̂̂:
P.X 2 AjY D y/; if f D 1A for A � E,EŒXkjY D y�; if f D idk,varŒX jY D y�; if f D Œid��.f /�2,
:::
10 / 51
Monte Carlo Methods
� Problem: analytical evaluation of �.f / costly/impossible.� Idea:
1. construct approximation O� of � .2. estimate �.f / by
O�.f / DZE
f .x/ O�.dx/:
� Monte Carlo methods: construction of O� based on(computer-generated) (pseudo-)random numbers.
11 / 51
Monte Carlo Methods
� Problem: analytical evaluation of �.f / costly/impossible.� Idea:
1. construct approximation O� of � .2. estimate �.f / by
O�.f / DZE
f .x/ O�.dx/:
� Monte Carlo methods: construction of O� based on(computer-generated) (pseudo-)random numbers.
11 / 51
Monte Carlo Methods
� Problem: analytical evaluation of �.f / costly/impossible.� Idea:
1. construct approximation O� of � .2. estimate �.f / by
O�.f / DZE
f .x/ O�.dx/:
� Monte Carlo methods: construction of O� based on(computer-generated) (pseudo-)random numbers.
11 / 51
Monte Carlo Methods
� Problem: analytical evaluation of �.f / costly/impossible.� Idea:
1. construct approximation O� of � .2. estimate �.f / by
O�.f / DZE
f .x/ O�.dx/:
� Monte Carlo methods: construction of O� based on(computer-generated) (pseudo-)random numbers.
11 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF of �
Prob
abilit
y
x
0
1
12 / 51
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
N D 10
0
1
12 / 51
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
N D 30
0
1
12 / 51
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
N D 70
0
1
12 / 51
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
N D 200
0
1
12 / 51
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
N D 1000
0
1
12 / 51
Vanilla Monte Carlo� Sample X1; : : : ; XN iid� � .� Approximate �.dx/ by the empirical measure:
O�mc.dx/ WD 1
N
NXiD1
•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
N D 5000
0
1
12 / 51
Vanilla Monte Carlo, continued
� Estimate �.f / D EŒf .X/� by
O�mc.f / DZE
f .x/ O�mc.dx/
D 1
N
NXiD1
f .X i/:
� Unbiased and consistent.� Monte Carlo methods are best viewed as simulation
techniques for approximating measures.
13 / 51
Vanilla Monte Carlo, continued
� Estimate �.f / D EŒf .X/� by
O�mc.f / DZE
f .x/ O�mc.dx/
D 1
N
NXiD1
f .X i/:
� Unbiased and consistent.� Monte Carlo methods are best viewed as simulation
techniques for approximating measures.
13 / 51
Vanilla Monte Carlo, continued
� Estimate �.f / D EŒf .X/� by
O�mc.f / DZE
f .x/ O�mc.dx/
D 1
N
NXiD1
f .X i/:
� Unbiased and consistent.� Monte Carlo methods are best viewed as simulation
techniques for approximating measures.
13 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Motivation
SetupAssume that
�.x/ D .x/
Z
with normalising constant Z D RE .x/ dx, but
� we cannot sample from � .� Z is unknown (i.e. we can evaluate but not �)
Example (Bayesian inference)Let �.x/ WD p.xjy/ for some data y, then often,� we can evaluate .x/ D p.x; y/,� but Z D p.y/ D R
Ep.x; y/ dx is typically intractable.
14 / 51
Motivation
SetupAssume that
�.x/ D .x/
Z
with normalising constant Z D RE .x/ dx, but
� we cannot sample from � .� Z is unknown (i.e. we can evaluate but not �)
Example (Bayesian inference)Let �.x/ WD p.xjy/ for some data y, then often,� we can evaluate .x/ D p.x; y/,� but Z D p.y/ D R
Ep.x; y/ dx is typically intractable.
14 / 51
Motivation
SetupAssume that
�.x/ D .x/
Z
with normalising constant Z D RE .x/ dx, but
� we cannot sample from � .� Z is unknown (i.e. we can evaluate but not �)
Example (Bayesian inference)Let �.x/ WD p.xjy/ for some data y, then often,� we can evaluate .x/ D p.x; y/,� but Z D p.y/ D R
Ep.x; y/ dx is typically intractable.
14 / 51
Motivation
SetupAssume that
�.x/ D .x/
Z
with normalising constant Z D RE .x/ dx, but
� we cannot sample from � .� Z is unknown (i.e. we can evaluate but not �)
Example (Bayesian inference)Let �.x/ WD p.xjy/ for some data y, then often,� we can evaluate .x/ D p.x; y/,� but Z D p.y/ D R
Ep.x; y/ dx is typically intractable.
14 / 51
Motivation
SetupAssume that
�.x/ D .x/
Z
with normalising constant Z D RE .x/ dx, but
� we cannot sample from � .� Z is unknown (i.e. we can evaluate but not �)
Example (Bayesian inference)Let �.x/ WD p.xjy/ for some data y, then often,� we can evaluate .x/ D p.x; y/,� but Z D p.y/ D R
Ep.x; y/ dx is typically intractable.
14 / 51
Motivation
SetupAssume that
�.x/ D .x/
Z
with normalising constant Z D RE .x/ dx, but
� we cannot sample from � .� Z is unknown (i.e. we can evaluate but not �)
Example (Bayesian inference)Let �.x/ WD p.xjy/ for some data y, then often,� we can evaluate .x/ D p.x; y/,� but Z D p.y/ D R
Ep.x; y/ dx is typically intractable.
14 / 51
Importance SamplingAssume that � is another distribution s.t. � � �.1. Sample X1; : : : ; XN iid� �.2. Approximate � by the weighted empirical measure:
O� is.dx/ WDNXiD1
W i•X i .dx/:
CDF: �CDF: �Pr
obab
ility
x
0
1
15 / 51
Importance SamplingAssume that � is another distribution s.t. � � �.1. Sample X1; : : : ; XN iid� �.2. Approximate � by the weighted empirical measure:
O� is.dx/ WDNXiD1
W i•X i .dx/:
CDF: �CDF: �Pr
obab
ility
x
0
1
15 / 51
Importance SamplingAssume that � is another distribution s.t. � � �.1. Sample X1; : : : ; XN iid� �.2. Approximate � by the weighted empirical measure:
O� is.dx/ WDNXiD1
W i•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
0
1
15 / 51
Importance SamplingAssume that � is another distribution s.t. � � �.1. Sample X1; : : : ; XN iid� �.2. Approximate � by the weighted empirical measure:
O� is.dx/ WDNXiD1
W i•X i .dx/:
CDF: O�MCCDF: �Pr
obab
ility
x
0
1
15 / 51
Importance SamplingAssume that � is another distribution s.t. � � �.1. Sample X1; : : : ; XN iid� �.2. Approximate � by the weighted empirical measure:
O� is.dx/ WDNXiD1
W i•X i .dx/:
CDF: O� ISCDF: �Pr
obab
ility
x
0
1
15 / 51
Constructing the Importance Weights
PDF: �PDF: �
Dens
ity
x0
0:5
16 / 51
Constructing the Importance Weights, continuedWant to set W i / d�
d� .Xi/ D �.X i /
�.X i /.
Problem: can only evaluate G.X i/ WD d d� .X
i/ D .X i /
�.X i /.
Solution: approximate and Z separately, i.e.1. approximate .dx/ by
O is;u.dx/ WD 1
N
NXiD1
G.X i/•X i .dx/
2. approximate Z D REG.x/�.dx/ by
yZ WD O�mc.G/̃
‘vanilla’ Monte Carloestimate of �.G/
D 1
N
NXiD1
G.X i/:
17 / 51
Constructing the Importance Weights, continuedWant to set W i / d�
d� .Xi/ D �.X i /
�.X i /.
Problem: can only evaluate G.X i/ WD d d� .X
i/ D .X i /
�.X i /.
Solution: approximate and Z separately, i.e.1. approximate .dx/ by
O is;u.dx/ WD 1
N
NXiD1
G.X i/•X i .dx/
2. approximate Z D REG.x/�.dx/ by
yZ WD O�mc.G/̃
‘vanilla’ Monte Carloestimate of �.G/
D 1
N
NXiD1
G.X i/:
17 / 51
Constructing the Importance Weights, continuedWant to set W i / d�
d� .Xi/ D �.X i /
�.X i /.
Problem: can only evaluate G.X i/ WD d d� .X
i/ D .X i /
�.X i /.
Solution: approximate and Z separately, i.e.1. approximate .dx/ by
O is;u.dx/ WD 1
N
NXiD1
G.X i/•X i .dx/
2. approximate Z D REG.x/�.dx/ by
yZ WD O�mc.G/̃
‘vanilla’ Monte Carloestimate of �.G/
D 1
N
NXiD1
G.X i/:
17 / 51
Constructing the Importance Weights, continuedWant to set W i / d�
d� .Xi/ D �.X i /
�.X i /.
Problem: can only evaluate G.X i/ WD d d� .X
i/ D .X i /
�.X i /.
Solution: approximate and Z separately, i.e.1. approximate .dx/ by
O is;u.dx/ WD 1
N
NXiD1
G.X i/•X i .dx/
2. approximate Z D REG.x/�.dx/ by
yZ WD O�mc.G/̃
‘vanilla’ Monte Carloestimate of �.G/
D 1
N
NXiD1
G.X i/:
17 / 51
Constructing the Importance Weights, continuedWant to set W i / d�
d� .Xi/ D �.X i /
�.X i /.
Problem: can only evaluate G.X i/ WD d d� .X
i/ D .X i /
�.X i /.
Solution: approximate and Z separately, i.e.1. approximate .dx/ by
O is;u.dx/ WD 1
N
NXiD1
G.X i/•X i .dx/
2. approximate Z D REG.x/�.dx/ by
yZ WD O�mc.G/̃
‘vanilla’ Monte Carloestimate of �.G/
D 1
N
NXiD1
G.X i/:
17 / 51
Constructing the Importance Weights, continued
3. approximate �.dx/ by
O� is.dx/ WD O is;u.dx/= OZ
DNXiD1
W i•X i .dx/;
whereW i WD G.X i/PN
jD1G.Xj /:
Importance sampling yields unbiased (and consistent)estimates of normalising constants!
18 / 51
Constructing the Importance Weights, continued
3. approximate �.dx/ by
O� is.dx/ WD O is;u.dx/= OZ
DNXiD1
W i•X i .dx/;
whereW i WD G.X i/PN
jD1G.Xj /:
Importance sampling yields unbiased (and consistent)estimates of normalising constants!
18 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Markov Chain Monte Carlo Methods
Let K be a �-invariant ergodic Markov kernel.1. simulate a Markov chain with transitions K, i.e. sample
X1 � �. � /; X2 � K. � jX1/; X3 � K. � jX2/; : : :
2. approximate �.dx/ by
O�mcmc.dx/ WD 1
N
RCNXiDRC1
•X i .dx/:
after a suitable burn-in time R.
19 / 51
Markov Chain Monte Carlo Methods
Let K be a �-invariant ergodic Markov kernel.1. simulate a Markov chain with transitions K, i.e. sample
X1 � �. � /; X2 � K. � jX1/; X3 � K. � jX2/; : : :
2. approximate �.dx/ by
O�mcmc.dx/ WD 1
N
RCNXiDRC1
•X i .dx/:
after a suitable burn-in time R.
19 / 51
Markov Chain Monte Carlo Methods
Let K be a �-invariant ergodic Markov kernel.1. simulate a Markov chain with transitions K, i.e. sample
X1 � �. � /; X2 � K. � jX1/; X3 � K. � jX2/; : : :
2. approximate �.dx/ by
O�mcmc.dx/ WD 1
N
RCNXiDRC1
•X i .dx/:
after a suitable burn-in time R.
19 / 51
Constructing the Markov Kernel K
Example (Gibbs sampler)Let E be d -dimensional. The standard Gibbs sampler cyclesthrough all full conditional distributions (under �), i.e.
K.dxi jxi�1/ WDdYjD1
�.dxij jxi1Wj�1; xi�1jC1Wd /
Partially-collapsed Gibbs sampler (Van Dyk & Park, 2008):� often no need to sample from full conditionals.� instead, sample from conditionals under a marginal of �
20 / 51
Constructing the Markov Kernel K
Example (Gibbs sampler)Let E be d -dimensional. The standard Gibbs sampler cyclesthrough all full conditional distributions (under �), i.e.
K.dxi jxi�1/ WDdYjD1
�.dxij jxi1Wj�1; xi�1jC1Wd /
Partially-collapsed Gibbs sampler (Van Dyk & Park, 2008):� often no need to sample from full conditionals.� instead, sample from conditionals under a marginal of �
20 / 51
Constructing the Markov Kernel K
Example (Gibbs sampler)Let E be d -dimensional. The standard Gibbs sampler cyclesthrough all full conditional distributions (under �), i.e.
K.dxi jxi�1/ WDdYjD1
�.dxij jxi1Wj�1; xi�1jC1Wd /
Partially-collapsed Gibbs sampler (Van Dyk & Park, 2008):� often no need to sample from full conditionals.� instead, sample from conditionals under a marginal of �
20 / 51
Constructing the Markov Kernel K
Example (Metropolis–Hastings algorithm)1. sample X? � Q. � jX i�1/ (where Q is not �-invariant)2. accept X i WD X? with probability
˛.X?jX i/ WD 1 ^ .X?/Q.X i jX?/
.X i/Q.X?jX i/;
otherwise, set X i WD X i�1.Thus, K has the form
K.dxi jxi�1/ WD ˛.xi jxi�1/Q.dxi jxi�1/C r.xi�1/•x.dxi/;
where r.x/ WD 1 � RE˛.zjx/Q.dzjx/.
21 / 51
Constructing the Markov Kernel K
Example (Metropolis–Hastings algorithm)1. sample X? � Q. � jX i�1/ (where Q is not �-invariant)2. accept X i WD X? with probability
˛.X?jX i/ WD 1 ^ .X?/Q.X i jX?/
.X i/Q.X?jX i/;
otherwise, set X i WD X i�1.Thus, K has the form
K.dxi jxi�1/ WD ˛.xi jxi�1/Q.dxi jxi�1/C r.xi�1/•x.dxi/;
where r.x/ WD 1 � RE˛.zjx/Q.dzjx/.
21 / 51
Constructing the Markov Kernel K
Example (Metropolis–Hastings algorithm)1. sample X? � Q. � jX i�1/ (where Q is not �-invariant)2. accept X i WD X? with probability
˛.X?jX i/ WD 1 ^ .X?/Q.X i jX?/
.X i/Q.X?jX i/;
otherwise, set X i WD X i�1.Thus, K has the form
K.dxi jxi�1/ WD ˛.xi jxi�1/Q.dxi jxi�1/C r.xi�1/•x.dxi/;
where r.x/ WD 1 � RE˛.zjx/Q.dzjx/.
21 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks
� Want to target Q�.�/ D Q .�/=Z, for Z > 0.� What if we cannot evaluate Q .�/? (needed for IS/MCMC).� Idea:
1. instead, target �.�; x/ D .�; x/=Z, s.t.i. �.�; x/ admits Q�.�/ as a marginal,ii. .�; x/ can be evaluated.
2. construct IS/MCMC approximation O� of � D =Z.3. approximate Q� by �-marginal of O� .
� Many names for this: state-space extension,auxiliary-variable construction, data augmentation, . . .
22 / 51
State-Space Extension Tricks, continued
Example (Hidden Markov model)X0 � �� , and for n 2 N,
Xn � f �. � jXn�1/;Yn � g�. � jXn/;
where � are some ‘static’ parameters.Assume we are interested in Q�.�/ WD p.� jy1Wn/.� Q .�/ D p.�; y1Wn/ and Z D p.y1Wn/ are intractable.� but we can evaluate
.�; x0Wn; y1Wn/ WD p.�/��.x0/nY
pD1f �.xpjxp�1/g�.ypjxp/:
23 / 51
State-Space Extension Tricks, continued
Example (Hidden Markov model)X0 � �� , and for n 2 N,
Xn � f �. � jXn�1/;Yn � g�. � jXn/;
where � are some ‘static’ parameters.Assume we are interested in Q�.�/ WD p.� jy1Wn/.� Q .�/ D p.�; y1Wn/ and Z D p.y1Wn/ are intractable.� but we can evaluate
.�; x0Wn; y1Wn/ WD p.�/��.x0/nY
pD1f �.xpjxp�1/g�.ypjxp/:
23 / 51
State-Space Extension Tricks, continued
Example (Hidden Markov model)X0 � �� , and for n 2 N,
Xn � f �. � jXn�1/;Yn � g�. � jXn/;
where � are some ‘static’ parameters.Assume we are interested in Q�.�/ WD p.� jy1Wn/.� Q .�/ D p.�; y1Wn/ and Z D p.y1Wn/ are intractable.� but we can evaluate
.�; x0Wn; y1Wn/ WD p.�/��.x0/nY
pD1f �.xpjxp�1/g�.ypjxp/:
23 / 51
State-Space Extension Tricks, continued
Example (Hidden Markov model)X0 � �� , and for n 2 N,
Xn � f �. � jXn�1/;Yn � g�. � jXn/;
where � are some ‘static’ parameters.Assume we are interested in Q�.�/ WD p.� jy1Wn/.� Q .�/ D p.�; y1Wn/ and Z D p.y1Wn/ are intractable.� but we can evaluate
.�; x0Wn; y1Wn/ WD p.�/��.x0/nY
pD1f �.xpjxp�1/g�.ypjxp/:
23 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Target distributions
SetupWant to target a sequence of related distributions.��n .x1Wn//n2N which� are defined on spaces .En/n2N of increasing dimension
[will be relaxed later],� have unknown normalising constants .Z�n/n2N .
� known [will be relaxed later].
ProblemIS/MCMC require new algorithm for each n 2 N.
24 / 51
Target distributions
SetupWant to target a sequence of related distributions.��n .x1Wn//n2N which� are defined on spaces .En/n2N of increasing dimension
[will be relaxed later],� have unknown normalising constants .Z�n/n2N .
� known [will be relaxed later].
ProblemIS/MCMC require new algorithm for each n 2 N.
24 / 51
Target distributions
SetupWant to target a sequence of related distributions.��n .x1Wn//n2N which� are defined on spaces .En/n2N of increasing dimension
[will be relaxed later],� have unknown normalising constants .Z�n/n2N .
� known [will be relaxed later].
ProblemIS/MCMC require new algorithm for each n 2 N.
24 / 51
Target distributions
SetupWant to target a sequence of related distributions.��n .x1Wn//n2N which� are defined on spaces .En/n2N of increasing dimension
[will be relaxed later],� have unknown normalising constants .Z�n/n2N .
� known [will be relaxed later].
ProblemIS/MCMC require new algorithm for each n 2 N.
24 / 51
Target distributions
SetupWant to target a sequence of related distributions.��n .x1Wn//n2N which� are defined on spaces .En/n2N of increasing dimension
[will be relaxed later],� have unknown normalising constants .Z�n/n2N .
� known [will be relaxed later].
ProblemIS/MCMC require new algorithm for each n 2 N.
24 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Sequential Monte Carlo MethodsSequential Monte Carlo (SMC): propagate weighted samples(‘particles’) .X i
1Wn; W �;in /i2f1;:::;N g to construct
O��n .dx1Wn/ WDNXiD1
W �;in •X i
1Wn.dx1Wn/:
SMC AlgorithmAt time n, given .X i
1Wn�1; W�;in�1/i2f1;:::;N g,
1. sample X in � K�
n. � jX i1Wn�1/
– this approximates ��n�1.dx1Wn�1/K�n .dxnjx1Wn�1),2. re-weight particle paths .X i
1Wn/i2f1;:::;N g– to approximate ��n .dx1Wn/,
3. resample: reset weights, after– “multiplying” particles with large weights,– “killing off” paths with small weights.
25 / 51
Sequential Monte Carlo MethodsSequential Monte Carlo (SMC): propagate weighted samples(‘particles’) .X i
1Wn; W �;in /i2f1;:::;N g to construct
O��n .dx1Wn/ WDNXiD1
W �;in •X i
1Wn.dx1Wn/:
SMC AlgorithmAt time n, given .X i
1Wn�1; W�;in�1/i2f1;:::;N g,
1. sample X in � K�
n. � jX i1Wn�1/
– this approximates ��n�1.dx1Wn�1/K�n .dxnjx1Wn�1),2. re-weight particle paths .X i
1Wn/i2f1;:::;N g– to approximate ��n .dx1Wn/,
3. resample: reset weights, after– “multiplying” particles with large weights,– “killing off” paths with small weights.
25 / 51
Sequential Monte Carlo MethodsSequential Monte Carlo (SMC): propagate weighted samples(‘particles’) .X i
1Wn; W �;in /i2f1;:::;N g to construct
O��n .dx1Wn/ WDNXiD1
W �;in •X i
1Wn.dx1Wn/:
SMC AlgorithmAt time n, given .X i
1Wn�1; W�;in�1/i2f1;:::;N g,
1. sample X in � K�
n. � jX i1Wn�1/
– this approximates ��n�1.dx1Wn�1/K�n .dxnjx1Wn�1),2. re-weight particle paths .X i
1Wn/i2f1;:::;N g– to approximate ��n .dx1Wn/,
3. resample: reset weights, after– “multiplying” particles with large weights,– “killing off” paths with small weights.
25 / 51
Sequential Monte Carlo MethodsSequential Monte Carlo (SMC): propagate weighted samples(‘particles’) .X i
1Wn; W �;in /i2f1;:::;N g to construct
O��n .dx1Wn/ WDNXiD1
W �;in •X i
1Wn.dx1Wn/:
SMC AlgorithmAt time n, given .X i
1Wn�1; W�;in�1/i2f1;:::;N g,
1. sample X in � K�
n. � jX i1Wn�1/
– this approximates ��n�1.dx1Wn�1/K�n .dxnjx1Wn�1),2. re-weight particle paths .X i
1Wn/i2f1;:::;N g– to approximate ��n .dx1Wn/,
3. resample: reset weights, after– “multiplying” particles with large weights,– “killing off” paths with small weights.
25 / 51
Sequential Monte Carlo Methods, continued
Particle filters: SMC methods applied to the filtering problem.
Almost all SMC methods, e.g.� auxiliary particle filters (Pitt & Shepard, 1999),� block sampling (Doucet, Briers & Sénécal, 2006),� resample–move algorithms (Gilks & Berzuini, 2001),� SMC samplers (Del Moral, Doucet & Jasra, 2006),� discrete state-space particle filters (Fearnhead, 1998),� : : :
are special cases of this algorithm!
26 / 51
Sequential Monte Carlo Methods, continued
Particle filters: SMC methods applied to the filtering problem.
Almost all SMC methods, e.g.� auxiliary particle filters (Pitt & Shepard, 1999),� block sampling (Doucet, Briers & Sénécal, 2006),� resample–move algorithms (Gilks & Berzuini, 2001),� SMC samplers (Del Moral, Doucet & Jasra, 2006),� discrete state-space particle filters (Fearnhead, 1998),� : : :
are special cases of this algorithm!
26 / 51
SMC: Sketch
Parti
clelo
catio
n
Q�1.x1/Time
27 / 51
SMC: Sketch
Parti
clelo
catio
n
Q�2.x1; x2/Time
27 / 51
SMC: Sketch
Parti
clelo
catio
n
Q�3.x1W2; x3/
27 / 51
SMC: Sketch
Parti
clelo
catio
n
Q�4.x1W3; x4/Time
27 / 51
SMC: Sketch
Parti
clelo
catio
n
Q�5.x1W4; x5/Time
27 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Sample Degeneracy
� Also known as sample impoverishment, or path degeneracy.� Eventually, all particles share a common ancestor.� Thus, SMC methods rely on an ergodicity (‘forgetting’)
property of .��n /n2N
28 / 51
Sample Degeneracy
� Also known as sample impoverishment, or path degeneracy.� Eventually, all particles share a common ancestor.� Thus, SMC methods rely on an ergodicity (‘forgetting’)
property of .��n /n2N
28 / 51
Sample Degeneracy
� Also known as sample impoverishment, or path degeneracy.� Eventually, all particles share a common ancestor.� Thus, SMC methods rely on an ergodicity (‘forgetting’)
property of .��n /n2N
28 / 51
Sample Degeneracy, continued
Hidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x1jy1/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x2jy1W2/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
p.x3jy1W3/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Time p.x4jy1W4/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Time p.x5jy1W5/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x1jy1/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x1jy1W2/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x1jy1W3/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x1jy1W4/
29 / 51
Sample Degeneracy, continuedHidden Markov model, continuedLet ��n .x1Wn/ WD p.x1Wnj�; y1Wn/ then we are often interested in� filtering distributions: ��n .xn/ D p.xnj�; y1Wn/,�! approximated by a diverse set of particles.� smoothing distributions: ��n .xk/ D p.xkj�; y1Wn/,�! often approximated by a single particle.
Parti
clelo
catio
n
Timep.x1jy1W5/
29 / 51
Alleviating the Degeneracy
� Use residual/stratified/systematic resampling�! avoid multinomial resampling!� Only resample when necessary.� Devise better proposal kernels K�
n .– e.g. avoid K�n . � jxn�1/ WD f � . � jxn�1/ in HMMs(’bootstrap’ filter),
– better: make use of yn when sampling X in.
30 / 51
Alleviating the Degeneracy
� Use residual/stratified/systematic resampling�! avoid multinomial resampling!� Only resample when necessary.� Devise better proposal kernels K�
n .– e.g. avoid K�n . � jxn�1/ WD f � . � jxn�1/ in HMMs(’bootstrap’ filter),
– better: make use of yn when sampling X in.
30 / 51
Alleviating the Degeneracy
� Use residual/stratified/systematic resampling�! avoid multinomial resampling!� Only resample when necessary.� Devise better proposal kernels K�
n .– e.g. avoid K�n . � jxn�1/ WD f � . � jxn�1/ in HMMs(’bootstrap’ filter),
– better: make use of yn when sampling X in.
30 / 51
Alleviating the Degeneracy
� Use residual/stratified/systematic resampling�! avoid multinomial resampling!� Only resample when necessary.� Devise better proposal kernels K�
n .– e.g. avoid K�n . � jxn�1/ WD f � . � jxn�1/ in HMMs(’bootstrap’ filter),
– better: make use of yn when sampling X in.
30 / 51
Alleviating the Degeneracy
� Use residual/stratified/systematic resampling�! avoid multinomial resampling!� Only resample when necessary.� Devise better proposal kernels K�
n .– e.g. avoid K�n . � jxn�1/ WD f � . � jxn�1/ in HMMs(’bootstrap’ filter),
– better: make use of yn when sampling X in.
30 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
x51
x41
x31
x21
x11
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
a51
a41
a31
a21
a11
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
x52
x42
x32
x22
x12
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
a52
a42
a32
a22
a12
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
x53
x43
x33
x23
x13
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
a53
a43
a33
a23
a13
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
x54
x44
x34
x24
x14
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
a54
a44
a34
a24
a14
31 / 51
Particle Lineages
�n .
all random variables generatedby the SMC algorithm‚ …„ ƒ
x1Wn; a1Wn�1/ WD"NYiD1
K�1 .x
i1/
#
�"
nYpD2
r�.ap�1jx1Wp�1; a1Wp�2/NYiD1
K�p .x
ipjx
aip�1
1Wp�1/
#Pa
rticle
no.
Time
x55
x45
x35
x25
x15
31 / 51
Interpretation as Importance Sampling
� SMC methods are just (standard!) importance sampling ona suitably extended space.� Hence, SMC methods yield an unbiased estimate of the
normalising constant Z�n , which is given by
yZ�n.X1Wn;A1Wn�1/ DnY
pD1
�1
N
NXiD1
G�p .Xi1Wp/˜
ith unnormalisedincremental weight at time n
�:
32 / 51
Interpretation as Importance Sampling
� SMC methods are just (standard!) importance sampling ona suitably extended space.� Hence, SMC methods yield an unbiased estimate of the
normalising constant Z�n , which is given by
yZ�n.X1Wn;A1Wn�1/ DnY
pD1
�1
N
NXiD1
G�p .Xi1Wp/˜
ith unnormalisedincremental weight at time n
�:
32 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Motivation
What if target distributions . Q�n/n2N are defined on spaces. zEn/n2N of non-increasing dimension?
33 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�1
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�2
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�3
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�4
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�5
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�6
Q�
Dens
ity
x0
1
34 / 51
Motivation, continuedExample (Annealing):� Want to target complicated distribution � on a space E.� Idea: use SMC methods to target bridging distributions
Q�n.x/ / Q n.x/ WD Œ�.x/��nŒ�1.x/�1��n;
for n D 1; : : : ; P , where– we can easily sample from �1,– 0 D �1 < �2 < � � � < �P�1 < �P D 1.
Q�7
Q�
Dens
ity
x0
1
34 / 51
SMC Samplers
� Problem: evaluating the weights difficult/impossible.� Solution: target a sequence of extended distributions.�n/n2N s.t.1. �n admits Q�n as a marginal,2. .�n/n2N are defined on spaces of increasing dimension.
SMC Samplers (Del Moral, Doucet & Jasra, 2006)Use ‘backward’ Markov kernels Ln�1.xn�1jxn/, so that
�n.x1Wn/ WD Q�n.xn/nY
pD1Lp�1.xp�1jxp/:
35 / 51
SMC Samplers
� Problem: evaluating the weights difficult/impossible.� Solution: target a sequence of extended distributions.�n/n2N s.t.1. �n admits Q�n as a marginal,2. .�n/n2N are defined on spaces of increasing dimension.
SMC Samplers (Del Moral, Doucet & Jasra, 2006)Use ‘backward’ Markov kernels Ln�1.xn�1jxn/, so that
�n.x1Wn/ WD Q�n.xn/nY
pD1Lp�1.xp�1jxp/:
35 / 51
SMC Samplers
� Problem: evaluating the weights difficult/impossible.� Solution: target a sequence of extended distributions.�n/n2N s.t.1. �n admits Q�n as a marginal,2. .�n/n2N are defined on spaces of increasing dimension.
SMC Samplers (Del Moral, Doucet & Jasra, 2006)Use ‘backward’ Markov kernels Ln�1.xn�1jxn/, so that
�n.x1Wn/ WD Q�n.xn/nY
pD1Lp�1.xp�1jxp/:
35 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Aim
� Now: want to approximate
�P .�; x1WP / D P .�; x1WP /Z
or its marginal �P .�/.
ExampleIf �P .�; x1WP / D p.�; x1WP jy1WP / for observations y1WP , then� P .�; x1WP / D p.�; x1WP ; y1WP /,� Z D p.y1WP /.
36 / 51
Aim
� Now: want to approximate
�P .�; x1WP / D P .�; x1WP /Z
or its marginal �P .�/.
ExampleIf �P .�; x1WP / D p.�; x1WP jy1WP / for observations y1WP , then� P .�; x1WP / D p.�; x1WP ; y1WP /,� Z D p.y1WP /.
36 / 51
Aim
� Now: want to approximate
�P .�; x1WP / D P .�; x1WP /Z
or its marginal �P .�/.
ExampleIf �P .�; x1WP / D p.�; x1WP jy1WP / for observations y1WP , then� P .�; x1WP / D p.�; x1WP ; y1WP /,� Z D p.y1WP /.
36 / 51
Aim
� Now: want to approximate
�P .�; x1WP / D P .�; x1WP /Z
or its marginal �P .�/.
ExampleIf �P .�; x1WP / D p.�; x1WP jy1WP / for observations y1WP , then� P .�; x1WP / D p.�; x1WP ; y1WP /,� Z D p.y1WP /.
36 / 51
Ideal MCMC Algorithms
Ideal Metropolis–Hastings AlgorithmGiven .�; X1WP /,1. propose new values .�?; X?
1WP /,2. accept with some probability.
BUT: difficult to design good proposals for X?1WP .
� Idea: use SMC approximation O��? as proposal for X?1WP .
� Problem: proposal density is intractable.H) cannot evaluate acceptance probability.
37 / 51
Ideal MCMC Algorithms
Ideal Metropolis–Hastings AlgorithmGiven .�; X1WP /,1. propose new values .�?; X?
1WP /,2. accept with some probability.
BUT: difficult to design good proposals for X?1WP .
� Idea: use SMC approximation O��? as proposal for X?1WP .
� Problem: proposal density is intractable.H) cannot evaluate acceptance probability.
37 / 51
Ideal MCMC Algorithms
Ideal Metropolis–Hastings AlgorithmGiven .�; X1WP /,1. propose new values .�?; X?
1WP /,2. accept with some probability.
BUT: difficult to design good proposals for X?1WP .
� Idea: use SMC approximation O��? as proposal for X?1WP .
� Problem: proposal density is intractable.H) cannot evaluate acceptance probability.
37 / 51
Ideal MCMC Algorithms
Ideal Metropolis–Hastings AlgorithmGiven .�; X1WP /,1. propose new values .�?; X?
1WP /,2. accept with some probability.
BUT: difficult to design good proposals for X?1WP .
� Idea: use SMC approximation O��? as proposal for X?1WP .
� Problem: proposal density is intractable.H) cannot evaluate acceptance probability.
37 / 51
Ideal MCMC Algorithms
Ideal Metropolis–Hastings AlgorithmGiven .�; X1WP /,1. propose new values .�?; X?
1WP /,2. accept with some probability.
BUT: difficult to design good proposals for X?1WP .
� Idea: use SMC approximation O��? as proposal for X?1WP .
� Problem: proposal density is intractable.H) cannot evaluate acceptance probability.
37 / 51
Ideal MCMC Algorithms
Ideal Metropolis–Hastings AlgorithmGiven .�; X1WP /,1. propose new values .�?; X?
1WP /,2. accept with some probability.
BUT: difficult to design good proposals for X?1WP .
� Idea: use SMC approximation O��? as proposal for X?1WP .
� Problem: proposal density is intractable.H) cannot evaluate acceptance probability.
37 / 51
Ideal MCMC Algorithms, continued
Ideal Gibbs sampler:Given .�; X1WP /, sample from1. �P .� jx1WP /,2. �P .x1WP j�/ D ��P .x1WP /,
BUT: cannot sample from ��P .� Idea: sample from SMC approximation O�� of ��P .� Problem: O��P ¤ ��P .
38 / 51
Ideal MCMC Algorithms, continued
Ideal Gibbs sampler:Given .�; X1WP /, sample from1. �P .� jx1WP /,2. �P .x1WP j�/ D ��P .x1WP /,
BUT: cannot sample from ��P .� Idea: sample from SMC approximation O�� of ��P .� Problem: O��P ¤ ��P .
38 / 51
Ideal MCMC Algorithms, continued
Ideal Gibbs sampler:Given .�; X1WP /, sample from1. �P .� jx1WP /,2. �P .x1WP j�/ D ��P .x1WP /,
BUT: cannot sample from ��P .� Idea: sample from SMC approximation O�� of ��P .� Problem: O��P ¤ ��P .
38 / 51
Ideal MCMC Algorithms, continued
Ideal Gibbs sampler:Given .�; X1WP /, sample from1. �P .� jx1WP /,2. �P .x1WP j�/ D ��P .x1WP /,
BUT: cannot sample from ��P .� Idea: sample from SMC approximation O�� of ��P .� Problem: O��P ¤ ��P .
38 / 51
Ideal MCMC Algorithms, continued
Ideal Gibbs sampler:Given .�; X1WP /, sample from1. �P .� jx1WP /,2. �P .x1WP j�/ D ��P .x1WP /,
BUT: cannot sample from ��P .� Idea: sample from SMC approximation O�� of ��P .� Problem: O��P ¤ ��P .
38 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Extended Target Distribution
Particle MCMC Methods� Andrieu, Doucet & Holenstein (2010).� exact MCMC methods, i.e.
– Metropolis–Hastings algorithm,– Gibbs sampler
targeting an extended distribution.� this distribution includes all random variables generated by
an SMC algorithm, i.e. .X1WP ;A1WP�1/ (and more).� basis: Pseudo-Marginal approach,
– Andrieu & Roberts (2009),– permits IS within MCMC,– and SMC can be interpreted as standard IS(on a suitably extended space).
39 / 51
Extended Target Distribution
Particle MCMC Methods� Andrieu, Doucet & Holenstein (2010).� exact MCMC methods, i.e.
– Metropolis–Hastings algorithm,– Gibbs sampler
targeting an extended distribution.� this distribution includes all random variables generated by
an SMC algorithm, i.e. .X1WP ;A1WP�1/ (and more).� basis: Pseudo-Marginal approach,
– Andrieu & Roberts (2009),– permits IS within MCMC,– and SMC can be interpreted as standard IS(on a suitably extended space).
39 / 51
Extended Target Distribution
Particle MCMC Methods� Andrieu, Doucet & Holenstein (2010).� exact MCMC methods, i.e.
– Metropolis–Hastings algorithm,– Gibbs sampler
targeting an extended distribution.� this distribution includes all random variables generated by
an SMC algorithm, i.e. .X1WP ;A1WP�1/ (and more).� basis: Pseudo-Marginal approach,
– Andrieu & Roberts (2009),– permits IS within MCMC,– and SMC can be interpreted as standard IS(on a suitably extended space).
39 / 51
Extended Target Distribution
Particle MCMC Methods� Andrieu, Doucet & Holenstein (2010).� exact MCMC methods, i.e.
– Metropolis–Hastings algorithm,– Gibbs sampler
targeting an extended distribution.� this distribution includes all random variables generated by
an SMC algorithm, i.e. .X1WP ;A1WP�1/ (and more).� basis: Pseudo-Marginal approach,
– Andrieu & Roberts (2009),– permits IS within MCMC,– and SMC can be interpreted as standard IS(on a suitably extended space).
39 / 51
Extended Target Distribution
Particle MCMC Methods� Andrieu, Doucet & Holenstein (2010).� exact MCMC methods, i.e.
– Metropolis–Hastings algorithm,– Gibbs sampler
targeting an extended distribution.� this distribution includes all random variables generated by
an SMC algorithm, i.e. .X1WP ;A1WP�1/ (and more).� basis: Pseudo-Marginal approach,
– Andrieu & Roberts (2009),– permits IS within MCMC,– and SMC can be interpreted as standard IS(on a suitably extended space).
39 / 51
Extended Target Distribution, continuedParametrisation I:x�P .�;x1WP ; a1WP�1; b�P /
/ p.�/ w�;b�PP–weight ofb�P th path
yZ�P .x1WP ; a1WP�1/œSMC estimate of
normalising constant
�P .x1WP ; a1WP�1/œSMC algorithm
:
Parti
cleno
.
Time
a54
a44
a34
a24
a14
a53
a43
a33
a23
a13
a52
a42
a32
a22
a12
a51
a41
a31
a21
a11
b�5
40 / 51
Reparametrisation
Using b�n D ab�
nC1
n for n 2 f1; : : : ; P � 1g,.x1WP ; a1WP�1; b�P /œ
Parametrisation I
! .x��1WP ; a��1WP�1; x
�1WP ; b
�1WP /Ÿ
Parametrisation II
:Pa
rticle
no.
Time
a54
a44
a34
a24
a14
a53
a43
a33
a23
a13
a52
a42
a32
a22
a12
a51
a41
a31
a21
a11
b�5
41 / 51
Reparametrisation
Using b�n D ab�
nC1
n for n 2 f1; : : : ; P � 1g,.x1WP ; a1WP�1; b�P /œ
Parametrisation I
! .x��1WP ; a��1WP�1; x
�1WP ; b
�1WP /Ÿ
Parametrisation II
:Pa
rticle
no.
Time
a54
a44
a34
a14
a53
a43
a23
a13
a52
a42
a32
a12
a51
a41
a21
a11
b�5
b�4
b�3
b�2b�
1
41 / 51
Extended Target DistributionParametrisation II:x�P .�;x��1WP ; a��1WP�1; x�1WP ; b�1WP /
D �P .�; x�1WP /™’actual’target
p.b�1WP /˜marginalof b�1WP
�P .x��1WP ; a
��1WP�1kx�1WP ; b�1WP /
‘conditional’ SMC algorithm
:
Parti
cleno
.
Time
a54
a44
a34
a14
a53
a43
a23
a13
a52
a42
a32
a12
a51
a41
a21
a11
b�5
b�4
b�3
b�2b�
1
42 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
PMMH AlgorithmA Metropolis–Hastings Algorithm Targeting x�P
� Notation: b WD b�P and � WD .�;x1WP ; a1WP�1; b/.� Proposal kernel:
Q.�?j�/ WD T .�?j�/ �?
P .x?1WP ; a?1WP�1/w
�?;b?
P ;
� Acceptance probability (using Parametrisation I):
˛.�?j�/ WD 1 ^ x�P .�?/Q.�j�?/
x�P .�/Q.�?j�/
D 1 ^ p.�?/
p.�/
yZ�?
P .x?1WP ; a
?1WP�1/
yZ�P .x1WP ; a1WP�1/T .� j�?/T .�?j�/:
� Special case of the GIMH algorithm(Andrieu & Roberts, 2009)
43 / 51
PMMH Algorithm, continued
� efficiency crucially depends on SMC estimate of Z�P� usually, varŒ yZ�P .X1WP ;A1WP�1/� grows linearly in P .� need to increase N at least linearly with P .�! otherwise: low acceptance rate.
44 / 51
PMMH Algorithm, continued
� efficiency crucially depends on SMC estimate of Z�P� usually, varŒ yZ�P .X1WP ;A1WP�1/� grows linearly in P .� need to increase N at least linearly with P .�! otherwise: low acceptance rate.
44 / 51
PMMH Algorithm, continued
� efficiency crucially depends on SMC estimate of Z�P� usually, varŒ yZ�P .X1WP ;A1WP�1/� grows linearly in P .� need to increase N at least linearly with P .�! otherwise: low acceptance rate.
44 / 51
Alternative “justification” of PMMH/GIMH
� only want to approximateZ�P .�; x1WP /dx1WP D �P .�/ / P .�/:
� p.�/Z�P D P .�/, so that
E�p.�/ yZ�P .X1WP ;A1WP�1/
� D P .�/:� View as MH algorithm with approximation
P .�?/T .� j�?/
P .�/T .�?j�/� p.�?/
p.�/
yZ�?
P .x?1WP ; a
?1WP�1/
yZ�P .x1WP ; a1WP�1/T .� j�?/T .�?j�/:
45 / 51
Alternative “justification” of PMMH/GIMH
� only want to approximateZ�P .�; x1WP /dx1WP D �P .�/ / P .�/:
� p.�/Z�P D P .�/, so that
E�p.�/ yZ�P .X1WP ;A1WP�1/
� D P .�/:� View as MH algorithm with approximation
P .�?/T .� j�?/
P .�/T .�?j�/� p.�?/
p.�/
yZ�?
P .x?1WP ; a
?1WP�1/
yZ�P .x1WP ; a1WP�1/T .� j�?/T .�?j�/:
45 / 51
Alternative “justification” of PMMH/GIMH
� only want to approximateZ�P .�; x1WP /dx1WP D �P .�/ / P .�/:
� p.�/Z�P D P .�/, so that
E�p.�/ yZ�P .X1WP ;A1WP�1/
� D P .�/:� View as MH algorithm with approximation
P .�?/T .� j�?/
P .�/T .�?j�/� p.�?/
p.�/
yZ�?
P .x?1WP ; a
?1WP�1/
yZ�P .x1WP ; a1WP�1/T .� j�?/T .�?j�/:
45 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
Particle Gibbs SamplerA Gibbs sampler targeting x�P
Recall thatx�P .�;x��1WP ; a��1WP�1; x�1WP ; b�1WP /
D �P .�; x�1WP /™‘actual’target
p.b�1WP /˜distribution ofindices of x�1WP
�P .x��1WP ; a
��1WP�1kx�1WP ; b�1WP /
‘conditional’ SMC algorithm
:
Parti
cleno
.
Time
b�5
b�4
b�3
b�2b�
1
46 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
47 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
Parti
cleno
.
Time47 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
Parti
cleno
.
Time47 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
Parti
cleno
.
Time47 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
Parti
cleno
.
Time47 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
Parti
cleno
.
Time47 / 51
Particle Gibbs SweepSample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�P j�;x1WP ; a1WP�1/ [via Parametrisation I].
Parti
cleno
.
Time47 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Backward SamplingWhiteley (2010)
Sample from:1. �P .� jx�1WP / [e.g. via a Metropolis–Hastings step],2. �P .x��1WP ; a��1WP�1kx�1WP ; b�1WP / [via ‘conditional’ SMC],3. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP / for n D P; : : : ; 1.
Parti
cleno
.
Time 48 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Particle Gibbs with Ancestor SamplingLindsten, Jordan & Schön (2012)
1. sample from �P .� jx�1WP /,2. for n D 1; : : : ; P , sample from
i. �P .x�b�nn ; a
�b�nn�1kx1Wn�1; a1Wn�2; x�nWP ; b�nWP /,
ii. x�P .b�n j�;x1Wn; a1Wn�1; x�nC1WP ; b�nC1WP /.
Parti
cleno
.
Time49 / 51
Piecewise Deterministic ProcessesStatic Monte Carlo Methods
MotivationVanilla Monte CarloImportance SamplingMarkov Chain Monte Carlo MethodsState-Space-Extension Tricks
Sequential Monte Carlo MethodsMotivationGeneric SMC AlgorithmSample DegeneracySMC Samplers
Particle MCMC MethodsMotivation and SetupExtended Target DistributionParticle Marginal Metropolis–Hastings AlgorithmParticle Gibbs SamplerSMC2 Algorithm
SMC2
Chopin, Jacob & Papaspiliopoulos, (2013)
� Particle MCMC methods can only target a singledistribution �P .�; x1WP /.� How to approximate .�n.�; x1Wn//n2N (e.g. if observations
arrive sequentially in time)?� Idea: instead of MCMC, use SMC to target (a marginal of)
the extended distribution x�n.x1Wn; a1Wn�1; b�n/.� Can be interpreted as nested SMC algorithms – SMCwithin SMC, i.e. each particle has its own SMC algorithm.
50 / 51
SMC2
Chopin, Jacob & Papaspiliopoulos, (2013)
� Particle MCMC methods can only target a singledistribution �P .�; x1WP /.� How to approximate .�n.�; x1Wn//n2N (e.g. if observations
arrive sequentially in time)?� Idea: instead of MCMC, use SMC to target (a marginal of)
the extended distribution x�n.x1Wn; a1Wn�1; b�n/.� Can be interpreted as nested SMC algorithms – SMCwithin SMC, i.e. each particle has its own SMC algorithm.
50 / 51
SMC2
Chopin, Jacob & Papaspiliopoulos, (2013)
� Particle MCMC methods can only target a singledistribution �P .�; x1WP /.� How to approximate .�n.�; x1Wn//n2N (e.g. if observations
arrive sequentially in time)?� Idea: instead of MCMC, use SMC to target (a marginal of)
the extended distribution x�n.x1Wn; a1Wn�1; b�n/.� Can be interpreted as nested SMC algorithms – SMCwithin SMC, i.e. each particle has its own SMC algorithm.
50 / 51
SMC2
Chopin, Jacob & Papaspiliopoulos, (2013)
� Particle MCMC methods can only target a singledistribution �P .�; x1WP /.� How to approximate .�n.�; x1Wn//n2N (e.g. if observations
arrive sequentially in time)?� Idea: instead of MCMC, use SMC to target (a marginal of)
the extended distribution x�n.x1Wn; a1Wn�1; b�n/.� Can be interpreted as nested SMC algorithms – SMCwithin SMC, i.e. each particle has its own SMC algorithm.
50 / 51
LiteratureI C Andrieu, GO Roberts, (2009). The pseudo-marginal approach for efficient Monte
Carlo computations. ANN STAT, 37(2), 697–725.
I C Andrieu, A Doucet, R Holenstein (2010). Particle Markov chain Monte Carlomethods. J ROY STAT SOC B, 72(3), 269–342.
I N Chopin, PE Jacob, O Papaspiliopoulos, (2013). SMC2: an efficient algorithm forsequential analysis of state space models. J ROY STAT SOC B, 75(3), 397–426.
I P Del Moral, A Doucet, A Jasra, (2006). Sequential Monte Carlo samplers.J ROY STAT SOC B, 68(3), 411–436.
I F Lindsten, MI Jordan, T Schön, (2012). Ancestor Sampling for Particle Gibbs.NIPS.
I DA Van Dyk, T Park (2006). Partially collapsed Gibbs samplers. JASA, 103(482),790–796.
I N Whiteley (2010). Contribution to the discussion on ‘Particle Markov chainMonte Carlo methods’ by C Andrieu, A Doucet, R Holenstein.J ROY STAT SOC B, 72(3), 306–307.
I N Whiteley, AM Johansen, S Godsill (2011). Monte Carlo filtering of piecewisedeterministic processes. J COMPUT GRAPH STAT, 20(1), 119–139.
51 / 51