Upload
phungnguyet
View
216
Download
0
Embed Size (px)
Citation preview
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
PMR: Sampling IIProbabilistic Modelling and Reasoning
Amos Storkey
School of Informatics, University of Edinburgh
Amos Storkey — PMR: Sampling II 1/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Outline
1 Metropolis Hastings
2 Gibbs Sampling
3 Hamiltonian MCMC
Amos Storkey — PMR: Sampling II 2/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Outline
1 Metropolis Hastings
2 Gibbs Sampling
3 Hamiltonian MCMC
Amos Storkey — PMR: Sampling II 3/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Outline
Amos Storkey — PMR: Sampling II 4/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Metropolis-Hastings Sampler
Markov chain: Propose Q(θ′|θt).Accept with probability
P(Accept) = min(1,
P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)
)
If accept, set θt+1 = θ′, else set θt+1 = θt.
Amos Storkey — PMR: Sampling II 5/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Metropolis-Hastings Sampler
Markov chain: Propose Q(θ′|θt).Accept with probability
P(Accept) = min(1,
P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)
)
If accept, set θt+1 = θ′, else set θt+1 = θt.
Amos Storkey — PMR: Sampling II 5/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Metropolis-Hastings Sampler
Markov chain: Propose Q(θ′|θt).Accept with probability
P(Accept) = min(1,
P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)
)
If accept, set θt+1 = θ′, else set θt+1 = θt.
Amos Storkey — PMR: Sampling II 5/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Metropolis-Hastings Sampler
Markov chain: Propose Q(θ′|θt).Accept with probability
P(Accept) = min(1,
P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)
)
If accept, set θt+1 = θ′, else set θt+1 = θt.
Amos Storkey — PMR: Sampling II 5/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Transition
Write out the full transition probability
P(φ|θ) =P(φ)Q(θ|φ)P(θ)Q(φ|θ)
Q(φ|θ)+∫dφ′
(1 −min
(1,
P(φ′)Q(θ|φ′)P(θ)Q(φ′|θ)
))Q(φ′|θ)δ(φ − θ)
if P(θ)Q(φ|θ) > P(φ)Q(θ|φ) and
P(φ|θ) = Q(φ|θ)
otherwise.Exercise: Prove satisfies detailed balance.
Amos Storkey — PMR: Sampling II 6/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Transition
Write out the full transition probability
P(φ|θ) =P(φ)Q(θ|φ)P(θ)Q(φ|θ)
Q(φ|θ)+∫dφ′
(1 −min
(1,
P(φ′)Q(θ|φ′)P(θ)Q(φ′|θ)
))Q(φ′|θ)δ(φ − θ)
if P(θ)Q(φ|θ) > P(φ)Q(θ|φ) and
P(φ|θ) = Q(φ|θ)
otherwise.Exercise: Prove satisfies detailed balance.
Amos Storkey — PMR: Sampling II 6/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Transition
Write out the full transition probability
P(φ|θ) =P(φ)Q(θ|φ)P(θ)Q(φ|θ)
Q(φ|θ)+∫dφ′
(1 −min
(1,
P(φ′)Q(θ|φ′)P(θ)Q(φ′|θ)
))Q(φ′|θ)δ(φ − θ)
if P(θ)Q(φ|θ) > P(φ)Q(θ|φ) and
P(φ|θ) = Q(φ|θ)
otherwise.Exercise: Prove satisfies detailed balance.
Amos Storkey — PMR: Sampling II 6/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Demo
Matlab Demo
Amos Storkey — PMR: Sampling II 7/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Demo
Matlab Demo
Amos Storkey — PMR: Sampling II 7/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Problems
Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.
Amos Storkey — PMR: Sampling II 8/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Problems
Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.
Amos Storkey — PMR: Sampling II 8/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Problems
Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.
Amos Storkey — PMR: Sampling II 8/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Metropolis-Hastings Problems
Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.
Amos Storkey — PMR: Sampling II 8/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Gibbs Sampler
Markov chain: Adapt θi keeping all θ j,i fixed. i.e.Choose i uniformly from i = 1, 2, . . . ,D. Set θt+1 = θt. Thensample θt+1,i from the conditional probability P(θt+1,i|θt+1,,i)where θt+1,,i denotes the set {θt+1, j| j , i}.Repeat.Can cycle through i either (this is not reversible, but can beshown to have a unique equilibrium distribution)
Amos Storkey — PMR: Sampling II 9/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Gibbs Sampler
Markov chain: Adapt θi keeping all θ j,i fixed. i.e.Choose i uniformly from i = 1, 2, . . . ,D. Set θt+1 = θt. Thensample θt+1,i from the conditional probability P(θt+1,i|θt+1,,i)where θt+1,,i denotes the set {θt+1, j| j , i}.Repeat.Can cycle through i either (this is not reversible, but can beshown to have a unique equilibrium distribution)
Amos Storkey — PMR: Sampling II 9/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Gibbs Demo
Matlab Demo: GaussianMatlab Demo: Lattice
Amos Storkey — PMR: Sampling II 10/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Block Sampler
Gibbs sampler suffers from self reinforcement problem:frustrated systems.Instead of updating one variable at a time it may bepossible to update a whole block of variables in one go.Can help a bit. But need to have joint distribution for block.
Amos Storkey — PMR: Sampling II 11/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Block Sampler
Gibbs sampler suffers from self reinforcement problem:frustrated systems.Instead of updating one variable at a time it may bepossible to update a whole block of variables in one go.Can help a bit. But need to have joint distribution for block.
Amos Storkey — PMR: Sampling II 11/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Block Sampler
Gibbs sampler suffers from self reinforcement problem:frustrated systems.Instead of updating one variable at a time it may bepossible to update a whole block of variables in one go.Can help a bit. But need to have joint distribution for block.
Amos Storkey — PMR: Sampling II 11/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Mixed Sampler
Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.
Amos Storkey — PMR: Sampling II 12/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Mixed Sampler
Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.
Amos Storkey — PMR: Sampling II 12/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Mixed Sampler
Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.
Amos Storkey — PMR: Sampling II 12/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Mixed Sampler
Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.
Amos Storkey — PMR: Sampling II 12/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Augmentation Methods
Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as
∫dψP(ψ|θ) = 1 whatever
θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.
Amos Storkey — PMR: Sampling II 13/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Hamiltonian (or Hybrid) Monte-Carlo
Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.
Amos Storkey — PMR: Sampling II 14/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Hamiltonian (or Hybrid) Monte-Carlo
Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.
Amos Storkey — PMR: Sampling II 14/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Hamiltonian (or Hybrid) Monte-Carlo
Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.
Amos Storkey — PMR: Sampling II 14/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Hamiltonian (or Hybrid) Monte-Carlo
Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.
Amos Storkey — PMR: Sampling II 14/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
MCMC - Hamiltonian (or Hybrid) Monte-Carlo
Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.
Amos Storkey — PMR: Sampling II 14/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Procedure
Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.
θ̇i = −b∂∂vi
log P(v)
v̇i = b∂∂θi
log P(θ)
Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.
Amos Storkey — PMR: Sampling II 15/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Comments
Most commonly used in large continuous systems.Hamiltonian Monte-Carlo is recommended for many typicalunsupervised settings.
Amos Storkey — PMR: Sampling II 16/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Comments
Most commonly used in large continuous systems.Hamiltonian Monte-Carlo is recommended for many typicalunsupervised settings.
Amos Storkey — PMR: Sampling II 16/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Hamiltonian Monte-Carlo Comments
Most commonly used in large continuous systems.Hamiltonian Monte-Carlo is recommended for many typicalunsupervised settings.
Amos Storkey — PMR: Sampling II 16/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Swendsen Wang
Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf
Amos Storkey — PMR: Sampling II 17/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Convergence?
No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?
Amos Storkey — PMR: Sampling II 18/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Convergence?
No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?
Amos Storkey — PMR: Sampling II 18/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Convergence?
No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?
Amos Storkey — PMR: Sampling II 18/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Convergence?
No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?
Amos Storkey — PMR: Sampling II 18/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Real Machine Learning?
But how are these used in a real probabilistic modellingcontext?For sampling from the posterior distribution of machinelearning methods.
Amos Storkey — PMR: Sampling II 19/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Real Machine Learning?
But how are these used in a real probabilistic modellingcontext?For sampling from the posterior distribution of machinelearning methods.
Amos Storkey — PMR: Sampling II 19/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
Real Machine Learning?
But how are these used in a real probabilistic modellingcontext?For sampling from the posterior distribution of machinelearning methods.
Amos Storkey — PMR: Sampling II 19/20
Metropolis Hastings Gibbs Sampling Hamiltonian MCMC
To Do
Examinable ReadingMackay Chapter 29, 30
Preparatory ReadingMackay Chapter 45
Extra ReadingAny papers of Radford Neal that take your fancy. Iain Murray’stutorial slides.
Amos Storkey — PMR: Sampling II 20/20