PMR: Sampling II - The University of Edinburgh · Setting the right proposal to provide good...

Preview:

Citation preview

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

PMR: Sampling IIProbabilistic Modelling and Reasoning

Amos Storkey

School of Informatics, University of Edinburgh

Amos Storkey — PMR: Sampling II 1/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Outline

1 Metropolis Hastings

2 Gibbs Sampling

3 Hamiltonian MCMC

Amos Storkey — PMR: Sampling II 2/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Outline

1 Metropolis Hastings

2 Gibbs Sampling

3 Hamiltonian MCMC

Amos Storkey — PMR: Sampling II 3/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Outline

Amos Storkey — PMR: Sampling II 4/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Metropolis-Hastings Sampler

Markov chain: Propose Q(θ′|θt).Accept with probability

P(Accept) = min(1,

P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)

)

If accept, set θt+1 = θ′, else set θt+1 = θt.

Amos Storkey — PMR: Sampling II 5/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Metropolis-Hastings Sampler

Markov chain: Propose Q(θ′|θt).Accept with probability

P(Accept) = min(1,

P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)

)

If accept, set θt+1 = θ′, else set θt+1 = θt.

Amos Storkey — PMR: Sampling II 5/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Metropolis-Hastings Sampler

Markov chain: Propose Q(θ′|θt).Accept with probability

P(Accept) = min(1,

P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)

)

If accept, set θt+1 = θ′, else set θt+1 = θt.

Amos Storkey — PMR: Sampling II 5/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Metropolis-Hastings Sampler

Markov chain: Propose Q(θ′|θt).Accept with probability

P(Accept) = min(1,

P(θ′)Q(θt|θ′)P(θt)Q(θ′|θt)

)

If accept, set θt+1 = θ′, else set θt+1 = θt.

Amos Storkey — PMR: Sampling II 5/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Transition

Write out the full transition probability

P(φ|θ) =P(φ)Q(θ|φ)P(θ)Q(φ|θ)

Q(φ|θ)+∫dφ′

(1 −min

(1,

P(φ′)Q(θ|φ′)P(θ)Q(φ′|θ)

))Q(φ′|θ)δ(φ − θ)

if P(θ)Q(φ|θ) > P(φ)Q(θ|φ) and

P(φ|θ) = Q(φ|θ)

otherwise.Exercise: Prove satisfies detailed balance.

Amos Storkey — PMR: Sampling II 6/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Transition

Write out the full transition probability

P(φ|θ) =P(φ)Q(θ|φ)P(θ)Q(φ|θ)

Q(φ|θ)+∫dφ′

(1 −min

(1,

P(φ′)Q(θ|φ′)P(θ)Q(φ′|θ)

))Q(φ′|θ)δ(φ − θ)

if P(θ)Q(φ|θ) > P(φ)Q(θ|φ) and

P(φ|θ) = Q(φ|θ)

otherwise.Exercise: Prove satisfies detailed balance.

Amos Storkey — PMR: Sampling II 6/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Transition

Write out the full transition probability

P(φ|θ) =P(φ)Q(θ|φ)P(θ)Q(φ|θ)

Q(φ|θ)+∫dφ′

(1 −min

(1,

P(φ′)Q(θ|φ′)P(θ)Q(φ′|θ)

))Q(φ′|θ)δ(φ − θ)

if P(θ)Q(φ|θ) > P(φ)Q(θ|φ) and

P(φ|θ) = Q(φ|θ)

otherwise.Exercise: Prove satisfies detailed balance.

Amos Storkey — PMR: Sampling II 6/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Demo

Matlab Demo

Amos Storkey — PMR: Sampling II 7/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Demo

Matlab Demo

Amos Storkey — PMR: Sampling II 7/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Problems

Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.

Amos Storkey — PMR: Sampling II 8/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Problems

Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.

Amos Storkey — PMR: Sampling II 8/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Problems

Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.

Amos Storkey — PMR: Sampling II 8/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Metropolis-Hastings Problems

Setting the right proposal to provide good mixing, butreasonable acceptance probability.Try to get acceptance rate to be 0.234 (various argumentsprovide conditions for this to be optimal). Can vary width.Random walk behaviour in high dimensions: proposal is adiffusion process. Can take a long time to get anywhere.

Amos Storkey — PMR: Sampling II 8/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Gibbs Sampler

Markov chain: Adapt θi keeping all θ j,i fixed. i.e.Choose i uniformly from i = 1, 2, . . . ,D. Set θt+1 = θt. Thensample θt+1,i from the conditional probability P(θt+1,i|θt+1,,i)where θt+1,,i denotes the set {θt+1, j| j , i}.Repeat.Can cycle through i either (this is not reversible, but can beshown to have a unique equilibrium distribution)

Amos Storkey — PMR: Sampling II 9/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Gibbs Sampler

Markov chain: Adapt θi keeping all θ j,i fixed. i.e.Choose i uniformly from i = 1, 2, . . . ,D. Set θt+1 = θt. Thensample θt+1,i from the conditional probability P(θt+1,i|θt+1,,i)where θt+1,,i denotes the set {θt+1, j| j , i}.Repeat.Can cycle through i either (this is not reversible, but can beshown to have a unique equilibrium distribution)

Amos Storkey — PMR: Sampling II 9/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Gibbs Demo

Matlab Demo: GaussianMatlab Demo: Lattice

Amos Storkey — PMR: Sampling II 10/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Block Sampler

Gibbs sampler suffers from self reinforcement problem:frustrated systems.Instead of updating one variable at a time it may bepossible to update a whole block of variables in one go.Can help a bit. But need to have joint distribution for block.

Amos Storkey — PMR: Sampling II 11/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Block Sampler

Gibbs sampler suffers from self reinforcement problem:frustrated systems.Instead of updating one variable at a time it may bepossible to update a whole block of variables in one go.Can help a bit. But need to have joint distribution for block.

Amos Storkey — PMR: Sampling II 11/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Block Sampler

Gibbs sampler suffers from self reinforcement problem:frustrated systems.Instead of updating one variable at a time it may bepossible to update a whole block of variables in one go.Can help a bit. But need to have joint distribution for block.

Amos Storkey — PMR: Sampling II 11/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Mixed Sampler

Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.

Amos Storkey — PMR: Sampling II 12/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Mixed Sampler

Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.

Amos Storkey — PMR: Sampling II 12/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Mixed Sampler

Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.

Amos Storkey — PMR: Sampling II 12/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Mixed Sampler

Can mix ergodic sampling steps from different samplers:still satisfies detailed balance.Can helps to overcome the disadvantages of one methodby incorporating another.Often helpful to add in specific steps to help with mixing: ifa sampler gets stuck in one potential well (a regionsurrounded by low probability regions), need a means ofgetting it to transition to another.

Amos Storkey — PMR: Sampling II 12/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Augmentation Methods

Suppose we have a big sampling problem, that is hard.Solution?Turn it into an even bigger problem by adding additionalvariables. Solve that.Can get sample from the original problem by just throwingunneeded variables away.If samples (ψi,θi) are from joint P(ψ,θ) then samples arealso samples of P(ψ|θ)P(θ) as this is the same. Hencesamples θi must be from P(θ) as

∫dψP(ψ|θ) = 1 whatever

θ is.Examples: Hamiltonian Monte-Carlo. Swendsen Wang.

Amos Storkey — PMR: Sampling II 13/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Hamiltonian (or Hybrid) Monte-Carlo

Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.

Amos Storkey — PMR: Sampling II 14/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Hamiltonian (or Hybrid) Monte-Carlo

Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.

Amos Storkey — PMR: Sampling II 14/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Hamiltonian (or Hybrid) Monte-Carlo

Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.

Amos Storkey — PMR: Sampling II 14/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Hamiltonian (or Hybrid) Monte-Carlo

Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.

Amos Storkey — PMR: Sampling II 14/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

MCMC - Hamiltonian (or Hybrid) Monte-Carlo

Problem of Metropolis Hastings is random walk behaviour:slow diffusion to cover the space.Hamiltonion Monte-Carlo reduces this by augmenting eachvariable in the original space with another random variable.Now can do contour walks for each of these variables, inaddition to Gibbs sampling steps in the joint distribution ofthe augmented variables.Related to Hamiltonian systems in physics: maintainconstant energy by swapping kinetic energy for potentialenergy. Augmented variables are momentum variables.See Mackay Chapter 30.

Amos Storkey — PMR: Sampling II 14/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Procedure

Original problem P(θ). Add augmented Gaussian P(v).Step 1: Sample from Gaussian P(v).Step 2: Choose a direction (b = ±1) (to maintainreversibility).Walk along Hamiltonian path.

θ̇i = −b∂∂vi

log P(v)

v̇i = b∂∂θi

log P(θ)

Repeat.Actually have to put in a few fixes to deal with fact that canonly run differential system using finite steps. Need to useleapfrog steps and run bidirectionally use aproposal/acceptance approach to ensure detailed balance.

Amos Storkey — PMR: Sampling II 15/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Comments

Most commonly used in large continuous systems.Hamiltonian Monte-Carlo is recommended for many typicalunsupervised settings.

Amos Storkey — PMR: Sampling II 16/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Comments

Most commonly used in large continuous systems.Hamiltonian Monte-Carlo is recommended for many typicalunsupervised settings.

Amos Storkey — PMR: Sampling II 16/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Hamiltonian Monte-Carlo Comments

Most commonly used in large continuous systems.Hamiltonian Monte-Carlo is recommended for many typicalunsupervised settings.

Amos Storkey — PMR: Sampling II 16/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Swendsen Wang

Problem: Gibbs sampling mixes poorly due toself-reinforcement.Add in bond variables between highly aligned variables.Bonds can be in ‘connected’ or ‘disconnected’ states.Ensure marginal distribution is original problem.Conditioned on the states, bonds are independent. Canrandomly cut strong bonds.Better mixing comes from fact that there are now fewerreinforcing influences.See http://www.inference.phy.cam.ac.uk/mackay/itila/swendsen.pdf

Amos Storkey — PMR: Sampling II 17/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Convergence?

No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?

Amos Storkey — PMR: Sampling II 18/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Convergence?

No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?

Amos Storkey — PMR: Sampling II 18/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Convergence?

No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?

Amos Storkey — PMR: Sampling II 18/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Convergence?

No guaranteed way to test convergence in general.Main tests involve tests of coalescence: converged when ithas forgotten past.Multiple chains: do they end up in the same place?

Amos Storkey — PMR: Sampling II 18/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Real Machine Learning?

But how are these used in a real probabilistic modellingcontext?For sampling from the posterior distribution of machinelearning methods.

Amos Storkey — PMR: Sampling II 19/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Real Machine Learning?

But how are these used in a real probabilistic modellingcontext?For sampling from the posterior distribution of machinelearning methods.

Amos Storkey — PMR: Sampling II 19/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

Real Machine Learning?

But how are these used in a real probabilistic modellingcontext?For sampling from the posterior distribution of machinelearning methods.

Amos Storkey — PMR: Sampling II 19/20

Metropolis Hastings Gibbs Sampling Hamiltonian MCMC

To Do

Examinable ReadingMackay Chapter 29, 30

Preparatory ReadingMackay Chapter 45

Extra ReadingAny papers of Radford Neal that take your fancy. Iain Murray’stutorial slides.

Amos Storkey — PMR: Sampling II 20/20

Recommended