Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
A Family of MCMC Methods on Implicitly Defined Manifolds!Marcus A. Brubaker,+, Mathieu Salzmann and Raquel Urtasun!
Toyota Technological Institute at Chicago!+ University of Toronto, Canada!
Introduc)on: • Tradi&onal MCMC methods (e.g., Gauss-‐Metropolis, HMC) assume the
target distribu&on is over a Euclidean space • However, many problems exist which are most naturally characterized over
a non-‐linear manifold • Sampling from posteriors that arise in such problems has typically required
the deriva&on of posterior-‐specific sampling schemes
Contribu)ons: • Here we derive an MCMC scheme based on Hamiltonian dynamics on an
implicitly defined manifold • We prove that, subject to suitable condi&ons, the Markov Chain converges
to the target posterior • We present constrained variants of several MCMC methods including:
Gauss-‐Metropolis, Hamiltonian (and Langevin) Monte Carlo and Riemann Manifold HMC [6]
• These algorithms are demonstrated on a range of problems including: o Sampling from a linearly constrained Gaussian distribu&on o Sampling from the Bingham-‐von Mises-‐Fisher distribu&on over o Bayesian matrix factoriza&on for collabora&ve filtering o Human pose es&ma&on
• Matlab code available from: hSp://www.cs.toronto.edu/~mbrubake/
Previous Work: • Similar methods are commonly used in molecular dynamics to compute the
free energy of a constrained system (eg, [1-‐3]) • Gibbs samplers have been derived for some distribu&ons (eg, [4]) but even
those specialized methods are outperformed by methods presented here
M = {q ∈ Rn|c(q) = 0}
π(q)
Sn
Experimental Results: • Gaussian distribu&on in a linear subspace
• Bingham-‐von Mises-‐Fisher
• Collabora&ve filtering
• Human pose es&ma&on o Pose is a set of 3D joint posi&ons o Manifold is induced by the limb length
constraints of the skeleton o Posterior combines noisy 2D joint projec&ons
with a PCA based prior model of pose o Compared with direct op&miza&on for
different levels of noise
References: 1. G. Cicco^ and J. P. Ryckaert. Molecular dynamics simula&on of rigid molecules. Computer Physics Report, 4(6):346–392, 1986
2. C. Hartmann. An ergodic sampling scheme for constrained Hamiltonian systems with applica&ons to molecular dynamics. Journal of Sta&s&cal Physics, 130:687–711, 2008
3. T. Lelièvre, M. Rousset, and G. Stoltz. Free energy computa&ons: A Mathema&cal Perspec&ve. Imperial College Press, 2010
4. P. D. Hoff. Simula&on of the matrix Bingham-‐von Mises-‐FIsher distribu&on, with applica&ons to mul&variate and rela&onal data. Journal of Computa&onal and Graphical Sta&s&cs, 18:438–456, 2009
5. E. Hairer, C. Lubich, and G. Wanner. Geometric Numerical Integra&on. Springer, 2nd edi&on, 2006 6. M. Girolami and B. Calderhead. Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Sta&s&cal Society: Series B, 73:123–214, 2011
0 0.01 0.02 0.03 0.04 0.05
0
0.2
0.4
0.6
0.8
1
CHMC (L = 4)CHMC (L = 3)CHMC (L = 2)CLangevinCMetropolisGibbs
20 40 60 80 100
100
200
300
400
Frame #
Mea
n jo
int e
rror [
mm
]
Constr optOurs MAPOurs mean
0 2 4 6 8 10
50
100
150
200
250
Noise std
Mea
n jo
int e
rror [
mm
]
Constr optOurs MAPOurs mean
M = {q ∈ Rn|c(q) = 0}Theore)cal Result: • Assume that is connected, smooth and
differen&able with full-‐rank everywhere and the target posterior is strictly posi&ve on
• Given: a mass matrix which is posi&ve definite on a simula&on poten&al energy func&on which is con&nuous a numerical integra&on method which is
symmetric, locally accessible, consistent with the Simula&on Hamiltonian , and symplec0c on the co-‐tangent bundle
• Theorem: For all
where denotes steps of the Markov transi&on kernel of the Constrained Hamiltonian Monte Carlo algorithm
C(q) = ∂c∂q
M(q) M
Mπ(q)
U(q)ΦH
h : T ∗M → T ∗M
T ∗M =�(p, q)|c(q) = 0 and C(q)∂H∂p (p, q) = 0
�
C2
H
q0 ∈ M
limn→∞
�Tn(q0 → ·)− π(·)� = 0
Tn(q0 → ·) n
Simula)on of constrained Hamiltonian systems • Need a symplec&c, consistent and symmetric integra&on method on • Generalized RATTLE Algorithm (see [5] for details and other op&ons)
• If and the mass matrix is constant, RATTLE reduces to Leapfrog
M
p1/2 = p0 −h
2
�∂H(p1/2, q0)
∂q+ C(q0)
Tλ
�
q1 = q0 +h
2
�∂H(p1/2, q0)
∂p+
∂H(p1/2, q1)
∂p
�
0 = c(q1)
p1 = p1/2 −h
2
�∂H(p1/2, q1)
∂q+ C(q1)
Tµ
�
0 = C(q1)∂H(p1, q1)
∂p
M = Rn
Instances of Constrained HMC: • Gauss-‐Metropolis with covariance can expressed as HMC with
and . Constrained Gauss-‐Metropolis is thus similarly defined. • Constrained Langevin Monte Carlo arises with • Constrained Riemann Manifold HMC [6] arises for suitable choices of
Σ U(q) = 0M(q) = Σ−1
L = 1M(q)
10 0 1015
10
5
0
5CHMC
10 0 1015
10
5
0
5CLangevin
10 0 1015
10
5
0
5CMetropolis
M = Sn π(q) ∝ exp(dT q + qTAq)
Method E[− log π(q)] ESS % ESS/second
CHMC (L = 4) -999.021 27.3 183.756
CHMC (L = 3) -998.759 25.4 217.427
CHMC (L = 2) -999.121 37.9 440.898
CLangevin -998.757 33.0 619.339
CMetropolis -998.82 3.8 90.1513
Gibbs [4] -998.742 50.8 160.722
M = Vr(RN )× Vr(RM )× Rr π(U,S,V) ∝�
(i,j)∈E
exp
�− (f(UiSVj)−Yi,j)2
2σ2p
�
1M Movie Lens (RMSE) EachMovie (RMSE)r 5 10 15 5 10 15
HMC 1.577 ± 0.39 2.001 ± 0.66 2.306 ± 0.25 1.153 ± 0.002 1.161 ± 0.002 1.204 ± 0.018
HMC-l 0.909 ± 0.008 0.949 ± 0.01 0.99 ± 0.01 1.155 ± 0.007 1.164 ± 0.001 1.184 ± 0.004
CHMC 0.893 ± 0.01 0.888 ± 0.01 0.889 ± 0.01 1.144 ± 0.002 1.121 ± 0.001 1.116 ± 0.001
CHMC-l 0.888 ± 0.01 0.881 ± 0.01 0.881 ± 0.01 1.137 ± 0.003 1.115 ± 0.002 1.11 ± 0.002
Constrained Hamiltonian Monte Carlo: • Input: • Define: o Co-‐tangent Projec0on:
o Acceptance Hamiltonian:
o Simula0on Hamiltonian:
1. , 2. For , 3. With probability o Return
4. Else o Return
q0, M(q), h, L, π(q), U(q)
i = 1, . . . , L (pi, qi) ← ΦH
h (pi−1, qi−1)
P(q) = I −M(q)−TC(q)T�C(q)M(q)−1M(q)−TC(q)T
�−1C(q)M(q)−1
H(p, q) = 12p
TM(q)−1p+ U(q)
H(p, q) = 12p
TM(q)−1p+ 12 log |2πP(q)TM(q)P(q)|− log π(q)
qL
q0
p�0 ∼ N (0,M(q0)) p0 ← P(q0)p�0
min {1, exp(H(p0, q0)−H(pL, qL))}