Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Lecture to: MarkovChain Monte Carlo
Scribes : Jay De Young
Iris Seaman
Last Lecture : Importance Sampling
Idea : Generate samples Xsnqcx ) froma proposal distribution that is similar to Mk )
Mex '
yhigh weight :
sgcxl underrepresented
""iµj:* ,⇐'s ?.ws w÷ris ,
✓ www.renren.gyted gx
Eftlx ) ) =ldxrcxlfcxs
= fdxqcxi 'IIIT fix , = fdxqcx, fix ,
S
Two levels = 's ¥ I wsfks, = &.gg?wsifCxs)xsngCxlof approx .
5=1 -
Default Choice fan
Proposal: Likelihood Weighting
Assume Bayes Net
ycx ) = ply ,Xi = pcylxlpcxl 2- = fdxpiy.tl = ply )
Set Proposal to Prior
q C x ) = pcx )
Importance Weights : Likelihood
Es= &
,
I f Ks )
ws=PC4=
pcycxs ,
[ W"
gear , passs
'
= ply 1×51
)
Motivating Problem : Hidden Markov Models
§→€→yi Yt
& 7,
Za . z ,
t
←
Goal : Posterior an Parameters It
pc O I y ) = fdz pco ,7- ly )
Problem : hand to gett
Will likelihood weighty wah ? good Can do with
| forward -
0s
~ pco ) W'
= pi yl Os , = /d± pay ,zigs ,
backward
Manhov Chain Monte Carlo
5 - 1
Idea : Use previous sample x to propose
the next sample xsMaher Chain : A sequence of random variables
Xl,
. . .,X5 is a ( discrete - time ) Markov chain when
xs 1×5 "
DX !... ,×s
" 2
Moran property
p ( xslx ' is ' ' ) = p ( Xsixs "
)
A Mmwov Chain is homogenous Whensame trans
p(Xs=×s 1×5 "=×s . ' ) = p( X '=xslX=xs " ) dtst foreach s
Manha Chain Monte Carlo
Convergence : A Manha chain converges to
a target density 17 ( x ) when
lying.
p(Xs=x ) = n(X=× )
2
*¥*y¥¥±,
III:"II÷u.
in which X=x is visited
with"
frequency" h(X=x )
J
Manha Chain Monte Carlo
Detailed Balance : A homogenous Manha chain
satisfies detailed balance when
MCX ) plx'lX ) = Mcx')Pkk 't
Implication ; pcxllxl leaves Mcx ) invariant1
s
17 ( × ) = |dx ' Mlx ) PCX'
IX ) = Mcxl |dx'#' k ),
= |dx ' MC ×')PCXIXI
If you start with a sample ×'
~ MKI
and then sample xlx'
~p( XIX' ) thin × ~ Mk )
Metropolis - Hastings
Idea : Starting from the current sample xs
generate a proposal ×'
~ qcxixs ) and accept
xst '=x ' with probability-
When we
have detailed
.
M ( X' ) q(×|×i )
balance :
A =mmin( 1,
( × , qk ,|× ,) 171×1191×1×1 '
^N
= 171×191×11×1-Always accept when ratio > i so ratio = I
with probability ( l - a ) reject the proposaland retain the previous sample xs "=xs
Metropolis - Hastings
Idea : Starting from the current sample xs
generate a proposal ×'
~ qcxixs ) and accept
xst '=x ' with probability.
M ( X' ) q( XIX ' )
a =mmin( li
n ( × ) qkllx ))
with probability ( l - d ) reject the proposaland retain the previous sample xs "=xs
Exercise : Show that the Markov chain
x'
. . .
xs satisfies detailed balance
etropolis - Hastings : Detailed Balance
Detailed Balance :
17 ( X ) p I X'
IX ) = Mex')plxcx 't
Metropolis - Hastings : Define
pl x' IN = ( I - d)8×4 ' ) tag ( x't x ) D= min (I,
17K ' ) qcxlx'
)
=Tx ( x' 1171×117449¥11 )
pcx' lxlncxl = ( I - a )8×1×1MK )
t min ( n Cx ) glx' Ix )
,Mlk )
qlxlx' ) )
= min ( n C x' 191×1×1,
171×191×119)= p ( XIX
'
) MIX ' )
etropolis - Hastings : Unrormalired Densities
Nice property : Can calculate acceptance prob
from unhormaliud densities ycx' ) and YCX )
.
M ( ×' ) q( xlx ' ) MCX ) =y(×l/Z
a = mmin( l, n(× , q(×i|× )
)MK
' )=yl×' VZ
.
y ( ×' ) q( xlx ' ) HCX ' )
= mminf, ya, gu , ,× ,) tat # 8¥'
pcy ,×'
) q( xlx ' ) y ( x ) = pcy ,× )= mmin( l
i
p( y,×)q(×' lx ))
( for Bayer net )
Metropolis - Hastings : Choosing proposals
Independent - Mtt : Sample proposers from pretor
9 ( X'
IX ) = pix' ) ( Independent from prau . sample )
ply ,x
' ) g ( XIX ' )a = rain( I
i
pay ,× , qcx ' Ix ))
= mm in( ,,
Pcb " " P " " Pk )
) pcyixy
Plylxspcx , pix ,= min !,
- )pcylx )
Straightforward ,but low acceptance prob
( same issue as with likelihood weighting )
large ftMetropolis - Hastings : Choosing proposals
yContinuous variables : Gaussian
"
÷÷¢\×.• -
f↳
- off
qcx' IX ) = Norm ( X
'
; X,
82 ) small or'
\Trade - off far proposal Variance symmetric proposal
91 x' txt e g ( Xix' )
- 82 too small ; good acceptance prob A,
but high correlation between samples are Minh , 9g5¥- 82 too large : less correlation
,but
= May ,,
)lower acceptance prob A
Rule of thumb,
time of'
to make 9 I 0.234
Gibbs Sampling ( Next Lecture )
Idea : Propose 1 variable at a time,
holding other variables constant
y C x )= ply Ix , ,Xa ) pix . , Xz )
X ,'
~ pix , I y Xz ) = ply ,X.
,Xa ) / ply ,
Xi
xi - peaty:X
,'s
Acceptance Ratio : Car accept with prob I=
on = min ( ,,
P' bi × i' " ) Pl Yi KHz ) ply ,xz )
)pay , X , ,×z ) pay ,xz ) ply ,xi,Xz7
= I
MCMC vs . Importance Sampling
NCO ) = 8107 IZ y ( O ) = ply ,O ) 2- = ply )
- importance Sampling i
ws =P'{j%,
on geo ) Elws ) -
- pigsy aI Guess and check Gives estimate of marginal
Metropolis - Hastings
ps = {Os "
u > a u - Unit ( 0,1 )Anglo'1o ",
O"
n ca g = min ( ,,
8105910541014
HO" ) 9104054 ,
)Can do
"
hill climbing" but no estimate of marginal
Computing Marginal Likelihoods
Motivation : Model
comparisonQuestion : How many clusters
'
K ?'
*Low ply 109 High ply if )
Fewerbad 0 Lots of bad 0
Bayesian Approach : Compare marginal likelihood
*
K = angmax plylk ) = angmax / do ply 107 PCOIK)
ke { I,
. . . . km " } k "
Best average fit"
Annealed Importance Sampling
Idea I : Sample from target fco ) by way ofintermediatedistributions g of Bnf I
ToLO) = pH ) yn I 01 = ply IOB "
ploy
Ifkilo ) = glo )
Easy to generate Hard to generategood proposals good proposals
Idea 2 : Use MLMC to generate proposals
Annealed Importance SamplingUse as proposal for next step
←
-541¥Initialize Use MCMC High quality
proposals to move around samples
Initialization w :-.4%0%4 Oi- got . )
f Pheu weight
Transition wins -
-town ! Oink. fOn 10ns. .
)
. ru . i ( Ons ) yMcmc kernel