Lecture - Khoury College of Computer Sciences€¦ · of approx. 5=1-Default Choice fan Proposal: Likelihood Weighting Assume Bayes Net ycx) = ply, Xi = pcylxlpcxl 2-= fdxpiy.tl =

Lecture to: MarkovChain Monte Carlo

Scribes : Jay De Young

Iris Seaman

Last Lecture : Importance Sampling

Idea : Generate samples Xsnqcx ) froma proposal distribution that is similar to Mk )

Mex '

yhigh weight :

sgcxl underrepresented

""iµj:* ,⇐'s ?.ws w÷ris ,

✓ www.renren.gyted gx

Eftlx ) ) =ldxrcxlfcxs

= fdxqcxi 'IIIT fix , = fdxqcx, fix ,

S

Two levels = 's ¥ I wsfks, = &.gg?wsifCxs)xsngCxlof approx .

5=1 -

Default Choice fan

Proposal: Likelihood Weighting

Assume Bayes Net

ycx ) = ply ,Xi = pcylxlpcxl 2- = fdxpiy.tl = ply )

Set Proposal to Prior

q C x ) = pcx )

Importance Weights : Likelihood

Es= &

,

I f Ks )

ws=PC4=

pcycxs ,

[ W"

gear , passs

'

= ply 1×51

)

Motivating Problem : Hidden Markov Models

§→€→yi Yt

& 7,

Za . z ,

t

←

Goal : Posterior an Parameters It

pc O I y ) = fdz pco ,7- ly )

Problem : hand to gett

Will likelihood weighty wah ? good Can do with

| forward -

0s

~ pco ) W'

= pi yl Os , = /d± pay ,zigs ,

backward

Manhov Chain Monte Carlo

5 - 1

Idea : Use previous sample x to propose

the next sample xsMaher Chain : A sequence of random variables

Xl,

. . .,X5 is a ( discrete - time ) Markov chain when

xs 1×5 "

DX !... ,×s

" 2

Moran property

p ( xslx ' is ' ' ) = p ( Xsixs "

)

A Mmwov Chain is homogenous Whensame trans

p(Xs=×s 1×5 "=×s . ' ) = p( X '=xslX=xs " ) dtst foreach s

Manha Chain Monte Carlo

Convergence : A Manha chain converges to

a target density 17 ( x ) when

lying.

p(Xs=x ) = n(X=× )

2

*¥*y¥¥±,

III:"II÷u.

in which X=x is visited

with"

frequency" h(X=x )

J

Manha Chain Monte Carlo

Detailed Balance : A homogenous Manha chain

satisfies detailed balance when

MCX ) plx'lX ) = Mcx')Pkk 't

Implication ; pcxllxl leaves Mcx ) invariant1

s

17 ( × ) = |dx ' Mlx ) PCX'

IX ) = Mcxl |dx'#' k ),

= |dx ' MC ×')PCXIXI

If you start with a sample ×'

~ MKI

and then sample xlx'

~p( XIX' ) thin × ~ Mk )

Metropolis - Hastings

Idea : Starting from the current sample xs

generate a proposal ×'

~ qcxixs ) and accept

xst '=x ' with probability-

When we

have detailed

.

M ( X' ) q(×|×i )

balance :

A =mmin( 1,

( × , qk ,|× ,) 171×1191×1×1 '

^N

= 171×191×11×1-Always accept when ratio > i so ratio = I

with probability ( l - a ) reject the proposaland retain the previous sample xs "=xs


Idea : Starting from the current sample xs

generate a proposal ×'

~ qcxixs ) and accept

xst '=x ' with probability.

M ( X' ) q( XIX ' )

a =mmin( li

n ( × ) qkllx ))

with probability ( l - d ) reject the proposaland retain the previous sample xs "=xs

Exercise : Show that the Markov chain

x'

. . .

xs satisfies detailed balance

etropolis - Hastings : Detailed Balance

Detailed Balance :

17 ( X ) p I X'

IX ) = Mex')plxcx 't

Metropolis - Hastings : Define

pl x' IN = ( I - d)8×4 ' ) tag ( x't x ) D= min (I,

17K ' ) qcxlx'

)

=Tx ( x' 1171×117449¥11 )

pcx' lxlncxl = ( I - a )8×1×1MK )

t min ( n Cx ) glx' Ix )

,Mlk )

qlxlx' ) )

= min ( n C x' 191×1×1,

171×191×119)= p ( XIX

'

) MIX ' )

etropolis - Hastings : Unrormalired Densities

Nice property : Can calculate acceptance prob

from unhormaliud densities ycx' ) and YCX )

.

M ( ×' ) q( xlx ' ) MCX ) =y(×l/Z

a = mmin( l, n(× , q(×i|× )

)MK

' )=yl×' VZ

.

y ( ×' ) q( xlx ' ) HCX ' )

= mminf, ya, gu , ,× ,) tat # 8¥'

pcy ,×'

) q( xlx ' ) y ( x ) = pcy ,× )= mmin( l

i

p( y,×)q(×' lx ))

( for Bayer net )

Metropolis - Hastings : Choosing proposals

Independent - Mtt : Sample proposers from pretor

9 ( X'

IX ) = pix' ) ( Independent from prau . sample )

ply ,x

' ) g ( XIX ' )a = rain( I

i

pay ,× , qcx ' Ix ))

= mm in( ,,

Pcb " " P " " Pk )

) pcyixy

Plylxspcx , pix ,= min !,

- )pcylx )

Straightforward ,but low acceptance prob

( same issue as with likelihood weighting )

large ftMetropolis - Hastings : Choosing proposals

yContinuous variables : Gaussian

"

÷÷¢\×.• -

f↳

- off

qcx' IX ) = Norm ( X

'

; X,

82 ) small or'

\Trade - off far proposal Variance symmetric proposal

91 x' txt e g ( Xix' )

- 82 too small ; good acceptance prob A,

but high correlation between samples are Minh , 9g5¥- 82 too large : less correlation

,but

= May ,,

)lower acceptance prob A

Rule of thumb,

time of'

to make 9 I 0.234

Gibbs Sampling ( Next Lecture )

Idea : Propose 1 variable at a time,

holding other variables constant

y C x )= ply Ix , ,Xa ) pix . , Xz )

X ,'

~ pix , I y Xz ) = ply ,X.

,Xa ) / ply ,

Xi

xi - peaty:X

,'s

Acceptance Ratio : Car accept with prob I=

on = min ( ,,

P' bi × i' " ) Pl Yi KHz ) ply ,xz )

)pay , X , ,×z ) pay ,xz ) ply ,xi,Xz7

= I

MCMC vs . Importance Sampling

NCO ) = 8107 IZ y ( O ) = ply ,O ) 2- = ply )

- importance Sampling i

ws =P'{j%,

on geo ) Elws ) -

- pigsy aI Guess and check Gives estimate of marginal


ps = {Os "

u > a u - Unit ( 0,1 )Anglo'1o ",

O"

n ca g = min ( ,,

8105910541014

HO" ) 9104054 ,

)Can do

"

hill climbing" but no estimate of marginal

Computing Marginal Likelihoods

Motivation : Model

comparisonQuestion : How many clusters

'

K ?'

*Low ply 109 High ply if )

Fewerbad 0 Lots of bad 0

Bayesian Approach : Compare marginal likelihood

*

K = angmax plylk ) = angmax / do ply 107 PCOIK)

ke { I,

. . . . km " } k "

Best average fit"

Annealed Importance Sampling

Idea I : Sample from target fco ) by way ofintermediatedistributions g of Bnf I

ToLO) = pH ) yn I 01 = ply IOB "

ploy

Ifkilo ) = glo )

Easy to generate Hard to generategood proposals good proposals

Idea 2 : Use MLMC to generate proposals

Annealed Importance SamplingUse as proposal for next step

←

-541¥Initialize Use MCMC High quality

proposals to move around samples

Initialization w :-.4%0%4 Oi- got . )

f Pheu weight

Transition wins -

-town ! Oink. fOn 10ns. .

)

. ru . i ( Ons ) yMcmc kernel

Documents

Lecture - Khoury College of Computer Sciences€¦ · of approx. 5=1-Default Choice fan Proposal: Likelihood Weighting Assume Bayes Net ycx) = ply, Xi = pcylxlpcxl 2-= fdxpiy.tl =