17
CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Embed Size (px)

Citation preview

Page 1: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

CS774. Markov Random Field : Theory and Application

Lecture 08

Kyomin JungKAIST

Sep 29 2009

Page 2: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Review: Exponential Family

I

ZxxP

)}()(exp{];[

])}(exp{log[)(

nx I

xZ

Parametrization of positive MRFs, i.e. P[x]>0 for all x.

Let denote a collection of potential functions defined on the cliques of G.

Let be a vector of weights on these potentials functions.

An MRF with weight is defined by

Where the log partition function is

}|{ I

}|{ I

Page 3: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Lemmas

1.

2.

So the Hessian of the log partition function Z is equal to a covariance matrix, which is always positive definite.Hence Z is convex as a function of .

nx

xxpEZ

)();(][)(

][][][)(2

EEEZ

Page 4: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Convex combinations

Let denote the set of all spanning trees of G.

Let be an exponential parameter vector that represents a tree T, i.e. only for vertices and edges of T.

Let be a probability distribution over T(G):

)(GT)(T

)(

1)(GTT

T

}0)(|)(),({ TGTTT

0

Page 5: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Example

)}(exp{);( 14433221 Zxxxxxxxxxp

]1,1,1,1,0,0,0,0[

4/3 4/3

4/3

4/3 4/3

4/3

4/3

4/3

4/3 4/3

4/3

4/3

1

1

11

0 0

0

0

)( 1T )( 2T

)( 4T)( 3T

4/14321

G

Page 6: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Upper bound on the log partition ftn

By Jensen’s inequality we obtain that

For all and such that

Then how to choose and that minimize ?

)(

))(()())](([)(GTT

TZTTZEZ

)]([ TE

))](([ TZE

Page 7: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Upper bound on the log partition ftn

Optimizing over with fixed

Since Z is convex and the constraint is linear, it has a global minimum, and it could be solved exactly by nonlinear programming.

Note : number of spanning tree is large ex Cayley’s formula says that # of spanning tree of a

complete graph is Hence we will solve the dual problem which has smaller

# of variables.

))](([min TZE )]([.. TEts

2nnnK

Page 8: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Pseudo-marginals

Consider a set of Pseudo-marginals

We require the following constraints

If G is a tree, LOCAL(G) is a complete de-scription of the set of valid marginals.

t sk j

jsjsjkstGLOCAL }1,|0{)( ;;;

}),(,{},{ EtsVs sts

Page 9: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Pseudo-marginals

Let denote the projection of onto the spanning tree T:

Then we can define an MRF

)(T

)}(),(,{},{:)( TEtsVs stsT

.)()(

),()(:)](;[

)(),(

Vs TEts ttss

tsstss

T

xx

xxxxXP

Page 10: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Lagrangian dual

Let be the optimal primal solution. And let be the optimal dual solution.

Then we have that, for any tree T,

Hence, fully expressesfor all tree T. Note that has dimension which is small.

*

*)](;[)](*;[ TxXPTxXP

*

* )](*;[ TxXP *

|)||||||(| 2 EVO

Page 11: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Optimal Upper Bound (for fixed )

Where

is the single node entropy.

is

mutual information between and .

is the edge appearance prob. of the edge e.

)};(,{max)()(

eGLOCAL

QZ

Ets

stststVs

sse IHQ),(

)()();(

jsj

jsss

s

H ;; log)(

),( ;;

;; ))((log:)(

kjj

jkstk

jkst

jkstjkststst

st

I

sx tx

…(1)

e

Page 12: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Optimal Upper Bound (for changing )

Note that for a fixed , only matters.

has large dimension (# of spanning trees of G), has small dimension (# of edges of G).

is a convex function of . Use Conditional gradient method to com-

pute optimal

),(min:)};(,{maxmin)()()()(

e

GTe

GLOCALGTRQZ

e

e

),( eR

e

e

Page 13: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Tree reweighted sum-product (for fixed ) Message passing implementation of the

dual problem (1). Messages from vertex t to s are defined as

follows.

t

ts

vt

x tnst

stvt

nvt

tt

st

tsst

snts xM

xM

xxx

xM'

)1(

\)(1

)]([

)]([

)'()',(

exp)(

Page 14: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Tree reweighted sum-product

The Pseudo-marginals is computed by

which maximizes

vs

svsvsssss xMxx

)(

)]([))(exp()(

.)]([

)]([

)]([

)]([

)()(),(

exp),( )1(

\)(

)1(

\)(

ts

vt

st

vs

tst

stvtvt

sts

tsvsvs

ttss

st

tsst

tsst xM

xM

xM

xM

xxxx

xx

).;(, eQ

Page 15: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

How the messages are defined

Lagrangean associated with (1) is

where

Take derivatives w.r.t. and to obtain relations (that are used in the message update).

Then define the message via

)}.()()()({);(,),(),(

tsttstEts

stsstse xCxxCxQL

)( ss x ),( txst xx

).(:)(log 1tsttst xxM

st

)(),(:)( ssx

tsststs xxxxCt

Page 16: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Self Avoiding Walk tree

Comparison with computation tree

Page 17: CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep 29 2009

Self Avoiding Walk tree

TheoremConsider any binary pairwise MRF on a

graph G=(V,E). For any vertex v, the marginal prob. computed at the root node of Tsaw(v) is equal to the marginal prob. for v in the original MRF.

Same theorem holds for MAP, i.e. for

Hence Tsaw can be used to compute exact marginal and MAP for graphs with small # of cycles.

][max)(,}1,0{

xXPaqaxx

vv

n