159

Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

  • Upload
    haminh

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stat 375: Inference in Graphical ModelsLectures 7-8-9

Andrea Montanari

Stanford University

May 6, 2012

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 1 / 83

Page 2: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Variational methods

Idea

I know a lot about (convex) optimization. . .

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 2 / 83

Page 3: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Variational methods

Idea

I know a lot about (convex) optimization. . .

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 2 / 83

Page 4: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Outline

1 Gibbs Variational Principle

2 Naive Mean Field

3 Bethe Free Energy

4 Region-Based Approximation

5 Tree-based Convexi�cations

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 3 / 83

Page 5: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Undirected Pairwise Graphical Model

x1

x2 x3 x4

x5x6

x7x8x9x10

x11x12

G = (V ;E), V = [n ], x = (x1; : : : ; xn), xi 2 X , jX j <1

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 4 / 83

Page 6: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Notation

`Actual' probability �! �

`Trial' probability (`belief') �! b

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 5 / 83

Page 7: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Variational Principle

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 6 / 83

Page 8: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

I want to compute

� � logZ

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 7 / 83

Page 9: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

G : XV ! R ;

b 7! G (b) :

Proposition

G is strictly concave, and achieves its unique maximum at b = �.

Further

G (b = �) = � :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 8 / 83

Page 10: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

G : XV ! R ;

b 7! G (b) :

Proposition

G is strictly concave, and achieves its unique maximum at b = �.

Further

G (b = �) = � :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 8 / 83

Page 11: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) �1

Z tot(x ) :

De�nition

G(b) = Eb log tot(x ) +H (b)

=X

(ij )2E

Xxi ;xj2X

b(xi ; xj ) log ij (xi ; xj )�Xx2XV

b(x ) log b(x )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 9 / 83

Page 12: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) �1

Z tot(x ) :

De�nition

G(b) = Eb log tot(x ) +H (b)

=X

(ij )2E

Xxi ;xj2X

b(xi ; xj ) log ij (xi ; xj )�Xx2XV

b(x ) log b(x )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 9 / 83

Page 13: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Gibbs Free Energy

�(x ) =1

Z

Y(ij )2E

ij (xi ; xj ) �1

Z tot(x ) :

De�nition

G(b) = Eb log tot(x ) +H (b)

=X

(ij )2E

Xxi ;xj2X

b(xi ; xj ) log ij (xi ; xj )�Xx2XV

b(x ) log b(x )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 9 / 83

Page 14: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof

1. De�ne the Lagrangian

L(b; �) = G(b)� �n Xx2XV

b(x )� 1o

and di�erentiate

2. Observe that

z 7! z log z is convex on R+

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 10 / 83

Page 15: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof

1. De�ne the Lagrangian

L(b; �) = G(b)� �n Xx2XV

b(x )� 1o

and di�erentiate

2. Observe that

z 7! z log z is convex on R+

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 10 / 83

Page 16: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Naive Mean Field

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 11 / 83

Page 17: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Good news/Bad news

Counting = Convex Optimization

M(XV ) is jX jV � 1 dimensional.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 12 / 83

Page 18: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Good news/Bad news

Counting = Convex Optimization

M(XV ) is jX jV � 1 dimensional.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 12 / 83

Page 19: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Good news/Bad news

Counting = Convex Optimization

M(XV ) is jX jV � 1 dimensional.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 12 / 83

Page 20: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Idea

S

�b�

Maximize Gibbs free energy on a low-dim subset

� � supb2S

G (b) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 13 / 83

Page 21: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Naive Mean Field Idea

S =nb 2 M(X n) : b = b1 � b2 � � � � � bn

o;

Abuse b � fbigi2V

FMF(b = fbigi2V ) = G (b1 � � � � � bn)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 14 / 83

Page 22: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Naive Mean Field Idea

S =nb 2 M(X n) : b = b1 � b2 � � � � � bn

o;

Abuse b � fbigi2V

FMF(b = fbigi2V ) = G (b1 � � � � � bn)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 14 / 83

Page 23: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Naive Mean Field Idea

S =nb 2 M(X n) : b = b1 � b2 � � � � � bn

o;

Abuse b � fbigi2V

FMF(b = fbigi2V ) = G (b1 � � � � � bn)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 14 / 83

Page 24: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Explicitly

FMF(b) =X

(i ;j )2E

Ebi�bj log ij (xi ; xj ) +H (b1 � � � � � bn)

=X

(i ;j )2E

Xxi ;xj

bi (xi )bj (xj ) log ij (xi ; xj )�Xxi

bi (xi ) log bi (xi )

Problem: Not convex.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 15 / 83

Page 25: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Explicitly

FMF(b) =X

(i ;j )2E

Ebi�bj log ij (xi ; xj ) +H (b1 � � � � � bn)

=X

(i ;j )2E

Xxi ;xj

bi (xi )bj (xj ) log ij (xi ; xj )�Xxi

bi (xi ) log bi (xi )

Problem: Not convex.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 15 / 83

Page 26: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Explicitly

FMF(b) =X

(i ;j )2E

Ebi�bj log ij (xi ; xj ) +H (b1 � � � � � bn)

=X

(i ;j )2E

Xxi ;xj

bi (xi )bj (xj ) log ij (xi ; xj )�Xxi

bi (xi ) log bi (xi )

Problem: Not convex.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 15 / 83

Page 27: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 16 / 83

Page 28: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 16 / 83

Page 29: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 16 / 83

Page 30: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 16 / 83

Page 31: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 16 / 83

Page 32: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = FMF(b) +Xi2V

�i

n Xxi2X

bi (xi )� 1o;

@L

@bi (xi )(b; �) =

Xj2@i

Xxj2X

bj (xj ) log ij (xi ; xj )� 1� log bi (xi ) + �i = 0 ;

bi (xi ) �= expn Xj2@i

Xxj

log ij (xi ; xj )bj (xj )o� FMF(b)i (xi ) ;

[Naive Mean Field Equations]

Typically a �xed point is searched by iteration: b(t+1) = FMF(b(t))

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 16 / 83

Page 33: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Another point of view

Exercise : For ij (xi ; xj ) = e�ij (xi ;xj )

�i (xi ) �= E�

8<: e

Pj2@i

�ij (xi ;Xj )

Px 0ieP

j2@i�ij (x

0i;Xj )

9=;

If we could move the expectation to exponents

�i (xi ) �= eE�

Pj2@i

�ij (xi ;Xj )

[= Naive Mean Field]

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 17 / 83

Page 34: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Another point of view

Exercise : For ij (xi ; xj ) = e�ij (xi ;xj )

�i (xi ) �= E�

8<: e

Pj2@i

�ij (xi ;Xj )

Px 0ieP

j2@i�ij (x

0i;Xj )

9=;

If we could move the expectation to exponents

�i (xi ) �= eE�

Pj2@i

�ij (xi ;Xj )

[= Naive Mean Field]

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 17 / 83

Page 35: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 18 / 83

Page 36: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Problem

One dimensional marginals give a very poor approximation.

Example: x1; x2 2 f0; 1g

�(x ) =1

2I(x1 � x2 = 0)

Would like to account exactly for the correlations induced by edges.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 19 / 83

Page 37: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Problem

One dimensional marginals give a very poor approximation.

Example: x1; x2 2 f0; 1g

�(x ) =1

2I(x1 � x2 = 0)

Would like to account exactly for the correlations induced by edges.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 19 / 83

Page 38: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Would like

F : M(X � X )E �M(X )V ! Rb = fbij ; big(i ;j )2E ;i2V 7! F(b)

bij = bij (xi ; xj ), bi = bi (xi ),

such that

argmaxb

F(b) � � ;

maxb

F(b) � � ;

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 20 / 83

Page 39: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

What is really the domain?

x1 x2 2 f0; 1g

Example:

b1 =

"0:10:9

#; b2 =

"0:90:1

#; b1 =

"0:4 0:10:1 0:4

#:

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 21 / 83

Page 40: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Natural guess

b 2 MARG(G)

MARG(G) =nb = fbi ; bij g : marginals of a distribution on XV

o;

bi (xi ) =XxV ni

p(x ) ;

bi ;j (xi ; xj ) =X

xV nfi;jg

p(x ) ;

p 2 M(XV ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 22 / 83

Page 41: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Natural guess

b 2 MARG(G)

MARG(G) =nb = fbi ; bij g : marginals of a distribution on XV

o;

bi (xi ) =XxV ni

p(x ) ;

bi ;j (xi ; xj ) =X

xV nfi;jg

p(x ) ;

p 2 M(XV ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 22 / 83

Page 42: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Natural guess

b 2 MARG(G)

MARG(G) =nb = fbi ; bij g : marginals of a distribution on XV

o;

bi (xi ) =XxV ni

p(x ) ;

bi ;j (xi ; xj ) =X

xV nfi;jg

p(x ) ;

p 2 M(XV ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 22 / 83

Page 43: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bad news

In general checking b 2 MARG(G) in NP-hard.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 23 / 83

Page 44: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Second attempt

b 2 LOC(G)

LOC(G) =nb = fbi ; bij g : locally consistent marginals

o;

nfbi ; bij g :

Xxj

bij (xi ; xj ) = bi (xi );

Xxi

bi (xi ) = 1o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 24 / 83

Page 45: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Second attempt

b 2 LOC(G)

LOC(G) =nb = fbi ; bij g : locally consistent marginals

o;

nfbi ; bij g :

Xxj

bij (xi ; xj ) = bi (xi );

Xxi

bi (xi ) = 1o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 24 / 83

Page 46: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Second attempt

b 2 LOC(G)

LOC(G) =nb = fbi ; bij g : locally consistent marginals

o;

nfbi ; bij g :

Xxj

bij (xi ; xj ) = bi (xi );

Xxi

bi (xi ) = 1o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 24 / 83

Page 47: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Geometric picture

LOC(G)

MARG(G)

Polytopes

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 25 / 83

Page 48: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

F : LOC(G)! R

Intuition

F(b) � Eb log tot(x ) +H (b)

� Energy + Entropy :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 26 / 83

Page 49: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

F : LOC(G)! R

Intuition

F(b) � Eb log tot(x ) +H (b)

� Energy + Entropy :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 26 / 83

Page 50: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Bethe Free Energy

F : LOC(G)! R

Intuition

F(b) � Eb log tot(x ) +H (b)

� Energy + Entropy :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 26 / 83

Page 51: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Energy

Eb log tot(x ) =X

(i ;j )2E

Eb log ij (xi ; xj )

=X

(i ;j )2E

Ebij log ij (xi ; xj )

=X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log ij (xi ; xj )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 27 / 83

Page 52: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Entropy?

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 28 / 83

Page 53: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Entropy?

Idea: Consider G a tree.

Proposition

If G is a tree, then

�(x ) =Y

(i ;j )2E

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi ) :

Corollary

If G is a tree, then

H (�) =Xi2V

H (�i )�X

(i ;j )2E

I (�ij ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 29 / 83

Page 54: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Entropy?

Idea: Consider G a tree.

Proposition

If G is a tree, then

�(x ) =Y

(i ;j )2E

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi ) :

Corollary

If G is a tree, then

H (�) =Xi2V

H (�i )�X

(i ;j )2E

I (�ij ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 29 / 83

Page 55: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Mutual Information

I (�1;2) =Xx1;x2

�1;2(x1; x2) log�1;2(x1; x2)

�1(x1)�2(x2)

= H (�1) +H (�2)�H (�1;2) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 30 / 83

Page 56: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

By induction over n .

n = 1: Trivial

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 31 / 83

Page 57: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

By induction over n .

n = 1: Trivial

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 31 / 83

Page 58: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

True for n , V = [n ], new vertex i = n + 1, connected to j = n

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn+1jxn) [Markov]

= �(xV )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1) [Bayes]

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 32 / 83

Page 59: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

True for n , V = [n ], new vertex i = n + 1, connected to j = n

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn+1jxn) [Markov]

= �(xV )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1) [Bayes]

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 32 / 83

Page 60: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

True for n , V = [n ], new vertex i = n + 1, connected to j = n

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn+1jxn) [Markov]

= �(xV )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1) [Bayes]

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 32 / 83

Page 61: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

True for n , V = [n ], new vertex i = n + 1, connected to j = n

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn+1jxn) [Markov]

= �(xV )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1) [Bayes]

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 32 / 83

Page 62: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of the proposition

True for n , V = [n ], new vertex i = n + 1, connected to j = n

�(xV ; xn) = �(xV )�(xn+1jxV )

= �(xV )�(xn+1jxn) [Markov]

= �(xV )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1) [Bayes]

=Y

(i ;j )6=(n ;n+1)

�i ;j (xi ; xj )

�i (xi )�j (xj )

Yi2V

�i (xi )�(xn ; xn+1)

�(xn)�(xn+1)�(xn+1)

QED

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 32 / 83

Page 63: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Let's just export it

HBethe : LOC(G)! R :

HBethe(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 33 / 83

Page 64: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Putting everything together

F(b) = Eb log tot(x ) +HBethe(b)

=X

(i ;j )2E

Ebij log ij (xi ; xj )�X

(i ;j )2E

I (bij ) +Xi2V

H (bi )

=X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log ij (xi ; xj )

�X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) logbij (xi ; xj )

bi (xi )bj (xj )�Xi2V

Xxi

bi (xi ) log bi (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 34 / 83

Page 65: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Putting everything together

F(b) = Eb log tot(x ) +HBethe(b)

=X

(i ;j )2E

Ebij log ij (xi ; xj )�X

(i ;j )2E

I (bij ) +Xi2V

H (bi )

=X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log ij (xi ; xj )

�X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) logbij (xi ; xj )

bi (xi )bj (xj )�Xi2V

Xxi

bi (xi ) log bi (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 34 / 83

Page 66: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Putting everything together

F(b) = Eb log tot(x ) +HBethe(b)

=X

(i ;j )2E

Ebij log ij (xi ; xj )�X

(i ;j )2E

I (bij ) +Xi2V

H (bi )

=X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log ij (xi ; xj )

�X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) logbij (xi ; xj )

bi (xi )bj (xj )�Xi2V

Xxi

bi (xi ) log bi (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 34 / 83

Page 67: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Putting everything together

F(b) =X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log ij (xi ; xj )

�X

(i ;j )2E

Xxi ;xj

bij (xi ; xj ) log bij (xi ; xj )

�Xi2V

(1� deg(i))Xxi

bi (xi ) log bi (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 35 / 83

Page 68: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Problem

Want to maximize F(b).

F(b) is not concave.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 36 / 83

Page 69: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Remark

Assume � a BP �xed point

�i!j (xi ) �=Y

k2@inj

n Xxk2X

ik (xi ; xk ) �k!i (xk )o:

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 37 / 83

Page 70: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

De�ne (as you would do on a tree)

bi (xi ) �=Y

k2@inj

n Xxk2X

ik (xi ; xk ) �k!i (xk )o;

bij (xi ; xj ) �= �i!j (xi ) ij (xi ; xj ) �j!i (xj ) :

Lemma

With these de�nitions, b 2 LOC(G).

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 38 / 83

Page 71: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = F(b)�Xi2V

�i

nXxi

bi (xi )� 1o

�X

(i ;j )2~E

Xxi

�i!j (xi )nX

xj

bij (xi ; xj )� bi (xi )o

rbijL(b; �) = �1� bij (xi ; xj ) + log ij (xi ; xj )� �i!j (xi )� �j!i (xi ) ;

rbiL(b; �) = �(1� deg(i)) log[bi (xi ) e ]� �i +Xj2@i

�i!j (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 39 / 83

Page 72: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = F(b)�Xi2V

�i

nXxi

bi (xi )� 1o

�X

(i ;j )2~E

Xxi

�i!j (xi )nX

xj

bij (xi ; xj )� bi (xi )o

rbijL(b; �) = �1� bij (xi ; xj ) + log ij (xi ; xj )� �i!j (xi )� �j!i (xi ) ;

rbiL(b; �) = �(1� deg(i)) log[bi (xi ) e ]� �i +Xj2@i

�i!j (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 39 / 83

Page 73: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

Lagrangian

L(b; �) = F(b)�Xi2V

�i

nXxi

bi (xi )� 1o

�X

(i ;j )2~E

Xxi

�i!j (xi )nX

xj

bij (xi ; xj )� bi (xi )o

rbijL(b; �) = �1� bij (xi ; xj ) + log ij (xi ; xj )� �i!j (xi )� �j!i (xi ) ;

rbiL(b; �) = �(1� deg(i)) log[bi (xi ) e ]� �i +Xj2@i

�i!j (xi )

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 39 / 83

Page 74: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

bij (xi ; xj ) = ij (xi ; xj ) exp�� 1� �i!j (xi )� �j!i (xj )

;

bi (xi ) �= expn�

1

deg(i)� 1

Xj2@i

�i!j (xi )o

Xxj

bij (xi ; xj ) = bi (xi )

Did you recognize this?

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 40 / 83

Page 75: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Stationarity condition

bij (xi ; xj ) = ij (xi ; xj ) exp�� 1� �i!j (xi )� �j!i (xj )

;

bi (xi ) �= expn�

1

deg(i)� 1

Xj2@i

�i!j (xi )o

Xxj

bij (xi ; xj ) = bi (xi )

Did you recognize this?

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 40 / 83

Page 76: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

De�ning

�i!j (xi ) �= e��i!j (xi ) :

We get

bi (xi ) �=Y

k2@inj

n Xxk2X

ik (xi ; xk ) �k!i (xk )o;

bij (xi ; xj ) �= �i!j (xi ) ij (xi ; xj ) �j!i (xj ) ;

+Local Consistency

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 41 / 83

Page 77: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

De�ning

�i!j (xi ) �= e��i!j (xi ) :

We get

bi (xi ) �=Y

k2@inj

n Xxk2X

ik (xi ; xk ) �k!i (xk )o;

bij (xi ; xj ) �= �i!j (xi ) ij (xi ; xj ) �j!i (xj ) ;

+Local Consistency

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 41 / 83

Page 78: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

We proved the following

Theorem (Yedidia, Freeman, Weiss, 2003)

Fixed points of BP are in one-to-one correspondence with

stationary points of Bethe free energy.

Fixed point messages are (exponentials of) the dual parameters at

the �xed point.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 42 / 83

Page 79: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

We proved the following

Theorem (Yedidia, Freeman, Weiss, 2003)

Fixed points of BP are in one-to-one correspondence with

stationary points of Bethe free energy.

Fixed point messages are (exponentials of) the dual parameters at

the �xed point.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 42 / 83

Page 80: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Uses

I Alternative algorithms to �nd �xed points (e.g. gradient ascent).[e.g. Heskes 2002]

I Include higher order marginals.[Yedidia, Freeman, Weiss, 2003]

I Convexify Bethe free energy.[Wainwright, Jaakkola, Willsky, 2005]

I Asymptotically tight estimates on logZ for graph sequences.[e.g. Dembo, Montanari 2010]

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 43 / 83

Page 81: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-Based Approximation

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 44 / 83

Page 82: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Idea

Vertices ! Edges ! Regions

Naive Mean Field ! Bethe Free Energy ! Region-Based Free Energy

MF Equations ! Belief Propagation ! Generalized BP

[Cluster variational method, Kikuchi 1951]

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 45 / 83

Page 83: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

R = (VR;ER), s.t.

I If (i ; j ) 2 ER then i ; j 2 VR

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 46 / 83

Page 84: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

Free energy of region R: FR : M(XVR)! R

FR(bR) = EbR log tot;R(xR) +H (bR)

=XxR

X(i ;j )2ER

bR(xR) log ij (xi ; xj )�XxR

bR(xR) log bR(xR) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 47 / 83

Page 85: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

Free energy of region R: FR : M(XVR)! R

FR(bR) = EbR log tot;R(xR) +H (bR)

=XxR

X(i ;j )2ER

bR(xR) log ij (xi ; xj )�XxR

bR(xR) log bR(xR) :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 47 / 83

Page 86: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region

Free energy of region R: FR : M(XVR)! R

Can be evaluated for small regions (complexity jX jjRj).

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 48 / 83

Page 87: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-based Approximation

Collection of regions

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

Free Energy approximation:

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 49 / 83

Page 88: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-based Approximation

Collection of regions

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

Free Energy approximation:

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 49 / 83

Page 89: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-based Approximation

Collection of regions

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

Free Energy approximation:

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 49 / 83

Page 90: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Region-based Approximation

Collection of regions

R =�R1;R2; : : : ;Rm

:

Coe�cients

cR =�cR1 ; cR2 ; : : : ; cRm

; cRi

2 R :

Free Energy approximation:

FR : M(XV (R1))� � � � �M(XV (Rm ))! R

bR = (bR1 ; : : : ; bRm ) 7! FR(bR) =XR2R

cR FR(bR)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 49 / 83

Page 91: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example: Bethe Free Energy

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

f1� deg(i)gH (bi ) +X

(i ;j )2E

nH (bij ) + Ebij log ij (xi ; xj )

o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 50 / 83

Page 92: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example: Bethe Free Energy

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

f1� deg(i)gH (bi ) +X

(i ;j )2E

nH (bij ) + Ebij log ij (xi ; xj )

o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 50 / 83

Page 93: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example: Bethe Free Energy

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

f1� deg(i)gH (bi ) +X

(i ;j )2E

nH (bij ) + Ebij log ij (xi ; xj )

o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 50 / 83

Page 94: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example: Bethe Free Energy

Regions

R =�Ri : i 2 V

[�Rij : (i ; j ) 2 E

;

Ri = (fig; ;) ;

Rij = (fi ; j g; f(i ; j )g) :

Coe�cients

ci = 1� deg(i) ; cij = 1 :

Free energy

FR(b) =Xi2V

f1� deg(i)gH (bi ) +X

(i ;j )2E

nH (bij ) + Ebij log ij (xi ; xj )

o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 50 / 83

Page 95: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Questions?

1. What about domain/consistency?

2. How to choose coe�cients?

3. How to choose regions?

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 51 / 83

Page 96: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Valid Region-Based Approximations[Yedidia, Freeman, Weiss, 2003]

Condition 1: Consistency

R 2 R; R0 � R ) R0 2 R :

Condition 2: Vertex countingXR2R

cR I(i 2 R) = 1 for all i 2 V :

Condition 3: Edge countingXR2R

cR I((i ; j ) 2 R) = 1 for all (i ; j ) 2 E :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 52 / 83

Page 97: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 53 / 83

Page 98: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 54 / 83

Page 99: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Add intersections!

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 55 / 83

Page 100: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Add intersections!

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 56 / 83

Page 101: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

Add intersections!

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 57 / 83

Page 102: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #1

R 2 R; R0 � R ) R0 2 R :

Clean local consistecy conditionsXxRnR0

bR(xR) = bR0(xR0) for all R0 � R :

LOC(G ;R)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 58 / 83

Page 103: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #1

R 2 R; R0 � R ) R0 2 R :

Clean local consistecy conditionsXxRnR0

bR(xR) = bR0(xR0) for all R0 � R :

LOC(G ;R)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 58 / 83

Page 104: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Geometric picture

LOC(G)

LOC(G ;R)

MARG(G)

Polytopes

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 59 / 83

Page 105: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #2

XR2R

cR I(i 2 R) = 1 for all i 2 V :

Consider ij (xi ; xj ) = 1, bR(xR) = Uniform

XR2R

FR(bR) =XR2R

cRH (bR)

=XR2R

cR jV (R)j log jX j

=Xi2V

8<:XR2R

cR I(i 2 R)

9=; log jX j = jV j log jX j

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 60 / 83

Page 106: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #2

XR2R

cR I(i 2 R) = 1 for all i 2 V :

Consider ij (xi ; xj ) = 1, bR(xR) = Uniform

XR2R

FR(bR) =XR2R

cRH (bR)

=XR2R

cR jV (R)j log jX j

=Xi2V

8<:XR2R

cR I(i 2 R)

9=; log jX j = jV j log jX j

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 60 / 83

Page 107: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #3

XR2R

cR I((i ; j ) 2 R) = 1 for all (i ; j ) 2 E :

Neglect entropy (e.g. ij (xi ; xj ) = e� �ij (xi ;xj ), � !1)XR2R

FR(bR) = �XR2R

cRXxR

bR(xR)X

(ij )2E(R)

�ij (xi ; xj ) +O�(1)

= �XR2R

cRX

(ij )2E(R)

Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

8<:XR2R

cR I((i ; j ) 2 R)

9=;Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

Ebij �ij (xi ; xj ) +O�(1)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 61 / 83

Page 108: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Why? #3

XR2R

cR I((i ; j ) 2 R) = 1 for all (i ; j ) 2 E :

Neglect entropy (e.g. ij (xi ; xj ) = e� �ij (xi ;xj ), � !1)XR2R

FR(bR) = �XR2R

cRXxR

bR(xR)X

(ij )2E(R)

�ij (xi ; xj ) +O�(1)

= �XR2R

cRX

(ij )2E(R)

Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

8<:XR2R

cR I((i ; j ) 2 R)

9=;Ebij �ij (xi ; xj ) +O�(1)

= �X

(ij )2E

Ebij �ij (xi ; xj ) +O�(1)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 61 / 83

Page 109: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

How do you compute the coe�cients?

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 62 / 83

Page 110: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The Region Graph

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 63 / 83

Page 111: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The Region Graph

cloop = +1

cedge = +1

cvert = +1

cR = 1�X

R02ANCESTORS(R)

cR

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 64 / 83

Page 112: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Was it worth it?

10� 10 Ising model with random potentials [Yedidia et al. 2003]

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 65 / 83

Page 113: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Tree-Based Convexi�cations

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 66 / 83

Page 114: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Intermezzo: Exponential Families

T : XV ! Rm ;

x 7! T (x ) = (T1(x ); : : : ;Tm(x )) :

Exponential family f�� : � 2 Rmg

��(x ) =1

Z (�)exp

nh�;T (x )i

o; F (�) = logZ (�)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 67 / 83

Page 115: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Intermezzo: Exponential Families

T : XV ! Rm ;

x 7! T (x ) = (T1(x ); : : : ;Tm(x )) :

Exponential family f�� : � 2 Rmg

��(x ) =1

Z (�)exp

nh�;T (x )i

o; F (�) = logZ (�)

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 67 / 83

Page 116: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Exponential Families: Basic Properties

Proposition

(1) � 7! F (�) is convex;

(2) r�F (�) = E�fT (x )g � � (�) ;

(3) r2�F (�) = Cov�fT (x );T (x )

�;

(4) Image(� ) = MARG(T ) :

MARG(T ) � conv�fT (x ) : x 2 XV g

=n

E�T (x ) : � 2 M(XV )o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 68 / 83

Page 117: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Exponential Families: Basic Properties

Proposition

(1) � 7! F (�) is convex;

(2) r�F (�) = E�fT (x )g � � (�) ;

(3) r2�F (�) = Cov�fT (x );T (x )

�;

(4) Image(� ) = MARG(T ) :

MARG(T ) � conv�fT (x ) : x 2 XV g

=n

E�T (x ) : � 2 M(XV )o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 68 / 83

Page 118: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Exponential Families: Basic Properties

Proposition

(1) � 7! F (�) is convex;

(2) r�F (�) = E�fT (x )g � � (�) ;

(3) r2�F (�) = Cov�fT (x );T (x )

�;

(4) Image(� ) = MARG(T ) :

MARG(T ) � conv�fT (x ) : x 2 XV g

=n

E�T (x ) : � 2 M(XV )o

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 68 / 83

Page 119: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proofs

(1), (2), (3): Exercises

(4): A bit more di�cult

Claim 1: A closed convex set is the closure of its relative interior.[Hint: Assume the set has full dimension. Each point has a cone of full

dimension around it.]

Claim 2: Let �� 2 relint(MARG(T )). Then �� = E��fT (x )g for some�� s.t. ��(x ) > 0 for all x 2 XV .[Hint: Consider the set of signed weigths � such that

Px�(x )T (x ) = ��. If

the claim was false, it would be tangent to the simplex.]

Claim 3: There exists �� 2 Rm such that E��fT (x )g = E��fT (x )g.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 69 / 83

Page 120: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proofs

(1), (2), (3): Exercises

(4): A bit more di�cult

Claim 1: A closed convex set is the closure of its relative interior.[Hint: Assume the set has full dimension. Each point has a cone of full

dimension around it.]

Claim 2: Let �� 2 relint(MARG(T )). Then �� = E��fT (x )g for some�� s.t. ��(x ) > 0 for all x 2 XV .[Hint: Consider the set of signed weigths � such that

Px�(x )T (x ) = ��. If

the claim was false, it would be tangent to the simplex.]

Claim 3: There exists �� 2 Rm such that E��fT (x )g = E��fT (x )g.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 69 / 83

Page 121: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of Claim 3

Wlog f1;T1; : : : ;Tmg linearly independent.Consider

F (�; ��) � F (�)� h��; �i

= logn Xx2XV

exp�h�;T (x )i

�o� E��fh�;T (x )ig

I F ( � ; ��) : Rm ! R is di�erentiable and convex.

I If �� is a stationary point, then E��fT (x )g = E��fT (x )g.

I As � !1, F (�; ��)!1.

Implies the thesis.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 70 / 83

Page 122: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of Claim 3

Wlog f1;T1; : : : ;Tmg linearly independent.Consider

F (�; ��) � F (�)� h��; �i

= logn Xx2XV

exp�h�;T (x )i

�o� E��fh�;T (x )ig

I F ( � ; ��) : Rm ! R is di�erentiable and convex.

I If �� is a stationary point, then E��fT (x )g = E��fT (x )g.

I As � !1, F (�; ��)!1.

Implies the thesis.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 70 / 83

Page 123: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Proof of Claim 3

Wlog f1;T1; : : : ;Tmg linearly independent.Consider

F (�; ��) � F (�)� h��; �i

= logn Xx2XV

exp�h�;T (x )i

�o� E��fh�;T (x )ig

I F ( � ; ��) : Rm ! R is di�erentiable and convex.

I If �� is a stationary point, then E��fT (x )g = E��fT (x )g.

I As � !1, F (�; ��)!1.

Implies the thesis.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 70 / 83

Page 124: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

As � !1, F��(�)!1

Let � = � v , � 2 R+

F (�; ��) = logn Xx2XV

exp�h�;T (x )i

�o� E��fh�;T (x )ig

� �hmaxxhv ;T (x )i � E��fhv ;T (x )ig

i

and [ : : : ] > 0 strictly because ��(x ) > 0 for all x .

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 71 / 83

Page 125: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Duality structure

F�(� ) � inf�2Rm

�F (�)� h�; �i

;

F� : MARG(T )! R ; concave.

F (�) � sup�2MARG(T )

�F�(� ) + h�; �i

;

F : Rm ! R ; convex.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 72 / 83

Page 126: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Duality structure

F�(� ) � inf�2Rm

�F (�)� h�; �i

;

F� : MARG(T )! R ; concave.

F (�) � sup�2MARG(T )

�F�(� ) + h�; �i

;

F : Rm ! R ; convex.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 72 / 83

Page 127: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Let's apply all this

x1

x2 x3 x4

x5x6

x7x8x9x10

x11x12

G = (V ;E), V = [n ], x = (x1; : : : ; xn), xi 2 X ,

Ti ;�(x ) = I(xi = �) ; i 2 V ; � 2 X ;

Tij ;�1;�2(x ) = I(xi = �1) I(xj = �2) ; (i ; j ) 2 E ; �1; �2 2 X ;

overcomplete!Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 73 / 83

Page 128: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The exponential family

��(x ) =1

Z (�)exp

8<:

X(i ;j )2E ;�1;�22X

�ij (�1; �2)Tij �1�2(x ) +X

i2V ;�2X

�i (�)Ti�(x )

9=;

=1

Z (�)exp

8<:

X(i ;j )2E

�ij (xi ; xj ) +Xi2V

�i (xi )

9=;

(General pairwise model)

The � parameters

bi (�) = E�fTi (�)g = ��(xi = �); for i 2 V ;

bij (�1; �2) = E�fTij (�1; �2)g = ��(xi = �1; xj = �2) ; for (i ; j ) 2 E :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 74 / 83

Page 129: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The exponential family

��(x ) =1

Z (�)exp

8<:

X(i ;j )2E ;�1;�22X

�ij (�1; �2)Tij �1�2(x ) +X

i2V ;�2X

�i (�)Ti�(x )

9=;

=1

Z (�)exp

8<:

X(i ;j )2E

�ij (xi ; xj ) +Xi2V

�i (xi )

9=;

(General pairwise model)

The � parameters

bi (�) = E�fTi (�)g = ��(xi = �); for i 2 V ;

bij (�1; �2) = E�fTij (�1; �2)g = ��(xi = �1; xj = �2) ; for (i ; j ) 2 E :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 74 / 83

Page 130: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The duality structure

F (�)$ F�(b) ;

F� : MARG(G)! R :

We want to evaluate at � = F (�� = log )':

� = supb2MARG(G)

nF�(b) + h��; bi

o= Entropy + Energy

New interpretation

Bethe entropy is an approximate expression for F�(b).

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 75 / 83

Page 131: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The duality structure

F (�)$ F�(b) ;

F� : MARG(G)! R :

We want to evaluate at � = F (�� = log )':

� = supb2MARG(G)

nF�(b) + h��; bi

o= Entropy + Energy

New interpretation

Bethe entropy is an approximate expression for F�(b).

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 75 / 83

Page 132: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

The duality structure

F (�)$ F�(b) ;

F� : MARG(G)! R :

We want to evaluate at � = F (�� = log )':

� = supb2MARG(G)

nF�(b) + h��; bi

o= Entropy + Energy

New interpretation

Bethe entropy is an approximate expression for F�(b).

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 75 / 83

Page 133: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Interpretation works �ne on trees

Proposition

If G is a tree, then MARG(G) = LOC(G) and

F�(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) = F =1(b)

As a consequence, F : LOC(G)! R is concave.

Proof: Exercise.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 76 / 83

Page 134: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Interpretation works �ne on trees

Proposition

If G is a tree, then MARG(G) = LOC(G) and

F�(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) = F =1(b)

As a consequence, F : LOC(G)! R is concave.

Proof: Exercise.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 76 / 83

Page 135: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Interpretation works �ne on trees

Proposition

If G is a tree, then MARG(G) = LOC(G) and

F�(b) =Xi2V

H (bi )�X

(i ;j )2E

I (bij ) = F =1(b)

As a consequence, F : LOC(G)! R is concave.

Proof: Exercise.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 76 / 83

Page 136: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

What about general graphs?

Write G as a convex combination of trees.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 77 / 83

Page 137: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Abuse: I will use T to denote trees, not functions.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 78 / 83

Page 138: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

T (G) =�spanning trees in G

;

� : T (G) ! [0; 1] ;

T 7! �T ; weights ;

XT2T (G)

�T = 1 ;

XT2T (G)

�T �T = � :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 79 / 83

Page 139: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

T (G) =�spanning trees in G

;

� : T (G) ! [0; 1] ;

T 7! �T ; weights ;

XT2T (G)

�T = 1 ;

XT2T (G)

�T �T = � :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 79 / 83

Page 140: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

T (G) =�spanning trees in G

;

� : T (G) ! [0; 1] ;

T 7! �T ; weights ;

XT2T (G)

�T = 1 ;

XT2T (G)

�T �T = � :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 79 / 83

Page 141: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

� = F (�) = F� XT2T (G)

�T �T�

�X

T2T (G)

�T F (�T )

I Fix weigths �T .

I Minimize over �T (convex!)

Problem: Exponentially many spanning trees.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 80 / 83

Page 142: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

� = F (�) = F� XT2T (G)

�T �T�

�X

T2T (G)

�T F (�T )

I Fix weigths �T .

I Minimize over �T (convex!)

Problem: Exponentially many spanning trees.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 80 / 83

Page 143: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Convex combinations

� = F (�) = F� XT2T (G)

�T �T�

�X

T2T (G)

�T F (�T )

I Fix weigths �T .

I Minimize over �T (convex!)

Problem: Exponentially many spanning trees.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 80 / 83

Page 144: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Minimization over (�T )T2T (G)

minimizeX

T2T (G)

�T F (�T ) ;

subject toX

T2T (G)

�T �Tij (xi ; xj ) = �ij (xi ; xj ) ;

XT2T (G)

�T �Ti (xi ) = �i (xi ) :

Convex Problem

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 81 / 83

Page 145: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Minimization over (�T )T2T (G)

minimizeX

T2T (G)

�T F (�T ) ;

subject toX

T2T (G)

�T �Tij (xi ; xj ) = �ij (xi ; xj ) ;

XT2T (G)

�T �Ti (xi ) = �i (xi ) :

Convex Problem

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 81 / 83

Page 146: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L((�T ); b) =XT

�T F (�T )

�X

(ij )2E

Xxi ;xj

bij (xi ; xj )nX

T

�T �Tij (xi ; xj )� �ij (xi ; xj )

o

�Xi2V

Xxi

bi (xi )nX

T

�T �Ti (xi )� �i (xi )

o

=XT

�T

nF (�T )� hb; �T i

o+ hb; �i

Separable in �T

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 82 / 83

Page 147: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L((�T ); b) =XT

�T F (�T )

�X

(ij )2E

Xxi ;xj

bij (xi ; xj )nX

T

�T �Tij (xi ; xj )� �ij (xi ; xj )

o

�Xi2V

Xxi

bi (xi )nX

T

�T �Ti (xi )� �i (xi )

o

=XT

�T

nF (�T )� hb; �T i

o+ hb; �i

Separable in �T

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 82 / 83

Page 148: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

L((�T ); b) =XT

�T F (�T )

�X

(ij )2E

Xxi ;xj

bij (xi ; xj )nX

T

�T �Tij (xi ; xj )� �ij (xi ; xj )

o

�Xi2V

Xxi

bi (xi )nX

T

�T �Ti (xi )� �i (xi )

o

=XT

�T

nF (�T )� hb; �T i

o+ hb; �i

Separable in �T

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 82 / 83

Page 149: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 83 / 83

Page 150: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 83 / 83

Page 151: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 83 / 83

Page 152: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Lagrangian

min(�T )

L((�T ); b) =XT

�TF�(b; �) + hb; �i

=XT

�T

nXi2V

H (bi )�X

(ij )2E(T )

I (bij )o+ hb; �i

=Xi2V

H (bi )n XT : i2V

�T

o�

X(i ;j )2V

I (bij )n XT : (i ;j )2E(T )

�T

o+ hb; �i

=Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 83 / 83

Page 153: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Tree-reweighted free energy

FTRW(b) =Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Compare with Bethe free energy

F(b)Xi2V

H (bi )�X

(i ;j )2V

I (bij ) + hb; �i

�(i ; j ) = 0 Obviously concave upper bound.

�(i ; j ) = 1 Bethe free energy.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 84 / 83

Page 154: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Tree-reweighted free energy

FTRW(b) =Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Compare with Bethe free energy

F(b)Xi2V

H (bi )�X

(i ;j )2V

I (bij ) + hb; �i

�(i ; j ) = 0 Obviously concave upper bound.

�(i ; j ) = 1 Bethe free energy.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 84 / 83

Page 155: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Tree-reweighted free energy

FTRW(b) =Xi2V

H (bi )�X

(i ;j )2V

�(ij )I (bij ) + hb; �i

Compare with Bethe free energy

F(b)Xi2V

H (bi )�X

(i ;j )2V

I (bij ) + hb; �i

�(i ; j ) = 0 Obviously concave upper bound.

�(i ; j ) = 1 Bethe free energy.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 84 / 83

Page 156: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Edge weights

� = ( �(e) : e 2 E)Intepretation

�(e) = P�fe 2 E(T )g ; P�(T ) = �T :

Spanning-Tree polytopeX(i ;j )2E

�(i ; j ) = jV j � 1 ;

X(i ;j )2E(U )

�(i ; j ) � jU j � 1 ; for all U � V :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 85 / 83

Page 157: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Edge weights

� = ( �(e) : e 2 E)Intepretation

�(e) = P�fe 2 E(T )g ; P�(T ) = �T :

Spanning-Tree polytopeX(i ;j )2E

�(i ; j ) = jV j � 1 ;

X(i ;j )2E(U )

�(i ; j ) � jU j � 1 ; for all U � V :

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 85 / 83

Page 158: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

k -regular graph

jV j = n ; jE j =nk

2:

Take all the weights equal (not necessarily ok, but. . . )

�(i ; j ) =2(n � 1)

nk�

2

k

For (some) models on locally tree-like graphs, �(i ; j ) = 1 isapproximately correct ! �(n) error.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 86 / 83

Page 159: Stat 375: Inference in Graphical Models Lectures 7-8-9 375: Inference in Graphical Models Lectures 7-8-9 Andrea Montanari Stanford University May 6, 2012 Andrea Montanari (Stanford)

Example

k -regular graph

jV j = n ; jE j =nk

2:

Take all the weights equal (not necessarily ok, but. . . )

�(i ; j ) =2(n � 1)

nk�

2

k

For (some) models on locally tree-like graphs, �(i ; j ) = 1 isapproximately correct ! �(n) error.

Andrea Montanari (Stanford) Stat375: Lecture 7, 8, 9 May 6, 2012 86 / 83