63
Expectation Propagation for Graphical Models Yuan (Alan) Qi Joint work with Tom Minka

Expectation Propagation for Graphical Models

Embed Size (px)

DESCRIPTION

Expectation Propagation for Graphical Models. Yuan (Alan) Qi Joint work with Tom Minka. Motivation. Graphical models are widely used in real-world applications, such as wireless communications and bioinformatics. - PowerPoint PPT Presentation

Citation preview

Page 1: Expectation Propagation for Graphical Models

Expectation Propagation for Graphical Models

Yuan (Alan) Qi

Joint work with Tom Minka

Page 2: Expectation Propagation for Graphical Models

Motivation

• Graphical models are widely used in real-world applications, such as wireless communications and bioinformatics.

• Inference techniques on graphical models often sacrifice efficiency for accuracy or sacrifice accuracy for efficiency.

• Need a new method that better balances the trade-off between accuracy and efficiency.

Page 3: Expectation Propagation for Graphical Models

Motivation

Efficiency

Acc

urac

y

CurrentTechniques

What we want

Page 4: Expectation Propagation for Graphical Models

Outline

• Background

• Expectation Propagation (EP) on dynamic systems– Poisson tracking– Signal detection for wireless communications

• Tree-structured EP on loopy graphs

• Conclusions and future work

Page 5: Expectation Propagation for Graphical Models

Outline

• Background

• Expectation Propagation (EP) on dynamic systems– Poisson tracking– Signal detection for wireless communications

• Tree-structured EP on loopy graphs

• Conclusions

Page 6: Expectation Propagation for Graphical Models

Graphical ModelsDirected Undirected

Generative Bayesian networks Boltzman machines

Conditional (Discriminative)

Maximum entropy Markov models

Conditional random fields

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

Page 7: Expectation Propagation for Graphical Models

Inference on Graphical Models

• Bayesian inference techniques:– Belief propagation(BP): Kalman filtering

/smoothing, forward-backward algorithm– Monte Carlo: Particle filter/smoothers,

MCMC• Loopy BP: typically efficient, but not accurate• Monte Carlo: accurate, but often not efficient

Page 8: Expectation Propagation for Graphical Models

Efficiency vs. Accuracy

Efficiency

Acc

urac

y EP ?

BP

MC

Page 9: Expectation Propagation for Graphical Models

Expectation Propagation in a Nutshell

• Approximate a probability distribution by simpler parametric terms:

• Each approximation term lives in an exponential family (e.g. Gaussian)

a

afp )()( xx a

afq )(~

)( xx

)(~xaf

Page 10: Expectation Propagation for Graphical Models

Update Term Approximation

• Iterate the fixed-point equation by moment matching:

))()(~

||)()((minarg \\

)(~

xxxxx

aa

aa

f

qfqfD

a

ab

ba fq )(

~)(\ xx

Where the leave-one-out approximation is

Page 11: Expectation Propagation for Graphical Models

Outline

• Background

• Expectation Propagation (EP) on dynamic systems– Poisson tracking– Signal detection for wireless communications

• Tree-structured EP on loopy graphs

• Conclusions

Page 12: Expectation Propagation for Graphical Models

EP on Dynamic SystemsDirected Undirected

Generative Bayesian networks Boltzman machines

Conditional (Discriminative)

Maximum entropy Markov models

Conditional random fields

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

Page 13: Expectation Propagation for Graphical Models

Object Tracking

Guess the position of an object given noisy observations

1y

4y

Object

1x2x

3x

4x

2y

3y

Page 14: Expectation Propagation for Graphical Models

Bayesian Network

ttt νxx 1

noise tt xy

(random walk)e.g.

want distribution of x’s given y’s

x1 x2 xT

y1 y2 yT

Page 15: Expectation Propagation for Graphical Models

Approximation

1

1111 )|()|()|()(),(t

tttt xypxxpxypxpp yx

1

111111 )(~)(~)(~)(~)()(t

tttttttt xoxpxpxoxpq x

Factorized and Gaussian in x

(proportional)

Page 16: Expectation Propagation for Graphical Models

Message Interpretation

)(~)(~)(~)( 11 tttttttt xpxoxpxq

= (forward msg)(observation)(backward msg)

xt

yt

Forward Message

Backward Message

Observation Message

Page 17: Expectation Propagation for Graphical Models

EP on Dynamic Systems• Filtering: t = 1, …, T

– Incorporate forward message

– Initialize observation message

• Smoothing: t = T, …, 1– Incorporate the backward message

– Compute the leave-one-out approximation by dividing out the old observation messages

– Re-approximate the new observation messages

• Re-filtering: t = 1, …, T– Incorporate forward and observation messages

Page 18: Expectation Propagation for Graphical Models

Extension of EP

• Instead of matching moments, use any method for approximate filtering.– Examples: Extended Kalman filter, statistical

linearization, unscented filter

• All methods can be interpreted as finding linear/Gaussian approximations to original terms

Page 19: Expectation Propagation for Graphical Models

Example: Poisson Tracking

• is an integer valued Poisson variate with mean )exp( txty

Page 20: Expectation Propagation for Graphical Models

Poisson Tracking Model

)01.0,(~)|( 11 ttt xNxxp

)100,0(~)( 1 Nxp

!/)exp()|( tx

tttt yexyxyp t

Page 21: Expectation Propagation for Graphical Models

Approximate Observation Message

• is not Gaussian

• Moments of x not analytic

• Two approaches:– Gauss-Hermite quadrature for moments– Statistical linearization instead of moment-

matching

• Both work well

)|()|( tttt yxpxyp

Page 22: Expectation Propagation for Graphical Models

EP Accuracy Improves Significantly in only a few Iterations

Page 23: Expectation Propagation for Graphical Models

Approximate vs. Exact Posterior

Page 24: Expectation Propagation for Graphical Models

EP vs. Monte Carlo: Accuracy

Variance

Mean

Page 25: Expectation Propagation for Graphical Models

Accuracy/Efficiency Tradeoff

Page 26: Expectation Propagation for Graphical Models

EP for Digital Wireless Communication

• Signal detection problem

• Transmitted signal st =

• vary to encode each symbol

• Complex representation:

)sin( ta

iae

Re

Im

),( a

a

Page 27: Expectation Propagation for Graphical Models

Binary Symbols, Gaussian Noise

• Symbols are 1 and –1 (in complex plane)

• Received signal yt =

• Optimal detection is easy

noises

ty

0s 1s

Page 28: Expectation Propagation for Graphical Models

Fading Channel

• Channel systematically changes amplitude and phase:

• changes over time

noise sxy tt

txty

0sxt

1sxt 11sxt

01sxt

1ty

Page 29: Expectation Propagation for Graphical Models

Benchmark: Differential Detection

• Classical technique

• Use previous observation to estimate state

• Binary symbols only

Page 30: Expectation Propagation for Graphical Models

Bayesian network for Signal Detection

x1 x2 xT

y1 y2 yT

s1 s2 sT

Page 31: Expectation Propagation for Graphical Models

On-line EP Joint Signal Detector and Channel Estimation

• Iterate over the last observations

• Observations before act as prior for the current estimation

)( t

Page 32: Expectation Propagation for Graphical Models

Computational Complexity• Expectation propagation O(nLd2)

• Stochastic mixture of Kalman filters O(LMd2)

• Rao-blackwised paricle smoothers O(LMNd2)

n: Number of EP iterations (Typically, 4 or 5)

d: Dimension of the parameter vector

L: Smooth window length

M: Number of samples in filtering

N: Number of samples in smoothing

Page 33: Expectation Propagation for Graphical Models

Experimental Results

EP outperforms particle smoothers in efficiency with comparable accuracy.

(Chen, Wang, Liu 2000)

Page 34: Expectation Propagation for Graphical Models

Bayesian Networks for Adaptive Decoding

x1 x2 xT

y1 y2 yT

e1 e2 eT

The information bits et are coded by a convolutional

error-correcting encoder.

Page 35: Expectation Propagation for Graphical Models

EP Outperforms Viterbi Decoding

Page 36: Expectation Propagation for Graphical Models

Outline

• Background

• Expectation Propagation (EP) on dynamic systems– Poisson tracking– Signal detection for wireless communications

• Tree-structured EP on loopy graphs

• Conclusions

Page 37: Expectation Propagation for Graphical Models

EP on Boltzman machinesDirected Undirected

Generative Bayesian networks Boltzman machines

Conditional (Discriminative)

Maximum entropy Markov models

Conditional random fields

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

Page 38: Expectation Propagation for Graphical Models

Inference on Grids

Problem: estimate marginal distributions of the variables indexed by the nodes in a loopy graph, e.g., p(xi), i = 1, . . . , 16.

X1 X2 X3 X4

X5 X6 X7 X8

X9 X10 X11 X12

X13 X14 X15 X16

Page 39: Expectation Propagation for Graphical Models

Boltzmann Machines

4x

2x 3x

1x

a

afp )()( xx

Joint distribution is product of pair potentials:

Want to approximate by a simpler distribution

Page 40: Expectation Propagation for Graphical Models

BP vs. EP

4x

2x 3x

1x

4x

2x 3x

4x

2x 3x

1x1xBP EP

Page 41: Expectation Propagation for Graphical Models

Junction Tree Representation

p(x) q(x) Junction tree

Page 42: Expectation Propagation for Graphical Models

Approximating an Edge by a Tree

24

443

3442

2441

14

21)(

~),(

~),(

~),(

~),(

xf

xxfxxfxxfxxf

a

aaaa

Each potential f a in p is projected onto the tree-structure of q

Correlations are not lost, but projected onto the tree

Page 43: Expectation Propagation for Graphical Models

Moment Matching

• Match single and pairwise marginals of

• Reduces to exact inference on single loops– Use cutset conditioning

4x

2x 3x

1x

4x

2x 3x

1x

and

Page 44: Expectation Propagation for Graphical Models

Local Propagation

• Original EP: globally propagate evidence to the whole tree– Problem: Computationally expensive

• Exploit the junction tree representation: only locally propagate evidence within the minimal subtree that is directly connected to the off-tree edge.– Reduce computational complexity

– Save memory

Page 45: Expectation Propagation for Graphical Models

x5 x7

x1 x2

x1 x3 x1 x4

x3 x5

x3 x6

x3 x4

x5 x7

x1 x2

x1 x3 x1 x4

x3 x5

x3 x6

Global propagation

x3 x4

Local propagation

x5 x7

x1 x2

x1 x3 x1 x4

x3 x5

x3 x6

x1 x2

x1 x3 x1 x4

Page 46: Expectation Propagation for Graphical Models

4-node Graph

TreeEP = the proposed method, BP = loopy belief propagation, GBP = generalized belief propagation on triangles, MF = mean-field, TreeVB =variational tree.

Page 47: Expectation Propagation for Graphical Models

Fully-connected graphs

Results are averaged over 10 graphs with randomly generated potentials

•TreeEP performs the same or better than all other methods in both accuracy and efficiency!

Page 48: Expectation Propagation for Graphical Models

8x8 grids, 10 trials

Method FLOPS Error

Exact 30,000 0

TreeEP 300,000 0.149

BP/double-loop 15,500,000 0.358

GBP 17,500,000 0.003

Page 49: Expectation Propagation for Graphical Models

TreeEP versus BP and GBP

• TreeEP is always more accurate than BP and is often faster

• TreeEP is much more efficient than GBP and more accurate on some problems

• TreeEP converges more often than BP and GBP

Page 50: Expectation Propagation for Graphical Models

Outline

• Background

• Expectation Propagation (EP) on dynamic systems– Poisson tracking– Signal detection for wireless communications

• Tree-structured EP on loopy graphs

• Conclusions

Page 51: Expectation Propagation for Graphical Models

Conclusions

• EP algorithms outperform state-of-art inference methods on graphical models in the trade-off between accuracy and efficiency

Efficiency

Acc

urac

y EP

Page 52: Expectation Propagation for Graphical Models

Future Work

• EP is applicable to a wide range of applications

• EP is sensitive to choice of approximation– How to choose an approximation family (e.g.

tree structure) – More flexible approximation: mixture of EP?– Error bound?

Page 53: Expectation Propagation for Graphical Models

Future WorkDirected Undirected

Generative Bayesian networks Boltzman machines

Conditional (Discriminative)

Maximum entropy Markov models

Conditional random fields

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

x1 x2

y1 y2

Page 54: Expectation Propagation for Graphical Models

End

Page 55: Expectation Propagation for Graphical Models

EP versus BP

• EP approximation is in a restricted family, e.g. Gaussian

• EP approximation does not have to be factorized

• EP applies to many more problems– e.g. mixture of discrete/continuous variables

Page 56: Expectation Propagation for Graphical Models

EP versus Monte Carlo

• Monte Carlo is general but expensive

• EP exploits underlying simplicity of the problem if it exists

• Monte Carlo is still needed for complex problems (e.g. large isolated peaks)

• Trick is to know what problem you have

Page 57: Expectation Propagation for Graphical Models

(Loopy) Belief propagation

• Specialize to factorized approximations:

• Minimize KL-divergence = match marginals of (partially factorized) and (fully factorized)– “send messages”

i

iaia xff )(~

)(~x “messages”

)()( \ xx aa qf

)()(~ \ xx a

a qf

Page 58: Expectation Propagation for Graphical Models

Limitation of BP

• If the dynamics or measurements are not linear and Gaussian, the complexity of the posterior increases with the number of measurements

• I.e. BP equations are not “closed”– Beliefs need not stay within a given family

* or any other exponential family

*

Page 59: Expectation Propagation for Graphical Models

Approximate filtering

• Compute a Gaussian belief which approximates the true posterior:

• E.g. Extended Kalman filter, statistical linearization, unscented filter, assumed-density filter

)|()( ttt yxpxq

Page 60: Expectation Propagation for Graphical Models

EP perspective

• Approximate filtering is equivalent to replacing true measurement/dynamics equations with linear/Gaussian equations

)|(

),|()|(

tt

ttttt yxp

yyxpxyp

)|()|(),|( ttttttt yxpxypyyxp

implies

Gaussian

Gaussian

Page 61: Expectation Propagation for Graphical Models

EP perspective

• EKF, UKF, ADF are all algorithms for:

)|( tt xyp

)|( 1tt xxp

Nonlinear,Non-Gaussian

Linear,Gaussian

)|(~tt xyp

)|(~1tt xxp

Page 62: Expectation Propagation for Graphical Models

Terminology

• Filtering: p(xt|y1:t )

• Smoothing: p(xt|y1:t+L ) where L>0

• On-line: old data is discarded (fixed memory)

• Off-line: old data is re-used (unbounded memory)

Page 63: Expectation Propagation for Graphical Models

Kalman filtering / Belief propagation

• Prediction:

• Measurement:

• Smoothing:

111 )|()|()|( ttttttt dxyxpxxpyxp

)|()|(),|( ttttttt yxpxypyyxp

11111 ),|()|()|(),|( ttttttttttt dxyyxpxxpyxpyyxp