Kristin Branson September 29, 2003 -...

Loopy Belief PropagationResearch Exam

Kristin Branson

September 29, 2003

Loopy Belief Propagation – p.1/73

Problem Formalization

Reasoning about any real-world problem requiresassumptions about the structure of the problem: therelevant variables and the interrelationships of thesevariables.A graphical model is a formal representation of theseassumptions.

Problem Formalization

Reasoning about any real-world problem requiresassumptions about the structure of the problem: therelevant variables and the interrelationships of thesevariables.A graphical model is a formal representation of theseassumptions.

Probabilistic Model

These assumptions are a simplification of the problem’strue structure.The world appears stochastic in terms of the model.Graphical models are interpreted as describing theprobability distribution of random variables.

Probabilistic Inference

Reasoning about real-world problems can be modeled asprobabilistic inference on a distribution described by agraph.Probabilistic inference involves computing desiredproperties of a distribution:

What is the most probable state of the variables,given some evidence?

What is the marginal distribution of a subset of thevariables, given some evidence?

Inference Example

Estimate the intensity value of each pixel of an imagegiven a corrupted version of the image.

Inference Example

Estimate the intensity value of each pixel of an imagegiven a corrupted version of the image.

Assume ��

Assume relationship between uncorrupted pixel variablescan be described by a local smoothness constraint.

Inference is Intractible

Assuming a sparse graphical model greatly simplifies theproblem.Still, probabilistic inference is in general intractible.Exact inference algorithms are exponential in the graphsize.Pearl’s Belief Propagation (BP) algorithm performsapproximate inference on an arbitrary graphical model.

Loopy BP

The assumptions made by BP only hold for acyclicgraphs.For graphs containing cycles, loopy BP is not guaranteedto converge or be correct.However, it has been applied with experimental success.

Experimental Results

Impressive experimental results were first observedin graphical code schemes.

The Turbo code error-correcting code scheme wasdescribed as “the most exciting and potentiallyimportant development in coding theory in manyyears” (McEliece et al., 1995).

Murphy et al. experimented with loopy BP ongraphical models that appear in machine learning.

They concluded that loopy BP did converge to goodapproximations in many cases.

Since then, loopy BP has successfully been appliedto many applications in machine learning andcomputer vision.

Theoretical Analysis

When considering applying loopy BP, one would like toknow whether it will converge to good approximations.

In this exam, I present recent analyses of loopy BP.

Outline

Background.Undirected graphical models.Belief Propagation algorithm.

Three techniques for analyzing loopy BP.Algebraic analysis.Unwrapped tree.Reparameterization.

Future work.

Outline

Future work.

Markov Random Fields

An undirected graphical model represents a distributionby an undirected graph.

� � � � � �1 2 3

� � � � � �

� � ��

� � � � � �

Nodes represent random variables.

Edges represent dependencies.

Each clique is associated with a potential function,

� � � � �.

Markov Properties

Paths in the graph represent dependencies.If two nodes are not connected, the variables areindependent.

If nodes separate nodes from nodes , then andare conditionally independent given .

The Hammersley-Clifford theorem: the conditionalindependence assumptions hold if and only if thedistribution factorizes as the product of potentialfunctions over cliques: � � � �

� � � � � � � �

Pairwise MRFs

To simplify notation, we focus on pairwise MRFs.The largest clique size in a pairwise MRF is two.The distribution can therefore be represented as

� � � ��

� � ��

��

� � � � � � ��

��

A MRF with larger cliques can be converted into apairwise MRF.

Probabilistic Inference

We discuss two inference problems:

Marginalization: For each node �, compute

��

� � � � � � � ��

MAP assignment: Find the maximum probabilityassignment given the evidence:

� � ��

� argmax� � �

� � � � � � � ��

Max-Marginals

To find the MAP assignment, we will compute themax-marginals:

��

� � ��

� � � ��

The MAP assignment � � ��

� maximizes ��

Notation

To simplify notation, we assume that effect of theobserved data � � is encapsulated in the single-nodepotentials � .

��

� � � �

� �

� � � � � �

Variable Elimination

Exact inference can be performed by repeatedlyeliminating variables:

��

� ��

� � � ��

� � ��

��

� � � ��

� � ��

Outline

Future work.

BP for Trees

��

� �

� � ��

��

BP breaks the [max-]marginalization for a node � intoindependent subproblems corresponding to the subtreesrooted at the neighbors of �.In each subproblem, BP eliminates all the variablesexcept �.The result is a message ��

.Loopy Belief Propagation – p.22/73

BP for Trees

� �

� ��

��

� � ��

BP is a dynamic programming form of variableelimination.The creation of a message is equivalent to repeatedlyremoving leaf nodes of the subtree:

��

� ��

��

� � ��

��

� � � � ��

� ��

��

� � ��

��

BP for Trees

��

� ��

� � ��

In terms of these messages, the [max-]marginals are

� ��

� � ��

��

� � ��

��

Parallel Message Passing

� �

� ��

� � � ��

��

� � ��

� ��

Instead of waiting for smaller problems to be solvedbefore solving larger problems, we can iteratively passmessages in parallel.

Initialize the messages � ��

for all

� �� .

� � ��

�� until convergence,

Update � ��

using � � ��

��

for all� �� .

Loopy BP

The parallel BP algorithm can be applied to arbitrarygraphs.However, the assumptions made by BP do not hold for aloopy graph.Loopy BP is not guaranteed to converge.If it does converge, it is not guaranteed to converge to thecorrect [max-]marginals.We will call the approximate [max-]marginals beliefs

��

Theoretical Analysis

When will loopy BP converge?

How good an approximation are the[max-]marginals and max-product assignment?

I present three techniques for analyzing BP.The first two analyze the message-passing dynamics,while the third analyzes the steady-state beliefs directly.

Outline

Future work.

Algebraic Analysis Overview

We first discuss an algebraic analysis of thesum-product algorithm for a single-cycle graph.

We represent each message update of thesum-product algorithm as a matrix multiplication.

We use linear algebra results to show therelationship between the steady-state beliefs and thetrue marginals, as well as convergence properties.

The sum-product algorithm converges for asingle-cycle graph.

The convergence rate and the accuracy of the beliefsare related.

Matrix Representation

We represent the message and belief functions as vectors

� � and

�� .

Similarly, we represent the single- and pair-nodepotentials as matrices � and � � .

��

� � � �

Matrix Representation

We represent the message and belief functions as vectors

� � and

�� .

Similarly, we represent the single- and pair-nodepotentials as matrices � and � � .

� � � �

Matrix Message Updates

� � �

� � � �

� �

� ��

� � � ��

� � � �

For a graph consisting of a single cycle, a messageupdate is a matrix multiplication

��

� � � � � � � �

� ��

� � �

� � � �

� �

� ��

� � � ��

� � � �

� � � � � �

For a graph consisting of a single cycle, a messageupdate is a matrix multiplication

��

� � � � � � � �

� ��

Matrix Belief Updates

� � �

� � � �

��

� ��For a graph consisting of a single cycle, a belief update is

� � � �diag� � � � � �

��

� ��

��

� � �!"!#$#

% & %A series of message-updates is a series of matrixmultiplications.

' � � (� � �� ) � � ��

)� � � � � ) � � �� ) � � � �* * * � )�� ) �

� ' ��

� � +� � �� ' ��

Power Method Lemma

� � ��

converges to the principaleigenvector of , � � � .

The convergence rate is the ratio of the first twoeigenvalues, � �

��

�� .

This applies ifThe eigenvalues follow

��

(e.g. if thedistribution is positive).The initial vector

is not orthogonal to � � .

True Marginals

The sums and multiplications performed whencomputing the marginals are a distributed version of thesums and multiplications performed when computing thediagonal elements of � � � � :

� � � �diag� � � � ��

Beliefs

� �� is the left eigenvector of � � � � , since

� � � � � ��

��

The steady-state beliefs are therefore the diagonalelements of the outer product of the right and leftprincipal eigenvectors of � � � � .

Beliefs vs True Marginals

The diagonal elements of the outer product of theright and left principal eigenvectors is anapproximation of the diagonal elements of � � � � .

The goodness of the approximation depends on theratio

� � �� .

Recall that the convergence rate depends on asimilar ratio,

� � ��.

The faster the convergence, the better theapproximation.

Algebraic Analysis Recap

By representing the sum-product algorithm on asingle-cycle graph as a series of matrixmultiplications, we showed the following results:

The sum-product algorithm converges forpositive distributions.Both the covergence rate and the accuracy of thesteady-state beliefs depend on the relative size ofthe first eigenvalue of the same matrix .

Outline

Future work.

Unwrapped Tree

To analyze the BP algorithm, we construct theunwrapped tree, , an acyclic graph that is locallyequivalent to the original graph, .

� �

� � � �

Unwrapped Tree Analysis

� ��

� � ��

� ��

� � ��

� ��

� � ��

Unwrapped Tree Overview

The unwrapped tree was used to prove:

The max-product assignment is exact for a graphcontaining a single cycle.

The max-product assignment has a higherprobability than any other assignment in a largeneighborhood.

Unwrapped Tree Construction

� ��

� � ��

The unwrapped tree, , is constructed from as follows:

Choose an arbitrary root node �. Initialize � �.Repeat:

For each leaf � of , find the neighbors of thecorresponding node in , other than the parent of

� in . Add these nodes to the tree.

� �

� �Loopy Belief Propagation – p.49/73

� � � �

� �

� � � �

� � � � � �

� �

� � � � � �

� � � � � � � �

� �

� � � � � � � �

� �

� � � � � � � �

� �

� � � � � �

� � � �

� �

� � � � � �

� � � �

� �

� � ��

Copy the potentials from the corresponding nodes in .Modify the leaf single-node potentials to simulate thesteady-state messages in .Because the graphs are locally the same, the � �

� will bereplicas of � �� .

Graphs Containing a Single Cycle

For a graph containing a single cycle, the unwrapped treeis an infinite chain.We can construct so that each node is replicated �

times in the interior.

� � � � � � � � � �

� �

� � � � � � � �

� ��

� � ��

� � � � �

be the log-likelihood of assignment � for .Since the interior of consists of � replicas of , thelog-likelihood for is

��

� � ��

��

is the log-likelihood for the two leaf nodes.As

��

does not depend on �, in the limit as � ,

��

� � � � � � � � �

� ��

� � ��

ShownK

ShownW

Optimality for Arbitrary Graphs

be a set of nodes whose induced subgraph containsat most one cycle per connected component.We can show that � � has a higher probability than any� ��

��

Outline

Future work.

Reparameterization Analysis

The past two analysis techniques analyzed themessage-passing dynamics of BP.The reparameterization technique analyzes thesteady-state beliefs.

Reparameterization Overview

We show that the beliefs define anotherparameterization of the distribution � � � �

In this parameterization,We show that the steady-state beliefs areconsistent w.r.t every subtree.We show that the max-product assignmentsatisfies an optimality condition w.r.t. everysubgraph with at most one cycle per connectedcomponent.

Steady-State Beliefs

� � ��

� �

� � � ��

��

� ��

We analyze the steady-state single- and pair-node beliefs:

� �

��

� � � � � ��

� ��

� �

� ��

� � � � � � ��

� ��

� � � ��

� �� *

Belief Parameterization

The beliefs

define another parameterization of thedistribution:

� � � � � �

� � ��

� � � � � �

� ��

� � ��

��

� ��

� � �

� � � �

� � ��

Belief Parameterization

The beliefs

define another parameterization of thedistribution:

� � � � � � � � ��

� �

��

� � � � � � �

� �

� ��

� � � � � � ��

� � � � � ) �

as can be shown by substituting in the definition of

Consistency

Definition: Let ��

be a subgraph withdistribution

� ��

��

� ��

��

� � � � � � � �

��

The beliefs

are consistent w.r.t if the correspondingbeliefs

�� are the true max-marginals of � �

� � ��

Edge Consistency

The edge beliefs are consistent:

� ��

��

as can be seen by substituting in the message definitionsof

�� and

�� .

Tree Consistency

� � � � � �

� ��

� � ��

��

� ��

� � �

� � � �

� � ��

The steady-state beliefs�

are consistent w.r.t everysubtree of .This is shown by exploiting the edge consistencydescribed to remove leaf nodes one at a time.In the end, we are left with only two nodes, a trivial basecase.

Tree Plus Cycle Optimality

� ��

� � � � � � � � �

Let ��

be a subgraph of with at most onecycle per connected component and distribution

� ��

��

� ��

��

� � � � � � � �

��

The max-product assignment � � �� maximizes � �.Loopy Belief Propagation – p.68/73

Using the edge consistency described, we can show that

� ��

��

� ��

��

for any other assignment � � � � � .

If is a connected subgraph containing one cycle, thenthe edges of can be directed so that each node hasexactly one parent:

� ��

��

� ��

��

� � ��

��

� ��

where � is the parent of �.

Corollaries of TPS Optimality

The two results proved using the unwrapped tree arecorollaries of the Tree-Plus-Cycle optimality.The Tree-Plus-Cycle optimality can also be used to showan error bound on the max-product assignment for anarbitrary graph.

Future Work

I have presented three techniques for analyzing loopy BP.Experimental results are better than the results proved.Future work includes extending each technique to bemore general and prove stronger results.

Prove convergence properties of the max-productalgorithm on a single-cycle graph using algebraicanalysis.

Prove the optimality of the max-product algorithmfor specific multiple-loop graphs using theunwrapped tree technique.

Show more powerful optimality results for arbitrarygraph structures with specific potential properties.

ReferencesAji, S., Horn, G., and McEliece, R. (1998). On the convergence of iterative decoding on graphs with a

single cycle. In IEEE International Symposium on Information Theory.

McEliece, R., Rodemich, E., and Cheng, J. (1995). The Turbo decision algorithm. In 33rd AllertonConference on Communications, Control and Computing, Monticello, IL.

Murphy, K., Weiss, Y., and Jordan, M. (1999). Loopy belief propagation for approximate inference:An empirical study. In Uncertainty in Artificial Intelligence, pages 467–475.

Pearl, J. (1998). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.Morgan Kaufmann Publishers, Inc., San Mateo, CA.

Wainwright, M. (January, 2002). Stochastic Processes on Graphs with Cycles: Geometric andVariational Approaches. PhD thesis, MIT, Cambridge, MA.

Wainwright, M., Jaakola, T., and Willsky, A. (2003). Tree-based reparameterization framework foranalysis of sum-product and related algorithms. IEEE Transactions on Information Theory, 49(5).

Wainwright, M., Jaakola, T., and Willsky, A. (October 28, 2002). Tree consistency and bounds on theperformance of the max-product algorithm and its generalizations. Technical Report P–2554,Laboratory for Information and Decision Systems, MIT.

Weiss, Y. (November, 1997). Belief propagation and revision in networks with loops. Technical ReportAI Memo No. 1616, C.B.C.L. Paper No. 155, AI Lab, MIT.

Weiss, Y. and Freeman, W. (2001a). Correctness of belief propagation in Gaussian graphical models orarbitrary topology. Neural Computation, 13:2173–2200.

Weiss, Y. and Freeman, W. (2001b). On the optimality of solutions of the max-product beliefpropagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory,47(2):723–735. Loopy Belief Propagation – p.73/73

Kristin Branson September 29, 2003 -...

Documents

BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003

Presents A Very Branson ChristmasA Very Branson … Very Branson ChristmasA Very Branson Christmas ... “Miracle of Christmas” at Sight & Sound Theatre Maxine’s Christmas Card

Branson Manual Book

Branson jmd

Richard branson slideshare

Branson Ultrasonic Precision Cleaning Brochure · Branson Ultrasonic Precision Cleaning Brochure Author: Emerson- Branson Subject: Branson has been a leader in ultrasonic precision

Virgin - Richard Branson

Branson Lodging

Branson Manual

Branson Creek CBRE Branson Creek 3-11-2011.pdf

Richard Branson Personality

Branson Real Estate

CITY PACKAGES BRANSON BRANSON TOWERS MARRIOTT’S … · Branson is Springfield, which is about 55 minutes north of Branson. Wecan provide transfers, for $85.00 per person, roundtrip

Mark Branson - Meat & Livestock Australia · The use of Digital Technologies on Branson Farms. Mark Branson (B.AppSc (Agric), 2005 Nuffield Scholar,)

Branson Wright's Portfolio

Branson Property Management - 130 Estate Circle, Branson, MO 65616 (MLS # 60040479)

Richrd branson 2

Branson richard

Branson 3510 Ultrasonic

Explore Branson