34
CS839: Probabilistic Graphical Models Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1

Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

CS839:ProbabilisticGraphicalModels

Lecture5:MessagePassing/BeliefPropagation

TheoRekatsinas

1

Page 2: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

JunctionTree

2

• Acliquetreeforatriangulated graphisreferredtoasajunctiontree

• Injunctiontrees,localconsistencyimpliesglobalconsistency.Thusthelocalmessage-passingalgorithmsis(provably)correct.

• Only triangulatedgraphshavethepropertythattheircliquetreesarejunctiontrees.Thusifwewantlocalalgorithms,wemust triangulate

Page 3: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Howtotriangulate?

3

• Intermediatetermscorrespondtothecliquesresultedfromelimination• VEvsMPoverjunctiontree?

Page 4: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

SketchoftheJunctionTreeAlgorithm

4

• Resultsinmarginalprobabilitiesofallcliques--- solvesallqueriesinasinglerun• AgenericexactinferencealgorithmforanyGM• Complexity:exponentialinthesizeofthemaximalclique--- agoodeliminationorderoftenleadstosmallmaximalclique,andhenceagood(i.e.,thin)JT

Page 5: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

InferenceinHMMs

5

• Summingwithelimination

• Messagepassingcorrespondstoaforwardandbackwardpass

Page 6: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

InferenceinHMMs

6

• AjunctiontreefortheHMM

• Forwardpass

Page 7: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

InferenceinHMMs

7

• AjunctiontreefortheHMM

• Backwardpass

Page 8: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

CS839:ProbabilisticGraphicalModels

Lecture6:GeneralizedLinearModels(MLE)

TheoRekatsinas

8

Page 9: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

ParametersinGraphicalModels

9

• Bayesiannetwork

Howdowefindtheseparameters?

Page 10: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

LinearRegressionasaBayesNet

10

• LinearReg:D=((x1,y1),(x2,y2),…,(xn,yn)),

• Assumethatε (errortermofunmoldedeffectsofrandomnoise)isaGaussianrandomvariableN(0,σ2)

• UseLeast-Mean-Squarealgorithmtoestimateparameters.

yi = ✓Txi + ✏i

xi 2 Rd, yi 2 R

p(yi|xi; ✓) =1p2⇡�2

exp

✓� (yi � ✓Txi)

2

2�2

Page 11: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

LogisticRegression(sigmoidclassifier)asaGm

11

• ConditionaldistributionisaBernoulli:

• Wecanuseatailoredgradientmethodagainasinlinearregression

• Butseethatp(y|x)belongsintheexponentialfamily anditisageneralizedlinearmodel.

p(y|x) = µ(x))

y(1� µ(x)))

1�y

µ(x)) =

1

1 + exp(�✓

Tx)

Page 12: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Markovrandomfields

12

Page 13: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

RestrictedBoltzmannMachines

13

Page 14: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

ConditionalRandomFields

14

• Discriminative

• Doesn’tassumethatfeaturesareindependent

• Whenlabelingfeatureobservationsaretakenintoaccount

P✓(Y |X) =

1

Z(✓, X)

exp

X

c

✓cfc(X,Yc)

!

Page 15: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Exponentialfamily:abasicbuildingblock

15

• ForanumericrandomvariableX

isanexponentialfamilydistributionwithnatural(canonical)parameterη

• FunctionT(x)isasufficientstatistic.• FunctionA(η)=logZ(η)isthelognormalizer• Examples:Bernoulli,multinomial,Gaussian,Poisson,Gamma,Categorical

p(x|⌘) = h(x) exp

�⌘

TT (x)�A(⌘)

�=

1

Z(⌘)

h(x) exp(⌘

TT (x))

Page 16: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Example:MultivariateGaussianDistribution

16

• Foracontinuousvectorrandomvariable

• Exponentialfamilyrepresentation

X 2 Rk

Page 17: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Example:MultivariateGaussianDistribution

17

• Forabinaryvectorrandomvariablex~multi(x|π)

Page 18: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Whyexponentialfamily?

18

• Momentgeneratingproperty

WecaneasilycomputemomentsofanyexponentialfamilydistributionbytakingthederivativesofthelognormalizerA(η)

Page 19: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Momentsvscanonicalparameters

19

• Themomentparameters(e.g.,μ) canbederivedfromthenaturalparameters• First=mean• Second– variance• Etc.

• A(η) isconvex

• Hence,wecaninverttherelationshipandinferthecanonicalparametersfromthemomentparameters(1-to-1)• Adistributionintheexp.familycanbeparametrizednotonlybyη butalsobyμ

Page 20: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

MLEforExponentialFamily

20

• Foriid datathelog-likelihoodis

• Wetakethederivativesandsetthemtozero

• Weperformmomentmatching

• Wecaninferthecanonicalparametersusing

Page 21: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Sufficiency

21

• Forp(x|θ),T(x)issufficient forθifthereisnoinformationinXregardingθ beyondthatinT(x)• WecanthrowawayXforthepurposeofinferencew.r.t.Θ

• BayesianView

• FrequentistView

• Neyman factorizationtheorem• T(x)issufficientforθ if

Page 22: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Examples

22

• Gaussian:

• Multinomial:

Page 23: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

GeneralizedLinearModels

23

• Thegraphicalmodel:• Linearregression• Discriminativelinearclassification

• GeneralizedLinearModel• Theobservedinputxisassumedtoenterintothemodelviaalinearcombinationofitselements.

• Theconditionalmeanμisrepresentedasafunctionf(ξ)ofξ,wherefisknownastheresponsefunction.

• Theobservedoutputyisassumedtobecharacterizedbyanexponentialfamilydistributionwithconditionalmeanμ.

Ep(T ) = µ = f(✓TX)

⇠ = ✓Tx

Page 24: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

GeneralizedLinearModels

24

Page 25: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

MLEforGLIMswithnaturalresponse

25

• Log-likelihood

• Derivativeoflog-likelihood

• LearningforcanonicalGLIMs• Stochasticgradientascent=leastmeansquares(LMS)

Page 26: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Second-ordermethods

26

• TheHessianmatrix

• XisthedesignmatrixandWiscomputedbycalculatingthe2-ndderivativeofA(ηn)

Page 27: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

BacktoLeastSquares

27

• Objectivefunctioninmatrixform

• Tominimizethisobjectivewetakethederivativeandsetittozero

Page 28: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

IterativelyReweightedLeastSquares

28

• Newton-RaphsonmethodswithobjectiveJ

• Wehave

• Update

Page 29: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

IterativelyReweightedLeastSquares

29

• Newton-RaphsonmethodswithobjectiveJ

• Wehave

• Update

GenericupdateforanyExp FamilyDistribution

Page 30: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Example1:LogisticRegression

30

• ConditionaldistributionisaBernoulli:

• IRLS

p(y|x) = µ(x))

y(1� µ(x)))

1�y

µ(x)) =

1

1 + exp(�⌘(x))

⌘ = ⇠ = ✓

x

@⌘= µ(1� µ)

W =

2

64µ11� µ1 . . . . . .

.... . .

.... . . . . . µN (1� µN )

3

75

Page 31: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Example2:LinearRegression

31

• ConditionaldistributionisaGaussian:

• IRLS

Page 32: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

SimpleGLIMsarethebuildingblocksofcomplexBNs

32

CPDscorrespondtoGLIMs

Page 33: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

MLEforgeneralBNs

33

• IfweassumetheparametersforeachCPDaregloballyindependent,andallnodesarefullyobserved,thenthelog-likelihoodfunctiondecomposesintoasumoflocalterms,onepernode

• MLE-basedparameterestimationofGMreducestolocalest.ofeachGLIM.

Page 34: Lecture 5: Message Passing/Belief Propagation · Lecture 5: Message Passing/Belief Propagation Theo Rekatsinas 1. Junction Tree 2 •A clique tree for a triangulatedgraph is referred

Summary

34

• Forexponentialfamilydistributions,MLEamountstomomentmatching

• GLIM:• Naturalresponse• IterativelyReweightedLeastSquaresasageneralalgorithm

• GLIMsarebuildingblocksofmostpracticalGMs