24
Statistical Methods in AI/ML Bucket elimination Vibhav Gogate

Statistical Methods in AI/ML

  • Upload
    sani

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Statistical Methods in AI/ML. Bucket elimination Vibhav Gogate. Bucket Elimination: Initialization. (A,C). (C,E). A E D F B C. A. C. E. (C,D). (E,F). (A,B). B. D. F. (B,D). (D,F). You put each function i n exactly one bucket How? - PowerPoint PPT Presentation

Citation preview

Page 1: Statistical Methods in AI/ML

Statistical Methods in AI/ML

Bucket eliminationVibhav Gogate

Page 2: Statistical Methods in AI/ML

Bucket Elimination: Initialization

A

B

C

D

E

F

(A,C) (C,E)

(D,F)(B,D)

(C,D)(A,B)

• You put each function in exactly one bucket

• How?• Along the order, find the first bucket such

that one of the variable’s in the function’s scope is the bucket variable

A

E

D

F

B

C

(E,F)

Page 3: Statistical Methods in AI/ML

Bucket elimination: Processing Buckets

• Process in order• Multiply all the functions in the

bucket• Sum-out the bucket variable• Put the new function in one of

the buckets obeying the initialization constraint

A

B

C

D

E

F(C,E)

(E,F)

(D,F)(B,D) (C,D)

A

E

D

F

B

C

ψ(B,C)

ψ(C,F)

ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z

Page 4: Statistical Methods in AI/ML

Bucket elimination: Why it works?A

B

C

D

E

F

A

E

D

F

B

C

(C,E)(E,F)

(D,F)(B,D) (C,D)

ψ(B,C)

ψ(C,F)ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z𝒁=∑

𝒄❑∑

𝒃❑∑

𝒇❑∑

𝒅❑∑

𝒆❑∑

𝒂𝝓 (𝒂 ,𝒃)𝝓 (𝒂 ,𝒄)𝝓 (𝒃 ,𝒅 )𝝓 (𝒄 ,𝒅)𝝓 (𝒄 ,𝒆)𝝓(𝒅 , 𝒇 )𝝓 (𝒆 , 𝒇 )

Page 5: Statistical Methods in AI/ML

Bucket elimination: Why it works?A

E

D

F

B

C

(C,E)(E,F)

(D,F)(B,D) (C,D)

ψ(B,C)

ψ(C,F)ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z

Page 6: Statistical Methods in AI/ML

Bucket elimination: Why it works?A

E

D

F

B

C

(C,E)(E,F)

(D,F)(B,D) (C,D)

ψ(B,C)

ψ(C,F)ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z

Page 7: Statistical Methods in AI/ML

Bucket elimination: Why it works?A

E

D

F

B

C

(C,E)(E,F)

(D,F)(B,D) (C,D)

ψ(B,C)

ψ(C,F)ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z

Page 8: Statistical Methods in AI/ML

Bucket elimination: Why it works?A

E

D

F

B

C

(C,E)(E,F)

(D,F)(B,D) (C,D)

ψ(B,C)

ψ(C,F)ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z

and so on.

Page 9: Statistical Methods in AI/ML

Bucket elimination: ComplexityA

E

D

F

B

C

(C,E)(E,F)

(D,F)(B,D) (C,D)

ψ(B,C)

ψ(C,F)ψ(B,C,F)

ψ2(B,C)

ψ(C)

(A,C)(A,B)

Z

exp(3)

exp(3)

exp(4)

exp(3)

exp(2)

exp(1)

≈6exp(3)

Complexity: O(nexp(w))w: scope of the largest function generatedn:#variables

Page 10: Statistical Methods in AI/ML

Bucket elimination: Determining complexity graphically

• Schematic operation on a graph

– Process nodes in order– Connect all children of a node to

each other

E

D

F

B

C

A

A

B

C

D

E

F

Page 11: Statistical Methods in AI/ML

Bucket elimination: Complexity

• Complexity of processing a bucket “i”– exp(childreni)

• Complexity of bucket elimination– nexp(max(childreni))

E

D

F

B

C

A

Page 12: Statistical Methods in AI/ML

Treewidth and Tree Decompositions

• Running schematic bucket elimination yields a chordal graph– Each cycle of length > 3 has a chord (an edge

connecting two nodes that are not adjacent in the cycle)

• Every chordal graph can be represented using a tree decomposition

Page 13: Statistical Methods in AI/ML

Tree Decomposition of Chordal graphs

E

D

F

B

C

A ABC

EFC

DBCF

FBC

BC

C

BCFC

FBC

BC

C

Page 14: Statistical Methods in AI/ML

Tree Decomposition and Treewidth: Definition

• Given a network and its interaction graph• Tree Decomposition is a set of subset of variables connected by

a tree such that:– Each variable is present in at least one subset– Each edge is present in at least one subset– The set of subsets containing a variable “X” form a connected sub-tree

• Running intersection property

• Width of a tree decomposition: Cardinality of the maximum subset minus 1

• Treewidth: minimum width out of all possible tree decompositions

Page 15: Statistical Methods in AI/ML

Bucket elimination: Complexity

• Best possible complexity: O(nexp(w+1)) where w is the treewidth of the graph

• Thus, we have a graph-based algorithm for determining the complexity of bucket elimination.

• If w is small, we can solve the problem efficiently!

Page 16: Statistical Methods in AI/ML

Generating Tree Decompositions

• Computing treewidth is NP-hard• Branch and Bound algorithm

(Gogate&Dechter, 2004)• Best-first search algorithm– (Dow and Korf, 2009)

• Heuristics in practice– min-fill heuristic– min-degree heuristic

Page 17: Statistical Methods in AI/ML

Min-degree and min-fill

• min-degree– At each point, select a variable with minimum

degree (ties broken arbitrarily)– Connect the children of the variable to each other

• min-fill– At each point, select a variable that adds the

minimum number of edges to the current graph– Connect the children of the selected variable to

each other

Page 18: Statistical Methods in AI/ML

Bucket Elimination: Implementation

• Two basic operations: Sum-out and Product– Naïve implementation of these two operations will

make your algorithm very slow• Factors: Use Arrays instead of Hashmaps!– Fast member functions for the following• Variable Assignment to Entry in the array• Entry in the array to Assignment

Page 19: Statistical Methods in AI/ML

Computing all Marginals

• Bucket elimination computes – P(e) or Z– P(Xi|e) where “Xi” is the last variable eliminated

• To compute all marginals P(Xi|e) for all variables Xi

– Run bucket elimination “n” times• Efficient algorithm– Junction tree algorithm or bucket tree propagation– Requires only two passes to compute all marginals

Page 20: Statistical Methods in AI/ML

Junction tree algorithm:An exact message passing algorithm

• Construct a tree decomposition T• Initialize the tree decomposition as in bucket

elimination• Select an arbitrary node of T as root• Pass messages from leaves to root (upward

pass)• Pass messages from root to leaves (downward

pass)

Page 21: Statistical Methods in AI/ML

Message passing Equations• Multiply all received

messages except from R• Multiply all functions• Sum-out all variables

except the separatorS

R

𝑚 (𝑆→𝑅 )= ∑𝑉 𝑎𝑟𝑠 (𝑆 )−𝑆𝑒𝑝 (𝑆,𝑅)

∏𝑓 ∈ 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑠 (𝑆)

𝑓 ∏𝐺∈ h𝑁𝑒𝑖𝑔 𝑏𝑜𝑟𝑠 (𝑆)−𝑅

𝑚 (¿¿¿G→R)¿¿¿

Page 22: Statistical Methods in AI/ML

Computing all marginals

SP(S)

Page 23: Statistical Methods in AI/ML

Message passing Equations

• Select “EFC” as root• Pass messages from

leaves to root• Pass messages from

root to leaves

ABC

EFC

DBCF

FBC

BC

C

FC

FBC

BC

C

(C,E) (E,F)

(D,F)

(B,D)

(C,D)

(A,C)(A,B)

Page 24: Statistical Methods in AI/ML

Architectures

• Shenoy-Shafer architecture• Hugin architecture– Associate one function with each cluster– Requires Division– Smaller time complexity– Higher space complexity