104
Chapter 13 Constraint Optimization And counting, and enumeration 275 class

Chapter 13

  • Upload
    petra

  • View
    30

  • Download
    1

Embed Size (px)

DESCRIPTION

Chapter 13. Constraint Optimization And counting, and enumeration 275 class. From satisfaction to counting and enumeration and to general graphical models. ICS-275 Spring 2007. Outline. Introduction Optimization tasks for graphical models - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 13

Chapter 13

Constraint OptimizationAnd counting, and

enumeration275 class

Page 2: Chapter 13

From satisfaction to counting and enumeration and to general graphical models

ICS-275Spring 2007

Page 3: Chapter 13

Outline Introduction

Optimization tasks for graphical models Solving optimization problems with inference and

search Inference

Bucket elimination, dynamic programming Mini-bucket elimination

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme

Page 4: Chapter 13

A Bred greenred yellowgreen redgreen yellowyellow greenyellow red

Example: map coloring Variables - countries (A,B,C,etc.)

Values - colors (e.g., red, green, yellow)

Constraints: etc. ,ED D, AB,A

C

A

B

DE

F

G

Task: consistency?Find a solution, all solutions, counting

Constraint Satisfaction

Page 5: Chapter 13

= {(¬C), (A v B v C), (¬A v B v E), (¬B v C v D)}.

Propositional Satisfiability

Page 6: Chapter 13

Constraint Optimization Problemsfor Graphical Models

functionscost - },...,{

domains - },...,{

variables- },...,{

:where,, triplea is A

1

1

1

m

n

n

ffF

DDD

XXX

FDXRCOPfinite

A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0

m

i i XfXF1

)(

FunctionCost Global

f(A,B,D) has scope {A,B,D}

Page 7: Chapter 13

Constraint Optimization Problemsfor Graphical Models

functionscost - },...,{

domains - },...,{

variables- },...,{

:where,, triplea is A

1

1

1

m

n

n

ffF

DDD

XXX

FDXRCOPfinite

A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0

G

A

B C

D F

m

i i XfXF1

)(

FunctionCost Global

Primal graph =Variables --> nodesFunctions, Constraints - arcs

f1(A,B,D)f2(D,F,G)f3(B,C,F)

f(A,B,D) has scope {A,B,D}

F(a,b,c,d,f,g)= f1(a,b,d)+f2(d,f,g)+f3(b,c,f)

Page 8: Chapter 13

Constrained Optimization

Example: power plant scheduling

)X,...,ost(XTotalFuelC minimize :

)(Power : demandpower time,down-min and up-min ,, :sConstraint

. domain ,Variables

N1

4321

1

Objective

DemandXXXXX

{ON,OFF}},...,X{X

i

n

Page 9: Chapter 13

Probabilistic Networks

Smoking

BronchitisCancer

X-Ray

Dyspnoea

P(S)

P(B|S)

P(D|C,B)

P(C|S)

P(X|C,S)

P(S,C,B,X,D) = P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B)

C BD=0

D=1

0 0 0.1 0.9

0 1 0.7 0.3

1 0 0.8 0.2

1 1 0.9 0.1

P(D|C,B)

Page 10: Chapter 13

A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions

(constraints, CPTS, cnfs) Primal graph

Graphical models

CAFF

CAFPF

i

i

:

),|(:

A

D

B C

E

F

Page 11: Chapter 13

A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions

(constraints, CPTS, cnfs) Primal graph

Graphical models

CAFF

CAFPF

i

i

:

),|(:

A

D

B C

E

F

MPE: maxX j Pj

CSP: X j Cj

Max-CSP: minX j Fj

Optimization: minX j Fj

A COP problem defined by operators sum, product and min,max over consistent solutions

Page 12: Chapter 13

Outline Introduction

Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming,

tree-clustering, bucket-elimination Mini-bucket elimination, belief propagation

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme

Page 13: Chapter 13

Computing MPE

“Moral” graph

A

D E

CB

bcde ,,,0max

P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=

0max

eP(a)

dmax

),,,( ecdahB

bmax P(b|a)P(d|b,a)P(e|b,c)

B C

ED

Variable Elimination

P(c|a)c

max

MPE=

Page 14: Chapter 13

b

maxElimination operator

MPE

bucket B:

P(a)

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e=0

B

C

D

E

A

e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

Finding )xP(maxMPEx

),|(),|()|()|()(max

,,,,cbePbadPabPacPaPMPE

bcdea

Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)

Page 15: Chapter 13

Generating the MPE-tuple

C:

E:

P(b|a) P(d|b,a) P(e|b,c)B:

D:

A: P(a)

P(c|a)

e=0 e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

(a)hP(a)max arga' 1. E

a

0e' 2.

)e'd,,(a'hmax argd' 3. C

d

)e'c,,d',(a'h

)a'|P(cmax argc' 4.B

c

)c'b,|P(e')a'b,|P(d')a'|P(bmax argb' 5.

b

)e',d',c',b',(a' Return

Page 16: Chapter 13

b

maxElimination operator

MPE

exp(W*=4)”induced width” (max clique size)

bucket B:

P(a)

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e=0

B

C

D

E

A

e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

Complexity

),|(),|()|()|()(max

,,,,cbePbadPabPacPaPMPE

bcdea

Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)

Page 17: Chapter 13

Complexity of bucket elimination

))((exp ( * dwrOddw ordering alonggraph primal theof width induced the)(*

The effect of the ordering:

4)( 1* dw 2)( 2

* dwconstraint graph

A

D E

CB

B

C

D

E

A

E

D

C

B

A

Finding smallest induced-width is hard

r = number of functions

Bucket-elimination is time and space

Page 18: Chapter 13

Directional i-consistency

DCBR

A

E

CD

B

D

CB

E

D

CB

E

DC

B

E

:A

B A:B

BC :C

AD C,D :D

BE C,E D,E :E

Adaptive d-arcd-path

DBDC RR ,CBR

DRCRDR

Page 19: Chapter 13

Mini-bucket approximation: MPE task

Split a bucket into mini-buckets =>bound complexity

XX gh )()()O(e :decrease complexity lExponentia n rnr eOeO

Page 20: Chapter 13

Mini-Bucket Elimination

A

B C

D

E

P(A)

P(B|A) P(C|A)

P(E|B,C)

P(D|A,B)

Bucket B

Bucket C

Bucket D

Bucket E

Bucket A

P(B|A) P(D|A,B)P(E|B,C)

P(C|A)

E = 0

P(A)

maxB∏

hB (A,D)

MPE* is an upper bound on MPE --UGenerating a solution yields a lower bound--L

maxB∏

hD (A)

hC (A,E)

hB (C,E)

hE (A)

Page 21: Chapter 13

MBE-MPE(i) Algorithm Approx-MPE (Dechter&Rish 1997)

Input: i – max number of variables allowed in a mini-bucket Output: [lower bound (P of a sub-optimal solution), upper bound]

Example: approx-mpe(3) versus elim-mpe

2* w 4* w

Page 22: Chapter 13

Properties of MBE(i)

Complexity: O(r exp(i)) time and O(exp(i)) space. Yields an upper-bound and a lower-bound.

Accuracy: determined by upper/lower (U/L) bound.

As i increases, both accuracy and complexity increase.

Possible use of mini-bucket approximations: As anytime algorithms As heuristics in search

Other tasks: similar mini-bucket approximations for: belief updating, MAP and MEU (Dechter and Rish, 1997)

Page 23: Chapter 13

Outline Introduction

Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming Mini-bucket elimination

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme

Page 24: Chapter 13

The Search Space

9

1

mini

iX ff

A

E

C

B

F

D

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

Objective function:

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

Page 25: Chapter 13

The Search Space

Arc-cost is calculated based on cost components.

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

A B f1

0 0 20 1 01 0 11 1 4

A

E

C

B

F

D

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 02 23 02 23 02 23 02 2 3 02 23 02 23 02 23 02 2

0 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

3 1

2

5 4

0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

0 4

9

1

mini

iX ff

Page 26: Chapter 13

The Value Function

Value of node = minimal cost solution below it

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

A B f1

0 0 20 1 01 0 11 1 4

A

E

C

B

F

D

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 00

2 2

6

2

3

3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2

3 3 3 1 1 1 1

8 5 3 1

5

5

1 0 1 1 10 0 0 1 0 1 1 10 0 0

2 2 2 2 0 0 0 0

7 4 2 0

7 4

7

50 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

3 1

2

5 4

0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

0 4

9

1

mini

iX ff

Page 27: Chapter 13

An Optimal Solution

Value of node = minimal cost solution below it

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

A B f1

0 0 20 1 01 0 11 1 4

A

E

C

B

F

D

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 00

2 2

6

2

3

3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2

3 3 3 1 1 1 1

8 5 3 1

5

5

1 0 1 1 10 0 0 1 0 1 1 10 0 0

2 2 2 2 0 0 0 0

7 4 2 0

7 4

7

50 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

3 1

2

5 4

0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

0 4

Page 28: Chapter 13

Basic Heuristic Search Schemes

Heuristic function f(x) computes a lower bound on the best

extension of x and can be used to guide a heuristic search algorithm. We focus on

1.Branch and BoundUse heuristic function f(xp) to prune the depth-first search tree.Linear space

2.Best-First SearchAlways expand the node with the highest heuristic value f(xp).Needs lots of memory

f L

L

Page 29: Chapter 13

Classic Branch-and-Bound

nn

g(n)g(n)

h(n)h(n)

LB(n) = g(n) + h(n)LB(n) = g(n) + h(n)

Lower Bound Lower Bound LBLB

OR Search Tree

Prune if LB(n) ≥ UBPrune if LB(n) ≥ UB

Upper Bound Upper Bound UBUB

Page 30: Chapter 13

How to Generate Heuristics

The principle of relaxed models

Linear optimization for integer programs

Mini-bucket elimination Bounded directional consistency ideas

Page 31: Chapter 13

Generating Heuristic for graphical models(Kask and Dechter, 1999)

Given a cost function

C(a,b,c,d,e) = f(a) • f(b,a) • f(c,a) • f(e,b,c) • P(d,b,a)

Define an evaluation function over a partial assignment as theprobability of it’s best extension

f*(a,e,d) = minb,c f(a,b,c,d,e) = = f(a) • minb,c f(b,a) • P(c,a) • P(e,b,c) • P(d,a,b)

= g(a,e,d) • H*(a,e,d)

D

E

E

DA

D

BD

B0

1

1

0

1

0

Page 32: Chapter 13

Generating Heuristics (cont.)

H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)

= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]

minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]

minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]

= hB(d,a) • hC(e,a)

= H(a,e,d)

f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)

The heuristic function H is what is compiled during the preprocessing stage of the

Mini-Bucket algorithm.

Page 33: Chapter 13

Generating Heuristics (cont.)

H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)

= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]

minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]

minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]

= hB(d,a) • hC(e,a)

= H(a,e,d)

f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)

The heuristic function H is what is compiled during the preprocessing stage of the

Mini-Bucket algorithm.

Page 34: Chapter 13

Static MBE Heuristics Given a partial assignment xp, estimate the cost of the

best extension to a full solution The evaluation function f(x^p) can be computed using

function recorded by the Mini-Bucket scheme

B: P(E|B,C) P(D|A,B) P(B|A)

A:

E:

D:

C: P(C|A) hB(E,C)

hB(D,A)

hC(E,A)

P(A) hE(A) hD(A)

f(a,e,D) = P(a) · hB(D,a) · hC(e,a)

g h – is admissible

A

B C

D

E

Belief Network

E

E

DA

D

BD

B

0

1

1

0

1

0

f(a,e,D))=g(a,e) · H(a,e,D )

Page 35: Chapter 13

Heuristics Properties

MB Heuristic is monotone, admissible

Retrieved in linear time IMPORTANT:

Heuristic strength can vary by MB(i). Higher i-bound more pre-processing

stronger heuristic less search.

Allows controlled trade-off between preprocessing and search

Page 36: Chapter 13

Experimental Methodology

Algorithms BBMB(i) – Branch and Bound with MB(i) BBFB(i) - Best-First with MB(i) MBE(i)

Test networks: Random Coding (Bayesian) CPCS (Bayesian) Random (CSP)

Measures of performance Compare accuracy given a fixed amount of time - how

close is the cost found to the optimal solution Compare trade-off performance as a function of time

Page 37: Chapter 13

Empirical Evaluation of mini-bucket heuristics,Bayesian networks, coding

Time [sec]

0 10 20 30

% S

olve

d E

xact

ly

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

BBMB i=2

BFMB i=2

BBMB i=6

BFMB i=6

BBMB i=10

BFMB i=10

BBMB i=14

BFMB i=14

Random Coding, K=100, noise=0.28 Random Coding, K=100, noise 0.32

Time [sec]

0 10 20 30

% S

olve

d E

xact

ly

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

BBMB i=6 BFMB i=6 BBMB i=10 BFMB i=10 BBMB i=14 BFMB i=14

Random Coding, K=100, noise=0.32

Page 38: Chapter 13

38

Max-CSP experiments(Kask and Dechter, 2000)

Page 39: Chapter 13

Dynamic MB Heuristics

Rather than pre-compiling, the mini-bucket heuristics can be generated during search

Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node in the search space

(a partial assignment, along the given variable ordering)

Page 40: Chapter 13

Dynamic MB and MBTE Heuristics(Kask, Marinescu and Dechter, 2003)

Rather than precompile compute the heuristics during search

Dynamic MB: Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node during search

Dynamic MBTE: We can compute heuristics simultaneously for all un-instantiated variables using mini-bucket-tree elimination .

MBTE is an approximation scheme defined over cluster-trees. It outputs multiple bounds for each variable and value extension at once.

Page 41: Chapter 13

ABC

2

4

),|()|()(),()2,1( bacpabpapcbha

1

3 BEF

EFG

),(),|()|(),( )2,3(,

)1,2( fbhdcfpbdpcbhfd

),(),|()|(),( )2,1(,

)3,2( cbhdcfpbdpfbhdc

),(),|(),( )3,4()2,3( fehfbepfbhe

),(),|(),( )3,2()4,3( fbhfbepfehb

),|(),()3,4( fegGpfeh e

EF

BF

BC

BCDF

G

E

F

C D

B

A

Cluster Tree Elimination - example

Page 42: Chapter 13

Mini-Clustering

Motivation: Time and space complexity of Cluster Tree Elimination

depend on the induced width w* of the problem When the induced width w* is big, CTE algorithm becomes

infeasible The basic idea:

Try to reduce the size of the cluster (the exponent); partition each cluster into mini-clusters with less variables

Accuracy parameter i = maximum number of variables in a mini-cluster

The idea was explored for variable elimination (Mini-Bucket)

Page 43: Chapter 13

Split a cluster into mini-clusters => bound complexity

)()( :decrease complexity lExponentia rnrn eOeO)O(e

Idea of Mini-Clustering

},...,,,...,{ 11 nrr hhhh )(ucluster

elim

n

iihh

1

},...,{ 1 rhh },...,{ 1 nr hh

elim

n

rii

elim

r

ii hhg

11

gh

Page 44: Chapter 13

EF

BF

BC

),|()|()(:),(1)2,1( bacpabpapcbh

a

)2,1(H

fd

fd

dcfpch

fbhbdpbh

,

2)1,2(

,

1)2,3(

1)1,2(

),|(:)(

),()|(:)(

)1,2(H

dc

dc

dcfpfh

cbhbdpbh

,

2)3,2(

,

1)2,1(

1)3,2(

),|(:)(

),()|(:)(

)3,2(H

),(),|(:),( 1)3,4(

1)2,3( fehfbepfbh

e

)2,3(H

)()(),|(:),( 2)3,2(

1)3,2(

1)4,3( fhbhfbepfeh

b

)4,3(H

),|(:),(1)3,4( fegGpfeh e)3,4(H

ABC

2

4

1

3 BEF

EFG

BCDF

Mini-Clustering - example

Page 45: Chapter 13

ABC

2

4

),()2,1( cbh1

3 BEF

EFG

),()1,2( cbh

),()3,2( fbh

),()2,3( fbh

),()4,3( feh

),()3,4( fehEF

BF

BC

BCDF

),(1)2,1( cbh

)(

)(2

)1,2(

1)1,2(

ch

bh

)(

)(2

)3,2(

1)3,2(

fh

bh

),(1)2,3( fbh

),(1)4,3( feh

),(1)3,4( feh

)2,1(H

)1,2(H

)3,2(H

)2,3(H

)4,3(H

)3,4(H

ABC

2

4

1

3 BEF

EFG

EF

BF

BC

BCDF

Mini Bucket Tree Elimination

Page 46: Chapter 13

Mini-Clustering

Correctness and completeness: Algorithm MC(i) computes a bound (or an approximation) for each variable and each of its values.

MBTE: when the clusters are buckets in BTE.

Page 47: Chapter 13

Branch and Bound w/ Mini-Buckets

BB with static Mini-Bucket Heuristics (s-BBMB) Heuristic information is pre-compiled before search.

Static variable ordering, prunes current variable

BB with dynamic Mini-Bucket Heuristics (d-BBMB)

Heuristic information is assembled during search. Static variable ordering, prunes current variable

BB with dynamic Mini-Bucket-Tree Heuristics (BBBT)

Heuristic information is assembled during search. Dynamic variable ordering, prunes all future variables

Page 48: Chapter 13

Empirical Evaluation Algorithms:

Complete BBBT BBMB

Incomplete DLM GLS SLS IJGP IBP (coding)

Measures: Time Accuracy (% exact) #Backtracks Bit Error Rate (coding)

Benchmarks: Coding networks Bayesian Network

Repository Grid networks (N-by-N) Random noisy-OR networks Random networks

Page 49: Chapter 13

Real World Benchmarks

Average Accuracy and Time. 30 samples, 10 observations, 30 seconds

Page 50: Chapter 13

Empirical Results: Max-CSP Random Binary Problems: <N, K, C, T>

N: number of variables K: domain size C: number of constraints T: Tightness

Task: Max-CSP

Page 51: Chapter 13

BBBT(i) vs. BBMB(i).

BBBT(i) vs BBMB(i), N=100

i=2 i=3 i=4 i=5 i=6 i=7 i=2

Page 52: Chapter 13

Searching the Graph; caching goods

C context(C) = [ABC]

D context(D) = [ABD]

F context(F) = [F]

E context(E) = [AE]

B context(B) = [AB]

A context(A) = [A]

A

E

C

B

F

D

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1

0 1

0 1 0 1

0 1 0 1

0 1

5

0 0

2 0 0 4

3 1 5 4 0 2 2 5

30

22 1

2

04

35

3 5 1 3 13

56 4

2 24 1

0 56 4

2 24

0

52

5 2 3 0 30

Page 53: Chapter 13

Searching the Graph; caching goods

C context(C) = [ABC]

D context(D) = [ABD]

F context(F) = [F]

E context(E) = [AE]

B context(B) = [AB]

A context(A) = [A]

A

E

C

B

F

D

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1

0 1

0 1 0 1

0 1 0 1

0 1

0 2 1 0

3 3 1 1 2 2 0 0

8 5 3 1 7 4 2 0

6 5 7 4

5 7

5

0 0

2 0 0 4

3 1 5 4 0 2 2 5

30

22 1

2

04

35

3 5 1 3 13

56 4

2 24 1

0 56 4

2 24 1

0

52

5 2 3 0 30

Page 54: Chapter 13

Outline Introduction

Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme

Page 55: Chapter 13

Classic OR Search Space

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

E

C

F

D

B

A 0 1

Ordering: A B E C Ordering: A B E C D FD F

A

D

B C

E

F

Page 56: Chapter 13

AND/OR Search Space

AOR

0AND 1

BOR B

0AND 1 0 1

EOR C E C E C E C

OR D F D F D F D F D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 0 10 1 0 10 1 0 10 1 0 10 1

A

D

B C

E

F

A

D

B

CE

F

Primal graph DFS tree

A

D

B C

E

F

A

D

B C

E

F

Page 57: Chapter 13

AND/OR vs. OR

E 0 1 0 1 0 1 0 1

0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0 1

A 0 1

AOR

0AND 1

BOR B

0AND 1 0 1

EOR C E C E C E C

OR D F D F D F D F D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 0 10 1 0 10 1 0 10 1 0 10 1

E 0 1 0 1 0 1 0 1

0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0 1

A 0 1

E 0 1 0 1 0 1 0 1

0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0 1

A 0 1

AND/OR

OR

A

D

B C

E

F

A

D

B

CE

F

AND/OR size: exp(4), OR size exp(6)

Page 58: Chapter 13

OR space vs. AND/OR space

width

height

OR space AND/OR space

Time (sec.) Nodes Backtracks Time (sec.) AND nodes OR nodes

5 10 3.1542,097,15

0 1,048,575 0.03 10,494 5,247

4 9 3.1352,097,15

0 1,048,575 0.01 5,102 2,551

5 10 3.1242,097,15

0 1,048,575 0.03 8,926 4,463

4 10 3.1252,097,15

0 1,048,575 0.02 7,806 3,903

5 13 3.1042,097,15

0 1,048,575 0.1 36,510 18,255

Random graphs with 20 nodes, 20 edges and 2 values per node.

Page 59: Chapter 13

AND/OR vs. OR

F

AND/OR

A

D

B C

E

F

A

D

B

CE

F

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

AOR

0AND 1

BOR B

0AND 1 0

EOR C E C E C

OR D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 10 1 0 10 1 10 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

OR

(A=1,B=1)(B=0,C=0)

Page 60: Chapter 13

AND/OR vs. OR

F

AND/OR

A

D

B C

E

F

A

D

B

CE

F

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

AOR

0AND 1

BOR B

0AND 1 0

EOR C E C E C

OR D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 10 1 0 10 1 10 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

OR

(A=1,B=1)(B=0,C=0)

Space: linearTime:O(exp(m))O(w* log n)

Linear space,Time:O(exp(n))

Page 61: Chapter 13

#CSP – AND/OR Search Tree

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

Page 62: Chapter 13

#CSP – AND/OR Search Tree

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

Page 63: Chapter 13

#CSP – AND/OR Tree DFS

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F

AND 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1 1 1 0 0 1

2 1 1

2 1 1 0

3 1

3

9

9

1

E

F F

0 1 0 1

0 1

C

D

0 1

0 1

1

B

0

E

F

0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F

0 1

0 1

C

D

0 1

0 1

1 1 0 1 1 1

2 1 2

2 1 20

2 3

6

1 1 1 0 1 0 1 0 1 1

2 1 1 1 2

3 1

2 1 10 01 02

1 2

3 2

5

5

14

OR node: Marginalization operator (summation)

AND node: Combination operator (product)

Page 64: Chapter 13

AND/OR Tree Search for COP

A

0

B

0

E

F F

0 1 0 1

OR

AND

OR

AND

OR

OR

AND

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

A

E

C

B

F

D

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

5 6 4 2 3 0 2 2

5 2 0 2

5 2 0 2

3 3

6

5

5

5

3 1 3 5

2

2 4 1 0 3 0 2 2

2 0 0 2

2 0 0 2

4 1

5

5 4 1 3

0

0

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

5 6 4 2 1 2 0 4 2 4 1 0 1 2 0 4

6 4

7

7

5 2 1 0 2 0 1 0

5 2 1 0 2 0 1 0

4 2 4 0

0

0 2 5 2 2 5 3 0

1 4

A

D

B

EC

F

AND node = Combination operator (summation)

9

1min :Goal

i iX Xf

OR node = Marginalization operator (minimization)

Page 65: Chapter 13

Summary of AND/OR Search Trees

Based on a backbone pseudo-tree

A solution is a subtree

Each node has a value – cost of the optimal solution to the subproblem (computed recursively based on the values of the descendants)

Solving a task = finding the value of the root node

AND/OR search tree and algorithms are ([Freuder & Quinn85], [Collin, Dechter & Katz91], [Bayardo & Miranker95])

Space: O(n) Time: O(exp(m)), where m is the depth of the pseudo-tree Time: O(exp(w* log n)) BFS is time and space O(exp(w* log n)

Page 66: Chapter 13

An AND/OR Graph: Caching Goods

A

D

B C

E

F

A

D

B

CE

F

G H

J

K

G

H

J

KAOR

0AND 1

BOR B

0AND 1 0 1

EOR C E C E C E C

OR D F D F D F D F D F D F D F D F

AND

AND 0 10 1 0 10 1 0 10 1 0 10 1

OR

OR

AND

AND

0

G

H H

0 1 0 1

0 1

1

G

H H

0 1 0 1

0 1

0

J

K K

0 1 0 1

0 1

1

J

K K

0 1 0 1

0 1

Page 67: Chapter 13

Context-based Caching

Caching is possible when context is the same

context = parent-separator set in induced pseudo-graph

= current variable + parents connected to subtree below

A

D

B C

E

F

A

D

B

CE

F

G H

J

K

G

H

J

K

context(B) = {A, B}

context(c) = {A,B,C}

context(D) = {D}

context(F) = {F}

Page 68: Chapter 13

Complexity of AND/OR Graph

Theorem: Traversing the AND/OR search graphis time and space exponential in the induced width/tree-width.

If applied to the OR graph complexity is time and space exponential in the path-width.

Page 69: Chapter 13

#CSP – AND/OR Search Tree

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

Page 70: Chapter 13

#CSP – AND/OR Tree DFS

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F

AND 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1 1 1 0 0 1

2 1 1

2 1 1 0

3 1

3

9

9

1

E

F F

0 1 0 1

0 1

C

D

0 1

0 1

1

B

0

E

F

0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F

0 1

0 1

C

D

0 1

0 1

1 1 0 1 1 1

2 1 2

2 1 20

2 3

6

1 1 1 0 1 0 1 0 1 1

2 1 1 1 2

3 1

2 1 10 01 02

1 2

3 2

5

5

14

Page 71: Chapter 13

#CSP – AND/OR Search Graph(Caching Goods)

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1

AND 0 1

C

D D

0 1

0 1

1

EC

D D

0 1

1

B

0

E

F F

0 1

C

1

EC

Page 72: Chapter 13

#CSP – AND/OR Search Graph(Caching Goods)

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1

AND 0 1

C

D D

0 1

0 1

1

EC

D D

0 1

1

B

0

E

F F

0 1

C

1

EC

Time and SpaceO(exp(w*))

Page 73: Chapter 13

All Four Search Spaces

Full OR search tree

126 nodes

Full AND/OR search tree

54 AND nodes

Context minimal OR search graph

28 nodes

Context minimal AND/OR search graph

18 AND nodes

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1

0 1

0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

AOR0ANDBOR

0AND

OR E

OR F F

AND 0 1

AND 0 1

C

D D

0 1

0 1

1

EC

D D

0 1

1

B

0

E

F F

0 1

C

1

EC

Page 74: Chapter 13

AND/OR vs. OR DFS algorithms

AND/OR tree Space: O(n) Time: O(n km)

O(n kw* log n)(Freuder85; Bayardo95; Darwiche01)

AND/OR graph Space: O(n kw*) Time: O(n kw*)

k = domain sizem = pseudo-tree depthn = number of variablesw*= induced widthpw*= path width

OR tree Space: O(n) Time: O(kn)

OR graph Space: O(n kpw*) Time: O(n kpw*)

Page 75: Chapter 13

Searching AND/OR Graphs

AO(i): searches depth-first, cache i-context i = the max size of a cache table (i.e.

number of variables in a context)

i=0

i=w*

Space: O(n)

Time: O(exp(w* log n))

Space: O(exp w*)

Time: O(exp w*)

AO(i) time complexity?

Page 76: Chapter 13

End of class

We did not cover the rest of the slides

Page 77: Chapter 13

AND/OR Branch-and-Bound (AOBB)

Associate each node n with a static heuristic estimate h(n) of v(n) h(n) is a lower bound on the value v(n)

For every node n in the search tree: ub(n) – current best solution cost rooted at n lb(n) – lower bound on the minimal cost at n

Page 78: Chapter 13

Lower/Upper Bounds

10

X

A B

0

C h(C)v(A)

h(B,0) LB(B,0) = h(B,0)

LB(X,1) = l(X,1) + v(A) + h(C) + LB(B)

Prune below AND node (B,0) if LB(X) ≥ UB(X)

LB(X) = LB(X,1)

LB(B) = LB(B,0)

UB(X) LB(X) UB(X) = best cost below X (i.e. v(X,0))

v(X,0)

Page 79: Chapter 13

Shallow/Deep Cutoffs

Deep cutoff

0

X

1

UB(X)

LB(X) = h(X,1)

10

X

A B

0

ED

0

C

UB(X)

LB(X)

Shallow cutoff

Reminiscent of Minimax shallow/deep cutoffs

Prune if LB(X) ≥ UB(X)

Page 80: Chapter 13

Summary of AOBB

Traverses the AND/OR search tree in a depth-first manner

Lower bounds computed based on heuristic estimates of nodes at the frontier of search, as well as the values of nodes already explored

Prunes the search space as soon as an upper-lower bound violation occurs

Page 81: Chapter 13

Heuristics for AND/OR

In the AND/OR search space h(n) can be computed using any heuristic. We used:

Static Mini-Bucket heuristics

Dynamic Mini-Bucket heuristics

Maintaining FDAC [Larrosa & Schiex03]

(full directional soft arc-consistency)

Page 82: Chapter 13

Empirical Evaluation Tasks

Solving WCSPs Finding the MPE in belief networks

Benchmarks (WCSP) Random binary WCSPs RLFAP networks (CELAR6) Bayesian Networks Repository

Algorithms s-AOMB(i), d-AOMB(i), AOMFDAC s-BBMB(i), d-BBMB(i), BBMFDAC Static variable ordering (dfs traversal of the pseudo-

tree)

Page 83: Chapter 13

Random Binary WCSPs(Marinescu and Dechter, 2005)

Random networks with n=20 (number of variables), d=5 (domain size), c=100 (number of constraints), t=70% (tightness). Time limit 180 seconds.AO search is superior to OR search

S-AOMB vs S-BBMB D-AOMB vs D-BBMB

Page 84: Chapter 13

Random Binary WCSPs (contd.)

n=20 (variables), d=5 (domain size), c=100 (constraints), t=70% (tightness)

dense sparse

n=50 (variables), d=5 (domain size), c=80 (constraints), t=70% (tightness)

AOMB for large i is competitive with AOMFDAC

Page 85: Chapter 13

Resource Allocation

InstanceBBMFDAC AOMFDAC

time (sec) nodes time (sec) nodes

CELAR6-SUB0 2.78 1,871 1.98 435

CELAR6-SUB1 2,420.93 364,986 981.98 180,784

CELAR6-SUB2 8,801.12 19,544,182 1,138.87 175,377

CELAR6-SUB3 38,889.20 91,168,896 4,028.59 846,986

CELAR6-SUB4 84,478.40 6,955,039 47,115.40 4,643,229

CELAR6 sub-instances

Radio Link Frequency Assignment Problem (RLFAP)

AOMFDAC is superior to ORMFDAC

Page 86: Chapter 13

Bayesian Networks Repository

Network Algorithm i=2 i=3 i=4 i=5

(n,d,w*,h)   time nodes time nodes time nodes time nodes

s-AOMB(i) - 8.5M - 7.6M 46.22 807K 0.563 9.6K

Barley s-BBMB(i) - 16M - 18M - 17M - 14M

(48,67,7,17) d-AOMB(i) - 79K 136.0 23K 12.55 667 45.95 567

  d-BBMB(i) - 2.2M - 1M 346.1 76K - 86K

  s-AOMB(i) 57.36 1.2M 12.08 260K 7.203 172K 1.657 43K

Munin1 s-BBMB(i) - 8.5M - 9M - 10M - 8M

(189,21,11,24) d-AOMB(i) 66.56 185K 12.47 8.1K 10.30 1.6K 11.99 523

  d-BBMB(i) - 405K - 430K - 235K 14.63 917

  s-AOMB(i) - 5.9M - 4.9M 1.313 17K 0.453 6K

Munin3 s-BBMB(i) - 1.4M - 1.2M - 316K - 1.5M

(1044,21,7,25) d-AOMB(i) - 2.3M 68.64 58K 3.594 5.9K 2.844 3.8K

  d-BBMB(i) - 33K - 125K - 52K - 31K

Time limit 600 seconds

available at http://www.cs.huji.ac.il/labs/compbio/Repository

Static AO is better with accurate heuristic (large i)

Page 87: Chapter 13

Outline Introduction

Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Searching trees Searching graphs

Hybrids of search and inference Cutset decomposition Super-bucket scheme

Page 88: Chapter 13

From Searching Trees to Searching Graphs

Any two nodes that root identical subtrees/subgraphs can be merged

Minimal AND/OR search graph: closure under merge of the AND/OR search tree

Inconsistent sub-trees can be pruned too. Some portions can be collapsed or reduced.

Page 89: Chapter 13

AND/OR Search Graph

A

E

C

B

F

D

A

D

B

EC

FPrimal graph

Pseudo-tree

context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

Page 90: Chapter 13

AND/OR Search Graph

OR

AND

OR

AND

OR E

OR F F

AND0 1

AND 0 1

D D

0 1

1

1

EC

D D

0 1

E

F F

0 1

A

0

B

0

C

0

1

B

0

C

1

EC

A

E

C

B

F

D

A

D

B

EC

FPrimal graph

Pseudo-tree

context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}

Page 91: Chapter 13

Context-based cachingA

D

B

EC

F

context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}

Cache Table (C)

B CValu

e0 0 50 1 21 0 21 1 0

A

0

B

0

EC

D D

0 1

1

EC

D D

0 1

1

B

0

EC

1

EC

5

5

5 2

3

6

2 0

4

5

4 4

46

7

7

A

E

C

B

F

D

Primal graph

Space: O(exp(2))

Page 92: Chapter 13

Searching AND/OR Graphs

AO(j): searches depth-first, cache j-context j = the max size of a cache table (i.e.

number of variables in a context)

j=0

j=w*

Space: O(n)

Time: O(exp(w* log n))

Space: O(exp w*)

Time: O(exp w*)

AO(j) time complexity?

Page 93: Chapter 13

Graph AND/OR Branch-and-Bound - AOBB(j)

Like BB on tree except, during search, merge nodes based on context (caching); maintain cache tables of size O(exp(j)), where j is a bound on the size of the context.

Heuristics Static mini-bucket Dynamic mini-bucket Soft directional arc-consistency

[Marinescu & Dechter, CP2005]

Page 94: Chapter 13

Pseudo-trees (I)

AND/OR graph/tree search algorithms influenced by the pseudo-tree quality

Finding the minimal context/depth pseudo-tree is a hard problem

Heuristics Min-fill (min context) Hypergraph separation (min depth)

Page 95: Chapter 13

Pseudo-trees (II) MIN-FILL [Kjæaerulff, 1990], [Bayardo & Miranker, 1995]

depth first traversal of the induced graph constructed along some elimination order

elimination order prefers variables with smallest “fill set”

HYPERGRAPH [Darwiche, 2001] constraints are vertices in the hypergraph

and variables are hyperedges Recursive decomposition of the hypergraph

while minimizing the separator size (i.e. number of variables) at each step

Using state-of-the-art software package hMeTiS

Page 96: Chapter 13

Quality of pseudo-trees

Network hypergraph min-fill

  width depth width depth

barley 7 13 7 23

diabetes 7 16 4 77

link 21 40 15 53

mildew 5 9 4 13

munin1 12 17 12 29

munin2 9 16 9 32

munin3 9 15 9 30

munin4 9 18 9 30

water 11 16 10 15

pigs 11 20 11 26

Network hypergraph min-fill

  width depth width depth

spot5 47 152 39 204

spot28 108 138 79 199

spot29 16 23 14 42

spot42 36 48 33 87

spot54 12 16 11 33

spot404 19 26 19 42

spot408 47 52 35 97

spot503 11 20 9 39

spot505 29 42 23 74

spot507 70 122 59 160

Bayesian Networks Repository SPOT5 Benchmarks

Page 97: Chapter 13

Empirical Evaluation Tasks

Solving binary WCSPs Finding the MPE in belief networks

Benchmarks Random networks Resource allocation (SPOT5) Genetic linkage analysis

Algorithms s-AOMB(i,j), d-AOMB(i,j), AOMFDAC(j), VE+C j is the cache bound Static variable ordering

Page 98: Chapter 13

Random Networks

Random networks with n=100 (number of variables), d=3 (domain size), c=90 (number of CPTs), p=2 (number of parents per CPT). Time limit 180 seconds.

Static heuristics Dynamic Heuristics

Caching helps for static heuristics with small i-bounds

Page 99: Chapter 13

Resource Allocation – SPOT5

    hypergraph minfill

Network Algorithm (w*,h) no cache cache (w*,h) no cache cache

      time nodes time nodes   time nodes time nodes

29b AOMFDAC (16,22) 5.938 170,823 1.492 40,428 (14,42) 5.036 79,866 3.237 34,123

(83,394) sAOMB(12)   1.002 8,458 1.012 1,033   0.381 997 0.411 940

42b AOMFDAC (31,43) 1,043 6,071,390 884.1 3,942,942 (18,62) - 22,102,050 - 17,911,719

(191, 1151) sAOMB(16)   132.0 2,871,804 127.4 2,815,503   3.254 11,636 3.164 9,030

54b AOMFDAC (12,14) 0.401 6,581 0.29 3,377 (9,19) 1.793 28,492 0.121 2,087

(68, 197) sAOMB(10)   0.03 74 0.03 74   0.02 567 0.02 381

404b AOMFDAC (19,23) 0.02 148 0.01 138 (19,57) 2.043 21,406 0.08 1,222

(101, 595) sAOMB(12)   0.01 101 0.01 101   0.02 357 0.01 208

503b AOMFDAC (9,14) 0.02 405 0.01 307 (8,46) 1077.1 19,041,552 0.05 701

(144, 414) sAOMB(8)   0.01 150 0.01 150   0.03 1,918 0.01 172

505b AOMFDAC (19,32) 17.8 368,247 5.20 69,045 (16,98) - 9,872,078 15.43 135,641

(241, 1481) sAOMB(14)   5.618 683 6.208 683   4.997 1912 5.096 831

Time limit 1800 seconds

Cache means j=i. we picked bestperforming i

The heuristics MB(i) is strong enough to prune so caching is less relevant

Page 100: Chapter 13

L11m L11f

X11

L12m L12f

X12

L13m L13f

X13

L14m L14f

X14

L15m L15f

X15

L16m L16f

X16

S13m

S15m

S16mS15m

S15m

S15m

L21m L21f

X21

L22m L22f

X22

L23m L23f

X23

L24m L24f

X24

L25m L25f

X25

L26m L26f

X26

S23m

S25m

S26mS25m

S25m

S25m

L31m L31f

X31

L32m L32f

X32

L33m L33f

X33

L34m L34f

X34

L35m L35f

X35

L36m L36f

X36

S33m

S35m

S36mS35m

S35m

S35m

6 people, 3 markers

Page 101: Chapter 13

Empirical evaluation

pedigree VE+C SUPERLINK 1.5(n, d) time time (i,c) time nodes

23 1,144 6,809 (14,18) 4.29 122,811(309, 5)

30 26,719 28,740 (18,22) 27.71 535,763(1016, 5)

s-AOMB(i,c)

*averages over 5 runs

Page 102: Chapter 13

Pedigree 23

pedigree(n,d)

23(309,5)

12 14 16 18 12 14 16 180 24.20 9.49 10.54 8.99 0 1,216,320 432,762 328,129 50,337

hypergraph 8 18.61 7.00 8.78 8.69 8 554,818 192,971 159,472 31,331(24, 38) 10 17.59 6.74 8.98 8.36 10 529,349 183,120 155,303 31,010

12 14.02 5.26 7.16 8.25 12 437,812 146,435 116,996 27,67614 13.87 5.24 7.37 8.25 14 432,895 145,169 116,718 27,55116 13.11 4.94 6.99 8.20 16 405,748 134,472 115,132 27,08718 10.11 4.29 5.93 8.13 18 350,048 122,811 96,893 25,176

SUPERLINK 1.5time

6,809

VE+Ctime

1,144

s-AOMB(i,c) - time s-AOMB(i,c) - nodesi i

cc(w*,h)

*averages over 5 runs

Page 103: Chapter 13

Pedigree 30

pedigree(n,d)

30(1016, 5)

16 18 20 22 16 18 20 220 362.39 72.32 43.58 68.01 0 11,476,500 1,812,940 433,073 70,911

hypergraph 8 289.88 49.63 36.50 66.95 8 8,061,550 939,683 202,137 24,991(24, 38) 16 164.79 30.13 31.70 66.02 16 5,194,540 611,623 119,994 11,446

18 156.46 27.96 30.77 66.36 18 4,705,450 541,026 101,193 10,25020 156.23 27.83 30.72 65.64 20 4,691,760 538,943 100,623 10,12722 154.62 27.71 30.81 66.46 22 4,681,200 535,763 100,396 10,095

i(w*,h) c

ic

26,719 28,740

s-AOMB(i,c) - time s-AOMB(i,c) - nodes

time timeVE+C SUPERLINK 1.5

*averages over 5 runs

Page 104: Chapter 13

Outline Introduction

Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme