Chapter 13

Chapter 13

Constraint OptimizationAnd counting, and

enumeration275 class

From satisfaction to counting and enumeration and to general graphical models

ICS-275Spring 2007

Outline Introduction

Optimization tasks for graphical models Solving optimization problems with inference and

search Inference

Bucket elimination, dynamic programming Mini-bucket elimination

Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces

Hybrids of search and inference Cutset decomposition Super-bucket scheme

A Bred greenred yellowgreen redgreen yellowyellow greenyellow red

Example: map coloring Variables - countries (A,B,C,etc.)

Values - colors (e.g., red, green, yellow)

Constraints: etc. ,ED D, AB,A

C

A

B

DE

F

G

Task: consistency?Find a solution, all solutions, counting

Constraint Satisfaction

= {(¬C), (A v B v C), (¬A v B v E), (¬B v C v D)}.

Propositional Satisfiability

Constraint Optimization Problemsfor Graphical Models

functionscost - },...,{

domains - },...,{

variables- },...,{

:where,, triplea is A

1

1

1

m

n

n

ffF

DDD

XXX

FDXRCOPfinite

A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0

m

i i XfXF1

)(

FunctionCost Global

f(A,B,D) has scope {A,B,D}

Constraint Optimization Problemsfor Graphical Models

functionscost - },...,{

domains - },...,{

variables- },...,{

:where,, triplea is A

1

1

1

m

n

n

ffF

DDD

XXX

FDXRCOPfinite

A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0

G

A

B C

D F

m

i i XfXF1

)(

FunctionCost Global

Primal graph =Variables --> nodesFunctions, Constraints - arcs

f1(A,B,D)f2(D,F,G)f3(B,C,F)

f(A,B,D) has scope {A,B,D}

F(a,b,c,d,f,g)= f1(a,b,d)+f2(d,f,g)+f3(b,c,f)

Constrained Optimization

Example: power plant scheduling

)X,...,ost(XTotalFuelC minimize :

)(Power : demandpower time,down-min and up-min ,, :sConstraint

. domain ,Variables

N1

4321

1

Objective

DemandXXXXX

{ON,OFF}},...,X{X

i

n

Probabilistic Networks

Smoking

BronchitisCancer

X-Ray

Dyspnoea

P(S)

P(B|S)

P(D|C,B)

P(C|S)

P(X|C,S)

P(S,C,B,X,D) = P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B)

C BD=0

D=1

0 0 0.1 0.9

0 1 0.7 0.3

1 0 0.8 0.2

1 1 0.9 0.1

P(D|C,B)

A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions

(constraints, CPTS, cnfs) Primal graph

Graphical models

CAFF

CAFPF

i

i

:

),|(:

A

D

B C

E

F

A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions

(constraints, CPTS, cnfs) Primal graph

Graphical models

CAFF

CAFPF

i

i

:

),|(:

A

D

B C

E

F

MPE: maxX j Pj

CSP: X j Cj

Max-CSP: minX j Fj

Optimization: minX j Fj

A COP problem defined by operators sum, product and min,max over consistent solutions


Optimization tasks for graphical models Solving by inference and search

Inference Bucket elimination, dynamic programming,

tree-clustering, bucket-elimination Mini-bucket elimination, belief propagation



Computing MPE

“Moral” graph

A

D E

CB

bcde ,,,0max

P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=

0max

eP(a)

dmax

),,,( ecdahB

bmax P(b|a)P(d|b,a)P(e|b,c)

B C

ED

Variable Elimination

P(c|a)c

max

MPE=

b

maxElimination operator

MPE

bucket B:

P(a)

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e=0

B

C

D

E

A

e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

Finding )xP(maxMPEx

),|(),|()|()|()(max

,,,,cbePbadPabPacPaPMPE

bcdea

Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)

Generating the MPE-tuple

C:

E:

P(b|a) P(d|b,a) P(e|b,c)B:

D:

A: P(a)

P(c|a)

e=0 e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

(a)hP(a)max arga' 1. E

a

0e' 2.

)e'd,,(a'hmax argd' 3. C

d

)e'c,,d',(a'h

)a'|P(cmax argc' 4.B

c

)c'b,|P(e')a'b,|P(d')a'|P(bmax argb' 5.

b

)e',d',c',b',(a' Return

b

maxElimination operator

MPE

exp(W*=4)”induced width” (max clique size)

bucket B:

P(a)

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e=0

B

C

D

E

A

e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

Complexity

),|(),|()|()|()(max

,,,,cbePbadPabPacPaPMPE

bcdea

Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)

Complexity of bucket elimination

))((exp ( * dwrOddw ordering alonggraph primal theof width induced the)(*

The effect of the ordering:

4)( 1* dw 2)( 2

* dwconstraint graph

A

D E

CB

B

C

D

E

A

E

D

C

B

A

Finding smallest induced-width is hard

r = number of functions

Bucket-elimination is time and space

Directional i-consistency

DCBR

A

E

CD

B

D

CB

E

D

CB

E

DC

B

E

:A

B A:B

BC :C

AD C,D :D

BE C,E D,E :E

Adaptive d-arcd-path

DBDC RR ,CBR

DRCRDR

Mini-bucket approximation: MPE task

Split a bucket into mini-buckets =>bound complexity

XX gh )()()O(e :decrease complexity lExponentia n rnr eOeO

Mini-Bucket Elimination

A

B C

D

E

P(A)

P(B|A) P(C|A)

P(E|B,C)

P(D|A,B)

Bucket B

Bucket C

Bucket D

Bucket E

Bucket A

P(B|A) P(D|A,B)P(E|B,C)

P(C|A)

E = 0

P(A)

maxB∏

hB (A,D)

MPE* is an upper bound on MPE --UGenerating a solution yields a lower bound--L

maxB∏

hD (A)

hC (A,E)

hB (C,E)

hE (A)

MBE-MPE(i) Algorithm Approx-MPE (Dechter&Rish 1997)

Input: i – max number of variables allowed in a mini-bucket Output: [lower bound (P of a sub-optimal solution), upper bound]

Example: approx-mpe(3) versus elim-mpe

2* w 4* w

Properties of MBE(i)

Complexity: O(r exp(i)) time and O(exp(i)) space. Yields an upper-bound and a lower-bound.

Accuracy: determined by upper/lower (U/L) bound.

As i increases, both accuracy and complexity increase.

Possible use of mini-bucket approximations: As anytime algorithms As heuristics in search

Other tasks: similar mini-bucket approximations for: belief updating, MAP and MEU (Dechter and Rish, 1997)



Inference Bucket elimination, dynamic programming Mini-bucket elimination



The Search Space

9

1

mini

iX ff

A

E

C

B

F

D

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

Objective function:

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

The Search Space

Arc-cost is calculated based on cost components.

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

A B f1

0 0 20 1 01 0 11 1 4

A

E

C

B

F

D

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 02 23 02 23 02 23 02 2 3 02 23 02 23 02 23 02 2

0 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

3 1

2

5 4

0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

0 4

9

1

mini

iX ff

The Value Function

Value of node = minimal cost solution below it

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

A B f1

0 0 20 1 01 0 11 1 4

A

E

C

B

F

D

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 00

2 2

6

2

3

3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2

3 3 3 1 1 1 1

8 5 3 1

5

5

1 0 1 1 10 0 0 1 0 1 1 10 0 0

2 2 2 2 0 0 0 0

7 4 2 0

7 4

7

50 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

3 1

2

5 4

0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

0 4

9

1

mini

iX ff

An Optimal Solution

Value of node = minimal cost solution below it

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

A B f1

0 0 20 1 01 0 11 1 4

A

E

C

B

F

D

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

3 00

2 2

6

2

3

3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2

3 3 3 1 1 1 1

8 5 3 1

5

5

1 0 1 1 10 0 0 1 0 1 1 10 0 0

2 2 2 2 0 0 0 0

7 4 2 0

7 4

7

50 0

3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3

5 6 4 2 2 4 1 0

3 1

2

5 4

0

1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4

5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0

5 6 4 2 2 4 1 0

0 2 2 5

0 4

Basic Heuristic Search Schemes

Heuristic function f(x) computes a lower bound on the best

extension of x and can be used to guide a heuristic search algorithm. We focus on

1.Branch and BoundUse heuristic function f(xp) to prune the depth-first search tree.Linear space

2.Best-First SearchAlways expand the node with the highest heuristic value f(xp).Needs lots of memory

f L

L

Classic Branch-and-Bound

nn

g(n)g(n)

h(n)h(n)

LB(n) = g(n) + h(n)LB(n) = g(n) + h(n)

Lower Bound Lower Bound LBLB

OR Search Tree

Prune if LB(n) ≥ UBPrune if LB(n) ≥ UB

Upper Bound Upper Bound UBUB

How to Generate Heuristics

The principle of relaxed models

Linear optimization for integer programs

Mini-bucket elimination Bounded directional consistency ideas

Generating Heuristic for graphical models(Kask and Dechter, 1999)

Given a cost function

C(a,b,c,d,e) = f(a) • f(b,a) • f(c,a) • f(e,b,c) • P(d,b,a)

Define an evaluation function over a partial assignment as theprobability of it’s best extension

f*(a,e,d) = minb,c f(a,b,c,d,e) = = f(a) • minb,c f(b,a) • P(c,a) • P(e,b,c) • P(d,a,b)

= g(a,e,d) • H*(a,e,d)

D

E

E

DA

D

BD

B0

1

1

0

1

0

Generating Heuristics (cont.)

H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)

= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]

minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]

minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]

= hB(d,a) • hC(e,a)

= H(a,e,d)

f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)

The heuristic function H is what is compiled during the preprocessing stage of the

Mini-Bucket algorithm.

Generating Heuristics (cont.)

H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)

= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]

minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]

minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]

= hB(d,a) • hC(e,a)

= H(a,e,d)

f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)

The heuristic function H is what is compiled during the preprocessing stage of the

Mini-Bucket algorithm.

Static MBE Heuristics Given a partial assignment xp, estimate the cost of the

best extension to a full solution The evaluation function f(x^p) can be computed using

function recorded by the Mini-Bucket scheme

B: P(E|B,C) P(D|A,B) P(B|A)

A:

E:

D:

C: P(C|A) hB(E,C)

hB(D,A)

hC(E,A)

P(A) hE(A) hD(A)

f(a,e,D) = P(a) · hB(D,a) · hC(e,a)

g h – is admissible

A

B C

D

E

Belief Network

E

E

DA

D

BD

B

0

1

1

0

1

0

f(a,e,D))=g(a,e) · H(a,e,D )

Heuristics Properties

MB Heuristic is monotone, admissible

Retrieved in linear time IMPORTANT:

Heuristic strength can vary by MB(i). Higher i-bound more pre-processing

stronger heuristic less search.

Allows controlled trade-off between preprocessing and search

Experimental Methodology

Algorithms BBMB(i) – Branch and Bound with MB(i) BBFB(i) - Best-First with MB(i) MBE(i)

Test networks: Random Coding (Bayesian) CPCS (Bayesian) Random (CSP)

Measures of performance Compare accuracy given a fixed amount of time - how

close is the cost found to the optimal solution Compare trade-off performance as a function of time

Empirical Evaluation of mini-bucket heuristics,Bayesian networks, coding

Time [sec]

0 10 20 30

% S

olve

d E

xact

ly

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

BBMB i=2

BFMB i=2

BBMB i=6

BFMB i=6

BBMB i=10

BFMB i=10

BBMB i=14

BFMB i=14

Random Coding, K=100, noise=0.28 Random Coding, K=100, noise 0.32

Time [sec]

0 10 20 30

% S

olve

d E

xact

ly

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

BBMB i=6 BFMB i=6 BBMB i=10 BFMB i=10 BBMB i=14 BFMB i=14

Random Coding, K=100, noise=0.32

38

Max-CSP experiments(Kask and Dechter, 2000)

Dynamic MB Heuristics

Rather than pre-compiling, the mini-bucket heuristics can be generated during search

Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node in the search space

(a partial assignment, along the given variable ordering)

Dynamic MB and MBTE Heuristics(Kask, Marinescu and Dechter, 2003)

Rather than precompile compute the heuristics during search

Dynamic MB: Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node during search

Dynamic MBTE: We can compute heuristics simultaneously for all un-instantiated variables using mini-bucket-tree elimination .

MBTE is an approximation scheme defined over cluster-trees. It outputs multiple bounds for each variable and value extension at once.

ABC

2

4

),|()|()(),()2,1( bacpabpapcbha

1

3 BEF

EFG

),(),|()|(),( )2,3(,

)1,2( fbhdcfpbdpcbhfd

),(),|()|(),( )2,1(,

)3,2( cbhdcfpbdpfbhdc

),(),|(),( )3,4()2,3( fehfbepfbhe

),(),|(),( )3,2()4,3( fbhfbepfehb

),|(),()3,4( fegGpfeh e

EF

BF

BC

BCDF

G

E

F

C D

B

A

Cluster Tree Elimination - example

Mini-Clustering

Motivation: Time and space complexity of Cluster Tree Elimination

depend on the induced width w* of the problem When the induced width w* is big, CTE algorithm becomes

infeasible The basic idea:

Try to reduce the size of the cluster (the exponent); partition each cluster into mini-clusters with less variables

Accuracy parameter i = maximum number of variables in a mini-cluster

The idea was explored for variable elimination (Mini-Bucket)

Split a cluster into mini-clusters => bound complexity

)()( :decrease complexity lExponentia rnrn eOeO)O(e

Idea of Mini-Clustering

},...,,,...,{ 11 nrr hhhh )(ucluster

elim

n

iihh

1

},...,{ 1 rhh },...,{ 1 nr hh

elim

n

rii

elim

r

ii hhg

11

gh

EF

BF

BC

),|()|()(:),(1)2,1( bacpabpapcbh

a

)2,1(H

fd

fd

dcfpch

fbhbdpbh

,

2)1,2(

,

1)2,3(

1)1,2(

),|(:)(

),()|(:)(

)1,2(H

dc

dc

dcfpfh

cbhbdpbh

,

2)3,2(

,

1)2,1(

1)3,2(

),|(:)(

),()|(:)(

)3,2(H

),(),|(:),( 1)3,4(

1)2,3( fehfbepfbh

e

)2,3(H

)()(),|(:),( 2)3,2(

1)3,2(

1)4,3( fhbhfbepfeh

b

)4,3(H

),|(:),(1)3,4( fegGpfeh e)3,4(H

ABC

2

4

1

3 BEF

EFG

BCDF

Mini-Clustering - example

ABC

2

4

),()2,1( cbh1

3 BEF

EFG

),()1,2( cbh

),()3,2( fbh

),()2,3( fbh

),()4,3( feh

),()3,4( fehEF

BF

BC

BCDF

),(1)2,1( cbh

)(

)(2

)1,2(

1)1,2(

ch

bh

)(

)(2

)3,2(

1)3,2(

fh

bh

),(1)2,3( fbh

),(1)4,3( feh

),(1)3,4( feh

)2,1(H

)1,2(H

)3,2(H

)2,3(H

)4,3(H

)3,4(H

ABC

2

4

1

3 BEF

EFG

EF

BF

BC

BCDF

Mini Bucket Tree Elimination

Mini-Clustering

Correctness and completeness: Algorithm MC(i) computes a bound (or an approximation) for each variable and each of its values.

MBTE: when the clusters are buckets in BTE.

Branch and Bound w/ Mini-Buckets

BB with static Mini-Bucket Heuristics (s-BBMB) Heuristic information is pre-compiled before search.

Static variable ordering, prunes current variable

BB with dynamic Mini-Bucket Heuristics (d-BBMB)

Heuristic information is assembled during search. Static variable ordering, prunes current variable

BB with dynamic Mini-Bucket-Tree Heuristics (BBBT)

Heuristic information is assembled during search. Dynamic variable ordering, prunes all future variables

Empirical Evaluation Algorithms:

Complete BBBT BBMB

Incomplete DLM GLS SLS IJGP IBP (coding)

Measures: Time Accuracy (% exact) #Backtracks Bit Error Rate (coding)

Benchmarks: Coding networks Bayesian Network

Repository Grid networks (N-by-N) Random noisy-OR networks Random networks

Real World Benchmarks

Average Accuracy and Time. 30 samples, 10 observations, 30 seconds

Empirical Results: Max-CSP Random Binary Problems: <N, K, C, T>

N: number of variables K: domain size C: number of constraints T: Tightness

Task: Max-CSP

BBBT(i) vs. BBMB(i).

BBBT(i) vs BBMB(i), N=100

i=2 i=3 i=4 i=5 i=6 i=7 i=2

Searching the Graph; caching goods

C context(C) = [ABC]

D context(D) = [ABD]

F context(F) = [F]

E context(E) = [AE]

B context(B) = [AB]

A context(A) = [A]

A

E

C

B

F

D

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1

0 1

0 1 0 1

0 1 0 1

0 1

5

0 0

2 0 0 4

3 1 5 4 0 2 2 5

30

22 1

2

04

35

3 5 1 3 13

56 4

2 24 1

0 56 4

2 24

0

52

5 2 3 0 30

Searching the Graph; caching goods

C context(C) = [ABC]

D context(D) = [ABD]

F context(F) = [F]

E context(E) = [AE]

B context(B) = [AB]

A context(A) = [A]

A

E

C

B

F

D

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1

0 1

0 1 0 1

0 1 0 1

0 1

0 2 1 0

3 3 1 1 2 2 0 0

8 5 3 1 7 4 2 0

6 5 7 4

5 7

5

0 0

2 0 0 4

3 1 5 4 0 2 2 5

30

22 1

2

04

35

3 5 1 3 13

56 4

2 24 1

0 56 4

2 24 1

0

52

5 2 3 0 30



Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation



Classic OR Search Space

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

E

C

F

D

B

A 0 1

Ordering: A B E C Ordering: A B E C D FD F

A

D

B C

E

F

AND/OR Search Space

AOR

0AND 1

BOR B

0AND 1 0 1

EOR C E C E C E C

OR D F D F D F D F D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 0 10 1 0 10 1 0 10 1 0 10 1

A

D

B C

E

F

A

D

B

CE

F

Primal graph DFS tree

A

D

B C

E

F

A

D

B C

E

F

AND/OR vs. OR

E 0 1 0 1 0 1 0 1

0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0 1

A 0 1

AOR

0AND 1

BOR B

0AND 1 0 1

EOR C E C E C E C


AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 0 10 1 0 10 1 0 10 1 0 10 1

E 0 1 0 1 0 1 0 1

0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0 1

A 0 1

E 0 1 0 1 0 1 0 1

0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0 1

A 0 1

AND/OR

OR

A

D

B C

E

F

A

D

B

CE

F

AND/OR size: exp(4), OR size exp(6)

OR space vs. AND/OR space

width

height

OR space AND/OR space

Time (sec.) Nodes Backtracks Time (sec.) AND nodes OR nodes

5 10 3.1542,097,15

0 1,048,575 0.03 10,494 5,247

4 9 3.1352,097,15

0 1,048,575 0.01 5,102 2,551

5 10 3.1242,097,15

0 1,048,575 0.03 8,926 4,463

4 10 3.1252,097,15

0 1,048,575 0.02 7,806 3,903

5 13 3.1042,097,15

0 1,048,575 0.1 36,510 18,255

Random graphs with 20 nodes, 20 edges and 2 values per node.

AND/OR vs. OR

F

AND/OR

A

D

B C

E

F

A

D

B

CE

F

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

AOR

0AND 1

BOR B

0AND 1 0

EOR C E C E C

OR D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 10 1 0 10 1 10 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

OR

(A=1,B=1)(B=0,C=0)

AND/OR vs. OR

F

AND/OR

A

D

B C

E

F

A

D

B

CE

F

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

AOR

0AND 1

BOR B

0AND 1 0

EOR C E C E C

OR D F D F D F D F

AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

AND 10 1 0 10 1 10 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

E 0 1 0 1 0 1

C 1 1 0 1 0 1 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0B 1 0

A 0 1

OR

(A=1,B=1)(B=0,C=0)

Space: linearTime:O(exp(m))O(w* log n)

Linear space,Time:O(exp(n))

#CSP – AND/OR Search Tree

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1


A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

#CSP – AND/OR Tree DFS

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F

AND 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1 1 1 0 0 1

2 1 1

2 1 1 0

3 1

3

9

9

1

E

F F

0 1 0 1

0 1

C

D

0 1

0 1

1

B

0

E

F

0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F

0 1

0 1

C

D

0 1

0 1

1 1 0 1 1 1

2 1 2

2 1 20

2 3

6

1 1 1 0 1 0 1 0 1 1

2 1 1 1 2

3 1

2 1 10 01 02

1 2

3 2

5

5

14

OR node: Marginalization operator (summation)

AND node: Combination operator (product)

AND/OR Tree Search for COP

A

0

B

0

E

F F

0 1 0 1

OR

AND

OR

AND

OR

OR

AND

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

A

E

C

B

F

D

A B f1

0 0 20 1 01 0 11 1 4

A C f2

0 0 30 1 01 0 01 1 1

A E f3

0 0 00 1 31 0 21 1 0

A F f4

0 0 20 1 01 0 01 1 2

B C f5

0 0 00 1 11 0 21 1 4

B D f6

0 0 40 1 21 0 11 1 0

B E f7

0 0 30 1 21 0 11 1 0

C D f8

0 0 10 1 41 0 01 1 0

E F f9

0 0 10 1 01 0 01 1 2

5 6 4 2 3 0 2 2

5 2 0 2

5 2 0 2

3 3

6

5

5

5

3 1 3 5

2

2 4 1 0 3 0 2 2

2 0 0 2

2 0 0 2

4 1

5

5 4 1 3

0

0

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

5 6 4 2 1 2 0 4 2 4 1 0 1 2 0 4

6 4

7

7

5 2 1 0 2 0 1 0

5 2 1 0 2 0 1 0

4 2 4 0

0

0 2 5 2 2 5 3 0

1 4

A

D

B

EC

F

AND node = Combination operator (summation)

9

1min :Goal

i iX Xf

OR node = Marginalization operator (minimization)

Summary of AND/OR Search Trees

Based on a backbone pseudo-tree

A solution is a subtree

Each node has a value – cost of the optimal solution to the subproblem (computed recursively based on the values of the descendants)

Solving a task = finding the value of the root node

AND/OR search tree and algorithms are ([Freuder & Quinn85], [Collin, Dechter & Katz91], [Bayardo & Miranker95])

Space: O(n) Time: O(exp(m)), where m is the depth of the pseudo-tree Time: O(exp(w* log n)) BFS is time and space O(exp(w* log n)

An AND/OR Graph: Caching Goods

A

D

B C

E

F

A

D

B

CE

F

G H

J

K

G

H

J

KAOR

0AND 1

BOR B

0AND 1 0 1

EOR C E C E C E C


AND

AND 0 10 1 0 10 1 0 10 1 0 10 1

OR

OR

AND

AND

0

G

H H

0 1 0 1

0 1

1

G

H H

0 1 0 1

0 1

0

J

K K

0 1 0 1

0 1

1

J

K K

0 1 0 1

0 1

Context-based Caching

Caching is possible when context is the same

context = parent-separator set in induced pseudo-graph

= current variable + parents connected to subtree below

A

D

B C

E

F

A

D

B

CE

F

G H

J

K

G

H

J

K

context(B) = {A, B}

context(c) = {A,B,C}

context(D) = {D}

context(F) = {F}

Complexity of AND/OR Graph

Theorem: Traversing the AND/OR search graphis time and space exponential in the induced width/tree-width.

If applied to the OR graph complexity is time and space exponential in the path-width.


A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

#CSP – AND/OR Tree DFS

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F

AND 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1 1 1 0 0 1

2 1 1

2 1 1 0

3 1

3

9

9

1

E

F F

0 1 0 1

0 1

C

D

0 1

0 1

1

B

0

E

F

0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F

0 1

0 1

C

D

0 1

0 1

1 1 0 1 1 1

2 1 2

2 1 20

2 3

6

1 1 1 0 1 0 1 0 1 1

2 1 1 1 2

3 1

2 1 10 01 02

1 2

3 2

5

5

14

#CSP – AND/OR Search Graph(Caching Goods)

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1

AND 0 1

C

D D

0 1

0 1

1

EC

D D

0 1

1

B

0

E

F F

0 1

C

1

EC

#CSP – AND/OR Search Graph(Caching Goods)

A

E

C

B

F

D

A

D

B

EC

F

A B C RABC

0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

A B E RABE

0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0

A E F RAEF

0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0

B C D RBCD

0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1

AND 0 1

C

D D

0 1

0 1

1

EC

D D

0 1

1

B

0

E

F F

0 1

C

1

EC

Time and SpaceO(exp(w*))

All Four Search Spaces

Full OR search tree

126 nodes

Full AND/OR search tree

54 AND nodes

Context minimal OR search graph

28 nodes

Context minimal AND/OR search graph

18 AND nodes

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1

0 1

0 1 0 1

0 1 0 1

C

D

F

E

B

A 0 1

AOR0ANDBOR

0AND

OR E

OR F F

AND 0 1

AND 0 1

C

D D

0 1

0 1

1

EC

D D

0 1

1

B

0

E

F F

0 1

C

1

EC

AND/OR vs. OR DFS algorithms

AND/OR tree Space: O(n) Time: O(n km)

O(n kw* log n)(Freuder85; Bayardo95; Darwiche01)

AND/OR graph Space: O(n kw*) Time: O(n kw*)

k = domain sizem = pseudo-tree depthn = number of variablesw*= induced widthpw*= path width

OR tree Space: O(n) Time: O(kn)

OR graph Space: O(n kpw*) Time: O(n kpw*)

Searching AND/OR Graphs

AO(i): searches depth-first, cache i-context i = the max size of a cache table (i.e.

number of variables in a context)

i=0

i=w*

Space: O(n)

Time: O(exp(w* log n))

Space: O(exp w*)

Time: O(exp w*)

AO(i) time complexity?

End of class

We did not cover the rest of the slides

AND/OR Branch-and-Bound (AOBB)

Associate each node n with a static heuristic estimate h(n) of v(n) h(n) is a lower bound on the value v(n)

For every node n in the search tree: ub(n) – current best solution cost rooted at n lb(n) – lower bound on the minimal cost at n

Lower/Upper Bounds

10

X

A B

0

C h(C)v(A)

h(B,0) LB(B,0) = h(B,0)

LB(X,1) = l(X,1) + v(A) + h(C) + LB(B)

Prune below AND node (B,0) if LB(X) ≥ UB(X)

LB(X) = LB(X,1)

LB(B) = LB(B,0)

UB(X) LB(X) UB(X) = best cost below X (i.e. v(X,0))

v(X,0)

Shallow/Deep Cutoffs

Deep cutoff

0

X

1

UB(X)

LB(X) = h(X,1)

10

X

A B

0

ED

0

C

UB(X)

LB(X)

Shallow cutoff

Reminiscent of Minimax shallow/deep cutoffs

Prune if LB(X) ≥ UB(X)

Summary of AOBB

Traverses the AND/OR search tree in a depth-first manner

Lower bounds computed based on heuristic estimates of nodes at the frontier of search, as well as the values of nodes already explored

Prunes the search space as soon as an upper-lower bound violation occurs

Heuristics for AND/OR

In the AND/OR search space h(n) can be computed using any heuristic. We used:

Static Mini-Bucket heuristics

Dynamic Mini-Bucket heuristics

Maintaining FDAC [Larrosa & Schiex03]

(full directional soft arc-consistency)

Empirical Evaluation Tasks

Solving WCSPs Finding the MPE in belief networks

Benchmarks (WCSP) Random binary WCSPs RLFAP networks (CELAR6) Bayesian Networks Repository

Algorithms s-AOMB(i), d-AOMB(i), AOMFDAC s-BBMB(i), d-BBMB(i), BBMFDAC Static variable ordering (dfs traversal of the pseudo-

tree)

Random Binary WCSPs(Marinescu and Dechter, 2005)

Random networks with n=20 (number of variables), d=5 (domain size), c=100 (number of constraints), t=70% (tightness). Time limit 180 seconds.AO search is superior to OR search

S-AOMB vs S-BBMB D-AOMB vs D-BBMB

Random Binary WCSPs (contd.)

n=20 (variables), d=5 (domain size), c=100 (constraints), t=70% (tightness)

dense sparse

n=50 (variables), d=5 (domain size), c=80 (constraints), t=70% (tightness)

AOMB for large i is competitive with AOMFDAC

Resource Allocation

InstanceBBMFDAC AOMFDAC

time (sec) nodes time (sec) nodes

CELAR6-SUB0 2.78 1,871 1.98 435

CELAR6-SUB1 2,420.93 364,986 981.98 180,784

CELAR6-SUB2 8,801.12 19,544,182 1,138.87 175,377

CELAR6-SUB3 38,889.20 91,168,896 4,028.59 846,986

CELAR6-SUB4 84,478.40 6,955,039 47,115.40 4,643,229

CELAR6 sub-instances

Radio Link Frequency Assignment Problem (RLFAP)

AOMFDAC is superior to ORMFDAC

Bayesian Networks Repository

Network Algorithm i=2 i=3 i=4 i=5

(n,d,w*,h) time nodes time nodes time nodes time nodes

s-AOMB(i) - 8.5M - 7.6M 46.22 807K 0.563 9.6K

Barley s-BBMB(i) - 16M - 18M - 17M - 14M

(48,67,7,17) d-AOMB(i) - 79K 136.0 23K 12.55 667 45.95 567

d-BBMB(i) - 2.2M - 1M 346.1 76K - 86K

s-AOMB(i) 57.36 1.2M 12.08 260K 7.203 172K 1.657 43K

Munin1 s-BBMB(i) - 8.5M - 9M - 10M - 8M

(189,21,11,24) d-AOMB(i) 66.56 185K 12.47 8.1K 10.30 1.6K 11.99 523

d-BBMB(i) - 405K - 430K - 235K 14.63 917

s-AOMB(i) - 5.9M - 4.9M 1.313 17K 0.453 6K

Munin3 s-BBMB(i) - 1.4M - 1.2M - 316K - 1.5M

(1044,21,7,25) d-AOMB(i) - 2.3M 68.64 58K 3.594 5.9K 2.844 3.8K

d-BBMB(i) - 33K - 125K - 52K - 31K

Time limit 600 seconds

available at http://www.cs.huji.ac.il/labs/compbio/Repository

Static AO is better with accurate heuristic (large i)





Searching trees Searching graphs


From Searching Trees to Searching Graphs

Any two nodes that root identical subtrees/subgraphs can be merged

Minimal AND/OR search graph: closure under merge of the AND/OR search tree

Inconsistent sub-trees can be pruned too. Some portions can be collapsed or reduced.

AND/OR Search Graph

A

E

C

B

F

D

A

D

B

EC

FPrimal graph

Pseudo-tree

context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}

AOR

0AND

BOR

0AND

OR E

OR F F

AND 0 1 0 1

AND 0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

B

0

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

1

E

F F

0 1 0 1

0 1

C

D D

0 1 0 1

0 1

AND/OR Search Graph

OR

AND

OR

AND

OR E

OR F F

AND0 1

AND 0 1

D D

0 1

1

1

EC

D D

0 1

E

F F

0 1

A

0

B

0

C

0

1

B

0

C

1

EC

A

E

C

B

F

D

A

D

B

EC

FPrimal graph

Pseudo-tree


Context-based cachingA

D

B

EC

F


Cache Table (C)

B CValu

e0 0 50 1 21 0 21 1 0

A

0

B

0

EC

D D

0 1

1

EC

D D

0 1

1

B

0

EC

1

EC

5

5

5 2

3

6

2 0

4

5

4 4

46

7

7

A

E

C

B

F

D

Primal graph

Space: O(exp(2))

Searching AND/OR Graphs

AO(j): searches depth-first, cache j-context j = the max size of a cache table (i.e.

number of variables in a context)

j=0

j=w*

Space: O(n)

Time: O(exp(w* log n))

Space: O(exp w*)

Time: O(exp w*)

AO(j) time complexity?

Graph AND/OR Branch-and-Bound - AOBB(j)

Like BB on tree except, during search, merge nodes based on context (caching); maintain cache tables of size O(exp(j)), where j is a bound on the size of the context.

Heuristics Static mini-bucket Dynamic mini-bucket Soft directional arc-consistency

[Marinescu & Dechter, CP2005]

Pseudo-trees (I)

AND/OR graph/tree search algorithms influenced by the pseudo-tree quality

Finding the minimal context/depth pseudo-tree is a hard problem

Heuristics Min-fill (min context) Hypergraph separation (min depth)

Pseudo-trees (II) MIN-FILL [Kjæaerulff, 1990], [Bayardo & Miranker, 1995]

depth first traversal of the induced graph constructed along some elimination order

elimination order prefers variables with smallest “fill set”

HYPERGRAPH [Darwiche, 2001] constraints are vertices in the hypergraph

and variables are hyperedges Recursive decomposition of the hypergraph

while minimizing the separator size (i.e. number of variables) at each step

Using state-of-the-art software package hMeTiS

Quality of pseudo-trees

Network hypergraph min-fill

width depth width depth

barley 7 13 7 23

diabetes 7 16 4 77

link 21 40 15 53

mildew 5 9 4 13

munin1 12 17 12 29

munin2 9 16 9 32

munin3 9 15 9 30

munin4 9 18 9 30

water 11 16 10 15

pigs 11 20 11 26

Network hypergraph min-fill

width depth width depth

spot5 47 152 39 204

spot28 108 138 79 199

spot29 16 23 14 42

spot42 36 48 33 87

spot54 12 16 11 33

spot404 19 26 19 42

spot408 47 52 35 97

spot503 11 20 9 39

spot505 29 42 23 74

spot507 70 122 59 160

Bayesian Networks Repository SPOT5 Benchmarks

Empirical Evaluation Tasks

Solving binary WCSPs Finding the MPE in belief networks

Benchmarks Random networks Resource allocation (SPOT5) Genetic linkage analysis

Algorithms s-AOMB(i,j), d-AOMB(i,j), AOMFDAC(j), VE+C j is the cache bound Static variable ordering

Random Networks

Random networks with n=100 (number of variables), d=3 (domain size), c=90 (number of CPTs), p=2 (number of parents per CPT). Time limit 180 seconds.

Static heuristics Dynamic Heuristics

Caching helps for static heuristics with small i-bounds

Resource Allocation – SPOT5

hypergraph minfill

Network Algorithm (w*,h) no cache cache (w*,h) no cache cache

time nodes time nodes time nodes time nodes

29b AOMFDAC (16,22) 5.938 170,823 1.492 40,428 (14,42) 5.036 79,866 3.237 34,123

(83,394) sAOMB(12) 1.002 8,458 1.012 1,033 0.381 997 0.411 940

42b AOMFDAC (31,43) 1,043 6,071,390 884.1 3,942,942 (18,62) - 22,102,050 - 17,911,719

(191, 1151) sAOMB(16) 132.0 2,871,804 127.4 2,815,503 3.254 11,636 3.164 9,030

54b AOMFDAC (12,14) 0.401 6,581 0.29 3,377 (9,19) 1.793 28,492 0.121 2,087

(68, 197) sAOMB(10) 0.03 74 0.03 74 0.02 567 0.02 381

404b AOMFDAC (19,23) 0.02 148 0.01 138 (19,57) 2.043 21,406 0.08 1,222

(101, 595) sAOMB(12) 0.01 101 0.01 101 0.02 357 0.01 208

503b AOMFDAC (9,14) 0.02 405 0.01 307 (8,46) 1077.1 19,041,552 0.05 701

(144, 414) sAOMB(8) 0.01 150 0.01 150 0.03 1,918 0.01 172

505b AOMFDAC (19,32) 17.8 368,247 5.20 69,045 (16,98) - 9,872,078 15.43 135,641

(241, 1481) sAOMB(14) 5.618 683 6.208 683 4.997 1912 5.096 831

Time limit 1800 seconds

Cache means j=i. we picked bestperforming i

The heuristics MB(i) is strong enough to prune so caching is less relevant

L11m L11f

X11

L12m L12f

X12

L13m L13f

X13

L14m L14f

X14

L15m L15f

X15

L16m L16f

X16

S13m

S15m

S16mS15m

S15m

S15m

L21m L21f

X21

L22m L22f

X22

L23m L23f

X23

L24m L24f

X24

L25m L25f

X25

L26m L26f

X26

S23m

S25m

S26mS25m

S25m

S25m

L31m L31f

X31

L32m L32f

X32

L33m L33f

X33

L34m L34f

X34

L35m L35f

X35

L36m L36f

X36

S33m

S35m

S36mS35m

S35m

S35m

6 people, 3 markers

Empirical evaluation

pedigree VE+C SUPERLINK 1.5(n, d) time time (i,c) time nodes

23 1,144 6,809 (14,18) 4.29 122,811(309, 5)

30 26,719 28,740 (18,22) 27.71 535,763(1016, 5)

s-AOMB(i,c)

*averages over 5 runs

Pedigree 23

pedigree(n,d)

23(309,5)

12 14 16 18 12 14 16 180 24.20 9.49 10.54 8.99 0 1,216,320 432,762 328,129 50,337

hypergraph 8 18.61 7.00 8.78 8.69 8 554,818 192,971 159,472 31,331(24, 38) 10 17.59 6.74 8.98 8.36 10 529,349 183,120 155,303 31,010

12 14.02 5.26 7.16 8.25 12 437,812 146,435 116,996 27,67614 13.87 5.24 7.37 8.25 14 432,895 145,169 116,718 27,55116 13.11 4.94 6.99 8.20 16 405,748 134,472 115,132 27,08718 10.11 4.29 5.93 8.13 18 350,048 122,811 96,893 25,176

SUPERLINK 1.5time

6,809

VE+Ctime

1,144

s-AOMB(i,c) - time s-AOMB(i,c) - nodesi i

cc(w*,h)


Pedigree 30

pedigree(n,d)

30(1016, 5)

16 18 20 22 16 18 20 220 362.39 72.32 43.58 68.01 0 11,476,500 1,812,940 433,073 70,911

hypergraph 8 289.88 49.63 36.50 66.95 8 8,061,550 939,683 202,137 24,991(24, 38) 16 164.79 30.13 31.70 66.02 16 5,194,540 611,623 119,994 11,446

18 156.46 27.96 30.77 66.36 18 4,705,450 541,026 101,193 10,25020 156.23 27.83 30.72 65.64 20 4,691,760 538,943 100,623 10,12722 154.62 27.71 30.81 66.46 22 4,681,200 535,763 100,396 10,095

i(w*,h) c

ic

26,719 28,740

s-AOMB(i,c) - time s-AOMB(i,c) - nodes

time timeVE+C SUPERLINK 1.5







Documents

Chapter 13