Upload
petra
View
30
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Chapter 13. Constraint Optimization And counting, and enumeration 275 class. From satisfaction to counting and enumeration and to general graphical models. ICS-275 Spring 2007. Outline. Introduction Optimization tasks for graphical models - PowerPoint PPT Presentation
Citation preview
Chapter 13
Constraint OptimizationAnd counting, and
enumeration275 class
From satisfaction to counting and enumeration and to general graphical models
ICS-275Spring 2007
Outline Introduction
Optimization tasks for graphical models Solving optimization problems with inference and
search Inference
Bucket elimination, dynamic programming Mini-bucket elimination
Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces
Hybrids of search and inference Cutset decomposition Super-bucket scheme
A Bred greenred yellowgreen redgreen yellowyellow greenyellow red
Example: map coloring Variables - countries (A,B,C,etc.)
Values - colors (e.g., red, green, yellow)
Constraints: etc. ,ED D, AB,A
C
A
B
DE
F
G
Task: consistency?Find a solution, all solutions, counting
Constraint Satisfaction
= {(¬C), (A v B v C), (¬A v B v E), (¬B v C v D)}.
Propositional Satisfiability
Constraint Optimization Problemsfor Graphical Models
functionscost - },...,{
domains - },...,{
variables- },...,{
:where,, triplea is A
1
1
1
m
n
n
ffF
DDD
XXX
FDXRCOPfinite
A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0
m
i i XfXF1
)(
FunctionCost Global
f(A,B,D) has scope {A,B,D}
Constraint Optimization Problemsfor Graphical Models
functionscost - },...,{
domains - },...,{
variables- },...,{
:where,, triplea is A
1
1
1
m
n
n
ffF
DDD
XXX
FDXRCOPfinite
A B D Cost1 2 3 31 3 2 22 1 3 02 3 1 03 1 2 53 2 1 0
G
A
B C
D F
m
i i XfXF1
)(
FunctionCost Global
Primal graph =Variables --> nodesFunctions, Constraints - arcs
f1(A,B,D)f2(D,F,G)f3(B,C,F)
f(A,B,D) has scope {A,B,D}
F(a,b,c,d,f,g)= f1(a,b,d)+f2(d,f,g)+f3(b,c,f)
Constrained Optimization
Example: power plant scheduling
)X,...,ost(XTotalFuelC minimize :
)(Power : demandpower time,down-min and up-min ,, :sConstraint
. domain ,Variables
N1
4321
1
Objective
DemandXXXXX
{ON,OFF}},...,X{X
i
n
Probabilistic Networks
Smoking
BronchitisCancer
X-Ray
Dyspnoea
P(S)
P(B|S)
P(D|C,B)
P(C|S)
P(X|C,S)
P(S,C,B,X,D) = P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B)
C BD=0
D=1
0 0 0.1 0.9
0 1 0.7 0.3
1 0 0.8 0.2
1 1 0.9 0.1
P(D|C,B)
A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions
(constraints, CPTS, cnfs) Primal graph
Graphical models
CAFF
CAFPF
i
i
:
),|(:
A
D
B C
E
F
A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions
(constraints, CPTS, cnfs) Primal graph
Graphical models
CAFF
CAFPF
i
i
:
),|(:
A
D
B C
E
F
MPE: maxX j Pj
CSP: X j Cj
Max-CSP: minX j Fj
Optimization: minX j Fj
A COP problem defined by operators sum, product and min,max over consistent solutions
Outline Introduction
Optimization tasks for graphical models Solving by inference and search
Inference Bucket elimination, dynamic programming,
tree-clustering, bucket-elimination Mini-bucket elimination, belief propagation
Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces
Hybrids of search and inference Cutset decomposition Super-bucket scheme
Computing MPE
“Moral” graph
A
D E
CB
bcde ,,,0max
P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=
0max
eP(a)
dmax
),,,( ecdahB
bmax P(b|a)P(d|b,a)P(e|b,c)
B C
ED
Variable Elimination
P(c|a)c
max
MPE=
b
maxElimination operator
MPE
bucket B:
P(a)
P(c|a)
P(b|a) P(d|b,a) P(e|b,c)
bucket C:
bucket D:
bucket E:
bucket A:
e=0
B
C
D
E
A
e)(a,hD
(a)hE
e)c,d,(a,hB
e)d,(a,hC
Finding )xP(maxMPEx
),|(),|()|()|()(max
,,,,cbePbadPabPacPaPMPE
bcdea
Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)
Generating the MPE-tuple
C:
E:
P(b|a) P(d|b,a) P(e|b,c)B:
D:
A: P(a)
P(c|a)
e=0 e)(a,hD
(a)hE
e)c,d,(a,hB
e)d,(a,hC
(a)hP(a)max arga' 1. E
a
0e' 2.
)e'd,,(a'hmax argd' 3. C
d
)e'c,,d',(a'h
)a'|P(cmax argc' 4.B
c
)c'b,|P(e')a'b,|P(d')a'|P(bmax argb' 5.
b
)e',d',c',b',(a' Return
b
maxElimination operator
MPE
exp(W*=4)”induced width” (max clique size)
bucket B:
P(a)
P(c|a)
P(b|a) P(d|b,a) P(e|b,c)
bucket C:
bucket D:
bucket E:
bucket A:
e=0
B
C
D
E
A
e)(a,hD
(a)hE
e)c,d,(a,hB
e)d,(a,hC
Complexity
),|(),|()|()|()(max
,,,,cbePbadPabPacPaPMPE
bcdea
Algorithm elim-mpe (Dechter 1996)Non-serial Dynamic Programming (Bertele and Briochi, 1973)
Complexity of bucket elimination
))((exp ( * dwrOddw ordering alonggraph primal theof width induced the)(*
The effect of the ordering:
4)( 1* dw 2)( 2
* dwconstraint graph
A
D E
CB
B
C
D
E
A
E
D
C
B
A
Finding smallest induced-width is hard
r = number of functions
Bucket-elimination is time and space
Directional i-consistency
DCBR
A
E
CD
B
D
CB
E
D
CB
E
DC
B
E
:A
B A:B
BC :C
AD C,D :D
BE C,E D,E :E
Adaptive d-arcd-path
DBDC RR ,CBR
DRCRDR
Mini-bucket approximation: MPE task
Split a bucket into mini-buckets =>bound complexity
XX gh )()()O(e :decrease complexity lExponentia n rnr eOeO
Mini-Bucket Elimination
A
B C
D
E
P(A)
P(B|A) P(C|A)
P(E|B,C)
P(D|A,B)
Bucket B
Bucket C
Bucket D
Bucket E
Bucket A
P(B|A) P(D|A,B)P(E|B,C)
P(C|A)
E = 0
P(A)
maxB∏
hB (A,D)
MPE* is an upper bound on MPE --UGenerating a solution yields a lower bound--L
maxB∏
hD (A)
hC (A,E)
hB (C,E)
hE (A)
MBE-MPE(i) Algorithm Approx-MPE (Dechter&Rish 1997)
Input: i – max number of variables allowed in a mini-bucket Output: [lower bound (P of a sub-optimal solution), upper bound]
Example: approx-mpe(3) versus elim-mpe
2* w 4* w
Properties of MBE(i)
Complexity: O(r exp(i)) time and O(exp(i)) space. Yields an upper-bound and a lower-bound.
Accuracy: determined by upper/lower (U/L) bound.
As i increases, both accuracy and complexity increase.
Possible use of mini-bucket approximations: As anytime algorithms As heuristics in search
Other tasks: similar mini-bucket approximations for: belief updating, MAP and MEU (Dechter and Rish, 1997)
Outline Introduction
Optimization tasks for graphical models Solving by inference and search
Inference Bucket elimination, dynamic programming Mini-bucket elimination
Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces
Hybrids of search and inference Cutset decomposition Super-bucket scheme
The Search Space
9
1
mini
iX ff
A
E
C
B
F
D
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1
C
D
F
E
B
A 0 1
Objective function:
A B f1
0 0 20 1 01 0 11 1 4
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
The Search Space
Arc-cost is calculated based on cost components.
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1
C
D
F
E
B
A 0 1
A B f1
0 0 20 1 01 0 11 1 4
A
E
C
B
F
D
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
3 02 23 02 23 02 23 02 2 3 02 23 02 23 02 23 02 2
0 0
3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3
5 6 4 2 2 4 1 0
3 1
2
5 4
0
1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4
5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0
5 6 4 2 2 4 1 0
0 2 2 5
0 4
9
1
mini
iX ff
The Value Function
Value of node = minimal cost solution below it
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1
C
D
F
E
B
A 0 1
A B f1
0 0 20 1 01 0 11 1 4
A
E
C
B
F
D
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
3 00
2 2
6
2
3
3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2
3 3 3 1 1 1 1
8 5 3 1
5
5
1 0 1 1 10 0 0 1 0 1 1 10 0 0
2 2 2 2 0 0 0 0
7 4 2 0
7 4
7
50 0
3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3
5 6 4 2 2 4 1 0
3 1
2
5 4
0
1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4
5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0
5 6 4 2 2 4 1 0
0 2 2 5
0 4
9
1
mini
iX ff
An Optimal Solution
Value of node = minimal cost solution below it
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1
C
D
F
E
B
A 0 1
A B f1
0 0 20 1 01 0 11 1 4
A
E
C
B
F
D
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
3 00
2 2
6
2
3
3 02 23 02 23 02 2 3 02 23 02 23 02 23 02 20 0 02 2 2 0 2 0 0 02 2 2
3 3 3 1 1 1 1
8 5 3 1
5
5
1 0 1 1 10 0 0 1 0 1 1 10 0 0
2 2 2 2 0 0 0 0
7 4 2 0
7 4
7
50 0
3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3
5 6 4 2 2 4 1 0
3 1
2
5 4
0
1 20 41 20 41 20 41 20 4 1 20 41 20 41 20 41 20 4
5 2 5 2 5 2 5 2 3 0 3 0 3 0 3 0
5 6 4 2 2 4 1 0
0 2 2 5
0 4
Basic Heuristic Search Schemes
Heuristic function f(x) computes a lower bound on the best
extension of x and can be used to guide a heuristic search algorithm. We focus on
1.Branch and BoundUse heuristic function f(xp) to prune the depth-first search tree.Linear space
2.Best-First SearchAlways expand the node with the highest heuristic value f(xp).Needs lots of memory
f L
L
Classic Branch-and-Bound
nn
g(n)g(n)
h(n)h(n)
LB(n) = g(n) + h(n)LB(n) = g(n) + h(n)
Lower Bound Lower Bound LBLB
OR Search Tree
Prune if LB(n) ≥ UBPrune if LB(n) ≥ UB
Upper Bound Upper Bound UBUB
How to Generate Heuristics
The principle of relaxed models
Linear optimization for integer programs
Mini-bucket elimination Bounded directional consistency ideas
Generating Heuristic for graphical models(Kask and Dechter, 1999)
Given a cost function
C(a,b,c,d,e) = f(a) • f(b,a) • f(c,a) • f(e,b,c) • P(d,b,a)
Define an evaluation function over a partial assignment as theprobability of it’s best extension
f*(a,e,d) = minb,c f(a,b,c,d,e) = = f(a) • minb,c f(b,a) • P(c,a) • P(e,b,c) • P(d,a,b)
= g(a,e,d) • H*(a,e,d)
D
E
E
DA
D
BD
B0
1
1
0
1
0
Generating Heuristics (cont.)
H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)
= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]
minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]
minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]
= hB(d,a) • hC(e,a)
= H(a,e,d)
f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)
The heuristic function H is what is compiled during the preprocessing stage of the
Mini-Bucket algorithm.
Generating Heuristics (cont.)
H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b)
= minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]]
minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]]
minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)]
= hB(d,a) • hC(e,a)
= H(a,e,d)
f(a,e,d) = g(a,e,d) • H(a,e,d) f*(a,e,d)
The heuristic function H is what is compiled during the preprocessing stage of the
Mini-Bucket algorithm.
Static MBE Heuristics Given a partial assignment xp, estimate the cost of the
best extension to a full solution The evaluation function f(x^p) can be computed using
function recorded by the Mini-Bucket scheme
B: P(E|B,C) P(D|A,B) P(B|A)
A:
E:
D:
C: P(C|A) hB(E,C)
hB(D,A)
hC(E,A)
P(A) hE(A) hD(A)
f(a,e,D) = P(a) · hB(D,a) · hC(e,a)
g h – is admissible
A
B C
D
E
Belief Network
E
E
DA
D
BD
B
0
1
1
0
1
0
f(a,e,D))=g(a,e) · H(a,e,D )
Heuristics Properties
MB Heuristic is monotone, admissible
Retrieved in linear time IMPORTANT:
Heuristic strength can vary by MB(i). Higher i-bound more pre-processing
stronger heuristic less search.
Allows controlled trade-off between preprocessing and search
Experimental Methodology
Algorithms BBMB(i) – Branch and Bound with MB(i) BBFB(i) - Best-First with MB(i) MBE(i)
Test networks: Random Coding (Bayesian) CPCS (Bayesian) Random (CSP)
Measures of performance Compare accuracy given a fixed amount of time - how
close is the cost found to the optimal solution Compare trade-off performance as a function of time
Empirical Evaluation of mini-bucket heuristics,Bayesian networks, coding
Time [sec]
0 10 20 30
% S
olve
d E
xact
ly
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
BBMB i=2
BFMB i=2
BBMB i=6
BFMB i=6
BBMB i=10
BFMB i=10
BBMB i=14
BFMB i=14
Random Coding, K=100, noise=0.28 Random Coding, K=100, noise 0.32
Time [sec]
0 10 20 30
% S
olve
d E
xact
ly
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
BBMB i=6 BFMB i=6 BBMB i=10 BFMB i=10 BBMB i=14 BFMB i=14
Random Coding, K=100, noise=0.32
38
Max-CSP experiments(Kask and Dechter, 2000)
Dynamic MB Heuristics
Rather than pre-compiling, the mini-bucket heuristics can be generated during search
Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node in the search space
(a partial assignment, along the given variable ordering)
Dynamic MB and MBTE Heuristics(Kask, Marinescu and Dechter, 2003)
Rather than precompile compute the heuristics during search
Dynamic MB: Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node during search
Dynamic MBTE: We can compute heuristics simultaneously for all un-instantiated variables using mini-bucket-tree elimination .
MBTE is an approximation scheme defined over cluster-trees. It outputs multiple bounds for each variable and value extension at once.
ABC
2
4
),|()|()(),()2,1( bacpabpapcbha
1
3 BEF
EFG
),(),|()|(),( )2,3(,
)1,2( fbhdcfpbdpcbhfd
),(),|()|(),( )2,1(,
)3,2( cbhdcfpbdpfbhdc
),(),|(),( )3,4()2,3( fehfbepfbhe
),(),|(),( )3,2()4,3( fbhfbepfehb
),|(),()3,4( fegGpfeh e
EF
BF
BC
BCDF
G
E
F
C D
B
A
Cluster Tree Elimination - example
Mini-Clustering
Motivation: Time and space complexity of Cluster Tree Elimination
depend on the induced width w* of the problem When the induced width w* is big, CTE algorithm becomes
infeasible The basic idea:
Try to reduce the size of the cluster (the exponent); partition each cluster into mini-clusters with less variables
Accuracy parameter i = maximum number of variables in a mini-cluster
The idea was explored for variable elimination (Mini-Bucket)
Split a cluster into mini-clusters => bound complexity
)()( :decrease complexity lExponentia rnrn eOeO)O(e
Idea of Mini-Clustering
},...,,,...,{ 11 nrr hhhh )(ucluster
elim
n
iihh
1
},...,{ 1 rhh },...,{ 1 nr hh
elim
n
rii
elim
r
ii hhg
11
gh
EF
BF
BC
),|()|()(:),(1)2,1( bacpabpapcbh
a
)2,1(H
fd
fd
dcfpch
fbhbdpbh
,
2)1,2(
,
1)2,3(
1)1,2(
),|(:)(
),()|(:)(
)1,2(H
dc
dc
dcfpfh
cbhbdpbh
,
2)3,2(
,
1)2,1(
1)3,2(
),|(:)(
),()|(:)(
)3,2(H
),(),|(:),( 1)3,4(
1)2,3( fehfbepfbh
e
)2,3(H
)()(),|(:),( 2)3,2(
1)3,2(
1)4,3( fhbhfbepfeh
b
)4,3(H
),|(:),(1)3,4( fegGpfeh e)3,4(H
ABC
2
4
1
3 BEF
EFG
BCDF
Mini-Clustering - example
ABC
2
4
),()2,1( cbh1
3 BEF
EFG
),()1,2( cbh
),()3,2( fbh
),()2,3( fbh
),()4,3( feh
),()3,4( fehEF
BF
BC
BCDF
),(1)2,1( cbh
)(
)(2
)1,2(
1)1,2(
ch
bh
)(
)(2
)3,2(
1)3,2(
fh
bh
),(1)2,3( fbh
),(1)4,3( feh
),(1)3,4( feh
)2,1(H
)1,2(H
)3,2(H
)2,3(H
)4,3(H
)3,4(H
ABC
2
4
1
3 BEF
EFG
EF
BF
BC
BCDF
Mini Bucket Tree Elimination
Mini-Clustering
Correctness and completeness: Algorithm MC(i) computes a bound (or an approximation) for each variable and each of its values.
MBTE: when the clusters are buckets in BTE.
Branch and Bound w/ Mini-Buckets
BB with static Mini-Bucket Heuristics (s-BBMB) Heuristic information is pre-compiled before search.
Static variable ordering, prunes current variable
BB with dynamic Mini-Bucket Heuristics (d-BBMB)
Heuristic information is assembled during search. Static variable ordering, prunes current variable
BB with dynamic Mini-Bucket-Tree Heuristics (BBBT)
Heuristic information is assembled during search. Dynamic variable ordering, prunes all future variables
Empirical Evaluation Algorithms:
Complete BBBT BBMB
Incomplete DLM GLS SLS IJGP IBP (coding)
Measures: Time Accuracy (% exact) #Backtracks Bit Error Rate (coding)
Benchmarks: Coding networks Bayesian Network
Repository Grid networks (N-by-N) Random noisy-OR networks Random networks
Real World Benchmarks
Average Accuracy and Time. 30 samples, 10 observations, 30 seconds
Empirical Results: Max-CSP Random Binary Problems: <N, K, C, T>
N: number of variables K: domain size C: number of constraints T: Tightness
Task: Max-CSP
BBBT(i) vs. BBMB(i).
BBBT(i) vs BBMB(i), N=100
i=2 i=3 i=4 i=5 i=6 i=7 i=2
Searching the Graph; caching goods
C context(C) = [ABC]
D context(D) = [ABD]
F context(F) = [F]
E context(E) = [AE]
B context(B) = [AB]
A context(A) = [A]
A
E
C
B
F
D
A B f1
0 0 20 1 01 0 11 1 4
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1
0 1
0 1 0 1
0 1 0 1
0 1
5
0 0
2 0 0 4
3 1 5 4 0 2 2 5
30
22 1
2
04
35
3 5 1 3 13
56 4
2 24 1
0 56 4
2 24
0
52
5 2 3 0 30
Searching the Graph; caching goods
C context(C) = [ABC]
D context(D) = [ABD]
F context(F) = [F]
E context(E) = [AE]
B context(B) = [AB]
A context(A) = [A]
A
E
C
B
F
D
A B f1
0 0 20 1 01 0 11 1 4
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1
0 1
0 1 0 1
0 1 0 1
0 1
0 2 1 0
3 3 1 1 2 2 0 0
8 5 3 1 7 4 2 0
6 5 7 4
5 7
5
0 0
2 0 0 4
3 1 5 4 0 2 2 5
30
22 1
2
04
35
3 5 1 3 13
56 4
2 24 1
0 56 4
2 24 1
0
52
5 2 3 0 30
Outline Introduction
Optimization tasks for graphical models Solving by inference and search
Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation
Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces
Hybrids of search and inference Cutset decomposition Super-bucket scheme
Classic OR Search Space
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1
E
C
F
D
B
A 0 1
Ordering: A B E C Ordering: A B E C D FD F
A
D
B C
E
F
AND/OR Search Space
AOR
0AND 1
BOR B
0AND 1 0 1
EOR C E C E C E C
OR D F D F D F D F D F D F D F D F
AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
AND 0 10 1 0 10 1 0 10 1 0 10 1
A
D
B C
E
F
A
D
B
CE
F
Primal graph DFS tree
A
D
B C
E
F
A
D
B C
E
F
AND/OR vs. OR
E 0 1 0 1 0 1 0 1
0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0 1
A 0 1
AOR
0AND 1
BOR B
0AND 1 0 1
EOR C E C E C E C
OR D F D F D F D F D F D F D F D F
AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
AND 0 10 1 0 10 1 0 10 1 0 10 1
E 0 1 0 1 0 1 0 1
0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0 1
A 0 1
E 0 1 0 1 0 1 0 1
0C 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
F 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0 1
A 0 1
AND/OR
OR
A
D
B C
E
F
A
D
B
CE
F
AND/OR size: exp(4), OR size exp(6)
OR space vs. AND/OR space
width
height
OR space AND/OR space
Time (sec.) Nodes Backtracks Time (sec.) AND nodes OR nodes
5 10 3.1542,097,15
0 1,048,575 0.03 10,494 5,247
4 9 3.1352,097,15
0 1,048,575 0.01 5,102 2,551
5 10 3.1242,097,15
0 1,048,575 0.03 8,926 4,463
4 10 3.1252,097,15
0 1,048,575 0.02 7,806 3,903
5 13 3.1042,097,15
0 1,048,575 0.1 36,510 18,255
Random graphs with 20 nodes, 20 edges and 2 values per node.
AND/OR vs. OR
F
AND/OR
A
D
B C
E
F
A
D
B
CE
F
E 0 1 0 1 0 1
C 1 1 0 1 0 1 1 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0
A 0 1
AOR
0AND 1
BOR B
0AND 1 0
EOR C E C E C
OR D F D F D F D F
AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
AND 10 1 0 10 1 10 1
E 0 1 0 1 0 1
C 1 1 0 1 0 1 1 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0
A 0 1
E 0 1 0 1 0 1
C 1 1 0 1 0 1 1 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0
A 0 1
OR
(A=1,B=1)(B=0,C=0)
AND/OR vs. OR
F
AND/OR
A
D
B C
E
F
A
D
B
CE
F
E 0 1 0 1 0 1
C 1 1 0 1 0 1 1 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0
A 0 1
AOR
0AND 1
BOR B
0AND 1 0
EOR C E C E C
OR D F D F D F D F
AND 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
AND 10 1 0 10 1 10 1
E 0 1 0 1 0 1
C 1 1 0 1 0 1 1 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0
A 0 1
E 0 1 0 1 0 1
C 1 1 0 1 0 1 1 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
D 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0B 1 0
A 0 1
OR
(A=1,B=1)(B=0,C=0)
Space: linearTime:O(exp(m))O(w* log n)
Linear space,Time:O(exp(n))
#CSP – AND/OR Search Tree
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
B
0
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
#CSP – AND/OR Search Tree
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
B
0
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
#CSP – AND/OR Tree DFS
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F
AND 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1 1 1 0 0 1
2 1 1
2 1 1 0
3 1
3
9
9
1
E
F F
0 1 0 1
0 1
C
D
0 1
0 1
1
B
0
E
F
0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F
0 1
0 1
C
D
0 1
0 1
1 1 0 1 1 1
2 1 2
2 1 20
2 3
6
1 1 1 0 1 0 1 0 1 1
2 1 1 1 2
3 1
2 1 10 01 02
1 2
3 2
5
5
14
OR node: Marginalization operator (summation)
AND node: Combination operator (product)
AND/OR Tree Search for COP
A
0
B
0
E
F F
0 1 0 1
OR
AND
OR
AND
OR
OR
AND
AND 0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
A
E
C
B
F
D
A B f1
0 0 20 1 01 0 11 1 4
A C f2
0 0 30 1 01 0 01 1 1
A E f3
0 0 00 1 31 0 21 1 0
A F f4
0 0 20 1 01 0 01 1 2
B C f5
0 0 00 1 11 0 21 1 4
B D f6
0 0 40 1 21 0 11 1 0
B E f7
0 0 30 1 21 0 11 1 0
C D f8
0 0 10 1 41 0 01 1 0
E F f9
0 0 10 1 01 0 01 1 2
5 6 4 2 3 0 2 2
5 2 0 2
5 2 0 2
3 3
6
5
5
5
3 1 3 5
2
2 4 1 0 3 0 2 2
2 0 0 2
2 0 0 2
4 1
5
5 4 1 3
0
0
1
B
0
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
5 6 4 2 1 2 0 4 2 4 1 0 1 2 0 4
6 4
7
7
5 2 1 0 2 0 1 0
5 2 1 0 2 0 1 0
4 2 4 0
0
0 2 5 2 2 5 3 0
1 4
A
D
B
EC
F
AND node = Combination operator (summation)
9
1min :Goal
i iX Xf
OR node = Marginalization operator (minimization)
Summary of AND/OR Search Trees
Based on a backbone pseudo-tree
A solution is a subtree
Each node has a value – cost of the optimal solution to the subproblem (computed recursively based on the values of the descendants)
Solving a task = finding the value of the root node
AND/OR search tree and algorithms are ([Freuder & Quinn85], [Collin, Dechter & Katz91], [Bayardo & Miranker95])
Space: O(n) Time: O(exp(m)), where m is the depth of the pseudo-tree Time: O(exp(w* log n)) BFS is time and space O(exp(w* log n)
An AND/OR Graph: Caching Goods
A
D
B C
E
F
A
D
B
CE
F
G H
J
K
G
H
J
KAOR
0AND 1
BOR B
0AND 1 0 1
EOR C E C E C E C
OR D F D F D F D F D F D F D F D F
AND
AND 0 10 1 0 10 1 0 10 1 0 10 1
OR
OR
AND
AND
0
G
H H
0 1 0 1
0 1
1
G
H H
0 1 0 1
0 1
0
J
K K
0 1 0 1
0 1
1
J
K K
0 1 0 1
0 1
Context-based Caching
Caching is possible when context is the same
context = parent-separator set in induced pseudo-graph
= current variable + parents connected to subtree below
A
D
B C
E
F
A
D
B
CE
F
G H
J
K
G
H
J
K
context(B) = {A, B}
context(c) = {A,B,C}
context(D) = {D}
context(F) = {F}
Complexity of AND/OR Graph
Theorem: Traversing the AND/OR search graphis time and space exponential in the induced width/tree-width.
If applied to the OR graph complexity is time and space exponential in the path-width.
#CSP – AND/OR Search Tree
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
B
0
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
#CSP – AND/OR Tree DFS
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F
AND 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1 1 1 0 0 1
2 1 1
2 1 1 0
3 1
3
9
9
1
E
F F
0 1 0 1
0 1
C
D
0 1
0 1
1
B
0
E
F
0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F
0 1
0 1
C
D
0 1
0 1
1 1 0 1 1 1
2 1 2
2 1 20
2 3
6
1 1 1 0 1 0 1 0 1 1
2 1 1 1 2
3 1
2 1 10 01 02
1 2
3 2
5
5
14
#CSP – AND/OR Search Graph(Caching Goods)
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1
AND 0 1
C
D D
0 1
0 1
1
EC
D D
0 1
1
B
0
E
F F
0 1
C
1
EC
#CSP – AND/OR Search Graph(Caching Goods)
A
E
C
B
F
D
A
D
B
EC
F
A B C RABC
0 0 0 10 0 1 10 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
A B E RABE
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
A E F RAEF
0 0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 0
B C D RBCD
0 0 0 10 0 1 10 1 0 10 1 1 01 0 0 11 0 1 01 1 0 11 1 1 1
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1
AND 0 1
C
D D
0 1
0 1
1
EC
D D
0 1
1
B
0
E
F F
0 1
C
1
EC
Time and SpaceO(exp(w*))
All Four Search Spaces
Full OR search tree
126 nodes
Full AND/OR search tree
54 AND nodes
Context minimal OR search graph
28 nodes
Context minimal AND/OR search graph
18 AND nodes
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1
C
D
F
E
B
A 0 1
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
B
0
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1
0 1
0 1 0 1
0 1 0 1
C
D
F
E
B
A 0 1
AOR0ANDBOR
0AND
OR E
OR F F
AND 0 1
AND 0 1
C
D D
0 1
0 1
1
EC
D D
0 1
1
B
0
E
F F
0 1
C
1
EC
AND/OR vs. OR DFS algorithms
AND/OR tree Space: O(n) Time: O(n km)
O(n kw* log n)(Freuder85; Bayardo95; Darwiche01)
AND/OR graph Space: O(n kw*) Time: O(n kw*)
k = domain sizem = pseudo-tree depthn = number of variablesw*= induced widthpw*= path width
OR tree Space: O(n) Time: O(kn)
OR graph Space: O(n kpw*) Time: O(n kpw*)
Searching AND/OR Graphs
AO(i): searches depth-first, cache i-context i = the max size of a cache table (i.e.
number of variables in a context)
i=0
i=w*
Space: O(n)
Time: O(exp(w* log n))
Space: O(exp w*)
Time: O(exp w*)
AO(i) time complexity?
End of class
We did not cover the rest of the slides
AND/OR Branch-and-Bound (AOBB)
Associate each node n with a static heuristic estimate h(n) of v(n) h(n) is a lower bound on the value v(n)
For every node n in the search tree: ub(n) – current best solution cost rooted at n lb(n) – lower bound on the minimal cost at n
Lower/Upper Bounds
10
X
A B
0
C h(C)v(A)
h(B,0) LB(B,0) = h(B,0)
LB(X,1) = l(X,1) + v(A) + h(C) + LB(B)
Prune below AND node (B,0) if LB(X) ≥ UB(X)
LB(X) = LB(X,1)
LB(B) = LB(B,0)
UB(X) LB(X) UB(X) = best cost below X (i.e. v(X,0))
v(X,0)
Shallow/Deep Cutoffs
Deep cutoff
0
X
1
UB(X)
LB(X) = h(X,1)
10
X
A B
0
ED
0
C
UB(X)
LB(X)
Shallow cutoff
Reminiscent of Minimax shallow/deep cutoffs
Prune if LB(X) ≥ UB(X)
Summary of AOBB
Traverses the AND/OR search tree in a depth-first manner
Lower bounds computed based on heuristic estimates of nodes at the frontier of search, as well as the values of nodes already explored
Prunes the search space as soon as an upper-lower bound violation occurs
Heuristics for AND/OR
In the AND/OR search space h(n) can be computed using any heuristic. We used:
Static Mini-Bucket heuristics
Dynamic Mini-Bucket heuristics
Maintaining FDAC [Larrosa & Schiex03]
(full directional soft arc-consistency)
Empirical Evaluation Tasks
Solving WCSPs Finding the MPE in belief networks
Benchmarks (WCSP) Random binary WCSPs RLFAP networks (CELAR6) Bayesian Networks Repository
Algorithms s-AOMB(i), d-AOMB(i), AOMFDAC s-BBMB(i), d-BBMB(i), BBMFDAC Static variable ordering (dfs traversal of the pseudo-
tree)
Random Binary WCSPs(Marinescu and Dechter, 2005)
Random networks with n=20 (number of variables), d=5 (domain size), c=100 (number of constraints), t=70% (tightness). Time limit 180 seconds.AO search is superior to OR search
S-AOMB vs S-BBMB D-AOMB vs D-BBMB
Random Binary WCSPs (contd.)
n=20 (variables), d=5 (domain size), c=100 (constraints), t=70% (tightness)
dense sparse
n=50 (variables), d=5 (domain size), c=80 (constraints), t=70% (tightness)
AOMB for large i is competitive with AOMFDAC
Resource Allocation
InstanceBBMFDAC AOMFDAC
time (sec) nodes time (sec) nodes
CELAR6-SUB0 2.78 1,871 1.98 435
CELAR6-SUB1 2,420.93 364,986 981.98 180,784
CELAR6-SUB2 8,801.12 19,544,182 1,138.87 175,377
CELAR6-SUB3 38,889.20 91,168,896 4,028.59 846,986
CELAR6-SUB4 84,478.40 6,955,039 47,115.40 4,643,229
CELAR6 sub-instances
Radio Link Frequency Assignment Problem (RLFAP)
AOMFDAC is superior to ORMFDAC
Bayesian Networks Repository
Network Algorithm i=2 i=3 i=4 i=5
(n,d,w*,h) time nodes time nodes time nodes time nodes
s-AOMB(i) - 8.5M - 7.6M 46.22 807K 0.563 9.6K
Barley s-BBMB(i) - 16M - 18M - 17M - 14M
(48,67,7,17) d-AOMB(i) - 79K 136.0 23K 12.55 667 45.95 567
d-BBMB(i) - 2.2M - 1M 346.1 76K - 86K
s-AOMB(i) 57.36 1.2M 12.08 260K 7.203 172K 1.657 43K
Munin1 s-BBMB(i) - 8.5M - 9M - 10M - 8M
(189,21,11,24) d-AOMB(i) 66.56 185K 12.47 8.1K 10.30 1.6K 11.99 523
d-BBMB(i) - 405K - 430K - 235K 14.63 917
s-AOMB(i) - 5.9M - 4.9M 1.313 17K 0.453 6K
Munin3 s-BBMB(i) - 1.4M - 1.2M - 316K - 1.5M
(1044,21,7,25) d-AOMB(i) - 2.3M 68.64 58K 3.594 5.9K 2.844 3.8K
d-BBMB(i) - 33K - 125K - 52K - 31K
Time limit 600 seconds
available at http://www.cs.huji.ac.il/labs/compbio/Repository
Static AO is better with accurate heuristic (large i)
Outline Introduction
Optimization tasks for graphical models Solving by inference and search
Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation
Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces
Searching trees Searching graphs
Hybrids of search and inference Cutset decomposition Super-bucket scheme
From Searching Trees to Searching Graphs
Any two nodes that root identical subtrees/subgraphs can be merged
Minimal AND/OR search graph: closure under merge of the AND/OR search tree
Inconsistent sub-trees can be pruned too. Some portions can be collapsed or reduced.
AND/OR Search Graph
A
E
C
B
F
D
A
D
B
EC
FPrimal graph
Pseudo-tree
context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}
AOR
0AND
BOR
0AND
OR E
OR F F
AND 0 1 0 1
AND 0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
B
0
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
1
E
F F
0 1 0 1
0 1
C
D D
0 1 0 1
0 1
AND/OR Search Graph
OR
AND
OR
AND
OR E
OR F F
AND0 1
AND 0 1
D D
0 1
1
1
EC
D D
0 1
E
F F
0 1
A
0
B
0
C
0
1
B
0
C
1
EC
A
E
C
B
F
D
A
D
B
EC
FPrimal graph
Pseudo-tree
context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}
Context-based cachingA
D
B
EC
F
context(A) = {A}context(B) = {B,A}context(C) = {C,B}context(D) = {D}context(E) = {E,A}context(F) = {F}
Cache Table (C)
B CValu
e0 0 50 1 21 0 21 1 0
A
0
B
0
EC
D D
0 1
1
EC
D D
0 1
1
B
0
EC
1
EC
5
5
5 2
3
6
2 0
4
5
4 4
46
7
7
A
E
C
B
F
D
Primal graph
Space: O(exp(2))
Searching AND/OR Graphs
AO(j): searches depth-first, cache j-context j = the max size of a cache table (i.e.
number of variables in a context)
j=0
j=w*
Space: O(n)
Time: O(exp(w* log n))
Space: O(exp w*)
Time: O(exp w*)
AO(j) time complexity?
Graph AND/OR Branch-and-Bound - AOBB(j)
Like BB on tree except, during search, merge nodes based on context (caching); maintain cache tables of size O(exp(j)), where j is a bound on the size of the context.
Heuristics Static mini-bucket Dynamic mini-bucket Soft directional arc-consistency
[Marinescu & Dechter, CP2005]
Pseudo-trees (I)
AND/OR graph/tree search algorithms influenced by the pseudo-tree quality
Finding the minimal context/depth pseudo-tree is a hard problem
Heuristics Min-fill (min context) Hypergraph separation (min depth)
Pseudo-trees (II) MIN-FILL [Kjæaerulff, 1990], [Bayardo & Miranker, 1995]
depth first traversal of the induced graph constructed along some elimination order
elimination order prefers variables with smallest “fill set”
HYPERGRAPH [Darwiche, 2001] constraints are vertices in the hypergraph
and variables are hyperedges Recursive decomposition of the hypergraph
while minimizing the separator size (i.e. number of variables) at each step
Using state-of-the-art software package hMeTiS
Quality of pseudo-trees
Network hypergraph min-fill
width depth width depth
barley 7 13 7 23
diabetes 7 16 4 77
link 21 40 15 53
mildew 5 9 4 13
munin1 12 17 12 29
munin2 9 16 9 32
munin3 9 15 9 30
munin4 9 18 9 30
water 11 16 10 15
pigs 11 20 11 26
Network hypergraph min-fill
width depth width depth
spot5 47 152 39 204
spot28 108 138 79 199
spot29 16 23 14 42
spot42 36 48 33 87
spot54 12 16 11 33
spot404 19 26 19 42
spot408 47 52 35 97
spot503 11 20 9 39
spot505 29 42 23 74
spot507 70 122 59 160
Bayesian Networks Repository SPOT5 Benchmarks
Empirical Evaluation Tasks
Solving binary WCSPs Finding the MPE in belief networks
Benchmarks Random networks Resource allocation (SPOT5) Genetic linkage analysis
Algorithms s-AOMB(i,j), d-AOMB(i,j), AOMFDAC(j), VE+C j is the cache bound Static variable ordering
Random Networks
Random networks with n=100 (number of variables), d=3 (domain size), c=90 (number of CPTs), p=2 (number of parents per CPT). Time limit 180 seconds.
Static heuristics Dynamic Heuristics
Caching helps for static heuristics with small i-bounds
Resource Allocation – SPOT5
hypergraph minfill
Network Algorithm (w*,h) no cache cache (w*,h) no cache cache
time nodes time nodes time nodes time nodes
29b AOMFDAC (16,22) 5.938 170,823 1.492 40,428 (14,42) 5.036 79,866 3.237 34,123
(83,394) sAOMB(12) 1.002 8,458 1.012 1,033 0.381 997 0.411 940
42b AOMFDAC (31,43) 1,043 6,071,390 884.1 3,942,942 (18,62) - 22,102,050 - 17,911,719
(191, 1151) sAOMB(16) 132.0 2,871,804 127.4 2,815,503 3.254 11,636 3.164 9,030
54b AOMFDAC (12,14) 0.401 6,581 0.29 3,377 (9,19) 1.793 28,492 0.121 2,087
(68, 197) sAOMB(10) 0.03 74 0.03 74 0.02 567 0.02 381
404b AOMFDAC (19,23) 0.02 148 0.01 138 (19,57) 2.043 21,406 0.08 1,222
(101, 595) sAOMB(12) 0.01 101 0.01 101 0.02 357 0.01 208
503b AOMFDAC (9,14) 0.02 405 0.01 307 (8,46) 1077.1 19,041,552 0.05 701
(144, 414) sAOMB(8) 0.01 150 0.01 150 0.03 1,918 0.01 172
505b AOMFDAC (19,32) 17.8 368,247 5.20 69,045 (16,98) - 9,872,078 15.43 135,641
(241, 1481) sAOMB(14) 5.618 683 6.208 683 4.997 1912 5.096 831
Time limit 1800 seconds
Cache means j=i. we picked bestperforming i
The heuristics MB(i) is strong enough to prune so caching is less relevant
L11m L11f
X11
L12m L12f
X12
L13m L13f
X13
L14m L14f
X14
L15m L15f
X15
L16m L16f
X16
S13m
S15m
S16mS15m
S15m
S15m
L21m L21f
X21
L22m L22f
X22
L23m L23f
X23
L24m L24f
X24
L25m L25f
X25
L26m L26f
X26
S23m
S25m
S26mS25m
S25m
S25m
L31m L31f
X31
L32m L32f
X32
L33m L33f
X33
L34m L34f
X34
L35m L35f
X35
L36m L36f
X36
S33m
S35m
S36mS35m
S35m
S35m
6 people, 3 markers
Empirical evaluation
pedigree VE+C SUPERLINK 1.5(n, d) time time (i,c) time nodes
23 1,144 6,809 (14,18) 4.29 122,811(309, 5)
30 26,719 28,740 (18,22) 27.71 535,763(1016, 5)
s-AOMB(i,c)
*averages over 5 runs
Pedigree 23
pedigree(n,d)
23(309,5)
12 14 16 18 12 14 16 180 24.20 9.49 10.54 8.99 0 1,216,320 432,762 328,129 50,337
hypergraph 8 18.61 7.00 8.78 8.69 8 554,818 192,971 159,472 31,331(24, 38) 10 17.59 6.74 8.98 8.36 10 529,349 183,120 155,303 31,010
12 14.02 5.26 7.16 8.25 12 437,812 146,435 116,996 27,67614 13.87 5.24 7.37 8.25 14 432,895 145,169 116,718 27,55116 13.11 4.94 6.99 8.20 16 405,748 134,472 115,132 27,08718 10.11 4.29 5.93 8.13 18 350,048 122,811 96,893 25,176
SUPERLINK 1.5time
6,809
VE+Ctime
1,144
s-AOMB(i,c) - time s-AOMB(i,c) - nodesi i
cc(w*,h)
*averages over 5 runs
Pedigree 30
pedigree(n,d)
30(1016, 5)
16 18 20 22 16 18 20 220 362.39 72.32 43.58 68.01 0 11,476,500 1,812,940 433,073 70,911
hypergraph 8 289.88 49.63 36.50 66.95 8 8,061,550 939,683 202,137 24,991(24, 38) 16 164.79 30.13 31.70 66.02 16 5,194,540 611,623 119,994 11,446
18 156.46 27.96 30.77 66.36 18 4,705,450 541,026 101,193 10,25020 156.23 27.83 30.72 65.64 20 4,691,760 538,943 100,623 10,12722 154.62 27.71 30.81 66.46 22 4,681,200 535,763 100,396 10,095
i(w*,h) c
ic
26,719 28,740
s-AOMB(i,c) - time s-AOMB(i,c) - nodes
time timeVE+C SUPERLINK 1.5
*averages over 5 runs
Outline Introduction
Optimization tasks for graphical models Solving by inference and search
Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation
Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces
Hybrids of search and inference Cutset decomposition Super-bucket scheme