Caching in Backtracking Search Fahiem Bacchus University of Toronto

Caching in Backtracking Search

Fahiem BacchusUniversity of Toronto

Introduction Backtracking search needs only space linear in the

number of variable (modulo the size of the problem representation).

However, its efficiency can greatly benefit from using more space to cache information computed during search. Caching can provable yield exponential improvements in the

efficiency of backtracking search. Caching is an any-space feature. We can use as much or as

little space for caching as we want without affecting soundness or completeness.

Unfortunately, caching can also be time consuming.How do we exploit the theoretical potential

of caching in practice?

2 Fahiem Bacchus, University of Toronto

04/19/23

Introduction We will examine this question for

The problem of finding a single solution And for problems that require considering all

solutions Counting the number of solutions/computing probabilities Finding optimal solutions.

We will look at The theoretical advantages offered by caching. Some of the practical issues involved with realizing

these theoretical advantages. Some of the practical benefits obtained so far.


04/19/23

Outline1. Caching when searching for a single solution.

Clause learning in SAT. Theoretical results. Its practical application and impact.

Clause learning in CSPs.

2. Caching when considering all solutions. Formula caching for sum of products problems

Theoretical results. Practical application


04/19/23

1. Caching when searching for a single solution

Fahiem Bacchus, University of Toronto

5 04/19/23

1.1 Clause Learning in SAT


6 04/19/23

Clause Learning in SAT (DPLL)


7

Clause learning is the most successful form of caching when searching for a single solution [Marques-Silva and Sakallah, 1996; Zhang et al., 2001].

Has revolutionized DPLL SAT solvers (i.e., Backtracking SAT solvers).

04/19/23

Clause Learning in SAT


8 04/19/23

XAssumption

1. Branch on a variable

2. Perform Propagation

Unit Propagation

),,,(),(),(),( DCBACXBXAX

),( AXA

),( BXB

),( CXC

),,,( DCBAD



9 04/19/23

XAssumption

AAX ),(

BBX ),(

CCX ),(

DDCBA ),,,(

Every inferred literal is labeled with a clausal reason.

The clausal reason for a literal is a subset of the previous literals on the path whose setting implies the literal

DCBADCBA ),,,(



04/19/23

XAssumption

AAX ),(

BBX ),(

CCX ),(

DDCBA ),,,(

YAssumption

PPY ),(

QQY ),(

DDPQ ),,(

Contradiction:1. D is forced to be both True and

False.

2. The clause (Q,P,D) has been falsifiedFalsified clauses are called conflict clauses.

10



04/19/23

XAssumption

AAX ),(

BBX ),(

CCX ),(

DDCBA ),,,(

YAssumption

PPY ),(

QQY ),(

DDPQ ),,(

Clause learning occurs when a contradiction is reached.

This involves a sequence of resolution steps.

Any implied literal in a clausal reason can be resolved away by resolving the clause with the clausal reason for the implied literal.

11

),,(),(),,,( YDQPYDPQ

),,,(),(),,,,( XDBACXDCBA

),,(),(),,,( YDPQYDPQ



04/19/23

XAssumption

AAX ),(

BBX ),(

CCX ),(

DDCBA ),,,(

YAssumption

PPY ),(

QQY ),(

DDPQ ),,(

12

SAT solvers utilize a particular sequence of resolutions against the conflict clause.

1-UIP learning [Zhang et al., 2001]—iteratively resolve away the deepest implied literal in the clause until the clause contains only one literal from the level the contradiction was generated.),,(),(),,,( YDPQYDPQ

),(),(),,,( YDPYYDP

Far Backtracking in SAT


04/19/23

XAssumption

AAX ),(

BBX ),(

CCX ),(

DDCBA ),,,(

YAssumption

PPY ),(

QQY ),(

DDPQ ),,(

13

Once the 1-UIP clause is learnt the SAT Solver backtracks to the level this clause became unit.

),( YD

1-UIP Clause

It then uses the clause to force a new literal.

Performs UP Continues its search.

YDY ),(

Theoretical Power of Clause Learning


14

The power of clause learning has been examined from the point of view of the theory of proof complexity [Cook & Reckhow 1977].

This area looks at the question of how large proofs can become and their relative sizes in in different propositional proof systems.

DPLL with Clause learning performing resolution (a particular type or resolution).

Various restricted versions of resolution have been well studied.

[Buresh-Oppenhiem, Pitassi 2003] contains a nice review of previous results and a number of new results in this area.

04/19/23



15

Every DPLL search tree refuting an UNSAT instance contains a TREE-Resolution.

TREE-Resolution proofs can be exponentially larger than REGULAR-Resolutions proofs.

REGULAR-Resolutions proofs can be exponentially larger than general (unrestricted) resolution proofs.

04/19/23

UNSAT formulas min_size(DPLL Search Tree)

≥ min_size(TREE-Resolution)

>> min_size(REGULAR-Resolution)

>> min_size(general resolution)



16

Furthermore every TREE-Resolution proof is a REGULAR-Resolution proof and every REGULAR-Resolution proof is a general resolution proof.

04/19/23

UNSAT formulasmin_size(DPLL Search Tree)

≥ min_size(TREE-Resolution)

≥ min_size(REGULAR-Resolution)

≥ min_size(general resolution)



17

[Beame, Kautz, and Sabharwal 2003] showed that clause learning can SOMETIMES yield exponentially smaller proofs than REGULAR.

Unknown if general resoution proofs are some times smaller.

04/19/23

UNSAT formulasmin_size(DPLL Search Tree)

≥ min_size(TREE-Resolution) >> min_size(REGULAR-Resolution)

>> min_size(Clause Learning DPLL Search Tree)≥ min_size(general resolution)



18

It is still unknown if REGULAR or even TREE resolutions can sometimes be smaller than the smallest Clause Learning DPLL Search tree.

04/19/23



19

It is also easily observed [Beame, Kautz, and Sabharwal 2003] that with restarts clause learning can make the DPLL Search Tree as small as the smallest general resolution proof on any formula.

04/19/23

UNSAT formulasmin_size(Clause Learning + Restarts

DPLL Search Tree)

= min_size(general resolution)



20

In sum. Clause Learning, especially with restarts, has the potential to yield exponential reductions in the size of the DPLL search tree.

With clause learning DPLL can potentially solve problems exponentially faster. That this can happen in practice has been irrefutably demonstrated by modern SAT solvers.

Modern SAT solvers have been able to exploit the theoretical potential of clause learning.

04/19/23



21

The theoretical advantages of clause learning also hold for CSP backtracking search

So the question that arises is can the theoretical potential of clause learning also be exploited in CSP solvers.

04/19/23

1.1 Clause Learning in CSPs


22 04/19/23

Clause Learning in CSPs


23

Joint work with George Katsirelos who just completed his PhD with me “NoGood Processing in CSPs”

Learning has been used in CSPs, but have not had the kind of impact Clause Learning has had in SAT.[Decther 1990; T. Schiex & G. Verfaillie 1993; Frost & Dechter 1994; Jussien & Barichard 2000]

This work has investigated NoGood learning.

04/19/23

A NoGood is a set of variable assignments that cannot be extended to a solution.

NoGood Learning


24

NoGood Learning is NOT Clause Learning. It is strictly less powerful.

To illustrate this let us consider encoding a CSP as a SAT problem, and compare what Clause Learning will do on the SAT encoding to what NoGood Learning would do.

04/19/23

Propositional Encoding of a CSP—the propositions.


25

A CSP consists of a Set of variables Vi and constraints Cj Each variable has a domain of values Dom[Vi] = {d1, …, dm}.

Consider the set of propositions Vi=dj one for each value of each variable.

Vi=dj means that Vi has been assigned the value dj. True when the assignment has been made.

¬(Vi=dj) means that Vi has not been assigned the value dj True when dj has been pruned from Vi’s domain.

if Vi has been assigned a different value, all other values (including dj) are pruned from its domain.

Usually write Vi≠dj instead of ¬(Vi=dj).

We encode the CSP using clauses over these assignment propositions.

04/19/23

Propositional Encoding of a CSP—the clauses.


26

For each variable V with Dom[V]={d1,…,dk} we have the following clauses:

(V=d1,V=d2,…,V=dk) (must have a value) For every pair of values (di, dk) the clause (V ≠ di, V ≠

dk) (has a unique value) For each constraint C(X1,…,Xk) over some set of

variables we have the following clauses: For each assignment to its variables that falsifies the

constraint we have a clause blocking that assignment. If C(a,b,…,k) = FALSE then we have the clause

(X1 ≠ a, X2 ≠ b, …, Xk ≠ k) This is the direct encoding of [Walsh 2000].

04/19/23

DPLL on this Encoded CSP.


27

Unit Propagation on this encoding is essentially equivalent to Forward Checking on the original CSP.

04/19/23

DPLL on the encoded CSP


28 04/19/23

Variables Q, X, Y, Z, ...

Dom[Q] = {0,1} Dom[X,Y,Z] =

{1,2,3} Constraints

Q + X + Y ≥ 3 Q + X + Z ≥ 3 Q + Y + Z ≤ 3

2)2,2,0( ZZYQ

0 QAssumption

1)1,0( QQQ

1 XAssumption2)2,1( XXX

3)3,1( XXX

1)1,1,0( YYXQ

1)1,1,0( ZZXQ

3)3,2( YYY

2 YAssumption

3)3,2,0( ZZYQ

3)3,2,1( ZZZZ



29 04/19/23

Clause learning

2)2,2,0( ZZYQ

0 QAssumption

1)1,0( QQQ


3)3,1( XXX

1)1,1,0( YYXQ

1)1,1,0( ZZXQ

3)3,2( YYY

2 YAssumption

3)3,2,0( ZZYQ

)2,1,2,0()3,2,0(),3,2,1( ZZYQZYQZZZ

3)3,2,1( ZZZZ



30 04/19/23

Clause learning

2)2,2,0( ZZYQ

0 QAssumption

1)1,0( QQQ


3)3,1( XXX

1)1,1,0( YYXQ

1)1,1,0( ZZXQ

3)3,2( YYY

2 YAssumption

)1,2,0()2,2,0(),2,1,2,0( ZYQZYQZZYQ

A 1-UIP Clause



31

This clause is not a NoGood! It asserts that we cannot have Q = 0, Y = 2, and Z

≠ 1 simultaneously. This is a set of assignments and domain prunings

that cannot lead to a solution. A NoGood is only a set of assignments. To obtain a NoGood we have to further resolve

awayZ = 1 from the clause.

04/19/23

)1,2,0( ZYQ



32 04/19/23

NoGood learning

0 QAssumption

1)1,0( QQQ


3)3,1( XXX

1)1,1,0( YYXQ

1)1,1,0( ZZXQ

2 YAssumption

)2,1,0()1,1,0(),1,2,0( YXQZXQZYQ

This clause is a NoGood. It says that we cannot have the set of assignments Q = 0, X = 1, Y = 2

NoGood learning requires resolving the conflicts back to the decision literals.

NoGoods vs. Clauses (Generalized NoGoods)


33 04/19/23

1. Unit propagation over a collection of learnt NoGoods is ineffective.

Nogoods are clauses containing negated literals only, e.g.,(Z ≠ 1, Y ≠ 0, X ≠ 3). If one of these clauses becomes unit, e.g., (X ≠ 3), the forced literal can only satisfy other NoGood clauses, it can never reduce the length of those clauses.

2. A single clause can represent an exponential number of NoGoods

(Q ≠ 1, Z = 1, Y = 1) is equivalent to (Domain = {1, 2, 3})(Q ≠ 1, Z ≠ 2, Y ≠ 2) (Q ≠ 1, Z ≠ 3, Y ≠ 2) (Q ≠ 1, Z ≠ 2, Y ≠ 3) (Q ≠ 1, Z ≠ 3, Y ≠ 3)

NoGoods vs. Clauses (Generalized NoGoods)


34 04/19/23

3. The 1-UIP clause can prune more branches during the future search than the NoGood clause [Katsirelos 2007].

4. Clause Learning can yield super-polynomially smaller search trees than NoGood Learning [Katsirelos 2007]

Encoding to SAT


35 04/19/23

With all of these benefits of clause learning over NoGood learning the natural question is

Why not encode CSPs to SAT and immediately obtain the benefits of Clause Learning already implemented in modern SAT solvers?

Encoding to SAT


36 04/19/23

1. The SAT theory produced by the direct encoding is not very effective.

Unit Prop. on this encoding only achieves Forward Checking (a weak form of propagation).

2. Under the direct encoding constraints of arity k yield 2O(k) clauses. Hence the resultant SAT theory is too large.

3. No direct way of exploiting propagators. Specialized polynomial time algorithms for

doing propagation on constraints of large arity.

Encoding to SAT


37 04/19/23

Some of these issues can be address by better encodings, e.g., [Bacchus 2007, Katsirelos & Walsh 2007, Quimper & Walsh 2007]. But overall complete conversion to SAT is currently impractical.

Clause Learning in CSPs without encoding


38 04/19/23

We can perform Clause Learning in a CSP solver by the following steps:

1. The CSP solver must keep track of the chronological sequence of variable assignments and value prunings made as we descend each path in the search tree.0Q

1X

0Q

1Q

1X

2X

3X

1Y



39 04/19/23

2. Each item must be labeled with a clausal reason consisting of items previously falsified along the path.

0Q

1Q

1X

2X

3X

1Y

0 QAssumption

1)1,0( QQQ

1 XAssumption

2)2,1( XXX

3)3,1( XXX

1)1,1,0( YYXQ



40 04/19/23

3. Contradictions are labeled by falsified clauses, e.g., Domain Wipe Outs can be labeled by the must have a value clause.

From this information clause learning can be performed whenever a contradiction is reached.

These clauses can be stored in a clausal database

Unit Propagation can be run on this database as new value assignments or value prunings are preformed.

The inferences of Unit Propagation agument the other constraint propagation done by the CSP solver.

)3,2,1( ZZZ

Higher Levels of Local Consistency Note that this technique works irrespective of

kinds of inference performed during search. That is, we can use any kind of inference we want to

infer a new value pruning or new variable assignment—as long as we can label the inference with a clausal reason.

This raises the question of how do we generate clausal reasons for other forms inference.

[Katsirelos 2007] answers this question for the most commonly used form of inference: Generalized Arc Consistency. Including ways of obtain clausal reasons from various

types of GAC propagators, ALL-DIFF, GCC.


41 04/19/23

Some Empirical Data [Katsirelos 2007]


42 04/19/23

GAC with NoGood learning helps a bit. GAC with clause learning but where GAC labels it inferences with NoGoods

offers only minor improvements. To get significant improvements must do clause learning as well have

proper clausal reasons from GAC.

Observations


43 04/19/23

Caching techniques have great potential, but to make them effective in practice it can sometimes require resolving a number of different issues.

This work goes a long ways towards achieving the goal of exploiting the theoretical potential of Clause Learning.

Prediction: Clause learning will play a fundamental role in the next generation of CSP solvers, and these solvers will often be orders of magnitude more effective than current solvers.

Open Issues


44 04/19/23

Many issues remain open. Here we mention only one: Restarts.

As previously pointed out, clause learning gains a great deal more power with restarts. With restarts it can be as powerful as unrestricted resolution.

Restarts play an essential role in the performance of SAT solvers. Both full restarts and partial restarts.

Search vs. Inference


45 04/19/23

With restarts and clause learning, the distinction of search vs. inference is turned on its head.

Now search is performing inference. Instead the distinction becomes systematic vs.

opportunistic inference. Enforcing a high level of consistency during search is

performing systematic inference. Searching until we learn a good clause is

opportunistic. Sat solvers perform very little systematic

inference, only Unit Propagation, but they perform lots of opportunistic inference.

CSP solvers essentially do the opposite.

One Open Question


46 04/19/23

In SAT solvers opportunistic inference is feasible: if a learnt clause turns out not to be useful it doesn’t matter much as the search to learn that clause did not take much time. Search (nodes/second rate) is very fast.

In CSP solvers the enforcement of higher levels of local consistency makes restarts and opportunistic inference very expensive. Search (nodes/second rate) is very slow.

Is high levels of consistency really the most effective approach for solving CSP once Clause learning is available?

2. Formula Caching when considering all solutions.


47 04/19/23

Considering All Solutions? One such class of problems are those that can be

expressed as Sum-Of-Product problems [Decther 1999].

1.Finite Set of Variables, V1, V2, …, Vn2.A finite domain of values for each variable

Dom[Vi].3.A finite set of real valued local functions f1, f2,

…, fm. Each function is local in the sense that it only

depends on a subset of the variables. f1(V1, V2), f2(V2, V4, V6), …

The locality of the functions can be exploited algorithmically.


48 04/19/23

Sum of Products The sum of products problem is to compute

from this representation


49 04/19/23

)()()(1 2

21 V V V

m

n

fff

The local functions assign a value to every complete instantiation of the variables (the product) and we want to compute some amalgamation of these values

A number of different problems can be cast as instances of sum-of-product [Decther 1999].

Sum of Products—Examples


50 04/19/23

#CSPs count the number of solutions. Inference in Bayes Nets. Optimization: the functions are sub-objective

functions returning real values and the global objective is to maximize the sum of the sub-objects (cf. soft constraints, generalized additive utility).

Algorithms Brief History [Arnborg et al. 1988] It had long been noted that various NP-

Complete problems on graphs were easy on Trees.

With the characterization of NP completeness systematic study of how to extend these techniques beyond trees started in the 1970s

A number of Dynamic Programming algorithms were developed for partial K-trees which could solve many hard problem in time linear in the size of the graph (but exponential in K)


51 04/19/23

Algorithms Brief History [Arnborg et al. 1988] These ideas were made systematic by Robertson &

Seymour who wrote a series of 20 articles to prove Wagner’s conjecture [1983].

Along the way they defined the concept of Tree and Branch decompositions and the graph parameters Tree-Width and Branch-Width.

It was subsequently noted that partial k-Trees are equivalent to the class of graphs with tree width ≤ k. So all of the dynamic programming algorithms developed for partial K-Trees worked for tree width k graphs.

The notion of tree width has been exploited many areas of computer science and combinatorics & optimization. .


52 04/19/23

Three types of Algorithms These algorithms all take one of three

basic forms all of which achieve the same kinds of tree-width complexity guarantees.

To understand these forms we first introduce the notion of a branch decomposition (which is somewhat easier to utilize than tree-decompositions when dealing the local functions with arity greater than 2)


53 04/19/23

Branch Decomposition


54 04/19/23

Start with m leaf nodes one for each of the local functions.

Map each local function to some leaf node.

f3 f6 f1 f4 f2 f7 f5



55 04/19/23

V4, V5

V5,V6,V7

V3,V7

V1,V3

V4,V6

V8,V6

V5, V2

Label each leaf node with the variables in the scope of the associated local function.

f3 f6 f1 f4 f2 f7 f5



56 04/19/23

V4, V5

V5,V6,V7

V3,V7

V1,V3

V4,V6

V8,V6

V5, V2

Build a binary tree on top of these nodes.

V4,V6,V3



57 04/19/23

V4, V5

V5,V6,V7

V3,V7

V1,V3

V4,V6

V8,V6

V5, V2

Then label the rest of the nodes of the tree.

V5,V6,V3V4, V5 V3,V4,V6

V4,V6,V3

Internal Labels


58 04/19/23

B

B variables in the rest of the tree. (Not in subtree under the node)

AB

A variables in the subtree below

Internal Labels


59 04/19/23

A B

v v

Branch Width Width of a particular decomposition is

the size of the minimal label. Branch width is the minimal width

over all possible branch decompositions.

Branch width is no more than the tree-width.


60 04/19/23

Algorithms: Dynamic Programming Bottom up Dynamic Programming,

e.g., Join Tree algorithms in Bayesian Inference


61 04/19/23

V4,V6,V3

V4, V5

V3,V7

V1,V3

V4,V6

V8,V6

V5, V2

V5,V6,V3V4, V5 V3,V4,V6

V4,V6,V3

V5,V6,V7

Algorithms: Variable Elimination Linearize the bottom up process:

Variable Elimination.


62 04/19/23

V4,V6,V3

V4, V5

V3,V7

V1,V3

V4,V6

V8,V6

V5, V2

V5,V6,V3V4, V5 V3,V4,V6

V4,V6,V3

V5,V6,V7

Algorithms: Instantiation and Decomposition Instantiate variables starting at the

top (V4, V6 and V3) and decompose the problem.


63 04/19/23

V4,V6,V3

V4, V5

V3,V7

V1,V3

V4,V6

V8,V6

V5, V2

V5,V6,V3V4, V5 V3,V4,V6

V4,V6,V3

V5,V6,V7

V8,V1

V5,V2,V7

Instantiation and Decomposition A number of works have used this

approach Pseudo Tree Search [Freuder & Quinn 1985] Counting Solutions [Bayardo & Pehoushek

2000] Recursive Conditioning [Darwiche 2001] Tour Merging [Cook & Seymour 2003] AND-OR Search [Dechter & Mateescu

2004] …


64 04/19/23

Instantiation and Decomposition Solved by AND/OR search: as we

instantiate variables we examine the residual sub-problem.

If the sub-problem consists of disjoint parts that share no variables (components) we solve each component in a separate recursion.


65 04/19/23

Theoretical Results With the right ordering this approach can

solve the problem in time2O(w log n)

and linear space, were w is the branch (tree) width of the instance.

If the solved components are cached so that they do not have to be solved again the approach can solve the problem in time

nO(1)2O(w).But now we need nO(1)2O(w) space.


66 04/19/23

Solving Sum-Of-Products with Backtracking In joint work with Toniann Pitassi &

Shannon Dalmao we showed that caching is in fact sufficient to achieve these bounds with standard backtracking search. [Bacchus, et al. 2003] AND/OR decomposition of the search tree is

not necessary (and may be harmful). Instead an ordinary decision tree can be searched.

Once again Caching again provides a significant increase in the theoretical power of backtracking.


67 04/19/23

Simple Formula Caching As assumptions are made during search the

problem is reduced. In Simple Formula Caching we cache every

solved residual formula, and if we encounter the same residual formula again we utilize its cached value instead of solving the same sub-problem again.

Two residual formulas are the same if They contain the same (unassigned) variables. All instantiated variables in the remaining

constraints (constraints with at least one unassigned variable) are instantiated to the same value.


68 04/19/23

Simple Formula Caching C1(X,Y), C2(Y,Z) C3(Y,Q) [X=a,Y=b] C2(Y=b,Z) C3(Y=b,Q)

C1(X,Y), C2(Y,Z) C3(Y,Q) [X=b,Y=b]

C2(Y=b,Z) C3(Y=b,Q)

These residual formulas are the same even though we obtained them from different instantiations.


69 04/19/23

Simple Formula Caching


70 04/19/23

BTSimpleCache(Φ)if InCache(Φ), return CachedValue(Φ)else

Pick a variable V in Φ, value = 0for d in Domain[V]

value = value + BTSimpleCache(Φ|V=d)

AddToCache(Φ, value)return

Runs in time and space 2O(w log n)

Component Caching We can achieve the same

performance as AND/OR decomposition, i.e., 2O(w log n) time with linear space or nO(1)2O(w) time with nO(1)2O(w) space by examining the residual formula for disjoint Components.

We cache these disjoint components as they are solved.

We remove any solved component from the residual formula.


71 04/19/23

Component Caching Since components are no longer

solved in a separate recursion we have to be a bit cleverer about identifying the value of these components from the search computation.

This can be accomplished by using the cache in a clever way, or by dependency tracking techniques.


72 04/19/23

Component Caching There are some potential advantages of

searching a single tree rather than and AND/OR tree. With an AND/OR tree one has to make a

commitment to which component to solve first. The wrong decision when doing Bayesian

Inference or optimization with Branch and Bound can be expensive

In the single tree the components are solved in an interleaved manner.

This also provides more flexibility with respect to variable ordering.


73 04/19/23

Bayesian Inference via Backtracking Search These ideas were used to build a

fairly successful Bayes Net Reasoner. [Bacchus et al. 2003].

Better performance however would require exploiting more of the structure internal to the local functions.


74 04/19/23

Exploiting Micro Structure C1(A,Y,Z) = TRUE

A=0 Y = 1 A = 1 Y=0 & Z = 1

Then C(A=0,Y,Z) is in fact not a function of Z. That is C(A=0,Y,Z) C(A=0,Y)

C2(X,Y,Z) = TRUE X + Y + Z ≥ 3

Then C(X=3, Y, Z) is already satisfied.


75 04/19/23

Exploiting Micro Structure In both cases if we could detect this

during search we could potentially Generate more components, e.g., if we

could reduce C1(A=0,Y,Z) to a C1(A=0,Y) perhaps Y and Z would be in different components.

Generate more cache hits, e.g., if the residual formula differs from a cached formula only because it contains C2(X=3,Y,Z), recognizing that constraint is already satisfied would allow us to ignore it and generate the cache hit.


76 04/19/23

Exploiting Micro Structure It is interesting to note that if we

encode to CNF we do get to exploit more of the micro structure (structure internal to the constraint). Clauses with a true literal are satisfied and can be

removed from the residual formula.

Bayes Net Reasoners using CNF encodings have displayed very good performance [Chavira & Darwiche 2005].


77 04/19/23

Exploiting Micro Structure Unfortunately, as pointed out before,

encoding in CNF can result in an impractical blowup in the size of the problem representation.

Practical techniques for exploiting the micro structure remain a promising area for further research. Some promising results by Kitching to detect

when a symmetric version of a current component has already been solved [Kitching & Bacchus 2007], but more work to be done.


78 04/19/23

Observations Component caching solvers are the most

effective ways of exactly computing the number of solutions of a SAT formula.

Allow solution of certain types of Bayesian Inference problems not solvable by other methods.

Have shown promise in solving decomposable optimization problems [Dechter & Marinescu 2005, de Givry et al. 2006, Kitching & Bacchus 2007]

To date all these works have used AND/OR search. So exploiting the advantages of plain backtracking search remains work to be done [Kitching in progress].

Better exploiting micro structure also remains work to be done.


79 04/19/23

Conclusions


80

Caching is a technique that has great potential for making a material difference in the effectiveness of backtracking search.

The range of practical mechanisms for exploiting caching remains a very fertile area for future research.

Research in this direction might well change present day “accepted practice” in constraint solving.

04/19/23

References [Marques-Silva and Sakallah, 1996]

J. P. Marques-Silva and K. A. Sakallah. Grasp—a new search algorithm for Satisfiability. In ICCAD, 220-227, 1996.

[Zhang et al., 2001] L. Zhang, C. F. Madigan, M. H. Moskewicz, and S. Malik. Efficient conflict driven

learning in a Boolean Satisfiability solver. In ICCAD, 279-285, 2001. [Cook & Reckhow 1977]

S. A. Cook and R. A. Reck-how, The relative efficiency of propositional proof systems, J. Symb. Logic, 44 (1977), 36-50.

[Buresh-Oppenhiem, Pitassi 2003] J. Buresh-Oppenheim and T. Pitassi, The Complexity of Resolution Refinements, in

Proceedings of the 18th IEEE Symposium on Logic in Computer Science (LICS), June 2003, pp. 138-147

[Beame, Kautz, and Sabharwal 2003] P. Beame, H. Kautz, and A. Sabharwal: Towards Understanding and Harnessing the

Potential of Clause Learning. J. Artif. Intell. Res. (JAIR) 22: 319-351 (2004) [Decther 1990]

R. Dechter: Enhancement Schemes for Constraint Processing: Backjumping, Learning, and Cutset Decomposition. Artif. Intell. 41(3): 273-312 (1990)

[T. Schiex & G. Verfaillie 1993] T. Schiex and G. Verfaillie. Nogood recording for static and dynamic CSP. Proceeding of the

5th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'93), p. 48-55, Boston, MA, november 1993.


81 04/19/23

References [Frost & Dechter 1994]

D. Frost, R. Dechter: Dead-End Driven Learning. AAAI 1994: 294-300 [Jussien & Barichard 2000]

N. Jussien, V. Barichard "The PaLM system: explanation-based constraint programming" , Proceedings of TRICS: Techniques foR Implementing Constraint programming Systems, a post-conference workshop of CP 2000, pp. 118-133, 2000

[Walsh 2000] T. Walsh. SAT v CSP, Proceedings of CP-2000, pages 441-456, Springer-Verlag LNCS-1894, 2000.

[Katsirelos 2007] G. Katsirelos, NoGood Processing in CSPs. PhD thesis. Department of Computer Science,

University of Toronto. [Bacchus 2007]

F. Bacchus. GAC via Unit Propagation. International Conference on Principles and Practice of Constraint Programming (CP 2007) , pages 133-147.

[Katsirelos & Walsh 2007] G. Katsirelos and T. Walsh. A Compression Algorithm for Large Arity Extensional Constraints..

Proceedings of CP-2007, LNCS 4741, 2007. [Quimper & Walsh 2007]

C. Quimper and T. Walsh. Decomposing Global Grammar Constraints. Proceedings of CP-2007, LNCS 4741, 590-604 2007.

[Decther 1999] R. Decther. "Bucket Elimination: A unifying framework for Reasoning." In "Artificial Intelligence",

October, 1999.


82 04/19/23

References [de Givry et al. 2006]

S. de Givry, T. Schiex, G. Verfaillie. Exploiting Tree Decomposition and Soft Local Consistency in Weighted CSP. Proc. of AAAI'2006. Boston (MA), USA.


83 04/19/23

Documents

Caching in Backtracking Search Fahiem Bacchus University of Toronto