Using Problem Structure for Efficient Clause Learning
Ashish Sabharwal, Paul Beame, Henry Kautz
University of Washington, Seattle
April 23, 2003
April 23, 2003 University of Washington 2
The SAT Approach
Input p 2 D
CNF encoding f
SAT solver
f SAT f SATp : Instance
D : Domain graph problem, AI planning, model checking p bad p good
April 23, 2003 University of Washington 3
Key Facts
• Problem instances typically have structure– Graphs, precedence relations, cause and effects– Translation to CNF flattens this structure
• Best complete SAT solvers are– DPLL based clause learners; branch and backtrack– Critical: Variable order used for branching
April 23, 2003 University of Washington 4
Natural Questions
• Can we extract structure efficiently?– In translation to CNF formula itself– From CNF formula– From higher level description
• How can we exploit this auxiliary information?– Tweak SAT solver for each domain– Tweak SAT solver to use general “guidance”
April 23, 2003 University of Washington 5
Our Approach
Input p 2 D
CNF encoding f
SAT solver
f SAT f SATEncode “structure”as branching sequence
p bad p good
Branching sequence
April 23, 2003 University of Washington 6
Related Work
• Exploiting structure in CNF formula– [GMT’02] Dependent variables– [OGMS’02] LSAT (blocked/redundant clauses)– [B’01] Binary clauses– [AM’00] Partition-based reasoning
• Exploiting domain knowledge– [S’00] Model checking– [KS’96] Planning (cause vars / effect vars)
April 23, 2003 University of Washington 7
Our Result, Informally
– Structure can be efficiently retrieved from highlevel description (pebbling graph)
– Branching sequence as auxiliary information can be easily exploited
Given a pebbling graph G, can efficiently generatea branching sequence BG that dramatically improvesthe performance of current best SAT solvers on fG.
April 23, 2003 University of Washington 8
Preliminaries: CNF Formula
f = (x1 OR x2 OR :x9) AND (:x3 OR x9) AND (:x1 OR :x4 OR :x5 OR :x6)
Conjunctionof clauses
April 23, 2003 University of Washington 9
Preliminaries: DPLL
DPLL(CNF formula f) {
Simplify(f);
If (conflict) return UNSAT;
If (all-vars-assigned) {return SAT assignment; exit}
Pick unassigned variable x;
Try DPLL(f |x=0), DPLL(f |x=1)
}
April 23, 2003 University of Washington 10
Prelim: Clause Learning
• DPLL: Change “if (conflict) return UNSAT”
to “if (conflict)
{learn conflict clause; return UNSAT}”
x2 = 1, x3 = 0, x6 = 0 ) conflict
“Learn” (:x2 OR x3 OR x6)
April 23, 2003 University of Washington 11
Prelim: Branching Sequence
• B = (x1, x4, :x3, x1, :x8, :x2, :x4, x7, :x1, x2)
• DPLL: Change “Pick unassigned var x”to “Pick next literal x from B; delete it from B;
if x already assigned, repeat”
• How “good” is B?– Depends on backtracking process, learning
scheme
Different from“branching order”
April 23, 2003 University of Washington 12
Prelim: Pebbling Formulas
(a1 OR a2) (b1 OR b2)
(e1 OR e2)
(t1 OR t2)
(f1)
(c1 OR c2 OR c3)
Target(s)
Sources
E
A B C
F
T
Node E is pebbled if(e1 OR e2) = 1 fG = Pebbling(G)
Source axioms: A, B, C are pebbled
Pebbling axioms: A and B are pebbled ) E is pebbled …Target axioms: T is not pebbled
April 23, 2003 University of Washington 13
Prelim: Pebbling Formulas
• Can have– Multiple targets– Unbounded fanin– Large clause labels
• Pebbling(G) is unsatisfiable
• Removing any clause from subgraph of each target makes it satisfiable
April 23, 2003 University of Washington 14
Grid vs. Randomized Pebbling
(a1 a2) b1 (c1 c2 c3)
(d1 d2 d3)
l1
(h1 h2)
(i1 i2 i3 i4)e1
(g1 g2)
f1
(n1 n2)
m1
(a1 a2) (b1 b2) (c1 c2) (d1 d2)
(e1 e2)
(h1 h2)
(t1 t2)
(i1 i2)
(g1 g2)(f1 f2)
April 23, 2003 University of Washington 15
Why Pebbling?
• Practically useful– precedence relations in tasks, fault propagation in
circuits, restricted planning problems
• Theoretically interesting– Used earlier for separating proof complexity classes– “Easy” to analyze
• Hard for current best SAT solvers like zChaff– Shown by our experiments
April 23, 2003 University of Washington 16
Our Result, Again
– Efficient : (|fG|)
– zChaff : One of the current best SAT solvers
Given a pebbling graph G, can efficiently generatea branching sequence BG such that zChaff(fG, BG) is
empirically exponentially faster than zChaff(fG).
April 23, 2003 University of Washington 17
The Algorithm
• Input:– Pebbling graph G
• Output:– Branching sequence BG, |BG| = (|fG|), that works
well for 1UIP learning scheme and fast backtracking
[fG : CNF encoding of pebbling(G)]
April 23, 2003 University of Washington 18
The Algorithm: GenSeq(G)
1. Compute node heights
2. Foreach u 2 {unit clause labeled nodes} bottom up• Add u to G.sources• GenSubseq(u)
3. Foreach t 2 {targets} bottom up• GenSubseq(t)
April 23, 2003 University of Washington 19
The Algorithm: GenSubseq(v)
// trivial wrapper
1. If (|v.preds| > 0)– GenSubseq(v, |v.preds|)
April 23, 2003 University of Washington 20
The Algorithm: GenSubseq(v, i)
1. u = v.preds[i] // by increasing height
2. if i=1 // lowest pred
a. GenSubseq(u) if unvisited non-source
b. return
3. Output u.labels // higher pred
4. GenSubseq(u) if unvisitedHigh non-source
5. GenSubseq(v, i-1) // recurse on i-1
6. GenPattern(u, v, i-1) // repetitive pattern
April 23, 2003 University of Washington 21
Results: Grid Pebbling
– Pure DPLL upto 60 variables
– DPLL + upto 60 variablesbranching seq
– Clause learning upto 4,000 variables(original zChaff)
– Clause learning upto 2,000,000 variables+ branching seq
April 23, 2003 University of Washington 22
Results: Randomized Pebl.
– Pure DPLL upto 35 variables
– DPLL + upto 50 variablesbranching seq
– Clause learning upto 350 variables(original zChaff)
– Clause learning upto 1,000,000 variables+ branching seq
April 23, 2003 University of Washington 23
Summary
• High level problem description is useful– Domain knowledge can help SAT solvers
• Branching sequence– One good way to encode structure
• Pebbling problems: Proof of concept– Can efficiently generate good branching sequence– Structure use improves performance dramatically
April 23, 2003 University of Washington 24
Open Problems
• Other domains?– STRIPS planning problems (layered structure)– Bounded model checking
• Variable ordering strategies from BDDs?
• Other ways of exploiting structure?– branching “order”– something to guide learning?– Domain-based tweaking of SAT algorithms