Propositional Satisfiability and Constraint Programming Lucas Bordeaux Microsoft Research, Cambridge

Propositional Satisfiability and

Constraint Programming

Lucas Bordeaux Microsoft Research, Cambridge

Acknowledgements

My views on SAT have been shaped by discussions along the last weeks or years with:

Joao Marques-Silva – Lakhdar Sais – David Mitchell – Marco Benedetti – Sathiamoorthy Subbarayan – Fabien Corblin – Horst Samulowitz – Claude-Guy Quimper – Lintao Zhang – Youssef Hamadi – Ines Lynce – Madan Musuvathi – Daniel Le Berre – Marco Cadoli – Robert Nieuwenhuis – Cedric Piette – Byron Cook – Laurent Simon – Toni Mancini...

(as they say, the responsibility for the mistakes in these slides is nonetheless mine)

Goals of the Course

• See some tricks used by modern SAT solvers• Discuss what lessons CP can learn from them• Suggest why research in SAT was, is still, and

will remain exciting and diverse• Expose you some working knowledge on

many interesting results that are part of folklore, but rarely exposed at once

• Connect to various areas of CP and AI

A first reference

Propositional Satisfiability and Constraint Programming: a Comparative SurveyL. Bordeaux, Y. Hamadi and L. ZhangACM Computing Surveys 2006

By no means the only ref. on SAT:- forthcoming handbooks, - survey by David Mitchell...

Outline

• A few things to know about propositional logic

• SAT solvers

Outline

• A few things to know about propositional logic– Basic definitions– Encoding problems in SAT– Deduction in propositional logic– Easy and hard classes of propositional

formulae

• SAT solvers

PART I – A FEW THINGS TO KNOW ABOUT PROPOSITIONAL LOGIC

Basic definitionsEncoding problems in SATEasy and hard classes of propositional formulaeDeduction in propositional logic

A few things to know about SAT

• SAT means “propositional satisfiability”– Given a Boolean circuit, find an input for which

the circuit evaluates to true.

0

0

0AND

NOT

OR

1

0

1


• An interesting restriction of SAT is Conjunctive Normal Form (CNF)

• Note that a clause is a no-good: it forbids a partial assignment

)()()()( tzyx A literal A clause

)]1()0()1()0[( tzyx


• SAT Instances can be put in CNF

x

z

yAND

NOT

OR

cbac

zbyxacbazyx

),(

),(),(.,,)(

• We obtain a formula that is equi-satisfiable

-Add variables for each intermediate resulta=(x .y), b=(-z) c=(a+b)

-Express the relations between these vars by clauses (!a + x), (!a + y) (!x + !y + a) ...-Constrain output to be true (c)

a

b

c


• Incidentally Conjunctive Normal Form provides a convenient format in which SAT instances can be represented

DEMO!DEMO!Have a look at the instance encoding used in SAT competitions

A few things to know about SAT• SAT is NP-complete

QUIZZQUIZZ There is a famous statement that 3SAT (i.e. Clauses of size 3 max) is NP-complete; why is it so?

0

0

0

0

1

1

0

0

0

0

1

1

1

0

0

0

0

0

1

1

0

0

1

0

0

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

State as input atevery step

Non-deterministiccase: “guessed” bits as extra input

Cultural Interlude I

QUIZZQUIZZ Who is the gentleman on the left?Who is the gentleman on the right?



Encoding Problems in Prop. Logic

• Clearly, variables with arbitrary finitely-representable domains can be represented by sequences of Boolean variables

• Relations over these variables can be represented as relations over {0,1}n, a.k.a. Boolean functions from {0,1}n → {0,1}.

• Among these relations, which ones can be concisely encoded as Boolean circuits?


Answer 1:• The overwhelming majority of Boolean

functions cannot be represented concisely in SAT (i.e. any circuit is super-polynomial)

• This is Shannon’s theorem (for Boolean circuits)

QUIZZQUIZZ Do you see why Shannon’s theorem holds?


Answer 2: • Think of Cook’s theorem

– If the membership of a 0/1 vector to the relation can be checked efficiently, then we can encode the relation into SAT

• “The assembly language of Constraints” – Even though it is a minimalistic language, “everything representable” at all can be represented concisely in propositional logic


• There is an area of AI that studies knowledge representation formalisms– Circuits are “optimally concise” representation

formalism for Boolean functions– This comes at a price: we cannot efficiently find a

solution, count the solutions, make projections, etc.

– Other formalisms exist that offer different tradeoffs between conciseness and functionality (e.g. Binary Decision Diagrams, d-DNNF)

(detour: Binary Decision Diagrams)

0 01 1 0 10 1

Decision tree

x

y

z


0 1

(we can join terminal nodes)


0 1

Merge isomorphic branches


0 1


0 1


0 1

Different variants exist(0-suppressed, etc.)

Problems typically encoded in SAT

• Circuit equivalence

A B

XOR

Solution gives Counter-example toequivalence

Problems typically encoded in SAT

• Finite-State Transition Systems (toy example)

We have a system whose state is encoded by two bits X and Y

The following transitions are possible:• Swap the values of X and Y• Flip the value of X – possible only when Y=1

Finite-State Transition SystemsExample of problem:

Starting from state (0,0), can the system reach state (1,1)?

• Swap the values of X and Y– possible only when X=1• Flip the value of X

0

0

1

0

X

Y




0

0

1

0

0

1

X

Y




0

0

1

0

0

1

1

1

X

Y

Done!

Finite-State Transition Systems

SAT encoding:• We bound to a given length• The value of a component A of the system at each step i is denoted by a different variable Ai• We put constraints between each consecutive pairs of states to guarantee correct transitions

X1

Y1

X2

Y2

X3

y3

x4

Y4C1 C2 C3

A solution encodes a path between initial and terminal states

• Swap the values of X and Y– possible only when X=1

• Flip the value of X


X1

Y1

X2

Y2

X3

y3

x4

Y4

12

12121 ,,1

XX

XYYXX


• Many applications of “bounded analysis of finite-state transition systems”, of which the two most famous are:

– “Bounded Model Checking”

– “Planning as satisfiability”

systemAF(end)?

(define (domain-summer-school) (:action swap

:parameters (...) :precondition () :effect ())(...) )


• Part of things typically studied at border between SAT and CP is encodings CSP ↔ SAT .

• Let’s see two simple cases:

X Y

0 1

0 2

1 1

2 0

X0, X1, X2, each Xi meaning “X=i”

Exactly one of the Xi’s is true:

(alternative: logarithmic encoding)

i

ix 1




X Y

0 1

0 2

1 1

2 0

Naive (line by line) encoding:

(X0 and Y1) or(X0 and Y2) or(X1 and Y1) or(X2 and Y0)




X Y

0 1

0 2

1 1

2 0

Support encoding:

Y1 → (X0 or X1)Y2 → (X0 or X2)X0 → (Y1 or Y2)X1 → (Y1)X2 → (Y2)




X Y

0 1

0 2

1 1

2 0

Support encoding:

not Y1 or X0 or X1not Y2 or X0 or X2not X0 or Y1 or Y2not X1 or Y1not X2 or Y0

QUIZZQUIZZ

Not all encodings are equivalent... Do you see any advantage of the support encoding over the naive one?



Classes of SAT

• Exponential worst-case complexity result does not say much about “non-worst-case” instances

– What about instances picked at random?– What about instances appearing in real-life?– What about instances that obey some

restrictions?

Random instances

• What do we mean, random instance? – Typically pure “homogeneous” versions of SAT

instances, e.g. all clauses with same size (k-SAT)– Typically (to best of my knowledge) generated from

trivial distributions – e.g. uniform

• There is every reason to believe that random instances from a properly chosen distribution are absolutely intractable (and in a sense not so different from worst case)

Random instances

• One well-known result emerging from the study of random instances is the phase transition phenomenon, first observed empirically and then proved/explained analytically

Percent of satisfiable instances (in %)

Resolution time(typically in log-scale)

“Real-Life” Instances

• Warning: although it would sound like “picked at random” is much closer to your typical instance than “worst-case”, it probably isn’t– Instances appearing in applications have

regularities and highly non-random patterns (“structure”)

– (A possible definition of “real-life” or “structured” may indeed very well be “non-random”)

Research Question

Research Question

What is a real-life instance (e.g. What is structure)?

Why are some solvers so good at exploring structure?

Special classes of SAT

• Restrictions on type of constraints allowed– Basic result: Schaefer’s classification of

Boolean constraint languages

• Trivial 0• Trivial 1• Horn• Anti-Horn• 2-SAT• Affine

zyx zyx

zx 1 zyx

These 6 “languages are polynomial; any constraint language not falling within the 6 categories is NP-complete


• Restrictions on structure of constraint graph– Typical results: difficulty to solve instance exponential

not in number of variables but in tree-width


• We just mentioned a few special classes• Do these tractable classes arise in practice?

– SAT LIB instances typically have fairly low tree-width– Real problems are never purely 2SAT or Horn but they

typically have substantial parts that are– Notion of backdoor: small number of variables that need

be instantiated before the instance breaks down to tractable fragment

– Idea: algos are exponential in something that is not constant but is typically much smaller than actual #variables (TW, backdoor size...)

Very special classes of Instance

• Some instances constructed to exhibit special features (e.g. to be hard for given proof system)– Pigeon-Hole Principle (PHP)– Parity

• Interesting problems related to generation of particular instances have received solutions– generating satisfiable instances– generating instances with unique solution– ...

QUIZZQUIZZ How do you write the PHP formulae?

Classes of SAT

QUIZZQUIZZ

zyx

uz

xyu

Are these instances Horn?

zya

uz

ayu

a = not x

YESNOPE

Horn-renamable

QUIZZQUIZZ Can you think of a poly-time algorithm for Horn-SAT?

QUIZZQUIZZ Can you think of a poly-time algorithm for 2-SAT?

QUIZZQUIZZ Can you think of a poly-time algorithm for XOR-SAT?

What is an easy/structured instance?What is a hard/unstructured instance?Is it feasible to produce hard instances?

ResearchQuestion

ResearchQuestion



Deductions in SAT

• If an instance is satisfiable, how do we know?– Well, by exhibiting a solution

• If an instance is unsatisfiable, how do we know?– That’s a non-trivial, fundamental question…

Deductions in SAT

• Enumeration is but one type of proof, many proof systems exist.

• Textbook example (Sequent calculus, fragment):

Deductions in SAT

• Let’s be open-minded: many proof systems exist: e.g. Cutting Plane proof Systems

• The notion of proof system is not proper to SAT (or Tautologies) but to any other coNP-complete language, i.e. 3-colouring

Propositional Resolution

• The proof system that has received most interest in practice (by far) is resolution:

(What do we mean “this rule is a proof system”? Essentially: it is complete: any unsat instance has a proof or, seen the other way, it generates all unsat formulae)

BA

BxxA

Propositional Resolution

• Example of resolution proof

x zyx y zy

zy

zz

Proof Search

• Proof systems capture the types of deductions that can be performed mechanically by one or another type of solvers

• They are not algorithms in the sense that they leave open how (in which order) these deductions should be made

• Said otherwise a proof system is a non-deterministic algorithm.

• The next question (not an easy one either ) is how to search for proofs deterministically

Proof Complexity in 1 slide

• In proof complexity– A proof is a sequence of bits– A proof system is a syntax for these bits, e.g. a

poly-time predicate check(instance, proof)

• Comparison between proof systems– There is a notion of polynomial reduction

(“simulation”) between proof systems– E.g. Sequent calculus simulates resolution. The

converse is not true

• Notion of automatizability of proofs

Proof Complexity

• Proof complexity results are interesting because they help understand the limits of some solvers, in particular resolution-based solvers

• Examples of results:– Resolution proofs for PHP grow exponentially– Many proof systems are strictly more powerful than

resolution (e.g. Sequent calculus)– Resolution is not automatizable (“unless W[P] is

tractable”)– Computing the resolution width of a formula is EXPTIME-

complete

Proof Systems

QUIZZQUIZZ Can you think of a proof system for CSP?

Proof Complexity

Is there an optimal proof system?

B.T.W. Is there an optimal algorithm?

Or an optimal semi-decision algorithm?

Or a practical, optimal, semi-decision algorithm?

ResearchQuestion

ResearchQuestion

• Deep questions :

The puzzling role of resolution

• Resolution has well-understood limitations• Yet no attempt to design a non-resolution-

based SAT prover has succeeded

Can a non-resolution-based deductive SAT approach made competitive?

Is there an unsuspected fundamental reason why resolution would play a central role?

ResearchQuestion

ResearchQuestion

Why is SAT interesting?

• Although “minimal”, SAT can encode many cool things

• SAT has a rich theory• Modern SAT solvers are

very good

DEMO!DEMO! Plus 2 anecdotes!

Documents

Propositional Satisfiability and Constraint Programming Lucas Bordeaux Microsoft Research, Cambridge