arxiv.org · Journal of Artiﬂcial Intelligence Research 35 (2009) 343-389 Submitted 08/08; published 06/09 Automated Reasoning in Modal and Description Logics via SAT Encoding:

Journal of Artificial Intelligence Research 35 (2009) 343-389 Submitted 08/08; published 06/09

Automated Reasoning in Modal and Description Logics viaSAT Encoding: the Case Study of Km/ALC-Satisfiability

Roberto Sebastiani [email protected]

Michele Vescovi [email protected]

DISI, Universita di TrentoVia Sommarive 14, I-38123, Povo, Trento, Italy

Abstract

In the last two decades, modal and description logics have been applied to numerousareas of computer science, including knowledge representation, formal verification, databasetheory, distributed computing and, more recently, semantic web and ontologies. For thisreason, the problem of automated reasoning in modal and description logics has beenthoroughly investigated. In particular, many approaches have been proposed for efficientlyhandling the satisfiability of the core normal modal logic Km, and of its notational variant,the description logic ALC. Although simple in structure, Km/ALC is computationally veryhard to reason on, its satisfiability being PSpace-complete.

In this paper we start exploring the idea of performing automated reasoning tasks inmodal and description logics by encoding them into SAT, so that to be handled by state-of-the-art SAT tools; as with most previous approaches, we begin our investigation fromthe satisfiability in Km. We propose an efficient encoding, and we test it on an extensiveset of benchmarks, comparing the approach with the main state-of-the-art tools available.Although the encoding is necessarily worst-case exponential, from our experiments wenotice that, in practice, this approach can handle most or all the problems which are atthe reach of the other approaches, with performances which are comparable with, or evenbetter than, those of the current state-of-the-art tools.

1. Motivations and Goals

In the last two decades, modal and description logics have provided an essential frameworkfor many applications in numerous areas of computer science, including artificial intelli-gence, formal verification, database theory, distributed computing and, more recently, se-mantic web and ontologies. For this reason, the problem of automated reasoning in modaland description logics has been thoroughly investigated (e.g., Fitting, 1983; Ladner, 1977;Baader & Hollunder, 1991; Halpern & Moses, 1992; Baader, Franconi, Hollunder, Nebel, &Profitlich, 1994; Massacci, 2000). In particular, the research in modal and description logicshas followed two parallel routes until the seminal work of Schild (1991), which proved thatthe core modal logic Km and the core description logic ALC are one a notational variant ofthe other. Since then, analogous results have been produced for a bunch of other logics, sothat, nowadays the two research lines have mostly merged into one research flow.

Many approaches have been proposed for efficiently reasoning in modal and descriptionlogics, starting from the problem of checking the satisfiability in the core normal modallogic Km and in its notational variant, the description logic ALC (hereafter simply “Km”).We classify them as follows.

c©2009 AI Access Foundation. All rights reserved.

Sebastiani & Vescovi

• The “classic” tableau-based approach (Fitting, 1983; Baader & Hollunder, 1991; Mas-sacci, 2000) is based on the construction of propositional tableau branches, which arerecursively expanded on demand by generating successor nodes in a candidate Kripkemodel. Kris (Baader & Hollunder, 1991; Baader et al., 1994), Crack (Franconi,1998), LWB (Balsiger, Heuerding, & Schwendimann, 1998) were among the mainrepresentative tools of this approach.

• The DPLL-based approach (Giunchiglia & Sebastiani, 1996, 2000) differs from theprevious one mostly in the fact that a Davis-Putnam-Logemann-Loveland (DPLL)procedure, which treats the modal subformulas as propositions, is used instead ofthe classic propositional tableaux procedure at each nesting level of the modal op-erators. KSAT (Giunchiglia & Sebastiani, 1996), ESAT (Giunchiglia, Giunchiglia,& Tacchella, 2002) and *SAT (Tacchella, 1999), are the representative tools of thisapproach.

These two approaches merged into the “modern” tableaux-based approach, which has beenextended to work with more expressive description logics and to provide more sophisticatereasoning functions. Among the tools employing this approach, we recall FaCT/FaCT++and DLP (Horrocks & Patel-Schneider, 1999), and Racer (Haarslev & Moeller, 2001). 1

• In the translational approach (Hustadt & Schmidt, 1999; Areces, Gennari, Heguiabehere,& de Rijke, 2000) the modal formula is encoded into first-order logic (FOL), and theencoded formula can be decided efficiently by a FOL theorem prover (Areces et al.,2000). Mspass (Hustadt, Schmidt, & Weidenbach, 1999) is the most representativetool of this approach.

• The CSP-based approach (Brand, Gennari, & de Rijke, 2003) differs from the tableaux-based and DPLL-based ones mostly in the fact that a CSP (Constraint SatisfactionProblem) engine is used instead of tableaux/DPLL. KCSP is the only representativetool of this approach.

• In the Inverse-method approach (Voronkov, 1999, 2001), a search procedure is basedon the “inverted” version of a sequent calculus (which can be seen as a modalizedversion of propositional resolution). K K(Voronkov, 1999) is the only representativetool of this approach.

• In the Automata-theoretic approach, (a symbolic representation based on BDDs –Binary Decision Diagrams – of) a tree automaton accepting all the tree models of theinput formula is implicitly built and checked for emptiness (Pan, Sattler, & Vardi,2002; Pan & Vardi, 2003). KBDD (Pan & Vardi, 2003) is the only representative toolof this approach.

1. Notice that there is not an universal agreement on the terminology “tableaux-based” and “DPLL-based”.E.g., tools like FaCT, DLP, and Racer are most often called “tableau-based”, although they usea DPLL-like algorithm instead of propositional tableaux for handling the propositional component ofreasoning (Horrocks, 1998; Patel-Schneider, 1998; Horrocks & Patel-Schneider, 1999; Haarslev & Moeller,2001).

344

Automated Reasoning in Modal and Description Logics via SAT Encoding

• Pan and Vardi (2003) presented also an encoding of K-satisfiability into QBF-satisfiability(which is PSpace-complete too), combined with the use of a state-of-the-art QBF(Quantified Boolean Formula) solver. We call this approach QBF-encoding approach.

To the best of our knowledge, the last four approaches so far are restricted to the satisfiabilityin Km only, whilst the translational approach has been applied to numerous modal anddescription logics (e.g. traditional modal logics like Tm and S4m, and dynamic modallogics) and to the relational calculus.

A significant amount of benchmarks formulas have been produced for testing the effec-tiveness of the different techniques (Halpern & Moses, 1992; Giunchiglia, Roveri, & Sebas-tiani, 1996; Heuerding & Schwendimann, 1996; Horrocks, Patel-Schneider, & Sebastiani,2000; Massacci, 1999; Patel-Schneider & Sebastiani, 2001, 2003).

In the last two decades we have also witnessed an impressive advance in the efficiencyof propositional satisfiability techniques (SAT), which has brought large and previously-intractable problems at the reach of state-of-the-art SAT solvers. Most of the success of SATtechnologies is motivated by the impressive efficiency reached by current implementationsof the DPLL procedure, (Davis & Putnam, 1960; Davis, Longemann, & Loveland, 1962),in its most-modern variants (Silva & Sakallah, 1996; Moskewicz, Madigan, Zhao, Zhang, &Malik, 2001; Een & Sorensson, 2004). Current implementations can handle formulas in theorder of 107 variables and clauses.

As a consequence, many hard real-world problems have been successfully solved byencoding into SAT (including, e.g., circuit verification and synthesis, scheduling, planning,model checking, automatic test pattern generation , cryptanalysis, gene mapping). Effectiveencodings into SAT have been proposed also for the satisfiability problems in quantifier-freeFOL theories which are of interest for formal verification (Strichman, Seshia, & Bryant,2002; Seshia, Lahiri, & Bryant, 2003; Strichman, 2002). Notably, successful SAT encodingsinclude also PSpace-complete problems, like planning (Kautz, McAllester, & Selman, 1996)and model checking (Biere, Cimatti, Clarke, & Zhu, 1999).

In this paper we start exploring the idea of performing automated reasoning tasks inmodal and description logics by encoding them into SAT, so that to be handled by state-of-the-art SAT tools; as with most previous approaches, we begin our investigation from thesatisfiability in Km.

In theory, the task may look hopeless because of worst-case complexity issues: in fact,with few exceptions, the satisfiability problem in most modal and description logics is not inNP, typically being PSpace-complete or even harder —PSpace-complete for Km (Ladner,1977; Halpern & Moses, 1992)— so that the encoding is in worst-case non polynomial. 2

In practice, however, a few considerations allow for not discarding that this approachmay be competitive with the state-of-the-art approaches. First, the non-polynomial boundsabove are worst-case bounds, and formulas may have different behaviors from that of thepathological formulas which can be found in textbooks. (E.g., notice that the exponentialityis based on the hypothesis of unboundedness of some parameter like the modal depth;Halpern & Moses, 1992; Halpern, 1995.) Second, some tricks in the encoding may allowfor reducing the size of the encoded formula significantly. Third, as the amount of RAM

2. We implicitly make the assumption NP 6= PSpace.

345


memory in current computers is in the order of the GBytes and current SAT solvers cansuccessfully handle huge formulas, the encoding of many modal formulas (at least of thosewhich are not too hard to solve also for the competitors) may be at the reach of a SAT solver.Finally, even for PSpace-complete logics like Km, also other state-of-the-art approaches arenot guaranteed to use polynomial memory.

In this paper we show that, at least for the satisfiability Km, by exploiting some smartoptimizations in the encoding the SAT-encoding approach becomes competitive in practicewith previous approaches. To this extent, the contributions of this paper are manyfold.

• We propose a basic encoding of Km formulas into purely-propositional ones, and provethat the encoding is satisfiability-preserving.

• We describe some optimizations of the encoding, both in form of preprocessing andof on-the-fly simplification. These techniques allow for significant (and in some casesdramatic) reductions in the size of the resulting Boolean formulas, and in performancesof the SAT solver thereafter.

• We perform a very extensive empirical comparison against the main state-of-the-arttools available. We show that, despite the NP-vs.-PSpace issue, this approach canhandle most or all the problems which are at the reach of the other approaches, withperformances which are comparable with, and sometimes even better than, thoseof the current state-of-the-art tools. In our perspective, this is the most surprisingcontribution of the paper.

• As a byproduct of our work, we obtain an empirical evaluation of current tools for Km-satisfiability available, which is very extensive in terms of both amount and variety ofbenchmarks and of number and representativeness of the tools evaluated. We are notaware of any other such evaluation in the recent literature.

We also stress the fact that with our approach the encoder can be interfaced with everySAT solver in a plug-and-play manner, so that to benefit for free of every improvement inthe technology of SAT solvers which has been or will be made available.

Content. The paper is structured as follows. In Section 2 we provide the necessarybackground notions on modal logics and SAT. In Section 3 we describe the basic encodingfrom Km to SAT. In Section 4 we describe and discuss the main optimizations, and providemany examples. In Section 5 we present the empirical evaluation, and discuss the results.In Section 6 we present some related work and current research trends. In Section 7 weconclude, and describe some possible future evolutions.

A six-page preliminary version of this paper, containing some of the basic ideas presentedhere, was presented at SAT’06 conference (Sebastiani & Vescovi, 2006). For the readers’convenience, an online appendix is provided, containing all plots of Section 5 in full size.Moreover, in order to make the results reproducible, the encoder, the benchmarks and therandom generators with the seeds used are also available in the online appendix.

346


2. Background

In this section we provide the necessary background on the modal logic Km (Section 2.1)and on SAT and the DPLL procedure (Section 2.2).

2.1 The Modal Logic Km

We recall some basic definitions and properties of Km. Given a non-empty set of primitivepropositions A = {A1, A2, . . .}, a set of m modal operators B = {21, . . . , 2m}, and the con-stants “True” and “False” (that we denote respectively with “>” and “⊥”) the language ofKm is the least set of formulas containing A, closed under the set of propositional connec-tives {¬,∧,∨,→,↔} and the set of modal operators in B∪{31, . . . , 3m}. Notationally, weuse the Greek letters α, β, ϕ, ψ, ν, π to denote formulas in the language of Km (Km-formulashereafter). Notice that we can consider {¬,∧} together with B as the group of the prim-itive connectives/operators, defining the remaining in the standard way, that is: “3rϕ”for “¬2r¬ϕ”, “ϕ1 ∨ ϕ2” for “¬(¬ϕ1 ∧ ¬ϕ2)”, “ϕ1 → ϕ2” for “¬(ϕ1 ∧ ¬ϕ2)”, “ϕ1 ↔ ϕ2”for “¬(ϕ1 ∧ ¬ϕ2) ∧ ¬(ϕ2 ∧ ¬ϕ1)”. (Hereafter formulas like ¬¬ψ are implicitly assumed tobe simplified into ψ, so that, if ψ is ¬φ, then by “¬ψ” we mean “φ”.) Notationally, weoften write “(

∧i li) →

∨j lj” for the clause “

∨j ¬li ∨

∨j lj”, and “(

∧i li) → (

∧j lj)” for the

conjunction of clauses “∧

j(∨

i ¬li ∨ lj)”. Further, we often write 2r or 3r meaning onespecific/generic modal operator, where it is assumed that r = 1, . . . ,m; and we denote by2i

r the nested application of the 2r operator i times: 20rψ := ψ and 2i+1

r ψ := 2r2irψ. We

call depth of ϕ, written depth(ϕ), the maximum number of nested modal operators in ϕ.We call a propositional atom every primitive proposition in A, and a propositional literalevery propositional atom (positive literal) or its negation (negative literal). We call a modalatom every formula which is either in the form 2rϕ or in the form 3rϕ.

In order to make our presentation more uniform, and to avoid considering the polarityof subformulas, we adopt the traditional representation of Km-formulas (introduced, as faras we know, by Fitting, 1983 and widely used in literature, e.g. Fitting, 1983; Massacci,2000; Donini & Massacci, 2000) from the following table:

α α1 α2 β β1 β2 πr πr0 νr νr

0

(ϕ1 ∧ ϕ2) ϕ1 ϕ2 (ϕ1 ∨ ϕ2) ϕ1 ϕ2 3rϕ1 ϕ1 2rϕ1 ϕ1

¬(ϕ1 ∨ ϕ2) ¬ϕ1 ¬ϕ2 ¬(ϕ1 ∧ ϕ2) ¬ϕ1 ¬ϕ2 ¬2rϕ1 ¬ϕ1 ¬3rϕ1 ¬ϕ1

¬(ϕ1 → ϕ2) ϕ1 ¬ϕ2 (ϕ1 → ϕ2) ¬ϕ1 ϕ2

in which non-literal Km-formulas are grouped into four categories: α’s (conjunctive), β’s(disjunctive), π’s (existential), ν’s (universal). Importantly, all such formulas occur in themain formula with positive polarity only. This allows for disregarding the issue of polarityof subformulas.

The semantic of modal logics is given by means of Kripke structures. A Kripke structurefor Km is a tuple M = 〈U ,L,R1, . . . ,Rm〉, where U is a set of states, L is a functionL : A×U 7−→ {True, False}, and each Rr is a binary relation on the states of U . With anabuse of notation we write “u ∈M” instead of “u ∈ U”. We call a situation any pair M, u,M being a Kripke structure and u ∈ M. The binary relation |= between a modal formula

347


ϕ and a situation M, u is defined as follows:

M, u |= >;M, u 6|= ⊥;M, u |= Ai, Ai ∈ A ⇐⇒ L(Ai, u) = True;M, u |= ¬Ai, Ai ∈ A ⇐⇒ L(Ai, u) = False;M, u |= α ⇐⇒ M, u |= α1 and M, u |= α2;M, u |= β ⇐⇒ M, u |= β1 or M, u |= β2;M, u |= πr ⇐⇒ M, w |= πr

0 for some w ∈ U s.t. Rr(u,w) holds in M;M, u |= νr ⇐⇒ M, w |= νr

0 for every w ∈ U s.t. Rr(u,w) holds in M.

“M, u |= ϕ” should be read as “M, u satisfy ϕ in Km” (alternatively, “M, u Km-satisfiesϕ”). We say that a Km-formula ϕ is satisfiable in Km (Km-satisfiable henceforth) if and onlyif there exist M and u ∈M s.t. M, u |= ϕ. (When this causes no ambiguity, we sometimesdrop the prefix “Km-”.) We say that w is a successor of u through Rr iff Rr(u,w) holds inM.

The problem of determining the Km-satisfiability of a Km-formula ϕ is decidable andPSPACE-complete (Ladner, 1977; Halpern & Moses, 1992), even restricting the language toa single Boolean atom (i.e., A = {A1}; Halpern, 1995); if we impose a bound on the modaldepth of the Km-formulas, the problem reduces to NP-complete (Halpern, 1995). For amore detailed description on Km— including, e.g., axiomatic characterization, decidabilityand complexity results — we refer the reader to the works of Halpern and Moses (1992),and Halpern (1995).

A Km-formula is said to be in Negative Normal Form (NNF) if it is written in terms ofthe symbols 2r, 3r, ∧, ∨ and propositional literals Ai, ¬Ai (i.e., if all negations occur onlybefore propositional atoms in A). Every Km-formula ϕ can be converted into an equivalentone NNF (ϕ) by recursively applying the rewriting rules: ¬2rϕ=⇒3r¬ϕ, ¬3rϕ=⇒2r¬ϕ,¬(ϕ1 ∧ ϕ2)=⇒(¬ϕ1 ∨ ¬ϕ2), ¬(ϕ1 ∨ ϕ2)=⇒(¬ϕ1 ∧ ¬ϕ2), ¬¬ϕ=⇒ϕ.

A Km-formula is said to be in Box Normal Form (BNF) (Pan et al., 2002; Pan & Vardi,2003) if it is written in terms of the symbols 2r, ¬2r, ∧, ∨, and propositional literals Ai,¬Ai (i.e., if no diamonds are there, and all negations occur only before boxes or beforepropositional atoms in A). Every Km-formula ϕ can be converted into an equivalent oneBNF (ϕ) by recursively applying the rewriting rules: 3rϕ=⇒¬2r¬ϕ, ¬(ϕ1 ∧ϕ2)=⇒(¬ϕ1 ∨¬ϕ2), ¬(ϕ1 ∨ ϕ2)=⇒(¬ϕ1 ∧ ¬ϕ2), ¬¬ϕ=⇒ϕ.

2.2 Propositional Satisfiability with the DPLL Algorithm

Most state-of-the-art SAT procedures are evolutions of the DPLL procedure (Davis &Putnam, 1960; Davis et al., 1962). A high-level schema of a modern DPLL engine, adaptedfrom the one presented by Zhang and Malik (2002), is reported in Figure 1. The Booleanformula ϕ is in CNF (Conjunctive Normal Form); the assignment µ is initially empty, andit is updated in a stack-based manner.

In the main loop, decide next branch(ϕ, µ) chooses an unassigned literal l from ϕaccording to some heuristic criterion, and adds it to µ. (This operation is called decision,l is called decision literal and the number of decision literals in µ after this operation iscalled the decision level of l.) In the inner loop, deduce(ϕ, µ) iteratively deduces literals l

348


1. SatValue DPLL (formula ϕ, assignment µ) {2. while (1) {3. decide next branch(ϕ, µ);4. while (1) {5. status = deduce(ϕ, µ);6. if (status == sat)7. return sat;8. else if (status == conflict) {9. blevel = analyze conflict(ϕ, µ);10. if (blevel == 0) return unsat;11. else backtrack(blevel,ϕ, µ);12. }13. else break;14. }}}

Figure 1: Schema of a modern SAT solver engine based on DPLL.

deriving from the current assignment and updates ϕ and µ accordingly; this step is repeateduntil either µ satisfies ϕ, or µ falsifies ϕ, or no more literals can be deduced, returning sat,conflict and unknown respectively. (The iterative application of Boolean deduction steps indeduce is also called Boolean Constraint Propagation, BCP.) In the first case, DPLL returnssat. If the second case, analyze conflict(ϕ, µ) detects the subset η of µ which causedthe conflict (conflict set) and the decision level blevel to backtrack. If blevel == 0,then a conflict exists even without branching, so that DPLL returns unsat. Otherwise,backtrack(blevel, ϕ, µ) adds the clause ¬η to ϕ (learning) and backtracks up to blevel(backjumping), updating ϕ and µ accordingly. In the third case, DPLL exits the inner loop,looking for the next decision.

Notably, modern DPLL implementations implement techniques, like the two-watched-literal scheme, which allow for extremely efficient handling of BCP (Moskewicz et al., 2001;Zhang & Malik, 2002). Old versions of DPLL used to implement also the Pure-Literal Rule(PLR) (Davis et al., 1962): when one proposition occurs only positively (resp. negatively) inthe formula, it can be safely assigned to true (resp. false). Modern DPLL implementations,however, often do not implement it anymore due to its computational cost. For a muchdeeper description of modern DPLL-based SAT solvers, we refer the reader to the literature(e.g., Zhang & Malik, 2002).

3. The Basic Encoding

We borrow some notation from the Single Step Tableau (SST) framework (Massacci, 2000;Donini & Massacci, 2000). We represent uniquely states in M as labels σ, represented asnon empty sequences of integers 1.nr1

1 .nr22 . ... .nrk

k , s.t. the label 1 represents the root state,and σ.nr represents the n-th Rr-successor of σ (where r ∈ {1, . . . , m}). With a little abuseof notation, hereafter we may say “a state σ” meaning “a state labeled by σ”. We call alabeled formula a pair 〈σ, ψ〉, such that σ is a state label and ψ is a Km-formula, and we

349


call labeled subformulas of a labeled formula 〈σ, ψ〉 all the labeled formulas 〈σ, φ〉 such thatφ is a subformula of ψ.

Let A〈 , 〉 be an injective function which maps a labeled formula 〈σ, ψ〉, s.t. ψ is notin the form ¬φ, into a Boolean variable A〈σ, ψ〉. We conventionally assume that A〈σ, >〉 is> and A〈σ, ⊥〉 is ⊥. Let L〈σ, ψ〉 denote ¬A〈σ, φ〉 if ψ is in the form ¬φ, A〈σ, ψ〉 otherwise.Given a Km-formula ϕ, the encoder Km2SAT builds a Boolean CNF formula as follows: 3

Km2SAT (ϕ) def= A〈1, ϕ〉 ∧Def(1, ϕ) (1)

Def(σ, >) def= > (2)

Def(σ, ⊥) def= > (3)

Def(σ, Ai)def= > (4)

Def(σ, ¬Ai)def= > (5)

Def(σ, α) def= (L〈σ, α〉 → (L〈σ, α1〉 ∧ L〈σ, α2〉)) ∧Def(σ, α1) ∧Def(σ, α2) (6)

Def(σ, β) def= (L〈σ, β〉 → (L〈σ, β1〉 ∨ L〈σ, β2〉)) ∧Def(σ, β1) ∧Def(σ, β2) (7)

Def(σ, πr,j) def= (L〈σ, πr,j〉 → L〈σ.j, πr,j0 〉) ∧Def(σ.j, πr,j

0 ) (8)

Def(σ, νr) def=∧

for every

〈σ,πr,i〉

(((L〈σ, νr〉 ∧ L〈σ, πr,i〉) → L〈σ.i, νr

0 〉) ∧ Def(σ.i, νr0)

). (9)

Here by “πr,j” we mean that πr,j is the j-th distinct πr formula labeled by σ. Notice that(6) and (7) generalize to the case of n-ary ∧ and ∨ in the obvious way: if φ is

⊗ni=1 φi

s.t.⊗ ∈ {∧,∨}, then Def(σ, φ) def= (L〈σ, φ〉 →

⊗ni=1 L〈σ, φi〉) ∧

∧ni=1 Def(σ, φi). Although

conceptually trivial, this fact has an important practical consequence: in order to encode⊗ni=1 φi one needs adding only one Boolean variable rather than up to n−1, see Section 4.2.

Notice also that in rule (9) the literals of the type L〈σ, πr,i〉 are strictly necessary; in fact, theSAT problem must consider and encode all the possibly occuring states, but it can be thecase, e.g., that a πr,i formula occurring in a disjunction is assigned to false for a particularstate label σ (which, in SAT, corresponds to assign L〈σ, πr,i〉 to false). In this situation allthe labeled formulas regarding the state label σ.i are useless, in particular those generatedby the expansion of the ν formulas interacting with πr,i. 4

We assume that the Km-formulas are represented as DAGs (Direct Acyclic Graphs),so that to avoid the expansion of the same Def(σ, ψ) more than once. Then the variousDef(σ, ψ) are expanded in a breadth-first manner wrt. the tree of labels, that is, allthe possible expansions for the same (newly introduced) σ are completed before startingthe expansions for a different state label σ′, and different state label are expanded in theorder they are introduced (thus all the expansions for a given state are always handledbefore those of any deeper state). Moreover, following what done by Massacci (2000), weassume that, for each σ, the Def(σ, ψ)’s are expanded in the order: α/β, π, ν. Thus, eachDef(σ, νr) is expanded after the expansion of all Def(σ, πr,i)’s, so that Def(σ, νr) will

3. We say that the formula is in CNF because we represent clauses as implications, according to the notationdescribed at the beginning of Section 2.

4. Indeed, (9) is a finite conjunction. In fact the number of π-subformulas is obviously finite and Km

benefits of the finite-tree-model property (see, e.g., Pan et al., 2002; Pan & Vardi, 2003).

350


generate one clause ((L〈σ, νr〉 ∧L〈σ, πr,i〉) → L〈σ.i, νr0 〉) and one novel definition Def(σ.i, νr

0)for each Def(σ, πr,i) expanded. 5

Intuitively, it is easy to see that Km2SAT (ϕ) mimics the construction of an SST tableauexpansion (Massacci, 2000; Donini & Massacci, 2000). We have the following fact.

Theorem 1. A Km-formula ϕ is Km-satisfiable if and only if the corresponding Booleanformula Km2SAT (ϕ) is satisfiable.

The complete proof of Theorem 1 can be found in Appendix A.Notice that, due to (9), the number of variables and clauses in Km2SAT (ϕ) may grow

exponentially with depth(ϕ). This is in accordance to what was stated by Halpern andMoses (1992).

Example 3.1 (NNF). Let ϕnnf be (3A1 ∨3(A2 ∨A3)) ∧ 2¬A1 ∧ 2¬A2 ∧ 2¬A3. 6 Itis easy to see that ϕnnf is K1-unsatisfiable: the 3-atoms impose that at least one atom Ai

is true in at least one successor of the root state, whilst the 2-atoms impose that all atomsAi are false in all successor states of the root state. Km2SAT (ϕnnf ) is: 7

1. A〈1, ϕnnf 〉 (1)

2. ∧ ( A〈1, ϕnnf 〉 → (A〈1, 3A1∨3(A2∨A3)〉 ∧A〈1, 2¬A1〉 ∧A〈1, 2¬A2〉 ∧A〈1, 2¬A3〉) ) (6)

3. ∧ ( A〈1, 3A1∨3(A2∨A3)〉 → (A〈1, 3A1〉 ∨A〈1, 3(A2∨A3)〉) ) (7)

4. ∧ ( A〈1, 3A1〉 → A〈1.1, A1〉 ) (8)

5. ∧ ( A〈1, 3(A2∨A3)〉 → A〈1.2, A2∨A3〉 ) (8)

6. ∧ ( (A〈1, 2¬A1〉 ∧A〈1, 3A1〉) → ¬A〈1.1, A1〉 ) (9)

7. ∧ ( (A〈1, 2¬A2〉 ∧A〈1, 3A1〉) → ¬A〈1.1, A2〉 ) (9)

8. ∧ ( (A〈1, 2¬A3〉 ∧A〈1, 3A1〉) → ¬A〈1.1, A3〉 ) (9)

9. ∧ ( (A〈1, 2¬A1〉 ∧A〈1, 3(A2∨A3)〉) → ¬A〈1.2, A1〉 ) (9)

10. ∧ ( (A〈1, 2¬A2〉 ∧A〈1, 3(A2∨A3)〉) → ¬A〈1.2, A2〉 ) (9)

11. ∧ ( (A〈1, 2¬A3〉 ∧A〈1, 3(A2∨A3)〉) → ¬A〈1.2, A3〉 ) (9)

12. ∧ ( A〈1.2, A2∨A3〉 → (A〈1.2, A2〉 ∨A〈1.2, A3〉) ) (7)

After a run of Boolean constraint propagation (BCP), 3. reduces to the implicate disjunc-tion. If the first element A〈1, 3A1〉 is assigned to true, then by BCP we have a conflict on 4.and 6. If it is set to false, then the second element A〈1, 3(A2∨A3)〉 is assigned to true, andby BCP we have a conflict on 12. Thus Km2SAT (ϕnnf ) is unsatisfiable. 3

4. Optimizations

The basic encoding of Section 3 is rather naive, and can be much improved to many extents,in order to reduce the size of the output propositional formula, or to make it easier to solveby DPLL, or both. We distinguish two main kinds of optimizations:

5. In practice, even if the definition of Km2SAT is recursive, the Def expansions are performed grouped bystates. More precisely, all the Def(σ.n, ψ) expansions, for any formula ψ and every defined n, are donetogether (in the α/β, π, ν order above exposed) and necessarily after that all the Def(σ, ϕ) expansionshave been completed.

6. For K1-formulas we omit the box and diamond indexes, i.e., we write 2, 3 for 21, 31.7. In all examples we report at the very end of each line, i.e. after each clause, the number of the Km2SAT

encoding rule applied to generate that clause. We also drop the application of the rules (2), (3), (4)and (5).

351


Preprocessing steps, which are applied on the input modal formula before the encoding.Among them, we have Pre-conversion into BNF (Section 4.1), Atom Normalization(Section 4.2), Box Lifting (Section 4.3), and Controlled Box Lifting (Section 4.4).

On-the-fly simplification steps, which are applied to the Boolean formula under con-struction. Among them, we have On-the-fly Boolean Simplification and Truth Prop-agation Through Boolean Operators (Section 4.5) and Truth Propagation ThroughModal Operators (Section 4.6), On-the-fly Pure-Literal Reduction (Section 4.7), andOn-the-fly Boolean Constraint Propagation (Section 4.8).

We analyze these techniques in detail.

4.1 Pre-conversion into BNF

Many systems use to pre-convert the input Km-formulas into NNF (e.g., Baader et al.,1994; Massacci, 2000). In our approach, instead, we pre-convert them into BNF (like, e.g.,Giunchiglia & Sebastiani, 1996; Pan et al., 2002). For our approach, the advantage of thelatter representation is that, when one 2rψ occurs both positively and negatively (like, e.g.,in (2rψ ∨ ...) ∧ (¬2rψ ∨ ...) ∧ ...), then both occurrences of 2rψ are labeled by the sameBoolean atom A〈σ, 2rψ〉, and hence they are always assigned the same truth value by DPLL.With NNF, instead, the negative occurrence ¬2rψ is rewritten into 3r(nnf(¬ψ)), so thattwo distinct Boolean atoms A〈σ, 2r(nnf(ψ))〉 and A〈σ, 3r(nnf(¬ψ))〉 are generated; DPLL canassign them the same truth value, creating a hidden conflict which may require some extraBoolean search to reveal. 8

Example 4.1 (BNF). We consider the BNF variant of the ϕnnf formula of Example 3.1,ϕbnf = (¬2¬A1 ∨ ¬2(¬A2 ∧ ¬A3)) ∧ 2¬A1 ∧ 2¬A2 ∧ 2¬A3. As before, it is easy tosee that ϕbnf is K1-unsatisfiable. Km2SAT (ϕbnf ) is: 9

1. A〈1, ϕbnf 〉 (1)

2. ∧ ( A〈1, ϕbnf 〉 → (A〈1, (¬2¬A1∨¬2(¬A2∧¬A3))〉 ∧A〈1, 2¬A1〉 ∧A〈1, 2¬A2〉 ∧A〈1, 2¬A3〉)) (6)

3. ∧ ( A〈1, (¬2¬A1∨¬2(¬A2∧¬A3))〉 → (¬A〈1, 2¬A1〉 ∨ ¬A〈1, 2(¬A2∧¬A3)〉) ) (7)

4. ∧ ( ¬A〈1, 2¬A1〉 → A〈1.1, A1〉 ) (8)

5. ∧ ( ¬A〈1, 2(¬A2∧¬A3)〉 → ¬A〈1.2, (¬A2∧¬A3)〉 ) (8)

6. ∧ ( (A〈1, 2¬A1〉 ∧ ¬A〈1, 2¬A1〉) → ¬A〈1.1, A1〉 ) (9)

7. ∧ ( (A〈1, 2¬A2〉 ∧ ¬A〈1, 2¬A1〉) → ¬A〈1.1, A2〉 ) (9)

8. ∧ ( (A〈1, 2¬A3〉 ∧ ¬A〈1, 2¬A1〉) → ¬A〈1.1, A3〉 ) (9)

9. ∧ ( (A〈1, 2¬A1〉 ∧ ¬A〈1, 2(¬A2∧¬A3)〉) → ¬A〈1.2, A1〉 ) (9)

10. ∧ ( (A〈1, 2¬A2〉 ∧ ¬A〈1, 2(¬A2∧¬A3)〉) → ¬A〈1.2, A2〉 ) (9)

11. ∧ ( (A〈1, 2¬A3〉 ∧ ¬A〈1, 2(¬A2∧¬A3)〉) → ¬A〈1.2, A3〉 ) (9)

12. ∧ ( ¬A〈1.2, (¬A2∧¬A3)〉 → (A〈1.2, A2〉 ∨A〈1.2, A3〉) ) (7)

Unlike with the NNF formula ϕnnf in Example 3.1, Km2SAT (ϕbnf ) is found unsatisfiabledirectly by BCP. In fact, the unit-propagation of A〈1, 2¬A1〉 from 2. causes ¬A〈1, 2¬A1〉 in

8. Notice that this consideration holds for every representation involving both boxes and diamonds; werefer to NNF simply because it is the most popular of these representations.

9. Notice that the valid clause 6. can be dropped. See the explanation in Section 4.5.

352


3. to be false, so that one of the two (unsatisfiable) branches induced by the disjunctionis cut a priori. With ϕnnf , Km2SAT does not recognize 2¬A1 and 3A1 to be one thenegation of the other, so that two distinct atoms A〈1, 2¬A1〉 and A〈1, 3A1〉 are generated.Hence A〈1, 2¬A1〉 and A〈1, 3A1〉 cannot be recognized by DPLL to be one the negation ofthe other, s.t. DPLL may need exploring one Boolean branch more. 3

In the following we will assume the formulas are in BNF (although most of the opti-mizations which follow work also for other representations).

4.2 Normalization of Modal Atoms

One potential source of inefficiency for DPLL-based procedures is the occurrence in theinput formula of semantically-equivalent though syntactically-different modal atoms ψ′ andψ′′ (e.g., 21(A1 ∨ A2) and 21(A2 ∨ A1)), which are not recognized as such by Km2SAT .This causes the introduction of duplicated Boolean atoms A〈σ, ψ′〉 and A〈σ, ψ′′〉 and —muchworse— of duplicated subformulas Def(σ, ψ′) and Def(σ, ψ′′). This fact can have verynegative consequences, in particular when ψ′ and ψ′′ occur with negative polarity, becausethis causes the creation of distinct versions of the same successor states, and the duplicationof whole parts of the output formula.

Example 4.2. Consider the Km-formula (φ1 ∨¬21(A2 ∨A1))∧ (φ2 ∨¬21(A1 ∨A2))∧ φ3,s.t. φ1, φ2, φ3 are possibly-big Km-formulas. Then Km2SAT creates two distinct atomsA〈1, ¬21(A2∨A1)〉 and A〈1, ¬21(A1∨A2)〉 and two distinct formulas Def(1, ¬21(A2 ∨A1)) andDef(1, ¬21(A1 ∨A2)). The latter will cause the creation of two distinct states 1.1 and 1.2.Thus, the recursive expansion of all 21-formulas occurring positively in φ1, φ2, φ3 will beduplicated for these two states. 3

In order to cope with this problem, as done by Giunchiglia and Sebastiani (1996), weapply some normalization steps to modal atoms with the intent of rewriting as many aspossible syntactically-different but semantically-equivalent modal atoms into syntactically-identical ones. This can be achieved by a recursive application of some simple validity-preserving rewriting rules.

Sorting: modal atoms are internally sorted according to some criterion, so that atomswhich are identical modulo reordering are rewritten into the same atom (e.g., 2i(ϕ2∨ϕ1) and 2i(ϕ1 ∨ ϕ2) are both rewritten into 2i(ϕ1 ∨ ϕ2)).

Flattening: the associativity of ∧ and ∨ is exploited and combinations of ∧’s or ∨’s are“flattened” into n-ary ∧’s or ∨’s respectively (e.g., 2i(ϕ1 ∨ (ϕ2 ∨ ϕ3)) and 2i((ϕ1 ∨ϕ2) ∨ ϕ3) are both rewritten into 2i(ϕ1 ∨ ϕ2 ∨ ϕ3)).

Flattening has also the advantage of reducing the number of novel atoms introduced in theencoding, as a consequence of the fact noticed in Section 3. One possible drawback of thistechnique is that it can reduce the sharing of subformulas (e.g., with 2i((ϕ1 ∨ ϕ2) ∨ ϕ3)and 2i((ϕ1∨ϕ2)∨ϕ4), the common part is no more shared). However, we have empiricallyexperienced that this drawback is negligible wrt. the advantages of flattening.

353


4.3 Box Lifting

As second preprocessing the Km-formula can also be rewritten by recursively applying theKm-validity-preserving “box lifting rules”:

(2rϕ1 ∧2rϕ2) =⇒ 2r(ϕ1 ∧ ϕ2), (¬2rϕ1 ∨ ¬2rϕ2) =⇒ ¬2r(ϕ1 ∧ ϕ2). (10)

This has the potential benefit of reducing the number of πr formulas, and hence the numberof labels σ.i to take into account in the expansion of the Def(σ, νr)’s (9). We call liftingthis preprocessing.

Example 4.3 (Box lifting). If we apply the rules (10) to the formula of Example 4.1,then we have ϕbnflift = ¬2(¬A1 ∧ ¬A2 ∧ ¬A3) ∧ 2(¬A1 ∧ ¬A2 ∧ ¬A3). Consequently,Km2SAT (ϕbnflift) is:

1. A〈1, ϕbnflift〉 (1)

2. ∧ ( A〈1, ϕbnflift〉 → (¬A〈1, 2(¬A1∧¬A2∧¬A3)〉 ∧A〈1, 2(¬A1∧¬A2∧¬A3)〉) ) (6)

3. ∧ ( ¬A〈1, 2(¬A1∧¬A2∧¬A3)〉 → ¬A〈1.1, (¬A1∧¬A2∧¬A3)〉 ) (8)

4. ∧ (( A〈1, 2(¬A1∧¬A2∧¬A3)〉 ∧ ¬A〈1, 2(¬A1∧¬A2∧¬A3)〉) → A〈1.1, (¬A1∧¬A2∧¬A3)〉 ) (9)

5. ∧ ( ¬A〈1.1, (¬A1∧¬A2∧¬A3)〉 → (A〈1.1, A1〉 ∨A〈1.1, A2〉 ∨A〈1.1, A3〉) ) (7)

6. ∧ ( A〈1.1, (¬A1∧¬A2∧¬A3)〉 → (¬A〈1.1, A1〉 ∧ ¬A〈1.1, A2〉 ∧ ¬A〈1.1, A3〉) ). (6)

Km2SAT (ϕbnflift) is found unsatisfiable directly by BCP on clauses 1. and 2.. Only onesuccessor state (1.1) is considered. Notice that 3., 4., 5. and 6. are redundant, because 1.and 2. alone are unsatisfiable. 10 3

4.4 Controlled Box Lifting

One potential drawback of applying the lifting rules is that, by collapsing the formula(2rϕ1 ∧2rϕ2) into 2r(ϕ1 ∧ϕ2) and (¬2rϕ1 ∨¬2rϕ2) into ¬2r(ϕ1 ∧ϕ2), the possibility ofsharing box subformulas in the DAG representation of the input Km-formula is reduced.

In order to cope with this problem we provide an alternative policy for applying boxlifting, that is, to apply the rules (10) only when neither box subformula occurring in theimplicant in (10) has multiple occurrences. We call this policy controlled box lifting.

Example 4.4 (Controlled Box Lifting). We apply Controlled Box Lifting to the formula ofExample 4.1, then we have ϕbnfclift = (¬2¬A1∨¬2(¬A2∧¬A3)) ∧ 2¬A1∧2(¬A2∧¬A3)since the rules (10) are applied among all the box subformulas except for 2¬A1, which is

10. In our actual implementation, trivial cases like ϕbnflift are found to be unsatisfiable directly during theconstruction of the DAG representations, so their encoding is never generated.

354


shared. It follows that Km2SAT (ϕbnfclift) is:

1. A〈1, ϕbnfclift〉 (1)

2. ∧ ( A〈1, ϕbnfclift〉 → (A〈1, (¬2¬A1∨¬2(¬A2∧¬A3))〉 ∧A〈1, 2¬A1〉 ∧A〈1, 2(¬A2∧¬A3)〉 ) (6)

3. ∧ ( A〈1, (¬2¬A1∨¬2(¬A2∧¬A3))〉 → (¬A〈1, 2¬A1〉 ∨ ¬A〈1, 2(¬A2∧¬A3)〉) ) (7)

4. ∧ ( ¬A〈1, 2¬A1〉 → A〈1.1, A1〉 ) (8)

5. ∧ ( ¬A〈1, 2(¬A2∧¬A3)〉 → ¬A〈1.2, (¬A2∧¬A3)〉 ) (8)

6. ∧ ( (A〈1, 2¬A1〉 ∧ ¬A〈1, 2¬A1〉) → ¬A〈1.1, A1〉 ) (9)

7. ∧ ( (A〈1, 2(¬A2∧¬A3)〉 ∧ ¬A〈1, 2¬A1〉) → A〈1.1, (¬A2∧¬A3)〉 ) (9)

8. ∧ ( (A〈1, 2¬A1〉 ∧ ¬A〈1, 2(¬A2∧¬A3)〉) → ¬A〈1.2, A1〉 ) (9)

9. ∧ ( (A〈1, 2(¬A2∧¬A3)〉 ∧ ¬A〈1, 2(¬A2∧¬A3)〉) → A〈1.2, (¬A2∧¬A3)〉 ) (9)

10. ∧ ( A〈1.1, (¬A2∧¬A3)〉 → (¬A〈1.1, A2〉 ∧ ¬A〈1.1, A3〉) ) (6)

11. ∧ ( ¬A〈1.2, (¬A2∧¬A3)〉 → (A〈1.2, A2〉 ∨A〈1.2, A3〉) ) (7)

12. ∧ ( A〈1.2, (¬A2∧¬A3)〉 → (¬A〈1.2, A2〉 ∧ ¬A〈1.2, A3〉) ) (6)

Km2SAT (ϕbnfclift) is found unsatisfiable directly by BCP on clauses 1., 2. and 3.. Noticethat the unit propagation of A〈1, 2¬A1〉 and A〈1, 2(¬A2∧¬A3)〉 from 2. causes the implicatedisjunction in 3. to be false. 3

4.5 On-the-fly Boolean Simplification and Truth Propagation

A first straightforward on-the-fly optimization is that of applying recursively the standardrewriting rules for the Boolean simplification of the formula like, e.g.,

〈σ, ϕ〉 ∧ 〈σ, ϕ〉 =⇒ 〈σ, ϕ〉, 〈σ, ϕ〉 ∨ 〈σ, ϕ〉 =⇒ 〈σ, ϕ〉,〈σ, ϕ1〉 ∧ 〈σ, (ϕ1 ∨ ϕ2)〉 =⇒ 〈σ, ϕ1〉, 〈σ, ϕ1〉 ∨ 〈σ, (ϕ1 ∧ ϕ2)〉 =⇒ 〈σ, ϕ1〉,〈σ, ϕ〉 ∧ ¬〈σ, ϕ〉 =⇒ 〈σ,⊥〉, 〈σ, ϕ〉 ∨ ¬〈σ, ϕ〉 =⇒ 〈σ,>〉,...,

and for the propagation of truth/falsehood through Boolean operators like, e.g.,

¬〈σ,⊥〉 =⇒ 〈σ,>〉, ¬〈σ,>〉 =⇒ 〈σ,⊥〉,〈σ, ϕ〉 ∧ 〈σ,>〉 =⇒ 〈σ, ϕ〉, 〈σ, ϕ〉 ∧ 〈σ,⊥〉 =⇒ 〈σ,⊥〉,〈σ, ϕ〉 ∨ 〈σ,>〉 =⇒ 〈σ,>〉, 〈σ, ϕ〉 ∨ 〈σ,⊥〉 =⇒ 〈σ, ϕ〉,....

Example 4.5. If we consider the Km-formula ϕbnflift = ¬2(¬A1 ∧ ¬A2 ∧ ¬A3) ∧2(¬A1 ∧ ¬A2 ∧ ¬A3) of Example 4.3 and we apply the Boolean simplification rule 〈σ, ϕ〉 ∧¬〈σ, ϕ〉 =⇒ 〈σ,⊥〉, then 〈σ, ϕbnflift〉 is simplified into 〈σ,⊥〉. 3

One important subcase of on-the-fly Boolean simplification avoids the useless encodingof incompatible πr and νr formulas. In BNF, in fact, the same subformula 2rψ may occurin the same state σ both positively and negatively (like πr = ¬2rψ and νr = 2rψ). Ifso, Km2SAT labels both those occurrences of 2rψ with the same Boolean atom A〈σ, 2rψ〉,and produces recursively two distinct subsets of clauses in the encoding, by applying (8)to ¬2rψ and (9) to 2rψ respectively. However, the latter step (9) generates a valid clause(A〈σ, 2rψ〉 ∧ ¬A〈σ, 2rψ〉) → A〈σ.i, ψ〉, so that we can avoid generating it. Consequently, if

355


A〈σ.i, ψ〉 no more occurs in the formula, then Def(σ.i, ψ) should not be generated, as thereis no more need of defining 〈σ.i, ψ〉. 11

Example 4.6. If we apply this observation in the construction of the formulas of Examples4.1 and 4.4, we have the following facts:

• In the formula Km2SAT (ϕbnf ) of Example 4.1, clause 6. is valid and thus it is dropped.

• In the formula Km2SAT (ϕbnfclift) of Example 4.4, both valid clauses 6. and 9. aredropped, so that 12. is not generated. 3

Hereafter we assume that on-the-fly Boolean simplification is applied also in combinationwith the techniques described in the next sections.

4.6 On-the-fly Truth Propagation Through Modal Operators

Truth and falsehood —which can derive by the application of the techniques in Section 4.5,Section 4.7 and Section 4.8— may be propagated on-the-fly also though modal operators.First, for every σ, both positive and negative instances of 〈σ,2r>〉 can be safely simplifiedby applying the rewriting rule 〈σ,2r>〉 =⇒ 〈σ,>〉.

Second, we notice the following fact. When we have a positive occurrence of 〈σ,¬2r⊥〉for some σ (we suppose wlog. that we have only that πr-formula for σ), 12 by definition of(8) and (9) we have

Def(σ, ¬2r⊥) = (L〈σ, ¬2r⊥〉 → A〈σ.j, >〉) ∧Def(σ.j, >), (11)Def(σ, 2rψ) = ((L〈σ, 2rψ〉 ∧ L〈σ, ¬2r⊥〉) → L〈σ.j, ψ〉) ∧Def(σ.j, ψ) (12)

for some new label σ.j and for every 2rψ occurring positively in σ. Def(σ, ¬2r⊥) reducesto > because both A〈σ.j, >〉 and Def(σ.j, >) reduce to >. If at least another distinct π-formula ¬2rϕ occurs positively in σ, however, there is no need for the σ.j label in (11) and(12) to be a new label, and we can re-use instead the label σ.i introduced in the expansionof Def(σ, ¬2rϕ), as follows:

Def(σ, ¬2rϕ) = (L〈σ, ¬2rϕ〉 → L〈σ.i, ¬ϕ〉) ∧Def(σ.i, ¬ϕ). (13)

Thus (11) is dropped and, for every 〈σ,2rψ〉 occurring positively, we write:

Def(σ, 2rψ) = ((L〈σ, 2rψ〉 ∧ L〈σ, ¬2r⊥〉) → L〈σ.i, ψ〉) ∧Def(σ.i, ψ) (14)

instead of (12). (Notice the label σ.i introduced in (13) rather than the label σ.j of (11).)This is motivated by the fact that Def(σ, ¬2r⊥) forces the existence of at least one

successor of σ but imposes no constraints on which formulas should hold there, so thatwe can use some other already-defined successor state, if any. This fact has the importantbenefit of eliminating useless successor states from the encoding.

11. Here the “if” is due to the fact that it may be the case that A〈σ.i, ψ〉 is generated anyway from theexpansion of some other subformula, like, e.g., 2r(ψ ∨ φ). If this is the case, Def(σ.i, ψ) must begenerated anyway.

12. E.g., ¬2r⊥ may result from applying the steps of Section 4.1 and of Section 4.5 to ¬2r(2rA1∧3r¬A1).

356


Example 4.7. Let ϕ be the BNF K-formula:

(¬A1 ∨ ¬2A2) ∧ (A1 ∨ ¬2⊥) ∧ (¬A1 ∨A3) ∧ (¬A1 ∨ ¬A3) ∧ (A1 ∨2¬A4) ∧2A4.

ϕ is K-inconsistent, because the only possible assignment is {¬A1,¬2⊥, 2¬A4,2A4}, whichis K-inconsistent. Km2SAT (ϕ) is encoded as follows:

1. A〈1, ϕ〉 (1)

2. ∧ (A〈1, ϕ〉 → (A〈1, (¬A1∨¬2A2)〉 ∧A〈1, (A1∨¬2⊥)〉 ∧A〈1, (¬A1∨A3)〉∧A〈1, (A1∨2¬A4)〉 ∧A〈1, 2A4〉)) (6)

3. ∧ (A〈1, (¬A1∨¬2A2)〉 → (¬A〈1, A1〉 ∨ ¬A〈1, 2A2〉)) (7)

4. ∧ (A〈1, (A1∨¬2⊥)〉 → (A〈1, A1〉 ∨ ¬A〈1, 2⊥〉)) (7)

5. ∧ (A〈1, (¬A1∨A3)〉 → (¬A〈1, A1〉 ∨A〈1, A3〉)) (7)

6. ∧ (A〈1, (¬A1∨¬A3)〉 → (¬A〈1, A1〉 ∨ ¬A〈1, A3〉)) (7)

7. ∧ (A〈1, (A1∨2¬A4)〉 → (A〈1, A1〉 ∨A〈1, 2¬A4〉)) (7)

8. ∧ (¬A〈1, 2A2〉 → ¬A〈1.1, A2〉) (8)

9. ∧ ((A〈1, 2¬A4〉 ∧ ¬A〈1, 2A2〉) → ¬A〈1.1, A4〉) (9)

10. ∧ ((A〈1, 2A4〉 ∧ ¬A〈1, 2A2〉) → A〈1.1, A4〉) (9)

11. ∧ (¬A〈1, 2⊥〉 → ¬A〈1.1, ⊥〉) (8)

12. ∧ ((A〈1, 2¬A4〉 ∧ ¬A〈1, 2⊥〉) → ¬A〈1.1, A4〉) (9)

13. ∧ ((A〈1, 2A4〉 ∧ ¬A〈1, 2⊥〉) → A〈1.1, A4〉) (9)

Clause 11. is then simplified into>. (In a practical implementation it is not even generated.)Notice that in clauses 11., 12. and 13. it is used the label 1.1 of clauses 8., 9. and 10. ratherthan a new label 1.2. Thus, only one successor label is generated.

When DPLL is run on Km2SAT (ϕ), by BCP 1. and 2. are immediately satisfied and theimplicants are removed from 3., 4., 5., 6.. Thanks to 5. and 6., A〈1, A1〉 can be assigned onlyto false, which causes 3. to be satisfied and forces the assignment of the literals ¬A〈1, 2⊥〉,A〈1, 2¬A4〉 by BCP on 3. and 7. and hence of ¬A〈1.1, ⊥〉, ¬A〈1.1, A4〉 and A〈1.1, A4〉 by BCPon 12. and 13., causing a contradiction. 3

It is worth noticing that (14) is strictly necessary for the correctness of the encodingeven when another π-formula occurs in σ. (E.g., in Example 4.7, without 12. and 13. theformula Km2SAT (ϕ) would become satisfiable because A〈1, 2A2〉 could be safely be assignedto true by DPLL, which would satisfy 8., 9. and 10..)

Hereafter we assume that this technique is applied also in combination with the tech-niques described in Section 4.5 and in the next sections.

4.7 On-the-fly Pure-Literal Reduction

Another technique, evolved from that proposed by Pan and Vardi (2003), applies Pure-Literal Reduction (PLR) on-the-fly during the construction of Km2SAT (ϕ). When for alabel σ all the clauses containing atoms in the form A〈σ, ψ〉 have been generated, if some ofthem occurs only positively [resp. negatively], then it can be safely assigned to true [resp.to false], and hence the clauses containing A〈σ, ψ〉 can be dropped. 13 As a consequence,

13. In our actual implementation this reduction is performed directly within an intermediate data structure,so that these clauses are never generated.

357


some other atom A〈σ, ψ′〉 can become pure, so that the process is repeated until a fixpointis reached.

Example 4.8. Consider the formula ϕbnf of Example 4.1. During the construction ofKm2SAT (ϕbnf ), after 1.-8. are generated, no more clause containing atoms in the formA〈1.1, ψ〉 is to be generated. Then we notice that A〈1.1, A2〉 and A〈1.1, A3〉 occur only neg-atively, so that they can be safely assigned to false. Therefore, 7. and 8. can be safelydropped. Same discourse applies lately to A〈1.2, A1〉 and 9.. The resulting formula is foundinconsistent by BCP. (In fact, notice from Example 4.1 that the atoms A〈1.1, A2〉, A〈1.1, A3〉,and A〈1.2, A1〉 play no role in the unsatisfiability of Km2SAT (ϕbnf ).) 3

We remark the differences between PLR and the Pure-Literal Reduction technique pro-posed by Pan and Vardi (2003). In KBDD (Pan et al., 2002; Pan & Vardi, 2003), thePure-Literal Reduction is a preprocessing step which is applied to the input modal formula,either at global level (i.e. looking for pure-polarity primitive propositions for the whole for-mula) or, more effectively, at different modal depths (i.e. looking for pure-polarity primitivepropositions for the subformulas at the same nesting level of modal operators).

Our technique is much more fine-grained, as PLR is applied on-the-fly with a single-stategranularity, obtaining a much stronger reduction effect.

Example 4.9. Consider again the BNF Km-formula ϕbnf discussed in Examples 4.1 and4.8: ϕbnf = (¬2¬A1∨¬2(¬A2∧¬A3)) ∧ 2¬A1 ∧ 2¬A2 ∧ 2¬A3. It is immediate to seethat all primitive propositions A1, A2, A3 occur at every modal depth with both polarities,so that the technique of Pan and Vardi (2003) produces no effect on this formula. 3

4.8 On-the-fly Boolean Constraint Propagation

One major problem of the basic encoding of Section 3 is that it is “purely-syntactic”, thatis, it does not consider the possible truth values of the subformulas, and the effect of theirpropagation through the Boolean and modal connectives. In particular, Km2SAT applies(8) [resp. (9)] to every π-subformula [resp. ν-subformula], regardless the fact that the truthvalues which can be deterministically assigned to the labeled subformulas of 〈1, ϕ〉 mayallow for dropping some labeled π-/ν-subformulas, and thus prevent the need of encodingthem.

One solution to this problem is that of applying Boolean Constraint Propagation (BCP)on-the-fly during the construction of Km2SAT (ϕ), starting from the fact that A〈1, ϕ〉 mustbe true. If a contradiction is found, then Km2SAT (ϕ) is unsatisfiable, so that the formula isnot expanded any further, and the encoder returns the formula “⊥”. 14 When BCP allowsfor dropping one implication in (6)-(9) without assigning some of its implicate literals,namely L〈σ, ψi〉, then 〈σ, ψi〉 needs not to be defined, so that Def(σ, ψi) must not beexpanded. 15 Importantly, dropping Def(σ, πr,j) for some π-formula 〈σ, πr,j〉 preventsgenerating the label σ.j (8) and all its successor labels σ.j.σ′ (corresponding to the subtreeof states rooted in σ.j), so that all the corresponding labeled subformulas are not encoded.

14. For the sake of compatibility with standard SAT solvers, our actual implementation returns the formulaA1 ∧ ¬A1.

15. Here we make the same consideration as in Footnote 11: if L〈σ.j, ψ〉 is generated also from the expansionof some other subformula, (e.g., 2r(ψ ∨ φ)), then (another instance of) Def(σ.i, ψ) must be generatedanyway.

358


Example 4.10. Consider Example 4.1, and suppose we apply on-the-fly BCP. During theconstruction of 1., 2. and 3. in Km2SAT (ϕbnf ), the atoms A〈1, ϕbnf 〉, A〈1, (¬2¬A1∨¬2(¬A2∧¬A3))〉,A〈1, 2¬A1〉, A〈1, 2¬A2〉 and A〈1, 2¬A3〉 are deterministically assigned to true by BCP. Thiscauses the removal from 3. of the first-implied disjunct ¬A〈1, 2¬A1〉, so that there is no needto generate Def(1, ¬2¬A1), and hence label 1.1. is not defined and 4. is not generated.While building 5., A〈1.2, (¬A2∧¬A3)〉, is unit-propagated. As label 1.1. is not defined, 6.,7. and 8. are not generated. Then during the construction of 5., 9., 10., 11. and 12., byapplying BCP a contradiction is found, so that Km2SAT (ϕ) is ⊥.

An analogous situation happens with ϕbnflift in Example 4.3: while building 1. and 2.a contradiction is found by BCP, s.t. Km2SAT returns ⊥ without expanding the formulaany further. Same discourse holds for ϕbnfclift in Example 4.4: while building 1., 2. and 3.a contradiction is found by BCP, s.t. Km2SAT returns ⊥ without expanding the formulaany further. 3

4.9 A Paradigmatic Example: Halpern & Moses Branching Formulas.

Among all optimizations described in this Section 4, on-the-fly BCP is by far the mosteffective. In order to better understand this fact, we consider as a paradigmatic examplethe branching formulas ϕK

h by Halpern and Moses (1992, 1995) (also called “k branch n”in the set of benchmark formulas proposed by Heuerding and Schwendimann, 1996) andtheir unsatisfiable version (called “k branch p” in the above-mentioned benchmark suite).

Given a single modality 2, an integer parameter h, and the primitive propositionsD0, . . . , Dh+1, P1, . . . , Ph, the formulas ϕK

h are defined as follows: 16

ϕKh

def= D0 ∧ ¬D1 ∧h∧

i=0

2i(depth ∧ determined ∧ branching), (15)

depthdef=

h+1∧

i=1

(Di → Di−1), (16)

determineddef=

h∧

i=1

(Di →

(( Pi → 2(Di → Pi)) ∧(¬Pi → 2(Di → ¬Pi))

) ), (17)

branchingdef=

h−1∧

i=0

((Di ∧ ¬Di+1) →

(3(Di+1 ∧ ¬Di+2 ∧ Pi+1) ∧3(Di+1 ∧ ¬Di+2 ∧ ¬Pi+1)

) ). (18)

A conjunction of the formulas depth, determined and branching is repeated at everynesting level of modal operators (i.e. at every depth): depth captures the relation betweenthe Di’s at every level; determined states that, if Pi is true [false] in a state at depth ≥ i,then it is true [false] in all the successor states of depth ≥ i; branching states that, for everynode at depth i, it is possible to find two successor states at depth i + 1 such that Pi+1 istrue in one and false in the other. For each value of the parameter h, ϕK

h is K-satisfiable,and every Kripke model M that satisfies it has at least 2h+1− 1 states. In fact, ϕK

h is buildin such a way to force the construction of a binary-tree Kripke model of depth h+1, each of

16. For the sake of better readability, here we adopt the description given by Halpern and Moses (1992)without converting the formulas into BNF. This fact does not affect the discussion.

359


whose leaves encodes a distinct truth assignment to the primitive propositions P1, . . . , Ph,whilst each Di is true in all and only the states occurring at a depth ≥ i in the tree (andthus denotes the level of nesting).

The unsatisfiable counterpart formulas proposed by Heuerding and Schwendimann (1996)(whose negations are the valid formulas called k branch p in the previously-mentionedbenchmark suite, which are exposed in more details in Section 5.1.1) are obtained by con-joining to (15) the formula:

2hPbh3c+1 (19)

(where bxc is the integer part of x) which forces the atom Pbh3c+1 to be true in all depth-h

states of the candidate Kripke model, which is incompatible with the fact that the remainingspecifications say that it has to be false in half depth-h states. 17

These formulas are very pathological for many approaches (Giunchiglia & Sebastiani,2000; Giunchiglia, Giunchiglia, Sebastiani, & Tacchella, 2000; Horrocks et al., 2000). In par-ticular, before introducing on-the-fly BCP, they used to be the pet hate of our Km2SAT ap-proach, as they caused the generation of huge Boolean formulas. In fact, due to branching(18), ϕK

h contains 2h 3-formulas (i.e., π-formulas) at every depth. Therefore, the Km2SATencoder of Section 3 has to consider 1 + 2h + (2h)2 + ... + (2h)h+1 = ((2h)h+2− 1)/(2h− 1)distinct labels, which is about hh+1 times the number of those labeling the states whichare actually needed. (None of the optimizations of Sections 4.1-4.7 is of any help withthese formulas, because neither BNF encoding nor atom normalization causes any sharingof subformulas, the formulas are already in lifted form, and no literal occurs pure. 18)

This pathological behavior can be mostly overcome by applying on-the-fly-BCP, becausesome truth values can be deterministically assigned to some subformulas of ϕK

h by on-the-fly-BCP, which prevent encoding some or even most 2/3-subformulas.

In fact, consider the branching and determined formulas occurring in ϕKh at a generic

depth d ∈ {0...h}, which determine the states at level d in the tree. As in these statesD0, ..., Dd are forced to be true and Dd+1, ..., Dh+1 are forced to be false, then all but thed-th conjunct in branching (all conjuncts if d = h) are forced to be true and thus theycould be dropped. Therefore, only 2 3-formulas per non-leaf level could be consideredinstead, causing the generation of 2h+1 − 1 labels overall. Similarly, in all states at level dthe last h−d conjuncts in determined are forced to be true and could be dropped, reducingsignificantly the number of 2-formulas to be considered.

It is easy to see that this is exactly what happens by applying on-the-fly-BCP. In fact,suppose that the construction of Km2SAT (ϕK

h ) has reached depth d (that is, the pointwhere for every state σ at level d, the Def(σ, α)’s and Def(σ, β)’s are expanded but noDef(σ, π) and Def(σ, ν) is expanded yet). Then, BCP deterministically assigns true to theliterals L〈σ, D0〉, ..., L〈σ, Dd〉 and false to L〈σ, Dd+1〉, ..., L〈σ, Dh+1〉, which removes all but oneconjuncts in branching, so that only two Def(σ, π)’s out of 2h ones are actually expanded;similarly, the last h − d conjuncts in determined are removed, so that the correspondingDef(σ, ν)’s are not expanded.

17. Heuerding and Schwendimann do not explain the choice of the index “bh3c + 1”. We understand that

also other choices would have done the job.18. More precisely, only one literal, ¬Dh+1, occurs pure in branching, but assigning it plays no role in

simplifying the formula.

360


1

10

100

1000

10000

100000

1e+06

1e+07

1e+08

5 10 15 20

BNF-liftBNF-nolift

BNF-lift-plrBNF-nolift-plr

BNF-lift-bcpBNF-nolift-bcp

BNF-lift-plr-bcpBNF-nolift-plr-bcp

(a) k branch n, var#

1

10

100

1000

10000

100000

1e+06

1e+07

1e+08

5 10 15 20

BNF-liftBNF-nolift




(b) k branch n, clause#

0.01

0.1

1

10

100

1000

5 10 15 20

BNF-liftBNF-nolift




(c) k branch n, cpu time

1

10

100

1000

10000

100000

1e+06

1e+07

1e+08

5 10 15 20

BNF-liftBNF-nolift




(d) k branch p, var#

1

10

100

1000

10000

100000

1e+06

1e+07

1e+08

5 10 15 20

BNF-liftBNF-nolift




(e) k branch p, clause#

0.01

0.1

1

10

100

1000

5 10 15 20

BNF-liftBNF-nolift




(f) k branch p, cpu time

Figure 2: Empirical analysis of Km2SAT on Halpern & Moses formulas wrt. the depthparameter h, for different options of the encoder. 1st row: k branch n, corre-sponding to Km2SAT (ϕK

h ), formulas (satisfiable); 2nd row: k branch p, corre-sponding to Km2SAT (ϕK

h ∧ 2hPbh3c+1), formulas (unsatisfiable). Left: number

of Boolean variables; center: number of clauses; right: total CPU time requestedto encoding+solving (where the solving step has been performed through Rsat).See Section 5 for more technical details.

361


As far as the unsatisfiable version Km2SAT (ϕKh ∧ 2hPbh

3c+1) is concerned, when the

expansion reaches depth h, thanks to (19), L〈σ, Pbh3 c+1

〉 is generated and deterministically

assigned to true by BCP for every depth-h label σ; thanks to determined and branching,BCP assigns all literals L〈σ, P1〉, ..., L〈σ, Ph〉 deterministically, so that L〈σ, Pbh

3 c+1〉 is assigned

to false for 50% of the depth-h labels σ. This causes a contradiction, so that the encoderstops the expansion and returns ⊥.

Figure 2 shows the growth in size and the CPU time required to encode and solveKm2SAT (ϕK

h ) (1st row) and Km2SAT (ϕKh ∧ 2hPbh

3c+1) (2nd row) wrt. the parameter h,

for eight combinations of the following options of the encoder: with and without box-lifting,with and without on-the-fly PLR, with and without on-the-fly BCP. (Notice the log scaleof the y axis.) In Figure 2(d) the plots of the four versions “-xxx-bcp” (with on-the-flyBCP) coincide with the line of value 1 (i.e, one variable) and in Figure 2(e) they coincidewith an horizontal line of value 2 (i.e, two clauses), corresponding to the fact that the1-variable/2-clause formula A1 ∧ ¬A1 is returned (see Footnote 14).

We notice a few facts. First, for both formulas, the eight plots always collapse into twogroups of overlapping plots, representing the four variants with and without on-the-fly BCPrespectively. This shows that box-lifting and on-the-fly PLR are almost irrelevant in theencoding of these formulas, causing just little variations in the time required by the encoder(Figures 2(c) and 2(f)); notice that enabling on-the-fly PLR alone permits to encode (butnot to solve) only one problem more wrt. the versions without both on-the-fly PLR andBCP. Second, the four versions with on-the-fly-BCP always outperform of several ordersmagnitude these without this option, in terms of both size of encoded formulas and of CPUtime required to encode and solve them. In particular, in the case of the unsatisfiablevariant (Figure 2, second row) the encoder returns the ⊥ formula, so that no actual workis required to the SAT solver (the plot of Figure 2(f) refers only to encoding time).

5. Empirical Evaluation

In order to verify empirically the effectiveness of this approach, we have performed a very-extensive empirical test session on about 14,000 Km/ALC formulas. We have implementedthe encoder Km2SAT in C++, with some flags corresponding to the optimizations ex-posed in the previous section: (i) NNF/BNF, performing a pre-conversion into NNF/BNFbefore the encoding; (ii) lift/ctrl.lift/nolift, performing respectively Box Lifting,Controlled Box Lifting or no Box Lifting before the encoding; (iii) plr if on-the-fly PureLiteral Reduction is performed and (iv) bcp if on-the-fly Boolean Constraint Propagationis performed. The techniques introduced in Section 4.2, Section 4.5 and Section 4.6 arehardwired in the encoder. Moreover, as pre-conversion into BNF almost always producessmaller formulas than NNF, we have set the BNF flag as a default.

In combination with Km2SAT we have tried several SAT solvers on our encoded for-mulas (including Zchaff 2004.11.15, Siege v4, BerkMin 5.6.1, MiniSat v1.13, SAT-Elite v1.0, SAT-Elite GTI 2005 submission 19, MiniSat 2.0 061208 and Rsat 1.03).

19. In the preliminary evaluation of the available SAT solvers we have also tried SAT-Elite as a preprocessorto reduce the size of the SAT formula generated by Km2SAT without the bcp option before to solve it.However, even if the preprocessing can signinificantly reduce the size of the formula, it has turned out

362


After a preliminary evaluation and further intensive experiments we have selected Rsat 1.03(Pipatsrisawat & Darwiche, 2006), because it produced the best overall performances onour benchmark suites (although the performance gaps wrt. other SAT tools, e.g. MiniSat2.0, were not dramatic).

We have downloaded the available versions of the state-of-the-art tools for Km-satisfiability.After an empirical evaluation 20 we have selected Racer 1-7-24 (Haarslev & Moeller, 2001)and *SAT 1.3 (Tacchella, 1999) as the best representatives of the tableaux/DPLL-basedtools, Mspass v 1.0.0t.1.3 (Hustadt & Schmidt, 1999; Hustadt et al., 1999) 21 as thebest representative of the FOL-encoding approach, KBDD (unique version) (Pan et al.,2002; Pan & Vardi, 2003) 22 as the representative of the automata-theoretic approach. Norepresentative of the CSP-based and of the inverse method approaches could be used. 23

Notice that all these tools but Racer are experimental tools, as far as Km2SAT which isa prototype, and many of them (e.g. *SAT and KBDD) are no longer maintained.

Finally, as representative of the QBF-encoding approach, we have selected the K-QBFtranslator (Pan & Vardi, 2003) combined with the sKizzo version 0.8.2 QBF solver(Benedetti, 2005), which turned out to be by far 24 the best QBF solver on our bench-marks among the freely-available QBF solvers from the QBF2006 competition (Narizzano,Pulina, & Tacchella, 2006). (In our evaluation we have considered the tools : 2clsQ, SQBF,preQuantor—i.e. preQuel +Quantor— Quantor 2.11, and Semprop 010604.)

All tests presented in this section have been performed on a two-processor Intel Xeon3.0GHz computer, with 1 MByte Cache each processor, 4 GByte RAM, with Red HatLinux 3.0 Enterprise Server, where four processes can run in parallel. When reporting theresults for one Km2SAT +Rsat version, the CPU times reported are the sums of both

that this preprocessing is too time-expensive and that the overall time spent for preprocessing and thensolving the reduced problem is higher than that solving directly the original encoded SAT formula.

20. As we did for the selection of the SAT solver, in order to select the tools to be used in the empiricalevaluation, we have performed a preliminary evaluation on the smaller benchmark suites (i.e. the LWBand, sometimes, the TANCS 2000 ones; see later). Importantly, from this preliminary evaluation Racerturned out to be definitely more efficient than FaCT++, being able to solve more problems in less time.Also, in order to meet the reviewers’ suggestions, we repeated this preliminary evaluation with the latestversions of FaCT++ (v1.2.3, March 5th, 2009) and the same version of Racer used in this paper.In this evaluation Racer solves ten more problems than FaCT++ on the LWB benchmark, and overthan one hundred of problems more than FaCT++ on the whole TANCS 2000 suite. Also on 2m-CNFrandom problems Racer outperforms FaCT++. (We include in the online appendix the plots of thiscomparison between Racer and FaCT++.)

21. We have run Mspass with the options -EMLTranslation=2 -EMLFuncNary=1 -Sorts=0

-CNFOptSkolem=0 -CNFStrSkolem=0 -Select=2 -Split=-1 -DocProof=0 -PProblem=0 -PKept=0

-PGiven=0, which are suggested for Km-formulas in the Mspass README file. We have also triedother options, but the former gave the best performances.

22. KBDD has been recompiled to be run with an increased internal memory bound of 1 GB.23. At the moment K Kis not freely available, and we failed in the attempt of obtaining it from the authors.

KCSP is a prolog piece of software, which is difficult to compare in performances wrt. other optimizedtools on a common platform; moreover, KCSP is no more maintained since 2005, and it is not com-petitive wrt. state-of-the-art tools (Brand, 2008). Other tools like leanK, 2KE, LWB, Kris are notcompetitive with the ones listed above (Horrocks et al., 2000). KSAT (Giunchiglia & Sebastiani, 1996,2000; Giunchiglia et al., 2000) has been reimplemented into *SAT.

24. Unlike with the choice of SAT solver, the performance gaps from the best choice and the others werevery significant: e.g., in the LWB benchmark (see later), sKizzo was able to solve nearly 90 problemsmore than its best QBF competitor.

363


the encoding and Rsat solving times. When reporting the results for K-QBF +sKizzo,the CPU times reported are only due to sKizzo because the time spent by the K-QBFconverter is negligible.

We anticipate that, for all formulas of all benchmark suites, all tools under test —i.e.all the variants of Km2SAT +Rsat and all the state-of-the-art Km-satisfiability solvers—agreed on the satisfiability/unsatisfiability result when terminating within the timeout.

Remark 1. Due to the big number of empirical tests performed and to the huge amountof data plotted, and due to limitations in size, and in order to to make the plots clearlydistinguishable in the figures, we have limited the number of plots included in the followingpart of the paper, considering only the most meaningful ones and those regarding the mostchallenging benchmark problems faced. For the sake of the reader’s convenience, however,full-size versions of all plots and many other plots regarding the not-exposed results (alsoon the easier problems), are available in the online appendix, together with the files withall data. When discussing the empirical evaluation we may include in our considerationsalso these results.

5.1 Test Description

We have performed our empirical evaluation on three different well-known benchmarkssuites of Km/ALC problems: the LWB (Heuerding & Schwendimann, 1996), the ran-dom 2m-CNF (Horrocks et al., 2000; Patel-Schneider & Sebastiani, 2003) and the TANCS2000 (Massacci & Donini, 2000) benchmark suites. We are not aware of any other publicly-available benchmark suite on Km/ALC-satisfiability from the literature. These three groupsof benchmark formulas allow us to test the effectiveness of our approach on a large numberof problems of various sizes, depths, hardness and characteristics, for a total amount ofabout 14,000 formulas.

In particular, these benchmark formulas allow us to fairly evaluate the different toolsboth on the modal component and on the Boolean component of reasoning which are in-trinsic in the Km-satisfiability problem, as we discuss later in Section 5.4.

In the following we describe these three benchmark suites.

5.1.1 The LWB Benchmark Suite

As a first group of benchmark formulas we used the LWB benchmark suite used in acomparison at Tableaux’98 (Heuerding & Schwendimann, 1996). It consists of 9 classes ofparametrized formulas (each in two versions, provable “ p” or not-provable “ n” 25), for atotal amount of 378 formulas. The parameter allows for creating formulas of increasing sizeand difficulty.

The benchmark methodology is to test formulas from each class, in increasing difficulty,until one formula cannot be solved within a given timeout, 1000 seconds in our tests. 26

The result from this class is the parameter’s value of the largest (and hardest) formula thatcan be solved within the time limit. The parameter ranges only from 1 to 21 so that, if a

25. Since all tools check Km-(un)satisfiability, all formulas are negated, so that the negations of the provableformulas are checked to be unsatisfiable, whilst the negation of the other formulas are checked to besatisfiable.

26. We also set a 1 GB file-size limit for the encoding produced by Km2SAT .

364


system can solve all 21 instances of a class, the result is given as 21. For a discussion on thisbenchmark suite, we refer the reader to the work of Heuerding and Schwendimann (1996)and of Horrocks et al. (2000).

5.1.2 The Random 2m-CNF Benchmark Suite

As a second group of benchmark formulas, we have selected the random 2m-CNF testbeddescribed by Horrocks et al. (2000), and Patel-Schneider and Sebastiani (2003). This isa generalization of the well-known random k-SAT test methods, and is the final result ofa long discussion in the communities of modal and description logics on how to to obtainsignificant and flawless random benchmarks for modal/description logics (Giunchiglia &Sebastiani, 1996; Hustadt & Schmidt, 1999; Giunchiglia et al., 2000; Horrocks et al., 2000;Patel-Schneider & Sebastiani, 2003).

In the 2m-CNF test methodology, a 2m-CNF formula is randomly generated accordingto the following parameters:

• the (maximum) modal depth d;

• the number of top-level clauses L;

• the number of literal per clause clauses k;

• the number of distinct propositional variables N ;

• the number of distinct box symbols m;

• the percentage p of purely-propositional literals in clauses occurring at depth < d, s.t.each clause of length k contains on average p · k randomly-picked Boolean literals andk − p · k randomly-generated modal literals 2rψ, ¬2rψ. 27

(We refer the reader to the works of Horrocks et al., 2000, and Patel-Schneider & Sebastiani,2003 for a more detailed description.)

A typical problem set is characterized by fixed values of d, k, N , m, and p: L isvaried in such a way as to empirically cover the “100% satisfiable / 100% unsatisfiable”transition. In other words, many problems with the same values of d, k, N, m, and p but anincreasing number of clauses L are generated, starting from really small, typically satisfiableproblems (i.e. with a probability of generating a satisfiable problem near to one) to hugeproblems, where the increasing interactions among the numerous clauses typically leadsto unsatisfiable problems (i.e. it makes the probability of generating satisfiable problemsconverging to zero). Then, for each tuple of the five values in a problem set, a certainnumber of 2m-CNF formulas are randomly generated, and the resulting formulas are givenin the input to the procedure under test, with a maximum time bound. The fraction offormulas which were solved within a given timeout, and the median/percentile values ofCPU times are plotted against the ratio L/N . Also, the fraction of satisfiable/unsatisfiableformulas is plotted for a better understanding.

27. More precisely, the number of Boolean literals in a clause is bp · kc (resp. dp · ke) with probabilitydp · ke − p · k (resp. 1 − (dp · ke − p · k)). Notice that typically the smaller is p, the harder is theproblem (Horrocks et al., 2000; Patel-Schneider & Sebastiani, 2003).

365


Following the methodology proposed by Horrocks et al. (2000), and by Patel-Schneiderand Sebastiani (2003), we have fixed m = 1, k = 3 and 100 samples per point in all tests,and we have selected two groups: an “easier” one, with d = 1, p = 0.5, N = 6, 7, 8, 9,L/N = 10..60, and a “harder” one, with d = 2, p = 0.6, 0.5, N = 3, 4, L/N = 30..150 withp = 0.6 and L/N = 50..140 with p = 0.5, varying the L/N ratio in steps of 5, for a totalamount of 13,200 formulas.

In each test, we imposed a timeout of 500 seconds per sample 28 and we calculated thenumber of samples which were solved within the timeout, and the 50%th and 90%th per-centiles of CPU time. 29 In order to correlate the performances with the (un)satisfiability ofthe sample formulas, in the background of each plot we also plot the satisfiable/unsatisfiableratio.

5.1.3 The TANCS 2000 Benchmark Suite

Finally, as a third group of benchmark formulas, we used the MODAL PSPACE divisionbenchmark suite used in the comparison at TANCS 2000 (Massacci & Donini, 2000). Itcontains both satisfiable and unsatisfiable formulas, with scalable hardness. In this bench-mark suite, which we call TANCS 2000, the formulas are constructed by translating QBFformulas into K using three translation schemas, namely the Schmidt-Schauss-Smolka trans-lation (240 problems with many different depths, from 19 to 112), the Ladner translation(240 problems, again with depths in the same range 19 – 112), and the Halpern translation(56 problems of depth among: 20, 28, 40, 56, 80 or 112) (Massacci & Donini, 2000). Asdone by Massacci and Donini, we call these classes easy, medium and hard respectively.

All formulas from each class are tested within a timeout of 1000 seconds. 30 For eachclass, we report the number of solved formulas (X axis) and the total (cumulative) CPUtime spent for solving these formulas (Y axes). For each class the results are plotted sortingthe solved problems from the easiest one to the hardest one.

5.2 An Empirical Comparison of the Different Variants of Km2SAT

We have first evaluated the various variants of the encoding in combination with Rsat. Inorder to avoid considering too many combinations of the flags, we have considered the BNFformat, and we have grouped plr and bcp into one parameter plr-bcp, restricting thusour investigation to 6 combinations: BNF, lift/ctrl.lift/nolift, and plr-bcp on/off.(We recall that the techniques introduced in Section 4.2, Section 4.5 and Section 4.6 arehardwired in the encoder.) Here we expose and analyze the results wrt. the three differentsuites of benchmark problems.

28. With also a 512 MB file-size limit for the encoding produced by Km2SAT .29. Due to the lack of space and for the sake of clarity we won’t include in the paper the 90%th percentiles

plots. Further, for the same reasons, we’ll skip to report the plots regarding some of the easiest class ofthe benchmark suite (e.g. those with d = 1 and lower values of N). All of these plots, however, can befound in the online appendix.

30. We also set a 1 GB file-size limit for the encoding produced by Km2SAT .

366


5.2.1 Results on the LWB Benchmark Suite

The results on the LWB benchmark suite are summarized in Table 1 and Figure 3.Table 1(a) reports in the left block the indexes of the hardest formulas encoded within

the file-size limit and, in the right block, those of the hardest formulas solved within thetimeout by Rsat; Table 1(b) reports the numbers of variables and clauses of Km2SAT (ϕ),referring to the hardest formulas solved within the timeout by Rsat (i.e., those reportedin the right block of Table 1(a)). For instance, the BNF-ctrl.lift-plr-bcp encoding ofk dum n(21) contains 11·106 variables and 14·106 clauses; it is the hardest k dum n problemsolved by Rsat with BNF-ctrl.lift-plr-bcp and it is the first which is not solved withBNF-ctrl.lift.

Looking at the numbers of cases solved in Table 1(a), we notice that the introduction ofthe on-the-fly Pure Literal Reduction and Boolean Constraint Propagation optimizationsis really effective and produces a consistent performance enhancement (the effect of theseoptimizations is eye-catching in the branching formulas k branch * – see Section 4.9 – andin the k path * formulas). We also notice that lift sometimes introduces some slightfurther improvement.

The view of Tables 1(a) and 1(b) hides the actual CPU times required to encode andsolve the problems. Small gaps in the numbers of Table 1(a) may correspond to big gaps inCPU time. In order to analyze also this aspect, in Figure 3 we plotted the total cumulativeamount of CPU time spent by all the variants of Km2SAT +Rsat to solve all the problemsof the LWB benchmark, sorted by hardness. For this plot, we also considered three moreoptions —BNF, lift/ctrl.lift/nolift, with plr on and bcp off— so that to evaluatealso the effect of plr and bcp separately. We notice that the plots are clearly clusteredinto three groups of increasing performance: BNF-*, BNF-*-plr, and BNF-*-plr-bcp., “*”representing the three options lift/ctrl.lift/nolift. This highlights the fact that onthis suite on-the-fly Pure Literal Reduction significantly improves the performances, thaton-the-fly Boolean Constraint Propagation introduces drastic improvements, and that thevariations due to Box Lifting are minor wrt. the other two optimizations.

Overall, the configuration BNF-lift-plr-bcp turns out to be the best performer on thissuite, with a tiny advantage wrt. BNF-ctrl.lift-plr-bcp.

5.2.2 Results on the Random 2m-CNF Benchmark Suite

The results on the random 2m-CNF benchmark suite are reported in Figures 4 and 5.In Figure 4 we report the 50%-percentile CPU times required to encode and solve the

formulas by the different Km2SAT +Rsat variants for the hardest benchmarks problems.We don’t report the percentage of solved problems since it is always 100%, i.e. Km2SAT+Rsat terminates within the timeout for every problem in the benchmark suite.

The tests with depth d = 1 (see the results on the hardest problems of the class in thefirst row of Figure 4) are simply too easy for Km2SAT +Rsat (but not for its competitors,see Section 5.3) which solved every sample formula in less than 1 second. Although thetests exposed in the second and third row of Figure 4 are more challenging, they are allsolved within the timeout as well. We have noticed also that the results are rather regular,since there are no big gaps between 50%- and 90%-percentile values.

367


Km2SAT , encoded Km2SAT + Rsat, solvedplr-bcp plr-bcp

lifting no yes ctrl no yes ctrl no yes ctrl no yes ctrl

k branch n 4 4 4 18 18 18 4 4 4 17 17 17k branch p 4 4 4 18 18 18 4 4 4 18 18 18k d4 n 8 8 8 8 9 8 8 8 8 8 8 8k d4 p 14 14 14 14 14 14 14 14 14 14 14 14k dum n 20 20 20 21 21 21 20 20 20 21 21 21k dum p 19 19 19 21 21 21 18 18 18 21 21 21k grz n 21 21 21 21 21 21 21 21 21 21 21 21k grz p 21 21 21 21 21 21 21 21 21 21 21 21k lin n 21 21 21 21 21 21 21 21 21 21 21 21k lin p 21 21 21 21 21 21 21 21 21 21 21 21k path n 7 7 7 14 15 14 7 7 7 13 14 13k path p 8 8 8 15 16 15 8 8 8 15 16 15k ph n 21 21 21 21 21 21 21 21 21 21 21 21k ph p 21 21 21 21 21 21 10 11 10 10 10 11k poly n 21 21 21 21 21 21 21 21 21 21 21 21k poly p 21 21 21 21 21 21 21 21 21 21 21 21k t4p n 6 6 6 6 6 6 5 6 5 6 6 6k t4p p 11 11 11 11 11 11 10 10 10 11 11 11

(a) Indexes of the hardest problems encoded (left)and of the hardest problems solved (right).

number of variables (·103) number of clauses (·103)plr-bcp plr-bcp

lifting no yes ctrl no yes ctrl no yes ctrl no yes ctrl

k branch n 1000 1000 1000 20000 20000 20000 1000 1000 1000 23000 23000 23000k branch p 1000 1000 1000 0 0 0 1000 1000 1000 0 0 0k d4 n 12000 6000 12000 10000 26000 10000 17000 9000 17000 16000 43000 16000k d4 p 19000 18000 19000 0 0 0 28000 25000 28000 0 0 0k dum n 19000 19000 19000 11000 11000 11000 23000 23000 23000 14000 14000 14000k dum p 11000 11000 11000 20000 19000 20000 14000 13000 14000 26000 25000 26000k grz n 10 10 10 5 5 5 10 10 10 6 6 6k grz p 8 8 8 0.2 0.1 0.2 8 8 8 0.3 0.2 0.2k lin n 30 30 20 20 10 20 50 50 20 30 30 30k lin p 0 0 0 0 0 0 0 0 0 0 0 0k path n 11000 12000 11000 10000 7000 10000 13000 14000 13000 14000 9000 13000k path p 11000 12000 11000 26000 16000 26000 13000 14000 13000 35000 20000 35000k ph n 50 300 50 50 300 50 50 300 50 50 600 50k ph p 3 13 3 3 8 4 3 14 3 3 14 5k poly n 200 20 20 200 20 20 200 20 20 200 20 20k poly p 200 20 20 200 20 20 200 20 20 200 20 20k t4p n 4000 21000 4000 17000 14000 17000 4000 22000 4000 20000 17000 20000k t4p p 12000 10000 12000 20000 18000 20000 12000 11000 12000 24000 21000 24000

(b) # of variables and # of clauses of the hardest problems solved.Note: Here “0” means that the formula is simplified into ⊥ by Km2SAT .

Table 1: Comparison of the variants of Km2SAT +Rsat on the LWB benchmarks.

368


0.1

1

10

100

1000

10000

50 100 150 200 250 300

BNF-lift (Rsat)BNF-nolift (Rsat)

BNF-ctrl.lift (Rsat)BNF-lift-plr (Rsat)

BNF-nolift-plr (Rsat)BNF-ctrl.lift-plr (Rsat)

BNF-lift-plr-bcp (Rsat)BNF-nolift-plr-bcp (Rsat)

BNF-ctrl.lift-plr-bcp (Rsat)

Figure 3: Comparison of different variants of Km2SAT +Rsat on the LWB problems.X axis: number of solved problems; Y axis: total CPU time spent (sortingproblems from the easiest to the hardest).

In general, we do not have relevant performance gaps between the various configurationson this benchmark suite; it seems that in the majority of cases ctrl.lift slightly beatsnolift and nolift slightly beats lift. These gaps are even more relevant if we considerthe size of the formulas generated (Figure 5). We believe that this may be due to the factthat random 2m-CNF formulas may contain lots of shared subformulas 2rψ, so that liftingmay cause a reduction of such sharing (see Section 3). Further, plr-bcp does not seem tointroduce relevant improvements here. We believe that this is due to the fact that theserandom formulas produce pure and unit literals with very low or even zero probability.

Overall, the configuration BNF-nolift turns out to be the best performer on this suite,with a slight advantage wrt. BNF-ctrl.lift-plr-bcp.

Finally, from some plots of Figure 4 we notice that for Km2SAT +Rsat the problemstend to be harder within the satisfiability/unsatisfiability transition area. (This fact holdsespecially for Racer and *SAT, see Section 5.3.) This seems to confirm the fact that theeasy-hard-easy pattern of random k-SAT extends also to 2m-CNF, as already observed inliterature (Giunchiglia & Sebastiani, 1996, 2000; Giunchiglia et al., 2000; Horrocks et al.,2000; Patel-Schneider & Sebastiani, 2003).

369


1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 10 20 30 40 50 60

0

20

40

60

80

100


BNF-ctrl.lift (Rsat)BNF-lift-plr-bcp (Rsat)

BNF-nolift-plr-bcp (Rsat)BNF-ctrl.lift-plr-bcp (Rsat)

1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 10 20 30 40 50 60

0

20

40

60

80

100




1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 40 60 80 100 120 140

0

20

40

60

80

100




1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 40 60 80 100 120 140

0

20

40

60

80

100




1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 60 80 100 120 140

0

20

40

60

80

100




1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 60 80 100 120 140

0

20

40

60

80

100




Figure 4: Comparison among different variants of Km2SAT +Rsat on random problems.X axis: #clauses/N . Y axis: median (50th percentile) CPU time, 100 sam-ples/point. 1st row: d = 1, p = 0.5, N = 8, 9; 2nd row: d = 2, p = 0.6, N = 3, 4;3rd row: d = 2, p = 0.5, N = 3, 4. Background: % of satisfiable instances.

370


0

20000

40000

60000

80000

100000

120000

140000

160000

180000

10 20 30 40 50 60

BNF-liftBNF-nolift

BNF-ctrl.liftBNF-lift-plr-bcp

BNF-nolift-plr-bcpBNF-ctrl.lift-plr-bcp

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

10 20 30 40 50 60

BNF-liftBNF-nolift



0

200000

400000

600000

800000

1e+06

1.2e+06

40 60 80 100 120 140

BNF-liftBNF-nolift



0

200000

400000

600000

800000

1e+06

1.2e+06

40 60 80 100 120 140

BNF-liftBNF-nolift



0

500000

1e+06

1.5e+06

2e+06

60 80 100 120 140

BNF-liftBNF-nolift



0

500000

1e+06

1.5e+06

2e+06

60 80 100 120 140

BNF-liftBNF-nolift



Figure 5: Comparison among different variants of Km2SAT on random problems. X axis:#clauses/N . Y axis: 1st column: #variables in the SAT encoding (90th per-centiles), 100 samples/point; 2nd column: #clauses in the SAT encoding (90thpercentiles), 100 samples/point. 1st row: d = 1, p = 0.5, N = 9; 2nd row: d = 2,p = 0.6, N = 4; 3rd row: d = 2, p = 0.5, N = 4.

371


5.2.3 Results on the TANCS 2000 Benchmark Suite

The comparison among the Km2SAT variants on the TANCS 2000 benchmark is presentedin Figures 6 and 7, where different BNF variants of Km2SAT are compared both enablingor disabling lift/ctrl.lif and plr-bcp.

In Figure 6, from top-left to bottom, we present the cumulative CPU times spent byKm2SAT +Rsat on the easy, medium and hard categories respectively (the correspondingplots reporting the non-cumulative CPU times are included in the online appendix). InFigure 7 we present the plots of the number of variables and clauses of the formulas solvedfor the more challenging cases of the medium and hard problems. 31 We notice that thereare only slight differences among the different variants of Km2SAT ; BNF with lift is thebest option which allows for solving more problems in the hard class and requiring less CPUtime.

We remark that, despite the expected exponential growth of the encoded formulas wrt.the modal depth, in this benchmark Km2SAT +Rsat allows for encoding and solvingproblems of modal depth greater than 100 for the easy class and greater than 50 for themedium and hard classes, producing and solving SAT-encoded formulas with more than 107

variables and 1.4 · 107 clauses.

5.3 An Empirical Comparison wrt. the Other Approaches

We proceed with the comparison of our approach wrt. the current state-of-the-art evalu-ating Km2SAT +Rsat against the other Km-satisfiability solvers listed above. In moredetails, we chose to compare the performance of the other solvers against both the best“local” Km2SAT +Rsat variant for the single benchmark suite and the best “global”Km2SAT +Rsat variant among all the benchmark suites, which we have identified inBNF-ctrl.lift-plr-bcp.

5.3.1 Comparison on the LWB Benchmark Suite

The results on the LWB benchmark suite are summarized numerically and graphically inTable 2. From Table 2(a) we notice a few facts: Racer and *SAT are the best performers(confirming the analysis done by Horrocks et al., 2000) with a significant gap wrt. the others;then, K-QBF +sKizzo solves a few more problems than Km2SAT +Rsat; then followsKBDD which outperforms Mspass, which is the worst performer. In detail, Km2SAT+Rsat is (one of) the worst performer(s) on k d4 * and k t4 *, the fourth best performeron k path n, the third best performer on k path p and k branch p, and it is (one of) thebest performer(s) on k branch n, k dum *, k grz *, k lin *, k ph * and k poly *; it is theabsolute best performer on k branch n and k ph p.

In Table 2(b) we give a graphical representation of this comparison, plotting the numberof solved problems by each approach against the total cumulative amount of CPU timespent. We notice that, even if Km2SAT +Rsat solves a few problems less than K-QBF+sKizzo, Km2SAT +Rsat is mostly faster than K-QBF +sKizzo.

31. The same plots for the easy problems are included in the online appendix.

372


0.1

1

10

100

1000

2000

20 40 60 80 100 120 140




0.1

1

10

100

1000

2000

10 20 30 40 50 60 70 80 90 100




0.1

1

10

100

1000

5 10 15 20 25 30




Figure 6: Comparison among different variants of Km2SAT +Rsat on TANCS 2000 prob-lems. X axis: number of solved problems. Y axis: total cumulative CPU timespent. 1st (top-left) to 3th (bottom) plot: easy, medium, hard problems. (Prob-lems are sorted from the easiest to the hardest).

373


10000

100000

1e+06

1e+07

1e+08

10 20 30 40 50 60 70 80 90 100

BNF-liftBNF-nolift



10000

100000

1e+06

1e+07

1e+08

10 20 30 40 50 60 70 80 90 100

BNF-liftBNF-nolift



10000

100000

1e+06

1e+07

1e+08

5 10 15 20 25 30

BNF-liftBNF-nolift



10000

100000

1e+06

1e+07

1e+08

5 10 15 20 25 30

BNF-liftBNF-nolift



Figure 7: Comparison among different variants of Km2SAT on TANCS 2000 problems. Xaxis: number of the harder solved problem. Y axis: 1st column: #variables inthe SAT encoding of the problem; 2nd column: #clauses in the SAT encodingof the problem. 1st to 2th row: medium, hard problems.

374


other tools Km2SAT + RsatK-QBF BNF-plr-bcp

test + sKizzo KBDD Mspass Racer *SAT -ctrl.lift -lift

k branch n 4 8 10 15 14 17 17k branch p 16 8 10 21 21 18 18k d4 n 8 21 21 21 21 8 8k d4 p 21 21 21 21 21 14 14k dum n 21 21 21 21 21 21 21k dum p 21 21 21 21 21 21 21k grz n 19 21 21 21 21 21 21k grz p 21 21 21 21 21 21 21k lin n 20 21 21 21 21 21 21k lin p 21 21 3 21 21 21 21k path n 9 21 4 21 21 13 14k path p 13 17 5 21 21 15 16k ph n 21 4 12 21 13 21 21k ph p 10 4 8 9 9 11 10k poly n 21 8 7 21 21 21 21k poly p 21 8 7 21 21 21 21k t4p n 21 21 12 21 21 6 6k t4p p 21 21 21 21 21 11 11

(a) Indexes of the hardest problems solved.

0.1

1

10

100

1000

10000

50 100 150 200 250 300 350

kQBF+sKizzo*SAT

RacerkBDD

MSpassBNF-lift-plr-bcp (Rsat)


(b) X axis: # of problems solved; Y axis: total (cumulative) CPU time spent.

Table 2: Comparison of Km2SAT +Rsat against the other tools on the LWB benchmarks.

375


0

10

20

30

40

50

60

70

80

90

100

10 20 30 40 50 60

kQBF+sKizzo*SAT

RacerkBDD

MSpassBNF-nolift (Rsat)


0

10

20

30

40

50

60

70

80

90

100

10 20 30 40 50 60

kQBF+sKizzo*SAT

RacerkBDD



0

10

20

30

40

50

60

70

80

90

100

40 60 80 100 120 140

kQBF+sKizzo*SAT

RacerkBDD



0

10

20

30

40

50

60

70

80

90

100

40 60 80 100 120 140

kQBF+sKizzo*SAT

RacerkBDD



0

10

20

30

40

50

60

70

80

90

100

60 80 100 120 140

kQBF+sKizzo*SAT

RacerkBDD



0

10

20

30

40

50

60

70

80

90

100

60 80 100 120 140

kQBF+sKizzo*SAT

RacerkBDD



Figure 8: Comparison against other approaches on random problems. X axis: #clauses/N .Y axis: % of problems solved within the timeout, 100 samples/point. 1st row:d = 1, p = 0.5, N = 8, 9; 2nd row: d = 2, p = 0.6, N = 3, 4; 3rd row: d = 2,p = 0.5, N = 3, 4.

376


1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 10 20 30 40 50 60

0

20

40

60

80

100

kQBF+sKizzo*SAT

RacerkBDD



1000

500

100

50

10

5

1

0.5

0.1

0.05

0.01 10 20 30 40 50 60

0

20

40

60

80

100

kQBF+sKizzo*SAT

RacerkBDD



0.01

0.05

0.1

0.5

1

5

10

50

100

500

1000

40 60 80 100 120 140 0

20

40

60

80

100

kQBF+sKizzo*SAT

RacerkBDD



0.01

0.05

0.1

0.5

1

5

10

50

100

500

1000

40 60 80 100 120 140 0

20

40

60

80

100

kQBF+sKizzo*SAT

RacerkBDD



0.01

0.05

0.1

0.5

1

5

10

50

100

500

1000

60 80 100 120 140 0

20

40

60

80

100

kQBF+sKizzo*SAT

RacerkBDD


BNF-ctrl.lift-plr-bcp (Rsat) 0.01

0.05

0.1

0.5

1

5

10

50

100

500

1000

60 80 100 120 140 0

20

40

60

80

100

kQBF+sKizzo*SAT

RacerkBDD



Figure 9: Comparison against other approaches on random problems. X axis: #clauses/N .Y axis: median (50th percentile) CPU time, 100 samples/point. 1st row: d = 1,p = 0.5, N = 8, 9; 2nd row: d = 2, p = 0.6, N = 3, 4; 3rd row: d = 2, p = 0.5,N = 3, 4. Background: % of satisfiable instances.

377


5.3.2 Comparison on the Random 2m-CNF Benchmark Suite

In the random 2m-CNF benchmark suite the results are dominated by Km2SAT +Rsat.For the hardest categories among the three groups of problems used in the evaluation, wereport in Figure 8 the number of problems solved by each tool within the timeout, and inFigure 9 the median CPU time (i.e. the 50%th percentile).

Looking at Figure 8 we notice that Km2SAT +Rsat (in both versions) is the only toolwhich always terminates within the timeout, whilst *SAT and Racer sometimes do notterminate in the hardest problems, K-QBF +sKizzo very often does not terminate, andMspass and KBDD do not terminate for most values.

In Figure 9 we notice that Km2SAT +Rsat is most often the best performer (in partic-ular with the hardest problems), followed by *SAT and Racer. (This is even much moreevident if we consider the 90%th percentile of CPU time, whose plots are included in theonline appendix.) In all these tests, K-QBF +sKizzo, Mspass and KBDD are drasticallyoutperformed by the others.

5.3.3 Comparison on the TANCS 2000 Benchmark Suite

The results of the TANCS 2000 benchmark are summarized in Figure 10, from the easycategory (top-left) to the hard category (bottom).

From the plots we notice that the relative performances of the tools under test varysignificantly with the category: Racer and *SAT are among the best performers in allcategories; K-QBF +sKizzo behaves well on the easy and medium categories but solvesvery few problems on the hard one; KBDD behaves very well on the easy category, but solvesvery few problems in the medium and hard ones. Mspass is among the worst performersin all categories: in particular, it does not solve any problem in the hard category; finally,Km2SAT +Rsat is the worst performer on the easy category, it outperforms all competitorsbut *SAT and Racer on the medium category, and competes head-to-head with bothRacer and *SAT for the first position on the hard category: the “local-best” configurationBNF-lift beats both competitors; the “global-best” configuration BNF-ctrl.lift-prl-bcpsolves as many problems as Racer, with one-order-magnitude CPU-time performance gap,and two problems less than *SAT.

Notice that the classification of the benchmark problems in “easy”, “medium” and“hard” given by Massacci and Donini (2000) is based on the translation schema used toproduce every modal problem and refers to its “reasoning component”, but it is not neces-sarily related to other components (like, e.g., the modal depth) which affect the size of ourencoding and, hence, the efficiency of our approach. This may explain the fact, e.g., thatthe “easy” problems are not so easy for our approach, and viceversa.

5.4 Discussion

As highlighted by Giunchiglia et al. (2000), and Horrocks et al. (2000), the satisfiabilityproblem in modal logics like Km is characterized by the alternation of two orthogonalcomponents of reasoning: a Boolean component, performing Boolean reasoning within eachstate, and a modal component, generating the successor states of each state. The lattermust cope with the fact that the candidate models may be up to exponentially big wrt.depth(ϕ), whilst the former must cope with the fact that there may be up to exponentially

378


0.1

1

10

100

1000

10000

50 100 150 200

kQBF+sKizzo*SAT

RacerkBDD


BNF-ctrl.lift-plr-bcp (Rsat) 0.1

1

10

100

1000

10000

1e+05

20 40 60 80 100 120

kQBF+sKizzo*SAT

RacerkBDD



0.1

1

10

100

1000

10000

5 10 15 20 25 30

kQBF+sKizzo*SAT

RacerkBDD



Figure 10: Comparison against other approaches on TANCS 2000 problems. X axis: num-ber of solved problems. Y axis: total cumulative CPU time spent. 1st (top-left)to 3th (bottom) plot: easy, medium, hard problems. (Problems are sorted fromthe easiest to the hardest).

379


many candidate (sub)models to explore. In the Km2SAT +DPLL approach the encoderhas to handle the whole modal component (rules (8) and (9)), whilst the handling of thewhole Boolean component is delegated to an external SAT solver.

From the results displayed in Section 5.3 we notice that the relative performances ofthe Km2SAT +DPLL approach wrt. other state-of-the-art tools range from cases whereKm2SAT +Rsat is much less efficient than other state-of-the-art approaches (e.g., the k d4and k t4p formulas) up to formulas where it is much more efficient (e.g., the k ph p or the2m-CNF formulas with d = 1). In the middle stands a large majority of formulas in whichthe approach competes well against the other state-of-the art tools; in particular, Km2SAT+Rsat competes very well or even outperforms the other approaches based on translationsinto different formalisms (the translational approach, the automata-theoretic approach andthe QBF-encoding approach).

A simple explanation of the former fact could be that the Km2SAT +DPLL approachloses on problems with high modal depth, or where the modal component of reasoningdominates (like, e.g., the k d4 and k t4p formulas), and wins on problems where the Booleancomponent of reasoning dominates (like, e.g., the k ph n or the 2m-CNF formulas withd = 1), and it is competitive for formulas in which both components are relevant.

We notice, however, that Km2SAT +Rsat wins in the hard category of TANCS 2000benchmarks, with modal depths greater than 50, and on k branch n, where the searchis dominated by the modal component. 32 In fact, we recall that reducing the Booleancomponent of reasoning may produce a reduction also of the modal reasoning effort, becauseit may reduce the number of successor states to analyze (e.g. Sebastiani, 2007, 2007). Thus,e.g., techniques like on-the-fly BCP, although exploiting only purely-Boolean properties,may produce not only a drastic pruning of the Boolean search, but also a drastic reductionin the size of the model investigated, because they cut a priori the amount of successorstates to expand.

6. Related Work and Research Trends

In the last fifteen years one main research line in description logic has focused on investigat-ing increasingly expressive logics, with the goal of establishing the theoretical boundariesof decidability and of allowing for more expressive power in the languages defined (Baader,Calvanese, McGuinness, Nardi, & Patel-Schneider, 2003). Consequently, very expressive —though very hard— description logics have today notable application in the field of SemanticWeb. For example, the SHOIN (D) logic (which has NExpTime complexity) captures thesub-language OWL DL of the full OWL (Web Ontology Language) language (Bechhofer,van Harmelen, Hendler, Horrocks, McGuinness, Patel-Schneider, & Stein, 2004), that is therecommended standard language for the semantic web proposed by the W3C consortium.

In contrast, the recent quest for tractable description logic-based languages arising fromthe field of bio-medical ontologies (e.g., Spackman, Campbell, & Cote, 1997; Sioutos,de Coronado, Haber, Hartel, Shaiu, & Wright, 2007; The Gene Ontology Consortium, 2000;

32. The k branch n formulas are very hard from the perspective of modal reasoning, because they requirefinding one modelM with 2d+1−1 states (Halpern & Moses, 1992), but no Boolean reasoning within eachstate is really required (Giunchiglia et al., 2000; Horrocks et al., 2000): e.g., *SAT solves k branch n(d)

with 2d+1−1 calls to its embedded DPLL engine, one for each state of M, each call solved by BCP only.

380


Rector & Horrocks, 1997) has stimulated the opening of another research line on tractabledescription logics (also called lightweight description logics), which are suitable for reasoningon these very big bio-medical ontologies. In particular, Baader et al. (2005, 2006, 2007, 2008)have spent a considerable effort in the attempt of defining a small but maximal subset oflogical constructors, expressive enough to cover the needs of these practical applications,but whose inference problems must remain tractable. For example, simple and tractabledescription logics like EL, EL+ and EL++ (Baader et al., 2005) are expressive enough todescribe several important bio-medical ontologies such as SNoMed (Spackman et al., 1997),NCI (Sioutos et al., 2007), the Gene Ontology (The Gene Ontology Consortium, 2000) andthe majority of Galen (Rector & Horrocks, 1997).

Reasoning on these ontologies represents not only an important application of lightweightdescription logics, but also a challenge due to the required efficiency and the huge dimen-sions of the ontologies. In this perspective, it is important to face efficiently not only thebasic reasoning services (e.g., satisfiability, subsumption, queries) on logics like EL, EL+ andEL++, but also more sophisticated reasoning problems like e.g., axiom pinpointing (Baaderet al., 2007; Baader & Penaloza, 2008) and logical difference between terminologies (Konev,Walther, & Wolter, 2008).

7. Conclusions and Future Work

In this paper we have explored the idea of encoding Km/ALC-satisfiability into SAT, sothat to be handled by state-of-the-art SAT tools. We have showed that, despite the intrinsicrisk of blowup in the size of the encoded formulas, the performances of this approach arecomparable with those of current state-of-the-art tools on a rather extensive variety ofempirical tests. Furthermore, we remark that our approach allows for directly benefitting“for free” from current and future enhancements in SAT solver technology.

We see many possible directions to explore in order to enhance and extend our approach.An important open research line is to explore the feasibility of SAT encodings for other andmore expressive modal and description logics (e.g., whilst for logics like Tm the extensionshould be straightforward, logics like S4m, or more elaborated description logics than ALC,should be challenging) and for more complex form of reasoning (e.g., including TBoxes andABoxes).

Our current investigation, however, focuses on the lightweight logics of Baader et al.(2005). We have investigated (and we are currently enhancing) an encoding of the mainreasoning services in EL and EL+ into Horn-SAT, which allows for reasoning efficientlyon the (often huge) bio-medical ontologies mentioned in Section 6, and for handling themore sophisticated inference problems mentioned there (e.g., we currently handle axiompinpointing), by exploiting some of the advanced functionalities which can be implementedon top of a modern SAT solver (Sebastiani & Vescovi, 2009).

8. Acknowledgments

The authors are partly supported by SRC/GRC under Custom Research Project 2009-TJ-1880 WOLFLING, and by MIUR under PRIN project 20079E5KM8 002.

381


Appendix A. The Proof of Correctness & Completeness

A.1 Some Further Notation

Let ψ be a Km-formula. We denote by ψ the representation of ¬ψ in the current formalism:in NNF, 3rψ

def= 2rψ, 2rψdef= 3rψ, ψ1 ∧ ψ2

def= ψ1 ∨ ψ2, ψ1 ∨ ψ2def= ψ1 ∧ ψ2, Ai

def= ¬Ai,¬Ai

def= Ai; in BNF, ¬2rψdef= 2rψ, 2rψ

def= ¬2rψ, ψ1 ∧ ψ2def= ψ1 ∨ ψ2, ψ1 ∨ ψ2

def= ψ1 ∧ ψ2,Ai

def= ¬Ai, ¬Aidef= Ai.

We represent a truth assignment µ as a set of literals, with the intended meaning that apositive literal Ai (resp. negative literal ¬Ai) in µ means that Ai is assigned to true (resp.false). We say that µ assigns a literal l if either l ∈ µ or ¬l ∈ µ. We say that a literal loccurs in a Boolean formula φ iff the atom of l occurs in φ.

Let M denote a Kripke model, and let σ be the label of a generic state uσ in M. Welabel (and denote) by “1” the root state of M. By “〈σ : ψ〉 ∈ M” we mean that uσ ∈ Mand M, uσ |= ψ. Thus, for every σ s.t. uσ ∈M, either 〈σ : ψ〉 ∈ M or 〈σ : ψ〉 ∈ M.

For convenience, instead of (9) sometimes we use the equivalent definition:

Def(σ, νr) def= (L〈σ, νr〉 →∧

for every

〈σ,πr,i〉

(L〈σ, πr,i〉 → L〈σ.i, νr0 〉)) ∧

∧

for every

〈σ,πr,i〉

Def(σ.i, νr0). (20)

Notice that each Def(σ, ψ) in (6), (7), (8), (20) is written in the general form

(L〈σ, ψ〉 → Φ〈σ,ψ〉) ∧∧

〈σ′,ψ′〉Def(σ′, ψ′). (21)

We call definition implication for Def(σ, ψ) the expressions “(L〈σ, ψ〉 → Φ〈σ,ψ〉)”.

A.2 Soundness and Completeness of Km2SAT

Let ϕ be a Km-formula. We prove the following theorem, which states the soundness andcompleteness of Km2SAT .

Theorem 1. A Km-formula ϕ is Km-satisfiable if and only if the corresponding Km2SAT (ϕ)is satisfiable.

Proof. It is a direct consequence of the following Lemmas 2 and 3.

Lemma 2. Given a Km-formula ϕ, if Km2SAT (ϕ) is satisfiable, then there exists a Kripkemodel M s.t. M, 1 |= ϕ.

Proof. Let µ be a total truth assignment satisfying Km2SAT (ϕ). We build from µ a Kripkemodel M = 〈U ,L,R1, . . . ,Rm〉 as follows:

U def= {σ : A〈σ, ψ〉 occurs in Km2SAT (ϕ) for some ψ} (22)

L(σ,Ai)def=

{True if L〈σ, Ai〉 ∈ µ

False if ¬L〈σ, Ai〉 ∈ µ(23)

Rrdef= {〈σ, σ.i〉 : L〈σ, πr,i〉 ∈ µ}. (24)

382


We show by induction on the structure of ϕ that, for every 〈σ, ψ〉 s.t. L〈σ, ψ〉 occurs onKm2SAT (ϕ),

〈σ : ψ〉 ∈ M if L〈σ, ψ〉 ∈ µ. (25)

Base

ψ = Ai or ψ = ¬Ai. Then (25) follows trivially from (23).

Step

ψ = α. Let L〈σ, α〉 ∈ µ.

As µ satisfies (6), L〈σ, αi〉 ∈ µ for every i ∈ {1, 2}.By inductive hypothesis, 〈σ : αi〉 ∈ M for every i ∈ {1, 2}.Then, by definition, 〈σ : α〉 ∈ M.

Thus, 〈σ : α〉 ∈ M if L〈σ, α〉 ∈ µ.

ψ = β. Let L〈σ, β〉 ∈ µ.

As µ satisfies (7), L〈σ, βi〉 ∈ µ for some i ∈ {1, 2}.By inductive hypothesis, 〈σ : βi〉 ∈ M for some i ∈ {1, 2}.Then, by definition, 〈σ : β〉 ∈ M.

Thus, 〈σ : β〉 ∈ M if L〈σ, β〉 ∈ µ.

ψ = πr,j . Let L〈σ, πr,j〉 ∈ µ.

As µ satisfies (8), L〈σ.j, πr,j0 〉 ∈ µ.

By inductive hypothesis, 〈σ.j : πr,j0 〉 ∈ M.

Then, by definition and by (24), 〈σ : πr,j〉 ∈ M.

Thus, 〈σ : πr,j〉 ∈ M if L〈σ, πr,j〉 ∈ µ.

ψ = νr. Let L〈σ, νr〉 ∈ µ.

As µ satisfies (9), for every 〈σ, πr,i〉 s.t. L〈σ, πr,i〉 ∈ µ, we have that L〈σ.i, νr0 〉 ∈ µ.

By inductive hypothesis, we have that 〈σ : πr,i〉 ∈ M and 〈σ.i : νr0〉 ∈ M.

Then, by definition and by (24), 〈σ : νr〉 ∈ M.

Thus, 〈σ : νr〉 ∈ M if L〈σ, νr〉 ∈ µ.

If µ |= Km2SAT (ϕ), then A〈1, ϕ〉 ∈ µ. Thus, by (25), 〈1 : ϕ〉 ∈ M, i.e., M, 1 |= ϕ.

383


Lemma 3. Given a Km-formula ϕ, if there exists a Kripke model M s.t. M, 1 |= ϕ, thenKm2SAT (ϕ) is satisfiable.

Proof. Let M be a Kripke model s.t. M, 1 |= ϕ. We build from M a truth assignment µfor Km2SAT (ϕ) recursively as follows: 33

µdef= µM ∪ µM (26)

µMdef= {L〈σ, ψ〉 ∈ Km2SAT (ϕ) : 〈σ, ψ〉 ∈ M} (27)

∪ {¬L〈σ, ψ〉 ∈ Km2SAT (ϕ) : 〈σ, ψ〉 ∈ M}µM

def= µπν ∪ µαβ ∪ µA (28)

µπνdef= {¬L〈σ, πr,i〉 ∈ Km2SAT (ϕ) : σ 6∈ M} (29)∪ {L〈σ, νr〉 ∈ Km2SAT (ϕ) : σ 6∈ M}

µαβdef= {¬L〈σ, α〉∈Km2SAT (ϕ) : σ 6∈M and ¬L〈σ, αi〉∈µM for some i∈{1, 2}} (30)∪ {¬L〈σ, β〉∈Km2SAT (ϕ) : σ 6∈M and ¬L〈σ, βi〉∈µM for every i∈{1, 2}}.

where µA is a consistent truth assignment for the literals L〈σ, Ai〉 s.t. Ai ∈ A and σ 6∈ M.

By construction, for every L〈σ, ψ〉 in Km2SAT (ϕ), µ assigns L〈σ, ψ〉 to true iff it assignsL〈σ, ψ〉 to false and vice versa, so that µ is a consistent truth assignment.

First, we show that µM satisfies the definition implications of all Def(σ, ψ)’s andDef(σ, ψ)’ s.t. σ ∈M. Let σ ∈M. We distinguish four cases.

ψ = α. Thus ψ = β s.t. β1 = α1 and β2 = α2.

– If 〈σ : α〉 ∈ M (and hence 〈σ : β〉 6∈ M), then for both i’s 〈σ : αi〉 ∈ M and〈σ : βi〉 6∈ M. Thus, by (27), {L〈σ, α1〉, L〈σ, α2〉,¬L〈σ, β〉} ⊆ µM, so that µMsatisfies the definition implications of both Def(σ, α) and Def(σ, β).

– If 〈σ : α〉 6∈ M (and hence 〈σ, β〉 ∈ M), then for some i 〈σ : αi〉 6∈ M and〈σ : βi〉 ∈ M. Thus, by (27), {¬L〈σ, α〉, L〈σ, βi〉} ⊆ µM, so that µM satisfies thedefinition implications of both Def(σ, α) and Def(σ, β).

ψ = β. Like in the previous case, inverting ψ and ψ.

ψ = πr,j . Thus ψ = νr s.t. νr0 = πr,j

0 .

– If 〈σ : πr,j〉 ∈ M (and hence 〈σ : νr〉 6∈ M), then 〈σ.j : πr,j0 〉 ∈ M. Thus, by (27),

{L〈σ.j, πr,j0 〉,¬L〈σ, νr〉} ⊆ µM, so that µM satisfies the definition implications of

both Def(σ, πr,j) and Def(σ, νr).

33. We assume that µM, µπν and µαβ are generated in order, so that µαβ is generated recursively startingfrom µπν . Intuitively, µM assigns the literals L〈σ, ψ〉 s.t. σ ∈M in such a way to mimic M; µM assignsthe other literals in such a way to mimic the fact that no state outside those in M is generated (i.e., allL〈σ, π〉’s are assigned false and the L〈σ, ν〉’s, L〈σ, α〉’s, L〈σ, β〉’s are assigned consequently).

384


– If 〈σ : πr,j〉 6∈ M (and hence 〈σ : νr〉 ∈ M), then by (27) ¬L〈σ, πr,j〉 ∈ µM, sothat µM satisfies the definition implications of Def(σ, πr,j).As far as Def(σ, νr) is concerned, we partition the clauses in (9):

((L〈σ, νr〉 ∧ L〈σ, πr,i〉) → L〈σ.i, νr0 〉) (31)

into two subsets. The first is the set of clauses (31) for which 〈σ : πr,i〉 ∈ M. As〈σ : νr〉 ∈ M, we have that 〈σ.i : νr

0〉 ∈ M. Thus, by (27), L〈σ.i, νr0 〉 ∈ µM, so that

µM satisfies (31). The second is the set of clauses (31) for which 〈σ : πr,i〉 6∈ M.By (27) we have that ¬L〈σ, πr,i〉 ∈ µM, so that µM satisfies (31). Thus, µMsatisfies the definition implications also of Def(σ, νr).

ψ = νr. Like in the previous case, inverting ψ and ψ.

Notice that, if σ 6∈ M, then σ.i 6∈ M for every i. Thus, for every Def(σ, ψ) s.t. σ 6∈ M, allatoms in the implication definition of Def(σ, ψ) are not assigned by µM.

Second, we show by induction on the recursive structure of µM that µM satisfies thedefinition implications of all Def(σ, ψ)’s and Def(σ, ψ)’s s.t. σ 6∈ M. Let σ 6∈ M.

As a base step, by (29), µπν satisfies the definition implications of all Def(σ, πr,i)’s andDef(σ, νr)’s because it assigns false to all L〈σ, πr,i〉’s. Indeed, µA assigns every literal ofthe type L〈σ, Ai〉 s.t Ai ∈ A and σ 6∈ M (notice that all the Def(σ, Ai)’s definitions aretrivially satisfied and don’t contain any definition implications).

As inductive step, we show on the inductive structure of µαβ that µαβ satisfies thedefinition implications of all Def(σ, α)’s and Def(σ, β)’s

Let ψdef= α and ψ = β s.t. βi = αi (or vice versa). Then we have that:

• if both L〈σ, αi〉’s (respectively at least one L〈σ, βi〉) are assigned true by µM, thenthe definition implications of Def(σ, α) (respectively Def(σ, β)) is already triviallysatisfied;

• if at least one L〈σ, αi〉 (respectively both L〈σ, βi〉’s) is assigned false by µM, then by(30) L〈σ, α〉 (respectively L〈σ, β〉) is assigned false by µαβ , which satisfies the definitionimplication of Def(σ, α) (respectively Def(σ, β)).

Thus µM satisfies the definition implications of all the Def(σ, ψ)’s and Def(σ, ψ)’s s.t.σ 6∈ M.

On the whole, µ |= Def(σ, ψ) for every Def(σ, ψ). By construction, µM |= A〈1, ϕ〉since 〈1 : ϕ〉 ∈ M. Therefore µ |= Km2SAT (ϕ).

385


References

Areces, C., Gennari, R., Heguiabehere, J., & de Rijke, M. (2000). Tree-based heuristics inmodal theorem proving. In Proc. of ECAI’00, pp. 199–203. IOS Press.

Baader, F., Brandt, S., & Lutz, C. (2005). Pushing the EL envelope. In Proc. of IJCAI’05,pp. 364–369. Morgan-Kaufmann Publishers.

Baader, F., Franconi, E., Hollunder, B., Nebel, B., & Profitlich, H. (1994). An EmpiricalAnalysis of Optimization Techniques for Terminological Representation Systems or:Making KRIS get a move on. Applied Artificial Intelligence. Special Issue on Knowl-edge Base Management, 4, 109–132.

Baader, F., & Hollunder, B. (1991). A Terminological Knowledge Representation Systemwith Complete Inference Algorithms. In Proc. of First International Workshop onProcessing Declarative Knowledge, Vol. 572 of LNCS, pp. 67–85. Springer–Verlag.

Baader, F., Lutz, C., & Suntisrivaraporn, B. (2006). CEL—a polynomial-time reasonerfor life science ontologies. In Proc. of IJCAR’06, Vol. 4130 of LNAI, pp. 287–291.Springer–Verlag.

Baader, F., Brandt, S., & Lutz, C. (2008). Pushing the EL envelope further. In Proc. ofthe OWLED 2008 DC Workshop.

Baader, F., Calvanese, D., McGuinness, D., Nardi, D., & Patel-Schneider, P. F. (Eds.).(2003). The Description Logic Handbook: Theory, Implementation, and Applications.Cambridge University Press.

Baader, F., & Penaloza, R. (2008). Automata-Based Axiom Pinpointing. In Proc. ofIJCAR’08, Vol. 5195 of LNAI, pp. 226–241. Springer.

Baader, F., Penaloza, R., & Suntisrivaraporn, B. (2007). Pinpointing in the descriptionlogic EL+. In Proc. of KI 2007, Vol. 4667 of LNCS, pp. 52–67. Springer.

Balsiger, P., Heuerding, A., & Schwendimann, S. (1998). Logics workbench 1.0. In Proc. ofTableaux’98, Vol. 1397 of LNCS, pp. 35–37. Springer.

Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D. L., Patel-Schneider, P. F., & Stein, L. A. (2004). OWL Web Ontology Language reference.W3C Recommendation. Available at http://www.w3.org/TR/owl-ref/.

Benedetti, M. (2005). sKizzo: A Suite to Evaluate and Certify QBFs. In Proc. of CADE-20,Vol. 3632 of LNCS, pp. 369–376. Springer.

Biere, A., Cimatti, A., Clarke, E. M., & Zhu, Y. (1999). Symbolic Model Checking withoutBDDs. In Proc. of TACAS’99, Vol. 1579 of LNCS, pp. 193–207. Springer.

Brand, S. (2008). Personal communication.

Brand, S., Gennari, R., & de Rijke, M. (2003). Constraint Programming for Modelling andSolving Modal Satisfability. In Proc. of CP 2003, Vol. 2833 of LNCS, pp. 795–800.Springer.

Davis, M., Longemann, G., & Loveland, D. (1962). A machine program for theorem-proving.Journal of the ACM, 5 (7), 394–397.

386


Davis, M., & Putnam, H. (1960). A computing procedure for quantification theory. Journalof the ACM, 7, 201–215.

Donini, F., & Massacci, F. (2000). EXPTIME tableaux for ALC. Artificial Intelligence,124 (1), 87–138.

Een, N., & Sorensson, N. (2004). An Extensible SAT-solver. In Proc. of SAT’03, Vol. 2919of LNCS, pp. 502–518. Springer.

Fitting, M. (1983). Proof Methods for Modal and Intuitionistic Logics. D. Reidel Publishing.

Franconi, E. (1998). CRACK. In Proc. of Description Logics ’98, Vol. 11 of CEUR WorkshopProceedings. CEUR-WS.org.

Giunchiglia, E., Giunchiglia, F., Sebastiani, R., & Tacchella, A. (2000). SAT vs. Translationbased decision procedures for modal logics: a comparative evaluation. Journal ofApplied Non-Classical Logics, 10 (2), 145–172.

Giunchiglia, E., Giunchiglia, F., & Tacchella, A. (2002). SAT-Based Decision Proceduresfor Classical Modal Logics. Journal of Automated Reasoning, 28 (2), 143–171.

Giunchiglia, F., & Sebastiani, R. (1996). Building decision procedures for modal logics frompropositional decision procedures - the case study of modal K. In Proc. of CADE-13,Vol. 1104 of LNAI, pp. 583–597. Springer.

Giunchiglia, F., & Sebastiani, R. (2000). Building decision procedures for modal logics frompropositional decision procedures - the case study of modal K(m). Information andComputation, 162 (1/2), 158–178.

Giunchiglia, F., Roveri, M., & Sebastiani, R. (1996). A new method for testing decisionprocedures in modal and terminological logics. In Proc. of Description Logics ’96,Vol. WS-96-05 of AAAI Technical Reports, pp. 119–123. AAAI Press.

Haarslev, V., & Moeller, R. (2001). RACER System Description. In Proc. of IJCAR’01,Vol. 2083 of LNAI, pp. 701–706. Springer.

Halpern, J. Y. (1995). The effect of bounding the number of primitive propositions andthe depth of nesting on the complexity of modal logic. Artificial Intelligence, 75 (3),361–372.

Halpern, J., & Moses, Y. (1992). A guide to the completeness and complexity for modallogics of knowledge and belief. Artificial Intelligence, 54 (3), 319–379.

Heuerding, A., & Schwendimann, S. (1996). A benchmark method for the propositionalmodal logics K, KT, S4. Tech. rep. IAM-96-015, University of Bern, Switzerland.

Horrocks, I. (1998). Using an expressive description logic: FaCT or fiction?. In Proc. ofKR’98, pp. 636–647. Morgan Kaufmann.

Horrocks, I., & Patel-Schneider, P. F. (1999). Optimizing Description Logic Subsumption.Journal of Logic and Computation, 9 (3), 267–293.

Horrocks, I., Patel-Schneider, P. F., & Sebastiani, R. (2000). An Analysis of EmpiricalTesting for Modal Decision Procedures. Logic Journal of the IGPL, 8 (3), 293–323.

387


Hustadt, U., Schmidt, R. A., & Weidenbach, C. (1999). MSPASS: Subsumption Testing withSPASS. In Proc. of Description Logics ’99, Vol. 22 of CEUR Workshop Proceedings,pp. 136–137. CEUR-WS.org.

Hustadt, U., & Schmidt, R. (1999). An empirical analysis of modal theorem provers. Journalof Applied Non-Classical Logics, 9 (4), 479–522.

Kautz, H., McAllester, D., & Selman, B. (1996). Encoding Plans in Propositional Logic. InProc. of KR’96, pp. 374–384. AAAI Press.

Konev, B., Walther, D., & Wolter, F. (2008). The Logical Difference Problem for DescriptionLogic Terminologies. In Proc. of IJCAR’08, Vol. 5195 of LNAI, pp. 259–274. Springer.

Ladner, R. (1977). The computational complexity of provability in systems of modal propo-sitional logic. SIAM Journal on Computing, 6 (3), 467–480.

Massacci, F. (1999). Design and Results of Tableaux-99 Non-Classical (Modal) System Com-petition. In Proc. of Tableaux’99, Vol. 1617 of LNCS, pp. 14–18. Springer–Verlang.

Massacci, F. (2000). Single Step Tableaux for modal logics: methodology, computations,algorithms. Journal of Automated Reasoning, 24 (3), 319–364.

Massacci, F., & Donini, F. (2000). Design and results of TANCS-2000, Automated Reason-ing with Analytic Tableaux and Related Methods. In Proc. of Tableaux 2000, Vol.1847 of LNCS, pp. 52–56. Springer.

Moskewicz, M. W., Madigan, C. F., Zhao, Y., Zhang, L., & Malik, S. (2001). Chaff: Engi-neering an efficient SAT solver. In Proc. of DAC’01, pp. 530–535. ACM.

Narizzano, M., Pulina, L., & Tacchella, A. (2006). The QBFEVAL Web Por-tal. In Proc. of JELIA’06, Vol. 4160 of LNCS, pp. 494–497. Springer. See:http://www.qbflib.org/index eval.php.

Pan, G., Sattler, U., & Vardi, M. Y. (2002). BDD-Based Decision Procedures for K. InProc. of CADE-18, Vol. 2392 of LNCS, pp. 16–30. Springer.

Pan, G., & Vardi, M. Y. (2003). Optimizing a BDD-based modal solver. In Proc. ofCADE-19, Vol. 2741 of LNAI, pp. 75–89. Springer.

Patel-Schneider, P. F. (1998). DLP system description. In Proc. of Tableaux’98, pp. 87–89.

Patel-Schneider, P. F., & Sebastiani, R. (2001). A new system and methodology for gener-ating random modal formulae. In Proc. of IJCAR’01, Vol. 2083 of LNAI, pp. 464–468.Springer-Verlag.

Patel-Schneider, P. F., & Sebastiani, R. (2003). A New General Method to Generate Ran-dom Modal Formulae for Testing Decision Procedures. Journal of Artificial Intelli-gence Research, (JAIR), 18, 351–389.

Pipatsrisawat, T., & Darwiche, A. (2006). SAT Solver Description: Rsat. Available at:http://fmv.jku.at/sat-race-2006/descriptions/9-rsat.pdf.

Rector, A., & Horrocks, I. (1997). Experience building a large, re-usable medical ontologyusing a description logic with transitivity and concept inclusions. In Proc. of Workshopon Ontological Engineering, AAAI Spring Symposium (AAAI’97). AAAI Press.

388


Schild, K. D. (1991). A correspondence theory for terminological logics: preliminary report.In Proc. of IJCAI’91, pp. 466–471.

Sebastiani, R. (2007). Lazy Satisfiability Modulo Theories. Journal on Satisfiability, BooleanModeling and Computation (JSAT), 3, 141–224.

Sebastiani, R., & Vescovi, M. (2006). Encoding the Satisfiability of Modal and DescriptionLogics into SAT: The Case Study of K(m)/ALC. In Proc. of SAT’06, Vol. 4121 ofLNCS, pp. 130–135. Springer.

Sebastiani, R., & Vescovi, M. (2009). Axiom Pinpointing in Lightweight Description Logicsvia Horn-SAT Encoding and Conflict Analysis. In Proc. of CADE-22, Vol. 5663 ofLNAI. Springer. In print.

Sebastiani, R. (2007). From KSAT to Delayed Theory Combination: Exploiting DPLLOutside the SAT Domain. In Proc. of FroCoS’07, Vol. 4720 of LNCS, pp. 28–46.Springer. Invited talk.

Seshia, S. A., Lahiri, S. K., & Bryant, R. E. (2003). A Hybrid SAT-Based Decision Procedurefor Separation Logic with Uninterpreted Functions. In Proc. of DAC’03, pp. 425–430.ACM.

Silva, J. P. M., & Sakallah, K. A. (1996). GRASP - A new Search Algorithm for Satisfiability.In Proc. of ICCAD’96, pp. 220–227. IEEE Computer Society.

Sioutos, N., de Coronado, S., Haber, M. W., Hartel, F. W., Shaiu, W., & Wright, L. W.(2007). NCI Thesaurus: A semantic model integrating cancer-related clinical andmolecular information. Journal of Biomedical Informatics, 40 (1), 30–43.

Spackman, K. A., Campbell, K., & Cote, R. (1997). SNOMED RT: A reference terminologyfor healt care. Journal of American Medical Informatics Association (Fall SymposiumSupplement), 640–644.

Strichman, O. (2002). On Solving Presburger and Linear Arithmetic with SAT. In Proc. ofFMCAD’02, Vol. 2517 of LNCS, pp. 160–170. Springer–Verlag.

Strichman, O., Seshia, S., & Bryant, R. (2002). Deciding separation formulas with SAT. InProc. of CAV’02, Vol. 2404 of LNCS, pp. 209–222. Springer.

Tacchella, A. (1999). *SAT system description. In Proc. of Description Logics ’99, Vol. 22of CEUR Workshop Proceedings, pp. 142–144. CEUR-WS.org.

The Gene Ontology Consortium (2000). Gene ontology: Tool for the unification of biology.Nature Genetics, 25, 25–29.

Voronkov, A. (1999). K K: a theorem prover for K. In Proc. of CADE-16, Vol. 1632 ofLNAI, pp. 383–387. Springer.

Voronkov, A. (2001). How to optimize proof-search in modal logics: new methods of provingredundancy criteria for sequent calculi. ACM Transactions on Computational Logic,2 (2), 182–215.

Zhang, L., & Malik, S. (2002). The quest for efficient boolean satisfiability solvers. In Proc.of CAV’02, Vol. 2404 of LNCS, pp. 17–36. Springer.

389

Documents

arxiv.org · Journal of Artiﬂcial Intelligence Research 35 (2009) 343-389 Submitted 08/08; published 06/09 Automated Reasoning in Modal and Description Logics via SAT Encoding: