Semantics-Based Analysis
and Transformation of Logic Programs
(Revised Report)
Harald Søndergaard
April 1990
i
Abstract
Dataflow analysis is an essential component of many programming tools. One use of dataflow
information is to identify errors in a program, as done by program “debuggers” and type checkers.
Another is in compilers and other program transformers, where the analysis may guide various
optimisations.
The correctness of a programming tool’s dataflow analysis component is usually of paramount
importance. The theory of abstract interpretation, originally developed by P. and R. Cousot aims at
providing a framework for the development of correct analysis tools. In this theory, dataflow analysis
is viewed as “non-standard” semantics, and abstract interpretation prescribes certain relations
between standard and non-standard semantics, in order to guarantee the correctness of the non-
standard semantics with respect to the standard semantics.
The increasing acceptance of Prolog as a practical programming language has motivated wide-
spread interest in dataflow analysis of logic programs and especially in abstract interpretation. Logic
programmming languages are attractive from a semantical point of view, but dataflow analysis
of logic programs is more complex than that of more traditional programming languages, since
dataflow is bi-directional (owing to unification) and control flow is in terms of backtracking.
The present thesis is concerned with semantics-based dataflow analysis of logic programs. We
first set up a theory for dataflow analysis which is basically that of abstract interpretation as intro-
duced by P. and R. Cousot. We do, however, relax the classical theory of abstract interpretation
somewhat, by giving up the demand for (unique) best approximations.
Two different kinds of analysis of logic programs are then identified: a bottom-up analysis yields
an approximation to the success set (and possibly the failure set) of a program, whereas a top-down
analysis yields an approximation to the call patterns that would appear during an execution based
on SLD resolution. We investigate some of the uses of the two kinds of analysis. In the bottom-up
case, we pay special attention to (bottom-up) type inference and its use in program specialisation.
In the top-down case, we present a generic “dataflow semantics” that encompasses many useful
dataflow analyses. As an instance of the generic semantics, a groundness analysis is detailed, and
a series of other applications are mentioned.
We finally present a transformation technique, based on top-down analysis, for introducing
difference-lists in a list-manipulating program, without changing the program’s semantics. This
may lead to substantial improvement in a program’s execution time.
ii
Contents
1 Introduction 1
2 Preliminaries 5
2.1 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Logic programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Semantics-Based Analysis of Logic Programs 11
3.1 Approximate computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Abstract interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Denotational abstract interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Dataflow Analysis of Logic Programs 27
5 Bottom-Up Analysis of Normal Logic Programs 31
5.1 Bottom-up semantics for logic programs . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Approximation of success and failure sets . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3 Applications and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6 Top-Down Analysis of Definite Logic Programs 47
6.1 SLD semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.2 A semantics based on parametric substitutions . . . . . . . . . . . . . . . . . . . . . 52
6.3 A dataflow semantics for definite logic programs . . . . . . . . . . . . . . . . . . . . 56
6.4 Approximating call patterns: Groundness analysis . . . . . . . . . . . . . . . . . . . 59
6.5 Other applications and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
iii
iv CONTENTS
7 Difference-Lists Made Safe 69
7.1 The difference-list problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2 Manipulating difference-lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.3 Automatic difference-list transformation . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8 Conclusion 91
A Correctness of the base semantics 95
Bibliography 103
Index 111
Preface
The present thesis is submitted in fulfilment of the requirements for the degree of licentiatus scien-
tiarum (Ph. D.) in computer science at the University of Copenhagen, Denmark. It contains eight
chapters of which the first is an introduction and the eighth is a summary. The theory of dataflow
analysis presented in Section 3.2 was developed together with Kim Marriott [60]. Chapter 5 is
based on joint work with Kim Marriott [59, 60], Chapter 6 on joint work with Kim Marriott and
Neil Jones [58, 67], and Chapter 7 on joint work with Kim Marriott [63].
This account should justify my use of the (nowadays somewhat discredited) academic “we”
throughout the thesis: an “I” would in most cases border on fraudulence. The thesis is of course
based on the work of several other people, but it would be impossible to list them all here—hopefully
I have given appropriate references and credit in the report.
Most of the thesis was written during my two and a half years visit to the Department of Com-
puter Science at Melbourne University. I would like to thank that department for its hospitality.
The visit was initially made possible by the Australian Department of Education’s Australian-
European Award Program, and I have subsequently received support from the Danish Research
Academy and the Danish Research Council for Natural Sciences. During my stay I held a scholar-
ship from the University of Copenhagen. I am grateful to all these institutions.
The thesis was finished while I visited Kim Marriott at IBM T. J. Watson Research Center and
Saumya Debray at the University of Arizona. Neil Jones supervised my work from Copenhagen,
and Rodney Topor was an invaluable mentor in Melbourne. My academic indebtedness to Neil
Jones and Kim Marriott will be apparent from the summary above but weighs little compared
to what they mean to me as friends. The same goes for Graeme Port, an always constructive
proof-reader, and for Peter Sestoft, who has been the highly reliable link to my home department,
and with whom I have collaborated (via electronic mail) on topics not covered in this thesis. I
am also indebted to Philip Dart, Saumya Debray, Dean Jacobs, Lee Naish, and Uday Reddy for
much stimulating input to my work. Finally I would like to thank Kamakshi Padmanabhan for her
support. A good “geographic fortune” has brought me together with these people.
Tucson, Arizona, November 1989
v
Preface to the revised version
This report is a revised version of my thesis of the same title, which was accepted for the degree of
licentiatus scientiarum (Ph. D.) in computer science at the University of Copenhagen, Denmark,
in December 1989.
The examiners made many useful remarks; in particular, thanks are due to Alan Mycroft for
his very valuable suggestions for improvements. These made me want to revise the paper slightly,
but a revision was really made necessary when Kim Marriott pointed out an error in the proof
of the theorem that stated the equivalence between the SLD semantics and the “base semantics”
in 6.2. The equivalence does hold, and a proof of full abstraction of the base semantics would
have to include a proof of the equivalence. Here, however, we make use of the fact that, for our
purposes, it suffices to prove “half” of the equivalence, that is, we prove that the base semantics
safely approximates SLD resolution in a sense that is made precise in the thesis. Even this part is
complicated, so we have chosen to move the proof into an appendix in order not to interrupt the
flow of presentation. In general, Chapter 6 has been brought closer to the presentation in Marriott
Søndergaard and Jones [67].
The remainder of the changes are mainly stylistic. Mostly effort has been put into expanding
discussions where examiners found them too glib.
Melbourne, Easter 1990
vi
Chapter 1
Introduction
Some of the most important programming tools studied in computer science are programs that
manipulate programs. Examples are interpreters, parsers, compilers, type-checkers and various
other error-finding devices, as well as partial evaluators and other kinds of program transformers.
The study and development of these tools will remain in the forefront of computer science research,
not only because of the paramount importance and complexity of the tools, but also because new
programming languages keep emerging, posing new challenges to tool construction.
In program transformation (with which we class compilation) the central problem is, broadly
formulated, given a program P , to generate a program P ′ which is in some sense equivalent to
P but which behaves better with respect to certain performance criteria. There are many well-
known solutions to this problem, especially in compilation: an object program produced by a
straightforward compilation algorithm is usually very inefficient, so most compilers include phases
that improve the generated code. Standard textbooks on compilation explain such transformations
as code motion, constant folding, induction variable elimination, and strength reduction.
Such automatic transformations may be construed as being made up of an analysis phase and
a subsequent synthesis phase where the information obtained by the analysis is utilised. Classical
analyses used by compilers include expression availability analysis, live variable analysis, and many
more.
These techniques are usually applied to (target) programs written in lower-level languages, but
similar techniques can be thought of for high-level languages. Stated generally, the purpose of
program analysis is to decide whether some invariant holds at some program point. It may thereby
be determined whether some transformation scheme is applicable, and possibly what the exact form
of the synthesis should be. The process of investigating such invariance is called static analysis, or,
as we prefer, dataflow analysis. Error-finding in programs, type inference, etc., may also be seen
as dataflow analysis.
Dataflow analyses are usually very complex and difficult to develop. Apart from trial and error
we only have very limited ways to convince ourselves of their correctness (or lack of correctness).
1
2 CHAPTER 1. INTRODUCTION
Fortunately this does not prevent us from building successful analysis tools. However, the scientific
answer to a complex practice that is hard to understand or manage is to build a theory. In computer
science, as in other natural sciences, a successful theory provides new points of view on a problem—
it may even be felt to have some explanatory element—and will hopefully, as a consequence, have
a positive influence on practice in turn.
In the case of dataflow analysis, an interesting view was suggested more than 25 years ago,
according to which dataflow analysis is “pseudo-evaluation,” that is, a process that somehow mimics
the normal execution of a program. Naur put this point of view to good use in explaining the type
checking component of the Gier Algol compiler, and Sintzoff later provided further examples of
its usefulness (we give references later). P. and R. Cousot formalised the idea by developing their
influential theory of abstract interpretation.
We later discuss abstract interpretation in great detail, but the following example may readily
convey the basic idea. Rather than using integers as data objects, a dataflow analysis may use
neg, zero, and plus to describe negative integers, 0, and positive integers, respectively. Then
by reinterpreting operations such as multiplication according to the “rules of signs,” the dataflow
analysis may establish certain properties of a program, such as “whenever control reaches this loop,
x is assigned a negative value.”
Abstract interpretation prescribes certain relations that should hold between a dataflow analysis
and the semantics of the programming language in question. If these relations hold, the dataflow
analysis is guaranteed to be correct. To formalise them, however, precise formal definitions of both
semantics and dataflow analysis are required. The analysis-as-pseudo-evaluation view is usually
reified as a strong similarity between the two definitions, a certain degree of congruence. This
naturally leads to viewing dataflow analysis as mere non-standard semantics.
Abstract interpretation illustrates the usefulness of formal semantics as a theoretical tool. A
formal definition of a full, complex programming language may well not be possible in practice, and
even if it is, it may itself be far too complex to be of any use. However, the possibility remains that
we can choose what we consider the essential part of a programming language and formalise its
semantics. By taking a formal definition as point of departure, we are led to solve a given dataflow
problem in a manner that is likely to be very different to a more ad hoc solution. We hope to
lend some credibility to that statement with the present thesis. The point is that, in striving for
congruence between a standard and a non-standard semantics, we are led to factor out issues of
execution order (flow of control) which is similar for the two, allowing us to focus on the relation
between the domains involved in their definitions, that is, between data. The solutions arrived at
this way are not necessarily better than ad hoc solutions, but there is a good chance that they
are, and in any case the process of formulating a particular dataflow analysis as a non-standard
semantics is bound to teach us more about the dataflow analysis problem.
Abstract interpretation of logic programs has gained considerable currency during the last few
years. The major impetus has been the quest for dataflow analyses that can improve code generation
in Prolog compilers. Logic programming languages are based on a principle of “separation of logic
3
and control,” which is desirable from a semantical point of view, but which also causes severe
problems for implementation. The lack of “control information” in programs provides a wide scope
for dataflow analysis of these languages and explains the currency that abstract interpretation has
gained in logic programming.
The theory has almost entirely been restricted to the case of definite logic programs executed
using SLD resolution with a standard (left-to-right) computation rule. Such a theory allows for
dataflow information to be propagated in a manner that resembles an SLD refutation of a query.
Analyses therefore yield information about call patterns that occur during the query evaluation
process. This information is exactly what a compiler needs to improve code generation, so the
major applications are in code improvement. We refer to this type of dataflow analysis as top-
down.
But other semantic models exist for logic programming languages. One of the best known is the
characterisation using an “immediate consequence function” TP , a semantics sometimes referred to
as “forward chaining” or “bottom-up.” A dataflow analysis developed from such a semantics gives
no information about call patterns, rather it approximates a program’s “success set.” We refer to
this type of dataflow analysis as bottom-up analysis.
The present thesis is concerned with both kinds of dataflow analysis, and their applications.
Readers are expected to be familiar with the theory of logic programming, domain theory, and
denotational semantics at the level of the textbooks by Lloyd [49] and Schmidt [88], for example.
We have tried to comply with the terminology used in these two books.
Chapter 2 recapitulates some mathematical notions and notation used throughout the thesis.
Readers may find this chapter useful for reference, but it cannot serve as an introduction to the
notions.
Chapter 3 is concerned with semantics-based dataflow analysis. We present the idea underlying
pseudo-evaluation, or, as we prefer, approximate computation. The theory of dataflow analysis
which we introduce relaxes P. and R. Cousot’s theory of abstract interpretation somewhat, so we
take some care to argue its adequacy. We also present Nielson’s extension of P. and R. Cousot’s
theory, that is, his denotational abstract interpretation. Nielson’s work is a strong case in favour
of denotational semantics as semantic formalism.
Chapter 4 discusses dataflow analysis of logic programs more specifically. The aim is to link
the general ideas to the specific case of logic programming before going into the details, and in
particular to explain the distinction between bottom-up and top-down analysis informally.
Chapter 5 covers bottom-up analysis. We give a semantic definition for normal logic programs
(in which clause bodies may contain negation). This definition is based on Kleene (three-valued)
logic and similar to a definition proposed by Fitting. We detail two dataflow analyses based on our
semantics, both essentially type inferences, one based on “depth k” abstractions (due to Sato and
Tamaki), another on “singleton” abstractions (due to Marriott, Naish and Lassez). These analyses
are shown to be sound with respect to our semantics and useful for error-finding in programs and
4 CHAPTER 1. INTRODUCTION
for program specialisation. We also show that every dataflow analysis that is correct with respect to
our semantics is automatically correct with respect to a semantics proposed by Kunen, to SLDNF
resolution, and to (sound) Prolog semantics.
Chapter 6 covers the case of top-down analysis. We give a denotational definition of an SLD
resolution-based semantics for definite logic programs (in which clause bodies may not contain
negation) and prove it correct. Negation can well be handled in a denotational definition, but to
do this in a sound way would complicate definitions unreasonably, so as to obscure the issues under
study. This is the reason for restricting attention to definite programs. Step by step we transform
the semantics into a “dataflow” semantics that forms a basis for a class of dataflow analyses. We
detail one member of this class, a so-called groundness analysis and show that it is correct with
respect to our semantics. Finally we discuss other dataflow analyses and related work.
Chapter 7 contains a discussion of difference-lists and their use in transformation of list-
processing Prolog programs, as described by Sterling and Shapiro. The transformation is rather
complex and may be unsafe in certain circumstances. We sketch how dataflow analyses, similar to
those already discussed, may establish the absence of the unfortunate circumstances, thus making
automatic difference-list transformation possible for most interesting list-processing programs.
Chapter 8 contains a conclusion. We summarise the thesis and suggest a series of related issues
that would be worthy of more study.
There is a bibliography and an index at the end of the thesis. A note about numbering in the
thesis may be useful: within each chapter, we number examples, lemmas, theorems etc. using the
same counter. We find that this speeds up search for a particular example, lemma, etc.
Chapter 2
Preliminaries
In this chapter we recapitulate some basic notions and facts from domain theory and explain some
notation that will be used throughout the report. The chapter is not meant as an introduction
to the notions, merely as a reference for readers that occasionally may feel the need for a precise
definition. For a detailed introduction, readers are referred to a textbook, for example Schmidt’s [88]
or Birkhoff’s book on lattice theory [5].
2.1 Lattices
Let I denote the identity relation on a set X. A preordering on X is a binary relation R that
is reflexive (I ⊆ R) and transitive (R · R ⊆ R). A partial ordering is a preordering that is
antisymmetric (R ∩ R−1 ⊆ I). A set equipped with a partial ordering is a poset. Let (X,≤) be a
poset. A (possibly empty) subset Y of X is a chain iff for all y, y′ ∈ Y, y ≤ y′ ∨ y′ ≤ y.
Let (X,≤) be a poset. An element y ∈ Y ⊆ X is maximal in Y iff {y′ ∈ Y | y ≤ y′} = {y}.
Dually we may define minimality in Y . An element x ∈ X is an upper bound for Y iff y ≤ x for all
y ∈ Y . Dually we may define a lower bound for Y . An upper bound x for Y is the least upper
bound for Y iff, for every upper bound x′ for Y , x ≤ x′, and when it exists, we denote it by⊔
Y .
Dually we may define the greatest lower bound ⊓Y for Y .
A poset for which every subset possesses a least upper bound is a complete join-lattice. A poset
for which every subset possesses a greatest lower bound is a complete meet-lattice. A poset that has
both properties is a complete lattice. In particular, equipped with the subset ordering, the powerset
of X, denoted by P X, is a complete lattice. For a complete lattice X, we denote⊔
∅ = ⊓X by
⊥X and ⊓ ∅ =⊔
X by ⊤X . A poset for which every finite subset possesses a least upper bound
and a greatest lower bound is a lattice. A sublattice of a (complete) lattice X is a subset of X
which preserves the least upper bound and greatest lower bound operations of X.
From a dataflow analysis point of view there are a number of interesting special types of complete
lattices (we explain why shortly). Let card Y denote the cardinality of the set Y . The complete
5
6 CHAPTER 2. PRELIMINARIES
lattice X
• is of finite height iff max {card Y | Y ⊆ X is a chain} is finite.
• has the finite chain property iff every chain Y ⊆ X is finite.
• is ascending chain finite (or Noetherian) iff every non-empty subset Y ⊆ X has an element
that is maximal in Y . Dually X may be descending chain finite.
(We will not be concerned with lattices in general, only complete lattices, but it may be of some
interest that if a lattice has any of the above properties then it is complete.) It is clear that a
complete lattice of finite height has the finite chain property, and one that has the finite chain
property is both ascending and descending chain finite. The following two examples show that the
relations between the classes are proper inclusions.
Example 2.1 Let X =⋃
n∈N {(n, j) | 1 ≤ j ≤ n} ∪ {⊥,⊤} be ordered by
• ⊥ ⊑ x for all x ∈ X.
• x ⊑ ⊤ for all x ∈ X.
• (n, j) ⊑ (n, j′) if j ≤ j′.
• For no other pair (x, x′) does x ⊑ x′ hold.
Then X has the finite chain property, but X is not of finite height.
Example 2.2 Let X = N ∪ {⊥} be ordered by
• ⊥ ⊑ n for all n ∈ N .
• n ⊑ n′ if n′ ≤ n.
• For no other pair (n, n′) does n ⊑ n′ hold.
Then X is ascending chain finite, but X does not have the finite chain property. A similar exam-
ple shows that there are lattices that are descending chain finite without having the finite chain
property.
2.2 Functions
Functions are generally used in their Curried form. Our notation for function application uses
parentheses sparingly. Only when it would seem to help the eye, shall we make use of redundant
parentheses. As usual, function space formation X → Y associates to the right, and function
2.2. FUNCTIONS 7
application to the left. We occasionally use the lambda notation (due to Church) for functions and
we use “◦” for function composition.
Let F : X → Y be a function. Then F is injective iff F x = F x′ ⇒ x = x′ for all x, x′ ∈ X,
and F is bijective iff there is a function F ′ : Z → X such that F ◦ F ′ and F ′ ◦ F are identity
functions. We define F ’s distributed version to be the function (∆F ) : P X → P Y , defined by
∆F Z = {F z | z ∈ Z}. A function F : X → X is idempotent iff F (F x) = F x for all x ∈ X.
For any set X let X∗ denote the set of finite sequences of elements of X. The empty sequence
is denoted by nil and we use the operator “:” for sequence construction. The folded version of a
function F : X → Y → Y is given by application of Σ : (X → Y → Y ) → X∗ → Y → Y . The
functional Σ is defined by
Σ F nil z = z
Σ F (x : y) z = Σ F y (F x z).
Let (X,≤) and (Z,�) be posets. A function F : X → Z is monotonic iff x ≤ x′ ⇒ F x �
F x′ for all x, x′ ∈ X. In what follows, monotonicity of functions is essential, so much so that it
is understood throughout the report that X → Z denotes a space of monotonic functions. Let X
and Z be complete lattices. A function F : X → Z is strict iff F ⊥X = ⊥Z and continuous iff
for every non-empty chain Y ⊆ X,⊔
(∆F Y ) = F (⊔
Y ). Dually we may define co-strictness and
co-continuity.
A fixpoint for a function F : X → X is an element x ∈ X such that x = F x. If X is a complete
lattice, then the set of fixpoints for (the monotonic) F : X → X is itself a complete lattice (though
not in general a sublattice of X). The least element of this lattice is the least fixpoint for F , and we
denote it by lfp F . Dually there is a greatest fixpoint for F , which we denote by gfp F . Furthermore,
defining
F ↑ α =
{
⊔
{F ↑ α′ | α′ < α} if α is a limit ordinal
F (F ↑ (α− 1)) if α is a successor ordinal,
there is some ordinal α such that F ↑α = lfp F . Dually we may define F ↓ α for any ordinal α, and
again there is some ordinal α such that F ↓ α = gfp F . The sequence (F ↑ 0), (F ↑ 1), . . . , (lfp F ) is
the ascending Kleene sequence for F . Dually we may define the descending Kleene sequence.
The reason for our interest in the three classes of lattices mentioned above is that if a monotonic
function is defined on a lattice that is ascending chain finite then it has a finite ascending Kleene
sequence. Dually, if the lattice is descending chain finite then the function has a finite descending
Kleene sequence. This means that fixpoints can be computed in finite time.
Let X be a complete lattice. A predicate Q is inclusive on X iff for all (possibly empty) chains
Y ⊆ X,Q (⊔
Y ) holds whenever (Q y) holds for every y ∈ Y . Dually Q is co-inclusive on X
iff for all chains Y ⊆ X,Q (⊓ Y ) holds whenever (Q y) holds for every y ∈ Y . Inclusive and
co-inclusive predicates are admissible in fixpoint induction. Assume that F : X → X is monotonic
and (Q x)⇒ Q (F x) for all x ∈ X. If Q is inclusive then Q (lfp F ) holds. If Q is co-inclusive then
8 CHAPTER 2. PRELIMINARIES
Q (gfp F ) holds. We shall also use a slightly stronger version of the first case: clearly it suffices for
(Q x) to imply Q (F x) for all x ≤ lfp F . All cases of this induction principle are easily proved by
transfinite induction.
2.3 Logic programs
When speaking about logic programs, we seek to comply with the terminology used by Lloyd [49].
In matters of naming and reference we use italic capital letters as meta-variables (although we do
not restrict them to this use), and we employ the pair of brackets “[[” and “]]” for quasi-quotation
(as is common in denotational semantics).
Let Var , Fun, and Pred denote the disjoint syntactic categories of variables, functors, and
predicate symbols, respectively. (We call a collection Fun ∪ Pred a lexicon). The set Var is
assumed to be countably infinite. The sets Fun and Pred are assumed to be non-empty, and each
of their elements has an associated natural number which is its arity. From these sets, programs
and queries can be constructed as follows.
• The set Term of terms is defined recursively: every term is either a variable V ∈ Var or a
construction [[F (T1, . . . , Tn)]], where F ∈ Fun has arity n ≥ 0 and T1, . . . , Tn are terms. In
particular, a functor with arity 0 is a term.
• An atom is a construction [[Q(T1, . . . , Tn)]], where Q ∈ Pred has arity n ≥ 0 and T1, . . . , Tn
are terms. We let Atom denote the set of atoms.
• A literal is an atom A or a negation of an atom, that is, a construction [[¬ A]] where A ∈ Atom.
• A body is a conjunction of literals, that is, a construction [[L1, . . . , Ln]] where L1, . . . Ln are
literals. Note that a body is finite and possibly empty.
• A clause consists of an atom A (its head) and a body B and is written [[A← B]]. We let
Clause denote the set of clauses.
• A normal program is a finite collection of clauses.
• A definite program is a normal program that is constructed without negation.
• A query is a conjunction of literals, written [[←L1, . . . , Ln]] where L1, . . . Ln. If L1, . . . , Ln are
all atoms, the query is definite.
We let Prog denote the set of programs, normal or definite, depending on the context. We
assume that we are given a function vars : (Prog ∪ Atom ∪ Term) → P Var , such that (vars S) is
the set of variables that occur in the syntactic object S. A syntactic object S is ground iff it is
constructed without variables, that is, vars S = ∅. We let Her denote the Herbrand base, that is,
the set of ground atoms (for some fixed lexicon of functors and predicate symbols).
2.3. LOGIC PROGRAMS 9
Logic programs can be given a clean model-theoretic semantics. Readers are referred to
Lloyd [49] for details let us merely recall that a (classical) model which is a subset of the Her-
brand base (for some fixed lexicon) is called a Herbrand model.
A substitution is an almost-identity mapping θ ∈ Sub ⊆ Var → Term . Substitutions are
not distinguished from their natural extensions to other syntactic categories. Our notation for
substitutions is standard. For instance {x 7→ a} denotes the substitution θ such that (θ x) = a and
(θ V ) = V for all V 6= x. We let ι denote the identity substitution.
An instance of a syntactic object S is an application θ S. For a syntactic object S, (ground S)
denotes the set of ground instances of S (for some fixed lexicon).
Given a definite program P , the immediate consequence function, TP : Her → Her is defined
by
TP U = {A ∈ Her | ∃C ∈ P . [[A←B]] ∈ ground C ∧ B ⊆ U}.
Here we have used the common convention of viewing a program P as a set of clauses C and a
body B as a set of atoms. Reading clauses as implications, TP gives, via modus ponens, those
consequences of a set of assumptions that can be deduced using each clause once. The function
TP is monotonic (in fact continuous), and so has a least fixpoint which happens to be the smallest
Herbrand model for the program1 [98]. This model is also the program’s success set.
A unifier of A,H ∈ Atom is a substitution θ such that (θ A) = (θ H). If such a unifier exists,
then A and H are unifiable. A unifier θ of A and H is an (idempotent) most general unifier of A
and H iff θ′ = θ′ ◦ θ for every unifier θ′ of A and H. Two atoms A and H have a most general
unifier whenever they are unifiable. The auxiliary function mgu : Atom → Atom → P Sub is
defined as follows. If A and H are unifiable, then (mgu A H) yields a singleton set consisting of a
most general unifier of A and H. Otherwise (mgu A H) = ∅.
Let G = ←A1, . . . , An be a query with selected atom Ai and let C = H ← A′1, . . . , A
′k be a
clause. If Ai and H are unifiable, then
θ[[A1, . . . , Ai−1, A′1, . . . , A
′k, Ai+1, . . . , An]]
where {θ} = mgu Ai H, is a resolvent of G and C with unifier θ.
Let P be a definite program and let G be a query. An SLD derivation of P ∪ {G} consists of
• a maximal sequence G0, G1, . . . of negative clauses with G0 = G,
• a sequence C0, C1, . . . of fresh variants of clauses from P (that is, the variables in Ci are
consistently replaced by variables not in C0, . . . , Ci−1, or G).
• a sequence θ0, θ1, . . . of substitutions
1The immediate consequence function is called T by van Emden and Kowalski [98] the name TP is due to Apt
and van Emden [2].
10 CHAPTER 2. PRELIMINARIES
such that for all i, Gi+1 is a resolvent of Gi and Ci with unifier θi. An SLD derivation may be
finite or infinite. Assume it is finite, with final elements Gn+1, Cn, and θn. Then if Gn+1 is empty,
the derivation is successful, otherwise it is finitely failed. If it is successful, the computed answer is
θn ◦ . . . ◦ θ0, restricted to the set of variables occurring in G.
Chapter 3
Semantics-Based Analysis of Logic
Programs
Our aim is to provide a theory for semantics-based dataflow analysis of logic programs. Fortunately
the area of dataflow analysis is well researched, so we can draw on several useful sources. In this
chapter we recapitulate as much of previous work as needed for the rest of the report. In Section 3.1
we present the idea of approximate computation on which P. and R. Cousot based their theory of
abstract interpretation. This theory is the most important influence on our work. We present the
theory in Section 3.2, together with a relaxation of it, based on a notion of “insertion.” In Section 3.3
we present (a variant of) Nielson’s powerful elaboration of P. and R. Cousot’s work, namely his
theory of denotational abstract interpretation. This provides us with the basic theoretical tools
needed in the remainder of the report. These tools are independent of any particular programming
language.
3.1 Approximate computation
Abstract interpretation captures an idea of performing approximate computations. Every program-
ming language L has some notion of “values,” such that an interpreter for L executes L-programs
by manipulating values. A pseudo-evaluator for L, on the other hand, performs approximate com-
putation: it manipulates (imprecise) descriptions of values in a way that is faithful to how an
interpreter would act.
Approximate computation is well-known from everyday use. One example is the casting out
of nines to check numerical computations, another is the application of the rules of signs, such as
“plus times minus yields minus.” The disadvantage of an approximate computation is that the
results that it yields are not in general as precise as those of a proper computation, as a matter of
course. But this is compensated for by the fact that an approximate computation usually is much
faster than a proper computation.
11
12 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
mult neg zero pos
neg pos zero neg
zero zero zero zero
pos neg zero pos
Table 3.1: Multiplying signs
Our concern is automatic approximate computation of programs. In this case, the difference
in speed between proper and approximate computation may be exorbitant: a proper computation
may well fail to terminate where an approximate computation can finish in finite time.
We may say that an approximate computation is an evaluation of a formula or a program,
not over its standard domain U, but over a set D of descriptions of collections of objects in U.
The domain U may consist of integers, states, terms, sets of atomic formulas, substitutions, or
whatever, depending on how the semantics is modelled, and D is determined by what sort of
program properties we want to expose.
Of course, when performing an approximate computation, one must reinterpret all operators so
as to apply to descriptions rather than to proper values. As an example, let Z denote the set of
integers and consider the set of descriptions
D1 = {neg, zero, pos}
which may be used to approximate integers in the obvious way: neg is a proxy for all negative
integers, zero for 0, and pos for positive integers. Multiplication is reinterpreted as the function
mult : D21 → D1, defined by Table 3.1. This is an adequate interpretation, since whenever x∗y = z
and x, y ∈ Z are described by φ, φ′ ∈ D1 respectively, then z ∈ Z is described by mult (φ, φ′).
The imprecision inherent in descriptions may however cause problems. Consider the “addition”
of the descriptions neg and pos. We cannot express the sum as an element of D1. If we want to
mimic addition of integers, we have to include yet another, more imprecise description, ⊤, which
applies to all integers:
D2 = {neg, zero, pos,⊤}.
Now we may reinterpret addition as in Table 3.2.
Some points may be noted at this stage. First, in spite of their imprecision, approximate
computations, if properly designed, may well yield much useful information. Casting out nines or
using the rules of signs exemplify this, and more examples will appear later.
Second, we are not primarily interested in the results of computations. For program analysis
purposes, our main concern is to extract information about invariance at particular program points,
such as “at this point, variable x is always assigned positive values” or “this term becomes ground
3.1. APPROXIMATE COMPUTATION 13
add neg zero pos ⊤
neg neg neg ⊤ ⊤
zero neg zero pos ⊤
pos ⊤ pos pos ⊤
⊤ ⊤ ⊤ ⊤ ⊤
Table 3.2: Adding signs
during execution.” There are examples of analyses that yield useful invariance information, even
though what they return as a description of the final result may be poor.
Third, there is a degree of freedom in choosing the objects that are to serve as descriptions.
Exactly what they should be depends on the purpose of the analysis: we design them according to
the program properties that we want to expose and the precision that we want to obtain.
Fourth, we are usually interested in properties that are undecidable. Since we want our ap-
proximate computations to terminate (we want some information), it follows that the information
we get is necessarily imprecise. This is acceptable: a dataflow analysis need not tell the whole
truth, although it should of course not contradict truth. In this way, approximate computation
is to standard computation what numerical analysis is to mathematical analysis: using numeri-
cal techniques, problems with no known analytic solution can be “solved” numerically, that is, a
solution can be estimated within an interval of error.
Finally, the notion of approximate computation should not be confused with that of “profil-
ing” computation (or program “instrumentation” in general). A profiling computation also yields
information about a standard computation, but it does so by extending it rather than approximat-
ing it. There are two characteristics that distinguish profiling from approximate computation. A
profiling computation yields precise information about runtime behaviour, but it is not guaranteed
to terminate, whereas an approximate computation yields approximate information in finite time.
Furthermore, profiling may extract information about implementation-dependent behaviour, such
as the execution time spent in a particular procedure. Approximate computation, in our sense of
the term, is concerned only with purely semantic properties, that is, with runtime behaviour that
can be attributed to the programming language in question, rather than to a particular machine
on which it is implemented.
The idea of computing by means of descriptions for analysis purposes is not new. Naur very
early identified the idea and applied it in work on the Gier Algol compiler [75]. Naur coined the
term pseudo-evaluation for what would later be described as
“a process which combines the operators and operands of the source text in the manner
in which an actual evaluation would have to do it, but which operates on descriptions
of the operands, not on their values” [38].
14 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
The same basic idea is found in work by Reynolds [83] and by Sintzoff [91]. Sintzoff used it
for proving a number of well-formedness aspects of programs in an imperative language, and for
verifying termination properties.
By the mid seventies, efficient dataflow analysis had been studied rather extensively by re-
searchers such as Kam, Kildall, Tarjan, Ullman, and others (for references, see Hecht’s book [31]).
In an attempt to unify much of that work, a precise framework for discussing approximate computa-
tion (of imperative programs) was developed by Patrick and Radhia Cousot [13, 14]. The advantage
of such a unifying framework is that it serves as a basis for understanding various dataflow analyses
better, including their interrelation, and for discussing their correctness.
The overall idea of P. and R. Cousot was to define an “extended” semantics which associates
with each program point the set of possible storage states that may obtain at run-time whenever
execution reaches that point (P. and R. Cousot called this semantics a “static” semantics). A
dataflow analysis can then be construed as a finitely computable approximation to the extended
semantics. We detail this general idea in Chapters 5 and 6.
Of the example applications given by P. and R. Cousot we mention: program verification
(descriptions being predicates over program variables), performance analysis (descriptions being
positive real numbers representing the mean number of times a program point is reached, given
that probabilities have been attached to test nodes), and the finding of bounds for integer variables
(descriptions being intervals) [13].
The work of P. and R. Cousot has later been extended to declarative languages. Such extensions
are not straightforward. For example, there is no clear-cut notion of “program point” in functional
or logic programs. Also, dataflow analysis of programs written in a language such as Prolog
differs somewhat from the analysis of programs written in more conventional languages because
the dataflow is bi-directional, owing to unification, and the control flow is more complex, owing to
backtracking. We return to the case of logic programming in Chapter 4.
Of the applications of abstract interpretation in functional programming we mention work by
Jones [39] and by Jones and Muchnick [40] on termination analysis for lambda expressions and,
in a Lisp setting, improved storage allocation schemes through reduced reference counting. The
main application, though, has been strictness analysis, which is concerned with the problem of
determining cases where applicative order may safely be used instead of normal order execution.
The study of strictness analysis was initiated by Mycroft [73] and the literature on the subject is
now quite extensive, see for example Nielson [77].
We discuss the history and applications of abstract interpretation of logic programming lan-
guages later (Sections 5.3 and 6.5). For a general introduction to abstract interpretation and an
extensive list of references, readers are referred to the book edited by Abramsky and Hankin [1].
3.2. ABSTRACT INTERPRETATION 15
mult odd even ⊤
odd odd even ⊤
even even even even
⊤ ⊤ even ⊤
Table 3.3: Multiplication of parities
3.2 Abstract interpretation
Abstract interpretation formalises the idea of approximate computation. Assume we have a data
domain U and operators on U. In our (continued) example, U is Z, the set of integers. We also
have a set D of descriptions, in our case {neg, zero, pos,⊤}. To be precise about the relation
between values and descriptions, a concretization function γ : D → P U is defined. For every
description φ ∈ D, (γ φ) is the set of objects which φ describes. Thus γ is the semantic function
for descriptions.
Example 3.1 (Signs) We define γ : D2 → P Z by
γ neg = {z ∈ Z | z < 0}
γ zero = {0}
γ pos = {z ∈ Z | z > 0}
γ ⊤ = Z.
Example 3.2 (Parity) We have D = {even, odd,⊤} and define γ : D → P Z by
γ even = {z ∈ Z | z is even}
γ odd = {z ∈ Z | z is odd}
γ ⊤ = Z.
Note that descriptions are ordered according to how large a set of objects they apply to: the more
imprecise, the “higher” they sit in the structure. Example 3.2 allows us to illustrate the point that
application of an operator to a description of “everything,” ⊤, need not involve loss of information
(operators need not be co-strict). To see this, consider “multiplication” of parities as defined in
Table 3.3. Also note that the ordering on descriptions in a sense is opposite to the ordering used
in domain theory: for descriptions, the top element corresponds to total lack of information.
To see how a computation using the descriptions in Example 3.2 can proceed, consider the
program1 in Figure 3.1. Program points are numbered edges. Descriptions can be propagated in
the graph by appropriately interpreting the commands. In the parity example, the fact that n is
odd at program point 5 is used to conclude that n is even at point 6. Assuming n is ⊤ at point
1Collatz’s problem in number theory amounts to determining whether this program terminates for all n ∈ N .
16 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
1
2
3
4
5
67
m
m
m
m
m
mm
read n
n := n/2 n := 3n+ 1
stop
��
�
��
�
n = 1 ?
even n ?
no yes
yes no
-
?
?
?
HHHHHj -
�� - -
6
-
6
�
Figure 3.1: A flow diagram
1, we get the following descriptions of n at other points: at 2 it is ⊤, at 3 it is even, at 4 it is ⊤,
at 5 it is odd, at 6 it is even, and at 7 it is odd. This information justifies a transformation of the
program so as to avoid a number of parity tests: the statement n := 3n + 1 can be replaced by
n := (3n + 1)/2. However, the information cannot, as we shall see, be used to conclude that the
program terminates. In the signs example, assuming n is pos at program point 1, n will be pos at
every point.
We follow P. and R. Cousot in demanding that the set of descriptions, D, forms a complete
lattice (that was not the case in our previous examples, but this will soon be rectified). The use
of complete lattices facilitates simple fixpoint characterisations of the non-standard semantics, as
we shall later see. In the Cousot framework, the semantics of a description is given by a function
γ : D → P U , such that (γ φ) is the set of objects described by φ ∈ D. The function γ should be
injective, as we do not want redundant descriptions: once we have a name for a particular set of
objects, that suffices. We also want γ to be co-strict since there should be a description for every
collection of objects, in particular we want a name for “everything.” This is a modest demand: in
the signs example we saw how it was inevitable, because of the inherent imprecision in descriptions.
The intuition behind organising the set of descriptions as a complete lattice is as follows. The
concretization function γ should be monotonic. In fact we let the partial ordering ⊑ on D be
defined by φ ⊑ φ′ iff (γ φ) ⊆ (γ φ′). As already mentioned we can read φ ⊑ φ′ as: whatever φ
describes is (also, though maybe not as precisely) described by φ′. Since D is a complete lattice, it
holds that for every subset D′ ⊆ D there exists a unique element⊔
D′ ∈ D such that
∀φ ∈ D′ . φ ⊑⊔
D′, (3.1)
(∀φ ∈ D′ . φ ⊑ φ′)⇒⊔
D′ ⊑ φ′. (3.2)
During the execution of a program, the same program point may be reached many times, but with
different descriptions of the “current state.” Now if D′ is the set of descriptions that might occur
3.2. ABSTRACT INTERPRETATION 17
at some program point, then⊔
D′ must be the best overall description for the point. Namely, (3.1)
states that⊔
D′ describes everything described by members of D′, and (3.2) ensures that⊔
D′ is
as precise as can be.
Let us give some examples of the use of lattices as description domains.
Example 3.3 (Void) We take as descriptions D = {⊤} and define γ ⊤ = U . That is, there is
only one description, ⊤, and ⊤ is very imprecise since it applies to every value in U . This gives
rise to a non-standard semantics which gives no useful dataflow information.
Example 3.4 (Reachability) Let D = {⊥,⊤}, γ ⊥ = ∅, and γ ⊤ = U . This simple domain can
be used in a so-called reachability analysis: unreachable program points are described by ⊥, and
possibly reachable points by ⊤.
Example 3.5 (Signs revisited) Let D = {⊥, neg, zero, pos,⊤}, and let the ordering ⊑ be de-
fined by φ ⊑ φ′ iff φ = ⊥ ∨ φ = φ′ ∨ φ′ = ⊤. The concretization function is γ from Example 3.1,
extended so that (γ ⊥) = ∅.
Example 3.6 (Overlapping signs) Let D = {⊥, nonpos, nonneg,⊤}, and let the ordering ⊑ be
defined as in Example 3.5. We define γ : D → P Z by
γ ⊥ = ∅
γ nonpos = {z ∈ Z | z ≤ 0}
γ nonneg = {z ∈ Z | z ≥ 0}
γ ⊤ = Z.
Examples 3.5 and 3.6 illustrate an important point. To form a complete lattice, that is, for⊔
∅ to
exist, we had to add a least element, ⊥. There is nothing in our previous discussion that tells us
what (γ ⊥) should be, but a natural choice is to let γ be strict, that is, (γ ⊥) = ∅. For dataflow
analysis this is useful, because it means that the presence of ⊥ at some program point conveys the
(precise!) information that execution never reaches the point.
Example 3.7 (Accumulating semantics) Here D = P U , the powerset of U . This is the most
fine-grained set of descriptions possible. The ordering ⊑ is the subset ordering, and⊔
is distributed
union. The concretization function γ is the identity function. This gives rise to a non-standard
semantics which is called the accumulating semantics and which simply gathers the values occurring
at each program point.
Clearly, the finer-grained descriptions we use, the better dataflow analysis possible. On the other
hand, considerations of finite computability and efficiency put a limit on granularity. It is common
to use a lattice that is ascending chain finite for descriptions. The reason is that if the analysis
18 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
process can be expressed as repeated application of a monotonic function on such a lattice then its
termination is guaranteed.
P. and R. Cousot make one final demand about γ, which we have not yet mentioned and which
we shall sometimes leave unsatisfied, namely that there should always be a best approximation.
Consider a set S ⊆ U . Let W = {S′ ∈ ∆ γ D | S ⊆ S′}, that is, W corresponds (via γ) to the
set of valid descriptions of S. Since γ is co-strict, W is non-empty. Now assume that (∆ γ D) is
closed under intersection. Then there is a best description of S, namely the one corresponding to⋂
W . If we renounce the assumption about closure under intersection, we will in general have a
set of equally good (optimal) descriptions. Whether this is acceptable is perhaps a matter of taste.
P. and R. Cousot do not find it satisfactory that, given a set of equally good descriptions, one can
simply choose an arbitrary element. This is because, in the context of some particular program
to be analysed, the choices are not equally good, and the best choice turns out to vary from one
program to another. P. and R. Cousot give examples of this and consider the assumption about
closure under intersection “a very reasonable assumption” avoiding “program-dependent” analysis
methods [14].
Taken together, the assumptions about co-strictness and closure under intersection imply that
(∆ γ D) is a Moore family [5], which in turn implies that (∆ γ D) is a sublattice of P U . This
is the case in all our previous examples except Example 3.6. In that example there is no best
approximation to {0} nonpos and nonneg are equally good. We shall later see another example of
this phenomenon in a context of dataflow analysis of logic programs.
Let the function α : P U → D be defined by
α ψ = ⊓{φ ∈ D | ψ ⊆ γ φ}.
Clearly α is well-defined and monotonic. It is not hard to see that given the assumption about
closure under intersection, (α ψ) is the best (that is, least) description that applies to all elements
of ψ. We thus arrive at the classical formulation of abstract interpretation in terms of Galois
insertions [71]): in addition to the concretization function, an abstraction function α should exist,
such that
∀φ ∈ D . φ = α (γ φ),
∀ψ ⊆ U . ψ ⊆ γ (α ψ).
Note that, when restricted to (∆ γ D), α is the inverse of γ and that {{ψ ⊆ U | α ψ = φ} | φ ∈ D}
partitions P U .
Following P. and R. Cousot [13] we now generalise the above discussion somewhat, letting
the codomain of γ be a complete lattice rather than a powerset. The reason is that a semantic
definition usually includes many different domains, not necessarily powersets. In this way we obtain
a category having complete lattices as objects and “insertions” (see below) as arrows: (idD,D,D)
is the identity arrow on D and (γ ◦ γ′,D,E) is the composite arrow of (γ,D′, E) and (γ′,D,D′).
3.2. ABSTRACT INTERPRETATION 19
Definition. An insertion is a triple (γ,D,E) where D and E are complete lattices and (the
monotonic) γ : D → E is injective and co-strict.
Definition. Let D and E be complete lattices. Let γ : D → E and α : E → D be (monotonic)
functions. Then α is γ’s insertion adjoint iff
∀ d ∈ D . d = α (γ d) (3.3)
∀ e ∈ E . e ≤ γ (α e), (3.4)
where ≤ is the ordering on E. We call the quadruple (γ,D,E, α) a Galois insertion [71].
Our definition of an “insertion adjoint” is narrower than that of an adjunction in category theory,
where D and E may be arbitrary preordered sets and “=” in 3.3 is replaced by the ordering on D.
However, 3.3 and 3.4 correspond to what P. and R. Cousot use in their original paper on abstract
interpretation [13]. As already mentioned, we shall not in general assume that γ has an insertion
adjoint α, but when it has, α and γ uniquely determine each other.
The following definition allows us to characterise when an abstraction function exists.
Definition. Let X be a complete lattice and Y ⊆ X. Then Y is Moore-closed (in X) iff
∀Y ′ ⊆ Y .⊓Y ′ ∈ Y.
Lemma 3.8 (P. and R. Cousot) Let (γ,X, Y ) be an insertion. Then γ has an adjoint iff (∆ γ X)
is Moore-closed.
Proposition 3.9 Let (γ,D,E, α) be a Galois insertion. Then
1. γ is co-continuous,
2. α is continuous.
Proof: Let ⊑ be the ordering on D and ≤ that on E.
(1) LetX ⊆ D be a chain. Then Y = ∆ γ X is a chain in E. We must show that γ (⊓X) = ⊓Y .
We have:
∀x ∈ X .⊓Y ≤ γ x, thus, by monotonicity of α and (3.3),
∀x ∈ X . α (⊓Y ) ⊑ x, thus
α (⊓Y ) ⊑ ⊓X, thus, by monotonicity of γ and (3.4),
⊓Y ≤ γ (⊓X).
Furthermore, by monotonicity of γ,∀x ∈ X . γ (⊓X) ≤ (γ x), so γ (⊓X) ≤ ⊓Y . Therefore
γ (⊓X) = ⊓Y , that is, γ is co-continuous.
20 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
(2) LetX ⊆ E be a chain. Then Y = ∆ α X is a chain in D. We must show that α (⊔
X) =⊔
Y .
We have:
∀x ∈ X . α x ⊑⊔
Y, thus, by monotonicity of γ and (3.4),
∀x ∈ X . x ≤ γ (⊔
Y ), thus⊔
X ≤ γ (⊔
Y ), thus, by monotonicity of α and (3.3),
α (⊔
X) ⊑⊔
Y.
Furthermore,
∀ y ∈ Y . ∃x ∈ X . y ⊑ α x, thus, by monotonicity of α,
∀ y ∈ Y . y ⊑ α (⊔
X), thus⊔
Y ⊑ α (⊔
X).
So α (⊔
X) =⊔
Y , that is, α is continuous.
In accordance with the observation that “larger” descriptions naturally correspond to decreased
precision, we now define what it means for d ∈ D to safely approximate e ∈ E.
Definition. Let (γ,D,E) be an insertion. We define apprγ : D × E → Bool by
apprγ (d, e) iff e ≤ γ d,
where ≤ is the ordering on E.
Thus apprγ (d, e) reads “d approximates e under γ.” Since γ will always be clear from the context,
we shall omit the subscript and simply denote the predicate by appr. For semantic functions we
shall normally use the symbol ∝ to denote the approximation relation:
Definition. Let F : Prog → D and F′ : Prog → D′ be semantic functions, and let (γ,D,D′) be an
insertion. Then F ∝ F′ iff appr (F [[P ]],F′ [[P ]]) holds for all P ∈ Prog .
Lemma 3.10 Let (γ,D,E) be an insertion. Then
1. appr is inclusive on D × E, ordered componentwise,
2. if γ is co-continuous then appr is co-inclusive on D × E.
Proof: Let Y ⊆ D × E be a chain, and let ≤ be the ordering on E. Assume that appr (d, e) holds
for all (d, e) ∈ Y , that is, e ≤ γ d.
(1) Let d0 =⊔
{d | (d, e) ∈ Y } and e0 =⊔
{e | (d, e) ∈ Y }. Clearly (d0, e0) =⊔
Y . By
monotonicity of γ then, e ≤ γ d ≤ γ d0 for all (d, e) ∈ Y . So e0 ≤ γ d0, that is, appr (d0, e0) holds.
Thus appr is inclusive.
(2) Let d0 = ⊓{d | (d, e) ∈ Y } and e0 = ⊓{e | (d, e) ∈ Y }. Then (d0, e0) = ⊓Y . Clearly
e0 ≤ γ d for all (d, e) ∈ Y , and so e0 ≤ ⊓{γ d | (d, e) ∈ Y }. Since γ is co-continuous, e0 ≤ γ d0,
that is, appr (d0, e0) holds. Thus appr is co-inclusive.
3.2. ABSTRACT INTERPRETATION 21
Definition. We extend appr from the domain D × E to (D → D) × (E → E) by defining:
appr ((λ x . F ′ x), (λ x . F x)) iff ∀ (d, e) ∈ D ×E . appr (d, e)⇒ appr ((F ′ d), (F e)).
We thus in fact have a series of relations “appr,” but in what follows, the “type” of appr should
always be clear from the context. This treatment of appr is similar to Reynolds’s use of relational
functors [84]. In Section 3.3 we shall extend appr to other kinds of domains.
Proposition 3.11 Let (γ,D,E) be an insertion and let F : E → E and F ′ : D → D be such that
appr (F ′, F ) holds. Then
1. appr (lfp F ′, lfp F ) holds,
2. appr (F ′ ↓ n, gfp F ) holds for all n ∈ ω,
3. if γ is co-continuous then appr (gfp F ′, gfp F ) holds.
Proof: (1) Let G : (D ×E)→ (D × E) be defined by G (d, e) = (F ′ d, F e). Then G is monotonic
and lfp G = (lfp F ′, lfp F ). By Lemma 3.10, we can reason about appr by using fixpoint induction
on G. Assume appr (d, e) holds. Since appr (F ′, F ) holds, so does appr (F ′ d, F e). So appr (d, e)
implies appr (G (d, e)). Therefore appr (lfp G) holds, that is, appr (lfp F ′, lfp F ) holds.
(2) The proof is by finite induction. Clearly appr (⊤D, gfp F ) holds. Assume appr (F ′ ↓ (n−
1), gfp F ) holds. Since appr (F ′, F ) holds, so does appr (F ′ (F ′ ↓ (n − 1)), F (gfp F )), that is,
appr (F ′ ↓ n, gfp F ). Therefore appr (F ′ ↓ n, gfp F ) holds for all n ∈ ω.
(3) The proof is dual to (1).
Corollary 3.12 If γ has an insertion adjoint then appr (gfp F ′, gfp F ) holds.
Proof: The assertion follows directly from Proposition 3.9 item 1 and Proposition 3.11 item 3.
The reason for our interest in fixpoints is the fact that semantics for logic programs is very ade-
quately given as fixpoint characterisations. This applies both to the well-known TP characterisation
[2, 98] and to denotational definitions. More precisely, the idea is to have the “standard” semantics
of program P given as lfp F for some function F , and to have dataflow analyses defined in terms of
“non-standard” functions F ′, approximating F . We can then use Proposition 3.11 to conclude that
all elements of lfp F have some property Q, provided all elements of γ (lfp F ′) have the property
Q. In other words, lfp F ′ provides us with approximate information about the standard semantics
lfp F . As we shall see in Section 5.2, knowledge about gfp F may also be useful, and (F ′ ↓ n), and
sometimes gfp F ′, can provide this. In this way dataflow analyses are nothing but approximations
to the standard semantics. Since it is preferable that approximations are finitely computable, the
approximating function F ′ and the description domain D are usually chosen in such a way that
the Kleene sequences for F ′ are finite.
22 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
Obviously, we usually want a dataflow analysis to be as precise as possible, in the sense of
making the best possible use of available information. Letting F ′ map every element of D to ⊤D
clearly leads to a dataflow analysis that is correct, but useless. We note that if an abstraction
function α : E → D exists, then a best safe approximating function F ′ exists, namely the function
defined by F ′ = α ◦F ◦ γ. However, this property is not preserved under composition of functions.
Most of the above discussion has been too simplistic in one respect. For dataflow analysis
purposes we are normally interested in a semantics that is somewhat more complex than the “base”
semantics which simply specifies a program’s “input-output” relation. The reason is that we are
looking for invariants (such as “x is always positive”) that hold at program points, not just results
of computations. (It is not obvious what constitutes a “program point” in a logic program—we
return to that in Chapter 4.) An extended semantics is obtained by extending the base semantics so
that for each program, all run-time states associated with each of its program points are recorded.
In this connection, the base semantics may be viewed as a degenerate extended semantics that has
only one program point, namely the “end” of the program. We present an extended semantics in
Section 6.1.
3.3 Denotational abstract interpretation
Abstract interpretation is powerful because it is semantics-based and thus concerned with a pro-
gramming language as a whole rather than the analysis of particular programs. But the generality
of abstract interpretation can be taken further. Assume we are given a language in which one
can express the semantics of a wide variety of programming languages. The theory of abstract
interpretation may then be developed in the framework of this meta-language once and for all. In
this way, we need not “reinvent” abstract interpretation for all different kinds of languages—each
case is just a special instance of the general theory. Figure 3.2 illustrates the point.
A general, language-independent framework has been developed by Nielson [76, 77]. Following
Nielson, we use the formalism of denotational semantics. This gives us a meta-language which is
simple but powerful (including for instance a least fixpoint operator), well-understood, and widely
used. Its versatility is mainly due to the ease with which it accommodates semantic definitions at
any level of abstraction one may want, allowing for good intellectual economy. The usefulness of
denotational definitions as bases for analysis and transformation of logic programs has been argued
by Debray and Mishra [20]. Denotational definitions use semantic functions that are total mappings
from phrases to their denotations, which in particular allows for easy modelling of non-termination.
Equally important, the semantic functions are homomorphic, or compositional, which facilitates
structural induction in proofs of program properties.
The usefulness of denotational semantics should be apparent from Chapters 5 and 6: standard
and non-standard semantics are easily presented in the meta-language. In fact, by choosing the
right level of abstraction, they can be made highly congruent: if the standard semantics employs
a certain operator on the standard domain, the non-standard semantics should use a very similar
3.3. DENOTATIONAL ABSTRACT INTERPRETATION 23
'&
$%
'&
$%
'&
$%
'&
$%
programminglanguage
meta-language
standarddenotation
non-standarddenotation
semantic equations
standardsemantics
dataflowsemantics
standardinterpretation
non-standardinterpretation
?
�
JJJJJJJJJJJJJJJJJ
��
��
��
@@
@@@@R
Figure 3.2: The role of the meta-language (after Nielson)
operator on the corresponding non-standard domain.
Expressing standard and non-standard semantics in the same meta-language also supports a
derivational approach to the development of dataflow analyses. The definition of a dataflow analysis
should be easily derivable from that of the standard semantics. Only in a second stage should the
analysis be implemented from its definition. Such a stepwise approach may be preferable to the
task of proving some baroque dataflow procedure correct with respect to a semantic definition.
Finally, using the same meta-language for standard and non-standard semantics means that most
correctness discussions can be conducted once and for all in the setting of the meta-language.
The idea behind the following is exactly as in Section 3 of Nielson’s treatment [77], but for
the sake of completeness, and since our meta-language differs in several respects from Nielson’s
“TMLb,” we redo some of his work in our setting. Another reason for giving a detailed treatment
is that, usually, in denotational semantics the emphasis is on chain-complete posets and continuous
functions, notions that we do not need here.
Correctness proofs for dataflow analyses given later hinge on the following results. On first
reading, readers not concerned with proof details may choose to skip the rest of this section, or
merely skim it.
24 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
The meta-language is one of typed lambda expressions, and the types are given by
E ∈ Exp ::= S | L
L ∈ Lat ::= D | E → L
where S ∈ Stat, D ∈ Dyn, Stat is a collection of static types, and Dyn is a collection of dynamic
types. The difference between the two kinds is that (the interpretation of) a static type remains the
same throughout all (standard and non-standard) semantics, whereas a dynamic type may change.
We call Stat ∪ Dyn the collection of base types, and Lat is the collection of lattice types. The
syntax of the meta-language is given by
e ::= ci (base functions)
| xi (variables)
| λ x : E . e (function abstraction)
| e e′ (function application)
| if e then e′ else e′′ (conditional)
| lfp e (least fixed point operation)
|⊔
x∈e′ e (least upper bound operation)
In addition we use some (standard) extensions, such as “where clauses.” There is a notion of
well-typing for this language, but as it is straightforward and merely a simple modification of
Nielson’s [77], we omit the definition. Static types are interpreted as posets (ordered by identity)
and dynamic types as complete lattices. A type interpretation I thus assigns a structure I E to
each base type E. The semantics of types is defined by natural extension as follows:
I [[S]] = I S (a (fixed) poset, ordered by identity)
I [[D]] = I D (some complete lattice)
I [[E → L]] = I [[E]]→ I [[L]] (ordered pointwise).
As usual, the “→” on the right-hand side denotes monotonic function space. By this, I [[L]] is a
complete lattice for every L ∈ Lat.
By an abuse of notation, an interpretation I denotes a type interpretation (also called I) together
with an assignment of an element of I [[E]] to each base function c of type E. By natural extension
this gives the semantics of the meta-language. The denotation of an expression e is relative to a type
environment tenv and a type E such that tenv ⊢ e : E (for details, see [77]). Let the domain of tenv
be {x1, . . . , xk} and let tenv xi = Ei for i ∈ {1, . . . , k}. Then I [[E]] : I [[E1]]× . . .× I [[Ek]]→ I [[E]]
3.3. DENOTATIONAL ABSTRACT INTERPRETATION 25
is defined by
I [[ci]] (v1, . . . , vk) = I ci
I [[xi]] (v1, . . . , vk) = vi
I [[λ x : E . e]] (v1, . . . , vk) = λ v . I [[e]] (v1, . . . , vk, v)
I [[e e′]] (v1, . . . , vk) = I [[e]] (v1, . . . , vk) (I [[e′]] (v1, . . . , vk))
I [[if e then e′ else e′′]] (v1, . . . , vk) = if I [[e]] (v1, . . . , vk) then I [[e′]] (v1, . . . , vk)
else I [[e′′]] (v1, . . . , vk)
I [[lfp e]] (v1, . . . , vk) = lfp (I [[e]] (v1, . . . , vk))
I [[⊔
x∈e′ e]] (v1, . . . , vk) =⊔
{I [[e]] (v1, . . . , vk, v) | v ∈ I [[e′]] (v1, . . . , vk)}.
In accordance with subsequent use we shall assume that D is the only element of Dyn, but the
following proposition is easily generalised. Let I and I′ be (type) interpretations and let Q be an
inclusive predicate on (I D)× (I′ D). The intention with Q is that Q (ψ, φ) holds iff the description
φ applies to ψ. Using Reynolds’s notion of relational functors [84] we can extend the relationship
to all types E to get a predicate simE [Q] on (I E)× (I′ E). This relation is defined by
simS [Q] (ψ, φ) iff ψ = φ
simD [Q] (ψ, φ) iff Q (ψ, φ)
simE→L [Q] (ψ, φ) iff ∀ψ′, φ′ . (simE [Q] (ψ′, φ′)⇒ simL [Q] (ψ ψ′, φ φ′))
We can now generalise results from Section 3.2.
Proposition 3.13 For all types E, simE [Q] is inclusive iff Q is.
Proof: Only the “if” part is non-trivial: its proof is by structural induction. The cases E = S
and E = D are trivial. So assume that E = E′ → L and that simL [Q] is inclusive. Let
Z ⊆ (I E)×(I′ E) be a chain such that simE [Q] holds for all (ψ, φ) ∈ Z. Clearly⊔
Z = (⊔
X,⊔
Y ),
where X = {ψ | (ψ, φ) ∈ Z} and Y = {φ | (ψ, φ) ∈ Z}. Let (ψ′, φ′) ∈ (I E′) × (I′ E′) be given.
Then W = {(ψ ψ′, φ φ′) | (ψ, φ) ∈ Z} is a chain and⊔
W = ((⊔
X) ψ′, (⊔
Y ) φ′). Assume
simE′ [Q] (ψ′, φ′) holds. Then simL [Q] (ψ ψ′, φ φ′) holds for all (ψ, φ) ∈ Z. Since simL [Q] is
inclusive, simL [Q] (⊔
W ) holds, that is, simL [Q] (ψ ψ′, φ φ′) holds for (ψ, φ) =⊔
Z. Since ψ′
and φ′ were arbitrary, simE [Q] (⊔
Z) holds, so simE [Q] is inclusive.
This leads to the following important result.
Proposition 3.14 (Nielson) If simE′ [Q] (I c, I′ c) holds for every base function c : E′, then, for
all types E, if e : E is a closed expression then simE [Q] (I [[e]], I′ [[e]]) holds.
Proof: The proof is by structural induction. In the case of lfp e, fixpoint induction is used.
We shall make good use of this result later. Namely, the relation “safely approximates,” or appr,
is inclusive (Lemma 3.10) and will be used in the role of Q. To simplify notation in the sequel,
26 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS
rather than using the complicated expression simE [Q] (ψ, φ), we shall simply write “ψ apprE φ,”
or, in fact, when E is clear from the context, merely “ψ appr φ.”
Note the generality of the above result: it immediately allows us to argue inductively the
correctness of a whole dataflow analysis (that is, a non-standard semantics) once certain primitive
base functions have been shown to be in the relation “safely approximates.” This applies not only
to logic programming languages as discussed in this thesis, but to any language whose semantics
can be expressed in the meta-language.
Chapter 4
Dataflow Analysis of Logic Programs
So far we have discussed dataflow analysis without reference to any particular programming lan-
guage. The very point of Section 3.3 was that a theory for dataflow analysis should be as language-
independent as possible. Section 3.2 outlined classical abstract interpretation but did not explain
how the theory applies to logic programs. In this chapter we discuss dataflow analysis of logic
programs.
It is far from clear what should be understood by “standard semantics,” “state,” “program
point,” etc. The point of this chapter is that several different answers are possible, and that
their relative merits depend on the particular dataflow analysis problem to be solved. In particular
there are many different semantic models—at different levels of abstraction—for logic programming
languages, and we argue that each may be adequate for some purpose and inadequate for another.
So let us consider some examples of dataflow analysis problems. The example programs we use
are all very simple, so analysing them becomes trivial. Our choice of examples is motivated by
a desire for clear exposition and should not lead readers to consider our approach simplistic: in
general dataflow problems are much more complicated and programs contain recursion, etc., but
our approach easily scales up to handle this, as will be seen in subsequent chapters. Before we give
examples, a note about notation is appropriate: to distinguish constants from variables, we use a
and b as constants, and u, v, w, x, y, and z for variables.
Consider the program P :
p(x)← q(x), r(x).
q(a).
r(y).
Assuming our lexicon contains a and b as the only functors, the success set of P is
S = {p(a), q(a), r(a), r(b)} = (TP ↑ 2) [2].
In the case of P , this computation is straightforward, but for more complicated programs the
success set may be infinite and we may want to approximate it instead, by a set S′, say, of atoms.
27
28 CHAPTER 4. DATAFLOW ANALYSIS OF LOGIC PROGRAMS
A set can be approximated from within or from without, and our choice depends on what we want
to use the approximation for. If S′ ⊆ S (approximation from within), we can interpret “A ∈ S′”
as “A is in the success set.” If S ⊆ S′ (approximation from without), we can interpret “A 6∈ S′” as
“A is not in the success set.” But in both cases, the reverse interpretation is invalid1.
Approximations to the success set may be thought of as (derived) type information about
predicates. They are most naturally computed in a manner similar to how the TP operator works.
A “state” becomes a set of ground atoms. Since a predicate’s definition can be spread over many
clauses, it may be useful to record for each clause those instances of its head that could be derived
using the clause. So in this case, a useful notion of “program point” would simply be “clause.”
In the case of the program P above, assuming that we have been approximating S from without
and arrived at S itself as approximation (this is a reasonable assumption given the simplicity of P ).
In particular we find the approximation {p(a)} attached to the first clause (or “program point”).
This allows for the replacement of that clause by p(a) ← q(a) without changing the program’s
success set (and it immediately suggests a further simplification, namely the unfolding of q(a)).
In Chapter 5 we discuss this case in detail. In fact we are concerned with a more complex
semantics than the TP characterisation, because we want to handle negation in programs as well.
This calls for the simultaneous approximation of success and failure sets.
For some purposes a characterisation of the success set may be inadequate, in particular since
success sets are usually infinite. It may sometimes be useful to know which atoms succeed, rather
than just which ground atoms succeed. This makes TP inadequate: we would need an operator
similar to TP , but one which handles arbitrary atoms. In the case of P , the denotation would be
{p(a), q(a), r(y)}, allowing us to conclude that the query ←p(x) has {x← a} as its only answer
substitution, while ←r(x) has the identity substitution as its only answer substitution. There are
semantic definitions that achieve this and it is perfectly possible to base dataflow analyses on such
definitions. We give some relevant references in Section 5.3.
What we have discussed so far are examples of what we call bottom-up dataflow analysis. Such
analyses propagate information in a way similar to how the TP operator works. The analyses
are therefore independent of any particular query. Often, however, we are only interested in a
program’s behaviour given some query. An interpreter for a logic programming language, for
example, is usually based on SLD resolution [49], in which control flows “top-down,” that is, flows
from a query to the clauses that are selected in an attempt to refute the query. This is normally
the case for Prolog, which uses a standard (left-to-right) computation rule when selecting a body’s
atoms for processing and a depth-first search rule for handling pending atoms. A Prolog compiler
that attempts to improve code generation will correspondingly need dataflow information that is
propagated in some top-down manner.
An example of such dataflow information is mode information. A mode of a predicate argument
1The following rendering of this principle is due to Alan Mycroft: “Safe behind a fire door” depends on which
side of the door the fire is on!
29
expresses its degree of instantiation at the time the predicate is called. It is well-known how mode
information can enhance generated code. Instead of generating code for the general unification of
terms, a compiler can capitalise on mode information to generate faster versions of the unification
procedure, specialised to the context set up by a given Prolog program [19, 21]. For example, code
to unify a variable with a constant is much simpler than code to unify two arbitrary terms. Tradi-
tionally mode information has been provided by the programmer in the form of mode declarations.
This approach has the disadvantage of putting an extra burden on the programmer, but worse yet:
if wrong mode declarations are given to the compiler, its output can no longer be trusted, and yet
there is no indication of the error. It is therefore worthwhile to try to make the generation of (safe!)
mode declarations automatic.
In general, whenever some kind of annotation of Prolog programs is called for, it is worth
considering the possibility of doing so automatically, not only because it is faster, but also because
it is safer. Partial evaluation, for instance, usually calls for some kind of annotation, and in
Prolog variants with a delay mechanism such as for example NU-Prolog [97], programs are usually
extensively annotated. One can also think of annotations that will guide a parallel execution of a
logic program.
In all these cases (and others, as we return to in Section 6.5), what is needed are descriptions
of the run-time call patterns that a program gives rise to. That is, for each clause in the program
we want information about how unification will bind the variables in the clause’s head.
For example, consider the program P qua Prolog program. A top-down dataflow analysis that
utilises the knowledge of the standard (left-to-right) computation rule can easily deduce that during
refutation of any query of form←p( ), every call of the third clause will bind y to a. Note that this
is very different from the information that a bottom-up analysis provides about the third clause.
As an even simpler example of how the two kinds of analysis attach different information to
clauses, consider the second clause in the program
←p(a)
p(x)← q(x).
q(b).
A bottom-up analysis can tell us that to succeed, p(x) must be instantiated to p(b) if instantiated
at all. A top-down analysis can tell us that the only actual call instance at run-time will be p(a).
The fact that this call ultimately fails is of no concern to the top-down analysis.
In the top-down example analysis of P , the analyzer made good use of its knowledge of the
computation rule. The conclusion that r is always called with a ground argument would not be
valid if arbitrary computation rules were considered. If the dataflow analysis problem that we face
is one of finding the best order in which to process atoms in a body (perhaps including parallel
processing), then we have to use a semantics in accordance with this, that is, one that does not
fix the computation rule. In Chapter 6 and its sequel, on the other hand, we choose a particular
(the standard) computation rule, since our primary interest there is the improvement of compilers
30 CHAPTER 4. DATAFLOW ANALYSIS OF LOGIC PROGRAMS
'
&
$
%
'
&
$
%
'
&
$
%- -
standardsemantics(abstract)
“dataflow”semantics(abstract)
dataflowanalysis
(implementation)
Figure 4.1: Two-step development of a dataflow analysis
and other program transformers for a Prolog-like language. In so doing, we increase the precision
of analyses.
The carriers of binding information in an SLD refutation are substitutions: in the SLD model a
substitution carries all the necessary information about a program’s current bindings of variables
to “values,” and thus closely corresponds to the notion of a “state” from Section 3.2.
However, the semantics that we use in Chapter 6 is somewhat different from the SLD model.
The reason for the difference is that we want a definition that is rather abstract. A semantics
defined in terms of SLD trees contains information that we do not need for some applications,
such as exact derivation history, bindings of variables that will not affect the final answers, and
a distinction between finite failure and non-termination. Since such information is not needed in
our dataflow analyses, it should be abstracted away so as not to obscure more important issues.
This is exactly where denotational semantics is useful: it allows for such abstraction and provides
a fixpoint characterisation as required by our theory.
On the other hand, a less abstract definition such as the SLD model may be useful as a basis for
dataflow analyses that our semantics cannot support. Keeping a record of the derivation history
may be useful for an analysis of programs’ storage requirements or (lack of) determinacy. The
relation is the same as that between our bottom-up and our top-down semantics: the former is
more abstract than the latter since it disregards computation rules, hence it is easier to formalise,
but it supports a more limited class of dataflow analyses.
The point that we hope to have made here is that, unfortunately, no one semantic model of
logic programs can be said to provide the best basis for dataflow analysis. Other things being
equal, we choose the simplest definition that will support a solution to the problem at hand. Under
the assumption that “simple” here means “abstract,” this leads to a two-step paradigm for the
development of a dataflow analysis, as indicated in Figure 4.1. The point in this approach is that,
rather than to implement an ad hoc dataflow analysis directly from a knowledge of the programming
language, it may be advantageous to factor out two independent issues: approximation (the left
arrow) and implementation (the right arrow).
Chapter 5
Bottom-Up Analysis of Normal Logic
Programs
In this chapter a Kleene logic-based semantics for normal logic programs is defined, similar to
Fitting’s ΦP semantics. This provides a semantic basis for bottom-up dataflow analyses. Such
analyses give information about the success and failure sets of a program. As we discussed in
Chapter 4, a major application of bottom-up analysis is therefore type inference. We detail a
dataflow analysis using descriptions similar to Sato and Tamaki’s “depth k” abstractions and
another using Marriott, Naish and Lassez’s “singleton” abstractions. We show that both are sound
with respect to our semantics and outline various uses of the analyses. We justify our choice of
semantics by showing that it is the most abstract of a number of possible semantics. This means that
every analysis based on our semantics is correct with respect to these other semantics, including
Kunen’s semantics, SLDNF resolution, and the common (sound) Prolog semantics. Finally we
discuss related work.
5.1 Bottom-up semantics for logic programs
In this section we give a bottom-up semantics for normal logic programs. Recall that normal
programs allow for negation in clause bodies. Throughout the chapter, by “program” we mean
normal logic program. Also recall the use of italic capital letters for meta-variables, a convention
we stick to throughout the thesis (while also using italic capitals for other things). Finally recall
that Her denotes the set of ground atoms (for some fixed lexicon), and that (ground S) denotes
the set of ground instances of the syntactic object S.
Let Interp = (P Her) × (P Her). Equipped with the component-wise subset ordering, Interp
forms a complete lattice. We denote the ordering on Interp by ≤. The idea is that Interp consists
of all three-valued (partial) “interpretations.” An interpretation u = (us, uf ) is read as follows: the
atoms in us are true, those in uf are false, those not in us∪uf are undefined (not assigned a classical
truth value), and those in us ∩ uf are overdefined (assigned both true and false). Alternatively,
31
32 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
Interp may be thought of as mapping ground atoms to the four-valued bilattice, as discussed by
Fitting [25]. Clearly, the set Interp is partitioned into consistent and inconsistent elements, where
(us, uf ) is consistent iff us∩uf = ∅ and inconsistent iff us∩uf 6= ∅. The greatest element (Her ,Her)
of Interp is inconsistent, for example. An interpretation (us, uf ) is complete iff us ∪ uf = Her .
Definition. The predicate consistent : Interp → Bool is defined by
consistent (X,Y ) iff X ∩ Y = ∅.
Lemma 5.1 The predicate consistent is inclusive on Interp.
Proof: Let Z ⊆ Interp be a chain. Let X ′ = {X | (X,Y ) ∈ Z} and Y ′ = {Y | (X,Y ) ∈ Z}.
Let X0 =⊔
X ′ and Y0 =⊔
Y ′. Clearly (X0, Y0) =⊔
Z. Assume consistent (X,Y ) holds for all
(X,Y ) ∈ Z, that is, X∩Y = ∅. We show by contradiction that X0∩Y0 = ∅. Assume ∃x.x ∈ X0∩Y0.
Then x ∈ X1 for some (X1, Y1) ∈ Z and x ∈ Y2 for some (X2, Y2) ∈ Z. For reasons of symmetry,
and since Z is a chain, we can assume that (X1, Y1) ≤ (X2, Y2). It follows that x ∈ X2 ∩ Y2,
contradicting the assumption that X ∩ Y = ∅ for all (X,Y ) ∈ Z. Thus X0 ∩ Y0 = ∅, so consistent
is inclusive on Interp.
The idea behind using three-valued logic to describe computational behaviour goes back to Kleene.1
Suppose we want to use a machine to determine the truth or falsehood of some statement. In
addition to the two possibilities that the machine returns true or false, it may happen that it fails
to terminate. It is therefore natural to use a logic which admits yet a third value which stands
for “undefined.” In Kleene’s logic [45], the connectives are the “most generous” extensions of the
classical connectives, so for example the tables for “∧” and “¬” are:
∧ true false undef
true true false undef
false false false false
undef undef false undef
¬
true false
false true
undef undef
In logic programming terms this version of “∧” corresponds to a fair computation rule. Note
that Prolog’s “∧” is not commutative as a three-valued connective: the standard computation
rule of Prolog rather corresponds to the connectives of McCarthy logic [51]. For example, in
McCarthy logic, false ∧ undef yields false, but undef ∧ false yields undef , since this corresponds
to the behaviour of a machine that attempts to evaluate expressions from left to right, given the
understanding that undef designates non-termination.
1Three-valued logics had previously been investigated by Post and Lukasiewicz independently, and by Bocvar, but
always from a point of view of the semantic paradoxes. Kleene wanted an intuitionistically sound logic for partial
recursive functions.
5.1. BOTTOM-UP SEMANTICS FOR LOGIC PROGRAMS 33
The use of a fair computation rule can only increase the success and finite failure sets of a
program. So since we approximate these sets from without, any analysis based on Kleene logic will
be sound with respect to McCarthy logic. In the following we give a semantic definition for normal
programs based on Kleene logic, but we return to alternative semantics and their relation to ours
in Section 5.3.
It is possible to give a three-valued model-based semantics for normal programs, based on Fit-
ting’s notion of satisfaction by (in our terminology) consistent interpretations [26]. Fitting, however,
also gives a fixpoint characterisation, based on an operator ΦP , and this is the type of semantic
definition we aim at. Fitting’s operator works on a semilattice of consistent interpretations. The
reason why we include inconsistent elements is that the inherent imprecision in “descriptions” may
sometimes force the generation of (description) values that are “inconsistent,” as Example 5.15 will
show. Technically it is simpler to include all elements, and unlike Fitting we are only interested in
our operator’s least fixpoint which is the same as that of ΦP .
Definition. Let u = (us, uf ) be an interpretation. Let B = L1, . . . , Ln be a ground body. Then
u makes B true iff
∀ i ∈ {1, . . . , n} . ∀A ∈ Her . (Li = A⇒ A ∈ us) ∧ (Li = ¬ A⇒ A ∈ uf )
u makes B false iff
∃ i ∈ {1, . . . , n} . ∃A ∈ Her . (Li = A ∧ A ∈ uf ) ∨ (Li = ¬ A ∧ A ∈ us).
The following lemma is easily verified.
Lemma 5.2 Let u be an interpretation and B a ground body. If u is consistent then u cannot
make B both true and false.
We now define the base semantics. Let P be a program and let I be an index set for the clauses in
P . Using this we may denote the i’th clause in P by P [i], where i ∈ I.
Definition. The immediate consequence function UP : Interp → Interp is defined by
UP u = (us, uf ) where
us = {A ∈ Her | ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . u makes B true}
uf = {A ∈ Her | ∀ i ∈ I . ∀A← B ∈ ground (P [i]) . u makes B false}.
The base semantics of program P is B [[P ]] = lfp UP .
Proposition 5.3 The base semantics of a program P is well-defined and consistent.
Proof: The function UP is easily seen to be monotonic, so B [[P ]] is well-defined. By Lemma 5.1
we can reason about consistent by using fixpoint induction on UP . If (UP u) is inconsistent then
there is a ground instance A← B of a clause in P such that u makes B both true and false, so u
is inconsistent by Lemma 5.2. It follows that the set of consistent interpretations is closed under
UP . Therefore B [[P ]] is consistent.
34 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
Note that we consider lfp UP to be the denotation of the program P . This differs from Fitting’s
more fine-grained semantics [26] in which the set of all (consistent) fixpoints, or “partial models,”
is used.
Example 5.4 Consider the program P :
p(x)← q(x, y),¬ r(y).
q(x, x).
r(f(a)).
r(f(f(x)))← r(f(f(x))).
Assuming that the alphabet is that of P , the semantics is (us, uf ) where
us = {p(a), q(a, a), q(f(a), f(a)), . . . , r(f(a))}
uf = {p(f(a)), q(a, f(a)), q(f(a), a), . . . , r(a)}.
The rest of this section establishes some results that will be used in Section 5.2 where the issue is
non-standard semantics. We first define a predicate compl, read “complementary.”
Definition. For x ⊆ Her , let x denote (Her \ x), the complement of x. The predicate compl :
Interp × Interp → Bool is defined by compl ((x, y), (x′, y′)) iff x = y′ ∧ y = x′.
Our interest in compl stems from Proposition 5.7 and its corollary below. The proposition relates
lfp UP and gfp UP by showing them to be complementary. To establish the proposition, two lemmas
are needed.
Lemma 5.5 Let B be a ground body and let u and u′ be interpretations. If compl (u, u′) holds,
then
1. u makes B true iff ¬ (u′ makes B false),
2. u makes B false iff ¬ (u′ makes B true).
Proof: Let B = L1, . . . , Ln, let (us, uf ) = u, and assume that compl (u, u′) holds.
(1) The assertion then is
∀ i ∈ {1, . . . , n} . ∀A ∈ Her . (Li = A⇒ A ∈ us) ∧ (Li = ¬ A⇒ A ∈ uf ) iff
¬ (∃ i ∈ {1, . . . , n} . ∃A ∈ Her . (Li = A ∧ A ∈ us) ∨ (Li = ¬ A ∧ A ∈ uf )),
which clearly holds.
(2) The proof is similar to (1).
Let Interp × Interp be equipped with the ordering ⊑ defined by (u1, u′1) ⊑ (u2, u
′2) iff u1 ≤ u2 ∧
u′2 ≤ u′1. Clearly Interp × Interp is a complete lattice.
5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 35
Lemma 5.6 The predicate compl is inclusive on Interp × Interp.
Proof: Let Z ⊆ Interp × Interp be a chain. Assume that compl (u, u′) holds for all (u, u′) ∈ Z,
that is, x = y′ ∧ y = x′ where (x, y) = u and (x′, y′) = u′. Let X = {x | ((x, y), (x′, y′)) ∈ Z} and
let Y,X ′, and Y ′ be defined similarly. Then ((⋃
X,⋃
Y ), (⋂
X ′,⋂
Y ′)) =⊔
Z. Consider A ∈ Her .
We have that
A ∈⋃
X iff ∀((x, y), (x′, y′)) ∈ Z . A 6∈ x
iff ∀((x, y), (x′, y′)) ∈ Z . A ∈ x
iff ∀((x, y), (x′, y′)) ∈ Z . A ∈ y′
iff A ∈⋂
Y ′.
So⋃
X =⋂
Y ′. Similarly⋃
Y =⋂
X ′. So compl (⊔
Z) holds.
Proposition 5.7 For every program P , compl (lfp UP , gfp UP ) holds.
Proof: Let F : (Interp × Interp) → (Interp × Interp) be defined by F (u, u′) = (UP u,UP u′). By
this, lfp F = (lfp UP , gfp UP ). By Lemma 5.6 we can reason about compl by using fixpoint induction
on F . Assume compl (u, u′) holds and let (us, uf ) = UP u and (u′s, u′f ) = UP u′. By Lemma 5.5 and
the definition of UP , u′s = uf and u′f = us, that is, compl (F (u, u′)) holds. Therefore compl (lfp F ),
that is, compl (lfp UP , gfp UP ), holds.
In general gfp UP will be inconsistent, and even though there may be many consistent fixpoints,
there is usually no greatest such, cf. Fitting’s use of “intrinsic” fixpoints [26]. We have the following
consequence of Proposition 5.7.
Corollary 5.8 gfp UP is consistent iff lfp UP = gfp UP .
Proof: If lfp UP = gfp UP then gfp UP is consistent, by Proposition 5.3. Let (us, uf ) = lfp UP . Then
gfp UP = (uf , us), by Proposition 5.7. Assume that gfp UP is consistent, that is, uf ∩us = ∅. Then
uf ∪us = Her , that is, lfp UP is complete. Thus uf = us and us = uf , and so lfp UP = gfp UP .
5.2 Approximation of success and failure sets
The obvious application of dataflow analyses based on the semantics given in the previous section is
type inference. We use the term “type” loosely to mean descriptions that are based on the structure
of the terms or atoms. For instance, types include the (restricted) rational tree descriptions used
by Bruynooghe [7] and the atom abstractions used by Sato and Tamaki [87]. Type inference is the
process of finding a type which describes the success set and/or failure set of a program. A special
case of type inference is termination analysis: from the success and failure information present in
36 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
an inferred type one may conclude that certain queries will neither succeed nor fail and so will not
terminate.
In this section we amalgamate ideas from several type inference methods found in the literature
[56, 59, 87]. Type inference is expressed as a non-standard version of the base semantics. We
generalise previous work along two dimensions; the first is that normal programs are considered
rather than definite programs, the second is that the style of type descriptions considered are
generalised to “atom abstractions” which intuitively are sets of atoms whose denotation is their
set of ground instances. We also generalise Marriott, Naish and Lassez’s method of program
specialisation [56] and prove the correctness of the generalisation.
We now introduce “atom abstractions” and “abstraction schemes”—the latter are sets of atom
abstractions. We are primarily interested in abstraction schemes in which each atom abstraction
has a distinct denotation, since atom abstractions representing the same set of ground atoms are
equivalent for our purposes. One step towards ensuring this is to consider the atoms in atom
abstractions to be taken modulo variable renaming.
Definition. The instantiation preordering ⊳ on Atom is defined by A ⊳ A′ iff A is an instance of
A′. We let Atom⊳
denote the poset of atoms (modulo variable renaming) with the partial ordering
induced by the instantiation preordering.
Definition. Let the function den : P Atom⊳ → P Her be defined by den A =⋃
(den A). An
atom abstraction A is a subset of Atom⊳
and its denotation is den A. An abstraction scheme S is
a set of atom abstractions such that for some A ∈ S,Her = den A.
Definition. The function depth : Term → N gives the depth of a term as follows: if T is a constant
then depth T = 1, otherwise depth T = 1 + max {depth T ′ | T ′ is a proper subterm of T}. We
define the depth of an atom A, depth A, as the maximal depth of any of its terms (0 if A is anadic).
The next two definitions exemplify abstraction schemes. Let (pred A) denote the predicate symbol
of atom A.
Definition. Let Atomk = {A ∈ Atom⊳| depth A ≤ k}. The depth k abstraction scheme is
P Atomk.
Example 5.9 The set {[p(x, f(x))], [p(f(a), x)]} is a depth 2 abstraction that denotes
{p(a, f(a)), p(f(a), f(f(a))), . . . , p(f(a), a), p(f(a), f(a)), . . .}
(assuming a lexicon {p, f, a}).
Definition. An atom abstraction A is singleton iff ∀A,A′ ∈ A . (pred A) = (pred A′)⇒ A = A′.
The singleton abstraction scheme is the set of all singleton atom abstractions.
5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 37
Example 5.10 The set {[p(x, f(f(x)))], [q(f(a), x)]} is a singleton abstraction that denotes
{p(a, f(f(a))), p(f(a), f(f(f(a)))), . . . , q(f(a), a), q(f(a), f(a)), q(f(a), f(f(a))), . . .}
(assuming a lexicon {p, q, a, f}). Note that this abstraction is not a depth 2 abstraction since
p(x, f(f(x))) is of depth 3, and conversely, the previous example’s abstraction is not a singleton
abstraction since it contains two atoms having the same predicate symbol p.
The depth k atom abstractions were introduced by Sato and Tamaki [87] and the singleton
atom abstractions have been used by Marriott [54] and by Marriott, Naish and Lassez [56]. Related
abstraction schemes have been studied by Marriott and Søndergaard [61].
We equip abstraction schemes with the ordering ≤ defined by A ≤ A′ iff den A ⊆ den A′. As it
stands, ≤ is a preordering: different atom abstractions may have the same denotation. For instance,
if there are only two constants a and b, then {p(x)} has the same denotation as {p(a), p(b)}. We are
ultimately interested in schemes that are complete lattices, so as a first step we introduce schemes
that are posets. These we call “canonical.”
Definition. An abstraction scheme S is canonical iff
∀A,A′ ∈ S . (den A) = (den A′)⇒ A = A′.
It is straightforward to show that if there are two or more distinct ground terms then the singleton
abstraction scheme is canonical. However, depth k abstraction schemes are not in general canonical.
For any abstraction scheme one can always find an “equivalent” scheme which is canonical by
choosing a maximal representative for each class of abstractions with the same denotation. Let
(X,≤) be a poset and let Y ⊆ X. We define maximal : P X → P X by
maximal Y = {y ∈ Y | ∀ y′ ∈ Y . y ≤ y′ ⇒ y = y′},
and we define minimal Y in the dual manner.
Definition. Let S be an abstraction scheme. We define the canonicised S to be
S∗ = {maximal {A ∈ Atom⊳| ground A ⊆ den A} | A ∈ S}.
Definition. Abstraction schemes S and S ′ are equivalent iff
⋃
{den A | A ∈ S} =⋃
{den A | A ∈ S ′}.
The next proposition follows immediately from the definition of S∗.
Proposition 5.11 S∗ is canonical and equivalent to the abstraction scheme S.
38 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
Definition. An abstraction scheme is a lattice abstraction scheme iff it is a complete lattice under
subset ordering.
Proposition 5.12 Canonicised depth k abstraction schemes and the canonicised singleton abstrac-
tion scheme are lattice abstraction schemes.
Our interest in lattice abstraction schemes stems from the following proposition.
Proposition 5.13 Let S be a lattice abstraction scheme. Let IntS = S × S and let γS : IntS →
Interp be defined by
γS (ws, wf ) = (den ws, den wf ).
Then (γS , IntS , Interp) is an insertion.
Abstraction schemes can be used to induce non-standard semantic functions which approximate the
standard semantic functions. Such non-standard semantics are useful because they may highlight
errors hidden in the program, by giving an approximation to the success set or failure set smaller
than the programmer would expect. This will be exemplified later (Example 5.16).
Definition. Let S be a canonical abstraction scheme. Define αS : (P Her) → S to be some fixed
function with the property that
∀G . αS G ∈ minimal {A ∈ S | G ⊆ den A}.
Definition. Let P be a program and let S be a lattice abstraction scheme. The function WP :
IntS → IntS is defined by
WP w = (ws, wf ) where
ws = αS {A ∈ Her | ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . (γS w) makes B true}
wf = αS {A ∈ Her | ∀ i ∈ I . ∀A←B ∈ ground (P [i]) . (γS w) makes B false}.
The non-standard base semantics of P using S is NS [[P ]] = lfpWP .
Note the non-constructiveness of the definition of WP . This allows us easily to argue that NS
approximates the standard base semantics. However, for an implementation a more complex def-
inition using unifiers rather than dealing with ground instances would be useful. For examples of
this, see Marriott’s [54] Chapter 7.
Theorem 5.14 For every lattice abstraction scheme S, NS ∝ B.
Proof: It follows from the definition of WP that appr (WP , UP ) holds for all programs P . Thus by
Proposition 3.11, appr (lfpWP , lfp UP ) holds for all programs P . The assertion follows from the
definitions of NS and B.
5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 39
Example 5.15 If S is the depth 2 abstraction scheme and P is the program from 5.4 then the
non-standard base semantics of P using S∗ is w = (ws, wf ) where
ws = {p(a), q(x, x), r(f(a))}
wf = {p(f(a)), q(a, f(a)), q(f(a), a), q(f(x), f(y)), r(a)}.
Example 5.15 illustrates the inherent imprecision of depth k abstraction schemes, since there is
no way to tell from the non-standard semantics w whether q(f(a), f(a)) is in P ’s success set or
failure set. Of course the larger the depth chosen for the depth k abstraction scheme, the more
precise the analysis. The example also demonstrates that the interpretation corresponding to the
non-standard base semantics may be inconsistent.
The following example illustrates how the non-standard base semantics may be used to indicate
errors in a program. Recall the list notation used in this thesis: we use the infix operator : for list
construction and let nil denote the empty list.
Example 5.16 Let S be the singleton abstraction scheme and let P be the following, somewhat
defective, program:
member(u, u : v).
member(u, x : v)←member(v, v).
The program is intended to compute list membership. The non-standard base semantics of P using
S∗ approximates the success set of P by {member(u, u : v)}. This set is obviously too small, which
indicates that the program contains an error.
Another use of abstraction schemes is to specialise programs by replacing clauses in the program
by instances of the clauses [56, 87]. This has the advantage that bindings are made earlier in
a derivation, allowing failure to be detected more quickly, thus pruning useless derivations. In
particular, specialisation may turn non-deterministic programs that use deep backtracking into
programs using shallow backtracking only and so are deterministic in the sense that only one
clause head matches each call.
Definition. Let S be a lattice abstraction scheme. Define gclause : Clause → IntS → Gcla,
sclause : Clause→ IntS → Clause, and spec : Prog → IntS → Prog by
gclause C w = {A← B ∈ ground C | (γS w) makes B true}
sclause C w = maximal {C ′ ≤ C | ground C ′ ⊆ gclause C w}
spec P w = {sclause (P [i]) w | i ∈ I}.
Example 5.17 Let P be the program from Example 5.4 and let w be its non-standard semantics
from Example 5.15. Then (spec P w) is
p(a)← q(a, a),¬ r(a).
q(x, x).
r(f(a)).
40 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
Note how the last clause of P has been “specialised” away.
It is reasonable to hope that if a program P is specialised using NS [[P ]], the resulting program
will be equivalent to P . More generally, we might hope that if appr (w,B [[P ]]) holds, then P and
(spec P w) will be equivalent in the sense of codenotative:
Definition. Programs P and P ′ are equivalent iff B [[P ]] = B [[P ′]].
Unfortunately Example 5.17 shows that this does not hold. We therefore aim at finding a sufficient
condition on w that ensures that P and (spec P w) are equivalent. We first establish some lemmas.
Lemma 5.18 Let S be a lattice abstraction scheme, P be a program and P ′ = (spec P w) for some
w ∈ S. For all u ∈ Interp such that appr (w, u) holds, if UP u = (us, uf ) and UP ′ u = (u′s, u′f )
then us = u′s and uf ⊆ u′f .
Proof: Since each clause in P ′ is an instance of a clause in P , it follows that u′s ⊆ us and uf ⊆ u′f .
By the definition of spec, all of the ground clause instances used when deriving UP u are instances
of a clause in P ′ and so u′s = us.
Lemma 5.19 Let S be a lattice abstraction scheme, P be a program and P ′ = (spec P w) for some
w ∈ S. If appr (w,B [[P ]]) holds, then B [[P ]] ≤ B [[P ′]].
Proof: Define the predicate leq : (Interp × Interp) → Bool by leq (u, u′) iff u ≤ u′. Clearly leq is
inclusive on Interp×Interp. Let G : (Interp×Interp)→ (Interp×Interp) be defined by G (u, u′) =
(UPu,UP ′u′). Then G is monotonic and lfp G = (lfp UP , lfp UP ′). Assume appr (w,B [[P ]]) holds,
that is, lfp UP ≤ γS w. By Lemma 5.18 then, UP u ≤ UP ′ u for all u ≤ lfp UP . Assume u ≤ u′.
Since UP ′ is monotonic, UP u ≤ UP ′ u′ for all u ≤ lfp UP . So leq (u, u′) implies leq (G (u, u′)) for all
u ≤ lfp UP . By fixpoint induction, leq (lfp UP , lfp UP ′) holds, that is, B [[P ]] ≤ B [[P ′]].
Definition. The function co : Interp → Interp is defined by co (us, uf ) = (uf , us).
Note that comp (u, co u) holds for all u ∈ Interp. The following lemma is an immediate consequence
of the definition of co.
Lemma 5.20 If u is consistent then u ≤ co u.
Lemma 5.21 Let S be a lattice abstraction scheme, P be a program and P ′ = (spec P w) for some
w ∈ S. If appr (w, co (B [[P ]])) holds, then B [[P ]] is a fixpoint for UP ′.
Proof: Let (us, uf ) = B [[P ]] and let (u′s, u′f ) = UP ′ (B [[P ]]) = (u′s, u
′f ). By Lemma 5.20, B [[P ]] ≤
co (B [[P ]]). Assume that appr (w, co (B [[P ]])) holds. It follows from Lemma 5.18 that us = u′s. We
5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 41
now prove that uf = u′f . Let I ′ be an index set for the clauses in P ′. Since (us, uf ) = UP (B [[P ]]),
we have:
A ∈ uf iff ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . ¬ ((B [[P ]]) makes B false)
iff ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . co (B [[P ]]) makes B true
iff ∃ i ∈ I ′ . ∃A←B ∈ ground (P ′[i]) . co (B [[P ]]) makes B true
iff ∃ i ∈ I ′ . ∃A←B ∈ ground (P ′[i]) . ¬ ((B [[P ]]) makes B false)
iff A ∈ u′f .
The second and fourth bi-implications are by Lemma 5.5, the third is by Lemma 5.18. It follows
that (us, uf ) = (u′s, u′f ), that is, B [[P ]] is a fixpoint for UP ′ .
Theorem 5.22 Let S be a lattice abstraction scheme, let P be a program, and let w ∈ S. If
appr (w, co (B [[P ]])) holds, then P and (spec P w) are equivalent.
Proof: It follows from Lemmas 5.19 and 5.21 that B [[P ]] is the least fixpoint for UP ′ where P ′ =
(spec P w). Thus B [[P ′]] = B [[P ]].
For this reason we are interested in finding approximations to co (B [[P ]]). We have the following
proposition.
Proposition 5.23 Let S be a lattice abstraction scheme and let P be a program. Then
1. appr (WP ↓ n, co (B [[P ]])) holds for all n ∈ ω,
2. if γS is co-continuous then appr (gfpWP , co (B [[P ]])) holds.
Proof: Both statements follow directly from Propositions 5.13, 5.7 and 3.11.
Thus when specialising programs we are interested in computing (WP ↓ n) or, in cases where γS is
co-continuous, gfpWP . We note that for the two canonicised abstraction schemes discussed here,
γS is co-continuous.
Example 5.24 Let P be the following (correct) program to compute list membership:
member(u, u : v).
member(u, x : v)←member(u, v).
Using the singleton abstraction scheme we have gfpWP = (ws, wf ) where
ws = {member(u, y : z)}
wf = {member(u, v)}.
42 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
1 2a b
3 4a b
b
Figure 5.1: Two non-deterministic automata
It follows that spec P (gfpWP ) =
member(u, u : v).
member(u, x : y : z)←member(u, y : z).
is equivalent to P .
The following example indicates the usefulness of program specialisation.
Example 5.25 Consider the following non-deterministic program P .
inter(x)← accept(1, x), accept(3, x).
accept(1, a : x)← accept(1, x).
accept(1, a : x)← accept(2, x).
accept(2, b : nil).
accept(3, a : x)← accept(4, x).
accept(4, b : x)← accept(4, x).
accept(4, b : nil).
The predicate accept defines two nondeterministic finite automata as shown in Figure 5.1. The
predicate inter defines a language as the intersection of the regular languages accepted by the two
automata. Using the canonicised depth 3 abstraction scheme, gfp WP = (ws, wf ) where
ws = { inter(a : b : nil),
accept(1, a : a : x), accept(1, a : b : nil), accept(2, b : nil),
accept(3, a : b : x), accept(4, b : b : x), accept(4, b : nil) }.
5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 43
So spec P (gfp WP ) =
inter(a : b : nil)← accept(1, a : b : nil), accept(3, a : b : nil).
accept(1, a : a : a : x)← accept(1, a : a : x).
accept(1, a : a : b : nil)← accept(1, a : b : nil).
accept(1, a : b : nil)← accept(2, b : nil).
accept(2, b : nil).
accept(3, a : b : b : x)← accept(4, b : b : x).
accept(3, a : b : nil)← accept(4, b : nil).
accept(4, b : b : b : x)← accept(4, b : b : x).
accept(4, b : b : nil)← accept(4, b : nil).
accept(4, b : nil).
This is a specialised, deterministic version of P , equivalent to P . The program can be simplified
by straightforward means, by removing all ground bodies by unfolding. The query ←inter(x) no
longer requires backtracking; in fact the only answer can be produced in one derivation step.
Let us finally discuss termination properties of the dataflow analyses presented in this section.
Lemma 5.26 Assume our lexicon is finite.
1. The canonicised depth k abstraction schemes are both ascending and descending chain finite.
2. The canonicised singleton abstraction scheme is ascending chain finite.
Proof: (1) This follows from the fact that every canonicised depth k abstraction scheme is finite.
(2) It is well-known that Atom⊳ is Noetherian, given a finite lexicon, (see Reynolds’s [85]
Theorem 5). Since, by definition, each atom abstraction in the canonicised singleton scheme has
at most n elements, where n is the number of predicate symbols, the assertion follows.
Proposition 5.27 Let P be a program.
1. If S is a canonicised depth k abstraction scheme, then the ascending and descending Kleene
sequences for WP are finite.
2. If S is the canonicised singleton abstraction scheme, then the ascending Kleene sequence for
WP is finite.
Proof: The assertions follow from the definition of WP and Lemma 5.26.
44 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
5.3 Applications and related work
We have presented two dataflow analyses and argued their correctness with respect to the semantic
function B. The question remains, however, what this means if one assumes another underlying
semantics, such as SLDNF resolution [10, 49]. In this section we justify our choice of semantics
by showing that it is, in a precise sense, the most abstract of a number of possible semantics for
normal logic programs. In particular, the semantics we consider are: the set of logical consequences
(in three-valued logic) of a program’s completion, SLDNF resolution, and the standard Prolog
semantics. This means that an analysis based on B is automatically correct with respect to all
these semantics.
Fitting’s proposal is not the only suggestion to base a semantic definition for logic programs on
many-valued logic. Mycroft [74] was the first to discuss this possibility and its advantages. Other
proposals have followed. It is beyond the scope of this chapter to provide detailed motivation and
definition for the various semantics. Readers are referred to the original sources for details.
Kunen [47] has advocated a declarative semantics for normal programs, in which the meaning of
a program is the set of logical consequences in three-valued logic of the program’s Clark completion.
An alternative, operational definition of this semantics is also given in terms of UP . We now recall
this definition.
As noted in Section 5.1, any consistent interpretation u may be viewed as a mapping from
ground atoms to the truth values true, false, or ⊥. Let Form denote the set of closed for-
mulas over the given alphabet. Then the mapping can be extended in the natural way [47]
to a mapping Form → {true, false ,⊥} [47]. Let consequences u be the set of closed formulas
mapped to true by this extension. Let the function lcons : P Interp → P Form be defined by
lcons Z =⋃
(∆ consequences Z) and let Lcon = P (lcons {u ∈ Interp | cons u}). Ordered by
subset ordering, Lcon is a complete lattice.
Definition. The (three-valued logic) semantic function L : Prog → Lcon is defined by
L [[P ]] = ∆ lcons {UP ↑ n | n ∈ ω}.
Another natural semantics for normal logic programs is given in terms of SLDNF derivations
[10, 49]. Writing the application of a substitution θ to a syntactic object S as (θ S) and using ∀
and exists for universal and existential closure respectively, we can define the SLDNF semantics
as follows:
Definition. The (SLDNF) semantic function S : Prog → Lcon is defined by
S [[P ]] = {¬ ∃G | P ∪ {G} has a finitely failed SLDNF tree} ∪
{∀(θ G) | θ is a computed (SLDNF) answer for P ∪ {G}}.
5.3. APPLICATIONS AND RELATED WORK 45
Shepherdson [90] has shown that SLDNF resolution is sound with respect to the Clark completion
of a program in three-valued logic. Therefore, using the identity function on Lcon → Lcon as
concretization function, we have the following result.
Proposition 5.28 L ∝ S.
Standard Prolog may be considered as a restricted form of SLDNF resolution in which the compu-
tation rule is left-to-right and a depth-first search rule is used. We assume that standard Prolog is
“sound” in that unification with the occur check is performed and that when a non-ground negative
literal is selected, the derivation halts with an undefined value.
Definition. The (Prolog) semantic function P : Prog → Lcon is defined by
P [[P ]] = {¬ ∃G | P ∪ {G} fails finitely} ∪
{∀(θ G) | θ is a computed answer for P ∪ {G}}.
Since every derivation constructed by standard Prolog is an SLDNF derivation, it is apparent that
the following holds (again using the identity function on Lcon→ Lcon as concretization function).
Proposition 5.29 S ∝ P.
We finally look at the relationship between the base semantics and L. Let γ : Interp → Lcon be
defined by γ u = lcons {u′ | u′ ≤ u ∧ cons u′}.
Proposition 5.30 (γ, Interp, Lcon) is an insertion.
Proof: Clearly γ is monotonic and co-strict. We now show that γ is injective. Let u = (us, uf ) and
u′ = (u′s, u′f ) be distinct elements of Interp. Since u 6= u′ either us 6= u′s or uf 6= u′f . If us 6= u′s
then, for reasons of symmetry, we can assume that there is some A ∈ us \ u′s. Then A ∈ γ u, while
A ∈ γ u′ cannot hold. Thus γ u 6= γ u′. Similarly if uf 6= u′f then by symmetry we can assume
that there is some A ∈ uf \ u′f . Then ¬ A ∈ γ u, but ¬ A 6∈ γ u′, and again γ u 6= γ u′. Thus γ is
injective.
Proposition 5.31 B ∝ L.
Proof: Let P be a program and assume u ∈ {UP ↑ n | n ∈ ω}. Then u = (UP ↑ n) for some n ∈ ω,
and u is consistent. Since u ≤ lfp UP , it follows that u ∈ {u′ | u′ ≤ lfp UP ∧ cons u′}. We therefore
have that ∆ lcons {UP ↑ n | n ∈ ω} ⊆ ∆ lcons {u′ | u′ ≤ lfp UP ∧ cons u′}. So L [[P ]] ⊆ γ (B [[P ]]).
Since P was arbitrary, B ∝ L.
Since ∝ is transitive we have the following theorem showing that B is the most abstract of the
standard semantics we have considered.
46 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS
Theorem 5.32 B ∝ X holds for all X ∈ {L,S,P}.
Corollary 5.33 A dataflow analysis based on B is correct with respect to L, S, and P.
The above result is very satisfactory, but there is a price to be paid for the very abstract semantics.
As discussed in Chapter 4, more abstraction implies less precision in dataflow analyses. Analysing
a Prolog program, for example, one can make more precise statements about runtime behaviour
by using the knowledge that the computation rule is left-to-right. The abstract semantics B does
not make this assumption. It is, however, possible to capture the left-to-right computation rule
in a fixpoint characterisation such as B’s. This can be done simply by changing the definitions of
the “makes true” and “makes false” relations so they correspond to conjunction in McCarthy logic
[51]. For example, u should make L ∧ L′ false iff u makes L false or u makes L true and L′ false.
By slightly more complicated means, one can capture Prolog’s search rule using the asymmetric
“∨” of McCarthy logic. It is outside the scope of the present thesis to detail this, but we note that
such a semantics would be safely approximated by B, that is, any dataflow analysis based on B
would be correct with respect to it.
We have shown how the information provided by type inference may be used for program
specialisation or to highlight errors in a program. Our use of the analyses has pointed to the
possibility of automated program specialisation that is independent of particular queries. In a rather
different context, one can imagine bottom-up dataflow analysis being used for query optimisation
in deductive databases. A bottom-up analysis is very natural for this since it corresponds to the
operational semantics of deductive databases.
However, for many applications it would be useful to describe arbitrary atoms, not just ground
atoms as was done in this chapter. For example, information about the interdependencies of term
groundness is useful for query handling in deductive databases, see for example Dart [15]. Such
applications presuppose a semantic definition based on the handling of arbitrary terms. That
is, the definitions should give all logical consequences of a program, rather than just the ground
consequences (over some alphabet). Definitions of this kind have been suggested [24, 64], though
only for the class of definite programs, and their use as a basis for abstract interpretation has been
investigated by Barbuti, Giacobazzi and Levi [4] and by Marriott and Søndergaard [64]. Program
specialisation has also been studied by Gallagher et al. [27] who use OLDT resolution [94] as
semantic basis.
Chapter 6
Top-Down Analysis of Definite Logic
Programs
In this chapter we consider top-down dataflow analysis of logic programs. We show how abstract
interpretation can be used to develop useful dataflow analyses that for example may help a Prolog
compiler improve its code generation.
The analyses sketched in Section 5.2 are based on a “bottom-up,” or “forward chaining” seman-
tics. We have seen how they could be used for a kind of type inference, and how this in turn might
be useful for program specialisation and detection of programming errors. However, a limitation
of the semantics of Chapter 5 is that it does not give any information about how variables would
actually become bound during execution of a program. Most compiler “optimisations” depend on
knowledge about variable bindings at various stages of execution, so to support such transforma-
tions, we have to develop semantic definitions that manipulate substitutions. Furthermore, a typical
Prolog compiler, say, is based on an execution mechanism given by SLD resolution—the semantics
we gave in Chapter 5 lacks the SLD model’s notion of “calls” to clauses.
We therefore need to base our definitions on a formalisation of SLD resolution. Such a formal-
isation is given in Section 6.1. However, as we shall explain, the SLD semantics itself is not very
useful as a basis for dataflow analysis. We therefore develop denotational definitions that are better
suited. In Section 6.2 we introduce “parametric substitutions” as the natural semantic domain for
such definitions and give the definition of a “base” semantics. In Section 6.3 we transform the
semantic definition into a generic definition of a wide range of dataflow analyses. The definition
is generic in the sense that it remains uncommitted as to what constitute “substitution approx-
imations” and how to “compose” and “unify” them. As an example of a dataflow analysis that
can be cast as an instance of the generic definition, we show in Section 6.4 how to extract runtime
groundness information from a program at compile time. Finally, in Section 6.5, related work is
discussed and other applications of the presented (or a very similar) framework are listed.
47
48 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
6.1 SLD semantics
Unlike the case in Chapter 5, the semantic definitions given in this chapter apply to definite
programs only. We assume the standard left-to-right computation rule—this is what we mean
by “pure Prolog.” While it is perfectly possible to give a mathematical definition of the (top-
down) semantics of normal logic programs, there are diverging suggestions as to what exactly the
semantics should be. Since furthermore the handling of negation would call for much overhead and
thus obscure the central ideas (about dataflow analysis as non-standard semantics), we will restrict
attention to definite programs.
The definition in Section 5.1 is useful as a basis for dataflow analyses that yield approximations
to a program’s success set (and finite failure set). It is, however, not very useful as a basis for a
number of the dataflow analyses that designers of compilers often are interested in. Interpreters
and compilers for logic programming languages are usually based on SLD resolution as execution
mechanism. The previous definition does not capture SLD resolution, because it has no notion of
“calls” to clauses, as has the SLD model. In the SLD model, the first thing that takes place when
a clause is called is a unification of the calling atom and the clause’s head. This unification is an
important target for a compiler’s attempts to generate more efficient code, because the general uni-
fication algorithm is expensive, and most calls only need very specialised versions of the algorithm.
(There are many transformations other than unification specialisation that we are interested in,
and which our framework will support, but this serves as sufficient motivation for incorporating
information about calls in the semantic definition).
We assume the same syntax as in Chapter 5, except that all literals are now positive. We
let Q ∈ Pred , F ∈ Fun, V ∈ Var , where Pred , Fun , and Var denote the syntactic categories of
predicate symbols, functors, and variables, respectively. The set Var is assumed to be countably
infinite. The syntactic categories Prog , Clause, Atom, and Term are as usual. We assume that we
are given a function vars : (Prog ∪Atom∪Term)→ P Var , such that (vars S) is the set of variables
that occur in the syntactic object S. As usual, we use italic capital letters for meta-variables.
A program denotes a computation in a domain of substitutions. A substitution is an almost-
identity mapping θ ∈ Sub ⊆ Var → Term from the set of variables Var to the set of terms over
Var . Substitutions are not distinguished from their natural extensions to Atom → Atom. Our
notation for substitutions is standard. For instance {x 7→ a} denotes the substitution θ such that
(θ x) = a and (θ V ) = V for all V 6= x. We let ι denote the identity substitution. The functions
dom , rng , vars : Sub → P Var are defined by
dom θ = {V | θ V 6= V }
rng θ =⋃
(V ∈dom θ) vars(θ V )
vars θ = (dom θ) ∪ (rng θ).
A unifier of A,H ∈ Atom is a substitution θ such that (θ A) = (θ H). A unifier θ of A and H is an
(idempotent) most general unifier of A and H iff θ′ = θ′ ◦ θ for every unifier θ′ of A and H. The
6.1. SLD SEMANTICS 49
auxiliary function mgu : Atom → Atom → P Sub is defined as follows. If A and H are unifiable,
then (mgu A H) yields a singleton set consisting of a most general unifier of A and H. Otherwise
(mgu A H) = ∅. The function restrict : P Var → Sub → Sub is defined by
restrict U θ V = if V ∈ U then θ V else V.
We shall also be interested in renamings. A renaming is a bijection ρ ∈ Ren ⊂ Var → Var which
satisfies ρ = ρ−1. We do not distinguish a renaming from its natural extension to atoms and clauses.
We assume that we are given a function ren : P Var → P Var → Ren for generating renamings:
(ren U W ) is some renaming, such that for all V in W , ren U W V 6∈ U . Note that this is renaming
of variables in W “away from” U . We shall primarily use this function for the renaming of clauses
(by natural extension): let the function rename : P Var → Clause → Clause be defined by
rename U C = ren U (vars C) C.
Our definition assumes the standard (left-to-right) computation rule. It is given in a form that
should make it clear that it is equivalent to the usual SLD model [2, 49]. The main difference from
the standard definition is that we express the SLD semantics using a fixpoint characterisation. We
let :: denote concatenation of sequences. Clauses are implicitly universally quantified, so each call
of a clause should produce new incarnations of its variables. This is where the function rename
is needed: the generated clause instance will contain no variables previously met during execution
(those in θ) or to be met later (those in A : G). As usual, we think of a program as a set of clauses,
and of goals and bodies as sequences of atoms. The domain P Sub is ordered by subset ordering,
the other domains are ordered pointwise.
Definition. The SLD semantics has semantic domains
Env = Atom∗ → P Sub
Sem = Sub → Env
and semantic functions
O : Prog → Env
O′ : Prog → Sem → Sem
C′ : Clause → Sem → Sem.
It is defined by
O [[P ]] G = restrict(vars G) (lfp(O′ [[P ]]) ι G)
O′ [[P ]] =⊔
C∈P (C′ [[C]])
C′ [[C]] s θ nil = {θ}
C′ [[C]] s θ (A : G) = let U = (vars θ) ∪ (vars(A : G)) in
let [[H ←B]] = rename U C in
let M = mgu A H in⋃
µ∈M s (µ ◦ θ) (µ (B :: G)).
50 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
By this definition, finite failure is not distinguished from non-termination. For example, the empty
program and the program consisting only of the clause p ← p have the same denotation. This
reflects the fact that the statements we are interested in generating from a dataflow analysis are of
the form “whenever execution reaches this point, so and so holds.” In saying so, we do not actually
say that execution does reach the point. In particular, “whenever the computation terminates, so
and so holds” concludes nothing about termination. As non-termination is not distinguished from
finite failure, we can assume a parallel search rule, rather than the customary depth-first rule—this
simplifies our task. Owing to the use of a parallel search rule, the execution of a program naturally
yields a set of answer substitutions, as opposed to a sequence.
Example 6.1 Consider the following list concatenation program P :
append(nil, y, y).
append(u : x, y, u : z)← append(x, y, z).
and let A = append(x, y, a : nil). Execution of P yields two instantiations of the variables in A.
We have that O [[P ]] (A : nil) = {{x 7→ nil, y 7→ a : nil}, {x 7→ a : nil, y 7→ nil}}.
A note about semantical modeling and dataflow analysis may be appropriate at this point. We
have defined O such that O, given a program P and a goal G, returns the set of computed answer
substitutions Θ. That is, O selects what is considered the relevant information from the SLD tree
for P ∪ {G}, namely for each success leaf, the composition of the substitutions that decorate the
path from the success leaf to the root. Other information is “forgotten.” For example, (O [[P ]] G)
contains no information about the length of derivations, the substitutions that decorate paths
leading to failure nodes, or how variables outside G become bound during execution.
This is quite in accordance with the common understanding of SLD semantics qua the result
of applying SLD resolution. However, as we earlier discussed, dataflow analysis aims at extracting
information about a program’s execution, that is, about the very details that our “SLD semantics”
forgets. For example, a dataflow analysis that aims at determining which calls may appear at
runtime cannot afford to disregard those paths in the SLD tree that lead to failure. This is why
we need a notion of “extended semantics,” as discussed at the end of Section 3.2.
One might argue that (O [[P ]] G) should really return the whole SLD tree for P ∪{G}, so as to
constitute the ultimate extended semantics with respect to which all dataflow analyses were to be
justified. However, since we do not want to commit ourselves to an operational semantics at that
level of detail, it seems best to take O as point of departure and extend it to “collect” whatever
extra information we may want on a case by case basis.
In this context we are specifically interested in the calls that may occur during the execution
of P . In the extended semantics given below, a “repository” is therefore maintained, consisting of
all the calls that occur during execution1. Thus the domain of repositories is P Atom, ordered by
subset ordering. Other domains are ordered pointwise.
1A repository is similar to what is sometimes called a “context vector” [13], a “log” [43, 92], or a “record” [59].
6.1. SLD SEMANTICS 51
Definition. The SLD call semantics has semantic domains
Repos = P Atom
Env = Atom∗ → Repos
Sem = Sub → Env
and semantic functions
Ocall : Prog → Env
O′call : Prog → Sem → Sem
C′call : Clause → Sem → Sem.
It is defined by
Ocall [[P ]] G = lfp (O′call [[P ]]) ι G
O′call [[P ]] =
⊔
C∈P (C′call [[C]])
C′call [[C]] s θ nil = ∅
C′call [[C]] s θ (A : G) = let U = (vars θ) ∪ (vars(A : G)) in
let [[H ←B]] = rename U C in
let M = mgu A H in
{A} ∪⋃
µ∈M (s (µ ◦ θ) (µ (B :: G))).
This semantics is still not particularly suitable as a basis for dataflow analysis, since termination of
the analyses cannot easily be guaranteed. To see this, assume that we are interested in determining
which variables are bound to ground terms in calls to clauses. As descriptions we choose sets U of
variables, with the intention that V ∈ U means that the current substitution definitely binds V to
ground terms. Now consider the following program:
ancestor(x, z)← parent(x, z).
ancestor(x, z)← ancestor(x, y), ancestor(y, z).
parent(a, b).
Assume we are given the goal ←ancestor(a, z). An analysis based on the SLD semantics must
compute an infinite number of goal/description pairs, including
(ancestor(x, z), {x})
(ancestor(x, y), ancestor(y, z), {x})
(ancestor(x,w), ancestor(w, y), ancestor(y, z), {x})...
Clearly such a dataflow analysis will not terminate. What we need is a semantic definition that
somehow “merges” information about all incarnations of a variable that appears in a program and
is compositional in the sense that the denotation of a goal is given in terms of the denotation of its
subgoals. In Section 6.2 we develop such a definition.
52 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
6.2 A semantics based on parametric substitutions
In this section we develop a theory of “substitutions” which differs somewhat from classical sub-
stitution theory. There are a number of reasons for doing this. First, classical substitutions have
some drawbacks when used in a denotational definition. Most general unifiers are not unique, even
when idempotent. For example, unifying p(x) and p(y), it is not clear whether the result should
be {x 7→ y} or {y 7→ x}, and it is hard to guarantee that a semantic definition is invariant un-
der choice of unification function. Second, renaming traditionally causes problems. Most existing
approaches either ignore the problem by postulating some magical (invisible or nondeterministic)
renaming operator, or they commit themselves to a particular renaming technique (as do Jones
and Søndergaard [43]).
As an example of the problem, consider the query ←p(x) and the program consisting of the
single clause p(x). By the definition of Jones and Søndergaard the answer substition is {{x 7→ x1}},
but one could argue that {{x 7→ x17}}, {{x 7→ z}}, or even {ι}, would be just as appropriate. It
would be better if substitutions would somehow “automatically” perform renaming, preferably in
a way that made no assumptions about renaming technique. Furthermore, most dataflow analyses
merge information about the various incarnations of variables anyway, so a specific “renaming
function” should be avoided, if possible.
Our denotational definition is based on a notion of “parametric substitutions.” These have
previously been studied by Maher [52], although for a different purpose and under the name “fully
parameterized substitutions.” We follow Maher in distinguishing between variables, whose names
are significant, and parameters, whose names are insignificant.
Recall that Var is a countably infinite set of variables. We think of Var , Term , and Atom
as syntactic categories: the variables that appear in a program are assumed to be taken from
Var . In contradistinction to Var , Par is a countably infinite set of parameters. Both variables
and parameters serve as “placeholders,” but it proves useful to maintain the distinction: Var and
Par are disjoint. A one-to-one correspondence between the two is given by the bijective function
ǫ : Var → Par . In examples we distinguish variables from parameters by putting primes on the
latter.
The set of terms over Par is denoted by Term [Par ], and the set of atoms over Par is denoted
by Atom [Par ]. A substitution into Par is a total mapping σ : Var → Term[Par ] which is “finitely
based” in the sense that all but a finite number of variables are mapped into distinct parameters.
We do not distinguish substitutions into Par from their natural extension to Atom → Atom[Par ].
The set of substitutions into Par is denoted Sub[Par ].
Let us for the time being denote Atom by Atom[Var ] to stress its distinction from Atom[Par ],
and let X be either Var or Par . We define the standard preordering , ⊳, on Atom [X] by A ⊳ A′ iff
∃ τ : X → Term[X] . A = (τ A′). The standard preordering clearly induces an equivalence relation
on Atom[Var ] (or Atom[Par ]) which is “consistent variable (or parameter) renaming.” We denote
the resulting quotient by Atom[X]⊳ and use ⊳ to denote the induced partial ordering on Atom[X]⊳
6.2. A SEMANTICS BASED ON PARAMETRIC SUBSTITUTIONS 53
as well. Note that Atom [X]⊳ has neither a least, nor a greatest, element.
For A ∈ A, where A ∈ Atom[X]⊳, we may denote A by [A]. In fact we think of [·] as the
function from Atom [X] to Atom[X]⊳ that, given A, yields the equivalence class of A.
A parametric substitution is a mapping from Atom to Atom[Par ]⊳. We will not allow all such
mappings, though: the mappings we use are “essentially” substitutions. More precisely, the set
Psub ⊆ Atom → Atom[Par ]⊳ of parametric substitutions is defined by
Psub = {[·] ◦ σ | σ ∈ Sub[Par ]}.
That is, application of a parametric substitution can be seen as application of a substitution that
has only parameters in its range, followed by taking the equivalence class of the result. We equip
Psub with a partial ordering ≤, namely pointwise ordering on {[·] ◦ σ | σ ∈ Sub[Par ]}. Whenever
convenient we will think of Psub and Atom[Par ]⊳ as having artificial least elements ⊥Psub and
⊥Atom added (such that ⊥Psub A = ⊥Atom). We then have the following result [52].
Proposition 6.2 Psub is a complete lattice.
We use a notation for parametric substitutions similar to that used for substitutions however,
square rather than curly brackets are used, to distinguish the two. For example, the parametric
substitution π = [x 7→ x′, y 7→ f(x′)] will map p(x, y, z) to [p(x′, f(x′), z′)]. So π corresponds to the
substitution {y 7→ f(x)}. An alternative way to think of π is as an existentially quantified (over
parameters) conjunction: ∃x′ . x = x′ ∧ y = f(x′) [52].
Definition. The function β : Sub → Psub maps a substitution to the corresponding parametric
substitution. It is defined by β θ A = [ǫ (θ A)].
So every substitution θ has a corresponding parametric substitution (β θ). However, there are
parametric substitutions that are not in the range of β. For example, there is no substitution θ
such that (β θ) = [x 7→ f(y′, z′)].
We now make use of parametric substitutions to give semantic definition which is suitable as a
basis for abstract interpretation and which, in a sense we make precise, captures SLD resolution.
The definition makes use of the auxiliary functions, meet , which corresponds to composition, and
unify, which corresponds to computing the most general unifier2.
2”Corresponds” should be taken in a loose sense: meet , unlike composition, is commutative, and unify is non-
commutative.
54 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
Definition. The function meet : P Psub → P Psub → P Psub is defined by
meet Π Π′ = {π ⊓ π′ | π ∈ Π ∧ π′ ∈ Π′} \ {⊥Psub}.
The function unify : Atom → Atom → P Psub → P Psub is defined by
unify H A Π = {⊔
{π′ | π′ H ⊳ π A} | π ∈ Π} \ {⊥Psub}.
The auxiliary functions are defined to operate on sets of parametric substitutions, rather than
single elements, because in the next section it proves useful to have the broader definitions.
Example 6.3 Let π = [y 7→ y′, z 7→ f(y′)] and π′ = [y 7→ a]. Then
meet{π} {π′} = {[y 7→ a, z 7→ f(a)]}.
Example 6.4 Let A1 = p(a, x), A2 = p(y, z), and let π = [y 7→ y′, z 7→ f(y′)]. Then we have that
(unify A1 A2 {π}) = {[x 7→ f(a)]}. Notice that this parametric substitution does not constrain y or
z, only variables in A1 may be constrained. On the other hand we have that (unify A2 A1 {π}) =
{[y 7→ a]}, in which both x and z are unconstrained. Notice also that (unify A1 A1 {π}) = {ǫ},
while (unify A2 A2 {π}) = {π}.
Let A3 = p(f(y), z). Then (unify A3 A2 {ǫ}) = {ǫ}. There is no “occur check problem,” because
the names of placeholders in (π A2) have no significance. One can think of this as automatic
renaming being performed by unify .
Finally let A4 = p(x, x). Then (unify A4 A2 {π}) = ∅, corresponding to failure of unification.
Lemma 6.5 The functions meet and unify are continuous.
The idea behind unify should be clear from this example: not only does the function unify perform
renaming “automatically,” it also restricts interest to certain variables, namely those of its first
argument.
The following lemma captures the relationship between unify and mgu. Let (dom π) denote the
domain of the parametric substitution π, that is,
dom π = {V ∈ Var | π V 6∈ Par ∨ ∃V ′ ∈ Var . (V 6= V ′ ∧ π V = π V ′)}.
Let (restr U π) be the parametric substitution that acts like π when applied to variables in U ,
but maps variables outside U to distinct parameters. Let restrict U = ∆ (restr U). Recall that
renamings were defined in Section 6.1.
Lemma 6.6 Let A,H ∈ Atom and Π ⊆ Psub. Let ρ = ren ((vars H)∪(∆ dom Π)) (vars A). Then
unify H A Π = restrict(vars H) (meet{π ◦ ρ | π ∈ Π} (∆ β (mgu H (ρ A)))).
6.2. A SEMANTICS BASED ON PARAMETRIC SUBSTITUTIONS 55
The following definition3 captures the essence of an SLD refutation-based interpreter using a stan-
dard computation rule and a parallel search rule. Recall that (Σ d) is the folded version of d. The
domain Atom is ordered by identity, Den is ordered pointwise.
Definition. The base semantics has semantic domain
Den = Atom → P Psub → P Psub
and semantic functions
Pbas : Prog → Den
P′bas : Prog → Den → Den
Cbas : Clause → Den → Den.
It is defined as follows.
Pbas [[P ]] = lfp(P′bas [[P ]])
P′bas [[P ]] =
⊔
C∈P (Cbas [[C]])
Cbas [[H ←B]] d A Π =⋃
π∈Πmeet{π} (unify A H (Σ d B (unify H A {π}))).
The idea behind the relationship between SLD and the base semantics is that whenever θ belongs
to (O [[P ]] (A : nil)) then restrict(vars A) (β θ) belongs to (Pbas [[P ]] A {ǫ}). However we are
interested in performing an analysis for many goals at once and so require the existence of a
somewhat stronger relationship.
Definition. Define cong : Env ×Den → Bool by cong (e, d) iff
∀A ∈ Atom . ∀Θ ⊆ Sub .⋃
θ∈Θ{θ′ (θ A) | θ′ ∈ e (θ (A : nil))} ⊆ {π A | π ∈ d A (∆β Θ)}.
Theorem 6.7 For all programs P , cong ((O [[P ]]), (Pbas [[P ]])).
A proof of this theorem is given in Appendix A.
It is straightforward to rewrite the above definition so that it gives information about calls, just
as we rewrote O to Ocall. A proof very similar to that of Theorem 6.7 can be given to show that
a similar relationship hold between the “call” versions of O and Pbas.
3Strictly speaking, the singleton set constructor {·} as used in the definition is not part of our meta-language, and it
is commonly avoided in denotational definitions as it is not monotonic. Its use here causes no problems: the semantic
functions are well-defined. Subsequent definitions will avoid using {·} so as to be able to utilize Proposition 3.14.
56 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
6.3 A dataflow semantics for definite logic programs
The semantic definition given in the previous section was designed to capture the essence of SLD
resolution. In this section we develop a generic dataflow analysis scheme from these definitions by
factoring out certain operations.
First we introduce some imprecision in the semantics. So far, for all substitutions generated
by a clause, track was kept of the particular substitution the clause was called with, so that the
“meet” of generated substitutions and call substitutions could be computed. We now abandon
this approach in order to get closer to a dataflow semantics. The point is that, in a dataflow
semantics, we want a “description” to replace Π, the set of parametric substitutions, and since we
think of descriptions as being atomic (in the sense of non-decomposable), we need to avoid reference
to elements of Π. We thus replace “⋃
π∈Π . . . π . . .” from the definition of the base semantics by
“. . .Π . . ..”
Definition. The lax semantics has semantic functions
Plax : Prog → Den
Clax : Clause → Den → Den.
It is defined as follows.
Plax [[P ]] = lfp (⊔
C∈P (Clax [[C]]))
Clax [[H ←B]] d A Π = meet Π (unify A H (Σ d B (unify H A Π))).
Proposition 6.8 Plax ∝ Pbas.
Proof: Let a set Π of substitutions and let F : P Psub → P Psub be monotonic. If Π = ∅ then
trivially
⋃
π∈Π
⋃
π′∈F {π}(π′ ⊓ π) ⊆
⋃
π∈Π
⋃
π′∈F Π(π′ ⊓ π). (6.1)
Otherwise consider some π ∈ Π. By monotonicity, F {π} ⊆ F Π, so π′⊓π ∈ F {π} ⇒ π′⊓π ∈ F Π.
It follows that (6.1) holds for arbitrary Π and thus
meet{π} (F {π}) ⊆ meet Π (F Π).
Letting the concretization function be the identity mapping, and setting
F = λΠ . unify A H (Σ d B (unify H A Π)),
we have that Clax ∝ Cbas. By Proposition 3.14 and the continuity of meet and unify , the assertion
follows.
6.3. A DATAFLOW SEMANTICS FOR DEFINITE LOGIC PROGRAMS 57
It should be clear that we can “laxify” the “call version” of the semantics just as we have done
for the base semantics. To limit the number of semantic definitions, however, we shall not do
this and will ignore the call semantics for the remainder of this chapter. Readers should keep in
mind, however, that ultimately the touchstone for the correctness of a dataflow analysis returning
information about call patterns is its soundness with respect to the call semantics.
To extract runtime properties of pure Prolog programs one can develop a variety of non-standard
interpretations of the preceding semantics. To clarify the presentation, we extract from the lax
semantics a dataflow semantics which contains exactly those features that are common to all the
non-standard interpretations that we want. It leaves the interpretation of one domain X and two
base functions, m and u unspecified. These missing details of the dataflow semantics are to be
filled in by interpretations. In the standard interpretation of this semantics, Ilax, X is P Psub, m is
meet and u is unify . In a non-standard interpretation, X is assigned whatever set of “descriptions”
we choose to approximate sets of parametric substitutions. X should thus be a complete lattice
which corresponds to P Psub in the standard semantics in a way laid down by some insertion
(X, γ,P Psub). The interpretation of m and u should approximate meet and unify respectively.
Prog , Clause, and Atom are static and have the obvious fixed interpretation. They are ordered by
identity. As usual, Den is ordered pointwise.
Definition. The dataflow semantics has domain
Den = Atom → X → X,
semantic functions
P : Prog → Den
C : Clause → Den → Den,
and base functions
m : X → X → X
u : Atom → Atom → X → X.
It is defined as follows.
P [[P ]] = lfp (⊔
C∈P (C [[C]]))
C [[H ←B]] d A x = m x (u A H (Σ d B (u H A x))).
An interpretation Ix of the dataflow semantics is determined by the triple (Ix X, Ix m, Ix u). We
use the convention that the semantic function P as determined by an interpretation Ix is denoted
by Px. For example the standard interpretation Ilax is given by (P Psub,meet , unify) and the
corresponding semantics is denoted by Plax.
Since C [[C]] is monotonic for every interpretation, we have the following proposition.
58 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
Proposition 6.9 For every interpretation Ix, Px is well-defined.
Definition. Let I = (X,m, u) and I′ = (X ′,m′, u′) be interpretations. Then I′ is sound with
respect to I iff for some insertion (γ,X ′,X), m′ appr m and u′ appr u.
The next proposition follows immediately from Proposition 3.14.
Proposition 6.10 If interpretation Ix is sound with respect to Iy, then Px ∝ Py.
By transitivity of ∝ and Proposition 6.8 we therefore have the following result.
Theorem 6.11 If interpretation Ix is sound with respect to Ilax, then Px ∝ Pbas.
Developing a dataflow analysis in this framework is therefore a matter of choosing the description
domain so that it captures the information required from the analysis and then defining suitable
functions to approximate meet and unify. Before giving example dataflow analyses we identify two
classes of description domain and indicate how meet can be approximated for these classes. This
is useful because the descriptions used in most dataflow analyses belong to one of the two classes
or are an orthogonal mixture of such descriptions. Thus finding generic ways to approximate meet
for these classes will help to give some insight into the design of practical dataflow analyses. We
note that once a suitable approximation for meet has been found, then, by Lemma 6.6, it may be
used as the basis for developing an approximation to unify .
Notice that Pbas cannot be defined as an interpretation. By introducing the lax semantics
we have achieved simplicity but also imprecision, to the extent that the base semantics cannot
be recovered. Jones and Søndergaard have shown how, by making the dataflow semantics more
complex (by not basing it on a lax semantics), a framework can be obtained in which the base
semantics itself can be given as an interpretation [43]. Such a framework clearly allows for more
precise dataflow analyses than the framework presented in this chapter.
Definition. An insertion (γ,X,P Psub) is downwards closed iff
∀x ∈ X . ∀π, π′ ∈ Psub . π ∈ γ x ∧ π′ ⊳ π ⇒ π′ ∈ γ x.
Dually we can define upwards closure.
Examples of downwards closed insertions are those typically used in groundness analysis, type
analysis, definite aliasing and definite sharing analysis (for references see Section 6.5). Examples of
upwards closed insertions are those typically used in possible sharing (independence) analysis, pos-
sible aliasing analysis, and freeness analyses. Many complex analyses, such as for the determination
of mode statements or the detection of and-parallelism, can often be expressed as a combination
of simpler analyses based on insertions that are downwards or upwards closed. A notion of “sub-
stitution closure,” which is similar to, but somewhat stricter than downwards closure, is identified
by Debray [16] as being the basis for an important class of dataflow analyses.
6.4. APPROXIMATING CALL PATTERNS: GROUNDNESS ANALYSIS 59
Proposition 6.12 Let (γ,X,P Psub) be a downwards closed insertion and let meet ′ : X → X → X
be given. Then meet ′ appr meet iff
∀x, x′ ∈ X . (γ x) ∩ (γ x′) ⊆ γ (meet ′ x x′).
Proof: We have that
meet ′ appr meet
⇔ ∀x, x′ ∈ X . ∀Π,Π′ ⊆ Psub . Π ⊆ (γ x) ∧ (Π′ ⊆ γ x′)⇒ meet Π Π′ ⊆ γ (meet ′ x x′)
⇔ ∀x, x′ ∈ X . ∀π ∈ (γ x) . ∀π′ ∈ (γ x′) . (π ⊓ π′) ∈ γ (meet ′ x x′) ∪ {⊥Psub}
⇔ ∀x, x′ ∈ X . (γ x) ∩ (γ x′) ⊆ γ (meet ′ x x′).
Definition. An insertion (γ,X,P Psub) is Moore-closed iff (∆ γ X) is Moore-closed in P Sub.
Corollary 6.13 Let (γ,X,P Psub) be a downwards closed, Moore-closed insertion. Let ⊓X be the
greatest lower bound operator for X. Then ⊓X appr meet (and in fact is the best approximation).
Proposition 6.14 Let (γ,X,P Psub) be an upwards closed insertion and let meet ′ : X → X → X
be given. Then meet ′ appr meet iff
∀x, x′ ∈ X . {π ⊓ π′ | π, π′ ∈ (γ x) ∪ (γ x′)} ⊆ γ (meet ′ x x′) ∪ {⊥Psub}.
Proof: We have that
meet ′ appr meet
⇔ ∀x, x′ ∈ X . ∀Π,Π′ ⊆ Psub . Π ⊆ (γ x) ∧ Π′ ⊆ (γ x′)⇒ (meet Π Π′) ⊆ γ (meet ′ x x′)
⇔ ∀x, x′ ∈ X . ∀π ∈ (γ x) . ∀π′ ∈ (γ x′) . (π ⊓ π′) ∈ γ (meet ′ x x′) ∪ {⊥Psub}
⇔ ∀x, x′ ∈ X . {π ⊓ π′ | π, π′ ∈ (γ x) ∪ (γ x′)} ⊆ γ (meet ′ x x′) ∪ {⊥Psub}.
Definition. An insertion (γ,X,P Psub) is meet-closed iff
∀x ∈ X . ∀π, π′ ∈ Psub . π, π′ ∈ γ x⇒ (π ⊓ π′) ∈ γ x ∪ {⊥Psub}.
Corollary 6.15 Let (γ,X,P Psub) be an upwards closed, meet-closed insertion. Let⊔
X be the
least upper bound operator for X. Then⊔
X appr meet.
6.4 Approximating call patterns: Groundness analysis
In this section we give an example dataflow analysis for groundness propagation. This analysis
is based on the scheme given in the previous section. Propositional formulas, or, more precisely,
classes of equivalent propositional formulas, are used as descriptions. We use monotonic formulas
60 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
built from the connectives ↔, ∧, and ∨ only. These descriptions are closely related to those used
by Dart [15]. A parametric substitution π is described by a formula F if, for any instance π′ of π,
the truth assignment given by “V is true iff π′ grounds V ” satisfies F . For example, the formula
x ↔ y describes the parametric substitutions [x 7→ a, y 7→ b] and [x 7→ u′, y 7→ u′, u 7→ u′] but not
[x 7→ a]. However, x↔ y is not the best description for [x 7→ a, y 7→ b]—that would be x ∧ y which
in turn “implies” x↔ y, that is, x↔ y is a less precise approximation than x ∧ y.
Let Prop be the poset of equivalent propositional formulas ordered by implication over some
suitably large finite variable set. We can represent an equivalence class in Prop by a canonical
representative for the class, perhaps a formula which is in disjunctive or conjunctive normal form.
By an abuse of notation we will apply logical connectives to both propositional formulas and to
classes of equivalent formulas.
Lemma 6.16 The poset Prop is a finite (complete) lattice with conjunction as the greatest lower
bound and disjunction as the least upper bound.
Definition. Let γ : Prop → P Psub be defined by
γ F = {π ∈ Psub | ∀π′ ⊳ π . (assign π′) satisfies F}
where assign : Psub → Var → Bool is given by (assign π V ) iff π grounds V .
Example 6.17 Let F = [x ∧ (y ↔ z)]. Then [x 7→ a] ∈ γ F and [y 7→ b] 6∈ γ F .
Let us use the notation∧
{φ1, . . . , φn} for the formula φ1 ∧ . . . ∧ φn, and similarly for∨
.
Lemma 6.18 The triple (γ,Prop,P Psub) is a downwards closed, Moore-closed insertion.
Proof: Clearly γ is monotonic and co-strict, and (γ,Prop,P Psub) is downwards closed. Let G ⊆
Prop. Then∧
G ∈ Prop and γ (∧
G) =⋂
F∈G (γ F ), so (γ,Prop,P Psub) is Moore-closed.
It follows from 6.13 that meet is best approximated by conjunction.
Definition. The function meetgro : Prop → Prop → Prop is defined by meetgro F F ′ = F ∧ F ′.
Lemma 6.19 meetgro appr meet.
The function unify is somewhat more complex to approximate. Its definition makes use of project ,
a projection function on propositional formulas and mgugro , the analogue of mgu for propositional
formulas. The motivation for approximating unify in this way comes from Lemma 6.6.
6.4. APPROXIMATING CALL PATTERNS: GROUNDNESS ANALYSIS 61
Definition. Let tass U be the set of truth assignments to the variables in U ⊆ Var . The function
project : P Var → Prop → Prop is defined by
project U F =∨
{ψ F | ψ ∈ tass ((vars F ) \ U)}.
Example 6.20 project{x, y, z} [x ∧ (y ↔ v) ∧ (z ↔ v)] = [x ∧ (y ↔ z)].
Lemma 6.21 project appr restrict.
Proof: Let π′ = restrict U π. We must show that if π ∈ γ F then π′ ∈ γ (project U F ). Let
ψ ∈ (tass (vars F ) \ U) be a truth assignment such that ψ V ⇔ assign π V . If π ∈ γ F then
(assign π) satisfies F . Thus the assignment ψ′ ∈ tass U such that ψ′ V ⇔ assign π V will satisfy
ψ F . Since vars(ψ F ) ⊆ U and (assign π′ V ) ⇔ (assign π V ) ⇔ (ψ V ) for all V ∈ U , it follows
that (assign π′) satisfies ψ F . Thus π′ ∈ γ (project U F ).
Definition. The function mgugro : Atom → Atom → Prop is defined by
mgugro A H = if (mgu A H) = ∅ then false else let {µ} = (mgu A H) in∧
{V ↔ (∧
{V ′ | V ′ ∈ (vars(µ V ))}) | V ∈ dom µ}.
Example 6.22 Let A = p(x, y) and H = p(a, f(u, v)). Then
mgugro A H = [x ∧ (y ↔ (u ∧ v))].
Lemma 6.23 (mgugro A H) appr (∆ β (mgu A H)).
Proof: This follows from the definition of mgugro and γ.
Definition. The function unifygro : Atom → Atom → Prop → Prop is defined by
unifygro H A F = let ρ = ren(vars H) ((vars A) ∪ (vars F )) in
project(vars H) ((ρ F ) ∧ (mgugro H (ρ A))).
Lemma 6.24 unifygro appr unify.
Proof: This follows from Proposition 3.14 and Lemmas 6.6, 6.19, 6.21, and 6.23.
Example 6.25 Let A = append(x, y, z), H = append(nil, y, y), and H ′ = append(u : x, y, u : z).
Then
unifygro H A [true] = [true]
unifygro A H [true] = [x ∧ (y ↔ z)]
unifygro H′ A [true] = [true]
unifygro A H ′ [false ] = [false]
unifygro A H ′ [x ∧ (y ↔ z)] = [(x ∧ y)↔ z].
62 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
Definition. The groundness analysis Pgro is given by instantiating the dataflow semantics with
the interpretation (Prop,meetgro, unifygro).
Theorem 6.26 Pgro ∝ Pbas.
Proof: This follows immediately from Lemmas 6.18, 6.19, and 6.24.
Example 6.27 Let P be the append program
append(nil, y, y).
append(u : x, y, u : z)← append(x, y, z).
Consider the goal A = ←append(x, y, z). To compute Pgro [[P ]] A [true], the analysis proceeds as
follows. Let DP =⊔
C∈P (Cgro [[C]]). We then have
(DP ↑ 0) A [true] = [false ]
(DP ↑ 1) A [true] = [x ∧ (y ↔ z)] ∨ [false ] = [x ∧ (y ↔ z)]
(DP ↑ 2) A [true] = [x ∧ (y ↔ z)] ∨ [(x ∧ y)↔ z] = [(x ∧ y)↔ z]
and (DP ↑3) = (DP ↑2) = lfpDP . Thus Pgro [[P ]] A [true ] = [(x ∧ y)↔ z]. Similarly one can show
that Pgro [[P ]] A [z] = [x ∧ y ∧ z]. In the “call” version of Pgro, we would find that, provided this
holds in the initial call, all calls to append will have a third argument which is definitely ground.
Note that we have here been concerned with the correct definition of the groundness analy-
sis. An implementation should make use of obvious properties of the operators. For example,
mgugroH (ρ A) should not be computed repeatedly during an analysis, since, by choosing ρ so as
to “rename away from programs variables,” every possible “unifier” can be computed once and for
all as a first step of the analysis.
The standard approach to least fixpoint computation is to compute the associated Kleene
sequence. If the descriptions form a Noetherian lattice then the Kleene sequence will be finite.
However, the question of how to compute the last element of the Kleene sequence efficiently remains.
A number of techniques are available and should be implemented. First of all, efficiency may be
obtained by first computing minimal cliques of mutually recursive clauses and performing the
analysis on these (relatively) independently. This would seem preferable for programs as they
appear in practice. Furthermore, finite differencing techniques [79] are usually applicable to the
fixpoint computation, since our operators are typically extensive, that is, x ⊆ F x holds for all x.
Some ideas apply to our case of computing fixpoints for functionals in particular. This is the case
with the standard technique of using memoing to avoid redundant computations, and also with
the idea behind the “minimal function graph” semantics introduced by Jones and Mycroft [42].
A minimal function graph semantics is only defined for those calls that are reachable from some
initial call (hence “minimal”) and employs memoing to avoid redundant work (hence “function
6.5. OTHER APPLICATIONS AND RELATED WORK 63
graph”). This provides an interesting parallel to “magic set” implementation of query processing
in deductive databases [3]. Again, the idea with magic set transformation is to make use of the
fact that interest is restricted to the set U of instances of the given query. See also Marriott and
Søndergaard [64].
6.5 Other applications and related work
An indication of the versatility of abstract interpretation for logic programming is the amount of
work published on the topic in recent years. In particular the idea of approximating call patterns
(as studied in this chapter) seems useful. Many kinds of code improvement that can be performed
automatically by a compiler depend on information about calls that may take place at run time,
and other kinds of program transformation make use of that information as well. At the end of
this section we list some of the applications.
However, a number of papers in the area have been concerned with what has come to be called
“frameworks” for abstract interpretation of logic programs. By this is usually meant a general
setting that allows one to express a number of dataflow analyses in a uniform way, just as we
have done in the present chapter through our “dataflow semantics.” The various frameworks differ
considerably in: (1) the degree of semantic justification provided, (2) assumptions about underlying
semantics, (3) the number of operations that need to be given in an “interpretation,” (4) the notion
of “safe approximation” as a relation between operations, and (5) general complexity. We shall
make no attempt at a taxonomy here (but see Marriott [55]), nor shall we have something to say
about every approach known to us. But it does seem relevant to compare two aspects of our work
with related work.
One aspect is the denotational definition of the “base” semantics which may be of some interest
in their own rights because of of their novelty. The other aspect is the degree of “factorisation”
achieved in our “dataflow semantics,” that is, the degree to which we have captured the essence of
the dataflow analysis problem. We discuss these two issues in turn before addressing applications.
Early work on SLD-based semantic definitions for logic programs was done by Jones and My-
croft [41] who addressed both operational and denotational semantics. Debray and Mishra [20] gave
a thorough exposition of a denotational definition, including a proof of its correctness with respect
to SLD. Both Jones and Mycroft and Debray and Mishra assume a left-to-right computation rule
and a depth-first search of SLD trees (as in Prolog), and both definitions capture non-termination
(unlike ours). Both use sequences of substitutions as denotations, rather than sets, which give the
definitions a rather different flavour. The definition used by Jones and Søndergaard [43] achieves
certain simplifications by assuming a parallel search rule and consequently manipulates sets of
substitutions. The use of substitutions forces it to employ an elaborate renaming technique which
complicates semantic definitions somewhat. Winsborough [99, 100] and Jacobs and Langen [35]
have suggested denotational definitions along similar lines.
Marriott and Søndergaard have considered a number of different denotational definitions for
64 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
the purpose of dataflow analysis. The definitions given in Section 5.1 falls outside the present
discussion, since it is “bottom-up,” but it is relevant to mention the work which attempts to give
a uniform presentation of both bottom-up and top-down definitions by expressing both in terms of
operations on lattices of substitutions, as far as this is possible [64]. In particular it has been shown
how operations such as unification and composition of substitution can be adequately dissolved into
lattice operations to simplify definitions [64], an idea which has formed the point of departure for
the definitions presented here.
Some of the above mentioned semantic definitions apply to more than “pure” Prolog, for ex-
ample by being able to interpret the “cut” operator of Prolog. We have not been concerned with
extra-logical or non-logical aspects of logic programming languages, partly because we believe that
many of those features will disappear from logic programming languages in due time. (Most of the
operations that necessarily remain have straightforward approximations in dataflow analysis. For
example, a “write” can be ignored by an analysis, and in case of “read,” a dataflow analysis will
always be forced to assume a worst case behaviour.) However, one operation should be of concern
to us, namely negation. Negation is useful in a logic programming language, but, as is well-known,
most Prolog systems use an unsound version of negation [49]. Future implementations of logic
programming languages should hopefully rectify this. It is perfectly possible to give a denotational
definition that incorporates negation. Marriott and Søndergaard [65] have shown this for both
traditional (unsound) Prolog and SLDNF resolution (the methods employed to handle SLDNF
resolution can be extrapolated in a straightforward way to cover languages with delay mechanisms,
such as NU-Prolog [97]). Such definitions are important for abstract interpretation, because they
allow for more precision: a dataflow analysis that “knows” about the semantics of negation can
yield better results than one whose policy simply is to ignore negation (which is a safe behaviour
for many purposes).
Abstract interpretation of logic programs was first mentioned by Mellish [69] who suggested it as
a way to formalise mode analysis4. An occur check analysis which was formalised as a non-standard
semantics was given by Søndergaard [92]. Some of the techniques used in the present chapter can
be traced back to that work. This is the case with the principle of performing unification both on
call and return so as to facilitate that only local variables need be manipulated at any stage (this
was referred to as a principle of “locality”). The work also established the principle of binding
information to variables in a program throughout computations, rather than to argument positions
as seen elsewhere [7, 53, 68]. Other things being equal, this improves precision. For example,
consider a mode analysis of the program
←p(f(x)).
p(f(u)).
using the two modes “free” (unbound or bound to a variable) and “any.” The “argument position”
methods will funnel mode information about the variables x and u through the argument position of
4The origin of this suggestion can be attributed to Alan Mycroft.
6.5. OTHER APPLICATIONS AND RELATED WORK 65
p and assign x the mode “any,” while clearly x could be more precisely deemed “free.” To counteract
such behaviour, an “argument position” method must use a more fine-grained description domain
and pay the price of more expensive “abstract operations.”
A framework for the abstract interpretation of logic programs was first given by Mellish [68].
Mellish’s semantics is an operational parallel to our “lax” semantics with the imprecision that
this implies: success patterns are not associated with their corresponding call patterns, so success
information is propagated back, not only to the atom that actually called the clause, but to all atoms
that unify with the clause’s head. The application that had Mellish’s interest in particular was
mode analysis. Debray [17] subsequently investigated this application in more detail and pointed
to a problem in Mellish’s application (the so-called aliasing problem, which may manifest itself as
either a soundness or a completeness problem, depending on the particular dataflow analysis).
A framework for the abstract interpretation of logic programs based on a denotational definition
of SLD was given by Jones and Søndergaard [43]. This was the first denotational approach to
abstract interpretation of logic programs. The framework allowed even the base (or standard)
semantics to be expressed as an instance of the dataflow semantics. This has the advantage of
providing a very clean cut between a semantic definition which is precise (unlike our lax and
dataflow semantics) and interpretations in which all introduced imprecision resides. In this chapter
we have abandoned this approach only to simplify our presentation. Jones and Søndergaard used
operations “call” and “return” which in the present approach have been replaced by “unify” and
“meet.” We find this conceptually cleaner.
Kanamori and Kawamura [44] suggested a framework based on OLDT resolution [94], which
essentially is SLD resolution extended with memoing, so as to avoid redundant computation.
Bruynooghe et al. [7, 9] suggested an AND/OR tree-based framework in which an interpretation
contains some seven operations. This has later been simplified to some extent [8]. Neither approach
makes use of fixpoint characterisations for standard or non-standard semantics. We have earlier
discussed the relative merits of operational and denotational definitions as basis for the design of
dataflow analyses. Readers should compare the two last mentioned approaches with that of this
chapter, both as regards conceptual complexity and complexities of the various semantic definitions.
Both Kanamori and Kawamura and Bruynooghe et al. give algorithms (or rather procedures) for
performing abstract interpretation. So does Nilsson in his thesis [78].
The framework used by Winsborough [99, 100] is rather close to ours. In particular, one semantic
definition (Winsborough’s “total function graph semantics” [100]) is almost identical to our base
semantics, the difference being that it works with (classical) substitutions that are “canonized” to
bar variants of a substitution from introducing redundancy (Mellish [68] used the same idea).
Debray [16] has studied a framework for dataflow analysis with the point of departure that
analyses must be efficient. He identifies a property of description domains (“substitution closure”)
and gives a complexity analysis to support the claim that the corresponding class of dataflow
analyses can be implemented efficiently. Our groundness analysis falls outside Debray’s class, as
does any analysis that attempts to maintain information about aliasing (see also below).
66 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
Marriott and Søndergaard have considered a number of different denotational definitions for the
purpose of dataflow analysis. The definitions given in Section 5.1 fall outside the present discussion,
since they are “bottom-up,” but it is relevant to mention work that attempts to give a uniform
presentation of both bottom-up and top-down definitions by expressing both algebraically [64].
Without claiming completeness of the list or the attached references, we finally list some appli-
cations that fit into our framework, or a similar framework.
Aliasing analysis [33]. In many dataflow analyses, such as mode analysis or the groundness
analysis we presented in Section 6.4, it is useful to know whether two variables are aliases, that
is, bound to the same term. Sometimes such knowledge is even necessary to guarantee soundness.
The purpose of aliasing analysis is to predict which aliases occur at runtime.
Compile time garbage collection [9]. The idea behind this analysis is to approximate reference
counting at compile time. This can allow a compiler to generate code for reclaiming storage without
runtime overhead. Clearly this is an example of an analysis that relies on aliasing information.
Determinacy analysis [22]. Many calls in programs are deterministic in the sense that they return
with at most one answer. The idea behind determinacy analysis (or the more refined “functionality
analysis” of Debray and Warren) is to detect such cases at compile time. This allows for the
generation of code with fewer backtrack points.
Floundering analysis for normal logic programs [66]. SLDNF resolution provides an oper-
ational semantics which (unlike many Prolog implementations) is both computationally tractable
and sound with respect to a declarative semantics. The idea is to process negated literals only when
they are ground. However, SLDNF is not complete, and it achieves soundness only by discarding
certain queries to programs because they “flounder,” that is, lead to a situation where only negated
literals with free variables may be selected for processing. The purpose of floundering analysis is
to determine whether floundering may occur at runtime.
Independence analysis for AND-parallelism [18, 34, 35, 99, 101]. Two atoms in a body can
be executed (non-speculatively) in parallel if they are independent, that is, the variable bindings
created by one does not affect the other’s variable bindings. This is the case, for example, if all
the variables that they share are known to be bound to ground terms at the time the atoms are
invoked. Less restrictive conditions can also be determined, under which atoms are independent.
This is the purpose of independence analysis.
Mode analysis [7, 19, 21, 44, 53, 68, 69, 70, 100]. We already discussed mode analysis in Chapter 4.
There is no clear-cut distinction between mode analysis and many of the other analyses listed here—
groundness analysis and determinacy analysis can be seen as special cases. Essentially the purpose
of mode analysis is, for each variable in a head of a clause, to predict the degree of its instantiation
whenever the clause is called. A simple instance of this is the annotation of argument positions as
“input,” “output,” or “both.”
Occur check analysis [23, 81, 92]. Most Prolog systems omit the occur check in unification, since
this speeds up execution considerably. However, by doing so, they sacrifice the soundness of Prolog
6.5. OTHER APPLICATIONS AND RELATED WORK 67
as a theorem prover. The purpose of occur check analysis is to detect, at compile time, cases where
it is safe to omit occur checks.
Program transformation (we discuss an example in Chapter 7).
Type inference [9, 37, 44]. In the abstract interpretation literature (for logic programming) type
inference usually means approximation of calling patterns. This contrasts our use of the term in
Chapter 5. Typically, tree-like structures such as (variants of) rational trees are used as “types,”
and a term T then is of type t iff T can be folded onto t. Type inference can be used for example
in the elimination of dead or unreachable code.
68 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS
Chapter 7
Difference-Lists Made Safe
Difference-lists are terms that represent lists. They belong to the Prolog folklore: most Prolog
programmers know how the use of difference-lists can speed up list processing, but there have
been few in-depth discussions of the phenomenon. Even though it is common knowledge that
difference-lists must be treated with some care, the exact limitations of their use have remained
obscure.
In this chapter we investigate the concept of a difference-list. The contents is based on work
by Marriott and Søndergaard [63]. We study the transformation of list-processing programs into
programs using difference-lists, as it has been sketched by Sterling and Shapiro [93]. In particular
we are concerned with finding circumstances under which the transformation is safe. We show
how dataflow analysis can be used to determine whether the transformation is applicable to a
given program and how this allows for automatic transformation. The chapter therefore serves to
illustrate the use of semantics-based dataflow analysis in program transformation.
In Section 7.1 we explain the idea behind difference-list transformation as propounded by Ster-
ling and Shapiro and give some examples that indicate the method’s limitation. In Section 7.2
difference-lists are defined formally. We give an algorithm to find list occurrences that should be
changed to difference-lists and find conditions under which this data structure transformation is
safe. We also address the problem: under which circumstances is rapid concatenation of difference-
lists possible? The insight gained from this analysis is used to construct a transformation algorithm
in Section 7.3. An integral part of this algorithm are dataflow analyses that are needed to guarantee
the safeness of the transformation. Finally, in Section 7.4, we discuss related work and possible
extensions to our treatment.
7.1 The difference-list problem
The use of difference-lists is a standard Prolog programmer’s trick. Changing data structures from
lists to difference-lists allows for much faster list concatenation. To understand why, one may
69
70 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
simplistically think of a difference-list as a “list plus a pointer to its end.” Owing to this “pointer”
one can avoid traversing lists that are represented as difference-lists, and so list concatenation has
a cost that is constant rather than linear in the length of one of the lists. The transformation can
therefore improve the efficiency of programs considerably. Since it furthermore applies to a large
class of important list-processing programs, it would be desirable to automate it. We suggest that
this is possible.
The reader may wonder why a difference-list transformer is not a standard logic program devel-
opment tool. One reason is that the naive “folk” transformation is treacherous and often results in
a program having a behaviour that is radically different to that of the original program. We know
of no rigorous analysis intended to find conditions which guarantee that the transformation is safe.
Conventional wisdom has it that problems with the transformation are intimately connected to the
absence of occur checks in most Prolog systems. But, as we show, even when occur checks are
present, the transformation may be unsafe. Throughout this chapter we will assume the presence
of occur checks.
In this section we present the idea behind the folk transformation. The hope is to familiarise
the reader with this basic idea before trying to give it a more formal treatment. We also exemplify
programs for which the transformation is unsafe. As usual in discussions about program trans-
formation we have to be careful with our use of identifiers. We use the convention from previous
chapters that meta-variables are (possibly subscripted) italic capital letters: A is used for atoms, B
for bodies, D for difference-lists, G for queries, L for lists, P for programs, Q for predicate symbols,
R, S, T for terms, and V, X and Y for variables.
Difference-lists were implicit already in Colmerauer’s treatment of metamorphosis grammars
[12], but the term itself was introduced by Clark and Tarnlund [11]. Manual transformation to
programs using difference-lists has been discussed by Tarnlund [96] and by Hansson and Tarnlund
[30]. These treatments, however, are mainly concerned with axiomatising a theory of difference-lists
and no general transformation method is discussed1.
The practice of transforming programs using lists into programs using difference-lists has been
most extensively dealt with by Sterling and Shapiro2. [93]. They make the useful observation that
“a program that independently builds different sections of a list to be later combined
together is a good candidate for using difference-lists.”
The idea behind the transformation is to replace certain lists by terms called difference-lists. A
difference-list is a pair of lists whose second component is a suffix of the first. The difference-list
denotes the list that results when the suffix is removed from the first component. For example,
1Hansson and Tarnlund incidentally use a list–to–difference-list mapping which is only partial. Their definition
([30] page 119) assigns no difference-list to (in our notation) a : nil or a : x.2However, Sterling and Shapiro’s reference to Bloch [6] for an “automatic transformation of simple programs
without difference-lists to programs with difference-lists” ([93] page 255), is misleading. A reading of Bloch’s work
reveals no such transformation.
7.1. THE DIFFERENCE-LIST PROBLEM 71
the list a : b : nil may be represented by the difference-list 〈a : b : x, x〉. Intuitively, x here is a
“pointer” to the end of the list (although “write-once” as all logical variables).
The concatenation of difference-lists is done by the predicate app⋆⋆ defined as follows:
app⋆⋆(〈x, y〉, 〈y, z〉, 〈x, z〉).
Clearly evaluation of this is much faster than using the well-known append program, which traverses
its first argument:
append(nil, y, y).
append(u : x, y, u : z)← append(x, y, z).
While Sterling and Shapiro do not present a general transformation, they give sufficient and well-
chosen examples for the reader to understand the idea. We exemplify the transformation by con-
sidering a program rev to reverse a list:
rev(nil, nil).
rev(u : x, y)← rev(x, z), append(z, u : nil, y).
Since the list z is to have another list appended to it, we want to change z into a difference-list.
This means introducing a predicate rev⋆ whose second argument is a difference-list:
rev⋆(nil, 〈y, y〉).
rev⋆(u : x, 〈y, y′〉)← rev⋆(x, 〈z, z′〉), app⋆⋆(〈z, z′〉, 〈u : v, v〉, 〈y, y′〉).
The relation between rev and rev⋆ is given by the clause
rev(x, y)← rev⋆(x, 〈y, nil〉).
Unfolding the call to app⋆⋆ above, we obtain the program
rev(x, y)← rev⋆(x, 〈y, nil〉).
rev⋆(nil, 〈y, y〉).
rev⋆(u : x, 〈y, y′〉)← rev⋆(x, 〈y, u : y′〉).
Whereas the original rev program qua Prolog program had O(n2) predicate calls this version has
onlyO(n) calls, where n is the length of the list to be reversed—certainly a worthwhile improvement.
Though Sterling and Shapiro do not precisely define the transformation, the following pattern
emerges from their examples. For each call to append:
1. Assume that the arguments to append are changed from lists to difference-lists and propagate
the changes caused by this to other “types” of predicate arguments.
2. For each predicate Q/n thus reached, introduce a predicate Q⋆/n that is defined exactly
as Q, except it uses difference-lists instead of lists and calls the corresponding ⋆-versions of
predicates, including app⋆⋆ instead of append.
72 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
3. Unfold all calls to app⋆⋆.
4. For each predicate Q/n that has been changed, replace its definition by the clause
Q(X1, . . . ,Xn)←Q⋆(Y1, . . . , Yn),
where Yi = 〈Xi, nil〉 if the i’th argument has been changed to a difference-list, and Yi = Xi
otherwise.
Unfortunately, though this transformation seems to be well-known, its limitations are not. In the
remainder of this section we give examples indicating the problems that may occur. We first note
that, though the transformation seemingly works well in the case of rev above, the two versions of
rev are not logically equivalent. They behave in the same way only if the first argument is a fixed
length list (such as a : x : nil), which is (hopefully) the intended usage. We now exemplify how a
resultant program may behave very differently to the original program for almost any usage.
Example 7.1 Consider the program
p← append(x, y : nil, b : nil), y = a.
Clearly the query ←p fails. By the sketched method, the program is transformed into
p← app⋆⋆(〈x, x′〉, 〈y : u, u〉, 〈b : v, v〉), y = a.
After unfolding of app⋆⋆ we get
p← y = a.
Now the query ←p succeeds, so the transformation has increased the success set.
Example 7.2 Consider the program
double(x, y)← append(x, x, y).
Here the query ←double(L, y) succeeds for all lists L. The transformation yields
double(x, y)← double⋆(〈x, nil〉, 〈y, nil〉).
double⋆(〈x, x′〉, 〈y, y′〉)← app⋆⋆(〈x, x′〉, 〈x, x′〉, 〈y, y′〉).
Unfolding the call to app⋆⋆, we get
double(x, y)← double⋆(〈x, nil〉, 〈y, nil〉).
double⋆(〈x, x〉, 〈x, x〉).
Now the query ←double(L, y) fails whenever L is a list of the form T1 : . . . : Tn : nil where n ≥ 1.
So the transformation has decreased the success set.
7.1. THE DIFFERENCE-LIST PROBLEM 73
The following example shows that introduction of difference-lists may go wrong even when there
are no calls to append.
Example 7.3 Consider the program
p← q(nil).
q(a : y).
Clearly the query ←p fails. Replacing nil and a : y with difference-list representations yields the
program
p← q(〈x, x〉).
q(〈a : y, y′〉).
Now the query ←p succeeds.
In Section 7.2 we explain what goes wrong and how safeness can be guaranteed by means of a
dataflow analysis.
It is not uncommon to think that problems with difference-lists are due to occur check problems.
This is not surprising since absence of occur checks makes very simple program transformations,
such as unfolding, unsafe, even in the case of pure Prolog [62]. However, the above discussion
assumed the presence of occur checks, and yet it revealed a number of problems. The absence of
occur checks might create even more problems, but they are of a different sort and orthogonal to
the problems that concern us here. For example, assume we are given the fact
empty⋆(〈x, x〉).
The query ←empty⋆(〈a : y, y〉) would succeed in the absence of occur checks, and the same would
be true for a query such as ←app⋆⋆(〈x, x〉, 〈y, y〉, 〈a : z, z〉), even though such queries of course
should fail.
It is useful to view the transformation as consisting of two stages: the first stage performs
a change of data structure from lists to difference-lists, the second stage modifies the resulting
program so as to perform efficient difference-list concatenation. This point of view is justified by
the fact that distinct problems arise at the two stages. At the first stage, problems arise because
unification of difference-lists is not faithful to unification of the lists they represent. For example,
the difference-lists 〈nil, nil〉 and 〈a : nil, a : nil〉 do not unify even though they both denote nil.
On the other hand 〈x, x〉 and 〈a : y, y′〉 do unify, while their denotations nil and a : nil do not.
This type of problem is exemplified in Example 7.3. At the second stage of the transformation,
problems arise because, in general, app⋆⋆ is not equivalent to append. This problem is exemplified
in Examples 7.1 and 7.2.
74 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
7.2 Manipulating difference-lists
In this section we discuss the problems with difference-list manipulation. We have seen that two
kinds of problems arise: those related to changing the data structure representing sequences of
terms from lists to difference-lists (first stage) and those related to concatenating difference-lists
(second stage). We discuss each kind in turn.
Definition. A list is a term of the form T1 : . . . : Tn : nil or of the form T1 : . . . : Tn : X,where
n ≥ 0. If it is of the first form then it has fixed length. The set of lists is denoted by List.
Definition. A difference-list is a term of the form 〈L1, L2〉, where L1 and L2 are lists. We call L2
the delimiter of the difference-list. A difference-list of the form 〈T1 : . . . : Tn, Tn〉 has fixed length.
The set of difference-lists is denoted by Dlist.
Definition. A difference-list D is said to represent the fixed length list T1 : . . . : Tn : nil iff D =
〈T1 : . . . : Tn : T, T 〉, where T does not share variables with any Ti. A difference-list D represents
the non-fixed length list T1 : . . . : Tn : X iff D = 〈T1 : . . . : Tn : X,T 〉, where T does not share
variables with any Ti or X.
For example, a : nil is represented by 〈a : x, x〉 or 〈a : nil, nil〉, and a : x is represented by 〈a : x, x′〉
or 〈a : x, nil〉. Clearly many difference-lists such as 〈a : nil, b : nil〉 do not represent any list. An
interesting subclass of these consists of the difference-lists that represent the so-called negative lists
[93].
Given a list L, there are two natural ways to create a difference-list that represents L. We now
define two functions that perform these mappings.
Definition. Let ρ be a bijection from the set of variables appearing in any program or query to a
disjoint set of variables. The function σ : List→ Dlist is defined by
σ (T1 : . . . : Tn : nil) = 〈T1 : . . . : Tn : Xnew,Xnew〉
σ (T1 : . . . : Tn : X) = 〈T1 : . . . : Tn : X, (ρ X)〉,
where each Xnew is a distinct new variable not appearing in any program or in the range of φ. The
function σ′ : List→ Dlist is defined by (σ′ L) = 〈L, nil〉.
The class of difference-lists generated by σ and σ′ turn out to have some nice unification properties,
as we show in Propositions 7.4 and 7.5.
Definition. A difference-list D is a simple representation of a list L iff D = (σ L) or D = (σ′ L).
7.2. MANIPULATING DIFFERENCE-LISTS 75
Ideally, unification of difference-lists should correspond to unification of the lists they represent.
In particular, if two lists unify, so should their difference-list representatives. This does not hold
in general, but it holds for simple representations, as the following proposition shows. Unlike the
case in the previous chapters, the function mgu is here assumed to have the type Term → Term →
Sub ∪ {fail} and is defined in the obvious way.
Proposition 7.4 Let D and D′ be simple representations of L and L′, respectively. Let θ =
(mguD D′) and φ = (mgu L L′). If φ 6= fail then θ 6= fail and (θ D) is a simple representation of
(φ L).
Proof: Assume φ 6= fail . We consider three cases depending on whether L and L′ have fixed length
or not (this suffices for reasons of symmetry).
1. Assume that L = T1 : . . . : Tn : nil and L′ = T ′1 : . . . T ′
n : nil. Then (φ L) = T ′′1 : . . . : T ′′
n : nil
for some T ′′1 , . . . , T
′′n . Since L and L′ have fixed length, D is either 〈L, nil〉 or 〈T1 : . . . : Tn :
Xnew,Xnew〉, while D′ is 〈L′, nil〉 or 〈T ′1 : . . . : T ′
n : X ′new,X
′new〉. In case D and D′ are of the
last forms, (θ D) = 〈T ′′1 : . . . : T ′′
n : X ′new,X
′new〉, because Xnew and X ′
new do not occur in
T1, T′1, . . . , Tn, T
′n. Otherwise (θ D) = 〈T ′′
1 : . . . : T ′′n : nil, nil〉. In both cases (θ D) is a simple
representation of (φ L).
2. Assume that L = T1 : . . . : Tn : nil and L′ = T ′1 : . . . T ′
m : X, where m ≤ n. Then
(φ L) = T ′′1 : . . . : T ′′
n : nil for some T ′′1 , . . . , T
′′n , and (φ X) = T ′′
m+1, . . . , T′′n : nil. Since
L has fixed length, D is either 〈L, nil〉 or 〈T1 : . . . : Tn : Xnew,Xnew〉, while D′ is either
〈T ′1 : . . . : T ′
m : X,nil〉 or 〈T ′1 : . . . : T ′
m : X,Y 〉, where Y does not occur in L or L′.
Assume D = 〈L, nil〉. If D′ = 〈T ′1 : . . . : T ′
m : X,nil〉 then θ = φ. If D′ = 〈T ′1 : . . . : T ′
m : X,Y 〉
then θ = {Y 7→ nil} ◦ φ. In both cases (θ D) = 〈T ′′1 : . . . : T ′′
n : nil, nil〉.
Assume D = 〈T1 : . . . : Tn : Xnew,Xnew〉. If D′ = 〈T ′1 : . . . : T ′
m : X,nil〉 then θ = {Xnew 7→
nil} ◦ φ. If D′ = 〈T ′1 : . . . : T ′
m : X,Y 〉 then θ = {Xnew 7→ nil, Y 7→ nil} ◦ φ. In both cases,
(θ D) = 〈T ′′1 : . . . : T ′′
n : nil, nil〉. So again (θ D) is a simple representation of (φ L).
3. Assume that L = T1 : . . . : Tn : X and L′ = T ′1 : . . . T ′
m : X ′, where, for reasons of symmetry,
we can assume that m ≤ n. Then (φ L) = T ′′1 : . . . : T ′′
n : X for some T ′′1 , . . . , T
′′n , and
(φ X ′) = T ′′m+1, . . . , T
′′n : X. Now D is either 〈T1 : . . . : Tn : X,nil〉 or 〈T1 : . . . : Tn :
X,Y 〉, where Y does not occur in L or L′, and D′ is either 〈T ′1 : . . . : T ′
m : X ′, nil〉 or
〈T ′1 : . . . : T ′
m : X ′, Y ′〉, where Y ′ does not occur in L or L′. If D = 〈T1 : . . . : Tn : X,Y 〉
and D′ = 〈T ′1 : . . . : T ′
m : X ′, Y ′〉 then θ = {Y 7→ Y ′} ◦ φ, so (θ D) = 〈T ′′1 : . . . : T ′′
n : X,Y ′〉.
Otherwise (θ D) = 〈T ′′1 : . . . : T ′′
n : X,nil〉. In any case, (θ D) is a simple representation of
(φ L).
In other words, using only simple representations removes one of the problems mentioned at the
end of Section 7.1. Namely, simple representations unify whenever the lists they represent do.
76 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
We would hope for the converse to hold as well, but unfortunately, simple representations may
still unify in cases where the lists they represent do not. For example, the term 〈x, x〉 unifies with
〈a : y, y′〉, yielding the most general common instance 〈a : y, a : y〉. However, 〈a : y, a : y〉 represents
nil, and nil is not an instance of the list represented by 〈a : y, y′〉, that is, of the list a : y. In
other words, σ is not homomorphic with respect to the operation “most general common instance”
(that is, the meet operation on the lattice of terms, which we usually compute using unification).
However, we have the following proposition.
Proposition 7.5 Let D and D′ be simple representations of L and L′, respectively. Let θ =
(mguD D′) and φ = (mgu L L′). If θ 6= fail and the delimiter of (θ D) is a variable or nil, then
φ 6= fail , and (θ D) is a simple representation of (φ L).
Proof: By the definition of σ and σ′, if θ = fail then φ = fail . The remainder of the proof is by
cases as before, assuming φ 6= fail . There is a myriad of cases, depending on the one hand on
whether the delimiter of (θ D) is a variable or nil, and on the other hand on the form of L and L′.
We show only one case—the others are similar.
Assume (θ D) = 〈L′′, nil〉, L = T1 : . . . : Tn : nil, and L′ = T ′1 : . . . : T ′
n : nil. Then (φ L) = T ′′1 :
. . . : T ′′n : nil for some T ′′
1 , . . . , T′′n . There are three cases to consider, regarding the form of D and D′.
We either have D = 〈L, nil〉 and D′ = 〈L′, nil〉, D = 〈T1 : . . . : Tn : Xnew,Xnew〉 and D′ = 〈L′, nil〉,
or D = 〈L, nil〉 and D′ = 〈T ′1 : . . . : T ′
n : X ′new,X
′new〉. In all cases (θ D) = 〈T ′′
1 : . . . : T ′′n : nil, nil〉,
that is, (θ D) is a simple representation of (φ L).
According to Proposition 7.4, we should only ever introduce simple representations of lists. Then,
by Proposition 7.5, to guarantee that difference-list unification is faithful, it suffices to check that
the delimiter of the resulting difference-list is a variable or nil. To make our transformation as
general as possible, we will allow for this check to be performed at run-time. This prompts the
following definition of the predicate var-nil :
var-nil(x)← if var(x) then true else x = nil.
Note that the predicate is non-logical since it makes use of the meta-predicate var of Prolog. For
most unifications, however, the check will not be needed, and we later show how unnecessary calls
to var-nil can be removed by a simple dataflow analysis. We can now introduce a predicate⋆= which
unifies two simple difference-lists, taking the unification problem for difference-lists into account:
〈x, y〉⋆= 〈u, v〉 ← 〈x, y〉 = 〈u, v〉, var-nil (v).
Recall that we are interested in changing programs so that some lists are represented by difference-
lists. The following algorithm is designed to mark those predicate arguments that must be changed
to difference-lists if the initial arguments marked are changed to difference-lists.
7.2. MANIPULATING DIFFERENCE-LISTS 77
Program Marking Algorithm
Input: A program in which some initial predicate arguments (which should be lists) are marked.
Algorithm: Repeatedly select one of the following actions until no more arguments can be marked:
1. If, in a clause body, the i’th argument of a predicate Q is marked then, for each clause defining
Q, mark the i’th argument of its head.
2. If, in a clause head defining a predicate Q, the i’th argument is marked then, in each atom
calling Q, mark the i’th argument.
3. If, in a clause, a variable X is the tail of a marked list then mark each predicate argument
containing X.
4. If one of the arguments to the predicate = is marked then mark the other.
Finally check that each marked term is a list and that in each clause, if a variable appears as a tail
of some marked list then it only ever appears as the tail of a marked list.
This process clearly terminates. We later (after Proposition 7.10) discuss why the individual steps
of the algorithm are necessary.
Example 7.6 Consider the following program which “flattens” binary trees into lists. We let
underlining indicate the terms that have been marked.
flatten(x : y, zf)← flatten(x, xf), f latten(y, yf), append(xf, yf, zf).
f latten(nil, nil).
f latten(z, z : nil)← constant(z).
append(nil, y, y).
append(u : x, y, u : z)← append(x, y, z).
Given this program, the Program Marking Algorithm will produce the following:
flatten(x : y, zf)← flatten(x, xf), f latten(y, yf), append(xf, yf, zf).
f latten(nil, nil).
f latten(z, z : nil)← constant(z).
append(nil, y, y).
append(u : x, y, u : z)← append(x, y, z).
As well as marking the program we must also mark any query the program. The following algorithm
achieves this.
78 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
Algorithm (Query Marking)
Input: A query and a successfully marked program.
Algorithm: For each atom A in the query:
if the i’th argument of the clause head defining A is marked
then mark the i’th argument in A
check that each term marked in the query is a list.
This process clearly terminates.
Example 7.7 If the query ←flatten(z, zf) and the marked version of the program flatten given
in Example 7.6 is input to the Query Marking Algorithm, ←flatten(z, zf) is returned.
One might hope that, for any successfully marked program P and query G, replacement of the
marked lists L in P by (σ L) and of those in G by (σ′ L) would always yield a program and query
that are, in a strong sense, equivalent to the original. Unfortunately, as Example 7.3 shows, this is
not the case.
For the transformed program to correctly mimic the original program, we require that all
unification of difference-lists be replaced by calls to⋆=. This includes both explicit calls to = and
the implicit unification in the clause heads. To make this implicit unification explicit, marked
clauses of the form
Q(S1, . . . , Sm, T1, . . . , Tn)←B
are transformed into
Q(S1, . . . , Sm,X1, . . . ,Xn)←X1 = T1, . . . ,Xn = Tn, B
where each Xi is a fresh variable. (We may unfold atoms Xi = Ti as long as the clause head
continues to have distinct variables in argument positions m+ 1, . . . , n in examples we do this.)
Example 7.8 The program flatten becomes
flatten(x : y, zf)← flatten(x, xf), f latten(y, yf), append(xf, yf, zf).
f latten(nil, zf)← zf = nil
f latten(z, zf)← zf = z : nil, constant(z).
append(x, y, z)← x = nil, y = z.
append(x1, y, z1)← x1 = u : x, z1 = u : z, append(x, y, z).
We henceforth assume that this additional transformation is performed in the marking process.
We now give a transformation from a marked program and query to an equivalent program in
which the marked lists are replaced by difference-lists.
7.2. MANIPULATING DIFFERENCE-LISTS 79
Algorithm. Let P and G be a successfully marked program and query. The program (τ P ) is
obtained as follows:
1. Replace the marked lists L in P by (σ L).
2. Replace = by⋆= whenever its arguments are marked.
3. Unfold all calls to⋆=.
4. If⋆= was introduced, add the definition of var-nil to P .
The query (τ ′ G) is obtained by replacing the marked lists L in G by (σ′ L).
Example 7.9 The transformed version of flatten given in Example 7.8 and the query given in
Example 7.7 is:
←flatten(z, 〈zf, nil〉).
f latten(x : y, 〈zf, zf ′〉)←
flatten(x, 〈xf, xf ′〉),
f latten(y, 〈yf, yf ′〉),
append(〈xf, xf ′〉, 〈yf, yf ′〉, 〈zf, zf ′〉).
f latten(nil, 〈v, v〉)← var-nil(v).
f latten(z, 〈z : v, v〉)← var-nil(v), constant(z).
append(〈v, v〉, 〈y, y′〉, 〈y, y′〉)← var-nil(v), var-nil (y′).
append(〈u : x, x′〉, 〈y, y′〉, 〈u : z, z′〉)←
var-nil(x′),
var-nil(z′),
append(〈x, x′〉, 〈y, y′〉, 〈z, z′〉).
var-nil(x)← if var(x) then true else x = nil.
The correctness of the transformation so far is captured by Proposition 7.10 below. The following
notion will prove useful.
Definition. A difference-list is free iff it has the form 〈T1 : . . . : Tn : X,Y 〉 and Y does not occur
in any Ti (note that Y may be X).
For the formulation of the proof of Proposition 7.10, two auxiliary notions will be helpful, but they
will play no role in the remainder of the chapter.
Definition. A term of the form 〈T, nil〉 is nil-delimited (note that T is not necessarily a list). A
free difference-list 〈T1 : . . . : Tn : X,Y 〉 is separate iff X does not appear in any Ti. We use “≡”
between equations to indicate that they are identical.
80 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
Proposition 7.10 Let program P and query G be marked. Then
• P ∪{G} returns the answers θ1, θ2, . . . iff (τ P )∪{τ ′ G} returns the answers φ1, φ2, . . . , where
for all i ≥ 1 and all variables X in G, (θi X) = (φi X).
• P ∪ {G} has an infinite derivation iff (τ P ) ∪ {τ ′ G} has an infinite derivation.
Proof: The idea in this proof is to consider a derivation of P ∪ {G} as a process of collecting an
increasing set of term equations to be solved, and to prove inductively on the number of equations
that certain invariants hold. These invariants in turn entail the proposition.
Consider a set of equations E = {e1, . . . , en} associated with some finite derivation of P ∪ {G}.
By virtue of the marking algorithms, both sides of each equation in E are either instances of marked
terms or else neither side is. Construct the query G′ = g1, . . . , gn from E by replacing each marked
term T from G by (σ′ T ), by replacing each marked term T from P by (σ T ), and by inserting
a call to var-nil after every = occurring between marked terms. We now prove by induction on
its cardinality that E is solvable iff G′ succeeds, and that if G′ succeeds with (the single) answer
substitution α, then α is a most general solution of E for the variables in G.
Let Ei = {e1, . . . , ei} and G′i = {g1, . . . , gi}. The induction hypothesis is that Ei is solvable iff
G′i succeeds, and if G′
i succeeds with the answer substitution αi, then
1. for any gj of the form D⋆= D′, it holds that (αi D) and (αi D
′) are either separate free
difference-lists or nil-delimited.
2. if 〈T1 : . . . : Tn : X,Y 〉 is a separate free difference-list appearing in some (αi gj) then X and
Y only appear elsewhere in (αi G′) in separate free difference-lists of the form 〈S1 : . . . : Sm :
X,Y 〉, and
3. there is a most general unifier, θi, of Ei such that for every variable X in E it holds that
(θi X) = ((δi ◦ αi) X),where (δi V ) =
if V is a delimiter of a separate free difference-list in (αi G′) then nil else V .
Since G and P have been successfully marked, it follows from the definition of τ and τ ′ that the
above holds when i = 0. Now assume that it holds for i, and we shall show that it holds for i+ 1.
If Ei is not solvable then Ei+1 is also not solvable. It follows from the induction hypothesis that
G′i does not succeed and so G′
i+1 cannot succeed. We now assume that Ei is solvable.
First consider the case when ei is not marked. It follows from the construction of gi+1 and (2)
and (3) that (θi ei+1) ≡ (αi gi+1). Thus Ei+1 is solvable iff G′i+1 succeeds. Furthermore, if (αi gi+1)
has answer substitution β then αi+1 = β ◦ αi and θi+1 = β ◦ θi is a most general solution to Ei+1.
Since β is idempotent, it only relates variables in (αi gi+1). Thus the induction hypothesis holds
for i+ 1.
7.2. MANIPULATING DIFFERENCE-LISTS 81
Now consider the case where ei+1 ≡ (S = T ) is marked. It follows from part (1) of the induction
hypothesis that (αi gi+1) has one of the four forms:
(a) 〈(αi S), nil〉⋆= 〈(αi T ), nil〉
(b) 〈(αi S), nil〉⋆= 〈(αi T ),X〉
(c) 〈(αi S),X〉⋆= 〈(αi T ), nil〉
(d) 〈(αi S),X〉⋆= 〈(αi T ), Y 〉
If it is of form (a), then from (2) and (3) of the induction hypothesis, (θi ei+1) ≡ ((αi S) = (αi T )).
Thus this case is analogous to when ei+1 is not marked. If (αi gi+1) has form (b) then from parts
(2) and (3) of the induction hypothesis, (θi ei+1) ≡ {X ← nil}((αi S) = (αi T )). Thus (αi gi+1)
succeeds iff (θi ei+1) is solvable. It is routine to verify that (1), (2), and (3) hold for i+1 if (αi gi+1)
succeeds. The case when (αi gi+1) is of form (c) is symmetric to the case (b). Now consider it to
be of form (d). From (2) and (3) of the induction hypothesis,
(θi ei+1) ≡ {X ← nil, Y ← nil}((αi S) = (αi T )).
By the definition of⋆=, (αi gi+1) succeeds iff (θi ei+1) is solvable. It is routine to check that (1),
(2), and (3) hold for i+ 1 if (αi gi+1) succeeds.
It follows by finite induction that E (= En) is solvable iff G′ (= G′n) succeeds. Furthermore, by
the definition of τ ′, if G′ succeeds with answer α then there is a most general solution θ of E such
that for any variable X in G, (α X) = (θ X). Since G′ is an unfolded version of the derivation of
(τ P )∪ {(τ ′ G)} which corresponds to the derivation associated with E, the assertion follows.
The proof of Proposition 7.10 made extensive use of properties of the marking algorithms. Actions
(1), (2) and (4) in the Program Marking Algorithm are necessary because otherwise the transformed
program might try to unify a difference-list with an untransformed list. The same is true for the
Query Marking Algorithm. The check in the Program Marking Algorithm that only lists have been
marked is necessary because otherwise τ is not well-defined. It may not be apparent that action
(3) in the Program Marking Algorithm is necessary. The following example shows that it is.
Example 7.11 Consider the marked program
p← r(x), x = a.
r(nil).
After applying the transformation, we obtain the program
p← r(〈x, x′〉), x = a.
r(〈y, y〉)← var-nil(y).
This succeeds with query ←p, whereas the original program fails.
82 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
The following example shows that it is necessary to check that in each clause, if a variable appears
as the suffix of a marked list, then it only ever appears as such.
Example 7.12 Consider the marked program
p← r(x : x).
r(z : nil)← z = a : nil.
After applying the transformation, we obtain the program
p← r(〈x : x, x′〉).
r(〈z : y, y〉)← var-nil(y), z = a : nil.
This succeeds with query ←p, whereas the original program fails.
The reason why we have introduced difference-lists is that some calls to the difference-list version
of append can be replaced by calls to a predicate which has constant cost and equivalent behaviour
to append. We now investigate when such replacement is feasible.
The transformation of append from lists to difference-lists can result in one of three different
programs depending on which arguments of append are marked. If the first argument is marked
then the following version, called append⋆, results
append⋆(〈x, x〉, y, y)← var-nil(x).
append⋆(〈u : x, x′〉, y, u : z)← var-nil(x′), append⋆(〈x, x′〉, y, z).
A second version of append, called append⋆⋆, results when all three arguments of the original
program are marked. The definition of append⋆⋆ was given in Example 7.9 where it was just called
append. A third version of append results when the second and third arguments are marked. This
version cannot be optimised using techniques discussed here and so will not be considered further.
A call to append⋆ can often be replaced by a call to the predicate app⋆, defined by
app⋆(〈x, y〉, y, x).
The following proposition captures the relationship between these two predicates. In essence the
difference is that app⋆ binds the delimiter of its first argument while append⋆ does not.
Proposition 7.13 Let D be a free, fixed length difference-list whose delimiter does not appear in
either S or T . Then
1. A call app⋆(D,S, T ) finitely fails iff append⋆(D,S, T ) finitely fails.
2. A call app⋆(D,S, T ) succeeds with answer θ iff append⋆(D,S, T ) succeeds with answer φ, and
for every variable V , except the delimiter of D, (θ V ) = (φ V ).
7.2. MANIPULATING DIFFERENCE-LISTS 83
Proof: Let D = 〈R1 : . . . : Rn : X,X〉. The call append⋆(D,S, T ) gives rise to the equation
T = R1 : . . . : Rn : S and the call app⋆(D,S, T ) gives rise to the two equations T = R1 : . . . : Rn : S
and X = S. Part (1) holds because X does not appear in the first equation, thus the second
equation set is solvable iff the first is. Part (2) follows because the only difference between the two
equation sets is that X is constrained in the second but not in the first.
The conditions given in the proposition are all necessary for this equivalence to hold. If the first
argument is not of fixed length, then app⋆ may succeed when append⋆ fails. The following example
shows this.
Example 7.14 The query
←app⋆(〈u : x, x′〉, a : nil, b : nil)
succeeds, whereas the query
←append⋆(〈u : x, x′〉, a : nil, b : nil)
fails.
The following example shows that if the first argument is not free, then, conversely, app⋆ may fail
when append⋆ succeeds.
Example 7.15 The query
←app⋆(〈nil, nil〉, a : nil, a : nil)
fails, whereas the query
←append⋆(〈nil, nil〉, a : nil, a : nil)
succeeds.
In the terminology of Sterling and Shapiro [93] this is a “compatibility” problem: nil does not
unify with a : nil.
A call to append⋆⋆ can often be replaced by a call to the predicate app⋆⋆, defined by
app⋆⋆(〈x, y〉, 〈y, z〉, 〈x, z〉)← var-nil(z).
The following proposition captures the relationship between these two predicates. Again, in essence
the difference is that app⋆⋆ binds the delimiter of its first argument while append⋆⋆ does not.
Proposition 7.16 Let D1, D2, and D3 be difference-lists such that D1 is free and of fixed length
and the delimiter of D1 does not appear in either D2 or D3. Then
84 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
1. app⋆⋆(D1,D2,D3) finitely fails iff append⋆⋆(D1,D2,D3) finitely fails.
2. app⋆⋆(D1,D2,D3) succeeds with answer θ iff append⋆⋆(D1,D2,D3) succeeds with answer φ,
and for every variable V , except the delimiter of D1, (θ V ) = (φ V ).
Proof: Let D1 = 〈R1 : . . . : Rn : X,X〉,D2 = 〈S1, S2〉, and D3 = 〈T1, T2〉. The call
append⋆⋆(D1,D2,D3)
is equivalent to the call
T1 = R1 : . . . : Rn : S1, T2 = S2, var-nil (T2).
The call
app⋆⋆(D1,D2,D3)
is equivalent to the call
T1 = R1 : . . . : Rn : S1, T2 = S2, var-nil (T2),X = S1.
Part (1) holds because the variable X does not appear in the first two equations, hence the second
call succeeds if and only if the first does. Part (2) follows because the only difference between the
two calls is that X is constrained in the second but not in the first.
The conditions given in the proposition are all necessary for the equivalence to hold. Examples
similar to Example 7.14 and Example 7.15 can be constructed to show that it is necessary that
the first argument is free and of fixed length. Furthermore, if the delimiter of the first argument
appears in either of the other arguments, app⋆⋆ may fail when append⋆⋆ succeeds. The following
example shows this.
Example 7.17 The query
←app⋆⋆(〈a : x, x〉, 〈a : x, x〉, 〈a : a : y, y〉)
fails, whereas the query
←append⋆⋆(〈a : x, x〉, 〈a : x, x〉, 〈a : a : y, y〉)
succeeds.
7.3. AUTOMATIC DIFFERENCE-LIST TRANSFORMATION 85
7.3 Automatic difference-list transformation
We have stressed how the difference-list transformation should be regarded as a stepwise process:
a change in data structure (introduction of difference-lists) followed by the introduction of rapid
concatenation by means of app⋆ and app⋆⋆. For each stage, we have demonstrated that it is only
applicable under certain circumstances. Section 7.2 discussed the safe replacement of difference-
lists by lists, and that of app⋆ and app⋆⋆ by append⋆ and append⋆⋆. In this section we sketch how
the transformation can be performed automatically for a large class of programs, including all the
examples given by Sterling and Shapiro [93].
The transformation takes a program, a predicate to be optimised, and a description of the
intended use of that predicate. The description allows better analysis to be done and is referred
to as a “call template.” The transformation is only required to be safe with respect to such a call
template.
Definition. A call template for an n-ary predicate is an n-tuple of descriptors. The i’th descriptor
specifies whether the i’th argument is a fixed length list, a list (possibly of non-fixed length) or any
term.
Example 7.18 For the predicate flatten/2, the call template might specify that the first argument
is a fixed length list and the second is a list.
Definition. A call template C for predicate Q/n is consistent with the marked program P iff for
each argument of the query Q(X1, . . . ,Xn) which is marked for P , the corresponding descriptor in
C specifies either a fixed length list or a list.
Figure 7.1 gives an overview of the whole transformation. The transformation is viewed as a
three-stage process in which each stage is an analysis-synthesis sequence. The first stage is the
data structure transformation that introduces difference-lists and the corresponding ⋆-versions of
predicates, including append⋆ and append⋆⋆. The second stage transforms append⋆ and append⋆⋆
into their more efficient versions app⋆ and app⋆⋆ wherever possible, and subsequently unfolds these.
The first and second stage may have introduced calls to the non-logical var-nil . The purpose of
the third stage is to remove such calls whenever possible.
In more detail, Stage 1 is made up by the following:
Analysis:
1. Mark the first argument of each call to append in P .
2. Apply the Program Marking Algorithm to P giving P ′ (if it fails then return P ).
3. Check that the call template C for Q is consistent with P ′.
86 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
Input: A predicate Q/n, a call template C for Q, and a program P defining Q.
⇓
Stage 1:
Analysis: Determine which arguments must be changed to difference-
lists in P and check that this is consistent with C.
Synthesis: Replace lists by difference-lists as dictated by the Analysis
and introduce ⋆-versions of the predicates in P .
⇓
Stage 2:
Analysis: Determine where fast concatenation is safe.
Synthesis: Replace occurrences of append⋆ and append⋆⋆ by app⋆ and
app⋆⋆ as dictated by the Analysis.
⇓
Stage 3:Analysis: Determine which difference-lists are simple at runtime.
Synthesis: Remove calls to var-nil as dictated by the analysis.
Figure 7.1: Overview of the transformation
Synthesis:
1. Apply the function τ to P ′ giving P ′′;
2. Rename each predicate Q′ in P ′′ to Q′⋆, provided any of its arguments are marked (this
includes changing append to append⋆ or append⋆⋆ as appropriate);
3. If Q is marked, add Q(X1, . . . ,Xn) ← (τ ′ Q(X1, . . . ,Xn)), where X1, . . . ,Xn are distinct
variables.
Example 7.19 If the program flatten from Example 7.6 is input to the transformation, then the
7.3. AUTOMATIC DIFFERENCE-LIST TRANSFORMATION 87
following program is returned from Stage 1:
flatten(z, zf)← flatten⋆(z, 〈zf, nil〉).
f latten⋆(x : y, 〈zf, zf ′〉)←
flatten⋆(x, 〈xf, xf ′〉),
f latten⋆(y, 〈yf, yf ′〉),
append⋆⋆(〈xf, xf ′〉, 〈yf, yf ′〉, 〈zf, zf ′〉).
f latten⋆(nil, 〈v, v〉)← var-nil(v).
f latten⋆(z, 〈z : v, v〉)← var-nil(v), constant(z).
append⋆⋆(〈v, v〉, 〈y, y′〉, 〈y, y′〉)← var-nil(v), var-nil (y′).
append⋆⋆(〈u : x, x′〉, 〈y, y′〉, 〈u : z, z′〉)←
var-nil(x′),
var-nil(z′),
append⋆⋆(〈x, x′〉, 〈y, y′〉, 〈z, z′〉).
var-nil(x)← if var(x) then true else x = nil.
We now look at Stage 2. We only consider the replacement of append⋆⋆ by app⋆⋆: the case of
append⋆ is simpler because Property 2 in the following definition always holds.
Definition. A call append⋆⋆(D1,D2,D3) is secure if the following three properties can be estab-
lished to hold at runtime:
1. D1 = 〈L1, L2〉 is of fixed length and most general.
2. L2 does not occur in D2 or D3.
3. L2 is not used in any subsequent unification.
By Proposition 7.16, a secure call append⋆⋆(D1,D2,D3) can be replaced by app⋆⋆(D1,D2,D3).
Security can be guaranteed by dataflow analysis. Properties 1 and 2 can be established by an
analysis similar to that used for mode and occur check analysis [19, 21, 92]. Property 3 can be
established by a live variable analysis of the kind used in compile-time garbage collection [9]. Thus
Stage 2 of the transformation is made up by the following:
Analysis:
For each call to append⋆⋆ check whether the call is secure.
Synthesis:
1. Replace secure calls to append⋆⋆ by calls to app⋆⋆ and unfold these.
2. Remove the definition of append⋆⋆, if no longer needed.
88 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
Example 7.20 If the program from Example 7.19 is input to Stage 2, the following program is
returned:
flatten(z, zf)← flatten⋆(z, 〈zf, nil〉).
f latten⋆(x : y, 〈xf, yf ′〉)←
flatten⋆(x, 〈xf, yf〉),
f latten⋆(y, 〈yf, yf ′〉),
var-nil(yf ′).
f latten⋆(nil, 〈v, v〉)← var-nil(v).
f latten⋆(z, 〈z : v, v〉)← var-nil(v), constant(z).
var-nil(x)← if var(x) then true else x = nil.
The programs that result from Stage 2 will in general contain calls to var-nil . Referring to rather
contrived example programs, we saw how these checks were necessary, but for many common
programs they are not needed. A simple dataflow analysis performed in Stage 3 can determine when
calls to var-nil can be omitted. The reader may recall the definition of var-nil from Example 7.20.
From this definition it is clear that we are interested in determining the following runtime properties:
whether variables are definitely bound to nil, and whether variables are definitely free, that is,
unbound or bound to other variables. This analysis is similar to that used for mode analysis, for
example by abstract interpretation.
Example 7.21 If the program from Example 7.20 is input to Stage 3, the following program is
returned:
flatten(z, zf)← flatten⋆(z, 〈zf, nil〉).
f latten⋆(x : y, 〈xf, yf ′〉)← flatten⋆(x, 〈xf, yf〉), f latten⋆(y, 〈yf, yf ′〉).
f latten⋆(nil, 〈v, v〉).
f latten⋆(z, 〈z : v, v〉)← constant(z).
The worst-case time complexity for this program is O(n), where n is the number of constants in z,
whereas for the original flatten (Example 7.6), it was O(n2).
The correctness of the whole transformation should be clear, since at each stage, the analysis simply
aims at establishing that the subsequent synthesis yields an equivalent program by the propositions
of Section 7.2. The termination of the process is evident assuming that the analyses in Stage 2 and
Stage 3 terminate.
Note that resulting programs may contain calls to var-nil and thus may be non-logical. This
is the price to be paid for having a very general transformation, able to handle a large class of
programs. However, the dataflow analysis in Stage 3 is intended to discover needless calls to
var-nil , and so for all the standard examples [93], the resulting program should be as logical as the
original.
7.4. DISCUSSION 89
7.4 Discussion
We have shown that the introduction of difference-lists in a Prolog program followed by optimisation
of concatenation should be handled with great care: the program’s semantics is easily changed by
accident. This transformation, however, is useful for a large class of list-processing programs,
because it considerably improves efficiency.
Fortunately it is possible to design a dataflow analysis to recognise cases where the transfor-
mation is safe. This dataflow analysis can be made sufficiently precise to allow for the automatic
transformation of all the standard example programs known to benefit from the use of difference-
lists. We have sketched this analysis and how it forms part of the total transformation process.
Our aim has been to analyse the “semantics” of difference-lists and to point to the possibility of
an automatic transformation. We have not been concerned with the transformation’s efficient im-
plementation. Some of the devices we have introduced, such as the predicate⋆=, or the division into
three stages, serve merely explanatory purposes and would not be called for in an implementation.
The resulting programs need not contain any difference-lists at all: it is clear that the desired
improvements of programs can be achieved by replacing a list argument not by a difference-list but
by two new arguments to the predicate. Thus, if one prefers, the generated predicates can avoid
difference-lists by having more arguments (though of course the difference-lists are still implicitly
present). The programs that result from this approach are very similar to those resulting from the
well-known unfold/fold transformations [95]. Indeed such transformations are used in the method
of Zhang and Grant [102]. There is no doubt that the two transformation processes are closely
connected. The present approach, however, seems to have the advantage of lending itself more
readily to automation. This is because the unfold/fold paradigm calls for “eurekas” in the form
of new predicate definitions. Another problem with an unfold/fold-based method (such as Zhang
and Grant’s) is that folding in general does not preserve finite failure sets, and therefore such an
approach is correct in a somewhat weaker sense than ours. (In fact, the exact conditions under
which folding is correct have only been laid down very recently [29].)
It is not easy to give a detailed analysis of the decrease in time complexity obtained by the
transformation we have sketched. The reduction in number of derivation steps, however, is obvious,
and experience indicates that the transformation is usually worthwhile. The greatest expense of
the transformation lies in the necessary dataflow analyses. In this connection it is worth pointing
out the possibility of reusing dataflow information for many different purposes. As this report
shows, many useful properties of logic programs (determinacy, mode or type correctness, absence
of occur check problems, etc.) can be determined by dataflow analyses that propagate the same
kinds of information (freeness, sharing, etc.) as are required by the present analysis. This suggests
the usefulness of fusing such analyses.
The sketched transformation is very general. It may be applied successfully to almost any
useful list-processing program. For the standard examples, the transformation behaves optimally.
For more intractable programs it does not give up, but leaves certain (non-logical) calls in the
90 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE
program to perform runtime checks.
The basic idea, of course, of pasting data structures together efficiently by introducing extra
variables applies not only to lists. It appears that a study of the more general case would be very
useful, but much harder: here we have relied heavily on the heuristic that it is useful to give append
a difference-list as a first argument.
Chapter 8
Conclusion
We have presented a theory of dataflow analysis of logic programs and we have sketched some
applications in areas such as error finding, program transformation, compilation, and parallelisation.
We have called our thesis “Semantics-Based Analysis and Transformation of Logic Programs,”
because we have seen our work as an attempt not only to justify certain dataflow analyses and
program transformations with respect to an underlying semantics, but also to understand them
better by clarifying their exact relation to semantics. We have striven to obtain simple definitions
of the semantics of logic programs and dataflow analyses, with the hope that an understanding
of their close relationship could spring from a high degree of congruence of the definitions. The
definitions reflect a choice of what is essential in the programming language under study. We
feel that the contribution of the present thesis lies as much in the suggested definitions as in the
theorems established.
A particular aim has been to try to capture the essence of what seems to us to be a major class
of analyses called for in many different logic programming tools. Like so much related work, our
theory is based on P. and R. Cousot’s notion of abstract interpretation. One assumption made by
P. and R. Cousot has been relaxed, so as to obtain a more widely applicable theory, but we do not
hesitate to call our approach abstract interpretation. In particular, the whole development of the
groundness analysis presented in Section 6.4 fits into the Cousot framework.
A more precise label for our work would be “denotational abstract interpretation,” though.
In this respect it differs significantly from much of the other work that has been published about
abstract interpretation of logic programs. We hope that the present thesis will be seen as a point
in favour of the denotational approach developed by F. Nielson, or at least for an approach based
on a powerful meta-language such as that of denotational semantics. Nielson’s approach allows for
generality at different levels. First, the language of denotational semantics allows for comfortable
reasoning at exactly the level of abstraction called for by any particular class of applications, or
dataflow analyses. Second, the proof of correctness of a particular dataflow analysis becomes almost
trivial, since most of it can be conducted at the level of the meta-language once and for all. Finally,
most of the theory is independent of any particular programming language, since it is expressed
91
92 CHAPTER 8. CONCLUSION
in terms of the meta-language only. This last point indicates that our work could be redone for
related programming language paradigms without too much effort.
We have found it useful to distinguish between bottom-up and top-down analysis. This distinc-
tion is not clear-cut, but we think of a “top-down semantics” as one that allows for extraction of
information about the SLD tree that corresponds to the execution of a program given some query.
Bottom-up analysis is not based on such a semantics to begin with, and therefore it cannot provide
information about for example calls that will take place at runtime. Bottom-up analysis suffices for
several applications, though. It is not only the conceptually simplest of the two, it also allows for
efficient derivation of query-independent information about a program. We have given examples
of its use for program specialisation. As examples of the use of top-down analysis we looked at
groundness analysis, and as a more complex case, we studied how top-down dataflow analysis could
be used to guarantee the correctness of the (automatic) transformation of list-processing programs
into more efficient programs using difference-lists.
In this thesis we have paid far too little attention to implementation issues. The aim of a
theory of dataflow analysis is of course not just to improve our understanding of the phenomenon
but to use that understanding to improve the practice of building programming tools. The reader
may feel that, in our quest for abstract semantic definitions (so as to simplify definitions), we have
only widened the theory/practice gap, since “more abstract” can easily be taken to mean “further
away from a concrete implementation.” This is not correct, though. The “dataflow semantics”
of Section 6.3 was obtained (and its correctness proved) through a series of abstract semantics,
but instances of it are no more difficult to implement efficiently that some of the “algorithms”
or “procedures” previously suggested [7, 44], assuming that the implementor knows some of the
standard tricks for efficient fixpoint computation. (An interesting project would be to build an
“analyzer generator” from the dataflow semantics. Such a device would, given an “interpretation,”
expressed in a suitable definition language, automatically produce a program that performs the
corresponding dataflow analysis.)
Fixpoint computation apart, efficient implementation of a particular dataflow analysis comes
down to the implementation of the operations in the corresponding interpretation, including possible
normalisation of descriptions. Clearly the complexity of these operations depends on the granularity
of the domain of descriptions chosen for the analysis, as does the precision of the information we
can extract. While we have well-established means for discussing complexity, there is no good
metric for “precision” of a dataflow analysis, and this makes it very hard to discuss the obvious
trade-off between higher time/space complexity and less precise dataflow information. So far, it
seems that what constitutes a “good” dataflow analysis is a purely empirical question, but it may
still be possible to develop some useful measure for “precision” that would be helpful in discussions
of the trade-off.
It would be useful to study to what extent ideas developed here could be used for related
programming language paradigms. Owing to the denotational framework, one would hope that it
would be relatively straightforward to redo some of the work for related programming languages,
93
such as logic programming languages with delay mechanisms (“freeze,” “wait,” “when,” etc.), de-
ductive databases [49, 72], constraint logic programming languages [36, 57], and perhaps even
concurrent constraint programming languages [86]. Whether this is the case remains to be seen,
though. Presumably the kinds of dataflow information relevant for implementing these program-
ming languages differ from what has been discussed in the present thesis, but it would seem that
abstract interpretation could be as versatile for these languages as it has proved to be for logic
programming.
The basic hypothesis in the present work has been that of P. and R. Cousot: many important
dataflow analyses in a programming language L can be understood as approximation of extreme
fixpoints in L’s semantic domains. It seems certain, however, that this does not cover all interesting
analyses, and it would be very useful to understand the limitations of the framework of abstract
interpretation in a logic programming context. One problem is that there are many useful properties
that are not inclusive, such as “finiteness” in deductive databases. In such cases our present theory
allows for no better solution than approximating the relevant least fixpoint by approximating a
greatest fixpoint from above.
Another problem is that some of the domains of descriptions that have been suggested for useful
dataflow analyses are neither ascending nor descending chain finite. This is the case, for example,
for the so-called regular trees used by Pyo and Reddy [82] for type inference. It would therefore
seem that methods such as those of Heintze and Jaffar [32] and Pyo and Reddy cannot easily
be captured, if at all, unless one considers incorporating some kind of “widening,” as discussed
by P. and R. Cousot [13]. Finally it is worth mentioning that Plaisted’s [80] idea about abstract
theorem proving bears a close resemblence to abstract interpretation of logic programs. Plaisted
points out that analogy is often a useful guide to proofs and discusses how one can use abstraction
in theorem proving to simplify reasoning. The idea is to first prove a related but simpler theorem
and then use its proof to guide a proof of the general theorem. Our bottom-up analyses can be
seen as particular examples (which use modus ponens as inference rule) of this principle. Similarly
our top-down analyses are examples that use resolution as deductive machinery.
94 CHAPTER 8. CONCLUSION
Appendix A
Correctness of the base semantics
To prove that the base semantics approximates the SLD semantics we first state a number of
elementary results concerning parametric substitutions.
Lemma A.1 Let Π,Π′ ⊆ Psub be given. Then
∆ dom (meet Π Π′) ⊆ (∆ dom Π) ∪ (∆ dom Π′).
Lemma A.2 Let Π ⊆ Psub and V,W ⊆ Var be given. Then
restrict V (restrict W Π) = restrict(W ∩ V ) Π.
Lemma A.3 Let Π ⊆ Psub and V,W ⊆ Var be given such that W ∩ (∆ dom Π) ⊆ V ⊆W . Then
restrictW Π = restrict V Π.
Lemma A.4 Let Π,Π′ ⊆ Psub be given and let W ⊆ Var be such that ∆ dom Π ⊆W . Then
restrictW (meet Π Π′) = meet Π (restrictW Π′).
Lemma A.5 Let θ, θ′ ∈ Sub be given such that (dom θ′) ∩ (vars θ) = ∅. Then
β (θ ◦ θ′) = β θ ⊓ β θ′.
Lemma A.6 Let θ ∈ Sub and π ∈ Psub be given such that π ≤ β θ. Then π = π ◦ θ.
The following result is a direct consequence of Lemma A.3.
Lemma A.7 Let A,A′ ∈ Atom and π ∈ Psub be given. If (vars A) ∩ (vars A′) = ∅ and (dom π) ∩
(vars A) = ∅ then unify A A′ {π} = restrict(vars A) (π ⊓ β (mgu A A′)).
95
96 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS
We also need to establish some invariants of the base semantics.
Definition. The predicate additive : Den → Bool is defined by
additive d iff ∀A ∈ Atom . ∀Π ⊆ Psub . d A Π =⊔
π∈Π(d A {π}).
Lemma A.8 The predicate additive is inclusive.
Proof: Let D ⊆ Den be a chain such that additive d holds for all d ∈ D. Then
π′ ∈ (⊔
D) A Π
⇔ ∃ d ∈ D . π′ ∈ d A Π
⇔ ∃ d ∈ D . ∃π ∈ Π . π′ ∈ d A {π}
⇔ ∃π ∈ Π . π′ ∈ (⊔
D) A {π}
⇔ π′ ∈⊔
π∈Π((⊔
D) A {π}).
Lemma A.9 For all P ∈ Prog and d ∈ Den, additive (P′bas [[P ]] d) holds.
Proof: We have that
π′ ∈ P′bas [[P ]] d A Π
⇔ ∃C ∈ P . π′ ∈ Cbas [[C]] d A Π
⇔ ∃C ∈ P . ∃π ∈ Π . π′ ∈ Cbas [[C]] d A {π}
⇔ ∃π ∈ Π . π′ ∈ P′bas [[P ]] d A {π}
⇔ π′ ∈⊔
π∈Π (P′bas [[P ]] d A {π}).
The following lemma characterizes additivity.
Lemma A.10 For all d ∈ Den,
additive d⇔ ∀G ∈ Atom∗ . ∀Π ⊆ Psub . Σ d G Π =⋃
π∈Π(Σ d G {π}).
Proof: The ⇐ direction holds trivially. The ⇒ direction can be proved by structural induction on
G.
Lemma A.11 Let D ⊆ Den be a non-empty chain such that additive d holds for all d ∈ D. Then
π ∈ Σ (⊔
D) G Π⇔ ∃ d ∈ D . π ∈ Σ d G Π.
Proof: The proof is by structural induction on G. If G = nil, the assertion holds since D 6= ∅.
Otherwise G = (A : G′) for some A and G′. Assume as hypothesis for induction that
π ∈ Σ (⊔
D) G′ Π⇔ ∃ d ∈ D . π ∈ Σ d G′ Π.
97
We have that
π′′ ∈ Σ (⊔
D) G Π
⇔ π′′ ∈ Σ (⊔
D) G′ ((⊔
D) A Π) (by the definition of Σ)
⇔ ∃π ∈ Π . ∃π′ ∈ (⊔
D) A {π} . π′′ ∈ Σ (⊔
D) G′ {π′}
by Lemmas A.8 and A.10. It follows by the induction hypothesis that
π′′ ∈ Σ (⊔
D) G Π
⇔ ∃π ∈ Π . ∃π′ ∈ (⊔
D) A {π} . ∃ d ∈ D . π′′ ∈ Σ d G′ {π′}
⇔ ∃ d ∈ D . π′′ ∈ Σ d G Π
by Lemma A.10 and the definition of Σ.
Definition. The predicate comp : Den → Bool is defined by
comp d iff ∀G ∈ Atom∗ . ∀π, π′ ∈ Psub . ∀ θ ∈ Sub . ∀ ρ ∈ Ren.
∆ dom (Σ d G {π}) ⊆ (dom π) ∪ (vars G)
∧ (π G = π′ G ∧ π′ ≤ π)⇒ Σ d G {π′} = meet{π′} (Σ d G {π})
∧ π ≤ β θ ⇒ Σ d (θ G) {π} = Σ d G {π}
∧ Σ d G {π} = {π′ ◦ ρ | π′ ∈ Σ d (ρ G) {π ◦ ρ}}.
Lemma A.12 The predicate λd . (comp d) ∧ (additive d) is inclusive.
Proof: Let D ⊆ Den be a chain such that (additive d) and (comp d) both hold for every d ∈ D. By
Lemma A.8, additive (⊔
D) holds. If D = ∅ then it is straightforward to verify that comp (⊔
D)
holds. For D 6= ∅, Lemma A.11 makes it a mechanical task to verify that comp (⊔
D) holds.
Lemma A.13 For all P ∈ Prog and d ∈ Den, comp d⇒ comp (P′bas [[P ]] d).
Proof: Let d′ = P′bas [[P ]] d. We must prove the following:
(1) π2 ∈ Σ d′ G {π} ⇒ dom π2 ⊆ (dom π) ∪ (vars G)
(2) (π G = π′ G ∧ π′ ≤ π)⇒ (π2 ∈ Σ d′ G {π} ⇔ (π2 ⊓ π′) ∈ (Σ d′ G {π′}) ∪ {⊥Psub})
(3) π ≤ (β θ)⇒ (π2 ∈ Σ d′ (θ G) {π} ⇔ π2 ∈ Σ d′ G {π})
(4) (π2 ◦ ρ) ∈ Σ d′ (ρ G) {π ◦ ρ} ⇔ π2 ∈ Σ d′ G {π}.
The proof is by structural induction on G. If G = nil then it is straightforward to verify (1)–(4).
Otherwise G = (A : G′) for some A and G′. Assume as hypothesis for induction that
(1′) π2 ∈ Σ d′ G′ {π} ⇒ dom π2 ⊆ (dom π) ∪ (vars G′)
(2′) (π G′ = π′ G′ ∧ π′ ≤ π)⇒ (π2 ∈ Σ d′ G′ {π} ⇔ (π2 ⊓ π′) ∈ (Σ d′ G′ {π′}) ∪ {⊥Psub})
(3′) π ≤ (β θ)⇒ (π2 ∈ Σ d′ (θ G′) {π} ⇔ π2 ∈ Σ d′ G′ {π})
(4′) (π2 ◦ ρ) ∈ Σ d′ (ρ G′) {π ◦ ρ} ⇔ π2 ∈ Σ d′ G′ {π}
98 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS
all hold. By Lemma A.9, additive d′ holds. Thus by Lemma A.10 and the definition of d′,
π2 ∈ Σ d′ G {π}
⇔ ∃π1 ∈ d′ A {π} . π2 ∈ Σ d′ G′ {π1}
⇔ ∃ [[H ←B]] ∈ P . ∃π3 ∈ unify A H (Σ d B (unify H A {π})) . π2 ∈ Σ d′ G′ {π ⊓ π3}.
By the definition of unify , (dom π3) ⊆ (vars A). Thus by Lemma A.1, dom(π ⊓ π3) ⊆ (dom π) ∪
(vars A). By (1′), dom π2 ⊆ (dom π) ∪ (vars A) ∪ (vars G′) = (dom π) ∪ (vars G), so (1) holds.
Assume π G = π′ G ∧ π′ ≤ π. Since π A = π′ A, we have that
unify H A {π} = unify H A {π′}.
Thus,
unify A H (Σ d B (unify H A {π})) = unify A H (Σ d B (unify H A {π′})).
Let π3 ∈ unify A H (Σ d B (unify H A {π})). Then (dom π3) ⊆ (vars A) ⊆ (vars G). Repeated
use of Lemma A.4 therefore gives
restrict (vars G) {π3 ⊓ π′}
= meet {π3} (restrict(vars G) {π′})
= meet {π3} (restrict(vars G) {π})
= restrict (vars G) {π3 ⊓ π}.
Thus (π3 ⊓ π) G′ = (π3 ⊓ π′) G′. Since (π3 ⊓ π
′) ≤ (π3 ⊓ π), it follows from (2′) that
π2 ∈ Σ d′ G′ {π3 ⊓ π} ⇔ (π2 ⊓ π3 ⊓ π′) ∈ (Σ d′ G′ {π3 ⊓ π
′}) ∪ {⊥Psub}.
Also, by (2′), π2 ≤ (π3 ⊓ π) ≤ π3. Thus,
π2 ∈ Σ d′ G′ {π3 ⊓ π} ⇔ (π2 ⊓ π′) ∈ (Σ d′ G′ {π3 ⊓ π
′}) ∪ {⊥Psub},
and (2) follows.
Assume π ≤ β θ. By the definition of unify , unify H A {π} = unify H (θ A) {π}. By (2′),
π′′′ ∈ Σ d B π′′ ⇒ π′′′ ≤ π′′. Thus
∀π′′′ ∈ Σ d B (unify H A {π}) . (π′′′ H) ≤ π (θ A).
Hence, by the definition of unify ,
π3 ∈ unify A H (Σ d B (unify H A {π}))
⇔ ∃π′3 ∈ unify (θ A) H (Σ d B (unify H (θ A) {π})) . π3 ⊓ (β θ) = π′3 ⊓ (β θ)
⇒ ∃π′3 ∈ unify (θ A) H (Σ d B (unify H (θ A) {π})) . π3 ⊓ π = π′3 ⊓ π.
Clearly (π3 ⊓ π) ≤ β θ, so (3) follows from (3′).
99
Finally it follows from the definition of unify that
π3 ∈ unify A H (Σ d B (unify H A {π}))
⇔ (π3 ◦ ρ) ∈ unify (ρ A) H (Σ d B (unify H (ρ A) {π ◦ ρ})).
By the definition of ⊓, (π3 ⊓ π) ◦ ρ = (π3 ◦ ρ) ⊓ (π ◦ ρ), so (4) follows from (4′).
Definition. The predicate cor : (Sem ×Den)→ Bool is defined by
cor (s, d) iff ∀G ∈ Atom∗ . ∀ θ ∈ Sub . restrict U (∆ β (s θ (θ G))) ⊆ Σ d G {β θ},
where U = (vars θ) ∪ (vars G).
Lemma A.14 The predicate λ (s, d) . (cor (s, d) ∧ additive d) is inclusive.
Proof: Let X ⊆ Sem × Den be a chain such that for all (s, d) ∈ X, cor (s, d) ∧ additive d
holds. Let D = {d | (s, d) ∈ X} and S = {s | (s, d) ∈ X}. Clearly D and S are chains and⊔
X = (⊔
S,⊔
D). By Lemma A.8, additive (⊔
D) holds. We must show that cor (⊔
S,⊔
D)
holds. Letting U = (vars θ) ∪ (vars G) we have
π ∈ restrict U (∆ β ((⊔
S) θ (θ G)))
⇔ ∃ s ∈ S . π ∈ restrict U (∆ β (s θ (θ G)))
⇔ ∃ d ∈ D . π ∈ Σ d G {β θ}
⇔ π ∈ Σ (⊔
D) G {β θ}.
The assertion follows.
Lemma A.15 Let P ∈ Prog, d ∈ Den and s ∈ Sem be given. Then
(comp d ∧ d ≤ lfp(P′bas [[P ]]) ∧ cor (s, d))⇒ cor ((O′ [[P ]] s), (P′
bas [[P ]] d)).
Proof: Let G ∈ Atom∗, θ ∈ Sub, and P ∈ Prog be given and let U = (vars θ) ∪ (vars G). Assume
that comp d, d ≤ lfp(P′bas [[P ]]), and cor (s, d) all hold. By the definition of cor and that of O′, it
suffices to show that
(5) restrict U (∆ β (O′ [[P ]] s θ (θ G))) ⊆ Σ (P′bas [[P ]] d) G {β θ}.
If G = nil then both sides of (5) are equal to {β θ}, so (5) holds. Otherwise θ G = (A : G′) for
some A and G′. Consider C ∈ P and let [[H←B]] = rename U C. If mgu A H = ∅ then (5) holds.
Otherwise mgu A H = {µ} for some substitution µ. Let θ′ = µ ◦ θ.
Note that vars µ = (vars A) ∪ (vars H) and that ((vars H) ∪ (vars B)) ∩ U = ∅. Since θ is
idempotent, θ (A : G′) = (A : G′). Thus (dom θ) ∩ (vars A) = ∅ and so (dom θ) ∩ (vars µ) = ∅.
100 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS
By the definition of C′,
restrict U (∆ β (C′ [[C]] s θ (θ G)))
= restrict U (∆ β (s θ′ (µ (B :: G′))))
= restrict U (∆ β (s θ′ (θ′ (B :: G′))))
since θ′ (B :: G′) = µ (B :: G′). Let U ′ = (vars θ′) ∪ (vars(B :: G′)). Then U ⊆ U ′ and thus
restrict U (∆ β (C′ [[C]] s θ (θ G)))
= restrict U (restrict U ′ (∆ β (s θ′ (θ′ (B :: G′))))) (by Lemma A.2)
⊆ restrict U (Σ d (B :: G′) {β θ′}) (since cor (s, d) holds)
= restrict U (Σ d G′ (Σ d B {β θ′})) (by the definition of Σ)
= restrict U (meet Π (Σ d G′ (restrict U Π))),
where Π = Σ d B {β θ′}, since vars G ⊆ U and comp d holds. Furthermore, since comp d holds,
we have by Lemma A.1 that dom(Σ d G′ (restrict U Π)) ⊆ vars U . Thus
restrict U (∆ β (C′ [[C]] s θ (θ G)))
⊆ meet(restrict U Π) (Σ d G′ (restrict U Π)) (by Lemma A.4)
= Σ d G′ (restrict U Π) (since comp d holds)
⊆ Σ (P′bas [[P ]] d) G′ (restrict U Π)
since d ≤ lfp(P′bas [[P ]]) and P′
bas is monotonic. Thus
(6) restrict U (∆ β (C′ [[C]] s θ (θ G))) ⊆ Σ (P′bas [[P ]] d) G′ (restrict U (Σ d B {β θ′})).
Now restrict U (Σ d B {β θ′}) = restrict U (meet{β θ′} (Σ d B (restrict(vars B) {β θ′}))), since
comp d holds. Since (vars B) ∩ (dom{β θ′}) ⊆ vars H,
restrict U (Σ d B {β θ′})
= restrict U (meet{β θ′} (Σ d B (restrict /(vars H) {β θ′}))) (by Lemma A.3)
= restrict U (meet{β θ′} (Σ d B (restrict /(vars H) {(β µ) ⊓ (β θ)}))) (by Lemma A.5)
= restrict U (meet{β θ′} (Σ d B (unify /H A {β θ}))) (by Lemma A.7)
= restrict U (meet{β θ} (meet{β µ} (Σ d B (unify H A {β θ})))) (by Lemma A.5)
= meet{β θ} (restrict U (meet{β µ} (Σ d B (unify H A {β θ})))) (by Lemma A.4)
since dom{β θ} = (vars θ) ⊆ U . Now by comp d and Lemma A.1,
dom(meet{β µ} (Σ d B (unify H A {β θ}))) ⊆ (vars A) ∪ (vars H) ∪ (vars B).
Since U ∩ ((vars A) ∪ (vars H) ∪ (vars B)) ⊆ vars A, it follows by Lemma A.3 that
restrict U (Σ d B {β θ′})
= meet{β θ} (restrict(vars A) (meet{β µ} (Σ d B (unify H A {β θ}))))
= meet{β θ} (unify A H (Σ d B (unify H A {β θ})))
101
by Lemma A.7. Now, since [[H ← B]] is a renaming of C ∈ P , there is some ρ ∈ Ren such that
C = [[(ρ H)← (ρ B)]]. By the definition of unify and by comp d, therefore
restrict U (Σ d B {β θ′})
= meet{β θ} (unify A (ρ H) (Σ d (ρ B) (unify (ρ H) A {β θ})))
= Cbas [[C]] d A {β θ}.
Insertion in (6) therefore gives
restrict U (∆ β (C′ [[C]] s θ (θ G))) ⊆ Σ (P′bas [[P ]] d) G′ (Cbas [[C]] d A {β θ}).
Since C ∈ P was chosen arbitrarily, it follows from the definitions of O′ and Pbas that
restrict U (∆ β (O′ [[P ]] s θ (θ G)))
⊆ Σ (P′bas [[P ]] d) G′ (P′
bas [[P ]] d A {β θ})
= Σ (P′bas [[P ]] d) (A : G′) {β θ} (by the definition of Σ)
= Σ (P′bas [[P ]] d) G {β θ}.
By comp d and Lemma A.12, since θ G = (A : G′), (5) holds.
Lemma A.16 For all programs P ,
cor (lfp (O′ [[P ]]), lfp (P′bas [[P ]])) ∧ comp (P′
bas [[P ]]) ∧ additive (P′bas [[P ]]).
Proof: Let cor1 be defined by cor1 (s, d) iff comp d ∧ additive d ∧ cor (s, d). By Lemmas A.12
and A.14, cor1 is inclusive. By Lemmas A.13 and A.15,
(cor1 (s, d) ∧ (s, d) ≤ (lfp (O′ [[P ]]), lfp (P′bas [[P ]]))⇒ cor1 ((O′ [[P ]] s), (P′
bas [[P ]] d)).
Thus by fixpoint induction, cor1 (lfp (O′ [[P ]]), lfp (P′bas [[P ]])) holds and the assertion follows.
We can now prove Theorem 6.7, that is, show that the base semantics and the SLD semantics are
congruent.
Theorem 6.7 For all programs P , cong ((O [[P ]]), (Pbas [[P ]])).
Proof: We must show that cong ((O [[P ]]), (Pbas [[P ]])) holds. Let d = Pbas [[P ]] and e = O [[P ]].
Now d = lfp (P′bas [[P ]]) and so by Lemmas A.8 and A.16, additive d holds. Thus it suffices to
show for every A ∈ Atom and θ ∈ Sub that
{θ′ (θ A) | θ′ ∈ e (θ (A : nil))} ⊆ {π A | π ∈ d A {β θ}}.
By the definition of O,
{θ′ (θ A) | θ′ ∈ e (θ (A : nil))}
= {θ′ (θ A) | θ′ ∈ lfp (O′ [[P ]]) ι (θ (A : nil))}
⊆ {π (θ A) | π ∈ d (θ A) {ǫ}}
102 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS
since by Lemma A.16, cor (lfp (O′ [[P ]]), d). If (dom π) ⊆ vars (θ A) then by Lemmas A.4 and A.6,
(π ⊓ (β θ)) A = (π ⊓ (β θ)) (θ A) = π (θ A). Therefore
{θ′ (θ A) | θ′ ∈ e (θ (A : nil))}
= {(π ⊓ (β θ)) A | π ∈ d (θ A) {ǫ}}
= {π A | π ∈ meet{β θ} (d (θ A) {ǫ})}
= {π A | π ∈ d (θ A) {β θ}} (since by Lemma A.16, comp d holds)
= {π A | π ∈ d A {β θ}}.
The assertion follows.
Bibliography
[1] S. Abramsky and C. Hankin, editors. Abstract Interpretation of Declarative Languages. Ellis
Horwood, 1987.
[2] K. Apt and M. van Emden. Contributions to the theory of logic programming. Journal of the
ACM 29: 841–862, 1982.
[3] F. Bancilhon et al. Magic sets and other strange ways to implement logic programs. In Proc.
Fifth ACM Symp. Principles of Database Systems, pages 1–15. Cambridge, Massachusetts,
1986.
[4] R. Barbuti, R. Giacobazzi and G. Levi. A declarative approach to abstract interpretation of
logic programs. Technical report 20/89, Dept. of Informatics, University of Pisa, Italy, 1989.
[5] G. Birkhoff. Lattice Theory (AMS Coll. Publ. XXV). American Mathematical Society, third
edition 1973.
[6] C. Bloch. Source-to-source transformations of logic programs. M. Sc. Dissertation, Weizmann
Institute of Science, Rehovot, Israel, 1984.
[7] M. Bruynooghe. A framework for the abstract interpretation of logic programs. Report
CW 62, Dept. of Computer Science, University of Leuven, Belgium, 1987.
[8] M. Bruynooghe. A practical framework for the abstract interpretation of logic programs. To
appear in Journal of Logic Programming.
[9] M. Bruynooghe et al. Abstract interpretation: towards the global optimization of Prolog
programs. In Proc. Fourth Int. Symp. Logic Programming, pages 192–204. San Francisco,
California, 1987.
[10] K. Clark. Negation as failure. In Gallaire and Minker [28], pages 293–322.
[11] K. Clark and S.-A. Tarnlund. A first order theory of data and programs. In B. Gilchrist,
editor, Information Processing, pages 939–944. North-Holland, 1977.
[12] A. Colmerauer. Metamorphosis grammars. In L. Bolc, editor, Natural Language Communica-
tion with Computers (Lecture Notes in Computer Science 63), pages 133–189. Springer-Verlag,
1978.
103
104 BIBLIOGRAPHY
[13] P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analy-
sis of programs by construction or approximation of fixpoints. In Proc. Fourth Ann. ACM
Symp. Principles of Programming Languages, pages 238–252. Los Angeles, California, 1977.
[14] P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Proc. Sixth
Ann. ACM Symp. Principles of Programming Languages, pages 269–282. San Antonio, Texas,
1979.
[15] P. Dart. On derived dependencies and connected databases. To appear in Journal of Logic
Programming.
[16] S. K. Debray. Efficient dataflow analysis of logic programs. Draft manuscript, 37 pages.
Dept. of Computer Science, University of Arizona, 1989. Preliminary version in Proc. Fif-
teenth Ann. ACM Symp. Principles of Programming Languages, pages 260–273. San Diego,
California, 1988.
[17] S. K. Debray. Global Optimization of Logic Programs. Ph. D. Thesis, State University of New
York at Stony Brook, New York, 1986.
[18] S. K. Debray. Static analysis of parallel logic programs. In Kowalski and Bowen [46], pages
711–732.
[19] S. K. Debray. Static inference of modes and data dependencies in logic programs. ACM
Transactions on Programming Languages and Systems 11 (3): 418–450, 1989.
[20] S. K. Debray and P. Mishra. Denotational and operational semantics for Prolog. Journal of
Logic Programming 5 (1): 61–91, 1988.
[21] S. K. Debray and D. S. Warren. Automatic mode inference of logic programs. Journal of
Logic Programming 5 (3): 207–229, 1988.
[22] S. K. Debray and D. S. Warren. Functional computations in logic programs. ACM Transac-
tions on Programming Languages and Systems 11 (3): 451–481, 1989.
[23] P. Deransart and J. Ma luszynski. Relating logic programs and attribute grammars. Journal
of Logic Programming 2 (2): 119–156, 1985.
[24] M. Falaschi et al. A new declarative semantics for logic languages. In Kowalski and Bowen
[46], pages 993–1005.
[25] M. Fitting. Bilattices and the semantics of logic programming. To appear in Journal of Logic
Programming.
[26] M. Fitting. A Kripke-Kleene semantics for logic programs. Journal of Logic Programming 2
(4): 295–312, 1985.
BIBLIOGRAPHY 105
[27] J. Gallagher, M. Codish and E. Shapiro. Specialisation of Prolog and FCP programs using
abstract interpretation. New Generation Computing 6 (2,3): 159–186, 1988.
[28] H. Gallaire and J. Minker, editors. Logic and Databases. Plenum Press, 1978.
[29] P. Gardner and J. Shepherdson. Unfold/fold transformations of logic programs. Draft report,
30 pages. Dept. of Computer Science, University of Edinburgh, Scotland, 1989.
[30] A. Hansson and S.-A. Tarnlund. Program transformation by data structure mapping. In
K. Clark and S.-A. Tarnlund, editors, Logic Programming, pages 117–122. Academic Press,
1982.
[31] M. Hecht. Flow Analysis of Computer Programs. North-Holland, 1977.
[32] N. Heintze and J. Jaffar. A finite presentation theorem for approximating logic programs. To
appear in Proc. Seventeenth Ann. ACM Symp. Principles of Programming Languages, San
Francisco, California, 1990.
[33] D. Jacobs and A. Langen. Accurate and efficient approximation of variable aliasing in logic
programs. In Lusk and Overbeek [50], pages 154–165.
[34] D. Jacobs and A. Langen. Compilation of logic programs for restricted and-parallelism. In
H. Ganzinger, editor, Proc. ESOP 88 (Lecture Notes in Computer Science 300), pages 284–
297. Springer-Verlag, 1988.
[35] D. Jacobs and A. Langen. Static analysis of logic programs for independent and-parallelism.
Technical report 89-03, Computer Science Dept., University of Southern California, Los An-
geles, California, 1989.
[36] J. Jaffar and J.-L. Lassez. Constraint logic programming. In Proc. Fourteenth Ann. ACM
Symp. Principles of Programming Languages, pages 111–119. Munich, Fed. Rep. Germany,
1987.
[37] G. Janssens and M. Bruynooghe. An application of abstract interpretation: Integrated type
and mode inferencing. Report CW 86, Dept. of Computer Science, University of Leuven,
Belgium, 1989.
[38] J. Jensen. Generation of machine code in Algol compilers. BIT 5: 235–245, 1965.
[39] N. D. Jones. Flow analysis of lambda expressions. In S. Even and O. Kariv, editors, Proc.
Eighth Int. Coll. Automata, Languages and Programming (Lecture Notes in Computer Science
115), pages 114–128. Springer-Verlag, 1981.
[40] N. D. Jones and S. S. Muchnick. Flow analysis and optimization of Lisp-like structures. In
S. S. Muchnick and N. D. Jones, editors, Program Flow Analysis, pages 102–131. Prentice-
Hall, 1981.
106 BIBLIOGRAPHY
[41] N. D. Jones and A. Mycroft. A stepwise development of operational and denotational seman-
tics for Prolog. In Proc. 1984 Int. Symp. Logic Programming, pages 289–298. Atlantic City,
New Jersey, 1984.
[42] N. D. Jones and A. Mycroft. Dataflow analysis of applicative programs using minimal function
graphs. In Proc. Thirteenth Ann. ACM Symp. Principles of Programming Languages, pages
296–306. St. Petersburg, Florida, 1986.
[43] N. D. Jones and H. Søndergaard. A semantics-based framework for the abstract interpretation
of Prolog. In Abramsky and Hankin [1], pages 123–142.
[44] T. Kanamori and T. Kawamura. Analyzing success patterns of logic programs by abstract
hybrid interpretation. ICOT TR-279, ICOT, Tokyo, Japan, 1987.
[45] S. C. Kleene. Introduction to Metamathematics. North-Holland, 1971. Originally published
by Van Nostrand, 1952.
[46] R. Kowalski and K. Bowen, editors. Logic Programming: Proc. Fifth Int. Conf. Symp. MIT
Press, 1988.
[47] K. Kunen. Negation in logic programming. Journal of Logic Programming 4 (4): 289–308,
1987.
[48] J.-L. Lassez, M. J. Maher and K. Marriott. Unification revisited. In Minker [72], pages 587–
625.
[49] J. W. Lloyd. Foundations of Logic Programming. Springer-Verlag, second edition 1987.
[50] E. L. Lusk and R. A. Overbeek, editors. Logic Programming: Proc. North American Conf.
1989. MIT Press, 1989.
[51] J. McCarthy. A basis for a mathematical theory of computation. In P. Braffort and
D. Hirschberg, editors, Computer Programming and Formal Systems, pages 33–70. North-
Holland, 1963.
[52] M. Maher. On parameterized substitutions. Unpublished note, 26 pages. IBM T. J. Watson
Research Center, Yorktown Heights, New York, 1986.
[53] H. Mannila and E. Ukkonen. Flow analysis of Prolog programs. In Proc. Fourth Symp. Logic
Programming, pages 205–214. San Francisco, California, 1987.
[54] K. Marriott. Finding Explicit Representations for Subsets of the Herbrand Universe.
Ph. D. Thesis, University of Melbourne, Australia, 1988.
[55] K. Marriott. Frameworks for abstract interpretation. In preparation.
[56] K. Marriott, L. Naish and J.-L. Lassez. Most specific logic programs. In Kowalski and Bowen
[46], pages 909–923.
BIBLIOGRAPHY 107
[57] K. Marriott and H. Søndergaard. Analysis of constraint logic programs. To appear in S. K. De-
bray and M. Hermenegildo, editors, Logic Programming: Proc. North American Conf. 1990.
MIT Press, 1990.
[58] K. Marriott and H. Søndergaard. A tutorial on abstract interpretation of logic programs.
Unpublished note, 26 pages. IBM T. J. Watson Research Center, Yorktown Heights, New
York, 1989.
[59] K. Marriott and H. Søndergaard. Bottom-up abstract interpretation of logic programs.
In Kowalski and Bowen [46], pages 733–748.
[60] K. Marriott and H. Søndergaard. Bottom-up dataflow analysis of normal logic programs. To
appear in Journal of Logic Programming.
[61] K. Marriott and H. Søndergaard. On describing success patterns of logic programs. Technical
report 88/12, Dept. of Computer Science, University of Melbourne, Australia, 1988.
[62] K. Marriott and H. Søndergaard. On Prolog and the occur check problem. SIGPLAN Notices
24 (5): 76–82.
[63] K. Marriott and H. Søndergaard. Prolog program transformation by introduction of
difference-lists. In Proc. Int. Computer Science Conf. 88, pages 206–213. IEEE Computer
Society, Hong Kong, 1988.
[64] K. Marriott and H. Søndergaard. Semantics-based dataflow analysis of logic programs. In
G. X. Ritter, editor, Information Processing 89, pages 601–606. North-Holland, 1989.
[65] K. Marriott and H. Søndergaard. Top-down abstract interpretation of normal logic programs.
In preparation.
[66] K. Marriott, H. Søndergaard and P. Dart. A characterization of non-floundering logic pro-
grams. To appear in S. K. Debray and M. Hermenegildo, editors, Logic Programming:
Proc. North American Conf. 1990. MIT Press, 1990.
[67] K. Marriott, H. Søndergaard and N. D. Jones. Denotational abstract interpretation of logic
programs. In preparation.
[68] C. S. Mellish. Abstract interpretation of Prolog programs. In Shapiro [89], pages 463–474.
Revised version in Abramsky and Hankin [1], pages 181–198.
[69] C. S. Mellish. The automatic generation of mode declarations for Prolog programs. DAI
Research Paper No. 163, University of Edinburgh, Scotland, 1981.
[70] C. S. Mellish. Some global optimizations for a Prolog compiler. Journal of Logic Programming
2 (1): 43–66, 1985.
108 BIBLIOGRAPHY
[71] A. Melton, D. Schmidt and G. Strecker. Galois connections and computer science applications.
In D. Pitt et al., editors, Category Theory and Computer Programming (Lecture Notes in
Computer Science 240), pages 299–312. Springer-Verlag, 1986.
[72] J. Minker, editor. Foundations of Deductive Databases and Logic Programming. Morgan Kauf-
mann, 1988.
[73] A. Mycroft. Abstract Interpretation and Optimising Transformations for Applicative Pro-
grams. Ph. D. Thesis, University of Edinburgh, Scotland, 1981.
[74] A. Mycroft. Logic programs and many-valued logic. In M. Fontet and K. Mehlhorn, editors,
Proc. STACS 84 (Lecture Notes in Computer Science 166), pages 274–286. Springer-Verlag,
1984.
[75] P. Naur. The design of the Gier Algol compiler, part II. BIT 3: 145–166, 1963.
[76] F. Nielson. A denotational framework for data flow analysis. Acta Informatica 18: 265–287,
1982.
[77] F. Nielson. Strictness analysis and denotational abstract interpretation. Information and
Computation 76 (1): 29–92, 1988.
[78] U. Nilsson. A Systematic Approach to Abstract Interpretation of Logic Programs. Ph. D. The-
sis, University of Linkoping, Sweden, 1989.
[79] R. Paige and S. Koenig. Finite differencing of computable expressions. ACM Transactions on
Programming Languages and Systems 4 (3): 402–454, 1982.
[80] D. Plaisted. Abstract theorem proving. Artificial Intelligence 16: 47–108, 1981.
[81] D. Plaisted. The occur-check problem in Prolog. New Generation Computing 2 (4): 309–322,
1984.
[82] C. Pyo and U. S. Reddy. Inference of polymorphic types for logic programs. In Lusk and
Overbeek [50], pages 1115–1132.
[83] J. C. Reynolds. Automatic computation of data set definitions. In A. Morrell, editor, Infor-
mation Processing 68, pages 456–461. North-Holland, 1969.
[84] J. C. Reynolds. On the relation between direct and continuation semantics. In J. Loeckx,
editor, Proc. Second Int. Coll. Automata, Languages and Programming (Lecture Notes in
Computer Science 14), pages 141–156. Springer-Verlag, 1974.
[85] J. C. Reynolds. Transformational systems and the algebraic structure of atomic formulas.
In B. Meltzer and D. Michie, editors, Machine Intelligence 5, pages 135–151. Edinburgh
University Press, 1969.
BIBLIOGRAPHY 109
[86] V. A. Saraswat. Concurrent Constraint Programming Languages. Ph. D. Thesis, Carnegie-
Mellon University, Pennsylvania, 1989.
[87] T. Sato and H. Tamaki. Enumeration of success patterns in logic programs. Theoretical Com-
puter Science 34: 227–240, 1984.
[88] D. Schmidt. Denotational Semantics: A Methodology for Language Development. Allyn and
Bacon, 1986.
[89] E. Shapiro, editor. Proc. Third Int. Conf. Logic Programming (Lecture Notes in Computer
Science 240). Springer-Verlag 1986.
[90] J. Shepherdson. Negation in logic programming. In Minker [72], pages 19–88.
[91] M. Sintzoff. Calculating properties of programs by valuation on specific models. SIGPLAN
Notices 7 (1): 203–207, 1972. Proc. ACM Conf. Proving Assertions about Programs.
[92] H. Søndergaard. An application of abstract interpretation of logic programs: Occur check
reduction. In B. Robinet and R. Wilhelm, editors, Proc. ESOP 86 (Lecture Notes in Computer
Science 213), pages 327–338. Springer-Verlag, 1986.
[93] L. Sterling and E. Shapiro. The Art of Prolog: Advanced Programming Techniques. MIT
Press, 1986.
[94] H. Tamaki and T. Sato. OLD resolution with tabulation. In Shapiro [89], pages 84–98.
[95] H. Tamaki and T. Sato. Unfold/fold transformation of logic programs. In S.-A. Tarnlund,
editor, Proc. Second Int. Conf. Logic Programming, pages 127–138. Uppsala, Sweden, 1984.
[96] S.-A. Tarnlund. An axiomatic data base theory. In Gallaire and Minker [28], pages 259–289.
[97] J. Thom, and J. Zobel, editors. NU-Prolog reference manual. Technical Report 86/10, Dept. of
Computer Science, University of Melbourne, Australia, revised edition 1987.
[98] M. van Emden and R. Kowalski. The semantics of logic as a programming language. Journal
of the ACM 23: 733–742, 1976.
[99] W. Winsborough. Automatic, transparent parallelization of logic programs at compile time.
Ph. D. Thesis, University of Wisconsin-Madison, Wisconsin, 1988.
[100] W. Winsborough. Source-level transforms for multiple specialization of Horn clauses (ex-
tended abstract). Technical report 88-15, Dept. of Computer Science, University of Chicago,
Illinois, 1988.
[101] W. Winsborough and A. Wærn. Transparent and-parallelism in the presence of shared free
variables. In Kowalski and Bowen [46], pages 749–764.
110 BIBLIOGRAPHY
[102] J. Zhang and P. Grant. An automatic difference-list transformation algorithm for Prolog.
In Y. Kodratoff, editor, Proc. 1988 European Conf. Artificial Intelligence, pages 320–325.
Pitman, 1988.
Index
⊥X 5
⊤X 5
⊓Y 5⊔
Y 5
PX 5
∆F 7
Σ;F 7
F ↓ α 7
F ↑ α 7
X∗ 7
∝ 20
⊳ 36, 52
ι 9, 48
ǫ 52
β 53
σ 74
σ′ 74⋆= 76
τ 79
τ ′ 79
app⋆ 82
app⋆⋆ 83
append⋆ 82
append⋆⋆ 82
appr 20, 21, 26
assign 60
Atom 8
Atomk 36
card 5
co 40
compl 34
cong 55
consequences 44
consistent 32
den 36
depth 36
dom 48
dom (for Psub) 54
Form 44
Fun 8
gfp 7
ground 9
Interp 31
Lcon 44
lcons 44
lfp 7
maximal 37
meet 54
meetgro 61
mgu 9, 48
mgugro 61
minimal 37
Par 52
pred 36
Pred 8
project 61
Prop 60
Psub 53
rename 49
restrict 49
restrict (for Psub) 54
rng 48
simE; [Q] 25
TP 9, 21
tass 61
Term 8
111
112 INDEX
unify 54
unifygro 61
UP 33
varnil 76
vars 48
Abramsky, S. 14
abstract interpretation 2, 15
abstraction function 18
abstraction scheme 36
aliasing analysis 66
approximate computation 11
Apt, K. 9
ascending chain finite lattice 6
atom 8
atom abstraction 36
Barbuti, R. 46
base semantics 22
base semantics B 33
base semantics Pbas 55
bijective function 7
Birkhoff, G. 5
Bloch, C. 70
Bocvar, D. 32
body 8
bottom-up analysis 3, 28
Bruynooghe, M. 35, 65
call pattern 3, 29
call template 85
canonical abstraction scheme 37
canonicised abstraction scheme S∗ 37
chain 5
Church, A. 7
Clark, K. 70
clause 8
co-continuous function 7
co-inclusive predicate 7
Colmerauer, A. 70
compile time garbage collection 66
complete join-lattice 5
complete lattice 5
complete meet-lattice 5
computed answer 10
concretization function 15
consistent call template 85
continuous function 7
co-strict function 7
Cousot, P. and R. 2, 3, 11, 14, 16, 18, 19, 91, 93
Dart, P. 46, 60
dataflow semantics P 57
Debray, S. 22, 58, 63, 65, 66
definite program 8
delimiter 74
depth k abstraction 36
descending chain finite lattice 6
determinacy analysis 66
difference-list 69, 74
downwards closed insertion 58
extended semantics 14, 22
finite chain property 6
finite height lattice 6
Fitting, M. 3, 31–35, 44
fixpoint 7
fixpoint induction 7
floundering analysis 66
folded version of function 7
free difference-list 79
function
abstraction - 18
bijective - 7
co-continuous - 7
co-strict - 7
concretization - 15
continuous - 7
idempotent - 7
injective - 7
monotonic - 7
INDEX 113
semantic - see “semantic functions”
strict - 7
Gallagher, J. 46
Galois insertion 18, 19
Giacobazzi, R. 46
Grant, P. 89
greatest fixpoint 7
greatest lower bound 5
ground syntactic object 8
groundness analysis 59
Hankin, C. 14
Hansson, A. 70
Hecht, M. 14
Heintze, N. 93
Herbrand model 9
idempotent function 7
identity substitution 9
immediate consequence function 9
inclusive predicate 7
independence analysis 66
injective function 7
insertion 19
downwards closed - 58
meet-closed - 59
Moore-closed - 59
upwards closed - 58
insertion adjoint 19
instance 9
instantiation ordering 36
interpretation
abstract - 2, 15
in three-valued logic 31
of dataflow semantics 58
of meta-language 24
type - 24
Jacobs, D. 63
Jaffar, J. 93
Jones, N. 14, 52, 58, 62, 63, 65
Kam, J. 14
Kanamori, T. 65
Kawamura, T. 65
Kildall, G. 14
Kleene, S. 3, 32
Kleene logic 32
Kleene sequence 7
Kowalski, R. 9
Kunen, K. 4, 31, 44
Langen, A. 63
Lassez, J.-L. 3, 31, 36, 37
lattice
ascending chain finite - 6
descending chain finite - 6
finite chain property 6
finite height - 6
join- 5
meet- 5
Noetherian - 6
lattice abstraction scheme 38
lax semantics Plax 56
least fixpoint 7
least upper bound 5
Levi, G. 46
lexicon 8
list 74
literal 8
Lloyd, J. 3, 8, 9
lower bound 5
Lukasiewicz, J. 32
magic set 63
Maher, M. 52
Marriott, K. 3, 31, 36–38, 46, 63, 64, 69
maximal element 5
McCarthy, J. 32, 46
McCarthy logic 32, 46
Mellish, C. 64, 65
meet-closed insertion 59
minimal element 5
114 INDEX
minimal function graph 62
Mishra, P. 22, 63
mode 28
mode analysis 66
monotonic function 7
Moore-closed insertion 59
Moore family 18
Moore closure 19
Muchnick, S. 14
Mycroft, A. 14, 28, 44, 62–64
Naish, L. 3, 31, 36, 37
Naur, P. 2, 13
Nielson, F. 3, 11, 14, 22, 24, 91
Noetherian lattice 6
non-standard semantics NS 38
normal program 8
occur check 70, 73
occur check analysis 66
parameter 52
parametric substitution 52, 53
partial ordering 5
Plaisted, D. 93
poset 5
Post, E. 32
preordering 5
program specialisation 36, 42
program transformation 67
Prolog semantics P 45
pseudo-evaluation 13
Pyo, C. 93
query 8
Reddy, U. 93
renaming 49
representation by difference-list 74
resolvent 9
Reynolds, J. 14, 21, 25, 43
Sato, T. 3, 31, 35, 37
Schmidt, D. 3, 5
secure call 87
semantic functions
B 33
L 44
NS 38
O 49
Ocall 51
P (Prolog semantics) 45
P (dataflow semantics) 57
Pbas 55
Plax 56
S 44
Shapiro, E. 4, 63, 70, 71, 83, 85
Shepherdson, J. 45
simple representation 74
singleton abstraction 36
Sintzoff, M. 2, 14
SLD call semantics Ocall 51
SLD resolution 28
SLD semantics 48
SLD semantics O 49
SLDNF semantics S 44
Søndergaard, H. 37, 46, 52, 58, 63–65, 69
Sterling, L. 4, 69–71, 83, 85
strict function 7
strictness analysis 14
sublattice 5
substitution 9,48
into Par 52
parametric 52, 53
success set 9
Tamaki, H. 3, 31, 35, 37
Tarjan, R. 14
Tarnlund, S.-A. 70
term 8
three-valued logic semantics L 44
top-down analysis 3
type inference 67
INDEX 115
type interpretation 24
Ullman, J. 14
unfold/fold transformation 89
upper bound 5
upwards closed insertion 58
van Emden, M. 9
variable 8, 52
Warren, D. S. 66
Winsborough, W. 63, 65
Zhang, J. 89