122
Semantics-Based Analysis and Transformation of Logic Programs (Revised Report) Harald Søndergaard April 1990

ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Semantics-Based Analysis

and Transformation of Logic Programs

(Revised Report)

Harald Søndergaard

April 1990

Page 2: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

i

Abstract

Dataflow analysis is an essential component of many programming tools. One use of dataflow

information is to identify errors in a program, as done by program “debuggers” and type checkers.

Another is in compilers and other program transformers, where the analysis may guide various

optimisations.

The correctness of a programming tool’s dataflow analysis component is usually of paramount

importance. The theory of abstract interpretation, originally developed by P. and R. Cousot aims at

providing a framework for the development of correct analysis tools. In this theory, dataflow analysis

is viewed as “non-standard” semantics, and abstract interpretation prescribes certain relations

between standard and non-standard semantics, in order to guarantee the correctness of the non-

standard semantics with respect to the standard semantics.

The increasing acceptance of Prolog as a practical programming language has motivated wide-

spread interest in dataflow analysis of logic programs and especially in abstract interpretation. Logic

programmming languages are attractive from a semantical point of view, but dataflow analysis

of logic programs is more complex than that of more traditional programming languages, since

dataflow is bi-directional (owing to unification) and control flow is in terms of backtracking.

The present thesis is concerned with semantics-based dataflow analysis of logic programs. We

first set up a theory for dataflow analysis which is basically that of abstract interpretation as intro-

duced by P. and R. Cousot. We do, however, relax the classical theory of abstract interpretation

somewhat, by giving up the demand for (unique) best approximations.

Two different kinds of analysis of logic programs are then identified: a bottom-up analysis yields

an approximation to the success set (and possibly the failure set) of a program, whereas a top-down

analysis yields an approximation to the call patterns that would appear during an execution based

on SLD resolution. We investigate some of the uses of the two kinds of analysis. In the bottom-up

case, we pay special attention to (bottom-up) type inference and its use in program specialisation.

In the top-down case, we present a generic “dataflow semantics” that encompasses many useful

dataflow analyses. As an instance of the generic semantics, a groundness analysis is detailed, and

a series of other applications are mentioned.

We finally present a transformation technique, based on top-down analysis, for introducing

difference-lists in a list-manipulating program, without changing the program’s semantics. This

may lead to substantial improvement in a program’s execution time.

Page 3: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

ii

Page 4: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Contents

1 Introduction 1

2 Preliminaries 5

2.1 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Logic programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Semantics-Based Analysis of Logic Programs 11

3.1 Approximate computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Abstract interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Denotational abstract interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Dataflow Analysis of Logic Programs 27

5 Bottom-Up Analysis of Normal Logic Programs 31

5.1 Bottom-up semantics for logic programs . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2 Approximation of success and failure sets . . . . . . . . . . . . . . . . . . . . . . . . 35

5.3 Applications and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6 Top-Down Analysis of Definite Logic Programs 47

6.1 SLD semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.2 A semantics based on parametric substitutions . . . . . . . . . . . . . . . . . . . . . 52

6.3 A dataflow semantics for definite logic programs . . . . . . . . . . . . . . . . . . . . 56

6.4 Approximating call patterns: Groundness analysis . . . . . . . . . . . . . . . . . . . 59

6.5 Other applications and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

iii

Page 5: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

iv CONTENTS

7 Difference-Lists Made Safe 69

7.1 The difference-list problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

7.2 Manipulating difference-lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.3 Automatic difference-list transformation . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

8 Conclusion 91

A Correctness of the base semantics 95

Bibliography 103

Index 111

Page 6: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Preface

The present thesis is submitted in fulfilment of the requirements for the degree of licentiatus scien-

tiarum (Ph. D.) in computer science at the University of Copenhagen, Denmark. It contains eight

chapters of which the first is an introduction and the eighth is a summary. The theory of dataflow

analysis presented in Section 3.2 was developed together with Kim Marriott [60]. Chapter 5 is

based on joint work with Kim Marriott [59, 60], Chapter 6 on joint work with Kim Marriott and

Neil Jones [58, 67], and Chapter 7 on joint work with Kim Marriott [63].

This account should justify my use of the (nowadays somewhat discredited) academic “we”

throughout the thesis: an “I” would in most cases border on fraudulence. The thesis is of course

based on the work of several other people, but it would be impossible to list them all here—hopefully

I have given appropriate references and credit in the report.

Most of the thesis was written during my two and a half years visit to the Department of Com-

puter Science at Melbourne University. I would like to thank that department for its hospitality.

The visit was initially made possible by the Australian Department of Education’s Australian-

European Award Program, and I have subsequently received support from the Danish Research

Academy and the Danish Research Council for Natural Sciences. During my stay I held a scholar-

ship from the University of Copenhagen. I am grateful to all these institutions.

The thesis was finished while I visited Kim Marriott at IBM T. J. Watson Research Center and

Saumya Debray at the University of Arizona. Neil Jones supervised my work from Copenhagen,

and Rodney Topor was an invaluable mentor in Melbourne. My academic indebtedness to Neil

Jones and Kim Marriott will be apparent from the summary above but weighs little compared

to what they mean to me as friends. The same goes for Graeme Port, an always constructive

proof-reader, and for Peter Sestoft, who has been the highly reliable link to my home department,

and with whom I have collaborated (via electronic mail) on topics not covered in this thesis. I

am also indebted to Philip Dart, Saumya Debray, Dean Jacobs, Lee Naish, and Uday Reddy for

much stimulating input to my work. Finally I would like to thank Kamakshi Padmanabhan for her

support. A good “geographic fortune” has brought me together with these people.

Tucson, Arizona, November 1989

v

Page 7: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Preface to the revised version

This report is a revised version of my thesis of the same title, which was accepted for the degree of

licentiatus scientiarum (Ph. D.) in computer science at the University of Copenhagen, Denmark,

in December 1989.

The examiners made many useful remarks; in particular, thanks are due to Alan Mycroft for

his very valuable suggestions for improvements. These made me want to revise the paper slightly,

but a revision was really made necessary when Kim Marriott pointed out an error in the proof

of the theorem that stated the equivalence between the SLD semantics and the “base semantics”

in 6.2. The equivalence does hold, and a proof of full abstraction of the base semantics would

have to include a proof of the equivalence. Here, however, we make use of the fact that, for our

purposes, it suffices to prove “half” of the equivalence, that is, we prove that the base semantics

safely approximates SLD resolution in a sense that is made precise in the thesis. Even this part is

complicated, so we have chosen to move the proof into an appendix in order not to interrupt the

flow of presentation. In general, Chapter 6 has been brought closer to the presentation in Marriott

Søndergaard and Jones [67].

The remainder of the changes are mainly stylistic. Mostly effort has been put into expanding

discussions where examiners found them too glib.

Melbourne, Easter 1990

vi

Page 8: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 1

Introduction

Some of the most important programming tools studied in computer science are programs that

manipulate programs. Examples are interpreters, parsers, compilers, type-checkers and various

other error-finding devices, as well as partial evaluators and other kinds of program transformers.

The study and development of these tools will remain in the forefront of computer science research,

not only because of the paramount importance and complexity of the tools, but also because new

programming languages keep emerging, posing new challenges to tool construction.

In program transformation (with which we class compilation) the central problem is, broadly

formulated, given a program P , to generate a program P ′ which is in some sense equivalent to

P but which behaves better with respect to certain performance criteria. There are many well-

known solutions to this problem, especially in compilation: an object program produced by a

straightforward compilation algorithm is usually very inefficient, so most compilers include phases

that improve the generated code. Standard textbooks on compilation explain such transformations

as code motion, constant folding, induction variable elimination, and strength reduction.

Such automatic transformations may be construed as being made up of an analysis phase and

a subsequent synthesis phase where the information obtained by the analysis is utilised. Classical

analyses used by compilers include expression availability analysis, live variable analysis, and many

more.

These techniques are usually applied to (target) programs written in lower-level languages, but

similar techniques can be thought of for high-level languages. Stated generally, the purpose of

program analysis is to decide whether some invariant holds at some program point. It may thereby

be determined whether some transformation scheme is applicable, and possibly what the exact form

of the synthesis should be. The process of investigating such invariance is called static analysis, or,

as we prefer, dataflow analysis. Error-finding in programs, type inference, etc., may also be seen

as dataflow analysis.

Dataflow analyses are usually very complex and difficult to develop. Apart from trial and error

we only have very limited ways to convince ourselves of their correctness (or lack of correctness).

1

Page 9: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

2 CHAPTER 1. INTRODUCTION

Fortunately this does not prevent us from building successful analysis tools. However, the scientific

answer to a complex practice that is hard to understand or manage is to build a theory. In computer

science, as in other natural sciences, a successful theory provides new points of view on a problem—

it may even be felt to have some explanatory element—and will hopefully, as a consequence, have

a positive influence on practice in turn.

In the case of dataflow analysis, an interesting view was suggested more than 25 years ago,

according to which dataflow analysis is “pseudo-evaluation,” that is, a process that somehow mimics

the normal execution of a program. Naur put this point of view to good use in explaining the type

checking component of the Gier Algol compiler, and Sintzoff later provided further examples of

its usefulness (we give references later). P. and R. Cousot formalised the idea by developing their

influential theory of abstract interpretation.

We later discuss abstract interpretation in great detail, but the following example may readily

convey the basic idea. Rather than using integers as data objects, a dataflow analysis may use

neg, zero, and plus to describe negative integers, 0, and positive integers, respectively. Then

by reinterpreting operations such as multiplication according to the “rules of signs,” the dataflow

analysis may establish certain properties of a program, such as “whenever control reaches this loop,

x is assigned a negative value.”

Abstract interpretation prescribes certain relations that should hold between a dataflow analysis

and the semantics of the programming language in question. If these relations hold, the dataflow

analysis is guaranteed to be correct. To formalise them, however, precise formal definitions of both

semantics and dataflow analysis are required. The analysis-as-pseudo-evaluation view is usually

reified as a strong similarity between the two definitions, a certain degree of congruence. This

naturally leads to viewing dataflow analysis as mere non-standard semantics.

Abstract interpretation illustrates the usefulness of formal semantics as a theoretical tool. A

formal definition of a full, complex programming language may well not be possible in practice, and

even if it is, it may itself be far too complex to be of any use. However, the possibility remains that

we can choose what we consider the essential part of a programming language and formalise its

semantics. By taking a formal definition as point of departure, we are led to solve a given dataflow

problem in a manner that is likely to be very different to a more ad hoc solution. We hope to

lend some credibility to that statement with the present thesis. The point is that, in striving for

congruence between a standard and a non-standard semantics, we are led to factor out issues of

execution order (flow of control) which is similar for the two, allowing us to focus on the relation

between the domains involved in their definitions, that is, between data. The solutions arrived at

this way are not necessarily better than ad hoc solutions, but there is a good chance that they

are, and in any case the process of formulating a particular dataflow analysis as a non-standard

semantics is bound to teach us more about the dataflow analysis problem.

Abstract interpretation of logic programs has gained considerable currency during the last few

years. The major impetus has been the quest for dataflow analyses that can improve code generation

in Prolog compilers. Logic programming languages are based on a principle of “separation of logic

Page 10: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3

and control,” which is desirable from a semantical point of view, but which also causes severe

problems for implementation. The lack of “control information” in programs provides a wide scope

for dataflow analysis of these languages and explains the currency that abstract interpretation has

gained in logic programming.

The theory has almost entirely been restricted to the case of definite logic programs executed

using SLD resolution with a standard (left-to-right) computation rule. Such a theory allows for

dataflow information to be propagated in a manner that resembles an SLD refutation of a query.

Analyses therefore yield information about call patterns that occur during the query evaluation

process. This information is exactly what a compiler needs to improve code generation, so the

major applications are in code improvement. We refer to this type of dataflow analysis as top-

down.

But other semantic models exist for logic programming languages. One of the best known is the

characterisation using an “immediate consequence function” TP , a semantics sometimes referred to

as “forward chaining” or “bottom-up.” A dataflow analysis developed from such a semantics gives

no information about call patterns, rather it approximates a program’s “success set.” We refer to

this type of dataflow analysis as bottom-up analysis.

The present thesis is concerned with both kinds of dataflow analysis, and their applications.

Readers are expected to be familiar with the theory of logic programming, domain theory, and

denotational semantics at the level of the textbooks by Lloyd [49] and Schmidt [88], for example.

We have tried to comply with the terminology used in these two books.

Chapter 2 recapitulates some mathematical notions and notation used throughout the thesis.

Readers may find this chapter useful for reference, but it cannot serve as an introduction to the

notions.

Chapter 3 is concerned with semantics-based dataflow analysis. We present the idea underlying

pseudo-evaluation, or, as we prefer, approximate computation. The theory of dataflow analysis

which we introduce relaxes P. and R. Cousot’s theory of abstract interpretation somewhat, so we

take some care to argue its adequacy. We also present Nielson’s extension of P. and R. Cousot’s

theory, that is, his denotational abstract interpretation. Nielson’s work is a strong case in favour

of denotational semantics as semantic formalism.

Chapter 4 discusses dataflow analysis of logic programs more specifically. The aim is to link

the general ideas to the specific case of logic programming before going into the details, and in

particular to explain the distinction between bottom-up and top-down analysis informally.

Chapter 5 covers bottom-up analysis. We give a semantic definition for normal logic programs

(in which clause bodies may contain negation). This definition is based on Kleene (three-valued)

logic and similar to a definition proposed by Fitting. We detail two dataflow analyses based on our

semantics, both essentially type inferences, one based on “depth k” abstractions (due to Sato and

Tamaki), another on “singleton” abstractions (due to Marriott, Naish and Lassez). These analyses

are shown to be sound with respect to our semantics and useful for error-finding in programs and

Page 11: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

4 CHAPTER 1. INTRODUCTION

for program specialisation. We also show that every dataflow analysis that is correct with respect to

our semantics is automatically correct with respect to a semantics proposed by Kunen, to SLDNF

resolution, and to (sound) Prolog semantics.

Chapter 6 covers the case of top-down analysis. We give a denotational definition of an SLD

resolution-based semantics for definite logic programs (in which clause bodies may not contain

negation) and prove it correct. Negation can well be handled in a denotational definition, but to

do this in a sound way would complicate definitions unreasonably, so as to obscure the issues under

study. This is the reason for restricting attention to definite programs. Step by step we transform

the semantics into a “dataflow” semantics that forms a basis for a class of dataflow analyses. We

detail one member of this class, a so-called groundness analysis and show that it is correct with

respect to our semantics. Finally we discuss other dataflow analyses and related work.

Chapter 7 contains a discussion of difference-lists and their use in transformation of list-

processing Prolog programs, as described by Sterling and Shapiro. The transformation is rather

complex and may be unsafe in certain circumstances. We sketch how dataflow analyses, similar to

those already discussed, may establish the absence of the unfortunate circumstances, thus making

automatic difference-list transformation possible for most interesting list-processing programs.

Chapter 8 contains a conclusion. We summarise the thesis and suggest a series of related issues

that would be worthy of more study.

There is a bibliography and an index at the end of the thesis. A note about numbering in the

thesis may be useful: within each chapter, we number examples, lemmas, theorems etc. using the

same counter. We find that this speeds up search for a particular example, lemma, etc.

Page 12: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 2

Preliminaries

In this chapter we recapitulate some basic notions and facts from domain theory and explain some

notation that will be used throughout the report. The chapter is not meant as an introduction

to the notions, merely as a reference for readers that occasionally may feel the need for a precise

definition. For a detailed introduction, readers are referred to a textbook, for example Schmidt’s [88]

or Birkhoff’s book on lattice theory [5].

2.1 Lattices

Let I denote the identity relation on a set X. A preordering on X is a binary relation R that

is reflexive (I ⊆ R) and transitive (R · R ⊆ R). A partial ordering is a preordering that is

antisymmetric (R ∩ R−1 ⊆ I). A set equipped with a partial ordering is a poset. Let (X,≤) be a

poset. A (possibly empty) subset Y of X is a chain iff for all y, y′ ∈ Y, y ≤ y′ ∨ y′ ≤ y.

Let (X,≤) be a poset. An element y ∈ Y ⊆ X is maximal in Y iff {y′ ∈ Y | y ≤ y′} = {y}.

Dually we may define minimality in Y . An element x ∈ X is an upper bound for Y iff y ≤ x for all

y ∈ Y . Dually we may define a lower bound for Y . An upper bound x for Y is the least upper

bound for Y iff, for every upper bound x′ for Y , x ≤ x′, and when it exists, we denote it by⊔

Y .

Dually we may define the greatest lower bound ⊓Y for Y .

A poset for which every subset possesses a least upper bound is a complete join-lattice. A poset

for which every subset possesses a greatest lower bound is a complete meet-lattice. A poset that has

both properties is a complete lattice. In particular, equipped with the subset ordering, the powerset

of X, denoted by P X, is a complete lattice. For a complete lattice X, we denote⊔

∅ = ⊓X by

⊥X and ⊓ ∅ =⊔

X by ⊤X . A poset for which every finite subset possesses a least upper bound

and a greatest lower bound is a lattice. A sublattice of a (complete) lattice X is a subset of X

which preserves the least upper bound and greatest lower bound operations of X.

From a dataflow analysis point of view there are a number of interesting special types of complete

lattices (we explain why shortly). Let card Y denote the cardinality of the set Y . The complete

5

Page 13: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6 CHAPTER 2. PRELIMINARIES

lattice X

• is of finite height iff max {card Y | Y ⊆ X is a chain} is finite.

• has the finite chain property iff every chain Y ⊆ X is finite.

• is ascending chain finite (or Noetherian) iff every non-empty subset Y ⊆ X has an element

that is maximal in Y . Dually X may be descending chain finite.

(We will not be concerned with lattices in general, only complete lattices, but it may be of some

interest that if a lattice has any of the above properties then it is complete.) It is clear that a

complete lattice of finite height has the finite chain property, and one that has the finite chain

property is both ascending and descending chain finite. The following two examples show that the

relations between the classes are proper inclusions.

Example 2.1 Let X =⋃

n∈N {(n, j) | 1 ≤ j ≤ n} ∪ {⊥,⊤} be ordered by

• ⊥ ⊑ x for all x ∈ X.

• x ⊑ ⊤ for all x ∈ X.

• (n, j) ⊑ (n, j′) if j ≤ j′.

• For no other pair (x, x′) does x ⊑ x′ hold.

Then X has the finite chain property, but X is not of finite height.

Example 2.2 Let X = N ∪ {⊥} be ordered by

• ⊥ ⊑ n for all n ∈ N .

• n ⊑ n′ if n′ ≤ n.

• For no other pair (n, n′) does n ⊑ n′ hold.

Then X is ascending chain finite, but X does not have the finite chain property. A similar exam-

ple shows that there are lattices that are descending chain finite without having the finite chain

property.

2.2 Functions

Functions are generally used in their Curried form. Our notation for function application uses

parentheses sparingly. Only when it would seem to help the eye, shall we make use of redundant

parentheses. As usual, function space formation X → Y associates to the right, and function

Page 14: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

2.2. FUNCTIONS 7

application to the left. We occasionally use the lambda notation (due to Church) for functions and

we use “◦” for function composition.

Let F : X → Y be a function. Then F is injective iff F x = F x′ ⇒ x = x′ for all x, x′ ∈ X,

and F is bijective iff there is a function F ′ : Z → X such that F ◦ F ′ and F ′ ◦ F are identity

functions. We define F ’s distributed version to be the function (∆F ) : P X → P Y , defined by

∆F Z = {F z | z ∈ Z}. A function F : X → X is idempotent iff F (F x) = F x for all x ∈ X.

For any set X let X∗ denote the set of finite sequences of elements of X. The empty sequence

is denoted by nil and we use the operator “:” for sequence construction. The folded version of a

function F : X → Y → Y is given by application of Σ : (X → Y → Y ) → X∗ → Y → Y . The

functional Σ is defined by

Σ F nil z = z

Σ F (x : y) z = Σ F y (F x z).

Let (X,≤) and (Z,�) be posets. A function F : X → Z is monotonic iff x ≤ x′ ⇒ F x �

F x′ for all x, x′ ∈ X. In what follows, monotonicity of functions is essential, so much so that it

is understood throughout the report that X → Z denotes a space of monotonic functions. Let X

and Z be complete lattices. A function F : X → Z is strict iff F ⊥X = ⊥Z and continuous iff

for every non-empty chain Y ⊆ X,⊔

(∆F Y ) = F (⊔

Y ). Dually we may define co-strictness and

co-continuity.

A fixpoint for a function F : X → X is an element x ∈ X such that x = F x. If X is a complete

lattice, then the set of fixpoints for (the monotonic) F : X → X is itself a complete lattice (though

not in general a sublattice of X). The least element of this lattice is the least fixpoint for F , and we

denote it by lfp F . Dually there is a greatest fixpoint for F , which we denote by gfp F . Furthermore,

defining

F ↑ α =

{

{F ↑ α′ | α′ < α} if α is a limit ordinal

F (F ↑ (α− 1)) if α is a successor ordinal,

there is some ordinal α such that F ↑α = lfp F . Dually we may define F ↓ α for any ordinal α, and

again there is some ordinal α such that F ↓ α = gfp F . The sequence (F ↑ 0), (F ↑ 1), . . . , (lfp F ) is

the ascending Kleene sequence for F . Dually we may define the descending Kleene sequence.

The reason for our interest in the three classes of lattices mentioned above is that if a monotonic

function is defined on a lattice that is ascending chain finite then it has a finite ascending Kleene

sequence. Dually, if the lattice is descending chain finite then the function has a finite descending

Kleene sequence. This means that fixpoints can be computed in finite time.

Let X be a complete lattice. A predicate Q is inclusive on X iff for all (possibly empty) chains

Y ⊆ X,Q (⊔

Y ) holds whenever (Q y) holds for every y ∈ Y . Dually Q is co-inclusive on X

iff for all chains Y ⊆ X,Q (⊓ Y ) holds whenever (Q y) holds for every y ∈ Y . Inclusive and

co-inclusive predicates are admissible in fixpoint induction. Assume that F : X → X is monotonic

and (Q x)⇒ Q (F x) for all x ∈ X. If Q is inclusive then Q (lfp F ) holds. If Q is co-inclusive then

Page 15: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

8 CHAPTER 2. PRELIMINARIES

Q (gfp F ) holds. We shall also use a slightly stronger version of the first case: clearly it suffices for

(Q x) to imply Q (F x) for all x ≤ lfp F . All cases of this induction principle are easily proved by

transfinite induction.

2.3 Logic programs

When speaking about logic programs, we seek to comply with the terminology used by Lloyd [49].

In matters of naming and reference we use italic capital letters as meta-variables (although we do

not restrict them to this use), and we employ the pair of brackets “[[” and “]]” for quasi-quotation

(as is common in denotational semantics).

Let Var , Fun, and Pred denote the disjoint syntactic categories of variables, functors, and

predicate symbols, respectively. (We call a collection Fun ∪ Pred a lexicon). The set Var is

assumed to be countably infinite. The sets Fun and Pred are assumed to be non-empty, and each

of their elements has an associated natural number which is its arity. From these sets, programs

and queries can be constructed as follows.

• The set Term of terms is defined recursively: every term is either a variable V ∈ Var or a

construction [[F (T1, . . . , Tn)]], where F ∈ Fun has arity n ≥ 0 and T1, . . . , Tn are terms. In

particular, a functor with arity 0 is a term.

• An atom is a construction [[Q(T1, . . . , Tn)]], where Q ∈ Pred has arity n ≥ 0 and T1, . . . , Tn

are terms. We let Atom denote the set of atoms.

• A literal is an atom A or a negation of an atom, that is, a construction [[¬ A]] where A ∈ Atom.

• A body is a conjunction of literals, that is, a construction [[L1, . . . , Ln]] where L1, . . . Ln are

literals. Note that a body is finite and possibly empty.

• A clause consists of an atom A (its head) and a body B and is written [[A← B]]. We let

Clause denote the set of clauses.

• A normal program is a finite collection of clauses.

• A definite program is a normal program that is constructed without negation.

• A query is a conjunction of literals, written [[←L1, . . . , Ln]] where L1, . . . Ln. If L1, . . . , Ln are

all atoms, the query is definite.

We let Prog denote the set of programs, normal or definite, depending on the context. We

assume that we are given a function vars : (Prog ∪ Atom ∪ Term) → P Var , such that (vars S) is

the set of variables that occur in the syntactic object S. A syntactic object S is ground iff it is

constructed without variables, that is, vars S = ∅. We let Her denote the Herbrand base, that is,

the set of ground atoms (for some fixed lexicon of functors and predicate symbols).

Page 16: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

2.3. LOGIC PROGRAMS 9

Logic programs can be given a clean model-theoretic semantics. Readers are referred to

Lloyd [49] for details let us merely recall that a (classical) model which is a subset of the Her-

brand base (for some fixed lexicon) is called a Herbrand model.

A substitution is an almost-identity mapping θ ∈ Sub ⊆ Var → Term . Substitutions are

not distinguished from their natural extensions to other syntactic categories. Our notation for

substitutions is standard. For instance {x 7→ a} denotes the substitution θ such that (θ x) = a and

(θ V ) = V for all V 6= x. We let ι denote the identity substitution.

An instance of a syntactic object S is an application θ S. For a syntactic object S, (ground S)

denotes the set of ground instances of S (for some fixed lexicon).

Given a definite program P , the immediate consequence function, TP : Her → Her is defined

by

TP U = {A ∈ Her | ∃C ∈ P . [[A←B]] ∈ ground C ∧ B ⊆ U}.

Here we have used the common convention of viewing a program P as a set of clauses C and a

body B as a set of atoms. Reading clauses as implications, TP gives, via modus ponens, those

consequences of a set of assumptions that can be deduced using each clause once. The function

TP is monotonic (in fact continuous), and so has a least fixpoint which happens to be the smallest

Herbrand model for the program1 [98]. This model is also the program’s success set.

A unifier of A,H ∈ Atom is a substitution θ such that (θ A) = (θ H). If such a unifier exists,

then A and H are unifiable. A unifier θ of A and H is an (idempotent) most general unifier of A

and H iff θ′ = θ′ ◦ θ for every unifier θ′ of A and H. Two atoms A and H have a most general

unifier whenever they are unifiable. The auxiliary function mgu : Atom → Atom → P Sub is

defined as follows. If A and H are unifiable, then (mgu A H) yields a singleton set consisting of a

most general unifier of A and H. Otherwise (mgu A H) = ∅.

Let G = ←A1, . . . , An be a query with selected atom Ai and let C = H ← A′1, . . . , A

′k be a

clause. If Ai and H are unifiable, then

θ[[A1, . . . , Ai−1, A′1, . . . , A

′k, Ai+1, . . . , An]]

where {θ} = mgu Ai H, is a resolvent of G and C with unifier θ.

Let P be a definite program and let G be a query. An SLD derivation of P ∪ {G} consists of

• a maximal sequence G0, G1, . . . of negative clauses with G0 = G,

• a sequence C0, C1, . . . of fresh variants of clauses from P (that is, the variables in Ci are

consistently replaced by variables not in C0, . . . , Ci−1, or G).

• a sequence θ0, θ1, . . . of substitutions

1The immediate consequence function is called T by van Emden and Kowalski [98] the name TP is due to Apt

and van Emden [2].

Page 17: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

10 CHAPTER 2. PRELIMINARIES

such that for all i, Gi+1 is a resolvent of Gi and Ci with unifier θi. An SLD derivation may be

finite or infinite. Assume it is finite, with final elements Gn+1, Cn, and θn. Then if Gn+1 is empty,

the derivation is successful, otherwise it is finitely failed. If it is successful, the computed answer is

θn ◦ . . . ◦ θ0, restricted to the set of variables occurring in G.

Page 18: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 3

Semantics-Based Analysis of Logic

Programs

Our aim is to provide a theory for semantics-based dataflow analysis of logic programs. Fortunately

the area of dataflow analysis is well researched, so we can draw on several useful sources. In this

chapter we recapitulate as much of previous work as needed for the rest of the report. In Section 3.1

we present the idea of approximate computation on which P. and R. Cousot based their theory of

abstract interpretation. This theory is the most important influence on our work. We present the

theory in Section 3.2, together with a relaxation of it, based on a notion of “insertion.” In Section 3.3

we present (a variant of) Nielson’s powerful elaboration of P. and R. Cousot’s work, namely his

theory of denotational abstract interpretation. This provides us with the basic theoretical tools

needed in the remainder of the report. These tools are independent of any particular programming

language.

3.1 Approximate computation

Abstract interpretation captures an idea of performing approximate computations. Every program-

ming language L has some notion of “values,” such that an interpreter for L executes L-programs

by manipulating values. A pseudo-evaluator for L, on the other hand, performs approximate com-

putation: it manipulates (imprecise) descriptions of values in a way that is faithful to how an

interpreter would act.

Approximate computation is well-known from everyday use. One example is the casting out

of nines to check numerical computations, another is the application of the rules of signs, such as

“plus times minus yields minus.” The disadvantage of an approximate computation is that the

results that it yields are not in general as precise as those of a proper computation, as a matter of

course. But this is compensated for by the fact that an approximate computation usually is much

faster than a proper computation.

11

Page 19: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

12 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

mult neg zero pos

neg pos zero neg

zero zero zero zero

pos neg zero pos

Table 3.1: Multiplying signs

Our concern is automatic approximate computation of programs. In this case, the difference

in speed between proper and approximate computation may be exorbitant: a proper computation

may well fail to terminate where an approximate computation can finish in finite time.

We may say that an approximate computation is an evaluation of a formula or a program,

not over its standard domain U, but over a set D of descriptions of collections of objects in U.

The domain U may consist of integers, states, terms, sets of atomic formulas, substitutions, or

whatever, depending on how the semantics is modelled, and D is determined by what sort of

program properties we want to expose.

Of course, when performing an approximate computation, one must reinterpret all operators so

as to apply to descriptions rather than to proper values. As an example, let Z denote the set of

integers and consider the set of descriptions

D1 = {neg, zero, pos}

which may be used to approximate integers in the obvious way: neg is a proxy for all negative

integers, zero for 0, and pos for positive integers. Multiplication is reinterpreted as the function

mult : D21 → D1, defined by Table 3.1. This is an adequate interpretation, since whenever x∗y = z

and x, y ∈ Z are described by φ, φ′ ∈ D1 respectively, then z ∈ Z is described by mult (φ, φ′).

The imprecision inherent in descriptions may however cause problems. Consider the “addition”

of the descriptions neg and pos. We cannot express the sum as an element of D1. If we want to

mimic addition of integers, we have to include yet another, more imprecise description, ⊤, which

applies to all integers:

D2 = {neg, zero, pos,⊤}.

Now we may reinterpret addition as in Table 3.2.

Some points may be noted at this stage. First, in spite of their imprecision, approximate

computations, if properly designed, may well yield much useful information. Casting out nines or

using the rules of signs exemplify this, and more examples will appear later.

Second, we are not primarily interested in the results of computations. For program analysis

purposes, our main concern is to extract information about invariance at particular program points,

such as “at this point, variable x is always assigned positive values” or “this term becomes ground

Page 20: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.1. APPROXIMATE COMPUTATION 13

add neg zero pos ⊤

neg neg neg ⊤ ⊤

zero neg zero pos ⊤

pos ⊤ pos pos ⊤

⊤ ⊤ ⊤ ⊤ ⊤

Table 3.2: Adding signs

during execution.” There are examples of analyses that yield useful invariance information, even

though what they return as a description of the final result may be poor.

Third, there is a degree of freedom in choosing the objects that are to serve as descriptions.

Exactly what they should be depends on the purpose of the analysis: we design them according to

the program properties that we want to expose and the precision that we want to obtain.

Fourth, we are usually interested in properties that are undecidable. Since we want our ap-

proximate computations to terminate (we want some information), it follows that the information

we get is necessarily imprecise. This is acceptable: a dataflow analysis need not tell the whole

truth, although it should of course not contradict truth. In this way, approximate computation

is to standard computation what numerical analysis is to mathematical analysis: using numeri-

cal techniques, problems with no known analytic solution can be “solved” numerically, that is, a

solution can be estimated within an interval of error.

Finally, the notion of approximate computation should not be confused with that of “profil-

ing” computation (or program “instrumentation” in general). A profiling computation also yields

information about a standard computation, but it does so by extending it rather than approximat-

ing it. There are two characteristics that distinguish profiling from approximate computation. A

profiling computation yields precise information about runtime behaviour, but it is not guaranteed

to terminate, whereas an approximate computation yields approximate information in finite time.

Furthermore, profiling may extract information about implementation-dependent behaviour, such

as the execution time spent in a particular procedure. Approximate computation, in our sense of

the term, is concerned only with purely semantic properties, that is, with runtime behaviour that

can be attributed to the programming language in question, rather than to a particular machine

on which it is implemented.

The idea of computing by means of descriptions for analysis purposes is not new. Naur very

early identified the idea and applied it in work on the Gier Algol compiler [75]. Naur coined the

term pseudo-evaluation for what would later be described as

“a process which combines the operators and operands of the source text in the manner

in which an actual evaluation would have to do it, but which operates on descriptions

of the operands, not on their values” [38].

Page 21: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

14 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

The same basic idea is found in work by Reynolds [83] and by Sintzoff [91]. Sintzoff used it

for proving a number of well-formedness aspects of programs in an imperative language, and for

verifying termination properties.

By the mid seventies, efficient dataflow analysis had been studied rather extensively by re-

searchers such as Kam, Kildall, Tarjan, Ullman, and others (for references, see Hecht’s book [31]).

In an attempt to unify much of that work, a precise framework for discussing approximate computa-

tion (of imperative programs) was developed by Patrick and Radhia Cousot [13, 14]. The advantage

of such a unifying framework is that it serves as a basis for understanding various dataflow analyses

better, including their interrelation, and for discussing their correctness.

The overall idea of P. and R. Cousot was to define an “extended” semantics which associates

with each program point the set of possible storage states that may obtain at run-time whenever

execution reaches that point (P. and R. Cousot called this semantics a “static” semantics). A

dataflow analysis can then be construed as a finitely computable approximation to the extended

semantics. We detail this general idea in Chapters 5 and 6.

Of the example applications given by P. and R. Cousot we mention: program verification

(descriptions being predicates over program variables), performance analysis (descriptions being

positive real numbers representing the mean number of times a program point is reached, given

that probabilities have been attached to test nodes), and the finding of bounds for integer variables

(descriptions being intervals) [13].

The work of P. and R. Cousot has later been extended to declarative languages. Such extensions

are not straightforward. For example, there is no clear-cut notion of “program point” in functional

or logic programs. Also, dataflow analysis of programs written in a language such as Prolog

differs somewhat from the analysis of programs written in more conventional languages because

the dataflow is bi-directional, owing to unification, and the control flow is more complex, owing to

backtracking. We return to the case of logic programming in Chapter 4.

Of the applications of abstract interpretation in functional programming we mention work by

Jones [39] and by Jones and Muchnick [40] on termination analysis for lambda expressions and,

in a Lisp setting, improved storage allocation schemes through reduced reference counting. The

main application, though, has been strictness analysis, which is concerned with the problem of

determining cases where applicative order may safely be used instead of normal order execution.

The study of strictness analysis was initiated by Mycroft [73] and the literature on the subject is

now quite extensive, see for example Nielson [77].

We discuss the history and applications of abstract interpretation of logic programming lan-

guages later (Sections 5.3 and 6.5). For a general introduction to abstract interpretation and an

extensive list of references, readers are referred to the book edited by Abramsky and Hankin [1].

Page 22: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.2. ABSTRACT INTERPRETATION 15

mult odd even ⊤

odd odd even ⊤

even even even even

⊤ ⊤ even ⊤

Table 3.3: Multiplication of parities

3.2 Abstract interpretation

Abstract interpretation formalises the idea of approximate computation. Assume we have a data

domain U and operators on U. In our (continued) example, U is Z, the set of integers. We also

have a set D of descriptions, in our case {neg, zero, pos,⊤}. To be precise about the relation

between values and descriptions, a concretization function γ : D → P U is defined. For every

description φ ∈ D, (γ φ) is the set of objects which φ describes. Thus γ is the semantic function

for descriptions.

Example 3.1 (Signs) We define γ : D2 → P Z by

γ neg = {z ∈ Z | z < 0}

γ zero = {0}

γ pos = {z ∈ Z | z > 0}

γ ⊤ = Z.

Example 3.2 (Parity) We have D = {even, odd,⊤} and define γ : D → P Z by

γ even = {z ∈ Z | z is even}

γ odd = {z ∈ Z | z is odd}

γ ⊤ = Z.

Note that descriptions are ordered according to how large a set of objects they apply to: the more

imprecise, the “higher” they sit in the structure. Example 3.2 allows us to illustrate the point that

application of an operator to a description of “everything,” ⊤, need not involve loss of information

(operators need not be co-strict). To see this, consider “multiplication” of parities as defined in

Table 3.3. Also note that the ordering on descriptions in a sense is opposite to the ordering used

in domain theory: for descriptions, the top element corresponds to total lack of information.

To see how a computation using the descriptions in Example 3.2 can proceed, consider the

program1 in Figure 3.1. Program points are numbered edges. Descriptions can be propagated in

the graph by appropriately interpreting the commands. In the parity example, the fact that n is

odd at program point 5 is used to conclude that n is even at point 6. Assuming n is ⊤ at point

1Collatz’s problem in number theory amounts to determining whether this program terminates for all n ∈ N .

Page 23: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

16 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

1

2

3

4

5

67

m

m

m

m

m

mm

read n

n := n/2 n := 3n+ 1

stop

��

��

n = 1 ?

even n ?

no yes

yes no

-

?

?

?

HHHHHj -

�� - -

6

-

6

Figure 3.1: A flow diagram

1, we get the following descriptions of n at other points: at 2 it is ⊤, at 3 it is even, at 4 it is ⊤,

at 5 it is odd, at 6 it is even, and at 7 it is odd. This information justifies a transformation of the

program so as to avoid a number of parity tests: the statement n := 3n + 1 can be replaced by

n := (3n + 1)/2. However, the information cannot, as we shall see, be used to conclude that the

program terminates. In the signs example, assuming n is pos at program point 1, n will be pos at

every point.

We follow P. and R. Cousot in demanding that the set of descriptions, D, forms a complete

lattice (that was not the case in our previous examples, but this will soon be rectified). The use

of complete lattices facilitates simple fixpoint characterisations of the non-standard semantics, as

we shall later see. In the Cousot framework, the semantics of a description is given by a function

γ : D → P U , such that (γ φ) is the set of objects described by φ ∈ D. The function γ should be

injective, as we do not want redundant descriptions: once we have a name for a particular set of

objects, that suffices. We also want γ to be co-strict since there should be a description for every

collection of objects, in particular we want a name for “everything.” This is a modest demand: in

the signs example we saw how it was inevitable, because of the inherent imprecision in descriptions.

The intuition behind organising the set of descriptions as a complete lattice is as follows. The

concretization function γ should be monotonic. In fact we let the partial ordering ⊑ on D be

defined by φ ⊑ φ′ iff (γ φ) ⊆ (γ φ′). As already mentioned we can read φ ⊑ φ′ as: whatever φ

describes is (also, though maybe not as precisely) described by φ′. Since D is a complete lattice, it

holds that for every subset D′ ⊆ D there exists a unique element⊔

D′ ∈ D such that

∀φ ∈ D′ . φ ⊑⊔

D′, (3.1)

(∀φ ∈ D′ . φ ⊑ φ′)⇒⊔

D′ ⊑ φ′. (3.2)

During the execution of a program, the same program point may be reached many times, but with

different descriptions of the “current state.” Now if D′ is the set of descriptions that might occur

Page 24: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.2. ABSTRACT INTERPRETATION 17

at some program point, then⊔

D′ must be the best overall description for the point. Namely, (3.1)

states that⊔

D′ describes everything described by members of D′, and (3.2) ensures that⊔

D′ is

as precise as can be.

Let us give some examples of the use of lattices as description domains.

Example 3.3 (Void) We take as descriptions D = {⊤} and define γ ⊤ = U . That is, there is

only one description, ⊤, and ⊤ is very imprecise since it applies to every value in U . This gives

rise to a non-standard semantics which gives no useful dataflow information.

Example 3.4 (Reachability) Let D = {⊥,⊤}, γ ⊥ = ∅, and γ ⊤ = U . This simple domain can

be used in a so-called reachability analysis: unreachable program points are described by ⊥, and

possibly reachable points by ⊤.

Example 3.5 (Signs revisited) Let D = {⊥, neg, zero, pos,⊤}, and let the ordering ⊑ be de-

fined by φ ⊑ φ′ iff φ = ⊥ ∨ φ = φ′ ∨ φ′ = ⊤. The concretization function is γ from Example 3.1,

extended so that (γ ⊥) = ∅.

Example 3.6 (Overlapping signs) Let D = {⊥, nonpos, nonneg,⊤}, and let the ordering ⊑ be

defined as in Example 3.5. We define γ : D → P Z by

γ ⊥ = ∅

γ nonpos = {z ∈ Z | z ≤ 0}

γ nonneg = {z ∈ Z | z ≥ 0}

γ ⊤ = Z.

Examples 3.5 and 3.6 illustrate an important point. To form a complete lattice, that is, for⊔

∅ to

exist, we had to add a least element, ⊥. There is nothing in our previous discussion that tells us

what (γ ⊥) should be, but a natural choice is to let γ be strict, that is, (γ ⊥) = ∅. For dataflow

analysis this is useful, because it means that the presence of ⊥ at some program point conveys the

(precise!) information that execution never reaches the point.

Example 3.7 (Accumulating semantics) Here D = P U , the powerset of U . This is the most

fine-grained set of descriptions possible. The ordering ⊑ is the subset ordering, and⊔

is distributed

union. The concretization function γ is the identity function. This gives rise to a non-standard

semantics which is called the accumulating semantics and which simply gathers the values occurring

at each program point.

Clearly, the finer-grained descriptions we use, the better dataflow analysis possible. On the other

hand, considerations of finite computability and efficiency put a limit on granularity. It is common

to use a lattice that is ascending chain finite for descriptions. The reason is that if the analysis

Page 25: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

18 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

process can be expressed as repeated application of a monotonic function on such a lattice then its

termination is guaranteed.

P. and R. Cousot make one final demand about γ, which we have not yet mentioned and which

we shall sometimes leave unsatisfied, namely that there should always be a best approximation.

Consider a set S ⊆ U . Let W = {S′ ∈ ∆ γ D | S ⊆ S′}, that is, W corresponds (via γ) to the

set of valid descriptions of S. Since γ is co-strict, W is non-empty. Now assume that (∆ γ D) is

closed under intersection. Then there is a best description of S, namely the one corresponding to⋂

W . If we renounce the assumption about closure under intersection, we will in general have a

set of equally good (optimal) descriptions. Whether this is acceptable is perhaps a matter of taste.

P. and R. Cousot do not find it satisfactory that, given a set of equally good descriptions, one can

simply choose an arbitrary element. This is because, in the context of some particular program

to be analysed, the choices are not equally good, and the best choice turns out to vary from one

program to another. P. and R. Cousot give examples of this and consider the assumption about

closure under intersection “a very reasonable assumption” avoiding “program-dependent” analysis

methods [14].

Taken together, the assumptions about co-strictness and closure under intersection imply that

(∆ γ D) is a Moore family [5], which in turn implies that (∆ γ D) is a sublattice of P U . This

is the case in all our previous examples except Example 3.6. In that example there is no best

approximation to {0} nonpos and nonneg are equally good. We shall later see another example of

this phenomenon in a context of dataflow analysis of logic programs.

Let the function α : P U → D be defined by

α ψ = ⊓{φ ∈ D | ψ ⊆ γ φ}.

Clearly α is well-defined and monotonic. It is not hard to see that given the assumption about

closure under intersection, (α ψ) is the best (that is, least) description that applies to all elements

of ψ. We thus arrive at the classical formulation of abstract interpretation in terms of Galois

insertions [71]): in addition to the concretization function, an abstraction function α should exist,

such that

∀φ ∈ D . φ = α (γ φ),

∀ψ ⊆ U . ψ ⊆ γ (α ψ).

Note that, when restricted to (∆ γ D), α is the inverse of γ and that {{ψ ⊆ U | α ψ = φ} | φ ∈ D}

partitions P U .

Following P. and R. Cousot [13] we now generalise the above discussion somewhat, letting

the codomain of γ be a complete lattice rather than a powerset. The reason is that a semantic

definition usually includes many different domains, not necessarily powersets. In this way we obtain

a category having complete lattices as objects and “insertions” (see below) as arrows: (idD,D,D)

is the identity arrow on D and (γ ◦ γ′,D,E) is the composite arrow of (γ,D′, E) and (γ′,D,D′).

Page 26: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.2. ABSTRACT INTERPRETATION 19

Definition. An insertion is a triple (γ,D,E) where D and E are complete lattices and (the

monotonic) γ : D → E is injective and co-strict.

Definition. Let D and E be complete lattices. Let γ : D → E and α : E → D be (monotonic)

functions. Then α is γ’s insertion adjoint iff

∀ d ∈ D . d = α (γ d) (3.3)

∀ e ∈ E . e ≤ γ (α e), (3.4)

where ≤ is the ordering on E. We call the quadruple (γ,D,E, α) a Galois insertion [71].

Our definition of an “insertion adjoint” is narrower than that of an adjunction in category theory,

where D and E may be arbitrary preordered sets and “=” in 3.3 is replaced by the ordering on D.

However, 3.3 and 3.4 correspond to what P. and R. Cousot use in their original paper on abstract

interpretation [13]. As already mentioned, we shall not in general assume that γ has an insertion

adjoint α, but when it has, α and γ uniquely determine each other.

The following definition allows us to characterise when an abstraction function exists.

Definition. Let X be a complete lattice and Y ⊆ X. Then Y is Moore-closed (in X) iff

∀Y ′ ⊆ Y .⊓Y ′ ∈ Y.

Lemma 3.8 (P. and R. Cousot) Let (γ,X, Y ) be an insertion. Then γ has an adjoint iff (∆ γ X)

is Moore-closed.

Proposition 3.9 Let (γ,D,E, α) be a Galois insertion. Then

1. γ is co-continuous,

2. α is continuous.

Proof: Let ⊑ be the ordering on D and ≤ that on E.

(1) LetX ⊆ D be a chain. Then Y = ∆ γ X is a chain in E. We must show that γ (⊓X) = ⊓Y .

We have:

∀x ∈ X .⊓Y ≤ γ x, thus, by monotonicity of α and (3.3),

∀x ∈ X . α (⊓Y ) ⊑ x, thus

α (⊓Y ) ⊑ ⊓X, thus, by monotonicity of γ and (3.4),

⊓Y ≤ γ (⊓X).

Furthermore, by monotonicity of γ,∀x ∈ X . γ (⊓X) ≤ (γ x), so γ (⊓X) ≤ ⊓Y . Therefore

γ (⊓X) = ⊓Y , that is, γ is co-continuous.

Page 27: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

20 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

(2) LetX ⊆ E be a chain. Then Y = ∆ α X is a chain in D. We must show that α (⊔

X) =⊔

Y .

We have:

∀x ∈ X . α x ⊑⊔

Y, thus, by monotonicity of γ and (3.4),

∀x ∈ X . x ≤ γ (⊔

Y ), thus⊔

X ≤ γ (⊔

Y ), thus, by monotonicity of α and (3.3),

α (⊔

X) ⊑⊔

Y.

Furthermore,

∀ y ∈ Y . ∃x ∈ X . y ⊑ α x, thus, by monotonicity of α,

∀ y ∈ Y . y ⊑ α (⊔

X), thus⊔

Y ⊑ α (⊔

X).

So α (⊔

X) =⊔

Y , that is, α is continuous.

In accordance with the observation that “larger” descriptions naturally correspond to decreased

precision, we now define what it means for d ∈ D to safely approximate e ∈ E.

Definition. Let (γ,D,E) be an insertion. We define apprγ : D × E → Bool by

apprγ (d, e) iff e ≤ γ d,

where ≤ is the ordering on E.

Thus apprγ (d, e) reads “d approximates e under γ.” Since γ will always be clear from the context,

we shall omit the subscript and simply denote the predicate by appr. For semantic functions we

shall normally use the symbol ∝ to denote the approximation relation:

Definition. Let F : Prog → D and F′ : Prog → D′ be semantic functions, and let (γ,D,D′) be an

insertion. Then F ∝ F′ iff appr (F [[P ]],F′ [[P ]]) holds for all P ∈ Prog .

Lemma 3.10 Let (γ,D,E) be an insertion. Then

1. appr is inclusive on D × E, ordered componentwise,

2. if γ is co-continuous then appr is co-inclusive on D × E.

Proof: Let Y ⊆ D × E be a chain, and let ≤ be the ordering on E. Assume that appr (d, e) holds

for all (d, e) ∈ Y , that is, e ≤ γ d.

(1) Let d0 =⊔

{d | (d, e) ∈ Y } and e0 =⊔

{e | (d, e) ∈ Y }. Clearly (d0, e0) =⊔

Y . By

monotonicity of γ then, e ≤ γ d ≤ γ d0 for all (d, e) ∈ Y . So e0 ≤ γ d0, that is, appr (d0, e0) holds.

Thus appr is inclusive.

(2) Let d0 = ⊓{d | (d, e) ∈ Y } and e0 = ⊓{e | (d, e) ∈ Y }. Then (d0, e0) = ⊓Y . Clearly

e0 ≤ γ d for all (d, e) ∈ Y , and so e0 ≤ ⊓{γ d | (d, e) ∈ Y }. Since γ is co-continuous, e0 ≤ γ d0,

that is, appr (d0, e0) holds. Thus appr is co-inclusive.

Page 28: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.2. ABSTRACT INTERPRETATION 21

Definition. We extend appr from the domain D × E to (D → D) × (E → E) by defining:

appr ((λ x . F ′ x), (λ x . F x)) iff ∀ (d, e) ∈ D ×E . appr (d, e)⇒ appr ((F ′ d), (F e)).

We thus in fact have a series of relations “appr,” but in what follows, the “type” of appr should

always be clear from the context. This treatment of appr is similar to Reynolds’s use of relational

functors [84]. In Section 3.3 we shall extend appr to other kinds of domains.

Proposition 3.11 Let (γ,D,E) be an insertion and let F : E → E and F ′ : D → D be such that

appr (F ′, F ) holds. Then

1. appr (lfp F ′, lfp F ) holds,

2. appr (F ′ ↓ n, gfp F ) holds for all n ∈ ω,

3. if γ is co-continuous then appr (gfp F ′, gfp F ) holds.

Proof: (1) Let G : (D ×E)→ (D × E) be defined by G (d, e) = (F ′ d, F e). Then G is monotonic

and lfp G = (lfp F ′, lfp F ). By Lemma 3.10, we can reason about appr by using fixpoint induction

on G. Assume appr (d, e) holds. Since appr (F ′, F ) holds, so does appr (F ′ d, F e). So appr (d, e)

implies appr (G (d, e)). Therefore appr (lfp G) holds, that is, appr (lfp F ′, lfp F ) holds.

(2) The proof is by finite induction. Clearly appr (⊤D, gfp F ) holds. Assume appr (F ′ ↓ (n−

1), gfp F ) holds. Since appr (F ′, F ) holds, so does appr (F ′ (F ′ ↓ (n − 1)), F (gfp F )), that is,

appr (F ′ ↓ n, gfp F ). Therefore appr (F ′ ↓ n, gfp F ) holds for all n ∈ ω.

(3) The proof is dual to (1).

Corollary 3.12 If γ has an insertion adjoint then appr (gfp F ′, gfp F ) holds.

Proof: The assertion follows directly from Proposition 3.9 item 1 and Proposition 3.11 item 3.

The reason for our interest in fixpoints is the fact that semantics for logic programs is very ade-

quately given as fixpoint characterisations. This applies both to the well-known TP characterisation

[2, 98] and to denotational definitions. More precisely, the idea is to have the “standard” semantics

of program P given as lfp F for some function F , and to have dataflow analyses defined in terms of

“non-standard” functions F ′, approximating F . We can then use Proposition 3.11 to conclude that

all elements of lfp F have some property Q, provided all elements of γ (lfp F ′) have the property

Q. In other words, lfp F ′ provides us with approximate information about the standard semantics

lfp F . As we shall see in Section 5.2, knowledge about gfp F may also be useful, and (F ′ ↓ n), and

sometimes gfp F ′, can provide this. In this way dataflow analyses are nothing but approximations

to the standard semantics. Since it is preferable that approximations are finitely computable, the

approximating function F ′ and the description domain D are usually chosen in such a way that

the Kleene sequences for F ′ are finite.

Page 29: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

22 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

Obviously, we usually want a dataflow analysis to be as precise as possible, in the sense of

making the best possible use of available information. Letting F ′ map every element of D to ⊤D

clearly leads to a dataflow analysis that is correct, but useless. We note that if an abstraction

function α : E → D exists, then a best safe approximating function F ′ exists, namely the function

defined by F ′ = α ◦F ◦ γ. However, this property is not preserved under composition of functions.

Most of the above discussion has been too simplistic in one respect. For dataflow analysis

purposes we are normally interested in a semantics that is somewhat more complex than the “base”

semantics which simply specifies a program’s “input-output” relation. The reason is that we are

looking for invariants (such as “x is always positive”) that hold at program points, not just results

of computations. (It is not obvious what constitutes a “program point” in a logic program—we

return to that in Chapter 4.) An extended semantics is obtained by extending the base semantics so

that for each program, all run-time states associated with each of its program points are recorded.

In this connection, the base semantics may be viewed as a degenerate extended semantics that has

only one program point, namely the “end” of the program. We present an extended semantics in

Section 6.1.

3.3 Denotational abstract interpretation

Abstract interpretation is powerful because it is semantics-based and thus concerned with a pro-

gramming language as a whole rather than the analysis of particular programs. But the generality

of abstract interpretation can be taken further. Assume we are given a language in which one

can express the semantics of a wide variety of programming languages. The theory of abstract

interpretation may then be developed in the framework of this meta-language once and for all. In

this way, we need not “reinvent” abstract interpretation for all different kinds of languages—each

case is just a special instance of the general theory. Figure 3.2 illustrates the point.

A general, language-independent framework has been developed by Nielson [76, 77]. Following

Nielson, we use the formalism of denotational semantics. This gives us a meta-language which is

simple but powerful (including for instance a least fixpoint operator), well-understood, and widely

used. Its versatility is mainly due to the ease with which it accommodates semantic definitions at

any level of abstraction one may want, allowing for good intellectual economy. The usefulness of

denotational definitions as bases for analysis and transformation of logic programs has been argued

by Debray and Mishra [20]. Denotational definitions use semantic functions that are total mappings

from phrases to their denotations, which in particular allows for easy modelling of non-termination.

Equally important, the semantic functions are homomorphic, or compositional, which facilitates

structural induction in proofs of program properties.

The usefulness of denotational semantics should be apparent from Chapters 5 and 6: standard

and non-standard semantics are easily presented in the meta-language. In fact, by choosing the

right level of abstraction, they can be made highly congruent: if the standard semantics employs

a certain operator on the standard domain, the non-standard semantics should use a very similar

Page 30: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.3. DENOTATIONAL ABSTRACT INTERPRETATION 23

'&

$%

'&

$%

'&

$%

'&

$%

programminglanguage

meta-language

standarddenotation

non-standarddenotation

semantic equations

standardsemantics

dataflowsemantics

standardinterpretation

non-standardinterpretation

?

JJJJJJJJJJJJJJJJJ

��

��

��

@@

@@@@R

Figure 3.2: The role of the meta-language (after Nielson)

operator on the corresponding non-standard domain.

Expressing standard and non-standard semantics in the same meta-language also supports a

derivational approach to the development of dataflow analyses. The definition of a dataflow analysis

should be easily derivable from that of the standard semantics. Only in a second stage should the

analysis be implemented from its definition. Such a stepwise approach may be preferable to the

task of proving some baroque dataflow procedure correct with respect to a semantic definition.

Finally, using the same meta-language for standard and non-standard semantics means that most

correctness discussions can be conducted once and for all in the setting of the meta-language.

The idea behind the following is exactly as in Section 3 of Nielson’s treatment [77], but for

the sake of completeness, and since our meta-language differs in several respects from Nielson’s

“TMLb,” we redo some of his work in our setting. Another reason for giving a detailed treatment

is that, usually, in denotational semantics the emphasis is on chain-complete posets and continuous

functions, notions that we do not need here.

Correctness proofs for dataflow analyses given later hinge on the following results. On first

reading, readers not concerned with proof details may choose to skip the rest of this section, or

merely skim it.

Page 31: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

24 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

The meta-language is one of typed lambda expressions, and the types are given by

E ∈ Exp ::= S | L

L ∈ Lat ::= D | E → L

where S ∈ Stat, D ∈ Dyn, Stat is a collection of static types, and Dyn is a collection of dynamic

types. The difference between the two kinds is that (the interpretation of) a static type remains the

same throughout all (standard and non-standard) semantics, whereas a dynamic type may change.

We call Stat ∪ Dyn the collection of base types, and Lat is the collection of lattice types. The

syntax of the meta-language is given by

e ::= ci (base functions)

| xi (variables)

| λ x : E . e (function abstraction)

| e e′ (function application)

| if e then e′ else e′′ (conditional)

| lfp e (least fixed point operation)

|⊔

x∈e′ e (least upper bound operation)

In addition we use some (standard) extensions, such as “where clauses.” There is a notion of

well-typing for this language, but as it is straightforward and merely a simple modification of

Nielson’s [77], we omit the definition. Static types are interpreted as posets (ordered by identity)

and dynamic types as complete lattices. A type interpretation I thus assigns a structure I E to

each base type E. The semantics of types is defined by natural extension as follows:

I [[S]] = I S (a (fixed) poset, ordered by identity)

I [[D]] = I D (some complete lattice)

I [[E → L]] = I [[E]]→ I [[L]] (ordered pointwise).

As usual, the “→” on the right-hand side denotes monotonic function space. By this, I [[L]] is a

complete lattice for every L ∈ Lat.

By an abuse of notation, an interpretation I denotes a type interpretation (also called I) together

with an assignment of an element of I [[E]] to each base function c of type E. By natural extension

this gives the semantics of the meta-language. The denotation of an expression e is relative to a type

environment tenv and a type E such that tenv ⊢ e : E (for details, see [77]). Let the domain of tenv

be {x1, . . . , xk} and let tenv xi = Ei for i ∈ {1, . . . , k}. Then I [[E]] : I [[E1]]× . . .× I [[Ek]]→ I [[E]]

Page 32: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

3.3. DENOTATIONAL ABSTRACT INTERPRETATION 25

is defined by

I [[ci]] (v1, . . . , vk) = I ci

I [[xi]] (v1, . . . , vk) = vi

I [[λ x : E . e]] (v1, . . . , vk) = λ v . I [[e]] (v1, . . . , vk, v)

I [[e e′]] (v1, . . . , vk) = I [[e]] (v1, . . . , vk) (I [[e′]] (v1, . . . , vk))

I [[if e then e′ else e′′]] (v1, . . . , vk) = if I [[e]] (v1, . . . , vk) then I [[e′]] (v1, . . . , vk)

else I [[e′′]] (v1, . . . , vk)

I [[lfp e]] (v1, . . . , vk) = lfp (I [[e]] (v1, . . . , vk))

I [[⊔

x∈e′ e]] (v1, . . . , vk) =⊔

{I [[e]] (v1, . . . , vk, v) | v ∈ I [[e′]] (v1, . . . , vk)}.

In accordance with subsequent use we shall assume that D is the only element of Dyn, but the

following proposition is easily generalised. Let I and I′ be (type) interpretations and let Q be an

inclusive predicate on (I D)× (I′ D). The intention with Q is that Q (ψ, φ) holds iff the description

φ applies to ψ. Using Reynolds’s notion of relational functors [84] we can extend the relationship

to all types E to get a predicate simE [Q] on (I E)× (I′ E). This relation is defined by

simS [Q] (ψ, φ) iff ψ = φ

simD [Q] (ψ, φ) iff Q (ψ, φ)

simE→L [Q] (ψ, φ) iff ∀ψ′, φ′ . (simE [Q] (ψ′, φ′)⇒ simL [Q] (ψ ψ′, φ φ′))

We can now generalise results from Section 3.2.

Proposition 3.13 For all types E, simE [Q] is inclusive iff Q is.

Proof: Only the “if” part is non-trivial: its proof is by structural induction. The cases E = S

and E = D are trivial. So assume that E = E′ → L and that simL [Q] is inclusive. Let

Z ⊆ (I E)×(I′ E) be a chain such that simE [Q] holds for all (ψ, φ) ∈ Z. Clearly⊔

Z = (⊔

X,⊔

Y ),

where X = {ψ | (ψ, φ) ∈ Z} and Y = {φ | (ψ, φ) ∈ Z}. Let (ψ′, φ′) ∈ (I E′) × (I′ E′) be given.

Then W = {(ψ ψ′, φ φ′) | (ψ, φ) ∈ Z} is a chain and⊔

W = ((⊔

X) ψ′, (⊔

Y ) φ′). Assume

simE′ [Q] (ψ′, φ′) holds. Then simL [Q] (ψ ψ′, φ φ′) holds for all (ψ, φ) ∈ Z. Since simL [Q] is

inclusive, simL [Q] (⊔

W ) holds, that is, simL [Q] (ψ ψ′, φ φ′) holds for (ψ, φ) =⊔

Z. Since ψ′

and φ′ were arbitrary, simE [Q] (⊔

Z) holds, so simE [Q] is inclusive.

This leads to the following important result.

Proposition 3.14 (Nielson) If simE′ [Q] (I c, I′ c) holds for every base function c : E′, then, for

all types E, if e : E is a closed expression then simE [Q] (I [[e]], I′ [[e]]) holds.

Proof: The proof is by structural induction. In the case of lfp e, fixpoint induction is used.

We shall make good use of this result later. Namely, the relation “safely approximates,” or appr,

is inclusive (Lemma 3.10) and will be used in the role of Q. To simplify notation in the sequel,

Page 33: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

26 CHAPTER 3. SEMANTICS-BASED ANALYSIS OF LOGIC PROGRAMS

rather than using the complicated expression simE [Q] (ψ, φ), we shall simply write “ψ apprE φ,”

or, in fact, when E is clear from the context, merely “ψ appr φ.”

Note the generality of the above result: it immediately allows us to argue inductively the

correctness of a whole dataflow analysis (that is, a non-standard semantics) once certain primitive

base functions have been shown to be in the relation “safely approximates.” This applies not only

to logic programming languages as discussed in this thesis, but to any language whose semantics

can be expressed in the meta-language.

Page 34: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 4

Dataflow Analysis of Logic Programs

So far we have discussed dataflow analysis without reference to any particular programming lan-

guage. The very point of Section 3.3 was that a theory for dataflow analysis should be as language-

independent as possible. Section 3.2 outlined classical abstract interpretation but did not explain

how the theory applies to logic programs. In this chapter we discuss dataflow analysis of logic

programs.

It is far from clear what should be understood by “standard semantics,” “state,” “program

point,” etc. The point of this chapter is that several different answers are possible, and that

their relative merits depend on the particular dataflow analysis problem to be solved. In particular

there are many different semantic models—at different levels of abstraction—for logic programming

languages, and we argue that each may be adequate for some purpose and inadequate for another.

So let us consider some examples of dataflow analysis problems. The example programs we use

are all very simple, so analysing them becomes trivial. Our choice of examples is motivated by

a desire for clear exposition and should not lead readers to consider our approach simplistic: in

general dataflow problems are much more complicated and programs contain recursion, etc., but

our approach easily scales up to handle this, as will be seen in subsequent chapters. Before we give

examples, a note about notation is appropriate: to distinguish constants from variables, we use a

and b as constants, and u, v, w, x, y, and z for variables.

Consider the program P :

p(x)← q(x), r(x).

q(a).

r(y).

Assuming our lexicon contains a and b as the only functors, the success set of P is

S = {p(a), q(a), r(a), r(b)} = (TP ↑ 2) [2].

In the case of P , this computation is straightforward, but for more complicated programs the

success set may be infinite and we may want to approximate it instead, by a set S′, say, of atoms.

27

Page 35: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

28 CHAPTER 4. DATAFLOW ANALYSIS OF LOGIC PROGRAMS

A set can be approximated from within or from without, and our choice depends on what we want

to use the approximation for. If S′ ⊆ S (approximation from within), we can interpret “A ∈ S′”

as “A is in the success set.” If S ⊆ S′ (approximation from without), we can interpret “A 6∈ S′” as

“A is not in the success set.” But in both cases, the reverse interpretation is invalid1.

Approximations to the success set may be thought of as (derived) type information about

predicates. They are most naturally computed in a manner similar to how the TP operator works.

A “state” becomes a set of ground atoms. Since a predicate’s definition can be spread over many

clauses, it may be useful to record for each clause those instances of its head that could be derived

using the clause. So in this case, a useful notion of “program point” would simply be “clause.”

In the case of the program P above, assuming that we have been approximating S from without

and arrived at S itself as approximation (this is a reasonable assumption given the simplicity of P ).

In particular we find the approximation {p(a)} attached to the first clause (or “program point”).

This allows for the replacement of that clause by p(a) ← q(a) without changing the program’s

success set (and it immediately suggests a further simplification, namely the unfolding of q(a)).

In Chapter 5 we discuss this case in detail. In fact we are concerned with a more complex

semantics than the TP characterisation, because we want to handle negation in programs as well.

This calls for the simultaneous approximation of success and failure sets.

For some purposes a characterisation of the success set may be inadequate, in particular since

success sets are usually infinite. It may sometimes be useful to know which atoms succeed, rather

than just which ground atoms succeed. This makes TP inadequate: we would need an operator

similar to TP , but one which handles arbitrary atoms. In the case of P , the denotation would be

{p(a), q(a), r(y)}, allowing us to conclude that the query ←p(x) has {x← a} as its only answer

substitution, while ←r(x) has the identity substitution as its only answer substitution. There are

semantic definitions that achieve this and it is perfectly possible to base dataflow analyses on such

definitions. We give some relevant references in Section 5.3.

What we have discussed so far are examples of what we call bottom-up dataflow analysis. Such

analyses propagate information in a way similar to how the TP operator works. The analyses

are therefore independent of any particular query. Often, however, we are only interested in a

program’s behaviour given some query. An interpreter for a logic programming language, for

example, is usually based on SLD resolution [49], in which control flows “top-down,” that is, flows

from a query to the clauses that are selected in an attempt to refute the query. This is normally

the case for Prolog, which uses a standard (left-to-right) computation rule when selecting a body’s

atoms for processing and a depth-first search rule for handling pending atoms. A Prolog compiler

that attempts to improve code generation will correspondingly need dataflow information that is

propagated in some top-down manner.

An example of such dataflow information is mode information. A mode of a predicate argument

1The following rendering of this principle is due to Alan Mycroft: “Safe behind a fire door” depends on which

side of the door the fire is on!

Page 36: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

29

expresses its degree of instantiation at the time the predicate is called. It is well-known how mode

information can enhance generated code. Instead of generating code for the general unification of

terms, a compiler can capitalise on mode information to generate faster versions of the unification

procedure, specialised to the context set up by a given Prolog program [19, 21]. For example, code

to unify a variable with a constant is much simpler than code to unify two arbitrary terms. Tradi-

tionally mode information has been provided by the programmer in the form of mode declarations.

This approach has the disadvantage of putting an extra burden on the programmer, but worse yet:

if wrong mode declarations are given to the compiler, its output can no longer be trusted, and yet

there is no indication of the error. It is therefore worthwhile to try to make the generation of (safe!)

mode declarations automatic.

In general, whenever some kind of annotation of Prolog programs is called for, it is worth

considering the possibility of doing so automatically, not only because it is faster, but also because

it is safer. Partial evaluation, for instance, usually calls for some kind of annotation, and in

Prolog variants with a delay mechanism such as for example NU-Prolog [97], programs are usually

extensively annotated. One can also think of annotations that will guide a parallel execution of a

logic program.

In all these cases (and others, as we return to in Section 6.5), what is needed are descriptions

of the run-time call patterns that a program gives rise to. That is, for each clause in the program

we want information about how unification will bind the variables in the clause’s head.

For example, consider the program P qua Prolog program. A top-down dataflow analysis that

utilises the knowledge of the standard (left-to-right) computation rule can easily deduce that during

refutation of any query of form←p( ), every call of the third clause will bind y to a. Note that this

is very different from the information that a bottom-up analysis provides about the third clause.

As an even simpler example of how the two kinds of analysis attach different information to

clauses, consider the second clause in the program

←p(a)

p(x)← q(x).

q(b).

A bottom-up analysis can tell us that to succeed, p(x) must be instantiated to p(b) if instantiated

at all. A top-down analysis can tell us that the only actual call instance at run-time will be p(a).

The fact that this call ultimately fails is of no concern to the top-down analysis.

In the top-down example analysis of P , the analyzer made good use of its knowledge of the

computation rule. The conclusion that r is always called with a ground argument would not be

valid if arbitrary computation rules were considered. If the dataflow analysis problem that we face

is one of finding the best order in which to process atoms in a body (perhaps including parallel

processing), then we have to use a semantics in accordance with this, that is, one that does not

fix the computation rule. In Chapter 6 and its sequel, on the other hand, we choose a particular

(the standard) computation rule, since our primary interest there is the improvement of compilers

Page 37: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

30 CHAPTER 4. DATAFLOW ANALYSIS OF LOGIC PROGRAMS

'

&

$

%

'

&

$

%

'

&

$

%- -

standardsemantics(abstract)

“dataflow”semantics(abstract)

dataflowanalysis

(implementation)

Figure 4.1: Two-step development of a dataflow analysis

and other program transformers for a Prolog-like language. In so doing, we increase the precision

of analyses.

The carriers of binding information in an SLD refutation are substitutions: in the SLD model a

substitution carries all the necessary information about a program’s current bindings of variables

to “values,” and thus closely corresponds to the notion of a “state” from Section 3.2.

However, the semantics that we use in Chapter 6 is somewhat different from the SLD model.

The reason for the difference is that we want a definition that is rather abstract. A semantics

defined in terms of SLD trees contains information that we do not need for some applications,

such as exact derivation history, bindings of variables that will not affect the final answers, and

a distinction between finite failure and non-termination. Since such information is not needed in

our dataflow analyses, it should be abstracted away so as not to obscure more important issues.

This is exactly where denotational semantics is useful: it allows for such abstraction and provides

a fixpoint characterisation as required by our theory.

On the other hand, a less abstract definition such as the SLD model may be useful as a basis for

dataflow analyses that our semantics cannot support. Keeping a record of the derivation history

may be useful for an analysis of programs’ storage requirements or (lack of) determinacy. The

relation is the same as that between our bottom-up and our top-down semantics: the former is

more abstract than the latter since it disregards computation rules, hence it is easier to formalise,

but it supports a more limited class of dataflow analyses.

The point that we hope to have made here is that, unfortunately, no one semantic model of

logic programs can be said to provide the best basis for dataflow analysis. Other things being

equal, we choose the simplest definition that will support a solution to the problem at hand. Under

the assumption that “simple” here means “abstract,” this leads to a two-step paradigm for the

development of a dataflow analysis, as indicated in Figure 4.1. The point in this approach is that,

rather than to implement an ad hoc dataflow analysis directly from a knowledge of the programming

language, it may be advantageous to factor out two independent issues: approximation (the left

arrow) and implementation (the right arrow).

Page 38: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 5

Bottom-Up Analysis of Normal Logic

Programs

In this chapter a Kleene logic-based semantics for normal logic programs is defined, similar to

Fitting’s ΦP semantics. This provides a semantic basis for bottom-up dataflow analyses. Such

analyses give information about the success and failure sets of a program. As we discussed in

Chapter 4, a major application of bottom-up analysis is therefore type inference. We detail a

dataflow analysis using descriptions similar to Sato and Tamaki’s “depth k” abstractions and

another using Marriott, Naish and Lassez’s “singleton” abstractions. We show that both are sound

with respect to our semantics and outline various uses of the analyses. We justify our choice of

semantics by showing that it is the most abstract of a number of possible semantics. This means that

every analysis based on our semantics is correct with respect to these other semantics, including

Kunen’s semantics, SLDNF resolution, and the common (sound) Prolog semantics. Finally we

discuss related work.

5.1 Bottom-up semantics for logic programs

In this section we give a bottom-up semantics for normal logic programs. Recall that normal

programs allow for negation in clause bodies. Throughout the chapter, by “program” we mean

normal logic program. Also recall the use of italic capital letters for meta-variables, a convention

we stick to throughout the thesis (while also using italic capitals for other things). Finally recall

that Her denotes the set of ground atoms (for some fixed lexicon), and that (ground S) denotes

the set of ground instances of the syntactic object S.

Let Interp = (P Her) × (P Her). Equipped with the component-wise subset ordering, Interp

forms a complete lattice. We denote the ordering on Interp by ≤. The idea is that Interp consists

of all three-valued (partial) “interpretations.” An interpretation u = (us, uf ) is read as follows: the

atoms in us are true, those in uf are false, those not in us∪uf are undefined (not assigned a classical

truth value), and those in us ∩ uf are overdefined (assigned both true and false). Alternatively,

31

Page 39: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

32 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

Interp may be thought of as mapping ground atoms to the four-valued bilattice, as discussed by

Fitting [25]. Clearly, the set Interp is partitioned into consistent and inconsistent elements, where

(us, uf ) is consistent iff us∩uf = ∅ and inconsistent iff us∩uf 6= ∅. The greatest element (Her ,Her)

of Interp is inconsistent, for example. An interpretation (us, uf ) is complete iff us ∪ uf = Her .

Definition. The predicate consistent : Interp → Bool is defined by

consistent (X,Y ) iff X ∩ Y = ∅.

Lemma 5.1 The predicate consistent is inclusive on Interp.

Proof: Let Z ⊆ Interp be a chain. Let X ′ = {X | (X,Y ) ∈ Z} and Y ′ = {Y | (X,Y ) ∈ Z}.

Let X0 =⊔

X ′ and Y0 =⊔

Y ′. Clearly (X0, Y0) =⊔

Z. Assume consistent (X,Y ) holds for all

(X,Y ) ∈ Z, that is, X∩Y = ∅. We show by contradiction that X0∩Y0 = ∅. Assume ∃x.x ∈ X0∩Y0.

Then x ∈ X1 for some (X1, Y1) ∈ Z and x ∈ Y2 for some (X2, Y2) ∈ Z. For reasons of symmetry,

and since Z is a chain, we can assume that (X1, Y1) ≤ (X2, Y2). It follows that x ∈ X2 ∩ Y2,

contradicting the assumption that X ∩ Y = ∅ for all (X,Y ) ∈ Z. Thus X0 ∩ Y0 = ∅, so consistent

is inclusive on Interp.

The idea behind using three-valued logic to describe computational behaviour goes back to Kleene.1

Suppose we want to use a machine to determine the truth or falsehood of some statement. In

addition to the two possibilities that the machine returns true or false, it may happen that it fails

to terminate. It is therefore natural to use a logic which admits yet a third value which stands

for “undefined.” In Kleene’s logic [45], the connectives are the “most generous” extensions of the

classical connectives, so for example the tables for “∧” and “¬” are:

∧ true false undef

true true false undef

false false false false

undef undef false undef

¬

true false

false true

undef undef

In logic programming terms this version of “∧” corresponds to a fair computation rule. Note

that Prolog’s “∧” is not commutative as a three-valued connective: the standard computation

rule of Prolog rather corresponds to the connectives of McCarthy logic [51]. For example, in

McCarthy logic, false ∧ undef yields false, but undef ∧ false yields undef , since this corresponds

to the behaviour of a machine that attempts to evaluate expressions from left to right, given the

understanding that undef designates non-termination.

1Three-valued logics had previously been investigated by Post and Lukasiewicz independently, and by Bocvar, but

always from a point of view of the semantic paradoxes. Kleene wanted an intuitionistically sound logic for partial

recursive functions.

Page 40: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.1. BOTTOM-UP SEMANTICS FOR LOGIC PROGRAMS 33

The use of a fair computation rule can only increase the success and finite failure sets of a

program. So since we approximate these sets from without, any analysis based on Kleene logic will

be sound with respect to McCarthy logic. In the following we give a semantic definition for normal

programs based on Kleene logic, but we return to alternative semantics and their relation to ours

in Section 5.3.

It is possible to give a three-valued model-based semantics for normal programs, based on Fit-

ting’s notion of satisfaction by (in our terminology) consistent interpretations [26]. Fitting, however,

also gives a fixpoint characterisation, based on an operator ΦP , and this is the type of semantic

definition we aim at. Fitting’s operator works on a semilattice of consistent interpretations. The

reason why we include inconsistent elements is that the inherent imprecision in “descriptions” may

sometimes force the generation of (description) values that are “inconsistent,” as Example 5.15 will

show. Technically it is simpler to include all elements, and unlike Fitting we are only interested in

our operator’s least fixpoint which is the same as that of ΦP .

Definition. Let u = (us, uf ) be an interpretation. Let B = L1, . . . , Ln be a ground body. Then

u makes B true iff

∀ i ∈ {1, . . . , n} . ∀A ∈ Her . (Li = A⇒ A ∈ us) ∧ (Li = ¬ A⇒ A ∈ uf )

u makes B false iff

∃ i ∈ {1, . . . , n} . ∃A ∈ Her . (Li = A ∧ A ∈ uf ) ∨ (Li = ¬ A ∧ A ∈ us).

The following lemma is easily verified.

Lemma 5.2 Let u be an interpretation and B a ground body. If u is consistent then u cannot

make B both true and false.

We now define the base semantics. Let P be a program and let I be an index set for the clauses in

P . Using this we may denote the i’th clause in P by P [i], where i ∈ I.

Definition. The immediate consequence function UP : Interp → Interp is defined by

UP u = (us, uf ) where

us = {A ∈ Her | ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . u makes B true}

uf = {A ∈ Her | ∀ i ∈ I . ∀A← B ∈ ground (P [i]) . u makes B false}.

The base semantics of program P is B [[P ]] = lfp UP .

Proposition 5.3 The base semantics of a program P is well-defined and consistent.

Proof: The function UP is easily seen to be monotonic, so B [[P ]] is well-defined. By Lemma 5.1

we can reason about consistent by using fixpoint induction on UP . If (UP u) is inconsistent then

there is a ground instance A← B of a clause in P such that u makes B both true and false, so u

is inconsistent by Lemma 5.2. It follows that the set of consistent interpretations is closed under

UP . Therefore B [[P ]] is consistent.

Page 41: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

34 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

Note that we consider lfp UP to be the denotation of the program P . This differs from Fitting’s

more fine-grained semantics [26] in which the set of all (consistent) fixpoints, or “partial models,”

is used.

Example 5.4 Consider the program P :

p(x)← q(x, y),¬ r(y).

q(x, x).

r(f(a)).

r(f(f(x)))← r(f(f(x))).

Assuming that the alphabet is that of P , the semantics is (us, uf ) where

us = {p(a), q(a, a), q(f(a), f(a)), . . . , r(f(a))}

uf = {p(f(a)), q(a, f(a)), q(f(a), a), . . . , r(a)}.

The rest of this section establishes some results that will be used in Section 5.2 where the issue is

non-standard semantics. We first define a predicate compl, read “complementary.”

Definition. For x ⊆ Her , let x denote (Her \ x), the complement of x. The predicate compl :

Interp × Interp → Bool is defined by compl ((x, y), (x′, y′)) iff x = y′ ∧ y = x′.

Our interest in compl stems from Proposition 5.7 and its corollary below. The proposition relates

lfp UP and gfp UP by showing them to be complementary. To establish the proposition, two lemmas

are needed.

Lemma 5.5 Let B be a ground body and let u and u′ be interpretations. If compl (u, u′) holds,

then

1. u makes B true iff ¬ (u′ makes B false),

2. u makes B false iff ¬ (u′ makes B true).

Proof: Let B = L1, . . . , Ln, let (us, uf ) = u, and assume that compl (u, u′) holds.

(1) The assertion then is

∀ i ∈ {1, . . . , n} . ∀A ∈ Her . (Li = A⇒ A ∈ us) ∧ (Li = ¬ A⇒ A ∈ uf ) iff

¬ (∃ i ∈ {1, . . . , n} . ∃A ∈ Her . (Li = A ∧ A ∈ us) ∨ (Li = ¬ A ∧ A ∈ uf )),

which clearly holds.

(2) The proof is similar to (1).

Let Interp × Interp be equipped with the ordering ⊑ defined by (u1, u′1) ⊑ (u2, u

′2) iff u1 ≤ u2 ∧

u′2 ≤ u′1. Clearly Interp × Interp is a complete lattice.

Page 42: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 35

Lemma 5.6 The predicate compl is inclusive on Interp × Interp.

Proof: Let Z ⊆ Interp × Interp be a chain. Assume that compl (u, u′) holds for all (u, u′) ∈ Z,

that is, x = y′ ∧ y = x′ where (x, y) = u and (x′, y′) = u′. Let X = {x | ((x, y), (x′, y′)) ∈ Z} and

let Y,X ′, and Y ′ be defined similarly. Then ((⋃

X,⋃

Y ), (⋂

X ′,⋂

Y ′)) =⊔

Z. Consider A ∈ Her .

We have that

A ∈⋃

X iff ∀((x, y), (x′, y′)) ∈ Z . A 6∈ x

iff ∀((x, y), (x′, y′)) ∈ Z . A ∈ x

iff ∀((x, y), (x′, y′)) ∈ Z . A ∈ y′

iff A ∈⋂

Y ′.

So⋃

X =⋂

Y ′. Similarly⋃

Y =⋂

X ′. So compl (⊔

Z) holds.

Proposition 5.7 For every program P , compl (lfp UP , gfp UP ) holds.

Proof: Let F : (Interp × Interp) → (Interp × Interp) be defined by F (u, u′) = (UP u,UP u′). By

this, lfp F = (lfp UP , gfp UP ). By Lemma 5.6 we can reason about compl by using fixpoint induction

on F . Assume compl (u, u′) holds and let (us, uf ) = UP u and (u′s, u′f ) = UP u′. By Lemma 5.5 and

the definition of UP , u′s = uf and u′f = us, that is, compl (F (u, u′)) holds. Therefore compl (lfp F ),

that is, compl (lfp UP , gfp UP ), holds.

In general gfp UP will be inconsistent, and even though there may be many consistent fixpoints,

there is usually no greatest such, cf. Fitting’s use of “intrinsic” fixpoints [26]. We have the following

consequence of Proposition 5.7.

Corollary 5.8 gfp UP is consistent iff lfp UP = gfp UP .

Proof: If lfp UP = gfp UP then gfp UP is consistent, by Proposition 5.3. Let (us, uf ) = lfp UP . Then

gfp UP = (uf , us), by Proposition 5.7. Assume that gfp UP is consistent, that is, uf ∩us = ∅. Then

uf ∪us = Her , that is, lfp UP is complete. Thus uf = us and us = uf , and so lfp UP = gfp UP .

5.2 Approximation of success and failure sets

The obvious application of dataflow analyses based on the semantics given in the previous section is

type inference. We use the term “type” loosely to mean descriptions that are based on the structure

of the terms or atoms. For instance, types include the (restricted) rational tree descriptions used

by Bruynooghe [7] and the atom abstractions used by Sato and Tamaki [87]. Type inference is the

process of finding a type which describes the success set and/or failure set of a program. A special

case of type inference is termination analysis: from the success and failure information present in

Page 43: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

36 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

an inferred type one may conclude that certain queries will neither succeed nor fail and so will not

terminate.

In this section we amalgamate ideas from several type inference methods found in the literature

[56, 59, 87]. Type inference is expressed as a non-standard version of the base semantics. We

generalise previous work along two dimensions; the first is that normal programs are considered

rather than definite programs, the second is that the style of type descriptions considered are

generalised to “atom abstractions” which intuitively are sets of atoms whose denotation is their

set of ground instances. We also generalise Marriott, Naish and Lassez’s method of program

specialisation [56] and prove the correctness of the generalisation.

We now introduce “atom abstractions” and “abstraction schemes”—the latter are sets of atom

abstractions. We are primarily interested in abstraction schemes in which each atom abstraction

has a distinct denotation, since atom abstractions representing the same set of ground atoms are

equivalent for our purposes. One step towards ensuring this is to consider the atoms in atom

abstractions to be taken modulo variable renaming.

Definition. The instantiation preordering ⊳ on Atom is defined by A ⊳ A′ iff A is an instance of

A′. We let Atom⊳

denote the poset of atoms (modulo variable renaming) with the partial ordering

induced by the instantiation preordering.

Definition. Let the function den : P Atom⊳ → P Her be defined by den A =⋃

(den A). An

atom abstraction A is a subset of Atom⊳

and its denotation is den A. An abstraction scheme S is

a set of atom abstractions such that for some A ∈ S,Her = den A.

Definition. The function depth : Term → N gives the depth of a term as follows: if T is a constant

then depth T = 1, otherwise depth T = 1 + max {depth T ′ | T ′ is a proper subterm of T}. We

define the depth of an atom A, depth A, as the maximal depth of any of its terms (0 if A is anadic).

The next two definitions exemplify abstraction schemes. Let (pred A) denote the predicate symbol

of atom A.

Definition. Let Atomk = {A ∈ Atom⊳| depth A ≤ k}. The depth k abstraction scheme is

P Atomk.

Example 5.9 The set {[p(x, f(x))], [p(f(a), x)]} is a depth 2 abstraction that denotes

{p(a, f(a)), p(f(a), f(f(a))), . . . , p(f(a), a), p(f(a), f(a)), . . .}

(assuming a lexicon {p, f, a}).

Definition. An atom abstraction A is singleton iff ∀A,A′ ∈ A . (pred A) = (pred A′)⇒ A = A′.

The singleton abstraction scheme is the set of all singleton atom abstractions.

Page 44: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 37

Example 5.10 The set {[p(x, f(f(x)))], [q(f(a), x)]} is a singleton abstraction that denotes

{p(a, f(f(a))), p(f(a), f(f(f(a)))), . . . , q(f(a), a), q(f(a), f(a)), q(f(a), f(f(a))), . . .}

(assuming a lexicon {p, q, a, f}). Note that this abstraction is not a depth 2 abstraction since

p(x, f(f(x))) is of depth 3, and conversely, the previous example’s abstraction is not a singleton

abstraction since it contains two atoms having the same predicate symbol p.

The depth k atom abstractions were introduced by Sato and Tamaki [87] and the singleton

atom abstractions have been used by Marriott [54] and by Marriott, Naish and Lassez [56]. Related

abstraction schemes have been studied by Marriott and Søndergaard [61].

We equip abstraction schemes with the ordering ≤ defined by A ≤ A′ iff den A ⊆ den A′. As it

stands, ≤ is a preordering: different atom abstractions may have the same denotation. For instance,

if there are only two constants a and b, then {p(x)} has the same denotation as {p(a), p(b)}. We are

ultimately interested in schemes that are complete lattices, so as a first step we introduce schemes

that are posets. These we call “canonical.”

Definition. An abstraction scheme S is canonical iff

∀A,A′ ∈ S . (den A) = (den A′)⇒ A = A′.

It is straightforward to show that if there are two or more distinct ground terms then the singleton

abstraction scheme is canonical. However, depth k abstraction schemes are not in general canonical.

For any abstraction scheme one can always find an “equivalent” scheme which is canonical by

choosing a maximal representative for each class of abstractions with the same denotation. Let

(X,≤) be a poset and let Y ⊆ X. We define maximal : P X → P X by

maximal Y = {y ∈ Y | ∀ y′ ∈ Y . y ≤ y′ ⇒ y = y′},

and we define minimal Y in the dual manner.

Definition. Let S be an abstraction scheme. We define the canonicised S to be

S∗ = {maximal {A ∈ Atom⊳| ground A ⊆ den A} | A ∈ S}.

Definition. Abstraction schemes S and S ′ are equivalent iff

{den A | A ∈ S} =⋃

{den A | A ∈ S ′}.

The next proposition follows immediately from the definition of S∗.

Proposition 5.11 S∗ is canonical and equivalent to the abstraction scheme S.

Page 45: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

38 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

Definition. An abstraction scheme is a lattice abstraction scheme iff it is a complete lattice under

subset ordering.

Proposition 5.12 Canonicised depth k abstraction schemes and the canonicised singleton abstrac-

tion scheme are lattice abstraction schemes.

Our interest in lattice abstraction schemes stems from the following proposition.

Proposition 5.13 Let S be a lattice abstraction scheme. Let IntS = S × S and let γS : IntS →

Interp be defined by

γS (ws, wf ) = (den ws, den wf ).

Then (γS , IntS , Interp) is an insertion.

Abstraction schemes can be used to induce non-standard semantic functions which approximate the

standard semantic functions. Such non-standard semantics are useful because they may highlight

errors hidden in the program, by giving an approximation to the success set or failure set smaller

than the programmer would expect. This will be exemplified later (Example 5.16).

Definition. Let S be a canonical abstraction scheme. Define αS : (P Her) → S to be some fixed

function with the property that

∀G . αS G ∈ minimal {A ∈ S | G ⊆ den A}.

Definition. Let P be a program and let S be a lattice abstraction scheme. The function WP :

IntS → IntS is defined by

WP w = (ws, wf ) where

ws = αS {A ∈ Her | ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . (γS w) makes B true}

wf = αS {A ∈ Her | ∀ i ∈ I . ∀A←B ∈ ground (P [i]) . (γS w) makes B false}.

The non-standard base semantics of P using S is NS [[P ]] = lfpWP .

Note the non-constructiveness of the definition of WP . This allows us easily to argue that NS

approximates the standard base semantics. However, for an implementation a more complex def-

inition using unifiers rather than dealing with ground instances would be useful. For examples of

this, see Marriott’s [54] Chapter 7.

Theorem 5.14 For every lattice abstraction scheme S, NS ∝ B.

Proof: It follows from the definition of WP that appr (WP , UP ) holds for all programs P . Thus by

Proposition 3.11, appr (lfpWP , lfp UP ) holds for all programs P . The assertion follows from the

definitions of NS and B.

Page 46: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 39

Example 5.15 If S is the depth 2 abstraction scheme and P is the program from 5.4 then the

non-standard base semantics of P using S∗ is w = (ws, wf ) where

ws = {p(a), q(x, x), r(f(a))}

wf = {p(f(a)), q(a, f(a)), q(f(a), a), q(f(x), f(y)), r(a)}.

Example 5.15 illustrates the inherent imprecision of depth k abstraction schemes, since there is

no way to tell from the non-standard semantics w whether q(f(a), f(a)) is in P ’s success set or

failure set. Of course the larger the depth chosen for the depth k abstraction scheme, the more

precise the analysis. The example also demonstrates that the interpretation corresponding to the

non-standard base semantics may be inconsistent.

The following example illustrates how the non-standard base semantics may be used to indicate

errors in a program. Recall the list notation used in this thesis: we use the infix operator : for list

construction and let nil denote the empty list.

Example 5.16 Let S be the singleton abstraction scheme and let P be the following, somewhat

defective, program:

member(u, u : v).

member(u, x : v)←member(v, v).

The program is intended to compute list membership. The non-standard base semantics of P using

S∗ approximates the success set of P by {member(u, u : v)}. This set is obviously too small, which

indicates that the program contains an error.

Another use of abstraction schemes is to specialise programs by replacing clauses in the program

by instances of the clauses [56, 87]. This has the advantage that bindings are made earlier in

a derivation, allowing failure to be detected more quickly, thus pruning useless derivations. In

particular, specialisation may turn non-deterministic programs that use deep backtracking into

programs using shallow backtracking only and so are deterministic in the sense that only one

clause head matches each call.

Definition. Let S be a lattice abstraction scheme. Define gclause : Clause → IntS → Gcla,

sclause : Clause→ IntS → Clause, and spec : Prog → IntS → Prog by

gclause C w = {A← B ∈ ground C | (γS w) makes B true}

sclause C w = maximal {C ′ ≤ C | ground C ′ ⊆ gclause C w}

spec P w = {sclause (P [i]) w | i ∈ I}.

Example 5.17 Let P be the program from Example 5.4 and let w be its non-standard semantics

from Example 5.15. Then (spec P w) is

p(a)← q(a, a),¬ r(a).

q(x, x).

r(f(a)).

Page 47: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

40 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

Note how the last clause of P has been “specialised” away.

It is reasonable to hope that if a program P is specialised using NS [[P ]], the resulting program

will be equivalent to P . More generally, we might hope that if appr (w,B [[P ]]) holds, then P and

(spec P w) will be equivalent in the sense of codenotative:

Definition. Programs P and P ′ are equivalent iff B [[P ]] = B [[P ′]].

Unfortunately Example 5.17 shows that this does not hold. We therefore aim at finding a sufficient

condition on w that ensures that P and (spec P w) are equivalent. We first establish some lemmas.

Lemma 5.18 Let S be a lattice abstraction scheme, P be a program and P ′ = (spec P w) for some

w ∈ S. For all u ∈ Interp such that appr (w, u) holds, if UP u = (us, uf ) and UP ′ u = (u′s, u′f )

then us = u′s and uf ⊆ u′f .

Proof: Since each clause in P ′ is an instance of a clause in P , it follows that u′s ⊆ us and uf ⊆ u′f .

By the definition of spec, all of the ground clause instances used when deriving UP u are instances

of a clause in P ′ and so u′s = us.

Lemma 5.19 Let S be a lattice abstraction scheme, P be a program and P ′ = (spec P w) for some

w ∈ S. If appr (w,B [[P ]]) holds, then B [[P ]] ≤ B [[P ′]].

Proof: Define the predicate leq : (Interp × Interp) → Bool by leq (u, u′) iff u ≤ u′. Clearly leq is

inclusive on Interp×Interp. Let G : (Interp×Interp)→ (Interp×Interp) be defined by G (u, u′) =

(UPu,UP ′u′). Then G is monotonic and lfp G = (lfp UP , lfp UP ′). Assume appr (w,B [[P ]]) holds,

that is, lfp UP ≤ γS w. By Lemma 5.18 then, UP u ≤ UP ′ u for all u ≤ lfp UP . Assume u ≤ u′.

Since UP ′ is monotonic, UP u ≤ UP ′ u′ for all u ≤ lfp UP . So leq (u, u′) implies leq (G (u, u′)) for all

u ≤ lfp UP . By fixpoint induction, leq (lfp UP , lfp UP ′) holds, that is, B [[P ]] ≤ B [[P ′]].

Definition. The function co : Interp → Interp is defined by co (us, uf ) = (uf , us).

Note that comp (u, co u) holds for all u ∈ Interp. The following lemma is an immediate consequence

of the definition of co.

Lemma 5.20 If u is consistent then u ≤ co u.

Lemma 5.21 Let S be a lattice abstraction scheme, P be a program and P ′ = (spec P w) for some

w ∈ S. If appr (w, co (B [[P ]])) holds, then B [[P ]] is a fixpoint for UP ′.

Proof: Let (us, uf ) = B [[P ]] and let (u′s, u′f ) = UP ′ (B [[P ]]) = (u′s, u

′f ). By Lemma 5.20, B [[P ]] ≤

co (B [[P ]]). Assume that appr (w, co (B [[P ]])) holds. It follows from Lemma 5.18 that us = u′s. We

Page 48: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 41

now prove that uf = u′f . Let I ′ be an index set for the clauses in P ′. Since (us, uf ) = UP (B [[P ]]),

we have:

A ∈ uf iff ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . ¬ ((B [[P ]]) makes B false)

iff ∃ i ∈ I . ∃A←B ∈ ground (P [i]) . co (B [[P ]]) makes B true

iff ∃ i ∈ I ′ . ∃A←B ∈ ground (P ′[i]) . co (B [[P ]]) makes B true

iff ∃ i ∈ I ′ . ∃A←B ∈ ground (P ′[i]) . ¬ ((B [[P ]]) makes B false)

iff A ∈ u′f .

The second and fourth bi-implications are by Lemma 5.5, the third is by Lemma 5.18. It follows

that (us, uf ) = (u′s, u′f ), that is, B [[P ]] is a fixpoint for UP ′ .

Theorem 5.22 Let S be a lattice abstraction scheme, let P be a program, and let w ∈ S. If

appr (w, co (B [[P ]])) holds, then P and (spec P w) are equivalent.

Proof: It follows from Lemmas 5.19 and 5.21 that B [[P ]] is the least fixpoint for UP ′ where P ′ =

(spec P w). Thus B [[P ′]] = B [[P ]].

For this reason we are interested in finding approximations to co (B [[P ]]). We have the following

proposition.

Proposition 5.23 Let S be a lattice abstraction scheme and let P be a program. Then

1. appr (WP ↓ n, co (B [[P ]])) holds for all n ∈ ω,

2. if γS is co-continuous then appr (gfpWP , co (B [[P ]])) holds.

Proof: Both statements follow directly from Propositions 5.13, 5.7 and 3.11.

Thus when specialising programs we are interested in computing (WP ↓ n) or, in cases where γS is

co-continuous, gfpWP . We note that for the two canonicised abstraction schemes discussed here,

γS is co-continuous.

Example 5.24 Let P be the following (correct) program to compute list membership:

member(u, u : v).

member(u, x : v)←member(u, v).

Using the singleton abstraction scheme we have gfpWP = (ws, wf ) where

ws = {member(u, y : z)}

wf = {member(u, v)}.

Page 49: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

42 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

1 2a b

3 4a b

b

Figure 5.1: Two non-deterministic automata

It follows that spec P (gfpWP ) =

member(u, u : v).

member(u, x : y : z)←member(u, y : z).

is equivalent to P .

The following example indicates the usefulness of program specialisation.

Example 5.25 Consider the following non-deterministic program P .

inter(x)← accept(1, x), accept(3, x).

accept(1, a : x)← accept(1, x).

accept(1, a : x)← accept(2, x).

accept(2, b : nil).

accept(3, a : x)← accept(4, x).

accept(4, b : x)← accept(4, x).

accept(4, b : nil).

The predicate accept defines two nondeterministic finite automata as shown in Figure 5.1. The

predicate inter defines a language as the intersection of the regular languages accepted by the two

automata. Using the canonicised depth 3 abstraction scheme, gfp WP = (ws, wf ) where

ws = { inter(a : b : nil),

accept(1, a : a : x), accept(1, a : b : nil), accept(2, b : nil),

accept(3, a : b : x), accept(4, b : b : x), accept(4, b : nil) }.

Page 50: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.2. APPROXIMATION OF SUCCESS AND FAILURE SETS 43

So spec P (gfp WP ) =

inter(a : b : nil)← accept(1, a : b : nil), accept(3, a : b : nil).

accept(1, a : a : a : x)← accept(1, a : a : x).

accept(1, a : a : b : nil)← accept(1, a : b : nil).

accept(1, a : b : nil)← accept(2, b : nil).

accept(2, b : nil).

accept(3, a : b : b : x)← accept(4, b : b : x).

accept(3, a : b : nil)← accept(4, b : nil).

accept(4, b : b : b : x)← accept(4, b : b : x).

accept(4, b : b : nil)← accept(4, b : nil).

accept(4, b : nil).

This is a specialised, deterministic version of P , equivalent to P . The program can be simplified

by straightforward means, by removing all ground bodies by unfolding. The query ←inter(x) no

longer requires backtracking; in fact the only answer can be produced in one derivation step.

Let us finally discuss termination properties of the dataflow analyses presented in this section.

Lemma 5.26 Assume our lexicon is finite.

1. The canonicised depth k abstraction schemes are both ascending and descending chain finite.

2. The canonicised singleton abstraction scheme is ascending chain finite.

Proof: (1) This follows from the fact that every canonicised depth k abstraction scheme is finite.

(2) It is well-known that Atom⊳ is Noetherian, given a finite lexicon, (see Reynolds’s [85]

Theorem 5). Since, by definition, each atom abstraction in the canonicised singleton scheme has

at most n elements, where n is the number of predicate symbols, the assertion follows.

Proposition 5.27 Let P be a program.

1. If S is a canonicised depth k abstraction scheme, then the ascending and descending Kleene

sequences for WP are finite.

2. If S is the canonicised singleton abstraction scheme, then the ascending Kleene sequence for

WP is finite.

Proof: The assertions follow from the definition of WP and Lemma 5.26.

Page 51: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

44 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

5.3 Applications and related work

We have presented two dataflow analyses and argued their correctness with respect to the semantic

function B. The question remains, however, what this means if one assumes another underlying

semantics, such as SLDNF resolution [10, 49]. In this section we justify our choice of semantics

by showing that it is, in a precise sense, the most abstract of a number of possible semantics for

normal logic programs. In particular, the semantics we consider are: the set of logical consequences

(in three-valued logic) of a program’s completion, SLDNF resolution, and the standard Prolog

semantics. This means that an analysis based on B is automatically correct with respect to all

these semantics.

Fitting’s proposal is not the only suggestion to base a semantic definition for logic programs on

many-valued logic. Mycroft [74] was the first to discuss this possibility and its advantages. Other

proposals have followed. It is beyond the scope of this chapter to provide detailed motivation and

definition for the various semantics. Readers are referred to the original sources for details.

Kunen [47] has advocated a declarative semantics for normal programs, in which the meaning of

a program is the set of logical consequences in three-valued logic of the program’s Clark completion.

An alternative, operational definition of this semantics is also given in terms of UP . We now recall

this definition.

As noted in Section 5.1, any consistent interpretation u may be viewed as a mapping from

ground atoms to the truth values true, false, or ⊥. Let Form denote the set of closed for-

mulas over the given alphabet. Then the mapping can be extended in the natural way [47]

to a mapping Form → {true, false ,⊥} [47]. Let consequences u be the set of closed formulas

mapped to true by this extension. Let the function lcons : P Interp → P Form be defined by

lcons Z =⋃

(∆ consequences Z) and let Lcon = P (lcons {u ∈ Interp | cons u}). Ordered by

subset ordering, Lcon is a complete lattice.

Definition. The (three-valued logic) semantic function L : Prog → Lcon is defined by

L [[P ]] = ∆ lcons {UP ↑ n | n ∈ ω}.

Another natural semantics for normal logic programs is given in terms of SLDNF derivations

[10, 49]. Writing the application of a substitution θ to a syntactic object S as (θ S) and using ∀

and exists for universal and existential closure respectively, we can define the SLDNF semantics

as follows:

Definition. The (SLDNF) semantic function S : Prog → Lcon is defined by

S [[P ]] = {¬ ∃G | P ∪ {G} has a finitely failed SLDNF tree} ∪

{∀(θ G) | θ is a computed (SLDNF) answer for P ∪ {G}}.

Page 52: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

5.3. APPLICATIONS AND RELATED WORK 45

Shepherdson [90] has shown that SLDNF resolution is sound with respect to the Clark completion

of a program in three-valued logic. Therefore, using the identity function on Lcon → Lcon as

concretization function, we have the following result.

Proposition 5.28 L ∝ S.

Standard Prolog may be considered as a restricted form of SLDNF resolution in which the compu-

tation rule is left-to-right and a depth-first search rule is used. We assume that standard Prolog is

“sound” in that unification with the occur check is performed and that when a non-ground negative

literal is selected, the derivation halts with an undefined value.

Definition. The (Prolog) semantic function P : Prog → Lcon is defined by

P [[P ]] = {¬ ∃G | P ∪ {G} fails finitely} ∪

{∀(θ G) | θ is a computed answer for P ∪ {G}}.

Since every derivation constructed by standard Prolog is an SLDNF derivation, it is apparent that

the following holds (again using the identity function on Lcon→ Lcon as concretization function).

Proposition 5.29 S ∝ P.

We finally look at the relationship between the base semantics and L. Let γ : Interp → Lcon be

defined by γ u = lcons {u′ | u′ ≤ u ∧ cons u′}.

Proposition 5.30 (γ, Interp, Lcon) is an insertion.

Proof: Clearly γ is monotonic and co-strict. We now show that γ is injective. Let u = (us, uf ) and

u′ = (u′s, u′f ) be distinct elements of Interp. Since u 6= u′ either us 6= u′s or uf 6= u′f . If us 6= u′s

then, for reasons of symmetry, we can assume that there is some A ∈ us \ u′s. Then A ∈ γ u, while

A ∈ γ u′ cannot hold. Thus γ u 6= γ u′. Similarly if uf 6= u′f then by symmetry we can assume

that there is some A ∈ uf \ u′f . Then ¬ A ∈ γ u, but ¬ A 6∈ γ u′, and again γ u 6= γ u′. Thus γ is

injective.

Proposition 5.31 B ∝ L.

Proof: Let P be a program and assume u ∈ {UP ↑ n | n ∈ ω}. Then u = (UP ↑ n) for some n ∈ ω,

and u is consistent. Since u ≤ lfp UP , it follows that u ∈ {u′ | u′ ≤ lfp UP ∧ cons u′}. We therefore

have that ∆ lcons {UP ↑ n | n ∈ ω} ⊆ ∆ lcons {u′ | u′ ≤ lfp UP ∧ cons u′}. So L [[P ]] ⊆ γ (B [[P ]]).

Since P was arbitrary, B ∝ L.

Since ∝ is transitive we have the following theorem showing that B is the most abstract of the

standard semantics we have considered.

Page 53: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

46 CHAPTER 5. BOTTOM-UP ANALYSIS OF NORMAL LOGIC PROGRAMS

Theorem 5.32 B ∝ X holds for all X ∈ {L,S,P}.

Corollary 5.33 A dataflow analysis based on B is correct with respect to L, S, and P.

The above result is very satisfactory, but there is a price to be paid for the very abstract semantics.

As discussed in Chapter 4, more abstraction implies less precision in dataflow analyses. Analysing

a Prolog program, for example, one can make more precise statements about runtime behaviour

by using the knowledge that the computation rule is left-to-right. The abstract semantics B does

not make this assumption. It is, however, possible to capture the left-to-right computation rule

in a fixpoint characterisation such as B’s. This can be done simply by changing the definitions of

the “makes true” and “makes false” relations so they correspond to conjunction in McCarthy logic

[51]. For example, u should make L ∧ L′ false iff u makes L false or u makes L true and L′ false.

By slightly more complicated means, one can capture Prolog’s search rule using the asymmetric

“∨” of McCarthy logic. It is outside the scope of the present thesis to detail this, but we note that

such a semantics would be safely approximated by B, that is, any dataflow analysis based on B

would be correct with respect to it.

We have shown how the information provided by type inference may be used for program

specialisation or to highlight errors in a program. Our use of the analyses has pointed to the

possibility of automated program specialisation that is independent of particular queries. In a rather

different context, one can imagine bottom-up dataflow analysis being used for query optimisation

in deductive databases. A bottom-up analysis is very natural for this since it corresponds to the

operational semantics of deductive databases.

However, for many applications it would be useful to describe arbitrary atoms, not just ground

atoms as was done in this chapter. For example, information about the interdependencies of term

groundness is useful for query handling in deductive databases, see for example Dart [15]. Such

applications presuppose a semantic definition based on the handling of arbitrary terms. That

is, the definitions should give all logical consequences of a program, rather than just the ground

consequences (over some alphabet). Definitions of this kind have been suggested [24, 64], though

only for the class of definite programs, and their use as a basis for abstract interpretation has been

investigated by Barbuti, Giacobazzi and Levi [4] and by Marriott and Søndergaard [64]. Program

specialisation has also been studied by Gallagher et al. [27] who use OLDT resolution [94] as

semantic basis.

Page 54: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 6

Top-Down Analysis of Definite Logic

Programs

In this chapter we consider top-down dataflow analysis of logic programs. We show how abstract

interpretation can be used to develop useful dataflow analyses that for example may help a Prolog

compiler improve its code generation.

The analyses sketched in Section 5.2 are based on a “bottom-up,” or “forward chaining” seman-

tics. We have seen how they could be used for a kind of type inference, and how this in turn might

be useful for program specialisation and detection of programming errors. However, a limitation

of the semantics of Chapter 5 is that it does not give any information about how variables would

actually become bound during execution of a program. Most compiler “optimisations” depend on

knowledge about variable bindings at various stages of execution, so to support such transforma-

tions, we have to develop semantic definitions that manipulate substitutions. Furthermore, a typical

Prolog compiler, say, is based on an execution mechanism given by SLD resolution—the semantics

we gave in Chapter 5 lacks the SLD model’s notion of “calls” to clauses.

We therefore need to base our definitions on a formalisation of SLD resolution. Such a formal-

isation is given in Section 6.1. However, as we shall explain, the SLD semantics itself is not very

useful as a basis for dataflow analysis. We therefore develop denotational definitions that are better

suited. In Section 6.2 we introduce “parametric substitutions” as the natural semantic domain for

such definitions and give the definition of a “base” semantics. In Section 6.3 we transform the

semantic definition into a generic definition of a wide range of dataflow analyses. The definition

is generic in the sense that it remains uncommitted as to what constitute “substitution approx-

imations” and how to “compose” and “unify” them. As an example of a dataflow analysis that

can be cast as an instance of the generic definition, we show in Section 6.4 how to extract runtime

groundness information from a program at compile time. Finally, in Section 6.5, related work is

discussed and other applications of the presented (or a very similar) framework are listed.

47

Page 55: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

48 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

6.1 SLD semantics

Unlike the case in Chapter 5, the semantic definitions given in this chapter apply to definite

programs only. We assume the standard left-to-right computation rule—this is what we mean

by “pure Prolog.” While it is perfectly possible to give a mathematical definition of the (top-

down) semantics of normal logic programs, there are diverging suggestions as to what exactly the

semantics should be. Since furthermore the handling of negation would call for much overhead and

thus obscure the central ideas (about dataflow analysis as non-standard semantics), we will restrict

attention to definite programs.

The definition in Section 5.1 is useful as a basis for dataflow analyses that yield approximations

to a program’s success set (and finite failure set). It is, however, not very useful as a basis for a

number of the dataflow analyses that designers of compilers often are interested in. Interpreters

and compilers for logic programming languages are usually based on SLD resolution as execution

mechanism. The previous definition does not capture SLD resolution, because it has no notion of

“calls” to clauses, as has the SLD model. In the SLD model, the first thing that takes place when

a clause is called is a unification of the calling atom and the clause’s head. This unification is an

important target for a compiler’s attempts to generate more efficient code, because the general uni-

fication algorithm is expensive, and most calls only need very specialised versions of the algorithm.

(There are many transformations other than unification specialisation that we are interested in,

and which our framework will support, but this serves as sufficient motivation for incorporating

information about calls in the semantic definition).

We assume the same syntax as in Chapter 5, except that all literals are now positive. We

let Q ∈ Pred , F ∈ Fun, V ∈ Var , where Pred , Fun , and Var denote the syntactic categories of

predicate symbols, functors, and variables, respectively. The set Var is assumed to be countably

infinite. The syntactic categories Prog , Clause, Atom, and Term are as usual. We assume that we

are given a function vars : (Prog ∪Atom∪Term)→ P Var , such that (vars S) is the set of variables

that occur in the syntactic object S. As usual, we use italic capital letters for meta-variables.

A program denotes a computation in a domain of substitutions. A substitution is an almost-

identity mapping θ ∈ Sub ⊆ Var → Term from the set of variables Var to the set of terms over

Var . Substitutions are not distinguished from their natural extensions to Atom → Atom. Our

notation for substitutions is standard. For instance {x 7→ a} denotes the substitution θ such that

(θ x) = a and (θ V ) = V for all V 6= x. We let ι denote the identity substitution. The functions

dom , rng , vars : Sub → P Var are defined by

dom θ = {V | θ V 6= V }

rng θ =⋃

(V ∈dom θ) vars(θ V )

vars θ = (dom θ) ∪ (rng θ).

A unifier of A,H ∈ Atom is a substitution θ such that (θ A) = (θ H). A unifier θ of A and H is an

(idempotent) most general unifier of A and H iff θ′ = θ′ ◦ θ for every unifier θ′ of A and H. The

Page 56: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.1. SLD SEMANTICS 49

auxiliary function mgu : Atom → Atom → P Sub is defined as follows. If A and H are unifiable,

then (mgu A H) yields a singleton set consisting of a most general unifier of A and H. Otherwise

(mgu A H) = ∅. The function restrict : P Var → Sub → Sub is defined by

restrict U θ V = if V ∈ U then θ V else V.

We shall also be interested in renamings. A renaming is a bijection ρ ∈ Ren ⊂ Var → Var which

satisfies ρ = ρ−1. We do not distinguish a renaming from its natural extension to atoms and clauses.

We assume that we are given a function ren : P Var → P Var → Ren for generating renamings:

(ren U W ) is some renaming, such that for all V in W , ren U W V 6∈ U . Note that this is renaming

of variables in W “away from” U . We shall primarily use this function for the renaming of clauses

(by natural extension): let the function rename : P Var → Clause → Clause be defined by

rename U C = ren U (vars C) C.

Our definition assumes the standard (left-to-right) computation rule. It is given in a form that

should make it clear that it is equivalent to the usual SLD model [2, 49]. The main difference from

the standard definition is that we express the SLD semantics using a fixpoint characterisation. We

let :: denote concatenation of sequences. Clauses are implicitly universally quantified, so each call

of a clause should produce new incarnations of its variables. This is where the function rename

is needed: the generated clause instance will contain no variables previously met during execution

(those in θ) or to be met later (those in A : G). As usual, we think of a program as a set of clauses,

and of goals and bodies as sequences of atoms. The domain P Sub is ordered by subset ordering,

the other domains are ordered pointwise.

Definition. The SLD semantics has semantic domains

Env = Atom∗ → P Sub

Sem = Sub → Env

and semantic functions

O : Prog → Env

O′ : Prog → Sem → Sem

C′ : Clause → Sem → Sem.

It is defined by

O [[P ]] G = restrict(vars G) (lfp(O′ [[P ]]) ι G)

O′ [[P ]] =⊔

C∈P (C′ [[C]])

C′ [[C]] s θ nil = {θ}

C′ [[C]] s θ (A : G) = let U = (vars θ) ∪ (vars(A : G)) in

let [[H ←B]] = rename U C in

let M = mgu A H in⋃

µ∈M s (µ ◦ θ) (µ (B :: G)).

Page 57: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

50 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

By this definition, finite failure is not distinguished from non-termination. For example, the empty

program and the program consisting only of the clause p ← p have the same denotation. This

reflects the fact that the statements we are interested in generating from a dataflow analysis are of

the form “whenever execution reaches this point, so and so holds.” In saying so, we do not actually

say that execution does reach the point. In particular, “whenever the computation terminates, so

and so holds” concludes nothing about termination. As non-termination is not distinguished from

finite failure, we can assume a parallel search rule, rather than the customary depth-first rule—this

simplifies our task. Owing to the use of a parallel search rule, the execution of a program naturally

yields a set of answer substitutions, as opposed to a sequence.

Example 6.1 Consider the following list concatenation program P :

append(nil, y, y).

append(u : x, y, u : z)← append(x, y, z).

and let A = append(x, y, a : nil). Execution of P yields two instantiations of the variables in A.

We have that O [[P ]] (A : nil) = {{x 7→ nil, y 7→ a : nil}, {x 7→ a : nil, y 7→ nil}}.

A note about semantical modeling and dataflow analysis may be appropriate at this point. We

have defined O such that O, given a program P and a goal G, returns the set of computed answer

substitutions Θ. That is, O selects what is considered the relevant information from the SLD tree

for P ∪ {G}, namely for each success leaf, the composition of the substitutions that decorate the

path from the success leaf to the root. Other information is “forgotten.” For example, (O [[P ]] G)

contains no information about the length of derivations, the substitutions that decorate paths

leading to failure nodes, or how variables outside G become bound during execution.

This is quite in accordance with the common understanding of SLD semantics qua the result

of applying SLD resolution. However, as we earlier discussed, dataflow analysis aims at extracting

information about a program’s execution, that is, about the very details that our “SLD semantics”

forgets. For example, a dataflow analysis that aims at determining which calls may appear at

runtime cannot afford to disregard those paths in the SLD tree that lead to failure. This is why

we need a notion of “extended semantics,” as discussed at the end of Section 3.2.

One might argue that (O [[P ]] G) should really return the whole SLD tree for P ∪{G}, so as to

constitute the ultimate extended semantics with respect to which all dataflow analyses were to be

justified. However, since we do not want to commit ourselves to an operational semantics at that

level of detail, it seems best to take O as point of departure and extend it to “collect” whatever

extra information we may want on a case by case basis.

In this context we are specifically interested in the calls that may occur during the execution

of P . In the extended semantics given below, a “repository” is therefore maintained, consisting of

all the calls that occur during execution1. Thus the domain of repositories is P Atom, ordered by

subset ordering. Other domains are ordered pointwise.

1A repository is similar to what is sometimes called a “context vector” [13], a “log” [43, 92], or a “record” [59].

Page 58: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.1. SLD SEMANTICS 51

Definition. The SLD call semantics has semantic domains

Repos = P Atom

Env = Atom∗ → Repos

Sem = Sub → Env

and semantic functions

Ocall : Prog → Env

O′call : Prog → Sem → Sem

C′call : Clause → Sem → Sem.

It is defined by

Ocall [[P ]] G = lfp (O′call [[P ]]) ι G

O′call [[P ]] =

C∈P (C′call [[C]])

C′call [[C]] s θ nil = ∅

C′call [[C]] s θ (A : G) = let U = (vars θ) ∪ (vars(A : G)) in

let [[H ←B]] = rename U C in

let M = mgu A H in

{A} ∪⋃

µ∈M (s (µ ◦ θ) (µ (B :: G))).

This semantics is still not particularly suitable as a basis for dataflow analysis, since termination of

the analyses cannot easily be guaranteed. To see this, assume that we are interested in determining

which variables are bound to ground terms in calls to clauses. As descriptions we choose sets U of

variables, with the intention that V ∈ U means that the current substitution definitely binds V to

ground terms. Now consider the following program:

ancestor(x, z)← parent(x, z).

ancestor(x, z)← ancestor(x, y), ancestor(y, z).

parent(a, b).

Assume we are given the goal ←ancestor(a, z). An analysis based on the SLD semantics must

compute an infinite number of goal/description pairs, including

(ancestor(x, z), {x})

(ancestor(x, y), ancestor(y, z), {x})

(ancestor(x,w), ancestor(w, y), ancestor(y, z), {x})...

Clearly such a dataflow analysis will not terminate. What we need is a semantic definition that

somehow “merges” information about all incarnations of a variable that appears in a program and

is compositional in the sense that the denotation of a goal is given in terms of the denotation of its

subgoals. In Section 6.2 we develop such a definition.

Page 59: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

52 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

6.2 A semantics based on parametric substitutions

In this section we develop a theory of “substitutions” which differs somewhat from classical sub-

stitution theory. There are a number of reasons for doing this. First, classical substitutions have

some drawbacks when used in a denotational definition. Most general unifiers are not unique, even

when idempotent. For example, unifying p(x) and p(y), it is not clear whether the result should

be {x 7→ y} or {y 7→ x}, and it is hard to guarantee that a semantic definition is invariant un-

der choice of unification function. Second, renaming traditionally causes problems. Most existing

approaches either ignore the problem by postulating some magical (invisible or nondeterministic)

renaming operator, or they commit themselves to a particular renaming technique (as do Jones

and Søndergaard [43]).

As an example of the problem, consider the query ←p(x) and the program consisting of the

single clause p(x). By the definition of Jones and Søndergaard the answer substition is {{x 7→ x1}},

but one could argue that {{x 7→ x17}}, {{x 7→ z}}, or even {ι}, would be just as appropriate. It

would be better if substitutions would somehow “automatically” perform renaming, preferably in

a way that made no assumptions about renaming technique. Furthermore, most dataflow analyses

merge information about the various incarnations of variables anyway, so a specific “renaming

function” should be avoided, if possible.

Our denotational definition is based on a notion of “parametric substitutions.” These have

previously been studied by Maher [52], although for a different purpose and under the name “fully

parameterized substitutions.” We follow Maher in distinguishing between variables, whose names

are significant, and parameters, whose names are insignificant.

Recall that Var is a countably infinite set of variables. We think of Var , Term , and Atom

as syntactic categories: the variables that appear in a program are assumed to be taken from

Var . In contradistinction to Var , Par is a countably infinite set of parameters. Both variables

and parameters serve as “placeholders,” but it proves useful to maintain the distinction: Var and

Par are disjoint. A one-to-one correspondence between the two is given by the bijective function

ǫ : Var → Par . In examples we distinguish variables from parameters by putting primes on the

latter.

The set of terms over Par is denoted by Term [Par ], and the set of atoms over Par is denoted

by Atom [Par ]. A substitution into Par is a total mapping σ : Var → Term[Par ] which is “finitely

based” in the sense that all but a finite number of variables are mapped into distinct parameters.

We do not distinguish substitutions into Par from their natural extension to Atom → Atom[Par ].

The set of substitutions into Par is denoted Sub[Par ].

Let us for the time being denote Atom by Atom[Var ] to stress its distinction from Atom[Par ],

and let X be either Var or Par . We define the standard preordering , ⊳, on Atom [X] by A ⊳ A′ iff

∃ τ : X → Term[X] . A = (τ A′). The standard preordering clearly induces an equivalence relation

on Atom[Var ] (or Atom[Par ]) which is “consistent variable (or parameter) renaming.” We denote

the resulting quotient by Atom[X]⊳ and use ⊳ to denote the induced partial ordering on Atom[X]⊳

Page 60: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.2. A SEMANTICS BASED ON PARAMETRIC SUBSTITUTIONS 53

as well. Note that Atom [X]⊳ has neither a least, nor a greatest, element.

For A ∈ A, where A ∈ Atom[X]⊳, we may denote A by [A]. In fact we think of [·] as the

function from Atom [X] to Atom[X]⊳ that, given A, yields the equivalence class of A.

A parametric substitution is a mapping from Atom to Atom[Par ]⊳. We will not allow all such

mappings, though: the mappings we use are “essentially” substitutions. More precisely, the set

Psub ⊆ Atom → Atom[Par ]⊳ of parametric substitutions is defined by

Psub = {[·] ◦ σ | σ ∈ Sub[Par ]}.

That is, application of a parametric substitution can be seen as application of a substitution that

has only parameters in its range, followed by taking the equivalence class of the result. We equip

Psub with a partial ordering ≤, namely pointwise ordering on {[·] ◦ σ | σ ∈ Sub[Par ]}. Whenever

convenient we will think of Psub and Atom[Par ]⊳ as having artificial least elements ⊥Psub and

⊥Atom added (such that ⊥Psub A = ⊥Atom). We then have the following result [52].

Proposition 6.2 Psub is a complete lattice.

We use a notation for parametric substitutions similar to that used for substitutions however,

square rather than curly brackets are used, to distinguish the two. For example, the parametric

substitution π = [x 7→ x′, y 7→ f(x′)] will map p(x, y, z) to [p(x′, f(x′), z′)]. So π corresponds to the

substitution {y 7→ f(x)}. An alternative way to think of π is as an existentially quantified (over

parameters) conjunction: ∃x′ . x = x′ ∧ y = f(x′) [52].

Definition. The function β : Sub → Psub maps a substitution to the corresponding parametric

substitution. It is defined by β θ A = [ǫ (θ A)].

So every substitution θ has a corresponding parametric substitution (β θ). However, there are

parametric substitutions that are not in the range of β. For example, there is no substitution θ

such that (β θ) = [x 7→ f(y′, z′)].

We now make use of parametric substitutions to give semantic definition which is suitable as a

basis for abstract interpretation and which, in a sense we make precise, captures SLD resolution.

The definition makes use of the auxiliary functions, meet , which corresponds to composition, and

unify, which corresponds to computing the most general unifier2.

2”Corresponds” should be taken in a loose sense: meet , unlike composition, is commutative, and unify is non-

commutative.

Page 61: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

54 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

Definition. The function meet : P Psub → P Psub → P Psub is defined by

meet Π Π′ = {π ⊓ π′ | π ∈ Π ∧ π′ ∈ Π′} \ {⊥Psub}.

The function unify : Atom → Atom → P Psub → P Psub is defined by

unify H A Π = {⊔

{π′ | π′ H ⊳ π A} | π ∈ Π} \ {⊥Psub}.

The auxiliary functions are defined to operate on sets of parametric substitutions, rather than

single elements, because in the next section it proves useful to have the broader definitions.

Example 6.3 Let π = [y 7→ y′, z 7→ f(y′)] and π′ = [y 7→ a]. Then

meet{π} {π′} = {[y 7→ a, z 7→ f(a)]}.

Example 6.4 Let A1 = p(a, x), A2 = p(y, z), and let π = [y 7→ y′, z 7→ f(y′)]. Then we have that

(unify A1 A2 {π}) = {[x 7→ f(a)]}. Notice that this parametric substitution does not constrain y or

z, only variables in A1 may be constrained. On the other hand we have that (unify A2 A1 {π}) =

{[y 7→ a]}, in which both x and z are unconstrained. Notice also that (unify A1 A1 {π}) = {ǫ},

while (unify A2 A2 {π}) = {π}.

Let A3 = p(f(y), z). Then (unify A3 A2 {ǫ}) = {ǫ}. There is no “occur check problem,” because

the names of placeholders in (π A2) have no significance. One can think of this as automatic

renaming being performed by unify .

Finally let A4 = p(x, x). Then (unify A4 A2 {π}) = ∅, corresponding to failure of unification.

Lemma 6.5 The functions meet and unify are continuous.

The idea behind unify should be clear from this example: not only does the function unify perform

renaming “automatically,” it also restricts interest to certain variables, namely those of its first

argument.

The following lemma captures the relationship between unify and mgu. Let (dom π) denote the

domain of the parametric substitution π, that is,

dom π = {V ∈ Var | π V 6∈ Par ∨ ∃V ′ ∈ Var . (V 6= V ′ ∧ π V = π V ′)}.

Let (restr U π) be the parametric substitution that acts like π when applied to variables in U ,

but maps variables outside U to distinct parameters. Let restrict U = ∆ (restr U). Recall that

renamings were defined in Section 6.1.

Lemma 6.6 Let A,H ∈ Atom and Π ⊆ Psub. Let ρ = ren ((vars H)∪(∆ dom Π)) (vars A). Then

unify H A Π = restrict(vars H) (meet{π ◦ ρ | π ∈ Π} (∆ β (mgu H (ρ A)))).

Page 62: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.2. A SEMANTICS BASED ON PARAMETRIC SUBSTITUTIONS 55

The following definition3 captures the essence of an SLD refutation-based interpreter using a stan-

dard computation rule and a parallel search rule. Recall that (Σ d) is the folded version of d. The

domain Atom is ordered by identity, Den is ordered pointwise.

Definition. The base semantics has semantic domain

Den = Atom → P Psub → P Psub

and semantic functions

Pbas : Prog → Den

P′bas : Prog → Den → Den

Cbas : Clause → Den → Den.

It is defined as follows.

Pbas [[P ]] = lfp(P′bas [[P ]])

P′bas [[P ]] =

C∈P (Cbas [[C]])

Cbas [[H ←B]] d A Π =⋃

π∈Πmeet{π} (unify A H (Σ d B (unify H A {π}))).

The idea behind the relationship between SLD and the base semantics is that whenever θ belongs

to (O [[P ]] (A : nil)) then restrict(vars A) (β θ) belongs to (Pbas [[P ]] A {ǫ}). However we are

interested in performing an analysis for many goals at once and so require the existence of a

somewhat stronger relationship.

Definition. Define cong : Env ×Den → Bool by cong (e, d) iff

∀A ∈ Atom . ∀Θ ⊆ Sub .⋃

θ∈Θ{θ′ (θ A) | θ′ ∈ e (θ (A : nil))} ⊆ {π A | π ∈ d A (∆β Θ)}.

Theorem 6.7 For all programs P , cong ((O [[P ]]), (Pbas [[P ]])).

A proof of this theorem is given in Appendix A.

It is straightforward to rewrite the above definition so that it gives information about calls, just

as we rewrote O to Ocall. A proof very similar to that of Theorem 6.7 can be given to show that

a similar relationship hold between the “call” versions of O and Pbas.

3Strictly speaking, the singleton set constructor {·} as used in the definition is not part of our meta-language, and it

is commonly avoided in denotational definitions as it is not monotonic. Its use here causes no problems: the semantic

functions are well-defined. Subsequent definitions will avoid using {·} so as to be able to utilize Proposition 3.14.

Page 63: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

56 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

6.3 A dataflow semantics for definite logic programs

The semantic definition given in the previous section was designed to capture the essence of SLD

resolution. In this section we develop a generic dataflow analysis scheme from these definitions by

factoring out certain operations.

First we introduce some imprecision in the semantics. So far, for all substitutions generated

by a clause, track was kept of the particular substitution the clause was called with, so that the

“meet” of generated substitutions and call substitutions could be computed. We now abandon

this approach in order to get closer to a dataflow semantics. The point is that, in a dataflow

semantics, we want a “description” to replace Π, the set of parametric substitutions, and since we

think of descriptions as being atomic (in the sense of non-decomposable), we need to avoid reference

to elements of Π. We thus replace “⋃

π∈Π . . . π . . .” from the definition of the base semantics by

“. . .Π . . ..”

Definition. The lax semantics has semantic functions

Plax : Prog → Den

Clax : Clause → Den → Den.

It is defined as follows.

Plax [[P ]] = lfp (⊔

C∈P (Clax [[C]]))

Clax [[H ←B]] d A Π = meet Π (unify A H (Σ d B (unify H A Π))).

Proposition 6.8 Plax ∝ Pbas.

Proof: Let a set Π of substitutions and let F : P Psub → P Psub be monotonic. If Π = ∅ then

trivially

π∈Π

π′∈F {π}(π′ ⊓ π) ⊆

π∈Π

π′∈F Π(π′ ⊓ π). (6.1)

Otherwise consider some π ∈ Π. By monotonicity, F {π} ⊆ F Π, so π′⊓π ∈ F {π} ⇒ π′⊓π ∈ F Π.

It follows that (6.1) holds for arbitrary Π and thus

meet{π} (F {π}) ⊆ meet Π (F Π).

Letting the concretization function be the identity mapping, and setting

F = λΠ . unify A H (Σ d B (unify H A Π)),

we have that Clax ∝ Cbas. By Proposition 3.14 and the continuity of meet and unify , the assertion

follows.

Page 64: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.3. A DATAFLOW SEMANTICS FOR DEFINITE LOGIC PROGRAMS 57

It should be clear that we can “laxify” the “call version” of the semantics just as we have done

for the base semantics. To limit the number of semantic definitions, however, we shall not do

this and will ignore the call semantics for the remainder of this chapter. Readers should keep in

mind, however, that ultimately the touchstone for the correctness of a dataflow analysis returning

information about call patterns is its soundness with respect to the call semantics.

To extract runtime properties of pure Prolog programs one can develop a variety of non-standard

interpretations of the preceding semantics. To clarify the presentation, we extract from the lax

semantics a dataflow semantics which contains exactly those features that are common to all the

non-standard interpretations that we want. It leaves the interpretation of one domain X and two

base functions, m and u unspecified. These missing details of the dataflow semantics are to be

filled in by interpretations. In the standard interpretation of this semantics, Ilax, X is P Psub, m is

meet and u is unify . In a non-standard interpretation, X is assigned whatever set of “descriptions”

we choose to approximate sets of parametric substitutions. X should thus be a complete lattice

which corresponds to P Psub in the standard semantics in a way laid down by some insertion

(X, γ,P Psub). The interpretation of m and u should approximate meet and unify respectively.

Prog , Clause, and Atom are static and have the obvious fixed interpretation. They are ordered by

identity. As usual, Den is ordered pointwise.

Definition. The dataflow semantics has domain

Den = Atom → X → X,

semantic functions

P : Prog → Den

C : Clause → Den → Den,

and base functions

m : X → X → X

u : Atom → Atom → X → X.

It is defined as follows.

P [[P ]] = lfp (⊔

C∈P (C [[C]]))

C [[H ←B]] d A x = m x (u A H (Σ d B (u H A x))).

An interpretation Ix of the dataflow semantics is determined by the triple (Ix X, Ix m, Ix u). We

use the convention that the semantic function P as determined by an interpretation Ix is denoted

by Px. For example the standard interpretation Ilax is given by (P Psub,meet , unify) and the

corresponding semantics is denoted by Plax.

Since C [[C]] is monotonic for every interpretation, we have the following proposition.

Page 65: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

58 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

Proposition 6.9 For every interpretation Ix, Px is well-defined.

Definition. Let I = (X,m, u) and I′ = (X ′,m′, u′) be interpretations. Then I′ is sound with

respect to I iff for some insertion (γ,X ′,X), m′ appr m and u′ appr u.

The next proposition follows immediately from Proposition 3.14.

Proposition 6.10 If interpretation Ix is sound with respect to Iy, then Px ∝ Py.

By transitivity of ∝ and Proposition 6.8 we therefore have the following result.

Theorem 6.11 If interpretation Ix is sound with respect to Ilax, then Px ∝ Pbas.

Developing a dataflow analysis in this framework is therefore a matter of choosing the description

domain so that it captures the information required from the analysis and then defining suitable

functions to approximate meet and unify. Before giving example dataflow analyses we identify two

classes of description domain and indicate how meet can be approximated for these classes. This

is useful because the descriptions used in most dataflow analyses belong to one of the two classes

or are an orthogonal mixture of such descriptions. Thus finding generic ways to approximate meet

for these classes will help to give some insight into the design of practical dataflow analyses. We

note that once a suitable approximation for meet has been found, then, by Lemma 6.6, it may be

used as the basis for developing an approximation to unify .

Notice that Pbas cannot be defined as an interpretation. By introducing the lax semantics

we have achieved simplicity but also imprecision, to the extent that the base semantics cannot

be recovered. Jones and Søndergaard have shown how, by making the dataflow semantics more

complex (by not basing it on a lax semantics), a framework can be obtained in which the base

semantics itself can be given as an interpretation [43]. Such a framework clearly allows for more

precise dataflow analyses than the framework presented in this chapter.

Definition. An insertion (γ,X,P Psub) is downwards closed iff

∀x ∈ X . ∀π, π′ ∈ Psub . π ∈ γ x ∧ π′ ⊳ π ⇒ π′ ∈ γ x.

Dually we can define upwards closure.

Examples of downwards closed insertions are those typically used in groundness analysis, type

analysis, definite aliasing and definite sharing analysis (for references see Section 6.5). Examples of

upwards closed insertions are those typically used in possible sharing (independence) analysis, pos-

sible aliasing analysis, and freeness analyses. Many complex analyses, such as for the determination

of mode statements or the detection of and-parallelism, can often be expressed as a combination

of simpler analyses based on insertions that are downwards or upwards closed. A notion of “sub-

stitution closure,” which is similar to, but somewhat stricter than downwards closure, is identified

by Debray [16] as being the basis for an important class of dataflow analyses.

Page 66: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.4. APPROXIMATING CALL PATTERNS: GROUNDNESS ANALYSIS 59

Proposition 6.12 Let (γ,X,P Psub) be a downwards closed insertion and let meet ′ : X → X → X

be given. Then meet ′ appr meet iff

∀x, x′ ∈ X . (γ x) ∩ (γ x′) ⊆ γ (meet ′ x x′).

Proof: We have that

meet ′ appr meet

⇔ ∀x, x′ ∈ X . ∀Π,Π′ ⊆ Psub . Π ⊆ (γ x) ∧ (Π′ ⊆ γ x′)⇒ meet Π Π′ ⊆ γ (meet ′ x x′)

⇔ ∀x, x′ ∈ X . ∀π ∈ (γ x) . ∀π′ ∈ (γ x′) . (π ⊓ π′) ∈ γ (meet ′ x x′) ∪ {⊥Psub}

⇔ ∀x, x′ ∈ X . (γ x) ∩ (γ x′) ⊆ γ (meet ′ x x′).

Definition. An insertion (γ,X,P Psub) is Moore-closed iff (∆ γ X) is Moore-closed in P Sub.

Corollary 6.13 Let (γ,X,P Psub) be a downwards closed, Moore-closed insertion. Let ⊓X be the

greatest lower bound operator for X. Then ⊓X appr meet (and in fact is the best approximation).

Proposition 6.14 Let (γ,X,P Psub) be an upwards closed insertion and let meet ′ : X → X → X

be given. Then meet ′ appr meet iff

∀x, x′ ∈ X . {π ⊓ π′ | π, π′ ∈ (γ x) ∪ (γ x′)} ⊆ γ (meet ′ x x′) ∪ {⊥Psub}.

Proof: We have that

meet ′ appr meet

⇔ ∀x, x′ ∈ X . ∀Π,Π′ ⊆ Psub . Π ⊆ (γ x) ∧ Π′ ⊆ (γ x′)⇒ (meet Π Π′) ⊆ γ (meet ′ x x′)

⇔ ∀x, x′ ∈ X . ∀π ∈ (γ x) . ∀π′ ∈ (γ x′) . (π ⊓ π′) ∈ γ (meet ′ x x′) ∪ {⊥Psub}

⇔ ∀x, x′ ∈ X . {π ⊓ π′ | π, π′ ∈ (γ x) ∪ (γ x′)} ⊆ γ (meet ′ x x′) ∪ {⊥Psub}.

Definition. An insertion (γ,X,P Psub) is meet-closed iff

∀x ∈ X . ∀π, π′ ∈ Psub . π, π′ ∈ γ x⇒ (π ⊓ π′) ∈ γ x ∪ {⊥Psub}.

Corollary 6.15 Let (γ,X,P Psub) be an upwards closed, meet-closed insertion. Let⊔

X be the

least upper bound operator for X. Then⊔

X appr meet.

6.4 Approximating call patterns: Groundness analysis

In this section we give an example dataflow analysis for groundness propagation. This analysis

is based on the scheme given in the previous section. Propositional formulas, or, more precisely,

classes of equivalent propositional formulas, are used as descriptions. We use monotonic formulas

Page 67: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

60 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

built from the connectives ↔, ∧, and ∨ only. These descriptions are closely related to those used

by Dart [15]. A parametric substitution π is described by a formula F if, for any instance π′ of π,

the truth assignment given by “V is true iff π′ grounds V ” satisfies F . For example, the formula

x ↔ y describes the parametric substitutions [x 7→ a, y 7→ b] and [x 7→ u′, y 7→ u′, u 7→ u′] but not

[x 7→ a]. However, x↔ y is not the best description for [x 7→ a, y 7→ b]—that would be x ∧ y which

in turn “implies” x↔ y, that is, x↔ y is a less precise approximation than x ∧ y.

Let Prop be the poset of equivalent propositional formulas ordered by implication over some

suitably large finite variable set. We can represent an equivalence class in Prop by a canonical

representative for the class, perhaps a formula which is in disjunctive or conjunctive normal form.

By an abuse of notation we will apply logical connectives to both propositional formulas and to

classes of equivalent formulas.

Lemma 6.16 The poset Prop is a finite (complete) lattice with conjunction as the greatest lower

bound and disjunction as the least upper bound.

Definition. Let γ : Prop → P Psub be defined by

γ F = {π ∈ Psub | ∀π′ ⊳ π . (assign π′) satisfies F}

where assign : Psub → Var → Bool is given by (assign π V ) iff π grounds V .

Example 6.17 Let F = [x ∧ (y ↔ z)]. Then [x 7→ a] ∈ γ F and [y 7→ b] 6∈ γ F .

Let us use the notation∧

{φ1, . . . , φn} for the formula φ1 ∧ . . . ∧ φn, and similarly for∨

.

Lemma 6.18 The triple (γ,Prop,P Psub) is a downwards closed, Moore-closed insertion.

Proof: Clearly γ is monotonic and co-strict, and (γ,Prop,P Psub) is downwards closed. Let G ⊆

Prop. Then∧

G ∈ Prop and γ (∧

G) =⋂

F∈G (γ F ), so (γ,Prop,P Psub) is Moore-closed.

It follows from 6.13 that meet is best approximated by conjunction.

Definition. The function meetgro : Prop → Prop → Prop is defined by meetgro F F ′ = F ∧ F ′.

Lemma 6.19 meetgro appr meet.

The function unify is somewhat more complex to approximate. Its definition makes use of project ,

a projection function on propositional formulas and mgugro , the analogue of mgu for propositional

formulas. The motivation for approximating unify in this way comes from Lemma 6.6.

Page 68: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.4. APPROXIMATING CALL PATTERNS: GROUNDNESS ANALYSIS 61

Definition. Let tass U be the set of truth assignments to the variables in U ⊆ Var . The function

project : P Var → Prop → Prop is defined by

project U F =∨

{ψ F | ψ ∈ tass ((vars F ) \ U)}.

Example 6.20 project{x, y, z} [x ∧ (y ↔ v) ∧ (z ↔ v)] = [x ∧ (y ↔ z)].

Lemma 6.21 project appr restrict.

Proof: Let π′ = restrict U π. We must show that if π ∈ γ F then π′ ∈ γ (project U F ). Let

ψ ∈ (tass (vars F ) \ U) be a truth assignment such that ψ V ⇔ assign π V . If π ∈ γ F then

(assign π) satisfies F . Thus the assignment ψ′ ∈ tass U such that ψ′ V ⇔ assign π V will satisfy

ψ F . Since vars(ψ F ) ⊆ U and (assign π′ V ) ⇔ (assign π V ) ⇔ (ψ V ) for all V ∈ U , it follows

that (assign π′) satisfies ψ F . Thus π′ ∈ γ (project U F ).

Definition. The function mgugro : Atom → Atom → Prop is defined by

mgugro A H = if (mgu A H) = ∅ then false else let {µ} = (mgu A H) in∧

{V ↔ (∧

{V ′ | V ′ ∈ (vars(µ V ))}) | V ∈ dom µ}.

Example 6.22 Let A = p(x, y) and H = p(a, f(u, v)). Then

mgugro A H = [x ∧ (y ↔ (u ∧ v))].

Lemma 6.23 (mgugro A H) appr (∆ β (mgu A H)).

Proof: This follows from the definition of mgugro and γ.

Definition. The function unifygro : Atom → Atom → Prop → Prop is defined by

unifygro H A F = let ρ = ren(vars H) ((vars A) ∪ (vars F )) in

project(vars H) ((ρ F ) ∧ (mgugro H (ρ A))).

Lemma 6.24 unifygro appr unify.

Proof: This follows from Proposition 3.14 and Lemmas 6.6, 6.19, 6.21, and 6.23.

Example 6.25 Let A = append(x, y, z), H = append(nil, y, y), and H ′ = append(u : x, y, u : z).

Then

unifygro H A [true] = [true]

unifygro A H [true] = [x ∧ (y ↔ z)]

unifygro H′ A [true] = [true]

unifygro A H ′ [false ] = [false]

unifygro A H ′ [x ∧ (y ↔ z)] = [(x ∧ y)↔ z].

Page 69: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

62 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

Definition. The groundness analysis Pgro is given by instantiating the dataflow semantics with

the interpretation (Prop,meetgro, unifygro).

Theorem 6.26 Pgro ∝ Pbas.

Proof: This follows immediately from Lemmas 6.18, 6.19, and 6.24.

Example 6.27 Let P be the append program

append(nil, y, y).

append(u : x, y, u : z)← append(x, y, z).

Consider the goal A = ←append(x, y, z). To compute Pgro [[P ]] A [true], the analysis proceeds as

follows. Let DP =⊔

C∈P (Cgro [[C]]). We then have

(DP ↑ 0) A [true] = [false ]

(DP ↑ 1) A [true] = [x ∧ (y ↔ z)] ∨ [false ] = [x ∧ (y ↔ z)]

(DP ↑ 2) A [true] = [x ∧ (y ↔ z)] ∨ [(x ∧ y)↔ z] = [(x ∧ y)↔ z]

and (DP ↑3) = (DP ↑2) = lfpDP . Thus Pgro [[P ]] A [true ] = [(x ∧ y)↔ z]. Similarly one can show

that Pgro [[P ]] A [z] = [x ∧ y ∧ z]. In the “call” version of Pgro, we would find that, provided this

holds in the initial call, all calls to append will have a third argument which is definitely ground.

Note that we have here been concerned with the correct definition of the groundness analy-

sis. An implementation should make use of obvious properties of the operators. For example,

mgugroH (ρ A) should not be computed repeatedly during an analysis, since, by choosing ρ so as

to “rename away from programs variables,” every possible “unifier” can be computed once and for

all as a first step of the analysis.

The standard approach to least fixpoint computation is to compute the associated Kleene

sequence. If the descriptions form a Noetherian lattice then the Kleene sequence will be finite.

However, the question of how to compute the last element of the Kleene sequence efficiently remains.

A number of techniques are available and should be implemented. First of all, efficiency may be

obtained by first computing minimal cliques of mutually recursive clauses and performing the

analysis on these (relatively) independently. This would seem preferable for programs as they

appear in practice. Furthermore, finite differencing techniques [79] are usually applicable to the

fixpoint computation, since our operators are typically extensive, that is, x ⊆ F x holds for all x.

Some ideas apply to our case of computing fixpoints for functionals in particular. This is the case

with the standard technique of using memoing to avoid redundant computations, and also with

the idea behind the “minimal function graph” semantics introduced by Jones and Mycroft [42].

A minimal function graph semantics is only defined for those calls that are reachable from some

initial call (hence “minimal”) and employs memoing to avoid redundant work (hence “function

Page 70: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.5. OTHER APPLICATIONS AND RELATED WORK 63

graph”). This provides an interesting parallel to “magic set” implementation of query processing

in deductive databases [3]. Again, the idea with magic set transformation is to make use of the

fact that interest is restricted to the set U of instances of the given query. See also Marriott and

Søndergaard [64].

6.5 Other applications and related work

An indication of the versatility of abstract interpretation for logic programming is the amount of

work published on the topic in recent years. In particular the idea of approximating call patterns

(as studied in this chapter) seems useful. Many kinds of code improvement that can be performed

automatically by a compiler depend on information about calls that may take place at run time,

and other kinds of program transformation make use of that information as well. At the end of

this section we list some of the applications.

However, a number of papers in the area have been concerned with what has come to be called

“frameworks” for abstract interpretation of logic programs. By this is usually meant a general

setting that allows one to express a number of dataflow analyses in a uniform way, just as we

have done in the present chapter through our “dataflow semantics.” The various frameworks differ

considerably in: (1) the degree of semantic justification provided, (2) assumptions about underlying

semantics, (3) the number of operations that need to be given in an “interpretation,” (4) the notion

of “safe approximation” as a relation between operations, and (5) general complexity. We shall

make no attempt at a taxonomy here (but see Marriott [55]), nor shall we have something to say

about every approach known to us. But it does seem relevant to compare two aspects of our work

with related work.

One aspect is the denotational definition of the “base” semantics which may be of some interest

in their own rights because of of their novelty. The other aspect is the degree of “factorisation”

achieved in our “dataflow semantics,” that is, the degree to which we have captured the essence of

the dataflow analysis problem. We discuss these two issues in turn before addressing applications.

Early work on SLD-based semantic definitions for logic programs was done by Jones and My-

croft [41] who addressed both operational and denotational semantics. Debray and Mishra [20] gave

a thorough exposition of a denotational definition, including a proof of its correctness with respect

to SLD. Both Jones and Mycroft and Debray and Mishra assume a left-to-right computation rule

and a depth-first search of SLD trees (as in Prolog), and both definitions capture non-termination

(unlike ours). Both use sequences of substitutions as denotations, rather than sets, which give the

definitions a rather different flavour. The definition used by Jones and Søndergaard [43] achieves

certain simplifications by assuming a parallel search rule and consequently manipulates sets of

substitutions. The use of substitutions forces it to employ an elaborate renaming technique which

complicates semantic definitions somewhat. Winsborough [99, 100] and Jacobs and Langen [35]

have suggested denotational definitions along similar lines.

Marriott and Søndergaard have considered a number of different denotational definitions for

Page 71: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

64 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

the purpose of dataflow analysis. The definitions given in Section 5.1 falls outside the present

discussion, since it is “bottom-up,” but it is relevant to mention the work which attempts to give

a uniform presentation of both bottom-up and top-down definitions by expressing both in terms of

operations on lattices of substitutions, as far as this is possible [64]. In particular it has been shown

how operations such as unification and composition of substitution can be adequately dissolved into

lattice operations to simplify definitions [64], an idea which has formed the point of departure for

the definitions presented here.

Some of the above mentioned semantic definitions apply to more than “pure” Prolog, for ex-

ample by being able to interpret the “cut” operator of Prolog. We have not been concerned with

extra-logical or non-logical aspects of logic programming languages, partly because we believe that

many of those features will disappear from logic programming languages in due time. (Most of the

operations that necessarily remain have straightforward approximations in dataflow analysis. For

example, a “write” can be ignored by an analysis, and in case of “read,” a dataflow analysis will

always be forced to assume a worst case behaviour.) However, one operation should be of concern

to us, namely negation. Negation is useful in a logic programming language, but, as is well-known,

most Prolog systems use an unsound version of negation [49]. Future implementations of logic

programming languages should hopefully rectify this. It is perfectly possible to give a denotational

definition that incorporates negation. Marriott and Søndergaard [65] have shown this for both

traditional (unsound) Prolog and SLDNF resolution (the methods employed to handle SLDNF

resolution can be extrapolated in a straightforward way to cover languages with delay mechanisms,

such as NU-Prolog [97]). Such definitions are important for abstract interpretation, because they

allow for more precision: a dataflow analysis that “knows” about the semantics of negation can

yield better results than one whose policy simply is to ignore negation (which is a safe behaviour

for many purposes).

Abstract interpretation of logic programs was first mentioned by Mellish [69] who suggested it as

a way to formalise mode analysis4. An occur check analysis which was formalised as a non-standard

semantics was given by Søndergaard [92]. Some of the techniques used in the present chapter can

be traced back to that work. This is the case with the principle of performing unification both on

call and return so as to facilitate that only local variables need be manipulated at any stage (this

was referred to as a principle of “locality”). The work also established the principle of binding

information to variables in a program throughout computations, rather than to argument positions

as seen elsewhere [7, 53, 68]. Other things being equal, this improves precision. For example,

consider a mode analysis of the program

←p(f(x)).

p(f(u)).

using the two modes “free” (unbound or bound to a variable) and “any.” The “argument position”

methods will funnel mode information about the variables x and u through the argument position of

4The origin of this suggestion can be attributed to Alan Mycroft.

Page 72: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.5. OTHER APPLICATIONS AND RELATED WORK 65

p and assign x the mode “any,” while clearly x could be more precisely deemed “free.” To counteract

such behaviour, an “argument position” method must use a more fine-grained description domain

and pay the price of more expensive “abstract operations.”

A framework for the abstract interpretation of logic programs was first given by Mellish [68].

Mellish’s semantics is an operational parallel to our “lax” semantics with the imprecision that

this implies: success patterns are not associated with their corresponding call patterns, so success

information is propagated back, not only to the atom that actually called the clause, but to all atoms

that unify with the clause’s head. The application that had Mellish’s interest in particular was

mode analysis. Debray [17] subsequently investigated this application in more detail and pointed

to a problem in Mellish’s application (the so-called aliasing problem, which may manifest itself as

either a soundness or a completeness problem, depending on the particular dataflow analysis).

A framework for the abstract interpretation of logic programs based on a denotational definition

of SLD was given by Jones and Søndergaard [43]. This was the first denotational approach to

abstract interpretation of logic programs. The framework allowed even the base (or standard)

semantics to be expressed as an instance of the dataflow semantics. This has the advantage of

providing a very clean cut between a semantic definition which is precise (unlike our lax and

dataflow semantics) and interpretations in which all introduced imprecision resides. In this chapter

we have abandoned this approach only to simplify our presentation. Jones and Søndergaard used

operations “call” and “return” which in the present approach have been replaced by “unify” and

“meet.” We find this conceptually cleaner.

Kanamori and Kawamura [44] suggested a framework based on OLDT resolution [94], which

essentially is SLD resolution extended with memoing, so as to avoid redundant computation.

Bruynooghe et al. [7, 9] suggested an AND/OR tree-based framework in which an interpretation

contains some seven operations. This has later been simplified to some extent [8]. Neither approach

makes use of fixpoint characterisations for standard or non-standard semantics. We have earlier

discussed the relative merits of operational and denotational definitions as basis for the design of

dataflow analyses. Readers should compare the two last mentioned approaches with that of this

chapter, both as regards conceptual complexity and complexities of the various semantic definitions.

Both Kanamori and Kawamura and Bruynooghe et al. give algorithms (or rather procedures) for

performing abstract interpretation. So does Nilsson in his thesis [78].

The framework used by Winsborough [99, 100] is rather close to ours. In particular, one semantic

definition (Winsborough’s “total function graph semantics” [100]) is almost identical to our base

semantics, the difference being that it works with (classical) substitutions that are “canonized” to

bar variants of a substitution from introducing redundancy (Mellish [68] used the same idea).

Debray [16] has studied a framework for dataflow analysis with the point of departure that

analyses must be efficient. He identifies a property of description domains (“substitution closure”)

and gives a complexity analysis to support the claim that the corresponding class of dataflow

analyses can be implemented efficiently. Our groundness analysis falls outside Debray’s class, as

does any analysis that attempts to maintain information about aliasing (see also below).

Page 73: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

66 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

Marriott and Søndergaard have considered a number of different denotational definitions for the

purpose of dataflow analysis. The definitions given in Section 5.1 fall outside the present discussion,

since they are “bottom-up,” but it is relevant to mention work that attempts to give a uniform

presentation of both bottom-up and top-down definitions by expressing both algebraically [64].

Without claiming completeness of the list or the attached references, we finally list some appli-

cations that fit into our framework, or a similar framework.

Aliasing analysis [33]. In many dataflow analyses, such as mode analysis or the groundness

analysis we presented in Section 6.4, it is useful to know whether two variables are aliases, that

is, bound to the same term. Sometimes such knowledge is even necessary to guarantee soundness.

The purpose of aliasing analysis is to predict which aliases occur at runtime.

Compile time garbage collection [9]. The idea behind this analysis is to approximate reference

counting at compile time. This can allow a compiler to generate code for reclaiming storage without

runtime overhead. Clearly this is an example of an analysis that relies on aliasing information.

Determinacy analysis [22]. Many calls in programs are deterministic in the sense that they return

with at most one answer. The idea behind determinacy analysis (or the more refined “functionality

analysis” of Debray and Warren) is to detect such cases at compile time. This allows for the

generation of code with fewer backtrack points.

Floundering analysis for normal logic programs [66]. SLDNF resolution provides an oper-

ational semantics which (unlike many Prolog implementations) is both computationally tractable

and sound with respect to a declarative semantics. The idea is to process negated literals only when

they are ground. However, SLDNF is not complete, and it achieves soundness only by discarding

certain queries to programs because they “flounder,” that is, lead to a situation where only negated

literals with free variables may be selected for processing. The purpose of floundering analysis is

to determine whether floundering may occur at runtime.

Independence analysis for AND-parallelism [18, 34, 35, 99, 101]. Two atoms in a body can

be executed (non-speculatively) in parallel if they are independent, that is, the variable bindings

created by one does not affect the other’s variable bindings. This is the case, for example, if all

the variables that they share are known to be bound to ground terms at the time the atoms are

invoked. Less restrictive conditions can also be determined, under which atoms are independent.

This is the purpose of independence analysis.

Mode analysis [7, 19, 21, 44, 53, 68, 69, 70, 100]. We already discussed mode analysis in Chapter 4.

There is no clear-cut distinction between mode analysis and many of the other analyses listed here—

groundness analysis and determinacy analysis can be seen as special cases. Essentially the purpose

of mode analysis is, for each variable in a head of a clause, to predict the degree of its instantiation

whenever the clause is called. A simple instance of this is the annotation of argument positions as

“input,” “output,” or “both.”

Occur check analysis [23, 81, 92]. Most Prolog systems omit the occur check in unification, since

this speeds up execution considerably. However, by doing so, they sacrifice the soundness of Prolog

Page 74: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

6.5. OTHER APPLICATIONS AND RELATED WORK 67

as a theorem prover. The purpose of occur check analysis is to detect, at compile time, cases where

it is safe to omit occur checks.

Program transformation (we discuss an example in Chapter 7).

Type inference [9, 37, 44]. In the abstract interpretation literature (for logic programming) type

inference usually means approximation of calling patterns. This contrasts our use of the term in

Chapter 5. Typically, tree-like structures such as (variants of) rational trees are used as “types,”

and a term T then is of type t iff T can be folded onto t. Type inference can be used for example

in the elimination of dead or unreachable code.

Page 75: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

68 CHAPTER 6. TOP-DOWN ANALYSIS OF DEFINITE LOGIC PROGRAMS

Page 76: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 7

Difference-Lists Made Safe

Difference-lists are terms that represent lists. They belong to the Prolog folklore: most Prolog

programmers know how the use of difference-lists can speed up list processing, but there have

been few in-depth discussions of the phenomenon. Even though it is common knowledge that

difference-lists must be treated with some care, the exact limitations of their use have remained

obscure.

In this chapter we investigate the concept of a difference-list. The contents is based on work

by Marriott and Søndergaard [63]. We study the transformation of list-processing programs into

programs using difference-lists, as it has been sketched by Sterling and Shapiro [93]. In particular

we are concerned with finding circumstances under which the transformation is safe. We show

how dataflow analysis can be used to determine whether the transformation is applicable to a

given program and how this allows for automatic transformation. The chapter therefore serves to

illustrate the use of semantics-based dataflow analysis in program transformation.

In Section 7.1 we explain the idea behind difference-list transformation as propounded by Ster-

ling and Shapiro and give some examples that indicate the method’s limitation. In Section 7.2

difference-lists are defined formally. We give an algorithm to find list occurrences that should be

changed to difference-lists and find conditions under which this data structure transformation is

safe. We also address the problem: under which circumstances is rapid concatenation of difference-

lists possible? The insight gained from this analysis is used to construct a transformation algorithm

in Section 7.3. An integral part of this algorithm are dataflow analyses that are needed to guarantee

the safeness of the transformation. Finally, in Section 7.4, we discuss related work and possible

extensions to our treatment.

7.1 The difference-list problem

The use of difference-lists is a standard Prolog programmer’s trick. Changing data structures from

lists to difference-lists allows for much faster list concatenation. To understand why, one may

69

Page 77: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

70 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

simplistically think of a difference-list as a “list plus a pointer to its end.” Owing to this “pointer”

one can avoid traversing lists that are represented as difference-lists, and so list concatenation has

a cost that is constant rather than linear in the length of one of the lists. The transformation can

therefore improve the efficiency of programs considerably. Since it furthermore applies to a large

class of important list-processing programs, it would be desirable to automate it. We suggest that

this is possible.

The reader may wonder why a difference-list transformer is not a standard logic program devel-

opment tool. One reason is that the naive “folk” transformation is treacherous and often results in

a program having a behaviour that is radically different to that of the original program. We know

of no rigorous analysis intended to find conditions which guarantee that the transformation is safe.

Conventional wisdom has it that problems with the transformation are intimately connected to the

absence of occur checks in most Prolog systems. But, as we show, even when occur checks are

present, the transformation may be unsafe. Throughout this chapter we will assume the presence

of occur checks.

In this section we present the idea behind the folk transformation. The hope is to familiarise

the reader with this basic idea before trying to give it a more formal treatment. We also exemplify

programs for which the transformation is unsafe. As usual in discussions about program trans-

formation we have to be careful with our use of identifiers. We use the convention from previous

chapters that meta-variables are (possibly subscripted) italic capital letters: A is used for atoms, B

for bodies, D for difference-lists, G for queries, L for lists, P for programs, Q for predicate symbols,

R, S, T for terms, and V, X and Y for variables.

Difference-lists were implicit already in Colmerauer’s treatment of metamorphosis grammars

[12], but the term itself was introduced by Clark and Tarnlund [11]. Manual transformation to

programs using difference-lists has been discussed by Tarnlund [96] and by Hansson and Tarnlund

[30]. These treatments, however, are mainly concerned with axiomatising a theory of difference-lists

and no general transformation method is discussed1.

The practice of transforming programs using lists into programs using difference-lists has been

most extensively dealt with by Sterling and Shapiro2. [93]. They make the useful observation that

“a program that independently builds different sections of a list to be later combined

together is a good candidate for using difference-lists.”

The idea behind the transformation is to replace certain lists by terms called difference-lists. A

difference-list is a pair of lists whose second component is a suffix of the first. The difference-list

denotes the list that results when the suffix is removed from the first component. For example,

1Hansson and Tarnlund incidentally use a list–to–difference-list mapping which is only partial. Their definition

([30] page 119) assigns no difference-list to (in our notation) a : nil or a : x.2However, Sterling and Shapiro’s reference to Bloch [6] for an “automatic transformation of simple programs

without difference-lists to programs with difference-lists” ([93] page 255), is misleading. A reading of Bloch’s work

reveals no such transformation.

Page 78: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.1. THE DIFFERENCE-LIST PROBLEM 71

the list a : b : nil may be represented by the difference-list 〈a : b : x, x〉. Intuitively, x here is a

“pointer” to the end of the list (although “write-once” as all logical variables).

The concatenation of difference-lists is done by the predicate app⋆⋆ defined as follows:

app⋆⋆(〈x, y〉, 〈y, z〉, 〈x, z〉).

Clearly evaluation of this is much faster than using the well-known append program, which traverses

its first argument:

append(nil, y, y).

append(u : x, y, u : z)← append(x, y, z).

While Sterling and Shapiro do not present a general transformation, they give sufficient and well-

chosen examples for the reader to understand the idea. We exemplify the transformation by con-

sidering a program rev to reverse a list:

rev(nil, nil).

rev(u : x, y)← rev(x, z), append(z, u : nil, y).

Since the list z is to have another list appended to it, we want to change z into a difference-list.

This means introducing a predicate rev⋆ whose second argument is a difference-list:

rev⋆(nil, 〈y, y〉).

rev⋆(u : x, 〈y, y′〉)← rev⋆(x, 〈z, z′〉), app⋆⋆(〈z, z′〉, 〈u : v, v〉, 〈y, y′〉).

The relation between rev and rev⋆ is given by the clause

rev(x, y)← rev⋆(x, 〈y, nil〉).

Unfolding the call to app⋆⋆ above, we obtain the program

rev(x, y)← rev⋆(x, 〈y, nil〉).

rev⋆(nil, 〈y, y〉).

rev⋆(u : x, 〈y, y′〉)← rev⋆(x, 〈y, u : y′〉).

Whereas the original rev program qua Prolog program had O(n2) predicate calls this version has

onlyO(n) calls, where n is the length of the list to be reversed—certainly a worthwhile improvement.

Though Sterling and Shapiro do not precisely define the transformation, the following pattern

emerges from their examples. For each call to append:

1. Assume that the arguments to append are changed from lists to difference-lists and propagate

the changes caused by this to other “types” of predicate arguments.

2. For each predicate Q/n thus reached, introduce a predicate Q⋆/n that is defined exactly

as Q, except it uses difference-lists instead of lists and calls the corresponding ⋆-versions of

predicates, including app⋆⋆ instead of append.

Page 79: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

72 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

3. Unfold all calls to app⋆⋆.

4. For each predicate Q/n that has been changed, replace its definition by the clause

Q(X1, . . . ,Xn)←Q⋆(Y1, . . . , Yn),

where Yi = 〈Xi, nil〉 if the i’th argument has been changed to a difference-list, and Yi = Xi

otherwise.

Unfortunately, though this transformation seems to be well-known, its limitations are not. In the

remainder of this section we give examples indicating the problems that may occur. We first note

that, though the transformation seemingly works well in the case of rev above, the two versions of

rev are not logically equivalent. They behave in the same way only if the first argument is a fixed

length list (such as a : x : nil), which is (hopefully) the intended usage. We now exemplify how a

resultant program may behave very differently to the original program for almost any usage.

Example 7.1 Consider the program

p← append(x, y : nil, b : nil), y = a.

Clearly the query ←p fails. By the sketched method, the program is transformed into

p← app⋆⋆(〈x, x′〉, 〈y : u, u〉, 〈b : v, v〉), y = a.

After unfolding of app⋆⋆ we get

p← y = a.

Now the query ←p succeeds, so the transformation has increased the success set.

Example 7.2 Consider the program

double(x, y)← append(x, x, y).

Here the query ←double(L, y) succeeds for all lists L. The transformation yields

double(x, y)← double⋆(〈x, nil〉, 〈y, nil〉).

double⋆(〈x, x′〉, 〈y, y′〉)← app⋆⋆(〈x, x′〉, 〈x, x′〉, 〈y, y′〉).

Unfolding the call to app⋆⋆, we get

double(x, y)← double⋆(〈x, nil〉, 〈y, nil〉).

double⋆(〈x, x〉, 〈x, x〉).

Now the query ←double(L, y) fails whenever L is a list of the form T1 : . . . : Tn : nil where n ≥ 1.

So the transformation has decreased the success set.

Page 80: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.1. THE DIFFERENCE-LIST PROBLEM 73

The following example shows that introduction of difference-lists may go wrong even when there

are no calls to append.

Example 7.3 Consider the program

p← q(nil).

q(a : y).

Clearly the query ←p fails. Replacing nil and a : y with difference-list representations yields the

program

p← q(〈x, x〉).

q(〈a : y, y′〉).

Now the query ←p succeeds.

In Section 7.2 we explain what goes wrong and how safeness can be guaranteed by means of a

dataflow analysis.

It is not uncommon to think that problems with difference-lists are due to occur check problems.

This is not surprising since absence of occur checks makes very simple program transformations,

such as unfolding, unsafe, even in the case of pure Prolog [62]. However, the above discussion

assumed the presence of occur checks, and yet it revealed a number of problems. The absence of

occur checks might create even more problems, but they are of a different sort and orthogonal to

the problems that concern us here. For example, assume we are given the fact

empty⋆(〈x, x〉).

The query ←empty⋆(〈a : y, y〉) would succeed in the absence of occur checks, and the same would

be true for a query such as ←app⋆⋆(〈x, x〉, 〈y, y〉, 〈a : z, z〉), even though such queries of course

should fail.

It is useful to view the transformation as consisting of two stages: the first stage performs

a change of data structure from lists to difference-lists, the second stage modifies the resulting

program so as to perform efficient difference-list concatenation. This point of view is justified by

the fact that distinct problems arise at the two stages. At the first stage, problems arise because

unification of difference-lists is not faithful to unification of the lists they represent. For example,

the difference-lists 〈nil, nil〉 and 〈a : nil, a : nil〉 do not unify even though they both denote nil.

On the other hand 〈x, x〉 and 〈a : y, y′〉 do unify, while their denotations nil and a : nil do not.

This type of problem is exemplified in Example 7.3. At the second stage of the transformation,

problems arise because, in general, app⋆⋆ is not equivalent to append. This problem is exemplified

in Examples 7.1 and 7.2.

Page 81: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

74 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

7.2 Manipulating difference-lists

In this section we discuss the problems with difference-list manipulation. We have seen that two

kinds of problems arise: those related to changing the data structure representing sequences of

terms from lists to difference-lists (first stage) and those related to concatenating difference-lists

(second stage). We discuss each kind in turn.

Definition. A list is a term of the form T1 : . . . : Tn : nil or of the form T1 : . . . : Tn : X,where

n ≥ 0. If it is of the first form then it has fixed length. The set of lists is denoted by List.

Definition. A difference-list is a term of the form 〈L1, L2〉, where L1 and L2 are lists. We call L2

the delimiter of the difference-list. A difference-list of the form 〈T1 : . . . : Tn, Tn〉 has fixed length.

The set of difference-lists is denoted by Dlist.

Definition. A difference-list D is said to represent the fixed length list T1 : . . . : Tn : nil iff D =

〈T1 : . . . : Tn : T, T 〉, where T does not share variables with any Ti. A difference-list D represents

the non-fixed length list T1 : . . . : Tn : X iff D = 〈T1 : . . . : Tn : X,T 〉, where T does not share

variables with any Ti or X.

For example, a : nil is represented by 〈a : x, x〉 or 〈a : nil, nil〉, and a : x is represented by 〈a : x, x′〉

or 〈a : x, nil〉. Clearly many difference-lists such as 〈a : nil, b : nil〉 do not represent any list. An

interesting subclass of these consists of the difference-lists that represent the so-called negative lists

[93].

Given a list L, there are two natural ways to create a difference-list that represents L. We now

define two functions that perform these mappings.

Definition. Let ρ be a bijection from the set of variables appearing in any program or query to a

disjoint set of variables. The function σ : List→ Dlist is defined by

σ (T1 : . . . : Tn : nil) = 〈T1 : . . . : Tn : Xnew,Xnew〉

σ (T1 : . . . : Tn : X) = 〈T1 : . . . : Tn : X, (ρ X)〉,

where each Xnew is a distinct new variable not appearing in any program or in the range of φ. The

function σ′ : List→ Dlist is defined by (σ′ L) = 〈L, nil〉.

The class of difference-lists generated by σ and σ′ turn out to have some nice unification properties,

as we show in Propositions 7.4 and 7.5.

Definition. A difference-list D is a simple representation of a list L iff D = (σ L) or D = (σ′ L).

Page 82: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.2. MANIPULATING DIFFERENCE-LISTS 75

Ideally, unification of difference-lists should correspond to unification of the lists they represent.

In particular, if two lists unify, so should their difference-list representatives. This does not hold

in general, but it holds for simple representations, as the following proposition shows. Unlike the

case in the previous chapters, the function mgu is here assumed to have the type Term → Term →

Sub ∪ {fail} and is defined in the obvious way.

Proposition 7.4 Let D and D′ be simple representations of L and L′, respectively. Let θ =

(mguD D′) and φ = (mgu L L′). If φ 6= fail then θ 6= fail and (θ D) is a simple representation of

(φ L).

Proof: Assume φ 6= fail . We consider three cases depending on whether L and L′ have fixed length

or not (this suffices for reasons of symmetry).

1. Assume that L = T1 : . . . : Tn : nil and L′ = T ′1 : . . . T ′

n : nil. Then (φ L) = T ′′1 : . . . : T ′′

n : nil

for some T ′′1 , . . . , T

′′n . Since L and L′ have fixed length, D is either 〈L, nil〉 or 〈T1 : . . . : Tn :

Xnew,Xnew〉, while D′ is 〈L′, nil〉 or 〈T ′1 : . . . : T ′

n : X ′new,X

′new〉. In case D and D′ are of the

last forms, (θ D) = 〈T ′′1 : . . . : T ′′

n : X ′new,X

′new〉, because Xnew and X ′

new do not occur in

T1, T′1, . . . , Tn, T

′n. Otherwise (θ D) = 〈T ′′

1 : . . . : T ′′n : nil, nil〉. In both cases (θ D) is a simple

representation of (φ L).

2. Assume that L = T1 : . . . : Tn : nil and L′ = T ′1 : . . . T ′

m : X, where m ≤ n. Then

(φ L) = T ′′1 : . . . : T ′′

n : nil for some T ′′1 , . . . , T

′′n , and (φ X) = T ′′

m+1, . . . , T′′n : nil. Since

L has fixed length, D is either 〈L, nil〉 or 〈T1 : . . . : Tn : Xnew,Xnew〉, while D′ is either

〈T ′1 : . . . : T ′

m : X,nil〉 or 〈T ′1 : . . . : T ′

m : X,Y 〉, where Y does not occur in L or L′.

Assume D = 〈L, nil〉. If D′ = 〈T ′1 : . . . : T ′

m : X,nil〉 then θ = φ. If D′ = 〈T ′1 : . . . : T ′

m : X,Y 〉

then θ = {Y 7→ nil} ◦ φ. In both cases (θ D) = 〈T ′′1 : . . . : T ′′

n : nil, nil〉.

Assume D = 〈T1 : . . . : Tn : Xnew,Xnew〉. If D′ = 〈T ′1 : . . . : T ′

m : X,nil〉 then θ = {Xnew 7→

nil} ◦ φ. If D′ = 〈T ′1 : . . . : T ′

m : X,Y 〉 then θ = {Xnew 7→ nil, Y 7→ nil} ◦ φ. In both cases,

(θ D) = 〈T ′′1 : . . . : T ′′

n : nil, nil〉. So again (θ D) is a simple representation of (φ L).

3. Assume that L = T1 : . . . : Tn : X and L′ = T ′1 : . . . T ′

m : X ′, where, for reasons of symmetry,

we can assume that m ≤ n. Then (φ L) = T ′′1 : . . . : T ′′

n : X for some T ′′1 , . . . , T

′′n , and

(φ X ′) = T ′′m+1, . . . , T

′′n : X. Now D is either 〈T1 : . . . : Tn : X,nil〉 or 〈T1 : . . . : Tn :

X,Y 〉, where Y does not occur in L or L′, and D′ is either 〈T ′1 : . . . : T ′

m : X ′, nil〉 or

〈T ′1 : . . . : T ′

m : X ′, Y ′〉, where Y ′ does not occur in L or L′. If D = 〈T1 : . . . : Tn : X,Y 〉

and D′ = 〈T ′1 : . . . : T ′

m : X ′, Y ′〉 then θ = {Y 7→ Y ′} ◦ φ, so (θ D) = 〈T ′′1 : . . . : T ′′

n : X,Y ′〉.

Otherwise (θ D) = 〈T ′′1 : . . . : T ′′

n : X,nil〉. In any case, (θ D) is a simple representation of

(φ L).

In other words, using only simple representations removes one of the problems mentioned at the

end of Section 7.1. Namely, simple representations unify whenever the lists they represent do.

Page 83: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

76 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

We would hope for the converse to hold as well, but unfortunately, simple representations may

still unify in cases where the lists they represent do not. For example, the term 〈x, x〉 unifies with

〈a : y, y′〉, yielding the most general common instance 〈a : y, a : y〉. However, 〈a : y, a : y〉 represents

nil, and nil is not an instance of the list represented by 〈a : y, y′〉, that is, of the list a : y. In

other words, σ is not homomorphic with respect to the operation “most general common instance”

(that is, the meet operation on the lattice of terms, which we usually compute using unification).

However, we have the following proposition.

Proposition 7.5 Let D and D′ be simple representations of L and L′, respectively. Let θ =

(mguD D′) and φ = (mgu L L′). If θ 6= fail and the delimiter of (θ D) is a variable or nil, then

φ 6= fail , and (θ D) is a simple representation of (φ L).

Proof: By the definition of σ and σ′, if θ = fail then φ = fail . The remainder of the proof is by

cases as before, assuming φ 6= fail . There is a myriad of cases, depending on the one hand on

whether the delimiter of (θ D) is a variable or nil, and on the other hand on the form of L and L′.

We show only one case—the others are similar.

Assume (θ D) = 〈L′′, nil〉, L = T1 : . . . : Tn : nil, and L′ = T ′1 : . . . : T ′

n : nil. Then (φ L) = T ′′1 :

. . . : T ′′n : nil for some T ′′

1 , . . . , T′′n . There are three cases to consider, regarding the form of D and D′.

We either have D = 〈L, nil〉 and D′ = 〈L′, nil〉, D = 〈T1 : . . . : Tn : Xnew,Xnew〉 and D′ = 〈L′, nil〉,

or D = 〈L, nil〉 and D′ = 〈T ′1 : . . . : T ′

n : X ′new,X

′new〉. In all cases (θ D) = 〈T ′′

1 : . . . : T ′′n : nil, nil〉,

that is, (θ D) is a simple representation of (φ L).

According to Proposition 7.4, we should only ever introduce simple representations of lists. Then,

by Proposition 7.5, to guarantee that difference-list unification is faithful, it suffices to check that

the delimiter of the resulting difference-list is a variable or nil. To make our transformation as

general as possible, we will allow for this check to be performed at run-time. This prompts the

following definition of the predicate var-nil :

var-nil(x)← if var(x) then true else x = nil.

Note that the predicate is non-logical since it makes use of the meta-predicate var of Prolog. For

most unifications, however, the check will not be needed, and we later show how unnecessary calls

to var-nil can be removed by a simple dataflow analysis. We can now introduce a predicate⋆= which

unifies two simple difference-lists, taking the unification problem for difference-lists into account:

〈x, y〉⋆= 〈u, v〉 ← 〈x, y〉 = 〈u, v〉, var-nil (v).

Recall that we are interested in changing programs so that some lists are represented by difference-

lists. The following algorithm is designed to mark those predicate arguments that must be changed

to difference-lists if the initial arguments marked are changed to difference-lists.

Page 84: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.2. MANIPULATING DIFFERENCE-LISTS 77

Program Marking Algorithm

Input: A program in which some initial predicate arguments (which should be lists) are marked.

Algorithm: Repeatedly select one of the following actions until no more arguments can be marked:

1. If, in a clause body, the i’th argument of a predicate Q is marked then, for each clause defining

Q, mark the i’th argument of its head.

2. If, in a clause head defining a predicate Q, the i’th argument is marked then, in each atom

calling Q, mark the i’th argument.

3. If, in a clause, a variable X is the tail of a marked list then mark each predicate argument

containing X.

4. If one of the arguments to the predicate = is marked then mark the other.

Finally check that each marked term is a list and that in each clause, if a variable appears as a tail

of some marked list then it only ever appears as the tail of a marked list.

This process clearly terminates. We later (after Proposition 7.10) discuss why the individual steps

of the algorithm are necessary.

Example 7.6 Consider the following program which “flattens” binary trees into lists. We let

underlining indicate the terms that have been marked.

flatten(x : y, zf)← flatten(x, xf), f latten(y, yf), append(xf, yf, zf).

f latten(nil, nil).

f latten(z, z : nil)← constant(z).

append(nil, y, y).

append(u : x, y, u : z)← append(x, y, z).

Given this program, the Program Marking Algorithm will produce the following:

flatten(x : y, zf)← flatten(x, xf), f latten(y, yf), append(xf, yf, zf).

f latten(nil, nil).

f latten(z, z : nil)← constant(z).

append(nil, y, y).

append(u : x, y, u : z)← append(x, y, z).

As well as marking the program we must also mark any query the program. The following algorithm

achieves this.

Page 85: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

78 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

Algorithm (Query Marking)

Input: A query and a successfully marked program.

Algorithm: For each atom A in the query:

if the i’th argument of the clause head defining A is marked

then mark the i’th argument in A

check that each term marked in the query is a list.

This process clearly terminates.

Example 7.7 If the query ←flatten(z, zf) and the marked version of the program flatten given

in Example 7.6 is input to the Query Marking Algorithm, ←flatten(z, zf) is returned.

One might hope that, for any successfully marked program P and query G, replacement of the

marked lists L in P by (σ L) and of those in G by (σ′ L) would always yield a program and query

that are, in a strong sense, equivalent to the original. Unfortunately, as Example 7.3 shows, this is

not the case.

For the transformed program to correctly mimic the original program, we require that all

unification of difference-lists be replaced by calls to⋆=. This includes both explicit calls to = and

the implicit unification in the clause heads. To make this implicit unification explicit, marked

clauses of the form

Q(S1, . . . , Sm, T1, . . . , Tn)←B

are transformed into

Q(S1, . . . , Sm,X1, . . . ,Xn)←X1 = T1, . . . ,Xn = Tn, B

where each Xi is a fresh variable. (We may unfold atoms Xi = Ti as long as the clause head

continues to have distinct variables in argument positions m+ 1, . . . , n in examples we do this.)

Example 7.8 The program flatten becomes

flatten(x : y, zf)← flatten(x, xf), f latten(y, yf), append(xf, yf, zf).

f latten(nil, zf)← zf = nil

f latten(z, zf)← zf = z : nil, constant(z).

append(x, y, z)← x = nil, y = z.

append(x1, y, z1)← x1 = u : x, z1 = u : z, append(x, y, z).

We henceforth assume that this additional transformation is performed in the marking process.

We now give a transformation from a marked program and query to an equivalent program in

which the marked lists are replaced by difference-lists.

Page 86: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.2. MANIPULATING DIFFERENCE-LISTS 79

Algorithm. Let P and G be a successfully marked program and query. The program (τ P ) is

obtained as follows:

1. Replace the marked lists L in P by (σ L).

2. Replace = by⋆= whenever its arguments are marked.

3. Unfold all calls to⋆=.

4. If⋆= was introduced, add the definition of var-nil to P .

The query (τ ′ G) is obtained by replacing the marked lists L in G by (σ′ L).

Example 7.9 The transformed version of flatten given in Example 7.8 and the query given in

Example 7.7 is:

←flatten(z, 〈zf, nil〉).

f latten(x : y, 〈zf, zf ′〉)←

flatten(x, 〈xf, xf ′〉),

f latten(y, 〈yf, yf ′〉),

append(〈xf, xf ′〉, 〈yf, yf ′〉, 〈zf, zf ′〉).

f latten(nil, 〈v, v〉)← var-nil(v).

f latten(z, 〈z : v, v〉)← var-nil(v), constant(z).

append(〈v, v〉, 〈y, y′〉, 〈y, y′〉)← var-nil(v), var-nil (y′).

append(〈u : x, x′〉, 〈y, y′〉, 〈u : z, z′〉)←

var-nil(x′),

var-nil(z′),

append(〈x, x′〉, 〈y, y′〉, 〈z, z′〉).

var-nil(x)← if var(x) then true else x = nil.

The correctness of the transformation so far is captured by Proposition 7.10 below. The following

notion will prove useful.

Definition. A difference-list is free iff it has the form 〈T1 : . . . : Tn : X,Y 〉 and Y does not occur

in any Ti (note that Y may be X).

For the formulation of the proof of Proposition 7.10, two auxiliary notions will be helpful, but they

will play no role in the remainder of the chapter.

Definition. A term of the form 〈T, nil〉 is nil-delimited (note that T is not necessarily a list). A

free difference-list 〈T1 : . . . : Tn : X,Y 〉 is separate iff X does not appear in any Ti. We use “≡”

between equations to indicate that they are identical.

Page 87: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

80 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

Proposition 7.10 Let program P and query G be marked. Then

• P ∪{G} returns the answers θ1, θ2, . . . iff (τ P )∪{τ ′ G} returns the answers φ1, φ2, . . . , where

for all i ≥ 1 and all variables X in G, (θi X) = (φi X).

• P ∪ {G} has an infinite derivation iff (τ P ) ∪ {τ ′ G} has an infinite derivation.

Proof: The idea in this proof is to consider a derivation of P ∪ {G} as a process of collecting an

increasing set of term equations to be solved, and to prove inductively on the number of equations

that certain invariants hold. These invariants in turn entail the proposition.

Consider a set of equations E = {e1, . . . , en} associated with some finite derivation of P ∪ {G}.

By virtue of the marking algorithms, both sides of each equation in E are either instances of marked

terms or else neither side is. Construct the query G′ = g1, . . . , gn from E by replacing each marked

term T from G by (σ′ T ), by replacing each marked term T from P by (σ T ), and by inserting

a call to var-nil after every = occurring between marked terms. We now prove by induction on

its cardinality that E is solvable iff G′ succeeds, and that if G′ succeeds with (the single) answer

substitution α, then α is a most general solution of E for the variables in G.

Let Ei = {e1, . . . , ei} and G′i = {g1, . . . , gi}. The induction hypothesis is that Ei is solvable iff

G′i succeeds, and if G′

i succeeds with the answer substitution αi, then

1. for any gj of the form D⋆= D′, it holds that (αi D) and (αi D

′) are either separate free

difference-lists or nil-delimited.

2. if 〈T1 : . . . : Tn : X,Y 〉 is a separate free difference-list appearing in some (αi gj) then X and

Y only appear elsewhere in (αi G′) in separate free difference-lists of the form 〈S1 : . . . : Sm :

X,Y 〉, and

3. there is a most general unifier, θi, of Ei such that for every variable X in E it holds that

(θi X) = ((δi ◦ αi) X),where (δi V ) =

if V is a delimiter of a separate free difference-list in (αi G′) then nil else V .

Since G and P have been successfully marked, it follows from the definition of τ and τ ′ that the

above holds when i = 0. Now assume that it holds for i, and we shall show that it holds for i+ 1.

If Ei is not solvable then Ei+1 is also not solvable. It follows from the induction hypothesis that

G′i does not succeed and so G′

i+1 cannot succeed. We now assume that Ei is solvable.

First consider the case when ei is not marked. It follows from the construction of gi+1 and (2)

and (3) that (θi ei+1) ≡ (αi gi+1). Thus Ei+1 is solvable iff G′i+1 succeeds. Furthermore, if (αi gi+1)

has answer substitution β then αi+1 = β ◦ αi and θi+1 = β ◦ θi is a most general solution to Ei+1.

Since β is idempotent, it only relates variables in (αi gi+1). Thus the induction hypothesis holds

for i+ 1.

Page 88: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.2. MANIPULATING DIFFERENCE-LISTS 81

Now consider the case where ei+1 ≡ (S = T ) is marked. It follows from part (1) of the induction

hypothesis that (αi gi+1) has one of the four forms:

(a) 〈(αi S), nil〉⋆= 〈(αi T ), nil〉

(b) 〈(αi S), nil〉⋆= 〈(αi T ),X〉

(c) 〈(αi S),X〉⋆= 〈(αi T ), nil〉

(d) 〈(αi S),X〉⋆= 〈(αi T ), Y 〉

If it is of form (a), then from (2) and (3) of the induction hypothesis, (θi ei+1) ≡ ((αi S) = (αi T )).

Thus this case is analogous to when ei+1 is not marked. If (αi gi+1) has form (b) then from parts

(2) and (3) of the induction hypothesis, (θi ei+1) ≡ {X ← nil}((αi S) = (αi T )). Thus (αi gi+1)

succeeds iff (θi ei+1) is solvable. It is routine to verify that (1), (2), and (3) hold for i+1 if (αi gi+1)

succeeds. The case when (αi gi+1) is of form (c) is symmetric to the case (b). Now consider it to

be of form (d). From (2) and (3) of the induction hypothesis,

(θi ei+1) ≡ {X ← nil, Y ← nil}((αi S) = (αi T )).

By the definition of⋆=, (αi gi+1) succeeds iff (θi ei+1) is solvable. It is routine to check that (1),

(2), and (3) hold for i+ 1 if (αi gi+1) succeeds.

It follows by finite induction that E (= En) is solvable iff G′ (= G′n) succeeds. Furthermore, by

the definition of τ ′, if G′ succeeds with answer α then there is a most general solution θ of E such

that for any variable X in G, (α X) = (θ X). Since G′ is an unfolded version of the derivation of

(τ P )∪ {(τ ′ G)} which corresponds to the derivation associated with E, the assertion follows.

The proof of Proposition 7.10 made extensive use of properties of the marking algorithms. Actions

(1), (2) and (4) in the Program Marking Algorithm are necessary because otherwise the transformed

program might try to unify a difference-list with an untransformed list. The same is true for the

Query Marking Algorithm. The check in the Program Marking Algorithm that only lists have been

marked is necessary because otherwise τ is not well-defined. It may not be apparent that action

(3) in the Program Marking Algorithm is necessary. The following example shows that it is.

Example 7.11 Consider the marked program

p← r(x), x = a.

r(nil).

After applying the transformation, we obtain the program

p← r(〈x, x′〉), x = a.

r(〈y, y〉)← var-nil(y).

This succeeds with query ←p, whereas the original program fails.

Page 89: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

82 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

The following example shows that it is necessary to check that in each clause, if a variable appears

as the suffix of a marked list, then it only ever appears as such.

Example 7.12 Consider the marked program

p← r(x : x).

r(z : nil)← z = a : nil.

After applying the transformation, we obtain the program

p← r(〈x : x, x′〉).

r(〈z : y, y〉)← var-nil(y), z = a : nil.

This succeeds with query ←p, whereas the original program fails.

The reason why we have introduced difference-lists is that some calls to the difference-list version

of append can be replaced by calls to a predicate which has constant cost and equivalent behaviour

to append. We now investigate when such replacement is feasible.

The transformation of append from lists to difference-lists can result in one of three different

programs depending on which arguments of append are marked. If the first argument is marked

then the following version, called append⋆, results

append⋆(〈x, x〉, y, y)← var-nil(x).

append⋆(〈u : x, x′〉, y, u : z)← var-nil(x′), append⋆(〈x, x′〉, y, z).

A second version of append, called append⋆⋆, results when all three arguments of the original

program are marked. The definition of append⋆⋆ was given in Example 7.9 where it was just called

append. A third version of append results when the second and third arguments are marked. This

version cannot be optimised using techniques discussed here and so will not be considered further.

A call to append⋆ can often be replaced by a call to the predicate app⋆, defined by

app⋆(〈x, y〉, y, x).

The following proposition captures the relationship between these two predicates. In essence the

difference is that app⋆ binds the delimiter of its first argument while append⋆ does not.

Proposition 7.13 Let D be a free, fixed length difference-list whose delimiter does not appear in

either S or T . Then

1. A call app⋆(D,S, T ) finitely fails iff append⋆(D,S, T ) finitely fails.

2. A call app⋆(D,S, T ) succeeds with answer θ iff append⋆(D,S, T ) succeeds with answer φ, and

for every variable V , except the delimiter of D, (θ V ) = (φ V ).

Page 90: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.2. MANIPULATING DIFFERENCE-LISTS 83

Proof: Let D = 〈R1 : . . . : Rn : X,X〉. The call append⋆(D,S, T ) gives rise to the equation

T = R1 : . . . : Rn : S and the call app⋆(D,S, T ) gives rise to the two equations T = R1 : . . . : Rn : S

and X = S. Part (1) holds because X does not appear in the first equation, thus the second

equation set is solvable iff the first is. Part (2) follows because the only difference between the two

equation sets is that X is constrained in the second but not in the first.

The conditions given in the proposition are all necessary for this equivalence to hold. If the first

argument is not of fixed length, then app⋆ may succeed when append⋆ fails. The following example

shows this.

Example 7.14 The query

←app⋆(〈u : x, x′〉, a : nil, b : nil)

succeeds, whereas the query

←append⋆(〈u : x, x′〉, a : nil, b : nil)

fails.

The following example shows that if the first argument is not free, then, conversely, app⋆ may fail

when append⋆ succeeds.

Example 7.15 The query

←app⋆(〈nil, nil〉, a : nil, a : nil)

fails, whereas the query

←append⋆(〈nil, nil〉, a : nil, a : nil)

succeeds.

In the terminology of Sterling and Shapiro [93] this is a “compatibility” problem: nil does not

unify with a : nil.

A call to append⋆⋆ can often be replaced by a call to the predicate app⋆⋆, defined by

app⋆⋆(〈x, y〉, 〈y, z〉, 〈x, z〉)← var-nil(z).

The following proposition captures the relationship between these two predicates. Again, in essence

the difference is that app⋆⋆ binds the delimiter of its first argument while append⋆⋆ does not.

Proposition 7.16 Let D1, D2, and D3 be difference-lists such that D1 is free and of fixed length

and the delimiter of D1 does not appear in either D2 or D3. Then

Page 91: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

84 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

1. app⋆⋆(D1,D2,D3) finitely fails iff append⋆⋆(D1,D2,D3) finitely fails.

2. app⋆⋆(D1,D2,D3) succeeds with answer θ iff append⋆⋆(D1,D2,D3) succeeds with answer φ,

and for every variable V , except the delimiter of D1, (θ V ) = (φ V ).

Proof: Let D1 = 〈R1 : . . . : Rn : X,X〉,D2 = 〈S1, S2〉, and D3 = 〈T1, T2〉. The call

append⋆⋆(D1,D2,D3)

is equivalent to the call

T1 = R1 : . . . : Rn : S1, T2 = S2, var-nil (T2).

The call

app⋆⋆(D1,D2,D3)

is equivalent to the call

T1 = R1 : . . . : Rn : S1, T2 = S2, var-nil (T2),X = S1.

Part (1) holds because the variable X does not appear in the first two equations, hence the second

call succeeds if and only if the first does. Part (2) follows because the only difference between the

two calls is that X is constrained in the second but not in the first.

The conditions given in the proposition are all necessary for the equivalence to hold. Examples

similar to Example 7.14 and Example 7.15 can be constructed to show that it is necessary that

the first argument is free and of fixed length. Furthermore, if the delimiter of the first argument

appears in either of the other arguments, app⋆⋆ may fail when append⋆⋆ succeeds. The following

example shows this.

Example 7.17 The query

←app⋆⋆(〈a : x, x〉, 〈a : x, x〉, 〈a : a : y, y〉)

fails, whereas the query

←append⋆⋆(〈a : x, x〉, 〈a : x, x〉, 〈a : a : y, y〉)

succeeds.

Page 92: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.3. AUTOMATIC DIFFERENCE-LIST TRANSFORMATION 85

7.3 Automatic difference-list transformation

We have stressed how the difference-list transformation should be regarded as a stepwise process:

a change in data structure (introduction of difference-lists) followed by the introduction of rapid

concatenation by means of app⋆ and app⋆⋆. For each stage, we have demonstrated that it is only

applicable under certain circumstances. Section 7.2 discussed the safe replacement of difference-

lists by lists, and that of app⋆ and app⋆⋆ by append⋆ and append⋆⋆. In this section we sketch how

the transformation can be performed automatically for a large class of programs, including all the

examples given by Sterling and Shapiro [93].

The transformation takes a program, a predicate to be optimised, and a description of the

intended use of that predicate. The description allows better analysis to be done and is referred

to as a “call template.” The transformation is only required to be safe with respect to such a call

template.

Definition. A call template for an n-ary predicate is an n-tuple of descriptors. The i’th descriptor

specifies whether the i’th argument is a fixed length list, a list (possibly of non-fixed length) or any

term.

Example 7.18 For the predicate flatten/2, the call template might specify that the first argument

is a fixed length list and the second is a list.

Definition. A call template C for predicate Q/n is consistent with the marked program P iff for

each argument of the query Q(X1, . . . ,Xn) which is marked for P , the corresponding descriptor in

C specifies either a fixed length list or a list.

Figure 7.1 gives an overview of the whole transformation. The transformation is viewed as a

three-stage process in which each stage is an analysis-synthesis sequence. The first stage is the

data structure transformation that introduces difference-lists and the corresponding ⋆-versions of

predicates, including append⋆ and append⋆⋆. The second stage transforms append⋆ and append⋆⋆

into their more efficient versions app⋆ and app⋆⋆ wherever possible, and subsequently unfolds these.

The first and second stage may have introduced calls to the non-logical var-nil . The purpose of

the third stage is to remove such calls whenever possible.

In more detail, Stage 1 is made up by the following:

Analysis:

1. Mark the first argument of each call to append in P .

2. Apply the Program Marking Algorithm to P giving P ′ (if it fails then return P ).

3. Check that the call template C for Q is consistent with P ′.

Page 93: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

86 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

Input: A predicate Q/n, a call template C for Q, and a program P defining Q.

Stage 1:

Analysis: Determine which arguments must be changed to difference-

lists in P and check that this is consistent with C.

Synthesis: Replace lists by difference-lists as dictated by the Analysis

and introduce ⋆-versions of the predicates in P .

Stage 2:

Analysis: Determine where fast concatenation is safe.

Synthesis: Replace occurrences of append⋆ and append⋆⋆ by app⋆ and

app⋆⋆ as dictated by the Analysis.

Stage 3:Analysis: Determine which difference-lists are simple at runtime.

Synthesis: Remove calls to var-nil as dictated by the analysis.

Figure 7.1: Overview of the transformation

Synthesis:

1. Apply the function τ to P ′ giving P ′′;

2. Rename each predicate Q′ in P ′′ to Q′⋆, provided any of its arguments are marked (this

includes changing append to append⋆ or append⋆⋆ as appropriate);

3. If Q is marked, add Q(X1, . . . ,Xn) ← (τ ′ Q(X1, . . . ,Xn)), where X1, . . . ,Xn are distinct

variables.

Example 7.19 If the program flatten from Example 7.6 is input to the transformation, then the

Page 94: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.3. AUTOMATIC DIFFERENCE-LIST TRANSFORMATION 87

following program is returned from Stage 1:

flatten(z, zf)← flatten⋆(z, 〈zf, nil〉).

f latten⋆(x : y, 〈zf, zf ′〉)←

flatten⋆(x, 〈xf, xf ′〉),

f latten⋆(y, 〈yf, yf ′〉),

append⋆⋆(〈xf, xf ′〉, 〈yf, yf ′〉, 〈zf, zf ′〉).

f latten⋆(nil, 〈v, v〉)← var-nil(v).

f latten⋆(z, 〈z : v, v〉)← var-nil(v), constant(z).

append⋆⋆(〈v, v〉, 〈y, y′〉, 〈y, y′〉)← var-nil(v), var-nil (y′).

append⋆⋆(〈u : x, x′〉, 〈y, y′〉, 〈u : z, z′〉)←

var-nil(x′),

var-nil(z′),

append⋆⋆(〈x, x′〉, 〈y, y′〉, 〈z, z′〉).

var-nil(x)← if var(x) then true else x = nil.

We now look at Stage 2. We only consider the replacement of append⋆⋆ by app⋆⋆: the case of

append⋆ is simpler because Property 2 in the following definition always holds.

Definition. A call append⋆⋆(D1,D2,D3) is secure if the following three properties can be estab-

lished to hold at runtime:

1. D1 = 〈L1, L2〉 is of fixed length and most general.

2. L2 does not occur in D2 or D3.

3. L2 is not used in any subsequent unification.

By Proposition 7.16, a secure call append⋆⋆(D1,D2,D3) can be replaced by app⋆⋆(D1,D2,D3).

Security can be guaranteed by dataflow analysis. Properties 1 and 2 can be established by an

analysis similar to that used for mode and occur check analysis [19, 21, 92]. Property 3 can be

established by a live variable analysis of the kind used in compile-time garbage collection [9]. Thus

Stage 2 of the transformation is made up by the following:

Analysis:

For each call to append⋆⋆ check whether the call is secure.

Synthesis:

1. Replace secure calls to append⋆⋆ by calls to app⋆⋆ and unfold these.

2. Remove the definition of append⋆⋆, if no longer needed.

Page 95: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

88 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

Example 7.20 If the program from Example 7.19 is input to Stage 2, the following program is

returned:

flatten(z, zf)← flatten⋆(z, 〈zf, nil〉).

f latten⋆(x : y, 〈xf, yf ′〉)←

flatten⋆(x, 〈xf, yf〉),

f latten⋆(y, 〈yf, yf ′〉),

var-nil(yf ′).

f latten⋆(nil, 〈v, v〉)← var-nil(v).

f latten⋆(z, 〈z : v, v〉)← var-nil(v), constant(z).

var-nil(x)← if var(x) then true else x = nil.

The programs that result from Stage 2 will in general contain calls to var-nil . Referring to rather

contrived example programs, we saw how these checks were necessary, but for many common

programs they are not needed. A simple dataflow analysis performed in Stage 3 can determine when

calls to var-nil can be omitted. The reader may recall the definition of var-nil from Example 7.20.

From this definition it is clear that we are interested in determining the following runtime properties:

whether variables are definitely bound to nil, and whether variables are definitely free, that is,

unbound or bound to other variables. This analysis is similar to that used for mode analysis, for

example by abstract interpretation.

Example 7.21 If the program from Example 7.20 is input to Stage 3, the following program is

returned:

flatten(z, zf)← flatten⋆(z, 〈zf, nil〉).

f latten⋆(x : y, 〈xf, yf ′〉)← flatten⋆(x, 〈xf, yf〉), f latten⋆(y, 〈yf, yf ′〉).

f latten⋆(nil, 〈v, v〉).

f latten⋆(z, 〈z : v, v〉)← constant(z).

The worst-case time complexity for this program is O(n), where n is the number of constants in z,

whereas for the original flatten (Example 7.6), it was O(n2).

The correctness of the whole transformation should be clear, since at each stage, the analysis simply

aims at establishing that the subsequent synthesis yields an equivalent program by the propositions

of Section 7.2. The termination of the process is evident assuming that the analyses in Stage 2 and

Stage 3 terminate.

Note that resulting programs may contain calls to var-nil and thus may be non-logical. This

is the price to be paid for having a very general transformation, able to handle a large class of

programs. However, the dataflow analysis in Stage 3 is intended to discover needless calls to

var-nil , and so for all the standard examples [93], the resulting program should be as logical as the

original.

Page 96: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

7.4. DISCUSSION 89

7.4 Discussion

We have shown that the introduction of difference-lists in a Prolog program followed by optimisation

of concatenation should be handled with great care: the program’s semantics is easily changed by

accident. This transformation, however, is useful for a large class of list-processing programs,

because it considerably improves efficiency.

Fortunately it is possible to design a dataflow analysis to recognise cases where the transfor-

mation is safe. This dataflow analysis can be made sufficiently precise to allow for the automatic

transformation of all the standard example programs known to benefit from the use of difference-

lists. We have sketched this analysis and how it forms part of the total transformation process.

Our aim has been to analyse the “semantics” of difference-lists and to point to the possibility of

an automatic transformation. We have not been concerned with the transformation’s efficient im-

plementation. Some of the devices we have introduced, such as the predicate⋆=, or the division into

three stages, serve merely explanatory purposes and would not be called for in an implementation.

The resulting programs need not contain any difference-lists at all: it is clear that the desired

improvements of programs can be achieved by replacing a list argument not by a difference-list but

by two new arguments to the predicate. Thus, if one prefers, the generated predicates can avoid

difference-lists by having more arguments (though of course the difference-lists are still implicitly

present). The programs that result from this approach are very similar to those resulting from the

well-known unfold/fold transformations [95]. Indeed such transformations are used in the method

of Zhang and Grant [102]. There is no doubt that the two transformation processes are closely

connected. The present approach, however, seems to have the advantage of lending itself more

readily to automation. This is because the unfold/fold paradigm calls for “eurekas” in the form

of new predicate definitions. Another problem with an unfold/fold-based method (such as Zhang

and Grant’s) is that folding in general does not preserve finite failure sets, and therefore such an

approach is correct in a somewhat weaker sense than ours. (In fact, the exact conditions under

which folding is correct have only been laid down very recently [29].)

It is not easy to give a detailed analysis of the decrease in time complexity obtained by the

transformation we have sketched. The reduction in number of derivation steps, however, is obvious,

and experience indicates that the transformation is usually worthwhile. The greatest expense of

the transformation lies in the necessary dataflow analyses. In this connection it is worth pointing

out the possibility of reusing dataflow information for many different purposes. As this report

shows, many useful properties of logic programs (determinacy, mode or type correctness, absence

of occur check problems, etc.) can be determined by dataflow analyses that propagate the same

kinds of information (freeness, sharing, etc.) as are required by the present analysis. This suggests

the usefulness of fusing such analyses.

The sketched transformation is very general. It may be applied successfully to almost any

useful list-processing program. For the standard examples, the transformation behaves optimally.

For more intractable programs it does not give up, but leaves certain (non-logical) calls in the

Page 97: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

90 CHAPTER 7. DIFFERENCE-LISTS MADE SAFE

program to perform runtime checks.

The basic idea, of course, of pasting data structures together efficiently by introducing extra

variables applies not only to lists. It appears that a study of the more general case would be very

useful, but much harder: here we have relied heavily on the heuristic that it is useful to give append

a difference-list as a first argument.

Page 98: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Chapter 8

Conclusion

We have presented a theory of dataflow analysis of logic programs and we have sketched some

applications in areas such as error finding, program transformation, compilation, and parallelisation.

We have called our thesis “Semantics-Based Analysis and Transformation of Logic Programs,”

because we have seen our work as an attempt not only to justify certain dataflow analyses and

program transformations with respect to an underlying semantics, but also to understand them

better by clarifying their exact relation to semantics. We have striven to obtain simple definitions

of the semantics of logic programs and dataflow analyses, with the hope that an understanding

of their close relationship could spring from a high degree of congruence of the definitions. The

definitions reflect a choice of what is essential in the programming language under study. We

feel that the contribution of the present thesis lies as much in the suggested definitions as in the

theorems established.

A particular aim has been to try to capture the essence of what seems to us to be a major class

of analyses called for in many different logic programming tools. Like so much related work, our

theory is based on P. and R. Cousot’s notion of abstract interpretation. One assumption made by

P. and R. Cousot has been relaxed, so as to obtain a more widely applicable theory, but we do not

hesitate to call our approach abstract interpretation. In particular, the whole development of the

groundness analysis presented in Section 6.4 fits into the Cousot framework.

A more precise label for our work would be “denotational abstract interpretation,” though.

In this respect it differs significantly from much of the other work that has been published about

abstract interpretation of logic programs. We hope that the present thesis will be seen as a point

in favour of the denotational approach developed by F. Nielson, or at least for an approach based

on a powerful meta-language such as that of denotational semantics. Nielson’s approach allows for

generality at different levels. First, the language of denotational semantics allows for comfortable

reasoning at exactly the level of abstraction called for by any particular class of applications, or

dataflow analyses. Second, the proof of correctness of a particular dataflow analysis becomes almost

trivial, since most of it can be conducted at the level of the meta-language once and for all. Finally,

most of the theory is independent of any particular programming language, since it is expressed

91

Page 99: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

92 CHAPTER 8. CONCLUSION

in terms of the meta-language only. This last point indicates that our work could be redone for

related programming language paradigms without too much effort.

We have found it useful to distinguish between bottom-up and top-down analysis. This distinc-

tion is not clear-cut, but we think of a “top-down semantics” as one that allows for extraction of

information about the SLD tree that corresponds to the execution of a program given some query.

Bottom-up analysis is not based on such a semantics to begin with, and therefore it cannot provide

information about for example calls that will take place at runtime. Bottom-up analysis suffices for

several applications, though. It is not only the conceptually simplest of the two, it also allows for

efficient derivation of query-independent information about a program. We have given examples

of its use for program specialisation. As examples of the use of top-down analysis we looked at

groundness analysis, and as a more complex case, we studied how top-down dataflow analysis could

be used to guarantee the correctness of the (automatic) transformation of list-processing programs

into more efficient programs using difference-lists.

In this thesis we have paid far too little attention to implementation issues. The aim of a

theory of dataflow analysis is of course not just to improve our understanding of the phenomenon

but to use that understanding to improve the practice of building programming tools. The reader

may feel that, in our quest for abstract semantic definitions (so as to simplify definitions), we have

only widened the theory/practice gap, since “more abstract” can easily be taken to mean “further

away from a concrete implementation.” This is not correct, though. The “dataflow semantics”

of Section 6.3 was obtained (and its correctness proved) through a series of abstract semantics,

but instances of it are no more difficult to implement efficiently that some of the “algorithms”

or “procedures” previously suggested [7, 44], assuming that the implementor knows some of the

standard tricks for efficient fixpoint computation. (An interesting project would be to build an

“analyzer generator” from the dataflow semantics. Such a device would, given an “interpretation,”

expressed in a suitable definition language, automatically produce a program that performs the

corresponding dataflow analysis.)

Fixpoint computation apart, efficient implementation of a particular dataflow analysis comes

down to the implementation of the operations in the corresponding interpretation, including possible

normalisation of descriptions. Clearly the complexity of these operations depends on the granularity

of the domain of descriptions chosen for the analysis, as does the precision of the information we

can extract. While we have well-established means for discussing complexity, there is no good

metric for “precision” of a dataflow analysis, and this makes it very hard to discuss the obvious

trade-off between higher time/space complexity and less precise dataflow information. So far, it

seems that what constitutes a “good” dataflow analysis is a purely empirical question, but it may

still be possible to develop some useful measure for “precision” that would be helpful in discussions

of the trade-off.

It would be useful to study to what extent ideas developed here could be used for related

programming language paradigms. Owing to the denotational framework, one would hope that it

would be relatively straightforward to redo some of the work for related programming languages,

Page 100: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

93

such as logic programming languages with delay mechanisms (“freeze,” “wait,” “when,” etc.), de-

ductive databases [49, 72], constraint logic programming languages [36, 57], and perhaps even

concurrent constraint programming languages [86]. Whether this is the case remains to be seen,

though. Presumably the kinds of dataflow information relevant for implementing these program-

ming languages differ from what has been discussed in the present thesis, but it would seem that

abstract interpretation could be as versatile for these languages as it has proved to be for logic

programming.

The basic hypothesis in the present work has been that of P. and R. Cousot: many important

dataflow analyses in a programming language L can be understood as approximation of extreme

fixpoints in L’s semantic domains. It seems certain, however, that this does not cover all interesting

analyses, and it would be very useful to understand the limitations of the framework of abstract

interpretation in a logic programming context. One problem is that there are many useful properties

that are not inclusive, such as “finiteness” in deductive databases. In such cases our present theory

allows for no better solution than approximating the relevant least fixpoint by approximating a

greatest fixpoint from above.

Another problem is that some of the domains of descriptions that have been suggested for useful

dataflow analyses are neither ascending nor descending chain finite. This is the case, for example,

for the so-called regular trees used by Pyo and Reddy [82] for type inference. It would therefore

seem that methods such as those of Heintze and Jaffar [32] and Pyo and Reddy cannot easily

be captured, if at all, unless one considers incorporating some kind of “widening,” as discussed

by P. and R. Cousot [13]. Finally it is worth mentioning that Plaisted’s [80] idea about abstract

theorem proving bears a close resemblence to abstract interpretation of logic programs. Plaisted

points out that analogy is often a useful guide to proofs and discusses how one can use abstraction

in theorem proving to simplify reasoning. The idea is to first prove a related but simpler theorem

and then use its proof to guide a proof of the general theorem. Our bottom-up analyses can be

seen as particular examples (which use modus ponens as inference rule) of this principle. Similarly

our top-down analyses are examples that use resolution as deductive machinery.

Page 101: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

94 CHAPTER 8. CONCLUSION

Page 102: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Appendix A

Correctness of the base semantics

To prove that the base semantics approximates the SLD semantics we first state a number of

elementary results concerning parametric substitutions.

Lemma A.1 Let Π,Π′ ⊆ Psub be given. Then

∆ dom (meet Π Π′) ⊆ (∆ dom Π) ∪ (∆ dom Π′).

Lemma A.2 Let Π ⊆ Psub and V,W ⊆ Var be given. Then

restrict V (restrict W Π) = restrict(W ∩ V ) Π.

Lemma A.3 Let Π ⊆ Psub and V,W ⊆ Var be given such that W ∩ (∆ dom Π) ⊆ V ⊆W . Then

restrictW Π = restrict V Π.

Lemma A.4 Let Π,Π′ ⊆ Psub be given and let W ⊆ Var be such that ∆ dom Π ⊆W . Then

restrictW (meet Π Π′) = meet Π (restrictW Π′).

Lemma A.5 Let θ, θ′ ∈ Sub be given such that (dom θ′) ∩ (vars θ) = ∅. Then

β (θ ◦ θ′) = β θ ⊓ β θ′.

Lemma A.6 Let θ ∈ Sub and π ∈ Psub be given such that π ≤ β θ. Then π = π ◦ θ.

The following result is a direct consequence of Lemma A.3.

Lemma A.7 Let A,A′ ∈ Atom and π ∈ Psub be given. If (vars A) ∩ (vars A′) = ∅ and (dom π) ∩

(vars A) = ∅ then unify A A′ {π} = restrict(vars A) (π ⊓ β (mgu A A′)).

95

Page 103: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

96 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS

We also need to establish some invariants of the base semantics.

Definition. The predicate additive : Den → Bool is defined by

additive d iff ∀A ∈ Atom . ∀Π ⊆ Psub . d A Π =⊔

π∈Π(d A {π}).

Lemma A.8 The predicate additive is inclusive.

Proof: Let D ⊆ Den be a chain such that additive d holds for all d ∈ D. Then

π′ ∈ (⊔

D) A Π

⇔ ∃ d ∈ D . π′ ∈ d A Π

⇔ ∃ d ∈ D . ∃π ∈ Π . π′ ∈ d A {π}

⇔ ∃π ∈ Π . π′ ∈ (⊔

D) A {π}

⇔ π′ ∈⊔

π∈Π((⊔

D) A {π}).

Lemma A.9 For all P ∈ Prog and d ∈ Den, additive (P′bas [[P ]] d) holds.

Proof: We have that

π′ ∈ P′bas [[P ]] d A Π

⇔ ∃C ∈ P . π′ ∈ Cbas [[C]] d A Π

⇔ ∃C ∈ P . ∃π ∈ Π . π′ ∈ Cbas [[C]] d A {π}

⇔ ∃π ∈ Π . π′ ∈ P′bas [[P ]] d A {π}

⇔ π′ ∈⊔

π∈Π (P′bas [[P ]] d A {π}).

The following lemma characterizes additivity.

Lemma A.10 For all d ∈ Den,

additive d⇔ ∀G ∈ Atom∗ . ∀Π ⊆ Psub . Σ d G Π =⋃

π∈Π(Σ d G {π}).

Proof: The ⇐ direction holds trivially. The ⇒ direction can be proved by structural induction on

G.

Lemma A.11 Let D ⊆ Den be a non-empty chain such that additive d holds for all d ∈ D. Then

π ∈ Σ (⊔

D) G Π⇔ ∃ d ∈ D . π ∈ Σ d G Π.

Proof: The proof is by structural induction on G. If G = nil, the assertion holds since D 6= ∅.

Otherwise G = (A : G′) for some A and G′. Assume as hypothesis for induction that

π ∈ Σ (⊔

D) G′ Π⇔ ∃ d ∈ D . π ∈ Σ d G′ Π.

Page 104: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

97

We have that

π′′ ∈ Σ (⊔

D) G Π

⇔ π′′ ∈ Σ (⊔

D) G′ ((⊔

D) A Π) (by the definition of Σ)

⇔ ∃π ∈ Π . ∃π′ ∈ (⊔

D) A {π} . π′′ ∈ Σ (⊔

D) G′ {π′}

by Lemmas A.8 and A.10. It follows by the induction hypothesis that

π′′ ∈ Σ (⊔

D) G Π

⇔ ∃π ∈ Π . ∃π′ ∈ (⊔

D) A {π} . ∃ d ∈ D . π′′ ∈ Σ d G′ {π′}

⇔ ∃ d ∈ D . π′′ ∈ Σ d G Π

by Lemma A.10 and the definition of Σ.

Definition. The predicate comp : Den → Bool is defined by

comp d iff ∀G ∈ Atom∗ . ∀π, π′ ∈ Psub . ∀ θ ∈ Sub . ∀ ρ ∈ Ren.

∆ dom (Σ d G {π}) ⊆ (dom π) ∪ (vars G)

∧ (π G = π′ G ∧ π′ ≤ π)⇒ Σ d G {π′} = meet{π′} (Σ d G {π})

∧ π ≤ β θ ⇒ Σ d (θ G) {π} = Σ d G {π}

∧ Σ d G {π} = {π′ ◦ ρ | π′ ∈ Σ d (ρ G) {π ◦ ρ}}.

Lemma A.12 The predicate λd . (comp d) ∧ (additive d) is inclusive.

Proof: Let D ⊆ Den be a chain such that (additive d) and (comp d) both hold for every d ∈ D. By

Lemma A.8, additive (⊔

D) holds. If D = ∅ then it is straightforward to verify that comp (⊔

D)

holds. For D 6= ∅, Lemma A.11 makes it a mechanical task to verify that comp (⊔

D) holds.

Lemma A.13 For all P ∈ Prog and d ∈ Den, comp d⇒ comp (P′bas [[P ]] d).

Proof: Let d′ = P′bas [[P ]] d. We must prove the following:

(1) π2 ∈ Σ d′ G {π} ⇒ dom π2 ⊆ (dom π) ∪ (vars G)

(2) (π G = π′ G ∧ π′ ≤ π)⇒ (π2 ∈ Σ d′ G {π} ⇔ (π2 ⊓ π′) ∈ (Σ d′ G {π′}) ∪ {⊥Psub})

(3) π ≤ (β θ)⇒ (π2 ∈ Σ d′ (θ G) {π} ⇔ π2 ∈ Σ d′ G {π})

(4) (π2 ◦ ρ) ∈ Σ d′ (ρ G) {π ◦ ρ} ⇔ π2 ∈ Σ d′ G {π}.

The proof is by structural induction on G. If G = nil then it is straightforward to verify (1)–(4).

Otherwise G = (A : G′) for some A and G′. Assume as hypothesis for induction that

(1′) π2 ∈ Σ d′ G′ {π} ⇒ dom π2 ⊆ (dom π) ∪ (vars G′)

(2′) (π G′ = π′ G′ ∧ π′ ≤ π)⇒ (π2 ∈ Σ d′ G′ {π} ⇔ (π2 ⊓ π′) ∈ (Σ d′ G′ {π′}) ∪ {⊥Psub})

(3′) π ≤ (β θ)⇒ (π2 ∈ Σ d′ (θ G′) {π} ⇔ π2 ∈ Σ d′ G′ {π})

(4′) (π2 ◦ ρ) ∈ Σ d′ (ρ G′) {π ◦ ρ} ⇔ π2 ∈ Σ d′ G′ {π}

Page 105: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

98 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS

all hold. By Lemma A.9, additive d′ holds. Thus by Lemma A.10 and the definition of d′,

π2 ∈ Σ d′ G {π}

⇔ ∃π1 ∈ d′ A {π} . π2 ∈ Σ d′ G′ {π1}

⇔ ∃ [[H ←B]] ∈ P . ∃π3 ∈ unify A H (Σ d B (unify H A {π})) . π2 ∈ Σ d′ G′ {π ⊓ π3}.

By the definition of unify , (dom π3) ⊆ (vars A). Thus by Lemma A.1, dom(π ⊓ π3) ⊆ (dom π) ∪

(vars A). By (1′), dom π2 ⊆ (dom π) ∪ (vars A) ∪ (vars G′) = (dom π) ∪ (vars G), so (1) holds.

Assume π G = π′ G ∧ π′ ≤ π. Since π A = π′ A, we have that

unify H A {π} = unify H A {π′}.

Thus,

unify A H (Σ d B (unify H A {π})) = unify A H (Σ d B (unify H A {π′})).

Let π3 ∈ unify A H (Σ d B (unify H A {π})). Then (dom π3) ⊆ (vars A) ⊆ (vars G). Repeated

use of Lemma A.4 therefore gives

restrict (vars G) {π3 ⊓ π′}

= meet {π3} (restrict(vars G) {π′})

= meet {π3} (restrict(vars G) {π})

= restrict (vars G) {π3 ⊓ π}.

Thus (π3 ⊓ π) G′ = (π3 ⊓ π′) G′. Since (π3 ⊓ π

′) ≤ (π3 ⊓ π), it follows from (2′) that

π2 ∈ Σ d′ G′ {π3 ⊓ π} ⇔ (π2 ⊓ π3 ⊓ π′) ∈ (Σ d′ G′ {π3 ⊓ π

′}) ∪ {⊥Psub}.

Also, by (2′), π2 ≤ (π3 ⊓ π) ≤ π3. Thus,

π2 ∈ Σ d′ G′ {π3 ⊓ π} ⇔ (π2 ⊓ π′) ∈ (Σ d′ G′ {π3 ⊓ π

′}) ∪ {⊥Psub},

and (2) follows.

Assume π ≤ β θ. By the definition of unify , unify H A {π} = unify H (θ A) {π}. By (2′),

π′′′ ∈ Σ d B π′′ ⇒ π′′′ ≤ π′′. Thus

∀π′′′ ∈ Σ d B (unify H A {π}) . (π′′′ H) ≤ π (θ A).

Hence, by the definition of unify ,

π3 ∈ unify A H (Σ d B (unify H A {π}))

⇔ ∃π′3 ∈ unify (θ A) H (Σ d B (unify H (θ A) {π})) . π3 ⊓ (β θ) = π′3 ⊓ (β θ)

⇒ ∃π′3 ∈ unify (θ A) H (Σ d B (unify H (θ A) {π})) . π3 ⊓ π = π′3 ⊓ π.

Clearly (π3 ⊓ π) ≤ β θ, so (3) follows from (3′).

Page 106: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

99

Finally it follows from the definition of unify that

π3 ∈ unify A H (Σ d B (unify H A {π}))

⇔ (π3 ◦ ρ) ∈ unify (ρ A) H (Σ d B (unify H (ρ A) {π ◦ ρ})).

By the definition of ⊓, (π3 ⊓ π) ◦ ρ = (π3 ◦ ρ) ⊓ (π ◦ ρ), so (4) follows from (4′).

Definition. The predicate cor : (Sem ×Den)→ Bool is defined by

cor (s, d) iff ∀G ∈ Atom∗ . ∀ θ ∈ Sub . restrict U (∆ β (s θ (θ G))) ⊆ Σ d G {β θ},

where U = (vars θ) ∪ (vars G).

Lemma A.14 The predicate λ (s, d) . (cor (s, d) ∧ additive d) is inclusive.

Proof: Let X ⊆ Sem × Den be a chain such that for all (s, d) ∈ X, cor (s, d) ∧ additive d

holds. Let D = {d | (s, d) ∈ X} and S = {s | (s, d) ∈ X}. Clearly D and S are chains and⊔

X = (⊔

S,⊔

D). By Lemma A.8, additive (⊔

D) holds. We must show that cor (⊔

S,⊔

D)

holds. Letting U = (vars θ) ∪ (vars G) we have

π ∈ restrict U (∆ β ((⊔

S) θ (θ G)))

⇔ ∃ s ∈ S . π ∈ restrict U (∆ β (s θ (θ G)))

⇔ ∃ d ∈ D . π ∈ Σ d G {β θ}

⇔ π ∈ Σ (⊔

D) G {β θ}.

The assertion follows.

Lemma A.15 Let P ∈ Prog, d ∈ Den and s ∈ Sem be given. Then

(comp d ∧ d ≤ lfp(P′bas [[P ]]) ∧ cor (s, d))⇒ cor ((O′ [[P ]] s), (P′

bas [[P ]] d)).

Proof: Let G ∈ Atom∗, θ ∈ Sub, and P ∈ Prog be given and let U = (vars θ) ∪ (vars G). Assume

that comp d, d ≤ lfp(P′bas [[P ]]), and cor (s, d) all hold. By the definition of cor and that of O′, it

suffices to show that

(5) restrict U (∆ β (O′ [[P ]] s θ (θ G))) ⊆ Σ (P′bas [[P ]] d) G {β θ}.

If G = nil then both sides of (5) are equal to {β θ}, so (5) holds. Otherwise θ G = (A : G′) for

some A and G′. Consider C ∈ P and let [[H←B]] = rename U C. If mgu A H = ∅ then (5) holds.

Otherwise mgu A H = {µ} for some substitution µ. Let θ′ = µ ◦ θ.

Note that vars µ = (vars A) ∪ (vars H) and that ((vars H) ∪ (vars B)) ∩ U = ∅. Since θ is

idempotent, θ (A : G′) = (A : G′). Thus (dom θ) ∩ (vars A) = ∅ and so (dom θ) ∩ (vars µ) = ∅.

Page 107: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

100 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS

By the definition of C′,

restrict U (∆ β (C′ [[C]] s θ (θ G)))

= restrict U (∆ β (s θ′ (µ (B :: G′))))

= restrict U (∆ β (s θ′ (θ′ (B :: G′))))

since θ′ (B :: G′) = µ (B :: G′). Let U ′ = (vars θ′) ∪ (vars(B :: G′)). Then U ⊆ U ′ and thus

restrict U (∆ β (C′ [[C]] s θ (θ G)))

= restrict U (restrict U ′ (∆ β (s θ′ (θ′ (B :: G′))))) (by Lemma A.2)

⊆ restrict U (Σ d (B :: G′) {β θ′}) (since cor (s, d) holds)

= restrict U (Σ d G′ (Σ d B {β θ′})) (by the definition of Σ)

= restrict U (meet Π (Σ d G′ (restrict U Π))),

where Π = Σ d B {β θ′}, since vars G ⊆ U and comp d holds. Furthermore, since comp d holds,

we have by Lemma A.1 that dom(Σ d G′ (restrict U Π)) ⊆ vars U . Thus

restrict U (∆ β (C′ [[C]] s θ (θ G)))

⊆ meet(restrict U Π) (Σ d G′ (restrict U Π)) (by Lemma A.4)

= Σ d G′ (restrict U Π) (since comp d holds)

⊆ Σ (P′bas [[P ]] d) G′ (restrict U Π)

since d ≤ lfp(P′bas [[P ]]) and P′

bas is monotonic. Thus

(6) restrict U (∆ β (C′ [[C]] s θ (θ G))) ⊆ Σ (P′bas [[P ]] d) G′ (restrict U (Σ d B {β θ′})).

Now restrict U (Σ d B {β θ′}) = restrict U (meet{β θ′} (Σ d B (restrict(vars B) {β θ′}))), since

comp d holds. Since (vars B) ∩ (dom{β θ′}) ⊆ vars H,

restrict U (Σ d B {β θ′})

= restrict U (meet{β θ′} (Σ d B (restrict /(vars H) {β θ′}))) (by Lemma A.3)

= restrict U (meet{β θ′} (Σ d B (restrict /(vars H) {(β µ) ⊓ (β θ)}))) (by Lemma A.5)

= restrict U (meet{β θ′} (Σ d B (unify /H A {β θ}))) (by Lemma A.7)

= restrict U (meet{β θ} (meet{β µ} (Σ d B (unify H A {β θ})))) (by Lemma A.5)

= meet{β θ} (restrict U (meet{β µ} (Σ d B (unify H A {β θ})))) (by Lemma A.4)

since dom{β θ} = (vars θ) ⊆ U . Now by comp d and Lemma A.1,

dom(meet{β µ} (Σ d B (unify H A {β θ}))) ⊆ (vars A) ∪ (vars H) ∪ (vars B).

Since U ∩ ((vars A) ∪ (vars H) ∪ (vars B)) ⊆ vars A, it follows by Lemma A.3 that

restrict U (Σ d B {β θ′})

= meet{β θ} (restrict(vars A) (meet{β µ} (Σ d B (unify H A {β θ}))))

= meet{β θ} (unify A H (Σ d B (unify H A {β θ})))

Page 108: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

101

by Lemma A.7. Now, since [[H ← B]] is a renaming of C ∈ P , there is some ρ ∈ Ren such that

C = [[(ρ H)← (ρ B)]]. By the definition of unify and by comp d, therefore

restrict U (Σ d B {β θ′})

= meet{β θ} (unify A (ρ H) (Σ d (ρ B) (unify (ρ H) A {β θ})))

= Cbas [[C]] d A {β θ}.

Insertion in (6) therefore gives

restrict U (∆ β (C′ [[C]] s θ (θ G))) ⊆ Σ (P′bas [[P ]] d) G′ (Cbas [[C]] d A {β θ}).

Since C ∈ P was chosen arbitrarily, it follows from the definitions of O′ and Pbas that

restrict U (∆ β (O′ [[P ]] s θ (θ G)))

⊆ Σ (P′bas [[P ]] d) G′ (P′

bas [[P ]] d A {β θ})

= Σ (P′bas [[P ]] d) (A : G′) {β θ} (by the definition of Σ)

= Σ (P′bas [[P ]] d) G {β θ}.

By comp d and Lemma A.12, since θ G = (A : G′), (5) holds.

Lemma A.16 For all programs P ,

cor (lfp (O′ [[P ]]), lfp (P′bas [[P ]])) ∧ comp (P′

bas [[P ]]) ∧ additive (P′bas [[P ]]).

Proof: Let cor1 be defined by cor1 (s, d) iff comp d ∧ additive d ∧ cor (s, d). By Lemmas A.12

and A.14, cor1 is inclusive. By Lemmas A.13 and A.15,

(cor1 (s, d) ∧ (s, d) ≤ (lfp (O′ [[P ]]), lfp (P′bas [[P ]]))⇒ cor1 ((O′ [[P ]] s), (P′

bas [[P ]] d)).

Thus by fixpoint induction, cor1 (lfp (O′ [[P ]]), lfp (P′bas [[P ]])) holds and the assertion follows.

We can now prove Theorem 6.7, that is, show that the base semantics and the SLD semantics are

congruent.

Theorem 6.7 For all programs P , cong ((O [[P ]]), (Pbas [[P ]])).

Proof: We must show that cong ((O [[P ]]), (Pbas [[P ]])) holds. Let d = Pbas [[P ]] and e = O [[P ]].

Now d = lfp (P′bas [[P ]]) and so by Lemmas A.8 and A.16, additive d holds. Thus it suffices to

show for every A ∈ Atom and θ ∈ Sub that

{θ′ (θ A) | θ′ ∈ e (θ (A : nil))} ⊆ {π A | π ∈ d A {β θ}}.

By the definition of O,

{θ′ (θ A) | θ′ ∈ e (θ (A : nil))}

= {θ′ (θ A) | θ′ ∈ lfp (O′ [[P ]]) ι (θ (A : nil))}

⊆ {π (θ A) | π ∈ d (θ A) {ǫ}}

Page 109: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

102 APPENDIX A. CORRECTNESS OF THE BASE SEMANTICS

since by Lemma A.16, cor (lfp (O′ [[P ]]), d). If (dom π) ⊆ vars (θ A) then by Lemmas A.4 and A.6,

(π ⊓ (β θ)) A = (π ⊓ (β θ)) (θ A) = π (θ A). Therefore

{θ′ (θ A) | θ′ ∈ e (θ (A : nil))}

= {(π ⊓ (β θ)) A | π ∈ d (θ A) {ǫ}}

= {π A | π ∈ meet{β θ} (d (θ A) {ǫ})}

= {π A | π ∈ d (θ A) {β θ}} (since by Lemma A.16, comp d holds)

= {π A | π ∈ d A {β θ}}.

The assertion follows.

Page 110: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Bibliography

[1] S. Abramsky and C. Hankin, editors. Abstract Interpretation of Declarative Languages. Ellis

Horwood, 1987.

[2] K. Apt and M. van Emden. Contributions to the theory of logic programming. Journal of the

ACM 29: 841–862, 1982.

[3] F. Bancilhon et al. Magic sets and other strange ways to implement logic programs. In Proc.

Fifth ACM Symp. Principles of Database Systems, pages 1–15. Cambridge, Massachusetts,

1986.

[4] R. Barbuti, R. Giacobazzi and G. Levi. A declarative approach to abstract interpretation of

logic programs. Technical report 20/89, Dept. of Informatics, University of Pisa, Italy, 1989.

[5] G. Birkhoff. Lattice Theory (AMS Coll. Publ. XXV). American Mathematical Society, third

edition 1973.

[6] C. Bloch. Source-to-source transformations of logic programs. M. Sc. Dissertation, Weizmann

Institute of Science, Rehovot, Israel, 1984.

[7] M. Bruynooghe. A framework for the abstract interpretation of logic programs. Report

CW 62, Dept. of Computer Science, University of Leuven, Belgium, 1987.

[8] M. Bruynooghe. A practical framework for the abstract interpretation of logic programs. To

appear in Journal of Logic Programming.

[9] M. Bruynooghe et al. Abstract interpretation: towards the global optimization of Prolog

programs. In Proc. Fourth Int. Symp. Logic Programming, pages 192–204. San Francisco,

California, 1987.

[10] K. Clark. Negation as failure. In Gallaire and Minker [28], pages 293–322.

[11] K. Clark and S.-A. Tarnlund. A first order theory of data and programs. In B. Gilchrist,

editor, Information Processing, pages 939–944. North-Holland, 1977.

[12] A. Colmerauer. Metamorphosis grammars. In L. Bolc, editor, Natural Language Communica-

tion with Computers (Lecture Notes in Computer Science 63), pages 133–189. Springer-Verlag,

1978.

103

Page 111: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

104 BIBLIOGRAPHY

[13] P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analy-

sis of programs by construction or approximation of fixpoints. In Proc. Fourth Ann. ACM

Symp. Principles of Programming Languages, pages 238–252. Los Angeles, California, 1977.

[14] P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Proc. Sixth

Ann. ACM Symp. Principles of Programming Languages, pages 269–282. San Antonio, Texas,

1979.

[15] P. Dart. On derived dependencies and connected databases. To appear in Journal of Logic

Programming.

[16] S. K. Debray. Efficient dataflow analysis of logic programs. Draft manuscript, 37 pages.

Dept. of Computer Science, University of Arizona, 1989. Preliminary version in Proc. Fif-

teenth Ann. ACM Symp. Principles of Programming Languages, pages 260–273. San Diego,

California, 1988.

[17] S. K. Debray. Global Optimization of Logic Programs. Ph. D. Thesis, State University of New

York at Stony Brook, New York, 1986.

[18] S. K. Debray. Static analysis of parallel logic programs. In Kowalski and Bowen [46], pages

711–732.

[19] S. K. Debray. Static inference of modes and data dependencies in logic programs. ACM

Transactions on Programming Languages and Systems 11 (3): 418–450, 1989.

[20] S. K. Debray and P. Mishra. Denotational and operational semantics for Prolog. Journal of

Logic Programming 5 (1): 61–91, 1988.

[21] S. K. Debray and D. S. Warren. Automatic mode inference of logic programs. Journal of

Logic Programming 5 (3): 207–229, 1988.

[22] S. K. Debray and D. S. Warren. Functional computations in logic programs. ACM Transac-

tions on Programming Languages and Systems 11 (3): 451–481, 1989.

[23] P. Deransart and J. Ma luszynski. Relating logic programs and attribute grammars. Journal

of Logic Programming 2 (2): 119–156, 1985.

[24] M. Falaschi et al. A new declarative semantics for logic languages. In Kowalski and Bowen

[46], pages 993–1005.

[25] M. Fitting. Bilattices and the semantics of logic programming. To appear in Journal of Logic

Programming.

[26] M. Fitting. A Kripke-Kleene semantics for logic programs. Journal of Logic Programming 2

(4): 295–312, 1985.

Page 112: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

BIBLIOGRAPHY 105

[27] J. Gallagher, M. Codish and E. Shapiro. Specialisation of Prolog and FCP programs using

abstract interpretation. New Generation Computing 6 (2,3): 159–186, 1988.

[28] H. Gallaire and J. Minker, editors. Logic and Databases. Plenum Press, 1978.

[29] P. Gardner and J. Shepherdson. Unfold/fold transformations of logic programs. Draft report,

30 pages. Dept. of Computer Science, University of Edinburgh, Scotland, 1989.

[30] A. Hansson and S.-A. Tarnlund. Program transformation by data structure mapping. In

K. Clark and S.-A. Tarnlund, editors, Logic Programming, pages 117–122. Academic Press,

1982.

[31] M. Hecht. Flow Analysis of Computer Programs. North-Holland, 1977.

[32] N. Heintze and J. Jaffar. A finite presentation theorem for approximating logic programs. To

appear in Proc. Seventeenth Ann. ACM Symp. Principles of Programming Languages, San

Francisco, California, 1990.

[33] D. Jacobs and A. Langen. Accurate and efficient approximation of variable aliasing in logic

programs. In Lusk and Overbeek [50], pages 154–165.

[34] D. Jacobs and A. Langen. Compilation of logic programs for restricted and-parallelism. In

H. Ganzinger, editor, Proc. ESOP 88 (Lecture Notes in Computer Science 300), pages 284–

297. Springer-Verlag, 1988.

[35] D. Jacobs and A. Langen. Static analysis of logic programs for independent and-parallelism.

Technical report 89-03, Computer Science Dept., University of Southern California, Los An-

geles, California, 1989.

[36] J. Jaffar and J.-L. Lassez. Constraint logic programming. In Proc. Fourteenth Ann. ACM

Symp. Principles of Programming Languages, pages 111–119. Munich, Fed. Rep. Germany,

1987.

[37] G. Janssens and M. Bruynooghe. An application of abstract interpretation: Integrated type

and mode inferencing. Report CW 86, Dept. of Computer Science, University of Leuven,

Belgium, 1989.

[38] J. Jensen. Generation of machine code in Algol compilers. BIT 5: 235–245, 1965.

[39] N. D. Jones. Flow analysis of lambda expressions. In S. Even and O. Kariv, editors, Proc.

Eighth Int. Coll. Automata, Languages and Programming (Lecture Notes in Computer Science

115), pages 114–128. Springer-Verlag, 1981.

[40] N. D. Jones and S. S. Muchnick. Flow analysis and optimization of Lisp-like structures. In

S. S. Muchnick and N. D. Jones, editors, Program Flow Analysis, pages 102–131. Prentice-

Hall, 1981.

Page 113: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

106 BIBLIOGRAPHY

[41] N. D. Jones and A. Mycroft. A stepwise development of operational and denotational seman-

tics for Prolog. In Proc. 1984 Int. Symp. Logic Programming, pages 289–298. Atlantic City,

New Jersey, 1984.

[42] N. D. Jones and A. Mycroft. Dataflow analysis of applicative programs using minimal function

graphs. In Proc. Thirteenth Ann. ACM Symp. Principles of Programming Languages, pages

296–306. St. Petersburg, Florida, 1986.

[43] N. D. Jones and H. Søndergaard. A semantics-based framework for the abstract interpretation

of Prolog. In Abramsky and Hankin [1], pages 123–142.

[44] T. Kanamori and T. Kawamura. Analyzing success patterns of logic programs by abstract

hybrid interpretation. ICOT TR-279, ICOT, Tokyo, Japan, 1987.

[45] S. C. Kleene. Introduction to Metamathematics. North-Holland, 1971. Originally published

by Van Nostrand, 1952.

[46] R. Kowalski and K. Bowen, editors. Logic Programming: Proc. Fifth Int. Conf. Symp. MIT

Press, 1988.

[47] K. Kunen. Negation in logic programming. Journal of Logic Programming 4 (4): 289–308,

1987.

[48] J.-L. Lassez, M. J. Maher and K. Marriott. Unification revisited. In Minker [72], pages 587–

625.

[49] J. W. Lloyd. Foundations of Logic Programming. Springer-Verlag, second edition 1987.

[50] E. L. Lusk and R. A. Overbeek, editors. Logic Programming: Proc. North American Conf.

1989. MIT Press, 1989.

[51] J. McCarthy. A basis for a mathematical theory of computation. In P. Braffort and

D. Hirschberg, editors, Computer Programming and Formal Systems, pages 33–70. North-

Holland, 1963.

[52] M. Maher. On parameterized substitutions. Unpublished note, 26 pages. IBM T. J. Watson

Research Center, Yorktown Heights, New York, 1986.

[53] H. Mannila and E. Ukkonen. Flow analysis of Prolog programs. In Proc. Fourth Symp. Logic

Programming, pages 205–214. San Francisco, California, 1987.

[54] K. Marriott. Finding Explicit Representations for Subsets of the Herbrand Universe.

Ph. D. Thesis, University of Melbourne, Australia, 1988.

[55] K. Marriott. Frameworks for abstract interpretation. In preparation.

[56] K. Marriott, L. Naish and J.-L. Lassez. Most specific logic programs. In Kowalski and Bowen

[46], pages 909–923.

Page 114: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

BIBLIOGRAPHY 107

[57] K. Marriott and H. Søndergaard. Analysis of constraint logic programs. To appear in S. K. De-

bray and M. Hermenegildo, editors, Logic Programming: Proc. North American Conf. 1990.

MIT Press, 1990.

[58] K. Marriott and H. Søndergaard. A tutorial on abstract interpretation of logic programs.

Unpublished note, 26 pages. IBM T. J. Watson Research Center, Yorktown Heights, New

York, 1989.

[59] K. Marriott and H. Søndergaard. Bottom-up abstract interpretation of logic programs.

In Kowalski and Bowen [46], pages 733–748.

[60] K. Marriott and H. Søndergaard. Bottom-up dataflow analysis of normal logic programs. To

appear in Journal of Logic Programming.

[61] K. Marriott and H. Søndergaard. On describing success patterns of logic programs. Technical

report 88/12, Dept. of Computer Science, University of Melbourne, Australia, 1988.

[62] K. Marriott and H. Søndergaard. On Prolog and the occur check problem. SIGPLAN Notices

24 (5): 76–82.

[63] K. Marriott and H. Søndergaard. Prolog program transformation by introduction of

difference-lists. In Proc. Int. Computer Science Conf. 88, pages 206–213. IEEE Computer

Society, Hong Kong, 1988.

[64] K. Marriott and H. Søndergaard. Semantics-based dataflow analysis of logic programs. In

G. X. Ritter, editor, Information Processing 89, pages 601–606. North-Holland, 1989.

[65] K. Marriott and H. Søndergaard. Top-down abstract interpretation of normal logic programs.

In preparation.

[66] K. Marriott, H. Søndergaard and P. Dart. A characterization of non-floundering logic pro-

grams. To appear in S. K. Debray and M. Hermenegildo, editors, Logic Programming:

Proc. North American Conf. 1990. MIT Press, 1990.

[67] K. Marriott, H. Søndergaard and N. D. Jones. Denotational abstract interpretation of logic

programs. In preparation.

[68] C. S. Mellish. Abstract interpretation of Prolog programs. In Shapiro [89], pages 463–474.

Revised version in Abramsky and Hankin [1], pages 181–198.

[69] C. S. Mellish. The automatic generation of mode declarations for Prolog programs. DAI

Research Paper No. 163, University of Edinburgh, Scotland, 1981.

[70] C. S. Mellish. Some global optimizations for a Prolog compiler. Journal of Logic Programming

2 (1): 43–66, 1985.

Page 115: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

108 BIBLIOGRAPHY

[71] A. Melton, D. Schmidt and G. Strecker. Galois connections and computer science applications.

In D. Pitt et al., editors, Category Theory and Computer Programming (Lecture Notes in

Computer Science 240), pages 299–312. Springer-Verlag, 1986.

[72] J. Minker, editor. Foundations of Deductive Databases and Logic Programming. Morgan Kauf-

mann, 1988.

[73] A. Mycroft. Abstract Interpretation and Optimising Transformations for Applicative Pro-

grams. Ph. D. Thesis, University of Edinburgh, Scotland, 1981.

[74] A. Mycroft. Logic programs and many-valued logic. In M. Fontet and K. Mehlhorn, editors,

Proc. STACS 84 (Lecture Notes in Computer Science 166), pages 274–286. Springer-Verlag,

1984.

[75] P. Naur. The design of the Gier Algol compiler, part II. BIT 3: 145–166, 1963.

[76] F. Nielson. A denotational framework for data flow analysis. Acta Informatica 18: 265–287,

1982.

[77] F. Nielson. Strictness analysis and denotational abstract interpretation. Information and

Computation 76 (1): 29–92, 1988.

[78] U. Nilsson. A Systematic Approach to Abstract Interpretation of Logic Programs. Ph. D. The-

sis, University of Linkoping, Sweden, 1989.

[79] R. Paige and S. Koenig. Finite differencing of computable expressions. ACM Transactions on

Programming Languages and Systems 4 (3): 402–454, 1982.

[80] D. Plaisted. Abstract theorem proving. Artificial Intelligence 16: 47–108, 1981.

[81] D. Plaisted. The occur-check problem in Prolog. New Generation Computing 2 (4): 309–322,

1984.

[82] C. Pyo and U. S. Reddy. Inference of polymorphic types for logic programs. In Lusk and

Overbeek [50], pages 1115–1132.

[83] J. C. Reynolds. Automatic computation of data set definitions. In A. Morrell, editor, Infor-

mation Processing 68, pages 456–461. North-Holland, 1969.

[84] J. C. Reynolds. On the relation between direct and continuation semantics. In J. Loeckx,

editor, Proc. Second Int. Coll. Automata, Languages and Programming (Lecture Notes in

Computer Science 14), pages 141–156. Springer-Verlag, 1974.

[85] J. C. Reynolds. Transformational systems and the algebraic structure of atomic formulas.

In B. Meltzer and D. Michie, editors, Machine Intelligence 5, pages 135–151. Edinburgh

University Press, 1969.

Page 116: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

BIBLIOGRAPHY 109

[86] V. A. Saraswat. Concurrent Constraint Programming Languages. Ph. D. Thesis, Carnegie-

Mellon University, Pennsylvania, 1989.

[87] T. Sato and H. Tamaki. Enumeration of success patterns in logic programs. Theoretical Com-

puter Science 34: 227–240, 1984.

[88] D. Schmidt. Denotational Semantics: A Methodology for Language Development. Allyn and

Bacon, 1986.

[89] E. Shapiro, editor. Proc. Third Int. Conf. Logic Programming (Lecture Notes in Computer

Science 240). Springer-Verlag 1986.

[90] J. Shepherdson. Negation in logic programming. In Minker [72], pages 19–88.

[91] M. Sintzoff. Calculating properties of programs by valuation on specific models. SIGPLAN

Notices 7 (1): 203–207, 1972. Proc. ACM Conf. Proving Assertions about Programs.

[92] H. Søndergaard. An application of abstract interpretation of logic programs: Occur check

reduction. In B. Robinet and R. Wilhelm, editors, Proc. ESOP 86 (Lecture Notes in Computer

Science 213), pages 327–338. Springer-Verlag, 1986.

[93] L. Sterling and E. Shapiro. The Art of Prolog: Advanced Programming Techniques. MIT

Press, 1986.

[94] H. Tamaki and T. Sato. OLD resolution with tabulation. In Shapiro [89], pages 84–98.

[95] H. Tamaki and T. Sato. Unfold/fold transformation of logic programs. In S.-A. Tarnlund,

editor, Proc. Second Int. Conf. Logic Programming, pages 127–138. Uppsala, Sweden, 1984.

[96] S.-A. Tarnlund. An axiomatic data base theory. In Gallaire and Minker [28], pages 259–289.

[97] J. Thom, and J. Zobel, editors. NU-Prolog reference manual. Technical Report 86/10, Dept. of

Computer Science, University of Melbourne, Australia, revised edition 1987.

[98] M. van Emden and R. Kowalski. The semantics of logic as a programming language. Journal

of the ACM 23: 733–742, 1976.

[99] W. Winsborough. Automatic, transparent parallelization of logic programs at compile time.

Ph. D. Thesis, University of Wisconsin-Madison, Wisconsin, 1988.

[100] W. Winsborough. Source-level transforms for multiple specialization of Horn clauses (ex-

tended abstract). Technical report 88-15, Dept. of Computer Science, University of Chicago,

Illinois, 1988.

[101] W. Winsborough and A. Wærn. Transparent and-parallelism in the presence of shared free

variables. In Kowalski and Bowen [46], pages 749–764.

Page 117: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

110 BIBLIOGRAPHY

[102] J. Zhang and P. Grant. An automatic difference-list transformation algorithm for Prolog.

In Y. Kodratoff, editor, Proc. 1988 European Conf. Artificial Intelligence, pages 320–325.

Pitman, 1988.

Page 118: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

Index

⊥X 5

⊤X 5

⊓Y 5⊔

Y 5

PX 5

∆F 7

Σ;F 7

F ↓ α 7

F ↑ α 7

X∗ 7

∝ 20

⊳ 36, 52

ι 9, 48

ǫ 52

β 53

σ 74

σ′ 74⋆= 76

τ 79

τ ′ 79

app⋆ 82

app⋆⋆ 83

append⋆ 82

append⋆⋆ 82

appr 20, 21, 26

assign 60

Atom 8

Atomk 36

card 5

co 40

compl 34

cong 55

consequences 44

consistent 32

den 36

depth 36

dom 48

dom (for Psub) 54

Form 44

Fun 8

gfp 7

ground 9

Interp 31

Lcon 44

lcons 44

lfp 7

maximal 37

meet 54

meetgro 61

mgu 9, 48

mgugro 61

minimal 37

Par 52

pred 36

Pred 8

project 61

Prop 60

Psub 53

rename 49

restrict 49

restrict (for Psub) 54

rng 48

simE; [Q] 25

TP 9, 21

tass 61

Term 8

111

Page 119: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

112 INDEX

unify 54

unifygro 61

UP 33

varnil 76

vars 48

Abramsky, S. 14

abstract interpretation 2, 15

abstraction function 18

abstraction scheme 36

aliasing analysis 66

approximate computation 11

Apt, K. 9

ascending chain finite lattice 6

atom 8

atom abstraction 36

Barbuti, R. 46

base semantics 22

base semantics B 33

base semantics Pbas 55

bijective function 7

Birkhoff, G. 5

Bloch, C. 70

Bocvar, D. 32

body 8

bottom-up analysis 3, 28

Bruynooghe, M. 35, 65

call pattern 3, 29

call template 85

canonical abstraction scheme 37

canonicised abstraction scheme S∗ 37

chain 5

Church, A. 7

Clark, K. 70

clause 8

co-continuous function 7

co-inclusive predicate 7

Colmerauer, A. 70

compile time garbage collection 66

complete join-lattice 5

complete lattice 5

complete meet-lattice 5

computed answer 10

concretization function 15

consistent call template 85

continuous function 7

co-strict function 7

Cousot, P. and R. 2, 3, 11, 14, 16, 18, 19, 91, 93

Dart, P. 46, 60

dataflow semantics P 57

Debray, S. 22, 58, 63, 65, 66

definite program 8

delimiter 74

depth k abstraction 36

descending chain finite lattice 6

determinacy analysis 66

difference-list 69, 74

downwards closed insertion 58

extended semantics 14, 22

finite chain property 6

finite height lattice 6

Fitting, M. 3, 31–35, 44

fixpoint 7

fixpoint induction 7

floundering analysis 66

folded version of function 7

free difference-list 79

function

abstraction - 18

bijective - 7

co-continuous - 7

co-strict - 7

concretization - 15

continuous - 7

idempotent - 7

injective - 7

monotonic - 7

Page 120: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

INDEX 113

semantic - see “semantic functions”

strict - 7

Gallagher, J. 46

Galois insertion 18, 19

Giacobazzi, R. 46

Grant, P. 89

greatest fixpoint 7

greatest lower bound 5

ground syntactic object 8

groundness analysis 59

Hankin, C. 14

Hansson, A. 70

Hecht, M. 14

Heintze, N. 93

Herbrand model 9

idempotent function 7

identity substitution 9

immediate consequence function 9

inclusive predicate 7

independence analysis 66

injective function 7

insertion 19

downwards closed - 58

meet-closed - 59

Moore-closed - 59

upwards closed - 58

insertion adjoint 19

instance 9

instantiation ordering 36

interpretation

abstract - 2, 15

in three-valued logic 31

of dataflow semantics 58

of meta-language 24

type - 24

Jacobs, D. 63

Jaffar, J. 93

Jones, N. 14, 52, 58, 62, 63, 65

Kam, J. 14

Kanamori, T. 65

Kawamura, T. 65

Kildall, G. 14

Kleene, S. 3, 32

Kleene logic 32

Kleene sequence 7

Kowalski, R. 9

Kunen, K. 4, 31, 44

Langen, A. 63

Lassez, J.-L. 3, 31, 36, 37

lattice

ascending chain finite - 6

descending chain finite - 6

finite chain property 6

finite height - 6

join- 5

meet- 5

Noetherian - 6

lattice abstraction scheme 38

lax semantics Plax 56

least fixpoint 7

least upper bound 5

Levi, G. 46

lexicon 8

list 74

literal 8

Lloyd, J. 3, 8, 9

lower bound 5

Lukasiewicz, J. 32

magic set 63

Maher, M. 52

Marriott, K. 3, 31, 36–38, 46, 63, 64, 69

maximal element 5

McCarthy, J. 32, 46

McCarthy logic 32, 46

Mellish, C. 64, 65

meet-closed insertion 59

minimal element 5

Page 121: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

114 INDEX

minimal function graph 62

Mishra, P. 22, 63

mode 28

mode analysis 66

monotonic function 7

Moore-closed insertion 59

Moore family 18

Moore closure 19

Muchnick, S. 14

Mycroft, A. 14, 28, 44, 62–64

Naish, L. 3, 31, 36, 37

Naur, P. 2, 13

Nielson, F. 3, 11, 14, 22, 24, 91

Noetherian lattice 6

non-standard semantics NS 38

normal program 8

occur check 70, 73

occur check analysis 66

parameter 52

parametric substitution 52, 53

partial ordering 5

Plaisted, D. 93

poset 5

Post, E. 32

preordering 5

program specialisation 36, 42

program transformation 67

Prolog semantics P 45

pseudo-evaluation 13

Pyo, C. 93

query 8

Reddy, U. 93

renaming 49

representation by difference-list 74

resolvent 9

Reynolds, J. 14, 21, 25, 43

Sato, T. 3, 31, 35, 37

Schmidt, D. 3, 5

secure call 87

semantic functions

B 33

L 44

NS 38

O 49

Ocall 51

P (Prolog semantics) 45

P (dataflow semantics) 57

Pbas 55

Plax 56

S 44

Shapiro, E. 4, 63, 70, 71, 83, 85

Shepherdson, J. 45

simple representation 74

singleton abstraction 36

Sintzoff, M. 2, 14

SLD call semantics Ocall 51

SLD resolution 28

SLD semantics 48

SLD semantics O 49

SLDNF semantics S 44

Søndergaard, H. 37, 46, 52, 58, 63–65, 69

Sterling, L. 4, 69–71, 83, 85

strict function 7

strictness analysis 14

sublattice 5

substitution 9,48

into Par 52

parametric 52, 53

success set 9

Tamaki, H. 3, 31, 35, 37

Tarjan, R. 14

Tarnlund, S.-A. 70

term 8

three-valued logic semantics L 44

top-down analysis 3

type inference 67

Page 122: ske - University of Melbourne€¦ · Title: ske.dvi Created Date: 6/29/2010 6:36:39 PM

INDEX 115

type interpretation 24

Ullman, J. 14

unfold/fold transformation 89

upper bound 5

upwards closed insertion 58

van Emden, M. 9

variable 8, 52

Warren, D. S. 66

Winsborough, W. 63, 65

Zhang, J. 89