ents ar andp Gr

Indu tive Synthesis of Fun tional

Programs

Learning Domain-Spe i Control Rules and

Abstra t S hemes

Ute S hmid

II

Summary.

In this book a novel approa h to indu tive synthesis of re ursive fun -

tions is proposed, ombining universal planning, folding of nite programs,

and s hema abstra tion by analogi al reasoning. In a rst step, an example

domain of small omplexity is explored by universal planning. For example,

for all possible lists over four xed natural numbers, their optimal trans-

formation sequen es into the sorted list are al ulated and represented as a

DAG. In a se ond step, the plan is transformed into a nite program term.

Plan transformation mainly relies on inferring the data type underlying

a given plan. In a third step, the nite program is folded into (a set of)

re ursive fun tions. Folding is performed by synta ti al pattern-mat hing

and orresponds to indu ing a spe ial lass of ontext-free tree grammars.

It is shown that the approa h an be su essfully applied to learn domain-

spe i ontrol rules. Control rule learning is important to gain the ef-

ien y of domain-spe i planners without the need to hand- ode the

domain-spe i knowledge. Furthermore, an extension of planning based

on a purely relational domain des ription language to fun tion appli ation

is presented. This extension makes planning appli able to a larger lass of

problems whi h are of interest for program synthesis.

As a last step, a hierar hy of program s hemes (patterns) is generated

by generalizing over already synthesized re ursive fun tions. Generaliza-

tion an be onsidered as the last step of problem solving by analogy or

programming by analogy. Some psy hologi al experiments were performed

to investigate whi h kind of stru tural relations between problems an

be exploited by human problem solvers. Anti-uni ation is presented as

an approa h to mapping and generalizing programs stru tures. It is pro-

posed that the integration of planning, program synthesis, and analogi al

reasoning ontributes to ognitive s ien e resear h on skill a quisition by

addressing the problem of extra ting generalized rules from some initial

experien e. Su h ( ontrol) rules represent domain-spe i problem solving

strategies.

All parts of the approa h are implemented in Common Lisp.

Keywords: Indu tive Program Synthesis, Re ursive Fun tions, Universal

Planning, Fun tion Appli ation in Planning, Plan Transformation, Data

Type Inferen e, Folding of Program Terms, Context-free Tree Grammars,

Control Rule Learning, Strategy Learning, Analogi al Reasoning, Anti-

Uni ation, S heme Abstra tion

For My Parents and

In Memory of My Grandparents

IV

A knowledgement. Resear h is a kind of work one an never do ompletely alone.

Over the years I proted from the guidan e of several professors who supervised

or supported my work, namely Fritz Wysotzki, Klaus Eyferth, Bernd Mahr, Jaime

Carbonell, Arnold Upmeyer, Peter Pepper, and Gerhard Strube. I learned a lot from

dis ussions with them, with olleagues, and students, su h as Jo hen Burghardt,

Bru e Burns, the group of Hartmut Ehrig, Pierre Flener, He tor Gener, Peter

Geibel, Peter Gerjets, Jurgen Giesl, Wolfgang Grieskamp, Maritta Heisel, Ralf Her-

bri h, Laurie Hiyakumoto, Petra Hofstedt, Rune Jensen, Emanuel Kitzelmann, Jana

Koehler, Steen Lange, Martin Muhlpfordt, Brigitte Pientka, Heike Pis h, Manuela

Veloso, Ulri h Wagner, Bernhard Wolf, Thomas Zeugmann (sorry to everyone I for-

got). I am very grateful for the time I ould spend at Carnegie Mellon University.

Thanks to Klaus Eyferth and Bernd Mahr who motivated me to go, to Gerhard

Strube for his support, to Fritz Wysotzki who a epted my absen e from tea hing,

and, of ourse to Jaime Carbonell who was my very helpful host. My work proted

mu h from the inspiration I got from talks, lasses, and dis ussions, and from the

very spe ial atmosphere suggesting that everything is all right as long as \the heart

is in the work". I thank all my diploma students who supported the work reported

in this book Dirk Matzke, Rene Mer y, Martin Muhlpfordt, Marina Muller, Mark

Muller, Heike Pis h, Knut Polkehn, Uwe Sinha, Imre Szabo, Janin Toussaint, Ulri h

Wagner, Joa him Wirth, and Bernhard Wolf. Additional thanks to some of them

and Peter Pollmanns for proof-reading parts of the draft of this book. I owe a lot

to Fritz Wysotzki for giving me the han e to move from ognitive psy hology to

arti ial intelligen e, for many interesting dis ussions and for riti ally reading and

ommenting the draft of this book. Finally, thanks to my olleagues and friends

Berry Claus, Robin Hornig, Barbara Kaup, and Martin Kindsmuller, to my family,

and my husband Uwe Konerding for support and high quality leisure time and to

all authors of good rime novels.

Table of Contents

A knowledgments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : IV

1. Introdu tion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1

Part I. Planning

2. State-Based Planning : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13

2.1 Standard Strips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1 A Blo ks-World Example . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.2 Basi Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.3 Ba kward Operator Appli ation . . . . . . . . . . . . . . . . . . . . 18

2.2 Extensions and Alternatives to Strips . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 The Planning Domain Denition Language . . . . . . . . . . 20

2.2.1.1 Typing and Equality Constraints . . . . . . . . . . . 22

2.2.1.2 Conditional Ee ts . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.2 Situation Cal ulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3 Basi Planning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.1 Informal Introdu tion of Basi Con epts . . . . . . . . . . . . . 28

2.3.2 Forward Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3.3 Formal Properties of Planning . . . . . . . . . . . . . . . . . . . . . 31

2.3.3.1 Complexity of Planning . . . . . . . . . . . . . . . . . . . 31

2.3.3.2 Termination, Soundness, Completeness . . . . . . 34

2.3.3.3 Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3.4 Ba kward Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3.4.1 Goal Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.3.4.2 In ompleteness of Linear Planning . . . . . . . . . . 39

2.4 Planning Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.4.1 Classi al Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.4.1.1 Strips Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.4.1.2 Dedu tive Planning . . . . . . . . . . . . . . . . . . . . . . . 41

2.4.1.3 Partial Order Planning . . . . . . . . . . . . . . . . . . . . 41

2.4.1.4 Total Order Non-Linear Planning . . . . . . . . . . . 42

2.4.2 Current Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

VI Table of Contents

2.4.2.1 Graphplan and Derivates . . . . . . . . . . . . . . . . . . 42

2.4.2.2 Forward Planning Revisited . . . . . . . . . . . . . . . . 44

2.4.3 Complex Domains and Un ertain Environments . . . . . . 44

2.4.3.1 In luding Domain Knowledge . . . . . . . . . . . . . . 44

2.4.3.2 Planning for Non-Deterministi Domains . . . . 45

2.4.4 Universal Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.4.5 Planning and Related Fields . . . . . . . . . . . . . . . . . . . . . . . 48

2.4.5.1 Planning and Problem Solving . . . . . . . . . . . . . . 48

2.4.5.2 Planning and S heduling . . . . . . . . . . . . . . . . . . . 49

2.4.5.3 Proof Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.4.6 Planning Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.5 Automati Knowledge A quisition for Planning . . . . . . . . . . . . 51

2.5.1 Pre-Planning Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.5.2 Planning and Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.5.2.1 Linear Ma ro-Operators . . . . . . . . . . . . . . . . . . . 52

2.5.2.2 Learning Control Rules . . . . . . . . . . . . . . . . . . . . 53

2.5.2.3 Learning Control Programs . . . . . . . . . . . . . . . . 53

3. Constru ting Complete Sets of Optimal Plans : : : : : : : : : : : : 55

3.1 Introdu tion to DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.1.1 DPlan Planning Language . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.1.2 DPlan Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.1.3 EÆ ien y Con erns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.1.4 Example Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.1.4.1 The Clearblo k Problem . . . . . . . . . . . . . . . . . . . 59

3.1.4.2 The Ro ket Problem . . . . . . . . . . . . . . . . . . . . . . 61

3.1.4.3 The Sorting Problem . . . . . . . . . . . . . . . . . . . . . . 61

3.1.4.4 The Hanoi Problem . . . . . . . . . . . . . . . . . . . . . . . 63

3.1.4.5 The Tower Problem . . . . . . . . . . . . . . . . . . . . . . . 63

3.2 Optimal Full Universal Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.3 Termination, Soundness, Completeness . . . . . . . . . . . . . . . . . . . . 67

3.3.1 Termination of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.3.2 Operator Restri tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.3.3 Soundness and Completeness of DPlan . . . . . . . . . . . . . . 70

4. Integrating Fun tion Appli ation in Planning : : : : : : : : : : : : : 73

4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.2 Extending Strips to Fun tion Appli ations . . . . . . . . . . . . . . . . . 76

4.3 Extensions of FPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.3.1 Ba kward Operator Appli ation . . . . . . . . . . . . . . . . . . . . 81

4.3.2 Introdu ing User-Dened Fun tions . . . . . . . . . . . . . . . . . 84

4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.4.1 Planning with Resour e Variables . . . . . . . . . . . . . . . . . . 85

4.4.2 Planning for Numeri al Problems . . . . . . . . . . . . . . . . . . . 87

4.4.3 Fun tional Planning for Standard Problems . . . . . . . . . . 89

Table of Contents VII

4.4.4 Mixing ADD/DEL Ee ts and Updates . . . . . . . . . . . . . 90

4.4.5 Planning for Programming Problems . . . . . . . . . . . . . . . . 91

4.4.6 Constraint Satisfa tion and Planning . . . . . . . . . . . . . . . 93

5. Con lusions and Further Resear h : : : : : : : : : : : : : : : : : : : : : : : : 95

5.1 Comparing DPlan with the State of the Art . . . . . . . . . . . . . . . 95

5.2 Extensions of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.3 Universal Planning versus In remental Exploration . . . . . . . . . 97

Part II. Indu tive Program Synthesis

6. Automati Programming : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 101

6.1 Overview of Automati Programming Resear h . . . . . . . . . . . . . 102

6.1.1 AI and Software Engineering . . . . . . . . . . . . . . . . . . . . . . . 102

6.1.1.1 Knowledge-Based Software Engineering . . . . . . 102

6.1.1.2 Programming Tutors . . . . . . . . . . . . . . . . . . . . . . 103

6.1.2 Approa hes to Program Synthesis . . . . . . . . . . . . . . . . . . 104

6.1.2.1 Methods of Program Spe i ation . . . . . . . . . . 104

6.1.2.2 Synthesis Methods . . . . . . . . . . . . . . . . . . . . . . . . 107

6.1.3 Pointers to Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.2 Dedu tive Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.2.1 Constru tive Theorem Proving . . . . . . . . . . . . . . . . . . . . . 112

6.2.1.1 Program Synthesis by Resolution . . . . . . . . . . . 113

6.2.1.2 Planning and Program Synthesis with Dedu -

tive Tableaus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.2.1.3 Further Approa hes . . . . . . . . . . . . . . . . . . . . . . . 117

6.2.2 Program Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.2.2.1 Transformational Implementation: The Fold

Unfold Me hanism . . . . . . . . . . . . . . . . . . . . . . . . 118

6.2.2.2 Meta-Programming: The CIP Approa h . . . . . 122

6.2.2.3 Program Synthesis by Stepwise Renement:

KIDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.2.2.4 Con luding Remarks . . . . . . . . . . . . . . . . . . . . . . 125

6.3 Indu tive Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.3.1 Foundations of Indu tion . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.3.1.1 Basi Con epts of Indu tive Learning . . . . . . . 127

6.3.1.2 Grammar Inferen e . . . . . . . . . . . . . . . . . . . . . . . 132

6.3.2 Geneti Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.3.2.1 Basi Con epts . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.3.2.2 Iteration and Re ursion . . . . . . . . . . . . . . . . . . . . 139

6.3.2.3 Evaluation of Geneti Programming . . . . . . . . . 142

6.3.3 Indu tive Logi Programming . . . . . . . . . . . . . . . . . . . . . . 142

6.3.3.1 Basi Con epts . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6.3.3.2 Basi Learning Te hniques . . . . . . . . . . . . . . . . . 145

VIII Table of Contents

6.3.3.3 De larative Biases . . . . . . . . . . . . . . . . . . . . . . . . 148

6.3.3.4 Learning Re ursive Prolog Programs . . . . . . . . 151

6.3.4 Indu tive Fun tional Programming . . . . . . . . . . . . . . . . . 153

6.3.4.1 Summers' Re urren e Relation Dete tion Ap-

proa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

6.3.4.2 Extensions of Summers' Approa h . . . . . . . . . . 161

6.3.4.3 Biermann's Fun tion Merging Approa h . . . . . 163

6.3.4.4 Generalization to N and Programming by

Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

6.4 Final Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

6.4.1 Indu tive versus Dedu tive Synthesis . . . . . . . . . . . . . . . 168

6.4.2 Indu tive Fun tional versus Logi Programming . . . . . . 169

7. Folding of Finite Program Terms : : : : : : : : : : : : : : : : : : : : : : : : : 171

7.1 Terminology and Basi Con epts . . . . . . . . . . . . . . . . . . . . . . . . . 172

7.1.1 Terms and Term Rewriting . . . . . . . . . . . . . . . . . . . . . . . . 172

7.1.2 Patterns and Anti-Uni ation . . . . . . . . . . . . . . . . . . . . . . 175

7.1.3 Re ursive Program S hemes . . . . . . . . . . . . . . . . . . . . . . . 176

7.1.3.1 Basi Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . 176

7.1.3.2 RPSs as Term Rewrite Systems . . . . . . . . . . . . . 178

7.1.3.3 RPSs as Context-Free Tree Grammars . . . . . . . 180

7.1.3.4 Initial Programs and Unfoldings . . . . . . . . . . . . 181

7.2 Synthesis of RPSs from Initial Programs . . . . . . . . . . . . . . . . . . 186

7.2.1 Folding and Fixpoint Semanti s . . . . . . . . . . . . . . . . . . . . 186

7.2.2 Chara teristi s of RPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

7.2.3 The Synthesis Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

7.3 Solving the Synthesis Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

7.3.1 Constru ting Segmentations . . . . . . . . . . . . . . . . . . . . . . . 190

7.3.1.1 Segmentation for `Mod' . . . . . . . . . . . . . . . . . . . . 190

7.3.1.2 Segmentation and Skeleton . . . . . . . . . . . . . . . . . 192

7.3.1.3 Valid Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . 195

7.3.1.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

7.3.2 Constru ting a Program Body . . . . . . . . . . . . . . . . . . . . . 199

7.3.2.1 Separating Program Body and Instantiated

Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

7.3.2.2 Constru tion of a Maximal Pattern . . . . . . . . . 201

7.3.2.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

7.3.3 Dealing with Further Subprograms . . . . . . . . . . . . . . . . . 202

7.3.3.1 Indu ing two Subprograms for `ModList' . . . . . 203

7.3.3.2 De omposition of Initial Trees . . . . . . . . . . . . . . 206

7.3.3.3 Equivalen e of Sub-S hemata . . . . . . . . . . . . . . . 207

7.3.3.4 Redu tion of Initial Trees . . . . . . . . . . . . . . . . . . 209

7.3.4 Finding Parameter Substitutions . . . . . . . . . . . . . . . . . . . 210

7.3.4.1 Finding Substitutions for `Mod' . . . . . . . . . . . . 211

7.3.4.2 Basi Considerations . . . . . . . . . . . . . . . . . . . . . . 212

Table of Contents IX

7.3.4.3 Dete ting Hidden Variables . . . . . . . . . . . . . . . . 216

7.3.4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

7.3.5 Constru ting an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

7.3.5.1 Parameter Instantiations for the Main Program219

7.3.5.2 Putting all Parts Together . . . . . . . . . . . . . . . . . 219

7.3.5.3 Existen e of an RPS . . . . . . . . . . . . . . . . . . . . . . 221

7.3.5.4 Extension: Equality of Subprograms . . . . . . . . . 223

7.3.5.5 Fault Toleran e . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

7.4 Example Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

7.4.1 Time Eort of Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

7.4.2 Re ursive Control Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

8. Transforming Plans into Finite Programs : : : : : : : : : : : : : : : : : 231

8.1 Overview of Plan Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 232

8.1.1 Universal Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

8.1.2 Introdu ing Data Types and Situation Variables . . . . . . 232

8.1.3 Components of Plan Transformation . . . . . . . . . . . . . . . . 233

8.1.4 Plans as Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

8.1.5 Completeness and Corre tness . . . . . . . . . . . . . . . . . . . . . 235

8.2 Transformation and Type Inferen e . . . . . . . . . . . . . . . . . . . . . . . 235

8.2.1 Plan De omposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

8.2.2 Data Type Inferen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

8.2.3 Introdu ing Situation Variables . . . . . . . . . . . . . . . . . . . . 238

8.3 Plans over Sequen es of Obje ts . . . . . . . . . . . . . . . . . . . . . . . . . . 239

8.4 Plans over Sets of Obje ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

8.5 Plans over Lists of Obje ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

8.5.1 Stru tural and Semanti List Problems . . . . . . . . . . . . . . 251

8.5.2 Synthesizing `Sele tion-Sort' . . . . . . . . . . . . . . . . . . . . . . . 253

8.5.2.1 A Plan for Sorting Lists . . . . . . . . . . . . . . . . . . . 253

8.5.2.2 Dierent Realizations of `Sele tion Sort' . . . . . 254

8.5.2.3 Inferring the Fun tion-Skeleton . . . . . . . . . . . . . 256

8.5.2.4 Inferring the Sele tor Fun tion . . . . . . . . . . . . . 259

8.5.3 Con luding Remarks on List Problems . . . . . . . . . . . . . . 262

8.6 Plans over Complex Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 265

8.6.1 Variants of Complex Finite Programs . . . . . . . . . . . . . . . 265

8.6.2 The `Tower' Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

8.6.2.1 A Plan for Three Blo ks . . . . . . . . . . . . . . . . . . . 266

8.6.2.2 Elemental to Composite Learning . . . . . . . . . . . 267

8.6.2.3 Simultaneous Composite Learning . . . . . . . . . . 267

8.6.2.4 Set of Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

8.6.2.5 Con luding Remarks on `Tower' . . . . . . . . . . . . 272

8.6.3 Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

X Table of Contents


9.1 Combining Planning and Program Synthesis . . . . . . . . . . . . . . . 277

9.2 A quisition of Problem Solving Strategies . . . . . . . . . . . . . . . . . 278

9.2.1 Learning in Problem Solving and Planning . . . . . . . . . . 278

9.2.2 Three Levels of Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 279

Part III. S hema Abstra tion

10. Analogi al Reasoning and Generalization : : : : : : : : : : : : : : : : : 285

10.1 Analogi al and Case-Based Reasoning . . . . . . . . . . . . . . . . . . . . . 285

10.1.1 Chara teristi s of Analogy . . . . . . . . . . . . . . . . . . . . . . . . . 285

10.1.2 Sub-Pro esses of Analogi al Reasoning . . . . . . . . . . . . . . 287

10.1.3 Transformational versus Derivational Analogy . . . . . . . . 288

10.1.4 Quantitive and Qualitative Similarity . . . . . . . . . . . . . . . 289

10.2 Mapping Simple Relations or Complex Stru tures . . . . . . . . . . 290

10.2.1 Proportional Analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

10.2.2 Causal Analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

10.2.3 Problem Solving and Planning by Analogy . . . . . . . . . . 292

10.3 Programming by Analogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

10.4 Pointers to Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

11. Stru tural Similarity in Analogi al Transfer : : : : : : : : : : : : : : : 297

11.1 Analogi al Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

11.1.1 Mapping and Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

11.1.2 Transfer of Non-Isomorphi Sour e Problems . . . . . . . . . 299

11.1.3 Stru tural Representation of Problems . . . . . . . . . . . . . . 300

11.1.4 Non-Isomorphi Variants in a Water Redistribution

Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

11.1.5 Measurement of Stru tural Overlap . . . . . . . . . . . . . . . . . 306

11.2 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

11.2.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

11.2.2 Results and Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

11.2.2.1 Isomorph Stru ture/Change in Surfa e . . . . . . 311

11.2.2.2 Change in Stru ture/Stable Surfa e . . . . . . . . . 311

11.2.2.3 Change in Stru ture/Change in Surfa e . . . . . . 311

11.3 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

11.3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

11.3.2 Results and Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

11.3.2.1 Type of Stru tural Relation . . . . . . . . . . . . . . . . 314

11.3.2.2 Degree of Stru tural Overlap . . . . . . . . . . . . . . . 315

11.4 General Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Table of Contents XI

12. Programming By Analogy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 319

12.1 Program Reuse and Program S hemes . . . . . . . . . . . . . . . . . . . . 319

12.2 Restri ted 2ndOrder AntiUni ation . . . . . . . . . . . . . . . . . . . . 320

12.2.1 Re ursive Program S hemes Revisited . . . . . . . . . . . . . . . 320

12.2.2 Anti-Uni ation of Program Terms . . . . . . . . . . . . . . . . . 322

12.3 Retrieval Using Term Subsumption . . . . . . . . . . . . . . . . . . . . . . . 324

12.3.1 Term Subsumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

12.3.2 Empiri al Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

12.3.3 Retrieval from Hierar hi al Memory . . . . . . . . . . . . . . . . 326

12.4 Generalizing Program S hemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

12.5 Adaptation of Program S hemes . . . . . . . . . . . . . . . . . . . . . . . . . . 328


13.1 Learning and Applying Abstra t S hemes . . . . . . . . . . . . . . . . . 331

13.2 A Framework for Learning from Problem Solving . . . . . . . . . . . 332

13.3 Appli ation Perspe tive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

Bibliography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 335

Appendi es

A. Implementation Details : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 355

A.1 Short History of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

A.2 Modules of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

A.3 DPlan Spe i ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

A.4 Development of Folding Algorithms . . . . . . . . . . . . . . . . . . . . . . . 359

A.5 Modules of TFold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

A.6 Time Eort of Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

A.7 Main Components of Plan-Transformation . . . . . . . . . . . . . . . . . 361

A.8 Plan De omposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

A.9 Introdu tion of Situation Variables . . . . . . . . . . . . . . . . . . . . . . . 363

A.10 Number of MSTs in a DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

A.11 Extra ting Minimal Spanning Trees from a DAG . . . . . . . . . . . 364

A.12 Regularizing a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

A.13 Programming by Analogy Algorithms . . . . . . . . . . . . . . . . . . . . . 367

B. Con epts and Proofs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 371

B.1 Fixpoint Semanti s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

B.2 Proof: Maximal Subprogram Body . . . . . . . . . . . . . . . . . . . . . . . . 374

B.3 Proof: Uniqueness of Substitutions . . . . . . . . . . . . . . . . . . . . . . . . 379

XII Table of Contents

C. Sample Programs and Problems : : : : : : : : : : : : : : : : : : : : : : : : : : 383

C.1 Fibona i with Sequen e Referen ing Fun tion . . . . . . . . . . . . . 383

C.2 Indu ing `Reverse' with Golem . . . . . . . . . . . . . . . . . . . . . . . . . . . 384

C.3 Finite Program for Ùnsta k' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

C.4 Re ursive Control Rules for the `Ro ket' Domain . . . . . . . . . . . 388

C.5 The `Sele tion Sort' Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

C.6 Re ursive Control Rules for the `Tower' Domain . . . . . . . . . . . . 390

C.7 Water Jug Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

C.8 Example RPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

List of Figures

1.1 Analogi al Problem Solving and Learning . . . . . . . . . . . . . . . . . . . . . 5

1.2 Main Components of the Synthesis System . . . . . . . . . . . . . . . . . . . . 6

2.1 A Simple Blo ks-World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 A Strips Planning Problem in the Blo ks-World . . . . . . . . . . . . . . . . 18

2.3 An Alternative Representation of the Blo ks-World . . . . . . . . . . . . . 19

2.4 Representation of a Blo ks-World Problem in PDDL-Strips . . . . . . 21

2.5 Blo ks-World Domain with Equality Constraints and Conditioned

Ee ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Representation of a Blo ks-World Problem in Situation Cal ulus . 26

2.7 A Forward Sear h Tree for Blo ks-World . . . . . . . . . . . . . . . . . . . . . . 32

2.8 A Ba kward Sear h Tree for Blo ks-World . . . . . . . . . . . . . . . . . . . . . 36

2.9 Goal-Regression for Blo ks-World . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.10 The Sussman Anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.11 Part of a Planning Graph as Constru ted by Graphplan . . . . . . . . . 43

2.12 Representation of the Boolean Formula f(x

1

; x

2

) = x

1

^x

2

as OBDD 48

3.1 The Clearblo k DPlan Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.2 DPlan Plan for Clearblo k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.3 Clearblo k with a Set of Goal States . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.4 The DPlan Ro ket Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.5 Universal Plan for Ro ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.6 The DPlan Sorting Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.7 Universal Plan for Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.8 The DPlan Hanoi Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.9 Universal Plan for Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.10 Universal Plan for Tower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.11 Minimal Spanning Tree for Ro ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.1 Tower of Hanoi (a) Without and (b) With Fun tion Appli ation . 75

4.2 Tower of Hanoi in Fun tional Strips (Gener, 1999) . . . . . . . . . . . . 76

4.3 A Plan for Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.4 Tower of Hanoi with User-Dened Fun tions . . . . . . . . . . . . . . . . . . . 85

4.5 A Problem Spe i ation for the Airplane Domain . . . . . . . . . . . . . . . 85

4.6 Spe i ation of the Inverse Operator y

1

for the Airplane Domain 87

XIV List of Figures

4.7 A Problem Spe i ation for the Water Jug Domain . . . . . . . . . . . . . 88

4.8 A Plan for the Water Jug Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.9 Blo ks-World Operators with Indire t Referen e and Update . . . . . 91

4.10 Spe i ation of Sele tion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.11 Lightmeal in Constraint Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.12 Problem Spe i ation for Lightmeal . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.1 Programs Represent Con epts and Skills . . . . . . . . . . . . . . . . . . . . . . 128

6.2 Constru tion of a Simple Arithmeti Fun tion (a) and an Even-2-

Parity Fun tion (b) Represented as a Labeled Tree with Ordered

Bran hes (Koza, 1992, gs. 6.1, 6.2) . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6.3 A Possible Initial State, an Intermediate State, and the Goal State

for Blo k Sta king (Koza, 1992, gs. 18.1, 18.2) . . . . . . . . . . . . . . . . 140

6.4 Resulting Programs for the Blo k Sta king Problem (Koza, 1992,

hap. 18.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

6.5 -Subsumption Equivalen e and Redu ed Clauses . . . . . . . . . . . . . . . 144

6.6 -Subsumption Latti e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

6.7 An Inverse Linear Derivation Tree (Lavra and Dzeroski, 1994,

pp. 46) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

6.8 Part of a Renement Graph (Lavra and Dzeroski, 1994, p. 56) . . 148

6.9 Spe ifying Modes and Types for Predi ates . . . . . . . . . . . . . . . . . . . . 150

6.10 Learning Fun tion unpa k from Examples . . . . . . . . . . . . . . . . . . . . . 154

6.11 Tra es for the unpa k Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

6.12 Result of the First Synthesis Step for unpa k . . . . . . . . . . . . . . . . . . . 157

6.13 Re urren e Relation for unpa k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

6.14 Tra es for the reverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

6.15 Synthesis of a Regular Lisp Program . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.16 Re ursion Formation with Tinker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

7.1 Example Term with Exemplari Positions of Sub-terms . . . . . . . . . . 174

7.2 Example First Order Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

7.3 Anti-Uni ation of Two Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7.4 Examples for Terms Belonging to the Language of an RPS and of

a Subprogram of an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

7.5 Unfolding Positions in the Third Unfolding of Fibona i . . . . . . . . . 183

7.6 Valid Re urrent Segmentation of Mod . . . . . . . . . . . . . . . . . . . . . . . . . 191

7.7 Initial Program for ModList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

7.8 Identifying Two Re ursive Subprograms in the Initial Program for

ModList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

7.9 Inferring a Sub-Program S heme for ModList . . . . . . . . . . . . . . . . . . . 205

7.10 The Redu ed Initial Tree of ModList . . . . . . . . . . . . . . . . . . . . . . . . . . 206

7.11 Substitutions for Mod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

7.12 Steps for Cal ulating a Subprogram . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

7.13 Overview of Indu ing an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

7.14 Time Eort for Unfolding/Folding Fa torial . . . . . . . . . . . . . . . . . . . 225

List of Figures XV

7.15 Time Eort for Unfolding/Folding Fibona i . . . . . . . . . . . . . . . . . . . 225

7.16 Time Eort Cal ulating Valid Re urrent Segmentations and Sub-

stitutions for Fa torial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

7.17 Initial Tree for Clearblo k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

7.18 Initial Tree for Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

8.1 Indu tion of Re ursive Fun tions from Plans . . . . . . . . . . . . . . . . . . . 231

8.2 Examples of Uniform Sub-Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

8.3 Uniform Plans as Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

8.4 Generating the Su essor-Fun tion for a Sequen e . . . . . . . . . . . . . . 241

8.5 The Unsta k Domain and Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

8.6 Proto ol for Unsta k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

8.7 Introdu tion of Data Type Sequen e in Unsta k . . . . . . . . . . . . . . . . 243

8.8 LISP-Program for Unsta k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

8.9 Partial Order of Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

8.10 Fun tions Inferred/Provided for Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

8.11 Sub-Plans of Ro ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

8.12 Introdu tion of the Data Type Set (a) and Resulting Finite Pro-

gram (b) for the Unload-All Sub-Plan of Ro ket ( denotes \un-

dened") . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

8.13 Proto ol of Transforming the Ro ket Plan . . . . . . . . . . . . . . . . . . . . . 250

8.14 Partial Order (a) and Total Order (b) of Flat Lists over Numbers . 253

8.15 A Minimal Spanning Tree Extra ted from the SelSort Plan . . . . . . 258

8.16 The Regularized Tree for SelSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

8.17 Introdu tion of a \Semanti " Sele tor Fun tion in the Regularized

Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

8.18 LISP-Program for SelSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

8.19 Abstra t Form of the Universal Plan for the Four-Blo k Tower . . . 269

9.1 Three Levels of Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

10.1 Mapping of Base and Target Domain . . . . . . . . . . . . . . . . . . . . . . . . . 287

10.2 Example for a Geometri -Analogy Problem (Evans, 1968, p. 333) . 291

10.3 Context Dependent Des riptions in Proportional Analogy (O'Hara,

1992) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

10.4 The Rutherford Analogy (Gentner, 1983) . . . . . . . . . . . . . . . . . . . . . . 292

10.5 Base and Target Spe i ation (Dershowitz, 1986) . . . . . . . . . . . . . . . 294

11.1 Types and degrees of stru tural overlap between sour e and target

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

11.2 A water redistribution problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

11.3 Graphs for the equations 2 x+5 = 9 (a) and 3 x+(62) = 16 (b)307

12.1 Adaptation of Sub to Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

XVI List of Figures

C.1 Universal Plan for Sorting Lists with Three Elements . . . . . . . . . . . 391

C.2 Minimal Spanning Trees for Sorting Lists with Three Elements . . . 391



List of Tables

2.1 Informal Des ription of Forward Planning . . . . . . . . . . . . . . . . . . . . . 29

2.2 A Simple Forward Planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3 Number of States in the Blo ks-World Domain . . . . . . . . . . . . . . . . . 34

2.4 Planning as Model Che king Algorithm (Giun higlia, 1999, g. 4) 47

3.1 Abstra t DPlan Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.1 Database with Distan es between Airports . . . . . . . . . . . . . . . . . . . . . 86

4.2 Performan e of FPlan: Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . 90

4.3 Performan e of FPlan: Sele tion Sort . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.1 Dierent Spe i ations for Last . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.2 Training Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.3 Ba kground Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

6.4 Fundamental Results of Language Learnability (Gold, 1967, tab. 1)135

6.5 Geneti Programming Algorithm (Koza, 1992, p. 77) . . . . . . . . . . . . 138

6.6 Cal ulation the Fibona i Sequen e . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

6.7 Learning the daughter Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6.8 Cal ulating an rlgg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

6.9 Simplied MIS-Algorithm (Lavra and Dzeroski, 1994, pp.54) . . . . 147

6.10 Ba kground Knowledge and Examples for Learning reverse(X,Y) . 151

6.11 Constru ting Tra es from I/O Examples . . . . . . . . . . . . . . . . . . . . . . . 155

6.12 Cal ulating the Form of an S-Expression . . . . . . . . . . . . . . . . . . . . . . 156

6.13 Constru ting a Regular Lisp Program by Fun tion Merging . . . . . . 164

7.1 A Sample of Fun tion Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

7.2 Example for an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

7.3 Re ursion Points and Substitution Terms for the Fibona i Fun tion182

7.4 Unfolding Positions and Unfolding Indi es for Fibona i . . . . . . . . . 184

7.5 Example of Extrapolating an RPS from an Initial Program . . . . . . 187

7.6 Cal ulation of the Next Position on the Right . . . . . . . . . . . . . . . . . . 199

7.7 Fa torial and Its Third Unfolding with Instantiation su (su (0))

(a) and pred(3) (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

7.8 Segments of the Third Unfolding of Fa torial for Instantiation

su (su (0)) (a) and pred(3) (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

XVIII List of Tables

7.9 Anti-Uni ation for In omplete Segments . . . . . . . . . . . . . . . . . . . . . . 202

7.10 Variants of Substitutions in Re ursive Calls . . . . . . . . . . . . . . . . . . . . 211

7.11 Testing whether Substitutions are Uniquely Determined . . . . . . . . . 213

7.12 Testing whether a Substitution is Re urrent . . . . . . . . . . . . . . . . . . . 214

7.13 Testing the Existen e of SuÆ iently Many Instan es for a Variable 215

7.14 Determining Hidden Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

7.15 Cal ulating Substitution Terms of a Variable in a Re ursive Call . 218

7.16 Equality of Subprograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

7.17 RPS for Fa torial with Constant Expression in Main . . . . . . . . . . . . 226

8.1 Introdu ing Sequen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

8.2 Linear Re ursive Fun tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

8.3 Introdu ing Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

8.4 Stru tural Fun tions over Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

8.5 Introdu ing List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

8.6 Dealing with Semanti Information in Lists . . . . . . . . . . . . . . . . . . . . 254

8.7 Fun tional Variants for Sele tion-Sort . . . . . . . . . . . . . . . . . . . . . . . . . 255

8.8 Extra t an MST from a DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

8.9 Regularization of a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

8.10 Stru tural Complex Re ursive Fun tions . . . . . . . . . . . . . . . . . . . . . . . 265

8.11 Transformation Sequen es for Leaf-Nodes of the Tower Plan for

Four Blo ks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

8.12 Power-Set of a List, Set of Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

8.13 Control Rules for Tower Inferred by De ision List Learning . . . . . . 273

8.14 A Tower of Hanoi Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

8.15 A Tower of Hanoi Program for Arbitrary Starting Constellations . 276

10.1 Kinds of Predi ates Mapped in Dierent Types of Domain Com-

parison (Gentner, 1983, Tab. 1, extended) . . . . . . . . . . . . . . . . . . . . . 286

10.2 Word Algebra Problems (Reed et al., 1990) . . . . . . . . . . . . . . . . . . . . 293

11.1 Relevant information for solving the sour e problem . . . . . . . . . . . . 305

11.2 Results of Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

11.3 Results of Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

12.1 A Simple Anti-Uni ation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 323

12.2 An Algorithm for Retrieval of RPSs . . . . . . . . . . . . . . . . . . . . . . . . . . 325

12.3 Results of the similarity rating study . . . . . . . . . . . . . . . . . . . . . . . . . . 326

12.4 Example Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

1. Introdu tion

She had written what she felt herself alled upon to write; and, though she

was beginning to feel that she might perhaps do this thing better, she had no

doubt that the thing itself was the right thing for her.

|Harriet Vane in: Dorothy L. Sayers, Gaudy Night, 1935

Automati program synthesis is an a tive area of resear h sin e the early

seventies. The appli ation goal of program synthesis is to support human pro-

grammers in developing orre t and eÆ ient program ode and in reasoning

about programs. Program synthesis is the ore of knowledge based software

engineering. The se ond goal of program synthesis resear h is to gain more

insight in the knowledge and strategies underlying the pro ess of ode gener-

ation. Developing algorithms for automati program onstru tion is therefore

an area of arti ial intelligen e (AI) resear h and human programmers are

studied in ognitive s ien e resear h.

There are two main approa hes to program synthesis dedu tion and

indu tion. Dedu tive program synthesis addresses the problem of deriving

exe utable programs from high-level spe i ations. Typi ally, the employed

transformation or inferen e rules guarantee that the resulting program is

orre t with respe t to the given spe i ation but of ourse, there is no

guarantee that the spe i ation is valid with respe t to the informal idea a

programmer has about what the program should do. The hallenge for de-

du tive program synthesis is to provide formalized knowledge whi h allows to

synthesize as large a lass of programs with as less user-guidan e as possible.

That is, a synthesis system an be seen as an expert system in orporating gen-

eral knowledge about algorithms, data stru tures, optimization te hniques,

as well as knowledge about the spe i programming domain.

Indu tive program synthesis investigates program onstru tion from in-

omplete information, namely from examples for the desired input/output

behavior of the program. Program behavior an be spe ied on dierent level

of detail: as pairs of input and output values, as a sele tion of omputational

tra es, or as an ordered set of generi tra es, abstra ting from spe i input

values. For indu tion from examples (not to be onfused with mathemati al

proofs by indu tion) it is not possible to give a notion of orre tness. The

resulting program has to over all given examples orre tly, but the user has

2 1. Introdu tion

to judge whether the generalized program orresponds to his/her intention.

1

Be ause orre tness of ode is ru ial for software development, knowledge

based software engineering relies on dedu tive approa hes to program syn-

thesis. The hallenge for indu tive program synthesis is to provide learning

algorithms whi h an generalize as large a lass of programs with as less ba k-

ground knowledge as possible. That is, indu tive synthesis models the ability

of extra ting stru ture in the form of possibly re ursive rules from some

initial experien e.

This book fo usses on indu tive program synthesis, and more spe ially on

the indu tion of re ursive fun tions. There are dierent approa hes for learn-

ing re ursive programs from examples. The oldest approa h is to synthesize

fun tional (Lisp) programs. Fun tional synthesis is realized by a two-step

pro ess: In a rst step, input/output examples are rewritten into nite terms

whi h are integrated into a nite program. In a se ond step, the nite program

is folded into a re ursive fun tion, that is, a program generalizing over nite

program is indu ed. The se ond step is also alled \generalization-to-n" and

orresponds to program synthesis from tra es or programming by demonstra-

tion. It will be shown later that folding of nite terms an be des ribed as a

grammar inferen e problem, namely as indu tion of a spe ial lass of ontext-

free tree grammars. Sin e the late eighties, there have been two additional

approa hes to indu tive synthesis indu tive logi programming and geneti

programming. While the lassi al fun tional approa h depends on exploiting

stru tural information given in the examples, these approa hes mainly de-

pend on sear h in hypotheses spa e, that is, the spa e of synta ti ally orre t

programs of a given programming language. The work presented here is in

the ontext of fun tional program synthesis.

While folding of nite programs into re ursive fun tions an be performed

(nearly) by purely synta ti al pattern mat hing, generation of nite terms

from input/output examples is knowledge-dependent. The result of rewrit-

ing examples that is, the form and omplexity of the nite program is

ompletely dependent on the set of predened primitive fun tions and data

stru tures provided for the rewrite-system. Additionally, the out ome de-

pends on the used rewrite-strategy even for a onstant set of ba kground

knowledge rewriting an result in dierent programs. Theoreti ally, there are

innitely many possible ways to represent a nite program whi h des ribes

how input examples an be transformed in the desired output. Be ause gen-

eralizability depends on the form of the nite program, this rst step is the

bottlene k of program synthesis. Here program synthesis is onfronted with

the ru ial problem of AI and ognitive s ien e problem solving su ess is

determined by the onstru ted representation.

1

Please note, that throughout the book I will mostly omit to give both mas uline

and feminine form and only use mas uline for better readability. Also, I will refer

to my work in rst person plural instead of singular be ause part of the work

was done in ollaboration with other people and I do not want to swit h between

rst and third person.

1. Introdu tion 3

Overview In the following, we give a short overview about the dierent

aspe ts of indu tive synthesis of re ursive fun tions whi h are overed in this

book.

Universal Planning. We propose to use domain-independent planning to gen-

erate nite programs. Inputs orrespond to problem states, outputs to states

fullling the desired goals, and transformation from input to output is real-

ized by al ulating optimal a tion sequen es (shortest sequen es of fun tion

appli ations). We use universal planning, that is, we onsider all possible

states of a given nite problem in a single plan. While a \standard" plan rep-

resents an (optimal) sequen es of a tions for transforming one spe i initial

state into a state fullling the given goals, a universal plan represents the set

of all optimal plans as a DAG (dire ted a y li graph). For example, a plan

for sorting lists (or more pre isely arrays) of four numbers 1, 2, 3, and 4,

ontains all sequen es of swap operations to transform ea h of the 4! possible

input lists into a list [1 2 3 4. Plan onstru tion is realized by a non-linear,

state-based, ba kward algorithm. To make planning appli able to a larger

lass of problems whi h are of interest for program synthesis, we present an

extension of planning from purely relational domain des riptions to fun tion

appli ation.

Plan Transformation. The universal plan already represents the stru ture

of the sear hed-for program, be ause it gives an ordering of operator ap-

pli ations. Nevertheless, the plan annot be generalized dire tly, but some

transformation steps are needed to generate a nite program whi h an be

input to a generalization-to-n algorithm. For a program to be generalizable

into a (terminating) re ursive fun tion, the input states have to be lassi-

ed with respe t to their \size", that is, the obje ts involved in the planning

problem must have an underlying order. We propose a method by whi h the

data type underlying a given plan an be inferred. This information an be

exploited in plan transformation to introdu e a \ onstru tive" representation

of input states, together with an \empty"-test and sele tor fun tions.

Folding of Finite Programs. Folding of a nite program into a (set of) re ur-

sive programs is done by pattern-mat hing. For indu tive generalization, the

notion of x-point semanti s an be \inverted": A given nite program is on-

sidered as n-th unfolding of an unknown re ursive program. The term an be

folded into a re ursion, if it an be de omposed su h that ea h segment of the

term mat hes the same sub-term ( alled skeleton) and if the instantiations of

the skeleton with respe t to ea h segment an be des ribed by a unique substi-

tution (set of repla ements of variables in the skeleton by terms). Identifying

the skeleton orresponds to learning a regular tree-grammar, identifying the

substitution orresponds to an extension of the regular to a ontext-free tree

grammar.

Our approa h is independent of a spe i programming language: terms

are supposed to be elements of a term algebra and the inferred re ursive

program is represented as a re ursive program s heme (RPS) over this term

4 1. Introdu tion

algebra. An RPS is dened as a \main program" (ground term) together with

a system of possibly re ursive equations over a term algebra onsisting of a

nite sets of variables, fun tion symbols, and fun tion variables (representing

names of user-dened fun tions). Folding results in the inferen e of a program

s heme be ause the elements of the term algebra (namely the fun tion sym-

bols) an be arbitrarily interpreted. Currently, we assume a xed interpreter

fun tion whi h maps RPSs into Lisp fun tions with known denotational and

operational semanti s. We restri t Lisp to its fun tional ore disregarding

global variables, variable assignments, loops, and other non-fun tional ele-

ments of the language. As a onsequen e, an RPS orresponds to a on rete

fun tional program.

For an arbitrarily instantiated nite term whi h is input to the folder, a

term algebra together with the valuation of the identied parameters of the

main program is inferred as part of the folding pro ess. The urrent system

an deal with a variety of (synta ti ) re ursive stru tures tail re ursion, lin-

ear re ursion, tree re ursion and ombinations thereof and it an deal with

re ursion over interdependent parameters. Furthermore, a onstant initial

segment of a term an be identied and used to onstru t the main program

and it is possible to identify sub-patterns distributed over the term whi h an

be folded separately resulting in an RPS onsisting of two or more re ursive

equations. Mutual re ursion and non-primitive re ursion are out of the s ope

of the urrent system.

S heme Abstra tion and Analogi al Problem Solving. A set of (synthesized)

RPSs an be organized into an abstra tion hierar hy of s hemes by generaliz-

ing over their ommon stru ture. For example, an RPS for al ulating the fa -

torial of a natural number and an RPS for al ulating the sum over a natural

number an be generalized to an RPS representing a simple linear re ursion

over natural numbers. This s heme an be generalized further when regarding

simple linear re ursive fun tions over lists. Abstra tion is realized by intro-

du ing fun tion variables whi h an be instantiated by a (restri ted) set of

primitive fun tion symbols. Abstra tion is based on \mapping" two terms

su h that their ommon stru ture is preserved. We present two approa hes

to mapping tree transformation and anti-uni ation. Tree transformation is

based on transforming one term (tree) into another by substituting, inserting,

and deleting nodes. The performed transformations dene the mapping re-

lation. Anti-uni ation is based on onstru ting an abstra t term ontaining

rst and se ond order variables (i. e., obje t and fun tion variables) together

with a set of substitutions su h that instantiating the abstra t term results

in the original terms.

S heme abstra tion an be integrated into a general model of analogi-

al problem solving (see g. 1.1): For a given nite program representing

some initial experien e with a problem, that RPS is retrieved from mem-

ory for whi h its n-th unfolding results in a \maximal similarity" to the

urrent (target) problem. Instead of indu ing a general solution strategy by

1. Introdu tion 5

generalization-to-n, the retrieved RPS is modied with respe t to the map-

ping obtained between its unfolding and the target. Modi ation an involve

a simple re-instantiation of primitive symbols or more omplex adaptations.

If an already abstra ted s heme is adapted to the urrent problem, we speak

of renement. To obtain some insight in the pragmati s of analogi al prob-

lem solving, we empiri ally investigated what type and degree of stru tural

mapping between two problem stru tures must exist for su essful analogi al

transfer in human problem solvers.

1

Finite Program

Term

Recursive

Program

Scheme

RPS-S

T-S

SOURCE

T

TARGET

RPS

RPS-ABSsto

rage

Hierarchy

Abstraction

retr

ieval

adaptation

mapping

abstraction

un

fold

ing

2

3

4

5

Fig. 1.1. Analogi al Problem Solving and Learning

Integrating Planning, Program Synthesis, and Analogi al Reasoning. The

three omponents of our approa h planning, folding of nite terms, and

analogi al problem solving an be used in isolation or integrated into a

omplex system. Input in the universal planner is a problem spe i ation

represented in an extended version of the standard Strips language, output

is a universal plan, represented as DAG. Input in the synthesis system is a

nite program term, output is a re ursive program s heme. Finite terms an

be obtained in arbitrary ways, for example, by hand- oding or by re ording

program tra es. We propose an approa h of plan transformation (from a DAG

to a nite program term) to ombine planning with program synthesis. Gen-

erating nite programs by planning is espe ially suitable for domains, where

inputs an be represented as relational obje ts (sets of literals) and where

operations an be des ribed as manipulation of su h obje ts ( hanging liter-

als). This is true for blo ks-world problems and puzzles (as Tower of Hanoi)

and for a variety of list-manipulating problems (as sorting). For numeri al

6 1. Introdu tion

problems su h as fa torial or bona i we omitt planning and start pro-

gram synthesis from nite programs generated from hand- oded tra es. Input

in the analogy module is a nite program term, outputs are the RPS whi h

is most similar to the term, its re-instantiation or adaptation to over the

urrent term, and an abstra ted s heme, generalizing over the urrent and

the re-used RPS. Figure 1.2 gives an overview of the des ribed omponents

and their intera tions. All omponents are implemented in Common Lisp.

Universal

Planning

Transformation

Plan

Problem

Specification

Universal

Plan

Finite Program

Term

Hierarchy

Abstraction

Recursive

Program

Scheme

Generalization-

to-n

storage

Analo

gic

al

Pro

ble

m S

olv

ing

Fig. 1.2. Main Components of the Synthesis System

Contributions to Resear h In the following, we dis uss our work in rela-

tion to dierent areas of resear h. Our work dire tly ontributs to resear h

in program synthesis and planning. To other resear h areas, su h as software

engineering and dis overy learning, our work does not dire tly ontribute,

but might oer some new perspe tives.

A Novel Approa h to Indu tive Program Synthesis. Going ba k to the roots

of early work in fun tional program synthesis and extending it, exploiting the

experien e of resear h available today, results in a novel approa h to indu tive

program synthesis with several advantageous aspe ts: Adopting the original

two-step pro ess of rst generating nite programs from examples and then

generalizing over them results in a lear separation of the knowledge depen-

dent and the synta ti aspe t of indu tion. Dividing the omplex program

synthesis problem in two parts allows us to address the sub-problems sepa-

rately. The knowledge-dependent rst part generating nite programs from

1. Introdu tion 7

examples an be realized by dierent approa hes, in luding the rewrite-

approa h proposed in the seventies. We propose to use state-based plan-

ning. While there is a long tradition of ombining (dedu tive) planning and

dedu tive program synthesis, up to now there was no intera tion between

resear h on state-based planning and resear h on indu tive program synthe-

sis. State-based planning provides a powerful approa h to al ulate (opti-

mal) transformation sequen es from input states to a state fullling a set of

goal relations by providing a powerful domain spe i ation language together

with a domain-independent sear h algorithm for plan onstru tion. The se -

ond part folding of nite programs an be solved by a pattern-mat hing

approa h. The orresponden e between folding program terms and inferring

ontext-free tree grammars makes it possible to give an exa t hara terization

of the lass of re ursive programs whi h an be indu ed. Dening pattern-

mat hing for terms whi h are elements of an arbitrary term algebra makes

the approa h independent of a spe i programming language. Synthesizing

program s hemes in ontrast to programs allows for a natural ombination

of indu tion with analogi al reasoning and learning.

Learning Domain Spe i Control Rules for Plans. Cross-fertilization be-

tween state-based planning and indu tive program synthesis results in a pow-

erful approa h to learning domain spe i ontrol rules for plans. Control rule

learning urrently be omes a major interest in planning resear h: Although

a variety of eÆ ient domain-independent planners have been developed in

the nineties, for demanding real world appli ations it is ne essary to guide

sear h for (optimal) plans by exploiting knowledge about the stru ture of

the planning domain. Learning ontrol rules allows to gain the eÆ ien y of

domain-dependent planning without the need to hand- ode su h knowledge

whi h is a time onsuming and error-prone knowledge engineering task. Our

fun tional approa h to program synthesis is very well suited for ontrol rule

learning be ause in ontrast to logi programs the ontrol ow is represented

expli itly. Furthermore, a lot of indu tive logi programming systems do not

provide an ordering for the indu ed lauses and literals. That is, evaluation

of su h lauses by a Prolog interpreter is not guaranteed to terminate with

the desired result. In proof planning, as a spe ial domain of planning in the

area of mathemati al theorem proving and program veri ation, the need

for learning proof methods and ontrol strategies is also re ognized. Up to

now, we have not applied our approa h to this domain, but we see this as an

interesting area of further resear h.

Knowledge A quisition for Software Engineering Tools. Knowledge based

software engineering systems are based on formalized knowledge of various

general and domain-spe i aspe ts of program development. For that reason,

su h systems are ne essarily in omplete. It depends on the \genius" of the

authors of su h systems their analyti al insights about program stru tures

and their abilities to expli ate these insights how large the set of domains

and the lass of programs is that an be supported. Similarly to ontrol rules

8 1. Introdu tion

in planning, ta ti s are used to guide sear h. Furthermore, some proof sys-

tems provide a variety of proof methods. While exe ution of transformation

or proof steps is performed autonomously, the sele tion of an appropriate ta -

ti or method is performed intera tively. We propose that these omponents

of software engineering tools are andidates for learning. The rst reason is,

that the guarantee of orre tness whi h is ne essary for the fully automated

parts of su h systems, remains inta t. Only the \higher level" strategi as-

pe ts of the system whi h depend on user intera tion are subje t to learning

and the a quired ta ti s and methods an be a epted or reje ted by the user.

Our approa h to program synthesis might omplement these spe ial kind of

expert systems by providing an approa h to model how some aspe ts of su h

expertise develop with experien e.

Programming by Demonstration. With the growing number of omputer

users, most of them without programming skills, program synthesis from

example be omes relevan e for pra ti al appli ations. For example, wat hing

a users input behavior in a text pro essing system provides tra es whi h an

be generalized to ma ros (su h as \write a letter-head"), wat hing a users

browsing behavior in the world-wide-web an be used to generate prefer-

en e lasses, or wat hing a users inputs into a graphi al editor might provide

suggestions for the next a tions to be performed. In the simplest ase, the re-

sulting programs are just sequen es of parameterized operations. By applying

indu tive program synthesis, more sophisti ated programs, involving loops,

ould be generated. A further appli ation is to support beginning program-

mers. A student might intera t with an interpreter by rst giving examples

for the desired program behavior and than wat h, how a re ursion is formed

to generalize the program to the general ase.

Dis overy Learning. Folding of nite programs and some aspe ts of plan

transformation an be hara terized as dis overy learning. Folding of nite

programs models the (human) ability to extra t generalized rules by iden-

tifying relevant stru tural aspe ts in per eptive inputs. This ability an be

seen as the ore of the exibility of (human) ognition underlying the a quisi-

tion of per eptual ategories and linguisti on epts as well as the extra tion

of general strategies from problem solving experien e. Furthermore, we will

demonstrate for plan transformation how problem dependent sele tor fun -

tions an be extra ted from a universal plan by a purely synta ti analysis

of its stru ture. Be ause our approa h to program synthesis is driven by

the underlying stru ture of some initial experien e (represented as universal

plan), it is also more ognitively plausible than the sear h driven synthesis

of indu tive logi and geneti programming.

Integrating Learning by Doing and by Analogy. Cognitive s ien e approa hes

to skill a quisition from problem solving and early work on learning ma ro

operators from planning fo us on ombining sequen es of primitive opera-

tors into more omplex ones by merging their pre onditions and ee ts. In

ontrast, our work addresses the a quisition of problem solving strategies.

1. Introdu tion 9

Be ause domain dependent strategies are represented as re ursive program

s hemes, they apture the operational aspe t of problem solving as well as

the stru ture of a domain. Thereby we an deal with learning by indu tion

and analogy in a unied way showing how s hemes an be a quired from

problem solving and how su h s hemes an be used and generalized in ana-

logi al problem solving. Furthermore, the identi ation of data types from

plans aptures the evolution of per eptive hunks by identifying the rele-

vant aspe ts of a problem des ription and dening an order over su h partial

des riptions.

Organization of the Book The book is organized in three main parts

Planning, Indu tive Program Synthesis, and Analogi al Problem Solving

and Learning. Ea h part starts with an overview of resear h, along with an

introdu tion of the basi on epts and formalisms. Afterwards our own work

is presented, in luding relations to other work. We nish with summarizing

the ontributions of our approa h to the eld, and giving an outlook to further

resear h. The overview hapters an be read independently of the resear h

spe i hapters. The resear h spe i hapters presuppose that the reader is

familiar with the on epts introdu ed in the overview hapters. The fo us of

our work is on indu tive program synthesis and its appli ation to ontrol rule

learning for planning. Additionally, we dis uss relations to work on problem

solving and learning in ognitive psy hology.

Part I: Planning. In hapter 2, rst, the standard Strips language for spe ify-

ing planning domains and problems and a semanti s for operator appli ation

are introdu ed. Afterwards, extensions of the Strips language (ADL/PDDL)

are dis ussed and ontrasted with situation al ulus. Basi algorithms for

forward and ba kward planning are introdu ed. Complexity of planning and

formal properties of planning algorithms are dis ussed. A short overview of

the development of dierent approa hes to planning is given and pre-planning

analysis and learning are introdu ed as methods for obtaining domain spe-

i knowledge for making plan onstru tion more eÆ ient. In hapter 3, the

non-linear, state-based, universal planner DPlan is introdu ed and in hapter

4 an extension of the Strips language to fun tion appli ation is presented. In

hapter 5, we evaluate our approa h and dis uss further work to be done.

Part II: Indu tive Program Synthesis. In hapter 6 we give a survey of auto-

mati programming resear h, fo ussing on automati program onstru tion,

that is, program synthesis. We give a short overview of onstru tive theorem

proving and program transformation as approa hes to dedu tive program

synthesis. Afterwards, we present indu tive program synthesis as a spe ial

ase of ma hine learning. We introdu e grammar inferen e as theoreti al

ba kground for indu tive program synthesis. Geneti programming, indu -

tive logi programming and indu tive fun tional programming are presented

as three approa hes to generalize re ursive programs from in omplete spe i-

ations. In hapter 7 we introdu e the on ept of re ursive program s hemes

and present our approa h to folding nite program terms. In hapter 8 we

10 1. Introdu tion

present our approa h to bridging the gap between planning and program syn-

thesis by transforming plans into program terms. In hapter 9, we evaluate

our approa h and dis uss relations to human strategy learning.

Part III: Analogi al Problem Solving and Learning. In hapter 10 we give

an overview of approa hes to analogi al and ase-based reasoning, dis ussing

similarity measures and qualitative on epts for stru tural similarity. In hap-

ter 11 some psy hologi al experiments on erning problem solving with non-

isomorphi example problems are reported. In hapter 12 we present anti-

uni ation as an approa h to stru ture mapping and generalization along

with preliminary ideas for adaptation of non-isomorphi stru tures. In hap-

ter 13 we evaluate our approa h and dis uss further work to be done.

Part I

Planning: Making Some Initial Experien e

with a Problem Domain

11

2. State-Based Planning

Plan it rst and then take it.

|Travis M Gee in: John D. Ma Donald, The Long Lavender Look, 1970

Planning is a major sub-dis ipline of AI. A plan is dened as a sequen e of

a tions for transforming a given state into a state whi h fullls a predened

set of goals. Planning resear h deals with the formalization, implementation,

and evaluation of algorithms for onstru ting plans. In the following, we rst

(se t. 2.1.1) introdu e a language for representing problems alled standard

Strips. Along with dening the language, the basi on epts and notations

of planning are introdu ed. Afterwards (se t. 2.2) some extensions to Strips

are introdu ed and situation al ulus is dis ussed as an alternative language

formalism. In se tion 2.3 basi algorithms for plan onstru tion based on

forward and ba kward sear h are introdu ed. We dis uss omplexity results

for planning and give results for termination, soundness, and ompleteness

of planning algorithms. In se tion 2.4 we give a short survey of well-known

planning systems and an overview of on epts often used in planning lit-

erature and pointers to literature. Finally (se t. 2.5), pre-planning analysis

and learning are introdu ed as approa hes for a quisition of domain-spe i

knowledge whi h an be used to make plan ontru tion more eÆ ient. The

fo us is on state-based algorithms, in luding universal planning. Throughout

the hapter we give illustrations using blo ks-world examples.

2.1 Standard Strips

To introdu e the basi on epts and notations of (state-based) planning, we

rst review Strips. The original Strips was proposed by Fikes and Nilsson

(1971) and up to now it is used with slight modi ations or some extensions

by the majority of planning systems. The main advantage of Strips is that

it has a strong expressive power (mainly due to the so alled losed world

assumption, see below) and at the same time allows for eÆ ient planning

algorithms. Informal introdu tions to Strips are given in all AI text books,

for example in Nilsson (1980) and Russell and Norvig (1995).

14 2. State-Based Planning

2.1.1 A Blo ks-World Example

Let us look at a very restri ted world, onsisting of four distin t blo ks. The

blo ks, alled A, B, C, and D, are the obje ts of this blo ks-world domain.

Ea h state of the world an be des ribed by the following relations: a blo k

an be on the table, that is, the proposition ontable(A) is true in a state, where

blo k A is lying on the table; a blo k an be lear, that is, the proposition

lear(A) is true in a state, where no other blo k is lying on top of blo k A;

and a blo k an be lying on another blo k, that is, the proposition on(B, C)

is true in a state, where blo k B is lying immediately on top of blo k C. State

hanges an be a hieved by applying one of the following two operators:

putting a blo k on the table and putting a blo k on another blo k. Both

operators have appli ation onditions: A blo k an only be moved if it is

lear and a blo k an only be put on another blo k, if this blo k is lear,

too. Figure 2.1 illustrates this simple blo ks-world example. If the goal for

whi h a plan is sear hed is that A is lying on B and B is lying on C, the right-

hand state is a goal state, be ause both proposition on(A, B) and proposition

on(B, C) are true. Note that other states, for example a tower with D on top

of A, B, C, also fulll the goal be ause it was not demanded that A has to

be lear or that C has to be on the table.

B

C DA

A

C

B

D

put A on B

Fig. 2.1. A Simple Blo ks-World

To formalize plan onstru tion, a language (or dierent languages) for

des ribing states, operators, and goals must be dened. Furthermore, it has

to be dened how a state hange an be (synta ti ally) al ulated.

2.1.2 Basi Denitions

The Strips language is dened over literals with onstant symbols and vari-

ables as arguments. That is, no general terms (in luding fun tion symbols)

are onsidered. We use the notion of term in the following denition in this

spe i way. In hapter 8 we will introdu e a language allowing for general

terms.

Denition 2.1.1 (Strips Language). The Strips language L

S

(X ; C;R) is

dened over sets of variables X , onstant symbols C, and relational symbols

R in the following way:

2.1 Standard Strips 15

Variables x 2 X are terms.

Constant symbols 2 C are terms.

If p 2 R is a relational symbol with arity (p) = i and if t

1

; : : : ; t

i

are

terms, then p(t

1

; : : : ; t

i

) is a formula.

Formulas onsisting of a single relational symbol are alled (positive) liter-

als. For short, we write p(t).

If p

1

(t

1

); : : : ; p

n

(t

n

) are formulas, then fp

1

(t

1

); : : : ; p

n

(t

n

)g is a formula,

representing the onjun tion of literals.

There are no other Strips formulas.

Formulas over L

S

(C;R), i. e., formulas not ontaining variables are alled

ground formulas. A literal without variables is alled atom.

We write L

S

as abbreviation for L

S

(X ; C;R). With X (F ) we denote the

variables o urring in formula F .

For the blo ks-world domain given in gure 2.1, L

S

is dened over the fol-

lowing set of symbols: C = fA;B;C;Dg, R = fon

2

; lear

1

; ontable

1

g, where

p

i

denotes a relational symbol of arity i. We will see below that for dening

operators additionally a set of variables X = f?x; ?y; ?zg is needed.

The Strips language an be used to represent states

1

and to dene synta -

ti rules for transforming a state representation by applying Strips operators.

A state representation an be interpreted logi ally by providing a domain, i. e.,

a set of obje ts of the world, and a denotation for all onstant and relational

symbols. A state representation denotes all states of the world where the

interpreted Strips relations are true. That is, a state representation denotes

a family of states in the world. When interpreting a formula, it is assumed

that all relations not given expli itly are false (the losed world assumption).

Denition 2.1.2 (State Representation). A problem state s is a on-

jun tion of atoms. That is, s 2 L

S

(C;R).

That is, states are propositions (relations over onstants).

Examples of problem states are

s

1

= fon(B, C), lear(A), lear(B), lear(D), ontable(A), ontable(C),

ontable(D)g

s

2

= fon(A, B), on(B, C), lear(A), lear(D), ontable(C), ontable(D)g.

An interpretation of s

1

results in a state as depi ted on the left-hand side

of gure 2.1, an interpretation of s

2

results in a state as depi ted on the

right-hand side.

The relational symbols on(x, y), lear(x), and ontable(x) an hange their

truth values over dierent states. There might be further relational symbols,

1

Note that we do not introdu e dierent representations to dis ern between syn-

ta ti al expressions and their semanti interpretation. We will speak of onstant

or relation symbols when referring to the syntax and of onstants or relations

when referring to the semanti s of a planning domain.


denoting relations whi h are onstant over all states, as for example the

olor of blo ks (if there is no operator paint-blo k). Relational symbols whi h

an hange their truth values are alled uents, onstant relational symbols

are alled stati s. We will see below, that uents an appear in operator

pre onditions and ee ts, while stati s an only appear in pre onditions.

Before introdu ing goals and operators dened over L

S

(X ; C;R), we in-

trodu e mat hing of formulas ontaining variables (also alled patterns) with

ground formulas.

Denition 2.1.3 (Substitution and Mat hing). A substitution is a set

of mappings = fx

1

t

1

; : : : ; x

n

t

n

g dening repla ements of vari-

ables x

i

by terms t

i

. For language L

S

terms t

i

are restri ted to variables and

onstants. By applying a substitution to a formula F denoted F

all

variables x

i

2 X (F ) with x

i

t

i

2 are repla ed by the asso iated t

i

. Note

that this repla ement is unique, i. e., identi al variables are repla ed by iden-

ti al terms. We all F

an instantiated formula if all X (F ) are repla ed by

onstant symbols.

For a formula F 2 L

S

and a set of atoms A 2 L

S

(C;R), mat h(F,A) =

gives all substitutions sigma

i

2 with F

i

A.

For state s

1

given above, the formula F = fontable(x), lear(x), lear(y)g

an be instantiated to = f

1

;

2

;

3

;

4

g with

1

= fx A; y Bg,

2

= fx A; y Dg,

3

= fx D; y Ag,

4

= fx D; y Bg.

For the quantor-free language L

S

(X ; C;R), all variables o urring in for-

mulas are assumed to be bound by existential quantiers. That is, all formulas

orrespond to onjun tions of propositions.

Denition 2.1.4 (Goal Representation). A goal G is a onjun tion of

literals. That is, G 2 L

S

(C;R).

An example of a planning goal is G = fon(A, B), on(B, C)g. All states s

with G

s are goal states.

Denition 2.1.5 (Strips Operator). A Strips operator op is des ribed by

pre onditions PRE, ADD- and DEL-lists

2

, with PRE, ADD, DEL 2 L

S

.

ADD and DEL des ribe the operator ee t. An instantiated operator o =

op

2 L

S

(C;R) is alled a tion. We write PRE(o), ADD(o), DEL(o) to

refer to the pre ondition, ADD-, or DEL-list of an (instantiated) operator.

Operators with variables are also alled operator s hemes.

An example for a Strips operator is:

Operator: put(?x, ?y)

PRE: fontable(?x), lear(?x), lear(?y)g

ADD: fon(?x, ?y)g

DEL: fontable(?x), lear(?y)g

2

More exa tly, the literals given in ADD and DEL are sets.


This operator an, for example, be instantiated to

Operator: put(A, B)

PRE: fontable(A), lear(A), lear(B)g

ADD: fon(A, B)g

DEL: fontable(A), lear(B)g

We will see below (g. 2.2), that a se ond variant for the put operator is

needed for the ase that blo k x is lying on another blo k z. In a blo ks-

world with additional relations green(A), green(B), red(C), red(D), we ould

restri t the appli ation onditions for put further, for example su h, that only

green blo ks are allowed to be moved. Assuming that a put a tion does not

ae t the olor of a blo k, red(x) and green(x) are stati symbols.

A usual restri tion of operator instantiation is, that dierent variables

have to be instantiated with dierent onstant symbols.

3

Denition 2.1.6 ((Forward) Operator Appli ation). For a state s and

an instantiated operator o, operator appli ation is dened as Res(o; s) =

s nDEL(o) [ADD(o) if PRE(o) s.

Note that subtra ting DEL(o) and adding ADD(o) are ommutative (result-

ing in the same su essor state), only if DEL(o)\ADD(o) = ;, DEL(o) s,

and ADD(o) \ s = ;. The ADD-list of an operator might ontain free vari-

ables, i. e., variables whi h do not o ur as arguments of the relational sym-

bols in the pre ondition. This means that mat hing might only result in

partial instantiations. The remaining variables have to be instantiated from

the set of onstant symbols C given for the urrent planning problem.

Operator appli ation gives us a synta ti rule for hanging one state repre-

sentation into another one by adding and subtra ting atoms. On the semanti

side, operator appli ation des ribes state transitions in a state spa e (Newell

& Simon, 1972; Nilsson, 1980), where a relation s

0

= Res(o; s) denotes that

a state s an be transformed into a state s

0

. (Synta ti ) operator appli ation

is admissible, if s

0

= Res(o; s) holds in the state spa e underlying the given

planning problem, i. e., if (1) s

0

denotes a state in the world and if (2) this

state an be rea hed by applying the a tion hara terized by o in state s.

For the left-hand state in gure 2.1 (s

1

) and the instantiated put operator

given above, PRE(o) s holds and Res(o,s) results in the right-hand state

in gure 2.1 (s

2

).

A Strips domain is given as set of operators. Extensions of Strips su h as

PDDL allow in lusion of additional information, su h as types (see se t. 2.2).

A Strips planning problem is given as P (O; I;G) with O as set of operators,

I as set of initial states and G as set of top-level goals. An example for a

Strips planning problem is given in gure 2.2.

3

The planning language PDDL (see se t. 2.2) allows expli it use of equality and

inequality onstraints, su h that dierent variables an be instantiated with the

same onstant symbol if no inequality onstraint is given.


Operators:

put(?x, ?y)

PRE: fontable(?x), lear(?x), lear(?y)g

ADD: fon(?x, ?y)g

DEL: fontable(?x), lear(?y)g

put(?x, ?y)

PRE: fon(?x, ?z), lear(?x), lear(?y)g

ADD: fon(?x, ?y), lear(?z)g

DEL: fon(?x, ?z), lear(?y)g

puttable(?x)

PRE: f lear(?x), on(?x, ?y)g

ADD: fontable(?x), lear(?y)g

DEL: fon(?x, ?y)g

Goal: fon(A, B), on(B, C)g

Initial State: fon(D, C), on(C, A), lear(D), lear(B), ontable(A), ontable(B)g

Fig. 2.2. A Strips Planning Problem in the Blo ks-World

While the language, in whi h domains and problems an be represented

is learly dened, there is no unique way of modeling domains and problems.

In gure 2.2, for example, we de ided, to des ribe states with the relations

on, ontable, and lear. There are two dierent put operators, one is applied

if blo k x is lying on the table, and the other if blo k x is lying on another

blo k z.

An alternative representation of the blo ks-world domain is given in gure

2.3. Here, not only the blo ks, but also the table are onsidered as obje ts

of the domain. Unary stati relations are used to represent that onstant

symbols A, B, C, D are of type \blo k". Now all operators an be represented

as put(x, y). The rst variant des ribes what happens if a blo k is moved from

the table and put on another blo k, the se ond variant des ribes, how a blo k

is moved from one blo k on another, and the third variant des ribes how a

blo k is put on the table. Another alternative would be, to represent put as

ternary operator put(blo k, from, to).

Another de ision whi h in uen es the representation of domains and prob-

lems is with respe t to the level of detail. We have ompletely abstra ted from

the agent who exe utes the a tions. If a plan is intended for exe ution by a

robot, it be omes ne essary to represent additional operators, as pi king up

an obje t and holding an obje t and states have to be des ribed with ad-

ditional literals, for example what blo k the agent is urrently holding (see

g. 2.4).

2.1.3 Ba kward Operator Appli ation

The task of a planning algorithm is, to al ulate sequen es of transformations

from states s

0

2 I to states s

G

with G s

G

. The planning problem is solved

if su h a transformation sequen e alled a plan is found. Often, it is also


Operators:

put(?x, ?y)

PRE: fon(?x, Table), blo k(?x), blo k(?y), lear(?x), lear(?y)g

ADD: fon(?x, ?y)g

DEL: fon(?x, Table), lear(?y)g

put(?x, ?y)

PRE: fon(?x, ?z), blo k(?x), blo k(?y), blo k(?z), lear(?x), lear(?y)g

ADD: fon(?x, ?y), lear(?z)g

DEL: fon(?x, ?z), lear(?y)g

put(?x, Table)

PRE: f lear(?x), on(?x, ?y), blo k(?x), blo k(?y)g

ADD: fon(?x, Table), lear(?y)g

DEL: fon(?x, ?y)g

Goal: fon(A, B), on(B, C)g

Initial State: fon(D, C), on(C, A), on(A, Table), on(B, Table),

lear(D), lear(B), blo k(A), blo k(B), blo k(C), blo k(D)g

Fig. 2.3. An Alternative Representation of the Blo ks-World

required that the transformations are optimal. Optimality an be dened as

minimal number of a tions or for operators asso iated with dierent osts

as an a tion sequen e with a minimal sum of osts.

Common to all state-based planning algorithms is that plan onstru tion

an be hara terized as sear h in the state spa e. Ea h planning step involves

the sele tion of an a tion. Finding a plan in general involves ba ktra king

over su h sele tions. To guarantee termination for the planning algorithm, it

is ne essary to keep tra k of the states already onstru ted, to avoid y les.

Sear h in the state spa e an be performed forward from an initial state to

a goal state , or ba kward from a goal state to an initial state. Ba kward

planning is based on ba kward operator appli ation:

Denition 2.1.7 (Ba kward Operator Appli ation). For a state s and

an instantiated operator o, ba kward operator appli ation is dened as

Res

1

(o; s) = s nADD(o) [ (DEL(o) [ PRE(o)) if ADD(o) s.

For state

s

2

= fon(A, B), on(B, C), lear(A), lear(D), ontable(C), ontable(D)g

ba kward appli ation of the instantiated operator put(A, B) given above

results in

s

2

n fon(A, B)g [ fontable(A), lear(A), lear(B)g =

fon(B, C), lear(A), lear(B), lear(D), ontable(A), ontable(C), ontable(D)g

= s

1

.

Ba kward operator appli ation is sound, if for Res

1

(o; s) = s

0

holds

Res(o; s

0

) = s for all states of the domain. Ba kward operator appli ation

is omplete if for all Res(o; s

0

) = s holds Res

1

(o; s) = s

0

(see se t. 3.3).


Originally, ba kward operator appli ation was not dened for \ omplete"

state des riptions (i. e., an enumeration of all atoms over R whi h hold in a

urrent state) but for onjun tions of (sub-) goals. We will dis uss so- alled

goal regression in se tion 2.3.4.

Note that while plan onstru tion an be performed by forward and ba k-

ward operator appli ations, plan exe ution is always performed by forward

operator appli ation transforming an initial state into a goal state.

2.2 Extensions and Alternatives to Strips

Sin e the introdu tion of the Strips language in the seventies, dierent exten-

sions have been introdu ed by dierent planning groups, both to make plan-

ning more eÆ ient and to enlarge the s ope to a larger set of domains. The

extensions were mainly in uen ed by work from Pednault (1987, 1994) who

proposed ADL (a tion des ription language) as a more expressive but still

eÆ ient alternative to Strips. The language PDDL (Planning Domain Deni-

tion Language) an be seen as a synthesis of all language features whi h were

introdu ed in the dierent planning systems available today (M Dermott,

1998b).

Strips and PDDL are based on the losed world assumption, allowing that

state transformations an be al ulated by adding and deleting literals from

state des riptions. Alternatively, planning an be seen as logi al inferen e

problem. In that sense, a Prolog interpreter is a planning algorithm. Situation

al ulus as a variant of rst order logi was introdu ed by M Carthy (1963),

M Carthy and Hayes (1969). Although most today planning systems are

based on the Strips approa h, situation al ulus is still in uential in planning

resear h. Basi on epts from situation al ulus are used to reason about

semanti properties of (Strips) planning. Furthermore, dedu tive planning is

based on this representation language.

2.2.1 The Planning Domain Denition Language

The language PDDL was developed 1998. Most urrent planning systems

are based on PDDL spe i ations as input and planning problems used at

the AIPS planning ompetitions are presented in PDDL (M Dermott, 1998a;

Ba hus et al., 2000). The development of PDDL was a joint proje t involving

most of the a tive planning resear h groups of the nineties. Thus, it an be

seen as a ompromise between the dierent synta ti representations and

language features available in the major planning systems of today.

The ore of PDDL is Strips. An example for the blo ks-world representa-

tion in PDDL is given in gure 2.4. For this example, we in luded aspe ts of

the behavior of an agent in the domain spe i ation (operators pi kup, and

putdown, relation symbols arm-empty and holding). Note that ADD- and

2.2 Extensions and Alternatives to Strips 21

DEL-lists are given together in an ee t-slot. All positive literals are added,

all negated literals are deleted from the urrent state.

(define (domain blo ksworld)

(:requirements :strips)

(:predi ates ( lear ?x)

(on-table ?x)

(arm-empty)

(holding ?x)

(on ?x ?y))

(:a tion pi kup

:parameters (?ob)

:pre ondition (and ( lear ?ob) (on-table ?ob) (arm-empty))

:effe t (and (holding ?ob) (not ( lear ?ob)) (not (on-table ?ob))

(not (arm-empty))))

(:a tion putdown

:parameters (?ob)

:pre ondition (holding ?ob)

:effe t (and ( lear ?ob) (arm-empty) (on-table ?ob)

(not (holding ?ob))))

(:a tion sta k

:parameters (?ob ?underob)

:pre ondition (and ( lear ?underob) (holding ?ob))

:effe t (and (arm-empty) ( lear ?ob) (on ?ob ?underob)

(not ( lear ?underob)) (not (holding ?ob))))

(:a tion unsta k

:parameters (?ob ?underob)

:pre ondition (and (on ?ob ?underob) ( lear ?ob) (arm-empty))

:effe t (and (holding ?ob) ( lear ?underob)

(not (on ?ob ?underob)) (not ( lear ?ob)) (not (arm-empty))))

)

(define (problem tower3)

(:domain blo ksworld)

(:obje ts a b )

(:init (on-table a) (on-table b) (on-table )

( lear a) ( lear b) ( lear ) (arm-empty))

(:goal (and (on a b) (on b )))

)

Fig. 2.4. Representation of a Blo ks-World Problem in PDDL-Strips

Extensions of Strips in luded in PDDL domain spe i ations are

Typing,

Equality onstraints,

Conditional ee ts,

Disjun tive pre onditions,

Universal quanti ation,

Updating of state variables.


Most modern planners are based on Strips plus the rst three extensions.

While the rst four extensions mainly result in a higher ee tiveness for plan

onstru tion, the last two extensions enlarge the lass of domains for whi h

plans an be onstru ted.

Unfortunately, PDDL (M Dermott, 1998b) does mainly provide a synta -

ti framework for these features but does give no or only an informal des rip-

tion of their semanti s. A semanti s for onditional ee ts and ee ts with

all-quanti ation is given in Koehler, Nebel, and Homann (1997). In the fol-

lowing, we introdu e typing, equality onstraints, and onditional ee ts. Up-

dating of state variables is dis ussed in detail in hapter 4. An example for a

blo ks-world domain spe i ation using an operator with equality- onstraints

and onditional ee ts is given in gure 2.5.

(define (domain blo ksworld-adl)

(:requirements :strips :equality : onditional-effe ts)

(:predi ates (on ?x ?y)

( lear ?x)) ; lear(Table) is stati

(:a tion puton

:parameters (?x ?y ?z)

:pre ondition (and (on ?x ?z) ( lear ?x) ( lear ?y)

(not (= ?y ?z)) (not (= ?x ?z))

(not (= ?x ?y)) (not (= ?x Table)))

:effe t

(and (on ?x ?y) (not (on ?x ?z))

(when (not (eq ?z Table)) ( lear ?z))

(when (not (eq ?y Table)) (not ( lear ?y)))))

)

Fig. 2.5. Blo ks-World Domain with Equality Constraints and Conditioned Ee ts

2.2.1.1 Typing and Equality Constraints. The pre ondition in gure

2.5 ontains equality onstraints, expressing, that all three obje ts involved

in the puton operator have to be dierent.

Equality onstraints restri t what substitutions are legal in mat hing a

urrent state and a pre ondition. That is, denition 2.1.3, is modied su h

that for all pairs (x t), (x

0

t

0

) in a substitution t 6= t

0

has to hold, if

(not (= x x')) is spe ied in the pre ondition of the operator.

Instead of using expli it equality onstraints, mat hing an be dened

su h that variables with dierent names must generally be instantiated with

dierent onstants. But, expli it equality onstraints give more expressive

power: giving no equality onstraint for two variables x and y allows that

these variables an be instantiated by dierent or the same onstant. Using

\impli it" equality onstraints make it ne essary to spe ify two dierent op-

erators, one with only one variable name (x = y), and one with both variable

names (x 6= y).


Besides equality onstraints, types an be used to restri t mat hing. Let

us assume a blo ks-world, where movable obje ts onsist of blo ks and of

pyramids and where no obje ts an be put on top of a pyramid. We an

introdu e the following hierar hy of types:

table

blo k

pyramid

movable-obje t: blo k, pyramid

attop-obje t: table, blo k.

Operator puton(x, y, z) an now be dened over typed variables x:movable-

obje t, y: attop-obje t, z: attop-obje t. When dening a problem for that ex-

tended blo ks-world domain, ea h onstant must be de lared together with

a type.

Typing an be simulated in standard Strips using stati relations. A simple

example for typing is given in gure 2.3. An example overing the type hierar-

hy from above is: ftable(T), attop-obje t(T), blo k(B), movable-obje t(B),

attop-obje t(B), pyramid(A), movable-obje t(A)g. The operator pre ondi-

tion an then be extended by fmovable-obje t(x), attop-obje t(y), attop-

obje t(z)g.

2.2.1.2 Conditional Ee ts. Conditional ee ts allow to represent on-

text dependent ee ts of a tions. In the blo ks-world spe i ation given in

gure 2.3, three dierent operators were used to des ribe the possible ee ts

of putting a blo k somewhere else (on another blo k or the table). In on-

trast, in gure 2.5, only one operator is needed. All variants of puton(x y

z) have some general appli ation restri tions, spe ied in the pre ondition:

blo k x is lying on something (z is another blo k or the table); and both x

and y are lear, where lear(Table) is stati , i. e., holds in all possible situa-

tions. Additionally, the equality onstraints in the pre ondition spe ify that

all obje ts involved have to be dierent. Regardless of the ontext in whi h

puton is applied, the result is that x is no longer lying on z, but on y. This

ontext independent ee t is spe ied in the rst line of the ee t. The next

two lines spe ify additional onsequen es: If x was not lying on the table,

but on another blo k z, this blo k z is lear after operator appli ation; and if

blo k x was not put on the table, but on a blo k y, this blo k y is no longer

lear after operator appli ation. Pre onditions for onditioned ee ts are also

alled se ondary pre onditions , onditioned ee ts are also alled se ondary

ee ts (Penberthy & Weld, 1992; Fink & Yang, 1997).

The main advantage of onditional ee ts is, that plan onstru tion gets

more eÆ ient: In every planning step, all operators must be mat hed with

the urrent state representation and all su essfully mat hed operators are

possible andidates for appli ation. If the general pre ondition of an operator

with onditioned ee ts does not mat h with the urrent state, it an be

reje ted immediately, while for the un onditioned variants given in gure 2.3

all three pre onditions have to be he ked.


The semanti s of applying an operator with onditioned ee ts is:

Denition 2.2.1 (Operator Appli ation with Conditional Ee ts). Let

PRE be the general pre ondition of an operator, and ADD and DEL the

un onditioned ee ts. Let PRE

i

, ADD

i

, DEL

i

, with i = 0 : : : n be on-

text onditions with their asso iated ontext ee ts. For a state s and an

instantiated operator o, operator appli ation is dened as Res(o; s) = s n

(DEL(o)

S

i2M

DEL

i

(o)) [ (ADD(o)

S

i2M

ADD

i

(o)) if PRE(o)

s and M = fi j PRE

i

(o) sg.

Ba kward appli ation is dened as Res

1

(o; s) = s

n(ADD(o)

S

i2M

ADD

i

(o))

[ [(DEL(o)

S

i2M

DEL

i

(o)) [ (PRE(o)

S

i2M

PRE

i

(o))

if ADD(o) s and M = fi j ADD

i

(o) sg.

2.2.2 Situation Cal ulus

Situation al ulus was introdu ed by M Carthy (M Carthy, 1963; M Carthy

& Hayes, 1969) to des ribe state transitions in rst order logi . The world

is on eived as a sequen e of situations and situations are generated from

previous situations by a tions. Situations are as Strips state representations

ne essarily in omplete representations of the states in the world!

Relations whi h an hange over time are alled uents. Ea h uent has

an extra argument for representing a situation. For example, lear(a, s

1

)

4

denotes, that blo k a is lear in a situation referred to as s

1

. Changes in the

world are represented by a fun tion Res(a tion, situation) = situation. For

example, we an write s

2

= Res(put(a, b), s

1

) to des ribe that applying put(a,

b) in situation s

1

results in a situation s

2

. Note that the on ept of uents

is also used in Strips. Although uents are there not spe ially marked, it an

be inferred from the operator ee ts, whi h relational symbols orrespond to

uent relations. We already used the Res-fun tion to des ribe the semanti s

of operator appli ations in Strips (se t. 2.1.1). In situation al ulus, the result

of an operator appli ation is not al ulated by adding and deleting literals

from a state but by logi al inferen e.

The rst implemented system using situation al ulus for automated plan

onstru tion was proposed by Green (1969) with the QA3 system. Alterna-

tively to the expli it use of a Res-fun tion, not only uents, but also a tions

are provided with an additional argument for situations. For example, we an

write s

2

= put(a, b, s

1

) to des ribe that blo k a is put on blo k b in situa-

tion s

1

, resulting in a new situation s

2

. Green's inferen e system is based on

resolution.

5

For example, the following two axioms might be given (the rst

axiom is a spe i fa t):

4

We represent onstants with small letters and variables with large letters.

5

We do not introdu e resolution in a formal way but give an illustratory example.

We assume that the reader is familiar with the basi on epts of theorem proving,

as they are introdu ed in logi or AI textbooks.


A1 on(a, table, s

1

)

A2 8 S[on(a; table; S)! on(a; b; put(a; b; S))

:on(a, table, S) _ on(a, b, put(a, b, S)) ( lausal form).

Green's theorem prover provides two results: rst, it infers, whether some

formula representing a planning goal follows from the axioms, and se ond,

it provides the a tion sequen e the plan if the goal statement an be

derived. Not only a yes/no answer but also a plan how the goal an be

a hieved an be returned be ause a so alled answer literal is introdu ed.

The answer literal is initially given as answer(S) and at ea h resolution step,

the variable S ontained in the answer literal is instantiated in a ordan e

with the involved formulas.

For example, we an ask the theorem prover whether there exists a situ-

ation S

F

in whi h on(a, b, S

F

) holds, given the pre-dened axioms. If su h

a situation S

F

exists, answer(S

F

) will be instantiated throughout the reso-

lution proof with the plan. Resolution proofs work by ontradi tion. That is,

we start with the negated goal:

1. :on(a, b, S

F

) (Negation of the theorem)

2. : on(a, table, S) _ on(a, b, put(a, b, S)) (A2)

3. : on(a, table, S) (Resolve 1, 2)

answer(put(a, b, S))

4. on(a, table, s

1

) (A1)

5. ontradi tion (Resolve 3, 4)

answer(put(a, b, s

1

))

The resolution proof shows that a situation s

2

= on(a, table, s

1

) with on(a,

b, s

2

) exists and that s

2

an be rea hed by putting a on b in situation s

1

.

Situation al ulus has the full expressive power of rst order predi ate

logi . Be ause logi al inferen e does not rely on the losed world assumption,

spe ifying a planning domain involves mu h more eort than in Strips: In

addition to axioms des ribing the ee ts of operator appli ations, frame ax-

ioms, des ribing what predi ates remain unae ted by operator appli ations,

have to be spe ied.

Frame axioms be ome always ne essary when the goal is not only a single

literal but a onjun tion of literals. For illustration, we extend the example

given above:

A3 on(a, table, s

1

)

A4 on(b, table, s

1

)

A5 on( , table, s

1

)

A6 :on(X, table, S) _ on(X, Y, put(X, Y, S))

A7 8 S[on(Y; Z; S)! on(Y; Z; put(X;Y; S))

:on(Y, Z, S) _ on(Y, Z, put(X, Y, S))

A8 8 S[on(X; table; S)! on(X; table; put(Y; Z; S))

:on(X, table, S) _ on(X, table, put(Y, Z, S)).


Axiom A6 orresponds to axiom A2 given above, now stated in a more general

form, abstra ting from blo ks a and b. Axiom A7 is a frame axiom, stating

that a blo k Y is still lying on a blo k Z, after a blo k X was put on blo k

Y . Axiom A8 is a frame axiom, stating that a blo k X is still lying on the

table, if a blo k Y is put on a blo k Z. Note, that the given axioms are only

a sub-set of the information needed for modeling the blo ks-world domain. A

omplete axiomatization is given in gure 2.6 (where we represent the axioms

in a Prolog-like notation, writing the on lusion side of an impli ation on the

left-hand side).

Ee t Axioms:

on(X, Y, put(X, Y, S)) lear(X, S) ^ lear(Y, S)

lear(Z, put(X, Y, S)) on(X, Z, S) ^ lear(X, S) ^ lear(Y, S)

lear(Y, puttable(X, S)) on(X, Y, S) ^ lear(X, S)

ontable(X, puttable(X, S)) lear(X, S)

Frame Axioms:

lear(X, put(X, Y, S)) lear(X, S) ^ lear(Y, S)

lear(Z, put(X, Y, S)) lear(X,S) ^ lear(Y, S) ^ lear(Z, S)

ontable(Y, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ ontable(Y, S)

ontable(Z, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ ontable(Z, S)

on(Y, Z, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ on(Y, Z, S)

on(W, Z, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ on(W, Z, S)

lear(Z, puttable(X, S)) lear(X, S) ^ lear(Z, S)

ontable(Z, puttable(X, S)) lear(X, S) ^ ontable(Z, S)

on(Y, Z, puttable(X, S)) lear(X, S) ^ on(Y, Z, S)

lear(Z, puttable(X, S)) on(Y, X, S) ^ lear(Y, S) ^ lear(Z, S)

ontable(Z, puttable(X, S)) on(Y, X, S) ^ lear(Y, S) ^ ontable(Z, S)

on(W, Z, puttable(X, S)) on(Y, X, S) ^ lear(Y, S) ^ on(W, Z, S)

Fa ts (Initial State):

on(d, , s

1

)

on( , a, s

1

)

lear(d, s

1

)

lear(b, s

1

)

ontable(a, s

1

)

ontable(b, s

1

)

Theorem (Goal):

on(a, b, S) ^ on(b, , S)

Fig. 2.6. Representation of a Blo ks-World Problem in Situation Cal ulus

For the goal 9 S

F

[on(a; b; S

F

) ^ on(b; ; S

F

), the resolution proof is

6

:

1. :on(a, b, S

F

) _ :on(b, , S

F

) (Negation of the theorem)

2. :on(X, table, S) _ on(X, Y, put(X, Y, S)) (A6)

3. :on(b, , put(a, b, S)) _ :on(a, table, S) (Resolve 1, 2)

4. :on(Y, Z, S') _ on(Y, Z, put(X, Y, S')) (A7)

6

When introdu ing a new lause, we rename variables su h that there an be no

onfusion, as usual for resolution.

2.3 Basi Planning Algorithms 27

5. :on(a, table, S') _ :on(b, , S') (Resolve 3, 4)

6. :on(X, table, S) _ on(X, Y, put(X, Y, S)) (A6)

7. :on(a, table, put(b, , S)) _ :on(b, table, S) (Resolve 5, 6)

8. :on(X, table, S') _ on(X, table, put(Y, Z, S')) (A8)

9. :on(b, table, S) _ :on(a, table, S) (Resolve 7, 8)

10. on(b, table, s

1

) (A4)

11. :on(a, table, S) (Resolve 9, 10)

12. on(a, table, s

1

) (A3)

13. ontradi tion.

Resolution gives us an inferen e rule for proving that a goal theorem

follows from a set of axioms and fa ts. For automated plan onstru tion, ad-

ditionally a strategy whi h guides the sear h for the proof (i. e., the sequen e

in whi h axioms and fa ts are introdu ed into the resolution steps) is needed.

An example for su h a strategy is SLD-resolution as used in Prolog (Sterling

& Shapiro, 1986). In general, nding a proof involves ba ktra king over the

resolution steps and over the ordering of goals.

The main reason why Strips and not situation al ulus got the standard

for domain representations in planning is ertainly that it is mu h more time

onsuming and also mu h more error-prone to represent a domain using ef-

fe t and frame axioms in ontrast to only modeling operator pre onditions

and ee ts. Additionally, spe ial purpose planning algorithms are naturally

more eÆ ient than general theorem provers. Furthermore, for a long time, the

restri ted expressiveness of Strips was onsidered suÆ ient for representing

domains whi h are of interest in plan onstru tion. When more interesting

domains, for example domains involving resour e onstraints (see hap. 4),

were onsidered in planning resear h, the expressiveness of Strips was ex-

tended to PDDL. But still, PDDL is a more restri ted language than situa-

tion al ulus. Due to the progress in automati theorem proving over the last

de ade, the eÆ ien y on erns whi h aused the prominen e of state-based

planning, might no longer be true (Bibel, 1986). Therefore, it might be of in-

terest again, to ompare urrent state-based and urrent dedu tive (situation

al ulus) approa hes

2.3 Basi Planning Algorithms

In general, a planning algorithm is a spe ial purpose sear h algorithm. State-

based planners sear h in the state-spa e, as shortly des ribed in se tion 2.1.1.

Another variant of planners, alled partial-order planners, sear h in the so-

alled plan spa e. Dedu tive planners sear h in the \proof spa e", i. e. the

possible orderings of resolution steps. Basi sear h algorithms are introdu ed

in all introdu tory algorithm textbooks, for example in Cormen, Leiserson,

and Rivest (1990).


In the following, we will rst introdu e basi on epts for plan onstru -

tion. Then forward planning is des ribed, followed by a dis ussion of omplex-

ity results for planning and formal properties for plans. Finally, we introdu e

ba kward planning.

2.3.1 Informal Introdu tion of Basi Con epts

The denitions for forward and ba kward operator appli ation given above

(def. 2.1.6 and def. 2.1.7) are the ru ial omponent for plan onstru tion:

The transformation of a given state into a next state by operator appli a-

tion onstitutes one planning step. In general, ea h planning step onsists of

mat hing all operators with the urrent state des ription, sele ting one in-

stantiated operator whi h is appli able in the urrent state, and applying this

operator. Plan onstru tion involves a series of su h mat h-sele t-apply

y les. In ea h planning step one operator is sele ted for appli ation, that

is, plan onstru tion is based on depth-rst sear h and operator sele tion is a

ba ktra k point. During plan onstru tion, a sear h tree is generated. State

des riptions are nodes, a tion appli ations are ar s in the sear h tree. Ea h

planning step expands the urrent leaf node s of the sear h tree by intro-

du ing a new a tion o and the state des ription s

0

resulting from applying

o to s. For forward planning, sear h starts with an initial state as root; for

ba kward planning, sear h starts with the top-level goals (or a goal state) as

root.

Input in a planning algorithm is a planning problem P (O; I;G), as

dened in se tion 2.1.1. Output of a planning algorithm is a plan. For basi

Strips planning, a plan is a sequen e of a tions, transforming an initial state

into an state fullling the top-level goals. Su h an exe utable plan is also

alled solution. More general, a plan is a set of operators together with a

set of binding onstraints for the variables o urring in the operators and

a set of ordering onstraints dening the sequen e of operator appli ations.

If the plan is not exe utable, a plan is also alled partial plan. A plan is

not exe utable if it does not ontain all operators ne essary to transform an

initial state into a goal state, or if not all variables are instantiated, or if there

is no omplete ordering of the operators.

When implementing a planner, at least the following fun tionalities must

be provided for:

A pattern-mat her and possibly a me hanism for dealing with free variables

(i. e., variables whi h are not bound by mat hing an operator with a set

of atoms).

A strategy for sele ting an appli able a tion and for handling ba ktra king

(taking ba k an earlier ommitment).

A me hanism for al ulating the ee t of an operator appli ation.

A data stru ture for storing the partially onstru ted plan.


A data stru ture for holding the information ne essary to dete t y les

(i. e., states whi h were already generated) to guarantee termination.

Some planning systems al ulate a set of a tions before plan onstru -

tion by instantiating the operators with the given onstant symbols. As

a onsequen e, during plan onstru tion, mat hing is repla ed by a simple

subset-test, where it is he ked whether the set of instantiated pre onditions

of an operator is ompletely ontained in the set of atoms representing the

urrent state. Instantiation of operators is realized with respe t to the (possi-

bly typed) onstants dened for an problem by extra ting them from a given

initial state or by using the set of expli itly de lared onstants. In general,

su h a \freely generated" set of a tions is a super-set of the set of \legal"

a tions.

Often planning algorithms do not dire tly onstru t a plan as fully instan-

tiated, totally ordered sequen e of a tions. Some planners onstru t partially

ordered plans where some a tions for whi h no order onstraints were deter-

mined during planning are \parallel". Some planners store plans as part of a

more general data stru ture. In su h ases, additional fun tionalities for plan

linearization and/or plan extra tion must be provided.

2.3.2 Forward Planning

Plan onstru tion based on forward operator appli ation starts with an initial

state as root of the sear h tree. Plan onstru tion terminates su essfully, if

a state is found whi h satises all top-level planning goals. Forward planning

is also alled progression planning. An informal forward planning algorithm

is given in table 2.1.

Table 2.1. Informal Des ription of Forward Planning

Until the top-level goals are satised in a state or until all possible states are

explored DO

For the urrent state s:

MATCH: For all operators op 2 O al ulate all substitutions su h that the

operator pre onditions are ontained in s. Generate a set of a tion andidates

A = fo j PRE(o) sg.

SELECT: Sele t one element o from A.

APPLY: Cal ulate the su essor state Res(s; o) = s

0

.

BACKTRACK: If there is no su essor state (A is empty), go ba k to the pre-

de essor state of s. If the generated su essor state is already ontained in the

plan, sele t another element from A.

PROCEED: Otherwise, insert the sele ted a tion in the plan and pro eed with

s

0

as urrent state.

In this algorithm, we abstra t from the data stru ture for saving a plan

and from the data stru ture ne essary for handling ba ktra king and for


dete tion of y les. One possibility would be, to put in ea h planning step

an a tion-su essor-state pair on a sta k. When the algorithm terminates

su essfully, the plan orresponds to the sequen e of a tions on the sta k. For

ba ktra king and y le dete tion a se ond data stru ture is needed where all

generated and reje ted states are saved. Both informations an be represented

together, if the omplete sear h history is saved expli itly in a sear h tree (see

Winston, 1992, hap. 4). A sear h tree an be represented as list of lists:

Denition 2.3.1 (The Data Stru ture \Sear h Tree"). A sear h tree ST

an be represented as a list of lists, where ea h list represents a path: ST =

nil j ons(path; ST ) with

empty(ST ) =

true if ST = nil

false else

and sele tor fun tions rst( ons(path, ST)) = path, rest( ons(path, ST)) = ST.

A (partially expanded) path is dened as a list of a tion-state pairs (o s). The

root (initial state) is represented as (nil s). The sele tor fun tion geta tion((o s))

returns the rst, and the sele tor fun tion getstate((o s)) the se ond argument of

an a tion-state pair.

A path is dened as: path = nil j r ons(path; pair) with sele tor fun tion

last(r ons(path, pair)) = pair.

With getstate(last(path)) a leaf of the sear h tree is retrieved. For a urrent state

s = getstate(last(path)) and a list of a tion andidates A a path is expanded by

expand(path, A) = r ons(path, ons(o, s')) for all o 2 A and Res(o; s) = s

0

.

A plan an be extra ted from a path with getplan(path) = map((x). geta tion(x)

(path)), where the higher-oder fun tion map des ribes that fun tion geta tion is

applied to the sequen e of all elements in path.

A \ eshed-out" version of the algorithm in table 2.1 is given in table 2.2.

For saving the possible ba ktra k points now all a tions whi h are appli able

in a state are inserted in the sear h tree, together with their orresponding

su essor states. That is, sele tion of an a tion is delayed one step. The

sele tion strategy is to always expand the rst (\left-most") path in the sear h

tree. That is, the algorithm is still depth-rst. To guarantee termination, for

the y le-test it is suÆ ient to he k whether an a tion results in a state whi h

is already ontained in the urrent path. Alternatively, it ould be he ked,

if the new state is ontained already anywhere else in the sear h tree. This

extended y le he k makes sure that a state is always rea hed with the

shortest possible a tion sequen e but involves possibly more ba ktra king.

Note that the extended y le-test does not result in optimal plans: although

every state already in luded in the sear h tree is rea hed with the shortest

possible a tion sequen e, the optimal path might be found in a not expanded

part of the sear h tree whi h does not in lude this state.

An example for plan onstru tion by forward sear h is given in gure

2.7. We use the operators as dened in gure 2.2. The top-level goals are

on(A, B), on(B, C), and the initial state is on(A, C), ontable(B), ontable(C),

lear(A), lear(B). For better readability, the states are presented graphi ally

and not as set of literals. All generated and dete ted y les are presented in


Table 2.2. A Simple Forward Planner

Main Fun tion fwplan(O, G, ST)

Initial Fun tion Call: fwplan(O, G, [[(nil s)) with a set of operators O, an initial

state s 2 I, and a set of top-level goals G

Return: A plan as sequen e of a tions whi h transform s in a state s

G

with

G s

G

, extra ted from sear h tree ST .

1. IF empty(ST) THEN \no plan found"

2. ELSE LET be urrent state S = getstate(last(first(ST ))).

( orresponds to SELECT a tion geta tion(last(rst(ST))))

a) IF G S THEN getplan(rst(ST))

b) ELSE

i. MATCH: For all op 2 O al ulate mat h(op; S) as in def. 2.1.3.

LET A = fo

1

; : : : o

n

g be the list of all instantiated operators with

PRE(o

i

) S.

ii. APPLY: For all o

i

2 A al ulate S

0

i

= Res(o

i

; S) as in def. 2.1.6.

LET AR = f(o

1

S

0

1

) : : : (o

n

S

0

n

)g

iii. Cy le-Test: Remove all pairs (o

i

S

0

i

) from AR where S

i

is ontained

in first(ST ).

iv. Re ursive all:

IF empty(AR) THEN fwplan(O, G, rest(ST)) (BACKTRACK)

ELSE fwplan(O, G, append(expand(rst(ST), AR), tail(ST))).

the gure but only for the rst two ba ktra king is expli itly depi ted. The

generation of a tion andidates is based on an ordering of put before puttable

and instantiation is based on alphabeti al order (A before B before C).

For the given sequen e in whi h a tion andidates are expanded, sear h is

very ineÆ ient. In fa t, all possible states of the blo ks-world are generated to

nd the solution. In general, un-informed forward sear h has a high bran hing

fa tor. Forward sear h an be made more eÆ ient, if some information about

the probable distan e of a state to a goal state an be used. The best known

heuristi forward sear h algorithm is A

(Nilsson, 1971), where a lower bound

estimate for the distan e from the urrent state to the goal is used to guide

sear h. For domain independent planning in ontrast to spe ialized problem

solving su h a domain spe i information is not available. An approa h

how to generate su h estimates in a pre-pro essing step to planning was

presented by Bonet and Gener (1999).

2.3.3 Formal Properties of Planning

2.3.3.1 Complexity of Planning. Even if a sear h tree an be kept smaller

than in the example given above, in the worst ase planning is NP- omplete.

That is, a (deterministi ) algorithm needs exponential time for nding a

solution (or de iding that a problem is not solvable). For a sear h tree with a

maximal depth of n and a maximal bran hing fa tor of m, in the worst ase,


B A

C

s0

B

C

A

s1

C

B

s2

A

B A C

s5

B A C

s3

C

A

B

s4

A

B

C

s14

A

C

B

s13

B

B

A

C

B

A C

C

A

B

A

C

s6 s7 s8 s9 s10

s11 s12

B

A

C A

B

C B

C

A

cycle

s0

cycle

s2

cycle

s5

cycle

s5

cycle

s5

cycle

s5

GOAL STATE

un-used

backtrack point

un-used

backtrack point

bac

ktra

ck

bac

ktra

ck

put(B, C) puttable(C)

put(C, B)

puttable(B) put(A, C) puttable(C)

puttable(A)

put(C, A)

put(B, A)

put(A, B)

put(A, C)

put(B, A)put(C, A)

put(B, C)

cycle cycle cycles6 s7 s8

put(C, B)

put(A, B)

Fig. 2.7. A Forward Sear h Tree for Blo ks-World


planning eort ism

n

, i. e., the omplete sear h tree has to be generated.

7

Even

worse, for problems without a restri tion of the maximal length of a possible

solution, planning is PSPACE- omplete. That is, even for non-deterministi

polynomial algorithms an exponential amount of memory is needed (Garey

& Johnson, 1979).

Be ause in the worst ase, all possible states have to be generated for

plan onstru tion, the number of nodes in the sear h tree is approximately

equivalent to the number of problem states. For problems, where the number

of states grows systemati ally with the number of obje ts involved, the max-

imal size of the sear h tree an be al ulated exa tly. For example, for the

Tower of Hanoi domain

8

with three pegs and n dis s, the number of states is

3

n

(and the minimal depth of a solution path is 2

n1

).

The blo ks-world domain is similar to the Tower of Hanoi domain but has

less onstraints. First, ea h blo k an be put on ea h other blo k (instead of

a dis an only be put on another dis if it is smaller) and se ond, the table is

assumed to have always additional free spa e for a blo k (instead of only three

pegs where dis s an be put). Enumeration of all states of the blo ks-world

domain with n blo ks orresponds to the abstra t problem of generating a

set of all possible sets of lists whi h an be onstru ted from n elements. For

example, for three blo ks A, B, C:

f f(A B C)g,f(A C B)g, f(B A C)g,f(B C A)g,f(C A B)g,f(C B A)g,

f(B C), (A)g, f(C B), (A)g, f(A C), (B)g, f(C A), (B)g, f(A B), (C)g,

f(B A), (C)g,

f(A) (B) (C)g g.

A single list with three elements represents a single tower, for example a

tower with blo ks A on B and B on C for the rst element given above. A set

of two lists represents two towers on the table, for example blo k B lying on

blo k C and A as a one-blo k tower. The number of sets of lists orresponds

to the so- alled Lah-number (Knuth, 1992).

9

It an be al ulated by the

following formula:

a(0) = 0

a(1) = 1

a(n) = [(2n - 1) a(n - 1) - [(n - 1) (n - 2) a(n - 2).

The growth of the number of states for the blo ks-world domain is given for

up to sixteen blo ks in table 2.3.

7

Note that this only holds for nite planning domains where all states are enu-

merable. For more omplex domains involving relational symbols over arbitrary

numeri al arguments, the sear h tree be omes innite and therefore other ri-

teria have to be introdu ed for termination (making planning in omplete, see

below).

8

The Tower of Hanoi domain is introdu ed in hapter 4.

9

More ba kground information an be found at http://www.resear h.att. om/

gi-bin/a ess. gi/as/njas/sequen es/eisA. gi?Anum=000262.


Table 2.3. Number of States in the Blo ks-World Domain

# blo ks 1 2 3 4 5

# states 1 3 13 73 501

approx. 1:0 10

0

3:0 10

0

1:3 10

1

7:3 10

1

5:0 10

2

# blo ks 6 7 8 9 10

# states 4051 37633 394353 4596553 58941091

approx. 4:1 10

3

3:8 10

4

3:9 10

5

4:6 10

6

5:9 10

7

#blo ks 11 12 13 14 15

# states 824073141 12470162233 202976401213 3535017524403 65573803186921

approx. 8:2 10

8

1:3 10

10

2:0 10

11

3:5 10

12

6:6 10

13

#blo ks 16 : : :

# states 1290434218669921 : : :

approx. 1:3 10

15

The sear h tree might be even larger than the number of legal states of a

problem i. e., the number of nodes in the state-spa e. First, some states an

be onstru ted more than on e (if y le dete tion is restri ted to paths) and

se ond, for some planning algorithms, \illegal" states whi h do not belong to

the state spa e might be onstru ted. Ba kward planning by goal regression

an lead to su h illegal states (see se t. 2.3.4).

An analysis of the omplexity of Strips planning is given by (Bylander,

1994). Another approa h to planning is that the state spa e (or that part of

it whi h might ontain the sear hed for solution) is already given. The task of

the planner is than to extra t a solution from the sear h spa e (this strategy

is used by Graphplan, see se t. 2.4.2.1). This problem is still NP- omplete!

Be ause planning is an inherently hard problem, no planning approa h

an outperform alternative approa hes in every domain. Instead, dierent

planning approa hes are better suited for dierent kinds of domains.

2.3.3.2 Termination, Soundness, Completeness. A planning algorithm

should like every sear h algorithm be orre t and omplete:

Denition 2.3.2 (Soundness). A planning algorithm is sound if it only

generates legal solutions for a planning problem. That is, the generated se-

quen e of a tions transforms the given initial state into a state satisfying the

given top-level goals. Soundness implies that the generated plans are onsis-

tent: A state generated by applying an a tion to a previous state is onsistent

if it does not ontain ontradi tory literals, i. e., if it belongs to the domain.

A solution is onsistent if it does not ontain ontradi tions with regard to

variable bindings and to ordering of states.

Proof of soundness relies on the operational semanti s given for operator ap-

pli ation (su h as our denitions 2.1.6 and 2.1.7). The proof an be performed

by indu tion over the length of the sequen es of a tions.

Corre tness follows from soundness and termination:

Denition 2.3.3 (Corre tness). A sear h algorithm is orre t, if it is

sound and termination is guaranteed.


To prove termination, it has to be shown, that for ea h plan step the sear h

spa e is redu ed su h that the termination onditions given for the planning

algorithm are eventually rea hed. For nite domains and a y le-test overing

the sear h tree, the planner always terminates when the omplete sear h tree

was onstru ted.

Besides making sure that a planner always terminates and that if returns

a plan that plan is a solution to the input problem, it is desirable that the

planner returns a solution for all problems, where a solution exists:

Denition 2.3.4 (Completeness). A sear h algorithm is omplete, if it

nds a solution if su h a solution exists. That is, if the algorithm terminates

without a solution, then the planning problem has no solution.

To proof ompleteness, it has to be shown that the planner only then

terminates without a solution, if no solution exists for a problem.

We will give proofs for orre tness and ompleteness for our planner DPlan

in hapter 3.

Ba kward planning algorithms based on a linear strategy are in omplete.

The Sussman anomaly is based on that in ompleteness. In ompleteness of

linear ba kward planning is dis ussed in se tion 2.3.4.2.

2.3.3.3 Optimality. When abstra ting from dierent osts (su h as time,

energy use) of operator appli ation, optimality of a plan is dened with re-

spe t to its length. Operator appli ations are assumed to have uniform osts

(for example one unit per appli ation) and the ost of a plan is equivalent to

the number of a tions it ontains.

Denition 2.3.5 (Optimality of a Plan). A plan is optimal if ea h other

plan whi h is a solution for the given problem has equal or greater length.

For operators involving dierent (positive) osts a plan is optimal if for

ea h other plan whi h is a solution the sum of the osts of the a tions in the

plan is equal or higher.

For uninformed planning based on depth-rst sear h, as des ribed in se -

tion 2.3.2, optimality of plans annot be guaranteed. In fa t, there is a trade-

o between eÆ ien y and optimality to generate optimal plans, the state-

spa e has to be sear hed more exhaustively. The obvious sear h strategy

for obtaining optimal plans is breadth-rst sear h.

10

Universal planning (see

se t. 2.4.4) for deterministi domains results in optimal plans.

2.3.4 Ba kward Planning

Ba kward planning often results in smaller sear h trees than forward plan-

ning be ause the top-level goals an be used to guide the sear h. For a long

10

Alternatively depth-rst sear h an be extended su h that all plans are generated

always keeping the urrent shortest plan.


time planning was used as synonym for ba kward planning while forward

planning was often asso iated with problem solving (see se t. 2.4.5.1). For

ba kward planning, plan onstru tion starts with the top-level goals and in

ea h planning step a prede essor state is generated by ba kward operator

appli ation. Ba kward planning is also alled regression planning.

Before we go into the details of ba kward planning, we present the ba k-

ward sear h tree for the tree for the example we already presented for forward

sear h (see g. 2.8). The ba kward sear h tree is slightly smaller than the

forward sear h tree. Ignoring the states whi h were generated as ba ktra k-

points, in forward sear h thirteen states have to be visited, in ba kward-sear h

nine. In general, the savings an be mu h higher.

A

B

C

s0

B

C A

s1

B

AC

s2

BC A

s3

C

s6

A

B C

s7

A

B

s8

C

A B

s9

A

C

B

puttable(A)

un-used

backtrack-point

puttable(C)put(B, A)

put(B, C)put(B, C)

put(A, B)

puttable(C) puttable(B)

cycle

s0

cycle

s1

put(B, A)

Initial

State

un-used

backtrack-point

un-usedbacktrack-point

cycle

s1, s2

puttable(A)

puttable(A)puttable(B)

puttable(C) puttable(C)

cycle

s6

cycle

s7

cycle

s5cycle

s5

BC A

B

C

A

s4 s5

C

A

B

B

A

C

s10 s11

Fig. 2.8. A Ba kward Sear h Tree for Blo ks-World

There is a pri e to pay for obtaining smaller sear h trees. The problem

is, that a ba kward operator appli ation an produ e an in onsistent state

des ription (see def. 2.3.4). We will dis uss this problem in se tion 3.3 in hap-


ter 3, in the ontext of our state-based non-linear ba kward planner DPlan.

In the following, we will dis uss the lassi (Strips) approa h to ba kward

planning, using goal regression together with the problem of dete ting and

eliminating in onsistent state des riptions. Furthermore, we will address the

problem of in ompleteness of lassi al linear ba kward planning and present

non-linear planning, whi h is omplete.

2.3.4.1 Goal Regression. Regression planning starts with the top-level

goals of a planning problem as input (root of the sear h tree). For a planning

problem involving three blo ks A, B, and C and the top-level goals G =

fon(A, B), on(B, C)g, the state depi ted in the root node in gure 2.8 is

the only legal goal state with G s = fon(A, B), on(B, C), ontable(C),

lear(A)g. In ontrast to forward planning, planning does not start with a

omplete state des ription s but with a partial state des ription G A planning

step is realized by goal regression:

Denition 2.3.6 (Goal Regression). Let G be a set of (goal) literals and

o an instantiated operator.

If for an atom p 2 G holds p 2 ADD(o), then the goal orresponding to p

an be repla ed by true whi h is equivalent to removing p from G.

If for an atom p 2 G holds p 2 DEL(o), then the goal orresponding to

p is destroyed, that is, it must be repla ed by false whi h is equivalent to

repla ing formula G by false.

If p 2 G is neither ontained in ADD(o) nor in DEL(o), nothing is

hanged.

If a formula is redu ed to false, an in onsistent state des ription is de-

te ted. Another possibility to deal with in onsisten y is to introdu e domain

axioms and dedu e ontradi tions by theorem proving. We give the beginning

of the sear h tree using goal regression in gure 2.9. A omplete regression

tree using the domain spe i ation given in gure 2.4 is given in (Nilsson,

1980, pp. 293295). Using goal regression, there an o ur free variables when

introdu ing new subgoals. Typi ally, these variables an be instantiated if a

subgoal expression is rea hed whi h mat hes with the initial state. Depend-

ing on the domain, there might go a lot of eort in dealing with in onsistent

states. If more omplex operator spe i ations su h as onditional ee ts

and all-quantied expressions are allowed, the number of impossible states

whi h are onstru ted and must be dete ted during plan onstru tion an

explode.

To summarize, there are the following dieren es between forward and

ba kward planning:

Complete vs. Partial State Des riptions. Forward planning starts with a om-

plete state representation the initial state while ba kward planning

starts with a partial state representation the top-level goals (whi h

might ontain variables).


clear(x)

clear(B)

ontable(A)

on(B, C)

on(x, A)

clear(x’)

on(x’, B)

clear(A)

ontable(A)

on(B, C)

on(A, B)

on(B, C)

clear(A)

clear(B)

ontable(A)

on(B, C)

clear(B)

clear(C)

on(B, z’)

on(A, B)

clear(B)

clear(C)

ontable(B)

on(A, B)

inconsistent:

to fulfill clear(y), y= B

puttable(x)

deletes on(x, B)clear(A)

on(A, y)

clear(B)

on(B, C)

clear(B)

clear(C)

ontable(B)

clear(A)

ontable(A)

clear(B)

clear(C)

on(B, z’’)

clear(A)

ontable(A)

clear(A)

clear(B)

on(A, z)

on(B, C)

put(A, B)

put(A, B)put(B, C)

put(B, C)

puttable(x) puttable(x’)puttable(A) put(B, C) put(B, C)

will become inconsistent

for z = B

Fig. 2.9. Goal-Regression for Blo ks-World

Consisten y of State Des riptions. A planning step in forward planning al-

ways generates a onsistent su essor state. Soundness of forward plan-

ning follows easily from the soundness of the planning steps. A planning

step in ba kward planning an result in an in onsistent state des rip-

tion. In general, a planner might not dete t all in onsisten ies. To proof

soundness, it must be shown, that if a plan is returned, all intermediate

state representations on the path from the top-level goals to the initial

state are onsistent.

Variable Bindings. In forward planning, all onstru ted state representations

are fully instantiated. This is due to the way, in whi h planning operators

are dened: Usually, all variables o urring in an operator are bound

by the pre ondition. In ba kward planning, newly introdu ed subgoals

(i. e., pre onditions of an operator) might ontain variables whi h are

not bound by mat hing the urrent subgoals with an operator.

In hapter 3 we will introdu e a ba kward planning strategy based on

omplete state des riptions: Planning starts with a goal state instead with

the top-level goals. An operator is appli able if all elements of its ADD-list

mat h with the urrent state. Operator appli ation is performed by removing

all elements of the ADD-list from the urrent des riptions, i. e., all literals

are redu ed to true in one step, and by adding the union of pre onditions and

DEL-list (see def. 2.1.7). Sin e usually the elements of the DEL-list are a sub-

set of the elements of the pre onditions, this state-based ba kward operator

appli ation an be seen as a spe ial kind of goal regression.

A further dieren e between forward and ba kward planning is:

Goal Ordering. In forward planning, it must be de ided whi h of the a tion

andidates whose pre onditions are satised in the urrent state is ap-


plied. In ba kward planning, it must be de ided whi h (sub-)goal of a

list of goals is onsidered next. That is, ba kward planning involves goal

ordering.

In the following, we will show that the original linear strategy for ba k-

ward planning is in omplete.

2.3.4.2 In ompleteness of Linear Planning. A planning strategy is

alled linear if it does not allow interleaving of sub-goals. That means, plan

onstru tion is based on the assumption that a problem an be solved by

solving ea h goal separately (Sa erdoti, 1975). This assumption does only

hold for independent goals Nilsson (1980) uses the term \ ommutativity",

(George, 1987, see also).

A famous demonstration for in ompleteness of linear ba kward planning

is the Sussman Anomaly (see Waldinger, 1977, for a dis ussion) illustrated in

gure 2.10. Here, the linear strategy does not work, regardless in whi h order

the sub-goals are approa hed. If B is put on C, A is overed by these two

blo ks and an only be moved, if the goal on(B, C) is destroyed. For rea hing

the goal on(A, B), rst A has to be leared by putting C on the table, if A

is subsequently put on B, C annot be moved under B without destroying

goal on(A, B).

A

C

BC

A B

Initial State Goal: on(A, B) and

on(B, C)

C

A

B

C

A

B

on(B, C) on(A, B)

Fig. 2.10. The Sussman Anomaly

Linear planning orresponds to dealing with goals organized in a sta k:

[on(A, B), on(B, C)

try to satisfy goal on(A, B)


solve sub-goals [ lear(A), lear(B)

11

all sub-goals hold after puttable(C)

apply put(A, B)

goal on(A, B) is rea hed

try to satisfy goal on(B, C).

Interleaving of goals also alled non-linear planning allows that a

sequen e of planning steps dealing with one goal is interrupted to deal with

another goal. For the Sussman Anomaly, that means that after blo k C is

put on the table pursuing goal on(A, B), the planner swit hes to the goal

on(B, C). Non-linear planning orresponds to dealing with goals organized

in a set:

fon(A, B), on(B, C)g


f lear(A), lear(B), on(A, B), on(B, C)g

lear(A) and lear(B) hold after puttable(C)

try to satisfy goal on(B, C)

apply put(B, C)


apply put(A, B).

The orre t sequen e of goals might not be found immediately but involve

ba ktra king.

Another example to illustrate the in ompleteness of linear planning was

presented by Veloso and Carbonell (1993): Given are a ro ket and several

pa kages together with operators for loading and unloading pa kages in and

from the ro ket and an operator for shooting the ro ket to the moon but no

operator for driving the ro ket ba k from the moon. The planning goal is to

transport some pa kages, for example at(Pa kA, Moon), at(Pa kB, Moon),

at(Pa kC, Moon). If the goals are addressed in a linear way, one pa kage

would be loaded in the ro ket, the ro ket would go to the moon, and the

pa kage would be unloaded. The rest of the pa kages ould never be delivered!

The orre t plan for this problem is, to load all pa kages in the ro ket before

the ro ket moves to its destination. We will give a plan for the ro ket domain

in hapter 3 and we will show in hapter 8 how this strategy for solving

problems of the ro ket domain an be learned from some initial planning

experien e.

A se ond sour e of in ompleteness is, if a planner instantiates variables in

an eager way also alled strong ommitment planning. An illustration with a

register-swapping problem is given in (pp. 305307 Nilsson, 1980). A similar

problem sorting of arrays and its solution is dis ussed in (Waldinger,

1977). We will introdu e sorting problems in hapter 3. Modern planners are

based on a non-linear, least ommitment strategy.

11

We ignore the additional subgoal ontable(A) rsp. on(A, z) here.

2.4 Planning Systems 41

2.4 Planning Systems

In this se tion we give a short overview of the history of and re ent trends

in planning resear h. We will only give more detailled des riptions for su h

resear h areas whi h are relevant for the later parts of the book. Otherwise,

we will only give short hara terizations together with hints to the literature.

2.4.1 Classi al Approa hes

2.4.1.1 Strips Planning. The rst well-known planning system was Strips

(Fikes & Nilsson, 1971). It integrated on epts developed in the area of

problem solving by state-spa e sear h and means-end analysis as realized

in the General Problem Solver (GPS) from Newell and Simon (1961) (see

se t. 2.4.5.1) and on epts from theorem proving and situation al ulus

the QA3 system of Green (1969). As dis ussed above, the basi notions of

the Strips language are still the ore of modern planning languages, as for

example PDDL. The Strips planning algorithm based on goal regression

and linear planning, as des ribed above , on the other hand, was repla ed

end of the eighties by non-linear, least- ommitment approa hes. In the sev-

enties and eighties, lots of work addressed problems with the original Strips

approa h, for example (Waldinger, 1977; Lifs hitz, 1987).

2.4.1.2 Dedu tive Planning. The dedu tive approa h to planning as the-

orem proving introdu ed by Green was pursued through the seventies and

eighties mainly by Manna andWaldinger (Manna &Waldinger, 1987). Manna

and Waldinger ombined dedu tive planning and dedu tive program synthe-

sis, and we will dis uss their work in hapter 6. Current dedu tive approa hes

are based on so alled a tion languages where a tions are represented as tem-

poral logi formulas. Here plan onstru tion is a pro ess of reasoning about

hange (Gelfond & Lifs hitz, 1993). End of the nineties, symboli model

he king was introdu ed as an approa h to planning as veri ation of tempo-

ral formulas in an semanti model (Giun higlia & Traverso, 1999). Symboli

model he king is mainly applied in universal planning for deterministi and

non-deterministi domains (see se t. 2.4.4).

2.4.1.3 Partial Order Planning. Also in the seventies, the rst partial

order planner (NOAH) was presented by Sa erdoti (1975).

12

Partial order

planning is based on a sear h in the spa e of (in omplete) plans. Sear h

starts with a plan ontaining only the initial state and the top-level goals.

In ea h planning step, the plan is rened by either introdu ing an a tion

fullling a goal or a pre ondition of another a tion, or by introdu ing an

ordering that puts one a tion in front of another a tion, or by instantiating

12

Sa erdoti alled NOAH an hierar hi al planner. Today, hierar hi al plan-

ning means that a plan is onstru ted on dierent levels of abstra tion (see

se t. 2.4.3.1).


a previously unbound variable. Partial order planners are based on a non-

linear strategy in ea h planning step an arbitrary goal or pre ondition an

be fo ussed. Furthermore, the least ommitment strategy i. e., refraining

from ommitting to a spe i ordering of planning steps or to a spe i

instantiation of a variable as long as possible was introdu ed in the ontext

of partial order planning (Penberthy & Weld, 1992).

The resulting plan is usually not a totally, but only a partially ordered

set of a tions. A tions for whi h no ordering onstraints o urred during

plan onstru tion remain unordered. A totally ordered plan an be extra ted

from the partially ordered plan by putting parallel (i. e., independent) steps

in an arbitrary order. An overview of partial order planning together with

a survey of the most important ontributions to this resear h is given in

(Russell & Norvig, 1995, hap. 11). Partial order planning was the dominant

approa h to planning from end of the eighties to mid of the nineties. The

main ontribution of partial order planning is the introdu tion of non-linear

planning and least ommitment.

2.4.1.4 Total Order Non-Linear Planning. Also at the end of the eight-

ies, the Prodigy system was introdu ed (Veloso, Carbonell, Perez, Borrajo,

Fink, & Blythe, 1995). Prodigy is a state-based, total-order planner based on

a non-linear strategy. Prodigy is more a framework for planning than a single

planning system. It allows the sele tion of a variety of planning strategies

and, more important, it allows that sear h is guided by domain-spe i on-

trol knowledge turning a domain-independent into a more eÆ ient domain-

spe i planner. Prodigy in ludes dierent strategies for learning su h ontrol

knowledge from experien e (see se t. 2.5.2 and oers te hniques for reusing

already onstru ted plans in the ontext of new problems based on analogi al

reasoning.

Two other total-order, non-linear ba kward planners are HSPr (Haslum

& Gener, 2000) and DPlan (S hmid & Wysotzki, 2000b) whi h we will

present in detail in hapter 3.

2.4.2 Current Approa hes

2.4.2.1 Graphplan and Derivates. Sin e the mid of the nineties, a new,

more eÆ ient, generation of planning algorithms, dominates the eld. The

Graphplan approa h presented by Blum and Furst (1997) an be seen as

the starting point of the new development. Graphplan deals with planning as

network ow problem (Cormen et al., 1990). The pro ess of plan onstru tion

is divided into two parts: rst, a so alled planning graph is onstru ted

by forward sear h, se ond a partial order plan is extra ted by ba kward

sear h. Plan extra tion from a planning graph of xed sized orresponds

to a bounded-length plan onstru tion problem.

The planning graph an be seen as a partial representation of the state-

spa e of a problem, representing a sub-graph of the state-spa e whi h ontains


paths from the given initial state to the given planning goals. An example for

a plannign graph is given in gure 2.11. It ontains fully instantiated liter-

als and a tions. But, in ontrast to a state-spa e representation, nodes in the

graph are not states i. e., onjun tions of literals but single literals (propo-

sitions). Be ause the number of dierent literals in a domain is onsiderably

smaller than the number of dierent sub-sets of those literals (i. e., state rep-

resentations), the size of a planning graph does not grow exponentially (see

se t. 2.3.3.1).

on(B, A)

ontable(A)

ontable(C)

clear(B)

clear(C)

on(B, A)

ontable(A)

ontable(C)

clear(B)

clear(C)

clear(A)

ontable(B)

puttable(B)

put(C, B) on(C, B)

level 1 level 2 level 3 ...

mute

x

Fig. 2.11. Part of a Planning Graph as Constru ted by Graphplan

The planning graph is organized level-wise. The rst level is the set of

propositions ontained in the initial state, the next level is a set of a tions,

a proposition from the rst level is onne ted with an a tion in the se ond

level, if it is a pre ondition for this a tion. The next level ontains proposi-

tions again. For ea h a tion of the pre eding level, all propositions ontained

in its ADD-list are introdu ed (and onne ted with the a tion). The plan-

ning graph ontains so alled noop a tions at ea h level whi h just pass a

literal from level k to level k + 2. Furthermore, so alled mutex relations are

introdu ed at ea h level, representing (an in omplete set) of propositions or

a tions whi h are mutually ex lusive on a given level. Two a tions are mu-

tex if they interfere (one a tion deletes a pre ondition or ADD-ee t of the

other) or have ompeting needs (have mutually ex lusive pre onditions). Two

propositions p and q are mutex if ea h a tion having an add-edge to proposi-

tion p is marked as mutex of ea h a tion having an add-edge to proposition

q. Constru tion of a planning graph terminates, when the rst time a level

ontains all literals o urring in the planning goal and these literals are not

mutex. If ba kward plan extra tion fails, the planning graph is extended one

level.


Originally, Graphplan was developed for the Strips planning language.

Koehler et al. (1997) presented an extension to onditional and univer-

sally quantied operator ee ts (system IPP). Another su essful Graph-

plan based system is STAN (Long & Fox, 1999). After Graphplan, a variety

of so alled ompilation approa hes have be ome popular. Starting with a

planning graph, the bounded-length plan onstru tion problem is addressed

by dierent approa hes to solving anoni al ombinatorial problems, su h

as satisability-solvers (Kautz & Selman, 1996, Bla kbox), integer program-

ming, or onstraint satisfa tion algorithms (see Kambhampati, 2000, for a

survey). IPP, STAN, and Bla kbox solve blo ks-world problems up to 10 ob-

je ts in under a se ond and up to thirteen obje ts in under 1000 se onds

(Ba hus et al., 2000).

Traditional planning algorithms worked for a (small) sub-set of rst-order

logi where in ea h planning step variables o urring in the operators must be

instantiated. In the ontext of partial order planning, the expressive power of

the original Strips approa h was extended by on epts in luded in the PDDL

language as dis ussed in se tion 2.2.1 (Penberthy & Weld, 1992). In ontrast,

ompilation approa hes are based on propositional logi . The redu tion in the

expressiveness of the underlying language gives rise to a gain in eÆ ien y for

plan onstru tion. Another approa h based on propositional representations

is symboli model he king whi h we will dis uss in ontext with universal

planning (see se t. 2.4.4).

2.4.2.2 Forward Planning Revisited. Another eÆ ient planning system

of the late nineties, is the system HSP from Bonet and Gener (1999). HSP

is a forward planner based on heuristi sear h. The power of HSP is based

on an eÆ ient pro edure for al ulating a lower bound estimate h

0

(s) for

the distan e (remaining number of a tions) from the urrent state s to a

goal state. The heuristi fun tion h

0

(s) is al ulated for a \relaxed" problem

P

0

, whi h is obtained from the given planning problem P by ignoring the

DEL-lists of the operators. Thus, h

0

(s) is set to the number of a tions whi h

transform s in a state where the goal literals appear for the rst time

ignoring that pre onditions of these a tions might be deleted on the way.

The new generation of forward planners dominated in the 2000 AIPS

ompetition (Ba hus et al., 2000). For example, HSP an solve blo ks-world

problems up to 35 blo ks in under 1000 se onds.

2.4.3 Complex Domains and Un ertain Environments

2.4.3.1 In luding Domain Knowledge. Most realisti domains are by

far more omplex than the blo ks-world domain dis ussed so far. Domain

spe i ations for more realisti domains su h as assembling of ma hines or

logisti s problems might involve large sets of (primitive) operators and/or

huge state-spa es. The obvious approa h to generate plans for omplex do-

mains is to restri t sear h by providing domain-spe i knowledge. Planners


relying on domain-spe i knowledge an solve blo ks-world problems with

up to 95 blo ks under one se ond (Ba hus et al., 2000). In ontrast to the

general purpose planners dis ussed so far, su h knowledge-based systems are

alled spe ial purpose planners. Note that all domain-spe i approa hes re-

quire that more eort and time is invested in the development of a formalized

domain model. Often, su h knowledge is not easily to provide.

One way to deal with a omplex domain, is to spe ify plans at dierent

levels of detail. For example (see Russell & Norvig, 1995, p. 368), to laun h

a ro ket, a top-level plan might be: prepare booster ro ket, prepare apsule,

load argo, laun h. This plan an be dierentiated to several intermediate-

level plans until a plan ontaining exe utable a tions is rea hed (on the detail

of insert nut A into hole B, et .). This approa h is alled hierar hi al plan-

ning. For hierar hi al planning, additional to primitive operators whi h an

be instantiated to exe utable a tions, abstra t operators must be spe ied.

An abstra t operator represents a de omposition of a problem into smaller

problems. To spe ify an abstra t operator, knowledge about the stru ture of

a domain is ne essary. An introdu tion to hierar hi al planning by problem

de omposition is given in (Russell & Norvig, 1995, hap. 12).

Other te hniques to make planning feasible for omplex domains are on-

erned with redu ing the eort of sear h. One sour e of omplexity is that

meaningless instantiations and in onsistent states might be generated whi h

must be re ognized and removed (see dis ussion in se t. 2.3.4). This problem

an be redu ed or eliminated if domain knowledge is provided in the form of

axioms and types. A se ond sour e of omplexity is that uninformed sear h

might lead into areas of the state-spa e whi h are far away from a possible

solution. This problem an be redu ed by providing domain spe i ontrol

strategies whi h guide sear h. Planners that rely on su h domain-spe i

ontrol knowledge an easily outperform every domain-independent planner

(Ba hus & Kabanza, 1996).

Alternatively to expli itly providing a planner with su h kind of domain-

spe i information, this information an be obtained by extra ting informa-

tion from domain spe i ations by pre-planning analysis (see se t. 2.5.1) or

from some example plans using ma hine learning te hniques (see se t. 2.5.2).

2.4.3.2 Planning for Non-Deterministi Domains. Constru ting plans

for real-world domains must take into a ount that information might be

in omplete or in orre t. For example, it might be unknown at planning time,

whether the weather onditions are sunny or rainy when the ro ket is to be

laun hed (see above); or during planning time it is assumed that a ertain tool

is stored in a ertain shelf, but at plan exe ution time, the tool might have

been moved to another lo ation. The rst example addresses in ompleteness

of information due to environmental hanges. The se ond example addresses

in orre t information whi h might be due to environmental hanges (e.g. , an

agent whi h does not orrespond to the agent exe uting the plan moved the

tool) or to non-deterministi a tions (e.g. , depending on where the planning


agent moves after using the tool, he might pla e the tool ba k in the shelf or

not).

A lassi al approa h to deal with in omplete information is onditional

planning. A onditional plan is a disjun tion of sub-plans for dierent ontexts

(as good or bad weather onditions). We will see in hapter 8 that introdu ing

onditions into a plan is a ne essary step for ombining planning and program

synthesis. A lassi al approa h to deal with in orre t information is exe ution

monitoring and re-planning. Plan exe ution is monitored and if a violation of

the pre onditions for the a tion whi h should be exe uted next is dete ted,

re-planning is invoked. An introdu tion to both te hniques is given in (Russell

& Norvig, 1995, hap. 13).

Another approa h to deal with non-deterministi domains is rea tive plan-

ning. Here, a set or table of state-a tion rules is generated. Instead of exe-

uting a omplete plan, the urrent state of the environment triggers whi h

a tion is performed next, dependent on what onditions are fullled in the

urrent state. This approa h is also alled poli y learning and is extensively re-

sear hed in the domain of reinfor ement learning (Dean, Basye, & Shew huk,

1993; Sutton & Barto, 1998). A similar approa h, ombining problem solv-

ing and de ision tree learning, is proposed by (Muller & Wysotzki, 1995).

In the ontext of symboli planning, universal planning was proposed as an

approa h to poli y learning (see se t. 2.4.4).

2.4.4 Universal Planning

Universal planning was originally proposed by S hoppers (1987) as an ap-

proa h to learn state-a tion rules for non-deterministi domains. A universal

plan represents solution paths for all possible states of a planning problem,

instead of a solution for one single initial state. State-a tion rules are ex-

tra ted from a universal plan overing all possible states of a given planning

problem. Generating universal plans instead of plans for a single initial state

was also proposed by Wysotzki (1987) in the ontext of indu tive program

synthesis (see part II).

A universal plan orresponds to a breadth-rst sear h tree. Sear h is per-

formed ba kward, starting with the top-level goals and for ea h node at the

urrent level of the plan all (new and onsistent) prede essor nodes are gen-

erated. The set of prede essor nodes of a set of nodes S is also alled pre-

image of S. An abstra t algorithm for universal plan onstru tion is given

in table 2.4. Universal planning was riti ized as impra ti able (Ginsberg,

1989), be ause su h sear h trees an grow exponentially (see dis ussion of

PSPACE- ompleteness, se t. 2.3.3.1). Currently, universal planning has a re-

naissan e, due to the introdu tion of OBDDs (ordered binary de ision dia-

grams) as a method for a ompa t representation of universal plans. OBDD-

representations were originally developed in the ontext of hardware design

(Bryant, 1986) and later adopted in symboli model he king for eÆ ient

exploration of large state-spa es (Bur h, Clarke, M Millan, & Hwang, 1992).


The new, memory eÆ ient approa hes to universal planning are based on

the idea of viewing planning as model he king paradigm instead of planning

as a state-spa e sear h problem (like Strips planners) or as theorem proving

problem (like dedu tive planners). Planning as model he king is not only

su essfully applied to non-deterministi domains planners MBP(Cimatti,

Roveri, & Traverso, 1998) and UMOP (Jensen & Veloso, 2000) , but also to

ben hmark problems for planning in deterministi domains (Edelkamp, 2000,

planner MIPS). Be ause plan onstru tion is based on breadth-rst sear h,

universal plans for deterministi domains represent optimal solutions.

Table 2.4. Planning as Model Che king Algorithm (Giun higlia, 1999, g. 4)

fun tion Plan(P) where P(D, I, G) is a planning problem with

I = fs

0

g as initial state,

G as goals, and

D = hF; S;A;Ri as planning domain with F as set of uents, S 2

F

as nite

set of states, A as nite set of a tions, and R : SA! S as transition fun tion.

An a tion a 2 A is exe utable in s 2 S if R(s; a) 6= ;.

CurrentStates := ;; NextStates := G; Plan := ;;

while (NextStates 6= CurrentStates) do (*)

if I NextStates then return Plan; (**)

OneStepPlan := OneStepPlan(NextStates,D);

( al ulate pre-image of NextStates)

Plan := Plan [ PruneStates(OneStepPlan,NextStates);

(eliminate states whi h have already been visited)

CurrentStates := NextStates;

NextStates := NextStates [ Proje tA tions(OneStepPlan);

(Proje tA tions, given a set of state-a tion pairs, returns the orre-

sponding set of states)

return Fail.

The general idea of planning as model he king is that a planning domain

is des ribed as semanti model of a domain. A semanti model an be given for

example as a nite state ma hine (Cimatti et al., 1998) or as Kripke stru ture

(Giun higlia & Traverso, 1999). A domain model D represents the states and

a tions of the domain and the state transitions aused by the exe ution of

a tions. States are represented as onjun tions of atoms. State transitions

(state, a tion, state') an be represented as formulas stateâ tion! state

0

.

As usual, a planning problem is the problem of nding plans of a tions given

planning domain, initial and goal states. Plan generation is done by exploring

the state spa e of the semanti model. At ea h step, plans are generated by

he king the truth of some formulas in the model. Plans are also represented

as formulas and planning is modeled as sear h through sets of states (instead

of single states) by evaluating the assignments verifying the orresponding

formulas. Giun higlia and Traverso (1999) gives an overview of planning as

model he king.


As mentioned above, plans (as semanti models) an be represented om-

pa tly as OBDDs. An OBDD is a anoni al representation of a boolean fun -

tion as a DAG (dire ted a y li graph). An example for an OBDD represen-

tation is given in gure 2.12. Solid lines represent that the pre eding variable

is true, broken lines represent that it is false. Note, that the size of an OBDD

is highly dependent on the ordering of variables (Bryant, 1986).

x1

x2

1 0

Fig. 2.12. Representation of the Boolean Formula f(x

1

; x

2

) = x

1

^ x

2

as OBDD

Our planning system DPlan is a universal planner for deterministi do-

mains (see hapt. 3). In ontrast to the algorithm given in table 2.4, plan-

ning problems in DPlan only give planning goals, but not an initial state.

DPlan terminates, if all states whi h are rea hable from the top-level goals

(by al ulating the pre-images) are enumerated. That is, DPlan terminates

su essfully, if ondition (*) given in table 2.4 is no longer fullled and DPlan

does not in lude ondition (**). Furthermore, in DPlan the universal plan is

represented as DAG over states and not as OBDD. Therefore, DPlan is mem-

ory ineÆ ient. But, as we will dis uss in later hapters, DPlan is typi ally

applied only to planning problems with small omplexity (involving not more

than three or four obje ts). The universal plan is used as starting point for

indu ing a domain spe i ontrol program for generating (optimal) a tion

sequen es for problems with arbitrary omplexity (see se t.2.5.2).

2.4.5 Planning and Related Fields

2.4.5.1 Planning and Problem Solving. The distin tion between plan-

ning and problem solving is not very lear- ut. In the following, we give some

dis riminations found in the literature (Russell & Norvig, 1995, e. g.,): Often,

forward sear h based omplete state representations is lassied as problem

solving, while ba kward sear h based on in omplete state representations is

lassied as planning. While planning is based on a logi al representation

of states, problem solving an rely on dierent, spe ial purpose representa-

tions, for example feature ve tors. While planning is mostly asso iated with a

domain-independent approa h, problem solving typi ally relies on pre-dened

domain spe i knowledge. Examples are heuristi fun tions as used to guide

A

sear h (Nilsson, 1971, 1980) or the dieren e table used in GPS (Newell

& Simon, 1961; Nilsson, 1980).


Cognitive psy hology is typi ally on erned with human problem solving

and not with human planning. Computer models of human problem solving

are mostly realized as produ tion systems (Mayer, 1983; Anderson, 1995). A

produ tion system onsists of an interpreter and a set of produ tion rules.

The interpreter realizes the mat h-sele t-apply y les. A produ tion rule is

an IF-THEN-rule. A rule res if the ondition spe ied in its if-part is sat-

ised in a urrent state in working memory. Sele tion might be in uen ed

by the number of times a rule was already applied su essfully oded as

a strength value. Appli ation of a produ tion rule results in a state hange.

Produ tion rules are similar to operators. But while an operator is spe ied

by pre onditions and ee ts represented as sets of literals, produ tion rules

an en ode onditions and ee ts dierently. For example, a ondition might

represent a sequen e of symbols whi h must o ur in the urrent state and

the then-part on the rule might give a sequen e of symbols whi h are ap-

pended to the urrent state representation. In ognitive s ien e, produ tion

systems are usually goal-dire ted the if-part of a rule represents a urrently

open goal and the then-part either introdu es new sub-goals or spe ies an

a tion. Goal-driven produ tion systems are similar to hierar hi al planners.

The earliest and most in uential approa h to problem solving in AI and

ognitive psy hology is the General Problem Solver (GPS) from (Newell &

Simon, 1961). GPS is very similar to Strips planning. The main dieren e

is, that sele tion of a rule (operator) is guided by a dieren e table. The dif-

feren e table has to be provided expli itly for ea h appli ation domain. For

ea h rule, it is represented whi h dieren e between a urrent state and a goal

state it an redu e. For example, a rule for put(A, B) fullls the goal on(A,

B). The pro ess of identifying dieren es and sele ting the appropriate rule

is alled means-end analysis. The system starts with the top-level goals and a

urrent state. It sele ts one of the top-level goals, if this goal is already satis-

ed in the urrent state, the system pro eeds with the next goal. Otherwise,

a rule is sele ted from the dieren e table. The top-level goal is removed from

the list (sta k) of goals and repla ed by the pre onditions of the rule and so

on. As Strips, GPS is based on a linear strategy and is therefore in omplete.

While in ompleteness is not a desirable property for an AI program, it might

be appropriate to hara terize human problem solving. A human problem

solver usually wants to generate a solution within reasonable time-bounds

and an furthermore only hold a restri ted amount of information in his/her

short-term memory. Therefore, using an in omplete but simple strategy is

rational be ause it still an work for a large lass of problems o urring in

everyday life (Simon, 1958).

2.4.5.2 Planning and S heduling. Planning deals with nding su h a -

tivities whi h must be performed to satisfy a given goal. S heduling deals with

nding an (optimal) allo ation of a tivities (jobs) to time segments given lim-

ited resour es (Zweben & Fox, 1994). For a long time planning and s heduling

resear h were ompletely separated. Only sin e the last years, the interests


of both ommunities onverge. While s heduling systems are su essfully ap-

plied in many areas (su h as fa tories and transportation ompanies), plan-

ning is mostly done \by hand". Currently, resear hers and ompanies be ome

more interested in automatizing the planning part. This goal seems now far

more realisti than ve years ago, due to the emergen e of new, eÆ ient plan-

ning algorithms. On the other hand, resear hers in planning aim at applying

their approa hes at realisti domains su h as the logisti s domain and the

elevator domain used as ben hmark problems in the AIPS-00 planning om-

petition (Ba hus et al., 2000).

13

Planning in realisti domains often involves

dealing with resour e onstraints. One approa h, for example proposed by

Do and Kambhampati (2000), is to model planning and s heduling as two

su eeding phases, both solved with a onstraint satisfa tion approa h. Other

work aims at integrating resour e onstraints in plan onstru tion (Koehler,

1998; Gener, 2000). In hapter 4 we present an approa h to integrating

fun tion appli ation in planning and example appli ations to planning with

resour e onstraints.

2.4.5.3 Proof Planning. In the domain of automati theorem proving,

planning is applied to guide proof onstru tion. Proof planning was origi-

nally proposed by Bundy (1988). Planning operators represent proof methods

with pre onditions, post onditions, and ta ti s. A ta ti represents a number

of inferen e steps. It an be applied if the pre onditions hold and the post-

onditions must be guaranteed to hold after the inferen e steps are exe uted.

A ta ti guides the sear h for a proof by pres ribing that ertain inferen e

steps should be exe uted in a ertain sequen e given some urrent step in a

proof.

An example for a high-level proof method is \proof by mathemati al

indu tion". Bundy (1988) introdu es a heuristi s alled \rippling" to des ribe

the heuristi s underlying the Boyer-Moore theorem prover and represents this

heuristi s as a method. Su h a method spe ies ta ti s on dierent levels of

detail similar to abstra t and primitive operators in hierar hi al planning.

Thus, a \super-method" for guiding the onstru tion of a omplete proof

invokes several sub-methods whi h satisfy the post- ondition of the super-

method after their exe ution.

Proof planning, like hierar hi al planning, requires insight in the stru -

ture of the planning domain. Identifying and formalizing su h knowledge

an be very time- onsuming, error-prone, or even impossible, as dis ussed in

se tion 2.4.3.1. Again, learning might be a good alternative to expli itly spe -

ifying su h domain-dependent knowledge: First, proofs are generated using

only primitive ta ti s. Su h proofs an then be input in a ma hine learn-

ing algorithm and generalized to more general methods (Jamnik, Kerber, &

Benzmuller, 2000). In prin iple, all strategies available for learning ontrol

13

Be ause of the onverging interests in planning and s heduling resear h, the

onferen e Arti ial Intelligen e Planning Systems was renamed into Arti ial

Intelligen e Planning and S heduling.

2.5 Automati Knowledge A quisition for Planning 51

rules or ontrol programs for planning as dis ussed in se tion 2.5.2 an be

applied to proof planning. Of ourse, su h a learned method might be in om-

plete or non-optimal be ause it is indu ed from some spe i examples. But

the same is true for methods whi h are generated \by hand". In both ases,

su h strategi de isions should not be exe uted automati ally. Instead, they

an be oered to a system user and a epted or reje t by user intera tion.

2.4.6 Planning Literature

Classi al Strips planning is des ribed in all introdu tory AI textbooks su h

as Nilsson (1980) and Winston (1992). The newest textbook from (Russell &

Norvig, 1995) fo usses on partial order planning, whi h is also des ribed in

Winston (1992). Short introdu tions to situation al ulus an also be found

in all textbooks. A olle tion of in uential papers in planning resear h from

the beginning until the end of the eighties is presented by Allen, Hendler,

and (Eds.) (1990).

The area of planning resear h underwent signi ant hanges sin e the

mid-nineties. The newer Graphplan based and ompilation approa hes as well

as methods of domain-knowledge learning (see se t. 2.5) are not des ribed in

textbooks. Good sour es to obtain an overview of urrent resear h are the pro-

eedings of the AIPS onferen e (Arti ial Intelligen e Planning and S hedul-

ing; http://www-aig.jpl.nasa.gov/publi /aips00/) and the Journal of

Arti ial Intelligen e Resear h (JAIR; http://www. s.washington.edu/

resear h/jair/home.html). Overview papers of spe ial areas of planning

an be found in the AI Magazine for example, a re ent overview of urrent

planning approa hes by Weld (1999). Furthermore, resear h in planning is

presented in all major AI onferen es (IJCAI, AAAI) and AI journals (e.g. ,

Arti ial Intelligen e). A olle tion of urrent planning systems together with

the possibility to exe ute planners on a variety of problems an be found at

http://rukbat.fokus.gmd.de:8080/.

14

2.5 Automati Knowledge A quisition for Planning

2.5.1 Pre-Planning Analysis

Pre-planning analysis was proposed for example by (Fox & Long, 1998) to

eliminate meaningless instantiations when onstru tion a propositional rep-

resentation of a planning problem, as for example a planning graph. The

basi idea is, to automati ally infer types from domain spe i ations. These

types then are used to extra t state invariants. Additionally to eliminating

14

This website was realized by Jurgen Muller as part of his diploma thesis at TU

Berlin, supervised by Ute S hmid and Fritz Wysotzki.


meaningless instantiations state invariants an be used to dete t and elim-

inate unsound plans. The domain analysis of the system TIM proposed by

Fox and Long (1998) extra ts information by analyzing the literals o urring

in the pre onditions, ADD- and DEL-lists of operators (transforming them

in a set of nite state ma hines).

For example, for the ro ket domain (shortly des ribed in se t. 2.3.4.2), it

an be inferred, that ea h pa kage is always either in the ro ket or at a xed

pla e, but never at two lo ations simultaneously.

Extra ting knowledge from domain spe i ations makes planning more

eÆ ient be ause it redu es the sear h spa e by eliminating impossible states.

Another possibility to make plan onstru tion more eÆ ient is to guide sear h

by introdu ing domain spe i ontrol knowledge.

2.5.2 Planning and Learning

Enri hing domains with ontrol knowledge whi h guides a planner to redu e

sear h for a solution makes planning more eÆ ient and therefore makes a

larger lass of problems solvable under given time and memory restri tions.

Be ause it is not always easy to provide su h knowledge and be ause domain

modeling is an \art" whi h is time onsuming and error-prone, one area of

planning resear h deals with the development of approa hes to learn su h

knowledge automati ally from some sample experien e with a domain.

2.5.2.1 Linear Ma ro-Operators. In the eighties, approa hes to learning

(linear) ma ro operators were investigated (Minton, 1985; Korf, 1985): After a

plan is onstru ted for a problem of a given domain, operators whi h appear

dire tly after ea h other in the plan an be omposed to a single ma ro

operator by merging their pre onditions and ee ts. This pro ess an be

applied to pairs or larger sequen es of primitive operators. If the planner

is onfronted with a new problem of the given domain, plan onstru tion

might involve a smaller number of mat h-sele t-apply y les be ause ma ro

operators an be applied whi h generate larger segments of the sear hed

for plan in one step. This approa h to learning in planning, however, did

not su eed, mainly be ause of the so alled utility problem. If ma ros are

extra ted undis riminated from a plan, the system might be ome \swamped"

with ma ro-operators. Possible eÆ ien y gains from redu tion of the number

of mat h-sele t-apply y les are ounterbalan ed or even overridden by the

number of operators whi h must be mat hed.

A similar approa h to linear ma ro learning is investigated in the ontext

of ognitive models of human problem solving (see se t. 2.4.5.1)and learning.

The ACT system (Anderson, 1983) and its des endants (Anderson & Lebiere,

1998) are realized as produ tion systems with a de larative omponent rep-

resenting fa tual knowledge and a pro edural omponent representing skills.

The pro edural knowledge is represented by produ tion rules (if ondition

then a tion pairs). Skill a quisition is modelled as \ ompilation" whi h in-

ludes on atenating primitive rules. This me hanism is used to des ribe

2.5 Automati Knowledge A quisition for Planning 53

speed-up learning from problem solving experien e. Another produ tion sys-

tem approa h to human ognitive skills is the SOAR system (Newell, 1990), a

des endant of GPS. Here, \ hunking" of rules is invoked, if an impasse during

generating a problem solution has o urred and was su essfully resolved.

2.5.2.2 Learning Control Rules. From the late eighties to the mid of

the nineties, ontrol rule learning was mainly investigated in ontext of the

Prodigy system (Veloso et al., 1995). A variety of approa hes mainly based

on explanation-based learning (EBL) or generalization (Mit hell, Keller, &

Kedar-Cabelli, 1986) were investigated to learn ontrol rules to improve the

eÆ ien y of plan onstru tion and the quality (i. e., optimality) of plans. An

overview of all investigated methods is given in Veloso et al. (1995). A ontrol-

rule is represented as produ tion rule. For a urrent state a hieved during

plan onstru tion, su h a ontrol rule provides the planner with information

whi h hoi e (next a tion to sele t, next sub-goal to fo us) it should make.

The EBL-approa h proposed by Minton (1988), extra ts su h ontrol rules

from an analysis of sear h trees by explaining why ertain bran hing de isions

were made during sear h for a solution.

Another approa h to ontrol rule learning, based on learning de ision

trees (Quinlan, 1986) or de ision lists (Rivest, 1987), is losely related to pol-

i y learning in reinfor ement learning (Sutton & Barto, 1998). For example,

Briesemeister, S heer, and Wysotzki (1996) ombined problem solving with

de ision tree learning. For a given problem solution, ea h state is represented

as a feature ve tor and asso iated with the a tion whi h was exe uted in this

state. A de ision tree is indu ed, representing relevant features of the state

des ription together with the a tion whi h has to be performed given spe i

values of these features. Ea h path in the de ision tree leads to an a tion (or

a \don't know what to do") in its leaf and an be seen as a ondition-a tion

rule. After a de ision tree is learned, new problem solving episodes an be

guided by the information ontained in the de ision tree. Learning an be per-

formed in rementally, leading either to instantiations of up to now unknown

ondition-a tion pairs or to a restru turing of the tree. Similar approa hes

are proposed by (Mart

in & Gener, 2000) and (Huang, Selman, & Kautz,

2000). Mart

in and Gener (2000) learn poli ies represented in a on ept lan-

guage (i. e., as logi al formulas) with a de ision list approa h. They an show

that a larger per entage of omplex problems (su h as blo ks-world problems

involving 20 blo ks) an be su essfully solved using the learned poli ies and

that the generated plans are while not optimal reasonably short.

2.5.2.3 Learning Control Programs. An alternative approa h to learn-

ing \mole ular" rules is learning of ontrol programs. A ontrol program gen-

erates a possibly y li (re ursive, iterative) sequen e of a tions. We will also

speak of (re ursive) ontrol rules when refering to ontrol programs. Shell

and Carbonell (1989) ontrast iterative with linear ma ros and give a the-

oreti al analysis and some empiri al demonstrations of the eÆ ien y gains

whi h an be expe ted using iterative ma ros. Iterative ma ros an be seen


as programs be ause they provide a ontrol stru ture for repeatedly exe ut-

ing a sequen e of a tions until the ondition for looping does no longer hold.

The authors point out that a ontrol program represents strategi knowledge

for a domain. EÆ ient human problem solving should not only be explained

by speed-up ee ts due to operator-merging but also by a quiring problem

solving strategies i. e., knowledge on a higher level of abstra tion.

Learning ontrol programs from some initial planning experien e brings

together indu tive program synthesis and planning resear h whi h is the main

topi of the work presented in this book. We will present our approa h to

learning ontrol programs by synthesizing fun tional programs in detail in

part II. Other approa hes were presented by (Shavlik, 1990), who applies

indu tive logi programming to ontrol program learning, and by (Koza,

1992) in the ontext of geneti programming. Re ently, learning re ursive

(\open loop") ma ros is also investigated in reinfor ement learning (Kalmar

& Szepesvari, 1999; Sun & Sessions, 1999).

While ontrol rules guide sear h for a plan, ontrol programs eliminate

sear h ompletely. In our approa h, we learn re ursive fun tions for a domain

of arbitrary omplexity but with a xed goal. For example, a program for on-

stru ting a tower of sorted blo ks an be synthesized from a (universal) plan

for a three blo k problem with goal fon(A, B), on(B, C)g. The synthesized

fun tion then an generate orre t and optimal a tion sequen es for tower

problems with an arbitrary number of blo ks. Instead of sear hing for a plan

needing probably exponential eort, the learned program is exe uted. Exe u-

tion time orresponds to the omplexity lass of the program for example

linear time for a linear re ursion. Furthermore, the learned ontrol programs

provide optimal a tion sequen es.

While learning a ontrol program for a hieving a ertain kind of top-level

goals in a domain results in a highly eÆ ient generation of an optimal a tion

sequen e, this might not be true if ontrol knowledge is only learned for a sub-

domain as for example, learing a blo k as subproblem to building a tower of

blo ks. In this ase, the problem of intelligent indexing and retrieval of ontrol

programs has to be dealt with similar as in reuse of already generated plans

in the ontext of a new planning problem (Veloso, 1994; Nebel & Koehler,

1995). Furthermore, program exe ution might lead to a state whi h is not

on the optimal path of the global problem solution (Kalmar & Szepesvari,

1999).

3. Constru ting Complete Sets

of Optimal Plans

The planning system DPlan is designed as a tool to support the rst step

of indu tive program synthesis generating nite programs for transforming

input examples into the desired output. Be ause our work is in the ontext

of program synthesis, planning is for small, deterministi domains and om-

pleteness and optimality are of more on ern than eÆ ien y onsiderations.

In the remaining hapters of part I, we present DPlan as planning system.

In part II we will des ribe, how re ursive fun tions an be indu ed from

plans onstru ted with DPlan and show how su h re ursive fun tions an be

used as ontrol programs for plan onstru tion. In this hapter, we will in-

trodu e DPlan (se t. 3.1), introdu e universal plans as sets of optimal plans

(se t. 3.2), and give proofs for termination, soundness and ompleteness of

DPlan (se t. 3.3).

1

An extension of DPlan to fun tion appli ation allowing

planning for innite domains is des ribed in the next hapter ( hap. 4).

3.1 Introdu tion to DPlan

DPlan

2

is a state-based, non-linear, total-order ba kward planner. DPlan is

a universal planner (see se t. 2.4.4 in hap. 2): Instead of a plan representing

a sequen e of a tions transforming a single initial state into a state fullling

the top-level goals, DPlan onstru ts a plan, representing optimal a tion se-

quen es for all states belonging to the planning problem. A short history of

DPlan is given in appendix AA.1.

DPlan diers from standard universal planning in the following aspe ts:

In ontrast to universal planning for non-deterministi domains, we do not

extra t state-a tion rules from the universal plan, but use it as starting-

point for program synthesis.

A planning problem spe ies a set of top-level goals and it restri ts the

number of obje ts of the domain, but no initial state is given. We are

1

The hapter is based on the following previous publi ations: S hmid (1999),

S hmid and Wysotzki (2000b, 2000a).

2

Our algorithm is named DPlan in referen e to the Dijkstra-algorithm (Cor-

men et al., 1990), be ause it is a single-sour e shortest-paths algorithm with

the state(s) fullling the top-level goals as sour e.

56 3. Constru ting Complete Sets of Optimal Plans

interested in optimal transformation sequen es for all possible states from

whi h the goal is rea hable.

A universal plan represents the optimal solutions for all states whi h an

be transformed into a state fullling the top-level goal. That is, plan on-

stru tion does not terminate if a given initial state is in luded in the plan,

but only if expansion of the urrent leaf nodes does not result in new states

(whi h are not already ontained in the plan).

A universal plan is represented as \minimal spanning DAG" (dire ted

a y li graph) (see se t. 3.2) with nodes as states and ar s as a tions.

3

Instead of a memory-eÆ ient representation using OBDDs, an expli it

state-based representation is used as starting-point for program synthe-

sis. Typi ally, plans involving only three or four obje ts are generated and

a ontrol program for dealing with n obje ts is generalized by program

synthesis.

Ba kward operator appli ation is not realized by goal regression over par-

tial state des riptions (see se t. 2.3.4.1) but over omplete state des rip-

tions. The rst step of plan onstru tion expands the top-level goals to a

set of goal states.

3.1.1 DPlan Planning Language

The urrent system is based on domain and problem spe i ations in PDDL

syntax (see se t. 2.2.1 in hap. 2), allowing Strips (see def. 2.1.1) and oper-

ators with onditional ee ts. As usual, states are dened as sets of atoms

(def. 2.1.2) and goals as sets of literals (def. 2.1.4). Operators are dened

with pre onditions, ADD- and DEL-lists (def. 2.1.5). Se ondary pre ondi-

tions an be introdu ed to model ontext-dependent ee ts (see se t. 2.2.1.2

in hap. 2).

Forward operator appli ation for exe uting a plan by transforming an

initial state into a goal state is dened as usual (see def. 2.1.6). Ba kward

operator appli ation is dened for omplete state des riptions (see def. 2.1.7).

An extension for onditional ee ts is given in denition 2.2.1. For universal

planning, ba kward operator appli ation must be extended to al ulate the

set of all prede essor states for a given state:

Denition 3.1.1 (Pre-Image). With Res

1

(o; s) we denote ba kward ap-

pli ation of an instantiated operator o to a state des ription s (see def. 2.1.7).

We all s

0

= Res

1

(o; s) a prede essor of s. Ba kward operator appli ation

an be extended in the following way:

Res

1

p

(fo

1

: : : o

n

g; s) represents the \parallel" appli ation of the set of all

a tions whi h satisfy the appli ation ondition for state s resulting in a set

of prede essor states.

3

As we will see for some examples below, for some planning domains, the universal

plan is a spe ialized DAG a minimal spanning tree or simply a sequen e.

3.1 Introdu tion to DPlan 57

The pre-image of a set of states S = fs

1

; : : : ; s

l

g is dened as

S

l

i=1

Res

1

p

(fo

1

: : : o

n

g; s

i

).

To make sure that plan onstru tion based on al ulating pre-images is

sound and omplete, it must be shown that the pre-image of a set of states S

only ontains onsistent state des riptions and that the pre-image ontains all

state des riptions whi h an be transformed into a state in S (see se t. 3.3).

A DPlan planning problem is dened as

Denition 3.1.2 (DPlan Planning Problem). A planning problem

P(O;G;D) onsists of a set of operators O, a set of top-level goals G, and a

domain restri tion D whi h an be spe ied in two ways:

as set of state des riptions,

as a set of goal states.

In standard planning and universal planning, an initial state is given as

part of a planning problem. A planning problem is solved if an a tion sequen e

transforming the initial state into a state satisfying the top-level goals is

found. In DPlan planning, a planning problem is solved if all states given in

D are in luded in the universal plan, or if a set of goal states is expanded to

all possible states from whi h a goal state is rea hable.

The initial state has the following additional fun tions in ba kward plan

onstru tion (see se t. 2.3.4 in hap. 2): Plan onstru tion terminates, if the

initial state is in luded in the sear h tree. All partial state des riptions on the

path from the initial state to the top-level goals an be ompleted by forward

operator appli ation. Consisten y of state des riptions an be he ked for the

states in luded in the solution path and in onsisten ies on other paths have

no in uen e on the soundness of the plan. To make ba kward planning sound

and omplete, in general at least one omplete state des ription denoting

a set of onsistent relations between all obje ts of interest is ne essary.

For DPlan, this must be either a (set of) omplete goal state(s) or the set

of all states to be in luded in the universal plan. If for a DPlan planning

problem D is given as a set of goal states, onsisten y of a prede essor state

an be he ked by Res

1

(o; s) = s

0

! Res(o; s

0

) = s. If D is given as a

set of ( onsistent) state des ription, plan onstru tion is redu ed to stepwise

in luding states from D in the plan.

3.1.2 DPlan Algorithm

The DPlan algorithm is a variant of the universal planning algorithm given

in table 2.4. In table 3.1, the DPlan algorithm is des ribed abstra tly. Input

is a DPlan planning problem P (O;G;D) with a set of operators O, a set of

top-level goals G and a domain restri tion D as spe ied in denition 3.1.2.

Output is a universal plan, representing the union of all optimal plans for the

restri ted domain and a xed goal, whi h we will des ribe in detail in se tion

3.2. In the following, the fun tions used for plan onstru tion are des ribed.


Table 3.1. Abstra t DPlan Algorithm

fun tion DPlan(P) where P is a DPlan planning problem.

CurrentStates := ;;

NextStates := GoalStates(P);

Plan := ;;

while (NextStates 6= CurrentStates) do

OneStepPlan := OneStepPlan(NextStates,P);

Plan := Plan [ PruneStates(OneStepPlan,NextStates);

CurrentStates := NextStates;

NextStates := NextStates [ Proje tA tions(OneStepPlan);

return Plan.

GoalStates(P):

If D is dened as set of goal states,

GoalStates(P) := D.

If D is dened as set of states S,

GoalStates(P) := fs

G

j G s

G

and s

G

2 Sg.

OneStepPlan(States, P):

Cal ulates the pre-image of States as des ribed in denition 3.1.1. Ea h

state s 2 States is expanded to all states s

0

with Res(o; s

0

) = s. States

s

0

are al ulated by ba kward operator appli ation Res

1

(o; s) = s

0

where

all states for whi h Res(o; s

0

) = s does not hold are removed. If D is given

as a set of states, additionally, all states not o urring in D are removed.

OneStepPlan returns all pairs ho; s

0

i:

OneStepPlan(States, P) = fho; s

0

i j Res(o; s) = s

0

^ s 2 Statesg.

PruneStates(Pairs, States):

Eliminates all pairs ho; s

0

i from Pairs whi h have already been visited:

PruneStates(Pairs, States) := fho; s

0

i 2 Pairs with s

0

=2 Statesg.

Proje tA tions(Pairs):

Returns the set of all states o urring in Pairs:

Proje tA tions(Pairs) := fs

0

j ho; s

0

i 2 Pairsg.

Plan onstru tion orresponds to building a sear h-tree with breadth-

rst sear h with ltering of states whi h already o ur on higher levels of

the tree. Sin e it is possible that a OneStepPlan ontains pairs ho; s

0

i, ho

0

; s

0

i

i. e., dierent a tions resulting in the same prede essor state the plan

orresponds not to a tree but to a DAG (Christodes, 1975).

3.1.3 EÆ ien y Con erns

Sin e planning is an NP- omplete or even PSPACE omplete problem (see

se t. 2.3.3.1 in hap. 2), the worst- ase performan e of every possible algo-

rithm is exponential. Nevertheless, urrent Graphplan-based and ompilation

approa hes (see se t. 2.4.2.1 in hap. 2) show good performan e for many

ben hmark problems. These algorithms are based on depth-rst sear h and


therefore in general annot guarantee optimality i. e., plans with a mini-

mal length sequen e of a tions (Koehler et al., 1997). In ontrast, universal

planning is based on breadth-rst sear h and an guarantee optimality (for

deterministi domains). The main disadvantage of universal planning is that

plan size grows exponentially for many domains (see se t. 2.3.3.1 in hap. 2).

En oding plans as OBDDs an often result in ompa t representations. But,

the size of an OBDD is dependent on variable ordering (see se t. 2.4.4 in

hap. 2) and al ulating an optimal variable ordering is itself an NP-hard

problem (Bryant, 1986).

Planning with DPlan is neither time nor memory eÆ ient. Plan onstru -

tion is based on breadth-rst sear h and therefore works without ba ktra k-

ing. Be ause paths to nodes whi h are already overed (by shorter paths) in

the plan are not expanded

4

, the eort of plan onstru tion is dependent on

the number of states. The number of states an grow exponentially as shown

for example for the blo ks-world domain in se tion 2.3.3.1 in hapter 2. Plans

are represented as DAGs with state des riptions as nodes that is, the size

of the plan depends on the number of states, too.

5

DPlan in orporates none of the state-of-the art te hniques for eÆ ient

plan onstru tion: we do use no pre-planning analysis (see se t. 2.5.1) and we

do not en ode plans as OBDDs. The reason is that we are mainly interested

in an expli it representation of the stru ture of a domain with a xed goal and

a small number of domain obje ts. A universal plan onstru ted with DPlan

represents the stru tural relations between optimal transformation sequen es

for all possible states. This information is ne essary for indu ing a re ursive

ontrol program, generalizing over the number of domain obje ts as we will

des ribe in hapter 8.

3.1.4 Example Problems

In the following, we will give some example domains together with a DPlan

problem, that is, a xed goal and a xed number of obje ts. For ea h DPlan

problem, we present the universal plans onstru ted by DPlan. For better

readability, we abstra t somewhat from the LISP-based PDDL en oding

whi h is input in the DPlan system (see appendix AA.3).

3.1.4.1 The Clearblo k Problem. First we look at a very simple sub-

domain of blo ks-world (see g. 3.1):. There is only one operator puttable

whi h puts a blo k x from another blo k on the table. We give the restri ted

domain Dom as a set of three states a tower of A, B, C, a tower of B, C,

and blo k A lying on the table, and a state where all three blo ks are lying

on the table.

4

similar to dynami programming in A

, (Nilsson, 1980)

5

For example, al ulating an universal plan with the un ompiled Lisp implemen-

tation of DPlan, needs about a se ond for blo ks-world with three obje ts, about

150 se onds for four obje ts, and already more than an hour for 5 obje ts.


Operator: puttable(x)

PRE: f lear(x), on(x, y)g

ADD: fontable(x), lear(y)g

DEL: fon(x, y)g

Goal: f lear(C)g

Dom-S: ffon(A, B), on(B, C), lear(A), ontable(C)g,

fon(B, C), lear(A), lear(B), ontable(A), ontable(C)g,

f lear(A), lear(B), lear(C), ontable(A), ontable(B), ontable(C)gg

Fig. 3.1. The Clearblo k DPlan Problem

The planning goal is to lear blo k C. Dom in ludes one state fullling the

goal and this state be omes the root of the plan. Plan onstru tion results

in ordering the three states given in Dom with respe t to the length of the

optimal transformation sequen e (see g. 3.2). For this simple problem, the

plan is a simple sequen e, i. e., the states are totally ordered with respe t to

the given goal and the given operator. Note that we present plans with states

and a tions in Lisp notation be ause we use the original DPlan output (see

appendix AA.2).

(PUTTABLE B)

(PUTTABLE A)

((clear A) (clear B) (clear C) (ontable A) (ontable B) (ontable C))

((on B C) (clear A) (clear B) (ontable A) (ontable C))

((on A B) (on B C) (clear A) (ontable C))

Fig. 3.2. DPlan Plan for Clearblo k

Alternatively, the set of all goal states satsifying lear(C) an be pre-

sented. Then, the resulting plan (see g. 3.3) is a forest. When planning for

multiple goal-states, we introdu e the top-level goals as root.

(PUTTABLE A)

((on A C) (on C B) (clear A))

((on A B) (clear A) (clear C))

((on B A) (clear B) (clear C))

((clear C))

((on C A) (on A B) (clear C))

((on C B) (clear A) (clear C))

((on B C) (on C A) (clear B))

((on C B) (on B A) (clear C))

(PUTTABLE B)

((on C A) (clear B) (clear C))((clear A) (clear B) (clear C))

((on A C) (clear A) (clear B)) ((on B C) (clear A) (clear B))

((on B A) (on A C) (clear B)) ((on A B) (on B C) (clear A))

(PUTTABLE A)

(PUTTABLE B) (PUTTABLE A)

(PUTTABLE B)

(ontable omitted for better readability)

Fig. 3.3. Clearblo k with a Set of Goal States


3.1.4.2 The Ro ket Problem. An example for a simple transportation

(logisti s) domain is the ro ket domain proposed by Veloso and Carbonell

(1993) (see g. 3.4). The planning goal is to transport three obje ts O1, O2,

and O3 from a pla e A to a destination B. The transport vehi le (Ro ket) an

only be moved in one dire tion (A to B), for example from the earth to the

moon. Therefore, it is important to load all obje ts before the ro ket moves

to its destination. The ro ket domain was introdu ed as a demonstration for

the in ompleteness of linear planning (see se t. 2.3.4.2 in hap. 2).

Operators:

load(?o, ?l)

PRE: fat(?o ?l), at(Ro ket, ?l)g

ADD: finside(?o, Ro ket)g

DEL: fat(?o, ?l)g

move-ro ket()

PRE: fat(Ro ket, A)g

ADD: fat(Ro ket, B)g

DEL: fat(Ro ket, A)g

unload(?o, ?l)

PRE: finside(?o, Ro ket), at(Ro ket, ?l)g

ADD: fat(?o, ?l)g

DEL: finside(?o, Ro ket)g

Goal: fat(O1, B), at(O2, B), at(O3, B)g

Dom-G: ffat(O1, B), at(O2, B), at(O3, B), at(Ro ket, B)gg

Fig. 3.4. The DPlan Ro ket Problem

The resulting universal plan is given in gure 3.5. For ro ket, the plan

orresponds to a DAG: loading and unloading of the three obje ts an be

performed in an arbitrary sequen e, but nally, ea h sequen e results in all

obje ts being inside the ro ket or at the destination.

3.1.4.3 The Sorting Problem. A spe i ation of a sorting problem is

given in gure 3.6. The relational symbol is (p k) represents that an element

k is at a position p of a list (or more exa tly an array). The relational symbol

gt(x y) represents that an element x has a greater value than an element y,

that is, the ordering on natural numbers is represented extensionally. Rela-

tional symbol gt is stati , i. e., its truth value is never hanged by operator

appli ation. Be ause swapping of two elements is only restri ted su h that

two elements are swapped if the rst is greater than the se ond, the resulting

plan orresponds to a set of tra es of a sele tion sort program. We will dis uss

synthesis of sele tion sort in detail in hapter 8.

The universal plan for sorting is given in gure 3.7. For better readability,

the lists itself instead of their logi al des ription are given. In the hapter 4

we will des ribe how su h lists an be manipulated dire tly by extending

planning to fun tion appli ation.


((AT O1 B) (AT O2 B) (AT O3 B) (AT R B))

((IN O1 R) (AT O2 B) (AT O3 B) (AT R B)) ((IN O2 R) (AT O1 B) (AT O3 B) (AT R B)) ((IN O3 R) (AT O1 B) (AT O2 B) (AT R B))

((IN O2 R) (IN O1 R) (AT O3 B) (AT R B))



((IN O3 R) (IN O1 R) (IN O2 R) (AT R B))

((AT R A) (IN O1 R) (IN O2 R) (IN O3) R)

((AT O1 A) (AT R A) (IN O2 R) (IN O3 R)) ((AT O2 A) (AT R A) (IN O1 R) (IN O3 R)) ((AT O3 A) (AT R A) (IN O1 R) (IN O2 R))

((AT O2 A) (AT O1 A) (AT R A) (IN O3 R))



((AT O3 A) (AT O1 A) (AT O2 A) (AT R A))

(UNLOAD O1) (UNLOAD O2) (UNLOAD O3)

(UNLOAD O2)(UNLOAD O3)(UNLOAD O1) (UNLOAD O3)

(UNLOAD O1)(UNLOAD O2)


(UNLOAD O1)

(LOAD O1) (LOAD O2) (LOAD O3)

(LOAD O2)(LOAD O3)(LOAD O1) (LOAD O3)

(LOAD O1)(LOAD O2)

(LOAD O3)(LOAD O2)

(LOAD O1)

(MOVE-ROCKET)

(in = inside, R = ro ket)

Fig. 3.5. Universal Plan for Ro ket

Operator: swap(?p, ?q)

PRE: fis (?p, ?n1), is (?q, ?n2), gt(?n1, ?n2)g

ADD: fis (?p, ?n2), is (?q, ?n1)g

DEL: fis (?p, ?n1), is (?q, ?n2)g

Goal: fis (p1, 1), is (p2, 2), is (p3, 3), is (p4, 4)g

Dom-G: fis (p1, 1), is (p2, 2), is (p3, 3), is (p4, 4)

gt(4, 3), gt(4, 2), gt(4, 1), gt(3, 2), gt(3, 1), gt(2, 1)g

Fig. 3.6. The DPlan Sorting Problem


=((isc p1 1) (isc p2 2) (isc p3 3) (isc p4 4))

[3421] [2341] [4312]

[4321][2431][4132]

[2413][4123][1342]

[3412][2314][1342][4213][3241][3241][3124][2143]

[4231][1324][1432][3214][2134][1243]

[1234]

= (swap p1 p4)(S P1 P2) (S P1 P3) (S P2 P3) (S P1 P4)(S P2 P4)(S P3 P4)

(S P1 P3) (S P2 P3) (S P1 P4) (S P2 P4)(S P2 P1)(S P3 P2) (S P1 P4) (S P2 P3)(S P1 P2) (S P1 P4)

(S P1 P4) (S P2 P4)(S P3 P4) (S P1 P4)(S P3 P4)(S P2 P4) (S P2 P3)(S P1 P3)(S P1 P4) (S P2 P1) (S P2 P4)(S P3 P2) (S P2 P1)(S P2 P3)(S P4 P2)(S P1 P2) (S P1 P4)(S P1 P2) (S P4 P3)(S P3 P1)

(S P1 P2) (S P1 P3)(S P2 P4) (S P2 P4)

(S P4 P3)(S P3 P4) (S P1 P4)

(S P1 P2)

(S P2 P3) (S P1 P4) (S P3 P2)

(S P3 P1)

(S P4 P3)

(S P3 P4) (S P1 P3)

(S P1 P3)

(S P2 P4)

(S P2 P1)

(S P2 P1)

(S P3 P2)

(S P1 P2)(S P4 P2)

(S P4 P3)

(S P1 P3)

(S P3 P4)

(S P1 P4)

(S P4 P1)(S P4 P3)

(S P3 P1)(S P3 P2)(S P1 P3)

(S P2 P4)

(S P2 P3)(S P1 P3)

(S P3 P4) (S P4 P2)

(States are represented abbreviated)

Fig. 3.7. Universal Plan for Sorting

3.1.4.4 The Hanoi Problem. A spe i ation of a Tower of Hanoi problem

is given in gure 3.8. The Tower of Hanoi problem ( hap. 5 Winston & Horn,

1989, e. g.,) is a well-resear hed puzzle. Given are n dis s (in our ase three)

with monotoni al in reasing size and three pegs. The goal is to have a tower

of dis s on a xed peg (in our ase p

3

) where the dis s are ordered by their

size with the largest dis as base and the smallest as top element. Moving of a

dis is restri ted in the following way: Only a dis whi h has no other dis on

its top an be moved. A dis an only moved onto a larger dis or an empty

peg. Coming up with eÆ ient algorithms (for restri ted variants) of the Tower

of Hanoi problem is still ongoing resear h (Atkinson, 1981; Pettorossi, 1984;

Walsh, 1983; Allou he, 1994; Hinz, 1996). We will ome ba k to this problem

in hapter 8.

The plan for hanoi is given in gure 3.9. Again, states are represented

abbreviated.

3.1.4.5 The Tower Problem. The blo ks-world domain with operators

put and puttable was introdu ed in hapter 2 (see gs. 2.2, 2.3, 2.4, 2.5). We

use a variant of the operator denitions given in gure 2.2 where the both

variants of put are spe ied as a single operator with onditioned ee t. A

plan for the goal fon(A, B), on(B, C)g is given in gure 3.10. We omitted

from representing the ontable relations for better readability.


Operator: move(?d, ?from, ?to)

PRE: fon(?d, ?from), lear(?d), lear(?to), smaller(?to, ?d)g

ADD: fon(?d, ?to), lear(?from)g

DEL: fon(?d, ?from), lear(?to)g

Goal: fon(d

3

, p

3

), on(d

2

, d

3

), on(d

1

, d

2

)g

Dom-G: fon(d

3

, p

3

), on(d

2

, d

3

), on(d

1

, d

2

), lear(d

1

), lear(p

1

), lear(p

2

)

smaller(p

1

, d

1

), smaller(p

1

, d

2

), smaller(p

1

, d

3

),

smaller(p

2

, d

1

), smaller(p

2

, d

2

), smaller(p

2

, d

3

),

smaller(p

3

, d

1

), smaller(p

3

, d

2

), smaller(p

3

, d

3

),

smaller(d

1

, d

2

), smaller(d

1

, d

3

), smaller(d

2

, d

3

)g

Fig. 3.8. The DPlan Hanoi Problem

[()(3)(12)]

[()(23)(1)]

[(2)(3)(1)]

[()(1)(23)]

[(2)(1)(3)]

[(2)()(13)][(12)()(3)]

[(12)(3)()]

[(2)(13)()]

[()(13)(2)]

[(1)(3)(2)][()(123)()][(1)(23)()]

[(13)()(2)]

[(3)()(12)][(3)(1)(2)][(123)()()][(23)(1)()]

[(23)()(1)]

[(13)(2)()][(3)(2)(1)]

[(3)(12)()]

[()(12)(3)][()(2)(13)]

[(1)(2)(3)]

[(1)()(23)]

[()()(123)]

(M D2 P2 D3) (M D2 P1 D3)

(M D3 P1 P3) (M D3 P2 P3)

(M D1 D3 D2)(M D1 P3 D2) (M D1 D3 D2)(M D1 P3 D2)

(M D2 D3 P2) (M D2 P3 P2) (M D2 D3 P1) (M D2 P3 P1)

(M D1 D3 P1) (M D1 D2 P1) (M D1 D2 P2) (M D1 D3 P2)

(M D1 P2 P3) (M D1 D2 P3) (M D1 P2 D3) (M D1 D2 D3) (M D1 P1 P3) (M D1 D2 P3) (M D1 P1 D3) (M D1 D2 D3)

(M D1 P1 D2) (M D1 P2 D2)

Fig. 3.9. Universal Plan for Hanoi

3.2 Optimal Full Universal Plans 65

((ON A B) (ON B C) (CT A))

((CT B) (ON B C) (CT A))

((CT C) (ON B A) (CT B))((CT C) (CT B) (CT A))

((ON C B) (CT C) (ON B A))((ON C A) (CT C) (CT B)) ((ON C B) (CT C) (CT A)) ((ON A B) (CT C) (CT A)) ((ON A C) (CT B) (CT A))

((ON B C) (ON C A) (CT B)) ((ON A C) (ON C B) (CT A)) ((ON C A) (ON A B) (CT C)) ((ON B A) (ON A C) (CT B))

(PUT A B)

(PUT B C) (PUT B C)

(PUTTABLE C)(PUTTABLE A)(PUTTABLE A)(PUTTABLE C)(PUTTABLE C)

(PUTTABLE B) (PUTTABLE A) (PUTTABLE C) (PUTTABLE B)

Fig. 3.10. Universal Plan for Tower

3.2 Optimal Full Universal Plans

A DPlan plan, as onstru ted with the algorithm reported in table 3.1, is

a set of a tion-state pairs ho; si. For the goal state(s) it holds o = nil. For

every other state ontained in a DPlan plan it holds that Res(o; s) = s

0

(see

se t. 3.1.1); that is, for ea h state s ex ept the goal state(s), there exists at

least one state s 6= s

0

whi h an be dire tly transformed into s

0

by appli ation

of an a tion o. Consequently, a DPlan plan onstitutes a dire ted graph with

states as nodes and a tions as edges. Be ause a state whi h is already on-

tained in the plan is never introdu ed again on a later level (PruneStates

in tab. 3.1), the resulting graph is also a y li .

Denition 3.2.1 (Universal Plan). A plan onstru ted with the DPLan

algorithm given in table 3.1 onstitutes a dire ted a y li graph (DAG)

=

(S;R) with S = Proje tA tions(S) as set of states and R as set of edges.

The set of edges is given by the pre-images (see tab. 3.1): R S S =

fho; s

0

i j Res(o; s) = s

0

g.

6

We all

an universal plan.

DPlan plan is onstru ted levelwise, starting with the states fullling the

top-level goals as root nodes and the plan ontains no y li paths. Therefore,

all paths leading from a xed state to the root have equal length and ea h

state an be assigned a uniquely determined natural number representing the

number of a tion appli ations for transforming it into a goal state.

Denition 3.2.2 (Universal Plan as Order over States). Ea h node

s 2 S for whi h no prede essor Res(o; s) exists has a path-length g(s) = 0

(s is a root node and s G).

Ea h node s 2 S with Res(o; s) = s

0

and g(s

0

) = i has a path-length

g(s) = i+ 1 (s is a node with s 6 G).

6

In ontrast to the standard denition of labelled graphs (Christodes, 1975), we

do not dis riminate between nodes/edges and labels of nodes/edges.


The path length g(s) 2 N onstitutes an order over the states in

with

s < s

0

i g(s) < g(s

0

).

In ontrast to standard universal planning, DPlan onstru ts a plan whi h

ontains all states whi h an be transformed into a state satisfying the given

top-level goals:

Denition 3.2.3 (Full Universal Plan). A universal plan

= (S;R)

is alled full with respe t to given top-level goals G, if for all states s

for whi h exists a transformation sequen e Res(o; s) = s

i

: : : Res(o

i

; s

i

) =

s

i1

: : : Res(o

1

; s

1

) = s

0

G holds s 2 S.

For example, the universal plan for ro ket (see g. 3.5) dened for three

obje ts and a single ro ket ontains transformation sequen es for all legal

states whi h an be dened over C = fO1; O2; O3; Ro ketg.

The notion of a full universal plan presupposes that planning with DPlan

must be omplete. Additionally, we require that the universal plan does not

ontain states whi h annot be transformed into a goal state, that is, DPlan

must be sound. Completeness and soundness of DPlan are dis ussed in se tion

3.3. For proving optimality of DPlan we assume ompleteness:

Lemma 3.2.1 (Universal Plans are Optimal). For a full universal plan

ea h path from a state s to a root note ontains the minimal number of

edges, that is, the minimal number of a tions ne essary to transform s into

a goal state.

Proof (Universal Plans are Optimal). We give an informal proof by indu -

tion over the onstru tion of

= (S;R) with the DPLan algorithm given

in table 3.1:

Plan onstru tion starts with a set of goal states S

G

= fs j G s^ s 2 Sg.

Sin e the goal states onstitute root nodes in

it holds g(s) = 0 for all

s 2 S

G

.

For all nodes in s with g(s) = i > 0, i gives the minimal number of a tions

to transform s into a goal state. It either holds that there exists a state s

0

with Res(o; s

0

) = s or that no su h state exists. In the se ond ase, s is a

leaf in

. In the rst ase, one of the following onditions hold:

1. s

0

is already introdu ed at a higher level with g(s

0

) g(s): ho; s

0

i will

not be in luded in the plan (see PruneStates in tab. 3.1).

2. s

0

is not ontained at a higher level: ho; s

0

i will be in luded in the plan

and g(s

0

) = i+ 1 whi h is the minimal path-length for s

0

.

The length of minimal paths in a graph onstitutes a metri . That is, for

all paths holds that if the transformation sequen e from start to goal node

of the path is optimal then the transformation sequen es for all intermediate

nodes on the path are optimal, too.

As a onsequen e of the optimality of DPlan, ea h path in

from some

state to a goal state represents an optimal solution for this state. In general,

3.3 Termination, Soundness, Completeness 67

there might be multiple optimal paths from a state to the goal. For example,

for ro ket (see se t. 3.1.4.2) there are 3! dierent possible sequen es for un-

loading and loading three obje ts. For the leaf node in gure 3.5 there are 36

dierent optimal solution paths ontained in the universal plan.

A DPlan plan an be redu ed to a minimal spanning tree (Christodes,

1975) by deleting edges su h that ea h node whi h is not the root node has

exa tly one prede essor. The minimal spanning tree ontains exa tly one

optimal solution path for ea h state. A minimal spanning tree for ro ket is

given in gure 3.11.


((IN O1 R) (AT O2 B) (AT O3 B) (AT R B)) ((IN O2 R) (AT O1 B) (AT O3 B) (AT R B)) ((IN O3 R) (AT O1 B) (AT O2 B) (AT R B))











(UNLOAD O1) (UNLOAD O2) (UNLOAD O3)

(UNLOAD O2) (UNLOAD O3)

(UNLOAD O3)


(LOAD O2) (LOAD O3)

(LOAD O3)

(MOVE-ROCKET)

(UNLOAD O3)

(LOAD O3)

Fig. 3.11. Minimal Spanning Tree for Ro ket

3.3 Termination, Soundness, Completeness

3.3.1 Termination of DPlan

For nite domains, termination of the DPlan algorithm given in table 3.1 is

guaranteed:

Lemma 3.3.1 (Termination of DPlan). For nite domains, a situation

(NextStates = CurrentStates) will be rea hed.

This fa t holds be ause the growth of the number of states in in plan on-

stru tion is monotoni al:

Proof (Termination of DPlan). We denote a plan after the t-th iteration of

pre-image al ulation with P lan

t

. From the denition of the DPlan algorithm

follows that NextStates = Proje tA tions(Plan).


If al ulating a pre-image only returns state des riptions whi h are al-

ready ontained in plan P lan

t

then (NextStates = CurrentStates) holds

and DPlan terminates after t iterations.

If al ulating a pre-image returns at least one state des ription whi h is not

already ontained in the plan then this state des ription(s) are in luded in

the plan and j P lan

t+1

j > j P lan

t

j .

Be ause planning domains are assumed to be nite, there is only a xed

number of dierent state des riptions and a xed point j P lan

t+1

j =

j P lan

t

j , i. e., a situation (NextStates = CurrentStates) will be rea hed

after a nite number of steps t.

3.3.2 Operator Restri tions

Soundness and ompleteness of plan onstru tion is determined by the sound-

ness and ompleteness of operator appli ation. State-based ba kward plan-

ning has some inherent restri tions whi h we will des ribe in the following. We

repeat the denitions for forward and ba kward operator appli ation given

in denitions 2.1.6 and 2.1.7:

Res(o; s) = s nDEL [ ADD if PRE s

Res

1

(o; s) = s nADD [ (DEL [ PRE) if ADD s:

For forward appli ation, nDEL and [ADD are only ommutative for ADD\

DEL = ;

7

. When literals from the DEL-list are deleted before adding literals

from the ADD-list, as dened above, we guarantee that everything given in

the ADD-list is really given for the new state.

Completeness and soundness of ba kward-planning means that the fol-

lowing diagram must ommute:

s == Res(o; s

0

)

# "

Res

1

(o; s) == s

0

If Res

1

(o; s) = s

0

results in s = Res(o; s

0

) for all s, ba kward planning is

sound. If for all s

0

with Res(o; s

0

) = s it holds that Res

1

(o; s) = s

0

, ba kward

planning is omplete. In the following we will proof that these propositions

hold with some restri tions.

Soundness:

7

In PDDL ADD \ DEL = ; must be always true be ause the ee ts are given

in a single list with DEL-ee ts as negated literals. Therefore, an expression

(and p (not p)) would represent a ontradi tion and annot be not allowed as

spe i ation of an operator ee t.


Res(o;Res

1

(o; s))

?

= s

Res(o;Res

1

(o; s)) =

= Res

1

(o; s) nDEL [ ADD

with PRE Res

1

(o; s)

= fs nADD [ fDEL [ PREgg nDEL [ ADD

with PRE fs nADD [ fDEL [ PREgg

and ADD s

= fs [ fDEL [ PREgg nDEL

with PRE fs nADD [ fDEL [ PREgg

= fs [DELg nDEL if PRE s [DEL

= s if s \DEL = ;:

The restri tion PRE s[DEL means for forward operator appli ation,

transforming from s

0

into s, that PRE holds after operator appli ation if it is

not expli itly deleted. In ba kward appli ation PRE is introdu ed as list of

sub-goals and onstraints whi h have to hold in s

0

. Therefore, this restri tion

is unproblemati . The restri tion s\DEL = ; means for forward appli ation

that s ontains no literal from the DEL-list. For ba kward-planning all liter-

als from the DEL-list are added and s has ontained none of these literals.

This restri tion is in a ordan e with the denition of legal operators. Thus,

soundness of ba kward appli ation of an Strips operator is given. In gen-

eral, soundness of ba kward-planning an be guaranteed by introdu ing an

onsisten y he k for ea h onstru ted prede essor: For a newly onstru ted

state s

0

, s = Res(o; s

0

) must hold. If forward operator appli ation does not

result in s, the onstru ted prede essor is onsidered as not admissible and

not introdu ed in the plan.

Completeness:

Res

1

(o;Res(o; s

0

))

?

= s

0

Res

1

(o;Res(o; s

0

)) =

= Res(o; s

0

) nADD [ fDEL [ PREg

with ADD Res(o; s

0

)

= (s

0

nDEL [ ADD) nADD [ fDEL [ PREg

with ADD (s

0

nDEL [ADD)

and with PRE s

0

= s

0

nDEL [ fDEL [ PREg

with PRE s

0

if s

0

\ ADD = ;

= s

0

[ PRE with PRE s

0

if DEL s

0

= s

0

:


The restri tion s

0

\ ADD = ; means that only su h states an be on-

stru ted whi h do not already ontain a literal added by an appli able opera-

tor. The restri tionDEL s

0

means that only su h states an be onstru ted

whi h ontain all literals whi h are deleted by an appli able operator. While

this is a real sour e of in ompleteness, we are still looking for a meaningful

domain where these ases o ur. In general, in ompleteness an be over ome

with a loss of eÆ ien y by onstru ting prede essors in the following way:

Res

1

(o; s) = s nADD

[ fDEL

[ PREg

for all ombinations of subsets ADD

of ADD and DEL

of DEL if ADD

s. Thus, all possible states ontaining subsets of DEL and ADD with the

spe ial ase of inserting all literals from DEL and deleting all literals from

ADD, would be onstru ted. Of ourse, most of these states might not be

sound!

For more general operator denitions, as allowed by PDDL, further re-

stri tions arise. For example, it is ne essary, that se ondary pre onditions are

mutually ex lusive.

Another denition of sound and omplete ba kward operator appli ation

is given by (Haslum & Gener, 2000): For s \ ADD 6= ; and s \DEL = ;:

Res

1

(o; s) = snADD[PRE. ForDEL PRE, this denition is equivalent

with our denition.

3.3.3 Soundness and Completeness of DPlan

Denition 3.3.1 (Soundness of DPlan). A universal plan

is sound,

if ea h path = fhnil; s

0

i; ho

1

; s

1

i; : : : ho

n

; s

n

ig denes a solution. That is,

for an initial state orresponding to s

n

, the sequen e of a tions o

n

Æ : : : Æ o

1

transforms s

n

into a goal state.

Soundness of DPlan follows from the soundness of ba kward operator

appli ation as dened in se tion 3.3.2.

Denition 3.3.2 (Completeness of DPlan). A universal plan

is om-

plete, if the plan ontains all states s for whi h an a tion sequen e, trans-

forming s into a goal state, exists.

Completeness of DPlan follows from the ompleteness of ba kward oper-

ator appli ation as dened in se tion 3.3.2.

If DPlan works on a domain restri tion D given as set of states, only

su h states are in luded in the plan whi h also o ur in D. From sound-

ness and ompleteness of DPlan also follows that for a given set of states

D, DPlan terminates with dom = Proje tA tions(Plan) if the state spa e

underlying D is (strongly) onne ted.

8

If DPlan terminates with Proje -

tA tions(Plan) D then D n Proje tA tions(Plan) only ontains states

8

For denitions of onne tedness of graphs, see for example (Christodes, 1975).


from whi h no goal-state in Proje tA tions(Plan) is rea hable. One rea-

son for non-rea hability an be that the state spa e is not onne ted. That

is, it ontains states on whi h no operator is appli able. For example, in the

restri ted blo ks-world domain where only a puttable but no put operator is

given (see se t. 3.1.4.1), no operator is appli able in a state where all blo ks

are lying on the table. As a onsequen e, a goal on(x, y) is not rea hable from

su h a state. A se ond reason for non-rea hability an be that the state spa e

is weakly onne ted (i. e., is a dire ted graph) su h that it ontains states

s; s

0

where Res(o; s) = s

0

is dened but not Res(o

0

; s

0

) = s. For example, in

the ro ket domain (see se t. 3.1.4.2), it is not possible to rea h a state where

the ro ket is on the earth from a state where the ro ket is on the moon. As a

onsequen e, a goal at(Ro ket, Earth) is nor rea hable from any state where

the ro ket is on the moon.


4. Integrating Fun tion Appli ation

in State-Based Planning

Standard planning is dened for logi al formulas over variables and on-

stants. Allowing general terms that is, appli ation of arbitrary symbol-

manipulating and numeri al fun tions enlarges the s ope of problems for

whi h plans an be onstru ted. In the ontext of using planning as starting

point for the synthesis of fun tional programs, we are interested in apply-

ing planning to standard programming problems, su h as list sorting. In the

following, we will rst give a motivation for extending planning to fun -

tion appli ation (se t. 4.1), then an extension of the Strips language pre-

sented in se tion 2.1 is introdu ed (se t. 4.2) together with some extensions

(se t. 4.3), and nally examples for planning with fun tion appli ation are

given (se t. 4.4).

1

4.1 Motivation

The development of eÆ ient planning algorithms in the nineties (see se t. 2.4.2

in hap. 2) su h as Graphplan, SAT planning, and heuristi planning (Blum

& Furst, 1997; Kautz & Selman, 1996; Bonet & Gener, 1999) has made it

possible to apply planning to more demanding real-world domains. Examples

are the logisti s or the elevator domain (Ba hus et al., 2000). Many realis-

ti domains involve manipulation of numeri al obje ts. For example: when

planning (optimal) routes for delivering obje ts as in the logisti s domain, it

might be ne essary to take into a ount time and other resour e onstraints

as fuel; plan onstru tion for landing a spa e-vehi le involves al ulations

for the orre t adjustments of thrusts (Pednault, 1987). One obvious way to

deal with numeri al obje ts is to assign and update their values by means of

fun tion appli ations.

There is some initial work on extending planning formalisms to deal with

resour e onstraints (Koehler, 1998; Laborie & Ghallab, 1995). Here, fun -

tion appli ation is restri ted to manipulation of spe ially marked resour e

variables in the operator ee t. For example, the amount of fuel might be

de reased in relation to the distan e an airplane ies: $gas -= distan e(?x

1

The hapter is based on the paper S hmid, Muller, and Wysotzki (2000).

74 4. Integrating Fun tion Appli ation in Planning

?y)/3 (Koehler, 1998). A more general approa h for in luding fun tion ap-

pli ations into planning was proposed by (Pednault, 1987) within the a tion

des ription language (ADL). In addition to ADD/DEL operator ee ts, ADL

allows variable updates by assigning new values whi h an be al ulated by

arbitrary fun tions. For example, the put(?x, ?y) operator in a blo ks-world

domain might be extended to updating the state variable ?LastBlo kMoved

to ?LastBlo kMoved := ?x; the amount of water in a jug ?j

2

might be hanged

by a pour(?j

1

, ?j

2

) a tion into ?j

2

:= min[?

2

, ?j

1

+?j

2

where ?j

1

, ?j

2

are

the urrent quantities of water in the jugs and ?

2

is the apa ity of jug ?j

2

(Pednault, 1994).

Introdu ing fun tions into planning not only makes it possible to deal

with numeri al values in a more general way than allowed for by a purely

relational language but has several additional advantages (see also Gener,

2000) whi h we will illustrate with the Tower of Hanoi domain (see g. 4.1)

2

:

Allowing not only numeri al but also symbol-manipulating fun tions makes

it possible to model operators in a more ompa t and sometimes also more

natural way. For example, representing whi h disks are lying on whi h peg in

the Tower of Hanoi domain ould be realized by a predi ate on(?disks, ?peg)

where ?disks represents the ordered sequen e of disks on peg ?peg with list

[1 representing an empty peg. Instead of modeling a state hange by adding

and deleting literals from the urrent state, arguments of the predi ates an

be hanged by applying standard list-manipulation fun tions (e. g., built-in

Lisp fun tions like ar(l) for returning the rst element of a list, dr(l)

for returning a list without its rst element, or ons(x, l) for inserting a

symbol x in front of a list l).

A further advantage is that obje ts an be referred to in an indire t way.

In the hanoi example, ar(l) refers to the obje t whi h is urrently the up-

permost disk on a peg. There is no need for any additional uent predi ate

besides on(?disks, ?peg). The lear(?disk) predi ate given in the standard

denition be omes super uous. Gener (2000) points out that indire t ref-

eren e redu es substantially the number of possible ground atoms and in

onsequen e the number of possible a tions, thus plan onstru tion be omes

more eÆ ient. Indire t obje t referen e additionally allows for modeling in-

nite domains while the state representations remain small and ompa t. For

example, ar(l) gives us the top disk of a peg regardless of how many disks

are involved in the planning problem.

Finally, introdu ing fun tions in modeling planning domains often makes

it possible to get rid of stati predi ates. For example, the smaller(?x, ?y)

predi ate in the hanoi domain an be eliminated. By representing disks as

numbers, the built-in predi ate \<" an be used to he k whether the ap-

pli ation onstraint for the hanoi move operator is satised in the urrent

2

Note that we use prex notation p(x

1

; : : : ; x

n

) for relations and fun tions

throughout the paper.

4.1 Motivation 75

(a) Standard representation

Operator: move(?d, ?from, ?to)

PRE: fon(?d, ?from), lear(?d), lear(?to), smaller(?d, ?to)g

ADD: fon(?d, ?to), lear(?from)g

DEL: fon(?d, ?from), lear(?to)g

Goal: fon(d

3

, p

3

), on(d

2

, d

3

), on(d

1

, d

2

)g

Initial State: fon(d

3

, p

1

), on(d

2

, d

3

), on(d

1

, d

2

), lear(d

1

), lear(p

2

), lear(p

3

)

smaller(d

1

, p

1

), smaller(d

1

, p

2

), smaller(d

1

, p

3

),

smaller(d

2

, p

1

), smaller(d

2

, p

2

), smaller(d

2

, p

3

),

smaller(d

3

, p

1

), smaller(d

3

, p

2

), smaller(d

3

, p

3

),

smaller(d

1

, d

2

), smaller(d

1

, d

3

), smaller(d

2

, d

3

)g

(b) Fun tional representation

Operator: move(?p

i

, ?p

j

)

PRE: fon(?l

i

, ?p

i

), on(?l

j

, ?p

j

), ( ar(?l

i

) 6= 1), ar(?l

i

) < ar(?l

j

)g

UPDATE: hange ?l

j

in on(?l

j

, ?p

j

) to ons( ar(?l

i

),?l

j

)

hange ?l

i

in on(?l

i

, ?p

i

) to dr(?l

i

)

Goal: fon([1, p

1

), on([1, p

2

), on([1 2 3 1, p

3

)g ; 1 represents a dummy

Initial State: fon([1 2 3 1, p

1

), on([1, p

2

), on([1, p

3

)g ; bottom disk

Fig. 4.1. Tower of Hanoi (a) Without and (b) With Fun tion Appli ation

state. Allowing arbitrary boolean operators to express pre onditions gener-

alizes mat hing of literals to ontraint satisfa tion.

The need to extend standard relational spe i ation languages to more

expressive languages, resulting in planning algorithms appli able to a wider

range of domains, is re ognized in the planning ommunity. The PDDL plan-

ning language for instan e (M Dermott, 1998b), the urrent standard for

planners, in orporates all features whi h extend ADL over lassi al Strips.

But while there is a lear operational semanti s for onditional ee ts and

quanti ation (Koehler et al., 1997), fun tion appli ation is dealt with in a

largely ad-ho manner (see also remark in Gener, 2000).

The only published approa h allowing fun tions in domain and problem

spe i ations is Gener (2000). He proposed Fun tional Strips, extending

standard Strips to support rst-order-fun tion symbols. He gives a denota-

tional state-based semanti s for a tions and an operational semanti s, de-

s ribing a tion ee ts as updates of state representations. In ontrast to the

lassi al relational languages, states are not represented as sets of (ground)

literals but as state variables ( f. situation al ulus). For example, the initial

positon of disk d

1

is represented as lo (d

1

) = d

0

(see g. 4.2).

While in Fun tional Strips relations are ompletely repla ed by fun tional

expressions, our approa h allows a ombination of relational and fun tional

expressions. We extended the Strips planning language to a subset of rst

order logi , where relational symbols are dened over terms where terms an

be variables, onstant symbols, or fun tions with arbitrary terms as argu-

ments. Our state representations are still sets of literals, but we introdu e a


Domains: Peg: p

1

, p

2

, p

3

; the pegs

Disk: d

1

, d

2

, d

3

; the disks

Disk*: Disk, d

0

; the disks and a dummy bottom disk 0

Fluents: top: Peg ! Disk* ; denotes top disk in peg

lo : Disk ! Disk*; denotes disk below given disk

size: Disk* ! Integer ; represents disk size

A tion: move(?p

i

, ?p

j

: Peg) ; moves between pegs

Pre : top(?p

i

) 6= d

0

, size(top(?p

i

)) < size(top(?p

j

))

Post: top(?p

i

) := lo (top(?p

i

)); lo (top(?p

i

)) := top(?p

j

); top(?p

j

) := top(?p

i

)

Init: lo (d

1

) = d

0

, lo (d

2

) = d

1

, lo (d

3

) = d

2

, lo (d

4

) = d

3

,

top(p

1

) = d

4

, top(p

2

) = d

0

, top(p

3

) = d

0

,

size(d

0

) = 4, size(d

1

) = 3, size(d

2

) = 2, size(d

3

) = 1, size(d

4

) = 0

Goal: lo (d

1

) = d

0

, lo (d

2

) = d

1

, lo (d

3

) = d

2

, lo (d

4

) = d

3

, top(p

3

) = d

4

Fig. 4.2. Tower of Hanoi in Fun tional Strips (Gener, 1999)

se ond lass of relational symbols ( alled onstraints) whi h are dened over

arbitrary fun tional expressions, as shown in gure 4.1. Standard represen-

tations ontaining only literals dened over onstant symbols are in luded as

spe ial ase, and we an ombine ADD/DEL ee ts and updates in operator

denitions.

4.2 Extending Strips to Fun tion Appli ations

In the following we introdu e an extension of the Strips language (see

def. 2.1.1) from propositional representations to a larger subset of rst order

logi , allowing general terms as arguments of relational symbols. Operators

are dened over these more general formulas, resulting in a more omplex

state transition fun tion.

Denition 4.2.1 (FPlan Language). The language L(X ; C;F ;R

P

[ R

C

)

is dened over sets of variables X , onstant symbols C, fun tion symbols

F , and of relational symbols R = R

P

[ R

C

in the following way:

Variables x 2 X are terms.

Constant symbols 2 C are terms.

If f 2 F is a fun tion symbol with arity (f) = i and t

1

; t

2

; : : : t

i

are terms, then f(t

1

; t

2

; : : : ; t

i

) is a term.

If p 2 R

P

is a relational symbol with arity (p) = j and t

1

, t

2

, : : : t

j

are

terms then p(t

1

; t

2

; : : : ; t

j

) is a formula.

If r 2 R

C

is a relational symbol with arity (r) = j and t

1

; t

2

; : : :

t

j

are terms then r(t

1

; t

2

; : : : ; t

j

) is a formula. We all formulas with

4.2 Extending Strips to Fun tion Appli ations 77

relational symbols from R

C

onstraints.

For short, we write p(t).

If p

1

(t

1

), p

2

(t

2

), : : :, p

k

(t

k

) with p

1

; p

2

; : : : ; p

k

2 R, then fp

1

(t

1

), p

2

(t

2

),

: : :, p

k

(t

k

)g is a formula, representing the onjun tion of literals.

There are no other FPlan formulas.

Remarks:

- Formulas onsisting of a single relational symbol are alled (positive) liter-

als.

Terms over C [ F , i. e., terms without variables, are alled ground terms.

Formulas over ground terms are alled ground formulas.

- With X (F ) we denote the variables o urring in formula F .

An example of a state representation, goal denition, and operator def-

inition using the FPlan language is given in gure 4.1.b for the hanoi do-

main. FPlan is instantiated in the following way: X = f?p

i

; ?p

j

; ?l

i

; ?l

j

g,

C = fp

1

; p

2

; p

3

; 1; 2; 3;1g, F = f[

i

; ar

1

; dr

1

; ons

2

g, R

P

= fon

2

g,

R

C

= f6=

2

; <

2

g.

A state in the hanoi domain, su h as fon([2 3 1, p

1

), on([1, p

2

), on([1

1, p

3

)g, is given as a set of atoms where the arguments an be arbitrary

ground terms. We an use onstru tor fun tions to dene omplex data stru -

tures. For example, [2 3 1 is a list of onstant symbols, where [ is a list

onstru tor fun tion dened over an arbitrary number of elements. The atom

on([2 3 1, p

1

) orresponds to the evaluated expression on( dr([1 2 3 1),

p

1

), where dr(?l) returns a list without its rst element. State transformation

by updating is dened by su h evaluations.

Relational symbols are divided in to two ategories symbols in R

P

and symbols in R

C

. Symbols in R

P

denote relations whi h hara terize a

problem state. For example, on([1 2 3 1, p

1

) with on 2 R

P

is true in a

state, where disks 1, 2, and 3 are lying on peg p

1

. Symbols in R

C

denote

additional hara teristi s of a state, whi h an be inferred from a formula

over R

P

. For example, ar(L

1

) 6= 1 with 6= 2 R

C

is true for L

1

= [1 2 3

1. Relations overR

C

often are useful for representing onstraints, espe ially

onstraints whi h must hold for all states, that is stati predi ates.

Pre onditions are dened as arbitrary formulas over R

P

[ R

C

. Standard

pre onditions, su h as p = on(?l

1

, ?x), an be transformed into onstraints:

If mat h(p,s) results in

1

= fl

1

[1 2 3 1; ?x p

1

g, then the onstraints

eq(?x; p

1

) and eq(?l

1

; [1 2 3 1), with eq as equality-test for symbols in R

C

,

must hold.

We allow onstraints to be dened over free variables, i. e., variables that

annot be instantiated by mat hing an operator pre ondition with a ur-

rent state. For example, variables ?i and ?j in the onstraint nth(?i; ?l) <

nth(?j; ?l)) might be free. To restri t the possible instantiations of su h vari-

ables, they must be de lared together with a range in the problem spe i a-

tion (see se t. 4.4.5 for an example).


Denition 4.2.2 (Free Variables in R

C

). For a formula F 2 L(X , C, F ,

R

P

[ R

C

), variables o urring only in literals over R

C

are alled free. The

set of su h variables is denoted as X

C

X and with X

C

(F ) we refer to free

variables in F .

For all variables x in X

C

instantiations must be restri ted by a range R(x).

For variables belonging to an ordered data type (e. g., natural numbers), a

range is de lared as R(x) = [min;max with min giving the smallest and

max the largest value x is allowed to assume. Alternatively, for ategorial

data types, a range is dened by enumerating all possible values the variable

an assume: R(x) = [v

1

; : : : ; v

n

.

Denition 4.2.3 (State Representation). A problem state s is a on-

jun tion of atoms over relational symbols in R

P

. That is, s 2 L(C;F ;R

P

).

We presuppose that terms are always evaluated if they are grounded. Evalu-

ation of terms is dened in the usual way (Field & Harrison, 1988; Ehrig &

Mahr, 1985):

Denition 4.2.4 (Evaluation of Ground Terms). A term t over L(C;F)

is evaluated in the following way:

eval( ) = , for 2 C

eval(f(t

1

; : : : ; t

n

)) = apply(f(eval(t

1

), . . . , eval(t

n

))), for f 2 F and t

1

; : : : ; t

n

as terms over C [ F .

Fun tion appli ation apply returns a unique value for f(

1

; : : : ;

n

).

For onstru tor fun tions (su h as the list onstru tor [ ) we dene

apply(f

(

1

; : : : ;

n

)) = f

(

1

; : : : ;

n

).

For example, eval( dr([1 2 3 1)) returns [2 3 1. Evaluation of eval(plus(3,

minus(5, 1))) returns 7. Evaluated states are sets of relational symbols over

onstant symbols and omplex stru tures, represented by onstru tor fun -

tions over onstant symbols. For example, the relational symbol on in state

des ription fon([2 3 1, p

1

), on([1, p

2

), on([1 1, p

3

)g has onstant ar-

guments p

1

, p

2

, and p

3

, and omplex arguments [2 3 1, [1, and [1 1.

Relational symbols in R

C

an be onsidered as spe ial terms, namely terms

that evaluate to a truth value, also alled boolean terms. In ontrast, rela-

tions in R

P

are true if they mat h with the urrent state and false otherwise.

In the following, we will speak of evaluation of expressions when referring to

terms in luding su h boolean terms.

Denition 4.2.5 (Evaluation of Expressions over Ground Terms). An

expression over L(C;F ;R

C

) is evaluated in the following way:

eval( ), eval(f(t

1

; : : : ; t

n

)) as in def. 4.2.4.

eval(r(t

1

; : : : ; t

n

)) = apply(r(eval(t

1

), . . . , eval(t

n

))), for r 2 R

C

and

t

1

; : : : ; t

n

as terms over C[F[R

C

with eval(r(t

1

; : : : ; t

n

)) 2 fTRUE;FALSEg.

4.2 Extending Strips to Fun tion Appli ations 79

For a set of atoms over L(C;F ;R

C

), i. e., a onjun tion of boolean expres-

sions, evaluation is extended to:

eval(fr

1

(t

1

); : : : ; r

n

(t

n

)g) =

8

<

:

TRUE if r

1

(t

1

) = TRUE and

: : : and r

n

(t

n

) = TRUE

FALSE else:

For example, f ar([1 2 3 1) 6= 1, ar([1 2 3 1) < ar([1)g evaluates

to TRUE. In our planner, evaluation of expressions is realized by alling the

meta-fun tion eval provided by Common Lisp.

Before introdu ing goals and operators dened over L(C;F ;R), we extend

the denitions for substitution and mat hing (def. 2.1.3):

Denition 4.2.6 (Produ t of Substitutions). The omposition of sub-

stitutions Æ

0

is the set of all ompatible pairs (x t) 2 and (x

0

t

0

) 2

0

with x; x

0

2 X and t; t

0

2 L(X ; C;F). Substitutions are ompatible i for ea h

(x t) 2 there does not exist any (x

0

t

0

) 2

0

with x = x

0

and t 6= t

0

.

The produ t of sets of substitutions = f

1

; : : :

n

g and

0

= f

0

1

; : : :

0

m

g is

dened as pairwise omposition [

0

=

i

Æ

0

j

for i = 1 : : : n, j = 1 : : :m.

Denition 4.2.7 (Mat hing and Assignment). Let F = F

P

[ F

C

be a

formula with

F

P

= fp

1

(t

1

); : : : ; p

i

(t

i

)g 2 L(X ; C;F ;R

P

) and

F

C

= fr

1

(t

0

1

); : : : ; r

j

(t

0

j

)g 2 L(X ; C;F ;R

C

), and let

A 2 L(C;F ;R

P

) be a set of evaluated atoms.

The set of substitutions for F with respe t to A is dened as

P

[

C

with

F

P

i

A and eval(F

C

i

) = TRUE for all

i

2 .

P

is al ulated by mat h(F

P

, A) as spe ied in denition 2.1.3.

C

is al ulated by ass(X

C

) for all free variables x

1

; : : : ; x

n

= X

C

(F

C

) su h

that for all

i

2

C

holds

i

= fx

1

1

; : : : ; x

n

n

g where

i

2 R(x

i

)

( onstants

i

in the range of variable x

i

, see def. 4.2.2).

Denition 4.2.7 extends mat hing of formulas with sets of atoms in su h a

way that only those mat hes are allowed where substitutions additionally

fulll the onstraints.

Consider the state

A = fon([2 3 1; p

1

); on([1; p

2

); on([1 1; p

3

)g.

Formula PRE of the move operator given in gure 4.1.b

F = fon(?l

i

; ?p

i

); on(?l

j

; ?p

j

); ar(?l

i

) 6=1; ar(?l

i

) < ar(?l

j

)g

an be instantiated to = f

1

;

2

;

3

g with

1

= f?p

i

p

1

; ?p

j

p

2

; ?l

i

[2 3 1; ?l

j

[1g,

2

= f?p

i

p

3

; ?p

j

p

2

; ?l

i

[1 1; ?l

j

[1g,

3

= f?p

i

p

3

; ?p

j

p

1

; ?l

i

[1 1; ?l

j

[2 3 1g


but not to

4

= f?p

i

p

1

; ?p

j

p

3

; ?l

i

[2 3 1; ?l

j

[1 1g,

5

= f?p

i

p

2

; ?p

j

p

1

; ?l

i

[1; ?l

j

[2 3 1g,

6

= f?p

i

p

2

; ?p

j

p

3

; ?l

i

[1; ?l

j

[1 1g.

A pro edural realization for al ulating all instantiations of a formula

with respe t to a set of atoms is to rst al ulate all mat hes and then

modify by stepwise introdu ing the onstraints. For the example above,

we rst al ulate = f

1

;

2

;

3

;

4

;

5

;

6

g onsidering only the sub-formula

F

P

= fon(?p

i

; ?l

i

); on(?p

j

; ?l

j

)g. Next, onstraint ar(?l

i

) 6=1 is introdu ed,

redu ing to = f

1

;

2

;

3

;

4

g. Finally, we obtain = f

1

;

2

;

3

g by

applying the onstraint ar(?l

i

) < ar(?l

j

).

If the onstraints ontain free variables, ea h substitution in is om-

bined with instantiations of these variables as spe ied in denition 4.2.6.

An example for instantiating formulas with free variables is given in se tion

4.4.5.

Denition 4.2.8 (Goal Representation). A goal G is a formula in L(X ,

C, F , R).

The goal given for hanoi in gure 4.1.b an alternatively be represented by:

fon([1 2 3 1; p

3

); on(?l

i

; ?p

i

); on(?l

j

; ?p

j

);1 = ar(?l

i

);1 = ar(?l

j

)g.

Denition 4.2.9 (FPlan Operator). An FPlan operator op is des ribed

by pre onditions (PRE), ADD and DEL lists, and updates (UP) with

PRE = PRE

P

[ PRE

C

2 L(X ; C;F ;R

P

[ R

C

) and ADD;DEL 2

L(X ; C;F ;R

P

).

An update UP is a list of fun tion appli ations, spe ifying that a fun tion

f(t) 2 L(X ; C;F) is applied to a term t

0

i

whi h is argument of literal p(t

0

)

with p 2 R

P

.

Updates overwrite the value of an argument of a literal. Realizing an

update using ADD/DEL ee ts would mean to rst al ulate the new value

of an argument, then deleting the literals in whi h this argument o urs and

nally adding the literals with the new argument values. That is, modeling

updates as spe i pro ess is more eÆ ient. An example for an update ee t

is given in gure 4.1.b:

hange ?l

i

in on(?l

i

; ?p

i

) to dr(?l

i

).

denes that variable ?l

i

in literal on(?l

i

; ?p

i

) is updated to dr(?l

i

). If more

than one update ee t is given in an operator, the updates are performed

sequentially from the uppermost to the last update. An update ee t annot

ontain free variables, that is X (UP ) X (PRE).

Denition 4.2.10 (Updating a State). For a state s = fp

1

(t

1

); : : : ; p

n

(t

n

)g

and an instantiated operator o with update ee ts u 2 UP , updating is de-

ned as repla ement t

0

i

:= eval(f(t)) for a xed argument t

0

i

in a xed literal

4.3 Extensions of FPlan 81

p

j

(t

0

) 2 s with X (t);X (t

0

) X (PRE).

For short we write update(u; s).

Applying a list of updates UP = (u

1

; : : : ; u

n

) to a state s is dened as

update(UP; s) = update(u

n

; update(u

n1

; : : : ; update(u

1

; s))).

For the initial state of hanoi in gure 4.1, s = fon(p

1

, [1 2 3 1), on(p

2

,

[1), on(p

3

, [1)g, the update ee ts of the move operator an be instanti-

ated to

u

1

= hange [1 in on(p

3

; [1) to ons( ar([1 2 3 1) [1)

u

2

= hange [1 2 3 1 in on(p

1

; [1 2 3 1) to dr([1 2 3 1).

Applying these updates to s results in:

update((u

1

; u

2

); fon(p

1

; [1 2 3 1); on(p

2

; [1); on(p

3

; [1)g) =

update(u

2

; fon(p

1

; [1 2 31); on(p

2

; [1); on(p

3

; eval( ons( ar([1 2 31); [1)))g =

update(u

2

; fon(p

1

; [1 2 3 1); on(p

2

; [1); on(p

3

; [1 1))g =

fon(p

1

; eval( dr([1 2 3 1))); on(p

2

; [1); on(p

3

; [1 1)g =

fon(p

1

; [2 3 1); on(p

2

; [1); on(p

3

; [1 1)g.

Note that the operator is fully instantiated before the update ee ts are

al ulated. The urrent substitution 2 is not ae ted by updating. That

is, if the value of an instantiated variable is hanged, as for example L

2

= [1

is hanged to L

2

= [1 1, this hange remains lo al to the literal spe ied in

the update.

Denition 4.2.11 ((Forward) Operator Appli ation). For a state s and

an instantiated operator o, operator appli ation is dened as Res(o; s) =

update(UP; s nDEL(o) [ ADD(o)) if PRE

P

(o) s and eval(PRE

C

(o)) =

TRUE.

ADD/DEL ee ts are always al ulated before updating. Examples for

operators with ombined ADD/DEL and update ee ts are given in se tion

4.4.

An FPlan planning problem is given as P (O; I;G), where operators O,

initial states I, and goals G are dened over the FPlan language, whi h

extends Strips by allowing fun tion appli ations. A plan for the hanoi problem

spe ied in gure 4.1.b is given in gure 4.3.

4.3 Extensions of FPlan

4.3.1 Ba kward Operator Appli ation

As des ribed in se tion 2.1 in hapter 2, plan onstru tion an be performed

using either forward or ba kward sear h in the state spa e. To make FPlan

appli able for ba kward planners, we introdu e ba kward operator appli a-

tion.

Ba kward operator appli ation for Strips (see def. 2.1.7) was dened so

that an operator is ba kward appli able in a state s if its ADD list an be


8

888

888

88

888

888

8

88

88

8

8 8 8

(MOVE P2 P3)

(MOVE P1 P3)

((ON [1 2 3 ] P1) (ON [ ] P2) (ON [ ] P3))

((ON [2 3 ] P1) (ON [ ] P2) (ON [1 ] P3))

((ON [3 ] P1) (ON [2 ] P2) (ON [1 ] P3))

((ON [3 ] P1) (ON [1 2 ] P2) (ON [ ] P3))

(MOVE P2 P1)

((ON [1 ] P1) (ON [ ] P2) (ON [2 3 ] P3))

((ON [ ] P1) (ON [ ] P2) (ON [1 2 3 ] P3))

(MOVE P1 P3)

(MOVE P3 P2)

((ON [ ] P1) (ON [1 2 ] P2) (ON [3 ] P3))

(MOVE P1 P2)

(MOVE P1 P3)

((ON [1 ] P1) (ON [2 ] P2) (ON [3 ] P3))

Fig. 4.3. A Plan for Tower of Hanoi

4.3 Extensions of FPlan 83

mat hed with the urrent state, resulting in a prede essor state s

0

where

all elements of the ADD list are removed and whi h ontains the literals of

PRE and DEL. For operators ontaining update ee ts and onstraints, the

denition of ba kward operator appli ation has to be extended with respe t

to appli ability onditions and al ulation of ee ts.

An operator is ba kward appli able if its ADD list an be mat hed with

the urrent state s and if those onstraints that hold after forward operator

appli ation hold in s. We all su h onstraints the post ondition (POST) of an

operator. In many domains, the onstraints holding in a state after forward

operator appli ation are the inverse of the onstraints spe ied in the operator

pre ondition. For example, for the hanoi domain, the onstraints for forward

appli ation are f ar(?l

i

) 6= 1, ar(?l

i

) < ar(?l

j

)g where ?l

i

represents the

disks lying on peg ?p

i

and ?l

j

represents the disks lying on peg ?p

j

(see

g. 4.1.b). After exe uting move(?p

i

, ?p

j

), the \top" disk of ?l

i

is inserted

as head of list ?l

j

, and the onstraints f ar(?l

j

) 6= 1, ar(?l

j

) < ar(?l

i

)g

hold. The rst onstraint must hold, be ause a disk is moved to peg ?p

j

and

therefore ?l

j

annot be the empty peg represented as list [1. The se ond

onstraint must hold be ause all legal states in the hanoi domain ontain

sta ks of disks with in reasing size. Be ause x < y holds for list ?l

i

= [x y ..,

and be ause x is the new head element of ?l

j

after exe uting move, the new

head element of list ?l

i

(y) is larger than the new head element of list ?l

j

(x).

To al ulate a ba kward update ee t, the inverse fun tion f

1

to up-

date fun tion f must be known. In general, updating an involve a sequen e

of fun tion appli ations. That is, f = f

1

Æ : : : Æ f

n

. To make ba kward plan

onstru tion ompatible with forward plan exe ution, update ee ts must be

restri ted to bije tive fun tions: f(x) = y $ f

1

(y) = x. For the move oper-

ator in the hanoi domain, the inverse fun tion of removing the top element

of a tower i. e. the head element of the list ?l

i

and inserting it at as the head

of ?l

j

is to remove the rst element of ?l

j

and inserting it at as head of ?l

i

.

We do not propose an algorithm for automati onstru tion of orre t

onstraints for ba kward appli ation and inverse fun tions. Instead, we re-

quire that for ea h operator op its inverse operator op

1

has to be dened in

the domain spe i ation. It is the responsibility of the author of an domain

to guarantee that op

1

is the inverse operator of op; that is, that Res(o,s')

= s holds for Res

1

(o

1

, s) = s' for all operators.

When dening an inverse operator, the (inverse) pre ondition represents

the onditions whi h must hold after forward appli ation of the original oper-

ator i. e. the inverse pre ondition orresponds to the original post ondition;

the updates represent the inverse fun tion as dis ussed above. The inverse

operator move

1

for hanoi is:

Operator: move

1

(?p

i

, ?p

j

)

PRE: fon(?l

i

, ?p

i

), on(?l

j

, ?p

j

), ar(?l

j

) 6= 1, ar(?l

i

) > ar(?l

j

)g

UPDATE: hange ?l

i

in on(?l

i

, ?p

j

) to ons( ar(?l

j

), ?l

i

)

hange ?l

j

in on(?l

j

, ?p

i

) to dr(?l

j

)


Be ause update ee ts are expli itly dened for inverse operator appli ation,

updates are performed as spe ied in denition 4.2.10.

For state s = fon([2 3 1; p

1

), on([1; p

2

), on([1 1; p

3

)g ba kward appli a-

tion of the instantiated operatormove

1

(p

1

, p

3

) results in s = fon([1 2 31; p

1

);

on([1; p

2

); on([1; p

3

)g.

For operators dened with mixed ADD/DEL and update ee ts the in-

verse operator is also dened expli itely, with inverted pre ondition, ADD

and DEL list. Appli ation of an inverted operator an then be performed

as spe ied in denition 4.2.11 that is, ba kward operator appli ation is

repla ed by forward appli ation of the inverted operator. Examples for ba k-

ward appli ation of operators with ombined ADD/DEL and update ee ts

are given in se tion 4.4.

4.3.2 Introdu ing User-Dened Fun tions

As dened in se tion 4.2, the set of fun tion symbols F and the set of rela-

tional symbols R

C

of FPlan an ontain arbitrary built-in and user-dened

fun tions. When parsing a problem spe i ation, ea h symbol whi h is part of

Common Lisp and ea h symbol orresponding to the name of a user-dened

fun tion is treated as expression to be evaluated. Symbols in R

C

have to

evaluate to a truth value.

An example spe i ation of hanoi using built-in and user-dened fun tions

is given in gure 4.4.

3

For the Towers of Hanoi domain the FPlan language

is instantiated to X = f?p

i

; ?p

j

; ?l

i

; ?l

j

g, C = fp

1

, p

2

, p

3

, 1, 2, 3, nilg,

F = f[

i

; ar

1

; dr

1

; ons

2

g, R

P

= fon

2

g, R

C

= fnotempty

1

, legal

2

g. The

empty list [ orresponds to nil. The user-dened fun tions an be dened

over additional built-in and user-dened fun tions whi h are not in luded in

L. For a Lisp-implemented planner like our system DPlan after reading

in a problem spe i ation, all de lared user-dened fun tions are appended

to the set of built-in Lisp fun tions and interpreted in the usual manner.

4.4 Examples

In the previous se tions, we presented FPlan with a fun tional version of

the hanoi domain, demonstrating how operators with ADD/DEL ee ts an

be alternatively modeled with update ee ts. In the following, we will give a

variety of examples for problem spe i ations with FPLan. First we will show

how problems involving resour e onstraints an be modeled. Afterwards, we

will give examples for a numeri al domain and domain spe i ations involving

3

For our planning system, fun tions are dened in Lisp syntax, for example:

(defun legal (l1 l2) (if (null l2) T (< ( ar l1) ( ar l2)))).

4.4 Examples 85

Operator: move(?p

i

, ?p

j

)

PRE: fon (?l

i

, ?p

i

), on(?l

j

, ?p

j

), not-empty(?l

i

), legal(?l

i

, ?l

j

)g

UPDATE: hange ?l

j

in on(?l

j

, ?p

j

) to ons( ar(?l

i

), ?l

j

)

hange ?l

i

in on(?l

i

, ?p

i

) to dr(?l

i

)

Goal: fon([ , p

1

), on([ , p

2

), on([1 2 3, p

3

)g

Initial State: fon([1 2 3, p

1

), on([ , p

2

), on([ , p

3

)g

Fun tions: not-empty(l) = not(null(l))

legal(l

1

, l

2

) = if null(l

2

) then TRUE else < ( ar(l

1

), ar(l

2

))

Fig. 4.4. Tower of Hanoi with User-Dened Fun tions

operators with mixed ADD/DEL ee ts and updates. We will show how

standard programming problems an be modeled in planning, and nally we

will present preliminary ideas for ombining planning with onstraint logi

programming.

For sele ted problems we in lude empiri al omparisions, demonstrating

that plan onstru tion with fun tion updates is signi antly more eÆ ient

than plan onstru tion using ADD/DEL ee ts. We tested all examples with

our ba kward planner DPlan (S hmid & Wysotzki, 2000b). Therefore, we

present standard forward operator denitions together with the inverse op-

erators as spe ied in se tion 4.3.1.

4.4.1 Planning with Resour e Variables

Planning with resour e onstraints an be modeled as a spe ial ase of fun -

tion updates with FPlan. As an example we use an airplane domain (Koehler,

1998). A problem spe i ation in FPlan is given in gure 4.5.

Operator: y(?plane, ?x, ?y)

PRE: fat(?plane, ?x), fuelresour e(?plane, ?fuel),

?fuel distan e(?x, ?y)=3g

ADD: fat(?plane, ?y)g

DEL: fat(?plane, ?x)g

UPDATE: hange ?fuel in fuelresour e(?plane, ?fuel) to al fuel(?fuel, ?x, ?y)

hange ?time in timeresour e(?plane, ?time) to al time(?time, ?x, ?y)

Goal: fat(p1, berlin)g

Initial State: fat(p1, london), fuelresour e(p1, 750), timeresour e(p1, 0)g

Fun tions: al fuel(?fuel, ?x, ?y) = ?fueldistan e(?x; ?y)=3

al time(?time, ?x, ?y) = ?time + 3=20 distan e(?x; ?y)

Fig. 4.5. A Problem Spe i ation for the Airplane Domain

The operator y spe ies what happens when an airplane ?plane ies

from airport ?x to airport ?y. We onsider two resour es: the amount of fuel


and the amount of time the plane needs to y from one airport to another. In

the initial state the tank is lled to apa ity (fuelresour e(750)) and no time

has yet been spent (timeresour e(0)). A resour e onstraint that must hold

before the a tion of ying the plane from one airport ?x to another airport

?y an be arried out would be: at(?plane, ?x), (?fuel distan e(?x, ?y)/3).

That is the plane has to be at airport ?x and there has to be enough fuel in

the tank to travel the distan e from ?x to ?y.

The distan e between two airports is obtained by alling the fun tion dis-

tan e(?x, ?y) whi h returns the distan e value by onsulting an underlying

database (see table 4.1). The distan es an alternatively be modeled as stati

predi ates (for example distan e(berlin, paris, 540), et .). Using a database

query fun tion is more eÆ ient than modeling the distan es with stati pred-

i ates be ause stati predi ates have to be en oded in the urrent state and

nding a ertain distan e (instantiating the literal) requires mat hing whi h

takes onsiderably longer than a database query.

Table 4.1. Database with Distan es between Airports

Berlin Paris London New York

Berlin 540 m 570 m 3960 m

Paris 540 m 210 m 3620 m

London 570 m 210 m 3460 m

New York 3960 m 3620 m 3460 m

The position of the plane ?plane is modeled with the literal at. When

ying from ?x to ?y the literal at(?plane, ?x) is deleted from and the literal

at(?plane, ?y) is added to the set of literals des ribing the urrent state. The

resour e variable ?fuel des ribed by the relational symbol fuelresour e is

updated with the result of the user-dened fun tion al fuel whi h al ulates

the onsumption of fuel a ording to the traveled distan e. The resour e

variable ?time des ribed by the relational symbol timeresour e is updated

in a similar way asuming that it takes 3=20 distan e(?x; ?y) to y the

distan e from airport ?x to ?y.

When ADD/DEL and update ee ts o ur together in one operator

the update ee t is arried out on the state obtained after al ulating the

ADD/DEL ee t (denition 4.2.11). For ba kward planning we have to spe -

ify a orresponding inverse operator y

1

(see g. 4.6). The pre ondition for

y

1

requests that the plane be at airport ?y and you have more time left

than it takes to y the distan e between airport ?x and airport ?y. The re-

sour e variables ?fuel and ?time are updated with the result of the inverse

fun tions ( al fuel

1

and al time

1

).

Updating of state variables as proposed in FPlan is more exible than han-

dling resour e variables separately (as in Koehler (1998)). While in Koehler

(1998), fuel and time are global variables, in FPlan the urrent value of fuel

4.4 Examples 87

Operator: y

1

(?plane, ?x, ?y)

PRE: fat(?plane, ?y), timeresour e(?plane, ?time), fuelresour e(?plane, ?fuel),

?time distan e(?x, ?y)) 3=20) g

ADD: fat(?plane, ?x)g

DEL: fat(?plane, ?y)g

UPDATE: hange ?fuel in fuelresour e(?plane, ?fuel) to al fuel

1

(?fuel, ?x, ?y)

hange ?time in timeresour e(?plane, ?time) to al time

1

(?time, ?x, ?y)

Fun tions.: al fuel

1

(?fuel, ?x, ?y) = ?fuel+distan e(?x; ?y)=3 j 750

; 750 is the maxium apa ity of the tank

al time

1

(?time, ?x, ?y) = ?time 3=20 distan e(?x, ?y)

Fig. 4.6. Spe i ation of the Inverse Operator y

1

for the Airplane Domain

and time are arguments of relational symbols fuelresour e(?plane, ?fuel),

timeresour e(?plane,?fuel). Therefore, while modeling a problem involving

more than one plane an be easily done in FPlan this is not possible with

Koehler's approa h. For example we an spe ify the following goal and initial

state:

Goal: fat(p1, berlin), at(p2, paris)g

Initial State: fat(p1, berlin), fuelresour e(p1, 750), timeresour e(p1, 0),

at(p2, paris), fuelresour e(p2, 750), timeresour e(p2, 0)g

Modeling domains with time or ost resour es is simple when fun tion

appli ations are allowed. Typi al examples are job s heduling problems

for example the ma hine shop domain presented in (Veloso et al., 1995). To

model time steps, a relational symbol time(?t) an be introdu ed. Time ?t is

initially zero and ea h operator appli ation results in ?t being in remented by

one step. To model a ma hine that is o upied during a ertain time interval,

a relational symbol o upied(?m, ?o) an be used where ?m represents the

name of a ma hine and ?o the last time slot where it is o upied with ?o = 0

representing that the ma hine is free to be used. For ea h operator involving

the usage of a ma hine, a pre ondition requesting that the ma hine is free for

the urrent time slot an be introdu ed. If a ma hine is free to be used, the

o upied relation is updated by adding the urrent time and the amount of

time steps the exe uted a tion requests. It an also be modeled that o upa-

tion time does not only depend on the kind of a tion performed but also on

the kind of obje t involved (e. g., polishing a large obje t ould need three

time steps, while polishing a small obje t needs only one time step).

4.4.2 Planning for Numeri al Problems The Water Jug Domain

Numeri al domains as the water jug domain presented by Pednault (1994),

annot be modeled with a Strips-like representation. Figure 4.7 shows an

FPlan spe i ation of the water jug domain: We have three jugs of dierent


volumes and dierent apa ities. The operator pour models the a tion of

pouring water from one jug ?j1 into another jug ?j2 until either ?j1 is empty

or ?j2 is lled to apa ity. The apa ities of the jugs are stati s while the

a tual volumes are uents. The resulting volume ?v1 for jug ?j1 is either

zero or what remains in the jug when pouring as mu h water as possible into

jug ?j2: max[0; v1 2 + v2. The resulting volume ?v2 for jug ?j2 is the

either its apa ity or its previos volume plus the volume of the rst jug ?j1:

min[ 2; v1 + v2.

Operator: pour(?j1, ?j2)

PRE: fvolume(?j1, ?v1), volume(?j2, ?v2), apa ity(?j2, ? 2))

(?v2 < ? 2), (?v1 > 0)g

UPDATE: hange ?v1 in volume(?j1, ?v1) to max(0; ?v1? 2+?v2)

hange ?v2 in volume(?j2, ?v2) to min(? 2; ?v1+?v2)

Operator: pour

1

(?j1, ?j2)

PRE: fvolume(?j1, ?v1), volume(?j2, ?v2), apa ity(?j2, ? 2))

((?v2 = ? 2) or (?v1 = 0))g

UPDATE: hange ?v1 in volume(?j1, ?v1) to max(0; ?v1? 2+?v2)

hange ?v2 in volume(?j2, ?v2) to min(? 2; ?v1+?v2)

Stati s: f apa ity(a, 36), apa ity(b, 45), apa ity( , 54)g

Goal: fvolume(a, 25), volume(b, 0), volume( , 52)g

Initial State: fvolume(a, 16), volume(b, 7), volume( , 34) g

Fig. 4.7. A Problem Spe i ation for the Water Jug Domain

After pouring water from one jug into another the post ondition (?v2 =

? 2) or (?v1 = 0) must hold. That is the volume ?v1 of the rst jug ?j1

must be zero or the volume ?v2 of the se ond jug ?j2 must equal its apa ity

? 2. When spe ifying the inverse operator pour

1

for ba kward planning this

post ondition be omes the pre ondition of pour

1

. The update fun tions are

inverted: The volume ?v1 of the rst jug ?j1 is updated with the result of

min[? 1; ?v1+?v2 and the volume ?v2 of the se ond jug ?v2 is updated with

the result of max[0; ?v2? 1+?v1. The resulting plan for the given problem

spe i ation is shown in gure 4.8.

For this numeri al problem there exists no ba kward plan for all initial

states whi h an be transformed into the goal by forward operator appli-

ation. For instan e, for the initial state volume(a, 16), volume(b, 27), vol-

ume( , 34) the following a tion sequen e results in a goal state:

volume(a, 16), volume(b, 27), volume( , 34) pour( , b)

volume(a, 16), volume(b, 45), volume( , 16) pour(b, a)

volume(a, 36), volume(b, 25), volume( , 16) pour(a, )

volume(a, 0), volume(b, 25), volume( , 52) pour(b, a)

volume(a, 25), volume(b, 0), volume( , 52)

4.4 Examples 89

((VOLUME A 25) (VOLUME B 0) (VOLUME C 52))






(POUR-1 B A)

(POUR-1 A C)

(POUR-1 B A)

(POUR-1 C B)

(POUR-1 C A)

Fig. 4.8. A Plan for the Water Jug Problem

but the initial state is not a legal prede essor of volume(a, 16), volume(b,

45), volume( , 16) when applying the pour

1

operator. That is, ba kward

planning is in omplete! The reason for this in ompleteness is that the initial

state in forward planning an be any arbitrary amount of water in the jugs,

while for ba kward planning is has to be a state obtainable by operator

appli ation. A remedy for this problem is to introdu e a se ond operator

lljugs(?j1, ?j2, ?j3) whi h has no pre ondition. This operator models the

initial lling up of jugs from a water sour e.

4.4.3 Fun tional Planning for Standard Problems Tower of

Hanoi

A main motivation for modeling operator ee ts with updates of state vari-

ables by fun tion appli ation given by Gener (2000) is that plan onstru tion

an be performed more eÆ iently. We give an empiri al demonstration of this

laim by omparing the running times when planning for the standard spe -

i ation with those of the the fun tional spe i ation of Tower of Hanoi (see

g. 4.1). The inverse operator move

1

was spe ied in se tion 4.3.1.

The running times were obtained with our system DPlan on an Ultra

Spar 5 system. DPlan is a universal planner implemented in Lisp. Be ause

DPlan is based on a breadth-rst strategy and be ause we did not fo us on

eÆ ien y for example we did not use OBDD representations (Cimatti et al.,

1998) it has longer running times than state of the art planers. Sin e we

are interested in omparing the performan e when planning for the dierent

spe i ations we do not give the absolute times but the performan e gain in

per ent, al ulated as [st ft 100=st, where st is the running time for the


standard spe i ation and ft the running time for the fun tional spe i ation

(see table 4.2).

4

Be ause in the standard representation the relations between the sizes

of the dierent dis s must be spe ied expli itely by stati relations, the

number of relational symbols is quite large (18 literals for three disks, 25 for

four disks, et .). In the fun tional representation, built-in fun tions an be

alled to determine wether a disk of the size 2 is smaller than a disk of the size

3. The number of literals is very small (3 literals for an arbitrary number of

disks). Consequently planning for the standard representation requires more

mat hing than planning for the fun tional one, where the onstraint of the

pre ondition has to be solved instead.

Table 4.2. Performan e of FPlan: Tower of Hanoi

Tower of Hanoi

Number of disks 3 4 5

States 27 81 243

Performan e gain 96.3% 98.7% 99.8%

4.4.4 Mixing ADD/DEL Ee ts and Updates Extended

Blo ksworld

In this se tion we present an extended version of the standard blo ks-world

(see g. 4.9). First we introdu e an operator learblo k whi h lears a blo k

?x by putting the blo k lying on top of ?x on the table if blo k topof(?x)

is lear. That is, a blo k is addressed in an indire t manner as result of a

fun tion evaluation (Gener, 2000). As terms an be nested this orresponds

to an innite type. For a state where on(A, B) holds, the fun tion topof(B)

returns A, while for a state where on(C, B) holds, topof(B) returns C. The

standard blo ks-world operators put and puttable are also formulated using

the topof fun tion. Furthermore all operators are extended by an update

for the value of the LastBlo kMoved (Pednault, 1994). Note that we used a

onditional ee t for spe ifying the put operator. Our urrent implementation

allows onditioned ee ts for ADD/DEL ee ts but not for updates.

The update ee t hanging the value of LastBlo kMoved annot be han-

deled in ba kward planning be ause we annot dene inverse operators in-

luding an inverse update of the argument LastBlo kMoved. The inverse fun -

tion would have to return the blo k that will be moved in the next step. At

the urrent time step we do not know whi h blo k this will be.

4

The absolute values are (standard/fun tional): 3 dis s: 34.5 se /1.26 se ; 4

dis s: 335.89 se /4.47 se ; 5 dis s: 10015.59 se /19.3 se ; 6 dis s (729 states):

>15 h/62.28 se .

4.4 Examples 91

(a) Operator: learblo k(?x)

PRE: f?y = topof(?x), lear(?y)g

ADD: f lear(?x)g

DEL: fon(?y,?x)g

UPDATE: hange ?blo k in LastBlo kMoved(?blo k) to ?y

(b) Operator: puttable(?x)

PRE: f?x = topof(?y), lear(?x)g

ADD: f lear(?x)g

DEL: fon(?x,?y)g

UPDATE: hange ?blo k in LastBlo kMoved(?blo k) to ?x

( ) Operator: put(?x, ?y)

PRE: f lear(?x), lear(?y) g

ADD: fon(?y,?x)g

DEL: f lear(?x)g

WHEN on(?x ?z) ADD fon(?x ?y), lear(?z)g

DEL f lear(?y), on(?x ?z)g

UPDATE: hange ?blo k in LastBlo kMoved(?blo k) to ?y

Fun tions: topof(?x) = if on(?y,?x) then ?y else nil

Fig. 4.9. Blo ks-World Operators with Indire t Referen e and Update

Other extended blo ks-world domains an be modeled with FPlan. For

example, inluding a robot agent who sta ks and unsta ks blo ks with an

asso iated energy level whi h de reases with ea h a tion it performes The

robot's energy an be des ribed with the relational symbol energy(?robot,

?energy) and is onsumed a ording to the weight of the a tual blo k being

arried by updating the variable ?energy in energy(?robot, ?energy).

4.4.5 Planning for Programming Problems Sorting of Lists

With the extension to fun tional representation a number of programming

problems for example list sorting algorithms su h as bubble, merge or sele -

tion sort an be planned more eÆ iently. We have spe ied sele tion sort in

the standard and the fun tional way (see g. 4.10).

In the fun tional representation we an use the built-in fun tion \>"

instead of the predi ate greater, the onstru tor for lists slist, and the fun tion

nth(n, L) to referen e the nth element of the list L instead of spe ifying the

position of ea h element in the list by using the literal isat(?x, ?i). In the

fun tional representation the indi es to the list ?i and ?j are free variables,

whi h for lists with three elements an range from zero to two. When

al ulating the possible instantiations for the operator swap the variables ?i,

?j an be assigned any value within their range for whi h the pre ondition

holds (denition 4.2.2). The fun tion swap is dened as an external fun tion.

The inverse fun tion to swap is the fun tion swap itself.


(a) Standard representation

Operator: swap(?i, ?j, ?x)

PRE: fisat(?x, ?i), isat(?y, ?j), greater(?x, ?y)

ADD: fisat(?x, ?j), isat(?y, ?i)g

DEL: fisat(?x, ?i), isat(?y, ?j)g

Initial State: fisat(3, p1), isat(2, p2), isat(1, p3)g

Goal: fisat(1, p1), isat(2, p2), isat(3, p3)g

(b) Fun tional representation

Operator: swap(?i, ?j, ?x)

PRE: fslist(?x), nth(?i, ?x) > nth(?j, ?x)g

UPDATE: hange ?x in slist(?x) to swap(?i, ?j, ?x)

Variables: f(?i :range (0 2)) (?j :range (0 2))g

Initial State: fslist([ 3 2 1 )g

Goal: fslist([ 1 2 3 )g

Fun tions: swap (?i, ?j, ?x) = (let ((temp (nth j x)))

(setf (nth j x) (nth i x)

(nth i x) temp)

x))

Fig. 4.10. Spe i ation of Sele tion Sort

Table 4.3. Performan e of FPlan: Sele tion Sort

sele tion sort

Number of list elements 3 4 5

States 6 24 120

Performan e gain 27.3% 81.3% 96.1%

4.4 Examples 93

The performan e gain when planning for the fun tional representation is

large (table 4.3) but not as stunning as in the Hanoi domain.

5

This may be

due to the free variables ?i and ?j that have to be assigned a value a ording

to their range. Consequently the onstraint solving takes some longer but is

still faster than the mat hing of the literals for the standard version (6 for

three list elements, 10 for four, et . versus only one literal for the fun tional

representation).

4.4.6 Constraint Satisfa tion as Spe ial Case of Planning

Logi al programs an be dened over rules and fa ts (Sterling & Shapiro,

1986). Rules orrespond to planning operators. Fa ts are similar to stati

relations in planning, that is, they are assumed to be true in all situations.

Constraint logi programming extends standard logi programming so that

onstraints an be used to eÆ iently restri t variable bindings (Fruhwirth &

Abdennadher, 1997). In gure 4.11 an example for a program in onstraint

Prolog is given. The problem is to ompose a light meal onsisting of an

appetizer, a main dish and a dessert whi h all together ontains not more

than a given alorie value (see Fruhwirth & Abdennadher, 1997).

lightmeal(A, M, D) 1400 I + J + K, I > 0, J > 0, K > 0,

appetizer(A, I), main(M, J), dessert(D, K).

appetizer(radishes, 50). main(beef, 1000). dessert(i e ream, 600).

appetizer(soup, 300). main(pork, 1200). dessert(fruit, 90).

Fig. 4.11. Lightmeal in Constraint Prolog

The same problem an be spe ied in FPlan (gure 4.12). The goal on-

sisting of a onjun tion of literals now ontains free variables and a set of

onstraints determining the values of those variables. In this ase ?x, ?y, ?z

are free variables that an be assigned any value within their range (here

enumerations of dishes). The ranges of the variables ?x; ?y, ?z are spe ied

for the domain and not for a problem. The fun tion alories looks up the

alorie value of a dish in an asso iation list.

There are no operators spe ied for this example. The planning pro ess

in this ase solves the goal onstraint by nding those ombinations of dishes

that satisfy the onstraint. For example a meal onsisting of radishes with 50

alories, beef with 1000 alories and fruit with 90 alories.

5

The absolute values are (standard/fun tional): 3 elem.: 0.33 se /0.24 se ; 4

elem.: 8.93 se /1.67 se ; 5 elem.: 547.05 se /21.57 se ; 6 elem. (720 states):

>15 h/717.41 se .


Domain: Lightmeal

variables: (?x :range (radishes, soup)

(?y :range (beef, pork))

(?z :range (fruit, i e ream)

Initial State: f g ; empty

Goal: f(lightmeal(?x, ?y, ?z),

( alories(?x) + alories(?y) + alories(?z)) 1400g

Fun tions: alories(?dish) = adr(asso (?dish, alorie-list))

alorie-list := ((radishes 50) (beef 1000) (fruit 90)

(i e ream 600) (soup 300) (pork 1200) ...)

Fig. 4.12. Problem Spe i ation for Lightmeal

5. Con lusions and Further Resear h

The rst part of anything is usually easy.

|Travis M Gee in: John D. Ma Donald, The S arlet Ruse, 1973

As mentioned before, DPlan is intended as a tool to support the gen-

eration of nite programs. That is, plan generation is for small domains

with three or four obje ts only. Generation of a universal plan overing all

possible states of a domain, as we do with DPlan, is ne essarily a omplex

problem be ause for most interesting domains the number of states grows

exponentially with the number of obje ts. In the following, we rst dis uss

what extensions and modi ations would be needed to make DPlan ompeti-

tive with state of the art planning systems. Afterwards, we dis uss extensions

whi h would improve DPlan as a tool for program synthesis. Finally, we dis-

uss some relations to human problem solving.

5.1 Comparing DPlan with the State of the Art

Eort of universal planning is typi ally higher than the average eort of

standard state-based planning be ause it is based on breadth-rst sear h.

But, if one is interested in generating optimal plans, this pri e must be payed.

Typi ally, universal planning terminates if the initial state is rea hed in some

pre-image (see se t. 2.4.4 in hap. 2). In ontrast, DPlan terminates after all

states whi h an be transformed into the goal state are overed. Consequently,

it is not possible to make DPlan more time eÆ ient for a domain with

exponential growth an exponential number of steps is needed to enumerate

all states. But DPlan ould be made more memory eÆ ient by representing

plans as OBDDs (Jensen & Veloso, 2000).

To be applied to standard planning problems, it would be ne essary to

modify DPlan. The main problem to over ome would be to generate omplete

state des riptions by ba kward operator appli ation from a set of top-level

goals, that is, from an in omplete state des ription (see se t. 3.1 in hap. 3).

Some re ent work by Wolf (2000) deals with that problem: Complete state

des riptions are onstru ted as maximal sets of atoms whi h are not mutually

ex luded by the operator denitions of the domain.

96 5. Con lusions and Further Resear h

Furthermore, DPlan urrently terminates with the universal plan. For

standard planning, a fun tionality for extra ting a linear plan for a xed

state must be provided. Plan extra tion an be done for example by depth-

rst sear h in the universal plan (S hmid, 1999).

In the ontext of planning and ontrol rule learning (see se t. 2.5.2 in

hap. 2), we argued that planning an prot from program synthesis: First,

a universal plan is onstru ted for a domain with a small number of obje ts,

then a re ursive rule for the omplete domain (with a xed goal, su h as \build

a tower of sorted blo ks") is generalized. Afterwards, planning an be omitted

and instead the re ursive rule is applied. To make this argument stronger, it is

important to provide empiri al eviden e. We must show that plan generation

by applying the re ursive rule results in orre t and optimal plans and that

these plans are al ulated in signi antly less time than with a state-of-the-

art planner. For that reason, DPlan must be extended su h that (1) learned

re ursive rules are stored with a domain, and (2) for a new planning problem,

the appropriate rule is extra ted from memory and applied.

1

5.2 Extensions of DPlan

To make it possible that DPlan an be applied to as large a set of problems

whi h are of interest in program synthesis as possible, DPlan should work

for a representation language with expressive power similar to PDDL (see

se t. 2.2.1 in hap. 2). We already made an important step in that dire tion

by introdu ing fun tion appli ation ( hap. 4). The extension of the Strips

planning language to fun tion appli ation has the following hara teristi s:

Planning operators an be dened using arbitrary symboli al and numeri al

fun tions; ADD/DEL ee ts and updates an be ombined; indire t referen e

to obje ts via fun tion appli ation allows for innite domains; planning with

resour e onstraints an be handled as spe ial ase.

The proposed language FPlan an be used to give fun tion appli ation in

PDDL (M Dermott, 1998b) a lear semanti s. The des ribed semanti s of op-

erator appli ation an be in orporated in arbitrary Strips planning systems.

In ontrast to other proposals for dealing with the manipulation of state vari-

ables, su h as ADL (Pednault, 1994), we do not represent them as spe ially

marked \rst lass obje ts" (de lared as uents in PDDL) but as arguments

of relational symbols. This results in a greater exibility of FPlan, be ause

ea h argument of a relational symbol may prin ipally be hanged by fun tion

appli ation. As a onsequen e, we an model domains usually spe ied by

operators with ADD/DEL ee ts alternatively by updating state variables

(Gener, 2000).

1

Preliminary work for ontrol rule appli ation was done in a student proje t at

TU Berlin, 1999.

5.3 Universal Planning versus In remental Exploration 97

Work to be done in ludes detailed empiri al omparisons of the eÆ ien y

of plan onstru tion with operators with ADD/DEL ee ts versus updates

and providing proofs of orre tness and termination for planning with FPlan.

FPlan urrently has no restri tions to what fun tions an be applied. As a

onsequen e, termination is not guaranteed. That is, we have to provide some

restri tions on fun tion appli ation.

Currently, DPlan works on Strips extended by onditional ee ts and

fun tion appli ation. For future extensions, we plan to in lude typing, nega-

tion in pre- onditions, and quanti ation. These extensions allow to generate

plans for a larger lass of problems. On the other hand, as already dis ussed

for fun tion appli ation, ea h language extension results in a higher omplex-

ity for planning. While PDDL oers the syntax for all mentioned extensions,

up to now only a limited number of work proposes an algorithmi realiza-

tion (Penberthy & Weld, 1992; Koehler et al., 1997) and areful analyses of

planning omplexity are ne essary (Nebel, 2000).

5.3 Universal Planning versus In remental Exploration

We argue, that for learning a re ursive generalization for a domain, the om-

plete stru ture of this domain must be known. Universal planning with DPlan

results in a DAG whi h represents the shortest a tion sequen es to transform

ea h possible state over a xed number of obje ts into a goal state. In hapter

8 we will show that from the stru ture of the DAG additionally information

about the data type underlying this domain an be inferred whi h is essential

for transforming the plan in a program term t for generalization. Our ap-

proa h is \all-or-nothing" that is, either a generalization an be generated

for a given universal plan or it annot be generated and if a generalization

an be generated it is guaranteed to generate orre t and optimal a tion se-

quen es for transforming all possible states into a goal state. A restri tion of

our approa h is, that the planning goal must also be a generalization of the

goal for whi h the original universal plan was generated ( lear a blo k in a

n blo k tower, transport n obje ts from A to B, sort a list with n elements,

build a tower of n sorted blo ks, solve Tower of Hanoi for n dis s).

Other approa hes to learning ontrol rules for planning, namely learning

in Prodigy (Veloso et al., 1995) and Gener's approa h (Mart

in & Gener,

2000), on the one hand oer more exibility but on the other hand an-

not guarantee that learning in every ase results in a better performan e (see

se t. 2.5.2 in hap. 2): The system is exposed in rementally to arbitrary plan-

ning experien e. For example, in the blo ks-world domain, the system might

be rst onfronted with a problem \nd an a tion sequen e for transforming

a state where all given four blo ks A, B, C, and D are lying on the table

into one where B is on A and C is on D; and next with a problem \nd

an a tion sequen e for transforming a state where all given six blo ks whi h

are sta ked in reverse order into one where blo ks A to D are sta ked into


an ordered tower and E and F are lying on the table". Rules generalized

from this experien e might or might not work for new problems. For ea h

new problem, the learned rules are applied and if rule appli ation results in

failure the system must extend or modify its learning experien e.

This in remental approa h to learning is similar to the models of hu-

man skill a quisition as, for example, proposed by Anderson (1983) in his

ACT theory, although these approa hes address only the a quisition of linear

ma ros (see se t. 2.5.2 in hap. 2). To get some hints whether our universal

planning approa h has plausibility for human ognition, the following empir-

i al study ould be ondu ted: For a given problem domain, su h as Tower of

Hanoi, one group of subje ts is onfronted with all possible onstellations of

the three-dis problem and one group of subje ts is onfronted with arbitrary

onstellations of problems with dierent numbers of dis s. Both groups have

to solve the given problems. Afterwards, performan e on arbitrary Tower of

Hanoi problems is tested for both groups and both groups are asked after

the general rule for solving Tower of Hanoi. We assume that expli it gener-

ation of all a tion sequen es for a problem of xed size, fa ilitates dete tion

of the stru ture underlying a domain and thereby the formation of a general

solution prin iple.

Part II

Indu tive Program Synthesis: Extra ting

Generalized Rules for a Domain

99

6. Automati Programming

"So mu h is observation. The rest is dedu tion."

|Sherlo k Holmes to Watson in: Arthur Connan Doyle, The Sign of Four, 1930

Automati programming is investigated in arti ial intelligen e and soft-

ware engineering. The overall resear h goal in automati programming is to

automatize as large a part of the development of omputer programs as pos-

sible. A more modest goal is to automatize or support spe ial aspe ts of pro-

gram development su h as program veri ation or generation of high-level

programs from spe i ations. The fo us of this hapter is on program gener-

ation from spe i ations referred to as automati program onstru tion or

program synthesis. There are two distin t approa hes to this problem: Dedu -

tive program synthesis is on erned with the automati derivation of orre t

programs from omplete, formal spe i ations; indu tive program synthesis

is on erned with automati generalization of (re ursive) programs from in-

omplete spe i ations, mostly from input/output examples. In the following,

we rst (se t. 6.1) give an introdu tory overview of approa hes to automati

programming, together with pointers to literature. Afterwards (se t. 6.2), we

shortly review theorem proving and transformational approa hes to dedu -

tive program synthesis. Sin e our own work is in the ontext of indu tive

program synthesis, we will go into more detail, presenting this area of re-

sear h (se t. 6.3): In se tion 6.3.1, the foundations of automati indu tion

that is, of ma hine learning are introdu ed; grammar inferen e is dis ussed

as theoreti al basis of indu tive synthesis. Afterwards, geneti programming

(se t. 6.3.2), indu tive logi programming (se t. 6.3.3), and the synthesis of

fun tional programs (se t. 6.3.4) are presented. Finally, we dis uss dedu tive

versus indu tive and fun tional versus logi al program synthesis (se t. 6.4).

Throughout the text we give illustrations using simple programming prob-

lems.

102 6. Automati Programming

6.1 Overview of Automati Programming Resear h

6.1.1 AI and Software Engineering

Software engineering is on erned with providing methodologies and tools

for the development of software ( omputer programs). Software engineering

involves at least the following a tivities:

Spe i ation: Analysis of requirements and the desired behavior of the pro-

gram and designing the internal stru ture of the program. Design spe i-

ation might be stepwise rened, from an overall stru ture of software

modules to the (formal) spe i ation of algorithms and data stru tures.

Development: Realizing the software design in exe utable program ode (pro-

gramming, implementation).

Validation: Ensuring that the program does what it is expe ted to do. One

aspe t of validation alled veri ation is that the implemented pro-

gram realizes the spe ied algorithms. Program veri ation is realized

by giving (formal) proofs that the program fullls the (formal) spe i a-

tion. A veried program is alled orre t with respe t to a spe i ation.

The se ond aspe t of validation is that the program meets the initially

spe ied requirements. This is usually done by testing.

Maintenan e: Fixing program errors, modifying and adding features to a

program (updating).

Software produ ts are expe ted to be orre t, eÆ ient, and transparent.

Furthermore, programs should be easily modiable, maintaining the qual-

ity standards orre tness, eÆ ien y, and transparen y. Obviously, software

development is a omplex task and sin e the eighties a variety of omputer-

aided software engineering (CASE) tools have been developed supporting

proje t management, design (e. g., he king the internal onsisten y of mod-

ule hierar hies), and ode generation for simple routine tasks (Sommerville,

1996).

6.1.1.1 Knowledge-Based Software Engineering. A more ambitious

approa h is knowledge-based software engineering (KBSE). The resear h goal

of this area is to support all stages of software development by \intelligent"

omputer-based assistants (Green et al., 1983). KBSE and automati pro-

gramming are often used as synonyms.

A system providing intelligent support for software development must in-

lude several aspe ts of knowledge (Smith, 1991; Green & Barstow, 1978):

General knowledge about the appli ation domain and the problem to be

solved, programming knowledge about the given domain, as well as gen-

eral programming knowledge about algorithms, data stru tures, optimization

te hniques, and so on. AI te hnologies, espe ially for knowledge representa-

tion, automated inferen e, and planning, are ne essary to build su h systems.

That is, an KBSE system is an expert system for software development.

6.1 Overview of Automati Programming Resear h 103

The lessons learned from the limited su ess of expert systems resear h in

the eighties, also apply to KBSE: It is not realisti to demand that a single

KBSE system might over all aspe ts of knowledge and all stages of software

development. Instead, systems typi ally are restri ted in at least one of the

following ways (Flener, 1995; Ri h & Waters, 1988):

Constru ting systems for expert spe iers instead of end-users.

Restri ting the system to a narrow domain.

Providing intera tive assistan e instead of full automatization.

Fo using on a small part of the software development pro ess.

Examples for spe ialized systems are Aries (Johnson & Feather, 1991) for

spe i ation a quisition, KIDS (Smith, 1990) for program synthesis from

high-level spe i ations (see se t. 6.2.2.3), or PVS (Dold, 1995) for program

veri ation. Automati programming is mainly an area of basi resear h. Up

to now, only a small number of systems are appli able to real-world software

engineering problems.

Below, we will introdu e approa hes to program synthesis, that is KBSE

systems addressing the aspe t of ode generation from spe i ations, in detail.

From a software engineering standpoint, the main advantage of automati

ode generation is, that ex-post veri ation of programs be omes obsolete if

it is possible to automati ally derive ( orre t) programs from spe i ations.

Furthermore, the problems of program modi ation an be shifted to the

more abstra t and therefore hopefully more transparent level of spe i-

ation.

6.1.1.2 Programming Tutors. KBSE resear h aims at providing systems

that support expert programmers. Another area of resear h, where AI and

software engineering intera t, is the development of tutor systems for support

and edu ation of student programmers. Su h tutoring systems must in orpo-

rate programming knowledge to a smaller extend than KBSE systems, but

additionally, they have to rely on knowledge about eÆ ient tea hing meth-

ods. Furthermore, user-modeling is riti al for providing helpful feedba k for

programming errors.

Examples for programming tutors are the Lisp-tutors from Anderson,

Conrad, and Corbett (1989) and Weber (1996) and the tutor Proust from

Johnson (1988). Anderson's Lisp-tutor is based on his ACT theory (Ander-

son, 1983). The tutoring strategy is based on the assumption that program-

ming is a ognitive skill, based on a growing set of produ tion rules. Training

fo uses on a quisition of su h rules (\IF the goal is to obtain the rst ele-

ment of a list l THEN write ( ar l)"), and errors are orre ted immediately

to avoid that students a quire faulty rules. Error-re ognition and feedba k

is based on a library of expert rules and rules whi h result in programming

errors. The later were a quired in a series of empiri al studies of novi e pro-

grammers. A faulty program is tried to be reprodu ed by applying rules from

the library and feedba k is based on the rules whi h generated the errors.


Weber's tutor is based on episodi user modeling. The programs a student

generates over a urri ulum are stored as s hemes and in a urrent session

the stored programming episodes are used for feedba k. The Proust system

is based on the representation of plans ( orresponding to high-level spe i a-

tions of algorithms). Ideally, more than one orre t program an be derived

from a orre t plan. Errors are dete ted by trying to identify the faulty plan

underlying a given program.

Programming tutors were mainly developed during the eighties. Simulta-

neously, novi e programmers were studied intensively in ognitive psy hology

(Widowski & Eyferth, 1986; Mayer, 1988; Soloway & Spohrer, 1989). One rea-

son for this interest was that this area of resear h oered an opportunity to

bring theories and empiri al results of psy hology to appli ation; a se ond

reason was that programming oers a relatively narrow domain whi h does

not strongly depend on previous experien e in other areas to study the de-

velopment of human problem solving skills (S hmid, 1994; S hmid & Kaup,

1995).

6.1.2 Approa hes to Program Synthesis

Resear h in program synthesis addresses the problem of automati generation

of program ode from spe i ations. The nature of this problem depends on

the form in whi h a spe i ation is given. In the following, we rst introdu e

dierent ways to spe ify programming problems. Afterwards we introdu e

dierent synthesis methods. Synthesis methods an be divided in two lasses

dedu tive program synthesis from omplete formal spe i ations and in-

du tive program synthesis from in omplete spe i ations. Therefore, we rst

ontrast dedu tive and indu tive inferen e, and then hara terize dedu tive

and indu tive program synthesis as spe ial ases of these inferen e meth-

ods. Finally, we will mention s hema-based and analogy-based approa hes as

extensions of dedu tive and indu tive synthesis.

6.1.2.1 Methods of Program Spe i ation. Table 6.1 gives an illustra-

tion of possible ways in whi h a program for returning the last element of a

non-empty list an be spe ied.

If a spe i ation is given just as an informal (natural language) des rip-

tion of requirements (Green & Barstow, 1978), we are ba k at the ambitious

goal of automati programming often ironi ally alled \automagi pro-

gramming" (Ri h & Waters, 1986a, p. xi). Besides the problem of automati

natural language understanding in general, the main diÆ ulty of su h an ap-

proa h would be to derive the required information for example, the desired

input/output relation from an informal statement. For the last spe i ation

in table 6.1, the synthesis system must have knowledge about lists, what it

means that a list is not empty, what \element of a list" refers to and what is

meant by \last".

If a spe i ation is given as a omplete formal representation of an al-

gorithm, we have a more realisti goal, whi h is addressed in approa hes of


Table 6.1. Dierent Spe i ations for Last

Informal Spe i ation

Return the last element of a non-empty list.

De larative Programs

last([X) :- X. (logi program,

last([X|T) :- last(T). Prolog)

fun last(l) = fun tional program

if null(tl(l)) then hd(l) else last(tl(l)); (ML)

fun last (x::nil) = x (ML with

j last(x::rest) = last(rest); pattern mat hing)

Complete, Formal Spe i ations

last(l) ( nd z su h that for some y, l = y Æ [z (Manna & Waldinger,

where islist(l) and l 6= [ 1992, p. 4)

last : seq 1 X --> X (Z)

forall s : seq 1 X last s = s(#s) (Spivey, 1992, p. 117)

In omplete Spe i ations

last([1, 1), last([2 5 6 7, 7), last([9 3 4, 4) (I/O Pairs, logi al)

last([1) = 1, last([2 7) = 7, last([5 3 4) = 4 (I/O Pairs, fun tional,

rst 3 inputs)

last([x

1

) = x

1

, last([x

1

x

2

) = x

2

, last([x

1

x

2

x

3

) = x

3

(Generi I/O Pairs)

last([1 2 3) ,! last([2 3) ,! last([3) ,! 3 (Tra e)

last([l) = if null(tl(l)) then hd(l) else if null(tl(tl(l))) (Complete

then hd(tl(l)) else if null(tl(tl(tl(l)))) then hd(tl(tl(l))) generi tra e)


dedu tive program synthesis. Both example spe i ations in table 6.1 give

a non-operational des ription of the problem to be solved. The spe i ation

given by Manna and Waldinger (1992) states that for a non-empty list l there

must be a sub-list y su h that y on atenated with the sear hed for element z

is equal to l. The Z spe i ation (Spivey, 1992) states that for all non-empty

sequen es s (dened by seq

1

X) last returns the element on the last position

of this list (s(#s)) where #s gives the length of s.

The distin tion between \very high level" spe i ation languages su h

as Gist or Z (Potter, Sin lair, & Till, 1991) and \high level" programming

languages is not really stri t: What is seen as spe i ation language today

might be a programming language in the future. In the fties, assembler lan-

guages and the rst ompiler language Fortran where lassied as automati

programming systems (see Ri h & Waters, 1986a, p. xi, for a reprint from

Communi ations of the ACM, Vol. 1(4), April 1958, p. 8).

Formal spe i ation languages as well as programming languages allow to

formulate unambiguous, omplete statements be ause they have a learly de-

ned syntax and semanti s. In general, a programming language guarantees

that all synta ti ally orre t expressions an be transformed automati ally in

ma hine-exe utable ode, while this must not be true for a spe i ation lan-

guage. Spe i ation languages and high-level de larative programming lan-

guages (i. e., fun tional and logi languages) share the ommon hara teristi ,

that they abstra t from the \how to solve" a problem on a ma hine and fo us

on the \what to solve", that is, they have more expressive power and provide

mu h abstra ter onstru ts than typi al imperative programming languages

(like C). Nevertheless, generating spe i ations or de larative programs re-

quires experts who are trained in representing problems orre tly and om-

pletely that is, giving the desired input/output relations and overing all

possible inputs.

For that reason, indu tive program synthesis addresses the problem to

derive programs from in omplete spe i ations. The basi idea is, that a user

presents some examples of the desired program behavior. Spe i ation by

examples is in omplete, be ause the synthesis algorithm must generalize the

desired program behavior from some inputs to all possible inputs. Further-

more, the examples themselves an ontain more or less information. Ex-

amples might be presented simply as input/output pairs. For stru tural list-

problems (as last or reverse), generi input/output pairs, abstra ting from

on rete values of list elements an be given. This makes the examples less

ambiguous.

More information about the desired program an be provided, if examples

are represented as tra es whi h illustrate the operation of a program. Tra es

an pres ribe the to be used data stru tures and operations, if the inputs are

des ribed by tests and outputs by transformations of the inputs using pre-

dened operators. Tra es are alled omplete, if all information about data

stru tures hanged, operators applied, and ontrol de isions taken is given.


Additionally, examples are more informative, if they indi ate what kind of

omputations are not desired in the goal program. This an be done by pre-

senting positive and negative examples. Another strategy is to present exam-

ples for the rst k inputs, thereby dening an order over the input domain.

Of ourse, the more information is presented by an example spe i ation,

the more the system user has to know about the stru ture of the desired

program.

The kind and number of examples must suÆ e to spe ify what the desired

program is supposed to al ulate. SuÆ ien y is dependent on the synthesis

algorithm whi h is applied. In general, it is ne essary to present more than

one example, be ause otherwise a program with onstant output is the most

parsimonious indu tion.

6.1.2.2 Synthesis Methods. Dedu tive and indu tive synthesis are spe ial

ases of dedu tive and indu tive inferen e. Therefore, we rst give a general

hara terization of dedu tion and indu tion:

Dedu tive versus Indu tive Inferen e. Dedu tive inferen e addresses the prob-

lem of inferring new fa ts or rules ( alled theorems) from a set of given fa ts

and rules ( alled theory or axioms) whi h are assumed to be true. For exam-

ple, from the axioms

1. list(l) ! : null( ons(x, l))

2. list([A,B,C)

we an infer theorem

3. : null( ons(Z, [A,B,C)).

Dedu tive approa hes are typi ally based on a logi al al ulus. That is,

axioms are represented in a formal language (su h as rst order predi ate

al ulus) and inferen e is based on a synta ti al proof me hanism (su h as

resolution). An example for a logi al al ulus was given in se tion 2.2.2 in

hapter 2 (situation al ulus). The theorem given above an be proved by

resolution in the following way:

1. null( ons(Z, [A,B,C)) (Negation of the theorem)

2. : list(l) _ : null( ons(x, l)) (axiom 1)

3. : list([A,B,C) (Resolve 1, 2 with substitutions = fx=Z; l=[A;B;Cg)

4. list([A,B,C) (axiom 2)

5. ontradi tion (Resolve 3, 4).

Indu tive inferen e addresses the problem of inferring a generalized rule

whi h holds for a domain ( alled hypothesis) from a set of observed instan es

belonging to this domain ( alled examples). For example, from the examples

list([Z,A,B,C)

list([1,2,3)

list([R,B)


axiom (1) given above might be inferred as hypothesis for the domain of ( at)

lists. For this inferen e, it is ne essary to refer to some ba kground knowledge

about lists, namely that lists are onstru ted by means of the list- onstru tor

ons(x, l) and that null(l) is true if l is the empty list and false otherwise.

Other hypotheses whi h ould be inferred are

list(l)

whi h is an over-generalization be ause it in ludes obje ts whi h are no lists

(i. e., atoms) or

l 2 f[Z; A;B;C; [1; 2; 3; [R;Bg ! list(l)

whi h is no generalization but just a ompa t representation of the examples.

Indu tive approa hes are realized within the dierent frameworks oered

by ma hine learning resear h. The language for representing hypotheses de-

pends on the sele ted framework. Examples are de ision trees, a subset of

predi ate al ulus, or fun tional programs. Also depending on the framework,

hypothesis onstru tion an be onstrained more or less by the presented

examples. In extreme, hypotheses might be generated just by enumeration

(Gold, 1967): a legal expression of the hypothesis language is generated and

tested against the examples. If the hypothesis is not onsistent with the ex-

amples, it is reje ted and a new hypothesis is generated. Note, that testing

whether a hypothesis holds for an example orresponds to dedu tive inferen e

(Mit hell, 1997, p. 291). An overview of ma hine learning is given by Mit hell

(1997) and we will present approa hes to indu tive inferen e in se tion 6.3.

Dedu tion guarantees that the inferen e is orre t (with respe t to the

given axioms), while indu tion only results in a hypothesis. From the perspe -

tive of epistemology, one might say that a dedu tive proof does not generate

knowledge it expli ates information whi h is already ontained in the given

axioms. Indu tive inferen e results in new information, but the inferen e has

only hypotheti al status whi h holds as long as the system is not onfronted

with examples whi h ontradi t the inferen e.

To make this dieren e more expli it, lets look at a set-theoreti inter-

pretation of inferen e:

Denition 6.1.1 (Relations in a Domain). Let D be a set of obje ts be-

longing to a domain. A relation with arity i is given as R

i

D

i

. The set of

all relations whi h hold in a domain is R.

For a given domain, for example the set of at lists, the set of all axioms

whi h hold in this domain is extensionally given by relations. For example,

there might be an unary relation R ontaining all lists, and a binary relation

R

0

ontaining all pairs of lists (l, ons(x, l)) where l 2 R. In general, relations

an represent formulas of arbitrary omplexity.

Now we an dene (the semanti s of) dedu tion and indu tion in the

following way:


Denition 6.1.2 (Dedu tion). Given a set of axioms A R, dedu tive

inferen e means to de ide whether for a theorem R =2 A holds R 2 R.

In a dedu tive (proof) al ulus, R 2 R is de ided by showing that R an

be derived from the given axioms A, that is, A j= R. A standard synta ti al

approa h to dedu tive inferen e is for example resolution (see illustration

above).

Denition 6.1.3 (Indu tion). Given a set of examples E R, indu tive

inferen e means to sear h a hypothesis H = fR

1

; : : : R

n

g su h that H 6= E

and 8 E

i

2 E : E

i

2 H.

The sear h for a hypothesis takes pla e in the set of possible hypotheses given

a xed hypotheses language, for example, the set of all synta ti ally orre t

Lisp programs. Typi ally, sele tion of a hypothesis is not only restri ted by

demanding that H explains (\ overs") all presented examples, but by addi-

tional riteria su h as simpli ity.

Dedu tive versus Indu tive Synthesis. In program synthesis, the result of in-

feren e is a omputer program, whi h transforms all legal inputs x in the

desired output y = f(x). For dedu tive synthesis, the omplete formal spe i-

ation of the pre- and post- onditions of the desired program an be repre-

sented as theorem of the following form:

Denition 6.1.4 (Program Spe i ation as Theorem).

8 x 9 y [Pre(x) ) Post(x; y):

For example, for the spe i ation of the last-program (see tab. 6.1), Pre(x)

states that x is a non-empty list and Post(x, y) states, that y is the last

element of list x. A onstru tive theorem prover (su h as Green's approa h,

des ribed in se t. 2.2.2 in hap. 2) tries to prove that this theorem holds. If

the proof fails, there exists no feasible program (given the axioms on whi h

the proof is based); if the proof su eeds, a program f(x) is returned as result

of the onstru tive proof. For f(x) must hold

Denition 6.1.5 (Program Spe i ation as Constru tive Theorem).

8 x [Pre(x) ) Post(x; f(x))

In the proof the existential quantied variable y must be expli itly on-

stru ted as a fun tion over x (Skolemization).

An alternative to theorem proving is program transformation. In theorem

proving, ea h proof step onsists of rewriting the urrent program (formula)

by sele ting a given axiom and applying the proof method (e. g., resolution).

In program transformation, rewriting is based on transformation rules.

Denition 6.1.6 (Transformation Rule). A transformation rule is de-

ned as a onditioned rule:


If Appli ation-Condition Then Left-Hand-Pattern! Right-Hand-Pattern.

The rules might be given in equational logi (i. e., a spe ial lass of ax-

ioms) or they might be uni-dire tional. Ea h transformation step onsists of

rewriting the urrent program by repla ing some part whi h mat hes with

the Left-Hand-Pattern by another expression whi h mat hes with the Right-

Hand-Pattern.

In se tion 6.2 we will present theorem proving as well as transformational

approa hes to dedu tive synthesis.

For indu tive synthesis, a program is onstru ted as generalization over

the given in omplete spe i ation (input/output examples or tra es, see

tab. 6.1) su h that the following proposition holds:

Denition 6.1.7 (Chara teristi of an Indu ed Program).

8 (x; y) 2 E [f(x) = y

where E is a set of examples with x as possible input in the desired program

and y as orresponding output value or tra e for x.

The formal foundation of indu tive synthesis is grammar inferen e. Here

the examples are viewed as words whi h belong to some unknown formal

language and the inferen e task is to onstru t a grammar (or automaton)

whi h generates (or re ognizes) a language to whi h the example words be-

long. The lassi al approa h to program synthesis is the onstru tion of re-

ursive fun tional (mostly Lisp) programs from tra es. Su h tra es are either

given as input our automati ally onstru ted from input/output examples.

Alternatively, in the ontext of indu tive logi programming, the onstru tion

of logi al (Prolog) programs from input/output examples is investigated. In

both resear h areas, there are methods whi h base hypothesis onstru tion

strongly on the presented examples versus methods whi h depend more on

sear h in hypothesis spa e (i. e., generate and test). An approa h whi h is

based expli itly on generate and test is geneti programming.

In se tion 6.3 we will present grammar inferen e, fun tional program syn-

thesis, indu tive logi programming, and geneti programming as approa hes

to indu tive program synthesis.

Enri hing Program Synthesis with Knowledge. The basi approa hes to de-

du tive and indu tive program synthesis depend on a limited amount of

knowledge: a set of ( orre t) axioms or transformation rules in dedu tion; or a

restri ted hypothesis language and some rudimentary ba kground knowledge

(for example about data types and primitive operators) in indu tion. These

basi approa hes an be enri hed by additional knowledge. Su h knowledge

might be not universally valid but an make a system more powerful. As a

onsequen e, a basi fully automati , algorithmi approa h is extended to an

intera tive, heuristi approa h.


A su essful approa h to knowledge-based program synthesis is to guide

synthesis by pre-dened program s hemes, also alled design strategies. A

program s heme is a program template (for example represented in higher

order logi ) representing a xed over-all program stru ture (i. e., a xed ow

of ontrol) but abstra ting from on rete operations. For a new programming

problem, an adequate s heme is sele ted (by the user) from the library and a

program is onstru ted by stepwise renement of this s heme. For example,

a qui ksort program might be synthesized by rening a divide-and- onquer

s heme. S heme-based synthesis is typi ally realized within the dedu tive

framework (Smith, 1990). An approa h to ombine indu tive program syn-

thesis and s hema-renement is proposed by Flener (1995).

A spe ial approa h to indu tive program synthesis is programming by

analogy, that is, program reuse. Here, already known programs (predened

or previously synthesized) are stored in a library. For a new programming

problem, a similar problem and the program whi h solves this problem are

retrieved and the new program is onstru ted by modifying the retrieved

program. Analogy an be seen as a spe ial ase of indu tion be ause modi-

ation is based on the stru tural similarity of both problems and stru tural

similarity is typi ally obtained by generalizing over the ommon stru ture of

the problems (see al ulation of least general generalizations in se tion 6.3.3

and anti-uni ation in hapter 7 and hapter 12).

We will ome ba k to s hema-based synthesis and programming by anal-

ogy in Part III, addressing the problems of (1) automati a quisition of ab-

stra t s hemes from example problems, and (2) automati al retrieval of an

appropriate s hema for a given programming problem.

6.1.3 Pointers to Literature

Although program synthesis is an a tive area of resear h sin e the begin-

ning of omputer s ien e, program synthesis is not overed in text books on

programming, software engineering, or AI. The main reason for that omis-

sion might be that program synthesis resear h does not rely on one ommon

formalism but that ea h resear h group proposes its own approa h. This is

even the ase within a given framework as synthesis by theorem proving

or indu tive logi programming. Furthermore, the formalisms are typi ally

quite omplex, su h that it is diÆ ult to present a formalism together with

a omplete, non-trivial example in a ompa t way.

An introdu tory overview to knowledge-based software engineering is

given by Lowry and Duran (1989). Here approa hes to and systems for spe -

i ation a quisition, spe i ation validation and maintenan e, as well as to

dedu tive program synthesis are presented. A olle tion of in uential pa-

pers in AI and software engineering was edited by Ri h and Waters (1986b).

The papers over a large variety of resear h areas in luding dedu tive and

indu tive program synthesis, program veri ation, natural language spe i-

ation, knowledge based approa hes, and programming tutors. Colle tions of


resear h papers on AI and software engineering are presented by Lowry and

M Carthy (1991), Partridge (1991), and Messing and Campbell (1999).

An introdu tory overview to program synthesis is given by (Barr & Feigen-

baum, 1982). Here the lassi al approa hes to and systems for dedu tive and

indu tive program synthesis of fun tional programs are presented. A ol-

le tion of in uential papers in program synthesis was edited by Biermann,

Guiho, and Kodrato (1984), addressing dedu tive synthesis, indu tive syn-

thesis of fun tional programs, and grammar inferen e. A more re ent survey

of program synthesis is given by Flener (1995, part 1) and a survey of indu -

tive logi program synthesis is given by Flener and Yilmaz (1999).

Current resear h in program synthesis is for example overed in the

journal Automated Software Engineering. AI onferen es, espe ially ma hine

learning onferen es (ECML, ICML) typi ally have se tions on indu tive pro-

gram synthesis.

6.2 Dedu tive Approa hes

In the following we give a short overview of approa hes to dedu tive program

synthesis. First we introdu e onstru tive theorem proving and afterwards

program transformation.

6.2.1 Constru tive Theorem Proving

Automated theorem proving has in general not to be onstru tive. That

means, for example, that from a statement :8x:P (x) an be followed that

9xP (x) without the ne essity to a tually onstru t an obje t a whi h has

property P . In ontrast, for a proof to be onstru tive, su h an obje t a

has to be given together with a proof that a has property P (Thompson,

1991). Classi al theorem provers, whi h are applied for program veri ation,

su h as the Boyer-Moore theorem prover (Boyer & Moore, 1975) annot be

used for program synthesis be ause they annot reason onstru tively about

existential quanti ations. As introdu ed above, a onstru tive proof of the

statement 8x9yPre(x) ! Post(x; y) means that a program f must be on-

stru ted su h that for all inputs a, for whi h Pre(a) holds, Post(x; f(x))

holds.

The observation that onstru tive proofs orrespond to programs was

made by a lot of dierent resear hers (e. g., Bates & Constable, 1985) and

was expli itly stated as so- alled Curry-Howard isomorphism in the eight-

ies. One of the oldest approa hes to onstru tive theorem proving was pro-

posed by Green (1969), introdu ing a onstru tive variant of resolution. In

the following, we rst present this pioneering work, afterwards we des ribe

the dedu tive tableau method of (Manna & Waldinger, 1980), and give a

short survey of more ontemporary approa hes.

6.2 Dedu tive Approa hes 113

6.2.1.1 Program Synthesis by Resolution. Green (1969) introdu ed a

onstru tive version of lausal resolution and demonstrated with his system

QA3 that onstru tive theorem proving an be applied for onstru ting plans

(see se t. 2.2.2 in hap. 2) and for automati (Lisp) programming. For auto-

mati programming, the theorem prover needs two sets of axioms: (1) Axioms

dening the fun tions and onstru ts of (a subset of) a programming language

(su h as Lisp), and (2) axioms dening an input/output relation R(x, y)

1

,

whi h is true if and only if x is an appropriate input for some program and y

is the orresponding output generated by this program. Green addresses four

types of automati programming problems:

Che king: Proving that R(a,b) is true or false.

For a xed input/output pair it is he ked whether a relation R(a, b)

holds.

Simulation: Proving that 9x R(a; x) is true (returning x = b) or false.

For a xed input value a, output b is onstru ted.

Veri ation: Proving that 8x R(x; f(x)) is true or false (returning x = as

ounter-example).

For a user- onstru ted program f(x), its orre tness is he ked. For

proofs on erning looping or re ursive programs, indu tion axioms are

required to proof onvergen e (termination).

Synthesis: Proving that 8x 9y R(x; y) is true (returning the program y =

f(x)) or false (returning x = as ounter-example).

Indu tion axioms are required to synthesize re ursive programs.

Green gives two examples for program synthesis: the synthesis of a simple

non-re ursive program whi h sorts two numbers of a dotted pair, and the

synthesis of a re ursive fun tion for sorting.

Synthesis of a Non-Re ursive Fun tion. For the dotted pair problem, the

following axioms on erning Lisp are given:

1. x = ar( ons(x,y))

2. y = dr( ons(x,y))

3. x = nil ! ond(x,y,z) = z

4. x 6= nil ! ond(x,y,z) = y

5. 8x; y [lessp(x,y) 6= nil $ x < y.

The axiom spe ifying the desired input/output relation is given as

8x9y [ ar(x) < dr(x) ! y = x^

[( ar(x) dr(x)) ! ( ar(x) = dr(y) ^ dr(x) = ar(y)):

The axioms states that a dotted pair (x

1

:x

2

) whi h is already sorted (x

1

< x

2

)

is just returned, otherwise, the reversed pair must be returned. By resolving

1

Relation R(x,y) orresponds to relation Post(x,y) given in denition 6.1.4.


the input/output axioms with axiom (5), the mathemati al expressions (<,

) are repla ed by Lisp-expressions lessp. Resolving expressions lessp(x,y) =

nil with axioms (3) and (4) introdu es onditional expressions; and axioms

(1) and (2) are used to introdu e ons-expressions. The resulting program is

y = ond(lessp( ar(x), dr(x)), x, ons( dr(x), ar(x))).

Synthesis of a Re ursive Fun tion. To synthesize a re ursive program the

theorem prover additionally requires an indu tion axiom. For example, for a

re ursion over nite linear lists, it an be stated that the empty list is rea hed

in a nite number of steps by stepwise applying dr(l):

Denition 6.2.1 (Indu tion over Linear Lists).

[P (h(nil)) ^ 8x [:atom(x) ^ P (h( dr(x))) ! P (h(x)) ! 8z P (h(z))

where P is a predi ate and h is a fun tion.

The axiom spe ifying the desired input/output relation for a list-sorting

fun tion an be stated for example as:

8x 9y [R(nil; y) ^ [[:atom(x) ^R( dr(x); sort( dr(x))) ! R(x; y):

The sear hed for fun tion is already named as sort.

Given the Lisp axioms above, some additional axioms hara terizing the

predi ates atom(x) and equal(x,y), and axioms spe ifying a pre-dened fun -

tion merge(x,l) (inserting number x in a sorted list l su h that the result is a

sorted list), the following fun tion an be synthesized:

y = ond(equal(x,nil), nil, merge( ar(x),sort( dr(x)))).

Drawba ks of Theorem-Proving. For a given set of axioms and a omplete,

formal spe i ation, program synthesis by theorem-proving results in a pro-

gram whi h is orre t with respe t to the spe i ation. The resulting program

is onstru ted as side-ee t of the proof by instantiation of variable y in the

input/output relation R(x; y).

Fully automated theorem proving has several disadvantages as approa h

to program synthesis:

The domain must be axiomatized ompletely.

The given axiomatization has a strong in uen e on the omplexity of sear h

for the proof and it determines the way in whi h the sear hed for program

is expressed. For example, Green (1969) reports that his theorem prover

ould not synthesize sort in a \reasonable amount of time" with the given

axiomatization. Given a dierent set of axioms (not reported in the paper),

QA3 reated the program ond(x, merge( ar(x),sort( dr(x))), nil).

Providing a set of axioms whi h are suÆ ient to prove the input/output

relation and thereby to synthesize a program, presupposes that a lot is

already known about the program. For example, to synthesize the sort

program, an axiomatization of the merge fun tion had to be provided.


It an be more diÆ ult to give a orre t and omplete input/output spe -

i ation than to dire tly write the program.

For the sort example given above, the sear hed for re ursive stru ture is

already given in the spe i ation by separating the ase of an empty list as

input from the ase of a non-empty list and by presenting the impli ation

from the rest of a list ( dr(x)) to the list itself.

Theorem provers la k the power to produ e proofs for more ompli ated

spe i ations.

The main sour e of ineÆ ien y is that a variety of indu tion axioms must

be spe ied for synthesizing re ursive programs. The sele tion of the \suit-

able" indu tion s heme for a given problem is ru ial for nding a solution

in reasonable time. Therefore, sele tion of the indu tion s heme to be used

is often performed by user-intera tion!

This disadvantages of theorem proving are true for the original approa h

of Green and are still true for the more sophisti ated approa hes of today,

whi h are not restri ted to proof by resolution but use a variety of proof

me hanisms (see below). The originally given hard problem of automati

program onstru tion is reformulated as equally hard problem of automati

theorem proving (Ri h & Waters, 1986a). Nevertheless, the proof-as-program

approa h is an important ontribution to program synthesis: First, the ne-

essity of omplete and formal spe i ations ompels the program designer

to state all knowledge involved in program onstru tion expli itly. Se ond,

and more important, while theorem proving might not be useful in isolation,

it an be helpful or even ne essary as part of a larger system. For example,

in transformation systems (see se t. 6.2.2), the appli ability of rules an be

proved by showing that a pre ondition holds for a given expression.

6.2.1.2 Planning and Program Synthesis with Dedu tive Tableaus.

A se ond in uential ontribution to program synthesis as onstru tive proof

was made by Manna and Waldinger (1980). As Green, they address not only

dedu tive program synthesis but also dedu tive plan onstru tion (Manna &

Waldinger, 1987).

Dedu tive Tableaus. The approa h of Manna and Waldinger is also based

on resolution. In ontrast to the lassi al resolution (Robinson, 1965) used

by Green, their resolution rule does not depend on axioms represented in

lausal form. Their proposed formalism the dedu tive tableaus in or-

porates not only non- lausal resolution but also transformation rules and

stru tural indu tion. The transformation rules have been taken over from an

earlier, transformation based programming system alled Dedalus (Manna &

Waldinger, 1975, 1979).

A dedu tive tableau onsists of three olumns

Assertions Goals Outputs

Pre(x)

Post(x,y) y


with x as input variable, y as output variable, Pre(x) as pre ondition and

Post(x,y) as post ondition of the sear hed for program. The initial tableau is

onstru ted from a spe i ation of the form (see also table 6.1)

f(x)( nd y su h that Post(x; y) where Pre(x).

In general, the semanti s of a tableau with assertions A

i

(x) and goals

G

j

(x; y) is

8xA

1

(x) ^ : : : ^ A

n

(x)! 9yG

1

(x; y) _ : : : _G

n

(x; y):

A proof is performed by adding new rows to the tableau through rules of

logi al inferen e. A proof is su essful if a row ould be onstru ted where the

goals olumn ontains the expression \true" and the output olumn ontains

a term whi h onsists only of primitive expressions of the target programming

language.

As an example for a derivation step, we present a simplied version of the

so alled GG-resolution (omitting possible uni ation of sub-expressions): For

two goals F in olumn i and G in olumn j a new row an be onstru ted

with the goal entry

F [P true ^G[P false

where P is a ommon sub-expression of F and G. For the output olumn,

GG-resolution results in the introdu tion of a onditional expression:

Assertions Goals Outputs

a b a

:(a b) b

true ^ : false if a b then a else b

true.

Again, indu tion is used for re ursion-formation:

Denition 6.2.2 (Indu tion Rule). For a sear hed for program f(x) with

assertion Pre(x) and goal Post(x; y) the indu tion hypothesis is 8u(u x)!

(Pre(u)! Post(u; f(u))) where is a well-founded ordering over the set of

input data fxg.

Re ursive Plans. Originally, Manna and Waldinger applied the dedu tive

tableau method to synthesize fun tional programs, su h as reversing lists or

nding the quotient of two integers (Manna &Waldinger, 1992). Additionally,

they demonstrated how imperative re ursive programs an be synthesized by

dedu tive planning (Manna & Waldinger, 1987). As example they used the

problem of learing a blo k (see also se t. 3.1.4.1 in hap. 3).

The sear hed for plan is alled make lear(a) where a denotes a blo k in a

blo ks-world. Plan onstru tion is based on proving the theorem


8s

0

8a9z

1

[Clear(s

0

; z

1

; a)

meaning \for an initial state s

0

, the blo k a is lear after exe ution of plan

z

1

".

Among the pre-spe ied axioms for this problems are:

hat-axiom: If : Clear(s, x) Then On(s, hat(s,x), x)

(If blo k x is not lear in situation s then a blo k hat(s; x) is lying on

blo k x in situation s where hat(s; x) is the blo k whi h lies on x in s.)

put-table-axiom: If Clear(s, x) Then On(put(s, x, table), x, table)

(If blo k x is lear in situation s then it lies on the table in a situation

where x was put on the table immediately before.)

and the resulting plan (program) is

make lear(a) (

If Clear(a)

Then (the urrent situation)

Else make lear(hat(a));put(hat(a),table).

The omplete derivation of this plan using dedu tive tableaus is reported

in Manna and Waldinger (1987).

6.2.1.3 Further Approa hes. A third \ lassi al" approa h to program

synthesis by theorem proving was proposed by (Bibel, 1980). Current auto-

mati theorem provers, su h as Nuprl

2

or Isabelle

3

an be in prin iple used

for program synthesis. But up to now, pure theorem proving approa hes an

only be applied to small problems. Current resear h addresses the problem

of enri hing onstru tive theorem proving with knowledge-based methods

su h as proof ta ti s or program s hemes (Kreitz, 1998; Bibel, Korn, Kreitz,

& Kuru z, 1998).

6.2.2 Program Transformation

Program transformation is the predominant te hnique in dedu tive program

synthesis. Typi ally, dire ted rules are used to transform an expression into a

synta ti ally dierent, but semanti ally equivalent expression. There are two

prin ipal kinds of transformation: lateral and verti al transformation (Ri h

& Waters, 1986a). Lateral transformation generates expressions on the same

level of abstra tion. For example, a linear re ursive program might be rewrit-

ten into a more eÆ ient tail-re ursive program. Program synthesis is realized

by verti al transformation rewriting a spe i ation into a program by apply-

ing transformation rules whi h represent the relationship between onstru ts

on an abstra t level and onstru ts on program level. If the starting point for

2

see http://www. s. ornell.edu/Info/Proje ts/NuPrl/nuprl.html

3

see http://www. l. am.a .uk/Resear h/HVG/Isabelle/


transformation is already an exe utable (high-level) program, one speaks of

transformational implementation.

In the following, we rst present the pioneering approa h of Burstall and

Darlington (1977). The authors present a small set of powerful rules for trans-

forming given programs in more eÆ ient programs, that is they propose an

approa h to transformational implementation. The main ontribution of their

approa h is the introdu tion of rules for unfolding and folding (synthesiz-

ing) re ursive programs. Afterwards, we present the CIP ( omputer-aided

intuition-guided programming) approa h (Broy & Pepper, 1981) whi h fo-

uses on orre tness-preserving transformations. CIP does not aim at full au-

tomatization, it an be lassied as a meta-programming approa h (Feather,

1987) whi h supports a program-developer in the onstru tion of orre t pro-

grams. Finally, we present the KIDS-system (Smith, 1990) whi h is the most

su essful dedu tive synthesis system today. KIDS is a program synthesis sys-

tem whi h transforms initial, not ne essarily exe utable, spe i ations into

eÆ ient programs by stepwise renement.

6.2.2.1 Transformational Implementation:

The Fold Unfold Me hanism.

The mind is a auldron and things bubble up and show for a moment, then

slip ba k into the brew. You an't rea h down and nd anything by tou h. You

wait for some order, some relationship in the order in whi h they appear.

Then yell Eureka! and believe that it was a pro ess of old, pure logi .

|Travis M Gee in: John D. Ma Donald, The Girl in the Plain Brown Wrapper, 1968

Starting point for the transformation approa h of Burstall and Darlington

(1977) is a program, whi h is presented as a set of equations with left-hand

sides representing program heads and right-hand sides representing al ula-

tions.

For example, the Fibona i fun tion is presented as

1. b(0) ( 1

2. b(1) ( 1

3. b(x+2) ( b(x+1) + b(x).

The following inferen e rules for transforming re ursive equations are

given:

Denition: Introdu e a new re ursive equation whose left-hand expression is

not an instan e of the left-hand expression of any previous equation.

We will see below, that introdu tion of a suitable equation in general

annot be performed automati ally (\eureka" step).

Instantiation: Introdu e a substitution instan e of an existing equation.


Unfolding: If E ( E

0

and F ( F

0

are equations and there is some o ur-

ren e in F

0

of an instan e of E, repla e it by the orresponding instan e

of E

0

, obtaining F

00

and add equation F ( F

00

.

This rule orresponds to the expansion of a re ursive fun tion by repla -

ing the fun tion all by the fun tion body with a ording substitution of

the parameters.

Folding: If E ( E

0

and F ( F

0

are equations and there is some o urren e

in F

0

of an instan e of E

0

, repla e it by the orresponding instan e of E,

obtaining F

00

and add equation F ( F

00

.

This is the \inverse" rule to unfolding. We will dis uss folding of terms

in detail in hapter 7.

Abstra tion: Introdu e a where lause by deriving from a previous equation

E ( E

0

a new equation

E ( E

0

[u

1

=F

1

; : : : ; u

n

=F

n

where hu

1

; : : : ; u

n

i = hF

1

; : : : ; F

n

i:

Laws: Algebrai laws su h as asso iativity or ommutativity an be used

to rewrite the right-hand sides of equations.

The authors propose the following strategy for appli ation of rules

Make any ne essary denitions.

Instantiate.

For ea h instantiation unfold repeatedly. At ea h stage of unfolding:

Try to apply laws and abstra tions.

Fold repeatedly.

Transforming Re ursive Fun tions. We will illustrate the approa h, demon-

strating how the bona i program given above an be transformed into a

more eÆ ient program whi h avoids al ulating values twi e:transformation,

re ursive fun tion

1. b(0) ( 1 (given)

2. b(1) ( 1 (given)

3. b(x+2) ( b(x+1) + b(x) (given)

4. g(x) ( hb(x+1),b(x) i (denition, eureka!)

5. g(0) ( hb(1),b(0) i (instantiation)

g(0) ( h1,1 i (unfolding with 1, 2)

6. g(x+1) ( hb(x+2),b(x+1) i (instantiate 4)

g(x+1) ( hb(x+1)+b(x),b(x+1) i (unfold with 3)

g(x+1) ( hu+v,ui where hu,vi = hb(x+1),b(x) i (abstra t)

g(x+1) ( hu+v,ui where hu,vi = g(x) (fold with 4)

7. b(x+2) ( u + v where hu,vi = hb(x+1),b(x) i (abstra t 3)

f(x+2) ( u + v where hu,vi = g(x) (fold with 4)

The new, more eÆ ient denition of bona i is:

b(0) ( 1


b(1) ( 1

b(x+2) ( u + v where hu,vi = g(x)

g(0) ( h1,1i

g(x+1) ( u + v where hu,vi = g(x).

Abstra t Programming and Data Type Change. Burstall and Darlington

(1977) also demonstrated how verti al program transformation an be real-

ized by transforming abstra t programs, dened on high-level, abstra t data

types, into on rete programs dened on on rete data.

Again, we illustrate the approa h with an example. Given is the abstra t

data type of labeled trees:

Abstra t Data Type

niltree 2 labeledTrees

ltree : atoms labeledTrees labeledTrees ! labeledTrees.

That is, a labeled tree is either empty (niltree) or it is a labeled node

(atoms) whi h bran hes into two labeled trees.

Furthermore, a data type of binary trees might be available as basi data

stru ture (e. g., in Lisp) with onstru tors nil and pair:

Con rete Data Stru ture

nil 2 binaryTrees

atoms 2 binaryTrees

pair : binaryTrees binaryTrees ! binaryTrees.

A labeled tree an be represented as binary tree by representing ea h node

as a pair of a label and a binary tree. For example:

ltree(A, niltree, ltree(B, niltree, niltree))

an be represented as

pair(A, pair(nil, pair(B, pair(nil, nil)))).

The relationship between the abstra t data type and the on rete data

stru ture an be expressed by the following representation fun tion:

R : binaryTrees ! labeledTrees

R(nil) ( niltree

R(pair(a; (pair(p

1

; p

2

)))) ( ltree(a;R(p

1

); R(p

2

)).

The inverse representation fun tion whi h in some ases ould be gen-

erated automati ally denes how to ode the data type:

C : labeledTrees ! binaryTrees

C(niltree) ( nil

C(ltree(a; t

1

; t

2

)) ( pair(a; pair(C(t

1

); C(t

2

))).


The user an write an abstra t program, using the abstra t data type.

Typi ally, writing abstra t programs involves less work than writing on rete

programs be ause implementation spe i details are omitted. Furthermore,

abstra t programs are more exible be ause they an be transformed into

dierent on rete realizations.

An example for an abstra t program is twist whi h mirrors the input tree:

twist : labeledTrees ! labeledTrees

twist(niltree) ( niltree

twist(ltree(a; t

1

; t

2

)) ( ltree(a; twist(t

2

); twist(t

1

))

The goal is, to automati ally generate a on rete program TWIST(p) =

C(twist(R(p))):

1. TWIST(p) ( C(twist(R(p)))

2. TWIST(nil) ( C(twist(R(nil))) (instantiate)

3. TWIST(nil) ( nil (unfold C, twist, R, and evaluate)

4. TWIST(pair(a,pair(p

1

,p

2

))) ( C(twist(R(pair(a,pair(p

1

,p

2

))))) (in-

stantiate)


1

,p

2

))) ( C(twist(ltree(a,R(p

1

), R(p

2

)))) (unfold

R)


1

,p

2

)))( C(ltree(a,twist(R(p

2

)),twist(R(p

1

)))) (un-

fold twist)


1

,p

2

)))( pair(a,pair(C(twist(R(p

2

))), C(twist(R(p

1

)))))

(unfold C)


1

,p

2

)))( pair(a,pair(TWIST(p

2

), TWIST(p

1

)))

(fold with 1)

with the resulting program given by equations 3 and 8.

Chara teristi s of Transformational Implementation. To sum up, the ap-

proa h proposed by Burstall and Darlington (1977) has the following har-

a teristi s:

Spe i ation: Input is a stru turally simple program whose orre tness is

obvious (or an easily be proved).

Basi Rules: A transformation system is based on a set of rules. There are

rules whi h dene semanti s-preserving re-formulations of the (right-

hand side, i. e., body) program (or spe i ation). Other rules, su h as

denition and instantiation, annot be applied me hani al but are based

on reativity (eureka) and insight into the problem to be solved.

Partial Corre tness: Transformations are semanti s-preserving modulo ter-

mination.

Sele tion of Rules: Providing that sequen e of rule appli ation whi h leads

to the desired result typi ally annot be performed fully automati ally.

Burstall and Darlington presented a simple strategy for rule sele tion.

This strategy an in general not be applied without guidan e by the

user.


6.2.2.2 Meta-Programming: The CIP Approa h. The CIP approa h

was developed during the seventies and eighties (Broy & Pepper, 1981; Bauer,

Broy, Moller, Pepper, Wirsing, et al., 1985; Bauer, Ehler, Hors h, Moller,

Parts h, Paukner, & Pepper, 1987) as a system to support the formal devel-

opment of orre t and eÆ ient programs (meta-programming). The approa h

has the following hara teristi s:

Transformation Rules: Des ribe mappings from programs to programs. They

are represented as dire ted pairs of program s hemes (a! b together with

an enabling ondition ( ) dening appli ability.

Transformational Semanti s: A programming language is onsidered as term

algebra. The given transformation rules dene ongruen e relations on

this algebra and thereby establish a notion of \equivalen e of programs"

(Ehrig & Mahr, 1985).

Corre tness of Transformations: The orre tness of a basi set of transfor-

mation rules an be proved in the usual way showing that a and b are

equivalent with respe t to the properties of the semanti model. Cor-

re tness of all other rules an be shown by transformational proofs: A

transformation rule T : a! b is orre t if it an be dedu ed from a set of

already veried rules T

1

; : : : T

n

: a

T

1

7! a

1

T

2

7! : : :

T

n

7! a

n

= b. In general, for

all rules a! b, it must hold, that the relation between a and b is re exive

and transitive. Properties of re ursive programs are proved by transfor-

mational indu tion: For two re ursive programs p and q with bodies [p

and [q every all p(x) an be transformed into a all q(x) if there is a

transformation rule [y! [y.

Abstra t Data Types: Abstra t data types are given by a signature (sorts

and operations) and a set of equations (axioms) (see, e. g. Ehrig & Mahr,

1985). These axioms an be onsidered as transformation rules whi h are

appli able within the whole s ope of the type. (Considering a program-

ming language as term algebra, the programming language itself an be

seen as abstra t data type with the transformation rules as axioms).

Assertions: For some transformation rules there might be enabling onditions

whi h hold only lo ally. For example, for the rule

B

if B then S else T fi

S

the enabling ondition B an be given or not, depending on the urrent

input. If B holds, than the onditional statement evaluates to S. Su h

lo al properties are expressed by assertions whi h an be provided by

the programmer. Furthermore, transformation rules an be dened for

generating or propagating assertions.


Again, we illustrate the approa h with an example, whi h is presented in

detail in (Broy & Pepper, 1981). The sear hed for program is a realization of

the Warshall-algorithm for al ulating the transitive losure of a graph.

The graph is given by its hara teristi fun tion

fun tion edge (x:node, y:node):bool;

(true, if nodes x and y are dire tly onne ted).

A path is dened as sequen es of nodes where ea h pair of su eeding

nodes are dire tly onne ted:

fun tion ispath (s:seq(node)):bool;

8s

1

; s

2

: seq(node), x,y: node::

s = s

1

Æ x Æ y Æ s

2

) edge(x; y).

Starting point for program transformation is the spe i ation of the

sear hed for fun tion:

fun tion trans (x:node, y:node):bool;

9 P: seq(node):: ispath(x Æ p Æ y).

The basi idea (eureka!) is to dene a more general fun tion, whi h only

onsiders the rst i nodes (where nodes are given as natural numbers 1 : : : n):

fun tion t (i:nat, x:node, y:node):bool;

9 P: seq(node):: ispath(x Æ p Æ y)

^ 8 z: node:: z 2 p) z i.

In a rst step, it must be proved, that t (n,x,y) = trans(x,y), where n is

the number of all nodes in a graph:

Lemma 1: t (n; x; y) = trans(x; y)

t (n,x,y) = 9 P: seq(node):: ispath(x Æ p Æ y) ^ 8 z: node:: z 2 p ) z n

(unfold)

= 9 P: seq(node):: ispath(x Æ p Æ y) (simpli ation: z n holds for all

nodes z)

= trans(x,y) (fold).

To obtain an exe utable fun tion from the given spe i ation, a next

eureka-step is the introdu tion of re ursion. This is done by formulating a

base ase (for i = 0) and a re ursive ase:

Lemma 2: t (0; x; y) = edge(x; y)

Lemma 3: t (i+ 1; x; y) = t (i; x; y) _ [t (i; x; i+ 1) ^ t (i; i+ 1; y).

The proof of lemma 2 is analogous to that of lemma 1. The proof of

lemma 3 involves an additional ase-analysis (all nodes on a path are smaller

or equal to node i vs. a path ontains a node i+1) and appli ation of logi al

rules.

The proved lemmas give rise to the program


fun tion t (i:nat, x:node, y:node):bool;

if i = 0

then edge(x,y)

else t (i,x,y) _ [t (i,x,i+1) ^ t (i,i+1,y).

This example demonstrates, how the development of a program from a

non-exe utable initial spe i ation an be supported by CIP. Program devel-

opment was mainly \intuition-guided", but the transformation system sup-

ports the programmer in two ways: First, the notion of program transforma-

tion results in systemati program development by stepwise renement, and

se ond, the transformation system (partially) automati ally generates proofs

for the orre tness of the programmers intuitions.

Given the re ursive fun tion t , in a next step it an be transformed from

its ineÆ ient re ursive realization into a more eÆ ient version realized by for-

loops. This transformational implementation an be performed automati ally

by means of sophisti ated (and veried) transformation rules.

6.2.2.3 Program Synthesis by Stepwise Renement: KIDS. KIDS

4

is urrently the most su essful system for dedu tive (transformational) pro-

gram synthesis. KIDS supports the development of orre t and eÆ ient pro-

grams by applying onsisten y-preserving transformations to an initial spe i-

ation, rst transforming the spe i ation into an exe utable but ineÆ ient

program and then optimizing this program. The basis of KIDS is Rene a

wide-spe trum language for representing spe i ations as well as programs.

The nal program is represented in Common Lisp.

In KIDS a variety of state-of-the-art te hniques are integrated into one

system and KIDS makes use of a large, formalized amount of domain and

programming knowledge: It oers program s hemes for divide-and- onquer,

global sear h (binary sear h, depth-rst sear h, breadth-rst sear h), and lo-

al sear h (hill limbing), it makes use of a library of reusable domain axioms

(equations spe ifying abstra t data types), it integrates dedu tive inferen e

(with about 500 inferen e rules represented as dire ted transformation rules),

and it in ludes a variety of state-of-the-art te hniques for program optimiza-

tion (su h as expression simpli ation and nite dieren ing).

KIDS is an intera tive system where the user typi ally goes through the

following steps of program development (Smith, 1990):

Develop a Domain Theory: The user denes types and fun tions and pro-

vides laws that allow high-level reasoning about the dened fun tions.

The user an develop the theory from the s rat h or make use of a hier-

ar hi library of types.

For example, a domain theory for the k-Queens problem gives boolean

fun tions representing the onstraints that no two queens an be in the

same row or olumn and that no two queens an be in the same diago-

4

see http://www.kestrel.edu/HTML/prototypes/kids.html


nal of a hessboard. Laws for reasoning about fun tions typi ally in lude

monotoni ity and distributive laws.

Create a Spe i ation: The user enters a spe i ation statement in terms of

the domain theory.

For example, a spe i ation of queens states that for k queens on a kk

board, a set of board positions must be returned for whi h the onstraints

dened in the domain theory hold.

Apply a Design Ta ti : The user sele ts a program s heme, representing a

ta ti for algorithm design, and applies it to the spe i ation. This is

the ru ial step of program development transforming a typi ally non-

exe utable spe i ation into an (ineÆ ient) high-level program.

For example, the k-Queens spe i ation an be solved with global sear h,

whi h an be rened to depth-rst sear h (nd legal solutions using ba k-

tra king).

Apply Optimizations: The user sele ts an optimization operation and a pro-

gram expression to whi h it should be applied. The optimization te h-

niques are fully automati .

Apply Data Type Renements: The user an sele t implementations for the

high-level data types in the program.

Whi h implementation of a data type will result in the most eÆ ient

program depends on the kind of operations to be performed. For example,

sets ould be realized as lists, arrays, or trees.

Compile: The ode is ompiled into ma hine-exe utable form (rst Common

Lisp and then ma hine ode).

The k-Queens example is given in detail in Smith (1990). The idea of

program development by stepwise renement of abstra t s hemes is also il-

lustrated in Smith (1985) for onstru ting a divide-and- onquer based sear h

algorithm.

5

The KIDS system demonstrates that it is possible that automati pro-

gram synthesis an over all steps of program development from high-level

spe i ations to eÆ ient programs. Furthermore, it gives some eviden e to

the laim of KBSE that automatization an result in a higher level of produ -

tivity in software development. High produ tivity is also due to the possibility

of reuse of domain axioms. But of ourse, the system is not \auto-magi " it

strongly depends on intera tions with an expert user, espe ially for the initial

development steps from domain theory to sele tion of a suitable design ta ti .

These rst steps an be seen as an approa h to meta-programming, similar

to CIP. Even for optimization, the user must have knowledge whi h parts

of the program ould be optimized in what way to sele t the appropriate

expressions and rules to be applied to them.

6.2.2.4 Con luding Remarks. In prin iple, there is no fundamental dif-

feren e between onstru tive theorem proving and program transformation:

5

Here the system Cypress, whi h was the prede essor of KIDS, was used.


In both ases, starting point is a formal spe i ation and output is a program

whi h is orre t with respe t to the spe i ation. Theorem proving an be

onverted into transformation if axioms are represented as rules (equations

may orrespond to bi-dire tional rules) and if proof-rules (su h as resolution)

are represented as spe ial rewrite rules. The advantages of transformation

over proof systems for program synthesis are, that forward reasoning and a

smaller formal overhead makes program derivation somewhat easier (Kreitz,

1998). In ontrast, proof systems are typi ally used for (ex-post) program

veri ation.

As des ribed above, program transformation systems might in lude non-

veried rules representing domain knowledge whi h are applied in intera tion

with the system user. Su h rules are mainly applied during the rst steps of

program development until an initial exe utable program is onstru ted. The

following optimization steps (transformational programming) mainly rely on

veried rules whi h an be applied fully automati ally. Code optimization

by transformation is typi ally dealt with in text books on ompilers (Aho,

Sethi, & Ullman, 1986). A good introdu tion to program transformation and

optimization is given in Field and Harrison (1988).

An interesting approa h to \reverse" transformation from eÆ ient loops

(tail re ursion) to more easily veriable (linear) re ursion is proposed by

Giesl (2000).

6.3 Indu tive Approa hes

In the following we introdu e indu tive approa hes to program synthesis.

First, basi on epts of indu tive inferen e are introdu ed. Indu tive infer-

en e is resear hed in philosophy (epistemology), in ognitive psy hology, and

in arti ial intelligen e. In this se tion we fo us on indu tion in AI, that is,

on ma hine learning. In hapter 9 we dis uss some relations between indu -

tive program synthesis and indu tive learning of strategies in human prob-

lem solving. Grammar inferen e is introdu ed as the theoreti al ba kground

for program synthesis. In the following se tions, three dierent approa hes

to program synthesis are des ribed geneti programming, indu tive logi

programming, and synthesis of fun tional programs. Synthesis of fun tional

programs is the oldest and geneti programming the newest approa h. In

geneti programming hypothesis onstru tion relies strongest on sear h in

hypotheses-spa e, while in fun tional program synthesis hypothesis onstru -

tion is most strongly guided by the stru ture of the examples.

6.3.1 Foundations of Indu tion

As introdu ed in se tion 6.1.2.2, indu tion means the inferen e of generalized

rules from examples. The ability to perform indu tion is a presupposition for

learning often indu tion and learning are used as synonyms.

6.3 Indu tive Approa hes 127

6.3.1.1 Basi Con epts of Indu tive Learning. Indu tive learning an

be hara terized in the following way (Mit hell, 1997, p. 2):

Denition 6.3.1 (Learning). A omputer program is said to learn from

experien e E with respe t to some lass of tasks T and performan e mea-

sure P , if its performan e at tasks in T , as measured by P , improves with

experien e E.

In the following, we will esh-out this denition.

Learning Tasks. Learning tasks an be roughly divided into lassi ation

tasks versus performan e tasks, labeled as on ept learning versus skill a qui-

sition. Examples for on ept learning are re ognizing handwritten letters, or

identifying dangerous substan es. Examples for skill a quisition are navigat-

ing without bumping into obsta les, or ontrolling a hemi al pro ess su h

that ertain variables stay in a pre-dened range.

In indu tive program synthesis, the learning task is to onstru t a program

whi h transforms input values from a given domain into the desired output

values. A program an be seen as representation of a on ept: Ea h input

value is lassied a ording to the operations whi h transform it into the

output value. Espe ially, programs returning boolean values, an be viewed

as on ept denitions. Typi al examples are re ursive denitions of odd or

an estor (see g. 6.1). A program an also be seen as representation of a

( ognitive) skill: The program represents su h operations whi h must be per-

formed for a given input value to fulll a ertain goal. For example, a program

qui ksort represents the skill of eÆ iently sorting lists; or a program hanoi

represents the skill of solving the Tower of Hanoi puzzle with a minimum

number of moves (see g. 6.1).

Learning Experien e. Learning experien e typi ally is provided by present-

ing a set of training examples. Examples might be pre- lassied, then we

speak of supervised learning or learning with a tea her, otherwise we speak

of unsupervised learning or learning from observation.

For example, for learning a single on ept su h as dangerous versus

harmless substan e the system might be provided with a number of hem-

i al des riptions labeled as positive (i. e., dangerous) and negative (i. e.,

harmless) instan es of the to be learned on ept. For learning to ontrol a

hemi al pro ess, the learning system might be provided with urrent values

of a set of ontrol variables, labeled with the appropriate operations whi h

must be performed in that ase and ea h operation sequen e onstitutes a

lass to be learned. Alternatively, for skill a quisition a learning system might

provide its own experien e, performing a sequen e of operations and getting

feedba k for the nal out ome (reinfor ement learning). In this ase, the ex-

amples are labeled only indire tly the system is onfronted with the redit

assignment problem, determining the degree to whi h ea h performed oper-

ation is responsible for the nal out ome.


; fun tional program de iding whether a natural number is odd

odd(x) = if (x = 0) then false else if (x = 1) then true else odd(x-2)

; logi al program de iding whether p is an estor of f

an estor(P,F) :- parent(P,F).

an estor(P,F) :- parent(P,I), an estor(I,F).

; logi al program for qui ksort

qsort([X|Xs,Ys) :-

partition(Xs,X,S1,S2),

qsort(S1,S1s),

qsort(S2,S2s),

append(S1s,[X|S2s,Ys).

qsort([,[).

; fun tional program for hanoi

hanoi(n,a,b, ) = if (n = 1) then move(a,b)

else append(hanoi((n-1),a, ,b), move(a,b),

hanoi((n-1), ,b,a))

Fig. 6.1. Programs Represent Con epts and Skills

In the simplest form, examples might onsist just of a set of attributes.

Su h attributes might be ategorial (substan e ontains uor yes/no) or met-

ri ( urrent temperature has value x). In general, examples might be stru -

tured obje ts, that is sets of relations or terms. For instan e, for the an estor

problem (see g. 6.1) kinship between dierent persons onstitute the learning

experien e. In program synthesis, examples are always stru tured obje ts.

Training examples an be presented stepwise (in remental learning) or

all at on e (\bat h" learning). Both learning modes are possible for program

synthesis. Finally, training examples should be a representative sample of

the on ept or skill to be learned. Otherwise, the performan e of the learning

system might be only su essful for a subset of possible situations.

In table 6.2 possible training examples for the an estor and the hanoi

problems are given. In the rst ase, given some parent- hild relations, posi-

tive and negative examples for the on ept \an estor" are presented. In the

se ond ase, examples are asso iated with sequen es of move operations for

dierent numbers of dis s. The hanoi example an be lassied as input to

supervised or unsupervised learning: The number of dis s (n = i; i = 1 : : : 3)

represents a possible input and the move operations represent the lass, the

input must be asso iated with (see Briesemeister et al., 1996). Alternatively,

the dis -number/operations pairs an be seen as patterns and the learning

task is to extra t their ommon stru ture (des riptive generalization). While

in the rst ase, the dierent operator sequen es onstitute dierent lasses

to be learnt, in the se ond ase, all examples are positive. The examples

are ordered with respe t to the number of dis s. Therefore, in an indire t


way, they also ontain information about what ases are not belonging to the

domain.

Table 6.2. Training Examples

; ANCESTOR(X,Y)

parent(peter,robert) parent(mary,robert)

parent(robert,tim) parent(tim,anna)

parent(tim,john) parent(julia,john)

... ...

; pre- lassified positive and negative examples

an estor(julia,john) not(an estor(anna,tim))

an estor(peter,tim) not(an estor(anna,mary))

an estor(peter,anna) not(an estor(john,robert))

... ...

; HANOI(N,A,B,C)

(N = 1) --> (move(A,B))

(N = 2) --> (move(A,C),move(A,B),move(C,B))

(N = 3) --> (move(A,B),move(A,C),move(B,C),

move(A,B),move(C,A),move(C,B),

move(A,B))

Additional Information. The learning system might be provided with addi-

tional information besides the training examples. For instan e, in the an estor

problem in table 6.2 some parent- hild relations are dire tly given. These fa ts

are essential for learning: It is not possible to ome up with a terminating

re ursive rule (as given in g. 6.1) without referen e to a dire t relation as

parent.

If a system is given additional information besides the training examples,

this is typi ally alled ba kground knowledge. Ba kground knowledge an have

dierent fun tions for learning: it an be ne essary for hypothesis onstru -

tion, it an make hypothesis onstru tion more eÆ ient, or it an make the

resulting hypothesis more ompa t or readable.

The an estor example ould have been presented in a dierent way, as

shown in table 6.3. Instead of giving parent relations, mother and father

relations an be used, together with a rule stating, that father(X,Y) or

mother(X,Y) implies parent(X,Y). Another example for using ba kground

knowledge for learning, is to give pre-dened rules for append and partition

for learning qui ksort (see g. 6.1).

Additional information an also be provided in form of an ora le: The

learning system onstru ts new examples whi h are onsistent with its urrent

hypothesis and asks an ora le whether the example belongs to the lass to

be learned. An ora le is usually the tea her.


Table 6.3. Ba kground Knowledge

; ANCESTOR(X,Y)

father(peter,robert) mother(mary,robert)

father(robert,tim) father(tim,anna)

father(tim,john) mother(julia,john)

...

parent(X,Y) :- mother(X,Y).

parent(X,Y) :- father(X,Y).

Representing Hypotheses. The task of the learning system is to onstru t

generalized rules (representing on epts or skills) from the training examples

and ba kground knowledge. In other words, learning means to onstru t an

intensional representation from a sample of an extensional representation of

a on ept or skill. As mentioned in se tion 6.1.2.2, su h generalized rules

have the status of hypotheses.

For supervised on ept learning, a hypothesis is a spe ial ase of a fun tion

f : X ! Y whi h maps a given obje t belonging to the learning domain to

a lass. In general, f an be a linear or a non-linear fun tion, mapping a

ve tor of attribute values X into an output value Y , representing the lass

the obje t des ribed by this ve tor is supposed to belong to. Alternatively,

hypotheses an be represented symboli ally the most prominent example

are de ision trees.

In the ontext of program synthesis, inputs X are stru tural des riptions

(relations or terms), outputs Y are operator sequen es transforming the in-

put into the desired output, and hypotheses are represented as logi al or

fun tional programs (see g. 6.1): For the an estor problem, fun tion f(x) is

represented by lauses an estor(X,Y) whi h return true or false for a given

set of parent- hild relations and two persons. For the hanoi problem, fun -

tion f(x) is represented by a fun tional program whi h returns a sequen e

of move-operations to transport a tower of n dis s from the start peg to the

goal peg.

Identi ation Criteria and Performan e Measure. There are two stages in

whi h the learning system is evaluated: First, a de ision riterium is needed

to terminate the learning algorithm, and, se ond, the quality of the learned

hypothesis must be tested. An algorithm terminates, if some identi ation

riterium with respe t to the training examples is fullled. The lassi al on-

ept here is identi ation in the limit, proposed by Gold (1967). The learner

terminates if the urrent hypothesis holds at some point during learning for all

examples. An alternative on ept is PAC (probably approximately orre t)

identi ation (Valiant, 1984). The learner terminates if the urrent hypoth-

esis an be guaranteed with a high probability to be onsistent with future

examples.

The rst model represents an all-or-nothing perspe tive on learning a

hypothesis must be totally orre t with respe t to the seen examples. The


se ond model is based on probability theory. Most approa hes of on ept

learning are based on PAC-learnability, while program synthesis typi ally is

based on Gold's theory (see se t. 6.3.1.2). An analysis of program synthesis

from tra es in the PAC theory was presented by Cohen (1995, 1998).

To evaluate the quality of the learned hypothesis, the learning system is

presented with new examples and generalization a ura y is obtained. Typi-

ally, the same riterium is used as for termination.

Learning Algorithm. A learning algorithm gets training examples and ba k-

ground knowledge as input and returns a hypothesis as output.

The majority of learning algorithms is for onstru ting non-re ursive hy-

potheses for supervised on ept learning where examples are represented as

attribute ve tors (Mit hell, 1997). Among the most prominent approa hes are

fun tion approximation algorithms su h as per eptron, ba k-propagation, or

support ve tor ma hines , Bayesian learners, and de ision tree algorithms.

For unsupervised on ept learning, the dominant approa h is luster analysis.

Approa hes for learning re ursive hypotheses from stru tured examples

that is, approa hes to indu tive program synthesis are geneti program-

ming, indu tive logi programming, and synthesis of fun tional programs.

These approa hes will be des ribed in detail below.

In general, a learning algorithm is a sear h-algorithm in the spa e of

possible hypotheses. There are two main learning me hanisms: data-driven

approa hes start with the most spe i hypothesis, that is, the training ex-

amples, and approximation-driven (or model-driven) approa hes start with a

set of approximate (in omplete, in orre t) hypotheses. Data-driven learning

is in remental, iterating over the training examples (Flener, 1995). Ea h iter-

ation returns a hypothesis whi h might be the sear hed for nal hypothesis.

Approximation-driven learning is non-in remental. Current hypotheses are

generalized if not all positive examples are overed, or spe ialized if other

than the positive examples are overed.

Indu tion Biases. As usual, there is a trade-o between the expressiveness

of the hypotheses language and the eÆ ien y of the learning algorithm ( f.,

the dis ussion of domain spe i ation languages and eÆ ien y of planning

algorithms, hap. 2). Typi ally, in ma hine learning sear h is restri ted by

so alled indu tion biases (Flener, 1995, pp. 33): Ea h learning algorithm is

restri ted by a synta ti bias, that is, a restri tion of the hypothesis language.

For logi al program synthesis, the language an be restri ted to a subset of

Prolog lauses. For fun tional program synthesis, the form of fun tions an be

restri ted by program s hemes. Furthermore, a semanti bias might restri t

the vo abulary available for the hypothesis. For example, when synthesizing

Lisp programs, the language primitives might be restri ted to ar, dr, ons,

and atom (see se t. 6.3.4).

Ma hine Learning Literature. The AI text books of Winston (1992) and Rus-

sell and Norvig (1995) in lude hapters on ma hine learning te hniques. An


ex ellent text book on ma hine learning was written by (Mit hell, 1997). Col-

le tions of important resear h papers, in luding work on indu tive program

synthesis, are presented by Mi halski, Carbonell, and Mit hell (1983) and

Mi halski, Carbonell, and Mit hell (1986). A olle tion of papers on knowl-

edge a quisition and learning was edited by Bu hanan and Wilkins (1993).

Flener (1995) dis usses ma hine learning in relation to program synthesis.

6.3.1.2 Grammar Inferen e. Most of the basi on epts of ma hine learn-

ing as introdu ed above have their origin in the work of Gold (1967). Gold

modeled learning as the a quisition of a formal grammar from example and

provided a lassi ation of models of language learnability together with

fundamental theoreti al results. Grammar inferen e resear h developed from

Gold's work in two dire tions providing theoreti al results of learnability

for ertain lasses of languages given a ertain kind of learning model and

re ently also the development of eÆ ient learning algorithms for some lasses

of formal grammars. We fo us on the on eptual and theoreti al aspe t of

grammar inferen e, that is on algorithmi learning theory. Learning theory

an be seen as a spe ial ase of omplexity theory where the omplexity of

languages is resear hed with respe t to the eort of their enumeration (by

appli ation of rules of a formal grammar) or with respe t to the eort of their

identi ation by an automaton.

Denition 6.3.2 (Formal Language). For a given non-empty nite set

A, alled alphabet of the language, A

represents the set of all nite strings

over elements from A. A language L is dened as: L A

.

Learning means to assign a grammar ( alled naming relation by Gold) to

a language L after seeing a sequen e of example strings:

Denition 6.3.3 (Language Learning). For a sequen e of time steps, the

learner is at ea h time step onfronted with a unit of information i

t

on ern-

ing the unknown language L. A stepwise presentation of units i is alled

training sequen e. At ea h time step, the learner makes a guess g

t

of the

grammar hara terizing L, based on the information it has re eived from the

start of learning to time step t. That is, the learner is a fun tion G with

g

t

= G(i

1

; : : : ; i

t

).

As introdu ed above, Gold introdu ed the on ept of \identi ation in

the limit" to dene the learnability of a language:

Denition 6.3.4 (Identi ation in the Limit). L is identied in the limit

if after a nite number of time steps the guesses are always the same, that

is, g

t

= g

t+i

; i 0. A lass of languages L is alled identiable in the limit if

there exists an algorithm su h that ea h language of the lass will be identied

in the limit for any allowable training sequen e.

Identi ation in the limit is dependent on the method of information pre-

sentation. Gold dis erns presentation of text and learning with an informant.


Learning from text means learning from positive examples only, where exam-

ples are presented in an arbitrary order. Learning with informant orresponds

to supervised learning. The informant an present positive and negative ex-

amples and an provide some methodi al enumeration of examples. Gold

investigates three modes of learning from text and three modes of learning

with informant:

Method of information presentation: Text

A text is a sequen e of strings x

1

; x

2

; : : : from L su h that ea h string of L

o urs at least on e. At time t the learner is presented x

t

.

Arbitrary Text: x

t

may be any fun tion of t.

Re ursive Text: x

t

may be any re ursive fun tion of t.

Primitive Re ursive Text: x

t

may be any primitive re ursive fun tion of

t.

Method of information presentation: Informant

An informant for L tells the learner at ea h time t whether a string y

t

is

element of L. There are dierent ways of how to hoose y

t

Arbitrary Informant: y

t

may be any fun tion of t as long as every string

of A

o urs at least on e.

Methodi al Informant: An enumeration is assigned a priori to the strings

of A

and y

t

is the t-th string of the enumeration.

Request Informant: At time t the learner hooses y

t

on the basis of

information re eived so far.

The method \request informant" is urrently an a tive area of resear h

alled a tive learning (Thompson, Cali, & Mooney, 1999). Gold showed that

all three methods of information presentation by informant are equivalent

(Theorem I.3, in Gold, 1967). Obviously, it is a harder problem to learn from

text only than to learn from an informant, be ause in the se ond ase not

only positive but also negative examples are given.

Identi ation in the limit is also dependent on the \naming relation", that

is, on the way, in whi h a hypothesis about language L, that is a grammar,

is represented. The two possible naming relations for languages are

Language Generator: A Turing ma hine whi h generates L.

A generator is a fun tion from positive integers ( orresponding to time

steps t) to strings in A

su h that the fun tion range is exa tly L. A

generator exists i L is re ursively enumerable.

Language Tester: A Turing ma hine whi h is a de ision pro edure for L.

A tester is a fun tion from strings to f0; 1g with value 1 for strings in L

and value 0 otherwise.

Generators an be transformed in testers but not the other way round

and it is simpler to learn a generator for a lass of languages than a tester

(Gold, 1967).


The denition of learnability as identi ation in the limit together with

a method of information presentation and a naming relation onstitutes a

language learnability model.

Gold presented some methods for identi ation in the limit, the most

prominent one is identi ation by enumeration.

Denition 6.3.5 (Identi ation by enumeration).

6

Let D be a des rip-

tion for a lass of languages where the elements of D are ee tively enu-

merable. Let L(d) be the language denoted by des ription d. For ea h time

step t, sear h for the least k su h that i

t

is in L(d

k

) and that all pre eding

i

j

; j = 1 : : : t 1 are in L(d

k

), too.

Identi ation by enumeration presupposes a linear ordering of possible hy-

potheses. Ea h in ompatibility between the training examples seen so far and

the urrent hypothesis results in the elimination of this hypothesis. That is,

if hypotheses are ee tively enumerable, it is guaranteed that the t-th guess

is the earliest des ription ompatible with the rst t elements of the training

sequen e and that learning will onverge to the rst des ription of L(d) in

the given enumeration D. In pra ti e, enumeration is mostly not eÆ ient.

It is often possible, to organize the lass of hypotheses in a better way su h

that whole sub lasses an be eliminated after some in ompatibility is iden-

tied (Angluin, 1984). The notion of indu tive inferen e as sear h though a

given spa e of hypotheses is helpful for theoreti al onsiderations. In pra ti e,

this spa e typi ally is only given indire tly by the inferen e pro edure and

hypotheses are generated and tested at ea h inferen e step.

Gold (1967) provided fundamental results about whi h lasses of lan-

guages are identiable in the limit given whi h learnability model. These

results are summarized in table 6.4. Later work in the area of learning the-

ory is mostly on erned with rening Gold's results by identifying interesting

sub lasses of the original ategorization of Gold and providing analyses of

their learnability together with eÆ ien y results (Zeugmann & Lange, 1995;

Sakakibara, 1997).

Theorems and proofs on erning all results given in table 6.4 an be found

in the appendix of Gold (1967). We just highlight some results, presenting

the theorems for the lass of nite ardinality languages (languages with a

nite number of words) and for the lass of supernite languages, ontaining

all languages of nite ardinality and at least one of innite ardinality.

Theorem 6.3.1 (Learnability of Finite Cardinality Languages). (Gold,

1967, theorem I.6) Finite ardinality languages are identiable from (arbi-

trary) text, that is, from positive examples presented in some arbitrary se-

quen e, only.

6

The denition is oriented at the formulation of Angluin (1984) rather than on

the original formulation of Gold (1967).


Table 6.4. Fundamental Results of Language Learnability (Gold, 1967, tab. 1)

Learnability Model Class of Languages

Primitive Re ursive Text + Generator Naming

Re ursively enumerable re ursive

Informant

Primitive re ursive

Context-sensitive

Context-free

Regular

Supernite

Text

Finite ardinality languages

Proof (Learnability of Finite Cardinality Languages). Imagine a language whi h

onsists of integers from 1 to a xed number n. For any information sequen e

i

1

; i

2

; : : : the learner an assume that the language onsists only of the num-

bers seen so far. Sin e L is nite, after some time all elements of L have

o ured in the information sequen e, su h that the guess will be orre t.

It is enough to give the proof for a language onsisting of a nite set of

positive integers be ause all other nite languages an be mapped on this

language.

Theorem 6.3.2 (Learnability of Supernite Languages). (Gold, 1967,

theorem I.9) Using information presentation by re ursive text and the generator-

naming relation, any lass of languages whi h ontains all nite languages

and at least one innite language L is not identiable in the limit.

Proof (Learnability of Supernite Languages). The innite language L is re-

ursively enumerable (be ause otherwise it would not have a generator). It

an be shown that there exists a re ursive sequen e of positive integers whi h

ranges over L without repetitions. If the information sequen e is presented

as re ursive sequen e a

1

; a

2

; : : : then at some time step, this information se-

quen e is a re ursive text for the nite language L = fa

1

; : : : a

t

g. At a later

time step t

0

some additional element of the innite language an be presented

and the information sequen e now is a re ursive text for the nite language

L

0

= fa

1

; : : : ; a

t

; a

t

0

g. From this observation follows, that the learning system

must hange its guess an innite number of times.

Primitive re ursive languages are learnable from an informant be ause

for su h languages exists an eÆ ient enumeration (Hop roft & Ullman, 1980,

hap. 7). Be ause learning with informant in ludes the presentation of neg-

ative examples, these an be used to remove in ompatible hypothesis from

sear h. While primitive re ursive language are identiable in the limit, it is

not possible even for regular languages to nd a minimal des ription (a de-


terministi nite automaton with a minimal number of states) in polynomial

time (Gold, 1978)!

Most of resear h in grammar inferen e is on erned with regular lan-

guages and sub lasses of them (as regular pattern languages). Angluin (1981)

ould show that a minimal deterministi nite automaton an be inferred in

polynomial time from positive and negative examples, if additionally to an

informant, an ora le answering membership queries is used. Angluin's ID al-

gorithm is used for example by S hrodl and Edelkamp (1999) for inferring

re ursive fun tions from user-generated tra es. Bostrom (1998) showed how

dis riminating predi ates an be inferred from only positive examples using

an algorithm for inferring a regular grammar.

7

In hapter 7 we will show that the lass of re ursive programs addressed

by our approa h to folding nite (\initial") programs orresponds to ontext-

free tree grammars. Finite programs an be seen as primitive re ursive text

and folding, that is, generating a re ursive program whi h results in the given

nite program as its i-th unfolding orrespond to generator-naming (see rst

line in tab. 6.4).

6.3.2 Geneti Programming

Geneti programming as an evolutionary approa h to program synthesis was

established by Koza (1992). Starting point is a population of randomly gen-

erated omputer programs. New programs better adapted to the \environ-

ment", whi h is represented by examples together with an \tness" fun tion

are generated by iterative appli ation of Darwinian natural sele tion and

biologi ally inspired \reprodu tion" operations.

8

Geneti programming is ap-

plied in system identi ation, lassi ation, ontrol, roboti s, optimization,

game playing, and pattern re ognition.

6.3.2.1 Basi Con epts.

Representation and Constru tion of Programs. In geneti programming, typ-

i ally fun tional programs are synthesized. Koza (1992) synthesizes Lisp fun -

tions, an appli ation of geneti programming to the synthesis of ML fun -

tions was presented by Olsson (1995). These programs are represented as

trees. A program is onstru ted as a synta ti ally orre t expression over a

set of predened fun tions and a set of terminals. The set of fun tions an

ontain standard operators (e. g., arithmeti operators, boolean operators,

onditional operators) as well as user-dened domain spe i fun tions. For

ea h fun tion, its arity is given. The set of terminals ontains variables and

7

More information about grammar inferen e, in luding algorithms and areas of

appli ation, an be found via http://www. s.iastate.edu/~honavar/gi/gi.

html.

8

The geneti programming home page is http://www.geneti -programming.

org/.


atoms. Atoms represent onstants. Variables represent program arguments.

They an be instantiated by values obtained for example from \sensors".

A random generation of a program typi ally starts with an element of

the set of fun tions. The fun tion be omes the root node in the program

tree and it bran hes in the number of ar s given by its arity. For ea h open

ar an element from the set of fun tions or terminals is introdu ed. Program

generation terminates if ea h urrent node ontains a terminal. Examples for

program trees are given in gure 6.2.

+

*

+

* C

A B

AND AND

NOT NOT

(a) + OR

D0 D1

D0 D1

(b)

Fig. 6.2. Constru tion of a Simple Arithmeti Fun tion (a) and an Even-2-Parity

Fun tion (b) Represented as a Labeled Tree with Ordered Bran hes (Koza, 1992,

gs. 6.1, 6.2)

Representation of Problems. A problem together with a \tness" measure

onstitutes the \environment" for a program. Problems are represented by a

set of examples. For some problems, this an be pairs of possible input val-

ues, together with their asso iated outputs. The sear hed for program must

transform possible inputs into desired outputs. Fitness an be al ulated as

the number of erroneous outputs generated by a program. Su h problems

an be seen as typi al for indu tive program synthesis. For learning to play

a game su essfully, the problem an onsist of a set of possible onstella-

tions together with a s oring fun tion. The sear hed for program represents

a strategy for playing this game. For ea h onstellation, the optimal move

must be sele ted. Fitness an for example be al ulated as the per entage

of wins. For learning a plan, the input onsists of a set of possible problem

states. The sear hed for program generates a tion sequen es to transform a

problem state into the given goal state. Fitness an be al ulated as the per-


entage of problem states for whi h the goal state is rea hed. An overview of

a variety of problems is given in Koza (1992, tab. 2.1).

Generational Loop. Initially, thousands of omputer programs are randomly

generated. This \initial population" orresponds to a blind, random sear h in

the spa e of the given problem. The size of the sear h spa e is dependent on

the size of the set of fun tions over whi h programs an be onstru ted. Ea h

program orresponds to an \individual" whose \tness" in uen es the prob-

ability with whi h it will be sele ted to parti ipate in the various \geneti

operations" (des ribed below). Be ause sele tion is not absolutely dependent

on tness, but tness just in uen es the probability of sele tion, geneti pro-

gramming is not a purely greedy-algorithm. Ea h iteration of the generational

loop results in a new \generation" of programs. After many iterations, a pro-

gram might emerge whi h solves or approximately solves the given problem.

An abstra t geneti programming algorithm is given in table 6.5.

Table 6.5. Geneti Programming Algorithm (Koza, 1992, p. 77)

1. Generate an initial population of random ompositions of the fun tions and

terminals of the problem.

2. Iteratively perform the following sub-steps until the termination riterium has

been satised:

a) Exe ute ea h program in the population and assign it a tness value a -

ording to how well it solves the problem.

b) Sele t omputer program(s) from the urrent population hosen with a

probability based on tness.

Create a new population of omputer programs by applying the following

two primary operations:

i. Copy program to the new population (Reprodu tion).

ii. Create new programs by geneti ally re ombining randomly hosen

parts of two existing programs.

3. The best so-far individual (program) is designated as result.

Geneti Operations. The set of geneti operations are:

Mutation: Delete a subtree of a program and grow a new subtree at its pla e

randomly.

This \asexual" operation is typi ally performed sparingly, for example

with a probability of 1% during ea h generation.

Crossover: For two programs (\parents"), in ea h tree a ross-over point is

hosen randomly and the subtree rooted at the ross-over point of the

rst program is deleted and repla ed by the subtree rooted at the ross-

over point of the se ond program.

This \sexual re ombination" operation is the predominant operation in

geneti programming and is performed with a high probability (85% to

90 %).


Reprodu tion: Copy a single individual into the next generation.

An individuum \survives" with for example 10% probability.

Ar hite ture Alteration: Change the stru ture of a program.

There are dierent stru ture hanging operations whi h are applied spar-

ingly (1% probability or below):

Introdu tion of Subroutines: Create a subroutine from a part of the main

program and reate a referen e between the main program and the

new subroutine.

Deletion of Subroutines: Delete a subroutine; thereby making the hier-

ar hy of subroutines narrower or shallower.

Subroutine Dupli ation: Dupli ate a subroutine, give it a new name and

randomly divide the preexisting alls of the subroutine between the

old and the new one. (This operation preserves semanti s. Later on,

ea h of these subroutines might be hanged, for example by muta-

tion.)

Argument Dupli ation: Dupli ate an argument of a subroutine and ran-

domly divide internal referen es to it. (This operation is also seman-

ti s preserving. It enlarges the dimensionality of the subroutine.)

Argument Deletion: Delete an argument; thereby redu ing the amount

of information available to a subroutine (\generalization").

Automati ally Dened Iterations/Re ursions: Introdu e or delete itera-

tions (ADIs) or re ursive alls (ADRs).

Introdu tion of iterations or re ursive alls might result in non-

termination. Typi ally, the number of iterations (or re ursions) is

restri ted for a problem. That is, for ea h problem, ea h program

has a time-out riterium and is terminated \from outside" after a

ertain number of iterations.

Automati ally Dened Stores: Introdu e or delete memory (ADSs).

Quality riteria for programs synthesized by geneti programming are or-

re tness whi h is dened as 100% tness for the given examples , eÆ ien y

and parsimony. The two later riteria an be additionally oded in the tness

measure. In the following, we give examples of program synthesis by geneti

programming.

6.3.2.2 Iteration and Re ursion. First we des ribe, how a plan for sta k-

ing blo ks in a predened order an be synthesized where the nal program

involves iteration of a tions (Koza, 1992, hap. 18.1). This problem is similar

to the tower problem des ribed in se tion 3.1.4.5: An initial problem state

onsists of n blo ks whi h an be sta ked or lying on the table. Problem solv-

ing operators are puttable(x) and put(x,y), the rst operator dening how a

blo k x is moved from another blo k on the table and the se ond operator

dening how a blo k x an be put on top of another blo k y. The problem

solving goal denes in whi h sequen e the n blo ks should be sta ked into a

single tower (see g. 6.3).


U

I

V

E

R

S

L

N

A

Initial State Goal StateA possible intermediate state

N

U

A

L I R V ES

U

S

AV E I N

R

L

STACK

TABLE

TBNN

CS

Fig. 6.3. A Possible Initial State, an Intermediate State, and the Goal State for

Blo k Sta king (Koza, 1992, gs. 18.1, 18.2)

The set of terminal is T = fCS; TB;NNg and the set of fun tions is

fMS;MT;NOT;EQ;DUg with

CS: A sensor that dynami ally spe ies the top blo k of the Sta k.

TB: A sensor that dynami ally spe ies the blo k in the Sta k whi h together

with all blo ks under it are already positioned orre tly. (\top orre t

blo k")

NN: A sensor that dynami ally spe ies the blo k whi h must be sta ked

immediately on top of TB a ording to the goal. (\next needed blo k")

MS: A move-to-sta k operator with arity one whi h moves a blo k from the

Table on top of the Sta k.

MT: A move-to-table operator with arity one whi h moves a blo k from the

top of the Sta k to the Table.

NOT: A boolean operator with arity one swit hing the truth value of its

argument.

EQ: A boolean operator with arity two whi h returns true if its arguments

are identi al and false otherwise.

DU: A user-dened iterative \do-until" operator with arity two. The expres-

sion DU *Work* *Predi ate* auses *Work* to be iteratively exe uted

until *Predi ate* is satised.

All fun tions have dened outputs for all onditions: MS and MT hange

the Table and Sta k as side-ee t. They return true, if the operator an be

applied su essfully and nil (false) otherwise. The return value of DU is also

a boolean value indi ating whether *Predi ate* is satised or whether the

DU operator timed out.

For this example, the set of terminals and the set of fun tions are are-

fully rafted. Espe ially the pre-dened sensors arry exa tly that information

whi h is relevant for solving the problem! The problem is more restri ted than

the lassi al blo ks-world problem: While in the blo ks-world, any onstella-

tion of blo ks (for example, four sta ks ontaining two blo ks ea h and one

with only one blo k) is possible (see tab. 2.3 for the growth of the number of

states in dependen e of the number of blo ks), in the blo k-sta king problem,

only one sta k is allowed and all other blo ks are lying on the table.


For the evolution of the desired program 166 tness ases were on-

stru ted: ten ases where zero to all nine blo ks in the sta k were already

ordered orre tly; eight ases where there is one out of order blo k on top of

the sta k; and a random sampling of 148 additions ases. Fitness was mea-

sured as the number of orre tly handled ases. Koza (1992) reports three

variants for the synthesis of a blo k sta king program, summarized in gure

6.4. The rst, orre t program rst moves all blo ks on the table and than

onstru ts the orre t sta k. This program is not very eÆ ient be ause there

are made unne essary moves from the sta k to the table for partially orre t

sta ks. Over all 166 ases, this fun tion generates 2319 moves in ontrast to

1642 ne essary moves. In the next trial, eÆ ien y was integrated into the

tness measure and as a onsequen e, a fun tion al ulating only the mini-

mum number of moves emerged. But this fun tion has an outer loop whi h is

not ne essary. By integrating parsimony into the tness measure, the orre t,

eÆ ient, and parsimonious fun tion is generated.

Population Size: M = 500

Fitness Cases: 166

Corre t Program:

Fitness: 166 - number of orre tly handled ases Termination: Generation 10

(EQ (DU (MT CS) (NOT CS))

(DU (MS NN) (NOT NN)))

Corre t and EÆ ient Program:

Fitness: 0:75 C + 0:25 E with

C = ( number of orre tly handled ases=166) 100 E = f(n) as fun tion of the

total number of moves over all 166 ases:

with f(n) = 100 for the analyti ally obtained minimal number of moves for a

orre t program (min = 1641);

f(n) linearly s aled upwards for zero moves up to 1640 moves with f(0) = 0

f(n) linearly s aled downwards for 1642 moves up to 2319 moves (obtained by

the rst orre t program) and f(n) = 0 for n > 2319

Termination: Generation 11

(DU (EQ (DU (MT CS) (EQ CS TB))


(NOT NN))

Corre t, EÆ ient, and Parsimonious Program:

Fitness: 0:70 C + 0:20 E + 0:10 (number of nodes in program tree)

Termination: Generation 1

(EQ (DU (MT CS) (EQ CS TB))


Fig. 6.4. Resulting Programs for the Blo k Sta king Problem (Koza, 1992,

hap. 18.1)

As an example for synthesizing a re ursive programKoza (1992, hap. 18.3)

shows how a fun tion al ulation Fibona i numbers an be generated from


examples for the rst twenty numbers. The problem is treated as sequen e

indu tion problem for an as ending order of index positions J , the fun -

tion f(J) must be dete ted. As terminal set T = fJ; 0; 1; 2; 3g is given and

as fun tion set F = f+;; ; SRFg. Terminal J represents the input value.

Fun tion SRF is a \sequen e referen ing fun tion" with (SRF K D) returns

either the Fibona i number for K or a default value D. The used strategy is,

that the Fibona i number are al ulated for K = 0 : : : 20 in as ending order

and ea h result is stored in a table. Su h, SRF has a ess to the Fibona i

numbers for all K whi h were already al ulated, returning the Fibona i

number for K J 1 and default value D otherwise. For a population size

of 2000 programs, in generation 22, the program given in table 6.6 emerged.

A re-implementation in Lisp together with a simplied version is given in

appendix CC.1. Fun tion SRF realizes re ursion in an indire t way: ea h

new element in the sequen e is dened re ursively in terms of one or more

previous elements.

Table 6.6. Cal ulation the Fibona i Sequen e

(+ (SRF (- J 2) 0)

(SRF (+ (+ (- J 2) 0) (SRF (- J J) 0))

(SRF (SRF 3 1) 1)))

Other treatments of re ursion in geneti programming an be found for

example in Olsson (1995) and Yu and Clark (1998).

6.3.2.3 Evaluation of Geneti Programming. Geneti programming is

a model-driven approa h to learning (see se t. 6.3.1.1), that is, it starts with a

set of programs (hypotheses), these are he ked against the data and modied

a ordingly. Be ause the method relies on (nearly) blind sear h, the time

eort for oming up with a program solving the desired task is very high.

Program onstru tion by generate-and-test has next to nothing in ommon

with the way in whi h a human programmer develops program ode whi h

(hopefully) is guided by the given ( omplete or in omplete) spe i ation.

Furthermore, eort and su ess of program onstru tion are heavily de-

pendent on the given representation of the problem and of the pre-dened

set of fun tions and terminals. For the blo k-sta king example given above,

knowledge about whi h information in a onstellation of blo ks is relevant for

hoosing an a tion was expli itly oded in the sensors. In hapter 8, we will

show that our own approa h allows to infer whi h information (predi ates)

are relevant for solving a problem.

6.3.3 Indu tive Logi Programming

Indu tive logi programming (ILP) was named by Muggleton (1991) be ause

this area of resear h ombines indu tive learning and logi programming.


Most work done in this area on erns on ept learning, that is, indu tion

of non-re ursive lassi ation rules (see se t. 6.3.1.1). In prin iple, all ILP

approa hes an be applied to learning re ursive lauses, if it is allowed that

the name of the to be learned relation an be used when onstru ting the

body whi h hara terizes the to be learned relation. But without spe ial

heuristi s, neither termination of the learning pro ess nor termination of the

learned lauses an be guaranteed. In the following, we present basi on epts

of ILP, three kinds of learning me hanisms, and biases used in ILP. Finally,

we refer some ILP systems whi h address learning of re ursive lauses. An

introdu tion to ILP is given for example by Lavra and Dzeroski (1994) and

Muggleton and De Raedt (1994), an overview of program synthesis and ILP

is given by Flener (Flener, 1995; Flener & Yilmaz, 1999).

In the following, we give illustrations of the basi on epts and learning

algorithms using a simple lassi ation problem presented, for example, in

Lavra and Dzeroski (1994). The to be learned on ept is the relation daugh-

ter(X,Y) (see tab. 6.7).

Table 6.7. Learning the daughter Relation

Training Examples:

E

+

= fe

1

= daughter(mary;ann); e

2

= daughter(eve; tom)g

E

= fdaughter(tom;ann); daughter(eve; ann)g

Ba kground Knowledge:

B = fparent(ann;mary); parent(ann; tom); parent(tom; eve); parent(tom; ian);

female(ann); female(mary); female(eve)g

To be learned lause: (learning methods des ribed below)

daughter(X;Y ) female(X); parent(Y;X)

6.3.3.1 Basi Con epts. In ILP, examples and hypotheses are represented

as subsets of rst order logi su h as denite Horn lauses with some addi-

tional restri tions (see dis ussion of biases below). Denite Horn lauses have

the form T _ :L

1

_ : : : _ :L

n

, or in Prolog notation T L

1

; : : : ; L

n

. The

positive literal T is alled head of the lause and represents the to be learned

relation. Often, a lause is represented as a set of literals.

Examples typi ally an be divided into positive and negative examples.

In addition to the examples, ba kground knowledge an be provided (see

se t. 6.3.1.1, table 6.2). The goal of indu tion is to obtain a hypothesis whi h,

together with the ba kground knowledge, is omplete and onsistent with

respe t to the examples:

Denition 6.3.6 (Complete and Consistent Hypotheses). Given a set

of training examples E = E

+

[ E

, with E

+

as positive examples and E

as

negative examples, together with ba kground knowledge B and a hypothesis

H, let overs(B [H; E) be a fun tion whi h returns all elements of E) whi h

follow from B [ H. The following onditions must hold:


Completeness: overs(B [ H; E

+

) = E

+

.

Ea h positive example in E

+

is overed by hypothesis H together with

the ba kground knowledge B. In other words, the positive examples follow

from hypothesis and ba kground knowledge (B ^ H j= E

+

).

Consisten y: overs(B [H; E

) = ;.

No negative example in E

is overed by hypothesis H together with the

ba kground knowledge B. In other words, no negative example follows

from hypothesis and ba kground knowledge (B ^ H ^ E

j= ;).

In logi programming (Sterling & Shapiro, 1986), SLD-resolution an be

used to he k for ea h example in E whether it is entailed by the hypothesis

together with the ba kground knowledge. An additional requirement is, that

the positive examples do not already follow from the ba kground knowledge.

In this ase, learning ( onstru tion of a hypothesis) is not ne essary (\prior

ne essity" Muggleton & De Raedt, 1994).

All possible hypotheses that is, all denite horn lauses with the to

be learned relation as head whi h an be onstru ted from the examples

and the ba kground knowledge onstitute the so alled hypothesis spa e or

sear h spa e for indu tion. For a systemati sear h of the hypothesis spa e,

it is useful to introdu e a partial order over lauses, based on -subsumption

(Plotkin, 1969):

Denition 6.3.7 (-Subsumption). Let and

0

be two program lauses.

Clause -subsumes lause

0

, if there exists a substitution , su h that

0

. Two lauses and d are -subsumption equivalent if -subsumes d and

if d -subsumes . A lause is redu ed if it is not -subsumption equivalent

to any proper subset of itself.

An example is given in gure 6.5.

= daughter(X;Y ) parent(Y;X); parent(W;V )

d = daughter(X;Y ) parent(Y;X)

and d are -subsumption equivalent:

d and d with = fW Y; V Xg

is not redu ed (d is a proper subset of ).

d is redu ed.

Fig. 6.5. -Subsumption Equivalen e and Redu ed Clauses

Based on -subsumption, we an introdu e a synta ti notion of general-

ity: If

0

, then is at least as general as

0

, written

0

. If

0

holds

and

0

does not hold, is a generalization of

0

and

0

is a spe ialization

(renement) of . The relation introdu es a latti e on the set of redu ed

lauses. That is, any two lauses have a least upper bound and a greatest

lower bound whi h are unique ex ept for variable renaming (see gure 6.6).


Based on -subsumption, the least general generalization of two lauses

an be dened (Plotkin, 1969)

9

:

Denition 6.3.8 (Least General Generalization). The least general gen-

eralization (lgg) of two redu ed lauses and

0

, written lgg( ;

0

) is the least

upper bound of and

0

in the -subsumption latti e.

Note, that al ulating the resolvent of two lauses is the inverse pro ess

to al ulating an lgg: When al ulating an lgg, two lauses are generalized

by keeping their -substitution equivalent ommon subset of literals. When

al ulating the resolvent, two lauses are integrated into one, more spe ial

lause where variables o uring in the given lauses might be repla ed by

terms (see des ription of resolution in se t. 2.2.2 in hap. 2 and above in

this hapter). With -subsumption as the entral on ept for ILP we have

the basis for hara terizing ILP learning te hniques as either bottom-up on-

stru tion of least general generalizations or top-down sear h for renements

(see g. 6.6).

c’

θθ

θ

θ θ

resolvent of (c1,c2)

top-down

refinement

bottom-upgeneralization

c2c1

θ θ

c = lgg(c1,c2)

Fig. 6.6. -Subsumption Latti e

6.3.3.2 Basi Learning Te hniques. In the following we present two te h-

niques for bottom-up generalization al ulating relative lggs and inverse

resolution and one te hnique for top-down sear h in a renement graph.

Relative Least General Generalization. Relative least general generalization

is an extension of lggs with respe t to ba kground knowledge, for example

used in Golem (Muggleton & Feng, 1990).

Denition 6.3.9 (Relative Least General Generalization). The rela-

tive least general generalization (rlgg) of two lauses

1

and

2

is their

least general generalization lgg(

1

;

2

) relative to the ba kground knowledge

B: rlgg(

1

;

2

) = lgg(

1

B;

2

B).

9

We will ome ba k to this on ept applied to terms (instead of logi al formulae)

under the name of rst-order anti-uni ation in hapter 7 and in part III.


In the ILP literature, enri hing lauses with ba kground knowledge is

dis ussed under the name of saturation (Rouveirol, 1991).

An algorithm for al ulating lggs for lauses is, for example, given in

Plotkin (1969). In table 6.8 we demonstrate al ulating rlggs with the daugh-

ter example (tab. 6.7). The rlgg of the two positive examples is al ulated

by determining the lgg between Horn lauses where the examples are onse-

quen es of the ba kground knowledge.

Table 6.8. Cal ulating an rlgg

rlgg(daughter(mary,ann),daughter(eve,tom)) =

lgg(daughter(mary,ann) B, daughter(eve,tom) B) =

daughter(X,Y) female(X), parent(Y,X)

Inverse Resolution. Inverse resolution is a generalization te hnique based on

inverting the resolution rule of dedu tive inferen e. Inverse resolution requires

inverse substitution:

Denition 6.3.10 (Inverse substitution). For a logi al formula W , an

inverse substitution

1

of a substitution is a fun tion that maps terms in

W to variables, su h that W

1

=W .

Again, we do not present the algorithm, but illustrate inverse resolu-

tion with the daughter-example introdu ed in table 6.7 (see g. 6.7). We

restri t our positive examples to e

1

= daughter(mary, ann) and ba kground

knowledge to b

1

= female(mary) and b

2

=parent(ann, mary). Working with

bottom-up generalization, the initial hypothesis is given as the rst positive

example. The learning pro ess starts, for example, with lauses e

1

and b

2

.

Inverse resolution attempts to nd a lause

1

su h that

1

together with b

2

entails e

1

. If su h a lause an be found, it be omes the urrent hypothesis

and a further lause from the set of positive examples or the ba kground

knowledge is onsidered. The urrent hypothesis is attempted to be inversely

resolved with this new lause and so on. The inverse resolution operator used

in the example is alled absorption or V operator.

In general, inverse resolution involves ba ktra king. In systems working

with inverse resolution, it is sometimes ne essary to introdu e new predi ates

in the hypothesis for onstru ting a omplete and onsistent hypothesis. The

operator realizing su h a ne essary predi ate invention is alled W operator

(Muggleton, 1994). Inverse resolution is for example used inMarvin (Sammut

& Banerji, 1986).

Sear h in the Renement-Graph. Alternatively to bottom-up generalization,

hypotheses an be onstru ted by top-down renement. The step from a

general to a more spe i hypothesis is realized by a so alled renement

operator:


c1 = daughter(mary,Y) parent(Y,mary)

c2 = daughter(X,Y) female(X), parent(X,Y)

θ2

-1

b2 = parent(ann, mary)

e1 = daughter(mary,ann)

b1 = female(mary)

θ-1

1

= mary X

= ann Y

Fig. 6.7. An Inverse Linear Derivation Tree (Lavra and Dzeroski, 1994, pp. 46)

Denition 6.3.11 (Renement Operator). Given a hypothesis language

L (e. g., denite Horn lauses), a renement operator maps a lause to

a set of lauses ( ) = f

0

j

0

2 L; <

0

g.

Typi ally, a renement operator omputes so alled \most general spe i-

ations" under -subsumption by performing one of the following synta ti

operations:

apply a substitution to ,

add a literal to the body of .

Learning as sear h in renement graphs was rst introdu ed by Shapiro

(1983) in the intera tive system MIS (Model Inferen e System). An outline

of the MIS-algorithm is given in table 6.9.

Table 6.9. Simplied MIS-Algorithm (Lavra and Dzeroski, 1994, pp.54)

Initialize hypothesis H to a (possibly empty) set of lauses in L.

REPEAT

Read next (positive or negative) example.

REPEAT

IF there exists a overed negative example e

THEN delete in orre t lauses from H.

IF there exists a positive example e overed by H

THEN with breadth-rst sear h of the renement graph, develop a

lause whi h overs e and add it to H.

UNTIL H is omplete and onsistent.

Output: Hypothesis H.

FOREVER.

Again, we illustrate the algorithm with the daughter example (tab. 6.7).

A part of the breadth-rst sear h tree (see hap. 2) is given in gure 6.8.

The root node of an renement graph is formally the most general lause,


that is false or the empty lause. Typi ally, a renement algorithm starts

with the most general denition of the goal relation, for our example, that is

H = = daughter(X,Y) true. The hypothesis overs both positive training

examples, that is, it must not be modied if these examples are presented

initially. If a negative example (e. g., e

3

= daughter(eve, tom)) is presented,

the hypothesis needs to be modied be ause overs the negative examples,

too. Therefore, a lause overing the positive examples but not overing e

3

must be onstru ted. The set of all possible (minimal) spe ializations of is

al ulated as ( ) = fdaughter(X;Y ) Lg. The newly introdu ed literal is

either a literal whi h has the variables used in the lausal head as argument

or a literal introdu ing a new variable. The predi ate names of the literals

an be obtained by the ba kground knowledge, additional build-in predi ates

(su h as equality or inequality) an be used.

Already for this small example, the sear h spa e be omes rather large: L

an be

X = Y , or female(X), or female(Y ), or parent(X;X), or parent(X;Y ),

or parent(Y;X), or parent(Y; Y ), or

parent(X;Z), or parent(Z;X), or parent(Y; Z), or parent(Z; Y ), with Z 6=

X and Z 6= Y .

Note, that in this example the language is restri ted su h, that literals with

the goal predi ate daughter are not onsidered in the body of the lause and

therefore, no re ursive lauses are generated.

The renement

0

= daughter(X;Y ) female(X) overs the positive

examples and does not over e

3

. Therefore, it be omes the new hypothesis.

If another negative example, su h as e

4

= daughter(eve, ann) is presented,

the hypothesis again must be modied, be ause

0

overs this example, and

so on.

daughter(X, Y)

daughter(X,Y) X = Y daughter(X, Y) female(X) daughter(X, Y) parent(Y, X)

daughter(X, Y) female(X), female(Y) daughter(X,Y) female(X), parent(Y,X)

...

...

... ...

...

... ... ...

... ...

Fig. 6.8. Part of a Renement Graph (Lavra and Dzeroski, 1994, p. 56)

A urrent system, based on top-down renement is Foil (Quinlan, 1990).

6.3.3.3 De larative Biases. The most general setting for learning hy-

potheses in an ILP framework would be to allow that the hypothesis an

be represented as an arbitrary set of denite Horn lauses, that is any legal


Prolog program. Be ause ILP te hniques are heavily dependent on sear h

in hypothesis spa e, typi ally mu h more restri ted hypothesis languages are

onsidered. In the following, we present su h language restri tions, also alled

synta ti biases (see se t. 6.3.1.1). Additionally, semanti biases an be in-

trodu ed to restri t the to be learned relations.

Synta ti Biases. One possibility to restri t the sear h spa e is, to provide

se ond order s hemes whi h represent the form of sear hed for hypotheses

(Flener, 1995). A se ond order s heme is a lause with existentially quantied

predi ate variables. An example s heme and an instantiation of su h a s heme

are (Muggleton & De Raedt, 1994, p. 656):

S heme: 9p; q; r : p(X;Y ) q(X;XW ); q(YW; Y ); r(XW;Y W ):

Instantiation: onne ted(X,Y) partof(X,XW), partof(Y,YW), tou hes(XW,YW).

A typi al synta ti bias is to restri t sear h to linked lauses:

Denition 6.3.12 (Linked Clause). A lause is linked if all of its vari-

ables are linked. A variable V is linked in a lause i V o urs in the head

of , or there exists a literal l in that ontains variables V and W , W 6= V

and W is linked in .

The s heme given above represents a linked lause.

Many ILP systems provide a number of parameters whi h an be set

by the user. These parameters present biases and by assigning them with

dierent values, so alled bias shifts an be realized.

Two examples of su h parameters are the depth of terms and the level of

terms.

Denition 6.3.13 (Depth of a Term). The depth d(V ) of a variable is

0. The depth d( ) of a onstant is 1. The depth d(f(t

1

; : : : ; t

n

)) of a term is

1 +max(fd(t

i

)g).

Denition 6.3.14 (Level of a Term). The level t(t) of a term t in a

linked lause is 0 if t o urs as an argument in the head of , and

1 + min(fl(s)g) where s and t o ur as arguments in the same literal of

.

In the s heme given above, X and Y have depth 1 and XW and YW

have depth 2. For linked languages with maximal depth 1 and level > 1, the

number of literals in the body of the lause an grow exponentially with its

level (Muggleton & De Raedt, 1994).

Semanti Biases. A typi ally semanti bias is to provide information about

the modes and types of a to be learned lause. This information an be pro-

vided by spe ifying de larations of predi ates in the ba kground knowledge.

An example for spe ifying modes and types for append is given in gure 6.9:

If the rst two arguments of append are instantiated (indi ated by +list), this


predi ate su eeds on e (1),; if the last argument is instantiated, it an su -

eed nitely many times (). The predi ate is restri ted to lists of integers. It

is obvious that su h a semanti bias an restri t eort of sear h onsiderably.

mode(1, append(+list,+list,-list))

mode(*, append(-list,-list,+list))

list(nil)

list([X|T) integer(X), list(T)

Fig. 6.9. Spe ifying Modes and Types for Predi ates

Another semanti bias, for example used in Foil (Quinlan, 1990) and

Golem (Muggleton & Feng, 1990), is the notion of determinate lauses.

Denition 6.3.15 (Determinate Clauses). A denite lause h l

1

; : : : ;

l

n

is determinate (with respe t to ba kground knowledge B and examples E)

i for every substitution for h that unies h to a ground instan e e 2 E, and

for all i = 1; : : : n there is a unique substitution

i

su h that (l

1

^ : : : ^ l

i

)

i

is ground and is true for B ^ E

+

.

A lause is determinate if all its literals are determinate and a literal l

i

is

determinate if ea h of its variables whi h do not appear in pre eding literals

l

j

, j = 1; : : : i 1 has only one possible instantiation given the instantiations

in fl

j

j j = 1; : : : i 1g . For the daughter example given in table 6.7, the

lause daughter(X;Y ) female(X); parent(X;Y ) is determinate: If X is

instantiated with mary, then Y an only be instantiated with ann.

Restri tion to determinate lauses an redu e the exponential growth of

literals in the body of a hypotheses during sear h. A spe ial ase of deter-

mina y is the so alled ij-determina y. Parameter i represents the maximum

depth of variables (def. 6.3.13), parameter j represents the maximal degree of

dependen y (def. 6.3.14) in determined lauses (Muggleton & Feng, 1990).

Denition 6.3.16 (ij-Determina y of Clauses). A lause onsisting of

a head only is 0j-determinate. A lause h l

1

; : : : ; l

m

; l

m+1

; : : : ; l

n

is ij-

determinate i

h l

1

; : : : ; l

m

is (i 1)j-determinate, and

every literal in l

m+1

; : : : ; l

n

ontains only determinate terms and has a

degree at most j.

The lause daughter(X;Y ) female(X); parent(X;Y ) is 11-determi-

nate be ause all variables in the body appear in the head (i = 1) and the

maximal degree is one (j = 1) for variable Y in parent(X,Y) whi h depends

on X .

It an be shown that the restri tion to ij-determinate lauses allows for

polynomial eort of learning (Muggleton & Feng, 1990).


6.3.3.4 Learning Re ursive Prolog Programs. In prin iple, all ILP sys-

tems an be extended to learning re ursive lauses, that is, they an be seen

as program synthesis systems, if the language bias is relaxed su h that the

goal predi ate is allowed to be used when onstru ting the body of a hypoth-

esis. For the systems MIS (Shapiro, 1983), Foil (Quinlan, 1990), and Golem

(Muggleton & Feng, 1990), for example, appli ation to program synthesis was

demonstrated. In the following, we rst show, how learning re ursive lauses

is realized with Golem and afterwards introdu e some ILP approa hes whi h

were proposed expli itly for program synthesis.

Learning reverse with Golem. The system Golem (Muggleton & Feng, 1990)

works with bottom-up generalization, al ulating relative least general gen-

eralizations (see def. 6.3.9). We give an example, how reverse(X, Y) an be

indu ed with Golem. In table 6.10 the ne essary ba kground knowledge, pos-

itive and negative examples are presented.

Table 6.10. Ba kground Knowledge and Examples for Learning reverse(X,Y)

%% positive examples

rev([,[). rev([1,[1). rev([2,[2). rev([3,[3). rev([4,[4).

rev([1,2,[2,1). rev([1,3,[3,1). rev([1,4,[4,1). rev([2,2,[2,2).

rev([2,3,[3,2). rev([2,4,[4,2). rev([0,1,2,[2,1,0).

rev([1,2,3,[3,2,1).

%% negative examples

rev([1,[). rev([0,1,[0,1). rev([0,1,2,[2,0,1).

app([1,[0,[0,1).

%% ba kground knowledge

!- mode(rev(+,-)).

!- mode(app(+,+,-)).

rev([,[). ... rev([1,2,3,[3,2,1). %% all positive examples

app([,[,[). app([1,[,[1). app([2,[,[2). app([3,[,[3).

app([4,[,[4). app([,[1,[1). app([,[2,[2). app([,[3,[3).

app([,[4,[4). app([1,[0,[1,0). app([2,[1,[2,1).

app([3,[1,[3,1). app([4,[1,[4,1). app([2,[2,[2,2).

app([3,[2,[3,2). app([4,[2,[4,2). app([2,1,[0,[2,1,0).

app([3,2,[1,[3,2,1).

As des ribed above, an rlgg is al ulated between pairs of lauses of the

form e B where e is a positive example and B is the onjun tion of

lauses from the ba kground knowledge. The \tri k" used for learning re-

ursive lauses with an rlgg te hnique is that the positive examples are addi-

tionally made part of the ba kground knowledge. In this way, the name of the

goal predi ate an be introdu ed in the body of the hypothesis. The notion

of ij-determina y (see def. 6.3.16) is used to avoid exponential size explosion

of rlggs.

For the rst examples rev([ ,[ ) and rev([1,[1) the rlgg is:


rev(X;X) rev([X ; [X ); rev([2; [2); : : : ;

rev([X; 2; [2; X ); : : : ; rev([X;X; 2; [2; X;X );

app(X;X;X); app([X ; [; [X );

app([2; [; [2); : : : ; app([3; 2; [X ; [3; 2; X ):

This hypothesis has a head whi h is still too spe ial and a body whi h

ontains far too many literals. The head will be ome more general in the

learning pro ess when further positive examples are introdu ed. Literals from

the body an be redu ed by eliminating literals whi h over negative exam-

ples. In general, this an be a very time onsuming task be ause all subsets

of the literals in the body must be he ked. In Muggleton and Feng (1990) a

te hnique working with O(n

2

) is des ribed.

For the above example, it remains:

rev(X;X) app(X;X; Y ):

The nally resulting program is:

rev([AjB; [CjD) rev(B;E);

app(E; [A; [CjD):

A tra e generated by Golem for this example is given in appendix CC.2.

Note, that no base ase is learned, that is, interpretation of rev would

not terminate. Furthermore, the sequen e of literals in a lause depends on

the sequen e of literals in the ba kground knowledge. If in our example,

the append literals would be presented before the rev literals, the resulting

program would have a body where app (append) appears before rev. In using

a top-down left-to-right strategy, su h as Prolog, interpretation of the lause

would fail.

Instead of presenting the append predi ate ne essary for indu ing reverse

by ground instan es, a predened append fun tion an be presented as ba k-

ground knowledge. But this an make indu tion harder be ause for al ulat-

ing an rlgg the ba kground knowledge must be grounded. That is, a re ursive

denition in the ba kground knowledge must be instantiated and unfolded to

some xed depth. Be ause there might be many possible instantiations and

be ause the \ne essary" depth of unfolding is unknown, sear h eort an eas-

ily explode. Without a predened restri tion of unfolding-depth, termination

annot be guaranteed.

Spe ial Purpose ILP Synthesis Systems. ILP systems expli itly designed for

learning re ursive lauses are for example TIM (Idestam-Almquist, 1995),

Synapse (Flener, Popelinsky, & Stepankova, 1994), and Merlin (Bostrom,

1996). The hypothesis language of Tim onsists of tail re ursive lauses to-

gether with base lauses. Learning is performed again by al ulating rlggs.

Synapse synthesizes programs whi h onrm to a divide-and- onquer s heme

and it in orporates predi ate invention. Merlin learns tail re ursive lauses

based on a grammar inferen e te hnique (see se t. 6.3.1.2). For a given set


of positive and negative examples, a deterministi nite automaton is on-

stru ted whi h a epts all positive and no negative examples. This an for

example be realized using the ID algorithm of Angluin (1981). The automa-

ton is then transformed in a set of Prolog rules. Merlin also uses predi ate

invention.

6.3.4 Indu tive Fun tional Programming

In ontrast to the indu tive approa hes presented so far, the lassi al fun -

tional approa h to indu tive program synthesis is based on a two step pro ess:

Rewriting I/O Examples into onstru tive expressions: Example omputations,

presented as input/output examples are transformed into tra es and

predi ates. Ea h predi ate hara terizes the stru ture of one input ex-

ample. Ea h tra e al ulates the orresponding output for a given input

example.

Re urren e Dete tion: Regularities are sear hed for in the set of predi ates/tra-

es pairs. These regularities are used for generating a re ursive general-

ization.

The dierent approa hes typi ally use a simple strategy for the rst step

whi h is not dis ussed in detail. The riti al dieren es between the fun -

tional synthesis algorithms is how the se ond step is realized. Two pioneering

approa hes are from Summers (1977) and Biermann (1978). The work of

Summers was extended by several resear hers, for example by Jouannaud

and Kodrato (1979), Wysotzki (1983), and Le Blan (1994).

In the following, we rst present Summers'approa h in some detail, be-

ause our own work is based on his approa h. Afterwards we give a short

overview over Biermann's approa h and nally we dis uss some extensions of

Summers' work.

6.3.4.1 Summers' Re urren e Relation Dete tion Approa h.

Basi Con epts. Input to the synthesis algorithm ( alled Thesys) is a set of

input/output examples E = fe

1

; : : : ; e

k

with e

j

= i

j

! o

j

, j = 1; : : : ; k. The

to be onstru ted output is a re ursive Lisp fun tion whi h generalizes over

E. The following Lisp primitives

10

are used:

Atom nil: representing the empty list or the truth-value false.

Predi ate atom(x): whi h is true if x is an atom.

Constru tor ons(x, l): whi h inserts an element x in front of l.

Basi fun tions: sele tors ar(x) and dr(x), whi h return the rst ele-

ment of a list/a list without its rst element;

and ompositions of sele tors, whi h an be noted for short as hxi

+

r,

x 2 fa; dg for instan e ar( dr(x)) = adr(x).

10

For better readability, we write the operator names, followed by omma-

separated arguments in parenthesis, instead of the usual Lisp notation. That

is, for the Lisp expression ( ons x l), we write ons(x, l).


M Carthy-Conditional ond(b1, ..., bn): where b

i

are pairs of predi-

ates and operations p

i

(x)! f

i

(x).

Inputs and outputs of the sear hed for re ursive fun tion are represented

as single s-expressions (simple expressions):

Denition 6.3.17 (S-Expression). Atom nil and basi fun tions are s-

expressions.

ons(s1, s2) is an s-expression, if s1 and s2 are s-expressions.

With Summers' method, Lisp fun tions with the following underlying

basi program s heme an be indu ed:

Denition 6.3.18 (Basi Program S heme).

F (x) (p

1

(x)! f

1

(x);

: : : ;

p

k

(x)! f

k

(x);

T ! C(F (b(x)); x))

where:

p

1

; : : : p

k

are predi ates of the form atom(b

i

(x))

b; b

i

are basi fun tions

f

1

; : : : ; f

k

are s-expressions,

T is the truth-value \true", C(w; x) is a ons-expression with w o uring

exa tly on e.

This basi program s heme allows for arbitrary linear re ursion, that is,

more omplex forms, as tree re ursion are out of rea h of this method. For

illustration we give an example presented in Summers (1977) in gure 6.10.

For better readability, we represent inputs and outputs as lists and not as

s-expressions. In the following, we will des ribe, how indu tion of a re ursive

fun tion is realized by (1) generating tra es, and (2) dete ting regularities

(re urren e) in tra es.

Input/Output Examples:

fnil! nil;

(A)! ((A));

(A B)! ((A) (B));

(A B C)! ((A) (B) (C))g

Re ursive Generalization:

unpa k(x) (atom(x)! nil;

T ! u(x))

u(x) (atom( dr(x))! ons(x; nil);

T ! ons( ons( ar(x); nil); u( dr(x))))

Fig. 6.10. Learning Fun tion unpa k from Examples


Generating Example Tra es. In the rst step of synthesis, the input/output

examples are transformed into a non-re ursive program whi h overs exa tly

the given examples. This transformation involves:

onstru ting a tra e f

j

whi h al ulates o

j

for i

j

for all examples e

j

, j =

1; : : : ; k,

determining a predi ate whi h asso iates ea h input with the tra e al u-

lating the desired output, and

onstru ting a program with the following underlying s heme:

F (x) (p

1

(x)! f

1

(x);

: : : ;

p

k

(x)! f

k

(x)):

A tra e is a fun tion whi h al ulates an output for a given input. In

Summers' approa h, su h fun tions are always ons-expressions. That su h

a ons-expression does exist, all atoms appearing in the output, must also

appear in the input. That is, it is not possible, to generate new symbols in

an output. That su h a ons-expression is uniquely determined, ea h atom

appearing in the output, must appear exa tly on e in the input. The reason

for that restri tion is, as we will see below, that the output is onstru ted

from the synta ti al stru ture of the input and therefore, the position of ea h

used atom must be unique.

The algorithm for onstru ting the tra es is given in table 6.11.

11

An

example is given in gure 6.11.

Table 6.11. Constru ting Tra es from I/O Examples

Let SE be the set of expressions over basi fun tions b, whi h des ribe sub-

expressions sub(x) of an s-expression x, asso iated with that fun tion, that is

SE = f(b(x); sub(x))g.

Let x be an example input and y be the asso iated output.

ST (x; y) =

b(x) y 6= nil and (b(x); y) 2 SE

ons(ST (x; ar(y)); ST (x; dr(y))) otherwise

In the next step, predi ates must be determined, whi h hara terize the

example inputs in su h a way that ea h input an be asso iated with its

orre t tra e: For the set of predi ates must hold that p

i

(x

i

), i = 1; : : : ; k

evaluate to T (true) and that predi ates p

j

(x

i

), 1 j i evaluats to nil.

Be ause the only predi ate used is atom, the inputs are dis riminated by

their stru ture only. For at hing hara teristi s of the atoms themselves,

\semanti " predi ates, su h as equal or greater would be needed.

11

ST stands for \semi-tra e". A semi-tra e is a tra e where the sequen e of eval-

uation of expressions is not xed.


Sub-expressions of i = (A B) an be des ribed by

SE = f(i; (A B)); ( ar(i); A); ( dr(i); (B)); ( adr(i); B); ( ddr(i); nil)g

For the given examples, the tra es onstru ted with ST are:

nil! nil: nil

(A)! ((A)): ons(x, nil)

(A B)! ((A) (B)): ons( ons( ar(x), nil), ons( dr(x), nil))

(A B C)! ((A) (B) (C)): ons( ons( ar(x), nil), ons( ons( adr(x),nil), ons( ddr(x),nil)))

Fig. 6.11. Tra es for the unpa k Example

The sear hed for predi ates have the form atom(b

i

, x) where b

i

is a basi

fun tion. The algorithm for determining the predi ates must onstru t su h

basi -fun tions that, if x

i

< x

i+1

, b

i

(x

i

) is an atom, but b

i

(x

i+1

) is not. To

de ide whether an input is smaller than an other input, we need an ordering

relation over possible inputs (that is over s-expressions). In a rst step, we

redu e the inputs to their stru ture, by repla ing all atoms by identi al atoms

!:

Denition 6.3.19 (Form of an S-Expression).

form(x) =

! atom(x) = T

ons(form( ar(x)); form( dr(x))) otherwise:

An example is given in table 6.12.

Table 6.12. Cal ulating the Form of an S-Expression

List (A (A B) C) is represented by

S-Expression (A:(B:(C:nil)):(D:nil)).

form((A:(B:(C:nil)):(D:nil))) = (!:(!:(!:!)):(!:!))

Now we an dene an order relation over the set of forms s-expressions:

Denition 6.3.20 (Complete Partial Order over S-Expressions). The

omplete partial order over SF (forms of s-expressions) is re ursively dened

as:

8s 2 SF : ! s

8a; b; ; d 2 SF : (a:b) ( :d), (a ) ^ (b d).

The omplete partial order over s-expressions S is:

8s; t 2 S : s t, form(s) form(t).

Note, that the order is partial, be ause not ea h s-expression an be om-

pared with ea h other. For example, it holds neither that (A (B C)) ((A B)

C) nor that ((A B) C) (A (B C)). For onstru ting predi ates, it must hold

that the example inputs an be totally ordered. For the synthesized fun tion


it is assumed that its hypotheti al (innite) input domain is totally ordered,

too.

The algorithm for onstru ting predi ates is dened as follows:

Denition 6.3.21 (Constru ting Predi ates). PredGen(x

i

; x

i+1

) = PG(x

i

;

x

i+1

; I), where I is the identity fun tion

PG(x; y; ) =

8

>

>

<

>

>

:

; atom(y)

fg atom(x) and :atom(y)

PG( ar(x); ar(y); ar Æ )[

PG( dr(x); dr(y); dr Æ ) otherwise:

For example, PredGen((A B), (A B C)) returns f ddrg and the resulting

predi ate is atom( ddr(x)).

The resulting non-re ursive fun tion whi h overs the set of input exam-

ples is given in gure 6.12.

F

L

(x) (atom(x)! nil;

atom( dr(x))! ons(x; nil);

atom( ddr(x))! ons( ons( ar(x); nil); ons( dr(x); nil));

T ! ons( ons( ar(x); nil); ons( ons( adr(x); nil); ons( ddr(x); nil))))

Fig. 6.12. Result of the First Synthesis Step for unpa k

Re urren e Dete tion. The se ond step of synthesis is to generalize over the

given set of examples. This an be done with a purely synta ti al approa h

by pattern mat hing: The nite program omposed from tra es is he ked

for regularities between pairs of tra es. This is done by determining dier-

en es between f

i

(x) and f

i+

(x) and he king whether re urren e relation

f

i+

= C(f

i

(b(x)); x) holds for all examples. The dieren es and the result-

ing re urren e relation for the unpa k example is given in gure 6.13.

In Summers (1977) no algorithm for al ulating the regularities is pre-

sented, ex ept that he mentions that the algorithm is similar to uni ation

algorithms. A detailed des ription of a method for re urren e dete tion in

nal program terms is presented in hapter 7.

From a re urren e relation whi h holds for a set of examples for indi es

k = s; : : : ; f where s and f are onstants, it is indu tively generalized, that

this relation holds for the omplete domain k = s; : : : ; n with n !1. Sum-

mers presented a theoreti al foundation for this generalization, in form of a

set of synthesis theorems. In the following, we present the entral aspe ts of

Summers' theory.

Synthesis Theorems.


Dieren es:

f

2

(x) = ons(x; f

1

(x))

f

3

(x) = ons( ons( ar(x); nil); f

2

( dr(x)))

f

4


3

( dr(x)))

p

2

(x) = p

1

( dr(x))

p

3

(x) = p

2

( dr(x))

p

4

(x) = p

3

( dr(x))

Re urren e Relation:

f

1

(x) = nil

f

2

(x) = ons(x; f

1

(x))

f

k+1


k

( dr(x))) for k = 2; 3

p

1

(x) = atom(x)

p

k+1

(x) = p

k

( dr(x)) for k = 1; 2; 3

Fig. 6.13. Re urren e Relation for unpa k

Denition 6.3.22 (Approximation of a Fun tion). The k-th approxima-

tion F

k

(x) of a fun tion F(x) is dened as

F

k

(x) (p

1

(x)! f

1

(x);

: : :

p

k1

(x)! f

k1

(x);

T ! )

where is undened.

That is, the nite fun tion F (x) onstru ted over a set of examples, whi h

is undened for all inputs with p

k1

(x) = nil for all k > f where f is a

onstant, an be seen as the k-th approximation of the to be indu ed fun tion

F(x).

Denition 6.3.23 (Re urren e Relation). If there exists an initial point

j and an interval n su h that 1 j < k in an k-th approximation F

k

(x)

dened by a set of examples su h that the following fragment dieren es exist:

f

j+n

(x) = C(f

j

(b

1

(x)); x);

f

j+n+1

(x) = C(f

j+1

(b

1

(x)); x);

: : :

f

k1

(x) = C(f

kn1

(b

1

(x)); x);

and su h that the following predi ate dieren es exist:

p

j+n

(x) = p

j

(b

2

(x));

p

j+n+1

(x) = p

j+1

(b

2

(x)); ;

: : :

p

k1

(x) = p

kn1

(b

2

(x));

then we dene the fun tional re urren e relation for the examples to be

f

1

(x); : : : ; f

j

(x); : : : ; f

j+n1

(x); f

i+n

(x) = C(f

i

(b

1

(x)); x) for j i k


n 1, and we dene the predi ate re urren e relation for the examples to be

p

1

(x); : : : ; p

j

(x); : : : ; p

j+n1

(x); p

i+n

(x) = p

i

(b

2

(x)) for j i k n 1

The index j gives the rst position in the set of ordered tra es where the

re urren e relation holds. All tra es with smaller indi es are kept as expli it

predi ate/fun tion pairs. For the unpa k example holds j = 2. The index n

gives a xed interval for the re urren e overing the fa t that, in general,

re urren e an o ur not only between immediately su eeding tra es. For

the unpa k example holds n = 1.

Denition 6.3.24 (Indu tion). If a fun tional and a predi ate re urren e

relation exist for a set of examples su h that k j 2n, then we indu tively

infer that these relationships hold for all i j.

The ondition k j 2n ensures that at least one regularity (between

two tra es) an be found in the example. We will see in hapter 7, that when

onsidering more general re ursive s hemes, at least four tra es from whi h

regularities an be extra ted are ne essary.

Applying the re urren e relation indu ed from the examples, the set of ex-

amples an be indu tively extended to further approximations of the sear hed

for fun tion F. The m-th approximation, with m j, has the form:

F

m

(x) (p

1

(x)! f

1

(x);

: : :

p

k

(x)! f

k

(x);

p

k+1

(x)! f

k+1

(x);

: : :

p

m

(x) ! f

m

(x);

T ! ):

Lemma 6.3.1 (Order of Approximations). The set of approximating fun -

tions, if it exists, is a hain with partial order

F

, where F (x)

F

G(x) holds

if it is true that, for all x for whi h F (x) is dened, F (x) = G(x).

Proof (Order of Approximations). We dene D

m

= fx j p

i

(x)g [ : : : [

fx j p

m

(x)g to be the domain of the approximating fun tion F

m

(x). From

this it is lear that F

i

(x)

F

F

i+1

(x) for all i, sin e D

i

is a subset of D

i+1

(Summers, 1977, p. 167).

Now, the sear hed for fun tion F(x), whi h was spe ied by examples,

an be dened as supremum of a hain of approximations:

Denition 6.3.25 (Supremum of a Chain of Approximations). If a set

of examples denes the hain hfF

m

(x)g;

F

i, the sear hed for fun tion F(x),

dened by the examples, is supfF

m

(x)g or the limit of F

m

(x) for m!1.

The on epts introdu ed, orrespond to the on epts used in the xpoint

theory of semanti s of re ursive fun tions (see appendix BB.1 and se t. 7.2


in hap. 7). Summers' basi synthesis theorem represents the onverse of

this problem, that is, to nd a re ursive program from a re urren e relation

hara terization of a partial fun tion. In other words, the synthesis problem

is to indu e a folding from a set of tra es onsidered as unfolding of some

unknown re ursive fun tion.

Theorem 6.3.3 (Basi Synthesis). If a set of examples denes F(x) with

re urren e relations f

1

(x); : : : ; f

n

(x); f

i+n

(x) = C(f

i

(b(x)); x),

p

1

(x); : : : ; p

n

(x); p

i+n

(x) = p

i

(b(x)) for i 1, then F(x) is equivalent to the

following re ursive program:

F (x) (p

1

(x)! f

1

(x);

p

n

(x)! f

n

(x);

T ! C(F (b(x)); x)):

Proof: see Summers (1977, pp. 168).

This basi synthesis theorem provides the foundations of synthesis of re-

ursive programs from a nite set of examples. But it is based on onstraints

whi h only allow for very restri ted indu tion. The rst restri tion is, that

re urren e relationships must hold for all tra es not allowing a nite set

of spe ial ases whi h an be dealt with separately. The se ond restri tion

is, that the basi fun tion b used in the predi ate and fun tional re urren es

must be identi al. Summers (1977) presents extensions of the basi theorem

whi h over ome these restri tions.

Summers points out that it would be desirable to give a formal hara -

terization of the lass of re ursive fun tions whi h an be synthesized using

this approa h, but that su h a formal hara terization does not exist. In later

work, Le Blan (1994) tries to give su h a hara terization. In our own work,

we give a stru tural hara terization of the lass of re ursive programs whi h

an be indu ed from nite programs (see hapter 7). A tight hara terization

of the lass of (semanti ) fun tions whi h an be synthesized from tra es is

still missing!

Variable Addition. Summers (1977) presents an extension of his approa h

for ases, where predi ate re urren e relations but no fun tional re urren e

relations an be dete ted in the tra es. This problem was later also dis ussed

by Jouannaud and Kodrato (1979) and Wysotzki (1983). Neither in Sum-

mers' nor in later work, it is pointed out, that this problem o urs exa tly in

su h ases when the underlying re urren e relation orresponds to a simple

loop, that is a tail re ursion. All authors propose solutions based on the in-

trodu tion of additional variables. This is a standard method in transforming

linear re ursion into tail re ursion in program optimization (Field & Harrison,

1988).

We illustrate this problem with the reverse fun tion given in gure 6.14.

For the initial tra es, all dieren es have dierent forms, that is, no re urren e

relation an be derived from the examples. The following heuristi s an be


used: Sear h for a subexpression a, whi h is identi al in all tra es. Rewrite

the fragments to g

i

(x; a) = f

i

(a), abstra t to g(x; y), and try again to nd

re urren e relations.

For the reverse example, it holds that a = nil. After abstra tion, for ea h

pair of tra es two dieren es exist. One of these dieren es is regular and an

be transformed into a re urren e relation. For the g

i

(x; y), the re ursive fun -

tion G(x; y) an be indu ed and the resulting program is: F (x) = G(x; a).

Variable y plays the role of a olle tor, where the resulting output is on-

stru ted, starting with the \neutral element" with respe t to ons, whi h is

nil. In the terminating ase, the urrent value of y is returned.

Examples:

fnil ! nil;

(A)! (A);

(A B)! (B A);

(A B C)! (C B A)g

Tra es:

f

1

(x) = nil

f

2

(x) = ons( ar(x); nil)

f

3

(x) = ons( adr(x); ons( ar(x); nil))

f

4

(x) = ons( addr(x); ons( adr(x); ons( ar(x); nil)))

Dieren es:

f

2

(x) = ons( ar(x); f

1

(x))

f

3

(x) = ons( adr(x); f

2

(x))

f

4

(x) = ons( addr(x); f

3

(x))

Variable Introdu tion:

g

1

(x; y) = y

g

2

(x; y) = ons( ar(x); y)

g

3

(x; y) = ons( adr(x); ons( ar(x); y))

g

4

(x; y) = ons( addr(x); ons( adr(x); ons( ar(x); y)))

Pairs of Dieren es: ( = anything)

g

2

(x; y) = f ons( ar(x); g

1

(x; y)); g

1

(; ons( ar(x); y))g

g

3

(x; y) = f ons( adr(x); g

2

(x; y)); g

2

( dr(x); ons( ar(x); y))g

g

4

(x; y) = f ons( addr(x); g

3

(x; y)); g

3

( dr(x); ons( ar(x); y))g

Re urren e Relation:

f

i

(x) = g

i

(x; nil), i 1

g

1

(x; y) = y

g

k+1

(x; y) = g

k

( dr(x); ons( ar(x); y))

Re ursive Program:

reverse(x) G(x; nil)

G(x; y) (atom(x)! y;

T ! G( dr(x); ons( ar(x); y)))

Fig. 6.14. Tra es for the reverse Problem

6.3.4.2 Extensions of Summers' Approa h. The most well-known ex-

tensions of Summers' approa h were presented by Jouannaud, Kodrato and


olleagues (Kodrato & J.Fargues, 1978; Jouannaud & Kodrato, 1979; Ko-

drato, Franova, & Partridge, 1989). In the work of this group, the re ursive

programs whi h an be synthesized from tra es orrespond to a more omplex

program s heme than that underlying Summers':


Denition 6.3.26 (Extension of Summers' Program S heme).

F (x) f(x; i(x))

f(x; z) (p

1

(x) ! f

1

(x; z);

: : : ;

p

k

(x) ! f

k

(x; z);

T ! h(x; f(b(x); g(x; z))))

where:

b is a basi fun tion,

i(x) is an initialization fun tion,

h; g are programs satisfying the s heme.

The greater omplexity of to be synthesized programs is mainly obtained

by the mat hing algorithm introdu ed by this group, alled BMWk (for

\Boyer, Moore, Wegbreit, Kodrato") whi h allows to determine how many

variables must be introdu ed in the to be onstru ted fun tion to obtain reg-

ular dieren es between tra es. An overview of this work is given in Smith

(1984).

Another extension was proposed by Wysotzki (1983). He reformulated

the se ond step of program synthesis, that is, re urren e dete tion in nite

programs, on a more abstra t level: The to be synthesized programs are

hara terized by term algebras instead of Lisp fun tions. Our work is based on

this extension and presented in detail in hapter 7. An additional ontribution

of Wysotzki was to demonstrate that indu tion of re ursive programs an

be applied to planning problems (su h as the learblo k problem presented

in se t. 3.1.4.1 in hap. 3). This new area of appli ation be omes possible

be ause, when representing programs over term algebras, the hoi e of the

interpreter fun tion is free. Therefore, fun tion symbols might be interpreted

by operations dened in a planning domain (su h as put or puttable, see

hapter 2). Our work on learning re ursive ontrol rules for planning (see

hapter 8) is based on Wysotzki's ideas. Later on Wysotzki (1987) presented

an approa h to realize the rst step of program synthesis, that is, onstru ting

nite programs, using universal planning. Extensions of this ideas are also

presented in hapter 8.

6.3.4.3 Biermann's Fun tion Merging Approa h. An approa h whi h

realizes the se ond step of synthesis dierently from Summers' approa h was

proposed by Biermann (1978). While Summers-like synthesis is based on

dete ting regularities in dieren es between pairs of tra es, Biermann works

with a single tra e. A tra e is represented by a set of non-re ursive fun tion

alls and regularities are sear hed in dieren es between these fun tions. The

basi Lisp operations used for generating tra es from input/output examples

are the same as in Summers' approa h. The program s heme for the to be

synthesized fun tions, alled semi-regular Lisp s heme, is:


Denition 6.3.27 (Semi-Regular Lisp-S heme).

F

i

(x) (p

1

(x)! f

1

(x);

: : : ;

p

k

(x)! f

k

(x);

T ! f

k+1

(x))

where:

p

1

; : : : p

k

are predi ates of the form atom(b

i

(x)),

b

i

is a basi fun tion,

f

1

; : : : ; f

k

have one of the following forms: nil, x, F

j

( ar(x)), F

j

( dr(x)),

ons(F

g

(x); F

h

(x)),

and it holds: p

j

(x) = atom(b

j+1

(x)) ) b

j+1

= b

j

Æw with w as basi fun tion.

A set of fun tions orresponding to su h s hemes, is a program, alled

regular Lisp program.

In the rst step, a semi-tra e is al ulated from the given example, using

the algorithm ST (see tab. 6.11). This semi-tra e is then transformed in a

tra e by assigning a fun tion denition to ea h sub-expression. An example is

given in table 6.13 (see also Smith, 1984; Flener, 1995). Note, that the given

input/output example represents nested ons-pairs. A ons-pair has the form

form((a.b)) = (! !).

Table 6.13. Constru ting a Regular Lisp Program by Fun tion Merging

Example: ((a:b):( :d))! ((d: ):(b:a))

Resulting Semi-Tra e: ons( ons( ddr(x); adr(x)); ons( dar(x); aar(x)))

Tra e:

f

1

(x) = ons(f

2

(x); f

3

(x))

f

2

(x) = f

4

( dr(x))

f

3

(x) = f

5

( ar(x))

f

4

(x) = ons(f

6

(x); f

7

(x))

f

5

(x) = ons(f

8

(x); f

9

(x))

f

6

(x) = f

10

( dr(x))

f

7

(x) = f

11

( ar(x))

f

8

(x) = f

12

( dr(x))

f

9

(x) = f

13

( ar(x))

f

10

(x) = f

11

(x) = f

12

(x) = f

13

(x) = x

Resulting Program:

g

1

(x) (atom(x)! x;

T ! ons(g

2

(x); g

3

(x)))

g

2

(x) g

1

( dr(x))

g

3

(x) g

1

( ar(x))

A regular Lisp program an be represented as dire ted graph where ea h

node represents a fun tion. A dire ted ar represents a part of a onditional

expression ontained in the start node. Su h a onditional expression is a pair


of predi ate and ons-expression. The goal nodes of an ar are fun tion alls

appearing within the ons-expression. An example is given in the last line (e)

in gure 6.15.

The onstru tion of the graph whi h represents the sear hed for program

is realized by sear h with ba ktra king. The given tra e is pro essed top-down

for the given fun tion alls f

i

and for ea h fun tion all it is he ked whether

it an be merged (is identi al) with an already existing node in the graph.

Merging results in y les in the graph, representing re ursion. If no merge is

possible, a new ar is introdu ed, representing a new onditional expression.

We demonstrate this method for the tra e given in table 6.13 in gure 6.15:

Initially, a node g

1

is reated, representing fun tion f

1

. Fun tion f

1

alls two,

possibly dierent fun tions, that is, two ar s, representing two onditional

expressions, are introdu ed. Fun tion all f

2

ould be either identi al with

g

1

( ase b.1) or not ( ase b.2). The rst hypothesis fails and ase b.2 is

sele ted. Fun tion all f

3

ould be either identi al to g

1

(fails) or to g

2

(fails)

or represent a new ase. Be ause it annot be merged with f

1

or f

2

, a new

node is introdu ed ( ase .3). Fun tion alls f

4

and f

5

an be merged with

f

1

and therefore an be represented by node g

1

( ase d). As a onsequen e,

f

6

and f

8

are identied with g

2

and f

7

and f

9

with g

3

. Identi ation for f

10

to f

13

fails and a new ar is introdu ed. The resulting graph represents the

nal re ursive program given in table 6.13.

In ontrast to Summers' approa h, fun tion merging is not restri ted to

linear re ursion. A drawba k is, that Biermann's method orresponds to an

identi ation by enumeration approa h (see se t. 6.3.1.2) and therefore is not

very eÆ ient.

6.3.4.4 Generalization to N and Programming by Demonstration.

A subeld of indu tive program synthesis is to only onsider how tra es or

proto ols an be generalized to re ursive programs (Bauer, 1975). Su h ap-

proa hes only address the se ond step of synthesis dete ting re urren e re-

lation in tra es. This area of resear h is alled programming by demonstration

(Cypher, 1993; S hrodl & Edelkamp, 1999) or generalization-to-n (Shavlik,

1990; Cohen, 1992). As des ribed for Summers' re urren e dete tion method

above, these approa hes are based on the idea that tra es an be onsidered

as the k-th unfolding of an unknown re ursive program. That is, in ontrast

to the folding te hnique in program transformation (see se t. 6.2.2.1), in in-

du tive synthesis folding annot rely on an already given re ursive fun tion!

Programming by Demonstration with Tinker. Programming by demonstra-

tion is for example used for tutoring of beginners in Lisp programming. The

system Tinker (Lieberman, 1993) an reate Lisp programs from user in-

tera tions with a graphi al interfa e. For example, the user an manipulate

blo ks-worlds with operations put and puttable (see hapter 2) and the system

reates Lisp ode realizing the operation sequen es performed by the user.

In the most simple ase, generalization is performed by repla ing onstants

by variables. Tinker an indu e simple linear re ursive fun tions, su h as the


g1

cons(-,-)

g1

cons(g1(x),-)

g1

g2

cons(g2(x),-)

g1

g2

cons(g2(x),g1(x))

g1

g2

cons(g2(x),g2(x))

g2

g1

g3

cons(g2(x),g3(x))

g1

g2 g3

g1(car(x))g1(cdr(x))

g1

g2 g3

T

atom(x)x

a: Node for function f1

b.1 Alternatives for f2 b.2

c.1 c.2 c.3

Alternatives for f3

e: Final Program

d: Merging f4 and f5 with g1

Fig. 6.15. Synthesis of a Regular Lisp Program


learblo k fun tion (see se t. 3.1.4.1 in hap. 3) from multiple examples (see

g. 6.16). Conditional expressions are onstru ted by asking the user about

dieren es between examples.

Fig. 6.16. Re ursion Formation with Tinker

Generalizing Tra es using Grammar Inferen e. A re ent approa h to pro-

gramming by demonstration was presented by S hrodl and Edelkamp (1999).

The authors des ribe how the grammar inferen e algorithm ID (Angluin,

1981) an be applied to indu e ontrol stru tures from tra es. As an ex-

ample they demonstrate how the loops for bubble-sort an be learned from

user tra es for sorting lists with a small number of elements using the swap-

operation. The tra e is enri hed by built-in knowledge about onditional

bran hes. A deterministi nite automaton (DFA) is learned from the ex-

ample tra es and membership queries to the user. The automaton a epts

all prexes of tra es (that is, trun ations of the original tra es) of the target

program.

Explanation-Based Generalization-to-n. An approa h related to program

synthesis from examples is explanation based learning or generalization

(EBG) (Mit hell et al., 1986). A (possibly re ursive) on ept des ription is

learned from a small set of examples based on (a large amount) of ba kground

knowledge. The ba kground knowledge (domain theory) is used to generate

an \explanation" why a given example belongs to the sear hed for on ept.

EBG is hara terized as analyti al learning: The onstru ted hypotheses do

not extend the dedu tive losure of the domain theory. Typi ally, Prolog is

used as hypothesis language (Mit hell, 1997, hap. 11). EBG-te hniques are

for example used in the planning system Prodigy for learning sear h ontrol

rules (see se t. 2.5.2 in hap. 2).


An extension of EBG to generalization-to-n was presented by Shavlik

(1990). In a rst step, explanations are generated and in a se ond step it is

tried to dete t repeatedly o uring sub-stru tures in an explanation. If su h

sub-stru tures are found, the explanation is transformed into a re ursive rule.

6.4 Final Comments

6.4.1 Indu tive versus Dedu tive Synthesis

As we have seen in this se tion, both dedu tive and indu tive approa hes

heavily depend on sear h. In the theorem proving approa hes, it is sear hed

for su h a sequen e of appli ation of programming axioms that an axiom

dening the sear hed for input/output relation be omes true. In the pro-

gram transformation approa hes, it is sear hed for a sequen e of transforma-

tion rules whi h, applied to a given input spe i ation, results in an eÆ iently

exe utable program. In geneti programming, it is sear hed for a synta ti-

ally orre t program whi h works orre tly for all presented examples. In

indu tive logi programming, it is sear hed for a logi al program whi h overs

all positive and none of the negative examples of the to be indu ed relation.

In indu tive fun tional program synthesis, it is sear hed for regularities in

tra es whi h allow to interpret the tra es as k-th approximation of a re ur-

sive fun tion.

For all approa hes holds, that in general sear h is ineÆ ient, and there-

fore must be restri ted by knowledge. One possibility to restri t sear h in

dedu tive as well as indu tive approa hes is to provide program s hemes: For

example, the program transformation system KIDS, relies on sophisti ated

program s hemes for divide-and- onquer, lo al and global sear h. In indu tive

logi programming, a synta ti bias is introdu ed by restri ting the lass of

lauses whi h an be indu ed. In indu tive fun tional program synthesis, the

lass of to be inferred fun tions is hara terized by s hemes.

We have seen for transformational synthesis that the step of transforming

a non-exe utable spe i ation into an (ineÆ ient) exe utable program annot

be performed automati al. Instead, this step depends heavily on intera tion

with a user. A similar bottlene k is the rst step in indu tive fun tional

program synthesis transforming input/output examples into tra es. In the

systems des ribed in this se tion, this problem was redu ed by allowing only

lists (and not numbers) as input and by onsidering only the stru ture but

not the ontent of the input examples. If these onstraints are relaxed, there

is no longer a unique tra e for hara terizing the transformation of an input

example in the desired output but potentially innitely many and it depends

on the generated tra es whether a re urren e relation an be found or not. In

ontrast, the se ond step of synthesis identifying regularities in dieren es

between tra es an be performed straight-forward by pattern-mat hing.

6.4 Final Comments 169

The same is true for program transformation there are several approa hes

to automati optimization of an exe utable program.

6.4.2 Indu tive Fun tional versus Logi Programming

There exists no systemati omparison of indu tive logi and indu tive fun -

tional programming in literature. A presupposition would be to provide a

formal hara terization of the lass of programs whi h an be inferred within

ea h approa h. In indu tive fun tional programming, there exist some pre-

liminary proposals for su h a hara terization (Le Blan , 1994). Possibly, the

theoreti al framework of grammar inferen e ould be used to des ribe what

lass of re ursive programs an be indu ed on the basis of what information.

Currently, we an oer only some informal observations to ompare indu tive

logi and fun tional programming:

ILP typi ally depends on positive and negative examples, where negative

examples are used to remove literals from the to be learned lauses. Indu tive

fun tional programming uses positive examples only, but these examples must

represent the rst k elements of a totally ordered input domain. Be ause of

this restri tion, in indu tive fun tional programming mu h less examples are

needed ranging from one to about four than in ILP, where typi ally more

than ten positive examples must be presented.

In indu tive fun tional programming, the resulting re ursive fun tion is

independent of the sequen e in whi h the input/output examples are pre-

sented, while in ILP dierent programs an result for dierent example se-

quen es. The reason is, that indu tive fun tional programming bases the on-

stru tion of predi ates whi h dis riminate between examples on knowledge

about the order of the input domain, while ILP stepwise in ludes positive

examples to onstru t a hypothesis.

Indu tive fun tional programming works in a two step pro ess rst

transforming input/output examples into tra es and then identifying regu-

larities in tra es. This orresponds to a bottom-up, data-driven approa h

to learning. ILP, regardless whether it works with top-down renement or

bottom-up generalization, in rementally modies an initial hypothesis by

adding or deleting literals.

In indu tive fun tional programming, the resulting program is exe utable

while this must not be true for ILP: typi ally neither a base ase for a re-

ursive lause is indu ed, nor is a order for the literals in the body provided.

It is a fundamental dieren e between logi and fun tional programs that

for logi programs ontrol ow is provided by the interpreter strategy (e. g.,

SLD-resolution), while fun tional programs implement a xed ontrol stru -

ture. Furthermore, fun tional programs typi ally have one parameter less

than logi programs. In the rst ase, a fun tion transforms a given input

into an output value. In the se ond ase, an input/output relation is proved

and it is possible to either onstru t and output su h that the relation holds


for the given input or to onstru t an input su h that the relation holds for

the desired output.

Be ause fun tions are a subset of relations, namely relations whi h are

dened for all inputs and have a unique output, it an be onje tured that in

general, indu tion of fun tions is a harder problem than indu tion of relations.

7. Folding of Finite Program Terms

"It ts into the pattern, I think." "Ah," said Mr. Rattisbon who knew Alleyn.

"The pattern. Your pet theory, Chief Inspe tor." "Yes, Sir, my pet theory. I

hope you may provide me with another lozenge in the pattern."

|Ngaio Marsh, Death of a Peer, 1940

In this hapter we present our approa h to folding nite program terms

in re ursive programs. That is, we address indu tive program synthesis from

tra es. This problem is resear hed in indu tive synthesis of fun tional pro-

grams as se ond step of synthesis and in programming by demonstration

(see hap. 6). Tra es an be provided to the system by the system user, they

an be re orded from intera tions of a user with a program, or they an be

onstru ted over a set of input/output examples. In the next hapter ( hap. 8)

we will des ribe how nite programs an be generated by planning. Folding

of nite programs into re ursive programs is a omplex problem itself. Our

approa h allows to fold re ursive programs whi h an be hara terized by

sets of ontext-free term rewrite rules. It allows to infer programs onsisting

of a set of re ursive equations, and to deal with interdependent and hidden

parameters. Tra es an be generi or over instantiated parameters and they

an ontain in omplete paths. That is, at least for the se ond step of program

synthesis, our approa h is more powerful than other approa hes dis ussed in

literature (see se t. 6.3 in hap. 6).

In the following we will rst introdu e terms and re ursive program

s hemes as ba kground for our approa h (se t. 7.1), then we will formulate

the synthesis problem (se t. 7.2), afterwards we present theoreti al results

and algorithms (se t. 7.3), and nally, we give some examples (se t. 7.4).

1

In appendix AA.4 we give a short overview of folding algorithms we re-

alized over the last years and in appendix AA.5 we give some details about

the urrent implementation whi h is based on the approa h presented in this

hapter.

1

This hapter is based on the previous publi ations S hmid and Wysotzki (1998)

and Muhlpfordt and S hmid (1998), and the diploma theses of Muhlpfordt (2000)

and Pis h (2000). Formalization, proof, and implementation are mainly based on

the thesis of Muhlpfordt (2000).

172 7. Folding of Finite Program Terms

7.1 Terminology and Basi Con epts

In ontrast to other approa hes to the synthesis of re ursive programs, the

approa h presented here is independent of a given program language. In-

stead, programs are hara terized as elements of some arbitrary term algebra

as proposed by Wysotzki (1983), based on the theory of re ursive programs

developed by Cour elle and Nivat (1978). A similar proposal was made by

Le Blan (1994) who formalized indu tive program synthesis in a term rewrit-

ing framework. The goal of folding is to indu e a re ursive program s heme

from some arbitrary nite program term. Our approa h is purely synta ti-

ally, that is, it works on the stru ture of given program terms, independent

of the interpretation of the symbols over whi h a term is onstru ted. The

overall idea is to dete t a re urrent relation in a nite program term. Our

approa h is an extension of Summers' synthesis theorem whi h was presented

in detail in se tion 6.3.4.1 in hapter 6.

We rst introdu e some basi on epts and notations for terms, mostly

following (Dershowitz & Jouanaud, 1990), then we introdu e rst order pat-

terns and anti-uni ation (Burghardt & Heinz, 1996), and nally, re ursive

program s hemes (Cour elle & Nivat, 1978).

7.1.1 Terms and Term Rewriting

Denition 7.1.1 (Signature). A signature is a set of (fun tion) symbols

with : ! N giving the arity of a symbol. The set

n

represents

all symbols with xed arity n, where symbols in

0

represent onstants. A

signature an be dened over a set of variables X with X \ = ;.

Although we are only on erned with the synta ti al stru ture of terms

built over a signature, in table 7.1 we give a sample of fun tion symbols

and their usual interpretation whi h we will use in illustrative examples. For

better readability, we will sometimes represent numbers or lists in their usual

notation, instead as onstru tor terms (e. g., 3 instead of su (su (su (0)))

or [9, 4, 7 instead of ons(9, ons(4, ons(7, nil)))).

Denition 7.1.2 (Term). For the set T

(X) of terms over a signature

and a set of variables X holds:

1. X T

(X),

2.

0

T

(X), and

3. t

1

; : : : ; t

n

2 T

(X); f 2

n

) f(t

1

; : : : ; t

n

) 2 T

(X).

Terms without variables are alled ground terms and the set of all ground

terms is denoted as T

. For a term t = f(t

1

; : : : ; t

n

) the terms t

i

(i = 1 : : : n)

are alled sub-terms of t. Mapping var : T

(X)! X returns all variables in

a term.

7.1 Terminology and Basi Con epts 173

Table 7.1. A Sample of Fun tion Symbols

0

0

natural number zero

1

0

natural number one

nil

0

empty list, truth value \false"

T

0

truth value \true"

su

1

su essor of a natural number

pred

1

prede essor of a natural number

head

1

head of a list

tail

1

tail of a list

eq0

1

test whether a natural number is equal zero

eq1

1

test whether a natural number is equal one

empty

1

test whether a list is empty

ons

2

insertion of an element in front of a list

+

2

addition of two numbers

2

multipli ation of two numbers

eq

2

equality-test for two symbols

<

2

test whether a natural number is smaller than another

if

3

onditional \if x then y else z"

In the following we use the expressions term, program term, and tree as

synonyms. Ea h symbol ontained in a term is alled a node in the term.

Denition 7.1.3 (Position). Positions in a term t 2 T

(X) are dened

by:

is the root-position of t,

if t = f(t

1

; : : : ; t

n

) and u is a position in t

i

then i:u is a position in t.

An example for a term together with an illustration of positions in the

term is given in gure 7.1.

Denition 7.1.4 (Composition and Order of Positions). The ompo-

sition v Æ w of two positions v and w is dened as u:w if v = u:.

A partial order over positions is dened in the following way:

The relation v w is true i

v = w, or

exists a position u with v Æ u = w.

Denition 7.1.5 (Sub-term at a Position). A sub-term of a term t 2

T

(X) at a position u in t (written tj

u

) is dened as

tj

= t,

if t = f(t

1

; : : : ; t

n

) and u a position in t

i

, then tj

i:u

= t

i

j

u

.

Denition 7.1.6 (Positions of a Node). Fun tion node : T

(X) Posi-

tion ! returns the symbol at a xed position in a term:

node(t; u) = f with t 2 T

(X), a position u with tju = f(t

1

; : : : ; t

n

), and


3.3.1.2.λ

3.λ

λif

< k if

k n <

-

k n

k if

<

-

-

k n

n

kn

n

0

t = if(<(k,n),k,if(<(-(k,n),n),k,if(<(-(-(k,n),n),n),k,0)))

Fig. 7.1. Example Term with Exemplari Positions of Sub-terms

f 2 .

The set of all positions at whi h a xed symbol f appears in a term is denoted

pos(t; f) and it holds 8w 2 pos(t; f) : node(t; w) = f .

The denitions an be extended to [X . If ne essary, the denition of node

an be extended to return additionally the arity of a symbol. With pos(t)

we refer to the set of all positions for a term t and with lpos(t) we refer to

the set of all positions of leaf-nodes in a term.

For the term in gure 7.1, a sub-term is for example tj

3

:3:1: = <(-(-(k,

n), n), n) and a set of positions is for example pos(t; <) = f1:; 3:1:; 3:3:1:g.

Denition 7.1.7 (Repla ement of a Term at a Position). The repla e-

ment of a sub-term tj

w

by a term s 2 T

(X) in a term t 2 T

(X) is written

as t[w s.

Denition 7.1.8 (Substitution). A substitution : X ! T

(X) is a

mapping of variables to terms. The extension of substitution for terms is

(f(t

1

; : : : ; t

n

)) = f((t

1

); : : : ; (t

n

)). Substitutions over a term are enumer-

ated as tfx

1

; t

1

; : : : ; x

n

t

n

g or t[t

1

; : : : t

n

for short.

Denition 7.1.9 (Term Rewrite System). A term rewrite system over

a signature is a set of pairs of terms R T

(X) T

(X). The elements

(l; r) of R are alled rewrite rules and are written as l ! r.


Denition 7.1.10 (Term Rewriting). Let R be a term rewrite system

and t; t

0

2 T

(X) two terms. Term t

0

an be derived in one rewrite step

from t using R (t!

R

t

0

), if there exists a position u in t, a rule l! r 2 R,

and a substitution : X ! T

(X), su h that holds

tj

u

= (l),

t

0

= t[u (r).

R implies a rewrite relation !

R

T

(X)T

(X) with (t; t

0

) 2!

R

if t!

R

t

0

. The re exive and transitive losure of !

R

is

!

R

.

7.1.2 Patterns and Anti-Uni ation

A term t = (p) whi h an be generated by substitutions over a term p is a

spe ialization of p and p is alled (rst order) pattern of t.

Denition 7.1.11 (First Order Pattern). A term p 2 T

(Y ) is alled

rst order pattern of a term t 2 T

(X), if there exists a subsumption relation

p t with : Y ! T

(X) and (p) = t. The sets of variables Y and X are

disjoint.

A pattern p of a term t is alled trivial, if p is a variable and non-trivial

otherwise.

Denition 7.1.12 (Order over Terms). For two terms over the same sig-

nature p; t 2 T

(X) holds p t if p is a pattern of t and p < t if there exists

no variable renaming su h that p and t are uniable.

Denition 7.1.13 (Maximal Pattern). A term p is alled maximal (rst

order) pattern of a term t, if p t and there exists no term p

0

with p < p

0

and p

0

t.

An example for a rst order pattern of a term is given in gure 7.2. For

all non-trivial patterns holds that terms whi h an be subsumed by a pattern

share a ommon prex.

A (maximal) pattern of two terms an be onstru ted by synta ti anti-

uni ation of these terms.

Denition 7.1.14 (Anti-Uni ator). A term p is alled (rst-order) anti-

uni ator of two terms t and t

0

, if p t and p t

0

and if for all terms p

0

with

p

0

t and p

0

t

0

is p

0

p. That is, the anti-uni ator is the most spe i

generalization of two terms and it is unique ex ept for variable renaming.

Denition 7.1.15 (Anti-Uni ation). Anti-uni ation u : T

(X)T

(X)!

T

(X [ Y ) is dened as:

f(t

1

; : : : ; t

n

) u f(t

0

1

; : : : ; t

0

n

) = f(t

1

u t

0

1

; : : : ; t

n

u t

0

n

),

f(t

1

; : : : ; t

n

) u f

0

(t

0

1

; : : : ; t

0

m

) = '(f(t

1

; : : : ; t

n

); f

0

(t

0

1

; : : : ; t

0

m

)) for f 6= f

0


_<

yx k

if

< k if

k n

if

< k if

k n <

-

k n

k if

<

-

-

k n

n

kn

n

0

Fig. 7.2. Example First Order Pattern

with ' : T

(X) T

(X) ! Y as inje tive mapping of terms in a (new) set

of variables.

Mapping ' guarantees that the mapping of pairs of terms to variables is

unique in both dire tions. Identi al sub-term pairs are repla ed by the same

variable over the omplete term (Lassez, Maher, & Marriott, 1988).

Anti-uni ation an be extended from pairs of terms to sets of terms.

Denition 7.1.16 (Anti-Uni ation of Sets of Terms). Anti-uni ation

of a set of terms T

d

: T

(X)

+

! T

(X [ Y ) is dened as:

l

T =

t if T = ftg

(

d

(T n ftg)) u t otherwise:

First order anti-uni ation is distributive, that is, terms in T an be anti-

unied in an arbitrary order (Lassez et al., 1988).

An example for anti-uni ation is given in gure 7.3.

7.1.3 Re ursive Program S hemes

7.1.3.1 Basi Denitions. The notion of a program s heme allows us to

hara terize the lass of re ursive fun tions whi h an be indu ed in program

synthesis.


00

if

<

-

n

k

n

k

if

< k

k n

if

< k

n

0

x

Fig. 7.3. Anti-Uni ation of Two Terms

Denition 7.1.17 (Re ursive Program S heme). Let be a signature

and = fG

1

; : : : ; G

n

g a set of fun tion variables with \ = ; and arity

(G

i

) = m

i

> 0. A re ursive program s heme (RPS) S is a pair (G; t

0

) with

t

0

2 T

[

(X) and G as a system of n equations:

G =

hG

1

(x

1

; : : : ; x

m

1

) = t

1

;

.

.

.

G

n

(x

1

; : : : ; x

m

n

) = t

n

i

with t

i

2 T

[

(X); i = 1 : : : n.

An RPS orresponds to an abstra t fun tional program with t

0

as initial

program all (\main program") and G as a set of (re ursive) fun tions (\sub-

programs"). The left-hand side of an equation in G is alled program head,

the right-hand side program body. The parameters x

i

in a program head are

pairwise dierent. The body t

i

of an equation G

i

an ontain alls of G

i

or

alls of other equations in G.

In the following, RPSs are restri ted in the following way: For ea h pro-

gram body t

i

of an equation in G holds

var(t

i

) = fx

1

; : : : ; x

m

i

g, that is, all variables in the program head are used

in the program body, and

pos(t

i

; G

i

) 6= ;, that is, ea h equation is re ursive.

The rst restri tion does not limit expressiveness: Variables whi h never are

used in the body of a program do not ontribute to the omputation of a

result. The se ond restri tion ensures that indu tion of an RPS from a nite

program term is really due to generalization (learning) and that the resulting

RPS does not just represent the given nite program itself.


An example for an RPS is given in table 7.2. Equation ModList he ks

for ea h element of a list of natural numbers whether it is a multiple of a

number n. The equation is linear re ursive be ause it ontains a all of itself

as argument of the ons-operator and it alls a se ond equation Mod whi h

is tail-re ursive, that is, a spe ial ase of linear re ursion, orresponding to

a loop. The program all t

0

in this example, is simply a all of equation

ModList. In general, t

0

an be a more omplex expression from T

(X) [ .

For example, we might only be interested whether the se ond element of the

list is a multiple of n and all t

0

= head(tail(ModList(l,n))).

Interpretation of an RPS, for example by means of an eval-apply method

(Field & Harrison, 1988), presupposes that all symbols in are available as

pre-dened operations and that the variables in t

0

are instantiated. An RPS

an be unfolded by repla ing fun tion alls by the orresponding fun tion

body where variables in the body are substituted in a ordan e to the pa-

rameters in the fun tion all. Generation of program terms by unfolding is

des ribed in the next se tion.

Table 7.2. Example for an RPS

S = hG; t

0

i

G = h

ModList(l, n) = if( empty(l),

nil,

ons( if(eq0(Mod(n, head(l))), T, nil),

ModList(tail(l), n))),

Mod(k, n) = if(<(k, n), k, Mod((k, n), n))

i

t

0

=ModList(l; n)

Variable instantiation is a spe ial ase of substitution (see def. 7.1.8) where

variables are repla ed by ground terms:

Denition 7.1.18 (Variable Instantiation). A mapping : X ! T

is

alled instantiation of variables X. Variable instantiation an be extended to

terms: : T

(X) ! T

su h that (f(t

1

; : : : ; t

n

)) = f((t

1

); : : : ; (t

n

)) for

f 2 , arity (f) = n, and t

1

; : : : ; t

n

2 T

(X).

Please note, that we speak of the (formal) parameters of a program to

refer to its arguments. For example, the re ursive equation ModList(l, n) has

two parameters. The parameters an be variables, su h as l, n 2 X, or be

instantiated with (ground) terms.

7.1.3.2 RPSs as Term Rewrite Systems. A re ursive program s heme

an be viewed as term rewrite system (Cour elle, 1990) or, alternatively as

ontext-free tree-grammar (see next se tion).

An RPS as introdu ed in denition 7.1.17 an be interpreted as term

rewrite system as introdu ed in denition 7.1.9:


Denition 7.1.19 (RPS a Rewrite System). Let S = (G; t

0

) be an RPS

and a spe ial symbol. The equations in G onstitute rules R

S

of a term

rewrite system. The system additionally ontains rules R

:

R

S

= fG

i

(x

1

; : : : ; x

m

i

)! t

i

j i 2 f1; : : : ; ngg,

R

= fG

i

(x

1

; : : : ; x

m

i

)! j i 2 f1; : : : ; ngg.

We write

!

;

for the re exive and transitive losure of the rewrite relation

implied by R

S

[ R

.

By means of the term rewrite system, the head of a user-dened equation

(o uring in t

0

or a program body t

i

) is either repla ed by the asso iated

body or by symbol whi h represents the empty or undened term.

Denition 7.1.20 (Language of an RPS). Let S = (G; t

0

) be an RPS

and : var(t

0

)! T

an instantiation of the parameters in t

0

. The set of all

terms L(S; ) = ft j t 2 T

[fg

; (t

0

)

!

;

tg is the language generated by

the RPS with instantiation . For a main program t

0

without variables, we

an write L(S).

The instantiation of all parameters in program all t

0

implies that the lan-

guage of an RPS is a set of ground terms, ontaining neither obje t variables

X nor fun tion variables . That is, all re ursive alls are at some rewrite

step repla ed by s.

Denition 7.1.21 (Language of a Subprogram). Let S = (G; t

0

) be an

RPS. For an equation G

i

2 G the rewrite relation !

G

i

;

is implied by the

rules

R

G

i

;

= fG

i

(x

1

; : : : x

m

i

)! t

i

; G

i

(x

1

; : : : x

m

i

)! g:

Let : fx

1

; : : : x

m

i

g ! T

be an instantiation of parameters of G

i

, then the

set of all terms L(G

i

; ) = ft j t 2 T

[fg[nfG

i

g

; (G

i

(x

1

; : : : ; x

m

i

))

!

G

i

;

tg is the language generated by subprogram G

i

with instantiation .

The term given on the left side of gure 7.4 is an example for a term be-

longing to the language of the RPS given in table 7.2 as well as an example

for a term belonging to the language of the subprogram ModList. The term

represents an unfolding of the main program t

0

where the ModList subpro-

gram is alled. For t

0

= ModList(l, n) variables are instantiated as l = [9; 4; 7

and n = 8.

2

Within this unfolding, there is an unfolding of the Mod subpro-

gram whi h is alled by ModList. At the same time, the term represents an

unfolding of the subprogram ModList itself. The term given on the right side

of gure 7.4 is an unfolding of ModList ontaining the all of the Mod subpro-

gram. Neither of the two terms ontain the name of the subprogramModList

itself.

2

For a restri ted signature with natural number 0 and su essor-fun tion only and

list onstru tor ons, a natural number i an be represented as su

i

(0) and a

list as ons-expression.


Ω

Ω

if

empty nil cons

if

eq0 T nil

if

< 8

8 head

if

empty nil

tail

[9,4,7]

[9,4,7]

[9,4,7]

Ω

if

empty nil cons

if

eq0 T nil

Mod

8head

if

empty nil

tail

[9,4,7]

[9,4,7]

[9,4,7]

t

0

=ModList(l; n), (l) = [9; 4; 7, (n) = 8

Fig. 7.4. Examples for Terms Belonging to the Language of an RPS and of a

Subprogram of an RPS

7.1.3.3 RPSs as Context-Free Tree Grammars. In the remaining hap-

ter, we will rely on the notion of RPSs as term rewrite systems. In this se -

tion we will shortly present an alternative, equivalent denition of RPSs as

ontext-free tree grammars (Guessarian, 1992).

Denition 7.1.22 (Context-free Tree Grammar). A ontext-free tree gram-

mar (CFTG), C = (A;N;; P ), is a four-tuple of

a set of terminal symbols ,

a set of non-terminal symbols N with xed arity and N \ = ;

an axiom A 2 T

[N

, and

a set of produ tion rules P of the form G(x

1

; : : : ; x

n

)! t with G 2 N and

fx

1

; : : : ; x

n

g X, where X \ ( [ N) = ; is a set of reserved variables,

and t 2 T

[N

(X).

If there exist for a non-terminal G 2 N more than one rule G(x

1

; : : : ; x

n

)!

t

1

; : : : ; G(x

1

; : : : ; x

n

)! t

m

we write G(x

1

; : : : ; x

n

)! t

1

j : : : j t

m

for short.

The language of a ontext-free tree grammar onsists of all words (terms)

whi h an be derived from the axiom and whi h do not ontain non-terminal

symbols.

Denition 7.1.23 (Language of a CFTG). Let C = (A;N;; P ) be an

CFTG and

!

P

the re exive and transitive losure of the derivation relation

!

P

. The language L(C) of the grammar is L

C

= ft 2 T

j A

!

P

tg.

The program all t

0

of an RPS an be interpreted as axiom of an CFTG

and the equations of an RPS as produ tion rules. In ontrast to an RPS, the


axiom must ontain no variables, that is, all variables in the program all

must be instantiated.

Denition 7.1.24 (CFTG of an RPS). Let S = (G; t

0

) be an RPS and

: X ! T

an instantiation of variables in t

0

(X = var(t

0

)), then the

CFTG C

S;

of S and is dened as C

S;

= ((t

0

); ;; P ) with

P =

fG

1

(x

1

; : : : ; x

m

1

) = t

1

j ;

.

.

.

G

n

(x

1

; : : : ; x

m

n

) = t

n

j g:

With su h a CFTG an the language generated by an RPS be dened as

L(S; ) = L(C

S;

).

7.1.3.4 Initial Programs and Unfoldings. By means of the denition of

an RPS as term rewrite system (see def. 7.1.19) and the language of an RPS

indu ed by this system (see def. 7.1.20) we provide for a method to unfold

an RPS into a term of nite length. Be ause all equations in an RPS must

be re ursive, su h a nite term ontains at least one symbol (termination

of rewriting). Now we an hara terize the input in our synthesis system as a

term of whi h we assume that it is generated by an (yet unknown) re ursive

program s heme.

Denition 7.1.25 (Initial Program). An initial program is a term t 2

T

[fg

(X) whi h ontains at least one : pos(t; ) 6= ;. The term might be

dened over instantiated variables only, that is, t 2 T

[fg

.

Denition 7.1.26 (Order over Initial Programs). Let t; t

0

2 T

[fg

(X)

be two terms. An order t

t

0

is indu tively dened as

1.

t

0

, if pos(t

0

; ) 6= ;,

2. x

t

0

, if x 2 X and pos(t

0

; ) 6= ;,

3. f(t

1

; : : : ; t

n

)

f(t

0

1

; : : : t

0

n

), if 8i 2 f1; : : : ; ng holds t

i

t

0

i

.

The task of folding an initial program involves nding a segmentation

and substitutions of the initial program whi h an be interpreted as the

rst i elements of an unfolding of an hypotheti al RPS. Therefore, we now

introdu e re ursion points and substitution terms for an RPS.

For a given re ursive equation, ea h o urren e of re ursive all in its

body onstitutes a re ursion point and for ea h re ursive all the parameters

in are substituted by terms:

Denition 7.1.27 (Re ursion Points and Substitution Terms). Let S =

(G; t

0

) be an RPS, G

i

(x

1

; : : : ; x

m

i

) = t

i

a re ursive equation in S, and

X

i

= fx

1

; : : : x

m

i

g the set of parameters of G

i

. The non-empty set of re ur-

sion points U

re

of G

i

is given by U

re

= pos(t

i

; G

i

) with indi es (positions)

R = f1; : : : ; jU

re

jg.


Ea h re ursive all of G

i

at position u

r

2 U

re

; r 2 R in t

i

implies substitu-

tions

r

: X

i

! T

(X

i

) of the parameters in G

i

. Let sub : X

i

R! T

(X

i

)

be a mapping sub(x

j

; r) =

r

(x

j

) for all x

j

2 X

i

, then holds sub(x

j

; r) =

t

i

j

u

r

Æj

. The terms sub(x

j

; r) are alled substitution terms.

An example for re ursion points and substitution terms for the Fibona i

fun tion is given in table 7.3. Fibona i is tree re ursive, that is, it ontains

two re ursive alls in its body. The positions of these alls (see def. 7.1.3) in

the term representing the body of Fibona i are U

re

= f3:3:1:; 3:3:2:g.

For referring to these re ursion points we introdu e an index set R =

f1; : : : ; jU

re

jg, that is, for Fibona i is R = f1; 2g. With u

1

2 U

re

we refer

to the rst re ursive all and with u

2

2 U

re

we refer to the se ond re ursive

all. In the rst re ursive all, the parameter x is substituted by pred(x) and

in the se ond re ursive all, it is substituted by pred(pred(x)).

Table 7.3. Re ursion Points and Substitution Terms for the Fibona i Fun tion

Re ursive Equation:

G

1

(x

1

) = if( eq0(x

1

),

1,

if( eq0(pred(x

1

)),

1,

+(G

1

(pred(x

1

)), G

1

(pred(pred(x

1

))))))

Re ursion Points: U

re

= f3:3:1:; 3:3:2:g

Re ursive Calls: t

1

j

3:3:1:

= G

1

(pred(x

1

)),

t

2

j

3:3:2:

= G

1

(pred(pred(x

1

)))

Substitution Terms for x

1

:

sub(x

1

; 1) = t

1

j

3:3:1:Æ1

= G

1

(pred(x

1

))j

1:

= pred(x

1

),

sub(x

1

; 2) = t

1

j

3:3:2:Æ1

= G

1

(pred(pred(x

1

)))j

1:

= pred(pred(x

1

))

For a given re ursive equation there is a xed set of re ursion points. This

set ontains a single position for linear re ursion, two positions for tree re ur-

sion, and for more omplex RPSs the set an ontain an arbitrary number of

re ursion points. If a given term is generated by unfolding a re ursive equa-

tion, the re ursion points o ur in a regular way in ea h unfolding. Be ause,

theoreti ally, a re ursive equation an be unfolded innitely many times,

there are innitely many positions at whi h re ursive alls o ur, that is un-

folding positions. We will des ribe su h positions over indi es in the following

way:

Denition 7.1.28 (Unfolding Indi es). Let S = (G; t

0

) be an RPS, G

i

(x

1

;

: : : ; x

m

i

) = t

i

a re ursive equation in S, X

i

= fx

1

; : : : x

m

i

g the set of pa-

rameters of G

i

, and U

re

the set of re ursion points of G

i

with index set

R = f1; : : : ; jU

re

jg. The set of unfolding indi es W onstru ted over R is

indu tively dened as the smallest set for whi h holds:

7.1TerminologyandBasi Con epts

183

λ3.3.1.

λ3.3.2.3.3.1.3.3.1. λ3.3.1.

λ3.3.2.

3.3.1. λ3.3.2.λ3.3.2.3.3.2.

3.3

.1.λ

3.3

.1.

3.3

.1.

λ3.3

.2.

3.3

.1.

3.3

.1.

3.3

.1.λ

3.3

.1.3

.3.2

.

λ3.3

.2.

3.3

.1.3

.3.2

.

3.3

.1.λ

3.3

.1.

3.3

.2.

λ3.3

.2.

3.3

.1.

3.3

.2.

3.3

.1.λ

3.3

.2.

3.3

.2.

λ3.3

.2.

3.3

.2.

3.3

.2.

ΩΩΩΩΩΩΩ Ω

λ

eq0

5

1 if

eq0

pred

5

1 +

if

eq0

pred

5

1 if

eq0

pred

pred

5

1 +

if

eq0

pred

pred

5

1 if

eq0

pred

pred

pred

5

1 +

if

eq0

pred

pred

pred

5

1 if

eq0

pred

pred

pred

pred

5

1 +

if

eq0

pred

pred

5

1 if

eq0

pred

pred

pred

5

1 +

if

eq0

pred

pred

pred

5

1 if

eq0

pred

pred

pred

pred

5

1 +

if

eq0

pred

pred

pred

pred

5

1 if

eq0

pred

pred

pred

pred

pred

5

1 +

if

Fig.7.5.UnfoldingPositionsintheThirdUnfoldingofFibona i


1. 2 W ,

2. if w 2 W and r 2 R, then w Æ r 2 W .

To illustrate the on ept of unfolding indi es, we anti ipate the on ept

of unfolding whi h will be introdu ed in the next denition. Consider the

Fibona i term in gure 7.5 whi h represents the third synta ti unfolding

of the re ursive fun tion given in table 7.3. The zero-th unfolding would be

just the empty term and its position is , that is the root of the tree.

The rst unfolding renders two unfolding positions, 3:3:1: and 3:3:2: whi h

are identi al with the re ursion points of the Fibona i fun tion. Unfolding

the fun tion at ea h of these positions again by one step results in four new

unfolding positions, 3:3:1:3:3:1:, 3:3:1:3:3:2:, 3:3:2:3:3:1:, and 3:3:2:3:3:2:,

and so on. Just as we refer to the re ursion points using the index set R,

we refer to the unfolding positions using an index set W . The unfolding

points and the orresponding indi es for the rst three unfoldings of Fibona i

are given in table 7.4. For a tree re ursive stru ture su h as Fibona i, the

indi es grow in a treelike fashion. For example index 1:1:2: refers to the right

unfolding position in the two left unfoldings on the levels above.

Table 7.4. Unfolding Positions and Unfolding Indi es for Fibona i

Unfolding Unfolding Positions Unfolding Indi es

0

1 3:3:1:, 3:3:2: 1:, 2:

2 3:3:1:3:3:1:, 3:3:1:3:3:2:, 1:1:, 1:2:,

3:3:2:3:3:1:, 3:3:2:3:3:2: 2:1:, 2:2:

3 3:3:1:3:3:1:3:3:1:, 3:3:1:3:3:1:3:3:2:, 1:1:1:, 1:1:2:,

3:3:1:3:3:2:3:3:1:, 3:3:1:3:3:2:3:3:2:, 1:2:1:, 1:2:2:,

3:3:2:3:3:1:3:3:1:, 3:3:2:3:3:1:3:3:2:, 2:1:1:, 2:1:2:,

3:3:2:3:3:2:3:3:1:, 3:3:2:3:3:2:3:3:2: 2:2:1:, 2:2:2:

Dening the unfolding of a re ursive equation is based on the positions

in a term at whi h unfoldings are performed and on a me hanism to repla e

parameters in the program body:

Denition 7.1.29 (Unfolding of an Re ursive Equation). Let S = (G; t

0

)

be an RPS, G

i

(x

1

; : : : ; x

m

i

) = t

i

a re ursive equation in S, X

i

= fx

1

; : : : ;

x

m

i

g the set of parameters of G

i

, U

re

the set of re ursion points of G

i

with index set R = f1; : : : ; jU

re

jg, and sub(x

j

; r) the substitution term for

x

j

2 X

i

in the re ursive all of G

i

at position u

r

2 U

re

; r 2 R. The set of all

unfoldings

i

of equation G

i

over instantiations : X

i

! T

is indu tively

dened as the smallest set for whi h holds:

1. Let

= be the instantiation of variables in unfolding . Then

=

(t

i

) and

2

i

.

2. Let

w

2

i

be an unfolding with instantiation

w

. Then for ea h

r 2 R exists an unfolding

wÆr

2

i

with instantiations

wÆr

(x

j

) =

w

(sub(x

j

; r)) for all x

j

2 X

i

, su h that

wÆr

=

wÆr

(t

i

).


An unfolding is a term, whi h is generated in a derivation step t!

S;

t

0

by applying a term rewrite rule G

i

(x

1

; : : : ; x

m

i

) ! t

i

(see def. 7.1.19). The

resulting term is an element of language L(G

i

; ) (see def. 7.1.21). For a term

t, whi h is the result of repeated appli ation of rule (G

i

(x

1

; : : : ; x

m

i

)), the

unfolding indi es w orrespond to the sequen e in whi h the re ursive alls

were repla ed.

Please note, that the initial instantiation of variables

an be empty,

that is, the variables appearing in the program body an remain uninstan-

tiated during unfolding. For synta ti al unfolding, as des ribed above, there

is no possibility to terminate unfolding for paths whi h ould never been

rea hed during program evaluation. For example, in gure 7.5 at ea h level

of unfolding positions, all re ursive alls where unfolded. It is not he ked,

whether one of the onditions eq0(x

1

) or eq0(pred(x

1

)) is already true for

the given instantiation of x

1

. Unfolding of terms is a well introdu ed on-

ept, whi h we already des ribed in hapter 6 (se ts. 6.2.2.1 and 6.3.4.1). In

the denition above, we simply reformulated unfolding in our terminology.

The synthesis problem is performed su essfully, if an RPS with a set

of re ursive equations an be indu ed whi h an re urrently explain a given

initial program.

Denition 7.1.30 ((Re ursive) Explanation of an Initial Program).

A re ursive equation G

i

(x

1

; : : : ; x

m

i

) = t

i

explains an initial program t

init

,

if there exist an instantiation : fx

1

; : : : x

m

i

g ! T

of the parameters of G

i

and a term t 2 L(G

i

; ), su h that t

init

t. G

i

is a re urrent explanation

of t

init

if furthermore exists a term t

0

2 L(G

i

; ) whi h an be derived by at

least two appli ations of G

i

(x

1

; : : : ; x

m

i

)! t

i

su h that t

0

t

init

.

An RPS S = (G; t

0

) explains an initial program t

init

, if there exist an in-

stantiation : var(t

0

) ! T

of the parameters of the program all t

0

and a

term t 2 L(S; ), su h that t

init

t. S is a re urrent explanation of t

init

if furthermore exists a term t

0

2 L(S; ) whi h an be derived by at least two

appli ations of all rules R

S

su h that t

0

t

init

.

An equation/RPS explains a set of initial programs, if it explains all terms

in the set and if there is a re urrent explanation for at least one term.

Denition 7.1.30 states that an re ursive equation or an RPS dened over

a set of su h explanations (subprograms) together with an initial program

all explains an initial program, if some term t an be derived su h that the

initial program is ompletely ontained in this term (see def. 7.1.26). For a

re urrent explanation it must hold additionally, that some term t

0

an be

derived by repeated unfolding (that is, at least two times) su h that this term

t

0

is ompletely ontained in the initial program.


7.2 Synthesis of RPSs from Initial Programs

Based the denition of RPSs, initial programs, unfoldings, and re urrent

explanations introdu ed above, we are now ready to formulate the synthesis

problem, whi h we want to solve, pre isely. Before we do that, we will rst

dis uss the relation between the xpoint semanti s of re ursive fun tions

and indu tion of re ursive fun tions from initial programs and give some

hara teristi s of RPSs.

7.2.1 Folding and Fixpoint Semanti s

As dis ussed in hapter 6, indu tion an be viewed as inverse pro ess to de-

du tion. For example, in indu tive logi programming, more general lauses

an be omputed by inverse resolution, a term or lause an be generalized

by anti-uni ation (Plotkin, 1969). Summers (1977) proposed that folding an

initial program into a (set of) re ursive equation an be seen as inverse of x-

point semanti s (Field & Harrison, 1988; Davey & Priestley, 1990). Summers

synthesis theorem is given in se tion 6.3.4.1 in hapter 6.

We introdu e xpoint semanti s in appendix BB.1. For short, xpoint

semanti s allows to give a denotation to re ursively dened synta ti fun -

tions. The semanti s of a ( ontinuous) fun tion (over an ordered domain) is

dened as the least supremum of a sequen e of unfoldings (expansions) of

this fun tion.

For folding, we onsider a given nite program term as the i-th unfolding

t

init

= G

i

of an unknown re ursively dened synta ti fun tion G. Find-

ing a re ursive explanation for G

i

as des ribed in denition 7.1.30, orre-

sponds to segmenting G

i

into a sequen e of unfoldings G

0

; G

1

; : : : ; G

i

where

G

k

= t

k

[u G

k1

with u as a xed position in t

k

and as substitution

of parameters holds for k = 1; : : : ; i. In that ase, based on the re urren e

relation whi h holds for G

i

, the re ursive fun tion G = t[u G

an be

extrapolated. An example is given in table 7.5.

7.2.2 Chara teristi s of RPSs

In se tion 7.1.3.1 we already introdu ed two restri tions for RPSs: All vari-

ables given in the head of an re ursive equation must appear in the body

and ea h equation ontains at least a all of itself. In the following, further

hara teristi s of the RPSs whi h an be folded using our approa h are given.

Denition 7.2.1 (Dependen y Relation of Subprograms). Let S = (G; t

0

)

be an RPS with a set of fun tion variables (names of subprograms) . Let

H =2 be a fun tion variable (for the unnamed main program). A relation

alls

S

fHg[ between subprograms and main program of S is dened

as:

7.2 Synthesis of RPSs from Initial Programs 187

Table 7.5. Example of Extrapolating an RPS from an Initial Program

t

init

= G

i

=

if(<(k,n ), k, if(<((k, n)), k, if(<(((k, n), k), k), k, )))

Segmentation into hypotheti al unfoldings of a re ursive fun tion G:

G

0

=

def

G

1

= if(<(k, n),k, )

G

2

= if(<(k, n),k, if(<((k, n)),k, ))

G

3

= G

i

Re urren e Relation and Extrapolation:

G

1

= if(<(k, n),k, G

0

[k (k;n)

)

G

2

= if(<(k, n),k, G

1

[k (k;n)

)

G

3

= if(<(k, n),k, G

3

[k (k;n)

)

extrapolate:

G = if(<(k, n),k, G

[k (k;n)

)

alls

S

= f(H;G

i

) j G

i

2 ;pos(t

0

; G

i

) 6= ;g [

f(G

i

; G

j

) j G

i

; G

j

2 ;pos(t

i

; G

j

) 6= ;g:

The transitive losure alls

S

of alls

S

is the smallest set alls

S

fHg[

for whi h holds:

1. alls

S

alls

S

,

2. for all P 2 fHg [ and G

i

; G

j

2 : If P alls

S

G

i

and G

i

alls

S

G

j

then P alls

S

G

j

.

For introdu ing a notion of minimality of RPSs two further restri tions

are ne essary:

Only Primitive Re ursion: All re ursive equations are primitive re ursive,

that is, there are no re ursive alls in substitutions.

No Mutual Re ursion: There are no re ursive equations G

i

; G

j

with i 6= j

with G

i

alls

S

G

j

and G

j

alls

S

G

i

.

The rst restri tion was already given impli itly in denition 7.1.27 where

sub maps into T

(X

i

). A onsequen e of this restri tion is that we annot

infer general -re ursive fun tions, su h as the A kermann fun tion. The

se ond restri tion is only synta ti al sin e ea h pair of mutually re ursive

fun tions an be transformed into semanti ally equivalent fun tions whi h

are not mutually re ursive.

Denition 7.2.2 (Minimality of an RPS). Let S = (G; t

0

) be an RPS.

S is minimal if for all subprograms G

i

(x

1

; : : : ; x

m

i

) = t

i

; G

i

2 , X

i

=

fx

1

; : : : ; x

m

i

g holds:

No unused subprograms: H alls

S

G

i

.

No unused parameters: For ea h instantiation : X

i

! T

and instantia-

tions

j

: X

i

! T

, j = 1; : : : ;m

i

onstru ted as


j

(x

k

) =

t k = j; t 2 T

; t 6= (x

j

)

(x

k

) k 6= j

holds L(G

i

; ) 6= L(G

i

;

j

).

No identi al parameters: For all : X

i

! T

and all x

i

; x

j

2 X

i

holds: For

all unfoldings

w

2

i

, w 2W (see def. 7.1.29) with instantiation

w

of

variables follows i = j from

w

(x

i

) =

w

(x

j

).

Note, that similar restri tions were formulated by (Le Blan , 1994). To

sum up the given riteria for minimality: An RPS is minimal if it only ontains

re ursive equations whi h are alled at least on e, if ea h parameter in the

head of an equation is used at least on e for instantiating a parameter in the

body of the equation, and if there exist no parameters with dierent names

but (always) identi al instantiation. It is obvious, that ea h RPS whi h does

not fulll these riteria an be transformed into one whi h does fulll them

by deleting unused equations and parameters and by unifying parameters

with dierent names but identi al instantiations. That is, these riteria do

not restri t the expressiveness of RPSs.

A more stri t denition of minimality ould additional onsider the size

(i. e., number of symbols) of the subprogram bodies (Osherson, Stob, & We-

instein, 1986). But this makes only sense for RPSs with a single re ursive

equations. In that ase, minimality of a subprogram G is given if there exists

no subprogram G

0

with L(G

0

; ) L(G; ) (Muhlpfordt & S hmid, 1998).

For RPSs with more than one equation, the minimality of ea h single equation

annot be determined by omparison of languages: If equation G alls other

equations G

i

there is always a smaller language where these alls are repla ed

by onstant symbols. Therefore, it would be ne essary to dene minimality

over the size of all equation bodies and the term representing the main pro-

gram. There are too many degrees of freedom to dene a thight riterium:

For example, is an RPS onsisting of a small main program but omplex sub-

programs smaller than an RPS onsisting of a omplex main program and

small subprograms?

To guarantee uniqueness for al ulating the substitutions in a subprogram

body for a given initial program, we introdu e the notion of \substitution

uniqueness":

Denition 7.2.3 (Substitution Uniqueness of an RPS). Let T

init

T

[fg

be a set of initial programs and S = (G; t

0

) an RPS over and

whi h explains T

init

re ursively (see def. 7.1.30). S is alled substitution

unique with regard to T

init

if there exists no S

0

over and whi h explains

T

init

re ursively and for whi h holds:

t

0

0

= t

0

,

for all G

i

2 holds pos(t

i

; G

i

) = pos(t

0

i

; G

i

) = U

re

, t

0

i

[U

re

=

t

i

[U

re

, and it exists an r 2 R with sub(x

j

; r) 6= sub

0

(x

j

; r) (see

def. 7.1.27).

7.3 Solving the Synthesis Problem 189

Substitution uniqueness guarantees that it is not possible to repla e a

substitution term in S su h that the resulting RPS S

0

still explains a given

set of initial programs re ursively.

7.2.3 The Synthesis Problem

Now all preliminaries are given to state the synthesis problem:

Denition 7.2.4 (Synthesis Problem). Let T

init

T

[fg

be a set of

initial programs with indi es I = f1; : : : ; jT

init

jg. The synthesis problem is to

indu e

a signature ,

a set of fun tion variables = fG

1

; : : : ; G

n

g,

a minimal RPS S = (G; t

0

) with a main program t

0

2 T

[

(X) and a set of

re ursive equations G = hG

1

(x

1

; : : : ; x

m

1

) = t

1

; : : : ; G

n

(x

1

; : : : ; x

m

n

) = t

n

i

su h that

S re ursively explains T

init

(def. 7.1.30), and

S is substitution unique (def. 7.2.3).

In the following se tion, an algorithm for solving the synthesis problem

will be presented. Identi ation of a signature is dealt with only impli itly

the RPS is onstru ted exa tly over su h symbols whi h are ne essary for

explaining the initial programs re ursively. Be ause the folding me hanism

should be independent of the way, in whi h the initial programs were on-

stru ted (by a planner, as input of a system user), we allow for in omplete

unfoldings and we allow instantiated initial programs as well as programs

over parameters (i. e., generi tra es), where parameters are treated just like

onstants.

If synthesis starts with only a single initial program, the main program

an ontain no parameters (ex ept the parameters whi h where identied

as \ onstants"). If a parameterized main program is sear hed for (whi h is

the usual ase), we assume that all parameters inferred for the subprograms

whi h are alled from the main program are also parameters of the main

program itself.

7.3 Solving the Synthesis Problem

In this se tion, the algorithms for onstru ting an RPS from a set of initial

programs (trees) are presented: For ea h given initial tree, in a rst step, a

segmentation is onstru ted (se t. 7.3.1), whi h orresponds to the unfoldings

(see def. 7.1.29) of a hypotheti al subprogram. As intermediate steps, rst

a set of hypotheti al re ursion points (see def. 7.1.27) is identied and a

skeleton of the body of the sear hed for subprogram is onstru ted.


If re ursion points do not explain the omplete initial tree, all unexplained

subtrees are onsidered as a new set of initial trees for whi h the synthesis

algorithm is alled re ursively and the subtrees are repla ed by the name

G

i

of the to be inferred subprogram (se t. 7.3.3). If an initial tree annot

be segmented starting from the root, a onstant initial part is identied and

in luded in the body of the alling fun tion that is, in t

0

for top-level

subprograms and in G

i

if alls(G

i

; G

j

) and G

j

is the urrently onsidered

hypotheti al subprogram (see gure 7.13 in se tion 7.3.5.2).

For a given segmentation, the maximal pattern is onstru ted by in luding

all parts of initial trees whi h are onstant over all segments into the subpro-

gram body (se t. 7.3.2). All remaining parts of the segments are onsidered

as parameters and their substitutions. That is, in a last step, a unique substi-

tution rule must be al ulated and as a onsequen e, the parameter instantia-

tions of the fun tion all are identied (se t. 7.3.4). The only ba ktra k-point

for folding is al ulating a valid segmentation for the initial trees.

We begin ea h subse tion with an illustration. For readers not interested

in the formal details, it is enough to study the illustration for getting the

general idea of our approa h.

7.3.1 Constru ting Segmentations

In the following, we rst give an informal illustration. Then, we introdu e

some on epts for onstru tion segmentations of initial programs. Afterwards,

we introdu e riteria whi h must hold for a segmentation to be valid. Finally,

we present an algorithm for onstru ting valid segmentations.

7.3.1.1 Segmentation for `Mod'. Consider the simple RPS for al ulat-

ing Mod(k, n), whi h onsists of a single re ursive equation and where the

main program t

0

is just the all of this equation:

S = hG; t

0

i

G = h Mod(k, n) = if(<(k, n), k, Mod((k, n), n)) i

t

0

=Mod(k; n)

Remember, that all re ursive equations are onsidered as \subprograms".

The main program, in general, an be a term whi h alls one or more of the

re ursive sub-programs of the RPS (see se t. 7.1.3.1).

In gure 7.6 a term is given whi h an be folded into this RPS. For better

readability, we do not give an initial instantiation for parameters k and n. The

rst and ru ial step of folding is, to identify a valid re urrent segmentation

from an initial program term t

init

. For the given term, su h a segmentation

is marked in the gure.

As a rst step, all paths in the term leading to s are identied; that

is, paths to positions where the unfolding of the hypotheti al (sear hed for)

RPS was stopped. These paths onstitute the skeleton of the term, whi h is

dened as the minimal pattern of the term (see def. 7.1.11) whi h ontains

all s and where all sub-terms whi h are not on paths to s are ignored.


skel

init

init

t

tSkeleton of

t

Hypothesis about Program Body

y’

Ω

y4y3

Segment 2

Segment 1

y

Ω

y1 y2

y6y5

Segment 4

Segment 3

if

if<

k n <

-

k n

n

-

k n <

-

-

k n

n

n

-

-

k n

n

k

if

if

G

if

if

if

Fig. 7.6. Valid Re urrent Segmentation of Mod

Then, a set of hypotheti al re ursion points U

r

e (see def. 7.1.27) is gen-

erated. There are dierent possible strategies, to obtain su h a set from t

init

.

The most simple hypothesis is, that, beginning at the root, ea h node in the

skeleton onstitutes an unfolding position. For our example, this hypothesis

is U

re

= f3:g. We an onstru t hypotheti al unfolding indi es W from

U

re

(see def. 7.1.28) whi h leads to the segmentation given in gure 7.5.

Repla ing all subtrees at positions U

re

in the skeleton by s results in

a hypothesis about the program body t

skel

. For our example, the segmen-

tation indu ed by U

re

= f3:g is valid be ause ea h segment onsists of

the same (non-trivial) pattern and be ause all s (here only a single one)

an be rea hed and this has a position whi h orresponds to a re ursion

point (unfolding position). The valid segmentation is re urrent be ause the

skeleton of the upper-most segment, t

skel

onstitutes a non-trivial pattern

for all subtrees t

u

of the initial program where u is a hypotheti al re ursion

point. The notion of a valid re urrent segmentation follows dire tly from the

denition of a re ursive explanation of a re ursive program (see def. 7.1.30).

If a valid re urrent segmentation an be found, the result of this rst step

of folding is a skeleton of the sub-program body. For our example it is if(y,

y', G), where G is a new fun tion variable in .

As stated in the denition of the synthesis problem (see def. 7.2.4), input

into the folding algorithm is a set of initial programs T

init

. That is, a user

or another system an provide more than one example program whi h are

onsidered as unfoldings of the same sear hed-for RPS with dierent instan-


tiations and/or dierent unfolding depth. In our example, T

init

onsisted of

a single initial program. For some denitions below, it is enough, to onsider

a single program t

init

2 T

init

, for others, it is important, that the omplete

set is taken into a ount.

In general, nding a valid re urrent segmentation an be mu h more om-

pli ated as illustrated for Mod:

It might not be possible to nd a valid re urrent segmentation starting at

the root of the initial tree, that is, there exists a onstant initial part of the

term whi h belongs to the main program t

0

. In that ase, this initial part is

introdu ed in t

0

and segmentation starts with T

init

as set of all sub-terms

of this initial part.

The re urren e relation underlying the term an be of arbitrary omplexity.

That is, in general, the skeleton does not just exist of a single path. Our

strategy for onstru ting segmentations is to nd segmentations whi h start

as high in the given tree as possible and whi h over as large a part of

the tree as possible. As we will see in the algorithm below, we sear h for

hypotheti al re ursion points \from left to right" in the tree.

There might exist sub-terms in the given initial program whi h ontain s

but whi h annot be overed within any valid re urrent segmentation. In

this ase, it is assumed, that these subtrees belong to alls of a further

re ursive equation. The positions of su h subtrees are onsidered as \sub-

s hema positions" U

sub

and nding valid re ursive segmentations for these

subtrees is dealt with by starting the folding algorithm again with these

subtrees as new set of initial trees.

We allow for in omplete unfoldings. That is, some paths in the initial pro-

gram are trun ated before an unfolding is omplete. For example, for the

re ursive equation

addlist(l) = if(empty(l), 0, +(head(l), addlist(tail(l))))

the last unfolding ould result in an at the position of +, be ause the

else ase is not onsidered if list l is empty. Although we onsider unfolding

as a purely synta ti al pro ess, it might be useful to take into a ount that

onstru tion of the initial program was based on some kind of semanti

information (su h as evaluation of expressions).

Consequently, when onstru ting a segmentation, it is allowed that a seg-

ment ontains an above the hypotheti al unfolding position (but not

below!).

All these ases are overed in our approa h whi h we will introdu e now

more formally.

7.3.1.2 Segmentation and Skeleton. The rst and ru ial step for in-

du ing an RPS from a set of initial programs is the identi ation of a valid,

re urrent segmentations of the initial trees. When sear hing for a segmenta-

tion, at rst only su h nodes of the initial trees are onsidered whi h are lying


on paths leading to hypotheti al re ursion points. These paths onstitute a

spe ial (minimal) pattern of an initial tree, alled skeleton:

Denition 7.3.1 (Skeleton). The skeleton of a term t 2 T

[fg

(X), writ-

ten skeleton(t) is the minimal pattern (def. 7.1.11) of t for whi h holds

pos(t; ) = pos(skeleton(t); ).

Let U pos(t; ) in t with t

U

= for all u 2 U . Then we dene

skeleton(t; U) as that skeleton of term t whi h ontains only the s at posi-

tions u 2 U , that is, the minimal pattern with pos(skeleton(t; U); ) = U .

An example skeleton for Mod is given in gure 7.5. This is the simple ase

of a single initial program whi h an be explained by an RPS onsisting of a

single re ursive equation and a main program whi h just alls this equation.

Note, that the paths under onsideration are leading tos, that is, trun ation

points of the unfolding of a hypotheti al RPS.

3

In general, it an be ne essary for nding a re ursive explanation of an

initial program to identify a subprogram G

i

whi h alls further subprograms.

The alls of these subprograms must be at xed positions in G

i

, alled sub-

s hema positions

4

. In that ase, the skeleton of G

i

does not ontain all paths

leading tos (see def. 7.3.1). For these remainings must hold, that they an

be rea hed over paths from the hypotheti al sub-s hema positions. Re ursion

points and sub-s hema positions onstitute a hypothesis for indu ing an RPS.

Denition 7.3.2 (Re ursion Points and Sub-S hema Positions). Let

U

re

and U

sub

be two sets of positions with U

re

6= ;. If for all u

sub

2 U

sub

and all positions u above u

sub

(u

sub

= u Æ i, i 2 N) holds

1. 6 9u

0

re

2 U

re

with u

sub

u

0

re

and

2. 9u

re

2 U

re

with u u

re

,

then (U

re

; U

sub

) is a hypothesis over re ursion points U

re

and sub-s hema

positions U

sub

.

For the initial program given in gure 7.7 (see tab. 7.2 for the underlying

RPS) the re ursion point is 3:2:, that is the se ond \if" on the rightmost

path. The sub-s hema position is 3:1:, that is, the \if" in the left path

under \ ons". For this position holds that it is not above the re ursion point

( ondition 1 of def. 7.3.2) but it an be rea hed over a path leading to this

re ursion point ( ondition 2 of def. 7.3.2): u

sub

= u Æ 1 = 3:1: for u = 3:.

3

For initial trees generated by a planner or by a system user that means, that

ases for whi h the ontinuation is unknown or not al ulated must be marked

by a spe ial symbol.

4

Be ause the inferen e of a further sub-program again an involve a onstant

initial part, not just a re ursive equation but an additional \main" ould be

inferred whi h will then be integrated in the alling subprogram. Therefore, we

speak of sub-s hema positions rather than subprogram positions

194

7.FoldingofFiniteProgram

Terms

3.1. λ 3.2. λ

Ω

Ω

Ω

Ω

if

empty nil cons

[9,4,7] if

eq0 T nil

if

<

8 head

[9, 4, 7]

8

if

empty

tail

[9, 4, 7]

nil cons

if

eq0 T nil

if

< 8 if

8 head

tail

[9, 4, 7]

< - if

- head

tail

[9, 4, 7]

8 head

tail

[9, 4, 7]

8 head

tail

[9, 4, 7]

< -

-

- head

8 head

tail

[9, 4, 7]

tail

[9, 4, 7]

head

tail

[9, 4, 7]

head-

8 head

tail

[9, 4, 7]

tail

[9, 4, 7]

if

empty

tail

tail

[9, 4, 7]

nil

if

cons

empty

tail

tail

tail

[9, 4, 7]

nil

if

eq0 T nil

if

< 8 if

8 head

tail

tail

[9, 4, 7]

< -

8 head

tail

tail

[9, 4, 7]

- head

head tail

tail

[9, 4, 7]

8

tail

tail

[9, 4, 7]

Fig.7.7.InitialProgram

forModList


7.3.1.3 Valid Hypotheses. Re ursion points and sub-s hema positions

onstitute a valid hypothesis if they are on paths leading to s and where

the s are at positions orresponding to re ursion points or for in omplete

unfoldings above re ursion points. For example, in the initial tree given in

gure 7.7 the last unfolding is in omplete: After the ìf'-node in the right-most

path should follow another ` ons'-node; but, instead it follows an .

Denition 7.3.3 (Valid Re ursion Points and Sub-S hema Positions).

Let be t

init

an initial tree and (U

re

; U

sub

) a hypothesis. (U

re

; U

sub

) is a valid

hypothesis for t

init

if

1. 8u 2 U

sub

: If u 2 pos(t

init

), then pos(t

init

j

u

; ) 6= ;,

2. 8u 2 U

re

: u 2 pos(t

init

) and (U

re

; U

sub

) is valid for t

init

j

u

or

3. 8u 2 U

re

: 9u

2 t

init

with u

< u and t

init

j

u

= .

The rst ondition of denition 7.3.3 ensures that sub-s hema positions

are at su h positions in the initial tree where the subtrees ontain at least one

. Condition 2 re ursively ensures that the hypothesis holds for all sub-terms

at the re ursion points. Condition 3 ensures that the s have positions at or

above re ursion points.

Up to now we have only regarded the stru ture of an initial tree. A valid

segmentation additionally must onsider the symbols of the initial tree (i. e.,

the node labels). Therefore, hypothesis (U

re

; U

sub

) will be extended to a

term t

skel

whi h represents a hypothesis about a sub-program body.

Denition 7.3.4 (Valid Segmentation). Let t

init

2 T

[fg

with

pos(t; ) 6= ; be an initial program. Let (U

re

; U

sub

) be a valid hypothesis

about re ursion points and sub-s hema positions in t. Let t

skel

2 T

[fg

(X)

be a term with t

skel

t

init

(def. 7.1.26) and U

re

[U

sub

= pos(t

skel

; ). A

segmentation given by (U

re

; U

sub

) together with t

skel

is valid for t

init

if for

U

= U

re

\ pos(t

init

) (the set of re ursion points in t

init

), and

U

i

= U

re

n U

(the set of re ursion points not in t

init

)

holds:

Rea hability of s: For all u

i

2 U

i

exists a u

2 pos(t

init

; ) with u

<

u

i

.

Validity of the Skeleton: Given the set of all positions of s in t

init

whi h are

above the re ursion points as U

t

i

fu

j u

< u; u

2 pos(t

init

; ); u 2

U

i

g it holds t

skel

[U

t

i

t

init

.

Validity of Segmentation in Subtrees: For all u 2 U

holds

(U

re

; U

sub

) together with t

skel

is a valid segmentation for tj

u

.

The denition is somewhat ompli ated be ause we are allowing in om-

plete unfoldings. Note, that we urrently are not on erned with the question

of how re ursion points are obtained. We just assume that a non-empty set

of hypotheti al re ursion points U

re

is at hand whi h an ontain positions


whi h an be found in the given t

init

and an ontain positions whi h are not

in t

init

.

The term t

skel

is a hypothesis about the skeleton of the body of a sear hed

for subprogram. It ontains s (trun ations of unfoldings) at the hypo-

theti al re ursion points and sub-s hema positions. For in omplete unfold-

ings, for ea h given re ursion point there must exist an at or above this

point (\rea hability"). The set U

t

i

ontains all su h positions and the term

t

skel

[U

t

i

is the skeleton of the hypotheti al subprogram body redu ed

to the size of the urrently investigated and possibly in omplete sub-term.

For a valid hypothesis it must hold further, that the urrently investigated

term ontains a sub-term with at least one and that there are no further

s in t

init

. The positions in U

sub

over all s whi h are not generated by

the urrently investigated subprogram (\validity of skeleton"). Finally, these

onditions must hold for all further hypotheti al segments (\validity of seg-

mentation in subtrees"). Note, that for a tree t

u

, whi h is a subtree of t

init

with a re ursion point as root, it an happen that not all positions in U

re

are given!

If we extend this denition from one initial tree t

init

to a set of initial

trees T

init

, we request that at least one of the initial programs ontains a

ompletely unfolded segment:

Denition 7.3.5 (Segmentation of a Set of Initial Programs). (U

re

;

U

sub

) together with t

skel

is a valid segmentation of a set of initial programs

T

init

if there exists a t

init

2 T

init

with t

skel

t

init

and if (U

re

; U

sub

)

together with t

skel

is a valid segmentation of all t 2 T

init

as dened in 7.3.4.

A valid segmentation partitions an initial tree into a sequen e of segments.

These segments an be indexed in analogy to the unfoldings of a subprogram

(def. 7.1.29):

Denition 7.3.6 (Segments of an Initial Program). Let T

init

T

[fg

be a set of initial programs and E a set of indi es with t

e

2 T

init

for all e 2 E.

Let R = f1; : : : ; jU

re

jg be a set of indi es of re ursion points. Let (U

re

; U

sub

)

together with t

skel

be a valid segmentation for all t

e

2 T

init

and W the set of

indi es over segments onstru ted over U

re

. Let G be a new fun tion variable

with G =2 . Then we an dene segment(t

e

; U

re

; w) with G as name for

the subprogram as segment of t

e

with respe t to re ursion points U

re

with

segment index w 2 W indu tively:

1: segment(t; U

re

; ) =

t[U

re

\ pos(t) G if t 6=

? otherwise

2: segment(t; U

re

; r:v) =

segment(tj

u

r

; U

re

; v) if r 2 R and u

r

2 pos(t)

? otherwise:

Fun tion skeleton(t

e

; U

re

; w) returns for a given initial program t

e

and

a set of hypotheti al re ursion points U

re

the a ording segment with index


w if it is ontained in the tree and otherwise \unknown".

5

A new fun tion

variable is inserted at the position of the re ursion point whi h will be the

name of the to be indu ed subprogram. For the ase of in omplete unfoldings

the s positioned above hypotheti al re ursion points remain in the tree.

It is not possible, that a hypotheti al subprogram body an onsist of the

subprogram name G only (item 1 in def. 7.3.6).

The segments of an initial program orrespond to unfoldings of a hypo-

theti al re ursive equation as dened in 7.1.29. But (up to now), segments

an additionally ontain subtrees (unfoldings) of further re ursive equations

(see se t. 7.3.3).

The set of all indi es for a given initial program with given segmentation

an be dened as

Denition 7.3.7 (Segment Indi es). Let T be a set of initial trees in-

dexed over E with t

e

2 T for all e 2 E. Let (U

re

; U

sub

) together with t

skel

be

a valid segmentation for all terms in T and let R = f1; : : : ; jU

re

jg be a set of

indi es of re ursion points and W the set of unfolding indi es al ulated over

R (def. 7.1.28). The W (e) is the set of indi es over the segments ontained

in an initial tree t

e

with W (e) = fw j w 2 W; segment(t

e

; U

re

; w) 6= ?g.

For indu ing a re ursive subprogram from a given initial program, this

initial program must be of \suÆ ient size", that is, it must ontain a suÆ ient

number of (hypotheti al) unfoldings su h that all information ne essary for

folding an be obtained from the given tree. Espe ially, it is ne essary that

ea h hypotheti al subprogram is unfolded at least on e at ea h hypotheti al

re ursion point. A hypothesis over re ursion points of a subprogram an only

lead to su essful folding if the following proposition holds:

Lemma 7.3.1 (Re urren e of a Valid Segmentation). Let (U

re

; U

sub

)

together with t

skel

be a valid segmentation for a set of initial programs T

init

and let R = f1; : : : ; jU

re

jg be the set of indi es over re ursion points U

re

.

(U

re

; U

sub

) an only generate a subprogram whi h re ursively explains T

init

,

if for ea h re ursion point u

r

2 U

re

there exists an initial tree t

e

2 T

init

su h

that exists w 2 W (e) with w = v Æ r; v 2 W (e); r 2 R.

Proof: follows dire tly from denition 7.1.30.

Lemma 7.3.1 is a weak onsequen e from the denition of re ursive ex-

planations (def. 7.1.30), be ause it is not required that segments orrespond

to omplete unfoldings. That is, we have given just a ne essary but not a

suÆ ient ondition for su essful indu tion.

7.3.1.4 Algorithm. The denition of a valid segmentation (def. 7.3.4) al-

lows us to formulate some relations whi h an be used for an algorithmi

onstru tion of a segmentation hypothesis:

5

Note, that we use the symbol ? to represent an unknown segment rather than

to dis riminate between an undened sub-term in a term () and an undened

hypotheti al fun tion body (?).


Be ause of onditions t

skel

t

init

and U

re

pos(t

skel

) must hold

U

re

pos(t

init

).

From the ondition \validity of segmentation in subtrees" follows for all

u

re

2 U

re

that node(t

init

; u

re

) = node(t

init

; ). That is, only su h

positions in an initial tree an be andidates for re ursion points whi h

ontain the same symbol as its root node.

6

The positions of further sub-s hemata an be inferred from the uppermost

unfolding in t

i

nit: U

sub

= fu j u 2 lpos(skeleton(t

init

[U

re

)) n

U

re

;pos(t

init

j

u

; ) 6= ;g. (Remember that lpos(t) returns the positions

of leaf nodes of a term, see def. 7.1.6).

Consequently, a hypothesis t

skel

about the skeleton of the subprogram body

an be onstru ted: t

skel

= skeleton(t

init

[(U

re

[ U

sub

) ).

As mentioned before, onstru ting a hypothesis about the set of possible

re ursion points U

re

from a given set of initial trees is the rst and ru ial

step of indu ing an RPS. This hypothesis determines the set of sub-s hema

positions and the skeleton of a sear hed for subprogram body. It represents

an assumption (U

re

; U

sub

) about the segmentation of an initial tree whi h

orresponds to the i-th unfolding of a sear hed for re ursive equation. The

validity of a hypothesis an be initially he ked using the riteria given in

denition 7.3.4 and it an he ked whether it holds re ursively over an om-

plete initial tree using lemma 7.3.1. If validation fails, a new set of re ursion

points U

re

must be onstru ted. If no su h set an be found, the sear hed for

RPS does not onsist of a main program whi h just alls a re ursive equation,

and sear h for re ursion points must start at deeper levels of the tree. In the

worst ase, the omplete tree would be regarded as main program with an

empty set of re ursive equations.

7

For onstru ting possible sets of re ursion points, ea h given initial tree

must be ompletely traversed. The onstru tion is based on a strategy whi h

determines the sequen e in whi h possible hypotheses are generated. The pro-

posed strategy ensures that minimal subprograms whi h explain maximally

large parts of an initial tree are onstru ted:

Sear h for a segmentation, starting with U

re

= ; at position u = 1: onsidering

the following ases:

Case 1: (The initial trees are ompletely traversed.)

If there exists no further position

If U

re

6= ; then onstru t U

sub

and t

skel

with respe t to U

re

.

If (U

re

; U

sub

) together with t

skel

is a valid segmentation, then stop else

ba ktra k.

6

Note, that it is possible, that a given initial tree ontains a onstant part be-

longing to the body of the alling fun tion, su h that the initial tree for whi h a

re ursive subprogram is to be indu ed has a root whi h is on a deeper position.

7

Be ause we sear h for an RPS with at least one re ursive equation, our algorithm

terminates already at an earlier point if the remaining tree is so small, that

it is not possible to nd at least one unfolding for ea h hypotheti al re ursion

point.


If U

re

= ; then no re ursion points ould be found and onsequently, there

exists no subprogram whi h re ursively explains the initial trees.

Case 2: (The node at position u is re ursion point.)

Set U

re

= U

re

[ fug.

Constru t U

sub

and t

skel

with respe t to U

re

.

If (U

re

; U

sub

) together with t

skel

is a valid segmentation, then progress with

U

re

to the next position u

0

right of u.

Otherwise go to ase 3.

Case 3: (The node at position u is lying above a re ursion point.)

If there exists a position u

0

= u Æ 1 in the initial trees, progress with U

re

at

position u

0

.

Otherwise go to ase 4.

Case 4: (There are no nodes lying below the urrent node.)

Progress with U

re

at the next position u

0

right of u.

This strategy realizes a sear h for re ursion points from left to right where

for ea h hypotheti al re ursion point sear h pro eeds downward. The fun tion

for al ulating the next position to the right is given in table 7.6.

Table 7.6. Cal ulation of the Next Position on the Right

Fun tion: nextPosRight : T

[fg

pos(t

init

)! pos(t

init

) for all t

init

2 T

init

Pre: T

init

is a set of initial trees; u is a position in the initial trees

Post: next position right of u if su h a position exists, ? otherwise

1. IF u =

2. THEN return ?

3. ELSE

a) Let u = u

0

Æ k

b) IF 9t 2 T

init

with u

0

Æ (k + 1) 2 pos(t)

) THEN return u

0

Æ (k + 1)

d) ELSE return nextPosRight(T

init

; u

0

).

7.3.2 Constru ting a Program Body

For a given valid segmentation the body of a hypotheti al subprogram an

be onstru ted. For ea h node of a segment it must be de ided whether it

belongs to

the body of the subprogram, or

the parameter instantiation, or

a further subprogram.

In this se tion we will only onsider subprograms without alls of further

subprograms, that is U

sub

= ;. An extension for subprograms with additional

subprogram alls is given in se tion 7.3.3. In the following, we will rst give

an example for onstru ting the program body. Afterwards, we will intro-

du e a theorems on erning the maximization of the program body. Finally,

we will present an algorithm for al ulating the program body for a given

segmentation of a set of initial trees.


7.3.2.1 Separating Program Body and Instantiated Parameters.

For illustration we use a simple linear subprogram the re ursive equation for

al ulating the fa torial of a natural number. This subprogram and its third

unfolding is given in table 7.7. The parameter x is instantiated with 2. In the

rst ase, represented as su (su (0)) and in the se ond ase, represented as

pred(3) (short for pred(su (su (su (0))))).

Table 7.7. Fa torial and Its Third Unfolding with Instantiation su (su (0)) (a)

and pred(3) (b)

Subprogram: G(x) = if(eq0(x), 1, *(x, G(pred(x))))

Unfolding (a): Third unfolding for (x) = su (su (0))

t

init

= if(eq0(su (su (0))), 1, *(su (su (0)),

if(eq0(pred(su (su (0)))), 1, *(pred(su (su (0))),

if(eq0(pred(pred(su (su (0))))),1, *(pred(pred(su (su (0)))),))))))

Unfolding (b): Third unfolding for (x) = pred(3)

t

init

= if(eq0(pred(3)), 1, *(pred(3),

if(eq0(pred(pred(3))), 1, *(pred(pred(3)),

if(eq0(pred(pred(pred(3)))), 1, *(pred(pred(pred(3))),))))))

A valid re urrent segmentation for t

init

is U

re

= f3:2:g, and the skeleton

is t

skel

= if(y

1

; y

2

; (y

3

; )). The segments for t

init

are given in table 7.8.

Table 7.8. Segments of the Third Unfolding of Fa torial for Instantiation

su (su (0)) (a) and pred(3) (b)

(a) Index w Segment

if(eq0(su (su (0))), 1, *(su (su (0)), ))

1: if(eq0(pred(su (su (0)))), 1, *(pred(su (su (0))), ))

1:1: if(eq0(pred(pred(su (su (0))))),1, *(pred(pred(su (su (0)))),))

(b) Index w Segment

if(eq0(pred(3)), 1, *(pred(3), ))

1: if(eq0(pred(pred(3))), 1, *(pred(pred(3)), ))

1:1: if(eq0(pred(pred(pred(3)))), 1, *(pred(pred(pred(3))),))

The program body an be onstru ted as maximal pattern (see def. 7.1.13)

of all given segments. For the segments onstru ted from the initial tree with

instantiation (x) = su (su (0)), the program body therefore is

t

G

0

= if (eq0 (x

0

); 1 ; (x

0

;G

0

)):

The body is identi al with the denition of fa torial given in table 7.7.

For the segments onstru ted from the initial tree with instantiation

(x) = pred(3), the program body is


t

G

00

= if (eq0 (pred(x

00

)); 1 ; (pred(x

00

);G

0

)):

A part of the parameter instantiation got part of the program body! This is a

onsequen e from dening the program body as maximal pattern of the seg-

ments of the initial trees. If an instantiation is given whi h shares a non-trivial

pattern (i. e., the anti-instan e is not just a variable; see def. 7.1.11) with the

substitution term (in the re ursive all), then this pattern will be assumed

to be part of the program body. A simple pra ti al solution is to present the

folding algorithm with a set of initial trees with dierent instantiations.

In the following, it will be shown that if a re ursive equation exists for

a set of initial trees then we an nd a re ursive subprogram with a \maxi-

mized" body. This resulting subprogram together with the found parameter

instantiation is equivalent to the \intended" subprogram with the \intended"

instantiation. Of ourse, it does not hold, that the indu ed program and the

intended program are equivalent for all possible parameter instantiations!

7.3.2.2 Constru tion of a Maximal Pattern. The body of a subprogram

is onstru ted by in luding all nodes in the skeleton whi h are ommon over

all segmentations. All subtrees whi h dier over segments are onsidered as

instantiated parameters, that is, the initial instantiations of the parameters

when the subprogram is alled and the instantiations resulting from substi-

tutions of parameters in re ursive alls. A given set of initial trees in the

following alled examples an ontain trees with dierent initial parameter

instantiations as shown in the fa torial example above.

Theorem 7.3.1 (Maximization of the Body). For a set of example ini-

tial trees, let E be a set of examples indi es e with j E j 1. For ea h re ursive

subprogram G(x

1

; : : : ; x

n

) = t

G

with X = fx

1

; : : : ; x

n

g and t

G

2 T

[

(X)

together with initial instantiations

e

: X ! T

for all x 2 X and all

e 2 E exists a subprogram G

0

(x

1

; : : : ; x

n

0

) = t

G

0

with X

0

= fx

1

; : : : ; x

n

0

g

and t

G

0

2 T

[

(X

0

) together with initial instantiations

0

e

: X

0

! T

for

all x 2 X

0

and all examples e 2 E su h that L(G;

e

) = L(G

0

;

0

e

) for all

e 2 E. Additionally, for ea h x 2 X

0

holds that the instantiations whi h an

be generated by G

0

from

0

e

don't share a ommon not-trivial pattern.

Proof: see appendix BB.2.

Theorem 7.3.1 states that if a re ursive subprogram G exists for a set of

initial trees T

init

= ft

e

j e 2 Eg whi h an generate all t

e

2 T

init

for a given

initial instantiation

e

, then there also exists a subprogram G

0

su h that the

instantiations (over all segments and all examples) do not share a non-trivial

pattern (a ommon prex). Therefore, it is feasible to generate a hypothesis

about the subprogram body for a given segmentation with re ursion points

U

re

by al ulating the maximal pattern of the segments.

Theorem 7.3.2 (Constru tion of a Subprogram Body). Let T

init

T

[fg

be a set of initial trees with t

e

2 T

init

for all e 2 E. Let (U

re

; U

sub

)

be a valid segmentation of T

init

with U

sub

= ; and G the fun tion variable


used for onstru ting the segments. The maximal pattern

^

t

G

2 T

[G

(X) of all

segments segment(t

e

; U

re

; w) is the resulting hypothesis about the program

body.

Proof: Follows from theorem 7.3.1 and denition 7.3.6.

The variable instantiations for a maximal pattern

^

t

G

are given by the

subtrees at the positions where the segmentations dier or (for un omplete

unfoldings) by?, if these positions do not o ur in a given tree (see def. 7.3.6).

Denition 7.3.8 (Variable Instantiations). Let T T

[fg

be a set of

terms indexed over E,

^

t

G

the maximal pattern of terms in T , and X =

var(

^

t

G

) the set of variables in the maximal pattern. The instantiation

e

:

X ! T

[f?g of variables x 2 X at positions u 2 pos(

^

t

G

; x) in term t

e

2 T

is dened as

e

(x) =

t

e

j

u

u 2 pos(t

e

)

? otherwise:

7.3.2.3 Algorithm. The maximal pattern of a set of terms an be al u-

lated by rst order anti-uni ation as des ribed in denition 7.1.16 in se tion

7.1.2. Only omplete segments are onsidered. For in omplete segments, it

is in general not possible to obtain a onsistent introdu tion of variables

during generalization. An example problem, using two terms of Fibona i

(introdu ed in table 7.3) is given in table 7.9.

Table 7.9. Anti-Uni ation for In omplete Segments

We dene t u = u t = t.

Complete Segment: t

1

= if(eq0(2); 1; if(eq0(pred(2)); 1;+(G;G)))

In omplete Segment: t

2

= if(eq0(pred(pred(2)));1; )

Anti-Uni ation: if(eq0(x

1

); 1; if(eq0(pred(2)); 1;+(G;G)))

Anti-uni ation gives us the maximal body

^

t

G

of a subprogram G. The

variables in the subprogram represent that subtrees whi h dier over the

segments of an initial tree. After this step of folding, the instantiations of these

variables are still a nite set of terms, namely, the diering subtrees. In se tion

7.3.4 it will be shown, how this nite set is generalized by inferring input

variables, their initial instantiation, and the substitution of these variables in

the re ursive all.

7.3.3 Dealing with Further Subprograms

Up to now we have ignored the ase that there are path leading to s whi h

annot be explained by the onstru ted segmentation and that annot be

generated by the resulting maximal subprogram body. In this se tion we will

present how folding works if U

sub

is not empty. First, we give an illustrative


example. Then we introdu e de omposition of initial trees with respe t to

positions in U

sub

. We show that integrating alls to further subprograms in

the body of a subprogram alled from the main program t

0

is well-dened.

Finally, we des ribe how the original initial tree an be redu ed by repla ing

subtrees orresponding to alls of further subprograms by their name.

7.3.3.1 Indu ing two Subprograms for `ModList'. Remember theMod-

List example presented in table 7.2 and the example initial tree from whi h

this RPS an be inferred given in gure 7.7. Let us look at this initial tree

again (see g. 7.8): We an nd a rst segmentation whi h explains the right-

most path leading to an with U

re

= f3:2:g. But, the initial tree ontains

further subtrees with -leafs, that is U

sub

= f3:1:g 6= ;. If the found rst

segmentation is valid, there must exist a further subprogram whi h an gen-

erate these remaining subtrees ontaining s (marked with bla k boxes in the

gure). In segments one to three (segment four is an in omplete unfolding,

as dis ussed above), the left-hand subtrees of the ` ons'-nodes ontain these

yet unexplained subtrees.

Folding has started with inferring an RPS S = (G; t

0

) and a promising

segmentation has been found whi h might result in a subprogram G

1

2 G

and a main program t

0

whi h alls G

1

. Now, the folding algorithm is alled

re ursively with the unexplained subtrees as new sets of initial trees T

u

init

with

u 2 U

sub

. For the ModList example, there exists only one position in U

sub

and we have three example trees whi h must be explained. The re ursive all

of the folding algorithm results again in nding a re ursive program s heme

S

u

= (G

u

; t

u

0

) and not just a re ursive subprogram in G. This is, be ause the

all of the further sub-program might be embedded in a onstant term t

u

0

whi h later be omes part of the body of the alling subprogram G

1

. That

is, we start to infer a sub-s heme whi h afterwards an be integrated in the

sear hed-for RPS.

Again, the rst step of folding is to nd a valid re urrent segmentation.

In gure 7.9 the new set of initial trees is given. The initial hypothesis, that

t

u

0

onsists only of the all of the subprogram fails. There is a onstant initial

part if(eq0(x), T, nil). Starting segmentation further down, at the rst ìf'-

node, results in a valid re urrent segmentation whi h an explain all three

initial trees.

We already des ribed, how the maximal body an be onstru ted by nd-

ing the ommon pattern of all segments (see se t. 7.3.2. For the sear hed

for subprogram G

2

we nd the body if(<(x

1

, head(x

2

)), x

1

, G

2

). We will

explain, how initial instantiations and substitutions in re ursive alls are al-

ulated below, in se tion 7.3.4. For the moment, just believe that the resulting

subprogram G

2

is the one given in the box in gure 7.9. The alling main

program for the sub-s heme is t

u

0

= if(eq0(G

2

(x

1

, x

2

)), T, nil). The initial

trees T

u

init

an be explained with the following initial instantiations:

First Tree: (x

1

) = 8, (x

2

) = [9; 4; 7,

Se ond Tree: (x

1

) = 8, (x

2

) = tail([9; 4; 7),

204

7.FoldingofFiniteProgram

Terms

3.2. λ

IncompleteSegment 4

Segment 3

Segment 2

Segment 1

Ω

Ω

Ω

Ω

if

empty nil cons

[9,4,7] if

eq0 T nil

if

<

8 head

[9, 4, 7]

8

if

empty

tail

[9, 4, 7]

nil cons

if

eq0 T nil

if

< 8 if

8 head

tail

[9, 4, 7]

< - if

- head

tail

[9, 4, 7]

8 head

tail

[9, 4, 7]

8 head

tail

[9, 4, 7]

< -

-

- head

8 head

tail

[9, 4, 7]

tail

[9, 4, 7]

head

tail

[9, 4, 7]

head-

8 head

tail

[9, 4, 7]

tail

[9, 4, 7]

if

empty

tail

tail

[9, 4, 7]

nil

if

cons

empty

tail

tail

tail

[9, 4, 7]

nil

if

eq0 T nil

if

< 8 if

8 head

tail

tail

[9, 4, 7]

< -

8 head

tail

tail

[9, 4, 7]

- head

head tail

tail

[9, 4, 7]

8

tail

tail

[9, 4, 7]

Fig.7.8.IdentifyingTwoRe ursiveSubprogramsintheInitialProgram

forMod-

List

7.3SolvingtheSynthesisProblem

205

Explanation in segment 2 of the original treeExplanation in segment 1 of the original tree

Ω

seg. 3

segment 1

segment 4

segment 3

segment 2

segment 1

Ω

x1

Explanation in segment 3 of the original tree

segment 1

x1 x2

Subprogram G2

segment 2 segment 2

Ω

if

eq0

< 8 if

8 head

tail

tail

[9, 4, 7]

< -

8 head

tail

- head

head tail

tail

[9, 4, 7]

8

tail

tail

T nil

if

tail

if

< 8 if

8 head < - if

8 head

tail

[9, 4, 7]

x1

x2

[9, 4, 7]

G

if

<

head

x2

if

T nil

if

< 8

head

eq0 T nil

G2

8 [9, 4, 7]

[9, 4, 7]

tail

[9, 4, 7]

tail

if

eq0 T nil

G2

8

if

eq0 T nil

G2

8 tail

tail

[9, 4, 7]

-

eq0

if

eq0 T nil

tail

[9, 4, 7]

- head

tail

[9, 4, 7]

8 head

8 head

tail

[9, 4, 7]

< -

-

- head

[9, 4, 7]

tail

[9, 4, 7]

head

tail

[9, 4, 7]

head-

8 head

tail

[9, 4, 7]

tail

[9, 4, 7]

8 head

[9, 4, 7]

if

Fig.7.9.InferringaSub-Program

S hemeforModList


Third Tree: (x

1

) = 8, (x

2

) = tail(tail([9; 4; 7)).

Now the inferen e of the sub-s heme is omplete and the folding algorithm

omes ba k to the original problem. The originally unexplained subtrees are

repla ed by the three alls of the main program t

u

0

given at the bottom of g-

ure 7.9. The inferred subprogram G

2

is introdu ed in the set of subprograms

G of the to be onstru ted RPS S. The redu ed initial tree is given in g-

ure 7.10. The folding algorithm pro eeds now with onstru ting the program

body for G

1

.

Ωtail

[9, 4, 7]

cons

tail

tail

[9, 4, 7]

if

empty

tail

tail

tail

[9, 4, 7]

cons[9, 4, 7]

nil

if

empty nil cons

[9,4,7] if

eq0 T nil

if

empty

tail

[9, 4, 7]

nil

if

eq0 T nil

G2

8

G2

8

if

empty

tail

tail

[9, 4, 7]

nil

if

eq0 T nil

G2

8

Fig. 7.10. The Redu ed Initial Tree of ModList

7.3.3.2 De omposition of Initial Trees. If a valid re urrent segmenta-

tion (U

re

; U

sub

) was found for a set of initial trees T

init

and if U

sub

6= ;, then

indu tion of an RPS is performed re ursively over the set of subtrees at the

positions in U

sub

.

Denition 7.3.9 (De omposition of Initial Trees). Let T

init

T

[fg

be a set of initial trees indexed over E. Let (U

re

; U

sub

) be a valid segmenta-

tion of T

init

with U

sub

6= ;. The set of initial trees T

u

init

for all u 2 U

sub

is

dened as:

T

u

init

= fsegment(t

e

; U

re

; w)j

u

j e 2 E;w 2W (e);

u 2 pos(segment(t

e

; U

re

; w))g:


The index set E

u

of elements in T

u

init

is onstru ted as E

u

= f(w; e) j e 2

E;w 2W (e); u 2 pos(segment(t

e

; U

re

; w))g and for elements t

(w;e)

2 T

u

init

with (w; e) 2 E

u

holds t

(w;e)

= segment(t

e

; U

re

; w)j

u

.

From the onstru tion of (U

re

; U

sub

) follows that ea h subtree at a posi-

tion in U

sub

ontains at least one (see def. 7.3.2 and def. 7.3.3). An example

for an initial tree with non-empty sub-s hema positions was given in gure

7.7.

7.3.3.3 Equivalen e of Sub-S hemata. In general, there an exist alter-

native RPSs whi h an explain a set of initial trees. These RPSs might have

dierent numbers of parameters in the main program with dierent initial

instantiations. We want to deal with indu tion of sub-s hemata for subtrees

of a set of given initial trees as independent sub-problem. For that reason,

it must hold that if there exists a valid hypothesis for initial trees in T

u

init

whi h is part of a re ursive explanation of T

init

ea h possible solution for

T

u

init

an be part of the re ursive explanation of T

init

.

For understanding the following theorem, remember that the main pro-

gram t

u

0

of a sub-s heme must be valid over dierent examples for this s heme.

For the ModList illustration (see g. 7.9) there were three given trees for in-

ferring the sub-s heme and a main program t

u

0

ould be generated overing

all three trees with dierent initial instantiations (t

u

0

).

Theorem 7.3.3 (Equivalen e of Sub-S hemata). Let T

init

be a set of

initial trees indexed over E. Let S

1

= (t

1

0

;G

1

) and S

2

= (t

2

0

;G

2

) be two

RPSs with subprograms G

1

1

; : : : ; G

1

n

1

and G

2

1

; : : : ; G

2

n

2

. The RPSs together

with initial instantiations

1

e

: X

1

! T

and

2

e

: X

2

! T

for all e 2 E and

parameters X

1

= var(t

1

0

) and X

2

= var(t

2

0

) re ursively explain T

init

. Let t

1

0

be the maximal pattern of all terms

1

e

(t

1

0

) and t

2

0

the maximal pattern of all

terms

1

e

(t

2

0

) for e 2 E. It holds that for ea h parameter x

1

2 X

1

exists a

parameter x

2

2 X

2

su h that for all e 2 E:

1

e

(x

1

) =

2

e

(x

2

).

Proof (Equivalen e of Sub-S hemata). Let S

1

be a sub-s hema onstru ted by the

\maximization of body" prin iple (see theorem 7.3.1) and S

2

a sub-s hema whi h

is obtained in some dierent way, that is, re ursion points must be at \deeper"

positions in the initial trees.

Let x

1

2 X

1

be a parameter of t

1

0

. For instantiations

1

e

(x

1

) holds that they do not

share a ommon non-trivial pattern over all positions of x

1

in the segments of the

initial trees. It holds 9e; e

0

2 E with node(

1

e

(x

1

)) 6= node(

1

e

0

(x

1

)) (postulation

in theorem 7.3.3).

Variable x

1

is used in RPS S

1

to generate sub-terms of the given initial trees.

Let u

x

1be a position in initial trees t

e

, e 2 E, where terms are generated by

instantiations

1

e

(x

1

). The sub-terms at positions t

e

j

u

x

1

are identi al to the initial

instantiation

1

e

(x

1

) and do not share a non-trivial pattern. It follows, that in RPS

S

2

, the sub-terms t

e

j

u

x

1

must also be explained by an initial instantiation in main

(t

2

0

) ( ase 1) or be part of an instantiated parameter o uring in an unfolding of a

subprogram ( ase 2).

Case 1: (sub-terms t

e

j

u

x

1

are explained by an initial instantiation in t

2

0

)

Let x

2

2 X

2

be a variable and u

x

22 pos(t

2

0

fG

2

1

; : : : ; G

n

2

g; x

2

) be a


position of this variable in main with u

x

2 u

x

1. Be ause x

2

was introdu ed by

anti-uni ation, exist e; e

0

2 E with node(u

x

2; t

e

) 6= node(u

x

2; t

e

0

). Be ause

for all u < u

x

1holds that node(u; t

e

) = node(u; t

e

0

) for all e; e

0

2 E follows

u

x

2= u

x

1and therefore

1

e

(x

1

) =

2

e

(x

2

).

Case 2: (sub-terms t

e

j

u

x

1

are part of a parameter instantiation in an unfolding of

a subprogram from S

2

)

Let u

S

be a position with u

S

u

x

1su h that terms t

e

j

u

x

1

are instantiations of

a parameter in an unfolding of a subprogram from S

2

. These instantiations are

generated by a sequen e of substitutions (in the all of a subprogram frommain,

in the re ursive all of a subprogram, in the all of a further subprogram, ...).

Let t

subst

2 T

(X

2

) be a ombination of su h substitutions where variables in

X

2

are parameters of t

2

0

. That is, it holds t

e

j

u

S

= t

subst

[

2

e

(x

2

1

); : : : ;

2

e

(x

2

m

) =

2

e

(t

subst

) for all e 2 E.

Let u

s

x

1

be the position of sub-term t

e

j

u

x

1

with respe t to position u

S

, that is

(t

e

j

u

S

)j

u

s

x

1

= t

e

j

u

x

1

. Be ause sub-terms t

e

j

u

x

1

for all e 2 E and therefore sub-

terms

2

e

(t

subst

)j

u

s

x

1

do not share a non-trivial pattern, terms

2

e

(t

subst

)j

u

s

x

1

must be part of instantiations of variables x

2

2 X

2

. Let u

s

x

2

be a position

of variables x

2

in t

subst

with u

s

x

2

u

s

x

1

and t

subst

j

u

s

x

2

= x

2

. Be ause for

ea h position u < u

s

x

1

holds node(

2

e

(t

subst

)j

u

) = node(

2

e

0

(t

subst

)j

u

) for all

e; e

0

2 E and be ause

e

(x

2

) do not share a ommon prex (postulation) must

hold

2

e

(x

2

) =

2

2

(t

subst

)j

u

s

x

1

= t

e

j

u

x

1

=

1

e

(x

1

).

Theorem 7.3.3 states that for ea h RPSs with a given initial instantia-

tion of variables in the main program whi h explains the set of subtrees at

positions U

sub

results an identi al instantiation of variables. Therefore, inde-

pendent of the indu ed subprogram together with a alling main program,

the number and instantiations of parameters o uring in this main program

are unique. The proof is based on the idea that for a given set of initial trees

and a given sub-s hema S

1

ea h other sub-s hema S

2

whi h explains the

same initial trees shares ertain hara teristi s from whi h follows that pa-

rameter instantiations are identi al. In the rst ase, we onsider parameter

instantiations whi h are part of the instantiation of parameters in the alling

main program. If u

x

2

is above u

x

1

in the initial trees then the subtrees must

dier at position u

x

2. But we already know that u

x

1is exa tly that position

at whi h the subtrees dier the rst time. Therefore u

x

1

and u

x

2

must be

identi al positions in the subtrees and as a onsequen e the instantiations

whi h are given as the subtrees at these positions must be identi al! Case

two is analogous for sub-terms in segments orresponding to unfoldings of a

subprogram in S

2

.

For illustration onsider again the ModList example given in gure 7.10

and the sear hed-for RPS given in table 7.2. The subprogram ModList with

parameters l and n alls a further subprogram Mod. The \partial" program

body of ModList, that is, the maximal pattern of all segments without on-

sideration of further subprograms, is:

ModList = if (empty(l);nil ; ons(u;ModList)):


For the set of subtrees at segment positions u = 3:1: for example the Mod

subprogram given in table 7.2 an be indu ed with parameters k and n. For

this subprogram the alling main program is if(eq0(Mod(n, head(l))), T, nil)

with parameter n being passed through from the main program of the RPS

and parameter k being onstru ted by substituting the parameter l given in

the main program by head(l). Be ause ModList as well as Mod is onstru ted

by al ulating the maximal pattern over all segments, ea h possible realiza-

tion of Mod together with its alling main must ontain parameters n and k

= head(l).

7.3.3.4 Redu tion of Initial Trees. If sub-s hemata explaining all sub-

trees T

u

init

an be indu ed for all positions U

sub

, these sub-s hemata an be

inserted in the segments of the subprogram whi h alls the sub-s hema. That

is, the on erned subtrees are repla ed by the main programs t

u

0

whi h all

further sub-programs.

Lemma 7.3.2 (Redu tion of Initial Trees). Let T

init

be a set of initial

trees indexed over E. Let (U

re

; U

sub

) together with t

skel

be a valid segmen-

tation for all t

e

2 T

init

and let U

sub

6= ;.

1. If there exists an RPS S

u

= (G

u

; t

u

0

) for ea h u 2 U

sub

whi h ex-

plains the initial trees T

u

init

onstru ted as des ribed in denition 7.3.9

together with initial instantiations

u

(w;e)

: var(t

u

0

)! T

[f?g

8

, the seg-

ments of initial trees in T

init

an be transformed by segment(t

e

; U

re

;

w) = segment(t

e

; U

re

; w)[u

u

(w;e)

(t

u

0

) for all trees e 2 E, unfold-

ing indi es w 2 W (e), and sub-s hema positions u 2 pos(segment(t

e

;

U

re

; w)).

2. Let be G

u

a set of properly named subprograms (no two subprograms have

identi al names G

i

) with

G

u

=

hG

u

1

(x

1

; : : : ; x

m

u

1

) = t

u

1

;

.

.

.

G

u

n

(x

1

; : : : ; x

m

u

n

) = t

u

n

i

and let G(x

1

; : : : ; x

m

) = t

G

be the (yet unknown) subprogram for segmen-

tation (U

re

; U

sub

). The RPS S = (G; t

0

) whi h explains T

init

, ontains

all subprograms of the sub-s hemata S

u

= (G

u

; t

u

0

) for all u 2 U

sub

:

G =

hG(x

1

; : : : ; x

m

) = t

G

;

G

u

1

(x

1

; : : : ; x

m

u

1

) = t

u

1

;

.

.

.

G

u

n

(x

1

; : : : ; x

m

u

n

) = t

u

n

i:

8

There an exist subtrees for the sub-s hemata whi h are too small to infer all

variables.


Proof (Redu tion of Initial Trees). 1. If an RPS S

u

= (G

u

; t

u

0

) together with an


u

(w;e)

explains an initial tree t

u

(w;e)

, then there exists a

term t 2 L(S

u

;

u

(w;e)

) with t

u

(w;e)

t. Therefore, if the subtree at position u

in segment segment(t

e

; U

re

; w) is repla ed by the instantiated alling main

program

u

(w;e)

(t

u

0

), then a term t

0

with segment(t

e

; U

re

; w)

t

0

an be

generated from segment(t

e

; U

re

; w)[u

u

(w;e)

(t

u

0

) using rules of the term

rewrite system R

S

u

.

2. To generate the original segments (or larger subtrees) from the segments where

at all positions u 2 U

sub

the subtrees are repla ed by the instantiated alling

main programs, all subprogram denitions of the sub-s hemata S

u

= (G

u

; t

u

0

)

are ne essary.

After redu tion, in the segments of T

init

the subtrees at positions U

sub

are

repla ed by the instantiated alling main of the orresponding subprograms.

That is, these subtrees do no longer ontain s and the program body of the

segments an be onstru ted as des ribed in se tion 7.3.2.

The pres ribed onstru tion of sub-programs imposes the following re-

stri tion on initial trees: A maximal pattern for the subprograms an only

be onstru ted if there exist at least two trees t

(w;e)

; t

(w

0

;e

0

)

2 T

u

init

su h that

for the initial instantiation of all variables x 2 var(t

u

0

) holds

u

(e;w)

(x) 6= ?

and

u

(e

0

;w

0

)

(x) 6= ?.

A onsequen e of inferring sub-s hemata separately is, that subprograms

are introdu ed lo ally with unique names. It is possible, that two su h subpro-

grams are identi al and have only dierent names. After folding is ompleted

the set of subprograms an be redu ed su h that only subprograms with

dierent program bodies remain in G.

7.3.4 Finding Parameter Substitutions

After onstru ting a subprogram body as maximal pattern of all segments,

the remaining not explained subtrees must be parameters and their substi-

tutions. Anti-uni ation already results in a set of variables. The last om-

ponent of indu ing an RPS from a set of initial trees is to identify the sub-

stitutions these variables in the re ursive alls of the subprogram.

Variable substitutions in re ursive alls an be quite omplex. In table 7.10

some re ursive equations with dierent variants of substitutions are given. In

the most simple ase, ea h variable is substituted independently of the other

variables and keeps its position in the re ursive all (f

1

). A variable might

be also substituted by an operation involving other program parameter (f

2

).

Additionally, variables an swit h there positions (given in the head of the

equation) in the re ursive all (f

3

). Finally, there might be \hidden" variables

whi h only o ur within the re ursive all (f

4

). If you look at the body of

f

4

, variable z o urs only within the re ursive all. The existen e of su h a

variable annot be dete ted when the program body is onstru ted by anti-

uni ation but only a step later, when substitutions for the re ursive all are

inferred.


Table 7.10. Variants of Substitutions in Re ursive Calls

Individual Substitutions, Identi al Positions

f1(x, y) = if(eq0(x), y, +(x, f1(pred(x),su (y))))

Interdependent Substitutions

f2(x, y) = if(eq0(x), y, +(x, f2(pred(x),+(x,y))))

Swit hing of Variables

f3(x, y, z) = if(eq0(x), +(y, z), +(x, f3(pred(x), z, su (y))))

Hidden Variable

f4(x, y, z) = if(eq0(x), y, +(x, f4(pred(x), z, su (y))))

7.3.4.1 Finding Substitutions for `Mod'. Let us again look at the Mod

example. The valid segmentation for the se ond initial trees given in gure 7.9

is depi ted again in gure 7.11. We already know from al ulating the subpro-

gram body if(<(x

1

, head(x

2

)), x

1

, G

2

) that there remain two dierent kinds

subtrees whi h must be explained by variables and their substitutions in re-

ursive alls. Remember, that in onstru ting the subprogram body for Mod,

there were three initial trees at hand and the program body was onstru ted

by anti-unifying the segments gained from all three initial trees. Variable x

1

re e ts dieren es appearing already in segments of a single of su h initial

trees, but variable x

2

re e ts dieren es between segments of dierent initial

trees.

Ω

segment 3

segment 1

segment 2

segment 4

tail

[9, 4, 7]

tail

head8

tail

[9, 4, 7]

- head

tail

[9, 4, 7]

8 head

8 head

tail

[9, 4, 7]

< -

-

- head

[9, 4, 7]

tail

[9, 4, 7]

head

tail

[9, 4, 7]

head-

8 head

tail

[9, 4, 7]

tail

[9, 4, 7]

if

< 8 if

8 head < - if

Fig. 7.11. Substitutions for Mod


For better readability we write the remaining subtrees for ea h segment

of the given tree into a table:

x

1

(1:1:1:) x

2

(1:1:2:1:) x

3

= x

1

(1:2:)

8 tail([9,4,7) 8

(8, head(tail([9,4,7))) tail([9,4,7) (8, head(tail([9,4,7)))

((8, head(tail([9,4,7))), tail([9,4,7) ((8, head(tail([9,4,7))),

head(tail([9,4,7))) head(tail([9,4,7)))

The middle olumn represents the instantiations of x

2

whi h is onstant over

all segments. If only this single initial tree were be onsidered, this term would

be part of the program body. The left-hand and right-hand olumns ontain

exa tly the same terms on ea h level. With these onsiderations, it is enough

to onsider the hanges from one level to the next in the rst two olumns.

Terms on su eeding levels represent hanges from one segment to the next,

that is, for the program body indu ed by this segmentation, these hanges

must be explained by substitutions of parameters in the re ursive all.

When going from the rst to the se ond level, we nd an o urren e

of x

1

as well as of x

2

, that is (8, head(tail([9,4,7))) = (x

1

, head(x

2

)).

This leads to the hypothesis, that the initial instantiations are x

1

= 8 and

x

2

= tail([9; 4; 7) with substitutions x

1

(x

1

, head(x

2

)) and x

2

x

2

.

We used the rst two segments to onstru t a hypothesis. As we will see

below, in general, we need two further segments to validate this hypoth-

esis. For now, let it be enough that we validate the hypothesis by going

from the se ond to the third segment. Applying the found substitution

to x

1

= (8, head(tail([9,4,7))) results in ((8, head(tail([9,4,7))),

head(tail([9,4,7))) and be ause x

2

= tail([9,4,7), the substitution hypothe-

sis holds!

The illustration gives an example for an interdependent substitution (see

tab. 7.10) be ause the substitution of x

1

depends not only on x

1

itself, but

also on x

2

.

7.3.4.2 Basi Considerations. A ne essary hara teristi of substitutions

is that they are unambiguously determined in the (innite set of) unfoldings

(see def. 7.1.29). In that ase, substitutions form regular patterns in the initial

trees and therefore an be identied by pattern mat hing.

Theorem 7.3.4 (Uniqueness of Substitutions). Let S = (G; t

0

) be an

RPS, G

i

(x

1

; : : : ; x

m

i

) = t

i

a subprogram in S, and X

i

= fx

1

; : : : ; x

m

i

g the

set of parameters. Let U

re

be the set of re ursion points of G

i

with index set

R = f1; : : : ; jU

re

jg, W the unfolding indi es onstru ted over R, and

i

the

set of all unfoldings of equation G

i

over instantiations : X

i

! T

. Let t

i

be a program body whi h orresponds to the maximal pattern of all unfoldings

w

2

i

. Let sub(x

j

; r) be a substitution term (def. 7.1.27) for x

j

2 X

i

in the

re ursive all of G

i

at position u

r

2 U

re

, r 2 R. For all terms t

s

2 T

(X)

with 8w 2 W :

wÆr

(x

j

) = t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g holds

t

s

= sub(x

j

; r).

Proof: see appendix BB.3.


Be ause substitutions are unique over unfoldings, it holds that for ea h

subtree under a re ursion point it is determined whether this term is a param-

eter with its initial instantiation or a substitution term over su h a parameter.

Denition 7.3.10 (Chara teristi s of Substitution Terms). Let

sub(x

j

; r) be a substitution term for parameter x

j

at position r. For all po-

sitions u, starting with u = , holds the following indu tively dened hara -

teristi :

sub(x

j

; r)j

u

=

8

>

>

>

>

>

<

>

>

>

>

>

:

x

k

(a) 8w 2W :

(wÆr)

(x

j

)j

u

=

w

(x

k

) and

(b) :9t 2 T

(X

i

) with t 6= x

k

; su h that holds

8w 2 W :

wÆr

(x

j

)j

u

=

tfx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g

f(sub(x

j

; r)j

uÆ1

; : : : ; 8w 2 W : node(

wÆr

(x

j

); u) = f 2

sub(x

j

; r)j

uÆn

) with arity (f) = n:

For suÆ iently large initial trees, it an be de ided for ea h parameter

in the r-th unfolding how it is substituted in the next r Æ w-th unfolding.

For example, in a re ursive equation with two parameters x

1

and x

2

with

(x

1

) = 0 and (x

2

) = su (0), both parameters might be substituted by

x

i

su (x

i

). Condition (a) alone allows that x

1

x

2

is identied as

substitution, but this is only a legal substitution, if ondition (b) holds addi-

tionally. For the given example, for x

1

in the r-th unfolding an be found a

substitution term su (x

1

) in the r Æ w-th unfolding. A fun tion for testing

uniqueness of substitution is given in table 7.11. The fun tion makes use of

a fun tion whi h tests the re urren e of a substitution given in table 7.12.

Table 7.11. Testing whether Substitutions are Uniquely Determined

Fun tion: uniqueness : X X pos(t

init

) T

! bool

Pre: E is the set of indi es for T

init

R is the index set of re ursion points

W (e) ist the set of segment indi es for t

e

2 T

init

X is the set of variables in

^

t

G

and x

j

; x

k

2 X

(x

k

; w; e) returns the instantiation of variable x

k

2 X for segment w 2 W (e)

t

is an instantiation of x

j

2 X with t

6= ? in a segment with w = r:

u is the position whi h is urrently he ked

Post: true if for x

j

annot be found a re urrent substitution for a variable from

X n fx

k

g, false otherwise

Side-ee t: Break if an ambiguity is dete ted

uniqueness(x

j

; x

k

; u; t

) =

1. IF u =2 pos(t

)

2. OR ((:9x

i

2 (X n fx

k

g) : betaRe urrent(x

j

; x

i

; u)) (see table 7.12)

3. OR (:8e; e

0

2 E;w 2 W (e); w

0

2 W (e

0

) : node((x

j

; w Æ r; e); u) =

node((x

j

; w

0

Æ r; e

0

); u)))

4. AND Let f = node(t

; u) with (f) = n In

8l = 1; : : : n : uniqueness(x

j

; x

k

; u Æ l; w

1

; e

1

)

5. THEN return TRUE

6. ELSE return FALSE and break


Table 7.12. Testing whether a Substitution is Re urrent

Fun tion: betaRe urrent : X X R pos(t

init

)! bool


init



e

2 T

init

and w 2 W (e)


^

t

G

and x

j

; x

k

2 X

(x

k


k


t

is an instantiation of x

j

2 X with t

6= ? in a segment with w = r:

u is the position whi h is urrently he ked

Post: true if in all instantiations of x

j

2 X in segment w Æ r at position u it an

be found the instantiation of x

k

in segment w, false otherwise

betaRe urrent(x

j

; x

k

; r; u) =

8e 2 E; 8w Æ r 2 W (e) :

(x

j

; w Æ r; e)j

u

=

?

(x

k

; w; e)

(see def. 7.3.11 for =

?

)

For a given hypothesis

^

t

G

about a subprogram body (see def. 7.3.2 in

se t. 7.3.2), variable instantiations an be determined for ea h segment. Af-

ter determining variable instantiations for segments, the initial instantia-

tions an be separated from the substitution terms by al ulating the dier-

en es between the segment-wise instantiations. To al ulate variable instanti-

ations for segments, we an extend the denition for parameter instantiations

(def. 7.1.18) in the following way:

Denition 7.3.11 (Instantiation of Variables in a Segment). Let

T

init

T

[fg

be a set of initial trees indexed over E. Let (U

re

; U

sub

) with

U

sub

= ; be a segmentation of trees in T

init

and

^

t

G

the resulting maximal

pattern with X = var(

^

t

G

). Let W (e) be the set of segment indi es for t

e

2

T

init

. The instantiations : X W E ! T

[ f?g for variables in X

o uring in segments with indi es w 2W (e) are dened as

(x;w; e) =

8

>

>

>

>

<

>

>

>

>

:

segment(t

e

; U

re

; w)j

u

(a) 9u 2 pos(

^

t

G

; x) with

u 2 pos(segment(t

e

; U

re

; w))

and

(b) segment(t

e

; U

re

; w) 6= ?

? otherwise:

With (x;w; e) =

?

(x

0

; w

0

; e

0

) we denote that two instantiations are either

identi al or that one of the instantiations is undened.

Be ause we allow for in omplete unfoldings, it an happen that in some

segments the positions of a hypotheti al variable does not exist. Instantiation

is ? (i. e., undened), if the sear hed for position is non-existent in all

segments.

Indu ing the substitutions is based on the same prin iple as indu ing the

subprogram body, that is, to dete t regularities in the yet unexplained sub-

trees of the segments. We realize the onstru tion of a hypothesis for variable


substitutions in the re ursive alls of a subprogram by identifying instantia-

tion and substitution through omparison of two su essive segments (w and

w Æ r) and validate this hypothesis for all other pairs of su essive segments

(w

0

and w

0

Ær). For validation there must exist at least two additional su es-

sive segments where the position of the substitution term is given. Therefore,

it must hold that the initial trees in T

init

must provide at least four segments

for onstru ting and validating the sear hed for substitution of a variable.

But it is not ne essary for these further segments to be given in the same

initial tree, it is enough that this information an be olle ted over all trees in

T

init

! Thereby we keep the restri tions on initial trees minimal. Furthermore,

for al ulating interdependent substitutions (see table 7.10), it is ne essary,

that for ea h substitution of a variable x

j

in a given segment, the substitution

terms of all variables are dened in the pre eding segment.

Lemma 7.3.3 (Ne essary Condition for Substitutions). Let T

init

T

[fg

be a set of initial trees indexed over E and

^

t

G

the hypothesis of the

program body whi h follows from segmentation (U

re

; U

sub

) with U

sub

= ;.

Re ursion points are indexed over R = f1; : : : ; jU

re

jg. The variables o uring

in the program body are X = var(

^

t

G

). Re urrent hypotheses over substitutions

an only be indu ed if for ea h variable x

j

2 X and ea h re ursive all r 2 R

holds: 9e

1

; e

2

2 E and w

1

2 W (e

1

), w

2

2 W (e

2

) with w

1

= , (w

1

; e

1

) 6=

(w

2

; e

2

) and for two positions h 2 f1; 2g:

1. (x

j

; w

h

Æ r; e

h

) 6= ? and

2. 8x

k

2 X : (x

k

; w

h

Æ r; e

h

) 6= ?.

An algorithm for testing whether enough instan es are given in a set of

initial trees is given in table 7.13.

Table 7.13. Testing the Existen e of SuÆ iently Many Instan es for a Variable

Fun tion: enoughInstan es : X ! bool


init



e

2 T

init


^

t

G

and x

j

2 X

(x

k


k


Post: true if there are suÆ iently many instan es, false otherwise

enoughInstan es(x

j

) =

1. (9e

1

2 E : (x

j

; r:; e

1

) 6= ? AND

2. 8x

k

2 X : (x

k

; ; e

1

) 6= ?) AND

3. (9e

2

2 E : 9w 2 W (e

2

) : (x

j

; w Æ r; e

2

) 6= ? AND

4. 8x

k

2 X : (x

k

; w; e

2

) 6= ?) AND

5. (e

1

6= e

2

OR w 6= )

After onstru tion of a substitution hypothesis from a pair of su eeding

substitutions and validating it for all (at least one further) pairs of substi-


tutions for whi h the a ording sub-term positions are dened in the initial

trees, it must be he ked whether the omplete set of substitution terms is

onsistent.

Lemma 7.3.4 (Consisten y of Substitutions). Let T

init

be a set of ini-

tial trees indexed over E. Let be : XW E ! T

[f?g

the set of instanti-

ations of variables x 2 X in segments e 2W (e); e 2 E. Let be sub(x

j

; r) the

substitution terms of variables x

j

2 X in the re ursive alls. The omposition

of substitutions sub

: X W ! T

(X) is indu tively dened as:

sub

(x

j

; w) =

8

<

:

x

j

w =

sub(x

j

; r)fx

1

sub

(x

1

; v); : : : ;

x

m

sub

(x

m

; v)g w = v Æ r:

For ea h initial tree t

e

2 T

init

must exist an instantiation (x

j

; ; e) su h

that for all variables x

j

2 X and all segments w 2W (e) holds: (x

j

; ; e) =

?

sub

(x

j

; w)fx

1

(x

1

; ; e); : : : ; x

m

(x

m

; ; e)g.

A substitution of a variable x

j

is onsistent, if the substitution for a given

segment at position w an be onstru ted by su essive appli ation of the

hypotheti al substitution on the initial instantiation of the variable in the

root segment (w = ).

7.3.4.3 Dete ting Hidden Variables. Remember that the set of parame-

ters of a re ursive subprogram initially results from onstru ting the hypoth-

esis of the program body by anti-uni ation of the hypotheti al segments

(see se t. 7.3.2). The maximal pattern of all segments ontains variables at

all positions where the subtrees dier over the segments. In many ases, the

set of variables identied in this way is already the nal set of parameters of

the to be onstru ted re ursive fun tions. Only in the ase of hidden variables

(see table 7.10), that is, variables whi h only o ur in substitution terms, the

set of variables must be extended.

In ontrast to the standard ase, the instantiation of su h hidden vari-

ables annot be identied in the segments, they must be identied within

the substitution terms instead. This an be done be ause, if su h a variable

o urs in a re ursive equation it must be used in parameter substitutions in

the re ursive all.

Lemma 7.3.5 (Identi ation of Hidden Variables). Let S = (G; t

0

) be

an RPS, G

i

(x

1

; : : : ; x

m

i

) = t

i

a subprogram in S, and X

i

= fx

1

; : : : ; x

m

i

g the

set of parameters of G

i

. Let U

re

be the set of re ursion points of G

i

indexed

over R = f1; : : : ; jU

re

jg, W the set of unfolding indi es onstru ted over R,

and

i

the set of all unfoldings of G

i

with instantiations : X

i

! T

. The

program body t

i

is the maximal pattern of all unfoldings in

i

.

Let X

R

i

= var(t

i

[U

re

) be the set of variables o uring in the pro-

gram body. Let X

H

i

= X

i

nX

R

i

be the set of hidden variables of subprogram G

i

.

Let sub(x

j

; r) 2 T

(X

i

) be a substitution term with var(sub(x

j

; r))\X

H

i

6=


;. Let x

h

2 var(sub(s

j

; r))\X

H

i

be a hidden variable used in the substitution

term at position u

h

2 pos(sub(x

j

; r); x

h

). It holds:

1. :9x

k

2 X

R

i

with 8w 2W :

wÆr

(x

j

)j

u

h

=

w

(x

k

).

2. :8w

1

; w

2

2W : node(

w

1

Ær

(x

j

); u

h

) = node(

w

2

Ær

(x

j

); u

h

).

3. :9u u

h

and :9x

k

2 X

i

with 8w 2W :

wÆr

(x

j

)j

u

=

w

(x

k

).

Proof (Identi ation of Hidden Variables).

1. Suppose there exists a variable x

k

2 X

R

i

for whi h holds ondition 1 in lemma

7.3.5, then it must hold that x

k

6= x

h

be ause x

k

2 X

R

i

, x

h

2 X

H

i

, and

X

R

i

[X

H

i

= ;; and it must hold for all w 2 W that

wÆr

(x

j

)j

u

h

=

w

(x

k

) (given postulation)

wÆr

(x

j

)j

u

h

=

w

(x

h

) (follows from supposition)

Consequently, for all w 2 W must hold

w

(x

k

) =

w

(x

h

) whi h ontradi ts

the assumption that t

i

is a maximal pattern of all unfoldings over .

2. For all w 2 W holds that

wÆr

(x

j

)j

u

h

=

w

(x

h

). Be ause t

i

is a maxi-

mal pattern, variables do not have a ommon non-trivial pattern (a om-

mon prex). Therefore, there must exist two positions w

1

; w

2

2 W su h that

node(

w

1

Ær

(x

h

); ) 6= node(

w

2

Ær

(x

h

); ) and therefore also

node(

w

1

Ær

(x

j

); u

v

) 6= node(

w

2

Ær

(x

j

); u

v

).

3. Follows from Lemma B.3.1 (appendix BB.3).

From lemma 7.3.5 follows that for ea h substitution term of a variable an

be de ided whether it is onstru ted using an fun tion from or a hidden

variable. The initial instantiations of su h hidden variables an be indu ed

from the substitution terms. An algorithm for determining hidden variables

is given in table 7.14.

Table 7.14. Determining Hidden Variables

Fun tion: hiddenVar : pos(t

init

)X R! X

0

Pre: u is a position in (x

j

; w; e)

X = fx

1

; : : : ; x

m

g is the set of variables

(x

k


k



Post: new variable symbol x

m+1

Side-ee t: X is extended by x

m+1

; is extended by the values of x

m+1

; Break (and

ba ktra king to al ulating segmentations) if there are not enough instan es of

x

m+1

1. m = jXj

2. Generate new variable symbol x

m+1

=2 X

3. X = X [ fx

m+1

g

4. FORALL e 2 E DO

5. FORALL w 2 W DO

6. IF w Æ r 2W (e)

7. THEN (x

m+1

; w; e) = (x

j

; w Æ r; e)j

u

8. IF NOT(enoughInstan es(x

m+1

)) (see table 7.13)

9. THEN break

10. return x

m+1


7.3.4.4 Algorithm. The ore of the algorithm for al ulating substitution

terms an be obtained by extending denition 7.3.10 and is given in ta-

ble 7.15. The algorithm makes use of fun tions uniqueness (table 7.11),

betaRe urrent (table 7.12), and hiddenVar (table 7.14).

Table 7.15. Cal ulating Substitution Terms of a Variable in a Re ursive Call

Fun tion: al Sub : X R pos(t

init

)! T


init

R is the index set of re ursion points with r 2 R


e

2 T

init


^

t

G

and x

j

; x

k

2 X

(x

k


k


Post: a valid substitution term sub(x

j

; r) for variable x

j

2 X in the rth re ursive

all.

Side-ee t: Break aused if a variable is not uniquely determined ( aused by

uniqueness) or if there are not enough instan es for al ulating the instanti-

ation of a hidden variable ( aused by hidden).

al Sub(x

j

; r; u) =

1. IF betaRe urrent(x

j

; x

k

; r; u)

2. AND (Let t

= (x

j

; w Æ r; e) for e 2 E;w 2W (e) and (x

j

; w Æ r; e) 6= ? In

uniqueness(x

j

; x

k

; u; t

))

3. THEN x

k

4. ELSE IF 8e 2 E;w 2W : node((x

j

:w Æ r; e); u) = f with (f) = n

5. THEN f( al Sub(x

j

; r; u Æ 1); : : : ; al Sub(x

j

; r; u Æ n))

6. ELSE hidden(u; x

j

; r)

Integrating all omponents for al ulating substitutions introdu ed in this

se tion, we obtain the following algorithm:

1. Instantiate X = var(

^

t

G

) and al ulate the initial instantiation as dened

in 7.3.11.

2. Determine whether there are enough su eeding substitution terms as

proposed in lemma 7.3.3. If there are not enough terms, break and ba k-

tra k to al ulating a new segmentation.

3. For ea h variable x

j

2 X and ea h re ursive all r 2 R al ulate the

substitutions, starting at u = using algorithm 7.15.

4. Che k onsisten y of instantiations as proposed in lemma 7.3.4. If an

in onsisten y is dete ted, break and ba ktra k to al ulating a new seg-

mentation.

5. If jT

init

j 2 then he k whether there exist at least two initial trees

in T

init

with omplete initial instantiations as proposed in 7.3.2 in se -

tion 7.3.3. If this is not the ase, break and ba ktra k to al ulating a

segmentation.

6. Initially, instantiate t

G

=

^

t

G

. Instantiate m = jX j. Cal ulate indi es for

the variables in X using f1; : : : ;mg. For ea h u

r

2 U

re

extend t

G

in the

following way: t

G

= t

G

[u

r

G(sub(x

1

; r); : : : sub(x

m

; r)).


7. Return G(x

1

; : : : x

m

) = t

G

and for all initial trees t

e

return

e

=

(x

i

; ; e) for all x

i

2 X .

7.3.5 Constru ting an RPS

7.3.5.1 Parameter Instantiations for the Main Program. When on-

stru ting an RPS S(G; t

0

) whi h re ursively explains a set of initial trees

T

init

, for ea h t

e

2 T

init

there must be al ulated the initial instantiation of

variables

e

in t

e

0

. This an be done by using slightly modied versions of def-

initions 7.3.8 (instantiations of variables in a hypotheti al subprogram body,

se tion 7.3.2) and 7.3.11 (instantiations of variables in a segment, dened for

parameter instantiations in re ursive alls, se tion 7.3.4).

For a re ursive program s heme S(G; t

0

) ea h initial tree t

e

2 T

init

implies

an instantiation

e

of variables in t

0

:

Denition 7.3.12 (Instantiation of t

0

). Let T

init

T

[fg

be a set of

initial trees indexed over E and S(G; t

0

) the RPS whi h re ursively explains

T

init

. Let t

e

0

be the main program for t

e

2 T

init

. The instantiations of t

0

for

a tree t

e

an be onstru ted as

e

(x) =

8

<

:

t

e

0

j

u

9u 2 pos(t

0

; x)

with u 2 pos(t

e

0

) and t

e

0

j

u

6= ?

? otherwise:

If there exists a re ursive subprogramwhi h explains all trees in T

init

, then

the variable instantiations in the main program whi h alls this subprograms

an be al ulated with respe t to the subprogram.

Denition 7.3.13 (Instantiation of t

0

wrt a Subprogram). Let T

init

T

[fg

be a set of initial trees indexed over E and G(x

1

; : : : ; x

m

) = t

G

be a

subprogram whi h re ursively explains the trees t

e

2 T

init

together with vari-

able instantiations

G

e

: fx

1

; : : : ; x

m

g ! T

[ f?g. The instantiations of t

0

for a tree t

e

an be onstru ted as

e

(x) =

8

>

>

<

>

>

:

G

e

(G(x

1

; : : : ; x

m

))j

u

(a) 9u 2 pos(t

0

; x) with

u 2 pos(

G

e

(G(x

1

; : : : ; x

m

))) and

(b)

G

e

(G(x

1

; : : : ; x

m

))j

u

6= ?

? otherwise:

7.3.5.2 Putting all Parts Together. Now we have all omponents for a

system for indu ing re ursive explanations from a set of initial trees. The ore

of the system is to nd a set of re ursive subprograms. An overview of the

indu tion of a subprogram is given in gure 7.12. Program BuildSub involves

the following steps: (1) nding a valid re urrent segmentation U

re

for a set of

initial trees (se tion 7.3.1), (2) onstru ting a hypothesis about the program

body by al ulating the maximal pattern of the segments (se tion 7.3.2), (3)


interpreting the remaining subtrees as variable instantiations, (4) determin-

ing the variable substitutions and the initial instantiation of the variables

(se tion 7.3.4). If there are subtrees whi h are not overed by the urrent

segmentation, these trees are onsidered to belong to further subprograms

and the pro ess of indu ing a (sub-) RPS is alled re ursively (see BuilRPS

in gure 7.13). Whenever one of the integrity onditions formulated in the

se tions above is violated, the algorithm ba ktra ks to step one.

BuildSub(T

init

):

PSfrag repla ements

Cal ulate valid re urrent segmentation

Cal ulate instantiations

Cal ulate substitutions

no further segmentation possible

n

o

s

o

l

u

t

i

o

n

no RPS an be al ulated

no transitivity

not suÆ iently many

not uniquely determined

if hidden variables

not enough instan es

not enough omplete instantiations

b

a

k

t

r

a

k

D

e

a

l

w

i

t

h

f

u

r

t

h

e

r

s

u

b

p

r

o

g

r

a

m

s

For all u in U

sub

:

De ompose T

init

) T

u

BuildRPS(T

u

)) (G

u

; t

u

0

); f

u

e

g

Redu e segments in T

init

Set G = G [ G

u

Cal ulate maximal pattern

^

t

G

Cal ulate initial instantiations

e

Set t

0

= G(x

1

; : : : ; x

m

)

Set G = G [ fG(x

1

; : : : x

m

) = t

G

g

Return (G; t

0

); f

e

g

For jT

init

j 2

Fig. 7.12. Steps for Cal ulating a Subprogram


The omplete pro edure for indu ing an RPS is given in gure 7.13. Start-

ing at the roots of the initial trees in T

init

, the system sear hes for a sub-

program together with a alling main program and an initial instantiation

of variables. Possibly, the sear hed for \top-level" subprogram alls further

subprograms whi h are also onstru ted (BuildSub, gure 7.12). If no solu-

tion is found, the root node is interpreted as a onstant fun tion in the main

program and BuildRPS starts with all subtrees of the root node as new set

of initial trees. If none of the new initial trees ontains an , no re ursive

generalization is onstru ted and there is only a main program whi h is the

maximal pattern of all trees, otherwise, BuildRPS is alled re ursively for all

sets of initial trees at xed positions under the root.

The used strategy is to nd a subprogram whi h explains as large a part

of the initial trees as possible at a level in the trees as high as possible. Re-

member, that, if only one initial tree is given, it is not possible to determine

the parameters of main program t

0

, otherwise, the parameters of main are

al ulated by anti-uni ation of the main programs of all initial trees (see

se tion 7.2.3). Remember further, that urrently subprograms with dierent

names are indu ed for ea h unexplained subtree of a to be indu ed subpro-

gram. Subprograms with identi al bodies an be identied after synthesis is

ompleted (see se tion 7.3.3.4 and se tion 7.3.5.4).

7.3.5.3 Existen e of an RPS. If an RPS an be indu ed from a set of

initial trees using the approa h presented here, then indu tion an be seen

as a proof of existen e of a re ursive explanation given the restri tions

presented in se tion 7.2.2.

Theorem 7.3.5 (Existen e of an RPS). Let T

init

be a set of initial trees

indexed over E. T

init

an be explained re ursively by an RPS i:

1. T

init

an be re ursively explained by a subprogram G(x

1

; : : : ; x

m

) = t

G

,

or

2. 8e 2 E : 9f 2 with (f) = n, n > 0, and node(t

e

; ) = f , and

8T

k

= ft

e

j

k:

j k = 1; : : : ; ng holds:

a) 8t 2 T

k

: pos(t; ) = ;, or

b) 8t 2 T

k

: pos(t; ) 6= ; and it exists an RPS S

k

= (G

k

; t

k

0

) whi h

re ursively explains the trees in T

k

.

Proof (Existen e of an RPS). It must be shown (1) that an RPS exists, if proposi-

tions (1) and (2) in theorem 7.3.5 hold, and (2) that no RPS exists, if propositions

(1) and (2) do not hold.

1. Existen e of an RPS, if propositions (1) and (2) in theorem 7.3.5 hold:

Proposition 1: Let G(x

1

; : : : ; x

m

) = t

G

be a subprogram whi h re ursively ex-

plains all trees in T

init

(def. 7.1.30). Then there exist

G

e

: fx

1

; : : : ; x

m

g !

T

[ f?g su h that G with instantiations

G

e

re ursively explains t

e

2

T

init

. Let t

0

be the maximal pattern of all terms

G

e

(G(x

1

; : : : ; x

m

))

(theorem 7.3.1) and

e

: var(t

0

) ! T

[ f?g (def. 7.3.13) the ini-

tial instantiations of variables in t

0

for tree t

e

2 T

init

. Then the RPS

S(hG(x

1

; : : : ; x

m

) = t

G

i; t

0

) together with instantiations

e

re ursively ex-

plains the trees in T

init

.


BuildRPS(T

init

):

no no

no no

no

yes

yes yes

yes

yes

PSfrag repla ements

BuildSub(T

init

)) (G; t

0

); f

e

g

Solution

Solution

found?

found?

8t 2 T

init

:

node(t; ) = f

with (f) = n?

no solution

8e 2 E : t

e

0

= f(x

1

; : : : ; x

m

)

For k = 1; : : : ; n:

Constru t T

k

No s in

all t 2 T

k

?all t 2 T

k

?

s in

8e 2 E :

t

e

0

j

k:

= t

BuildRPS(T

k

)

) (G

k

; t

k

0

)f

k

e

g

8e 2 E : t

e

0

j

k:

=

k

e

(t

k

0

)

Set G = G [ G

k

Set t

0

j

k:

= maximal pattern of all t

e

0

j

k:

8e 2 E : al ulate

k

e

wrt t

0

j

k:

Set

e

=

e

[

k

e

Set t

0

= maximal pattern of all

e

(t

0

)

8e 2 E : al ulate

e

wrt t

0

Return (G; t

0

), f

e

g

Fig. 7.13. Overview of Indu ing an RPS


Proposition 2: For all T

k

, k = 1; : : : ; n, with pos(t; ) 6= emptyset by as-

sumption must exist an RPS S

k

= (G

k

; t

k

0

) whi h re ursively explains

trees T

k

and there must exist instantiations

k

e

: var(t

k

0

)! T

[ f?g for

parameters in the main programs t

k

0

. For ea h term t

e

0

it must hold that

node(t

e

0

; ) = f with arity (f) = n. For all k = 1; : : : ; n must hold:

t

e

0

j

k:

=

k

e

(t

k

0

) pos(t; ) 6= ;

t

e

j

k:

pos(t; ) = ;:

The \global" main t

0

then is the maximal pattern of all t

e

0

with variable

instantiations

e

(x) (def. 7.3.12). G is the set of all subprograms of the

sub-s hemata S

k

and the RPS S(G; t

0

) together with instantiations

e

re ursively explains all trees in T

init

.

2. Non-Existen e of an RPS, if propositions (1) and (2) in theorem 7.3.5 do not

hold:

If proposition (1) do not hold, then there annot exist an RPS with a main

program t

0

whi h just alls a re ursive subprogram. That is, there must exist

an RPS with a larger main program for whi h proposition (2) does not hold.

If there does not exist a symbol f 2 whi h is identi al for all roots of the

initial trees t

e

2 T

init

, then there annot exist a (rst-order) main program

whi h explains all initial trees.

If there exist t

1

2 T

k

with pos(t

1

; ) = ; and t

2

2 T

k

with pos(t

2

; ) 6= ;,

then on the one hand, T

k

annot be explained by an RPS (by assumption)

be ause of t

1

, and, on the other hand, T

k

annot be explained by a non-

re ursive term with variable instantiations be ause of t

2

.

7.3.5.4 Extension: Equality of Subprograms. After an RPS S(G; t

0

) is

indu ed, the set of subprograms G an possibly be redu ed by unifying sub-

programs with equivalent bodies and dierent names G

i

2 .

9

In the most

simple ase, the bodies of the two subprograms are identi al ex ept names of

variables. In general, G

i

and G

j

an ontain alls of further subprograms. G

i

and G

j

are equivalent, if these subprograms are equivalent, too. Table 7.16

gives an example, where G

A

alls a further subprogram G

B

and where vari-

ables of G

A

are partially instantiated with dierent values for two dierent

positions. Subprogram G

A

ould be generated from the two partially instan-

tiated subprograms by (a) determining whether the further subprograms are

equivalent and by (b) onstru ting the maximal pattern of instan es of G

A

.

But note, that a strategy for de iding when dierent subprograms should be

identied as a single subprogram would be needed to realize this simpli a-

tion of program s hemes!

9

This is a possible extension of the system, not implemented yet.


Table 7.16. Equality of Subprograms

G

A

(x

1

; x

2

; x

3

) = if(eq1(x

1

); G

B

(x

2

); G

A

((x

1

; x

3

); x

2

; x

3

))

Indu ed subprograms:

G

A

is alled with x

2

= 1 and x

3

= 2:

G

A1

(x

1

) = if(eq1(x

1

); G

B1

(1); G

A1

((x

1

; 2)))

G

A

is alled with x

2

= 2 and x

3

= 3:

G

A2

(x

1

) = if(eq1(x

1

); G

B2

(2); G

A2

((x

1

; 3)))

7.3.5.5 Fault Toleran e. The urrent, purely synta ti al approa h to fold-

ing does allow for initial trees where some of the segments of a hypotheti al

unfolding are in omplete. In ompleteness ould for example result from not

onsidered ases when the initial trees were generated (by a user or another

system). But, it does not allow for initial trees whi h ontain defe ts, for

example wrong fun tion symbols. Su h defe ts ould for example result from

user generated initial trees or tra es, where it might easily happen that trees

are unintentionally awed (by typing errors et .). A possible solution for this

problem ould be, that not only an RPS is indu ed but that additionally, min-

imal transformations of the given initial trees are al ulated and proposed to

the user. This problem is open for further resear h.

7.4 Example Problems

To illustrate the presented approa h to folding nite programs we present

some examples. First, we give time measures for some standard problems.

Afterwards we demonstrate appli ation to planning problems whi h we in-

trodu ed in hapter 3.

7.4.1 Time Eort of Folding

Figures 7.14 and 7.15 give the time eort for unfolding and folding the fa to-

rial (see tab. 7.7) and the Fibona i fun tion (see tab. 7.3). The experiments

were done by unfolding a given RPS to a ertain depth and then folding

the resulting nite program term again. In all experiments the original RPS

ould be inferred from its n-th unfolding (n = 3; : : : ; 10). The pro edure for

obtaining the time measures is des ribed in appendix AA.6.

Folding has a time eort whi h is a fa tor between 3 and 4 higher than

unfolding. For linear re ursive fun tions, folding time is linear and for tree

re ursive fun tions, folding time is exponential in the number of unfolding

points, as should be expe ted.

The greatest amount of folding time is used for onstru ting the substitu-

tions and the se ond greatest amount is used for al ulating a valid re urrent

segmentation. Figure 7.16 gives these times for the fa torial fun tion.

7.4 Example Problems 225

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

3 4 5 6 7 8 9 10

’fac-unfold’’fac-fold’

Unfolding-depth n = 3; : : : ; 10 (x-axis) measured in se onds (y-axis)

Fig. 7.14. Time Eort for Unfolding/Folding Fa torial

0

10

20

30

40

50

60

70

80

90

3 4 5 6 7 8 9 10

’fib-unfold’’fib-fold’


Fig. 7.15. Time Eort for Unfolding/Folding Fibona i

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

3 4 5 6 7 8 9 10

’fac-fold’’fac-seg’’fac-sub’


Fig. 7.16. Time Eort Cal ulating Valid Re urrent Segmentations and Substitu-

tions for Fa torial


Both for fa torial and for Fibona i the rst segmentation hypothesis

leads to su ess, that is, no ba ktra king is needed. For a variant of fa torial

with a onstant initial part (see table 7.17), one ba ktra k is needed. For an

unfolding depth n = 3 time for folding goes up to 0:36 se (from 0:12 se )

and for n = 4 to 0:44 se (from 0:16 se ).

Table 7.17. RPS for Fa torial with Constant Expression in Main

G = hG(x) = if(eq0(pred(x)); 1; (x;G(pred(x))))i

t

0

= if(eq0(x); 1; G(x))

For the ModList fun tion (see tab. 7.2) an RPS with two subprograms

must be inferred. For the third unfolding of ModList where Mod is unfolded

on e in the rst, twi e in the se ond, and three times in the third unfolding

is 1:07 se (with 0:16 se for unfolding). For the fourth unfolding with one to

four unfoldings of Mod, folding time is 3:20 se (with 0:33 se for unfolding).

7.4.2 Re ursive Control Rules

In the following we present how RPSs for planning problems su h as the ones

presented in hapter 3 an be indu ed. In this se tion we do not des ribe

how appropriate initial trees an be generated from universal plans but just

demonstrate that the re ursive rules underlying a given planning domain an

be indu ed if appropriate initial trees are given!

For the fun tion symbols used in the signature of the RPSs we assume

that they are predened. Typi ally su h symbols refer to predi ates and op-

erators dened in the planning domain. Furthermore, for planning problems,

a situation variable s is introdu ed whi h represents the urrent planning

state. A state an be presented as list of literals for Strips-like planning

(see hap. 2) or as list or array over primitive types su h as numbers for

a fun tional extension of Strips (see hap. 4). Generating initial trees from

universal plans is dis ussed in detail in hapter 8.

The learblo k problem introdu ed in se tion 3.1.4.1 in hapter 3 has an

underlying linear re ursive stru ture. In gure 7.17 an initial tree for a three

blo k problem is given (the ontable literal is omitted for better readability).

The resulting RPS is:

G = hG(x; s) = if( lear(x; s); s; puttable(topof(x; s); G(topof(x; s); s)))i

t

0

= G(C; list( lear(A); on(A;B); on(B;C); ontable(C))).

The rst variable x holds the name of the blo k to be leared, for example

C. The situation variable s holds a des ription of a planning state as list of

literals. Predi ate lear, sele tor topof, and operator puttable are assumed to

be predened fun tions. Clear(x, s) is true if lear(x) 2 s; topof(x, s) returns

7.4ExampleProblems

227

Ω

if

clear

c list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

puttable

topof

c list

on

ab

on

bc

clear

a

if

clear

topof

c list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

puttable

topof

topof

c list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

if

clear

topof

topof

c list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

puttable

topof

topof

topof

c list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

list

on

ab

on

bc

clear

a

Fig.7.17.InitialTreeforClearblo k


y for on(y; x) 2 s; puttable(x, s) deletes on(x, y) from s and adds the literals

lear(y) and ontable(x) to s.

The Tower of Hanoi domain is an example for a re ursive fun tion relying

on a hidden variable (see se t. 7.3.4.3). In gure 7.18, an initial tree is given.

The underlying RPS is:

G = h G(n; x; y; z) = if(eq1(n);move(x; y);

ap(hanoi(pred(n); x; z; y);move(x; y); hanoi(pred(n); z; y; x)))i

t

0

= G(3; a; b; ).

The fun tion symbol ap (whi h an be interpreted as append) is used to

ombine two alls of hanoi with a move operation.

Parameter z does o ur only in substitutions in the re ursive alls. There-

fore, the maximal pattern ontains only three variables:

^

t

G

= if( eq1(x

1

);

move(x

2

; x

3

);

ap(G;move(x

2

; x

3

); G))

and the fourth variable z is in luded after determining the substitution terms.

In the following hapter, re ursive generalizations for these and further

planning problems will be dis ussed.

7.4ExampleProblems

229

ΩΩΩΩΩΩΩΩ

if

eq1

3

move

a b

ap

if

eq1

pred

3

move

a c

ap

if

eq1

pred

pred

3

move

a b

ap

move

a b

move

a c

if

eq1

pred

pred

3

move

b c

ap

move

b c

move

a b

if

eq1

pred

3

move

c b

ap

if

eq1

pred

pred

3

move

c a

ap

move

c a

move

c b

if

eq1

pred

pred

3

move

a b

ap

move

a b

Fig.7.18.InitialTreeforTowerofHanoi


8. Transforming Plans into Finite Programs

"[... I guess I an put two and two together." "Sometimes the answer's four,"

I said, "and sometimes it's twenty-two. [..."

|Ni k Charles to Morelli in: Dashiell Hammett, The Thin Man, 1932

Now we are ready to ombine universal planning as des ribed in hapter

3 and indu tion of re ursive program s hemes as des ribed in hapter 7.

In this hapter, we introdu e an approa h to transform plans generated by

universal planning into nite programs whi h are used as input to our folder.

On the one hand, we present an alternative approa h to realizing the rst

step of indu tive program synthesis as des ribed in se tion 6.3.4 in hapter

6 using AI planning as basis for generating program tra es. On the other

hand, we demonstrate that indu tive program synthesis an be applied to

the generation of re ursive ontrol rules for planning as dis ussed in se tion

2.5.2 in hapter 2. As a reminder to our overall goal introdu ed in hapter 1,

in gure 8.1 an overview of learning re ursive rules from some initial planning

experien e is given.

Domain Specification

- PDDL-operators

+ goal

set of constants

+ set of states or

Universal Plan

set of all optimal plans

Finite Program(s)

universal

planning

TransformationPlan

generalization

to-n

SchemeRecursive Program

CONTROL KNOWLEDGE LEARNING

DOMAIN EXPLORATION

Fig. 8.1. Indu tion of Re ursive Fun tions from Plans

232 8. Transforming Plans into Finite Programs

While plan onstru tion and folding are straight-forward and an be

dealt with by domain-independent, generi algorithms, plan transformation

is knowledge dependent and therefore the bottlene k of our approa h. What

we present here is work in progress. As a onsequen e, the algorithms pre-

sented in this hapter are stated rather informally or given as part of the Lisp

ode of the urrent implementation and we rely heavily on examples. We an

demonstrate that our basi idea transformation based on data type infer-

en e works for a variety of domains and hope to elaborate and formalize

plan transformation in the near future.

In the following, we rst give an overview of our approa h (se t. 8.1), then

we introdu e de omposition, type inferen e, and introdu tion of situation

variables as omponents of plan transformation (se t. 8.2). In the remaining

se tions we give examples for plans over totally ordered sequen es (se t. 8.3),

sets (se t. 8.4), partially ordered lists (se t. 8.5), and more omplex types

(se t. 8.6).

1

8.1 Overview of Plan Transformation

8.1.1 Universal Plans

Remember onstru tion of universal plans for a small nite set of states

and a xed goal, as introdu ed in hapter 3. A universal plan is a DAG

whi h represents an optimal, totally ordered sequen e of a tions (operations)

to transform ea h input state onsidered in the plan into the desired output

(goal) state. Ea h state is a node in the plan and is typi ally represented as

set of atoms over domain obje ts ( onstants). A tions in a plan are applied to

the urrent state in working memory. For example, for a given state fon(A,

B), on(B, C), lear(A), ontable(C)g, appli ation of a tion puttable(A) results

in fon(B, C), lear(A), lear(B), ontable(A), ontable(C)g.

8.1.2 Introdu ing Data Types and Situation Variables

To transform a plan into a nite program, it is ne essary to introdu e an

order over the domain (whi h an be gained by introdu ing a data type)

and to introdu e a variable whi h holds the urrent state (i. e., a situation

variable).

Data type inferen e is ru ial for plan transformation: The universal plan

already represents the stru ture of the sear hed-for program, but it does not

ontain information about the order of the obje ts of its domain. Typi ally for

1

This hapter is based on the previous publi ations S hmid and Wysotzki (2000b)

and S hmid and Wysotzki (2000a). An approa h whi h shares some aspe ts

with the plan transformation presented here and is based on the assumption of

linearizability of goals is presented in Wysotzki and S hmid (2001).

8.1 Overview of Plan Transformation 233

planning, domain obje ts are represented as sets of onstants. For example,

in the learblo k problem (se t. 3.1.4.1 in hap. 3) we have blo ks A, B, and

C. As dis ussed in hapter 6, approa hes to dedu tive as well as indu tive

fun tional program synthesis rely on a \ onstru tive" representation of the

domain. For example, Manna and Waldinger (1987) introdu e the hat-axiom

for refering to a blo k whi h is lying on another blo k; Summers (1977) relies

on a predened omplete partial order over lists. Our aim is, to infer the data

stru ture for a given planning domain. If the data type is known, onstant

obje ts in the plan an be repla ed by onstru tive expressions.

Furthermore, to transform a plan into a program term, a situation variable

(see se t. 2.2.2 in hap. 2 and se t. 6.2.1.2 in hap. 6) must be introdu ed.

While a planning algorithm applies an instantiated operator to the urrent

state in working memory (that is, a \global" variable), interpretation of a

fun tional program depends only on the instantiated parameters (that is,

\lo al" variables) of this program term (Field & Harrison, 1988). Introdu ing

a situation variable in the plan makes it possible to treat a state as valuated

parameter of an expression. That is, the primitive operators are now applied

to the set of literals held by the situation variable.

8.1.3 Components of Plan Transformation

Overall, plan transformation onsists of three steps, whi h we will des ribe

in the following se tion:

Plan De omposition: If a universal plan onsists of parts with dierent sets

of a tion names (e. g., a part where only unload(x, y) is used and another

part, where only load(x, y) is used), the plan is splitted into sub-plans.

In this ase, the following transformation steps are performed for ea h

sub-plan separately and a term giving the stru ture of fun tion- alls is

generated from the de omposition stru ture.

Data Type Inferen e: The ordering underlying the obje ts involved in a tion

exe ution is generated from the stru ture of the plan. From this order,

the data type of the domain is inferred.

Introdu tion of Situation Variables: The plan is re-interpreted in situation

al ulus and rewritten as nested onditional expression.

For information about the global data stru tures and entral omponents

of the algorithm, see appendix AA.7.

8.1.4 Plans as Programs

As stated above, a universal plan represents the optimal transformation se-

quen es for ea h state of the nite problem domain for whi h the plan was

onstru ted. To exe ute su h a plan, the urrent input state an be sear hed

in the planning graph by depth-rst or breadth-rst sear h starting from the


root and the a tions along the edges from the urrent input state to the

root an be extra ted (S hmid, 1999). If the universal plan is transformed

into a nite program term, this program an be evaluated by a fun tional

eval-apply interpreter. For ea h input state, the resulting transformation se-

quen e should orrespond exa tly to the sequen e of a tions asso iated with

that state in the universal plan.

The sear hed-for nite program is already impli itely given in the plan

and we have to extra t it by plan transformation. A plan an be onsidered

as a nite program for transforming a xed set of inputs into the desired

output by means of applying a total ordered sequen e of a tions to an input

state, resulting in a state fullling the top-level goals. A plan onstru ted by

ba kward sear h with the state(s) fullling the top-level goals as root, an be

read top-down as: IF the literals at the urrent node are true in a situation

THEN you are done after exe uting the a tions on the path from the urrent

node to the root ELSE go to the hild node(s) and re ur.

2

Our goal is to ex-

tra t the underlying program stru ture from the plan. To interpret the plan

as a (fun tional) program term, states are re-interpreted as boolean opera-

tors: All literals of a state des ription whi h are involved in transformation

the \footprint" (Veloso, 1994) are rewritten with the predi ate symbol

as boolean operator introdu ing a situation variable as additional argument.

Ea h boolean operator is rewritten as a fun tion whi h returns true, if some

proposition holds in the urrent situation, and false otherwise. Additionally,

the a tions are extended by a situation variable, thus, the urrent (partial)

des ription of a situation an be passed through the transformations. Fi-

nally, to represent the onditional statement given above, additional nodes

only ontaining the situation variable are introdu ed for all ases, where the

urrent situation fullls a boolean ondition. In this ase, the urrent value

of the situation variable is returned.

We dene plans as programs in the following way:

Denition 8.1.1 (Plan as Program). Ea h node S (set of literals) in

plan is interpreted as onjun tion B of boolean expressions. The planning

tree an now be interpreted as nested onditional: (s) = IF B(s) THEN t

1

ELSE t

2

with t

1

; t

2

== s j o(

0

(s)), where s is a situation variable, o the

a tion given at the edge from B to a hild node, and

0

as sub-plan with this

hild node as root.

The restri tion to binary onditions \if-then-else" is no limitation in expres-

siveness. Ea h n-ary ondition an be rewritten as nested binary ondition:

( ond (x

1

t

1

) (x

2

t

2

) (x

3

t

3

) : : : (x

n

t

n

)) ==

(if x

1

t

1

(if x

2

t

2

(if x

3

t

3

(if : : : (if x

n

t

n

))))).

In general, boolean expressions B(s) an involve propositions expli itely

given in a situation as well as additional onstraints as des ribed in hapter

2

An interpreter fun tion for universal plans is given in (Wysotzki & S hmid,

2001).

8.2 Transformation and Type Inferen e 235

4. If the plan results in a term IF B(s) THEN s ELSE o((s)), the problem

is linear, if then- and else- part involve operator appli ation, the problem is

more omplex (resulting in a tree re ursion). We will see below, that for some

problems a omplex stru ture an be ollapsed into a linear one as a result

of data type introdu tion.

8.1.5 Completeness and Corre tness

Starting point for plan transformation is a omplete and orre t universal

plan (see hap. 3). The result of plan transformation is a nite program

term. The ompleteness and orre tness of this program an be he ked by

using ea h state in the original plan as input to the nite program and he k

(1) whether the interpretation of the program results in the goal state and (2)

whether the number of operator appli ations orresponds to the number of

edges on the path from the given input state to the root of the plan. Of ourse,

be ause folding is an indu tive step, we annot guarantee the ompleteness

and orre tness of the inferred re ursive program. The re ursive program

ould be empiri ally validated by presenting arbitrary input states of the

domain. For example, if a plan for sorting lists of up to three elements was

generalized into a re ursive sorting program, the user ould test this program

by presenting lists with for or more elements as input and inspe ting the

resulting output.

8.2 Transformation and Type Inferen e

8.2.1 Plan De omposition

As an initial step, the plan might be de omposed in uniform sub-plans:

Denition 8.2.1 (Uniform Sub-Plan). A sub-plan is uniform if it on-

tains only xed, regular sequen es of operator-names ho

1

: : : o

n

i with n 1.

The most simple uniform sub-plan is a sequen e of steps where ea h a -

tion involves an identi al operator-name (see g. 8.2.a). An example for su h

a plan is the learblo k plan given in gure 3.2 whi h onsists of a linear se-

quen e of puttable a tions. It ould also be possible that some operators are

applied in a regular way for example drill-holepolish-obje t (see g. 8.2.b).

Single operators or regular sequen es an alternatively o ur in more omplex

planning stru tures (see g. 8.2. ). An example for su h a plan is the sorting

plan given in gure 3.7 whi h is a DAG where ea h ar is labelled with a

swap a tion.

A plan an ontain uniform sub-plans in several ways: The most simple

way is, that the plan an be de omposed level-wise (see g. 8.3.a). This is, for

example, the ase for the ro ket domain (see plan in g. 3.5): The rst levels


of the DAG only ontain unload a tions, followed by a single move-ro ket

a tion, followed by some levels whi h only ontain load a tions. In general,

sub-plans an o ur at any position in the planning stru ture as subgraphs

(see g. 8.3.b).

(o1 ...)

(o1 ...)

(o1 ...)

(o1 ...)

(o1 ...)

(o1 ...)

(o1 ...)

(o2 ...)

(o2 ...)

(o2 ...)

(o1 ...) (o1 ...)

(o1 ...) (o1 ...)

(a) (b) (c)

Fig. 8.2. Examples of Uniform Sub-Plans

We have only implemented a very restri ted mode for this initial plan

de omposition: single operators and level-wise splitting (see appendix AA.8).

A full implementation of de omposition involves omplex pattern-mat hing,

as it is realized for identifying sub-programs in our folder ( hap. 7). Level-

wise de omposition an result in a set of \parallel" sub-plans whi h might be

omposed again during later planning steps. Parallel sub-plans o ur, if the

\parent" sub-plan is a tree, that is, it terminates with more than one leaf.

Ea h leaf be omes the root of a potential subsequent sub-plan.

In the urrent implementation, we return the omplete plan if dierent

operators o ur at the same level. A reasonable minimal extension (still avoid-

ing omplex pattern-mat hing as des ribed in hap. 7) would be to sear h for

sub-plans fullling our simple splitting riterium at lower levels of the plan.

But up to now, this ase only o urred for omplex list problems (su h as

tower with 4 blo ks, see below) and in su h ases, a minimal spanning tree

is extra ted from the plan.

8.2 Transformation and Type Inferen e 237

(o1 ...)(o1 ...)

(o1 ...) (o1 ...) (o1 ...)

(o1 ...) (o1 ...) (o1 ...)

(a)

(o1 ...)

(o1 ...) (o1 ...)

(o1 ...) (o1 ...)

(b)

Fig. 8.3. Uniform Plans as Subgraphs

If de omposition results in more than one sub-plan, an initial skeleton

for the program stru ture is generated over the (automati ally generated)

names of the sub-plans (see also hap. 7), whi h are initially asso iated with

the partial plans and nally with re ursive fun tions. If folding su eeded,

the names are extended by the asso iated lists of parameters. For example,

a stru ture (p1 (p2 (p3))) ould be ompleted to (p1 arg1 (p2 arg2 arg3 (p3

(arg 4 s)))) where the last argument of ea h sub-program with name p

i

is a

situation variable. For all arguments arg

i

, the initial values as given in the

nite program are known as a result from folding where parameters and

their initial instantiations are identied in an initial program ( hap. 7).

8.2.2 Data Type Inferen e

The entral step of plan transformation is data type inferen e.

3

The stru ture

of a (sub-) plan is used to generate an hypothesis about the underlying data

type. This hypothesis invokes ertain data type spe i on epts whi h

subsequently are tried to identify in the plan and ertain rewrite-steps whi h

are to be performed on the plan. If the data type spe i on epts annot be

identied from the plan, plan transformation fails.

Denition 8.2.2 (Data Type). A data type is a olle tion of data items,

with designated basi items ? (together with a \bottom"-test) and operations

3

Con rete and abstra t data types are for example introdu ed in (Ehrig & Mahr,

1985).


( onstru tors) su h that all data items an be generated from basi items by

operation-appli ation. The onstru tor fun tion determines the stru ture of

the data type with ? < (x;?) < (x

0

; (x;?)) : : :. If x is empty or unique, the

data type is simple and the ( ountable, innite) set of elements belonging to

is totally ordered. The stru ture of data belonging to omplex types is usually a

partial order. For omplex types, additionally sele tor fun tions el( (x; s)) =

x and rs( (x; s)) = s are dened.

An example for a data type is list with the empty list as bottom element,

ons(x, l) as list- onstru tor, head(l) as sele tor for the rst element of a list,

and tail(l) as sele tor for the \rest" of the list, that is, the list without the

rst element.

The data type hypotheses are he ked against the plan ordered by in-

reasing omplexity:

Is the plan a sequen e of steps (no bran hing in the plan)?

Hypothesis: Data Type is Sequen e

Does the plan onsist of paths with identi al sets of a tions?

Hypothesis: Data Type is Set

Is the plan a tree?

Hypothesis: Data Type is List or ompound type

Is the plan a DAG?

Hypothesis: Data Type is List or ompound type.

Data type inferen e for the dierent plan stru tures is dis ussed in detail

below. After the plan is rewritten in a ordan e to the data type, the or-

der of the operator-appli ations and the order over the domain obje ts are

represented expli itly.

Expli it order over the domain obje ts is a hieved by repla ing obje t

names by fun tional expressions (sele tor fun tions) referring to obje ts in

an indire t way. Referring to obje ts by fun tions f(t), where t is a ground

term (a onstant, su h as the the bottom-element, or a fun tional expression

over a onstant) makes it possible to deal with innite domains while still

using nite, ompa t representations (Gener, 2000). For example, (pi k oset)

an represent a spe i obje t in an obje t list of arbitrary length.

8.2.3 Introdu ing Situation Variables

In the nal step of plan transformation, the remaining literals of ea h state

and the a tions are extended by situation variable s as additional argument

and the plan is rewritten as an onditioned expression as dened in denition

8.1.1. An abbreviated version of the rewriting-algorithm is given in appendix

AA.9.

4

4

This nal step is urrently only implemented for linearized plans.

8.3 Plans over Sequen es of Obje ts 239

8.3 Plans over Sequen es of Obje ts

A plan whi h onsists of a sequen e of a tions (without bran hing) is assumed

to deal with a sequen e of obje ts. For sequen es, there must exist a single

bottom element, whi h is identiable from the top-level goal(s) together with

the goal-predi ate(s) as bottom-test. The total order over domain obje ts is

dened over the arguments of the a tions from the top (root) of the sequen e

to the leaf.

Denition 8.3.1 (Sequen e). Data type sequen e is dened as:

seq = ? j (seq) with

null(seq) =

true if seq = ?

false otherwise:

For a plan or sub-plan with hypothesized data type sequen e, data

type introdu tion works as des ribed in the algorithm given in table 8.1.

Note, that inferen e is performed over a plan whi h is generated by our Lisp-

implemented planning system. Therefore, expressions are represented within

parenthesis. For example, the literal, representing that a blo k A is lying on

another blo k B is represented as (on A B); the a tion to put blo k A on the

table is represented as (puttable A). The restri tion to a single top-level goal

in the algorithm an be extended to multiple goals by a slight modi ation.

5

For the appli ation of an inferred re ursive ontrol rule, an additional

fun tion for identifying su essor-elements from the urrent state has to be

provided as des ribed in algorithm 8.1. The program ode for generating this

fun tion is given in gure 8.4. The rst fun tion (su -pattern) represents

a pre-dened pattern for the su essor-fun tion. For a on rete plan, this

pattern must be instantiated in a ordan e with the predi ates o uring in

the plan. For example, for the learblo k problem (see hap. 3 and below),

the su essor fun tion topof(x) = y is onstru ted from predi ates on(y, x).

Unsta king Obje ts. A prototypi al example for plans over a sequen e of

obje ts is unsta king obje ts either to put all obje ts on the ground or

to lear a spe i obje t lo ated somewhere in the staple. Su h a planning

domain was spe ied by learblo k as presented in se tion 3.1.4.1 in hapter

3. Here we use a slightly dierent domain, omitting the predi ate ontable and

with operator puttable named unsta k. The domain spe i ation and a plan

for unsta k are given in gure 8.5.

The proto ol of plan-transformation is given in gure 8.6. After identifying

the data type sequen e, the ru ial step is the introdu tion of the su essor-

fun tion: (su x) = y (on y x) whi h represents the \blo k lying on top of

5

Data type sequen e must be introdu ed as rst step. If the sequen e of obje ts

in the plan is identied, it is also identied whi h predi ate hara terizes the

elements of this sequen e. The top-level goals must in lude this predi ate with

the bottom-element as argument and onsequently an be sele ted as \bottom-

test" predi ate.


Table 8.1. Introdu ing Sequen e

If the plan starts at level 0:

If the plan is not a single step:

If there is a single top-level goal (g a

1

: : : a

n

) set bottom-test to p, else fail.

Set type to seq.

Generate the sequen e:

Colle t the argument-tuple of ea h a tion along the path from the root to

the leaf.

If the tuple onsists of a single argument a

1

, keep it

otherwise, remove all arguments a

i

whi h are onstant over the sequen e.

If the sequen e onsists of single elements and if ea h element o urs as

argument of g,

pro eed with sequen e = (e

0

: : : e

m

) and set bottom to e

0

else fail.

Constru t an asso iation list ((e

1

(su e

0

)) : : : (e

m

(su

m1

e

0

))).

For (e

0

; e

1

) he k, whether the state on level 1 ontains a predi ate q(args)

with e

0

; e

1

2 args at positions (pos e

0

1 q) and (pos e

1

1 q) if yes, pro eed,

else fail.

For ea h (e

i

; e

i+1

) of the sequen e with i = 1; : : : ;m he k whether q(args)

with e

i

; e

i+1

2 args exists at level i+1 with (pose

i

; i+1; q) = (pos e

0

1 q) =

p

i

and pos(e

i+1

; i+ 1; q) = pos(e

1

; 1; q) = p

j

.

If yes, generate a fun tion (su e

i

) = e

j

if (q args) with e

i

at p

i

in q and

e

j

at p

j

in q

else fail.

Introdu e data type sequen e into the plan:

For ea h state, keep only bottom-test predi ate (g a

1

: : : a

n

)

Repla e arguments of g and of a tions by (su

i

e

0

) in a ordan e to the

asso iation list.

If the plan is a single step: identify bottom-test and bottom as above, redu e

states to the bottom-test predi ate.

If the plan starts at a level > 0:an \intermediate" goal must be identied; after-

wards, pro eed as above.

8.3 Plans over Sequen es of Obje ts 241

; pattern for getting the su of a onstant

; pred has to be repla ed by the predi ate-name of the rewrite-rule

; x-pos has to be repla ed by a list-sele tor (position of x in pred)

; y-pos dito (position of y = (su x) in pred)

(setq su -pattern '(defun su (x s)

( ond ((null s) nil)

((and (equal (first ( ar s)) pred)

(equal (nth x-pos ( ar s)) x))

(nth y-pos ( ar s))

)

(T (su x ( dr s)))

)))

; use a pattern for al ulating the su essor of a onstant from a

; planning state (su -pattern) and repla e the parameters for the

; urrent problem

; this fun tion has to be saved so that the synthesized program

; an be exe uted

(defun transform-to-f t (r)

; r: ((pred ...) (y = (su x)))

; pred = (first ( ar r))

; find variable-names x and y and find their positions in (pred ...)

; repla e pred, x-pos, y-pos

(setq r-pred (first ( ar r)))

(setq r-x-pos (position (se ond (third (se ond r))) (first r)))

(setq r-y-pos (position (first (se ond r)) (first r)))

(nsubst ( ons 'quote (list r-pred)) 'pred

(nsubst r-x-pos 'x-pos (nsubst r-y-pos 'y-pos su -pattern)))

)

Fig. 8.4. Generating the Su essor-Fun tion for a Sequen e

D = f (( lear O1) ( lear O2) ( lear O3)) ,

((on O2 O3) ( lear O1) ( lear O2)),

((on O1 O2) (on O2 O3) ( lear O1)) g

G = f( lear O3)g

O = funsta kg with

(unsta k ?x)

PRE f( lear ?x), (on ?x

?y)g

ADD f( lear ?y)g

DEL f(on ?x ?y)g

((CLEAR O1) (CLEAR O2) (CLEAR O3))

((ON O2 O3) (CLEAR O1) (CLEAR O2))

((ON O1 O2) (ON O2 O3) (CLEAR O1))

(UNSTACK O2)

(UNSTACK O1)

Fig. 8.5. The Unsta k Domain and Plan


blo k x". While su h fun tions are usually pre-dened (Manna & Waldinger,

1987; Gener, 2000; Wysotzki & S hmid, 2001), we an infer them from the

universal plan. The \ onstru tively" rewritten plan where data type sequen e

is introdu ed is given in gure 8.7. The transformation information stored for

the nite program is given in appendix CC.3.

+++++++++ Transform Plan to Program ++++++++++

1st step: de ompose by operator-type

Single Plan

(SAVE SUBPLAN P1)

-----------------------------------------------

2nd step: Identify and introdu e data type

(INSPECTING P1) Plan is linear

(SINGLE GOAL-PREDICATE (CLEAR O3))

Plan is of type SEQUENCE

(SEQUENCE IS O3 O2 O1)

(IN CONSTRUCTIVE TERMS THAT 'S (O2 (SUCC O3)) (O1 (SUCC (SUCC O3))))

Building rewrite- ases:

(((ON O2 O3) (O2 = (SUCC O3))) ((ON O1 O2) (O1 = (SUCC O2))))

(GENERALIZED RULE IS (ON |?x1| |?x2|) (|?x1| = (SUCC |?x2|)))

Storage as LISP-fun tion

Redu e states to relevant predi ates (footprint)

((CLEAR O3))

((CLEAR (SUCC O3)))

((CLEAR (SUCC (SUCC O3))))

-----------------------------------------------

3rd step: Transform plan to program

Show Plan as Program? y

---------------------------------------------------------

Fig. 8.6. Proto ol for Unsta k

The nite program term whi h an be onstru ted from the term given

in gure 8.7 is:

(IF (CLEAR O3 S)

S

(UNSTACK (SUCC O3 S)

(IF (CLEAR (SUCC O3 S) S)

S

(UNSTACK (SUCC (SUCC O3 S) S)

(IF (CLEAR (SUCC (SUCC O3 S) S) S)

S

OMEGA))))).

An RPS (see hap. 7) generalizing this unsta k term is = h (unsta k-all o

s) = (if ( lear o s) s (unsta k (su o s) (unsta k-all (su o s) s))) i with

8.4 Plans over Sets of Obje ts 243

((CLEAR O3))

((CLEAR (SUCC O3)))

((CLEAR (SUCC (SUCC O3))))

(UNSTACK (SUCC O3))

(UNSTACK (SUCC (SUCC O3)))

Fig. 8.7. Introdu tion of Data Type Sequen e in Unsta k

t

0

= (unsta k-re o s) for some onstant o and some set of literals s. The

exe utable program is given in gure 8.8.

Plans onsisting of a linear sequen e of operator appli ations over a se-

quen e of obje ts in general result in generalized ontrol knowledge in form

of linear re ursive fun tions. In standard programming domains (over num-

bers, lists), a large group of problems is solvable by fun tions of this re ursion

lass. Examples are given in table 8.2.

Table 8.2. Linear Re ursive Fun tions

(unsta k-all x s) == (if ( lear x) s (unsta k (su x) (unsta k-all (su x) s)))

(fa torial x) == (if (eq0 x) 1 (mult x (fa torial (pred x))))

(sum x) == (if (eq0 x) 0 (plus x (sum (pred x))))

(expt m n) == (if (eq0 n) 1 (mult m (expt m (pred n))))

(length l) == (if (null l) 0 (su (length (tail l))))

(sumlist l) == (if (null l) 0 (plus (head l) (sumlist (tail l))))

(reverse l) == (if (null l) nil (append (reverse (tail l)) (list (head l))))

(append l1 l2) == (if (null l1) l2 ( ons (head l1) (append (tail l1) l2)))

It is simple and straight-forward to generalize over linear plans. For this

plans of problems, our approa h automati ally provides omplete and orre t

ontrol rules. As a result, planning an be avoided ompletely and the trans-

formation sequen e for solving an arbitrary problem involving an arbitrary

number of obje ts an be solved in linear time!

8.4 Plans over Sets of Obje ts

A plan whi h has a single root and a single leaf where the set of a tions

for ea h path from root to leaf are identi al is assumed to deal with a set

of obje ts. For sets, there has to be a omplex data obje t whi h is a set of

elements ( onstants of the planning domain), a bottom-element whi h is

inferred from the elements involved in the top-level goals , a bottom-test


; Complete re ursive program for the UNSTACK problem

; all (unsta k-all <oname> <state-des ription>)

; e.g. (unsta k-all 'O3 '((on o1 o2) (on o2 o3) ( lear o3)))

; --------------------------------------------------------------------

; generalized from finite program generated in plan-transform

(defun unsta k-all (o s)

(if ( lear o s)

s

(unsta k (su o s) (unsta k-all (su o s) s))

) )

(defun lear (o s)

(member (list ' lear o) s :test 'equal)

)

; inferred in plan-transform

(DEFUN SUCC (X S)

(COND ((NULL S) NIL)

((AND (EQUAL (FIRST (CAR S)) 'ON)

(EQUAL (NTH 2 (CAR S)) X))

(NTH 1 (CAR S)))

(T (SUCC X (CDR S)))))

; expli it implementation of "unsta k"

; in onne tion with DPlan: apply unsta k-operator on state $s$ and return

; the new state

(defun unsta k (o s)


((and (equal (first ( ar s)) 'on) (equal (se ond ( ar s)) o))

( ons ( ons ' lear (list (third ( ar s)))) ( dr s))

)

(T ( ons ( ar s) (unsta k o ( dr s))))

))

Fig. 8.8. LISP-Program for Unsta k


whi h has to be an inferred predi ate over the set, and two sele tors one for

an element of a set (pi k) and one for a set without some xed element (rst).

The partial order over sets with maximally three elements is given in gure

8.9 this order orresponds to the sub-plans for unload or load in the ro ket

domain (se t. 3.1.4.2 in hap. 3). If pi k and rst are dened deterministi ally

(e. g., by list-sele tors), the partial order gets redu ed to a total order.

e2e1 e3

e2,e3e1,e3e1,e2

e1,e2,e3

=

Fig. 8.9. Partial Order of Set

Denition 8.4.1 (Set). Data type set is dened as:

set = ? j (e; set) with

empty(set) =

true if set = ?

false otherwise

pi k(set) = some e 2 set

rst(set) = set n e for some e 2 set.

Be ause the omplex data obje t is inferred from the top-level goals (given in

the root of the plan), we typi ally infer an \inverted" order with the largest

set ( ontaining all obje ts whi h an be element of set) as bottom element

and (e; set) == rst(set) as a de-stru tor.

For a plan or sub-plan with hypothesized data type set, data type

introdu tion works as des ribed in the algorithm given in table 8.3. Be ause

we ollapse su h a plan to one path of a set of paths, this algorithm ompletely

ontains the ase of sequen es (tab. 8.1). Note, that ollapsing plans with

underlying data type set orresponds to the idea of \ ommutative pruning"

as dis ussed in (Haslum & Gener, 2000).

The program ode for make- o, generating the bottom-test, and for pi k, and

rst referred to in algorithm 8.3 is given in gure 8.10. As des ribed for data

type sequen e above, the pattern of the sele tor fun tions are pre-dened and

the on rete fun tions are onstru ted by instantiating these patterns with

information gained from the given plan. Introdu tion of a new predi ate with


Table 8.3. Introdu ing Set

Collapse plan to one path (implemented as: take the \leftmost" path).

Generate a omplex data obje t: like Generate Sequen e in tab. 8.1.

sequen e = (e1 : : : e

m

) is interpreted as set and bottom is instantiated with

CO = (e1 : : : e

m

).

A fun tion for generating CO from the top-level goals (make- o) is provided.

A generalized predi ate (g* args) with CO 2 args is onstru ted by olle ting

all predi ates (g args) with o 2 COô 2 args and repla ing o by CO. For a plan

starting at level 0, g has to be a top-level goal and all top-level goals have to be

overed by g

; for other plans, g has to be in the urrent root node.

A fun tion for testing whether g

holds in a state is generated as bottom-test.

Introdu e the data type into the plan:

For ea h state keep only bottom-test predi ate (g* args) with CO

0

CO 2 args.

Introdu e set-sele tors for arguments of g

by repla ing CO

0

by (rst

i

CO) and

afterwards repla ing a tion-arguments by (pi k(rst

i

CO)) with (rst

i

CO) o ur-

ring as argument of g

of the parent-node.

(pi k set) and (rstset) are predened by ar and dr.

a omplex argument orresponds to the notion of \predi ate-invention" in

Wysotzki and S hmid (2001).

6

Oneway Transportation. A typi al example for a domain with a set as un-

derlying data type are simple transportation domains, where some obje ts

must be loaded into a vehi le whi h moves them to a new lo ation. Su h a

domain is ro ket (see se t. 3.1.4.2 in hap. 3). The ro ket plan is in a rst step

levelwise de omposed into sub-plans with uniform a tions (see g. 8.11).

Data type inferen e is done for ea h sub-plan. For both sub-plans (unload-

all and load-all) there is a single root and a single leaf node and the sets of

a tions along all (six) possible paths from root to leaf are equal. From this

observation we an on lude that the a tual sequen e in whi h the a tions

are performed is irrelevant, that is, the underlying data stru ture of both

sub-plans is a set. Consequently, the (sub-) plan an be ollapsed to one

path.

Introdu tion of the data type set involves the following steps:

The initial set is onstru ted by olle ting all arguments of the a tions

along the path from root to leaf. That is, the initial value for the set an

be I = fO1; O2; O3g when the left-most path of a sub-plan is kept. (If

the involved operator has more than one argument, for example (unload

<obj> <pla e>), the onstant arguments are ignored.)

A \generalized" predi ate orresponding to the empty-test of the data

type is invented by generalizing over all literals in the root-node with an

element of the initial set I as argument. That is, for the unload-all sub-plan,

the new predi ate is p = (at

hobjSeti B) with

6

This kind of predi ate invention should not be onfused with predi ate invention

in ILP (see se t. 6.3.3 in hap. 6). In ILP a new predi ate must ne essarily be

introdu ed if the urrent hypothesis overs some negative examples.


; make- o

; -------

; olle t all obje ts overed by goal-preds p orresponding to the

; new predi ate p*

; the new omplex obje t is referred to as CO

; f.e. for ro ket: (at o1 b) (at o2 b) --> CO = (o1 o2)

; how to make the omplex obje t CO: use the pattern of the new

; predi ate to olle t all obje ts from the urrent top-level goal

;; use for all of main fun tion, f.e.: (ro ket (make- o ..) s)

(defun make- o (goal newpat)

( ond ((null goal) nil)

((string< (string ( aar goal)) (string ( ar newpat)))

( ons (nth (position 'CO newpat) ( ar goal))

(make- o ( dr goal) newpat)))

))

; rest(set) (implemented as for lists, otherwise pi k/rest would not

; --------- be guaranteed to be really omplements)

;; named rst be ause rest is build-in

(defun rst ( o)

( dr o)

)

; pi k(set)

(defun pi k ( o)

( ar o)

)

; is newpred true in the urrent situation?

;; used for he king this predi ate in the generalized fun tion

;; f.e. (at* CO Pla e s)

;; newpat has to be repla ed by the newpred name (f.e. at*)

;; pname has to be repla ed by the original pred name (f.e.at)

(setq newpat '(defun newp (args s)

( ond ((null args) T)

((and (null s) (not(null args))) nil)

((and (equal ( aar s) pname)

(interse tion args ( dar s)))

(newp (set-differen e args ( dar s)) ( dr s)))

(T (newp args ( dr s)))

))

)

(defun make-npf t (patname gpname)

(subst patname 'newp (nsubst ( ons 'quote (list gpname)) 'pname newpat))

)

Fig. 8.10. Fun tions Inferred/Provided for Set


(b) "move-rocket"

(c) "load-all"

(a) "unload-all"

(single step)


((IN O1 R) (AT O2 B) (AT O3 B) (AT R B)) ((IN O2 R) (AT O1 B) (AT O3 B) (AT R B))





(UNLOAD O2)(UNLOAD O3)(UNLOAD O1) (UNLOAD O3)

(UNLOAD O1)


(UNLOAD O1)

((IN O3 R) (AT O1 B) (AT O2 B) (AT R B))


(UNLOAD O2)

(UNLOAD O3)







(LOAD O2)(LOAD O3)(LOAD O1) (LOAD O3)

(LOAD O1)(LOAD O2)

(LOAD O3)(LOAD O2)

(LOAD O1)




(MOVE-ROCKET)

Fig. 8.11. Sub-Plans of Ro ket

(at

hobjSeti B) =

8

<

:

true if 8o 2 objSet : (at o B)

holds in the urrent state

false otherwise:

The original literals are repla ed by p.

The arguments of the generalized predi ate are rewritten using the pre-

dened rest-sele tor. We have the following repla ements for the unload

sub-plan (given from root to leaf):

(at

fO1; O2; O3g B) == (at

fO1; O2; O3g B)

(at

fO1; O2g B) ! (at

(rstfO1; O2; O3g) B)

(at

fO1g B) ! (at

(rst(rstfO1; O2; O3g)) B)

(at

f g B) ! (at

(rst(rst(rstfO1; O2; O3g))) B).

The arguments of the a tions are rewritten using the predened pi k-

sele tor:

(unload O1) ! (unload(pi kfO1; O2; O3g))

(unload O2) ! (unload(pi k(rstfO1; O2; O3g)))

(unload O3) ! (unload(pi k(rst(rstfO1; O2; O3g)))).

Note, that we dene pi k and rst deterministi ally (e. g., as head/tail or

last/butlast). Although it is irrelevant whi h obje t is sele ted next, it is

ne essary that rst returns exa tly the set of obje ts without the urrently

pi ked one.


The transformed sub-plan for unload-all is given in gure 8.12.a. After intro-

du ing a data type into the plan, there is only one additional step ne essary

to interpret the plan as a program introdu ing a situation variable. Now

the plan an be read as a nested onditional expression (see g. 8.12.b).

((AT* (O1 O2 O3) B))

((AT* (RST (O1 O2 O3)) B))

((AT* (RST (RST (O1 O2 O3))) B))

((AT* (RST (RST (RST (O1 O2 O3)))) B))

(UNLOAD (PICK (O1 O2 O3)))

(UNLOAD (PICK (RST (O1 O2 O3))))

(UNLOAD (PICK (RST (RST (O1 O2 O3)))))

s

ifs

ifs

if

Ω

if

s

(AT* (RST (O1 O2 O3)) B s)

(AT* (RST (RST (RST (O1 O2 O3)))) B s)

(AT* (O1 O2 O3) B s)

(AT* (RST (RST (O1 O2 O3))) B s)

(UNLOAD (PICK (RST (RST (O1 O2 O3)))) s)

(UNLOAD (PICK (RST (O1 O2 O3))) s)

(UNLOAD (PICK (O1 O2 O3)) s)

(a) (b)

Fig. 8.12. Introdu tion of the Data Type Set (a) and Resulting Finite Program

(b) for the Unload-All Sub-Plan of Ro ket ( denotes \undened")

Plan transformation for the ro ket problem results in the following pro-

gram stru ture

(ro ket oset s) = (unload-all oset (move-ro ket (load-all oset s)))

where unload-all and load-all are re ursive fun tions whi h were generated by

our folding algorithm from the orresponding nite programs. Note, that the

parameters involve only the set of obje ts this information is the only one

ne essary for ontrol, while lo ations (A, B) and transport-vehi le (Ro ket)

are additional information ne essary for the dynami s of plan onstru tion.

The proto ol of plan-transformation is given in gure 8.13.

Re ursive fun tions for unloading and loading obje ts are:

unload all(oset; s) = if( at (oset; B; s);

s;

unload(pi k(oset); unload all(rst(oset); s)))

load all(oset; s) = if( inside (oset; Ro ket; s);

s;

load(pi k(oset); load all(rst(oset); s)))

whi h are integrated in the \main" program (ro ket oset s) given above.

The re ursive fun tions for loading and unloading all obje ts in some

arbitrary order learned from the obje t ro ket problem an be of use in many

transportation problems, as for example the logisti s domain

7

whi h is still

7

The logisti s domain denes problems, where obje ts have to be transported

from dierent pla es in and between dierent ities, using tru ks within ities

and planes between ities.


+++++++++ Transform Plan to Program ++++++++++

1st step: de ompose by operator-type

Possible Sub-Plan, de ompose...

(SAVE SUBPLAN #:P1)

Possible Sub-Plan, de ompose...

(SAVE SUBPLAN #:P2)

Single Plan

(SAVE SUBPLAN #:P3)

(#:P1 (#:P2 (#:P3)))

-----------------------------------------------

2nd step: Identify and introdu e data type

(INSPECTING #:P1) Plan is of type SET

Unify equivalent paths...

Introdu e omplex obje t (CO)

(CO IS (O1 O2 O3))

A fun tion for generating CO from a goal is provided: make- o

Generalize predi ate...

(NEW PREDICATE IS (#:AT* CO B))

Generate a fun tion for testing the new predi ate...

New predi ate overs top-level goal -> repla e goal

Repla e basi predi ates by new predi ate...

Introdu e sele tor fun tions...

((((O1 O2) (RST (O1 O2 O3))) ((O1) (RST (RST (O1 O2 O3))))

(NIL (RST (RST (RST (O1 O2 O3))))))

((O3 (PICK (O1 O2 O3))) (O2 (PICK (RST (O1 O2 O3))))

(O1 (PICK (RST (RST (O1 O2 O3)))))))

RST(CO) and PICK(CO) are predefined (as dr and ar).

(INSPECTING #:P2) Plan is linear

Plan onsists of a single step

(SET ADD-PRED AS INTERMEDIATE GOAL (AT ROCKET B))

(INSPECTING #:P3) Plan is of type SET

Unify equivalent paths... [... see P1

Generalize predi ate...

(NEW PREDICATE IS (#:INSIDER* CO))

Generate a fun tion for testing the new predi ate...

New predi ate is set as goal!

Repla e basi predi ates by new predi ate...

Introdu e sele tor fun tions... [... see P1

-----------------------------------------------

3rd step: Transform plan to program

Show Plan(s) as Program(s)? y

Fig. 8.13. Proto ol of Transforming the Ro ket Plan

8.5 Plans over Lists of Obje ts 251

one of the most hallenging domains for planning algorithms (see the AIPS

planning ompetitions 1998 and 2000).

For the ro ket domain our system an learn the omplete and orre t on-

trol rules. All problems of this domain an now be solved in linear time. A

small aw is, that generalization-to-n assumes innite apa ity of the trans-

port vehi le and does not take into a ount apa ity as an additional on-

straint. To get more realisti , we have to in lude a resour e variable

8

for

the load operator. The resulting load-all fun tion must involve an additional

ondition:

load all(oset; ; s) = if( or(eq0( ); at (oset;B; s));

s;

load(pi k(oset); load all(rst(oset); pred( ); s))):

A further extension of the domain would be, to take into a ount dierent

priorities for obje ts to be transported. This would involve an extension of

pi k, sele ting always the obje t with highest priority, that is, the ontrol

fun tion would follow a greedy-algorithm.

A Lisp-program representing the ontrol knowledge for the ro ket domain

is given in appendix CC.4 together with a short dis ussion about interleaving

the inside and at predi ates.

8.5 Plans over Lists of Obje ts

8.5.1 Stru tural and Semanti List Problems

List-problems an be divided in two lasses: (a) problems whi h involve no

knowledge about the elements of the list, and (b) problems whi h involve

su h knowledge. Standard programming problems of the rst lass are for

example reversing a list, attening a list, or in rementing elements of a list.

Problems, where some operation is performed on every element of a list an

be hara terized by the higher-order fun tion (map f l). Other list problems

whi h an be solved purely stru turally are al ulating the length of a list or

adding the elements of a list of numbers. Su h problems an be hara terized

by the higher-order fun tion (redu e f b l). The unload and load problems

dis ussed above fall into this lass if ea h obje t involved has a unique name

and if pi k and rst are realized in a deterministi way. A third lass of prob-

lems follow (lter p l), for example the fun tions member, or odd-els. This

lass already involves some semanti knowledge about the elements of a list

represented by the predi ate p in lter and by the equal test in member.

Table 8.4 illustrates stru tural list-problems.

8

Dealing with resour e-variables is possible with the DPlan-system extended to

fun tion appli ations, see hap. 4. Currently we annot generate disjun tive

boolean operators in plan transformation.


Table 8.4. Stru tural Fun tions over Lists

(a) (map f l) == (if (empty l) nil ( ons (f (head l)) (map f (tail l))))

(in l) == (if (empty l) nil ( ons (su (head l)) (in (tail l))))

(b) (redu e f b l) == (if (empty l) b (f (head l) (redu e f b (tail l))))

(sumlist l) == (if (null l) 0 (plus (head l) (sumlist (tail l))))

(re -unload oset s) == (if (empty oset) s

(unload (pi k oset) (re -unload (rst oset) s)))

( ) (lter p l) == (if (empty l) nil

(if (p (head l)) ( ons (head l) (lter p (tail l))) (lter p (tail l))))

(odd-els l) == (if (empty l) nil

(if (odd (head l)) ( ons (head l) (lter (tail l))) (lter (tail l))))

(member e l) == (if (empty l) nil

(if (equal (head l) e) e (member e (tail l))))

Stru tural list problems, as dis ussed for example in se tion 6.3.4.1 in

hapter 6, an be dealt with by an algorithm nearly identi al to algorithm in

table 8.3 dealing with sets (see tab. 8.5). Be ause the only relevant informa-

tion is the length of a list, the partial order an be redu ed to a total order

(see g. 8.14). Generating a total order results in linearizing the problem

(Wysotzki & S hmid, 2001). The extra tion of a unique path in the plan is

only slightly more ompli ated as for sets and is dis ussed below.

Denition 8.5.1 (List). Data type list is dened as:

list = nil j ons(e; list) with

null(list) =

true if list = nil

false otherwise

head( ons(e,list)) = e

tail( ons(e,list)) = list

Table 8.5. Introdu ing List

*Collapse plan to one path. (dis ussed in detail below)

Generate a omplex data obje t CO = (e1 : : : e

m

).

A fun tion for generating CO from the top-level goals (make- o) is provided.

A generalized predi ate (g* args) with CO 2 args is onstru ted by olle ting

all predi ates (g args) with o 2 COô 2 args and repla ing o by CO. For a plan

starting at level 0 g has to be a top-level goal and all top-level goals have to be

overed by g; for other plans g has to be in the urrent root node.

A fun tion for testing whether g holds in a state is generated as bottom-test.

Introdu e the data type into the plan:

For ea h state keep only bottom-test predi ate (g args) with CO

0

CO 2 args.

Introdu e list-sele tors by repla ing CO

0

by (tail

i

CO) and afterwards repla ing

a tion-arguments by (head(tail

i

CO)) with (tail

i

CO) o urring as argument of

g of the parent-node.


(a)

(1) (2)

(1 1) (1 2) (1 3) (1 4) ... (2 1) (2 2) (2 3) (2 4)

(1 1 1) (1 1 2) ...

(1 1 1 1) (1 1 1 1) ...

nil

...

... ... ... ...

... ...

...

...

......

(3)

...

(4)

...

...

(b)

nil

(w)

(w w)

(w w w)

(w w w w)

Fig. 8.14. Partial Order (a) and Total Order (b) of Flat Lists over Numbers

While fun tions over lists involving only stru tural knowledge are easy

to infer with our approa h as shown for the ro ket domain , this is not

true for the se ond lass of problems. A proto-typi al example for this lass is

sorting: for sorting a list, knowledge about whi h element is smaller (or larger)

than another is ne essary. That synthesizing fun tions involving semanti

knowledge is notoriously hard is dis ussed at length in the ILP literature

(Flener & Yilmaz, 1999; Le Blan , 1994) and indu tive fun tional program

synthesis typi ally is restri ted to stru tural problems (Summers, 1977).

Currently, we approa h transformation for su h problems by the steps

presented in the algorithm in table 8.6. We do not laim, that this strategy

is appli able to all semanti problems over lists. We developed this strategy

from analyzing and implementing plan transformation for the sele tion sort

problem, whi h is des ribed below. We will des ribe how semanti knowledge

an be \dete ted" by analyzing the stru ture of a plan. For the future, we

plan to investigate further problems and try to nd a strategy whi h overs

a lass as large as possible.

8.5.2 Synthesizing `Sele tion-Sort'

8.5.2.1 A Plan for Sorting Lists. The spe i ation for sorting lists with

four elements is given in table 3.6. In the standard version of DPlan des ribed


Table 8.6. Dealing with Semanti Information in Lists

Extra ting a path (identifying list stru ture):

Extra ting a minimal spanning tree from the DAG: The plan is a DAG, but

the stru ture does not fulll the set- riterium dened above. Therefore, we

annot just sele t one path, but we have to extra t one deterministi set of

transformation sequen es. For purely stru tural list-problems every minimal

spanning tree is suitable for generalization. For problems involving semanti

knowledge only some of the minimal spanning trees an be generalized.

Regularization of the tree: Generating plan levels with identi al a tions by

shifting nodes downward in the plan and introdu ing edges with \empty" or

\id" a tions.

Collapsing the tree: Unifying identi al subtrees whi h are positioned at the

same level of the tree.

If there are still bran hes left (identifying semanti riterium for elements):

Identify a riterium for lassifying elements.

Unify bran hes by introdu ing list as argument into operator using the ri-

terium as sele tion-fun tion.

Pro eed using algorithm 8.5.

in this report we only allow for ADD-DEL-ee ts and we do not dis rimi-

nate between stati predi ates (su h as greater than, being not ae ted by

operator appli ation) and uid predi ates. Note, that it is enough to spe ify

the desired position of three of the four list-elements in the goal, be ause

positioning three elements determines the position of the fourth. A more

natural spe i ation for DPlan with fun tions is given in gure 4.10. This

se ond version allows for plan onstru tion without using a set of predened

states. Information about whi h element is on whi h position in the list or

what number is greater than another an simply be \read" from the list by

applying predened (LISP-) fun tions. The denition of the swap-operator

determines whether the problem is solved by bubble-sort or by sele tion-sort.

In the rst ase, swap is applied to neighboring elements where the rst is

greater than the other; in the se ond ase, the rst ondition is omitted. Note,

that for nding operator-sequen es for sorting a list by an as ending order

by ba kward planning, the greater ondition is reverse!

The universal plan is given in gure 3.7. For sorting a list of four elements,

there exist 24 states. Swapping elements with the restri tions given for sele -

tion sort results in 72 edges. Sorting lists of three elements is illustrated in

appendix CC.5.

8.5.2.2 Dierent Realizations of `Sele tion Sort'. To make the trans-

formation steps more intuitive, we rst dis uss fun tional variants of selsort

(see tab. 8.7): The rst variant is a standard implementation with two nested

for-loops. The outer loop pro esses the list l (more exa tly, the array) from

start (s) to end (e), the inner loop (fun tion smpos) sear hes for the position

of the smallest element in l, starting at index (1 + s) where s is the urrent

index of the outer loop. The for-loops are realized as tail-re ursions.


Table 8.7. Fun tional Variants for Sele tion-Sort

; (1) Standard: Two Tail-Re ursions

(defun selsort (l s e)

(if (= s e)

l

(selsort (swap s (smpos s (1+ s) e l) l) (1+ s) e)

))

(defun smpos (s ss e l)

(if (> ss e)

s

(if (> (nth s l) (nth ss l))

(smpos ss (1+ ss) e l)

(smpos s (1+ ss) e l)

)))

; (2) Realization as Linear Re ursion

; is `` ounter'', starting with last list-position e

(defun lselsort (l s e)

(if (sorted l s )

l

(swap* (1- ) e (lselsort l (1- ) s e))

))

(defun swap* (s from to l)

(swap s (smpos s from to l) l)

)

(defun sorted (l from )

(equal (subseq l from ) (subseq (sort ( opy-seq l) '<) from ))

)

; (3) Expli it definition of order (gl is list of pos-key pairs)

; e.g. gl = ((3 4) (2 3) (1 2) (0 1)) --> sorted l = (1 2 3 4)

(defun llselsort (gl l)

(if (lsorted gl l)

l

(lswap* ( ar gl) (llselsort ( dr gl) l))

))

(defun lsorted (gl l)

( ond ((null gl) T)

((equal (se ond ( ar gl)) ( ar l)) (lsorted ( dr gl) ( dr l)))

(T nil)

))

(defun lswap* (g l)

(swap (first g) (position (se ond g) l) l)

)


There is some on i t between a tail-re ursive stru ture where some

input state is transformed step-wise to the desired output and a plan rep-

resenting a sequen e of a tions from the goal to some initial state. Our def-

inition of a plan as program implies a linear re ursion (see def. 8.1.1). The

se ond variant of sele tion sort is a linear re ursion: For a list l with starting

index s and last index e, it is he ked, whether l is already sorted from s to a

urrent index whi h is initialized with e and step-wise redu ed. If yes, the

list is returned, otherwise, it is determined whi h element at positions ; : : : ; e

should be swapped to position 1. The inner loop is repla ed by an expli it

sele tor-fun tion. We will see below, that this orresponds to sele ting one

of several elements represented at dierent bran hes on the same level of the

plan.

The se ond variant orresponds losely to the fun tion we an infer

from the plan, it an be inferred from a plan generated by using fun tion-

appli ation.

9

A plan onstru ted by manipulating literals ontains no knowl-

edge about numbers as indi es in a list and order relations between numbers.

Looking ba k at plan transformation for ro ket, we introdu ed a \ omplex

obje t" oset whi h guided a tion appli ation in the re ursive unload and load

fun tion. Thus, for sorting, we an infer a omplex obje t from the top-level

goals whi h determine whi h number should be at whi h position of the list.

The third variant gives an abstra t representation of the fun tion whi h we

an infer automati ally from the plan in gure 3.7. This fun tion is more

general than standard sele tion sort, be ause now lists an be sorted in a -

ordan e to any arbitrary order relation spe ied in parameter gl! That is,

it does not rely on the stati predi ate gt(x, y) whi h represents the order

relation of the rst n natural numbers (see g. 3.6).

8.5.2.3 Inferring the Fun tion-Skeleton. The plan given in gure 3.7 is

not de omposed by the initial de omposition step, be ause the omplete plan

involves only one operator swap. The plan is a DAG, but it does not fulll

the riterium for sets of obje ts. Therefore, extra tion of one unique set of

optimal plans is done by pi king oneminimal spanning tree from the plan

(see se t. 3.2 in hap. 3).

The plan for sorting lists with three elements (see appendix CC.5) onsists

of 3! = 6 nodes and 9 edges. It ontains 9 dierent minimal spanning trees

(see also appendix). Not all of them are suitable for generalization: Three of

the nine trees an be \regularized" (see next transformation step below). If

we have no information for pi king a suitable minimal spanning tree, we have

to extra t a tree, try to regularize it and ba ktra k if regularization fails. For

9 andidates with 3 suitable solutions, this is feasible. But for the sorting of 4

element list, there exist 24 nodes, 72 edges and more than a million possible

minimal spanning trees with only a small amount of them being regularizable

(see appendix AA.10 for al ulation of number of minimal spanning trees).

9

Our investigation of plan transformation for plans involving fun tion-appli ation

is still at the beginning.


Currently, we pre- al ulate the number of trees ontained in the DAG and

if the number ex eeds a given threshold t (say 500), we only generate the

rst t trees. One possible but unsatisfying solution would be to parallelize

this step, whi h is possible. We plan to investigate whether tree-extra tion

and regularization an be integrated into one step. This would solve the

problem in an elegant way. One of the regularizable minimal spanning trees

for sorting four elements is given in gure 8.15. For bettern readability, the

state des riptions in the nodes are given as lists. In the original plan, ea h list

is des ribed by a set of literals. For example, [1 2 3 4 is a short notation for

(is p1 1) (is p2 2) (is p3 3) (is p4 4) where is (x y) means that position

x has ontent y (see se t. 3.1.4.3 in hap. 3 for details).

The algorithm for extra ting a minimal spanning tree from a DAG is

given in table 8.8, the orresponding program fragment in appendix AA.11.

Table 8.8. Extra t an MST from a DAG

Input: a DAG with edges (n m) annotated by the level of the node

Initialization: t == root(dag) (minimal spanning tree)

For l = 0 to maxlevel 1 DO:

Partition all edges (n m) from nodes n at level l to nodes m at level l+1 into

groups with equal end-node m:

p = ff(n

1

m

1

)(n

2

m

1

) : : : (n

k

m

1

)g; : : : f(n

0

1

m

l

) : : : (n

0

k

0

m

l

)gg.

Cal ulate the Cartesian produ t between sets in p: Cart = p

1

p

2

: : : p

l

.

Generate trees t

0

= t [ , for all 2 Cart.

The original plan represented how ea h of the 24 possible lists over four

(dierent) numbers an be transformed into the desired goal (the sorted list)

by the set of all possible optimal sequen es of a tions. The minimal spanning

tree gives a unique sequen e of a tions for ea h state. In the minimal spanning

tree given in gure 8.15, the a tion sequen es for sorting lists by shifting the

smallest element to the left is given.

For ollapsing a tree into a single path, this tree has to be regular

10

:

Denition 8.5.2 (Regular Tree). An edge-labeled tree t with edges (n m)

from nodes n to nodes m is regular, if for ea h level l = 0 : : :maxlevel1 the

subtrees for ea h node n f(n m

1

); (n m

2

); : : : (n m

k

)g onsist of id-identi al

label-sets with label

=

label

0

if label = label

0

or label = id.

For a given minimal spanning tree it an be tried to transform it into a

regular tree, using the algorithm given in table 8.9. The program fragment

for tree-regularization is given in appendix AA.12. A tree is regularized by

pushing all edges labeled with a tions o urring also on the next level of the

tree down to this level. The starting node of su h an edge is \ opied" and

10

This denition to regular trees is similar to the denition of irrelevant attributes

in de ision trees (Unger & Wysotzki, 1981): If a node has only identi al subtrees,

the node and all but one subtree are eliminated from the tree.

258

8.TransformingPlansintoFinitePrograms

[1 2 3 4]

[2 1 3 4] [3 2 1 4] [1 3 2 4][4 2 3 1] [1 4 3 2] [1 2 4 3]

[3 1 2 4] [2 3 1 4] [4 3 2 1] [4 1 3 2] [2 4 3 1][3 4 1 2] [4 2 1 3] [1 4 2 3][3 2 4 1] [1 3 4 2][2 1 4 3]

[4 1 2 3][2 4 1 3][3 4 2 1] [3 1 4 2] [2 3 4 1][4 3 1 2]

(S P1 P2) (S P1 P3) (S P2 P3)(S P1 P4) (S P2 P4) (S P3 P4)

(S P1 P2) (S P1 P3) (S P1 P4) (S P1 P2) (S P1 P4)(S P1 P3) (S P1 P3) (S P2 P3)(S P1 P4) (S P2 P4)(S P1 P2)

(S P1 P2)(S P1 P3)(S P1 P4) (S P1 P2) (S P1 P4)(S P1 P3)

Fig.8.15.AMinimalSpanningTreeExtra tedfrom

theSelSortPlan


an id-edge is introdu ed between the original and the opied starting node.

Note, that an edge an be shifted more than on e. If the result is a regular

tree a ording to denition 8.5.2, plan-transformation pro eeds, otherwise, it

fails.

Table 8.9. Regularization of a Tree

Input: an edge-labeled tree

For l = 0 to maxlevel 2 DO:

Constru t a label-set LS

l

for all edges (n m) from nodes n on level l to nodes

m on level l+1 and a label-set LS

l+1

for all edges (o p) from nodes o on level

l + 1 to nodes p on level l+ 2.

IF LS

l+1

LS

l

shift all edges (n

s

m

s

) 2 LS

l

\LS

l+1

one level and introdu e

edges (n

s

n

s

) with label \id" from level l to the shifted nodes n

s

on level n+1.

Test if the resulting tree is regular.

The regularized version of the minimal spanning tree for selsort is given

in gure 8.16 (again with abbreviated state des riptions). The stru ture of

the re ursion to be inferred is now already visible in the regularized tree: The

\parallel" subtrees have identi al edge-labels and the states share a large

overlap: The a tions in the bottom level of the tree all refer to swap the

element on position one with some element further to the right in the list

(swap(p1, x)). For the resulting states, the rst element is already the smallest

element of the list ([1 x y z). On the next level, all a tions refer to swap

the element on position two with some element further to the right, and so

on. A slightly dierent method for regularization, also with the goal of plan

linearization is presented in Wysotzki and S hmid (2001).

(S P1 P4) (S P1 P2)

[4 2 3 1][3 2 1 4][2 1 3 4][2 4 3 1]

[(ID ID) 1 2 3 4]

(S P1 P3)

[3 4 1 2][2 3 1 4][3 1 2 4][3 2 4 1][4 2 1 3] [4 1 3 2][4 3 2 1]

(S P2 P4)

ID(S P3 P4)

(S P2 P4) ID

(S P1 P2)(S P1 P2)(S P1 P3) (S P1 P4) (S P1 P3)

(S P2 P3)

((ID) 1 2 4 3)

(S P1 P2)(S P1 P3) (S P1 P2)

[4 1 2 3]

[1 4 3 2]

(S P1 P4)(S P1 P3)

[2 4 1 3]

(S P2 P3) ID

(S P1 P4)(S P1 P4) (S P1 P2) (S P1 P3)

[1 2 4 3]

[1 4 2 3]

[3 4 2 1][3 1 4 2] [4 3 1 2] [2 3 4 1][2 1 4 3]

[1 3 4 2]

(S P1 P4)

[1 2 3 4]

[(ID) 1 2 3 4]

[1 3 2 4]

Fig. 8.16. The Regularized Tree for SelSort

8.5.2.4 Inferring the Sele tor Fun tion. Although the selsort plan ould

be transformed su essfully to a regular tree with identi al subtrees, the plan

is still not linear. Considering the nodes at ea h planning level, these nodes


share a ommon sub-set of literals. That is, at ea h level, the ommonalities

of the states an be aptured by al ulating the interse tion of the state

des riptions. For the bran hes from one node holds that their ordering is

irrelevant for obtaining the a tion sequen e for transforming a given state

into the goal state. Furthermore, the names of all a tions are identi al and at

ea h level, the rst argument of swap is identi al. Putting this information

together, we an represent the levels of the regularized tree as:

((is p1 1) (is p2 2) (is p3 3)) (swap p3 [p4)

((is p1 1) (is p2 2)) (swap p2 [p3, p4)

((is p1 1)) (swap p1 [p2, p3, p4).

From the instantiations of the se ond argument we an onstru t a hypo-

theti al omplex obje t: CO

S

= [p4 < [p3; p4 < [p2; p3; p4. Note, that the

numbers asso iated with positions are not re ognized as numbers. Another

minimal spanning tree extra ted from the plan, would result in a dierent

pattern, for example [p2 < [p4; p2 < [p1; p4; p2 whi h is generalizable in the

same way as we will des ribe in the following.

The data obje t whi h nally will be ome the re ursive parameter is on-

stru ted along a path in the plan (as des ribed for ro ket). On ea h level in

the plan, the argument(s) of an a tion an be hara terized relative to the

obje t involved in the parent node. Now, we have to introdu e a sele tor fun -

tion for an argument of an a tion involving a tions on the same level of the

plan with the same parent node. That is the sear hed for fun tion has to be

dened with respe t to the hildren of the a tion. Remember, that this is a

ba kward plan and that a hild-node is input to an a tion. As a onsequen e,

the sele tor fun tion has to be applied to the urrent instantiation of the list

(situation).

The sear hed-for fun tion for sele ting one element of the andidate ele-

ments represented in the se ond argument of swap has to be dened relative

to the information available at the urrent \position" in the plan. That is,

the literals of the parent node and the rst argument of the swap operator

an be used to de ide whi h element should be swapped in the urrent ( hild)

state. For example, for (swap p3 [p4), we have (sel CO

S

) = p4 from ((is

p1 1) (is p2 2) (is p3 4) (is p4 3)). The rst argument of swap o urs in

(is p3 3) of the parent node. The position to be sele ted p4 is related

to (is p4 3) in the hild node. That is, when applying swap, the rst posi-

tion is given and the se ond position must be obtained somehow. A modied

swap-fun tion whi h deals with sele ting the \right" position to swap must

provide the following hara teristi s:

Be ause the overall stru ture of the plan indi ates a list problem, the goal

predi ates must be pro essed in a xed order whi h an be derived from

the sequen e of a tions in the plan. For a given list of goal predi ates, deal

with the rst of these predi ates.


For example, the goal predi ates an be (is p1 1) (is p2 2) (is p3 3).

The rst element to be onsidered is (is p1 1).

A generalized swap* operator works on the urrent goal predi ate and the

urrent situation s.

For example, swap* realizes (is p1 1) in situation s = (is p1 4) (is p2

1) (is p3 2) (is p4 3).

The urrent goal is splitted into the urrent position and the element for

this position.

For example, (pos (is p1 1)) = p1 and (key (is p1 1)) = 1.

The element urrently at position (pos (is x y)) must be swapped with

the element whi h should be at this position, that is (key (is x y)).

For s = (is p1 4) (is p2 1) (is p3 2) (is p4 3) results that position p1

and position p2 are swapped.

Constru ting this hypothesis presupposes, that the plan-transformation

system has predened knowledge about how to a ess elements in lists.

11

Written as Lisp-fun tions, the general hypothesis is:

(defun swap* (fp s)

(pswap (ppos fp) (sel (pkey fp) s) s) )

(defun sel (k s)

(ppos ( ar

(map an #'(lambda(x) (and (equal (pkey x) k) (list x))) s) )))

where pswap is the swap fun tion predened in the domain spe i ation and

ppos and pkey sele t the se ond/third element of a list (is x y).

12

The generalized swap*-fun tion is tested for ea h swap-a tion o urring

in the plan. Be ause it holds for all ases, plan-transformation an pro eed.

If the hypothesis would have failed, a new hypothesis had to be onstru ted.

If all hypotheses fail, plan-transformation fails. For the selsort plan, only the

hypothesis introdu ed above is possible.

The rest of plan-transformation is straight-forward: The regularized tree

is redu ed to a single path, unifying bran hes by repla ing the swap a tion by

swap* (see g. 8.17). Note, that in ontrast to the ro ket problem, this path

already ontains \id" bran hes whi h will onstitute the \then" ase for the

nested onditional to whi h the plan is nally transformed. Note further, that

only the rst argument of swap* is given, be ause the se ond argument s is

11

In the urrent implementation we provide this knowledge only partially. For

example, we an sele t an element x 2 arg from a list (p arg), as needed to

onstru t the denition of su for sequen es. That is, we an onstru t pos and

key. For onstru ting the denition for sel, we need additionally the sele tion of

an element of a list of literals whi h follows a given pattern. This an be realized

with the lter-fun tion map an. But up to know, we did not implement su h

omplex sele tor fun tions.

12

The se ond ondition (list x) for map an is due to the denition of this fun tor

in Lisp. Without this ondition, the fun tor returns T or nil, with this ondition,

it returns a list of all elements fullling the lter- ondition (equal (pkey x) k).


given by the sub-tree following the a tion. This is the same way to represent

plans as terms as used in the se tions before (for unsta k and ro ket).

((isc p1 1) (isc p2 2) (isc p3 3))

((isc p1 1) (isc p2 2))

((isc p1 1))

(swap* (isc p2 2))

(swap* (isc p3 3))

(swap* (isc p1 1))

s

s

s

id

id

Fig. 8.17. Introdu tion of a \Semanti " Sele tor Fun tion in the Regularized Tree

Data type introdu tion works as des ribed for ro ket (for set and stru -

tural list problems): a omplex obje t CO = ((is p1 1) (is p2 2) (is p3

3)) and a generalized predi ate (is * CO) for the bottom-test is inferred. The

state-nodes are rewritten using the rest-sele tor tail on CO. The head sele tor

is introdu ed in the a tions: (swap*(head (tail

i

CO)). The resulting re ursive

fun tion is:

(defun pselsort (pl s)

(if (is * pl s)

s

(swap* (head pl)

(pselsort (tail pl) s)

) ))

Note, that head and tail here orrespond to last and butlast. But, be ause

lists are represented as expli it position-key pairs, pselsort transforms lists

s to lists sorted a ording to the spe i ation derived from the top-level goals

given in pl independent of the transformation sequen e! The Lisp-program

realizing sele tion sort is given in gure 8.18.

8.5.3 Con luding Remarks on List Problems

List sorting is an example for a domain whi h involves not only stru tural but

also semanti knowledge: While for stru tural problems, the elements of the

input-data an be repla ed by variables with arbitrary interpretation, this is

not true for semanti problems. For example, reversing a list [x, y, z works in

the same way, regardless whether this list is [1, 2, 3 or [1, 1, 1. In semanti

problems, su h as sorting, operators omparing elements (x < y or x = y)


; Complete Re ursive Program for SelSort

; for lists represented as literals

; pl is inferred omplex obje t, e.g., ((p1 1) (p2 2) (p3 3))

; s is situation (stati s an be omitted),

; e.g. ((is p1 3) (is p2 1) (is p3 4) (is p4 2))

(defun pselsort (pl s)

(if (is * pl s)

s

(swap* (head pl)

(pselsort (tail pl) s)

)))

(defun swap* (fp s)

(pswap (ppos fp) (sel (pkey fp) s) s) )

(defun sel (k s)

(spos ( ar

(map an #'(lambda(x) (and (equal (skey x) k) (list x))) s)

)))

(defun is * (pl s)

(subsetp pl (map ar #'(lambda(x) ( dr x)) s) :test 'equal))

; sele tors for elements of pl (p k)

(defun ppos (p) (first p))

(defun pkey (p) (se ond p))

; sele tors for elements of s (is p k)

(defun spos (p) (se ond p))

(defun skey (p) (third p))

; head and tail realized as last and butlast

; (from the order defined in the plan, alternatively: ar dr)

(defun head (l) ( ar (last l)))

(defun tail (l) (butlast l))

; expli it implementation of add-del effe t

; in onne tion with DPlan: appli ation of swap-operator on s

; "inner" union: i=j ase

(defun pswap (i j s)

(print `(swap ,i ,j ,s))

(let ((ikey (skey ( ar(remove-if #'(lambda(x) (not (equal i (spos x)))) s))))

(jkey (skey ( ar(remove-if #'(lambda(x) (not (equal j (spos x)))) s))))

(rsts (remove-if #'(lambda(x) (or (equal i (spos x))

(equal j (spos x)))) s))

)

(union (union (list (list 'is i jkey))

(list (list 'is j ikey)) :test 'equal)

rsts :test 'equal)

))

Fig. 8.18. LISP-Program for SelSort


are ne essary. A prototypi al example for a problem involving semanti is

the member fun tion whi h returns true, if an element x is ontained in a list

l and false otherwise. Here an equality test must be performed. A detailed

dis ussion of the reasons why the member fun tion annot be inferred using

standard te hniques of indu tive logi programming is given by Le Blan

(1994).

Transforming the selsort plan into a nite program involved to riti al

steps: (1) extra ting a suitable minimal spanning tree from the plan and (2)

introdu ing a \semanti " sele tor fun tion. The inferred omplex obje t rep-

resents the number of elements whi h are already on the goal position. This is

in analogy to the ro ket problem, where the omplex obje t represented how

many obje ts are already at the goal-destination (at lo ation B for unload-

all or inside the ro ket for load-all). Plan transformation results in a nal

program whi h an be generalized to a re ursive sort fun tion sharing ru ial

hara teristi s with sele tion sort. But the fun tion inferred by our system

diers from standard sele tion sort in two aspe ts: First, the re ursion is lin-

ear, involving a \goal" sta k. The nested for-loops (two tail-re ursions) of

the standard fun tion are realized by a single linear re ursive all. Of ourse,

the fun tion for sele ting the urrent position is itself a loop: the literal list

is sear hed for a literal orresponding to a given pattern by an higher-order

lter fun tion. Se ond, it does not rely on the ordering of natural numbers,

but generates the desired sequen e of elements, expli itely given as input.

We demonstrated plan transformation of a plan for lists with four ele-

ments. From a list with three elements, eviden e for the hypothesis for gener-

ating the semanti sele tor fun tion would have been weaker (involving only

the a tions from level one to two in the regularized tree). An alternative

approa h to plan transformation, involving knowledge about numbers, is de-

s ribed for a plan for three-element lists in Wysotzki and S hmid (2001). In

general, there are three ba ktra k-points for plan transformation:

Generating \semanti " fun tions:

If a generated hypothesis for the semanti fun tion fails or if generaliza-

tion-to-n fails, generate another hypothesis.

Extra ting a minimal spanning tree from a plan:

If plan transformation or generalization-to-n fails, sele t another minimal

spanning tree.

Number of obje ts involved in planning:

If plan transformation or generalization-to-n fails, generate a plan, involv-

ing an additional obje t. (A plan must to involve at least three obje ts

to identify the re ursive parameter and its substitution as des ribed in

hapter 7.

Be ause plan onstru tion has exponential eort (all possible states of a do-

main for a xed number of obje ts have to be generated and the number of

states an grow exponentially relative to the number of obje ts) and be ause

the number of minimal spanning trees might be enormous, generating a -

8.6 Plans over Complex Data Types 265

nite program suitable for generalization is not eÆ ient in the general ase.

To redu e ba ktra king eort, we hope to ome up with a good heuristi for

extra ting a \suitable" minimal spanning tree in the future. One possibility

mentioned above is, to try to ombine tree extra tion and regularization.

8.6 Plans over Complex Data Types

8.6.1 Variants of Complex Finite Programs

The usual way, to lassify re ursive fun tions, is to divide them into dierent

omplexity lasses (Hinman, 1978; Odifreddi, 1989). In omplexity theory, the

semanti s of a re ursive fun tion is under investigation. For example, bona i

is typi ally implemented as tree-re ursion (see tab. 8.10), but it belongs to

the lass of linear problems meaning, the bona i-number of a number

n an be al ulated by a linear re ursive fun tion (Field & Harrison, 1988,

pp. 454). In our approa h to program synthesis, omplexity is determined

by the synta ti al stru ture of the nite program, based on the stru ture of

a universal plan. The unfolding (see hap. 7) of all fun tions in table 8.10

results in a tree stru ture. Interpretation of max always involves only one

of the two tail-re ursive alls (that is, the fun tion is linear). Interpretation

of b results in two new re ursive alls for ea h re ursive-step (resulting in

an eort O(2

n

)). The A kermann-fun tion (a k) is the lassi example for

a non-primitive re ursive fun tion with exponential growth ea h re ursive

all results in y + x new re ursive alls.

Table 8.10. Stru tural Complex Re ursive Fun tions

Alternative Tail Re ursion

(max m l) == (if (null l) m

(if (> (head l) m) (max (head l) (tail l)) (max m (tail l))))

Tree Re ursion

(b x) == (if (= 0 x) 0 (if (= 1 x) 1 (plus (b (- x 1)) (b (- x 2)))))

-Re ursion

(a k x y) == (if (= 0 x) (1+ y) (if (= 0 y) (a k (1- x) 1) (a k (1- x) (a k x (1- y)))))

For plan transformation, on the other hand, semanti s is taken into a -

ount to some extend: As we saw above, plans are linearizable if the data

type underlying the plan is a set or a list. For the ase of list problems in-

volving semanti attributes of the list elements, it depends on the omplexity

of the involved \semanti " fun tions whether the resulting re ursion is linear

or more omplex. Currently, we do not have a theory of \linearizability" of

universal plans, but learly, su h a theory is ne essary to make our approa h

to plan transformation more general. A good starting point for investigating

this problem, should be the literature on the transformational approa h to


ode optimization in fun tional programming (Field & Harrison, 1988). In

Wysotzki and S hmid (2001) linearizability is dis ussed in detail.

There are two well-known planning domains, for whi h the underlying

data type is more omplex than sets or lists: Tower of Hanoi (se t. 3.1.4.4 in

hap. 3) and building a Tower of alphabeti ally sorted blo ks in the blo ks-

world domain (se t. 3.1.4.5 in hap. 3). The tower problem is a set of lists

problem and used as one of the ben hmark problems for planners. The hanoi

problem is a list of lists problem (the \outer" list is of length 3 for the

standard 3 peg problems). For both domains, the general solution pro edure

to transform an arbitrary state into a state fullling the top-level goals is

at least at rst glan e more omplex than a single linear re ursion. Up to

now we annot fully automati ally transform plans for su h omplex domains

into nite programs. In the following, we will dis uss possible strategies.

8.6.2 The `Tower' Domain

8.6.2.1 A Plan for Three Blo ks. The spe i ation of the three-blo k

tower problem was introdu ed in hapter 2 and is des ribed in se tion 3.1.4.5

in hapter 3. The unsta k/ learblo k domain des ribed above as example for a

problem with underlying sequential data type is a sub-domain of this problem:

the puttable operator is stru turally identi al to the unsta k operator. The

put operator is represented with a onditioned ee t for taking are of the

ase that a blo k x is moved from the table to another blo k y and for the

ase that x is moved from another blo k z.

For the 3-blo k problem, the universal plan is a unique minimal spanning

tree (see g. 3.10). Note, that for a given goal to build a tower with fon(A, B),

on(B, C)), the state fon(C, B), on (B, A)g that is a tower where the blo ks

are sorted in reverse order to the goal only two a tions are needed to rea h

the goal state: base of the tower C is put on the table, B an immediately

put on C without putting it on the table rst. A program for realizing tower

is only optimal, if these short- uts are performed.

There are two possible strategies for learning a ontrol program for tower

whi h are dis ussed in reinfor ement learning under the labels in remental

elemental-to- omposite learning and simultaneous omposite learning (Sun &

Sessions, 1999): In the rst ase, the system is rst trained with a simple

task, for example learing an arbitrary blo k by unsta king all blo ks lying

on top of it. The learned poli y is stored, for example as CLEAR-program

and extends the set of available basi fun tions. When learning the poli y

for a more omplex problem, su h as tower, the already available knowledge

(CLEAR) an be used. In the se ond ase, the system immediately learns

to solve the omplex problem and the de omposition must be performed

autonomously. The appli ation of both strategies to the tower problem is

demonstrated in Wysotzki and S hmid (2001). In the work reported here, we

fo us on the se ond strategy.


8.6.2.2 Elemental to Composite Learning. Let us assume, that the

ontrol knowledge for learing an arbitrary blo k is already available as a

CLEARBLOCK fun tion:

CLEARBLOCK(x, s) = if( lear(x, s), s, puttable(topof(x), CLEAR-

BLOCK(topof(x), s))).

This fun tion immediately returns the urrent state s, if lear(x) already

holds in s, otherwise it is tried to put the blo k lying on top of x on the

table.

Now we want to learn a program for solving the more omplex tower

problem. In Wysotzki and S hmid (2001) we des ribe an approa h based on

the assumption of linearizability of the planning problem (see se t. 2.3.4.2

in hap. 2): It is presupposed that sub-goals an be a hieved immediately

before the orresponding top-level goal is a hieved, For example, to rea h a

state where blo k A is lying on blo k B, the a tion put(A, B) an be ap-

plied; but this a tion is appli able only if both A and B are lear. That is, to

realize goal on(A, B) the sub-goals lear(A) and lear(B) must hold in the

urrent situation. Allowing the use of the already learned re ursive CLEAR-

BLOCK fun tion and using linearization, the omplex plan for solving the

tower problem for three blo ks, an be ollapsed to:

(on a b) (on b )

(put a b)Æ(CLEARBLOCK a)Æ(CLEARBLOCK b)

!

(on b )

(put b )Æ(CLEARBLOCK b)Æ(CLEARBLOCK )

! s.

The CLEARBLOCK program does apply as many puttable a tions as ne es-

sary. If a blo k is already lear, the state is returned un hanged. The re ursive

program generalizing over this plan is given in appendix CC.6.

For some domains, it is possible to identify independent sets of predi ate

by analyzing the operator spe i ations, for example using the TIM-analysis

(see se t. 2.5.1 in hap. 2) for the planner STAN (Long & Fox, 1999).

While the approa h of Wysotzki and S hmid (2001) is based on analyzing

the goals and sub-goals ontained in a plan, the strategy reported in this

hapter is based on data type inferen e for uniform sub-plans (see se t. 8.2.1).

If you look at the plan given in gure 3.10, the two upper levels ontain only

ar s labelled with put a tions and all levels below ontain only ar s labelled

with puttable a tions. Therefore, as demonstrated for ro ket above, the plan

is in a rst step de omposed and it be omes impossible to infer a program

where put and puttable a tions are interwoven.

8.6.2.3 Simultaneous Composite Learning. Initial plan de omposition

for the 3-blo k tower plan results in two sub-plans a sub-plan for put-all and

a sub-plan for puttable-all. The put-all sub-plan is a regular tree as dened

above. The only level with bran hing is for a tions (put B C) and the plan an

be immediately redu ed to a linear sequen e. Consequently, we introdu e the

data type list with omplex obje t CO = (A B C) and bottom-test (on* (A


B C)). The generalized put-all fun tion is stru turally analogous to load-all

from the ro ket domain:

(put-all olist s) ==

(if (on* olist s)

s

(put (first olist) (se ond olist) (put-all (tail olist) s))

)

where first and se ond are implemented as last and se ond-last, or

olist gives the desired order of the tower in reverse order.

For the puttable sub-plan we have one fragment onsisting of a single step

(puttable C) for the reversed tower and a set of four sequen es:

A < C < B

B < C < A

B < A < C

C < A < B

with ( t x) as bottom-test and the onstru tor (su x) = y for (on y x)

as introdu ed above for linear plans. An obvious strategy ompatible with

our general approa h to plan transformation would be to sele t one of this

sequen es and generalize a lear-all fun tion identi al to the unsta k-all fun -

tion dis ussed above. It remains the problem of sele ting the blo k whi h is to

be unsta ked that is, we have to infer the bottom-element from the goal-set

( t a) ( t b) ( t ). As des ribed for sorting above, we have to generate a

semanti sele tor fun tion whi h is not dependent of the parent-node, but of

the urrent state.

13

We have the following examples for onstru ting the sele tor fun tion:

((on b ) (on a)): (sel (A B C)) = A

((on a ) (on b)), ((on a) (on a b)): (sel (A B C)) = B

((on b a) (on a )): (sel (A B C)) = C

that is, for the omplex obje t (A B C) we always must sele t the element

whi h is the base of the urrent tower.

If we model the tower domain by expli it use of an ontable predi ate, this

predi ate an be used as riterium for the sele tor fun tion. Without this

predi ate, we an introdu e (on* CO) already generated for put-all and

sele t the last element of the list. The resulting tower fun tion than would

be:

tower(olist, x) = put-all(olist, lear-all(sel(s)))

sel(s) = last(make-olist(s)).

13

Note, that for selsort we introdu ed a sele tor in the basi operator swap. Here

we introdu e a sele tor in the fun tion lear-all whi h is already a re ursive

generalization!


With the des ribed strategy, the problem got redu ed to an underlying

data type list with a semanti sele tor fun tion. The sele tor fun tion is se-

manti , be ause it is relevant whi h blo k must be leared. This ontrol rule

generates orre t transformation sequen es for towers with arbitrary num-

bers of blo ks with the desired sorting of blo ks spe ied by olist, whi h is

generated from the top-level goals. But, it does not for all ases generate the

optimal transformation sequen es!

For generating optimal transformation sequen es, we must over the ases

where a blo k an be put onto another blo k immediately, without putting

it on the table rst. For the three-blo k plan, there is only one su h ase and

we ould ome up with the dis riminating predi ate (on b):

tower(olist, x) = put-all(olist, if on( , b, s),

puttable( , s),

lear-all(sel(s)))

whi h generates in orre t plans for larger problems, for example for the state

((on b) (on b a) (on a d) ( t ))!

For both variants of tower a generate-and-test strategy would dis over

the aw: For the rst variant, it would be dete ted that for ((on b) (on

b a) ( t a)) an additional a tion (puttable b) would be generated whi h is

not in luded in the optimal universal plan. For the se ond variant, all states

of the 3-blo k plan are overed orre tly the faulty ondition would only

be dete ted when he king larger problems. But, with only one spe ial ase

of a reversed tower in the three-blo k plan every other hypothesis would be

highly spe ulative. Therefore, we now will investigate the four-blo k plan.

The universal plan for the four-blo k tower problem is a DAG with 73

nodes and 78 edges, thus we have to extra t a suitable minimal spanning tree.

Be ause the plan is rather larger, we present an abstra t version in gure 8.19

and a summary for the a tion sequen es for all 33 leaf nodes in table 8.11.

0

1

2 3

4 5 6 78 910

11 12 13 141516 17 18 1920212223 2425262728 293031

32 3334 353637383940 41424344454647 484950 515253 54 555657

58596061 62636465666768697071 72

Fig. 8.19. Abstra t Form of the Universal Plan for the Four-Blo k Tower

For the four-blo k problem, we have 15 sequen es needing to put all blo ks

on the table and 8 ases with shorter optimal plans (only ounting leaf nodes)


Table 8.11. Transformation Sequen es for Leaf-Nodes of the Tower Plan for Four

Blo ks

15 4-towers, needing 3 puttable a tions

( (on a b) (on b d) (on d ) ( t a)) (PUTTABLE A) (PUTTABLE B) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on a ) (on b) (on b d) ( t a)) (PUTTABLE A) (PUTTABLE C) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on a ) (on d) (on d b) ( t a)) (PUTTABLE A) (PUTTABLE C) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on a d) (on d b) (on b ) ( t a)) (PUTTABLE A) (PUTTABLE D) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on b a) (on a d) (on d ) ( t b)) (PUTTABLE B) (PUTTABLE A) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on b ) (on a) (on a d) ( t b)) (PUTTABLE B) (PUTTABLE C) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on b ) (on d) (on d a) ( t b)) (PUTTABLE B) (PUTTABLE C) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on b d) (on d a) (on a ) ( t b)) (PUTTABLE B) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on a) (on a b) (on b d) ( t )) (PUTTABLE C) (PUTTABLE A) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on a) (on a d) (on d b) ( t )) (PUTTABLE C) (PUTTABLE A) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on b) (on b a) (on a d) ( t )) (PUTTABLE C) (PUTTABLE B) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on b) (on b d) (on d a) ( t )) (PUTTABLE C) (PUTTABLE B) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on d a) (on a b) (on b ) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on d b) (on b a) (on a ) ( t d)) (PUTTABLE D) (PUTTABLE B) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on d) (on d a) (on a b) ( t )) (PUTTABLE C) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

((put d) (puttable a) also possible)

6 4-towers, needing 2 puttable a tions

( (on a d) (on d ) (on b) ( t a)) (PUTTABLE A) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on b d) (on d ) (on a) ( t b)) (PUTTABLE B) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on d) (on d b) (on b a) ( t )) (PUTTABLE C) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on d a) (on a ) (on b) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on d b) (on b ) (on a) ( t d)) (PUTTABLE D) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on d ) (on a) (on a b) ( t d)) (PUTTABLE D) (PUT C D) (PUTTABLE A) (PUT B C) (PUT A B)

((put d) BEFORE (puttable a)!)

2 4-towers, needing 4 a tions

( (on b a) (on a ) (on d) ( t b)) (PUTTABLE B) (PUTTABLE A) (PUT B C) (PUT A B)

( (on d ) (on b) (on b a) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

(sorted tower, 0 a tions is root of plan)

5 2-tower pairs, needing 2 puttable a tions

( (on a ) (on b d) ( t a) ( t b)) (PUTTABLE B) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on a ) (on d b) ( t a) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on a d) (on b ) ( t a) ( t b)) (PUTTABLE A) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on b ) (on d a) ( t b) ( t d)) (PUTTABLE D) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on a b) (on d ) ( t a) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

((put d) (puttable a) also possible)

5 2-tower pairs, needing 1 puttable a tions

( (on a d) (on b) ( t a) ( t )) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)

( (on b a) (on d ) ( t b) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on b d) (on a) ( t b) ( t )) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)

( (on a) (on d b) ( t ) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

( (on b) (on d a) ( t ) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)

2 2-tower pairs, needing no (put d) a tion

( (on a b) (on d) ( t a) ( t )) (PUTTABLE A) (PUT B C) (PUT A B)

( (on b a) (on d) ( t b) ( t )) (PUT B C) (PUT A B)

are no leafs)


in ontrast to 5 to 1 ases for the 3-blo k tower. Additionally, we have not

only one possible partial tower (with 2 or more blo ks sta ked) but also a

two-tower ase (with two towers onsisting of two blo ks). Only one path

in the plan makes it ne essary to interleave put and puttable: if D is on top

and C immediately under it. There are four ases, where puttable has to be

performed only two times before the put a tions are applied and one ase,

where puttable has to be performed only on e. For one ase, only two puttables

and two puts have to be applied. For the ase of pairs of towers, all three put-

a tions have to be performed for ea h leaf, puttable has to performed on e or

twi e.

The underlying data type is now not just a list, but a more ompli ated

stru ture, where for example (D C A B) < (D A B C)! Again, the order

is derived from the number of a tions needed to transform a state into the

goal state and a tower (D C A B) an be transformed into the goal using

two puttable a tions and three put a tions while for (D A B C) three puttable

a tions and three put a tions are needed (see tab. 8.11). Currently, we do

not see an easy way to extra t all onditions for generating optimal a tion

sequen es from the universal plan. Either, we have to be ontent with the

orre t but suboptimal ontrol rules inferred from the three-blo k plan, or we

have to rely on in remental learning. A program whi h overs all onditions

for generating optimal plans is given in appendix CC.6.

One path, we want to investigate in the future is, to model the domain

spe i ation in a slightly dierent way using only a single operator (put

blo k lo ) where lo an be a blo k or the table. This makes the universal

plan more uniform. There is no longer the de ision to take, whi h operator

to apply next. Instead, the de ision whether a blo k is put on another blo k

or on the table an be in luded in the \semanti " sele tor fun tion.

8.6.2.4 Set of Lists. Some deeper insight in the stru ture of the tower

problem might be gained by analyzing the analogous abstra t problem. The

sequen e of number of states in dependen e of the number of blo ks is given in

table 2.3. This sequen e orresponds to the number of sets of lists: a(n)=(2n-

1)a(n-1) - (n-1)(n-2)a(n-2). For n 1 it is the row sum of the \unsigned

Lah-triangle" (Knuth, 1992). The orresponding formula is exp(x=(1x)).

14

The tower problem is related to generating the power-set of a list with

mutually dierent elements (see tab. 8.12). But there is also a dieren e

between the two domains: For powerset ea h element of the set is a set again,

that is, for example ffag, fb, g, fbgg is equal to ffag, f , bg, fbgg. In

ontrast, for tower, the elements of the sets are lists. For example f(a), (b,

), (b)g is equal to f(b), (a), (b, )g but not to f(a), ( , b), (b)g. A program

generating all dierent sets of lists (that is towers) an be easily generated

from powerset by hanging :test 'setequal to :test 'equal in ins-el.

14

The identi ation of the sequen e was resear hed by Bernhard Wolf. More ba k-

ground information an be found at http://www.resear h.att. om/ gi-bin/

a ess. gi/as/njas/sequen es/eisA. gi?Anum=000262.


The tower domain is the inverse problem to set of lists: For sets of lists a single

list is de omposed in all possible partial lists. For tower ea h state orresponds

to a set of partial lists and the goal state is the set ontaining a single list

with all elements in a xed order. The (puttable x) operator orresponds to

removing an element from a list and generating a new one-element list ( ons

( ar l) nil), the (put x y) operator orresponds to removing an element

from a list and putting it in front of another list ( ons ( ar l1) l2). A

program generating a list of sorted numbers is given in appendix CC.6.

Table 8.12. Power-Set of a List, Set of Lists

(defun powerset (l) (pset l (1+ (length l)) (list (list nil))))

(defun pset (l ps)

( ond ((= 0 ) nil)

(T (union ps (pset l (1- ) (ins-el l ps)))) ))

(defun ins-el (l ps)

( ond ((null l) nil)

(T (union (map ar #'(lambda(y) (adjoin ( ar l) y)) ps)

(ins-el ( dr l) ps) :test 'setequal)) ))

; for set of lists :test 'equal

(defun setequal (s1 s2) (and (subsetp s1 s2) (subsetp s2 s1)))

8.6.2.5 Con luding Remarks on `Tower'. Inferen e of generalized on-

trol knowledge for the tower domain was investigated also in the ontext of

two alternative approa hes. One of these approa hes is geneti programming

(se t. 6.3.2 in hap. 6). Within this approa h, given primitive operators of

some fun tional programming language together with rules for the orre t

generation of terms, for a set of input/output examples and an evaluation

fun tion (representing knowledge about \good" solutions) a program ov-

ering all I/O examples orre tly is generated by sear h in the \evolution

spa e" of programs. Programs generated by this approa h are given in gure

6.4. These programs orrespond to the \linear" program dis ussed above. Be-

ause always rst all blo ks are put on the table and afterwards the tower is

onstru ted, the program does not generate optimal transformation sequen es

for all possible ases.

The se ond approa h, introdu ed by Mart

in and Gener (2000), infers

rules from plans for sample input states (see se t. 2.5.2 in hap. 2. The domain

is modelled in a on ept language (AI knowledge representation language)

and the rules are inferred with a de ision list learning approa h. The resulting

rules are given in table 8.13. For example, rule A3 represents the knowledge,

that a blo k should be pi ked up if it's lear, and if its target blo k is lear

and \well-pla ed". With these rules, 95:5% of 1000 test problems were solved

for 5-blo k problems and 72:2% of 500 test problems were solved for 20-blo k

problems. The generated plans are about two steps longer than the optimal

plans. The authors ould show, that after a sele tive extension of the training


set by the input states for whi h the original rules failed to generate a orre t

plan, a more extensive set of rules is generated for whi h the generated plans

are about one step longer than the optimal plans.

Table 8.13. Control Rules for Tower Inferred by De ision List Learning

A1: PUT-ON ((on

g

= on

s

) ^ (8on

1

g

:holding) ^ lear

s

)

A2: PUT-ON-TABLE (holding)

A3: PICK ((8on

g

:(on

g

= on

s

)) ^ (8on

g

: lear

s

) ^ lear

s

)

A4: PICK (:(on

g

= on

s

) ^ (8on

s

:(8on

1

g

: lear

s

)) ^ lear

s

)

A5: PICK (:(on

g

= on

s

) ^ (8on

g

:(on

s

= on

s

)) ^ lear

s

)

A6: PICK (:(on

g

= on

s

) ^ (8on

s

:(8on

1

s

: lear

s

)))

Our approa h diers from these two approa hes in two aspe ts: First, we

do not use example sets of input/output pairs or of input/plan pairs but

we analyze the omplete spa e of optimal solutions for a problem of small

size. Se ond, we do not rely on in remental hypothesis- onstru tion, that is,

a learning approa h where ea h new example is used to modify the urrent

hypothesis if this example is not overed in the orre t way. Instead, we

aim at extra ting the ontrol knowledge from the given universal plan by

exploiting the stru tural information ontained in it. Although we failed up

to now to generate optimal rules for tower, we ould show for sequen e, set,

and list problems, that with our analyti al approa h we an extra t orre t

and optimal rules from the plan.

There is a trade-o between optimality of the poli y versus (a) the ef-

ien y of ontrol knowledge appli ation and (b) the eÆ ien y of ontrol

knowledge learning. As we an see from the program presented in appendix

CC.6 and from the (still non-optimal!) ontrol rules in table 8.13, generat-

ing minimal a tion sequen es might involve omplex tests whi h have to be

performed on the urrent state. In the worst ase, these tests again involve

re ursion, for example, a test, whether already a \well-pla ed" partial tower

exists (test subtow in our program). Furthermore, we demonstrated, that the

suboptimal ontrol rules for tower ould be extra ted quite easily from the

3-blo k plan, while automati extra tion of the optimal rules from the 4-blo k

plan involves omplex reasoning (for generating the tests for \spe ial" ases).

8.6.3 Tower of Hanoi

Up to now, we did not investigate plan transformation for the Tower of Hanoi.

Thus, we will make just some more general remarks about this domain. Tower

of Hanoi an be seen as a modied tower problem: In ontrast to tower,

where blo ks an be put on arbitrary positions on a table, in Tower of Hanoi

the positions of dis s are restri ted to some (typi ally three) positions of

pegs. This results in a more regular stru ture of the DAG than for the tower

problem and therefore, we hope that if we ome up with a plan transformation


strategy for tower, the Tower of Hanoi domain is overed by this strategy as

well.

It is often laimed, that hanoi is a highly arti ial domain, and that

the only isomorphi domains are hand- rafted puzzles, as for example the

monster problems (Simon & Hayes, 1976; Clement & Ri hard, 1997). I want

to point out, that there are solitaire (\patien e") games, whi h are isomorphi

to hanoi.

15

One of these solitaire-games (free ell) was in luded in the AIPS-

2000 planning ompetition.

The domain spe i ation for hanoi is given in table 3.8. The resulting plan

is a unique minimal spanning tree, whi h is already regular (see g. 3.9). This

indi ates, that data type inferen e and resulting plan transformation should

be easier than for the tower problem. While hanoi with three dis s ontains

more states than the three-blo k tower domain (27 to 13) the a tions for

transforming one state into another are mu h more restri ted. The number

of states for hanoi is 3

n

. The minimal number of moves when starting with a

omplete tower on one peg is 2

n1

. Up to now, there seems to be no general

formula to al ulate the minimal number of moves for an arbitrary starting

state that is, one of the nodes of the universal plan.

16

Tower of Hanoi is a puzzle investigated extensively in arti ial intelligen e

as well as in ognitive psy hology sin e the 60ies. In omputer s ien e lasses,

Tower of Hanoi is used as a prototypi al example for a problem with expo-

nential eort. Coming up with eÆ ient algorithms (for restri ted variants) of

the Tower of Hanoi problem is still ongoing resear h (Atkinson, 1981; Pet-

torossi, 1984; Walsh, 1983; Allou he, 1994; Hinz, 1996). As far as we survey

the literature, all algorithms are on erned with the ase, where a tower of

n dis s is initially lo ated at a predened start peg (see for example table

8.14). In general, hanoi is -re ursive already for the restri ted state where

the initial state is xed and only the number of dis s are variable with the

stru ture hanoiÆmoveÆhanoi. A standard implementation, as shown in table

8.14 is as tree-re ursion.

We are interested in learning a ontrol strategy starting with an arbitrary

initial state (see program in table 8.15).

15

We plan to ondu t a psy hologi al experiment in the domain of problem solv-

ing by analogy, demonstrating, that subje ts who are a quainted with playing

patien e games perform better on hanoi than subje ts with no su h experien e.

16

see: http://forum.swarthmore.edu/epigone/geometry-puzzles/

twim lehmeh/7oen0r212 wyforum.swarthmore.edu, open question from

Februar 2000


Table 8.14. A Tower of Hanoi Program

; (SETQ A '(1 2 3) B NIL C NIL) (start) OR

; (hanoi '(1 2 3) nil nil 3)

(DEFUN move (from to)

(COND ( (NULL (EVAL from)) (PRINT (LIST 'PEG from 'EMPTY)) )

( (OR (NULL (EVAL to))

(> (CAR (EVAL to)) (CAR (EVAL from)) ))

(SET to (CONS (CAR(EVAL from)) (EVAL to)) )

(SET from (CDR (EVAL from)) )

)

( T (PRINT (LIST 'MOVE 'FROM (CAR(EVAL from))

'TO (CAR(EVAL to)) 'NOT 'POSSIBLE)))

)

(LIST(LIST 'MOVE 'DISC (CAR (EVAL to)) 'FROM from 'TO to)) )

(DEFUN hanoi (from to help n)

(COND ( (= n 1) (move from to) )

( T ( APPEND

(hanoi from help to (- n 1))

(move from to)

(hanoi help to from (- n 1))

) ) ) )

(DEFUN start () (hanoi 'A 'B 'C (LENGTH A)))


Table 8.15. A Tower of Hanoi Program for Arbitrary Starting Constellations

(DEFUN ison (dis peg)

(COND ( (NULL peg) NIL)

( (= (CAR peg) dis ) T)

( T (ison dis ( dr peg)))))

(DEFUN on (dis from to help)

(COND ( (ison dis (eval from)) from)

( (ison dis (eval to)) to)

( T help)))

; whi hpeg: peg on whi h the urrent dis is NOT lying and peg whi h

; is not urrent goal peg

(DEFUN whi hpeg (dis peg)

(COND ( (or (and (equal (on dis 'A 'B 'C) 'B) (equal peg 'C))

(and (equal (on dis 'A 'B 'C) 'C) (equal peg 'B)) ) 'A)

( (or (and (equal (on dis 'A 'B 'C) 'A) (equal peg 'C))

(and (equal (on dis 'A 'B 'C) 'C) (equal peg 'A)) ) 'B)

( (or (and (equal (on dis 'A 'B 'C) 'A) (equal peg 'B))

(and (equal (on dis 'A 'B 'C) 'B) (equal peg 'A)) ) 'C) ))

(DEFUN topof (peg)

(COND ( (null ( ar (eval peg))) nil) ( T ( ar (eval peg))) ))

(DEFUN learpeg (peg)

(COND ( (null ( ar (eval peg))) T) ( T nil) ))

(DEFUN leartop (dis )

(COND ( (and (equal (on dis 'A 'B 'C) 'A) (= ( ar A) dis )) T)

( (and (equal (on dis 'A 'B 'C) 'B) (= ( ar B) dis )) T)

( (and (equal (on dis 'A 'B 'C) 'C) (= ( ar C) dis )) T)

( T nil)))

(DEFUN gmove (dis peg)

(COND ( (= dis 0) (PRINT (LIST 'NO 'DISC)))

( (equal (on dis 'A 'B 'C) peg)

(PRINT (LIST 'Dis dis 'IS 'ON 'PEG peg)) )

( (OR ( learpeg peg) (> (topof peg) dis ))

(PRINT (LIST 'MOVE 'DISC dis

'FROM (on dis 'A 'B 'C)

'TO peg ) )

(SET (on dis 'A 'B 'C) (CDR (eval (on dis 'A 'B 'C))))

(SET peg (CONS dis (EVAL peg)))

)

( T (PRINT (LIST 'MOVE 'FROM dis 'ON peg 'NOT 'POSSIBLE)))))

(DEFUN ghanoi (dis peg)

(COND ( (and (= dis 1) (equal (on dis 'A 'B 'C) peg)) T )

( T (COND

( (equal (on dis 'A 'B 'C) peg) (ghanoi (- dis 1) peg) )

( (and (not (equal (on dis 'A 'B 'C) peg))

(not (and ( leartop dis ) ( learpeg peg)))

(> dis 1)) (ghanoi (- dis 1) (whi hpeg dis peg)) )

)

(gmove dis peg)

(COND ((> dis 1) (ghanoi (- dis 1) peg))) )))

(DEFUN n-of-dis s (p1 p2 p3) (+ (LENGTH p1) (+ (LENGTH p2) (LENGTH p3))))

; ghanoi: "largest" Dis x Goal-Peg --> Solution Sequen e

(DEFUN gstart () (ghanoi (n-of-dis s A B C) 'C))


9.1 Combining Planning and Program Synthesis

We demonstrated that planning an be ombined with indu tive program

synthesis by rst generating a universal plan for a problem with a small

number of obje ts, then transforming this plan into a nite program term,

and folding this term into a re ursive program s heme. While planning and

folding an be performed by powerful, domain-independent algorithms, plan

transformation is knowledge dependent. In part I, we presented the domain-

indpendent universal planner DPlan. In this part, we presented an approa h

to folding nite program terms based on pattern-mat hing whi h is more

powerful than other published approa hes.

Our approa h to plan transformation, presented in the previous hap-

ter, is based on inferen e of the data type underlying the planning domain.

Typi ally, su h knowledge is pre-dened in program synthesis, for example,

as domain-axiom as in the dedu tive approa h of Manna and Waldinger

(1975) or as inherent restri tion of the input domain as in the indu tive

approa h of Summers (1977). We go beyond these approa hes, providing a

method to infer the data type from the stru ture of the plan where inferen e

is based on a set of predened abstra t types. Furthermore, we presented rst

ideas for dealing with problems relying on semanti knowledge. In program

synthesis, typ ially, only stru tural list problems (su h as reverse) are onsid-

ered. Planning problems whi h an be solved by using stru tural knowledge

only are for example learblo k and ro ket. In learblo k, relations between ob-

je ts (on(x, y)) are independent of attributes of these obje ts (as their size).

In ro ket, relations between obje ts are irrelevant. In ontrast, sorting lists

depends on a semanti relation between obje ts, that is, whether one number

is greater than another. If a universal plan has parallel bran hes representing

the same operation but for dierent obje ts, we assume that a semanti se-

le tor fun tion must be introdu ed. The sele tor is onstru ted by identifying

dis riminating literals between the underlying problem states (see se t. 8.5 in

hap. 8). To sum up, with our urrent approa h we an deal with stru tural

problems in a fully automati way from planning over plan transformation

to folding. Dealing with problems relying on semanti information is a topi

for further resear h.


Currently we are not exploiting the full power of the folder when gen-

eralizing over plans: In plan transformation, it is tried to ome up with a

single, linear stru ture as input to the folder although the folder allows to

infer sets of re ursive equations with arbitrarily omplex re ursive stru tures.

In future resear h we plan to investigate possibilities for submitting omplex,

non-linear plans dire tly to the folder.

A drawba k of using planning as basis for onstru ting nite programs is

that number problems, su h as fa torial or bona i, annot be dealt with

in a natural way. For su h problems, our folder must rely on input tra es

provided by a user. Using a graphi al user interfa e to support the user in

onstru ting su h tra es is dis ussed by S hrodl and Edelkamp (1999).

9.2 A quisition of Problem Solving Strategies

The most important building-stones for the exible and adaptive nature of

human ognition are powerful me hanisms of learning.

1

On the low-level end

of learning me hanisms is stimulus-response learning, mostly modelled with

arti ial neural nets or reinfor ement learning. On the high-level end are

dierent prin iples of indu tion, that is, generalizing rules from examples.

While the majority of work in ma hine learning fo usses on indu tion of

on epts ( lassi ation learning, see se t. 6.3.1.1 in hap. 6), our work fo usses

on indu tive learning of ognitive skills from problem solving. While on epts

are mostly hara terized as de larative knowledge (know what) whi h an be

verbalized and is a essible for reasoning pro esses, skills are des ribed as

highly automated pro edural knowledge (know how) (Anderson, 1983).

9.2.1 Learning in Problem Solving and Planning

Problem solving is generally realized as heuristi sear h in a problem spa e.

In ognitive s ien e most work is in the area of goal driven produ tion sys-

tems (see se t. 2.4.5.1 in hap. 2). In AI, dierent planning te hniques are

investigated (see se t. 2.4.2 in hap. 2). In both frameworks, the denition of

problem operators together with onditions for their appli ation that is,,

produ tion rules is entral. Mat hing, sele tion, and appli ation of opera-

tors is performed by an interpreter ( ontrol strategy): the pre onditions of all

operators dened for a given domain are mat hed against the urrent data

(problem state), one of the mat hing operators is sele ted and applied on the

urrent state. The pro ess terminates if the goal is fullled or if no operator

is appli able.

Skill a quisition is usually modelled as omposition of predened primi-

tive operators as result of their o-o urren e during problem solving, that is,

1

This se tion is a short version of the previous publi ations S hmid and Wysotzki

(1996) and S hmid and Wysotzki (2000 ).

9.2 A quisition of Problem Solving Strategies 279

learning by doing. This is true in ognitive modelling (knowledge ompilation

in ACT Anderson & Lebiere, 1998), (operator hunking in Soar Rosenbloom

& Newell, 1986) as well as in AI planning (ma ro learning, see se t. 2.5.2

in hap. 2). A quisition of su h \linear" ma ros an result in a redu tion of

sear h, be ause now omposite operators an be applied instead of primitive

ones. In ognitive s ien e, operator- omposition is viewed as me hanism re-

sponsible for a quisition of automatisms and the main explanation for speed-

up ee ts of learning (Anderson et al., 1989).

In ontrast, in AI planning, learning of domain spe i ontrol knowledge,

that is, learning of problem solving strategies, are investigated as an additional

me hanism, as dis ussed in se tion 2.5.2 in hapter 2. One possibility to

model a quisition of ontrol knowledge is learning of \ y li " ma ros (Shell &

Carbonell, 1989; Shavlik, 1990). Learning a problem solving strategy ideally

eliminates sear h ompletely be ause the omplete sub-goal stru ture of a

problem domain is known. For example, a ma ro for a one-way transportation

problem as ro ket represents the strategy that all obje ts must be loaded

before the ro ket moves to its destination (see se t. 3.1.4.2 in hap. 2 and

se t. 8.4 in hap. 8). There is empiri al eviden e, for example in the Tower

of Hanoi domain, that people an a quire su h kind of knowledge (Anzai &

Simon, 1979).

9.2.2 Three Levels of Learning

We propose, that our system, as given in gure 1.2 provides a general frame-

work for modeling the a quisition of strategies from problem solving experi-

en e: Starting-point is a problem spe i ation, given as primitive operators,

their appli ation onditions, and a problem solving goal. Using DPlan, the

problem is explored, that is, operator sequen es for transforming some ini-

tial states into a state fullling the goals are generated. This experien e is

integrated into a nite program, orresponding roughly to a set of operator

hunks, as dis ussed above using plan transformation. Subsequently, this ex-

perien e is generalized to a re ursive program s heme (RPS) using a folding

te hnique. That is, the system infers a domain spe i ontrol strategy whi h

simultaneously represents the (goal) stru ture of the urrent domain. Alter-

natively, as we will dis uss in part III after initial exploration a similar

problem might be retrieved from memory. That is, the system re ognizes that

the new problem an be solved with the same strategy as an already known

problem. In this ase, a further generalized s heme, representing the abstra t

strategy for solving both problems is learned.

All three steps of learning are illustrated in gure 9.1 for the simple lear-

blo k example whi h was dis ussed in hapters 3, 7, and 8: To rea h the goal

that blo k C has a lear top, all blo ks lying above C have to be put on

the table. This an be done by applying the operator puttable(x) if blo k

x has a lear top. The nite program represents the experien e with three

initial states as a nested onditional expression. The most important aspe t


of this nite program, whi h makes it dierent from the ognitive s ien e

approa hes to operator hunking, is, that obje ts are not referred to dire tly

by their name but with help of a sele tor fun tion (topof). Sele tor fun tions

are inferred from the stru ture of the universal plan (as des ribed in hap. 8).

This pro ess in a way aptures the evolution of per eptual hunks from prob-

lem solving experien e (Koedinger & Anderson, 1990): For the learblo k

example, the introdu tion of topof(x) represents the knowledge that the rel-

evant part of the problem is the blo k lying on top of the urrently fo ussed

blo k. For more omplex domains, as Tower of Hanoi, the data type repre-

sents \partial solutions" (su h as how many dis s are already in the orre t

position or looking for the largest free dis ).

IF cleartop(x)

ELSE puttable(topof(x), THEN s

IF bo(x)THEN v

r(x, v) =

ELSE op1(op2(x), r(op2(x), v))

clearblock(topof(x),s))

clearblock(x, s) =

IF cleartop(x)THEN sELSE IF cleartop(topof(x)) THEN puttable(topof(x),s) ELSE IF cleartop(topof(topof(x)))

ELSE undefined

puttable(topof(topof(x)), s)) THEN puttable(topof(x),

A

B B

C C A C A

...

...

Generalization over problem states

Generalization over recursive enumerable

problem spaces

Generalization over classes of problems

Recursive Program Scheme

B

THEN s

clearblock(x, s) =

Finite ProgramOperator Chunk

Problem Solving Strategy

Scheme Hierarchy

IF clear(x)

ELSE puttable(topof(x),clearblock(topof(x),s))

ELSE plus(pred(x),addx(pred(x),y))THEN yIF eq0(x)

addx(x, y) =

Fig. 9.1. Three Levels of Generalization

In the se ond step, this primitive behavioral program is generalized over

re ursive enumerable problem spa es: a strategy for learing a blo k in

n-blo k problems is extrapolated and interpreted as a re ursive program

s heme. Indu tion of generalized stru tures from examples is a fundamen-

tal hara teristi of human intelligen e as for example proposed by Chomsky

as \language a quisition devi e" (Chomsky, 1959) or by Holland, Holyoak,

Nisbett, and Thagard (1986). This ability to extra t general rules from some

initial experien e is aptured in the presented te hnique of folding of nite

programs by dete ting regularities.

The representation of problem s hemes by re ursive program s hemes dif-

fer from the representation formats proposed in ognitive psy hology (Rumel-

hart & Norman, 1981). But RPSs are apturing exa tly the hara teristi s

9.2 A quisition of Problem Solving Strategies 281

whi h are attributed to ognitive s hemes, namely that s hemes represent

pro edural knowledge (\knowledge how") whi h the system an interrogate

to produ e \knowledge that", that is, knowledge about the stru ture of a

problem. Thereby problem s hemes are suitable for modeling analogi al rea-

soning: The a quired s heme represents not only the solution strategy for

unsta king towers of arbitrary height but for all stru tural identi al prob-

lems. Experien e with a blo ks-world problem an for example be used to

solve a numeri al problem whi h has the same stru ture by re-interpreting

the meaning of the symbols of an RPS. After solving some problems with

similar stru tures, more general s hemes evolve and problem solving an be

guided by abstra t s hemes.


Part III

S hema Abstra tion: Building a Hierar hy of

Problem S hemes

283

10. Analogi al Reasoning and Generalization

"I wish you'd solve the ase, Miss Marple, like you did the time Miss

Wetherby's gill of pi ked shrimps disappeared. And all be ause it reminded

you of something quite dierent about a sa k of oals." "You're laughing, my

dear," said Miss Marple, "but after all, that is a very sound way of arriving

at the truth. It's really what people all intuition and make su h a fuss about.

Intuition is like reading a word without having to spell it out. A hild an't

do that be ause it has had so little experien e. But a grown-up person knows

the word be ause they've seen it often before. You at h my meaning, Vi ar?"

"Yes," I said slowly, "I think I do. You mean that if a thing reminds you of

something else well, it's probably the same kind of thing."

|Agatha Christie, The Murder at the Vi arage, 1930

Analogi al inferen e is a spe ial ase of indu tive inferen e where knowl-

edge is transferred from a known base domain to a new target domain.

Analogy is a prominent resear h topi in ognitive s ien e: In philosophy,

it is dis ussed as sour e of reative thinking; in linguisti s, similarity- reating

metaphors are studied as a spe ial ase of analogi al reasoning; in psy hology,

analogi al reasoning and problem solving are resear hed in innumerous exper-

iments and there exist several pro ess models of analogi al problem solving

and learning; in arti ial intelligen e, besides some general approa hes to

analogy, programming by analogy and ase-based reasoning are investigated.

In the following (se t. 10.1), we will rst give an overview of entral on-

epts and me hanisms of the eld. Afterwards (se t. 10.2), we will introdu e

analogi al reasoning for domains with dierent levels of omplexity from

proportional analogies to analogi al problem solving and planning. Then we

will dis uss programming by analogy as a spe ial ase of problem solving

by analogy (se t. 10.3). Finally (se t. 10.4), we will give some pointers to

literature.

10.1 Analogi al and Case-Based Reasoning

10.1.1 Chara teristi s of Analogy

Typi ally for AI and psy hologi al models of analogi al reasoning is that

domains or problems are onsidered as stru tured obje ts, su h as terms or

286 10. Analogi al Reasoning and Generalization

relational stru tures (semanti nets). Stru tured obje ts are often represented

as graphs with basi obje ts as nodes and relations between obje ts as ar s.

Relations are often distinguished in obje t attributes (unary relations), rela-

tions between obje ts (rst order n-ary relations), and higher oder relations.

Based on this representational assumption, Gentner (1983) introdu ed

a hara terization of analogy and ontrasted analogy with other modes of

transfer between two domains (see tab. 10.1): In ontrast to mere appearan e

and literal similarity, analogy is based on mapping the relational stru ture of

domains rather than obje t attributes. For the famous Rutherford analogy

(see g. 10.4) \The atom is like our solar system" it is relevant that there

is a entral obje t (sun/nu leus) and obje ts revolving around this obje t

(planets/ele trons), but is irrelevant how mu h these obje ts weight or what

their temperature is. The same is true for abstra tion, but in ontrast to

analogy, the obje ts of the base domain ( entral for e system) are generalized

on epts rather than on rete instan es. Metaphors an be found to be either

a form of similarity-based transfer ( omparable to mere appearan e) or a form

of analogy where a similarity is reated between two previously un onne ted

domains (Indurkhya, 1992).

1

In ontrast to analogy, ase-based reasoning

often relies on a simple mapping of domain attributes (Kolodner, 1993).

Table 10.1.Kinds of Predi ates Mapped in Dierent Types of Domain Comparison

(Gentner, 1983, Tab. 1, extended)

No. of No. of

Attributes Relations

mapped to mapped to

target target Example

Mere Appearan e Many Few A sun ower is like the sun.

Literal Similarity Many Many The K5 solar system is like our

solar system.

Analogy Few Many The atom is like our solar system.

Abstra tion Few Many The atom is a entral for e system.

Metaphor Many Few Her smile is like the sun.

Few Many King Lois XIV was like the sun.

Below we will give a hierar hy of analogi al reasoning from simple propo-

sitional analogies where only one relation is mapped to omplex analogies

between (planning or programming) problems. In hapter 11 we will intro-

du e a graph representation for water-jug problems in detail.

There is often made a distin tion between within- and between-domain

analogies (Vosniadou & Ortony, 1989). For example, using the solution of one

programming problem (fa torial) to onstru t a solution to a new problem

(sum) is onsidered as within-domain (Anderson & Thompson, 1989) while

1

Famous are the metaphors of Philip Marlowe. Just to give you one: \The purring

voi e was now as false as an usherette's eyelashes and as slippery as a watermelon

seed." Raymond Chandler, The Big Sleep, 1939.

10.1 Analogi al and Case-Based Reasoning 287

transferring knowledge about the solar system to explain the stru ture of an

atom is onsidered as between-domain. Be ause analogy is typi ally des ribed

as stru ture mapping (see below), we think that this lassi ation is an ar-

ti ial one: When sour e and target domains are represented as relational

stru tures and these stru tures are mapped by a synta ti al pattern mat h-

ing algorithm, the ontent of these stru tures is irrelevant. In ase-based

reasoning, base and target are usually from the same domain.

10.1.2 Sub-Pro esses of Analogi al Reasoning

Analogi al reasoning is typi ally des ribed by the following, possibly inter-

a ting, sub-pro esses:

Retrieval: For a given new target domain/problem a \suitable", \similar"

base domain/problem is retrieved (from memory).

Mapping: The base and target stru tures are mapped.

Transfer: The target is enri hed with information from the base.

Generalization: A stru ture generalizing over base and target is indu ed.

Typi ally, it is assumed that retrieval is based on super ial similarity,

that is, a base problem is sele ted whi h shares a high number of attributes

with the new problem. Super ial similarity and stru tural similarity must

not ne essarily orrespond and it was shown in numerous studies that human

problem solvers have diÆ ulties in nding an adequate base problem (Novi k,

1988; Ross, 1989).

Most ognitive s ien e models of analogi al reasoning fo us on modeling

the mapping pro ess (Falkenhainer, Forbus, & Gentner, 1989; Hummel &

Holyoak, 1997). The general assumption is that obje ts of the base domain

are mapped to obje ts of the target domain in a stru turally onsistent way

(see g. 10.1). The approa hes dier with respe t to the onstraints on map-

ping, allowing only rst order or also higher order mapping, and restri ting

mapping to isomorphism, homomorphism or weaker relations (Holland et al.,

1986).

O O

O’ O’

R*

ff

R’*Target

Base

Fig. 10.1. Mapping of Base and Target Domain

Gentner (1983) proposed the following onstraints for mapping a base

domain to a target domain:


First Order Mapping: An obje t in the base an be mapped on a dierent

obje t in the target, but base relations must be mapped on identi al

relations in the target.

Isomorphism: A set of base obje ts must be mapped one-to-one and stru -

turally onsistent to a set of target obje ts. That is, if r(o

1

; o

2

) holds for

the base then r(f(o

1

); f(o

2

)) must hold for the target, where f(o) is the

mapping fun tion.

Systemati ity: Mapping of large parts of the relational stru ture is preferred

over mapping of small isolated parts, that is, relations with greater arity

are preferred.

Relying on these onstraints, mapping orresponds to the problem of nding

the largest ommon sub-graph of two graphs (S hadler & Wysotzki, 1999).

If the base is mapped to the target, the parts of the base graph whi h are

onne ted with the ommon sub-graph are transferred to the target where

base obje ts are translated to target obje ts in a ordan e with mapping.

This kind of transfer is also alled inferen e, be ause relations known in the

base are assumed to also hold in the target. If the isomorphism onstraint is

relaxed, transfer additionally an involve modi ations of the base stru ture.

This kind of transfer is also alled adaptation (see hap. 11).

After su essful analogi al transfer, the ommon stru ture of base and

target an be generalized to a more general s heme (Novi k & Holyoak, 1991).

For example, the stru ture of the solar system and the Rutherford atommodel

an be generalized to the more general on ept of a entral for e system.

To our knowledge, none of the pro ess models in ognitive s ien e models

generalization learning (see hap. 12).

In ase-based reasoning, mostly only retrieval is addressed and the re-

trieved ase is presented to the system user as a sour e of information whi h

he might transfer to a urrent problem.

10.1.3 Transformational versus Derivational Analogy

The subpro esses des ribed above hara terize so alled transformational

analogy. An alternative approa h for analogi al problem solving, alled deriva-

tional analogy, was proposed by Carbonell (1986). He argues that, from a

omputational point of view, transformational analogy is often ineÆ ient

and an result in suboptimal solutions. Instead of al ulating a base/target

mapping and solving the target by transfer of the base stru ture, it might

be more eÆ ient to re onstru t the solution pro ess; that is, use a remem-

bered problem solving episode as guideline for solving the new problem.

A problem solving episode onsists of the reasoning tra es (derivations) of

past solution pro esses, in luding the explored subgoal stru ture and used

methods. Derivational analogy an be hara terized by the following sub-

pro esses: (1) Retrieving a suitable problem solving episode, (2) applying

the retrieved derivation to the urrent situation by \replaying" the problem

10.1 Analogi al and Case-Based Reasoning 289

solving episode, he king for ea h step if the derivation is still appli able in

the new problem solving ontext. Derivational analogy is for example used

within the AI planning system Prodigy (Veloso & Carbonell, 1993). An empir-

i al omparison of transformational and derivational analogy was ondu ted

by S hmid and Carbonell (1999).

10.1.4 Quantitive and Qualitative Similarity

As mentioned above, retrieval is typi ally based on similarity of attributes.

For omparison of base and target, all kinds of similarity measures dened on

attribute ve tors an be used. An overview over measures of similarity and

distan e is typi ally given in textbooks on luster analysis, su h as E kes and

Roba h (1980). In psy hology, non-symmetri al measures are dis ussed, for

example, the ontrast-model of Tversky (1977).

In analogy resear h, fo us is on measures for stru tural similarity. A va-

riety of measures are based on the size of the greatest ommon sub-graph of

two stru tures. Su h a measure for un-labeled graphs was for example pro-

posed by Zelinka (1975): d(G;H) = jN j jN

U

j, where j N j is the number of

nodes of the larger graph and j N

U

j is the number of nodes in the greatest

ommon sub-graph of G and H . A measure onsidering the number of nodes

and ar s in the graphs is introdu ed in hapter 11.

Another approa h to stru tural similarity is to onsider the number of

transformations whi h are ne essary for making two graphs identi al (Bunke

& Messmer, 1994). A measure of transformation distan e for trees was for

example proposed by Lu (1979): d(T

1

; T

2

) = w

s

s + w

d

d + w

i

i, where

s represents the number of substitutions (renamings of node-labels), d the

number of deletions of nodes, and i the number of insertion of nodes. Param-

eters w give operation spe i weights. To guarantee that the measure is a

metri , it must hold that w

s

< w

d

; w

i

and w

d

= w

i

(proof in Mer y, 1998).

Transformational similarity is also the basis for stru tural information theory

(Leeuwenberg, 1971).

All measures dis ussed so far are quantitative, that is, a numeri al value is

al ulated whi h represents the similarity between base and target. Usually,

a threshold is dened and a domain or problem is onsidered as a andidate

for analogi al reasoning, if its similarity to the target problem lies above this

threshold. A qualitative approa h to stru tural similarity was proposed by

Plaza (1995). Plaza's domain of appli ation is the retrieval of typed feature

terms (e. g., re ords of employees), whi h an be hierar hi ally organized, for

example the eld \name" onsists of a further feature term with the elds

\rst name" and \surname". For a given (target) feature term, the most

similar (base) feature term from a data base is identied by the following

method: Ea h pair of the xed target and a term from the data base is

anti-unied. The resulting anti-instan es are ordered with respe t to their

subsumption relation and the most spe i anti-instan e is returned. Anti-

uni ation was introdu ed in se tion 6.3.3 in hapter 6 under the name of


least general generalization. An anti-uni ation approa h for programming

by analogy is presented in hapter 12.

10.2 Mapping Simple Relations or Complex Stru tures

10.2.1 Proportional Analogies

The most simple form of analogi al reasoning are proportional analogies of

the form

A:B :: C:? \A is to B as C to ?".

Expression A to B is the base domain and expression C to ? is the target do-

main. The relation existing between A and B must be identied and applied

to the target domain. Su h problems are typi ally used in intelligen e tests,

and algorithms for solving proportional analogies were introdu ed early in AI

resear h (Evans, 1968; Winston, 1980). In intelligen e tests, items are often

semanti ategories as \Rose is to Flower as Herring to ?". In this exam-

ple, the relevant relation is subordinate/superordinate and the superordinate

on ept to \Herring" must be retrieved (from semanti memory). Be ause

the solution of su h analogies is knowledge dependent, often letter strings

(Hofstadter & The Fluid Analogies Resear h Group, 1995; Burns, 1996) or

simple geometri gures (Evans, 1968; Winston, 1980; O'Hara, 1992) are used

as alternative.

An example for an analogy whi h an be solved with Evans Analogy-

program (Evans, 1968) is given in gure 10.2. In a rst step, ea h gure is

de omposed into simple geometri obje ts (su h as a re tangle and a tri-

angle). Be ause de omposition is ambiguous for overlapping obje ts, Evans

used Gestalt-laws and ontext-information for de omposition. For the pairs

of gures (A;B), (A;C), (B;C), (C; 1), : : :, (C; 5) relations between obje ts

and between spatial positions are onstru ted. These relations are used for

onstru ting transformation rules A ! B. Transformation in ludes obje t

mapping, deletion and insertion. The rules are abstra ted su h that they an

be applied to gure C, resulting in a new gure whi h hopefully orresponds

to one of the given alternatives for D.

An algorithm whi h an use geometri analogy problems using ontext-

dependent re-des riptions was proposed by O'Hara (1992). An example is

given in gure 10.3. All gures must be omposed of lines. An algebra is

used to represent/ onstru t gures. Operations are translation, rotation, re-

e tion, s aling of obje ts and glueing of pairs of obje ts.

In a rst step, an initial representation of gure A is onstru ted. Then,

a representation of B and a transformation t are onstru ted su h that B =

t(A). A representation of C and an isomorphi mapping are onstru ted

su h that C = (A). Mapping is extended to

0

, preserving isomorphism,

10.2 Mapping Simple Relations or Complex Stru tures 291

A B

1 2 3 4 5

C D

?

Fig. 10.2. Example for a Geometri -Analogy Problem (Evans, 1968, p. 333)

GLUE

ROT[p,60] ROT[p,60]

ROT[p,60]

GLUE

ROT[p,60] ROT[p,60]

ROT[p,60]

GLUE

::::

: :: :

Fig. 10.3. Context Dependent Des riptions in Proportional Analogy (O'Hara,

1992)


and D =

0

(B) is onstru ted. If no mapping C = (A) an be found,

the algorithm ba ktra ks and onstru ts a dierent representation for B (re-

des ription).

Letter string domains are easier to model as geometri domains and dier-

ent approa hes to algebrai representation (Leeuwenberg, 1971; Indurkhya,

1992) have been proposed. The Copy at program of Hofstadter and The Fluid

Analogies Resear h Group (1995) solves letter string analogies of the form

ab : abd :: kji :?.

10.2.2 Causal Analogies

In proportional analogies, a domain is represented by two stru tured obje ts

A and B with a single or a small set of relations t(A) = B whi h are relevant

for analogi al transfer. More omplex domains are explanatory stru tures, for

example for physi al phenomena. In Rutherford analogy, introdu ed above,

knowledge about the solar system is used to infer the ause why ele trons are

rotating around the nu leus of an atom (see g. 10.4).

Systems realizing mapping and inferen e for su h domains are for exam-

ple the stru ture mapping engine (SME, Falkenhainer et al., 1989) or LISA

(Hummel & Holyoak, 1997). The SME is based on the onstraints for stru -

ture mapping proposed by Gentner (1983), whi h were introdu ed above.

Note, that also the inferred higher-order relation is labeled \ ause", mapping

and inferen e are purely synta ti al, as des ribed for proportional analogies.

(a) (b)

Planet-j

Planet-i

O

O

S

S

O

O

SS

OSR E

O

O

S

S

O

O

S

S Elektron-i

YELLOW HOT MASSIVE

ATTRACTS

ATTRACTS ATTRACTS HOTTER THAN

Sun

CAUSE

REVOLVESAROUND

MORE MASSIVETHAN ATTRACTS ATTRACTS MORE MASSIVE

THANREVOLVES

AROUND

Nucleus

Fig. 10.4. The Rutherford Analogy (Gentner, 1983)

10.2.3 Problem Solving and Planning by Analogy

Many people have pointed out that analogy is an important me hanism in

problem solving. For example, Polya (1957) presented numerous examples

for how analogi al reasoning an help students to learn proving mathemati al

theorems. Psy hologi al studies on analogi al problem solving address mostly

10.2 Mapping Simple Relations or Complex Stru tures 293

solving of mathemati al problems (Novi k, 1988; Reed, A kin lose, & Voss,

1990) or program onstru tion (Anderson & Thompson, 1989; Weber, 1996).

Typi ally, students are presented with worked out examples whi h they an

use to solve a new (target) problem. The standard example for programming

is using fa torial as base problem to onstru t sum:

fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))

sum(x) = if(eq0(x), 0, +(x, sum(pred(x)))).

Anderson and Thompson (1989) propose an analogy model where pro-

gramming problems and solutions are represented as s hemes with fun tion

slots representing the intended operation of the program and form slots rep-

resenting the synta ti al realization. For a target problem only the fun tion

slot is lled and the form slot is inferred from the given base problem.

Examples for simple word algebra problems used in experiments by Reed

et al. (1990) is given in table 10.2. Analogi al problem solving means to

identify the relations between on epts in a given problem with the equation

for al ulating the solution. For a new (target) problem, the on epts in the

new text have to be mapped with the on epts in the base problem and then

the equation an be transferred. Reed et al. (1990) ould show that problem

solving su ess is higher if a base problem whi h in ludes the target problem

is used (e. g., the third problem in tab. 10.2 as base and the se ond as target)

but that most subje ts did not sele t the in lusive problems as most helpful

if they ould hoose themselves. In hapter 11 we report experiments with

dierent variants of in lusive problems in the water jug domain.

Table 10.2. Word Algebra Problems (Reed et al., 1990)

A group of people paid $238 to pur hase ti kets to a play. How many people were

in the group if the ti kets ost $14 ea h?

$14 = $238/n

A group of people paid $306 to pur hase theater ti kets. When 7 more people joined

the group, the total ost was $425. How many people were in the original group if

all ti kets had the same pri e?

$306/n = $425/(n+7)

A group of people paid $70 to wat h a basketball game. When 8 more people joined

the group the total ost was $ 20. How many people were in the original group if

the larger group re eived a 20% dis ount?

0.8 ($70/n) = $120/(n+8)

An AI system whi h addresses problem solving by analogy is Prodigy

(Veloso & Carbonell, 1993). Here, a derivational analogy me hanism is used

(see above). Prodigy-Analogy was, for example, applied for the ro ket domain

(see se t. 3.1.4.2 in hap. 3): A planning episode for solving a ro ket problem


is stored (e. g., transporting two obje ts) and indexed with the initial and

goal states. For a new problem (e. g., transporting four obje ts) the old

episode an be retrieved by mat hing the urrent initial and goal states with

the indi es of stored episodes and the retrieved episode is replayed. While

Veloso and Carbonell (1993) ould demonstrate eÆ ien y gains of reuse over

planning from the s rat h for some example domains, Nebel and Koehler

(1995) provided a theoreti al analysis that in the worst ase reuse is not more

eÆ ient than planning from the s rat h. The rst bottlene k is retrieval of a

suitable ase and the se ond is modi ation of old solutions.

10.3 Programming by Analogy

Program onstru tion an be seen as a spe ial ase of problem solving. In part

II we have dis ussed automati program synthesis as an area of resear h whi h

tries to ome up with me hanisms to automatize or support at least routine

parts of program development. Programming by analogy is a further su h

me hanism, whi h is dis ussed sin e the beginning of automati programming

resear h. For example, Manna and Waldinger (1975) laim that retention

of previously onstru ted programs is a powerful way to a quire and store

knowledge.

Dershowitz (1986) presented an approa h to onstru t imperative pro-

grams by analogi al reasoning. A base problem e. g., al ulating the divi-

sion of two real numbers with some toleran e is given by its spe i ation

and solution. The solution is annotated with additional statements (su h as

assert, a hieve, purpose) whi h makes it possible to relate spe i ation and

program. For a new problem e. g., al ulating the ube-root of a number

with some toleran e only the spe i ation is given. By mapping the spe i-

ations (see g. 10.5), the statements in the base program are transformed

into statements of the to be onstru ted target program. Some additional

inferen e methods are used to generate a program whi h is orre t with re-

spe t to the spe i ation from the initial program whi h was onstru ted by

analogy.

Real-Division:

ASSERT 0 <= < d; e > 0

ACHIEVE j =d qj < e

VARYING q

Cube-Root:

ASSERT a >= 0; e > 0 ACHIEVE

ja

(1=3)

rj < e

VARYING r

Mapping: q ) r, =d) a

(1=3)

Transformations:

q ! r,

u=v ! u

(1=3)

(repla e ea h division operator by a ube-root operator),

! a

Fig. 10.5. Base and Target Spe i ation (Dershowitz, 1986)

10.3 Programming by Analogy 295

Both programs are based on the more general strategy of binary sear h.

As a last step of programming by analogy, Dershowitz proposes to onstru t

a s heme by abstra ting over the programs. For the example above, the oper-

ators for division and ube-root are generalized to (u; v), that is, a se ond-

order variable is introdu ed. New binary sear h problems an then be solved

by instantiating the s heme. If program s hemes are a quired, dedu tive ap-

proa hes to program synthesis an be applied, for example, the stepwise re-

nement approa h used in the KIDS system (see se t. 6.2.2.3 in hap. 6).

Cru ial for analogi al programming is to dete t relations between pairs

of programs. The theoreti al foundation for onstru ting mappings are pro-

gram morphisms (Burton, 1992; Smith, 1993). A te hnique to perform su h

mappings is anti-uni ation. A ompletely implemented system for program-

ming by analogy, based on se ond-order anti-uni ation and generalization

morphisms was presented by Hasker (1995). For a given spe i ation and

program, the system user provides a program derivation ( . f., derivational

analogy), that is, steps for transforming the spe i ation into the program.

Se ond-order anti-uni ation is used to dete t analogies between pairs of

spe i ations whi h are represented as ombinator terms. First, he introdu es

anti-uni ation for monadi ombinator terms, that is, allowing only unary

terms. Then, he introdu es anti-uni ation for produ t types. Monadi om-

binator terms are not expressive enough to represent programs while ombi-

nator terms for artesian produ t types allowing that sub-terms are deleted

are too general and allow innitely many minimal generalizations. There-

fore, Hasker introdu es relevant ombinator terms as a subset of ombinators

for artesian produ t types whi h allow introdu ing pairs of terms but do

not allow to ignore sub-terms. For this lass of ombinator terms there still

exist, possibly large, sets of minimal generalization. Therefore, Hasker intro-

du es some heuristi s and allows for user intera tion to guide onstru tion of

a \useful" generalization.

Our own approa h to programming with analogy is still at its beginning.

We onsider how folding of nite program terms into re ursive programs (see

hap. 7) an be alternatively realized by analogy or abstra tion. That is,

mapping is performed on pairs of nite programs, where for one program

the re ursive generalization is known and for the other not (see hap. 12). In

ontrast to most approa hes to analogi al reasoning, retrieval is not based on

attribute similarity but on stru tural similarity. We use anti-uni ation for

retrieval (Plaza, 1995) as well as for mapping and transfer (Hasker, 1995).

Cal ulating the anti-instan e of two terms results in their maximally spe i

generalization. Our urrent approa h to anti-uni ation is mu h more re-

stri ted than the approa h of Hasker (1995). Our restri ted approa h guaran-

tees that the minimal generalization of two terms is unique. But for retrieval,

based on identifying the maximal spe i anti-instan e in a subsumption hi-

erar hy, as proposed by Plaza (1995, see above), typi ally sets of andidates

for analogi al transfer are returned. Here, we must provide additional infor-


mation to sele t a useful base. One possibility is, to onsider the size and

type of stru tural overlap between terms. We ondu ted psy hologi al exper-

iments to identify some general riteria of stru tural base/target relations

whi h allow su essful transfer for human problem solvers (see hap. 11). If

a nite program (asso iated with its re ursive generalization) is sele ted as

base, the term substitutions al ulated while onstru ting the anti-instan e

of this program with the target an be applied to the re ursive base pro-

gram and thereby the re ursive generalization for the target is obtained (see

hap. 12).

We believe, that human problem solvers prefer abstra t s hemes over on-

rete base problems to guide solving novel problems and use on rete prob-

lems as guidan e only, if they are inexperien ed in a domain. Therefore, in

our work we fo us on abstra tion rather than analogy. Our approa h to anti-

uni ation works likewise for on rete program terms and terms ontaining

obje t and fun tion variables, that is s hemes. Anti-unifying a new nite pro-

gram term with the unfolding of some program s heme results in a further

generalization. Consequently, a hierar hy of abstra t s hemes develops over

problem solving experien e.

Our notion of representing programs as elements of some term algebra in-

stead of some given programming language (see hap. 7) allows us to address

analogy and abstra tion in the domain of programming as well as in more

general domains of problem solving (su h us solving blo ks-world problems

or other domains onsidered in planning) within the same approa h. That

is, as dis ussed in hapter 9, we address the a quisition of problem s hemes

whi h represent problem solving strategies (or re ursive ontrol rules) from

experien e.

10.4 Pointers to Literature

Analogy and ase-based reasoning is an extensively resear hed domain. A bib-

liography for both areas an be found at http://www.ai- br.org. Classi al

books on ase-based reasoning are Riesbe k and S hank (1989) and Kolodner

(1993). A good sour e for urrent resear h are the pro eedings of the inter-

national onferen e on ase-based reasoning (ICCBR). Cognitive models of

analogy are presented, for example, at the annual onferen e of the Cognitive

S ien e So iety (CogS i) and in the journal Cognitive S ien e. A dis ussion

of metaphors and analogy is presented by Indurkhya (1992). Holland et al.

(1986) dis uss indu tion and analogy from perspe tives of philosophy, psy-

hology, and AI. A olle tion of ognitive s ien e papers on similarity and

analogy was presented by Vosniadou and (Eds.) (1989). Some AI papers on

analogy an be found in Mi halski et al. (1986).

11. Stru tural Similarity in Analogi al Transfer

"And what is s ien e about?" "He explained that s ientists formulate theories

about how the physi al world works, andthen test them out by experiments.

As long as the experiments su eed, then the theories hold. If they fail, the

s ientists have to nd another theory to explain the fa ts. He says that, with

s ien e, there's this ex iting paradox, that diosillusionment needn't be defeat.

It's a step forward."

|P. D. James, Death of an Expert Witness, 1977

In this hapter, we present two psy hologi al experiments to demonstrate

that human problem solver do and an use not only base (sour e) domains

whi h are isomorphi to target domains but also rely on non-isomorphi stru -

tures in analogi al problem solving. Of ourse, not every non-isomorphi rela-

tion is suitable for analogi al transfer. Therefore, we identied whi h degree

of stru tural overlap must exist between two problems to guarantee a high

probability of transfer su ess. This resear h is not only of interest in the

ontext of theories of human problem solving. In the ontext of our system

IPAL, riteria are needed for de iding whether it is worthwhile to try to gen-

erate a new re ursive program by analogi al transfer of a program (or by

instantiating a program s heme) given in memory or to generate a new so-

lution from s rat h, using indu tive program synthesis. In the following, we

rst (se t. 11.1) give an introdu tion in psy hologi al theories to analogi al

problem solving and present the problem domain used in the experiments.

Then we report two experiments on analogi al transfer of non-isomorphi al

sour e problems (se t. 11.2 and se t. 11.3). We on lude with a dis ussion of

the empiri al results and possible areas of appli ation (se t. 11.4).

1

11.1 Analogi al Problem Solving

Analogi al reasoning is an often used strategy in everyday and a ademi

problem solving. For example, if a person already has experien e in planning

a trip by train, he/she might transfer this knowledge to planning a trip by

1

This hapter is based on the papers S hmid, Wirth, and Polkehn (2001) and

S hmid, Wirth, and Polkehn (1999).

298 11. Stru tural Similarity in Analogi al Transfer

plane. If a student already has knowledge in solving an equation with one

additive variable, he/she might transfer the solution pro edure to an equation

with a multipli ative variable. For analogi al transfer, a previously solved

problem alled sour e has to be similar to the urrent problem alled

target. While a large number of ommon attributes might help to nd an

analogy, sour e and target have to be stru turally similar for transfer su ess

(Holyoak & Koh, 1987). In the ideal ase, sour e and target are stru turally

identi al (isomorph) but this is seldom true in real-live problem solving.

Analogi al problem solving is ommonly des ribed by the following (possi-

bly intera ting) omponent pro esses (e. g., Keane, Ledgeway, & Du, 1994):

representation of the target problem, retrieval of a previously solved sour e

problem from memory, mapping of the stru tures of sour e and target, trans-

fer of the sour e solution to the target problem, and generalizing over the om-

mon stru ture of sour e and target. The empiri ally best explored pro esses

are retrieval and mapping (see Hummel & Holyoak, 1997, for an overview).

Retrieval of a sour e is assumed to be guided by overall semanti similarity

(i. e., ommon attributes), often hara terized as \super ial" in ontrast to

stru tural similarity (Gentner & Landers, 1985; Holyoak & Koh, 1987; Ross,

1989). Empiri al results show that retrieval is the bottlene k in analogi al

reasoning and often an only be performed su essfully if expli it hints about

a suitable sour e are given (Gi k & Holyoak, 1980; Gentner, Ratterman, &

Forbus, 1993). Therefore, a usual pro edure for studying mapping and trans-

fer is to ir umvent retrieval by expli itly presenting a problem as a helpful

example (Novi k & Holyoak, 1991). In the following, we will give a loser

look at mapping and transfer.

11.1.1 Mapping and Transfer

Mapping is onsidered the ore pro ess in analogi al reasoning. The de i-

sion whether two problems are analogous is based on identifying stru tural

orresponden es between them. Mapping is a ne essary but not always suf-

ient ondition for su essful transfer (Novi k & Holyoak, 1991). There are

numerous empiri al studies on erning the mapping pro ess ( . f., Hummel

& Holyoak, 1997) and all omputational models of analogi al reasoning pro-

vide an implementation of this omponent (Falkenhainer et al., 1989; Keane

et al., 1994; Hummel & Holyoak, 1997). Mapping is typi ally modelled as

rst identifying the orresponding omponents of sour e and target and then

arrying over the on eptual stru ture from the sour e to the target. For ex-

ample, in the Rutherford analogy (the atom is like the solar system), planets

an be mapped to ele trons and the sun to the nu leus of an atom together

with relations as \revolves around" or \more mass than" (Gentner, 1983).

In the stru ture mapping theory (Gentner, 1983) it is postulated that map-

ping is performed purely synta ti ally and that it is guided by the prin i-

ple of systemati ity preferring mapping of greater portions of stru ture to

11.1 Analogi al Problem Solving 299

mapping of isolated elements. Alternatively, Holyoak and olleagues postu-

late that mapping is onstrained by semanti and pragmati aspe ts of the

problem (Holyoak & Thagard, 1989; Hummel & Holyoak, 1997). Mapping

might be further onstrained su h that it results in easy adaptability (Keane,

1996). Currently, it is dis ussed that a target might be re-represented, if

sour e/target mapping annot be performed su essfully (Hofstadter & The

Fluid Analogies Resear h Group, 1995; Gentner, Brem, Ferguson, Markman,

Levidow, Wol, & Fobus, 1997).

Based on the mapping of sour e and target, the on eptual stru ture of

the sour e an be transferred to the target. For example, the explanatory

stru ture that the planets revolve arround the sun be ause the sun attra ts

the planets might be transferred to the domain of atoms. Transfer an be

faulty or in omplete, even if mapping was performed su essfully (Novi k

& Holyoak, 1991). Negative transfer an also result from a failure in prior

sub-pro esses onstru tion of an unsuitable representation of the target,

retrieval of an inappropriate sour e problem, or in omplete, in onsistent or

inappropriate mapping of sour e and target (Novi k, 1988). Analogi al trans-

fer might lead to the indu tion of a more general s hema whi h represents an

abstra tion over the ommon stru ture of sour e and target (Gi k & Holyoak,

1983). For example, when solving the Rutherford analogy, the more general

on ept of entral for e systems might be learned.

11.1.2 Transfer of Non-Isomorphi Sour e Problems

Our work fo usses on analogi al transfer in problem solving. There is a

marginal and a ru ial dieren e between general models of analogi al reason-

ing and models of analogi al problem solving. While in general sour e and

target might be from dierent domains (between-domain analogies as the

Rutherford analogy), in analogi al problem solving sour e and target typi-

ally are from the same domain (within-domain analogies, e. g., Vosniadou

& Ortony, 1989). For example, people an use a previously solved algebra

word problem as an example to fa ilitate solving a new algebra word prob-

lem (Novi k & Holyoak, 1991; Reed et al., 1990), or they an use a omputer

program with whi h they are already familiar as an example to onstru t

a new program (Anderson & Thompson, 1989). While the dis rimination of

between- and within-domain analogies is relevant for the question of how a

suitable sour e an be retrieved, it has no impa t on stru ture mapping if

this pro ess is assumed to be performed purely synta ti ally.

The more ru ial dieren e between models of analogi al reasoning and

of problem solving is that in analogi al reasoning transfer is mostly des ribed

by inferen e (in so- alled explanatory analogies, e. g., Gentner, 1983) vs. by

adaptation (in problem solving, e. g., Keane, 1996). In the rst ase, (higher-

order) relations given for the sour e are arried over to the target as the

explanation given above of why ele trons revolve around the nu leus. In ana-

logi al problem solving, on the other hand, most often the omplete solution


pro edure of the sour e problem is adapted to the target. Analogi al transfer

of a problem solution subsumes the stru tural \ arry-over" along with pos-

sible hanges (adaptation) of the solution stru ture and the appli ation of

the solution pro edure. For example, if we are presented with the ne essary

operations to isolate a variable in an equation, we an solve a new equation

by adapting the known solution pro edure. If stru tures of sour e and target

are identi al (isomorphi ), transfer an be des ribed as simply repla ing the

sour e on epts by the target on epts in the sour e solution. For a sour e

equation 2 x+5 = 9 with solution x =

(95)

2

, the target 3 x+4 = 16 an be

solved by (1) mapping the numbers of sour e and target, that is 2 is mapped

to 3, 4 to 5 and 9 to 16 and by (2) substituting the orresponding numbers

in the sour e solution. An example for sour e in lusive sour e/target pair

mapping is given below in gure 11.3.

We are espe ially interested in onditions for su essful transfer of non-

isomorphi sour e solutions. There are a variety of non-isomorphi al sour e/tar-

get relations dis ussed in literature: First, there are dierent types of map-

ping relations: one-to-one-mappings (isomorphism), many-to-one, and one-

to-many mappings (Spellman & Holyoak, 1996). Se ondly, there are dierent

types and degrees of stru tural overlap (see g. 11.1): a sour e might be \ om-

pletely ontained" in the target (sour e in lusiveness; Reed et al., 1990), or

a sour e might represent all on epts needed for solving the target together

with some additional on epts (target exhaustiveness; Gentner, 1980). These

are two spe ial ases of stru tural overlap between sour e and target. It seems

plausible to assume that if the overlap is too small, a problem is no longer

helpful for solving the target. Su h a problem would not be hara terized

as a sour e problem. While there are some empiri al studies investigating

transfer of non-isomorphi sour es (Reed et al., 1990; Novi k & Hmelo, 1994;

Gholson, Smither, Buhrman, Dun an, & Pier e, 1996; Spellman & Holyoak,

1996), there is no systemati investigation of the stru tural relation between

sour e and target whi h is ne essary for su esful transfer. Our experimental

work fo usses on the impa t of dierent types and degrees of stru tural over-

lap on transfer su ess, that is, we urrently are only onsidering one-to-one

mappings.

11.1.3 Stru tural Representation of Problems

To determine the stru tural relation between sour e and target we have to

rely on expli itly dened representations of problem stru tures. In ognitive

models of analogi al reasoning, problems are typi ally represented by s hemas

(SME, Falkenhainer et al., 1989), (IAM, Keane et al., 1994) or by semanti

nets (ACME, Holyoak & Thagard, 1989), (LISA, Hummel & Holyoak, 1997).

From a more abstra t view, these representations orrespond to graphs, where

on epts are represented as nodes and relations between them as ar s. Ex-

amples for graphs are given in gure 11.1. For a tual problems, nodes (and


Isomorphism

source

target

Target Exhaustiveness

source

target

Source Inclusiveness

Fig. 11.1. Types and degrees of stru tural overlap between sour e and target

Problems


possibly ar s) are labelled. A graph representation of the solar system on-

tains for instan e a node labelled with the relation more mass than onne ted

to a node planet-1 and to a node sun.

While expli it representations are often presented for explanatory analo-

gies (Gi k & Holyoak, 1980; Gentner, 1983), this is not true for problem

solving. For algebra problems (Novi k & Holyoak, 1991; Reed et al., 1990),

the mathemati al equations an be used to represent the problem stru ture

(see g. 11.3). In general when investigating su h problems as the Tower of

Hanoi (Clement & Ri hard, 1997; Simon & Hayes, 1976) both the stru ture

of a problem and the problem solving operators, possibly together with ap-

pli ation onditions and onstraints (Gholson et al., 1996), have to be taken

into a ount.

In the lassi al transformational view of analogi al problem solving (Gen-

tner, 1983), little work has been done whi h addresses how to model analog-

i al transfer of problems involving several solutions steps. In arti al intel-

ligen e, Carbonell (1986) proposed derivational analogy for multi-step prob-

lems: He models problem solving by analogy as deriving a solution by replay

of an already known solution pro ess, he king on the way whether the ondi-

tions for operator appli ations of the sour e still hold when solving the target.

In the following, we nevertheless adopt the transformational approa h de-

s ribing analogi al problem solving by mapping and transfer. That is, we will

assume that both the (de larative) des ription of the problem and pro edural

information are stru turally represented and that a target problem is solved

by adapting the sour e solution to the target problem based on stru ture

mapping. We assume that problems and solutions are represented in s hemas

apturing de larative as well as pro edural aspe ts as argued for example by

Anderson and Thompson (1989) and Rumelhart and Norman (1981).

When spe ifying the representation of a problem, we have to de ide on

its format as well as its ontent (Gi k & Holyoak, 1980). In general, it is not

possible to determine all possible aspe ts asso iated with a problem, that

is, we annot laim omplete representations. We adopt the position of Gi k

and Holyoak, to model at least all aspe ts whi h are relevant for su essful

transfer. A omponent of the (de larative) des ription of a problem is rele-

vant, if it is ne essary for generating the operation sequen e whi h solves the

problem. Furthermore, only the operation sequen e whi h solves the problem

is regarded as relevant pro edural information. A su essful problem solver

has to fo us on these relevant aspe ts of a problem and should ignore all

other aspe ts. Of ourse, we do not assume that human problem solvers in

general represent only relevant or all relevant aspe ts of a problem. Our goal

is to systemati ally ontrol variants of stru tural sour e/target relations and

their impa t on transfer, that is, our representational assumptions are not

empiri al laims but a means for task analysis. We want to onstru t \nor-

matively omplete" graph representations of problems to explore the impa t


of dierent analyti ally given stru tural sour e/target relations on empiri-

ally observable transfer su ess.

In the following, we will rst introdu e our problem solving domain

water redistribution tasks and our problem representations. Then we will

present two experiments. In the rst experiment we will show that problem

solvers an transfer a sour e solution with moderate stru tural similarity to

the target if the problems do not vary in super ial features. In the se ond

experiment we investigate a variety of dierent stru tural overlaps between

sour e and target.

11.1.4 Non-Isomorphi Variants in a Water Redistribution

Domain

Be ause we fo us on transfer of de larative and pro edural aspe ts of prob-

lems we onstru ted a problem type that an be lassied as interpolation

problems like the Tower of Hanoi problems (Simon & Hayes, 1976), the wa-

ter jug problems (Atwood & Polson, 1976), or missionary- annibal problems

(Reed, Ernst, & Banerji, 1974; Gholson et al., 1996). Problem solving means

to nd the orre t multi-step sequen e of operators that transform an ini-

tial state into the goal state. Interpolation problems have well-dened initial

and goal states and usually one well-dened multi-step solution. Thus, they

are as suitable for systemati ally analyzing their stru ture as, for instan e,

mathemati al problems (Reed et al., 1990) with the advantage that they are

not likely to a tivate s hool-trained mathemati al pre-knowledge.

We onstru ted a water redistribution domain that is similar to but more

omplex than the water jug problems des ribed by Atwood and Polson (1976).

In the initial state three (or four) jugs of dierent apa ity are given. The

jugs are initially lled with dierent amounts of water (initial quantities).

The water has to be redistributed between the jugs in su h a way that the

pre-spe ied goal quantities are obtained. For example, given are the three

jugs A, B and C with apa ities

A

= 36,

B

= 45, and

C

= 54 (units). In

the initial state quantities are q

A

= 16, q

B

= 27, and q

C

= 34. To rea h the

goal state the values of these quantities must be transformed into q

A

= 25,

q

B

= 0 and q

C

= 52 by redistributing the water among the dierent jugs (see

g. 11.2).

The task is to determine the shortest sequen e of operators that trans-

form the initial quantities into the goal quantities. The only legal operator

available is a pour-operator (the redistribute operator) that is restri ted by

the following onditions: (1) The only water to pour is the water ontained

by the jugs in the initial state. (2) Water an be poured only from a non-

empty `pour out'-jug into an in ompletely lled `pour in'- jug. (3) Pouring

always results in either lling the `pour in'-jug up to its apa ity with possibly

leaving a rest of water in the `pour out'-jug or emptying the `pour out'-jug

with possibly remaining free apa ity in the `pour in'-jug. (4) The amount

of water that is poured out of the `pour out'-jug is always the same amount


5436 45capacity

16 27 34

25 0 52

Solution: pour(C,B), pour(B,A), pour(A,C), pour(B,A)

A B C name/position

quantiy

quantiygoal

initial

Fig. 11.2. A water redistribution problem

that is lled in the `pour in'-jug. Formally this pour-operator is dened in

the following way:

IF not(q

X

(t) = 0) AND not(q

Y

(t) =

Y

) THEN pour(X,Y) resulting in:

IF q

X

(t)

Y

q

Y

(t)

THEN q

Y

(t+ 1) := q

Y

(t) + q

X

(t)

q

X

(t+ 1) := q

X

(t) q

X

(t) (i. e. 0, emptying jug X)

ELSE q

X

(t+ 1) := q

X

(t) (

Y

q

Y

(t))

q

Y

(t+ 1 := q

Y

(t) + (

Y

q

Y

(t)) (i. e.,

Y

, lling jug Y)

with

q

X

(t): quantity of jug X at solution step t

X

: apa ity of jug X,

(

X

q

X

(t)): remaining free apa ity of jug X.

Be ause we are interested in whi h types and degrees of stru tural overlap

are suÆ ient for su essful analogi al transfer, we have to ensure that subje ts

really refer to the sour e for solving the target problem. That is, the problems

should be omplex enough to ensure that the orre t solution an not be

found by trial and error, and diÆ ult enough to ensure that the abstra t

solution prin iple is not immediately inferable. Therefore, we onstru ted

redistribution problems for whi h exists only a single (for two problems two)

shortest operator sequen e (in problem spa es with over 1000 states and more

than 50 y le-free solution paths).

To onstru t a stru tural representation of a problem we were guided by

the following prin iples: (a) the goal quantity of ea h jug an be des ribed

as the initial quantity transformed by a ertain (shortest) sequen e of oper-

ators; (b) relevant de larative attributes ( apa ities, quantities and relations

between these attributes) of the initial and the goal state determine the solu-

tion sequen e of operators; and ( ) ea h solution step an be des ribed by the


denition of the pour-operator given above. In terms of these prin iples oper-

ator appli ations an be re-formulated by equations where a urrent quantity

an be expressed by adding or subtra ting amounts of water. For example,

using the parameters introdu ed above, the rst operator pour(C,B) of the

example presented in gure 11.2 transforms the quantities of jug B and C

of the initial state t = 0 into quantities of state t = 1 in the following way:

q

B

(0) = 27, q

C

(0) = 34,

B

= 45,

C

= 54:

BECAUSE not(34 = 0) AND not(27 = 45)

pour(C,B) at t = 0 results in:

BECAUSE not(34 (45 - 27)):

q

B

(1) = 27 + (45 - 27) = 45

q

C

(1) = 34 - (45 - 27) = 16.

Redistribution problems dene a spe i goal quantity for ea h jug. For

this reason, there have to be as many equations onstru ted as there are jugs

involved. This group of equations represents all relevant pro edural aspe ts

of the problem, that is, it represents the sequen e of operators leading to the

goal state. For example, the three equations of the three jugs in gure 11.2

are presented in table 11.1.

Table 11.1. Relevant information for solving the sour e problem

(a) Pro edural

pour(C,B) pour(B,A) pour(A,C) pour(B,A)

q

A

(4) = q

A

(0) +(

A

q

A

(0))

A

+[

B

(

A

q

A

(0))

q

B

(4) = q

B

(0) +(

B

q

B

(0)) (

A

q

A

(0)) [

B

(

A

q

A

(0))

q

C

(4) = q

C

(0) (

B

q

B

(0)) +

A

(b) De larative ( onstraints)

B

(

A

q

A

(0))

A

= 2 (

B

q

B

(0))

q

C

(0) (

B

q

B

(0))

C

< q

C

(0) +

A

Ea h goal state is expressed by the values given in the initial state. On

the right side of the equality sign these given values are ombined in a way

that transforms the initial quantity of ea h jug into its goal quantity. Cer-

tain onstraints between the initial values have to be satised so that these

equations are balan ed. These onstraints an be analyti ally derived from

the equations given in table 11.1 and they onstitute the relevant de larative

attributes of the problem. The onstraints for water redistribution problems

have the same fun tion as the problem solving onstraints given for exam-

ple for the radiation or fortress problems (Holyoak & Koh, 1987) or the

missionary- annibal problems (Reed et al., 1974; Gholson et al., 1996).


As an example of how these ontraints an be derived from the equations,

you an easily see that the last pour operator of the solution (pouring a

ertain amount of water from B into A) is only exe utable if the relation

B

(

A

q

A

(0)) holds. As a se ond onstraint, the goal quantity of jug C

ould be des ribed by the following equation q

C

(4) = q

C

(0) + (

B

q

B

(0)).

But the expression (

B

q

B

(0)) does not represent the quantity in jug B. It

represents the remaining free apa ity of this jug whi h, of ourse, an not

be poured into another jug. Be ause the relation

A

= 2 (

B

q

B

(0)) holds,

we on lude that the value of the remaining free apa ity of jug B has to be

subtra ted from the quantity of jug C (only possible if q

C

(0) (

B

q

B

(0))

holds) and that the double of this value (

A

) has to be added to the quantity

in jug C by pouring the apa ity of jug A into jug C. Additionally, the relation

C

< q

C

(0)+

A

determines the relative order of the two pour-operators: you

have to subtra t (

B

q

B

(0)) before you an add

A

to q

C

(0).

The equations des ribing the transformations for ea h jug and the (in-

)equations des ribing the onstraints of the problem are suÆ ient to rep-

resent all relevant de larative and pro edural information of the problem.

Thus, transforming all of them into one graphi al representation leads to a

normatively omplete representation of the problem whi h an be used for a

task analyti al determination of the overlap between two problem stru tures.

The equations and in-equations for all water redistribution problems used in

our experiments are given in appendix CC.7.

11.1.5 Measurement of Stru tural Overlap

Stru tural similarity between two graphsG and H is usually al ulated as the

size of the greatest ommon subgraph of G and H in relation to the size of the

greater of both graphs (S hadler &Wysotzki, 1999; Bunke & Messmer, 1994).

To al ulate graph distan e by formula 1 we introdu e dire ted \empty"

ar s between all pairs of nodes where no ar s exist. The size of the ommon

subgraph is expressed by the sum of ommon ar s V

GH

and nodes N

GH

.

d

(G;H)

= 1

V

GH

+N

GH

max(V

G

; V

H

) +max(N

G

; N

H

)

(11.1)

The graph distan e an assume values between 0 and 1, indi ating iso-

morphi relations between two graphs with d

(G;H)

= 0 and no systemati

relation between G and H with d

(G;H)

= 1. For the two partial isomorphi

graphs in gure 11.3 we obtain (for graph 11.3.a as G and graph 11.3.b as

H)

V

G

= 42 V

H

= 72 (V

G

< V

H

)

N

G

= 7 N

H

= 9 (N

G

< N

H

)

V

GH

= 30 N

GH

= 6

11.2 Experiment 1 307

resulting in the dieren e between G and H

d

(G;H)

= 1

30 + 6

72 + 9

= 0:57:

The value of d

(G;H)

= 0:57 indi ates that the size of the ommon subgraph

of G and H is a little more than half of the size of the larger graph H . Of

ourse, this absolute value of the distan e between two problems is highly

dependent on the kind of representation of their stru tures. Thus, for task

analysis only the ordinal information of these values should be onsidered.

2

a2a1

a1

a1

a2

a2

(a) (b)

.

=

+

.

=

+

a1

a1

a1 a2

a2

-

a1a2

a2

x

5

9

3 x

2

16

6

Fig. 11.3. Graphs for the equations 2 x+ 5 = 9 (a) and 3 x+ (6 2) = 16 (b)

11.2 Experiment 1

Experiment 1 was designed to investigate the suitability of the water-

redistribution for studying analogi al transfer in problem solving, to get some

initial information about transfer of isomorphi vs. non-isomorphi sour es,

and to he k for possible intera tion of super ial with stru tural similarity.

The problems were onstru ted in su h a way that it is highly improbable

that the orre t optimal operator-sequen e an be found by trial-and-error

or that the general solution prin iple is immediately inferable. To investi-

gate analogi al transfer, information about mapping of sour e and target an

be given before the target problem has to be solved (Novi k & Holyoak,

1991). This an be done by pointing out the relevant properties of a problem

( on eptual mapping) and by giving information about the orresponding

jugs in sour e and target (\numeri al" mapping). Additionally, information

about the problem solving strategy of subje ts an be obtained by analyzing

log-les of subje ts' problem solving behavior and by testing mapping after

subje ts solved the target problem.


To get an indi ation of the degree of stru tural similarity between a

sour e and a target whi h is ne essary for transfer su ess, an isomorphi

sour e/target pair was ontrasted with a partial isomorphi sour e/target

pair with \moderately high" stru tural overlap. This should give us some

information about the range of sour e/target similarities whi h should be

investigated more losely (in experiment 2).

To ontrol possible intera tions of stru tural and super ial similarity, we

dis riminate stru ture preserving and stru ture violating variants of target

problems (Holyoak & Koh, 1987): For a given sour e problem with three jugs

(see g. 11.2), a target problem with four jugs learly hanges the superi al

similarity in ontrast to a target problem onsisting also of three jugs. But this

additional jug might or might not result in a hange of the problem stru ture

re e ted in the sequen e of pour operations ne essary for solving the problem.

In ontrast, other super ial variations like hanging the sequen e of jugs

from small/medium/large to large/medium/small are stru ture preserving,

but learly ae t super ial similarity. If the introdu tion of an additional

jug does not lead to additional deviations from the surfa e appearan e of the

sour e problem, we regard the surfa e as \stable". As a onsequen e, there

are four possible sour e/target variations: stru ture preserving problems with

stable or hanged surfa e and stru ture violating problems with stable or

hanged surfa e.

11.2.1 Method

Material As sour e problem, the problem given in gure 11.2 was used.

We onstru ted ve dierent redistribution problems as target problems (see

appendix CC.7):

Problem 1: a three jug problem solvable with four operators whi h is isomor-

phi to the sour e problem ( ondition isomorph/no surfa e hange),

Problem 2: a three jug problem solvable with four operators whi h is iso-

morphi to the sour e problem, but has a surfa e variation by swit hing

positions of the small jug (A) and the medium jug (B) and renaming

these jugs a ordingly (A! B, B ! A) ( ondition isomorph/small sur-

fa e hange),

Problem 3: a three jug problem solvable with four operators whi h is iso-

morphi to the sour e problem, but has a surfa e variation by swit hing

positions of all jugs (A! B, B ! C, C ! A) ( ondition isomorph/large

surfa e hange),

Problem 4: a four jug problem solvable with ve operators whi h has a mod-

erately high stru tural overlap with the sour e ( ondition partial iso-

morph/no surfa e hange), and

Problem 5: a four jug problem solvable with ve operators whi h is isomorph

to problem 4, but has a surfa e variation by swit hing positions of two

jugs (A! B, B ! A) ( ondition partial isomorph/small surfa e hange).


Be ause of the exploratory nature of this rst experiment, we did not in-

trodu e a omplete rossing of stru ture and surfa e similarity. The main

question was, whether subje ts ould su essfully use a partial isomorph in

analogi al transfer.

In addition to the sour e and target problems, an \initial" problem whi h

is isomorphi to the sour e was onstru ted. This problem was introdu ed

before the presentation of the sour e problem for the following reasons: rst,

subje ts should be ome familiar with intera ting with the problem solving

environment (the experiment was fully omputer-based, see below); and se -

ond, subje ts should be \primed" to use analogy as a solution strategy, by

getting demonstrated how the sour e problem ould be solved with help of

the initial problem.

Subje ts Subje ts were 60 pupils (31 male and 29 female) of a gymnasium

in Berlin, Germany. Their average age was 17.4 years (minimum 14 and

maximum 19 years).

Pro edure The experiment was fully omputer based and ondu ted at

the s hool's PC- luster. The overall duration of an experimental session was

about 45 minutes. All intera tions with the programwere re orded in log-les.

One session onsisted of the following parts:

Instru tion and Training. First, general instru tions were given, informing

about the following tasks and the water-redistribution problems. Subje ts

were introdu ed to the setting of the s reen-layout (graphi s of the jugs)

and the handling of intera tions with the program (performing a pour-

operation, un-doing an operation).

Initial problem. Afterwards, the subje ts were asked to solve the initial prob-

lem with tutorial guidan e (performed by the program). In ase of orre t

solution the tutor asked the subje t to repeat it without any error. The

tutor intervened, if subje t had performed four steps without su ess, or

had started two new attempts to solve the problem, or if they needed

longer than three minutes. This part was nished, if the problem was

orre tly solved twi e without tutorial help.

Sour e problem. When introdu ing the sour e problem, rst some hints

about the relevant problem aspe ts for guring out a shortest operator-

sequen e were given (by thinking about the goal quantities in terms of

relations to initial quantities and maximum apa ities). Afterwards, the

orrespondan e between the three jugs of the initial problem and the

three jugs of the sour e problems were pointed out. Now, the s reen of-

fered an additional button to retrieve the solution sequen e of the initial

problem. The initial solution ould be retrieved as often as the subje t

desired. To perform an operation for solving the sour e problem, this

window had to be losed. Tutorial guidan e was identi al to the initial

problem, but subje ts had to solve the sour e problem only on e.

Target problem. Every subje t randomly re eived one of the ve target prob-

lems. Again, mapping hints (relevant problem aspe ts, orrespondan e


between jugs of the sour e and the target problem) were given. The

sour e solution ould be retrieved without limit, but, again, the subje ts

ould only pro eed to solve the target problem after this window was

losed. Thereby, the number and time of referen e to the sour e problem

ould be obtained in log-les. The subje ts had a maximum of 10 minutes

to solve the target problem.

Mapping Control. Mapping su ess was ontrolled by a short test where sub-

je ts had to give relations between the jugs of the sour e and target

problem.

Questionnaire. Finally, mathemati al skills (last mark in mathemati s, sub-

je tive rating of mathemati al knowledge, interest in mathemati s) and

personal data (age and gender) were obtained.

11.2.2 Results and Dis ussion

Overall, there was a monotoni de rease in problem solving su ess over the

ve target problems (see tab. 11.2, lines n and solved). To make sure, that

problem solving su ess was determined by su essful transfer of the sour e

and not by some other problem solving strategy, only subje ts whi h gave a

orre t mapping were onsidered and a solution was rated as transfer su ess

if the generated solution sequen e was the (unique) shortest solution (see

tab. 11.2, lines orre t mapping and shortest solution). The variable \trans-

fer su ess" was al ulated as per entage of subje ts whith orre t mapping

whi h generated the shortest solution (see tab. 11.2, line transfer rate).

Table 11.2. Results of Experiment 1

Problem 1 2 3 4 5

stru ture

a

ISO ISO ISO P-ISO P-ISO

surfa e

b

no small large no small

n 12 12 12 12 12

solved 12 11 9 8 6

orre t mapping 8 12 10 9 6

shortest solution 8 10 8 8 3

transfer rate 100% 83.3% 80% 88.9% 50%

a

ISO = isomorph, P-ISO = partial isomorph

b

no, small, large hange of surfa e

Log-le analysis showed, that none of the subje ts who performed orre t

mapping, retrieved the sour e solution while solving the target problem. It

is highly improbable that the orre t shortest solution was found randomly

or that subje ts ould infer the general solution prin iple when solving the

initial and sour e problem. As a onsequen e, we have to assume that these

subje ts solved the target by analogi al transfer of the memorized (four-step)

sour e solution.


Transfer su ess de reased nearly monotoni ly over the ve onditions.

Ex eptions were problems 3 (isomorph/high surfa e hange) and 4 (partial

isomorph/no surfa e hange). The high per entage of transfer su ess for

problem 4 indi ates learly, that subje ts an su esfully transfer a non-

isomorphi sour e problem. Even for problem 5 (partial isomorph/surfa e

hange) transfer su ess was 50%.

There is no overall signi ant dieren e between the ve experimental

onditions (exa t 5 2 polynomial test

2

: P = 0:21). To ontrol intera tions

between stru tural and super ial similarity, dierent ontrasts were al u-

lated whi h we dis uss in the following.

11.2.2.1 Isomorph Stru ture/Change in Surfa e. There is no signi-

ant impa t of the variation of super ial features between onditions 1, 2 and

3 (exa t binomial-tests: 1 vs. 2 with P = 0:225; 2 vs. 3 with P = 0:558; and 1

vs.3 with P = 0:168). This nding is in ontrast to Reed et al. (1990). Reed

and olleagues showed that subje ts' rating of the suitability of a problem for

solving a given target is highly in uen ed by super ial attributes. However,

these ratings were obtained before subje ts had to solve the target problem.

This indi ates, that super ial similarity has a high impa t on retrieval of a

suitable problem but not on transfer su ess, as was also shown by Holyoak

and Koh (1987) when ontrasting stru ture-preserving vs. stru ture-violating

dieren es.

11.2.2.2 Change in Stru ture/Stable Surfa e. Changes in super ial

attributes between onditions 1 and 4 (isomorph vs. partial isomorph, both

with no surfa e hange) respe tively 2 and 5 (isomorph vs. partial isomorph,

both with small surfa e hange) an be regarded as stable, be ause the addi-

tional jug (in ondition 4 vs. 1 and ondition 5 vs. 2) in uen es only stru tural

attributes. That is, by ontrasting these onditions we measure the in uen e

of stru tural similarity on transfer su ess. There is no signi ant dieren e

between ondition 1 and 4 (exa t binomial-tests: 1 vs. 4 with P = 0:394). But

there is a signi ant dieren e between onditions 2 and 5 (exa t binomial-

tests: 2 vs. 5 with P = 0:039): A partial isomorph an be useful for analogi al

transfer if it shares super ial attributes with the target, but, transfer diÆ-

ulty is high if sour e and target vary in super ial attributes even if the

mapping is expli itly given!

11.2.2.3 Change in Stru ture/Change in Surfa e. The variation of

super ial attributes between onditions 4 and 5 has a signi ant impa t

(exa t binomial-tests: P = 0:039). As shown for the ontrast of onditions 2

and 5, if problems are not isomorphi , super ial attributes gain importan e.

Of ourse, this nding is restri ted to the spe ial type of sour e/target pairs

and variation of super ial attributes we investigated that is, to ases where

2

For this test, the result has to be tested against the number of ell-value distri-

butions orresponding with the given row and olumn values. The pro edure for

obtaining this number was implemented by Knut Polkehn.


the target is \larger" than the sour e and where jugs are always named as

A, B, C (D), but the names an be asso iated with jugs of dierent sizes.

In this spe ial ase, the intuitive onstraint of mapping jugs with identi al

names and positions has to be over ome and kept a tive during transfer.

To summarize, this rst explorative experiment shows, that water redis-

tribution problems are suitable for investigating analogi al transfer most

subje ts ould solve the target problems, but solution su ess is sensitive to

variations of sour e/target similarity. As a onsequen e of the intera tion

found between super ial and stru tural similarity, in the following, super-

ial sour e/target similarity will be kept high for all target problems and we

will investigate only target problems varying in their stru tural similarity to

the sour e (i. e., with stable surfa e). Finally, the high solution su ess for

the partial isomorph of \moderately high" stru tural similarity ( ondition 4)

indi ates, that we an investigate sour e/target pairs with a smaller degree

of stru tural overlap.

11.3 Experiment 2

In the se ond experiment we investigated a ner variation of dierent types

and degrees of stru tural overlap. We fo used on two hypotheses about the

in uen e of stru tural similarity on transfer:

(1) We have been interested in the possibly dierent ee ts of dier-

ent types of stru tural overlap on transfer that is target exhaustiveness

versus sour e in lusiveness of problems ( . f., g. 11.1). If one onsiders a

problem stru ture as onsisting of only relevant de larative and pro edural

information, dierent types of stru tural relations result in dieren es in the

amount of both ommon relevant de larative and ommon relevant pro e-

dural information. Changing the amount of ommon de larative information

requires ignoring de larative sour e information in the ase of target exhaus-

tiveness and additionally identifying de larative target information in the

ase of sour e in lusiveness. Changing the amount of ommon pro edural

information means hanging the length of the solution (i. e., the minimal

number of pour-operators ne essary to solve the problem).

Thus, ompared to the sour e solution target exhaustiveness results in

a shorter target solution while the target solution is longer for sour e in-

lusiveness. Assuming that ignoring information is easier than additionally

identifying information (S hmid, Mer y, & Wysotzki, 1998) and assuming

that a shorter target solution is easier to nd than a longer one, we expe t

that su essful transfer should be more probable for target exhaustive than

for sour e in lusive problems. In line with this assumption, Reed et al. (1974)

reported in reasing transfer frequen ies for target exhaustive relations, if sub-

je ts were informed about the orresponden es between sour e and target (see

also Reed et al., 1990).


(2) While sour e in lusiveness and target exhaustiveness are spe ial types

of stru tural overlap we have also been interested in the overall impa t of

the degree of stru tural overlap on transfer. We wanted to determine the

minimum size of the ommon substru ture of sour e and target problem

that makes the sour e useful for analogi ally solving the target. Or in other

words, we wanted to measure the degree of the distan e between sour e and

target stru tures up to whi h the sour e solution is transferable to the target

problem.

11.3.1 Method

Material As initial and sour e problem we used the same problems as in

experiment 1. As target problems we onstru ted following ve dierent redis-

tribution problems with onstant super ial attributes (see appendix CC.7):

Problem 1: a three jug problem solvable with three operators whose stru -

ture was ompletely ontained in the stru ture of the sour e problem

( ondition target exhaustiveness),

Problem 2: a three jug problem solvable with ve operators whose stru -

ture ontained ompletely the stru ture of the sour e problem ( ondition

sour e in lusiveness),

Problem 3: the partial isomorph problem used before ( ondition 4 in experi-

ment 1) a four jug problem solvable with ve operators whose stru ture

ompletely ontains the stru ture of the sour e problem; this problem

shares a smaller stru tural overlap with the sour e than problem 2 ( on-

dition \high" stru tural overlap), and

Problem 4 and 5: more four jug problems solvable with ve operators that

have de reasing stru tural overlap with the sour e; the stru tures of

sour e and target share a ommon substru ture, but both stru tures have

additional aspe ts ( onditions \medium" stru tural overlap and \low"

stru tural overlap)

For all problem stru tures distan es to the sour e stru ture were al u-

lated using formula 1 (see appendix CC.7). Be ause of the intrinsi onstraints

of the water redistribution domain, it was not possible to obtain equi-distan e

between problems. Nevertheless, the problems we onstru ted served as good

andidates for testing our hypotheses.

To investigate the ee t of the type of stru tural sour e/target relation,

the distan es of problem 1 (target exhaustive) and problem 2 (sour e in-

lusive) to the sour e have been kept as low as possible and as similar as

possible: d

S1

= 0:16 and d

S2

= 0:17. As dis ussed above, it an be expe ted,

that target exhaustiveness leads to a higher probability of transfer su ess

than sour e in lusiveness.

Although problem 3 is a sour e in lusive problem, we used it as an an hor

problem for varying the degree of stru tural overlap. Target problem 4 diered

moderately from target problem 3 in its distan e value (d

S3

= 0:37 vs d

S4

=


0:55) while target problem 5 diered from target problem 4 only slightly

(d

S4

= 0:55 vs. d

S5

= 0:59). Thus, one ould expe t strong dieren es in

transfer rates between ondition 3 and 4 and nearly the same transfer rates

for onditions 4 and 5. We name problems 3, 4, and 5 as \high", \medium"

and \low" overlap in a ordan e to the ranking of their distan es to the

sour e problem.

Subje ts Subje ts were 70 pupils (18 male and 52 female) of a gymnasium in

Berlin, Germany. Their average age as 16.3 years (minimum 16 and maximum

17 years). The data of 2 subje ts was not logged due to te hni al problems.

Thus, 68 logles were available for data analysis.

Pro edure The pro edure was the same as in experiment 1. Ea h subje t

had to solve one initial problem and one isomorphi sour e problem rst and

was then presented one of the ve target problems.

11.3.2 Results and Dis ussion

49 subje ts mapped the jugs from the sour e to the target orre tly. Thus 19

subje ts had to be ex luded from analysis. Table 11.3 shows the frequen ies

of subje ts who performed the orre t mapping between sour e and target

and generated the shortest solution sequen , i. e., solved the target problem

analogi ally ( . f., experiment 1).

Table 11.3. Results of Experiment 2

Problem 1 2 3 4 5

stru ture target sour e \high" \medium" \low"

exhaustive in lusive overlap overlap overlap

n 11 10 16 15 16

orre t mapping 7 8 13 9 12

shortest solution 6 7 10 5 1

transfer rate 86% 88% 77% 56% 8%

11.3.2.1 Type of Stru tural Relation. There is no dieren e in solving

frequen ies between ondition 1 and 2 (exa t binomial test, P = 0:607). That

is, there is no indi ation of an ee t of the type of stru tural sour e/target

relation on transfer su ess. In ontrast to the ndings of Reed et al. (1974)

and Reed et al. (1990), it seems, that the degree of stru tural overlap has a

mu h larger in uen e than the type of stru tural relation between sour e and

target. Furthermore, looking only at the pro edural aspe t of our problems,

we ould not nd an impa t of the length of the required operator-sequen e

(three steps for problem 1 vs. 5 steps for problem 2) on solution su ess.

A possible explanation might be that the type of stru tural relation has

no ee t, if problems are very similar to the sour e. It is learly a topi for

11.4 General Dis ussion 315

further investigation, to he k whether target exhaustive problems be ome

superior to sour e in lusive problems with in reasing sour e/target distan es.

A general superiority of degree over type of overlap ould be explained

by assuming mapping as a symmetri al instead of an asymmetri al (sour e

to target) pro ess. Hummel and Holyoak (1997) argue that during retrieval

the target representation \drives" the pro ess. In ontrast, during mapping

the role of the \driver" an swit h from target to sour e and vi e versa.

During transfer the sour e stru ture again takes ontrol of the pro ess. Thus,

an intera tion between these pro esses must lead to de reasing dieren es

between ee ts of sour e in lusiveness and ee ts of target exhaustiveness on

analogi al transfer.

11.3.2.2 Degree of Stru tural Overlap. Ea h problem of onditions 3

to 5 has been solvable with at least ve operators. That means, there was

one additional operator needed ompared to the sour e solution. Results for

onditions 3 to 5 show a signi ant dieren e between the ee ts of dier-

ent degrees of stru tural sour e/target overlap on solution frequen y (exa t

3 2 test, P = 0:002). Comparing ea h single frequen y against ea h other

indi ates that the ru ial dieren e between stru tural distan es is between

onditions 4 and 5 (exa t binomial test, onditions 3 and 4: P = 0:1; ondi-

tions 3 and 5: P < 0:001; onditions 4 and 5: P = 0:0001).

This nding is surprising taking into a ount that the dieren e of stru -

tural distan e between onditions 3 and 4 is mu h larger than between on-

dition 4 and 5 (d

S3

= 0:37, d

S4

= 0:55, d

S5

= 0:59). A possible explanation

is, that with problem 5 we have rea hed the margin of the range of stru tural

overlap where a problem an be helpful for solving a target problem. A on-

je ture worth further investigation is, that a problem an be onsidered as a

suitable sour e if it shares at least fty per ent of its stru ture with the tar-

get! An alternative hypothesis is, that not the relative but rather the absolute

size of stru tural overlap determines transfer su ess that is, that a sour e

is no longer helpful to solve the target, if the number of nodes ontained in

the ommon sub-stru ture gets smaller than some xed lower limit.

11.4 General Dis ussion

In our studies we investigated only a small sele tion of analyti ally possible

sour e/target relations. We did not investigate many-to-one versus one-to-

many mappings (Spellman & Holyoak, 1996), and we only looked at target

exhaustiveness versus sour e in lusiveness for problems with a large om-

mon stru ture. We plan to investigate these variations in further studies. For

sour e/target relations with a varying degree of stru tural overlap we were

able to show that a problem is suitable as sour e even if it shares only about

half of its stru ture with the target. A rst explanation for this nding whi h

goes along with models of transformational analogy is, that subje ts rst on-

stru t a partial solution guided by the solution for the stru turally identi al


part of the solution, and than use this partial solution as a onstraint for

nding the missing solution steps by some problem solving strategy, su h as

means-end-analysis (Newell, Shaw, & Simon, 1958), or by internal analogy

(Hi kman & Larkin, 1990).

Internal analogy des ribes a strategy where a previously as ertained solu-

tion for a part of a problem guides the onstru tion of a solution for another

part of the same problem. For the problem domain we investigated, internal

analogy gives no plausible explanation: The onstraints used to gure out

the solution steps for the overlapping part of the target problem are not the

same as those used for the non-overlapping part therefore internal analogy

annot be applied. A se ond explanation might be, that subje ts try to re-

represent the target problem in su h a way that it be omes isomorphi to

the sour e (Hofstadter & The Fluid Analogies Resear h Group, 1995; Gen-

tner et al., 1997). Again, this explanation seems to be inplausible for our

domain: Be ause the number of jugs and given initial, goal, and maximum

quantities determine the solution steps ompletely, re-representation (for ex-

ample looking at two dierent jugs as one jug) annot be helpful for nding

a solution.

The results of the present study give some new insights about the nature of

stru tural similarity underlying transfer su ess in analogous problem solving.

While it is agreed upon that appli ation of analogies is mostly in uen ed by

stru tural and not by super ial similarity (Reed et al., 1974; Gentner, 1983;

Holyoak & Koh, 1987; Reed et al., 1990; Novi k & Holyoak, 1991), there

are only few studies that have investigated whi h type and what degree of

stru tural relationship between a sour e and a target problem is ne essary

for transfer su ess.

Holyoak and Koh (1987) used variants of the radiation problem (Dun ker,

1945) to show that stru tural dieren es have an impa t on transfer. They

varied stru tural similarity by onstru ting problems with dierent solution

onstraints. In studies using variants of the missionaires- annibales problem

stru tural similarity was varied in the same way (Reed et al., 1974; Gholson

et al., 1996). In the area of mathemati al problem solving, typi ally the om-

plexity of the solution pro edure is varied (Reed et al., 1990; Reed & Bolstad,

1991). While in all of these studies non-isomorphi sour e/target pairs are

investigated, in none of them the type and degree of stru tural similarity

was ontrolled. Thus, the question of whi h stru tural hara teristi s make a

sour e a suitable andidate for analogi al transfer remained unanswered.

Investigating stru tural sour e/target relations is of pra ti al interest for

several reasons: (1) In an edu ational ontext ( f. tutoring systems) the pro-

vided examples have to be arefully balan ed to allow for generalization

(learning). Presenting only isomorphs restri ts learning to small problem

lasses, while too large a degree of stru tural dissimilarity an result in fail-

ure of transfer and thereby obstru ts learning (Pirolli & Anderson, 1985).

(2) A plausible ognitive model of analogi al problem solving (Falkenhainer

11.4 General Dis ussion 317

et al., 1989; Hummel & Holyoak, 1997) should generate orre t transfer only

for su h sour e/target relations where human subje ts perform su essfully.

(3) Computer systems whi h employ analogi al or ase-based reasoning te h-

niques (Carbonell, 1986; S hmid & Wysotzki, 1998) should refrain from ana-

logi al transfer when there is a high probability of onstru ting faulty solu-

tions. Thus, situations an be avoided in whi h system users have to he k

and possibly debug generated solutions. Here information about ondi-

tions for su essful transfer in human analogi al problem solving an provide

guidelines for implementing riteria when the strategy of analogi al reasoning

should be reje ted in favour of other problem solving strategies.


12. Programming By Analogy

In part II we dis ussed indu tive program synthesis as an approa h to learning

re ursive program s hemes. Alternatively, in this hapter, we investigate how

a re ursive program s heme an be generalized from two, stru turally similar

programs and how a new re ursive program an be onstru ted by adapting

an already known s heme. Generalization an be seen as the last step of

programming or problem solving by analogy. Adaptation of a s heme to a

new problem an be seen as abstra tion rather than analogy be ause an

abstra t s heme is applied to a new problem in ontrast to mapping two

on rete problems. In the following, we rst (se t. 12.1) give a motivation

for programming by analogy and abstra tion. Afterwards (se t. 12.2), we

introdu e a restri ted approa h to se ond-order anti-uni ation. Then we

present rst results on retrieval (se t. 12.3), introdu ing subsumption as a

qualitative approa h to program similarity. In se tion 12.4, generalization of

re ursive program s hemes as anti-instan es of pairs of programs is des ribed.

Finally (se t. 12.5), some preliminary ideas for adaptation are presented.

1

12.1 Program Reuse and Program S hemes

It is an old laim in software engineering, that programmers should write less

ode but reuse ode developed in previous eorts (Lowry & Duran, 1989).

In the ontext of programming, reuse an be hara terized as transforming

an old program into a new program by repla ing expressions (Cheatham,

1984; Burton, 1992). But reuse is often problemati on the level of on rete

programs be ause the relation between ode fragments and the fun tion they

perform is not always obvious (Cheatham, 1984). There are two approa hes

to over ome this problem: (1) performing transformations on the program

spe i ations rather than on the on rete programs (Dershowitz, 1986), and

(2) providing abstra t s hemes whi h are stepwise rened by \verti al" pro-

gram transformation (Smith, 1985). The se ond approa h proved to be quite

su essful and is used in the semi-automati program synthesis system KIDS

(Smith, 1990, see se t. 6.2.2.3 in hap. 6).

1

This hapter is based on the previous publi ation S hmid, Sinha, and Wysotzki

(2001).

320 12. Programming By Analogy

In the KIDS system, program s hemes, su h as divide-and- onquer, lo al,

and global sear h, are predened by the system author and sele tion of an ap-

propriate s heme for a new programming problem is performed by the system

user. From a software-engineering perspe tive, it is prudent to ex lude these

knowledge-based aspe ts of program synthesis from automatization. Never-

theless, we onsider automati retrieval of an appropriate program s heme

and automati onstru tion of su h program s hemes from experien e as an

interesting resear h questions.

While the KIDS system is based on dedu tive (transformational) pro-

gram synthesis, the ontext of our own work is indu tive program synthesis:

A programming problem is spe ied by some input/output examples and

an universal plan is onstru ted whi h represents the transformation of ea h

possible input state of the initially nite domain into the desired output (see

part I). This plan is transformed into a nite program tree whi h is folded

into a re ursive fun tion (see part II). It might be interesting to investigate

reuse on the level of programming problems that is, providing a hypo-

theti al re ursive program for a new set of input/output examples omitting

planning. But urrently we are investigating how the folding step an be re-

pla ed by analogi al transfer or abstra tion and how abstra t s hemes an

be generalized from on rete programs.

Our overall approa h is illustrated in gure 1.1: For a given nite pro-

gram (T) representing some initial experien e with a problem that program

s heme (RPS-S) is retrieved from memory for whi h its nth unfolding (T-S)

results in a \maximal similarity" to the urrent (target) problem. The sour e

program s heme an either be a on rete program with primitive operations

or an already abstra ted s heme. The sour e s heme is modied with respe t

to the mapping obtained between its unfolding and the target. Modi ation

an involve a simple re-instantiation of primitive symbols or more omplex

adaptations. Finally, the sour e s heme and the new program are generalized

to a more abstra t s heme. The new RPS and the abstra ted RPS are stored

in memory.

After introdu ing our approa h to anti-uni ation, all omponents of pro-

gramming by analogy are dis ussed. An overview of our work on onstru ting

programming by analogy algorithms is given in appendix AA.13.

12.2 Restri ted 2ndOrder AntiUni ation

12.2.1 Re ursive Program S hemes Revisited

A program is represented as a re ursive program s heme (RPS) S = (G; t

0

),

as introdu ed in denition 7.1.17. The main program t

0

is dened over a

term algebra T

(X) where signature is a set of fun tion symbols and

X is a set of variables (def. 7.1.1). In the following, we split in a set of

primitive fun tion symbols F and a set of user-dened fun tion names with

12.2 Restri ted 2ndOrder AntiUni ation 321

F\ = ; and = F[. A re ursive equation in G onsists of a fun tion head

G(x

1

; : : : ; x

m

) and a fun tion body t. The fun tion head gives the name of the

fun tion G 2 and the parameters of the fun tion with X = fx

1

; : : : ; x

m

g.

The fun tion body is dened over the term algebra, t 2 T

(X). The body t

ontains the symbol G at least on e. Currently, our generalization approa h

is restri ted to RPSs where G ontains only a single equation and where t

0

onsists only of the all of this equation.

Sin e program terms are dened over fun tion symbols, in general, an

RPS represents a lass of programs. For program evaluation, all variables

in X must be instantiated by on rete values, all fun tion symbols must be

interpreted by exe utable fun tions, and all names for user-dened fun tions

must be dened in F .

In the following, we introdu e spe ial sets of variables and fun tion sym-

bols to dis riminate between on rete programs and program s hemes whi h

were generated by anti-uni ation. The set of variables X is divided into

variables X

p

representing input variables and variables X

au

whi h represent

rst-order generalizations, that is, generalizations over rst-order onstants.

The set of fun tion symbols F is divided into symbols for primitive fun tions

F

p

with a xed interpretation (su h as +(x, y) representing addition of two

numbers) and fun tion symbols with arbitrary interpretation

au

. Symbols in

au

represent fun tion variables whi h were generated by se ond-order gen-

eralizations, that is generalizations over primitive fun tions. Please note, that

these fun tion variables do not orrespond to names of user-dened fun tions

!

Denition 12.2.1 (Con rete Program). A on rete program is an RPS

over T

(X

p

) with = F

p

[ . That is, it ontains only input variables in

X

p

, symbols for primitive fun tions in F

p

and fun tion alls G

i

2 whi h

are dened in G.

Denition 12.2.2 (Program S heme). A program s heme is an RPS de-

ned over T

(X

p

[X

au

) with = F

p

[

au

[. That is, it an ontain input

variables in X

p

, symbols for primitive fun tions in F

p

, fun tion alls G

i

2

whi h are dened in G, as well as obje t variables y 2 X

au

, and fun tion

variables 2

au

.

An RPS an be unfolded by repla ing the name of a user-dened fun tion

by its body where variables are substituted in a ordan e with the fun tion

all as introdu ed in denition 7.1.29. In prin iple, a re ursive equation an be

unfolded innitely often. Unfolding terminates, if the re ursive program all

is repla ed by (the undened term) and the result of unfolding is a nite

program term (see def. 7.1.19) T 2 T

(X

p

[X

au

) with = F

p

[

au

[ fg.

That is, T does not ontain names of user-dened fun tions G

i

2 .


12.2.2 Anti-Uni ation of Program Terms

First order anti-uni ation was introdu ed in se tion 7.1.2 in hapter 7. In the

following, we present an extension of Huet's de larative, rst-order algorithm

for anti-uni ation of pairs of program terms (Huet, 1976; Lassez et al., 1988)

by introdu ing three additional rules. The term that results from the anti-

uni ation of two terms is their most spe i anti-instan e.

Denition 12.2.3 (Instan e/Anti-Instan e). If t

1

: : : t

n

and u are terms

and for ea h t

i

there exists a substitution

i

(1 i n) su h that t

i

= u

i

,

then the terms t

i

are instan es of u and u is an anti-instan e of the set

ft

1

; : : : ; t

n

g.

u is the most spe i anti-instan e of the set ft

1

; : : : ; t

n

g if, for ea h u

0

whi h is also an anti-instan e of ft

1

; : : : ; t

n

g, there exists a substitution

su h that u

0

= u.

Simply spoken, an anti-instan e re e ts some ommonalities shared be-

tween two or more terms, whereas the most spe i anti-instan e re e ts all

ommonalities.

Huet's algorithm onstru ts the most spe i anti-instan e of two terms

in the following way: If both terms start with the same fun tion symbol, this

fun tion symbol is kept for the anti-instan e, and anti-uni ation is performed

re ursively on the fun tion arguments. Otherwise, the anti-instan e of the two

terms is a variable determined by an inje tive mapping '. ' ensures that ea h

o urren e of a ertain pair of sub-terms within the given terms is represented

by the same variable in the anti-instan e.

A ording to Idestam-Almquist (1993), ' an be onsidered a term sub-

stitution:

Denition 12.2.4 (Term Substitution). A nite set of the form

[[(t

1

; u

1

)=x

1

; : : : ; (t

n

; u

n

)=x

n

is a term substitution, i [[x

1

=t

1

; : : : ; x

n

=t

n

and [[x

1

=u

1

; : : : ; x

n

=u

n

are

substitutions, and the pairs (t

1

; u

1

); : : : ; (t

n

; u

n

) are distin t.

Huet's algorithm only omputes the most spe i rst-order anti-instan e.

But in order to apture as mu h of the ommon stru ture of two programs

as possible in an abstra t s heme we need at least a se ond order (fun tion)

mapping. In ontrast to rst order approa hes, higher order anti-uni ation

(as well as uni ation) in general has no unique solution that is, a notion

of the most spe i anti-instan e does not exist and annot be al ulated

eÆ iently (Siekmann, 1989; Hasker, 1995).

Therefore, we developed a very restri ted se ond-order algorithm. Its main

extension ompared with Huet's is the introdu tion of fun tion variables for

fun tions of the same arity.

12.2 Restri ted 2ndOrder AntiUni ation 323

Based on the results obtained with this algorithm, we plan areful exten-

sions, maintaining uniqueness of the anti-instan es and eÆ ien y of al ula-

tion. A more powerful approa h to se ond order anti-uni ation was proposed

for example by Hasker (1995), as mentioned in hapter 10.

Our algorithm is presented in table 12.1. The Var-Term, Di-Arity,

and Same-Fun tion rules orrespond to the two ases of Huet's algorithm.

Same-Term is a trivial rule whi h makes the implementation more eÆ ient.

Our main extension to Huet's algorithm is the Same-Arity rule, by whi h

fun tion variables 2

au

are introdu ed for fun tions of the same arity.

Huet's termination ase has been split up into two rules (Var-Term and

Di-Arity) be ause the Di-Arity rule an be rened in future versions

of this algorithm to allow generalizing over fun tions with dierent arities as

well.

Table 12.1. A Simple Anti-Uni ation Algorithm

Fun tion Call: au(t

1

; t

2

) with terms t

1

; t

2

2M(V; F [ )

Initialization: term substitutions ' = ;

Rules:

Same-Term: au(t; t) = t

Var-Term: au(t

1

; t

2

) = y with ' = ' Æ [[t

1

; t

2

=y (where either t

1

2 X and

t

2

2 T

(X ) or t

1

2 T

(X) and t

2

2 X

Same-Fun tion: au(f(x

1

; : : : ; x

n

); f(x

0

1

; : : : ; x

0

n

)) =

f(au(x

1

; x

0

1

); : : : ; au(x

n

; x

0

n

))

Same-Arity: au(f(x

1

; : : : ; x

n

); g(x

0

1

; : : : ; x

0

n

)) = (au(x

1

; x

0

1

); : : : ; au(x

n

; x

0

n

))

with ' = ' Æ [[f; g=

Di-Arity: au(f(x

1

; : : : ; x

n

); g(x

0

1

; : : : ; x

0

m

)) = y

with ' = ' Æ [[f(x

1

; : : : ; x

n

); g(x

0

1

; : : : ; x

0

m

)=y (where n 6= m)

Proofs of uniqueness, orre tness, and termination, are given in (Sinha,

2000).

To illustrate the algorithm, we present a simple example: Let the two

terms whi h are input to au(t

1

; t

2

) be

t

1

= if(eq0(x); 1; (x; fa (p(x))))

t

2

= if(eq0(z); 0;+(z; sum(p(z))))

Then,

au(t

1

; t

2

) = if(eq0(y

1

); y

2

;

2

(y

1

;

1

(p(y

1

))))

with

' = [[y

1

=(x; z); y

2

=(1; 0);

1

=(fa ; sum);

2

=(;+)


as their most spe i antiinstan e. The rules used are: Same-Fun tion

(twi e), Var-Term, Same-Arity (twi e), Var-Term (making use of '),

Same-Arity, Same-Fun tion, and nally Var-Term (again, with '). In

table 12.4, additional examples for anti-uni ation are given.

12.3 Retrieval Using Term Subsumption

12.3.1 Term Subsumption

By using subsumption of anti-instan es, retrieval an be based on a qualita-

tive riterium of stru tural similarity (Plaza, 1995) instead of on a quantita-

tive similarity measure. The advantage of a qualitative approa h is that it is

parameter-free. That is, there is no need to invest in the somewhat tedious

pro ess of determining \suitable" settings for parameters and no threshold

for similarity must be given.

The subsumption relation is dened usually only for lauses (see se t. 6.3.3

in hap. 6). This denition is based either on the subset/superset relation-

ship whi h exists between lauses sin e they an be viewed as sets (the

onjun tion of literals is ommutative) or on the existen e of a substitution

whi h transforms the more general (subsuming) lause into the more spe ial

(subsumed) one. Sin e terms annot be viewed as sets, and therefore the sub-

set/superset relationship does not exist, our denition of term subsumption

has to be based on the existen e of substitutions only.

Denition 12.3.1 (Term Subsumption). A term t

1

subsumes a term t

2

(written t

2

t

1

) i there exists a substitution su h that t

1

= t

2

.

Note, that must be a \proper" substitution, not a term substitution

(def. 12.2.4).

Substitution is obtained by unifying t

1

and t

2

. If is empty and t

1

6= t

2

,

uni ation of t

1

and t

2

has failed. In that ase, the subsumption relation

between t

1

and t

2

is unde idable. That is, the subsumption relation does not

dene a total but a partial order over terms. For two isomorphi terms, that

is, terms whi h an by unied by variable renaming only, we write t

1

t

2

.

2

Our algorithm for retrieving the \maximally similar" RPSs from a linear

ase base is given in table 12.2. In general, the algorithm returns a set of

possible sour e programs rather than a unique sour e due to the unde idabil-

ity of subsumption for non-uniable terms. Input to the algorithm is a nite

program T . For ea h RPS in memory, this RPS S

i

is unfolded to a term T

i

su h that it has at least the length of T and the most spe i anti-instan e u

i

2

Note, that here we speak of onstru ting an order over terms and isomorphism

is dened with respe t to the uni ation of terms. Later in this hapter, we will

dis uss isomorphism in the ontext of anti-uni ation, that is generalization, of

terms.

12.3 Retrieval Using Term Subsumption 325

of T and T

i

is al ulated. This anti-instan e is only inserted in the set of anti-

instan es U if it is either subsumed (i. e., is more spe ial) by an anti-instan e

u 2 U or if it is not uniable with any u 2 U . Furthermore, all anti-instan es

in U whi h subsume (i. e., are more general) than u

i

are removed from U .

The algorithm returns all RPSs S

i

from the ase-base whi h are asso iated

with an u

i

in the nal set of antiinstan es U .

Table 12.2. An Algorithm for Retrieval of RPSs

Given

T a nite program.

CB = fR

1

; : : : ; R

n

g a ase base whi h is a set (represented as a list) of RPSs

S

i

.

U a set (initially empty) of most spe i anti-instan es.

8R

i

2 CB:

Let T

i

= unfold(S

i

). (unfold S

i

a ording to def. 7.1.29 until the resulting term

T

i

has at least the length of T )

Let u

i

= au(T; T

i

). (anti-unify T and T

i

)

If 9u 2 U with u

i

u or u u

i

then U .

8u 2 U with u

i

u: U := (U n u) [ u

i

Return the RPSs S

i

asso iated with all u

i

2 U .

12.3.2 Empiri al Evaluation

We presented terms to six subje ts (all of whom were familiar with fun -

tional programming) asking them to identify similarities among the terms.

All these terms were RPSs unfolded twi e. We hose ten RPSs whi h we on-

sidered \relevant". The RPSs represented various re ursion types, su h as

linear re ursion (SUM, FAC, APPEND, CLEARBLOCK), tail re ursion (MEMBER,

REVERT), tree re ursion (BINOM, FIBO, LAH), and a fun tion alling a further

fun tion (GGT alling MOD). All fun tions are given in appendix CC.8.

In ea h rating task the subje ts were presented one term and a \ ase

base" onsisting of the other nine terms. The order, in whi h the ase base

was presented, varied among subje ts. They were requested to hoose the

term from the ase base whi h was most similar to the given term.

In table 12.3, the results of this rating study are shown.

The rightmost olumn shows the results of our algorithm. Only for the

CLEARBLOCK problem (3) the algorithm yields a unique result. In all other

ases, a set of \most similar" terms is returned. This is due to both sub-

sumption equivalen es and the fa t that subsumption is not always de idable

(see se tion 12.3.1).

The RPS preferred by a majority (two-thirds or more) of the subje ts

was always among the RPSs returned by the algorithm (ex ept for GGT).

Moreover, there are symmetries in retrieval (e. g., FAC and SUM) that an be


Table 12.3. Results of the similarity rating study

Goal RPS Subje ts' Choi e Majority Program Returns

REVERT 4: 33%, 2, 3, 5, 9: 17% ea h f7, 8, 10g

SUM 5: 83%, 10: 17% 5 f5, 10g

CLEARBLOCK 6: 100% 6 f6g

GGT 1: 67%, 6: 33% 1 f8, 10g

FAC 2: 100% 2 f2 ,8 ,9g

APPEND 3: 83%, 2: 17% 3 f1, 3, 7g

MEMBER 10: 33%, 1, 2, 4, 6: 17% ea h f1, 9g

LAH 10: 50%, 9: 33%, 6: 17% f9, 10g

BINOM 10: 100% 10 f7, 8,10g

FIBO 9: 50%, 8: 33%, 3: 17% f2, 8, 9g

explained by a ommon re ursion s heme of these RPSs. Generally, at least

one RPS of the same re ursion type as the goal RPS is retrieved, provided

that there is su h an RPS in the ase base.

12.3.3 Retrieval from Hierar hi al Memory

We demonstrated retrieval of on rete programs or abstra t s hemes for the

ase of a linear memory. Memory an be organized more eÆ iently by intro-

du ing a hierar hy of programs. That is, for ea h pair of programs whi h were

used as sour e and target in programming by analogy, their anti-instan e is

introdu ed as parent.

For hierar hi al memory organization, retrieval an be restri ted to a

subset of the memory in the following way: Given a new nite program term

t,

identify the most spe i anti-instan e au(T; T

i

) for all unfolded RPSs

whi h are root-nodes,

in the following only investigate the hildren of this node.

As in the linear ase, in general, there might not exist a unique minimal

anti-instan e, that is, more than one tree in the memory must be explored.

Furthermore, retrieval from hierar hi al memory is based on a monotoni -

ity assumption: If a program term T

i

is less similar to the urrent term T

than a program term T

j

then the hildren of T

j

annot be more similar to

T than T

i

and the hildren of T

i

. While this assumption is plausible, it is

worth further investigation. Our work on memory hiera hization is work in

progress and we plan to provide a proof for monotoni ity together with an

empiri al omparison of the eÆ ien y of retrieval from hierar hi al versus

linear memory.

12.4 Generalizing Program S hemes 327

12.4 Generalizing Program S hemes

There are dierent possibilities, to obtain an RPS S

1;2

whi h generalizes over

a pair of RPSs S

1

and S

2

:

The generalized program term T

1;2

whi h was onstru ted as au(T

1

; T

2

)

with T

1

= unfold(S

1

) and T

2

= unfold(S

2

) an be folded, using our

standard approa h to program synthesis des ribed in hapter 7.

The term-substitution ', obtained by onstru ting au(T

1

; T

2

), an be

extended to '

0

= ' Æ [G

1

; G

2

=

1;2

, that is, by introdu ing a variable

for the names of the re ursive fun tions. Then, for our restri ted ap-

proa h dealing with fun tions of dierent arity in a rst-order way, it is

enough to apply '

0

to either S

1

or S

2

. That is, for term-substitutions

'

0

= f(T

1;i

; T

2;i

=y

i

) j i = 1::ng where y

i

2 X

au

[

au

, with proje tions

1

= f(T

1;i

=y

i

)g,

2

= f(T

2;i

=y

i

)g it holds, that

1

(S

1

) =

2

(S

2

).

The anti-uni ation algorithm dened above an be applied to re ursive

equations G

i

(x

1

; : : : ; x

n

) = t

i

, if the equations are reformulated as equality

terms: (= (Gi x1 ... xn) ti).

All three possibilities have their advantages: Folding of abstra ted terms

an be used to he k whether the obtained generalization really represents

a re ursive program. Applying term-substitutions is the most parsimonious

approa h to generalization. Applying anti-uni ation dire tly to re ursive

programs allows to separate reuse and learning.

To illustrate our anti-uni ation algorithm, we give examples for general-

ization of re ursive equations in table 12.4. The results are edited for better

readability. In ea h example, we used a fun tion for al ulating the fa torial

of a number as rst instan e. If fa torial is anti-unied with sum, the re-

sulting s heme preserves the omplete stru ture of both programs. Fa torial

and sum an be onsidered as synta ti al isomorphs, that is, we an map

the symbols from fa torial into sum in a onsistent way. A similar result is

obtained for in rementing the elements of a list. If fa torial is anti-unied

with a program for al ulating the square of a number, the resulting s heme

abstra ts from the rst argument of the \else"- ase, whi h is a \ onstant" for

fa torial and a more omplex term for the sqr. Anti-uni ation of fa torial

and bona i returns a very general s heme be ause the programs have very

dierent stru tures: one versus two onditionals and a linear re ursion versus

a tree-re ursion. The same is true for fun tions with dierent numbers of

parameters, as fa torial and mult.

Be ause the result of anti-uni ation is a generalized term together with

a term substitution ' (see def. 12.2.4), the original programs an be re on-

stru ted. The term substitution an be onsidered as representing a (very

limited) domain for interpreting the variables in the anti-instan e. For exam-

ple, in the rst anti-instan e given in table 12.4

2

=1; 0 restri ts the inter-

pretation of

2

to be zero or one. If we want to obtain generalized fun tion

s hemes from anti-uni ation, we keep only the anti-instan es and get rid


Table 12.4. Example Generalizations

t

1

: fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))

t

2

: sum(z) = if(eq0(z), 0, +(z, sum(pred(z))))

1

(y

1

) = if(eq0(y

1

),

2

,

3

(y

1

,

1

(pred(y

1

))))

t

1


t

2

: in l(l) = if(null(l), nil, ons(su ( ar(l)), in l(tail(l))))

1

(y

1

) = if(

2

(y

1

),

3

,

4

(y

3

,

1

(

4

(y

1

))))

t

1


t

2

: sqr(z) = if(eq0(z), 0, +(+(z, pred(z)), sqr(pred(z))))

1

(y

1

) = if(eq0(y

1

),

2

,

3

(y

3

,

1

(pred(y

1

))))

t

1


t

2

: bo(z) = if(eq0(z), 0, if(eq0(pred(z)), 1,

+(bo(pred(z)), bo(pred(pred(z))))))

1

(y

1

) = if(eq0(y

1

),

2

, y

2

)

t

1


t

2

: mult(u v) = if(eq0(u), 0, +(v, mult(pred(u), v)))

y

1

= if(eq0(y

2

), hi

1

,

2

(y

3

, y

4

))

of the term substitutions. Without onstraints, the variables an now be in-

terpreted in an arbitrary way, that is, instantiation of an anti-instan e an

result in \stupid" programs. For example, we might interpret y

1

as a list,

2

as a truth-value, and

3

as list- onstru tor. In the ontext of automati

programming, however, there is always some information about the new (tar-

get) programming problem whi h restri ts the instantiation of a s heme. A

planned extension of our approa h is, to in orporate type-information into

anti-uni ation. Thereby, it an at least be guaranteed that the instantiated

expressions of an anti-instan e are type- onsistent.

12.5 Adaptation of Program S hemes

If an RPS RPS S is retrieved from memory for a urrent problem T this

RPS must be adapted su h that the resulting RPS denes a language to

whi h T belongs (see def. 7.1.20). Adaptation an be performed by applying

the term substitutions obtained by anti-unifying T and T S on the given

RPSS. For example, anti-uni ation of the third unfolding of RPS fa (as

sour e) and the third unfolding of sum (as new nite program with unknown

re ursive generalization) results in ' = [[y

1

=(x; z); y

2

=(1; 0);

1

=(;+). The

given RPS fa an be adapted by repla ing 1 by 0 (rst order) and by +

(se ond-order) and variable renaming x to z.

12.5 Adaptation of Program S hemes 329

A more ompli ated ase o urs, if identi al symbols in one program are

mapped to dierent symbols in another program. For the nite programs sub

and add (see g. 12.1), results ' = [[

1

=(pred; su ). This term substitution

was obtained for the sub-terms at position 3:2: and 3:3: of both nite pro-

grams while the sub-terms at positions 3:1: are identi al (pred(x)). Simply

applying the mappings given in ' to the RPS sum results in a not intended

(and not terminating) RPS add(x, y) = if(eq0(x), y, add(su (x), su (y))).

This error an be easily dete ted, be ause the given nite program does not

belong to the language of the onstru ted RPS, that is, the nite program

annot be generated by unfolding the RPS. To deal with su h ases, the on-

text of the substitution terms must be onsidered, that is, the fun tion of

whi h the sub-terms are arguments and their position (S hmid et al., 1998;

Hasker, 1995).

Ω

if

eq0

x

y if

eq0

pred

x

pred

y

if

eq0

pred

pred

x

pred

pred

y

Ω

if

eq0

x

y if

eq0

pred

x

succ

y

if

eq0

pred

pred

x

succ

succ

y

Sour e: sub(x, y) = if(eq0(x), y, sub(pred(x), pred(y)))

Target: add(x, y) = if(eq0(x), y, add(pred(x), su (y)))

Fig. 12.1. Adaptation of Sub to Add



We introdu ed analogy and abstra tion as alternative to indu tive program

synthesis by folding of initial trees. In general, analogy is not more eÆ ient

than folding for generalizing re ursive program s hemes from nite programs.

Nevertheless, it is a method whi h we nd interesting for two reasons: Fist,

using anti-uni ation, mapping and generalization an be realized within the

same framework and we an demonstrate ( urrently only with toy-examples)

how abstra t program s hemes an be learned from some initial programming

experien e. Se ond, analogy is onsidered as a ru ial strategy in human

problem solving and learning.

13.1 Learning and Applying Abstra t S hemes

Our approa h to programming by analogy using anti-uni ation mainly ad-

dresses the problem of obtaining abstra t s hemes from some on rete pro-

gramming experien e. If su h s hemes are a quired, a dedu tive approa h to

program synthesis by stepwise renement as proposed by (Smith, 1990)

an be used. Currently, mainly in the ontext of obje t oriented programing,

a similar on ept design patterns are dis ussed as approa h to program

reuse and an important ontribution to produ tivity of software development

(Gamma, Helm, Johnson, & Vlissides, 1996). Typi ally, program s hemes or

design patterns are provided by experts as aggregation and abstra tion of

their programming experien e. Our resear h aims at getting more insight into

how su h abstra tions are obtained by human programmers and at exploiting

this insight for (partial) automation of the onstru tion of program s hemes.

Additionally, we address the problem of retrieving a suitable s heme for a

given new programming problem. By using subsumption of anti-instan es, re-

trieval an be based on a qualitative riterion of stru tural similarity (Plaza,

1995) instead of on a quantitative similarity measure.

Currently, our approa h is only a minimal extension of rst-order anti-

uni ation, allowing to generalize over fun tions with the same arity. Gen-

eral se ond-order anti-uni ation has the draw-ba k that uniqueness is lost,

mainly be ause it is open whi h pair of arguments of two terms are anti-

unied. To onstrain se ond-order anti-uni ation, we plan to introdu e fun -

tion evaluation, similar to the proposal of Kolbe and Walther (1995) in the


ontext of reusing proofs. Thereby we also extend the purely synta ti al ap-

proa h to similarity and generalization by semanti aspe ts. Even with this

additional information, typi ally not a single sour e andidate but a set of

RPSs will be retrieved from memory.

While in the system KIDS a knowledgeable user sele ts the RPS whi h

is underlying the to be onstru ted program, we are interested in identifying

heuristi s for automati sele tion. In a rst step, andidate RPSs should be

evaluated with respe t to their stru tural similarity to the target RPSs. As we

have seen in the previous hapter, it is not very helpful to transfer a s heme to

a problem belonging to a dierent re ursion lass su h as transferring linear

re ursive fa torial to tree re ursive bona i (see tab. 12.4). For this step

an interleaving of folding and analogy might be helpful: Information about

hypotheti al positions of re ursive alls, whi h are known for the sour e RPS,

an be superimposed on the new nite program term and it an be he ked

whether this results in a valid re ursive segmentation. Additionally, riteria of

stru tural overlap between nite programs, as investigated in hapter 11 an

be employed to de ide whether it is worthwhile to try analogi al transfer. If

after this initial redu tion of sour e andidates more than one RPS remains,

information about fun tion types an support the nal sele tion. Note, that

su h a strategy is omplementary to ognitive s ien e approa hes where in a

rst step, andidates sharing super ial features are obtained and in a se ond

step, the best mapping andidate is sele ted.

Currently, we are using analogy or abstra tion as alternative to folding. A

more powerful approa h would be to apply analogy already for a given prob-

lem spe i ation the set of operators and top-level goals given as input to

DPlan: Ea h RPS ould be asso iated with the planning domain from whi h

it was originally onstru ted and for a new planning domain, a stru turally

similar domain ould be identied for whi h a re ursive ontrol program is

already given. First ideas for analyzing planning domains with respe t to

their underlying re ursive stru ture were proposed by Wolf (2000).

13.2 A Framework for Learning from Problem Solving

Now we have introdu ed all omponents of the system proposed in hapter

1 universal planning, indu tive program synthesis, and programming by

analogy and abstra tion. The system is yet in its infan y, but we hope that

we ould demonstrate that the ombination of these three areas of resear h

an be fruitful for program synthesis.

Most work in ma hine learning resear h addresses on ept learning, some

work is done in the area of skill a quisiton. These elds over only a fra tion

of the power of human learning. We believe that our system models some

aspe ts of human learning from problem solving: Exploring a new domain

by sear h, indu ing a re ursive generalization, whi h represents a problem

solving strategy, and in rementally, with growing experien e over dierent

13.3 Appli ation Perspe tive 333

domains, onstru ting a hierar hy of abstra t s hemes. Folding of nite terms

models generalization learning by dete ting a pattern in a sequen e of a tions.

Abstra tion as a result of analogi al problem solving models generalization

learning by dete ting a ommon stru ture between solutions of dierent prob-

lems. Both approa hes address one possible sour e of human reativity: the

dis overy of problem solving strategies or algorithms from some example

experien e.

13.3 Appli ation Perspe tive

In this book we introdu ed basi on epts and algorithms for automatizing

(parts of) the pro ess of program onstru tion and generalization learning.

We presented learning of re ursive ontrol rules for AI planning as a eld of

appli ation. In the future, we hope to use the ideas presented in this book as

basis for further areas of appli ation (whi h we already mentioned in hap. 1).

We plan to apply learning of re ursive ontrol rules to proof planning. As in

general AI planning, there is a need to guide sear h in automati theorem

proving. An obvious area of appli ation is learning from user tra es to support

end-user programming. Nowadays, nearly everybody uses a omputer but

only a small fra tion of people know how to program. There exist already

some tools for generating \ma ros" from observation of users' input behavior

in a software system, be it a text editor, a graphi editor, or spreadsheet

analysis. Allowing to not only to learn linear sequen es of operations, whi h

is what urrent tools provide for, but generalizing to loops (re ursion) an

make end-user support more powerful.

Everybody [... omits something, if only be ause to in lude everything is im-

possible.

|Nero Wolf in: Rex Stout Poison a la Cart, 1960


Bibliography

Aho, A. V., Sethi, R., & Ullman, J. D. (1986). Compilers Prin iples, Te h-

niques, and Tools. Addison Wesley, Reading, MA.

Allen, J., Hendler, J., & (Eds.), A. T. (1990). Readings in Planning. Morgan

Kaufmann, San Mateo, CA.

Allou he, J.-P. (1994). Note on the y li Towers of Hanoi. Theoreti al

Computer S ien e, 123, 37.

Anderson, J. R. (1983). The Ar hite ture of Cognition. Havard University

Press, Cambridge, MA.

Anderson, J. R. (1995). Cognitive Psy hology and Its Impli ations. Freeman,

New York.

Anderson, J. R., Conrad, F. G., & Corbett, A. T. (1989). Skill a quisition

and the LISP tutor. Cognitive S ien e, 13, 467505.

Anderson, J. R., & Lebiere, C. (1998). The atomi omponents of thought.

Erlbaum, Mahwah, NJ.

Anderson, J., & Thompson, R. (1989). Use of analogy in a produ tion system

ar hite ture. In Vosniadou, S., & Ortony, A. (Eds.), Similarity and

Analogi al Reasoning, pp. 267297. Cambridge University Press.

Angluin, D. (1981). A note on the number of queries needed to identify

regular languages. Information and Control, 51 (1), 7687.

Angluin, D. (1984). Indu tive inferen e: EÆ ient algorithms. In Biermann,

A., Guiho, G., & Kodrato, Y. (Eds.), Automati Program Constru -

tion Te hniques, pp. 503516. Ma millan, New York.

Anzai, Y., & Simon, H. (1979). The theory of learning by doing. Psy hologi al

Review, 86, 124140.

Atkinson, M. D. (1981). The y li Towers of Hanoi. Information Pro essing

Letters, 13 (3), 118119.

Atwood, M. E., & Polson, P. G. (1976). A pro ess model for water jug

problems. Cognitive Psy hology, 8, 191216.

Ba hus, F., & Kabanza, F. (1996). Using temproal logi to ontrol sear h

in a forward haining planner. In New Dire tions in AI-planning, pp.

141153. ISO, Amsterdam.

Ba hus, F., Kautz, H., Smith, D. E., Long, D., Gener, H., & Koehler,

J. (2000). AIPS-00 Planning Competition, Bre kenridge, CO. http:

//www. s.toronto.edu/aips2000/.

336 BIBLIOGRAPHY

Barr, A., & Feigenbaum, E. A. (Eds.). (1982). The Handbook of Arti ial

Intelligen e, Vol. 2. William Kaufmann, Los Altos, CA.

Bates, J., & Constable, R. (1985). Proofs as programs. ACM Transa tions

on Programming Languages and Systems, 7 (1), 113136.

Bauer, F., Broy, M., Moller, B., Pepper, P., Wirsing, M., et al. (1985). The

Muni h Proje t CIP. Vol. I: The Wide Spe trum Language CIP-L.

Springer.

Bauer, F., Ehler, H., Hors h, R., Moller, B., Parts h, H., Paukner, O., & Pep-

per, P. (1987). The Muni h Proje t CIP. Vol. II: The Transformation

System CIP-S, LNCS 292, Vol. II. Springer.

Bauer, M. (1975). A basis for the a quisition of pro edures from proto ols.

In Pro . 4th International Joint Conferen e on Arti ial Intelligen e

(4th IJCAI), pp. 226231.

Bibel, W. (1980). Syntax-dire ted, semanti -supported program synthesis.

Arti ial Intelligen e, 14 (3), 243261.

Bibel, W. (1986). A dedu tive solution for plan generation. New Generation

Computing, 4 (2), 115132.

Bibel, W., Korn, D., Kreitz, C., & Kuru z, F. (1998). A multi-level approa h

to program synthesis. In Pro . 7th International Workshop on Logi

Program Synthesis and Transformation, Vol. 1463 of LNCS, pp. 127.

Springer.

Biermann, A. W. (1978). The inferen e of regular LISP programs from ex-

amples. IEEE Transa tions on Systems, Man, and Cyberneti s, 8 (8),

585600.

Biermann, A. W., Guiho, G., & Kodrato, Y. (Eds.). (1984). Automati

Program Constru tion Te hniques. Ma millan, New York.

Blum, A., & Furst, M. (1997). Fast planning through planning graph analysis.

Arti ial Intelligen e, 90 (1-2), 281300.

Bonet, B., & Gener, H. (1999). Planning as heuristi sear h: New results.

In Pro . European Conferen e on Planning (ECP-99), Durham, UK.

Springer.

Bostrom, H. (1996). Theory-guided indu tion of logi programs by infer-

en e of regular languages. In Saitta, L. (Ed.), Pro . 13th International

Conferen e on Ma hine Learning (ICML-96), pp. 4653. Morgan Kauf-

mann.

Bostrom, H. (1998). Predi ate invention and learning from positive examples

only. In Pro . 10th European Conferen e on Ma hine Learning (ECML-

98), No. 1398 in LNAI, pp. 226237. Springer.

Boyer, R., & Moore, J. (1975). Proving theorems about lisp fun tions. Jornal

of the ACM, 22 (1), 129144.

Briesemeister, L., S heer, T., & Wysotzki, F. (1996). A on ept based

algorithmi model for skill a quisition. In S hmid, U., Krems, J., &

Wysotzki, F. (Eds.), Pro . 1st European Workshop on Cognitive Model-

BIBLIOGRAPHY 337

ing (14.-16.11.96) TUBerlin. http://ki. s.tu-berlin.de/EuroCog/

m-program.html.

Broy, M., & Pepper, P. (1981). Program development as formal a tivity.

IEEE Transa tions on Software Engineering, SE-7 (1), 1422.

Bryant, R. E. (1986). Graph-based algorithms for boolean fun tion manipu-

lation. IEEE Transa tions on Computers, 35, 677691.

Bu hanan, B. G., & Wilkins, D. C. (Eds.). (1993). Readings in Knowledge

A quisition and Learning. Morgan Kaufmann, San Mateo, CA.

Bundy, A. (1988). The use of expli it plans to guide indu tive proofs. In

Pro . 9th Conferen e on Automati Dedu tion (CADE), pp. 111120.

Springer.

Bunke, H., & Messmer, B. T. (1994). Similarity measures for stru tured

representations. In Wess, S., Altho, K.-D., & Ri hter, M. (Eds.), Pro .

1st European Workshop on Topi s in Case-Based Reasoning, Vol. 837

of LNAI, pp. 106118. Springer.

Bur h, J. R., Clarke, E. M., M Millan, K. L., & Hwang, J. (1992). Symboli

model he king: 10

20

states and beyond. Information and Computation,

98 (2), 142170.

Burghardt, J., & Heinz, B. (1996). Implementing anti-uni ation modulo

equational theory. Te h. rep., Arbeitspapiere der GMD. ftp://ftp.

first.gmd.de/pub/jo hen/Papers/ap1006.dvi.gz.

Burns, B. (1996). Meta-analogi al transfer: Transfer between episodes of

analogi al reasoning. Journal of Experimental Psy hology: Learning,

Memory, and Cognition, 22 (4), 10321048.

Burstall, R., & Darlington, J. (1977). A transformation system for developing

re ursive programs. JACM, 24 (1), 4467.

Burton, C. T. P. (1992). Program morphisms. Formal Aspe ts of Computing,

4, 693726.

Bylander, T. (1994). The omputational omplexity of Strips planning. Ar-

ti ial Intelligen e, 69, 165204.

Carbonell, J. (1986). Derivational analogy: A theory of re onstru tive prob-

lem solving and expertise a quistion. In Mi halski, R., Carbonell, J.,

& Mit hell, T. (Eds.), Ma hine Learning - An Arti ial Intelligen e

Approa h, Vol. 2, pp. 371392. Morgan Kaufmann, Los Altos, CA.

Cheatham, T. E. (1984). Reusability through program transformation. IEEE

Transa tions of Software Engineering, 10 (5), 589594.

Chomsky, N. (1959). Review of Skinner's `Verbal Behavior'. Language, 35,

2658.

Christodes, N. (1975). Graph Theory - An Algorithmi Approa h. A ademi

Press, London.

Cimatti, A., Roveri, M., & Traverso, P. (1998). Automati OBDD-based

generation of universal plans in non-deterministi domains. In Pro .

15th National Conferen e on Arti ial Intelligen e (AAAI-98), pp. 875

881 Menlo Park. AAAI Press.

338 BIBLIOGRAPHY

Clement, E., & Ri hard, J.-F. (1997). Knowledge of domain ee ts in problem

representation: The ase of Tower of Hanoi isomorphs. Thinking and

Reasoning, 3 (2), 133157.

Cohen, W. W. (1992). Desiderata for generaization-to-n algorithms. In

Pro . Int. Workshop on Analogi al and Indu tive Inferen e (AII '92),

Dagstuhl Castle, Germany, Vol. 642 of LNAI, pp. 140150. Springer.

Cohen, W. W. (1995). Pa -learning re ursive logi programs: EÆ ient algo-

rithms. Journal of Arti ial Intelligen e Resear h, 2, 501539.

Cohen, W. (1998). Hardness results for learning rst-order representations

and programming by demonstration. Ma hine Learning, 30, 5797.

Cormen, T., Leiserson, C. E., & Rivest, R. (1990). Introdu tion to Algorithms.

MIT Press.

Cour elle, B. (1990). Re ursive appli ative program s hemes. In van Leeuwen,

J. (Ed.), Handbook of Theoreti al Computer S ien e, Vol. A, pp. 459

492. Elsevier.

Cour elle, B., & Nivat, M. (1978). The algebrai semanti s of re ursive pro-

gram s hemes. In Winkowski, J. (Ed.), Pro . 7th Symposium on Mathe-

mati al Foundations of Computer S ien e, Vol. 64 of LNCS, pp. 1630.

Springer.

Cypher, A. (Ed.). (1993).Wat h What I Do: Programming by Demonstration.

MIT Press, Cambridge, MA.

Davey, B. A., & Priestley, H. A. (1990). Introdu tion to Latti es and Order.

Cambridge University Press.

Dean, T., Basye, K., & Shew huk, J. (1993). Reinfor ement learning for

planning and ontrol. In Minton, S. (Ed.), Ma hine Learning Methods

for Planning, hap. 3, pp. 6792. Morgan Kaufmann.

Dershowitz, N. (1986). Programming by analogy. In Mi halski, R. S., Car-

bonell, J. G., & Mit hell, T. M. (Eds.), Ma hine Learning An Arti-

ial Intelligen e Approa h, Vol. 2, pp. 393422. Morgan Kaufmann, Los

Altos, CA.

Dershowitz, N., & Jouanaud, J.-P. (1990). Rewrite systems. In Leeuwen, J.

(Ed.), Handbook of Theoreti al Computer S ien e, Vol. B, pp. 243320.

Elsevier.

Do, M. B., & Kambhampati, S. (2000). Solving planning-graph by ompil-

ing it into CSP. In Pro . 5th International Conferen e on Arti ial

Intelligen e Planning and S heduling (AIPS 2000), pp. 8291. AAAI

Press.

Dold, A. (1995). Representing, verifying and applying software development

steps using the PVS system. In Alagar, V. S., & Nivat, M. (Eds.), Pro .

Algebrai Methodology and Software Te hnology (AMAST'95), Vol. 936

of LNCS, pp. 431460. Springer.

Dun ker, K. (1945). On problem solving. Psy hologi al Monographs, 58.

E kes, T., & Roba h, H. (1980). Clusteranalysen. Kohlhammer, Stuttgart.

BIBLIOGRAPHY 339

Edelkamp, S. (2000). Heuristi sear h planning with BDDs. In Pro .

14th Workshop "New Results in Planning, S heduling and Design"

(PuK2000), Berlin. http://www-is.informatik.uni-oldenburg.

de/~sauer/

puk2000/paper.html.

Ehrig, H., & Mahr, B. (1985). Fundamentals of Algebrai Spe i ation 1 -

Equations and Initial Semanti s. Springer, Berlin.

Evans, T. G. (1968). A program for the solution of a lass of geometri -

analogy intelligen e-questions. In Minsky, M. (Ed.), Semanti Infor-

mation Pro essing, pp. 271353. MIT Press.

Falkenhainer, B., Forbus, K., & Gentner, D. (1989). The stru ture mapping

engine: Algorithm and example. Arti ial Intelligen e, 41, 163.

Feather, M. S. (1987). A survey and lassi ation of some program transfor-

mation approa hes and te hniques. In Meertens, L. G. L. T. (Ed.), Pro-

gram Spe i ation and Transformation, pp. 165198. North-Holland.

Field, A. J., & Harrison, P. G. (1988). Fun tional Progamming. Addison-

Wesley, Reading, MA.

Fikes, R., & Nilsson, N. (1971). STRIPS: A new approa h to the appli ation

of theorem proving to problem solving. Arti ial Intelligen e, 2 (3),

189208.

Fink, E., & Yang, Q. (1997). Automati ally sele ting and using primary ef-

fe ts in planning: Theory and experiments. Arti ial Intelligen e Jour-

nal, 89, 285315.

Flener, P. (1995). Logi Program Synthesis from In omplete Information.

Kluwer A ademi Press, Boston.

Flener, P., Popelinsky, L., & Stepankova, O. (1994). Ilp and automati pro-

gramming: towards three approa hes. In Pro . of ILP-94, pp. 351364.

Flener, P., & Yilmaz, S. (1999). Indu tive synthesis of re ursive logi pro-

grams: A hievements and prospe ts. Journal of Logi Programming,

41 (23), 141195.

Fox, M., & Long, D. (1998). The automati inferen e of state invariants in

TIM. Journal of Arti ial Intelligen e Resear h, 9, 367421.

Fruhwirth, T., & Abdennadher, S. (1997). Constraint-Programming. Sprin-

ger, Berlin.

Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1996). Design Patterns:

Elements of Reusable Obje t-oriented Software. Addison Wesley, Read-

ing, MA.

Garey, M. R., & Johnson, D. S. (1979). Computers and Intra tability A

Guide to the Theory of NP- ompleteness. Freeman.

Gener, H. (2000). Fun tional Strips: A more exible language for planning

and problem solving. In Minker, J. (Ed.), Logi -Based Arti ial Intel-

ligen e. Kluwer, Dordre ht. Earlier version presented at "Logi -based

AI Workshop", Washington D.C., June 1999.

340 BIBLIOGRAPHY

Gelfond, M., & Lifs hitz, V. (1993). Representing a tion and hange by logi

programs. Journal of Logi Programming, 17, 301322.

Gentner, D. (1980). The Stru ture of Analogi al Models in S ien e. Bolt

Beranek and Newman In ., Cambridge, MA.

Gentner, D. (1983). Stru ture-mapping: A theoreti al framework for analogy.

Cognitive S ien e, 7, 155170.

Gentner, D., Brem, S., Ferguson, R., Markman, A. B., Levidow, B., Wol,

P., & Fobus, K. (1997). Analogi al reasoning and on eptual hange: A

ase study of Johannes Kepler. Journal of the Learning S ien es, 6 (1),

340.

Gentner, D., & Landers, R. (1985). Analogi al a ess: A good mat h is hard

to nd. In Pro . Annual Meeting of the Psy honomi So iety, Boston.

Gentner, D., Ratterman, M. J., & Forbus, K. D. (1993). The roles of simi-

larity in transfer: Separating retrievability from inferential soundness.

Cognitive Psy hology, 25 (4), 524575.

George, M. P. (1987). Planning. Annual Review of Computer S ien e, 2,

359400. also in J. Allen, J. Hendler and A. Tate (Eds.), Readings in

Planning, Morgan Kaufmann, 1990.

Gholson, B., Smither, D., Buhrman, A., Dun an, M. K., & Pier e, K. (1996).

The sour es of hildren's reasoning errors during analogi al problem

solving. Applied Cognitive Psy hology, Spe ial Issue: "Reasoning Pro-

esses", 10, 8597.

Gi k, M., & Holyoak, K. (1980). Analogi al problem solving. Cognitive

Psy hology, 12, 306355.

Gi k, M., & Holyoak, K. (1983). S hema indu tion and analogi al transfer.

Cognitive Psy hology, 14, 138.

Giesl, J. (2000). Context-moving transformations for fun tion veri ation.

In Pro . 9th International Workshop on Logi -based Program Synthesis

and Transformation (LOPSTR '99). Springer.

Ginsberg, M. (1989). Universal planning: An (almost) universally bad idea.

AI Magazine, 10 (4), 4044.

Giun higlia, F., & Traverso, P. (1999). Planning as model he king. In Bi-

undo, S., & Fox, M. (Eds.), Pro . 5th European Conferen e on Planning

(ECP'99), Durham, UK, LNCS. Springer.

Gold, E. M. (1978). Complexity of automaton identi ation from given data.

Information and Control, 37, 302320.

Gold, E. (1967). Language identi ation in the limit. Information and Con-

trol, 10, 447474.

Green, C. (1969). Appli ation of theorem proving to problem solving. In Pro .

1st International Joint Conferen e on Arti ial Intelligen e (IJCAI-

69), pp. 342344.

Green, C., Lu kham, D., Balzer, R., Cheatham, T., & Ri h, C. (1983). Re-

port on a knowledge-based software assistant. Te h. rep. KES.U.83.2,

Kestrel Institute, Palo Alto, CA.

BIBLIOGRAPHY 341

Green, C., & Barstow, D. (1978). On program synthesis knowledge. Arti ial

Intelligen e, 10, 241279.

Guessarian, I. (1992). Trees and algebrai semanti s. In Nivat, M., & Podel-

ski, A. (Eds.), Tree Automata and Languages, pp. 291310. Elsevier.

Hasker, R. W. (1995). The Replay of Program Derivations. Ph.D. thesis,

Univ. of Illinois at Urbana-Champaign.

Haslum, P., & Gener, H. (2000). Admissible heuristi s for optimal plan-

ning. In Pro . 5th International Conferen e on Arti ial Intelligen e

Planning and S heduling (AIPS 2000), pp. 150158. AAAI Press.

Hi kman, A. K., & Larkin, J. H. (1990). Internal analogy: A model of transfer

within problems. In Pro . 12th Annual Conferen e of the Cognitive S i-

en e So iety (CogS i-90), pp. 5360 Hillsdale, NJ. Lawren e Erlbaum.

Hinman, P. (1978). Re ursion-Theoreti Hierar hies. Springer, New York.

Hinz, A. M. (1996). Square-free Tower of Hanoi sequen es. Enseign. Math.,

2 (42), 257264.

Hofstadter, D., & The Fluid Analogies Resear h Group (1995). Fluid Con-

epts and Creative Analogies. Basi Books, New York.

Holland, J., Holyoak, K., Nisbett, R., & Thagard, P. (1986). Indu tion Pro-

esses of Inferen e, Learning, and Dis overy. MIT Press, Cambridge,

MA.

Holyoak, K. J., & Thagard, P. (1989). Analogi al mapping by onstraint

satisfa tion. Cognitive S ien e, 13, 295355.

Holyoak, K., & Koh, K. (1987). Surfa e and stru tural similarity in analogi al

transfer. Memory and Cognition, 15, 332440.

Hop roft, J., & Ullman, J. (1980). Introdu tion to Automata Theory, Lan-

guages, and Computation. Addison-Wesley.

Huang, Y., Selman, B., & Kautz, H. (2000). Learning de larative ontrol

rules for onstraint-based planning. In Pro . 17th International Con-

feren e on Ma hine Learning (ICML-2000), Stanford, CA, pp. 415422.

Morgan Kaufmann.

Huet, G. (1976). Resolution d'equations dans les langages d'ordre 1, 2, : : : ; !.

Ph.D. thesis, Universitee de Paris VII.

Hummel, J., & Holyoak, K. (1997). Distributed representation of stru ture: A

theory of analogi al a ess and mapping. Psy hologi al Review, 104 (3),

427466.

Idestam-Almquist, P. (1993). Generalization under impli ation by re ur-

sive anti-uni ation. In Utgo, P. (Ed.), Pro . 10th International Con-

feren e on Ma hine Learning (ICML-93), pp. 151158. Morgan Kauf-

mann.

Idestam-Almquist, P. (1995). EÆ ient indu tion of re ursive denitions by

stru tural analysis of saturations. In De Raedt, L. (Ed.), Pro . 5th In-

ternational Workshop on Indu tive Logi Programming (ILP-95), Leu-

ven, pp. 7794.

342 BIBLIOGRAPHY

Indurkhya, B. (1992). Metaphor and Cognition. Kluwer, Dordre ht, The

Netherlands.

Jamnik, M., Kerber, M., & Benzmuller, C. (2000). Towards learning new

methods in proof planning. In Pro . of the Cal ulemus Symposium:

Systems for Integrated Computation and Dedu tion St. Andrews, S ot-

land, UK. ftp://ftp. s.bham.a .uk/pub/authors/M.Kerber/TR/

CSRP-00-09.ps.gz.

Jensen, R. M., & Veloso, M. M. (2000). OBDD-based universal planning for

syn hronized agents in non-deterministi domains. Journal of Arti ial

Intelligen e Resear h, 13, 189226.

Johnson, W. L. (1988). Modelling programmers' intentions. In Self, J.

(Ed.), Arti ial Intelligen e and Human Learning Intelligent Com-

puter-Aided Instru tion, pp. 374390. Chapman and Hall, London.

Johnson, W., & Feather, M. (1991). Using evolution transformations to on-

stru t spe i ations. In Lowry, M., & M Cartney, R. (Eds.), Automat-

ing Software Design, hap. 4, pp. 6592. AAAI press/MIT Press, Menlo

Park.

Jouannaud, J. P., & Kodrato, Y. (1979). Chara terization of a lass of

fun tions synthesized from examples by a summers like method using a

`B.M.W.' mat hing te hnique. In Pro . International Joint Conferen e

on Arti ial Intelligen e (IJCAI-79), pp. 440447. Morgan Kaufmann.

Kalmar, Z., & Szepesvari, C. (1999). An evaluation riterion for ma ro learn-

ing and some results. Te h. rep. TR99-01, Mindmaker Ltd., Budapest.

http://vi toria.mindmaker.hu/~szepes/papers/

ma ro-tr99-01.ps.gz.

Kambhampati, S. (2000). Planning graph as a (dynami ) CSP: Exploiting

EBL, DDB and other CSP sear h te hniques in Graphplan. Journal of

Arti ial Intelligen e Resear h, 12, 134.

Kautz, H., & Selman, B. (1996). Pushing the envelope: Planning, proposi-

tional logi and sto hasti sear h. In Pro . 13th National Conferen e

on Arti ial Intelligen e and 8thInnovative Appli ations of Arti ial

Intelligen e Conferen e, pp. 11941201 Menlo Park. AAAI Press/MIT

Press.

Keane, M. (1996). On adaptation in analogy: Tests of pragmati importan e

and adaptability in analogi al problem solving. The Quarterly Journal

of Experimental Psy hology, 49A(4), 10621085.

Keane, M., Ledgeway, T., & Du, S. (1994). Constraints on analogi al map-

ping: A omparison of three models. Cognitive S ien e, 18, 387438.

Knuth, D. E. (1992). Convolution polynomials. The Mathemati a Journal,

2 (4), 6778.

Kodrato, Y., & J.Fargues (1978). A sane algorithm for the synthesis of LISP

fun tions from example problems: The Boyer and Moore algorithm. In

Pro . AISE Meeting Hambourg, pp. 169175.

BIBLIOGRAPHY 343

Kodrato, Y., Franova, M., & Partridge, D. (1989). Why and how program

synthesis?. In Pro . International Workshop on Analogi al and Indu -

tive Inferen e (AII '89), LNCS, pp. 4559. Springer.

Koedinger, K., & Anderson, J. (1990). Abstra t planning and per eptual

hunks: Elements of expertise in geometry. Cognitive S ien e, 14, 511

550.

Koehler, J. (1998). Planning under resour e onstraints. In Prade, H. (Ed.),

Pro . 13th European Conferen e on Arti ial Intelligen e (ECAI-98,

pp. 489493. Wiley.

Koehler, J., Nebel, B., & Homann, J. (1997). Extending planning graphs

to an ADL subset. In Pro . European Conferen e on Planning (ECP-

97), pp. 273285. Springer. extended version as Te hni al Report No.

88/1997, University Freiburg.

Kolbe, T., & Walther, C. (1995). Se ond-order mat hing modulo evaluation:

A te hnique for reusing proofs. In Mellish, C. S. (Ed.), Pro . 14th

International Joint Conferen e on Arti ial Intelligen e (IJCAI-95),

pp. 190195. Morgan Kaufmann.

Kolodner, J. (1993). Case-based Reasoning. Morgan Kaufmann, San Mateo,

CA.

Korf, R. E. (1985). Ma ro-operators: a weak method for learning. Arti ial

Intelligen e, 1985, 26, 3577.

Koza, J. (1992). Geneti programming: On the programming of omputers by

means of natural sele tion. MIT Press, Cambridge, MA.

Kreitz, C. (1998). Program synthesis. In Bibel, W., & S hmitt, P. (Eds.),

Automated Dedu tion A Basis for Appli ations, Vol. III, hap. III.2.5,

pp. 105134. Kluwer.

Laborie, P., & Ghallab, M. (1995). Planning with sharable resour e on-

straints. In Pro . of the 14th IJCAI, pp. 16431649.Morgan Kaufmann.

Lassez, J.-L., Maher, M. J., & Marriott, K. (1988). Uni ation revisited. In

Bos arol, M., Aiello, L. C., & Levi, G. (Eds.), Pro . Workshop Foun-

dations of Logi and Fun tional Programming, Vol. 306 of LNCS, pp.

67113. Springer.

Lavra , N., & Dzeroski, S. (1994). Indu tive Logi Programming: Te hniques

and Appli ations. Ellis Horwood, London.

Le Blan , G. (1994). BMWk revisited: Generalization and formalization of

an algorithm for dete ting re ursive relations in term sequen es. In

Bergadano, F., & de Raedt, L. (Eds.), Ma hine Learning, Pro . of

ECML-94, pp. 183197. Springer.

Leeuwenberg, E. (1971). A per eptual en oding language for visual and au-

ditory patterns. Ameri an Journal of Psy hology, 84, 307349.

Lieberman, H. (1993). Tinker: A programming by demonstration system

for beginning programmers. In Cypher, A. (Ed.), Wat h What I Do:

Programming by Demonstration, hap. 2. MIT Press, Cambridge, MA.

see http://www.a ypher. om/wwid/Chapters/02Tinker.html.

344 BIBLIOGRAPHY

Lifs hitz, V. (1987). On the semanti s of STRIPS. In George, M. P., & Lan-

sky, A. L. (Eds.), Reasoning about A tions and Plans, pp. 19. Kauf-

mann, Los Altos, CA.

Long, D., & Fox, M. (1999). The eÆ ient implementation of the plan-graph

in STAN. Journal of Arti ial Intelligen e Resear h, 10, 87115.

Lowry, M., & Duran, R. (1989). Knowledge-based software engineering. In

A. Barr, P. R. C., & Feigenbaum, E. A. (Eds.), Handbook of Arti ial

Intelligen e, Vol. IV, pp. 241322. Addison-Wesely, Reading, MA.

Lowry, M. L., & M Carthy, R. D. (Eds.). (1991). Autmati Software Design.

MIT Press, Cambridge, MA.

Lu, S. (1979). A tree-to-tree distan e and its appli ation to luster analy-

sis. IEEE Transa tions on Pattern Analysis and Ma hine Intelligen e,

PAMI-1 (2), 219224.

Manna, Z., & Waldinger, R. (1975). Knowledge and reasoning in program

synthesis. Arti ial Intelligen e, 6, 175208.

Manna, Z., & Waldinger, R. (1979). Synthesis: Dreams ! programs. IEEE

Transa tions on Software Engineering, 5 (4), 294328.

Manna, Z., & Waldinger, R. (1980). A dedu tive approa h to program syn-

thesis. ACM Transa tions on Programming Languages and Systems,

2 (1), 90121. also in: C. Ri h and R.C. Waters (1986). Readings in AI

and Software Engineering, Chap. 1. Morgan Kaufmann.

Manna, Z., & Waldinger, R. (1987). How to lear a blo k: a theory of plans.

Journal of Automated Reasoning, 3 (4), 343378.

Manna, Z., & Waldinger, R. (1992). Fundamentals of dedu tive program

synthesis. IEEE Transa tions on Software Engineering, 18 (8), 674

704.

Mart

in, M., & Gener, H. (2000). Learning generalized poli ies in plan-

ning using on ept languages. In Pro . 7th International Conferen e

on Knowledge Representation and Reasoning (KR 2000), pp. 667677.

Morgan Kaufmann.

Mayer, R. E. (1983). Thinking, Problem Solving, Cognition. Freeman, New

York.

Mayer, R. E. (1988). Tea hing and Learning Computer Programming: Mul-

tiple Resear h Perspe tives. Erlbaum, Hillsdale, NJ.

M Carthy, J., & Hayes, P. J. (1969). Some philosophi al problems from the

standpoint of arti ial intelligen e. Ma hine Intelligen e, 4, 463502.

M Carthy, J. (1963). Situations, a tions, and ausal laws. Memo 2, Stanford

University Arti ial Intelligen e Proje t, Stanford, California.

M Dermott, D. (1998a). AIPS-98 Planning Competition Results. http:

//ftp. s.yale.edu/pub/m dermott/aips omp-results.html.

M Dermott, D. (1998b). PDDL - the planning domain denition language.

http://ftp. s.yale.edu/pub/m dermott.

Mer y, R. (1998). Programmieren dur h analoges S hlieen: Zwei Algo-

rithmen zur Adaptation rekursiver Programms hemata (programming

BIBLIOGRAPHY 345

by analogy: Two algorithms for the adaptation of re ursive program

s hemes). Master's thesis, Fa hberei h Informatik, Te hnis he Univer-

sitat Berlin.

Messing, L. S., & Campbell, R. (Eds.). (1999). Advan es in Arti ial Intel-

ligen e in Software Engineering. JAI Press, Greenwi h, Conne ti ut.

Mi halski, R. S., Carbonell, J. G., & Mit hell, T. M. (Eds.). (1983). Ma hine

Learning An Arti ial Intelligen e Approa h. Tioga.

Mi halski, R. S., Carbonell, J. G., & Mit hell, T. M. (Eds.). (1986). Ma-

hine Learning An Arti ial Intelligen e Approa h, Vol. 2. Morgan

Kaufmann.

Minton, S. (1988). Learning ee tive sear h ontrol knowledge: An

explanation-based approa h. Kluwer.

Minton, S. (1985). Sele tively generalizing plans for problem-solving. In

Pro . International Joint Conferen e on Arti ial Intelligen e (IJCAI-

85), pp. 596599. Morgan Kaufmann.

Mit hell, T. M., Keller, R. M., & Kedar-Cabelli, S. T. (1986). Explanation-

based generalization: A unifying view. Ma hine Learning, 1, 4780.

Mit hell, T. M. (1997). Ma hine Learning. M Graw-Hill.

Muggleton, S. (1994). Predi ate invention and utility. Journal of Experimen-

tal and Theoreti al Arti ial Intelligen e, 6 (1), 121130.

Muggleton, S., & De Raedt, L. (1994). Indu tive logi programming: Theory

and methods. Journal of Logi Programming, Spe ial Issue on 10 Years

of Logi Programming, 19-20, 629679.

Muggleton, S., & Feng, C. (1990). EÆ ient indu tion of logi programs. In

Pro . of the Workshop on Algorithmi Learning Theory by the Japanese

So iety for AI, pp. 368381.

Muggleton, S. (1991). Indu tive logi programming. New Generation Com-

puting, 8, 295318.

Muhlpfordt, M. (2000). Syntaktis he Inferenz Rekursiver Programm S he-

mata (Synta ti al inferen e of re ursive program s hemes). Master's

thesis, Dept. of Computer S ien e, TU Berlin.

Muhlpfordt, M., & S hmid, U. (1998). Synthesis of re ursive fun tions with

interdependent parameters. In Pro . of the Annual Meeting of the GI

Ma hine Learning Group, FGML-98, pp. 132139 TU Berlin.

Muller, W., & Wysotzki, F. (1995). Automati synthesis of ontrol programs

by ombination of learning and problem solving methods. In Lavra , N.,

& Wrobel, S. (Eds.), Pro . European Conferen e on Ma hine Learning

(ECML-95), Vol. 912 of LNAI, pp. 323326. Springer.

Nebel, B. (2000). On the ompilability and expressive power of propositional

planning formalisms. Journal of Arti ial Intelligen e Resear h, 12,

271315.

Nebel, B., & Koehler, J. (1995). Plan reuse versus plan generation: A theo-

reti al and empiri al analysis. Arti ial Intelligen e, 76, 427454.

346 BIBLIOGRAPHY

Newell, A. (1990). Unifyied Theories of Cognition. Harvard University Press,

Cambridge, MA.

Newell, A., Shaw, J. C., & Simon, H. A. (1958). Elements of a theory of

human problem solving. Psy hologi al Review, 65, 151166.

Newell, A., & Simon, H. A. (1972). Human Problem Solving. Prenti e Hall,

Englewood Clis, NJ.

Newell, A., & Simon, H. (1961). GPS, A program that simulates human

thought. In Billing, H. (Ed.), Lernende Automaten, pp. 109124. Old-

enbourg, Mun hen.

Nilsson, N. J. (1980). Prin iples of Arti ial Intelligen e. Springer, New

York.

Nilsson, N. (1971). Problem-solving Methods in Arti ial Intelligen e.

M Graw-Hill.

Novi k, L. R. (1988). Analogi al transfer, problem similarity, and expertise.

Journal of Experimental Psy hology: Learning, Memory, and Cognition,

14, 510520.

Novi k, L. R., & Hmelo, C. E. (1994). Transferring symboli representations

a ross nonisomorphi problems. Journal of Experimental Psy hology:

Learning, Memory, and Cognition, 20 (6), 12961321.

Novi k, L. R., & Holyoak, K. J. (1991). Mathemati al problem solving by

analogy. Journal of Experimental Psy hology: Learning, Memory, and

Cognition, 17 (3), 398415.

Odifreddi, P. (1989). Classi al Re ursion Theory, Vol. 125 of Studies in Logi .

North-Holland, Amsterdam.

O'Hara, S. (1992). A model of the redes ription pro ess in the ontext of ge-

ometri proportional analogy problems. In Pro . International Work-

shop on Analogi al and Indu tive Inferen e (AII '92), Dagstuhl Castle,

Germany, Vol. 642 of LNAI, pp. 268293. Springer.

Olsson, R. (1995). Indu tive fun tional programming using in remental pro-

gram transformation. Arti ial Intelligen e, 74 (1).

Osherson, D., Stob, M., & Weinstein, S. (1986). Systems that Learn. MIT

Press, Cambridge, MA.

Parandian, B., S hmid, U., & Wysotzki, F. (1995). Program synthesis with a

generalized planning approa h. In Program Synthesis by Learning and

Planning. Te hni al Report 95-20, Department of Computer S ien e,

TU Berlin.

Partridge, D. (Ed.). (1991). Arti ial Intelligen e and Software Engineering.

Ablex, Norwood, NJ.

Pednault, E. P. D. (1987). Formulating multiagent, dynami -world problems

in the lassi al planning framework. In George, M. P., & Lansky,

A. L. (Eds.), Pro . Workshop on Reasoning About A tions and Plans,

pp. 4782. Morgan Kaufmann.

Pednault, E. P. D. (1994). ADL and the state-transition model of a tion.

Journal of Logi and Computation, 4 (5), 467512.

BIBLIOGRAPHY 347

Penberthy, J. S., & Weld, D. S. (1992). UCPOP: A sound, omplete par-

tial order planner for adl. In Pro . 3rd International Conferen e on

Prin iples in Knowledge Representation and Reasoning (KR-92), pp.

103114.

Pettorossi, A. (1984). Towers of Hanoi problems: Deriving the iterative so-

lutions using the program transformation te hnique. Te h. rep. R.82,

Instituto di Analisi dei Sistemi ed Informati a, Roma.

Pirolli, P., & Anderson, J. (1985). The role of learning from examples in

the a quisition of re ursive programming skills. Canadian Journal of

Psy hology, 39, 240272.

Pis h, H. (2000). Ein eÆzienter Pattern-Mat her zur Synthese rekursiver

Programme aus endli hen Termen (An eÆ ient pattern-mat her for the

synthesis of re ursive programs from nite terms). Master's thesis,

Dept. of Computer S ien e, TU Berlin.

Plaza, E. (1995). Cases as terms: A feature term approa h to the stru -

tured representation of ases. In Pro . 1st International Conferen e on

Case-Based Reasoning (ICCBR-95), Vol. 1010 of LNCS, pp. 265276.

Springer.

Plotkin, G. D. (1969). A note on indu tive generalization. In Ma hine Intel-

ligen e, Vol. 5, pp. 153163. Edinburgh University Press.

Polya, G. (1957). How to Solve It. Prin eton University Press, Prin eton,

NJ.

Potter, B., Sin lair, J., & Till, D. (1991). An Introdu tion to Formal Spe i-

ation and Z. Prenti e Hall.

Quinlan, J. R. (1990). Learning logi al denitions from relations. Ma hine

Learning, 5 (3), 239266.

Quinlan, J. (1986). Indu tion of de ision trees. Ma hine Learning, 1 (1),

81106.

Reed, S. K., A kin lose, C. C., & Voss, A. A. (1990). Sele ting analogous

problems: Similarity versus in lusiveness. Memory & Cognition, 18 (1),

8398.

Reed, S. K., & Bolstad, C. (1991). Use of examples and pro edures in problem

solving. Journal of Experimental Psy hology: Learning, Memory, and

Cognition, 17 (4), 753766.

Reed, S., Ernst, G., & Banerji, R. (1974). The role of analogy in transfer

between similar problem states. Cognitive Psy hology, 6, 436450.

Ri h, C., & Waters, R. C. (1988). Automati programming: Myths and

prospe ts. IEEE Computer, 21 (11), 1025.

Ri h, C., & Waters, R. (1986a). Introdu tion. In Ri h, C., & (Eds.), R. W.

(Eds.), Readings in Arti ial Intelligen e and Software Engineering.

Morgan Kaufmann, Los Altos, CA.

Ri h, C., & Waters, R. (Eds.). (1986b). Readings in Arti ial Intelligen e

and Software Engineering. Morgan Kaufmann, Los Altos, CA.

348 BIBLIOGRAPHY

Riesbe k, C., & S hank, R. (1989). Inside Case-based Reasoning. Lawren e,

Erlbaum, Hillsdale, NJ.

Rivest, R. (1987). Learning de ision lists. Ma hine Learning, 2 (3), 229246.

Robinson, J. A. (1965). A ma hine-oeriented logi based on the resolution

prin iple. Journal of the ACM, 12 (1), 2341.

Rosenbloom, P. S., & Newell, A. (1986). The hunking of goal hierar hies:

A generalized model of pra ti e. In Mi halski, R. S., Carbonell, J. G.,

& Mit hell, T. M. (Eds.), Ma hine Learning - An Arti ial Intelligen e

Approa h, Vol. 2, pp. 247288. Morgan Kaufmann.

Ross, B. H. (1989). Distinguishing types of super ial similarities: Dierent

ee ts on the a ess and use of earlier problems. Journal of Experi-

mental Psy hology: Learning, Memory, and Cognition, 15 (3), 456468.

Rouveirol, C. (1991). Semanti model for indu tion of rst order theories.

In Myopoulos, R., & Reiter, J. (Eds.), Pro . 12th International Joint

Conferen e on Arti ial Intelligen e (IJCAI-91), pp. 685691. Morgan

Kaufmann.

Rumelhart, D. E., & Norman, D. A. (1981). Analogi al pro esses in learning.

In Anderson, J. R. (Ed.), Cognitive Skills and Their A quisition, pp.

335360. Lawren e Erlbaum, Hillsdale, NJ.

Russell, S. J., & Norvig, P. (1995). Arti ial Intelligen e. A Modern Ap-

proa h. Prenti e-Hall, Englewood Clis, NJ.

Sa erdoti, E. (1975). The nonlinear nature of plans. In Pro . International

Joint Conferen e on Arti ial Intelligen e (IJCAI-75), pp. 206214.

Morgan Kaufmann.

Sakakibara, Y. (1997). Re ent advan es of grammati al inferen e. Theoreti al

Computer S ien e, 185, 1545.

Sammut, C., & Banerji, R. (1986). Learning on epts by asking questions.

In Mi halski, R., Carbonell, J., & Mit hell, T. (Eds.), Ma hine Learn-

ing An Arti ial Intelligen e Approa h, Vol. 2, pp. 167192. Morgan

Kaufmann, Los Altos, CA.

S hadler, K., S heer, T., S hmid, U., & Wysotzki, F. (1995). Program

synthesis by learning and planning. Te h. rep. 95-20, Te hnis he Uni-

versitat Berlin, Computer S ien e Department.

S hadler, K., & Wysotzki, F. (1999). Comparing stru tures using a Hopeld-

style neural network. Applied Intelligen e, 11, 1530.

S hmid, U. (1994). Programmieren lernen: Unterstutzung des Erwerbs rekur-

siver Programmierte hniken dur h Beispielfunktionen und Erklarungs-

texte (Learning programming: A quisition of re ursive programming

skills from examples and explanations). Kognitionswissens haft, 4 (1),

4754.

S hmid, U., & Carbonell, J. (1999). Empiri al eviden e for derivational anal-

ogy. In Wa hsmuth, I., & Jung, B. (Eds.), Pro . Jahrestagung der

Deuts hen Gesells haft f. Kognitionswissens haft, Sept. 1999, Bielefeld,

pp. 116121. inx.

BIBLIOGRAPHY 349

S hmid, U., & Kaup, B. (1995). Analoges Lernen beim rekursiven Program-

mieren (Analogi al learning in re ursive programming). Kognitionswis-

sens haft, 5, 3141.

S hmid, U., Mer y, R., & Wysotzki, F. (1998). Programming by analogy:

Retrieval, mapping, adaptation and generalization of re ursive program

s hemes. In Pro . of the Annual Meeting of the GI Ma hine Learning

Group, FGML-98, pp. 140147 TU Berlin.

S hmid, U., Muller, M., & Wysotzki, F. (2000). Integrating fun tion appli-

ation in state-based planning. submitted manus ript.

S hmid, U., Sinha, U., & Wysotzki, F. (2001). Program reuse and ab-

stra tion by anti-uni ation. In Professionelles Wissensmanagement

Erfahrungen und Visionen, pp. 183185. Shaker. Long Version:

http://ki. s.tu-berlin.de/~s hmid/pub-ps/pba-wm01-3a.ps.

S hmid, U., Wirth, J., & Polkehn, K. (1999). Analogi al transfer of non-

isomorphi sour e problems. In Pro . 21st Annual Meeting of the Cog-

nitive S ien e So iety (CogS i-99), August 19-21, 1999; Simon Fraser

University, Van ouver, British Columbia, pp. 631636. Morgan Kauf-

mann.

S hmid, U., Wirth, J., & Polkehn, K. (2001). A loser look on stru tural

similarity in analogi al transfer. submitted manus ript.

S hmid, U., & Wysotzki, F. (1996). Skill a quisition an be regarded as

program synthesis. In S hmid, U., Krems, J., & Wysotzki, F. (Eds.),

Pro . 1st European Workshop on Cognitive Modeling (14.-16.11.96),

pp. 3945 Te hni al University Berlin. http://ki. s.tu-berlin.de/

EuroCog/ m-program.html.

S hmid, U., & Wysotzki, F. (1998). Indu tion of re ursive program s hemes.

In Pro . 10th European Conferen e on Ma hine Learning (ECML-98),

Vol. 1398 of LNAI, pp. 214225. Springer.

S hmid, U., & Wysotzki, F. (2000a). Applying indu tive program synthesis

to learning domain-dependent ontrol knowledge Transforming plans

into programs. Te h. rep. CMU-CS-143, Computer S ien e Depart-

ment, Carnegie Mellon University.

S hmid, U., & Wysotzki, F. (2000b). Applying indu tive programm synthesis

to ma ro learning. In Pro . 5th International Conferen e on Arti ial

Intelligen e Planning and S heduling (AIPS 2000), pp. 371378. AAAI

Press.

S hmid, U., &Wysotzki, F. (2000 ). A unifying approa h to learning by doing

and learning by analogy. In Callaos, N. (Ed.), 4th World Multi onfer-

en e on Systemi s, Cyberneti s and Informati s (SCI 2000), Vol. 1, pp.

379384. IIIS.

S hmid, U. (1999). Iterative ma ro-operators revisited: Applying program

synthesis to learning in planning. Te h. rep. CMU-CS-99-114, Com-

puter S ien e Department, Carnegie Mellon University.

350 BIBLIOGRAPHY

S hoppers, M. (1987). Universal plans for rea tive robots in unpredi table

environments. In IJCAI '87, pp. 10391046. Morgan Kaufmann.

S hrodl, S., & Edelkamp, S. (1999). Inferring ow of ontrol in program

synthesis by example. In Pro . Annual German Conferen e on Arti ial

Intelligen e (KI'99), Bonn, LNAI, pp. 171182. Springer.

Shapiro, E. Y. (1983). Algorithmi Program Debugging. MIT Press.

Shavlik, J. W. (1990). A quiring re ursive and iterative on epts with

explanation-based learning. Ma hine Learning, 5, 3970.

Shell, P., & Carbonell, J. (1989). Towards a general framework for omposing

disjun tive and iterative ma ro-operators. In Pro . 11th International

Joint Conferen e on Arti ial Intelligen e (IJCAI-89), Detroit, MI.

Siekmann, J. H. (1989). Uni ation theory. Journal of Symboli Computation,

7, 207274.

Simon, H. (1958). Rational hoi e and the stru ture of the environment. In

Models of Bounded Rationality, Vol. 2. MIT Press, Cambridge, MA.

Simon, H. A., & Hayes, J. R. (1976). The understanding pro ess: Problem

isomorphs. Cognitive Psy hology, 8, 165190.

Sinha, U. (2000). Geda htnisorganisation und Abruf von rekursiven Pro-

gramms hemata beim analogen Programmieren dur h typindizierte

Anti-Unikation (Memory organization and retrieval of re ursive pro-

gram s hemes in programming by analogy with anti-uni ation). Mas-

ter's thesis, Fa hberei h Informatik, TU Berlin.

Smith, D. R. (1990). KIDS: A semiautomati program development system.

IEEE Transa tions on Software Engineering, 16 (9), 10241043.

Smith, D. R. (1991). KIDS: A knowledge-based software development system.

In Lowry, M., & M Cartney, R. (Eds.), Automating Software Design,

hap. 4, pp. 483514. MIT Press, Menlo Park.

Smith, D. R. (1993). Constru ting spe i ation morphisms. Journal of Sym-

boli Computation, 15 (5-6), 571606.

Smith, D. R. (1985). Top-down synthesis of divide-and- onquer algorithms.

Arti ial Intelligen e, 27, 4396.

Smith, D. (1984). The synthesis of LISP programs from examples: A survery.

In Biermann, A., Guiho, G., & Kodrato, Y. (Eds.), Automati Program

Constru tion Te hniques, pp. 307324. Ma millan.

Soloway, E., & Spohrer, J. (Eds.). (1989). Studying the Novi e Programmer.

Lawren e Erlbaum, Hillsdale, NJ.

Sommerville, I. (1996). Software Engineering (5th edition). Addison-Wesley.

Spellman, B. A., & Holyoak, K. (1996). Pragmati s in analogi al mapping.

Cognitive Psy hology, 31, 307346.

Spivey, J. M. (1992). The Z Notation: A Referen e Manual (2nd edition).

Prenti e-Hall.

Sterling, L., & Shapiro, E. (1986). The Art of Prolog: Advan ed Programming

Te hniques. MIT Press.

BIBLIOGRAPHY 351

Summers, P. D. (1977). A methodology for LISP program onstru tion from

examples. Journal ACM, 24 (1), 162175.

Sun, R., & Sessions, C. (1999). Self-segmentation of sequen es: Automati

formation of hierar hies of sequential behaviors. Te h. rep. 609-951-

2781, NEC Resear h Institute, Prin eton. http:// s.ua.edu/~rsun/

sun.sss.ps.

Sutton, R. S., & Barto, A. G. (1998). Reinfor ement Learning An Intro-

du tion. MIT Press.

Thompson, C. A., Cali, M. E., & Mooney, R. J. (1999). A tive learning for

natural language parsing and information extra tion. In Bratko, I., &

Dzeroski, S. (Eds.), Pro . 16th International Conferen e on Ma hine

Learning (ICML-99), pp. 406414. Morgan Kaufmann.

Thompson, S. (1991). Type Theory and Fun tional Programming. Addison-

Wesley, Workingham, Eng.

Tversky, A. (1977). Features of similarity. Psy hologi al Review, 85, 327352.

Unger, S., & Wysotzki, F. (1981). Lernfahige Klassizierungssysteme (Clas-

sier Systems). Akademie-Verlag, Berlin.

Valiant, L. (1984). A theory of the learnable. Communi ations of the ACM,

27 (11), 11341142.

Veloso, M. (1994). Planning and Learning by Analogi al Reasoning. Springer.

Veloso, M., Carbonell, J., Perez, M. A., Borrajo, D., Fink, E., & Blythe, J.

(1995). Integrating planning and learning: The Prodigy ar hite ture.

Journal of Experimental and Theoreti al Arti ial Intelligen e, 7 (1),

81120.

Veloso, M. M., & Carbonell, J. G. (1993). Derivational analogy in Prodigy:

Automating ase a quisition, storage, and utilization. Ma hine Learn-

ing, 10, 249278.

Vosniadou, S., & (Eds.), A. O. (1989). Similarity and Analogi al Reasoning.

Cambridge University Press.

Vosniadou, S., & Ortony, A. (1989). Similarity and analogi al reasoning: A

synthesis. In Vosniadou, S., & Ortony, A. (Eds.), Similarity and Ana-

logi al Reasoning, pp. 117. Cambridge University Press, Cambridge.

Waldinger, R. (1977). A hieving several goals simultaneously. In El o k,

E. W., & Mi hie, D. (Eds.), Ma hine Intelligen e, Vol. 8, pp. 94136.

Wiley.

Walsh, T. (1983). Iteration strikes ba k - at the y li Towers of Hanoi.

Information Pro essing Letters, 16, 9193.

Weber, G. (1996). Episodi learner modeling. Cognitive S ien e, 20, 195236.

Weld, D. (1999). Re ent advan es in AI planning. AI Magazine, 1999, 20 (2),

93123.

Widowski, D., & Eyferth, K. (1986). Comprehending and re alling omputer

programs of dierent stru tural and semanti omplexity by experts and

novi es. In Willumeit, H. (Ed.), Human De ision Making and Manual

Control. Elsevier, North-Holland.

352 BIBLIOGRAPHY

Winston, P. (1980). Learning and reasoning by analogy. Communi ations of

the ACM, 23, 689703.

Winston, P. (1992). Arti ial Intelligen e, 3rd Edition. Addison-Wesley,

Reading, MA.

Winston, P., & Horn, B. (1989). LISP. Addison-Wesley, Reading, MA.

Wolf, B. (2000). Automati generation of state sets from domain spe i a-

tions. Master's thesis, Fa hberei h Informatik, TU Berlin.

Wysotzki, F. (1983). Representation and indu tion of innite on epts and

re ursive a tion sequen es. In Pro . 8th International Joint Conferen e

on Arti ial Intelligen e (IJCAI-83), Karlsruhe, pp. 409414. William

Kaufmann.

Wysotzki, F. (1987). Program synthesis by hierar hi al planning. In Jorrand,

P., & Sgurev, V. (Eds.), Arti ial Intelligen e: Methodology, Systems,

Appli ations, pp. 311. Elsevier, Amsterdam.

Wysotzki, F., & S hmid, U. (2001). Synthesis of re ursive programs from

nite examples by dete tion of ma ro-fun tions. Te h. rep. 01-2, Dept.

of Computer S ien e, TU Berlin, Germany.

Yu, T., & Clark, C. (1998). Re ursion, lambda abstra tion and geneti pro-

gramming. In Late Breaking Papers at EuroGP'98, pp. 2630 University

of Birmingham, UK.

Zelinka, B. (1975). On a ertain distan e between isomorphism lasses of

graphs.

Casopos pro pestovan

i matematiky, 100, 371372.

Zeugmann, T., & Lange, S. (1995). A guided tour a ross the boundaries of

learning re ursive languages. Le ture Notes in Computer S ien e, 961,

193262.

Zweben, M., & Fox, M. S. (1994). Intelligent S heduling. Morgan Kaufmann.

Appendi es

A. Implementation Details

A.1 Short History of DPlan

The original idea of using universal planning as initial step for indu tive

program synthesis was presented in Wysotzki (1987): sets of basi programs

and axioms are used to onstru t a kind of onditional plan with top-level

goals as root-node; ordering of dependent goals was realized by ba ktra king

over plan onstru tion. From 1995 to 1997 we explored (implemented and

tested) several strategies for generating nite programs by planning:

The approa h proposed in Wysotzki (1987) was implemented in dierent

versions by Ute S hmid, by Baba k Paradian and by Olaf Brandes Paran-

dian, S hmid, and Wysotzki (1995, see).

Furthermore, we investigated ombining forward sear h with de ision tree

learning:

As rst step, optimal plans are generated for ea h possible initial state of a

domain with small omplexity. Initial states then are represented as feature

ve tors where the set of all dierent literals o urring over states are used

as features with value 1 if this literal o urs in a state des ription and value

0 otherwise. Ea h initial state is asso iated with the (or an) optimal a tion

sequen e for transforming it into a goal state. A de ision tree algorithm

(Unger & Wysotzki, 1981, CAL2, see) is used to generate a lassi ation

program.

Transforming su h a program into a nite program t for generalization-

to-n involves the same problems as dis ussed in hapter 8 along with some

additional problems: First, the de ision tree does not ne essarily represent

an order over the number of transformations involved: it is possible that

the rst attribute already bran hes to a leaf representing the most omplex

transformation sequen e. Se ond, there is no interleaving between a tions

and onditions whi h have to be fullled before exe uting these a tions

whi h is typi al for re ursion (see for example the nite program for unload-

all in hap. 8). Instead, all onditions ne essary to exe ute the omplete

transformation sequen e are he ked rst (along a path in the de ision

tree) and the transformation sequen e is exe uted ompletely afterwards.

A possible way to split su h ompound a tions is dis ussed in Wysotzki

356 A. Implementation Details

(1983).

1

A se ond forward-sear h approa h is presented in (Briesemeister et al.,

1996) (see se t. 2.5.2).

In 1998 we ame up with the rst version of DPlan as a state-based non-

linear ba kward-planner onstru ting universal plans (implemented by Ute

S hmid). This rst algorithm, its formalization, proofs of ompleteness and

orre tness are presented in (S hmid, 1999). DPlan1.0 works for STRIPS-

like domain spe i ations extended to binary onditioned ee ts. The gen-

erated universal plan is a minimal spanning tree. That is, for domains with

sets of optimal solutions only one alternative is al ulated for ea h possible

state. DPlan1.0 was extended over the last year by several people in several

ways:

Domain spe i ation in PDDL, STRIPS + general onditioned ee ts

(see report of the student proje t \Extending DPlan To an ADL-subset"

by Janin Toussaint, Ulri h Wagner, and Hakan Iri an, 1999, http://ki.

s.tu-berlin.de/~s hmid/mlpj2/ss99/mlpj2_99.html).

Constru ting universal plans without a predened set of states (see re-

port of the student proje t \Universal Planning without state sets" by

Mi hael Christmann and Stefan Ronne ke, 1999, URL as above).

Preliminary work on plan onstru tion using ontrol knowledge repre-

sented as re ursive ma ros (see report of the student proje t \Planning

with re ursive ma ros", Mis ha Neumann and Ralf Ansorg, 1999, URL

as above).

Cal ulating DAG's instead of minimal spanning trees (S hmid, 1999).

Extending DPlan to fun tion appli ation, as reported in hapter 4. This

work is based on a diploma thesis by Marina Muller, Mai 2000 (http:

//ki. s.tu-berlin.de/proje ts/dipl.html).

Current work in progress is to extend DPlan to further features of PDDL

all-quanti ation and negation. And investigate soundness and omplete-

ness of planning for these more general operator denitions (diploma thesis

by Bernhard Wolf, Sept. 2000, http://ki. s.tu-berlin.de/proje ts/

dipl.html).

DPlan was extended (January 2001) to in lude re ursive program s hemes

fullling a ertain lass of goals in the domain spe i ation. If the top-

level goals of a new planning problem mat h with the goals of a re ursive

program s heme, this s heme is applied and an optimal transformation se-

quen e is onstru ted dire tly, without planning. This extention was ur-

rently realized by Janin Toussaint as part of her diploma thesis on plan

transformation for program synthesis (see below).

1

This work is not do umented in a paper. The do umented program together with

some examples an be obtained from Ute S hmid.

A.2 Modules of DPlan 357

A.2 Modules of DPlan

DPlan is implemented in Common Lisp. The program together with exam-

ple domains and further information an be obtained via http://ki. s.

tu-berlin.de/~s hmid/IPAL/dplan.html.

dplan.lsp: main program

onstru tion of universal plan, saved in plan-steps as list of pstep-

stru tures; re ursive al ulation of pre-images, as des ribed in table 3.1

plan-dstru .lsd: global data stru ture pstep, see below

ps-ba k.lsp: al ulating the set of legal prede essors of a state (mat h

and ba kward operator appli ation)

fun tion (apply-rules < urrent-state>) is alled from dplan, returns

list of new psteps; the new psteps are pruned with respe t to the urrent

plan

showplan.lsp: graphi al and term output of plans

fun tion (show plan-steps) is alled from dplan; graphs an be displayed

with graphlet, trees an be displayed with xterm

parse.lsp, sele tors.lsp: handling the PDDL-input (implemented by

Marina Muller)

generate.lsp, range.lsp, update.lsp: handling update ee ts for plan-

ning with fun tion appli ation (implemented by Marina Muller)

A plan-step is a pair h instop, hild i. Additional informations are given

for onstru tion of graphi al outputs and for plan transformation:

(defstru t pstep

instop ; instantiated operator, f. puttable(A)

parent ; parent node: in ba kward planning su essor of instop

; input in ps-ba k "state"

hild ; hild node: in ba kward planning prede essor of instop

; onstru ted in ps-ba k

pre ; instantiated pre onditions of operator

; for onditioned operators: union of global and spe ifi

; pre ondition

add ; instantiated add-list

del ; instantiated del-liste

nodeid ; identifier

level ; level in the ms-dag

)

A.3 DPlan Spe i ations

DPlan spe i ations are given in standard PDDL. For a omplete des ription

of the syntax of PDDL see (M Dermott, 1998b). We give a Tower of Hanoi

problem as example:


(define (domain disk-world-domain)

(:a tion move

:parameters (?d ?x ?y)

:pre ondition (and (on ?d ?x)( lear ?d)( lear ?y)(smaller ?d ?y))

:effe t

((and ((on ?d ?y)( lear ?x))

(not ((on ?d ?x)( lear ?y)))

))

))

(define (problem towers-of-hanoi)

:domain 'disk-world-domain

:goal ((on d3 p1) (on d2 d3) (on d1 d2))

)

(define (states states-in-disk-world-domain) ; domain restri tion

:states ; omplete goal state

( ((on d3 p3) (on d2 d3) (on d1 d2) ( lear d1) ( lear p2) ( lear p3)

(smaller d1 p1)(smaller d2 p1)(smaller d3 p1) (smaller d4 p1)



(smaller d1 d2)(smaller d1 d3)(smaller d2 d3) (smaller d1 d4)

(smaller d2 d4)(smaller d3 d4))

))

For planning with update ee ts we extended PDDL as des ribed in hap-

ter 4. An example of a Tower of Hanoi problem is:

(define (domain disk-world-domain)

(:a tion move

:parameters (?x ?y)

:pre ondition ()

:effe t

(( hange (?L2 in (on ?y ?L2) (to ( dr ?L2)))

(?L1 in (on ?x ?L1) (to ( ons- ar ?L1 ?L2)))

))

:post (((on ?x ?L1) (on ?y ?L2))

(f1 ?L2) ; L2 is not empty

(f6 ?L1 ?L2)) ; ( ar L2) < ( ar L1)

))

(defun ons- ar (l1 l2)

;(print `( ons- ar ,l1 ,l2 ,( ons ( ar l2) l1)))

( ons ( ar l2) l1))

(defun f6 (l1 l2)

;(print `(f6 ,l1 ,l2))

( ond ((null l1) T)

(T (> ( ar l1) ( ar l2)))

))

(define (problem towers-of-hanoi)

:domain 'disk-world-domain

:goal ((on p1 [ ) (on p2 [ ) (on p3 [ 1 2 3 ))

A.4 Development of Folding Algorithms 359

)

A.4 Development of Folding Algorithms

Starting point for our development of a folding algorithm was the approa h

by Summers (1977) and its extension from a subset of Lisp to term algebras

by Wysotzki (1983). Both approa hes rely on the identi ation of dieren es

between pairs of tra es in a given nite program. Sin e the original algorithm

of Summers was no longer available and there did not exist an implementation

for Wysotzki's approa h, we started from the s rat h:

In the beginning (1997), a simple string pattern-mat her was realized

in Common Lisp by Imre Szabo in his diploma thesis (http://ki. s.

tu-berlin.de/proje ts/dipl.html). This algorithm did only work for

dete ting linear re ursions with independent variable substitutions in the

re ursive all.

At the same time, an algorithm based on the identi ation of uniable

sub-terms of a given nite program term was implemented in Common

Lisp by Dirk Matzke in his diploma thesis (http://ki. s.tu-berlin.

de/proje ts/dipl.html). He used the tree-transformation algorithm of

Lu (1979) whi h was originally implemented in a student proje t where

it was applied to mat hing of dierent nite programs in the ontext of

programming by analogy (S hadler, S heer, S hmid, & Wysotzki, 1995).

In a student proje t in 1997, Martin Mhlpfordt and Markus Jurinka worked

on a formalization of folding based on indu tion of ontext-free tree gram-

mars (Muhlpfordt & S hmid, 1998) and Martin Mhlpfordt implemented

a pattern mat her in Prolog based on this formalization. This implemen-

tation works already for re ursive equations realizing arbitrary re ursion

forms (linear, tree, ombinations) and for interdependent variable substi-

tutions.

Heike Pis h (2000) provided an eÆ ient implementation of this approa h in

her diploma thesis (http://ki. s.tu-berlin.de/proje ts/dipl.html),

using a dierent approa h for generating valid hypotheses for segmenta-

tions.

The pre eding two proje ts were the starting point for the powerful folding

algorithm realized in Common Lisp by Martin Mhlpfordt in his diploma

thesis 2000 (http://ki. s.tu-berlin.de/proje ts/dipl.html) whi h

is presented in detail in hapter 7: the original approa h was extended to

RPSs with sets of re ursive equations and to the identi ation of hidden

variables in substitution terms.


A.5 Modules of TFold

TFold is implemented in Common Lisp (by Martin Mhlpfordt). The program

together with example domains and further information an be obtained via

http://ki. s.tu-berlin.de/~s hmid/IPAL/fold.html.

folder.lsp: main program. The outer ontrol loop for sear h, as given in

gure 7.13 in hapter 7.

Main fun tion: sear h-RPS with an initial tree as input and an RPS or

nil as output. First, it is tried to nd a sub-program for the tree, if this

fails, it is tried to nd an RPS for ea h argument of the top-level-symbol

of the tree.

Fun tion: sear h-SubProg-Ba ktra k implements the sear h for a sub-

program by ba ktra king, as des ribed in gure 7.12 in hapter 7. The

idea is to in rementally build skeletons of the subprogram-body until a

valid segmentation is found (a) whi h an be extended to the body and

(b) so that the resulting instan es of the variable an be explained by

substitutions.

some test-fun tions ( onverting, folding, and printing the result)

The initial tree is initially onverted in a spe ial representation: all (maxi-

mal) sub-trees whi h do not ontaining \Omegas" are onverted to a ve tor

with 2 elements.

base-fkt-rps.lsp: Fun tions for onstru ting an RPS from the indu ed

omponents (signature, head, parameters, body, substitutions)

base-fkt-tree.lsp: Fun tions for sear hing and partitioning trees, anti-

uni ation of terms.

skeleton.lsp: onstru ting a skeleton and he king whether it is valid

substitutions.lsp: Cal ulating the substitutions for variables in re ur-

sive alls.

term2gml.lsp": Transformation of terms into gml format for graphlet

whi h provides a graphi al output.

"read-gml.lsp": Transformation of gml represented terms in terms as lists

of lists.

pretty-rps-print.lsp: Fun tions for the output of RPSs.

unfolder.lsp: Unfolding RPSs into nite programs.

rps-example.lsp: Colle tion of RPSs.

Currently, a graphi al user interfa e (GUI) realized in Java for the folder

is under onstru tion.

RPSs are represented as olle tion of sub-programs and a main program.

Sub-programs are represented as stru tures. An example for ModList pre-

sented in hapter 7 is:

(defun ModList ()

(setq Sub1 (make-sub :name 'ModList

:params '(n l)

A.6 Time Eort of Folding 361

:body '(if (empty l)

nill

( ons (if (eq0 (Mod n (head l)))

true

false)

(ModList n (tail l))))

))

(setq Sub2 (make-sub :name 'Mod

:params '(n k)

:if '(g (< n k) n (Mod (- n k) k))

))

(make-rps :main '(ModList 8 (list 9 4 7))

:sigma (list Sub1 Sub2)))

A.6 Time Eort of Folding

The time measures were performed on a SUN Spar 20 workstation. For mea-

surement the Common Lisp timema ro was used for unfolding, the omplete

folding and the subpro esses al ulating valid segmentations and determining

substitutions. Be ause CPU-time measures are not overly pre ise, the average

of three runs for ea h problem was al ulated.

A.7 Main Components of Plan-Transformation

Plan transformation is work in progress. The urrent implementation was

realized in Common Lisp by Ute S hmid, 2000 (plan-transform.lsp). Cur-

rently, the implementation is extended and the on ept of linearization based

on the assumption of sub-goal indenden e proposed in Wysotzki and S hmid

(2001) is in luded by Janin Toussaint in her diploma thesis.

The main fun tion of the urrent implementation is (plantransform

<plan>):

Input: plan as list of psteps from dplan.lsp

(de ompose plan):

initial de omposition; generation and initialization of the global variables

subplanlist (list of tplan stru tures), planstru (sub-plan stru ture as

term of sub-plan names)

(intro-type subplanlist):

data type infere e for ea h sub-plan, su essively lling the slots of tplan

(ptransform subplanlist):

introdu ing situation variables and rewriting into onditional expression

Output: transformation information as given in subplanlist and planstru ;

tplan.term is passed to the generalization-to-n algorithm for ea h sub-

plan; planstru is extended from sub-plan names to arguments and rewrit-

ten into a \main" program


Global data-stru ture:

; input from dplan.lsp is a plan saved in plan-steps as a list of psteps

; global variables

(setq subplanlist nil) ; list of all transformed subplans

; (spe ial ase: one subplan)

(setq planstru nil) ; stru ture of global plan (subtrees repla ed by

; subplan names)

(setq mstlist nil) ; list of possible minimal spanning trees

; (ba ktra k-point for dags whi h are not sets!)

; .....................................................................

; new stru ture for transformed plan (one for ea h subplan)

; initial generation in de ompose

(defstru t tplan

pname ; name of plan (de ompose)

suplan ; plan as stru ture (plan-steps) (de ompose)

oplan ; plan with data types and relevant predi ates

term ; plan as term

ptype ; type of the plan (singleop, seq, set, list)

newdat ; *newly onstru ted datastru ture (a list/set of obje ts)

newpred ; *newly onstru ted predi ate (if newdat =/= nil) as pattern

npf t ; *fun tion definition for new predi ate

goalpred ; goal-predi ate (might be newpred)

bottom ; bottom-element (might be newdat)

passo s ; onst/ onstr rewrite pairs

paf t ; fun tion definition for the rewriting

pparams ; input parameters

pvalues ; initial values

)

;; newdat, newpred, npf t are not filled for ptype = singleop and ptype = set

;; newpred: p* (... CO ...) with CO as pla e-holder for omplex obje t

;; maybe additional onstant arguments

;; f.e.: (at* CO B) for ro ket

;; bottom == newdat (-> newdat might be superfluous)

A.8 Plan De omposition

Please note: Extra ts from the program are given in an abbreviated pseudo-

Lisp notation, omitting implementation details!

; all de ompose with omplete plan and level = 0 (root)

(de ompose plan level) ==

(map ar $\lambda$x.(r-de -p (get-first-op plan lv) x lv) (partition plan))

(get-first-op plan level) ==

get the set first operator-symbol at the upper-most level of the plan

(partition plan) == for ea h root of ``plan'', return its subplan

(r-de -p op plan lv) ==

(let ((dlv (disag-level op plan))

A.9 Introdu tion of Situation Variables 363

(pname (gensym "P")))

( ond ((not dlv) (save-subplan pname plan)) ; single plan

; subplan

((> dlv lv) (save-subplan pname (get-suplan dlv plan))

(de ompose (append (make-root (1- dlv) plan)

(rest-suplan dlv plan))

dlv)

)

(T ; (= dlv lv) ; different operators

(save-subplan pname plan) ; at one level

) ; urrently not treated by de omposition

))

(disag-level op plan) ==

returns the level in ``plan'' on whi h appears for the first time a

different operator-name than `òp''

(save-suplan pname plan) ==

generate a new tplan-stru ture and instantiate the slots pname with

``pname'' and suplan with ``plan'' (see appendix for tplan-stru ture)

(get-suplan dlv plan) ==

return a partial plan from the urrent top-level to level ``dlv''

(make-root lv plan) == get the leaf of the newly generated subplan

(rest-suplan dlv plan) ==

return the remaining plan with level ``dlv'' as new top-level

A.9 Introdu tion of Situation Variables

For ea h subplan:

(rewrite-to-term oplan) ==

(list 'if (intro-s (first-state oplan))

's (r-rewrite (rest oplan)) )

(intro-s e) == extend e by an argument ``s''

(r-rewrite oplan) ==

( ond ((null p) 'omega)

(T (append (intro-s (first-op oplan))

(list (list 'if

(intro-s (first-state oplan))

's (r-rewrite (rest oplan)) )))))

A.10 Number of MSTs in a DAG

To al ulate the number of minimal spanning trees in an DAG, we an use

formula

Q

n

i=0

i

, multiplying the number of alternative hoi es

i

on ea h


level i. For the sorting of three elements, we have: 1 1 9 = 9, with a single

option for the root and the rst level and 9 dierent options for the third

level. To al ulate the number of options on a level, the onne tive stru ture

of the DAG has to be know. The dierent alternatives on one level an be

al ulated as des ribed for the onstru tion of minimal spanning trees. For

sorting lists with 4 elements we have: 1 1 52488 46656 = 2:448:880:128!

To omit the expli it al ulation of options per level, an upper bound

estimate is

i

=

a

i

n

i

with a

i

as number of ar s from level i1 to level i and n

i

as number of nodes

on level i.

We annot provide a formula for al ulating the proportion of regular-

izable trees to all trees, be ause regularizability depends on the identity of

edge-labels whi h is variable between domains.

A.11 Extra ting Minimal Spanning Trees from a DAG

; lst is tree until lv (initially: root) ; sp is all plan-steps on levels > lv

(defun extra t-trees (lst sp lv)

(let* ((tve (number-of-msts 1 sp (1+ lv)))

(t nt (redu e '* tve )))

(print `(There are * ,tve = ,t nt minimal spanning trees))

(fresh-line)

( ond ((> t nt 576) (write-string "How many trees? <number> ")

(setq k (read-number))

(fresh-line)

(extra t-msts (list lst) sp (1+ lv) k))

(T (write-string "Generate all alternatives")

(fresh-line)

(extra t-msts (list lst) sp (1+ lv) t nt))

)))

(defun extra t-msts (lst sp lv k)

(print `(In lude next level ,lv))

(let ((lp (remove-if #'(lambda(x) (/= lv (pstep-level x))) sp))

(rp (remove-if #'(lambda(x) (= lv (pstep-level x))) sp)))

( ond ((null lp) nil)

((null rp) (first-k k (lift

(map ar #'(lambda(x) (pmerge lst x)) (all- ombs lp k)))))

; in the last step this is only a "throw-away" of

; already al ulated trees!

(T (extra t-msts (first-k k

(lift (map ar #'(lambda(x) (pmerge lst x))

(all- ombs lp k))))

rp (1+ lv) k)

))) )

A.12 Regularizing a Tree 365

(defun pmerge (lst e)

( ond ((null lst) (list e)) ; should never o ur

((flatlst lst) (join lst e))

(T (map ar #'(lambda(x) (join x e)) lst))

))

(defun all- ombs (lp k)

(let* ((d (make-sset (map ar #'(lambda(x) (pstep- hild x)) lp)))

(l s ( hild-split lp d k))

(tl s (map ar #'(lambda(x) (generate-trees x (1- (length x)))) l s)))

( art-produ t ( ar tl s) ( dr tl s) k)

))

(defun hild-split (lp d k)

( ond ((null d ) nil)

(T ( ons (first-k k (remove-if #'(lambda(x)

(not (setequal (pstep- hild x) ( ar d )))) lp) )

( hild-split lp ( dr d ) k)))

))

(defun generate-trees (lp nt)

( ond ((< nt 0) nil)

(T (setq ur-lp ( opy-plan lp))

( ons (nth nt ur-lp)

(generate-trees lp (1- nt))))

))

(defun art-produ t (fst rst k)

( ond ((null rst) fst)

(T ( art-produ t (first-k k ( omb fst ( ar rst))) ( dr rst) k))

))

(defun omb (f r)

( ond ((null f) nil)

(T (append (map ar #'(lambda(x) (join ( ar f) x)) r)

( omb ( dr f) r)))

))

A.12 Regularizing a Tree

(defun regularize-tree (mst lv)

(let (( ur (remove-if #'(lambda(y) (/= lv (pstep-level y))) mst))

(nxt (remove-if #'(lambda(y) (/= (1+ lv) (pstep-level y))) mst))

)

( ond ((null ur) nil)

((null nxt) mst)

(T (let ((opset l (map ar #'(lambda(x) (pstep-instop x)) ur))

(opsetnl (map ar #'(lambda(x) (pstep-instop x)) nxt)))

( ond ((subsetp opsetnl opset l :test 'equal)

(regularize-tree (smerge (lv-shift ur opsetnl)

mst) (1+ lv)))

(T (regularize-tree mst (1+ lv)))


))) ) ) )

(defun lv-shift ( ur opsetnl)

( ond ((null ur) nil)

((member (pstep-instop ( ar ur)) opsetnl :test 'equal)

(setf n ( opy-pstep ( ar ur)))

(setf (pstep-parent n )

(id-insert (pstep-parent ( ar ur))))

(setf (pstep-level n ) (1+ (pstep-level ( ar ur))))

( ons

(make-pstep

:instop 'id

:parent (pstep-parent ( ar ur))

: hild (id-insert (pstep-parent ( ar ur)))

:nodeid (+ 1000 (pstep-nodeid ( ar ur)))

:level (pstep-level ( ar ur))

)

( ons n (lv-shift ( dr ur) opsetnl)))

)

(T ( ons ( ar ur) (lv-shift ( dr ur) opsetnl)))

))

(defun id-insert (s)


((idlist ( ar s)) ( ons ( ons 'id ( ar s)) ( dr s)))

(T ( ons '(id) s))

))

(defun idlist (l)

( ond ((null l) T)

((equal 'id ( ar l)) (idlist ( dr l)))

(T nil)

))

; if pstep- hild of new is equal to a pstep- hild in old -> keep new

; (has higher level)

; if pstep- hild without "idlist" is equal to a pstep- hild in old

; and both are on the same level -> keep old (the one whi h has

; no or a shorter "idlist")

; ==> parent-node for the nodes in nw with pstep- hild of new as

; parent has to be set to "old" parent!

; these nodes are still in new be ause of the sequen e of

; node onstru tion in lv-shift!

(defun smerge (nw old)

( ond ((null nw) old)

((member ( ar nw) old :test 'node-equal)

(smerge ( dr nw) ( ons ( ar nw)

(remove-if #'(lambda(x) (node-equal

( ar nw) x)) old))))

((member ( ar nw) old :test 'level-equal)

(smerge (old-parent ( ar nw) ( dr nw) old) old))

(T (smerge ( dr nw) ( ons ( ar nw) old)))

A.13 Programming by Analogy Algorithms 367

))

(defun node-equal (s1 s2)

(and (= (length (find-if #'(lambda(x) (idlist x)) (pstep- hild s1)))

(length (find-if #'(lambda(x) (idlist x)) (pstep- hild s2)))

)

(setequal (remove-if #'(lambda(x) (idlist x)) (pstep- hild s1))

(remove-if #'(lambda(x) (idlist x)) (pstep- hild s2))

)) )

(defun level-equal (s1 s2)

(and (setequal (remove-if #'(lambda(x) (idlist x)) (pstep- hild s1))

(remove-if #'(lambda(x) (idlist x)) (pstep- hild s2))

)

(= (pstep-level s1) (pstep-level s2))

))

; if smerge keeps the old state, then the new-state with ld as parent

; has to keept the orresponding old parent

(defun old-parent ( ld sl old)

( ond ((null sl) nil)

((setequal (pstep- hild ld) (pstep-parent ( ar sl)))

( ons (update-par ( opy-pstep ( ar sl))

(pstep- hild (find-if #'(lambda(x)

(level-equal ld x)) old))) ( dr sl)))

(T (old-parent ld ( dr sl) old))

))

(defun update-par (sl ud) (setf (pstep-parent sl) ud) sl)

A.13 Programming by Analogy Algorithms

In ontrast to other approa hes to programming by analogy (in omputer

s ien e) and analogi al problem solving (in ognitive s ien e), our approa h

allows to deal likewise with analogy and abstra tion: Re ursive program

s hemes an be dened over a signature onsisting of a set of primitive fun -

tions with xed interpretation or over a signature ontaining (se ond-order)

fun tion variables. Analogy is realized by mapping on rete programs while

abstra tion is realized by mapping a se ond-order s heme with a on rete

program. We use the same similarity riterium for retrieval and mapping.

Originally, we investigated tree-transformation, urrently we are investi-

gating anti-uni ation as an approa h to mapping. Tree-transformation re-

turns a quantitative measure of stru tural similarity based on the ne essary

number of substitutions, deletions, and insertions to make a sour e program

identi al to a target andidate. Tree-transformation is introdu ed in hapter

10 as an example for a measure of stru tural similarity. Anti-uni ation is

des ribed in hapter 12.

We did some preliminary work on adaptation of non-isomorphi programs.

Again in ontrast to other approa hes, our fo us is on generalization over pro-


grams, that is, on onstru ting a hierar hy of re ursive program s hemes. An

overview is given at http://ki. s.tu-berlin.de/~s hmid/IPAL/aprog.

html.

Several aspe ts of programming by analogy and analogi al problem solv-

ing were investigated:

Memory organization and retrieval using tree-transformation was investi-

gated by Mark Muller in his diploma thesis (Aug. 1997, http://ki. s.

tu-berlin.de/proje ts/dipl.html). He used hierar hi al luster analy-

sis to build a hierar hi al memory and ould show that memory organiza-

tion resulted in a grouping of programs sharing re ursive stru tures (tail

re ursion, linear re ursion, tree re ursion).

In two psy hologi al experiments, Knuth Polkehn and Joa him Wirth

investigated the use of non-isomorphi al base problems of human prob-

lem solvers (Sept. 1997 and Mai 1998, http://ki. s.tu-berlin.de/

proje ts/dipl.html). They ould show that human problem solvers an

solve new problems if base and target have a stru tural overlap of at least

fty per ent. For our automati programming system, we take this result

as a hint that non-isomorphi sour es should be onsidered for adapta-

tion. But, we need a (qualitative) riterum for de iding whether the sour e

andidate is \similar enough" to prefer adaptation over folding from the

s rat h.

Adaptation of program s hemes to new problems was resear hed by Rene

Mer y in his diploma thesis (April 1998, http://ki. s.tu-berlin.de/

proje ts/dipl.html). He used tree-transformation to onstru t a map-

ping. If the mapping only ontains unique substitutions, the programs are

isomorphi and adaptation is realized simply by repla ing the symbols in

the base program by the symbols they are mapped upon in the target. For

non-unique substitutions, deletions and insertions, ontext in the program

tree is used to determine the adaptation of the sour e program (see hapter

12).

If base and target share some similarity, but adaptation fails, it ould

be tried to rewrite the base problem to obtain a better mapping. That

is, stru ture mapping is enri hed by a ba kground theory. This idea is

urrently dis ussed in ognitive s ien e as \re-representation" and Martin

Muhlpfordt investigated re-representation in a psy hologi al experiment in

his diploma thesis (June 1999, http://ki. s.tu-berlin.de/proje ts/

dipl.html).

In 2000 we repla ed tree-transformation by (se ond order) anti-uni ation

as ba kbone of programming by analogy. Thereby we gained a parame-

ter free, qualitative approa h to stru tural similarity whi h furthermore

allowed to deal with mapping and generalization in an uniform way. Adap-

tation an be modeled by the inverse pro ess, that is, (se ond order)

uni ation. Mapping and generalization was resear hed by Uwe Sinha in

A.13 Programming by Analogy Algorithms 369

his diploma thesis (Aug. 2000, http://ki. s.tu-berlin.de/proje ts/

dipl.html) and is des ribed in hapter 12.

Currently, Ulri h Wagner is investigating retrieval and adaptation.


B. Con epts and Proofs

B.1 Fixpoint Semanti s

Fixpoint theory in general on erns the following question: Given an ordered

set P and an order-preserving map : P ! P , does there exist a xpoint

for ? A point x 2 P is alled xpoint if (x) = x.

The following introdu tion of xpoint semanti s losely follows Davey and

Priestley (1990). We rst introdu e some basi on epts and notations. Then,

we present the xpoint theorem for omplete partial orders. Finally, we give

an illustration for the fa torial fun tion.

Denition B.1.1 (Map). (Davey & Priestley, 1990, 1.6) Let X be a set

and onsider a map f : X ! X. f assigns a member f(x) 2 X to ea h

member x 2 X and is determined by its graph, graphf = f(x; f(x)) j x 2 Xg

with graphf X X.

A partial map is a map : S ! X where S X. With dom = S we

denote the domain of . The set of partial maps on X is denoted (X

Æ

! X)

and is ordered in the following way: given ; 2 (X

Æ

! X), dene i

dom dom and (x) = (x) for all x 2 dom . A subset G of X X

is the graph of a partial map i 8s 2 X : (s; x) 2 G and (s; x

0

) 2 G ) x = x

0

.

Denition B.1.2 (Order-Preserving Map). (Davey & Priestley, 1990,

1.11) Let P and Q be ordered sets. A map : P ! Q is order-preserving (or

monotone) if x y in P implies (x) (y) in Q.

Denition B.1.3 (Bottom Element). (Davey & Priestley, 1990, 1.23,1.24)

Let P be an ordered set. The least element of P , if su h exists, is alled the

bottom element and denoted ?.

Given an ordered set P (with or without ?), we form P

?

( alled \P lifted")

as follows: Take an element 0 =2 P , onstru t P

?

= P [ f0g and dene on

P

?

by x y i x = 0 or x y in P .

Denition B.1.4 (Order-Isomorphism between Partial and Stri t Maps).

(Davey & Priestley, 1990, 1.29) With ea h 2 (S

Æ

! S) we asso iate a map

() : S ! S

?

, given by () =

?

where

?

(x) =

(x) if x 2 dom

0 otherwise:

372 B. Con epts and Proofs

Thus is a map from (S

Æ

! S) to (S ! S

?

). We have used the extra

element 0 to onvert a partial map on S into a total map. sets up an

order-isomorphism between (S

Æ

! S) and (S ! S

?

) (Davey & Priestley,

1990, p. 21).

Denition B.1.5 (Supremum). (Davey & Priestley, 1990, 2.1) Let P be

an ordered set and let S P . An element x 2 P is an upper bound of S if

s x for all s 2 S. x is alled least upper bound of S if

1. x is an upper bound of S, and

2. x y for all upper bounds y of S.

Sin e least elements are unique, the least upper bound is unique if it exists.

The least upper bound is also alled the supremum of S and is denoted supS,

or

W

S (\join of S"), or

F

S if S is dire ted.

Denition B.1.6 (CPO). (Davey & Priestley, 1990, 3.9) An ordered set

P is a CPO ( omplete partial order) if

1. P has a bottom element ?,

2. supD exists for ea h dire ted sub-set D of P .

The simplest example of a dire ted set is a hain, su h as 0 su (0)

su (su (0)) su (su (su (0))) : : :. Fixpoint theorems are based on

as ending hains.

Denition B.1.7 (Continuous and Stri t Maps). (Davey & Priestley,

1990, 3.14) A map : P ! Q is ontinuous, if for every dire ted set D in

P : (

F

D) =

F

((D)). Every ontinuous map is order preserving (Davey &

Priestley, 1990, 3.15).

A map : P ! Q su h that (?) = ? is alled stri t.

Denition B.1.8 (Fixpoint). (Davey & Priestley, 1990, 4.4) Let P be an

ordered set and let : P ! P be a map. We say x 2 P is a xpoint of

if (x) = x. The set of all xpoints of is denoted by x(); it arries the

indu ed order. The least element of x(), when it exists, is denoted by ().

The n-fold omposite,

n

, of a map : P ! P is dened as follows:

n

is the

identity map if n = 0 and

n

= Æ

n1

for n 1. If is order preserving,

so is

n

.

Theorem B.1.1 (CPO Fixpoint). (Davey & Priestley, 1990, 4.5) Let P

be a omplete partial order (CPO), let : P ! P be an order-preserving map

and dene =

F

n0

n

(?).

1. If 2 fix(), then = ().

2. If is ontinuous then the least xpoint () exists and equals .

Proof (CPO Fixpoint). (Davey & Priestley, 1990, 4.5)

B.1 Fixpoint Semanti s 373

1. Certainly ? (?). Applying the order preserving map

n

, we have

n

(?)

n+1

(?) for all n. Hen e we have a hain ? (?) : : :

n

(?)

n+1

(?) : : : in P . Sin e P is a CPO, =

F

n0

n

(?)

exists. Let be any xpoint of . By indu tion,

n

() = for all n. We

have ? , hen e we obtain

n

(?)

n

() = by applying

n

. The

denition of for es . Hen e if os a xpoint then it is the least

xpoint.

2. It will be enough to show that 2 x(). We have

(

F

n0

n

(?)) =

F

n0

(

n

(?)) (sin e is ontinuous)

=

F

n1

n

(?)

=

F

n0

n

(?) (sin e ?

n

(?) for all n):

Consider the fa torial fun tion fa u(x) = x! with its re ursive denition:

fa u(x) =

1 if x = 0

x fa u(x 1) if x 1:

To ea h map f : N

0

! N

0

we may asso iate a new map

f given by:

f(x) =

1 if x = 0

x f(x 1) if x 1:

The equation satised by fa u an be reformulated as (f) = f , where

(f) =

f . The entire fa torial fun tion annot be unwound from its re ursive

spe i ation in a nite number of steps. However we an, for ea h n = 0; 1; : : :,

determine in a nite number of steps the partial map f

n

whi h is a restri tion

of fa u to f0; 1; : : : ; ng. The graph of f

n

is f(0; 1); (1; 1); : : : ; (n; n!)g. There-

fore, to a ommodate approximations of fa u we an onsider all partial

maps on N

0

and regard

f as having the domain f0g [ fk j k 1 2 domfg.

Similarly, we an work with all maps from N

0

to (N

0

)

?

and take

f = ?

pre isely when f(k 1) = ?. The fa torial fun tion an be regarded as a

solution of a re ursive equation (f) = f , where f 2 (N

0

Æ

! N

0

) or equiva-

lently an be regarded as a solution of a re ursive equation (f) = f , where

f 2 (N

0

! (N

0

)

?

):

(f)(x) =

1 if x = 0

x f(x 1) if x 1:

The domain P of as given in the xpoint theorem above, is (N

0

Æ

! N

0

).

That is, maps partial maps on partial maps. The bottom element of (N

0

Æ

!

N

0

) is ;. This orresponds to that map in (N

0

! (N

0

)

?

) whi h sends every

element to ?. We denote this map by ?.

It is obvious that is order preserving: We have graph (?) = f(0; 1)g,

graph

2

(?) = f(0; 1); (1; 1)g and so on. An easy indu tion onrms that

f

n

=

n

(?) for all n where ff

n

g is the sequen e of partial maps (\Kleene


sequen e") whi h approximate fa u. Taking the union of all graphs, we see

that

F

n0

n

(?) is the map fa u : x ! x! on N

0

. This is the least xpoint

of in (N

0

Æ

! N

0

).

B.2 Proof: Maximal Subprogram Body

Theorem 7.3.1 states that if a re ursive subprogram G exists for a set of

initial trees T

init

= ft

e

j e 2 Eg whi h an generate all t

e

2 T

init

for a given


e

, then there also exists a subprogram G

0

su h that the

instantiations (over all segments and all examples) do not share a non-trivial

pattern (a ommon prex). Therefore, it is feasible to generate a hypothesis

about the subprogram body for a given segmentation with re ursion points

U

re

by al ulating the maximal pattern of the segments.

Let U

re

= pos(t

G

; G) be the set of re ursion points in G. Be ause G

is a re ursive subprogram, holds U

re

6= ;. Let U

re

be indexed over R =

f1; : : : ; jU

re

jg and let be W the set of unfolding indi es over R (def. 7.1.28).

Let the body of the subprogram G be

^

t

G

= t

G

[U

re

G

0

. That is, the

re ursive alls are repla ed by fun tion variable G with arity 0, and therefore,

the substitutions in the re ursive alls are not given in

^

t

G

. Let sub : XR!

T

(X) be the substitution term of variable x

i

2 X in the r-th re ursive all

(r 2 R) with sub(x

i

; r) = t

G

j

u

r

Æi

(def. 7.1.27).

For better readability we introdu e unfolding indi es w 2W and example

e 2 E as parameters of instantiation : X W E ! T

. We will use vari-

able names with indi es to indi ate their position. If position is not relevant,

variables are written without indi es.

The instantiation : XWE ! T

of a variable x 2 X in an unfolding

w 2W in example e 2 E is dened by the initial instantiation of this variable

and by the substitution terms for this variable. In the rst unfolding w = ,

is the initial instantiation for the given example. Further instantiations an

be generated by substitutions:

Denition B.2.1 (Generating Instantiations). From instantiations in

unfolding w of an example e, the instantiation in an unfolding w Æ r an

be generated by appli ation of the substitution:

1. (x; ; e) =

e

(x).

2. For all w 2 W , r 2 R, e = inE:

(x;w Æ r; e) = sub(x; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g.

There an exist substitutions for x with sub(x; r) 2 X .

Lemma B.2.1 (Identi al Instantiations). If for x 2 X exists a substitu-

tion sub(x; r) = x

0

, r 2 R, x

0

2 X, it holds for all e 2 E: 8w 2 W9v 2 W

with (x

0

; w; e) = (x; v; e).

B.2 Proof: Maximal Subprogram Body 375

Proof (Identi al Instantiations). If for an r 2 R holds that sub(x; r) = x

0

with x

0

2 X then it follows for all e 2 E and w 2W :

(x;w Æ r; e) = sub(x; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g

= x

0

fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g

= (x

0

; w; e);

and therefore for ea h e 2 E and w 2 W exists v 2 W (v = w Æ r) su h that

(x

0

; w; e) = (x;w; e).

From lemma B.2.1 follows: If a substitution of x is a repla ement of x by

another variable x

0

then in ea h example the instantiation of x

0

is also an

instantiation of x. In general, for x an exist several substitutions where x

is repla ed by x

0

and additionally, for x

0

there an be further substitutions

sub(x

0

; r) 2 X . We dene the set of instantiation generating variables for x

and generalize lemma B.2.1.

Denition B.2.2 (Instantiation Generating Variables). We dene X

(x)

as the minimal set of variables whi h dire tly generate instantiations of x for

whi h holds:

1. x 2 X

(x).

2. If x

0

2 X

(x) then fsub(x

0

; r) j r 2 R; sub(x

0

; r) 2 Xg X

(x).

Lemma B.2.2 (Continuation of Identi al Instantiations). Let X

(x) be

the set of variables whi h dire tly generate instantiations of x. For all

x

0

2 X

(x) and all e 2 E holds: 8w 2W9v 2 W with (x

0

; w; e) = (x; v; e).

Proof (Continuation of Identi al Instantiations). For the given denition of

X

(x) it must be shown that (1) lemma B.2.2 holds for x, and (2) if x

0

2 X

(x)

then lemma B.2.2 holds for all x

00

2 fsub(x

0

; r) j r 2 R; sub(x

0

; r) 2 Xg.

1. For x 2 X

(x) holds: for all e 2 E and w 2 W exists v 2 W with v = w

su h that (x;w; e) = (x; v; e).

2. Let x

0

2 X

(x). By lemma B.2.2 holds for ea h x

00

2 fsub(x

0

; r) j r 2

R; sub(x

0

; r) 2 Xg that for ea h e 2 E and w 2 W exists a v

0

2W with

(x

00

; w; e) = (x

0

; v

0

; e). Be ause x

0

2 X

(x), for ea h (x

0

; v

0

; e) must

exist a v 2W with (x

0

; v

0

; e) = (x; v; e). Therefore, for ea h e 2 E and

w 2W exists v 2 W with (x

00

; w; e) = (x; v; e).

From lemma B.2.2 follows: In ea h example is ea h instantiation of a

variable from X

(x) an instantiation of x. If there exists a non-trivial pattern

p of instantiations of x, then it must hold p (x;w; e) for all w 2 W and

e 2 E. Based on lemma B.2.2 for all x

0

2 X

(x) holds:

Denition B.2.3 (Non-Trivial Pattern of an Instantiation). For all x

0

2

X

(x) holds for a non-trivial pattern p of instantiations of x:

1. p

e

(x

0

) for all e 2 E, and


2. p sub(x

0

; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g for all e 2 E,

w 2W , and r 2 R with sub(x

0

; r) =2 X.

Furthermore, there must exist a non-trivial pattern p

0

su h that for all x

0

2

X

(x) holds:

3. p

0

e

(x

0


4. p

0

sub(x

0

; r) for all r 2 R with sub(x

0

; r) =2 X.

The proof of theorem 7.3.1 is divided in the two ases (1) p

0

does not

ontain variables, and (2) p

0

does ontain variables.

(1) p

0

does not ontain variables. From var(p) = ; follows

p

0

=

e

(x

0


p

0

= sub(x

0

; r) for all r 2 R with sub(x

0

; r) =2 X , and therefore

p

0

= (x

0

; w; e) for all x

0

2 X

(x), e 2 E, w 2 W .

That means, that instantiations of x

0

2 X

(x) are identi al in all unfold-

ings over all examples. Therefore, it holds for all w 2 W and e 2 E:

^

t

G

fx

0

(x

0

; w; e)g =

^

t

G

fx

0

p

0

g. Consequently, variables x

0

2 X

(x)

an be repla ed by p

0

in the program body.

Furthermore, for all variables x

00

2 X , e 2 E, w 2 W , and r 2 R holds:

(x

00

; w Æ r; e) = sub(x

00

; r) fx

0

(x

0

; w; e) j x

0

2 X

(x)g

fx

0

(x

0

; w; e) j x

0

2 X nX

(x)g

= sub(x

00

; r) fx

0

p

0

j x

0

2 X

(x)g

fx

0

(x

0

; w; e) j x

0

2 X nX

(x)g:

That is, in all substitutions the variables from X

(x) an be repla ed by p

0

and these new substitutions generate identi al instantiations in all unfoldings

over all examples.

Let X

0

= X nX

(x) and m = jX

0

j. Be ause jX

(x) > 0 holds jX

0

j < jX j.

Let be i

1

; : : : ; i

m

the indi es of variables in X

0

. There an be onstru ted a

new subprogram G

0

whi h does not need the variables from X

(x):

Head: G

0

(x

i

1

; : : : ; x

i

m

).

Re ursive Calls: Ea h re ursive all uses the new program nam G

0

and the

new substitutions. The all at position u

r

2 U

re

, r 2 R is:

G

0

( sub(x

i

1

; r)fx

0

p j x

0

2 X

(x)g;

.

.

.

sub(x

i

m

; r)fx

0

p j x

0

2 X

(x)g:

Maximal Pattern of Body: The maximal patterns of the instantiations an

be in luded in the body:

^

t

G

0

=

^

t

G

fx

0

p

0

j x

0

2 X

(x)g.

B.2 Proof: Maximal Subprogram Body 377

Body: The program body an be onstru ted as:

t

G

0

= t

G

fx

0

p

0

j x

0

2 X

(x)g

[u

r

G

0

( sub(x

i

1

; r)fx

0

p j x

0

2 X

(x)g;

.

.

.

sub(x

i

1

; r)fx

0

p j x

0

2 X

(x)g) j u

r

2 U

re

:

The initial instantiations of variables x

0

2 X

(x) are no longer used in the

examples, be ause these variables do not belong to the new subprogram. The

initial instantiations of the remaining variables in X

0

are unae ted. With the

redu ed initial instantiations the new subprogram still has the same language

for ea h example.

(2) p

0

does ontain variables. Let jvar(p

0

)j = m the number of variables

in p

0

. For ea h variable x

j

2 X

(x) new variables x

ji

; : : : ; x

jm

are on-

stru ted. The new variable set is X

0

= fx

jk

j x

j

2 X

(x); k = 1; : : : ;mg [

X n X

(x). For ea h x

j

2 X

(x) for all e 2 E and w 2 W there

must exist terms (x

j1

; w; e); : : : ; (x

jm

; w; e) 2 T

su h that (x

j

; w; e) =

p

0

[(x

j1

; w; e); : : : ; (x

jm

; w; e) holds be ause p

0

is a non-trivial pattern of

ea h (x

j

; w; e).

Furthermore, for ea h x

j

2 X

(x) and ea h r 2 R with sub(x

0

; r) =2

X there must exist terms sub(x

j1

; r); : : : ; sub(x

jm

; r) 2 T su h that

sub(x

j

; r) = p

0

[sub(x

j1

; r); : : : ; sub(x

jm

; r) holds be ause p

0

is a non-

trivial pattern of ea h sub(x

j

; r).

For all w 2W , e 2 E, and r 2 R holds

(x

j

; w Æ r; e) = p

0

[(x

j1

; w Æ r; e); : : : ; (x

jm

; w Æ r; e)

= sub(x

j

; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g

= p

0

[sub(x

j1

; r); : : : ; sub(x

jm

; r)fx

1

(x

1

; w; e);

.

.

.

x

n

(x

n

; w; e)g

= p

0

[sub(x

j1

; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g

.

.

.

sub(x

jm

; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g:

It follows for k = 1; : : : ;m, w 2W , e 2 E, and r 2 R:

(x

jk;

w Æ r; e) = mathbfsub(x

jk

; r)fx

1

(x

1

; w; e); : : : ; x

n

(x

n

; w; e)g:

Consequently, the instantiations of all variables x

jk

an be generated by the

(new) substitutions sub(x

jk

; r) in all unfoldings for all examples.

For all x

0

2 X

0

, w 2 W , e 2 E, and r 2 R holds:


(x

0

; w Æ r; e) = sub(x

0

; r) fx

j

(x

j

; w; e) j x

j

2 X

(x)g

fx

j

(x

j

; w; e) j x

j

2 X nX

(x)g

= sub(x

0

; r) fx

j

p

0

[(x

j1

; w; e); : : : ; (x

jm

; w; e) j x

j

2 X

(x)g

fx

j

(x

j

; w; e) j x

j

2 X nX

(x)g

= sub(x

0

; r) fx

j

p

0

[x

j1

; : : : ; x

jm

j x

j

2 X

(x)g

fx

jk

(x

j1

; w; e) j x

j

2 X

(x); k = 1; : : : ;mg

fx

j

(x

j

; w; e) j x

j

2 X nX

(x)g

= sub(x

0

; r) fx

j

p

0

[x

j1

; : : : ; x

jm

j x

j

2 X

(x)g

fx

00

(x

00

; w; e) j x

00

2 X

0

g:

Consequently, substitutions an be modied in the following way: Ea h

variable x

j

2 X

(x) is repla ed by the pattern p

0

, in whi h variables are

repla ed by the new variables x

j1

; : : : ; x

jm

. All substitutions are terms in

T

(X

0

) and do not refer to variables in X

(x).

Furthermore, for all w 2W , e 2 E holds:

^

t

G

fx

j

(x

j

; w; e) j x

j

2 X

(x)g

fx

j

(x

j

; w; e) j x

j

2 X nX

(x)g

=

^

t

G

fx

j

p

0

[(x

j1

; w; e); : : : ; (x

jm

; w; e) j x

j

2 X

(x)g

fx

j

(x

j

; w; e) j x

j

2 X nX

(x)g

=

^

t

G

fx

j

p

0

[x

j1

; : : : ; x

jm

j x

j

2 X

(x)g

fx

jk

(x

j1

; w; e) j x

j

2 X

(x); k = 1; : : : ;mg

fx

j

(x

j

; w; e) j x

j

2 X nX

(x)g

=

^

t

G

fx

j

p

0

[x

j1

; : : : ; x

jm

j x

j

2 X

(x)g

fx

0

(x

0

; w; e) j x

0

2 X

0

g:

If in the body t

G

variables x

j

2 X

(x) are repla ed by pattern p

0

whi h

is instantiated with the new variables x

j1

; : : : ; x

jm

then the same unfoldings

are generated with instantiations (x

0

; w; e) of variables x

0

2 X

0

.

Consequently, a new subprogram an be onstru ted in an analogous way

as in ase 1:

Repla e variables x

j

2 X

(x) by pattern p

0

whi h is instantiated with

variables x

j1

; : : : ; x

jm

in the body, and

in all substitution terms.

Delete the substitutions for the obsolete variables from X

(x) in all alls.

Introdu e the substitutions for the new variables in all alls.

Constru t the parameter list of the program head onsistent with variables

X

0

used in the re ursive alls.

The initial instantiations for the new variables result from (x

j

; w; e) =

p

0

[(x

j1

; w; e); : : : ; (x

jm

; w; e) (introdu ed above). The initial instantiations

for variables from X

(x) are no longer ne essary. With the redu ed initial in-

stantiations the new subprogram still has the same language for ea h example.

B.3 Proof: Uniqueness of Substitutions 379

Consequen es. If instantiations of a variable x 2 X in a subprogram share a

non-trivial pattern p

0

then a new maximized subprogram with new initial

instantiations an be onstru ted. By moving p

0

to the program body, the

number of nodes (symbols) in the initial instantiations gets redu ed ( ase

1). If a variable of the new subprogram has instantiations whi h share a

non-trivial pattern, a new subprogram an be onstru ted iteratively ( ase

2). Be ause the number of nodes in the initial instantiation is nite and is

redu ed in ea h iteration, after a nite number of steps a subprogram is

onstru ted together with initial instantiations su h that the instantiations

of ea h variable do not share a non-trivial pattern.

B.3 Proof: Uniqueness of Substitutions

For the proof of theorem 7.3.4, the following lemma is needed:

Lemma B.3.1 (No Common Non-Trivial Pattern for Variables). Let

t

s

2 T

(X

i

) be a term, su h that for x

j

2 X

i

and r 2 R holds: 8w 2 W :

wÆr

= t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g.

1. For all u with u 2

T

w2W

pos(

wÆr

(x

j

)) holds

2. if there exists a variable x

k

2 X

i

with: 8w 2 W :

wÆr

(x

j

)j

u

=

w

(x

k

),

and

3. 8u

0

< um8w

1

; w

2

2 W : node(

w

1

Ær

(x

j

); u

0

) = node(

w

2

Ær

(x

j

); u

0

)

then t

s

j

u

= x

k

.

Proof (No Common Non-Trivial Pattern for Variables). Assume there exists

a position u whi h together with variable x

k

2 X fullls properties (1), (2),

and (3) stated in lemma B.3.1, but for whi h does not hold t

s

j

u

= x

k

. We

must onsider two ases: u 2 pos(t

s

) and u =2 pos(t

s

).

Case 1: u 2 pos(t

s

)

1.a: Let t

s

j

u

= x

h

and x

h

6= x

k

. Then holds for all w 2W :

wÆr

(x

j

)j

u

=

w

(x

k

) (be ause of lemma B:3:1; 2)

wÆr

(x

j

)j

u

=

w

(x

h

) (by assumption)

and therefore, for all w 2W :

w

(x

k

) =

w

(x

h

). Be ause t

i

(that is,

the body of subprogram G

i

, see theorem 7.3.4) is a maximal pattern,

it must hold x

k

= x

h

whi h is a ontradi tion to the assumption.

1.b: Let t

s

j

u

=2 X

i

. Then holds for all w 2W :

wÆr

(x

j

)j

u

=

w

(x

k

) (be ause of lemma B:3:1; 2)

be ause

wÆr

(x

j

) = t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g

wÆr

(x

k

) = t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)gj

u

wÆr

(x

k

) = t

s

j

u

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g (be ause u 2 pos(t

s

)):


Be ause by assumption t

s

j

u

is not in X

i

, t

s

j

u

must be a non-trivial

pattern of instantiations

w

(x

k

) of x

k

. This is a ontradi tion to the

assumption that t

i

is a maximal pattern of unfoldings over .

Case 2: u =2 pos(t

s

)

Be ause u is a position of all

wÆr

(x

j

) and by assumption

wÆr

(x

j

) = t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

1

)g

there must exist a position u

s

< u with u

s

2 pos(t

s

) and t

s

j

u

s

2 X

i

. Let

t

s

j

u

s

= x

h

. Then for all w 2W holds:

wÆr

(x

j

)j

u

s

= t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)gj

u

s

= t

s

j

u

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)gj

u

s

(be ause u

s

2 pos(t

s

))

= x

h

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)gj

u

s

(be ause t

s

j

u

s

= x

h

)

=

w

(x

h

):

By assumption, equation (3) in lemma B.3.1 holds for all u

0

< u and

therefore also for u

s

< u: for all w 2 W

node(

w

1

Ær

(x

j

); u

s

) = node(

w

2

Ær

(x

j

); u

s

)

node(

w

1

(x

h

); ) = node(

w

2

(x

j

); ):

Consequently, instantiations of variable x

h

share a non-trivial pattern,

whi h is a ontradi tion to the assumption that t

i

is a maximal pattern

of unfoldings over .

Now we an proof theorem 7.3.4:

Proof (Uniqueness of Substitutions). Let t

s

2 T

(X) be a term, whi h fullls

equation (*) 8w 2 W :

wÆr

(x

j

) = t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g

given in theorem 7.3.4.We must distinguish the ases that (1) t

s

has a variable

at position u and (2) The subtree in t

s

at position u does not ontain a

variable.

Case 1: Let u 2 pos(t

s

) with t

s

j

u

2 X

i

and let t

s

j

u

= x

k

.

Then holds:

wÆr

(x

j

)j

u

=

w

(x

k

) and be ause of (*) that for all u

0

<

u and all w

1

; w

2

2 W node(

w

1

(x

j

); u

0

) = node(

w

2

(x

j

); u

0

). Be-

ause of lemma B.3.1 holds sub(x

j

; r)ju = t

s

j

u

and for all u

0

u:

node(sub(x

j

; r); u

0

) = node(t

s

; u

0

).

Case 2: Let u 2 pos(t

s

) with t

s

j

u

2 T

.

Then holds:

wÆr

(x

j

)j

u

= t

s

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)gj

u

= t

s

j

u

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g

= t

s

j

u

:

2.a: Let u 2 pos(sub(x

j

; r)). Let u

0

2 pos(sub(x

j

; r)) be a position with

u u

0

and let sub(x

j

; r)j

u

0

2 X

i

with sub(x

j

; r) = x

k

. Then holds

B.3 Proof: Uniqueness of Substitutions 381

for all w 2 W :

wÆr

(x

j

)j

u

0

= beta

w

(x

k

) and be ause

wÆr

(x

j

)j

u

=

t

s

j

u

it holds

wÆr

(x

j

)j

u

0

= t

s

j

u

0

and therefore

w

(x

k

) = t

s

j

u

0

.

The instantiations of x

k

then must all be identi al and share the

non-trivial pattern t

s

j

u

0

. This is a ontradi tion to the assumption

that t

i

is a maximal pattern of unfoldings over . It follows: if u 2

pos(sub(x

j

; r)) then term sub(x

j

; r)j

u

annot ontain variables and

it must hold:

wÆr

(x

j

)j

u

= sub(x

j

; r)j

u

fx

1

w

(x

1

); : : : ; x

m

i

w

(x

m

i

)g

= sub(x

j

; r)j

u

= t

s

j

u

:

2.b: Let u =2 pos(sub(x

j

; r)). Be ause u is a position in all

wÆr

(x

j

), in

sub(x

j

; r) at position u

0

< u must be a variable x

k

2 X

i

. Be ause

then holds for all w 2 W :

wÆr

(x

j

)j

u

0

=

w

(x

k

) and for all u

00

< u

and w

1

; w

2

2 W node(

w

1

Ær

(x

j

); u

00

) = node(

w

2

Ær

(x

j

); u

00

) and

be ause of lemma B.3.1 must hold for t

s

that t

s

j

u

0

= x

k

. Be ause

u

0

< u this is a ontradi tion to the assumption u 2 pos(t).

Be ause of ase 1 holds for all positions u 2 pos(t

s

) with 9u

0

: t

s

j

u

0

2

X

i

and u u

0

that node(t

s

; u) = node(sub(x

j

; r); u). Be ause of ase

2 holds for all positions u 2 pos(t

s

) with :9u

0

: t

s

j

u

0

2 X

i

and u u

0

that node(t

s

; u) = node(sub(x

j

; r); u). Therefore, for all u 2 pos(t

s

) holds

node(t

s

; u) = node(sub(x

j

; r); u) and therefore holds t

s

= sub(x

j

; r).


C. Sample Programs and Problems

C.1 Fibona i with Sequen e Referen ing Fun tion

The bona i fun tion as inferred with a geneti programming algorithm

Koza (1992, hap. 18.3):

(+ (SRF (- J 2) 0)

(SRF (+ (+ (- J 2) 0) (SRF (- J J) 0))

(SRF (SRF 3 1) 1)))

A realization in Lisp:

(defun kfib (x)

(+ (srf x (- x 2) 0)

(srf x (+ (+ (- x 2) 0) (srf x (- x x) 0))

(srf x (srf x 3 1) 1))

) )

; sequen e referen ing fun tion (sfr ur x def)

; if 0 >= x <= ur then fib(x) else default def

(defun srf ( ur x def)

( ond ((<= x 0) def)

((<= x ur) (fib x))

(T def)

))

; Koza's SRF al ulates the fibona i numbers for x = 0..20

; in as ending order and saves them in a global table

; we just al ulate it by standard fibona i

(defun fib (x)

( ond ((= 0 x) 1)

((= 1 x) 1)

(T (+ (fib (- x 1)) (fib (- x 2))))

))

A simpli ation of the synthesized fun tion:

(defun skfib (x)

(+ (srf x (- x 2) 0)

(srf x (- x 1)

(srf x (srf x 3 1) 1))

) )

384 C. Sample Programs and Problems

C.2 Indu ing `Reverse' with Golem

Tra e of the ILP system Golem (Muggleton & Feng, 1990) for indu ing the

re ursive lause reverse.

Positive Examples:

rev([,[).

rev([1,[1).

rev([2,[2).

rev([3,[3).

rev([4,[4).

rev([1,2,[2,1).

rev([1,3,[3,1).

rev([1,4,[4,1).

rev([2,2,[2,2).

rev([2,3,[3,2).

rev([2,4,[4,2).

rev([0,1,2,[2,1,0).

rev([1,2,3,[3,2,1).

Negative Examples:

rev([1,[).

rev([0,1,[0,1).

rev([0,1,2,[2,0,1).

app([1,[0,[0,1).

Ba kground Knowledge:

!- mode(rev(+,-)).

!- mode(app(+,+,-)).

rev([,[).

rev([1,[1).

rev([2,[2).

rev([3,[3).

rev([4,[4).

rev([1,2,[2,1).

rev([1,3,[3,1).

rev([1,4,[4,1).

rev([2,2,[2,2).

rev([2,3,[3,2).

rev([2,4,[4,2).

rev([0,1,2,[2,1,0).

rev([1,2,3,[3,2,1).

app([,[,[).

app([1,[,[1).

app([2,[,[2).

app([3,[,[3).

app([4,[,[4).

app([,[1,[1).

app([,[2,[2).

app([,[3,[3).

app([,[4,[4).

app([1,[0,[1,0).

app([2,[1,[2,1).

C.2 Indu ing `Reverse' with Golem 385

app([2,[1,[2,1).

app([3,[1,[3,1).

app([4,[1,[4,1).

app([2,[2,[2,2).

app([3,[2,[3,2).

app([4,[2,[4,2).

app([2,1,[0,[2,1,0).

app([3,2,[1,[3,2,1).

nat(0).

nat(1).

nat(2).

nat(3).

nat(4).

nat(5).

nat(6).

nat(7).

nat(8).

nat(9).

Tra e:

[:- mode(rev(+,-)). - Time taken 0ms

[:- mode(app(+,+,-)). - Time taken 0ms

[Rlgg of pair:

rev([,[).

rev([2,[2).

[is

rev(A,A) :- app(A,A,B).

[Number of negatives overed=0

[Number of positives overed=2

[Rlgg of pair:

rev([,[).

rev([1,2,[2,1).

[is

rev(A,B) :- app(C,D,B).


[Rlgg of pair:

rev([,[).

rev([1,3,[3,1).

[is

rev(A,B) :- app(C,D,B).


[Rlgg of pair:

rev([1,[1).

rev([1,4,[4,1).

[is

rev([1|A,[B|C) :- rev(A,A), rev(C,C), app([2,1,[0,[2,1,0),

app(A,[1,[B|C), app(D,A,[B), nat(1), nat(B), rev([,

[), app([1,[0,[1,0), app([1,[,[1), app([2,[1,

[2,1), app([2,[2,[2,2), app([3,2,[1,[3,2,1), app([B,

[,[B), app([B,C,[B|C), app([,[1,[1), app([,[B,

[B), app([,[,[), nat(0), nat(2).




[Rlgg of pair:

rev([2,[2).

rev([1,3,[3,1).

[is

rev([A|B,[C|D) :- rev([1|E,[F|D), rev(B,B), rev(D,D), app([G,

A,[H,[G,A,H), app(B,[A,[C|D), app(I,B,[C), nat(A),

nat(C), rev([1,[1), rev([F,[F), rev([,[), rev(E,

E), app([1,[0,[1,0), app([2,1,[0,[2,1,0), app([3,

2,[1,[3,2,1), app([A,[H,[A,H), app([A,[,[A),

app([A,I,[A|I), app([C,F,[J,[C,F,J), app([C,[1,

[C,1), app([C,[A,[C,A), app([C,[F,[C,F), app([C,

[,[C), app([C,D,[C|D), app([C,E,[C|E), app([C,

I,[C|I), app([F,[,[F), app([F,D,[F|D), app([F,

E,[F|E), app([G,[A,[G,A), app([,[A,[A), app([,

[C,[C), app([,[F,[F), app([,[,[), app(B,[F,[K|

E), nat(1), nat(F), nat(G), nat(H).



[Rlgg of pair:

rev([2,[2).

rev([2,3,[3,2).

[is

rev([2|A,[B|C) :- rev([1|A,[D|E), rev([F|C,[F|C), rev(A,

A), rev(C,C), app([2,[2,[2,2), app([2,C,[2|C), app([3,

2,[1,[3,2,1), app(A,[2,[B|C), app(G,A,[B), nat(2),

nat(B), rev([1,[1), rev([D,[D), rev([F,[F), rev([,

[), rev(E,E), app([1,[0,[1,0), app([2,1,[0,[2,1,

0), app([2,[1,[2,1), app([2,[,[2), app([2,E,[2|

E), app([2,G,[2|G), app([3,[1,[3,1), app([3,[2,

[3,2), app([B,F,[H,[B,F,H), app([B,[1,[B,1), app([B,

[2,[B,2), app([B,[F,[B,F), app([B,[,[B), app([B,

C,[B|C), app([B,E,[B|E), app([B,G,[B|G), app([D,

[,[D), app([D,C,[D|C), app([F,E,[F|E), app([,[2,

[2), app([,[B,[B), app([,[D,[D), app([,[,[),

app(A,E,I), app(E,J,[1), app(G,[D,[B|J), app(J,E,

[1), app(K,[B,[3|G), app(K,A,[3), nat(1), nat(3),

nat(D), nat(F).



[Rlgg of pair:

rev([3,[3).

rev([1,4,[4,1).

[is

rev([A|B,[C|D) :- rev(B,B), rev(D,D), app(B,[A,[C|D), app(E,

B,[C), nat(A), nat(C), rev([,[), app([A,[,[A),

app([C,[,[C), app([C,D,[C|D), app([,[A,[A), app([,

[C,[C), app([,[,[).



[Rlgg of pair:

rev([1,4,[4,1).

rev([2,3,[3,2).

C.3 Finite Program for Ùnsta k' 387

[is

rev([A,B,[B,A) :- rev([A,[A), rev([B,[B), rev([,[), app([A,

[,[A), app([B,[A,[B,A), app([B,[,[B), app([C,

A,[D,[C,A,D), app([,[A,[A), app([,[B,[B), app([,

[,[), nat(A), nat(B), rev([C,[C), app([A,[D,[A,

D), app([C,[A,[C,A), app([C,[,[C), app([,[C,

[C), nat(C), nat(D).



[Pos-neg over=10,potential-examples=3

[Rlgg of :

rev([,[).

rev([3,[3).

rev([1,4,[4,1).

[is

rev(A,B).


[Overgeneral

[Rlgg of :

rev([0,1,2,[2,1,0).

rev([3,[3).

rev([1,4,[4,1).

[is

rev([A|B,[C|D) :- rev(B,E), app(E,[A,[C|D), nat(A), nat(C),

rev([,[), app([,[,[).


[OK

[Rlgg of :

rev([1,2,3,[3,2,1).

rev([3,[3).

rev([1,4,[4,1).

[is

rev([A|B,[C|D) :- rev(B,E), rev(F,D), app(E,[A,[C|D), nat(A),

nat(C), rev([,[), app([A,[,[A), app([,[A,[A),

app([,[,[).


[OK

[Best-atom rev([0,1,2,[2,1,0)

[Pos-neg over=12,potential-examples=0

[Cover=12

[Redu ing lause

[Adding lause rev([A|B,[C|D) :- rev(B,E), app(E,[A,[C|D).

[REMOVED: 12 FORES, 0 NEGS

[FORES: 1, BACKS: 41, NEGS: 4, RULES: 1

[Indu tion time 498ms

C.3 Finite Program for Ùnsta k'

#S(TPLAN :PNAME P1

:SUPLAN

(#S(PSTEP :INSTOP NIL :PARENT NIL

:CHILD


((CLEAR (SUCC (SUCC O3))) (CLEAR (SUCC O3))

(CLEAR O3))

:PREC NIL :ADD NIL :DEL NIL :NODEID 0 :LEVEL 0)

#S(PSTEP :INSTOP (UNSTACK (SUCC O3))

:PARENT

((CLEAR (SUCC (SUCC O3))) (CLEAR (SUCC O3))

(CLEAR O3))

:CHILD

((ON O2 O3) (CLEAR (SUCC (SUCC O3)))

(CLEAR (SUCC O3)))

:PREC ((ON O2 O3) (CLEAR O2)) :ADD ((CLEAR O3))

:DEL ((CLEAR O2) (ON O2 O3)) :NODEID 1 :LEVEL 1)

#S(PSTEP :INSTOP (UNSTACK (SUCC (SUCC O3)))

:PARENT

((ON O2 O3) (CLEAR (SUCC (SUCC O3)))

(CLEAR (SUCC O3)))

:CHILD

((ON O1 O2) (ON O2 O3) (CLEAR (SUCC (SUCC O3))))


:DEL ((CLEAR O1) (ON O1 O2)) :NODEID 2 :LEVEL 2))

:COPLAN

(#S(PSTEP :INSTOP NIL :PARENT NIL :CHILD ((CLEAR O3))

:PREC NIL :ADD NIL :DEL NIL :NODEID 0 :LEVEL 0)

#S(PSTEP :INSTOP (UNSTACK (SUCC O3)) :PARENT ((CLEAR O3))

:CHILD ((CLEAR (SUCC O3)))


:DEL ((CLEAR O2) (ON O2 O3)) :NODEID 1 :LEVEL 1)

#S(PSTEP :INSTOP (UNSTACK (SUCC (SUCC O3)))

:PARENT ((CLEAR (SUCC O3)))

:CHILD ((CLEAR (SUCC (SUCC O3))))


:DEL ((CLEAR O1) (ON O1 O2)) :NODEID 2 :LEVEL 2))

:TERM

(IF (CLEAR O3 S)

S

(UNSTACK (SUCC O3) S

(IF (CLEAR (SUCC O3) S)

S

(UNSTACK (SUCC (SUCC O3)) S

(IF (CLEAR (SUCC (SUCC O3)) S) S OMEGA)))))

:PTYPE SEQ :NEWDAT NIL :NEWPRED NIL :NPFCT NIL

:GOALPRED (CLEAR O3) :BOTTOM O3

:PASSOCS ((O2 (SUCC O3)) (O1 (SUCC (SUCC O3))))

:PAFCT

(DEFUN SUCC (X S)

(COND ((NULL S) NIL)

((AND (EQUAL (FIRST (CAR S)) 'ON)

(EQUAL (NTH 2 (CAR S)) X))

(NTH 1 (CAR S)))

(T (SUCC X (CDR S)))))

:PPARAMS NIL :PVALUES NIL)

For sequen es, the slots newdat, newpred, and npf t are not required. The

slots pparams and pvalues are lled after folding.

C.4 Re ursive Control Rules for the `Ro ket' Domain

; Control Knowledge for ROCKET

; ----------------------------

; all for example (ro ket '(o1 o2 o3) '((at o1 a) (at o2 a) (at o3 a) (at ro ket a)))

; or, in luding generation of oset

C.4 Re ursive Control Rules for the `Ro ket' Domain 389

; (start-r '((at o1 b) (at o2 b) (at o3 b)) '((at o1 a) (at o2 a) (at o3 a) (at ro ket a)))

; predefined set-sele tors

(defun pi k (oset) ( ar oset) )

(defun rst (oset) ( dr oset) )

; generalized predi ates inferred during plan transformation

(DEFUN AT* (ARGS S)

(COND ((NULL ARGS) T)

((AND (NULL S) (NOT (NULL ARGS))) NIL)

((AND (EQUAL (CAAR S) 'AT)

(INTERSECTION ARGS (CDAR S)))

(AT* (SET-DIFFERENCE ARGS (CDAR S)) (CDR S)))

(T (AT* ARGS (CDR S)))))

(DEFUN INSIDE* (ARGS S)

(COND ((NULL ARGS) T)

((AND (NULL S) (NOT (NULL ARGS))) NIL)

((AND (EQUAL (CAAR S) 'INSIDE)

(INTERSECTION ARGS (CDAR S)))

(INSIDE* (SET-DIFFERENCE ARGS (CDAR S)) (CDR S)))

(T (INSIDE* ARGS (CDR S)))))

; expli it operator appli ation

; in ombination with DPlan, the add-del-lists are applied to s

(defun unload (o s)

(print `(unload ,o ,s))


((member o ( ar s) :test 'equal) ( ons (list 'at o 'b) ( dr s)))

(T ( ons ( ar s) (unload o ( dr s)))) ))

(defun loadr (o s)

(print `(load ,o ,s))


((member o ( ar s) :test 'equal)

( ons (list 'inside o 'ro ket) ( dr s)))

(T ( ons ( ar s) (loadr o ( dr s)))) ))

(defun move-ro ket (s)

(print `(move-ro ket ,s))


((equal ( ar s) '(at ro ket a)) ( ons (list 'at 'ro ket 'b) ( dr s)))

(T ( ons ( ar s) (move-ro ket ( dr s)))) ))

; generalized ontrol rules

; abstra tion from destination (B) (for at*) and vehi le (Ro ket) (for inside*)

(defun unload-all (oset s)

(if (at* oset s)

s

(unload (pi k oset) (unload-all (rst oset) s)) ))

(defun load-all (oset s)

(if (inside* oset s)

s

(loadr (pi k oset) (load-all (rst oset) s)) ))

(defun ro ket (oset s) (unload-all oset (move-ro ket (load-all oset s))) )


; "meta"-fun tion, generating the set of obje ts to be transported

; from the top-level goals

(defun start-r (g s) (ro ket (make- o g '(at* CO x)) s))

(defun make- o (goal newpat)

( ond ((null goal) nil)

((string< (string ( aar goal)) (string ( ar newpat)))

( ons (nth (position 'CO newpat) ( ar goal))

(make- o ( dr goal) newpat))) ))

Interleaving àt' and ìnside'. The generalized predi ates (at* oset B) and

(inside* oset Ro ket) are omplements. If all obje ts are at a lo ation, no

obje t is inside the vehi le and vi e versa. The unload-all fun tion presup-

poses, that all obje ts in oset are inside the ro ket and the load-all fun tion

presupposes, that all obje ts in oset are at the urrent lo ation. The onstr-

ution of a omplex obje t from the plan is driven by the top-level goals (or

for a sub-plan by the predi ates in its root-node). After transforming both

sub-plans into nite programs and generalizing over them, it be omes lear,

that both sub-plans share the parameter oset whith initial value (o1 o2 o3).

Analyzing the relationship between the obje ts in the at*-set and the

inside*-set, ould lead to an alternative implementation of these generalized

predi ates: (at* oset l) is true, if no obje t is inside the ro ket, that means,

if for (inside* oset ro ket) the oset is empty (s ontains no literal (inside o

ro ket)); and analogous for (inside* oset ro ket).

C.5 The `Sele tion Sort' Domain

Sorting Lists with Three Elements. The universal plan for sorting lists with

three elements is given in gure C.1. Note, that in onstru ting the universal

plan, it is random whether (swap p q) or (swap q p) is the rst instantiation.

Be ause the operator is symmetri , both appli ations result in an identi al

state. Only the rst appli ation is integrated in the plan. Restri tion of swap-

ping only from smaller positions to larger ones (or the other way round) an

be done by extending state spe i ations by ((gt P3 P2) (gt P3 P1) (gt

P2 P1)) and the appli ation- ondition of the swap-operator to ((is p n1)

(is q n2) (gt n1 n2) (gt q p)).

Minimal Spanning Trees. The nine minimal spanning trees of the 3-sort prob-

lem are given in gures C.2, C.3 and C.4. Three of these trees (the last three

given) are generalizable to the selsort program: namely, all trees where the

bran hing fa tor is as regular as possible (i. e., 3 to 2 vs. 3 to 1).

C.6 Re ursive Control Rules for the `Tower' Domain

; TOWER -- assuming subgoal independen e

C.6 Re ursive Control Rules for the `Tower' Domain 391

((ISC P1 1) (ISC P2 2) (ISC P3 3))

((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))

((ISC P1 3) (ISC P2 1) (ISC P3 2)) ((ISC P1 2) (ISC P2 3) (ISC P3 1))

(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)

(SWAP P2 P3)(SWAP P1 P3) (SWAP P1 P3)(SWAP P1 P2)(SWAP P2 P3) (SWAP P1 P2)

Fig. C.1. Universal Plan for Sorting Lists with Three Elements

((ISC P1 1) (ISC P2 2) (ISC P3 3))




(SWAP P2 P3)(SWAP P1 P3)

((ISC P1 1) (ISC P2 2) (ISC P3 3))




(SWAP P1 P3) (SWAP P1 P2)

((ISC P1 1) (ISC P2 2) (ISC P3 3))


((ISC P1 3) (ISC P2 1) (ISC P3 2))((ISC P1 2) (ISC P2 3) (ISC P3 1))



Fig. C.2. Minimal Spanning Trees for Sorting Lists with Three Elements

((ISC P1 1) (ISC P2 2) (ISC P3 3))





((ISC P1 1) (ISC P2 2) (ISC P3 3))





((ISC P1 1) (ISC P2 2) (ISC P3 3))








((ISC P1 1) (ISC P2 2) (ISC P3 3))

((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1))((ISC P1 1) (ISC P2 3) (ISC P3 2))

(SWAP P2 P3) (SWAP P1 P3)(SWAP P1 P2)


((ISC P1 1) (ISC P2 2) (ISC P3 3))





((ISC P1 1) (ISC P2 2) (ISC P3 3))






; abstra t representation of ontrol knowledge

;G(n,n,s) = tow(n,n)(s, putt1(n,s))

;G(i,n,s) = tow(i,n)(s,put1(i,s(i),G(s(i),s,n)))

;put1(i,s(i),s) = put(i,s(i), lear(i, lear(s(i),s)))

;putt1(i,s)=putt(i, lear(i,s))

; lear(x,s)= t(x,s)(s,putt(f(x), lear(f(x),s)))

; main: G(1,n,s)

; ------------------------------------------------------------------

; all (G 1 maxblo k s)

; s for maxblo k = 3

; ((on 1 2) (on 2 3) ( t 1) (ont 3))

; ((on 1 3) (on 3 2) ( t 1) (ont 2))

; ((on 2 1) (on 1 3) ( t 2) (ont 3))

; ((on 2 3) (on 3 1) ( t 2) (ont 1))

; ((on 3 1) (on 1 2) ( t 3) (ont 2))

; ((on 3 2) (on 2 1) ( t 3) (ont 1))

; ((on 1 2) ( t 1) ( t 3) (ont 2) (ont 3))

; ((on 1 3) ( t 1) ( t 2) (ont 2) (ont 3))

; ((on 2 1) ( t 2) ( t 3) (ont 1) (ont 3))

; ((on 2 3) ( t 2) ( t 1) (ont 1) (ont 3))

; ((on 3 1) ( t 3) ( t 2) (ont 1) (ont 2))

; ((on 3 2) ( t 3) ( t 1) (ont 1) (ont 2))

; (( t 1) ( t 2) ( t 3) (ont 1) (ont 2) (ont 3))

; some s for maxblo k > 3

; ((on 4 2) (on 2 1) (on 1 3) ( t 4) (ont 3))

; ((on 2 5) (on 5 1) (on 4 3) ( t 2) ( t 4) (ont 1) (ont 3))

; ------------------------------------------------------------------

; this would also give true if blo k n is ont and there are

; unsorted blo ks above it!


(defun tow (i n s)

( ond ((eq i n) ( ond ((member ( ons 'ont (list n)) s :test 'equal)

(print `(tower ,s)) T)

(T nil)

) )

(T ( ond ((and

(member ( ons 'on (list i (1+ i))) s :test 'equal)

(tow (1+ i) n s)) T)

(T nil)

) ) ))

(defun putt1 (i s) (puttable i ( lear i s)))

(defun put1 (i s) (put i (1+ i) ( lear i ( lear (1+ i) s))) )

(defun lear (x s)

( ond ((member ( ons ' t (list x)) s :test 'equal)

(print `(blo k ,(list x) is lear now)) s)

(T (print 'puttable- all-by- lear)

(puttable (f x s) ( lear (f x s) s))) ))

(defun put (x y s)

(print s)

( ond ((null s) (print 'error-in-put) nil)

((and (member ( ons ' t (list x)) s :test 'equal)

(member ( ons ' t (list y)) s :test 'equal)

) (print `(put ,(list x) on ,(list y) in,s)) (exe -put x y s))

(T (print 'error-in-put-xy-not- lear) nil) ))

(defun exe -put (x y s)


((and (equal (first ( ar s)) 'ont) (equal (se ond ( ar s)) x))

(exe -put x y ( dr s)))

((and (equal (first ( ar s)) 'on) (equal (se ond ( ar s)) x))

( ons ( ons ' t (list (third ( ar s)))) (exe -put x y ( dr s)))

)

((and (equal (first ( ar s)) ' t) (equal (se ond ( ar s)) y))

( ons ( ons 'on (list x y)) (exe -put x y ( dr s)))

)

(T ( ons ( ar s) (exe -put x y ( dr s)))) ))

(defun puttable (x s)

( ond ((null s) (print 'error-in-puttable) nil)

((member ( ons ' t (list x)) s :test 'equal)

(print `(puttable ,(list x) in ,s)) (exe -puttable x s))

(T (print 'error-in-puttable-x-not- lear) nil) ))

(defun exe -puttable (x s)

( ond ((null s) (print 'x-maybe-already-on-table) nil)

((and (equal (first ( ar s)) 'on) (equal (se ond ( ar s)) x))

( ons ( ons ' t (list (third ( ar s))))

( ons

( ons 'ont (list (se ond ( ar s))))


( dr s)))

)

(T ( ons ( ar s) (exe -puttable x ( dr s)))) ))

(defun f (x s)


((and (equal (first ( ar s)) 'on) (equal (third ( ar s)) x))

(se ond ( ar s)))

(T (f x ( dr s))) ))

(defun G (i n s)

(print `(G ,(list i) ,(list n) ,s))

( ond ((eq i n) ( ond ((tow i n s) s) ; G(n,n,s)

(T (putt1 n s))

) )

(T ( ond ((tow i n s) s) ; G(i,n,s), i < n

(T (put1 i (G (1+ i) n s)))

) ) ))

; TOWER-1; all e.g. (tower '(a b d) '((on a d) (on d ) (on b) ( t a)))

(defun tower (olist s)

(if (subtow olist s)

s

(if (and ( t (first olist) s) (subtow ( dr olist) s))

(put (first olist) (se ond olist) s)

(if (and (singleblo k olist) (ot (first olist) s))

( lear-all (first olist) s)

(if (and (singleblo k olist) ( t (first olist) s))

(puttable (first olist) s)

(if (singleblo k olist)

(puttable (first olist) ( lear-all (first olist) s))

(if (and ( t (first olist) s) (on* ( dr olist) s))

(put (first olist) (se ond olist) ( lear-all (se ond olist) s))

(if ( t (first olist) s)

(put (first olist) (se ond olist) (tower ( dr olist) s))

(put (first olist) (se ond olist) ( lear-all (first olist)

(tower ( dr olist) s)))

))))))))

; lear-all ma ro

(defun lear-all (o s )

(if ( t o s)

s

(puttable (su o s) ( lear-all (su o s) s ) )) )

(defun su (x s)


((and (equal (first ( ar s)) 'on)

(equal (nth 2 ( ar s)) x))

(nth 1 ( ar s)))

(T (su x ( dr s)))))

(defun singleblo k (l) (null ( dr l)))


; orre t subtower?

(defun subtow (olist s) (and ( t ( ar olist) s) (on* olist s)))

; tower ontains a urre t subtower?

(defun on* (olist s)

( ond ((null olist) T) ; should not happend

((and (null ( dr olist)) (ot ( ar olist) s)) T)

((member (list 'on (first olist) (se ond olist)) s :test 'equal)

(on* ( dr olist) s))

(T nil) ))

; given

(defun t (o s) (member (list ' t o) s :test 'equal))

; given as (ontable x) OR (here) inferred from state

(defun ot (o s)

(null (map an #'(lambda(x) (and (equal 'on (first x))

(equal o (se ond x))

(list x))) s) ))

; expli it appli ation of put-operator

(defun put (x y s )

( ond ((null s) (print `(put ,x ,y))

nil)

((equal ( ar s) (list ' t y)) ( ons (list 'on x y) (put x y ( dr s) )))


(equal (se ond ( ar s)) x))

( ons (list ' t (third ( ar s))) (put x y ( dr s) )))

(T ( ons ( ar s) (put x y ( dr s) )))

))

; expli it appli ation of puttable-operator

(defun puttable (x s )

( ond ((null s) (print `(puttable ,x))

nil)


(equal (se ond ( ar s)) x)) ( ons (list ' t (third ( ar s)))

(puttable x ( dr s) )))

(T ( ons ( ar s) (puttable x ( dr s) )))

))

; TOWER-2 ;Building a list (tower) of sorted numbers (blo ks)

; -----------------------------------------------------------------

; Representation: ea h partial tower as list

; Input: list of lists

; Examples for the three blo ks world

; ((1 2 3))

; ((2 3) (1))

; ((1) (2) (3))

; ((2 1) (3))

; ((3 2 1))

; ((1 2) (3))

; ((1 3) (2))

; ((3 1) (2))

; ((3 2) (1))

; ((1 3 2))


; ((2 3 1))

; ((2 1 3))

; ((3 1 2))

; -----------------------------------------------------------------

; help fun tions

; ---------------

; flattens a list l

(defun flatten (l)


(T (append ( ar l) (flatten ( dr l))))

))

; x+1 = y?

(defun onedif (x y) (= (1+ x) y))

; blo ks world sele tors

; ----------------------

; topmost blo k of a tower

(defun topof (tw) ( ar tw))

; bottom blo k (base) of a tower

(defun bottom (tw) ( ar (last tw)))

; next tower

; f.e. ((2 1) (3)) -> (2 1)

(defun get-tower (l) ( ar l))

; tops of all urrent towers

(defun topelements (l) (sort (map 'list #' ar l) #'>))

; topblo k with highest number

(defun greatest (l) ( ar (topelements l)))

; topblo k mit se ond highest number

(defun s ndgreatest (l) ( adr (topelements l)))

; label of the blo k with the highest number

(defun maxblo k (l)

( ond ((null l) 0)

(T ( ar (sort (flatten l) #'>)))

))

(defun get-all-no-towers (l max)


((and (equal (bottom ( ar l)) max) (sorted (get-tower l)))

(get-all-no-towers ( dr l) max))

((single-blo k (get-tower l)) (get-all-no-towers ( dr l) max))

(T ( ons ( ar l) (get-all-no-towers ( dr l) max)))

))

(defun find-greatest (max l)

( ond ((null l) max)


((> (topof max) (topof ( ar l))) (find-greatest max ( dr l)))

(T (find-greatest ( ar l) ( dr l)))

))

; find in orre t tower ontaining highest element

(defun greatest-no-tower (l)


(T (find-greatest ( ar (get-all-no-towers l (maxblo k l)))

( dr (get-all-no-towers l (maxblo k l)))))

))

; blo ksworld predi ates

; ----------------------

; is tower only a single blo k?

(defun single-blo k (tw) (= (length tw) 1))

; exist two partial towers whi h top elements differ only by one?

(defun exist-free-neighbours (l) (onedif (s ndgreatest l) (greatest l)))

; exists a orre t partial tower?

; f.e. (2 3) or (B C)

(defun exists-tower (l)


((and (equal (bottom (get-tower l)) (maxblo k l))

(sorted (get-tower l))) T)

(T (exists-tower ( dr l)))

))

; is blo k x prede essor to top of a tower?

(defun su essor (x tw)

( ond ((null tw) T)

((onedif x ( ar tw)) T) ;(su essor x ( dr tw)))

(T nil)

))

; is tower sorted?

(defun sorted (tw)

( ond ((null tw) T)

((su essor ( ar tw) ( dr tw)) (sorted ( dr tw)))

(T nil)

))

; exists only one tower?

(defun single-tower (l) (null ( dr l)))

; goal state?

(defun is-tower (l) (and (single-tower l) (sorted (get-tower l))))

; ------------------------------------------------------------------

; blo ksworld operators

; ---------------------

; put x on y

(defun put (x y l)

( ond ((null l) (print 'put) (print x) (print y)


nil)

((equal ( aar l) x) ( ond ((not (null ( dar l)))

(append (list ( dar l)) (put x y ( dr l))))

(T (put x y ( dr l)))))

((equal ( aar l) y) ( ons ( ons x ( ar l)) (put x y ( dr l))))

(T ( ons ( ar l) (put x y ( dr l))))

))

; puttable x

(defun puttable (x l)


((equal ( aar l) x) (print 'puttable) (print x)

( ons (list x) ( ons ( dar l) ( dr l))))

(T ( ons ( ar l) (puttable x ( dr l))))

))

; ------------------------------------------------------------------

; main fun tion

; -------------

(defun tower (l)

( ond ((is-tower l) l)

((and (exists-tower l)

(exist-free-neighbours l))

(tower (put (s ndgreatest l) (greatest l) l)))

(T (tower (puttable (topof (greatest-no-tower l)) l)))

))

C.7 Water Jug Problems

In the following we present all problems used in experiment 1 and 2. Be ause

the graphs of the problem stru tures are large, we give the equations and

inequations (terms) instead. For ea h problem, a graph an be onstru ted

by representing the terms as demonstrated in gure 11.3. The terms get inte-

grated into a single stru ture by using ea h parameter (

A

;

B

: : :, q

A

; q

B

: : :)

as unique node.

Problems for Experiment 1

Initial Problem

Jug A B C (D)

Capa ity 28 35 42 -

Initial quantity 12 21 26 -

goal quantity 19 0 40 -

Operator sequen e: pour(C,B), pour(B,A), pour(A,C), pour(B,A)

Relevant Pro edural Information/Relevant De larative Information (Con-

straints), see Sour e problem

C.7 Water Jug Problems 399

Sour e Problem

Jug A B C (D)

Capa ity 36 45 54 -




Relevant Pro edural Information

q

A

(4) = q

A

(0) + (

A

q

A

(0))

A

+ (

B

(

A

q

A

(0)))

q

B

(4) = q

B

(0) + (

B

q

B

(0)) (

A

q

A

(0)) (

B

(

A

q

A

(0)))

q

C

(4) = q

C

(0) (

B

q

B

(0)) +

A

Relevant De larative Information (Constraints)

A

= 2 (

B

q

B

(0))

B

(

A

q

A

(0))

q

C

(0) (

B

q

B

(0))

C

< q

C

(0) +

A

Problem 1 (Isomorph/No Surfa e Change)

Jug A B C (D)

Capa ity 48 60 72 -




Stru tural distan e to sour e: d

S1

= 0

Relevant Pro edural Information/Relevant De larative Information (Con-

straints), see Sour e problem

Problem 2 (Isomorph/Small Surfa e Change) (Sour e/Target Map-

ping: A/B, B/A, C/A)

Jug A B C (D)

Capa ity 60 48 72 -



Operator sequen e: pour(C,A), pour(A,B), pour(B,C), pour(A,B)


S2

= 0


q

A

(4) = q

A

(0) + (

A

q

A

(0)) (

B

q

B

(0)) (

A

(

B

q

B

(0)))

q

B

(4) = q

B

(0) + (

B

q

B

(0))

B

+ (

A

(

B

q

B

(0)))

q

C

(4) = q

C

(0) (

A

q

A

(0)) +

B


B

= 2 (

A

q

A

(0))

A

(

B

q

B

(0))

q

C

(0) (

A

q

A

(0))

C

< q

C

(0) +

B


Problem 3 (Isomorph/Large Surfa e Change) (Sour e/Target Map-

ping: A/B, B/C, C/A)

Jug A B C (D)

Capa ity 72 48 60 -



Operator sequen e: pour(A,C), pour(C,B), pour(B,A), pour(C,B)


S3

= 0


q

A

(4) = q

A

(0) (

C

q

C

(0)) +

B

q

B

(4) = q

B

(0) + (

B

q

B

(0))

B

+ (

C

(

B

q

B

(0)))

q

C

(4) = q

C

(0) + (

C

q

C

(0)) (

B

q

B

(0)) (

C

(

B

q

B

(0)))


B

= 2 (

C

q

C

(0))

C

(

B

q

B

(0))

q

A

(0) (

C

q

C

(0))

A

< q

A

(0) +

B

Problem 4 (Partial Isomorph/No Surfa e Change) (additional jug:

A; Sour e/Target Mapping: A/B, B/C, C/D)

Jug A B C D

Capa ity 16 20 25 31

Initial quantity 3 8 15 18

goal quantity 0 13 3 28

Operator sequen e: pour(D,C), pour(C,B), pour(B,D), pour(C,B), pour(A,C)


S4

= 0:37


q

A

(5) = q

A

(0) q

A

(0)

q

B

(5) = q

B

(0) + (

B

q

B

(0))

B

+ (

C

(

B

q

B

(0)))

q

C

(5) = q

C

(0) + (

C

q

C

(0)) (

B

q

B

(0)) (

C

(

B

q

B

(0)))

q

D

(5) = q

D

(0) (

C

q

C

(0)) +

B


B

= 2 (

C

q

C

(0))

C

(

B

q

B

(0))

C

q

A

(0)

q

D

(0) (

C

q

C

(0))

D

< q

D

(0) +

B

Problem 5 (Partial Isomorph/Small Surfa e Change) (additional jug:

A; Sour e/Target Mapping: A/C, B/B, C/D)

Jug A B C D

Capa ity 16 25 20 31



C.7 Water Jug Problems 401

Operator sequen e: pour(D,B), pour(B,C), pour(C,D), pour(B,C), pour(A,B)


S5

= 0:37


q

A

(5) = q

A

(0) q

A

(0)

q

C

(5) = q

C

(0) + (

C

q

C

(0))

C

+ (

B

(

C

q

C

(0)))

q

B

(5) = q

B

(0) + (

B

q

B

(0)) (

C

q

C

(0)) (

B

(

C

q

C

(0)))

q

D

(5) = q

D

(0) (

B

q

B

(0)) +

C


C

= 2 (

B

q

B

(0))

B

(

C

q

C

(0))

B

q

A

(0)

q

D

(0) (

B

q

B

(0))

D

< q

D

(0) +

C

Problems for Experiment 2

(Initial problem and sour e problem are identi al to experiment 1.)

Problem 1 (Target Exhaustiveness) (Sour e/Target Mapping: A/A,

B/B, C/C)

Jug A B C (D)

Capa ity 48 60 72 -



Operator sequen e: pour(C,B), pour(B,A), pour(A,C)


S1

= 0:16


q

A

(3) = q

A

(0) + (

A

q

A

(0))

A

q

B

(3) = q

C

(0) + (

B

q

B

(0)) (

A

q

A

(0))

q

C

(3) = q

B

(0) (

B

q

B

(0)) +

A


A

= 2 (

B

q

B

(0))

B

(

A

q

A

(0))

q

C

(0) (

B

q

B

(0))

C

< q

C

(0) +

A

Problem 2 (Sour e In lusiveness) (Sour e/Target Mapping: A/A, B/B,

C/C)

Jug A B C (D)

Capa ity 48 60 72 -



Operator sequen e: pour(C,B), pour(B,A), pour(A,C), pour(B,A), pour(C,B)



S2

= 0:17


q

A

(5) = q

A

(0) + (

A

q

A

(0))

A

+ (

B

(

A

q

A

(0)))

q

B

(5) = q

B

(0) + (

B

q

B

(0)) (

A

q

A

(0)) (

B

(

A

q

A

(0))) +

B

q

C

(5) = q

C

(0) (

B

q

B

(0)) +

A

B


A

= 2 (

B

q

B

(0))

B

(

A

q

A

(0))

q

C

(0) (

B

q

B

(0))

q

C

(0) <

B

C

< q

C

(0) +

A

Problem 3 (\High" Stru tural Overlap) (identi al to problem 4 in ex-

periment 1)

Problem 4 (\Medium" Stru tural Overlap) (additional jug: A; Sour e/Target

Mapping: A/B, B/C, C/D)

Jug A B C D

Capa ity 16 20 25 31



Operator sequen e: pour(D,C), pour(C,B), pour(D,A), pour(B,D), pour(C,B)


S4

= 0:55


q

A

(5) = q

A

(0) + (q

D

(0) (

C

q

C

(0)))

q

B

(5) = q

B

(0) + (

B

q

B

(0))

B

+ (

C

(

B

q

B

(0)))

q

C

(5) = q

C

(0) + (

C

q

C

(0)) (

B

q

B

(0)) (

C

(

B

q

B

(0)))

q

D

(5) = q

D

(0) (

C

q

C

(0)) (q

D

(0) (

C

q

C

(0))) +

B


A

< (q

A

(0) + q

D

(0))

q

A

(0) < (

C

q

C

(0))

C

> (

B

q

B

(0))

D

> (q

D

(0) +

B

Problem 5 (\Low" Stru tural Overlap) (additional jug: A; Sour e/Target

Mapping: A/B, B/C, C/D)

Jug A B C D

Capa ity 17 20 25 31



Operator sequen e: pour(B,A), pour(D,C), pour(C,B), pour(B,D), pour(C,B)


S5

= 0:59


q

A

(5) = q

A

(0) + q

B

(0)

C.8 Example RPSs 403

q

B

(5) = q

B

(0) q

B

(0) +

B

B

+ (

C

q

C

(0))

q

C

(5) = q

C

(0) + (

C

q

C

(0))

B

(

C

q

C

(0))

q

D

(5) = q

D

(0) (

C

q

C

(0)) +

B


B

= 2 (

C

q

C

(0))

A

q

A

(0) + q

B

(0)

C

B

q

D

(0) (

C

q

C

(0))

D

< q

D

(0) +

B

C.8 Example RPSs

In the following the ten example programs used in the empiri al evaluation

of retrieval in hapter 12 are given. Note, that we only give the subprograms.

The omplete RPSs ontain additionally a all of the a ording subprogram.

#S(sub

:NAME REVERT

:PARAMS (L1 L2)

:BODY (IF (NULL L1) L2 (REVERT (TL L1) (++ (HD L1) L2)))

)

#S(sub

:NAME SUM

:PARAMS (X)

:BODY (IF (EQ0 X) 0 (+ X (SUM (PRE X))))

)

#S(sub

:NAME CB

:PARAMS (X SIT)

:BODY (IF (CT X) SIT (PT (TO X) (CB (TO X) SIT)))

)

#S(sub

:NAME GGT

:PARAMS (X Y)

:BODY (IF (EQ0 Y) X (GGT Y (MOD X Y)))

)

#S(sub

:NAME MOD

:PARAMS (X Y)

:BODY (IF (< X Y) X (MOD (- X Y) Y))

)

#S(sub

:NAME FAC

:PARAMS (X)

:BODY (IF (EQ0 X) 1 (* X (FAC (PRE X))))

)

#S(sub

:NAME APPEND

:PARAMS (L1 L2)


:BODY (IF (NULL L1) L2 (++ (HD L1) (APPEND (TL L1) L2)))

)

#S(sub

:NAME MEMBER

:PARAMS (X L)

:BODY (IF (NULL L) F (IF (== X (HD L)) T (MEMBER X (TL L))))

)

#S(sub

:NAME LAH

:PARAMS (X)

:BODY

(IF (EQ0 X) 1

(IF (EQ0 (PRE X)) 1

(- (* (PRE (* 2 X)) (LAH (PRE X)))

(* (* (PRE X) (PRE (PRE X))) (LAH (PRE (PRE X))))

) ) )

)

#S(sub

:NAME BINOM

:PARAMS (X Y)

:BODY (IF (EQ0 X) 1

(IF (== X Y) 1 (+ (BINOM (PRE X) (PRE Y)) (BINOM (PRE X) Y)))

)

)

#S(sub

:NAME FIBO

:PARAMS (X)

:BODY

(IF (EQ0 X) 0

(IF (EQ0 (PRE X)) 1 (+ (FIBO (PRE X)) (FIBO (PRE (PRE X)))))

)

)

Index

A

31, 59

-subsumption 144

absorption 146

abstra tion 119, 286, 320

ACME 300

ACT 52, 279

a tion 16, 24, 29, 74

minimal number 19

sele tion 19

semanti s 75

ADL 20, 74, 75

airplane domain 85

analogi al reasoning 281, 299

analogy 286

adaptation 288, 299, 320, 328

between-domain 286

ausal 292

ognitive model 300

derivational 288, 293, 302

explanatory 292, 299

rst-order mapping 288

generalization 287, 288

inferen e 288

internal 316

mapping 286, 287

mapping onstraints 287

problem solving 292, 296

programming 294

proportional 290

re-des ription 290

replay 288

retrieval 287

stru tural 289

transfer 287, 299

transformational 288, 302

within-domain 286

anti-instan e 322

most spe i 322

anti-uni ation 175, 202, 210, 289, 295

algorithm 323

rst order 322

higher order 322

se ond order 295, 323

sets of terms 176

anti-uni ator 175

approximation hain 159

Aries 103

assertion 115, 122

attribute 286

automati programming 101104,

106, 125

automaton 132

deterministi nite 136, 153, 167

axiom

domain 45, 124, 277

ee t 25

frame 25

indu tion 113115

ba kground knowledge 108, 110, 129,

143, 149, 167

bias

semanti 149

synta ti 149

Bla kbox 44

blo ks-world 14, 17, 18, 20, 22, 26, 30,

36, 59, 71, 90

learblo k 59, 116, 163, 167, 226,

239, 279

omplexity 33

tower 63, 139, 266, 271

BMWk 163

bottom test 239, 243, 245, 262

breadth-rst sear h 35, 46, 58

anoni al ombinatorial problem 44

ase-based reasoning 286288

CIP 118, 122, 124, 125

lause

ij-determinate 150

determinate 150

405

406 Index

re ursive 152

losed world assumption 13, 15, 20, 25

ognition

adaptive 278

ognitive psy hology 49, 280

on ept learning 127, 131, 278

onstraint

free variable 77, 78

onstraint programming 93

Constraint Prolog 93

onstraint satisfa tion 44, 50, 75, 93

onstru tor 238

ontrol knowledge

ro ket 251

unsta k 242

ontrol rule

ompleteness 243

orre tness 243

learning 226, 231, 279

re ursive 53, 163, 226, 231

Curry-Howard isomorphism 112

y le 30

Cypress 125

DAG 48, 58, 65, 256

data type 149, 237

abstra t 120, 122

binary tree 120

hange 120

inferen e 232, 233, 237

list 238, 252

pre-dened 277

renement 125

sequen e 238, 239

set 238, 245

de ision tree 46, 108, 130

Dedalus 115

dedu tive planning 20

dedu tive tableau 115

depth-rst sear h 28, 58

ba ktra king 28, 30

design pattern 331

design strategy 111

design ta ti 125

Dijkstra-algorithm 55

DPlan 42, 48, 55, 57, 59, 6567, 70, 279

algorithm 57

ompleteness 70

goal 56

operator 56

planning problem 57

soundness 70

state 56

termination 67

universal plan 56, 58

DPlan, universal plan 59

ee t 16

ontext dependent 23

mixed ADD/DEL

update 90

se ondary 23

elevator domain 73

eureka 118, 119, 121, 123

evaluation

expression 77, 78

ground term 78

exe ution monitoring 46

expression simpli ation 124

fa torial 200

folding 224

unfolding 200, 224

Fibona i fun tion 118, 141, 182, 202,

265

folding 224

unfolding 224

lter 251, 261, 264

nite dieren ing 124

tness measure 137

xpoint semanti s 160, 186

uents 16, 24, 74

Foil 150, 151

folding 119, 160, 171, 181, 186, 277,

279

fault toleran e 224

formal language 132

FPlan

assignement 79

goal representation 80

language 76

performan e gain 90, 91

planning problem 81

state representation 78

user-dened fun tions 84

FPlan, mat hing 79

fun tion approximation 158

algorithms 131

order 159

fun tion merging 165

generalization

analogy 287, 288

bottom-up 145, 146, 151

explanation-based 53, 167

re ursive 153, 280

RPS 327

Index 407

generalization-to-n 165

generate and test 110, 269

geneti operations 138

geneti programming 54, 110, 131,

136, 168, 272

goal 115

ordering 39

regression 37

goal regression 56

goals

independen e 39

interleaving 39

Golem 150, 151

GPS 41, 49, 53

grammar

ontext-free 136

ontext-free tree 180

regular 136

grammar inferen e 110, 126, 132, 152,

169

graph 286

graph distan e 306

Graphplan 34, 42, 58, 73

hanoi domain 63, 74, 89

hat-axiom 117, 233

heuristi 50

fun tion 44, 48

sear h 44, 278

hidden variable 216, 228

identi ation 216

identifying 217

higher-order fun tion 251

HSP 44

HSPr 42

hypothesis 107, 130, 142, 143, 165

omplete 144

onsistent 144

non-re ursive 131

hypothesis language 109, 131

IAM 300

ID 136, 153, 167

identi ation

by enumeration 134, 165

in the limit 130, 132

probably approximately orre t 130

indexing 54

indu tion 116, 159, 278

stru tural 115

indu tion bias 131

synta ti 131

indu tion s heme 115

indu tive fun tional programming

110, 131, 153, 168, 171

indu tive logi programming 54, 110,

131, 142, 168

inferen e

dedu tive 107, 109, 124

indu tive 107, 109, 126

information

in omplete 45

in orre t 45

initial instantiation 201, 207, 219

initial program 136, 181

order 181

re ursive explanation 185

segments 196

initial tree

de omposition 206

redu tion 209

input/output examples 106, 110

rewriting 153, 155

instantiation

main program 219

partial 38

IPP 44

Isabelle 117

KIDS 103, 124, 168, 295, 319

knowledge

de larative 278

pro edural 278, 281

Lah-number 33, 325

language

ontext-free tree 180

generator 133

learnability 134, 135

primitive re ursive 135

regular 135, 136

tester 133

learning 127

algorithm 131

analyti al 167

approximation-driven 131

bat h 128, 131

ontrol programs 53

ontrol rule 53

data-driven 131

de ision list 53

de ision tree 53

experien e 127

from observation 127

in remental 128, 131

language 132

408 Index

model-driven 131, 142

poli y 53

simultaneous omposite 267

speed-up 53

speed-up ee ts 279

supervised 127, 133

task 127

unsupervised 127

with informant 133

least general generalization 145

relative 145, 151

linked lause 149

LISA 292, 300

Lisp 109, 110, 113, 114, 120, 124, 125,

131, 136, 153, 163, 165

Lisp tutor 103

list

introdu tion 252

order 252

list problem

semanti 251, 254

stru tural 251, 277

logisti s domain 73, 249

ma hine learning 45, 50, 108, 126, 278

ma ro operator

iterative 53

linear 52, 279

map 251

mat h-sele t-apply y le 28, 49

MBP 47

M Carthy- onditional 154

means-end analysis 49, 316

memory

hierar hi al 326

linear 324

Merlin 152

meta-programming 118, 122, 125

metaphor 286

minimal spanning tree 67

extra tion 256, 257

MIPS 47

MIS 147, 151

ML 136

mode 149

modlist 178, 193, 203, 208, 226

mutex relation 43

network ow problem 42

NOAH 41

novi e programmer 104

Nuprl 117

OBDD 46, 59

operator

abstra t 45

ee t

ADD/DEL 16, 84

update 84

FPlan 80

primitive 45

Strips 16

operator appli ation

ba kward 19, 56

ompleteness 69

FPlan 81

soundness 68

onditional ee t 24

forward 17, 56

FPlan 81

operator s hemes 16

pattern

rst order 175

maximal 175, 200, 202, 219

pattern mat hing 16, 22, 23

PDDL 17, 20, 27, 44, 56, 75

onditional ee t 21, 56

equality onstraints 21

typing 21

per eptual hunk 280

plan 13, 18, 28

as program 234

extra tion 29, 42

levelwise de omposition 235

linearization 29, 252, 259, 265

optimal 19, 59

optimality 35, 47

partial 28

partial order 42

partially ordered 29, 42

re ursive 116

reuse 54

totally ordered 29, 42

universal 232

plan onstru tion 19, 20, 264

plan de omposition 233, 246

plan exe ution 20, 233

plan transformation 277

ompleteness 235

orre tness 235

planner

partial order 41

spe ial purpose 45

planning 163, 278

ba kward 55

ompetition 20

Index 409

ompilation approa h 44

ompleteness 35

onditional 46

ontrol knowledge 45, 52

orre tness 34

dedu tive 41

forward 29, 30, 44

fun tion appli ation 73

heuristi 73

hierar hi al 45, 50

innite domain 74

least ommitment 40, 42

linear 35, 39

logi al inferen e 20

non-linear 40, 42, 55

NP- ompleteness 31, 58

number problems 87, 278

progression 29

PSPACE- ompleteness 33, 46, 58

rea tive 46

regression 36

resour e onstraints 85

resour e variable 73

soundness 34

state-based 19, 55

Strips 37, 49

strong ommitment 40

termination 35

total order 55

types 45

universal 44, 46, 55, 231, 277

planning graph 42

poli y learning 46

position

omposition 173

next right 199

node 173

order 173

term 173

pre-image 46, 56

pre-planning analysis 45

pre ondition

se ondary 23

standard 77

predi ate onstru tion 157

predi ate invention 146, 152

prex 175

problem de omposition 45

problem s heme 302

problem solving 48

analogi al 298, 299

human 49, 54

s heme 281

strategy 279

problem solving onstraints 302, 305

Prodigy 42, 53, 289, 293

produ tion rule 49, 103

produ tion system 49, 52, 278

program 321

fun tional 108, 116, 118, 130, 136,

177

morphism 295

optimization 125, 160

regular 164

validation 102

veri ation 101, 111, 113

program body 177

maximal 201

maximization 201

program optimization 124

program reuse 111, 319, 331

program s heme 111, 117, 168, 296,

319, 321

binary sear h 295

divide-and- onquer 124, 152

extension of Summers' 163

global sear h 124

lo al sear h 124

se ond order 149

semi-regular 164

Summers 154

program synthesis 46, 101, 104, 112,

113, 117, 277

dedu tive 41, 101, 107, 109, 111,

112, 117, 168

indu tive 101, 107, 110, 111, 126,

131, 168

knowledge 110

s hema-based 111

program transformation 109, 117, 125,

168, 319

program veri ation 103

programming

abstra t 120

by analogy 111, 294

by demonstration 165, 171

de larative 106

programming tutor 103, 111

Prolog 20, 27, 110, 131, 143, 149, 152,

153, 167

proof method 50

proof planning 50

proof ta ti 117

proof-as-program 115

Proust 103

PVS 103

410 Index

QA3 24, 41, 113

re-planning 46

re urren e dete tion 153, 157

re urren e relation 157, 158

re ursion

introdu tion 123

lists 114

re ursion point 181, 193

valid 195

re ursive

linear 117, 126, 154, 160, 165, 226,

243, 256, 264, 325

tail 117, 126, 160, 254, 264, 325

tree 325

re ursive all

variable substitution 210

re ursive lause 143

re ursive fun tion 265

re ursive program 101, 113115, 136

re ursive subprogram 177, 219

redu e 251

Rene 124

renement 118, 124

operator 147

top-down 145, 146

reinfor ement learning 46, 53, 54, 127,

278

relation

rst order 286

n-ary 286

se ond order 286

unary 286

resolution 24, 27, 112, 115, 144

inverse 146, 186

resour e onstraint 50

retrieval 320, 326

algorithm 325

reverse 151, 160, 251

rewrite rule

ontext-free 171

ro ket 40, 52, 61, 66, 71, 246, 249, 253,

293

re ursive loading 249

re ursive unloading 249

RPS 177, 277, 279, 280, 320

CFTG 181

existen e 221

indu tion 172, 231

interpretation 178

language 179, 328, 329

minimality 187

restri tions 177, 187

rewrite system 179

substitution uniqueness 188

Rutherford analogy 286, 288, 292, 298

S-expression 154

omplete partial order 156

form 156

SAT-solver 44

saturation 146

s heduling 49, 87

s hema indu tion 299

sear h

algorithm 27

ba kward 48

forward 48

tree 2830

segment indi es 197

segmentation 181

algorithm 197

re urrent 192

strategy 198

valid 195

valid re urrent 197, 219

sele tion sort 91

sele tor fun tion 238, 256, 262, 280

sequen e

introdu tion 240

set

introdu tion 246

sets of list problem 33, 271

signature 172

similarity

attribute 289

literal 286

stru tural 287, 289

super ial 287

single-sour e shortest-path algorithm

55

situation al ulus 20, 24, 41, 107

situation variable 233

introdu tion 233, 238

skeleton 193

skill

a quistition 278

ognitive 278

skill a quisition 52, 127

SME 292, 300

SOAR 53

Soar 279

software engineering 101, 102, 319

omputer-aided 102

knowledge-based 102, 111, 125

solution 28

Index 411

sorting 73, 91, 253, 277

bubble-sort 167, 254

domain 40, 61

program 114

sele tion sort 253, 254, 256

sour e in lusiveness 300

sour e program 320

spe i ation 101103, 121, 319

as theorem 109

by examples 106

omplete 104

formal 104, 126

in omplete 101, 106

language 106

natural language 104, 111

non-exe utable 124

spe i ation language 106

sta k 30

STAN 44, 267

state des ription

omplete 37, 56

onsistent 38

in onsistent 37, 38

partial 37, 56

state invariant 51

state representation 15

omplete 48

partial 37, 48

state spa e 17, 34, 70

state transformation 15

state transition 47

state variable 75

state-a tion rule 46

stati s 16, 17, 23, 74

Strips 24, 27, 56

domain 17

Fun tional 75

goal 16

language 14, 44

operator 16

planner 41

planning problem 17

state 15

stru tural similarity 298, 308, 312, 316

stru ture mapping 292, 298, 320

theory 298

sub-plan

uniform 235

sub-s hema

equivalen e 207

sub-s hema position 193

valid 195

sub-term 173

subprogram

body 201

al ulation steps 220

al ulation strategy 221

dependen y relation 186

equality 223

language 179

substitution 16, 174, 181, 220

algorithm 218

onsisten y 216

inverse 146

ne essary ondition 215

produ t 79

testing re urren e 214

testing uniqueness 213

uniqueness 212

substitution term 181

al ulation 218

hara teristi s 213

ontext 329

subsumption 289

su esor identi ation 239

Sussman anomaly 35, 39

symboli model he king 44, 46

Synapse 152

synthesis problem 189

synthesis theorem 158

basi 160

ta ti 50

target exhaustiveness 300

target problem 320

term 172

depth 149

level 149

order 175

repla ement 174

substitution 322, 328

subsumption 324

term algebra 163

term rewrite system 174

term rewriting 175

theorem prover 25

Boyer-Moore 50

theorem proving 37, 41, 50

onstru tive 112, 113, 115, 125, 168

drawba ks 114

Thesys 153

TIM 52, 152

Tinker 165

Tower of Hanoi 33, 63, 127, 128, 228,

266, 273, 280, 302, 303

tra e 163

412 Index

tra es 106, 110, 153, 154, 163, 165, 171

user-generated 136

training examples 127

positive/negative 143

transformation

orre tness 121, 122

distan e measure 289

lateral 117

plan to program 231

verti al 117, 319

transformation rule 109, 115, 121, 122

transformational semanti s 122

tree

regular 257

regularization 259

type inferen e 51

UMOP 47

unfolding 119, 136, 160, 165, 184, 320,

321

depth 152

indi es 182

uni ation

higher order 322

universal plan 65

full 66

optimal 66

order over states 65

unpa k 154, 157

unsta k 239

update 74, 77, 80

ba kward 83

user-modeling 103

utility problem 52

variable addition 160

variable instantiation 178, 220

maximal pattern 202

segment 214

variables

range 78

suÆ ient instan es 215

Warshall-algorithm 123

water jug domain 87, 303

Documents

ents ar andp Gr