Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Indu tive Synthesis of Fun tional
Programs
Learning Domain-Spe i Control Rules and
Abstra t S hemes
Ute S hmid
II
Summary.
In this book a novel approa h to indu tive synthesis of re ursive fun -
tions is proposed, ombining universal planning, folding of nite programs,
and s hema abstra tion by analogi al reasoning. In a rst step, an example
domain of small omplexity is explored by universal planning. For example,
for all possible lists over four xed natural numbers, their optimal trans-
formation sequen es into the sorted list are al ulated and represented as a
DAG. In a se ond step, the plan is transformed into a nite program term.
Plan transformation mainly relies on inferring the data type underlying
a given plan. In a third step, the nite program is folded into (a set of)
re ursive fun tions. Folding is performed by synta ti al pattern-mat hing
and orresponds to indu ing a spe ial lass of ontext-free tree grammars.
It is shown that the approa h an be su essfully applied to learn domain-
spe i ontrol rules. Control rule learning is important to gain the ef-
ien y of domain-spe i planners without the need to hand- ode the
domain-spe i knowledge. Furthermore, an extension of planning based
on a purely relational domain des ription language to fun tion appli ation
is presented. This extension makes planning appli able to a larger lass of
problems whi h are of interest for program synthesis.
As a last step, a hierar hy of program s hemes (patterns) is generated
by generalizing over already synthesized re ursive fun tions. Generaliza-
tion an be onsidered as the last step of problem solving by analogy or
programming by analogy. Some psy hologi al experiments were performed
to investigate whi h kind of stru tural relations between problems an
be exploited by human problem solvers. Anti-uni ation is presented as
an approa h to mapping and generalizing programs stru tures. It is pro-
posed that the integration of planning, program synthesis, and analogi al
reasoning ontributes to ognitive s ien e resear h on skill a quisition by
addressing the problem of extra ting generalized rules from some initial
experien e. Su h ( ontrol) rules represent domain-spe i problem solving
strategies.
All parts of the approa h are implemented in Common Lisp.
Keywords: Indu tive Program Synthesis, Re ursive Fun tions, Universal
Planning, Fun tion Appli ation in Planning, Plan Transformation, Data
Type Inferen e, Folding of Program Terms, Context-free Tree Grammars,
Control Rule Learning, Strategy Learning, Analogi al Reasoning, Anti-
Uni ation, S heme Abstra tion
IV
A knowledgement. Resear h is a kind of work one an never do ompletely alone.
Over the years I proted from the guidan e of several professors who supervised
or supported my work, namely Fritz Wysotzki, Klaus Eyferth, Bernd Mahr, Jaime
Carbonell, Arnold Upmeyer, Peter Pepper, and Gerhard Strube. I learned a lot from
dis ussions with them, with olleagues, and students, su h as Jo hen Burghardt,
Bru e Burns, the group of Hartmut Ehrig, Pierre Flener, He tor Gener, Peter
Geibel, Peter Gerjets, Jurgen Giesl, Wolfgang Grieskamp, Maritta Heisel, Ralf Her-
bri h, Laurie Hiyakumoto, Petra Hofstedt, Rune Jensen, Emanuel Kitzelmann, Jana
Koehler, Steen Lange, Martin Muhlpfordt, Brigitte Pientka, Heike Pis h, Manuela
Veloso, Ulri h Wagner, Bernhard Wolf, Thomas Zeugmann (sorry to everyone I for-
got). I am very grateful for the time I ould spend at Carnegie Mellon University.
Thanks to Klaus Eyferth and Bernd Mahr who motivated me to go, to Gerhard
Strube for his support, to Fritz Wysotzki who a epted my absen e from tea hing,
and, of ourse to Jaime Carbonell who was my very helpful host. My work proted
mu h from the inspiration I got from talks, lasses, and dis ussions, and from the
very spe ial atmosphere suggesting that everything is all right as long as \the heart
is in the work". I thank all my diploma students who supported the work reported
in this book Dirk Matzke, Rene Mer y, Martin Muhlpfordt, Marina Muller, Mark
Muller, Heike Pis h, Knut Polkehn, Uwe Sinha, Imre Szabo, Janin Toussaint, Ulri h
Wagner, Joa him Wirth, and Bernhard Wolf. Additional thanks to some of them
and Peter Pollmanns for proof-reading parts of the draft of this book. I owe a lot
to Fritz Wysotzki for giving me the han e to move from ognitive psy hology to
arti ial intelligen e, for many interesting dis ussions and for riti ally reading and
ommenting the draft of this book. Finally, thanks to my olleagues and friends
Berry Claus, Robin Hornig, Barbara Kaup, and Martin Kindsmuller, to my family,
and my husband Uwe Konerding for support and high quality leisure time and to
all authors of good rime novels.
Table of Contents
A knowledgments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : IV
1. Introdu tion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
Part I. Planning
2. State-Based Planning : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13
2.1 Standard Strips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 A Blo ks-World Example . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Basi Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 Ba kward Operator Appli ation . . . . . . . . . . . . . . . . . . . . 18
2.2 Extensions and Alternatives to Strips . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 The Planning Domain Denition Language . . . . . . . . . . 20
2.2.1.1 Typing and Equality Constraints . . . . . . . . . . . 22
2.2.1.2 Conditional Ee ts . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.2 Situation Cal ulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Basi Planning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 Informal Introdu tion of Basi Con epts . . . . . . . . . . . . . 28
2.3.2 Forward Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.3 Formal Properties of Planning . . . . . . . . . . . . . . . . . . . . . 31
2.3.3.1 Complexity of Planning . . . . . . . . . . . . . . . . . . . 31
2.3.3.2 Termination, Soundness, Completeness . . . . . . 34
2.3.3.3 Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.4 Ba kward Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.4.1 Goal Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.4.2 In ompleteness of Linear Planning . . . . . . . . . . 39
2.4 Planning Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1 Classi al Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1.1 Strips Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1.2 Dedu tive Planning . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1.3 Partial Order Planning . . . . . . . . . . . . . . . . . . . . 41
2.4.1.4 Total Order Non-Linear Planning . . . . . . . . . . . 42
2.4.2 Current Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
VI Table of Contents
2.4.2.1 Graphplan and Derivates . . . . . . . . . . . . . . . . . . 42
2.4.2.2 Forward Planning Revisited . . . . . . . . . . . . . . . . 44
2.4.3 Complex Domains and Un ertain Environments . . . . . . 44
2.4.3.1 In luding Domain Knowledge . . . . . . . . . . . . . . 44
2.4.3.2 Planning for Non-Deterministi Domains . . . . 45
2.4.4 Universal Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4.5 Planning and Related Fields . . . . . . . . . . . . . . . . . . . . . . . 48
2.4.5.1 Planning and Problem Solving . . . . . . . . . . . . . . 48
2.4.5.2 Planning and S heduling . . . . . . . . . . . . . . . . . . . 49
2.4.5.3 Proof Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.4.6 Planning Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.5 Automati Knowledge A quisition for Planning . . . . . . . . . . . . 51
2.5.1 Pre-Planning Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.5.2 Planning and Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.5.2.1 Linear Ma ro-Operators . . . . . . . . . . . . . . . . . . . 52
2.5.2.2 Learning Control Rules . . . . . . . . . . . . . . . . . . . . 53
2.5.2.3 Learning Control Programs . . . . . . . . . . . . . . . . 53
3. Constru ting Complete Sets of Optimal Plans : : : : : : : : : : : : 55
3.1 Introdu tion to DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.1 DPlan Planning Language . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.2 DPlan Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.1.3 EÆ ien y Con erns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1.4 Example Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.4.1 The Clearblo k Problem . . . . . . . . . . . . . . . . . . . 59
3.1.4.2 The Ro ket Problem . . . . . . . . . . . . . . . . . . . . . . 61
3.1.4.3 The Sorting Problem . . . . . . . . . . . . . . . . . . . . . . 61
3.1.4.4 The Hanoi Problem . . . . . . . . . . . . . . . . . . . . . . . 63
3.1.4.5 The Tower Problem . . . . . . . . . . . . . . . . . . . . . . . 63
3.2 Optimal Full Universal Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 Termination, Soundness, Completeness . . . . . . . . . . . . . . . . . . . . 67
3.3.1 Termination of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3.2 Operator Restri tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.3 Soundness and Completeness of DPlan . . . . . . . . . . . . . . 70
4. Integrating Fun tion Appli ation in Planning : : : : : : : : : : : : : 73
4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Extending Strips to Fun tion Appli ations . . . . . . . . . . . . . . . . . 76
4.3 Extensions of FPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.1 Ba kward Operator Appli ation . . . . . . . . . . . . . . . . . . . . 81
4.3.2 Introdu ing User-Dened Fun tions . . . . . . . . . . . . . . . . . 84
4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4.1 Planning with Resour e Variables . . . . . . . . . . . . . . . . . . 85
4.4.2 Planning for Numeri al Problems . . . . . . . . . . . . . . . . . . . 87
4.4.3 Fun tional Planning for Standard Problems . . . . . . . . . . 89
Table of Contents VII
4.4.4 Mixing ADD/DEL Ee ts and Updates . . . . . . . . . . . . . 90
4.4.5 Planning for Programming Problems . . . . . . . . . . . . . . . . 91
4.4.6 Constraint Satisfa tion and Planning . . . . . . . . . . . . . . . 93
5. Con lusions and Further Resear h : : : : : : : : : : : : : : : : : : : : : : : : 95
5.1 Comparing DPlan with the State of the Art . . . . . . . . . . . . . . . 95
5.2 Extensions of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3 Universal Planning versus In remental Exploration . . . . . . . . . 97
Part II. Indu tive Program Synthesis
6. Automati Programming : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 101
6.1 Overview of Automati Programming Resear h . . . . . . . . . . . . . 102
6.1.1 AI and Software Engineering . . . . . . . . . . . . . . . . . . . . . . . 102
6.1.1.1 Knowledge-Based Software Engineering . . . . . . 102
6.1.1.2 Programming Tutors . . . . . . . . . . . . . . . . . . . . . . 103
6.1.2 Approa hes to Program Synthesis . . . . . . . . . . . . . . . . . . 104
6.1.2.1 Methods of Program Spe i ation . . . . . . . . . . 104
6.1.2.2 Synthesis Methods . . . . . . . . . . . . . . . . . . . . . . . . 107
6.1.3 Pointers to Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.2 Dedu tive Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.2.1 Constru tive Theorem Proving . . . . . . . . . . . . . . . . . . . . . 112
6.2.1.1 Program Synthesis by Resolution . . . . . . . . . . . 113
6.2.1.2 Planning and Program Synthesis with Dedu -
tive Tableaus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2.1.3 Further Approa hes . . . . . . . . . . . . . . . . . . . . . . . 117
6.2.2 Program Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2.2.1 Transformational Implementation: The Fold
Unfold Me hanism . . . . . . . . . . . . . . . . . . . . . . . . 118
6.2.2.2 Meta-Programming: The CIP Approa h . . . . . 122
6.2.2.3 Program Synthesis by Stepwise Renement:
KIDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.2.2.4 Con luding Remarks . . . . . . . . . . . . . . . . . . . . . . 125
6.3 Indu tive Approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.3.1 Foundations of Indu tion . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.3.1.1 Basi Con epts of Indu tive Learning . . . . . . . 127
6.3.1.2 Grammar Inferen e . . . . . . . . . . . . . . . . . . . . . . . 132
6.3.2 Geneti Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.3.2.1 Basi Con epts . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.3.2.2 Iteration and Re ursion . . . . . . . . . . . . . . . . . . . . 139
6.3.2.3 Evaluation of Geneti Programming . . . . . . . . . 142
6.3.3 Indu tive Logi Programming . . . . . . . . . . . . . . . . . . . . . . 142
6.3.3.1 Basi Con epts . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.3.3.2 Basi Learning Te hniques . . . . . . . . . . . . . . . . . 145
VIII Table of Contents
6.3.3.3 De larative Biases . . . . . . . . . . . . . . . . . . . . . . . . 148
6.3.3.4 Learning Re ursive Prolog Programs . . . . . . . . 151
6.3.4 Indu tive Fun tional Programming . . . . . . . . . . . . . . . . . 153
6.3.4.1 Summers' Re urren e Relation Dete tion Ap-
proa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3.4.2 Extensions of Summers' Approa h . . . . . . . . . . 161
6.3.4.3 Biermann's Fun tion Merging Approa h . . . . . 163
6.3.4.4 Generalization to N and Programming by
Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.4 Final Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.4.1 Indu tive versus Dedu tive Synthesis . . . . . . . . . . . . . . . 168
6.4.2 Indu tive Fun tional versus Logi Programming . . . . . . 169
7. Folding of Finite Program Terms : : : : : : : : : : : : : : : : : : : : : : : : : 171
7.1 Terminology and Basi Con epts . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.1.1 Terms and Term Rewriting . . . . . . . . . . . . . . . . . . . . . . . . 172
7.1.2 Patterns and Anti-Uni ation . . . . . . . . . . . . . . . . . . . . . . 175
7.1.3 Re ursive Program S hemes . . . . . . . . . . . . . . . . . . . . . . . 176
7.1.3.1 Basi Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.1.3.2 RPSs as Term Rewrite Systems . . . . . . . . . . . . . 178
7.1.3.3 RPSs as Context-Free Tree Grammars . . . . . . . 180
7.1.3.4 Initial Programs and Unfoldings . . . . . . . . . . . . 181
7.2 Synthesis of RPSs from Initial Programs . . . . . . . . . . . . . . . . . . 186
7.2.1 Folding and Fixpoint Semanti s . . . . . . . . . . . . . . . . . . . . 186
7.2.2 Chara teristi s of RPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
7.2.3 The Synthesis Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.3 Solving the Synthesis Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.3.1 Constru ting Segmentations . . . . . . . . . . . . . . . . . . . . . . . 190
7.3.1.1 Segmentation for `Mod' . . . . . . . . . . . . . . . . . . . . 190
7.3.1.2 Segmentation and Skeleton . . . . . . . . . . . . . . . . . 192
7.3.1.3 Valid Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.3.1.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.3.2 Constru ting a Program Body . . . . . . . . . . . . . . . . . . . . . 199
7.3.2.1 Separating Program Body and Instantiated
Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.3.2.2 Constru tion of a Maximal Pattern . . . . . . . . . 201
7.3.2.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.3.3 Dealing with Further Subprograms . . . . . . . . . . . . . . . . . 202
7.3.3.1 Indu ing two Subprograms for `ModList' . . . . . 203
7.3.3.2 De omposition of Initial Trees . . . . . . . . . . . . . . 206
7.3.3.3 Equivalen e of Sub-S hemata . . . . . . . . . . . . . . . 207
7.3.3.4 Redu tion of Initial Trees . . . . . . . . . . . . . . . . . . 209
7.3.4 Finding Parameter Substitutions . . . . . . . . . . . . . . . . . . . 210
7.3.4.1 Finding Substitutions for `Mod' . . . . . . . . . . . . 211
7.3.4.2 Basi Considerations . . . . . . . . . . . . . . . . . . . . . . 212
Table of Contents IX
7.3.4.3 Dete ting Hidden Variables . . . . . . . . . . . . . . . . 216
7.3.4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
7.3.5 Constru ting an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
7.3.5.1 Parameter Instantiations for the Main Program219
7.3.5.2 Putting all Parts Together . . . . . . . . . . . . . . . . . 219
7.3.5.3 Existen e of an RPS . . . . . . . . . . . . . . . . . . . . . . 221
7.3.5.4 Extension: Equality of Subprograms . . . . . . . . . 223
7.3.5.5 Fault Toleran e . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.4 Example Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.4.1 Time Eort of Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.4.2 Re ursive Control Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
8. Transforming Plans into Finite Programs : : : : : : : : : : : : : : : : : 231
8.1 Overview of Plan Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 232
8.1.1 Universal Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
8.1.2 Introdu ing Data Types and Situation Variables . . . . . . 232
8.1.3 Components of Plan Transformation . . . . . . . . . . . . . . . . 233
8.1.4 Plans as Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
8.1.5 Completeness and Corre tness . . . . . . . . . . . . . . . . . . . . . 235
8.2 Transformation and Type Inferen e . . . . . . . . . . . . . . . . . . . . . . . 235
8.2.1 Plan De omposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
8.2.2 Data Type Inferen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
8.2.3 Introdu ing Situation Variables . . . . . . . . . . . . . . . . . . . . 238
8.3 Plans over Sequen es of Obje ts . . . . . . . . . . . . . . . . . . . . . . . . . . 239
8.4 Plans over Sets of Obje ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
8.5 Plans over Lists of Obje ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
8.5.1 Stru tural and Semanti List Problems . . . . . . . . . . . . . . 251
8.5.2 Synthesizing `Sele tion-Sort' . . . . . . . . . . . . . . . . . . . . . . . 253
8.5.2.1 A Plan for Sorting Lists . . . . . . . . . . . . . . . . . . . 253
8.5.2.2 Dierent Realizations of `Sele tion Sort' . . . . . 254
8.5.2.3 Inferring the Fun tion-Skeleton . . . . . . . . . . . . . 256
8.5.2.4 Inferring the Sele tor Fun tion . . . . . . . . . . . . . 259
8.5.3 Con luding Remarks on List Problems . . . . . . . . . . . . . . 262
8.6 Plans over Complex Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 265
8.6.1 Variants of Complex Finite Programs . . . . . . . . . . . . . . . 265
8.6.2 The `Tower' Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
8.6.2.1 A Plan for Three Blo ks . . . . . . . . . . . . . . . . . . . 266
8.6.2.2 Elemental to Composite Learning . . . . . . . . . . . 267
8.6.2.3 Simultaneous Composite Learning . . . . . . . . . . 267
8.6.2.4 Set of Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
8.6.2.5 Con luding Remarks on `Tower' . . . . . . . . . . . . 272
8.6.3 Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
X Table of Contents
9. Con lusions and Further Resear h : : : : : : : : : : : : : : : : : : : : : : : : 277
9.1 Combining Planning and Program Synthesis . . . . . . . . . . . . . . . 277
9.2 A quisition of Problem Solving Strategies . . . . . . . . . . . . . . . . . 278
9.2.1 Learning in Problem Solving and Planning . . . . . . . . . . 278
9.2.2 Three Levels of Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Part III. S hema Abstra tion
10. Analogi al Reasoning and Generalization : : : : : : : : : : : : : : : : : 285
10.1 Analogi al and Case-Based Reasoning . . . . . . . . . . . . . . . . . . . . . 285
10.1.1 Chara teristi s of Analogy . . . . . . . . . . . . . . . . . . . . . . . . . 285
10.1.2 Sub-Pro esses of Analogi al Reasoning . . . . . . . . . . . . . . 287
10.1.3 Transformational versus Derivational Analogy . . . . . . . . 288
10.1.4 Quantitive and Qualitative Similarity . . . . . . . . . . . . . . . 289
10.2 Mapping Simple Relations or Complex Stru tures . . . . . . . . . . 290
10.2.1 Proportional Analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
10.2.2 Causal Analogies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
10.2.3 Problem Solving and Planning by Analogy . . . . . . . . . . 292
10.3 Programming by Analogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
10.4 Pointers to Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
11. Stru tural Similarity in Analogi al Transfer : : : : : : : : : : : : : : : 297
11.1 Analogi al Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
11.1.1 Mapping and Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
11.1.2 Transfer of Non-Isomorphi Sour e Problems . . . . . . . . . 299
11.1.3 Stru tural Representation of Problems . . . . . . . . . . . . . . 300
11.1.4 Non-Isomorphi Variants in a Water Redistribution
Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
11.1.5 Measurement of Stru tural Overlap . . . . . . . . . . . . . . . . . 306
11.2 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
11.2.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
11.2.2 Results and Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
11.2.2.1 Isomorph Stru ture/Change in Surfa e . . . . . . 311
11.2.2.2 Change in Stru ture/Stable Surfa e . . . . . . . . . 311
11.2.2.3 Change in Stru ture/Change in Surfa e . . . . . . 311
11.3 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
11.3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
11.3.2 Results and Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
11.3.2.1 Type of Stru tural Relation . . . . . . . . . . . . . . . . 314
11.3.2.2 Degree of Stru tural Overlap . . . . . . . . . . . . . . . 315
11.4 General Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Table of Contents XI
12. Programming By Analogy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 319
12.1 Program Reuse and Program S hemes . . . . . . . . . . . . . . . . . . . . 319
12.2 Restri ted 2ndOrder AntiUni ation . . . . . . . . . . . . . . . . . . . . 320
12.2.1 Re ursive Program S hemes Revisited . . . . . . . . . . . . . . . 320
12.2.2 Anti-Uni ation of Program Terms . . . . . . . . . . . . . . . . . 322
12.3 Retrieval Using Term Subsumption . . . . . . . . . . . . . . . . . . . . . . . 324
12.3.1 Term Subsumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
12.3.2 Empiri al Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
12.3.3 Retrieval from Hierar hi al Memory . . . . . . . . . . . . . . . . 326
12.4 Generalizing Program S hemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
12.5 Adaptation of Program S hemes . . . . . . . . . . . . . . . . . . . . . . . . . . 328
13. Con lusions and Further Resear h : : : : : : : : : : : : : : : : : : : : : : : : 331
13.1 Learning and Applying Abstra t S hemes . . . . . . . . . . . . . . . . . 331
13.2 A Framework for Learning from Problem Solving . . . . . . . . . . . 332
13.3 Appli ation Perspe tive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Bibliography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 335
Appendi es
A. Implementation Details : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 355
A.1 Short History of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
A.2 Modules of DPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
A.3 DPlan Spe i ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
A.4 Development of Folding Algorithms . . . . . . . . . . . . . . . . . . . . . . . 359
A.5 Modules of TFold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
A.6 Time Eort of Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
A.7 Main Components of Plan-Transformation . . . . . . . . . . . . . . . . . 361
A.8 Plan De omposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
A.9 Introdu tion of Situation Variables . . . . . . . . . . . . . . . . . . . . . . . 363
A.10 Number of MSTs in a DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
A.11 Extra ting Minimal Spanning Trees from a DAG . . . . . . . . . . . 364
A.12 Regularizing a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
A.13 Programming by Analogy Algorithms . . . . . . . . . . . . . . . . . . . . . 367
B. Con epts and Proofs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 371
B.1 Fixpoint Semanti s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
B.2 Proof: Maximal Subprogram Body . . . . . . . . . . . . . . . . . . . . . . . . 374
B.3 Proof: Uniqueness of Substitutions . . . . . . . . . . . . . . . . . . . . . . . . 379
XII Table of Contents
C. Sample Programs and Problems : : : : : : : : : : : : : : : : : : : : : : : : : : 383
C.1 Fibona i with Sequen e Referen ing Fun tion . . . . . . . . . . . . . 383
C.2 Indu ing `Reverse' with Golem . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
C.3 Finite Program for `Unsta k' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
C.4 Re ursive Control Rules for the `Ro ket' Domain . . . . . . . . . . . 388
C.5 The `Sele tion Sort' Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
C.6 Re ursive Control Rules for the `Tower' Domain . . . . . . . . . . . . 390
C.7 Water Jug Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
C.8 Example RPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
List of Figures
1.1 Analogi al Problem Solving and Learning . . . . . . . . . . . . . . . . . . . . . 5
1.2 Main Components of the Synthesis System . . . . . . . . . . . . . . . . . . . . 6
2.1 A Simple Blo ks-World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 A Strips Planning Problem in the Blo ks-World . . . . . . . . . . . . . . . . 18
2.3 An Alternative Representation of the Blo ks-World . . . . . . . . . . . . . 19
2.4 Representation of a Blo ks-World Problem in PDDL-Strips . . . . . . 21
2.5 Blo ks-World Domain with Equality Constraints and Conditioned
Ee ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Representation of a Blo ks-World Problem in Situation Cal ulus . 26
2.7 A Forward Sear h Tree for Blo ks-World . . . . . . . . . . . . . . . . . . . . . . 32
2.8 A Ba kward Sear h Tree for Blo ks-World . . . . . . . . . . . . . . . . . . . . . 36
2.9 Goal-Regression for Blo ks-World . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.10 The Sussman Anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.11 Part of a Planning Graph as Constru ted by Graphplan . . . . . . . . . 43
2.12 Representation of the Boolean Formula f(x
1
; x
2
) = x
1
^x
2
as OBDD 48
3.1 The Clearblo k DPlan Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2 DPlan Plan for Clearblo k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3 Clearblo k with a Set of Goal States . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4 The DPlan Ro ket Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5 Universal Plan for Ro ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6 The DPlan Sorting Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.7 Universal Plan for Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.8 The DPlan Hanoi Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.9 Universal Plan for Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.10 Universal Plan for Tower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.11 Minimal Spanning Tree for Ro ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.1 Tower of Hanoi (a) Without and (b) With Fun tion Appli ation . 75
4.2 Tower of Hanoi in Fun tional Strips (Gener, 1999) . . . . . . . . . . . . 76
4.3 A Plan for Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4 Tower of Hanoi with User-Dened Fun tions . . . . . . . . . . . . . . . . . . . 85
4.5 A Problem Spe i ation for the Airplane Domain . . . . . . . . . . . . . . . 85
4.6 Spe i ation of the Inverse Operator y
1
for the Airplane Domain 87
XIV List of Figures
4.7 A Problem Spe i ation for the Water Jug Domain . . . . . . . . . . . . . 88
4.8 A Plan for the Water Jug Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.9 Blo ks-World Operators with Indire t Referen e and Update . . . . . 91
4.10 Spe i ation of Sele tion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.11 Lightmeal in Constraint Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.12 Problem Spe i ation for Lightmeal . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.1 Programs Represent Con epts and Skills . . . . . . . . . . . . . . . . . . . . . . 128
6.2 Constru tion of a Simple Arithmeti Fun tion (a) and an Even-2-
Parity Fun tion (b) Represented as a Labeled Tree with Ordered
Bran hes (Koza, 1992, gs. 6.1, 6.2) . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.3 A Possible Initial State, an Intermediate State, and the Goal State
for Blo k Sta king (Koza, 1992, gs. 18.1, 18.2) . . . . . . . . . . . . . . . . 140
6.4 Resulting Programs for the Blo k Sta king Problem (Koza, 1992,
hap. 18.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5 -Subsumption Equivalen e and Redu ed Clauses . . . . . . . . . . . . . . . 144
6.6 -Subsumption Latti e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.7 An Inverse Linear Derivation Tree (Lavra and Dzeroski, 1994,
pp. 46) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.8 Part of a Renement Graph (Lavra and Dzeroski, 1994, p. 56) . . 148
6.9 Spe ifying Modes and Types for Predi ates . . . . . . . . . . . . . . . . . . . . 150
6.10 Learning Fun tion unpa k from Examples . . . . . . . . . . . . . . . . . . . . . 154
6.11 Tra es for the unpa k Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.12 Result of the First Synthesis Step for unpa k . . . . . . . . . . . . . . . . . . . 157
6.13 Re urren e Relation for unpa k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.14 Tra es for the reverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.15 Synthesis of a Regular Lisp Program . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.16 Re ursion Formation with Tinker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.1 Example Term with Exemplari Positions of Sub-terms . . . . . . . . . . 174
7.2 Example First Order Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.3 Anti-Uni ation of Two Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.4 Examples for Terms Belonging to the Language of an RPS and of
a Subprogram of an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.5 Unfolding Positions in the Third Unfolding of Fibona i . . . . . . . . . 183
7.6 Valid Re urrent Segmentation of Mod . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.7 Initial Program for ModList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.8 Identifying Two Re ursive Subprograms in the Initial Program for
ModList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.9 Inferring a Sub-Program S heme for ModList . . . . . . . . . . . . . . . . . . . 205
7.10 The Redu ed Initial Tree of ModList . . . . . . . . . . . . . . . . . . . . . . . . . . 206
7.11 Substitutions for Mod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
7.12 Steps for Cal ulating a Subprogram . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
7.13 Overview of Indu ing an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
7.14 Time Eort for Unfolding/Folding Fa torial . . . . . . . . . . . . . . . . . . . 225
List of Figures XV
7.15 Time Eort for Unfolding/Folding Fibona i . . . . . . . . . . . . . . . . . . . 225
7.16 Time Eort Cal ulating Valid Re urrent Segmentations and Sub-
stitutions for Fa torial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.17 Initial Tree for Clearblo k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.18 Initial Tree for Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
8.1 Indu tion of Re ursive Fun tions from Plans . . . . . . . . . . . . . . . . . . . 231
8.2 Examples of Uniform Sub-Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
8.3 Uniform Plans as Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
8.4 Generating the Su essor-Fun tion for a Sequen e . . . . . . . . . . . . . . 241
8.5 The Unsta k Domain and Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
8.6 Proto ol for Unsta k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
8.7 Introdu tion of Data Type Sequen e in Unsta k . . . . . . . . . . . . . . . . 243
8.8 LISP-Program for Unsta k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
8.9 Partial Order of Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
8.10 Fun tions Inferred/Provided for Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
8.11 Sub-Plans of Ro ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
8.12 Introdu tion of the Data Type Set (a) and Resulting Finite Pro-
gram (b) for the Unload-All Sub-Plan of Ro ket ( denotes \un-
dened") . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.13 Proto ol of Transforming the Ro ket Plan . . . . . . . . . . . . . . . . . . . . . 250
8.14 Partial Order (a) and Total Order (b) of Flat Lists over Numbers . 253
8.15 A Minimal Spanning Tree Extra ted from the SelSort Plan . . . . . . 258
8.16 The Regularized Tree for SelSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.17 Introdu tion of a \Semanti " Sele tor Fun tion in the Regularized
Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
8.18 LISP-Program for SelSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
8.19 Abstra t Form of the Universal Plan for the Four-Blo k Tower . . . 269
9.1 Three Levels of Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
10.1 Mapping of Base and Target Domain . . . . . . . . . . . . . . . . . . . . . . . . . 287
10.2 Example for a Geometri -Analogy Problem (Evans, 1968, p. 333) . 291
10.3 Context Dependent Des riptions in Proportional Analogy (O'Hara,
1992) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
10.4 The Rutherford Analogy (Gentner, 1983) . . . . . . . . . . . . . . . . . . . . . . 292
10.5 Base and Target Spe i ation (Dershowitz, 1986) . . . . . . . . . . . . . . . 294
11.1 Types and degrees of stru tural overlap between sour e and target
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
11.2 A water redistribution problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
11.3 Graphs for the equations 2 x+5 = 9 (a) and 3 x+(62) = 16 (b)307
12.1 Adaptation of Sub to Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
XVI List of Figures
C.1 Universal Plan for Sorting Lists with Three Elements . . . . . . . . . . . 391
C.2 Minimal Spanning Trees for Sorting Lists with Three Elements . . . 391
C.3 Minimal Spanning Trees for Sorting Lists with Three Elements . . . 391
C.4 Minimal Spanning Trees for Sorting Lists with Three Elements . . . 392
List of Tables
2.1 Informal Des ription of Forward Planning . . . . . . . . . . . . . . . . . . . . . 29
2.2 A Simple Forward Planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Number of States in the Blo ks-World Domain . . . . . . . . . . . . . . . . . 34
2.4 Planning as Model Che king Algorithm (Giun higlia, 1999, g. 4) 47
3.1 Abstra t DPlan Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1 Database with Distan es between Airports . . . . . . . . . . . . . . . . . . . . . 86
4.2 Performan e of FPlan: Tower of Hanoi . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3 Performan e of FPlan: Sele tion Sort . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1 Dierent Spe i ations for Last . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Training Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.3 Ba kground Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.4 Fundamental Results of Language Learnability (Gold, 1967, tab. 1)135
6.5 Geneti Programming Algorithm (Koza, 1992, p. 77) . . . . . . . . . . . . 138
6.6 Cal ulation the Fibona i Sequen e . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.7 Learning the daughter Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.8 Cal ulating an rlgg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.9 Simplied MIS-Algorithm (Lavra and Dzeroski, 1994, pp.54) . . . . 147
6.10 Ba kground Knowledge and Examples for Learning reverse(X,Y) . 151
6.11 Constru ting Tra es from I/O Examples . . . . . . . . . . . . . . . . . . . . . . . 155
6.12 Cal ulating the Form of an S-Expression . . . . . . . . . . . . . . . . . . . . . . 156
6.13 Constru ting a Regular Lisp Program by Fun tion Merging . . . . . . 164
7.1 A Sample of Fun tion Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.2 Example for an RPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.3 Re ursion Points and Substitution Terms for the Fibona i Fun tion182
7.4 Unfolding Positions and Unfolding Indi es for Fibona i . . . . . . . . . 184
7.5 Example of Extrapolating an RPS from an Initial Program . . . . . . 187
7.6 Cal ulation of the Next Position on the Right . . . . . . . . . . . . . . . . . . 199
7.7 Fa torial and Its Third Unfolding with Instantiation su (su (0))
(a) and pred(3) (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.8 Segments of the Third Unfolding of Fa torial for Instantiation
su (su (0)) (a) and pred(3) (b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
XVIII List of Tables
7.9 Anti-Uni ation for In omplete Segments . . . . . . . . . . . . . . . . . . . . . . 202
7.10 Variants of Substitutions in Re ursive Calls . . . . . . . . . . . . . . . . . . . . 211
7.11 Testing whether Substitutions are Uniquely Determined . . . . . . . . . 213
7.12 Testing whether a Substitution is Re urrent . . . . . . . . . . . . . . . . . . . 214
7.13 Testing the Existen e of SuÆ iently Many Instan es for a Variable 215
7.14 Determining Hidden Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
7.15 Cal ulating Substitution Terms of a Variable in a Re ursive Call . 218
7.16 Equality of Subprograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.17 RPS for Fa torial with Constant Expression in Main . . . . . . . . . . . . 226
8.1 Introdu ing Sequen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
8.2 Linear Re ursive Fun tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
8.3 Introdu ing Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.4 Stru tural Fun tions over Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
8.5 Introdu ing List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
8.6 Dealing with Semanti Information in Lists . . . . . . . . . . . . . . . . . . . . 254
8.7 Fun tional Variants for Sele tion-Sort . . . . . . . . . . . . . . . . . . . . . . . . . 255
8.8 Extra t an MST from a DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
8.9 Regularization of a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.10 Stru tural Complex Re ursive Fun tions . . . . . . . . . . . . . . . . . . . . . . . 265
8.11 Transformation Sequen es for Leaf-Nodes of the Tower Plan for
Four Blo ks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
8.12 Power-Set of a List, Set of Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
8.13 Control Rules for Tower Inferred by De ision List Learning . . . . . . 273
8.14 A Tower of Hanoi Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
8.15 A Tower of Hanoi Program for Arbitrary Starting Constellations . 276
10.1 Kinds of Predi ates Mapped in Dierent Types of Domain Com-
parison (Gentner, 1983, Tab. 1, extended) . . . . . . . . . . . . . . . . . . . . . 286
10.2 Word Algebra Problems (Reed et al., 1990) . . . . . . . . . . . . . . . . . . . . 293
11.1 Relevant information for solving the sour e problem . . . . . . . . . . . . 305
11.2 Results of Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
11.3 Results of Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
12.1 A Simple Anti-Uni ation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 323
12.2 An Algorithm for Retrieval of RPSs . . . . . . . . . . . . . . . . . . . . . . . . . . 325
12.3 Results of the similarity rating study . . . . . . . . . . . . . . . . . . . . . . . . . . 326
12.4 Example Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
1. Introdu tion
She had written what she felt herself alled upon to write; and, though she
was beginning to feel that she might perhaps do this thing better, she had no
doubt that the thing itself was the right thing for her.
|Harriet Vane in: Dorothy L. Sayers, Gaudy Night, 1935
Automati program synthesis is an a tive area of resear h sin e the early
seventies. The appli ation goal of program synthesis is to support human pro-
grammers in developing orre t and eÆ ient program ode and in reasoning
about programs. Program synthesis is the ore of knowledge based software
engineering. The se ond goal of program synthesis resear h is to gain more
insight in the knowledge and strategies underlying the pro ess of ode gener-
ation. Developing algorithms for automati program onstru tion is therefore
an area of arti ial intelligen e (AI) resear h and human programmers are
studied in ognitive s ien e resear h.
There are two main approa hes to program synthesis dedu tion and
indu tion. Dedu tive program synthesis addresses the problem of deriving
exe utable programs from high-level spe i ations. Typi ally, the employed
transformation or inferen e rules guarantee that the resulting program is
orre t with respe t to the given spe i ation but of ourse, there is no
guarantee that the spe i ation is valid with respe t to the informal idea a
programmer has about what the program should do. The hallenge for de-
du tive program synthesis is to provide formalized knowledge whi h allows to
synthesize as large a lass of programs with as less user-guidan e as possible.
That is, a synthesis system an be seen as an expert system in orporating gen-
eral knowledge about algorithms, data stru tures, optimization te hniques,
as well as knowledge about the spe i programming domain.
Indu tive program synthesis investigates program onstru tion from in-
omplete information, namely from examples for the desired input/output
behavior of the program. Program behavior an be spe ied on dierent level
of detail: as pairs of input and output values, as a sele tion of omputational
tra es, or as an ordered set of generi tra es, abstra ting from spe i input
values. For indu tion from examples (not to be onfused with mathemati al
proofs by indu tion) it is not possible to give a notion of orre tness. The
resulting program has to over all given examples orre tly, but the user has
2 1. Introdu tion
to judge whether the generalized program orresponds to his/her intention.
1
Be ause orre tness of ode is ru ial for software development, knowledge
based software engineering relies on dedu tive approa hes to program syn-
thesis. The hallenge for indu tive program synthesis is to provide learning
algorithms whi h an generalize as large a lass of programs with as less ba k-
ground knowledge as possible. That is, indu tive synthesis models the ability
of extra ting stru ture in the form of possibly re ursive rules from some
initial experien e.
This book fo usses on indu tive program synthesis, and more spe ially on
the indu tion of re ursive fun tions. There are dierent approa hes for learn-
ing re ursive programs from examples. The oldest approa h is to synthesize
fun tional (Lisp) programs. Fun tional synthesis is realized by a two-step
pro ess: In a rst step, input/output examples are rewritten into nite terms
whi h are integrated into a nite program. In a se ond step, the nite program
is folded into a re ursive fun tion, that is, a program generalizing over nite
program is indu ed. The se ond step is also alled \generalization-to-n" and
orresponds to program synthesis from tra es or programming by demonstra-
tion. It will be shown later that folding of nite terms an be des ribed as a
grammar inferen e problem, namely as indu tion of a spe ial lass of ontext-
free tree grammars. Sin e the late eighties, there have been two additional
approa hes to indu tive synthesis indu tive logi programming and geneti
programming. While the lassi al fun tional approa h depends on exploiting
stru tural information given in the examples, these approa hes mainly de-
pend on sear h in hypotheses spa e, that is, the spa e of synta ti ally orre t
programs of a given programming language. The work presented here is in
the ontext of fun tional program synthesis.
While folding of nite programs into re ursive fun tions an be performed
(nearly) by purely synta ti al pattern mat hing, generation of nite terms
from input/output examples is knowledge-dependent. The result of rewrit-
ing examples that is, the form and omplexity of the nite program is
ompletely dependent on the set of predened primitive fun tions and data
stru tures provided for the rewrite-system. Additionally, the out ome de-
pends on the used rewrite-strategy even for a onstant set of ba kground
knowledge rewriting an result in dierent programs. Theoreti ally, there are
innitely many possible ways to represent a nite program whi h des ribes
how input examples an be transformed in the desired output. Be ause gen-
eralizability depends on the form of the nite program, this rst step is the
bottlene k of program synthesis. Here program synthesis is onfronted with
the ru ial problem of AI and ognitive s ien e problem solving su ess is
determined by the onstru ted representation.
1
Please note, that throughout the book I will mostly omit to give both mas uline
and feminine form and only use mas uline for better readability. Also, I will refer
to my work in rst person plural instead of singular be ause part of the work
was done in ollaboration with other people and I do not want to swit h between
rst and third person.
1. Introdu tion 3
Overview In the following, we give a short overview about the dierent
aspe ts of indu tive synthesis of re ursive fun tions whi h are overed in this
book.
Universal Planning. We propose to use domain-independent planning to gen-
erate nite programs. Inputs orrespond to problem states, outputs to states
fullling the desired goals, and transformation from input to output is real-
ized by al ulating optimal a tion sequen es (shortest sequen es of fun tion
appli ations). We use universal planning, that is, we onsider all possible
states of a given nite problem in a single plan. While a \standard" plan rep-
resents an (optimal) sequen es of a tions for transforming one spe i initial
state into a state fullling the given goals, a universal plan represents the set
of all optimal plans as a DAG (dire ted a y li graph). For example, a plan
for sorting lists (or more pre isely arrays) of four numbers 1, 2, 3, and 4,
ontains all sequen es of swap operations to transform ea h of the 4! possible
input lists into a list [1 2 3 4. Plan onstru tion is realized by a non-linear,
state-based, ba kward algorithm. To make planning appli able to a larger
lass of problems whi h are of interest for program synthesis, we present an
extension of planning from purely relational domain des riptions to fun tion
appli ation.
Plan Transformation. The universal plan already represents the stru ture
of the sear hed-for program, be ause it gives an ordering of operator ap-
pli ations. Nevertheless, the plan annot be generalized dire tly, but some
transformation steps are needed to generate a nite program whi h an be
input to a generalization-to-n algorithm. For a program to be generalizable
into a (terminating) re ursive fun tion, the input states have to be lassi-
ed with respe t to their \size", that is, the obje ts involved in the planning
problem must have an underlying order. We propose a method by whi h the
data type underlying a given plan an be inferred. This information an be
exploited in plan transformation to introdu e a \ onstru tive" representation
of input states, together with an \empty"-test and sele tor fun tions.
Folding of Finite Programs. Folding of a nite program into a (set of) re ur-
sive programs is done by pattern-mat hing. For indu tive generalization, the
notion of x-point semanti s an be \inverted": A given nite program is on-
sidered as n-th unfolding of an unknown re ursive program. The term an be
folded into a re ursion, if it an be de omposed su h that ea h segment of the
term mat hes the same sub-term ( alled skeleton) and if the instantiations of
the skeleton with respe t to ea h segment an be des ribed by a unique substi-
tution (set of repla ements of variables in the skeleton by terms). Identifying
the skeleton orresponds to learning a regular tree-grammar, identifying the
substitution orresponds to an extension of the regular to a ontext-free tree
grammar.
Our approa h is independent of a spe i programming language: terms
are supposed to be elements of a term algebra and the inferred re ursive
program is represented as a re ursive program s heme (RPS) over this term
4 1. Introdu tion
algebra. An RPS is dened as a \main program" (ground term) together with
a system of possibly re ursive equations over a term algebra onsisting of a
nite sets of variables, fun tion symbols, and fun tion variables (representing
names of user-dened fun tions). Folding results in the inferen e of a program
s heme be ause the elements of the term algebra (namely the fun tion sym-
bols) an be arbitrarily interpreted. Currently, we assume a xed interpreter
fun tion whi h maps RPSs into Lisp fun tions with known denotational and
operational semanti s. We restri t Lisp to its fun tional ore disregarding
global variables, variable assignments, loops, and other non-fun tional ele-
ments of the language. As a onsequen e, an RPS orresponds to a on rete
fun tional program.
For an arbitrarily instantiated nite term whi h is input to the folder, a
term algebra together with the valuation of the identied parameters of the
main program is inferred as part of the folding pro ess. The urrent system
an deal with a variety of (synta ti ) re ursive stru tures tail re ursion, lin-
ear re ursion, tree re ursion and ombinations thereof and it an deal with
re ursion over interdependent parameters. Furthermore, a onstant initial
segment of a term an be identied and used to onstru t the main program
and it is possible to identify sub-patterns distributed over the term whi h an
be folded separately resulting in an RPS onsisting of two or more re ursive
equations. Mutual re ursion and non-primitive re ursion are out of the s ope
of the urrent system.
S heme Abstra tion and Analogi al Problem Solving. A set of (synthesized)
RPSs an be organized into an abstra tion hierar hy of s hemes by generaliz-
ing over their ommon stru ture. For example, an RPS for al ulating the fa -
torial of a natural number and an RPS for al ulating the sum over a natural
number an be generalized to an RPS representing a simple linear re ursion
over natural numbers. This s heme an be generalized further when regarding
simple linear re ursive fun tions over lists. Abstra tion is realized by intro-
du ing fun tion variables whi h an be instantiated by a (restri ted) set of
primitive fun tion symbols. Abstra tion is based on \mapping" two terms
su h that their ommon stru ture is preserved. We present two approa hes
to mapping tree transformation and anti-uni ation. Tree transformation is
based on transforming one term (tree) into another by substituting, inserting,
and deleting nodes. The performed transformations dene the mapping re-
lation. Anti-uni ation is based on onstru ting an abstra t term ontaining
rst and se ond order variables (i. e., obje t and fun tion variables) together
with a set of substitutions su h that instantiating the abstra t term results
in the original terms.
S heme abstra tion an be integrated into a general model of analogi-
al problem solving (see g. 1.1): For a given nite program representing
some initial experien e with a problem, that RPS is retrieved from mem-
ory for whi h its n-th unfolding results in a \maximal similarity" to the
urrent (target) problem. Instead of indu ing a general solution strategy by
1. Introdu tion 5
generalization-to-n, the retrieved RPS is modied with respe t to the map-
ping obtained between its unfolding and the target. Modi ation an involve
a simple re-instantiation of primitive symbols or more omplex adaptations.
If an already abstra ted s heme is adapted to the urrent problem, we speak
of renement. To obtain some insight in the pragmati s of analogi al prob-
lem solving, we empiri ally investigated what type and degree of stru tural
mapping between two problem stru tures must exist for su essful analogi al
transfer in human problem solvers.
1
Finite Program
Term
Recursive
Program
Scheme
RPS-S
T-S
SOURCE
T
TARGET
RPS
RPS-ABSsto
rage
Hierarchy
Abstraction
retr
ieval
adaptation
mapping
abstraction
un
fold
ing
2
3
4
5
Fig. 1.1. Analogi al Problem Solving and Learning
Integrating Planning, Program Synthesis, and Analogi al Reasoning. The
three omponents of our approa h planning, folding of nite terms, and
analogi al problem solving an be used in isolation or integrated into a
omplex system. Input in the universal planner is a problem spe i ation
represented in an extended version of the standard Strips language, output
is a universal plan, represented as DAG. Input in the synthesis system is a
nite program term, output is a re ursive program s heme. Finite terms an
be obtained in arbitrary ways, for example, by hand- oding or by re ording
program tra es. We propose an approa h of plan transformation (from a DAG
to a nite program term) to ombine planning with program synthesis. Gen-
erating nite programs by planning is espe ially suitable for domains, where
inputs an be represented as relational obje ts (sets of literals) and where
operations an be des ribed as manipulation of su h obje ts ( hanging liter-
als). This is true for blo ks-world problems and puzzles (as Tower of Hanoi)
and for a variety of list-manipulating problems (as sorting). For numeri al
6 1. Introdu tion
problems su h as fa torial or bona i we omitt planning and start pro-
gram synthesis from nite programs generated from hand- oded tra es. Input
in the analogy module is a nite program term, outputs are the RPS whi h
is most similar to the term, its re-instantiation or adaptation to over the
urrent term, and an abstra ted s heme, generalizing over the urrent and
the re-used RPS. Figure 1.2 gives an overview of the des ribed omponents
and their intera tions. All omponents are implemented in Common Lisp.
Universal
Planning
Transformation
Plan
Problem
Specification
Universal
Plan
Finite Program
Term
Hierarchy
Abstraction
Recursive
Program
Scheme
Generalization-
to-n
storage
Analo
gic
al
Pro
ble
m S
olv
ing
Fig. 1.2. Main Components of the Synthesis System
Contributions to Resear h In the following, we dis uss our work in rela-
tion to dierent areas of resear h. Our work dire tly ontributs to resear h
in program synthesis and planning. To other resear h areas, su h as software
engineering and dis overy learning, our work does not dire tly ontribute,
but might oer some new perspe tives.
A Novel Approa h to Indu tive Program Synthesis. Going ba k to the roots
of early work in fun tional program synthesis and extending it, exploiting the
experien e of resear h available today, results in a novel approa h to indu tive
program synthesis with several advantageous aspe ts: Adopting the original
two-step pro ess of rst generating nite programs from examples and then
generalizing over them results in a lear separation of the knowledge depen-
dent and the synta ti aspe t of indu tion. Dividing the omplex program
synthesis problem in two parts allows us to address the sub-problems sepa-
rately. The knowledge-dependent rst part generating nite programs from
1. Introdu tion 7
examples an be realized by dierent approa hes, in luding the rewrite-
approa h proposed in the seventies. We propose to use state-based plan-
ning. While there is a long tradition of ombining (dedu tive) planning and
dedu tive program synthesis, up to now there was no intera tion between
resear h on state-based planning and resear h on indu tive program synthe-
sis. State-based planning provides a powerful approa h to al ulate (opti-
mal) transformation sequen es from input states to a state fullling a set of
goal relations by providing a powerful domain spe i ation language together
with a domain-independent sear h algorithm for plan onstru tion. The se -
ond part folding of nite programs an be solved by a pattern-mat hing
approa h. The orresponden e between folding program terms and inferring
ontext-free tree grammars makes it possible to give an exa t hara terization
of the lass of re ursive programs whi h an be indu ed. Dening pattern-
mat hing for terms whi h are elements of an arbitrary term algebra makes
the approa h independent of a spe i programming language. Synthesizing
program s hemes in ontrast to programs allows for a natural ombination
of indu tion with analogi al reasoning and learning.
Learning Domain Spe i Control Rules for Plans. Cross-fertilization be-
tween state-based planning and indu tive program synthesis results in a pow-
erful approa h to learning domain spe i ontrol rules for plans. Control rule
learning urrently be omes a major interest in planning resear h: Although
a variety of eÆ ient domain-independent planners have been developed in
the nineties, for demanding real world appli ations it is ne essary to guide
sear h for (optimal) plans by exploiting knowledge about the stru ture of
the planning domain. Learning ontrol rules allows to gain the eÆ ien y of
domain-dependent planning without the need to hand- ode su h knowledge
whi h is a time onsuming and error-prone knowledge engineering task. Our
fun tional approa h to program synthesis is very well suited for ontrol rule
learning be ause in ontrast to logi programs the ontrol ow is represented
expli itly. Furthermore, a lot of indu tive logi programming systems do not
provide an ordering for the indu ed lauses and literals. That is, evaluation
of su h lauses by a Prolog interpreter is not guaranteed to terminate with
the desired result. In proof planning, as a spe ial domain of planning in the
area of mathemati al theorem proving and program veri ation, the need
for learning proof methods and ontrol strategies is also re ognized. Up to
now, we have not applied our approa h to this domain, but we see this as an
interesting area of further resear h.
Knowledge A quisition for Software Engineering Tools. Knowledge based
software engineering systems are based on formalized knowledge of various
general and domain-spe i aspe ts of program development. For that reason,
su h systems are ne essarily in omplete. It depends on the \genius" of the
authors of su h systems their analyti al insights about program stru tures
and their abilities to expli ate these insights how large the set of domains
and the lass of programs is that an be supported. Similarly to ontrol rules
8 1. Introdu tion
in planning, ta ti s are used to guide sear h. Furthermore, some proof sys-
tems provide a variety of proof methods. While exe ution of transformation
or proof steps is performed autonomously, the sele tion of an appropriate ta -
ti or method is performed intera tively. We propose that these omponents
of software engineering tools are andidates for learning. The rst reason is,
that the guarantee of orre tness whi h is ne essary for the fully automated
parts of su h systems, remains inta t. Only the \higher level" strategi as-
pe ts of the system whi h depend on user intera tion are subje t to learning
and the a quired ta ti s and methods an be a epted or reje ted by the user.
Our approa h to program synthesis might omplement these spe ial kind of
expert systems by providing an approa h to model how some aspe ts of su h
expertise develop with experien e.
Programming by Demonstration. With the growing number of omputer
users, most of them without programming skills, program synthesis from
example be omes relevan e for pra ti al appli ations. For example, wat hing
a users input behavior in a text pro essing system provides tra es whi h an
be generalized to ma ros (su h as \write a letter-head"), wat hing a users
browsing behavior in the world-wide-web an be used to generate prefer-
en e lasses, or wat hing a users inputs into a graphi al editor might provide
suggestions for the next a tions to be performed. In the simplest ase, the re-
sulting programs are just sequen es of parameterized operations. By applying
indu tive program synthesis, more sophisti ated programs, involving loops,
ould be generated. A further appli ation is to support beginning program-
mers. A student might intera t with an interpreter by rst giving examples
for the desired program behavior and than wat h, how a re ursion is formed
to generalize the program to the general ase.
Dis overy Learning. Folding of nite programs and some aspe ts of plan
transformation an be hara terized as dis overy learning. Folding of nite
programs models the (human) ability to extra t generalized rules by iden-
tifying relevant stru tural aspe ts in per eptive inputs. This ability an be
seen as the ore of the exibility of (human) ognition underlying the a quisi-
tion of per eptual ategories and linguisti on epts as well as the extra tion
of general strategies from problem solving experien e. Furthermore, we will
demonstrate for plan transformation how problem dependent sele tor fun -
tions an be extra ted from a universal plan by a purely synta ti analysis
of its stru ture. Be ause our approa h to program synthesis is driven by
the underlying stru ture of some initial experien e (represented as universal
plan), it is also more ognitively plausible than the sear h driven synthesis
of indu tive logi and geneti programming.
Integrating Learning by Doing and by Analogy. Cognitive s ien e approa hes
to skill a quisition from problem solving and early work on learning ma ro
operators from planning fo us on ombining sequen es of primitive opera-
tors into more omplex ones by merging their pre onditions and ee ts. In
ontrast, our work addresses the a quisition of problem solving strategies.
1. Introdu tion 9
Be ause domain dependent strategies are represented as re ursive program
s hemes, they apture the operational aspe t of problem solving as well as
the stru ture of a domain. Thereby we an deal with learning by indu tion
and analogy in a unied way showing how s hemes an be a quired from
problem solving and how su h s hemes an be used and generalized in ana-
logi al problem solving. Furthermore, the identi ation of data types from
plans aptures the evolution of per eptive hunks by identifying the rele-
vant aspe ts of a problem des ription and dening an order over su h partial
des riptions.
Organization of the Book The book is organized in three main parts
Planning, Indu tive Program Synthesis, and Analogi al Problem Solving
and Learning. Ea h part starts with an overview of resear h, along with an
introdu tion of the basi on epts and formalisms. Afterwards our own work
is presented, in luding relations to other work. We nish with summarizing
the ontributions of our approa h to the eld, and giving an outlook to further
resear h. The overview hapters an be read independently of the resear h
spe i hapters. The resear h spe i hapters presuppose that the reader is
familiar with the on epts introdu ed in the overview hapters. The fo us of
our work is on indu tive program synthesis and its appli ation to ontrol rule
learning for planning. Additionally, we dis uss relations to work on problem
solving and learning in ognitive psy hology.
Part I: Planning. In hapter 2, rst, the standard Strips language for spe ify-
ing planning domains and problems and a semanti s for operator appli ation
are introdu ed. Afterwards, extensions of the Strips language (ADL/PDDL)
are dis ussed and ontrasted with situation al ulus. Basi algorithms for
forward and ba kward planning are introdu ed. Complexity of planning and
formal properties of planning algorithms are dis ussed. A short overview of
the development of dierent approa hes to planning is given and pre-planning
analysis and learning are introdu ed as methods for obtaining domain spe-
i knowledge for making plan onstru tion more eÆ ient. In hapter 3, the
non-linear, state-based, universal planner DPlan is introdu ed and in hapter
4 an extension of the Strips language to fun tion appli ation is presented. In
hapter 5, we evaluate our approa h and dis uss further work to be done.
Part II: Indu tive Program Synthesis. In hapter 6 we give a survey of auto-
mati programming resear h, fo ussing on automati program onstru tion,
that is, program synthesis. We give a short overview of onstru tive theorem
proving and program transformation as approa hes to dedu tive program
synthesis. Afterwards, we present indu tive program synthesis as a spe ial
ase of ma hine learning. We introdu e grammar inferen e as theoreti al
ba kground for indu tive program synthesis. Geneti programming, indu -
tive logi programming and indu tive fun tional programming are presented
as three approa hes to generalize re ursive programs from in omplete spe i-
ations. In hapter 7 we introdu e the on ept of re ursive program s hemes
and present our approa h to folding nite program terms. In hapter 8 we
10 1. Introdu tion
present our approa h to bridging the gap between planning and program syn-
thesis by transforming plans into program terms. In hapter 9, we evaluate
our approa h and dis uss relations to human strategy learning.
Part III: Analogi al Problem Solving and Learning. In hapter 10 we give
an overview of approa hes to analogi al and ase-based reasoning, dis ussing
similarity measures and qualitative on epts for stru tural similarity. In hap-
ter 11 some psy hologi al experiments on erning problem solving with non-
isomorphi example problems are reported. In hapter 12 we present anti-
uni ation as an approa h to stru ture mapping and generalization along
with preliminary ideas for adaptation of non-isomorphi stru tures. In hap-
ter 13 we evaluate our approa h and dis uss further work to be done.
2. State-Based Planning
Plan it rst and then take it.
|Travis M Gee in: John D. Ma Donald, The Long Lavender Look, 1970
Planning is a major sub-dis ipline of AI. A plan is dened as a sequen e of
a tions for transforming a given state into a state whi h fullls a predened
set of goals. Planning resear h deals with the formalization, implementation,
and evaluation of algorithms for onstru ting plans. In the following, we rst
(se t. 2.1.1) introdu e a language for representing problems alled standard
Strips. Along with dening the language, the basi on epts and notations
of planning are introdu ed. Afterwards (se t. 2.2) some extensions to Strips
are introdu ed and situation al ulus is dis ussed as an alternative language
formalism. In se tion 2.3 basi algorithms for plan onstru tion based on
forward and ba kward sear h are introdu ed. We dis uss omplexity results
for planning and give results for termination, soundness, and ompleteness
of planning algorithms. In se tion 2.4 we give a short survey of well-known
planning systems and an overview of on epts often used in planning lit-
erature and pointers to literature. Finally (se t. 2.5), pre-planning analysis
and learning are introdu ed as approa hes for a quisition of domain-spe i
knowledge whi h an be used to make plan ontru tion more eÆ ient. The
fo us is on state-based algorithms, in luding universal planning. Throughout
the hapter we give illustrations using blo ks-world examples.
2.1 Standard Strips
To introdu e the basi on epts and notations of (state-based) planning, we
rst review Strips. The original Strips was proposed by Fikes and Nilsson
(1971) and up to now it is used with slight modi ations or some extensions
by the majority of planning systems. The main advantage of Strips is that
it has a strong expressive power (mainly due to the so alled losed world
assumption, see below) and at the same time allows for eÆ ient planning
algorithms. Informal introdu tions to Strips are given in all AI text books,
for example in Nilsson (1980) and Russell and Norvig (1995).
14 2. State-Based Planning
2.1.1 A Blo ks-World Example
Let us look at a very restri ted world, onsisting of four distin t blo ks. The
blo ks, alled A, B, C, and D, are the obje ts of this blo ks-world domain.
Ea h state of the world an be des ribed by the following relations: a blo k
an be on the table, that is, the proposition ontable(A) is true in a state, where
blo k A is lying on the table; a blo k an be lear, that is, the proposition
lear(A) is true in a state, where no other blo k is lying on top of blo k A;
and a blo k an be lying on another blo k, that is, the proposition on(B, C)
is true in a state, where blo k B is lying immediately on top of blo k C. State
hanges an be a hieved by applying one of the following two operators:
putting a blo k on the table and putting a blo k on another blo k. Both
operators have appli ation onditions: A blo k an only be moved if it is
lear and a blo k an only be put on another blo k, if this blo k is lear,
too. Figure 2.1 illustrates this simple blo ks-world example. If the goal for
whi h a plan is sear hed is that A is lying on B and B is lying on C, the right-
hand state is a goal state, be ause both proposition on(A, B) and proposition
on(B, C) are true. Note that other states, for example a tower with D on top
of A, B, C, also fulll the goal be ause it was not demanded that A has to
be lear or that C has to be on the table.
B
C DA
A
C
B
D
put A on B
Fig. 2.1. A Simple Blo ks-World
To formalize plan onstru tion, a language (or dierent languages) for
des ribing states, operators, and goals must be dened. Furthermore, it has
to be dened how a state hange an be (synta ti ally) al ulated.
2.1.2 Basi Denitions
The Strips language is dened over literals with onstant symbols and vari-
ables as arguments. That is, no general terms (in luding fun tion symbols)
are onsidered. We use the notion of term in the following denition in this
spe i way. In hapter 8 we will introdu e a language allowing for general
terms.
Denition 2.1.1 (Strips Language). The Strips language L
S
(X ; C;R) is
dened over sets of variables X , onstant symbols C, and relational symbols
R in the following way:
2.1 Standard Strips 15
Variables x 2 X are terms.
Constant symbols 2 C are terms.
If p 2 R is a relational symbol with arity (p) = i and if t
1
; : : : ; t
i
are
terms, then p(t
1
; : : : ; t
i
) is a formula.
Formulas onsisting of a single relational symbol are alled (positive) liter-
als. For short, we write p(t).
If p
1
(t
1
); : : : ; p
n
(t
n
) are formulas, then fp
1
(t
1
); : : : ; p
n
(t
n
)g is a formula,
representing the onjun tion of literals.
There are no other Strips formulas.
Formulas over L
S
(C;R), i. e., formulas not ontaining variables are alled
ground formulas. A literal without variables is alled atom.
We write L
S
as abbreviation for L
S
(X ; C;R). With X (F ) we denote the
variables o urring in formula F .
For the blo ks-world domain given in gure 2.1, L
S
is dened over the fol-
lowing set of symbols: C = fA;B;C;Dg, R = fon
2
; lear
1
; ontable
1
g, where
p
i
denotes a relational symbol of arity i. We will see below that for dening
operators additionally a set of variables X = f?x; ?y; ?zg is needed.
The Strips language an be used to represent states
1
and to dene synta -
ti rules for transforming a state representation by applying Strips operators.
A state representation an be interpreted logi ally by providing a domain, i. e.,
a set of obje ts of the world, and a denotation for all onstant and relational
symbols. A state representation denotes all states of the world where the
interpreted Strips relations are true. That is, a state representation denotes
a family of states in the world. When interpreting a formula, it is assumed
that all relations not given expli itly are false (the losed world assumption).
Denition 2.1.2 (State Representation). A problem state s is a on-
jun tion of atoms. That is, s 2 L
S
(C;R).
That is, states are propositions (relations over onstants).
Examples of problem states are
s
1
= fon(B, C), lear(A), lear(B), lear(D), ontable(A), ontable(C),
ontable(D)g
s
2
= fon(A, B), on(B, C), lear(A), lear(D), ontable(C), ontable(D)g.
An interpretation of s
1
results in a state as depi ted on the left-hand side
of gure 2.1, an interpretation of s
2
results in a state as depi ted on the
right-hand side.
The relational symbols on(x, y), lear(x), and ontable(x) an hange their
truth values over dierent states. There might be further relational symbols,
1
Note that we do not introdu e dierent representations to dis ern between syn-
ta ti al expressions and their semanti interpretation. We will speak of onstant
or relation symbols when referring to the syntax and of onstants or relations
when referring to the semanti s of a planning domain.
16 2. State-Based Planning
denoting relations whi h are onstant over all states, as for example the
olor of blo ks (if there is no operator paint-blo k). Relational symbols whi h
an hange their truth values are alled uents, onstant relational symbols
are alled stati s. We will see below, that uents an appear in operator
pre onditions and ee ts, while stati s an only appear in pre onditions.
Before introdu ing goals and operators dened over L
S
(X ; C;R), we in-
trodu e mat hing of formulas ontaining variables (also alled patterns) with
ground formulas.
Denition 2.1.3 (Substitution and Mat hing). A substitution is a set
of mappings = fx
1
t
1
; : : : ; x
n
t
n
g dening repla ements of vari-
ables x
i
by terms t
i
. For language L
S
terms t
i
are restri ted to variables and
onstants. By applying a substitution to a formula F denoted F
all
variables x
i
2 X (F ) with x
i
t
i
2 are repla ed by the asso iated t
i
. Note
that this repla ement is unique, i. e., identi al variables are repla ed by iden-
ti al terms. We all F
an instantiated formula if all X (F ) are repla ed by
onstant symbols.
For a formula F 2 L
S
and a set of atoms A 2 L
S
(C;R), mat h(F,A) =
gives all substitutions sigma
i
2 with F
i
A.
For state s
1
given above, the formula F = fontable(x), lear(x), lear(y)g
an be instantiated to = f
1
;
2
;
3
;
4
g with
1
= fx A; y Bg,
2
= fx A; y Dg,
3
= fx D; y Ag,
4
= fx D; y Bg.
For the quantor-free language L
S
(X ; C;R), all variables o urring in for-
mulas are assumed to be bound by existential quantiers. That is, all formulas
orrespond to onjun tions of propositions.
Denition 2.1.4 (Goal Representation). A goal G is a onjun tion of
literals. That is, G 2 L
S
(C;R).
An example of a planning goal is G = fon(A, B), on(B, C)g. All states s
with G
s are goal states.
Denition 2.1.5 (Strips Operator). A Strips operator op is des ribed by
pre onditions PRE, ADD- and DEL-lists
2
, with PRE, ADD, DEL 2 L
S
.
ADD and DEL des ribe the operator ee t. An instantiated operator o =
op
2 L
S
(C;R) is alled a tion. We write PRE(o), ADD(o), DEL(o) to
refer to the pre ondition, ADD-, or DEL-list of an (instantiated) operator.
Operators with variables are also alled operator s hemes.
An example for a Strips operator is:
Operator: put(?x, ?y)
PRE: fontable(?x), lear(?x), lear(?y)g
ADD: fon(?x, ?y)g
DEL: fontable(?x), lear(?y)g
2
More exa tly, the literals given in ADD and DEL are sets.
2.1 Standard Strips 17
This operator an, for example, be instantiated to
Operator: put(A, B)
PRE: fontable(A), lear(A), lear(B)g
ADD: fon(A, B)g
DEL: fontable(A), lear(B)g
We will see below (g. 2.2), that a se ond variant for the put operator is
needed for the ase that blo k x is lying on another blo k z. In a blo ks-
world with additional relations green(A), green(B), red(C), red(D), we ould
restri t the appli ation onditions for put further, for example su h, that only
green blo ks are allowed to be moved. Assuming that a put a tion does not
ae t the olor of a blo k, red(x) and green(x) are stati symbols.
A usual restri tion of operator instantiation is, that dierent variables
have to be instantiated with dierent onstant symbols.
3
Denition 2.1.6 ((Forward) Operator Appli ation). For a state s and
an instantiated operator o, operator appli ation is dened as Res(o; s) =
s nDEL(o) [ADD(o) if PRE(o) s.
Note that subtra ting DEL(o) and adding ADD(o) are ommutative (result-
ing in the same su essor state), only if DEL(o)\ADD(o) = ;, DEL(o) s,
and ADD(o) \ s = ;. The ADD-list of an operator might ontain free vari-
ables, i. e., variables whi h do not o ur as arguments of the relational sym-
bols in the pre ondition. This means that mat hing might only result in
partial instantiations. The remaining variables have to be instantiated from
the set of onstant symbols C given for the urrent planning problem.
Operator appli ation gives us a synta ti rule for hanging one state repre-
sentation into another one by adding and subtra ting atoms. On the semanti
side, operator appli ation des ribes state transitions in a state spa e (Newell
& Simon, 1972; Nilsson, 1980), where a relation s
0
= Res(o; s) denotes that
a state s an be transformed into a state s
0
. (Synta ti ) operator appli ation
is admissible, if s
0
= Res(o; s) holds in the state spa e underlying the given
planning problem, i. e., if (1) s
0
denotes a state in the world and if (2) this
state an be rea hed by applying the a tion hara terized by o in state s.
For the left-hand state in gure 2.1 (s
1
) and the instantiated put operator
given above, PRE(o) s holds and Res(o,s) results in the right-hand state
in gure 2.1 (s
2
).
A Strips domain is given as set of operators. Extensions of Strips su h as
PDDL allow in lusion of additional information, su h as types (see se t. 2.2).
A Strips planning problem is given as P (O; I;G) with O as set of operators,
I as set of initial states and G as set of top-level goals. An example for a
Strips planning problem is given in gure 2.2.
3
The planning language PDDL (see se t. 2.2) allows expli it use of equality and
inequality onstraints, su h that dierent variables an be instantiated with the
same onstant symbol if no inequality onstraint is given.
18 2. State-Based Planning
Operators:
put(?x, ?y)
PRE: fontable(?x), lear(?x), lear(?y)g
ADD: fon(?x, ?y)g
DEL: fontable(?x), lear(?y)g
put(?x, ?y)
PRE: fon(?x, ?z), lear(?x), lear(?y)g
ADD: fon(?x, ?y), lear(?z)g
DEL: fon(?x, ?z), lear(?y)g
puttable(?x)
PRE: f lear(?x), on(?x, ?y)g
ADD: fontable(?x), lear(?y)g
DEL: fon(?x, ?y)g
Goal: fon(A, B), on(B, C)g
Initial State: fon(D, C), on(C, A), lear(D), lear(B), ontable(A), ontable(B)g
Fig. 2.2. A Strips Planning Problem in the Blo ks-World
While the language, in whi h domains and problems an be represented
is learly dened, there is no unique way of modeling domains and problems.
In gure 2.2, for example, we de ided, to des ribe states with the relations
on, ontable, and lear. There are two dierent put operators, one is applied
if blo k x is lying on the table, and the other if blo k x is lying on another
blo k z.
An alternative representation of the blo ks-world domain is given in gure
2.3. Here, not only the blo ks, but also the table are onsidered as obje ts
of the domain. Unary stati relations are used to represent that onstant
symbols A, B, C, D are of type \blo k". Now all operators an be represented
as put(x, y). The rst variant des ribes what happens if a blo k is moved from
the table and put on another blo k, the se ond variant des ribes, how a blo k
is moved from one blo k on another, and the third variant des ribes how a
blo k is put on the table. Another alternative would be, to represent put as
ternary operator put(blo k, from, to).
Another de ision whi h in uen es the representation of domains and prob-
lems is with respe t to the level of detail. We have ompletely abstra ted from
the agent who exe utes the a tions. If a plan is intended for exe ution by a
robot, it be omes ne essary to represent additional operators, as pi king up
an obje t and holding an obje t and states have to be des ribed with ad-
ditional literals, for example what blo k the agent is urrently holding (see
g. 2.4).
2.1.3 Ba kward Operator Appli ation
The task of a planning algorithm is, to al ulate sequen es of transformations
from states s
0
2 I to states s
G
with G s
G
. The planning problem is solved
if su h a transformation sequen e alled a plan is found. Often, it is also
2.1 Standard Strips 19
Operators:
put(?x, ?y)
PRE: fon(?x, Table), blo k(?x), blo k(?y), lear(?x), lear(?y)g
ADD: fon(?x, ?y)g
DEL: fon(?x, Table), lear(?y)g
put(?x, ?y)
PRE: fon(?x, ?z), blo k(?x), blo k(?y), blo k(?z), lear(?x), lear(?y)g
ADD: fon(?x, ?y), lear(?z)g
DEL: fon(?x, ?z), lear(?y)g
put(?x, Table)
PRE: f lear(?x), on(?x, ?y), blo k(?x), blo k(?y)g
ADD: fon(?x, Table), lear(?y)g
DEL: fon(?x, ?y)g
Goal: fon(A, B), on(B, C)g
Initial State: fon(D, C), on(C, A), on(A, Table), on(B, Table),
lear(D), lear(B), blo k(A), blo k(B), blo k(C), blo k(D)g
Fig. 2.3. An Alternative Representation of the Blo ks-World
required that the transformations are optimal. Optimality an be dened as
minimal number of a tions or for operators asso iated with dierent osts
as an a tion sequen e with a minimal sum of osts.
Common to all state-based planning algorithms is that plan onstru tion
an be hara terized as sear h in the state spa e. Ea h planning step involves
the sele tion of an a tion. Finding a plan in general involves ba ktra king
over su h sele tions. To guarantee termination for the planning algorithm, it
is ne essary to keep tra k of the states already onstru ted, to avoid y les.
Sear h in the state spa e an be performed forward from an initial state to
a goal state , or ba kward from a goal state to an initial state. Ba kward
planning is based on ba kward operator appli ation:
Denition 2.1.7 (Ba kward Operator Appli ation). For a state s and
an instantiated operator o, ba kward operator appli ation is dened as
Res
1
(o; s) = s nADD(o) [ (DEL(o) [ PRE(o)) if ADD(o) s.
For state
s
2
= fon(A, B), on(B, C), lear(A), lear(D), ontable(C), ontable(D)g
ba kward appli ation of the instantiated operator put(A, B) given above
results in
s
2
n fon(A, B)g [ fontable(A), lear(A), lear(B)g =
fon(B, C), lear(A), lear(B), lear(D), ontable(A), ontable(C), ontable(D)g
= s
1
.
Ba kward operator appli ation is sound, if for Res
1
(o; s) = s
0
holds
Res(o; s
0
) = s for all states of the domain. Ba kward operator appli ation
is omplete if for all Res(o; s
0
) = s holds Res
1
(o; s) = s
0
(see se t. 3.3).
20 2. State-Based Planning
Originally, ba kward operator appli ation was not dened for \ omplete"
state des riptions (i. e., an enumeration of all atoms over R whi h hold in a
urrent state) but for onjun tions of (sub-) goals. We will dis uss so- alled
goal regression in se tion 2.3.4.
Note that while plan onstru tion an be performed by forward and ba k-
ward operator appli ations, plan exe ution is always performed by forward
operator appli ation transforming an initial state into a goal state.
2.2 Extensions and Alternatives to Strips
Sin e the introdu tion of the Strips language in the seventies, dierent exten-
sions have been introdu ed by dierent planning groups, both to make plan-
ning more eÆ ient and to enlarge the s ope to a larger set of domains. The
extensions were mainly in uen ed by work from Pednault (1987, 1994) who
proposed ADL (a tion des ription language) as a more expressive but still
eÆ ient alternative to Strips. The language PDDL (Planning Domain Deni-
tion Language) an be seen as a synthesis of all language features whi h were
introdu ed in the dierent planning systems available today (M Dermott,
1998b).
Strips and PDDL are based on the losed world assumption, allowing that
state transformations an be al ulated by adding and deleting literals from
state des riptions. Alternatively, planning an be seen as logi al inferen e
problem. In that sense, a Prolog interpreter is a planning algorithm. Situation
al ulus as a variant of rst order logi was introdu ed by M Carthy (1963),
M Carthy and Hayes (1969). Although most today planning systems are
based on the Strips approa h, situation al ulus is still in uential in planning
resear h. Basi on epts from situation al ulus are used to reason about
semanti properties of (Strips) planning. Furthermore, dedu tive planning is
based on this representation language.
2.2.1 The Planning Domain Denition Language
The language PDDL was developed 1998. Most urrent planning systems
are based on PDDL spe i ations as input and planning problems used at
the AIPS planning ompetitions are presented in PDDL (M Dermott, 1998a;
Ba hus et al., 2000). The development of PDDL was a joint proje t involving
most of the a tive planning resear h groups of the nineties. Thus, it an be
seen as a ompromise between the dierent synta ti representations and
language features available in the major planning systems of today.
The ore of PDDL is Strips. An example for the blo ks-world representa-
tion in PDDL is given in gure 2.4. For this example, we in luded aspe ts of
the behavior of an agent in the domain spe i ation (operators pi kup, and
putdown, relation symbols arm-empty and holding). Note that ADD- and
2.2 Extensions and Alternatives to Strips 21
DEL-lists are given together in an ee t-slot. All positive literals are added,
all negated literals are deleted from the urrent state.
(define (domain blo ksworld)
(:requirements :strips)
(:predi ates ( lear ?x)
(on-table ?x)
(arm-empty)
(holding ?x)
(on ?x ?y))
(:a tion pi kup
:parameters (?ob)
:pre ondition (and ( lear ?ob) (on-table ?ob) (arm-empty))
:effe t (and (holding ?ob) (not ( lear ?ob)) (not (on-table ?ob))
(not (arm-empty))))
(:a tion putdown
:parameters (?ob)
:pre ondition (holding ?ob)
:effe t (and ( lear ?ob) (arm-empty) (on-table ?ob)
(not (holding ?ob))))
(:a tion sta k
:parameters (?ob ?underob)
:pre ondition (and ( lear ?underob) (holding ?ob))
:effe t (and (arm-empty) ( lear ?ob) (on ?ob ?underob)
(not ( lear ?underob)) (not (holding ?ob))))
(:a tion unsta k
:parameters (?ob ?underob)
:pre ondition (and (on ?ob ?underob) ( lear ?ob) (arm-empty))
:effe t (and (holding ?ob) ( lear ?underob)
(not (on ?ob ?underob)) (not ( lear ?ob)) (not (arm-empty))))
)
(define (problem tower3)
(:domain blo ksworld)
(:obje ts a b )
(:init (on-table a) (on-table b) (on-table )
( lear a) ( lear b) ( lear ) (arm-empty))
(:goal (and (on a b) (on b )))
)
Fig. 2.4. Representation of a Blo ks-World Problem in PDDL-Strips
Extensions of Strips in luded in PDDL domain spe i ations are
Typing,
Equality onstraints,
Conditional ee ts,
Disjun tive pre onditions,
Universal quanti ation,
Updating of state variables.
22 2. State-Based Planning
Most modern planners are based on Strips plus the rst three extensions.
While the rst four extensions mainly result in a higher ee tiveness for plan
onstru tion, the last two extensions enlarge the lass of domains for whi h
plans an be onstru ted.
Unfortunately, PDDL (M Dermott, 1998b) does mainly provide a synta -
ti framework for these features but does give no or only an informal des rip-
tion of their semanti s. A semanti s for onditional ee ts and ee ts with
all-quanti ation is given in Koehler, Nebel, and Homann (1997). In the fol-
lowing, we introdu e typing, equality onstraints, and onditional ee ts. Up-
dating of state variables is dis ussed in detail in hapter 4. An example for a
blo ks-world domain spe i ation using an operator with equality- onstraints
and onditional ee ts is given in gure 2.5.
(define (domain blo ksworld-adl)
(:requirements :strips :equality : onditional-effe ts)
(:predi ates (on ?x ?y)
( lear ?x)) ; lear(Table) is stati
(:a tion puton
:parameters (?x ?y ?z)
:pre ondition (and (on ?x ?z) ( lear ?x) ( lear ?y)
(not (= ?y ?z)) (not (= ?x ?z))
(not (= ?x ?y)) (not (= ?x Table)))
:effe t
(and (on ?x ?y) (not (on ?x ?z))
(when (not (eq ?z Table)) ( lear ?z))
(when (not (eq ?y Table)) (not ( lear ?y)))))
)
Fig. 2.5. Blo ks-World Domain with Equality Constraints and Conditioned Ee ts
2.2.1.1 Typing and Equality Constraints. The pre ondition in gure
2.5 ontains equality onstraints, expressing, that all three obje ts involved
in the puton operator have to be dierent.
Equality onstraints restri t what substitutions are legal in mat hing a
urrent state and a pre ondition. That is, denition 2.1.3, is modied su h
that for all pairs (x t), (x
0
t
0
) in a substitution t 6= t
0
has to hold, if
(not (= x x')) is spe ied in the pre ondition of the operator.
Instead of using expli it equality onstraints, mat hing an be dened
su h that variables with dierent names must generally be instantiated with
dierent onstants. But, expli it equality onstraints give more expressive
power: giving no equality onstraint for two variables x and y allows that
these variables an be instantiated by dierent or the same onstant. Using
\impli it" equality onstraints make it ne essary to spe ify two dierent op-
erators, one with only one variable name (x = y), and one with both variable
names (x 6= y).
2.2 Extensions and Alternatives to Strips 23
Besides equality onstraints, types an be used to restri t mat hing. Let
us assume a blo ks-world, where movable obje ts onsist of blo ks and of
pyramids and where no obje ts an be put on top of a pyramid. We an
introdu e the following hierar hy of types:
table
blo k
pyramid
movable-obje t: blo k, pyramid
attop-obje t: table, blo k.
Operator puton(x, y, z) an now be dened over typed variables x:movable-
obje t, y: attop-obje t, z: attop-obje t. When dening a problem for that ex-
tended blo ks-world domain, ea h onstant must be de lared together with
a type.
Typing an be simulated in standard Strips using stati relations. A simple
example for typing is given in gure 2.3. An example overing the type hierar-
hy from above is: ftable(T), attop-obje t(T), blo k(B), movable-obje t(B),
attop-obje t(B), pyramid(A), movable-obje t(A)g. The operator pre ondi-
tion an then be extended by fmovable-obje t(x), attop-obje t(y), attop-
obje t(z)g.
2.2.1.2 Conditional Ee ts. Conditional ee ts allow to represent on-
text dependent ee ts of a tions. In the blo ks-world spe i ation given in
gure 2.3, three dierent operators were used to des ribe the possible ee ts
of putting a blo k somewhere else (on another blo k or the table). In on-
trast, in gure 2.5, only one operator is needed. All variants of puton(x y
z) have some general appli ation restri tions, spe ied in the pre ondition:
blo k x is lying on something (z is another blo k or the table); and both x
and y are lear, where lear(Table) is stati , i. e., holds in all possible situa-
tions. Additionally, the equality onstraints in the pre ondition spe ify that
all obje ts involved have to be dierent. Regardless of the ontext in whi h
puton is applied, the result is that x is no longer lying on z, but on y. This
ontext independent ee t is spe ied in the rst line of the ee t. The next
two lines spe ify additional onsequen es: If x was not lying on the table,
but on another blo k z, this blo k z is lear after operator appli ation; and if
blo k x was not put on the table, but on a blo k y, this blo k y is no longer
lear after operator appli ation. Pre onditions for onditioned ee ts are also
alled se ondary pre onditions , onditioned ee ts are also alled se ondary
ee ts (Penberthy & Weld, 1992; Fink & Yang, 1997).
The main advantage of onditional ee ts is, that plan onstru tion gets
more eÆ ient: In every planning step, all operators must be mat hed with
the urrent state representation and all su essfully mat hed operators are
possible andidates for appli ation. If the general pre ondition of an operator
with onditioned ee ts does not mat h with the urrent state, it an be
reje ted immediately, while for the un onditioned variants given in gure 2.3
all three pre onditions have to be he ked.
24 2. State-Based Planning
The semanti s of applying an operator with onditioned ee ts is:
Denition 2.2.1 (Operator Appli ation with Conditional Ee ts). Let
PRE be the general pre ondition of an operator, and ADD and DEL the
un onditioned ee ts. Let PRE
i
, ADD
i
, DEL
i
, with i = 0 : : : n be on-
text onditions with their asso iated ontext ee ts. For a state s and an
instantiated operator o, operator appli ation is dened as Res(o; s) = s n
(DEL(o)
S
i2M
DEL
i
(o)) [ (ADD(o)
S
i2M
ADD
i
(o)) if PRE(o)
s and M = fi j PRE
i
(o) sg.
Ba kward appli ation is dened as Res
1
(o; s) = s
n(ADD(o)
S
i2M
ADD
i
(o))
[ [(DEL(o)
S
i2M
DEL
i
(o)) [ (PRE(o)
S
i2M
PRE
i
(o))
if ADD(o) s and M = fi j ADD
i
(o) sg.
2.2.2 Situation Cal ulus
Situation al ulus was introdu ed by M Carthy (M Carthy, 1963; M Carthy
& Hayes, 1969) to des ribe state transitions in rst order logi . The world
is on eived as a sequen e of situations and situations are generated from
previous situations by a tions. Situations are as Strips state representations
ne essarily in omplete representations of the states in the world!
Relations whi h an hange over time are alled uents. Ea h uent has
an extra argument for representing a situation. For example, lear(a, s
1
)
4
denotes, that blo k a is lear in a situation referred to as s
1
. Changes in the
world are represented by a fun tion Res(a tion, situation) = situation. For
example, we an write s
2
= Res(put(a, b), s
1
) to des ribe that applying put(a,
b) in situation s
1
results in a situation s
2
. Note that the on ept of uents
is also used in Strips. Although uents are there not spe ially marked, it an
be inferred from the operator ee ts, whi h relational symbols orrespond to
uent relations. We already used the Res-fun tion to des ribe the semanti s
of operator appli ations in Strips (se t. 2.1.1). In situation al ulus, the result
of an operator appli ation is not al ulated by adding and deleting literals
from a state but by logi al inferen e.
The rst implemented system using situation al ulus for automated plan
onstru tion was proposed by Green (1969) with the QA3 system. Alterna-
tively to the expli it use of a Res-fun tion, not only uents, but also a tions
are provided with an additional argument for situations. For example, we an
write s
2
= put(a, b, s
1
) to des ribe that blo k a is put on blo k b in situa-
tion s
1
, resulting in a new situation s
2
. Green's inferen e system is based on
resolution.
5
For example, the following two axioms might be given (the rst
axiom is a spe i fa t):
4
We represent onstants with small letters and variables with large letters.
5
We do not introdu e resolution in a formal way but give an illustratory example.
We assume that the reader is familiar with the basi on epts of theorem proving,
as they are introdu ed in logi or AI textbooks.
2.2 Extensions and Alternatives to Strips 25
A1 on(a, table, s
1
)
A2 8 S[on(a; table; S)! on(a; b; put(a; b; S))
:on(a, table, S) _ on(a, b, put(a, b, S)) ( lausal form).
Green's theorem prover provides two results: rst, it infers, whether some
formula representing a planning goal follows from the axioms, and se ond,
it provides the a tion sequen e the plan if the goal statement an be
derived. Not only a yes/no answer but also a plan how the goal an be
a hieved an be returned be ause a so alled answer literal is introdu ed.
The answer literal is initially given as answer(S) and at ea h resolution step,
the variable S ontained in the answer literal is instantiated in a ordan e
with the involved formulas.
For example, we an ask the theorem prover whether there exists a situ-
ation S
F
in whi h on(a, b, S
F
) holds, given the pre-dened axioms. If su h
a situation S
F
exists, answer(S
F
) will be instantiated throughout the reso-
lution proof with the plan. Resolution proofs work by ontradi tion. That is,
we start with the negated goal:
1. :on(a, b, S
F
) (Negation of the theorem)
2. : on(a, table, S) _ on(a, b, put(a, b, S)) (A2)
3. : on(a, table, S) (Resolve 1, 2)
answer(put(a, b, S))
4. on(a, table, s
1
) (A1)
5. ontradi tion (Resolve 3, 4)
answer(put(a, b, s
1
))
The resolution proof shows that a situation s
2
= on(a, table, s
1
) with on(a,
b, s
2
) exists and that s
2
an be rea hed by putting a on b in situation s
1
.
Situation al ulus has the full expressive power of rst order predi ate
logi . Be ause logi al inferen e does not rely on the losed world assumption,
spe ifying a planning domain involves mu h more eort than in Strips: In
addition to axioms des ribing the ee ts of operator appli ations, frame ax-
ioms, des ribing what predi ates remain unae ted by operator appli ations,
have to be spe ied.
Frame axioms be ome always ne essary when the goal is not only a single
literal but a onjun tion of literals. For illustration, we extend the example
given above:
A3 on(a, table, s
1
)
A4 on(b, table, s
1
)
A5 on( , table, s
1
)
A6 :on(X, table, S) _ on(X, Y, put(X, Y, S))
A7 8 S[on(Y; Z; S)! on(Y; Z; put(X;Y; S))
:on(Y, Z, S) _ on(Y, Z, put(X, Y, S))
A8 8 S[on(X; table; S)! on(X; table; put(Y; Z; S))
:on(X, table, S) _ on(X, table, put(Y, Z, S)).
26 2. State-Based Planning
Axiom A6 orresponds to axiom A2 given above, now stated in a more general
form, abstra ting from blo ks a and b. Axiom A7 is a frame axiom, stating
that a blo k Y is still lying on a blo k Z, after a blo k X was put on blo k
Y . Axiom A8 is a frame axiom, stating that a blo k X is still lying on the
table, if a blo k Y is put on a blo k Z. Note, that the given axioms are only
a sub-set of the information needed for modeling the blo ks-world domain. A
omplete axiomatization is given in gure 2.6 (where we represent the axioms
in a Prolog-like notation, writing the on lusion side of an impli ation on the
left-hand side).
Ee t Axioms:
on(X, Y, put(X, Y, S)) lear(X, S) ^ lear(Y, S)
lear(Z, put(X, Y, S)) on(X, Z, S) ^ lear(X, S) ^ lear(Y, S)
lear(Y, puttable(X, S)) on(X, Y, S) ^ lear(X, S)
ontable(X, puttable(X, S)) lear(X, S)
Frame Axioms:
lear(X, put(X, Y, S)) lear(X, S) ^ lear(Y, S)
lear(Z, put(X, Y, S)) lear(X,S) ^ lear(Y, S) ^ lear(Z, S)
ontable(Y, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ ontable(Y, S)
ontable(Z, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ ontable(Z, S)
on(Y, Z, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ on(Y, Z, S)
on(W, Z, put(X, Y, S)) lear(X, S) ^ lear(Y, S) ^ on(W, Z, S)
lear(Z, puttable(X, S)) lear(X, S) ^ lear(Z, S)
ontable(Z, puttable(X, S)) lear(X, S) ^ ontable(Z, S)
on(Y, Z, puttable(X, S)) lear(X, S) ^ on(Y, Z, S)
lear(Z, puttable(X, S)) on(Y, X, S) ^ lear(Y, S) ^ lear(Z, S)
ontable(Z, puttable(X, S)) on(Y, X, S) ^ lear(Y, S) ^ ontable(Z, S)
on(W, Z, puttable(X, S)) on(Y, X, S) ^ lear(Y, S) ^ on(W, Z, S)
Fa ts (Initial State):
on(d, , s
1
)
on( , a, s
1
)
lear(d, s
1
)
lear(b, s
1
)
ontable(a, s
1
)
ontable(b, s
1
)
Theorem (Goal):
on(a, b, S) ^ on(b, , S)
Fig. 2.6. Representation of a Blo ks-World Problem in Situation Cal ulus
For the goal 9 S
F
[on(a; b; S
F
) ^ on(b; ; S
F
), the resolution proof is
6
:
1. :on(a, b, S
F
) _ :on(b, , S
F
) (Negation of the theorem)
2. :on(X, table, S) _ on(X, Y, put(X, Y, S)) (A6)
3. :on(b, , put(a, b, S)) _ :on(a, table, S) (Resolve 1, 2)
4. :on(Y, Z, S') _ on(Y, Z, put(X, Y, S')) (A7)
6
When introdu ing a new lause, we rename variables su h that there an be no
onfusion, as usual for resolution.
2.3 Basi Planning Algorithms 27
5. :on(a, table, S') _ :on(b, , S') (Resolve 3, 4)
6. :on(X, table, S) _ on(X, Y, put(X, Y, S)) (A6)
7. :on(a, table, put(b, , S)) _ :on(b, table, S) (Resolve 5, 6)
8. :on(X, table, S') _ on(X, table, put(Y, Z, S')) (A8)
9. :on(b, table, S) _ :on(a, table, S) (Resolve 7, 8)
10. on(b, table, s
1
) (A4)
11. :on(a, table, S) (Resolve 9, 10)
12. on(a, table, s
1
) (A3)
13. ontradi tion.
Resolution gives us an inferen e rule for proving that a goal theorem
follows from a set of axioms and fa ts. For automated plan onstru tion, ad-
ditionally a strategy whi h guides the sear h for the proof (i. e., the sequen e
in whi h axioms and fa ts are introdu ed into the resolution steps) is needed.
An example for su h a strategy is SLD-resolution as used in Prolog (Sterling
& Shapiro, 1986). In general, nding a proof involves ba ktra king over the
resolution steps and over the ordering of goals.
The main reason why Strips and not situation al ulus got the standard
for domain representations in planning is ertainly that it is mu h more time
onsuming and also mu h more error-prone to represent a domain using ef-
fe t and frame axioms in ontrast to only modeling operator pre onditions
and ee ts. Additionally, spe ial purpose planning algorithms are naturally
more eÆ ient than general theorem provers. Furthermore, for a long time, the
restri ted expressiveness of Strips was onsidered suÆ ient for representing
domains whi h are of interest in plan onstru tion. When more interesting
domains, for example domains involving resour e onstraints (see hap. 4),
were onsidered in planning resear h, the expressiveness of Strips was ex-
tended to PDDL. But still, PDDL is a more restri ted language than situa-
tion al ulus. Due to the progress in automati theorem proving over the last
de ade, the eÆ ien y on erns whi h aused the prominen e of state-based
planning, might no longer be true (Bibel, 1986). Therefore, it might be of in-
terest again, to ompare urrent state-based and urrent dedu tive (situation
al ulus) approa hes
2.3 Basi Planning Algorithms
In general, a planning algorithm is a spe ial purpose sear h algorithm. State-
based planners sear h in the state-spa e, as shortly des ribed in se tion 2.1.1.
Another variant of planners, alled partial-order planners, sear h in the so-
alled plan spa e. Dedu tive planners sear h in the \proof spa e", i. e. the
possible orderings of resolution steps. Basi sear h algorithms are introdu ed
in all introdu tory algorithm textbooks, for example in Cormen, Leiserson,
and Rivest (1990).
28 2. State-Based Planning
In the following, we will rst introdu e basi on epts for plan onstru -
tion. Then forward planning is des ribed, followed by a dis ussion of omplex-
ity results for planning and formal properties for plans. Finally, we introdu e
ba kward planning.
2.3.1 Informal Introdu tion of Basi Con epts
The denitions for forward and ba kward operator appli ation given above
(def. 2.1.6 and def. 2.1.7) are the ru ial omponent for plan onstru tion:
The transformation of a given state into a next state by operator appli a-
tion onstitutes one planning step. In general, ea h planning step onsists of
mat hing all operators with the urrent state des ription, sele ting one in-
stantiated operator whi h is appli able in the urrent state, and applying this
operator. Plan onstru tion involves a series of su h mat h-sele t-apply
y les. In ea h planning step one operator is sele ted for appli ation, that
is, plan onstru tion is based on depth-rst sear h and operator sele tion is a
ba ktra k point. During plan onstru tion, a sear h tree is generated. State
des riptions are nodes, a tion appli ations are ar s in the sear h tree. Ea h
planning step expands the urrent leaf node s of the sear h tree by intro-
du ing a new a tion o and the state des ription s
0
resulting from applying
o to s. For forward planning, sear h starts with an initial state as root; for
ba kward planning, sear h starts with the top-level goals (or a goal state) as
root.
Input in a planning algorithm is a planning problem P (O; I;G), as
dened in se tion 2.1.1. Output of a planning algorithm is a plan. For basi
Strips planning, a plan is a sequen e of a tions, transforming an initial state
into an state fullling the top-level goals. Su h an exe utable plan is also
alled solution. More general, a plan is a set of operators together with a
set of binding onstraints for the variables o urring in the operators and
a set of ordering onstraints dening the sequen e of operator appli ations.
If the plan is not exe utable, a plan is also alled partial plan. A plan is
not exe utable if it does not ontain all operators ne essary to transform an
initial state into a goal state, or if not all variables are instantiated, or if there
is no omplete ordering of the operators.
When implementing a planner, at least the following fun tionalities must
be provided for:
A pattern-mat her and possibly a me hanism for dealing with free variables
(i. e., variables whi h are not bound by mat hing an operator with a set
of atoms).
A strategy for sele ting an appli able a tion and for handling ba ktra king
(taking ba k an earlier ommitment).
A me hanism for al ulating the ee t of an operator appli ation.
A data stru ture for storing the partially onstru ted plan.
2.3 Basi Planning Algorithms 29
A data stru ture for holding the information ne essary to dete t y les
(i. e., states whi h were already generated) to guarantee termination.
Some planning systems al ulate a set of a tions before plan onstru -
tion by instantiating the operators with the given onstant symbols. As
a onsequen e, during plan onstru tion, mat hing is repla ed by a simple
subset-test, where it is he ked whether the set of instantiated pre onditions
of an operator is ompletely ontained in the set of atoms representing the
urrent state. Instantiation of operators is realized with respe t to the (possi-
bly typed) onstants dened for an problem by extra ting them from a given
initial state or by using the set of expli itly de lared onstants. In general,
su h a \freely generated" set of a tions is a super-set of the set of \legal"
a tions.
Often planning algorithms do not dire tly onstru t a plan as fully instan-
tiated, totally ordered sequen e of a tions. Some planners onstru t partially
ordered plans where some a tions for whi h no order onstraints were deter-
mined during planning are \parallel". Some planners store plans as part of a
more general data stru ture. In su h ases, additional fun tionalities for plan
linearization and/or plan extra tion must be provided.
2.3.2 Forward Planning
Plan onstru tion based on forward operator appli ation starts with an initial
state as root of the sear h tree. Plan onstru tion terminates su essfully, if
a state is found whi h satises all top-level planning goals. Forward planning
is also alled progression planning. An informal forward planning algorithm
is given in table 2.1.
Table 2.1. Informal Des ription of Forward Planning
Until the top-level goals are satised in a state or until all possible states are
explored DO
For the urrent state s:
MATCH: For all operators op 2 O al ulate all substitutions su h that the
operator pre onditions are ontained in s. Generate a set of a tion andidates
A = fo j PRE(o) sg.
SELECT: Sele t one element o from A.
APPLY: Cal ulate the su essor state Res(s; o) = s
0
.
BACKTRACK: If there is no su essor state (A is empty), go ba k to the pre-
de essor state of s. If the generated su essor state is already ontained in the
plan, sele t another element from A.
PROCEED: Otherwise, insert the sele ted a tion in the plan and pro eed with
s
0
as urrent state.
In this algorithm, we abstra t from the data stru ture for saving a plan
and from the data stru ture ne essary for handling ba ktra king and for
30 2. State-Based Planning
dete tion of y les. One possibility would be, to put in ea h planning step
an a tion-su essor-state pair on a sta k. When the algorithm terminates
su essfully, the plan orresponds to the sequen e of a tions on the sta k. For
ba ktra king and y le dete tion a se ond data stru ture is needed where all
generated and reje ted states are saved. Both informations an be represented
together, if the omplete sear h history is saved expli itly in a sear h tree (see
Winston, 1992, hap. 4). A sear h tree an be represented as list of lists:
Denition 2.3.1 (The Data Stru ture \Sear h Tree"). A sear h tree ST
an be represented as a list of lists, where ea h list represents a path: ST =
nil j ons(path; ST ) with
empty(ST ) =
true if ST = nil
false else
and sele tor fun tions rst( ons(path, ST)) = path, rest( ons(path, ST)) = ST.
A (partially expanded) path is dened as a list of a tion-state pairs (o s). The
root (initial state) is represented as (nil s). The sele tor fun tion geta tion((o s))
returns the rst, and the sele tor fun tion getstate((o s)) the se ond argument of
an a tion-state pair.
A path is dened as: path = nil j r ons(path; pair) with sele tor fun tion
last(r ons(path, pair)) = pair.
With getstate(last(path)) a leaf of the sear h tree is retrieved. For a urrent state
s = getstate(last(path)) and a list of a tion andidates A a path is expanded by
expand(path, A) = r ons(path, ons(o, s')) for all o 2 A and Res(o; s) = s
0
.
A plan an be extra ted from a path with getplan(path) = map((x). geta tion(x)
(path)), where the higher-oder fun tion map des ribes that fun tion geta tion is
applied to the sequen e of all elements in path.
A \ eshed-out" version of the algorithm in table 2.1 is given in table 2.2.
For saving the possible ba ktra k points now all a tions whi h are appli able
in a state are inserted in the sear h tree, together with their orresponding
su essor states. That is, sele tion of an a tion is delayed one step. The
sele tion strategy is to always expand the rst (\left-most") path in the sear h
tree. That is, the algorithm is still depth-rst. To guarantee termination, for
the y le-test it is suÆ ient to he k whether an a tion results in a state whi h
is already ontained in the urrent path. Alternatively, it ould be he ked,
if the new state is ontained already anywhere else in the sear h tree. This
extended y le he k makes sure that a state is always rea hed with the
shortest possible a tion sequen e but involves possibly more ba ktra king.
Note that the extended y le-test does not result in optimal plans: although
every state already in luded in the sear h tree is rea hed with the shortest
possible a tion sequen e, the optimal path might be found in a not expanded
part of the sear h tree whi h does not in lude this state.
An example for plan onstru tion by forward sear h is given in gure
2.7. We use the operators as dened in gure 2.2. The top-level goals are
on(A, B), on(B, C), and the initial state is on(A, C), ontable(B), ontable(C),
lear(A), lear(B). For better readability, the states are presented graphi ally
and not as set of literals. All generated and dete ted y les are presented in
2.3 Basi Planning Algorithms 31
Table 2.2. A Simple Forward Planner
Main Fun tion fwplan(O, G, ST)
Initial Fun tion Call: fwplan(O, G, [[(nil s)) with a set of operators O, an initial
state s 2 I, and a set of top-level goals G
Return: A plan as sequen e of a tions whi h transform s in a state s
G
with
G s
G
, extra ted from sear h tree ST .
1. IF empty(ST) THEN \no plan found"
2. ELSE LET be urrent state S = getstate(last(first(ST ))).
( orresponds to SELECT a tion geta tion(last(rst(ST))))
a) IF G S THEN getplan(rst(ST))
b) ELSE
i. MATCH: For all op 2 O al ulate mat h(op; S) as in def. 2.1.3.
LET A = fo
1
; : : : o
n
g be the list of all instantiated operators with
PRE(o
i
) S.
ii. APPLY: For all o
i
2 A al ulate S
0
i
= Res(o
i
; S) as in def. 2.1.6.
LET AR = f(o
1
S
0
1
) : : : (o
n
S
0
n
)g
iii. Cy le-Test: Remove all pairs (o
i
S
0
i
) from AR where S
i
is ontained
in first(ST ).
iv. Re ursive all:
IF empty(AR) THEN fwplan(O, G, rest(ST)) (BACKTRACK)
ELSE fwplan(O, G, append(expand(rst(ST), AR), tail(ST))).
the gure but only for the rst two ba ktra king is expli itly depi ted. The
generation of a tion andidates is based on an ordering of put before puttable
and instantiation is based on alphabeti al order (A before B before C).
For the given sequen e in whi h a tion andidates are expanded, sear h is
very ineÆ ient. In fa t, all possible states of the blo ks-world are generated to
nd the solution. In general, un-informed forward sear h has a high bran hing
fa tor. Forward sear h an be made more eÆ ient, if some information about
the probable distan e of a state to a goal state an be used. The best known
heuristi forward sear h algorithm is A
(Nilsson, 1971), where a lower bound
estimate for the distan e from the urrent state to the goal is used to guide
sear h. For domain independent planning in ontrast to spe ialized problem
solving su h a domain spe i information is not available. An approa h
how to generate su h estimates in a pre-pro essing step to planning was
presented by Bonet and Gener (1999).
2.3.3 Formal Properties of Planning
2.3.3.1 Complexity of Planning. Even if a sear h tree an be kept smaller
than in the example given above, in the worst ase planning is NP- omplete.
That is, a (deterministi ) algorithm needs exponential time for nding a
solution (or de iding that a problem is not solvable). For a sear h tree with a
maximal depth of n and a maximal bran hing fa tor of m, in the worst ase,
32 2. State-Based Planning
B A
C
s0
B
C
A
s1
C
B
s2
A
B A C
s5
B A C
s3
C
A
B
s4
A
B
C
s14
A
C
B
s13
B
B
A
C
B
A C
C
A
B
A
C
s6 s7 s8 s9 s10
s11 s12
B
A
C A
B
C B
C
A
cycle
s0
cycle
s2
cycle
s5
cycle
s5
cycle
s5
cycle
s5
GOAL STATE
un-used
backtrack point
un-used
backtrack point
bac
ktra
ck
bac
ktra
ck
put(B, C) puttable(C)
put(C, B)
puttable(B) put(A, C) puttable(C)
puttable(A)
put(C, A)
put(B, A)
put(A, B)
put(A, C)
put(B, A)put(C, A)
put(B, C)
cycle cycle cycles6 s7 s8
put(C, B)
put(A, B)
Fig. 2.7. A Forward Sear h Tree for Blo ks-World
2.3 Basi Planning Algorithms 33
planning eort ism
n
, i. e., the omplete sear h tree has to be generated.
7
Even
worse, for problems without a restri tion of the maximal length of a possible
solution, planning is PSPACE- omplete. That is, even for non-deterministi
polynomial algorithms an exponential amount of memory is needed (Garey
& Johnson, 1979).
Be ause in the worst ase, all possible states have to be generated for
plan onstru tion, the number of nodes in the sear h tree is approximately
equivalent to the number of problem states. For problems, where the number
of states grows systemati ally with the number of obje ts involved, the max-
imal size of the sear h tree an be al ulated exa tly. For example, for the
Tower of Hanoi domain
8
with three pegs and n dis s, the number of states is
3
n
(and the minimal depth of a solution path is 2
n1
).
The blo ks-world domain is similar to the Tower of Hanoi domain but has
less onstraints. First, ea h blo k an be put on ea h other blo k (instead of
a dis an only be put on another dis if it is smaller) and se ond, the table is
assumed to have always additional free spa e for a blo k (instead of only three
pegs where dis s an be put). Enumeration of all states of the blo ks-world
domain with n blo ks orresponds to the abstra t problem of generating a
set of all possible sets of lists whi h an be onstru ted from n elements. For
example, for three blo ks A, B, C:
f f(A B C)g,f(A C B)g, f(B A C)g,f(B C A)g,f(C A B)g,f(C B A)g,
f(B C), (A)g, f(C B), (A)g, f(A C), (B)g, f(C A), (B)g, f(A B), (C)g,
f(B A), (C)g,
f(A) (B) (C)g g.
A single list with three elements represents a single tower, for example a
tower with blo ks A on B and B on C for the rst element given above. A set
of two lists represents two towers on the table, for example blo k B lying on
blo k C and A as a one-blo k tower. The number of sets of lists orresponds
to the so- alled Lah-number (Knuth, 1992).
9
It an be al ulated by the
following formula:
a(0) = 0
a(1) = 1
a(n) = [(2n - 1) a(n - 1) - [(n - 1) (n - 2) a(n - 2).
The growth of the number of states for the blo ks-world domain is given for
up to sixteen blo ks in table 2.3.
7
Note that this only holds for nite planning domains where all states are enu-
merable. For more omplex domains involving relational symbols over arbitrary
numeri al arguments, the sear h tree be omes innite and therefore other ri-
teria have to be introdu ed for termination (making planning in omplete, see
below).
8
The Tower of Hanoi domain is introdu ed in hapter 4.
9
More ba kground information an be found at http://www.resear h.att. om/
gi-bin/a ess. gi/as/njas/sequen es/eisA. gi?Anum=000262.
34 2. State-Based Planning
Table 2.3. Number of States in the Blo ks-World Domain
# blo ks 1 2 3 4 5
# states 1 3 13 73 501
approx. 1:0 10
0
3:0 10
0
1:3 10
1
7:3 10
1
5:0 10
2
# blo ks 6 7 8 9 10
# states 4051 37633 394353 4596553 58941091
approx. 4:1 10
3
3:8 10
4
3:9 10
5
4:6 10
6
5:9 10
7
#blo ks 11 12 13 14 15
# states 824073141 12470162233 202976401213 3535017524403 65573803186921
approx. 8:2 10
8
1:3 10
10
2:0 10
11
3:5 10
12
6:6 10
13
#blo ks 16 : : :
# states 1290434218669921 : : :
approx. 1:3 10
15
The sear h tree might be even larger than the number of legal states of a
problem i. e., the number of nodes in the state-spa e. First, some states an
be onstru ted more than on e (if y le dete tion is restri ted to paths) and
se ond, for some planning algorithms, \illegal" states whi h do not belong to
the state spa e might be onstru ted. Ba kward planning by goal regression
an lead to su h illegal states (see se t. 2.3.4).
An analysis of the omplexity of Strips planning is given by (Bylander,
1994). Another approa h to planning is that the state spa e (or that part of
it whi h might ontain the sear hed for solution) is already given. The task of
the planner is than to extra t a solution from the sear h spa e (this strategy
is used by Graphplan, see se t. 2.4.2.1). This problem is still NP- omplete!
Be ause planning is an inherently hard problem, no planning approa h
an outperform alternative approa hes in every domain. Instead, dierent
planning approa hes are better suited for dierent kinds of domains.
2.3.3.2 Termination, Soundness, Completeness. A planning algorithm
should like every sear h algorithm be orre t and omplete:
Denition 2.3.2 (Soundness). A planning algorithm is sound if it only
generates legal solutions for a planning problem. That is, the generated se-
quen e of a tions transforms the given initial state into a state satisfying the
given top-level goals. Soundness implies that the generated plans are onsis-
tent: A state generated by applying an a tion to a previous state is onsistent
if it does not ontain ontradi tory literals, i. e., if it belongs to the domain.
A solution is onsistent if it does not ontain ontradi tions with regard to
variable bindings and to ordering of states.
Proof of soundness relies on the operational semanti s given for operator ap-
pli ation (su h as our denitions 2.1.6 and 2.1.7). The proof an be performed
by indu tion over the length of the sequen es of a tions.
Corre tness follows from soundness and termination:
Denition 2.3.3 (Corre tness). A sear h algorithm is orre t, if it is
sound and termination is guaranteed.
2.3 Basi Planning Algorithms 35
To prove termination, it has to be shown, that for ea h plan step the sear h
spa e is redu ed su h that the termination onditions given for the planning
algorithm are eventually rea hed. For nite domains and a y le-test overing
the sear h tree, the planner always terminates when the omplete sear h tree
was onstru ted.
Besides making sure that a planner always terminates and that if returns
a plan that plan is a solution to the input problem, it is desirable that the
planner returns a solution for all problems, where a solution exists:
Denition 2.3.4 (Completeness). A sear h algorithm is omplete, if it
nds a solution if su h a solution exists. That is, if the algorithm terminates
without a solution, then the planning problem has no solution.
To proof ompleteness, it has to be shown that the planner only then
terminates without a solution, if no solution exists for a problem.
We will give proofs for orre tness and ompleteness for our planner DPlan
in hapter 3.
Ba kward planning algorithms based on a linear strategy are in omplete.
The Sussman anomaly is based on that in ompleteness. In ompleteness of
linear ba kward planning is dis ussed in se tion 2.3.4.2.
2.3.3.3 Optimality. When abstra ting from dierent osts (su h as time,
energy use) of operator appli ation, optimality of a plan is dened with re-
spe t to its length. Operator appli ations are assumed to have uniform osts
(for example one unit per appli ation) and the ost of a plan is equivalent to
the number of a tions it ontains.
Denition 2.3.5 (Optimality of a Plan). A plan is optimal if ea h other
plan whi h is a solution for the given problem has equal or greater length.
For operators involving dierent (positive) osts a plan is optimal if for
ea h other plan whi h is a solution the sum of the osts of the a tions in the
plan is equal or higher.
For uninformed planning based on depth-rst sear h, as des ribed in se -
tion 2.3.2, optimality of plans annot be guaranteed. In fa t, there is a trade-
o between eÆ ien y and optimality to generate optimal plans, the state-
spa e has to be sear hed more exhaustively. The obvious sear h strategy
for obtaining optimal plans is breadth-rst sear h.
10
Universal planning (see
se t. 2.4.4) for deterministi domains results in optimal plans.
2.3.4 Ba kward Planning
Ba kward planning often results in smaller sear h trees than forward plan-
ning be ause the top-level goals an be used to guide the sear h. For a long
10
Alternatively depth-rst sear h an be extended su h that all plans are generated
always keeping the urrent shortest plan.
36 2. State-Based Planning
time planning was used as synonym for ba kward planning while forward
planning was often asso iated with problem solving (see se t. 2.4.5.1). For
ba kward planning, plan onstru tion starts with the top-level goals and in
ea h planning step a prede essor state is generated by ba kward operator
appli ation. Ba kward planning is also alled regression planning.
Before we go into the details of ba kward planning, we present the ba k-
ward sear h tree for the tree for the example we already presented for forward
sear h (see g. 2.8). The ba kward sear h tree is slightly smaller than the
forward sear h tree. Ignoring the states whi h were generated as ba ktra k-
points, in forward sear h thirteen states have to be visited, in ba kward-sear h
nine. In general, the savings an be mu h higher.
A
B
C
s0
B
C A
s1
B
AC
s2
BC A
s3
C
s6
A
B C
s7
A
B
s8
C
A B
s9
A
C
B
puttable(A)
un-used
backtrack-point
puttable(C)put(B, A)
put(B, C)put(B, C)
put(A, B)
puttable(C) puttable(B)
cycle
s0
cycle
s1
put(B, A)
Initial
State
un-used
backtrack-point
un-usedbacktrack-point
cycle
s1, s2
puttable(A)
puttable(A)puttable(B)
puttable(C) puttable(C)
cycle
s6
cycle
s7
cycle
s5cycle
s5
BC A
B
C
A
s4 s5
C
A
B
B
A
C
s10 s11
Fig. 2.8. A Ba kward Sear h Tree for Blo ks-World
There is a pri e to pay for obtaining smaller sear h trees. The problem
is, that a ba kward operator appli ation an produ e an in onsistent state
des ription (see def. 2.3.4). We will dis uss this problem in se tion 3.3 in hap-
2.3 Basi Planning Algorithms 37
ter 3, in the ontext of our state-based non-linear ba kward planner DPlan.
In the following, we will dis uss the lassi (Strips) approa h to ba kward
planning, using goal regression together with the problem of dete ting and
eliminating in onsistent state des riptions. Furthermore, we will address the
problem of in ompleteness of lassi al linear ba kward planning and present
non-linear planning, whi h is omplete.
2.3.4.1 Goal Regression. Regression planning starts with the top-level
goals of a planning problem as input (root of the sear h tree). For a planning
problem involving three blo ks A, B, and C and the top-level goals G =
fon(A, B), on(B, C)g, the state depi ted in the root node in gure 2.8 is
the only legal goal state with G s = fon(A, B), on(B, C), ontable(C),
lear(A)g. In ontrast to forward planning, planning does not start with a
omplete state des ription s but with a partial state des ription G A planning
step is realized by goal regression:
Denition 2.3.6 (Goal Regression). Let G be a set of (goal) literals and
o an instantiated operator.
If for an atom p 2 G holds p 2 ADD(o), then the goal orresponding to p
an be repla ed by true whi h is equivalent to removing p from G.
If for an atom p 2 G holds p 2 DEL(o), then the goal orresponding to
p is destroyed, that is, it must be repla ed by false whi h is equivalent to
repla ing formula G by false.
If p 2 G is neither ontained in ADD(o) nor in DEL(o), nothing is
hanged.
If a formula is redu ed to false, an in onsistent state des ription is de-
te ted. Another possibility to deal with in onsisten y is to introdu e domain
axioms and dedu e ontradi tions by theorem proving. We give the beginning
of the sear h tree using goal regression in gure 2.9. A omplete regression
tree using the domain spe i ation given in gure 2.4 is given in (Nilsson,
1980, pp. 293295). Using goal regression, there an o ur free variables when
introdu ing new subgoals. Typi ally, these variables an be instantiated if a
subgoal expression is rea hed whi h mat hes with the initial state. Depend-
ing on the domain, there might go a lot of eort in dealing with in onsistent
states. If more omplex operator spe i ations su h as onditional ee ts
and all-quantied expressions are allowed, the number of impossible states
whi h are onstru ted and must be dete ted during plan onstru tion an
explode.
To summarize, there are the following dieren es between forward and
ba kward planning:
Complete vs. Partial State Des riptions. Forward planning starts with a om-
plete state representation the initial state while ba kward planning
starts with a partial state representation the top-level goals (whi h
might ontain variables).
38 2. State-Based Planning
clear(x)
clear(B)
ontable(A)
on(B, C)
on(x, A)
clear(x’)
on(x’, B)
clear(A)
ontable(A)
on(B, C)
on(A, B)
on(B, C)
clear(A)
clear(B)
ontable(A)
on(B, C)
clear(B)
clear(C)
on(B, z’)
on(A, B)
clear(B)
clear(C)
ontable(B)
on(A, B)
inconsistent:
to fulfill clear(y), y= B
puttable(x)
deletes on(x, B)clear(A)
on(A, y)
clear(B)
on(B, C)
clear(B)
clear(C)
ontable(B)
clear(A)
ontable(A)
clear(B)
clear(C)
on(B, z’’)
clear(A)
ontable(A)
clear(A)
clear(B)
on(A, z)
on(B, C)
put(A, B)
put(A, B)put(B, C)
put(B, C)
puttable(x) puttable(x’)puttable(A) put(B, C) put(B, C)
will become inconsistent
for z = B
Fig. 2.9. Goal-Regression for Blo ks-World
Consisten y of State Des riptions. A planning step in forward planning al-
ways generates a onsistent su essor state. Soundness of forward plan-
ning follows easily from the soundness of the planning steps. A planning
step in ba kward planning an result in an in onsistent state des rip-
tion. In general, a planner might not dete t all in onsisten ies. To proof
soundness, it must be shown, that if a plan is returned, all intermediate
state representations on the path from the top-level goals to the initial
state are onsistent.
Variable Bindings. In forward planning, all onstru ted state representations
are fully instantiated. This is due to the way, in whi h planning operators
are dened: Usually, all variables o urring in an operator are bound
by the pre ondition. In ba kward planning, newly introdu ed subgoals
(i. e., pre onditions of an operator) might ontain variables whi h are
not bound by mat hing the urrent subgoals with an operator.
In hapter 3 we will introdu e a ba kward planning strategy based on
omplete state des riptions: Planning starts with a goal state instead with
the top-level goals. An operator is appli able if all elements of its ADD-list
mat h with the urrent state. Operator appli ation is performed by removing
all elements of the ADD-list from the urrent des riptions, i. e., all literals
are redu ed to true in one step, and by adding the union of pre onditions and
DEL-list (see def. 2.1.7). Sin e usually the elements of the DEL-list are a sub-
set of the elements of the pre onditions, this state-based ba kward operator
appli ation an be seen as a spe ial kind of goal regression.
A further dieren e between forward and ba kward planning is:
Goal Ordering. In forward planning, it must be de ided whi h of the a tion
andidates whose pre onditions are satised in the urrent state is ap-
2.3 Basi Planning Algorithms 39
plied. In ba kward planning, it must be de ided whi h (sub-)goal of a
list of goals is onsidered next. That is, ba kward planning involves goal
ordering.
In the following, we will show that the original linear strategy for ba k-
ward planning is in omplete.
2.3.4.2 In ompleteness of Linear Planning. A planning strategy is
alled linear if it does not allow interleaving of sub-goals. That means, plan
onstru tion is based on the assumption that a problem an be solved by
solving ea h goal separately (Sa erdoti, 1975). This assumption does only
hold for independent goals Nilsson (1980) uses the term \ ommutativity",
(George, 1987, see also).
A famous demonstration for in ompleteness of linear ba kward planning
is the Sussman Anomaly (see Waldinger, 1977, for a dis ussion) illustrated in
gure 2.10. Here, the linear strategy does not work, regardless in whi h order
the sub-goals are approa hed. If B is put on C, A is overed by these two
blo ks and an only be moved, if the goal on(B, C) is destroyed. For rea hing
the goal on(A, B), rst A has to be leared by putting C on the table, if A
is subsequently put on B, C annot be moved under B without destroying
goal on(A, B).
A
C
BC
A B
Initial State Goal: on(A, B) and
on(B, C)
C
A
B
C
A
B
on(B, C) on(A, B)
Fig. 2.10. The Sussman Anomaly
Linear planning orresponds to dealing with goals organized in a sta k:
[on(A, B), on(B, C)
try to satisfy goal on(A, B)
40 2. State-Based Planning
solve sub-goals [ lear(A), lear(B)
11
all sub-goals hold after puttable(C)
apply put(A, B)
goal on(A, B) is rea hed
try to satisfy goal on(B, C).
Interleaving of goals also alled non-linear planning allows that a
sequen e of planning steps dealing with one goal is interrupted to deal with
another goal. For the Sussman Anomaly, that means that after blo k C is
put on the table pursuing goal on(A, B), the planner swit hes to the goal
on(B, C). Non-linear planning orresponds to dealing with goals organized
in a set:
fon(A, B), on(B, C)g
try to satisfy goal on(A, B)
f lear(A), lear(B), on(A, B), on(B, C)g
lear(A) and lear(B) hold after puttable(C)
try to satisfy goal on(B, C)
apply put(B, C)
try to satisfy goal on(A, B)
apply put(A, B).
The orre t sequen e of goals might not be found immediately but involve
ba ktra king.
Another example to illustrate the in ompleteness of linear planning was
presented by Veloso and Carbonell (1993): Given are a ro ket and several
pa kages together with operators for loading and unloading pa kages in and
from the ro ket and an operator for shooting the ro ket to the moon but no
operator for driving the ro ket ba k from the moon. The planning goal is to
transport some pa kages, for example at(Pa kA, Moon), at(Pa kB, Moon),
at(Pa kC, Moon). If the goals are addressed in a linear way, one pa kage
would be loaded in the ro ket, the ro ket would go to the moon, and the
pa kage would be unloaded. The rest of the pa kages ould never be delivered!
The orre t plan for this problem is, to load all pa kages in the ro ket before
the ro ket moves to its destination. We will give a plan for the ro ket domain
in hapter 3 and we will show in hapter 8 how this strategy for solving
problems of the ro ket domain an be learned from some initial planning
experien e.
A se ond sour e of in ompleteness is, if a planner instantiates variables in
an eager way also alled strong ommitment planning. An illustration with a
register-swapping problem is given in (pp. 305307 Nilsson, 1980). A similar
problem sorting of arrays and its solution is dis ussed in (Waldinger,
1977). We will introdu e sorting problems in hapter 3. Modern planners are
based on a non-linear, least ommitment strategy.
11
We ignore the additional subgoal ontable(A) rsp. on(A, z) here.
2.4 Planning Systems 41
2.4 Planning Systems
In this se tion we give a short overview of the history of and re ent trends
in planning resear h. We will only give more detailled des riptions for su h
resear h areas whi h are relevant for the later parts of the book. Otherwise,
we will only give short hara terizations together with hints to the literature.
2.4.1 Classi al Approa hes
2.4.1.1 Strips Planning. The rst well-known planning system was Strips
(Fikes & Nilsson, 1971). It integrated on epts developed in the area of
problem solving by state-spa e sear h and means-end analysis as realized
in the General Problem Solver (GPS) from Newell and Simon (1961) (see
se t. 2.4.5.1) and on epts from theorem proving and situation al ulus
the QA3 system of Green (1969). As dis ussed above, the basi notions of
the Strips language are still the ore of modern planning languages, as for
example PDDL. The Strips planning algorithm based on goal regression
and linear planning, as des ribed above , on the other hand, was repla ed
end of the eighties by non-linear, least- ommitment approa hes. In the sev-
enties and eighties, lots of work addressed problems with the original Strips
approa h, for example (Waldinger, 1977; Lifs hitz, 1987).
2.4.1.2 Dedu tive Planning. The dedu tive approa h to planning as the-
orem proving introdu ed by Green was pursued through the seventies and
eighties mainly by Manna andWaldinger (Manna &Waldinger, 1987). Manna
and Waldinger ombined dedu tive planning and dedu tive program synthe-
sis, and we will dis uss their work in hapter 6. Current dedu tive approa hes
are based on so alled a tion languages where a tions are represented as tem-
poral logi formulas. Here plan onstru tion is a pro ess of reasoning about
hange (Gelfond & Lifs hitz, 1993). End of the nineties, symboli model
he king was introdu ed as an approa h to planning as veri ation of tempo-
ral formulas in an semanti model (Giun higlia & Traverso, 1999). Symboli
model he king is mainly applied in universal planning for deterministi and
non-deterministi domains (see se t. 2.4.4).
2.4.1.3 Partial Order Planning. Also in the seventies, the rst partial
order planner (NOAH) was presented by Sa erdoti (1975).
12
Partial order
planning is based on a sear h in the spa e of (in omplete) plans. Sear h
starts with a plan ontaining only the initial state and the top-level goals.
In ea h planning step, the plan is rened by either introdu ing an a tion
fullling a goal or a pre ondition of another a tion, or by introdu ing an
ordering that puts one a tion in front of another a tion, or by instantiating
12
Sa erdoti alled NOAH an hierar hi al planner. Today, hierar hi al plan-
ning means that a plan is onstru ted on dierent levels of abstra tion (see
se t. 2.4.3.1).
42 2. State-Based Planning
a previously unbound variable. Partial order planners are based on a non-
linear strategy in ea h planning step an arbitrary goal or pre ondition an
be fo ussed. Furthermore, the least ommitment strategy i. e., refraining
from ommitting to a spe i ordering of planning steps or to a spe i
instantiation of a variable as long as possible was introdu ed in the ontext
of partial order planning (Penberthy & Weld, 1992).
The resulting plan is usually not a totally, but only a partially ordered
set of a tions. A tions for whi h no ordering onstraints o urred during
plan onstru tion remain unordered. A totally ordered plan an be extra ted
from the partially ordered plan by putting parallel (i. e., independent) steps
in an arbitrary order. An overview of partial order planning together with
a survey of the most important ontributions to this resear h is given in
(Russell & Norvig, 1995, hap. 11). Partial order planning was the dominant
approa h to planning from end of the eighties to mid of the nineties. The
main ontribution of partial order planning is the introdu tion of non-linear
planning and least ommitment.
2.4.1.4 Total Order Non-Linear Planning. Also at the end of the eight-
ies, the Prodigy system was introdu ed (Veloso, Carbonell, Perez, Borrajo,
Fink, & Blythe, 1995). Prodigy is a state-based, total-order planner based on
a non-linear strategy. Prodigy is more a framework for planning than a single
planning system. It allows the sele tion of a variety of planning strategies
and, more important, it allows that sear h is guided by domain-spe i on-
trol knowledge turning a domain-independent into a more eÆ ient domain-
spe i planner. Prodigy in ludes dierent strategies for learning su h ontrol
knowledge from experien e (see se t. 2.5.2 and oers te hniques for reusing
already onstru ted plans in the ontext of new problems based on analogi al
reasoning.
Two other total-order, non-linear ba kward planners are HSPr (Haslum
& Gener, 2000) and DPlan (S hmid & Wysotzki, 2000b) whi h we will
present in detail in hapter 3.
2.4.2 Current Approa hes
2.4.2.1 Graphplan and Derivates. Sin e the mid of the nineties, a new,
more eÆ ient, generation of planning algorithms, dominates the eld. The
Graphplan approa h presented by Blum and Furst (1997) an be seen as
the starting point of the new development. Graphplan deals with planning as
network ow problem (Cormen et al., 1990). The pro ess of plan onstru tion
is divided into two parts: rst, a so alled planning graph is onstru ted
by forward sear h, se ond a partial order plan is extra ted by ba kward
sear h. Plan extra tion from a planning graph of xed sized orresponds
to a bounded-length plan onstru tion problem.
The planning graph an be seen as a partial representation of the state-
spa e of a problem, representing a sub-graph of the state-spa e whi h ontains
2.4 Planning Systems 43
paths from the given initial state to the given planning goals. An example for
a plannign graph is given in gure 2.11. It ontains fully instantiated liter-
als and a tions. But, in ontrast to a state-spa e representation, nodes in the
graph are not states i. e., onjun tions of literals but single literals (propo-
sitions). Be ause the number of dierent literals in a domain is onsiderably
smaller than the number of dierent sub-sets of those literals (i. e., state rep-
resentations), the size of a planning graph does not grow exponentially (see
se t. 2.3.3.1).
on(B, A)
ontable(A)
ontable(C)
clear(B)
clear(C)
on(B, A)
ontable(A)
ontable(C)
clear(B)
clear(C)
clear(A)
ontable(B)
puttable(B)
put(C, B) on(C, B)
level 1 level 2 level 3 ...
mute
x
Fig. 2.11. Part of a Planning Graph as Constru ted by Graphplan
The planning graph is organized level-wise. The rst level is the set of
propositions ontained in the initial state, the next level is a set of a tions,
a proposition from the rst level is onne ted with an a tion in the se ond
level, if it is a pre ondition for this a tion. The next level ontains proposi-
tions again. For ea h a tion of the pre eding level, all propositions ontained
in its ADD-list are introdu ed (and onne ted with the a tion). The plan-
ning graph ontains so alled noop a tions at ea h level whi h just pass a
literal from level k to level k + 2. Furthermore, so alled mutex relations are
introdu ed at ea h level, representing (an in omplete set) of propositions or
a tions whi h are mutually ex lusive on a given level. Two a tions are mu-
tex if they interfere (one a tion deletes a pre ondition or ADD-ee t of the
other) or have ompeting needs (have mutually ex lusive pre onditions). Two
propositions p and q are mutex if ea h a tion having an add-edge to proposi-
tion p is marked as mutex of ea h a tion having an add-edge to proposition
q. Constru tion of a planning graph terminates, when the rst time a level
ontains all literals o urring in the planning goal and these literals are not
mutex. If ba kward plan extra tion fails, the planning graph is extended one
level.
44 2. State-Based Planning
Originally, Graphplan was developed for the Strips planning language.
Koehler et al. (1997) presented an extension to onditional and univer-
sally quantied operator ee ts (system IPP). Another su essful Graph-
plan based system is STAN (Long & Fox, 1999). After Graphplan, a variety
of so alled ompilation approa hes have be ome popular. Starting with a
planning graph, the bounded-length plan onstru tion problem is addressed
by dierent approa hes to solving anoni al ombinatorial problems, su h
as satisability-solvers (Kautz & Selman, 1996, Bla kbox), integer program-
ming, or onstraint satisfa tion algorithms (see Kambhampati, 2000, for a
survey). IPP, STAN, and Bla kbox solve blo ks-world problems up to 10 ob-
je ts in under a se ond and up to thirteen obje ts in under 1000 se onds
(Ba hus et al., 2000).
Traditional planning algorithms worked for a (small) sub-set of rst-order
logi where in ea h planning step variables o urring in the operators must be
instantiated. In the ontext of partial order planning, the expressive power of
the original Strips approa h was extended by on epts in luded in the PDDL
language as dis ussed in se tion 2.2.1 (Penberthy & Weld, 1992). In ontrast,
ompilation approa hes are based on propositional logi . The redu tion in the
expressiveness of the underlying language gives rise to a gain in eÆ ien y for
plan onstru tion. Another approa h based on propositional representations
is symboli model he king whi h we will dis uss in ontext with universal
planning (see se t. 2.4.4).
2.4.2.2 Forward Planning Revisited. Another eÆ ient planning system
of the late nineties, is the system HSP from Bonet and Gener (1999). HSP
is a forward planner based on heuristi sear h. The power of HSP is based
on an eÆ ient pro edure for al ulating a lower bound estimate h
0
(s) for
the distan e (remaining number of a tions) from the urrent state s to a
goal state. The heuristi fun tion h
0
(s) is al ulated for a \relaxed" problem
P
0
, whi h is obtained from the given planning problem P by ignoring the
DEL-lists of the operators. Thus, h
0
(s) is set to the number of a tions whi h
transform s in a state where the goal literals appear for the rst time
ignoring that pre onditions of these a tions might be deleted on the way.
The new generation of forward planners dominated in the 2000 AIPS
ompetition (Ba hus et al., 2000). For example, HSP an solve blo ks-world
problems up to 35 blo ks in under 1000 se onds.
2.4.3 Complex Domains and Un ertain Environments
2.4.3.1 In luding Domain Knowledge. Most realisti domains are by
far more omplex than the blo ks-world domain dis ussed so far. Domain
spe i ations for more realisti domains su h as assembling of ma hines or
logisti s problems might involve large sets of (primitive) operators and/or
huge state-spa es. The obvious approa h to generate plans for omplex do-
mains is to restri t sear h by providing domain-spe i knowledge. Planners
2.4 Planning Systems 45
relying on domain-spe i knowledge an solve blo ks-world problems with
up to 95 blo ks under one se ond (Ba hus et al., 2000). In ontrast to the
general purpose planners dis ussed so far, su h knowledge-based systems are
alled spe ial purpose planners. Note that all domain-spe i approa hes re-
quire that more eort and time is invested in the development of a formalized
domain model. Often, su h knowledge is not easily to provide.
One way to deal with a omplex domain, is to spe ify plans at dierent
levels of detail. For example (see Russell & Norvig, 1995, p. 368), to laun h
a ro ket, a top-level plan might be: prepare booster ro ket, prepare apsule,
load argo, laun h. This plan an be dierentiated to several intermediate-
level plans until a plan ontaining exe utable a tions is rea hed (on the detail
of insert nut A into hole B, et .). This approa h is alled hierar hi al plan-
ning. For hierar hi al planning, additional to primitive operators whi h an
be instantiated to exe utable a tions, abstra t operators must be spe ied.
An abstra t operator represents a de omposition of a problem into smaller
problems. To spe ify an abstra t operator, knowledge about the stru ture of
a domain is ne essary. An introdu tion to hierar hi al planning by problem
de omposition is given in (Russell & Norvig, 1995, hap. 12).
Other te hniques to make planning feasible for omplex domains are on-
erned with redu ing the eort of sear h. One sour e of omplexity is that
meaningless instantiations and in onsistent states might be generated whi h
must be re ognized and removed (see dis ussion in se t. 2.3.4). This problem
an be redu ed or eliminated if domain knowledge is provided in the form of
axioms and types. A se ond sour e of omplexity is that uninformed sear h
might lead into areas of the state-spa e whi h are far away from a possible
solution. This problem an be redu ed by providing domain spe i ontrol
strategies whi h guide sear h. Planners that rely on su h domain-spe i
ontrol knowledge an easily outperform every domain-independent planner
(Ba hus & Kabanza, 1996).
Alternatively to expli itly providing a planner with su h kind of domain-
spe i information, this information an be obtained by extra ting informa-
tion from domain spe i ations by pre-planning analysis (see se t. 2.5.1) or
from some example plans using ma hine learning te hniques (see se t. 2.5.2).
2.4.3.2 Planning for Non-Deterministi Domains. Constru ting plans
for real-world domains must take into a ount that information might be
in omplete or in orre t. For example, it might be unknown at planning time,
whether the weather onditions are sunny or rainy when the ro ket is to be
laun hed (see above); or during planning time it is assumed that a ertain tool
is stored in a ertain shelf, but at plan exe ution time, the tool might have
been moved to another lo ation. The rst example addresses in ompleteness
of information due to environmental hanges. The se ond example addresses
in orre t information whi h might be due to environmental hanges (e.g. , an
agent whi h does not orrespond to the agent exe uting the plan moved the
tool) or to non-deterministi a tions (e.g. , depending on where the planning
46 2. State-Based Planning
agent moves after using the tool, he might pla e the tool ba k in the shelf or
not).
A lassi al approa h to deal with in omplete information is onditional
planning. A onditional plan is a disjun tion of sub-plans for dierent ontexts
(as good or bad weather onditions). We will see in hapter 8 that introdu ing
onditions into a plan is a ne essary step for ombining planning and program
synthesis. A lassi al approa h to deal with in orre t information is exe ution
monitoring and re-planning. Plan exe ution is monitored and if a violation of
the pre onditions for the a tion whi h should be exe uted next is dete ted,
re-planning is invoked. An introdu tion to both te hniques is given in (Russell
& Norvig, 1995, hap. 13).
Another approa h to deal with non-deterministi domains is rea tive plan-
ning. Here, a set or table of state-a tion rules is generated. Instead of exe-
uting a omplete plan, the urrent state of the environment triggers whi h
a tion is performed next, dependent on what onditions are fullled in the
urrent state. This approa h is also alled poli y learning and is extensively re-
sear hed in the domain of reinfor ement learning (Dean, Basye, & Shew huk,
1993; Sutton & Barto, 1998). A similar approa h, ombining problem solv-
ing and de ision tree learning, is proposed by (Muller & Wysotzki, 1995).
In the ontext of symboli planning, universal planning was proposed as an
approa h to poli y learning (see se t. 2.4.4).
2.4.4 Universal Planning
Universal planning was originally proposed by S hoppers (1987) as an ap-
proa h to learn state-a tion rules for non-deterministi domains. A universal
plan represents solution paths for all possible states of a planning problem,
instead of a solution for one single initial state. State-a tion rules are ex-
tra ted from a universal plan overing all possible states of a given planning
problem. Generating universal plans instead of plans for a single initial state
was also proposed by Wysotzki (1987) in the ontext of indu tive program
synthesis (see part II).
A universal plan orresponds to a breadth-rst sear h tree. Sear h is per-
formed ba kward, starting with the top-level goals and for ea h node at the
urrent level of the plan all (new and onsistent) prede essor nodes are gen-
erated. The set of prede essor nodes of a set of nodes S is also alled pre-
image of S. An abstra t algorithm for universal plan onstru tion is given
in table 2.4. Universal planning was riti ized as impra ti able (Ginsberg,
1989), be ause su h sear h trees an grow exponentially (see dis ussion of
PSPACE- ompleteness, se t. 2.3.3.1). Currently, universal planning has a re-
naissan e, due to the introdu tion of OBDDs (ordered binary de ision dia-
grams) as a method for a ompa t representation of universal plans. OBDD-
representations were originally developed in the ontext of hardware design
(Bryant, 1986) and later adopted in symboli model he king for eÆ ient
exploration of large state-spa es (Bur h, Clarke, M Millan, & Hwang, 1992).
2.4 Planning Systems 47
The new, memory eÆ ient approa hes to universal planning are based on
the idea of viewing planning as model he king paradigm instead of planning
as a state-spa e sear h problem (like Strips planners) or as theorem proving
problem (like dedu tive planners). Planning as model he king is not only
su essfully applied to non-deterministi domains planners MBP(Cimatti,
Roveri, & Traverso, 1998) and UMOP (Jensen & Veloso, 2000) , but also to
ben hmark problems for planning in deterministi domains (Edelkamp, 2000,
planner MIPS). Be ause plan onstru tion is based on breadth-rst sear h,
universal plans for deterministi domains represent optimal solutions.
Table 2.4. Planning as Model Che king Algorithm (Giun higlia, 1999, g. 4)
fun tion Plan(P) where P(D, I, G) is a planning problem with
I = fs
0
g as initial state,
G as goals, and
D = hF; S;A;Ri as planning domain with F as set of uents, S 2
F
as nite
set of states, A as nite set of a tions, and R : SA! S as transition fun tion.
An a tion a 2 A is exe utable in s 2 S if R(s; a) 6= ;.
CurrentStates := ;; NextStates := G; Plan := ;;
while (NextStates 6= CurrentStates) do (*)
if I NextStates then return Plan; (**)
OneStepPlan := OneStepPlan(NextStates,D);
( al ulate pre-image of NextStates)
Plan := Plan [ PruneStates(OneStepPlan,NextStates);
(eliminate states whi h have already been visited)
CurrentStates := NextStates;
NextStates := NextStates [ Proje tA tions(OneStepPlan);
(Proje tA tions, given a set of state-a tion pairs, returns the orre-
sponding set of states)
return Fail.
The general idea of planning as model he king is that a planning domain
is des ribed as semanti model of a domain. A semanti model an be given for
example as a nite state ma hine (Cimatti et al., 1998) or as Kripke stru ture
(Giun higlia & Traverso, 1999). A domain model D represents the states and
a tions of the domain and the state transitions aused by the exe ution of
a tions. States are represented as onjun tions of atoms. State transitions
(state, a tion, state') an be represented as formulas state^a tion! state
0
.
As usual, a planning problem is the problem of nding plans of a tions given
planning domain, initial and goal states. Plan generation is done by exploring
the state spa e of the semanti model. At ea h step, plans are generated by
he king the truth of some formulas in the model. Plans are also represented
as formulas and planning is modeled as sear h through sets of states (instead
of single states) by evaluating the assignments verifying the orresponding
formulas. Giun higlia and Traverso (1999) gives an overview of planning as
model he king.
48 2. State-Based Planning
As mentioned above, plans (as semanti models) an be represented om-
pa tly as OBDDs. An OBDD is a anoni al representation of a boolean fun -
tion as a DAG (dire ted a y li graph). An example for an OBDD represen-
tation is given in gure 2.12. Solid lines represent that the pre eding variable
is true, broken lines represent that it is false. Note, that the size of an OBDD
is highly dependent on the ordering of variables (Bryant, 1986).
x1
x2
1 0
Fig. 2.12. Representation of the Boolean Formula f(x
1
; x
2
) = x
1
^ x
2
as OBDD
Our planning system DPlan is a universal planner for deterministi do-
mains (see hapt. 3). In ontrast to the algorithm given in table 2.4, plan-
ning problems in DPlan only give planning goals, but not an initial state.
DPlan terminates, if all states whi h are rea hable from the top-level goals
(by al ulating the pre-images) are enumerated. That is, DPlan terminates
su essfully, if ondition (*) given in table 2.4 is no longer fullled and DPlan
does not in lude ondition (**). Furthermore, in DPlan the universal plan is
represented as DAG over states and not as OBDD. Therefore, DPlan is mem-
ory ineÆ ient. But, as we will dis uss in later hapters, DPlan is typi ally
applied only to planning problems with small omplexity (involving not more
than three or four obje ts). The universal plan is used as starting point for
indu ing a domain spe i ontrol program for generating (optimal) a tion
sequen es for problems with arbitrary omplexity (see se t.2.5.2).
2.4.5 Planning and Related Fields
2.4.5.1 Planning and Problem Solving. The distin tion between plan-
ning and problem solving is not very lear- ut. In the following, we give some
dis riminations found in the literature (Russell & Norvig, 1995, e. g.,): Often,
forward sear h based omplete state representations is lassied as problem
solving, while ba kward sear h based on in omplete state representations is
lassied as planning. While planning is based on a logi al representation
of states, problem solving an rely on dierent, spe ial purpose representa-
tions, for example feature ve tors. While planning is mostly asso iated with a
domain-independent approa h, problem solving typi ally relies on pre-dened
domain spe i knowledge. Examples are heuristi fun tions as used to guide
A
sear h (Nilsson, 1971, 1980) or the dieren e table used in GPS (Newell
& Simon, 1961; Nilsson, 1980).
2.4 Planning Systems 49
Cognitive psy hology is typi ally on erned with human problem solving
and not with human planning. Computer models of human problem solving
are mostly realized as produ tion systems (Mayer, 1983; Anderson, 1995). A
produ tion system onsists of an interpreter and a set of produ tion rules.
The interpreter realizes the mat h-sele t-apply y les. A produ tion rule is
an IF-THEN-rule. A rule res if the ondition spe ied in its if-part is sat-
ised in a urrent state in working memory. Sele tion might be in uen ed
by the number of times a rule was already applied su essfully oded as
a strength value. Appli ation of a produ tion rule results in a state hange.
Produ tion rules are similar to operators. But while an operator is spe ied
by pre onditions and ee ts represented as sets of literals, produ tion rules
an en ode onditions and ee ts dierently. For example, a ondition might
represent a sequen e of symbols whi h must o ur in the urrent state and
the then-part on the rule might give a sequen e of symbols whi h are ap-
pended to the urrent state representation. In ognitive s ien e, produ tion
systems are usually goal-dire ted the if-part of a rule represents a urrently
open goal and the then-part either introdu es new sub-goals or spe ies an
a tion. Goal-driven produ tion systems are similar to hierar hi al planners.
The earliest and most in uential approa h to problem solving in AI and
ognitive psy hology is the General Problem Solver (GPS) from (Newell &
Simon, 1961). GPS is very similar to Strips planning. The main dieren e
is, that sele tion of a rule (operator) is guided by a dieren e table. The dif-
feren e table has to be provided expli itly for ea h appli ation domain. For
ea h rule, it is represented whi h dieren e between a urrent state and a goal
state it an redu e. For example, a rule for put(A, B) fullls the goal on(A,
B). The pro ess of identifying dieren es and sele ting the appropriate rule
is alled means-end analysis. The system starts with the top-level goals and a
urrent state. It sele ts one of the top-level goals, if this goal is already satis-
ed in the urrent state, the system pro eeds with the next goal. Otherwise,
a rule is sele ted from the dieren e table. The top-level goal is removed from
the list (sta k) of goals and repla ed by the pre onditions of the rule and so
on. As Strips, GPS is based on a linear strategy and is therefore in omplete.
While in ompleteness is not a desirable property for an AI program, it might
be appropriate to hara terize human problem solving. A human problem
solver usually wants to generate a solution within reasonable time-bounds
and an furthermore only hold a restri ted amount of information in his/her
short-term memory. Therefore, using an in omplete but simple strategy is
rational be ause it still an work for a large lass of problems o urring in
everyday life (Simon, 1958).
2.4.5.2 Planning and S heduling. Planning deals with nding su h a -
tivities whi h must be performed to satisfy a given goal. S heduling deals with
nding an (optimal) allo ation of a tivities (jobs) to time segments given lim-
ited resour es (Zweben & Fox, 1994). For a long time planning and s heduling
resear h were ompletely separated. Only sin e the last years, the interests
50 2. State-Based Planning
of both ommunities onverge. While s heduling systems are su essfully ap-
plied in many areas (su h as fa tories and transportation ompanies), plan-
ning is mostly done \by hand". Currently, resear hers and ompanies be ome
more interested in automatizing the planning part. This goal seems now far
more realisti than ve years ago, due to the emergen e of new, eÆ ient plan-
ning algorithms. On the other hand, resear hers in planning aim at applying
their approa hes at realisti domains su h as the logisti s domain and the
elevator domain used as ben hmark problems in the AIPS-00 planning om-
petition (Ba hus et al., 2000).
13
Planning in realisti domains often involves
dealing with resour e onstraints. One approa h, for example proposed by
Do and Kambhampati (2000), is to model planning and s heduling as two
su eeding phases, both solved with a onstraint satisfa tion approa h. Other
work aims at integrating resour e onstraints in plan onstru tion (Koehler,
1998; Gener, 2000). In hapter 4 we present an approa h to integrating
fun tion appli ation in planning and example appli ations to planning with
resour e onstraints.
2.4.5.3 Proof Planning. In the domain of automati theorem proving,
planning is applied to guide proof onstru tion. Proof planning was origi-
nally proposed by Bundy (1988). Planning operators represent proof methods
with pre onditions, post onditions, and ta ti s. A ta ti represents a number
of inferen e steps. It an be applied if the pre onditions hold and the post-
onditions must be guaranteed to hold after the inferen e steps are exe uted.
A ta ti guides the sear h for a proof by pres ribing that ertain inferen e
steps should be exe uted in a ertain sequen e given some urrent step in a
proof.
An example for a high-level proof method is \proof by mathemati al
indu tion". Bundy (1988) introdu es a heuristi s alled \rippling" to des ribe
the heuristi s underlying the Boyer-Moore theorem prover and represents this
heuristi s as a method. Su h a method spe ies ta ti s on dierent levels of
detail similar to abstra t and primitive operators in hierar hi al planning.
Thus, a \super-method" for guiding the onstru tion of a omplete proof
invokes several sub-methods whi h satisfy the post- ondition of the super-
method after their exe ution.
Proof planning, like hierar hi al planning, requires insight in the stru -
ture of the planning domain. Identifying and formalizing su h knowledge
an be very time- onsuming, error-prone, or even impossible, as dis ussed in
se tion 2.4.3.1. Again, learning might be a good alternative to expli itly spe -
ifying su h domain-dependent knowledge: First, proofs are generated using
only primitive ta ti s. Su h proofs an then be input in a ma hine learn-
ing algorithm and generalized to more general methods (Jamnik, Kerber, &
Benzmuller, 2000). In prin iple, all strategies available for learning ontrol
13
Be ause of the onverging interests in planning and s heduling resear h, the
onferen e Arti ial Intelligen e Planning Systems was renamed into Arti ial
Intelligen e Planning and S heduling.
2.5 Automati Knowledge A quisition for Planning 51
rules or ontrol programs for planning as dis ussed in se tion 2.5.2 an be
applied to proof planning. Of ourse, su h a learned method might be in om-
plete or non-optimal be ause it is indu ed from some spe i examples. But
the same is true for methods whi h are generated \by hand". In both ases,
su h strategi de isions should not be exe uted automati ally. Instead, they
an be oered to a system user and a epted or reje t by user intera tion.
2.4.6 Planning Literature
Classi al Strips planning is des ribed in all introdu tory AI textbooks su h
as Nilsson (1980) and Winston (1992). The newest textbook from (Russell &
Norvig, 1995) fo usses on partial order planning, whi h is also des ribed in
Winston (1992). Short introdu tions to situation al ulus an also be found
in all textbooks. A olle tion of in uential papers in planning resear h from
the beginning until the end of the eighties is presented by Allen, Hendler,
and (Eds.) (1990).
The area of planning resear h underwent signi ant hanges sin e the
mid-nineties. The newer Graphplan based and ompilation approa hes as well
as methods of domain-knowledge learning (see se t. 2.5) are not des ribed in
textbooks. Good sour es to obtain an overview of urrent resear h are the pro-
eedings of the AIPS onferen e (Arti ial Intelligen e Planning and S hedul-
ing; http://www-aig.jpl.nasa.gov/publi /aips00/) and the Journal of
Arti ial Intelligen e Resear h (JAIR; http://www. s.washington.edu/
resear h/jair/home.html). Overview papers of spe ial areas of planning
an be found in the AI Magazine for example, a re ent overview of urrent
planning approa hes by Weld (1999). Furthermore, resear h in planning is
presented in all major AI onferen es (IJCAI, AAAI) and AI journals (e.g. ,
Arti ial Intelligen e). A olle tion of urrent planning systems together with
the possibility to exe ute planners on a variety of problems an be found at
http://rukbat.fokus.gmd.de:8080/.
14
2.5 Automati Knowledge A quisition for Planning
2.5.1 Pre-Planning Analysis
Pre-planning analysis was proposed for example by (Fox & Long, 1998) to
eliminate meaningless instantiations when onstru tion a propositional rep-
resentation of a planning problem, as for example a planning graph. The
basi idea is, to automati ally infer types from domain spe i ations. These
types then are used to extra t state invariants. Additionally to eliminating
14
This website was realized by Jurgen Muller as part of his diploma thesis at TU
Berlin, supervised by Ute S hmid and Fritz Wysotzki.
52 2. State-Based Planning
meaningless instantiations state invariants an be used to dete t and elim-
inate unsound plans. The domain analysis of the system TIM proposed by
Fox and Long (1998) extra ts information by analyzing the literals o urring
in the pre onditions, ADD- and DEL-lists of operators (transforming them
in a set of nite state ma hines).
For example, for the ro ket domain (shortly des ribed in se t. 2.3.4.2), it
an be inferred, that ea h pa kage is always either in the ro ket or at a xed
pla e, but never at two lo ations simultaneously.
Extra ting knowledge from domain spe i ations makes planning more
eÆ ient be ause it redu es the sear h spa e by eliminating impossible states.
Another possibility to make plan onstru tion more eÆ ient is to guide sear h
by introdu ing domain spe i ontrol knowledge.
2.5.2 Planning and Learning
Enri hing domains with ontrol knowledge whi h guides a planner to redu e
sear h for a solution makes planning more eÆ ient and therefore makes a
larger lass of problems solvable under given time and memory restri tions.
Be ause it is not always easy to provide su h knowledge and be ause domain
modeling is an \art" whi h is time onsuming and error-prone, one area of
planning resear h deals with the development of approa hes to learn su h
knowledge automati ally from some sample experien e with a domain.
2.5.2.1 Linear Ma ro-Operators. In the eighties, approa hes to learning
(linear) ma ro operators were investigated (Minton, 1985; Korf, 1985): After a
plan is onstru ted for a problem of a given domain, operators whi h appear
dire tly after ea h other in the plan an be omposed to a single ma ro
operator by merging their pre onditions and ee ts. This pro ess an be
applied to pairs or larger sequen es of primitive operators. If the planner
is onfronted with a new problem of the given domain, plan onstru tion
might involve a smaller number of mat h-sele t-apply y les be ause ma ro
operators an be applied whi h generate larger segments of the sear hed
for plan in one step. This approa h to learning in planning, however, did
not su eed, mainly be ause of the so alled utility problem. If ma ros are
extra ted undis riminated from a plan, the system might be ome \swamped"
with ma ro-operators. Possible eÆ ien y gains from redu tion of the number
of mat h-sele t-apply y les are ounterbalan ed or even overridden by the
number of operators whi h must be mat hed.
A similar approa h to linear ma ro learning is investigated in the ontext
of ognitive models of human problem solving (see se t. 2.4.5.1)and learning.
The ACT system (Anderson, 1983) and its des endants (Anderson & Lebiere,
1998) are realized as produ tion systems with a de larative omponent rep-
resenting fa tual knowledge and a pro edural omponent representing skills.
The pro edural knowledge is represented by produ tion rules (if ondition
then a tion pairs). Skill a quisition is modelled as \ ompilation" whi h in-
ludes on atenating primitive rules. This me hanism is used to des ribe
2.5 Automati Knowledge A quisition for Planning 53
speed-up learning from problem solving experien e. Another produ tion sys-
tem approa h to human ognitive skills is the SOAR system (Newell, 1990), a
des endant of GPS. Here, \ hunking" of rules is invoked, if an impasse during
generating a problem solution has o urred and was su essfully resolved.
2.5.2.2 Learning Control Rules. From the late eighties to the mid of
the nineties, ontrol rule learning was mainly investigated in ontext of the
Prodigy system (Veloso et al., 1995). A variety of approa hes mainly based
on explanation-based learning (EBL) or generalization (Mit hell, Keller, &
Kedar-Cabelli, 1986) were investigated to learn ontrol rules to improve the
eÆ ien y of plan onstru tion and the quality (i. e., optimality) of plans. An
overview of all investigated methods is given in Veloso et al. (1995). A ontrol-
rule is represented as produ tion rule. For a urrent state a hieved during
plan onstru tion, su h a ontrol rule provides the planner with information
whi h hoi e (next a tion to sele t, next sub-goal to fo us) it should make.
The EBL-approa h proposed by Minton (1988), extra ts su h ontrol rules
from an analysis of sear h trees by explaining why ertain bran hing de isions
were made during sear h for a solution.
Another approa h to ontrol rule learning, based on learning de ision
trees (Quinlan, 1986) or de ision lists (Rivest, 1987), is losely related to pol-
i y learning in reinfor ement learning (Sutton & Barto, 1998). For example,
Briesemeister, S heer, and Wysotzki (1996) ombined problem solving with
de ision tree learning. For a given problem solution, ea h state is represented
as a feature ve tor and asso iated with the a tion whi h was exe uted in this
state. A de ision tree is indu ed, representing relevant features of the state
des ription together with the a tion whi h has to be performed given spe i
values of these features. Ea h path in the de ision tree leads to an a tion (or
a \don't know what to do") in its leaf and an be seen as a ondition-a tion
rule. After a de ision tree is learned, new problem solving episodes an be
guided by the information ontained in the de ision tree. Learning an be per-
formed in rementally, leading either to instantiations of up to now unknown
ondition-a tion pairs or to a restru turing of the tree. Similar approa hes
are proposed by (Mart
in & Gener, 2000) and (Huang, Selman, & Kautz,
2000). Mart
in and Gener (2000) learn poli ies represented in a on ept lan-
guage (i. e., as logi al formulas) with a de ision list approa h. They an show
that a larger per entage of omplex problems (su h as blo ks-world problems
involving 20 blo ks) an be su essfully solved using the learned poli ies and
that the generated plans are while not optimal reasonably short.
2.5.2.3 Learning Control Programs. An alternative approa h to learn-
ing \mole ular" rules is learning of ontrol programs. A ontrol program gen-
erates a possibly y li (re ursive, iterative) sequen e of a tions. We will also
speak of (re ursive) ontrol rules when refering to ontrol programs. Shell
and Carbonell (1989) ontrast iterative with linear ma ros and give a the-
oreti al analysis and some empiri al demonstrations of the eÆ ien y gains
whi h an be expe ted using iterative ma ros. Iterative ma ros an be seen
54 2. State-Based Planning
as programs be ause they provide a ontrol stru ture for repeatedly exe ut-
ing a sequen e of a tions until the ondition for looping does no longer hold.
The authors point out that a ontrol program represents strategi knowledge
for a domain. EÆ ient human problem solving should not only be explained
by speed-up ee ts due to operator-merging but also by a quiring problem
solving strategies i. e., knowledge on a higher level of abstra tion.
Learning ontrol programs from some initial planning experien e brings
together indu tive program synthesis and planning resear h whi h is the main
topi of the work presented in this book. We will present our approa h to
learning ontrol programs by synthesizing fun tional programs in detail in
part II. Other approa hes were presented by (Shavlik, 1990), who applies
indu tive logi programming to ontrol program learning, and by (Koza,
1992) in the ontext of geneti programming. Re ently, learning re ursive
(\open loop") ma ros is also investigated in reinfor ement learning (Kalmar
& Szepesvari, 1999; Sun & Sessions, 1999).
While ontrol rules guide sear h for a plan, ontrol programs eliminate
sear h ompletely. In our approa h, we learn re ursive fun tions for a domain
of arbitrary omplexity but with a xed goal. For example, a program for on-
stru ting a tower of sorted blo ks an be synthesized from a (universal) plan
for a three blo k problem with goal fon(A, B), on(B, C)g. The synthesized
fun tion then an generate orre t and optimal a tion sequen es for tower
problems with an arbitrary number of blo ks. Instead of sear hing for a plan
needing probably exponential eort, the learned program is exe uted. Exe u-
tion time orresponds to the omplexity lass of the program for example
linear time for a linear re ursion. Furthermore, the learned ontrol programs
provide optimal a tion sequen es.
While learning a ontrol program for a hieving a ertain kind of top-level
goals in a domain results in a highly eÆ ient generation of an optimal a tion
sequen e, this might not be true if ontrol knowledge is only learned for a sub-
domain as for example, learing a blo k as subproblem to building a tower of
blo ks. In this ase, the problem of intelligent indexing and retrieval of ontrol
programs has to be dealt with similar as in reuse of already generated plans
in the ontext of a new planning problem (Veloso, 1994; Nebel & Koehler,
1995). Furthermore, program exe ution might lead to a state whi h is not
on the optimal path of the global problem solution (Kalmar & Szepesvari,
1999).
3. Constru ting Complete Sets
of Optimal Plans
The planning system DPlan is designed as a tool to support the rst step
of indu tive program synthesis generating nite programs for transforming
input examples into the desired output. Be ause our work is in the ontext
of program synthesis, planning is for small, deterministi domains and om-
pleteness and optimality are of more on ern than eÆ ien y onsiderations.
In the remaining hapters of part I, we present DPlan as planning system.
In part II we will des ribe, how re ursive fun tions an be indu ed from
plans onstru ted with DPlan and show how su h re ursive fun tions an be
used as ontrol programs for plan onstru tion. In this hapter, we will in-
trodu e DPlan (se t. 3.1), introdu e universal plans as sets of optimal plans
(se t. 3.2), and give proofs for termination, soundness and ompleteness of
DPlan (se t. 3.3).
1
An extension of DPlan to fun tion appli ation allowing
planning for innite domains is des ribed in the next hapter ( hap. 4).
3.1 Introdu tion to DPlan
DPlan
2
is a state-based, non-linear, total-order ba kward planner. DPlan is
a universal planner (see se t. 2.4.4 in hap. 2): Instead of a plan representing
a sequen e of a tions transforming a single initial state into a state fullling
the top-level goals, DPlan onstru ts a plan, representing optimal a tion se-
quen es for all states belonging to the planning problem. A short history of
DPlan is given in appendix AA.1.
DPlan diers from standard universal planning in the following aspe ts:
In ontrast to universal planning for non-deterministi domains, we do not
extra t state-a tion rules from the universal plan, but use it as starting-
point for program synthesis.
A planning problem spe ies a set of top-level goals and it restri ts the
number of obje ts of the domain, but no initial state is given. We are
1
The hapter is based on the following previous publi ations: S hmid (1999),
S hmid and Wysotzki (2000b, 2000a).
2
Our algorithm is named DPlan in referen e to the Dijkstra-algorithm (Cor-
men et al., 1990), be ause it is a single-sour e shortest-paths algorithm with
the state(s) fullling the top-level goals as sour e.
56 3. Constru ting Complete Sets of Optimal Plans
interested in optimal transformation sequen es for all possible states from
whi h the goal is rea hable.
A universal plan represents the optimal solutions for all states whi h an
be transformed into a state fullling the top-level goal. That is, plan on-
stru tion does not terminate if a given initial state is in luded in the plan,
but only if expansion of the urrent leaf nodes does not result in new states
(whi h are not already ontained in the plan).
A universal plan is represented as \minimal spanning DAG" (dire ted
a y li graph) (see se t. 3.2) with nodes as states and ar s as a tions.
3
Instead of a memory-eÆ ient representation using OBDDs, an expli it
state-based representation is used as starting-point for program synthe-
sis. Typi ally, plans involving only three or four obje ts are generated and
a ontrol program for dealing with n obje ts is generalized by program
synthesis.
Ba kward operator appli ation is not realized by goal regression over par-
tial state des riptions (see se t. 2.3.4.1) but over omplete state des rip-
tions. The rst step of plan onstru tion expands the top-level goals to a
set of goal states.
3.1.1 DPlan Planning Language
The urrent system is based on domain and problem spe i ations in PDDL
syntax (see se t. 2.2.1 in hap. 2), allowing Strips (see def. 2.1.1) and oper-
ators with onditional ee ts. As usual, states are dened as sets of atoms
(def. 2.1.2) and goals as sets of literals (def. 2.1.4). Operators are dened
with pre onditions, ADD- and DEL-lists (def. 2.1.5). Se ondary pre ondi-
tions an be introdu ed to model ontext-dependent ee ts (see se t. 2.2.1.2
in hap. 2).
Forward operator appli ation for exe uting a plan by transforming an
initial state into a goal state is dened as usual (see def. 2.1.6). Ba kward
operator appli ation is dened for omplete state des riptions (see def. 2.1.7).
An extension for onditional ee ts is given in denition 2.2.1. For universal
planning, ba kward operator appli ation must be extended to al ulate the
set of all prede essor states for a given state:
Denition 3.1.1 (Pre-Image). With Res
1
(o; s) we denote ba kward ap-
pli ation of an instantiated operator o to a state des ription s (see def. 2.1.7).
We all s
0
= Res
1
(o; s) a prede essor of s. Ba kward operator appli ation
an be extended in the following way:
Res
1
p
(fo
1
: : : o
n
g; s) represents the \parallel" appli ation of the set of all
a tions whi h satisfy the appli ation ondition for state s resulting in a set
of prede essor states.
3
As we will see for some examples below, for some planning domains, the universal
plan is a spe ialized DAG a minimal spanning tree or simply a sequen e.
3.1 Introdu tion to DPlan 57
The pre-image of a set of states S = fs
1
; : : : ; s
l
g is dened as
S
l
i=1
Res
1
p
(fo
1
: : : o
n
g; s
i
).
To make sure that plan onstru tion based on al ulating pre-images is
sound and omplete, it must be shown that the pre-image of a set of states S
only ontains onsistent state des riptions and that the pre-image ontains all
state des riptions whi h an be transformed into a state in S (see se t. 3.3).
A DPlan planning problem is dened as
Denition 3.1.2 (DPlan Planning Problem). A planning problem
P(O;G;D) onsists of a set of operators O, a set of top-level goals G, and a
domain restri tion D whi h an be spe ied in two ways:
as set of state des riptions,
as a set of goal states.
In standard planning and universal planning, an initial state is given as
part of a planning problem. A planning problem is solved if an a tion sequen e
transforming the initial state into a state satisfying the top-level goals is
found. In DPlan planning, a planning problem is solved if all states given in
D are in luded in the universal plan, or if a set of goal states is expanded to
all possible states from whi h a goal state is rea hable.
The initial state has the following additional fun tions in ba kward plan
onstru tion (see se t. 2.3.4 in hap. 2): Plan onstru tion terminates, if the
initial state is in luded in the sear h tree. All partial state des riptions on the
path from the initial state to the top-level goals an be ompleted by forward
operator appli ation. Consisten y of state des riptions an be he ked for the
states in luded in the solution path and in onsisten ies on other paths have
no in uen e on the soundness of the plan. To make ba kward planning sound
and omplete, in general at least one omplete state des ription denoting
a set of onsistent relations between all obje ts of interest is ne essary.
For DPlan, this must be either a (set of) omplete goal state(s) or the set
of all states to be in luded in the universal plan. If for a DPlan planning
problem D is given as a set of goal states, onsisten y of a prede essor state
an be he ked by Res
1
(o; s) = s
0
! Res(o; s
0
) = s. If D is given as a
set of ( onsistent) state des ription, plan onstru tion is redu ed to stepwise
in luding states from D in the plan.
3.1.2 DPlan Algorithm
The DPlan algorithm is a variant of the universal planning algorithm given
in table 2.4. In table 3.1, the DPlan algorithm is des ribed abstra tly. Input
is a DPlan planning problem P (O;G;D) with a set of operators O, a set of
top-level goals G and a domain restri tion D as spe ied in denition 3.1.2.
Output is a universal plan, representing the union of all optimal plans for the
restri ted domain and a xed goal, whi h we will des ribe in detail in se tion
3.2. In the following, the fun tions used for plan onstru tion are des ribed.
58 3. Constru ting Complete Sets of Optimal Plans
Table 3.1. Abstra t DPlan Algorithm
fun tion DPlan(P) where P is a DPlan planning problem.
CurrentStates := ;;
NextStates := GoalStates(P);
Plan := ;;
while (NextStates 6= CurrentStates) do
OneStepPlan := OneStepPlan(NextStates,P);
Plan := Plan [ PruneStates(OneStepPlan,NextStates);
CurrentStates := NextStates;
NextStates := NextStates [ Proje tA tions(OneStepPlan);
return Plan.
GoalStates(P):
If D is dened as set of goal states,
GoalStates(P) := D.
If D is dened as set of states S,
GoalStates(P) := fs
G
j G s
G
and s
G
2 Sg.
OneStepPlan(States, P):
Cal ulates the pre-image of States as des ribed in denition 3.1.1. Ea h
state s 2 States is expanded to all states s
0
with Res(o; s
0
) = s. States
s
0
are al ulated by ba kward operator appli ation Res
1
(o; s) = s
0
where
all states for whi h Res(o; s
0
) = s does not hold are removed. If D is given
as a set of states, additionally, all states not o urring in D are removed.
OneStepPlan returns all pairs ho; s
0
i:
OneStepPlan(States, P) = fho; s
0
i j Res(o; s) = s
0
^ s 2 Statesg.
PruneStates(Pairs, States):
Eliminates all pairs ho; s
0
i from Pairs whi h have already been visited:
PruneStates(Pairs, States) := fho; s
0
i 2 Pairs with s
0
=2 Statesg.
Proje tA tions(Pairs):
Returns the set of all states o urring in Pairs:
Proje tA tions(Pairs) := fs
0
j ho; s
0
i 2 Pairsg.
Plan onstru tion orresponds to building a sear h-tree with breadth-
rst sear h with ltering of states whi h already o ur on higher levels of
the tree. Sin e it is possible that a OneStepPlan ontains pairs ho; s
0
i, ho
0
; s
0
i
i. e., dierent a tions resulting in the same prede essor state the plan
orresponds not to a tree but to a DAG (Christodes, 1975).
3.1.3 EÆ ien y Con erns
Sin e planning is an NP- omplete or even PSPACE omplete problem (see
se t. 2.3.3.1 in hap. 2), the worst- ase performan e of every possible algo-
rithm is exponential. Nevertheless, urrent Graphplan-based and ompilation
approa hes (see se t. 2.4.2.1 in hap. 2) show good performan e for many
ben hmark problems. These algorithms are based on depth-rst sear h and
3.1 Introdu tion to DPlan 59
therefore in general annot guarantee optimality i. e., plans with a mini-
mal length sequen e of a tions (Koehler et al., 1997). In ontrast, universal
planning is based on breadth-rst sear h and an guarantee optimality (for
deterministi domains). The main disadvantage of universal planning is that
plan size grows exponentially for many domains (see se t. 2.3.3.1 in hap. 2).
En oding plans as OBDDs an often result in ompa t representations. But,
the size of an OBDD is dependent on variable ordering (see se t. 2.4.4 in
hap. 2) and al ulating an optimal variable ordering is itself an NP-hard
problem (Bryant, 1986).
Planning with DPlan is neither time nor memory eÆ ient. Plan onstru -
tion is based on breadth-rst sear h and therefore works without ba ktra k-
ing. Be ause paths to nodes whi h are already overed (by shorter paths) in
the plan are not expanded
4
, the eort of plan onstru tion is dependent on
the number of states. The number of states an grow exponentially as shown
for example for the blo ks-world domain in se tion 2.3.3.1 in hapter 2. Plans
are represented as DAGs with state des riptions as nodes that is, the size
of the plan depends on the number of states, too.
5
DPlan in orporates none of the state-of-the art te hniques for eÆ ient
plan onstru tion: we do use no pre-planning analysis (see se t. 2.5.1) and we
do not en ode plans as OBDDs. The reason is that we are mainly interested
in an expli it representation of the stru ture of a domain with a xed goal and
a small number of domain obje ts. A universal plan onstru ted with DPlan
represents the stru tural relations between optimal transformation sequen es
for all possible states. This information is ne essary for indu ing a re ursive
ontrol program, generalizing over the number of domain obje ts as we will
des ribe in hapter 8.
3.1.4 Example Problems
In the following, we will give some example domains together with a DPlan
problem, that is, a xed goal and a xed number of obje ts. For ea h DPlan
problem, we present the universal plans onstru ted by DPlan. For better
readability, we abstra t somewhat from the LISP-based PDDL en oding
whi h is input in the DPlan system (see appendix AA.3).
3.1.4.1 The Clearblo k Problem. First we look at a very simple sub-
domain of blo ks-world (see g. 3.1):. There is only one operator puttable
whi h puts a blo k x from another blo k on the table. We give the restri ted
domain Dom as a set of three states a tower of A, B, C, a tower of B, C,
and blo k A lying on the table, and a state where all three blo ks are lying
on the table.
4
similar to dynami programming in A
, (Nilsson, 1980)
5
For example, al ulating an universal plan with the un ompiled Lisp implemen-
tation of DPlan, needs about a se ond for blo ks-world with three obje ts, about
150 se onds for four obje ts, and already more than an hour for 5 obje ts.
60 3. Constru ting Complete Sets of Optimal Plans
Operator: puttable(x)
PRE: f lear(x), on(x, y)g
ADD: fontable(x), lear(y)g
DEL: fon(x, y)g
Goal: f lear(C)g
Dom-S: ffon(A, B), on(B, C), lear(A), ontable(C)g,
fon(B, C), lear(A), lear(B), ontable(A), ontable(C)g,
f lear(A), lear(B), lear(C), ontable(A), ontable(B), ontable(C)gg
Fig. 3.1. The Clearblo k DPlan Problem
The planning goal is to lear blo k C. Dom in ludes one state fullling the
goal and this state be omes the root of the plan. Plan onstru tion results
in ordering the three states given in Dom with respe t to the length of the
optimal transformation sequen e (see g. 3.2). For this simple problem, the
plan is a simple sequen e, i. e., the states are totally ordered with respe t to
the given goal and the given operator. Note that we present plans with states
and a tions in Lisp notation be ause we use the original DPlan output (see
appendix AA.2).
(PUTTABLE B)
(PUTTABLE A)
((clear A) (clear B) (clear C) (ontable A) (ontable B) (ontable C))
((on B C) (clear A) (clear B) (ontable A) (ontable C))
((on A B) (on B C) (clear A) (ontable C))
Fig. 3.2. DPlan Plan for Clearblo k
Alternatively, the set of all goal states satsifying lear(C) an be pre-
sented. Then, the resulting plan (see g. 3.3) is a forest. When planning for
multiple goal-states, we introdu e the top-level goals as root.
(PUTTABLE A)
((on A C) (on C B) (clear A))
((on A B) (clear A) (clear C))
((on B A) (clear B) (clear C))
((clear C))
((on C A) (on A B) (clear C))
((on C B) (clear A) (clear C))
((on B C) (on C A) (clear B))
((on C B) (on B A) (clear C))
(PUTTABLE B)
((on C A) (clear B) (clear C))((clear A) (clear B) (clear C))
((on A C) (clear A) (clear B)) ((on B C) (clear A) (clear B))
((on B A) (on A C) (clear B)) ((on A B) (on B C) (clear A))
(PUTTABLE A)
(PUTTABLE B) (PUTTABLE A)
(PUTTABLE B)
(ontable omitted for better readability)
Fig. 3.3. Clearblo k with a Set of Goal States
3.1 Introdu tion to DPlan 61
3.1.4.2 The Ro ket Problem. An example for a simple transportation
(logisti s) domain is the ro ket domain proposed by Veloso and Carbonell
(1993) (see g. 3.4). The planning goal is to transport three obje ts O1, O2,
and O3 from a pla e A to a destination B. The transport vehi le (Ro ket) an
only be moved in one dire tion (A to B), for example from the earth to the
moon. Therefore, it is important to load all obje ts before the ro ket moves
to its destination. The ro ket domain was introdu ed as a demonstration for
the in ompleteness of linear planning (see se t. 2.3.4.2 in hap. 2).
Operators:
load(?o, ?l)
PRE: fat(?o ?l), at(Ro ket, ?l)g
ADD: finside(?o, Ro ket)g
DEL: fat(?o, ?l)g
move-ro ket()
PRE: fat(Ro ket, A)g
ADD: fat(Ro ket, B)g
DEL: fat(Ro ket, A)g
unload(?o, ?l)
PRE: finside(?o, Ro ket), at(Ro ket, ?l)g
ADD: fat(?o, ?l)g
DEL: finside(?o, Ro ket)g
Goal: fat(O1, B), at(O2, B), at(O3, B)g
Dom-G: ffat(O1, B), at(O2, B), at(O3, B), at(Ro ket, B)gg
Fig. 3.4. The DPlan Ro ket Problem
The resulting universal plan is given in gure 3.5. For ro ket, the plan
orresponds to a DAG: loading and unloading of the three obje ts an be
performed in an arbitrary sequen e, but nally, ea h sequen e results in all
obje ts being inside the ro ket or at the destination.
3.1.4.3 The Sorting Problem. A spe i ation of a sorting problem is
given in gure 3.6. The relational symbol is (p k) represents that an element
k is at a position p of a list (or more exa tly an array). The relational symbol
gt(x y) represents that an element x has a greater value than an element y,
that is, the ordering on natural numbers is represented extensionally. Rela-
tional symbol gt is stati , i. e., its truth value is never hanged by operator
appli ation. Be ause swapping of two elements is only restri ted su h that
two elements are swapped if the rst is greater than the se ond, the resulting
plan orresponds to a set of tra es of a sele tion sort program. We will dis uss
synthesis of sele tion sort in detail in hapter 8.
The universal plan for sorting is given in gure 3.7. For better readability,
the lists itself instead of their logi al des ription are given. In the hapter 4
we will des ribe how su h lists an be manipulated dire tly by extending
planning to fun tion appli ation.
62 3. Constru ting Complete Sets of Optimal Plans
((AT O1 B) (AT O2 B) (AT O3 B) (AT R B))
((IN O1 R) (AT O2 B) (AT O3 B) (AT R B)) ((IN O2 R) (AT O1 B) (AT O3 B) (AT R B)) ((IN O3 R) (AT O1 B) (AT O2 B) (AT R B))
((IN O2 R) (IN O1 R) (AT O3 B) (AT R B))
((IN O3 R) (IN O1 R) (AT O2 B) (AT R B))
((IN O3 R) (IN O2 R) (AT O1 B) (AT R B))
((IN O3 R) (IN O1 R) (IN O2 R) (AT R B))
((AT R A) (IN O1 R) (IN O2 R) (IN O3) R)
((AT O1 A) (AT R A) (IN O2 R) (IN O3 R)) ((AT O2 A) (AT R A) (IN O1 R) (IN O3 R)) ((AT O3 A) (AT R A) (IN O1 R) (IN O2 R))
((AT O2 A) (AT O1 A) (AT R A) (IN O3 R))
((AT O3 A) (AT O1 A) (AT R A) (IN O2 R))
((AT O3 A) (AT O2 A) (AT R A) (IN O1 R))
((AT O3 A) (AT O1 A) (AT O2 A) (AT R A))
(UNLOAD O1) (UNLOAD O2) (UNLOAD O3)
(UNLOAD O2)(UNLOAD O3)(UNLOAD O1) (UNLOAD O3)
(UNLOAD O1)(UNLOAD O2)
(UNLOAD O3)(UNLOAD O2)
(UNLOAD O1)
(LOAD O1) (LOAD O2) (LOAD O3)
(LOAD O2)(LOAD O3)(LOAD O1) (LOAD O3)
(LOAD O1)(LOAD O2)
(LOAD O3)(LOAD O2)
(LOAD O1)
(MOVE-ROCKET)
(in = inside, R = ro ket)
Fig. 3.5. Universal Plan for Ro ket
Operator: swap(?p, ?q)
PRE: fis (?p, ?n1), is (?q, ?n2), gt(?n1, ?n2)g
ADD: fis (?p, ?n2), is (?q, ?n1)g
DEL: fis (?p, ?n1), is (?q, ?n2)g
Goal: fis (p1, 1), is (p2, 2), is (p3, 3), is (p4, 4)g
Dom-G: fis (p1, 1), is (p2, 2), is (p3, 3), is (p4, 4)
gt(4, 3), gt(4, 2), gt(4, 1), gt(3, 2), gt(3, 1), gt(2, 1)g
Fig. 3.6. The DPlan Sorting Problem
3.1 Introdu tion to DPlan 63
=((isc p1 1) (isc p2 2) (isc p3 3) (isc p4 4))
[3421] [2341] [4312]
[4321][2431][4132]
[2413][4123][1342]
[3412][2314][1342][4213][3241][3241][3124][2143]
[4231][1324][1432][3214][2134][1243]
[1234]
= (swap p1 p4)(S P1 P2) (S P1 P3) (S P2 P3) (S P1 P4)(S P2 P4)(S P3 P4)
(S P1 P3) (S P2 P3) (S P1 P4) (S P2 P4)(S P2 P1)(S P3 P2) (S P1 P4) (S P2 P3)(S P1 P2) (S P1 P4)
(S P1 P4) (S P2 P4)(S P3 P4) (S P1 P4)(S P3 P4)(S P2 P4) (S P2 P3)(S P1 P3)(S P1 P4) (S P2 P1) (S P2 P4)(S P3 P2) (S P2 P1)(S P2 P3)(S P4 P2)(S P1 P2) (S P1 P4)(S P1 P2) (S P4 P3)(S P3 P1)
(S P1 P2) (S P1 P3)(S P2 P4) (S P2 P4)
(S P4 P3)(S P3 P4) (S P1 P4)
(S P1 P2)
(S P2 P3) (S P1 P4) (S P3 P2)
(S P3 P1)
(S P4 P3)
(S P3 P4) (S P1 P3)
(S P1 P3)
(S P2 P4)
(S P2 P1)
(S P2 P1)
(S P3 P2)
(S P1 P2)(S P4 P2)
(S P4 P3)
(S P1 P3)
(S P3 P4)
(S P1 P4)
(S P4 P1)(S P4 P3)
(S P3 P1)(S P3 P2)(S P1 P3)
(S P2 P4)
(S P2 P3)(S P1 P3)
(S P3 P4) (S P4 P2)
(States are represented abbreviated)
Fig. 3.7. Universal Plan for Sorting
3.1.4.4 The Hanoi Problem. A spe i ation of a Tower of Hanoi problem
is given in gure 3.8. The Tower of Hanoi problem ( hap. 5 Winston & Horn,
1989, e. g.,) is a well-resear hed puzzle. Given are n dis s (in our ase three)
with monotoni al in reasing size and three pegs. The goal is to have a tower
of dis s on a xed peg (in our ase p
3
) where the dis s are ordered by their
size with the largest dis as base and the smallest as top element. Moving of a
dis is restri ted in the following way: Only a dis whi h has no other dis on
its top an be moved. A dis an only moved onto a larger dis or an empty
peg. Coming up with eÆ ient algorithms (for restri ted variants) of the Tower
of Hanoi problem is still ongoing resear h (Atkinson, 1981; Pettorossi, 1984;
Walsh, 1983; Allou he, 1994; Hinz, 1996). We will ome ba k to this problem
in hapter 8.
The plan for hanoi is given in gure 3.9. Again, states are represented
abbreviated.
3.1.4.5 The Tower Problem. The blo ks-world domain with operators
put and puttable was introdu ed in hapter 2 (see gs. 2.2, 2.3, 2.4, 2.5). We
use a variant of the operator denitions given in gure 2.2 where the both
variants of put are spe ied as a single operator with onditioned ee t. A
plan for the goal fon(A, B), on(B, C)g is given in gure 3.10. We omitted
from representing the ontable relations for better readability.
64 3. Constru ting Complete Sets of Optimal Plans
Operator: move(?d, ?from, ?to)
PRE: fon(?d, ?from), lear(?d), lear(?to), smaller(?to, ?d)g
ADD: fon(?d, ?to), lear(?from)g
DEL: fon(?d, ?from), lear(?to)g
Goal: fon(d
3
, p
3
), on(d
2
, d
3
), on(d
1
, d
2
)g
Dom-G: fon(d
3
, p
3
), on(d
2
, d
3
), on(d
1
, d
2
), lear(d
1
), lear(p
1
), lear(p
2
)
smaller(p
1
, d
1
), smaller(p
1
, d
2
), smaller(p
1
, d
3
),
smaller(p
2
, d
1
), smaller(p
2
, d
2
), smaller(p
2
, d
3
),
smaller(p
3
, d
1
), smaller(p
3
, d
2
), smaller(p
3
, d
3
),
smaller(d
1
, d
2
), smaller(d
1
, d
3
), smaller(d
2
, d
3
)g
Fig. 3.8. The DPlan Hanoi Problem
[()(3)(12)]
[()(23)(1)]
[(2)(3)(1)]
[()(1)(23)]
[(2)(1)(3)]
[(2)()(13)][(12)()(3)]
[(12)(3)()]
[(2)(13)()]
[()(13)(2)]
[(1)(3)(2)][()(123)()][(1)(23)()]
[(13)()(2)]
[(3)()(12)][(3)(1)(2)][(123)()()][(23)(1)()]
[(23)()(1)]
[(13)(2)()][(3)(2)(1)]
[(3)(12)()]
[()(12)(3)][()(2)(13)]
[(1)(2)(3)]
[(1)()(23)]
[()()(123)]
(M D2 P2 D3) (M D2 P1 D3)
(M D3 P1 P3) (M D3 P2 P3)
(M D1 D3 D2)(M D1 P3 D2) (M D1 D3 D2)(M D1 P3 D2)
(M D2 D3 P2) (M D2 P3 P2) (M D2 D3 P1) (M D2 P3 P1)
(M D1 D3 P1) (M D1 D2 P1) (M D1 D2 P2) (M D1 D3 P2)
(M D1 P2 P3) (M D1 D2 P3) (M D1 P2 D3) (M D1 D2 D3) (M D1 P1 P3) (M D1 D2 P3) (M D1 P1 D3) (M D1 D2 D3)
(M D1 P1 D2) (M D1 P2 D2)
Fig. 3.9. Universal Plan for Hanoi
3.2 Optimal Full Universal Plans 65
((ON A B) (ON B C) (CT A))
((CT B) (ON B C) (CT A))
((CT C) (ON B A) (CT B))((CT C) (CT B) (CT A))
((ON C B) (CT C) (ON B A))((ON C A) (CT C) (CT B)) ((ON C B) (CT C) (CT A)) ((ON A B) (CT C) (CT A)) ((ON A C) (CT B) (CT A))
((ON B C) (ON C A) (CT B)) ((ON A C) (ON C B) (CT A)) ((ON C A) (ON A B) (CT C)) ((ON B A) (ON A C) (CT B))
(PUT A B)
(PUT B C) (PUT B C)
(PUTTABLE C)(PUTTABLE A)(PUTTABLE A)(PUTTABLE C)(PUTTABLE C)
(PUTTABLE B) (PUTTABLE A) (PUTTABLE C) (PUTTABLE B)
Fig. 3.10. Universal Plan for Tower
3.2 Optimal Full Universal Plans
A DPlan plan, as onstru ted with the algorithm reported in table 3.1, is
a set of a tion-state pairs ho; si. For the goal state(s) it holds o = nil. For
every other state ontained in a DPlan plan it holds that Res(o; s) = s
0
(see
se t. 3.1.1); that is, for ea h state s ex ept the goal state(s), there exists at
least one state s 6= s
0
whi h an be dire tly transformed into s
0
by appli ation
of an a tion o. Consequently, a DPlan plan onstitutes a dire ted graph with
states as nodes and a tions as edges. Be ause a state whi h is already on-
tained in the plan is never introdu ed again on a later level (PruneStates
in tab. 3.1), the resulting graph is also a y li .
Denition 3.2.1 (Universal Plan). A plan onstru ted with the DPLan
algorithm given in table 3.1 onstitutes a dire ted a y li graph (DAG)
=
(S;R) with S = Proje tA tions(S) as set of states and R as set of edges.
The set of edges is given by the pre-images (see tab. 3.1): R S S =
fho; s
0
i j Res(o; s) = s
0
g.
6
We all
an universal plan.
DPlan plan is onstru ted levelwise, starting with the states fullling the
top-level goals as root nodes and the plan ontains no y li paths. Therefore,
all paths leading from a xed state to the root have equal length and ea h
state an be assigned a uniquely determined natural number representing the
number of a tion appli ations for transforming it into a goal state.
Denition 3.2.2 (Universal Plan as Order over States). Ea h node
s 2 S for whi h no prede essor Res(o; s) exists has a path-length g(s) = 0
(s is a root node and s G).
Ea h node s 2 S with Res(o; s) = s
0
and g(s
0
) = i has a path-length
g(s) = i+ 1 (s is a node with s 6 G).
6
In ontrast to the standard denition of labelled graphs (Christodes, 1975), we
do not dis riminate between nodes/edges and labels of nodes/edges.
66 3. Constru ting Complete Sets of Optimal Plans
The path length g(s) 2 N onstitutes an order over the states in
with
s < s
0
i g(s) < g(s
0
).
In ontrast to standard universal planning, DPlan onstru ts a plan whi h
ontains all states whi h an be transformed into a state satisfying the given
top-level goals:
Denition 3.2.3 (Full Universal Plan). A universal plan
= (S;R)
is alled full with respe t to given top-level goals G, if for all states s
for whi h exists a transformation sequen e Res(o; s) = s
i
: : : Res(o
i
; s
i
) =
s
i1
: : : Res(o
1
; s
1
) = s
0
G holds s 2 S.
For example, the universal plan for ro ket (see g. 3.5) dened for three
obje ts and a single ro ket ontains transformation sequen es for all legal
states whi h an be dened over C = fO1; O2; O3; Ro ketg.
The notion of a full universal plan presupposes that planning with DPlan
must be omplete. Additionally, we require that the universal plan does not
ontain states whi h annot be transformed into a goal state, that is, DPlan
must be sound. Completeness and soundness of DPlan are dis ussed in se tion
3.3. For proving optimality of DPlan we assume ompleteness:
Lemma 3.2.1 (Universal Plans are Optimal). For a full universal plan
ea h path from a state s to a root note ontains the minimal number of
edges, that is, the minimal number of a tions ne essary to transform s into
a goal state.
Proof (Universal Plans are Optimal). We give an informal proof by indu -
tion over the onstru tion of
= (S;R) with the DPLan algorithm given
in table 3.1:
Plan onstru tion starts with a set of goal states S
G
= fs j G s^ s 2 Sg.
Sin e the goal states onstitute root nodes in
it holds g(s) = 0 for all
s 2 S
G
.
For all nodes in s with g(s) = i > 0, i gives the minimal number of a tions
to transform s into a goal state. It either holds that there exists a state s
0
with Res(o; s
0
) = s or that no su h state exists. In the se ond ase, s is a
leaf in
. In the rst ase, one of the following onditions hold:
1. s
0
is already introdu ed at a higher level with g(s
0
) g(s): ho; s
0
i will
not be in luded in the plan (see PruneStates in tab. 3.1).
2. s
0
is not ontained at a higher level: ho; s
0
i will be in luded in the plan
and g(s
0
) = i+ 1 whi h is the minimal path-length for s
0
.
The length of minimal paths in a graph onstitutes a metri . That is, for
all paths holds that if the transformation sequen e from start to goal node
of the path is optimal then the transformation sequen es for all intermediate
nodes on the path are optimal, too.
As a onsequen e of the optimality of DPlan, ea h path in
from some
state to a goal state represents an optimal solution for this state. In general,
3.3 Termination, Soundness, Completeness 67
there might be multiple optimal paths from a state to the goal. For example,
for ro ket (see se t. 3.1.4.2) there are 3! dierent possible sequen es for un-
loading and loading three obje ts. For the leaf node in gure 3.5 there are 36
dierent optimal solution paths ontained in the universal plan.
A DPlan plan an be redu ed to a minimal spanning tree (Christodes,
1975) by deleting edges su h that ea h node whi h is not the root node has
exa tly one prede essor. The minimal spanning tree ontains exa tly one
optimal solution path for ea h state. A minimal spanning tree for ro ket is
given in gure 3.11.
((AT O1 B) (AT O2 B) (AT O3 B) (AT R B))
((IN O1 R) (AT O2 B) (AT O3 B) (AT R B)) ((IN O2 R) (AT O1 B) (AT O3 B) (AT R B)) ((IN O3 R) (AT O1 B) (AT O2 B) (AT R B))
((IN O2 R) (IN O1 R) (AT O3 B) (AT R B))
((IN O3 R) (IN O1 R) (AT O2 B) (AT R B))
((IN O3 R) (IN O2 R) (AT O1 B) (AT R B))
((IN O3 R) (IN O1 R) (IN O2 R) (AT R B))
((AT R A) (IN O1 R) (IN O2 R) (IN O3) R)
((AT O1 A) (AT R A) (IN O2 R) (IN O3 R)) ((AT O2 A) (AT R A) (IN O1 R) (IN O3 R)) ((AT O3 A) (AT R A) (IN O1 R) (IN O2 R))
((AT O2 A) (AT O1 A) (AT R A) (IN O3 R))
((AT O3 A) (AT O1 A) (AT R A) (IN O2 R))
((AT O3 A) (AT O2 A) (AT R A) (IN O1 R))
((AT O3 A) (AT O1 A) (AT O2 A) (AT R A))
(UNLOAD O1) (UNLOAD O2) (UNLOAD O3)
(UNLOAD O2) (UNLOAD O3)
(UNLOAD O3)
(LOAD O1) (LOAD O2) (LOAD O3)
(LOAD O2) (LOAD O3)
(LOAD O3)
(MOVE-ROCKET)
(UNLOAD O3)
(LOAD O3)
Fig. 3.11. Minimal Spanning Tree for Ro ket
3.3 Termination, Soundness, Completeness
3.3.1 Termination of DPlan
For nite domains, termination of the DPlan algorithm given in table 3.1 is
guaranteed:
Lemma 3.3.1 (Termination of DPlan). For nite domains, a situation
(NextStates = CurrentStates) will be rea hed.
This fa t holds be ause the growth of the number of states in in plan on-
stru tion is monotoni al:
Proof (Termination of DPlan). We denote a plan after the t-th iteration of
pre-image al ulation with P lan
t
. From the denition of the DPlan algorithm
follows that NextStates = Proje tA tions(Plan).
68 3. Constru ting Complete Sets of Optimal Plans
If al ulating a pre-image only returns state des riptions whi h are al-
ready ontained in plan P lan
t
then (NextStates = CurrentStates) holds
and DPlan terminates after t iterations.
If al ulating a pre-image returns at least one state des ription whi h is not
already ontained in the plan then this state des ription(s) are in luded in
the plan and j P lan
t+1
j > j P lan
t
j .
Be ause planning domains are assumed to be nite, there is only a xed
number of dierent state des riptions and a xed point j P lan
t+1
j =
j P lan
t
j , i. e., a situation (NextStates = CurrentStates) will be rea hed
after a nite number of steps t.
3.3.2 Operator Restri tions
Soundness and ompleteness of plan onstru tion is determined by the sound-
ness and ompleteness of operator appli ation. State-based ba kward plan-
ning has some inherent restri tions whi h we will des ribe in the following. We
repeat the denitions for forward and ba kward operator appli ation given
in denitions 2.1.6 and 2.1.7:
Res(o; s) = s nDEL [ ADD if PRE s
Res
1
(o; s) = s nADD [ (DEL [ PRE) if ADD s:
For forward appli ation, nDEL and [ADD are only ommutative for ADD\
DEL = ;
7
. When literals from the DEL-list are deleted before adding literals
from the ADD-list, as dened above, we guarantee that everything given in
the ADD-list is really given for the new state.
Completeness and soundness of ba kward-planning means that the fol-
lowing diagram must ommute:
s == Res(o; s
0
)
# "
Res
1
(o; s) == s
0
If Res
1
(o; s) = s
0
results in s = Res(o; s
0
) for all s, ba kward planning is
sound. If for all s
0
with Res(o; s
0
) = s it holds that Res
1
(o; s) = s
0
, ba kward
planning is omplete. In the following we will proof that these propositions
hold with some restri tions.
Soundness:
7
In PDDL ADD \ DEL = ; must be always true be ause the ee ts are given
in a single list with DEL-ee ts as negated literals. Therefore, an expression
(and p (not p)) would represent a ontradi tion and annot be not allowed as
spe i ation of an operator ee t.
3.3 Termination, Soundness, Completeness 69
Res(o;Res
1
(o; s))
?
= s
Res(o;Res
1
(o; s)) =
= Res
1
(o; s) nDEL [ ADD
with PRE Res
1
(o; s)
= fs nADD [ fDEL [ PREgg nDEL [ ADD
with PRE fs nADD [ fDEL [ PREgg
and ADD s
= fs [ fDEL [ PREgg nDEL
with PRE fs nADD [ fDEL [ PREgg
= fs [DELg nDEL if PRE s [DEL
= s if s \DEL = ;:
The restri tion PRE s[DEL means for forward operator appli ation,
transforming from s
0
into s, that PRE holds after operator appli ation if it is
not expli itly deleted. In ba kward appli ation PRE is introdu ed as list of
sub-goals and onstraints whi h have to hold in s
0
. Therefore, this restri tion
is unproblemati . The restri tion s\DEL = ; means for forward appli ation
that s ontains no literal from the DEL-list. For ba kward-planning all liter-
als from the DEL-list are added and s has ontained none of these literals.
This restri tion is in a ordan e with the denition of legal operators. Thus,
soundness of ba kward appli ation of an Strips operator is given. In gen-
eral, soundness of ba kward-planning an be guaranteed by introdu ing an
onsisten y he k for ea h onstru ted prede essor: For a newly onstru ted
state s
0
, s = Res(o; s
0
) must hold. If forward operator appli ation does not
result in s, the onstru ted prede essor is onsidered as not admissible and
not introdu ed in the plan.
Completeness:
Res
1
(o;Res(o; s
0
))
?
= s
0
Res
1
(o;Res(o; s
0
)) =
= Res(o; s
0
) nADD [ fDEL [ PREg
with ADD Res(o; s
0
)
= (s
0
nDEL [ ADD) nADD [ fDEL [ PREg
with ADD (s
0
nDEL [ADD)
and with PRE s
0
= s
0
nDEL [ fDEL [ PREg
with PRE s
0
if s
0
\ ADD = ;
= s
0
[ PRE with PRE s
0
if DEL s
0
= s
0
:
70 3. Constru ting Complete Sets of Optimal Plans
The restri tion s
0
\ ADD = ; means that only su h states an be on-
stru ted whi h do not already ontain a literal added by an appli able opera-
tor. The restri tionDEL s
0
means that only su h states an be onstru ted
whi h ontain all literals whi h are deleted by an appli able operator. While
this is a real sour e of in ompleteness, we are still looking for a meaningful
domain where these ases o ur. In general, in ompleteness an be over ome
with a loss of eÆ ien y by onstru ting prede essors in the following way:
Res
1
(o; s) = s nADD
[ fDEL
[ PREg
for all ombinations of subsets ADD
of ADD and DEL
of DEL if ADD
s. Thus, all possible states ontaining subsets of DEL and ADD with the
spe ial ase of inserting all literals from DEL and deleting all literals from
ADD, would be onstru ted. Of ourse, most of these states might not be
sound!
For more general operator denitions, as allowed by PDDL, further re-
stri tions arise. For example, it is ne essary, that se ondary pre onditions are
mutually ex lusive.
Another denition of sound and omplete ba kward operator appli ation
is given by (Haslum & Gener, 2000): For s \ ADD 6= ; and s \DEL = ;:
Res
1
(o; s) = snADD[PRE. ForDEL PRE, this denition is equivalent
with our denition.
3.3.3 Soundness and Completeness of DPlan
Denition 3.3.1 (Soundness of DPlan). A universal plan
is sound,
if ea h path = fhnil; s
0
i; ho
1
; s
1
i; : : : ho
n
; s
n
ig denes a solution. That is,
for an initial state orresponding to s
n
, the sequen e of a tions o
n
Æ : : : Æ o
1
transforms s
n
into a goal state.
Soundness of DPlan follows from the soundness of ba kward operator
appli ation as dened in se tion 3.3.2.
Denition 3.3.2 (Completeness of DPlan). A universal plan
is om-
plete, if the plan ontains all states s for whi h an a tion sequen e, trans-
forming s into a goal state, exists.
Completeness of DPlan follows from the ompleteness of ba kward oper-
ator appli ation as dened in se tion 3.3.2.
If DPlan works on a domain restri tion D given as set of states, only
su h states are in luded in the plan whi h also o ur in D. From sound-
ness and ompleteness of DPlan also follows that for a given set of states
D, DPlan terminates with dom = Proje tA tions(Plan) if the state spa e
underlying D is (strongly) onne ted.
8
If DPlan terminates with Proje -
tA tions(Plan) D then D n Proje tA tions(Plan) only ontains states
8
For denitions of onne tedness of graphs, see for example (Christodes, 1975).
3.3 Termination, Soundness, Completeness 71
from whi h no goal-state in Proje tA tions(Plan) is rea hable. One rea-
son for non-rea hability an be that the state spa e is not onne ted. That
is, it ontains states on whi h no operator is appli able. For example, in the
restri ted blo ks-world domain where only a puttable but no put operator is
given (see se t. 3.1.4.1), no operator is appli able in a state where all blo ks
are lying on the table. As a onsequen e, a goal on(x, y) is not rea hable from
su h a state. A se ond reason for non-rea hability an be that the state spa e
is weakly onne ted (i. e., is a dire ted graph) su h that it ontains states
s; s
0
where Res(o; s) = s
0
is dened but not Res(o
0
; s
0
) = s. For example, in
the ro ket domain (see se t. 3.1.4.2), it is not possible to rea h a state where
the ro ket is on the earth from a state where the ro ket is on the moon. As a
onsequen e, a goal at(Ro ket, Earth) is nor rea hable from any state where
the ro ket is on the moon.
4. Integrating Fun tion Appli ation
in State-Based Planning
Standard planning is dened for logi al formulas over variables and on-
stants. Allowing general terms that is, appli ation of arbitrary symbol-
manipulating and numeri al fun tions enlarges the s ope of problems for
whi h plans an be onstru ted. In the ontext of using planning as starting
point for the synthesis of fun tional programs, we are interested in apply-
ing planning to standard programming problems, su h as list sorting. In the
following, we will rst give a motivation for extending planning to fun -
tion appli ation (se t. 4.1), then an extension of the Strips language pre-
sented in se tion 2.1 is introdu ed (se t. 4.2) together with some extensions
(se t. 4.3), and nally examples for planning with fun tion appli ation are
given (se t. 4.4).
1
4.1 Motivation
The development of eÆ ient planning algorithms in the nineties (see se t. 2.4.2
in hap. 2) su h as Graphplan, SAT planning, and heuristi planning (Blum
& Furst, 1997; Kautz & Selman, 1996; Bonet & Gener, 1999) has made it
possible to apply planning to more demanding real-world domains. Examples
are the logisti s or the elevator domain (Ba hus et al., 2000). Many realis-
ti domains involve manipulation of numeri al obje ts. For example: when
planning (optimal) routes for delivering obje ts as in the logisti s domain, it
might be ne essary to take into a ount time and other resour e onstraints
as fuel; plan onstru tion for landing a spa e-vehi le involves al ulations
for the orre t adjustments of thrusts (Pednault, 1987). One obvious way to
deal with numeri al obje ts is to assign and update their values by means of
fun tion appli ations.
There is some initial work on extending planning formalisms to deal with
resour e onstraints (Koehler, 1998; Laborie & Ghallab, 1995). Here, fun -
tion appli ation is restri ted to manipulation of spe ially marked resour e
variables in the operator ee t. For example, the amount of fuel might be
de reased in relation to the distan e an airplane ies: $gas -= distan e(?x
1
The hapter is based on the paper S hmid, Muller, and Wysotzki (2000).
74 4. Integrating Fun tion Appli ation in Planning
?y)/3 (Koehler, 1998). A more general approa h for in luding fun tion ap-
pli ations into planning was proposed by (Pednault, 1987) within the a tion
des ription language (ADL). In addition to ADD/DEL operator ee ts, ADL
allows variable updates by assigning new values whi h an be al ulated by
arbitrary fun tions. For example, the put(?x, ?y) operator in a blo ks-world
domain might be extended to updating the state variable ?LastBlo kMoved
to ?LastBlo kMoved := ?x; the amount of water in a jug ?j
2
might be hanged
by a pour(?j
1
, ?j
2
) a tion into ?j
2
:= min[?
2
, ?j
1
+?j
2
where ?j
1
, ?j
2
are
the urrent quantities of water in the jugs and ?
2
is the apa ity of jug ?j
2
(Pednault, 1994).
Introdu ing fun tions into planning not only makes it possible to deal
with numeri al values in a more general way than allowed for by a purely
relational language but has several additional advantages (see also Gener,
2000) whi h we will illustrate with the Tower of Hanoi domain (see g. 4.1)
2
:
Allowing not only numeri al but also symbol-manipulating fun tions makes
it possible to model operators in a more ompa t and sometimes also more
natural way. For example, representing whi h disks are lying on whi h peg in
the Tower of Hanoi domain ould be realized by a predi ate on(?disks, ?peg)
where ?disks represents the ordered sequen e of disks on peg ?peg with list
[1 representing an empty peg. Instead of modeling a state hange by adding
and deleting literals from the urrent state, arguments of the predi ates an
be hanged by applying standard list-manipulation fun tions (e. g., built-in
Lisp fun tions like ar(l) for returning the rst element of a list, dr(l)
for returning a list without its rst element, or ons(x, l) for inserting a
symbol x in front of a list l).
A further advantage is that obje ts an be referred to in an indire t way.
In the hanoi example, ar(l) refers to the obje t whi h is urrently the up-
permost disk on a peg. There is no need for any additional uent predi ate
besides on(?disks, ?peg). The lear(?disk) predi ate given in the standard
denition be omes super uous. Gener (2000) points out that indire t ref-
eren e redu es substantially the number of possible ground atoms and in
onsequen e the number of possible a tions, thus plan onstru tion be omes
more eÆ ient. Indire t obje t referen e additionally allows for modeling in-
nite domains while the state representations remain small and ompa t. For
example, ar(l) gives us the top disk of a peg regardless of how many disks
are involved in the planning problem.
Finally, introdu ing fun tions in modeling planning domains often makes
it possible to get rid of stati predi ates. For example, the smaller(?x, ?y)
predi ate in the hanoi domain an be eliminated. By representing disks as
numbers, the built-in predi ate \<" an be used to he k whether the ap-
pli ation onstraint for the hanoi move operator is satised in the urrent
2
Note that we use prex notation p(x
1
; : : : ; x
n
) for relations and fun tions
throughout the paper.
4.1 Motivation 75
(a) Standard representation
Operator: move(?d, ?from, ?to)
PRE: fon(?d, ?from), lear(?d), lear(?to), smaller(?d, ?to)g
ADD: fon(?d, ?to), lear(?from)g
DEL: fon(?d, ?from), lear(?to)g
Goal: fon(d
3
, p
3
), on(d
2
, d
3
), on(d
1
, d
2
)g
Initial State: fon(d
3
, p
1
), on(d
2
, d
3
), on(d
1
, d
2
), lear(d
1
), lear(p
2
), lear(p
3
)
smaller(d
1
, p
1
), smaller(d
1
, p
2
), smaller(d
1
, p
3
),
smaller(d
2
, p
1
), smaller(d
2
, p
2
), smaller(d
2
, p
3
),
smaller(d
3
, p
1
), smaller(d
3
, p
2
), smaller(d
3
, p
3
),
smaller(d
1
, d
2
), smaller(d
1
, d
3
), smaller(d
2
, d
3
)g
(b) Fun tional representation
Operator: move(?p
i
, ?p
j
)
PRE: fon(?l
i
, ?p
i
), on(?l
j
, ?p
j
), ( ar(?l
i
) 6= 1), ar(?l
i
) < ar(?l
j
)g
UPDATE: hange ?l
j
in on(?l
j
, ?p
j
) to ons( ar(?l
i
),?l
j
)
hange ?l
i
in on(?l
i
, ?p
i
) to dr(?l
i
)
Goal: fon([1, p
1
), on([1, p
2
), on([1 2 3 1, p
3
)g ; 1 represents a dummy
Initial State: fon([1 2 3 1, p
1
), on([1, p
2
), on([1, p
3
)g ; bottom disk
Fig. 4.1. Tower of Hanoi (a) Without and (b) With Fun tion Appli ation
state. Allowing arbitrary boolean operators to express pre onditions gener-
alizes mat hing of literals to ontraint satisfa tion.
The need to extend standard relational spe i ation languages to more
expressive languages, resulting in planning algorithms appli able to a wider
range of domains, is re ognized in the planning ommunity. The PDDL plan-
ning language for instan e (M Dermott, 1998b), the urrent standard for
planners, in orporates all features whi h extend ADL over lassi al Strips.
But while there is a lear operational semanti s for onditional ee ts and
quanti ation (Koehler et al., 1997), fun tion appli ation is dealt with in a
largely ad-ho manner (see also remark in Gener, 2000).
The only published approa h allowing fun tions in domain and problem
spe i ations is Gener (2000). He proposed Fun tional Strips, extending
standard Strips to support rst-order-fun tion symbols. He gives a denota-
tional state-based semanti s for a tions and an operational semanti s, de-
s ribing a tion ee ts as updates of state representations. In ontrast to the
lassi al relational languages, states are not represented as sets of (ground)
literals but as state variables ( f. situation al ulus). For example, the initial
positon of disk d
1
is represented as lo (d
1
) = d
0
(see g. 4.2).
While in Fun tional Strips relations are ompletely repla ed by fun tional
expressions, our approa h allows a ombination of relational and fun tional
expressions. We extended the Strips planning language to a subset of rst
order logi , where relational symbols are dened over terms where terms an
be variables, onstant symbols, or fun tions with arbitrary terms as argu-
ments. Our state representations are still sets of literals, but we introdu e a
76 4. Integrating Fun tion Appli ation in Planning
Domains: Peg: p
1
, p
2
, p
3
; the pegs
Disk: d
1
, d
2
, d
3
; the disks
Disk*: Disk, d
0
; the disks and a dummy bottom disk 0
Fluents: top: Peg ! Disk* ; denotes top disk in peg
lo : Disk ! Disk*; denotes disk below given disk
size: Disk* ! Integer ; represents disk size
A tion: move(?p
i
, ?p
j
: Peg) ; moves between pegs
Pre : top(?p
i
) 6= d
0
, size(top(?p
i
)) < size(top(?p
j
))
Post: top(?p
i
) := lo (top(?p
i
)); lo (top(?p
i
)) := top(?p
j
); top(?p
j
) := top(?p
i
)
Init: lo (d
1
) = d
0
, lo (d
2
) = d
1
, lo (d
3
) = d
2
, lo (d
4
) = d
3
,
top(p
1
) = d
4
, top(p
2
) = d
0
, top(p
3
) = d
0
,
size(d
0
) = 4, size(d
1
) = 3, size(d
2
) = 2, size(d
3
) = 1, size(d
4
) = 0
Goal: lo (d
1
) = d
0
, lo (d
2
) = d
1
, lo (d
3
) = d
2
, lo (d
4
) = d
3
, top(p
3
) = d
4
Fig. 4.2. Tower of Hanoi in Fun tional Strips (Gener, 1999)
se ond lass of relational symbols ( alled onstraints) whi h are dened over
arbitrary fun tional expressions, as shown in gure 4.1. Standard represen-
tations ontaining only literals dened over onstant symbols are in luded as
spe ial ase, and we an ombine ADD/DEL ee ts and updates in operator
denitions.
4.2 Extending Strips to Fun tion Appli ations
In the following we introdu e an extension of the Strips language (see
def. 2.1.1) from propositional representations to a larger subset of rst order
logi , allowing general terms as arguments of relational symbols. Operators
are dened over these more general formulas, resulting in a more omplex
state transition fun tion.
Denition 4.2.1 (FPlan Language). The language L(X ; C;F ;R
P
[ R
C
)
is dened over sets of variables X , onstant symbols C, fun tion symbols
F , and of relational symbols R = R
P
[ R
C
in the following way:
Variables x 2 X are terms.
Constant symbols 2 C are terms.
If f 2 F is a fun tion symbol with arity (f) = i and t
1
; t
2
; : : : t
i
are terms, then f(t
1
; t
2
; : : : ; t
i
) is a term.
If p 2 R
P
is a relational symbol with arity (p) = j and t
1
, t
2
, : : : t
j
are
terms then p(t
1
; t
2
; : : : ; t
j
) is a formula.
If r 2 R
C
is a relational symbol with arity (r) = j and t
1
; t
2
; : : :
t
j
are terms then r(t
1
; t
2
; : : : ; t
j
) is a formula. We all formulas with
4.2 Extending Strips to Fun tion Appli ations 77
relational symbols from R
C
onstraints.
For short, we write p(t).
If p
1
(t
1
), p
2
(t
2
), : : :, p
k
(t
k
) with p
1
; p
2
; : : : ; p
k
2 R, then fp
1
(t
1
), p
2
(t
2
),
: : :, p
k
(t
k
)g is a formula, representing the onjun tion of literals.
There are no other FPlan formulas.
Remarks:
- Formulas onsisting of a single relational symbol are alled (positive) liter-
als.
Terms over C [ F , i. e., terms without variables, are alled ground terms.
Formulas over ground terms are alled ground formulas.
- With X (F ) we denote the variables o urring in formula F .
An example of a state representation, goal denition, and operator def-
inition using the FPlan language is given in gure 4.1.b for the hanoi do-
main. FPlan is instantiated in the following way: X = f?p
i
; ?p
j
; ?l
i
; ?l
j
g,
C = fp
1
; p
2
; p
3
; 1; 2; 3;1g, F = f[
i
; ar
1
; dr
1
; ons
2
g, R
P
= fon
2
g,
R
C
= f6=
2
; <
2
g.
A state in the hanoi domain, su h as fon([2 3 1, p
1
), on([1, p
2
), on([1
1, p
3
)g, is given as a set of atoms where the arguments an be arbitrary
ground terms. We an use onstru tor fun tions to dene omplex data stru -
tures. For example, [2 3 1 is a list of onstant symbols, where [ is a list
onstru tor fun tion dened over an arbitrary number of elements. The atom
on([2 3 1, p
1
) orresponds to the evaluated expression on( dr([1 2 3 1),
p
1
), where dr(?l) returns a list without its rst element. State transformation
by updating is dened by su h evaluations.
Relational symbols are divided in to two ategories symbols in R
P
and symbols in R
C
. Symbols in R
P
denote relations whi h hara terize a
problem state. For example, on([1 2 3 1, p
1
) with on 2 R
P
is true in a
state, where disks 1, 2, and 3 are lying on peg p
1
. Symbols in R
C
denote
additional hara teristi s of a state, whi h an be inferred from a formula
over R
P
. For example, ar(L
1
) 6= 1 with 6= 2 R
C
is true for L
1
= [1 2 3
1. Relations overR
C
often are useful for representing onstraints, espe ially
onstraints whi h must hold for all states, that is stati predi ates.
Pre onditions are dened as arbitrary formulas over R
P
[ R
C
. Standard
pre onditions, su h as p = on(?l
1
, ?x), an be transformed into onstraints:
If mat h(p,s) results in
1
= fl
1
[1 2 3 1; ?x p
1
g, then the onstraints
eq(?x; p
1
) and eq(?l
1
; [1 2 3 1), with eq as equality-test for symbols in R
C
,
must hold.
We allow onstraints to be dened over free variables, i. e., variables that
annot be instantiated by mat hing an operator pre ondition with a ur-
rent state. For example, variables ?i and ?j in the onstraint nth(?i; ?l) <
nth(?j; ?l)) might be free. To restri t the possible instantiations of su h vari-
ables, they must be de lared together with a range in the problem spe i a-
tion (see se t. 4.4.5 for an example).
78 4. Integrating Fun tion Appli ation in Planning
Denition 4.2.2 (Free Variables in R
C
). For a formula F 2 L(X , C, F ,
R
P
[ R
C
), variables o urring only in literals over R
C
are alled free. The
set of su h variables is denoted as X
C
X and with X
C
(F ) we refer to free
variables in F .
For all variables x in X
C
instantiations must be restri ted by a range R(x).
For variables belonging to an ordered data type (e. g., natural numbers), a
range is de lared as R(x) = [min;max with min giving the smallest and
max the largest value x is allowed to assume. Alternatively, for ategorial
data types, a range is dened by enumerating all possible values the variable
an assume: R(x) = [v
1
; : : : ; v
n
.
Denition 4.2.3 (State Representation). A problem state s is a on-
jun tion of atoms over relational symbols in R
P
. That is, s 2 L(C;F ;R
P
).
We presuppose that terms are always evaluated if they are grounded. Evalu-
ation of terms is dened in the usual way (Field & Harrison, 1988; Ehrig &
Mahr, 1985):
Denition 4.2.4 (Evaluation of Ground Terms). A term t over L(C;F)
is evaluated in the following way:
eval( ) = , for 2 C
eval(f(t
1
; : : : ; t
n
)) = apply(f(eval(t
1
), . . . , eval(t
n
))), for f 2 F and t
1
; : : : ; t
n
as terms over C [ F .
Fun tion appli ation apply returns a unique value for f(
1
; : : : ;
n
).
For onstru tor fun tions (su h as the list onstru tor [ ) we dene
apply(f
(
1
; : : : ;
n
)) = f
(
1
; : : : ;
n
).
For example, eval( dr([1 2 3 1)) returns [2 3 1. Evaluation of eval(plus(3,
minus(5, 1))) returns 7. Evaluated states are sets of relational symbols over
onstant symbols and omplex stru tures, represented by onstru tor fun -
tions over onstant symbols. For example, the relational symbol on in state
des ription fon([2 3 1, p
1
), on([1, p
2
), on([1 1, p
3
)g has onstant ar-
guments p
1
, p
2
, and p
3
, and omplex arguments [2 3 1, [1, and [1 1.
Relational symbols in R
C
an be onsidered as spe ial terms, namely terms
that evaluate to a truth value, also alled boolean terms. In ontrast, rela-
tions in R
P
are true if they mat h with the urrent state and false otherwise.
In the following, we will speak of evaluation of expressions when referring to
terms in luding su h boolean terms.
Denition 4.2.5 (Evaluation of Expressions over Ground Terms). An
expression over L(C;F ;R
C
) is evaluated in the following way:
eval( ), eval(f(t
1
; : : : ; t
n
)) as in def. 4.2.4.
eval(r(t
1
; : : : ; t
n
)) = apply(r(eval(t
1
), . . . , eval(t
n
))), for r 2 R
C
and
t
1
; : : : ; t
n
as terms over C[F[R
C
with eval(r(t
1
; : : : ; t
n
)) 2 fTRUE;FALSEg.
4.2 Extending Strips to Fun tion Appli ations 79
For a set of atoms over L(C;F ;R
C
), i. e., a onjun tion of boolean expres-
sions, evaluation is extended to:
eval(fr
1
(t
1
); : : : ; r
n
(t
n
)g) =
8
<
:
TRUE if r
1
(t
1
) = TRUE and
: : : and r
n
(t
n
) = TRUE
FALSE else:
For example, f ar([1 2 3 1) 6= 1, ar([1 2 3 1) < ar([1)g evaluates
to TRUE. In our planner, evaluation of expressions is realized by alling the
meta-fun tion eval provided by Common Lisp.
Before introdu ing goals and operators dened over L(C;F ;R), we extend
the denitions for substitution and mat hing (def. 2.1.3):
Denition 4.2.6 (Produ t of Substitutions). The omposition of sub-
stitutions Æ
0
is the set of all ompatible pairs (x t) 2 and (x
0
t
0
) 2
0
with x; x
0
2 X and t; t
0
2 L(X ; C;F). Substitutions are ompatible i for ea h
(x t) 2 there does not exist any (x
0
t
0
) 2
0
with x = x
0
and t 6= t
0
.
The produ t of sets of substitutions = f
1
; : : :
n
g and
0
= f
0
1
; : : :
0
m
g is
dened as pairwise omposition [
0
=
i
Æ
0
j
for i = 1 : : : n, j = 1 : : :m.
Denition 4.2.7 (Mat hing and Assignment). Let F = F
P
[ F
C
be a
formula with
F
P
= fp
1
(t
1
); : : : ; p
i
(t
i
)g 2 L(X ; C;F ;R
P
) and
F
C
= fr
1
(t
0
1
); : : : ; r
j
(t
0
j
)g 2 L(X ; C;F ;R
C
), and let
A 2 L(C;F ;R
P
) be a set of evaluated atoms.
The set of substitutions for F with respe t to A is dened as
P
[
C
with
F
P
i
A and eval(F
C
i
) = TRUE for all
i
2 .
P
is al ulated by mat h(F
P
, A) as spe ied in denition 2.1.3.
C
is al ulated by ass(X
C
) for all free variables x
1
; : : : ; x
n
= X
C
(F
C
) su h
that for all
i
2
C
holds
i
= fx
1
1
; : : : ; x
n
n
g where
i
2 R(x
i
)
( onstants
i
in the range of variable x
i
, see def. 4.2.2).
Denition 4.2.7 extends mat hing of formulas with sets of atoms in su h a
way that only those mat hes are allowed where substitutions additionally
fulll the onstraints.
Consider the state
A = fon([2 3 1; p
1
); on([1; p
2
); on([1 1; p
3
)g.
Formula PRE of the move operator given in gure 4.1.b
F = fon(?l
i
; ?p
i
); on(?l
j
; ?p
j
); ar(?l
i
) 6=1; ar(?l
i
) < ar(?l
j
)g
an be instantiated to = f
1
;
2
;
3
g with
1
= f?p
i
p
1
; ?p
j
p
2
; ?l
i
[2 3 1; ?l
j
[1g,
2
= f?p
i
p
3
; ?p
j
p
2
; ?l
i
[1 1; ?l
j
[1g,
3
= f?p
i
p
3
; ?p
j
p
1
; ?l
i
[1 1; ?l
j
[2 3 1g
80 4. Integrating Fun tion Appli ation in Planning
but not to
4
= f?p
i
p
1
; ?p
j
p
3
; ?l
i
[2 3 1; ?l
j
[1 1g,
5
= f?p
i
p
2
; ?p
j
p
1
; ?l
i
[1; ?l
j
[2 3 1g,
6
= f?p
i
p
2
; ?p
j
p
3
; ?l
i
[1; ?l
j
[1 1g.
A pro edural realization for al ulating all instantiations of a formula
with respe t to a set of atoms is to rst al ulate all mat hes and then
modify by stepwise introdu ing the onstraints. For the example above,
we rst al ulate = f
1
;
2
;
3
;
4
;
5
;
6
g onsidering only the sub-formula
F
P
= fon(?p
i
; ?l
i
); on(?p
j
; ?l
j
)g. Next, onstraint ar(?l
i
) 6=1 is introdu ed,
redu ing to = f
1
;
2
;
3
;
4
g. Finally, we obtain = f
1
;
2
;
3
g by
applying the onstraint ar(?l
i
) < ar(?l
j
).
If the onstraints ontain free variables, ea h substitution in is om-
bined with instantiations of these variables as spe ied in denition 4.2.6.
An example for instantiating formulas with free variables is given in se tion
4.4.5.
Denition 4.2.8 (Goal Representation). A goal G is a formula in L(X ,
C, F , R).
The goal given for hanoi in gure 4.1.b an alternatively be represented by:
fon([1 2 3 1; p
3
); on(?l
i
; ?p
i
); on(?l
j
; ?p
j
);1 = ar(?l
i
);1 = ar(?l
j
)g.
Denition 4.2.9 (FPlan Operator). An FPlan operator op is des ribed
by pre onditions (PRE), ADD and DEL lists, and updates (UP) with
PRE = PRE
P
[ PRE
C
2 L(X ; C;F ;R
P
[ R
C
) and ADD;DEL 2
L(X ; C;F ;R
P
).
An update UP is a list of fun tion appli ations, spe ifying that a fun tion
f(t) 2 L(X ; C;F) is applied to a term t
0
i
whi h is argument of literal p(t
0
)
with p 2 R
P
.
Updates overwrite the value of an argument of a literal. Realizing an
update using ADD/DEL ee ts would mean to rst al ulate the new value
of an argument, then deleting the literals in whi h this argument o urs and
nally adding the literals with the new argument values. That is, modeling
updates as spe i pro ess is more eÆ ient. An example for an update ee t
is given in gure 4.1.b:
hange ?l
i
in on(?l
i
; ?p
i
) to dr(?l
i
).
denes that variable ?l
i
in literal on(?l
i
; ?p
i
) is updated to dr(?l
i
). If more
than one update ee t is given in an operator, the updates are performed
sequentially from the uppermost to the last update. An update ee t annot
ontain free variables, that is X (UP ) X (PRE).
Denition 4.2.10 (Updating a State). For a state s = fp
1
(t
1
); : : : ; p
n
(t
n
)g
and an instantiated operator o with update ee ts u 2 UP , updating is de-
ned as repla ement t
0
i
:= eval(f(t)) for a xed argument t
0
i
in a xed literal
4.3 Extensions of FPlan 81
p
j
(t
0
) 2 s with X (t);X (t
0
) X (PRE).
For short we write update(u; s).
Applying a list of updates UP = (u
1
; : : : ; u
n
) to a state s is dened as
update(UP; s) = update(u
n
; update(u
n1
; : : : ; update(u
1
; s))).
For the initial state of hanoi in gure 4.1, s = fon(p
1
, [1 2 3 1), on(p
2
,
[1), on(p
3
, [1)g, the update ee ts of the move operator an be instanti-
ated to
u
1
= hange [1 in on(p
3
; [1) to ons( ar([1 2 3 1) [1)
u
2
= hange [1 2 3 1 in on(p
1
; [1 2 3 1) to dr([1 2 3 1).
Applying these updates to s results in:
update((u
1
; u
2
); fon(p
1
; [1 2 3 1); on(p
2
; [1); on(p
3
; [1)g) =
update(u
2
; fon(p
1
; [1 2 31); on(p
2
; [1); on(p
3
; eval( ons( ar([1 2 31); [1)))g =
update(u
2
; fon(p
1
; [1 2 3 1); on(p
2
; [1); on(p
3
; [1 1))g =
fon(p
1
; eval( dr([1 2 3 1))); on(p
2
; [1); on(p
3
; [1 1)g =
fon(p
1
; [2 3 1); on(p
2
; [1); on(p
3
; [1 1)g.
Note that the operator is fully instantiated before the update ee ts are
al ulated. The urrent substitution 2 is not ae ted by updating. That
is, if the value of an instantiated variable is hanged, as for example L
2
= [1
is hanged to L
2
= [1 1, this hange remains lo al to the literal spe ied in
the update.
Denition 4.2.11 ((Forward) Operator Appli ation). For a state s and
an instantiated operator o, operator appli ation is dened as Res(o; s) =
update(UP; s nDEL(o) [ ADD(o)) if PRE
P
(o) s and eval(PRE
C
(o)) =
TRUE.
ADD/DEL ee ts are always al ulated before updating. Examples for
operators with ombined ADD/DEL and update ee ts are given in se tion
4.4.
An FPlan planning problem is given as P (O; I;G), where operators O,
initial states I, and goals G are dened over the FPlan language, whi h
extends Strips by allowing fun tion appli ations. A plan for the hanoi problem
spe ied in gure 4.1.b is given in gure 4.3.
4.3 Extensions of FPlan
4.3.1 Ba kward Operator Appli ation
As des ribed in se tion 2.1 in hapter 2, plan onstru tion an be performed
using either forward or ba kward sear h in the state spa e. To make FPlan
appli able for ba kward planners, we introdu e ba kward operator appli a-
tion.
Ba kward operator appli ation for Strips (see def. 2.1.7) was dened so
that an operator is ba kward appli able in a state s if its ADD list an be
82 4. Integrating Fun tion Appli ation in Planning
8
888
888
88
888
888
8
88
88
8
8 8 8
(MOVE P2 P3)
(MOVE P1 P3)
((ON [1 2 3 ] P1) (ON [ ] P2) (ON [ ] P3))
((ON [2 3 ] P1) (ON [ ] P2) (ON [1 ] P3))
((ON [3 ] P1) (ON [2 ] P2) (ON [1 ] P3))
((ON [3 ] P1) (ON [1 2 ] P2) (ON [ ] P3))
(MOVE P2 P1)
((ON [1 ] P1) (ON [ ] P2) (ON [2 3 ] P3))
((ON [ ] P1) (ON [ ] P2) (ON [1 2 3 ] P3))
(MOVE P1 P3)
(MOVE P3 P2)
((ON [ ] P1) (ON [1 2 ] P2) (ON [3 ] P3))
(MOVE P1 P2)
(MOVE P1 P3)
((ON [1 ] P1) (ON [2 ] P2) (ON [3 ] P3))
Fig. 4.3. A Plan for Tower of Hanoi
4.3 Extensions of FPlan 83
mat hed with the urrent state, resulting in a prede essor state s
0
where
all elements of the ADD list are removed and whi h ontains the literals of
PRE and DEL. For operators ontaining update ee ts and onstraints, the
denition of ba kward operator appli ation has to be extended with respe t
to appli ability onditions and al ulation of ee ts.
An operator is ba kward appli able if its ADD list an be mat hed with
the urrent state s and if those onstraints that hold after forward operator
appli ation hold in s. We all su h onstraints the post ondition (POST) of an
operator. In many domains, the onstraints holding in a state after forward
operator appli ation are the inverse of the onstraints spe ied in the operator
pre ondition. For example, for the hanoi domain, the onstraints for forward
appli ation are f ar(?l
i
) 6= 1, ar(?l
i
) < ar(?l
j
)g where ?l
i
represents the
disks lying on peg ?p
i
and ?l
j
represents the disks lying on peg ?p
j
(see
g. 4.1.b). After exe uting move(?p
i
, ?p
j
), the \top" disk of ?l
i
is inserted
as head of list ?l
j
, and the onstraints f ar(?l
j
) 6= 1, ar(?l
j
) < ar(?l
i
)g
hold. The rst onstraint must hold, be ause a disk is moved to peg ?p
j
and
therefore ?l
j
annot be the empty peg represented as list [1. The se ond
onstraint must hold be ause all legal states in the hanoi domain ontain
sta ks of disks with in reasing size. Be ause x < y holds for list ?l
i
= [x y ..,
and be ause x is the new head element of ?l
j
after exe uting move, the new
head element of list ?l
i
(y) is larger than the new head element of list ?l
j
(x).
To al ulate a ba kward update ee t, the inverse fun tion f
1
to up-
date fun tion f must be known. In general, updating an involve a sequen e
of fun tion appli ations. That is, f = f
1
Æ : : : Æ f
n
. To make ba kward plan
onstru tion ompatible with forward plan exe ution, update ee ts must be
restri ted to bije tive fun tions: f(x) = y $ f
1
(y) = x. For the move oper-
ator in the hanoi domain, the inverse fun tion of removing the top element
of a tower i. e. the head element of the list ?l
i
and inserting it at as the head
of ?l
j
is to remove the rst element of ?l
j
and inserting it at as head of ?l
i
.
We do not propose an algorithm for automati onstru tion of orre t
onstraints for ba kward appli ation and inverse fun tions. Instead, we re-
quire that for ea h operator op its inverse operator op
1
has to be dened in
the domain spe i ation. It is the responsibility of the author of an domain
to guarantee that op
1
is the inverse operator of op; that is, that Res(o,s')
= s holds for Res
1
(o
1
, s) = s' for all operators.
When dening an inverse operator, the (inverse) pre ondition represents
the onditions whi h must hold after forward appli ation of the original oper-
ator i. e. the inverse pre ondition orresponds to the original post ondition;
the updates represent the inverse fun tion as dis ussed above. The inverse
operator move
1
for hanoi is:
Operator: move
1
(?p
i
, ?p
j
)
PRE: fon(?l
i
, ?p
i
), on(?l
j
, ?p
j
), ar(?l
j
) 6= 1, ar(?l
i
) > ar(?l
j
)g
UPDATE: hange ?l
i
in on(?l
i
, ?p
j
) to ons( ar(?l
j
), ?l
i
)
hange ?l
j
in on(?l
j
, ?p
i
) to dr(?l
j
)
84 4. Integrating Fun tion Appli ation in Planning
Be ause update ee ts are expli itly dened for inverse operator appli ation,
updates are performed as spe ied in denition 4.2.10.
For state s = fon([2 3 1; p
1
), on([1; p
2
), on([1 1; p
3
)g ba kward appli a-
tion of the instantiated operatormove
1
(p
1
, p
3
) results in s = fon([1 2 31; p
1
);
on([1; p
2
); on([1; p
3
)g.
For operators dened with mixed ADD/DEL and update ee ts the in-
verse operator is also dened expli itely, with inverted pre ondition, ADD
and DEL list. Appli ation of an inverted operator an then be performed
as spe ied in denition 4.2.11 that is, ba kward operator appli ation is
repla ed by forward appli ation of the inverted operator. Examples for ba k-
ward appli ation of operators with ombined ADD/DEL and update ee ts
are given in se tion 4.4.
4.3.2 Introdu ing User-Dened Fun tions
As dened in se tion 4.2, the set of fun tion symbols F and the set of rela-
tional symbols R
C
of FPlan an ontain arbitrary built-in and user-dened
fun tions. When parsing a problem spe i ation, ea h symbol whi h is part of
Common Lisp and ea h symbol orresponding to the name of a user-dened
fun tion is treated as expression to be evaluated. Symbols in R
C
have to
evaluate to a truth value.
An example spe i ation of hanoi using built-in and user-dened fun tions
is given in gure 4.4.
3
For the Towers of Hanoi domain the FPlan language
is instantiated to X = f?p
i
; ?p
j
; ?l
i
; ?l
j
g, C = fp
1
, p
2
, p
3
, 1, 2, 3, nilg,
F = f[
i
; ar
1
; dr
1
; ons
2
g, R
P
= fon
2
g, R
C
= fnotempty
1
, legal
2
g. The
empty list [ orresponds to nil. The user-dened fun tions an be dened
over additional built-in and user-dened fun tions whi h are not in luded in
L. For a Lisp-implemented planner like our system DPlan after reading
in a problem spe i ation, all de lared user-dened fun tions are appended
to the set of built-in Lisp fun tions and interpreted in the usual manner.
4.4 Examples
In the previous se tions, we presented FPlan with a fun tional version of
the hanoi domain, demonstrating how operators with ADD/DEL ee ts an
be alternatively modeled with update ee ts. In the following, we will give a
variety of examples for problem spe i ations with FPLan. First we will show
how problems involving resour e onstraints an be modeled. Afterwards, we
will give examples for a numeri al domain and domain spe i ations involving
3
For our planning system, fun tions are dened in Lisp syntax, for example:
(defun legal (l1 l2) (if (null l2) T (< ( ar l1) ( ar l2)))).
4.4 Examples 85
Operator: move(?p
i
, ?p
j
)
PRE: fon (?l
i
, ?p
i
), on(?l
j
, ?p
j
), not-empty(?l
i
), legal(?l
i
, ?l
j
)g
UPDATE: hange ?l
j
in on(?l
j
, ?p
j
) to ons( ar(?l
i
), ?l
j
)
hange ?l
i
in on(?l
i
, ?p
i
) to dr(?l
i
)
Goal: fon([ , p
1
), on([ , p
2
), on([1 2 3, p
3
)g
Initial State: fon([1 2 3, p
1
), on([ , p
2
), on([ , p
3
)g
Fun tions: not-empty(l) = not(null(l))
legal(l
1
, l
2
) = if null(l
2
) then TRUE else < ( ar(l
1
), ar(l
2
))
Fig. 4.4. Tower of Hanoi with User-Dened Fun tions
operators with mixed ADD/DEL ee ts and updates. We will show how
standard programming problems an be modeled in planning, and nally we
will present preliminary ideas for ombining planning with onstraint logi
programming.
For sele ted problems we in lude empiri al omparisions, demonstrating
that plan onstru tion with fun tion updates is signi antly more eÆ ient
than plan onstru tion using ADD/DEL ee ts. We tested all examples with
our ba kward planner DPlan (S hmid & Wysotzki, 2000b). Therefore, we
present standard forward operator denitions together with the inverse op-
erators as spe ied in se tion 4.3.1.
4.4.1 Planning with Resour e Variables
Planning with resour e onstraints an be modeled as a spe ial ase of fun -
tion updates with FPlan. As an example we use an airplane domain (Koehler,
1998). A problem spe i ation in FPlan is given in gure 4.5.
Operator: y(?plane, ?x, ?y)
PRE: fat(?plane, ?x), fuelresour e(?plane, ?fuel),
?fuel distan e(?x, ?y)=3g
ADD: fat(?plane, ?y)g
DEL: fat(?plane, ?x)g
UPDATE: hange ?fuel in fuelresour e(?plane, ?fuel) to al fuel(?fuel, ?x, ?y)
hange ?time in timeresour e(?plane, ?time) to al time(?time, ?x, ?y)
Goal: fat(p1, berlin)g
Initial State: fat(p1, london), fuelresour e(p1, 750), timeresour e(p1, 0)g
Fun tions: al fuel(?fuel, ?x, ?y) = ?fueldistan e(?x; ?y)=3
al time(?time, ?x, ?y) = ?time + 3=20 distan e(?x; ?y)
Fig. 4.5. A Problem Spe i ation for the Airplane Domain
The operator y spe ies what happens when an airplane ?plane ies
from airport ?x to airport ?y. We onsider two resour es: the amount of fuel
86 4. Integrating Fun tion Appli ation in Planning
and the amount of time the plane needs to y from one airport to another. In
the initial state the tank is lled to apa ity (fuelresour e(750)) and no time
has yet been spent (timeresour e(0)). A resour e onstraint that must hold
before the a tion of ying the plane from one airport ?x to another airport
?y an be arried out would be: at(?plane, ?x), (?fuel distan e(?x, ?y)/3).
That is the plane has to be at airport ?x and there has to be enough fuel in
the tank to travel the distan e from ?x to ?y.
The distan e between two airports is obtained by alling the fun tion dis-
tan e(?x, ?y) whi h returns the distan e value by onsulting an underlying
database (see table 4.1). The distan es an alternatively be modeled as stati
predi ates (for example distan e(berlin, paris, 540), et .). Using a database
query fun tion is more eÆ ient than modeling the distan es with stati pred-
i ates be ause stati predi ates have to be en oded in the urrent state and
nding a ertain distan e (instantiating the literal) requires mat hing whi h
takes onsiderably longer than a database query.
Table 4.1. Database with Distan es between Airports
Berlin Paris London New York
Berlin 540 m 570 m 3960 m
Paris 540 m 210 m 3620 m
London 570 m 210 m 3460 m
New York 3960 m 3620 m 3460 m
The position of the plane ?plane is modeled with the literal at. When
ying from ?x to ?y the literal at(?plane, ?x) is deleted from and the literal
at(?plane, ?y) is added to the set of literals des ribing the urrent state. The
resour e variable ?fuel des ribed by the relational symbol fuelresour e is
updated with the result of the user-dened fun tion al fuel whi h al ulates
the onsumption of fuel a ording to the traveled distan e. The resour e
variable ?time des ribed by the relational symbol timeresour e is updated
in a similar way asuming that it takes 3=20 distan e(?x; ?y) to y the
distan e from airport ?x to ?y.
When ADD/DEL and update ee ts o ur together in one operator
the update ee t is arried out on the state obtained after al ulating the
ADD/DEL ee t (denition 4.2.11). For ba kward planning we have to spe -
ify a orresponding inverse operator y
1
(see g. 4.6). The pre ondition for
y
1
requests that the plane be at airport ?y and you have more time left
than it takes to y the distan e between airport ?x and airport ?y. The re-
sour e variables ?fuel and ?time are updated with the result of the inverse
fun tions ( al fuel
1
and al time
1
).
Updating of state variables as proposed in FPlan is more exible than han-
dling resour e variables separately (as in Koehler (1998)). While in Koehler
(1998), fuel and time are global variables, in FPlan the urrent value of fuel
4.4 Examples 87
Operator: y
1
(?plane, ?x, ?y)
PRE: fat(?plane, ?y), timeresour e(?plane, ?time), fuelresour e(?plane, ?fuel),
?time distan e(?x, ?y)) 3=20) g
ADD: fat(?plane, ?x)g
DEL: fat(?plane, ?y)g
UPDATE: hange ?fuel in fuelresour e(?plane, ?fuel) to al fuel
1
(?fuel, ?x, ?y)
hange ?time in timeresour e(?plane, ?time) to al time
1
(?time, ?x, ?y)
Fun tions.: al fuel
1
(?fuel, ?x, ?y) = ?fuel+distan e(?x; ?y)=3 j 750
; 750 is the maxium apa ity of the tank
al time
1
(?time, ?x, ?y) = ?time 3=20 distan e(?x, ?y)
Fig. 4.6. Spe i ation of the Inverse Operator y
1
for the Airplane Domain
and time are arguments of relational symbols fuelresour e(?plane, ?fuel),
timeresour e(?plane,?fuel). Therefore, while modeling a problem involving
more than one plane an be easily done in FPlan this is not possible with
Koehler's approa h. For example we an spe ify the following goal and initial
state:
Goal: fat(p1, berlin), at(p2, paris)g
Initial State: fat(p1, berlin), fuelresour e(p1, 750), timeresour e(p1, 0),
at(p2, paris), fuelresour e(p2, 750), timeresour e(p2, 0)g
Modeling domains with time or ost resour es is simple when fun tion
appli ations are allowed. Typi al examples are job s heduling problems
for example the ma hine shop domain presented in (Veloso et al., 1995). To
model time steps, a relational symbol time(?t) an be introdu ed. Time ?t is
initially zero and ea h operator appli ation results in ?t being in remented by
one step. To model a ma hine that is o upied during a ertain time interval,
a relational symbol o upied(?m, ?o) an be used where ?m represents the
name of a ma hine and ?o the last time slot where it is o upied with ?o = 0
representing that the ma hine is free to be used. For ea h operator involving
the usage of a ma hine, a pre ondition requesting that the ma hine is free for
the urrent time slot an be introdu ed. If a ma hine is free to be used, the
o upied relation is updated by adding the urrent time and the amount of
time steps the exe uted a tion requests. It an also be modeled that o upa-
tion time does not only depend on the kind of a tion performed but also on
the kind of obje t involved (e. g., polishing a large obje t ould need three
time steps, while polishing a small obje t needs only one time step).
4.4.2 Planning for Numeri al Problems The Water Jug Domain
Numeri al domains as the water jug domain presented by Pednault (1994),
annot be modeled with a Strips-like representation. Figure 4.7 shows an
FPlan spe i ation of the water jug domain: We have three jugs of dierent
88 4. Integrating Fun tion Appli ation in Planning
volumes and dierent apa ities. The operator pour models the a tion of
pouring water from one jug ?j1 into another jug ?j2 until either ?j1 is empty
or ?j2 is lled to apa ity. The apa ities of the jugs are stati s while the
a tual volumes are uents. The resulting volume ?v1 for jug ?j1 is either
zero or what remains in the jug when pouring as mu h water as possible into
jug ?j2: max[0; v1 2 + v2. The resulting volume ?v2 for jug ?j2 is the
either its apa ity or its previos volume plus the volume of the rst jug ?j1:
min[ 2; v1 + v2.
Operator: pour(?j1, ?j2)
PRE: fvolume(?j1, ?v1), volume(?j2, ?v2), apa ity(?j2, ? 2))
(?v2 < ? 2), (?v1 > 0)g
UPDATE: hange ?v1 in volume(?j1, ?v1) to max(0; ?v1? 2+?v2)
hange ?v2 in volume(?j2, ?v2) to min(? 2; ?v1+?v2)
Operator: pour
1
(?j1, ?j2)
PRE: fvolume(?j1, ?v1), volume(?j2, ?v2), apa ity(?j2, ? 2))
((?v2 = ? 2) or (?v1 = 0))g
UPDATE: hange ?v1 in volume(?j1, ?v1) to max(0; ?v1? 2+?v2)
hange ?v2 in volume(?j2, ?v2) to min(? 2; ?v1+?v2)
Stati s: f apa ity(a, 36), apa ity(b, 45), apa ity( , 54)g
Goal: fvolume(a, 25), volume(b, 0), volume( , 52)g
Initial State: fvolume(a, 16), volume(b, 7), volume( , 34) g
Fig. 4.7. A Problem Spe i ation for the Water Jug Domain
After pouring water from one jug into another the post ondition (?v2 =
? 2) or (?v1 = 0) must hold. That is the volume ?v1 of the rst jug ?j1
must be zero or the volume ?v2 of the se ond jug ?j2 must equal its apa ity
? 2. When spe ifying the inverse operator pour
1
for ba kward planning this
post ondition be omes the pre ondition of pour
1
. The update fun tions are
inverted: The volume ?v1 of the rst jug ?j1 is updated with the result of
min[? 1; ?v1+?v2 and the volume ?v2 of the se ond jug ?v2 is updated with
the result of max[0; ?v2? 1+?v1. The resulting plan for the given problem
spe i ation is shown in gure 4.8.
For this numeri al problem there exists no ba kward plan for all initial
states whi h an be transformed into the goal by forward operator appli-
ation. For instan e, for the initial state volume(a, 16), volume(b, 27), vol-
ume( , 34) the following a tion sequen e results in a goal state:
volume(a, 16), volume(b, 27), volume( , 34) pour( , b)
volume(a, 16), volume(b, 45), volume( , 16) pour(b, a)
volume(a, 36), volume(b, 25), volume( , 16) pour(a, )
volume(a, 0), volume(b, 25), volume( , 52) pour(b, a)
volume(a, 25), volume(b, 0), volume( , 52)
4.4 Examples 89
((VOLUME A 25) (VOLUME B 0) (VOLUME C 52))
((VOLUME A 0) (VOLUME B 25) (VOLUME C 52))
((VOLUME A 36) (VOLUME B 25) (VOLUME C 16))
((VOLUME A 16) (VOLUME B 45) (VOLUME C 16))
((VOLUME A 16) (VOLUME B 7) (VOLUME C 54))
((VOLUME A 36) (VOLUME B 7) (VOLUME C 34))
(POUR-1 B A)
(POUR-1 A C)
(POUR-1 B A)
(POUR-1 C B)
(POUR-1 C A)
Fig. 4.8. A Plan for the Water Jug Problem
but the initial state is not a legal prede essor of volume(a, 16), volume(b,
45), volume( , 16) when applying the pour
1
operator. That is, ba kward
planning is in omplete! The reason for this in ompleteness is that the initial
state in forward planning an be any arbitrary amount of water in the jugs,
while for ba kward planning is has to be a state obtainable by operator
appli ation. A remedy for this problem is to introdu e a se ond operator
lljugs(?j1, ?j2, ?j3) whi h has no pre ondition. This operator models the
initial lling up of jugs from a water sour e.
4.4.3 Fun tional Planning for Standard Problems Tower of
Hanoi
A main motivation for modeling operator ee ts with updates of state vari-
ables by fun tion appli ation given by Gener (2000) is that plan onstru tion
an be performed more eÆ iently. We give an empiri al demonstration of this
laim by omparing the running times when planning for the standard spe -
i ation with those of the the fun tional spe i ation of Tower of Hanoi (see
g. 4.1). The inverse operator move
1
was spe ied in se tion 4.3.1.
The running times were obtained with our system DPlan on an Ultra
Spar 5 system. DPlan is a universal planner implemented in Lisp. Be ause
DPlan is based on a breadth-rst strategy and be ause we did not fo us on
eÆ ien y for example we did not use OBDD representations (Cimatti et al.,
1998) it has longer running times than state of the art planers. Sin e we
are interested in omparing the performan e when planning for the dierent
spe i ations we do not give the absolute times but the performan e gain in
per ent, al ulated as [st ft 100=st, where st is the running time for the
90 4. Integrating Fun tion Appli ation in Planning
standard spe i ation and ft the running time for the fun tional spe i ation
(see table 4.2).
4
Be ause in the standard representation the relations between the sizes
of the dierent dis s must be spe ied expli itely by stati relations, the
number of relational symbols is quite large (18 literals for three disks, 25 for
four disks, et .). In the fun tional representation, built-in fun tions an be
alled to determine wether a disk of the size 2 is smaller than a disk of the size
3. The number of literals is very small (3 literals for an arbitrary number of
disks). Consequently planning for the standard representation requires more
mat hing than planning for the fun tional one, where the onstraint of the
pre ondition has to be solved instead.
Table 4.2. Performan e of FPlan: Tower of Hanoi
Tower of Hanoi
Number of disks 3 4 5
States 27 81 243
Performan e gain 96.3% 98.7% 99.8%
4.4.4 Mixing ADD/DEL Ee ts and Updates Extended
Blo ksworld
In this se tion we present an extended version of the standard blo ks-world
(see g. 4.9). First we introdu e an operator learblo k whi h lears a blo k
?x by putting the blo k lying on top of ?x on the table if blo k topof(?x)
is lear. That is, a blo k is addressed in an indire t manner as result of a
fun tion evaluation (Gener, 2000). As terms an be nested this orresponds
to an innite type. For a state where on(A, B) holds, the fun tion topof(B)
returns A, while for a state where on(C, B) holds, topof(B) returns C. The
standard blo ks-world operators put and puttable are also formulated using
the topof fun tion. Furthermore all operators are extended by an update
for the value of the LastBlo kMoved (Pednault, 1994). Note that we used a
onditional ee t for spe ifying the put operator. Our urrent implementation
allows onditioned ee ts for ADD/DEL ee ts but not for updates.
The update ee t hanging the value of LastBlo kMoved annot be han-
deled in ba kward planning be ause we annot dene inverse operators in-
luding an inverse update of the argument LastBlo kMoved. The inverse fun -
tion would have to return the blo k that will be moved in the next step. At
the urrent time step we do not know whi h blo k this will be.
4
The absolute values are (standard/fun tional): 3 dis s: 34.5 se /1.26 se ; 4
dis s: 335.89 se /4.47 se ; 5 dis s: 10015.59 se /19.3 se ; 6 dis s (729 states):
>15 h/62.28 se .
4.4 Examples 91
(a) Operator: learblo k(?x)
PRE: f?y = topof(?x), lear(?y)g
ADD: f lear(?x)g
DEL: fon(?y,?x)g
UPDATE: hange ?blo k in LastBlo kMoved(?blo k) to ?y
(b) Operator: puttable(?x)
PRE: f?x = topof(?y), lear(?x)g
ADD: f lear(?x)g
DEL: fon(?x,?y)g
UPDATE: hange ?blo k in LastBlo kMoved(?blo k) to ?x
( ) Operator: put(?x, ?y)
PRE: f lear(?x), lear(?y) g
ADD: fon(?y,?x)g
DEL: f lear(?x)g
WHEN on(?x ?z) ADD fon(?x ?y), lear(?z)g
DEL f lear(?y), on(?x ?z)g
UPDATE: hange ?blo k in LastBlo kMoved(?blo k) to ?y
Fun tions: topof(?x) = if on(?y,?x) then ?y else nil
Fig. 4.9. Blo ks-World Operators with Indire t Referen e and Update
Other extended blo ks-world domains an be modeled with FPlan. For
example, inluding a robot agent who sta ks and unsta ks blo ks with an
asso iated energy level whi h de reases with ea h a tion it performes The
robot's energy an be des ribed with the relational symbol energy(?robot,
?energy) and is onsumed a ording to the weight of the a tual blo k being
arried by updating the variable ?energy in energy(?robot, ?energy).
4.4.5 Planning for Programming Problems Sorting of Lists
With the extension to fun tional representation a number of programming
problems for example list sorting algorithms su h as bubble, merge or sele -
tion sort an be planned more eÆ iently. We have spe ied sele tion sort in
the standard and the fun tional way (see g. 4.10).
In the fun tional representation we an use the built-in fun tion \>"
instead of the predi ate greater, the onstru tor for lists slist, and the fun tion
nth(n, L) to referen e the nth element of the list L instead of spe ifying the
position of ea h element in the list by using the literal isat(?x, ?i). In the
fun tional representation the indi es to the list ?i and ?j are free variables,
whi h for lists with three elements an range from zero to two. When
al ulating the possible instantiations for the operator swap the variables ?i,
?j an be assigned any value within their range for whi h the pre ondition
holds (denition 4.2.2). The fun tion swap is dened as an external fun tion.
The inverse fun tion to swap is the fun tion swap itself.
92 4. Integrating Fun tion Appli ation in Planning
(a) Standard representation
Operator: swap(?i, ?j, ?x)
PRE: fisat(?x, ?i), isat(?y, ?j), greater(?x, ?y)
ADD: fisat(?x, ?j), isat(?y, ?i)g
DEL: fisat(?x, ?i), isat(?y, ?j)g
Initial State: fisat(3, p1), isat(2, p2), isat(1, p3)g
Goal: fisat(1, p1), isat(2, p2), isat(3, p3)g
(b) Fun tional representation
Operator: swap(?i, ?j, ?x)
PRE: fslist(?x), nth(?i, ?x) > nth(?j, ?x)g
UPDATE: hange ?x in slist(?x) to swap(?i, ?j, ?x)
Variables: f(?i :range (0 2)) (?j :range (0 2))g
Initial State: fslist([ 3 2 1 )g
Goal: fslist([ 1 2 3 )g
Fun tions: swap (?i, ?j, ?x) = (let ((temp (nth j x)))
(setf (nth j x) (nth i x)
(nth i x) temp)
x))
Fig. 4.10. Spe i ation of Sele tion Sort
Table 4.3. Performan e of FPlan: Sele tion Sort
sele tion sort
Number of list elements 3 4 5
States 6 24 120
Performan e gain 27.3% 81.3% 96.1%
4.4 Examples 93
The performan e gain when planning for the fun tional representation is
large (table 4.3) but not as stunning as in the Hanoi domain.
5
This may be
due to the free variables ?i and ?j that have to be assigned a value a ording
to their range. Consequently the onstraint solving takes some longer but is
still faster than the mat hing of the literals for the standard version (6 for
three list elements, 10 for four, et . versus only one literal for the fun tional
representation).
4.4.6 Constraint Satisfa tion as Spe ial Case of Planning
Logi al programs an be dened over rules and fa ts (Sterling & Shapiro,
1986). Rules orrespond to planning operators. Fa ts are similar to stati
relations in planning, that is, they are assumed to be true in all situations.
Constraint logi programming extends standard logi programming so that
onstraints an be used to eÆ iently restri t variable bindings (Fruhwirth &
Abdennadher, 1997). In gure 4.11 an example for a program in onstraint
Prolog is given. The problem is to ompose a light meal onsisting of an
appetizer, a main dish and a dessert whi h all together ontains not more
than a given alorie value (see Fruhwirth & Abdennadher, 1997).
lightmeal(A, M, D) 1400 I + J + K, I > 0, J > 0, K > 0,
appetizer(A, I), main(M, J), dessert(D, K).
appetizer(radishes, 50). main(beef, 1000). dessert(i e ream, 600).
appetizer(soup, 300). main(pork, 1200). dessert(fruit, 90).
Fig. 4.11. Lightmeal in Constraint Prolog
The same problem an be spe ied in FPlan (gure 4.12). The goal on-
sisting of a onjun tion of literals now ontains free variables and a set of
onstraints determining the values of those variables. In this ase ?x, ?y, ?z
are free variables that an be assigned any value within their range (here
enumerations of dishes). The ranges of the variables ?x; ?y, ?z are spe ied
for the domain and not for a problem. The fun tion alories looks up the
alorie value of a dish in an asso iation list.
There are no operators spe ied for this example. The planning pro ess
in this ase solves the goal onstraint by nding those ombinations of dishes
that satisfy the onstraint. For example a meal onsisting of radishes with 50
alories, beef with 1000 alories and fruit with 90 alories.
5
The absolute values are (standard/fun tional): 3 elem.: 0.33 se /0.24 se ; 4
elem.: 8.93 se /1.67 se ; 5 elem.: 547.05 se /21.57 se ; 6 elem. (720 states):
>15 h/717.41 se .
94 4. Integrating Fun tion Appli ation in Planning
Domain: Lightmeal
variables: (?x :range (radishes, soup)
(?y :range (beef, pork))
(?z :range (fruit, i e ream)
Initial State: f g ; empty
Goal: f(lightmeal(?x, ?y, ?z),
( alories(?x) + alories(?y) + alories(?z)) 1400g
Fun tions: alories(?dish) = adr(asso (?dish, alorie-list))
alorie-list := ((radishes 50) (beef 1000) (fruit 90)
(i e ream 600) (soup 300) (pork 1200) ...)
Fig. 4.12. Problem Spe i ation for Lightmeal
5. Con lusions and Further Resear h
The rst part of anything is usually easy.
|Travis M Gee in: John D. Ma Donald, The S arlet Ruse, 1973
As mentioned before, DPlan is intended as a tool to support the gen-
eration of nite programs. That is, plan generation is for small domains
with three or four obje ts only. Generation of a universal plan overing all
possible states of a domain, as we do with DPlan, is ne essarily a omplex
problem be ause for most interesting domains the number of states grows
exponentially with the number of obje ts. In the following, we rst dis uss
what extensions and modi ations would be needed to make DPlan ompeti-
tive with state of the art planning systems. Afterwards, we dis uss extensions
whi h would improve DPlan as a tool for program synthesis. Finally, we dis-
uss some relations to human problem solving.
5.1 Comparing DPlan with the State of the Art
Eort of universal planning is typi ally higher than the average eort of
standard state-based planning be ause it is based on breadth-rst sear h.
But, if one is interested in generating optimal plans, this pri e must be payed.
Typi ally, universal planning terminates if the initial state is rea hed in some
pre-image (see se t. 2.4.4 in hap. 2). In ontrast, DPlan terminates after all
states whi h an be transformed into the goal state are overed. Consequently,
it is not possible to make DPlan more time eÆ ient for a domain with
exponential growth an exponential number of steps is needed to enumerate
all states. But DPlan ould be made more memory eÆ ient by representing
plans as OBDDs (Jensen & Veloso, 2000).
To be applied to standard planning problems, it would be ne essary to
modify DPlan. The main problem to over ome would be to generate omplete
state des riptions by ba kward operator appli ation from a set of top-level
goals, that is, from an in omplete state des ription (see se t. 3.1 in hap. 3).
Some re ent work by Wolf (2000) deals with that problem: Complete state
des riptions are onstru ted as maximal sets of atoms whi h are not mutually
ex luded by the operator denitions of the domain.
96 5. Con lusions and Further Resear h
Furthermore, DPlan urrently terminates with the universal plan. For
standard planning, a fun tionality for extra ting a linear plan for a xed
state must be provided. Plan extra tion an be done for example by depth-
rst sear h in the universal plan (S hmid, 1999).
In the ontext of planning and ontrol rule learning (see se t. 2.5.2 in
hap. 2), we argued that planning an prot from program synthesis: First,
a universal plan is onstru ted for a domain with a small number of obje ts,
then a re ursive rule for the omplete domain (with a xed goal, su h as \build
a tower of sorted blo ks") is generalized. Afterwards, planning an be omitted
and instead the re ursive rule is applied. To make this argument stronger, it is
important to provide empiri al eviden e. We must show that plan generation
by applying the re ursive rule results in orre t and optimal plans and that
these plans are al ulated in signi antly less time than with a state-of-the-
art planner. For that reason, DPlan must be extended su h that (1) learned
re ursive rules are stored with a domain, and (2) for a new planning problem,
the appropriate rule is extra ted from memory and applied.
1
5.2 Extensions of DPlan
To make it possible that DPlan an be applied to as large a set of problems
whi h are of interest in program synthesis as possible, DPlan should work
for a representation language with expressive power similar to PDDL (see
se t. 2.2.1 in hap. 2). We already made an important step in that dire tion
by introdu ing fun tion appli ation ( hap. 4). The extension of the Strips
planning language to fun tion appli ation has the following hara teristi s:
Planning operators an be dened using arbitrary symboli al and numeri al
fun tions; ADD/DEL ee ts and updates an be ombined; indire t referen e
to obje ts via fun tion appli ation allows for innite domains; planning with
resour e onstraints an be handled as spe ial ase.
The proposed language FPlan an be used to give fun tion appli ation in
PDDL (M Dermott, 1998b) a lear semanti s. The des ribed semanti s of op-
erator appli ation an be in orporated in arbitrary Strips planning systems.
In ontrast to other proposals for dealing with the manipulation of state vari-
ables, su h as ADL (Pednault, 1994), we do not represent them as spe ially
marked \rst lass obje ts" (de lared as uents in PDDL) but as arguments
of relational symbols. This results in a greater exibility of FPlan, be ause
ea h argument of a relational symbol may prin ipally be hanged by fun tion
appli ation. As a onsequen e, we an model domains usually spe ied by
operators with ADD/DEL ee ts alternatively by updating state variables
(Gener, 2000).
1
Preliminary work for ontrol rule appli ation was done in a student proje t at
TU Berlin, 1999.
5.3 Universal Planning versus In remental Exploration 97
Work to be done in ludes detailed empiri al omparisons of the eÆ ien y
of plan onstru tion with operators with ADD/DEL ee ts versus updates
and providing proofs of orre tness and termination for planning with FPlan.
FPlan urrently has no restri tions to what fun tions an be applied. As a
onsequen e, termination is not guaranteed. That is, we have to provide some
restri tions on fun tion appli ation.
Currently, DPlan works on Strips extended by onditional ee ts and
fun tion appli ation. For future extensions, we plan to in lude typing, nega-
tion in pre- onditions, and quanti ation. These extensions allow to generate
plans for a larger lass of problems. On the other hand, as already dis ussed
for fun tion appli ation, ea h language extension results in a higher omplex-
ity for planning. While PDDL oers the syntax for all mentioned extensions,
up to now only a limited number of work proposes an algorithmi realiza-
tion (Penberthy & Weld, 1992; Koehler et al., 1997) and areful analyses of
planning omplexity are ne essary (Nebel, 2000).
5.3 Universal Planning versus In remental Exploration
We argue, that for learning a re ursive generalization for a domain, the om-
plete stru ture of this domain must be known. Universal planning with DPlan
results in a DAG whi h represents the shortest a tion sequen es to transform
ea h possible state over a xed number of obje ts into a goal state. In hapter
8 we will show that from the stru ture of the DAG additionally information
about the data type underlying this domain an be inferred whi h is essential
for transforming the plan in a program term t for generalization. Our ap-
proa h is \all-or-nothing" that is, either a generalization an be generated
for a given universal plan or it annot be generated and if a generalization
an be generated it is guaranteed to generate orre t and optimal a tion se-
quen es for transforming all possible states into a goal state. A restri tion of
our approa h is, that the planning goal must also be a generalization of the
goal for whi h the original universal plan was generated ( lear a blo k in a
n blo k tower, transport n obje ts from A to B, sort a list with n elements,
build a tower of n sorted blo ks, solve Tower of Hanoi for n dis s).
Other approa hes to learning ontrol rules for planning, namely learning
in Prodigy (Veloso et al., 1995) and Gener's approa h (Mart
in & Gener,
2000), on the one hand oer more exibility but on the other hand an-
not guarantee that learning in every ase results in a better performan e (see
se t. 2.5.2 in hap. 2): The system is exposed in rementally to arbitrary plan-
ning experien e. For example, in the blo ks-world domain, the system might
be rst onfronted with a problem \nd an a tion sequen e for transforming
a state where all given four blo ks A, B, C, and D are lying on the table
into one where B is on A and C is on D; and next with a problem \nd
an a tion sequen e for transforming a state where all given six blo ks whi h
are sta ked in reverse order into one where blo ks A to D are sta ked into
98 5. Con lusions and Further Resear h
an ordered tower and E and F are lying on the table". Rules generalized
from this experien e might or might not work for new problems. For ea h
new problem, the learned rules are applied and if rule appli ation results in
failure the system must extend or modify its learning experien e.
This in remental approa h to learning is similar to the models of hu-
man skill a quisition as, for example, proposed by Anderson (1983) in his
ACT theory, although these approa hes address only the a quisition of linear
ma ros (see se t. 2.5.2 in hap. 2). To get some hints whether our universal
planning approa h has plausibility for human ognition, the following empir-
i al study ould be ondu ted: For a given problem domain, su h as Tower of
Hanoi, one group of subje ts is onfronted with all possible onstellations of
the three-dis problem and one group of subje ts is onfronted with arbitrary
onstellations of problems with dierent numbers of dis s. Both groups have
to solve the given problems. Afterwards, performan e on arbitrary Tower of
Hanoi problems is tested for both groups and both groups are asked after
the general rule for solving Tower of Hanoi. We assume that expli it gener-
ation of all a tion sequen es for a problem of xed size, fa ilitates dete tion
of the stru ture underlying a domain and thereby the formation of a general
solution prin iple.
6. Automati Programming
"So mu h is observation. The rest is dedu tion."
|Sherlo k Holmes to Watson in: Arthur Connan Doyle, The Sign of Four, 1930
Automati programming is investigated in arti ial intelligen e and soft-
ware engineering. The overall resear h goal in automati programming is to
automatize as large a part of the development of omputer programs as pos-
sible. A more modest goal is to automatize or support spe ial aspe ts of pro-
gram development su h as program veri ation or generation of high-level
programs from spe i ations. The fo us of this hapter is on program gener-
ation from spe i ations referred to as automati program onstru tion or
program synthesis. There are two distin t approa hes to this problem: Dedu -
tive program synthesis is on erned with the automati derivation of orre t
programs from omplete, formal spe i ations; indu tive program synthesis
is on erned with automati generalization of (re ursive) programs from in-
omplete spe i ations, mostly from input/output examples. In the following,
we rst (se t. 6.1) give an introdu tory overview of approa hes to automati
programming, together with pointers to literature. Afterwards (se t. 6.2), we
shortly review theorem proving and transformational approa hes to dedu -
tive program synthesis. Sin e our own work is in the ontext of indu tive
program synthesis, we will go into more detail, presenting this area of re-
sear h (se t. 6.3): In se tion 6.3.1, the foundations of automati indu tion
that is, of ma hine learning are introdu ed; grammar inferen e is dis ussed
as theoreti al basis of indu tive synthesis. Afterwards, geneti programming
(se t. 6.3.2), indu tive logi programming (se t. 6.3.3), and the synthesis of
fun tional programs (se t. 6.3.4) are presented. Finally, we dis uss dedu tive
versus indu tive and fun tional versus logi al program synthesis (se t. 6.4).
Throughout the text we give illustrations using simple programming prob-
lems.
102 6. Automati Programming
6.1 Overview of Automati Programming Resear h
6.1.1 AI and Software Engineering
Software engineering is on erned with providing methodologies and tools
for the development of software ( omputer programs). Software engineering
involves at least the following a tivities:
Spe i ation: Analysis of requirements and the desired behavior of the pro-
gram and designing the internal stru ture of the program. Design spe i-
ation might be stepwise rened, from an overall stru ture of software
modules to the (formal) spe i ation of algorithms and data stru tures.
Development: Realizing the software design in exe utable program ode (pro-
gramming, implementation).
Validation: Ensuring that the program does what it is expe ted to do. One
aspe t of validation alled veri ation is that the implemented pro-
gram realizes the spe ied algorithms. Program veri ation is realized
by giving (formal) proofs that the program fullls the (formal) spe i a-
tion. A veried program is alled orre t with respe t to a spe i ation.
The se ond aspe t of validation is that the program meets the initially
spe ied requirements. This is usually done by testing.
Maintenan e: Fixing program errors, modifying and adding features to a
program (updating).
Software produ ts are expe ted to be orre t, eÆ ient, and transparent.
Furthermore, programs should be easily modiable, maintaining the qual-
ity standards orre tness, eÆ ien y, and transparen y. Obviously, software
development is a omplex task and sin e the eighties a variety of omputer-
aided software engineering (CASE) tools have been developed supporting
proje t management, design (e. g., he king the internal onsisten y of mod-
ule hierar hies), and ode generation for simple routine tasks (Sommerville,
1996).
6.1.1.1 Knowledge-Based Software Engineering. A more ambitious
approa h is knowledge-based software engineering (KBSE). The resear h goal
of this area is to support all stages of software development by \intelligent"
omputer-based assistants (Green et al., 1983). KBSE and automati pro-
gramming are often used as synonyms.
A system providing intelligent support for software development must in-
lude several aspe ts of knowledge (Smith, 1991; Green & Barstow, 1978):
General knowledge about the appli ation domain and the problem to be
solved, programming knowledge about the given domain, as well as gen-
eral programming knowledge about algorithms, data stru tures, optimization
te hniques, and so on. AI te hnologies, espe ially for knowledge representa-
tion, automated inferen e, and planning, are ne essary to build su h systems.
That is, an KBSE system is an expert system for software development.
6.1 Overview of Automati Programming Resear h 103
The lessons learned from the limited su ess of expert systems resear h in
the eighties, also apply to KBSE: It is not realisti to demand that a single
KBSE system might over all aspe ts of knowledge and all stages of software
development. Instead, systems typi ally are restri ted in at least one of the
following ways (Flener, 1995; Ri h & Waters, 1988):
Constru ting systems for expert spe iers instead of end-users.
Restri ting the system to a narrow domain.
Providing intera tive assistan e instead of full automatization.
Fo using on a small part of the software development pro ess.
Examples for spe ialized systems are Aries (Johnson & Feather, 1991) for
spe i ation a quisition, KIDS (Smith, 1990) for program synthesis from
high-level spe i ations (see se t. 6.2.2.3), or PVS (Dold, 1995) for program
veri ation. Automati programming is mainly an area of basi resear h. Up
to now, only a small number of systems are appli able to real-world software
engineering problems.
Below, we will introdu e approa hes to program synthesis, that is KBSE
systems addressing the aspe t of ode generation from spe i ations, in detail.
From a software engineering standpoint, the main advantage of automati
ode generation is, that ex-post veri ation of programs be omes obsolete if
it is possible to automati ally derive ( orre t) programs from spe i ations.
Furthermore, the problems of program modi ation an be shifted to the
more abstra t and therefore hopefully more transparent level of spe i-
ation.
6.1.1.2 Programming Tutors. KBSE resear h aims at providing systems
that support expert programmers. Another area of resear h, where AI and
software engineering intera t, is the development of tutor systems for support
and edu ation of student programmers. Su h tutoring systems must in orpo-
rate programming knowledge to a smaller extend than KBSE systems, but
additionally, they have to rely on knowledge about eÆ ient tea hing meth-
ods. Furthermore, user-modeling is riti al for providing helpful feedba k for
programming errors.
Examples for programming tutors are the Lisp-tutors from Anderson,
Conrad, and Corbett (1989) and Weber (1996) and the tutor Proust from
Johnson (1988). Anderson's Lisp-tutor is based on his ACT theory (Ander-
son, 1983). The tutoring strategy is based on the assumption that program-
ming is a ognitive skill, based on a growing set of produ tion rules. Training
fo uses on a quisition of su h rules (\IF the goal is to obtain the rst ele-
ment of a list l THEN write ( ar l)"), and errors are orre ted immediately
to avoid that students a quire faulty rules. Error-re ognition and feedba k
is based on a library of expert rules and rules whi h result in programming
errors. The later were a quired in a series of empiri al studies of novi e pro-
grammers. A faulty program is tried to be reprodu ed by applying rules from
the library and feedba k is based on the rules whi h generated the errors.
104 6. Automati Programming
Weber's tutor is based on episodi user modeling. The programs a student
generates over a urri ulum are stored as s hemes and in a urrent session
the stored programming episodes are used for feedba k. The Proust system
is based on the representation of plans ( orresponding to high-level spe i a-
tions of algorithms). Ideally, more than one orre t program an be derived
from a orre t plan. Errors are dete ted by trying to identify the faulty plan
underlying a given program.
Programming tutors were mainly developed during the eighties. Simulta-
neously, novi e programmers were studied intensively in ognitive psy hology
(Widowski & Eyferth, 1986; Mayer, 1988; Soloway & Spohrer, 1989). One rea-
son for this interest was that this area of resear h oered an opportunity to
bring theories and empiri al results of psy hology to appli ation; a se ond
reason was that programming oers a relatively narrow domain whi h does
not strongly depend on previous experien e in other areas to study the de-
velopment of human problem solving skills (S hmid, 1994; S hmid & Kaup,
1995).
6.1.2 Approa hes to Program Synthesis
Resear h in program synthesis addresses the problem of automati generation
of program ode from spe i ations. The nature of this problem depends on
the form in whi h a spe i ation is given. In the following, we rst introdu e
dierent ways to spe ify programming problems. Afterwards we introdu e
dierent synthesis methods. Synthesis methods an be divided in two lasses
dedu tive program synthesis from omplete formal spe i ations and in-
du tive program synthesis from in omplete spe i ations. Therefore, we rst
ontrast dedu tive and indu tive inferen e, and then hara terize dedu tive
and indu tive program synthesis as spe ial ases of these inferen e meth-
ods. Finally, we will mention s hema-based and analogy-based approa hes as
extensions of dedu tive and indu tive synthesis.
6.1.2.1 Methods of Program Spe i ation. Table 6.1 gives an illustra-
tion of possible ways in whi h a program for returning the last element of a
non-empty list an be spe ied.
If a spe i ation is given just as an informal (natural language) des rip-
tion of requirements (Green & Barstow, 1978), we are ba k at the ambitious
goal of automati programming often ironi ally alled \automagi pro-
gramming" (Ri h & Waters, 1986a, p. xi). Besides the problem of automati
natural language understanding in general, the main diÆ ulty of su h an ap-
proa h would be to derive the required information for example, the desired
input/output relation from an informal statement. For the last spe i ation
in table 6.1, the synthesis system must have knowledge about lists, what it
means that a list is not empty, what \element of a list" refers to and what is
meant by \last".
If a spe i ation is given as a omplete formal representation of an al-
gorithm, we have a more realisti goal, whi h is addressed in approa hes of
6.1 Overview of Automati Programming Resear h 105
Table 6.1. Dierent Spe i ations for Last
Informal Spe i ation
Return the last element of a non-empty list.
De larative Programs
last([X) :- X. (logi program,
last([X|T) :- last(T). Prolog)
fun last(l) = fun tional program
if null(tl(l)) then hd(l) else last(tl(l)); (ML)
fun last (x::nil) = x (ML with
j last(x::rest) = last(rest); pattern mat hing)
Complete, Formal Spe i ations
last(l) ( nd z su h that for some y, l = y Æ [z (Manna & Waldinger,
where islist(l) and l 6= [ 1992, p. 4)
last : seq 1 X --> X (Z)
forall s : seq 1 X last s = s(#s) (Spivey, 1992, p. 117)
In omplete Spe i ations
last([1, 1), last([2 5 6 7, 7), last([9 3 4, 4) (I/O Pairs, logi al)
last([1) = 1, last([2 7) = 7, last([5 3 4) = 4 (I/O Pairs, fun tional,
rst 3 inputs)
last([x
1
) = x
1
, last([x
1
x
2
) = x
2
, last([x
1
x
2
x
3
) = x
3
(Generi I/O Pairs)
last([1 2 3) ,! last([2 3) ,! last([3) ,! 3 (Tra e)
last([l) = if null(tl(l)) then hd(l) else if null(tl(tl(l))) (Complete
then hd(tl(l)) else if null(tl(tl(tl(l)))) then hd(tl(tl(l))) generi tra e)
106 6. Automati Programming
dedu tive program synthesis. Both example spe i ations in table 6.1 give
a non-operational des ription of the problem to be solved. The spe i ation
given by Manna and Waldinger (1992) states that for a non-empty list l there
must be a sub-list y su h that y on atenated with the sear hed for element z
is equal to l. The Z spe i ation (Spivey, 1992) states that for all non-empty
sequen es s (dened by seq
1
X) last returns the element on the last position
of this list (s(#s)) where #s gives the length of s.
The distin tion between \very high level" spe i ation languages su h
as Gist or Z (Potter, Sin lair, & Till, 1991) and \high level" programming
languages is not really stri t: What is seen as spe i ation language today
might be a programming language in the future. In the fties, assembler lan-
guages and the rst ompiler language Fortran where lassied as automati
programming systems (see Ri h & Waters, 1986a, p. xi, for a reprint from
Communi ations of the ACM, Vol. 1(4), April 1958, p. 8).
Formal spe i ation languages as well as programming languages allow to
formulate unambiguous, omplete statements be ause they have a learly de-
ned syntax and semanti s. In general, a programming language guarantees
that all synta ti ally orre t expressions an be transformed automati ally in
ma hine-exe utable ode, while this must not be true for a spe i ation lan-
guage. Spe i ation languages and high-level de larative programming lan-
guages (i. e., fun tional and logi languages) share the ommon hara teristi ,
that they abstra t from the \how to solve" a problem on a ma hine and fo us
on the \what to solve", that is, they have more expressive power and provide
mu h abstra ter onstru ts than typi al imperative programming languages
(like C). Nevertheless, generating spe i ations or de larative programs re-
quires experts who are trained in representing problems orre tly and om-
pletely that is, giving the desired input/output relations and overing all
possible inputs.
For that reason, indu tive program synthesis addresses the problem to
derive programs from in omplete spe i ations. The basi idea is, that a user
presents some examples of the desired program behavior. Spe i ation by
examples is in omplete, be ause the synthesis algorithm must generalize the
desired program behavior from some inputs to all possible inputs. Further-
more, the examples themselves an ontain more or less information. Ex-
amples might be presented simply as input/output pairs. For stru tural list-
problems (as last or reverse), generi input/output pairs, abstra ting from
on rete values of list elements an be given. This makes the examples less
ambiguous.
More information about the desired program an be provided, if examples
are represented as tra es whi h illustrate the operation of a program. Tra es
an pres ribe the to be used data stru tures and operations, if the inputs are
des ribed by tests and outputs by transformations of the inputs using pre-
dened operators. Tra es are alled omplete, if all information about data
stru tures hanged, operators applied, and ontrol de isions taken is given.
6.1 Overview of Automati Programming Resear h 107
Additionally, examples are more informative, if they indi ate what kind of
omputations are not desired in the goal program. This an be done by pre-
senting positive and negative examples. Another strategy is to present exam-
ples for the rst k inputs, thereby dening an order over the input domain.
Of ourse, the more information is presented by an example spe i ation,
the more the system user has to know about the stru ture of the desired
program.
The kind and number of examples must suÆ e to spe ify what the desired
program is supposed to al ulate. SuÆ ien y is dependent on the synthesis
algorithm whi h is applied. In general, it is ne essary to present more than
one example, be ause otherwise a program with onstant output is the most
parsimonious indu tion.
6.1.2.2 Synthesis Methods. Dedu tive and indu tive synthesis are spe ial
ases of dedu tive and indu tive inferen e. Therefore, we rst give a general
hara terization of dedu tion and indu tion:
Dedu tive versus Indu tive Inferen e. Dedu tive inferen e addresses the prob-
lem of inferring new fa ts or rules ( alled theorems) from a set of given fa ts
and rules ( alled theory or axioms) whi h are assumed to be true. For exam-
ple, from the axioms
1. list(l) ! : null( ons(x, l))
2. list([A,B,C)
we an infer theorem
3. : null( ons(Z, [A,B,C)).
Dedu tive approa hes are typi ally based on a logi al al ulus. That is,
axioms are represented in a formal language (su h as rst order predi ate
al ulus) and inferen e is based on a synta ti al proof me hanism (su h as
resolution). An example for a logi al al ulus was given in se tion 2.2.2 in
hapter 2 (situation al ulus). The theorem given above an be proved by
resolution in the following way:
1. null( ons(Z, [A,B,C)) (Negation of the theorem)
2. : list(l) _ : null( ons(x, l)) (axiom 1)
3. : list([A,B,C) (Resolve 1, 2 with substitutions = fx=Z; l=[A;B;Cg)
4. list([A,B,C) (axiom 2)
5. ontradi tion (Resolve 3, 4).
Indu tive inferen e addresses the problem of inferring a generalized rule
whi h holds for a domain ( alled hypothesis) from a set of observed instan es
belonging to this domain ( alled examples). For example, from the examples
list([Z,A,B,C)
list([1,2,3)
list([R,B)
108 6. Automati Programming
axiom (1) given above might be inferred as hypothesis for the domain of ( at)
lists. For this inferen e, it is ne essary to refer to some ba kground knowledge
about lists, namely that lists are onstru ted by means of the list- onstru tor
ons(x, l) and that null(l) is true if l is the empty list and false otherwise.
Other hypotheses whi h ould be inferred are
list(l)
whi h is an over-generalization be ause it in ludes obje ts whi h are no lists
(i. e., atoms) or
l 2 f[Z; A;B;C; [1; 2; 3; [R;Bg ! list(l)
whi h is no generalization but just a ompa t representation of the examples.
Indu tive approa hes are realized within the dierent frameworks oered
by ma hine learning resear h. The language for representing hypotheses de-
pends on the sele ted framework. Examples are de ision trees, a subset of
predi ate al ulus, or fun tional programs. Also depending on the framework,
hypothesis onstru tion an be onstrained more or less by the presented
examples. In extreme, hypotheses might be generated just by enumeration
(Gold, 1967): a legal expression of the hypothesis language is generated and
tested against the examples. If the hypothesis is not onsistent with the ex-
amples, it is reje ted and a new hypothesis is generated. Note, that testing
whether a hypothesis holds for an example orresponds to dedu tive inferen e
(Mit hell, 1997, p. 291). An overview of ma hine learning is given by Mit hell
(1997) and we will present approa hes to indu tive inferen e in se tion 6.3.
Dedu tion guarantees that the inferen e is orre t (with respe t to the
given axioms), while indu tion only results in a hypothesis. From the perspe -
tive of epistemology, one might say that a dedu tive proof does not generate
knowledge it expli ates information whi h is already ontained in the given
axioms. Indu tive inferen e results in new information, but the inferen e has
only hypotheti al status whi h holds as long as the system is not onfronted
with examples whi h ontradi t the inferen e.
To make this dieren e more expli it, lets look at a set-theoreti inter-
pretation of inferen e:
Denition 6.1.1 (Relations in a Domain). Let D be a set of obje ts be-
longing to a domain. A relation with arity i is given as R
i
D
i
. The set of
all relations whi h hold in a domain is R.
For a given domain, for example the set of at lists, the set of all axioms
whi h hold in this domain is extensionally given by relations. For example,
there might be an unary relation R ontaining all lists, and a binary relation
R
0
ontaining all pairs of lists (l, ons(x, l)) where l 2 R. In general, relations
an represent formulas of arbitrary omplexity.
Now we an dene (the semanti s of) dedu tion and indu tion in the
following way:
6.1 Overview of Automati Programming Resear h 109
Denition 6.1.2 (Dedu tion). Given a set of axioms A R, dedu tive
inferen e means to de ide whether for a theorem R =2 A holds R 2 R.
In a dedu tive (proof) al ulus, R 2 R is de ided by showing that R an
be derived from the given axioms A, that is, A j= R. A standard synta ti al
approa h to dedu tive inferen e is for example resolution (see illustration
above).
Denition 6.1.3 (Indu tion). Given a set of examples E R, indu tive
inferen e means to sear h a hypothesis H = fR
1
; : : : R
n
g su h that H 6= E
and 8 E
i
2 E : E
i
2 H.
The sear h for a hypothesis takes pla e in the set of possible hypotheses given
a xed hypotheses language, for example, the set of all synta ti ally orre t
Lisp programs. Typi ally, sele tion of a hypothesis is not only restri ted by
demanding that H explains (\ overs") all presented examples, but by addi-
tional riteria su h as simpli ity.
Dedu tive versus Indu tive Synthesis. In program synthesis, the result of in-
feren e is a omputer program, whi h transforms all legal inputs x in the
desired output y = f(x). For dedu tive synthesis, the omplete formal spe i-
ation of the pre- and post- onditions of the desired program an be repre-
sented as theorem of the following form:
Denition 6.1.4 (Program Spe i ation as Theorem).
8 x 9 y [Pre(x) ) Post(x; y):
For example, for the spe i ation of the last-program (see tab. 6.1), Pre(x)
states that x is a non-empty list and Post(x, y) states, that y is the last
element of list x. A onstru tive theorem prover (su h as Green's approa h,
des ribed in se t. 2.2.2 in hap. 2) tries to prove that this theorem holds. If
the proof fails, there exists no feasible program (given the axioms on whi h
the proof is based); if the proof su eeds, a program f(x) is returned as result
of the onstru tive proof. For f(x) must hold
Denition 6.1.5 (Program Spe i ation as Constru tive Theorem).
8 x [Pre(x) ) Post(x; f(x))
In the proof the existential quantied variable y must be expli itly on-
stru ted as a fun tion over x (Skolemization).
An alternative to theorem proving is program transformation. In theorem
proving, ea h proof step onsists of rewriting the urrent program (formula)
by sele ting a given axiom and applying the proof method (e. g., resolution).
In program transformation, rewriting is based on transformation rules.
Denition 6.1.6 (Transformation Rule). A transformation rule is de-
ned as a onditioned rule:
110 6. Automati Programming
If Appli ation-Condition Then Left-Hand-Pattern! Right-Hand-Pattern.
The rules might be given in equational logi (i. e., a spe ial lass of ax-
ioms) or they might be uni-dire tional. Ea h transformation step onsists of
rewriting the urrent program by repla ing some part whi h mat hes with
the Left-Hand-Pattern by another expression whi h mat hes with the Right-
Hand-Pattern.
In se tion 6.2 we will present theorem proving as well as transformational
approa hes to dedu tive synthesis.
For indu tive synthesis, a program is onstru ted as generalization over
the given in omplete spe i ation (input/output examples or tra es, see
tab. 6.1) su h that the following proposition holds:
Denition 6.1.7 (Chara teristi of an Indu ed Program).
8 (x; y) 2 E [f(x) = y
where E is a set of examples with x as possible input in the desired program
and y as orresponding output value or tra e for x.
The formal foundation of indu tive synthesis is grammar inferen e. Here
the examples are viewed as words whi h belong to some unknown formal
language and the inferen e task is to onstru t a grammar (or automaton)
whi h generates (or re ognizes) a language to whi h the example words be-
long. The lassi al approa h to program synthesis is the onstru tion of re-
ursive fun tional (mostly Lisp) programs from tra es. Su h tra es are either
given as input our automati ally onstru ted from input/output examples.
Alternatively, in the ontext of indu tive logi programming, the onstru tion
of logi al (Prolog) programs from input/output examples is investigated. In
both resear h areas, there are methods whi h base hypothesis onstru tion
strongly on the presented examples versus methods whi h depend more on
sear h in hypothesis spa e (i. e., generate and test). An approa h whi h is
based expli itly on generate and test is geneti programming.
In se tion 6.3 we will present grammar inferen e, fun tional program syn-
thesis, indu tive logi programming, and geneti programming as approa hes
to indu tive program synthesis.
Enri hing Program Synthesis with Knowledge. The basi approa hes to de-
du tive and indu tive program synthesis depend on a limited amount of
knowledge: a set of ( orre t) axioms or transformation rules in dedu tion; or a
restri ted hypothesis language and some rudimentary ba kground knowledge
(for example about data types and primitive operators) in indu tion. These
basi approa hes an be enri hed by additional knowledge. Su h knowledge
might be not universally valid but an make a system more powerful. As a
onsequen e, a basi fully automati , algorithmi approa h is extended to an
intera tive, heuristi approa h.
6.1 Overview of Automati Programming Resear h 111
A su essful approa h to knowledge-based program synthesis is to guide
synthesis by pre-dened program s hemes, also alled design strategies. A
program s heme is a program template (for example represented in higher
order logi ) representing a xed over-all program stru ture (i. e., a xed ow
of ontrol) but abstra ting from on rete operations. For a new programming
problem, an adequate s heme is sele ted (by the user) from the library and a
program is onstru ted by stepwise renement of this s heme. For example,
a qui ksort program might be synthesized by rening a divide-and- onquer
s heme. S heme-based synthesis is typi ally realized within the dedu tive
framework (Smith, 1990). An approa h to ombine indu tive program syn-
thesis and s hema-renement is proposed by Flener (1995).
A spe ial approa h to indu tive program synthesis is programming by
analogy, that is, program reuse. Here, already known programs (predened
or previously synthesized) are stored in a library. For a new programming
problem, a similar problem and the program whi h solves this problem are
retrieved and the new program is onstru ted by modifying the retrieved
program. Analogy an be seen as a spe ial ase of indu tion be ause modi-
ation is based on the stru tural similarity of both problems and stru tural
similarity is typi ally obtained by generalizing over the ommon stru ture of
the problems (see al ulation of least general generalizations in se tion 6.3.3
and anti-uni ation in hapter 7 and hapter 12).
We will ome ba k to s hema-based synthesis and programming by anal-
ogy in Part III, addressing the problems of (1) automati a quisition of ab-
stra t s hemes from example problems, and (2) automati al retrieval of an
appropriate s hema for a given programming problem.
6.1.3 Pointers to Literature
Although program synthesis is an a tive area of resear h sin e the begin-
ning of omputer s ien e, program synthesis is not overed in text books on
programming, software engineering, or AI. The main reason for that omis-
sion might be that program synthesis resear h does not rely on one ommon
formalism but that ea h resear h group proposes its own approa h. This is
even the ase within a given framework as synthesis by theorem proving
or indu tive logi programming. Furthermore, the formalisms are typi ally
quite omplex, su h that it is diÆ ult to present a formalism together with
a omplete, non-trivial example in a ompa t way.
An introdu tory overview to knowledge-based software engineering is
given by Lowry and Duran (1989). Here approa hes to and systems for spe -
i ation a quisition, spe i ation validation and maintenan e, as well as to
dedu tive program synthesis are presented. A olle tion of in uential pa-
pers in AI and software engineering was edited by Ri h and Waters (1986b).
The papers over a large variety of resear h areas in luding dedu tive and
indu tive program synthesis, program veri ation, natural language spe i-
ation, knowledge based approa hes, and programming tutors. Colle tions of
112 6. Automati Programming
resear h papers on AI and software engineering are presented by Lowry and
M Carthy (1991), Partridge (1991), and Messing and Campbell (1999).
An introdu tory overview to program synthesis is given by (Barr & Feigen-
baum, 1982). Here the lassi al approa hes to and systems for dedu tive and
indu tive program synthesis of fun tional programs are presented. A ol-
le tion of in uential papers in program synthesis was edited by Biermann,
Guiho, and Kodrato (1984), addressing dedu tive synthesis, indu tive syn-
thesis of fun tional programs, and grammar inferen e. A more re ent survey
of program synthesis is given by Flener (1995, part 1) and a survey of indu -
tive logi program synthesis is given by Flener and Yilmaz (1999).
Current resear h in program synthesis is for example overed in the
journal Automated Software Engineering. AI onferen es, espe ially ma hine
learning onferen es (ECML, ICML) typi ally have se tions on indu tive pro-
gram synthesis.
6.2 Dedu tive Approa hes
In the following we give a short overview of approa hes to dedu tive program
synthesis. First we introdu e onstru tive theorem proving and afterwards
program transformation.
6.2.1 Constru tive Theorem Proving
Automated theorem proving has in general not to be onstru tive. That
means, for example, that from a statement :8x:P (x) an be followed that
9xP (x) without the ne essity to a tually onstru t an obje t a whi h has
property P . In ontrast, for a proof to be onstru tive, su h an obje t a
has to be given together with a proof that a has property P (Thompson,
1991). Classi al theorem provers, whi h are applied for program veri ation,
su h as the Boyer-Moore theorem prover (Boyer & Moore, 1975) annot be
used for program synthesis be ause they annot reason onstru tively about
existential quanti ations. As introdu ed above, a onstru tive proof of the
statement 8x9yPre(x) ! Post(x; y) means that a program f must be on-
stru ted su h that for all inputs a, for whi h Pre(a) holds, Post(x; f(x))
holds.
The observation that onstru tive proofs orrespond to programs was
made by a lot of dierent resear hers (e. g., Bates & Constable, 1985) and
was expli itly stated as so- alled Curry-Howard isomorphism in the eight-
ies. One of the oldest approa hes to onstru tive theorem proving was pro-
posed by Green (1969), introdu ing a onstru tive variant of resolution. In
the following, we rst present this pioneering work, afterwards we des ribe
the dedu tive tableau method of (Manna & Waldinger, 1980), and give a
short survey of more ontemporary approa hes.
6.2 Dedu tive Approa hes 113
6.2.1.1 Program Synthesis by Resolution. Green (1969) introdu ed a
onstru tive version of lausal resolution and demonstrated with his system
QA3 that onstru tive theorem proving an be applied for onstru ting plans
(see se t. 2.2.2 in hap. 2) and for automati (Lisp) programming. For auto-
mati programming, the theorem prover needs two sets of axioms: (1) Axioms
dening the fun tions and onstru ts of (a subset of) a programming language
(su h as Lisp), and (2) axioms dening an input/output relation R(x, y)
1
,
whi h is true if and only if x is an appropriate input for some program and y
is the orresponding output generated by this program. Green addresses four
types of automati programming problems:
Che king: Proving that R(a,b) is true or false.
For a xed input/output pair it is he ked whether a relation R(a, b)
holds.
Simulation: Proving that 9x R(a; x) is true (returning x = b) or false.
For a xed input value a, output b is onstru ted.
Veri ation: Proving that 8x R(x; f(x)) is true or false (returning x = as
ounter-example).
For a user- onstru ted program f(x), its orre tness is he ked. For
proofs on erning looping or re ursive programs, indu tion axioms are
required to proof onvergen e (termination).
Synthesis: Proving that 8x 9y R(x; y) is true (returning the program y =
f(x)) or false (returning x = as ounter-example).
Indu tion axioms are required to synthesize re ursive programs.
Green gives two examples for program synthesis: the synthesis of a simple
non-re ursive program whi h sorts two numbers of a dotted pair, and the
synthesis of a re ursive fun tion for sorting.
Synthesis of a Non-Re ursive Fun tion. For the dotted pair problem, the
following axioms on erning Lisp are given:
1. x = ar( ons(x,y))
2. y = dr( ons(x,y))
3. x = nil ! ond(x,y,z) = z
4. x 6= nil ! ond(x,y,z) = y
5. 8x; y [lessp(x,y) 6= nil $ x < y.
The axiom spe ifying the desired input/output relation is given as
8x9y [ ar(x) < dr(x) ! y = x^
[( ar(x) dr(x)) ! ( ar(x) = dr(y) ^ dr(x) = ar(y)):
The axioms states that a dotted pair (x
1
:x
2
) whi h is already sorted (x
1
< x
2
)
is just returned, otherwise, the reversed pair must be returned. By resolving
1
Relation R(x,y) orresponds to relation Post(x,y) given in denition 6.1.4.
114 6. Automati Programming
the input/output axioms with axiom (5), the mathemati al expressions (<,
) are repla ed by Lisp-expressions lessp. Resolving expressions lessp(x,y) =
nil with axioms (3) and (4) introdu es onditional expressions; and axioms
(1) and (2) are used to introdu e ons-expressions. The resulting program is
y = ond(lessp( ar(x), dr(x)), x, ons( dr(x), ar(x))).
Synthesis of a Re ursive Fun tion. To synthesize a re ursive program the
theorem prover additionally requires an indu tion axiom. For example, for a
re ursion over nite linear lists, it an be stated that the empty list is rea hed
in a nite number of steps by stepwise applying dr(l):
Denition 6.2.1 (Indu tion over Linear Lists).
[P (h(nil)) ^ 8x [:atom(x) ^ P (h( dr(x))) ! P (h(x)) ! 8z P (h(z))
where P is a predi ate and h is a fun tion.
The axiom spe ifying the desired input/output relation for a list-sorting
fun tion an be stated for example as:
8x 9y [R(nil; y) ^ [[:atom(x) ^R( dr(x); sort( dr(x))) ! R(x; y):
The sear hed for fun tion is already named as sort.
Given the Lisp axioms above, some additional axioms hara terizing the
predi ates atom(x) and equal(x,y), and axioms spe ifying a pre-dened fun -
tion merge(x,l) (inserting number x in a sorted list l su h that the result is a
sorted list), the following fun tion an be synthesized:
y = ond(equal(x,nil), nil, merge( ar(x),sort( dr(x)))).
Drawba ks of Theorem-Proving. For a given set of axioms and a omplete,
formal spe i ation, program synthesis by theorem-proving results in a pro-
gram whi h is orre t with respe t to the spe i ation. The resulting program
is onstru ted as side-ee t of the proof by instantiation of variable y in the
input/output relation R(x; y).
Fully automated theorem proving has several disadvantages as approa h
to program synthesis:
The domain must be axiomatized ompletely.
The given axiomatization has a strong in uen e on the omplexity of sear h
for the proof and it determines the way in whi h the sear hed for program
is expressed. For example, Green (1969) reports that his theorem prover
ould not synthesize sort in a \reasonable amount of time" with the given
axiomatization. Given a dierent set of axioms (not reported in the paper),
QA3 reated the program ond(x, merge( ar(x),sort( dr(x))), nil).
Providing a set of axioms whi h are suÆ ient to prove the input/output
relation and thereby to synthesize a program, presupposes that a lot is
already known about the program. For example, to synthesize the sort
program, an axiomatization of the merge fun tion had to be provided.
6.2 Dedu tive Approa hes 115
It an be more diÆ ult to give a orre t and omplete input/output spe -
i ation than to dire tly write the program.
For the sort example given above, the sear hed for re ursive stru ture is
already given in the spe i ation by separating the ase of an empty list as
input from the ase of a non-empty list and by presenting the impli ation
from the rest of a list ( dr(x)) to the list itself.
Theorem provers la k the power to produ e proofs for more ompli ated
spe i ations.
The main sour e of ineÆ ien y is that a variety of indu tion axioms must
be spe ied for synthesizing re ursive programs. The sele tion of the \suit-
able" indu tion s heme for a given problem is ru ial for nding a solution
in reasonable time. Therefore, sele tion of the indu tion s heme to be used
is often performed by user-intera tion!
This disadvantages of theorem proving are true for the original approa h
of Green and are still true for the more sophisti ated approa hes of today,
whi h are not restri ted to proof by resolution but use a variety of proof
me hanisms (see below). The originally given hard problem of automati
program onstru tion is reformulated as equally hard problem of automati
theorem proving (Ri h & Waters, 1986a). Nevertheless, the proof-as-program
approa h is an important ontribution to program synthesis: First, the ne-
essity of omplete and formal spe i ations ompels the program designer
to state all knowledge involved in program onstru tion expli itly. Se ond,
and more important, while theorem proving might not be useful in isolation,
it an be helpful or even ne essary as part of a larger system. For example,
in transformation systems (see se t. 6.2.2), the appli ability of rules an be
proved by showing that a pre ondition holds for a given expression.
6.2.1.2 Planning and Program Synthesis with Dedu tive Tableaus.
A se ond in uential ontribution to program synthesis as onstru tive proof
was made by Manna and Waldinger (1980). As Green, they address not only
dedu tive program synthesis but also dedu tive plan onstru tion (Manna &
Waldinger, 1987).
Dedu tive Tableaus. The approa h of Manna and Waldinger is also based
on resolution. In ontrast to the lassi al resolution (Robinson, 1965) used
by Green, their resolution rule does not depend on axioms represented in
lausal form. Their proposed formalism the dedu tive tableaus in or-
porates not only non- lausal resolution but also transformation rules and
stru tural indu tion. The transformation rules have been taken over from an
earlier, transformation based programming system alled Dedalus (Manna &
Waldinger, 1975, 1979).
A dedu tive tableau onsists of three olumns
Assertions Goals Outputs
Pre(x)
Post(x,y) y
116 6. Automati Programming
with x as input variable, y as output variable, Pre(x) as pre ondition and
Post(x,y) as post ondition of the sear hed for program. The initial tableau is
onstru ted from a spe i ation of the form (see also table 6.1)
f(x)( nd y su h that Post(x; y) where Pre(x).
In general, the semanti s of a tableau with assertions A
i
(x) and goals
G
j
(x; y) is
8xA
1
(x) ^ : : : ^ A
n
(x)! 9yG
1
(x; y) _ : : : _G
n
(x; y):
A proof is performed by adding new rows to the tableau through rules of
logi al inferen e. A proof is su essful if a row ould be onstru ted where the
goals olumn ontains the expression \true" and the output olumn ontains
a term whi h onsists only of primitive expressions of the target programming
language.
As an example for a derivation step, we present a simplied version of the
so alled GG-resolution (omitting possible uni ation of sub-expressions): For
two goals F in olumn i and G in olumn j a new row an be onstru ted
with the goal entry
F [P true ^G[P false
where P is a ommon sub-expression of F and G. For the output olumn,
GG-resolution results in the introdu tion of a onditional expression:
Assertions Goals Outputs
a b a
:(a b) b
true ^ : false if a b then a else b
true.
Again, indu tion is used for re ursion-formation:
Denition 6.2.2 (Indu tion Rule). For a sear hed for program f(x) with
assertion Pre(x) and goal Post(x; y) the indu tion hypothesis is 8u(u x)!
(Pre(u)! Post(u; f(u))) where is a well-founded ordering over the set of
input data fxg.
Re ursive Plans. Originally, Manna and Waldinger applied the dedu tive
tableau method to synthesize fun tional programs, su h as reversing lists or
nding the quotient of two integers (Manna &Waldinger, 1992). Additionally,
they demonstrated how imperative re ursive programs an be synthesized by
dedu tive planning (Manna & Waldinger, 1987). As example they used the
problem of learing a blo k (see also se t. 3.1.4.1 in hap. 3).
The sear hed for plan is alled make lear(a) where a denotes a blo k in a
blo ks-world. Plan onstru tion is based on proving the theorem
6.2 Dedu tive Approa hes 117
8s
0
8a9z
1
[Clear(s
0
; z
1
; a)
meaning \for an initial state s
0
, the blo k a is lear after exe ution of plan
z
1
".
Among the pre-spe ied axioms for this problems are:
hat-axiom: If : Clear(s, x) Then On(s, hat(s,x), x)
(If blo k x is not lear in situation s then a blo k hat(s; x) is lying on
blo k x in situation s where hat(s; x) is the blo k whi h lies on x in s.)
put-table-axiom: If Clear(s, x) Then On(put(s, x, table), x, table)
(If blo k x is lear in situation s then it lies on the table in a situation
where x was put on the table immediately before.)
and the resulting plan (program) is
make lear(a) (
If Clear(a)
Then (the urrent situation)
Else make lear(hat(a));put(hat(a),table).
The omplete derivation of this plan using dedu tive tableaus is reported
in Manna and Waldinger (1987).
6.2.1.3 Further Approa hes. A third \ lassi al" approa h to program
synthesis by theorem proving was proposed by (Bibel, 1980). Current auto-
mati theorem provers, su h as Nuprl
2
or Isabelle
3
an be in prin iple used
for program synthesis. But up to now, pure theorem proving approa hes an
only be applied to small problems. Current resear h addresses the problem
of enri hing onstru tive theorem proving with knowledge-based methods
su h as proof ta ti s or program s hemes (Kreitz, 1998; Bibel, Korn, Kreitz,
& Kuru z, 1998).
6.2.2 Program Transformation
Program transformation is the predominant te hnique in dedu tive program
synthesis. Typi ally, dire ted rules are used to transform an expression into a
synta ti ally dierent, but semanti ally equivalent expression. There are two
prin ipal kinds of transformation: lateral and verti al transformation (Ri h
& Waters, 1986a). Lateral transformation generates expressions on the same
level of abstra tion. For example, a linear re ursive program might be rewrit-
ten into a more eÆ ient tail-re ursive program. Program synthesis is realized
by verti al transformation rewriting a spe i ation into a program by apply-
ing transformation rules whi h represent the relationship between onstru ts
on an abstra t level and onstru ts on program level. If the starting point for
2
see http://www. s. ornell.edu/Info/Proje ts/NuPrl/nuprl.html
3
see http://www. l. am.a .uk/Resear h/HVG/Isabelle/
118 6. Automati Programming
transformation is already an exe utable (high-level) program, one speaks of
transformational implementation.
In the following, we rst present the pioneering approa h of Burstall and
Darlington (1977). The authors present a small set of powerful rules for trans-
forming given programs in more eÆ ient programs, that is they propose an
approa h to transformational implementation. The main ontribution of their
approa h is the introdu tion of rules for unfolding and folding (synthesiz-
ing) re ursive programs. Afterwards, we present the CIP ( omputer-aided
intuition-guided programming) approa h (Broy & Pepper, 1981) whi h fo-
uses on orre tness-preserving transformations. CIP does not aim at full au-
tomatization, it an be lassied as a meta-programming approa h (Feather,
1987) whi h supports a program-developer in the onstru tion of orre t pro-
grams. Finally, we present the KIDS-system (Smith, 1990) whi h is the most
su essful dedu tive synthesis system today. KIDS is a program synthesis sys-
tem whi h transforms initial, not ne essarily exe utable, spe i ations into
eÆ ient programs by stepwise renement.
6.2.2.1 Transformational Implementation:
The Fold Unfold Me hanism.
The mind is a auldron and things bubble up and show for a moment, then
slip ba k into the brew. You an't rea h down and nd anything by tou h. You
wait for some order, some relationship in the order in whi h they appear.
Then yell Eureka! and believe that it was a pro ess of old, pure logi .
|Travis M Gee in: John D. Ma Donald, The Girl in the Plain Brown Wrapper, 1968
Starting point for the transformation approa h of Burstall and Darlington
(1977) is a program, whi h is presented as a set of equations with left-hand
sides representing program heads and right-hand sides representing al ula-
tions.
For example, the Fibona i fun tion is presented as
1. b(0) ( 1
2. b(1) ( 1
3. b(x+2) ( b(x+1) + b(x).
The following inferen e rules for transforming re ursive equations are
given:
Denition: Introdu e a new re ursive equation whose left-hand expression is
not an instan e of the left-hand expression of any previous equation.
We will see below, that introdu tion of a suitable equation in general
annot be performed automati ally (\eureka" step).
Instantiation: Introdu e a substitution instan e of an existing equation.
6.2 Dedu tive Approa hes 119
Unfolding: If E ( E
0
and F ( F
0
are equations and there is some o ur-
ren e in F
0
of an instan e of E, repla e it by the orresponding instan e
of E
0
, obtaining F
00
and add equation F ( F
00
.
This rule orresponds to the expansion of a re ursive fun tion by repla -
ing the fun tion all by the fun tion body with a ording substitution of
the parameters.
Folding: If E ( E
0
and F ( F
0
are equations and there is some o urren e
in F
0
of an instan e of E
0
, repla e it by the orresponding instan e of E,
obtaining F
00
and add equation F ( F
00
.
This is the \inverse" rule to unfolding. We will dis uss folding of terms
in detail in hapter 7.
Abstra tion: Introdu e a where lause by deriving from a previous equation
E ( E
0
a new equation
E ( E
0
[u
1
=F
1
; : : : ; u
n
=F
n
where hu
1
; : : : ; u
n
i = hF
1
; : : : ; F
n
i:
Laws: Algebrai laws su h as asso iativity or ommutativity an be used
to rewrite the right-hand sides of equations.
The authors propose the following strategy for appli ation of rules
Make any ne essary denitions.
Instantiate.
For ea h instantiation unfold repeatedly. At ea h stage of unfolding:
Try to apply laws and abstra tions.
Fold repeatedly.
Transforming Re ursive Fun tions. We will illustrate the approa h, demon-
strating how the bona i program given above an be transformed into a
more eÆ ient program whi h avoids al ulating values twi e:transformation,
re ursive fun tion
1. b(0) ( 1 (given)
2. b(1) ( 1 (given)
3. b(x+2) ( b(x+1) + b(x) (given)
4. g(x) ( hb(x+1),b(x) i (denition, eureka!)
5. g(0) ( hb(1),b(0) i (instantiation)
g(0) ( h1,1 i (unfolding with 1, 2)
6. g(x+1) ( hb(x+2),b(x+1) i (instantiate 4)
g(x+1) ( hb(x+1)+b(x),b(x+1) i (unfold with 3)
g(x+1) ( hu+v,ui where hu,vi = hb(x+1),b(x) i (abstra t)
g(x+1) ( hu+v,ui where hu,vi = g(x) (fold with 4)
7. b(x+2) ( u + v where hu,vi = hb(x+1),b(x) i (abstra t 3)
f(x+2) ( u + v where hu,vi = g(x) (fold with 4)
The new, more eÆ ient denition of bona i is:
b(0) ( 1
120 6. Automati Programming
b(1) ( 1
b(x+2) ( u + v where hu,vi = g(x)
g(0) ( h1,1i
g(x+1) ( u + v where hu,vi = g(x).
Abstra t Programming and Data Type Change. Burstall and Darlington
(1977) also demonstrated how verti al program transformation an be real-
ized by transforming abstra t programs, dened on high-level, abstra t data
types, into on rete programs dened on on rete data.
Again, we illustrate the approa h with an example. Given is the abstra t
data type of labeled trees:
Abstra t Data Type
niltree 2 labeledTrees
ltree : atoms labeledTrees labeledTrees ! labeledTrees.
That is, a labeled tree is either empty (niltree) or it is a labeled node
(atoms) whi h bran hes into two labeled trees.
Furthermore, a data type of binary trees might be available as basi data
stru ture (e. g., in Lisp) with onstru tors nil and pair:
Con rete Data Stru ture
nil 2 binaryTrees
atoms 2 binaryTrees
pair : binaryTrees binaryTrees ! binaryTrees.
A labeled tree an be represented as binary tree by representing ea h node
as a pair of a label and a binary tree. For example:
ltree(A, niltree, ltree(B, niltree, niltree))
an be represented as
pair(A, pair(nil, pair(B, pair(nil, nil)))).
The relationship between the abstra t data type and the on rete data
stru ture an be expressed by the following representation fun tion:
R : binaryTrees ! labeledTrees
R(nil) ( niltree
R(pair(a; (pair(p
1
; p
2
)))) ( ltree(a;R(p
1
); R(p
2
)).
The inverse representation fun tion whi h in some ases ould be gen-
erated automati ally denes how to ode the data type:
C : labeledTrees ! binaryTrees
C(niltree) ( nil
C(ltree(a; t
1
; t
2
)) ( pair(a; pair(C(t
1
); C(t
2
))).
6.2 Dedu tive Approa hes 121
The user an write an abstra t program, using the abstra t data type.
Typi ally, writing abstra t programs involves less work than writing on rete
programs be ause implementation spe i details are omitted. Furthermore,
abstra t programs are more exible be ause they an be transformed into
dierent on rete realizations.
An example for an abstra t program is twist whi h mirrors the input tree:
twist : labeledTrees ! labeledTrees
twist(niltree) ( niltree
twist(ltree(a; t
1
; t
2
)) ( ltree(a; twist(t
2
); twist(t
1
))
The goal is, to automati ally generate a on rete program TWIST(p) =
C(twist(R(p))):
1. TWIST(p) ( C(twist(R(p)))
2. TWIST(nil) ( C(twist(R(nil))) (instantiate)
3. TWIST(nil) ( nil (unfold C, twist, R, and evaluate)
4. TWIST(pair(a,pair(p
1
,p
2
))) ( C(twist(R(pair(a,pair(p
1
,p
2
))))) (in-
stantiate)
5. TWIST(pair(a,pair(p
1
,p
2
))) ( C(twist(ltree(a,R(p
1
), R(p
2
)))) (unfold
R)
6. TWIST(pair(a,pair(p
1
,p
2
)))( C(ltree(a,twist(R(p
2
)),twist(R(p
1
)))) (un-
fold twist)
7. TWIST(pair(a,pair(p
1
,p
2
)))( pair(a,pair(C(twist(R(p
2
))), C(twist(R(p
1
)))))
(unfold C)
8. TWIST(pair(a,pair(p
1
,p
2
)))( pair(a,pair(TWIST(p
2
), TWIST(p
1
)))
(fold with 1)
with the resulting program given by equations 3 and 8.
Chara teristi s of Transformational Implementation. To sum up, the ap-
proa h proposed by Burstall and Darlington (1977) has the following har-
a teristi s:
Spe i ation: Input is a stru turally simple program whose orre tness is
obvious (or an easily be proved).
Basi Rules: A transformation system is based on a set of rules. There are
rules whi h dene semanti s-preserving re-formulations of the (right-
hand side, i. e., body) program (or spe i ation). Other rules, su h as
denition and instantiation, annot be applied me hani al but are based
on reativity (eureka) and insight into the problem to be solved.
Partial Corre tness: Transformations are semanti s-preserving modulo ter-
mination.
Sele tion of Rules: Providing that sequen e of rule appli ation whi h leads
to the desired result typi ally annot be performed fully automati ally.
Burstall and Darlington presented a simple strategy for rule sele tion.
This strategy an in general not be applied without guidan e by the
user.
122 6. Automati Programming
6.2.2.2 Meta-Programming: The CIP Approa h. The CIP approa h
was developed during the seventies and eighties (Broy & Pepper, 1981; Bauer,
Broy, Moller, Pepper, Wirsing, et al., 1985; Bauer, Ehler, Hors h, Moller,
Parts h, Paukner, & Pepper, 1987) as a system to support the formal devel-
opment of orre t and eÆ ient programs (meta-programming). The approa h
has the following hara teristi s:
Transformation Rules: Des ribe mappings from programs to programs. They
are represented as dire ted pairs of program s hemes (a! b together with
an enabling ondition ( ) dening appli ability.
Transformational Semanti s: A programming language is onsidered as term
algebra. The given transformation rules dene ongruen e relations on
this algebra and thereby establish a notion of \equivalen e of programs"
(Ehrig & Mahr, 1985).
Corre tness of Transformations: The orre tness of a basi set of transfor-
mation rules an be proved in the usual way showing that a and b are
equivalent with respe t to the properties of the semanti model. Cor-
re tness of all other rules an be shown by transformational proofs: A
transformation rule T : a! b is orre t if it an be dedu ed from a set of
already veried rules T
1
; : : : T
n
: a
T
1
7! a
1
T
2
7! : : :
T
n
7! a
n
= b. In general, for
all rules a! b, it must hold, that the relation between a and b is re exive
and transitive. Properties of re ursive programs are proved by transfor-
mational indu tion: For two re ursive programs p and q with bodies [p
and [q every all p(x) an be transformed into a all q(x) if there is a
transformation rule [y! [y.
Abstra t Data Types: Abstra t data types are given by a signature (sorts
and operations) and a set of equations (axioms) (see, e. g. Ehrig & Mahr,
1985). These axioms an be onsidered as transformation rules whi h are
appli able within the whole s ope of the type. (Considering a program-
ming language as term algebra, the programming language itself an be
seen as abstra t data type with the transformation rules as axioms).
Assertions: For some transformation rules there might be enabling onditions
whi h hold only lo ally. For example, for the rule
B
if B then S else T fi
S
the enabling ondition B an be given or not, depending on the urrent
input. If B holds, than the onditional statement evaluates to S. Su h
lo al properties are expressed by assertions whi h an be provided by
the programmer. Furthermore, transformation rules an be dened for
generating or propagating assertions.
6.2 Dedu tive Approa hes 123
Again, we illustrate the approa h with an example, whi h is presented in
detail in (Broy & Pepper, 1981). The sear hed for program is a realization of
the Warshall-algorithm for al ulating the transitive losure of a graph.
The graph is given by its hara teristi fun tion
fun tion edge (x:node, y:node):bool;
(true, if nodes x and y are dire tly onne ted).
A path is dened as sequen es of nodes where ea h pair of su eeding
nodes are dire tly onne ted:
fun tion ispath (s:seq(node)):bool;
8s
1
; s
2
: seq(node), x,y: node::
s = s
1
Æ x Æ y Æ s
2
) edge(x; y).
Starting point for program transformation is the spe i ation of the
sear hed for fun tion:
fun tion trans (x:node, y:node):bool;
9 P: seq(node):: ispath(x Æ p Æ y).
The basi idea (eureka!) is to dene a more general fun tion, whi h only
onsiders the rst i nodes (where nodes are given as natural numbers 1 : : : n):
fun tion t (i:nat, x:node, y:node):bool;
9 P: seq(node):: ispath(x Æ p Æ y)
^ 8 z: node:: z 2 p) z i.
In a rst step, it must be proved, that t (n,x,y) = trans(x,y), where n is
the number of all nodes in a graph:
Lemma 1: t (n; x; y) = trans(x; y)
t (n,x,y) = 9 P: seq(node):: ispath(x Æ p Æ y) ^ 8 z: node:: z 2 p ) z n
(unfold)
= 9 P: seq(node):: ispath(x Æ p Æ y) (simpli ation: z n holds for all
nodes z)
= trans(x,y) (fold).
To obtain an exe utable fun tion from the given spe i ation, a next
eureka-step is the introdu tion of re ursion. This is done by formulating a
base ase (for i = 0) and a re ursive ase:
Lemma 2: t (0; x; y) = edge(x; y)
Lemma 3: t (i+ 1; x; y) = t (i; x; y) _ [t (i; x; i+ 1) ^ t (i; i+ 1; y).
The proof of lemma 2 is analogous to that of lemma 1. The proof of
lemma 3 involves an additional ase-analysis (all nodes on a path are smaller
or equal to node i vs. a path ontains a node i+1) and appli ation of logi al
rules.
The proved lemmas give rise to the program
124 6. Automati Programming
fun tion t (i:nat, x:node, y:node):bool;
if i = 0
then edge(x,y)
else t (i,x,y) _ [t (i,x,i+1) ^ t (i,i+1,y).
This example demonstrates, how the development of a program from a
non-exe utable initial spe i ation an be supported by CIP. Program devel-
opment was mainly \intuition-guided", but the transformation system sup-
ports the programmer in two ways: First, the notion of program transforma-
tion results in systemati program development by stepwise renement, and
se ond, the transformation system (partially) automati ally generates proofs
for the orre tness of the programmers intuitions.
Given the re ursive fun tion t , in a next step it an be transformed from
its ineÆ ient re ursive realization into a more eÆ ient version realized by for-
loops. This transformational implementation an be performed automati ally
by means of sophisti ated (and veried) transformation rules.
6.2.2.3 Program Synthesis by Stepwise Renement: KIDS. KIDS
4
is urrently the most su essful system for dedu tive (transformational) pro-
gram synthesis. KIDS supports the development of orre t and eÆ ient pro-
grams by applying onsisten y-preserving transformations to an initial spe i-
ation, rst transforming the spe i ation into an exe utable but ineÆ ient
program and then optimizing this program. The basis of KIDS is Rene a
wide-spe trum language for representing spe i ations as well as programs.
The nal program is represented in Common Lisp.
In KIDS a variety of state-of-the-art te hniques are integrated into one
system and KIDS makes use of a large, formalized amount of domain and
programming knowledge: It oers program s hemes for divide-and- onquer,
global sear h (binary sear h, depth-rst sear h, breadth-rst sear h), and lo-
al sear h (hill limbing), it makes use of a library of reusable domain axioms
(equations spe ifying abstra t data types), it integrates dedu tive inferen e
(with about 500 inferen e rules represented as dire ted transformation rules),
and it in ludes a variety of state-of-the-art te hniques for program optimiza-
tion (su h as expression simpli ation and nite dieren ing).
KIDS is an intera tive system where the user typi ally goes through the
following steps of program development (Smith, 1990):
Develop a Domain Theory: The user denes types and fun tions and pro-
vides laws that allow high-level reasoning about the dened fun tions.
The user an develop the theory from the s rat h or make use of a hier-
ar hi library of types.
For example, a domain theory for the k-Queens problem gives boolean
fun tions representing the onstraints that no two queens an be in the
same row or olumn and that no two queens an be in the same diago-
4
see http://www.kestrel.edu/HTML/prototypes/kids.html
6.2 Dedu tive Approa hes 125
nal of a hessboard. Laws for reasoning about fun tions typi ally in lude
monotoni ity and distributive laws.
Create a Spe i ation: The user enters a spe i ation statement in terms of
the domain theory.
For example, a spe i ation of queens states that for k queens on a kk
board, a set of board positions must be returned for whi h the onstraints
dened in the domain theory hold.
Apply a Design Ta ti : The user sele ts a program s heme, representing a
ta ti for algorithm design, and applies it to the spe i ation. This is
the ru ial step of program development transforming a typi ally non-
exe utable spe i ation into an (ineÆ ient) high-level program.
For example, the k-Queens spe i ation an be solved with global sear h,
whi h an be rened to depth-rst sear h (nd legal solutions using ba k-
tra king).
Apply Optimizations: The user sele ts an optimization operation and a pro-
gram expression to whi h it should be applied. The optimization te h-
niques are fully automati .
Apply Data Type Renements: The user an sele t implementations for the
high-level data types in the program.
Whi h implementation of a data type will result in the most eÆ ient
program depends on the kind of operations to be performed. For example,
sets ould be realized as lists, arrays, or trees.
Compile: The ode is ompiled into ma hine-exe utable form (rst Common
Lisp and then ma hine ode).
The k-Queens example is given in detail in Smith (1990). The idea of
program development by stepwise renement of abstra t s hemes is also il-
lustrated in Smith (1985) for onstru ting a divide-and- onquer based sear h
algorithm.
5
The KIDS system demonstrates that it is possible that automati pro-
gram synthesis an over all steps of program development from high-level
spe i ations to eÆ ient programs. Furthermore, it gives some eviden e to
the laim of KBSE that automatization an result in a higher level of produ -
tivity in software development. High produ tivity is also due to the possibility
of reuse of domain axioms. But of ourse, the system is not \auto-magi " it
strongly depends on intera tions with an expert user, espe ially for the initial
development steps from domain theory to sele tion of a suitable design ta ti .
These rst steps an be seen as an approa h to meta-programming, similar
to CIP. Even for optimization, the user must have knowledge whi h parts
of the program ould be optimized in what way to sele t the appropriate
expressions and rules to be applied to them.
6.2.2.4 Con luding Remarks. In prin iple, there is no fundamental dif-
feren e between onstru tive theorem proving and program transformation:
5
Here the system Cypress, whi h was the prede essor of KIDS, was used.
126 6. Automati Programming
In both ases, starting point is a formal spe i ation and output is a program
whi h is orre t with respe t to the spe i ation. Theorem proving an be
onverted into transformation if axioms are represented as rules (equations
may orrespond to bi-dire tional rules) and if proof-rules (su h as resolution)
are represented as spe ial rewrite rules. The advantages of transformation
over proof systems for program synthesis are, that forward reasoning and a
smaller formal overhead makes program derivation somewhat easier (Kreitz,
1998). In ontrast, proof systems are typi ally used for (ex-post) program
veri ation.
As des ribed above, program transformation systems might in lude non-
veried rules representing domain knowledge whi h are applied in intera tion
with the system user. Su h rules are mainly applied during the rst steps of
program development until an initial exe utable program is onstru ted. The
following optimization steps (transformational programming) mainly rely on
veried rules whi h an be applied fully automati ally. Code optimization
by transformation is typi ally dealt with in text books on ompilers (Aho,
Sethi, & Ullman, 1986). A good introdu tion to program transformation and
optimization is given in Field and Harrison (1988).
An interesting approa h to \reverse" transformation from eÆ ient loops
(tail re ursion) to more easily veriable (linear) re ursion is proposed by
Giesl (2000).
6.3 Indu tive Approa hes
In the following we introdu e indu tive approa hes to program synthesis.
First, basi on epts of indu tive inferen e are introdu ed. Indu tive infer-
en e is resear hed in philosophy (epistemology), in ognitive psy hology, and
in arti ial intelligen e. In this se tion we fo us on indu tion in AI, that is,
on ma hine learning. In hapter 9 we dis uss some relations between indu -
tive program synthesis and indu tive learning of strategies in human prob-
lem solving. Grammar inferen e is introdu ed as the theoreti al ba kground
for program synthesis. In the following se tions, three dierent approa hes
to program synthesis are des ribed geneti programming, indu tive logi
programming, and synthesis of fun tional programs. Synthesis of fun tional
programs is the oldest and geneti programming the newest approa h. In
geneti programming hypothesis onstru tion relies strongest on sear h in
hypotheses-spa e, while in fun tional program synthesis hypothesis onstru -
tion is most strongly guided by the stru ture of the examples.
6.3.1 Foundations of Indu tion
As introdu ed in se tion 6.1.2.2, indu tion means the inferen e of generalized
rules from examples. The ability to perform indu tion is a presupposition for
learning often indu tion and learning are used as synonyms.
6.3 Indu tive Approa hes 127
6.3.1.1 Basi Con epts of Indu tive Learning. Indu tive learning an
be hara terized in the following way (Mit hell, 1997, p. 2):
Denition 6.3.1 (Learning). A omputer program is said to learn from
experien e E with respe t to some lass of tasks T and performan e mea-
sure P , if its performan e at tasks in T , as measured by P , improves with
experien e E.
In the following, we will esh-out this denition.
Learning Tasks. Learning tasks an be roughly divided into lassi ation
tasks versus performan e tasks, labeled as on ept learning versus skill a qui-
sition. Examples for on ept learning are re ognizing handwritten letters, or
identifying dangerous substan es. Examples for skill a quisition are navigat-
ing without bumping into obsta les, or ontrolling a hemi al pro ess su h
that ertain variables stay in a pre-dened range.
In indu tive program synthesis, the learning task is to onstru t a program
whi h transforms input values from a given domain into the desired output
values. A program an be seen as representation of a on ept: Ea h input
value is lassied a ording to the operations whi h transform it into the
output value. Espe ially, programs returning boolean values, an be viewed
as on ept denitions. Typi al examples are re ursive denitions of odd or
an estor (see g. 6.1). A program an also be seen as representation of a
( ognitive) skill: The program represents su h operations whi h must be per-
formed for a given input value to fulll a ertain goal. For example, a program
qui ksort represents the skill of eÆ iently sorting lists; or a program hanoi
represents the skill of solving the Tower of Hanoi puzzle with a minimum
number of moves (see g. 6.1).
Learning Experien e. Learning experien e typi ally is provided by present-
ing a set of training examples. Examples might be pre- lassied, then we
speak of supervised learning or learning with a tea her, otherwise we speak
of unsupervised learning or learning from observation.
For example, for learning a single on ept su h as dangerous versus
harmless substan e the system might be provided with a number of hem-
i al des riptions labeled as positive (i. e., dangerous) and negative (i. e.,
harmless) instan es of the to be learned on ept. For learning to ontrol a
hemi al pro ess, the learning system might be provided with urrent values
of a set of ontrol variables, labeled with the appropriate operations whi h
must be performed in that ase and ea h operation sequen e onstitutes a
lass to be learned. Alternatively, for skill a quisition a learning system might
provide its own experien e, performing a sequen e of operations and getting
feedba k for the nal out ome (reinfor ement learning). In this ase, the ex-
amples are labeled only indire tly the system is onfronted with the redit
assignment problem, determining the degree to whi h ea h performed oper-
ation is responsible for the nal out ome.
128 6. Automati Programming
; fun tional program de iding whether a natural number is odd
odd(x) = if (x = 0) then false else if (x = 1) then true else odd(x-2)
; logi al program de iding whether p is an estor of f
an estor(P,F) :- parent(P,F).
an estor(P,F) :- parent(P,I), an estor(I,F).
; logi al program for qui ksort
qsort([X|Xs,Ys) :-
partition(Xs,X,S1,S2),
qsort(S1,S1s),
qsort(S2,S2s),
append(S1s,[X|S2s,Ys).
qsort([,[).
; fun tional program for hanoi
hanoi(n,a,b, ) = if (n = 1) then move(a,b)
else append(hanoi((n-1),a, ,b), move(a,b),
hanoi((n-1), ,b,a))
Fig. 6.1. Programs Represent Con epts and Skills
In the simplest form, examples might onsist just of a set of attributes.
Su h attributes might be ategorial (substan e ontains uor yes/no) or met-
ri ( urrent temperature has value x). In general, examples might be stru -
tured obje ts, that is sets of relations or terms. For instan e, for the an estor
problem (see g. 6.1) kinship between dierent persons onstitute the learning
experien e. In program synthesis, examples are always stru tured obje ts.
Training examples an be presented stepwise (in remental learning) or
all at on e (\bat h" learning). Both learning modes are possible for program
synthesis. Finally, training examples should be a representative sample of
the on ept or skill to be learned. Otherwise, the performan e of the learning
system might be only su essful for a subset of possible situations.
In table 6.2 possible training examples for the an estor and the hanoi
problems are given. In the rst ase, given some parent- hild relations, posi-
tive and negative examples for the on ept \an estor" are presented. In the
se ond ase, examples are asso iated with sequen es of move operations for
dierent numbers of dis s. The hanoi example an be lassied as input to
supervised or unsupervised learning: The number of dis s (n = i; i = 1 : : : 3)
represents a possible input and the move operations represent the lass, the
input must be asso iated with (see Briesemeister et al., 1996). Alternatively,
the dis -number/operations pairs an be seen as patterns and the learning
task is to extra t their ommon stru ture (des riptive generalization). While
in the rst ase, the dierent operator sequen es onstitute dierent lasses
to be learnt, in the se ond ase, all examples are positive. The examples
are ordered with respe t to the number of dis s. Therefore, in an indire t
6.3 Indu tive Approa hes 129
way, they also ontain information about what ases are not belonging to the
domain.
Table 6.2. Training Examples
; ANCESTOR(X,Y)
parent(peter,robert) parent(mary,robert)
parent(robert,tim) parent(tim,anna)
parent(tim,john) parent(julia,john)
... ...
; pre- lassified positive and negative examples
an estor(julia,john) not(an estor(anna,tim))
an estor(peter,tim) not(an estor(anna,mary))
an estor(peter,anna) not(an estor(john,robert))
... ...
; HANOI(N,A,B,C)
(N = 1) --> (move(A,B))
(N = 2) --> (move(A,C),move(A,B),move(C,B))
(N = 3) --> (move(A,B),move(A,C),move(B,C),
move(A,B),move(C,A),move(C,B),
move(A,B))
Additional Information. The learning system might be provided with addi-
tional information besides the training examples. For instan e, in the an estor
problem in table 6.2 some parent- hild relations are dire tly given. These fa ts
are essential for learning: It is not possible to ome up with a terminating
re ursive rule (as given in g. 6.1) without referen e to a dire t relation as
parent.
If a system is given additional information besides the training examples,
this is typi ally alled ba kground knowledge. Ba kground knowledge an have
dierent fun tions for learning: it an be ne essary for hypothesis onstru -
tion, it an make hypothesis onstru tion more eÆ ient, or it an make the
resulting hypothesis more ompa t or readable.
The an estor example ould have been presented in a dierent way, as
shown in table 6.3. Instead of giving parent relations, mother and father
relations an be used, together with a rule stating, that father(X,Y) or
mother(X,Y) implies parent(X,Y). Another example for using ba kground
knowledge for learning, is to give pre-dened rules for append and partition
for learning qui ksort (see g. 6.1).
Additional information an also be provided in form of an ora le: The
learning system onstru ts new examples whi h are onsistent with its urrent
hypothesis and asks an ora le whether the example belongs to the lass to
be learned. An ora le is usually the tea her.
130 6. Automati Programming
Table 6.3. Ba kground Knowledge
; ANCESTOR(X,Y)
father(peter,robert) mother(mary,robert)
father(robert,tim) father(tim,anna)
father(tim,john) mother(julia,john)
...
parent(X,Y) :- mother(X,Y).
parent(X,Y) :- father(X,Y).
Representing Hypotheses. The task of the learning system is to onstru t
generalized rules (representing on epts or skills) from the training examples
and ba kground knowledge. In other words, learning means to onstru t an
intensional representation from a sample of an extensional representation of
a on ept or skill. As mentioned in se tion 6.1.2.2, su h generalized rules
have the status of hypotheses.
For supervised on ept learning, a hypothesis is a spe ial ase of a fun tion
f : X ! Y whi h maps a given obje t belonging to the learning domain to
a lass. In general, f an be a linear or a non-linear fun tion, mapping a
ve tor of attribute values X into an output value Y , representing the lass
the obje t des ribed by this ve tor is supposed to belong to. Alternatively,
hypotheses an be represented symboli ally the most prominent example
are de ision trees.
In the ontext of program synthesis, inputs X are stru tural des riptions
(relations or terms), outputs Y are operator sequen es transforming the in-
put into the desired output, and hypotheses are represented as logi al or
fun tional programs (see g. 6.1): For the an estor problem, fun tion f(x) is
represented by lauses an estor(X,Y) whi h return true or false for a given
set of parent- hild relations and two persons. For the hanoi problem, fun -
tion f(x) is represented by a fun tional program whi h returns a sequen e
of move-operations to transport a tower of n dis s from the start peg to the
goal peg.
Identi ation Criteria and Performan e Measure. There are two stages in
whi h the learning system is evaluated: First, a de ision riterium is needed
to terminate the learning algorithm, and, se ond, the quality of the learned
hypothesis must be tested. An algorithm terminates, if some identi ation
riterium with respe t to the training examples is fullled. The lassi al on-
ept here is identi ation in the limit, proposed by Gold (1967). The learner
terminates if the urrent hypothesis holds at some point during learning for all
examples. An alternative on ept is PAC (probably approximately orre t)
identi ation (Valiant, 1984). The learner terminates if the urrent hypoth-
esis an be guaranteed with a high probability to be onsistent with future
examples.
The rst model represents an all-or-nothing perspe tive on learning a
hypothesis must be totally orre t with respe t to the seen examples. The
6.3 Indu tive Approa hes 131
se ond model is based on probability theory. Most approa hes of on ept
learning are based on PAC-learnability, while program synthesis typi ally is
based on Gold's theory (see se t. 6.3.1.2). An analysis of program synthesis
from tra es in the PAC theory was presented by Cohen (1995, 1998).
To evaluate the quality of the learned hypothesis, the learning system is
presented with new examples and generalization a ura y is obtained. Typi-
ally, the same riterium is used as for termination.
Learning Algorithm. A learning algorithm gets training examples and ba k-
ground knowledge as input and returns a hypothesis as output.
The majority of learning algorithms is for onstru ting non-re ursive hy-
potheses for supervised on ept learning where examples are represented as
attribute ve tors (Mit hell, 1997). Among the most prominent approa hes are
fun tion approximation algorithms su h as per eptron, ba k-propagation, or
support ve tor ma hines , Bayesian learners, and de ision tree algorithms.
For unsupervised on ept learning, the dominant approa h is luster analysis.
Approa hes for learning re ursive hypotheses from stru tured examples
that is, approa hes to indu tive program synthesis are geneti program-
ming, indu tive logi programming, and synthesis of fun tional programs.
These approa hes will be des ribed in detail below.
In general, a learning algorithm is a sear h-algorithm in the spa e of
possible hypotheses. There are two main learning me hanisms: data-driven
approa hes start with the most spe i hypothesis, that is, the training ex-
amples, and approximation-driven (or model-driven) approa hes start with a
set of approximate (in omplete, in orre t) hypotheses. Data-driven learning
is in remental, iterating over the training examples (Flener, 1995). Ea h iter-
ation returns a hypothesis whi h might be the sear hed for nal hypothesis.
Approximation-driven learning is non-in remental. Current hypotheses are
generalized if not all positive examples are overed, or spe ialized if other
than the positive examples are overed.
Indu tion Biases. As usual, there is a trade-o between the expressiveness
of the hypotheses language and the eÆ ien y of the learning algorithm ( f.,
the dis ussion of domain spe i ation languages and eÆ ien y of planning
algorithms, hap. 2). Typi ally, in ma hine learning sear h is restri ted by
so alled indu tion biases (Flener, 1995, pp. 33): Ea h learning algorithm is
restri ted by a synta ti bias, that is, a restri tion of the hypothesis language.
For logi al program synthesis, the language an be restri ted to a subset of
Prolog lauses. For fun tional program synthesis, the form of fun tions an be
restri ted by program s hemes. Furthermore, a semanti bias might restri t
the vo abulary available for the hypothesis. For example, when synthesizing
Lisp programs, the language primitives might be restri ted to ar, dr, ons,
and atom (see se t. 6.3.4).
Ma hine Learning Literature. The AI text books of Winston (1992) and Rus-
sell and Norvig (1995) in lude hapters on ma hine learning te hniques. An
132 6. Automati Programming
ex ellent text book on ma hine learning was written by (Mit hell, 1997). Col-
le tions of important resear h papers, in luding work on indu tive program
synthesis, are presented by Mi halski, Carbonell, and Mit hell (1983) and
Mi halski, Carbonell, and Mit hell (1986). A olle tion of papers on knowl-
edge a quisition and learning was edited by Bu hanan and Wilkins (1993).
Flener (1995) dis usses ma hine learning in relation to program synthesis.
6.3.1.2 Grammar Inferen e. Most of the basi on epts of ma hine learn-
ing as introdu ed above have their origin in the work of Gold (1967). Gold
modeled learning as the a quisition of a formal grammar from example and
provided a lassi ation of models of language learnability together with
fundamental theoreti al results. Grammar inferen e resear h developed from
Gold's work in two dire tions providing theoreti al results of learnability
for ertain lasses of languages given a ertain kind of learning model and
re ently also the development of eÆ ient learning algorithms for some lasses
of formal grammars. We fo us on the on eptual and theoreti al aspe t of
grammar inferen e, that is on algorithmi learning theory. Learning theory
an be seen as a spe ial ase of omplexity theory where the omplexity of
languages is resear hed with respe t to the eort of their enumeration (by
appli ation of rules of a formal grammar) or with respe t to the eort of their
identi ation by an automaton.
Denition 6.3.2 (Formal Language). For a given non-empty nite set
A, alled alphabet of the language, A
represents the set of all nite strings
over elements from A. A language L is dened as: L A
.
Learning means to assign a grammar ( alled naming relation by Gold) to
a language L after seeing a sequen e of example strings:
Denition 6.3.3 (Language Learning). For a sequen e of time steps, the
learner is at ea h time step onfronted with a unit of information i
t
on ern-
ing the unknown language L. A stepwise presentation of units i is alled
training sequen e. At ea h time step, the learner makes a guess g
t
of the
grammar hara terizing L, based on the information it has re eived from the
start of learning to time step t. That is, the learner is a fun tion G with
g
t
= G(i
1
; : : : ; i
t
).
As introdu ed above, Gold introdu ed the on ept of \identi ation in
the limit" to dene the learnability of a language:
Denition 6.3.4 (Identi ation in the Limit). L is identied in the limit
if after a nite number of time steps the guesses are always the same, that
is, g
t
= g
t+i
; i 0. A lass of languages L is alled identiable in the limit if
there exists an algorithm su h that ea h language of the lass will be identied
in the limit for any allowable training sequen e.
Identi ation in the limit is dependent on the method of information pre-
sentation. Gold dis erns presentation of text and learning with an informant.
6.3 Indu tive Approa hes 133
Learning from text means learning from positive examples only, where exam-
ples are presented in an arbitrary order. Learning with informant orresponds
to supervised learning. The informant an present positive and negative ex-
amples and an provide some methodi al enumeration of examples. Gold
investigates three modes of learning from text and three modes of learning
with informant:
Method of information presentation: Text
A text is a sequen e of strings x
1
; x
2
; : : : from L su h that ea h string of L
o urs at least on e. At time t the learner is presented x
t
.
Arbitrary Text: x
t
may be any fun tion of t.
Re ursive Text: x
t
may be any re ursive fun tion of t.
Primitive Re ursive Text: x
t
may be any primitive re ursive fun tion of
t.
Method of information presentation: Informant
An informant for L tells the learner at ea h time t whether a string y
t
is
element of L. There are dierent ways of how to hoose y
t
Arbitrary Informant: y
t
may be any fun tion of t as long as every string
of A
o urs at least on e.
Methodi al Informant: An enumeration is assigned a priori to the strings
of A
and y
t
is the t-th string of the enumeration.
Request Informant: At time t the learner hooses y
t
on the basis of
information re eived so far.
The method \request informant" is urrently an a tive area of resear h
alled a tive learning (Thompson, Cali, & Mooney, 1999). Gold showed that
all three methods of information presentation by informant are equivalent
(Theorem I.3, in Gold, 1967). Obviously, it is a harder problem to learn from
text only than to learn from an informant, be ause in the se ond ase not
only positive but also negative examples are given.
Identi ation in the limit is also dependent on the \naming relation", that
is, on the way, in whi h a hypothesis about language L, that is a grammar,
is represented. The two possible naming relations for languages are
Language Generator: A Turing ma hine whi h generates L.
A generator is a fun tion from positive integers ( orresponding to time
steps t) to strings in A
su h that the fun tion range is exa tly L. A
generator exists i L is re ursively enumerable.
Language Tester: A Turing ma hine whi h is a de ision pro edure for L.
A tester is a fun tion from strings to f0; 1g with value 1 for strings in L
and value 0 otherwise.
Generators an be transformed in testers but not the other way round
and it is simpler to learn a generator for a lass of languages than a tester
(Gold, 1967).
134 6. Automati Programming
The denition of learnability as identi ation in the limit together with
a method of information presentation and a naming relation onstitutes a
language learnability model.
Gold presented some methods for identi ation in the limit, the most
prominent one is identi ation by enumeration.
Denition 6.3.5 (Identi ation by enumeration).
6
Let D be a des rip-
tion for a lass of languages where the elements of D are ee tively enu-
merable. Let L(d) be the language denoted by des ription d. For ea h time
step t, sear h for the least k su h that i
t
is in L(d
k
) and that all pre eding
i
j
; j = 1 : : : t 1 are in L(d
k
), too.
Identi ation by enumeration presupposes a linear ordering of possible hy-
potheses. Ea h in ompatibility between the training examples seen so far and
the urrent hypothesis results in the elimination of this hypothesis. That is,
if hypotheses are ee tively enumerable, it is guaranteed that the t-th guess
is the earliest des ription ompatible with the rst t elements of the training
sequen e and that learning will onverge to the rst des ription of L(d) in
the given enumeration D. In pra ti e, enumeration is mostly not eÆ ient.
It is often possible, to organize the lass of hypotheses in a better way su h
that whole sub lasses an be eliminated after some in ompatibility is iden-
tied (Angluin, 1984). The notion of indu tive inferen e as sear h though a
given spa e of hypotheses is helpful for theoreti al onsiderations. In pra ti e,
this spa e typi ally is only given indire tly by the inferen e pro edure and
hypotheses are generated and tested at ea h inferen e step.
Gold (1967) provided fundamental results about whi h lasses of lan-
guages are identiable in the limit given whi h learnability model. These
results are summarized in table 6.4. Later work in the area of learning the-
ory is mostly on erned with rening Gold's results by identifying interesting
sub lasses of the original ategorization of Gold and providing analyses of
their learnability together with eÆ ien y results (Zeugmann & Lange, 1995;
Sakakibara, 1997).
Theorems and proofs on erning all results given in table 6.4 an be found
in the appendix of Gold (1967). We just highlight some results, presenting
the theorems for the lass of nite ardinality languages (languages with a
nite number of words) and for the lass of supernite languages, ontaining
all languages of nite ardinality and at least one of innite ardinality.
Theorem 6.3.1 (Learnability of Finite Cardinality Languages). (Gold,
1967, theorem I.6) Finite ardinality languages are identiable from (arbi-
trary) text, that is, from positive examples presented in some arbitrary se-
quen e, only.
6
The denition is oriented at the formulation of Angluin (1984) rather than on
the original formulation of Gold (1967).
6.3 Indu tive Approa hes 135
Table 6.4. Fundamental Results of Language Learnability (Gold, 1967, tab. 1)
Learnability Model Class of Languages
Primitive Re ursive Text + Generator Naming
Re ursively enumerable re ursive
Informant
Primitive re ursive
Context-sensitive
Context-free
Regular
Supernite
Text
Finite ardinality languages
Proof (Learnability of Finite Cardinality Languages). Imagine a language whi h
onsists of integers from 1 to a xed number n. For any information sequen e
i
1
; i
2
; : : : the learner an assume that the language onsists only of the num-
bers seen so far. Sin e L is nite, after some time all elements of L have
o ured in the information sequen e, su h that the guess will be orre t.
It is enough to give the proof for a language onsisting of a nite set of
positive integers be ause all other nite languages an be mapped on this
language.
Theorem 6.3.2 (Learnability of Supernite Languages). (Gold, 1967,
theorem I.9) Using information presentation by re ursive text and the generator-
naming relation, any lass of languages whi h ontains all nite languages
and at least one innite language L is not identiable in the limit.
Proof (Learnability of Supernite Languages). The innite language L is re-
ursively enumerable (be ause otherwise it would not have a generator). It
an be shown that there exists a re ursive sequen e of positive integers whi h
ranges over L without repetitions. If the information sequen e is presented
as re ursive sequen e a
1
; a
2
; : : : then at some time step, this information se-
quen e is a re ursive text for the nite language L = fa
1
; : : : a
t
g. At a later
time step t
0
some additional element of the innite language an be presented
and the information sequen e now is a re ursive text for the nite language
L
0
= fa
1
; : : : ; a
t
; a
t
0
g. From this observation follows, that the learning system
must hange its guess an innite number of times.
Primitive re ursive languages are learnable from an informant be ause
for su h languages exists an eÆ ient enumeration (Hop roft & Ullman, 1980,
hap. 7). Be ause learning with informant in ludes the presentation of neg-
ative examples, these an be used to remove in ompatible hypothesis from
sear h. While primitive re ursive language are identiable in the limit, it is
not possible even for regular languages to nd a minimal des ription (a de-
136 6. Automati Programming
terministi nite automaton with a minimal number of states) in polynomial
time (Gold, 1978)!
Most of resear h in grammar inferen e is on erned with regular lan-
guages and sub lasses of them (as regular pattern languages). Angluin (1981)
ould show that a minimal deterministi nite automaton an be inferred in
polynomial time from positive and negative examples, if additionally to an
informant, an ora le answering membership queries is used. Angluin's ID al-
gorithm is used for example by S hrodl and Edelkamp (1999) for inferring
re ursive fun tions from user-generated tra es. Bostrom (1998) showed how
dis riminating predi ates an be inferred from only positive examples using
an algorithm for inferring a regular grammar.
7
In hapter 7 we will show that the lass of re ursive programs addressed
by our approa h to folding nite (\initial") programs orresponds to ontext-
free tree grammars. Finite programs an be seen as primitive re ursive text
and folding, that is, generating a re ursive program whi h results in the given
nite program as its i-th unfolding orrespond to generator-naming (see rst
line in tab. 6.4).
6.3.2 Geneti Programming
Geneti programming as an evolutionary approa h to program synthesis was
established by Koza (1992). Starting point is a population of randomly gen-
erated omputer programs. New programs better adapted to the \environ-
ment", whi h is represented by examples together with an \tness" fun tion
are generated by iterative appli ation of Darwinian natural sele tion and
biologi ally inspired \reprodu tion" operations.
8
Geneti programming is ap-
plied in system identi ation, lassi ation, ontrol, roboti s, optimization,
game playing, and pattern re ognition.
6.3.2.1 Basi Con epts.
Representation and Constru tion of Programs. In geneti programming, typ-
i ally fun tional programs are synthesized. Koza (1992) synthesizes Lisp fun -
tions, an appli ation of geneti programming to the synthesis of ML fun -
tions was presented by Olsson (1995). These programs are represented as
trees. A program is onstru ted as a synta ti ally orre t expression over a
set of predened fun tions and a set of terminals. The set of fun tions an
ontain standard operators (e. g., arithmeti operators, boolean operators,
onditional operators) as well as user-dened domain spe i fun tions. For
ea h fun tion, its arity is given. The set of terminals ontains variables and
7
More information about grammar inferen e, in luding algorithms and areas of
appli ation, an be found via http://www. s.iastate.edu/~honavar/gi/gi.
html.
8
The geneti programming home page is http://www.geneti -programming.
org/.
6.3 Indu tive Approa hes 137
atoms. Atoms represent onstants. Variables represent program arguments.
They an be instantiated by values obtained for example from \sensors".
A random generation of a program typi ally starts with an element of
the set of fun tions. The fun tion be omes the root node in the program
tree and it bran hes in the number of ar s given by its arity. For ea h open
ar an element from the set of fun tions or terminals is introdu ed. Program
generation terminates if ea h urrent node ontains a terminal. Examples for
program trees are given in gure 6.2.
+
*
+
* C
A B
AND AND
NOT NOT
(a) + OR
D0 D1
D0 D1
(b)
Fig. 6.2. Constru tion of a Simple Arithmeti Fun tion (a) and an Even-2-Parity
Fun tion (b) Represented as a Labeled Tree with Ordered Bran hes (Koza, 1992,
gs. 6.1, 6.2)
Representation of Problems. A problem together with a \tness" measure
onstitutes the \environment" for a program. Problems are represented by a
set of examples. For some problems, this an be pairs of possible input val-
ues, together with their asso iated outputs. The sear hed for program must
transform possible inputs into desired outputs. Fitness an be al ulated as
the number of erroneous outputs generated by a program. Su h problems
an be seen as typi al for indu tive program synthesis. For learning to play
a game su essfully, the problem an onsist of a set of possible onstella-
tions together with a s oring fun tion. The sear hed for program represents
a strategy for playing this game. For ea h onstellation, the optimal move
must be sele ted. Fitness an for example be al ulated as the per entage
of wins. For learning a plan, the input onsists of a set of possible problem
states. The sear hed for program generates a tion sequen es to transform a
problem state into the given goal state. Fitness an be al ulated as the per-
138 6. Automati Programming
entage of problem states for whi h the goal state is rea hed. An overview of
a variety of problems is given in Koza (1992, tab. 2.1).
Generational Loop. Initially, thousands of omputer programs are randomly
generated. This \initial population" orresponds to a blind, random sear h in
the spa e of the given problem. The size of the sear h spa e is dependent on
the size of the set of fun tions over whi h programs an be onstru ted. Ea h
program orresponds to an \individual" whose \tness" in uen es the prob-
ability with whi h it will be sele ted to parti ipate in the various \geneti
operations" (des ribed below). Be ause sele tion is not absolutely dependent
on tness, but tness just in uen es the probability of sele tion, geneti pro-
gramming is not a purely greedy-algorithm. Ea h iteration of the generational
loop results in a new \generation" of programs. After many iterations, a pro-
gram might emerge whi h solves or approximately solves the given problem.
An abstra t geneti programming algorithm is given in table 6.5.
Table 6.5. Geneti Programming Algorithm (Koza, 1992, p. 77)
1. Generate an initial population of random ompositions of the fun tions and
terminals of the problem.
2. Iteratively perform the following sub-steps until the termination riterium has
been satised:
a) Exe ute ea h program in the population and assign it a tness value a -
ording to how well it solves the problem.
b) Sele t omputer program(s) from the urrent population hosen with a
probability based on tness.
Create a new population of omputer programs by applying the following
two primary operations:
i. Copy program to the new population (Reprodu tion).
ii. Create new programs by geneti ally re ombining randomly hosen
parts of two existing programs.
3. The best so-far individual (program) is designated as result.
Geneti Operations. The set of geneti operations are:
Mutation: Delete a subtree of a program and grow a new subtree at its pla e
randomly.
This \asexual" operation is typi ally performed sparingly, for example
with a probability of 1% during ea h generation.
Crossover: For two programs (\parents"), in ea h tree a ross-over point is
hosen randomly and the subtree rooted at the ross-over point of the
rst program is deleted and repla ed by the subtree rooted at the ross-
over point of the se ond program.
This \sexual re ombination" operation is the predominant operation in
geneti programming and is performed with a high probability (85% to
90 %).
6.3 Indu tive Approa hes 139
Reprodu tion: Copy a single individual into the next generation.
An individuum \survives" with for example 10% probability.
Ar hite ture Alteration: Change the stru ture of a program.
There are dierent stru ture hanging operations whi h are applied spar-
ingly (1% probability or below):
Introdu tion of Subroutines: Create a subroutine from a part of the main
program and reate a referen e between the main program and the
new subroutine.
Deletion of Subroutines: Delete a subroutine; thereby making the hier-
ar hy of subroutines narrower or shallower.
Subroutine Dupli ation: Dupli ate a subroutine, give it a new name and
randomly divide the preexisting alls of the subroutine between the
old and the new one. (This operation preserves semanti s. Later on,
ea h of these subroutines might be hanged, for example by muta-
tion.)
Argument Dupli ation: Dupli ate an argument of a subroutine and ran-
domly divide internal referen es to it. (This operation is also seman-
ti s preserving. It enlarges the dimensionality of the subroutine.)
Argument Deletion: Delete an argument; thereby redu ing the amount
of information available to a subroutine (\generalization").
Automati ally Dened Iterations/Re ursions: Introdu e or delete itera-
tions (ADIs) or re ursive alls (ADRs).
Introdu tion of iterations or re ursive alls might result in non-
termination. Typi ally, the number of iterations (or re ursions) is
restri ted for a problem. That is, for ea h problem, ea h program
has a time-out riterium and is terminated \from outside" after a
ertain number of iterations.
Automati ally Dened Stores: Introdu e or delete memory (ADSs).
Quality riteria for programs synthesized by geneti programming are or-
re tness whi h is dened as 100% tness for the given examples , eÆ ien y
and parsimony. The two later riteria an be additionally oded in the tness
measure. In the following, we give examples of program synthesis by geneti
programming.
6.3.2.2 Iteration and Re ursion. First we des ribe, how a plan for sta k-
ing blo ks in a predened order an be synthesized where the nal program
involves iteration of a tions (Koza, 1992, hap. 18.1). This problem is similar
to the tower problem des ribed in se tion 3.1.4.5: An initial problem state
onsists of n blo ks whi h an be sta ked or lying on the table. Problem solv-
ing operators are puttable(x) and put(x,y), the rst operator dening how a
blo k x is moved from another blo k on the table and the se ond operator
dening how a blo k x an be put on top of another blo k y. The problem
solving goal denes in whi h sequen e the n blo ks should be sta ked into a
single tower (see g. 6.3).
140 6. Automati Programming
U
I
V
E
R
S
L
N
A
Initial State Goal StateA possible intermediate state
N
U
A
L I R V ES
U
S
AV E I N
R
L
STACK
TABLE
TBNN
CS
Fig. 6.3. A Possible Initial State, an Intermediate State, and the Goal State for
Blo k Sta king (Koza, 1992, gs. 18.1, 18.2)
The set of terminal is T = fCS; TB;NNg and the set of fun tions is
fMS;MT;NOT;EQ;DUg with
CS: A sensor that dynami ally spe ies the top blo k of the Sta k.
TB: A sensor that dynami ally spe ies the blo k in the Sta k whi h together
with all blo ks under it are already positioned orre tly. (\top orre t
blo k")
NN: A sensor that dynami ally spe ies the blo k whi h must be sta ked
immediately on top of TB a ording to the goal. (\next needed blo k")
MS: A move-to-sta k operator with arity one whi h moves a blo k from the
Table on top of the Sta k.
MT: A move-to-table operator with arity one whi h moves a blo k from the
top of the Sta k to the Table.
NOT: A boolean operator with arity one swit hing the truth value of its
argument.
EQ: A boolean operator with arity two whi h returns true if its arguments
are identi al and false otherwise.
DU: A user-dened iterative \do-until" operator with arity two. The expres-
sion DU *Work* *Predi ate* auses *Work* to be iteratively exe uted
until *Predi ate* is satised.
All fun tions have dened outputs for all onditions: MS and MT hange
the Table and Sta k as side-ee t. They return true, if the operator an be
applied su essfully and nil (false) otherwise. The return value of DU is also
a boolean value indi ating whether *Predi ate* is satised or whether the
DU operator timed out.
For this example, the set of terminals and the set of fun tions are are-
fully rafted. Espe ially the pre-dened sensors arry exa tly that information
whi h is relevant for solving the problem! The problem is more restri ted than
the lassi al blo ks-world problem: While in the blo ks-world, any onstella-
tion of blo ks (for example, four sta ks ontaining two blo ks ea h and one
with only one blo k) is possible (see tab. 2.3 for the growth of the number of
states in dependen e of the number of blo ks), in the blo k-sta king problem,
only one sta k is allowed and all other blo ks are lying on the table.
6.3 Indu tive Approa hes 141
For the evolution of the desired program 166 tness ases were on-
stru ted: ten ases where zero to all nine blo ks in the sta k were already
ordered orre tly; eight ases where there is one out of order blo k on top of
the sta k; and a random sampling of 148 additions ases. Fitness was mea-
sured as the number of orre tly handled ases. Koza (1992) reports three
variants for the synthesis of a blo k sta king program, summarized in gure
6.4. The rst, orre t program rst moves all blo ks on the table and than
onstru ts the orre t sta k. This program is not very eÆ ient be ause there
are made unne essary moves from the sta k to the table for partially orre t
sta ks. Over all 166 ases, this fun tion generates 2319 moves in ontrast to
1642 ne essary moves. In the next trial, eÆ ien y was integrated into the
tness measure and as a onsequen e, a fun tion al ulating only the mini-
mum number of moves emerged. But this fun tion has an outer loop whi h is
not ne essary. By integrating parsimony into the tness measure, the orre t,
eÆ ient, and parsimonious fun tion is generated.
Population Size: M = 500
Fitness Cases: 166
Corre t Program:
Fitness: 166 - number of orre tly handled ases Termination: Generation 10
(EQ (DU (MT CS) (NOT CS))
(DU (MS NN) (NOT NN)))
Corre t and EÆ ient Program:
Fitness: 0:75 C + 0:25 E with
C = ( number of orre tly handled ases=166) 100 E = f(n) as fun tion of the
total number of moves over all 166 ases:
with f(n) = 100 for the analyti ally obtained minimal number of moves for a
orre t program (min = 1641);
f(n) linearly s aled upwards for zero moves up to 1640 moves with f(0) = 0
f(n) linearly s aled downwards for 1642 moves up to 2319 moves (obtained by
the rst orre t program) and f(n) = 0 for n > 2319
Termination: Generation 11
(DU (EQ (DU (MT CS) (EQ CS TB))
(DU (MS NN) (NOT NN)))
(NOT NN))
Corre t, EÆ ient, and Parsimonious Program:
Fitness: 0:70 C + 0:20 E + 0:10 (number of nodes in program tree)
Termination: Generation 1
(EQ (DU (MT CS) (EQ CS TB))
(DU (MS NN) (NOT NN)))
Fig. 6.4. Resulting Programs for the Blo k Sta king Problem (Koza, 1992,
hap. 18.1)
As an example for synthesizing a re ursive programKoza (1992, hap. 18.3)
shows how a fun tion al ulation Fibona i numbers an be generated from
142 6. Automati Programming
examples for the rst twenty numbers. The problem is treated as sequen e
indu tion problem for an as ending order of index positions J , the fun -
tion f(J) must be dete ted. As terminal set T = fJ; 0; 1; 2; 3g is given and
as fun tion set F = f+;; ; SRFg. Terminal J represents the input value.
Fun tion SRF is a \sequen e referen ing fun tion" with (SRF K D) returns
either the Fibona i number for K or a default value D. The used strategy is,
that the Fibona i number are al ulated for K = 0 : : : 20 in as ending order
and ea h result is stored in a table. Su h, SRF has a ess to the Fibona i
numbers for all K whi h were already al ulated, returning the Fibona i
number for K J 1 and default value D otherwise. For a population size
of 2000 programs, in generation 22, the program given in table 6.6 emerged.
A re-implementation in Lisp together with a simplied version is given in
appendix CC.1. Fun tion SRF realizes re ursion in an indire t way: ea h
new element in the sequen e is dened re ursively in terms of one or more
previous elements.
Table 6.6. Cal ulation the Fibona i Sequen e
(+ (SRF (- J 2) 0)
(SRF (+ (+ (- J 2) 0) (SRF (- J J) 0))
(SRF (SRF 3 1) 1)))
Other treatments of re ursion in geneti programming an be found for
example in Olsson (1995) and Yu and Clark (1998).
6.3.2.3 Evaluation of Geneti Programming. Geneti programming is
a model-driven approa h to learning (see se t. 6.3.1.1), that is, it starts with a
set of programs (hypotheses), these are he ked against the data and modied
a ordingly. Be ause the method relies on (nearly) blind sear h, the time
eort for oming up with a program solving the desired task is very high.
Program onstru tion by generate-and-test has next to nothing in ommon
with the way in whi h a human programmer develops program ode whi h
(hopefully) is guided by the given ( omplete or in omplete) spe i ation.
Furthermore, eort and su ess of program onstru tion are heavily de-
pendent on the given representation of the problem and of the pre-dened
set of fun tions and terminals. For the blo k-sta king example given above,
knowledge about whi h information in a onstellation of blo ks is relevant for
hoosing an a tion was expli itly oded in the sensors. In hapter 8, we will
show that our own approa h allows to infer whi h information (predi ates)
are relevant for solving a problem.
6.3.3 Indu tive Logi Programming
Indu tive logi programming (ILP) was named by Muggleton (1991) be ause
this area of resear h ombines indu tive learning and logi programming.
6.3 Indu tive Approa hes 143
Most work done in this area on erns on ept learning, that is, indu tion
of non-re ursive lassi ation rules (see se t. 6.3.1.1). In prin iple, all ILP
approa hes an be applied to learning re ursive lauses, if it is allowed that
the name of the to be learned relation an be used when onstru ting the
body whi h hara terizes the to be learned relation. But without spe ial
heuristi s, neither termination of the learning pro ess nor termination of the
learned lauses an be guaranteed. In the following, we present basi on epts
of ILP, three kinds of learning me hanisms, and biases used in ILP. Finally,
we refer some ILP systems whi h address learning of re ursive lauses. An
introdu tion to ILP is given for example by Lavra and Dzeroski (1994) and
Muggleton and De Raedt (1994), an overview of program synthesis and ILP
is given by Flener (Flener, 1995; Flener & Yilmaz, 1999).
In the following, we give illustrations of the basi on epts and learning
algorithms using a simple lassi ation problem presented, for example, in
Lavra and Dzeroski (1994). The to be learned on ept is the relation daugh-
ter(X,Y) (see tab. 6.7).
Table 6.7. Learning the daughter Relation
Training Examples:
E
+
= fe
1
= daughter(mary;ann); e
2
= daughter(eve; tom)g
E
= fdaughter(tom;ann); daughter(eve; ann)g
Ba kground Knowledge:
B = fparent(ann;mary); parent(ann; tom); parent(tom; eve); parent(tom; ian);
female(ann); female(mary); female(eve)g
To be learned lause: (learning methods des ribed below)
daughter(X;Y ) female(X); parent(Y;X)
6.3.3.1 Basi Con epts. In ILP, examples and hypotheses are represented
as subsets of rst order logi su h as denite Horn lauses with some addi-
tional restri tions (see dis ussion of biases below). Denite Horn lauses have
the form T _ :L
1
_ : : : _ :L
n
, or in Prolog notation T L
1
; : : : ; L
n
. The
positive literal T is alled head of the lause and represents the to be learned
relation. Often, a lause is represented as a set of literals.
Examples typi ally an be divided into positive and negative examples.
In addition to the examples, ba kground knowledge an be provided (see
se t. 6.3.1.1, table 6.2). The goal of indu tion is to obtain a hypothesis whi h,
together with the ba kground knowledge, is omplete and onsistent with
respe t to the examples:
Denition 6.3.6 (Complete and Consistent Hypotheses). Given a set
of training examples E = E
+
[ E
, with E
+
as positive examples and E
as
negative examples, together with ba kground knowledge B and a hypothesis
H, let overs(B [H; E) be a fun tion whi h returns all elements of E) whi h
follow from B [ H. The following onditions must hold:
144 6. Automati Programming
Completeness: overs(B [ H; E
+
) = E
+
.
Ea h positive example in E
+
is overed by hypothesis H together with
the ba kground knowledge B. In other words, the positive examples follow
from hypothesis and ba kground knowledge (B ^ H j= E
+
).
Consisten y: overs(B [H; E
) = ;.
No negative example in E
is overed by hypothesis H together with the
ba kground knowledge B. In other words, no negative example follows
from hypothesis and ba kground knowledge (B ^ H ^ E
j= ;).
In logi programming (Sterling & Shapiro, 1986), SLD-resolution an be
used to he k for ea h example in E whether it is entailed by the hypothesis
together with the ba kground knowledge. An additional requirement is, that
the positive examples do not already follow from the ba kground knowledge.
In this ase, learning ( onstru tion of a hypothesis) is not ne essary (\prior
ne essity" Muggleton & De Raedt, 1994).
All possible hypotheses that is, all denite horn lauses with the to
be learned relation as head whi h an be onstru ted from the examples
and the ba kground knowledge onstitute the so alled hypothesis spa e or
sear h spa e for indu tion. For a systemati sear h of the hypothesis spa e,
it is useful to introdu e a partial order over lauses, based on -subsumption
(Plotkin, 1969):
Denition 6.3.7 (-Subsumption). Let and
0
be two program lauses.
Clause -subsumes lause
0
, if there exists a substitution , su h that
0
. Two lauses and d are -subsumption equivalent if -subsumes d and
if d -subsumes . A lause is redu ed if it is not -subsumption equivalent
to any proper subset of itself.
An example is given in gure 6.5.
= daughter(X;Y ) parent(Y;X); parent(W;V )
d = daughter(X;Y ) parent(Y;X)
and d are -subsumption equivalent:
d and d with = fW Y; V Xg
is not redu ed (d is a proper subset of ).
d is redu ed.
Fig. 6.5. -Subsumption Equivalen e and Redu ed Clauses
Based on -subsumption, we an introdu e a synta ti notion of general-
ity: If
0
, then is at least as general as
0
, written
0
. If
0
holds
and
0
does not hold, is a generalization of
0
and
0
is a spe ialization
(renement) of . The relation introdu es a latti e on the set of redu ed
lauses. That is, any two lauses have a least upper bound and a greatest
lower bound whi h are unique ex ept for variable renaming (see gure 6.6).
6.3 Indu tive Approa hes 145
Based on -subsumption, the least general generalization of two lauses
an be dened (Plotkin, 1969)
9
:
Denition 6.3.8 (Least General Generalization). The least general gen-
eralization (lgg) of two redu ed lauses and
0
, written lgg( ;
0
) is the least
upper bound of and
0
in the -subsumption latti e.
Note, that al ulating the resolvent of two lauses is the inverse pro ess
to al ulating an lgg: When al ulating an lgg, two lauses are generalized
by keeping their -substitution equivalent ommon subset of literals. When
al ulating the resolvent, two lauses are integrated into one, more spe ial
lause where variables o uring in the given lauses might be repla ed by
terms (see des ription of resolution in se t. 2.2.2 in hap. 2 and above in
this hapter). With -subsumption as the entral on ept for ILP we have
the basis for hara terizing ILP learning te hniques as either bottom-up on-
stru tion of least general generalizations or top-down sear h for renements
(see g. 6.6).
c’
θθ
θ
θ θ
resolvent of (c1,c2)
top-down
refinement
bottom-upgeneralization
c2c1
θ θ
c = lgg(c1,c2)
Fig. 6.6. -Subsumption Latti e
6.3.3.2 Basi Learning Te hniques. In the following we present two te h-
niques for bottom-up generalization al ulating relative lggs and inverse
resolution and one te hnique for top-down sear h in a renement graph.
Relative Least General Generalization. Relative least general generalization
is an extension of lggs with respe t to ba kground knowledge, for example
used in Golem (Muggleton & Feng, 1990).
Denition 6.3.9 (Relative Least General Generalization). The rela-
tive least general generalization (rlgg) of two lauses
1
and
2
is their
least general generalization lgg(
1
;
2
) relative to the ba kground knowledge
B: rlgg(
1
;
2
) = lgg(
1
B;
2
B).
9
We will ome ba k to this on ept applied to terms (instead of logi al formulae)
under the name of rst-order anti-uni ation in hapter 7 and in part III.
146 6. Automati Programming
In the ILP literature, enri hing lauses with ba kground knowledge is
dis ussed under the name of saturation (Rouveirol, 1991).
An algorithm for al ulating lggs for lauses is, for example, given in
Plotkin (1969). In table 6.8 we demonstrate al ulating rlggs with the daugh-
ter example (tab. 6.7). The rlgg of the two positive examples is al ulated
by determining the lgg between Horn lauses where the examples are onse-
quen es of the ba kground knowledge.
Table 6.8. Cal ulating an rlgg
rlgg(daughter(mary,ann),daughter(eve,tom)) =
lgg(daughter(mary,ann) B, daughter(eve,tom) B) =
daughter(X,Y) female(X), parent(Y,X)
Inverse Resolution. Inverse resolution is a generalization te hnique based on
inverting the resolution rule of dedu tive inferen e. Inverse resolution requires
inverse substitution:
Denition 6.3.10 (Inverse substitution). For a logi al formula W , an
inverse substitution
1
of a substitution is a fun tion that maps terms in
W to variables, su h that W
1
=W .
Again, we do not present the algorithm, but illustrate inverse resolu-
tion with the daughter-example introdu ed in table 6.7 (see g. 6.7). We
restri t our positive examples to e
1
= daughter(mary, ann) and ba kground
knowledge to b
1
= female(mary) and b
2
=parent(ann, mary). Working with
bottom-up generalization, the initial hypothesis is given as the rst positive
example. The learning pro ess starts, for example, with lauses e
1
and b
2
.
Inverse resolution attempts to nd a lause
1
su h that
1
together with b
2
entails e
1
. If su h a lause an be found, it be omes the urrent hypothesis
and a further lause from the set of positive examples or the ba kground
knowledge is onsidered. The urrent hypothesis is attempted to be inversely
resolved with this new lause and so on. The inverse resolution operator used
in the example is alled absorption or V operator.
In general, inverse resolution involves ba ktra king. In systems working
with inverse resolution, it is sometimes ne essary to introdu e new predi ates
in the hypothesis for onstru ting a omplete and onsistent hypothesis. The
operator realizing su h a ne essary predi ate invention is alled W operator
(Muggleton, 1994). Inverse resolution is for example used inMarvin (Sammut
& Banerji, 1986).
Sear h in the Renement-Graph. Alternatively to bottom-up generalization,
hypotheses an be onstru ted by top-down renement. The step from a
general to a more spe i hypothesis is realized by a so alled renement
operator:
6.3 Indu tive Approa hes 147
c1 = daughter(mary,Y) parent(Y,mary)
c2 = daughter(X,Y) female(X), parent(X,Y)
θ2
-1
b2 = parent(ann, mary)
e1 = daughter(mary,ann)
b1 = female(mary)
θ-1
1
= mary X
= ann Y
Fig. 6.7. An Inverse Linear Derivation Tree (Lavra and Dzeroski, 1994, pp. 46)
Denition 6.3.11 (Renement Operator). Given a hypothesis language
L (e. g., denite Horn lauses), a renement operator maps a lause to
a set of lauses ( ) = f
0
j
0
2 L; <
0
g.
Typi ally, a renement operator omputes so alled \most general spe i-
ations" under -subsumption by performing one of the following synta ti
operations:
apply a substitution to ,
add a literal to the body of .
Learning as sear h in renement graphs was rst introdu ed by Shapiro
(1983) in the intera tive system MIS (Model Inferen e System). An outline
of the MIS-algorithm is given in table 6.9.
Table 6.9. Simplied MIS-Algorithm (Lavra and Dzeroski, 1994, pp.54)
Initialize hypothesis H to a (possibly empty) set of lauses in L.
REPEAT
Read next (positive or negative) example.
REPEAT
IF there exists a overed negative example e
THEN delete in orre t lauses from H.
IF there exists a positive example e overed by H
THEN with breadth-rst sear h of the renement graph, develop a
lause whi h overs e and add it to H.
UNTIL H is omplete and onsistent.
Output: Hypothesis H.
FOREVER.
Again, we illustrate the algorithm with the daughter example (tab. 6.7).
A part of the breadth-rst sear h tree (see hap. 2) is given in gure 6.8.
The root node of an renement graph is formally the most general lause,
148 6. Automati Programming
that is false or the empty lause. Typi ally, a renement algorithm starts
with the most general denition of the goal relation, for our example, that is
H = = daughter(X,Y) true. The hypothesis overs both positive training
examples, that is, it must not be modied if these examples are presented
initially. If a negative example (e. g., e
3
= daughter(eve, tom)) is presented,
the hypothesis needs to be modied be ause overs the negative examples,
too. Therefore, a lause overing the positive examples but not overing e
3
must be onstru ted. The set of all possible (minimal) spe ializations of is
al ulated as ( ) = fdaughter(X;Y ) Lg. The newly introdu ed literal is
either a literal whi h has the variables used in the lausal head as argument
or a literal introdu ing a new variable. The predi ate names of the literals
an be obtained by the ba kground knowledge, additional build-in predi ates
(su h as equality or inequality) an be used.
Already for this small example, the sear h spa e be omes rather large: L
an be
X = Y , or female(X), or female(Y ), or parent(X;X), or parent(X;Y ),
or parent(Y;X), or parent(Y; Y ), or
parent(X;Z), or parent(Z;X), or parent(Y; Z), or parent(Z; Y ), with Z 6=
X and Z 6= Y .
Note, that in this example the language is restri ted su h, that literals with
the goal predi ate daughter are not onsidered in the body of the lause and
therefore, no re ursive lauses are generated.
The renement
0
= daughter(X;Y ) female(X) overs the positive
examples and does not over e
3
. Therefore, it be omes the new hypothesis.
If another negative example, su h as e
4
= daughter(eve, ann) is presented,
the hypothesis again must be modied, be ause
0
overs this example, and
so on.
daughter(X, Y)
daughter(X,Y) X = Y daughter(X, Y) female(X) daughter(X, Y) parent(Y, X)
daughter(X, Y) female(X), female(Y) daughter(X,Y) female(X), parent(Y,X)
...
...
... ...
...
... ... ...
... ...
Fig. 6.8. Part of a Renement Graph (Lavra and Dzeroski, 1994, p. 56)
A urrent system, based on top-down renement is Foil (Quinlan, 1990).
6.3.3.3 De larative Biases. The most general setting for learning hy-
potheses in an ILP framework would be to allow that the hypothesis an
be represented as an arbitrary set of denite Horn lauses, that is any legal
6.3 Indu tive Approa hes 149
Prolog program. Be ause ILP te hniques are heavily dependent on sear h
in hypothesis spa e, typi ally mu h more restri ted hypothesis languages are
onsidered. In the following, we present su h language restri tions, also alled
synta ti biases (see se t. 6.3.1.1). Additionally, semanti biases an be in-
trodu ed to restri t the to be learned relations.
Synta ti Biases. One possibility to restri t the sear h spa e is, to provide
se ond order s hemes whi h represent the form of sear hed for hypotheses
(Flener, 1995). A se ond order s heme is a lause with existentially quantied
predi ate variables. An example s heme and an instantiation of su h a s heme
are (Muggleton & De Raedt, 1994, p. 656):
S heme: 9p; q; r : p(X;Y ) q(X;XW ); q(YW; Y ); r(XW;Y W ):
Instantiation: onne ted(X,Y) partof(X,XW), partof(Y,YW), tou hes(XW,YW).
A typi al synta ti bias is to restri t sear h to linked lauses:
Denition 6.3.12 (Linked Clause). A lause is linked if all of its vari-
ables are linked. A variable V is linked in a lause i V o urs in the head
of , or there exists a literal l in that ontains variables V and W , W 6= V
and W is linked in .
The s heme given above represents a linked lause.
Many ILP systems provide a number of parameters whi h an be set
by the user. These parameters present biases and by assigning them with
dierent values, so alled bias shifts an be realized.
Two examples of su h parameters are the depth of terms and the level of
terms.
Denition 6.3.13 (Depth of a Term). The depth d(V ) of a variable is
0. The depth d( ) of a onstant is 1. The depth d(f(t
1
; : : : ; t
n
)) of a term is
1 +max(fd(t
i
)g).
Denition 6.3.14 (Level of a Term). The level t(t) of a term t in a
linked lause is 0 if t o urs as an argument in the head of , and
1 + min(fl(s)g) where s and t o ur as arguments in the same literal of
.
In the s heme given above, X and Y have depth 1 and XW and YW
have depth 2. For linked languages with maximal depth 1 and level > 1, the
number of literals in the body of the lause an grow exponentially with its
level (Muggleton & De Raedt, 1994).
Semanti Biases. A typi ally semanti bias is to provide information about
the modes and types of a to be learned lause. This information an be pro-
vided by spe ifying de larations of predi ates in the ba kground knowledge.
An example for spe ifying modes and types for append is given in gure 6.9:
If the rst two arguments of append are instantiated (indi ated by +list), this
150 6. Automati Programming
predi ate su eeds on e (1),; if the last argument is instantiated, it an su -
eed nitely many times (). The predi ate is restri ted to lists of integers. It
is obvious that su h a semanti bias an restri t eort of sear h onsiderably.
mode(1, append(+list,+list,-list))
mode(*, append(-list,-list,+list))
list(nil)
list([X|T) integer(X), list(T)
Fig. 6.9. Spe ifying Modes and Types for Predi ates
Another semanti bias, for example used in Foil (Quinlan, 1990) and
Golem (Muggleton & Feng, 1990), is the notion of determinate lauses.
Denition 6.3.15 (Determinate Clauses). A denite lause h l
1
; : : : ;
l
n
is determinate (with respe t to ba kground knowledge B and examples E)
i for every substitution for h that unies h to a ground instan e e 2 E, and
for all i = 1; : : : n there is a unique substitution
i
su h that (l
1
^ : : : ^ l
i
)
i
is ground and is true for B ^ E
+
.
A lause is determinate if all its literals are determinate and a literal l
i
is
determinate if ea h of its variables whi h do not appear in pre eding literals
l
j
, j = 1; : : : i 1 has only one possible instantiation given the instantiations
in fl
j
j j = 1; : : : i 1g . For the daughter example given in table 6.7, the
lause daughter(X;Y ) female(X); parent(X;Y ) is determinate: If X is
instantiated with mary, then Y an only be instantiated with ann.
Restri tion to determinate lauses an redu e the exponential growth of
literals in the body of a hypotheses during sear h. A spe ial ase of deter-
mina y is the so alled ij-determina y. Parameter i represents the maximum
depth of variables (def. 6.3.13), parameter j represents the maximal degree of
dependen y (def. 6.3.14) in determined lauses (Muggleton & Feng, 1990).
Denition 6.3.16 (ij-Determina y of Clauses). A lause onsisting of
a head only is 0j-determinate. A lause h l
1
; : : : ; l
m
; l
m+1
; : : : ; l
n
is ij-
determinate i
h l
1
; : : : ; l
m
is (i 1)j-determinate, and
every literal in l
m+1
; : : : ; l
n
ontains only determinate terms and has a
degree at most j.
The lause daughter(X;Y ) female(X); parent(X;Y ) is 11-determi-
nate be ause all variables in the body appear in the head (i = 1) and the
maximal degree is one (j = 1) for variable Y in parent(X,Y) whi h depends
on X .
It an be shown that the restri tion to ij-determinate lauses allows for
polynomial eort of learning (Muggleton & Feng, 1990).
6.3 Indu tive Approa hes 151
6.3.3.4 Learning Re ursive Prolog Programs. In prin iple, all ILP sys-
tems an be extended to learning re ursive lauses, that is, they an be seen
as program synthesis systems, if the language bias is relaxed su h that the
goal predi ate is allowed to be used when onstru ting the body of a hypoth-
esis. For the systems MIS (Shapiro, 1983), Foil (Quinlan, 1990), and Golem
(Muggleton & Feng, 1990), for example, appli ation to program synthesis was
demonstrated. In the following, we rst show, how learning re ursive lauses
is realized with Golem and afterwards introdu e some ILP approa hes whi h
were proposed expli itly for program synthesis.
Learning reverse with Golem. The system Golem (Muggleton & Feng, 1990)
works with bottom-up generalization, al ulating relative least general gen-
eralizations (see def. 6.3.9). We give an example, how reverse(X, Y) an be
indu ed with Golem. In table 6.10 the ne essary ba kground knowledge, pos-
itive and negative examples are presented.
Table 6.10. Ba kground Knowledge and Examples for Learning reverse(X,Y)
%% positive examples
rev([,[). rev([1,[1). rev([2,[2). rev([3,[3). rev([4,[4).
rev([1,2,[2,1). rev([1,3,[3,1). rev([1,4,[4,1). rev([2,2,[2,2).
rev([2,3,[3,2). rev([2,4,[4,2). rev([0,1,2,[2,1,0).
rev([1,2,3,[3,2,1).
%% negative examples
rev([1,[). rev([0,1,[0,1). rev([0,1,2,[2,0,1).
app([1,[0,[0,1).
%% ba kground knowledge
!- mode(rev(+,-)).
!- mode(app(+,+,-)).
rev([,[). ... rev([1,2,3,[3,2,1). %% all positive examples
app([,[,[). app([1,[,[1). app([2,[,[2). app([3,[,[3).
app([4,[,[4). app([,[1,[1). app([,[2,[2). app([,[3,[3).
app([,[4,[4). app([1,[0,[1,0). app([2,[1,[2,1).
app([3,[1,[3,1). app([4,[1,[4,1). app([2,[2,[2,2).
app([3,[2,[3,2). app([4,[2,[4,2). app([2,1,[0,[2,1,0).
app([3,2,[1,[3,2,1).
As des ribed above, an rlgg is al ulated between pairs of lauses of the
form e B where e is a positive example and B is the onjun tion of
lauses from the ba kground knowledge. The \tri k" used for learning re-
ursive lauses with an rlgg te hnique is that the positive examples are addi-
tionally made part of the ba kground knowledge. In this way, the name of the
goal predi ate an be introdu ed in the body of the hypothesis. The notion
of ij-determina y (see def. 6.3.16) is used to avoid exponential size explosion
of rlggs.
For the rst examples rev([ ,[ ) and rev([1,[1) the rlgg is:
152 6. Automati Programming
rev(X;X) rev([X ; [X ); rev([2; [2); : : : ;
rev([X; 2; [2; X ); : : : ; rev([X;X; 2; [2; X;X );
app(X;X;X); app([X ; [; [X );
app([2; [; [2); : : : ; app([3; 2; [X ; [3; 2; X ):
This hypothesis has a head whi h is still too spe ial and a body whi h
ontains far too many literals. The head will be ome more general in the
learning pro ess when further positive examples are introdu ed. Literals from
the body an be redu ed by eliminating literals whi h over negative exam-
ples. In general, this an be a very time onsuming task be ause all subsets
of the literals in the body must be he ked. In Muggleton and Feng (1990) a
te hnique working with O(n
2
) is des ribed.
For the above example, it remains:
rev(X;X) app(X;X; Y ):
The nally resulting program is:
rev([AjB; [CjD) rev(B;E);
app(E; [A; [CjD):
A tra e generated by Golem for this example is given in appendix CC.2.
Note, that no base ase is learned, that is, interpretation of rev would
not terminate. Furthermore, the sequen e of literals in a lause depends on
the sequen e of literals in the ba kground knowledge. If in our example,
the append literals would be presented before the rev literals, the resulting
program would have a body where app (append) appears before rev. In using
a top-down left-to-right strategy, su h as Prolog, interpretation of the lause
would fail.
Instead of presenting the append predi ate ne essary for indu ing reverse
by ground instan es, a predened append fun tion an be presented as ba k-
ground knowledge. But this an make indu tion harder be ause for al ulat-
ing an rlgg the ba kground knowledge must be grounded. That is, a re ursive
denition in the ba kground knowledge must be instantiated and unfolded to
some xed depth. Be ause there might be many possible instantiations and
be ause the \ne essary" depth of unfolding is unknown, sear h eort an eas-
ily explode. Without a predened restri tion of unfolding-depth, termination
annot be guaranteed.
Spe ial Purpose ILP Synthesis Systems. ILP systems expli itly designed for
learning re ursive lauses are for example TIM (Idestam-Almquist, 1995),
Synapse (Flener, Popelinsky, & Stepankova, 1994), and Merlin (Bostrom,
1996). The hypothesis language of Tim onsists of tail re ursive lauses to-
gether with base lauses. Learning is performed again by al ulating rlggs.
Synapse synthesizes programs whi h onrm to a divide-and- onquer s heme
and it in orporates predi ate invention. Merlin learns tail re ursive lauses
based on a grammar inferen e te hnique (see se t. 6.3.1.2). For a given set
6.3 Indu tive Approa hes 153
of positive and negative examples, a deterministi nite automaton is on-
stru ted whi h a epts all positive and no negative examples. This an for
example be realized using the ID algorithm of Angluin (1981). The automa-
ton is then transformed in a set of Prolog rules. Merlin also uses predi ate
invention.
6.3.4 Indu tive Fun tional Programming
In ontrast to the indu tive approa hes presented so far, the lassi al fun -
tional approa h to indu tive program synthesis is based on a two step pro ess:
Rewriting I/O Examples into onstru tive expressions: Example omputations,
presented as input/output examples are transformed into tra es and
predi ates. Ea h predi ate hara terizes the stru ture of one input ex-
ample. Ea h tra e al ulates the orresponding output for a given input
example.
Re urren e Dete tion: Regularities are sear hed for in the set of predi ates/tra-
es pairs. These regularities are used for generating a re ursive general-
ization.
The dierent approa hes typi ally use a simple strategy for the rst step
whi h is not dis ussed in detail. The riti al dieren es between the fun -
tional synthesis algorithms is how the se ond step is realized. Two pioneering
approa hes are from Summers (1977) and Biermann (1978). The work of
Summers was extended by several resear hers, for example by Jouannaud
and Kodrato (1979), Wysotzki (1983), and Le Blan (1994).
In the following, we rst present Summers'approa h in some detail, be-
ause our own work is based on his approa h. Afterwards we give a short
overview over Biermann's approa h and nally we dis uss some extensions of
Summers' work.
6.3.4.1 Summers' Re urren e Relation Dete tion Approa h.
Basi Con epts. Input to the synthesis algorithm ( alled Thesys) is a set of
input/output examples E = fe
1
; : : : ; e
k
with e
j
= i
j
! o
j
, j = 1; : : : ; k. The
to be onstru ted output is a re ursive Lisp fun tion whi h generalizes over
E. The following Lisp primitives
10
are used:
Atom nil: representing the empty list or the truth-value false.
Predi ate atom(x): whi h is true if x is an atom.
Constru tor ons(x, l): whi h inserts an element x in front of l.
Basi fun tions: sele tors ar(x) and dr(x), whi h return the rst ele-
ment of a list/a list without its rst element;
and ompositions of sele tors, whi h an be noted for short as hxi
+
r,
x 2 fa; dg for instan e ar( dr(x)) = adr(x).
10
For better readability, we write the operator names, followed by omma-
separated arguments in parenthesis, instead of the usual Lisp notation. That
is, for the Lisp expression ( ons x l), we write ons(x, l).
154 6. Automati Programming
M Carthy-Conditional ond(b1, ..., bn): where b
i
are pairs of predi-
ates and operations p
i
(x)! f
i
(x).
Inputs and outputs of the sear hed for re ursive fun tion are represented
as single s-expressions (simple expressions):
Denition 6.3.17 (S-Expression). Atom nil and basi fun tions are s-
expressions.
ons(s1, s2) is an s-expression, if s1 and s2 are s-expressions.
With Summers' method, Lisp fun tions with the following underlying
basi program s heme an be indu ed:
Denition 6.3.18 (Basi Program S heme).
F (x) (p
1
(x)! f
1
(x);
: : : ;
p
k
(x)! f
k
(x);
T ! C(F (b(x)); x))
where:
p
1
; : : : p
k
are predi ates of the form atom(b
i
(x))
b; b
i
are basi fun tions
f
1
; : : : ; f
k
are s-expressions,
T is the truth-value \true", C(w; x) is a ons-expression with w o uring
exa tly on e.
This basi program s heme allows for arbitrary linear re ursion, that is,
more omplex forms, as tree re ursion are out of rea h of this method. For
illustration we give an example presented in Summers (1977) in gure 6.10.
For better readability, we represent inputs and outputs as lists and not as
s-expressions. In the following, we will des ribe, how indu tion of a re ursive
fun tion is realized by (1) generating tra es, and (2) dete ting regularities
(re urren e) in tra es.
Input/Output Examples:
fnil! nil;
(A)! ((A));
(A B)! ((A) (B));
(A B C)! ((A) (B) (C))g
Re ursive Generalization:
unpa k(x) (atom(x)! nil;
T ! u(x))
u(x) (atom( dr(x))! ons(x; nil);
T ! ons( ons( ar(x); nil); u( dr(x))))
Fig. 6.10. Learning Fun tion unpa k from Examples
6.3 Indu tive Approa hes 155
Generating Example Tra es. In the rst step of synthesis, the input/output
examples are transformed into a non-re ursive program whi h overs exa tly
the given examples. This transformation involves:
onstru ting a tra e f
j
whi h al ulates o
j
for i
j
for all examples e
j
, j =
1; : : : ; k,
determining a predi ate whi h asso iates ea h input with the tra e al u-
lating the desired output, and
onstru ting a program with the following underlying s heme:
F (x) (p
1
(x)! f
1
(x);
: : : ;
p
k
(x)! f
k
(x)):
A tra e is a fun tion whi h al ulates an output for a given input. In
Summers' approa h, su h fun tions are always ons-expressions. That su h
a ons-expression does exist, all atoms appearing in the output, must also
appear in the input. That is, it is not possible, to generate new symbols in
an output. That su h a ons-expression is uniquely determined, ea h atom
appearing in the output, must appear exa tly on e in the input. The reason
for that restri tion is, as we will see below, that the output is onstru ted
from the synta ti al stru ture of the input and therefore, the position of ea h
used atom must be unique.
The algorithm for onstru ting the tra es is given in table 6.11.
11
An
example is given in gure 6.11.
Table 6.11. Constru ting Tra es from I/O Examples
Let SE be the set of expressions over basi fun tions b, whi h des ribe sub-
expressions sub(x) of an s-expression x, asso iated with that fun tion, that is
SE = f(b(x); sub(x))g.
Let x be an example input and y be the asso iated output.
ST (x; y) =
b(x) y 6= nil and (b(x); y) 2 SE
ons(ST (x; ar(y)); ST (x; dr(y))) otherwise
In the next step, predi ates must be determined, whi h hara terize the
example inputs in su h a way that ea h input an be asso iated with its
orre t tra e: For the set of predi ates must hold that p
i
(x
i
), i = 1; : : : ; k
evaluate to T (true) and that predi ates p
j
(x
i
), 1 j i evaluats to nil.
Be ause the only predi ate used is atom, the inputs are dis riminated by
their stru ture only. For at hing hara teristi s of the atoms themselves,
\semanti " predi ates, su h as equal or greater would be needed.
11
ST stands for \semi-tra e". A semi-tra e is a tra e where the sequen e of eval-
uation of expressions is not xed.
156 6. Automati Programming
Sub-expressions of i = (A B) an be des ribed by
SE = f(i; (A B)); ( ar(i); A); ( dr(i); (B)); ( adr(i); B); ( ddr(i); nil)g
For the given examples, the tra es onstru ted with ST are:
nil! nil: nil
(A)! ((A)): ons(x, nil)
(A B)! ((A) (B)): ons( ons( ar(x), nil), ons( dr(x), nil))
(A B C)! ((A) (B) (C)): ons( ons( ar(x), nil), ons( ons( adr(x),nil), ons( ddr(x),nil)))
Fig. 6.11. Tra es for the unpa k Example
The sear hed for predi ates have the form atom(b
i
, x) where b
i
is a basi
fun tion. The algorithm for determining the predi ates must onstru t su h
basi -fun tions that, if x
i
< x
i+1
, b
i
(x
i
) is an atom, but b
i
(x
i+1
) is not. To
de ide whether an input is smaller than an other input, we need an ordering
relation over possible inputs (that is over s-expressions). In a rst step, we
redu e the inputs to their stru ture, by repla ing all atoms by identi al atoms
!:
Denition 6.3.19 (Form of an S-Expression).
form(x) =
! atom(x) = T
ons(form( ar(x)); form( dr(x))) otherwise:
An example is given in table 6.12.
Table 6.12. Cal ulating the Form of an S-Expression
List (A (A B) C) is represented by
S-Expression (A:(B:(C:nil)):(D:nil)).
form((A:(B:(C:nil)):(D:nil))) = (!:(!:(!:!)):(!:!))
Now we an dene an order relation over the set of forms s-expressions:
Denition 6.3.20 (Complete Partial Order over S-Expressions). The
omplete partial order over SF (forms of s-expressions) is re ursively dened
as:
8s 2 SF : ! s
8a; b; ; d 2 SF : (a:b) ( :d), (a ) ^ (b d).
The omplete partial order over s-expressions S is:
8s; t 2 S : s t, form(s) form(t).
Note, that the order is partial, be ause not ea h s-expression an be om-
pared with ea h other. For example, it holds neither that (A (B C)) ((A B)
C) nor that ((A B) C) (A (B C)). For onstru ting predi ates, it must hold
that the example inputs an be totally ordered. For the synthesized fun tion
6.3 Indu tive Approa hes 157
it is assumed that its hypotheti al (innite) input domain is totally ordered,
too.
The algorithm for onstru ting predi ates is dened as follows:
Denition 6.3.21 (Constru ting Predi ates). PredGen(x
i
; x
i+1
) = PG(x
i
;
x
i+1
; I), where I is the identity fun tion
PG(x; y; ) =
8
>
>
<
>
>
:
; atom(y)
fg atom(x) and :atom(y)
PG( ar(x); ar(y); ar Æ )[
PG( dr(x); dr(y); dr Æ ) otherwise:
For example, PredGen((A B), (A B C)) returns f ddrg and the resulting
predi ate is atom( ddr(x)).
The resulting non-re ursive fun tion whi h overs the set of input exam-
ples is given in gure 6.12.
F
L
(x) (atom(x)! nil;
atom( dr(x))! ons(x; nil);
atom( ddr(x))! ons( ons( ar(x); nil); ons( dr(x); nil));
T ! ons( ons( ar(x); nil); ons( ons( adr(x); nil); ons( ddr(x); nil))))
Fig. 6.12. Result of the First Synthesis Step for unpa k
Re urren e Dete tion. The se ond step of synthesis is to generalize over the
given set of examples. This an be done with a purely synta ti al approa h
by pattern mat hing: The nite program omposed from tra es is he ked
for regularities between pairs of tra es. This is done by determining dier-
en es between f
i
(x) and f
i+
(x) and he king whether re urren e relation
f
i+
= C(f
i
(b(x)); x) holds for all examples. The dieren es and the result-
ing re urren e relation for the unpa k example is given in gure 6.13.
In Summers (1977) no algorithm for al ulating the regularities is pre-
sented, ex ept that he mentions that the algorithm is similar to uni ation
algorithms. A detailed des ription of a method for re urren e dete tion in
nal program terms is presented in hapter 7.
From a re urren e relation whi h holds for a set of examples for indi es
k = s; : : : ; f where s and f are onstants, it is indu tively generalized, that
this relation holds for the omplete domain k = s; : : : ; n with n !1. Sum-
mers presented a theoreti al foundation for this generalization, in form of a
set of synthesis theorems. In the following, we present the entral aspe ts of
Summers' theory.
Synthesis Theorems.
158 6. Automati Programming
Dieren es:
f
2
(x) = ons(x; f
1
(x))
f
3
(x) = ons( ons( ar(x); nil); f
2
( dr(x)))
f
4
(x) = ons( ons( ar(x); nil); f
3
( dr(x)))
p
2
(x) = p
1
( dr(x))
p
3
(x) = p
2
( dr(x))
p
4
(x) = p
3
( dr(x))
Re urren e Relation:
f
1
(x) = nil
f
2
(x) = ons(x; f
1
(x))
f
k+1
(x) = ons( ons( ar(x); nil); f
k
( dr(x))) for k = 2; 3
p
1
(x) = atom(x)
p
k+1
(x) = p
k
( dr(x)) for k = 1; 2; 3
Fig. 6.13. Re urren e Relation for unpa k
Denition 6.3.22 (Approximation of a Fun tion). The k-th approxima-
tion F
k
(x) of a fun tion F(x) is dened as
F
k
(x) (p
1
(x)! f
1
(x);
: : :
p
k1
(x)! f
k1
(x);
T ! )
where is undened.
That is, the nite fun tion F (x) onstru ted over a set of examples, whi h
is undened for all inputs with p
k1
(x) = nil for all k > f where f is a
onstant, an be seen as the k-th approximation of the to be indu ed fun tion
F(x).
Denition 6.3.23 (Re urren e Relation). If there exists an initial point
j and an interval n su h that 1 j < k in an k-th approximation F
k
(x)
dened by a set of examples su h that the following fragment dieren es exist:
f
j+n
(x) = C(f
j
(b
1
(x)); x);
f
j+n+1
(x) = C(f
j+1
(b
1
(x)); x);
: : :
f
k1
(x) = C(f
kn1
(b
1
(x)); x);
and su h that the following predi ate dieren es exist:
p
j+n
(x) = p
j
(b
2
(x));
p
j+n+1
(x) = p
j+1
(b
2
(x)); ;
: : :
p
k1
(x) = p
kn1
(b
2
(x));
then we dene the fun tional re urren e relation for the examples to be
f
1
(x); : : : ; f
j
(x); : : : ; f
j+n1
(x); f
i+n
(x) = C(f
i
(b
1
(x)); x) for j i k
6.3 Indu tive Approa hes 159
n 1, and we dene the predi ate re urren e relation for the examples to be
p
1
(x); : : : ; p
j
(x); : : : ; p
j+n1
(x); p
i+n
(x) = p
i
(b
2
(x)) for j i k n 1
The index j gives the rst position in the set of ordered tra es where the
re urren e relation holds. All tra es with smaller indi es are kept as expli it
predi ate/fun tion pairs. For the unpa k example holds j = 2. The index n
gives a xed interval for the re urren e overing the fa t that, in general,
re urren e an o ur not only between immediately su eeding tra es. For
the unpa k example holds n = 1.
Denition 6.3.24 (Indu tion). If a fun tional and a predi ate re urren e
relation exist for a set of examples su h that k j 2n, then we indu tively
infer that these relationships hold for all i j.
The ondition k j 2n ensures that at least one regularity (between
two tra es) an be found in the example. We will see in hapter 7, that when
onsidering more general re ursive s hemes, at least four tra es from whi h
regularities an be extra ted are ne essary.
Applying the re urren e relation indu ed from the examples, the set of ex-
amples an be indu tively extended to further approximations of the sear hed
for fun tion F. The m-th approximation, with m j, has the form:
F
m
(x) (p
1
(x)! f
1
(x);
: : :
p
k
(x)! f
k
(x);
p
k+1
(x)! f
k+1
(x);
: : :
p
m
(x) ! f
m
(x);
T ! ):
Lemma 6.3.1 (Order of Approximations). The set of approximating fun -
tions, if it exists, is a hain with partial order
F
, where F (x)
F
G(x) holds
if it is true that, for all x for whi h F (x) is dened, F (x) = G(x).
Proof (Order of Approximations). We dene D
m
= fx j p
i
(x)g [ : : : [
fx j p
m
(x)g to be the domain of the approximating fun tion F
m
(x). From
this it is lear that F
i
(x)
F
F
i+1
(x) for all i, sin e D
i
is a subset of D
i+1
(Summers, 1977, p. 167).
Now, the sear hed for fun tion F(x), whi h was spe ied by examples,
an be dened as supremum of a hain of approximations:
Denition 6.3.25 (Supremum of a Chain of Approximations). If a set
of examples denes the hain hfF
m
(x)g;
F
i, the sear hed for fun tion F(x),
dened by the examples, is supfF
m
(x)g or the limit of F
m
(x) for m!1.
The on epts introdu ed, orrespond to the on epts used in the xpoint
theory of semanti s of re ursive fun tions (see appendix BB.1 and se t. 7.2
160 6. Automati Programming
in hap. 7). Summers' basi synthesis theorem represents the onverse of
this problem, that is, to nd a re ursive program from a re urren e relation
hara terization of a partial fun tion. In other words, the synthesis problem
is to indu e a folding from a set of tra es onsidered as unfolding of some
unknown re ursive fun tion.
Theorem 6.3.3 (Basi Synthesis). If a set of examples denes F(x) with
re urren e relations f
1
(x); : : : ; f
n
(x); f
i+n
(x) = C(f
i
(b(x)); x),
p
1
(x); : : : ; p
n
(x); p
i+n
(x) = p
i
(b(x)) for i 1, then F(x) is equivalent to the
following re ursive program:
F (x) (p
1
(x)! f
1
(x);
p
n
(x)! f
n
(x);
T ! C(F (b(x)); x)):
Proof: see Summers (1977, pp. 168).
This basi synthesis theorem provides the foundations of synthesis of re-
ursive programs from a nite set of examples. But it is based on onstraints
whi h only allow for very restri ted indu tion. The rst restri tion is, that
re urren e relationships must hold for all tra es not allowing a nite set
of spe ial ases whi h an be dealt with separately. The se ond restri tion
is, that the basi fun tion b used in the predi ate and fun tional re urren es
must be identi al. Summers (1977) presents extensions of the basi theorem
whi h over ome these restri tions.
Summers points out that it would be desirable to give a formal hara -
terization of the lass of re ursive fun tions whi h an be synthesized using
this approa h, but that su h a formal hara terization does not exist. In later
work, Le Blan (1994) tries to give su h a hara terization. In our own work,
we give a stru tural hara terization of the lass of re ursive programs whi h
an be indu ed from nite programs (see hapter 7). A tight hara terization
of the lass of (semanti ) fun tions whi h an be synthesized from tra es is
still missing!
Variable Addition. Summers (1977) presents an extension of his approa h
for ases, where predi ate re urren e relations but no fun tional re urren e
relations an be dete ted in the tra es. This problem was later also dis ussed
by Jouannaud and Kodrato (1979) and Wysotzki (1983). Neither in Sum-
mers' nor in later work, it is pointed out, that this problem o urs exa tly in
su h ases when the underlying re urren e relation orresponds to a simple
loop, that is a tail re ursion. All authors propose solutions based on the in-
trodu tion of additional variables. This is a standard method in transforming
linear re ursion into tail re ursion in program optimization (Field & Harrison,
1988).
We illustrate this problem with the reverse fun tion given in gure 6.14.
For the initial tra es, all dieren es have dierent forms, that is, no re urren e
relation an be derived from the examples. The following heuristi s an be
6.3 Indu tive Approa hes 161
used: Sear h for a subexpression a, whi h is identi al in all tra es. Rewrite
the fragments to g
i
(x; a) = f
i
(a), abstra t to g(x; y), and try again to nd
re urren e relations.
For the reverse example, it holds that a = nil. After abstra tion, for ea h
pair of tra es two dieren es exist. One of these dieren es is regular and an
be transformed into a re urren e relation. For the g
i
(x; y), the re ursive fun -
tion G(x; y) an be indu ed and the resulting program is: F (x) = G(x; a).
Variable y plays the role of a olle tor, where the resulting output is on-
stru ted, starting with the \neutral element" with respe t to ons, whi h is
nil. In the terminating ase, the urrent value of y is returned.
Examples:
fnil ! nil;
(A)! (A);
(A B)! (B A);
(A B C)! (C B A)g
Tra es:
f
1
(x) = nil
f
2
(x) = ons( ar(x); nil)
f
3
(x) = ons( adr(x); ons( ar(x); nil))
f
4
(x) = ons( addr(x); ons( adr(x); ons( ar(x); nil)))
Dieren es:
f
2
(x) = ons( ar(x); f
1
(x))
f
3
(x) = ons( adr(x); f
2
(x))
f
4
(x) = ons( addr(x); f
3
(x))
Variable Introdu tion:
g
1
(x; y) = y
g
2
(x; y) = ons( ar(x); y)
g
3
(x; y) = ons( adr(x); ons( ar(x); y))
g
4
(x; y) = ons( addr(x); ons( adr(x); ons( ar(x); y)))
Pairs of Dieren es: ( = anything)
g
2
(x; y) = f ons( ar(x); g
1
(x; y)); g
1
(; ons( ar(x); y))g
g
3
(x; y) = f ons( adr(x); g
2
(x; y)); g
2
( dr(x); ons( ar(x); y))g
g
4
(x; y) = f ons( addr(x); g
3
(x; y)); g
3
( dr(x); ons( ar(x); y))g
Re urren e Relation:
f
i
(x) = g
i
(x; nil), i 1
g
1
(x; y) = y
g
k+1
(x; y) = g
k
( dr(x); ons( ar(x); y))
Re ursive Program:
reverse(x) G(x; nil)
G(x; y) (atom(x)! y;
T ! G( dr(x); ons( ar(x); y)))
Fig. 6.14. Tra es for the reverse Problem
6.3.4.2 Extensions of Summers' Approa h. The most well-known ex-
tensions of Summers' approa h were presented by Jouannaud, Kodrato and
162 6. Automati Programming
olleagues (Kodrato & J.Fargues, 1978; Jouannaud & Kodrato, 1979; Ko-
drato, Franova, & Partridge, 1989). In the work of this group, the re ursive
programs whi h an be synthesized from tra es orrespond to a more omplex
program s heme than that underlying Summers':
6.3 Indu tive Approa hes 163
Denition 6.3.26 (Extension of Summers' Program S heme).
F (x) f(x; i(x))
f(x; z) (p
1
(x) ! f
1
(x; z);
: : : ;
p
k
(x) ! f
k
(x; z);
T ! h(x; f(b(x); g(x; z))))
where:
b is a basi fun tion,
i(x) is an initialization fun tion,
h; g are programs satisfying the s heme.
The greater omplexity of to be synthesized programs is mainly obtained
by the mat hing algorithm introdu ed by this group, alled BMWk (for
\Boyer, Moore, Wegbreit, Kodrato") whi h allows to determine how many
variables must be introdu ed in the to be onstru ted fun tion to obtain reg-
ular dieren es between tra es. An overview of this work is given in Smith
(1984).
Another extension was proposed by Wysotzki (1983). He reformulated
the se ond step of program synthesis, that is, re urren e dete tion in nite
programs, on a more abstra t level: The to be synthesized programs are
hara terized by term algebras instead of Lisp fun tions. Our work is based on
this extension and presented in detail in hapter 7. An additional ontribution
of Wysotzki was to demonstrate that indu tion of re ursive programs an
be applied to planning problems (su h as the learblo k problem presented
in se t. 3.1.4.1 in hap. 3). This new area of appli ation be omes possible
be ause, when representing programs over term algebras, the hoi e of the
interpreter fun tion is free. Therefore, fun tion symbols might be interpreted
by operations dened in a planning domain (su h as put or puttable, see
hapter 2). Our work on learning re ursive ontrol rules for planning (see
hapter 8) is based on Wysotzki's ideas. Later on Wysotzki (1987) presented
an approa h to realize the rst step of program synthesis, that is, onstru ting
nite programs, using universal planning. Extensions of this ideas are also
presented in hapter 8.
6.3.4.3 Biermann's Fun tion Merging Approa h. An approa h whi h
realizes the se ond step of synthesis dierently from Summers' approa h was
proposed by Biermann (1978). While Summers-like synthesis is based on
dete ting regularities in dieren es between pairs of tra es, Biermann works
with a single tra e. A tra e is represented by a set of non-re ursive fun tion
alls and regularities are sear hed in dieren es between these fun tions. The
basi Lisp operations used for generating tra es from input/output examples
are the same as in Summers' approa h. The program s heme for the to be
synthesized fun tions, alled semi-regular Lisp s heme, is:
164 6. Automati Programming
Denition 6.3.27 (Semi-Regular Lisp-S heme).
F
i
(x) (p
1
(x)! f
1
(x);
: : : ;
p
k
(x)! f
k
(x);
T ! f
k+1
(x))
where:
p
1
; : : : p
k
are predi ates of the form atom(b
i
(x)),
b
i
is a basi fun tion,
f
1
; : : : ; f
k
have one of the following forms: nil, x, F
j
( ar(x)), F
j
( dr(x)),
ons(F
g
(x); F
h
(x)),
and it holds: p
j
(x) = atom(b
j+1
(x)) ) b
j+1
= b
j
Æw with w as basi fun tion.
A set of fun tions orresponding to su h s hemes, is a program, alled
regular Lisp program.
In the rst step, a semi-tra e is al ulated from the given example, using
the algorithm ST (see tab. 6.11). This semi-tra e is then transformed in a
tra e by assigning a fun tion denition to ea h sub-expression. An example is
given in table 6.13 (see also Smith, 1984; Flener, 1995). Note, that the given
input/output example represents nested ons-pairs. A ons-pair has the form
form((a.b)) = (! !).
Table 6.13. Constru ting a Regular Lisp Program by Fun tion Merging
Example: ((a:b):( :d))! ((d: ):(b:a))
Resulting Semi-Tra e: ons( ons( ddr(x); adr(x)); ons( dar(x); aar(x)))
Tra e:
f
1
(x) = ons(f
2
(x); f
3
(x))
f
2
(x) = f
4
( dr(x))
f
3
(x) = f
5
( ar(x))
f
4
(x) = ons(f
6
(x); f
7
(x))
f
5
(x) = ons(f
8
(x); f
9
(x))
f
6
(x) = f
10
( dr(x))
f
7
(x) = f
11
( ar(x))
f
8
(x) = f
12
( dr(x))
f
9
(x) = f
13
( ar(x))
f
10
(x) = f
11
(x) = f
12
(x) = f
13
(x) = x
Resulting Program:
g
1
(x) (atom(x)! x;
T ! ons(g
2
(x); g
3
(x)))
g
2
(x) g
1
( dr(x))
g
3
(x) g
1
( ar(x))
A regular Lisp program an be represented as dire ted graph where ea h
node represents a fun tion. A dire ted ar represents a part of a onditional
expression ontained in the start node. Su h a onditional expression is a pair
6.3 Indu tive Approa hes 165
of predi ate and ons-expression. The goal nodes of an ar are fun tion alls
appearing within the ons-expression. An example is given in the last line (e)
in gure 6.15.
The onstru tion of the graph whi h represents the sear hed for program
is realized by sear h with ba ktra king. The given tra e is pro essed top-down
for the given fun tion alls f
i
and for ea h fun tion all it is he ked whether
it an be merged (is identi al) with an already existing node in the graph.
Merging results in y les in the graph, representing re ursion. If no merge is
possible, a new ar is introdu ed, representing a new onditional expression.
We demonstrate this method for the tra e given in table 6.13 in gure 6.15:
Initially, a node g
1
is reated, representing fun tion f
1
. Fun tion f
1
alls two,
possibly dierent fun tions, that is, two ar s, representing two onditional
expressions, are introdu ed. Fun tion all f
2
ould be either identi al with
g
1
( ase b.1) or not ( ase b.2). The rst hypothesis fails and ase b.2 is
sele ted. Fun tion all f
3
ould be either identi al to g
1
(fails) or to g
2
(fails)
or represent a new ase. Be ause it annot be merged with f
1
or f
2
, a new
node is introdu ed ( ase .3). Fun tion alls f
4
and f
5
an be merged with
f
1
and therefore an be represented by node g
1
( ase d). As a onsequen e,
f
6
and f
8
are identied with g
2
and f
7
and f
9
with g
3
. Identi ation for f
10
to f
13
fails and a new ar is introdu ed. The resulting graph represents the
nal re ursive program given in table 6.13.
In ontrast to Summers' approa h, fun tion merging is not restri ted to
linear re ursion. A drawba k is, that Biermann's method orresponds to an
identi ation by enumeration approa h (see se t. 6.3.1.2) and therefore is not
very eÆ ient.
6.3.4.4 Generalization to N and Programming by Demonstration.
A subeld of indu tive program synthesis is to only onsider how tra es or
proto ols an be generalized to re ursive programs (Bauer, 1975). Su h ap-
proa hes only address the se ond step of synthesis dete ting re urren e re-
lation in tra es. This area of resear h is alled programming by demonstration
(Cypher, 1993; S hrodl & Edelkamp, 1999) or generalization-to-n (Shavlik,
1990; Cohen, 1992). As des ribed for Summers' re urren e dete tion method
above, these approa hes are based on the idea that tra es an be onsidered
as the k-th unfolding of an unknown re ursive program. That is, in ontrast
to the folding te hnique in program transformation (see se t. 6.2.2.1), in in-
du tive synthesis folding annot rely on an already given re ursive fun tion!
Programming by Demonstration with Tinker. Programming by demonstra-
tion is for example used for tutoring of beginners in Lisp programming. The
system Tinker (Lieberman, 1993) an reate Lisp programs from user in-
tera tions with a graphi al interfa e. For example, the user an manipulate
blo ks-worlds with operations put and puttable (see hapter 2) and the system
reates Lisp ode realizing the operation sequen es performed by the user.
In the most simple ase, generalization is performed by repla ing onstants
by variables. Tinker an indu e simple linear re ursive fun tions, su h as the
166 6. Automati Programming
g1
cons(-,-)
g1
cons(g1(x),-)
g1
g2
cons(g2(x),-)
g1
g2
cons(g2(x),g1(x))
g1
g2
cons(g2(x),g2(x))
g2
g1
g3
cons(g2(x),g3(x))
g1
g2 g3
g1(car(x))g1(cdr(x))
g1
g2 g3
T
atom(x)x
a: Node for function f1
b.1 Alternatives for f2 b.2
c.1 c.2 c.3
Alternatives for f3
e: Final Program
d: Merging f4 and f5 with g1
Fig. 6.15. Synthesis of a Regular Lisp Program
6.3 Indu tive Approa hes 167
learblo k fun tion (see se t. 3.1.4.1 in hap. 3) from multiple examples (see
g. 6.16). Conditional expressions are onstru ted by asking the user about
dieren es between examples.
Fig. 6.16. Re ursion Formation with Tinker
Generalizing Tra es using Grammar Inferen e. A re ent approa h to pro-
gramming by demonstration was presented by S hrodl and Edelkamp (1999).
The authors des ribe how the grammar inferen e algorithm ID (Angluin,
1981) an be applied to indu e ontrol stru tures from tra es. As an ex-
ample they demonstrate how the loops for bubble-sort an be learned from
user tra es for sorting lists with a small number of elements using the swap-
operation. The tra e is enri hed by built-in knowledge about onditional
bran hes. A deterministi nite automaton (DFA) is learned from the ex-
ample tra es and membership queries to the user. The automaton a epts
all prexes of tra es (that is, trun ations of the original tra es) of the target
program.
Explanation-Based Generalization-to-n. An approa h related to program
synthesis from examples is explanation based learning or generalization
(EBG) (Mit hell et al., 1986). A (possibly re ursive) on ept des ription is
learned from a small set of examples based on (a large amount) of ba kground
knowledge. The ba kground knowledge (domain theory) is used to generate
an \explanation" why a given example belongs to the sear hed for on ept.
EBG is hara terized as analyti al learning: The onstru ted hypotheses do
not extend the dedu tive losure of the domain theory. Typi ally, Prolog is
used as hypothesis language (Mit hell, 1997, hap. 11). EBG-te hniques are
for example used in the planning system Prodigy for learning sear h ontrol
rules (see se t. 2.5.2 in hap. 2).
168 6. Automati Programming
An extension of EBG to generalization-to-n was presented by Shavlik
(1990). In a rst step, explanations are generated and in a se ond step it is
tried to dete t repeatedly o uring sub-stru tures in an explanation. If su h
sub-stru tures are found, the explanation is transformed into a re ursive rule.
6.4 Final Comments
6.4.1 Indu tive versus Dedu tive Synthesis
As we have seen in this se tion, both dedu tive and indu tive approa hes
heavily depend on sear h. In the theorem proving approa hes, it is sear hed
for su h a sequen e of appli ation of programming axioms that an axiom
dening the sear hed for input/output relation be omes true. In the pro-
gram transformation approa hes, it is sear hed for a sequen e of transforma-
tion rules whi h, applied to a given input spe i ation, results in an eÆ iently
exe utable program. In geneti programming, it is sear hed for a synta ti-
ally orre t program whi h works orre tly for all presented examples. In
indu tive logi programming, it is sear hed for a logi al program whi h overs
all positive and none of the negative examples of the to be indu ed relation.
In indu tive fun tional program synthesis, it is sear hed for regularities in
tra es whi h allow to interpret the tra es as k-th approximation of a re ur-
sive fun tion.
For all approa hes holds, that in general sear h is ineÆ ient, and there-
fore must be restri ted by knowledge. One possibility to restri t sear h in
dedu tive as well as indu tive approa hes is to provide program s hemes: For
example, the program transformation system KIDS, relies on sophisti ated
program s hemes for divide-and- onquer, lo al and global sear h. In indu tive
logi programming, a synta ti bias is introdu ed by restri ting the lass of
lauses whi h an be indu ed. In indu tive fun tional program synthesis, the
lass of to be inferred fun tions is hara terized by s hemes.
We have seen for transformational synthesis that the step of transforming
a non-exe utable spe i ation into an (ineÆ ient) exe utable program annot
be performed automati al. Instead, this step depends heavily on intera tion
with a user. A similar bottlene k is the rst step in indu tive fun tional
program synthesis transforming input/output examples into tra es. In the
systems des ribed in this se tion, this problem was redu ed by allowing only
lists (and not numbers) as input and by onsidering only the stru ture but
not the ontent of the input examples. If these onstraints are relaxed, there
is no longer a unique tra e for hara terizing the transformation of an input
example in the desired output but potentially innitely many and it depends
on the generated tra es whether a re urren e relation an be found or not. In
ontrast, the se ond step of synthesis identifying regularities in dieren es
between tra es an be performed straight-forward by pattern-mat hing.
6.4 Final Comments 169
The same is true for program transformation there are several approa hes
to automati optimization of an exe utable program.
6.4.2 Indu tive Fun tional versus Logi Programming
There exists no systemati omparison of indu tive logi and indu tive fun -
tional programming in literature. A presupposition would be to provide a
formal hara terization of the lass of programs whi h an be inferred within
ea h approa h. In indu tive fun tional programming, there exist some pre-
liminary proposals for su h a hara terization (Le Blan , 1994). Possibly, the
theoreti al framework of grammar inferen e ould be used to des ribe what
lass of re ursive programs an be indu ed on the basis of what information.
Currently, we an oer only some informal observations to ompare indu tive
logi and fun tional programming:
ILP typi ally depends on positive and negative examples, where negative
examples are used to remove literals from the to be learned lauses. Indu tive
fun tional programming uses positive examples only, but these examples must
represent the rst k elements of a totally ordered input domain. Be ause of
this restri tion, in indu tive fun tional programming mu h less examples are
needed ranging from one to about four than in ILP, where typi ally more
than ten positive examples must be presented.
In indu tive fun tional programming, the resulting re ursive fun tion is
independent of the sequen e in whi h the input/output examples are pre-
sented, while in ILP dierent programs an result for dierent example se-
quen es. The reason is, that indu tive fun tional programming bases the on-
stru tion of predi ates whi h dis riminate between examples on knowledge
about the order of the input domain, while ILP stepwise in ludes positive
examples to onstru t a hypothesis.
Indu tive fun tional programming works in a two step pro ess rst
transforming input/output examples into tra es and then identifying regu-
larities in tra es. This orresponds to a bottom-up, data-driven approa h
to learning. ILP, regardless whether it works with top-down renement or
bottom-up generalization, in rementally modies an initial hypothesis by
adding or deleting literals.
In indu tive fun tional programming, the resulting program is exe utable
while this must not be true for ILP: typi ally neither a base ase for a re-
ursive lause is indu ed, nor is a order for the literals in the body provided.
It is a fundamental dieren e between logi and fun tional programs that
for logi programs ontrol ow is provided by the interpreter strategy (e. g.,
SLD-resolution), while fun tional programs implement a xed ontrol stru -
ture. Furthermore, fun tional programs typi ally have one parameter less
than logi programs. In the rst ase, a fun tion transforms a given input
into an output value. In the se ond ase, an input/output relation is proved
and it is possible to either onstru t and output su h that the relation holds
170 6. Automati Programming
for the given input or to onstru t an input su h that the relation holds for
the desired output.
Be ause fun tions are a subset of relations, namely relations whi h are
dened for all inputs and have a unique output, it an be onje tured that in
general, indu tion of fun tions is a harder problem than indu tion of relations.
7. Folding of Finite Program Terms
"It ts into the pattern, I think." "Ah," said Mr. Rattisbon who knew Alleyn.
"The pattern. Your pet theory, Chief Inspe tor." "Yes, Sir, my pet theory. I
hope you may provide me with another lozenge in the pattern."
|Ngaio Marsh, Death of a Peer, 1940
In this hapter we present our approa h to folding nite program terms
in re ursive programs. That is, we address indu tive program synthesis from
tra es. This problem is resear hed in indu tive synthesis of fun tional pro-
grams as se ond step of synthesis and in programming by demonstration
(see hap. 6). Tra es an be provided to the system by the system user, they
an be re orded from intera tions of a user with a program, or they an be
onstru ted over a set of input/output examples. In the next hapter ( hap. 8)
we will des ribe how nite programs an be generated by planning. Folding
of nite programs into re ursive programs is a omplex problem itself. Our
approa h allows to fold re ursive programs whi h an be hara terized by
sets of ontext-free term rewrite rules. It allows to infer programs onsisting
of a set of re ursive equations, and to deal with interdependent and hidden
parameters. Tra es an be generi or over instantiated parameters and they
an ontain in omplete paths. That is, at least for the se ond step of program
synthesis, our approa h is more powerful than other approa hes dis ussed in
literature (see se t. 6.3 in hap. 6).
In the following we will rst introdu e terms and re ursive program
s hemes as ba kground for our approa h (se t. 7.1), then we will formulate
the synthesis problem (se t. 7.2), afterwards we present theoreti al results
and algorithms (se t. 7.3), and nally, we give some examples (se t. 7.4).
1
In appendix AA.4 we give a short overview of folding algorithms we re-
alized over the last years and in appendix AA.5 we give some details about
the urrent implementation whi h is based on the approa h presented in this
hapter.
1
This hapter is based on the previous publi ations S hmid and Wysotzki (1998)
and Muhlpfordt and S hmid (1998), and the diploma theses of Muhlpfordt (2000)
and Pis h (2000). Formalization, proof, and implementation are mainly based on
the thesis of Muhlpfordt (2000).
172 7. Folding of Finite Program Terms
7.1 Terminology and Basi Con epts
In ontrast to other approa hes to the synthesis of re ursive programs, the
approa h presented here is independent of a given program language. In-
stead, programs are hara terized as elements of some arbitrary term algebra
as proposed by Wysotzki (1983), based on the theory of re ursive programs
developed by Cour elle and Nivat (1978). A similar proposal was made by
Le Blan (1994) who formalized indu tive program synthesis in a term rewrit-
ing framework. The goal of folding is to indu e a re ursive program s heme
from some arbitrary nite program term. Our approa h is purely synta ti-
ally, that is, it works on the stru ture of given program terms, independent
of the interpretation of the symbols over whi h a term is onstru ted. The
overall idea is to dete t a re urrent relation in a nite program term. Our
approa h is an extension of Summers' synthesis theorem whi h was presented
in detail in se tion 6.3.4.1 in hapter 6.
We rst introdu e some basi on epts and notations for terms, mostly
following (Dershowitz & Jouanaud, 1990), then we introdu e rst order pat-
terns and anti-uni ation (Burghardt & Heinz, 1996), and nally, re ursive
program s hemes (Cour elle & Nivat, 1978).
7.1.1 Terms and Term Rewriting
Denition 7.1.1 (Signature). A signature is a set of (fun tion) symbols
with : ! N giving the arity of a symbol. The set
n
represents
all symbols with xed arity n, where symbols in
0
represent onstants. A
signature an be dened over a set of variables X with X \ = ;.
Although we are only on erned with the synta ti al stru ture of terms
built over a signature, in table 7.1 we give a sample of fun tion symbols
and their usual interpretation whi h we will use in illustrative examples. For
better readability, we will sometimes represent numbers or lists in their usual
notation, instead as onstru tor terms (e. g., 3 instead of su (su (su (0)))
or [9, 4, 7 instead of ons(9, ons(4, ons(7, nil)))).
Denition 7.1.2 (Term). For the set T
(X) of terms over a signature
and a set of variables X holds:
1. X T
(X),
2.
0
T
(X), and
3. t
1
; : : : ; t
n
2 T
(X); f 2
n
) f(t
1
; : : : ; t
n
) 2 T
(X).
Terms without variables are alled ground terms and the set of all ground
terms is denoted as T
. For a term t = f(t
1
; : : : ; t
n
) the terms t
i
(i = 1 : : : n)
are alled sub-terms of t. Mapping var : T
(X)! X returns all variables in
a term.
7.1 Terminology and Basi Con epts 173
Table 7.1. A Sample of Fun tion Symbols
0
0
natural number zero
1
0
natural number one
nil
0
empty list, truth value \false"
T
0
truth value \true"
su
1
su essor of a natural number
pred
1
prede essor of a natural number
head
1
head of a list
tail
1
tail of a list
eq0
1
test whether a natural number is equal zero
eq1
1
test whether a natural number is equal one
empty
1
test whether a list is empty
ons
2
insertion of an element in front of a list
+
2
addition of two numbers
2
multipli ation of two numbers
eq
2
equality-test for two symbols
<
2
test whether a natural number is smaller than another
if
3
onditional \if x then y else z"
In the following we use the expressions term, program term, and tree as
synonyms. Ea h symbol ontained in a term is alled a node in the term.
Denition 7.1.3 (Position). Positions in a term t 2 T
(X) are dened
by:
is the root-position of t,
if t = f(t
1
; : : : ; t
n
) and u is a position in t
i
then i:u is a position in t.
An example for a term together with an illustration of positions in the
term is given in gure 7.1.
Denition 7.1.4 (Composition and Order of Positions). The ompo-
sition v Æ w of two positions v and w is dened as u:w if v = u:.
A partial order over positions is dened in the following way:
The relation v w is true i
v = w, or
exists a position u with v Æ u = w.
Denition 7.1.5 (Sub-term at a Position). A sub-term of a term t 2
T
(X) at a position u in t (written tj
u
) is dened as
tj
= t,
if t = f(t
1
; : : : ; t
n
) and u a position in t
i
, then tj
i:u
= t
i
j
u
.
Denition 7.1.6 (Positions of a Node). Fun tion node : T
(X) Posi-
tion ! returns the symbol at a xed position in a term:
node(t; u) = f with t 2 T
(X), a position u with tju = f(t
1
; : : : ; t
n
), and
174 7. Folding of Finite Program Terms
3.3.1.2.λ
3.λ
λif
< k if
k n <
-
k n
k if
<
-
-
k n
n
kn
n
0
t = if(<(k,n),k,if(<(-(k,n),n),k,if(<(-(-(k,n),n),n),k,0)))
Fig. 7.1. Example Term with Exemplari Positions of Sub-terms
f 2 .
The set of all positions at whi h a xed symbol f appears in a term is denoted
pos(t; f) and it holds 8w 2 pos(t; f) : node(t; w) = f .
The denitions an be extended to [X . If ne essary, the denition of node
an be extended to return additionally the arity of a symbol. With pos(t)
we refer to the set of all positions for a term t and with lpos(t) we refer to
the set of all positions of leaf-nodes in a term.
For the term in gure 7.1, a sub-term is for example tj
3
:3:1: = <(-(-(k,
n), n), n) and a set of positions is for example pos(t; <) = f1:; 3:1:; 3:3:1:g.
Denition 7.1.7 (Repla ement of a Term at a Position). The repla e-
ment of a sub-term tj
w
by a term s 2 T
(X) in a term t 2 T
(X) is written
as t[w s.
Denition 7.1.8 (Substitution). A substitution : X ! T
(X) is a
mapping of variables to terms. The extension of substitution for terms is
(f(t
1
; : : : ; t
n
)) = f((t
1
); : : : ; (t
n
)). Substitutions over a term are enumer-
ated as tfx
1
; t
1
; : : : ; x
n
t
n
g or t[t
1
; : : : t
n
for short.
Denition 7.1.9 (Term Rewrite System). A term rewrite system over
a signature is a set of pairs of terms R T
(X) T
(X). The elements
(l; r) of R are alled rewrite rules and are written as l ! r.
7.1 Terminology and Basi Con epts 175
Denition 7.1.10 (Term Rewriting). Let R be a term rewrite system
and t; t
0
2 T
(X) two terms. Term t
0
an be derived in one rewrite step
from t using R (t!
R
t
0
), if there exists a position u in t, a rule l! r 2 R,
and a substitution : X ! T
(X), su h that holds
tj
u
= (l),
t
0
= t[u (r).
R implies a rewrite relation !
R
T
(X)T
(X) with (t; t
0
) 2!
R
if t!
R
t
0
. The re exive and transitive losure of !
R
is
!
R
.
7.1.2 Patterns and Anti-Uni ation
A term t = (p) whi h an be generated by substitutions over a term p is a
spe ialization of p and p is alled (rst order) pattern of t.
Denition 7.1.11 (First Order Pattern). A term p 2 T
(Y ) is alled
rst order pattern of a term t 2 T
(X), if there exists a subsumption relation
p t with : Y ! T
(X) and (p) = t. The sets of variables Y and X are
disjoint.
A pattern p of a term t is alled trivial, if p is a variable and non-trivial
otherwise.
Denition 7.1.12 (Order over Terms). For two terms over the same sig-
nature p; t 2 T
(X) holds p t if p is a pattern of t and p < t if there exists
no variable renaming su h that p and t are uniable.
Denition 7.1.13 (Maximal Pattern). A term p is alled maximal (rst
order) pattern of a term t, if p t and there exists no term p
0
with p < p
0
and p
0
t.
An example for a rst order pattern of a term is given in gure 7.2. For
all non-trivial patterns holds that terms whi h an be subsumed by a pattern
share a ommon prex.
A (maximal) pattern of two terms an be onstru ted by synta ti anti-
uni ation of these terms.
Denition 7.1.14 (Anti-Uni ator). A term p is alled (rst-order) anti-
uni ator of two terms t and t
0
, if p t and p t
0
and if for all terms p
0
with
p
0
t and p
0
t
0
is p
0
p. That is, the anti-uni ator is the most spe i
generalization of two terms and it is unique ex ept for variable renaming.
Denition 7.1.15 (Anti-Uni ation). Anti-uni ation u : T
(X)T
(X)!
T
(X [ Y ) is dened as:
f(t
1
; : : : ; t
n
) u f(t
0
1
; : : : ; t
0
n
) = f(t
1
u t
0
1
; : : : ; t
n
u t
0
n
),
f(t
1
; : : : ; t
n
) u f
0
(t
0
1
; : : : ; t
0
m
) = '(f(t
1
; : : : ; t
n
); f
0
(t
0
1
; : : : ; t
0
m
)) for f 6= f
0
176 7. Folding of Finite Program Terms
_<
yx k
if
< k if
k n
if
< k if
k n <
-
k n
k if
<
-
-
k n
n
kn
n
0
Fig. 7.2. Example First Order Pattern
with ' : T
(X) T
(X) ! Y as inje tive mapping of terms in a (new) set
of variables.
Mapping ' guarantees that the mapping of pairs of terms to variables is
unique in both dire tions. Identi al sub-term pairs are repla ed by the same
variable over the omplete term (Lassez, Maher, & Marriott, 1988).
Anti-uni ation an be extended from pairs of terms to sets of terms.
Denition 7.1.16 (Anti-Uni ation of Sets of Terms). Anti-uni ation
of a set of terms T
d
: T
(X)
+
! T
(X [ Y ) is dened as:
l
T =
t if T = ftg
(
d
(T n ftg)) u t otherwise:
First order anti-uni ation is distributive, that is, terms in T an be anti-
unied in an arbitrary order (Lassez et al., 1988).
An example for anti-uni ation is given in gure 7.3.
7.1.3 Re ursive Program S hemes
7.1.3.1 Basi Denitions. The notion of a program s heme allows us to
hara terize the lass of re ursive fun tions whi h an be indu ed in program
synthesis.
7.1 Terminology and Basi Con epts 177
00
if
<
-
n
k
n
k
if
< k
k n
if
< k
n
0
x
Fig. 7.3. Anti-Uni ation of Two Terms
Denition 7.1.17 (Re ursive Program S heme). Let be a signature
and = fG
1
; : : : ; G
n
g a set of fun tion variables with \ = ; and arity
(G
i
) = m
i
> 0. A re ursive program s heme (RPS) S is a pair (G; t
0
) with
t
0
2 T
[
(X) and G as a system of n equations:
G =
hG
1
(x
1
; : : : ; x
m
1
) = t
1
;
.
.
.
G
n
(x
1
; : : : ; x
m
n
) = t
n
i
with t
i
2 T
[
(X); i = 1 : : : n.
An RPS orresponds to an abstra t fun tional program with t
0
as initial
program all (\main program") and G as a set of (re ursive) fun tions (\sub-
programs"). The left-hand side of an equation in G is alled program head,
the right-hand side program body. The parameters x
i
in a program head are
pairwise dierent. The body t
i
of an equation G
i
an ontain alls of G
i
or
alls of other equations in G.
In the following, RPSs are restri ted in the following way: For ea h pro-
gram body t
i
of an equation in G holds
var(t
i
) = fx
1
; : : : ; x
m
i
g, that is, all variables in the program head are used
in the program body, and
pos(t
i
; G
i
) 6= ;, that is, ea h equation is re ursive.
The rst restri tion does not limit expressiveness: Variables whi h never are
used in the body of a program do not ontribute to the omputation of a
result. The se ond restri tion ensures that indu tion of an RPS from a nite
program term is really due to generalization (learning) and that the resulting
RPS does not just represent the given nite program itself.
178 7. Folding of Finite Program Terms
An example for an RPS is given in table 7.2. Equation ModList he ks
for ea h element of a list of natural numbers whether it is a multiple of a
number n. The equation is linear re ursive be ause it ontains a all of itself
as argument of the ons-operator and it alls a se ond equation Mod whi h
is tail-re ursive, that is, a spe ial ase of linear re ursion, orresponding to
a loop. The program all t
0
in this example, is simply a all of equation
ModList. In general, t
0
an be a more omplex expression from T
(X) [ .
For example, we might only be interested whether the se ond element of the
list is a multiple of n and all t
0
= head(tail(ModList(l,n))).
Interpretation of an RPS, for example by means of an eval-apply method
(Field & Harrison, 1988), presupposes that all symbols in are available as
pre-dened operations and that the variables in t
0
are instantiated. An RPS
an be unfolded by repla ing fun tion alls by the orresponding fun tion
body where variables in the body are substituted in a ordan e to the pa-
rameters in the fun tion all. Generation of program terms by unfolding is
des ribed in the next se tion.
Table 7.2. Example for an RPS
S = hG; t
0
i
G = h
ModList(l, n) = if( empty(l),
nil,
ons( if(eq0(Mod(n, head(l))), T, nil),
ModList(tail(l), n))),
Mod(k, n) = if(<(k, n), k, Mod((k, n), n))
i
t
0
=ModList(l; n)
Variable instantiation is a spe ial ase of substitution (see def. 7.1.8) where
variables are repla ed by ground terms:
Denition 7.1.18 (Variable Instantiation). A mapping : X ! T
is
alled instantiation of variables X. Variable instantiation an be extended to
terms: : T
(X) ! T
su h that (f(t
1
; : : : ; t
n
)) = f((t
1
); : : : ; (t
n
)) for
f 2 , arity (f) = n, and t
1
; : : : ; t
n
2 T
(X).
Please note, that we speak of the (formal) parameters of a program to
refer to its arguments. For example, the re ursive equation ModList(l, n) has
two parameters. The parameters an be variables, su h as l, n 2 X, or be
instantiated with (ground) terms.
7.1.3.2 RPSs as Term Rewrite Systems. A re ursive program s heme
an be viewed as term rewrite system (Cour elle, 1990) or, alternatively as
ontext-free tree-grammar (see next se tion).
An RPS as introdu ed in denition 7.1.17 an be interpreted as term
rewrite system as introdu ed in denition 7.1.9:
7.1 Terminology and Basi Con epts 179
Denition 7.1.19 (RPS a Rewrite System). Let S = (G; t
0
) be an RPS
and a spe ial symbol. The equations in G onstitute rules R
S
of a term
rewrite system. The system additionally ontains rules R
:
R
S
= fG
i
(x
1
; : : : ; x
m
i
)! t
i
j i 2 f1; : : : ; ngg,
R
= fG
i
(x
1
; : : : ; x
m
i
)! j i 2 f1; : : : ; ngg.
We write
!
;
for the re exive and transitive losure of the rewrite relation
implied by R
S
[ R
.
By means of the term rewrite system, the head of a user-dened equation
(o uring in t
0
or a program body t
i
) is either repla ed by the asso iated
body or by symbol whi h represents the empty or undened term.
Denition 7.1.20 (Language of an RPS). Let S = (G; t
0
) be an RPS
and : var(t
0
)! T
an instantiation of the parameters in t
0
. The set of all
terms L(S; ) = ft j t 2 T
[fg
; (t
0
)
!
;
tg is the language generated by
the RPS with instantiation . For a main program t
0
without variables, we
an write L(S).
The instantiation of all parameters in program all t
0
implies that the lan-
guage of an RPS is a set of ground terms, ontaining neither obje t variables
X nor fun tion variables . That is, all re ursive alls are at some rewrite
step repla ed by s.
Denition 7.1.21 (Language of a Subprogram). Let S = (G; t
0
) be an
RPS. For an equation G
i
2 G the rewrite relation !
G
i
;
is implied by the
rules
R
G
i
;
= fG
i
(x
1
; : : : x
m
i
)! t
i
; G
i
(x
1
; : : : x
m
i
)! g:
Let : fx
1
; : : : x
m
i
g ! T
be an instantiation of parameters of G
i
, then the
set of all terms L(G
i
; ) = ft j t 2 T
[fg[nfG
i
g
; (G
i
(x
1
; : : : ; x
m
i
))
!
G
i
;
tg is the language generated by subprogram G
i
with instantiation .
The term given on the left side of gure 7.4 is an example for a term be-
longing to the language of the RPS given in table 7.2 as well as an example
for a term belonging to the language of the subprogram ModList. The term
represents an unfolding of the main program t
0
where the ModList subpro-
gram is alled. For t
0
= ModList(l, n) variables are instantiated as l = [9; 4; 7
and n = 8.
2
Within this unfolding, there is an unfolding of the Mod subpro-
gram whi h is alled by ModList. At the same time, the term represents an
unfolding of the subprogram ModList itself. The term given on the right side
of gure 7.4 is an unfolding of ModList ontaining the all of the Mod subpro-
gram. Neither of the two terms ontain the name of the subprogramModList
itself.
2
For a restri ted signature with natural number 0 and su essor-fun tion only and
list onstru tor ons, a natural number i an be represented as su
i
(0) and a
list as ons-expression.
180 7. Folding of Finite Program Terms
Ω
Ω
if
empty nil cons
if
eq0 T nil
if
< 8
8 head
if
empty nil
tail
[9,4,7]
[9,4,7]
[9,4,7]
Ω
if
empty nil cons
if
eq0 T nil
Mod
8head
if
empty nil
tail
[9,4,7]
[9,4,7]
[9,4,7]
t
0
=ModList(l; n), (l) = [9; 4; 7, (n) = 8
Fig. 7.4. Examples for Terms Belonging to the Language of an RPS and of a
Subprogram of an RPS
7.1.3.3 RPSs as Context-Free Tree Grammars. In the remaining hap-
ter, we will rely on the notion of RPSs as term rewrite systems. In this se -
tion we will shortly present an alternative, equivalent denition of RPSs as
ontext-free tree grammars (Guessarian, 1992).
Denition 7.1.22 (Context-free Tree Grammar). A ontext-free tree gram-
mar (CFTG), C = (A;N;; P ), is a four-tuple of
a set of terminal symbols ,
a set of non-terminal symbols N with xed arity and N \ = ;
an axiom A 2 T
[N
, and
a set of produ tion rules P of the form G(x
1
; : : : ; x
n
)! t with G 2 N and
fx
1
; : : : ; x
n
g X, where X \ ( [ N) = ; is a set of reserved variables,
and t 2 T
[N
(X).
If there exist for a non-terminal G 2 N more than one rule G(x
1
; : : : ; x
n
)!
t
1
; : : : ; G(x
1
; : : : ; x
n
)! t
m
we write G(x
1
; : : : ; x
n
)! t
1
j : : : j t
m
for short.
The language of a ontext-free tree grammar onsists of all words (terms)
whi h an be derived from the axiom and whi h do not ontain non-terminal
symbols.
Denition 7.1.23 (Language of a CFTG). Let C = (A;N;; P ) be an
CFTG and
!
P
the re exive and transitive losure of the derivation relation
!
P
. The language L(C) of the grammar is L
C
= ft 2 T
j A
!
P
tg.
The program all t
0
of an RPS an be interpreted as axiom of an CFTG
and the equations of an RPS as produ tion rules. In ontrast to an RPS, the
7.1 Terminology and Basi Con epts 181
axiom must ontain no variables, that is, all variables in the program all
must be instantiated.
Denition 7.1.24 (CFTG of an RPS). Let S = (G; t
0
) be an RPS and
: X ! T
an instantiation of variables in t
0
(X = var(t
0
)), then the
CFTG C
S;
of S and is dened as C
S;
= ((t
0
); ;; P ) with
P =
fG
1
(x
1
; : : : ; x
m
1
) = t
1
j ;
.
.
.
G
n
(x
1
; : : : ; x
m
n
) = t
n
j g:
With su h a CFTG an the language generated by an RPS be dened as
L(S; ) = L(C
S;
).
7.1.3.4 Initial Programs and Unfoldings. By means of the denition of
an RPS as term rewrite system (see def. 7.1.19) and the language of an RPS
indu ed by this system (see def. 7.1.20) we provide for a method to unfold
an RPS into a term of nite length. Be ause all equations in an RPS must
be re ursive, su h a nite term ontains at least one symbol (termination
of rewriting). Now we an hara terize the input in our synthesis system as a
term of whi h we assume that it is generated by an (yet unknown) re ursive
program s heme.
Denition 7.1.25 (Initial Program). An initial program is a term t 2
T
[fg
(X) whi h ontains at least one : pos(t; ) 6= ;. The term might be
dened over instantiated variables only, that is, t 2 T
[fg
.
Denition 7.1.26 (Order over Initial Programs). Let t; t
0
2 T
[fg
(X)
be two terms. An order t
t
0
is indu tively dened as
1.
t
0
, if pos(t
0
; ) 6= ;,
2. x
t
0
, if x 2 X and pos(t
0
; ) 6= ;,
3. f(t
1
; : : : ; t
n
)
f(t
0
1
; : : : t
0
n
), if 8i 2 f1; : : : ; ng holds t
i
t
0
i
.
The task of folding an initial program involves nding a segmentation
and substitutions of the initial program whi h an be interpreted as the
rst i elements of an unfolding of an hypotheti al RPS. Therefore, we now
introdu e re ursion points and substitution terms for an RPS.
For a given re ursive equation, ea h o urren e of re ursive all in its
body onstitutes a re ursion point and for ea h re ursive all the parameters
in are substituted by terms:
Denition 7.1.27 (Re ursion Points and Substitution Terms). Let S =
(G; t
0
) be an RPS, G
i
(x
1
; : : : ; x
m
i
) = t
i
a re ursive equation in S, and
X
i
= fx
1
; : : : x
m
i
g the set of parameters of G
i
. The non-empty set of re ur-
sion points U
re
of G
i
is given by U
re
= pos(t
i
; G
i
) with indi es (positions)
R = f1; : : : ; jU
re
jg.
182 7. Folding of Finite Program Terms
Ea h re ursive all of G
i
at position u
r
2 U
re
; r 2 R in t
i
implies substitu-
tions
r
: X
i
! T
(X
i
) of the parameters in G
i
. Let sub : X
i
R! T
(X
i
)
be a mapping sub(x
j
; r) =
r
(x
j
) for all x
j
2 X
i
, then holds sub(x
j
; r) =
t
i
j
u
r
Æj
. The terms sub(x
j
; r) are alled substitution terms.
An example for re ursion points and substitution terms for the Fibona i
fun tion is given in table 7.3. Fibona i is tree re ursive, that is, it ontains
two re ursive alls in its body. The positions of these alls (see def. 7.1.3) in
the term representing the body of Fibona i are U
re
= f3:3:1:; 3:3:2:g.
For referring to these re ursion points we introdu e an index set R =
f1; : : : ; jU
re
jg, that is, for Fibona i is R = f1; 2g. With u
1
2 U
re
we refer
to the rst re ursive all and with u
2
2 U
re
we refer to the se ond re ursive
all. In the rst re ursive all, the parameter x is substituted by pred(x) and
in the se ond re ursive all, it is substituted by pred(pred(x)).
Table 7.3. Re ursion Points and Substitution Terms for the Fibona i Fun tion
Re ursive Equation:
G
1
(x
1
) = if( eq0(x
1
),
1,
if( eq0(pred(x
1
)),
1,
+(G
1
(pred(x
1
)), G
1
(pred(pred(x
1
))))))
Re ursion Points: U
re
= f3:3:1:; 3:3:2:g
Re ursive Calls: t
1
j
3:3:1:
= G
1
(pred(x
1
)),
t
2
j
3:3:2:
= G
1
(pred(pred(x
1
)))
Substitution Terms for x
1
:
sub(x
1
; 1) = t
1
j
3:3:1:Æ1
= G
1
(pred(x
1
))j
1:
= pred(x
1
),
sub(x
1
; 2) = t
1
j
3:3:2:Æ1
= G
1
(pred(pred(x
1
)))j
1:
= pred(pred(x
1
))
For a given re ursive equation there is a xed set of re ursion points. This
set ontains a single position for linear re ursion, two positions for tree re ur-
sion, and for more omplex RPSs the set an ontain an arbitrary number of
re ursion points. If a given term is generated by unfolding a re ursive equa-
tion, the re ursion points o ur in a regular way in ea h unfolding. Be ause,
theoreti ally, a re ursive equation an be unfolded innitely many times,
there are innitely many positions at whi h re ursive alls o ur, that is un-
folding positions. We will des ribe su h positions over indi es in the following
way:
Denition 7.1.28 (Unfolding Indi es). Let S = (G; t
0
) be an RPS, G
i
(x
1
;
: : : ; x
m
i
) = t
i
a re ursive equation in S, X
i
= fx
1
; : : : x
m
i
g the set of pa-
rameters of G
i
, and U
re
the set of re ursion points of G
i
with index set
R = f1; : : : ; jU
re
jg. The set of unfolding indi es W onstru ted over R is
indu tively dened as the smallest set for whi h holds:
7.1TerminologyandBasi Con epts
183
λ3.3.1.
λ3.3.2.3.3.1.3.3.1. λ3.3.1.
λ3.3.2.
3.3.1. λ3.3.2.λ3.3.2.3.3.2.
3.3
.1.λ
3.3
.1.
3.3
.1.
λ3.3
.2.
3.3
.1.
3.3
.1.
3.3
.1.λ
3.3
.1.3
.3.2
.
λ3.3
.2.
3.3
.1.3
.3.2
.
3.3
.1.λ
3.3
.1.
3.3
.2.
λ3.3
.2.
3.3
.1.
3.3
.2.
3.3
.1.λ
3.3
.2.
3.3
.2.
λ3.3
.2.
3.3
.2.
3.3
.2.
ΩΩΩΩΩΩΩ Ω
λ
eq0
5
1 if
eq0
pred
5
1 +
if
eq0
pred
5
1 if
eq0
pred
pred
5
1 +
if
eq0
pred
pred
5
1 if
eq0
pred
pred
pred
5
1 +
if
eq0
pred
pred
pred
5
1 if
eq0
pred
pred
pred
pred
5
1 +
if
eq0
pred
pred
5
1 if
eq0
pred
pred
pred
5
1 +
if
eq0
pred
pred
pred
5
1 if
eq0
pred
pred
pred
pred
5
1 +
if
eq0
pred
pred
pred
pred
5
1 if
eq0
pred
pred
pred
pred
pred
5
1 +
if
Fig.7.5.UnfoldingPositionsintheThirdUnfoldingofFibona i
184 7. Folding of Finite Program Terms
1. 2 W ,
2. if w 2 W and r 2 R, then w Æ r 2 W .
To illustrate the on ept of unfolding indi es, we anti ipate the on ept
of unfolding whi h will be introdu ed in the next denition. Consider the
Fibona i term in gure 7.5 whi h represents the third synta ti unfolding
of the re ursive fun tion given in table 7.3. The zero-th unfolding would be
just the empty term and its position is , that is the root of the tree.
The rst unfolding renders two unfolding positions, 3:3:1: and 3:3:2: whi h
are identi al with the re ursion points of the Fibona i fun tion. Unfolding
the fun tion at ea h of these positions again by one step results in four new
unfolding positions, 3:3:1:3:3:1:, 3:3:1:3:3:2:, 3:3:2:3:3:1:, and 3:3:2:3:3:2:,
and so on. Just as we refer to the re ursion points using the index set R,
we refer to the unfolding positions using an index set W . The unfolding
points and the orresponding indi es for the rst three unfoldings of Fibona i
are given in table 7.4. For a tree re ursive stru ture su h as Fibona i, the
indi es grow in a treelike fashion. For example index 1:1:2: refers to the right
unfolding position in the two left unfoldings on the levels above.
Table 7.4. Unfolding Positions and Unfolding Indi es for Fibona i
Unfolding Unfolding Positions Unfolding Indi es
0
1 3:3:1:, 3:3:2: 1:, 2:
2 3:3:1:3:3:1:, 3:3:1:3:3:2:, 1:1:, 1:2:,
3:3:2:3:3:1:, 3:3:2:3:3:2: 2:1:, 2:2:
3 3:3:1:3:3:1:3:3:1:, 3:3:1:3:3:1:3:3:2:, 1:1:1:, 1:1:2:,
3:3:1:3:3:2:3:3:1:, 3:3:1:3:3:2:3:3:2:, 1:2:1:, 1:2:2:,
3:3:2:3:3:1:3:3:1:, 3:3:2:3:3:1:3:3:2:, 2:1:1:, 2:1:2:,
3:3:2:3:3:2:3:3:1:, 3:3:2:3:3:2:3:3:2: 2:2:1:, 2:2:2:
Dening the unfolding of a re ursive equation is based on the positions
in a term at whi h unfoldings are performed and on a me hanism to repla e
parameters in the program body:
Denition 7.1.29 (Unfolding of an Re ursive Equation). Let S = (G; t
0
)
be an RPS, G
i
(x
1
; : : : ; x
m
i
) = t
i
a re ursive equation in S, X
i
= fx
1
; : : : ;
x
m
i
g the set of parameters of G
i
, U
re
the set of re ursion points of G
i
with index set R = f1; : : : ; jU
re
jg, and sub(x
j
; r) the substitution term for
x
j
2 X
i
in the re ursive all of G
i
at position u
r
2 U
re
; r 2 R. The set of all
unfoldings
i
of equation G
i
over instantiations : X
i
! T
is indu tively
dened as the smallest set for whi h holds:
1. Let
= be the instantiation of variables in unfolding . Then
=
(t
i
) and
2
i
.
2. Let
w
2
i
be an unfolding with instantiation
w
. Then for ea h
r 2 R exists an unfolding
wÆr
2
i
with instantiations
wÆr
(x
j
) =
w
(sub(x
j
; r)) for all x
j
2 X
i
, su h that
wÆr
=
wÆr
(t
i
).
7.1 Terminology and Basi Con epts 185
An unfolding is a term, whi h is generated in a derivation step t!
S;
t
0
by applying a term rewrite rule G
i
(x
1
; : : : ; x
m
i
) ! t
i
(see def. 7.1.19). The
resulting term is an element of language L(G
i
; ) (see def. 7.1.21). For a term
t, whi h is the result of repeated appli ation of rule (G
i
(x
1
; : : : ; x
m
i
)), the
unfolding indi es w orrespond to the sequen e in whi h the re ursive alls
were repla ed.
Please note, that the initial instantiation of variables
an be empty,
that is, the variables appearing in the program body an remain uninstan-
tiated during unfolding. For synta ti al unfolding, as des ribed above, there
is no possibility to terminate unfolding for paths whi h ould never been
rea hed during program evaluation. For example, in gure 7.5 at ea h level
of unfolding positions, all re ursive alls where unfolded. It is not he ked,
whether one of the onditions eq0(x
1
) or eq0(pred(x
1
)) is already true for
the given instantiation of x
1
. Unfolding of terms is a well introdu ed on-
ept, whi h we already des ribed in hapter 6 (se ts. 6.2.2.1 and 6.3.4.1). In
the denition above, we simply reformulated unfolding in our terminology.
The synthesis problem is performed su essfully, if an RPS with a set
of re ursive equations an be indu ed whi h an re urrently explain a given
initial program.
Denition 7.1.30 ((Re ursive) Explanation of an Initial Program).
A re ursive equation G
i
(x
1
; : : : ; x
m
i
) = t
i
explains an initial program t
init
,
if there exist an instantiation : fx
1
; : : : x
m
i
g ! T
of the parameters of G
i
and a term t 2 L(G
i
; ), su h that t
init
t. G
i
is a re urrent explanation
of t
init
if furthermore exists a term t
0
2 L(G
i
; ) whi h an be derived by at
least two appli ations of G
i
(x
1
; : : : ; x
m
i
)! t
i
su h that t
0
t
init
.
An RPS S = (G; t
0
) explains an initial program t
init
, if there exist an in-
stantiation : var(t
0
) ! T
of the parameters of the program all t
0
and a
term t 2 L(S; ), su h that t
init
t. S is a re urrent explanation of t
init
if furthermore exists a term t
0
2 L(S; ) whi h an be derived by at least two
appli ations of all rules R
S
su h that t
0
t
init
.
An equation/RPS explains a set of initial programs, if it explains all terms
in the set and if there is a re urrent explanation for at least one term.
Denition 7.1.30 states that an re ursive equation or an RPS dened over
a set of su h explanations (subprograms) together with an initial program
all explains an initial program, if some term t an be derived su h that the
initial program is ompletely ontained in this term (see def. 7.1.26). For a
re urrent explanation it must hold additionally, that some term t
0
an be
derived by repeated unfolding (that is, at least two times) su h that this term
t
0
is ompletely ontained in the initial program.
186 7. Folding of Finite Program Terms
7.2 Synthesis of RPSs from Initial Programs
Based the denition of RPSs, initial programs, unfoldings, and re urrent
explanations introdu ed above, we are now ready to formulate the synthesis
problem, whi h we want to solve, pre isely. Before we do that, we will rst
dis uss the relation between the xpoint semanti s of re ursive fun tions
and indu tion of re ursive fun tions from initial programs and give some
hara teristi s of RPSs.
7.2.1 Folding and Fixpoint Semanti s
As dis ussed in hapter 6, indu tion an be viewed as inverse pro ess to de-
du tion. For example, in indu tive logi programming, more general lauses
an be omputed by inverse resolution, a term or lause an be generalized
by anti-uni ation (Plotkin, 1969). Summers (1977) proposed that folding an
initial program into a (set of) re ursive equation an be seen as inverse of x-
point semanti s (Field & Harrison, 1988; Davey & Priestley, 1990). Summers
synthesis theorem is given in se tion 6.3.4.1 in hapter 6.
We introdu e xpoint semanti s in appendix BB.1. For short, xpoint
semanti s allows to give a denotation to re ursively dened synta ti fun -
tions. The semanti s of a ( ontinuous) fun tion (over an ordered domain) is
dened as the least supremum of a sequen e of unfoldings (expansions) of
this fun tion.
For folding, we onsider a given nite program term as the i-th unfolding
t
init
= G
i
of an unknown re ursively dened synta ti fun tion G. Find-
ing a re ursive explanation for G
i
as des ribed in denition 7.1.30, orre-
sponds to segmenting G
i
into a sequen e of unfoldings G
0
; G
1
; : : : ; G
i
where
G
k
= t
k
[u G
k1
with u as a xed position in t
k
and as substitution
of parameters holds for k = 1; : : : ; i. In that ase, based on the re urren e
relation whi h holds for G
i
, the re ursive fun tion G = t[u G
an be
extrapolated. An example is given in table 7.5.
7.2.2 Chara teristi s of RPSs
In se tion 7.1.3.1 we already introdu ed two restri tions for RPSs: All vari-
ables given in the head of an re ursive equation must appear in the body
and ea h equation ontains at least a all of itself. In the following, further
hara teristi s of the RPSs whi h an be folded using our approa h are given.
Denition 7.2.1 (Dependen y Relation of Subprograms). Let S = (G; t
0
)
be an RPS with a set of fun tion variables (names of subprograms) . Let
H =2 be a fun tion variable (for the unnamed main program). A relation
alls
S
fHg[ between subprograms and main program of S is dened
as:
7.2 Synthesis of RPSs from Initial Programs 187
Table 7.5. Example of Extrapolating an RPS from an Initial Program
t
init
= G
i
=
if(<(k,n ), k, if(<((k, n)), k, if(<(((k, n), k), k), k, )))
Segmentation into hypotheti al unfoldings of a re ursive fun tion G:
G
0
=
def
G
1
= if(<(k, n),k, )
G
2
= if(<(k, n),k, if(<((k, n)),k, ))
G
3
= G
i
Re urren e Relation and Extrapolation:
G
1
= if(<(k, n),k, G
0
[k (k;n)
)
G
2
= if(<(k, n),k, G
1
[k (k;n)
)
G
3
= if(<(k, n),k, G
3
[k (k;n)
)
extrapolate:
G = if(<(k, n),k, G
[k (k;n)
)
alls
S
= f(H;G
i
) j G
i
2 ;pos(t
0
; G
i
) 6= ;g [
f(G
i
; G
j
) j G
i
; G
j
2 ;pos(t
i
; G
j
) 6= ;g:
The transitive losure alls
S
of alls
S
is the smallest set alls
S
fHg[
for whi h holds:
1. alls
S
alls
S
,
2. for all P 2 fHg [ and G
i
; G
j
2 : If P alls
S
G
i
and G
i
alls
S
G
j
then P alls
S
G
j
.
For introdu ing a notion of minimality of RPSs two further restri tions
are ne essary:
Only Primitive Re ursion: All re ursive equations are primitive re ursive,
that is, there are no re ursive alls in substitutions.
No Mutual Re ursion: There are no re ursive equations G
i
; G
j
with i 6= j
with G
i
alls
S
G
j
and G
j
alls
S
G
i
.
The rst restri tion was already given impli itly in denition 7.1.27 where
sub maps into T
(X
i
). A onsequen e of this restri tion is that we annot
infer general -re ursive fun tions, su h as the A kermann fun tion. The
se ond restri tion is only synta ti al sin e ea h pair of mutually re ursive
fun tions an be transformed into semanti ally equivalent fun tions whi h
are not mutually re ursive.
Denition 7.2.2 (Minimality of an RPS). Let S = (G; t
0
) be an RPS.
S is minimal if for all subprograms G
i
(x
1
; : : : ; x
m
i
) = t
i
; G
i
2 , X
i
=
fx
1
; : : : ; x
m
i
g holds:
No unused subprograms: H alls
S
G
i
.
No unused parameters: For ea h instantiation : X
i
! T
and instantia-
tions
j
: X
i
! T
, j = 1; : : : ;m
i
onstru ted as
188 7. Folding of Finite Program Terms
j
(x
k
) =
t k = j; t 2 T
; t 6= (x
j
)
(x
k
) k 6= j
holds L(G
i
; ) 6= L(G
i
;
j
).
No identi al parameters: For all : X
i
! T
and all x
i
; x
j
2 X
i
holds: For
all unfoldings
w
2
i
, w 2W (see def. 7.1.29) with instantiation
w
of
variables follows i = j from
w
(x
i
) =
w
(x
j
).
Note, that similar restri tions were formulated by (Le Blan , 1994). To
sum up the given riteria for minimality: An RPS is minimal if it only ontains
re ursive equations whi h are alled at least on e, if ea h parameter in the
head of an equation is used at least on e for instantiating a parameter in the
body of the equation, and if there exist no parameters with dierent names
but (always) identi al instantiation. It is obvious, that ea h RPS whi h does
not fulll these riteria an be transformed into one whi h does fulll them
by deleting unused equations and parameters and by unifying parameters
with dierent names but identi al instantiations. That is, these riteria do
not restri t the expressiveness of RPSs.
A more stri t denition of minimality ould additional onsider the size
(i. e., number of symbols) of the subprogram bodies (Osherson, Stob, & We-
instein, 1986). But this makes only sense for RPSs with a single re ursive
equations. In that ase, minimality of a subprogram G is given if there exists
no subprogram G
0
with L(G
0
; ) L(G; ) (Muhlpfordt & S hmid, 1998).
For RPSs with more than one equation, the minimality of ea h single equation
annot be determined by omparison of languages: If equation G alls other
equations G
i
there is always a smaller language where these alls are repla ed
by onstant symbols. Therefore, it would be ne essary to dene minimality
over the size of all equation bodies and the term representing the main pro-
gram. There are too many degrees of freedom to dene a thight riterium:
For example, is an RPS onsisting of a small main program but omplex sub-
programs smaller than an RPS onsisting of a omplex main program and
small subprograms?
To guarantee uniqueness for al ulating the substitutions in a subprogram
body for a given initial program, we introdu e the notion of \substitution
uniqueness":
Denition 7.2.3 (Substitution Uniqueness of an RPS). Let T
init
T
[fg
be a set of initial programs and S = (G; t
0
) an RPS over and
whi h explains T
init
re ursively (see def. 7.1.30). S is alled substitution
unique with regard to T
init
if there exists no S
0
over and whi h explains
T
init
re ursively and for whi h holds:
t
0
0
= t
0
,
for all G
i
2 holds pos(t
i
; G
i
) = pos(t
0
i
; G
i
) = U
re
, t
0
i
[U
re
=
t
i
[U
re
, and it exists an r 2 R with sub(x
j
; r) 6= sub
0
(x
j
; r) (see
def. 7.1.27).
7.3 Solving the Synthesis Problem 189
Substitution uniqueness guarantees that it is not possible to repla e a
substitution term in S su h that the resulting RPS S
0
still explains a given
set of initial programs re ursively.
7.2.3 The Synthesis Problem
Now all preliminaries are given to state the synthesis problem:
Denition 7.2.4 (Synthesis Problem). Let T
init
T
[fg
be a set of
initial programs with indi es I = f1; : : : ; jT
init
jg. The synthesis problem is to
indu e
a signature ,
a set of fun tion variables = fG
1
; : : : ; G
n
g,
a minimal RPS S = (G; t
0
) with a main program t
0
2 T
[
(X) and a set of
re ursive equations G = hG
1
(x
1
; : : : ; x
m
1
) = t
1
; : : : ; G
n
(x
1
; : : : ; x
m
n
) = t
n
i
su h that
S re ursively explains T
init
(def. 7.1.30), and
S is substitution unique (def. 7.2.3).
In the following se tion, an algorithm for solving the synthesis problem
will be presented. Identi ation of a signature is dealt with only impli itly
the RPS is onstru ted exa tly over su h symbols whi h are ne essary for
explaining the initial programs re ursively. Be ause the folding me hanism
should be independent of the way, in whi h the initial programs were on-
stru ted (by a planner, as input of a system user), we allow for in omplete
unfoldings and we allow instantiated initial programs as well as programs
over parameters (i. e., generi tra es), where parameters are treated just like
onstants.
If synthesis starts with only a single initial program, the main program
an ontain no parameters (ex ept the parameters whi h where identied
as \ onstants"). If a parameterized main program is sear hed for (whi h is
the usual ase), we assume that all parameters inferred for the subprograms
whi h are alled from the main program are also parameters of the main
program itself.
7.3 Solving the Synthesis Problem
In this se tion, the algorithms for onstru ting an RPS from a set of initial
programs (trees) are presented: For ea h given initial tree, in a rst step, a
segmentation is onstru ted (se t. 7.3.1), whi h orresponds to the unfoldings
(see def. 7.1.29) of a hypotheti al subprogram. As intermediate steps, rst
a set of hypotheti al re ursion points (see def. 7.1.27) is identied and a
skeleton of the body of the sear hed for subprogram is onstru ted.
190 7. Folding of Finite Program Terms
If re ursion points do not explain the omplete initial tree, all unexplained
subtrees are onsidered as a new set of initial trees for whi h the synthesis
algorithm is alled re ursively and the subtrees are repla ed by the name
G
i
of the to be inferred subprogram (se t. 7.3.3). If an initial tree annot
be segmented starting from the root, a onstant initial part is identied and
in luded in the body of the alling fun tion that is, in t
0
for top-level
subprograms and in G
i
if alls(G
i
; G
j
) and G
j
is the urrently onsidered
hypotheti al subprogram (see gure 7.13 in se tion 7.3.5.2).
For a given segmentation, the maximal pattern is onstru ted by in luding
all parts of initial trees whi h are onstant over all segments into the subpro-
gram body (se t. 7.3.2). All remaining parts of the segments are onsidered
as parameters and their substitutions. That is, in a last step, a unique substi-
tution rule must be al ulated and as a onsequen e, the parameter instantia-
tions of the fun tion all are identied (se t. 7.3.4). The only ba ktra k-point
for folding is al ulating a valid segmentation for the initial trees.
We begin ea h subse tion with an illustration. For readers not interested
in the formal details, it is enough to study the illustration for getting the
general idea of our approa h.
7.3.1 Constru ting Segmentations
In the following, we rst give an informal illustration. Then, we introdu e
some on epts for onstru tion segmentations of initial programs. Afterwards,
we introdu e riteria whi h must hold for a segmentation to be valid. Finally,
we present an algorithm for onstru ting valid segmentations.
7.3.1.1 Segmentation for `Mod'. Consider the simple RPS for al ulat-
ing Mod(k, n), whi h onsists of a single re ursive equation and where the
main program t
0
is just the all of this equation:
S = hG; t
0
i
G = h Mod(k, n) = if(<(k, n), k, Mod((k, n), n)) i
t
0
=Mod(k; n)
Remember, that all re ursive equations are onsidered as \subprograms".
The main program, in general, an be a term whi h alls one or more of the
re ursive sub-programs of the RPS (see se t. 7.1.3.1).
In gure 7.6 a term is given whi h an be folded into this RPS. For better
readability, we do not give an initial instantiation for parameters k and n. The
rst and ru ial step of folding is, to identify a valid re urrent segmentation
from an initial program term t
init
. For the given term, su h a segmentation
is marked in the gure.
As a rst step, all paths in the term leading to s are identied; that
is, paths to positions where the unfolding of the hypotheti al (sear hed for)
RPS was stopped. These paths onstitute the skeleton of the term, whi h is
dened as the minimal pattern of the term (see def. 7.1.11) whi h ontains
all s and where all sub-terms whi h are not on paths to s are ignored.
7.3 Solving the Synthesis Problem 191
skel
init
init
t
tSkeleton of
t
Hypothesis about Program Body
y’
Ω
y4y3
Segment 2
Segment 1
y
Ω
y1 y2
y6y5
Segment 4
Segment 3
if
if<
k n <
-
k n
n
-
k n <
-
-
k n
n
n
-
-
k n
n
k
if
if
G
if
if
if
Fig. 7.6. Valid Re urrent Segmentation of Mod
Then, a set of hypotheti al re ursion points U
r
e (see def. 7.1.27) is gen-
erated. There are dierent possible strategies, to obtain su h a set from t
init
.
The most simple hypothesis is, that, beginning at the root, ea h node in the
skeleton onstitutes an unfolding position. For our example, this hypothesis
is U
re
= f3:g. We an onstru t hypotheti al unfolding indi es W from
U
re
(see def. 7.1.28) whi h leads to the segmentation given in gure 7.5.
Repla ing all subtrees at positions U
re
in the skeleton by s results in
a hypothesis about the program body t
skel
. For our example, the segmen-
tation indu ed by U
re
= f3:g is valid be ause ea h segment onsists of
the same (non-trivial) pattern and be ause all s (here only a single one)
an be rea hed and this has a position whi h orresponds to a re ursion
point (unfolding position). The valid segmentation is re urrent be ause the
skeleton of the upper-most segment, t
skel
onstitutes a non-trivial pattern
for all subtrees t
u
of the initial program where u is a hypotheti al re ursion
point. The notion of a valid re urrent segmentation follows dire tly from the
denition of a re ursive explanation of a re ursive program (see def. 7.1.30).
If a valid re urrent segmentation an be found, the result of this rst step
of folding is a skeleton of the sub-program body. For our example it is if(y,
y', G), where G is a new fun tion variable in .
As stated in the denition of the synthesis problem (see def. 7.2.4), input
into the folding algorithm is a set of initial programs T
init
. That is, a user
or another system an provide more than one example program whi h are
onsidered as unfoldings of the same sear hed-for RPS with dierent instan-
192 7. Folding of Finite Program Terms
tiations and/or dierent unfolding depth. In our example, T
init
onsisted of
a single initial program. For some denitions below, it is enough, to onsider
a single program t
init
2 T
init
, for others, it is important, that the omplete
set is taken into a ount.
In general, nding a valid re urrent segmentation an be mu h more om-
pli ated as illustrated for Mod:
It might not be possible to nd a valid re urrent segmentation starting at
the root of the initial tree, that is, there exists a onstant initial part of the
term whi h belongs to the main program t
0
. In that ase, this initial part is
introdu ed in t
0
and segmentation starts with T
init
as set of all sub-terms
of this initial part.
The re urren e relation underlying the term an be of arbitrary omplexity.
That is, in general, the skeleton does not just exist of a single path. Our
strategy for onstru ting segmentations is to nd segmentations whi h start
as high in the given tree as possible and whi h over as large a part of
the tree as possible. As we will see in the algorithm below, we sear h for
hypotheti al re ursion points \from left to right" in the tree.
There might exist sub-terms in the given initial program whi h ontain s
but whi h annot be overed within any valid re urrent segmentation. In
this ase, it is assumed, that these subtrees belong to alls of a further
re ursive equation. The positions of su h subtrees are onsidered as \sub-
s hema positions" U
sub
and nding valid re ursive segmentations for these
subtrees is dealt with by starting the folding algorithm again with these
subtrees as new set of initial trees.
We allow for in omplete unfoldings. That is, some paths in the initial pro-
gram are trun ated before an unfolding is omplete. For example, for the
re ursive equation
addlist(l) = if(empty(l), 0, +(head(l), addlist(tail(l))))
the last unfolding ould result in an at the position of +, be ause the
else ase is not onsidered if list l is empty. Although we onsider unfolding
as a purely synta ti al pro ess, it might be useful to take into a ount that
onstru tion of the initial program was based on some kind of semanti
information (su h as evaluation of expressions).
Consequently, when onstru ting a segmentation, it is allowed that a seg-
ment ontains an above the hypotheti al unfolding position (but not
below!).
All these ases are overed in our approa h whi h we will introdu e now
more formally.
7.3.1.2 Segmentation and Skeleton. The rst and ru ial step for in-
du ing an RPS from a set of initial programs is the identi ation of a valid,
re urrent segmentations of the initial trees. When sear hing for a segmenta-
tion, at rst only su h nodes of the initial trees are onsidered whi h are lying
7.3 Solving the Synthesis Problem 193
on paths leading to hypotheti al re ursion points. These paths onstitute a
spe ial (minimal) pattern of an initial tree, alled skeleton:
Denition 7.3.1 (Skeleton). The skeleton of a term t 2 T
[fg
(X), writ-
ten skeleton(t) is the minimal pattern (def. 7.1.11) of t for whi h holds
pos(t; ) = pos(skeleton(t); ).
Let U pos(t; ) in t with t
U
= for all u 2 U . Then we dene
skeleton(t; U) as that skeleton of term t whi h ontains only the s at posi-
tions u 2 U , that is, the minimal pattern with pos(skeleton(t; U); ) = U .
An example skeleton for Mod is given in gure 7.5. This is the simple ase
of a single initial program whi h an be explained by an RPS onsisting of a
single re ursive equation and a main program whi h just alls this equation.
Note, that the paths under onsideration are leading tos, that is, trun ation
points of the unfolding of a hypotheti al RPS.
3
In general, it an be ne essary for nding a re ursive explanation of an
initial program to identify a subprogram G
i
whi h alls further subprograms.
The alls of these subprograms must be at xed positions in G
i
, alled sub-
s hema positions
4
. In that ase, the skeleton of G
i
does not ontain all paths
leading tos (see def. 7.3.1). For these remainings must hold, that they an
be rea hed over paths from the hypotheti al sub-s hema positions. Re ursion
points and sub-s hema positions onstitute a hypothesis for indu ing an RPS.
Denition 7.3.2 (Re ursion Points and Sub-S hema Positions). Let
U
re
and U
sub
be two sets of positions with U
re
6= ;. If for all u
sub
2 U
sub
and all positions u above u
sub
(u
sub
= u Æ i, i 2 N) holds
1. 6 9u
0
re
2 U
re
with u
sub
u
0
re
and
2. 9u
re
2 U
re
with u u
re
,
then (U
re
; U
sub
) is a hypothesis over re ursion points U
re
and sub-s hema
positions U
sub
.
For the initial program given in gure 7.7 (see tab. 7.2 for the underlying
RPS) the re ursion point is 3:2:, that is the se ond \if" on the rightmost
path. The sub-s hema position is 3:1:, that is, the \if" in the left path
under \ ons". For this position holds that it is not above the re ursion point
( ondition 1 of def. 7.3.2) but it an be rea hed over a path leading to this
re ursion point ( ondition 2 of def. 7.3.2): u
sub
= u Æ 1 = 3:1: for u = 3:.
3
For initial trees generated by a planner or by a system user that means, that
ases for whi h the ontinuation is unknown or not al ulated must be marked
by a spe ial symbol.
4
Be ause the inferen e of a further sub-program again an involve a onstant
initial part, not just a re ursive equation but an additional \main" ould be
inferred whi h will then be integrated in the alling subprogram. Therefore, we
speak of sub-s hema positions rather than subprogram positions
194
7.FoldingofFiniteProgram
Terms
3.1. λ 3.2. λ
Ω
Ω
Ω
Ω
if
empty nil cons
[9,4,7] if
eq0 T nil
if
<
8 head
[9, 4, 7]
8
if
empty
tail
[9, 4, 7]
nil cons
if
eq0 T nil
if
< 8 if
8 head
tail
[9, 4, 7]
< - if
- head
tail
[9, 4, 7]
8 head
tail
[9, 4, 7]
8 head
tail
[9, 4, 7]
< -
-
- head
8 head
tail
[9, 4, 7]
tail
[9, 4, 7]
head
tail
[9, 4, 7]
head-
8 head
tail
[9, 4, 7]
tail
[9, 4, 7]
if
empty
tail
tail
[9, 4, 7]
nil
if
cons
empty
tail
tail
tail
[9, 4, 7]
nil
if
eq0 T nil
if
< 8 if
8 head
tail
tail
[9, 4, 7]
< -
8 head
tail
tail
[9, 4, 7]
- head
head tail
tail
[9, 4, 7]
8
tail
tail
[9, 4, 7]
Fig.7.7.InitialProgram
forModList
7.3 Solving the Synthesis Problem 195
7.3.1.3 Valid Hypotheses. Re ursion points and sub-s hema positions
onstitute a valid hypothesis if they are on paths leading to s and where
the s are at positions orresponding to re ursion points or for in omplete
unfoldings above re ursion points. For example, in the initial tree given in
gure 7.7 the last unfolding is in omplete: After the `if'-node in the right-most
path should follow another ` ons'-node; but, instead it follows an .
Denition 7.3.3 (Valid Re ursion Points and Sub-S hema Positions).
Let be t
init
an initial tree and (U
re
; U
sub
) a hypothesis. (U
re
; U
sub
) is a valid
hypothesis for t
init
if
1. 8u 2 U
sub
: If u 2 pos(t
init
), then pos(t
init
j
u
; ) 6= ;,
2. 8u 2 U
re
: u 2 pos(t
init
) and (U
re
; U
sub
) is valid for t
init
j
u
or
3. 8u 2 U
re
: 9u
2 t
init
with u
< u and t
init
j
u
= .
The rst ondition of denition 7.3.3 ensures that sub-s hema positions
are at su h positions in the initial tree where the subtrees ontain at least one
. Condition 2 re ursively ensures that the hypothesis holds for all sub-terms
at the re ursion points. Condition 3 ensures that the s have positions at or
above re ursion points.
Up to now we have only regarded the stru ture of an initial tree. A valid
segmentation additionally must onsider the symbols of the initial tree (i. e.,
the node labels). Therefore, hypothesis (U
re
; U
sub
) will be extended to a
term t
skel
whi h represents a hypothesis about a sub-program body.
Denition 7.3.4 (Valid Segmentation). Let t
init
2 T
[fg
with
pos(t; ) 6= ; be an initial program. Let (U
re
; U
sub
) be a valid hypothesis
about re ursion points and sub-s hema positions in t. Let t
skel
2 T
[fg
(X)
be a term with t
skel
t
init
(def. 7.1.26) and U
re
[U
sub
= pos(t
skel
; ). A
segmentation given by (U
re
; U
sub
) together with t
skel
is valid for t
init
if for
U
= U
re
\ pos(t
init
) (the set of re ursion points in t
init
), and
U
i
= U
re
n U
(the set of re ursion points not in t
init
)
holds:
Rea hability of s: For all u
i
2 U
i
exists a u
2 pos(t
init
; ) with u
<
u
i
.
Validity of the Skeleton: Given the set of all positions of s in t
init
whi h are
above the re ursion points as U
t
i
fu
j u
< u; u
2 pos(t
init
; ); u 2
U
i
g it holds t
skel
[U
t
i
t
init
.
Validity of Segmentation in Subtrees: For all u 2 U
holds
(U
re
; U
sub
) together with t
skel
is a valid segmentation for tj
u
.
The denition is somewhat ompli ated be ause we are allowing in om-
plete unfoldings. Note, that we urrently are not on erned with the question
of how re ursion points are obtained. We just assume that a non-empty set
of hypotheti al re ursion points U
re
is at hand whi h an ontain positions
196 7. Folding of Finite Program Terms
whi h an be found in the given t
init
and an ontain positions whi h are not
in t
init
.
The term t
skel
is a hypothesis about the skeleton of the body of a sear hed
for subprogram. It ontains s (trun ations of unfoldings) at the hypo-
theti al re ursion points and sub-s hema positions. For in omplete unfold-
ings, for ea h given re ursion point there must exist an at or above this
point (\rea hability"). The set U
t
i
ontains all su h positions and the term
t
skel
[U
t
i
is the skeleton of the hypotheti al subprogram body redu ed
to the size of the urrently investigated and possibly in omplete sub-term.
For a valid hypothesis it must hold further, that the urrently investigated
term ontains a sub-term with at least one and that there are no further
s in t
init
. The positions in U
sub
over all s whi h are not generated by
the urrently investigated subprogram (\validity of skeleton"). Finally, these
onditions must hold for all further hypotheti al segments (\validity of seg-
mentation in subtrees"). Note, that for a tree t
u
, whi h is a subtree of t
init
with a re ursion point as root, it an happen that not all positions in U
re
are given!
If we extend this denition from one initial tree t
init
to a set of initial
trees T
init
, we request that at least one of the initial programs ontains a
ompletely unfolded segment:
Denition 7.3.5 (Segmentation of a Set of Initial Programs). (U
re
;
U
sub
) together with t
skel
is a valid segmentation of a set of initial programs
T
init
if there exists a t
init
2 T
init
with t
skel
t
init
and if (U
re
; U
sub
)
together with t
skel
is a valid segmentation of all t 2 T
init
as dened in 7.3.4.
A valid segmentation partitions an initial tree into a sequen e of segments.
These segments an be indexed in analogy to the unfoldings of a subprogram
(def. 7.1.29):
Denition 7.3.6 (Segments of an Initial Program). Let T
init
T
[fg
be a set of initial programs and E a set of indi es with t
e
2 T
init
for all e 2 E.
Let R = f1; : : : ; jU
re
jg be a set of indi es of re ursion points. Let (U
re
; U
sub
)
together with t
skel
be a valid segmentation for all t
e
2 T
init
and W the set of
indi es over segments onstru ted over U
re
. Let G be a new fun tion variable
with G =2 . Then we an dene segment(t
e
; U
re
; w) with G as name for
the subprogram as segment of t
e
with respe t to re ursion points U
re
with
segment index w 2 W indu tively:
1: segment(t; U
re
; ) =
t[U
re
\ pos(t) G if t 6=
? otherwise
2: segment(t; U
re
; r:v) =
segment(tj
u
r
; U
re
; v) if r 2 R and u
r
2 pos(t)
? otherwise:
Fun tion skeleton(t
e
; U
re
; w) returns for a given initial program t
e
and
a set of hypotheti al re ursion points U
re
the a ording segment with index
7.3 Solving the Synthesis Problem 197
w if it is ontained in the tree and otherwise \unknown".
5
A new fun tion
variable is inserted at the position of the re ursion point whi h will be the
name of the to be indu ed subprogram. For the ase of in omplete unfoldings
the s positioned above hypotheti al re ursion points remain in the tree.
It is not possible, that a hypotheti al subprogram body an onsist of the
subprogram name G only (item 1 in def. 7.3.6).
The segments of an initial program orrespond to unfoldings of a hypo-
theti al re ursive equation as dened in 7.1.29. But (up to now), segments
an additionally ontain subtrees (unfoldings) of further re ursive equations
(see se t. 7.3.3).
The set of all indi es for a given initial program with given segmentation
an be dened as
Denition 7.3.7 (Segment Indi es). Let T be a set of initial trees in-
dexed over E with t
e
2 T for all e 2 E. Let (U
re
; U
sub
) together with t
skel
be
a valid segmentation for all terms in T and let R = f1; : : : ; jU
re
jg be a set of
indi es of re ursion points and W the set of unfolding indi es al ulated over
R (def. 7.1.28). The W (e) is the set of indi es over the segments ontained
in an initial tree t
e
with W (e) = fw j w 2 W; segment(t
e
; U
re
; w) 6= ?g.
For indu ing a re ursive subprogram from a given initial program, this
initial program must be of \suÆ ient size", that is, it must ontain a suÆ ient
number of (hypotheti al) unfoldings su h that all information ne essary for
folding an be obtained from the given tree. Espe ially, it is ne essary that
ea h hypotheti al subprogram is unfolded at least on e at ea h hypotheti al
re ursion point. A hypothesis over re ursion points of a subprogram an only
lead to su essful folding if the following proposition holds:
Lemma 7.3.1 (Re urren e of a Valid Segmentation). Let (U
re
; U
sub
)
together with t
skel
be a valid segmentation for a set of initial programs T
init
and let R = f1; : : : ; jU
re
jg be the set of indi es over re ursion points U
re
.
(U
re
; U
sub
) an only generate a subprogram whi h re ursively explains T
init
,
if for ea h re ursion point u
r
2 U
re
there exists an initial tree t
e
2 T
init
su h
that exists w 2 W (e) with w = v Æ r; v 2 W (e); r 2 R.
Proof: follows dire tly from denition 7.1.30.
Lemma 7.3.1 is a weak onsequen e from the denition of re ursive ex-
planations (def. 7.1.30), be ause it is not required that segments orrespond
to omplete unfoldings. That is, we have given just a ne essary but not a
suÆ ient ondition for su essful indu tion.
7.3.1.4 Algorithm. The denition of a valid segmentation (def. 7.3.4) al-
lows us to formulate some relations whi h an be used for an algorithmi
onstru tion of a segmentation hypothesis:
5
Note, that we use the symbol ? to represent an unknown segment rather than
to dis riminate between an undened sub-term in a term () and an undened
hypotheti al fun tion body (?).
198 7. Folding of Finite Program Terms
Be ause of onditions t
skel
t
init
and U
re
pos(t
skel
) must hold
U
re
pos(t
init
).
From the ondition \validity of segmentation in subtrees" follows for all
u
re
2 U
re
that node(t
init
; u
re
) = node(t
init
; ). That is, only su h
positions in an initial tree an be andidates for re ursion points whi h
ontain the same symbol as its root node.
6
The positions of further sub-s hemata an be inferred from the uppermost
unfolding in t
i
nit: U
sub
= fu j u 2 lpos(skeleton(t
init
[U
re
)) n
U
re
;pos(t
init
j
u
; ) 6= ;g. (Remember that lpos(t) returns the positions
of leaf nodes of a term, see def. 7.1.6).
Consequently, a hypothesis t
skel
about the skeleton of the subprogram body
an be onstru ted: t
skel
= skeleton(t
init
[(U
re
[ U
sub
) ).
As mentioned before, onstru ting a hypothesis about the set of possible
re ursion points U
re
from a given set of initial trees is the rst and ru ial
step of indu ing an RPS. This hypothesis determines the set of sub-s hema
positions and the skeleton of a sear hed for subprogram body. It represents
an assumption (U
re
; U
sub
) about the segmentation of an initial tree whi h
orresponds to the i-th unfolding of a sear hed for re ursive equation. The
validity of a hypothesis an be initially he ked using the riteria given in
denition 7.3.4 and it an he ked whether it holds re ursively over an om-
plete initial tree using lemma 7.3.1. If validation fails, a new set of re ursion
points U
re
must be onstru ted. If no su h set an be found, the sear hed for
RPS does not onsist of a main program whi h just alls a re ursive equation,
and sear h for re ursion points must start at deeper levels of the tree. In the
worst ase, the omplete tree would be regarded as main program with an
empty set of re ursive equations.
7
For onstru ting possible sets of re ursion points, ea h given initial tree
must be ompletely traversed. The onstru tion is based on a strategy whi h
determines the sequen e in whi h possible hypotheses are generated. The pro-
posed strategy ensures that minimal subprograms whi h explain maximally
large parts of an initial tree are onstru ted:
Sear h for a segmentation, starting with U
re
= ; at position u = 1: onsidering
the following ases:
Case 1: (The initial trees are ompletely traversed.)
If there exists no further position
If U
re
6= ; then onstru t U
sub
and t
skel
with respe t to U
re
.
If (U
re
; U
sub
) together with t
skel
is a valid segmentation, then stop else
ba ktra k.
6
Note, that it is possible, that a given initial tree ontains a onstant part be-
longing to the body of the alling fun tion, su h that the initial tree for whi h a
re ursive subprogram is to be indu ed has a root whi h is on a deeper position.
7
Be ause we sear h for an RPS with at least one re ursive equation, our algorithm
terminates already at an earlier point if the remaining tree is so small, that
it is not possible to nd at least one unfolding for ea h hypotheti al re ursion
point.
7.3 Solving the Synthesis Problem 199
If U
re
= ; then no re ursion points ould be found and onsequently, there
exists no subprogram whi h re ursively explains the initial trees.
Case 2: (The node at position u is re ursion point.)
Set U
re
= U
re
[ fug.
Constru t U
sub
and t
skel
with respe t to U
re
.
If (U
re
; U
sub
) together with t
skel
is a valid segmentation, then progress with
U
re
to the next position u
0
right of u.
Otherwise go to ase 3.
Case 3: (The node at position u is lying above a re ursion point.)
If there exists a position u
0
= u Æ 1 in the initial trees, progress with U
re
at
position u
0
.
Otherwise go to ase 4.
Case 4: (There are no nodes lying below the urrent node.)
Progress with U
re
at the next position u
0
right of u.
This strategy realizes a sear h for re ursion points from left to right where
for ea h hypotheti al re ursion point sear h pro eeds downward. The fun tion
for al ulating the next position to the right is given in table 7.6.
Table 7.6. Cal ulation of the Next Position on the Right
Fun tion: nextPosRight : T
[fg
pos(t
init
)! pos(t
init
) for all t
init
2 T
init
Pre: T
init
is a set of initial trees; u is a position in the initial trees
Post: next position right of u if su h a position exists, ? otherwise
1. IF u =
2. THEN return ?
3. ELSE
a) Let u = u
0
Æ k
b) IF 9t 2 T
init
with u
0
Æ (k + 1) 2 pos(t)
) THEN return u
0
Æ (k + 1)
d) ELSE return nextPosRight(T
init
; u
0
).
7.3.2 Constru ting a Program Body
For a given valid segmentation the body of a hypotheti al subprogram an
be onstru ted. For ea h node of a segment it must be de ided whether it
belongs to
the body of the subprogram, or
the parameter instantiation, or
a further subprogram.
In this se tion we will only onsider subprograms without alls of further
subprograms, that is U
sub
= ;. An extension for subprograms with additional
subprogram alls is given in se tion 7.3.3. In the following, we will rst give
an example for onstru ting the program body. Afterwards, we will intro-
du e a theorems on erning the maximization of the program body. Finally,
we will present an algorithm for al ulating the program body for a given
segmentation of a set of initial trees.
200 7. Folding of Finite Program Terms
7.3.2.1 Separating Program Body and Instantiated Parameters.
For illustration we use a simple linear subprogram the re ursive equation for
al ulating the fa torial of a natural number. This subprogram and its third
unfolding is given in table 7.7. The parameter x is instantiated with 2. In the
rst ase, represented as su (su (0)) and in the se ond ase, represented as
pred(3) (short for pred(su (su (su (0))))).
Table 7.7. Fa torial and Its Third Unfolding with Instantiation su (su (0)) (a)
and pred(3) (b)
Subprogram: G(x) = if(eq0(x), 1, *(x, G(pred(x))))
Unfolding (a): Third unfolding for (x) = su (su (0))
t
init
= if(eq0(su (su (0))), 1, *(su (su (0)),
if(eq0(pred(su (su (0)))), 1, *(pred(su (su (0))),
if(eq0(pred(pred(su (su (0))))),1, *(pred(pred(su (su (0)))),))))))
Unfolding (b): Third unfolding for (x) = pred(3)
t
init
= if(eq0(pred(3)), 1, *(pred(3),
if(eq0(pred(pred(3))), 1, *(pred(pred(3)),
if(eq0(pred(pred(pred(3)))), 1, *(pred(pred(pred(3))),))))))
A valid re urrent segmentation for t
init
is U
re
= f3:2:g, and the skeleton
is t
skel
= if(y
1
; y
2
; (y
3
; )). The segments for t
init
are given in table 7.8.
Table 7.8. Segments of the Third Unfolding of Fa torial for Instantiation
su (su (0)) (a) and pred(3) (b)
(a) Index w Segment
if(eq0(su (su (0))), 1, *(su (su (0)), ))
1: if(eq0(pred(su (su (0)))), 1, *(pred(su (su (0))), ))
1:1: if(eq0(pred(pred(su (su (0))))),1, *(pred(pred(su (su (0)))),))
(b) Index w Segment
if(eq0(pred(3)), 1, *(pred(3), ))
1: if(eq0(pred(pred(3))), 1, *(pred(pred(3)), ))
1:1: if(eq0(pred(pred(pred(3)))), 1, *(pred(pred(pred(3))),))
The program body an be onstru ted as maximal pattern (see def. 7.1.13)
of all given segments. For the segments onstru ted from the initial tree with
instantiation (x) = su (su (0)), the program body therefore is
t
G
0
= if (eq0 (x
0
); 1 ; (x
0
;G
0
)):
The body is identi al with the denition of fa torial given in table 7.7.
For the segments onstru ted from the initial tree with instantiation
(x) = pred(3), the program body is
7.3 Solving the Synthesis Problem 201
t
G
00
= if (eq0 (pred(x
00
)); 1 ; (pred(x
00
);G
0
)):
A part of the parameter instantiation got part of the program body! This is a
onsequen e from dening the program body as maximal pattern of the seg-
ments of the initial trees. If an instantiation is given whi h shares a non-trivial
pattern (i. e., the anti-instan e is not just a variable; see def. 7.1.11) with the
substitution term (in the re ursive all), then this pattern will be assumed
to be part of the program body. A simple pra ti al solution is to present the
folding algorithm with a set of initial trees with dierent instantiations.
In the following, it will be shown that if a re ursive equation exists for
a set of initial trees then we an nd a re ursive subprogram with a \maxi-
mized" body. This resulting subprogram together with the found parameter
instantiation is equivalent to the \intended" subprogram with the \intended"
instantiation. Of ourse, it does not hold, that the indu ed program and the
intended program are equivalent for all possible parameter instantiations!
7.3.2.2 Constru tion of a Maximal Pattern. The body of a subprogram
is onstru ted by in luding all nodes in the skeleton whi h are ommon over
all segmentations. All subtrees whi h dier over segments are onsidered as
instantiated parameters, that is, the initial instantiations of the parameters
when the subprogram is alled and the instantiations resulting from substi-
tutions of parameters in re ursive alls. A given set of initial trees in the
following alled examples an ontain trees with dierent initial parameter
instantiations as shown in the fa torial example above.
Theorem 7.3.1 (Maximization of the Body). For a set of example ini-
tial trees, let E be a set of examples indi es e with j E j 1. For ea h re ursive
subprogram G(x
1
; : : : ; x
n
) = t
G
with X = fx
1
; : : : ; x
n
g and t
G
2 T
[
(X)
together with initial instantiations
e
: X ! T
for all x 2 X and all
e 2 E exists a subprogram G
0
(x
1
; : : : ; x
n
0
) = t
G
0
with X
0
= fx
1
; : : : ; x
n
0
g
and t
G
0
2 T
[
(X
0
) together with initial instantiations
0
e
: X
0
! T
for
all x 2 X
0
and all examples e 2 E su h that L(G;
e
) = L(G
0
;
0
e
) for all
e 2 E. Additionally, for ea h x 2 X
0
holds that the instantiations whi h an
be generated by G
0
from
0
e
don't share a ommon not-trivial pattern.
Proof: see appendix BB.2.
Theorem 7.3.1 states that if a re ursive subprogram G exists for a set of
initial trees T
init
= ft
e
j e 2 Eg whi h an generate all t
e
2 T
init
for a given
initial instantiation
e
, then there also exists a subprogram G
0
su h that the
instantiations (over all segments and all examples) do not share a non-trivial
pattern (a ommon prex). Therefore, it is feasible to generate a hypothesis
about the subprogram body for a given segmentation with re ursion points
U
re
by al ulating the maximal pattern of the segments.
Theorem 7.3.2 (Constru tion of a Subprogram Body). Let T
init
T
[fg
be a set of initial trees with t
e
2 T
init
for all e 2 E. Let (U
re
; U
sub
)
be a valid segmentation of T
init
with U
sub
= ; and G the fun tion variable
202 7. Folding of Finite Program Terms
used for onstru ting the segments. The maximal pattern
^
t
G
2 T
[G
(X) of all
segments segment(t
e
; U
re
; w) is the resulting hypothesis about the program
body.
Proof: Follows from theorem 7.3.1 and denition 7.3.6.
The variable instantiations for a maximal pattern
^
t
G
are given by the
subtrees at the positions where the segmentations dier or (for un omplete
unfoldings) by?, if these positions do not o ur in a given tree (see def. 7.3.6).
Denition 7.3.8 (Variable Instantiations). Let T T
[fg
be a set of
terms indexed over E,
^
t
G
the maximal pattern of terms in T , and X =
var(
^
t
G
) the set of variables in the maximal pattern. The instantiation
e
:
X ! T
[f?g of variables x 2 X at positions u 2 pos(
^
t
G
; x) in term t
e
2 T
is dened as
e
(x) =
t
e
j
u
u 2 pos(t
e
)
? otherwise:
7.3.2.3 Algorithm. The maximal pattern of a set of terms an be al u-
lated by rst order anti-uni ation as des ribed in denition 7.1.16 in se tion
7.1.2. Only omplete segments are onsidered. For in omplete segments, it
is in general not possible to obtain a onsistent introdu tion of variables
during generalization. An example problem, using two terms of Fibona i
(introdu ed in table 7.3) is given in table 7.9.
Table 7.9. Anti-Uni ation for In omplete Segments
We dene t u = u t = t.
Complete Segment: t
1
= if(eq0(2); 1; if(eq0(pred(2)); 1;+(G;G)))
In omplete Segment: t
2
= if(eq0(pred(pred(2)));1; )
Anti-Uni ation: if(eq0(x
1
); 1; if(eq0(pred(2)); 1;+(G;G)))
Anti-uni ation gives us the maximal body
^
t
G
of a subprogram G. The
variables in the subprogram represent that subtrees whi h dier over the
segments of an initial tree. After this step of folding, the instantiations of these
variables are still a nite set of terms, namely, the diering subtrees. In se tion
7.3.4 it will be shown, how this nite set is generalized by inferring input
variables, their initial instantiation, and the substitution of these variables in
the re ursive all.
7.3.3 Dealing with Further Subprograms
Up to now we have ignored the ase that there are path leading to s whi h
annot be explained by the onstru ted segmentation and that annot be
generated by the resulting maximal subprogram body. In this se tion we will
present how folding works if U
sub
is not empty. First, we give an illustrative
7.3 Solving the Synthesis Problem 203
example. Then we introdu e de omposition of initial trees with respe t to
positions in U
sub
. We show that integrating alls to further subprograms in
the body of a subprogram alled from the main program t
0
is well-dened.
Finally, we des ribe how the original initial tree an be redu ed by repla ing
subtrees orresponding to alls of further subprograms by their name.
7.3.3.1 Indu ing two Subprograms for `ModList'. Remember theMod-
List example presented in table 7.2 and the example initial tree from whi h
this RPS an be inferred given in gure 7.7. Let us look at this initial tree
again (see g. 7.8): We an nd a rst segmentation whi h explains the right-
most path leading to an with U
re
= f3:2:g. But, the initial tree ontains
further subtrees with -leafs, that is U
sub
= f3:1:g 6= ;. If the found rst
segmentation is valid, there must exist a further subprogram whi h an gen-
erate these remaining subtrees ontaining s (marked with bla k boxes in the
gure). In segments one to three (segment four is an in omplete unfolding,
as dis ussed above), the left-hand subtrees of the ` ons'-nodes ontain these
yet unexplained subtrees.
Folding has started with inferring an RPS S = (G; t
0
) and a promising
segmentation has been found whi h might result in a subprogram G
1
2 G
and a main program t
0
whi h alls G
1
. Now, the folding algorithm is alled
re ursively with the unexplained subtrees as new sets of initial trees T
u
init
with
u 2 U
sub
. For the ModList example, there exists only one position in U
sub
and we have three example trees whi h must be explained. The re ursive all
of the folding algorithm results again in nding a re ursive program s heme
S
u
= (G
u
; t
u
0
) and not just a re ursive subprogram in G. This is, be ause the
all of the further sub-program might be embedded in a onstant term t
u
0
whi h later be omes part of the body of the alling subprogram G
1
. That
is, we start to infer a sub-s heme whi h afterwards an be integrated in the
sear hed-for RPS.
Again, the rst step of folding is to nd a valid re urrent segmentation.
In gure 7.9 the new set of initial trees is given. The initial hypothesis, that
t
u
0
onsists only of the all of the subprogram fails. There is a onstant initial
part if(eq0(x), T, nil). Starting segmentation further down, at the rst `if'-
node, results in a valid re urrent segmentation whi h an explain all three
initial trees.
We already des ribed, how the maximal body an be onstru ted by nd-
ing the ommon pattern of all segments (see se t. 7.3.2. For the sear hed
for subprogram G
2
we nd the body if(<(x
1
, head(x
2
)), x
1
, G
2
). We will
explain, how initial instantiations and substitutions in re ursive alls are al-
ulated below, in se tion 7.3.4. For the moment, just believe that the resulting
subprogram G
2
is the one given in the box in gure 7.9. The alling main
program for the sub-s heme is t
u
0
= if(eq0(G
2
(x
1
, x
2
)), T, nil). The initial
trees T
u
init
an be explained with the following initial instantiations:
First Tree: (x
1
) = 8, (x
2
) = [9; 4; 7,
Se ond Tree: (x
1
) = 8, (x
2
) = tail([9; 4; 7),
204
7.FoldingofFiniteProgram
Terms
3.2. λ
IncompleteSegment 4
Segment 3
Segment 2
Segment 1
Ω
Ω
Ω
Ω
if
empty nil cons
[9,4,7] if
eq0 T nil
if
<
8 head
[9, 4, 7]
8
if
empty
tail
[9, 4, 7]
nil cons
if
eq0 T nil
if
< 8 if
8 head
tail
[9, 4, 7]
< - if
- head
tail
[9, 4, 7]
8 head
tail
[9, 4, 7]
8 head
tail
[9, 4, 7]
< -
-
- head
8 head
tail
[9, 4, 7]
tail
[9, 4, 7]
head
tail
[9, 4, 7]
head-
8 head
tail
[9, 4, 7]
tail
[9, 4, 7]
if
empty
tail
tail
[9, 4, 7]
nil
if
cons
empty
tail
tail
tail
[9, 4, 7]
nil
if
eq0 T nil
if
< 8 if
8 head
tail
tail
[9, 4, 7]
< -
8 head
tail
tail
[9, 4, 7]
- head
head tail
tail
[9, 4, 7]
8
tail
tail
[9, 4, 7]
Fig.7.8.IdentifyingTwoRe ursiveSubprogramsintheInitialProgram
forMod-
List
7.3SolvingtheSynthesisProblem
205
Explanation in segment 2 of the original treeExplanation in segment 1 of the original tree
Ω
seg. 3
segment 1
segment 4
segment 3
segment 2
segment 1
Ω
x1
Explanation in segment 3 of the original tree
segment 1
x1 x2
Subprogram G2
segment 2 segment 2
Ω
if
eq0
< 8 if
8 head
tail
tail
[9, 4, 7]
< -
8 head
tail
- head
head tail
tail
[9, 4, 7]
8
tail
tail
T nil
if
tail
if
< 8 if
8 head < - if
8 head
tail
[9, 4, 7]
x1
x2
[9, 4, 7]
G
if
<
head
x2
if
T nil
if
< 8
head
eq0 T nil
G2
8 [9, 4, 7]
[9, 4, 7]
tail
[9, 4, 7]
tail
if
eq0 T nil
G2
8
if
eq0 T nil
G2
8 tail
tail
[9, 4, 7]
-
eq0
if
eq0 T nil
tail
[9, 4, 7]
- head
tail
[9, 4, 7]
8 head
8 head
tail
[9, 4, 7]
< -
-
- head
[9, 4, 7]
tail
[9, 4, 7]
head
tail
[9, 4, 7]
head-
8 head
tail
[9, 4, 7]
tail
[9, 4, 7]
8 head
[9, 4, 7]
if
Fig.7.9.InferringaSub-Program
S hemeforModList
206 7. Folding of Finite Program Terms
Third Tree: (x
1
) = 8, (x
2
) = tail(tail([9; 4; 7)).
Now the inferen e of the sub-s heme is omplete and the folding algorithm
omes ba k to the original problem. The originally unexplained subtrees are
repla ed by the three alls of the main program t
u
0
given at the bottom of g-
ure 7.9. The inferred subprogram G
2
is introdu ed in the set of subprograms
G of the to be onstru ted RPS S. The redu ed initial tree is given in g-
ure 7.10. The folding algorithm pro eeds now with onstru ting the program
body for G
1
.
Ωtail
[9, 4, 7]
cons
tail
tail
[9, 4, 7]
if
empty
tail
tail
tail
[9, 4, 7]
cons[9, 4, 7]
nil
if
empty nil cons
[9,4,7] if
eq0 T nil
if
empty
tail
[9, 4, 7]
nil
if
eq0 T nil
G2
8
G2
8
if
empty
tail
tail
[9, 4, 7]
nil
if
eq0 T nil
G2
8
Fig. 7.10. The Redu ed Initial Tree of ModList
7.3.3.2 De omposition of Initial Trees. If a valid re urrent segmenta-
tion (U
re
; U
sub
) was found for a set of initial trees T
init
and if U
sub
6= ;, then
indu tion of an RPS is performed re ursively over the set of subtrees at the
positions in U
sub
.
Denition 7.3.9 (De omposition of Initial Trees). Let T
init
T
[fg
be a set of initial trees indexed over E. Let (U
re
; U
sub
) be a valid segmenta-
tion of T
init
with U
sub
6= ;. The set of initial trees T
u
init
for all u 2 U
sub
is
dened as:
T
u
init
= fsegment(t
e
; U
re
; w)j
u
j e 2 E;w 2W (e);
u 2 pos(segment(t
e
; U
re
; w))g:
7.3 Solving the Synthesis Problem 207
The index set E
u
of elements in T
u
init
is onstru ted as E
u
= f(w; e) j e 2
E;w 2W (e); u 2 pos(segment(t
e
; U
re
; w))g and for elements t
(w;e)
2 T
u
init
with (w; e) 2 E
u
holds t
(w;e)
= segment(t
e
; U
re
; w)j
u
.
From the onstru tion of (U
re
; U
sub
) follows that ea h subtree at a posi-
tion in U
sub
ontains at least one (see def. 7.3.2 and def. 7.3.3). An example
for an initial tree with non-empty sub-s hema positions was given in gure
7.7.
7.3.3.3 Equivalen e of Sub-S hemata. In general, there an exist alter-
native RPSs whi h an explain a set of initial trees. These RPSs might have
dierent numbers of parameters in the main program with dierent initial
instantiations. We want to deal with indu tion of sub-s hemata for subtrees
of a set of given initial trees as independent sub-problem. For that reason,
it must hold that if there exists a valid hypothesis for initial trees in T
u
init
whi h is part of a re ursive explanation of T
init
ea h possible solution for
T
u
init
an be part of the re ursive explanation of T
init
.
For understanding the following theorem, remember that the main pro-
gram t
u
0
of a sub-s heme must be valid over dierent examples for this s heme.
For the ModList illustration (see g. 7.9) there were three given trees for in-
ferring the sub-s heme and a main program t
u
0
ould be generated overing
all three trees with dierent initial instantiations (t
u
0
).
Theorem 7.3.3 (Equivalen e of Sub-S hemata). Let T
init
be a set of
initial trees indexed over E. Let S
1
= (t
1
0
;G
1
) and S
2
= (t
2
0
;G
2
) be two
RPSs with subprograms G
1
1
; : : : ; G
1
n
1
and G
2
1
; : : : ; G
2
n
2
. The RPSs together
with initial instantiations
1
e
: X
1
! T
and
2
e
: X
2
! T
for all e 2 E and
parameters X
1
= var(t
1
0
) and X
2
= var(t
2
0
) re ursively explain T
init
. Let t
1
0
be the maximal pattern of all terms
1
e
(t
1
0
) and t
2
0
the maximal pattern of all
terms
1
e
(t
2
0
) for e 2 E. It holds that for ea h parameter x
1
2 X
1
exists a
parameter x
2
2 X
2
su h that for all e 2 E:
1
e
(x
1
) =
2
e
(x
2
).
Proof (Equivalen e of Sub-S hemata). Let S
1
be a sub-s hema onstru ted by the
\maximization of body" prin iple (see theorem 7.3.1) and S
2
a sub-s hema whi h
is obtained in some dierent way, that is, re ursion points must be at \deeper"
positions in the initial trees.
Let x
1
2 X
1
be a parameter of t
1
0
. For instantiations
1
e
(x
1
) holds that they do not
share a ommon non-trivial pattern over all positions of x
1
in the segments of the
initial trees. It holds 9e; e
0
2 E with node(
1
e
(x
1
)) 6= node(
1
e
0
(x
1
)) (postulation
in theorem 7.3.3).
Variable x
1
is used in RPS S
1
to generate sub-terms of the given initial trees.
Let u
x
1be a position in initial trees t
e
, e 2 E, where terms are generated by
instantiations
1
e
(x
1
). The sub-terms at positions t
e
j
u
x
1
are identi al to the initial
instantiation
1
e
(x
1
) and do not share a non-trivial pattern. It follows, that in RPS
S
2
, the sub-terms t
e
j
u
x
1
must also be explained by an initial instantiation in main
(t
2
0
) ( ase 1) or be part of an instantiated parameter o uring in an unfolding of a
subprogram ( ase 2).
Case 1: (sub-terms t
e
j
u
x
1
are explained by an initial instantiation in t
2
0
)
Let x
2
2 X
2
be a variable and u
x
22 pos(t
2
0
fG
2
1
; : : : ; G
n
2
g; x
2
) be a
208 7. Folding of Finite Program Terms
position of this variable in main with u
x
2 u
x
1. Be ause x
2
was introdu ed by
anti-uni ation, exist e; e
0
2 E with node(u
x
2; t
e
) 6= node(u
x
2; t
e
0
). Be ause
for all u < u
x
1holds that node(u; t
e
) = node(u; t
e
0
) for all e; e
0
2 E follows
u
x
2= u
x
1and therefore
1
e
(x
1
) =
2
e
(x
2
).
Case 2: (sub-terms t
e
j
u
x
1
are part of a parameter instantiation in an unfolding of
a subprogram from S
2
)
Let u
S
be a position with u
S
u
x
1su h that terms t
e
j
u
x
1
are instantiations of
a parameter in an unfolding of a subprogram from S
2
. These instantiations are
generated by a sequen e of substitutions (in the all of a subprogram frommain,
in the re ursive all of a subprogram, in the all of a further subprogram, ...).
Let t
subst
2 T
(X
2
) be a ombination of su h substitutions where variables in
X
2
are parameters of t
2
0
. That is, it holds t
e
j
u
S
= t
subst
[
2
e
(x
2
1
); : : : ;
2
e
(x
2
m
) =
2
e
(t
subst
) for all e 2 E.
Let u
s
x
1
be the position of sub-term t
e
j
u
x
1
with respe t to position u
S
, that is
(t
e
j
u
S
)j
u
s
x
1
= t
e
j
u
x
1
. Be ause sub-terms t
e
j
u
x
1
for all e 2 E and therefore sub-
terms
2
e
(t
subst
)j
u
s
x
1
do not share a non-trivial pattern, terms
2
e
(t
subst
)j
u
s
x
1
must be part of instantiations of variables x
2
2 X
2
. Let u
s
x
2
be a position
of variables x
2
in t
subst
with u
s
x
2
u
s
x
1
and t
subst
j
u
s
x
2
= x
2
. Be ause for
ea h position u < u
s
x
1
holds node(
2
e
(t
subst
)j
u
) = node(
2
e
0
(t
subst
)j
u
) for all
e; e
0
2 E and be ause
e
(x
2
) do not share a ommon prex (postulation) must
hold
2
e
(x
2
) =
2
2
(t
subst
)j
u
s
x
1
= t
e
j
u
x
1
=
1
e
(x
1
).
Theorem 7.3.3 states that for ea h RPSs with a given initial instantia-
tion of variables in the main program whi h explains the set of subtrees at
positions U
sub
results an identi al instantiation of variables. Therefore, inde-
pendent of the indu ed subprogram together with a alling main program,
the number and instantiations of parameters o uring in this main program
are unique. The proof is based on the idea that for a given set of initial trees
and a given sub-s hema S
1
ea h other sub-s hema S
2
whi h explains the
same initial trees shares ertain hara teristi s from whi h follows that pa-
rameter instantiations are identi al. In the rst ase, we onsider parameter
instantiations whi h are part of the instantiation of parameters in the alling
main program. If u
x
2
is above u
x
1
in the initial trees then the subtrees must
dier at position u
x
2. But we already know that u
x
1is exa tly that position
at whi h the subtrees dier the rst time. Therefore u
x
1
and u
x
2
must be
identi al positions in the subtrees and as a onsequen e the instantiations
whi h are given as the subtrees at these positions must be identi al! Case
two is analogous for sub-terms in segments orresponding to unfoldings of a
subprogram in S
2
.
For illustration onsider again the ModList example given in gure 7.10
and the sear hed-for RPS given in table 7.2. The subprogram ModList with
parameters l and n alls a further subprogram Mod. The \partial" program
body of ModList, that is, the maximal pattern of all segments without on-
sideration of further subprograms, is:
ModList = if (empty(l);nil ; ons(u;ModList)):
7.3 Solving the Synthesis Problem 209
For the set of subtrees at segment positions u = 3:1: for example the Mod
subprogram given in table 7.2 an be indu ed with parameters k and n. For
this subprogram the alling main program is if(eq0(Mod(n, head(l))), T, nil)
with parameter n being passed through from the main program of the RPS
and parameter k being onstru ted by substituting the parameter l given in
the main program by head(l). Be ause ModList as well as Mod is onstru ted
by al ulating the maximal pattern over all segments, ea h possible realiza-
tion of Mod together with its alling main must ontain parameters n and k
= head(l).
7.3.3.4 Redu tion of Initial Trees. If sub-s hemata explaining all sub-
trees T
u
init
an be indu ed for all positions U
sub
, these sub-s hemata an be
inserted in the segments of the subprogram whi h alls the sub-s hema. That
is, the on erned subtrees are repla ed by the main programs t
u
0
whi h all
further sub-programs.
Lemma 7.3.2 (Redu tion of Initial Trees). Let T
init
be a set of initial
trees indexed over E. Let (U
re
; U
sub
) together with t
skel
be a valid segmen-
tation for all t
e
2 T
init
and let U
sub
6= ;.
1. If there exists an RPS S
u
= (G
u
; t
u
0
) for ea h u 2 U
sub
whi h ex-
plains the initial trees T
u
init
onstru ted as des ribed in denition 7.3.9
together with initial instantiations
u
(w;e)
: var(t
u
0
)! T
[f?g
8
, the seg-
ments of initial trees in T
init
an be transformed by segment(t
e
; U
re
;
w) = segment(t
e
; U
re
; w)[u
u
(w;e)
(t
u
0
) for all trees e 2 E, unfold-
ing indi es w 2 W (e), and sub-s hema positions u 2 pos(segment(t
e
;
U
re
; w)).
2. Let be G
u
a set of properly named subprograms (no two subprograms have
identi al names G
i
) with
G
u
=
hG
u
1
(x
1
; : : : ; x
m
u
1
) = t
u
1
;
.
.
.
G
u
n
(x
1
; : : : ; x
m
u
n
) = t
u
n
i
and let G(x
1
; : : : ; x
m
) = t
G
be the (yet unknown) subprogram for segmen-
tation (U
re
; U
sub
). The RPS S = (G; t
0
) whi h explains T
init
, ontains
all subprograms of the sub-s hemata S
u
= (G
u
; t
u
0
) for all u 2 U
sub
:
G =
hG(x
1
; : : : ; x
m
) = t
G
;
G
u
1
(x
1
; : : : ; x
m
u
1
) = t
u
1
;
.
.
.
G
u
n
(x
1
; : : : ; x
m
u
n
) = t
u
n
i:
8
There an exist subtrees for the sub-s hemata whi h are too small to infer all
variables.
210 7. Folding of Finite Program Terms
Proof (Redu tion of Initial Trees). 1. If an RPS S
u
= (G
u
; t
u
0
) together with an
initial instantiation
u
(w;e)
explains an initial tree t
u
(w;e)
, then there exists a
term t 2 L(S
u
;
u
(w;e)
) with t
u
(w;e)
t. Therefore, if the subtree at position u
in segment segment(t
e
; U
re
; w) is repla ed by the instantiated alling main
program
u
(w;e)
(t
u
0
), then a term t
0
with segment(t
e
; U
re
; w)
t
0
an be
generated from segment(t
e
; U
re
; w)[u
u
(w;e)
(t
u
0
) using rules of the term
rewrite system R
S
u
.
2. To generate the original segments (or larger subtrees) from the segments where
at all positions u 2 U
sub
the subtrees are repla ed by the instantiated alling
main programs, all subprogram denitions of the sub-s hemata S
u
= (G
u
; t
u
0
)
are ne essary.
After redu tion, in the segments of T
init
the subtrees at positions U
sub
are
repla ed by the instantiated alling main of the orresponding subprograms.
That is, these subtrees do no longer ontain s and the program body of the
segments an be onstru ted as des ribed in se tion 7.3.2.
The pres ribed onstru tion of sub-programs imposes the following re-
stri tion on initial trees: A maximal pattern for the subprograms an only
be onstru ted if there exist at least two trees t
(w;e)
; t
(w
0
;e
0
)
2 T
u
init
su h that
for the initial instantiation of all variables x 2 var(t
u
0
) holds
u
(e;w)
(x) 6= ?
and
u
(e
0
;w
0
)
(x) 6= ?.
A onsequen e of inferring sub-s hemata separately is, that subprograms
are introdu ed lo ally with unique names. It is possible, that two su h subpro-
grams are identi al and have only dierent names. After folding is ompleted
the set of subprograms an be redu ed su h that only subprograms with
dierent program bodies remain in G.
7.3.4 Finding Parameter Substitutions
After onstru ting a subprogram body as maximal pattern of all segments,
the remaining not explained subtrees must be parameters and their substi-
tutions. Anti-uni ation already results in a set of variables. The last om-
ponent of indu ing an RPS from a set of initial trees is to identify the sub-
stitutions these variables in the re ursive alls of the subprogram.
Variable substitutions in re ursive alls an be quite omplex. In table 7.10
some re ursive equations with dierent variants of substitutions are given. In
the most simple ase, ea h variable is substituted independently of the other
variables and keeps its position in the re ursive all (f
1
). A variable might
be also substituted by an operation involving other program parameter (f
2
).
Additionally, variables an swit h there positions (given in the head of the
equation) in the re ursive all (f
3
). Finally, there might be \hidden" variables
whi h only o ur within the re ursive all (f
4
). If you look at the body of
f
4
, variable z o urs only within the re ursive all. The existen e of su h a
variable annot be dete ted when the program body is onstru ted by anti-
uni ation but only a step later, when substitutions for the re ursive all are
inferred.
7.3 Solving the Synthesis Problem 211
Table 7.10. Variants of Substitutions in Re ursive Calls
Individual Substitutions, Identi al Positions
f1(x, y) = if(eq0(x), y, +(x, f1(pred(x),su (y))))
Interdependent Substitutions
f2(x, y) = if(eq0(x), y, +(x, f2(pred(x),+(x,y))))
Swit hing of Variables
f3(x, y, z) = if(eq0(x), +(y, z), +(x, f3(pred(x), z, su (y))))
Hidden Variable
f4(x, y, z) = if(eq0(x), y, +(x, f4(pred(x), z, su (y))))
7.3.4.1 Finding Substitutions for `Mod'. Let us again look at the Mod
example. The valid segmentation for the se ond initial trees given in gure 7.9
is depi ted again in gure 7.11. We already know from al ulating the subpro-
gram body if(<(x
1
, head(x
2
)), x
1
, G
2
) that there remain two dierent kinds
subtrees whi h must be explained by variables and their substitutions in re-
ursive alls. Remember, that in onstru ting the subprogram body for Mod,
there were three initial trees at hand and the program body was onstru ted
by anti-unifying the segments gained from all three initial trees. Variable x
1
re e ts dieren es appearing already in segments of a single of su h initial
trees, but variable x
2
re e ts dieren es between segments of dierent initial
trees.
Ω
segment 3
segment 1
segment 2
segment 4
tail
[9, 4, 7]
tail
head8
tail
[9, 4, 7]
- head
tail
[9, 4, 7]
8 head
8 head
tail
[9, 4, 7]
< -
-
- head
[9, 4, 7]
tail
[9, 4, 7]
head
tail
[9, 4, 7]
head-
8 head
tail
[9, 4, 7]
tail
[9, 4, 7]
if
< 8 if
8 head < - if
Fig. 7.11. Substitutions for Mod
212 7. Folding of Finite Program Terms
For better readability we write the remaining subtrees for ea h segment
of the given tree into a table:
x
1
(1:1:1:) x
2
(1:1:2:1:) x
3
= x
1
(1:2:)
8 tail([9,4,7) 8
(8, head(tail([9,4,7))) tail([9,4,7) (8, head(tail([9,4,7)))
((8, head(tail([9,4,7))), tail([9,4,7) ((8, head(tail([9,4,7))),
head(tail([9,4,7))) head(tail([9,4,7)))
The middle olumn represents the instantiations of x
2
whi h is onstant over
all segments. If only this single initial tree were be onsidered, this term would
be part of the program body. The left-hand and right-hand olumns ontain
exa tly the same terms on ea h level. With these onsiderations, it is enough
to onsider the hanges from one level to the next in the rst two olumns.
Terms on su eeding levels represent hanges from one segment to the next,
that is, for the program body indu ed by this segmentation, these hanges
must be explained by substitutions of parameters in the re ursive all.
When going from the rst to the se ond level, we nd an o urren e
of x
1
as well as of x
2
, that is (8, head(tail([9,4,7))) = (x
1
, head(x
2
)).
This leads to the hypothesis, that the initial instantiations are x
1
= 8 and
x
2
= tail([9; 4; 7) with substitutions x
1
(x
1
, head(x
2
)) and x
2
x
2
.
We used the rst two segments to onstru t a hypothesis. As we will see
below, in general, we need two further segments to validate this hypoth-
esis. For now, let it be enough that we validate the hypothesis by going
from the se ond to the third segment. Applying the found substitution
to x
1
= (8, head(tail([9,4,7))) results in ((8, head(tail([9,4,7))),
head(tail([9,4,7))) and be ause x
2
= tail([9,4,7), the substitution hypothe-
sis holds!
The illustration gives an example for an interdependent substitution (see
tab. 7.10) be ause the substitution of x
1
depends not only on x
1
itself, but
also on x
2
.
7.3.4.2 Basi Considerations. A ne essary hara teristi of substitutions
is that they are unambiguously determined in the (innite set of) unfoldings
(see def. 7.1.29). In that ase, substitutions form regular patterns in the initial
trees and therefore an be identied by pattern mat hing.
Theorem 7.3.4 (Uniqueness of Substitutions). Let S = (G; t
0
) be an
RPS, G
i
(x
1
; : : : ; x
m
i
) = t
i
a subprogram in S, and X
i
= fx
1
; : : : ; x
m
i
g the
set of parameters. Let U
re
be the set of re ursion points of G
i
with index set
R = f1; : : : ; jU
re
jg, W the unfolding indi es onstru ted over R, and
i
the
set of all unfoldings of equation G
i
over instantiations : X
i
! T
. Let t
i
be a program body whi h orresponds to the maximal pattern of all unfoldings
w
2
i
. Let sub(x
j
; r) be a substitution term (def. 7.1.27) for x
j
2 X
i
in the
re ursive all of G
i
at position u
r
2 U
re
, r 2 R. For all terms t
s
2 T
(X)
with 8w 2 W :
wÆr
(x
j
) = t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g holds
t
s
= sub(x
j
; r).
Proof: see appendix BB.3.
7.3 Solving the Synthesis Problem 213
Be ause substitutions are unique over unfoldings, it holds that for ea h
subtree under a re ursion point it is determined whether this term is a param-
eter with its initial instantiation or a substitution term over su h a parameter.
Denition 7.3.10 (Chara teristi s of Substitution Terms). Let
sub(x
j
; r) be a substitution term for parameter x
j
at position r. For all po-
sitions u, starting with u = , holds the following indu tively dened hara -
teristi :
sub(x
j
; r)j
u
=
8
>
>
>
>
>
<
>
>
>
>
>
:
x
k
(a) 8w 2W :
(wÆr)
(x
j
)j
u
=
w
(x
k
) and
(b) :9t 2 T
(X
i
) with t 6= x
k
; su h that holds
8w 2 W :
wÆr
(x
j
)j
u
=
tfx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g
f(sub(x
j
; r)j
uÆ1
; : : : ; 8w 2 W : node(
wÆr
(x
j
); u) = f 2
sub(x
j
; r)j
uÆn
) with arity (f) = n:
For suÆ iently large initial trees, it an be de ided for ea h parameter
in the r-th unfolding how it is substituted in the next r Æ w-th unfolding.
For example, in a re ursive equation with two parameters x
1
and x
2
with
(x
1
) = 0 and (x
2
) = su (0), both parameters might be substituted by
x
i
su (x
i
). Condition (a) alone allows that x
1
x
2
is identied as
substitution, but this is only a legal substitution, if ondition (b) holds addi-
tionally. For the given example, for x
1
in the r-th unfolding an be found a
substitution term su (x
1
) in the r Æ w-th unfolding. A fun tion for testing
uniqueness of substitution is given in table 7.11. The fun tion makes use of
a fun tion whi h tests the re urren e of a substitution given in table 7.12.
Table 7.11. Testing whether Substitutions are Uniquely Determined
Fun tion: uniqueness : X X pos(t
init
) T
! bool
Pre: E is the set of indi es for T
init
R is the index set of re ursion points
W (e) ist the set of segment indi es for t
e
2 T
init
X is the set of variables in
^
t
G
and x
j
; x
k
2 X
(x
k
; w; e) returns the instantiation of variable x
k
2 X for segment w 2 W (e)
t
is an instantiation of x
j
2 X with t
6= ? in a segment with w = r:
u is the position whi h is urrently he ked
Post: true if for x
j
annot be found a re urrent substitution for a variable from
X n fx
k
g, false otherwise
Side-ee t: Break if an ambiguity is dete ted
uniqueness(x
j
; x
k
; u; t
) =
1. IF u =2 pos(t
)
2. OR ((:9x
i
2 (X n fx
k
g) : betaRe urrent(x
j
; x
i
; u)) (see table 7.12)
3. OR (:8e; e
0
2 E;w 2 W (e); w
0
2 W (e
0
) : node((x
j
; w Æ r; e); u) =
node((x
j
; w
0
Æ r; e
0
); u)))
4. AND Let f = node(t
; u) with (f) = n In
8l = 1; : : : n : uniqueness(x
j
; x
k
; u Æ l; w
1
; e
1
)
5. THEN return TRUE
6. ELSE return FALSE and break
214 7. Folding of Finite Program Terms
Table 7.12. Testing whether a Substitution is Re urrent
Fun tion: betaRe urrent : X X R pos(t
init
)! bool
Pre: E is the set of indi es for T
init
R is the index set of re ursion points
W (e) ist the set of segment indi es for t
e
2 T
init
and w 2 W (e)
X is the set of variables in
^
t
G
and x
j
; x
k
2 X
(x
k
; w; e) returns the instantiation of variable x
k
2 X for segment w 2 W (e)
t
is an instantiation of x
j
2 X with t
6= ? in a segment with w = r:
u is the position whi h is urrently he ked
Post: true if in all instantiations of x
j
2 X in segment w Æ r at position u it an
be found the instantiation of x
k
in segment w, false otherwise
betaRe urrent(x
j
; x
k
; r; u) =
8e 2 E; 8w Æ r 2 W (e) :
(x
j
; w Æ r; e)j
u
=
?
(x
k
; w; e)
(see def. 7.3.11 for =
?
)
For a given hypothesis
^
t
G
about a subprogram body (see def. 7.3.2 in
se t. 7.3.2), variable instantiations an be determined for ea h segment. Af-
ter determining variable instantiations for segments, the initial instantia-
tions an be separated from the substitution terms by al ulating the dier-
en es between the segment-wise instantiations. To al ulate variable instanti-
ations for segments, we an extend the denition for parameter instantiations
(def. 7.1.18) in the following way:
Denition 7.3.11 (Instantiation of Variables in a Segment). Let
T
init
T
[fg
be a set of initial trees indexed over E. Let (U
re
; U
sub
) with
U
sub
= ; be a segmentation of trees in T
init
and
^
t
G
the resulting maximal
pattern with X = var(
^
t
G
). Let W (e) be the set of segment indi es for t
e
2
T
init
. The instantiations : X W E ! T
[ f?g for variables in X
o uring in segments with indi es w 2W (e) are dened as
(x;w; e) =
8
>
>
>
>
<
>
>
>
>
:
segment(t
e
; U
re
; w)j
u
(a) 9u 2 pos(
^
t
G
; x) with
u 2 pos(segment(t
e
; U
re
; w))
and
(b) segment(t
e
; U
re
; w) 6= ?
? otherwise:
With (x;w; e) =
?
(x
0
; w
0
; e
0
) we denote that two instantiations are either
identi al or that one of the instantiations is undened.
Be ause we allow for in omplete unfoldings, it an happen that in some
segments the positions of a hypotheti al variable does not exist. Instantiation
is ? (i. e., undened), if the sear hed for position is non-existent in all
segments.
Indu ing the substitutions is based on the same prin iple as indu ing the
subprogram body, that is, to dete t regularities in the yet unexplained sub-
trees of the segments. We realize the onstru tion of a hypothesis for variable
7.3 Solving the Synthesis Problem 215
substitutions in the re ursive alls of a subprogram by identifying instantia-
tion and substitution through omparison of two su essive segments (w and
w Æ r) and validate this hypothesis for all other pairs of su essive segments
(w
0
and w
0
Ær). For validation there must exist at least two additional su es-
sive segments where the position of the substitution term is given. Therefore,
it must hold that the initial trees in T
init
must provide at least four segments
for onstru ting and validating the sear hed for substitution of a variable.
But it is not ne essary for these further segments to be given in the same
initial tree, it is enough that this information an be olle ted over all trees in
T
init
! Thereby we keep the restri tions on initial trees minimal. Furthermore,
for al ulating interdependent substitutions (see table 7.10), it is ne essary,
that for ea h substitution of a variable x
j
in a given segment, the substitution
terms of all variables are dened in the pre eding segment.
Lemma 7.3.3 (Ne essary Condition for Substitutions). Let T
init
T
[fg
be a set of initial trees indexed over E and
^
t
G
the hypothesis of the
program body whi h follows from segmentation (U
re
; U
sub
) with U
sub
= ;.
Re ursion points are indexed over R = f1; : : : ; jU
re
jg. The variables o uring
in the program body are X = var(
^
t
G
). Re urrent hypotheses over substitutions
an only be indu ed if for ea h variable x
j
2 X and ea h re ursive all r 2 R
holds: 9e
1
; e
2
2 E and w
1
2 W (e
1
), w
2
2 W (e
2
) with w
1
= , (w
1
; e
1
) 6=
(w
2
; e
2
) and for two positions h 2 f1; 2g:
1. (x
j
; w
h
Æ r; e
h
) 6= ? and
2. 8x
k
2 X : (x
k
; w
h
Æ r; e
h
) 6= ?.
An algorithm for testing whether enough instan es are given in a set of
initial trees is given in table 7.13.
Table 7.13. Testing the Existen e of SuÆ iently Many Instan es for a Variable
Fun tion: enoughInstan es : X ! bool
Pre: E is the set of indi es for T
init
R is the index set of re ursion points
W (e) ist the set of segment indi es for t
e
2 T
init
X is the set of variables in
^
t
G
and x
j
2 X
(x
k
; w; e) returns the instantiation of variable x
k
2 X for segment w 2 W (e)
Post: true if there are suÆ iently many instan es, false otherwise
enoughInstan es(x
j
) =
1. (9e
1
2 E : (x
j
; r:; e
1
) 6= ? AND
2. 8x
k
2 X : (x
k
; ; e
1
) 6= ?) AND
3. (9e
2
2 E : 9w 2 W (e
2
) : (x
j
; w Æ r; e
2
) 6= ? AND
4. 8x
k
2 X : (x
k
; w; e
2
) 6= ?) AND
5. (e
1
6= e
2
OR w 6= )
After onstru tion of a substitution hypothesis from a pair of su eeding
substitutions and validating it for all (at least one further) pairs of substi-
216 7. Folding of Finite Program Terms
tutions for whi h the a ording sub-term positions are dened in the initial
trees, it must be he ked whether the omplete set of substitution terms is
onsistent.
Lemma 7.3.4 (Consisten y of Substitutions). Let T
init
be a set of ini-
tial trees indexed over E. Let be : XW E ! T
[f?g
the set of instanti-
ations of variables x 2 X in segments e 2W (e); e 2 E. Let be sub(x
j
; r) the
substitution terms of variables x
j
2 X in the re ursive alls. The omposition
of substitutions sub
: X W ! T
(X) is indu tively dened as:
sub
(x
j
; w) =
8
<
:
x
j
w =
sub(x
j
; r)fx
1
sub
(x
1
; v); : : : ;
x
m
sub
(x
m
; v)g w = v Æ r:
For ea h initial tree t
e
2 T
init
must exist an instantiation (x
j
; ; e) su h
that for all variables x
j
2 X and all segments w 2W (e) holds: (x
j
; ; e) =
?
sub
(x
j
; w)fx
1
(x
1
; ; e); : : : ; x
m
(x
m
; ; e)g.
A substitution of a variable x
j
is onsistent, if the substitution for a given
segment at position w an be onstru ted by su essive appli ation of the
hypotheti al substitution on the initial instantiation of the variable in the
root segment (w = ).
7.3.4.3 Dete ting Hidden Variables. Remember that the set of parame-
ters of a re ursive subprogram initially results from onstru ting the hypoth-
esis of the program body by anti-uni ation of the hypotheti al segments
(see se t. 7.3.2). The maximal pattern of all segments ontains variables at
all positions where the subtrees dier over the segments. In many ases, the
set of variables identied in this way is already the nal set of parameters of
the to be onstru ted re ursive fun tions. Only in the ase of hidden variables
(see table 7.10), that is, variables whi h only o ur in substitution terms, the
set of variables must be extended.
In ontrast to the standard ase, the instantiation of su h hidden vari-
ables annot be identied in the segments, they must be identied within
the substitution terms instead. This an be done be ause, if su h a variable
o urs in a re ursive equation it must be used in parameter substitutions in
the re ursive all.
Lemma 7.3.5 (Identi ation of Hidden Variables). Let S = (G; t
0
) be
an RPS, G
i
(x
1
; : : : ; x
m
i
) = t
i
a subprogram in S, and X
i
= fx
1
; : : : ; x
m
i
g the
set of parameters of G
i
. Let U
re
be the set of re ursion points of G
i
indexed
over R = f1; : : : ; jU
re
jg, W the set of unfolding indi es onstru ted over R,
and
i
the set of all unfoldings of G
i
with instantiations : X
i
! T
. The
program body t
i
is the maximal pattern of all unfoldings in
i
.
Let X
R
i
= var(t
i
[U
re
) be the set of variables o uring in the pro-
gram body. Let X
H
i
= X
i
nX
R
i
be the set of hidden variables of subprogram G
i
.
Let sub(x
j
; r) 2 T
(X
i
) be a substitution term with var(sub(x
j
; r))\X
H
i
6=
7.3 Solving the Synthesis Problem 217
;. Let x
h
2 var(sub(s
j
; r))\X
H
i
be a hidden variable used in the substitution
term at position u
h
2 pos(sub(x
j
; r); x
h
). It holds:
1. :9x
k
2 X
R
i
with 8w 2W :
wÆr
(x
j
)j
u
h
=
w
(x
k
).
2. :8w
1
; w
2
2W : node(
w
1
Ær
(x
j
); u
h
) = node(
w
2
Ær
(x
j
); u
h
).
3. :9u u
h
and :9x
k
2 X
i
with 8w 2W :
wÆr
(x
j
)j
u
=
w
(x
k
).
Proof (Identi ation of Hidden Variables).
1. Suppose there exists a variable x
k
2 X
R
i
for whi h holds ondition 1 in lemma
7.3.5, then it must hold that x
k
6= x
h
be ause x
k
2 X
R
i
, x
h
2 X
H
i
, and
X
R
i
[X
H
i
= ;; and it must hold for all w 2 W that
wÆr
(x
j
)j
u
h
=
w
(x
k
) (given postulation)
wÆr
(x
j
)j
u
h
=
w
(x
h
) (follows from supposition)
Consequently, for all w 2 W must hold
w
(x
k
) =
w
(x
h
) whi h ontradi ts
the assumption that t
i
is a maximal pattern of all unfoldings over .
2. For all w 2 W holds that
wÆr
(x
j
)j
u
h
=
w
(x
h
). Be ause t
i
is a maxi-
mal pattern, variables do not have a ommon non-trivial pattern (a om-
mon prex). Therefore, there must exist two positions w
1
; w
2
2 W su h that
node(
w
1
Ær
(x
h
); ) 6= node(
w
2
Ær
(x
h
); ) and therefore also
node(
w
1
Ær
(x
j
); u
v
) 6= node(
w
2
Ær
(x
j
); u
v
).
3. Follows from Lemma B.3.1 (appendix BB.3).
From lemma 7.3.5 follows that for ea h substitution term of a variable an
be de ided whether it is onstru ted using an fun tion from or a hidden
variable. The initial instantiations of su h hidden variables an be indu ed
from the substitution terms. An algorithm for determining hidden variables
is given in table 7.14.
Table 7.14. Determining Hidden Variables
Fun tion: hiddenVar : pos(t
init
)X R! X
0
Pre: u is a position in (x
j
; w; e)
X = fx
1
; : : : ; x
m
g is the set of variables
(x
k
; w; e) returns the instantiation of variable x
k
2 X for segment w 2 W (e)
R is the index set of re ursion points
Post: new variable symbol x
m+1
Side-ee t: X is extended by x
m+1
; is extended by the values of x
m+1
; Break (and
ba ktra king to al ulating segmentations) if there are not enough instan es of
x
m+1
1. m = jXj
2. Generate new variable symbol x
m+1
=2 X
3. X = X [ fx
m+1
g
4. FORALL e 2 E DO
5. FORALL w 2 W DO
6. IF w Æ r 2W (e)
7. THEN (x
m+1
; w; e) = (x
j
; w Æ r; e)j
u
8. IF NOT(enoughInstan es(x
m+1
)) (see table 7.13)
9. THEN break
10. return x
m+1
218 7. Folding of Finite Program Terms
7.3.4.4 Algorithm. The ore of the algorithm for al ulating substitution
terms an be obtained by extending denition 7.3.10 and is given in ta-
ble 7.15. The algorithm makes use of fun tions uniqueness (table 7.11),
betaRe urrent (table 7.12), and hiddenVar (table 7.14).
Table 7.15. Cal ulating Substitution Terms of a Variable in a Re ursive Call
Fun tion: al Sub : X R pos(t
init
)! T
Pre: E is the set of indi es for T
init
R is the index set of re ursion points with r 2 R
W (e) ist the set of segment indi es for t
e
2 T
init
X is the set of variables in
^
t
G
and x
j
; x
k
2 X
(x
k
; w; e) returns the instantiation of variable x
k
2 X for segment w 2 W (e)
Post: a valid substitution term sub(x
j
; r) for variable x
j
2 X in the rth re ursive
all.
Side-ee t: Break aused if a variable is not uniquely determined ( aused by
uniqueness) or if there are not enough instan es for al ulating the instanti-
ation of a hidden variable ( aused by hidden).
al Sub(x
j
; r; u) =
1. IF betaRe urrent(x
j
; x
k
; r; u)
2. AND (Let t
= (x
j
; w Æ r; e) for e 2 E;w 2W (e) and (x
j
; w Æ r; e) 6= ? In
uniqueness(x
j
; x
k
; u; t
))
3. THEN x
k
4. ELSE IF 8e 2 E;w 2W : node((x
j
:w Æ r; e); u) = f with (f) = n
5. THEN f( al Sub(x
j
; r; u Æ 1); : : : ; al Sub(x
j
; r; u Æ n))
6. ELSE hidden(u; x
j
; r)
Integrating all omponents for al ulating substitutions introdu ed in this
se tion, we obtain the following algorithm:
1. Instantiate X = var(
^
t
G
) and al ulate the initial instantiation as dened
in 7.3.11.
2. Determine whether there are enough su eeding substitution terms as
proposed in lemma 7.3.3. If there are not enough terms, break and ba k-
tra k to al ulating a new segmentation.
3. For ea h variable x
j
2 X and ea h re ursive all r 2 R al ulate the
substitutions, starting at u = using algorithm 7.15.
4. Che k onsisten y of instantiations as proposed in lemma 7.3.4. If an
in onsisten y is dete ted, break and ba ktra k to al ulating a new seg-
mentation.
5. If jT
init
j 2 then he k whether there exist at least two initial trees
in T
init
with omplete initial instantiations as proposed in 7.3.2 in se -
tion 7.3.3. If this is not the ase, break and ba ktra k to al ulating a
segmentation.
6. Initially, instantiate t
G
=
^
t
G
. Instantiate m = jX j. Cal ulate indi es for
the variables in X using f1; : : : ;mg. For ea h u
r
2 U
re
extend t
G
in the
following way: t
G
= t
G
[u
r
G(sub(x
1
; r); : : : sub(x
m
; r)).
7.3 Solving the Synthesis Problem 219
7. Return G(x
1
; : : : x
m
) = t
G
and for all initial trees t
e
return
e
=
(x
i
; ; e) for all x
i
2 X .
7.3.5 Constru ting an RPS
7.3.5.1 Parameter Instantiations for the Main Program. When on-
stru ting an RPS S(G; t
0
) whi h re ursively explains a set of initial trees
T
init
, for ea h t
e
2 T
init
there must be al ulated the initial instantiation of
variables
e
in t
e
0
. This an be done by using slightly modied versions of def-
initions 7.3.8 (instantiations of variables in a hypotheti al subprogram body,
se tion 7.3.2) and 7.3.11 (instantiations of variables in a segment, dened for
parameter instantiations in re ursive alls, se tion 7.3.4).
For a re ursive program s heme S(G; t
0
) ea h initial tree t
e
2 T
init
implies
an instantiation
e
of variables in t
0
:
Denition 7.3.12 (Instantiation of t
0
). Let T
init
T
[fg
be a set of
initial trees indexed over E and S(G; t
0
) the RPS whi h re ursively explains
T
init
. Let t
e
0
be the main program for t
e
2 T
init
. The instantiations of t
0
for
a tree t
e
an be onstru ted as
e
(x) =
8
<
:
t
e
0
j
u
9u 2 pos(t
0
; x)
with u 2 pos(t
e
0
) and t
e
0
j
u
6= ?
? otherwise:
If there exists a re ursive subprogramwhi h explains all trees in T
init
, then
the variable instantiations in the main program whi h alls this subprograms
an be al ulated with respe t to the subprogram.
Denition 7.3.13 (Instantiation of t
0
wrt a Subprogram). Let T
init
T
[fg
be a set of initial trees indexed over E and G(x
1
; : : : ; x
m
) = t
G
be a
subprogram whi h re ursively explains the trees t
e
2 T
init
together with vari-
able instantiations
G
e
: fx
1
; : : : ; x
m
g ! T
[ f?g. The instantiations of t
0
for a tree t
e
an be onstru ted as
e
(x) =
8
>
>
<
>
>
:
G
e
(G(x
1
; : : : ; x
m
))j
u
(a) 9u 2 pos(t
0
; x) with
u 2 pos(
G
e
(G(x
1
; : : : ; x
m
))) and
(b)
G
e
(G(x
1
; : : : ; x
m
))j
u
6= ?
? otherwise:
7.3.5.2 Putting all Parts Together. Now we have all omponents for a
system for indu ing re ursive explanations from a set of initial trees. The ore
of the system is to nd a set of re ursive subprograms. An overview of the
indu tion of a subprogram is given in gure 7.12. Program BuildSub involves
the following steps: (1) nding a valid re urrent segmentation U
re
for a set of
initial trees (se tion 7.3.1), (2) onstru ting a hypothesis about the program
body by al ulating the maximal pattern of the segments (se tion 7.3.2), (3)
220 7. Folding of Finite Program Terms
interpreting the remaining subtrees as variable instantiations, (4) determin-
ing the variable substitutions and the initial instantiation of the variables
(se tion 7.3.4). If there are subtrees whi h are not overed by the urrent
segmentation, these trees are onsidered to belong to further subprograms
and the pro ess of indu ing a (sub-) RPS is alled re ursively (see BuilRPS
in gure 7.13). Whenever one of the integrity onditions formulated in the
se tions above is violated, the algorithm ba ktra ks to step one.
BuildSub(T
init
):
PSfrag repla ements
Cal ulate valid re urrent segmentation
Cal ulate instantiations
Cal ulate substitutions
no further segmentation possible
n
o
s
o
l
u
t
i
o
n
no RPS an be al ulated
no transitivity
not suÆ iently many
not uniquely determined
if hidden variables
not enough instan es
not enough omplete instantiations
b
a
k
t
r
a
k
D
e
a
l
w
i
t
h
f
u
r
t
h
e
r
s
u
b
p
r
o
g
r
a
m
s
For all u in U
sub
:
De ompose T
init
) T
u
BuildRPS(T
u
)) (G
u
; t
u
0
); f
u
e
g
Redu e segments in T
init
Set G = G [ G
u
Cal ulate maximal pattern
^
t
G
Cal ulate initial instantiations
e
Set t
0
= G(x
1
; : : : ; x
m
)
Set G = G [ fG(x
1
; : : : x
m
) = t
G
g
Return (G; t
0
); f
e
g
For jT
init
j 2
Fig. 7.12. Steps for Cal ulating a Subprogram
7.3 Solving the Synthesis Problem 221
The omplete pro edure for indu ing an RPS is given in gure 7.13. Start-
ing at the roots of the initial trees in T
init
, the system sear hes for a sub-
program together with a alling main program and an initial instantiation
of variables. Possibly, the sear hed for \top-level" subprogram alls further
subprograms whi h are also onstru ted (BuildSub, gure 7.12). If no solu-
tion is found, the root node is interpreted as a onstant fun tion in the main
program and BuildRPS starts with all subtrees of the root node as new set
of initial trees. If none of the new initial trees ontains an , no re ursive
generalization is onstru ted and there is only a main program whi h is the
maximal pattern of all trees, otherwise, BuildRPS is alled re ursively for all
sets of initial trees at xed positions under the root.
The used strategy is to nd a subprogram whi h explains as large a part
of the initial trees as possible at a level in the trees as high as possible. Re-
member, that, if only one initial tree is given, it is not possible to determine
the parameters of main program t
0
, otherwise, the parameters of main are
al ulated by anti-uni ation of the main programs of all initial trees (see
se tion 7.2.3). Remember further, that urrently subprograms with dierent
names are indu ed for ea h unexplained subtree of a to be indu ed subpro-
gram. Subprograms with identi al bodies an be identied after synthesis is
ompleted (see se tion 7.3.3.4 and se tion 7.3.5.4).
7.3.5.3 Existen e of an RPS. If an RPS an be indu ed from a set of
initial trees using the approa h presented here, then indu tion an be seen
as a proof of existen e of a re ursive explanation given the restri tions
presented in se tion 7.2.2.
Theorem 7.3.5 (Existen e of an RPS). Let T
init
be a set of initial trees
indexed over E. T
init
an be explained re ursively by an RPS i:
1. T
init
an be re ursively explained by a subprogram G(x
1
; : : : ; x
m
) = t
G
,
or
2. 8e 2 E : 9f 2 with (f) = n, n > 0, and node(t
e
; ) = f , and
8T
k
= ft
e
j
k:
j k = 1; : : : ; ng holds:
a) 8t 2 T
k
: pos(t; ) = ;, or
b) 8t 2 T
k
: pos(t; ) 6= ; and it exists an RPS S
k
= (G
k
; t
k
0
) whi h
re ursively explains the trees in T
k
.
Proof (Existen e of an RPS). It must be shown (1) that an RPS exists, if proposi-
tions (1) and (2) in theorem 7.3.5 hold, and (2) that no RPS exists, if propositions
(1) and (2) do not hold.
1. Existen e of an RPS, if propositions (1) and (2) in theorem 7.3.5 hold:
Proposition 1: Let G(x
1
; : : : ; x
m
) = t
G
be a subprogram whi h re ursively ex-
plains all trees in T
init
(def. 7.1.30). Then there exist
G
e
: fx
1
; : : : ; x
m
g !
T
[ f?g su h that G with instantiations
G
e
re ursively explains t
e
2
T
init
. Let t
0
be the maximal pattern of all terms
G
e
(G(x
1
; : : : ; x
m
))
(theorem 7.3.1) and
e
: var(t
0
) ! T
[ f?g (def. 7.3.13) the ini-
tial instantiations of variables in t
0
for tree t
e
2 T
init
. Then the RPS
S(hG(x
1
; : : : ; x
m
) = t
G
i; t
0
) together with instantiations
e
re ursively ex-
plains the trees in T
init
.
222 7. Folding of Finite Program Terms
BuildRPS(T
init
):
no no
no no
no
yes
yes yes
yes
yes
PSfrag repla ements
BuildSub(T
init
)) (G; t
0
); f
e
g
Solution
Solution
found?
found?
8t 2 T
init
:
node(t; ) = f
with (f) = n?
no solution
8e 2 E : t
e
0
= f(x
1
; : : : ; x
m
)
For k = 1; : : : ; n:
Constru t T
k
No s in
all t 2 T
k
?all t 2 T
k
?
s in
8e 2 E :
t
e
0
j
k:
= t
BuildRPS(T
k
)
) (G
k
; t
k
0
)f
k
e
g
8e 2 E : t
e
0
j
k:
=
k
e
(t
k
0
)
Set G = G [ G
k
Set t
0
j
k:
= maximal pattern of all t
e
0
j
k:
8e 2 E : al ulate
k
e
wrt t
0
j
k:
Set
e
=
e
[
k
e
Set t
0
= maximal pattern of all
e
(t
0
)
8e 2 E : al ulate
e
wrt t
0
Return (G; t
0
), f
e
g
Fig. 7.13. Overview of Indu ing an RPS
7.3 Solving the Synthesis Problem 223
Proposition 2: For all T
k
, k = 1; : : : ; n, with pos(t; ) 6= emptyset by as-
sumption must exist an RPS S
k
= (G
k
; t
k
0
) whi h re ursively explains
trees T
k
and there must exist instantiations
k
e
: var(t
k
0
)! T
[ f?g for
parameters in the main programs t
k
0
. For ea h term t
e
0
it must hold that
node(t
e
0
; ) = f with arity (f) = n. For all k = 1; : : : ; n must hold:
t
e
0
j
k:
=
k
e
(t
k
0
) pos(t; ) 6= ;
t
e
j
k:
pos(t; ) = ;:
The \global" main t
0
then is the maximal pattern of all t
e
0
with variable
instantiations
e
(x) (def. 7.3.12). G is the set of all subprograms of the
sub-s hemata S
k
and the RPS S(G; t
0
) together with instantiations
e
re ursively explains all trees in T
init
.
2. Non-Existen e of an RPS, if propositions (1) and (2) in theorem 7.3.5 do not
hold:
If proposition (1) do not hold, then there annot exist an RPS with a main
program t
0
whi h just alls a re ursive subprogram. That is, there must exist
an RPS with a larger main program for whi h proposition (2) does not hold.
If there does not exist a symbol f 2 whi h is identi al for all roots of the
initial trees t
e
2 T
init
, then there annot exist a (rst-order) main program
whi h explains all initial trees.
If there exist t
1
2 T
k
with pos(t
1
; ) = ; and t
2
2 T
k
with pos(t
2
; ) 6= ;,
then on the one hand, T
k
annot be explained by an RPS (by assumption)
be ause of t
1
, and, on the other hand, T
k
annot be explained by a non-
re ursive term with variable instantiations be ause of t
2
.
7.3.5.4 Extension: Equality of Subprograms. After an RPS S(G; t
0
) is
indu ed, the set of subprograms G an possibly be redu ed by unifying sub-
programs with equivalent bodies and dierent names G
i
2 .
9
In the most
simple ase, the bodies of the two subprograms are identi al ex ept names of
variables. In general, G
i
and G
j
an ontain alls of further subprograms. G
i
and G
j
are equivalent, if these subprograms are equivalent, too. Table 7.16
gives an example, where G
A
alls a further subprogram G
B
and where vari-
ables of G
A
are partially instantiated with dierent values for two dierent
positions. Subprogram G
A
ould be generated from the two partially instan-
tiated subprograms by (a) determining whether the further subprograms are
equivalent and by (b) onstru ting the maximal pattern of instan es of G
A
.
But note, that a strategy for de iding when dierent subprograms should be
identied as a single subprogram would be needed to realize this simpli a-
tion of program s hemes!
9
This is a possible extension of the system, not implemented yet.
224 7. Folding of Finite Program Terms
Table 7.16. Equality of Subprograms
G
A
(x
1
; x
2
; x
3
) = if(eq1(x
1
); G
B
(x
2
); G
A
((x
1
; x
3
); x
2
; x
3
))
Indu ed subprograms:
G
A
is alled with x
2
= 1 and x
3
= 2:
G
A1
(x
1
) = if(eq1(x
1
); G
B1
(1); G
A1
((x
1
; 2)))
G
A
is alled with x
2
= 2 and x
3
= 3:
G
A2
(x
1
) = if(eq1(x
1
); G
B2
(2); G
A2
((x
1
; 3)))
7.3.5.5 Fault Toleran e. The urrent, purely synta ti al approa h to fold-
ing does allow for initial trees where some of the segments of a hypotheti al
unfolding are in omplete. In ompleteness ould for example result from not
onsidered ases when the initial trees were generated (by a user or another
system). But, it does not allow for initial trees whi h ontain defe ts, for
example wrong fun tion symbols. Su h defe ts ould for example result from
user generated initial trees or tra es, where it might easily happen that trees
are unintentionally awed (by typing errors et .). A possible solution for this
problem ould be, that not only an RPS is indu ed but that additionally, min-
imal transformations of the given initial trees are al ulated and proposed to
the user. This problem is open for further resear h.
7.4 Example Problems
To illustrate the presented approa h to folding nite programs we present
some examples. First, we give time measures for some standard problems.
Afterwards we demonstrate appli ation to planning problems whi h we in-
trodu ed in hapter 3.
7.4.1 Time Eort of Folding
Figures 7.14 and 7.15 give the time eort for unfolding and folding the fa to-
rial (see tab. 7.7) and the Fibona i fun tion (see tab. 7.3). The experiments
were done by unfolding a given RPS to a ertain depth and then folding
the resulting nite program term again. In all experiments the original RPS
ould be inferred from its n-th unfolding (n = 3; : : : ; 10). The pro edure for
obtaining the time measures is des ribed in appendix AA.6.
Folding has a time eort whi h is a fa tor between 3 and 4 higher than
unfolding. For linear re ursive fun tions, folding time is linear and for tree
re ursive fun tions, folding time is exponential in the number of unfolding
points, as should be expe ted.
The greatest amount of folding time is used for onstru ting the substitu-
tions and the se ond greatest amount is used for al ulating a valid re urrent
segmentation. Figure 7.16 gives these times for the fa torial fun tion.
7.4 Example Problems 225
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
3 4 5 6 7 8 9 10
’fac-unfold’’fac-fold’
Unfolding-depth n = 3; : : : ; 10 (x-axis) measured in se onds (y-axis)
Fig. 7.14. Time Eort for Unfolding/Folding Fa torial
0
10
20
30
40
50
60
70
80
90
3 4 5 6 7 8 9 10
’fib-unfold’’fib-fold’
Unfolding-depth n = 3; : : : ; 10 (x-axis) measured in se onds (y-axis)
Fig. 7.15. Time Eort for Unfolding/Folding Fibona i
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
3 4 5 6 7 8 9 10
’fac-fold’’fac-seg’’fac-sub’
Unfolding-depth n = 3; : : : ; 10 (x-axis) measured in se onds (y-axis)
Fig. 7.16. Time Eort Cal ulating Valid Re urrent Segmentations and Substitu-
tions for Fa torial
226 7. Folding of Finite Program Terms
Both for fa torial and for Fibona i the rst segmentation hypothesis
leads to su ess, that is, no ba ktra king is needed. For a variant of fa torial
with a onstant initial part (see table 7.17), one ba ktra k is needed. For an
unfolding depth n = 3 time for folding goes up to 0:36 se (from 0:12 se )
and for n = 4 to 0:44 se (from 0:16 se ).
Table 7.17. RPS for Fa torial with Constant Expression in Main
G = hG(x) = if(eq0(pred(x)); 1; (x;G(pred(x))))i
t
0
= if(eq0(x); 1; G(x))
For the ModList fun tion (see tab. 7.2) an RPS with two subprograms
must be inferred. For the third unfolding of ModList where Mod is unfolded
on e in the rst, twi e in the se ond, and three times in the third unfolding
is 1:07 se (with 0:16 se for unfolding). For the fourth unfolding with one to
four unfoldings of Mod, folding time is 3:20 se (with 0:33 se for unfolding).
7.4.2 Re ursive Control Rules
In the following we present how RPSs for planning problems su h as the ones
presented in hapter 3 an be indu ed. In this se tion we do not des ribe
how appropriate initial trees an be generated from universal plans but just
demonstrate that the re ursive rules underlying a given planning domain an
be indu ed if appropriate initial trees are given!
For the fun tion symbols used in the signature of the RPSs we assume
that they are predened. Typi ally su h symbols refer to predi ates and op-
erators dened in the planning domain. Furthermore, for planning problems,
a situation variable s is introdu ed whi h represents the urrent planning
state. A state an be presented as list of literals for Strips-like planning
(see hap. 2) or as list or array over primitive types su h as numbers for
a fun tional extension of Strips (see hap. 4). Generating initial trees from
universal plans is dis ussed in detail in hapter 8.
The learblo k problem introdu ed in se tion 3.1.4.1 in hapter 3 has an
underlying linear re ursive stru ture. In gure 7.17 an initial tree for a three
blo k problem is given (the ontable literal is omitted for better readability).
The resulting RPS is:
G = hG(x; s) = if( lear(x; s); s; puttable(topof(x; s); G(topof(x; s); s)))i
t
0
= G(C; list( lear(A); on(A;B); on(B;C); ontable(C))).
The rst variable x holds the name of the blo k to be leared, for example
C. The situation variable s holds a des ription of a planning state as list of
literals. Predi ate lear, sele tor topof, and operator puttable are assumed to
be predened fun tions. Clear(x, s) is true if lear(x) 2 s; topof(x, s) returns
7.4ExampleProblems
227
Ω
if
clear
c list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
puttable
topof
c list
on
ab
on
bc
clear
a
if
clear
topof
c list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
puttable
topof
topof
c list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
if
clear
topof
topof
c list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
puttable
topof
topof
topof
c list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
list
on
ab
on
bc
clear
a
Fig.7.17.InitialTreeforClearblo k
228 7. Folding of Finite Program Terms
y for on(y; x) 2 s; puttable(x, s) deletes on(x, y) from s and adds the literals
lear(y) and ontable(x) to s.
The Tower of Hanoi domain is an example for a re ursive fun tion relying
on a hidden variable (see se t. 7.3.4.3). In gure 7.18, an initial tree is given.
The underlying RPS is:
G = h G(n; x; y; z) = if(eq1(n);move(x; y);
ap(hanoi(pred(n); x; z; y);move(x; y); hanoi(pred(n); z; y; x)))i
t
0
= G(3; a; b; ).
The fun tion symbol ap (whi h an be interpreted as append) is used to
ombine two alls of hanoi with a move operation.
Parameter z does o ur only in substitutions in the re ursive alls. There-
fore, the maximal pattern ontains only three variables:
^
t
G
= if( eq1(x
1
);
move(x
2
; x
3
);
ap(G;move(x
2
; x
3
); G))
and the fourth variable z is in luded after determining the substitution terms.
In the following hapter, re ursive generalizations for these and further
planning problems will be dis ussed.
7.4ExampleProblems
229
ΩΩΩΩΩΩΩΩ
if
eq1
3
move
a b
ap
if
eq1
pred
3
move
a c
ap
if
eq1
pred
pred
3
move
a b
ap
move
a b
move
a c
if
eq1
pred
pred
3
move
b c
ap
move
b c
move
a b
if
eq1
pred
3
move
c b
ap
if
eq1
pred
pred
3
move
c a
ap
move
c a
move
c b
if
eq1
pred
pred
3
move
a b
ap
move
a b
Fig.7.18.InitialTreeforTowerofHanoi
8. Transforming Plans into Finite Programs
"[... I guess I an put two and two together." "Sometimes the answer's four,"
I said, "and sometimes it's twenty-two. [..."
|Ni k Charles to Morelli in: Dashiell Hammett, The Thin Man, 1932
Now we are ready to ombine universal planning as des ribed in hapter
3 and indu tion of re ursive program s hemes as des ribed in hapter 7.
In this hapter, we introdu e an approa h to transform plans generated by
universal planning into nite programs whi h are used as input to our folder.
On the one hand, we present an alternative approa h to realizing the rst
step of indu tive program synthesis as des ribed in se tion 6.3.4 in hapter
6 using AI planning as basis for generating program tra es. On the other
hand, we demonstrate that indu tive program synthesis an be applied to
the generation of re ursive ontrol rules for planning as dis ussed in se tion
2.5.2 in hapter 2. As a reminder to our overall goal introdu ed in hapter 1,
in gure 8.1 an overview of learning re ursive rules from some initial planning
experien e is given.
Domain Specification
- PDDL-operators
+ goal
set of constants
+ set of states or
Universal Plan
set of all optimal plans
Finite Program(s)
universal
planning
TransformationPlan
generalization
to-n
SchemeRecursive Program
CONTROL KNOWLEDGE LEARNING
DOMAIN EXPLORATION
Fig. 8.1. Indu tion of Re ursive Fun tions from Plans
232 8. Transforming Plans into Finite Programs
While plan onstru tion and folding are straight-forward and an be
dealt with by domain-independent, generi algorithms, plan transformation
is knowledge dependent and therefore the bottlene k of our approa h. What
we present here is work in progress. As a onsequen e, the algorithms pre-
sented in this hapter are stated rather informally or given as part of the Lisp
ode of the urrent implementation and we rely heavily on examples. We an
demonstrate that our basi idea transformation based on data type infer-
en e works for a variety of domains and hope to elaborate and formalize
plan transformation in the near future.
In the following, we rst give an overview of our approa h (se t. 8.1), then
we introdu e de omposition, type inferen e, and introdu tion of situation
variables as omponents of plan transformation (se t. 8.2). In the remaining
se tions we give examples for plans over totally ordered sequen es (se t. 8.3),
sets (se t. 8.4), partially ordered lists (se t. 8.5), and more omplex types
(se t. 8.6).
1
8.1 Overview of Plan Transformation
8.1.1 Universal Plans
Remember onstru tion of universal plans for a small nite set of states
and a xed goal, as introdu ed in hapter 3. A universal plan is a DAG
whi h represents an optimal, totally ordered sequen e of a tions (operations)
to transform ea h input state onsidered in the plan into the desired output
(goal) state. Ea h state is a node in the plan and is typi ally represented as
set of atoms over domain obje ts ( onstants). A tions in a plan are applied to
the urrent state in working memory. For example, for a given state fon(A,
B), on(B, C), lear(A), ontable(C)g, appli ation of a tion puttable(A) results
in fon(B, C), lear(A), lear(B), ontable(A), ontable(C)g.
8.1.2 Introdu ing Data Types and Situation Variables
To transform a plan into a nite program, it is ne essary to introdu e an
order over the domain (whi h an be gained by introdu ing a data type)
and to introdu e a variable whi h holds the urrent state (i. e., a situation
variable).
Data type inferen e is ru ial for plan transformation: The universal plan
already represents the stru ture of the sear hed-for program, but it does not
ontain information about the order of the obje ts of its domain. Typi ally for
1
This hapter is based on the previous publi ations S hmid and Wysotzki (2000b)
and S hmid and Wysotzki (2000a). An approa h whi h shares some aspe ts
with the plan transformation presented here and is based on the assumption of
linearizability of goals is presented in Wysotzki and S hmid (2001).
8.1 Overview of Plan Transformation 233
planning, domain obje ts are represented as sets of onstants. For example,
in the learblo k problem (se t. 3.1.4.1 in hap. 3) we have blo ks A, B, and
C. As dis ussed in hapter 6, approa hes to dedu tive as well as indu tive
fun tional program synthesis rely on a \ onstru tive" representation of the
domain. For example, Manna and Waldinger (1987) introdu e the hat-axiom
for refering to a blo k whi h is lying on another blo k; Summers (1977) relies
on a predened omplete partial order over lists. Our aim is, to infer the data
stru ture for a given planning domain. If the data type is known, onstant
obje ts in the plan an be repla ed by onstru tive expressions.
Furthermore, to transform a plan into a program term, a situation variable
(see se t. 2.2.2 in hap. 2 and se t. 6.2.1.2 in hap. 6) must be introdu ed.
While a planning algorithm applies an instantiated operator to the urrent
state in working memory (that is, a \global" variable), interpretation of a
fun tional program depends only on the instantiated parameters (that is,
\lo al" variables) of this program term (Field & Harrison, 1988). Introdu ing
a situation variable in the plan makes it possible to treat a state as valuated
parameter of an expression. That is, the primitive operators are now applied
to the set of literals held by the situation variable.
8.1.3 Components of Plan Transformation
Overall, plan transformation onsists of three steps, whi h we will des ribe
in the following se tion:
Plan De omposition: If a universal plan onsists of parts with dierent sets
of a tion names (e. g., a part where only unload(x, y) is used and another
part, where only load(x, y) is used), the plan is splitted into sub-plans.
In this ase, the following transformation steps are performed for ea h
sub-plan separately and a term giving the stru ture of fun tion- alls is
generated from the de omposition stru ture.
Data Type Inferen e: The ordering underlying the obje ts involved in a tion
exe ution is generated from the stru ture of the plan. From this order,
the data type of the domain is inferred.
Introdu tion of Situation Variables: The plan is re-interpreted in situation
al ulus and rewritten as nested onditional expression.
For information about the global data stru tures and entral omponents
of the algorithm, see appendix AA.7.
8.1.4 Plans as Programs
As stated above, a universal plan represents the optimal transformation se-
quen es for ea h state of the nite problem domain for whi h the plan was
onstru ted. To exe ute su h a plan, the urrent input state an be sear hed
in the planning graph by depth-rst or breadth-rst sear h starting from the
234 8. Transforming Plans into Finite Programs
root and the a tions along the edges from the urrent input state to the
root an be extra ted (S hmid, 1999). If the universal plan is transformed
into a nite program term, this program an be evaluated by a fun tional
eval-apply interpreter. For ea h input state, the resulting transformation se-
quen e should orrespond exa tly to the sequen e of a tions asso iated with
that state in the universal plan.
The sear hed-for nite program is already impli itely given in the plan
and we have to extra t it by plan transformation. A plan an be onsidered
as a nite program for transforming a xed set of inputs into the desired
output by means of applying a total ordered sequen e of a tions to an input
state, resulting in a state fullling the top-level goals. A plan onstru ted by
ba kward sear h with the state(s) fullling the top-level goals as root, an be
read top-down as: IF the literals at the urrent node are true in a situation
THEN you are done after exe uting the a tions on the path from the urrent
node to the root ELSE go to the hild node(s) and re ur.
2
Our goal is to ex-
tra t the underlying program stru ture from the plan. To interpret the plan
as a (fun tional) program term, states are re-interpreted as boolean opera-
tors: All literals of a state des ription whi h are involved in transformation
the \footprint" (Veloso, 1994) are rewritten with the predi ate symbol
as boolean operator introdu ing a situation variable as additional argument.
Ea h boolean operator is rewritten as a fun tion whi h returns true, if some
proposition holds in the urrent situation, and false otherwise. Additionally,
the a tions are extended by a situation variable, thus, the urrent (partial)
des ription of a situation an be passed through the transformations. Fi-
nally, to represent the onditional statement given above, additional nodes
only ontaining the situation variable are introdu ed for all ases, where the
urrent situation fullls a boolean ondition. In this ase, the urrent value
of the situation variable is returned.
We dene plans as programs in the following way:
Denition 8.1.1 (Plan as Program). Ea h node S (set of literals) in
plan is interpreted as onjun tion B of boolean expressions. The planning
tree an now be interpreted as nested onditional: (s) = IF B(s) THEN t
1
ELSE t
2
with t
1
; t
2
== s j o(
0
(s)), where s is a situation variable, o the
a tion given at the edge from B to a hild node, and
0
as sub-plan with this
hild node as root.
The restri tion to binary onditions \if-then-else" is no limitation in expres-
siveness. Ea h n-ary ondition an be rewritten as nested binary ondition:
( ond (x
1
t
1
) (x
2
t
2
) (x
3
t
3
) : : : (x
n
t
n
)) ==
(if x
1
t
1
(if x
2
t
2
(if x
3
t
3
(if : : : (if x
n
t
n
))))).
In general, boolean expressions B(s) an involve propositions expli itely
given in a situation as well as additional onstraints as des ribed in hapter
2
An interpreter fun tion for universal plans is given in (Wysotzki & S hmid,
2001).
8.2 Transformation and Type Inferen e 235
4. If the plan results in a term IF B(s) THEN s ELSE o((s)), the problem
is linear, if then- and else- part involve operator appli ation, the problem is
more omplex (resulting in a tree re ursion). We will see below, that for some
problems a omplex stru ture an be ollapsed into a linear one as a result
of data type introdu tion.
8.1.5 Completeness and Corre tness
Starting point for plan transformation is a omplete and orre t universal
plan (see hap. 3). The result of plan transformation is a nite program
term. The ompleteness and orre tness of this program an be he ked by
using ea h state in the original plan as input to the nite program and he k
(1) whether the interpretation of the program results in the goal state and (2)
whether the number of operator appli ations orresponds to the number of
edges on the path from the given input state to the root of the plan. Of ourse,
be ause folding is an indu tive step, we annot guarantee the ompleteness
and orre tness of the inferred re ursive program. The re ursive program
ould be empiri ally validated by presenting arbitrary input states of the
domain. For example, if a plan for sorting lists of up to three elements was
generalized into a re ursive sorting program, the user ould test this program
by presenting lists with for or more elements as input and inspe ting the
resulting output.
8.2 Transformation and Type Inferen e
8.2.1 Plan De omposition
As an initial step, the plan might be de omposed in uniform sub-plans:
Denition 8.2.1 (Uniform Sub-Plan). A sub-plan is uniform if it on-
tains only xed, regular sequen es of operator-names ho
1
: : : o
n
i with n 1.
The most simple uniform sub-plan is a sequen e of steps where ea h a -
tion involves an identi al operator-name (see g. 8.2.a). An example for su h
a plan is the learblo k plan given in gure 3.2 whi h onsists of a linear se-
quen e of puttable a tions. It ould also be possible that some operators are
applied in a regular way for example drill-holepolish-obje t (see g. 8.2.b).
Single operators or regular sequen es an alternatively o ur in more omplex
planning stru tures (see g. 8.2. ). An example for su h a plan is the sorting
plan given in gure 3.7 whi h is a DAG where ea h ar is labelled with a
swap a tion.
A plan an ontain uniform sub-plans in several ways: The most simple
way is, that the plan an be de omposed level-wise (see g. 8.3.a). This is, for
example, the ase for the ro ket domain (see plan in g. 3.5): The rst levels
236 8. Transforming Plans into Finite Programs
of the DAG only ontain unload a tions, followed by a single move-ro ket
a tion, followed by some levels whi h only ontain load a tions. In general,
sub-plans an o ur at any position in the planning stru ture as subgraphs
(see g. 8.3.b).
(o1 ...)
(o1 ...)
(o1 ...)
(o1 ...)
(o1 ...)
(o1 ...)
(o1 ...)
(o2 ...)
(o2 ...)
(o2 ...)
(o1 ...) (o1 ...)
(o1 ...) (o1 ...)
(a) (b) (c)
Fig. 8.2. Examples of Uniform Sub-Plans
We have only implemented a very restri ted mode for this initial plan
de omposition: single operators and level-wise splitting (see appendix AA.8).
A full implementation of de omposition involves omplex pattern-mat hing,
as it is realized for identifying sub-programs in our folder ( hap. 7). Level-
wise de omposition an result in a set of \parallel" sub-plans whi h might be
omposed again during later planning steps. Parallel sub-plans o ur, if the
\parent" sub-plan is a tree, that is, it terminates with more than one leaf.
Ea h leaf be omes the root of a potential subsequent sub-plan.
In the urrent implementation, we return the omplete plan if dierent
operators o ur at the same level. A reasonable minimal extension (still avoid-
ing omplex pattern-mat hing as des ribed in hap. 7) would be to sear h for
sub-plans fullling our simple splitting riterium at lower levels of the plan.
But up to now, this ase only o urred for omplex list problems (su h as
tower with 4 blo ks, see below) and in su h ases, a minimal spanning tree
is extra ted from the plan.
8.2 Transformation and Type Inferen e 237
(o1 ...)(o1 ...)
(o1 ...) (o1 ...) (o1 ...)
(o1 ...) (o1 ...) (o1 ...)
(a)
(o1 ...)
(o1 ...) (o1 ...)
(o1 ...) (o1 ...)
(b)
Fig. 8.3. Uniform Plans as Subgraphs
If de omposition results in more than one sub-plan, an initial skeleton
for the program stru ture is generated over the (automati ally generated)
names of the sub-plans (see also hap. 7), whi h are initially asso iated with
the partial plans and nally with re ursive fun tions. If folding su eeded,
the names are extended by the asso iated lists of parameters. For example,
a stru ture (p1 (p2 (p3))) ould be ompleted to (p1 arg1 (p2 arg2 arg3 (p3
(arg 4 s)))) where the last argument of ea h sub-program with name p
i
is a
situation variable. For all arguments arg
i
, the initial values as given in the
nite program are known as a result from folding where parameters and
their initial instantiations are identied in an initial program ( hap. 7).
8.2.2 Data Type Inferen e
The entral step of plan transformation is data type inferen e.
3
The stru ture
of a (sub-) plan is used to generate an hypothesis about the underlying data
type. This hypothesis invokes ertain data type spe i on epts whi h
subsequently are tried to identify in the plan and ertain rewrite-steps whi h
are to be performed on the plan. If the data type spe i on epts annot be
identied from the plan, plan transformation fails.
Denition 8.2.2 (Data Type). A data type is a olle tion of data items,
with designated basi items ? (together with a \bottom"-test) and operations
3
Con rete and abstra t data types are for example introdu ed in (Ehrig & Mahr,
1985).
238 8. Transforming Plans into Finite Programs
( onstru tors) su h that all data items an be generated from basi items by
operation-appli ation. The onstru tor fun tion determines the stru ture of
the data type with ? < (x;?) < (x
0
; (x;?)) : : :. If x is empty or unique, the
data type is simple and the ( ountable, innite) set of elements belonging to
is totally ordered. The stru ture of data belonging to omplex types is usually a
partial order. For omplex types, additionally sele tor fun tions el( (x; s)) =
x and rs( (x; s)) = s are dened.
An example for a data type is list with the empty list as bottom element,
ons(x, l) as list- onstru tor, head(l) as sele tor for the rst element of a list,
and tail(l) as sele tor for the \rest" of the list, that is, the list without the
rst element.
The data type hypotheses are he ked against the plan ordered by in-
reasing omplexity:
Is the plan a sequen e of steps (no bran hing in the plan)?
Hypothesis: Data Type is Sequen e
Does the plan onsist of paths with identi al sets of a tions?
Hypothesis: Data Type is Set
Is the plan a tree?
Hypothesis: Data Type is List or ompound type
Is the plan a DAG?
Hypothesis: Data Type is List or ompound type.
Data type inferen e for the dierent plan stru tures is dis ussed in detail
below. After the plan is rewritten in a ordan e to the data type, the or-
der of the operator-appli ations and the order over the domain obje ts are
represented expli itly.
Expli it order over the domain obje ts is a hieved by repla ing obje t
names by fun tional expressions (sele tor fun tions) referring to obje ts in
an indire t way. Referring to obje ts by fun tions f(t), where t is a ground
term (a onstant, su h as the the bottom-element, or a fun tional expression
over a onstant) makes it possible to deal with innite domains while still
using nite, ompa t representations (Gener, 2000). For example, (pi k oset)
an represent a spe i obje t in an obje t list of arbitrary length.
8.2.3 Introdu ing Situation Variables
In the nal step of plan transformation, the remaining literals of ea h state
and the a tions are extended by situation variable s as additional argument
and the plan is rewritten as an onditioned expression as dened in denition
8.1.1. An abbreviated version of the rewriting-algorithm is given in appendix
AA.9.
4
4
This nal step is urrently only implemented for linearized plans.
8.3 Plans over Sequen es of Obje ts 239
8.3 Plans over Sequen es of Obje ts
A plan whi h onsists of a sequen e of a tions (without bran hing) is assumed
to deal with a sequen e of obje ts. For sequen es, there must exist a single
bottom element, whi h is identiable from the top-level goal(s) together with
the goal-predi ate(s) as bottom-test. The total order over domain obje ts is
dened over the arguments of the a tions from the top (root) of the sequen e
to the leaf.
Denition 8.3.1 (Sequen e). Data type sequen e is dened as:
seq = ? j (seq) with
null(seq) =
true if seq = ?
false otherwise:
For a plan or sub-plan with hypothesized data type sequen e, data
type introdu tion works as des ribed in the algorithm given in table 8.1.
Note, that inferen e is performed over a plan whi h is generated by our Lisp-
implemented planning system. Therefore, expressions are represented within
parenthesis. For example, the literal, representing that a blo k A is lying on
another blo k B is represented as (on A B); the a tion to put blo k A on the
table is represented as (puttable A). The restri tion to a single top-level goal
in the algorithm an be extended to multiple goals by a slight modi ation.
5
For the appli ation of an inferred re ursive ontrol rule, an additional
fun tion for identifying su essor-elements from the urrent state has to be
provided as des ribed in algorithm 8.1. The program ode for generating this
fun tion is given in gure 8.4. The rst fun tion (su -pattern) represents
a pre-dened pattern for the su essor-fun tion. For a on rete plan, this
pattern must be instantiated in a ordan e with the predi ates o uring in
the plan. For example, for the learblo k problem (see hap. 3 and below),
the su essor fun tion topof(x) = y is onstru ted from predi ates on(y, x).
Unsta king Obje ts. A prototypi al example for plans over a sequen e of
obje ts is unsta king obje ts either to put all obje ts on the ground or
to lear a spe i obje t lo ated somewhere in the staple. Su h a planning
domain was spe ied by learblo k as presented in se tion 3.1.4.1 in hapter
3. Here we use a slightly dierent domain, omitting the predi ate ontable and
with operator puttable named unsta k. The domain spe i ation and a plan
for unsta k are given in gure 8.5.
The proto ol of plan-transformation is given in gure 8.6. After identifying
the data type sequen e, the ru ial step is the introdu tion of the su essor-
fun tion: (su x) = y (on y x) whi h represents the \blo k lying on top of
5
Data type sequen e must be introdu ed as rst step. If the sequen e of obje ts
in the plan is identied, it is also identied whi h predi ate hara terizes the
elements of this sequen e. The top-level goals must in lude this predi ate with
the bottom-element as argument and onsequently an be sele ted as \bottom-
test" predi ate.
240 8. Transforming Plans into Finite Programs
Table 8.1. Introdu ing Sequen e
If the plan starts at level 0:
If the plan is not a single step:
If there is a single top-level goal (g a
1
: : : a
n
) set bottom-test to p, else fail.
Set type to seq.
Generate the sequen e:
Colle t the argument-tuple of ea h a tion along the path from the root to
the leaf.
If the tuple onsists of a single argument a
1
, keep it
otherwise, remove all arguments a
i
whi h are onstant over the sequen e.
If the sequen e onsists of single elements and if ea h element o urs as
argument of g,
pro eed with sequen e = (e
0
: : : e
m
) and set bottom to e
0
else fail.
Constru t an asso iation list ((e
1
(su e
0
)) : : : (e
m
(su
m1
e
0
))).
For (e
0
; e
1
) he k, whether the state on level 1 ontains a predi ate q(args)
with e
0
; e
1
2 args at positions (pos e
0
1 q) and (pos e
1
1 q) if yes, pro eed,
else fail.
For ea h (e
i
; e
i+1
) of the sequen e with i = 1; : : : ;m he k whether q(args)
with e
i
; e
i+1
2 args exists at level i+1 with (pose
i
; i+1; q) = (pos e
0
1 q) =
p
i
and pos(e
i+1
; i+ 1; q) = pos(e
1
; 1; q) = p
j
.
If yes, generate a fun tion (su e
i
) = e
j
if (q args) with e
i
at p
i
in q and
e
j
at p
j
in q
else fail.
Introdu e data type sequen e into the plan:
For ea h state, keep only bottom-test predi ate (g a
1
: : : a
n
)
Repla e arguments of g and of a tions by (su
i
e
0
) in a ordan e to the
asso iation list.
If the plan is a single step: identify bottom-test and bottom as above, redu e
states to the bottom-test predi ate.
If the plan starts at a level > 0:an \intermediate" goal must be identied; after-
wards, pro eed as above.
8.3 Plans over Sequen es of Obje ts 241
; pattern for getting the su of a onstant
; pred has to be repla ed by the predi ate-name of the rewrite-rule
; x-pos has to be repla ed by a list-sele tor (position of x in pred)
; y-pos dito (position of y = (su x) in pred)
(setq su -pattern '(defun su (x s)
( ond ((null s) nil)
((and (equal (first ( ar s)) pred)
(equal (nth x-pos ( ar s)) x))
(nth y-pos ( ar s))
)
(T (su x ( dr s)))
)))
; use a pattern for al ulating the su essor of a onstant from a
; planning state (su -pattern) and repla e the parameters for the
; urrent problem
; this fun tion has to be saved so that the synthesized program
; an be exe uted
(defun transform-to-f t (r)
; r: ((pred ...) (y = (su x)))
; pred = (first ( ar r))
; find variable-names x and y and find their positions in (pred ...)
; repla e pred, x-pos, y-pos
(setq r-pred (first ( ar r)))
(setq r-x-pos (position (se ond (third (se ond r))) (first r)))
(setq r-y-pos (position (first (se ond r)) (first r)))
(nsubst ( ons 'quote (list r-pred)) 'pred
(nsubst r-x-pos 'x-pos (nsubst r-y-pos 'y-pos su -pattern)))
)
Fig. 8.4. Generating the Su essor-Fun tion for a Sequen e
D = f (( lear O1) ( lear O2) ( lear O3)) ,
((on O2 O3) ( lear O1) ( lear O2)),
((on O1 O2) (on O2 O3) ( lear O1)) g
G = f( lear O3)g
O = funsta kg with
(unsta k ?x)
PRE f( lear ?x), (on ?x
?y)g
ADD f( lear ?y)g
DEL f(on ?x ?y)g
((CLEAR O1) (CLEAR O2) (CLEAR O3))
((ON O2 O3) (CLEAR O1) (CLEAR O2))
((ON O1 O2) (ON O2 O3) (CLEAR O1))
(UNSTACK O2)
(UNSTACK O1)
Fig. 8.5. The Unsta k Domain and Plan
242 8. Transforming Plans into Finite Programs
blo k x". While su h fun tions are usually pre-dened (Manna & Waldinger,
1987; Gener, 2000; Wysotzki & S hmid, 2001), we an infer them from the
universal plan. The \ onstru tively" rewritten plan where data type sequen e
is introdu ed is given in gure 8.7. The transformation information stored for
the nite program is given in appendix CC.3.
+++++++++ Transform Plan to Program ++++++++++
1st step: de ompose by operator-type
Single Plan
(SAVE SUBPLAN P1)
-----------------------------------------------
2nd step: Identify and introdu e data type
(INSPECTING P1) Plan is linear
(SINGLE GOAL-PREDICATE (CLEAR O3))
Plan is of type SEQUENCE
(SEQUENCE IS O3 O2 O1)
(IN CONSTRUCTIVE TERMS THAT 'S (O2 (SUCC O3)) (O1 (SUCC (SUCC O3))))
Building rewrite- ases:
(((ON O2 O3) (O2 = (SUCC O3))) ((ON O1 O2) (O1 = (SUCC O2))))
(GENERALIZED RULE IS (ON |?x1| |?x2|) (|?x1| = (SUCC |?x2|)))
Storage as LISP-fun tion
Redu e states to relevant predi ates (footprint)
((CLEAR O3))
((CLEAR (SUCC O3)))
((CLEAR (SUCC (SUCC O3))))
-----------------------------------------------
3rd step: Transform plan to program
Show Plan as Program? y
---------------------------------------------------------
Fig. 8.6. Proto ol for Unsta k
The nite program term whi h an be onstru ted from the term given
in gure 8.7 is:
(IF (CLEAR O3 S)
S
(UNSTACK (SUCC O3 S)
(IF (CLEAR (SUCC O3 S) S)
S
(UNSTACK (SUCC (SUCC O3 S) S)
(IF (CLEAR (SUCC (SUCC O3 S) S) S)
S
OMEGA))))).
An RPS (see hap. 7) generalizing this unsta k term is = h (unsta k-all o
s) = (if ( lear o s) s (unsta k (su o s) (unsta k-all (su o s) s))) i with
8.4 Plans over Sets of Obje ts 243
((CLEAR O3))
((CLEAR (SUCC O3)))
((CLEAR (SUCC (SUCC O3))))
(UNSTACK (SUCC O3))
(UNSTACK (SUCC (SUCC O3)))
Fig. 8.7. Introdu tion of Data Type Sequen e in Unsta k
t
0
= (unsta k-re o s) for some onstant o and some set of literals s. The
exe utable program is given in gure 8.8.
Plans onsisting of a linear sequen e of operator appli ations over a se-
quen e of obje ts in general result in generalized ontrol knowledge in form
of linear re ursive fun tions. In standard programming domains (over num-
bers, lists), a large group of problems is solvable by fun tions of this re ursion
lass. Examples are given in table 8.2.
Table 8.2. Linear Re ursive Fun tions
(unsta k-all x s) == (if ( lear x) s (unsta k (su x) (unsta k-all (su x) s)))
(fa torial x) == (if (eq0 x) 1 (mult x (fa torial (pred x))))
(sum x) == (if (eq0 x) 0 (plus x (sum (pred x))))
(expt m n) == (if (eq0 n) 1 (mult m (expt m (pred n))))
(length l) == (if (null l) 0 (su (length (tail l))))
(sumlist l) == (if (null l) 0 (plus (head l) (sumlist (tail l))))
(reverse l) == (if (null l) nil (append (reverse (tail l)) (list (head l))))
(append l1 l2) == (if (null l1) l2 ( ons (head l1) (append (tail l1) l2)))
It is simple and straight-forward to generalize over linear plans. For this
plans of problems, our approa h automati ally provides omplete and orre t
ontrol rules. As a result, planning an be avoided ompletely and the trans-
formation sequen e for solving an arbitrary problem involving an arbitrary
number of obje ts an be solved in linear time!
8.4 Plans over Sets of Obje ts
A plan whi h has a single root and a single leaf where the set of a tions
for ea h path from root to leaf are identi al is assumed to deal with a set
of obje ts. For sets, there has to be a omplex data obje t whi h is a set of
elements ( onstants of the planning domain), a bottom-element whi h is
inferred from the elements involved in the top-level goals , a bottom-test
244 8. Transforming Plans into Finite Programs
; Complete re ursive program for the UNSTACK problem
; all (unsta k-all <oname> <state-des ription>)
; e.g. (unsta k-all 'O3 '((on o1 o2) (on o2 o3) ( lear o3)))
; --------------------------------------------------------------------
; generalized from finite program generated in plan-transform
(defun unsta k-all (o s)
(if ( lear o s)
s
(unsta k (su o s) (unsta k-all (su o s) s))
) )
(defun lear (o s)
(member (list ' lear o) s :test 'equal)
)
; inferred in plan-transform
(DEFUN SUCC (X S)
(COND ((NULL S) NIL)
((AND (EQUAL (FIRST (CAR S)) 'ON)
(EQUAL (NTH 2 (CAR S)) X))
(NTH 1 (CAR S)))
(T (SUCC X (CDR S)))))
; expli it implementation of "unsta k"
; in onne tion with DPlan: apply unsta k-operator on state $s$ and return
; the new state
(defun unsta k (o s)
( ond ((null s) nil)
((and (equal (first ( ar s)) 'on) (equal (se ond ( ar s)) o))
( ons ( ons ' lear (list (third ( ar s)))) ( dr s))
)
(T ( ons ( ar s) (unsta k o ( dr s))))
))
Fig. 8.8. LISP-Program for Unsta k
8.4 Plans over Sets of Obje ts 245
whi h has to be an inferred predi ate over the set, and two sele tors one for
an element of a set (pi k) and one for a set without some xed element (rst).
The partial order over sets with maximally three elements is given in gure
8.9 this order orresponds to the sub-plans for unload or load in the ro ket
domain (se t. 3.1.4.2 in hap. 3). If pi k and rst are dened deterministi ally
(e. g., by list-sele tors), the partial order gets redu ed to a total order.
e2e1 e3
e2,e3e1,e3e1,e2
e1,e2,e3
=
Fig. 8.9. Partial Order of Set
Denition 8.4.1 (Set). Data type set is dened as:
set = ? j (e; set) with
empty(set) =
true if set = ?
false otherwise
pi k(set) = some e 2 set
rst(set) = set n e for some e 2 set.
Be ause the omplex data obje t is inferred from the top-level goals (given in
the root of the plan), we typi ally infer an \inverted" order with the largest
set ( ontaining all obje ts whi h an be element of set) as bottom element
and (e; set) == rst(set) as a de-stru tor.
For a plan or sub-plan with hypothesized data type set, data type
introdu tion works as des ribed in the algorithm given in table 8.3. Be ause
we ollapse su h a plan to one path of a set of paths, this algorithm ompletely
ontains the ase of sequen es (tab. 8.1). Note, that ollapsing plans with
underlying data type set orresponds to the idea of \ ommutative pruning"
as dis ussed in (Haslum & Gener, 2000).
The program ode for make- o, generating the bottom-test, and for pi k, and
rst referred to in algorithm 8.3 is given in gure 8.10. As des ribed for data
type sequen e above, the pattern of the sele tor fun tions are pre-dened and
the on rete fun tions are onstru ted by instantiating these patterns with
information gained from the given plan. Introdu tion of a new predi ate with
246 8. Transforming Plans into Finite Programs
Table 8.3. Introdu ing Set
Collapse plan to one path (implemented as: take the \leftmost" path).
Generate a omplex data obje t: like Generate Sequen e in tab. 8.1.
sequen e = (e1 : : : e
m
) is interpreted as set and bottom is instantiated with
CO = (e1 : : : e
m
).
A fun tion for generating CO from the top-level goals (make- o) is provided.
A generalized predi ate (g* args) with CO 2 args is onstru ted by olle ting
all predi ates (g args) with o 2 CO^o 2 args and repla ing o by CO. For a plan
starting at level 0, g has to be a top-level goal and all top-level goals have to be
overed by g
; for other plans, g has to be in the urrent root node.
A fun tion for testing whether g
holds in a state is generated as bottom-test.
Introdu e the data type into the plan:
For ea h state keep only bottom-test predi ate (g* args) with CO
0
CO 2 args.
Introdu e set-sele tors for arguments of g
by repla ing CO
0
by (rst
i
CO) and
afterwards repla ing a tion-arguments by (pi k(rst
i
CO)) with (rst
i
CO) o ur-
ring as argument of g
of the parent-node.
(pi k set) and (rstset) are predened by ar and dr.
a omplex argument orresponds to the notion of \predi ate-invention" in
Wysotzki and S hmid (2001).
6
Oneway Transportation. A typi al example for a domain with a set as un-
derlying data type are simple transportation domains, where some obje ts
must be loaded into a vehi le whi h moves them to a new lo ation. Su h a
domain is ro ket (see se t. 3.1.4.2 in hap. 3). The ro ket plan is in a rst step
levelwise de omposed into sub-plans with uniform a tions (see g. 8.11).
Data type inferen e is done for ea h sub-plan. For both sub-plans (unload-
all and load-all) there is a single root and a single leaf node and the sets of
a tions along all (six) possible paths from root to leaf are equal. From this
observation we an on lude that the a tual sequen e in whi h the a tions
are performed is irrelevant, that is, the underlying data stru ture of both
sub-plans is a set. Consequently, the (sub-) plan an be ollapsed to one
path.
Introdu tion of the data type set involves the following steps:
The initial set is onstru ted by olle ting all arguments of the a tions
along the path from root to leaf. That is, the initial value for the set an
be I = fO1; O2; O3g when the left-most path of a sub-plan is kept. (If
the involved operator has more than one argument, for example (unload
<obj> <pla e>), the onstant arguments are ignored.)
A \generalized" predi ate orresponding to the empty-test of the data
type is invented by generalizing over all literals in the root-node with an
element of the initial set I as argument. That is, for the unload-all sub-plan,
the new predi ate is p = (at
hobjSeti B) with
6
This kind of predi ate invention should not be onfused with predi ate invention
in ILP (see se t. 6.3.3 in hap. 6). In ILP a new predi ate must ne essarily be
introdu ed if the urrent hypothesis overs some negative examples.
8.4 Plans over Sets of Obje ts 247
; make- o
; -------
; olle t all obje ts overed by goal-preds p orresponding to the
; new predi ate p*
; the new omplex obje t is referred to as CO
; f.e. for ro ket: (at o1 b) (at o2 b) --> CO = (o1 o2)
; how to make the omplex obje t CO: use the pattern of the new
; predi ate to olle t all obje ts from the urrent top-level goal
;; use for all of main fun tion, f.e.: (ro ket (make- o ..) s)
(defun make- o (goal newpat)
( ond ((null goal) nil)
((string< (string ( aar goal)) (string ( ar newpat)))
( ons (nth (position 'CO newpat) ( ar goal))
(make- o ( dr goal) newpat)))
))
; rest(set) (implemented as for lists, otherwise pi k/rest would not
; --------- be guaranteed to be really omplements)
;; named rst be ause rest is build-in
(defun rst ( o)
( dr o)
)
; pi k(set)
(defun pi k ( o)
( ar o)
)
; is newpred true in the urrent situation?
;; used for he king this predi ate in the generalized fun tion
;; f.e. (at* CO Pla e s)
;; newpat has to be repla ed by the newpred name (f.e. at*)
;; pname has to be repla ed by the original pred name (f.e.at)
(setq newpat '(defun newp (args s)
( ond ((null args) T)
((and (null s) (not(null args))) nil)
((and (equal ( aar s) pname)
(interse tion args ( dar s)))
(newp (set-differen e args ( dar s)) ( dr s)))
(T (newp args ( dr s)))
))
)
(defun make-npf t (patname gpname)
(subst patname 'newp (nsubst ( ons 'quote (list gpname)) 'pname newpat))
)
Fig. 8.10. Fun tions Inferred/Provided for Set
248 8. Transforming Plans into Finite Programs
(b) "move-rocket"
(c) "load-all"
(a) "unload-all"
(single step)
((AT O1 B) (AT O2 B) (AT O3 B) (AT R B))
((IN O1 R) (AT O2 B) (AT O3 B) (AT R B)) ((IN O2 R) (AT O1 B) (AT O3 B) (AT R B))
((IN O2 R) (IN O1 R) (AT O3 B) (AT R B))
((IN O3 R) (IN O1 R) (AT O2 B) (AT R B))
((IN O3 R) (IN O1 R) (IN O2 R) (AT R B))
(UNLOAD O2)(UNLOAD O1)
(UNLOAD O2)(UNLOAD O3)(UNLOAD O1) (UNLOAD O3)
(UNLOAD O1)
(UNLOAD O3)(UNLOAD O2)
(UNLOAD O1)
((IN O3 R) (AT O1 B) (AT O2 B) (AT R B))
((IN O3 R) (IN O2 R) (AT O1 B) (AT R B))
(UNLOAD O2)
(UNLOAD O3)
((AT O1 A) (AT R A) (IN O2 R) (IN O3 R)) ((AT O2 A) (AT R A) (IN O1 R) (IN O3 R)) ((AT O3 A) (AT R A) (IN O1 R) (IN O2 R))
((AT O2 A) (AT O1 A) (AT R A) (IN O3 R))
((AT O3 A) (AT O1 A) (AT R A) (IN O2 R))
((AT O3 A) (AT O2 A) (AT R A) (IN O1 R))
((AT O3 A) (AT O1 A) (AT O2 A) (AT R A))
(LOAD O1) (LOAD O2) (LOAD O3)
(LOAD O2)(LOAD O3)(LOAD O1) (LOAD O3)
(LOAD O1)(LOAD O2)
(LOAD O3)(LOAD O2)
(LOAD O1)
((AT R A) (IN O1 R) (IN O2 R) (IN O3) R)
((AT R A) (IN O1 R) (IN O2 R) (IN O3) R)
((IN O3 R) (IN O1 R) (IN O2 R) (AT R B))
(MOVE-ROCKET)
Fig. 8.11. Sub-Plans of Ro ket
(at
hobjSeti B) =
8
<
:
true if 8o 2 objSet : (at o B)
holds in the urrent state
false otherwise:
The original literals are repla ed by p.
The arguments of the generalized predi ate are rewritten using the pre-
dened rest-sele tor. We have the following repla ements for the unload
sub-plan (given from root to leaf):
(at
fO1; O2; O3g B) == (at
fO1; O2; O3g B)
(at
fO1; O2g B) ! (at
(rstfO1; O2; O3g) B)
(at
fO1g B) ! (at
(rst(rstfO1; O2; O3g)) B)
(at
f g B) ! (at
(rst(rst(rstfO1; O2; O3g))) B).
The arguments of the a tions are rewritten using the predened pi k-
sele tor:
(unload O1) ! (unload(pi kfO1; O2; O3g))
(unload O2) ! (unload(pi k(rstfO1; O2; O3g)))
(unload O3) ! (unload(pi k(rst(rstfO1; O2; O3g)))).
Note, that we dene pi k and rst deterministi ally (e. g., as head/tail or
last/butlast). Although it is irrelevant whi h obje t is sele ted next, it is
ne essary that rst returns exa tly the set of obje ts without the urrently
pi ked one.
8.4 Plans over Sets of Obje ts 249
The transformed sub-plan for unload-all is given in gure 8.12.a. After intro-
du ing a data type into the plan, there is only one additional step ne essary
to interpret the plan as a program introdu ing a situation variable. Now
the plan an be read as a nested onditional expression (see g. 8.12.b).
((AT* (O1 O2 O3) B))
((AT* (RST (O1 O2 O3)) B))
((AT* (RST (RST (O1 O2 O3))) B))
((AT* (RST (RST (RST (O1 O2 O3)))) B))
(UNLOAD (PICK (O1 O2 O3)))
(UNLOAD (PICK (RST (O1 O2 O3))))
(UNLOAD (PICK (RST (RST (O1 O2 O3)))))
s
ifs
ifs
if
Ω
if
s
(AT* (RST (O1 O2 O3)) B s)
(AT* (RST (RST (RST (O1 O2 O3)))) B s)
(AT* (O1 O2 O3) B s)
(AT* (RST (RST (O1 O2 O3))) B s)
(UNLOAD (PICK (RST (RST (O1 O2 O3)))) s)
(UNLOAD (PICK (RST (O1 O2 O3))) s)
(UNLOAD (PICK (O1 O2 O3)) s)
(a) (b)
Fig. 8.12. Introdu tion of the Data Type Set (a) and Resulting Finite Program
(b) for the Unload-All Sub-Plan of Ro ket ( denotes \undened")
Plan transformation for the ro ket problem results in the following pro-
gram stru ture
(ro ket oset s) = (unload-all oset (move-ro ket (load-all oset s)))
where unload-all and load-all are re ursive fun tions whi h were generated by
our folding algorithm from the orresponding nite programs. Note, that the
parameters involve only the set of obje ts this information is the only one
ne essary for ontrol, while lo ations (A, B) and transport-vehi le (Ro ket)
are additional information ne essary for the dynami s of plan onstru tion.
The proto ol of plan-transformation is given in gure 8.13.
Re ursive fun tions for unloading and loading obje ts are:
unload all(oset; s) = if( at (oset; B; s);
s;
unload(pi k(oset); unload all(rst(oset); s)))
load all(oset; s) = if( inside (oset; Ro ket; s);
s;
load(pi k(oset); load all(rst(oset); s)))
whi h are integrated in the \main" program (ro ket oset s) given above.
The re ursive fun tions for loading and unloading all obje ts in some
arbitrary order learned from the obje t ro ket problem an be of use in many
transportation problems, as for example the logisti s domain
7
whi h is still
7
The logisti s domain denes problems, where obje ts have to be transported
from dierent pla es in and between dierent ities, using tru ks within ities
and planes between ities.
250 8. Transforming Plans into Finite Programs
+++++++++ Transform Plan to Program ++++++++++
1st step: de ompose by operator-type
Possible Sub-Plan, de ompose...
(SAVE SUBPLAN #:P1)
Possible Sub-Plan, de ompose...
(SAVE SUBPLAN #:P2)
Single Plan
(SAVE SUBPLAN #:P3)
(#:P1 (#:P2 (#:P3)))
-----------------------------------------------
2nd step: Identify and introdu e data type
(INSPECTING #:P1) Plan is of type SET
Unify equivalent paths...
Introdu e omplex obje t (CO)
(CO IS (O1 O2 O3))
A fun tion for generating CO from a goal is provided: make- o
Generalize predi ate...
(NEW PREDICATE IS (#:AT* CO B))
Generate a fun tion for testing the new predi ate...
New predi ate overs top-level goal -> repla e goal
Repla e basi predi ates by new predi ate...
Introdu e sele tor fun tions...
((((O1 O2) (RST (O1 O2 O3))) ((O1) (RST (RST (O1 O2 O3))))
(NIL (RST (RST (RST (O1 O2 O3))))))
((O3 (PICK (O1 O2 O3))) (O2 (PICK (RST (O1 O2 O3))))
(O1 (PICK (RST (RST (O1 O2 O3)))))))
RST(CO) and PICK(CO) are predefined (as dr and ar).
(INSPECTING #:P2) Plan is linear
Plan onsists of a single step
(SET ADD-PRED AS INTERMEDIATE GOAL (AT ROCKET B))
(INSPECTING #:P3) Plan is of type SET
Unify equivalent paths... [... see P1
Generalize predi ate...
(NEW PREDICATE IS (#:INSIDER* CO))
Generate a fun tion for testing the new predi ate...
New predi ate is set as goal!
Repla e basi predi ates by new predi ate...
Introdu e sele tor fun tions... [... see P1
-----------------------------------------------
3rd step: Transform plan to program
Show Plan(s) as Program(s)? y
Fig. 8.13. Proto ol of Transforming the Ro ket Plan
8.5 Plans over Lists of Obje ts 251
one of the most hallenging domains for planning algorithms (see the AIPS
planning ompetitions 1998 and 2000).
For the ro ket domain our system an learn the omplete and orre t on-
trol rules. All problems of this domain an now be solved in linear time. A
small aw is, that generalization-to-n assumes innite apa ity of the trans-
port vehi le and does not take into a ount apa ity as an additional on-
straint. To get more realisti , we have to in lude a resour e variable
8
for
the load operator. The resulting load-all fun tion must involve an additional
ondition:
load all(oset; ; s) = if( or(eq0( ); at (oset;B; s));
s;
load(pi k(oset); load all(rst(oset); pred( ); s))):
A further extension of the domain would be, to take into a ount dierent
priorities for obje ts to be transported. This would involve an extension of
pi k, sele ting always the obje t with highest priority, that is, the ontrol
fun tion would follow a greedy-algorithm.
A Lisp-program representing the ontrol knowledge for the ro ket domain
is given in appendix CC.4 together with a short dis ussion about interleaving
the inside and at predi ates.
8.5 Plans over Lists of Obje ts
8.5.1 Stru tural and Semanti List Problems
List-problems an be divided in two lasses: (a) problems whi h involve no
knowledge about the elements of the list, and (b) problems whi h involve
su h knowledge. Standard programming problems of the rst lass are for
example reversing a list, attening a list, or in rementing elements of a list.
Problems, where some operation is performed on every element of a list an
be hara terized by the higher-order fun tion (map f l). Other list problems
whi h an be solved purely stru turally are al ulating the length of a list or
adding the elements of a list of numbers. Su h problems an be hara terized
by the higher-order fun tion (redu e f b l). The unload and load problems
dis ussed above fall into this lass if ea h obje t involved has a unique name
and if pi k and rst are realized in a deterministi way. A third lass of prob-
lems follow (lter p l), for example the fun tions member, or odd-els. This
lass already involves some semanti knowledge about the elements of a list
represented by the predi ate p in lter and by the equal test in member.
Table 8.4 illustrates stru tural list-problems.
8
Dealing with resour e-variables is possible with the DPlan-system extended to
fun tion appli ations, see hap. 4. Currently we annot generate disjun tive
boolean operators in plan transformation.
252 8. Transforming Plans into Finite Programs
Table 8.4. Stru tural Fun tions over Lists
(a) (map f l) == (if (empty l) nil ( ons (f (head l)) (map f (tail l))))
(in l) == (if (empty l) nil ( ons (su (head l)) (in (tail l))))
(b) (redu e f b l) == (if (empty l) b (f (head l) (redu e f b (tail l))))
(sumlist l) == (if (null l) 0 (plus (head l) (sumlist (tail l))))
(re -unload oset s) == (if (empty oset) s
(unload (pi k oset) (re -unload (rst oset) s)))
( ) (lter p l) == (if (empty l) nil
(if (p (head l)) ( ons (head l) (lter p (tail l))) (lter p (tail l))))
(odd-els l) == (if (empty l) nil
(if (odd (head l)) ( ons (head l) (lter (tail l))) (lter (tail l))))
(member e l) == (if (empty l) nil
(if (equal (head l) e) e (member e (tail l))))
Stru tural list problems, as dis ussed for example in se tion 6.3.4.1 in
hapter 6, an be dealt with by an algorithm nearly identi al to algorithm in
table 8.3 dealing with sets (see tab. 8.5). Be ause the only relevant informa-
tion is the length of a list, the partial order an be redu ed to a total order
(see g. 8.14). Generating a total order results in linearizing the problem
(Wysotzki & S hmid, 2001). The extra tion of a unique path in the plan is
only slightly more ompli ated as for sets and is dis ussed below.
Denition 8.5.1 (List). Data type list is dened as:
list = nil j ons(e; list) with
null(list) =
true if list = nil
false otherwise
head( ons(e,list)) = e
tail( ons(e,list)) = list
Table 8.5. Introdu ing List
*Collapse plan to one path. (dis ussed in detail below)
Generate a omplex data obje t CO = (e1 : : : e
m
).
A fun tion for generating CO from the top-level goals (make- o) is provided.
A generalized predi ate (g* args) with CO 2 args is onstru ted by olle ting
all predi ates (g args) with o 2 CO^o 2 args and repla ing o by CO. For a plan
starting at level 0 g has to be a top-level goal and all top-level goals have to be
overed by g; for other plans g has to be in the urrent root node.
A fun tion for testing whether g holds in a state is generated as bottom-test.
Introdu e the data type into the plan:
For ea h state keep only bottom-test predi ate (g args) with CO
0
CO 2 args.
Introdu e list-sele tors by repla ing CO
0
by (tail
i
CO) and afterwards repla ing
a tion-arguments by (head(tail
i
CO)) with (tail
i
CO) o urring as argument of
g of the parent-node.
8.5 Plans over Lists of Obje ts 253
(a)
(1) (2)
(1 1) (1 2) (1 3) (1 4) ... (2 1) (2 2) (2 3) (2 4)
(1 1 1) (1 1 2) ...
(1 1 1 1) (1 1 1 1) ...
nil
...
... ... ... ...
... ...
...
...
......
(3)
...
(4)
...
...
(b)
nil
(w)
(w w)
(w w w)
(w w w w)
Fig. 8.14. Partial Order (a) and Total Order (b) of Flat Lists over Numbers
While fun tions over lists involving only stru tural knowledge are easy
to infer with our approa h as shown for the ro ket domain , this is not
true for the se ond lass of problems. A proto-typi al example for this lass is
sorting: for sorting a list, knowledge about whi h element is smaller (or larger)
than another is ne essary. That synthesizing fun tions involving semanti
knowledge is notoriously hard is dis ussed at length in the ILP literature
(Flener & Yilmaz, 1999; Le Blan , 1994) and indu tive fun tional program
synthesis typi ally is restri ted to stru tural problems (Summers, 1977).
Currently, we approa h transformation for su h problems by the steps
presented in the algorithm in table 8.6. We do not laim, that this strategy
is appli able to all semanti problems over lists. We developed this strategy
from analyzing and implementing plan transformation for the sele tion sort
problem, whi h is des ribed below. We will des ribe how semanti knowledge
an be \dete ted" by analyzing the stru ture of a plan. For the future, we
plan to investigate further problems and try to nd a strategy whi h overs
a lass as large as possible.
8.5.2 Synthesizing `Sele tion-Sort'
8.5.2.1 A Plan for Sorting Lists. The spe i ation for sorting lists with
four elements is given in table 3.6. In the standard version of DPlan des ribed
254 8. Transforming Plans into Finite Programs
Table 8.6. Dealing with Semanti Information in Lists
Extra ting a path (identifying list stru ture):
Extra ting a minimal spanning tree from the DAG: The plan is a DAG, but
the stru ture does not fulll the set- riterium dened above. Therefore, we
annot just sele t one path, but we have to extra t one deterministi set of
transformation sequen es. For purely stru tural list-problems every minimal
spanning tree is suitable for generalization. For problems involving semanti
knowledge only some of the minimal spanning trees an be generalized.
Regularization of the tree: Generating plan levels with identi al a tions by
shifting nodes downward in the plan and introdu ing edges with \empty" or
\id" a tions.
Collapsing the tree: Unifying identi al subtrees whi h are positioned at the
same level of the tree.
If there are still bran hes left (identifying semanti riterium for elements):
Identify a riterium for lassifying elements.
Unify bran hes by introdu ing list as argument into operator using the ri-
terium as sele tion-fun tion.
Pro eed using algorithm 8.5.
in this report we only allow for ADD-DEL-ee ts and we do not dis rimi-
nate between stati predi ates (su h as greater than, being not ae ted by
operator appli ation) and uid predi ates. Note, that it is enough to spe ify
the desired position of three of the four list-elements in the goal, be ause
positioning three elements determines the position of the fourth. A more
natural spe i ation for DPlan with fun tions is given in gure 4.10. This
se ond version allows for plan onstru tion without using a set of predened
states. Information about whi h element is on whi h position in the list or
what number is greater than another an simply be \read" from the list by
applying predened (LISP-) fun tions. The denition of the swap-operator
determines whether the problem is solved by bubble-sort or by sele tion-sort.
In the rst ase, swap is applied to neighboring elements where the rst is
greater than the other; in the se ond ase, the rst ondition is omitted. Note,
that for nding operator-sequen es for sorting a list by an as ending order
by ba kward planning, the greater ondition is reverse!
The universal plan is given in gure 3.7. For sorting a list of four elements,
there exist 24 states. Swapping elements with the restri tions given for sele -
tion sort results in 72 edges. Sorting lists of three elements is illustrated in
appendix CC.5.
8.5.2.2 Dierent Realizations of `Sele tion Sort'. To make the trans-
formation steps more intuitive, we rst dis uss fun tional variants of selsort
(see tab. 8.7): The rst variant is a standard implementation with two nested
for-loops. The outer loop pro esses the list l (more exa tly, the array) from
start (s) to end (e), the inner loop (fun tion smpos) sear hes for the position
of the smallest element in l, starting at index (1 + s) where s is the urrent
index of the outer loop. The for-loops are realized as tail-re ursions.
8.5 Plans over Lists of Obje ts 255
Table 8.7. Fun tional Variants for Sele tion-Sort
; (1) Standard: Two Tail-Re ursions
(defun selsort (l s e)
(if (= s e)
l
(selsort (swap s (smpos s (1+ s) e l) l) (1+ s) e)
))
(defun smpos (s ss e l)
(if (> ss e)
s
(if (> (nth s l) (nth ss l))
(smpos ss (1+ ss) e l)
(smpos s (1+ ss) e l)
)))
; (2) Realization as Linear Re ursion
; is `` ounter'', starting with last list-position e
(defun lselsort (l s e)
(if (sorted l s )
l
(swap* (1- ) e (lselsort l (1- ) s e))
))
(defun swap* (s from to l)
(swap s (smpos s from to l) l)
)
(defun sorted (l from )
(equal (subseq l from ) (subseq (sort ( opy-seq l) '<) from ))
)
; (3) Expli it definition of order (gl is list of pos-key pairs)
; e.g. gl = ((3 4) (2 3) (1 2) (0 1)) --> sorted l = (1 2 3 4)
(defun llselsort (gl l)
(if (lsorted gl l)
l
(lswap* ( ar gl) (llselsort ( dr gl) l))
))
(defun lsorted (gl l)
( ond ((null gl) T)
((equal (se ond ( ar gl)) ( ar l)) (lsorted ( dr gl) ( dr l)))
(T nil)
))
(defun lswap* (g l)
(swap (first g) (position (se ond g) l) l)
)
256 8. Transforming Plans into Finite Programs
There is some on i t between a tail-re ursive stru ture where some
input state is transformed step-wise to the desired output and a plan rep-
resenting a sequen e of a tions from the goal to some initial state. Our def-
inition of a plan as program implies a linear re ursion (see def. 8.1.1). The
se ond variant of sele tion sort is a linear re ursion: For a list l with starting
index s and last index e, it is he ked, whether l is already sorted from s to a
urrent index whi h is initialized with e and step-wise redu ed. If yes, the
list is returned, otherwise, it is determined whi h element at positions ; : : : ; e
should be swapped to position 1. The inner loop is repla ed by an expli it
sele tor-fun tion. We will see below, that this orresponds to sele ting one
of several elements represented at dierent bran hes on the same level of the
plan.
The se ond variant orresponds losely to the fun tion we an infer
from the plan, it an be inferred from a plan generated by using fun tion-
appli ation.
9
A plan onstru ted by manipulating literals ontains no knowl-
edge about numbers as indi es in a list and order relations between numbers.
Looking ba k at plan transformation for ro ket, we introdu ed a \ omplex
obje t" oset whi h guided a tion appli ation in the re ursive unload and load
fun tion. Thus, for sorting, we an infer a omplex obje t from the top-level
goals whi h determine whi h number should be at whi h position of the list.
The third variant gives an abstra t representation of the fun tion whi h we
an infer automati ally from the plan in gure 3.7. This fun tion is more
general than standard sele tion sort, be ause now lists an be sorted in a -
ordan e to any arbitrary order relation spe ied in parameter gl! That is,
it does not rely on the stati predi ate gt(x, y) whi h represents the order
relation of the rst n natural numbers (see g. 3.6).
8.5.2.3 Inferring the Fun tion-Skeleton. The plan given in gure 3.7 is
not de omposed by the initial de omposition step, be ause the omplete plan
involves only one operator swap. The plan is a DAG, but it does not fulll
the riterium for sets of obje ts. Therefore, extra tion of one unique set of
optimal plans is done by pi king oneminimal spanning tree from the plan
(see se t. 3.2 in hap. 3).
The plan for sorting lists with three elements (see appendix CC.5) onsists
of 3! = 6 nodes and 9 edges. It ontains 9 dierent minimal spanning trees
(see also appendix). Not all of them are suitable for generalization: Three of
the nine trees an be \regularized" (see next transformation step below). If
we have no information for pi king a suitable minimal spanning tree, we have
to extra t a tree, try to regularize it and ba ktra k if regularization fails. For
9 andidates with 3 suitable solutions, this is feasible. But for the sorting of 4
element list, there exist 24 nodes, 72 edges and more than a million possible
minimal spanning trees with only a small amount of them being regularizable
(see appendix AA.10 for al ulation of number of minimal spanning trees).
9
Our investigation of plan transformation for plans involving fun tion-appli ation
is still at the beginning.
8.5 Plans over Lists of Obje ts 257
Currently, we pre- al ulate the number of trees ontained in the DAG and
if the number ex eeds a given threshold t (say 500), we only generate the
rst t trees. One possible but unsatisfying solution would be to parallelize
this step, whi h is possible. We plan to investigate whether tree-extra tion
and regularization an be integrated into one step. This would solve the
problem in an elegant way. One of the regularizable minimal spanning trees
for sorting four elements is given in gure 8.15. For bettern readability, the
state des riptions in the nodes are given as lists. In the original plan, ea h list
is des ribed by a set of literals. For example, [1 2 3 4 is a short notation for
(is p1 1) (is p2 2) (is p3 3) (is p4 4) where is (x y) means that position
x has ontent y (see se t. 3.1.4.3 in hap. 3 for details).
The algorithm for extra ting a minimal spanning tree from a DAG is
given in table 8.8, the orresponding program fragment in appendix AA.11.
Table 8.8. Extra t an MST from a DAG
Input: a DAG with edges (n m) annotated by the level of the node
Initialization: t == root(dag) (minimal spanning tree)
For l = 0 to maxlevel 1 DO:
Partition all edges (n m) from nodes n at level l to nodes m at level l+1 into
groups with equal end-node m:
p = ff(n
1
m
1
)(n
2
m
1
) : : : (n
k
m
1
)g; : : : f(n
0
1
m
l
) : : : (n
0
k
0
m
l
)gg.
Cal ulate the Cartesian produ t between sets in p: Cart = p
1
p
2
: : : p
l
.
Generate trees t
0
= t [ , for all 2 Cart.
The original plan represented how ea h of the 24 possible lists over four
(dierent) numbers an be transformed into the desired goal (the sorted list)
by the set of all possible optimal sequen es of a tions. The minimal spanning
tree gives a unique sequen e of a tions for ea h state. In the minimal spanning
tree given in gure 8.15, the a tion sequen es for sorting lists by shifting the
smallest element to the left is given.
For ollapsing a tree into a single path, this tree has to be regular
10
:
Denition 8.5.2 (Regular Tree). An edge-labeled tree t with edges (n m)
from nodes n to nodes m is regular, if for ea h level l = 0 : : :maxlevel1 the
subtrees for ea h node n f(n m
1
); (n m
2
); : : : (n m
k
)g onsist of id-identi al
label-sets with label
=
label
0
if label = label
0
or label = id.
For a given minimal spanning tree it an be tried to transform it into a
regular tree, using the algorithm given in table 8.9. The program fragment
for tree-regularization is given in appendix AA.12. A tree is regularized by
pushing all edges labeled with a tions o urring also on the next level of the
tree down to this level. The starting node of su h an edge is \ opied" and
10
This denition to regular trees is similar to the denition of irrelevant attributes
in de ision trees (Unger & Wysotzki, 1981): If a node has only identi al subtrees,
the node and all but one subtree are eliminated from the tree.
258
8.TransformingPlansintoFinitePrograms
[1 2 3 4]
[2 1 3 4] [3 2 1 4] [1 3 2 4][4 2 3 1] [1 4 3 2] [1 2 4 3]
[3 1 2 4] [2 3 1 4] [4 3 2 1] [4 1 3 2] [2 4 3 1][3 4 1 2] [4 2 1 3] [1 4 2 3][3 2 4 1] [1 3 4 2][2 1 4 3]
[4 1 2 3][2 4 1 3][3 4 2 1] [3 1 4 2] [2 3 4 1][4 3 1 2]
(S P1 P2) (S P1 P3) (S P2 P3)(S P1 P4) (S P2 P4) (S P3 P4)
(S P1 P2) (S P1 P3) (S P1 P4) (S P1 P2) (S P1 P4)(S P1 P3) (S P1 P3) (S P2 P3)(S P1 P4) (S P2 P4)(S P1 P2)
(S P1 P2)(S P1 P3)(S P1 P4) (S P1 P2) (S P1 P4)(S P1 P3)
Fig.8.15.AMinimalSpanningTreeExtra tedfrom
theSelSortPlan
8.5 Plans over Lists of Obje ts 259
an id-edge is introdu ed between the original and the opied starting node.
Note, that an edge an be shifted more than on e. If the result is a regular
tree a ording to denition 8.5.2, plan-transformation pro eeds, otherwise, it
fails.
Table 8.9. Regularization of a Tree
Input: an edge-labeled tree
For l = 0 to maxlevel 2 DO:
Constru t a label-set LS
l
for all edges (n m) from nodes n on level l to nodes
m on level l+1 and a label-set LS
l+1
for all edges (o p) from nodes o on level
l + 1 to nodes p on level l+ 2.
IF LS
l+1
LS
l
shift all edges (n
s
m
s
) 2 LS
l
\LS
l+1
one level and introdu e
edges (n
s
n
s
) with label \id" from level l to the shifted nodes n
s
on level n+1.
Test if the resulting tree is regular.
The regularized version of the minimal spanning tree for selsort is given
in gure 8.16 (again with abbreviated state des riptions). The stru ture of
the re ursion to be inferred is now already visible in the regularized tree: The
\parallel" subtrees have identi al edge-labels and the states share a large
overlap: The a tions in the bottom level of the tree all refer to swap the
element on position one with some element further to the right in the list
(swap(p1, x)). For the resulting states, the rst element is already the smallest
element of the list ([1 x y z). On the next level, all a tions refer to swap
the element on position two with some element further to the right, and so
on. A slightly dierent method for regularization, also with the goal of plan
linearization is presented in Wysotzki and S hmid (2001).
(S P1 P4) (S P1 P2)
[4 2 3 1][3 2 1 4][2 1 3 4][2 4 3 1]
[(ID ID) 1 2 3 4]
(S P1 P3)
[3 4 1 2][2 3 1 4][3 1 2 4][3 2 4 1][4 2 1 3] [4 1 3 2][4 3 2 1]
(S P2 P4)
ID(S P3 P4)
(S P2 P4) ID
(S P1 P2)(S P1 P2)(S P1 P3) (S P1 P4) (S P1 P3)
(S P2 P3)
((ID) 1 2 4 3)
(S P1 P2)(S P1 P3) (S P1 P2)
[4 1 2 3]
[1 4 3 2]
(S P1 P4)(S P1 P3)
[2 4 1 3]
(S P2 P3) ID
(S P1 P4)(S P1 P4) (S P1 P2) (S P1 P3)
[1 2 4 3]
[1 4 2 3]
[3 4 2 1][3 1 4 2] [4 3 1 2] [2 3 4 1][2 1 4 3]
[1 3 4 2]
(S P1 P4)
[1 2 3 4]
[(ID) 1 2 3 4]
[1 3 2 4]
Fig. 8.16. The Regularized Tree for SelSort
8.5.2.4 Inferring the Sele tor Fun tion. Although the selsort plan ould
be transformed su essfully to a regular tree with identi al subtrees, the plan
is still not linear. Considering the nodes at ea h planning level, these nodes
260 8. Transforming Plans into Finite Programs
share a ommon sub-set of literals. That is, at ea h level, the ommonalities
of the states an be aptured by al ulating the interse tion of the state
des riptions. For the bran hes from one node holds that their ordering is
irrelevant for obtaining the a tion sequen e for transforming a given state
into the goal state. Furthermore, the names of all a tions are identi al and at
ea h level, the rst argument of swap is identi al. Putting this information
together, we an represent the levels of the regularized tree as:
((is p1 1) (is p2 2) (is p3 3)) (swap p3 [p4)
((is p1 1) (is p2 2)) (swap p2 [p3, p4)
((is p1 1)) (swap p1 [p2, p3, p4).
From the instantiations of the se ond argument we an onstru t a hypo-
theti al omplex obje t: CO
S
= [p4 < [p3; p4 < [p2; p3; p4. Note, that the
numbers asso iated with positions are not re ognized as numbers. Another
minimal spanning tree extra ted from the plan, would result in a dierent
pattern, for example [p2 < [p4; p2 < [p1; p4; p2 whi h is generalizable in the
same way as we will des ribe in the following.
The data obje t whi h nally will be ome the re ursive parameter is on-
stru ted along a path in the plan (as des ribed for ro ket). On ea h level in
the plan, the argument(s) of an a tion an be hara terized relative to the
obje t involved in the parent node. Now, we have to introdu e a sele tor fun -
tion for an argument of an a tion involving a tions on the same level of the
plan with the same parent node. That is the sear hed for fun tion has to be
dened with respe t to the hildren of the a tion. Remember, that this is a
ba kward plan and that a hild-node is input to an a tion. As a onsequen e,
the sele tor fun tion has to be applied to the urrent instantiation of the list
(situation).
The sear hed-for fun tion for sele ting one element of the andidate ele-
ments represented in the se ond argument of swap has to be dened relative
to the information available at the urrent \position" in the plan. That is,
the literals of the parent node and the rst argument of the swap operator
an be used to de ide whi h element should be swapped in the urrent ( hild)
state. For example, for (swap p3 [p4), we have (sel CO
S
) = p4 from ((is
p1 1) (is p2 2) (is p3 4) (is p4 3)). The rst argument of swap o urs in
(is p3 3) of the parent node. The position to be sele ted p4 is related
to (is p4 3) in the hild node. That is, when applying swap, the rst posi-
tion is given and the se ond position must be obtained somehow. A modied
swap-fun tion whi h deals with sele ting the \right" position to swap must
provide the following hara teristi s:
Be ause the overall stru ture of the plan indi ates a list problem, the goal
predi ates must be pro essed in a xed order whi h an be derived from
the sequen e of a tions in the plan. For a given list of goal predi ates, deal
with the rst of these predi ates.
8.5 Plans over Lists of Obje ts 261
For example, the goal predi ates an be (is p1 1) (is p2 2) (is p3 3).
The rst element to be onsidered is (is p1 1).
A generalized swap* operator works on the urrent goal predi ate and the
urrent situation s.
For example, swap* realizes (is p1 1) in situation s = (is p1 4) (is p2
1) (is p3 2) (is p4 3).
The urrent goal is splitted into the urrent position and the element for
this position.
For example, (pos (is p1 1)) = p1 and (key (is p1 1)) = 1.
The element urrently at position (pos (is x y)) must be swapped with
the element whi h should be at this position, that is (key (is x y)).
For s = (is p1 4) (is p2 1) (is p3 2) (is p4 3) results that position p1
and position p2 are swapped.
Constru ting this hypothesis presupposes, that the plan-transformation
system has predened knowledge about how to a ess elements in lists.
11
Written as Lisp-fun tions, the general hypothesis is:
(defun swap* (fp s)
(pswap (ppos fp) (sel (pkey fp) s) s) )
(defun sel (k s)
(ppos ( ar
(map an #'(lambda(x) (and (equal (pkey x) k) (list x))) s) )))
where pswap is the swap fun tion predened in the domain spe i ation and
ppos and pkey sele t the se ond/third element of a list (is x y).
12
The generalized swap*-fun tion is tested for ea h swap-a tion o urring
in the plan. Be ause it holds for all ases, plan-transformation an pro eed.
If the hypothesis would have failed, a new hypothesis had to be onstru ted.
If all hypotheses fail, plan-transformation fails. For the selsort plan, only the
hypothesis introdu ed above is possible.
The rest of plan-transformation is straight-forward: The regularized tree
is redu ed to a single path, unifying bran hes by repla ing the swap a tion by
swap* (see g. 8.17). Note, that in ontrast to the ro ket problem, this path
already ontains \id" bran hes whi h will onstitute the \then" ase for the
nested onditional to whi h the plan is nally transformed. Note further, that
only the rst argument of swap* is given, be ause the se ond argument s is
11
In the urrent implementation we provide this knowledge only partially. For
example, we an sele t an element x 2 arg from a list (p arg), as needed to
onstru t the denition of su for sequen es. That is, we an onstru t pos and
key. For onstru ting the denition for sel, we need additionally the sele tion of
an element of a list of literals whi h follows a given pattern. This an be realized
with the lter-fun tion map an. But up to know, we did not implement su h
omplex sele tor fun tions.
12
The se ond ondition (list x) for map an is due to the denition of this fun tor
in Lisp. Without this ondition, the fun tor returns T or nil, with this ondition,
it returns a list of all elements fullling the lter- ondition (equal (pkey x) k).
262 8. Transforming Plans into Finite Programs
given by the sub-tree following the a tion. This is the same way to represent
plans as terms as used in the se tions before (for unsta k and ro ket).
((isc p1 1) (isc p2 2) (isc p3 3))
((isc p1 1) (isc p2 2))
((isc p1 1))
(swap* (isc p2 2))
(swap* (isc p3 3))
(swap* (isc p1 1))
s
s
s
id
id
Fig. 8.17. Introdu tion of a \Semanti " Sele tor Fun tion in the Regularized Tree
Data type introdu tion works as des ribed for ro ket (for set and stru -
tural list problems): a omplex obje t CO = ((is p1 1) (is p2 2) (is p3
3)) and a generalized predi ate (is * CO) for the bottom-test is inferred. The
state-nodes are rewritten using the rest-sele tor tail on CO. The head sele tor
is introdu ed in the a tions: (swap*(head (tail
i
CO)). The resulting re ursive
fun tion is:
(defun pselsort (pl s)
(if (is * pl s)
s
(swap* (head pl)
(pselsort (tail pl) s)
) ))
Note, that head and tail here orrespond to last and butlast. But, be ause
lists are represented as expli it position-key pairs, pselsort transforms lists
s to lists sorted a ording to the spe i ation derived from the top-level goals
given in pl independent of the transformation sequen e! The Lisp-program
realizing sele tion sort is given in gure 8.18.
8.5.3 Con luding Remarks on List Problems
List sorting is an example for a domain whi h involves not only stru tural but
also semanti knowledge: While for stru tural problems, the elements of the
input-data an be repla ed by variables with arbitrary interpretation, this is
not true for semanti problems. For example, reversing a list [x, y, z works in
the same way, regardless whether this list is [1, 2, 3 or [1, 1, 1. In semanti
problems, su h as sorting, operators omparing elements (x < y or x = y)
8.5 Plans over Lists of Obje ts 263
; Complete Re ursive Program for SelSort
; for lists represented as literals
; pl is inferred omplex obje t, e.g., ((p1 1) (p2 2) (p3 3))
; s is situation (stati s an be omitted),
; e.g. ((is p1 3) (is p2 1) (is p3 4) (is p4 2))
(defun pselsort (pl s)
(if (is * pl s)
s
(swap* (head pl)
(pselsort (tail pl) s)
)))
(defun swap* (fp s)
(pswap (ppos fp) (sel (pkey fp) s) s) )
(defun sel (k s)
(spos ( ar
(map an #'(lambda(x) (and (equal (skey x) k) (list x))) s)
)))
(defun is * (pl s)
(subsetp pl (map ar #'(lambda(x) ( dr x)) s) :test 'equal))
; sele tors for elements of pl (p k)
(defun ppos (p) (first p))
(defun pkey (p) (se ond p))
; sele tors for elements of s (is p k)
(defun spos (p) (se ond p))
(defun skey (p) (third p))
; head and tail realized as last and butlast
; (from the order defined in the plan, alternatively: ar dr)
(defun head (l) ( ar (last l)))
(defun tail (l) (butlast l))
; expli it implementation of add-del effe t
; in onne tion with DPlan: appli ation of swap-operator on s
; "inner" union: i=j ase
(defun pswap (i j s)
(print `(swap ,i ,j ,s))
(let ((ikey (skey ( ar(remove-if #'(lambda(x) (not (equal i (spos x)))) s))))
(jkey (skey ( ar(remove-if #'(lambda(x) (not (equal j (spos x)))) s))))
(rsts (remove-if #'(lambda(x) (or (equal i (spos x))
(equal j (spos x)))) s))
)
(union (union (list (list 'is i jkey))
(list (list 'is j ikey)) :test 'equal)
rsts :test 'equal)
))
Fig. 8.18. LISP-Program for SelSort
264 8. Transforming Plans into Finite Programs
are ne essary. A prototypi al example for a problem involving semanti is
the member fun tion whi h returns true, if an element x is ontained in a list
l and false otherwise. Here an equality test must be performed. A detailed
dis ussion of the reasons why the member fun tion annot be inferred using
standard te hniques of indu tive logi programming is given by Le Blan
(1994).
Transforming the selsort plan into a nite program involved to riti al
steps: (1) extra ting a suitable minimal spanning tree from the plan and (2)
introdu ing a \semanti " sele tor fun tion. The inferred omplex obje t rep-
resents the number of elements whi h are already on the goal position. This is
in analogy to the ro ket problem, where the omplex obje t represented how
many obje ts are already at the goal-destination (at lo ation B for unload-
all or inside the ro ket for load-all). Plan transformation results in a nal
program whi h an be generalized to a re ursive sort fun tion sharing ru ial
hara teristi s with sele tion sort. But the fun tion inferred by our system
diers from standard sele tion sort in two aspe ts: First, the re ursion is lin-
ear, involving a \goal" sta k. The nested for-loops (two tail-re ursions) of
the standard fun tion are realized by a single linear re ursive all. Of ourse,
the fun tion for sele ting the urrent position is itself a loop: the literal list
is sear hed for a literal orresponding to a given pattern by an higher-order
lter fun tion. Se ond, it does not rely on the ordering of natural numbers,
but generates the desired sequen e of elements, expli itely given as input.
We demonstrated plan transformation of a plan for lists with four ele-
ments. From a list with three elements, eviden e for the hypothesis for gener-
ating the semanti sele tor fun tion would have been weaker (involving only
the a tions from level one to two in the regularized tree). An alternative
approa h to plan transformation, involving knowledge about numbers, is de-
s ribed for a plan for three-element lists in Wysotzki and S hmid (2001). In
general, there are three ba ktra k-points for plan transformation:
Generating \semanti " fun tions:
If a generated hypothesis for the semanti fun tion fails or if generaliza-
tion-to-n fails, generate another hypothesis.
Extra ting a minimal spanning tree from a plan:
If plan transformation or generalization-to-n fails, sele t another minimal
spanning tree.
Number of obje ts involved in planning:
If plan transformation or generalization-to-n fails, generate a plan, involv-
ing an additional obje t. (A plan must to involve at least three obje ts
to identify the re ursive parameter and its substitution as des ribed in
hapter 7.
Be ause plan onstru tion has exponential eort (all possible states of a do-
main for a xed number of obje ts have to be generated and the number of
states an grow exponentially relative to the number of obje ts) and be ause
the number of minimal spanning trees might be enormous, generating a -
8.6 Plans over Complex Data Types 265
nite program suitable for generalization is not eÆ ient in the general ase.
To redu e ba ktra king eort, we hope to ome up with a good heuristi for
extra ting a \suitable" minimal spanning tree in the future. One possibility
mentioned above is, to try to ombine tree extra tion and regularization.
8.6 Plans over Complex Data Types
8.6.1 Variants of Complex Finite Programs
The usual way, to lassify re ursive fun tions, is to divide them into dierent
omplexity lasses (Hinman, 1978; Odifreddi, 1989). In omplexity theory, the
semanti s of a re ursive fun tion is under investigation. For example, bona i
is typi ally implemented as tree-re ursion (see tab. 8.10), but it belongs to
the lass of linear problems meaning, the bona i-number of a number
n an be al ulated by a linear re ursive fun tion (Field & Harrison, 1988,
pp. 454). In our approa h to program synthesis, omplexity is determined
by the synta ti al stru ture of the nite program, based on the stru ture of
a universal plan. The unfolding (see hap. 7) of all fun tions in table 8.10
results in a tree stru ture. Interpretation of max always involves only one
of the two tail-re ursive alls (that is, the fun tion is linear). Interpretation
of b results in two new re ursive alls for ea h re ursive-step (resulting in
an eort O(2
n
)). The A kermann-fun tion (a k) is the lassi example for
a non-primitive re ursive fun tion with exponential growth ea h re ursive
all results in y + x new re ursive alls.
Table 8.10. Stru tural Complex Re ursive Fun tions
Alternative Tail Re ursion
(max m l) == (if (null l) m
(if (> (head l) m) (max (head l) (tail l)) (max m (tail l))))
Tree Re ursion
(b x) == (if (= 0 x) 0 (if (= 1 x) 1 (plus (b (- x 1)) (b (- x 2)))))
-Re ursion
(a k x y) == (if (= 0 x) (1+ y) (if (= 0 y) (a k (1- x) 1) (a k (1- x) (a k x (1- y)))))
For plan transformation, on the other hand, semanti s is taken into a -
ount to some extend: As we saw above, plans are linearizable if the data
type underlying the plan is a set or a list. For the ase of list problems in-
volving semanti attributes of the list elements, it depends on the omplexity
of the involved \semanti " fun tions whether the resulting re ursion is linear
or more omplex. Currently, we do not have a theory of \linearizability" of
universal plans, but learly, su h a theory is ne essary to make our approa h
to plan transformation more general. A good starting point for investigating
this problem, should be the literature on the transformational approa h to
266 8. Transforming Plans into Finite Programs
ode optimization in fun tional programming (Field & Harrison, 1988). In
Wysotzki and S hmid (2001) linearizability is dis ussed in detail.
There are two well-known planning domains, for whi h the underlying
data type is more omplex than sets or lists: Tower of Hanoi (se t. 3.1.4.4 in
hap. 3) and building a Tower of alphabeti ally sorted blo ks in the blo ks-
world domain (se t. 3.1.4.5 in hap. 3). The tower problem is a set of lists
problem and used as one of the ben hmark problems for planners. The hanoi
problem is a list of lists problem (the \outer" list is of length 3 for the
standard 3 peg problems). For both domains, the general solution pro edure
to transform an arbitrary state into a state fullling the top-level goals is
at least at rst glan e more omplex than a single linear re ursion. Up to
now we annot fully automati ally transform plans for su h omplex domains
into nite programs. In the following, we will dis uss possible strategies.
8.6.2 The `Tower' Domain
8.6.2.1 A Plan for Three Blo ks. The spe i ation of the three-blo k
tower problem was introdu ed in hapter 2 and is des ribed in se tion 3.1.4.5
in hapter 3. The unsta k/ learblo k domain des ribed above as example for a
problem with underlying sequential data type is a sub-domain of this problem:
the puttable operator is stru turally identi al to the unsta k operator. The
put operator is represented with a onditioned ee t for taking are of the
ase that a blo k x is moved from the table to another blo k y and for the
ase that x is moved from another blo k z.
For the 3-blo k problem, the universal plan is a unique minimal spanning
tree (see g. 3.10). Note, that for a given goal to build a tower with fon(A, B),
on(B, C)), the state fon(C, B), on (B, A)g that is a tower where the blo ks
are sorted in reverse order to the goal only two a tions are needed to rea h
the goal state: base of the tower C is put on the table, B an immediately
put on C without putting it on the table rst. A program for realizing tower
is only optimal, if these short- uts are performed.
There are two possible strategies for learning a ontrol program for tower
whi h are dis ussed in reinfor ement learning under the labels in remental
elemental-to- omposite learning and simultaneous omposite learning (Sun &
Sessions, 1999): In the rst ase, the system is rst trained with a simple
task, for example learing an arbitrary blo k by unsta king all blo ks lying
on top of it. The learned poli y is stored, for example as CLEAR-program
and extends the set of available basi fun tions. When learning the poli y
for a more omplex problem, su h as tower, the already available knowledge
(CLEAR) an be used. In the se ond ase, the system immediately learns
to solve the omplex problem and the de omposition must be performed
autonomously. The appli ation of both strategies to the tower problem is
demonstrated in Wysotzki and S hmid (2001). In the work reported here, we
fo us on the se ond strategy.
8.6 Plans over Complex Data Types 267
8.6.2.2 Elemental to Composite Learning. Let us assume, that the
ontrol knowledge for learing an arbitrary blo k is already available as a
CLEARBLOCK fun tion:
CLEARBLOCK(x, s) = if( lear(x, s), s, puttable(topof(x), CLEAR-
BLOCK(topof(x), s))).
This fun tion immediately returns the urrent state s, if lear(x) already
holds in s, otherwise it is tried to put the blo k lying on top of x on the
table.
Now we want to learn a program for solving the more omplex tower
problem. In Wysotzki and S hmid (2001) we des ribe an approa h based on
the assumption of linearizability of the planning problem (see se t. 2.3.4.2
in hap. 2): It is presupposed that sub-goals an be a hieved immediately
before the orresponding top-level goal is a hieved, For example, to rea h a
state where blo k A is lying on blo k B, the a tion put(A, B) an be ap-
plied; but this a tion is appli able only if both A and B are lear. That is, to
realize goal on(A, B) the sub-goals lear(A) and lear(B) must hold in the
urrent situation. Allowing the use of the already learned re ursive CLEAR-
BLOCK fun tion and using linearization, the omplex plan for solving the
tower problem for three blo ks, an be ollapsed to:
(on a b) (on b )
(put a b)Æ(CLEARBLOCK a)Æ(CLEARBLOCK b)
!
(on b )
(put b )Æ(CLEARBLOCK b)Æ(CLEARBLOCK )
! s.
The CLEARBLOCK program does apply as many puttable a tions as ne es-
sary. If a blo k is already lear, the state is returned un hanged. The re ursive
program generalizing over this plan is given in appendix CC.6.
For some domains, it is possible to identify independent sets of predi ate
by analyzing the operator spe i ations, for example using the TIM-analysis
(see se t. 2.5.1 in hap. 2) for the planner STAN (Long & Fox, 1999).
While the approa h of Wysotzki and S hmid (2001) is based on analyzing
the goals and sub-goals ontained in a plan, the strategy reported in this
hapter is based on data type inferen e for uniform sub-plans (see se t. 8.2.1).
If you look at the plan given in gure 3.10, the two upper levels ontain only
ar s labelled with put a tions and all levels below ontain only ar s labelled
with puttable a tions. Therefore, as demonstrated for ro ket above, the plan
is in a rst step de omposed and it be omes impossible to infer a program
where put and puttable a tions are interwoven.
8.6.2.3 Simultaneous Composite Learning. Initial plan de omposition
for the 3-blo k tower plan results in two sub-plans a sub-plan for put-all and
a sub-plan for puttable-all. The put-all sub-plan is a regular tree as dened
above. The only level with bran hing is for a tions (put B C) and the plan an
be immediately redu ed to a linear sequen e. Consequently, we introdu e the
data type list with omplex obje t CO = (A B C) and bottom-test (on* (A
268 8. Transforming Plans into Finite Programs
B C)). The generalized put-all fun tion is stru turally analogous to load-all
from the ro ket domain:
(put-all olist s) ==
(if (on* olist s)
s
(put (first olist) (se ond olist) (put-all (tail olist) s))
)
where first and se ond are implemented as last and se ond-last, or
olist gives the desired order of the tower in reverse order.
For the puttable sub-plan we have one fragment onsisting of a single step
(puttable C) for the reversed tower and a set of four sequen es:
A < C < B
B < C < A
B < A < C
C < A < B
with ( t x) as bottom-test and the onstru tor (su x) = y for (on y x)
as introdu ed above for linear plans. An obvious strategy ompatible with
our general approa h to plan transformation would be to sele t one of this
sequen es and generalize a lear-all fun tion identi al to the unsta k-all fun -
tion dis ussed above. It remains the problem of sele ting the blo k whi h is to
be unsta ked that is, we have to infer the bottom-element from the goal-set
( t a) ( t b) ( t ). As des ribed for sorting above, we have to generate a
semanti sele tor fun tion whi h is not dependent of the parent-node, but of
the urrent state.
13
We have the following examples for onstru ting the sele tor fun tion:
((on b ) (on a)): (sel (A B C)) = A
((on a ) (on b)), ((on a) (on a b)): (sel (A B C)) = B
((on b a) (on a )): (sel (A B C)) = C
that is, for the omplex obje t (A B C) we always must sele t the element
whi h is the base of the urrent tower.
If we model the tower domain by expli it use of an ontable predi ate, this
predi ate an be used as riterium for the sele tor fun tion. Without this
predi ate, we an introdu e (on* CO) already generated for put-all and
sele t the last element of the list. The resulting tower fun tion than would
be:
tower(olist, x) = put-all(olist, lear-all(sel(s)))
sel(s) = last(make-olist(s)).
13
Note, that for selsort we introdu ed a sele tor in the basi operator swap. Here
we introdu e a sele tor in the fun tion lear-all whi h is already a re ursive
generalization!
8.6 Plans over Complex Data Types 269
With the des ribed strategy, the problem got redu ed to an underlying
data type list with a semanti sele tor fun tion. The sele tor fun tion is se-
manti , be ause it is relevant whi h blo k must be leared. This ontrol rule
generates orre t transformation sequen es for towers with arbitrary num-
bers of blo ks with the desired sorting of blo ks spe ied by olist, whi h is
generated from the top-level goals. But, it does not for all ases generate the
optimal transformation sequen es!
For generating optimal transformation sequen es, we must over the ases
where a blo k an be put onto another blo k immediately, without putting
it on the table rst. For the three-blo k plan, there is only one su h ase and
we ould ome up with the dis riminating predi ate (on b):
tower(olist, x) = put-all(olist, if on( , b, s),
puttable( , s),
lear-all(sel(s)))
whi h generates in orre t plans for larger problems, for example for the state
((on b) (on b a) (on a d) ( t ))!
For both variants of tower a generate-and-test strategy would dis over
the aw: For the rst variant, it would be dete ted that for ((on b) (on
b a) ( t a)) an additional a tion (puttable b) would be generated whi h is
not in luded in the optimal universal plan. For the se ond variant, all states
of the 3-blo k plan are overed orre tly the faulty ondition would only
be dete ted when he king larger problems. But, with only one spe ial ase
of a reversed tower in the three-blo k plan every other hypothesis would be
highly spe ulative. Therefore, we now will investigate the four-blo k plan.
The universal plan for the four-blo k tower problem is a DAG with 73
nodes and 78 edges, thus we have to extra t a suitable minimal spanning tree.
Be ause the plan is rather larger, we present an abstra t version in gure 8.19
and a summary for the a tion sequen es for all 33 leaf nodes in table 8.11.
0
1
2 3
4 5 6 78 910
11 12 13 141516 17 18 1920212223 2425262728 293031
32 3334 353637383940 41424344454647 484950 515253 54 555657
58596061 62636465666768697071 72
Fig. 8.19. Abstra t Form of the Universal Plan for the Four-Blo k Tower
For the four-blo k problem, we have 15 sequen es needing to put all blo ks
on the table and 8 ases with shorter optimal plans (only ounting leaf nodes)
270 8. Transforming Plans into Finite Programs
Table 8.11. Transformation Sequen es for Leaf-Nodes of the Tower Plan for Four
Blo ks
15 4-towers, needing 3 puttable a tions
( (on a b) (on b d) (on d ) ( t a)) (PUTTABLE A) (PUTTABLE B) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on a ) (on b) (on b d) ( t a)) (PUTTABLE A) (PUTTABLE C) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on a ) (on d) (on d b) ( t a)) (PUTTABLE A) (PUTTABLE C) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on a d) (on d b) (on b ) ( t a)) (PUTTABLE A) (PUTTABLE D) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on b a) (on a d) (on d ) ( t b)) (PUTTABLE B) (PUTTABLE A) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on b ) (on a) (on a d) ( t b)) (PUTTABLE B) (PUTTABLE C) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on b ) (on d) (on d a) ( t b)) (PUTTABLE B) (PUTTABLE C) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on b d) (on d a) (on a ) ( t b)) (PUTTABLE B) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on a) (on a b) (on b d) ( t )) (PUTTABLE C) (PUTTABLE A) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on a) (on a d) (on d b) ( t )) (PUTTABLE C) (PUTTABLE A) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on b) (on b a) (on a d) ( t )) (PUTTABLE C) (PUTTABLE B) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on b) (on b d) (on d a) ( t )) (PUTTABLE C) (PUTTABLE B) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on d a) (on a b) (on b ) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on d b) (on b a) (on a ) ( t d)) (PUTTABLE D) (PUTTABLE B) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on d) (on d a) (on a b) ( t )) (PUTTABLE C) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
((put d) (puttable a) also possible)
6 4-towers, needing 2 puttable a tions
( (on a d) (on d ) (on b) ( t a)) (PUTTABLE A) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on b d) (on d ) (on a) ( t b)) (PUTTABLE B) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on d) (on d b) (on b a) ( t )) (PUTTABLE C) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on d a) (on a ) (on b) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on d b) (on b ) (on a) ( t d)) (PUTTABLE D) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on d ) (on a) (on a b) ( t d)) (PUTTABLE D) (PUT C D) (PUTTABLE A) (PUT B C) (PUT A B)
((put d) BEFORE (puttable a)!)
2 4-towers, needing 4 a tions
( (on b a) (on a ) (on d) ( t b)) (PUTTABLE B) (PUTTABLE A) (PUT B C) (PUT A B)
( (on d ) (on b) (on b a) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
(sorted tower, 0 a tions is root of plan)
5 2-tower pairs, needing 2 puttable a tions
( (on a ) (on b d) ( t a) ( t b)) (PUTTABLE B) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on a ) (on d b) ( t a) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on a d) (on b ) ( t a) ( t b)) (PUTTABLE A) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on b ) (on d a) ( t b) ( t d)) (PUTTABLE D) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on a b) (on d ) ( t a) ( t d)) (PUTTABLE D) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
((put d) (puttable a) also possible)
5 2-tower pairs, needing 1 puttable a tions
( (on a d) (on b) ( t a) ( t )) (PUTTABLE A) (PUT C D) (PUT B C) (PUT A B)
( (on b a) (on d ) ( t b) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on b d) (on a) ( t b) ( t )) (PUTTABLE B) (PUT C D) (PUT B C) (PUT A B)
( (on a) (on d b) ( t ) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
( (on b) (on d a) ( t ) ( t d)) (PUTTABLE D) (PUT C D) (PUT B C) (PUT A B)
2 2-tower pairs, needing no (put d) a tion
( (on a b) (on d) ( t a) ( t )) (PUTTABLE A) (PUT B C) (PUT A B)
( (on b a) (on d) ( t b) ( t )) (PUT B C) (PUT A B)
are no leafs)
8.6 Plans over Complex Data Types 271
in ontrast to 5 to 1 ases for the 3-blo k tower. Additionally, we have not
only one possible partial tower (with 2 or more blo ks sta ked) but also a
two-tower ase (with two towers onsisting of two blo ks). Only one path
in the plan makes it ne essary to interleave put and puttable: if D is on top
and C immediately under it. There are four ases, where puttable has to be
performed only two times before the put a tions are applied and one ase,
where puttable has to be performed only on e. For one ase, only two puttables
and two puts have to be applied. For the ase of pairs of towers, all three put-
a tions have to be performed for ea h leaf, puttable has to performed on e or
twi e.
The underlying data type is now not just a list, but a more ompli ated
stru ture, where for example (D C A B) < (D A B C)! Again, the order
is derived from the number of a tions needed to transform a state into the
goal state and a tower (D C A B) an be transformed into the goal using
two puttable a tions and three put a tions while for (D A B C) three puttable
a tions and three put a tions are needed (see tab. 8.11). Currently, we do
not see an easy way to extra t all onditions for generating optimal a tion
sequen es from the universal plan. Either, we have to be ontent with the
orre t but suboptimal ontrol rules inferred from the three-blo k plan, or we
have to rely on in remental learning. A program whi h overs all onditions
for generating optimal plans is given in appendix CC.6.
One path, we want to investigate in the future is, to model the domain
spe i ation in a slightly dierent way using only a single operator (put
blo k lo ) where lo an be a blo k or the table. This makes the universal
plan more uniform. There is no longer the de ision to take, whi h operator
to apply next. Instead, the de ision whether a blo k is put on another blo k
or on the table an be in luded in the \semanti " sele tor fun tion.
8.6.2.4 Set of Lists. Some deeper insight in the stru ture of the tower
problem might be gained by analyzing the analogous abstra t problem. The
sequen e of number of states in dependen e of the number of blo ks is given in
table 2.3. This sequen e orresponds to the number of sets of lists: a(n)=(2n-
1)a(n-1) - (n-1)(n-2)a(n-2). For n 1 it is the row sum of the \unsigned
Lah-triangle" (Knuth, 1992). The orresponding formula is exp(x=(1x)).
14
The tower problem is related to generating the power-set of a list with
mutually dierent elements (see tab. 8.12). But there is also a dieren e
between the two domains: For powerset ea h element of the set is a set again,
that is, for example ffag, fb, g, fbgg is equal to ffag, f , bg, fbgg. In
ontrast, for tower, the elements of the sets are lists. For example f(a), (b,
), (b)g is equal to f(b), (a), (b, )g but not to f(a), ( , b), (b)g. A program
generating all dierent sets of lists (that is towers) an be easily generated
from powerset by hanging :test 'setequal to :test 'equal in ins-el.
14
The identi ation of the sequen e was resear hed by Bernhard Wolf. More ba k-
ground information an be found at http://www.resear h.att. om/ gi-bin/
a ess. gi/as/njas/sequen es/eisA. gi?Anum=000262.
272 8. Transforming Plans into Finite Programs
The tower domain is the inverse problem to set of lists: For sets of lists a single
list is de omposed in all possible partial lists. For tower ea h state orresponds
to a set of partial lists and the goal state is the set ontaining a single list
with all elements in a xed order. The (puttable x) operator orresponds to
removing an element from a list and generating a new one-element list ( ons
( ar l) nil), the (put x y) operator orresponds to removing an element
from a list and putting it in front of another list ( ons ( ar l1) l2). A
program generating a list of sorted numbers is given in appendix CC.6.
Table 8.12. Power-Set of a List, Set of Lists
(defun powerset (l) (pset l (1+ (length l)) (list (list nil))))
(defun pset (l ps)
( ond ((= 0 ) nil)
(T (union ps (pset l (1- ) (ins-el l ps)))) ))
(defun ins-el (l ps)
( ond ((null l) nil)
(T (union (map ar #'(lambda(y) (adjoin ( ar l) y)) ps)
(ins-el ( dr l) ps) :test 'setequal)) ))
; for set of lists :test 'equal
(defun setequal (s1 s2) (and (subsetp s1 s2) (subsetp s2 s1)))
8.6.2.5 Con luding Remarks on `Tower'. Inferen e of generalized on-
trol knowledge for the tower domain was investigated also in the ontext of
two alternative approa hes. One of these approa hes is geneti programming
(se t. 6.3.2 in hap. 6). Within this approa h, given primitive operators of
some fun tional programming language together with rules for the orre t
generation of terms, for a set of input/output examples and an evaluation
fun tion (representing knowledge about \good" solutions) a program ov-
ering all I/O examples orre tly is generated by sear h in the \evolution
spa e" of programs. Programs generated by this approa h are given in gure
6.4. These programs orrespond to the \linear" program dis ussed above. Be-
ause always rst all blo ks are put on the table and afterwards the tower is
onstru ted, the program does not generate optimal transformation sequen es
for all possible ases.
The se ond approa h, introdu ed by Mart
in and Gener (2000), infers
rules from plans for sample input states (see se t. 2.5.2 in hap. 2. The domain
is modelled in a on ept language (AI knowledge representation language)
and the rules are inferred with a de ision list learning approa h. The resulting
rules are given in table 8.13. For example, rule A3 represents the knowledge,
that a blo k should be pi ked up if it's lear, and if its target blo k is lear
and \well-pla ed". With these rules, 95:5% of 1000 test problems were solved
for 5-blo k problems and 72:2% of 500 test problems were solved for 20-blo k
problems. The generated plans are about two steps longer than the optimal
plans. The authors ould show, that after a sele tive extension of the training
8.6 Plans over Complex Data Types 273
set by the input states for whi h the original rules failed to generate a orre t
plan, a more extensive set of rules is generated for whi h the generated plans
are about one step longer than the optimal plans.
Table 8.13. Control Rules for Tower Inferred by De ision List Learning
A1: PUT-ON ((on
g
= on
s
) ^ (8on
1
g
:holding) ^ lear
s
)
A2: PUT-ON-TABLE (holding)
A3: PICK ((8on
g
:(on
g
= on
s
)) ^ (8on
g
: lear
s
) ^ lear
s
)
A4: PICK (:(on
g
= on
s
) ^ (8on
s
:(8on
1
g
: lear
s
)) ^ lear
s
)
A5: PICK (:(on
g
= on
s
) ^ (8on
g
:(on
s
= on
s
)) ^ lear
s
)
A6: PICK (:(on
g
= on
s
) ^ (8on
s
:(8on
1
s
: lear
s
)))
Our approa h diers from these two approa hes in two aspe ts: First, we
do not use example sets of input/output pairs or of input/plan pairs but
we analyze the omplete spa e of optimal solutions for a problem of small
size. Se ond, we do not rely on in remental hypothesis- onstru tion, that is,
a learning approa h where ea h new example is used to modify the urrent
hypothesis if this example is not overed in the orre t way. Instead, we
aim at extra ting the ontrol knowledge from the given universal plan by
exploiting the stru tural information ontained in it. Although we failed up
to now to generate optimal rules for tower, we ould show for sequen e, set,
and list problems, that with our analyti al approa h we an extra t orre t
and optimal rules from the plan.
There is a trade-o between optimality of the poli y versus (a) the ef-
ien y of ontrol knowledge appli ation and (b) the eÆ ien y of ontrol
knowledge learning. As we an see from the program presented in appendix
CC.6 and from the (still non-optimal!) ontrol rules in table 8.13, generat-
ing minimal a tion sequen es might involve omplex tests whi h have to be
performed on the urrent state. In the worst ase, these tests again involve
re ursion, for example, a test, whether already a \well-pla ed" partial tower
exists (test subtow in our program). Furthermore, we demonstrated, that the
suboptimal ontrol rules for tower ould be extra ted quite easily from the
3-blo k plan, while automati extra tion of the optimal rules from the 4-blo k
plan involves omplex reasoning (for generating the tests for \spe ial" ases).
8.6.3 Tower of Hanoi
Up to now, we did not investigate plan transformation for the Tower of Hanoi.
Thus, we will make just some more general remarks about this domain. Tower
of Hanoi an be seen as a modied tower problem: In ontrast to tower,
where blo ks an be put on arbitrary positions on a table, in Tower of Hanoi
the positions of dis s are restri ted to some (typi ally three) positions of
pegs. This results in a more regular stru ture of the DAG than for the tower
problem and therefore, we hope that if we ome up with a plan transformation
274 8. Transforming Plans into Finite Programs
strategy for tower, the Tower of Hanoi domain is overed by this strategy as
well.
It is often laimed, that hanoi is a highly arti ial domain, and that
the only isomorphi domains are hand- rafted puzzles, as for example the
monster problems (Simon & Hayes, 1976; Clement & Ri hard, 1997). I want
to point out, that there are solitaire (\patien e") games, whi h are isomorphi
to hanoi.
15
One of these solitaire-games (free ell) was in luded in the AIPS-
2000 planning ompetition.
The domain spe i ation for hanoi is given in table 3.8. The resulting plan
is a unique minimal spanning tree, whi h is already regular (see g. 3.9). This
indi ates, that data type inferen e and resulting plan transformation should
be easier than for the tower problem. While hanoi with three dis s ontains
more states than the three-blo k tower domain (27 to 13) the a tions for
transforming one state into another are mu h more restri ted. The number
of states for hanoi is 3
n
. The minimal number of moves when starting with a
omplete tower on one peg is 2
n1
. Up to now, there seems to be no general
formula to al ulate the minimal number of moves for an arbitrary starting
state that is, one of the nodes of the universal plan.
16
Tower of Hanoi is a puzzle investigated extensively in arti ial intelligen e
as well as in ognitive psy hology sin e the 60ies. In omputer s ien e lasses,
Tower of Hanoi is used as a prototypi al example for a problem with expo-
nential eort. Coming up with eÆ ient algorithms (for restri ted variants) of
the Tower of Hanoi problem is still ongoing resear h (Atkinson, 1981; Pet-
torossi, 1984; Walsh, 1983; Allou he, 1994; Hinz, 1996). As far as we survey
the literature, all algorithms are on erned with the ase, where a tower of
n dis s is initially lo ated at a predened start peg (see for example table
8.14). In general, hanoi is -re ursive already for the restri ted state where
the initial state is xed and only the number of dis s are variable with the
stru ture hanoiÆmoveÆhanoi. A standard implementation, as shown in table
8.14 is as tree-re ursion.
We are interested in learning a ontrol strategy starting with an arbitrary
initial state (see program in table 8.15).
15
We plan to ondu t a psy hologi al experiment in the domain of problem solv-
ing by analogy, demonstrating, that subje ts who are a quainted with playing
patien e games perform better on hanoi than subje ts with no su h experien e.
16
see: http://forum.swarthmore.edu/epigone/geometry-puzzles/
twim lehmeh/7oen0r212 wyforum.swarthmore.edu, open question from
Februar 2000
8.6 Plans over Complex Data Types 275
Table 8.14. A Tower of Hanoi Program
; (SETQ A '(1 2 3) B NIL C NIL) (start) OR
; (hanoi '(1 2 3) nil nil 3)
(DEFUN move (from to)
(COND ( (NULL (EVAL from)) (PRINT (LIST 'PEG from 'EMPTY)) )
( (OR (NULL (EVAL to))
(> (CAR (EVAL to)) (CAR (EVAL from)) ))
(SET to (CONS (CAR(EVAL from)) (EVAL to)) )
(SET from (CDR (EVAL from)) )
)
( T (PRINT (LIST 'MOVE 'FROM (CAR(EVAL from))
'TO (CAR(EVAL to)) 'NOT 'POSSIBLE)))
)
(LIST(LIST 'MOVE 'DISC (CAR (EVAL to)) 'FROM from 'TO to)) )
(DEFUN hanoi (from to help n)
(COND ( (= n 1) (move from to) )
( T ( APPEND
(hanoi from help to (- n 1))
(move from to)
(hanoi help to from (- n 1))
) ) ) )
(DEFUN start () (hanoi 'A 'B 'C (LENGTH A)))
276 8. Transforming Plans into Finite Programs
Table 8.15. A Tower of Hanoi Program for Arbitrary Starting Constellations
(DEFUN ison (dis peg)
(COND ( (NULL peg) NIL)
( (= (CAR peg) dis ) T)
( T (ison dis ( dr peg)))))
(DEFUN on (dis from to help)
(COND ( (ison dis (eval from)) from)
( (ison dis (eval to)) to)
( T help)))
; whi hpeg: peg on whi h the urrent dis is NOT lying and peg whi h
; is not urrent goal peg
(DEFUN whi hpeg (dis peg)
(COND ( (or (and (equal (on dis 'A 'B 'C) 'B) (equal peg 'C))
(and (equal (on dis 'A 'B 'C) 'C) (equal peg 'B)) ) 'A)
( (or (and (equal (on dis 'A 'B 'C) 'A) (equal peg 'C))
(and (equal (on dis 'A 'B 'C) 'C) (equal peg 'A)) ) 'B)
( (or (and (equal (on dis 'A 'B 'C) 'A) (equal peg 'B))
(and (equal (on dis 'A 'B 'C) 'B) (equal peg 'A)) ) 'C) ))
(DEFUN topof (peg)
(COND ( (null ( ar (eval peg))) nil) ( T ( ar (eval peg))) ))
(DEFUN learpeg (peg)
(COND ( (null ( ar (eval peg))) T) ( T nil) ))
(DEFUN leartop (dis )
(COND ( (and (equal (on dis 'A 'B 'C) 'A) (= ( ar A) dis )) T)
( (and (equal (on dis 'A 'B 'C) 'B) (= ( ar B) dis )) T)
( (and (equal (on dis 'A 'B 'C) 'C) (= ( ar C) dis )) T)
( T nil)))
(DEFUN gmove (dis peg)
(COND ( (= dis 0) (PRINT (LIST 'NO 'DISC)))
( (equal (on dis 'A 'B 'C) peg)
(PRINT (LIST 'Dis dis 'IS 'ON 'PEG peg)) )
( (OR ( learpeg peg) (> (topof peg) dis ))
(PRINT (LIST 'MOVE 'DISC dis
'FROM (on dis 'A 'B 'C)
'TO peg ) )
(SET (on dis 'A 'B 'C) (CDR (eval (on dis 'A 'B 'C))))
(SET peg (CONS dis (EVAL peg)))
)
( T (PRINT (LIST 'MOVE 'FROM dis 'ON peg 'NOT 'POSSIBLE)))))
(DEFUN ghanoi (dis peg)
(COND ( (and (= dis 1) (equal (on dis 'A 'B 'C) peg)) T )
( T (COND
( (equal (on dis 'A 'B 'C) peg) (ghanoi (- dis 1) peg) )
( (and (not (equal (on dis 'A 'B 'C) peg))
(not (and ( leartop dis ) ( learpeg peg)))
(> dis 1)) (ghanoi (- dis 1) (whi hpeg dis peg)) )
)
(gmove dis peg)
(COND ((> dis 1) (ghanoi (- dis 1) peg))) )))
(DEFUN n-of-dis s (p1 p2 p3) (+ (LENGTH p1) (+ (LENGTH p2) (LENGTH p3))))
; ghanoi: "largest" Dis x Goal-Peg --> Solution Sequen e
(DEFUN gstart () (ghanoi (n-of-dis s A B C) 'C))
9. Con lusions and Further Resear h
9.1 Combining Planning and Program Synthesis
We demonstrated that planning an be ombined with indu tive program
synthesis by rst generating a universal plan for a problem with a small
number of obje ts, then transforming this plan into a nite program term,
and folding this term into a re ursive program s heme. While planning and
folding an be performed by powerful, domain-independent algorithms, plan
transformation is knowledge dependent. In part I, we presented the domain-
indpendent universal planner DPlan. In this part, we presented an approa h
to folding nite program terms based on pattern-mat hing whi h is more
powerful than other published approa hes.
Our approa h to plan transformation, presented in the previous hap-
ter, is based on inferen e of the data type underlying the planning domain.
Typi ally, su h knowledge is pre-dened in program synthesis, for example,
as domain-axiom as in the dedu tive approa h of Manna and Waldinger
(1975) or as inherent restri tion of the input domain as in the indu tive
approa h of Summers (1977). We go beyond these approa hes, providing a
method to infer the data type from the stru ture of the plan where inferen e
is based on a set of predened abstra t types. Furthermore, we presented rst
ideas for dealing with problems relying on semanti knowledge. In program
synthesis, typ ially, only stru tural list problems (su h as reverse) are onsid-
ered. Planning problems whi h an be solved by using stru tural knowledge
only are for example learblo k and ro ket. In learblo k, relations between ob-
je ts (on(x, y)) are independent of attributes of these obje ts (as their size).
In ro ket, relations between obje ts are irrelevant. In ontrast, sorting lists
depends on a semanti relation between obje ts, that is, whether one number
is greater than another. If a universal plan has parallel bran hes representing
the same operation but for dierent obje ts, we assume that a semanti se-
le tor fun tion must be introdu ed. The sele tor is onstru ted by identifying
dis riminating literals between the underlying problem states (see se t. 8.5 in
hap. 8). To sum up, with our urrent approa h we an deal with stru tural
problems in a fully automati way from planning over plan transformation
to folding. Dealing with problems relying on semanti information is a topi
for further resear h.
278 9. Con lusions and Further Resear h
Currently we are not exploiting the full power of the folder when gen-
eralizing over plans: In plan transformation, it is tried to ome up with a
single, linear stru ture as input to the folder although the folder allows to
infer sets of re ursive equations with arbitrarily omplex re ursive stru tures.
In future resear h we plan to investigate possibilities for submitting omplex,
non-linear plans dire tly to the folder.
A drawba k of using planning as basis for onstru ting nite programs is
that number problems, su h as fa torial or bona i, annot be dealt with
in a natural way. For su h problems, our folder must rely on input tra es
provided by a user. Using a graphi al user interfa e to support the user in
onstru ting su h tra es is dis ussed by S hrodl and Edelkamp (1999).
9.2 A quisition of Problem Solving Strategies
The most important building-stones for the exible and adaptive nature of
human ognition are powerful me hanisms of learning.
1
On the low-level end
of learning me hanisms is stimulus-response learning, mostly modelled with
arti ial neural nets or reinfor ement learning. On the high-level end are
dierent prin iples of indu tion, that is, generalizing rules from examples.
While the majority of work in ma hine learning fo usses on indu tion of
on epts ( lassi ation learning, see se t. 6.3.1.1 in hap. 6), our work fo usses
on indu tive learning of ognitive skills from problem solving. While on epts
are mostly hara terized as de larative knowledge (know what) whi h an be
verbalized and is a essible for reasoning pro esses, skills are des ribed as
highly automated pro edural knowledge (know how) (Anderson, 1983).
9.2.1 Learning in Problem Solving and Planning
Problem solving is generally realized as heuristi sear h in a problem spa e.
In ognitive s ien e most work is in the area of goal driven produ tion sys-
tems (see se t. 2.4.5.1 in hap. 2). In AI, dierent planning te hniques are
investigated (see se t. 2.4.2 in hap. 2). In both frameworks, the denition of
problem operators together with onditions for their appli ation that is,,
produ tion rules is entral. Mat hing, sele tion, and appli ation of opera-
tors is performed by an interpreter ( ontrol strategy): the pre onditions of all
operators dened for a given domain are mat hed against the urrent data
(problem state), one of the mat hing operators is sele ted and applied on the
urrent state. The pro ess terminates if the goal is fullled or if no operator
is appli able.
Skill a quisition is usually modelled as omposition of predened primi-
tive operators as result of their o-o urren e during problem solving, that is,
1
This se tion is a short version of the previous publi ations S hmid and Wysotzki
(1996) and S hmid and Wysotzki (2000 ).
9.2 A quisition of Problem Solving Strategies 279
learning by doing. This is true in ognitive modelling (knowledge ompilation
in ACT Anderson & Lebiere, 1998), (operator hunking in Soar Rosenbloom
& Newell, 1986) as well as in AI planning (ma ro learning, see se t. 2.5.2
in hap. 2). A quisition of su h \linear" ma ros an result in a redu tion of
sear h, be ause now omposite operators an be applied instead of primitive
ones. In ognitive s ien e, operator- omposition is viewed as me hanism re-
sponsible for a quisition of automatisms and the main explanation for speed-
up ee ts of learning (Anderson et al., 1989).
In ontrast, in AI planning, learning of domain spe i ontrol knowledge,
that is, learning of problem solving strategies, are investigated as an additional
me hanism, as dis ussed in se tion 2.5.2 in hapter 2. One possibility to
model a quisition of ontrol knowledge is learning of \ y li " ma ros (Shell &
Carbonell, 1989; Shavlik, 1990). Learning a problem solving strategy ideally
eliminates sear h ompletely be ause the omplete sub-goal stru ture of a
problem domain is known. For example, a ma ro for a one-way transportation
problem as ro ket represents the strategy that all obje ts must be loaded
before the ro ket moves to its destination (see se t. 3.1.4.2 in hap. 2 and
se t. 8.4 in hap. 8). There is empiri al eviden e, for example in the Tower
of Hanoi domain, that people an a quire su h kind of knowledge (Anzai &
Simon, 1979).
9.2.2 Three Levels of Learning
We propose, that our system, as given in gure 1.2 provides a general frame-
work for modeling the a quisition of strategies from problem solving experi-
en e: Starting-point is a problem spe i ation, given as primitive operators,
their appli ation onditions, and a problem solving goal. Using DPlan, the
problem is explored, that is, operator sequen es for transforming some ini-
tial states into a state fullling the goals are generated. This experien e is
integrated into a nite program, orresponding roughly to a set of operator
hunks, as dis ussed above using plan transformation. Subsequently, this ex-
perien e is generalized to a re ursive program s heme (RPS) using a folding
te hnique. That is, the system infers a domain spe i ontrol strategy whi h
simultaneously represents the (goal) stru ture of the urrent domain. Alter-
natively, as we will dis uss in part III after initial exploration a similar
problem might be retrieved from memory. That is, the system re ognizes that
the new problem an be solved with the same strategy as an already known
problem. In this ase, a further generalized s heme, representing the abstra t
strategy for solving both problems is learned.
All three steps of learning are illustrated in gure 9.1 for the simple lear-
blo k example whi h was dis ussed in hapters 3, 7, and 8: To rea h the goal
that blo k C has a lear top, all blo ks lying above C have to be put on
the table. This an be done by applying the operator puttable(x) if blo k
x has a lear top. The nite program represents the experien e with three
initial states as a nested onditional expression. The most important aspe t
280 9. Con lusions and Further Resear h
of this nite program, whi h makes it dierent from the ognitive s ien e
approa hes to operator hunking, is, that obje ts are not referred to dire tly
by their name but with help of a sele tor fun tion (topof). Sele tor fun tions
are inferred from the stru ture of the universal plan (as des ribed in hap. 8).
This pro ess in a way aptures the evolution of per eptual hunks from prob-
lem solving experien e (Koedinger & Anderson, 1990): For the learblo k
example, the introdu tion of topof(x) represents the knowledge that the rel-
evant part of the problem is the blo k lying on top of the urrently fo ussed
blo k. For more omplex domains, as Tower of Hanoi, the data type repre-
sents \partial solutions" (su h as how many dis s are already in the orre t
position or looking for the largest free dis ).
IF cleartop(x)
ELSE puttable(topof(x), THEN s
IF bo(x)THEN v
r(x, v) =
ELSE op1(op2(x), r(op2(x), v))
clearblock(topof(x),s))
clearblock(x, s) =
IF cleartop(x)THEN sELSE IF cleartop(topof(x)) THEN puttable(topof(x),s) ELSE IF cleartop(topof(topof(x)))
ELSE undefined
puttable(topof(topof(x)), s)) THEN puttable(topof(x),
A
B B
C C A C A
...
...
Generalization over problem states
Generalization over recursive enumerable
problem spaces
Generalization over classes of problems
Recursive Program Scheme
B
THEN s
clearblock(x, s) =
Finite ProgramOperator Chunk
Problem Solving Strategy
Scheme Hierarchy
IF clear(x)
ELSE puttable(topof(x),clearblock(topof(x),s))
ELSE plus(pred(x),addx(pred(x),y))THEN yIF eq0(x)
addx(x, y) =
Fig. 9.1. Three Levels of Generalization
In the se ond step, this primitive behavioral program is generalized over
re ursive enumerable problem spa es: a strategy for learing a blo k in
n-blo k problems is extrapolated and interpreted as a re ursive program
s heme. Indu tion of generalized stru tures from examples is a fundamen-
tal hara teristi of human intelligen e as for example proposed by Chomsky
as \language a quisition devi e" (Chomsky, 1959) or by Holland, Holyoak,
Nisbett, and Thagard (1986). This ability to extra t general rules from some
initial experien e is aptured in the presented te hnique of folding of nite
programs by dete ting regularities.
The representation of problem s hemes by re ursive program s hemes dif-
fer from the representation formats proposed in ognitive psy hology (Rumel-
hart & Norman, 1981). But RPSs are apturing exa tly the hara teristi s
9.2 A quisition of Problem Solving Strategies 281
whi h are attributed to ognitive s hemes, namely that s hemes represent
pro edural knowledge (\knowledge how") whi h the system an interrogate
to produ e \knowledge that", that is, knowledge about the stru ture of a
problem. Thereby problem s hemes are suitable for modeling analogi al rea-
soning: The a quired s heme represents not only the solution strategy for
unsta king towers of arbitrary height but for all stru tural identi al prob-
lems. Experien e with a blo ks-world problem an for example be used to
solve a numeri al problem whi h has the same stru ture by re-interpreting
the meaning of the symbols of an RPS. After solving some problems with
similar stru tures, more general s hemes evolve and problem solving an be
guided by abstra t s hemes.
10. Analogi al Reasoning and Generalization
"I wish you'd solve the ase, Miss Marple, like you did the time Miss
Wetherby's gill of pi ked shrimps disappeared. And all be ause it reminded
you of something quite dierent about a sa k of oals." "You're laughing, my
dear," said Miss Marple, "but after all, that is a very sound way of arriving
at the truth. It's really what people all intuition and make su h a fuss about.
Intuition is like reading a word without having to spell it out. A hild an't
do that be ause it has had so little experien e. But a grown-up person knows
the word be ause they've seen it often before. You at h my meaning, Vi ar?"
"Yes," I said slowly, "I think I do. You mean that if a thing reminds you of
something else well, it's probably the same kind of thing."
|Agatha Christie, The Murder at the Vi arage, 1930
Analogi al inferen e is a spe ial ase of indu tive inferen e where knowl-
edge is transferred from a known base domain to a new target domain.
Analogy is a prominent resear h topi in ognitive s ien e: In philosophy,
it is dis ussed as sour e of reative thinking; in linguisti s, similarity- reating
metaphors are studied as a spe ial ase of analogi al reasoning; in psy hology,
analogi al reasoning and problem solving are resear hed in innumerous exper-
iments and there exist several pro ess models of analogi al problem solving
and learning; in arti ial intelligen e, besides some general approa hes to
analogy, programming by analogy and ase-based reasoning are investigated.
In the following (se t. 10.1), we will rst give an overview of entral on-
epts and me hanisms of the eld. Afterwards (se t. 10.2), we will introdu e
analogi al reasoning for domains with dierent levels of omplexity from
proportional analogies to analogi al problem solving and planning. Then we
will dis uss programming by analogy as a spe ial ase of problem solving
by analogy (se t. 10.3). Finally (se t. 10.4), we will give some pointers to
literature.
10.1 Analogi al and Case-Based Reasoning
10.1.1 Chara teristi s of Analogy
Typi ally for AI and psy hologi al models of analogi al reasoning is that
domains or problems are onsidered as stru tured obje ts, su h as terms or
286 10. Analogi al Reasoning and Generalization
relational stru tures (semanti nets). Stru tured obje ts are often represented
as graphs with basi obje ts as nodes and relations between obje ts as ar s.
Relations are often distinguished in obje t attributes (unary relations), rela-
tions between obje ts (rst order n-ary relations), and higher oder relations.
Based on this representational assumption, Gentner (1983) introdu ed
a hara terization of analogy and ontrasted analogy with other modes of
transfer between two domains (see tab. 10.1): In ontrast to mere appearan e
and literal similarity, analogy is based on mapping the relational stru ture of
domains rather than obje t attributes. For the famous Rutherford analogy
(see g. 10.4) \The atom is like our solar system" it is relevant that there
is a entral obje t (sun/nu leus) and obje ts revolving around this obje t
(planets/ele trons), but is irrelevant how mu h these obje ts weight or what
their temperature is. The same is true for abstra tion, but in ontrast to
analogy, the obje ts of the base domain ( entral for e system) are generalized
on epts rather than on rete instan es. Metaphors an be found to be either
a form of similarity-based transfer ( omparable to mere appearan e) or a form
of analogy where a similarity is reated between two previously un onne ted
domains (Indurkhya, 1992).
1
In ontrast to analogy, ase-based reasoning
often relies on a simple mapping of domain attributes (Kolodner, 1993).
Table 10.1.Kinds of Predi ates Mapped in Dierent Types of Domain Comparison
(Gentner, 1983, Tab. 1, extended)
No. of No. of
Attributes Relations
mapped to mapped to
target target Example
Mere Appearan e Many Few A sun ower is like the sun.
Literal Similarity Many Many The K5 solar system is like our
solar system.
Analogy Few Many The atom is like our solar system.
Abstra tion Few Many The atom is a entral for e system.
Metaphor Many Few Her smile is like the sun.
Few Many King Lois XIV was like the sun.
Below we will give a hierar hy of analogi al reasoning from simple propo-
sitional analogies where only one relation is mapped to omplex analogies
between (planning or programming) problems. In hapter 11 we will intro-
du e a graph representation for water-jug problems in detail.
There is often made a distin tion between within- and between-domain
analogies (Vosniadou & Ortony, 1989). For example, using the solution of one
programming problem (fa torial) to onstru t a solution to a new problem
(sum) is onsidered as within-domain (Anderson & Thompson, 1989) while
1
Famous are the metaphors of Philip Marlowe. Just to give you one: \The purring
voi e was now as false as an usherette's eyelashes and as slippery as a watermelon
seed." Raymond Chandler, The Big Sleep, 1939.
10.1 Analogi al and Case-Based Reasoning 287
transferring knowledge about the solar system to explain the stru ture of an
atom is onsidered as between-domain. Be ause analogy is typi ally des ribed
as stru ture mapping (see below), we think that this lassi ation is an ar-
ti ial one: When sour e and target domains are represented as relational
stru tures and these stru tures are mapped by a synta ti al pattern mat h-
ing algorithm, the ontent of these stru tures is irrelevant. In ase-based
reasoning, base and target are usually from the same domain.
10.1.2 Sub-Pro esses of Analogi al Reasoning
Analogi al reasoning is typi ally des ribed by the following, possibly inter-
a ting, sub-pro esses:
Retrieval: For a given new target domain/problem a \suitable", \similar"
base domain/problem is retrieved (from memory).
Mapping: The base and target stru tures are mapped.
Transfer: The target is enri hed with information from the base.
Generalization: A stru ture generalizing over base and target is indu ed.
Typi ally, it is assumed that retrieval is based on super ial similarity,
that is, a base problem is sele ted whi h shares a high number of attributes
with the new problem. Super ial similarity and stru tural similarity must
not ne essarily orrespond and it was shown in numerous studies that human
problem solvers have diÆ ulties in nding an adequate base problem (Novi k,
1988; Ross, 1989).
Most ognitive s ien e models of analogi al reasoning fo us on modeling
the mapping pro ess (Falkenhainer, Forbus, & Gentner, 1989; Hummel &
Holyoak, 1997). The general assumption is that obje ts of the base domain
are mapped to obje ts of the target domain in a stru turally onsistent way
(see g. 10.1). The approa hes dier with respe t to the onstraints on map-
ping, allowing only rst order or also higher order mapping, and restri ting
mapping to isomorphism, homomorphism or weaker relations (Holland et al.,
1986).
O O
O’ O’
R*
ff
R’*Target
Base
Fig. 10.1. Mapping of Base and Target Domain
Gentner (1983) proposed the following onstraints for mapping a base
domain to a target domain:
288 10. Analogi al Reasoning and Generalization
First Order Mapping: An obje t in the base an be mapped on a dierent
obje t in the target, but base relations must be mapped on identi al
relations in the target.
Isomorphism: A set of base obje ts must be mapped one-to-one and stru -
turally onsistent to a set of target obje ts. That is, if r(o
1
; o
2
) holds for
the base then r(f(o
1
); f(o
2
)) must hold for the target, where f(o) is the
mapping fun tion.
Systemati ity: Mapping of large parts of the relational stru ture is preferred
over mapping of small isolated parts, that is, relations with greater arity
are preferred.
Relying on these onstraints, mapping orresponds to the problem of nding
the largest ommon sub-graph of two graphs (S hadler & Wysotzki, 1999).
If the base is mapped to the target, the parts of the base graph whi h are
onne ted with the ommon sub-graph are transferred to the target where
base obje ts are translated to target obje ts in a ordan e with mapping.
This kind of transfer is also alled inferen e, be ause relations known in the
base are assumed to also hold in the target. If the isomorphism onstraint is
relaxed, transfer additionally an involve modi ations of the base stru ture.
This kind of transfer is also alled adaptation (see hap. 11).
After su essful analogi al transfer, the ommon stru ture of base and
target an be generalized to a more general s heme (Novi k & Holyoak, 1991).
For example, the stru ture of the solar system and the Rutherford atommodel
an be generalized to the more general on ept of a entral for e system.
To our knowledge, none of the pro ess models in ognitive s ien e models
generalization learning (see hap. 12).
In ase-based reasoning, mostly only retrieval is addressed and the re-
trieved ase is presented to the system user as a sour e of information whi h
he might transfer to a urrent problem.
10.1.3 Transformational versus Derivational Analogy
The subpro esses des ribed above hara terize so alled transformational
analogy. An alternative approa h for analogi al problem solving, alled deriva-
tional analogy, was proposed by Carbonell (1986). He argues that, from a
omputational point of view, transformational analogy is often ineÆ ient
and an result in suboptimal solutions. Instead of al ulating a base/target
mapping and solving the target by transfer of the base stru ture, it might
be more eÆ ient to re onstru t the solution pro ess; that is, use a remem-
bered problem solving episode as guideline for solving the new problem.
A problem solving episode onsists of the reasoning tra es (derivations) of
past solution pro esses, in luding the explored subgoal stru ture and used
methods. Derivational analogy an be hara terized by the following sub-
pro esses: (1) Retrieving a suitable problem solving episode, (2) applying
the retrieved derivation to the urrent situation by \replaying" the problem
10.1 Analogi al and Case-Based Reasoning 289
solving episode, he king for ea h step if the derivation is still appli able in
the new problem solving ontext. Derivational analogy is for example used
within the AI planning system Prodigy (Veloso & Carbonell, 1993). An empir-
i al omparison of transformational and derivational analogy was ondu ted
by S hmid and Carbonell (1999).
10.1.4 Quantitive and Qualitative Similarity
As mentioned above, retrieval is typi ally based on similarity of attributes.
For omparison of base and target, all kinds of similarity measures dened on
attribute ve tors an be used. An overview over measures of similarity and
distan e is typi ally given in textbooks on luster analysis, su h as E kes and
Roba h (1980). In psy hology, non-symmetri al measures are dis ussed, for
example, the ontrast-model of Tversky (1977).
In analogy resear h, fo us is on measures for stru tural similarity. A va-
riety of measures are based on the size of the greatest ommon sub-graph of
two stru tures. Su h a measure for un-labeled graphs was for example pro-
posed by Zelinka (1975): d(G;H) = jN j jN
U
j, where j N j is the number of
nodes of the larger graph and j N
U
j is the number of nodes in the greatest
ommon sub-graph of G and H . A measure onsidering the number of nodes
and ar s in the graphs is introdu ed in hapter 11.
Another approa h to stru tural similarity is to onsider the number of
transformations whi h are ne essary for making two graphs identi al (Bunke
& Messmer, 1994). A measure of transformation distan e for trees was for
example proposed by Lu (1979): d(T
1
; T
2
) = w
s
s + w
d
d + w
i
i, where
s represents the number of substitutions (renamings of node-labels), d the
number of deletions of nodes, and i the number of insertion of nodes. Param-
eters w give operation spe i weights. To guarantee that the measure is a
metri , it must hold that w
s
< w
d
; w
i
and w
d
= w
i
(proof in Mer y, 1998).
Transformational similarity is also the basis for stru tural information theory
(Leeuwenberg, 1971).
All measures dis ussed so far are quantitative, that is, a numeri al value is
al ulated whi h represents the similarity between base and target. Usually,
a threshold is dened and a domain or problem is onsidered as a andidate
for analogi al reasoning, if its similarity to the target problem lies above this
threshold. A qualitative approa h to stru tural similarity was proposed by
Plaza (1995). Plaza's domain of appli ation is the retrieval of typed feature
terms (e. g., re ords of employees), whi h an be hierar hi ally organized, for
example the eld \name" onsists of a further feature term with the elds
\rst name" and \surname". For a given (target) feature term, the most
similar (base) feature term from a data base is identied by the following
method: Ea h pair of the xed target and a term from the data base is
anti-unied. The resulting anti-instan es are ordered with respe t to their
subsumption relation and the most spe i anti-instan e is returned. Anti-
uni ation was introdu ed in se tion 6.3.3 in hapter 6 under the name of
290 10. Analogi al Reasoning and Generalization
least general generalization. An anti-uni ation approa h for programming
by analogy is presented in hapter 12.
10.2 Mapping Simple Relations or Complex Stru tures
10.2.1 Proportional Analogies
The most simple form of analogi al reasoning are proportional analogies of
the form
A:B :: C:? \A is to B as C to ?".
Expression A to B is the base domain and expression C to ? is the target do-
main. The relation existing between A and B must be identied and applied
to the target domain. Su h problems are typi ally used in intelligen e tests,
and algorithms for solving proportional analogies were introdu ed early in AI
resear h (Evans, 1968; Winston, 1980). In intelligen e tests, items are often
semanti ategories as \Rose is to Flower as Herring to ?". In this exam-
ple, the relevant relation is subordinate/superordinate and the superordinate
on ept to \Herring" must be retrieved (from semanti memory). Be ause
the solution of su h analogies is knowledge dependent, often letter strings
(Hofstadter & The Fluid Analogies Resear h Group, 1995; Burns, 1996) or
simple geometri gures (Evans, 1968; Winston, 1980; O'Hara, 1992) are used
as alternative.
An example for an analogy whi h an be solved with Evans Analogy-
program (Evans, 1968) is given in gure 10.2. In a rst step, ea h gure is
de omposed into simple geometri obje ts (su h as a re tangle and a tri-
angle). Be ause de omposition is ambiguous for overlapping obje ts, Evans
used Gestalt-laws and ontext-information for de omposition. For the pairs
of gures (A;B), (A;C), (B;C), (C; 1), : : :, (C; 5) relations between obje ts
and between spatial positions are onstru ted. These relations are used for
onstru ting transformation rules A ! B. Transformation in ludes obje t
mapping, deletion and insertion. The rules are abstra ted su h that they an
be applied to gure C, resulting in a new gure whi h hopefully orresponds
to one of the given alternatives for D.
An algorithm whi h an use geometri analogy problems using ontext-
dependent re-des riptions was proposed by O'Hara (1992). An example is
given in gure 10.3. All gures must be omposed of lines. An algebra is
used to represent/ onstru t gures. Operations are translation, rotation, re-
e tion, s aling of obje ts and glueing of pairs of obje ts.
In a rst step, an initial representation of gure A is onstru ted. Then,
a representation of B and a transformation t are onstru ted su h that B =
t(A). A representation of C and an isomorphi mapping are onstru ted
su h that C = (A). Mapping is extended to
0
, preserving isomorphism,
10.2 Mapping Simple Relations or Complex Stru tures 291
A B
1 2 3 4 5
C D
?
Fig. 10.2. Example for a Geometri -Analogy Problem (Evans, 1968, p. 333)
GLUE
ROT[p,60] ROT[p,60]
ROT[p,60]
GLUE
ROT[p,60] ROT[p,60]
ROT[p,60]
GLUE
::::
: :: :
Fig. 10.3. Context Dependent Des riptions in Proportional Analogy (O'Hara,
1992)
292 10. Analogi al Reasoning and Generalization
and D =
0
(B) is onstru ted. If no mapping C = (A) an be found,
the algorithm ba ktra ks and onstru ts a dierent representation for B (re-
des ription).
Letter string domains are easier to model as geometri domains and dier-
ent approa hes to algebrai representation (Leeuwenberg, 1971; Indurkhya,
1992) have been proposed. The Copy at program of Hofstadter and The Fluid
Analogies Resear h Group (1995) solves letter string analogies of the form
ab : abd :: kji :?.
10.2.2 Causal Analogies
In proportional analogies, a domain is represented by two stru tured obje ts
A and B with a single or a small set of relations t(A) = B whi h are relevant
for analogi al transfer. More omplex domains are explanatory stru tures, for
example for physi al phenomena. In Rutherford analogy, introdu ed above,
knowledge about the solar system is used to infer the ause why ele trons are
rotating around the nu leus of an atom (see g. 10.4).
Systems realizing mapping and inferen e for su h domains are for exam-
ple the stru ture mapping engine (SME, Falkenhainer et al., 1989) or LISA
(Hummel & Holyoak, 1997). The SME is based on the onstraints for stru -
ture mapping proposed by Gentner (1983), whi h were introdu ed above.
Note, that also the inferred higher-order relation is labeled \ ause", mapping
and inferen e are purely synta ti al, as des ribed for proportional analogies.
(a) (b)
Planet-j
Planet-i
O
O
S
S
O
O
SS
OSR E
O
O
S
S
O
O
S
S Elektron-i
YELLOW HOT MASSIVE
ATTRACTS
ATTRACTS ATTRACTS HOTTER THAN
Sun
CAUSE
REVOLVESAROUND
MORE MASSIVETHAN ATTRACTS ATTRACTS MORE MASSIVE
THANREVOLVES
AROUND
Nucleus
Fig. 10.4. The Rutherford Analogy (Gentner, 1983)
10.2.3 Problem Solving and Planning by Analogy
Many people have pointed out that analogy is an important me hanism in
problem solving. For example, Polya (1957) presented numerous examples
for how analogi al reasoning an help students to learn proving mathemati al
theorems. Psy hologi al studies on analogi al problem solving address mostly
10.2 Mapping Simple Relations or Complex Stru tures 293
solving of mathemati al problems (Novi k, 1988; Reed, A kin lose, & Voss,
1990) or program onstru tion (Anderson & Thompson, 1989; Weber, 1996).
Typi ally, students are presented with worked out examples whi h they an
use to solve a new (target) problem. The standard example for programming
is using fa torial as base problem to onstru t sum:
fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))
sum(x) = if(eq0(x), 0, +(x, sum(pred(x)))).
Anderson and Thompson (1989) propose an analogy model where pro-
gramming problems and solutions are represented as s hemes with fun tion
slots representing the intended operation of the program and form slots rep-
resenting the synta ti al realization. For a target problem only the fun tion
slot is lled and the form slot is inferred from the given base problem.
Examples for simple word algebra problems used in experiments by Reed
et al. (1990) is given in table 10.2. Analogi al problem solving means to
identify the relations between on epts in a given problem with the equation
for al ulating the solution. For a new (target) problem, the on epts in the
new text have to be mapped with the on epts in the base problem and then
the equation an be transferred. Reed et al. (1990) ould show that problem
solving su ess is higher if a base problem whi h in ludes the target problem
is used (e. g., the third problem in tab. 10.2 as base and the se ond as target)
but that most subje ts did not sele t the in lusive problems as most helpful
if they ould hoose themselves. In hapter 11 we report experiments with
dierent variants of in lusive problems in the water jug domain.
Table 10.2. Word Algebra Problems (Reed et al., 1990)
A group of people paid $238 to pur hase ti kets to a play. How many people were
in the group if the ti kets ost $14 ea h?
$14 = $238/n
A group of people paid $306 to pur hase theater ti kets. When 7 more people joined
the group, the total ost was $425. How many people were in the original group if
all ti kets had the same pri e?
$306/n = $425/(n+7)
A group of people paid $70 to wat h a basketball game. When 8 more people joined
the group the total ost was $ 20. How many people were in the original group if
the larger group re eived a 20% dis ount?
0.8 ($70/n) = $120/(n+8)
An AI system whi h addresses problem solving by analogy is Prodigy
(Veloso & Carbonell, 1993). Here, a derivational analogy me hanism is used
(see above). Prodigy-Analogy was, for example, applied for the ro ket domain
(see se t. 3.1.4.2 in hap. 3): A planning episode for solving a ro ket problem
294 10. Analogi al Reasoning and Generalization
is stored (e. g., transporting two obje ts) and indexed with the initial and
goal states. For a new problem (e. g., transporting four obje ts) the old
episode an be retrieved by mat hing the urrent initial and goal states with
the indi es of stored episodes and the retrieved episode is replayed. While
Veloso and Carbonell (1993) ould demonstrate eÆ ien y gains of reuse over
planning from the s rat h for some example domains, Nebel and Koehler
(1995) provided a theoreti al analysis that in the worst ase reuse is not more
eÆ ient than planning from the s rat h. The rst bottlene k is retrieval of a
suitable ase and the se ond is modi ation of old solutions.
10.3 Programming by Analogy
Program onstru tion an be seen as a spe ial ase of problem solving. In part
II we have dis ussed automati program synthesis as an area of resear h whi h
tries to ome up with me hanisms to automatize or support at least routine
parts of program development. Programming by analogy is a further su h
me hanism, whi h is dis ussed sin e the beginning of automati programming
resear h. For example, Manna and Waldinger (1975) laim that retention
of previously onstru ted programs is a powerful way to a quire and store
knowledge.
Dershowitz (1986) presented an approa h to onstru t imperative pro-
grams by analogi al reasoning. A base problem e. g., al ulating the divi-
sion of two real numbers with some toleran e is given by its spe i ation
and solution. The solution is annotated with additional statements (su h as
assert, a hieve, purpose) whi h makes it possible to relate spe i ation and
program. For a new problem e. g., al ulating the ube-root of a number
with some toleran e only the spe i ation is given. By mapping the spe i-
ations (see g. 10.5), the statements in the base program are transformed
into statements of the to be onstru ted target program. Some additional
inferen e methods are used to generate a program whi h is orre t with re-
spe t to the spe i ation from the initial program whi h was onstru ted by
analogy.
Real-Division:
ASSERT 0 <= < d; e > 0
ACHIEVE j =d qj < e
VARYING q
Cube-Root:
ASSERT a >= 0; e > 0 ACHIEVE
ja
(1=3)
rj < e
VARYING r
Mapping: q ) r, =d) a
(1=3)
Transformations:
q ! r,
u=v ! u
(1=3)
(repla e ea h division operator by a ube-root operator),
! a
Fig. 10.5. Base and Target Spe i ation (Dershowitz, 1986)
10.3 Programming by Analogy 295
Both programs are based on the more general strategy of binary sear h.
As a last step of programming by analogy, Dershowitz proposes to onstru t
a s heme by abstra ting over the programs. For the example above, the oper-
ators for division and ube-root are generalized to (u; v), that is, a se ond-
order variable is introdu ed. New binary sear h problems an then be solved
by instantiating the s heme. If program s hemes are a quired, dedu tive ap-
proa hes to program synthesis an be applied, for example, the stepwise re-
nement approa h used in the KIDS system (see se t. 6.2.2.3 in hap. 6).
Cru ial for analogi al programming is to dete t relations between pairs
of programs. The theoreti al foundation for onstru ting mappings are pro-
gram morphisms (Burton, 1992; Smith, 1993). A te hnique to perform su h
mappings is anti-uni ation. A ompletely implemented system for program-
ming by analogy, based on se ond-order anti-uni ation and generalization
morphisms was presented by Hasker (1995). For a given spe i ation and
program, the system user provides a program derivation ( . f., derivational
analogy), that is, steps for transforming the spe i ation into the program.
Se ond-order anti-uni ation is used to dete t analogies between pairs of
spe i ations whi h are represented as ombinator terms. First, he introdu es
anti-uni ation for monadi ombinator terms, that is, allowing only unary
terms. Then, he introdu es anti-uni ation for produ t types. Monadi om-
binator terms are not expressive enough to represent programs while ombi-
nator terms for artesian produ t types allowing that sub-terms are deleted
are too general and allow innitely many minimal generalizations. There-
fore, Hasker introdu es relevant ombinator terms as a subset of ombinators
for artesian produ t types whi h allow introdu ing pairs of terms but do
not allow to ignore sub-terms. For this lass of ombinator terms there still
exist, possibly large, sets of minimal generalization. Therefore, Hasker intro-
du es some heuristi s and allows for user intera tion to guide onstru tion of
a \useful" generalization.
Our own approa h to programming with analogy is still at its beginning.
We onsider how folding of nite program terms into re ursive programs (see
hap. 7) an be alternatively realized by analogy or abstra tion. That is,
mapping is performed on pairs of nite programs, where for one program
the re ursive generalization is known and for the other not (see hap. 12). In
ontrast to most approa hes to analogi al reasoning, retrieval is not based on
attribute similarity but on stru tural similarity. We use anti-uni ation for
retrieval (Plaza, 1995) as well as for mapping and transfer (Hasker, 1995).
Cal ulating the anti-instan e of two terms results in their maximally spe i
generalization. Our urrent approa h to anti-uni ation is mu h more re-
stri ted than the approa h of Hasker (1995). Our restri ted approa h guaran-
tees that the minimal generalization of two terms is unique. But for retrieval,
based on identifying the maximal spe i anti-instan e in a subsumption hi-
erar hy, as proposed by Plaza (1995, see above), typi ally sets of andidates
for analogi al transfer are returned. Here, we must provide additional infor-
296 10. Analogi al Reasoning and Generalization
mation to sele t a useful base. One possibility is, to onsider the size and
type of stru tural overlap between terms. We ondu ted psy hologi al exper-
iments to identify some general riteria of stru tural base/target relations
whi h allow su essful transfer for human problem solvers (see hap. 11). If
a nite program (asso iated with its re ursive generalization) is sele ted as
base, the term substitutions al ulated while onstru ting the anti-instan e
of this program with the target an be applied to the re ursive base pro-
gram and thereby the re ursive generalization for the target is obtained (see
hap. 12).
We believe, that human problem solvers prefer abstra t s hemes over on-
rete base problems to guide solving novel problems and use on rete prob-
lems as guidan e only, if they are inexperien ed in a domain. Therefore, in
our work we fo us on abstra tion rather than analogy. Our approa h to anti-
uni ation works likewise for on rete program terms and terms ontaining
obje t and fun tion variables, that is s hemes. Anti-unifying a new nite pro-
gram term with the unfolding of some program s heme results in a further
generalization. Consequently, a hierar hy of abstra t s hemes develops over
problem solving experien e.
Our notion of representing programs as elements of some term algebra in-
stead of some given programming language (see hap. 7) allows us to address
analogy and abstra tion in the domain of programming as well as in more
general domains of problem solving (su h us solving blo ks-world problems
or other domains onsidered in planning) within the same approa h. That
is, as dis ussed in hapter 9, we address the a quisition of problem s hemes
whi h represent problem solving strategies (or re ursive ontrol rules) from
experien e.
10.4 Pointers to Literature
Analogy and ase-based reasoning is an extensively resear hed domain. A bib-
liography for both areas an be found at http://www.ai- br.org. Classi al
books on ase-based reasoning are Riesbe k and S hank (1989) and Kolodner
(1993). A good sour e for urrent resear h are the pro eedings of the inter-
national onferen e on ase-based reasoning (ICCBR). Cognitive models of
analogy are presented, for example, at the annual onferen e of the Cognitive
S ien e So iety (CogS i) and in the journal Cognitive S ien e. A dis ussion
of metaphors and analogy is presented by Indurkhya (1992). Holland et al.
(1986) dis uss indu tion and analogy from perspe tives of philosophy, psy-
hology, and AI. A olle tion of ognitive s ien e papers on similarity and
analogy was presented by Vosniadou and (Eds.) (1989). Some AI papers on
analogy an be found in Mi halski et al. (1986).
11. Stru tural Similarity in Analogi al Transfer
"And what is s ien e about?" "He explained that s ientists formulate theories
about how the physi al world works, andthen test them out by experiments.
As long as the experiments su eed, then the theories hold. If they fail, the
s ientists have to nd another theory to explain the fa ts. He says that, with
s ien e, there's this ex iting paradox, that diosillusionment needn't be defeat.
It's a step forward."
|P. D. James, Death of an Expert Witness, 1977
In this hapter, we present two psy hologi al experiments to demonstrate
that human problem solver do and an use not only base (sour e) domains
whi h are isomorphi to target domains but also rely on non-isomorphi stru -
tures in analogi al problem solving. Of ourse, not every non-isomorphi rela-
tion is suitable for analogi al transfer. Therefore, we identied whi h degree
of stru tural overlap must exist between two problems to guarantee a high
probability of transfer su ess. This resear h is not only of interest in the
ontext of theories of human problem solving. In the ontext of our system
IPAL, riteria are needed for de iding whether it is worthwhile to try to gen-
erate a new re ursive program by analogi al transfer of a program (or by
instantiating a program s heme) given in memory or to generate a new so-
lution from s rat h, using indu tive program synthesis. In the following, we
rst (se t. 11.1) give an introdu tion in psy hologi al theories to analogi al
problem solving and present the problem domain used in the experiments.
Then we report two experiments on analogi al transfer of non-isomorphi al
sour e problems (se t. 11.2 and se t. 11.3). We on lude with a dis ussion of
the empiri al results and possible areas of appli ation (se t. 11.4).
1
11.1 Analogi al Problem Solving
Analogi al reasoning is an often used strategy in everyday and a ademi
problem solving. For example, if a person already has experien e in planning
a trip by train, he/she might transfer this knowledge to planning a trip by
1
This hapter is based on the papers S hmid, Wirth, and Polkehn (2001) and
S hmid, Wirth, and Polkehn (1999).
298 11. Stru tural Similarity in Analogi al Transfer
plane. If a student already has knowledge in solving an equation with one
additive variable, he/she might transfer the solution pro edure to an equation
with a multipli ative variable. For analogi al transfer, a previously solved
problem alled sour e has to be similar to the urrent problem alled
target. While a large number of ommon attributes might help to nd an
analogy, sour e and target have to be stru turally similar for transfer su ess
(Holyoak & Koh, 1987). In the ideal ase, sour e and target are stru turally
identi al (isomorph) but this is seldom true in real-live problem solving.
Analogi al problem solving is ommonly des ribed by the following (possi-
bly intera ting) omponent pro esses (e. g., Keane, Ledgeway, & Du, 1994):
representation of the target problem, retrieval of a previously solved sour e
problem from memory, mapping of the stru tures of sour e and target, trans-
fer of the sour e solution to the target problem, and generalizing over the om-
mon stru ture of sour e and target. The empiri ally best explored pro esses
are retrieval and mapping (see Hummel & Holyoak, 1997, for an overview).
Retrieval of a sour e is assumed to be guided by overall semanti similarity
(i. e., ommon attributes), often hara terized as \super ial" in ontrast to
stru tural similarity (Gentner & Landers, 1985; Holyoak & Koh, 1987; Ross,
1989). Empiri al results show that retrieval is the bottlene k in analogi al
reasoning and often an only be performed su essfully if expli it hints about
a suitable sour e are given (Gi k & Holyoak, 1980; Gentner, Ratterman, &
Forbus, 1993). Therefore, a usual pro edure for studying mapping and trans-
fer is to ir umvent retrieval by expli itly presenting a problem as a helpful
example (Novi k & Holyoak, 1991). In the following, we will give a loser
look at mapping and transfer.
11.1.1 Mapping and Transfer
Mapping is onsidered the ore pro ess in analogi al reasoning. The de i-
sion whether two problems are analogous is based on identifying stru tural
orresponden es between them. Mapping is a ne essary but not always suf-
ient ondition for su essful transfer (Novi k & Holyoak, 1991). There are
numerous empiri al studies on erning the mapping pro ess ( . f., Hummel
& Holyoak, 1997) and all omputational models of analogi al reasoning pro-
vide an implementation of this omponent (Falkenhainer et al., 1989; Keane
et al., 1994; Hummel & Holyoak, 1997). Mapping is typi ally modelled as
rst identifying the orresponding omponents of sour e and target and then
arrying over the on eptual stru ture from the sour e to the target. For ex-
ample, in the Rutherford analogy (the atom is like the solar system), planets
an be mapped to ele trons and the sun to the nu leus of an atom together
with relations as \revolves around" or \more mass than" (Gentner, 1983).
In the stru ture mapping theory (Gentner, 1983) it is postulated that map-
ping is performed purely synta ti ally and that it is guided by the prin i-
ple of systemati ity preferring mapping of greater portions of stru ture to
11.1 Analogi al Problem Solving 299
mapping of isolated elements. Alternatively, Holyoak and olleagues postu-
late that mapping is onstrained by semanti and pragmati aspe ts of the
problem (Holyoak & Thagard, 1989; Hummel & Holyoak, 1997). Mapping
might be further onstrained su h that it results in easy adaptability (Keane,
1996). Currently, it is dis ussed that a target might be re-represented, if
sour e/target mapping annot be performed su essfully (Hofstadter & The
Fluid Analogies Resear h Group, 1995; Gentner, Brem, Ferguson, Markman,
Levidow, Wol, & Fobus, 1997).
Based on the mapping of sour e and target, the on eptual stru ture of
the sour e an be transferred to the target. For example, the explanatory
stru ture that the planets revolve arround the sun be ause the sun attra ts
the planets might be transferred to the domain of atoms. Transfer an be
faulty or in omplete, even if mapping was performed su essfully (Novi k
& Holyoak, 1991). Negative transfer an also result from a failure in prior
sub-pro esses onstru tion of an unsuitable representation of the target,
retrieval of an inappropriate sour e problem, or in omplete, in onsistent or
inappropriate mapping of sour e and target (Novi k, 1988). Analogi al trans-
fer might lead to the indu tion of a more general s hema whi h represents an
abstra tion over the ommon stru ture of sour e and target (Gi k & Holyoak,
1983). For example, when solving the Rutherford analogy, the more general
on ept of entral for e systems might be learned.
11.1.2 Transfer of Non-Isomorphi Sour e Problems
Our work fo usses on analogi al transfer in problem solving. There is a
marginal and a ru ial dieren e between general models of analogi al reason-
ing and models of analogi al problem solving. While in general sour e and
target might be from dierent domains (between-domain analogies as the
Rutherford analogy), in analogi al problem solving sour e and target typi-
ally are from the same domain (within-domain analogies, e. g., Vosniadou
& Ortony, 1989). For example, people an use a previously solved algebra
word problem as an example to fa ilitate solving a new algebra word prob-
lem (Novi k & Holyoak, 1991; Reed et al., 1990), or they an use a omputer
program with whi h they are already familiar as an example to onstru t
a new program (Anderson & Thompson, 1989). While the dis rimination of
between- and within-domain analogies is relevant for the question of how a
suitable sour e an be retrieved, it has no impa t on stru ture mapping if
this pro ess is assumed to be performed purely synta ti ally.
The more ru ial dieren e between models of analogi al reasoning and
of problem solving is that in analogi al reasoning transfer is mostly des ribed
by inferen e (in so- alled explanatory analogies, e. g., Gentner, 1983) vs. by
adaptation (in problem solving, e. g., Keane, 1996). In the rst ase, (higher-
order) relations given for the sour e are arried over to the target as the
explanation given above of why ele trons revolve around the nu leus. In ana-
logi al problem solving, on the other hand, most often the omplete solution
300 11. Stru tural Similarity in Analogi al Transfer
pro edure of the sour e problem is adapted to the target. Analogi al transfer
of a problem solution subsumes the stru tural \ arry-over" along with pos-
sible hanges (adaptation) of the solution stru ture and the appli ation of
the solution pro edure. For example, if we are presented with the ne essary
operations to isolate a variable in an equation, we an solve a new equation
by adapting the known solution pro edure. If stru tures of sour e and target
are identi al (isomorphi ), transfer an be des ribed as simply repla ing the
sour e on epts by the target on epts in the sour e solution. For a sour e
equation 2 x+5 = 9 with solution x =
(95)
2
, the target 3 x+4 = 16 an be
solved by (1) mapping the numbers of sour e and target, that is 2 is mapped
to 3, 4 to 5 and 9 to 16 and by (2) substituting the orresponding numbers
in the sour e solution. An example for sour e in lusive sour e/target pair
mapping is given below in gure 11.3.
We are espe ially interested in onditions for su essful transfer of non-
isomorphi sour e solutions. There are a variety of non-isomorphi al sour e/tar-
get relations dis ussed in literature: First, there are dierent types of map-
ping relations: one-to-one-mappings (isomorphism), many-to-one, and one-
to-many mappings (Spellman & Holyoak, 1996). Se ondly, there are dierent
types and degrees of stru tural overlap (see g. 11.1): a sour e might be \ om-
pletely ontained" in the target (sour e in lusiveness; Reed et al., 1990), or
a sour e might represent all on epts needed for solving the target together
with some additional on epts (target exhaustiveness; Gentner, 1980). These
are two spe ial ases of stru tural overlap between sour e and target. It seems
plausible to assume that if the overlap is too small, a problem is no longer
helpful for solving the target. Su h a problem would not be hara terized
as a sour e problem. While there are some empiri al studies investigating
transfer of non-isomorphi sour es (Reed et al., 1990; Novi k & Hmelo, 1994;
Gholson, Smither, Buhrman, Dun an, & Pier e, 1996; Spellman & Holyoak,
1996), there is no systemati investigation of the stru tural relation between
sour e and target whi h is ne essary for su esful transfer. Our experimental
work fo usses on the impa t of dierent types and degrees of stru tural over-
lap on transfer su ess, that is, we urrently are only onsidering one-to-one
mappings.
11.1.3 Stru tural Representation of Problems
To determine the stru tural relation between sour e and target we have to
rely on expli itly dened representations of problem stru tures. In ognitive
models of analogi al reasoning, problems are typi ally represented by s hemas
(SME, Falkenhainer et al., 1989), (IAM, Keane et al., 1994) or by semanti
nets (ACME, Holyoak & Thagard, 1989), (LISA, Hummel & Holyoak, 1997).
From a more abstra t view, these representations orrespond to graphs, where
on epts are represented as nodes and relations between them as ar s. Ex-
amples for graphs are given in gure 11.1. For a tual problems, nodes (and
11.1 Analogi al Problem Solving 301
Isomorphism
source
target
Target Exhaustiveness
source
target
Source Inclusiveness
Fig. 11.1. Types and degrees of stru tural overlap between sour e and target
Problems
302 11. Stru tural Similarity in Analogi al Transfer
possibly ar s) are labelled. A graph representation of the solar system on-
tains for instan e a node labelled with the relation more mass than onne ted
to a node planet-1 and to a node sun.
While expli it representations are often presented for explanatory analo-
gies (Gi k & Holyoak, 1980; Gentner, 1983), this is not true for problem
solving. For algebra problems (Novi k & Holyoak, 1991; Reed et al., 1990),
the mathemati al equations an be used to represent the problem stru ture
(see g. 11.3). In general when investigating su h problems as the Tower of
Hanoi (Clement & Ri hard, 1997; Simon & Hayes, 1976) both the stru ture
of a problem and the problem solving operators, possibly together with ap-
pli ation onditions and onstraints (Gholson et al., 1996), have to be taken
into a ount.
In the lassi al transformational view of analogi al problem solving (Gen-
tner, 1983), little work has been done whi h addresses how to model analog-
i al transfer of problems involving several solutions steps. In arti al intel-
ligen e, Carbonell (1986) proposed derivational analogy for multi-step prob-
lems: He models problem solving by analogy as deriving a solution by replay
of an already known solution pro ess, he king on the way whether the ondi-
tions for operator appli ations of the sour e still hold when solving the target.
In the following, we nevertheless adopt the transformational approa h de-
s ribing analogi al problem solving by mapping and transfer. That is, we will
assume that both the (de larative) des ription of the problem and pro edural
information are stru turally represented and that a target problem is solved
by adapting the sour e solution to the target problem based on stru ture
mapping. We assume that problems and solutions are represented in s hemas
apturing de larative as well as pro edural aspe ts as argued for example by
Anderson and Thompson (1989) and Rumelhart and Norman (1981).
When spe ifying the representation of a problem, we have to de ide on
its format as well as its ontent (Gi k & Holyoak, 1980). In general, it is not
possible to determine all possible aspe ts asso iated with a problem, that
is, we annot laim omplete representations. We adopt the position of Gi k
and Holyoak, to model at least all aspe ts whi h are relevant for su essful
transfer. A omponent of the (de larative) des ription of a problem is rele-
vant, if it is ne essary for generating the operation sequen e whi h solves the
problem. Furthermore, only the operation sequen e whi h solves the problem
is regarded as relevant pro edural information. A su essful problem solver
has to fo us on these relevant aspe ts of a problem and should ignore all
other aspe ts. Of ourse, we do not assume that human problem solvers in
general represent only relevant or all relevant aspe ts of a problem. Our goal
is to systemati ally ontrol variants of stru tural sour e/target relations and
their impa t on transfer, that is, our representational assumptions are not
empiri al laims but a means for task analysis. We want to onstru t \nor-
matively omplete" graph representations of problems to explore the impa t
11.1 Analogi al Problem Solving 303
of dierent analyti ally given stru tural sour e/target relations on empiri-
ally observable transfer su ess.
In the following, we will rst introdu e our problem solving domain
water redistribution tasks and our problem representations. Then we will
present two experiments. In the rst experiment we will show that problem
solvers an transfer a sour e solution with moderate stru tural similarity to
the target if the problems do not vary in super ial features. In the se ond
experiment we investigate a variety of dierent stru tural overlaps between
sour e and target.
11.1.4 Non-Isomorphi Variants in a Water Redistribution
Domain
Be ause we fo us on transfer of de larative and pro edural aspe ts of prob-
lems we onstru ted a problem type that an be lassied as interpolation
problems like the Tower of Hanoi problems (Simon & Hayes, 1976), the wa-
ter jug problems (Atwood & Polson, 1976), or missionary- annibal problems
(Reed, Ernst, & Banerji, 1974; Gholson et al., 1996). Problem solving means
to nd the orre t multi-step sequen e of operators that transform an ini-
tial state into the goal state. Interpolation problems have well-dened initial
and goal states and usually one well-dened multi-step solution. Thus, they
are as suitable for systemati ally analyzing their stru ture as, for instan e,
mathemati al problems (Reed et al., 1990) with the advantage that they are
not likely to a tivate s hool-trained mathemati al pre-knowledge.
We onstru ted a water redistribution domain that is similar to but more
omplex than the water jug problems des ribed by Atwood and Polson (1976).
In the initial state three (or four) jugs of dierent apa ity are given. The
jugs are initially lled with dierent amounts of water (initial quantities).
The water has to be redistributed between the jugs in su h a way that the
pre-spe ied goal quantities are obtained. For example, given are the three
jugs A, B and C with apa ities
A
= 36,
B
= 45, and
C
= 54 (units). In
the initial state quantities are q
A
= 16, q
B
= 27, and q
C
= 34. To rea h the
goal state the values of these quantities must be transformed into q
A
= 25,
q
B
= 0 and q
C
= 52 by redistributing the water among the dierent jugs (see
g. 11.2).
The task is to determine the shortest sequen e of operators that trans-
form the initial quantities into the goal quantities. The only legal operator
available is a pour-operator (the redistribute operator) that is restri ted by
the following onditions: (1) The only water to pour is the water ontained
by the jugs in the initial state. (2) Water an be poured only from a non-
empty `pour out'-jug into an in ompletely lled `pour in'- jug. (3) Pouring
always results in either lling the `pour in'-jug up to its apa ity with possibly
leaving a rest of water in the `pour out'-jug or emptying the `pour out'-jug
with possibly remaining free apa ity in the `pour in'-jug. (4) The amount
of water that is poured out of the `pour out'-jug is always the same amount
304 11. Stru tural Similarity in Analogi al Transfer
5436 45capacity
16 27 34
25 0 52
Solution: pour(C,B), pour(B,A), pour(A,C), pour(B,A)
A B C name/position
quantiy
quantiygoal
initial
Fig. 11.2. A water redistribution problem
that is lled in the `pour in'-jug. Formally this pour-operator is dened in
the following way:
IF not(q
X
(t) = 0) AND not(q
Y
(t) =
Y
) THEN pour(X,Y) resulting in:
IF q
X
(t)
Y
q
Y
(t)
THEN q
Y
(t+ 1) := q
Y
(t) + q
X
(t)
q
X
(t+ 1) := q
X
(t) q
X
(t) (i. e. 0, emptying jug X)
ELSE q
X
(t+ 1) := q
X
(t) (
Y
q
Y
(t))
q
Y
(t+ 1 := q
Y
(t) + (
Y
q
Y
(t)) (i. e.,
Y
, lling jug Y)
with
q
X
(t): quantity of jug X at solution step t
X
: apa ity of jug X,
(
X
q
X
(t)): remaining free apa ity of jug X.
Be ause we are interested in whi h types and degrees of stru tural overlap
are suÆ ient for su essful analogi al transfer, we have to ensure that subje ts
really refer to the sour e for solving the target problem. That is, the problems
should be omplex enough to ensure that the orre t solution an not be
found by trial and error, and diÆ ult enough to ensure that the abstra t
solution prin iple is not immediately inferable. Therefore, we onstru ted
redistribution problems for whi h exists only a single (for two problems two)
shortest operator sequen e (in problem spa es with over 1000 states and more
than 50 y le-free solution paths).
To onstru t a stru tural representation of a problem we were guided by
the following prin iples: (a) the goal quantity of ea h jug an be des ribed
as the initial quantity transformed by a ertain (shortest) sequen e of oper-
ators; (b) relevant de larative attributes ( apa ities, quantities and relations
between these attributes) of the initial and the goal state determine the solu-
tion sequen e of operators; and ( ) ea h solution step an be des ribed by the
11.1 Analogi al Problem Solving 305
denition of the pour-operator given above. In terms of these prin iples oper-
ator appli ations an be re-formulated by equations where a urrent quantity
an be expressed by adding or subtra ting amounts of water. For example,
using the parameters introdu ed above, the rst operator pour(C,B) of the
example presented in gure 11.2 transforms the quantities of jug B and C
of the initial state t = 0 into quantities of state t = 1 in the following way:
q
B
(0) = 27, q
C
(0) = 34,
B
= 45,
C
= 54:
BECAUSE not(34 = 0) AND not(27 = 45)
pour(C,B) at t = 0 results in:
BECAUSE not(34 (45 - 27)):
q
B
(1) = 27 + (45 - 27) = 45
q
C
(1) = 34 - (45 - 27) = 16.
Redistribution problems dene a spe i goal quantity for ea h jug. For
this reason, there have to be as many equations onstru ted as there are jugs
involved. This group of equations represents all relevant pro edural aspe ts
of the problem, that is, it represents the sequen e of operators leading to the
goal state. For example, the three equations of the three jugs in gure 11.2
are presented in table 11.1.
Table 11.1. Relevant information for solving the sour e problem
(a) Pro edural
pour(C,B) pour(B,A) pour(A,C) pour(B,A)
q
A
(4) = q
A
(0) +(
A
q
A
(0))
A
+[
B
(
A
q
A
(0))
q
B
(4) = q
B
(0) +(
B
q
B
(0)) (
A
q
A
(0)) [
B
(
A
q
A
(0))
q
C
(4) = q
C
(0) (
B
q
B
(0)) +
A
(b) De larative ( onstraints)
B
(
A
q
A
(0))
A
= 2 (
B
q
B
(0))
q
C
(0) (
B
q
B
(0))
C
< q
C
(0) +
A
Ea h goal state is expressed by the values given in the initial state. On
the right side of the equality sign these given values are ombined in a way
that transforms the initial quantity of ea h jug into its goal quantity. Cer-
tain onstraints between the initial values have to be satised so that these
equations are balan ed. These onstraints an be analyti ally derived from
the equations given in table 11.1 and they onstitute the relevant de larative
attributes of the problem. The onstraints for water redistribution problems
have the same fun tion as the problem solving onstraints given for exam-
ple for the radiation or fortress problems (Holyoak & Koh, 1987) or the
missionary- annibal problems (Reed et al., 1974; Gholson et al., 1996).
306 11. Stru tural Similarity in Analogi al Transfer
As an example of how these ontraints an be derived from the equations,
you an easily see that the last pour operator of the solution (pouring a
ertain amount of water from B into A) is only exe utable if the relation
B
(
A
q
A
(0)) holds. As a se ond onstraint, the goal quantity of jug C
ould be des ribed by the following equation q
C
(4) = q
C
(0) + (
B
q
B
(0)).
But the expression (
B
q
B
(0)) does not represent the quantity in jug B. It
represents the remaining free apa ity of this jug whi h, of ourse, an not
be poured into another jug. Be ause the relation
A
= 2 (
B
q
B
(0)) holds,
we on lude that the value of the remaining free apa ity of jug B has to be
subtra ted from the quantity of jug C (only possible if q
C
(0) (
B
q
B
(0))
holds) and that the double of this value (
A
) has to be added to the quantity
in jug C by pouring the apa ity of jug A into jug C. Additionally, the relation
C
< q
C
(0)+
A
determines the relative order of the two pour-operators: you
have to subtra t (
B
q
B
(0)) before you an add
A
to q
C
(0).
The equations des ribing the transformations for ea h jug and the (in-
)equations des ribing the onstraints of the problem are suÆ ient to rep-
resent all relevant de larative and pro edural information of the problem.
Thus, transforming all of them into one graphi al representation leads to a
normatively omplete representation of the problem whi h an be used for a
task analyti al determination of the overlap between two problem stru tures.
The equations and in-equations for all water redistribution problems used in
our experiments are given in appendix CC.7.
11.1.5 Measurement of Stru tural Overlap
Stru tural similarity between two graphsG and H is usually al ulated as the
size of the greatest ommon subgraph of G and H in relation to the size of the
greater of both graphs (S hadler &Wysotzki, 1999; Bunke & Messmer, 1994).
To al ulate graph distan e by formula 1 we introdu e dire ted \empty"
ar s between all pairs of nodes where no ar s exist. The size of the ommon
subgraph is expressed by the sum of ommon ar s V
GH
and nodes N
GH
.
d
(G;H)
= 1
V
GH
+N
GH
max(V
G
; V
H
) +max(N
G
; N
H
)
(11.1)
The graph distan e an assume values between 0 and 1, indi ating iso-
morphi relations between two graphs with d
(G;H)
= 0 and no systemati
relation between G and H with d
(G;H)
= 1. For the two partial isomorphi
graphs in gure 11.3 we obtain (for graph 11.3.a as G and graph 11.3.b as
H)
V
G
= 42 V
H
= 72 (V
G
< V
H
)
N
G
= 7 N
H
= 9 (N
G
< N
H
)
V
GH
= 30 N
GH
= 6
11.2 Experiment 1 307
resulting in the dieren e between G and H
d
(G;H)
= 1
30 + 6
72 + 9
= 0:57:
The value of d
(G;H)
= 0:57 indi ates that the size of the ommon subgraph
of G and H is a little more than half of the size of the larger graph H . Of
ourse, this absolute value of the distan e between two problems is highly
dependent on the kind of representation of their stru tures. Thus, for task
analysis only the ordinal information of these values should be onsidered.
2
a2a1
a1
a1
a2
a2
(a) (b)
.
=
+
.
=
+
a1
a1
a1 a2
a2
-
a1a2
a2
x
5
9
3 x
2
16
6
Fig. 11.3. Graphs for the equations 2 x+ 5 = 9 (a) and 3 x+ (6 2) = 16 (b)
11.2 Experiment 1
Experiment 1 was designed to investigate the suitability of the water-
redistribution for studying analogi al transfer in problem solving, to get some
initial information about transfer of isomorphi vs. non-isomorphi sour es,
and to he k for possible intera tion of super ial with stru tural similarity.
The problems were onstru ted in su h a way that it is highly improbable
that the orre t optimal operator-sequen e an be found by trial-and-error
or that the general solution prin iple is immediately inferable. To investi-
gate analogi al transfer, information about mapping of sour e and target an
be given before the target problem has to be solved (Novi k & Holyoak,
1991). This an be done by pointing out the relevant properties of a problem
( on eptual mapping) and by giving information about the orresponding
jugs in sour e and target (\numeri al" mapping). Additionally, information
about the problem solving strategy of subje ts an be obtained by analyzing
log-les of subje ts' problem solving behavior and by testing mapping after
subje ts solved the target problem.
308 11. Stru tural Similarity in Analogi al Transfer
To get an indi ation of the degree of stru tural similarity between a
sour e and a target whi h is ne essary for transfer su ess, an isomorphi
sour e/target pair was ontrasted with a partial isomorphi sour e/target
pair with \moderately high" stru tural overlap. This should give us some
information about the range of sour e/target similarities whi h should be
investigated more losely (in experiment 2).
To ontrol possible intera tions of stru tural and super ial similarity, we
dis riminate stru ture preserving and stru ture violating variants of target
problems (Holyoak & Koh, 1987): For a given sour e problem with three jugs
(see g. 11.2), a target problem with four jugs learly hanges the superi al
similarity in ontrast to a target problem onsisting also of three jugs. But this
additional jug might or might not result in a hange of the problem stru ture
re e ted in the sequen e of pour operations ne essary for solving the problem.
In ontrast, other super ial variations like hanging the sequen e of jugs
from small/medium/large to large/medium/small are stru ture preserving,
but learly ae t super ial similarity. If the introdu tion of an additional
jug does not lead to additional deviations from the surfa e appearan e of the
sour e problem, we regard the surfa e as \stable". As a onsequen e, there
are four possible sour e/target variations: stru ture preserving problems with
stable or hanged surfa e and stru ture violating problems with stable or
hanged surfa e.
11.2.1 Method
Material As sour e problem, the problem given in gure 11.2 was used.
We onstru ted ve dierent redistribution problems as target problems (see
appendix CC.7):
Problem 1: a three jug problem solvable with four operators whi h is isomor-
phi to the sour e problem ( ondition isomorph/no surfa e hange),
Problem 2: a three jug problem solvable with four operators whi h is iso-
morphi to the sour e problem, but has a surfa e variation by swit hing
positions of the small jug (A) and the medium jug (B) and renaming
these jugs a ordingly (A! B, B ! A) ( ondition isomorph/small sur-
fa e hange),
Problem 3: a three jug problem solvable with four operators whi h is iso-
morphi to the sour e problem, but has a surfa e variation by swit hing
positions of all jugs (A! B, B ! C, C ! A) ( ondition isomorph/large
surfa e hange),
Problem 4: a four jug problem solvable with ve operators whi h has a mod-
erately high stru tural overlap with the sour e ( ondition partial iso-
morph/no surfa e hange), and
Problem 5: a four jug problem solvable with ve operators whi h is isomorph
to problem 4, but has a surfa e variation by swit hing positions of two
jugs (A! B, B ! A) ( ondition partial isomorph/small surfa e hange).
11.2 Experiment 1 309
Be ause of the exploratory nature of this rst experiment, we did not in-
trodu e a omplete rossing of stru ture and surfa e similarity. The main
question was, whether subje ts ould su essfully use a partial isomorph in
analogi al transfer.
In addition to the sour e and target problems, an \initial" problem whi h
is isomorphi to the sour e was onstru ted. This problem was introdu ed
before the presentation of the sour e problem for the following reasons: rst,
subje ts should be ome familiar with intera ting with the problem solving
environment (the experiment was fully omputer-based, see below); and se -
ond, subje ts should be \primed" to use analogy as a solution strategy, by
getting demonstrated how the sour e problem ould be solved with help of
the initial problem.
Subje ts Subje ts were 60 pupils (31 male and 29 female) of a gymnasium
in Berlin, Germany. Their average age was 17.4 years (minimum 14 and
maximum 19 years).
Pro edure The experiment was fully omputer based and ondu ted at
the s hool's PC- luster. The overall duration of an experimental session was
about 45 minutes. All intera tions with the programwere re orded in log-les.
One session onsisted of the following parts:
Instru tion and Training. First, general instru tions were given, informing
about the following tasks and the water-redistribution problems. Subje ts
were introdu ed to the setting of the s reen-layout (graphi s of the jugs)
and the handling of intera tions with the program (performing a pour-
operation, un-doing an operation).
Initial problem. Afterwards, the subje ts were asked to solve the initial prob-
lem with tutorial guidan e (performed by the program). In ase of orre t
solution the tutor asked the subje t to repeat it without any error. The
tutor intervened, if subje t had performed four steps without su ess, or
had started two new attempts to solve the problem, or if they needed
longer than three minutes. This part was nished, if the problem was
orre tly solved twi e without tutorial help.
Sour e problem. When introdu ing the sour e problem, rst some hints
about the relevant problem aspe ts for guring out a shortest operator-
sequen e were given (by thinking about the goal quantities in terms of
relations to initial quantities and maximum apa ities). Afterwards, the
orrespondan e between the three jugs of the initial problem and the
three jugs of the sour e problems were pointed out. Now, the s reen of-
fered an additional button to retrieve the solution sequen e of the initial
problem. The initial solution ould be retrieved as often as the subje t
desired. To perform an operation for solving the sour e problem, this
window had to be losed. Tutorial guidan e was identi al to the initial
problem, but subje ts had to solve the sour e problem only on e.
Target problem. Every subje t randomly re eived one of the ve target prob-
lems. Again, mapping hints (relevant problem aspe ts, orrespondan e
310 11. Stru tural Similarity in Analogi al Transfer
between jugs of the sour e and the target problem) were given. The
sour e solution ould be retrieved without limit, but, again, the subje ts
ould only pro eed to solve the target problem after this window was
losed. Thereby, the number and time of referen e to the sour e problem
ould be obtained in log-les. The subje ts had a maximum of 10 minutes
to solve the target problem.
Mapping Control. Mapping su ess was ontrolled by a short test where sub-
je ts had to give relations between the jugs of the sour e and target
problem.
Questionnaire. Finally, mathemati al skills (last mark in mathemati s, sub-
je tive rating of mathemati al knowledge, interest in mathemati s) and
personal data (age and gender) were obtained.
11.2.2 Results and Dis ussion
Overall, there was a monotoni de rease in problem solving su ess over the
ve target problems (see tab. 11.2, lines n and solved). To make sure, that
problem solving su ess was determined by su essful transfer of the sour e
and not by some other problem solving strategy, only subje ts whi h gave a
orre t mapping were onsidered and a solution was rated as transfer su ess
if the generated solution sequen e was the (unique) shortest solution (see
tab. 11.2, lines orre t mapping and shortest solution). The variable \trans-
fer su ess" was al ulated as per entage of subje ts whith orre t mapping
whi h generated the shortest solution (see tab. 11.2, line transfer rate).
Table 11.2. Results of Experiment 1
Problem 1 2 3 4 5
stru ture
a
ISO ISO ISO P-ISO P-ISO
surfa e
b
no small large no small
n 12 12 12 12 12
solved 12 11 9 8 6
orre t mapping 8 12 10 9 6
shortest solution 8 10 8 8 3
transfer rate 100% 83.3% 80% 88.9% 50%
a
ISO = isomorph, P-ISO = partial isomorph
b
no, small, large hange of surfa e
Log-le analysis showed, that none of the subje ts who performed orre t
mapping, retrieved the sour e solution while solving the target problem. It
is highly improbable that the orre t shortest solution was found randomly
or that subje ts ould infer the general solution prin iple when solving the
initial and sour e problem. As a onsequen e, we have to assume that these
subje ts solved the target by analogi al transfer of the memorized (four-step)
sour e solution.
11.2 Experiment 1 311
Transfer su ess de reased nearly monotoni ly over the ve onditions.
Ex eptions were problems 3 (isomorph/high surfa e hange) and 4 (partial
isomorph/no surfa e hange). The high per entage of transfer su ess for
problem 4 indi ates learly, that subje ts an su esfully transfer a non-
isomorphi sour e problem. Even for problem 5 (partial isomorph/surfa e
hange) transfer su ess was 50%.
There is no overall signi ant dieren e between the ve experimental
onditions (exa t 5 2 polynomial test
2
: P = 0:21). To ontrol intera tions
between stru tural and super ial similarity, dierent ontrasts were al u-
lated whi h we dis uss in the following.
11.2.2.1 Isomorph Stru ture/Change in Surfa e. There is no signi-
ant impa t of the variation of super ial features between onditions 1, 2 and
3 (exa t binomial-tests: 1 vs. 2 with P = 0:225; 2 vs. 3 with P = 0:558; and 1
vs.3 with P = 0:168). This nding is in ontrast to Reed et al. (1990). Reed
and olleagues showed that subje ts' rating of the suitability of a problem for
solving a given target is highly in uen ed by super ial attributes. However,
these ratings were obtained before subje ts had to solve the target problem.
This indi ates, that super ial similarity has a high impa t on retrieval of a
suitable problem but not on transfer su ess, as was also shown by Holyoak
and Koh (1987) when ontrasting stru ture-preserving vs. stru ture-violating
dieren es.
11.2.2.2 Change in Stru ture/Stable Surfa e. Changes in super ial
attributes between onditions 1 and 4 (isomorph vs. partial isomorph, both
with no surfa e hange) respe tively 2 and 5 (isomorph vs. partial isomorph,
both with small surfa e hange) an be regarded as stable, be ause the addi-
tional jug (in ondition 4 vs. 1 and ondition 5 vs. 2) in uen es only stru tural
attributes. That is, by ontrasting these onditions we measure the in uen e
of stru tural similarity on transfer su ess. There is no signi ant dieren e
between ondition 1 and 4 (exa t binomial-tests: 1 vs. 4 with P = 0:394). But
there is a signi ant dieren e between onditions 2 and 5 (exa t binomial-
tests: 2 vs. 5 with P = 0:039): A partial isomorph an be useful for analogi al
transfer if it shares super ial attributes with the target, but, transfer diÆ-
ulty is high if sour e and target vary in super ial attributes even if the
mapping is expli itly given!
11.2.2.3 Change in Stru ture/Change in Surfa e. The variation of
super ial attributes between onditions 4 and 5 has a signi ant impa t
(exa t binomial-tests: P = 0:039). As shown for the ontrast of onditions 2
and 5, if problems are not isomorphi , super ial attributes gain importan e.
Of ourse, this nding is restri ted to the spe ial type of sour e/target pairs
and variation of super ial attributes we investigated that is, to ases where
2
For this test, the result has to be tested against the number of ell-value distri-
butions orresponding with the given row and olumn values. The pro edure for
obtaining this number was implemented by Knut Polkehn.
312 11. Stru tural Similarity in Analogi al Transfer
the target is \larger" than the sour e and where jugs are always named as
A, B, C (D), but the names an be asso iated with jugs of dierent sizes.
In this spe ial ase, the intuitive onstraint of mapping jugs with identi al
names and positions has to be over ome and kept a tive during transfer.
To summarize, this rst explorative experiment shows, that water redis-
tribution problems are suitable for investigating analogi al transfer most
subje ts ould solve the target problems, but solution su ess is sensitive to
variations of sour e/target similarity. As a onsequen e of the intera tion
found between super ial and stru tural similarity, in the following, super-
ial sour e/target similarity will be kept high for all target problems and we
will investigate only target problems varying in their stru tural similarity to
the sour e (i. e., with stable surfa e). Finally, the high solution su ess for
the partial isomorph of \moderately high" stru tural similarity ( ondition 4)
indi ates, that we an investigate sour e/target pairs with a smaller degree
of stru tural overlap.
11.3 Experiment 2
In the se ond experiment we investigated a ner variation of dierent types
and degrees of stru tural overlap. We fo used on two hypotheses about the
in uen e of stru tural similarity on transfer:
(1) We have been interested in the possibly dierent ee ts of dier-
ent types of stru tural overlap on transfer that is target exhaustiveness
versus sour e in lusiveness of problems ( . f., g. 11.1). If one onsiders a
problem stru ture as onsisting of only relevant de larative and pro edural
information, dierent types of stru tural relations result in dieren es in the
amount of both ommon relevant de larative and ommon relevant pro e-
dural information. Changing the amount of ommon de larative information
requires ignoring de larative sour e information in the ase of target exhaus-
tiveness and additionally identifying de larative target information in the
ase of sour e in lusiveness. Changing the amount of ommon pro edural
information means hanging the length of the solution (i. e., the minimal
number of pour-operators ne essary to solve the problem).
Thus, ompared to the sour e solution target exhaustiveness results in
a shorter target solution while the target solution is longer for sour e in-
lusiveness. Assuming that ignoring information is easier than additionally
identifying information (S hmid, Mer y, & Wysotzki, 1998) and assuming
that a shorter target solution is easier to nd than a longer one, we expe t
that su essful transfer should be more probable for target exhaustive than
for sour e in lusive problems. In line with this assumption, Reed et al. (1974)
reported in reasing transfer frequen ies for target exhaustive relations, if sub-
je ts were informed about the orresponden es between sour e and target (see
also Reed et al., 1990).
11.3 Experiment 2 313
(2) While sour e in lusiveness and target exhaustiveness are spe ial types
of stru tural overlap we have also been interested in the overall impa t of
the degree of stru tural overlap on transfer. We wanted to determine the
minimum size of the ommon substru ture of sour e and target problem
that makes the sour e useful for analogi ally solving the target. Or in other
words, we wanted to measure the degree of the distan e between sour e and
target stru tures up to whi h the sour e solution is transferable to the target
problem.
11.3.1 Method
Material As initial and sour e problem we used the same problems as in
experiment 1. As target problems we onstru ted following ve dierent redis-
tribution problems with onstant super ial attributes (see appendix CC.7):
Problem 1: a three jug problem solvable with three operators whose stru -
ture was ompletely ontained in the stru ture of the sour e problem
( ondition target exhaustiveness),
Problem 2: a three jug problem solvable with ve operators whose stru -
ture ontained ompletely the stru ture of the sour e problem ( ondition
sour e in lusiveness),
Problem 3: the partial isomorph problem used before ( ondition 4 in experi-
ment 1) a four jug problem solvable with ve operators whose stru ture
ompletely ontains the stru ture of the sour e problem; this problem
shares a smaller stru tural overlap with the sour e than problem 2 ( on-
dition \high" stru tural overlap), and
Problem 4 and 5: more four jug problems solvable with ve operators that
have de reasing stru tural overlap with the sour e; the stru tures of
sour e and target share a ommon substru ture, but both stru tures have
additional aspe ts ( onditions \medium" stru tural overlap and \low"
stru tural overlap)
For all problem stru tures distan es to the sour e stru ture were al u-
lated using formula 1 (see appendix CC.7). Be ause of the intrinsi onstraints
of the water redistribution domain, it was not possible to obtain equi-distan e
between problems. Nevertheless, the problems we onstru ted served as good
andidates for testing our hypotheses.
To investigate the ee t of the type of stru tural sour e/target relation,
the distan es of problem 1 (target exhaustive) and problem 2 (sour e in-
lusive) to the sour e have been kept as low as possible and as similar as
possible: d
S1
= 0:16 and d
S2
= 0:17. As dis ussed above, it an be expe ted,
that target exhaustiveness leads to a higher probability of transfer su ess
than sour e in lusiveness.
Although problem 3 is a sour e in lusive problem, we used it as an an hor
problem for varying the degree of stru tural overlap. Target problem 4 diered
moderately from target problem 3 in its distan e value (d
S3
= 0:37 vs d
S4
=
314 11. Stru tural Similarity in Analogi al Transfer
0:55) while target problem 5 diered from target problem 4 only slightly
(d
S4
= 0:55 vs. d
S5
= 0:59). Thus, one ould expe t strong dieren es in
transfer rates between ondition 3 and 4 and nearly the same transfer rates
for onditions 4 and 5. We name problems 3, 4, and 5 as \high", \medium"
and \low" overlap in a ordan e to the ranking of their distan es to the
sour e problem.
Subje ts Subje ts were 70 pupils (18 male and 52 female) of a gymnasium in
Berlin, Germany. Their average age as 16.3 years (minimum 16 and maximum
17 years). The data of 2 subje ts was not logged due to te hni al problems.
Thus, 68 logles were available for data analysis.
Pro edure The pro edure was the same as in experiment 1. Ea h subje t
had to solve one initial problem and one isomorphi sour e problem rst and
was then presented one of the ve target problems.
11.3.2 Results and Dis ussion
49 subje ts mapped the jugs from the sour e to the target orre tly. Thus 19
subje ts had to be ex luded from analysis. Table 11.3 shows the frequen ies
of subje ts who performed the orre t mapping between sour e and target
and generated the shortest solution sequen , i. e., solved the target problem
analogi ally ( . f., experiment 1).
Table 11.3. Results of Experiment 2
Problem 1 2 3 4 5
stru ture target sour e \high" \medium" \low"
exhaustive in lusive overlap overlap overlap
n 11 10 16 15 16
orre t mapping 7 8 13 9 12
shortest solution 6 7 10 5 1
transfer rate 86% 88% 77% 56% 8%
11.3.2.1 Type of Stru tural Relation. There is no dieren e in solving
frequen ies between ondition 1 and 2 (exa t binomial test, P = 0:607). That
is, there is no indi ation of an ee t of the type of stru tural sour e/target
relation on transfer su ess. In ontrast to the ndings of Reed et al. (1974)
and Reed et al. (1990), it seems, that the degree of stru tural overlap has a
mu h larger in uen e than the type of stru tural relation between sour e and
target. Furthermore, looking only at the pro edural aspe t of our problems,
we ould not nd an impa t of the length of the required operator-sequen e
(three steps for problem 1 vs. 5 steps for problem 2) on solution su ess.
A possible explanation might be that the type of stru tural relation has
no ee t, if problems are very similar to the sour e. It is learly a topi for
11.4 General Dis ussion 315
further investigation, to he k whether target exhaustive problems be ome
superior to sour e in lusive problems with in reasing sour e/target distan es.
A general superiority of degree over type of overlap ould be explained
by assuming mapping as a symmetri al instead of an asymmetri al (sour e
to target) pro ess. Hummel and Holyoak (1997) argue that during retrieval
the target representation \drives" the pro ess. In ontrast, during mapping
the role of the \driver" an swit h from target to sour e and vi e versa.
During transfer the sour e stru ture again takes ontrol of the pro ess. Thus,
an intera tion between these pro esses must lead to de reasing dieren es
between ee ts of sour e in lusiveness and ee ts of target exhaustiveness on
analogi al transfer.
11.3.2.2 Degree of Stru tural Overlap. Ea h problem of onditions 3
to 5 has been solvable with at least ve operators. That means, there was
one additional operator needed ompared to the sour e solution. Results for
onditions 3 to 5 show a signi ant dieren e between the ee ts of dier-
ent degrees of stru tural sour e/target overlap on solution frequen y (exa t
3 2 test, P = 0:002). Comparing ea h single frequen y against ea h other
indi ates that the ru ial dieren e between stru tural distan es is between
onditions 4 and 5 (exa t binomial test, onditions 3 and 4: P = 0:1; ondi-
tions 3 and 5: P < 0:001; onditions 4 and 5: P = 0:0001).
This nding is surprising taking into a ount that the dieren e of stru -
tural distan e between onditions 3 and 4 is mu h larger than between on-
dition 4 and 5 (d
S3
= 0:37, d
S4
= 0:55, d
S5
= 0:59). A possible explanation
is, that with problem 5 we have rea hed the margin of the range of stru tural
overlap where a problem an be helpful for solving a target problem. A on-
je ture worth further investigation is, that a problem an be onsidered as a
suitable sour e if it shares at least fty per ent of its stru ture with the tar-
get! An alternative hypothesis is, that not the relative but rather the absolute
size of stru tural overlap determines transfer su ess that is, that a sour e
is no longer helpful to solve the target, if the number of nodes ontained in
the ommon sub-stru ture gets smaller than some xed lower limit.
11.4 General Dis ussion
In our studies we investigated only a small sele tion of analyti ally possible
sour e/target relations. We did not investigate many-to-one versus one-to-
many mappings (Spellman & Holyoak, 1996), and we only looked at target
exhaustiveness versus sour e in lusiveness for problems with a large om-
mon stru ture. We plan to investigate these variations in further studies. For
sour e/target relations with a varying degree of stru tural overlap we were
able to show that a problem is suitable as sour e even if it shares only about
half of its stru ture with the target. A rst explanation for this nding whi h
goes along with models of transformational analogy is, that subje ts rst on-
stru t a partial solution guided by the solution for the stru turally identi al
316 11. Stru tural Similarity in Analogi al Transfer
part of the solution, and than use this partial solution as a onstraint for
nding the missing solution steps by some problem solving strategy, su h as
means-end-analysis (Newell, Shaw, & Simon, 1958), or by internal analogy
(Hi kman & Larkin, 1990).
Internal analogy des ribes a strategy where a previously as ertained solu-
tion for a part of a problem guides the onstru tion of a solution for another
part of the same problem. For the problem domain we investigated, internal
analogy gives no plausible explanation: The onstraints used to gure out
the solution steps for the overlapping part of the target problem are not the
same as those used for the non-overlapping part therefore internal analogy
annot be applied. A se ond explanation might be, that subje ts try to re-
represent the target problem in su h a way that it be omes isomorphi to
the sour e (Hofstadter & The Fluid Analogies Resear h Group, 1995; Gen-
tner et al., 1997). Again, this explanation seems to be inplausible for our
domain: Be ause the number of jugs and given initial, goal, and maximum
quantities determine the solution steps ompletely, re-representation (for ex-
ample looking at two dierent jugs as one jug) annot be helpful for nding
a solution.
The results of the present study give some new insights about the nature of
stru tural similarity underlying transfer su ess in analogous problem solving.
While it is agreed upon that appli ation of analogies is mostly in uen ed by
stru tural and not by super ial similarity (Reed et al., 1974; Gentner, 1983;
Holyoak & Koh, 1987; Reed et al., 1990; Novi k & Holyoak, 1991), there
are only few studies that have investigated whi h type and what degree of
stru tural relationship between a sour e and a target problem is ne essary
for transfer su ess.
Holyoak and Koh (1987) used variants of the radiation problem (Dun ker,
1945) to show that stru tural dieren es have an impa t on transfer. They
varied stru tural similarity by onstru ting problems with dierent solution
onstraints. In studies using variants of the missionaires- annibales problem
stru tural similarity was varied in the same way (Reed et al., 1974; Gholson
et al., 1996). In the area of mathemati al problem solving, typi ally the om-
plexity of the solution pro edure is varied (Reed et al., 1990; Reed & Bolstad,
1991). While in all of these studies non-isomorphi sour e/target pairs are
investigated, in none of them the type and degree of stru tural similarity
was ontrolled. Thus, the question of whi h stru tural hara teristi s make a
sour e a suitable andidate for analogi al transfer remained unanswered.
Investigating stru tural sour e/target relations is of pra ti al interest for
several reasons: (1) In an edu ational ontext ( f. tutoring systems) the pro-
vided examples have to be arefully balan ed to allow for generalization
(learning). Presenting only isomorphs restri ts learning to small problem
lasses, while too large a degree of stru tural dissimilarity an result in fail-
ure of transfer and thereby obstru ts learning (Pirolli & Anderson, 1985).
(2) A plausible ognitive model of analogi al problem solving (Falkenhainer
11.4 General Dis ussion 317
et al., 1989; Hummel & Holyoak, 1997) should generate orre t transfer only
for su h sour e/target relations where human subje ts perform su essfully.
(3) Computer systems whi h employ analogi al or ase-based reasoning te h-
niques (Carbonell, 1986; S hmid & Wysotzki, 1998) should refrain from ana-
logi al transfer when there is a high probability of onstru ting faulty solu-
tions. Thus, situations an be avoided in whi h system users have to he k
and possibly debug generated solutions. Here information about ondi-
tions for su essful transfer in human analogi al problem solving an provide
guidelines for implementing riteria when the strategy of analogi al reasoning
should be reje ted in favour of other problem solving strategies.
12. Programming By Analogy
In part II we dis ussed indu tive program synthesis as an approa h to learning
re ursive program s hemes. Alternatively, in this hapter, we investigate how
a re ursive program s heme an be generalized from two, stru turally similar
programs and how a new re ursive program an be onstru ted by adapting
an already known s heme. Generalization an be seen as the last step of
programming or problem solving by analogy. Adaptation of a s heme to a
new problem an be seen as abstra tion rather than analogy be ause an
abstra t s heme is applied to a new problem in ontrast to mapping two
on rete problems. In the following, we rst (se t. 12.1) give a motivation
for programming by analogy and abstra tion. Afterwards (se t. 12.2), we
introdu e a restri ted approa h to se ond-order anti-uni ation. Then we
present rst results on retrieval (se t. 12.3), introdu ing subsumption as a
qualitative approa h to program similarity. In se tion 12.4, generalization of
re ursive program s hemes as anti-instan es of pairs of programs is des ribed.
Finally (se t. 12.5), some preliminary ideas for adaptation are presented.
1
12.1 Program Reuse and Program S hemes
It is an old laim in software engineering, that programmers should write less
ode but reuse ode developed in previous eorts (Lowry & Duran, 1989).
In the ontext of programming, reuse an be hara terized as transforming
an old program into a new program by repla ing expressions (Cheatham,
1984; Burton, 1992). But reuse is often problemati on the level of on rete
programs be ause the relation between ode fragments and the fun tion they
perform is not always obvious (Cheatham, 1984). There are two approa hes
to over ome this problem: (1) performing transformations on the program
spe i ations rather than on the on rete programs (Dershowitz, 1986), and
(2) providing abstra t s hemes whi h are stepwise rened by \verti al" pro-
gram transformation (Smith, 1985). The se ond approa h proved to be quite
su essful and is used in the semi-automati program synthesis system KIDS
(Smith, 1990, see se t. 6.2.2.3 in hap. 6).
1
This hapter is based on the previous publi ation S hmid, Sinha, and Wysotzki
(2001).
320 12. Programming By Analogy
In the KIDS system, program s hemes, su h as divide-and- onquer, lo al,
and global sear h, are predened by the system author and sele tion of an ap-
propriate s heme for a new programming problem is performed by the system
user. From a software-engineering perspe tive, it is prudent to ex lude these
knowledge-based aspe ts of program synthesis from automatization. Never-
theless, we onsider automati retrieval of an appropriate program s heme
and automati onstru tion of su h program s hemes from experien e as an
interesting resear h questions.
While the KIDS system is based on dedu tive (transformational) pro-
gram synthesis, the ontext of our own work is indu tive program synthesis:
A programming problem is spe ied by some input/output examples and
an universal plan is onstru ted whi h represents the transformation of ea h
possible input state of the initially nite domain into the desired output (see
part I). This plan is transformed into a nite program tree whi h is folded
into a re ursive fun tion (see part II). It might be interesting to investigate
reuse on the level of programming problems that is, providing a hypo-
theti al re ursive program for a new set of input/output examples omitting
planning. But urrently we are investigating how the folding step an be re-
pla ed by analogi al transfer or abstra tion and how abstra t s hemes an
be generalized from on rete programs.
Our overall approa h is illustrated in gure 1.1: For a given nite pro-
gram (T) representing some initial experien e with a problem that program
s heme (RPS-S) is retrieved from memory for whi h its nth unfolding (T-S)
results in a \maximal similarity" to the urrent (target) problem. The sour e
program s heme an either be a on rete program with primitive operations
or an already abstra ted s heme. The sour e s heme is modied with respe t
to the mapping obtained between its unfolding and the target. Modi ation
an involve a simple re-instantiation of primitive symbols or more omplex
adaptations. Finally, the sour e s heme and the new program are generalized
to a more abstra t s heme. The new RPS and the abstra ted RPS are stored
in memory.
After introdu ing our approa h to anti-uni ation, all omponents of pro-
gramming by analogy are dis ussed. An overview of our work on onstru ting
programming by analogy algorithms is given in appendix AA.13.
12.2 Restri ted 2ndOrder AntiUni ation
12.2.1 Re ursive Program S hemes Revisited
A program is represented as a re ursive program s heme (RPS) S = (G; t
0
),
as introdu ed in denition 7.1.17. The main program t
0
is dened over a
term algebra T
(X) where signature is a set of fun tion symbols and
X is a set of variables (def. 7.1.1). In the following, we split in a set of
primitive fun tion symbols F and a set of user-dened fun tion names with
12.2 Restri ted 2ndOrder AntiUni ation 321
F\ = ; and = F[. A re ursive equation in G onsists of a fun tion head
G(x
1
; : : : ; x
m
) and a fun tion body t. The fun tion head gives the name of the
fun tion G 2 and the parameters of the fun tion with X = fx
1
; : : : ; x
m
g.
The fun tion body is dened over the term algebra, t 2 T
(X). The body t
ontains the symbol G at least on e. Currently, our generalization approa h
is restri ted to RPSs where G ontains only a single equation and where t
0
onsists only of the all of this equation.
Sin e program terms are dened over fun tion symbols, in general, an
RPS represents a lass of programs. For program evaluation, all variables
in X must be instantiated by on rete values, all fun tion symbols must be
interpreted by exe utable fun tions, and all names for user-dened fun tions
must be dened in F .
In the following, we introdu e spe ial sets of variables and fun tion sym-
bols to dis riminate between on rete programs and program s hemes whi h
were generated by anti-uni ation. The set of variables X is divided into
variables X
p
representing input variables and variables X
au
whi h represent
rst-order generalizations, that is, generalizations over rst-order onstants.
The set of fun tion symbols F is divided into symbols for primitive fun tions
F
p
with a xed interpretation (su h as +(x, y) representing addition of two
numbers) and fun tion symbols with arbitrary interpretation
au
. Symbols in
au
represent fun tion variables whi h were generated by se ond-order gen-
eralizations, that is generalizations over primitive fun tions. Please note, that
these fun tion variables do not orrespond to names of user-dened fun tions
!
Denition 12.2.1 (Con rete Program). A on rete program is an RPS
over T
(X
p
) with = F
p
[ . That is, it ontains only input variables in
X
p
, symbols for primitive fun tions in F
p
and fun tion alls G
i
2 whi h
are dened in G.
Denition 12.2.2 (Program S heme). A program s heme is an RPS de-
ned over T
(X
p
[X
au
) with = F
p
[
au
[. That is, it an ontain input
variables in X
p
, symbols for primitive fun tions in F
p
, fun tion alls G
i
2
whi h are dened in G, as well as obje t variables y 2 X
au
, and fun tion
variables 2
au
.
An RPS an be unfolded by repla ing the name of a user-dened fun tion
by its body where variables are substituted in a ordan e with the fun tion
all as introdu ed in denition 7.1.29. In prin iple, a re ursive equation an be
unfolded innitely often. Unfolding terminates, if the re ursive program all
is repla ed by (the undened term) and the result of unfolding is a nite
program term (see def. 7.1.19) T 2 T
(X
p
[X
au
) with = F
p
[
au
[ fg.
That is, T does not ontain names of user-dened fun tions G
i
2 .
322 12. Programming By Analogy
12.2.2 Anti-Uni ation of Program Terms
First order anti-uni ation was introdu ed in se tion 7.1.2 in hapter 7. In the
following, we present an extension of Huet's de larative, rst-order algorithm
for anti-uni ation of pairs of program terms (Huet, 1976; Lassez et al., 1988)
by introdu ing three additional rules. The term that results from the anti-
uni ation of two terms is their most spe i anti-instan e.
Denition 12.2.3 (Instan e/Anti-Instan e). If t
1
: : : t
n
and u are terms
and for ea h t
i
there exists a substitution
i
(1 i n) su h that t
i
= u
i
,
then the terms t
i
are instan es of u and u is an anti-instan e of the set
ft
1
; : : : ; t
n
g.
u is the most spe i anti-instan e of the set ft
1
; : : : ; t
n
g if, for ea h u
0
whi h is also an anti-instan e of ft
1
; : : : ; t
n
g, there exists a substitution
su h that u
0
= u.
Simply spoken, an anti-instan e re e ts some ommonalities shared be-
tween two or more terms, whereas the most spe i anti-instan e re e ts all
ommonalities.
Huet's algorithm onstru ts the most spe i anti-instan e of two terms
in the following way: If both terms start with the same fun tion symbol, this
fun tion symbol is kept for the anti-instan e, and anti-uni ation is performed
re ursively on the fun tion arguments. Otherwise, the anti-instan e of the two
terms is a variable determined by an inje tive mapping '. ' ensures that ea h
o urren e of a ertain pair of sub-terms within the given terms is represented
by the same variable in the anti-instan e.
A ording to Idestam-Almquist (1993), ' an be onsidered a term sub-
stitution:
Denition 12.2.4 (Term Substitution). A nite set of the form
[[(t
1
; u
1
)=x
1
; : : : ; (t
n
; u
n
)=x
n
is a term substitution, i [[x
1
=t
1
; : : : ; x
n
=t
n
and [[x
1
=u
1
; : : : ; x
n
=u
n
are
substitutions, and the pairs (t
1
; u
1
); : : : ; (t
n
; u
n
) are distin t.
Huet's algorithm only omputes the most spe i rst-order anti-instan e.
But in order to apture as mu h of the ommon stru ture of two programs
as possible in an abstra t s heme we need at least a se ond order (fun tion)
mapping. In ontrast to rst order approa hes, higher order anti-uni ation
(as well as uni ation) in general has no unique solution that is, a notion
of the most spe i anti-instan e does not exist and annot be al ulated
eÆ iently (Siekmann, 1989; Hasker, 1995).
Therefore, we developed a very restri ted se ond-order algorithm. Its main
extension ompared with Huet's is the introdu tion of fun tion variables for
fun tions of the same arity.
12.2 Restri ted 2ndOrder AntiUni ation 323
Based on the results obtained with this algorithm, we plan areful exten-
sions, maintaining uniqueness of the anti-instan es and eÆ ien y of al ula-
tion. A more powerful approa h to se ond order anti-uni ation was proposed
for example by Hasker (1995), as mentioned in hapter 10.
Our algorithm is presented in table 12.1. The Var-Term, Di-Arity,
and Same-Fun tion rules orrespond to the two ases of Huet's algorithm.
Same-Term is a trivial rule whi h makes the implementation more eÆ ient.
Our main extension to Huet's algorithm is the Same-Arity rule, by whi h
fun tion variables 2
au
are introdu ed for fun tions of the same arity.
Huet's termination ase has been split up into two rules (Var-Term and
Di-Arity) be ause the Di-Arity rule an be rened in future versions
of this algorithm to allow generalizing over fun tions with dierent arities as
well.
Table 12.1. A Simple Anti-Uni ation Algorithm
Fun tion Call: au(t
1
; t
2
) with terms t
1
; t
2
2M(V; F [ )
Initialization: term substitutions ' = ;
Rules:
Same-Term: au(t; t) = t
Var-Term: au(t
1
; t
2
) = y with ' = ' Æ [[t
1
; t
2
=y (where either t
1
2 X and
t
2
2 T
(X ) or t
1
2 T
(X) and t
2
2 X
Same-Fun tion: au(f(x
1
; : : : ; x
n
); f(x
0
1
; : : : ; x
0
n
)) =
f(au(x
1
; x
0
1
); : : : ; au(x
n
; x
0
n
))
Same-Arity: au(f(x
1
; : : : ; x
n
); g(x
0
1
; : : : ; x
0
n
)) = (au(x
1
; x
0
1
); : : : ; au(x
n
; x
0
n
))
with ' = ' Æ [[f; g=
Di-Arity: au(f(x
1
; : : : ; x
n
); g(x
0
1
; : : : ; x
0
m
)) = y
with ' = ' Æ [[f(x
1
; : : : ; x
n
); g(x
0
1
; : : : ; x
0
m
)=y (where n 6= m)
Proofs of uniqueness, orre tness, and termination, are given in (Sinha,
2000).
To illustrate the algorithm, we present a simple example: Let the two
terms whi h are input to au(t
1
; t
2
) be
t
1
= if(eq0(x); 1; (x; fa (p(x))))
t
2
= if(eq0(z); 0;+(z; sum(p(z))))
Then,
au(t
1
; t
2
) = if(eq0(y
1
); y
2
;
2
(y
1
;
1
(p(y
1
))))
with
' = [[y
1
=(x; z); y
2
=(1; 0);
1
=(fa ; sum);
2
=(;+)
324 12. Programming By Analogy
as their most spe i antiinstan e. The rules used are: Same-Fun tion
(twi e), Var-Term, Same-Arity (twi e), Var-Term (making use of '),
Same-Arity, Same-Fun tion, and nally Var-Term (again, with '). In
table 12.4, additional examples for anti-uni ation are given.
12.3 Retrieval Using Term Subsumption
12.3.1 Term Subsumption
By using subsumption of anti-instan es, retrieval an be based on a qualita-
tive riterium of stru tural similarity (Plaza, 1995) instead of on a quantita-
tive similarity measure. The advantage of a qualitative approa h is that it is
parameter-free. That is, there is no need to invest in the somewhat tedious
pro ess of determining \suitable" settings for parameters and no threshold
for similarity must be given.
The subsumption relation is dened usually only for lauses (see se t. 6.3.3
in hap. 6). This denition is based either on the subset/superset relation-
ship whi h exists between lauses sin e they an be viewed as sets (the
onjun tion of literals is ommutative) or on the existen e of a substitution
whi h transforms the more general (subsuming) lause into the more spe ial
(subsumed) one. Sin e terms annot be viewed as sets, and therefore the sub-
set/superset relationship does not exist, our denition of term subsumption
has to be based on the existen e of substitutions only.
Denition 12.3.1 (Term Subsumption). A term t
1
subsumes a term t
2
(written t
2
t
1
) i there exists a substitution su h that t
1
= t
2
.
Note, that must be a \proper" substitution, not a term substitution
(def. 12.2.4).
Substitution is obtained by unifying t
1
and t
2
. If is empty and t
1
6= t
2
,
uni ation of t
1
and t
2
has failed. In that ase, the subsumption relation
between t
1
and t
2
is unde idable. That is, the subsumption relation does not
dene a total but a partial order over terms. For two isomorphi terms, that
is, terms whi h an by unied by variable renaming only, we write t
1
t
2
.
2
Our algorithm for retrieving the \maximally similar" RPSs from a linear
ase base is given in table 12.2. In general, the algorithm returns a set of
possible sour e programs rather than a unique sour e due to the unde idabil-
ity of subsumption for non-uniable terms. Input to the algorithm is a nite
program T . For ea h RPS in memory, this RPS S
i
is unfolded to a term T
i
su h that it has at least the length of T and the most spe i anti-instan e u
i
2
Note, that here we speak of onstru ting an order over terms and isomorphism
is dened with respe t to the uni ation of terms. Later in this hapter, we will
dis uss isomorphism in the ontext of anti-uni ation, that is generalization, of
terms.
12.3 Retrieval Using Term Subsumption 325
of T and T
i
is al ulated. This anti-instan e is only inserted in the set of anti-
instan es U if it is either subsumed (i. e., is more spe ial) by an anti-instan e
u 2 U or if it is not uniable with any u 2 U . Furthermore, all anti-instan es
in U whi h subsume (i. e., are more general) than u
i
are removed from U .
The algorithm returns all RPSs S
i
from the ase-base whi h are asso iated
with an u
i
in the nal set of antiinstan es U .
Table 12.2. An Algorithm for Retrieval of RPSs
Given
T a nite program.
CB = fR
1
; : : : ; R
n
g a ase base whi h is a set (represented as a list) of RPSs
S
i
.
U a set (initially empty) of most spe i anti-instan es.
8R
i
2 CB:
Let T
i
= unfold(S
i
). (unfold S
i
a ording to def. 7.1.29 until the resulting term
T
i
has at least the length of T )
Let u
i
= au(T; T
i
). (anti-unify T and T
i
)
If 9u 2 U with u
i
u or u u
i
then U .
8u 2 U with u
i
u: U := (U n u) [ u
i
Return the RPSs S
i
asso iated with all u
i
2 U .
12.3.2 Empiri al Evaluation
We presented terms to six subje ts (all of whom were familiar with fun -
tional programming) asking them to identify similarities among the terms.
All these terms were RPSs unfolded twi e. We hose ten RPSs whi h we on-
sidered \relevant". The RPSs represented various re ursion types, su h as
linear re ursion (SUM, FAC, APPEND, CLEARBLOCK), tail re ursion (MEMBER,
REVERT), tree re ursion (BINOM, FIBO, LAH), and a fun tion alling a further
fun tion (GGT alling MOD). All fun tions are given in appendix CC.8.
In ea h rating task the subje ts were presented one term and a \ ase
base" onsisting of the other nine terms. The order, in whi h the ase base
was presented, varied among subje ts. They were requested to hoose the
term from the ase base whi h was most similar to the given term.
In table 12.3, the results of this rating study are shown.
The rightmost olumn shows the results of our algorithm. Only for the
CLEARBLOCK problem (3) the algorithm yields a unique result. In all other
ases, a set of \most similar" terms is returned. This is due to both sub-
sumption equivalen es and the fa t that subsumption is not always de idable
(see se tion 12.3.1).
The RPS preferred by a majority (two-thirds or more) of the subje ts
was always among the RPSs returned by the algorithm (ex ept for GGT).
Moreover, there are symmetries in retrieval (e. g., FAC and SUM) that an be
326 12. Programming By Analogy
Table 12.3. Results of the similarity rating study
Goal RPS Subje ts' Choi e Majority Program Returns
REVERT 4: 33%, 2, 3, 5, 9: 17% ea h f7, 8, 10g
SUM 5: 83%, 10: 17% 5 f5, 10g
CLEARBLOCK 6: 100% 6 f6g
GGT 1: 67%, 6: 33% 1 f8, 10g
FAC 2: 100% 2 f2 ,8 ,9g
APPEND 3: 83%, 2: 17% 3 f1, 3, 7g
MEMBER 10: 33%, 1, 2, 4, 6: 17% ea h f1, 9g
LAH 10: 50%, 9: 33%, 6: 17% f9, 10g
BINOM 10: 100% 10 f7, 8,10g
FIBO 9: 50%, 8: 33%, 3: 17% f2, 8, 9g
explained by a ommon re ursion s heme of these RPSs. Generally, at least
one RPS of the same re ursion type as the goal RPS is retrieved, provided
that there is su h an RPS in the ase base.
12.3.3 Retrieval from Hierar hi al Memory
We demonstrated retrieval of on rete programs or abstra t s hemes for the
ase of a linear memory. Memory an be organized more eÆ iently by intro-
du ing a hierar hy of programs. That is, for ea h pair of programs whi h were
used as sour e and target in programming by analogy, their anti-instan e is
introdu ed as parent.
For hierar hi al memory organization, retrieval an be restri ted to a
subset of the memory in the following way: Given a new nite program term
t,
identify the most spe i anti-instan e au(T; T
i
) for all unfolded RPSs
whi h are root-nodes,
in the following only investigate the hildren of this node.
As in the linear ase, in general, there might not exist a unique minimal
anti-instan e, that is, more than one tree in the memory must be explored.
Furthermore, retrieval from hierar hi al memory is based on a monotoni -
ity assumption: If a program term T
i
is less similar to the urrent term T
than a program term T
j
then the hildren of T
j
annot be more similar to
T than T
i
and the hildren of T
i
. While this assumption is plausible, it is
worth further investigation. Our work on memory hiera hization is work in
progress and we plan to provide a proof for monotoni ity together with an
empiri al omparison of the eÆ ien y of retrieval from hierar hi al versus
linear memory.
12.4 Generalizing Program S hemes 327
12.4 Generalizing Program S hemes
There are dierent possibilities, to obtain an RPS S
1;2
whi h generalizes over
a pair of RPSs S
1
and S
2
:
The generalized program term T
1;2
whi h was onstru ted as au(T
1
; T
2
)
with T
1
= unfold(S
1
) and T
2
= unfold(S
2
) an be folded, using our
standard approa h to program synthesis des ribed in hapter 7.
The term-substitution ', obtained by onstru ting au(T
1
; T
2
), an be
extended to '
0
= ' Æ [G
1
; G
2
=
1;2
, that is, by introdu ing a variable
for the names of the re ursive fun tions. Then, for our restri ted ap-
proa h dealing with fun tions of dierent arity in a rst-order way, it is
enough to apply '
0
to either S
1
or S
2
. That is, for term-substitutions
'
0
= f(T
1;i
; T
2;i
=y
i
) j i = 1::ng where y
i
2 X
au
[
au
, with proje tions
1
= f(T
1;i
=y
i
)g,
2
= f(T
2;i
=y
i
)g it holds, that
1
(S
1
) =
2
(S
2
).
The anti-uni ation algorithm dened above an be applied to re ursive
equations G
i
(x
1
; : : : ; x
n
) = t
i
, if the equations are reformulated as equality
terms: (= (Gi x1 ... xn) ti).
All three possibilities have their advantages: Folding of abstra ted terms
an be used to he k whether the obtained generalization really represents
a re ursive program. Applying term-substitutions is the most parsimonious
approa h to generalization. Applying anti-uni ation dire tly to re ursive
programs allows to separate reuse and learning.
To illustrate our anti-uni ation algorithm, we give examples for general-
ization of re ursive equations in table 12.4. The results are edited for better
readability. In ea h example, we used a fun tion for al ulating the fa torial
of a number as rst instan e. If fa torial is anti-unied with sum, the re-
sulting s heme preserves the omplete stru ture of both programs. Fa torial
and sum an be onsidered as synta ti al isomorphs, that is, we an map
the symbols from fa torial into sum in a onsistent way. A similar result is
obtained for in rementing the elements of a list. If fa torial is anti-unied
with a program for al ulating the square of a number, the resulting s heme
abstra ts from the rst argument of the \else"- ase, whi h is a \ onstant" for
fa torial and a more omplex term for the sqr. Anti-uni ation of fa torial
and bona i returns a very general s heme be ause the programs have very
dierent stru tures: one versus two onditionals and a linear re ursion versus
a tree-re ursion. The same is true for fun tions with dierent numbers of
parameters, as fa torial and mult.
Be ause the result of anti-uni ation is a generalized term together with
a term substitution ' (see def. 12.2.4), the original programs an be re on-
stru ted. The term substitution an be onsidered as representing a (very
limited) domain for interpreting the variables in the anti-instan e. For exam-
ple, in the rst anti-instan e given in table 12.4
2
=1; 0 restri ts the inter-
pretation of
2
to be zero or one. If we want to obtain generalized fun tion
s hemes from anti-uni ation, we keep only the anti-instan es and get rid
328 12. Programming By Analogy
Table 12.4. Example Generalizations
t
1
: fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))
t
2
: sum(z) = if(eq0(z), 0, +(z, sum(pred(z))))
1
(y
1
) = if(eq0(y
1
),
2
,
3
(y
1
,
1
(pred(y
1
))))
t
1
: fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))
t
2
: in l(l) = if(null(l), nil, ons(su ( ar(l)), in l(tail(l))))
1
(y
1
) = if(
2
(y
1
),
3
,
4
(y
3
,
1
(
4
(y
1
))))
t
1
: fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))
t
2
: sqr(z) = if(eq0(z), 0, +(+(z, pred(z)), sqr(pred(z))))
1
(y
1
) = if(eq0(y
1
),
2
,
3
(y
3
,
1
(pred(y
1
))))
t
1
: fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))
t
2
: bo(z) = if(eq0(z), 0, if(eq0(pred(z)), 1,
+(bo(pred(z)), bo(pred(pred(z))))))
1
(y
1
) = if(eq0(y
1
),
2
, y
2
)
t
1
: fa (x) = if(eq0(x), 1, *(x, fa (pred(x))))
t
2
: mult(u v) = if(eq0(u), 0, +(v, mult(pred(u), v)))
y
1
= if(eq0(y
2
), hi
1
,
2
(y
3
, y
4
))
of the term substitutions. Without onstraints, the variables an now be in-
terpreted in an arbitrary way, that is, instantiation of an anti-instan e an
result in \stupid" programs. For example, we might interpret y
1
as a list,
2
as a truth-value, and
3
as list- onstru tor. In the ontext of automati
programming, however, there is always some information about the new (tar-
get) programming problem whi h restri ts the instantiation of a s heme. A
planned extension of our approa h is, to in orporate type-information into
anti-uni ation. Thereby, it an at least be guaranteed that the instantiated
expressions of an anti-instan e are type- onsistent.
12.5 Adaptation of Program S hemes
If an RPS RPS S is retrieved from memory for a urrent problem T this
RPS must be adapted su h that the resulting RPS denes a language to
whi h T belongs (see def. 7.1.20). Adaptation an be performed by applying
the term substitutions obtained by anti-unifying T and T S on the given
RPSS. For example, anti-uni ation of the third unfolding of RPS fa (as
sour e) and the third unfolding of sum (as new nite program with unknown
re ursive generalization) results in ' = [[y
1
=(x; z); y
2
=(1; 0);
1
=(;+). The
given RPS fa an be adapted by repla ing 1 by 0 (rst order) and by +
(se ond-order) and variable renaming x to z.
12.5 Adaptation of Program S hemes 329
A more ompli ated ase o urs, if identi al symbols in one program are
mapped to dierent symbols in another program. For the nite programs sub
and add (see g. 12.1), results ' = [[
1
=(pred; su ). This term substitution
was obtained for the sub-terms at position 3:2: and 3:3: of both nite pro-
grams while the sub-terms at positions 3:1: are identi al (pred(x)). Simply
applying the mappings given in ' to the RPS sum results in a not intended
(and not terminating) RPS add(x, y) = if(eq0(x), y, add(su (x), su (y))).
This error an be easily dete ted, be ause the given nite program does not
belong to the language of the onstru ted RPS, that is, the nite program
annot be generated by unfolding the RPS. To deal with su h ases, the on-
text of the substitution terms must be onsidered, that is, the fun tion of
whi h the sub-terms are arguments and their position (S hmid et al., 1998;
Hasker, 1995).
Ω
if
eq0
x
y if
eq0
pred
x
pred
y
if
eq0
pred
pred
x
pred
pred
y
Ω
if
eq0
x
y if
eq0
pred
x
succ
y
if
eq0
pred
pred
x
succ
succ
y
Sour e: sub(x, y) = if(eq0(x), y, sub(pred(x), pred(y)))
Target: add(x, y) = if(eq0(x), y, add(pred(x), su (y)))
Fig. 12.1. Adaptation of Sub to Add
13. Con lusions and Further Resear h
We introdu ed analogy and abstra tion as alternative to indu tive program
synthesis by folding of initial trees. In general, analogy is not more eÆ ient
than folding for generalizing re ursive program s hemes from nite programs.
Nevertheless, it is a method whi h we nd interesting for two reasons: Fist,
using anti-uni ation, mapping and generalization an be realized within the
same framework and we an demonstrate ( urrently only with toy-examples)
how abstra t program s hemes an be learned from some initial programming
experien e. Se ond, analogy is onsidered as a ru ial strategy in human
problem solving and learning.
13.1 Learning and Applying Abstra t S hemes
Our approa h to programming by analogy using anti-uni ation mainly ad-
dresses the problem of obtaining abstra t s hemes from some on rete pro-
gramming experien e. If su h s hemes are a quired, a dedu tive approa h to
program synthesis by stepwise renement as proposed by (Smith, 1990)
an be used. Currently, mainly in the ontext of obje t oriented programing,
a similar on ept design patterns are dis ussed as approa h to program
reuse and an important ontribution to produ tivity of software development
(Gamma, Helm, Johnson, & Vlissides, 1996). Typi ally, program s hemes or
design patterns are provided by experts as aggregation and abstra tion of
their programming experien e. Our resear h aims at getting more insight into
how su h abstra tions are obtained by human programmers and at exploiting
this insight for (partial) automation of the onstru tion of program s hemes.
Additionally, we address the problem of retrieving a suitable s heme for a
given new programming problem. By using subsumption of anti-instan es, re-
trieval an be based on a qualitative riterion of stru tural similarity (Plaza,
1995) instead of on a quantitative similarity measure.
Currently, our approa h is only a minimal extension of rst-order anti-
uni ation, allowing to generalize over fun tions with the same arity. Gen-
eral se ond-order anti-uni ation has the draw-ba k that uniqueness is lost,
mainly be ause it is open whi h pair of arguments of two terms are anti-
unied. To onstrain se ond-order anti-uni ation, we plan to introdu e fun -
tion evaluation, similar to the proposal of Kolbe and Walther (1995) in the
332 13. Con lusions and Further Resear h
ontext of reusing proofs. Thereby we also extend the purely synta ti al ap-
proa h to similarity and generalization by semanti aspe ts. Even with this
additional information, typi ally not a single sour e andidate but a set of
RPSs will be retrieved from memory.
While in the system KIDS a knowledgeable user sele ts the RPS whi h
is underlying the to be onstru ted program, we are interested in identifying
heuristi s for automati sele tion. In a rst step, andidate RPSs should be
evaluated with respe t to their stru tural similarity to the target RPSs. As we
have seen in the previous hapter, it is not very helpful to transfer a s heme to
a problem belonging to a dierent re ursion lass su h as transferring linear
re ursive fa torial to tree re ursive bona i (see tab. 12.4). For this step
an interleaving of folding and analogy might be helpful: Information about
hypotheti al positions of re ursive alls, whi h are known for the sour e RPS,
an be superimposed on the new nite program term and it an be he ked
whether this results in a valid re ursive segmentation. Additionally, riteria of
stru tural overlap between nite programs, as investigated in hapter 11 an
be employed to de ide whether it is worthwhile to try analogi al transfer. If
after this initial redu tion of sour e andidates more than one RPS remains,
information about fun tion types an support the nal sele tion. Note, that
su h a strategy is omplementary to ognitive s ien e approa hes where in a
rst step, andidates sharing super ial features are obtained and in a se ond
step, the best mapping andidate is sele ted.
Currently, we are using analogy or abstra tion as alternative to folding. A
more powerful approa h would be to apply analogy already for a given prob-
lem spe i ation the set of operators and top-level goals given as input to
DPlan: Ea h RPS ould be asso iated with the planning domain from whi h
it was originally onstru ted and for a new planning domain, a stru turally
similar domain ould be identied for whi h a re ursive ontrol program is
already given. First ideas for analyzing planning domains with respe t to
their underlying re ursive stru ture were proposed by Wolf (2000).
13.2 A Framework for Learning from Problem Solving
Now we have introdu ed all omponents of the system proposed in hapter
1 universal planning, indu tive program synthesis, and programming by
analogy and abstra tion. The system is yet in its infan y, but we hope that
we ould demonstrate that the ombination of these three areas of resear h
an be fruitful for program synthesis.
Most work in ma hine learning resear h addresses on ept learning, some
work is done in the area of skill a quisiton. These elds over only a fra tion
of the power of human learning. We believe that our system models some
aspe ts of human learning from problem solving: Exploring a new domain
by sear h, indu ing a re ursive generalization, whi h represents a problem
solving strategy, and in rementally, with growing experien e over dierent
13.3 Appli ation Perspe tive 333
domains, onstru ting a hierar hy of abstra t s hemes. Folding of nite terms
models generalization learning by dete ting a pattern in a sequen e of a tions.
Abstra tion as a result of analogi al problem solving models generalization
learning by dete ting a ommon stru ture between solutions of dierent prob-
lems. Both approa hes address one possible sour e of human reativity: the
dis overy of problem solving strategies or algorithms from some example
experien e.
13.3 Appli ation Perspe tive
In this book we introdu ed basi on epts and algorithms for automatizing
(parts of) the pro ess of program onstru tion and generalization learning.
We presented learning of re ursive ontrol rules for AI planning as a eld of
appli ation. In the future, we hope to use the ideas presented in this book as
basis for further areas of appli ation (whi h we already mentioned in hap. 1).
We plan to apply learning of re ursive ontrol rules to proof planning. As in
general AI planning, there is a need to guide sear h in automati theorem
proving. An obvious area of appli ation is learning from user tra es to support
end-user programming. Nowadays, nearly everybody uses a omputer but
only a small fra tion of people know how to program. There exist already
some tools for generating \ma ros" from observation of users' input behavior
in a software system, be it a text editor, a graphi editor, or spreadsheet
analysis. Allowing to not only to learn linear sequen es of operations, whi h
is what urrent tools provide for, but generalizing to loops (re ursion) an
make end-user support more powerful.
Everybody [... omits something, if only be ause to in lude everything is im-
possible.
|Nero Wolf in: Rex Stout Poison a la Cart, 1960
Bibliography
Aho, A. V., Sethi, R., & Ullman, J. D. (1986). Compilers Prin iples, Te h-
niques, and Tools. Addison Wesley, Reading, MA.
Allen, J., Hendler, J., & (Eds.), A. T. (1990). Readings in Planning. Morgan
Kaufmann, San Mateo, CA.
Allou he, J.-P. (1994). Note on the y li Towers of Hanoi. Theoreti al
Computer S ien e, 123, 37.
Anderson, J. R. (1983). The Ar hite ture of Cognition. Havard University
Press, Cambridge, MA.
Anderson, J. R. (1995). Cognitive Psy hology and Its Impli ations. Freeman,
New York.
Anderson, J. R., Conrad, F. G., & Corbett, A. T. (1989). Skill a quisition
and the LISP tutor. Cognitive S ien e, 13, 467505.
Anderson, J. R., & Lebiere, C. (1998). The atomi omponents of thought.
Erlbaum, Mahwah, NJ.
Anderson, J., & Thompson, R. (1989). Use of analogy in a produ tion system
ar hite ture. In Vosniadou, S., & Ortony, A. (Eds.), Similarity and
Analogi al Reasoning, pp. 267297. Cambridge University Press.
Angluin, D. (1981). A note on the number of queries needed to identify
regular languages. Information and Control, 51 (1), 7687.
Angluin, D. (1984). Indu tive inferen e: EÆ ient algorithms. In Biermann,
A., Guiho, G., & Kodrato, Y. (Eds.), Automati Program Constru -
tion Te hniques, pp. 503516. Ma millan, New York.
Anzai, Y., & Simon, H. (1979). The theory of learning by doing. Psy hologi al
Review, 86, 124140.
Atkinson, M. D. (1981). The y li Towers of Hanoi. Information Pro essing
Letters, 13 (3), 118119.
Atwood, M. E., & Polson, P. G. (1976). A pro ess model for water jug
problems. Cognitive Psy hology, 8, 191216.
Ba hus, F., & Kabanza, F. (1996). Using temproal logi to ontrol sear h
in a forward haining planner. In New Dire tions in AI-planning, pp.
141153. ISO, Amsterdam.
Ba hus, F., Kautz, H., Smith, D. E., Long, D., Gener, H., & Koehler,
J. (2000). AIPS-00 Planning Competition, Bre kenridge, CO. http:
//www. s.toronto.edu/aips2000/.
336 BIBLIOGRAPHY
Barr, A., & Feigenbaum, E. A. (Eds.). (1982). The Handbook of Arti ial
Intelligen e, Vol. 2. William Kaufmann, Los Altos, CA.
Bates, J., & Constable, R. (1985). Proofs as programs. ACM Transa tions
on Programming Languages and Systems, 7 (1), 113136.
Bauer, F., Broy, M., Moller, B., Pepper, P., Wirsing, M., et al. (1985). The
Muni h Proje t CIP. Vol. I: The Wide Spe trum Language CIP-L.
Springer.
Bauer, F., Ehler, H., Hors h, R., Moller, B., Parts h, H., Paukner, O., & Pep-
per, P. (1987). The Muni h Proje t CIP. Vol. II: The Transformation
System CIP-S, LNCS 292, Vol. II. Springer.
Bauer, M. (1975). A basis for the a quisition of pro edures from proto ols.
In Pro . 4th International Joint Conferen e on Arti ial Intelligen e
(4th IJCAI), pp. 226231.
Bibel, W. (1980). Syntax-dire ted, semanti -supported program synthesis.
Arti ial Intelligen e, 14 (3), 243261.
Bibel, W. (1986). A dedu tive solution for plan generation. New Generation
Computing, 4 (2), 115132.
Bibel, W., Korn, D., Kreitz, C., & Kuru z, F. (1998). A multi-level approa h
to program synthesis. In Pro . 7th International Workshop on Logi
Program Synthesis and Transformation, Vol. 1463 of LNCS, pp. 127.
Springer.
Biermann, A. W. (1978). The inferen e of regular LISP programs from ex-
amples. IEEE Transa tions on Systems, Man, and Cyberneti s, 8 (8),
585600.
Biermann, A. W., Guiho, G., & Kodrato, Y. (Eds.). (1984). Automati
Program Constru tion Te hniques. Ma millan, New York.
Blum, A., & Furst, M. (1997). Fast planning through planning graph analysis.
Arti ial Intelligen e, 90 (1-2), 281300.
Bonet, B., & Gener, H. (1999). Planning as heuristi sear h: New results.
In Pro . European Conferen e on Planning (ECP-99), Durham, UK.
Springer.
Bostrom, H. (1996). Theory-guided indu tion of logi programs by infer-
en e of regular languages. In Saitta, L. (Ed.), Pro . 13th International
Conferen e on Ma hine Learning (ICML-96), pp. 4653. Morgan Kauf-
mann.
Bostrom, H. (1998). Predi ate invention and learning from positive examples
only. In Pro . 10th European Conferen e on Ma hine Learning (ECML-
98), No. 1398 in LNAI, pp. 226237. Springer.
Boyer, R., & Moore, J. (1975). Proving theorems about lisp fun tions. Jornal
of the ACM, 22 (1), 129144.
Briesemeister, L., S heer, T., & Wysotzki, F. (1996). A on ept based
algorithmi model for skill a quisition. In S hmid, U., Krems, J., &
Wysotzki, F. (Eds.), Pro . 1st European Workshop on Cognitive Model-
BIBLIOGRAPHY 337
ing (14.-16.11.96) TUBerlin. http://ki. s.tu-berlin.de/EuroCog/
m-program.html.
Broy, M., & Pepper, P. (1981). Program development as formal a tivity.
IEEE Transa tions on Software Engineering, SE-7 (1), 1422.
Bryant, R. E. (1986). Graph-based algorithms for boolean fun tion manipu-
lation. IEEE Transa tions on Computers, 35, 677691.
Bu hanan, B. G., & Wilkins, D. C. (Eds.). (1993). Readings in Knowledge
A quisition and Learning. Morgan Kaufmann, San Mateo, CA.
Bundy, A. (1988). The use of expli it plans to guide indu tive proofs. In
Pro . 9th Conferen e on Automati Dedu tion (CADE), pp. 111120.
Springer.
Bunke, H., & Messmer, B. T. (1994). Similarity measures for stru tured
representations. In Wess, S., Altho, K.-D., & Ri hter, M. (Eds.), Pro .
1st European Workshop on Topi s in Case-Based Reasoning, Vol. 837
of LNAI, pp. 106118. Springer.
Bur h, J. R., Clarke, E. M., M Millan, K. L., & Hwang, J. (1992). Symboli
model he king: 10
20
states and beyond. Information and Computation,
98 (2), 142170.
Burghardt, J., & Heinz, B. (1996). Implementing anti-uni ation modulo
equational theory. Te h. rep., Arbeitspapiere der GMD. ftp://ftp.
first.gmd.de/pub/jo hen/Papers/ap1006.dvi.gz.
Burns, B. (1996). Meta-analogi al transfer: Transfer between episodes of
analogi al reasoning. Journal of Experimental Psy hology: Learning,
Memory, and Cognition, 22 (4), 10321048.
Burstall, R., & Darlington, J. (1977). A transformation system for developing
re ursive programs. JACM, 24 (1), 4467.
Burton, C. T. P. (1992). Program morphisms. Formal Aspe ts of Computing,
4, 693726.
Bylander, T. (1994). The omputational omplexity of Strips planning. Ar-
ti ial Intelligen e, 69, 165204.
Carbonell, J. (1986). Derivational analogy: A theory of re onstru tive prob-
lem solving and expertise a quistion. In Mi halski, R., Carbonell, J.,
& Mit hell, T. (Eds.), Ma hine Learning - An Arti ial Intelligen e
Approa h, Vol. 2, pp. 371392. Morgan Kaufmann, Los Altos, CA.
Cheatham, T. E. (1984). Reusability through program transformation. IEEE
Transa tions of Software Engineering, 10 (5), 589594.
Chomsky, N. (1959). Review of Skinner's `Verbal Behavior'. Language, 35,
2658.
Christodes, N. (1975). Graph Theory - An Algorithmi Approa h. A ademi
Press, London.
Cimatti, A., Roveri, M., & Traverso, P. (1998). Automati OBDD-based
generation of universal plans in non-deterministi domains. In Pro .
15th National Conferen e on Arti ial Intelligen e (AAAI-98), pp. 875
881 Menlo Park. AAAI Press.
338 BIBLIOGRAPHY
Clement, E., & Ri hard, J.-F. (1997). Knowledge of domain ee ts in problem
representation: The ase of Tower of Hanoi isomorphs. Thinking and
Reasoning, 3 (2), 133157.
Cohen, W. W. (1992). Desiderata for generaization-to-n algorithms. In
Pro . Int. Workshop on Analogi al and Indu tive Inferen e (AII '92),
Dagstuhl Castle, Germany, Vol. 642 of LNAI, pp. 140150. Springer.
Cohen, W. W. (1995). Pa -learning re ursive logi programs: EÆ ient algo-
rithms. Journal of Arti ial Intelligen e Resear h, 2, 501539.
Cohen, W. (1998). Hardness results for learning rst-order representations
and programming by demonstration. Ma hine Learning, 30, 5797.
Cormen, T., Leiserson, C. E., & Rivest, R. (1990). Introdu tion to Algorithms.
MIT Press.
Cour elle, B. (1990). Re ursive appli ative program s hemes. In van Leeuwen,
J. (Ed.), Handbook of Theoreti al Computer S ien e, Vol. A, pp. 459
492. Elsevier.
Cour elle, B., & Nivat, M. (1978). The algebrai semanti s of re ursive pro-
gram s hemes. In Winkowski, J. (Ed.), Pro . 7th Symposium on Mathe-
mati al Foundations of Computer S ien e, Vol. 64 of LNCS, pp. 1630.
Springer.
Cypher, A. (Ed.). (1993).Wat h What I Do: Programming by Demonstration.
MIT Press, Cambridge, MA.
Davey, B. A., & Priestley, H. A. (1990). Introdu tion to Latti es and Order.
Cambridge University Press.
Dean, T., Basye, K., & Shew huk, J. (1993). Reinfor ement learning for
planning and ontrol. In Minton, S. (Ed.), Ma hine Learning Methods
for Planning, hap. 3, pp. 6792. Morgan Kaufmann.
Dershowitz, N. (1986). Programming by analogy. In Mi halski, R. S., Car-
bonell, J. G., & Mit hell, T. M. (Eds.), Ma hine Learning An Arti-
ial Intelligen e Approa h, Vol. 2, pp. 393422. Morgan Kaufmann, Los
Altos, CA.
Dershowitz, N., & Jouanaud, J.-P. (1990). Rewrite systems. In Leeuwen, J.
(Ed.), Handbook of Theoreti al Computer S ien e, Vol. B, pp. 243320.
Elsevier.
Do, M. B., & Kambhampati, S. (2000). Solving planning-graph by ompil-
ing it into CSP. In Pro . 5th International Conferen e on Arti ial
Intelligen e Planning and S heduling (AIPS 2000), pp. 8291. AAAI
Press.
Dold, A. (1995). Representing, verifying and applying software development
steps using the PVS system. In Alagar, V. S., & Nivat, M. (Eds.), Pro .
Algebrai Methodology and Software Te hnology (AMAST'95), Vol. 936
of LNCS, pp. 431460. Springer.
Dun ker, K. (1945). On problem solving. Psy hologi al Monographs, 58.
E kes, T., & Roba h, H. (1980). Clusteranalysen. Kohlhammer, Stuttgart.
BIBLIOGRAPHY 339
Edelkamp, S. (2000). Heuristi sear h planning with BDDs. In Pro .
14th Workshop "New Results in Planning, S heduling and Design"
(PuK2000), Berlin. http://www-is.informatik.uni-oldenburg.
de/~sauer/
puk2000/paper.html.
Ehrig, H., & Mahr, B. (1985). Fundamentals of Algebrai Spe i ation 1 -
Equations and Initial Semanti s. Springer, Berlin.
Evans, T. G. (1968). A program for the solution of a lass of geometri -
analogy intelligen e-questions. In Minsky, M. (Ed.), Semanti Infor-
mation Pro essing, pp. 271353. MIT Press.
Falkenhainer, B., Forbus, K., & Gentner, D. (1989). The stru ture mapping
engine: Algorithm and example. Arti ial Intelligen e, 41, 163.
Feather, M. S. (1987). A survey and lassi ation of some program transfor-
mation approa hes and te hniques. In Meertens, L. G. L. T. (Ed.), Pro-
gram Spe i ation and Transformation, pp. 165198. North-Holland.
Field, A. J., & Harrison, P. G. (1988). Fun tional Progamming. Addison-
Wesley, Reading, MA.
Fikes, R., & Nilsson, N. (1971). STRIPS: A new approa h to the appli ation
of theorem proving to problem solving. Arti ial Intelligen e, 2 (3),
189208.
Fink, E., & Yang, Q. (1997). Automati ally sele ting and using primary ef-
fe ts in planning: Theory and experiments. Arti ial Intelligen e Jour-
nal, 89, 285315.
Flener, P. (1995). Logi Program Synthesis from In omplete Information.
Kluwer A ademi Press, Boston.
Flener, P., Popelinsky, L., & Stepankova, O. (1994). Ilp and automati pro-
gramming: towards three approa hes. In Pro . of ILP-94, pp. 351364.
Flener, P., & Yilmaz, S. (1999). Indu tive synthesis of re ursive logi pro-
grams: A hievements and prospe ts. Journal of Logi Programming,
41 (23), 141195.
Fox, M., & Long, D. (1998). The automati inferen e of state invariants in
TIM. Journal of Arti ial Intelligen e Resear h, 9, 367421.
Fruhwirth, T., & Abdennadher, S. (1997). Constraint-Programming. Sprin-
ger, Berlin.
Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1996). Design Patterns:
Elements of Reusable Obje t-oriented Software. Addison Wesley, Read-
ing, MA.
Garey, M. R., & Johnson, D. S. (1979). Computers and Intra tability A
Guide to the Theory of NP- ompleteness. Freeman.
Gener, H. (2000). Fun tional Strips: A more exible language for planning
and problem solving. In Minker, J. (Ed.), Logi -Based Arti ial Intel-
ligen e. Kluwer, Dordre ht. Earlier version presented at "Logi -based
AI Workshop", Washington D.C., June 1999.
340 BIBLIOGRAPHY
Gelfond, M., & Lifs hitz, V. (1993). Representing a tion and hange by logi
programs. Journal of Logi Programming, 17, 301322.
Gentner, D. (1980). The Stru ture of Analogi al Models in S ien e. Bolt
Beranek and Newman In ., Cambridge, MA.
Gentner, D. (1983). Stru ture-mapping: A theoreti al framework for analogy.
Cognitive S ien e, 7, 155170.
Gentner, D., Brem, S., Ferguson, R., Markman, A. B., Levidow, B., Wol,
P., & Fobus, K. (1997). Analogi al reasoning and on eptual hange: A
ase study of Johannes Kepler. Journal of the Learning S ien es, 6 (1),
340.
Gentner, D., & Landers, R. (1985). Analogi al a ess: A good mat h is hard
to nd. In Pro . Annual Meeting of the Psy honomi So iety, Boston.
Gentner, D., Ratterman, M. J., & Forbus, K. D. (1993). The roles of simi-
larity in transfer: Separating retrievability from inferential soundness.
Cognitive Psy hology, 25 (4), 524575.
George, M. P. (1987). Planning. Annual Review of Computer S ien e, 2,
359400. also in J. Allen, J. Hendler and A. Tate (Eds.), Readings in
Planning, Morgan Kaufmann, 1990.
Gholson, B., Smither, D., Buhrman, A., Dun an, M. K., & Pier e, K. (1996).
The sour es of hildren's reasoning errors during analogi al problem
solving. Applied Cognitive Psy hology, Spe ial Issue: "Reasoning Pro-
esses", 10, 8597.
Gi k, M., & Holyoak, K. (1980). Analogi al problem solving. Cognitive
Psy hology, 12, 306355.
Gi k, M., & Holyoak, K. (1983). S hema indu tion and analogi al transfer.
Cognitive Psy hology, 14, 138.
Giesl, J. (2000). Context-moving transformations for fun tion veri ation.
In Pro . 9th International Workshop on Logi -based Program Synthesis
and Transformation (LOPSTR '99). Springer.
Ginsberg, M. (1989). Universal planning: An (almost) universally bad idea.
AI Magazine, 10 (4), 4044.
Giun higlia, F., & Traverso, P. (1999). Planning as model he king. In Bi-
undo, S., & Fox, M. (Eds.), Pro . 5th European Conferen e on Planning
(ECP'99), Durham, UK, LNCS. Springer.
Gold, E. M. (1978). Complexity of automaton identi ation from given data.
Information and Control, 37, 302320.
Gold, E. (1967). Language identi ation in the limit. Information and Con-
trol, 10, 447474.
Green, C. (1969). Appli ation of theorem proving to problem solving. In Pro .
1st International Joint Conferen e on Arti ial Intelligen e (IJCAI-
69), pp. 342344.
Green, C., Lu kham, D., Balzer, R., Cheatham, T., & Ri h, C. (1983). Re-
port on a knowledge-based software assistant. Te h. rep. KES.U.83.2,
Kestrel Institute, Palo Alto, CA.
BIBLIOGRAPHY 341
Green, C., & Barstow, D. (1978). On program synthesis knowledge. Arti ial
Intelligen e, 10, 241279.
Guessarian, I. (1992). Trees and algebrai semanti s. In Nivat, M., & Podel-
ski, A. (Eds.), Tree Automata and Languages, pp. 291310. Elsevier.
Hasker, R. W. (1995). The Replay of Program Derivations. Ph.D. thesis,
Univ. of Illinois at Urbana-Champaign.
Haslum, P., & Gener, H. (2000). Admissible heuristi s for optimal plan-
ning. In Pro . 5th International Conferen e on Arti ial Intelligen e
Planning and S heduling (AIPS 2000), pp. 150158. AAAI Press.
Hi kman, A. K., & Larkin, J. H. (1990). Internal analogy: A model of transfer
within problems. In Pro . 12th Annual Conferen e of the Cognitive S i-
en e So iety (CogS i-90), pp. 5360 Hillsdale, NJ. Lawren e Erlbaum.
Hinman, P. (1978). Re ursion-Theoreti Hierar hies. Springer, New York.
Hinz, A. M. (1996). Square-free Tower of Hanoi sequen es. Enseign. Math.,
2 (42), 257264.
Hofstadter, D., & The Fluid Analogies Resear h Group (1995). Fluid Con-
epts and Creative Analogies. Basi Books, New York.
Holland, J., Holyoak, K., Nisbett, R., & Thagard, P. (1986). Indu tion Pro-
esses of Inferen e, Learning, and Dis overy. MIT Press, Cambridge,
MA.
Holyoak, K. J., & Thagard, P. (1989). Analogi al mapping by onstraint
satisfa tion. Cognitive S ien e, 13, 295355.
Holyoak, K., & Koh, K. (1987). Surfa e and stru tural similarity in analogi al
transfer. Memory and Cognition, 15, 332440.
Hop roft, J., & Ullman, J. (1980). Introdu tion to Automata Theory, Lan-
guages, and Computation. Addison-Wesley.
Huang, Y., Selman, B., & Kautz, H. (2000). Learning de larative ontrol
rules for onstraint-based planning. In Pro . 17th International Con-
feren e on Ma hine Learning (ICML-2000), Stanford, CA, pp. 415422.
Morgan Kaufmann.
Huet, G. (1976). Resolution d'equations dans les langages d'ordre 1, 2, : : : ; !.
Ph.D. thesis, Universitee de Paris VII.
Hummel, J., & Holyoak, K. (1997). Distributed representation of stru ture: A
theory of analogi al a ess and mapping. Psy hologi al Review, 104 (3),
427466.
Idestam-Almquist, P. (1993). Generalization under impli ation by re ur-
sive anti-uni ation. In Utgo, P. (Ed.), Pro . 10th International Con-
feren e on Ma hine Learning (ICML-93), pp. 151158. Morgan Kauf-
mann.
Idestam-Almquist, P. (1995). EÆ ient indu tion of re ursive denitions by
stru tural analysis of saturations. In De Raedt, L. (Ed.), Pro . 5th In-
ternational Workshop on Indu tive Logi Programming (ILP-95), Leu-
ven, pp. 7794.
342 BIBLIOGRAPHY
Indurkhya, B. (1992). Metaphor and Cognition. Kluwer, Dordre ht, The
Netherlands.
Jamnik, M., Kerber, M., & Benzmuller, C. (2000). Towards learning new
methods in proof planning. In Pro . of the Cal ulemus Symposium:
Systems for Integrated Computation and Dedu tion St. Andrews, S ot-
land, UK. ftp://ftp. s.bham.a .uk/pub/authors/M.Kerber/TR/
CSRP-00-09.ps.gz.
Jensen, R. M., & Veloso, M. M. (2000). OBDD-based universal planning for
syn hronized agents in non-deterministi domains. Journal of Arti ial
Intelligen e Resear h, 13, 189226.
Johnson, W. L. (1988). Modelling programmers' intentions. In Self, J.
(Ed.), Arti ial Intelligen e and Human Learning Intelligent Com-
puter-Aided Instru tion, pp. 374390. Chapman and Hall, London.
Johnson, W., & Feather, M. (1991). Using evolution transformations to on-
stru t spe i ations. In Lowry, M., & M Cartney, R. (Eds.), Automat-
ing Software Design, hap. 4, pp. 6592. AAAI press/MIT Press, Menlo
Park.
Jouannaud, J. P., & Kodrato, Y. (1979). Chara terization of a lass of
fun tions synthesized from examples by a summers like method using a
`B.M.W.' mat hing te hnique. In Pro . International Joint Conferen e
on Arti ial Intelligen e (IJCAI-79), pp. 440447. Morgan Kaufmann.
Kalmar, Z., & Szepesvari, C. (1999). An evaluation riterion for ma ro learn-
ing and some results. Te h. rep. TR99-01, Mindmaker Ltd., Budapest.
http://vi toria.mindmaker.hu/~szepes/papers/
ma ro-tr99-01.ps.gz.
Kambhampati, S. (2000). Planning graph as a (dynami ) CSP: Exploiting
EBL, DDB and other CSP sear h te hniques in Graphplan. Journal of
Arti ial Intelligen e Resear h, 12, 134.
Kautz, H., & Selman, B. (1996). Pushing the envelope: Planning, proposi-
tional logi and sto hasti sear h. In Pro . 13th National Conferen e
on Arti ial Intelligen e and 8thInnovative Appli ations of Arti ial
Intelligen e Conferen e, pp. 11941201 Menlo Park. AAAI Press/MIT
Press.
Keane, M. (1996). On adaptation in analogy: Tests of pragmati importan e
and adaptability in analogi al problem solving. The Quarterly Journal
of Experimental Psy hology, 49A(4), 10621085.
Keane, M., Ledgeway, T., & Du, S. (1994). Constraints on analogi al map-
ping: A omparison of three models. Cognitive S ien e, 18, 387438.
Knuth, D. E. (1992). Convolution polynomials. The Mathemati a Journal,
2 (4), 6778.
Kodrato, Y., & J.Fargues (1978). A sane algorithm for the synthesis of LISP
fun tions from example problems: The Boyer and Moore algorithm. In
Pro . AISE Meeting Hambourg, pp. 169175.
BIBLIOGRAPHY 343
Kodrato, Y., Franova, M., & Partridge, D. (1989). Why and how program
synthesis?. In Pro . International Workshop on Analogi al and Indu -
tive Inferen e (AII '89), LNCS, pp. 4559. Springer.
Koedinger, K., & Anderson, J. (1990). Abstra t planning and per eptual
hunks: Elements of expertise in geometry. Cognitive S ien e, 14, 511
550.
Koehler, J. (1998). Planning under resour e onstraints. In Prade, H. (Ed.),
Pro . 13th European Conferen e on Arti ial Intelligen e (ECAI-98,
pp. 489493. Wiley.
Koehler, J., Nebel, B., & Homann, J. (1997). Extending planning graphs
to an ADL subset. In Pro . European Conferen e on Planning (ECP-
97), pp. 273285. Springer. extended version as Te hni al Report No.
88/1997, University Freiburg.
Kolbe, T., & Walther, C. (1995). Se ond-order mat hing modulo evaluation:
A te hnique for reusing proofs. In Mellish, C. S. (Ed.), Pro . 14th
International Joint Conferen e on Arti ial Intelligen e (IJCAI-95),
pp. 190195. Morgan Kaufmann.
Kolodner, J. (1993). Case-based Reasoning. Morgan Kaufmann, San Mateo,
CA.
Korf, R. E. (1985). Ma ro-operators: a weak method for learning. Arti ial
Intelligen e, 1985, 26, 3577.
Koza, J. (1992). Geneti programming: On the programming of omputers by
means of natural sele tion. MIT Press, Cambridge, MA.
Kreitz, C. (1998). Program synthesis. In Bibel, W., & S hmitt, P. (Eds.),
Automated Dedu tion A Basis for Appli ations, Vol. III, hap. III.2.5,
pp. 105134. Kluwer.
Laborie, P., & Ghallab, M. (1995). Planning with sharable resour e on-
straints. In Pro . of the 14th IJCAI, pp. 16431649.Morgan Kaufmann.
Lassez, J.-L., Maher, M. J., & Marriott, K. (1988). Uni ation revisited. In
Bos arol, M., Aiello, L. C., & Levi, G. (Eds.), Pro . Workshop Foun-
dations of Logi and Fun tional Programming, Vol. 306 of LNCS, pp.
67113. Springer.
Lavra , N., & Dzeroski, S. (1994). Indu tive Logi Programming: Te hniques
and Appli ations. Ellis Horwood, London.
Le Blan , G. (1994). BMWk revisited: Generalization and formalization of
an algorithm for dete ting re ursive relations in term sequen es. In
Bergadano, F., & de Raedt, L. (Eds.), Ma hine Learning, Pro . of
ECML-94, pp. 183197. Springer.
Leeuwenberg, E. (1971). A per eptual en oding language for visual and au-
ditory patterns. Ameri an Journal of Psy hology, 84, 307349.
Lieberman, H. (1993). Tinker: A programming by demonstration system
for beginning programmers. In Cypher, A. (Ed.), Wat h What I Do:
Programming by Demonstration, hap. 2. MIT Press, Cambridge, MA.
see http://www.a ypher. om/wwid/Chapters/02Tinker.html.
344 BIBLIOGRAPHY
Lifs hitz, V. (1987). On the semanti s of STRIPS. In George, M. P., & Lan-
sky, A. L. (Eds.), Reasoning about A tions and Plans, pp. 19. Kauf-
mann, Los Altos, CA.
Long, D., & Fox, M. (1999). The eÆ ient implementation of the plan-graph
in STAN. Journal of Arti ial Intelligen e Resear h, 10, 87115.
Lowry, M., & Duran, R. (1989). Knowledge-based software engineering. In
A. Barr, P. R. C., & Feigenbaum, E. A. (Eds.), Handbook of Arti ial
Intelligen e, Vol. IV, pp. 241322. Addison-Wesely, Reading, MA.
Lowry, M. L., & M Carthy, R. D. (Eds.). (1991). Autmati Software Design.
MIT Press, Cambridge, MA.
Lu, S. (1979). A tree-to-tree distan e and its appli ation to luster analy-
sis. IEEE Transa tions on Pattern Analysis and Ma hine Intelligen e,
PAMI-1 (2), 219224.
Manna, Z., & Waldinger, R. (1975). Knowledge and reasoning in program
synthesis. Arti ial Intelligen e, 6, 175208.
Manna, Z., & Waldinger, R. (1979). Synthesis: Dreams ! programs. IEEE
Transa tions on Software Engineering, 5 (4), 294328.
Manna, Z., & Waldinger, R. (1980). A dedu tive approa h to program syn-
thesis. ACM Transa tions on Programming Languages and Systems,
2 (1), 90121. also in: C. Ri h and R.C. Waters (1986). Readings in AI
and Software Engineering, Chap. 1. Morgan Kaufmann.
Manna, Z., & Waldinger, R. (1987). How to lear a blo k: a theory of plans.
Journal of Automated Reasoning, 3 (4), 343378.
Manna, Z., & Waldinger, R. (1992). Fundamentals of dedu tive program
synthesis. IEEE Transa tions on Software Engineering, 18 (8), 674
704.
Mart
in, M., & Gener, H. (2000). Learning generalized poli ies in plan-
ning using on ept languages. In Pro . 7th International Conferen e
on Knowledge Representation and Reasoning (KR 2000), pp. 667677.
Morgan Kaufmann.
Mayer, R. E. (1983). Thinking, Problem Solving, Cognition. Freeman, New
York.
Mayer, R. E. (1988). Tea hing and Learning Computer Programming: Mul-
tiple Resear h Perspe tives. Erlbaum, Hillsdale, NJ.
M Carthy, J., & Hayes, P. J. (1969). Some philosophi al problems from the
standpoint of arti ial intelligen e. Ma hine Intelligen e, 4, 463502.
M Carthy, J. (1963). Situations, a tions, and ausal laws. Memo 2, Stanford
University Arti ial Intelligen e Proje t, Stanford, California.
M Dermott, D. (1998a). AIPS-98 Planning Competition Results. http:
//ftp. s.yale.edu/pub/m dermott/aips omp-results.html.
M Dermott, D. (1998b). PDDL - the planning domain denition language.
http://ftp. s.yale.edu/pub/m dermott.
Mer y, R. (1998). Programmieren dur h analoges S hlieen: Zwei Algo-
rithmen zur Adaptation rekursiver Programms hemata (programming
BIBLIOGRAPHY 345
by analogy: Two algorithms for the adaptation of re ursive program
s hemes). Master's thesis, Fa hberei h Informatik, Te hnis he Univer-
sitat Berlin.
Messing, L. S., & Campbell, R. (Eds.). (1999). Advan es in Arti ial Intel-
ligen e in Software Engineering. JAI Press, Greenwi h, Conne ti ut.
Mi halski, R. S., Carbonell, J. G., & Mit hell, T. M. (Eds.). (1983). Ma hine
Learning An Arti ial Intelligen e Approa h. Tioga.
Mi halski, R. S., Carbonell, J. G., & Mit hell, T. M. (Eds.). (1986). Ma-
hine Learning An Arti ial Intelligen e Approa h, Vol. 2. Morgan
Kaufmann.
Minton, S. (1988). Learning ee tive sear h ontrol knowledge: An
explanation-based approa h. Kluwer.
Minton, S. (1985). Sele tively generalizing plans for problem-solving. In
Pro . International Joint Conferen e on Arti ial Intelligen e (IJCAI-
85), pp. 596599. Morgan Kaufmann.
Mit hell, T. M., Keller, R. M., & Kedar-Cabelli, S. T. (1986). Explanation-
based generalization: A unifying view. Ma hine Learning, 1, 4780.
Mit hell, T. M. (1997). Ma hine Learning. M Graw-Hill.
Muggleton, S. (1994). Predi ate invention and utility. Journal of Experimen-
tal and Theoreti al Arti ial Intelligen e, 6 (1), 121130.
Muggleton, S., & De Raedt, L. (1994). Indu tive logi programming: Theory
and methods. Journal of Logi Programming, Spe ial Issue on 10 Years
of Logi Programming, 19-20, 629679.
Muggleton, S., & Feng, C. (1990). EÆ ient indu tion of logi programs. In
Pro . of the Workshop on Algorithmi Learning Theory by the Japanese
So iety for AI, pp. 368381.
Muggleton, S. (1991). Indu tive logi programming. New Generation Com-
puting, 8, 295318.
Muhlpfordt, M. (2000). Syntaktis he Inferenz Rekursiver Programm S he-
mata (Synta ti al inferen e of re ursive program s hemes). Master's
thesis, Dept. of Computer S ien e, TU Berlin.
Muhlpfordt, M., & S hmid, U. (1998). Synthesis of re ursive fun tions with
interdependent parameters. In Pro . of the Annual Meeting of the GI
Ma hine Learning Group, FGML-98, pp. 132139 TU Berlin.
Muller, W., & Wysotzki, F. (1995). Automati synthesis of ontrol programs
by ombination of learning and problem solving methods. In Lavra , N.,
& Wrobel, S. (Eds.), Pro . European Conferen e on Ma hine Learning
(ECML-95), Vol. 912 of LNAI, pp. 323326. Springer.
Nebel, B. (2000). On the ompilability and expressive power of propositional
planning formalisms. Journal of Arti ial Intelligen e Resear h, 12,
271315.
Nebel, B., & Koehler, J. (1995). Plan reuse versus plan generation: A theo-
reti al and empiri al analysis. Arti ial Intelligen e, 76, 427454.
346 BIBLIOGRAPHY
Newell, A. (1990). Unifyied Theories of Cognition. Harvard University Press,
Cambridge, MA.
Newell, A., Shaw, J. C., & Simon, H. A. (1958). Elements of a theory of
human problem solving. Psy hologi al Review, 65, 151166.
Newell, A., & Simon, H. A. (1972). Human Problem Solving. Prenti e Hall,
Englewood Clis, NJ.
Newell, A., & Simon, H. (1961). GPS, A program that simulates human
thought. In Billing, H. (Ed.), Lernende Automaten, pp. 109124. Old-
enbourg, Mun hen.
Nilsson, N. J. (1980). Prin iples of Arti ial Intelligen e. Springer, New
York.
Nilsson, N. (1971). Problem-solving Methods in Arti ial Intelligen e.
M Graw-Hill.
Novi k, L. R. (1988). Analogi al transfer, problem similarity, and expertise.
Journal of Experimental Psy hology: Learning, Memory, and Cognition,
14, 510520.
Novi k, L. R., & Hmelo, C. E. (1994). Transferring symboli representations
a ross nonisomorphi problems. Journal of Experimental Psy hology:
Learning, Memory, and Cognition, 20 (6), 12961321.
Novi k, L. R., & Holyoak, K. J. (1991). Mathemati al problem solving by
analogy. Journal of Experimental Psy hology: Learning, Memory, and
Cognition, 17 (3), 398415.
Odifreddi, P. (1989). Classi al Re ursion Theory, Vol. 125 of Studies in Logi .
North-Holland, Amsterdam.
O'Hara, S. (1992). A model of the redes ription pro ess in the ontext of ge-
ometri proportional analogy problems. In Pro . International Work-
shop on Analogi al and Indu tive Inferen e (AII '92), Dagstuhl Castle,
Germany, Vol. 642 of LNAI, pp. 268293. Springer.
Olsson, R. (1995). Indu tive fun tional programming using in remental pro-
gram transformation. Arti ial Intelligen e, 74 (1).
Osherson, D., Stob, M., & Weinstein, S. (1986). Systems that Learn. MIT
Press, Cambridge, MA.
Parandian, B., S hmid, U., & Wysotzki, F. (1995). Program synthesis with a
generalized planning approa h. In Program Synthesis by Learning and
Planning. Te hni al Report 95-20, Department of Computer S ien e,
TU Berlin.
Partridge, D. (Ed.). (1991). Arti ial Intelligen e and Software Engineering.
Ablex, Norwood, NJ.
Pednault, E. P. D. (1987). Formulating multiagent, dynami -world problems
in the lassi al planning framework. In George, M. P., & Lansky,
A. L. (Eds.), Pro . Workshop on Reasoning About A tions and Plans,
pp. 4782. Morgan Kaufmann.
Pednault, E. P. D. (1994). ADL and the state-transition model of a tion.
Journal of Logi and Computation, 4 (5), 467512.
BIBLIOGRAPHY 347
Penberthy, J. S., & Weld, D. S. (1992). UCPOP: A sound, omplete par-
tial order planner for adl. In Pro . 3rd International Conferen e on
Prin iples in Knowledge Representation and Reasoning (KR-92), pp.
103114.
Pettorossi, A. (1984). Towers of Hanoi problems: Deriving the iterative so-
lutions using the program transformation te hnique. Te h. rep. R.82,
Instituto di Analisi dei Sistemi ed Informati a, Roma.
Pirolli, P., & Anderson, J. (1985). The role of learning from examples in
the a quisition of re ursive programming skills. Canadian Journal of
Psy hology, 39, 240272.
Pis h, H. (2000). Ein eÆzienter Pattern-Mat her zur Synthese rekursiver
Programme aus endli hen Termen (An eÆ ient pattern-mat her for the
synthesis of re ursive programs from nite terms). Master's thesis,
Dept. of Computer S ien e, TU Berlin.
Plaza, E. (1995). Cases as terms: A feature term approa h to the stru -
tured representation of ases. In Pro . 1st International Conferen e on
Case-Based Reasoning (ICCBR-95), Vol. 1010 of LNCS, pp. 265276.
Springer.
Plotkin, G. D. (1969). A note on indu tive generalization. In Ma hine Intel-
ligen e, Vol. 5, pp. 153163. Edinburgh University Press.
Polya, G. (1957). How to Solve It. Prin eton University Press, Prin eton,
NJ.
Potter, B., Sin lair, J., & Till, D. (1991). An Introdu tion to Formal Spe i-
ation and Z. Prenti e Hall.
Quinlan, J. R. (1990). Learning logi al denitions from relations. Ma hine
Learning, 5 (3), 239266.
Quinlan, J. (1986). Indu tion of de ision trees. Ma hine Learning, 1 (1),
81106.
Reed, S. K., A kin lose, C. C., & Voss, A. A. (1990). Sele ting analogous
problems: Similarity versus in lusiveness. Memory & Cognition, 18 (1),
8398.
Reed, S. K., & Bolstad, C. (1991). Use of examples and pro edures in problem
solving. Journal of Experimental Psy hology: Learning, Memory, and
Cognition, 17 (4), 753766.
Reed, S., Ernst, G., & Banerji, R. (1974). The role of analogy in transfer
between similar problem states. Cognitive Psy hology, 6, 436450.
Ri h, C., & Waters, R. C. (1988). Automati programming: Myths and
prospe ts. IEEE Computer, 21 (11), 1025.
Ri h, C., & Waters, R. (1986a). Introdu tion. In Ri h, C., & (Eds.), R. W.
(Eds.), Readings in Arti ial Intelligen e and Software Engineering.
Morgan Kaufmann, Los Altos, CA.
Ri h, C., & Waters, R. (Eds.). (1986b). Readings in Arti ial Intelligen e
and Software Engineering. Morgan Kaufmann, Los Altos, CA.
348 BIBLIOGRAPHY
Riesbe k, C., & S hank, R. (1989). Inside Case-based Reasoning. Lawren e,
Erlbaum, Hillsdale, NJ.
Rivest, R. (1987). Learning de ision lists. Ma hine Learning, 2 (3), 229246.
Robinson, J. A. (1965). A ma hine-oeriented logi based on the resolution
prin iple. Journal of the ACM, 12 (1), 2341.
Rosenbloom, P. S., & Newell, A. (1986). The hunking of goal hierar hies:
A generalized model of pra ti e. In Mi halski, R. S., Carbonell, J. G.,
& Mit hell, T. M. (Eds.), Ma hine Learning - An Arti ial Intelligen e
Approa h, Vol. 2, pp. 247288. Morgan Kaufmann.
Ross, B. H. (1989). Distinguishing types of super ial similarities: Dierent
ee ts on the a ess and use of earlier problems. Journal of Experi-
mental Psy hology: Learning, Memory, and Cognition, 15 (3), 456468.
Rouveirol, C. (1991). Semanti model for indu tion of rst order theories.
In Myopoulos, R., & Reiter, J. (Eds.), Pro . 12th International Joint
Conferen e on Arti ial Intelligen e (IJCAI-91), pp. 685691. Morgan
Kaufmann.
Rumelhart, D. E., & Norman, D. A. (1981). Analogi al pro esses in learning.
In Anderson, J. R. (Ed.), Cognitive Skills and Their A quisition, pp.
335360. Lawren e Erlbaum, Hillsdale, NJ.
Russell, S. J., & Norvig, P. (1995). Arti ial Intelligen e. A Modern Ap-
proa h. Prenti e-Hall, Englewood Clis, NJ.
Sa erdoti, E. (1975). The nonlinear nature of plans. In Pro . International
Joint Conferen e on Arti ial Intelligen e (IJCAI-75), pp. 206214.
Morgan Kaufmann.
Sakakibara, Y. (1997). Re ent advan es of grammati al inferen e. Theoreti al
Computer S ien e, 185, 1545.
Sammut, C., & Banerji, R. (1986). Learning on epts by asking questions.
In Mi halski, R., Carbonell, J., & Mit hell, T. (Eds.), Ma hine Learn-
ing An Arti ial Intelligen e Approa h, Vol. 2, pp. 167192. Morgan
Kaufmann, Los Altos, CA.
S hadler, K., S heer, T., S hmid, U., & Wysotzki, F. (1995). Program
synthesis by learning and planning. Te h. rep. 95-20, Te hnis he Uni-
versitat Berlin, Computer S ien e Department.
S hadler, K., & Wysotzki, F. (1999). Comparing stru tures using a Hopeld-
style neural network. Applied Intelligen e, 11, 1530.
S hmid, U. (1994). Programmieren lernen: Unterstutzung des Erwerbs rekur-
siver Programmierte hniken dur h Beispielfunktionen und Erklarungs-
texte (Learning programming: A quisition of re ursive programming
skills from examples and explanations). Kognitionswissens haft, 4 (1),
4754.
S hmid, U., & Carbonell, J. (1999). Empiri al eviden e for derivational anal-
ogy. In Wa hsmuth, I., & Jung, B. (Eds.), Pro . Jahrestagung der
Deuts hen Gesells haft f. Kognitionswissens haft, Sept. 1999, Bielefeld,
pp. 116121. inx.
BIBLIOGRAPHY 349
S hmid, U., & Kaup, B. (1995). Analoges Lernen beim rekursiven Program-
mieren (Analogi al learning in re ursive programming). Kognitionswis-
sens haft, 5, 3141.
S hmid, U., Mer y, R., & Wysotzki, F. (1998). Programming by analogy:
Retrieval, mapping, adaptation and generalization of re ursive program
s hemes. In Pro . of the Annual Meeting of the GI Ma hine Learning
Group, FGML-98, pp. 140147 TU Berlin.
S hmid, U., Muller, M., & Wysotzki, F. (2000). Integrating fun tion appli-
ation in state-based planning. submitted manus ript.
S hmid, U., Sinha, U., & Wysotzki, F. (2001). Program reuse and ab-
stra tion by anti-uni ation. In Professionelles Wissensmanagement
Erfahrungen und Visionen, pp. 183185. Shaker. Long Version:
http://ki. s.tu-berlin.de/~s hmid/pub-ps/pba-wm01-3a.ps.
S hmid, U., Wirth, J., & Polkehn, K. (1999). Analogi al transfer of non-
isomorphi sour e problems. In Pro . 21st Annual Meeting of the Cog-
nitive S ien e So iety (CogS i-99), August 19-21, 1999; Simon Fraser
University, Van ouver, British Columbia, pp. 631636. Morgan Kauf-
mann.
S hmid, U., Wirth, J., & Polkehn, K. (2001). A loser look on stru tural
similarity in analogi al transfer. submitted manus ript.
S hmid, U., & Wysotzki, F. (1996). Skill a quisition an be regarded as
program synthesis. In S hmid, U., Krems, J., & Wysotzki, F. (Eds.),
Pro . 1st European Workshop on Cognitive Modeling (14.-16.11.96),
pp. 3945 Te hni al University Berlin. http://ki. s.tu-berlin.de/
EuroCog/ m-program.html.
S hmid, U., & Wysotzki, F. (1998). Indu tion of re ursive program s hemes.
In Pro . 10th European Conferen e on Ma hine Learning (ECML-98),
Vol. 1398 of LNAI, pp. 214225. Springer.
S hmid, U., & Wysotzki, F. (2000a). Applying indu tive program synthesis
to learning domain-dependent ontrol knowledge Transforming plans
into programs. Te h. rep. CMU-CS-143, Computer S ien e Depart-
ment, Carnegie Mellon University.
S hmid, U., & Wysotzki, F. (2000b). Applying indu tive programm synthesis
to ma ro learning. In Pro . 5th International Conferen e on Arti ial
Intelligen e Planning and S heduling (AIPS 2000), pp. 371378. AAAI
Press.
S hmid, U., &Wysotzki, F. (2000 ). A unifying approa h to learning by doing
and learning by analogy. In Callaos, N. (Ed.), 4th World Multi onfer-
en e on Systemi s, Cyberneti s and Informati s (SCI 2000), Vol. 1, pp.
379384. IIIS.
S hmid, U. (1999). Iterative ma ro-operators revisited: Applying program
synthesis to learning in planning. Te h. rep. CMU-CS-99-114, Com-
puter S ien e Department, Carnegie Mellon University.
350 BIBLIOGRAPHY
S hoppers, M. (1987). Universal plans for rea tive robots in unpredi table
environments. In IJCAI '87, pp. 10391046. Morgan Kaufmann.
S hrodl, S., & Edelkamp, S. (1999). Inferring ow of ontrol in program
synthesis by example. In Pro . Annual German Conferen e on Arti ial
Intelligen e (KI'99), Bonn, LNAI, pp. 171182. Springer.
Shapiro, E. Y. (1983). Algorithmi Program Debugging. MIT Press.
Shavlik, J. W. (1990). A quiring re ursive and iterative on epts with
explanation-based learning. Ma hine Learning, 5, 3970.
Shell, P., & Carbonell, J. (1989). Towards a general framework for omposing
disjun tive and iterative ma ro-operators. In Pro . 11th International
Joint Conferen e on Arti ial Intelligen e (IJCAI-89), Detroit, MI.
Siekmann, J. H. (1989). Uni ation theory. Journal of Symboli Computation,
7, 207274.
Simon, H. (1958). Rational hoi e and the stru ture of the environment. In
Models of Bounded Rationality, Vol. 2. MIT Press, Cambridge, MA.
Simon, H. A., & Hayes, J. R. (1976). The understanding pro ess: Problem
isomorphs. Cognitive Psy hology, 8, 165190.
Sinha, U. (2000). Geda htnisorganisation und Abruf von rekursiven Pro-
gramms hemata beim analogen Programmieren dur h typindizierte
Anti-Unikation (Memory organization and retrieval of re ursive pro-
gram s hemes in programming by analogy with anti-uni ation). Mas-
ter's thesis, Fa hberei h Informatik, TU Berlin.
Smith, D. R. (1990). KIDS: A semiautomati program development system.
IEEE Transa tions on Software Engineering, 16 (9), 10241043.
Smith, D. R. (1991). KIDS: A knowledge-based software development system.
In Lowry, M., & M Cartney, R. (Eds.), Automating Software Design,
hap. 4, pp. 483514. MIT Press, Menlo Park.
Smith, D. R. (1993). Constru ting spe i ation morphisms. Journal of Sym-
boli Computation, 15 (5-6), 571606.
Smith, D. R. (1985). Top-down synthesis of divide-and- onquer algorithms.
Arti ial Intelligen e, 27, 4396.
Smith, D. (1984). The synthesis of LISP programs from examples: A survery.
In Biermann, A., Guiho, G., & Kodrato, Y. (Eds.), Automati Program
Constru tion Te hniques, pp. 307324. Ma millan.
Soloway, E., & Spohrer, J. (Eds.). (1989). Studying the Novi e Programmer.
Lawren e Erlbaum, Hillsdale, NJ.
Sommerville, I. (1996). Software Engineering (5th edition). Addison-Wesley.
Spellman, B. A., & Holyoak, K. (1996). Pragmati s in analogi al mapping.
Cognitive Psy hology, 31, 307346.
Spivey, J. M. (1992). The Z Notation: A Referen e Manual (2nd edition).
Prenti e-Hall.
Sterling, L., & Shapiro, E. (1986). The Art of Prolog: Advan ed Programming
Te hniques. MIT Press.
BIBLIOGRAPHY 351
Summers, P. D. (1977). A methodology for LISP program onstru tion from
examples. Journal ACM, 24 (1), 162175.
Sun, R., & Sessions, C. (1999). Self-segmentation of sequen es: Automati
formation of hierar hies of sequential behaviors. Te h. rep. 609-951-
2781, NEC Resear h Institute, Prin eton. http:// s.ua.edu/~rsun/
sun.sss.ps.
Sutton, R. S., & Barto, A. G. (1998). Reinfor ement Learning An Intro-
du tion. MIT Press.
Thompson, C. A., Cali, M. E., & Mooney, R. J. (1999). A tive learning for
natural language parsing and information extra tion. In Bratko, I., &
Dzeroski, S. (Eds.), Pro . 16th International Conferen e on Ma hine
Learning (ICML-99), pp. 406414. Morgan Kaufmann.
Thompson, S. (1991). Type Theory and Fun tional Programming. Addison-
Wesley, Workingham, Eng.
Tversky, A. (1977). Features of similarity. Psy hologi al Review, 85, 327352.
Unger, S., & Wysotzki, F. (1981). Lernfahige Klassizierungssysteme (Clas-
sier Systems). Akademie-Verlag, Berlin.
Valiant, L. (1984). A theory of the learnable. Communi ations of the ACM,
27 (11), 11341142.
Veloso, M. (1994). Planning and Learning by Analogi al Reasoning. Springer.
Veloso, M., Carbonell, J., Perez, M. A., Borrajo, D., Fink, E., & Blythe, J.
(1995). Integrating planning and learning: The Prodigy ar hite ture.
Journal of Experimental and Theoreti al Arti ial Intelligen e, 7 (1),
81120.
Veloso, M. M., & Carbonell, J. G. (1993). Derivational analogy in Prodigy:
Automating ase a quisition, storage, and utilization. Ma hine Learn-
ing, 10, 249278.
Vosniadou, S., & (Eds.), A. O. (1989). Similarity and Analogi al Reasoning.
Cambridge University Press.
Vosniadou, S., & Ortony, A. (1989). Similarity and analogi al reasoning: A
synthesis. In Vosniadou, S., & Ortony, A. (Eds.), Similarity and Ana-
logi al Reasoning, pp. 117. Cambridge University Press, Cambridge.
Waldinger, R. (1977). A hieving several goals simultaneously. In El o k,
E. W., & Mi hie, D. (Eds.), Ma hine Intelligen e, Vol. 8, pp. 94136.
Wiley.
Walsh, T. (1983). Iteration strikes ba k - at the y li Towers of Hanoi.
Information Pro essing Letters, 16, 9193.
Weber, G. (1996). Episodi learner modeling. Cognitive S ien e, 20, 195236.
Weld, D. (1999). Re ent advan es in AI planning. AI Magazine, 1999, 20 (2),
93123.
Widowski, D., & Eyferth, K. (1986). Comprehending and re alling omputer
programs of dierent stru tural and semanti omplexity by experts and
novi es. In Willumeit, H. (Ed.), Human De ision Making and Manual
Control. Elsevier, North-Holland.
352 BIBLIOGRAPHY
Winston, P. (1980). Learning and reasoning by analogy. Communi ations of
the ACM, 23, 689703.
Winston, P. (1992). Arti ial Intelligen e, 3rd Edition. Addison-Wesley,
Reading, MA.
Winston, P., & Horn, B. (1989). LISP. Addison-Wesley, Reading, MA.
Wolf, B. (2000). Automati generation of state sets from domain spe i a-
tions. Master's thesis, Fa hberei h Informatik, TU Berlin.
Wysotzki, F. (1983). Representation and indu tion of innite on epts and
re ursive a tion sequen es. In Pro . 8th International Joint Conferen e
on Arti ial Intelligen e (IJCAI-83), Karlsruhe, pp. 409414. William
Kaufmann.
Wysotzki, F. (1987). Program synthesis by hierar hi al planning. In Jorrand,
P., & Sgurev, V. (Eds.), Arti ial Intelligen e: Methodology, Systems,
Appli ations, pp. 311. Elsevier, Amsterdam.
Wysotzki, F., & S hmid, U. (2001). Synthesis of re ursive programs from
nite examples by dete tion of ma ro-fun tions. Te h. rep. 01-2, Dept.
of Computer S ien e, TU Berlin, Germany.
Yu, T., & Clark, C. (1998). Re ursion, lambda abstra tion and geneti pro-
gramming. In Late Breaking Papers at EuroGP'98, pp. 2630 University
of Birmingham, UK.
Zelinka, B. (1975). On a ertain distan e between isomorphism lasses of
graphs.
Casopos pro pestovan
i matematiky, 100, 371372.
Zeugmann, T., & Lange, S. (1995). A guided tour a ross the boundaries of
learning re ursive languages. Le ture Notes in Computer S ien e, 961,
193262.
Zweben, M., & Fox, M. S. (1994). Intelligent S heduling. Morgan Kaufmann.
A. Implementation Details
A.1 Short History of DPlan
The original idea of using universal planning as initial step for indu tive
program synthesis was presented in Wysotzki (1987): sets of basi programs
and axioms are used to onstru t a kind of onditional plan with top-level
goals as root-node; ordering of dependent goals was realized by ba ktra king
over plan onstru tion. From 1995 to 1997 we explored (implemented and
tested) several strategies for generating nite programs by planning:
The approa h proposed in Wysotzki (1987) was implemented in dierent
versions by Ute S hmid, by Baba k Paradian and by Olaf Brandes Paran-
dian, S hmid, and Wysotzki (1995, see).
Furthermore, we investigated ombining forward sear h with de ision tree
learning:
As rst step, optimal plans are generated for ea h possible initial state of a
domain with small omplexity. Initial states then are represented as feature
ve tors where the set of all dierent literals o urring over states are used
as features with value 1 if this literal o urs in a state des ription and value
0 otherwise. Ea h initial state is asso iated with the (or an) optimal a tion
sequen e for transforming it into a goal state. A de ision tree algorithm
(Unger & Wysotzki, 1981, CAL2, see) is used to generate a lassi ation
program.
Transforming su h a program into a nite program t for generalization-
to-n involves the same problems as dis ussed in hapter 8 along with some
additional problems: First, the de ision tree does not ne essarily represent
an order over the number of transformations involved: it is possible that
the rst attribute already bran hes to a leaf representing the most omplex
transformation sequen e. Se ond, there is no interleaving between a tions
and onditions whi h have to be fullled before exe uting these a tions
whi h is typi al for re ursion (see for example the nite program for unload-
all in hap. 8). Instead, all onditions ne essary to exe ute the omplete
transformation sequen e are he ked rst (along a path in the de ision
tree) and the transformation sequen e is exe uted ompletely afterwards.
A possible way to split su h ompound a tions is dis ussed in Wysotzki
356 A. Implementation Details
(1983).
1
A se ond forward-sear h approa h is presented in (Briesemeister et al.,
1996) (see se t. 2.5.2).
In 1998 we ame up with the rst version of DPlan as a state-based non-
linear ba kward-planner onstru ting universal plans (implemented by Ute
S hmid). This rst algorithm, its formalization, proofs of ompleteness and
orre tness are presented in (S hmid, 1999). DPlan1.0 works for STRIPS-
like domain spe i ations extended to binary onditioned ee ts. The gen-
erated universal plan is a minimal spanning tree. That is, for domains with
sets of optimal solutions only one alternative is al ulated for ea h possible
state. DPlan1.0 was extended over the last year by several people in several
ways:
Domain spe i ation in PDDL, STRIPS + general onditioned ee ts
(see report of the student proje t \Extending DPlan To an ADL-subset"
by Janin Toussaint, Ulri h Wagner, and Hakan Iri an, 1999, http://ki.
s.tu-berlin.de/~s hmid/mlpj2/ss99/mlpj2_99.html).
Constru ting universal plans without a predened set of states (see re-
port of the student proje t \Universal Planning without state sets" by
Mi hael Christmann and Stefan Ronne ke, 1999, URL as above).
Preliminary work on plan onstru tion using ontrol knowledge repre-
sented as re ursive ma ros (see report of the student proje t \Planning
with re ursive ma ros", Mis ha Neumann and Ralf Ansorg, 1999, URL
as above).
Cal ulating DAG's instead of minimal spanning trees (S hmid, 1999).
Extending DPlan to fun tion appli ation, as reported in hapter 4. This
work is based on a diploma thesis by Marina Muller, Mai 2000 (http:
//ki. s.tu-berlin.de/proje ts/dipl.html).
Current work in progress is to extend DPlan to further features of PDDL
all-quanti ation and negation. And investigate soundness and omplete-
ness of planning for these more general operator denitions (diploma thesis
by Bernhard Wolf, Sept. 2000, http://ki. s.tu-berlin.de/proje ts/
dipl.html).
DPlan was extended (January 2001) to in lude re ursive program s hemes
fullling a ertain lass of goals in the domain spe i ation. If the top-
level goals of a new planning problem mat h with the goals of a re ursive
program s heme, this s heme is applied and an optimal transformation se-
quen e is onstru ted dire tly, without planning. This extention was ur-
rently realized by Janin Toussaint as part of her diploma thesis on plan
transformation for program synthesis (see below).
1
This work is not do umented in a paper. The do umented program together with
some examples an be obtained from Ute S hmid.
A.2 Modules of DPlan 357
A.2 Modules of DPlan
DPlan is implemented in Common Lisp. The program together with exam-
ple domains and further information an be obtained via http://ki. s.
tu-berlin.de/~s hmid/IPAL/dplan.html.
dplan.lsp: main program
onstru tion of universal plan, saved in plan-steps as list of pstep-
stru tures; re ursive al ulation of pre-images, as des ribed in table 3.1
plan-dstru .lsd: global data stru ture pstep, see below
ps-ba k.lsp: al ulating the set of legal prede essors of a state (mat h
and ba kward operator appli ation)
fun tion (apply-rules < urrent-state>) is alled from dplan, returns
list of new psteps; the new psteps are pruned with respe t to the urrent
plan
showplan.lsp: graphi al and term output of plans
fun tion (show plan-steps) is alled from dplan; graphs an be displayed
with graphlet, trees an be displayed with xterm
parse.lsp, sele tors.lsp: handling the PDDL-input (implemented by
Marina Muller)
generate.lsp, range.lsp, update.lsp: handling update ee ts for plan-
ning with fun tion appli ation (implemented by Marina Muller)
A plan-step is a pair h instop, hild i. Additional informations are given
for onstru tion of graphi al outputs and for plan transformation:
(defstru t pstep
instop ; instantiated operator, f. puttable(A)
parent ; parent node: in ba kward planning su essor of instop
; input in ps-ba k "state"
hild ; hild node: in ba kward planning prede essor of instop
; onstru ted in ps-ba k
pre ; instantiated pre onditions of operator
; for onditioned operators: union of global and spe ifi
; pre ondition
add ; instantiated add-list
del ; instantiated del-liste
nodeid ; identifier
level ; level in the ms-dag
)
A.3 DPlan Spe i ations
DPlan spe i ations are given in standard PDDL. For a omplete des ription
of the syntax of PDDL see (M Dermott, 1998b). We give a Tower of Hanoi
problem as example:
358 A. Implementation Details
(define (domain disk-world-domain)
(:a tion move
:parameters (?d ?x ?y)
:pre ondition (and (on ?d ?x)( lear ?d)( lear ?y)(smaller ?d ?y))
:effe t
((and ((on ?d ?y)( lear ?x))
(not ((on ?d ?x)( lear ?y)))
))
))
(define (problem towers-of-hanoi)
:domain 'disk-world-domain
:goal ((on d3 p1) (on d2 d3) (on d1 d2))
)
(define (states states-in-disk-world-domain) ; domain restri tion
:states ; omplete goal state
( ((on d3 p3) (on d2 d3) (on d1 d2) ( lear d1) ( lear p2) ( lear p3)
(smaller d1 p1)(smaller d2 p1)(smaller d3 p1) (smaller d4 p1)
(smaller d1 p2)(smaller d2 p2)(smaller d3 p2) (smaller d4 p2)
(smaller d1 p3)(smaller d2 p3)(smaller d3 p3) (smaller d4 p2)
(smaller d1 d2)(smaller d1 d3)(smaller d2 d3) (smaller d1 d4)
(smaller d2 d4)(smaller d3 d4))
))
For planning with update ee ts we extended PDDL as des ribed in hap-
ter 4. An example of a Tower of Hanoi problem is:
(define (domain disk-world-domain)
(:a tion move
:parameters (?x ?y)
:pre ondition ()
:effe t
(( hange (?L2 in (on ?y ?L2) (to ( dr ?L2)))
(?L1 in (on ?x ?L1) (to ( ons- ar ?L1 ?L2)))
))
:post (((on ?x ?L1) (on ?y ?L2))
(f1 ?L2) ; L2 is not empty
(f6 ?L1 ?L2)) ; ( ar L2) < ( ar L1)
))
(defun ons- ar (l1 l2)
;(print `( ons- ar ,l1 ,l2 ,( ons ( ar l2) l1)))
( ons ( ar l2) l1))
(defun f6 (l1 l2)
;(print `(f6 ,l1 ,l2))
( ond ((null l1) T)
(T (> ( ar l1) ( ar l2)))
))
(define (problem towers-of-hanoi)
:domain 'disk-world-domain
:goal ((on p1 [ ) (on p2 [ ) (on p3 [ 1 2 3 ))
A.4 Development of Folding Algorithms 359
)
A.4 Development of Folding Algorithms
Starting point for our development of a folding algorithm was the approa h
by Summers (1977) and its extension from a subset of Lisp to term algebras
by Wysotzki (1983). Both approa hes rely on the identi ation of dieren es
between pairs of tra es in a given nite program. Sin e the original algorithm
of Summers was no longer available and there did not exist an implementation
for Wysotzki's approa h, we started from the s rat h:
In the beginning (1997), a simple string pattern-mat her was realized
in Common Lisp by Imre Szabo in his diploma thesis (http://ki. s.
tu-berlin.de/proje ts/dipl.html). This algorithm did only work for
dete ting linear re ursions with independent variable substitutions in the
re ursive all.
At the same time, an algorithm based on the identi ation of uniable
sub-terms of a given nite program term was implemented in Common
Lisp by Dirk Matzke in his diploma thesis (http://ki. s.tu-berlin.
de/proje ts/dipl.html). He used the tree-transformation algorithm of
Lu (1979) whi h was originally implemented in a student proje t where
it was applied to mat hing of dierent nite programs in the ontext of
programming by analogy (S hadler, S heer, S hmid, & Wysotzki, 1995).
In a student proje t in 1997, Martin Mhlpfordt and Markus Jurinka worked
on a formalization of folding based on indu tion of ontext-free tree gram-
mars (Muhlpfordt & S hmid, 1998) and Martin Mhlpfordt implemented
a pattern mat her in Prolog based on this formalization. This implemen-
tation works already for re ursive equations realizing arbitrary re ursion
forms (linear, tree, ombinations) and for interdependent variable substi-
tutions.
Heike Pis h (2000) provided an eÆ ient implementation of this approa h in
her diploma thesis (http://ki. s.tu-berlin.de/proje ts/dipl.html),
using a dierent approa h for generating valid hypotheses for segmenta-
tions.
The pre eding two proje ts were the starting point for the powerful folding
algorithm realized in Common Lisp by Martin Mhlpfordt in his diploma
thesis 2000 (http://ki. s.tu-berlin.de/proje ts/dipl.html) whi h
is presented in detail in hapter 7: the original approa h was extended to
RPSs with sets of re ursive equations and to the identi ation of hidden
variables in substitution terms.
360 A. Implementation Details
A.5 Modules of TFold
TFold is implemented in Common Lisp (by Martin Mhlpfordt). The program
together with example domains and further information an be obtained via
http://ki. s.tu-berlin.de/~s hmid/IPAL/fold.html.
folder.lsp: main program. The outer ontrol loop for sear h, as given in
gure 7.13 in hapter 7.
Main fun tion: sear h-RPS with an initial tree as input and an RPS or
nil as output. First, it is tried to nd a sub-program for the tree, if this
fails, it is tried to nd an RPS for ea h argument of the top-level-symbol
of the tree.
Fun tion: sear h-SubProg-Ba ktra k implements the sear h for a sub-
program by ba ktra king, as des ribed in gure 7.12 in hapter 7. The
idea is to in rementally build skeletons of the subprogram-body until a
valid segmentation is found (a) whi h an be extended to the body and
(b) so that the resulting instan es of the variable an be explained by
substitutions.
some test-fun tions ( onverting, folding, and printing the result)
The initial tree is initially onverted in a spe ial representation: all (maxi-
mal) sub-trees whi h do not ontaining \Omegas" are onverted to a ve tor
with 2 elements.
base-fkt-rps.lsp: Fun tions for onstru ting an RPS from the indu ed
omponents (signature, head, parameters, body, substitutions)
base-fkt-tree.lsp: Fun tions for sear hing and partitioning trees, anti-
uni ation of terms.
skeleton.lsp: onstru ting a skeleton and he king whether it is valid
substitutions.lsp: Cal ulating the substitutions for variables in re ur-
sive alls.
term2gml.lsp": Transformation of terms into gml format for graphlet
whi h provides a graphi al output.
"read-gml.lsp": Transformation of gml represented terms in terms as lists
of lists.
pretty-rps-print.lsp: Fun tions for the output of RPSs.
unfolder.lsp: Unfolding RPSs into nite programs.
rps-example.lsp: Colle tion of RPSs.
Currently, a graphi al user interfa e (GUI) realized in Java for the folder
is under onstru tion.
RPSs are represented as olle tion of sub-programs and a main program.
Sub-programs are represented as stru tures. An example for ModList pre-
sented in hapter 7 is:
(defun ModList ()
(setq Sub1 (make-sub :name 'ModList
:params '(n l)
A.6 Time Eort of Folding 361
:body '(if (empty l)
nill
( ons (if (eq0 (Mod n (head l)))
true
false)
(ModList n (tail l))))
))
(setq Sub2 (make-sub :name 'Mod
:params '(n k)
:if '(g (< n k) n (Mod (- n k) k))
))
(make-rps :main '(ModList 8 (list 9 4 7))
:sigma (list Sub1 Sub2)))
A.6 Time Eort of Folding
The time measures were performed on a SUN Spar 20 workstation. For mea-
surement the Common Lisp timema ro was used for unfolding, the omplete
folding and the subpro esses al ulating valid segmentations and determining
substitutions. Be ause CPU-time measures are not overly pre ise, the average
of three runs for ea h problem was al ulated.
A.7 Main Components of Plan-Transformation
Plan transformation is work in progress. The urrent implementation was
realized in Common Lisp by Ute S hmid, 2000 (plan-transform.lsp). Cur-
rently, the implementation is extended and the on ept of linearization based
on the assumption of sub-goal indenden e proposed in Wysotzki and S hmid
(2001) is in luded by Janin Toussaint in her diploma thesis.
The main fun tion of the urrent implementation is (plantransform
<plan>):
Input: plan as list of psteps from dplan.lsp
(de ompose plan):
initial de omposition; generation and initialization of the global variables
subplanlist (list of tplan stru tures), planstru (sub-plan stru ture as
term of sub-plan names)
(intro-type subplanlist):
data type infere e for ea h sub-plan, su essively lling the slots of tplan
(ptransform subplanlist):
introdu ing situation variables and rewriting into onditional expression
Output: transformation information as given in subplanlist and planstru ;
tplan.term is passed to the generalization-to-n algorithm for ea h sub-
plan; planstru is extended from sub-plan names to arguments and rewrit-
ten into a \main" program
362 A. Implementation Details
Global data-stru ture:
; input from dplan.lsp is a plan saved in plan-steps as a list of psteps
; global variables
(setq subplanlist nil) ; list of all transformed subplans
; (spe ial ase: one subplan)
(setq planstru nil) ; stru ture of global plan (subtrees repla ed by
; subplan names)
(setq mstlist nil) ; list of possible minimal spanning trees
; (ba ktra k-point for dags whi h are not sets!)
; .....................................................................
; new stru ture for transformed plan (one for ea h subplan)
; initial generation in de ompose
(defstru t tplan
pname ; name of plan (de ompose)
suplan ; plan as stru ture (plan-steps) (de ompose)
oplan ; plan with data types and relevant predi ates
term ; plan as term
ptype ; type of the plan (singleop, seq, set, list)
newdat ; *newly onstru ted datastru ture (a list/set of obje ts)
newpred ; *newly onstru ted predi ate (if newdat =/= nil) as pattern
npf t ; *fun tion definition for new predi ate
goalpred ; goal-predi ate (might be newpred)
bottom ; bottom-element (might be newdat)
passo s ; onst/ onstr rewrite pairs
paf t ; fun tion definition for the rewriting
pparams ; input parameters
pvalues ; initial values
)
;; newdat, newpred, npf t are not filled for ptype = singleop and ptype = set
;; newpred: p* (... CO ...) with CO as pla e-holder for omplex obje t
;; maybe additional onstant arguments
;; f.e.: (at* CO B) for ro ket
;; bottom == newdat (-> newdat might be superfluous)
A.8 Plan De omposition
Please note: Extra ts from the program are given in an abbreviated pseudo-
Lisp notation, omitting implementation details!
; all de ompose with omplete plan and level = 0 (root)
(de ompose plan level) ==
(map ar $\lambda$x.(r-de -p (get-first-op plan lv) x lv) (partition plan))
(get-first-op plan level) ==
get the set first operator-symbol at the upper-most level of the plan
(partition plan) == for ea h root of ``plan'', return its subplan
(r-de -p op plan lv) ==
(let ((dlv (disag-level op plan))
A.9 Introdu tion of Situation Variables 363
(pname (gensym "P")))
( ond ((not dlv) (save-subplan pname plan)) ; single plan
; subplan
((> dlv lv) (save-subplan pname (get-suplan dlv plan))
(de ompose (append (make-root (1- dlv) plan)
(rest-suplan dlv plan))
dlv)
)
(T ; (= dlv lv) ; different operators
(save-subplan pname plan) ; at one level
) ; urrently not treated by de omposition
))
(disag-level op plan) ==
returns the level in ``plan'' on whi h appears for the first time a
different operator-name than ``op''
(save-suplan pname plan) ==
generate a new tplan-stru ture and instantiate the slots pname with
``pname'' and suplan with ``plan'' (see appendix for tplan-stru ture)
(get-suplan dlv plan) ==
return a partial plan from the urrent top-level to level ``dlv''
(make-root lv plan) == get the leaf of the newly generated subplan
(rest-suplan dlv plan) ==
return the remaining plan with level ``dlv'' as new top-level
A.9 Introdu tion of Situation Variables
For ea h subplan:
(rewrite-to-term oplan) ==
(list 'if (intro-s (first-state oplan))
's (r-rewrite (rest oplan)) )
(intro-s e) == extend e by an argument ``s''
(r-rewrite oplan) ==
( ond ((null p) 'omega)
(T (append (intro-s (first-op oplan))
(list (list 'if
(intro-s (first-state oplan))
's (r-rewrite (rest oplan)) )))))
A.10 Number of MSTs in a DAG
To al ulate the number of minimal spanning trees in an DAG, we an use
formula
Q
n
i=0
i
, multiplying the number of alternative hoi es
i
on ea h
364 A. Implementation Details
level i. For the sorting of three elements, we have: 1 1 9 = 9, with a single
option for the root and the rst level and 9 dierent options for the third
level. To al ulate the number of options on a level, the onne tive stru ture
of the DAG has to be know. The dierent alternatives on one level an be
al ulated as des ribed for the onstru tion of minimal spanning trees. For
sorting lists with 4 elements we have: 1 1 52488 46656 = 2:448:880:128!
To omit the expli it al ulation of options per level, an upper bound
estimate is
i
=
a
i
n
i
with a
i
as number of ar s from level i1 to level i and n
i
as number of nodes
on level i.
We annot provide a formula for al ulating the proportion of regular-
izable trees to all trees, be ause regularizability depends on the identity of
edge-labels whi h is variable between domains.
A.11 Extra ting Minimal Spanning Trees from a DAG
; lst is tree until lv (initially: root) ; sp is all plan-steps on levels > lv
(defun extra t-trees (lst sp lv)
(let* ((tve (number-of-msts 1 sp (1+ lv)))
(t nt (redu e '* tve )))
(print `(There are * ,tve = ,t nt minimal spanning trees))
(fresh-line)
( ond ((> t nt 576) (write-string "How many trees? <number> ")
(setq k (read-number))
(fresh-line)
(extra t-msts (list lst) sp (1+ lv) k))
(T (write-string "Generate all alternatives")
(fresh-line)
(extra t-msts (list lst) sp (1+ lv) t nt))
)))
(defun extra t-msts (lst sp lv k)
(print `(In lude next level ,lv))
(let ((lp (remove-if #'(lambda(x) (/= lv (pstep-level x))) sp))
(rp (remove-if #'(lambda(x) (= lv (pstep-level x))) sp)))
( ond ((null lp) nil)
((null rp) (first-k k (lift
(map ar #'(lambda(x) (pmerge lst x)) (all- ombs lp k)))))
; in the last step this is only a "throw-away" of
; already al ulated trees!
(T (extra t-msts (first-k k
(lift (map ar #'(lambda(x) (pmerge lst x))
(all- ombs lp k))))
rp (1+ lv) k)
))) )
A.12 Regularizing a Tree 365
(defun pmerge (lst e)
( ond ((null lst) (list e)) ; should never o ur
((flatlst lst) (join lst e))
(T (map ar #'(lambda(x) (join x e)) lst))
))
(defun all- ombs (lp k)
(let* ((d (make-sset (map ar #'(lambda(x) (pstep- hild x)) lp)))
(l s ( hild-split lp d k))
(tl s (map ar #'(lambda(x) (generate-trees x (1- (length x)))) l s)))
( art-produ t ( ar tl s) ( dr tl s) k)
))
(defun hild-split (lp d k)
( ond ((null d ) nil)
(T ( ons (first-k k (remove-if #'(lambda(x)
(not (setequal (pstep- hild x) ( ar d )))) lp) )
( hild-split lp ( dr d ) k)))
))
(defun generate-trees (lp nt)
( ond ((< nt 0) nil)
(T (setq ur-lp ( opy-plan lp))
( ons (nth nt ur-lp)
(generate-trees lp (1- nt))))
))
(defun art-produ t (fst rst k)
( ond ((null rst) fst)
(T ( art-produ t (first-k k ( omb fst ( ar rst))) ( dr rst) k))
))
(defun omb (f r)
( ond ((null f) nil)
(T (append (map ar #'(lambda(x) (join ( ar f) x)) r)
( omb ( dr f) r)))
))
A.12 Regularizing a Tree
(defun regularize-tree (mst lv)
(let (( ur (remove-if #'(lambda(y) (/= lv (pstep-level y))) mst))
(nxt (remove-if #'(lambda(y) (/= (1+ lv) (pstep-level y))) mst))
)
( ond ((null ur) nil)
((null nxt) mst)
(T (let ((opset l (map ar #'(lambda(x) (pstep-instop x)) ur))
(opsetnl (map ar #'(lambda(x) (pstep-instop x)) nxt)))
( ond ((subsetp opsetnl opset l :test 'equal)
(regularize-tree (smerge (lv-shift ur opsetnl)
mst) (1+ lv)))
(T (regularize-tree mst (1+ lv)))
366 A. Implementation Details
))) ) ) )
(defun lv-shift ( ur opsetnl)
( ond ((null ur) nil)
((member (pstep-instop ( ar ur)) opsetnl :test 'equal)
(setf n ( opy-pstep ( ar ur)))
(setf (pstep-parent n )
(id-insert (pstep-parent ( ar ur))))
(setf (pstep-level n ) (1+ (pstep-level ( ar ur))))
( ons
(make-pstep
:instop 'id
:parent (pstep-parent ( ar ur))
: hild (id-insert (pstep-parent ( ar ur)))
:nodeid (+ 1000 (pstep-nodeid ( ar ur)))
:level (pstep-level ( ar ur))
)
( ons n (lv-shift ( dr ur) opsetnl)))
)
(T ( ons ( ar ur) (lv-shift ( dr ur) opsetnl)))
))
(defun id-insert (s)
( ond ((null s) nil)
((idlist ( ar s)) ( ons ( ons 'id ( ar s)) ( dr s)))
(T ( ons '(id) s))
))
(defun idlist (l)
( ond ((null l) T)
((equal 'id ( ar l)) (idlist ( dr l)))
(T nil)
))
; if pstep- hild of new is equal to a pstep- hild in old -> keep new
; (has higher level)
; if pstep- hild without "idlist" is equal to a pstep- hild in old
; and both are on the same level -> keep old (the one whi h has
; no or a shorter "idlist")
; ==> parent-node for the nodes in nw with pstep- hild of new as
; parent has to be set to "old" parent!
; these nodes are still in new be ause of the sequen e of
; node onstru tion in lv-shift!
(defun smerge (nw old)
( ond ((null nw) old)
((member ( ar nw) old :test 'node-equal)
(smerge ( dr nw) ( ons ( ar nw)
(remove-if #'(lambda(x) (node-equal
( ar nw) x)) old))))
((member ( ar nw) old :test 'level-equal)
(smerge (old-parent ( ar nw) ( dr nw) old) old))
(T (smerge ( dr nw) ( ons ( ar nw) old)))
A.13 Programming by Analogy Algorithms 367
))
(defun node-equal (s1 s2)
(and (= (length (find-if #'(lambda(x) (idlist x)) (pstep- hild s1)))
(length (find-if #'(lambda(x) (idlist x)) (pstep- hild s2)))
)
(setequal (remove-if #'(lambda(x) (idlist x)) (pstep- hild s1))
(remove-if #'(lambda(x) (idlist x)) (pstep- hild s2))
)) )
(defun level-equal (s1 s2)
(and (setequal (remove-if #'(lambda(x) (idlist x)) (pstep- hild s1))
(remove-if #'(lambda(x) (idlist x)) (pstep- hild s2))
)
(= (pstep-level s1) (pstep-level s2))
))
; if smerge keeps the old state, then the new-state with ld as parent
; has to keept the orresponding old parent
(defun old-parent ( ld sl old)
( ond ((null sl) nil)
((setequal (pstep- hild ld) (pstep-parent ( ar sl)))
( ons (update-par ( opy-pstep ( ar sl))
(pstep- hild (find-if #'(lambda(x)
(level-equal ld x)) old))) ( dr sl)))
(T (old-parent ld ( dr sl) old))
))
(defun update-par (sl ud) (setf (pstep-parent sl) ud) sl)
A.13 Programming by Analogy Algorithms
In ontrast to other approa hes to programming by analogy (in omputer
s ien e) and analogi al problem solving (in ognitive s ien e), our approa h
allows to deal likewise with analogy and abstra tion: Re ursive program
s hemes an be dened over a signature onsisting of a set of primitive fun -
tions with xed interpretation or over a signature ontaining (se ond-order)
fun tion variables. Analogy is realized by mapping on rete programs while
abstra tion is realized by mapping a se ond-order s heme with a on rete
program. We use the same similarity riterium for retrieval and mapping.
Originally, we investigated tree-transformation, urrently we are investi-
gating anti-uni ation as an approa h to mapping. Tree-transformation re-
turns a quantitative measure of stru tural similarity based on the ne essary
number of substitutions, deletions, and insertions to make a sour e program
identi al to a target andidate. Tree-transformation is introdu ed in hapter
10 as an example for a measure of stru tural similarity. Anti-uni ation is
des ribed in hapter 12.
We did some preliminary work on adaptation of non-isomorphi programs.
Again in ontrast to other approa hes, our fo us is on generalization over pro-
368 A. Implementation Details
grams, that is, on onstru ting a hierar hy of re ursive program s hemes. An
overview is given at http://ki. s.tu-berlin.de/~s hmid/IPAL/aprog.
html.
Several aspe ts of programming by analogy and analogi al problem solv-
ing were investigated:
Memory organization and retrieval using tree-transformation was investi-
gated by Mark Muller in his diploma thesis (Aug. 1997, http://ki. s.
tu-berlin.de/proje ts/dipl.html). He used hierar hi al luster analy-
sis to build a hierar hi al memory and ould show that memory organiza-
tion resulted in a grouping of programs sharing re ursive stru tures (tail
re ursion, linear re ursion, tree re ursion).
In two psy hologi al experiments, Knuth Polkehn and Joa him Wirth
investigated the use of non-isomorphi al base problems of human prob-
lem solvers (Sept. 1997 and Mai 1998, http://ki. s.tu-berlin.de/
proje ts/dipl.html). They ould show that human problem solvers an
solve new problems if base and target have a stru tural overlap of at least
fty per ent. For our automati programming system, we take this result
as a hint that non-isomorphi sour es should be onsidered for adapta-
tion. But, we need a (qualitative) riterum for de iding whether the sour e
andidate is \similar enough" to prefer adaptation over folding from the
s rat h.
Adaptation of program s hemes to new problems was resear hed by Rene
Mer y in his diploma thesis (April 1998, http://ki. s.tu-berlin.de/
proje ts/dipl.html). He used tree-transformation to onstru t a map-
ping. If the mapping only ontains unique substitutions, the programs are
isomorphi and adaptation is realized simply by repla ing the symbols in
the base program by the symbols they are mapped upon in the target. For
non-unique substitutions, deletions and insertions, ontext in the program
tree is used to determine the adaptation of the sour e program (see hapter
12).
If base and target share some similarity, but adaptation fails, it ould
be tried to rewrite the base problem to obtain a better mapping. That
is, stru ture mapping is enri hed by a ba kground theory. This idea is
urrently dis ussed in ognitive s ien e as \re-representation" and Martin
Muhlpfordt investigated re-representation in a psy hologi al experiment in
his diploma thesis (June 1999, http://ki. s.tu-berlin.de/proje ts/
dipl.html).
In 2000 we repla ed tree-transformation by (se ond order) anti-uni ation
as ba kbone of programming by analogy. Thereby we gained a parame-
ter free, qualitative approa h to stru tural similarity whi h furthermore
allowed to deal with mapping and generalization in an uniform way. Adap-
tation an be modeled by the inverse pro ess, that is, (se ond order)
uni ation. Mapping and generalization was resear hed by Uwe Sinha in
A.13 Programming by Analogy Algorithms 369
his diploma thesis (Aug. 2000, http://ki. s.tu-berlin.de/proje ts/
dipl.html) and is des ribed in hapter 12.
Currently, Ulri h Wagner is investigating retrieval and adaptation.
B. Con epts and Proofs
B.1 Fixpoint Semanti s
Fixpoint theory in general on erns the following question: Given an ordered
set P and an order-preserving map : P ! P , does there exist a xpoint
for ? A point x 2 P is alled xpoint if (x) = x.
The following introdu tion of xpoint semanti s losely follows Davey and
Priestley (1990). We rst introdu e some basi on epts and notations. Then,
we present the xpoint theorem for omplete partial orders. Finally, we give
an illustration for the fa torial fun tion.
Denition B.1.1 (Map). (Davey & Priestley, 1990, 1.6) Let X be a set
and onsider a map f : X ! X. f assigns a member f(x) 2 X to ea h
member x 2 X and is determined by its graph, graphf = f(x; f(x)) j x 2 Xg
with graphf X X.
A partial map is a map : S ! X where S X. With dom = S we
denote the domain of . The set of partial maps on X is denoted (X
Æ
! X)
and is ordered in the following way: given ; 2 (X
Æ
! X), dene i
dom dom and (x) = (x) for all x 2 dom . A subset G of X X
is the graph of a partial map i 8s 2 X : (s; x) 2 G and (s; x
0
) 2 G ) x = x
0
.
Denition B.1.2 (Order-Preserving Map). (Davey & Priestley, 1990,
1.11) Let P and Q be ordered sets. A map : P ! Q is order-preserving (or
monotone) if x y in P implies (x) (y) in Q.
Denition B.1.3 (Bottom Element). (Davey & Priestley, 1990, 1.23,1.24)
Let P be an ordered set. The least element of P , if su h exists, is alled the
bottom element and denoted ?.
Given an ordered set P (with or without ?), we form P
?
( alled \P lifted")
as follows: Take an element 0 =2 P , onstru t P
?
= P [ f0g and dene on
P
?
by x y i x = 0 or x y in P .
Denition B.1.4 (Order-Isomorphism between Partial and Stri t Maps).
(Davey & Priestley, 1990, 1.29) With ea h 2 (S
Æ
! S) we asso iate a map
() : S ! S
?
, given by () =
?
where
?
(x) =
(x) if x 2 dom
0 otherwise:
372 B. Con epts and Proofs
Thus is a map from (S
Æ
! S) to (S ! S
?
). We have used the extra
element 0 to onvert a partial map on S into a total map. sets up an
order-isomorphism between (S
Æ
! S) and (S ! S
?
) (Davey & Priestley,
1990, p. 21).
Denition B.1.5 (Supremum). (Davey & Priestley, 1990, 2.1) Let P be
an ordered set and let S P . An element x 2 P is an upper bound of S if
s x for all s 2 S. x is alled least upper bound of S if
1. x is an upper bound of S, and
2. x y for all upper bounds y of S.
Sin e least elements are unique, the least upper bound is unique if it exists.
The least upper bound is also alled the supremum of S and is denoted supS,
or
W
S (\join of S"), or
F
S if S is dire ted.
Denition B.1.6 (CPO). (Davey & Priestley, 1990, 3.9) An ordered set
P is a CPO ( omplete partial order) if
1. P has a bottom element ?,
2. supD exists for ea h dire ted sub-set D of P .
The simplest example of a dire ted set is a hain, su h as 0 su (0)
su (su (0)) su (su (su (0))) : : :. Fixpoint theorems are based on
as ending hains.
Denition B.1.7 (Continuous and Stri t Maps). (Davey & Priestley,
1990, 3.14) A map : P ! Q is ontinuous, if for every dire ted set D in
P : (
F
D) =
F
((D)). Every ontinuous map is order preserving (Davey &
Priestley, 1990, 3.15).
A map : P ! Q su h that (?) = ? is alled stri t.
Denition B.1.8 (Fixpoint). (Davey & Priestley, 1990, 4.4) Let P be an
ordered set and let : P ! P be a map. We say x 2 P is a xpoint of
if (x) = x. The set of all xpoints of is denoted by x(); it arries the
indu ed order. The least element of x(), when it exists, is denoted by ().
The n-fold omposite,
n
, of a map : P ! P is dened as follows:
n
is the
identity map if n = 0 and
n
= Æ
n1
for n 1. If is order preserving,
so is
n
.
Theorem B.1.1 (CPO Fixpoint). (Davey & Priestley, 1990, 4.5) Let P
be a omplete partial order (CPO), let : P ! P be an order-preserving map
and dene =
F
n0
n
(?).
1. If 2 fix(), then = ().
2. If is ontinuous then the least xpoint () exists and equals .
Proof (CPO Fixpoint). (Davey & Priestley, 1990, 4.5)
B.1 Fixpoint Semanti s 373
1. Certainly ? (?). Applying the order preserving map
n
, we have
n
(?)
n+1
(?) for all n. Hen e we have a hain ? (?) : : :
n
(?)
n+1
(?) : : : in P . Sin e P is a CPO, =
F
n0
n
(?)
exists. Let be any xpoint of . By indu tion,
n
() = for all n. We
have ? , hen e we obtain
n
(?)
n
() = by applying
n
. The
denition of for es . Hen e if os a xpoint then it is the least
xpoint.
2. It will be enough to show that 2 x(). We have
(
F
n0
n
(?)) =
F
n0
(
n
(?)) (sin e is ontinuous)
=
F
n1
n
(?)
=
F
n0
n
(?) (sin e ?
n
(?) for all n):
Consider the fa torial fun tion fa u(x) = x! with its re ursive denition:
fa u(x) =
1 if x = 0
x fa u(x 1) if x 1:
To ea h map f : N
0
! N
0
we may asso iate a new map
f given by:
f(x) =
1 if x = 0
x f(x 1) if x 1:
The equation satised by fa u an be reformulated as (f) = f , where
(f) =
f . The entire fa torial fun tion annot be unwound from its re ursive
spe i ation in a nite number of steps. However we an, for ea h n = 0; 1; : : :,
determine in a nite number of steps the partial map f
n
whi h is a restri tion
of fa u to f0; 1; : : : ; ng. The graph of f
n
is f(0; 1); (1; 1); : : : ; (n; n!)g. There-
fore, to a ommodate approximations of fa u we an onsider all partial
maps on N
0
and regard
f as having the domain f0g [ fk j k 1 2 domfg.
Similarly, we an work with all maps from N
0
to (N
0
)
?
and take
f = ?
pre isely when f(k 1) = ?. The fa torial fun tion an be regarded as a
solution of a re ursive equation (f) = f , where f 2 (N
0
Æ
! N
0
) or equiva-
lently an be regarded as a solution of a re ursive equation (f) = f , where
f 2 (N
0
! (N
0
)
?
):
(f)(x) =
1 if x = 0
x f(x 1) if x 1:
The domain P of as given in the xpoint theorem above, is (N
0
Æ
! N
0
).
That is, maps partial maps on partial maps. The bottom element of (N
0
Æ
!
N
0
) is ;. This orresponds to that map in (N
0
! (N
0
)
?
) whi h sends every
element to ?. We denote this map by ?.
It is obvious that is order preserving: We have graph (?) = f(0; 1)g,
graph
2
(?) = f(0; 1); (1; 1)g and so on. An easy indu tion onrms that
f
n
=
n
(?) for all n where ff
n
g is the sequen e of partial maps (\Kleene
374 B. Con epts and Proofs
sequen e") whi h approximate fa u. Taking the union of all graphs, we see
that
F
n0
n
(?) is the map fa u : x ! x! on N
0
. This is the least xpoint
of in (N
0
Æ
! N
0
).
B.2 Proof: Maximal Subprogram Body
Theorem 7.3.1 states that if a re ursive subprogram G exists for a set of
initial trees T
init
= ft
e
j e 2 Eg whi h an generate all t
e
2 T
init
for a given
initial instantiation
e
, then there also exists a subprogram G
0
su h that the
instantiations (over all segments and all examples) do not share a non-trivial
pattern (a ommon prex). Therefore, it is feasible to generate a hypothesis
about the subprogram body for a given segmentation with re ursion points
U
re
by al ulating the maximal pattern of the segments.
Let U
re
= pos(t
G
; G) be the set of re ursion points in G. Be ause G
is a re ursive subprogram, holds U
re
6= ;. Let U
re
be indexed over R =
f1; : : : ; jU
re
jg and let be W the set of unfolding indi es over R (def. 7.1.28).
Let the body of the subprogram G be
^
t
G
= t
G
[U
re
G
0
. That is, the
re ursive alls are repla ed by fun tion variable G with arity 0, and therefore,
the substitutions in the re ursive alls are not given in
^
t
G
. Let sub : XR!
T
(X) be the substitution term of variable x
i
2 X in the r-th re ursive all
(r 2 R) with sub(x
i
; r) = t
G
j
u
r
Æi
(def. 7.1.27).
For better readability we introdu e unfolding indi es w 2W and example
e 2 E as parameters of instantiation : X W E ! T
. We will use vari-
able names with indi es to indi ate their position. If position is not relevant,
variables are written without indi es.
The instantiation : XWE ! T
of a variable x 2 X in an unfolding
w 2W in example e 2 E is dened by the initial instantiation of this variable
and by the substitution terms for this variable. In the rst unfolding w = ,
is the initial instantiation for the given example. Further instantiations an
be generated by substitutions:
Denition B.2.1 (Generating Instantiations). From instantiations in
unfolding w of an example e, the instantiation in an unfolding w Æ r an
be generated by appli ation of the substitution:
1. (x; ; e) =
e
(x).
2. For all w 2 W , r 2 R, e = inE:
(x;w Æ r; e) = sub(x; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g.
There an exist substitutions for x with sub(x; r) 2 X .
Lemma B.2.1 (Identi al Instantiations). If for x 2 X exists a substitu-
tion sub(x; r) = x
0
, r 2 R, x
0
2 X, it holds for all e 2 E: 8w 2 W9v 2 W
with (x
0
; w; e) = (x; v; e).
B.2 Proof: Maximal Subprogram Body 375
Proof (Identi al Instantiations). If for an r 2 R holds that sub(x; r) = x
0
with x
0
2 X then it follows for all e 2 E and w 2W :
(x;w Æ r; e) = sub(x; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g
= x
0
fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g
= (x
0
; w; e);
and therefore for ea h e 2 E and w 2 W exists v 2 W (v = w Æ r) su h that
(x
0
; w; e) = (x;w; e).
From lemma B.2.1 follows: If a substitution of x is a repla ement of x by
another variable x
0
then in ea h example the instantiation of x
0
is also an
instantiation of x. In general, for x an exist several substitutions where x
is repla ed by x
0
and additionally, for x
0
there an be further substitutions
sub(x
0
; r) 2 X . We dene the set of instantiation generating variables for x
and generalize lemma B.2.1.
Denition B.2.2 (Instantiation Generating Variables). We dene X
(x)
as the minimal set of variables whi h dire tly generate instantiations of x for
whi h holds:
1. x 2 X
(x).
2. If x
0
2 X
(x) then fsub(x
0
; r) j r 2 R; sub(x
0
; r) 2 Xg X
(x).
Lemma B.2.2 (Continuation of Identi al Instantiations). Let X
(x) be
the set of variables whi h dire tly generate instantiations of x. For all
x
0
2 X
(x) and all e 2 E holds: 8w 2W9v 2 W with (x
0
; w; e) = (x; v; e).
Proof (Continuation of Identi al Instantiations). For the given denition of
X
(x) it must be shown that (1) lemma B.2.2 holds for x, and (2) if x
0
2 X
(x)
then lemma B.2.2 holds for all x
00
2 fsub(x
0
; r) j r 2 R; sub(x
0
; r) 2 Xg.
1. For x 2 X
(x) holds: for all e 2 E and w 2 W exists v 2 W with v = w
su h that (x;w; e) = (x; v; e).
2. Let x
0
2 X
(x). By lemma B.2.2 holds for ea h x
00
2 fsub(x
0
; r) j r 2
R; sub(x
0
; r) 2 Xg that for ea h e 2 E and w 2 W exists a v
0
2W with
(x
00
; w; e) = (x
0
; v
0
; e). Be ause x
0
2 X
(x), for ea h (x
0
; v
0
; e) must
exist a v 2W with (x
0
; v
0
; e) = (x; v; e). Therefore, for ea h e 2 E and
w 2W exists v 2 W with (x
00
; w; e) = (x; v; e).
From lemma B.2.2 follows: In ea h example is ea h instantiation of a
variable from X
(x) an instantiation of x. If there exists a non-trivial pattern
p of instantiations of x, then it must hold p (x;w; e) for all w 2 W and
e 2 E. Based on lemma B.2.2 for all x
0
2 X
(x) holds:
Denition B.2.3 (Non-Trivial Pattern of an Instantiation). For all x
0
2
X
(x) holds for a non-trivial pattern p of instantiations of x:
1. p
e
(x
0
) for all e 2 E, and
376 B. Con epts and Proofs
2. p sub(x
0
; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g for all e 2 E,
w 2W , and r 2 R with sub(x
0
; r) =2 X.
Furthermore, there must exist a non-trivial pattern p
0
su h that for all x
0
2
X
(x) holds:
3. p
0
e
(x
0
) for all e 2 E, and
4. p
0
sub(x
0
; r) for all r 2 R with sub(x
0
; r) =2 X.
The proof of theorem 7.3.1 is divided in the two ases (1) p
0
does not
ontain variables, and (2) p
0
does ontain variables.
(1) p
0
does not ontain variables. From var(p) = ; follows
p
0
=
e
(x
0
) for all e 2 E, and
p
0
= sub(x
0
; r) for all r 2 R with sub(x
0
; r) =2 X , and therefore
p
0
= (x
0
; w; e) for all x
0
2 X
(x), e 2 E, w 2 W .
That means, that instantiations of x
0
2 X
(x) are identi al in all unfold-
ings over all examples. Therefore, it holds for all w 2 W and e 2 E:
^
t
G
fx
0
(x
0
; w; e)g =
^
t
G
fx
0
p
0
g. Consequently, variables x
0
2 X
(x)
an be repla ed by p
0
in the program body.
Furthermore, for all variables x
00
2 X , e 2 E, w 2 W , and r 2 R holds:
(x
00
; w Æ r; e) = sub(x
00
; r) fx
0
(x
0
; w; e) j x
0
2 X
(x)g
fx
0
(x
0
; w; e) j x
0
2 X nX
(x)g
= sub(x
00
; r) fx
0
p
0
j x
0
2 X
(x)g
fx
0
(x
0
; w; e) j x
0
2 X nX
(x)g:
That is, in all substitutions the variables from X
(x) an be repla ed by p
0
and these new substitutions generate identi al instantiations in all unfoldings
over all examples.
Let X
0
= X nX
(x) and m = jX
0
j. Be ause jX
(x) > 0 holds jX
0
j < jX j.
Let be i
1
; : : : ; i
m
the indi es of variables in X
0
. There an be onstru ted a
new subprogram G
0
whi h does not need the variables from X
(x):
Head: G
0
(x
i
1
; : : : ; x
i
m
).
Re ursive Calls: Ea h re ursive all uses the new program nam G
0
and the
new substitutions. The all at position u
r
2 U
re
, r 2 R is:
G
0
( sub(x
i
1
; r)fx
0
p j x
0
2 X
(x)g;
.
.
.
sub(x
i
m
; r)fx
0
p j x
0
2 X
(x)g:
Maximal Pattern of Body: The maximal patterns of the instantiations an
be in luded in the body:
^
t
G
0
=
^
t
G
fx
0
p
0
j x
0
2 X
(x)g.
B.2 Proof: Maximal Subprogram Body 377
Body: The program body an be onstru ted as:
t
G
0
= t
G
fx
0
p
0
j x
0
2 X
(x)g
[u
r
G
0
( sub(x
i
1
; r)fx
0
p j x
0
2 X
(x)g;
.
.
.
sub(x
i
1
; r)fx
0
p j x
0
2 X
(x)g) j u
r
2 U
re
:
The initial instantiations of variables x
0
2 X
(x) are no longer used in the
examples, be ause these variables do not belong to the new subprogram. The
initial instantiations of the remaining variables in X
0
are unae ted. With the
redu ed initial instantiations the new subprogram still has the same language
for ea h example.
(2) p
0
does ontain variables. Let jvar(p
0
)j = m the number of variables
in p
0
. For ea h variable x
j
2 X
(x) new variables x
ji
; : : : ; x
jm
are on-
stru ted. The new variable set is X
0
= fx
jk
j x
j
2 X
(x); k = 1; : : : ;mg [
X n X
(x). For ea h x
j
2 X
(x) for all e 2 E and w 2 W there
must exist terms (x
j1
; w; e); : : : ; (x
jm
; w; e) 2 T
su h that (x
j
; w; e) =
p
0
[(x
j1
; w; e); : : : ; (x
jm
; w; e) holds be ause p
0
is a non-trivial pattern of
ea h (x
j
; w; e).
Furthermore, for ea h x
j
2 X
(x) and ea h r 2 R with sub(x
0
; r) =2
X there must exist terms sub(x
j1
; r); : : : ; sub(x
jm
; r) 2 T su h that
sub(x
j
; r) = p
0
[sub(x
j1
; r); : : : ; sub(x
jm
; r) holds be ause p
0
is a non-
trivial pattern of ea h sub(x
j
; r).
For all w 2W , e 2 E, and r 2 R holds
(x
j
; w Æ r; e) = p
0
[(x
j1
; w Æ r; e); : : : ; (x
jm
; w Æ r; e)
= sub(x
j
; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g
= p
0
[sub(x
j1
; r); : : : ; sub(x
jm
; r)fx
1
(x
1
; w; e);
.
.
.
x
n
(x
n
; w; e)g
= p
0
[sub(x
j1
; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g
.
.
.
sub(x
jm
; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g:
It follows for k = 1; : : : ;m, w 2W , e 2 E, and r 2 R:
(x
jk;
w Æ r; e) = mathbfsub(x
jk
; r)fx
1
(x
1
; w; e); : : : ; x
n
(x
n
; w; e)g:
Consequently, the instantiations of all variables x
jk
an be generated by the
(new) substitutions sub(x
jk
; r) in all unfoldings for all examples.
For all x
0
2 X
0
, w 2 W , e 2 E, and r 2 R holds:
378 B. Con epts and Proofs
(x
0
; w Æ r; e) = sub(x
0
; r) fx
j
(x
j
; w; e) j x
j
2 X
(x)g
fx
j
(x
j
; w; e) j x
j
2 X nX
(x)g
= sub(x
0
; r) fx
j
p
0
[(x
j1
; w; e); : : : ; (x
jm
; w; e) j x
j
2 X
(x)g
fx
j
(x
j
; w; e) j x
j
2 X nX
(x)g
= sub(x
0
; r) fx
j
p
0
[x
j1
; : : : ; x
jm
j x
j
2 X
(x)g
fx
jk
(x
j1
; w; e) j x
j
2 X
(x); k = 1; : : : ;mg
fx
j
(x
j
; w; e) j x
j
2 X nX
(x)g
= sub(x
0
; r) fx
j
p
0
[x
j1
; : : : ; x
jm
j x
j
2 X
(x)g
fx
00
(x
00
; w; e) j x
00
2 X
0
g:
Consequently, substitutions an be modied in the following way: Ea h
variable x
j
2 X
(x) is repla ed by the pattern p
0
, in whi h variables are
repla ed by the new variables x
j1
; : : : ; x
jm
. All substitutions are terms in
T
(X
0
) and do not refer to variables in X
(x).
Furthermore, for all w 2W , e 2 E holds:
^
t
G
fx
j
(x
j
; w; e) j x
j
2 X
(x)g
fx
j
(x
j
; w; e) j x
j
2 X nX
(x)g
=
^
t
G
fx
j
p
0
[(x
j1
; w; e); : : : ; (x
jm
; w; e) j x
j
2 X
(x)g
fx
j
(x
j
; w; e) j x
j
2 X nX
(x)g
=
^
t
G
fx
j
p
0
[x
j1
; : : : ; x
jm
j x
j
2 X
(x)g
fx
jk
(x
j1
; w; e) j x
j
2 X
(x); k = 1; : : : ;mg
fx
j
(x
j
; w; e) j x
j
2 X nX
(x)g
=
^
t
G
fx
j
p
0
[x
j1
; : : : ; x
jm
j x
j
2 X
(x)g
fx
0
(x
0
; w; e) j x
0
2 X
0
g:
If in the body t
G
variables x
j
2 X
(x) are repla ed by pattern p
0
whi h
is instantiated with the new variables x
j1
; : : : ; x
jm
then the same unfoldings
are generated with instantiations (x
0
; w; e) of variables x
0
2 X
0
.
Consequently, a new subprogram an be onstru ted in an analogous way
as in ase 1:
Repla e variables x
j
2 X
(x) by pattern p
0
whi h is instantiated with
variables x
j1
; : : : ; x
jm
in the body, and
in all substitution terms.
Delete the substitutions for the obsolete variables from X
(x) in all alls.
Introdu e the substitutions for the new variables in all alls.
Constru t the parameter list of the program head onsistent with variables
X
0
used in the re ursive alls.
The initial instantiations for the new variables result from (x
j
; w; e) =
p
0
[(x
j1
; w; e); : : : ; (x
jm
; w; e) (introdu ed above). The initial instantiations
for variables from X
(x) are no longer ne essary. With the redu ed initial in-
stantiations the new subprogram still has the same language for ea h example.
B.3 Proof: Uniqueness of Substitutions 379
Consequen es. If instantiations of a variable x 2 X in a subprogram share a
non-trivial pattern p
0
then a new maximized subprogram with new initial
instantiations an be onstru ted. By moving p
0
to the program body, the
number of nodes (symbols) in the initial instantiations gets redu ed ( ase
1). If a variable of the new subprogram has instantiations whi h share a
non-trivial pattern, a new subprogram an be onstru ted iteratively ( ase
2). Be ause the number of nodes in the initial instantiation is nite and is
redu ed in ea h iteration, after a nite number of steps a subprogram is
onstru ted together with initial instantiations su h that the instantiations
of ea h variable do not share a non-trivial pattern.
B.3 Proof: Uniqueness of Substitutions
For the proof of theorem 7.3.4, the following lemma is needed:
Lemma B.3.1 (No Common Non-Trivial Pattern for Variables). Let
t
s
2 T
(X
i
) be a term, su h that for x
j
2 X
i
and r 2 R holds: 8w 2 W :
wÆr
= t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g.
1. For all u with u 2
T
w2W
pos(
wÆr
(x
j
)) holds
2. if there exists a variable x
k
2 X
i
with: 8w 2 W :
wÆr
(x
j
)j
u
=
w
(x
k
),
and
3. 8u
0
< um8w
1
; w
2
2 W : node(
w
1
Ær
(x
j
); u
0
) = node(
w
2
Ær
(x
j
); u
0
)
then t
s
j
u
= x
k
.
Proof (No Common Non-Trivial Pattern for Variables). Assume there exists
a position u whi h together with variable x
k
2 X fullls properties (1), (2),
and (3) stated in lemma B.3.1, but for whi h does not hold t
s
j
u
= x
k
. We
must onsider two ases: u 2 pos(t
s
) and u =2 pos(t
s
).
Case 1: u 2 pos(t
s
)
1.a: Let t
s
j
u
= x
h
and x
h
6= x
k
. Then holds for all w 2W :
wÆr
(x
j
)j
u
=
w
(x
k
) (be ause of lemma B:3:1; 2)
wÆr
(x
j
)j
u
=
w
(x
h
) (by assumption)
and therefore, for all w 2W :
w
(x
k
) =
w
(x
h
). Be ause t
i
(that is,
the body of subprogram G
i
, see theorem 7.3.4) is a maximal pattern,
it must hold x
k
= x
h
whi h is a ontradi tion to the assumption.
1.b: Let t
s
j
u
=2 X
i
. Then holds for all w 2W :
wÆr
(x
j
)j
u
=
w
(x
k
) (be ause of lemma B:3:1; 2)
be ause
wÆr
(x
j
) = t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g
wÆr
(x
k
) = t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)gj
u
wÆr
(x
k
) = t
s
j
u
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g (be ause u 2 pos(t
s
)):
380 B. Con epts and Proofs
Be ause by assumption t
s
j
u
is not in X
i
, t
s
j
u
must be a non-trivial
pattern of instantiations
w
(x
k
) of x
k
. This is a ontradi tion to the
assumption that t
i
is a maximal pattern of unfoldings over .
Case 2: u =2 pos(t
s
)
Be ause u is a position of all
wÆr
(x
j
) and by assumption
wÆr
(x
j
) = t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
1
)g
there must exist a position u
s
< u with u
s
2 pos(t
s
) and t
s
j
u
s
2 X
i
. Let
t
s
j
u
s
= x
h
. Then for all w 2W holds:
wÆr
(x
j
)j
u
s
= t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)gj
u
s
= t
s
j
u
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)gj
u
s
(be ause u
s
2 pos(t
s
))
= x
h
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)gj
u
s
(be ause t
s
j
u
s
= x
h
)
=
w
(x
h
):
By assumption, equation (3) in lemma B.3.1 holds for all u
0
< u and
therefore also for u
s
< u: for all w 2 W
node(
w
1
Ær
(x
j
); u
s
) = node(
w
2
Ær
(x
j
); u
s
)
node(
w
1
(x
h
); ) = node(
w
2
(x
j
); ):
Consequently, instantiations of variable x
h
share a non-trivial pattern,
whi h is a ontradi tion to the assumption that t
i
is a maximal pattern
of unfoldings over .
Now we an proof theorem 7.3.4:
Proof (Uniqueness of Substitutions). Let t
s
2 T
(X) be a term, whi h fullls
equation (*) 8w 2 W :
wÆr
(x
j
) = t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g
given in theorem 7.3.4.We must distinguish the ases that (1) t
s
has a variable
at position u and (2) The subtree in t
s
at position u does not ontain a
variable.
Case 1: Let u 2 pos(t
s
) with t
s
j
u
2 X
i
and let t
s
j
u
= x
k
.
Then holds:
wÆr
(x
j
)j
u
=
w
(x
k
) and be ause of (*) that for all u
0
<
u and all w
1
; w
2
2 W node(
w
1
(x
j
); u
0
) = node(
w
2
(x
j
); u
0
). Be-
ause of lemma B.3.1 holds sub(x
j
; r)ju = t
s
j
u
and for all u
0
u:
node(sub(x
j
; r); u
0
) = node(t
s
; u
0
).
Case 2: Let u 2 pos(t
s
) with t
s
j
u
2 T
.
Then holds:
wÆr
(x
j
)j
u
= t
s
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)gj
u
= t
s
j
u
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g
= t
s
j
u
:
2.a: Let u 2 pos(sub(x
j
; r)). Let u
0
2 pos(sub(x
j
; r)) be a position with
u u
0
and let sub(x
j
; r)j
u
0
2 X
i
with sub(x
j
; r) = x
k
. Then holds
B.3 Proof: Uniqueness of Substitutions 381
for all w 2 W :
wÆr
(x
j
)j
u
0
= beta
w
(x
k
) and be ause
wÆr
(x
j
)j
u
=
t
s
j
u
it holds
wÆr
(x
j
)j
u
0
= t
s
j
u
0
and therefore
w
(x
k
) = t
s
j
u
0
.
The instantiations of x
k
then must all be identi al and share the
non-trivial pattern t
s
j
u
0
. This is a ontradi tion to the assumption
that t
i
is a maximal pattern of unfoldings over . It follows: if u 2
pos(sub(x
j
; r)) then term sub(x
j
; r)j
u
annot ontain variables and
it must hold:
wÆr
(x
j
)j
u
= sub(x
j
; r)j
u
fx
1
w
(x
1
); : : : ; x
m
i
w
(x
m
i
)g
= sub(x
j
; r)j
u
= t
s
j
u
:
2.b: Let u =2 pos(sub(x
j
; r)). Be ause u is a position in all
wÆr
(x
j
), in
sub(x
j
; r) at position u
0
< u must be a variable x
k
2 X
i
. Be ause
then holds for all w 2 W :
wÆr
(x
j
)j
u
0
=
w
(x
k
) and for all u
00
< u
and w
1
; w
2
2 W node(
w
1
Ær
(x
j
); u
00
) = node(
w
2
Ær
(x
j
); u
00
) and
be ause of lemma B.3.1 must hold for t
s
that t
s
j
u
0
= x
k
. Be ause
u
0
< u this is a ontradi tion to the assumption u 2 pos(t).
Be ause of ase 1 holds for all positions u 2 pos(t
s
) with 9u
0
: t
s
j
u
0
2
X
i
and u u
0
that node(t
s
; u) = node(sub(x
j
; r); u). Be ause of ase
2 holds for all positions u 2 pos(t
s
) with :9u
0
: t
s
j
u
0
2 X
i
and u u
0
that node(t
s
; u) = node(sub(x
j
; r); u). Therefore, for all u 2 pos(t
s
) holds
node(t
s
; u) = node(sub(x
j
; r); u) and therefore holds t
s
= sub(x
j
; r).
C. Sample Programs and Problems
C.1 Fibona i with Sequen e Referen ing Fun tion
The bona i fun tion as inferred with a geneti programming algorithm
Koza (1992, hap. 18.3):
(+ (SRF (- J 2) 0)
(SRF (+ (+ (- J 2) 0) (SRF (- J J) 0))
(SRF (SRF 3 1) 1)))
A realization in Lisp:
(defun kfib (x)
(+ (srf x (- x 2) 0)
(srf x (+ (+ (- x 2) 0) (srf x (- x x) 0))
(srf x (srf x 3 1) 1))
) )
; sequen e referen ing fun tion (sfr ur x def)
; if 0 >= x <= ur then fib(x) else default def
(defun srf ( ur x def)
( ond ((<= x 0) def)
((<= x ur) (fib x))
(T def)
))
; Koza's SRF al ulates the fibona i numbers for x = 0..20
; in as ending order and saves them in a global table
; we just al ulate it by standard fibona i
(defun fib (x)
( ond ((= 0 x) 1)
((= 1 x) 1)
(T (+ (fib (- x 1)) (fib (- x 2))))
))
A simpli ation of the synthesized fun tion:
(defun skfib (x)
(+ (srf x (- x 2) 0)
(srf x (- x 1)
(srf x (srf x 3 1) 1))
) )
384 C. Sample Programs and Problems
C.2 Indu ing `Reverse' with Golem
Tra e of the ILP system Golem (Muggleton & Feng, 1990) for indu ing the
re ursive lause reverse.
Positive Examples:
rev([,[).
rev([1,[1).
rev([2,[2).
rev([3,[3).
rev([4,[4).
rev([1,2,[2,1).
rev([1,3,[3,1).
rev([1,4,[4,1).
rev([2,2,[2,2).
rev([2,3,[3,2).
rev([2,4,[4,2).
rev([0,1,2,[2,1,0).
rev([1,2,3,[3,2,1).
Negative Examples:
rev([1,[).
rev([0,1,[0,1).
rev([0,1,2,[2,0,1).
app([1,[0,[0,1).
Ba kground Knowledge:
!- mode(rev(+,-)).
!- mode(app(+,+,-)).
rev([,[).
rev([1,[1).
rev([2,[2).
rev([3,[3).
rev([4,[4).
rev([1,2,[2,1).
rev([1,3,[3,1).
rev([1,4,[4,1).
rev([2,2,[2,2).
rev([2,3,[3,2).
rev([2,4,[4,2).
rev([0,1,2,[2,1,0).
rev([1,2,3,[3,2,1).
app([,[,[).
app([1,[,[1).
app([2,[,[2).
app([3,[,[3).
app([4,[,[4).
app([,[1,[1).
app([,[2,[2).
app([,[3,[3).
app([,[4,[4).
app([1,[0,[1,0).
app([2,[1,[2,1).
C.2 Indu ing `Reverse' with Golem 385
app([2,[1,[2,1).
app([3,[1,[3,1).
app([4,[1,[4,1).
app([2,[2,[2,2).
app([3,[2,[3,2).
app([4,[2,[4,2).
app([2,1,[0,[2,1,0).
app([3,2,[1,[3,2,1).
nat(0).
nat(1).
nat(2).
nat(3).
nat(4).
nat(5).
nat(6).
nat(7).
nat(8).
nat(9).
Tra e:
[:- mode(rev(+,-)). - Time taken 0ms
[:- mode(app(+,+,-)). - Time taken 0ms
[Rlgg of pair:
rev([,[).
rev([2,[2).
[is
rev(A,A) :- app(A,A,B).
[Number of negatives overed=0
[Number of positives overed=2
[Rlgg of pair:
rev([,[).
rev([1,2,[2,1).
[is
rev(A,B) :- app(C,D,B).
[Number of negatives overed=1
[Rlgg of pair:
rev([,[).
rev([1,3,[3,1).
[is
rev(A,B) :- app(C,D,B).
[Number of negatives overed=1
[Rlgg of pair:
rev([1,[1).
rev([1,4,[4,1).
[is
rev([1|A,[B|C) :- rev(A,A), rev(C,C), app([2,1,[0,[2,1,0),
app(A,[1,[B|C), app(D,A,[B), nat(1), nat(B), rev([,
[), app([1,[0,[1,0), app([1,[,[1), app([2,[1,
[2,1), app([2,[2,[2,2), app([3,2,[1,[3,2,1), app([B,
[,[B), app([B,C,[B|C), app([,[1,[1), app([,[B,
[B), app([,[,[), nat(0), nat(2).
[Number of negatives overed=0
386 C. Sample Programs and Problems
[Number of positives overed=4
[Rlgg of pair:
rev([2,[2).
rev([1,3,[3,1).
[is
rev([A|B,[C|D) :- rev([1|E,[F|D), rev(B,B), rev(D,D), app([G,
A,[H,[G,A,H), app(B,[A,[C|D), app(I,B,[C), nat(A),
nat(C), rev([1,[1), rev([F,[F), rev([,[), rev(E,
E), app([1,[0,[1,0), app([2,1,[0,[2,1,0), app([3,
2,[1,[3,2,1), app([A,[H,[A,H), app([A,[,[A),
app([A,I,[A|I), app([C,F,[J,[C,F,J), app([C,[1,
[C,1), app([C,[A,[C,A), app([C,[F,[C,F), app([C,
[,[C), app([C,D,[C|D), app([C,E,[C|E), app([C,
I,[C|I), app([F,[,[F), app([F,D,[F|D), app([F,
E,[F|E), app([G,[A,[G,A), app([,[A,[A), app([,
[C,[C), app([,[F,[F), app([,[,[), app(B,[F,[K|
E), nat(1), nat(F), nat(G), nat(H).
[Number of negatives overed=0
[Number of positives overed=2
[Rlgg of pair:
rev([2,[2).
rev([2,3,[3,2).
[is
rev([2|A,[B|C) :- rev([1|A,[D|E), rev([F|C,[F|C), rev(A,
A), rev(C,C), app([2,[2,[2,2), app([2,C,[2|C), app([3,
2,[1,[3,2,1), app(A,[2,[B|C), app(G,A,[B), nat(2),
nat(B), rev([1,[1), rev([D,[D), rev([F,[F), rev([,
[), rev(E,E), app([1,[0,[1,0), app([2,1,[0,[2,1,
0), app([2,[1,[2,1), app([2,[,[2), app([2,E,[2|
E), app([2,G,[2|G), app([3,[1,[3,1), app([3,[2,
[3,2), app([B,F,[H,[B,F,H), app([B,[1,[B,1), app([B,
[2,[B,2), app([B,[F,[B,F), app([B,[,[B), app([B,
C,[B|C), app([B,E,[B|E), app([B,G,[B|G), app([D,
[,[D), app([D,C,[D|C), app([F,E,[F|E), app([,[2,
[2), app([,[B,[B), app([,[D,[D), app([,[,[),
app(A,E,I), app(E,J,[1), app(G,[D,[B|J), app(J,E,
[1), app(K,[B,[3|G), app(K,A,[3), nat(1), nat(3),
nat(D), nat(F).
[Number of negatives overed=0
[Number of positives overed=2
[Rlgg of pair:
rev([3,[3).
rev([1,4,[4,1).
[is
rev([A|B,[C|D) :- rev(B,B), rev(D,D), app(B,[A,[C|D), app(E,
B,[C), nat(A), nat(C), rev([,[), app([A,[,[A),
app([C,[,[C), app([C,D,[C|D), app([,[A,[A), app([,
[C,[C), app([,[,[).
[Number of negatives overed=0
[Number of positives overed=10
[Rlgg of pair:
rev([1,4,[4,1).
rev([2,3,[3,2).
C.3 Finite Program for `Unsta k' 387
[is
rev([A,B,[B,A) :- rev([A,[A), rev([B,[B), rev([,[), app([A,
[,[A), app([B,[A,[B,A), app([B,[,[B), app([C,
A,[D,[C,A,D), app([,[A,[A), app([,[B,[B), app([,
[,[), nat(A), nat(B), rev([C,[C), app([A,[D,[A,
D), app([C,[A,[C,A), app([C,[,[C), app([,[C,
[C), nat(C), nat(D).
[Number of negatives overed=0
[Number of positives overed=6
[Pos-neg over=10,potential-examples=3
[Rlgg of :
rev([,[).
rev([3,[3).
rev([1,4,[4,1).
[is
rev(A,B).
[Number of negatives overed=3
[Overgeneral
[Rlgg of :
rev([0,1,2,[2,1,0).
rev([3,[3).
rev([1,4,[4,1).
[is
rev([A|B,[C|D) :- rev(B,E), app(E,[A,[C|D), nat(A), nat(C),
rev([,[), app([,[,[).
[Number of negatives overed=0
[OK
[Rlgg of :
rev([1,2,3,[3,2,1).
rev([3,[3).
rev([1,4,[4,1).
[is
rev([A|B,[C|D) :- rev(B,E), rev(F,D), app(E,[A,[C|D), nat(A),
nat(C), rev([,[), app([A,[,[A), app([,[A,[A),
app([,[,[).
[Number of negatives overed=0
[OK
[Best-atom rev([0,1,2,[2,1,0)
[Pos-neg over=12,potential-examples=0
[Cover=12
[Redu ing lause
[Adding lause rev([A|B,[C|D) :- rev(B,E), app(E,[A,[C|D).
[REMOVED: 12 FORES, 0 NEGS
[FORES: 1, BACKS: 41, NEGS: 4, RULES: 1
[Indu tion time 498ms
C.3 Finite Program for `Unsta k'
#S(TPLAN :PNAME P1
:SUPLAN
(#S(PSTEP :INSTOP NIL :PARENT NIL
:CHILD
388 C. Sample Programs and Problems
((CLEAR (SUCC (SUCC O3))) (CLEAR (SUCC O3))
(CLEAR O3))
:PREC NIL :ADD NIL :DEL NIL :NODEID 0 :LEVEL 0)
#S(PSTEP :INSTOP (UNSTACK (SUCC O3))
:PARENT
((CLEAR (SUCC (SUCC O3))) (CLEAR (SUCC O3))
(CLEAR O3))
:CHILD
((ON O2 O3) (CLEAR (SUCC (SUCC O3)))
(CLEAR (SUCC O3)))
:PREC ((ON O2 O3) (CLEAR O2)) :ADD ((CLEAR O3))
:DEL ((CLEAR O2) (ON O2 O3)) :NODEID 1 :LEVEL 1)
#S(PSTEP :INSTOP (UNSTACK (SUCC (SUCC O3)))
:PARENT
((ON O2 O3) (CLEAR (SUCC (SUCC O3)))
(CLEAR (SUCC O3)))
:CHILD
((ON O1 O2) (ON O2 O3) (CLEAR (SUCC (SUCC O3))))
:PREC ((ON O1 O2) (CLEAR O1)) :ADD ((CLEAR O2))
:DEL ((CLEAR O1) (ON O1 O2)) :NODEID 2 :LEVEL 2))
:COPLAN
(#S(PSTEP :INSTOP NIL :PARENT NIL :CHILD ((CLEAR O3))
:PREC NIL :ADD NIL :DEL NIL :NODEID 0 :LEVEL 0)
#S(PSTEP :INSTOP (UNSTACK (SUCC O3)) :PARENT ((CLEAR O3))
:CHILD ((CLEAR (SUCC O3)))
:PREC ((ON O2 O3) (CLEAR O2)) :ADD ((CLEAR O3))
:DEL ((CLEAR O2) (ON O2 O3)) :NODEID 1 :LEVEL 1)
#S(PSTEP :INSTOP (UNSTACK (SUCC (SUCC O3)))
:PARENT ((CLEAR (SUCC O3)))
:CHILD ((CLEAR (SUCC (SUCC O3))))
:PREC ((ON O1 O2) (CLEAR O1)) :ADD ((CLEAR O2))
:DEL ((CLEAR O1) (ON O1 O2)) :NODEID 2 :LEVEL 2))
:TERM
(IF (CLEAR O3 S)
S
(UNSTACK (SUCC O3) S
(IF (CLEAR (SUCC O3) S)
S
(UNSTACK (SUCC (SUCC O3)) S
(IF (CLEAR (SUCC (SUCC O3)) S) S OMEGA)))))
:PTYPE SEQ :NEWDAT NIL :NEWPRED NIL :NPFCT NIL
:GOALPRED (CLEAR O3) :BOTTOM O3
:PASSOCS ((O2 (SUCC O3)) (O1 (SUCC (SUCC O3))))
:PAFCT
(DEFUN SUCC (X S)
(COND ((NULL S) NIL)
((AND (EQUAL (FIRST (CAR S)) 'ON)
(EQUAL (NTH 2 (CAR S)) X))
(NTH 1 (CAR S)))
(T (SUCC X (CDR S)))))
:PPARAMS NIL :PVALUES NIL)
For sequen es, the slots newdat, newpred, and npf t are not required. The
slots pparams and pvalues are lled after folding.
C.4 Re ursive Control Rules for the `Ro ket' Domain
; Control Knowledge for ROCKET
; ----------------------------
; all for example (ro ket '(o1 o2 o3) '((at o1 a) (at o2 a) (at o3 a) (at ro ket a)))
; or, in luding generation of oset
C.4 Re ursive Control Rules for the `Ro ket' Domain 389
; (start-r '((at o1 b) (at o2 b) (at o3 b)) '((at o1 a) (at o2 a) (at o3 a) (at ro ket a)))
; predefined set-sele tors
(defun pi k (oset) ( ar oset) )
(defun rst (oset) ( dr oset) )
; generalized predi ates inferred during plan transformation
(DEFUN AT* (ARGS S)
(COND ((NULL ARGS) T)
((AND (NULL S) (NOT (NULL ARGS))) NIL)
((AND (EQUAL (CAAR S) 'AT)
(INTERSECTION ARGS (CDAR S)))
(AT* (SET-DIFFERENCE ARGS (CDAR S)) (CDR S)))
(T (AT* ARGS (CDR S)))))
(DEFUN INSIDE* (ARGS S)
(COND ((NULL ARGS) T)
((AND (NULL S) (NOT (NULL ARGS))) NIL)
((AND (EQUAL (CAAR S) 'INSIDE)
(INTERSECTION ARGS (CDAR S)))
(INSIDE* (SET-DIFFERENCE ARGS (CDAR S)) (CDR S)))
(T (INSIDE* ARGS (CDR S)))))
; expli it operator appli ation
; in ombination with DPlan, the add-del-lists are applied to s
(defun unload (o s)
(print `(unload ,o ,s))
( ond ((null s) nil)
((member o ( ar s) :test 'equal) ( ons (list 'at o 'b) ( dr s)))
(T ( ons ( ar s) (unload o ( dr s)))) ))
(defun loadr (o s)
(print `(load ,o ,s))
( ond ((null s) nil)
((member o ( ar s) :test 'equal)
( ons (list 'inside o 'ro ket) ( dr s)))
(T ( ons ( ar s) (loadr o ( dr s)))) ))
(defun move-ro ket (s)
(print `(move-ro ket ,s))
( ond ((null s) nil)
((equal ( ar s) '(at ro ket a)) ( ons (list 'at 'ro ket 'b) ( dr s)))
(T ( ons ( ar s) (move-ro ket ( dr s)))) ))
; generalized ontrol rules
; abstra tion from destination (B) (for at*) and vehi le (Ro ket) (for inside*)
(defun unload-all (oset s)
(if (at* oset s)
s
(unload (pi k oset) (unload-all (rst oset) s)) ))
(defun load-all (oset s)
(if (inside* oset s)
s
(loadr (pi k oset) (load-all (rst oset) s)) ))
(defun ro ket (oset s) (unload-all oset (move-ro ket (load-all oset s))) )
390 C. Sample Programs and Problems
; "meta"-fun tion, generating the set of obje ts to be transported
; from the top-level goals
(defun start-r (g s) (ro ket (make- o g '(at* CO x)) s))
(defun make- o (goal newpat)
( ond ((null goal) nil)
((string< (string ( aar goal)) (string ( ar newpat)))
( ons (nth (position 'CO newpat) ( ar goal))
(make- o ( dr goal) newpat))) ))
Interleaving `at' and `inside'. The generalized predi ates (at* oset B) and
(inside* oset Ro ket) are omplements. If all obje ts are at a lo ation, no
obje t is inside the vehi le and vi e versa. The unload-all fun tion presup-
poses, that all obje ts in oset are inside the ro ket and the load-all fun tion
presupposes, that all obje ts in oset are at the urrent lo ation. The onstr-
ution of a omplex obje t from the plan is driven by the top-level goals (or
for a sub-plan by the predi ates in its root-node). After transforming both
sub-plans into nite programs and generalizing over them, it be omes lear,
that both sub-plans share the parameter oset whith initial value (o1 o2 o3).
Analyzing the relationship between the obje ts in the at*-set and the
inside*-set, ould lead to an alternative implementation of these generalized
predi ates: (at* oset l) is true, if no obje t is inside the ro ket, that means,
if for (inside* oset ro ket) the oset is empty (s ontains no literal (inside o
ro ket)); and analogous for (inside* oset ro ket).
C.5 The `Sele tion Sort' Domain
Sorting Lists with Three Elements. The universal plan for sorting lists with
three elements is given in gure C.1. Note, that in onstru ting the universal
plan, it is random whether (swap p q) or (swap q p) is the rst instantiation.
Be ause the operator is symmetri , both appli ations result in an identi al
state. Only the rst appli ation is integrated in the plan. Restri tion of swap-
ping only from smaller positions to larger ones (or the other way round) an
be done by extending state spe i ations by ((gt P3 P2) (gt P3 P1) (gt
P2 P1)) and the appli ation- ondition of the swap-operator to ((is p n1)
(is q n2) (gt n1 n2) (gt q p)).
Minimal Spanning Trees. The nine minimal spanning trees of the 3-sort prob-
lem are given in gures C.2, C.3 and C.4. Three of these trees (the last three
given) are generalizable to the selsort program: namely, all trees where the
bran hing fa tor is as regular as possible (i. e., 3 to 2 vs. 3 to 1).
C.6 Re ursive Control Rules for the `Tower' Domain
; TOWER -- assuming subgoal independen e
C.6 Re ursive Control Rules for the `Tower' Domain 391
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2)) ((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P2 P3)(SWAP P1 P3) (SWAP P1 P3)(SWAP P1 P2)(SWAP P2 P3) (SWAP P1 P2)
Fig. C.1. Universal Plan for Sorting Lists with Three Elements
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2)) ((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P2 P3)(SWAP P1 P3)
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2)) ((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P1 P3) (SWAP P1 P2)
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2))((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P2 P3) (SWAP P2 P3)
Fig. C.2. Minimal Spanning Trees for Sorting Lists with Three Elements
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
((ISC P1 2) (ISC P2 3) (ISC P3 1)) ((ISC P1 3) (ISC P2 1) (ISC P3 2))
(SWAP P2 P3)(SWAP P1 P2)
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2)) ((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P1 P3) (SWAP P1 P3)
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2))((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P2 P3) (SWAP P1 P2)
Fig. C.3. Minimal Spanning Trees for Sorting Lists with Three Elements
392 C. Sample Programs and Problems
((ISC P1 3) (ISC P2 1) (ISC P3 2))((ISC P1 2) (ISC P2 3) (ISC P3 1))
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1))((ISC P1 1) (ISC P2 3) (ISC P3 2))
(SWAP P2 P3) (SWAP P1 P3)(SWAP P1 P2)
(SWAP P1 P3)(SWAP P1 P2)
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2)) ((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P1 P3)(SWAP P2 P3)
((ISC P1 1) (ISC P2 2) (ISC P3 3))
((ISC P1 2) (ISC P2 1) (ISC P3 3)) ((ISC P1 3) (ISC P2 2) (ISC P3 1)) ((ISC P1 1) (ISC P2 3) (ISC P3 2))
((ISC P1 3) (ISC P2 1) (ISC P3 2))((ISC P1 2) (ISC P2 3) (ISC P3 1))
(SWAP P2 P3)(SWAP P1 P3)(SWAP P1 P2)
(SWAP P1 P2)(SWAP P1 P2)
Fig. C.4. Minimal Spanning Trees for Sorting Lists with Three Elements
; abstra t representation of ontrol knowledge
;G(n,n,s) = tow(n,n)(s, putt1(n,s))
;G(i,n,s) = tow(i,n)(s,put1(i,s(i),G(s(i),s,n)))
;put1(i,s(i),s) = put(i,s(i), lear(i, lear(s(i),s)))
;putt1(i,s)=putt(i, lear(i,s))
; lear(x,s)= t(x,s)(s,putt(f(x), lear(f(x),s)))
; main: G(1,n,s)
; ------------------------------------------------------------------
; all (G 1 maxblo k s)
; s for maxblo k = 3
; ((on 1 2) (on 2 3) ( t 1) (ont 3))
; ((on 1 3) (on 3 2) ( t 1) (ont 2))
; ((on 2 1) (on 1 3) ( t 2) (ont 3))
; ((on 2 3) (on 3 1) ( t 2) (ont 1))
; ((on 3 1) (on 1 2) ( t 3) (ont 2))
; ((on 3 2) (on 2 1) ( t 3) (ont 1))
; ((on 1 2) ( t 1) ( t 3) (ont 2) (ont 3))
; ((on 1 3) ( t 1) ( t 2) (ont 2) (ont 3))
; ((on 2 1) ( t 2) ( t 3) (ont 1) (ont 3))
; ((on 2 3) ( t 2) ( t 1) (ont 1) (ont 3))
; ((on 3 1) ( t 3) ( t 2) (ont 1) (ont 2))
; ((on 3 2) ( t 3) ( t 1) (ont 1) (ont 2))
; (( t 1) ( t 2) ( t 3) (ont 1) (ont 2) (ont 3))
; some s for maxblo k > 3
; ((on 4 2) (on 2 1) (on 1 3) ( t 4) (ont 3))
; ((on 2 5) (on 5 1) (on 4 3) ( t 2) ( t 4) (ont 1) (ont 3))
; ------------------------------------------------------------------
; this would also give true if blo k n is ont and there are
; unsorted blo ks above it!
C.6 Re ursive Control Rules for the `Tower' Domain 393
(defun tow (i n s)
( ond ((eq i n) ( ond ((member ( ons 'ont (list n)) s :test 'equal)
(print `(tower ,s)) T)
(T nil)
) )
(T ( ond ((and
(member ( ons 'on (list i (1+ i))) s :test 'equal)
(tow (1+ i) n s)) T)
(T nil)
) ) ))
(defun putt1 (i s) (puttable i ( lear i s)))
(defun put1 (i s) (put i (1+ i) ( lear i ( lear (1+ i) s))) )
(defun lear (x s)
( ond ((member ( ons ' t (list x)) s :test 'equal)
(print `(blo k ,(list x) is lear now)) s)
(T (print 'puttable- all-by- lear)
(puttable (f x s) ( lear (f x s) s))) ))
(defun put (x y s)
(print s)
( ond ((null s) (print 'error-in-put) nil)
((and (member ( ons ' t (list x)) s :test 'equal)
(member ( ons ' t (list y)) s :test 'equal)
) (print `(put ,(list x) on ,(list y) in,s)) (exe -put x y s))
(T (print 'error-in-put-xy-not- lear) nil) ))
(defun exe -put (x y s)
( ond ((null s) nil)
((and (equal (first ( ar s)) 'ont) (equal (se ond ( ar s)) x))
(exe -put x y ( dr s)))
((and (equal (first ( ar s)) 'on) (equal (se ond ( ar s)) x))
( ons ( ons ' t (list (third ( ar s)))) (exe -put x y ( dr s)))
)
((and (equal (first ( ar s)) ' t) (equal (se ond ( ar s)) y))
( ons ( ons 'on (list x y)) (exe -put x y ( dr s)))
)
(T ( ons ( ar s) (exe -put x y ( dr s)))) ))
(defun puttable (x s)
( ond ((null s) (print 'error-in-puttable) nil)
((member ( ons ' t (list x)) s :test 'equal)
(print `(puttable ,(list x) in ,s)) (exe -puttable x s))
(T (print 'error-in-puttable-x-not- lear) nil) ))
(defun exe -puttable (x s)
( ond ((null s) (print 'x-maybe-already-on-table) nil)
((and (equal (first ( ar s)) 'on) (equal (se ond ( ar s)) x))
( ons ( ons ' t (list (third ( ar s))))
( ons
( ons 'ont (list (se ond ( ar s))))
394 C. Sample Programs and Problems
( dr s)))
)
(T ( ons ( ar s) (exe -puttable x ( dr s)))) ))
(defun f (x s)
( ond ((null s) nil)
((and (equal (first ( ar s)) 'on) (equal (third ( ar s)) x))
(se ond ( ar s)))
(T (f x ( dr s))) ))
(defun G (i n s)
(print `(G ,(list i) ,(list n) ,s))
( ond ((eq i n) ( ond ((tow i n s) s) ; G(n,n,s)
(T (putt1 n s))
) )
(T ( ond ((tow i n s) s) ; G(i,n,s), i < n
(T (put1 i (G (1+ i) n s)))
) ) ))
; TOWER-1; all e.g. (tower '(a b d) '((on a d) (on d ) (on b) ( t a)))
(defun tower (olist s)
(if (subtow olist s)
s
(if (and ( t (first olist) s) (subtow ( dr olist) s))
(put (first olist) (se ond olist) s)
(if (and (singleblo k olist) (ot (first olist) s))
( lear-all (first olist) s)
(if (and (singleblo k olist) ( t (first olist) s))
(puttable (first olist) s)
(if (singleblo k olist)
(puttable (first olist) ( lear-all (first olist) s))
(if (and ( t (first olist) s) (on* ( dr olist) s))
(put (first olist) (se ond olist) ( lear-all (se ond olist) s))
(if ( t (first olist) s)
(put (first olist) (se ond olist) (tower ( dr olist) s))
(put (first olist) (se ond olist) ( lear-all (first olist)
(tower ( dr olist) s)))
))))))))
; lear-all ma ro
(defun lear-all (o s )
(if ( t o s)
s
(puttable (su o s) ( lear-all (su o s) s ) )) )
(defun su (x s)
( ond ((null s) nil)
((and (equal (first ( ar s)) 'on)
(equal (nth 2 ( ar s)) x))
(nth 1 ( ar s)))
(T (su x ( dr s)))))
(defun singleblo k (l) (null ( dr l)))
C.6 Re ursive Control Rules for the `Tower' Domain 395
; orre t subtower?
(defun subtow (olist s) (and ( t ( ar olist) s) (on* olist s)))
; tower ontains a urre t subtower?
(defun on* (olist s)
( ond ((null olist) T) ; should not happend
((and (null ( dr olist)) (ot ( ar olist) s)) T)
((member (list 'on (first olist) (se ond olist)) s :test 'equal)
(on* ( dr olist) s))
(T nil) ))
; given
(defun t (o s) (member (list ' t o) s :test 'equal))
; given as (ontable x) OR (here) inferred from state
(defun ot (o s)
(null (map an #'(lambda(x) (and (equal 'on (first x))
(equal o (se ond x))
(list x))) s) ))
; expli it appli ation of put-operator
(defun put (x y s )
( ond ((null s) (print `(put ,x ,y))
nil)
((equal ( ar s) (list ' t y)) ( ons (list 'on x y) (put x y ( dr s) )))
((and (equal (first ( ar s)) 'on)
(equal (se ond ( ar s)) x))
( ons (list ' t (third ( ar s))) (put x y ( dr s) )))
(T ( ons ( ar s) (put x y ( dr s) )))
))
; expli it appli ation of puttable-operator
(defun puttable (x s )
( ond ((null s) (print `(puttable ,x))
nil)
((and (equal (first ( ar s)) 'on)
(equal (se ond ( ar s)) x)) ( ons (list ' t (third ( ar s)))
(puttable x ( dr s) )))
(T ( ons ( ar s) (puttable x ( dr s) )))
))
; TOWER-2 ;Building a list (tower) of sorted numbers (blo ks)
; -----------------------------------------------------------------
; Representation: ea h partial tower as list
; Input: list of lists
; Examples for the three blo ks world
; ((1 2 3))
; ((2 3) (1))
; ((1) (2) (3))
; ((2 1) (3))
; ((3 2 1))
; ((1 2) (3))
; ((1 3) (2))
; ((3 1) (2))
; ((3 2) (1))
; ((1 3 2))
396 C. Sample Programs and Problems
; ((2 3 1))
; ((2 1 3))
; ((3 1 2))
; -----------------------------------------------------------------
; help fun tions
; ---------------
; flattens a list l
(defun flatten (l)
( ond ((null l) nil)
(T (append ( ar l) (flatten ( dr l))))
))
; x+1 = y?
(defun onedif (x y) (= (1+ x) y))
; blo ks world sele tors
; ----------------------
; topmost blo k of a tower
(defun topof (tw) ( ar tw))
; bottom blo k (base) of a tower
(defun bottom (tw) ( ar (last tw)))
; next tower
; f.e. ((2 1) (3)) -> (2 1)
(defun get-tower (l) ( ar l))
; tops of all urrent towers
(defun topelements (l) (sort (map 'list #' ar l) #'>))
; topblo k with highest number
(defun greatest (l) ( ar (topelements l)))
; topblo k mit se ond highest number
(defun s ndgreatest (l) ( adr (topelements l)))
; label of the blo k with the highest number
(defun maxblo k (l)
( ond ((null l) 0)
(T ( ar (sort (flatten l) #'>)))
))
(defun get-all-no-towers (l max)
( ond ((null l) nil)
((and (equal (bottom ( ar l)) max) (sorted (get-tower l)))
(get-all-no-towers ( dr l) max))
((single-blo k (get-tower l)) (get-all-no-towers ( dr l) max))
(T ( ons ( ar l) (get-all-no-towers ( dr l) max)))
))
(defun find-greatest (max l)
( ond ((null l) max)
C.6 Re ursive Control Rules for the `Tower' Domain 397
((> (topof max) (topof ( ar l))) (find-greatest max ( dr l)))
(T (find-greatest ( ar l) ( dr l)))
))
; find in orre t tower ontaining highest element
(defun greatest-no-tower (l)
( ond ((null l) nil)
(T (find-greatest ( ar (get-all-no-towers l (maxblo k l)))
( dr (get-all-no-towers l (maxblo k l)))))
))
; blo ksworld predi ates
; ----------------------
; is tower only a single blo k?
(defun single-blo k (tw) (= (length tw) 1))
; exist two partial towers whi h top elements differ only by one?
(defun exist-free-neighbours (l) (onedif (s ndgreatest l) (greatest l)))
; exists a orre t partial tower?
; f.e. (2 3) or (B C)
(defun exists-tower (l)
( ond ((null l) nil)
((and (equal (bottom (get-tower l)) (maxblo k l))
(sorted (get-tower l))) T)
(T (exists-tower ( dr l)))
))
; is blo k x prede essor to top of a tower?
(defun su essor (x tw)
( ond ((null tw) T)
((onedif x ( ar tw)) T) ;(su essor x ( dr tw)))
(T nil)
))
; is tower sorted?
(defun sorted (tw)
( ond ((null tw) T)
((su essor ( ar tw) ( dr tw)) (sorted ( dr tw)))
(T nil)
))
; exists only one tower?
(defun single-tower (l) (null ( dr l)))
; goal state?
(defun is-tower (l) (and (single-tower l) (sorted (get-tower l))))
; ------------------------------------------------------------------
; blo ksworld operators
; ---------------------
; put x on y
(defun put (x y l)
( ond ((null l) (print 'put) (print x) (print y)
398 C. Sample Programs and Problems
nil)
((equal ( aar l) x) ( ond ((not (null ( dar l)))
(append (list ( dar l)) (put x y ( dr l))))
(T (put x y ( dr l)))))
((equal ( aar l) y) ( ons ( ons x ( ar l)) (put x y ( dr l))))
(T ( ons ( ar l) (put x y ( dr l))))
))
; puttable x
(defun puttable (x l)
( ond ((null l) nil)
((equal ( aar l) x) (print 'puttable) (print x)
( ons (list x) ( ons ( dar l) ( dr l))))
(T ( ons ( ar l) (puttable x ( dr l))))
))
; ------------------------------------------------------------------
; main fun tion
; -------------
(defun tower (l)
( ond ((is-tower l) l)
((and (exists-tower l)
(exist-free-neighbours l))
(tower (put (s ndgreatest l) (greatest l) l)))
(T (tower (puttable (topof (greatest-no-tower l)) l)))
))
C.7 Water Jug Problems
In the following we present all problems used in experiment 1 and 2. Be ause
the graphs of the problem stru tures are large, we give the equations and
inequations (terms) instead. For ea h problem, a graph an be onstru ted
by representing the terms as demonstrated in gure 11.3. The terms get inte-
grated into a single stru ture by using ea h parameter (
A
;
B
: : :, q
A
; q
B
: : :)
as unique node.
Problems for Experiment 1
Initial Problem
Jug A B C (D)
Capa ity 28 35 42 -
Initial quantity 12 21 26 -
goal quantity 19 0 40 -
Operator sequen e: pour(C,B), pour(B,A), pour(A,C), pour(B,A)
Relevant Pro edural Information/Relevant De larative Information (Con-
straints), see Sour e problem
C.7 Water Jug Problems 399
Sour e Problem
Jug A B C (D)
Capa ity 36 45 54 -
Initial quantity 16 27 34 -
goal quantity 25 0 52 -
Operator sequen e: pour(C,B), pour(B,A), pour(A,C), pour(B,A)
Relevant Pro edural Information
q
A
(4) = q
A
(0) + (
A
q
A
(0))
A
+ (
B
(
A
q
A
(0)))
q
B
(4) = q
B
(0) + (
B
q
B
(0)) (
A
q
A
(0)) (
B
(
A
q
A
(0)))
q
C
(4) = q
C
(0) (
B
q
B
(0)) +
A
Relevant De larative Information (Constraints)
A
= 2 (
B
q
B
(0))
B
(
A
q
A
(0))
q
C
(0) (
B
q
B
(0))
C
< q
C
(0) +
A
Problem 1 (Isomorph/No Surfa e Change)
Jug A B C (D)
Capa ity 48 60 72 -
Initial quantity 21 36 45 -
goal quantity 33 0 69 -
Operator sequen e: pour(C,B), pour(B,A), pour(A,C), pour(B,A)
Stru tural distan e to sour e: d
S1
= 0
Relevant Pro edural Information/Relevant De larative Information (Con-
straints), see Sour e problem
Problem 2 (Isomorph/Small Surfa e Change) (Sour e/Target Map-
ping: A/B, B/A, C/A)
Jug A B C (D)
Capa ity 60 48 72 -
Initial quantity 36 21 45 -
goal quantity 0 33 69 -
Operator sequen e: pour(C,A), pour(A,B), pour(B,C), pour(A,B)
Stru tural distan e to sour e: d
S2
= 0
Relevant Pro edural Information
q
A
(4) = q
A
(0) + (
A
q
A
(0)) (
B
q
B
(0)) (
A
(
B
q
B
(0)))
q
B
(4) = q
B
(0) + (
B
q
B
(0))
B
+ (
A
(
B
q
B
(0)))
q
C
(4) = q
C
(0) (
A
q
A
(0)) +
B
Relevant De larative Information (Constraints)
B
= 2 (
A
q
A
(0))
A
(
B
q
B
(0))
q
C
(0) (
A
q
A
(0))
C
< q
C
(0) +
B
400 C. Sample Programs and Problems
Problem 3 (Isomorph/Large Surfa e Change) (Sour e/Target Map-
ping: A/B, B/C, C/A)
Jug A B C (D)
Capa ity 72 48 60 -
Initial quantity 45 21 36 -
goal quantity 69 33 0 -
Operator sequen e: pour(A,C), pour(C,B), pour(B,A), pour(C,B)
Stru tural distan e to sour e: d
S3
= 0
Relevant Pro edural Information
q
A
(4) = q
A
(0) (
C
q
C
(0)) +
B
q
B
(4) = q
B
(0) + (
B
q
B
(0))
B
+ (
C
(
B
q
B
(0)))
q
C
(4) = q
C
(0) + (
C
q
C
(0)) (
B
q
B
(0)) (
C
(
B
q
B
(0)))
Relevant De larative Information (Constraints)
B
= 2 (
C
q
C
(0))
C
(
B
q
B
(0))
q
A
(0) (
C
q
C
(0))
A
< q
A
(0) +
B
Problem 4 (Partial Isomorph/No Surfa e Change) (additional jug:
A; Sour e/Target Mapping: A/B, B/C, C/D)
Jug A B C D
Capa ity 16 20 25 31
Initial quantity 3 8 15 18
goal quantity 0 13 3 28
Operator sequen e: pour(D,C), pour(C,B), pour(B,D), pour(C,B), pour(A,C)
Stru tural distan e to sour e: d
S4
= 0:37
Relevant Pro edural Information
q
A
(5) = q
A
(0) q
A
(0)
q
B
(5) = q
B
(0) + (
B
q
B
(0))
B
+ (
C
(
B
q
B
(0)))
q
C
(5) = q
C
(0) + (
C
q
C
(0)) (
B
q
B
(0)) (
C
(
B
q
B
(0)))
q
D
(5) = q
D
(0) (
C
q
C
(0)) +
B
Relevant De larative Information (Constraints)
B
= 2 (
C
q
C
(0))
C
(
B
q
B
(0))
C
q
A
(0)
q
D
(0) (
C
q
C
(0))
D
< q
D
(0) +
B
Problem 5 (Partial Isomorph/Small Surfa e Change) (additional jug:
A; Sour e/Target Mapping: A/C, B/B, C/D)
Jug A B C D
Capa ity 16 25 20 31
Initial quantity 3 15 8 18
goal quantity 0 3 13 28
C.7 Water Jug Problems 401
Operator sequen e: pour(D,B), pour(B,C), pour(C,D), pour(B,C), pour(A,B)
Stru tural distan e to sour e: d
S5
= 0:37
Relevant Pro edural Information
q
A
(5) = q
A
(0) q
A
(0)
q
C
(5) = q
C
(0) + (
C
q
C
(0))
C
+ (
B
(
C
q
C
(0)))
q
B
(5) = q
B
(0) + (
B
q
B
(0)) (
C
q
C
(0)) (
B
(
C
q
C
(0)))
q
D
(5) = q
D
(0) (
B
q
B
(0)) +
C
Relevant De larative Information (Constraints)
C
= 2 (
B
q
B
(0))
B
(
C
q
C
(0))
B
q
A
(0)
q
D
(0) (
B
q
B
(0))
D
< q
D
(0) +
C
Problems for Experiment 2
(Initial problem and sour e problem are identi al to experiment 1.)
Problem 1 (Target Exhaustiveness) (Sour e/Target Mapping: A/A,
B/B, C/C)
Jug A B C (D)
Capa ity 48 60 72 -
Initial quantity 21 36 45 -
goal quantity 0 33 69 -
Operator sequen e: pour(C,B), pour(B,A), pour(A,C)
Stru tural distan e to sour e: d
S1
= 0:16
Relevant Pro edural Information
q
A
(3) = q
A
(0) + (
A
q
A
(0))
A
q
B
(3) = q
C
(0) + (
B
q
B
(0)) (
A
q
A
(0))
q
C
(3) = q
B
(0) (
B
q
B
(0)) +
A
Relevant De larative Information (Constraints)
A
= 2 (
B
q
B
(0))
B
(
A
q
A
(0))
q
C
(0) (
B
q
B
(0))
C
< q
C
(0) +
A
Problem 2 (Sour e In lusiveness) (Sour e/Target Mapping: A/A, B/B,
C/C)
Jug A B C (D)
Capa ity 48 60 72 -
Initial quantity 21 36 45 -
goal quantity 33 60 9 -
Operator sequen e: pour(C,B), pour(B,A), pour(A,C), pour(B,A), pour(C,B)
402 C. Sample Programs and Problems
Stru tural distan e to sour e: d
S2
= 0:17
Relevant Pro edural Information
q
A
(5) = q
A
(0) + (
A
q
A
(0))
A
+ (
B
(
A
q
A
(0)))
q
B
(5) = q
B
(0) + (
B
q
B
(0)) (
A
q
A
(0)) (
B
(
A
q
A
(0))) +
B
q
C
(5) = q
C
(0) (
B
q
B
(0)) +
A
B
Relevant De larative Information (Constraints)
A
= 2 (
B
q
B
(0))
B
(
A
q
A
(0))
q
C
(0) (
B
q
B
(0))
q
C
(0) <
B
C
< q
C
(0) +
A
Problem 3 (\High" Stru tural Overlap) (identi al to problem 4 in ex-
periment 1)
Problem 4 (\Medium" Stru tural Overlap) (additional jug: A; Sour e/Target
Mapping: A/B, B/C, C/D)
Jug A B C D
Capa ity 16 20 25 31
Initial quantity 6 9 15 18
goal quantity 14 14 0 20
Operator sequen e: pour(D,C), pour(C,B), pour(D,A), pour(B,D), pour(C,B)
Stru tural distan e to sour e: d
S4
= 0:55
Relevant Pro edural Information
q
A
(5) = q
A
(0) + (q
D
(0) (
C
q
C
(0)))
q
B
(5) = q
B
(0) + (
B
q
B
(0))
B
+ (
C
(
B
q
B
(0)))
q
C
(5) = q
C
(0) + (
C
q
C
(0)) (
B
q
B
(0)) (
C
(
B
q
B
(0)))
q
D
(5) = q
D
(0) (
C
q
C
(0)) (q
D
(0) (
C
q
C
(0))) +
B
Relevant De larative Information (Constraints)
A
< (q
A
(0) + q
D
(0))
q
A
(0) < (
C
q
C
(0))
C
> (
B
q
B
(0))
D
> (q
D
(0) +
B
Problem 5 (\Low" Stru tural Overlap) (additional jug: A; Sour e/Target
Mapping: A/B, B/C, C/D)
Jug A B C D
Capa ity 17 20 25 31
Initial quantity 7 9 15 18
goal quantity 16 5 0 28
Operator sequen e: pour(B,A), pour(D,C), pour(C,B), pour(B,D), pour(C,B)
Stru tural distan e to sour e: d
S5
= 0:59
Relevant Pro edural Information
q
A
(5) = q
A
(0) + q
B
(0)
C.8 Example RPSs 403
q
B
(5) = q
B
(0) q
B
(0) +
B
B
+ (
C
q
C
(0))
q
C
(5) = q
C
(0) + (
C
q
C
(0))
B
(
C
q
C
(0))
q
D
(5) = q
D
(0) (
C
q
C
(0)) +
B
Relevant De larative Information (Constraints)
B
= 2 (
C
q
C
(0))
A
q
A
(0) + q
B
(0)
C
B
q
D
(0) (
C
q
C
(0))
D
< q
D
(0) +
B
C.8 Example RPSs
In the following the ten example programs used in the empiri al evaluation
of retrieval in hapter 12 are given. Note, that we only give the subprograms.
The omplete RPSs ontain additionally a all of the a ording subprogram.
#S(sub
:NAME REVERT
:PARAMS (L1 L2)
:BODY (IF (NULL L1) L2 (REVERT (TL L1) (++ (HD L1) L2)))
)
#S(sub
:NAME SUM
:PARAMS (X)
:BODY (IF (EQ0 X) 0 (+ X (SUM (PRE X))))
)
#S(sub
:NAME CB
:PARAMS (X SIT)
:BODY (IF (CT X) SIT (PT (TO X) (CB (TO X) SIT)))
)
#S(sub
:NAME GGT
:PARAMS (X Y)
:BODY (IF (EQ0 Y) X (GGT Y (MOD X Y)))
)
#S(sub
:NAME MOD
:PARAMS (X Y)
:BODY (IF (< X Y) X (MOD (- X Y) Y))
)
#S(sub
:NAME FAC
:PARAMS (X)
:BODY (IF (EQ0 X) 1 (* X (FAC (PRE X))))
)
#S(sub
:NAME APPEND
:PARAMS (L1 L2)
404 C. Sample Programs and Problems
:BODY (IF (NULL L1) L2 (++ (HD L1) (APPEND (TL L1) L2)))
)
#S(sub
:NAME MEMBER
:PARAMS (X L)
:BODY (IF (NULL L) F (IF (== X (HD L)) T (MEMBER X (TL L))))
)
#S(sub
:NAME LAH
:PARAMS (X)
:BODY
(IF (EQ0 X) 1
(IF (EQ0 (PRE X)) 1
(- (* (PRE (* 2 X)) (LAH (PRE X)))
(* (* (PRE X) (PRE (PRE X))) (LAH (PRE (PRE X))))
) ) )
)
#S(sub
:NAME BINOM
:PARAMS (X Y)
:BODY (IF (EQ0 X) 1
(IF (== X Y) 1 (+ (BINOM (PRE X) (PRE Y)) (BINOM (PRE X) Y)))
)
)
#S(sub
:NAME FIBO
:PARAMS (X)
:BODY
(IF (EQ0 X) 0
(IF (EQ0 (PRE X)) 1 (+ (FIBO (PRE X)) (FIBO (PRE (PRE X)))))
)
)
Index
A
31, 59
-subsumption 144
absorption 146
abstra tion 119, 286, 320
ACME 300
ACT 52, 279
a tion 16, 24, 29, 74
minimal number 19
sele tion 19
semanti s 75
ADL 20, 74, 75
airplane domain 85
analogi al reasoning 281, 299
analogy 286
adaptation 288, 299, 320, 328
between-domain 286
ausal 292
ognitive model 300
derivational 288, 293, 302
explanatory 292, 299
rst-order mapping 288
generalization 287, 288
inferen e 288
internal 316
mapping 286, 287
mapping onstraints 287
problem solving 292, 296
programming 294
proportional 290
re-des ription 290
replay 288
retrieval 287
stru tural 289
transfer 287, 299
transformational 288, 302
within-domain 286
anti-instan e 322
most spe i 322
anti-uni ation 175, 202, 210, 289, 295
algorithm 323
rst order 322
higher order 322
se ond order 295, 323
sets of terms 176
anti-uni ator 175
approximation hain 159
Aries 103
assertion 115, 122
attribute 286
automati programming 101104,
106, 125
automaton 132
deterministi nite 136, 153, 167
axiom
domain 45, 124, 277
ee t 25
frame 25
indu tion 113115
ba kground knowledge 108, 110, 129,
143, 149, 167
bias
semanti 149
synta ti 149
Bla kbox 44
blo ks-world 14, 17, 18, 20, 22, 26, 30,
36, 59, 71, 90
learblo k 59, 116, 163, 167, 226,
239, 279
omplexity 33
tower 63, 139, 266, 271
BMWk 163
bottom test 239, 243, 245, 262
breadth-rst sear h 35, 46, 58
anoni al ombinatorial problem 44
ase-based reasoning 286288
CIP 118, 122, 124, 125
lause
ij-determinate 150
determinate 150
405
406 Index
re ursive 152
losed world assumption 13, 15, 20, 25
ognition
adaptive 278
ognitive psy hology 49, 280
on ept learning 127, 131, 278
onstraint
free variable 77, 78
onstraint programming 93
Constraint Prolog 93
onstraint satisfa tion 44, 50, 75, 93
onstru tor 238
ontrol knowledge
ro ket 251
unsta k 242
ontrol rule
ompleteness 243
orre tness 243
learning 226, 231, 279
re ursive 53, 163, 226, 231
Curry-Howard isomorphism 112
y le 30
Cypress 125
DAG 48, 58, 65, 256
data type 149, 237
abstra t 120, 122
binary tree 120
hange 120
inferen e 232, 233, 237
list 238, 252
pre-dened 277
renement 125
sequen e 238, 239
set 238, 245
de ision tree 46, 108, 130
Dedalus 115
dedu tive planning 20
dedu tive tableau 115
depth-rst sear h 28, 58
ba ktra king 28, 30
design pattern 331
design strategy 111
design ta ti 125
Dijkstra-algorithm 55
DPlan 42, 48, 55, 57, 59, 6567, 70, 279
algorithm 57
ompleteness 70
goal 56
operator 56
planning problem 57
soundness 70
state 56
termination 67
universal plan 56, 58
DPlan, universal plan 59
ee t 16
ontext dependent 23
mixed ADD/DEL
update 90
se ondary 23
elevator domain 73
eureka 118, 119, 121, 123
evaluation
expression 77, 78
ground term 78
exe ution monitoring 46
expression simpli ation 124
fa torial 200
folding 224
unfolding 200, 224
Fibona i fun tion 118, 141, 182, 202,
265
folding 224
unfolding 224
lter 251, 261, 264
nite dieren ing 124
tness measure 137
xpoint semanti s 160, 186
uents 16, 24, 74
Foil 150, 151
folding 119, 160, 171, 181, 186, 277,
279
fault toleran e 224
formal language 132
FPlan
assignement 79
goal representation 80
language 76
performan e gain 90, 91
planning problem 81
state representation 78
user-dened fun tions 84
FPlan, mat hing 79
fun tion approximation 158
algorithms 131
order 159
fun tion merging 165
generalization
analogy 287, 288
bottom-up 145, 146, 151
explanation-based 53, 167
re ursive 153, 280
RPS 327
Index 407
generalization-to-n 165
generate and test 110, 269
geneti operations 138
geneti programming 54, 110, 131,
136, 168, 272
goal 115
ordering 39
regression 37
goal regression 56
goals
independen e 39
interleaving 39
Golem 150, 151
GPS 41, 49, 53
grammar
ontext-free 136
ontext-free tree 180
regular 136
grammar inferen e 110, 126, 132, 152,
169
graph 286
graph distan e 306
Graphplan 34, 42, 58, 73
hanoi domain 63, 74, 89
hat-axiom 117, 233
heuristi 50
fun tion 44, 48
sear h 44, 278
hidden variable 216, 228
identi ation 216
identifying 217
higher-order fun tion 251
HSP 44
HSPr 42
hypothesis 107, 130, 142, 143, 165
omplete 144
onsistent 144
non-re ursive 131
hypothesis language 109, 131
IAM 300
ID 136, 153, 167
identi ation
by enumeration 134, 165
in the limit 130, 132
probably approximately orre t 130
indexing 54
indu tion 116, 159, 278
stru tural 115
indu tion bias 131
synta ti 131
indu tion s heme 115
indu tive fun tional programming
110, 131, 153, 168, 171
indu tive logi programming 54, 110,
131, 142, 168
inferen e
dedu tive 107, 109, 124
indu tive 107, 109, 126
information
in omplete 45
in orre t 45
initial instantiation 201, 207, 219
initial program 136, 181
order 181
re ursive explanation 185
segments 196
initial tree
de omposition 206
redu tion 209
input/output examples 106, 110
rewriting 153, 155
instantiation
main program 219
partial 38
IPP 44
Isabelle 117
KIDS 103, 124, 168, 295, 319
knowledge
de larative 278
pro edural 278, 281
Lah-number 33, 325
language
ontext-free tree 180
generator 133
learnability 134, 135
primitive re ursive 135
regular 135, 136
tester 133
learning 127
algorithm 131
analyti al 167
approximation-driven 131
bat h 128, 131
ontrol programs 53
ontrol rule 53
data-driven 131
de ision list 53
de ision tree 53
experien e 127
from observation 127
in remental 128, 131
language 132
408 Index
model-driven 131, 142
poli y 53
simultaneous omposite 267
speed-up 53
speed-up ee ts 279
supervised 127, 133
task 127
unsupervised 127
with informant 133
least general generalization 145
relative 145, 151
linked lause 149
LISA 292, 300
Lisp 109, 110, 113, 114, 120, 124, 125,
131, 136, 153, 163, 165
Lisp tutor 103
list
introdu tion 252
order 252
list problem
semanti 251, 254
stru tural 251, 277
logisti s domain 73, 249
ma hine learning 45, 50, 108, 126, 278
ma ro operator
iterative 53
linear 52, 279
map 251
mat h-sele t-apply y le 28, 49
MBP 47
M Carthy- onditional 154
means-end analysis 49, 316
memory
hierar hi al 326
linear 324
Merlin 152
meta-programming 118, 122, 125
metaphor 286
minimal spanning tree 67
extra tion 256, 257
MIPS 47
MIS 147, 151
ML 136
mode 149
modlist 178, 193, 203, 208, 226
mutex relation 43
network ow problem 42
NOAH 41
novi e programmer 104
Nuprl 117
OBDD 46, 59
operator
abstra t 45
ee t
ADD/DEL 16, 84
update 84
FPlan 80
primitive 45
Strips 16
operator appli ation
ba kward 19, 56
ompleteness 69
FPlan 81
soundness 68
onditional ee t 24
forward 17, 56
FPlan 81
operator s hemes 16
pattern
rst order 175
maximal 175, 200, 202, 219
pattern mat hing 16, 22, 23
PDDL 17, 20, 27, 44, 56, 75
onditional ee t 21, 56
equality onstraints 21
typing 21
per eptual hunk 280
plan 13, 18, 28
as program 234
extra tion 29, 42
levelwise de omposition 235
linearization 29, 252, 259, 265
optimal 19, 59
optimality 35, 47
partial 28
partial order 42
partially ordered 29, 42
re ursive 116
reuse 54
totally ordered 29, 42
universal 232
plan onstru tion 19, 20, 264
plan de omposition 233, 246
plan exe ution 20, 233
plan transformation 277
ompleteness 235
orre tness 235
planner
partial order 41
spe ial purpose 45
planning 163, 278
ba kward 55
ompetition 20
Index 409
ompilation approa h 44
ompleteness 35
onditional 46
ontrol knowledge 45, 52
orre tness 34
dedu tive 41
forward 29, 30, 44
fun tion appli ation 73
heuristi 73
hierar hi al 45, 50
innite domain 74
least ommitment 40, 42
linear 35, 39
logi al inferen e 20
non-linear 40, 42, 55
NP- ompleteness 31, 58
number problems 87, 278
progression 29
PSPACE- ompleteness 33, 46, 58
rea tive 46
regression 36
resour e onstraints 85
resour e variable 73
soundness 34
state-based 19, 55
Strips 37, 49
strong ommitment 40
termination 35
total order 55
types 45
universal 44, 46, 55, 231, 277
planning graph 42
poli y learning 46
position
omposition 173
next right 199
node 173
order 173
term 173
pre-image 46, 56
pre-planning analysis 45
pre ondition
se ondary 23
standard 77
predi ate onstru tion 157
predi ate invention 146, 152
prex 175
problem de omposition 45
problem s heme 302
problem solving 48
analogi al 298, 299
human 49, 54
s heme 281
strategy 279
problem solving onstraints 302, 305
Prodigy 42, 53, 289, 293
produ tion rule 49, 103
produ tion system 49, 52, 278
program 321
fun tional 108, 116, 118, 130, 136,
177
morphism 295
optimization 125, 160
regular 164
validation 102
veri ation 101, 111, 113
program body 177
maximal 201
maximization 201
program optimization 124
program reuse 111, 319, 331
program s heme 111, 117, 168, 296,
319, 321
binary sear h 295
divide-and- onquer 124, 152
extension of Summers' 163
global sear h 124
lo al sear h 124
se ond order 149
semi-regular 164
Summers 154
program synthesis 46, 101, 104, 112,
113, 117, 277
dedu tive 41, 101, 107, 109, 111,
112, 117, 168
indu tive 101, 107, 110, 111, 126,
131, 168
knowledge 110
s hema-based 111
program transformation 109, 117, 125,
168, 319
program veri ation 103
programming
abstra t 120
by analogy 111, 294
by demonstration 165, 171
de larative 106
programming tutor 103, 111
Prolog 20, 27, 110, 131, 143, 149, 152,
153, 167
proof method 50
proof planning 50
proof ta ti 117
proof-as-program 115
Proust 103
PVS 103
410 Index
QA3 24, 41, 113
re-planning 46
re urren e dete tion 153, 157
re urren e relation 157, 158
re ursion
introdu tion 123
lists 114
re ursion point 181, 193
valid 195
re ursive
linear 117, 126, 154, 160, 165, 226,
243, 256, 264, 325
tail 117, 126, 160, 254, 264, 325
tree 325
re ursive all
variable substitution 210
re ursive lause 143
re ursive fun tion 265
re ursive program 101, 113115, 136
re ursive subprogram 177, 219
redu e 251
Rene 124
renement 118, 124
operator 147
top-down 145, 146
reinfor ement learning 46, 53, 54, 127,
278
relation
rst order 286
n-ary 286
se ond order 286
unary 286
resolution 24, 27, 112, 115, 144
inverse 146, 186
resour e onstraint 50
retrieval 320, 326
algorithm 325
reverse 151, 160, 251
rewrite rule
ontext-free 171
ro ket 40, 52, 61, 66, 71, 246, 249, 253,
293
re ursive loading 249
re ursive unloading 249
RPS 177, 277, 279, 280, 320
CFTG 181
existen e 221
indu tion 172, 231
interpretation 178
language 179, 328, 329
minimality 187
restri tions 177, 187
rewrite system 179
substitution uniqueness 188
Rutherford analogy 286, 288, 292, 298
S-expression 154
omplete partial order 156
form 156
SAT-solver 44
saturation 146
s heduling 49, 87
s hema indu tion 299
sear h
algorithm 27
ba kward 48
forward 48
tree 2830
segment indi es 197
segmentation 181
algorithm 197
re urrent 192
strategy 198
valid 195
valid re urrent 197, 219
sele tion sort 91
sele tor fun tion 238, 256, 262, 280
sequen e
introdu tion 240
set
introdu tion 246
sets of list problem 33, 271
signature 172
similarity
attribute 289
literal 286
stru tural 287, 289
super ial 287
single-sour e shortest-path algorithm
55
situation al ulus 20, 24, 41, 107
situation variable 233
introdu tion 233, 238
skeleton 193
skill
a quistition 278
ognitive 278
skill a quisition 52, 127
SME 292, 300
SOAR 53
Soar 279
software engineering 101, 102, 319
omputer-aided 102
knowledge-based 102, 111, 125
solution 28
Index 411
sorting 73, 91, 253, 277
bubble-sort 167, 254
domain 40, 61
program 114
sele tion sort 253, 254, 256
sour e in lusiveness 300
sour e program 320
spe i ation 101103, 121, 319
as theorem 109
by examples 106
omplete 104
formal 104, 126
in omplete 101, 106
language 106
natural language 104, 111
non-exe utable 124
spe i ation language 106
sta k 30
STAN 44, 267
state des ription
omplete 37, 56
onsistent 38
in onsistent 37, 38
partial 37, 56
state invariant 51
state representation 15
omplete 48
partial 37, 48
state spa e 17, 34, 70
state transformation 15
state transition 47
state variable 75
state-a tion rule 46
stati s 16, 17, 23, 74
Strips 24, 27, 56
domain 17
Fun tional 75
goal 16
language 14, 44
operator 16
planner 41
planning problem 17
state 15
stru tural similarity 298, 308, 312, 316
stru ture mapping 292, 298, 320
theory 298
sub-plan
uniform 235
sub-s hema
equivalen e 207
sub-s hema position 193
valid 195
sub-term 173
subprogram
body 201
al ulation steps 220
al ulation strategy 221
dependen y relation 186
equality 223
language 179
substitution 16, 174, 181, 220
algorithm 218
onsisten y 216
inverse 146
ne essary ondition 215
produ t 79
testing re urren e 214
testing uniqueness 213
uniqueness 212
substitution term 181
al ulation 218
hara teristi s 213
ontext 329
subsumption 289
su esor identi ation 239
Sussman anomaly 35, 39
symboli model he king 44, 46
Synapse 152
synthesis problem 189
synthesis theorem 158
basi 160
ta ti 50
target exhaustiveness 300
target problem 320
term 172
depth 149
level 149
order 175
repla ement 174
substitution 322, 328
subsumption 324
term algebra 163
term rewrite system 174
term rewriting 175
theorem prover 25
Boyer-Moore 50
theorem proving 37, 41, 50
onstru tive 112, 113, 115, 125, 168
drawba ks 114
Thesys 153
TIM 52, 152
Tinker 165
Tower of Hanoi 33, 63, 127, 128, 228,
266, 273, 280, 302, 303
tra e 163
412 Index
tra es 106, 110, 153, 154, 163, 165, 171
user-generated 136
training examples 127
positive/negative 143
transformation
orre tness 121, 122
distan e measure 289
lateral 117
plan to program 231
verti al 117, 319
transformation rule 109, 115, 121, 122
transformational semanti s 122
tree
regular 257
regularization 259
type inferen e 51
UMOP 47
unfolding 119, 136, 160, 165, 184, 320,
321
depth 152
indi es 182
uni ation
higher order 322
universal plan 65
full 66
optimal 66
order over states 65
unpa k 154, 157
unsta k 239
update 74, 77, 80
ba kward 83
user-modeling 103
utility problem 52
variable addition 160
variable instantiation 178, 220
maximal pattern 202
segment 214
variables
range 78
suÆ ient instan es 215
Warshall-algorithm 123
water jug domain 87, 303