Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Optimizing and IncrementalizingHigher-order Collection eries
by AST Transformation
Dissertation
der Mathematisch- und Naturwissenschalichen Fakultaumlt
der Eberhard Karls Universitaumlt Tuumlbingen
zur Erlangung des Grades eines
Doktors der Naturwissenschaen
(Dr rer nat)
vorgelegt von
Paolo G Giarrusso
aus Catania (Italien)
Tuumlbingen
2017
Gedruckt mit Genehmigung der Mathematisch-Naturwissenschalichen Fakultaumlt
der Eberhard Karls Universitaumlt Tuumlbingen
Tag der muumlndlichen alifikation mm2017
Dekan Prof Dr Wolfgang Rosenstiel
1 Berichterstaer Prof Dr Klaus Ostermann
2 Berichterstaer Prof Dr Fritz Henglein
To the memory of my mother Rosalia
Alla memoria di mia madre Rosalia
Abstract
In modern programming languages queries on in-memory collections are often more expensive thanneeded While database queries can be readily optimized collection queries often donrsquot translatetrivially to databases as modern programming languages are far more expressive than databasessupporting both nested data and rst-class functions
Collection queries can be optimized and incrementalized by hand but this reduces modularityand is often too error-prone to be feasible or to enable maintenance of resulting programs In thisthesis we optimize and incrementalize such collection queries to free programmers from suchburdens Resulting programs are expressed in the same core language so that they can be subjectedto other standard optimizations
To enable optimizing collection queries which occur inside programs we develop a stagedvariant of the Scala collection API that reies queries as ASTs On top of this interface we adaptdomain-specic optimizations from the elds of programming languages and databases amongothers we rewrite queries to use indexes chosen by programmers Thanks to the use of indexes weshow signicant speedups in our experimental evaluation with an average of 12x and a maximumof 12800x
To incrementalize higher-order programs by program transformation we extend nite dier-encing [Paige and Koenig 1982 Blakeley et al 1986 Gupta and Mumick 1999] and develop therst approach to incrementalization by program transformation for higher-order programs Baseprograms are transformed to derivatives programs that transform input changes to output changesWe prove that our incrementalization approach is correct We develop the theory underlying thisapproach for simply-typed and untyped λ-calculus and discuss extensions to System F
Derivatives often need to reuse results produced by base programs to enable such reuse weextend work by Liu and Teitelbaum [1995] to higher-order programs and develop and prove correcta program transformation converting higher-order programs to cache-transfer-style
For ecient incrementalization it is necessary to choose and incrementalize by hand appropriateprimitive operations We incrementalize a signicant subset of collection operations and performcase studies showing order-of-magnitude speedups both in practice and in asymptotic complexity
v
vi
Zusammenfassung
In modernen universellen Programmiersprachen sind Abfragen auf Speicher-basierten Kollektio-nen oft rechenintensiver als erforderlich Waumlhrend Datenbankenabfragen vergleichsweise einfachoptimiert werden koumlnnen faumlllt dies bei Speicher-basierten Kollektionen oft schwer denn universelleProgrammiersprachen sind in aller Regel ausdrucksstaumlrker als Datenbanken Insbesondere unter-stuumltzen diese Sprachen meistens verschachtelte rekursive Datentypen und Funktionen houmlhererOrdnung
Kollektionsabfragen koumlnnen per Hand optimiert und inkrementalisiert werden jedoch verringertdies haumlug die Modularitaumlt und ist oft zu fehleranfaumlllig um realisierbar zu sein oder um Instandhal-tung von entstandene Programm zu gewaumlhrleisten Die vorliegende Doktorarbeit demonstriert wieAbfragen auf Kollektionen systematisch und automatisch optimiert und inkrementalisiert werdenkoumlnnen um Programmierer von dieser Last zu befreien Die so erzeugten Programme werden inderselben Kernsprache ausgedruumlckt um weitere Standardoptimierungen zu ermoumlglichen
Teil I entwickelt eine Variante der Scala API fuumlr Kollektionen die Staging verwendet um Abfra-gen als abstrakte Syntaxbaumlume zu reizieren Auf Basis dieser Schnittstelle werden anschlieszligenddomaumlnenspezische Optimierungen von Programmiersprachen und Datenbanken angewandt unteranderem werden Abfragen umgeschrieben um vom Programmierer ausgewaumlhlte Indizes zu benutzenDank dieser Indizes kann eine erhebliche Beschleunigung der Ausfuumlhrungsgeschwindigkeit gezeigtwerden eine experimentelle Auswertung zeigt hierbei Beschleunigungen von durchschnittlich 12xbis zu einem Maximum von 12800x
Um Programme mit Funktionen houmlherer Ordnung durch Programmtransformation zu inkre-mentalisieren wird in Teil II eine Erweiterung der Finite-Dierenzen-Methode vorgestellt [Paigeand Koenig 1982 Blakeley et al 1986 Gupta and Mumick 1999] und ein erster Ansatz zur Inkre-mentalisierung durch Programmtransformation fuumlr Programme mit Funktionen houmlherer Ordnungentwickelt Dabei werden Programme zu Ableitungen transformiert dh zu Programmen die Ein-gangsdierenzen in Ausgangdierenzen umwandeln Weiterhin werden in den Kapiteln 12ndash13 dieKorrektheit des Inkrementalisierungsansatzes fuumlr einfach-getypten und ungetypten λ-Kalkuumll bewie-sen und Erweiterungen zu System F besprochen
Ableitungen muumlssen oft Ergebnisse der urspruumlnglichen Programme wiederverwenden Um einesolche Wiederverwendung zu ermoumlglichen erweitert Kapitel 17 die Arbeit von Liu and Teitelbaum[1995] zu Programmen mit Funktionen houmlherer Ordnung und entwickeln eine Programmtransfor-mation solcher Programme im Cache-Transfer-Stil
Fuumlr eine eziente Inkrementalisierung ist es weiterhin notwendig passende Grundoperationenauszuwaumlhlen und manuell zu inkrementalisieren Diese Arbeit deckt einen Groszligteil der wich-tigsten Grundoperationen auf Kollektionen ab Die Durchfuumlhrung von Fallstudien zeigt deutlicheLaufzeitverbesserungen sowohl in Praxis als auch in der asymptotischen Komplexitaumlt
vii
viii
Contents
Abstract v
Zusammenfassung vii
Contents ix
List of Figures xv
List of Tables xvii
1 Introduction 111 This thesis 2
111 Included papers 3112 Excluded papers 3113 Navigating this thesis 4
I Optimizing Collection Queries by Reication 7
2 Introduction 921 Contributions and summary 1022 Motivation 10
221 Optimizing by Hand 11
3 Automatic optimization with SQuOpt 1531 Adapting a Query 1532 Indexing 17
4 Implementing SQuOpt 1941 Expression Trees 1942 Optimizations 2143 Query Execution 22
5 A deep EDSL for collection queries 2351 Implementation Expressing the interface in Scala 2452 Representing expression trees 2453 Lifting rst-order expressions 2554 Lifting higher-order expressions 2755 Lifting collections 29
ix
x Contents
56 Interpretation 3157 Optimization 32
6 Evaluating SQuOpt 3561 Study Setup 3562 Experimental Units 3763 Measurement Setup 4064 Results 40
7 Discussion 4371 Optimization limitations 4372 Deep embedding 43
721 What worked well 43722 Limitations 44723 What did not work so well Scala type inference 44724 Lessons for language embedders 46
8 Related work 4781 Language-Integrated Queries 4782 Query Optimization 4883 Deep Embeddings in Scala 48
831 LMS 4984 Code Querying 49
9 Conclusions 5191 Future work 51
II Incremental λ-Calculus 53
10 Introduction to dierentiation 55101 Generalizing the calculus of nite dierences 57102 A motivating example 57103 Introducing changes 58
1031 Incrementalizing with changes 59104 Dierentiation on open terms and functions 61
1041 Producing function changes 611042 Consuming function changes 621043 Pointwise function changes 621044 Passing change targets 62
105 Dierentiation informally 63106 Dierentiation on our example 64107 Conclusion and contributions 66
1071 Navigating this thesis part 661072 Contributions 66
Contents xi
11 A tour of dierentiation examples 69111 Change structures as type-class instances 69112 How to design a language plugin 70113 Incrementalizing a collection API 70
1131 Changes to type-class instances 711132 Incrementalizing aggregation 711133 Modifying list elements 731134 Limitations 74
114 Ecient sequence changes 75115 Products 75116 Sums pattern matching and conditionals 76
1161 Optimizing lter 78117 Chapter conclusion 78
12 Changes and dierentiation formally 79121 Changes and validity 79
1211 Function spaces 821212 Derivatives 831213 Basic change structures on types 841214 Validity as a logical relation 851215 Change structures on typing contexts 85
122 Correctness of dierentiation 861221 Plugin requirements 881222 Correctness proof 88
123 Discussion 911231 The correctness statement 911232 Invalid input changes 921233 Alternative environment changes 921234 Capture avoidance 92
124 Plugin requirement summary 94125 Chapter conclusion 95
13 Change structures 97131 Formally dening oplus and change structures 97
1311 Example Group changes 981312 Properties of change structures 99
132 Operations on function changes informally 1001321 Examples of nil changes 1001322 Nil changes on arbitrary functions 1011323 Constraining oplus on functions 102
133 Families of change structures 1021331 Change structures for function spaces 1021332 Change structures for products 104
134 Change structures for types and contexts 105135 Development history 107136 Chapter conclusion 107
xii Contents
14 Equational reasoning on changes 109141 Reasoning on changes syntactically 109
1411 Denotational equivalence for valid changes 1101412 Syntactic validity 111
142 Change equivalence 1141421 Preserving change equivalence 1141422 Sketching an alternative syntax 1171423 Change equivalence is a PER 118
143 Chapter conclusion 119
15 Extensions and theoretical discussion 121151 General recursion 121
1511 Dierentiating general recursion 1211512 Justication 122
152 Completely invalid changes 122153 Pointwise function changes 123154 Modeling only valid changes 124
1541 One-sided vs two-sided validity 125
16 Dierentiation in practice 127161 The role of dierentiation plugins 127162 Predicting nil changes 128
1621 Self-maintainability 128163 Case study 129164 Benchmarking setup 132165 Benchmark results 133166 Chapter conclusion 133
17 Cache-transfer-style conversion 135171 Introduction 135172 Introducing Cache-Transfer Style 136
1721 Background 1371722 A rst-order motivating example computing averages 1371723 A higher-order motivating example nested loops 140
173 Formalization 1411731 The source language λAL 1411732 Static dierentiation in λAL 1451733 A new soundness proof for static dierentiation 1461734 The target language iλAL 1471735 CTS conversion from λAL to iλAL 1511736 Soundness of CTS conversion 152
174 Incrementalization case studies 1541741 Averaging integers bags 1541742 Nested loops over two sequences 1561743 Indexed joins of two bags 157
175 Limitations and future work 1591751 Hiding the cache type 1591752 Nested bags 1591753 Proper tail calls 159
Contents xiii
1754 Pervasive replacement values 1591755 Recomputing updated values 1601756 Cache pruning via absence analysis 1601757 Unary vs n-ary abstraction 160
176 Related work 161177 Chapter conclusion 162
18 Towards dierentiation for System F 163181 The parametricity transformation 164182 Dierentiation and parametricity 166183 Proving dierentiation correct externally 167184 Combinators for polymorphic change structures 168185 Dierentiation for System F 171186 Related work 172187 Prototype implementation 172188 Chapter conclusion 172
19 Related work 173191 Dynamic approaches 173192 Static approaches 174
1921 Finite dierencing 1741922 λndashdi and partial dierentials 1751923 Static memoization 176
20 Conclusion and future work 177
Appendixes 181
A Preliminaries 181A1 Our proof meta-language 181
A11 Type theory versus set theory 182A2 Simply-typed λ-calculus 182
A21 Denotational semantics for STLC 184A22 Weakening 185A23 Substitution 186A24 Discussion Our mechanization and semantic style 186A25 Language plugins 187
B More on our formalization 189B1 Mechanizing plugins modularly and limitations 189
C (Un)typed ILC operationally 193C1 Formalization 195
C11 Types and contexts 196C12 Base syntax for λA 196C13 Change syntax for λ∆A 199C14 Dierentiation 199C15 Typing λArarr and λ∆Ararr 199C16 Semantics 200
xiv Contents
C17 Type soundness 200C2 Validating our step-indexed semantics 201C3 Validity syntactically (λArarr λ∆Ararr) 202C4 Step-indexed extensional validity (λArarr λ∆Ararr) 204C5 Step-indexed intensional validity 207C6 Untyped step-indexed validity (λA λ∆A) 210C7 General recursion in λArarr and λ∆Ararr 210C8 Future work 212
C81 Change composition 212C9 Development history 213C10 Conclusion 213
D Defunctionalizing function changes 215D1 Setup 215D2 Defunctionalization 217
D21 Defunctionalization with separate function codes 219D22 Defunctionalizing function changes 220
D3 Defunctionalization and cache-transfer-style 224
Acknowledgments 229
Bibliography 231
List of Figures
21 Denition of the schema and of some content 1122 Our example query on the schema in Fig 21 and a function which postprocesses
its result 1223 Composition of queries in Fig 22 after inlining query unnesting and hoisting 13
31 SOpt version of Fig 22 recordQuery contains a reication of the queryrecords its result 16
51 Desugaring of code in Fig 22 2952 Lifting TraversableLike 31
61 Measurement Setup Overview 3765 Find covariant equals methods 3966 A simple index denition 40
121 Dening dierentiation and proving it correct The rest of this chapter explainsand motivates the above denitions 80
131 Dening change structures 108
161 The λ-term histogram with Haskell-like syntactic sugar 128162 A Scala implementation of primitives for bags and maps 130163 Performance results in logndashlog scale 134
171 Source language λAL (syntax) 142172 Step-indexed big-step semantics for base terms of source language λAL 142173 Step-indexed big-step semantics for the change terms of the source language λAL 144174 Static dierentiation in λAL 145175 Step-indexed relation between values and changes 146176 Target language iλAL (syntax) 148177 Target language iλAL (semantics of base terms and caches) 149178 Target language iλAL (semantics of change terms and cache updates) 150179 Cache-Transfer Style (CTS) conversion from λAL to iλAL 1511710 Extending CTS translation to values change values environments and change
environments 1521711 Extension of an environment with cache values dF uarr C i dF prime (for i = 1 2) 153
181 A change structure that allows modifying list elements 170
xv
xvi List of Figures
A1 Standard denitions for the simply-typed lambda calculus 183
C1 ANF λ-calculus λA and λ∆A 196C2 ANF λ-calculus λArarr and λ∆Ararr type system 197C3 ANF λ-calculus (λA and λ∆A) CBV semantics 198C4 Dening extensional validity via logical relations and big-step semantics 203C5 Dening extensional validity via step-indexed logical relations and big-step semantics 206C6 Dening extensional validity via untyped step-indexed logical relations and big-step
semantics 211
D1 A small example program for defunctionalization 217D2 Defunctionalized program 218D3 Implementing change structures using Codelike instances 226D4 Implementing FunOps using Codelike instances 227
List of Tables
62 Performance results As in in Sec 61 (1) denotes the modular Scala implementation(2) the hand-optimized Scala one and (3minus) (3o ) (3x ) refer to the SOpt imple-mentation when run respectively without optimizations with optimizations withoptimizations and indexing Queries marked with the R superscript were selectedby random sampling 38
63 Average performance ratios This table summarizes all interesting performanceratios across all queries using the geometric mean [Fleming and Wallace 1986]The meaning of speedups is discussed in Sec 61 39
64 Description of abstractions removed during hand-optimization and number of que-ries where the abstraction is used (and optimized away) 39
xvii
xviii List of Tables
Chapter 1
Introduction
Many program perform queries on collections of data and for non-trivial amounts of data it isuseful to execute the queries eciently When the data is updated often enough it can also be usefulto update the results of some queries incrementally whenever the input changes so that up-to-dateresults are quickly available even for queries that are expensive to execute
For instance a program manipulating anagraphic data about citizens of a country might needto compute statistics on them such as their average age and update those statistics when the set ofcitizens changes
Traditional relational database management systems (RDBMS) support both queries optimiza-tion and (in quite a few cases) incremental update of query results (called there incremental viewmaintenance)
However often queries are executed on collections of data that are not stored in a database butin collections manipulated by some program Moving in-memory data to RDBMSs typically doesnot improve performance [Stonebraker et al 2007 Rompf and Amin 2015] and reusing databaseoptimizers is not trivial
Moreover many programming languages are far more expressive than RDBMSs Typical RDBMScan only manipulate SQL relations that is multisets (or bags) of tuples (or sometimes sets of tuplesafter duplicate elimination) Typical programming languages (PL) support also arbitrarily nestedlists and maps of data and allow programmers to dene new data types with few restrictions atypical PL will also allow a far richer set of operations than SQL
However typical PLs do not apply typical database optimizations to collection queries and ifqueries are incrementalized this is often done by hand even though code implementing incrementalquery is error-prone and hard-to-maintain [Salvaneschi and Mezini 2013]
Whatrsquos worse some of these manual optimizations are best done over the whole program Someoptimizations only become possible after inlining which reduces modularity if done by hand1
Worse adding an index on some collection can speed up looking for some information buteach index must be maintained incrementally when the underlying data changes Depending onthe actual queries and updates performed on the collection and on how often they happen it mightturn out that updating an index takes more time than the index saves hence the choice of whichindexes to enable depends on the whole program However addingremoving an index requiresupdating all PL queries to use it while RDBMS queries can use an index transparently
Overall manual optimizations are not only eort-intensive and error-prone but they alsosignicantly reduce modularity
1In Chapter 2 we dene modularity as the ability to abstract behavior in a separate function (possibly part of a dierentmodule) to enable reuse and improve understandability
1
2 Chapter 1 Introduction
11 This thesisTo reduce the need for manual optimizations in this thesis we propose techniques for optimizinghigher-order collection queries and executing them incrementally By generalizing database-inspired approaches we provide approaches to query optimization and incrementalization by codetransformation to apply to higher-order collection queries including user-dened functions Insteadof building on existing approaches to incrementalization such as self-adjusting computation [Acar2009] we introduce a novel incrementalization approach which enables further transformationson the incrementalization results This approach is in fact not restricted to collection queries butapplicable to other domains but it requires adaptation for the domain under consideration Furtherresearch is needed but a number of case studies suggest applicability even in some scenarios beyondexisting techniques [Koch et al 2016]
We consider the problem for functional programming languages such as Haskell or Scala andwe consider collection queries written using the APIs of their collection libraries which we treat asan embedded domain-specic language (EDSL) Such APIs (or DSLs) contain powerful operators oncollections such as map which are higher-order that is they take as arguments arbitrary functionsin the host language that we must also handle
Therefore our optimizations and incrementalizations must handle programs in higher-orderEDSLs that can contain arbitrary code in the host language Hence many of our optimizationswill exploit on properties of our collection EDSL but will need to handle host language code Werestrict the problem to purely functional programs (without mutation or other side eects mostlyincluding non-termination) because such programs can be more ldquodeclarativerdquo and because avoidingside eects can enable more powerful optimization and simplify the work of the optimizer at thesame time
This thesis is divided into two parts
bull In Part I we optimize collection queries by static program transformation based on a set ofrewrite rules [Giarrusso et al 2013]
bull In Part II we incrementalize programs by transforming them to new programs This thesispresents the rst approach that handles higher-order programs through program transforma-tion hence while our main examples use collection queries we phrase the work in terms ofλ-calculi with unspecied primitives [Cai Giarrusso Rendel and Ostermann 2014] In Chap-ter 17 we extend ILC with a further program transformation step so that base programs canstore intermediate results and derivatives can reuse them but without resorting to dynamicmemoization and necessarily needing to look results up at runtime To this end we build onwork by Liu [2000] and extend it to a higher-order typed setting
Part II is more theoretical than Part I because optimizations in Part I are much better understoodthan our approach to incrementalization
To incrementalize programs we are the rst to extend to higher-order programs techniques basedon nite dierencing for queries on collections [Paige and Koenig 1982] and databases [Blakeleyet al 1986 Gupta and Mumick 1999] Incrementalizing by nite dierencing is a well-understoodtechnique for database queries How to generalize it for higher-order programs or beyond databaseswas less clear so we spend signicant energy on providing sound mathematical foundations forthis transformation
In fact it took us a while to be sure that our transformation was correct and to understand whyour rst correctness proof [Cai Giarrusso Rendel and Ostermann 2014] while a signicant stepwas still more complex than needed In Part II especially Chapter 12 we oer a mathematicallymuch simpler proof
Chapter 1 Introduction 3
Contributions of Part I are listed in Sec 21 Contributions of Part II are listed at the end ofSec 107
111 Included papersThis thesis includes material from joint work with colleagues
Part I is based on work by Giarrusso Ostermann Eichberg Mitschke Rendel and Kaumlstner [2013]and Chapters 2 to 4 6 8 and 9 come from that manuscript While the work was in collaborationa few clearer responsibilities arose I did most of the implementation work and collaborated tothe writing among other things I devised the embedding for collection operations implementedoptimizations and indexing implemented compilation when interpretation did not achieve sucientperformance and performed the performance evaluation Michael Eichberg and Ralf Mitschkecontributed to the evaluation by adapting FindBugs queries Christian Kaumlstner contributed amongother things to the experiment design for the performance evaluation Klaus Ostermann proposedthe original idea (together with an initial prototype) and supervised the project
The rst chapters of Part II are originally based on work by Cai Giarrusso Rendel and Ostermann[2014] though signicantly revised Chapter 10 contains signicantly revised text from that paperand Chapters 16 and 19 survive mostly unchanged This work was even more of a team eort Iinitiated and led the overall project and came up with the original notion of change structures CaiYufei contributed dierentiation itself and its rst correctness proof Tillmann Rendel contributedsignicant simplications and started the overall mechanization eort Klaus Ostermann providedsenior supervision that proved essential to the project Chapters 12 and 13 contain a novel correctnessproof for simply-typed λ-calculus for its history and contributions see Sec 135
Furthermore Chapter 17 comes from a recently submitted manuscript [Giarrusso Reacutegis-Gianasand Schuster Submitted] I designed the overall approach the transformation and the case studyon sequences and nested loops Proofs were done in collaboration with Yann Reacutegis-Gianas heis the main author of the Coq mechanization and proof though I contributed signicantly to thecorrectness proof for ILC in particular with the proofs described in Appendix C and their partialAgda mechanization The history of this correctness proof is described in Appendix C9 PhilippSchuster contributed to the evaluation devising language plugins for bags and maps in collaborationwith me
112 Excluded papersThis thesis improves modularity of collection queries by automating dierent sorts of optimizationson them
During my PhD work I collaborated on several other papers mostly on dierent facets ofmodularity which do not appear in this thesis
Modularity Module boundaries hide information on implementation that is not relevant to clientsHowever it often turns out that some scenarios clients or task require such hidden information Asdiscussed manual optimization requires hidden information hence researchers strive to automateoptimizations But other tasks often violate these boundaries I collaborated on research onunderstanding why this happens and how to deal with this problem [Ostermann Giarrusso Kaumlstnerand Rendel 2011]
Software Product Lines In some scenarios a dierent form of modular software is realizedthrough software product lines (SPLs) where many variants of a single software artifact (with
4 Chapter 1 Introduction
dierent sets of features) can be generated This is often done using conditional compilation forinstance through the C preprocessor
But if a software product line uses conditional compilation processing automatically its sourcecode is dicult mdash in fact even lexing or parsing it is a challenge Since the number of variants isexponential in the number of features generating and processing each variant does not scale
To address these challenges I collaborated with the TypeChef project to develop a variability-aware lexer [Kaumlstner Giarrusso and Ostermann 2011a] and parser [Kaumlstner Giarrusso RendelErdweg Ostermann and Berger 2011b]
Another challenge is predicting non-functional properties (such as code size or performance) ofa variant of a software product line before generation and direct measurement Such predictionscan be useful to select eciently a variant that satises some non-functional requirements Icollaborated to research on predicting such non-functional properties we measure the propertieson a few variants and extrapolate the results to other variants In particular I collaborated tothe experimental evaluation on a few open source projects where even a simple model reachedsignicant accuracy in a few scenarios [Siegmund Rosenmuumlller Kaumlstner Giarrusso Apel andKolesnikov 2011 2013]
Domain-specic languages This thesis is concerned with DSLs both ones for the domain ofcollection queries but also (in Part II) more general ones
Dierent forms of language composition both among DSLs and across general-purpose lan-guages and DSLs are possible but their relationship is nontrivial I collaborated to work classifyingthe dierent forms of language compositions [Erdweg Giarrusso and Rendel 2012]
The implementation of SOpt as described in Part I requires an extensible representationof ASTs similar to the one used by LMS [Rompf and Odersky 2010] While this representation ispretty exible it relies on Scalarsquos support of GADTs [Emir et al 2006 2007b] which is known to besomewhat fragile due to implementation bugs In fact sound pattern matching on Scalarsquos extensibleGADTs is impossible without imposing signicant restrictions to extensibility due to languageextensions not considered during formal study [Emir et al 2006 2007a] the problem arises due tothe interaction between GADTs declaration-site variance annotations and variant renement oftype arguments at inheritance time To illustrate the problem Irsquove shown it already applies to anextensible denitional interpreter for λlt [Giarrusso 2013] While solutions have been investigatedin other settings [Scherer and Reacutemy 2013] the problem remains open for Scala to this day
113 Navigating this thesisThe two parts of this thesis while related by the common topic of collection queries can be readmostly independently from each other Summaries of the two parts are given respectively in Sec 21and Sec 1071
Numbering We use numbering and hyperlinks to help navigation we explain our conventionsto enable exploiting them
To simplify navigation we number all sorts of ldquotheoremsrdquo (including here denitions remarkseven examples or descriptions of notation) per-chapter with counters including section numbersFor instance Denition 1221 is the rst such item in Chapter 12 followed by Theorem 1222 andthey both appear in Sec 122 And we can be sure that Lemma 1228 comes after both ldquotheoremsrdquobecause of its number even if they are of dierent sorts
Similarly we number tables and gures per-chapter with a shared counter For instance Fig 61is the rst gure or table in Chapter 6 followed by Table 62
Chapter 1 Introduction 5
To help reading at times we will repeat or anticipate statements of ldquotheoremsrdquo to refer to themwithout forcing readers to jump back and forth We will then reuse the original number For instanceSec 122 contains a copy of Slogan 1033 with its original number
For ease of navigation all such references are hyperlinked in electronic versions of this thesisand the table of contents is exposed to PDF readers via PDF bookmarks
6 Chapter 1 Introduction
Part I
Optimizing Collection Queries byReication
7
Chapter 2
Introduction
In-memory collections of data often need ecient processing For on-disk data ecient processingis already provided by database management systems (DBMS) thanks to their query optimizerswhich support many optimizations specic to the domain of collections Moving in-memory datato DBMSs however typically does not improve performance [Stonebraker et al 2007] and queryoptimizers cannot be reused separately since DBMS are typically monolithic and their optimizersdeeply integrated A few collection-specic optimizations such as shortcut fusion [Gill et al1993] are supported by compilers for purely functional languages such as Haskell However theimplementation techniques for those optimizations do not generalize to many other ones suchas support for indexes In general collection-specic optimizations are not supported by thegeneral-purpose optimizers used by typical (JIT) compilers
Therefore programmers when needing collection-related optimizations perform them manuallyTo allow that they are often forced to perform manual inlining [Peyton Jones and Marlow 2002]But manual inlining modies source code by combining distinct functions together while oftendistinct functions should remain distinct because they deal with dierent concerns or because onefunction need to be reused in a dierent context In either case manual inlining reduces modularitymdash dened here as the ability to abstract behavior in a separate function (possibly part of a dierentmodule) to enable reuse and improve understandability
For these reasons currently developers need to choose between modularity and performanceas also highlighted by Kiczales et al [1997] on a similar example Instead we envision that theyshould rely on an automatic optimizer performing inlining and collection-specic optimizationsThey would then achieve both performance and modularity1
One way to implement such an optimizer would be to extend the compiler of the languagewith a collection-specic optimizer or to add some kind of external preprocessor to the languageHowever such solutions would be rather brittle (for instance they lack composability with otherlanguage extensions) and they would preclude optimization opportunities that arise only at runtime
For this reason our approach is implemented as an embedded domain-specic language (EDSL)that is as a regular library We call this library SOpt the Scala QUery OPTimizer SOptconsists of a Scala EDSL for queries on collections based on the Scala collections API An expressionin this EDSL produces at run time an expression tree in the host language a data structure whichrepresents the query to execute similar to an abstract syntax tree (AST) or a query plan Thanks to
1In the terminology of Kiczales et al [1997] our goal is to be able to decompose dierent generalized procedures of aprogram according to its primary decomposition while separating the handling of some performance concerns To this endwe are modularizing these performance concerns into a metaprogramming-based optimization module which we believecould be called in that terminology aspect
9
10 Chapter 2 Introduction
the extensibility of Scala expressions in this language look almost identical to expressions with thesame meaning in Scala When executing the query SOpt optimizes and compiles these expressiontrees for more ecient execution Doing optimization at run time instead of compile-time avoidsthe need for control-ow analyses to determine which code will be actually executed [Chamberset al 2010] as we will see later
We have chosen Scala [Odersky et al 2011] to implement our library for two reasons (i) Scala isa good meta-language for EDSLs because it is syntactically exible and has a powerful type systemand (ii) Scala has a sophisticated collections library with an attractive syntax (for-comprehensions)to specify queries
To evaluate SOpt we study queries of the FindBugs tool [Hovemeyer and Pugh 2004] Werewrote a set of queries to use the Scala collections API and show that modularization incurssignicant performance overhead Subsequently we consider versions of the same queries usingSOpt We demonstrate that the automatic optimization can reconcile modularity and performancein many cases Adding advanced optimizations such as indexing can even improve the performanceof the analyses beyond the original non-modular analyses
21 Contributions and summaryOverall our main contributions in Part I are the following
bull We illustrate the tradeo between modularity and performance when manipulating collectionscaused by the lack of domain-specic optimizations (Sec 22) Conversely we illustrate howdomain-specic optimizations lead to more readable and more modular code (Chapter 3)
bull We present the design and implementation of SOpt an embedded DSL for queries oncollections in Scala (Chapter 4)
bull We evaluate SOpt to show that it supports writing queries that are at the same timemodular and fast We do so by re-implementing several code analyses of the FindBugstool The resulting code is more modular andor more ecient in some cases by orders ofmagnitude In these case studies we measured average speedups of 12x with a maximum of12800x (Chapter 6)
22 MotivationIn this section we show how the absense of collection-specic optimizations forces programmersto trade modularity against performance which motivates our design of SOpt to resolve thisconict
As our running example through the chapter we consider representing and querying a simplein-memory bibliography A book has in our schema a title a publisher and a list of authors Eachauthor in turn has a rst and last name We represent authors and books as instances of the Scalaclasses Author and Book shown in Fig 21 The class declarations list the type of each eld Titlespublishers and rst and last names are all stored in elds of type String The list of authors isstored in a eld of type Seq[Author] that is a sequence of authors ndash something that would bemore complex to model in a relational database The code fragment also denes a collection ofbooks named books
As a common idiom to query such collections Scala provides for-comprehensions For instancethe for-comprehension computing records in Fig 22 nds all books published by Pearson Educationand yields for each of those books and for each of its authors a record containing the book title
Chapter 2 Introduction 11
package schemacase class Author(firstName String lastName String)case class Book(title String publisher String
authors Seq[Author ])
val books Set[Book] = Set(new Book(Compilers Principles Techniques and Tools
Pearson EducationSeq(new Author(Alfred V Aho)
new Author(Monica S Lam)new Author(Ravi Sethi)new Author(Jeffrey D Ullman))
other books )
Figure 21 Denition of the schema and of some content
the full name of that author and the number of additional coauthors The generator book larr booksfunctions like a loop header The remainder of the for-comprehension is executed once per bookin the collection Consequently the generator author larr bookauthors starts a nested loop Thereturn value of the for-comprehension is a collection of all yielded records Note that if a book hasmultiple authors this for-comprehensions will return multiple records relative to this book one foreach author
We can further process this collection with another for-comprehension possibly in a dierentmodule For example still in Fig 22 the function titleFilter lters book titles containing theword Principles and drops from each record the number of additional coauthors
In Scala the implementation of for-comprehensions is not xed Instead the compiler desugarsa for-comprehension to a series of API calls and dierent collection classes can implement thisAPI dierently Later we will use this exibility to provide an optimizing implementation of for-comprehensions but in this section we focus on the behavior of the standard Scala collectionswhich implement for-comprehensions as loops that create intermediate collections
221 Optimizing by HandIn the naive implementation in Fig 22 dierent concerns are separated hence it is modular Howeverit is also inecient To execute this code we rst build the original collection and only later weperform further processing to build the new result creating the intermediate collection at theinterface between these functions is costly Moreover the same book can appear in records morethan once if the book has more than one author but all of these duplicates have the same titleNevertheless we test each duplicate title separately whether it contains the searched keyword Ifbooks have 4 authors on average this means a slowdown of a factor of 4 for the ltering step
In general one can only resolve these ineciencies by manually optimizing the query howeverwe will observe that these manual optimizations produce less modular code2
To address the rst problem above that is to avoid creating intermediate collections we canmanually inline titleFilter and records we obtain two nested for-comprehensions Furthermorewe can unnest the inner one [Fegaras and Maier 2000]
2The existing Scala collections API supports optimization for instance through non-strict variants of the query operators(called lsquoviewsrsquo in Scala) but they can only be used for a limited set of optimizations as we discuss in Chapter 8
12 Chapter 2 Introduction
case class BookData(title String authorName String coauthors Int)
val records =for
book larr booksif bookpublisher == Pearson Educationauthor larr bookauthors
yield new BookData(booktitle authorfirstName + +authorlastName bookauthorssize - 1)
def titleFilter(records Set[BookData]keyword String) =
for record larr recordsif recordtitlecontains(keyword)
yield (recordtitle recordauthorName)
val res = titleFilter(records Principles)
Figure 22 Our example query on the schema in Fig 21 and a function which postprocesses itsresult
To address the second problem above that is to avoid testing the same title multiple times wehoist the ltering step that is we change the order of the processing steps in the query to rstlook for keyword within booktitle and then iterate over the set of authors This does not changethe overall semantics of the query because the lter only accesses the title but does not dependon the author In the end we obtain the code in Fig 23 The resulting query processes the title ofeach book only once Since ltering in Scala is done lazily the resulting query avoids building anintermediate collection
This second optimization is only possible after inlining and thereby reducing the modularity ofthe code because it mixes together processing steps from titleFilter and from the denition ofrecords Therefore reusing the code creating records would now be harder
To make titleFilterHandOpt more reusable we could turn the publisher name into a parame-ter However the new versions of titleFilter cannot be reused as-is if some details of the inlinedcode change for instance we might need to lter publishers dierently or not at all On the otherhand if we express queries modularly we might lose some opportunities for optimization Thedesign of the collections API both in Scala and in typical languages forces us to manually optimizeour code by repeated inlining and subsequent application of query optimization rules which leadsto a loss of modularity
Chapter 2 Introduction 13
def titleFilterHandOpt(books Set[Book]publisher String keyword String) =
for book larr booksif bookpublisher == publisher ampamp booktitlecontains(keyword)author larr bookauthors
yield (booktitle authorfirstName + + authorlastName)val res = titleFilterHandOpt(books
Pearson Education Principles)
Figure 23 Composition of queries in Fig 22 after inlining query unnesting and hoisting
14 Chapter 2 Introduction
Chapter 3
Automatic optimization withSQuOpt
The goal of SOpt is to let programmers write queries modularly and at a high level of abstractionand deal with optimization by a dedicated domain-specic optimizer In our concrete exampleprogrammers should be able to write queries similar to the one in Fig 22 but get the eciency ofthe one in Fig 23 To allow this SOpt overloads for-comprehensions and other constructs suchas string concatenation with + and eld access bookauthor Our overloads of these constructsreify the query as an expression tree SOpt can then optimize this expression tree and executethe resulting optimized query Programmers explicitly trigger processing by SOpt by adaptingtheir queries as we describe in next subsection
31 Adapting a QueryTo use SOpt instead of native Scala queries we rst assume that the query does not use side eectsand is thus purely functional We argue that purely functional queries are more declarative Sideeects are used to improve performance but SOpt makes that unnecessary through automaticoptimizations In fact the lack of side eects enables more optimizations
In Fig 31 we show a version of our running example adapted to use SOpt We rst discusschanges to records To enable SOpt a programmer needs to (a) import the SOpt library(b) import some wrapper code specic to the types the collection operates on in this case Bookand Author (more about that later) (c) convert explicitly the native Scala collections involved tocollections of our framework by a call to asSquopt (d) rename a few operators such as == to ==(this is necessary due to some Scala limitations) and (e) add a separate step where the query isevaluated (possibly after optimization) All these changes are lightweight and mostly of a syntacticnature
For parameterized queries like titleFilter we need to also adapt type annotations Theones in titleFilterQuery reveal some details of our implementation Expressions that are reiedhave type Exp[T] instead of T As the code shows resQuery is optimized before compilationThis call will perform the optimizations that we previously did by hand and will return a queryequivalent to that in Fig 23 after verifying their safety conditions For instance after inliningthe lter if booktitlecontains(keyword) does not reference author hence it is safe to hoistNote that checking this safety condition would not be possible without reifying the predicate Forinstance it would not be sucient to only reify the calls to the collection API because the predicate
15
16 Chapter 3 Automatic optimization with SQuOpt
import squopt_import schemasquopt_
val recordsQuery =for
book larr booksasSquoptif bookpublisher == Pearson Educationauthor larr bookauthors
yield new BookData(booktitle authorfirstName + + authorlastName bookauthorssize - 1)
val records = recordsQueryeval
def titleFilterQuery(records Exp[Set[BookData]] keyword Exp[String ]) = for record larr recordsif recordtitlecontains(keyword)
yield (recordtitle recordauthorName)
val resQuery = titleFilterQuery(recordsQuery Principles)val res = resQueryoptimizeeval
Figure 31 SOpt version of Fig 22 recordQuery contains a reication of the query recordsits result
Chapter 3 Automatic optimization with SQuOpt 17
is represented as a boolean function parameter In general our automatic optimizer inspects thewhole reication of the query implementation to check that optimizations do not introduce changesin the overall result of the query and are therefore safe
32 IndexingSOpt also supports the transparent usage of indexes Indexes can further improve the eciencyof queries sometimes by orders of magnitude In our running example the query scans all books tolook for the ones having the right publisher To speed up this query we can preprocess books tobuild an index that is a dictionary mapping from each publisher to a collection of all the books itpublished This index can then be used to answer the original query without scanning all books
We construct a query representing the desired dictionary and inform the optimizer that it shoulduse this index where appropriate
val idxByPublisher = booksasSquoptindexBy(_publisher)OptimizationaddIndex(idxByPublisher)
The indexBy collection method accepts a function that maps a collection element to a keycollindexBy(key) returns a dictionary mapping each key to the collection of all elements ofcoll having that key Missing keys are mapped to an empty collection1 OptimizationaddIndexsimply preevaluates the index and updates a dictionary mapping the index to its preevaluated result
A call to optimize on a query will then take this index into account and rewrite the query toperform index lookup instead of scanning if possible For instance the code in Fig 31 would betransparently rewritten by the optimizer to a query similar to the following
val indexedQuery =for
book larr idxByPublisher(Pearson Education)author larr bookauthors
yield new BookData(booktitle authorfirstName + + authorlastName bookauthorssize - 1)
Since dictionaries in Scala are functions in the above code dictionary lookup on idxByPublisheris represented simply as function application The above code iterates over books having the desiredpublisher instead of scanning the whole library and performs the remaining computation from theoriginal query Although the index use in the listing above is notated as idxByPublisher(Pearson Education)only the cached result of evaluating the index is used when the query is executed not the reiedindex denition
This optimization could also be performed manually of course but the queries are on a higherabstraction level and more maintainable if indexing is dened separately and applied automaticallyManual application of indexing is a crosscutting concern because adding or removing an indexaects potentially many queries SOpt does not free the developer from the task of assessingwhich index will lsquopay orsquo (we have not considered automatic index creation yet) but at least itbecomes simple to add or remove an index since the application of the indexes is modularized inthe optimizer
1For readers familiar with the Scala collection API we remark that the only dierence with the standard groupBy methodis the handling of missing keys
18 Chapter 3 Automatic optimization with SQuOpt
Chapter 4
Implementing SQuOpt
After describing how to use SOpt we explain how SOpt represents queries internally andoptimizes them Here we give only a brief overview of our implementation technique it is describedin more detail in Sec 51
41 Expression TreesIn order to analyze and optimize collection queries at runtime SOpt reies their syntacticstructure as expression trees The expression tree reects the syntax of the query after desugaringthat is after for-comprehensions have been replaced by API calls For instance recordsQuery fromFig 31 points to the following expression tree (with some boilerplate omitted for clarity)
new FlatMap(new Filter(
new Const(books)v2 rArr new Eq(new Book_publisher(v2)
new Const(Pearson Education)))v3 rArr new MapNode(
new Book_authors(v3)v4 rArr new BookData(
new Book_title(v3)new StringConcat(
new StringConcat(new Author_firstName(v4)new Const( ))
new Author_lastName(v4))new Plus(new Size(new Book_authors(v3))
new Negate(new Const (1))))))
The structure of the for-comprehension is encoded with the FlatMap Filter and MapNodeinstances These classes correspond to the API methods that for-comprehensions get desugaredto SOpt arranges for the implementation of flatMap to construct a FlatMap instance etcThe instances of the other classes encode the rest of the structure of the collection query that iswhich methods are called on which arguments On the one hand SOpt denes classes suchas Const or Eq that are generic and applicable to all queries On the other hand classes such asBook_publisher cannot be predened because they are specic to the user-dened types used
19
20 Chapter 4 Implementing SQuOpt
in a query SOpt provides a small code generator which creates a case class for each methodand eld of a user-dened type Functions in the query are represented by functions that createexpression trees representing functions in this way is frequently called higher-order abstract syntax[Pfenning and Elliot 1988]
We can see that the reication of this code corresponds closely to an abstract syntax tree forthe code which is executed however many calls to specic methods like map are represented byspecial nodes like MapNode rather than as method calls For the optimizer it becomes easier tomatch and transform those nodes than with a generic abstract syntax tree
Nodes for collection operations are carefully dened by hand to provide them highly generictype signatures and make them reusable for all collection types In Scala collection operations arehighly polymorphic for instance map has a single implementation working on all collection typeslike List Set and we similarly want to represent all usages of map through instances of a singlenode type namely MapNode Having separate nodes ListMapNode SetMapNode and so on would beinconvenient for instance when writing the optimizer However map on a List[Int] will produceanother List while on a Set it will produce another Set and so on for each specic collectiontype (in rst approximation) moreover this is guaranteed statically by the type of map Yet thanksto advanced typesystem features map is dened only once avoiding redundancy but has a typepolymorphic enough to guarantee statically that the correct return value is produced Since ourtree representation is strongly typed we need to have a similar level of polymorphism in MapNodeWe achieved this by extending the techniques described by Odersky and Moors [2009] as detailedin our technical report [Giarrusso et al 2012]
We get these expression trees by using Scala implicit conversions in a particular style whichwe adopted from Rompf and Odersky [2010] Implicit conversions allow to add for each methodAfoo(B) an overload of Exp[A]foo(Exp[B]) Where a value of type Exp[T] is expected a valueof type T can be used thanks to other implicit conversions which wrap it in a Const node The initialcall of asSquopt triggers the application of the implicit conversions by converting the collection tothe leaf of an expression tree
It is also possible to call methods that do not return expression trees however such methodcalls would then only be represented by an opaque MethodCall node in the expression tree whichmeans that the code of the method cannot be considered in optimizations
Crucially these expression trees are generated at runtime For instance the rst Const containsa reference to the actual collection of books to which books refers If a query uses another querysuch as records in Fig 31 then the subquery is eectively inlined The same holds for method callsinside queries If these methods return an expression tree (such as the titleFilterQuery methodin Fig 31) then these expression trees are inlined into the composite query Since the reicationhappens at runtime it is not necessary to predict the targets of dynamically bound method calls Anew (and possibly dierent) expression tree is created each time a block of code containing queriesis executed
Hence we can say that expression trees represent the computation which is going to be executedafter inlining control ow or virtual calls in the original code typically disappear mdash especially if theymanipulate the query as a whole This is typical of deeply embedded DSLs like ours where codeinstead of performing computations produces a representation of the computation to perform [Elliottet al 2003 Chambers et al 2010]
This inlining can duplicate computations for instance in this code
val num Exp[Int] = 10val square = num numval sum = square + square
Chapter 4 Implementing SQuOpt 21
evaluating sum will evaluate square twice Elliott et al [2003] and we avoid this using common-subexpression elimination
42 OptimizationsOur optimizer currently supports several algebraic optimizations Any query and in fact everyreied expression can be optimized by calling the optimize function on it The ability to optimizereied expressions that are not queries is useful for instance optimizing a function that produces aquery is similar to a ldquoprepared statementrdquo in relational databases
The optimizations we implemented are mostly standard in compilers [Muchnick 1997] ordatabases
bull Query unnesting merges a nested query into the containing one [Fegaras and Maier 2000Grust and Scholl 1999] replacing for instance
for val1 larr (for val2 larr coll yield f(val2 ))yield g(val1)
with
for val2 larr coll val1 = f(val2) yield g(val1)
bull Bulk operation fusion fuses higher-order operators on collections
bull Filter hoisting tries to apply lters as early as possible in database query optimization it isknown as selection pushdown For lter hoisting it is important that the full query is reiedbecause otherwise the dependencies of the lter condition cannot be determined
bull We reduce during optimization tuplecase class accesses For instance (a b)_1 is simpliedto a This is important because the produced expression does not depend on b removing thisfalse dependency can allow for instance a lter containing this expression to be hoisted to acontext where b is not bound
bull Indexing tries to apply one or more of the available indexes to speed up the query
bull Common subexpression elimination (CSE) avoids that the same computation is performedmultiple times we use techniques similar to Rompf and Odersky [2010]
bull Smaller optimizations include constant folding reassociation of associative operators andremoval of identity maps (collmap(x rArr x) typically generated by the translation offor-comprehensions)
Each optimization is applied recursively bottom-up until it does not trigger anymore dierentoptimizations are composed in a xed pipeline
Optimizations are only guaranteed to be semantics-preserving if queries obey the restrictionswe mentioned for instance queries should not involve side-eects such as assignments or IO andall collections used in queries should implement the specications stated in the collections APIObviously the choice of optimizations involves many tradeos for that reason we believe that it isall the more important that the optimizer is not hard-wired into the compiler but implemented as alibrary with potentially many dierent implementations
To make changes to the optimizer more practical we designed our query representationso that optimizations are easy to express restricting to pure queries also helps For instance
22 Chapter 4 Implementing SQuOpt
lter fusion can be implemented in few lines of code just match against expressions of formcollfilter(pred2)filter(pred1) and rewrite them1
val mergeFilters = ExpTransformer case Sym(Filter(Sym(Filter(collection pred2)) pred1)) rArr
collfilter(x rArr pred2(x) ampamp pred1(x))
A more complex optimization such as lter hoisting requires only 20 lines of codeWe have implemented a prototype of the optimizer with the mentioned optimizations Many
additional algebraic optimizations can be added in future work by us or others a candidate would beloop hoisting which moves out of loops arbitrary computations not depending on the loop variable(and not just lters) With some changes to the optimizerrsquos architecture it would also be possible toperform cost-based and dynamic optimizations
43 Query ExecutionCalling the eval method on a query will convert it to executable bytecode this bytecode will beloaded and invoked by using Java reection We produce a thunk that when evaluated will executethe generated code
In our prototype we produce bytecode by converting expression trees to Scala code and invokingon the result the Scala compiler scalac Invoking scalac is typically quite slow and we currentlyuse caching to limit this concern however we believe it is merely an engineering problem toproduce bytecode directly from expression trees just as compilers do
Our expression trees contain native Scala values wrapped in Const nodes and in many casesone cannot produce Scala program text evaluating to the same value To allow executing suchexpression trees we need to implement cross-stage persistence (CSP) the generated code will be afunction accepting the actual values as arguments [Rompf and Odersky 2010] This allows sharingthe compiled code for expressions which dier only in the embedded values
More in detail our compilation algorithm is as follows (a) We implement CSP by replacing em-bedded Scala values by references to the function arguments so for instance List(1 2 3)map(x rArr x + 1)becomes the function (s1 List[Int] s2 Int) rArr s1map(x rArr x + s2) (b) We look upthe produced expression tree together with the types of the constants we just removed in a cachemapping to the generated classes If the lookup fails we update the cache with the result of the nextsteps (c) We apply CSE on the expression (d) We convert the tree to code compile it and load thegenerated code
Preventing errors in generated code Compiler errors in generated code are typically a con-cern with SOpt however they can only arise due to implementation bugs in SOpt (forinstance in pretty-printing which cannot be checked statically) so they do not concern users Sinceour query language and tree representation are statically typed type-incorrect queries will berejected statically For instance consider again idxByPublisher described previously
val idxByPublisher = booksasSquoptindexBy(_publisher)
Since Bookpublisher returns a String idxByPublisher has type Exp[Map[String Book]]Looking up a key of the wrong type for instance by writing idxByPublisher(book) wherebook Book will make scalac emit a static type error
1Sym nodes are part of the boilerplate we omitted earlier
Chapter 5
A deep EDSL for collection queries
In this chapter we discuss collections as a critical case study [Flyvbjerg 2006] for deep DSL embed-ding
As discussed in the previous chapter to support optimizations we require a deep embedding ofthe collections DSL
While the basic idea of deep embedding is well known it is not obvious how to realize deepembedding when considering the following additional goals
bull To support users adopting SOpt a generic SOpt query should share the ldquolook and feelrdquoof the ordinary collections API In particular query syntax should remain mostly unchangedIn our case we want to preserve Scalarsquos for-comprehension1 syntax and its notation foranonymous functions
bull Again to support users adopting SOpt a generic SOpt query should not only sharethe syntax of the ordinary collections API it should also be well-typed if and only if thecorresponding ordinary query is well-typed This is particularly challenging in the Scalacollections library due to its deep integration with advanced type-system features such ashigher-kinded generics and implicit objects [Odersky and Moors 2009] For instance callingmap on a List will return a List and calling map on a Set will return a Set Hence the object-language representation and the transformations thereof should be as ldquotypedrdquo as possibleThis precludes among others a rst-order representation of object-language variables asstrings
bull SOpt should be interoperable with ordinary Scala code and Scala collections For instanceit should be possible to call normal non-reied functions within a SOpt query or mixnative Scala queries and SOpt queries
bull The performance of SOpt queries should be reasonable even without optimizations Anon-optimized SOpt query should not be dramatically slower than a native Scala queryFurthermore it should be possible to create new queries at run time and execute them withoutexcessive overhead This goal limits the options of applicable interpretation or compilationtechniques
We think that these challenges characterize deep embedding of queries on collections as acritical case study [Flyvbjerg 2006] for DSL embedding That is it is so challenging that embedding
1Also known as for expressions [Odersky et al 2011 Ch 23]
23
24 Chapter 5 A deep EDSL for collection queries
techniques successfully used in this case are likely to be successful on a broad range of other DSLsIn this chapter we report from the case study the successes and failures of achieving these goals inSOpt
51 Implementation Expressing the interface in ScalaTo optimize a query as described in the previous section SOpt needs to reify optimize andexecute queries Our implementation assigns responsibility for these steps to three main componentsA generic library for reication and execution of general Scala expressions a more specializedlibrary for reication and execution of query operators and a dedicated query optimizer Queriesneed then to be executed through either compilation (already discussed in Sec 43) or interpretation(to discuss in Sec 56) We describe the implementation in more detail in the rest of this sectionThe full implementation is also available online2
A core idea of SOpt is to reify Scala code as a data structure in memory A programmer coulddirectly create instances of that data structure but we also provide a more convenient interfacebased on advanced Scala features such as implicit conversions and type inference That interfaceallows to automatically reify code with a minimum of programmer annotations as shown in theexamples in Chapter 3 Since this is a case study on Scalarsquos support for deep embedding of DSLswe also describe in this section how Scala supports our task In particular we report on techniqueswe used and issues we faced
52 Representing expression treesIn the previous section we have seen that expressions that would have type T in a native Scalaquery are reied and have type Exp[T] in SOpt The generic type Exp[T] is the base for ourreication of Scala expression as expression trees that is as data structures in memory We providea subclass of Exp[T] for every dierent form of expression we want to reify For example in Fig 22the expression authorfirstName + + authorlastName must be reied even though it isnot collection-related for otherwise the optimizer could not see whether author is used Knowingthis is needed for instance to remove variables which are bound but not used Hence this expressionis reied as
StringConcat(StringConcat(AuthorFirstName(author) Const( )) AuthorLastName(author ))
This example uses the constructors of the following subclasses of Exp[T] to create the expressiontree
case class Const[T](t T) extends Exp[T]case class StringConcat(str1 Exp[String] str2 Exp[String ]) extends Exp[String]case class AuthorFirstName(t Exp[Author ]) extends Exp[String]case class AuthorLastName(t Exp[Author ]) extends Exp[String]
Expression nodes additionally implement support code for tree traversals to support optimiza-tions which we omit here
This representation of expression trees is well-suited for a representation of the structure ofexpressions in memory and also for pattern matching (which is automatically supported for caseclasses in Scala) but inconvenient for query writers In fact in Fig 31 we have seen that SOptprovides a much more convenient front-end The programmer writes almost the usual code for typeT and SOpt converts it automatically to Exp[T]
2hpwwwinformatikuni-marburgde~pgiarrussoSOpt
Chapter 5 A deep EDSL for collection queries 25
53 Lifting rst-order expressionsWe call the process of converting from T to Exp[T] lifting Here we describe how we lift rst-orderexpressions ndash Scala expressions that do not contain anonymous function denitions
To this end consider again the fragment
authorfirstName + + authorlastName
now in the context of the SOpt-enabled query in Fig 31 It looks like a normal Scala expressioneven syntactically unchanged from Fig 22 However evaluating that code in the context of Fig 31does not concatenate any strings but creates an expression tree instead Although the code lookslike the same expression it has a dierent type Exp[String] instead of String This dierence inthe type is caused by the context The variable author is now bound in a SOpt-enabled queryand therefore has type Exp[Author] instead of Author We can still access the firstName eld ofauthor because expression trees of type Exp[T] provide the same interface as values of type Texcept that all operations return expressions trees instead of values
To understand how an expression tree of type Exp[T] can have the same interface as a value oftype T we consider two expression trees str1 and str2 of type Exp[String] The implementationof lifting diers depending on the kind of expression we want to lift
Method calls and operators In our example the operator + should be available on Exp[String]but not on Exp[Boolean] because + is available on String but not on Boolean Furthermore wewant str1 + str2 to have type Exp[String] and to evaluate not to a string concatenation butto a call of StringConcat that is to an expression tree which represents str1 + str2 This is asomewhat unusual requirement because usually the interface of a generic type does not depend onthe type parameters
To provide such operators and to encode expression trees we use implicit conversions in asimilar style as Rompf and Odersky [2010] Scala allows to make expressions of a type T implicitlyconvertible to a dierent type U To this end one must declare an implicit conversion functionhaving type T rArr U Calls to such functions will be inserted by the compiler when required to xa type mismatch between an expression of type T and a context expecting a value of type U Inaddition a method call em(args) can trigger the conversion of e to a type where the method m ispresent3 Similarly an operator usage as in str1 + str2 can also trigger an implicit conversionan expression using an operator like str1 + str2 is desugared to the method call str1+(str2)which can trigger an implicit conversion from str1 to a type providing the + method Thereforefrom now on we do not distinguish between operators and methods
To provide the method + on Exp[String] we dene an implicit conversion from Exp[String]to a new type providing a + method which creates the appropriate expression node
implicit def expToStringOps(t Exp[String ]) = new StringOps(t)class StringOps(t Exp[String ])
def +(that Exp[String ]) Exp[String] = StringConcat(t that)
This is an example of the well-known Scala enrich-my-library pattern4 [Odersky 2006]With these declarations in scope the Scala compiler rewrites str1 + str2 to expToStringOps(str1)+(str2)
which evaluates to StringConcat(str1 str2) as desired The implicit conversion functionexpToStringOps is not applicable to Exp[Boolean] because it explicitly species the receiver ofthe +-call to have type Exp[String] In other words expressions like str1 + str2 are now lifted
3For the exact rules see Odersky et al [2011 Ch 21] and Odersky [2011]4Also known as pimp-my-library pattern
26 Chapter 5 A deep EDSL for collection queries
on the level of expression trees in a type-safe way For brevity we refer to the dened operator asExp[String]+
Literal valuesHowever a string concatenation might also include constants as in str1 + fooor bar + str1 To lift str1 + foo we introduce a lifting for constants which wraps them ina Const node
implicit def pure[T](t T) Exp[T] = Const(t)
The compiler will now rewrite str1 + foo to expToStringOps(str1) + pure(foo) andstr1 + foo + str2 to expToStringOps(str1) + pure(foo) + str2 Dierent implicitconversions cooperate successfully to lift the expression
Analogously it would be convenient if the similar expression bar + str1 would be rewrittento expToStringOps(pure(bar)) + str1 but this is not the case because implicit coercions arenot chained automatically in Scala Instead we have to manually chain existing implicit conversionsinto a new one
implicit def toStringOps(t String) = expToStringOps(pure(t))
so that bar + str1 is rewritten to toStringOps(bar) + str1User-dened methods Calls of user-dened methods like authorfirstName are lifted the
same way as calls to built-in methods such as string concatenation shown earlier For the runningexample the following denitions are necessary to lift the methods from Author to Exp[Author]
package schemasquopt
implicit def expToAuthorOps(t Exp[Author ]) = new AuthorOps(t)implicit def toAuthorOps(t Author) = expToAuthorOps(pure(t))
class AuthorOps(t Exp[Author ]) def firstName Exp[String] = AuthorFirstName(t)def lastName Exp[String] = AuthorLastName(t)
Implicit conversions for user-dened types cooperate with other ones for instance authorfirstName + + authorlastNameis rewritten to
(expToStringOps(expToAuthorOps(author ) firstName) + pure( )) +expToAuthorOps(author ) lastName
Author is not part of SOpt or the standard Scala library but an application-specic classhence SOpt cannot pre-dene the necessary lifting code Instead the application programmerneeds to provide this code to connect SOpt to his application To support the applicationprogrammer with this tedious task we provide a code generator which discovers the interfaceof a class through reection on its compiled version and generates the boilerplate code such asthe one above for Author It also generates the application-specic expression tree types such asAuthorFirstName as shown in Sec 52 In general query writers need to generate and import theboilerplate lifting code for all application-specic types they want to use in a SOpt query
If desired we can exclude some methods to restrict the language supported in our deeplyembedded programs For instance SOpt requires the user to write side-eect-free queries hencewe do not lift methods which perform side eects
Using similar techniques we can also lift existing functions and implicit conversionsTuples and other generic constructors The techniques presented above for the lifting of
method calls rely on overloading the name of the method with a signature that involves Exp Implicit
Chapter 5 A deep EDSL for collection queries 27
resolution (for method calls) will then choose our lifted version of the function or method to satisfythe typing requirements of the context or arguments of the call Unfortunately this technique doesnot work for tuple constructors which in Scala are not resolved like ordinary calls Instead supportfor tuple types is hard-wired into the language and tuples are always created by the predenedtuple constructors
For example the expression (str1 str2) will always call Scalarsquos built-in Tuple2 constructorand correspondingly have type (Exp[String] Exp[String]) We would prefer that it calls alifting function and produces an expression tree of type Exp[(String String)] instead
Even though we cannot intercept the call to Tuple2 we can add an implicit conversion to becalled after the tuple is constructed
implicit def tuple2ToTuple2Exp[A1 A2](tuple (Exp[A1] Exp[A2])) LiftTuple2[A1 A2] =LiftTuple2[A1 A2](tuple_1 tuple_2)
case class LiftTuple2[A1 A2](t1 Exp[A1] t2 Exp[A2]) extends Exp[(A1 A2)]
We generate such conversions for dierent arities with a code generator These conversionswill be used only when the context requires an expression tree Note that this technique is onlyapplicable because tuples are generic and support arbitrary components including expression trees
In fact we use the same technique also for other generic constructors to avoid problemsassociated with shadowing of constructor functions For example an implicit conversion is usedto lift Seq[Exp[T]] to Exp[Seq[T]] code like Seq(str1 str2) rst constructs a sequence ofexpression trees and then wraps the result with an expression node that describes a sequence
Subtyping So far we have seen that for each rst-order method m operating on instances of Twe can create a corresponding method which operates on Exp[T] If the method accepts parametershaving types A1 An and has return type R the corresponding lifted method will acceptparameters having types Exp[A1] Exp[An] and return type Exp[R] However Exp[T]also needs to support all methods that T inherits from its super-type S To ensure this we declarethe type constructor Exp to be covariant in its type parameter so that Exp[T] correctly inheritsthe liftings from Exp[S] This works even with the enrich-my-library pattern because implicitresolution respects subtyping in an appropriate way
Limitations of Lifting Lifting methods of Any or AnyRef (Scala types at the root of the inheri-tance hierarchy) is not possible with this technique Exp[T] inherits such methods and makes themdirectly available hence the compiler will not insert an implicit conversion Therefore it is notpossible to lift expressions such as x == y rather we have to rely on developer discipline to use== and = instead of == and =
An expression like foo + bar + str1 is converted to
toStringOps(foo + bar) + str1
Hence part of the expression is evaluated before being reied This is harmless here since we wantfoo + bar to be evaluated at compile-time that is constant-folded but in other cases it ispreferable to prevent the constant folding We will see later examples where queries on collectionsare evaluated before reication defeating the purpose of our framework and how we work aroundthose
54 Lifting higher-order expressionsWe have shown how to lift rst-order expressions however the interface of collections also useshigher-order methods that is methods that accept functions as parameters and we need to lift themas well to reify the complete collection EDSL For instance the map method applies a function to
28 Chapter 5 A deep EDSL for collection queries
each element of a collection In this section we describe how we reify such expressions of functiontype
Higher-order abstract syntax To represent functions we have to represent the bound vari-ables in the function bodies For example a reication of str rArr str + needs to reify thevariable str of type String in the body of the anonymous function This reication shouldretain the information where str is bound We achieve this by representing bound variables us-ing higher-order abstract syntax (HOAS) [Pfenning and Elliot 1988] that is we represent themby variables bound at the meta level To continue the example the above function is reied as(str Exp[String]) rArr str + Note how the type of str in the body of this version isExp[String] because str is a reied variable now Correspondingly the expression str + islifted as described in the previous section
With all operations in the function body automatically lifted the only remaining syntacticdierence between normal and lifted functions is the type annotation for the functionrsquos parameterFortunately Scalarsquos local type inference can usually deduce the argument type from the context forexample from the signature of the map operation being called Type inference plays a dual role hereFirst it allows the query writer to leave out the annotation and second it triggers lifting in thefunction body by requesting a lifted function instead of a normal function This is how in Fig 31 asingle call to asSquopt triggers lifting of the overall query
Note that reied functions have type Exp[A] rArr Exp[B] instead of the more regular Exp[A rArr B]We chose the former over the latter to support Scalarsquos syntax for anonymous functions and for-comprehensions which is hard-coded to produce or consume instances of the pre-dened A rArr Btype We have to reect this irregularity in the lifting of methods and functions by treating thetypes of higher-order arguments accordingly
User-dened methods revised We can now extend the lifting of signatures for methods orfunctions from the previous section to the general case that is the case of higher-order functionsWe lift a method or function with signature
def m[A1 An](a1 T1 an Tn) R
to a method or function with the following signature
def m[A1 An](a1 LinT1o an LinTno) LinRoAs before the denition of the lifted method or function will return an expression node representingthe call If the original was a function the lifted version is also dened as a function If the originalwas a method on type T then the lifted version is enriched onto T
The type transformation Li converts the argument and return types of the method or functionto be lifted For most types Li just wraps the type in the Exp type constructor but functiontypes are treated specially Li recursively descends into function types to convert their argumentsseparately Overall Li behaves as follows
Lin(A1 An) rArr R]o =(LinA1o LinAno rArr LinRo
LinAo = Exp[A]
We can use this extended denition of method lifting to implement map lifting for Lists that isa method with signature Exp[List[T]]map(Exp[T] rArr Exp[U])
implicit def expToListOps[T](coll Exp[List[T]]) = new ListOps(coll)implicit def toListOps[T](coll List[T]) = expToListOps(coll)
class ListOps(coll Exp[List[T]]) def map[U](f Exp[T] rArr Exp[U]) = ListMapNode(coll Fun(f))
Chapter 5 A deep EDSL for collection queries 29
val records = bookswithFilter(book rArr bookpublisher == Pearson Education)flatMap(book rArr bookauthorsmap(author rArr BookData(booktitle authorfirstName + + authorlastName
bookauthorssize - 1)))
Figure 51 Desugaring of code in Fig 22
case class ListMapNode[T U](coll Exp[List[T]] mapping Exp[T rArr U]) extends Exp[List[U]]
Note how maprsquos parameter f has type Exp[T] rArr Exp[U] as necessary to enable Scalarsquos syntaxfor anonymous functions and automatic lifting of the function body This implementation wouldwork for queries on lists but does not support other collection types or queries where the collectiontype changes We show in Sec 55 how SOpt integrates with such advanced features of the ScalaCollection EDSL
55 Lifting collectionsIn this section we rst explain how for-comprehensions are desugared to calls to library functionsallowing an external library to give them a dierent meaning We summarize afterwards neededinformation about the subset of the Scala collection EDSL that we reify Then we present how weperform this reication We nally present the reication of the running example (Fig 31)
For-comprehensions As we have seen an idiomatic encoding in Scala of queries on collectionsare for-comprehensions Although Scala is an impure functional language and supports side-eectfulfor-comprehensions only pure queries are supported in our framework because this enables orsimplies many domain-specic analyses Hence we will restrict our attention to pure queries
The Scala compiler desugars for-comprehensions into calls to three collection methods mapflatMap and withFilter which we explain shortly the query in Fig 22 is desugared to the codein Fig 51
The compiler performs type inference and type checking on a for-comprehension only afterdesugaring it this aords some freedom for the types of map flatMap and withFilter methods
The Scala Collection EDSL A Scala collection containing elements of type T implements thetrait Traversable[T] On an expression coll of type Traversable[T] one can invoke methodsdeclared (in rst approximation) as follows
def map[U](f T rArr U) Traversable[U]def flatMap[U](f T rArr Traversable[U]) Traversable[U]def withFilter[U](p T rArr Boolean ) Traversable[T]
For a Scala collection coll the expression collmap(f) applies f to each element of collcollects the results in a new collection and returns it collflatMap(f) applies f to each elementof coll concatenates the results in a new collection and returns it collwithFilter(p) producesa collection containing the elements of coll which satisfy the predicate p
However Scala supports many dierent collection types and this complicates the actual types ofthese methods Each collection can further implement subtraits like Seq[T] lt Traversable[T]
30 Chapter 5 A deep EDSL for collection queries
(for sequences) Set[T] lt Traversable[T] (for sets) and Map[K V] lt Traversable[(K V)](for dictionaries) for each such trait dierent implementations are provided
One consequence of this syntactic desugaring is that a single for-comprehension can operateover dierent collection types The type of the result depends essentially on the type of the rootcollection that is books in the example above The example above can hence be altered to producea sequence rather than a set by simply converting the root collection to another type
val recordsSeq = for book larr bookstoSeqif bookpublisher == Pearson Educationauthor larr bookauthors
yield BookData(booktitle authorfirstName + + authorlastName bookauthorssize - 1)
Precise static typing The Scala collections EDSL achieves precise static typing while avoidingcode duplication [Odersky and Moors 2009] Precise static typing is necessary because the returntype of a query operation can depend on subtle details of the base collection and query argumentsTo return the most specic static type the Scala collection DSL denes a type-level relation betweenthe source collection type the element type for the transformed collection and the type for theresulting collection The relation is encoded through the concept pattern [Oliveira et al 2010] iethrough a type-class-style trait CanBuildFrom[From Elem To] and elements of the relation areexpressed as implicit instances
For example a nite map can be treated as a set of pairs so that mapping a function frompairs to pairs over it produces another nite map This behavior is encoded in an instance of typeCanBuildFrom[Map[K V] (K1 V1) Map[K1 V1] The Map[K V] is the type of the basecollection (K1 V1) is the return type of the function and Map[K1 V1] is the return type of themap operation
It is also possible to map some other function over the nite map but the result will be ageneral collection instead of a nite map This behavior is described by an instance of typeCanBuildFrom[Traversable[T] U Traversable[U] Note that this instance is also applicableto nite maps because Map is a subclass of Traversable Together these two instances describehow to compute the return type of mapping over a nite map
Code reuse Even though these two use cases for mapping over a nite map have dierent returntypes they are implemented as a single method that uses its implicit CanBuildFrom parameter tocompute both the static type and the dynamic representation of its result So the Scala CollectionsEDSL provides precise typing without code duplication In our deep embedding we want to preservethis property
CanBuildFrom is used in the implementation of map flatMap and withFilter To furtherincrease reuse the implementations are provided in a helper trait TraversableLike[T Repr]with the following signatures
def map[U](f T rArr U)( implicit cbf CanBuildFrom[Repr U That ]) Thatdef flatMap[U](f T rArr Traversable[U])( implicit cbf CanBuildFrom[Repr U That ]) Thatdef withFilter[U](p T rArr Boolean ) Repr
The Repr type parameter represents the specic type of the receiver of the method callThe lifting The basic idea is to use the enrich-my-library pattern to lift collection methods
from TraversableLike[T Repr] to TraversableLikeOps[T Repr]
implicit def expToTraversableLikeOps[T Repr](v Exp[TraversableLike[T Repr ]]) =new TraversableLikeOps[T Repr](v)
Chapter 5 A deep EDSL for collection queries 31
class TraversableLikeOps[T Repr lt Traversable[T] with TraversableLike[T Repr ]](t Exp[Repr]) val t Exp[Repr]
def withFilter(f Exp[T] rArr Exp[Boolean ]) Exp[Repr] = Filter(thist FuncExp(f))
def map[U That lt TraversableLike[U That ]](f Exp[T] rArr Exp[U])(implicit c CanBuildFrom[Repr U That ]) Exp[That] = MapNode(thist FuncExp(f))
other methods definitions of MapNode Filter omitted
Figure 52 Lifting TraversableLike
Subclasses of TraversableLike[T Repr] also subclass both Traversable[T] and Repr totake advantage of this during interpretation and optimization we need to restrict the type ofexpToTraversableLikeOps getting the following conversion5
implicit def expToTraversableLikeOps[T Repr lt Traversable[T] with TraversableLike[T Repr ]](v Exp[Repr]) =
new TraversableLikeOps[T Repr](v)
The query operations are dened in class TraversableLikeOps[T Repr] a few examples areshown in Fig 526
Note how the lifted query operations use CanBuildFrom to compute the same static return typeas the corresponding non-lifted query operation would compute This reuse of type-level machineryallows SOpt to provide the same interface as the Scala Collections EDSL
Code reuse revisited We already saw how we could reify List[T]map through a specicexpression node ListMapNode However this approach would require generating many variantsfor dierent collections with slightly dierent types writing an optimizer able to handle all suchvariations would be unfeasible because of the amount of code duplication required Instead byreusing Scala type-level machinery we obtain a reication which is statically typed and at the sametime avoids code duplication in both our lifting and our optimizer and in general in all possibleconsumers of our reication making them feasible to write
56 InterpretationAfter optimization SOpt needs to interpret the optimized expression trees to perform the queryTherefore the trait Exp[T] declares a def interpret() T method and each expression nodeoverrides it appropriately to implement a mostly standard typed tagless [Carette et al 2009]environment-based interpreter The interpreter computes a value of type T from an expression treeof type Exp[T] This design allows query writers to extend the interpreter to handle application-specic operations In fact the lifting generator described in Sec 54 automatically generatesappropriate denitions of interpret for the lifted operations
5Due to type inference bugs the actual implicit conversion needs to be slightly more complex to mention T directly inthe argument type We reported the bug at hpsissuesscala-langorgbrowseSI-5298
6Similar liftings are introduced for traits similar to TraversableLike like SeqLike SetLike MapLike and so on
32 Chapter 5 A deep EDSL for collection queries
For example the interpretation of string concatenation is simply string concatenation as shownin the following fragment of the interpreter Note that type-safety of the interpreter is checkedautomatically by the Scala compiler when it compiles the fragments
case class StringConcat(str1 Exp[String] str2 Exp[String ]) extends Exp[String] def interpret () = str1interpret () + str2interpret ()
The subset of Scala we reify roughly corresponds to a typed lambda calculus with subtyping andtype constructors It does not include constructs for looping or recursion so it should be stronglynormalizing as long as application programmers do not add expression nodes with non-terminatinginterpretations However query writers can use the host language to create a reied expression ofinnite size This should not be an issue if SOpt is used as a drop-in replacement for the ScalaCollection EDSL
During optimization nodes of the expression tree might get duplicated and the interpretercould in principle observe this sharing and treat the expression tree as a DAG to avoid recomputingresults Currently we do not exploit this unlike during compilation
57 OptimizationOur optimizer is structured as a pipeline of dierent transformations on a single intermediaterepresentation constituted by our expression trees Each phase of the pipeline and the pipelineas a whole produce a new expression having the same type as the original one Most of ourtransformations express simple rewrite rules with or without side conditions which are appliedon expression trees from the bottom up and are implemented using Scalarsquos support for patternmatching [Emir et al 2007b]
Some optimizations like lter hoisting (which we applied manually to produce the code inFig 23) are essentially domain-specic and can improve complexity of a query To enable suchoptimizations to trigger however one needs often to perform inlining-like transformations and tosimplify the result Inlining-related transformation can for instance produce code like (x y)_1which we simplify to x reducing abstraction overhead but also (more importantly) making syn-tactically clear that the result does not depend on y hence might be computed before y is evenbound This simplication extends to user-dened product types with denitions in Fig 21 codelike BookData(booktitle )title is simplied to booktitle
We have implemented thus optimizations of three classes
bull general-purpose simplications like inlining compile-time beta-reduction constant foldingand reassociation on primitive types and other simplications7
bull domain-specic simplications whose main goal is still to enable more important optimiza-tions
bull domain-specic optimizations which can change the complexity class of a query such aslter hoisting hash-join transformation or indexing
Among domain-specic simplications we implement a few described in the context of themonoid comprehension calculus [Grust and Scholl 1996b Grust 1999] such as query unnestingand fusion of bulk operators Query unnesting allows to unnest a for comprehension nested insideanother and produce a single for-comprehension Furthermore we can fuse dierent collection
7Beta-reduction and simplication are run in a xpoint loop [Peyton Jones and Marlow 2002] Termination is guaranteedbecause our language does not admit general recursion
Chapter 5 A deep EDSL for collection queries 33
operations together collection operators like map flatMap and withFilter can be expressed asfolds producing new collections which can be combined Scala for-comprehension are howevermore general than monoid comprehensions8 hence to ensure safety of some optimizations we needsome additional side conditions9
Manipulating functionsTo be able to inspect a HOAS function body funBody Exp[S] rArr Exp[T]like str rArr str + we convert it to rst-order abstract syntax (FOAS) that is to an expressiontree of type Exp[T] To this end we introduce a representation of variables and a generator of freshones since variable names are auto-generated they are internally represented simply as integersinstead of strings for eciency
To convert funBody from HOAS to FOAS we apply it to a fresh variable v of type TypedVar[S]obtaining a rst-order representation of the function body having type Exp[T] and containingoccurrences of v
This transformation is hidden into the constructor Fun which converts Exp[S] rArr Exp[T]a representation of an expression with one free variable to Exp[S rArr T] a representation of afunction
case class App[S T](f Exp[S rArr T] t Exp[S]) extends Exp[T]def Fun[-S +T](f Exp[S] rArr Exp[T]) Exp[S rArr T] =
val v = Fungensym[S]()FOASFun(funBody(v) v)
case class FOASFun[S T](val foasBody Exp[T] v TypedVar[S]) extends Exp[S rArr T]implicit def app[S T](f Exp[S rArr T]) Exp[S] rArr Exp[T] =
arg rArr App(f arg)
Conversely function applications are represented using the constructor App an implicit conver-sion allows App to be inserted implicitly Whenever f can be applied to arg and f is an expressiontree the compiler will convert f(arg) to app(f)(arg) that is App(f arg)
In our example Fun(str rArr str + ) producesFOASFun(StringConcat(TypedVar[String](1) Const()) TypedVar[String](1))
Since we auto-generate variable names it is easiest to implement represent variable occurrencesusing the Barendregt convention where bound variables must always be globally unique we mustbe careful to perform renaming after beta-reduction to restore this invariant [Pierce 2002 Ch 6]
We can now easily implement substitution and beta-reduction and through that as shownbefore enable other optimizations to trigger more easily and speedup queries
8For instance a for-comprehension producing a list cannot iterate over a set9For instance consider a for-comprehension producing a set and nested inside another producing a list This comprehen-
sion does not correspond to a valid monoid comprehension (see previous footnote) and query unnesting does not applyhere if we unnested the inner comprehension into the outer one we would not perform duplicate elimination on the innercomprehension aecting the overall result
34 Chapter 5 A deep EDSL for collection queries
Chapter 6
Evaluating SQuOpt
The key goals of SOpt are to reconcile modularity and eciency To evaluate this claim weperform a rigorous performance evaluation of queries with and without SOpt We also analyzemodularization potential of these queries and evaluate how modularization aects performance(with and without SOpt)
We show that modularization introduces a signicant slowdown The overhead of using SOptis usually moderate and optimizations can compensate this overhead remove the modularizationslowdown and improve performance of some queries by orders of magnitude especially whenindexes are used
61 Study SetupThroughout this chapter we have already shown several compact queries for which our optimiza-tions increase performance signicantly compared to a naive execution Since some optimizationschange the complexity class of the query (eg by using an index) so the speedups grow with thesize of the data However to get a more realistic evaluation of SOpt we decided to perform anexperiment with existing real-world queries
As we are interested in both performance and modularization we have a specication and threedierent implementations of each query that we need to compare
(0) Query specication We selected a set of existing real-world queries specied and imple-mented independently from our work and prior to it We used only the specication of thesequeries
(1) Modularized Scala implementation We reimplemented each query as an expression onScala collections mdash our baseline implementation For modularity we separated reusabledomain abstractions into subqueries We conrmed the abstractions with a domain expertand will later illustrate them to emphasize their general nature
(2) Hand-optimized Scala implementation Next we asked a domain expert to performedmanual optimizations on the modularized queries The expert should perform optimizationssuch as inlining and lter hoisting where he could nd performance improvements
(3) SOpt implementation Finally we rewrote the modularized Scala queries from (1) asSOpt queries The rewrites are of purely syntactic nature to use our library (as describedin Sec 31) and preserve the modularity of the queries
35
36 Chapter 6 Evaluating SQuOpt
Since SOpt supports executing queries with and without optimizations and indexes wemeasured actually three dierent execution modes of the SOpt implementation
(3minus) SOpt without optimizer First we execute the SOpt queries without performingoptimization rst which should show the SOpt overhead compared to the modular Scalaimplementation (1) However common-subexpression elimination is still used here sinceit is part of the compilation pipeline This is appropriate to counter the eects of excessiveinlining due to using a deep embedding as explained in Sec 41
(3o ) SOpt with optimizer Next we execute SOpt queries after optimization
(3x ) SOpt with optimizer and indexes Finally we execute the queries after providing a setof indexes that the optimizer can consider
In all cases we measure query execution time for the generated code excluding compilationwe consider this appropriate because the results of compilations are cached aggressively and can bereused when the underlying data is changed potentially even across executions (even though thisis not yet implemented) as the data is not part of the compiled code
We use additional indexes in (3x ) but not in the hand-optimized Scala implementation (2) Weargue that indexes are less likely to be applied manually because index application is a crosscuttingconcern and makes the whole query implementation more complicated and less abstract Still weoer measurement (3o ) to compare the speedup without additional indexes
This gives us a total of ve settings to measure and compare (1 2 3minus 3o and 3x ) Betweenthem we want to observe the following interesting performance ratios (speedups or slowdownscomputed through the indicated divisions)
(M) Modularization overhead (the relative performance dierence between the modularized andthe hand-optimized Scala implementation 12)
(S) SOpt overhead (the overhead of executing unoptimized SOpt queries 13minus smaller isbetter)
(H) Hand-optimization challenge (the performance overhead of our optimizer against hand-optimizations of a domain expert 23o bigger is better) This overhead is partly due to theSOpt overhead (S) and partly to optimizations which have not been automated or havenot been eective enough This comparison excludes the eects of indexing since this isan optimization we did not perform by hand we also report (Hrsquo) = 23x which includesindexing
(O) Optimization potential (the speedup by optimizing modularized queries 13o bigger isbetter)
(X) Index inuence (the speedup gained by using indexes 3o3x ) (bigger is better)
(T) Total optimization potential with indexes (13x bigger is better) which is equal to (O) times (X )
In Fig 61 we provide an overview of the setup We made our raw data available and our resultsreproducible [Vitek and Kalibera 2011]1
Chapter 6 Evaluating SQuOpt 37
SQuOpt with optimizer
Modularized Scala
Implementation
Hand-opt Scala
Implementation
SQuOpt without optimizer
SQuOpt with optimizer
and indexes
Reference Implementation
Specification
0 1 2
3- 3o 3x
S O
M
T H
X
Legend derived from comparison
Figure 61 Measurement Setup Overview
62 Experimental UnitsAs experimental units we sampled a set of queries on code structures from FindBugs 20 [Hovemeyerand Pugh 2004] FindBugs is a popular bug-nding tool for Java Bytecode available as open sourceTo detect instances of bug patterns it queries a structural in-memory representation of a codebase (extracted from bytecode) Concretely a single loop traverses each class and invokes allvisitors (implemented as listeners) on each element of the class Many visitors in turn performactivities concerning multiple bug detectors which are fused together An extreme example is thatin FindBugs query 4 is dened in class DumbMethods together with other 41 bug detectors fordistinct types of bugs Typically a bug detector is furthermore scattered across the dierent methodsof the visitor which handle dierent elements of the class We believe this architecture has beenchosen to achieve good performance however we do not consider such manual fusion of distinctbug detectors together as modular We selected queries from FindBugs because they representtypical non-trivial queries on in-memory collections and because we believe our framework allowsexpressing them more modularly
We sampled queries in two batches First we manually selected 8 queries (from approx400 queries in FindBugs) chosen mainly to evaluate the potential speedups of indexing (queriesthat primarily looked for declarations of classes methods or elds with specic properties queriesthat inspect the type hierarchy and queries that required analyzing methods implementation)Subsequently we randomly selected a batch of 11 additional queries The batch excluded queriesthat rely on control-dataow analyses (ie analyzing the eect of bytecode instructions on thestack) due to limitations of the bytecode tookit we use In total we have 19 queries as listed inTable 62 (the randomly selected queries are marked with the superscript R)
We implemented each query three times (see implementations (1)ndash(3) in Sec 61) following thespecications given in the FindBugs documentation (0) Instead of using a hierarchy of visitors as theoriginal implementations of the queries in FindBugs we wrote the queries as for-comprehensions inScala on an in-memory representation created by the Scala toolkit BAT2 BAT in particular providescomprehensive support for writing queries against Java bytecode in an idiomatic way We exemplifyan analysis in Fig 65 It detects all co-variant equals methods in a project by iterating over allclass les (line 2) and all methods searching for methods named ldquoequalsrdquo that return a boolean
1Data available at hpwwwinformatikuni-marburgde~pgiarrussoSOpt2hpgithubcomDelorsBAT
38 Chapter 6 Evaluating SQuOpt
Performance
(ms)
Performance
ratiosId
Description
12
3minus
3o
3x
M(12)
H(23
o)T
(13
x)1
CovariantcompareTo()dened
1113
085026
02609
5044
2Explicitgarbage
collectioncall
496258
11761150
5219
0295
3Protected
eldin
nalclass11
1111
1212
10010
984
ExplicitrunFinalizersOnExit()call
509262
11501123
10019
0251
5clone()dened
innon-Cloneable
class29
1455
46047
2103
616
Covariantequals()dened29
1523
97020
1916
1477
Publicnalizerdened29
1228
80003
2315
10708
Dubiouscatching
ofIllegalMonitorStateException
8272
11028
00111
2612800
9R
Uniniteldread
duringconstruction
ofsuper896
3673017
960960
2404
0910
RM
utablestaticeld
declaredpublic
95279511
91159350
935010
1010
11R
Refactoranoninnerclasstostatic
88048767
87188700
870010
1010
12R
Inecientuse
oftoArray(O
bject[])3714
19054046
34143414
2006
1113
RPrim
itiveboxed
andunboxed
forcoercion3905
16725044
32243224
2305
1214
RD
oubleprecision
conversionfrom
32bit
38871796
52893010
301022
0613
15R
Privilegedm
ethodused
outsidedoPrivileged
505302
1319337
33717
0915
16R
Mutable
publicstaticeldshould
benal
1362
1270
7020
0918
17R
Serializableclassism
emberofnon-serclass
12077
09418
1816
0469
18R
Swing
methodsused
outsideSw
ingthread
57753
116345
4511
1213
19R
Finalizeronlycallssuperclassnalize
5513
7311
01044
11541
Table62PerformanceresultsAsin
inSec61(1)denotesthem
odularScalaim
plementation(2)thehand-optim
izedScala
oneand(3minus)(3
o)(3x)
refertothe
SO
ptim
plementation
when
runrespectivelywithoutoptim
izationswith
optimizationsw
ithoptim
izationsandindexingQ
ueriesm
arkedw
iththe
Rsuperscriptw
ereselected
byrandom
sampling
Chapter 6 Evaluating SQuOpt 39
M (12) S (13minus) H (23o ) Hrsquo (23x ) O (13o ) X (3o3x ) T (13x )
Geometric means ofperformance ratios 24x 12x 08x 51x 19x 63x 12x
Table 63 Average performance ratios This table summarizes all interesting performance ratiosacross all queries using the geometric mean [Fleming and Wallace 1986] The meaning of speedupsis discussed in Sec 61
Abstraction UsedAll elds in all class les 4All methods in all class les 3All method bodies in all class les 3All instructions in all method bodies and theirbytecode index
5
Sliding window (size n) over all instructions (andtheir index)
3
Table 64 Description of abstractions removed during hand-optimization and number of querieswhere the abstraction is used (and optimized away)
value and dene a single parameter of the type of the current classAbstractions In the reference implementations (1) we identied several reusable abstractions
as shown in Table 64 The reference implementations of all queries except 17R use exactly one ofthese abstractions which encapsulate the main loops of the queries
Indexes For executing (3x ) (SOpt with indexes) we have constructed three indexes to speedup navigation over the queried data of queries 1ndash8 Indexes for method name exception handlersand instruction types We illustrate the implementation of the method-name index in Fig 66 itproduces a collection of all methods and then indexes them using indexBy its argument extractsfrom an entry the key that is the method name We selected which indexes to implement usingguidance from SOpt itself during optimizations SOpt reports which indexes it could haveapplied to the given query Among those we tried to select indexes giving a reasonable compromisebetween construction cost and optimization speedup We rst measured the construction cost ofthese indexes
for classFile larr classFilesasSquoptmethod larr classFilemethodsif methodisAbstract ampamp methodname == equals ampamp
methoddescriptorreturnType == BooleanTypeparameterTypes larr Let(methoddescriptorparameterTypes)if parameterTypeslength == 1 ampamp parameterTypes (0) == classFilethisClass
yield (classFile method)
Figure 65 Find covariant equals methods
40 Chapter 6 Evaluating SQuOpt
val methodNameIdx Exp[Map[String Seq[(ClassFile Method )]]] = (for classFile larr classFilesasSquoptmethod larr classFilemethods
yield (classFile method )) indexBy(entry rArr entry_2name)
Figure 66 A simple index denition
Index Elapsed time (ms)Method name 9799plusmn294Exception handlers 17929plusmn321Instruction type 416649plusmn20285
For our test data index construction takes less than 200 ms for the rst two indexes which ismoderate compared to the time for loading the bytecode in the BAT representation (475532plusmn14166)Building the instruction index took around 4 seconds which we consider acceptable since this indexmaps each type of instruction (eg INSTANCEOF) to a collection of all bytecode instructions of thattype
63 Measurement SetupTo measure performance we executed the queries on the preinstalled JDK class library (rtjar)containing 58M of uncompressed Java bytecode We also performed a preliminary evaluation byrunning queries on the much smaller ScalaTest library getting comparable results that we hence donot discuss Experiments were run on a 8-core Intel Core i7-2600 340 GHz with 8 GB of RAMrunning Scientic Linux release 62 The benchmark code itself is single-threaded so it uses only onecore however the JVM used also other cores to ooad garbage collection We used the preinstalledOpenJDK Java version 170_05-icedtea and Scala 2100-M7
We measure steady-state performance as recommended by Georges et al [2007] We invokethe JVM p = 15 times at the beginning of each JVM invocation all the bytecode to analyze isloaded in memory and converted into BATrsquos representation In each JVM invocation we iterateeach benchmark until the variations of results becomes low enough We measure the variations ofresults through the coecient of variation (CoV standard deviation divided by the mean) Thus weiterate each benchmark until the CoV in the last k = 10 iterations drops under the threshold θ = 01or until we complete q = 50 iterations We report the arithmetic mean of these measurements (andalso report the usually low standard deviation on our web page)
64 ResultsCorrectness We machine-checked that for each query all variants in Table 62 agree
Modularization Overhead We rst observe that performance suers signicantly when usingthe abstractions we described in Table 64 These abstractions while natural in the domain and in thesetting of a declarative language are not idiomatic in Java or Scala because without optimizationthey will obviously lead to bad performance They are still useful abstractions from the point ofview of modularity though mdash as indicated by Table 64 mdash and as such it would be desirable if onecould use them without paying the performance penalty
Chapter 6 Evaluating SQuOpt 41
Scala Implementations vs FindBugs Before actually comparing between the dierent Scalaand SOpt implementations we rst ensured that the implementations are comparable to theoriginal FindBugs implementation A direct comparison between the FindBugs reference implemen-tation and any of our implementations is not possible in a rigorous and fair manner FindBugs bugdetectors are not fully modularized therefore we cannot reasonably isolate the implementationof the selected queries from support code Furthermore the architecture of the implementationhas many dierences that aect performance among others FindBugs also uses multithreadingMoreover while in our case each query loops over all classes in FindBugs as discussed above asingle loop considers each class and invokes all visitors (implemented as listeners) on it
We measured startup performance [Georges et al 2007] that is the performance of runningthe queries only once to minimize the eect of compiler optimizations We setup our SOpt-based analyses to only perform optimization and run the optimized query To setup FindBugs wemanually disabled all unrelated bug detectors we also made the modied FindBugs source codeavailable The result is that the performance of the Scala implementations of the queries (3minus) hasperformance of the same order of magnitude as the original FindBugs queries ndash in our tests theSOpt implementation was about twice as fast However since the comparison cannot be madefair we refrained from a more detailed investigation
SQuOpt Overhead and Optimization Potential We present the results of our benchmarksin Table 62 Column names refer to a few of the denitions described above for readability we donot present all the ratios previously introduced for each query but report the raw data In Table 63we report the geometric mean Fleming and Wallace [1986] of each ratio computed with the sameweight for each query
We see that in its current implementation SOpt can cause a overhead S (13minus) up to 34xOn average SOpt queries are 12x faster These dierences are due to minor implementationdetails of certain collection operators For query 18R instead we have that the the basic SOptimplementation is 129x faster and are investigating the reason we suspect this might be related tothe use of pattern matching in the original query
As expected not all queries benet from optimizations out of 19 queries optimization aordsfor 15 of them signicant speedups ranging from a 12x factor to a 12800x factor 10 queries arefaster by a factor of at least 5 Only queries 10R 11R and 12R fail to recover any modularizationoverhead
We have analyzed the behavior of a few queries after optimization to understand why theirperformance has (or has not) improved
Optimization makes query 17R slower we believe this is because optimization replaces lteringby lazy ltering which is usually faster but not here Among queries where indexing succeedsquery 2 has the least speedup After optimization this query uses the instruction-type index tond all occurrences of invocation opcodes (INVOKESTATIC and INVOKEVIRTUAL) after this step thequery looks among those invocations for ones targeting runFinalizersOnExit Since invocationopcodes are quite frequent the used index is not very specic hence it allows for little speedup(95x) However no other index applies to this query moreover our framework does not maintainany selectivity statistics on indexes to predict these eects Query 19R benets from indexingwithout any specic tuning on our part because it looks for implementations of finalize withsome characteristic hence the highly selective method-name index applies After optimizationquery 8 becomes simply an index lookup on the index for exception handlers looking for handlersof IllegalMonitorStateException it is thus not surprising that its speedup is thus extremelyhigh (12800x) This speedup relies on an index which is specic for this kind of query and buildingthis index is slower than executing the unoptimized query On the other hand building this index isentirely appropriate in a situation where similar queries are common enough Similar considerationsapply to usage of indexing in general similarly to what happens in databases
42 Chapter 6 Evaluating SQuOpt
Optimization Overhead The current implementation of the optimizer is not yet optimizedfor speed (of the optimization algorithm) For instance expression trees are traversed and rebuiltcompletely once for each transformation However the optimization overhead is usually notexcessive and is 548 plusmn 855 ms varying between 35 ms and 3817 ms (mostly depending on thequery size)
Limitations Although many speedups are encouraging our optimizer is currently a proof-of-concept and we experienced some limitations
bull In a few cases hand-optimized queries are still faster than what the optimizer can produceWe believe these problems could be addressed by adding further optimizations
bull Our implementation of indexing is currently limited to immutable collections For mutablecollections indexes must be maintained incrementally Since indexes are dened as specialqueries in SOpt incremental index maintenance becomes an instance of incrementalmaintenance of query results that is of incremental view maintenance We plan to supportincremental view maintenance as part of future work however indexing in the current formis already useful as illustrated by our experimental results
Threats to Validity With rigorous performance measurements and the chosen setup our studywas setup to maximize internal and construct validity Although we did not involve an externaldomain expert and we did not compare the results of our queries with the ones from FindBugs (exceptwhile developing the queries) we believe that the queries adequately represent the modularity andperformance characteristics of FindBugs and SOpt However since we selected only queriesfrom a single project external validity is limited While we cannot generalize our results beyondFindBugs yet we believe that the FindBugs queries are representative for complex in-memoryqueries performed by applications
Summary We demonstrated on our real-world queries that relying on declarative abstractionsin collection queries often causes a signicant slowdown As we have seen using SOpt withoutoptimization or when no optimizations are possible usually provides performance comparableto using standard Scala however SOpt optimizations can in most cases remove the slowdowndue to declarative abstractions Furthermore relying on indexing allows to achieve even greaterspeedups while still using a declarative programming style Some implementation limitationsrestrict the eectiveness of our optimizer but since this is a preliminary implementation we believeour evaluation shows the great potential of optimizing queries to in-memory collections
Chapter 7
Discussion
In this chapter we discuss the degree to which SOpt fullled our original design goals and theconclusions for host and domain-specic language design
71 Optimization limitationsIn our experiments indexing achieved signicant speedups but when indexing does not applyspeedups are more limited in comparison later projects working on collection query optimizationsuch as OptiQL [Rompf et al 2013 Sujeeth et al 2013b] gave better speedups as also discussed inSec 831
A design goal of this project was to incrementalize optimized queries and while it is easy toincrementalize collection operators such as map flatMap or filter it was much less clear to ushow to optimize the result of inlining We considered using shortcut fusion but we did not knowa good way to incrementalize the resulting programs later work as described in Part II clariedwhat is possible and what not
Another problem is that most fusion techniques are designed for sequential processing henceconict with incrementalization Most fusion techniques generally assume bulk operations scancollections in linear order For instance shortcut fusion rewrites operators in terms of foldrDuring parallel andor incremental computation instead it is better to use tree-shaped folds that isto split the task in a divide-and-conquer fashion so that the various subproblems form a balancedtree This division minimizes the height of the tree hence the number of steps needed to combineresults from subproblems into the nal result as also discussed by Steele [2009] It is not so clearhow to apply shortcut fusion to parallel and incremental programs Maier and Odersky [2013]describe an incremental tree-shaped fold where each subproblem that is small enough is solved byscanning its input in linear order but does not perform code transformation and does not studyhow to perform fusion
72 Deep embedding
721 What worked wellSeveral features of Scala contributed greatly to the success we achieved With implicit conversionsthe lifting can be made mostly transparent The advanced type system features were quite helpfulto make the expression tree representation typed The fact that for-comprehensions are desugared
43
44 Chapter 7 Discussion
before type inference and type checking was also a prerequisite for automatic lifting The syntacticexpressiveness and uniformity of Scala in particular the fact that custom types can have the samelook-and-feel as primitive types were also vital to lift expressions on primitive types
722 LimitationsDespite these positive experiences and our experimental success our embedding has a few signicantlimitations
The rst limitation is that we only lift a subset of Scala and some interesting features are missingWe do not support statements in nested blocks in our queries but this could be implemented reusingtechniques from Delite [Rompf et al 2011] More importantly for queries pattern matching cannotbe supported by deep embedding similar to ours In contrast to for-comprehension syntax patternmatching is desugared only after type checking Emir et al [2007b] which prevents us from liftingpattern matching notation More specically because an extractor Emir et al [2007b] cannot returnthe representation of a result value (say Exp[Boolean]) to later evaluate it must produce its nalresult at pattern matching time There is initial work on ldquovirtualized pattern matchingrdquo1 and wehope to use this feature in future work
We also experienced problems with operators that cannot be overloaded such as == or if-elseand with lifting methods in scalaAny which forced us to provide alternative syntax for thesefeatures in queries The Scala-virtualized project [Moors et al 2012] aims to address these limitationsunfortunately it advertises no change on the other problems we found which we subsequentlydetail
It would also be desirable if we could enforce the absence of side eects in queries but sinceScala like most practical programming languages except Haskell does not track side eects in thetype system this does not seem to be possible
Finally compared to lightweight modular staging [Rompf and Odersky 2010] (the foundation ofDelite) and to polymorphic embedding [Hofer et al 2008] we have less static checking for someprogramming errors when writing queries the recommended way to use Delite is to write a EDSLprogram in one trait in terms of the EDSL interface only and combine it with the implementationin another trait In polymorphic embedding the EDSL program is a function of the specic imple-mentation (in this case semantics) Either approach ensures that the DSL program is parametric inthe implementation and hence cannot refer to details of a specic implementation However wejudged the syntactic overhead for the programmer to be too high to use those techniques ndash in ourimplementation we rely on encapsulation and on dynamic checks at query construction time toachieve similar guarantees
The choice of interpreting expressions turned out to be a signicant performance limitation Itcould likely be addressed by using Delite and lightweight modular staging instead but we wantedto experiment with how far we can get within the language in a well-typed setting
723 What did not work so well Scala type inferenceWhen implementing our library we often struggled against limitations and bugs in the Scalacompiler especially regarding type inference and its interaction with implicit resolution and wewere often constrained by its limitations Not only Scalarsquos type inference is not complete but welearned that its behavior is only specied by the behavior of the current implementation in manycases where there is a clear desirable solution type inference fails or nds an incorrect substitutionwhich leads to a type error Hence we cannot distinguish in the discussion the Scala language fromits implementation We regard many of Scalarsquos type inference problems as bugs and reported them
1hpstackoverflowcomquestions8533826what-is-scalas-experimental-virtual-paern-matcher
Chapter 7 Discussion 45
as such when no previous bug report existed as noted in the rest of this section Some of them arelong-standing issues others of them were accepted for other ones we received no feedback yetat the time of this writing and another one was already closed as WONTFIX indicating that a xwould be possible but have excessive complexity for the time being2
Overloading The code in Fig 31 uses the lifted BookData constructor Two denitions ofBookData are available with signatures BookData(String String Int) and BookData(Exp[String] Exp[String] Exp[Int])and it seems like the Scala compiler should be able to choose which one to apply using overloadresolution This however is not supported simply because the two functions are dened in dierentscopes3 hence importing BookData(Exp[String] Exp[String] Exp[Int]) shadows locallythe original denition
Type inference vs implicit resolution The interaction between type inference and implicitresolution is a hard problem and Scalac has also many bugs but the current situation requiresfurther research for instance there is not even a specication for the behavior of type inference4
As a consequence to the best of our knowledge some properties of type inference have notbeen formally established For instance a reasonable user expectation is that removing a call to animplicit conversion does not alter the program if it is the only implicit conversion with the correcttype in scope or if it is more specic than the others [Odersky et al 2011 Ch 21] This is notalways correct because removing the implicit conversion reduces the information available for thetype inference algorithm we observed multiple cases5 where type inference becomes unable togure out enough information about the type to trigger implicit conversion
We also consider signicant that Scala 28 required making both type inference and implicitresolution more powerful specically in order to support the collection library [Moors 2010Odersky et al 2011 Sec 217] further extensions would be possible and desirable For instanceif type inference were extended with higher-order unication6 [Pfenning 1988] it would bettersupport a part of our DSL interface (not discussed in this chapter) by removing the need for typeannotations
Nested patternmatches for GADTs in Scala Writing a typed decomposition for Exp requirespattern-matching support for generalized algebraic datatypes (GADTs) We found that supportfor GADTs in Scala is currently insucient Emir et al [2007b] dene the concept of typecasingessentially a form of pattern-matching limited to non-nested patterns and demonstrate that Scalasupports typecasing on GADTs in Scala by demonstrating a typed evaluator however typecasingis rather verbose for deep patterns since one has to nest multiple pattern-matching expressionsWhen using normal pattern matches instead the support for GADT seems much weaker7 Henceone has to choose between support for GADT and the convenience of nested pattern matching Athird alternative is to ignore or disable compiler warnings but we did not consider this option
Implicit conversions do not chain While implicit conversions by default do not chain it issometimes convenient to allow chaining selectively For instance let us assume a context such thata Exp[A] b Exp[B] and c Exp[C] In this context let us consider again how we lift tuplesWe have seen that the expression (a b) has type (Exp[A] Exp[B]) but can be converted toExp[(A B)] through an implicit conversion Let us now consider nested tuples like ((a b) c)it has type ((Exp[A] Exp[B]) Exp[C]) hence the previous conversion cannot be applied tothis expression
Odersky et al [2011 Ch 21] describe a pattern which can address this goal Using this patternto lift pairs we must write an implicit conversion from pairs of elements which can be implicitly
2hpsissuesscala-langorgbrowseSI-25513hpsissuesscala-langorgbrowseSI-25514hpsissuesscala-langorgbrowseSI-5298focusedCommentId=55971comment-55971 reported by us5hpsissuesscala-langorgbrowseSI-5592 reported by us6hpsissuesscala-langorgbrowseSI-27127Due to bug hpsissuesscala-langorgbrowseSI-5298focusedCommentId=56840comment-56840 reported by us
46 Chapter 7 Discussion
converted to expressions Instead of requiring (Exp[A] Exp[B]) the implicit conversion shouldrequire (A B) with the condition that A can be converted to Exp[Arsquo] and B to Exp[Brsquo] Thisconversion solves the problem if applied explicitly but the compiler refuses to apply it implicitlyagain because of type inference issues8
Because of this type inference limitations we failed to provide support for reifying code like((a b) c)9
Error messages for implicit conversions The enrich-my-library pattern has the declaredgoal to allow to extend existing libraries transparently However implementation details shinehowever through when a user program using this feature contains a type error When invokinga method would require an implicit conversion which is not applicable the compiler often justreports that the method is not available The recommended approach to debugging such errors isto manually add the missing implicit conversion and investigating the type error [Odersky et al2011 Ch 218] but this destroys the transparency of the approach when creating or modifying codeWe believe this could be solved in principle by research on error reporting the compiler couldautomatically insert all implicit conversions enabling the method calls and report correspondingerrors even if at some performance cost
724 Lessons for language embeddersVarious domains such as the one considered in our case study allow powerful domain-specicoptimizations Such optimizations often are hard to express in a compositional way hence theycannot be performed while building the query but must be expressed as global optimizations passesFor those domains deep embedding is key to allow signicant optimizations On the other handdeep embedding requires to implement an interpreter or a compiler
On the one hand interpretation overhead is signicant in Scala even when using HOAS to takeadvantage of the metalevel implementation of argument access
Instead of interpreting a program one can compile a EDSL program to Scala and load it asdone by Rompf et al [2011] while we are using this approach the disadvantage is the compilationdelay especially for Scala whose compilation process is complex and time-consuming Possiblealternatives include generating bytecode directly or combining interpretation and compilationssimilarly to tiered JIT compilers where only code which is executed often is compiled We plan toinvestigate such alternatives in future work
8hpsissuesscala-langorgbrowseSI-5651 reported by us9One could of course write a specic implicit conversions for this case however (a (b c)) requires already a dierent
implicit conversion and there are innite ways to nest pairs let alone tuples of bounded arity
Chapter 8
Related work
This chapter builds on prior work on language-integrated queries query optimization techniquesfor DSL embedding and other works on code querying
81 Language-Integrated QueriesMicrosoftrsquos Language-Integrated Query technology (Linq) [Meijer et al 2006 Bierman et al 2007]is similar to our work in that it also reies queries on collections to enable analysis and optimizationSuch queries can be executed against a variety of backends (such as SQL databases or in-memoryobjects) and adding new back-ends is supported Its implementation uses expression trees a compiler-supported implicit conversion between expressions and their reication as a syntax tree Thereare various major dierences though First the support for expression trees is hard-coded intothe compiler This means that the techniques are not applicable in languages that do not explicitlysupport expression trees More importantly the way expression trees are created in Linq is genericand xed For instance it is not possible to create dierent tree nodes for method calls that arerelevant to an analysis (such as the map method) than for method calls that are irrelevant forthe analysis (such as the toString method) For this reason expression trees in Linq cannot becustomized to the task at hand and contain too much low-level information It is well-known thatthis makes it quite hard to implement programs operating on expression trees [Eini 2011]
Linq queries can also not easily be decomposed and modularized For instance consider thetask of refactoring the lter in the query from x in y where xz == 1 select x into a functionDening this function as bool comp(int v) return v == 1 would destroy the possibilityof analyzing the lter for optimization since the resulting expression tree would only containa reference to an opaque function The function could be declared as returning an expressiontree instead but then this function could not be used in the original query anymore since thecompiler expects an expression of type bool and not an expression tree of type bool It could onlybe integrated if the expression tree of the original query is created by hand without using thebuilt-in support for expression trees
Although queries against in-memory collections could theoretically also be optimized in Linqthe standard implementation Linq2Objects performs no optimizations
A few optimized embedded DSLs allow executing queries or computations on distributed clustersDryadLINQ [Yu et al 2008] based on Linq optimizes queries for distributed execution It inheritsLinqrsquos limitations and thus does not support decomposing queries in dierent modules Modulariz-ing queries is supported instead by FlumeJava [Chambers et al 2010] another library (in Java) fordistributed query execution However FlumeJava cannot express many optimizations because its
47
48 Chapter 8 Related work
representation of expressions is more limited also its query language is more cumbersome Bothproblems are rooted in Javarsquos limited support for embedded DSLs Other embedded DSLs supportparallel platforms such as GPUs or many-core CPUs such as Delite [Rompf et al 2013]
Willis et al [2006 2008] add rst-class queries to Java through a source-to-source translatorand implement a few selected optimizations including join order optimization and incrementalmaintenance of query results They investigate how well their techniques apply to Java programsand they suggest that programmers use manual optimizations to avoid expensive constructs likenested loops While the goal of these works is similar to ours their implementation as an externalsource-to-source-translator makes the adoption extensibility and composability of their techniquedicult
There have been many approaches for a closer integration of SQL queries into programs such asHaskellDB [Leijen and Meijer 1999] (which also inspired Linq) or Ferry [Grust et al 2009] (whichmoves part of a program execution to a database) In Scala there are also APIs which integrate SQLqueries more closely such as Slick1 Its frontend allows to dene and combine type-safe queriessimilarly to ours (also in the way it is implemented) However the language for dening queriesmaps to SQL so it does not support nesting collections in other collections (a feature which simpliedour example in Sec 22) nor distinguishes statically between dierent kinds of collections such asSet or Seq Based on Ferry ScalaQL [Garcia et al 2010] extends Scala with a compiler-plugin tointegrate a query language on top of a relational database The work by Spiewak and Zhao [2009] isunrelated to [Garcia et al 2010] but also called ScalaQL It is similar to our approach in that it alsoproposes to reify queries based on for-comprehensions but it is not clear from their paper how thereication works2
82 Query OptimizationQuery optimization on relational data is a long-standing issue in the database community butthere are also many works on query optimization on objects [Fegaras and Maier 2000 Grust 1999]Compared to these works we have only implemented a few simple query optimizations so there ispotential for further improvement of our work by incorporating more advanced optimizations
Henglein [2010] and Henglein and Larsen [2010] embed relational algebra in Haskell Queriesare not reied in their approach but due to a particularly sophisticated representation of multisetsit is possible to execute some queries containing cross-products using faster equi-joins While theirapproach appears to not rely on non-conuent rewriting rules and hence appears more robust it isnot yet clear how to incorporate other optimizations
83 Deep Embeddings in ScalaTechnically our implementation of SOpt is a deep embedding of a part of the Scala collectionsAPI [Odersky and Moors 2009] Deep embeddings were pionereed by Leijen and Meijer [1999] andElliott et al [2003] in Haskell for other applications
We regard the Scala collections API [Odersky and Moors 2009] as a shallowly embedded queryDSL Query operators are eager that is they immediately perform collection operations when calledso that it is not possible to optimize queries before execution
Scala collections also provide views which are closer to SOpt Unlike standard queryoperators they create lazy collections Like SOpt views reify query operators as data structuresand interpret them later Unlike SOpt views are not used for automatic query optimization
1hpslicktypesafecom2We contacted the authors they were not willing to provide more details or the sources of their approach
Chapter 8 Related work 49
but for explicitly changing the evaluation order of collection processing Moreover they cannotbe reused by SOpt as they reify too little Views embed deeply only the outermost pipeline ofcollection operators while they embed shallowly nested collection operators and other Scala codein queries such as arguments of filter map and flatMap Deep embedding of the whole query isnecessary for many optimizations as discussed in Chapter 3
831 LMSOur deep embedding is inspired by some of the Scala techniques presented by Lightweight ModularStaging (LMS) [Rompf and Odersky 2010] for using implicits and for adding inx operators to atype Like Rompf and Odersky [2010] we also generate and compile Scala code on-the-y reusingthe Scala compiler A plausible alternative backend for SOpt would have been to use LMS andDelite [Rompf et al 2011] a framework for building highly ecient DSLs in Scala
An alternative to SOpt based on LMS and Delite named OptiQL was indeed built andpresented in concurrent [Rompf et al 2013] and subsequent work [Sujeeth et al 2013b] LikeSOpt OptiQL enables writing and optimizing collection queries in Scala On the minus sideOptiQL supports fewer collections types (ArrayList and HashMap)
On the plus side OptiQL supports queries containing eects and can reuse support for automaticparallelization and multiple platforms present in Delite While LMS allows embedding eectfulqueries it is unclear how many of the implemented optimizations keep being sound on such queriesand how many of those have been extended to such queries
OptiQL achieves impressive speedups by fusing collection operations and transforming (orlowering) high-level bulk operators to highly optimized imperative programs using while loopsThis gains signicant advantages because it can avoid boxing intermediate allocations and becauseinlining heuristics in the Hotspot JIT compiler have never been tuned to bulk operators in functionalprograms3 We did not investigate such lowerings It is unclear how well these optimizations wouldextend to other collections which intrinsically carry further overhead Moreover it is unclear howto execute incrementally the result of these lowerings as we discuss in Sec 71
Those works did not support embedding arbitrary libraries in an automated or semi-automatedway this was only addressed later in Forge [Sujeeth et al 2013a] and Yin-Yang [Jovanovic et al2014]
Ackermann et al [2012] present Jet which also optimizes collection queries but targets MapReduce-style computations in a distributed environment Like OptiQL Jet does not apply typical databaseoptimizations such as indexing or lter hoisting
84 Code QueryingIn our evaluation we explore the usage of SOpt to express queries on code and re-implement asubset of the FindBugs [Hovemeyer and Pugh 2004] analyses There are various other specializedcode query languages such as CodeQuest [Hajiyev et al 2006] or D-CUBED [Wegrzynowicz andStencel 2009] Since these are special-purpose query languages that are not embedded into a hostlanguage they are not directly comparable to our approach
3See discussion by Cli Click at hpwwwazulsystemscomblogcli2011-04-04-fixing-the-inlining-problem
50 Chapter 8 Related work
Chapter 9
Conclusions
We have illustrated the tradeo between performance and modularity for queries on in-memorycollections We have shown that it is possible to design a deep embedding of a version of thecollections API which reies queries and can optimize them at runtime Writing queries using thisframework is except minor syntactic details the same as writing queries using the collection libraryhence the adoption barrier to using our optimizer is low
Our evaluation shows that using abstractions in queries introduces a signicant performanceoverhead with native Scala code while SOpt in most cases makes the overhead much moretolerable or removes it completely Optimizations are not sucient on some queries but sinceour optimizer is a proof-of-concept with many opportunities for improvement we believe a moreelaborate version will achieve even better performance and reduce these limitations
91 Future workTo make our DSL more convenient to use it would be useful to use the virtualized pattern matcherof Scala 210 when it will be more robust to add support for pattern matching in our virtualizedqueries
Finally while our optimizations are type-safe as they rewrite an expression tree to anotherof the same type currently the Scala type-checker cannot verify this statically because of itslimited support for GADTs Solving this problem conveniently would allow checking statically thattransformations are safe and make developing them easier
51
52 Chapter 9 Conclusions
Part II
Incremental λ-Calculus
53
Chapter 10
Introduction to dierentiation
Incremental computation (or incrementalization) has a long-standing history in computer sci-ence [Ramalingam and Reps 1993] Often a program needs to update quickly the output of somenontrivial function f when the input to the computation changes In this scenario we assume wehave computed y1 = f x1 and we need to compute y2 that equals f x2 In this scenario programmerstypically have to choose between a few undesirable options
bull Programmers can call again function f on the updated input x2 and repeat the computationfrom scratch This choice guarantees correct results and is easy to implement but typicallywastes computation time Often if the updated input is close to the original input the sameresult can be computed much faster
bull Programmers can write by hand a new function df that updates the output based on inputchanges using various techniques Running a hand-written function df can be much moreecient than rerunning f but writing df requires signicant developer eort is error-proneand requires updating df by hand to keep it consistent with f whenever f is modied Inpractice this complicates code maintenance signicantly [Salvaneschi and Mezini 2013]
bull Programmers can write f using domain-specic languages that support incrementalizationfor tasks where such languages are available For instance build scripts (our f ) are written indomain-specic languages that support (coarse-grained) incremental builds Database querylanguages also have often support for incrementalization
bull Programmers can attempt using general-purpose techniques for incrementalizing programssuch as self-adjusting computation and variants such as Adapton Self-adjusting computationapplies to arbitrary purely functional programs and has been extended to imperative programshowever it only guarantees ecient incrementalization when applied to base programs thatare designed for ecient incrementalization Nevertheless self-adjusting computation enabledincrementalizing programs that had never been incrementalized by hand before
No approach guarantees automatic ecient incrementalization for arbitrary programs Wepropose instead to design domain-specic languages (DSLs) that can be eciently incrementalizedthat we call incremental DSLs (IDSLs)
To incrementalize IDSL programs we use a transformation that we call (nite) dierentiationDierentiation produces programs in the same language called derivatives that can be optimizedfurther and compiled to ecient code Derivatives represent changes to values through furthervalues that we call simply changes
55
56 Chapter 10 Introduction to dierentiation
For primitives IDSL designers must specify the result of dierentiation IDSL designers are tochoose primitives that encapsulate eciently incrementalizable computation schemes while IDSLusers are to express their computation using the primitives provided by the IDSL
Helping IDSL designers to incrementalize primitives automatically is a desirable goal thoughone that we leave open In our setting incrementalizing primitives becomes a problem of programsynthesis and we agree with Shah and Bodik [2017] that it should be treated as such Among othersLiu [2000] develops a systematic approach to this synthesis problem for rst-order programs basedon equational reasoning but it is unclear how scalable this approach is We provide foundations forusing equational reasoning and sketch an IDSL for handling dierent sorts of collections We alsodiscuss avenues at providing language plugins for more fundamental primitives such as algebraicdatatypes with structural recursion
In the IDSLs we consider similarly to database languages we use primitives for high-leveloperations of complexity similar to SQL operators On the one hand IDSL designers wish to designfew general primitives to limit the eort needed for manual incrementalization On the other handoverly general primitives can be harder to incrementalize eciently Nevertheless we also providesome support for more general building blocks such as product or sum types and even (in somesituations) recursive types Other approaches provide more support for incrementalizing primitivesbut even then ensuring ecient incrementalization is not necessarily easy Dynamic approachesto incrementalization are most powerful they can nd work that can be reused at runtime usingmemoization as long as the computation is structured so that memoization matches will occurMoreover for some programs it seems more ecient to detect that some output can be reusedthanks to a description of the input changes rather than through runtime detection
We propose that IDSLs be higher-order so that primitives can be parameterized over functionsand hence highly exible and purely functional to enable more powerful optimizations both beforeand after dierentiation Hence an incremental DSL is a higher-order purely functional languagecomposed of a λ-calculus core extended with base types and primitives Various database querylanguages support forms of nite dierentiation (see Sec 1921) but only over rst-order languageswhich provide only restricted forms of operators such as map lter or aggregation
To support higher-order IDSLs we dene the rst form of dierentiation that supports higher-order functional languages to this end we introduce the concept of function changes which containchanges to either the code of a function value or the values it closes over While higher-orderprograms can be transformed to rst-order ones incrementalizing resulting programs is still beyondreach for previous approaches to dierentiation (see Sec 1921 for earlier work and Sec 1922 forlater approaches) In Chapter 17 and Appendix D we transform higher-order programs to rst-orderones by closure conversion or defunctionalization but we incrementalize defunctionalized programsusing similar ideas including changes to (defunctionalized) functions
Our primary example will be DSLs for operations on collections as discussed earlier (Chapter 2)we favor higher-order collection DSLs over relational databases
We build our incremental DSLs based on simply-typed λ-calculus (STLC) extended with languageplugins to dene the domain-specic parts as discussed in Appendix A2 and summarized in Fig A1We call our approach ILC for Incremental Lambda Calculus
The rest of this chapter is organized as follows In Sec 101 we explain that dierentiationgeneralizes the calculus of nite dierences a relative of dierential calculus In Sec 102 we showa motivating example for our approach In Sec 103 we introduce informally the concept of changesas values and in Sec 104 we introduce changes to functions In Sec 105 we dene dierentiationand motivate it informally In Sec 106 we apply dierentiation to our motivating example
Correctness of ILC is far from obvious In Chapters 12 and 13 we introduce a formal theory ofchanges and we use it to formalize dierentiation and prove it correct
Chapter 10 Introduction to dierentiation 57
101 Generalizing the calculus of nite dierencesOur theory of changes generalizes an existing eld of mathematics called the calculus of nitedierence If f is a real function one can dene its nite dierence that is a function ∆f such that∆f a da = f (a + da) minus f a Readers might notice the similarity (and the dierences) between thenite dierence and the derivative of f since the latter is dened as
f prime(a) = limdararr0
f (a + da) minus f (a)
da
The calculus of nite dierences helps computing a closed formula for ∆f given a closed formulafor f For instance if function f is dened by f x = 2 middot x one can prove its nite dierence is∆f x dx = 2 middot (x + dx) minus 2 middot x = 2 middot dx
Finite dierences are helpful for incrementalization because they allow computing functionson updated inputs based on results on base inputs if we know how inputs change Take againfor instance f x = 2 middot x if x is a base input and x + dx is an updated input we can computef (x + dx) = f x + ∆f x dx If we already computed y = f x and reuse the result we can computef (x + dx) = y + ∆f x Here the input change is dx and the output change is ∆f x dx
However the calculus of nite dierences is usually dened for real functions Since it is basedon operators + and minus it can be directly extended to commutative groups Incrementalization basedon nite dierences for groups and rst-order programs has already been researched [Paige andKoenig 1982 Gluche et al 1997] most recently and spectacularly with DBToaster [Koch 2010Koch et al 2016]
But it is not immediate how to generalize nite dierencing beyond groups And many usefultypes do not form a group for instance lists of integers donrsquot form a group but only a monoidMoreover itrsquos hard to represent list changes simply through a list how do we specify which elementswere inserted (and where) which were removed and which were subjected to change themselves
In ILC we generalize the calculus of nite dierences by using distinct types for base valuesand changes and adapting the surrounding theory ILC generalizes operators + and minus as operatorsoplus (pronounced ldquooplusrdquo or ldquoupdaterdquo) and (pronounced ldquoominusrdquo or ldquodierencerdquo) We show howILC subsumes groups in Sec 1311
102 A motivating exampleIn this section we illustrate informally incrementalization on a small example
In the following program grandTotal xs ys sums integer numbers in input collections xs and ys
grandTotal Bag Zrarr Bag Zrarr Zs ZgrandTotal xs ys = sum (merge xs ys)s = grandTotal xs ys
This program computes output s from input collections xs and ys These collections are multisetsor bags that is collections that are unordered (like sets) where elements are allowed to appear morethan once (unlike sets) In this example we assume a language plugin that supports a base type ofintegers Z and a family of base types of bags Bag τ for any type τ
We can run this program on specic inputs xs1 = 1 2 3 and ys1 = 4 to obtain outputs1 Here double braces denote a bag containing the elements among the double braces
s1 = grandTotal xs1 ys1= sum 1 2 3 4 = 10
58 Chapter 10 Introduction to dierentiation
This example uses small inputs for simplicity but in practice they are typically much bigger weuse n to denote the input size In this case the asymptotic complexity of computing s is Θ(n)
Consider computing updated output s2 from updated inputs xs2 = 1 1 2 3 and ys2 = 4 5 We could recompute s2 from scratch as
s2 = grandTotal xs2 ys2= sum 1 1 2 3 4 5 = 16
But if the size of the updated inputs is Θ(n) recomputation also takes time Θ(n) and we would liketo obtain our result asymptotically faster
To compute the updated output s2 faster we assume the changes to the inputs have a descriptionof size dn that is asymptotically smaller than the input size n that is dn = o(n) All approaches toincrementalization require small input changes Incremental computation will then process theinput changes rather than just the new inputs
103 Introducing changesTo talk about how the dierences between old values and new values we introduce a few conceptsfor now without full denitions In our approach to incrementalization we describe changes tovalues as values themselves We call such descriptions simply changes Incremental programsexamine changes to inputs to understand how to produce changes to outputs Just like in STLCwe have terms (programs) that evaluates to values we also have change terms which evaluate tochange values We require that going from old values to new values preserves types That is if anold value v1 has type τ then also its corresponding new value v2 must have type τ To each typeτ we associate a type of changes or change type ∆τ a change between v1 and v2 must be a valueof type ∆τ Similarly environments can change to typing context Γ we associate change typingcontexts ∆Γ such that we can have an environment change dρ n∆Γ o from ρ1 n Γ o to ρ2 n Γ o
Not all descriptions of changes are meaningful so we also talk about valid changes Validchanges satisfy additional invariants that are useful during incrementalization A change value dvcan be a valid change from v1 to v2 We can also consider a valid change as an edge from v1 to v2 ina graph associated to τ (where the vertexes are values of type τ ) and we call v1 the source of dv andv2 the destination of dv We only talk of source and destination for valid changes so a change fromv1 to v2 is (implicitly) valid Wersquoll discuss examples of valid and invalid changes in Examples 1031and 1032
We also introduce an operator oplus on values and changes if dv is a valid change from v1 to v2then v1 oplus dv (read as ldquov1 updated by dvrdquo) is guaranteed to return v2 If dv is not a valid change fromv1 then v1 oplus dv can be dened to some arbitrary value or not without any eect on correctness Inpractice if oplus detects an invalid input change it can trigger an error or return a dummy value in ourformalization we assume for simplicity that oplus is total Again if dv is not valid from v1 to v1 oplus dvthen we do not talk of the source and destination of dv
We also introduce operator given two values v1 v2 for the same type v2 v1 is a valid changefrom v1 to v2
Finally we introduce change composition if dv1 is a valid change from v1 to v2 and dv2 is avalid change from v2 to v3 then dv1 dv2 is a valid change from v1 to v3
Change operators are overloaded over dierent types Coherent denitions of validity and ofoperators oplus and for a type τ form a change structure over values of type τ (Denition 1311)For each type τ wersquoll dene a change structure (Denition 1342) and operators will have typesoplus τ rarr ∆τ rarr τ τ rarr τ rarr ∆τ ∆τ rarr ∆τ rarr ∆τ
Chapter 10 Introduction to dierentiation 59
Example 1031 (Changes on integers and bags)To show how incrementalization aects our example we next describe valid changes for integersand bags For now a change das to a bag as1 simply contains all elements to be added to the initialbag as1 to obtain the updated bag as2 (wersquoll ignore removing elements for this section and discussit later) In our example the change from xs1 (that is 1 2 3 ) to xs2 (that is 1 1 2 3 ) isdxs = 1 while the change from ys1 (that is 4 ) to ys2 (that is 4 5 ) is dys = 5 To represent the output change ds from s1 to s2 we need integer changes For now we representinteger changes as integers and dene oplus on integers as addition v1 oplus dv = v1 + dv
For both bags and integers a change dv is always valid between v1 and v2 = v1 oplus dv for otherchanges however validity will be more restrictive
Example 1032 (Changes on naturals)For instance say we want to dene changes on a type of natural numbers and we still want to havev1 oplus dv = v1 + dv A change from 3 to 2 should still be minus1 so the type of changes must be Z Butthe result of oplus should still be a natural that is an integer gt 0 to ensure that v1 oplus dv gt 0 we need torequire that dv gt minusv1 We use this requirement to dene validity on naturals dv B v1 rarr v1+dv Nis dened as equivalent to dv gt minusv1 We can guarantee equation v1 oplus dv = v1 + dv not for allchanges but only for valid changes Conversely if a change dv is invalid for v1 then v1 + dv lt 0We then dene v1 oplus dv to be 0 though any other denition on invalid changes would work1
1031 Incrementalizing with changesAfter introducing changes and related notions we describe how we incrementalize our exampleprogram
We consider again the scenario of Sec 102 we need to compute the updated output s2 theresult of calling our program grandTotal on updated inputs xs2 and ys2 And we have the initialoutput s1 from calling our program on initial inputs xs1 and ys1 In this scenario we can computes2 non-incrementally by calling grandTotal on the updated inputs but we would like to obtain thesame result faster Hence we compute s2 incrementally that is we rst compute the output changeds from s1 to s2 then we update the old output s1 by change ds Successful incremental computationmust compute the correct s2 asymptotically faster than non-incremental computation This speedupis possible because we take advantage of the computation already done to compute s1
To compute the output change ds from s1 to s2 we propose to transform our base programgrandTotal to a new program dgrandTotal that we call the derivative of grandTotal to computeds we call dgrandTotal on initial inputs and their respective changes Unlike other approaches toincrementalization dgrandTotal is a regular program in the same language as grandTotal hencecan be further optimized with existing technology
Below we give the code for dgrandTotal and show that in this example incremental computationcomputes s2 correctly
For ease of reference we recall inputs changes and outputs
xs1 = 1 2 3 dxs = 1 xs2 = 1 1 2 3 ys1 = 4 dys = 5
1In fact we could leave oplus undened on invalid changes Our original presentation [Cai et al 2014] in essence restrictedoplus to valid changes through dependent types by ensuring that applying it to invalid changes would be ill-typed LaterHuesca [2015] in similar developments simply made oplus partial on its domain instead of restricting the domain achievingsimilar results
60 Chapter 10 Introduction to dierentiation
ys2 = 4 5 s1 = grandTotal xs1 ys1= 10
s2 = grandTotal xs2 ys2= 16
Incremental computation uses the following denitions to compute s2 correctly and with fewersteps as desired
dgrandTotal xs dxs ys dys = sum (merge dxs dys)ds = dgrandTotal xs1 dxs ys1 dys =
= sum 1 5 = 6s2 = s1 oplus ds = s1 + ds
= 10 + 6 = 16
Incremental computation should be asymptotically faster than non-incremental computationhence the derivative we run should be asymptotically faster than the base program Here derivativedgrandTotal is faster simply because it ignores initial inputs altogether Therefore its time complexitydepends only on the total size of changes dn In particular the complexity of dgrandTotal isΘ(dn) = o(n)
We generate derivatives through a program transformation from terms to terms which we calldierentiation (or sometimes simply derivation) We write D n t o for the result of dierentiatingterm t We apply D n ndash o on terms of our non-incremental programs or base terms such as grandTotalTo dene dierentiation we assume that we already have derivatives for primitive functions theyuse we discuss later how to write such derivatives by hand
We dene dierentiation in Denition 1051 some readers might prefer to peek ahead but weprefer to rst explain what dierentiation is supposed to do
A derivative of a function can be applied to initial inputs and changes from initial inputs toupdated inputs and returns a change from an initial output to an updated output For instancetake derivative dgrandTotal initial inputs xs1 and ys1 and changes dxs and dys from initial inputsto updated inputs Then change dgrandTotal xs1 dxs ys1 dys that is ds goes from initial outputgrandTotal xs1 ys1 that is s1 to updated output grandTotal xs2 ys2 that is s2 And because ds goesfrom s1 to s2 it follows as a corollary that s2 = s1 oplus ds Hence we can compute s2 incrementallythrough s1 oplus ds as we have shown rather than by evaluating grandTotal xs2 ys2
We often just say that a derivative of function f maps changes to the inputs of f to changes tothe outputs of f leaving the initial inputs implicit In short
Slogan 1033Term D n t o maps input changes to output changes That is D n t o applied to initial base inputs andvalid input changes (from initial inputs to updated inputs) gives a valid output change from t appliedon old inputs to t applied on new inputs
For a generic unary function f Ararr B the behavior of D n f o can be described as
f a2 f a1 oplus D n f o a1 da (101)
or asf (a1 oplus da) f a1 oplus D n f o a1 da (102)
where da is a metavariable standing for a valid change from a1 to a2 (with a1 a2 A) and where denotes denotational equivalence (Denition A25) Moreover D n f o a1 da is also a valid changeand can be hence used as an argument for operations that require valid changes These equations
Chapter 10 Introduction to dierentiation 61
follow from Theorem 1222 and Corollary 1345 we iron out the few remaining details to obtainthese equations in Sec 1412
In our example we have applied D n ndash o to grandTotal and simplify the result via β-reduction toproduce dgrandTotal as we show in Sec 106 Correctness of D n ndash o guarantees that sum (merge dxs dys)evaluates to a change from sum (merge xs ys) evaluated on old inputs xs1 and ys1 to sum (merge xs ys)evaluated on new inputs xs2 and ys2
In this section we have sketched the meaning of dierentiation informally We discuss incre-mentalization on higher-order terms in Sec 104 and actually dene dierentiation in Sec 105
104 Dierentiation on open terms and functionsWe have shown that applying D n ndash o on closed functions produces their derivatives HoweverD n ndash o is dened for all terms hence also for open terms and for non-function types
Open terms Γ ` t τ are evaluated with respect to an environment for Γ and when thisenvironment changes the result of t changes as well D n t o computes the change to trsquos output If Γ `t τ evaluating term D n t o requires as input a change environment dρ n∆Γ o containing changesfrom the initial environment ρ1 n Γ o to the updated environment ρ2 n Γ o The (environment)input change dρ is mapped by D n t o to output change dv = nD n t o o dρ a change from initialoutput n t o ρ1 to updated output n t o ρ2 If t is a function dv maps in turn changes to the functionarguments to changes to the function result All this behavior again follows our slogan
Environment changes contains changes for each variable in Γ More precisely if variable xappears with type τ in Γ and hence in ρ1 ρ2 then dx appears with type ∆τ in ∆Γ and hence in dρMoreover dρ extends ρ1 to provide D n t o with initial inputs and not just with their changes
The two environments ρ1 and ρ2 can share entries mdash in particular environment change dρ canbe a nil change from ρ to ρ For instance Γ can be empty then ρ1 and ρ2 are also empty (since theymatch Γ) and equal so dρ is a nil change Alternatively some or all the change entries in dρ can benil changes Whenever dρ is a nil change nD n t o o dρ is also a nil change
If t is a function D n t o will be a function change Changes to functions in turn map inputchanges to output changes following our Slogan 1033 If a change df from f1 to f2 is applied (viadf a1 da) to an input change da from a1 to a2 then df will produce a change dv = df a1 da fromv1 = f1 a1 to v2 = f2 a2 The denition of function changes is recursive on types that is dv can inturn be a function change mapping input changes to output changes
Derivatives are a special case of function changes a derivative df is simply a change from f to fitself which maps input changes da from a1 to a2 to output changes dv = df a1 da from f a1 to f a2This denition coincides with the earlier denition of derivatives and it also coincides with thedenition of function changes for the special case where f1 = f2 = f That is why D n t o producesderivatives if t is a closed function term we can only evaluate D n t o against an nil environmentchange producing a nil function change
Since the concept of function changes can be surprising we examine it more closely next
1041 Producing function changesA rst-class function can close over free variables that can change hence functions values themselvescan change hence we introduce function changes to describe these changes
For instance term tf = λx rarr x + y is a function that closes over y so dierent values v for ygive rise to dierent values for f =
tf(y = v) Take a change dv from v1 = 5 to v2 = 6 dierent
2Nitpick if da is read as an object variable denotational equivalence will detect that these terms are not equivalent ifda maps to an invalid change Hence we said that da is a metavariable Later we dene denotational equivalence for validchanges (Denition 1412) which gives a less cumbersome way to state such equations
62 Chapter 10 Introduction to dierentiation
inputs v1 and v2 for y give rise to dierent outputs f1 =tf(y = v1) and f2 =
tf(y = v2) We
describe the dierence between outputs f1 and f2 through a function change df from f1 to f2Consider again Slogan 1033 and how it applies to term f
Slogan 1033Term D n t o maps input changes to output changes That is D n t o applied to initial base inputs andvalid input changes (from initial inputs to updated inputs) gives a valid output change from t appliedon old inputs to t applied on new inputs
Since y is free in tf the value for y is an input of tf So continuing our example dtf = Dtf
must map a valid input change dv from v1 to v2 for variable y to a valid output change df from f1 tof2 more precisely we must have df =
dtf
(y = v1 dy = dv)
1042 Consuming function changesFunction changes can not only be produced but also be consumed in programs obtained from D n ndash oWe discuss next how
As discussed we consider the value for y as an input to tf = λx rarr x + y However we alsochoose to consider the argument for x as an input (of a dierent sort) to tf = λx rarr x + y andwe require our Slogan 1033 to apply to input x too While this might sound surprising it worksout well Specically since df =
Dtf
is a change from f1 to f2 we require df a1 da to be achange from f1 a1 to f2 a2 so df maps base input a1 and input change da to output change df a1 dasatisfying the slogan
More in general any valid function change df from f1 to f2 (where f1 f2 nσ rarr τ o) must inturn be a function that takes an input a1 and a change da valid from a1 to a2 to a valid changedf a1 da from f1 a1 to f2 a2
This way to satisfy our slogan on application t = tf x we can simply dene D n ndash o so thatDtf x
= D
tfx dx Then
Dtf x
(y = v1 dy = dv x = a1 dx = da) =
Dtf
a1 da = df a1 da
As required thatrsquos a change from f1 a1 to f2 a2Overall valid function changes preserve validity just like D n t o in Slogan 1033 and map valid
input changes to valid output changes In turn output changes can be function changes since theyare valid they in turn map valid changes to their inputs to valid output changes (as wersquoll see inLemma 12110) Wersquoll later formalize this and dene validity by recursion on types that is as alogical relation (see Sec 1214)
1043 Pointwise function changesIt might seem more natural to describe a function change df prime between f1 and f2 via df prime x = λx rarrf2 x f1 x We call such a df prime a pointwise change While some might nd pointwise changes amore natural concept the output of dierentiation needs often to compute f2 x2 f1 x1 using achange df from f1 to f2 and a change dx from x1 to x2 Function changes perform this job directlyWe discuss this point further in Sec 153
1044 Passing change targetsIt would be more symmetric to make function changes also take updated input a2 that is havedf a1 da a2 computes a change from f1 a1 to f2 a2 However passing a2 explicitly adds noinformation the value a2 can be computed from a1 and da as a1 oplus da Indeed in various cases
Chapter 10 Introduction to dierentiation 63
a function change can compute its required output without actually computing a1 oplus da Sincewe expect the size of a1 and a2 is asymptotically larger than da actually computing a2 could beexpensive3 Hence we stick to our asymmetric form of function changes
105 Dierentiation informallyNext we dene dierentiation and explain informally why it does what we want We then give anexample of how dierentiation applies to our example A formal proof will follow soon in Sec 122justifying more formally why this denition is correct but we proceed more gently
Denition 1051 (Dierentiation)Dierentiation is the following term transformation
D n λ(x σ ) rarr t o = λ(x σ ) (dx ∆σ ) rarr D n t oD n s t o = D n s o t D n t o
D n x o = dx
D n c o = DC n c o
where DC n c o denes dierentiation on primitives and is provided by language plugins (seeAppendix A25) and dx stands for a variable generated by prexing xrsquos name with d so thatD n y o = dy and so on
If we extend the language with (non-recursive) let-bindings we can give derived rules for itsuch as
D n let x = t1 in t2 o = let x = t1dx = D n t1 o
in D n t2 oIn Sec 151 we will explain that the same transformation rules apply for recursive let-bindings
If t contains occurrences of both (say) x and dx capture issues arise in D n t o We defer theseissues to Sec 1234 and assume throughout that such issues can be avoided by α-renaming and theBarendregt convention [Barendregt 1984]
This transformation might seem deceptively simple Indeed pure λ-calculus only handlesbinding and higher-order functions leaving ldquoreal workrdquo to primitives Similarly our transformationincrementalizes binding and higher-order functions leaving ldquoreal workrdquo to derivatives of primitivesHowever our support of λ-calculus allows to glue primitives together Wersquoll discuss later how weadd support to various primitives and families of primitives
Now we try to motivate the transformation informally We claimed that D n ndash o must satisfySlogan 1033 which reads
Slogan 1033Term D n t o maps input changes to output changes That is D n t o applied to initial base inputs andvalid input changes (from initial inputs to updated inputs) gives a valid output change from t appliedon old inputs to t applied on new inputs
Letrsquos analyze the denition of D n ndash o by case analysis of input term u In each case we assumethat our slogan applies to any subterms of u and sketch why it applies to u itself
3We show later ecient change structures where oplus reuses part of a1 to output a2 in logarithmic time
64 Chapter 10 Introduction to dierentiation
bull if u = x by our slogan D n x o must evaluate to the change of x when inputs change so weset D n x o = dx
bull if u = c we simply delegate dierentiation to DC n c o which is dened by plugins Sinceplugins can dene arbitrary primitives they need provide their derivatives
bull if u = λx rarr t then u introduces a function Assume for simplicity that u is a closed termThen D n t o evaluates to the change of the result of this function u evaluated in a contextbinding x and its change dx Then because of how function changes are dened the changeof u is the change of output t as a function of the base input x and its change dx that isD n u o = λx dx rarr D n t o
bull if u = s t then s is a function Assume for simplicity that u is a closed term Then D n s oevaluates to the change of s as a function of D n s orsquos base input and input change So weapply D n s o to its actual base input t and actual input change D n t o and obtain D n s t o =D n s o t D n t o
This is not quite a correct proof sketch because of many issues but we x these issues with ourformal treatment in Sec 122 In particular in the case for abstraction u = λx rarr t D n t o dependsnot only on x and dx but also on other free variables of u and their changes Similarly we mustdeal with free variables also in the case for application u = s t But rst we apply dierentiation toour earlier example
106 Dierentiation on our exampleTo exemplify the behavior of dierentiation concretely and help x ideas for later discussion inthis section we show how the derivative of grandTotal looks like
grandTotal = λxs ysrarr sum (merge xs ys)s = grandTotal 1 2 3 4 = 11
Dierentiation is a structurally recursive program transformation so we rst compute D nmerge xs ys oTo compute its change we simply call the derivative of merge that is dmerge and apply it to thebase inputs and their changes hence we write
D nmerge xs ys o = dmerge xs dxs ys dys
As wersquoll better see later we can dene function dmerge as
dmerge = λxs dxs ys dysrarr merge dxs dys
so D nmerge xs ys o can be simplied by β-reduction to merge dxs dys
D nmerge xs ys o= dmerge xs dxs ys dys=β (λxs dxs ys dysrarr merge dxs dys) xs dxs ys dys=β merge dxs dys
Letrsquos next derive sum (merge xs ys) First like above the derivative of sum zs would bedsum zs dzs which depends on base input zs and its change dzs As wersquoll see dsum zs dzs cansimply call sum on dzs so dsum zs dzs = sum dzs To derive sum (merge xs ys) we must call
Chapter 10 Introduction to dierentiation 65
the derivative of sum that is dsum on its base argument and its change so on merge xs ys andD nmerge xs ys o We can later simplify again by β-reduction and obtain
D n sum (merge xs ys) o= dsum (merge xs ys) D nmerge xs ys o=β sum D nmerge xs ys o= sum (dmerge xs dxs ys dys)=β sum (merge dxs dys)
Here we see the output of dierentiation is dened in a bigger typing context while merge xs ysonly depends on base inputs xs and ys D nmerge xs ys o also depends on their changes Thisproperty extends beyond the examples we just saw if a term t is dened in context Γ then theoutput of derivation D n t o is dened in context Γ∆Γ where ∆Γ is a context that binds a changedx for each base input x bound in the context Γ
Next we consider λxs ysrarr sum (merge xs ys) Since xs dxs ys dys are free in D n sum (merge xs ys) o(ignoring later optimizations) term
D n λxs ysrarr sum (merge xs ys) omust bind all those variables
D n λxs ysrarr sum (merge xs ys) o= λxs dxs ys dysrarr D n sum (merge xs ys) o=β λxs dxs ys dysrarr sum (merge dxs dys)
Next we need to transform the binding of grandTotal2 to its body b = λxs ysrarr sum (merge xs ys)We copy this binding and add a new additional binding from dgrandTotal2 to the derivative of b
grandTotal = λxs ys rarr sum (merge xs ys)dgrandTotal = λxs dxs ys dysrarr sum (merge dxs dys)
Finally we need to transform the binding of output and its body By iterating similar steps inthe end we get
grandTotal = λxs ys rarr sum (merge xs ys)dgrandTotal = λxs dxs ys dysrarr sum (merge dxs dys)s = grandTotal 1 2 3 4 ds = dgrandTotal 1 2 3 1 4 5
=β sum (merge 1 5 )
Self-maintainability Dierentiation does not always produce ecient derivatives without fur-ther program transformations in particular derivatives might need to recompute results producedby the base program In the above example if we donrsquot inline derivatives and use β-reduction tosimplify programs D n sum (merge xs ys) o is just dsum (merge xs ys) D nmerge xs ys o A directexecution of this program will compute merge xs ys which would waste time linear in the baseinputs
Wersquoll show how to avoid such recomputations in general in Sec 172 but here we can avoidcomputing merge xs ys simply because dsum does not use its base argument that is it is self-maintainable Without the approach described in Chapter 17 we are restricted to self-maintainablederivatives
66 Chapter 10 Introduction to dierentiation
107 Conclusion and contributionsIn this chapter we have seen how a correct dierentiation transform allows us to incrementalizeprograms
1071 Navigating this thesis partDierentiation This chapter and Chapters 11 to 14 form the core of incrementalization theorywith other chapters building on top of them We study incrementalization for STLC we summarizeour formalization of STLC in Appendix A Together with Chapter 10 Chapter 11 introduces theoverall approach to incrementalization informally Chapters 12 and 13 show that incrementalizationusing ILC is correct Building on a set-theoretic semantics these chapters also develop the theoryunderlying this correctness proofs and further results Equational reasoning on terms is thendeveloped in Chapter 14
Chapters 12 to 14 contain full formal proofs readers are welcome to skip or skim those proofswhere appropriate For ease of reference and to help navigation and skimming these highlightand number all denitions notations theorem statements and so on and we strive not to hideimportant denitions inside theorems
Later chapters build on this core but are again independent from each other
Extensions and theoretical discussion Chapter 15 discusses a few assorted aspects of thetheory that do not t elsewhere and do not suce for standalone chapters We show how todierentiation general recursion Sec 151 we exhibit a function change that is not valid for anyfunction (Sec 152) we contrast our representation of function changes with pointwise functionchanges (Sec 153) and we compare our formalization with the one presented in [Cai et al 2014](Sec 154)
Performance of dierentiated programs Chapter 16 studies how to apply dierentiation(as introduced in previous chapters) to incrementalize a case study and empirically evaluatesperformance speedups
Dierentiation with cache-transfer-style conversion Chapter 17 is self-contained eventhough it builds on the rest of the material it summarizes the basics of ILC in Sec 1721 Terminologyin that chapter is sometimes subtly dierent from the rest of the thesis
Towards dierentiation for System F Chapter 18 outlines how to extend dierentiation toSystem F and suggests a proof avenue Dierentiation for System F requires a generalization ofchange structures to changes across dierent types (Sec 184) that appears of independent interestthough further research remains to be done
Related work conclusions and appendixes Finally Sec 176 and Chapter 19 discuss relatedwork and Chapter 20 concludes this part of the thesis
1072 ContributionsIn Chapters 11 to 14 and Chapter 16 we make the following contributions
bull We present a novel mathematical theory of changes and derivatives which is more generalthan other work in the eld because changes are rst-class entities they are distinct frombase values and they are dened also for functions
Chapter 10 Introduction to dierentiation 67
bull We present the rst approach to incremental computation for pure λ-calculi by a source-to-source transformation D that requires no run-time support The transformation producesan incremental program in the same language all optimization techniques for the originalprogram are applicable to the incremental program as well
bull We prove that our incrementalizing transformation D is correct by a novel machine-checkedlogical relation proof mechanized in Agda
bull While we focus mainly on the theory of changes and derivatives we also perform a per-formance case study We implement the derivation transformation in Scala with a plug-inarchitecture that can be extended with new base types and primitives We dene a pluginwith support for dierent collection types and use the plugin to incrementalize a variant ofthe MapReduce programming model [Laumlmmel 2007] Benchmarks show that on this programincrementalization can reduce asymptotic complexity and can turn O(n) performance intoO(1) improving running time by over 4 orders of magnitude on realistic inputs (Chapter 16)
In Chapter 17 we make the following contributions
bull via examples we motivate extending ILC to remember intermediate results (Sec 172)
bull we give a novel proof of correctness for ILC for untyped λ-calculus based on step-indexedlogical relations (Sec 1733)
bull building on top of ILC-style dierentiation we show how to transform untyped higher-orderprograms to cache-transfer-style (CTS) (Sec 1735)
bull we show through formal proofs that programs and derivatives in cache-transfer style simulatecorrectly their non-CTS variants (Sec 1736)
bull we perform performance case studies (in Sec 174) applying (by hand) extension of thistechnique to Haskell programs and incrementalize eciently also programs that do not admitself-maintainable derivatives
Chapter 18 describes how to extend dierentiation to System F To this end we extend changestructure to allow from changes where source and destination have dierent types and enabledening more powerful combinators for change structures to be more powerful While the resultsin this chapter call for further research we consider them exciting
Appendix C proposes the rst correctness proofs for ILC via operational methods and (step-indexed) logical relations for simply-typed λ-calculus (without and with general recursion) andfor untyped λ-calculus A later variant of this proof adapted for use of cache-transfer-style ispresented in Chapter 17 but we believe the presentation in Appendix C might also be interesting fortheorists as it explains how to extend ILC correctness proofs with fewer extraneous complicationsNevertheless given the similarity between proofs in Appendix C and Chapter 17 we relegate thelatter one to an appendix
Finally Appendix D shows how to implement all change operations on function changeseciently by defunctionalizing functions and function changes rather than using mere closureconversion
68 Chapter 10 Introduction to dierentiation
Chapter 11
A tour of dierentiation examples
Before formalizing ILC we show more example of change structures and primitives to show (a)designs for reusable primitives and their derivatives (b) to what extent we can incrementalize basicbuilding blocks such as recursive functions and algebraic data types and (c) to sketch how we canincrementalize collections eciently We make no attempt at incrementalizing a complete collectionAPI here we discuss briey more complete implementations in Chapter 16 and Sec 174
To describe these examples informally we use Haskell notation and let polymorphism asappropriate (see Appendix A2)
We also motivate a few extensions to dierentiation that we describe later As wersquoll see inthis chapter wersquoll need to enable some forms of introspection on function changes to manipulatethe embedded environments as we discuss in Appendix D We will also need ways to rememberintermediate results which we will discuss in Chapter 17 We will also use overly simplied changestructures to illustrate a few points
111 Change structures as type-class instances
We encode change structures as sketched earlier in Sec 103 through a type class namedChangeStructAn instance ChangeStruct t denes a change type ∆t as an associated type and operations oplus and are dened as methods We also dene method oreplace such that oreplace v2 produces areplacement change from any source to v2 by default v2 v1 is simply an alias for oreplace v2
class ChangeStruct t wheretype ∆t(oplus) t rarr ∆t rarr toreplace t rarr ∆t() t rarr t rarr ∆tv2 v1 = oreplace v2() ∆t rarr ∆t rarr ∆t0 t rarr ∆t
In this chapter we will often show change structures where only some methods are dened inactual implementations we use a type class hierarchy to encode what operations are available butwe collapse this hierarchy here to simplify presentation
69
70 Chapter 11 A tour of dierentiation examples
112 How to design a language pluginWhen adding support for a datatype T we will strive to dene both a change structure and derivativesof introduction and elimination forms for T since such forms constitute a complete API for usingthat datatype However we will sometimes have to restrict elimination forms to scenarios that canbe incrementalized eciently
In general to dierentiate a primitive f Ararr B once we have dened a change structure for Awe can start by dening
df a1 da = f (a1 oplus da) f a1 (111)
where da is a valid change from a1 to a2 We then try to simplify and rewrite the expression usingequational reasoning so that it does not refer to any more as far as possible We can assume thatall argument changes are valid especially if that allows producing faster derivatives we formalizeequational reasoning for valid changes in Sec 1411 In fact instead of dening and simplifyingf a2 f a1 to not use it it is sucient to produce a change from f a1 to f a2 even a dierent oneWe write da1 =∆ da2 to mean that changes da1 and da2 are equivalent that is they have the samesource and destination We dene this concept properly in Sec 142
We try to avoid running on arguments of non-constant size since it might easily take timelinear or superlinear in the argument sizes if produces replacement values it completes inconstant time but derivatives invoked on the produced changes are not ecient
113 Incrementalizing a collection APIIn this section we describe a collection API that we incrementalize (partially) in this chapter
To avoid notation conicts we represent lists via datatype List a dened as follows
data List a = Nil | Cons a (List a)
We also consider as primitive operation a standard mapping function map We also support tworestricted forms of aggregation (a) folding over an abelian group via fold similar to how one usuallyfolds over a monoid1 (b) list concatenation via concat We will not discuss how to dierentiateconcat as we reuse existing solutions by Firsov and Jeltsch [2016]
singleton ararr List asingleton x = Cons x Nilmap (ararr b) rarr List ararr List bmap f Nil = Nilmap f (Cons x xs) = Cons (f x) (map f xs)fold AbelianGroupChangeStruct brArr List brarr bfold Nil = memptyfold (Cons x xs) = x fold xs -- Where is inx for mappendconcat List (List a) rarr List aconcat =
While usually fold requires only an instance Monoid b of type class Monoid to aggregate collectionelements our variant of fold requires an instance of type class GroupChangeStruct a subclass ofMonoid This type class is not used by fold itself but only by its derivative as we explain inSec 1132 nevertheless we add this stronger constraint to fold itself because we forbid derivatives
1hpshackagehaskellorgpackagebase-4910docsData-Foldablehtml
Chapter 11 A tour of dierentiation examples 71
with stronger type-class constraints With this approach all clients of fold can be incrementalizedusing dierentiation
Using those primitives one can dene further higher-order functions on collections such asconcatMap lter foldMap In turn these functions form the kernel of a collection API as studiedfor instance by work on the monoid comprehension calculus [Grust and Scholl 1996a Fegaras andMaier 1995 Fegaras 1999] even if they are not complete by themselves
concatMap (ararr List b) rarr List ararr List bconcatMap f = concat map flter (ararr Bool) rarr List ararr List alter p = concatMap (λx rarr if p x then singleton x else Nil)foldMap AbelianGroupChangeStruct brArr (ararr b) rarr List ararr bfoldMap f = fold map f
In rst-order DSLs such as SQL such functionality must typically be added through separateprimitives (consider for instance lter) while here we can simply dene for instance lter on topof concatMap and incrementalize the resulting denitions using dierentiation
Function lter uses conditionals which we havenrsquot discussed yet we show how to incrementalizelter successfully in Sec 116
1131 Changes to type-class instancesIn this whole chapter we assume that type-class instances such as foldrsquos AbelianGroupChangeStructargument do not undergo changes Since type-class instances are closed top-level denitions ofoperations and are canonical for a datatype it is hard to imagine a change to a type-class instanceOn the other hand type-class instances can be encoded as rst-class values We can for instanceimagine a fold taking a unit value and an associative operation as argument In such scenarios oneneeds additional eort to propagate changes to operation arguments similarly to changes to thefunction argument to map
1132 Incrementalizing aggregationLetrsquos now discuss how to incrementalize fold We consider an oversimplied change structure thatallows only two sorts of changes prepending an element to a list or removing the list head of anon-empty list and study how to incrementalize fold for such changes
data ListChange a = Prepend a | Removeinstance ChangeStruct (List a) wheretype ∆(List a) = ListChange axs oplus Prepend x = Cons x xs(Cons x xs) oplus Remove = xsNil oplus Remove = error Invalid change
dfold xs (Prepend x) =
Removing an element from an empty list is an invalid change hence it is safe to give an error inthat scenario as mentioned when introducing oplus (Sec 103)
By using equational reasoning as suggested in Sec 112 starting from Eq (111) one can showformally that dfold xs (Prepend x) should be a change that in a sense ldquoaddsrdquo x to the result usinggroup operations
72 Chapter 11 A tour of dierentiation examples
dfold xs (Prepend x)=∆ fold (xs oplus Prepend x) fold xs= fold (Cons x xs) fold xs= (x fold xs) fold xs
Similarly dfold (Cons x xs) Remove should instead ldquosubtractrdquo x from the result
dfold (Cons x xs) Remove=∆ fold (Cons x xs oplus Remove) fold (Cons x xs)= fold xs fold (Cons x xs)= fold xs (x fold xs)
As discussed using is fast enough on say integers or other primitive types but not in generalTo avoid using we must rewrite its invocation to an equivalent expression In this scenario wecan use group changes for abelian groups and restrict fold to situations where such changes areavailable
dfold AbelianGroupChangeStruct brArr List brarr ∆(List b) rarr ∆bdfold xs (Prepend x) = inject xdfold (Cons x xs) Remove = inject (invert x)dfold Nil Remove = error Invalid change
To support group changes we dene the following type classes to model abelian groups andgroup change structures omitting APIs for more general groups AbelianGroupChangeStruct onlyrequires that group elements of type g can be converted into changes (type ∆g) allowing changetype ∆g to contain other sorts of changes
class Monoid g rArr AbelianGroup g whereinvert g rarr g
class (AbelianGroup aChangeStruct a) rArrAbelianGroupChangeStruct a where
-- Inject group elements into changes Law-- a oplus inject b = a binject ararr ∆a
Chapter 16 discusses how we can use group changes without assuming a single group is denedon elements but here we simply select the canonical group as chosen by type-class resolution Touse a dierent group as usual one denes a dierent but isomorphic type via the Haskell newtypeconstruct As a downside derivatives newtype constructors must convert changes across dierentrepresentations
Rewriting away can also be possible for other specialized folds though sometimes they canbe incrementalized directly for instance map can be written as a fold Incrementalizing map for theinsertion of x into xs requires simplifying map f (Cons x xs) map f xs To avoid we can rewritethis change statically to Insert (f x) indeed we can incrementalize map also for more realisticchange structures
Associative tree folds Other usages of fold over sequences produce result type of small boundedsize (such as integers) In this scenario one can incrementalize the given fold eciently using instead of relying on group operations For such scenarios one can design a primitive
foldMonoid Monoid arArr List arArr a
Chapter 11 A tour of dierentiation examples 73
for associative tree folds that is a function that folds over the input sequence using a monoid (thatis an associative operation with a unit) For ecient incrementalization foldMonoidrsquos intermediateresults should form a balanced tree and updating this tree should take logarithmic time oneapproach to ensure this is to represent the input sequence itself using a balanced tree such as anger tree [Hinze and Paterson 2006]
Various algorithms store intermediate results of folding inside an input balanced tree as describedby Cormen et al [2001 Ch 14] or by Hinze and Paterson [2006] But intermediate results can alsobe stored outside the input tree as is commonly done using self-adjusting computation [Acar 2005Sec 91] or as can be done in our setting While we do not use such folds we describe the existingalgorithms briey and sketch how to integrate them in our setting
Function foldMonoid must record the intermediate results and the derivative dfoldMonoid mustpropagate input changes to aected intermediate results
2 To study time complexity of input change propagation it is useful to consider the dependencygraph of intermediate results in this graph an intermediate result v1 has an arc to intermediate resultv2 if and only if computing v1 depends on v2 To ensure dfoldMonoid is ecient the dependencygraph of intermediate results from foldMonoid must form a balanced tree of logarithmic height sothat changes to a leaf only aect a logarithmic number of intermediate results
In contrast implementing foldMonoid using foldr on a list produces an unbalanced graph ofintermediate results For instance take input list xs = [1 8] containing numbers from 1 to 8 andassume a given monoid Summing them with foldr () mempty xs means evaluating
1 (2 (3 (4 (5 (6 (7 (8 mempty)))))))
Then a change to the last element of input xs aects all intermediate results hence incrementaliza-tion takes at least O(n) In contrast using foldAssoc on xs should evaluate a balanced tree similarto
((1 2) (3 4)) ((5 6) (7 8))
so that any individual change to a leave insertion or deletion only aects O(logn) intermediateresults (where n is the sequence size) Upon modications to the tree one must ensure that thebalancing is stable [Acar 2005 Sec 91] In other words altering the tree (by inserting or removingan element) must only alter O(logn) nodes
We have implemented associative tree folds on very simple but unbalanced tree structures webelieve they could be implemented and incrementalized over balanced trees representing sequencessuch as nger trees or random access zippers [Headley and Hammer 2016] but doing so requirestransforming their implementation of their data structure to cache-transfer style (CTS) (Chapter 17)We leave this for future work together with an automated implementation of CTS transformation
1133 Modifying list elementsIn this section we consider another change structure on lists that allows expressing changes toindividual elements Then we present dmap derivative of map for this change structure Finallywe sketch informally the correctness of dmap which we prove formally in Example 1411
We can then represent changes to a list (List a) as a list of changes (List (∆a)) one for eachelement A list change dxs is valid for source xs if they have the same length and each elementchange is valid for its corresponding element For this change structure we can dene oplus and but not a total such list changes canrsquot express the dierence between two lists of dierentlengths Nevertheless this change structure is sucient to dene derivatives that act correctly
2We discuss in Chapter 17 how base functions communicate results to derivatives
74 Chapter 11 A tour of dierentiation examples
on the changes that can be expressed We can describe this change structure in Haskell using atype-class instance for class ChangeStruct
instance ChangeStruct (List a) wheretype ∆(List a) = List (∆a)Nil oplus Nil = Nil(Cons x xs) oplus (Cons dx dxs) = Cons (x oplus xs) (dx oplus dxs)_ oplus _ = Nil
The following dmap function is a derivative for the standardmap function (included for reference)and the given change structure We discuss derivatives for recursive functions in Sec 151
map (ararr b) rarr List ararr List bmap f Nil = Nilmap f (Cons x xs) = Cons (f x) (map f xs)dmap (ararr b) rarr ∆(ararr b) rarr List ararr ∆List ararr ∆List b
-- A valid list change has the same length as the base listdmap f df Nil Nil = Nildmap f df (Cons x xs) (Cons dx dxs) =
Cons (df x dx) (dmap f df xs dxs)-- Remaining cases deal with invalid changes and a dummy-- result is sucient
dmap f df xs dxs = Nil
Function dmap is a correct derivative of map for this change structure according to Slogan 1033 wesketch an informal argument by induction The equation for dmap f df Nil Nil returns Nil a validchange from initial to updated outputs as required In the equation for dmap f df (Cons x xs) (Cons dx dxs)we compute changes to the head and tail of the result to produce a change from map f (Cons x xs)to map (f oplus df ) (Cons x xs oplus Cons dx dxs) To this end (a) we use df x dx to compute a changeto the head of the result from f x to (f oplus df ) (x oplus dx) (b) we use dmap f df xs dxs recursivelyto compute a change to the tail of the result from map f xs to map (f oplus df ) (xs oplus dxs) (c) weassemble changes to head and tail with Cons into a change from In other words dmap turns inputchanges to output changes correctly according to our Slogan 1033 it is a correct derivative formap according to this change structure We have reasoned informally we formalize this style ofreasoning in Sec 141 Crucially our conclusions only hold if input changes are valid hence termmap f xs oplus dmap f df xs dxs is not denotationally equal to map (f oplus df ) (xs oplus dxs) for arbitrarychange environments these two terms only evaluate to the same result for valid input changes
Since this denition of dmap is a correct derivative we could use it in an incremental DSL forlist manipulation together with other primitives Because of limitations we describe next we willuse instead improved language plugins for sequences
1134 LimitationsWe have shown simplied list changes but they have a few limitations Fixing those requires moresophisticated denitions
As discussed our list changes intentionally forbid changing the length of a list And ourdenition of dmap has further limitations a change to a list of n elements takes size O(n) evenwhen most elements do not change and calling dmap f df on it requires n calls to df This is onlyfaster if df is faster than f but adds no further speedup
We can describe instead a change to an arbitrary list element x in xs by giving the change dxand the position of x in xs A list change is then a sequence of such changes
Chapter 11 A tour of dierentiation examples 75
type ∆(List a) = List (AtomicChange a)data AtomicChange a = Modify Z (∆a)
However fetching the i-th list element still takes time linear in i we need a better representationof sequences In next section we switch to a change structure on sequences dened by ngertrees [Hinze and Paterson 2006] following Firsov and Jeltsch [2016]
114 Ecient sequence changesFirsov and Jeltsch [2016] dene an ecient representation of list changes in a framework similarto ILC and incrementalize selected operations over this change structure They also providecombinators to assemble further operations on top of the provided ones We extend their frameworkto handle function changes and generate derivatives for all functions that can be expressed in termsof the primitives
Conceptually a change for type Sequence a is a sequence of atomic changes Each atomic changeinserts one element at a given position or removes one element or changes an element at oneposition3
data SeqSingleChange a= Insert idx Z x a| Remove idx Z| ChangeAt idx Z dx ∆a
data SeqChange a = Sequence (SeqSingleChange a)type ∆(Sequence a) = SeqChange a
We use Firsov and Jeltschrsquos variant of this change structure in Sec 174
115 ProductsIt is also possible to dene change structures for arbitrary sum and product types and to providederivatives for introduction and elimination forms for such datatypes In this section we discussproducts in the next section sums
We dene a simple change structure for product type A times B from change structures for A and Bsimilar to change structures for environments operations act pointwise on the two components
instance (ChangeStruct aChangeStruct b) rArr ChangeStruct (a b) wheretype ∆(a b) = (∆a∆b)(a b) oplus (da db) = (a oplus da b oplus db)(a2 b2) (a1 b1) = (a2 a1 b2 b1)oreplace (a2 b2) = (oreplace a2 oreplace b2)0ab = (0a 0b)(da1 db1) (da2 db2) = (da1 da2 db1 db2)
Through equational reasoning as in Sec 112 we can also compute derivatives for basic primitiveson product types both the introduction form (that we alias as pair) and the elimination forms fstand snd We just present the resulting denitions
pair a b = (a b)dpair a da b db = (da db)
3Firsov and Jeltsch [2016] and our actual implementation allow changes to multiple elements
76 Chapter 11 A tour of dierentiation examples
fst (a b) = asnd (a b) = bdfst ∆(a b) rarr ∆adfst (da db) = dadsnd ∆(a b) rarr ∆bdsnd (da db) = dbuncurry (ararr brarr c) rarr (a b) rarr cuncurry f (a b) = f a bduncurry (ararr brarr c) rarr ∆(ararr brarr c) rarr (a b) rarr ∆(a b) rarr ∆cduncurry f df (x y) (dx dy) = df x dx y dy
One can also dene n-ary products in a similar way However a product change contains asmany entries as a product
116 Sums pattern matching and conditionalsIn this section we dene change structures for sum types together with the derivative of theirintroduction and elimination forms We also obtain support for booleans (which can be encoded assum type 1 + 1) and conditionals (which can be encoded in terms of elimination for sums) We havemechanically proved correctness of this change structure and derivatives but we do not present thetedious details in this thesis and refer to our Agda formalization
Changes structures for sums are more challenging than ones for products We can dene thembut in many cases we can do better with specialized structures Nevertheless such changes areuseful in some scenarios
In Haskell sum types a+ b are conventionally dened via datatype Either a b with introductionforms Le and Right and elimination form either which will be our primitives
data Either a b = Le a | Right beither (ararr c) rarr (brarr c) rarr Either a brarr ceither f g (Le a) = f aeither f g (Right b) = g b
We can dene the following change structure
data EitherChange a b= LeC (∆a)| RightC (∆b)| EitherReplace (Either a b)
instance (ChangeStruct aChangeStruct b) rArrChangeStruct (Either a b) where
type ∆(Either a b) = EitherChange a bLe a oplus LeC da = Le (a oplus da)Right b oplus RightC db = Right (b oplus db)Le _ oplus RightC _ = error Invalid changeRight _ oplus LeC _ = error Invalid change_ oplus EitherReplace e2 = e2oreplace = EitherReplace0Le a = LeC 0a0Right a = RightC 0a
Chapter 11 A tour of dierentiation examples 77
Changes to a sum value can either keep the same ldquobranchrdquo (left or right) and modify the containedvalue or replace the sum value with another one altogether Specically change LeC da is validfrom Le a1 to Le a2 if da is valid from a1 to a2 Similarly change RightC db is valid from Right b1to Right b2 if db is valid from b1 to b2 Finally replacement change EitherReplace e2 is valid from e1to e2 for any e1
Using Eq (111) we can then obtain denitions for derivatives of primitives Le Right andeither The resulting code is as follows
dLe ararr ∆ararr ∆(Either a b)dLe a da = LeC dadRight brarr ∆brarr ∆(Either a b)dRight b db = RightC dbdeither (NilChangeStruct aNilChangeStruct bDiChangeStruct c) rArr(ararr c) rarr ∆(ararr c) rarr (brarr c) rarr ∆(brarr c) rarrEither a brarr ∆(Either a b) rarr ∆c
deither f df g dg (Le a) (LeC da) = df a dadeither f df g dg (Right b) (RightC db) = dg b dbdeither f df g dg e1 (EitherReplace e2) =either (f oplus df ) (g oplus dg) e2 either f g e1
deither _ _ _ _ _ _ = error Invalid sum change
We show only one case of the derivation of deither as an example
deither f df g dg (Le a) (LeC da)=∆ using variants of Eq (111) for multiple arguments either (f oplus df ) (g oplus dg) (Le a oplus LeC da) either f g (Le a)= simplify oplus either (f oplus df ) (g oplus dg) (Le (a oplus da)) either f g (Le a)= simplify either (f oplus df ) (a oplus da) f a=∆ because df is a valid change for f and da for a df a da
Unfortunately with this change structure a change from Le a1 to Right b2 is simply a replace-ment change so derivatives processing it must recompute results from scratch In general wecannot do better since there need be no shared data between two branches of a datatype We needto nd specialized scenarios where better implementations are possible
Extensions changes for ADTs and future work In some cases we consider changes for typeEither a b where a and b both contain some other type c Take again lists a change from list asto list Cons a2 as should simply say that we prepend a2 to the list In a sense we are just using achange structure from type List a to (a List a) More in general if change das from as1 to as2 issmall a change from list as1 to list Cons a2 as2 should simply ldquosayrdquo that we prepend a2 and that wemodify as1 into as2 and similarly for removals
In Sec 184 we suggest how to construct such change structures based on the concept ofpolymorphic change structure where changes have source and destination of dierent types Basedon initial experiments we believe one could develop these constructions into a powerful combinatorlanguage for change structures In particular it should be possible to build change structures for lists
78 Chapter 11 A tour of dierentiation examples
similar to the ones in Sec 1132 Generalizing beyond lists similar systematic constructions shouldbe able to represent insertions and removals of one-hole contexts [McBride 2001] for arbitraryalgebraic datatypes (ADTs) for ADTs representing balanced data structures such changes couldenable ecient incrementalization in many scenarios However many questions remain open sowe leave this eort for future work
1161 Optimizing lter
In Sec 113 we have dened lter using a conditional and now we have just explained that ingeneral conditionals are inecient This seems pretty unfortunate But luckily we can optimizelter using equational reasoning
Consider again the earlier denition of lter
lter (ararr Bool) rarr List ararr List alter p = concatMap (λx rarr if p x then singleton x else Nil)
As explained we can encode conditionals using either and dierentiate the resulting programHowever if p x changes from True to False or viceversa (that is for all elements x for which dp x dxis a non-nil change) we must compute at runtime However will compare empty list Nil witha singleton list produced by singleton x (in one direction or the other) We can have detect thissituation at runtime But since the implementation of lter is known statically we can optimizethis code at runtime rewriting singleton x Nil to Insert x and Nil singleton x to Remove Toenable this optimization in dlter we need to inline the function that lter passes as argument toconcatMap and all the functions it calls except p Moreover we need to case-split on possible returnvalues for p x and dp x dx We omit the steps because they are both tedious and standard
It appears in principle possible to automate such transformations by adding domain-specicknowledge to a suciently smart compiler though we have made no attempt at an actual implemen-tation It would be rst necessary to investigate further classes of examples where optimizations areapplicable Suciently smart compilers are rare but since our approach produces purely functionalprograms we have access to GHC and HERMIT [Farmer et al 2012] An interesting alternative(which does have some support for side eects) is LMS [Rompf and Odersky 2010] and Delite [Brownet al 2011] We leave further investigation for future work
117 Chapter conclusionIn this chapter we have toured what can and cannot be incrementalized using dierentiation andhow using higher-order functions allows dening generic primitives to incrementalize
Chapter 12
Changes and dierentiationformally
To support incrementalization in this chapter we introduce dierentiation and formally proveit correct That is we prove that nD n t o o produces derivatives As we explain in Sec 1212derivatives transform valid input changes into valid output changes (Slogan 1033) Hence wedene what are valid changes (Sec 121) As wersquoll explain in Sec 1214 validity is a logical relationAs we explain in Sec 1222 our correctness theorem is the fundamental property for the validitylogical relation proved by induction over the structure of terms Crucial denitions or derived factsare summarized in Fig 121 Later in Chapter 13 we study consequences of correctness and changeoperations
All denitions and proofs in this and next chapter is mechanized in Agda except where otherwiseindicated To this writer given these denitions all proofs have become straightforward andunsurprising Nevertheless rst obtaining these proofs took a while So we typically include fullproofs We also believe these proofs clarify the meaning and consequences of our denitions Tomake proofs as accessible as possible we try to provide enough detail that our target readers canfollow along without pencil and paper at the expense of making our proofs look longer than theywould usually be As we target readers procient with STLC (but not necessarily procient withlogical relations) wersquoll still omit routine steps needed to reason on STLC such as typing derivationsor binding issues
121 Changes and validity
In this section we introduce formally (a) a description of changes (b) a denition of which changesare valid We have already introduced informally in Chapter 10 these notions and how they ttogether We next dene the same notions formally and deduce their key properties Languageplugins extend these denitions for base types and constants that they provide
To formalize the notion of changes for elements of a set V we dene the notion of basic changestructure on V
Denition 1211 (Basic change structures)A basic change structure on set V written V comprises
(a) a change set ∆V
79
80 Chapter 12 Changes and dierentiation formally
∆τ
∆ι =
∆(σ rarr τ ) = σ rarr ∆σ rarr ∆τ
(a) Change types (Denition 12117)
∆Γ
∆ε = ε
∆ (Γx τ ) = ∆Γx τ dx ∆τ
(b) Change contexts (Denition 12123)
D n t o
D n λ(x σ ) rarr t o = λ(x σ ) (dx ∆σ ) rarr D n t oD n s t o = D n s o t D n t o
D n x o = dx
D n c o = DC n c o
(c) Dierentiation (Denition 1051)
Γ ` t τ∆Γ ` D n t o ∆τ
Derive
(d) Dierentiation typing (Lemma 1228)
dv B v1 rarr v2 τ with v1 v2 nτ o dv n∆τ o
dv B v1 rarr v2 ι = df B f1 rarr f2 σ rarr τ = forallda B a1 rarr a2 σ df a1 da B f1 a1 rarr f2 a2 τ
dρ B ρ1 rarr ρ2 Γ with ρ1 ρ2 n Γ o dρ n∆Γ o
B rarr εdρ B ρ1 rarr ρ2 Γ da B a1 rarr a2 τ
(dρx = a1dx = da) B (ρ1x = a1) rarr (ρ2x = a2) Γx τ
(e) Validity (Denitions 12121 and 12124)
If Γ ` t τ then
foralldρ B ρ1 rarr ρ2 Γ nD n t o o dρ B n t o ρ1 rarr n t o ρ2 τ
(f) Correctness of D n ndash o (from Theorem 1222)
Figure 121 Dening dierentiation and proving it correct The rest of this chapter explains andmotivates the above denitions
Chapter 12 Changes and dierentiation formally 81
(b) a ternary validity relation dv B v1 rarr v2 V for v1 v2 isin V and dv isin ∆V that we read as ldquodvis a valid change from source v1 to destination v2 (on set V )rdquo
Example 1212In Examples 1031 and 1032 we exemplied informally change types and validity on naturalsintegers and bags We dene formally basic change structures on naturals and integers Comparedto validity for integers validity for naturals ensures that the destination v1 + dv is again a naturalFor instance given source v1 = 1 change dv = minus2 is valid (with destination v2 = minus1) only onintegers not on naturals
Denition 1213 (Basic change structure on integers)Basic change structure Z on integers has integers as changes (∆Z = Z) and the following validityjudgment
dv B v1 rarr v1 + dv Z
Denition 1214 (Basic change structure on naturals)Basic change structure N on naturals has integers as changes (∆N = Z) and the following validityjudgment
v1 + dv gt 0dv B v1 rarr v1 + dv N
Intuitively we can think of a valid change from v1 to v2 as a graph edge from v1 to v2 so wersquolloften use graph terminology when discussing changes This intuition is robust and can be madefully precise1 More specically a basic change structure on V can be seen as a directed multigraphhaving as vertexes the elements of V and as edges from v1 to v2 the valid changes dv from v1 to v2This is a multigraph because our denition allows multiple edges between v1 and v2
A change dv can be valid from v1 to v2 and from v3 to v4 but wersquoll still want to talk about thesource and the destination of a change When we talk about a change dv valid from v1 to v2 valuev1 is dvrsquos source and v2 is dvrsquos destination Hence wersquoll systematically quantify theorems over validchanges dv with their sources v1 and destination v2 using the following notation2
Notation 1215 (Quantication over valid changes)We write
foralldv B v1 rarr v2 V P
and say ldquofor all (valid) changes dv from v1 to v2 on set V we have Prdquo as a shortcut for
forallv1 v2 isin V dv isin ∆V if dv B v1 rarr v2 V then P
Since we focus on valid changes wersquoll omit the word ldquovalidrdquo when clear from context Inparticular a change from v1 to v2 is necessarily valid
1See for instance Robert Atkeyrsquos blog post [Atkey 2015] or Yufei Cairsquos PhD thesis [Cai 2017]2If you prefer you can tag a change with its source and destination by using a triple and regard the whole triple
(v1 dv v2) as a change Mathematically this gives the correct results but wersquoll typically not use such triples as changes inprograms for performance reasons
82 Chapter 12 Changes and dierentiation formally
We can have multiple basic change structures on the same set
Example 1216 (Replacement changes)For instance for any set V we can talk about replacement changes on V a replacement changedv = v2 for a value v1 V simply species directly a new value v2 V so that v2 B v1 rarr v2 V We read as the ldquobangrdquo operator
A basic change structure can decide to use only replacement changes (which can be appropriatefor primitive types with values of constant size) or to make∆V a sum type allowing both replacementchanges and other ways to describe a change (as long as wersquore using a language plugin that addssum types)
Nil changes Just like integers have a null element 0 among changes there can be nil changes
Denition 1217 (Nil changes)We say that dv ∆V is a nil change for v V if dv B v rarr v V
For instance 0 is a nil change for any integer number n However in general a change might benil for an element but not for another For instance the replacement change 6 is a nil change on 6but not on 5
Wersquoll say more on nil changes in Sec 1212 and 1321
1211 Function spaces
Next we dene a basic change structure that we call Ararr B for an arbitrary function space Ararr Bassuming we have basic change structures for both A and B We claimed earlier that valid functionchanges map valid input changes to valid output changes We make this claim formal through nextdenitionDenition 1218 (Basic change structure on Ararr B)Given basic change structures on A and B we dene a basic change structure on Ararr B that wewrite Ararr B as follows
(a) Change set ∆(Ararr B) is Ararr ∆Ararr ∆B
(b) Function change df is valid from f1 to f2 (that is df B f1 rarr f2 Ararr B) if and only if for allvalid input changes da B a1 rarr a2 A value df a1 da is a valid output change from f1 a1 tof2 a2 (that is df a1 da B f1 a1 rarr f2 a2 B)
Notation 1219 (Applying function changes to changes)When reading out df a1 da wersquoll often talk for brevity about applying df to da leaving darsquos sourcea1 implicit when it can be deduced from context
Wersquoll also consider valid changes df for curried n-ary functions We show what their validitymeans for curried binary functions f Ararr Brarr C We omit similar statements for higher aritiesas they add no new ideas
Lemma 12110 (Validity on Ararr Brarr C)For any basic change structures A B and C function change df ∆(A rarr B rarr C) is valid fromf1 to f2 (that is df B f1 rarr f2 A rarr B rarr C) if and only if applying df to valid input changesda B a1 rarr a2 A and db B b1 rarr b2 B gives a valid output change
df a1 da b1 db B f a1 b1 rarr f a2 b2 C
Chapter 12 Changes and dierentiation formally 83
Proof The equivalence follows from applying the denition of function validity of df twiceThat is function change df is valid (df B f1 rarr f2 Ararr (Brarr C)) if and only if it maps valid
input change da B a1 rarr a2 A to valid output change
df a1 da B f1 a1 rarr f2 a2 Brarr C
In turn df a1 da is a function change which is valid if and only if it maps valid input changedb B b1 rarr b2 B to
df a1 da b1 db B f a1 b1 rarr f a2 b2 C
as required by the lemma
1212 DerivativesAmong valid function changes derivatives play a central role especially in the statement ofcorrectness of dierentiationDenition 12111 (Derivatives)Given function f Ararr B function df Ararr ∆Ararr ∆B is a derivative for f if for all changes dafrom a1 to a2 on set A change df a1 da is valid from f a1 to f a2
However it follows that derivatives are nil function changes
Lemma 12112 (Derivatives as nil function changes)Given function f A rarr B function df ∆(A rarr B) is a derivative of f if and only if df is a nilchange of f (df B f rarr f Ararr B)
Proof First we show that nil changes are derivatives First a nil change df B f rarr f Ararr B hasthe right type to be a derivative because A rarr ∆A rarr ∆B = ∆(A rarr B) Since df is a nil changefrom f to f by denition it maps valid input changes da B a1 rarr a2 A to valid output changesdf a1 da B f a1 rarr f a2 B Hence df is a derivative as required
In fact all proof steps are equivalences and by tracing them backward we can show thatderivatives are nil changes Since a derivative df maps valid input changes da B a1 rarr a2 A tovalid output changes df a1 da B f a1 rarr f a2 B df is a change from f to f as required
Applying derivatives to nil changes gives again nil changes This fact is useful when reasoningon derivatives The proof is a useful exercise on using validity
Lemma 12113 (Derivatives preserve nil changes)For any basic change structures A and B function change df ∆(Ararr B) is a derivative of f Ararr B(df B f rarr f Ararr B) if and only if applying df to an arbitrary input nil change da B a rarr a Agives a nil change
df a da B f a rarr f a B
Proof Just rewrite the denition of derivatives (Lemma 12112) using the denition of validity ofdf
In detail by denition of validity for function changes (Denition 1218) df B f1 rarr f2 Ararr Bmeans that from da B a1 rarr a2 A follows df a1 da B f1 a1 rarr f2 a2 B Just substitute f1 = f2 = fand a1 = a2 = a to get the required equivalence
Also derivatives of curried n-ary functions f preserve nil changes We only state this formallyfor curried binary functions f Ararr Brarr C higher arities require no new ideas
84 Chapter 12 Changes and dierentiation formally
Lemma 12114 (Derivatives preserve nil changes on Ararr Brarr C)For any basic change structures A B and C Change df ∆(A rarr B rarr C) is a derivative off Ararr Brarr C if and only if applying df to nil changes da B a rarr a A and db B b rarr b B givesa nil change
df a da b db B f a b rarr f a b C
Proof Similarly to validity on Ararr Brarr C (Lemma 12110) the thesis follows by applying twicethe fact that derivatives preserve nil changes (Lemma 12113)
In detail since derivatives preserve nil changes df is a derivative if and only if for all da Ba rarr a A we have df a da B f a rarr f a B rarr C But then df a da is a nil change that is aderivative and since it preserves nil changes df is a derivative if and only if for all da B a rarr a Aand db B b rarr b B we have df a da b db B f a b rarr f a b C
1213 Basic change structures on typesAfter studying basic change structures in the abstract we apply them to the study of our objectlanguage
For each type τ we can dene a basic change structure on domain nτ o which we write τ Language plugins must provide basic change structures for base types To provide basic changestructures for function types σ rarr τ we use the earlier construction for basic change structuresσ rarr τ on function spaces nσ rarr τ o (Denition 1218)Denition 12115 (Basic change structures on types)For each type τ we associate a basic change structure on domain nτ o written τ through thefollowing equations
ι = σ rarr τ = σ rarr τ
Plugin Requirement 12116 (Basic change structures on base types)For each base type ι the plugin denes a basic change structure on n ι o that we write ι
Crucially for each type τ we can dene a type ∆τ of changes that we call change type suchthat the change set ∆ nτ o is just the domain n∆τ o associated to change type ∆τ ∆ nτ o = n∆τ oThis equation allows writing change terms that evaluate directly to change values3
Denition 12117 (Change types)The change type ∆τ of a type τ is dened as follows
∆ι =
∆(σ rarr τ ) = σ rarr ∆σ rarr ∆τ
Lemma 12118 (∆ and n ndasho commute on types)For each type τ change set ∆ nτ o equals the domain of change type n∆τ o
Proof By induction on types For the case of function types we simply prove equationally that∆ nσ rarr τ o = ∆(nσ o rarr nτ o) = nσ o rarr ∆ nσ o rarr ∆ nτ o = nσ o rarr n∆σ o rarr n∆τ o =nσ rarr ∆σ rarr ∆τ o = n∆(σ rarr τ ) o The case for base types is delegates to plugins (Plugin Require-ment 12119)
3Instead in earlier proofs [Cai et al 2014] the values of change terms were not change values but had to be related tochange values through a logical relation see Sec 154
Chapter 12 Changes and dierentiation formally 85
Plugin Requirement 12119 (Base change types)For each base type ι the plugin denes a change type ∆ι such that ∆ n ι o = n∆ι o
We refer to values of change types as change values or just changes
Notation 12120We write basic change structures for types τ not nτ o and dv B v1 rarr v2 τ not dv B v1 rarr v2 nτ o We also write consistently n∆τ o not ∆ nτ o
1214 Validity as a logical relation
Next we show an equivalent denition of validity for values of terms directly by induction ontypes as a ternary logical relation between a change its source and destination A typical logicalrelation constrains functions to map related input to related outputs In a twist validity constrainsfunction changes to map related inputs to related outputs
Denition 12121 (Change validity)We say that dv is a (valid) change from v1 to v2 (on type τ ) and write dv B v1 rarr v2 τ if dv n∆τ ov1 v2 nτ o and dv is a ldquovalidrdquo description of the dierence from v1 to v2 as we dene in Fig 121e
The key equations for function types are
∆(σ rarr τ ) = σ rarr ∆σ rarr ∆τ
df B f1 rarr f2 σ rarr τ = forallda B a1 rarr a2 σ df a1 da B f1 a1 rarr f2 a2 τ
Remark 12122We have kept repeating the idea that valid function changes map valid input changes to valid outputchanges As seen in Sec 104 and Lemmas 12110 and 12114 such valid outputs can in turn bevalid function changes Wersquoll see the same idea at work in Lemma 12127 in the correctness proofof D n ndash o
As we have nally seen in this section this denition of validity can be formalized as a logicalrelation dened by induction on types Wersquoll later take for granted the consequences of validitytogether with lemmas such as Lemma 12110
1215 Change structures on typing contexts
To describe changes to the inputs of a term we now also introduce change contexts ∆Γ environmentchanges dρ n∆Γ o and validity for environment changes dρ B ρ1 rarr ρ2 Γ
A valid environment change from ρ1 n Γ o to ρ2 n Γ o is an environment dρ n∆Γ o thatextends environment ρ1 with valid changes for each entry We rst dene the shape of environmentchanges through change contexts
Denition 12123 (Change contexts)For each context Γ we dene change context ∆Γ as follows
∆ε = ε
∆ (Γx τ ) = ∆Γx τ dx ∆τ
Then we describe validity of environment changes via a judgment
86 Chapter 12 Changes and dierentiation formally
Denition 12124 (Environment change validity)We dene validity for environment changes through judgment dρ B ρ1 rarr ρ2 Γ pronounced ldquodρis an environment change from ρ1 to ρ2 (at context Γ)rdquo where ρ1 ρ2 n Γ o dρ n∆Γ o via thefollowing inference rules
B rarr εdρ B ρ1 rarr ρ2 Γ da B a1 rarr a2 τ
(dρx = a1dx = da) B (ρ1x = a1) rarr (ρ2x = a2) Γx τ
Denition 12125 (Basic change structures for contexts)To each context Γ we associate a basic change structure on set n Γ o We take n∆Γ o as change setand reuse validity on environment changes (Denition 12124)
Notation 12126We write dρ B ρ1 rarr ρ2 Γ rather than dρ B ρ1 rarr ρ2 n Γ o
Finally to state and prove correctness of dierentiation we are going to need to discuss functionchanges on term semantics The semantics of a term Γ ` t τ is a function n t o from environmentsin n Γ o to values in nτ o To discuss changes to n t o we need a basic change structure on functionspace n Γ orarr nτ o
Lemma 12127The construction of basic change structures on function spaces (Denition 1218) associates to eachcontext Γ and type τ a basic change structure Γ rarr τ on function space n Γ orarr nτ o
Notation 12128As usual we write the change set as ∆(n Γ o rarr nτ o) for validity we write df B f1 rarr f2 Γτrather than df B f1 rarr f2 n Γ orarr nτ o
122 Correctness of dierentiationIn this section we state and prove correctness of dierentiation a term-to-term transformationwritten D n t o that produces incremental programs We recall that all our results apply only towell-typed terms (since we formalize no other ones)
Earlier we described how D n ndash o behaves through Slogan 1033 mdash here is it again for reference
Slogan 1033Term D n t o maps input changes to output changes That is D n t o applied to initial base inputs andvalid input changes (from initial inputs to updated inputs) gives a valid output change from t appliedon old inputs to t applied on new inputs
In our slogan we do not specify what we meant by inputs though we gave examples duringthe discussion We have now the notions needed for a more precise statement Term D n t o mustsatisfy our slogan for two sorts of inputs
1 Evaluating D n t o must map an environment change dρ from ρ1 to ρ2 into a valid resultchange nD n t o o dρ going from n t o ρ1 to n t o ρ2
2 As we learned since stating our slogan validity is dened by recursion over types If term t hastype σ rarr τ change nD n t o o dρ can in turn be a (valid) function change (Remark 12122)Function changes map valid changes for their inputs to valid changes for their outputs
Chapter 12 Changes and dierentiation formally 87
Instead of saying that nD n t o o maps dρ B ρ1 rarr ρ2 Γ to a change from n t o ρ1 to n t o ρ2we can say that function n t o∆ = λρ dρ rarr nD n t o o dρ must be a nil change for n t o that is aderivative for n t o We give a name to this function change and state D n ndash orsquos correctness theorem
Denition 1221 (Incremental semantics)We dene the incremental semantics of a well-typed term Γ ` t τ in terms of dierentiation as
n t o∆ = (λρ1 dρ rarr nD n t o o dρ) n Γ orarr n∆Γ orarr n∆τ o
Theorem 1222 (D n ndasho is correct)Function n t o∆ is a derivative of n t o That is if Γ ` t τ and dρ B ρ1 rarr ρ2 Γ then nD n t o o dρ Bn t o ρ1 rarr n t o ρ2 τ
For now we discuss this statement further we defer the proof to Sec 1222
Remark 1223 (Why n ndasho∆ ignores ρ1)You might wonder why n t o∆ = λρ1 dρ rarr nD n t o o dρ appears to ignore ρ1 But for alldρ B ρ1 rarr ρ2 Γ change environment dρ extends ρ1 which hence provides no further informationWe are only interested in applying n t o∆ to valid environment changes dρ so n t o∆ ρ1 dρ cansafely ignore ρ1
Remark 1224 (Term derivatives)In Chapter 10 we suggested that D n t o only produced a derivative for closed terms not for openones But n t o∆ = λρ dρ rarr nD n t o o dρ is always a nil change and derivative of n t o for anyΓ ` t τ There is no contradiction because the value of D n t o is nD n t o o dρ which is only anil change if dρ is a nil change as well In particular for closed terms (Γ = ε) dρ must equal theempty environment hence a nil change If τ is a function type df = nD n t o o dρ accepts furtherinputs since df must be a valid function change it will also map them to valid outputs as requiredby our Slogan 1033 Finally if Γ = ε and τ is a function type then df = nD n t o o is a derivativeof f = n t o
We summarize this remark with the following denition and corollary
Denition 1225 (Derivatives of terms)For all closed terms of function type ` t σ rarr τ we call D n t o the (term) derivative of t
Corollary 1226 (Term derivatives evaluate to derivatives)For all closed terms of function type ` t σ rarr τ function nD n t o o is a derivative of n t o
Proof Because n t o∆ is a derivative (Theorem 1222) and applying derivative n t o∆ to nil change gives a derivative (Lemma 12113)
Remark 1227We typically talk a derivative of a function value f A rarr B not the derivative since multipledierent functions can satisfy the specication of derivatives We talk about the derivative to referto a canonically chosen derivative For terms and their semantics the canonical derivative the oneproduced by dierentiation For language primitives the canonical derivative is the one chosen bythe language plugin under consideration
Theorem 1222 only makes sense if D n ndash o has the right static semantics
88 Chapter 12 Changes and dierentiation formally
Lemma 1228 (Typing of D n ndasho)Typing rule
Γ ` t τ∆Γ ` D n t o ∆τ
Derive
is derivable
After wersquoll dene oplus in next chapter wersquoll be able to relate oplus to validity by proving Lemma 1344which we state in advance here
Lemma 1344 (oplus agrees with validity)If dv B v1 rarr v2 τ then v1 oplus dv = v2
Hence updating base result n t o ρ1 by change nD n t o o dρ via oplus gives the updated resultn t o ρ2
Corollary 1345 (D n ndasho is correct corollary)If Γ ` t τ and dρ B ρ1 rarr ρ2 Γ then n t o ρ1 oplus nD n t o o dρ = n t o ρ2
We anticipate the proof of this corollary
Proof First dierentiation is correct (Theorem 1222) so under the hypotheses
nD n t o o dρ B n t o ρ1 rarr n t o ρ2 τ
that judgement implies the thesis
n t o ρ1 oplus nD n t o o dρ = n t o ρ2
because oplus agrees with validty (Lemma 1344)
1221 Plugin requirementsDierentiation is extended by plugins on constants so plugins must prove their extensions correct
Plugin Requirement 1229 (Typing of DC n ndasho)For all `C c τ the plugin denes DC n c o satisfying ` DC n c o ∆τ
Plugin Requirement 12210 (Correctness of DC n ndasho)For all `C c τ
DC n c o is a derivative for n c o
Since constants are typed in the empty context and the only change for an empty environmentis an empty environment Plugin Requirement 12210 means that for all `C c τ we have
DC n c o B n c o rarr n c o τ
1222 Correctness proofWe next recall D n ndash orsquos denition and prove it satises its correctness statement Theorem 1222
Chapter 12 Changes and dierentiation formally 89
Denition 1051 (Dierentiation)Dierentiation is the following term transformation
D n λ(x σ ) rarr t o = λ(x σ ) (dx ∆σ ) rarr D n t oD n s t o = D n s o t D n t o
D n x o = dx
D n c o = DC n c o
where DC n c o denes dierentiation on primitives and is provided by language plugins (seeAppendix A25) and dx stands for a variable generated by prexing xrsquos name with d so thatD n y o = dy and so on
Before correctness we prove Lemma 1228
Lemma 1228 (Typing of D n ndasho)Typing rule
Γ ` t τ∆Γ ` D n t o ∆τ
Derive
is derivable
Proof The thesis can be proven by induction on the typing derivation Γ ` t τ The case forconstants is delegated to plugins in Plugin Requirement 1229
We prove Theorem 1222 using a typical logical relations strategy We proceed by inductionon term t and prove for each case that if D n ndash o preserves validity on subterms of t then alsoD n t o preserves validity Hence if the input environment change dρ is valid then the result ofdierentiation evaluates to valid change nD n t o o dρ
Readers familiar with logical relations proofs should be able to reproduce this proof on theirown as it is rather standard once one uses the given denitions In particular this proof resemblesclosely the proof of the abstraction theorem or relational parametricity (as given by Wadler [1989Sec 6] or by Bernardy and Lasson [2011 Sec 33 Theorem 3]) and the proof of the fundamentaltheorem of logical relations by Statman [1985]
Nevertheless we spell this proof out and use it to motivate how D n ndash o is dened more formallythan we did in Sec 105 For each case we rst give a short proof sketch and then redo the proof inmore detail to make the proof easier to follow
Theorem 1222 (D n ndasho is correct)Function n t o∆ is a derivative of n t o That is if Γ ` t τ and dρ B ρ1 rarr ρ2 Γ then nD n t o o dρ Bn t o ρ1 rarr n t o ρ2 τ
Proof By induction on typing derivation Γ ` t τ
bull Case Γ ` x τ The thesis is that nD n x o o is a derivative for n x o that is nD n x o o dρ Bn x o ρ1 rarr n x o ρ2 τ Since dρ is a valid environment change from ρ1 to ρ2 n dx o dρ is avalid change from n x o ρ1 to n x o ρ2 Hence dening D n x o = dx satises our thesis
bull Case Γ ` s t τ The thesis is that nD n s t o o is a derivative for n s t o that is nD n s t o o dρ Bn s t o ρ1 rarr n s t o ρ2 τ By inversion of typing there is some typeσ such that Γ ` s σ rarr τand Γ ` t σ
90 Chapter 12 Changes and dierentiation formally
To prove the thesis in short you can apply the inductive hypothesis to s and t obtaining respec-tively that nD n s o o and nD n t o o are derivatives for n s o and n t o In particular nD n s o oevaluates to a validity-preserving function change Term D n s t o that is D n s o t D n t oapplies validity-preserving function D n s o to valid input change D n t o and this produces avalid change for s t as requiredIn detail our thesis is that for all dρ B ρ1 rarr ρ2 Γ we have
nD n s t o o dρ B n s t o ρ1 rarr n s t o ρ2 τ
where n s t o ρ = (n s o ρ) (n t o ρ) and
nD n s t o o dρ= nD n s o t D n t o o dρ= (nD n s o o dρ) (n t o dρ) (nD n t o o dρ)= (nD n s o o dρ) (n t o ρ1) (nD n t o o dρ)
The last step relies on n t o dρ = n t o ρ1 Since weakening preserves meaning (Lemma A28)this follows because dρ n∆Γ o extends ρ1 n Γ o and t can be typed in context ΓOur thesis becomes
nD n s o o dρ (n t o ρ1) (nD n t o o dρ) B n s o ρ1 (n t o ρ1) rarr n s o ρ2 (n t o ρ2) τ
By the inductive hypothesis on s and t we have
nD n s o o dρ B n s o ρ1 rarr n s o ρ2 σ rarr τ
nD n t o o dρ B n t o ρ1 rarr n t o ρ2 σ
Since n s o is a function its validity means
forallda B a1 rarr a2 σ nD n s o o dρ a1 da B n s o ρ1 a1 rarr n s o ρ2 a2 τ
Instantiating in this statement the hypothesis da B a1 rarr a2 σ by nD n t o o dρ Bn t o ρ1 rarr n t o ρ2 σ gives the thesis
bull Case Γ ` λx rarr t σ rarr τ By inversion of typing Γ x σ ` t τ By typing of D n ndash o youcan show that
∆Γ x σ dx ∆σ ` D n t o ∆τ
In short our thesis is that n λx rarr t o∆ = λρ1 dρ rarr n λx dx rarr D n t o o dρ is a derivative ofn λx rarr t o After a few simplications our thesis reduces to
nD n t o o (dρ x = a1 dx = da) B n t o (ρ1 x = a1) rarr n t o (ρ2 x = a2) τ
for all dρ B ρ1 rarr ρ2 Γ and da B a1 rarr a2 σ But then the thesis is simply that n t o∆ isthe derivative of n t o which is true by inductive hypothesisMore in detail our thesis is that n λx rarr t o∆ is a derivative for n λx rarr t o that is
foralldρ B ρ1 rarr ρ2 ΓnD n λx rarr t o o dρ B n λx rarr t o ρ1 rarr n λx rarr t o ρ2 σ rarr τ (121)
Chapter 12 Changes and dierentiation formally 91
By simplifying the thesis Eq (121) becomes
foralldρ B ρ1 rarr ρ2 Γλa1 dararr nD n t o o (dρ x = a1 dx = da) B
(λa1 rarr n t o (ρ1 x = a1)) rarr (λa2 rarr n t o (ρ2 x = a2)) σ rarr τ (122)
By denition of validity of function type the thesis Eq (122) becomes
foralldρ B ρ1 rarr ρ2 Γ forallda B a1 rarr a2 σ nD n t o o (dρ x = a1 dx = da) B
n t o (ρ1 x = a1) rarr n t o (ρ2 x = a2) τ (123)
To prove the rewritten thesis Eq (123) take the inductive hypothesis on t it says thatnD n t o o is a derivative for n t o so nD n t o o maps valid environment changes on Γ x σ tovalid changes on τ But by inversion of the validity judgment all valid environment changeson Γ x σ can be written as
(dρx = a1dx = da) B (ρ1x = a1) rarr (ρ2x = a2) Γx σ
for valid changes dρ B ρ1 rarr ρ2 Γ and da B a1 rarr a2 σ So the inductive hypothesis isthat
foralldρ B ρ1 rarr ρ2 Γ forallda B a1 rarr a2 σ nD n t o o (dρ x = a1 dx = da) B
n t o (ρ1 x = a1) rarr n t o (ρ2 x = a2) τ (124)
But that is exactly our thesis Eq (123) so wersquore done
bull Case Γ ` c τ In essence since weakening preserves meaning we can rewrite the thesis tomatch Plugin Requirement 12210In more detail the thesis is that DC n c o is a derivative for c that is if dρ B ρ1 rarr ρ2 Γ thennD n c o o dρ B n c o ρ1 rarr n c o ρ2 τ Since constants donrsquot depend on the environmentand weakening preserves meaning (Lemma A28) and by the denitions of n ndash o and D n ndash oon constants the thesis can be simplied to
DC n c o B n c o rarr n c o τ which is
delegated to plugins in Plugin Requirement 12210
123 Discussion
1231 The correctness statementWe might have asked for the following correctness property
Theorem 1231 (Incorrect correctness statement)If Γ ` t τ and ρ1 oplus dρ = ρ2 then (n t o ρ1) oplus (nD n t o o dρ) = (n t o ρ2)
However this property is not quite right We can only prove correctness if we restrict thestatement to input changes dρ that are valid Moreover to prove this statement by induction weneed to strengthen its conclusion we require that the output change nD n t o o dρ is also valid To
92 Chapter 12 Changes and dierentiation formally
see why consider term (λx rarr s) t Here the output of t is an input of s Similarly in D n (λx rarr s) t othe output of D n t o becomes an input change for subterm D n t o and D n s o behaves correctlyonly if only if D n t o produces a valid change
Typically change types contain values that invalid in some sense but incremental programswill preserve validity In particular valid changes between functions are in turn functions that takevalid input changes to valid output changes This is why we formalize validity as a logical relation
1232 Invalid input changesTo see concretely why invalid changes in general can cause derivatives to produce incorrect resultsconsider again grandTotal = λxs ys rarr sum (merge xs ys) from Sec 102 Suppose a bag changedxs removes an element 20 from input bag xs while dys makes no changes to ys in this case theoutput should decrease so dz = dgrandTotal xs dxs ys dys should be minus20 However that is onlycorrect if 20 is actually an element of xs Otherwise xs oplus dxs will make no change to xs hence thecorrect output change dz would be 0 instead of minus20 Similar but trickier issues apply with functionchanges see also Sec 152
1233 Alternative environment changesEnvironment changes can also be dened dierently We will use this alternative denition later (inAppendix D22)
A change dρ from ρ1 to ρ2 contains a copy of ρ1 Thanks to this copy we can use an environ-ment change as environment for the result of dierentiation that is we can evaluate D n t o withenvironment dρ and Denition 1221 can dene n t o∆ as λρ1 dρ rarr nD n t o o dρ
But we could adapt denitions to omit the copy of ρ1 from dρ by setting
∆ (Γx τ ) = ∆Γ dx ∆τ
and adapting other denitions Evaluating D n t o still requires base inputs we could then set n t o∆ =λρ1 dρ rarr nD n t o o (ρ1 dρ) where ρ1 dρ simply merges the two environments appropriately(we omit a formal denition) This is the approach taken by Cai et al [2014] When provingTheorem 1222 using one or the other denition for environment changes makes little dierenceif we embed the base environment in environment changes we reduce noise because we need notdene environment meging formally
Later (in Appendix D22) we will deal with environment explicitly and manipulate them inprograms Then we will use this alternative denition for environment changes since it will beconvenient to store base environments separately from environment changes
1234 Capture avoidanceDierentiation generates new names so a correct implementation must prevent accidental captureTill now we have ignored this problem
Using de Bruijn indexes Our mechanization has no capture issues because it uses de Bruijnindexes Change context just alternate variables for base inputs and input changes A context suchas Γ = x Z y Bool is encoded as Γ = ZBool its change context is ∆Γ = Z∆ZBool∆Bool Thissolution is correct and robust and is the one we rely on
Alternatively we can mechanize ILC using separate syntax for change terms dt that use separatenamespaces for base variables and change variables
Chapter 12 Changes and dierentiation formally 93
ds dt = dc| λ(x σ ) (dx ∆σ ) rarr dt| ds t dt| dx
In that case change variables live in a separate namespace Example context Γ = ZBool givesrise to a dierent sort of change context ∆Γ = ∆Z∆Bool And a change term in context Γ isevaluted with separate environments for Γ and ∆Γ This is appealing because it allows deningdierentiation and proving it correct without using weakening and applying its proof of soundnessWe still need to use weakening to convert change terms to their equivalents in the base languagebut proving that conversion correct is more straightforward
Using names Next we discuss issues in implementing this transformation with names ratherthan de Bruijn indexes Using names rather than de Bruijn indexes makes terms more readable thisis also why in this thesis we use names in our on-paper formalization
Unlike the rest of this chapter we keep this discussion informal also because we have notmechanized any denitions using names (as it may be possible using nominal logic) nor attemptedformal proofs The rest of the thesis does not depend on this material so readers might want toskip to next section
Using names introduces the risk of capture as it is common for name-generating transfor-mations [Erdweg et al 2014] For instance dierentiating term t = λx rarr f dx gives D n t o =λx dx rarr df dx ddx Here variable dx represents a base input and is free in t yet it is incorrectlycaptured in D n t o by the other variable dx the one representing xrsquos change Dierentiation givesinstead a correct result if we α-rename x in t to any other name (more on that in a moment)
A few workarounds and xes are possible
bull As a workaround we can forbid names starting with the letter d for variables in base termsas we do in our examples thatrsquos formally correct but pretty unsatisfactory and inelegantWith this approach our term t = λx rarr f dx is simply forbidden
bull As a better workaround instead of prexing variable names with d we can add changevariables as a separate construct to the syntax of variables and forbid dierentiation on termsthat containing change variables This is a variant of the earlier approach using separatechange terms While we used this approach in our prototype implementation in Scala [Caiet al 2014] it makes our output language annoyingly non-standard Converting to a standardlanguage using names (not de Bruijn indexes) raises again capture issues
bull We can try to α-rename existing bound variables as in the implementation of capture-avoiding substitution As mentioned in our case we can rename bound variable x to y andget t prime = λy rarr f dx Dierentiation gives the correct result D n t prime o = λy dy rarr df dx ddxIn general we can dene D n λx rarr t o = λy dy rarr D n t [x = y ] o where neither y nor dyappears free in t that is we search for a fresh variable y (which being fresh does not appearanywhere else) such that also dy does not appear free in tThis solution is however subtle it reuses ideas from capture-avoiding substitution which iswell-known to be subtle We have not attempted to formally prove such a solution correct (oreven test it) and have no plan to do so
bull Finally and most easily we can α-rename new bound variables the ones used to refer tochanges or rather only pick them fresh But if we pick say fresh variable dx1 to refer tothe change of variable x we must use dx1 consistently for every occurrence of x so that
94 Chapter 12 Changes and dierentiation formally
D n λx rarr x o is not λdx1 rarr dx2 Hence we extend D n ndash o to also take a map from names tonames as follows
D n λ(x σ ) rarr tm o = λ(x σ ) (dx ∆σ ) rarr D n t (m [x rarr dx ]) oD n s tm o = D n sm o t D n tm o
D n xm o = m(x)
D n cm o = DC n c o
where m(x) represents lookup of x in map m dx is now a fresh variable that does not appearin t and m [x rarr dx ] extend m with a new mapping from x to dx
But this approach that is using a map from base variables to change variables aects theinterface of dierentiation In particular it aects which variables are free in output termshence we must also update the denition of ∆Γ and derived typing rule Derive With this ap-proach if term s is closed then D n s emptyMap o gives a result α-equivalent to the old D n s oas long as s triggers no capture issues But if instead s is open invoking D n s emptyMap o isnot meaningful we must pass an initial map m containing mappings from srsquos free variablesto fresh variables for their changes These change variables appear free in D n sm o hencewe must update typing rule Derive and modify ∆Γ to use m
We dene ∆mΓ by adding m as a parameter to ∆Γ and use it in a modied rule Deriversquo
∆mε = ε
∆m (Γx τ ) = ∆mΓx τ m(x) ∆τ
Γ ` t τ∆mΓ ` D n tm o ∆τ
Deriversquo
We conjecture that Deriversquo holds and that D n tm o is correct but we have attempted noformal proof
124 Plugin requirement summary
For reference we repeat here plugin requirements spread through the chapter
Plugin Requirement 12116 (Basic change structures on base types)For each base type ι the plugin denes a basic change structure on n ι o that we write ι
Plugin Requirement 12119 (Base change types)For each base type ι the plugin denes a change type ∆ι such that ∆ n ι o = n∆ι o
Plugin Requirement 1229 (Typing of DC n ndasho)For all `C c τ the plugin denes DC n c o satisfying ` DC n c o ∆τ
Plugin Requirement 12210 (Correctness of DC n ndasho)For all `C c τ
DC n c o is a derivative for n c o
Chapter 12 Changes and dierentiation formally 95
125 Chapter conclusionIn this chapter we have formally dened changes for values and environments of our languageand when changes are valid Through these denitions we have explained that D n t o is correctthat is that it maps changes to the input environment to changes to the output environment All ofthis assumes among other things that language plugins dene valid changes for their base typesand derivatives for their primitives
In next chapter we discuss operations we provide to construct and use changes These operationswill be especially useful to provide derivatives of primitives
96 Chapter 12 Changes and dierentiation formally
Chapter 13
Change structures
In the previous chapter we have shown that evaluating the result of dierentiation produces a validchange dv from the old output v1 to the new one v2 To compute v2 from v1 and dv in this chapterwe introduce formally the operator oplus mentioned earlier
To dene dierentiation on primitives plugins need a few operations on changes not just oplus and 0
To formalize these operators and specify their behavior we extend the notion of basic changestructure into the notion of change structure in Sec 131 The change structure for function spaces isnot entirely intuitive so we motivate it in Sec 132 Then we show how to take change structureson A and B and dene new ones on A rarr B in Sec 133 Using these structures we nally showthat starting from change structures for base types we dene change structures for all types τ andcontexts Γ in Sec 134 completing the core theory of changes
131 Formally dening oplus and change structures
In this section we dene what is a change structure on a set V A change structure V extends a basicchange structure V with change operators oplus and 0 Change structures also require changeoperators to respect validity as described below Key properties of change structures follow inSec 1312
As usual wersquoll use metavariables v v1 v2 will range over elements ofV while dv dv1 dv2 will range over elements of ∆V
Letrsquos rst recall change operators Operator oplus (ldquooplusrdquo) updates a value with a change If dv is avalid change from v1 to v2 then v1 oplus dv (read as ldquov1 updated by dvrdquo or ldquov1 oplus dvrdquo) is guaranteedto return v2 Operator (ldquoominusrdquo) produces a dierence between two values v2 v1 is a validchange from v1 to v2 Operator 0 (ldquonilrdquo) produces nil changes 0v is a nil change for v Finallychange composition (ldquoocomposerdquo) ldquopastes changes togetherrdquo if dv1 is a valid change from v1 tov2 and dv2 is a valid change from v2 to v3 then dv1 dv2 (read ldquodv1 composed with dv2rdquo) is a validchange from v1 to v3
We summarize these descriptions in the following denition
Denition 1311A change structure V over a set V requires
(a) A basic change structure for V (hence change set ∆V and validity dv B v1 rarr v2 V )
97
98 Chapter 13 Change structures
(b) An update operation oplus V rarr ∆V rarr V that updates a value with a change Update mustagree with validity for all dv B v1 rarr v2 V we require v1 oplus dv = v2
(c) A nil change operation 0 V rarr ∆V that must produce nil changes for all v V we require0v B v rarr v V
(d) a dierence operation V rarr V rarr ∆V that produces a change across two values for allv1 v2 V we require v2 v1 B v1 rarr v2 V
(e) a change composition operation ∆V rarr ∆V rarr ∆V that composes together two changesrelative to a base value Change composition must preserve validity for all dv1 B v1 rarr v2 Vand dv2 B v2 rarr v3 V we require dv1 dv2 B v1 rarr v3 V
Notation 1312Operators oplus and can be subscripted to highlight their base set but we will usually omit suchsubscripts Moreover oplus is left-associative so that v oplus dv1 oplus dv2 means (v oplus dv1) oplus dv2
Finally whenever we have a change structure such as A B V and so on we write respectivelyA B V to refer to its base set
1311 Example Group changesAs an example we show next that each group induces a change structure on its carrier This changestructure also subsumes basic change structures we saw earlier on integersDenition 1313 (Change structure on groups G)Given any group (G e+minus) we can dene a change structure G on carrier set G as follows
(a) The change set is G
(b) Validity is dened as dg B g1 rarr g2 G if and only if g2 = g1 + dg
(c) Change update coincides with + g1 oplus dg = g1 + dg Hence oplus agrees perfectly with validityfor all g1 isin G all changes dg are valid from g1 to g1 oplus dg (that is dg B g1 rarr g1 oplus dg G)
(d) We dene dierence as g2 g1 = (minusg1) + g2 Verifying g2 g1 B g1 rarr g2 G reduces toverifying g1 + ((minusg1) + g2) = g2 which follows from group axioms
(e) The only nil change is the identity element 0g = e Validity 0g B g rarr g G reduces tog + e = g which follows from group axioms
(f) Change composition also coincides with + dg1dg2 = dg1+dg2 Letrsquos prove that compositionrespects validity Our hypothesis is dg1 B g1 rarr g2 G and dg2 B g2 rarr g3 G Because oplusagrees perfectly with validity the hypothesis is equivalent to g2 = g1 oplus dg1 and
g3 = g2 oplus dg2 = (g1 oplus dg1) oplus dg2
Our thesis is dg1 dg2 B g1 rarr g3 G that is
g3 = g1 oplus (dg1 dg2)
Hence the thesis reduces to
(g1 oplus dg1) oplus dg2 = g1 oplus (dg1 dg2)
hence to g1 + (dg1 + dg2) = (g1 + dg1) + dg2 which is just the associative law for group G As we show later group change structures are useful to support aggregation
Chapter 13 Change structures 99
1312 Properties of change structuresTo understand better the denition of change structures we present next a few lemmas followingfrom this denitionLemma 1314 ( inverts oplus)oplus inverts that is
v1 oplus (v2 v1) = v2
for change structure V and any values v1 v2 V
Proof For change structures we know v2 v1 B v1 rarr v2 V and v1 oplus (v2 v1) = v2 followsMore in detail Change dv = v2 v1 is a valid change from v1 to v2 (because produces valid
changes v2 v1 B v1 rarr v2 V ) so updating dvrsquos source v1 with dv produces dvrsquos destination v2(because oplus agrees with validity that is if dv B v1 rarr v2 V then v1 oplus dv = v2)
Lemma 1315 (A change canrsquot be valid for two destinations with the same source)Given a change dv ∆V and a source v1 V dv can only be valid with v1 as source for a singledestination That is if dv B v1 rarr v2a V and dv B v1 rarr v2b V then v2a = v2b
Proof The proof follows intuitively because oplus also maps change dv and its source v1 to its destina-tion and oplus is a function
More technically since oplus respects validity the hypotheses mean that v2a = v1 oplus dv = v2b asrequired
Beware that changes can be valid for multiple sources and associate them to dierent destinationFor instance integer 0 is a valid change for all integers
If a change dv has source v dvrsquos destination equals v oplus dv So to specify that dv is valid withsource v without mentioning dvrsquos destination we introduce the following denition
Denition 1316 (One-sided validity)We dene relation VA as (v dv) isin A times ∆A | dv B v rarr v oplus dv V
We use this denition right away
Lemma 1317 ( and oplus interact correctly)If (v1 dv1) isin VV and (v1 oplus dv1 dv2) isin VV then v1 oplus (dv1 dv2) = v1 oplus dv1 oplus dv2
Proof We know that preserves validity so under the hypotheses (v1 dv1) isin VV and (v1 oplusdv1 dv2) isin VV we get that dv = dv1 dv2 is a valid change from v1 to v1 oplus dv1 oplus dv2
dv1 dv2 B v1 rarr v1 V oplus dv1 oplus dv2
Hence updating dvrsquos source v1 with dv produces dvrsquos destination v1 oplus dv1 oplus dv2
v1 oplus (dv1 dv2) = v1 oplus dv1 oplus dv2
We can dene 0 in terms of other operations and prove they satisfy their requirements forchange structures
Lemma 1318 (0 can be derived from )If we dene 0v = v v then 0 produces valid changes as required (0v B v rarr v V ) for any changestructure V and value v V
Proof This follows from validity of (v2 v1 B v1 rarr v2 V ) instantiated with v1 = v and v2 = v
100 Chapter 13 Change structures
Moreover nil changes are a right identity element for oplus
Corollary 1319 (Nil changes are identity elements)Any nil change dv for a value v is a right identity element for oplus that is we have v oplus dv = v forevery set V with a basic change structure every element v isin V and every nil change dv for v
Proof This follows from Lemma 1344 and the denition of nil changes
The converse of this theorem does not hold there exists changes dv such that v oplus dv = v yetdv is not a valid change from v to v More in general v1 oplus dv = v2 does not imply that dv is avalid change For instance under earlier denitions for changes on naturals if we take v = 0 anddv = minus5 we have v oplus dv = v even though dv is not valid examples of invalid changes on functionsare discussed at Sec 1232 and Sec 152 However we prefer to dene ldquodv is a nil changerdquo as we doto imply that dv is valid not just a neutral element
132 Operations on function changes informally
1321 Examples of nil changes
We have not dened any change structure yet but we can already exhibit nil changes for somevalues including a few functions
Example 1321bull An environment change for an empty environment n ε o must be an environment for the
empty context ∆ε = ε so it must be the empty environment In other words if and only ifdρ B rarr ε then and only then dρ = and in particular dρ is a nil change
bull If all values in an environment ρ have nil changes the environment has a nil change dρ = 0ρassociating each value to a nil change Indeed take a context Γ and a suitable environmentρ n Γ o For each typing assumption x τ in Γ assume we have a nil change 0v for v Thenwe dene 0ρ n∆Γ o by structural recursion on ρ as
0 = 0ρx=v = 0ρ x = v dx = 0v
Then we can see that 0ρ is indeed a nil change for ρ that is 0ρ B ρ rarr ρ Γ
bull We have seen in Theorem 1222 that whenever Γ ` t τ n t o has nil change n t o∆ Moreoverif we have an appropriate nil environment change dρ B ρ rarr ρ Γ (which we often do asdiscussed above) then we also get a nil change n t o∆ ρ dρ for n t o ρ
n t o∆ ρ dρ B n t o ρ rarr n t o ρ τ
In particular for all closed well-typed terms ` t τ we have
n t o∆ B n t o rarr n t o τ
Chapter 13 Change structures 101
1322 Nil changes on arbitrary functionsWe have discussed how to nd a nil change 0f for a function f if we know the intension of f that isits denition What if we have only its extension that is its behavior Can we still nd 0f Thatrsquosnecessary to implement 0 as an object-language function 0 from f to 0f since such a function doesnot have access to f rsquos implementation Thatrsquos also necessary to dene 0 on metalanguage functionspaces mdash and we look at this case rst
We seek a nil change 0f for an arbitrary metalanguage function f Ararr B where A and B arearbitrary sets we assume a basic change structure on Ararr B and will require A and B to support afew additional operations We require that
0f B f rarr f Ararr B (131)
Equivalently whenever da B a1 rarr a2 A then 0f a1 da B f a1 rarr f a2 B By Lemma 1344 itfollows that
f a1 oplus 0f a1 da = f a2 (132)where a1 oplus da = a2
To dene 0f we solve Eq (132) To understand how we use an analogy oplus and are intendedto resemble + and minus To solve f a1 + 0f a1 da = f a2 we subtract f a1 from both sides and write0f a1 da = f a2 minus f a1
Similarly here we use operator 0f must equal
0f = λa1 dararr f (a1 oplus da) f a1 (133)
Because b2 b1 B b1 rarr b2 B for all b1 b2 B we can verify that 0f as dened by Eq (133) satisesour original requirement Eq (131) not just its weaker consequence Eq (132)
We have shown that to dene 0 on functions f A rarr B we can use at type B Withoutusing f rsquos intension we are aware of no alternative To ensure 0f is dened for all f we require thatchange structures dene We can then dene 0 as a derived operation via 0v = v v and verifythis derived denition satises requirements for 0ndash
Next we show how to dene on functions We seek a valid function change f2 f1 from f1 tof2 We have just sought and found a valid change from f to f generalizing the reasoning we usedwe obtain that whenever da B a1 rarr a2 A then we need to have (f2 f1) a1 da B f1 a1 rarr f2 a2 Bsince a2 = a1 oplus da we can dene
f2 f1 = λa1 dararr f2 (a1 oplus da) f1 a1 (134)One can verify that Eq (134) denes f2 f1 as a valid function from f1 to f2 as desired And
after dening f2 f1 we need no more to dene 0f separately using Eq (133) We can just dene0f = f f simplify through the denition of in Eq (134) and reobtain Eq (133) as a derivedequation
0f = f f = λa1 dararr f (a1 oplus da) f a1
We dened f2 f1 on metalanguage functions We can also internalize change operators in ourobject language that is dene for each type τ object-level terms oplusτ τ and so on with the samebehavior We can dene object-language change operators such as using the same equationsHowever the produced function change df is slow because it recomputes the old output f1 a1computes the new output f2 a2 and takes the dierence
However we can implement σrarrτ using replacement changes if they are supported by thechange structure on type τ Let us dene τ as b2 b1 = b2 and simplify Eq (134) we obtain
f2 f1 = λa1 dararr (f2 (a1 oplus da))
102 Chapter 13 Change structures
We could even imagine allowing replacement changes on functions themselves However herethe bang operator needs to be dened to produce a function change that can be applied hence
f2 = λa1 dararr (f2 (a1 oplus da))
Alternatively as we see in Appendix D we could represent function changes not as functionsbut as data through defunctionalization and provide a function applying defunctionalized functionchanges df ∆(σ rarr τ ) to inputs t1 σ and dt ∆σ In this case f2 would simply be another way toproduce defunctionalized function changes
1323 Constraining oplus on functionsNext we discuss how oplus must be dened on functions and show informally why we must denef1oplusdf = λv rarr f1 xoplusdf v 0v to prove that oplus on functions agrees with validity (that is Lemma 1344)
We know that a valid function change df B f1 rarr f2 A rarr B takes valid input changesdv B v1 rarr v2 A to a valid output change df v1 dv B f1 v1 rarr f2 v2 B We require that oplus agreeswith validity (Lemma 1344) so f2 = f1 oplus df v2 = v1 oplus dv and
f2 v2 = (f1 oplus df ) (v1 oplus dv) = f1 v1 oplus df v1 dv (135)
Instantiating dv with 0v gives equation
(f1 oplus df ) v1 = (f1 oplus df ) (v1 oplus 0v) = f1 v1 oplus df v1 0v
which is not only a requirement on oplus for functions but also denes oplus eectively
133 Families of change structuresIn this section we derive change structures for A rarr B and A times B from two change structuresA and B The change structure on A rarr B enables dening change structures for function typesSimilarly the change structure on A times B enables dening a change structure for product types in alanguage plugin as described informally in Sec 115 In Sec 116 we also discuss informally changestructures for disjoint sums Formally we can derive a change structure for disjoint union of setsA + B (from change structures for A and B) and this enables dening change structures for sumtypes we have mechanized the required proofs but omit the tedious details here
1331 Change structures for function spacesSec 132 introduces informally how to dene change operations on Ararr B Next we dene formallychange structures on function spaces and then prove its operations respect their constraints
Denition 1331 (Change structure for Ararr B)Given change structures A and B we dene a change structure on their function space A rarr Bwritten Ararr B where
(a) The change set is dened as ∆(Ararr B) = Ararr ∆Ararr ∆B
(b) Validity is dened as
df B f1 rarr f2 Ararr B = foralla1 da a2 (da B a1 rarr a2 A)implies (df a1 da B f1 a1 rarr f2 a2 B)
Chapter 13 Change structures 103
(c) We dene change update by
f1 oplus df = λararr f1 a oplus df a 0a
(d) We dene dierence by
f2 f1 = λa dararr f2 (a oplus da) f1 a
(e) We dene 0 like in Lemma 1318 as
0f = f f
(f) We dene change composition as
df 1 df 2 = λa dararr df 1 a 0a df 2 a da
Lemma 1332Denition 1331 denes a correct change structure Ararr B
Proof bull We prove that oplus agrees with validity on A rarr B Consider f1 f2 A rarr B anddf B f1 rarr f2 A rarr B we must show that f1 oplus df = f2 By functional extensionality weonly need prove that (f1 oplus df ) a = f2 a that is that f1 a oplus df a 0a = f2 a Since oplus agrees withvalidity on B we just need to show that df a 0a B f1 a rarr f2 a B which follows because 0ais a valid change from a to a and because df is a valid change from f1 to f2
bull We prove that produces valid changes on Ararr B Consider df = f2 f1 for f1 f2 Ararr BFor any valid input da B a1 rarr a2 A we must show that df produces a valid output withthe correct vertexes that is that df a1 da B f1 a1 rarr f2 a2 B Since oplus agrees with validitya2 equals a1 oplus da By substituting away a2 and df the thesis becomes f2 (a1 oplus da) f1 a1 Bf1 a1 rarr f2 (a1 oplus da) B which is true because produces valid changes on B
bull 0 produces valid changes as proved in Lemma 1318
bull We prove that change composition preserves validity on Ararr B That is we must prove
df 1 a1 0a1 df 2 a1 da B f1 a1 rarr f3 a2 B
for every f1 f2 f3 df 1 df 2 a1 da a2 satifsfying df 1 B f1 rarr f2 Ararr B df 2 B f2 rarr f3 ArarrB and da B a1 rarr a2 A
Because change composition preserves validity on B itrsquos enough to prove that (1) df 1 a1 0a1 Bf1 a1 rarr f2 a1 B (2) df 2 a1 da B f2 a1 rarr f3 a2 B That is intuitively we create a compositechange using and it goes from f1 a1 to f3 a2 passing through f2 a1 Part (1) follows becausedf 1 is a valid function change from f1 to f2 applied to a valid change 0a1 from a1 to a1 Part(2) follows because df 2 is a valid function change from f2 to f3 applied to a valid change dafrom a1 to a2
104 Chapter 13 Change structures
1332 Change structures for productsWe can dene change structures on products A times B given change structures on A and B a changeon pairs is just a pair of changes all other change structure denitions distribute componentwisethe same way and their correctness reduce to the correctness on components
Change structures on n-ary products or records present no additional diculty
Denition 1333 (Change structure for A times B)Given change structures A and B we dene a change structure A times B on product A times B
(a) The change set is dened as ∆(A times B) = ∆A times ∆B
(b) Validity is dened as
(da db) B (a1 b1) rarr (a2 b2) A times B =(da B a1 rarr a2 A) and (db B b1 rarr b2 B)
In other words validity distributes componentwise a product change is valid if each compo-nent is valid
(c) We dene change update by
(a1 b1) oplus (da db) = (a1 oplus da b1 oplus db)
(d) We dene dierence by
(a2 b2) (a1 b1) = (a2 a1 b2 b1)
(e) We dene 0 to distribute componentwise
0ab = (0a 0b)
(f) We dene change composition to distribute componentwise
(da1 db1) (da2 db2) = (da1 da2 db1 db2)
Lemma 1334Denition 1333 denes a correct change structure A times B
Proof Since all these proofs are similar and spelling out their details does not make them clearerwe only give the rst such proof in full
bull oplus agrees with validity on A times B because oplus agrees with validity on both A and B For thisproperty we give a full proofFor each p1 p2 A times B and dp B p1 rarr p2 A times B we must show that p1 oplus dp = p2Instead of quantifying over pairs p A times B we can quantify equivalently over componentsa A b B Hence consider a1 a2 A b1 b2 B and changes da db that are valid that isda B a1 rarr a2 A and db B b1 rarr b2 B We must show that
(a1 b1) oplus (da db) = (a2 b2)
That follows from a1 oplus da = a2 (which follows from da B a1 rarr a2 A) and b1 oplus db = b2(which follows from db B b1 rarr b2 B)
Chapter 13 Change structures 105
bull produces valid changes on A times B because produces valid changes on both A and B Weomit a full proof the key step reduces the thesis
(a2 b2) (a1 b1) B (a1 b1) rarr (a2 b2) A times B
to a2 a1 B a1 rarr a2 A and b2 b1 B b1 rarr b2 B (where free variables range on suitabledomains)
bull 0ab is correct that is (0a 0b) B (a b) rarr (a b) AtimesB because 0 is correct on each component
bull Change composition is correct on A times B that is
(da1 da2 db1 db2) B (a1 b1) rarr (a3 b3) A times B
whenever (da1 db1) B (a1 b1) rarr (a2 b2) A times B and (da2 db2) B (a2 b2) rarr (a3 b3) A times Bin essence because change composition is correct on both A and B We omit details
134 Change structures for types and contextsAs promised given change structures for base types we can provide change structures for all types
Plugin Requirement 1341 (Change structures for base types)For each base type ι we must have a change structure ι dened on base set n ι o based on the basicchange structures dened earlier
Denition 1342 (Change structure for types)For each type τ we dene a change structure τ on base set nτ o
ι = σ rarr τ = σ rarr τ
Lemma 1343Change sets and validity as dened in Denition 1342 give rise to the same basic change structuresas the ones dened earlier in Denition 12115 and to the change operations described in Fig 131a
Proof This can be veried by induction on types For each case it is sucient to compare deni-tions
Lemma 1344 (oplus agrees with validity)If dv B v1 rarr v2 τ then v1 oplus dv = v2
Proof Because τ is a change structure and in change structures oplus agrees with validity
As shortly proved in Sec 122 since oplus agrees with validity (Lemma 1344) and D n ndash o is correct(Theorem 1222) we get Corollary 1345
Corollary 1345 (D n ndasho is correct corollary)If Γ ` t τ and dρ B ρ1 rarr ρ2 Γ then n t o ρ1 oplus nD n t o o dρ = n t o ρ2
106 Chapter 13 Change structures
We can also dene a change structure for environments Recall that change structures forproducts dene their operations to act on each component Each change structure operation forenvironments acts ldquovariable-wiserdquo Recall that a typing context Γ is a list of variable assignmentx τ For each such entry environments ρ and environment changes dρ contain a base entry x = vwhere v nτ o and possibly a change dx = dv where dv n∆τ o
Denition 1346 (Change structure for environments)To each context Γ we associate a change structure Γ that extends the basic change structure fromDenition 12125 Operations are dened as shown in Fig 131b
Base values v prime in environment changes are redundant with base values v in base environmentsbecause for valid changes v = v prime So when consuming an environment change we choose arbitrarilyto use v instead of v prime Alternatively we could also use v prime and get the same results for valid inputsWhen producing an environment change they are created to ensure validity of the resultingenvironment change
Lemma 1347Denition 1346 denes a correct change structure Γ for each context Γ
Proof All proofs are by structural induction on contexts Most details are analogous to the ones forproducts and add no details so we refer to our mechanization for most proofs
However we show by induction that oplus agrees with validity For the empty context therersquosa single environment n ε o so oplus returns the correct environment For the inductive caseΓprime x τ inversion on the validity judgment reduces our hypothesis to dv B v1 rarr v2 τ anddρ B ρ1 rarr ρ2 Γ and our thesis to (ρ1 x = v1) oplus (dρ x = v1 dx = dv) = (ρ2 x = v2) where v1appears both in the base environment and the environment change The thesis follows because oplusagrees with validity on both Γ and τ
We summarize denitions on types in Fig 131Finally we can lift change operators from the semantic level to the syntactic level so that their
meaning is coherent
Denition 1348 (Term-level change operators)We dene type-indexed families of change operators at the term level with the following signatures
oplusτ τ rarr ∆τ rarr ττ τ rarr τ rarr ∆τ0τ ndash τ rarr ∆ττ ∆τ rarr ∆τ rarr ∆τ
and denitions
tf 1 oplusσrarrτ dtf = λx rarr tf 1 x oplus dtf x 0xtf 2 σrarrτ tf 1 = λx dx rarr tf 2 (x oplus dx) tf 1 x0σrarrτ tf = tf σrarrτ tfdtf 1 σrarrτ dtf 2 = λx dx rarr dtf 1 x 0x dtf 2 x dxtf 1 oplusι dtf = tf 2 ι tf 1 = 0ι tf = dtf 1 ι dtf 2 =
Chapter 13 Change structures 107
Lemma 1349 (Term-level change operators agree with change structures)The following equations hold for all types τ contexts Γ well-typed terms Γ ` t1 t2 τ ∆Γ `dt dt1 dt2 ∆τ and environments ρ n Γ o dρ n∆Γ o such that all expressions are dened
n t1 oplusτ dt o dρ = n t1 o dρ oplus n dt o dρn t2 τ t1 o ρ = n t2 o ρ oplus n t1 o ρ0τ t
ρ = 0n t o ρ
n dt1 τ dt2 o dρ = n dt1 o dρ n dt2 o dρ
Proof By induction on types and simplifying both sides of the equalities The proofs for oplus and must be done by simultaneous induction
At the time of writing we have not mechanized the proof for To dene the lifting and prove it coherent on base types we must add a further plugin require-
mentPlugin Requirement 13410 (Term-level change operators for base types)For each base type ι we dene change operators as required by Denition 1348 and satisfyingrequirements for Lemma 1349
135 Development historyThe proof presented in this and the previous chapter is an signicant evolution of the original oneby Cai et al [2014] While this formalization and the mechanization are both original with thisthesis some ideas were suggested by other (currently unpublished) developments by Yufei Cai andby Yann Reacutegis-Gianas Yufei Cai gives a simpler pen-and-paper set-theoretic proof by separatingvalidity while we noticed separating validity works equally well in a mechanized type theory andsimplies the mechanization The rst to use a two-sided validity relation was Yann Reacutegis-Gianasbut using a big-step operational semantics while we were collaborating on an ILC correctness prooffor untyped λ-calculus (as in Appendix C) I gave the rst complete and mechanized ILC correctnessproof using two-sided validity again for a simply-typed λ-calculus with a denotational semanticsBased on two-sided validity I also reconstructed the rest of the theory of changes
136 Chapter conclusionIn this chapter we have seen how to dene change operators both on semantic values nad onterms and what are their guarantees via the notion of change structures We have also denedchange structures for groups function spaces and products Finally we have explained and shownCorollary 1345 We continue in next chapter by discussing how to reason syntactically aboutchanges before nally showing how to dene some language plugins
108 Chapter 13 Change structures
oplusτ nτ rarr ∆τ rarr τ oτ nτ rarr τ rarr ∆τ o
0ndash nτ rarr ∆τ oτ n∆τ rarr ∆τ rarr ∆τ o
f1 oplusσrarrτ df = λv rarr f1 v oplus df v 0vv1 oplusι dv = f2 σrarrτ f1 = λv dv rarr f2 (v oplus dv) f1 vv2 ι v1 = 0v = v τ vdv1 ι dv2 = df 1 σrarrτ df 2 = λv dv rarr df 1 v 0v df 2 v dv
(a) Change structure operations on types (see Denition 1342)oplusΓ n Γ rarr ∆Γ rarr Γ oΓ n Γ rarr Γ rarr ∆Γ o
0ndash n Γ rarr ∆Γ oΓ n∆Γ rarr ∆Γ rarr ∆Γ o
oplus =
(ρ x = v) oplus (dρ x = v prime dx = dv) = (ρ oplus dρ x = v oplus dv) =
(ρ2 x = v2) (ρ1 x = v1) = (ρ2 ρ1 x = v1 dx = v2 v1)0 =
0ρx=v = (0ρ x = v dx = 0v) = (dρ1 x = v1 dx = dv1) (dρ2 x = v2 dx = dv2) =(dρ1 dρ2 x = v1 dx = dv1 dv2)
(b) Change structure operations on environments (see Denition 1346)
Lemma 1344 (oplus agrees with validity)If dv B v1 rarr v2 τ then v1 oplus dv = v2
Corollary 1345 (D n ndasho is correct corollary)If Γ ` t τ and dρ B ρ1 rarr ρ2 Γ then n t o ρ1 oplus nD n t o o dρ = n t o ρ2
Figure 131 Dening change structures
Chapter 14
Equational reasoning on changes
In this chapter we formalize equational reasoning directly on terms rather than on semantic values(Sec 141) and we discuss when two changes can be considered equivalent (Sec 142) We also showas an example a simple change structure on lists and a derivative of map for it (Example 1411)
To reason on terms instead of describing the updated value of a term t by using an updatedenvironment ρ2 we substitute in t each variable xi with expression xi oplus dxi to produce a term thatcomputes the updated value of t so that we can say that dx is a change from x to x oplus dx or thatdf x dx is a change from f x to (f oplus df ) (x oplus dx) Lots of the work in this chapter is needed tomodify denitions and go from using an updated environment to using substitution in this fashion
To compare for equivalence terms that use changes we canrsquot use denotational equivalence butmust restrict to consider valid changes
Comparing changes is trickier most often we are not interested in whether two changes producethe same value but whether two changes have the same source and destination Hence if twochanges share source and destination we say they are equivalent As we show in this chapteroperations that preserve validity also respect change equivalence because for all those operationsthe source and destination of any output changes only depend on source and destination of inputchanges Among the same source and destination there often are multiple changes and the dierenceamong them can aect how long a derivative takes but not whether the result is correct
We also show that change equivalence is a particular form of logical relation a logical partialequivalence relation (PER) PERs are well-known to semanticists and often used to study sets withinvalid elements together with the appropriate equivalence on these sets
The rest of the chapter expands on the details even if they are not especially surprising
141 Reasoning on changes syntacticallyTo dene derivatives of primitives we will often discuss validity of changes directly on programsfor instance saying that dx is a valid change from x to x oplus dx or that f x oplus df x dx is equivalent tof (x oplus dx) if all changes in scope are valid
In this section we formalize these notions We have not mechanized proofs involving substitu-tions but we include them here even though they are not especially interesting
But rst we exemplify informally these notions
Example 1411 (Derivingmap on lists)Letrsquos consider again the example from Sec 1133 in particular dmap Recall that a list changedxs is valid for source xs if they have the same length and each element change is valid for its
109
110 Chapter 14 Equational reasoning on changes
corresponding element
map (ararr b) rarr List ararr List bmap f Nil = Nilmap f (Cons x xs) = Cons (f x) (map f xs)dmap (ararr b) rarr ∆(ararr b) rarr List ararr ∆List ararr ∆List b
-- A valid list change has the same length as the base listdmap f df Nil Nil = Nildmap f df (Cons x xs) (Cons dx dxs) =
Cons (df x dx) (dmap f df xs dxs)-- Remaining cases deal with invalid changes and a dummy-- result is sucient
dmap f df xs dxs = Nil
In our example one can show that dmap is a correct derivative for map As a consequenceterms map (f oplus df ) (xs oplus dxs) and map f xs oplus dmap f df xs dxs are interchangeable in all validcontexts that is contexts that bind df and dxs to valid changes respectively for f and xs
We sketch an informal proof directly on terms
Proof sketch We must show that dy = dmap f df xs dxs is a change change from initial outputy1 = map f xs to updated output y2 = map (f oplus df ) (xs oplus dxs) for valid inputs df and dxs
We proceed by structural induction on xs and dxs (technically on a proof of validity of dxs)Since dxs is valid those two lists have to be of the same length If xs and dxs are both emptyy1 = dy = y2 = Nil so dy is a valid change as required
For the inductive step both lists are Cons nodes so we need to show that output change
dy = dmap f df (Cons x xs) (Cons dx dxs)
is a valid change fromy1 = map f (Cons x xs)
toy2 = map (f oplus df ) (Cons (x oplus dx) (xs oplus dxs))
To restate validity we name heads and tails of dy y1 y2 If we write dy = Cons dh dt y1 =Cons h1 t1 and y2 = Cons h2 t2 we need to show that dh is a change from h1 to h2 and dt is a changefrom t1 to t2
Indeed head change dh = df x dx is a valid change from h1 = f x to h2 = f (x oplus dx) And tailchange dt = dmap f df xs dxs is a valid change from t1 = map f xs to t2 = map (f oplus df ) (xs oplus dxs)by induction Hence dy is a valid change from y1 to y2
Hopefully this proof is already convincing but it relies on undened concepts On a metalevelfunction we could already make this proof formal but not so on terms yet In this section wedene the required concepts
1411 Denotational equivalence for valid changesThis example uses the notion of denotational equivalence for valid changes We now proceed toformalize it For reference we recall denotational equivalence of terms and then introduce itsrestriction
Chapter 14 Equational reasoning on changes 111
Denition A25 (Denotational equivalence)We say that two terms Γ ` t1 τ and Γ ` t2 τ are denotationally equal and write Γ t1 t2 τ (orsometimes t1 t2) if for all environments ρ n Γ o we have that n t1 o ρ = n t2 o ρ
For open terms t1 t2 that depend on changes denotational equivalence is too restrictive since itrequires t1 and t2 to also be equal when the changes they depend on are not valid By restrictingdenotational equivalence to valid environment changes and terms to depend on contexts we obtainthe following denitionDenition 1412 (Denotational equivalence for valid changes)For any context Γ and type τ we say that two terms ∆Γ ` t1 τ and ∆Γ ` t2 τ are denotationallyequal for valid changes and write ∆Γ t1 ∆ t2 τ if for all valid environment changes dρ B ρ1 rarrρ2 Γ we have that t1 and t2 evaluate in environment dρ to the same value that is n t1 o dρ =n t2 o dρ
Example 1413Terms f x oplus df x dx and f (x oplus dx) are denotationally equal for valid changes (for any types σ τ )∆(f σ rarr τ x σ ) f x oplus df x dx ∆ f (x oplus dx) τ
Example 1414One of our claims in Example 1411 can now be written as
∆Γ map (f oplus df ) (xs oplus dxs) ∆ map f xs oplus dmap f df xs dxs List b
for a suitable context Γ = f List σ rarr List τ xs List σ map (σ rarr τ ) rarr List σ rarr List τ (andfor any types σ τ )
Arguably the need for a special equivalence is a defect in our semantics of change programs itmight be more preferable to make the type of changes abstract throughout the program (except forderivatives of primitives which must inspect derivatives) but this is not immediate especially in amodule system like Haskell Other possible alternatives are discussed in Sec 154
1412 Syntactic validityNext we dene syntactic validity that is when a change term dt is a (valid) change from sourceterm t1 to destination t2 Intuitively dt is valid from t1 to t2 if dt t1 and t2 evaluated all against thesame valid environment change dρ produce a valid change its source and destination FormallyDenition 1415 (Syntactic validity)We say that term ∆Γ ` dt ∆τ is a (syntactic) change from ∆Γ ` t1 τ to ∆Γ ` t2 τ and writeΓ |= dt I t1 rarr t2 τ if
foralldρ B ρ1 rarr ρ2 Γ n dt o dρ B n t1 o dρ rarr n t2 o dρ τ
Notation 1416We often simply say that dt is a change from t1 to t2 leaving everything else implicit when notimportant
Using syntactic validity we can show for instance that dx is a change from x to x oplus dx thatdf x dx is a change from f x to (f oplusdf ) (xoplusdx) both examples follow from a general statement aboutD n ndash o (Theorem 1419) Our earlier informal proof of the correctness of dmap (Example 1411)can also be justied in terms of syntactic validity
Just like (semantic) oplus agrees with validity term-level (or syntactic) oplus agrees with syntacticvalidity up to denotational equivalence for valid changes
112 Chapter 14 Equational reasoning on changes
Lemma 1417 (Term-level oplus agrees with syntactic validity)If dt is a change from t1 to t2 (Γ |= dt I t1 rarr t2 τ ) then t1 oplus dt and t2 are denotationally equal forvalid changes (∆Γ t1 oplus dt ∆ t2 τ )
Proof Follows because term-level oplus agrees with semantic oplus (Lemma 1349) and oplus agrees withvalidity In more detail x an arbitrary valid environment change dρ B ρ1 rarr ρ2 Γ First we haven dt o dρ B n t1 o dρ rarr n t2 o dρ τ because of syntactic validity Then we conclude with thiscalculation
n t1 oplus dt o dρ= term-level oplus agrees with oplus (Lemma 1349)
n t1 o dρ oplus n dt o dρ= oplus agrees with validity
n t2 o dρ
Beware that the denition of Γ |= dt I t1 rarr t2 τ evaluates all terms against changeenvironment dρ containing separately base values and changes In contrast if we use validity inthe change structure for n Γ orarr nτ o we evaluate dierent terms against dierent environmentsThat is why we have that dx is a change from x to x oplus dx (where x oplus dx is evaluated in environmentdρ) while n dx o is a valid change from n x o to n x o (where the destination n x o gets applied toenvironment ρ2)
Is syntactic validity trivial Without needing syntactic validity based on earlier chapters onecan show that dv is a valid change from v to v oplus dv or that df v dv is a valid change from f vto (f oplus df ) (v oplus dv) or further examples But thatrsquos all about values In this section we are justtranslating these notions to the level of terms and formalize them
Our semantics is arguably (intuitively) a trivial one similar to a metainterpreter interpretingobject-language functions in terms of metalanguage functions our semantics simply embeds anobject-level λ-calculus into a meta-level and more expressive λ-calculus mapping for instanceλf x rarr f x (an AST) into λf v rarr f v (syntax for a metalevel function) Hence proofs in thissection about syntactic validity deal mostly with this translation We donrsquot expect the proofs togive special insights and we expect most development would keep such issues informal (which iscertainly reasonable)
Nevertheless we write out the statements to help readers refer to them and write out (mostly)full proofs to help ourselves (and readers) verify them Proofs are mostly standard but with a fewtwists since we must often consider and relate three computations the computation on initialvalues and the ones on the change and on updated values
Wersquore also paying a proof debt Had we used substitution and small step semantics wersquod havedirectly simple statements on terms instead of trickier ones involving terms and environments Weproduce those statements now
Dierentiation and syntactic validity
We can also show that D n t o produces a syntactically valid change from t but we need to specifyits destination In general D n t o is not a change from t to t The destination must evaluate to theupdated value of t to produce a term that evaluates to the right value we use substitution If theonly free variable in t is x then D n t o is a syntactic change from t to t [x = x oplus dx ] To repeat thesame for all variables in context Γ we use the following notation
Chapter 14 Equational reasoning on changes 113
Notation 1418We write t [Γ = Γ oplus ∆Γ ] to mean t [x1 = x1 oplus dx1 x2 = x2 oplus dx2 xn = xn oplus dxn ]
Theorem 1419 (D n ndasho is correct syntactically)For any well-typed term Γ ` t τ term D n t o is a syntactic change from t to t [Γ = Γ oplus ∆Γ ]
We present the following straightforward (if tedious) proof (formal but not mechanized)
Proof Let t2 = t [Γ = Γ oplus ∆Γ ] Take any dρ B ρ1 rarr ρ2 Γ We must show that nD n t o o dρ Bn t o dρ rarr n t2 o dρ τ
Because dρ extend ρ1 and t only needs entries in ρ1 we can show that n t o dρ = n t o ρ1 soour thesis becomes nD n t o o dρ B n t o ρ1 rarr n t2 o dρ τ
Because D n ndash o is correct (Theorem 1222) we know that nD n t o o dρ B n t o ρ1 rarr n t o ρ2 τ thatrsquos almost our thesis so we must only show that n t2 o dρ = n t o ρ2 Since oplus agrees with validityand dρ is valid we have that ρ2 = ρ1 oplus dρ so our thesis is now the following equation which weleave to Lemma 14110
n t o [Γ = Γ oplus ∆Γ ] dρ = n t o (ρ1 oplus dρ)
Herersquos the technical lemma to complete the proof
Lemma 14110For any Γ ` t τ and dρ B ρ1 rarr ρ2 Γ we have
n t o [Γ = Γ oplus ∆Γ ] dρ = n t o (ρ1 oplus dρ)
Proof This follows from the structure of valid environment changes because term-level oplus (used onthe left-hand side) agrees with value-level oplus (used on the right-hand side) by Lemma 1349 andbecause of the substitution lemma
More formally we can show the thesis by induction over environments for empty environmentsthe equations reduces to n t o = n t o For the case of Γ x σ (where x lt Γ) the thesis can berewritten as
n (t [Γ = Γ oplus ∆Γ ]) [x = x oplus dx ] o (dρ x = v1 dx = dv) = n t o (ρ1 oplus dρ x = v1 oplus dv)
We prove it via the following calculation
n (t [Γ = Γ oplus ∆Γ ]) [x = x oplus dx ] o (dρ x = v1 dx = dv)= Substitution lemma on x
n (t [Γ = Γ oplus ∆Γ ]) o (dρ x = (n x oplus dx o (dρ x = v1 dx = dv)) dx = dv)= Term-level oplus agrees with oplus (Lemma 1349)
Then simplify n x o and n dx o n (t [Γ = Γ oplus ∆Γ ]) o (dρ x = v1 oplus dv dx = dv)
= t [Γ = Γ oplus ∆Γ ] does not mention dx So we can modify the environment entry for dx This makes the environment into a valid environment change
n (t [Γ = Γ oplus ∆Γ ]) o (dρ x = v1 oplus dv dx = 0v1oplusdv)= By applying the induction hypothesis and simplifying 0 away
n t o (ρ1 oplus dρ x = v1 oplus dv)
114 Chapter 14 Equational reasoning on changes
142 Change equivalenceTo optimize programs manipulate changes we often want to replace a change-producing term byanother one while preserving the overall program meaning Hence we dene an equivalence onvalid changes that is preserved by change operations that is (in spirit) a congruence We call thisrelation (change) equivalence and refrain from using other equivalences on changes
Earlier (say in Sec 153) we have sometimes required that changes be equal but thatrsquos often toorestrictive
Change equivalence is dened in terms of validity to ensure that validity-preserving operationspreserve change equivalence If two changes dv1 and dv2 are equivalent one can be substituted forthe other in a validity-preserving context We can dene this once for all change structures hencein particular for values and environmentsDenition 1421 (Change equivalence)Given a change structure V changes dva dvb ∆V are equivalent relative to source v1 V (writtendva =∆ dvb B v1 rarr v2 V ) if and only if there exists v2 such that both dva and dvb are valid fromv1 to v2 (that is dva B v1 rarr v2 V dvb B v1 rarr v2 V )
Notation 1422When not ambiguous we often abbreviate dva =∆ dvb B v1 rarr v2 V as dva =v1∆ dvb or dva =∆ dvb
Two changes are often equivalent relative to a source but not others Hence dva =∆ dvb isalways an abbreviation for change equivalence for a specic source
Example 1423For instance we later use a change structure for integers using both replacement changes anddierences (Example 1216) In this structure change 0 is nil for all numbers while change 5 (ldquobang5rdquo) replaces any number with 5 Hence changes 0 and 5 are equivalent only relative to source 5and we write 0 =5∆5
By applying denitions one can verify that change equivalence relative to a source v is asymmetric and transitive relation on ∆V However it is not an equivalence relation on ∆V becauseit is only reexive on changes valid for source v Using the set-theoretic concept of subset we canthen state the following lemma (whose proof we omit as it is brief)Lemma 1424 (=∆ is an equivalence on valid changes)For each set V and source v isin V change equivalence relative to source v is an equivalence relationover the set of changes
dv isin ∆V | dv is valid with source v
We elaborate on this peculiar sort of equivalence in Sec 1423
1421 Preserving change equivalenceChange equivalence relative to a source v is respected in an appropriate sense by all validity-preserving expression contexts that accept changes with source v To explain what this means westudy an example lemma we show that because valid function changes preserve validity they alsorespect change equivalence At rst we use ldquo(expression) contextrdquo informally to refer to expressioncontexts in the metalanguage Later wersquoll extend our discussion to actual expression contexts inthe object languageLemma 1425 (Valid function changes respect change equivalence)Any valid function change
df B f1 rarr f2 Ararr B
Chapter 14 Equational reasoning on changes 115
respects change equivalence if dva =∆ dvb B v1 rarr v2 A then df v1 dva =∆ df v1 dvb B f1 v1 rarrf2 v2 B We also say that (expression) context df v1 ndash respects change equivalence
Proof The thesis means that df v1 dva B f1 v1 rarr f2 v2 B and df v1 dvb B f1 v1 rarr f2 v2 B Bothequivalences follow in one step from validity of df dva and dvb
This lemma holds because the source and destination of df v1 dv donrsquot depend on dv onlyon its source and destination Source and destination are shared by equivalent changes Hencevalidity-preserving functions map equivalent changes to equivalent changes
In general all operations that preserve validity also respect change equivalence because for allthose operations the source and destination of any output changes and the resulting value onlydepend on source and destination of input changes
However Lemma 1425 does not mean that df v1 dva = df v1 dvb because there can bemultiple changes with the same source and destination For instance say that dva is a list changethat removes an element and readds it and dvb is a list change that describes no modication Theyare both nil changes but a function change might handle them dierently
Moreover we only proved that context df v1 ndash respects change equivalence relative to sourcev1 If value v3 diers from v1 df v3 dva and df v3 dvb need not be equivalent Hence we saythat context df v1 accepts changes with source v1 More in general a context accepts changes withsource v1 if it preserves validity for changes with source v1 we can say informally that all suchcontexts also respect change equivalence
Another example context v1oplusndash also accepts changes with source v1 Since this context producesa base value and not a change it maps equivalent changes to equal resultsLemma 1426 (oplus respects change equivalence)If dva =∆ dvb B v1 rarr v2 V then v1 oplus ndash respects the equivalence between dva and dvb that isv1 oplus dva = v1 oplus dvb
Proof v1 oplus dva = v2 = v1 oplus dvb
There are more contexts that preserve equivalence As discussed function changes preserve con-texts and D n ndash o produces functions changes so D n t o preserves equivalence on its environmentand on any of its free variablesLemma 1427 (D n ndasho preserves change equivalence)For any term Γ ` t τ D n t o preserves change equivalence of environments for all dρa =∆ dρb Bρ1 rarr ρ2 Γ we have nD n t o o dρa =∆ nD n t o o dρb B n t o ρ1 rarr n t o ρ2 Γ rarr τ
Proof To verify this just apply correctness of dierentiation to both changes dρa and dρb
To show more formally in what sense change equivalence is a congruence we rst lift changeequivalence to terms (Denition 1429) similarly to syntactic change validity in Sec 1412 To doso we rst need a notation for one-sided or source-only validityNotation 1428 (One-sided validity)We write dv B v1 V to mean there exists v2 such that dv B v1 rarr v2 V We will reuseexisting conventions and write dv B v1 τ instead of dv B v1 nτ o and dρ B ρ1 Γ instead ofdρ B ρ1 n Γ o
Denition 1429 (Syntactic change equivalence)Two change terms ∆Γ ` dta ∆τ and ∆Γ ` dtb ∆τ are change equivalent relative to sourceΓ ` t τ if for all valid environment changes dρ B ρ1 rarr ρ2 Γ we have that
n dta o dρ =∆ n dtb o dρ B n t o ρ1 τ
We write then Γ |= dta =∆ dtb I t τ or dta =t∆ dtb or simply dta =∆ dtb
116 Chapter 14 Equational reasoning on changes
Saying that dta and dtb are equivalent relative to t does not specify the destination of dta and dtb only their source The only reason is to simplify the statement and proof of Theorem 14210
If two change terms are change equivalent with respect to the right source we can replace onefor the other in an expression context to optimize a program as long as the expression context isvalidity-preserving and accepts the change
In particular substituting into D n t o preserves syntactic change equivalence according to thefollowing theorem (for which we have only a pen-and-paper formal proof)
Theorem 14210 (D n ndasho preserves syntactic change equivalence)For any equivalent changes Γ |= s I dsa =∆ dsb σ and for any term t typed as Γ x σ ` t τ wecan produce equivalent results by substituting into D n t o either s and dsa or s and dsb
Γ |= D n t o [x = s dx = dsa ] =∆ D n t o [x = s dx = dsb ] I t [x = s ] τ
Proof sketch The thesis holds because D n ndash o preserves change equivalence Lemma 1427 A formalproof follows through routine (and tedious) manipulations of bindings In essence we can extend achange environment dρ for context Γ to equivalent environment changes for context Γ x σ withthe values of dsa and dsb The tedious calculations follow
Proof Assume dρ B ρ1 rarr ρ2 Γ Because dsa and dsb are change-equivalent we have
n dsa o dρ =∆ n dsb o dρ B n s o ρ1 σ
Moreover n s o ρ1 = n s o dρ because dρ extends ρ1 Wersquoll use this equality without explicitmention
Hence we can construct change-equivalent environments for evaluating D n t o by combiningdρ and the values of respectively dsa and dsb
(dρ x = n s o dρ dx = n dsa o dρ) =∆
(dρ x = n s o dρ dx = n dsb o dρ)B (ρ1 x = n s o ρ1) (Γ x σ ) (141)
This environment change equivalence is respected by D n t o hence
nD n t o o (dρ x = n s o dρ dx = n dsa o dρ) =∆
nD n t o o (dρ x = n s o dρ dx = n dsb o dρ)B n t o (ρ1 x = n s o ρ1) Γ rarr τ (142)
We want to deduce the thesis by applying to this statement the substitution lemma for denotationalsemantics n t o (ρ x = n s o ρ) = n t [x = s ] o ρ
To apply the substitution lemma to the substitution of dx we must adjust Eq (142) usingsoundness of weakening We get
nD n t o o (dρ x = n s o dρ dx = n dsa o (dρ x = n s o dρ)) =∆
nD n t o o (dρ x = n s o dρ dx = n dsb o (dρ x = n s o dρ))B n t o (ρ1 x = n s o ρ1) Γ rarr τ (143)
Chapter 14 Equational reasoning on changes 117
This equation can now be rewritten (by applying the substitution lemma to the substitutions ofdx and x) to the following one
n (D n t o [dx = dsa ] [x = s ]) o dρ =∆
n (D n t o [dx = dsb ] [x = s ]) o dρB n t [x = s ] o ρ1 Γ rarr τ (144)
Since x is not in scope in s dsa dsb we can permute substitutions to conclude that
Γ |= D n t o [x = s dx = dsa ] =∆ D n t o [x = s dx = dsb ] I t [x = s ] τ
as required
In this theorem if x appears once in t then dx appears once in D n t o (this follows by induction ont) hence D n t o [x = s dx = ndash] produces a one-hole expression context
Further validity-preserving contexts There are further operations that preserve validity Torepresent terms with ldquoholesrdquo where other terms can be inserted we can dene one-level contexts F and contexts E as is commonly done
F = [ ] t dt | ds t [ ] | λx dx rarr [ ] | t oplus [ ] | dt1 [ ] | [ ] dt2E = [ ] | F [E ]
If dt1 =∆ dt2 B t1 rarr t2 τ and our context E accepts changes from t1 then F [dt1 ] and F [dt2 ] arechange equivalent It is easy to prove such a lemma for each possible shape of one-level context F both on values (like Lemmas 1425 and 1426) and on terms We have been unable to state a moregeneral theorem because itrsquos not clear how to formalize the notion of a context accepting a changein general the syntax of a context does not always hint at the validity proofs embedded
Cai et al [2014] solve this problem for metalevel contexts by typing them with dependenttypes but as discussed the overall proof is more awkward Alternatively it appears that the use ofdependent types in Chapter 18 also ensures that change equivalence is a congruence (though atpresent this is still a conjecture) without overly complicating correctness proofs However it is notclear whether such a type system can be expressive enough without requiring additional coercionsConsider a change dv1 from v1 to v1 oplus dv1 a value v2 which is known to be (propositionally) equalto v1 oplus dv1 and a change dv2 from v2 to v3 Then term dv1 dv2 is not type correct (for instancein Agda) the typechecker will complain that dv1 has destination v1 oplus dv1 while dv2 has source v2When working in Agda to solve this problem we can explicitly coerce terms through propositionalequalities and can use Agda to prove such equalities in the rst place We leave the design of asuciently expressive object language where change equivalence is a congruence for future work
1422 Sketching an alternative syntaxIf we exclude composition we can sketch an alternative syntax which helps construct a congruenceon changes The idea is to manipulate instead of changes alone pairs of sources v isin V and validchanges dv | dv B v V
t = src dt | dst dt | x | t t | λx rarr tdt = dt dt | src dt | λdx rarr dt | dx
Adapting dierentiation to this syntax is easy
118 Chapter 14 Equational reasoning on changes
D n x o = dxD n s t o = D n s o D n t oD n λx rarr t o = D n λdx rarr dt o
Derivatives of primitives only need to use dst dt instead of t oplus dt and src dt instead of t wheneverdt is a change for t
With this syntax we can dene change expression contexts which can be lled in by changeexpressions
E = src dE | dst dE | E t | t E | λx rarr EdE = dt dE | dE dt | λdx rarr dE | [ ]
We conjecture change equivalence is a congruence with respect to contexts dE and that contextsE map change-equivalent changes to results that are denotationally equivalent for valid changes Weleave a proof to future work but we expect it to be straightforward It also appears straightforwardto provide an isomorphism between this syntax and the standard one
However extending such contexts with composition does not appear trivial contexts such asdt dE or dE dt only respect validity when the changes sources and destinations align correctly
We make no further use of this alternative syntax in this work
1423 Change equivalence is a PERReaders with relevant experience will recognize that change equivalence is a partial equivalencerelation (PER) [Mitchell 1996 Ch 5] It is standard to use PERs to identify valid elements in amodel [Harper 1992] In this section we state the connection showing that change equivalence isnot an ad-hoc construction so that mathematical constructions using PERs can be adapted to usechange equivalence
We recall the denition of a PERDenition 14211 (Partial equivalence relation (PER))A relation R sube StimesS is a partial equivalence relation if it is symmetric (if aRb then bRa) and transitive(if aRb and bRc then aRc)
Elements related to another are also related to themselves If aRb then aRa (by transitivity aRbbRa hence aRa) So a PER on S identies a subset of valid elements of S Since PERs are equivalencerelations on that subset they also induce a (partial) partition of elements of S into equivalenceclasses of change-equivalent elements
Lemma 14212 (=∆ is a PER)Change equivalence relative to a source a A is a PER on set ∆A
Proof A restatement of Lemma 1424
Typically one studies logical PERs which are logical relations and PERs at the same time [Mitchell1996 Ch 8] In particular with a logical PER two functions are related if they map related inputs torelated outputs This helps showing that a PERs is a congruence Luckily our PER is equivalent tosuch a denitionLemma 14213 (Alternative denition for =∆)Change equivalence is equivalent to the following logical relation
dva =∆ dvb B v1 rarr v2 ι =defdva B v1 rarr v2 ι and dva B v1 rarr v2 ι
Chapter 14 Equational reasoning on changes 119
df a =∆ df b B f1 rarr f2 σ rarr τ =defforalldva =∆ dvb B v1 rarr v2 σ df a v1 dva =∆ df b v2 dvb B f1 v1 rarr f2 v2 τ
Proof By induction on types
143 Chapter conclusionIn this chapter we have put on a more solid foundation forms of reasoning about changes on termsand dened an appropriate equivalence on changes
120 Chapter 14 Equational reasoning on changes
Chapter 15
Extensions and theoreticaldiscussion
In this chapter we collect discussion of a few additional topics related to ILC that do not sucefor standalone chapters We show how to dierentiation general recursion Sec 151 we exhibita function change that is not valid for any function (Sec 152) we contrast our representation offunction changes with pointwise function changes (Sec 153) and we compare our formalizationwith the one presented in [Cai et al 2014] (Sec 154)
151 General recursionThis section discusses informally how to dierentiate terms using general recursion and what is thebehavior of the resulting terms
1511 Dierentiating general recursionEarlier we gave a rule for deriving (non-recursive) let
D n let x = t1 in t2 o = let x = t1dx = D n t1 o
in D n t2 oIt turns out that we can use the same rule also for recursive let-bindings which we write here (andonly here) letrec for distinction
D n letrec x = t1 in t2 o = letrec x = t1dx = D n t1 o
in D n t2 oExample 1511In Example 1411 we presented a derivative dmap for map By using rules for dierentiatingrecursive functions we obtain dmap as presented from map
dmap f df Nil Nil = Nildmap f df (Cons x xs) (Cons dx dxs) =Cons (df x dx) (dmap f df xs dxs)
121
122 Chapter 15 Extensions and theoretical discussion
However derivative dmap is not asymptotically faster than map Even when we consider lesstrivial change structures derivatives of recursive functions produced using this rule are often notasymptotically faster Deriving letrec x = t1 in t2 can still be useful if D n t1 o andor D n t2 ois faster than its base term but during our work we focus mostly on using structural recursionAlternatively in Chapter 11 and Sec 112 we have shown how to incrementalize functions (includingrecursive ones) using equational reasoning
In general when we invoke dmap on a change dxs from xs1 to xs2 it is important that xs1 and xs2are similar enough that enough computation can be reused Say that xs1 = Cons 2 (Cons 3 (Cons 4 Nil))and xs2 = Cons 1 (Cons 2 (Cons 3 (Cons 4 Nil))) in this case a change modifying each elementof xs1 and then replacing Nil by Cons 4 Nil would be inecient to process and naive incremen-talization would produce this scenario In this case it is clear that a preferable change shouldsimply insert 1 at the beginning of the list as illustrated in Sec 1132 (though we have omitted thestraightforward denition of dmap for such a change structure) In approaches like self-adjustingcomputation this is ensured by using memoization In our approach instead we rely on changesthat are nil or small to detect when a derivative can reuse input computation
The same problem aects naive attempts to incrementalize for instance a simple factorialfunction we omit details Because of these issues we focus on incrementalization of structurally re-cursive functions and on incrementalizing generally recursive primitives using equational reasoningWe return to this issue in Chapter 20
1512 JusticationHere we justify informally the rule for dierentiating recursive functions using xpoint operators
Letrsquos consider STLC extended with a family of standard xpoint combinators xτ (τ rarr τ ) rarr τ with x-reduction dened by equation x f rarr f (x f ) we search for a denition of D nx f o
Using informal equational reasoning if a correct denition of D nx f o exists it must satisfy
D nx f o x (D n f o (x f ))
We can proceed as follows
D nx f o= imposing that D n ndash o respects x-reduction here
D n f (x f ) o= using rules for D n ndash o on application
D n f o (x f ) D nx f o
This is a recursive equation in D nx f o so we can try to solve it using x itself
D nx f o = x (λdxf rarr D n f o (x f ) dxf )
Indeed this rule gives a correct derivative Formalizing our reasoning using denotationalsemantics would presumably require the use of domain theory Instead we prove correct a variantof x in Appendix C but using operational semantics
152 Completely invalid changesIn some change sets some changes might not be valid relative to any source In particular we canconstruct examples in ∆(Zrarr Z)
Chapter 15 Extensions and theoretical discussion 123
To understand why this is plausible we recall that as described in Sec 153 df can be decomposedinto a derivative and a pointwise function change that is independent of da While pointwisechanges can be dened arbitrarily the behavior of the derivative of f on changes is determined bythe behavior of f
Example 1521We search for a function change df ∆(Zrarr Z) such that there exist no f1 f2 Zrarr Z for whichdf B f1 rarr f2 Zrarr Z To nd df we assume that there are f1 f2 such that df B f1 rarr f2 Zrarr Zprove a few consequences and construct df that cannot satisfy them Alternatively we could pickthe desired denition for df right away and prove by contradiction that there exist no f1 f2 suchthat df B f1 rarr f2 Zrarr Z
Recall that on integers a1oplusda = a1+da and that da B a1 rarr a2 Zmeans a2 = a1oplusda = a1+daSo for any numbers a1 da a2 such that a1 + da = a2 validity of df implies that
f2 (a1 + da) = f1 a1 + df a1 da
For any two numbers b1 db such that b1 + db = a1 + da we have that
f1 a1 + df a1 da = f2 (a1 + da) = f2 (b1 + db) = f1 b1 + df b1 db
Rearranging terms we have
df a1 da minus df b1 db = f1 b1 minus f1 a1
that is df a1 da minus df b1 db does not depend on da and dbFor concreteness let us x a1 = 0 b1 = 1 and a1 + da = b1 + db = s We have then that
df 0 s minus df 1 (s minus 1) = f1 1 minus f1 0
Once we set h = f1 1 minus f1 0 we have df 0 s minus df 1 (s minus 1) = h Because s is just the sum of twoarbitrary numbers while h only depends on f1 this equation must hold for a xed h and for allintegers s
To sum up we assumed for a given df there exists f1 f2 such that df B f1 rarr f2 Zrarr Z andconcluded that there exists h = f1 1 minus f1 0 such that for all s
df 0 s minus df 1 (s minus 1) = h
At this point we can try concrete families of functions df to obtain a contradiction Substitutinga linear polynomial df a da = c1 middot a + c2 middot da fails to obtain a contradiction in fact we can constructvarious f1 f2 such that df B f1 rarr f2 Z rarr Z So we try quadratic polynomials Substitutingdf a da = c middot da2 succeeds we have that there is h such that for all integers s
c middot(s2 minus (s minus 1)2
)= h
However c middot(s2 minus (s minus 1)2
)= 2 middot c middot s minus c which isnrsquot constant so there can be no such h
153 Pointwise function changesWe can also decompose function changes into orthogonal (and possibly easier to understand)concepts
Consider two functions f1 f2 Ararr B and two inputs a1 a2 A The dierence between f2 a2and f1 a1 is due to changes to both the function and its argument We can compute the whole
124 Chapter 15 Extensions and theoretical discussion
change at once via a function change df as df a1 da Or we can compute separately the eects ofthe function change and of the argument change We can account for changes from f1 a1 to f2 a2using f prime1 a derivative of f1 f prime1 a1 da = f1 a2 f1 a2 = f1 (a1 oplus da) f a11
We can account for changes from f1 to f2 using the pointwise dierence of two functions nablaf1 =λ(a A) rarr f2 a f1 a in particular f2 (a1 oplus da) f1 (a1 oplus da) = nablaf (a1 oplus da) Hence a functionchange simply combines a derivative with a pointwise change using change composition
df a1 da = f2 a2 f1 a1= (f1 a2 f1 a1) (f2 a2 f1 a2)= f prime1 a1 da nablaf (a1 oplus da)
(151)
One can also compute a pointwise change from a function change
nabla f a = df a 0a
While some might nd pointwise changes a more natural concept we nd it easier to use ourdenitions of function changes which combines both pointwise changes and derivatives into asingle concept Some related works explore the use of pointwise changes we discuss them inSec 1922
154 Modeling only valid changesIn this section we contrast briey the formalization of ILC in this thesis (for short ILCrsquo17) with theone we used in our rst formalization [Cai et al 2014] (for short ILCrsquo14) We keep the discussionsomewhat informal we have sketched proofs of our claims and mechanized some but we omit allproofs here We discuss both formalizations using our current notation and terminology except forconcepts that are not present here
Both formalizations model function changes semantically but the two models we present aredierent Overall ILCrsquo17 uses simpler machinery and seems easier to extend to more general baselanguages and its mechanization of ILCrsquo17 appears simpler and smaller Instead ILCrsquo14 studiesadditional entities but better behaved entities
In ILCrsquo17 input and output domains of function changes contain invalid changes while inILCrsquo14 these domains are restricted to valid changes via dependent types ILCrsquo14 also considers thedenotation of D n t o whose domains include invalid changes but such denotations are studied onlyindirectly In both cases function changes must map valid changes to valid changes But ILCrsquo14application df v1 dv is only well-typed is dv is a change valid from v1 hence we can simply say thatdf v1 respects change equivalence As discussed in Sec 142 in ILCrsquo17 the analogous property hasa trickier statement we can write df v1 and apply it to arbitrary equivalent changes dv1 =∆ dv2even if their source is not v1 but such change equivalences are not preserved
We can relate the two models by dening a logical relation called erasure (similar to the onedescribed by Cai et al) an ILCrsquo14 function change df erases to an ILCrsquo17 function change df prime
relative to source f A rarr B if given any change da that erases to daprime relative to source a1 Aoutput change df a1 da erases to df prime a1 daprime relative to source f a1 For base types erasure simplyconnects corresponding da (with source) with daprime in a manner dependent from the base type (oftenjust throwing away any embedded proofs of validity) In all cases one can show that if and only if
1For simplicity we use equality on changes even though equality is too restrictive Later (in Sec 142) wersquoll dene anequivalence relation on changes called change equivalence and written =∆ and use it systematically to relate changes inplace of equality For instance wersquoll write that f prime1 a1 da =∆ f1 (a1 oplus da) f1 a1 But for the present discussion equality willdo
Chapter 15 Extensions and theoretical discussion 125
dv erases to dv prime with source v1 then v1 oplus dv = v2 oplus dv prime (for suitable variants of oplus) in other wordsdv and dv prime share source and destination (technically ILCrsquo17 changes have no xed source so wesay that they are changes from v1 to v2 for some v2)
In ILCrsquo14 there is a dierent incremental semantics n t o∆ for terms t but it is still a valid ILCrsquo14change One can show that n t o∆ (as dened in ILCrsquo14) erases to nD n t o o∆ (as dened in ILCrsquo17)relative to source n t o in fact the needed proof is sketched by Cai et al through in disguise
It seems clear there is no isomorphism between ILCrsquo14 changes and ILCrsquo17 changes An ILCrsquo17function change also accepts invalid changes and the behavior on those changes canrsquot be preservedby an isomorphism Worse it seems hard to dene a non-isomorphic mapping to map an ILCrsquo14change df to an an ILCrsquo17 change erase df we have to dene behavior for (erase df ) a da evenwhen da is invalid As long as we work in a constructive setting we cannot decide whether da isvalid in general because da can be a function change with innite domain
We can give however a denition that does not need to detect such invalid changes Just extractsource and destination from a function change using valid change 0v and take dierence of sourceand destination using in the target system
unerase (σ rarr τ ) df prime = let f = λv rarr df prime v 0v in (f oplus df prime) funerase _ dv prime = erase (σ rarr τ ) df = let f = λv rarr df v 0v in (f oplus df ) ferase _ dv =
We dene these function by induction on types (for elements of ∆τ not arbitrary change structures)and we overload for ILCrsquo14 and ILCrsquo17 We conjecture that for all types τ and for all ILCrsquo17changes dv prime (of the right type) unerase τ dv prime erases to dv prime and for all ILCrsquo14 changes dv dv erasesto erase τ dv prime
Erasure is a well-behaved logical relation similar to the ones relating source and destinationlanguage of a compiler and to partial equivalence relations In particular it also induces partialequivalence relations (PER) (see Sec 1423) both on ILCrsquo14 changes and on ILCrsquo17 changes twoILCrsquo14 changes are equivalent if they erase to the same ILCrsquo17 change and two ILCrsquo17 changes areequivalent if the same ILCrsquo14 change erases to both Both relations are partial equivalence relations(and total on valid changes) Because changes that erase to each other share source and destinationthese induced equivalences coincide again with change equivalence That both relations are PERsalso means that erasure is a so-called quasi-PER [Krishnaswami and Dreyer 2013] Quasi-PERs area natural (though not obvious) generalization of PERs for relations among dierent sets R sube S1 times S2such relations cannot be either symmetric or transitive However we make no use of additionalproperties of quasi-PERs hence we donrsquot discuss them in further detail
1541 One-sided vs two-sided validity
There are also further supercial dierences among the two denitions In ILCrsquo14 changes validwith soure a have dependent type ∆a This dependent type is indexed by the source but not by thedestination Dependent function changes with source f Ararr B have type (a A) rarr ∆ararr ∆(f a)relating the behavior of function change df with the behavior of f on original inputs But this ishalf of function validity to relate the behavior of df with the behavior of df on updated inputs inILCrsquo14 valid function changes have to satisfy an additional equation called preservation of future2
f1 a1 oplus df a1 da = (f1 oplus df ) (a1 oplus da)
2Name suggested by Yufei Cai
126 Chapter 15 Extensions and theoretical discussion
This equation appears inelegant and mechanized proofs were often complicated by the need toperform rewritings using it Worse to show that a function change is valid we have to use dierentapproaches to prove it has the correct source and the correct destination
This dierence is however supercial If we replace f1 oplus df with f2 and a1 oplus da with a2 thisequation becomes f1 a1 oplus df a1 da = f2 a2 a consequence of f2 B df rarr ndash f1 So one might suspectthat ILCrsquo17 valid function changes also satisfy this equation This is indeed the case
Lemma 1541A valid function change df B f1 rarr f2 Ararr B satises equation
f1 a1 oplus df a1 da = (f1 oplus df ) (a1 oplus da)
on any valid input da B a1 rarr a2 Ararr B
Conversely one can also show that ILCrsquo14 function changes also satisfy two-sided validity asdened in ILCrsquo17 Hence the only true dierence between ILCrsquo14 and ILCrsquo17 models is the one wediscussed earlier namely whether function changes can be applied to invalid inputs or not
We believe it could be possible to formalize the ILCrsquo14 model using two-sided validity bydening a dependent type of valid changes ∆2 (A rarr B) f1 f2 = (a1 a2 A) rarr ∆2 A a1 a2 rarr∆2 B (f1 a1) (f2 a2) We provide more details on such a transformation in Chapter 18
Models restricted to valid changes (like ILCrsquo14) are related to models based on directed graphsand reexive graphs where values are graphs vertexes changes are edges between change sourceand change destination (as hinted earlier) In graph language validity preservation means thatfunction changes are graph homomorphisms
Based on similar insights Atkey [2015] suggests modeling ILC using reexive graphs whichhave been used to construct parametric models for System F and extensions and calls for researchon the relation between ILC and parametricity As follow-up work Cai [2017] studies models ofILC based on directed and reexive graphs
Chapter 16
Dierentiation in practice
In practice successful incrementalization requires both correctness and performance of the deriva-tives Correctness of derivatives is guaranteed by the theoretical development the previous sectionstogether with the interface for dierentiation and proof plugins whereas performance of derivativeshas to come from careful design and implementation of dierentiation plugins
161 The role of dierentiation pluginsUsers of our approach need to (1) choose which base types and primitives they need (2) implementsuitable dierentiation plugins for these base types and primitives (3) rewrite (relevant parts of)their programs in terms of these primitives and (4) arrange for their program to be called on changesinstead of updated inputs
As discussed in Sec 105 dierentiation supports abstraction application and variables butsince computation on base types is performed by primitives for those types ecient derivatives forprimitives are essential for good performance
To make such derivatives ecient change types must also have ecient implementations andallow describing precisely what changed The ecient derivative of sum in Chapter 10 is possibleonly if bag changes can describe deletions and insertions and integer changes can describe additivedierences
For many conceivable base types we do not have to design the dierentiation plugins fromscratch Instead we can reuse the large body of existing research on incrementalization in rst-orderand domain-specic settings For instance we reuse the approach from Gluche et al [1997] tosupport incremental bags and maps By wrapping a domain-specic incrementalization result in adierentiation plugin we adapt it to be usable in the context of a higher-order and general-purposeprogramming language and in interaction with other dierentiation plugins for the other basetypes of that language
For base types with no known incrementalization strategy the precise interfaces for dierentia-tion and proof plugins can guide the implementation eort These interfaces could also from thebasis for a library of dierentiation plugins that work well together
Rewriting whole programs in our language would be an excessive requirements Instead weembed our object language as an EDSL in some more expressive meta-language (Scala in our casestudy) so that embedded programs are reied The embedded language can be made to resemblethe metalanguage [Rompf and Odersky 2010] To incrementalize a part of a computation we writeit in our embedded object language invoke D on the embedded program optionally optimize theresulting programs and nally invoke them The metalanguage also acts as a macro system for the
127
128 Chapter 16 Dierentiation in practice
histogram Map Int (Bag word)rarr Map word Inthistogram = mapReduce groupOnBags additiveGroupOnIntegers histogramMap histogramReduce
where additiveGroupOnIntegers = Group (+) (λn rarr minusn) 0histogramMap = foldBag groupOnBags (λn rarr singletonBag (n 1))histogramReduce = foldBag additiveGroupOnIntegers id
-- Precondition-- For every key1 k1 and key2 k2 the terms mapper key1 and reducer key2 are homomorphismsmapReduce Group v1 rarr Group v3 rarr (k1 rarr v1 rarr Bag (k2 v2))rarr (k2 rarr Bag v2 rarr v3)rarr Map k1 v1 rarr Map k2 v3mapReduce group1 group3 mapper reducer = reducePerKey groupByKey mapPerKey
where mapPerKey = foldMap group1 groupOnBags mappergroupByKey = foldBag (groupOnMaps groupOnBags) (λ(key val)rarr singletonMap key (singletonBag val))reducePerKey = foldMap groupOnBags (groupOnMaps group3) (λkey bag rarr singletonMap key (reducer key bag))
Figure 161 The λ-term histogram with Haskell-like syntactic sugar additiveGroupOnIntegers is theabelian group induced on integers by addition (Z+ 0minus)
object language as usual This allows us to simulate polymorphic collections such as (Bag ι) eventhough the object language is simply-typed technically our plugin exposes a family of base typesto the object language
162 Predicting nil changesHandling changes to all inputs can induce excessive overhead in incremental programs [Acar2009] It is also often unnecessary for instance the function argument of fold in Chapter 10 doesnot change since it is a closed subterm of the program so fold will receive a nil change for it A(conservative) static analysis can detect changes that are guaranteed to be nil at runtime We canthen specialize derivatives that receive this change so that they need not inspect the change atruntime
For our case study we have implemented a simple static analysis which detects and propagatesinformation about closed terms The analysis is not interesting and we omit details for lack of space
1621 Self-maintainabilityIn databases a self-maintainable view [Gupta and Mumick 1999] is a function that can update itsresult from input changes alone without looking at the actual input By analogy we call a derivativeself-maintainable if it uses no base parameters only their changes Self-maintainable derivativesdescribe ecient incremental computations since they do not use their base input their runningtime does not have to depend on the input size
Example 1621D nmerge o = λx dx y dy rarr merge dx dy is self-maintainable with the change structure Bag Sdescribed in Example 1031 because it does not use the base inputs x and y Other derivatives areself-maintainable only in certain contexts The derivative of element-wise function application(map f xs) ignores the original value of the bag xs if the changes to f are always nil because theunderlying primitive foldBag is self-maintainable in this case (as discussed in next section) We takeadvantage of this by implementing a specialized derivative for foldBag
Similarly to what we have seen in Sec 106 that dgrandTotal needlessly recomputes merge xs yswithout optimizations However the result is a base input to fold prime In next section wersquoll replacefold prime by a self-maintainable derivative (based again on foldBag) and will avoid this recomputation
To conservatively predict whether a derivative is going to be self-maintainable (and thus ecient)one can inspect whether the program restricts itself to (conditionally) self-maintainable primitives
Chapter 16 Dierentiation in practice 129
like merge (always) or map f (only if df is nil which is guaranteed when f is a closed term)To avoid recomputing base arguments for self-maintainable derivatives (which never need
them) we currently employ lazy evaluation Since we could use standard techniques for dead-codeelimination [Appel and Jim 1997] instead laziness is not central to our approach
A signicant restriction is that not-self-maintainable derivatives can require expensive compu-tations to supply their base arguments which can be expensive to compute Since they are alsocomputed while running the base program one could reuse the previously computed value throughmemoization or extensions of static caching (as discussed in Sec 1923) We leave implementingthese optimizations for future work As a consequence our current implementation delivers goodresults only if most derivatives are self-maintainable
163 Case studyWe perform a case study on a nontrivial realistic program to demonstrate that ILC can speed it upWe take the MapReduce-based skeleton of the word-count example [Laumlmmel 2007] We dene asuitable dierentiation plugin adapt the program to use it and show that incremental computationis faster than recomputation We designed and implemented the dierentiation plugin following therequirements of the corresponding proof plugin even though we did not formalize the proof plugin(eg in Agda) For lack of space we focus on base types which are crucial for our example andits performance that is collections The plugin also implements tuples tagged unions Booleansand integers with the usual introduction and elimination forms with few optimizations for theirderivatives
wordcount takes a map from document IDs to documents and produces a map from wordsappearing in the input to the count of their appearances that is a histogram
wordcount Map ID Documentrarr Map Word Int
For simplicity instead of modeling strings we model documents as bags of words and documentIDs as integers Hence what we implement is
histogram Map Int (Bag a) rarr Map a Int
We model words by integers (a = Int ) but treat them parametrically Other than that we adaptdirectly Laumlmmelrsquos code to our language Figure 161 shows the λ-term histogram
Figure 162 shows a simplied Scala implementation of the primitives used in Fig 161 Asbag primitives we provide constructors and a fold operation following Gluche et al [1997] Theconstructors for bags are empty (constructing the empty bag) singleton (constructing a bag with oneelement) merge (constructing the merge of two bags) and negate (negate b constructs a bag with thesame elements as b but negated multiplicities) all but singleton represent abelian group operationsUnlike for usual ADT constructors the same bag can be constructed in dierent ways which areequivalent by the equations dening abelian groups for instance since merge is commutativemerge x y = merge y x Folding on a bag will represent the bag through constructors in an arbitraryway and then replace constructors with arguments to ensure a well-dened result the argumentsof fold should respect the same equations that is they should form an abelian group for instancethe binary operator should be commutative Hence the fold operator foldBag can be dened to takea function (corresponding to singleton) and an abelian group (for the other constructors) foldBag isthen dened by equations
foldBag Group τ rarr (σ rarr τ ) rarr Bag σ rarr τ
foldBag д(_ e) f empty = e
130 Chapter 16 Dierentiation in practice
Abelian groupsabstract class Group[A]
def merge( value1 A value2 A) Adef inverse(value A) Adef zero A
Bagstype Bag[A] = collectionimmutableHashMap[A Int]
def groupOnBags[A] = new Group[Bag[A]] def merge( bag1 Bag[A] bag2 Bag[A]) = def inverse(bag Bag[A]) = bagmap(
case (value count) rArr (value -count))def zero = collectionimmutableHashMap ()
def foldBag[A B](group Group[B] f A rArr B bag Bag[A]) B =bagflatMap (
case (x c) if c ge 0 rArr Seqfill(c)(f(x))case (x c) if c lt 0 rArr Seqfill(-c)(groupinverse(f(x)))
) fold(groupzero)(groupmerge)
Mapstype Map[K A] = collectionimmutableHashMap[K A]
def groupOnMaps[K A](group Group[A]) = new Group[Map[K A]] def merge( dict1 Map[K A] dict2 Map[K A]) Map[K A] =
dict1 merged( dict2 )(case ((k v1 ) (_ v2 )) rArr (k groupmerge( v1 v2 ))
) filter (case (k v) rArr v groupzero
)
def inverse(dict Map[K A]) Map[K A] = dictmap(case (k v) rArr (k groupinverse(v))
)
def zero = collectionimmutableHashMap ()
The general map folddef foldMapGen[K A B](zero B merge (B B) rArr B)
(f (K A) rArr B dict Map[K A]) B =dictmap(Functiontupled(f)) fold(zero)(merge)
By using foldMap instead of foldMapGen the user promises that f k is a homomorphism from groupA to groupB for each k Kdef foldMap[K A B]( groupA Group[A] groupB Group[B])
(f (K A) rArr B dict Map[K A]) B =foldMapGen(groupBzero groupBmerge)(f dict)
Figure 162 A Scala implementation of primitives for bags and maps In the code we call and erespectively merge inverse and zero We also omit the relatively standard primitives
Chapter 16 Dierentiation in practice 131
foldBag д(_ e) f (merge b1 b2) = foldBag д f b1
foldBag д f b1
foldBag д(_ e) f (negate b) = (foldBag д f b)
foldBag д(_ e) f (singleton v) = f v
If д is a group these equations specify foldBag д precisely [Gluche et al 1997] Moreover the rstthree equations mean that foldBag д f is the abelian group homomorphism between the abeliangroup on bags and the group д (because those equations coincide with the denition) Figure 162shows an implementation of foldBag as specied above Moreover all functions which deconstructa bag can be expressed in terms of foldBag with suitable arguments For instance we can sum theelements of a bag of integers with foldBag gZ (λx rarr x) where gZ is the abelian group inducedon integers by addition (Z+ 0minus) Users of foldBag can dene dierent abelian groups to specifydierent operations (for instance to multiply oating-point numbers)
If д and f do not change foldBag д f has a self-maintainable derivative By the equationsabove
foldBag д f (b oplus db)= foldBag д f (merge b db)= foldBag д f b foldBag д f db= foldBag д f b oplus GroupChange д (foldBag д f db)
We will describe the GroupChange change constructor in a moment Before that we note that as aconsequence the derivative of foldBag д f is
λb db rarr GroupChange д (foldBag д f db)
and we can see it does not use b as desired it is self-maintainable Additional restrictions arerequire to make foldMaprsquos derivative self-maintainable Those restrictions require the preconditionon mapReduce in Fig 161 foldMapGen has the same implementation but without those restrictionsas a consequence its derivative is not self-maintainable but it is more generally applicable Lack ofspace prevents us from giving more details
To dene GroupChange we need a suitable erased change structure on τ such that oplus will beequivalent to Since there might be multiple groups on τ we allow the changes to specify a groupand have oplus delegate to (which we extract by pattern-matching on the group)
∆τ = Replace τ | GroupChange (AbelianGroup τ )τv oplus (Replace u) = uv oplus (GroupChange ( inverse zero) dv) = v dvv u = Replace v
That is a change between two values is either simply the new value (which replaces the old onetriggering recomputation) or their dierence (computed with abelian group operations like in thechanges structures for groups from Sec 1311 The operator does not know which group to useso it does not take advantage of the group structure However foldBag is now able to generate agroup change
132 Chapter 16 Dierentiation in practice
We rewrite grandTotal in terms of foldBag to take advantage of group-based changes
id = λx rarr x
G+ = (Z+minus 0)grandTotal = λxsrarr λysrarr foldBag G+ id (merge xs ys)D n grandTotal o =
λxsrarr λdxsrarr λysrarr λdysrarrfoldBagprime G+ G prime+ id id prime
(merge xs ys)(mergeprime xs dxs ys dys)
It is now possible to write down the derivative of foldBag
(if static analysis detects that dG and df are nil changes)foldBagprime = D n foldBag o =
λG rarr λdG rarr λ f rarr λdf rarr λzsrarr λdzsrarrGroupChange G (foldBag G f dzs)
We know from Sec 106 that
mergeprime = λu rarr λdurarr λv rarr λdv rarr merge du dv
Inlining foldBagprime and mergeprime gives us a more readable term β-equivalent to the derivative ofgrandTotal
D n grandTotal o =λxsrarr λdxsrarr λysrarr λdysrarr foldBag G+ id (merge dxs dys)
164 Benchmarking setupWe run object language programs by generating corresponding Scala code To ensure rigorousbenchmarking [Georges et al 2007] we use the Scalameter benchmarking library To show thatthe performance dierence from the baseline is statistically signicant we show 99-condenceintervals in graphs
We verify Eq (101) experimentally by checking that the two sides of the equation alwaysevaluate to the same value
We ran benchmarks on an 8-core Intel Core i7-2600 CPU running at 34 GHz with 8GB of RAMrunning Scientic Linux release 64 While the benchmark code is single-threaded the JVM ooadsgarbage collection to other cores We use the preinstalled OpenJDK 170_25 and Scala 2102
Input generation Inputs are randomly generated to resemble English words over all webpages onthe internet The vocabulary size and the average length of a webpage stay relatively the same whilethe number of webpages grows day by day To generate a size-n input of type (Map Int (Bag Int))we generate n random numbers between 1 and 1000 and distribute them randomly in n1000 bagsChanges are randomly generated to resemble edits A change has 50 probability to delete a randomexisting number and has 50 probability to insert a random number at a random location
Chapter 16 Dierentiation in practice 133
Experimental units Thanks to Eq (101) both recomputation f (a oplus da) and incremental com-putation f a oplus D n f o a da produce the same result Both programs are written in our objectlanguage To show that derivatives are faster we compare these two computations To comparewith recomputation we measure the aggregated running time for running the derivative on thechange and for updating the original output with the result of the derivative
165 Benchmark resultsOur results show (Fig 163) that our program reacts to input changes in essentially constant time asexpected hence orders of magnitude faster than recomputation Constant factors are small enoughthat the speedup is apparent on realistic input sizes
We present our results in Fig 163 As expected the runtime of incremental computation isessentially constant in the size of the input while the runtime of recomputation is linear in the inputsize Hence on our biggest inputs incremental computation is over 104 times faster
Derivative time is in fact slightly irregular for the rst few inputs but this irregularity decreasesslowly with increasing warmup cycles In the end for derivatives we use 104 warmup cycles Withfewer warmup cycles running time for derivatives decreases signicantly during execution goingfrom 26ms for n = 1000 to 02ms for n = 512000 Hence we believe extended warmup is appropriateand the changes do not aect our general conclusions Considering condence intervals in ourexperiments the running time for derivatives varies between 0139ms and 0378ms
In our current implementation the code of the generated derivatives can become quite big Forthe histogram example (which is around 1KB of code) a pretty-print of its derivative is around40KB of code The function application case in Fig 121c can lead to a quadratic growth in the worstcase More importantly we realized that our home-grown transformation system in some casesperforms overly aggressive inlining especially for derivatives even though this is not required forincrementalization and believe this explains a signicant part of the problem Indeed code blowupproblems do not currently appear in later experiments (see Sec 174)
166 Chapter conclusionOur results show that the incrementalized program runs in essentially constant time and henceorders of magnitude faster than the alternative of recomputation from scratch
An important lessons from the evaluations is that as anticipated in Sec 1621 to achieve goodperformance our current implementation requires some form of dead code elimination such aslaziness
134 Chapter 16 Dierentiation in practice
0113
113
1013
10013
100013
1000013
1k13 2k13 4k13 8k13 16k13 32k13 64k13 128k13 256k13 512k13 1024k13 2048k13 4096k13
Run$me13 (ms)13
Input13 size13
Incremental13 Recomputaon13
Figure 163 Performance results in logndashlog scale with input size on the x-axis and runtime in mson the y-axis Condence intervals are shown by the whiskers most whiskers are too small to bevisible
Chapter 17
Cache-transfer-style conversion
171 IntroductionIncremental computation is often desirable after computing an output from some input we oftenneed to produce new outputs corresponding to changed inputs One can simply rerun the samebase program on the new input but instead incremental computation transforms the input changeto an output change This can be desirable because more ecient
Incremental computation could also be desirable because the changes themselves are of interestImagine a typechecker explaining how some change to input source propagates to changes to thetypechecking result More generally imagine a program explaining how some change to its inputpropagates through a computation into changes to the output
ILC (Incremental λ-Calculus) [Cai et al 2014] is a recent approach to incremental computationfor higher-order languages ILC represents changes from an old value v1 to an updated value v2 as arst-class value written dv ILC also transforms statically base programs to incremental programs orderivatives derivatives produce output changes from changes to all their inputs Since functions arerst-class values ILC introduces a notion of function changes
However as mentioned by Cai et al and as we explain below ILC as currently dened doesnot allow reusing intermediate results created by the base computation during the incrementalcomputation That restricts ILC to supporting eciently only self-maintainable computations arather restrictive class for instance mapping self-maintainable functions on a sequence is self-maintainable but dividing numbers isnrsquot In this paper we remove this limitation
To remember intermediate results many incrementalization approaches rely on forms of memo-ization one uses hashtables to memoize function results or dynamic dependence graphs [Acar2005] to remember the computation trace However such data structures often remember resultsthat might not be reused moreover the data structures themselves (as opposed to their values)occupy additional memory looking up intermediate results has a cost in time and typical general-purpose optimizers cannot predict results from memoization lookups Instead ILC aims to producepurely functional programs that are suitable for further optimizations
We eschew memoization instead we transform programs to cache-transfer style (CTS) followingideas from Liu and Teitelbaum [1995] CTS functions output caches of intermediate results alongwith their primary results Caches are just nested tuples whose structure is derived from code andaccessing them does not involve looking up keys depending on inputs We also extend dierentiationto produce CTS derivatives which can extract from caches any intermediate results they need Thisapproach was inspired and pioneered by Liu and Teitelbaum for untyped rst-order functionallanguages but we integrate it with ILC and extend it to higher-order typed languages
135
136 Chapter 17 Cache-transfer-style conversion
While CTS functions still produce additional intermediate data structures produced programscan be subject to further optimizations We believe static analysis of a CTS function and its CTSderivative can identify and remove unneeded state (similar to what has been done by Liu andTeitelbaum) as we discuss in Sec 1756 We leave a more careful analysis to future work
We prove most of our formalization correct in Coq To support non-simply-typed programs allour proofs are for untyped λ-calculus while previous ILC correctness proofs were restricted tosimply-typed λ-calculus Unlike previous ILC proofs we simply dene which changes are valid viaa logical relation then show the fundamental property for this logical relation (see Sec 1721) Toextend this proof to untyped λ-calculus we switch to step-indexed logical relations
To support dierentiation on our case studies we also represent function changes as closuresthat can be inspected to support manipulating them more eciently and detecting at runtime whena function change is nil hence need not be applied To show this representation is correct we alsouse closures in our mechanized proof
Unlike plain ILC typing programs in CTS is challenging because the shape of caches for afunction depends on the function implementation Our case studies show how to non-triviallyembed resulting programs in typed languages at least for our case studies but our proofs supportan untyped target language
In sum we present the following contributions
bull via examples we motivate extending ILC to remember intermediate results (Sec 172)
bull we give a novel proof of correctness for ILC for untyped λ-calculus based on step-indexedlogical relations (Sec 1733)
bull building on top of ILC-style dierentiation we show how to transform untyped higher-orderprograms to cache-transfer-style (CTS) (Sec 1735)
bull we show that programs and derivatives in cache-transfer style simulate correctly their non-CTS variants (Sec 1736)
bull we mechanize in Coq most of our proofs
bull we perform performance case studies (in Sec 174) applying (by hand) extension of thistechnique to Haskell programs and incrementalize eciently also programs that do not admitself-maintainable derivatives
The rest of the paper is organized as follows Sec 172 summarizes ILC and motivates theextension to cache-transfer style Sec 173 presents our formalization and proofs Sec 174 presentsour case studies and benchmarks Sec 175 discusses limitations and future work Sec 176 discussesrelated work and Sec 177 concludes
172 Introducing Cache-Transfer StyleIn this section we motivate cache-transfer style (CTS) Sec 1721 summarizes a reformulation ofILC so we recommend it also to readers familiar with Cai et al [2014] In Sec 1722 we consider aminimal rst-order example namely an average function We incrementalize it using ILC explainwhy the result is inecient and show that remembering results via cache-transfer style enablesecient incrementalization with asymptotic speedups In Sec 1723 we consider a higher-orderexample that requires not just cache-transfer style but also ecient support for both nil and non-nilfunction changes together with the ability to detect nil changes
Chapter 17 Cache-transfer-style conversion 137
1721 Background
ILC considers simply-typed programs and assumes that base types and primitives come withsupport for incrementalization
The ILC framework describes changes as rst-class values and types them using dependent typesTo each type A we associate a type ∆A of changes for A and an update operator oplus Ararr ∆Ararr Athat updates an initial value with a change to compute an updated value We also consider changesfor evaluation environments which contain changes for each environment entry
A change da ∆A can be valid from a1 A to a2 A and we write then da B a1 rarr a2 AThen a1 is the source or initial value for da and a2 is the destination or updated value for da Fromda B a1 rarr a2 A follows that a2 coincides with a1 oplus da but validity imposes additional invariantsthat are useful during incrementalization A change can be valid for more than one source but achange da and a source a1 uniquely determine the destination a2 = a1 oplus da To avoid ambiguity wealways consider a change together with a specic source
Each type comes with its denition of validity Validity is a ternary logical relation For functiontypes Ararr B we dene ∆(Ararr B) = Ararr ∆Ararr ∆B and say that a function change df ∆(Ararr B)is valid from f1 A rarr B to f2 A rarr B (that is df B f1 rarr f2 A rarr B) if and only if df mapsvalid input changes to valid output changes by that we mean that if da B a1 rarr a2 A thendf a1 da B f1 a1 rarr f2 a2 B Source and destination of df a1 da that is f1 a1 and f2 a2 are theresult of applying two dierent functions that is f1 and f2
ILC expresses incremental programs as derivatives Generalizing previous usage we simplysay derivative for all terms produced by dierentiation If dE is a valid environment change fromE1 to E2 and term t is well-typed and can be evaluated against environments E1 E2 then termD ιn t o the derivative of t evaluated against dE produces a change from v1 to v2 where v1 is thevalue of t in environment E1 and v2 is the value of t in environment E2 This correctness propertyfollows immediately the fundamental property for the logical relation of validity and can be provenaccordingly we give a step-indexed variant of this proof in Sec 1733 If t is a function and dE is anil change (that is its source E1 and destination E2 coincide) then D ιn t o produces a nil functionchange and is also a derivative according to Cai et al [2014]
To support incrementalization one must dene change types and validity for each base type anda correct derivative for each primitive Functions written in terms of primitives can be dierentiatedautomatically As in all approaches to incrementalization (see Sec 176) one cannot incrementalizeeciently an arbitrary program ILC limits the eort to base types and primitives
1722 A rst-order motivating example computing averages
Suppose we need to compute the average y of a bag of numbers xs Bag Z and that whenever wereceive a change dxs ∆(Bag Z) to this bag we need to compute the change dy to the average y
In fact we expect multiple consecutive updates to our bag Hence we say we have an initial bagxs1 and compute its average y1 as y1 = avg xs1 and then consecutive changes dxs1 dxs2 Changedxs1 goes from xs1 to xs2 dxs2 goes from xs2 to xs3 and so on We need to compute y2 = avg xs2y3 = avg xs3 but more quickly than we would by calling avg again
We can compute the average through the following function (that we present in Haskell)
avg xs =let s = sum xs
n = length xsr = div s n
in r
138 Chapter 17 Cache-transfer-style conversion
We write this function in Arsquo-normal form (ArsquoNF) a small variant of A-normal form (ANF) Sabry andFelleisen [1993] that we introduce In ArsquoNF programs bind to a variable the result of each functioncall in avg instead of using it directly unlike plain ANF ArsquoNF also binds the result of tail calls suchas div s n in avg ArsquoNF simplies conversion to cache-transfer style
We can incrementalize eciently both sum and length by generating via ILC their derivativesdsum and dlength assuming a language plugin supporting bags supporting folds
But division is more challenging To be sure we can write the following derivative
ddiv a1 da b1 db =let a2 = a1 oplus da
b2 = b1 oplus dbin div a2 b2 div a1 b1
Function ddiv computes the dierence between the updated and the original result without anyspecial optimization but still takes O(1) for machine integers But unlike other derivatives ddivuses its base inputs a1 and b1 that is it is not self-maintainable [Cai et al 2014]
Because ddiv is not self-maintainable a derivative calling it will not be ecient To wit let uslook at davg the derivative of avg
davg xs dxs =let s = sum xs
ds = dsum xs dxsn = length xsdn = dlength xs dxsr = div s ndr = ddiv s ds n dn
in dr
This function recomputes s n and r just like in avg but r is not used so its recomputation caneasily be removed by later optimizations On the other hand derivative ddiv will use the valuesof its base inputs a1 and b1 so derivative davg will need to recompute s and n and save no timeover recomputation If ddiv were instead a self-maintainable derivative the computations of sand n would also be unneeded and could be optimized away Cai et al leave ecient support fornon-self-maintainable derivatives for future work
To avoid recomputation we must simply remember intermediate results as needed Executingdavg xs1 dxs1 will compute exactly the same s and n as executing avg xs1 so we must just save andreuse them without needing to use memoization proper Hence we CTS-convert each function fto a CTS function fC and a CTS derivative dfC CTS function fC produces together with its nalresult a cache that the caller must pass to CTS derivative dfC A cache is just a tuple of valuescontaining information from subcalls mdash either inputs (as we explain in a bit) intermediate resultsor subcaches that is caches returned from further function calls In fact typically only primitivefunctions like div need to recall actual result automatically transformed functions only need toremember subcaches or inputs
CTS conversion is simplied by rst converting to ArsquoNF where all results of subcomputationsare bound to variables we just collect all caches in scope and return them
As an additional step we avoid always passing base inputs to derivatives by dening ∆(ArarrB) = ∆Ararr ∆B Instead of always passing a base input and possibly not using it we can simplyassume that primitives whose derivative needs a base input store the input in the cache
To make the translation uniform we stipulate all functions in the program are transformedto CTS using a (potentially empty) cache Since the type of the cache for a function f A rarr B
Chapter 17 Cache-transfer-style conversion 139
depends on implementation of f we introduce for each function f a type for its cache FC so thatCTS function fC has type Ararr (B FC) and CTS derivative dfC has type ∆Ararr FC rarr (∆B FC)
The denition of FC is only needed inside fC and dfC and it can be hidden in other functions tokeep implementation details hidden in transformed code because of limitations of Haskell moduleswe can only hide such denitions from functions in other modules
Since functions of the same type translate to functions of dierent types the translation doesnot preserve well-typedness in a higher-order language in general but it works well in our casestudies (Sec 174) Sec 1741 shows how to map such functions We return to this point brieyin Sec 1751
CTS-converting our example produces the following code
data AvgC = AvgC SumC LengthC DivCavgC Bag Zrarr (ZAvgC)avgC xs =let (s cs1) = sumC xs(n cn1) = lengthC xs(r cr1) = s lsquodivClsquo n
in (rAvgC cs1 cn1 cr1)davgC ∆(Bag Z) rarr AvgC rarr (∆ZAvgC)davgC dxs (AvgC cs1 cn1 cr1) =let (ds cs2) = dsumC dxs cs1(dn cn2) = dlengthC dxs cn1(dr cr2) = ddivC ds dn cr1in (drAvgC cs2 cn2 cr2)
In the above program sumC and lengthC are self-maintainable that is they need no base inputsand can be transformed to use empty caches On the other hand ddiv is not self-maintainable sowe transform it to remember and use its base inputs
divC a b = (a lsquodivlsquo b (a b))ddivC da db (a1 b1) =
let a2 = a1 oplus dab2 = b1 oplus db
in (div a2 b2 div a1 b1 (a2 b2))
Crucially ddivC must return an updated cache to ensure correct incrementalization so that applica-tion of further changes works correctly In general if (y c1) = fC x and (dy c2) = dfC dx c1 then(y oplus dy c2) must equal the result of the base function fC applied to the new input x oplus dx that is(y oplus dy c2) = fC (x oplus dx)
Finally to use these functions we need to adapt the caller Letrsquos rst show how to deal with asequence of changes imagine that dxs1 is a valid change for xs and that dxs2 is a valid change forxs oplus dxs1 To update the average for both changes we can then call the avgC and davgC as follows
-- A simple example caller with two consecutive changesavgDAvgC Bag Zrarr ∆(Bag Z) rarr ∆(Bag Z) rarr(Z∆Z∆ZAvgC)
avgDAvgC xs dxs1 dxs2 =let (res1 cache1) = avgC xs(dres1 cache2) = davgC dxs1 cache1(dres2 cache3) = davgC dxs2 cache2
in (res1 dres1 dres2 cache3)
140 Chapter 17 Cache-transfer-style conversion
Incrementalization guarantees that the produced changes update the output correctly in responseto the input changes that is we have res1 oplus dres1 = avg (xs oplus dxs1) and res1 oplus dres1 oplus dres2 =avg (xs oplus dxs1 oplus dxs2) We also return the last cache to allow further updates to be processed
Alternatively we can try writing a caller that gets an initial input and a (lazy) list of changesdoes incremental computation and prints updated outputs
processChangeList (dxsN dxss) yN cacheN = dolet (dy cacheN prime) = avgprime dxsN cacheNyN prime = yN oplus dy
print yN prime
processChangeList dxss yN prime cacheN prime
-- Example caller with multiple consecutive-- changes
someCaller xs1 dxss = dolet (y1 cache1) = avgC xs1processChangeList dxss y1 cache1
More in general we produce both an augmented base function and a derivative where theaugmented base function communicates with the derivative by returning a cache The contentsof this cache are determined statically and can be accessed by tuple projections without dynamiclookups In the rest of the paper we use the above idea to develop a correct transformation thatallows incrementalizing programs using cache-transfer style
Wersquoll return to this example in Sec 1741
1723 A higher-order motivating example nested loopsNext we consider CTS dierentiation on a minimal higher-order example To incrementalize thisexample we enable detecting nil function changes at runtime by representing function changes asclosures that can be inspected by incremental programs Wersquoll return to this example in Sec 1742
We take an example of nested loops over sequences implemented in terms of standard Haskellfunctions map and concat For simplicity we compute the Cartesian product of inputs
cartesianProduct Sequence ararr Sequence brarr Sequence (a b)cartesianProduct xs ys =
concatMap (λx rarr map (λy rarr (x y)) ys) xsconcatMap f xs = concat (map f xs)
Computing cartesianProduct xs ys loops over each element x from sequence xs and y from sequenceys and produces a list of pairs (x y) taking quadratic time O(n2) (we assume for simplicity that|xs | and |ys | are both O(n)) Adding a fresh element to either xs or ys generates an output changecontainingΘ(n) fresh pairs hence derivative dcartesianProduct must take at least linear time Thanksto specialized derivatives dmap and dconcat for primitives map and concat dcartesianProduct hasasymptotically optimal time complexity To achieve this complexity dmap f df must detect whendf is a nil function change and avoid applying it to unchanged input elements
To simplify the transformations we describe we λ-lift programs before dierentiating andtransforming them
cartesianProduct xs ys =concatMap (mapPair ys) xs
mapPair ys = λx rarr map (pair x) ys
Chapter 17 Cache-transfer-style conversion 141
pair x = λy rarr (x y)concatMap f xs =let yss = map f xsin concat yss
Suppose we add an element to either xs or ys If change dys adds one element to ys thendmapPair ys dys the argument to dconcatMap is a non-nil function change taking constant timeso dconcatMap must apply it to each element of xs oplus dxs
Suppose next that change dxs adds one element to xs and dys is a nil change for ys ThendmapPair ys dys is a nil function change And we must detect this dynamically If a function changedf ∆(Ararr B) is represented as a function and A is innite one cannot detect dynamically that itis a nil change To enable runtime nil change detection we apply closure conversion on functionchanges a function change df represented as a closure is nil for f only if all environment changesit contains are also nil and if the contained function is a derivative of f rsquos function
173 FormalizationIn this section we formalize cache-transfer-style (CTS) dierentiation and formally prove itssoundness with respect to dierentiation We furthermore present a novel proof of correctness fordierentiation itself
To simplify the proof we encode many invariants of input and output programs by deninginput and output languages with a suitably restricted abstract syntax Restricting the input languagesimplies the transformations while restricting the output languages simplies the semantics Sincebase programs and derivatives have dierent shapes we dene dierent syntactic categories of baseterms and change terms
We dene and prove sound three transformations across two languages rst we present avariant of dierentiation (as dened by Cai et al [2014]) going from base terms of language λAL tochange terms of λAL and we prove it sound with respect to non-incremental evaluation Second wedene CTS conversion as a pair of transformations going from base terms of λAL CTS translationproduces CTS versions of base functions as base terms in iλAL and CTS dierentiation producesCTS derivatives as change terms in iλAL
As source language for CTS dierentiation we use a core language named λAL This language isa common target of a compiler front-end for a functional programming language in this languageterms are written in λ-lifted and Arsquo-normal form (ArsquoNF) so that every intermediate result is namedand can thus be stored in a cache by CTS conversion and reused later (as described in Sec 172)
The target language for CTS dierentiation is named iλAL Programs in this target language arealso λ-lifted and in ArsquoNF But additionally in these programs every toplevel function f produces acache which is to be conveyed to the derivatives of f
The rest of this section is organized as follows Sec 1731 presents syntax and semantics ofsource language λAL Sec 1732 denes dierentiation and Sec 1733 proves it correct Sec 1734presents syntax and semantics of target language λAL Sec 1735 dene CTS conversion andSec 1736 proves it correct Most of the formalization has been mechanized in Coq (the proofs ofsome straightforward lemmas are left for future work)
1731 The source language λALSyntax The syntax of λAL is given in Figure 171 Our source language allows representing bothbase terms t and change terms dt
142 Chapter 17 Cache-transfer-style conversion
Base termst = let y = f x in t Call
| let y = (x) in t Tuple| x Result
Change termsdt = let y = f x dy = df x dx in dt Call
| let y = (x) dy = (dx) in dt Tuple| dx Result
Closed valuesv = E[λx t ] Closure
| (v) Tuple| ` Literal| p Primitive
Value environmentsE = E x = v Value binding
| bull EmptyChange values
dv = dE[λx dx dt] Closure| (dv) Tuple| d` Literal| v Replace| nil Nil
Change environmentsdE = dE x = v dx = dv Binding
| bull Emptyn k isin N Step indexes
Figure 171 Source language λAL (syntax)
[SVar]E ` x dArr1 E(x)
[STuple]E y = (E(x)) ` t dArrn v
E ` let y = (x) in t dArrn+1 v
[SPrimitiveCall]E(f ) = p E y = δp(E(x)) ` t dArrn v
E ` let y = f x in t dArrn+1 v
[SClosureCall]E(f ) = Ef [λx tf ] Ef x = E(x) ` tf dArrm vy E y = vy ` t dArrn v
E ` let y = f x in t dArrm+n+1 v
Figure 172 Step-indexed big-step semantics for base terms of source language λAL
Our syntax for base terms t represents λ-lifted programs in Arsquo-normal form (Sec 1722) Wewrite x for a sequence of variables of some unspecied length x1 x2 xm A term can be either abound variable x or a let-binding of y in subterm t to either a new tuple (x) (let y = (x) in t ) orthe result of calling function f on argument x (let y = f x in t ) Both f and x are variables to belooked up in the environment Terms cannot contain λ-abstractions as they have been λ-lifted totop-level denitions which we encode as closures in the initial environments Unlike standard ANFwe add no special syntax for function calls in tail position (see Sec 1753 for a discussion about thislimitation) We often inspect the result of a function call f x which is not a valid term in our syntaxTo enable this we write ldquo(f x)rdquo for ldquolet y = f x in yrdquo where y is chosen fresh
Our syntax for change terms dt mimicks the syntax for base terms except that (i) each functioncall is immediately followed by the evaluation of its derivative and (ii) the nal value returned by achange term is a change variable dx As we will see this ad-hoc syntax of change terms is tailoredto capture the programs produced by dierentiation We only allows α-renamings that maintainthe invariant that the denition of x is immediately followed by the denition of dx if x is renamedto y then dx must be renamed to dy
Semantics A closed value for base terms is either a closure a tuple of values a constants ora primitive A closure is a pair of an evaluation environment E and a λ-abstraction closed withrespect to E The set of available constants is left abstract It may contain usual rst-order constantslike integers We also leave abstract the primitives p like if-then-else or projections of tuplecomponents As usual environments E map variables to closed values With no loss of generalitywe assume that all bound variables are distinct
Figure 172 shows a step-indexed big-step semantics for base terms dened through judgment E `t dArrn v pronounced ldquoUnder environment E base term t evaluates to closed value v in n stepsrdquo Our
Chapter 17 Cache-transfer-style conversion 143
step-indexes count the number of ldquonodesrdquo of a big-step derivation1 We explain each rule in turnRule [SVar] looks variable x up in environment E Other rules evaluate let-binding let y = in tin environment E Each rule computes yrsquos new value vy (taking m steps where m can be zero) andevaluates in n steps body t to v using environment E extended by binding y to vy The overalllet-binding evaluates to v in m + n + 1 steps But dierent rules compute yrsquos value dierently[STuple] looks each variable in x up in E to evaluate tuple (x) (in m = 0 steps) [SPrimitiveCall]evaluates function calls where variable f is bound in E to a primitive p How primitives are evaluatedis specied by function δp(ndash) from closed values to closed values To evaluate such a primitive callthis rule applies δp(ndash) to xrsquos value (in m = 0 steps) [SClosureCall] evaluates functions calls wherevariable f is bound in E to closure Ef [λx tf ] this rule evaluates closure body tf inm steps usingclosure environment Ef extended with the value of argument x in E
Change semantics We move on to dene how change terms evaluate to change values We startby required auxiliary denitions
A closed change value is either a closure change a tuple change a literal change a replacementchange or a nil change A closure change is a pair made of an evaluation environment dE anda λ-abstraction expecting a value and a change value as arguments to evaluate a change term intoan output change value An evaluation environment dE follows the same structure as let-bindingsof change terms it binds variables to closed values and each variable x is immediately followed bya binding for its associated change variable dx As with let-bindings of change terms α-renamingsin an environment dE must rename dx into dy if x is renamed into y With no loss of generality weassume that all bound term variables are distinct in these environments
We dene the original environment bdEc1 and the new environment bdEc2 of a change environ-ment dE by induction over dE
bbullci = bull i = 1 2bdE x = v dx = dvc1 = bdEc1 x = vbdE x = v dx = dvc2 = bdEc2 x = v oplus dv
This last rule makes use of an operation oplus to update a value with a change which may fail atruntime Indeed change update is a partial function written ldquov oplus dvrdquo dened as follows
v oplus nil = vv1 oplus v2 = v2` oplus d` = δoplus(`d`)
E[λx t] oplus dE[λx dx dt] = (E oplus dE)[λx t](v1 vn) oplus (dv1 dvn) = (v1 oplus dv1 vn oplus dvn)
where(E x = v) oplus (dE x = v dx = dv) = ((E oplus dE) x = (v oplus dv))
Nil and replacement changes can be used to update all values (constants tuples primitives andclosures) while tuple changes can only update tuples literal changes can only update literals andclosure changes can only update closures A nil change leaves a value unchanged A replacementchange overrides the current value v with a new one v prime On literals oplus is dened via some in-terpretation function δoplus Change update for a closure ignores dt instead of combining it with tsomehow This may seem surprising but we only need oplus to behave well for valid changes (as shownby Lemma 1731) for valid closure changes dt must behave similarly to D ιn t o anyway so onlyenvironment updates matter This denition also avoids having to modify terms at runtime whichwould be dicult to implement safely
1Instead Ahmed [2006] and Acar et al [2008] count the number of steps that small-step evaluation would take (as we didin Appendix C) but this simplies some proof steps and makes a minor dierence in others
144 Chapter 17 Cache-transfer-style conversion
[SDVar]dE ` dx dArr dE(dx)
[SDTuple]dE(x dx) = vx dvx
dE y = (vx ) dy = (dvx ) ` dt dArr dv
dE ` let y = (x) dy = (dx) in dt dArr dv
[SDReplaceCall]dE(df ) =vf
bdE c1 ` (f x) dArrn vybdE c2 ` (f x) dArrm vy prime
dE y = vy dy =vy prime ` dt dArr dv
dE ` let y = f x dy = df x dx in dt dArr dv
[SDClosureNil]dE(f df ) = Ef [λx tf ] nilEf x = dE(x) ` tf dArrn vy
Ef x = bdE c2(x) ` tf dArrm vy prime
dE y = vy dy =vy prime ` dt dArr dv
dE ` let y = f x dy = df x dx in dt dArr dv
[SDPrimitiveNil]dE(f df ) = p nildE(x dx) = vx dvx
dE y = δp(vx ) dy = ∆p(vx dvx ) ` dt dArr dv
dE ` let y = f x dy = df x dx in dt dArr dv
[SDClosureChange]dE(f df ) = Ef [λx tf ] dEf [λx dx dtf ]
dE(x dx) = vx dvxEf x = vx ` tf dArrn vy
dEf x = vx dx = dvx ` dtf dArr dvydE y = vy dy = dvy ` dt dArr dv
dE ` let y = f x dy = df x dx in dt dArr dv
Figure 173 Step-indexed big-step semantics for the change terms of the source language λAL
Having given these denitions we show in Fig 173 a big-step semantics for change terms (with-out step-indexing) dened through judgment dE ` dt dArr dv pronounced ldquoUnder the environmentdE the change term dt evaluates into the closed change value dvrdquo [SDVar] looks up into dE toreturn a value for dx [SDTuple] builds a tuple out of the values of x and a change tuple out of thechange values of dx as found in the environment dE There are four rules to evaluate let-bindingdepending on the nature of dE(df ) These four rules systematically recomputes the value vy of y inthe original environment They dier in the way they compute the change dy to y
If dE(df ) is a replacement [SDReplaceCall] applies Replacing the value of f in the environmentforces recomputing f x from scratch in the new environment The resulting value v primey is the newvalue which must replace vy so dy binds to v primey when evaluating the let body
If dE(df ) is a nil change we have two rules depending on the nature of dE(f ) If dE(f ) is aclosure [SDClosureNil] applies and in that case the nil change of dE(f ) is the exact same closureHence to compute dy we reevaluate this closure applied to the updated argument bdEc2(x) to avalue v primey and bind dy to v primey In other words this rule is equivalent to [SDReplaceCall] in the casewhere a closure is replaced by itself2 If dE(f ) is a primitive [SDPrimitiveNil] applies The nilchange of a primitive p is its derivative which interpretation is realized by a function ∆p(ndash) Theevaluation of this function on the input value and the input change leads to the change dvy boundto dy
If dE(f ) is a closure change dEf [λx dx dtf ] [SDClosureChange] applies The change dvy resultsfrom the evaluation of dtf in the closure change environment dEf augmented with an input valuefor x and a change value for dx Again let us recall that we will maintain the invariant that theterm dtf behaves as the derivative of f so this rule can be seen as the invocation of f rsquos derivative
2Based on Appendix C we are condent we could in fact use the derivative of tf instead of replacement changes buttransforming terms in a semantics seems aesthetically wrong We can also restrict nil to primitives as we essentially did inAppendix C
Chapter 17 Cache-transfer-style conversion 145
Expressiveness A closure in the initial environment can be used to represent a top-level denitionSince environment entries can point to primitives we need no syntax to directly represent calls ofprimitives in the syntax of base terms To encode in our syntax a program with top-level denitionsand a term to be evaluated representing the entry point one can produce a term t representingthe entry point together with an environment E containing as values any top-level denitionsprimitives and constants used in the program
Our formalization does not model directly n-ary functions but they can be encoded throughunary functions and tuples This encoding does not support currying eciently but we discusspossible solutions in Sec 1757
Control operators like recursion combinators or branching can be introduced as primitiveoperations as well If the branching condition changes expressing the output change in generalrequires replacement changes Similarly to branching we can add tagged unions
1732 Static dierentiation in λALDenition 1051 denes dierentiation for simply-typed λ-calculus terms Figure 174 showsdierentiation for λAL syntax
D ιn x o = dx
D ιn let y = (x) in t o = let y = (x) dy = (dx) in D ιn t oD ιn let y = f x in t o = let y = f x dy = df x dx in D ιn t o
Figure 174 Static dierentiation in λAL
Dierentiating a base term t produces a change term D ιn t o its derivative Dierentiatingnal result variable x produces its change variable dx Dierentiation copies each binding of anintermediate result y to the output and adds a new bindings for its change dy If y is bound totuple (x) then dy will be bound to the change tuple (dx) If y is bound to function application f xthen dy will be bound to the application of function change df to input x and its change dx
Evaluating D ιn t o recomputes all intermediate results computed by t This recomputation willbe avoided through cache-transfer style in Sec 1735
The original transformation for static dierential of λ-terms [Cai et al 2014] has three caseswhich we recall here
Derive(x) = dxDerive(t u) = Derive(t)u Derive(u)
Derive(λx t) = λx dx Derive(t)
Even though the rst two cases of Cai et alrsquos dierentiation map into the two cases of ourdierentiation variant one may ask where the third case is realized now Actually this third caseoccurs while we transform the initial environment Indeed we will assume that the closures of theenvironment of the source program have been adjoined a derivative More formally we supposethat the derivative of t is evaluated under an environment D ιnE o obtained as follows
D ιn bull o = bull
D ιnE f = Ef [λx t] o = D ιnE o f = Ef [λx t] df = D ιnEf o[λx dx D ιn t o]D ιnE x = v o = D ιnE o x = v dx = nil (If v is not a closure)
146 Chapter 17 Cache-transfer-style conversion
1733 A new soundness proof for static dierentiationAs in Cai et al [2014]rsquos development static dierentiation is only sound on input changes thatare valid However our denition of validity needs to be signicantly dierent Cai et al provesoundness for a strongly normalizing simply-typed λ-calculus using denotational semantics Wegeneralize this result to an untyped and Turing-complete language using purely syntactic argumentsIn both scenarios a function change is only valid if it turns valid input changes into valid outputchanges so validity is a logical relation Since standard logical relations only apply to typedlanguages we turn to step-indexed logical relations
Validity as a step-indexed logical relation
To state and prove soundness of dierentiation we dene validity by introducing a ternary step-indexed relation over base values changes and updated values following previous work on step-indexed logical relations [Ahmed 2006 Acar et al 2008] Experts might notice small dierences inour step-indexing as mentioned in Sec 1731 but they do not aect the substance of the proof Wewrite
dv Bk v1 rarr v2
and say that ldquodv is a valid change fromv1 tov2 up to k stepsrdquo to mean that dv is a change fromv1 tov2 and that dv is a valid description of the dierences between v1 and v2 with validity tested withup to k steps To justify this intuition of validity we state and prove two lemmas a valid changefrom v1 to v2 goes indeed from v1 to v2 (Lemma 1731) and if a change is valid up to k steps it isalso valid up to fewer steps (Lemma 1732)
Lemma 1731 (oplus agrees with validity)If we have dv Bk v1 rarr v2 for all step-indexes k then v1 oplus dv = v2
Lemma 1732 (Downward-closure)If N ge n then dv BN v1 rarr v2 implies dv Bn v1 rarr v2
bull d` Bn ` rarr δoplus(`d`)
bull v2 Bn v1 rarr v2
bull nil Bn v rarr v
bull (dv1 dvm ) Bn (v1 vm ) rarr (v prime1 vprimem )
if and only ifforallk lt nforalli isin [1 m] dvi Bn vi rarr v primei
bull dE[λx dx dt] Bn E1[λx t] rarr E2[λx t]if and only if E2 = E1 oplus dE andforallk lt nv1 dvv2
if dv Bk v1 rarr v2then(dE dv ` dt) Ik (E1 x = v1 ` t) rarr(E2 x = v2 ` t)
Figure 175 Step-indexed relation between values and changes
As usual with step-indexed logical relations validity is dened by well-founded induction overnaturals ordered by lt To show this it helps to observe that evaluation always takes at least onestep
Validity is formally dened by cases in Figure 175 we describe in turn each case First aconstant change d` is a valid change from ` to `oplusd` = δoplus(`d`) Since the function δoplus is partial therelation only holds for the constant changes d` which are valid changes for ` Second a replacementchange v2 is always a valid change from any value v1 to v2 Third a nil change is a valid changebetween any value and itself Fourth a tuple change is valid up to step n if each of its componentsis valid up to any step strictly less than k Fifth we dene validity for closure changes Roughly
Chapter 17 Cache-transfer-style conversion 147
speaking this statement means that a closure change is valid if (i) its environment change dE isvalid for the original closure environment E1 and for the new closure environment E2 (ii) whenapplied to related values the closure bodies t1 and t2 are related by dt The validity relation betweenterms is dened as follows
(dE ` dt) In (E1 ` t1) rarr (E2 ` t2)if and only if forallk lt nv1v2E1 ` t1 dArrk v1 and E2 ` t2 dArr v2implies that existdvdE ` dt dArr dv and dv Bnminusk v1 rarr v2
We extend this relation from values to environments by dening a judgment dE Bk E1 rarr E2dened as follows
bull Bk bull rarr bulldE Bk E1 rarr E2 dv Bk v1 rarr v2
(dE x = v1 dx = dv) Bk (E1 x = v1) rarr E2 x = v2
The above lemmas about validity for values extend to environments
Lemma 1733 (oplus agrees with validity for environments)If dE Bk E1 rarr E2 then E1 oplus dE = E2
Lemma 1734 (Downward-closure for environments)If N ge n then dE BN E1 rarr E2 implies dE Bn E1 rarr E2
Finally for both values terms and environments omitting the step count k from validity meansvalidity holds for all ks That is for instance dv B v1 rarr v2 means dv Bk v1 rarr v2 for all k
Soundness of dierentiation
We can state a soundness theorem for dierentiation without mentioning step-indexes Instead ofproving it directly we must rst prove a more technical statement (Lemma 1736) that mentionsstep-indexes explicitly
Theorem 1735 (Soundness of dierentiation in λAL)If dE is a valid change environment from base environment E1 to updated environment E2 that isdE B E1 rarr E2 and if t converges both in the base and updated environment that is E1 ` t dArr v1 andE2 ` t dArr v2 then D ιn t o evaluates under the change environment dE to a valid change dv betweenbase result v1 and updated result v2 that is dE ` D ιn t o dArr dv dv B v1 rarr v2 and v1 oplus dv = v2
Lemma 1736 (Fundamental Property)For each n if dE Bn E1 rarr E2 then (dE ` D ιn t o) In (E1 ` t) rarr (E2 ` t)
1734 The target language iλALIn this section we present the target language of a transformation that extends static dierentiationwith CTS conversion As said earlier the functions of iλAL compute both their output and a cachewhich contains the intermediate values that contributed to that output Derivatives receive thiscache and use it to compute changes without recomputing intermediate values derivatives alsoupdate the caches according to their input changes
148 Chapter 17 Cache-transfer-style conversion
Base termsM = let y cyfx = f x in M Call
| let y = (x) in Tuple| (x C) Result
Cache termsC = bull Empty
| C cyfx Sub-cache| C x Cached value
Change termsdM = let dy cyfx = df dx cyfx in dM Call
| let dy = (dx) in dM Tuple| (dx dC) Result
Cache updatesdC = bull Empty
| dC cyfx Sub-cache| dC (x oplus dx) Updated value
Closed valuesV = F [λx M ] Closure
| (V ) Tuple| ` Literal| p Primitive
Cache valuesVc = bull Empty
| Vc Vc Sub-cache| Vc V Cached value
Change valuesdV = dF[λdxC dM ] Closure
| (dV ) Tuple| d` Literal| nil Nil| V Replacement
Base denitionsDv = x = V Value denition
| cyfx = Vc Cache denition
Change denitionsdDv = Dv Base
| dx = dV ChangeEvaluation environments
F = F Dv Binding| bull Empty
Change environmentsdF = dF dDv Binding
| bull Empty
Figure 176 Target language iλAL (syntax)
Syntax The syntax of iλAL is dened in Figure 176 Base terms of iλAL follow again λ-liftedArsquoNF like λAL except that a let-binding for a function application f x now binds an extra cacheidentier cyfx besides output y Cache identiers have non-standard syntax it can be seen as a triplethat refers to the value identiers f x and y Hence an α-renaming of one of these three identiersmust refresh the cache identier accordingly Result terms explicitly return cache C through syntax(xC)
The syntax for caches has three cases a cache can be empty or it can prepend a value or a cachevariable to a cache In other words a cache is a tree-like data structure which is isomorphic to anexecution trace containing both immediate values and the execution traces of the function callsissued during the evaluation
The syntax for change terms of iλAL witnesses the CTS discipline followed by the derivativesto determine dy the derivative of f evaluated at point x with change dx expects the cache producedby evaluating y in the base term The derivative returns the updated cache which contains theintermediate results that would be gathered by the evaluation of f (x oplus dx) The result term of everychange term returns a cache update dC in addition to the computed change
The syntax for cache updates resembles the one for caches but each value identier x of theinput cache is updated with its corresponding change dx
Semantics An evaluation environment F of iλAL contains not only values but also cache valuesThe syntax for values V includes closures tuples primitives and constants The syntax for cachevalues Vc mimics the one for cache terms The evaluation of change terms expects the evaluationenvironments dF to also include bindings for change values
There are ve kinds of change values closure changes tuple changes literal changes nil changesand replacements Closure changes embed an environment dF and a code pointer for a functionwaiting for both a base value x and a cache C By abuse of notation we reuse the same syntax C toboth deconstruct and construct caches Other changes are similar to the ones found in λAL
Base terms of the language are evaluated using a big-step semantics dened in Figure 177
Chapter 17 Cache-transfer-style conversion 149
Evaluation of base terms F ` M dArr (V Vc )
[TResult]F (x) = V F ` C dArr Vc
F ` (xC) dArr (V Vc )
[TTuple]F y = F (x) ` M dArr (V Vc )
F ` let y = (x) in M dArr (V Vc )
[TClosureCall]F (f ) = F prime[λx primeM prime]
F prime x prime = F (x) ` M prime dArr (V primeV primec )F y = V prime cyfx = V
primec ` M dArr (V Vc )
F ` let y cyfx = f x in M dArr (V Vc )
[TPrimitiveCall]F (f ) = ` δ`(F (x)) = (V
primeV primec ) F y = V prime cyfx = Vprimec ` M dArr v
F ` let y cyfx = f x in M dArr (V Vc )
Evaluation of caches F ` C dArr Vc
[TEmptyCache]
F ` bull dArr bull
[TCacheVar]F (x) = V F ` C dArr Vc
F ` C x dArr Vc V
[TCacheSubCache]F (c
yfx ) = V
primec F ` C dArr Vc
F ` C cyfx dArr Vc V
primec
Figure 177 Target language iλAL (semantics of base terms and caches)
Judgment ldquoF ` M dArr (V Vc )rdquo is read ldquoUnder evaluation environment F base term M evaluates tovalue V and cache Vc rdquo Auxiliary judgment ldquoF ` C dArr Vc rdquo evaluates cache terms into cache valuesRule [TResult] not only looks into the environment for the return value V but it also evaluatesthe returned cache C Rule [TTuple] is similar to the rule of the source language since no cacheis produced by the allocation of a tuple Rule [TClosureCall] works exactly as [SClosureCall]except that the cache value returned by the closure is bound to cache identier cyfx In the sameway [TPrimitiveCall] resembles [SPrimitiveCall] but also binds cyfx Rule [TEmptyCache] evaluatesan empty cache term into an empty cache value Rule [TCacheVar] computes the value of cachetermC x by appending the value of variable x to cache valueVc computed for cache termC Similarlyrule [TCacheSubCache] appends the cache value of a cache named cyfx to the cache valueVc computedfor C
Change terms of the target language are also evaluated using a big-step semantics dened inFig 178 Judgment ldquodF ` dM dArr (dV Vc )rdquo is read ldquoUnder evaluation environment F change term dMevaluates to change value dV and updated cache Vc rdquo The rst auxiliary judgment ldquodF ` dC dArr Vc rdquodenes evaluation of cache update terms We omit the rules for this judgment since it is similarto the one for cache terms except that cached values are computed by x oplus dx not simply x Thenal auxiliary judgment ldquoVc sim C rarr dFrdquo describes a limited form of pattern matching used byCTS derivatives namely how a cache pattern C matches a cache value Vc to produce a changeenvironment dF
Rule [TDResult] returns the nal change value of a computation as well as a updated cacheresulting from the evaluation of the cache update term dC Rule [TDTuple] resembles its counterpartin the source language but the tuple for y is not built as it has already been pushed in the environmentby the cache
As for λAL there are four rules to deal with let-bindings depending on the shape of the changebound to df in the environment If df is bound to a replacement the rule [TDReplaceCall] applies
150 Chapter 17 Cache-transfer-style conversion
Evaluation of change terms dF ` dM dArr (dV Vc )
[TDResult]dF(dx) = dV dF ` dC dArr Vc
dF ` (dx dC) dArr (dV Vc )
[TDTuple]dF dy = dF(dx) ` dM dArr (dV Vc )
dF ` let dy = (dx) in dM(dV Vc ) dArr
[TDReplaceCall]dF(df ) =Vf
bdFc2 ` (f x) dArr (V primeV primec )dF dy =V prime cyfx = V
primec ` dM dArr (dV Vc )
dF ` let dy cyfx = df dx cyfx in dM dArr (dV Vc )
[TDClosureNil]dF(f df ) = dF f [λx Mf ]nil
dF f x = bdFc2(x) ` Mf dArr (Vprimey Vprimec )
dF dy =Vy prime cyfx = V
primec ` dM dArr (dV Vc )
dF ` let dy cyfx = df dx cyfx in dM dArr (dV Vc )
[TDPrimitiveNil]dF(f df ) = pnil dF(x dx) = Vx dV x dE dy cyfx = ∆p(Vx dV x dF(c
yfx )) ` dM dArr (dV Vc )
dF ` let dy cyfx = df dx cyfx in dM dArr (dV Vc )
[TDClosureChange]dF(df ) = dF f [λdxC dMf ] dF(cyfx ) sim C rarr dF prime
dF f dx = dF(dx) dF prime ` dMf dArr (dV y Vprimec ) dF dy = dV y c
yfx = V
primec ` dM dArr (dV Vc )
dF ` let dy cyfx = df dx cyfx in dM dArr (dV Vc )
Binding of caches Vc sim C rarr dF
[TMatchEmptyCache]
bull sim bull rarr bull
[TMatchCachedValue]Vc sim C rarr dF
Vc V sim C x rarr dF (x = V )
[TMatchSubCache]Vc sim C rarr dF
Vc Vprimec sim C c
yfx rarr dF (cyfx = V
primec )
Figure 178 Target language iλAL (semantics of change terms and cache updates)
Chapter 17 Cache-transfer-style conversion 151
CTS translation of toplevel denitions C n f = λx t oC n f = λx t o = f = λx M
df = λdxC Dbull (xoplusdx)n t owhere (CM) = Cbull xn t o
CTS dierentiation of terms DdCn t oDdCn let y = f x in t o = n let dy cyfx = df dx cyfx in M o
where M = D(dC (yoplusdy) cyfx )n t o
DdCn let y = (x) in t o = n let dy = (dx) in M owhere M = D(dC (yoplusdy))n t o
DdCn x o = n (dx dC) o
CTS translation of terms CCn t oCCn let y = f x in t o = (C prime n let y cyfx = f x in M o)
where (C primeM) = C(C y cyfx )n t o
CCn let y = (x) in t o = (C prime n let y = (x) in M o)where (C primeM) = C(C y)n t o
CCn x o = (C n (xC) o)
Figure 179 Cache-Transfer Style (CTS) conversion from λAL to iλAL
In that case we reevaluate the function call in the updated environment bdFc2 (dened similarly asin the source language) This evaluation leads to a new value V prime which replaces the original one aswell as an updated cache for cyfx
If df is bound to a nil change and f is bound to a closure the rule [TDClosureNil] applies Thisrule mimicks again its counterpart in the source language passing with the dierence that only theresulting change and the updated cache are bound in the environment
If df is bound to a nil change and f is bound to primitive p the rule [TDPrimitiveNil] appliesThe derivative of p is invoked with the value of x its change value and the cache of the original callto p The semantics of prsquos derivative is given by builtin function ∆p(ndash) as in the source language
If df is bound to a closure change and f is bound to a closure the rule [TDClosureNil] appliesThe body of the closure change is evaluated under the closure change environment extended withthe value of the formal argument dx and with the environment resulting from the binding of theoriginal cache value to the variables occuring in the cache C This evaluation leads to a change andan updated cache bound in the environment to continue with the evaluation of the rest of the term
1735 CTS conversion from λAL to iλAL
CTS conversion from λAL to iλAL is dened in Figure 179 It comprises CTS dierentiation Dn ndash ofrom λAL base terms to iλAL change terms and CTS translation C n ndash o from λAL to iλAL which isoverloaded over top-level denitions C n f = λx t o and terms CCn t o
By the rst rule C n ndash o maps each source toplevel denition ldquof = λx trdquo to the compiled codeof the function f and to the derivative df of f expressed in the target language iλAL These targetdenitions are generated by a rst call to the compilation function Cbull xn t o it returns both M thecompiled body of f and the cache term C which contains the names of the intermediate values
152 Chapter 17 Cache-transfer-style conversion
CTS translation of values C nv oC nE[λx t] o = C nE o[λx M]
where (CM) = Cbull xn t o
C n ` o = `
CTS translation of change values C n dv oC ndE[λx dx Dιn t o] o = C ndE o[λdxC Dbull (xoplusdx)n t o]
where (CM) = Cbull xn t o
C n v o = C nv oC nd` o = d`
CTS translation of value environments C nE oC n bull o = bull
C nE x = v o = C nE o x = C nv o
CTS translation of change environments C ndE oC n bull o = bull
C ndE x = v dx = dv o = C ndE o x = C nv o dx = C n dv o
Figure 1710 Extending CTS translation to values change values environments and change envi-ronments
computed by the evaluation of M This cache term C is used as a cache pattern to dene the secondargument of the derivative of f That way we make sure that the shape of the cache expectedby df is consistent with the shape of the cache produced by f Derivative body df is computed byderivation call Dbull (xoplusdx)n t o
CTS translation on terms CCn t o accepts a term t and a cache term C This cache is a fragmentof output code in tail position (t = x) it generates code to return both the result x and the cache C When the transformation visits let-bindings it outputs extra bindings for caches cyfx and appendsall variables newly bound in the output to the cache used when visiting the let-body
Similarly to CCn t o CTS derivation DdCn t o accepts a cache update dC to return in tail positionWhile cache terms record intermediate results cache updates record result updates For let-bindingsto update y by change dy CTS derivation appends to dC term y oplus dy to replace y
1736 Soundness of CTS conversionIn this section we outline denitions and main lemmas needed to prove CTS conversion sound Theproof is based on a mostly straightforward simulation in lock-step but two subtle points emergeFirst we must relate λAL environments that do not contain caches with iλAL environments that doSecond while evaluating CTS derivatives the evaluation environment mixes caches from the basecomputation and updated caches computed by the derivatives
Evaluation commutes with CTS conversion Figure 1710 extends CTS translation to valueschange values environments and change environments CTS translation of base terms commuteswith our semantics
Chapter 17 Cache-transfer-style conversion 153
dF uarr bull i dFdF uarr C i dFprime
dF uarr C x i dFprimedF uarr C i dFprime bdF ci ` let y c
yfx = f x in (y cyfx ) dArr (_ Vc )
dF uarr C cyfx i dFprime cyfx = Vc
Figure 1711 Extension of an environment with cache values dF uarr C i dF prime (for i = 1 2)
Lemma 1737 (Evaluation commutes with CTS translation)For all E t and v such that E ` t dArr v and for all C there exists Vc C nE o ` CCn t o dArr (C nv oVc )
Stating a corresponding lemma for CTS translation of derived terms is trickier If the derivativeof t evaluates correctly in some environment (that is dE ` D ιn t o dArr dv) CTS derivative DdCn t o can-not be evaluated in environment C ndE o A CTS derivative can only evaluate against environmentscontaining cache values from the base computation but no cache values appear in C ndE o
As a x we enrich C ndE o with the values of a cache C using the pair of judgments dF uarr C idF prime (for i = 1 2) dened in Fig 1711 Judgment dF uarr C i dF prime (for i = 1 2) is read ldquoTarget changeenvironment dF prime extends dF with the original (for i = 1) (or updated for i = 2) values of cache C rdquoSince C is essentially a list of variables containing no values cache values in dF prime must be computedfrom bdFci
Lemma 1738 (Evaluation commutes with CTS dierentiation)LetC be such that (C _) = Cbulln t o For all dE t and dv if dE ` D ιn t o dArr dv and C ndE o uarr C 1 dF then dF ` DdCn t o dArr (C n dv oVc )
The proof of this lemma is not immediate since during the evaluation of DdCn t o the new cachesreplace the old caches In our Coq development we enforce a physical separation between the partof the environment containing old caches and the one containing new caches and we maintain theinvariant that the second part of the environment corresponds to the remaining part of the term
Soundness of CTS conversion Finally we can state soundness of CTS dierentiation relativeto dierentiation The theorem says that (a) the CTS derivative DCn t o computes the CTS transla-tion C n dv o of the change computed by the standard derivative D ιn t o (b) the updated cache Vc2produced by the CTS derivative coincides with the cache produced by the CTS-translated base termM in the updated environment bdF2c2 We must use bdF2c2 instead of dF2 to evaluate CTS-translatedbase term M since dF2 produced by environment extension contains updated caches changes andoriginal values Since we require a correct cache via condition (b) we can use this cache to invokethe CTS derivative on further changes as described in Sec 1722
Theorem 1739 (Soundness of CTS dierentiation wrt dierentiation)Let C and M be such that (CM) = Cbulln t o For all dE t and dv such that dE ` D ιn t o dArr dv and forall dF1 and dF2 such that
C ndE o uarr C 1 dF1C ndE o uarr C 2 dF2bdF1c1 ` M dArr (V1Vc1)bdF2c2 ` M dArr (V2Vc2)
we havedF1 ` DCn t o dArr (C n dv oVc2)
154 Chapter 17 Cache-transfer-style conversion
174 Incrementalization case studies
In this section we investigate whether our transformations incrementalize eciently programsin a typed language such as Haskell And indeed after providing support for the needed bulkoperations on sequences bags and maps we successfully transform a few case studies and obtainecient incremental programs that is ones for which incremental computation is faster than fromscratch recomputation
We type CTS programs by by associating to each function f a cache type FC We can transformprograms that use higher-order functions by making closure construction explicit
We illustrate those points on three case studies average computation over bags of integers anested loop over two sequences and a more involved example inspired by Koch et alrsquos work onincrementalizing database queries
In all cases we conrm that results are consistent between from scratch recomputation andincremental evaluation
Our benchmarks were compiled by GHC 802 They were run on a 260GHz dual core Inteli7-6600U CPU with 12GB of RAM running Ubuntu 1604
1741 Averaging integers bagsSection Sec 1722 motivates our transformation with a running example of computing the averageover a bag of integers We represent bags as maps from elements to (possibly negative) multiplicitiesEarlier work [Cai et al 2014 Koch et al 2014] represents bag changes as bags of removed andadded elements to update element a with change da one removes a and adds a oplus da Instead ourbag changes contain element changes a valid bag change is either a replacement change (with anew bag) or a list of atomic changes An atomic change contains an element a a valid change forthat element da a multiplicity n and a change to that multiplicity dn Updating a bag with an atomicchange means removing n occurrences of a and inserting n oplus dn occurrences of updated elementa oplus da
type Bag a = Map a Ztype ∆(Bag a) = Replace (Bag a) | Ch [AtomicChange a]data AtomicChange a = AtomicChange a (∆a) Z (∆Z)
This change type enables both users and derivatives to express eciently various bag changessuch as inserting a single element or changing a single element but also changing some but not alloccurrences of an element in a bag
insertSingle ararr ∆(Bag a)insertSingle a = Ch [AtomicChange a 0a 0 (Ch 1)]changeSingle ararr ∆ararr ∆(Bag a)changeSingle a da = Ch [AtomicChange a da 1 01 ]
Based on this change structure we provide ecient primitives and their derivatives The CTSvariant of map that we call mapC takes a function fC in CTS and a bag as and produces a bag anda cache Because map is not self-maintainable the cache stores the arguments fC and as In additionthe cache stores for each invocation of fC and therefore for each distinct element in as the resultof fC of type b and the cache of type c The incremental function dmapC has to produce an updatedcache We check that this is the same cache that mapC would produce when applied to updatedinputs using the QuickCheck library
Chapter 17 Cache-transfer-style conversion 155
Following ideas inspired by Rossberg et al [2010] all higher-order functions (and typically alsotheir caches) are parametric over cache types of their function arguments Here functions mapCand dmapC and cache type MapC are parametric over the cache type c of fC and dfC
map (ararr b) rarr Bag ararr Bag btype MapC a b c = (ararr (b c)Bag aMap a (b c))mapC (ararr (b c)) rarr Bag ararr (Bag bMapC a b c)dmapC (∆ararr c rarr (∆b c)) rarr ∆(Bag a) rarrMapC a b c rarr (∆(Bag b)MapC a b c)
We wrote the length and sum function used in our benchmarks in terms of primitives mapand foldGroup We applied our transformations to get lengthC dlengthC sumC and dsumC Thistransformation could in principle be automated Implementing the CTS variant and CTS derivativeof primitives eciently requires manual work
We evaluate whether we can produce an updated result with davgC faster than by from scratchrecomputation with avg We use the criterion library for benchmarking The benchmarking codeand our raw results are available in the supplementarty material We x an initial bag of size n Itcontains the integers from 1 to n We dene a sequence of consecutive changes of length r as theinsertion of 1 the deletion of 1 the insertion of 2 the deletion of 2 and so on We update the initialbag with these changes one after another to obtain a sequence of input bags
We measure the time taken by from scratch recomputation We apply avg to each input bagin the sequence and fully force all results We report the time from scratch recomputation takesas this measured time divided by the number r of input bags We measure the time taken by ourCTS variant avgC similarly We make sure to fully force all results and caches avgC produces onthe sequence of input bags
We measure the time taken by our CTS derivative davgC We assume a fully forced initial resultand cache We apply davgC to each change in the sequence of bag changes using the cache fromthe previous invocation We then apply the result change to the previous result We fully force theobtained sequence of results Finally we measure the time taken to only produce the sequence ofresult changes without applying them
Because of laziness choosing what exactly to measure is tricky We ensure that reported timesinclude the entire cost of computing the whole sequence of results To measure this cost we ensureany thunks within the sequence are forced We need not force any partial results such as outputchanges or caches Doing so would distort the results because forcing the whole cache can costasymptotically more than running derivatives However not using the cache at all underestimatesthe cost of incremental computation Hence we measure a sequence of consecutive updates eachusing the cache produced by the previous one
The plot in Fig 1712a shows execution time versus the size n of the initial input To produce theinitial result and cache avgC takes longer than the original avg function takes to produce just theresult This is to be expected Producing the result incrementally is much faster than from scratchrecomputation This speedup increases with the size of the initial bag For an initial bag size of 800incrementally updating the result is 70 times faster than from scratch recomputation
The dierence between only producing the output change versus producing the output changeand updating the result is very small in this example This is to be expected because in this examplethe result is an integer and updating and evaluating an integer is fast We will see an example wherethere is a bigger dierence between the two
156 Chapter 17 Cache-transfer-style conversion
200 400 600 800
0
1
2
3
4
5
input size
time
inm
illise
cond
soriginalcaching
change onlyincremental
(a) Benchmark results for avg
200 400 600 800
0
20
40
60
input size
time
inm
illise
cond
s
originalcaching
change onlyincremental
(b) Benchmark results for totalPrice
1742 Nested loops over two sequencesWe implemented incremental sequences and related primitives following Firsov and Jeltsch [2016]our change operations and rst-order operations (such as concat) reuse their implementation Onthe other hand we must extend higher-order operations such as map to handle non-nil functionchanges and caching A correct and ecient CTS incremental dmapC function has to work dierentlydepending on whether the given function change is nil or not For a non-nil function change it hasto go over the input sequence for a nil function change it can avoid that
Consider the running example again this time in ArsquoNF The partial application of a lambdalifted function constructs a closure We made that explicit with a closure function
cartesianProduct xs ys = letmapPairYs = closure mapPair ysxys = concatMap mapPairYs xsin xys
mapPair ys = λx rarr letpairX = closure pair xxys = map pairX ysin xys
While the only valid change for closed functions is their nil change for closures we can havenon-nil function changes We represent closed functions and closures as variants of the same typeCorrespondingly we represent changes to a closed function and changes to a closure as variantsof the same type of function changes We inspect this representation at runtime to nd out if afunction change is a nil change
data Fun a b c whereClosed (ararr (b c)) rarr Fun a b cClosure (erarr ararr (b c)) rarr erarr Fun a b c
data ∆(Fun a b c) whereDClosed (∆ararr c rarr (∆b c)) rarr ∆(Fun a b c)DClosure (∆erarr ∆ararr c rarr (∆b c)) rarr ∆erarr ∆(Fun a b c)
Chapter 17 Cache-transfer-style conversion 157
200 400 600 800
0
1
2
3
4
5
input size
time
inse
cond
s
originalcaching
change onlyincremental
(a) Benchmark results for cartesian product changinginner sequence
200 400 600 800
0
1
2
3
4
5
input size
time
inse
cond
s
originalcaching
change onlyincremental
(b) Benchmark results for Cartesian product changingouter sequence
We have also evaluated another representation of function changes with dierent tradeos wherewe use defunctionalization instead of closure conversion We discuss the use of defunctionalizationin Appendix D
We use the same benchmark setup as in the benchmark for the average computation on bagsThe input for size n is a pair of sequences (xs ys) Each sequence initially contains the integers from1 to n Updating the result in reaction to a change dxs to xs takes less time than updating the resultin reaction to a change dys to ys The reason is that when dys is a nil change we dynamically detectthat the function change passed to dconcatMapc is the nil change and avoid looping over xs entirely
We benchmark changes to the outer sequence xs and the inner sequence ys separately In bothcases the sequence of changes are insertions and deletions of dierent numbers at dierent positionsWhen we benchmark changes to the outer sequence xs we take this sequence of changes for xs andall changes to ys will be nil changes When we benchmark changes to the inner sequence ys wetake this sequence of changes for ys and all changes to xs will be nil changes
In this example preparing the cache takes much longer than from scratch recomputation Thespeedup of incremental computation over from scratch recomputation increases with the size ofthe initial sequences It reaches 25 for a change to the inner sequences and 88 for a change tothe outer sequence when the initial sequences have size 800 Producing and fully forcing only thechanges to the results is 5 times faster for a change to the inner sequence and 370 times faster for achange to the outer sequence
While we do get speedups for both kinds of changes (to the inner and to the outer sequence)the speedups for changes to the outer sequence are bigger Due to our benchmark methodology wehave to fully force updated results This alone takes time quadratic in the size of the input sequencesn Threrefore we also report the time it takes to produce and fully force only the output changeswhich is of a lower time complexity
1743 Indexed joins of two bags
As expected we found that we can compose functions into larger and more complex programs andapply CTS dierentiation to get a fast incremental program
For this example imagine we have a bag of orders and a bag of line items An order is a pair of
158 Chapter 17 Cache-transfer-style conversion
an integer key and an exchange rate represented as an integer A line item is a pair of a key of acorresponding order and a price represented as an integer The price of an order is the sum of theprices of the corresponding line items multiplied by the exchange rate We want to nd the totalprice dened as the sum of the prices of all orders
To do so eciently we build an index for line items and an index for orders We representindexes as maps We build a map from order key to the sum of the prices of corresponding lineitems Because orders are not necessarily unique by key and because multiplication distributesover addition we can build an index mapping each order key to the sum of all exchange rates forthis order key We then compute the total price by merging the two maps by key multiplyingcorresponding sums of exchange rates and sums of prices We compute the total price as the sum ofthose products
Because our indexes are maps we need a change structure for maps We generalize the changestructure for bags by generalizing from multiplicities to arbitrary values as long as they have agroup structure A change to a map is either a replacement change (with a new map) or a list ofatomic changes An atomic change is a key k a valid change to that key dk a value a and a validchange to that value da To update a map with an atomic change means to use the group structureof the type of a to subtract a at key k and to add a oplus da at key k oplus dk
type ∆(Map k a) = Replace (Map k a) | Ch [AtomicChange k a]data AtomicChange k a = AtomicChange k (∆k) a (∆a)
Bags have a group structure as long as the elements have a group structure Maps have a groupstructure as long as the values have a group structure This means for example that we can ecientlyincrementalize programs on maps from keys to bags of elements We implemented ecient cachingand incremental primitives for maps and veried their correctness with QuickCheck
To build the indexes we use a groupBy function built from primitive functions foldMapGroup onbags and singleton for bags and maps respectively While computing the indexes with groupBy is self-maintainable merging them is not We need to cache and incrementally update the intermediatelycreated indexes to avoid recomputing them
type Order = (ZZ)type LineItem = (ZZ)totalPrice Bag Order rarr Bag LineItemrarr ZtotalPrice orders lineItems = let
orderIndex = groupBy fst ordersorderSumIndex = Mapmap (BagfoldMapGroup snd) orderIndexlineItemIndex = groupBy fst lineItemslineItemSumIndex = Mapmap (BagfoldMapGroup snd) lineItemIndexmerged = Mapmerge orderSumIndex lineItemSumIndextotal = MapfoldMapGroup multiply mergedin total
This example is inspired by Koch et al [2014] Unlike them we donrsquot automate indexing so toget good performance we start from a program that explicitly uses indexes
We evaluate the performace in the same way we did for the average computation on bags Theinitial input of size n is a pair of bags where both contain the pairs (i i) for i between 1 and n Thesequence of changes are alternations between insertion and deletion of pairs (j j) for dierent j Wealternate the insertions and deletions between the orders bag and the line items bag
Our CTS derivative of the original program produces updated results much faster than fromscratch recomputation and again the speedup increases with the size of the initial bags For an
Chapter 17 Cache-transfer-style conversion 159
initial size of the two bags of 800 incrementally updating the result is 180 times faster than fromscratch recomputation
175 Limitations and future workIn this section we describe limitations to be addressed in future work
1751 Hiding the cache typeIn our experiments functions of the same type f1 f2 Ararr B can be transformed to CTS functionsf1 Ararr (BC1) f2 Ararr (BC2) with dierent cache types C1C2 since cache types depend onthe implementation We can x this problem with some runtime overhead by using a single cachetype Cache dened as a tagged union of all cache types If we defunctionalize function changes wecan index this cache type with tags representing functions but other approaches are possible andwe omit details We conjecture (but have not proven) this x gives a type-preserving translationbut leave this question for future work
1752 Nested bagsOur implementation of bags makes nested bags overly slow we represent bags as tree-based mapsfrom elements to multiplicity so looking up a bag b in a bag of bags takes time proportional to thesize of b Possible solutions in this case include shredding like done by [Koch et al 2016] We haveno such problem for nested sequences or other nested data which can be addressed in O(1)
1753 Proper tail callsCTS transformation conicts with proper tail calls as it turns most tail calls into non-tail callsIn AprimeNF syntax tail calls such as let y = f x in g y become let y = f x in let z = g y in zand in CTS that becomes let (y cy) = f x in let (z cz) = g y in (z (cy cz)) where the call tog is genuinely not in tail position This prevents recursion on deeply nested data structures likelong lists but such programs incrementalize ineciently if deeply nested data is aected so it isadvisable to replace lists by other sequences anyway Itrsquos unclear whether such xes are availablefor other uses of tail calls
1754 Pervasive replacement valuesThanks to replacement changes we can compute a change from any v1 to any v2 in constant timeCai et al [2014] use a dierence operator instead but itrsquos hard to implement in constanttime on values of non-constant size So our formalization and implementation allow replacementvalues everywhere to ensure all computations can be incrementalized in some sense Supportingreplacement changes introduces overhead even if they are not used because it prevents writingself-maintainable CTS derivatives Hence to improve performance one should consider droppingsupport for replacement values and restricting supported computations Consider a call to a binaryCTS derivative dfc da db c1 after computing (y1 c1) = fc a1 b1 if db is a replacement change b2then dfc must compute a result afresh by invoking fc a2 b2 and a2 = a1 oplus da requires rememberingprevious input a1 inside c1 By the same argument c1 must also remember input b1 Worsereplacement values are only needed to handle cases where incremental computation reduces torecomputation because the new input is completely dierent or because the condition of an if
160 Chapter 17 Cache-transfer-style conversion
expression changed Such changes to conditions are forbidden by other works [Koch et al 2016]we believe for similar reasons
1755 Recomputing updated values
In some cases the same updated input might be recomputed more than once If a derivative dfneeds some base input x (that is if df is not self-maintainable) df rsquos input cache will contain a copyof x and df rsquos output cache will contain its updated value x oplus dx When all or most derivativesare self-maintainable this is convenient because in most cases updated inputs will not need tobe computed But if most derivatives are not self-maintainable the same updated input might becomputed multiple times specically if derivative dh calls functions df and dg and both df and dgneed the same base input x caches for both df and dg will contain the updated value of x oplus dxcomputed independently Worse because of pervasive replacement values (Sec 1754) derivativesin our case studies tend to not be self-maintainable
In some cases such repeated updates should be removable by a standard optimizer after inliningand common-subexpression elimination but it is unclear how often this happens To solve thisproblem derivatives could take and return both old inputs x1 and updated ones x2 = x1 oplus dx andx2 could be computed at the single location where dx is bound In this case to avoid updates forunused base inputs we would have to rely more on absence analysis (Sec 1756) pruning functioninputs appears easier than pruning caches Otherwise computations of updated inputs that are notused in a lazy context might cause space leaks where thunks for x2 = x1 oplus dx1 x3 = x2 oplus dx2 andso on might accumulate and grow without bounds
1756 Cache pruning via absence analysis
To reduce memory usage and runtime overhead it should be possible to automatically removefrom transformed programs any caches or cache fragments that are not used (directly or indirectly)to compute outputs Liu [2000] performs this transformation on CTS programs by using absenceanalysis which was later extended to higher-order languages by Sergey et al [2014] In lazylanguages absence analysis removes thunks that are not needed to compute the output Weconjecture that as long as the analysis is extended to not treat caches as part of the output it shouldbe able to remove unused caches or inputs (assuming unused inputs exist see Sec 1754)
1757 Unary vs n-ary abstraction
We only show our transformation correct for unary functions and tuples But many languagesprovide ecient support for applying curried functions such as div Zrarr Zrarr Z via either thepush-enter or eval-apply evaluation model For instance invoking div m n should not require theoverhead of a function invocation for each argument [Marlow and Jones 2006] Naively transformingsuch a curried functions to CTS would produce a function divc of typeZrarr (Zrarr (ZDivC2))DivC1)with DivC1 = () which adds excessive overhead3 Based on preliminary experiments we believe wecan straightforwardly combine our approach with a push-enter or eval-apply evaluation strategyfor transformed curried functions Alternatively we expect that translating Haskell to StrictCore [Bolingbroke and Peyton Jones 2009] would take care of turning Haskell into a languagewhere function types describe their arity which our transformation can easily take care of Weleave a more proper investigation for future work
3In Sec 172 and our evaluation we use curried functions and never need to use this naive encoding but only because wealways invoke functions of known arity
Chapter 17 Cache-transfer-style conversion 161
176 Related workOf all research on incremental computation in both programming languages and databases [Guptaand Mumick 1999 Ramalingam and Reps 1993] we discuss the most closely related works Otherrelated work more closely to cache-transfer style is discussed in Chapter 19
Previouswork on cache-transfer-style Liu [2000]rsquos work has been the fundamental inspirationto this work but her approach has no correctness proof and is restricted to a rst-order untypedlanguage (in part because no absence analysis for higher-order languages was available) Moreoverwhile the idea of cache-transfer-style is similar itrsquos unclear if her approach to incrementalizationwould extend to higher-order programs
Firsov and Jeltsch [2016] also approach incrementalization by code transformation but theirapproach does not deal with changes to functions Instead of transforming functions written interms of primitives they provide combinators to write CTS functions and derivatives togetherOn the other hand they extend their approach to support mutable caches while restricting toimmutable ones as we do might lead to a logarithmic slowdown
Finite dierencing Incremental computation on collections or databases by nite dierencinghas a long tradition [Paige and Koenig 1982 Blakeley et al 1986] The most recent and impressiveline of work is the one on DBToaster [Koch 2010 Koch et al 2014] which is a highly ecientapproach to incrementalize queries over bags by combining iterated nite dierencing with otherprogram transformations They show asymptotic speedups both in theory and through experimentalevaluations Changes are only allowed for datatypes that form groups (such as bags or certainmaps) but not for instance for lists or sets Similar ideas were recently extended to higher-orderand nested computation [Koch et al 2016] though still restricted to datatypes that can be turnedinto groups
Logical relations To study correctness of incremental programs we use a logical relation amonginitial values v1 updated values v2 and changes dv To dene a logical relation for an untypedλ-calculus we use a step-indexed logical relation following [Appel and McAllester 2001 Ahmed2006] in particular our denitions are closest to the ones by Acar et al [2008] who also workswith an untyped language big-step semantics and (a dierent form of) incremental computationHowever their logical relation does not mention changes explicitly since they do not have rst-classstatus in their system Moreover we use environments rather than substitution and use a slightlydierent step-indexing for our semantics
Dynamic incrementalization The approaches to incremental computation with the widestapplicability are in the family of self-adjusting computation [Acar 2005 2009] including its descen-dant Adapton [Hammer et al 2014] These approaches incrementalize programs by combiningmemoization and change propagation after creating a trace of base computations updated inputsare compared with old ones in O(1) to nd corresponding outputs which are updated to accountfor input modications Compared to self-adjusting computation Adapton only updates resultswhen they are demanded As usual incrementalization is not ecient on arbitrary programs Toincrementalize eciently a program must be designed so that input changes produce small changesto the computation trace renement type systems have been designed to assist in this task [Ccediliccedileket al 2016 Hammer et al 2016] Instead of comparing inputs by pointer equality Nominal Adap-ton [Hammer et al 2015] introduces rst-class labels to identify matching inputs enabling reuse inmore situations
162 Chapter 17 Cache-transfer-style conversion
Recording traces has often signicant overheads in both space and time (slowdowns of 20-30timesare common) though Acar et al [2010] alleviate that by with datatype-specic support for tracinghigher-level operations while Chen et al [2014] reduce that overhead by optimizing traces to notrecord redundant entries and by logging chunks of operations at once which reduces memoryoverhead but also potential speedups
177 Chapter conclusionWe have presented a program transformation which turns a functional program into its derivativeand eciently shares redundant computations between them thanks to a statically computed cacheAlthough our rst practical case studies show promising results it remains now to integrate thistransformation into a realistic compiler
Chapter 18
Towards dierentiation for SystemF
Dierentiation is closely related to both logical relations and parametricity as noticed by variousauthors in discussions of dierentiation Appendix C presents novel proofs of ILC correctnessby adapting some logical relation techniques As a demonstration we dene in this chapterdierentiation for System F by adapting results about parametricity We stop short of a full proofthat this generalization is correct but we have implemented and tested it on a mature implementationof a System F typechecker we believe the proof will mostly be a straightforward adaptation ofexisting ones about parametricity but we leave verication for future work A key open issue isdiscussed in Remark 1821
History and motivation Various authors noticed that dierentiation appears related to (binary)parametricity (including Atkey [2015]) In particular it resembles a transformation presented byBernardy and Lasson [2011] for arbitrary PTSs1 By analogy with unary parametricity Yufei Caisketched an extension of dierentiation for arbitrary PTSs but many questions remained open andat the time our correctness proof for ILC was signicantly more involved and trickier to extend toSystem F since it was dened in terms of denotational equivalence Later we reduced the proofcore to dening a logical relation proving its fundamental property and a few corollaries as shownin Chapter 12 Extending this logical relation to System F proved comparably more straightforward
Parametricity versus ILC Both parametricity and ILC dene logical relations across programexecutions on dierent inputs When studying parametricity dierences are only allowed in theimplementations of abstractions (through abstract types or other mechanisms) To be relateddierent implementations of the same abstraction must give results that are equivalent according tothe calling program Indeed parametricity denes not just a logical relation but a logical equivalencethat can be shown to be equivalent to contextual equivalence (as explained for instance by Harper[2016 Ch 48] or by Ahmed [2006])
When studying ILC logical equivalence between terms t1 and t2 (written (t1 t2) isin P nτ o)appears to be generalized by the existence of a valid change de between t1 and t2 (that is dt B t1 rarrt2 τ ) As in earlier chapters if terms e1 and e2 are equivalent any valid change dt between them isa nil change but validity allows describing other changes
1Bernardy and Lasson were not the rst to introduce such a transformation But most earlier work focuses on System Fand our presentation follows theirs and uses their added generality We refer for details to existing discussions of relatedwork [Wadler 2007 Bernardy and Lasson 2011]
163
164 Chapter 18 Towards dierentiation for System F
181 The parametricity transformationFirst we show a variant of their parametricity transformation adapted to a variant of STLC withoutbase types but with type variables Presenting λrarr using type variables will help when we comeback to System F and allows discussing parametricity on STLC This transformation is based on thepresentation of STLC as calculus λrarr a Pure Type System (PTS) [Barendregt 1992 Sec 52]
Background In PTSs terms and types form a single syntactic category but are distinguishedthrough an extended typing judgement (written Γ ` t t) using additional terms called sortsFunction types σ rarr τ are generalized by Π-type Πx s t where x can in general appear free in t ageneralization of Π-types in dependently-typed languages but also of universal types forallX T in theSystem F family if x does not appear free in t we write srarr t Typing restricts whether terms ofsome sort s1 can abstract over terms of sort s2 dierent PTSs allow dierent combinations of sortss1 and s2 (specied by relations R ) but lots of metatheory for PTSs is parameterized over the choiceof combinations In STLC presented as a PTS there is an additional sort those terms τ such that` τ are types2 We do not intend to give a full introduction to PTSs only to introduce whatrsquosstrictly needed for our presentation and refer the readers to existing presentations [Barendregt1992]
Instead of base types λrarr use uninterpreted type variables α but do not allow terms to bindthem Nevertheless we can write open terms free in type variables for say naturals and termvariables for operations on naturals STLC restricts Π-types Πx A B to the usual arrow typesArarr B through R one can show that in Πx A B variable x cannot occur free in B
Parametricity Bernardy and Lasson show how to transform a typed term Γ ` t τ in a stronglynormalizing PTS P into a proof that t satises a parametricity statement for τ The proof is in alogic represented by PTS P2 PTS P2 is constructed uniformly from P and is strongly normalizingwhenever P is When the base PTS P is λrarr Bernardy and Lassonrsquos transformation produces termsin a PTS λ2rarr produced by transforming λrarr PTS λ2rarr extends λrarr with a separate sort de ofpropositions together with enough abstraction power to abstract propositions over values Likeλrarr λ2rarr uses uninterpreted type variables α and does not allow abstracting over them
In parametricity statements about λrarr we write (t1 t2) isin P nτ o for a proposition (hence livingin de) that states that t1 and t2 are related This proposition is dened as usual via a logical relationWe write dxx for a proof that x1 and x2 are related For type variables α transformed terms abstractover an arbitrary relation R α between α1 and α2 When α is instantiated by τ R α can (but doesnot have to) be instantiated with relation (ndash ndash) isin P nτ o but R α abstracts over arbitrary candidaterelations (similar to the notion of reducibility candidates [Girard et al 1989 Ch 1411]) Allowingalternative instantiations makes parametricity statements more powerful and it is also necessary todene parametricity for impredicative type systems (like System F) in a predicative ambient logic3
A transformed term P n t o relates two executions of terms t in dierent environments it can beread as a proof that term t maps related inputs to related outputs The proof strategy that P n t ouses is the standard one for proving fundamental properties of logical relations but instead ofdoing a proof in the metalevel logic (by induction on terms and typing derivations) λ2rarr serves asan object-level logic and P n ndash o generates proof terms in this logic by recursion on terms At themetalevel Bernardy and Lasson Th 3 prove that for any well-typed term Γ ` t τ proof P n t o
2Calculus λrarr has also a sort such that ` but itself has no type Such a sort appears only in PTS typingderivations not in well-typed terms themselves so we do not discuss it further
3Specically if we require R α to be instantiated with the logical relation itself for τ when α is instantiated with τ thedenition becomes circular Consider the denition of (t1 t2) isin P n forallα τ o type variable α can be instantiated with thesame type forallα τ so the denition of (t1 t2) isin P n forallα τ o becomes impredicative
Chapter 18 Towards dierentiation for System F 165
shows that t satises the parametricity statement for type τ given the hypotheses P n Γ o (or moreformally P n Γ o ` P n t o (S1 n t o S2 n t o) isin P nτ o)
(f1 f2) isin P nσ rarr τ o = Π(x1 S1 nσ o) (x2 S2 nσ o) (dxx (x1 x2) isin P nσ o)(f1 x1 f2 x2) isin P nτ o(x1 x2) isin P nα o = R α x1 x2P n x o = dxxP n λ(x σ ) rarr t o =λ(x1 S1 nσ o) (x2 S2 nσ o) (dxx (x1 x2) isin P nσ o) rarr
P n t oP n s t o = P n s o S1 n s o S2 n s o P n t oP n ε o = εP n Γ x τ o = P n Γ o x1 S1 nτ o x2 S2 nτ o dxx (x1 x2) isin P nτ oP n Γα o = P n Γ o α1 α2 R α α1 rarr α2 rarr de
In the above S1 n s o and S2 n s o simply subscript all (term and type) variables in their argumentswith 1 and 2 to make them refer to the rst or second computation To wit the straightforwarddenition is
Si n x o = xiSi n λ(x σ ) rarr t o = λ(xi σ ) rarr Si n t oSi n s t o = Si n s o Si n t oSi nσ rarr τ o = Si nσ orarr Si nτ oSi nα o = αi
It might be unclear how the proof P n t o references the original term t Indeed it does notdo so explicitly But since in PTS β-equivalent types have the same members we can constructtyping judgement that mention t As mentioned from Γ ` t τ it follows that P n Γ o ` P n t o (S1 n t o S2 n t o) isin P nτ o [Bernardy and Lasson 2011 Th 3] This is best shown on a smallexampleExample 1811Take for instance an identity function id = λ(x α) rarr x which is typed in an open context (that isα ` λ(x α) rarr x) The transformation gives us
pid = P n id o = λ(x1 α1) (x2 α2) (dxx (x1 x2) isin P nα o) rarr dxx
which simply returns the proofs that inputs are related
α1 α2 R α α1 rarr α2 rarr de `
pid Π(x1 α1) (x2 α2) (x1 x2) isin P nα orarr (x1 x2) isin P nα o This typing judgement does not mention id But since x1 =β S1 n id o x1 and x2 = S2 n id o x2
we can also show that idrsquos outputs are related whenever the inputs are
α1 α2 R α α1 rarr α2 rarr de `
pid Π(x1 α1) (x2 α2) (x1 x2) isin P nα orarr (S1 n id o x1 S2 n id o x2) isin P nα o More concisely we can show that
α1 α2 R α α1 rarr α2 rarr de ` pid (S1 n id o S2 n id o) isin P nα rarr α o
166 Chapter 18 Towards dierentiation for System F
182 Dierentiation and parametricityWe obtain a close variant of dierentiation by altering the transformation for binary parametricityWe obtain a variant very close to the one investigated by Bernardy Jansson and Paterson [2010]Instead of only having proofs that values are related we can modify (t1 t2) isin P nτ o to be a typeof values mdash more precisely a dependent type (t1 t2) isin ∆2 nτ o of valid changes indexed by sourcet1 S1 nτ o and destination t2 S2 nτ o Similarly R α is replaced by a dependent type of changes notpropositions that we write ∆α Formally this is encoded by replacing sort de with in Bernardyand Lasson [2011]rsquos transform and by moving target programs in a PTS that allows for both simpleand dependent types like the Edinburgh Logical Framework such a PTS is known as λP [Barendregt1992]
For type variables α we specialize the transformation further ensuring that α1 = α2 = α(and adapting S1 n ndash o S2 n ndash o accordingly to not add subscripts to type variable) Without thisspecialization we get to deal with changes across dierent types which we donrsquot do just yet butdefer to Sec 184
We rst show how the transformation aects typing contexts
D n ε o = εD n Γ x σ o = D n Γ o x1 σ x2 σ dx (x1 x2) isin ∆2 nσ oD n Γα o = D n Γ o α ∆α α rarr α rarr
Unlike standard dierentiation transformed programs bind for each input variable x σ a sourcex1 σ a destination x2 σ and a valid change dx from x1 to x2 More in detail for each variable ininput programs x σ programs produced by standard dierentiation bind both x σ and dx σ valid use of these programs requires dx to be a valid change with source x but this is not enforcedthrough types Instead programs produced by this variant of dierentiation bind for each variablein input programs x σ both source x1 σ destination x2 σ and a valid change dx from x1 to x2where validity is enforced through dependent types
For input type variables α output programs also bind a dependent type of valid changes ∆α Next we dene the type of changes Since change types are indexed by source and destination
the type of function changes forces them to be valid and the denition of function changesresembles our earlier logical relation for validity More precisely if df is a change from f1 to f2(df (f1 f2) isin ∆2 nσ rarr τ o) then df must map any change dx from initial input x1 to updated inputx2 to a change from initial output f1 x1 to updated output f2 x2
(f1 f2) isin ∆2 nσ rarr τ o = Π(x1 x2 σ ) (dx (x1 x2) isin ∆2 nσ o)(f1 x1 f2 x2) isin ∆2 nτ o(x1 x2) isin ∆2 nα o = ∆α x1 x2
At this point the transformation on terms acts on abstraction and application to match thetransformation on typing contexts The denition of D n λ(x σ ) rarr t o follows the denitionof D n Γ x σ omdash both transformation results bind the same variables x1 x2 dx with the same typesApplication provides corresponding arguments
D n x o = dxD n λ(x σ ) rarr t o = λ(x1 x2 σ ) (dx (x1 x2) isin ∆2 nσ o) rarr D n t oD n s t o = D n s o S1 n s o S2 n s o D n t o
If we extend the language to support primitives and their manually-written derivatives it isuseful to have contexts also bind next to type variables α also change structures for α to allowterms to use change operations Since the dierentiation output does not use change operationshere we omit change structures for now
Chapter 18 Towards dierentiation for System F 167
Remark 1821 (Validity and oplus)Because we omit change structures (here and through most of the chapter) the type of dierentiationonly suggests that D n t o is a valid change but not that validity agrees with oplus
In other words dependent types as used here do not prove all theorems expected of an incremen-tal transformation While we can prove the fundamental property of our logical relation we cannotprove that oplus agrees with dierentiation for abstract types As we did in Sec 134 (in particularPlugin Requirement 1341) we must require that validity relations provided for type variables agreewith implementations of oplus as provided for type variables formally we must state (internally) thatforall(x1 x2 α) (dx ∆α x1 x2) (x1 oplus dx x2) isin P nα o We can state this requirement by internalizinglogical equivalence (such as shown in Sec 181 or using Bernardy et al [2010]rsquos variant in λP ) Butsince this statement quanties over members of α it is not clear if it can be proven internally wheninstantiating α with a type argument We leave this question (admittedly crucial) for future work
This transformation is not incremental as it recomputes both source and destination for eachapplication but we could x this by replacing S2 n s o with S1 n s ooplusD n s o (and passing change struc-tures to make oplus available to terms) or by not passing destinations We ignore such complicationshere
183 Proving dierentiation correct externallyInstead of producing dependently-typed programs that show their correctness we might wantto produce simply-typed programs (which are easier to compile eciently) and use an externalcorrectness proof as in Chapter 12 We show how to generate such external proofs here Forsimplicity here we produce external correctness proofs for the transformation we just dened ondependently-typed outputs rather than dening a separate simply-typed transformation
Specically we show how to generate from well-typed terms t a proof that D n t o is correctthat is that (S1 n t o S2 n t o D n t o) isin ∆V nτ o
Proofs are generated through the following transformation dened in terms of the transforma-tion from Sec 182 Again we show rst typing contexts and the logical relation
DP n ε o = εDP n Γ x τ o = DP n Γ o x1 τ x2 τ
dx (x1 x2) isin ∆2 nτ o dxx (x1 x2 dx) isin ∆V nτ oDP n Γα o = DP n Γ o α
∆α α rarr α rarr R α Π(x1 α) (x2 α) (dx (x1 x2) isin ∆2 nα o) rarr de(f1 f2 df ) isin ∆V nσ rarr τ o =
Π(x1 x2 σ ) (dx (x1 x2) isin ∆2 nσ o) (dxx (x1 x2 dx) isin ∆V nσ o)(f1 x1 f2 x2 df x1 x2 dx) isin ∆V nτ o
(x1 x2 dx) isin ∆V nα o = R α x1 x2 dx
The transformation from terms to proofs then matches the denition of typing contexts
DP n x o = dxxDP n λ(x σ ) rarr t o =λ(x1 x2 σ ) (dx (x1 x2) isin ∆2 nσ o) (dxx (x1 x2 dx) isin ∆V nσ o) rarr
DP n t oDP n s t o = DP n s o S1 n s o S2 n s o D n t o DP n t o
This term produces a proof object in PTS λ2P which is produced by augmenting λP followingBernardy and Lasson [2011] The informal proof content of generated proofs follows the proof
168 Chapter 18 Towards dierentiation for System F
of D n ndash orsquos correctness (Theorem 1222) For a variable x we just use the assumption dxx that(x1 dx x2) isin ∆V nτ o that we have in context For abstractions λx rarr t we have to show that D n t ois correct for all valid input changes x1 x2 dx and for all proofs dxx (x1 dx x2) isin ∆V nσ o that dxis indeed a valid input change so we bind all those variables in context including proof dxx and useDP n t o recursively to prove that D n t o is correct in the extended context For applications s t weuse the proof that D n s o is correct (obtained recursively via DP n s o) To show that D n s o D n t othat is D n s o S1 n s o S2 n s o D n t o we simply show that D n s o is being applied to valid inputsusing the proof that D n t o is correct (obtained recursively via DP n t o)
184 Combinators for polymorphic change structuresEarlier we restricted our transformation on λrarr terms so that there could be a change dt from t1 tot2 only if t1 and if t2 have the same type In this section we lift this restriction and dene polymorphicchange structures (also called change structures when no ambiguity arises) To do so we sketchhow to extend core change-structure operations to polymorphic change structures
We also show a few combinators combining existing polymorphic change structures into newones We believe the combinator types are more enlightening than their implementations but weinclude them here for completeness We already described a few constructions on non-polymorphicchange structures however polymorphic change structures enable new constructions that insert orremove data or apply isomorphisms to the source or destination type
We conjecture that combinators for polymorphic change structure can be used to compose outof small building blocks change structures that for instance allow inserting or removing elementsfrom a recursive datatype such as lists Derivatives for primitives could then be produced usingequational reasoning as described in Sec 112 However we leave investigation of such avenues tofuture work
We describe change operations for polymorphic change structures via a Haskell record contain-ing change operations and we dene combinators on polymorphic change structures Of all changeoperations we only consider oplus to avoid confusion we write for the polymorphic variant of oplus
data CS τ1 δτ τ2 = CS () τ1 rarr δτ rarr τ2
This code dene a record type constructor CS with a data constructor also written CS and a eldaccessor written () we use τ1 δτ and τ2 (and later also σ1 δσ and σ2) for Haskell type variablesTo follow Haskell lexical rules we use here lowercase letters (even though Greek ones)
We have not formalized denitions of validity or proofs that it agrees with but for all thechange structures and combinators in this section this exercise appears no harder than the ones inChapter 13
In Sec 111 and 172 change structures are embedded in Haskell using type class ChangeStruct τ Conversely here we do not dene a type class of polymorphic change structures because (apartfrom the simplest scenarios) Haskell type class resolution is unable to choose a canonical way toconstruct a polymorphic change structure using our combinators
All existing change structures (that is instances of ChangeStruct τ ) induce generalized changestructures CS τ (∆τ ) τ
typeCS ChangeStruct τ rArr CS τ (∆τ ) τtypeCS = CS (oplus)
We can also have change structures across dierent types Replacement changes are possible
Chapter 18 Towards dierentiation for System F 169
replCS CS τ1 τ2 τ2replCS = CS $ λx1 x2 rarr x2
But replacement changes are not the only option For product types or for any form of nested datawe can apply changes to the dierent components changing the type of some components We canalso dene a change structure for nullary products (the unit type) which can be useful as a buildingblock in other change structures
prodCS CS σ1 δσ σ2 rarr CS τ1 δτ τ2 rarr CS (σ1τ1) (δσ δτ ) (σ2τ2)prodCS scs tcs = CS $ λ(s1 t1) (ds dt) rarr (() scs s1 ds () tcs t1 dt)unitCS CS () () ()unitCS = CS $ λ() () rarr ()
The ability to modify a eld to one of a dierent type is also known as in the Haskell commu-nity as polymorphic record update a feature that has proven desirable in the context of lens li-braries [OrsquoConnor 2012 Kmett 2012]
We can also dene a combinator sumCS for change structures on sum types similarly to ourearlier construction described in Sec 116 This time we choose to forbid changes across branchessince theyrsquore inecient though we could support them as well if desired
sumCS CS s1 ds s2 rarr CS t1 dt t2 rarrCS (Either s1 t1) (Either ds dt) (Either s2 t2)
sumCS scs tcs = CS gowherego (Le s1) (Le ds) = Le $ () scs s1 dsgo (Right t1) (Right dt) = Right $ () tcs t1 dtgo _ _ = error Invalid changes
Given two change structures from τ1 to τ2 with respective change types δτa and δτb we canalso dene a new change structure with change type Either δτa δτb that allows using changes fromeither structure We capture this construction through combinator mSumCS having the followingsignature
mSumCS CS τ1 δτa τ2 rarr CS τ1 δτb τ2 rarr CS τ1 (Either δτa δτb ) τ2mSumCS acs bcs = CS go
wherego t1 (Le dt) = () acs t1 dtgo t1 (Right dt) = () bcs t1 dt
This construction is possible for non-polymorphic change structures we only need change structuresto be rst-class (instead of a type class) to be able to internalize this construction in Haskell
Using combinator lInsCS we can describe updates going from type τ1 to type (σ τ2) assuminga change structure from τ1 to τ2 that is we can prepend a value of type σ to our data while wemodify it Similarly combinator lRemCS allows removing values
lInsCS CS t1 dt t2 rarr CS t1 (s dt) (s t2)lInsCS tcs = CS $ λt1 (s dt) rarr (s () tcs t1 dt)lRemCS CS t1 dt (s t2) rarr CS t1 dt t2lRemCS tcs = CS $ λt1 dt rarr snd $ () tcs t1 dt
We can also transform change structures given suitable conversion functions
170 Chapter 18 Towards dierentiation for System F
lIsoCS (t1 rarr s1) rarr CS s1 dt t2 rarr CS t1 dt t2mIsoCS (dt rarr ds) rarr CS t1 ds t2 rarr CS t1 dt t2rIsoCS (s2 rarr t2) rarr CS t1 dt s2 rarr CS t1 dt t2isoCS (t1 rarr s1) rarr (dt rarr ds) rarr (s2 rarr t2) rarr CS s1 ds s2 rarr CS t1 dt t2
To do so we must only compose with the given conversion functions according to the typesCombinator isoCS arises by simply combining the other ones
lIsoCS f tcs = CS $ λs1 dt rarr () tcs (f s1) dtmIsoCS g tcs = CS $ λt1 dsrarr () tcs t1 (g ds)rIsoCS h tcs = CS $ λt1 dt rarr h $ () tcs t1 dtisoCS f g h scs = lIsoCS f $ mIsoCS g $ rIsoCS h scs
With a bit of datatype-generic programming infrastructure we can reobtain only using combi-nators the change structure for lists shown in Sec 1133 which allows modifying list elements Thecore denition is the following one
listMuChangeCS CS a1 da a2 rarr CS (Listmicro a1) (Listmicro da) (Listmicro a2)listMuChangeCS acs = go where
go = isoCS unRollL unRollL rollL $sumCS unitCS $ prodCS acs go
The needed infrastructure appears in Fig 181
-- Our list typedata List a = Nil | Cons a (List a)
-- Equivalently we can represent lists as a xpoint of a sum-of-product pattern functordata Mu f = Roll unRoll f (Mu f )data ListF a x = L unL Either () (a x)type Listmicro a = Mu (ListF a)rollL Either () (a Listmicro a) rarr Listmicro arollL = Roll LunRollL Listmicro ararr Either () (a Listmicro a)unRollL = unL unRoll
-- Isomorphism between List and Listmicro lToLMu List ararr Listmicro alToLMu Nil = rollL $ Le ()lToLMu (Cons a as) = rollL $ Right (a lToLMu as)lMuToL Listmicro ararr List alMuToL = go unRollL wherego (Le ()) = Nilgo (Right (a as)) = Cons a (lMuToL as)-- A change structure for List
listChangesCS CS a1 da a2 rarr CS (List a1) (List da) (List a2)listChangesCS acs = isoCS lToLMu lToLMu lMuToL $ listMuChangeCS acs
Figure 181 A change structure that allows modifying list elements
Chapter 18 Towards dierentiation for System F 171
Section summary We have dened polymorphic change structures and shown they admit a richcombinator language Now we return to using these change structures for dierentiating λrarr andSystem F
185 Dierentiation for System FAfter introducing changes across dierent types we can also generalize dierentiation for λrarr sothat it allows for the now generalized changes
(f1 f2) isin ∆2 nσ rarr τ o = Π(x1 S1 nσ o) (x2 S2 nσ o) (dx (x1 x2) isin ∆2 nσ o)(f1 x1 f2 x2) isin ∆2 nτ o(x1 x2) isin ∆2 nα o = ∆α x1 x2D n x o = dxD n λ(x σ ) rarr t o = λ(x1 S1 nσ o) (x2 S2 nσ o) (dx (x1 x2) isin ∆2 nσ o) rarr D n t oD n s t o = D n s o S1 n s o S2 n s o D n t oD n ε o = εD n Γ x τ o = D n Γ o x1 S1 nτ o x2 S2 nτ o dx (x1 x2) isin ∆2 nτ oD n Γα o = D n Γ o α1 α2 ∆α α1 rarr α2 rarr
By adding a few additional rules we can extend dierentiation to System F (the PTS λ2) We chooseto present the additional rules using System F syntax rather than PTS syntax
(f1 f2) isin ∆2 n forallα T o =Π(α1 lowast) (α2 lowast) (∆α α1 rarr α2 rarr ) (f1 [α1 ] f2 [α2 ]) isin ∆2 n T o
D nΛα t o =λ(α1 α2 ) (∆α α1 rarr α2 rarr ) rarr D n t o
D n t [τ ] o = D n t o S1 nτ o S2 nτ o ((ndash ndash) isin ∆2 nτ o)Produced terms use a combination of System F and dependent types which is known as λP2 [Baren-dregt 1992] and is strongly normalizing This PTS corresponds to second-order (intuitionistic)predicate logic and is part of Barendregtrsquos lambda cube λP2 does not admit types depending ontypes (that is type-level functions) but admits all other forms of abstraction in the lambda cube(terms on terms like λrarr terms on types like System F and types on terms like LFλP )
Finally we sketch a transformation producing proofs that dierentiation is correct for System F
DP n ε o = εDP n Γ x τ o = DP n Γ o x1 S1 nτ o x2 S2 nτ o
dx (x1 x2) isin ∆2 nτ o dxx (x1 x2 dx) isin ∆V nτ oDP n Γα o = DP n Γ o α1 α2 ∆α α1 rarr α2 rarr R α Π(x1 α1) (x2 α2) (dx (x1 x2) isin ∆2 nα o) rarr de(f1 f2 df ) isin ∆V n forallα T o =
Π(α1 lowast) (α2 lowast) (∆α α1 rarr α2 rarr )(R α Π(x1 α1) (x2 α2) (dx (x1 x2) isin ∆2 nα o) rarr de)(f1 [α1 ] f2 [α2 ] df [α1 ] [α2 ] [∆α ]) isin ∆V n T o
(f1 f2 df ) isin ∆V nσ rarr τ o =Π(x1 S1 nσ o) (x2 S2 nσ o) (dx (x1 x2) isin ∆2 nσ o) (dxx (x1 x2 dx) isin ∆V nσ o)(f1 x1 f2 x2 df x1 x2 dx) isin ∆V nτ o
(x1 x2 dx) isin ∆V nα o = R α x1 x2 dx
172 Chapter 18 Towards dierentiation for System F
DP n x o = dxxDP n λ(x σ ) rarr t o =λ(x1 S1 nσ o) (x2 S2 nσ o) (dx (x1 x2) isin ∆2 nσ o)(dxx (x1 x2 dx) isin ∆V nσ o) rarrDP n t o
DP n s t o = DP n s o S1 n s o S2 n s o D n t o DP n t oDP nΛα t o =λ(α1 α2 ) (∆α α1 rarr α2 rarr )(R α Π(x1 α1) (x2 α2) (dx (x1 x2) isin ∆2 nα o) rarr de) rarrDP n t o
DP n t [τ ] o = DP n t o S1 nτ o S2 nτ o ((ndash ndash) isin ∆2 nτ o) ((ndash ndash ndash) isin ∆V nτ o)Produced terms live in λ2P2 the logic produced by extending λP2 following Bernardy and Lasson Avariant producing proofs for a non-dependently-typed dierentiation (as suggested earlier) wouldproduce proofs in λ22 the logic produced by extending λ2 following Bernardy and Lasson
186 Related workDependently-typed dierentiation for System F as given coincides with the parametricity transfor-mation for System F given by Bernardy Jansson and Paterson [2010 Sec 31] But our applicationis fundamentally dierent for known types Bernardy et al only consider identity relations whilewe can consider non-identity relations as long as we assume that is available for all types Whatis more Bernardy et al do not consider changes the update operator oplus or change structures acrossdierent types changes are replaced by proofs of relatedness with no computational signicanceFinally non-dependently-typed dierentiation (and its corectness proof) is novel here as it makeslimited sense in the context of parametricity even though it is a small variant of Bernardy et alrsquosparametricity transform
187 Prototype implementationWe have written a prototype implementation of the above rules for a PTS presentation of System Fand veried it on a few representative examples of System F terms We have built our implementationon top of an existing typechecker for PTSs4
Since our implementation is dened in terms of a generic PTS some of the rules presented aboveare unied and generalized in our implementation suggesting the transformation might generalizeto arbitrary PTSs In the absence of further evidence on the correctness of this generalization weleave a detailed investigation as future work
188 Chapter conclusionIn this chapter we have sketched how to dene and prove correct dierentiation following Bernardyand Lasson [2011]rsquos and Bernardy et al [2010]rsquos work on parametricity by code transformationWe give no formal correctness proof but proofs appear mostly an extension of their methods Animportant open point is Remark 1821 We have implemented and tested dierentiation in anexisting mature PTS implementation and veried it is type-correct on a few typical terms
We leave further investigation as future work4hpsgithubcomToxarispts by Tillmann Rendel based on van Benthem Jutting et al [1994]rsquos algorithm for PTS
typechecking
Chapter 19
Related work
Existing work on incremental computation can be divided into two groups Static incrementalizationand dynamic incrementalization Static approaches analyze a program statically and generate anincremental version of it Dynamic approaches create dynamic dependency graphs while theprogram runs and propagate changes along these graphs
The trade-o between the two is that static approaches have the potential to be faster becauseno dependency tracking at runtime is needed whereas dynamic approaches can support moreexpressive programming languages ILC is a static approach but compared to the other staticapproaches it supports more expressive languages
In the remainder of this section we analyze the relation to the most closely related prior worksRamalingam and Reps [1993] Gupta and Mumick [1999] and Acar et al [2006] discuss furtherrelated work Other related work more closely to cache-transfer style is discussed in Sec 176
191 Dynamic approachesOne of the most advanced dynamic approach to incrementalization is self-adjusting computationwhich has been applied to Standard ML and large subsets of C [Acar 2009 Hammer et al 2011]In this approach programs execute on the original input in an enhanced runtime environmentthat tracks the dependencies between values in a dynamic dependence graph [Acar et al 2006]intermediate results are memoized Later changes to the input propagate through dependencygraphs from changed inputs to results updating both intermediate and nal results this processingis often more ecient than recomputation
However creating dynamic dependence graphs imposes a large constant-factor overhead duringruntime ranging from 2 to 30 in reported experiments [Acar et al 2009 2010] and aecting theinitial run of the program on its base input Acar et al [2010] show how to support high-level datatypes in the context of self-adjusting computation however the approach still requires expensiveruntime bookkeeping during the initial run Our approach like other static ones uses a standardruntime environment and has no overhead during base computation but may be less ecient whenprocessing changes This pays o if the initial input is big compared to its changes
Chen et al [2011] have developed a static transformation for purely functional programs but thistransformation just provides a superior interface to use the runtime support with less boilerplateand does not reduce this performance overhead Hence it is still a dynamic approach unlike thetransformation this work presents
Another property of self-adjusting computation is that incrementalization is only ecient if theprogram has a suitable computation structure For instance a program folding the elements of a
173
174 Chapter 19 Related work
bag with a left or right fold will not have ecient incremental behavior instead itrsquos necessary thatthe fold be shaped like a balanced tree In general incremental computations become ecient onlyif they are stable [Acar 2005] Hence one may need to massage the program to make it ecientOur methodology is dierent Since we do not aim to incrementalize arbitrary programs writtenin standard programming languages we can select primitives that have ecient derivatives andthereby require the programmer to use them
Functional reactive programming [Elliott and Hudak 1997] can also be seen as a dynamicapproach to incremental computation recent work by Maier and Odersky [2013] has focused onspeeding up reactions to input changes by making them incremental on collections Willis et al[2008] use dynamic techniques to incrementalize JQL queries
192 Static approachesStatic approaches analyze a program at compile-time and produce an incremental version thateciently updates the output of the original program according to changing inputs
Static approaches have the potential to be more ecient than dynamic approaches becauseno bookkeeping at runtime is required Also the computed incremental versions can often beoptimized using standard compiler techniques such as constant folding or inlining However noneof them support rst-class functions some approaches have further restrictions
Our aim is to apply static incrementalization to more expressive languages in particular ILCsupports rst-class functions and an open set of base types with associated primitive operations
Sundaresh and Hudak [1991] propose to incrementalize programs using partial evaluation givena partitioning of program inputs in parts that change and parts that stay constant Sundaresh andHudak partially evaluates a given program relative to the constant input parts and combine theresult with the changing inputs
1921 Finite dierencingPaige and Koenig [1982] present derivatives for a rst-order language with a xed set of primitivesBlakeley et al [1986] apply these ideas to a class of relational queries The database communityextended this work to queries on relational data such as in algebraic dierencing [Gupta andMumick 1999] which inspired our work and terminology However most of this work does notapply to nested collections or algebraic data types but only to relational (at) data and no previousapproach handles rst-class functions or programs resulting from defunctionalization or closureconversion Incremental support is typically designed monolithically for a whole language ratherthan piecewise Improving on algebraic dierencing DBToaster (Koch [2010] Koch et al [2014])guarantees asymptotic speedups with a compositional query transformation and delivers hugespeedup in realistic benchmarks though still for a rst-order database language
More general (non-relational) data types are considered in the work by Gluche et al [1997]our support for bags and the use of groups is inspired by their work but their architecture isstill rather restrictive they lack support for function changes and restrict incrementalization toself-maintainable views without hinting at a possible solution
It seems possible to transform higher-order functional programs to database queries using avariety of approaches [Grust et al 2009 Cheney et al 2013] some of which support rst-classfunctions via closure conversion [Grust and Ulrich 2013 Grust et al 2013] and incrementalize theresulting programs using standard database technology Such a solution would inherit limitations ofdatabase incrementalization approaches in particular it appears that database incrementalizationapproaches such as DBToaster can handle the insertion and removal of entire table rows not ofsmaller changes Nevertheless such an alternative approach might be worth investigating
Chapter 19 Related work 175
Unlike later approaches to higher-order dierentiation we do not restrict our base types togroups unlike Koch et al [2016] and transformed programs we produce do not require furtherruntime transformation unlike Koch et al [2016] and Huesca [2015] as we discuss further next
1922 λndashdi and partial dierentialsIn work concurrent with Cai et al [2014] Huesca [2015] introduces an alternative formalism calledλndashdi for incremental computation by program transformation While λndashdi has some appealingfeatures it currently appears to require program transformations at runtime Other systems appearto share this feature [Koch et al 2016] Hence this section discusses the reason in some detail
Instead of dierentiating a term t relative to all inputs (free variables and function arguments)via D n t o like ILC λndashdi dierentiates terms relative to one input variable and writes parttpartx dx forthe result of dierentiating t relative to x a term that computes the change in t when the value forx is updated by change dx The formalism also uses pointwise function changes similarly to whatwe discussed in Sec 153
Unfortunately it is not known how to dene such a transformation for a λ-calculus with astandard semantics and the needed semantics appear to require runtime term transformationswhich are usually considered problematic when implementing functional languages In particularit appears necessary to introduce a new term constructor D t which evaluates t to a function valueλy rarr u and then evaluates to λ(ydy ) rarr parttparty dy which dierentiates t at runtime relative to itshead variable y As an indirect consequence if the program under incrementalization containsfunction term Γ ` t σ rarr τ where Γ contains n free variables it can be necessary in the worst-caseto dierentiate t over any subset of the n free variables in Γ There are 2n such subsets To performall term transformations before runtime it seems hence necessary in the worst case to precompute2n partial derivatives for each function term t which appears unfeasible On the other hand itis not clear how often this worst-case is realized or how big n grows in typical programs or if itis simply feasible to perform dierentiation at runtime similarly to JIT compilation Overall anecient implementation of λndashdi remains an open problem It appears also Koch et al [2016] suersimilar problems but a few details appear simpler since they restrict focus to functions over groups
To see why λndashdi need introduce D t consider dierentiating parts tpartx dx that is the change d ofs t when xx is updated by change dx Change d depends (a) on the change of t when x is updatedby dx that is parttpartx dx (b) on how s changes when its input t is updated by parttpartx dx to express thischange λndashdi expresses this via (D s) t parttpartx dx (c) on the change of s (applied to the updated t )when x is updated by dx that is parttpartx dx To compute component (b) λndashdi writes D s to dierentiates not relative to x but relative to the still unknown head variable of s If s evaluates to λy rarr u theny is the head variable of s and D s dierentiates u relative to y and evaluates to λ(ydy ) rarr partuparty dy
Overall the rule for dierentiating application in λ-di is
parts t
partx dx= (Ds)
(t partt
partx dx
)parts
partx dx
(t oplus
partt
partx dx
)
This rule appears closely related to Eq (151) hence we refer to its discussion for claricationOn the other hand dierentiating a term relative to all its inputs introduces a dierent sort
of overhead For instance it is much more ecient to dierentiate map f xs relative to xs thanrelative to f if f undergoes a non-nil change df D nmap f xs o must apply df to each elements inthe updated input xs Therefore in our practical implementations D nmap f xs o tests whether dfis nil and uses a more ecient implementation In Chapter 16 we detect at compile time whetherdf is guaranteed to be nil In Sec 1742 we instead detect at runtime whether df is nil In bothcases authors of derivatives must implement this optimization by hand Instead λndashdi hints at amore general solution
176 Chapter 19 Related work
1923 Static memoizationLiursquos work [Liu 2000] allows to incrementalize a rst-order base program f (xold) to compute f (xnew)knowing how xnew is related to xold To this end they transform f (xnew) into an incremental programwhich reuses the intermediate results produced while computing f (xold) the base program Tothis end (i) rst the base program is transformed to save all its intermediate results then (ii) theincremental program is transformed to reuse those intermediate results and nally (iii) intermediateresults which are not needed are pruned from the base program However to reuse intermediateresults the incremental program must often be rearranged using some form of equational reasoninginto some equivalent program where partial results appear literally For instance if the base programf uses a left fold to sum the elements of a list of integers xold accessing them from the head onwardsand xnew prepends a new element h to the list at no point does f (xnew) recompute the same resultsBut since addition is commutative on integers we can rewrite f (xnew) as f (xold) + h The authorrsquosCACHET system will try to perform such rewritings automatically but it is not guaranteed tosucceed Similarly CACHET will try to synthesize any additional results which can be computedcheaply by the base program to help make the incremental program more ecient
Since it is hard to fully automate such reasoning we move equational reasoning to the plugindesign phase A plugin provides general-purpose higher-order primitives for which the pluginauthors have devised ecient derivatives (by using equational reasoning in the design phase) Thenthe dierentiation algorithm computes incremental versions of user programs without requiringfurther user intervention It would be useful to combine ILC with some form of static caching tomake the computation of derivatives which are not self-maintainable more ecient We plan to doso in future work
Chapter 20
Conclusion and future work
In databases standard nite dierencing technology allows incrementalizing programs in a specicdomain-specic rst-order language of collection operations Unlike other approaches nitedierencing transforms programs to programs which can in turn be transformed further
Dierentiation (Chapter 10) transforms higher-order programs to incremental ones calledderivatives as long as one can provide incremental support for primitive types and operations
Case studies in this thesis consider support for several such primitives We rst study a settingrestricted to self-maintainable derivatives (Chapter 16) Then we study a more general settingwhere it is possible to remember inputs and intermediate results from one execution to the nextthanks to a transformation to cache-transfer style (Chapter 17) In all cases we are able to deliverorder-of-magnitude speedups on non-trivial examples moreover incrementalization producesprograms in standard languages that are subject to further optimization and code transformation
Correctness of incrementalization appeared initially surprising so we devoted signicant eortto formal correctness proofs either on paper or in mechanized proof assistants The originalcorrectness proof of dierentiation using a set-theoretic denotational semantics [Cai et al 2014]was a signicant milestone but since then we have simplied the proof to a logical relations proofthat denes when a change is valid from a source to a destination and proves that dierentiationproduces valid changes (Chapter 12) Valid changes witness the dierence between sources anddestinations since changes can be nil change validity arises as a generalization of the concept oflogical equivalence and parametricity for a language (at least in terminating languages) (Chapter 18)Crucially changes are not just witnesses operation oplus takes a change and its source to the changedestination One can consider further operations that give rise to an algebraic structure that wecall change structure (Chapter 13)
Based on this simplied proof in this thesis we generalize correctness to further languages usingbig-step operational semantics and (step-indexed) logical relations (Appendix C) Using operationalsemantics we reprove correctness for simply-typed λ-calculus then add support for recursivefunctions (which would require domain-theoretic methods when using denotational semantics)and nally extend the proof to untyped λ-calculus Building on a variant of this proof (Sec 1733)we show that conversion to cache-transfer-style also preserves correctness
Based on a dierent formalism for logical equivalence and parametricity [Bernardy and Lasson2011] we sketch variants of the transformation and correctness proofs for simply-typed λ-calculuswith type variables and then for full System F (Chapter 18)
Future work To extend dierentiation to System F we must consider changes where source anddestination have dierent types This generalization of changes makes change operations much
177
178 Chapter 20 Conclusion and future work
more exible it becomes possible to dene a combinator language for change structures and itappears possible to dene nontrivial change structures for algebraic datatypes using this combinatorlanguage
We conjecture such a combinator language will allow programmers to dene correct changestructures out of simple reusable components
Incrementalizing primitives correctly remains at present a signicant challenge We providetools to support this task by formalizing equational reasoning but it appears necessary to providemore tools to programmers as done by Liu [2000] Conceivably it might be possible to build such arewriting tool on top of the rewriting and automation support of theorem provers such as Coq
Appendixes
179
Appendix A
Preliminaries syntax andsemantics of simply-typedλ-calculus
To discuss how we incrementalize programs and prove that our incrementalization technique givescorrect results we specify which foundation we use for our proofs and what object language westudy throughout most of Part II
We mechanize our correctness proof using Agda hence we use Agdarsquos underlying type theoryas our foundation We discuss what this means in Appendix A1
Our object language is a standard simply-typed λ-calculus (STLC) [Pierce 2002 Ch 9] param-eterized over base types and constants We term the set of base types and constants a languageplugin (see Appendix A25) In our examples we assume that the language plugins supports neededbase types and constants Later (eg in Chapter 12) we add further requirements to languageplugins to support incrementalization of the language features they add to our STLC Rather thanoperational semantics we use a denotational semantics which is however set-theoretic rather thandomain-theoretic Our object language and its semantics are summarized in Fig A1
At this point readers might want to skip to Chapter 10 right away or focus on denotationalsemantics and refer to this section as needed
A1 Our proof meta-languageIn this section we describe the logic (or meta-language) used in our mechanized correctness proof
First as usual we distinguish between ldquoformalizationrdquo (that is on-paper formalized proofs) andldquomechanizationrdquo (that is proofs encoded in the language of a proof assistant for computer-aidedmechanized verication)
To prove the correctness of ILC we provide a mechanized proof in Agda [Agda DevelopmentTeam 2013] Agda implements intensional Martin-Loumlf type theory (from now on simply typetheory) so type theory is also the foundation of our proofs
At times we use conventional set-theoretic language to discuss our proofs but the dierencesare only supercial For instance we might talk about a set of elements S and with elements suchas s isin S though we always mean that S is a metalanguage type that s is a metalanguage value andthat s S Talking about sets avoids ambiguity between types of our meta-language and types ofour object-language (that we discuss next in Appendix A2)
181
182 Chapter A Preliminaries
Notation A11Wersquoll let uppercase latin letters ABC V U range over sets never over types
We do not prove correctness of all our language plugins However in earlier work [Cai et al2014] we prove correctness for a language plugin supporting bags a type of collection (described inSec 102) For that proof we extend our logic by postulating a few standard axioms on the imple-mentation of bags to avoid proving correct an implementation of bags or needing to account fordierent values representing the same bag (such dierent values typically arise when implementingbags as search trees)
A11 Type theory versus set theory
Here we summarize a few features of type theory over set theoryType theory is dependently typed so it generalizes function type Ararr B to dependent function
type (x A) rarr B where x can appear free in B Such a type guarantees that if we apply a functionf (x A) rarr B to an argument a A the result has type B [x = a] that is B where x is substitutedby a At times we will use dependent types in our presentation
Moreover by using type theory
bull We do not postulate the law of excluded middle that is our logic is constructive
bull Unlike set theory type theory is proof-relevant that is proofs are rst-class mathematicalobjects
bull Instead of subsets x isin A | P(x) we must use Σ-types Σ(x A)P(x) which contain pairs ofelements x and proofs they satisfy predicate P
bull In set theory we can assume without further ado functional extensionality that is thatfunctions that give equal results on all equal inputs are equal themselves Intuitionistic typetheory does not prove functional extensionality so we need to add it as a postulate In Agdathis postulate is known to be consistent [Hofmann 1996] hence it is safe to assume1
To handle binding issues in our object language our formalization uses typed de Bruijn indexesbecause this techniques takes advantage of Agdarsquos support for type renement in pattern matchingOn top of that we implement a HOAS-like frontend which we use for writing specic terms
A2 Simply-typed λ-calculusWe consider as object language a strongly-normalizing simply-typed λ-calculus (STLC) We chooseSTLC as it is the simplest language with rst-class functions and types while being a sucientmodel of realistic total languages2 We recall the syntax and typing rules of STLC in Figs A1aand A1b together with metavariables we use Language plugins dene base types ι and constants cTypes can be base types ι or function types σ rarr τ Terms can be constants c variables x functionapplications t1 t2 or λ-abstractions λ(x σ ) rarr t To describe assumptions on variable types whentyping terms we dene (typing) contexts Γ as being either empty ε or as context extensions Γ x τ which extend context Γ by asserting variable x has type τ Typing is dened through a judgment
1hppermalinkgmaneorggmanecomplangagda23432To know why we restrict to total languages see Sec 151
Chapter A Preliminaries 183
ι = (base types)σ τ = ι | τ rarr τ (types)
Γ = ε | Γx τ (typing contexts)c = (constants)
s t = c | λ(x σ ) rarr t | t t | x (terms)
(a) Syntax
`C c τΓ ` c τ
ConstΓ1x τ Γ2 ` x τ
Lookup Γ ` t τ Γx σ ` t τΓ ` λ(x σ ) rarr t σ rarr τ
Lam
Γ ` s σ rarr τ Γ ` t σΓ ` s t τ
App
(b) Typing
n ι o = nσ rarr τ o = nσ orarr nτ o
(c) Values
n ε o = n Γx τ o = (ρx = v) | ρ isin n Γ o andv isin nτ o
(d) Environments
n c o ρ = n c oC
n λ(x σ ) rarr t o ρ = λ(v nσ o) rarr n t o (ρx = v)n s t o ρ = (n s o ρ) (n t o ρ)
nx o ρ = lookup x in ρ
(e) Denotational semantics
Figure A1 Standard denitions for the simply-typed lambda calculus
Γ ` t τ stating that term t under context Γ has type τ 3 For a proper introduction to STLC we referthe reader to Pierce [2002 Ch 9] We will assume signicant familiarity with it
An extensible syntax of types In fact the denition of base types can be mutually recursivewith the denition of types So a language plugin might add as base types for instance collectionsof elements of type τ products and sums of type σ and type τ and so on However this mutualrecursion must satisfy a few technical restrictions to avoid introducing subtle inconsistencies andAgda cannot enforce these restrictions across modules Hence if we dene language plugins asseparate modules in our mechanization we need to verify by hand that such restrictions are satised(which they are) See Appendix B1 for the gory details
Notation A21We typically omit type annotations on λ-abstractions that is we write λx rarr t rather than λ(x σ ) rarr t Such type annotations can often be inferred from context (or type inference) Nevertheless
3We only formalize typed terms not untyped ones so that each term has a unique type That is in the relevant jargon weuse Church-style typing as opposed to Curry-style typing Alternatively we use an intrinsically-typed term representationIn fact arguably we mechanize at once both well-typed terms and their typing derivations This is even more clear in ourmechanization see discussion in Appendix A24
184 Chapter A Preliminaries
whenever we discuss terms of shape λx rarr t wersquore in fact discussing λ(x σ ) rarr t for some arbitraryσ We write λx rarr t instead of λx t for consistency with the notation we use later for Haskellprograms
We often omit ε from typing contexts with some assumptions For instance we write x τ1 y τ2instead of ε x τ1 y τ2
We overload symbols (often without warning) when they can be disambiguated from contextespecially when we can teach modern programming languages to disambiguate such overloadingsFor instance we reuserarr for lambda abstractions λx rarr t function spaces Ararr B and functiontypes σ rarr τ even though the rst is the separator
Extensions In our examples we will use some unproblematic syntactic sugar over STLC includinglet expressions global denitions type inference and we will use a Haskell-like concrete syntax Inparticular when giving type signatures or type annotations in Haskell snippets we will use toseparate terms or variables from their types rather than as in λ-calculus To avoid confusion wenever use to denote the constructor for Haskell lists
At times our concrete examples will use Hindley-Milner (prenex) polymorphism but this is alsonot such a signicant extension A top-level denition using prenex polymorphism that is of typeforallα τ (where α is free in τ ) can be taken as sugar for a metalevel family of object-level programsindexed by type argument τ1 of denitions of type τ [α = τ1 ] We use this trick without explicitmention in our rst implementation of incrementalization in Scala [Cai et al 2014]
A21 Denotational semantics for STLCTo prove that incrementalization preserves the semantics of our object-language programs wedene a semantics for STLC We use a naive set-theoretic denotational semantics Since STLC isstrongly normalizing [Pierce 2002 Ch 12] its semantics need not handle partiality Hence wecan use denotational semantics but eschew using domain theory and simply use sets from themetalanguage (see Appendix A1) Likewise we can use normal functions as domains for functiontypes
We rst associate to every type τ a set of values nτ o so that terms of a type τ evaluate to valuesin nτ o We call set nτ o a domain Domains associated to types τ depend on domain associated tobase types ι that must be specied by language plugins (Plugin Requirement A210)Denition A22 (Domains and values)The domain nτ o of a type τ is dened as in Fig A1c A value is a member of a domain
We let metavariables u v a b range over members of domains we tend to use v forgeneric values and a for values we use as a function argument We also let metavariable f g range over values in the domain for a function type At times we might also use metavariablesf g to range over terms of function types the context will clarify what is intended
Given this domain construction we can now dene a denotational semantics for terms Theplugin has to provide the evaluation function for constants In general the evaluation function n t ocomputes the value of a well-typed term t given the values of all free variables in t The values ofthe free variables are provided in an environmentDenition A23 (Environments)An environment ρ assigns values to the names of free variables
ρ = ε | ρx = v
We write n Γ o for the set of environments that assign values to the names bound in Γ (seeFig A1d)
Chapter A Preliminaries 185
Notation We often omit from environments with some assignments For instance we writex = v1 y = v2 instead of x = v1 y = v2Denition A24 (Evaluation)Given Γ ` t τ the meaning of t is dened by the function n t o of type n Γ orarr nτ o in Fig A1e
This is a standard denotational semantics of the simply-typed λ-calculusFor each constant c τ the plugin provides n c oC nτ o the semantics of c (by Plugin Require-
ment A211) since constants donrsquot contain free variables n c oC does not depend on an environmentWe dene a program equivalence across terms of the same type t1 t2 to mean n t1 o = n t2 o
Denition A25 (Denotational equivalence)We say that two terms Γ ` t1 τ and Γ ` t2 τ are denotationally equal and write Γ t1 t2 τ (orsometimes t1 t2) if for all environments ρ n Γ o we have that n t1 o ρ = n t2 o ρ
Remark A26Beware that denotational equivalence cannot always be strengthened by dropping unused variablesthat is Γ x σ t1 t2 τ does not imply Γ t1 t2 τ even if x does not occur free ineither t1 or t2 Counterexamples rely on σ being an empty type For instance we cannot weakenx 0τ 0 1 Z (where 0τ is an empty type) this equality is only true vacuously because thereexists no environment for context x 0τ
A22 WeakeningWhile we donrsquot discuss our formalization of variables in full in this subsection we discuss brieyweakening on STLC terms and state as a lemma that weakening preserves meaning This lemma isneeded in a key proof the one of Theorem 1222
As usual if a term t is well-typed in a given context Γ1 and context Γ2 extends Γ1 (which wewrite as Γ1 Γ2) then t is also well-typed in Γ2Lemma A27 (Weakening is admissible)The following typing rule is admissible
Γ1 ` t τ Γ1 Γ2
Γ2 ` t τWeaken
Weakening also preserves semantics If a term t is typed in context Γ1 evaluating it requires anenvironment matching Γ1 So if we weaken t to a bigger context Γ2 evaluation requires an extendedenvironment matching Γ2 and is going to produce the same resultLemma A28 (Weakening preserves meaning)Take Γ1 ` t τ and ρ1 n Γ1 o If Γ1 Γ2 and ρ2 n Γ2 o extends ρ1 then we have that
n t o ρ1 = n t o ρ2
Mechanize these statements and their proofs requires some care We have a meta-level typeTerm Γ τ of object terms having type τ in context Γ Evaluation has type n ndash o Term Γ τ rarrn Γ o rarr nτ o so n t o ρ1 = n t o ρ2 is not directly ill-typed To remedy this we dene formallythe subcontext relation Γ1 Γ2 and an explicit operation that weakens a term in context Γ1 to acorresponding term in bigger context Γ2 weaken Γ1 Γ2 rarr Term Γ1 τ rarr Term Γ2 τ We denethe subcontext relation Γ1 Γ2 as a judgment using order preserving embeddings4 We refer to ourmechanized proof for details including auxiliary denitions and relevant lemmas
4As mentioned by James Chapman at hpslistschalmerssepipermailagda2011003423html who attributes them toConor McBride
186 Chapter A Preliminaries
A23 SubstitutionSome facts can be presented using (capture-avoiding) substitution rather than environments andwe do so at some points so let us x notation We write t [x = s ] for the result of substitutingvariable x in term t by term s
We have mostly avoided mechanizing proofs about substitution but we have mechanizedsubstitution following Keller and Altenkirch [2010] and proved the following substitution lemmaLemma A29 (Substitution lemma)For any term Γ ` t τ variable x σ bound in Γ we write Γ minus x for the result of removing variable xfrom Γ (as dened by Keller and Altenkirch) Take term Γ minus x ` s σ and environment ρ n Γ minus x oThen we have that substitution and evaluation commute as follows
n t [x = s ] o ρ = n t o (ρ x = n s o ρ)
A24 Discussion Our mechanization and semantic styleTo formalize meaning of our programs we use denotational semantics while nowadays most preferoperational semantics in particular small-step Hence we next justify our choice and discuss relatedwork
We expect we could use other semantics techniques such as big-step or small-step semanticsBut at least for such a simple object language working with denotational semantics as we use it iseasier than other approaches in a proof assistant especially in Agda
bull Our semantics n ndash o is a function and not a relation like in small-step or big-step semantics
bull It is clear to Agda that our semantics is a total function since it is structurally recursive
bull Agda can normalize n ndash o on partially-known terms when normalizing goals
bull The meanings of our programs are well-behaved Agda functions not syntax so we knowldquowhat they meanrdquo and need not prove any lemmas about it We need not prove say thatevaluation is deterministic
In Agda the domains for our denotational semantics are simply Agda types and semantic valuesare Agda values mdash in other words we give a denotational semantics in terms of type theory Usingdenotational semantics allows us to state the specication of dierentiation directly in the semanticdomain and take advantage of Agdarsquos support for equational reasoning for proving equalitiesbetween Agda functions
Related work Our variant is used for instance by McBride [2010] who attribute it to Augustssonand Carlsson [1999] and Altenkirch and Reus [1999] In particular Altenkirch and Reus [1999]already dene our type Term Γ τ of simply-typed λ-terms t well-typed with type τ in context Γwhile Augustsson and Carlsson [1999] dene semantic domains by induction over types Bentonet al [2012] and Allais et al [2017] also discuss this approach to formalizing λ terms and discusshow to best prove various lemmas needed to reason for instance about substitution
More in general similar approaches are becoming more common when using proof assistantsOur denotational semantics could be otherwise called a denitional interpreter (which is in particularcompositional) and mechanized formalizations using a variety of denitional interpreters arenowadays often advocated either using denotational semantics [Chlipala 2008] or using functionalbig-step semantics Functional semantics are so convenient that their use has been advocated evenfor languages that are not strongly normalizing [Owens et al 2016 Amin and Rompf 2017] evenat the cost of dealing with step-indexes
Chapter A Preliminaries 187
A25 Language pluginsOur object language is parameterized by language plugins (or just plugins) that encapsulate itsdomain-specic aspects
In our examples our language plugin will typically support integers and primitive operationson them However our plugin will also support various sorts collections and base operations onthem Our rst example of collection will be bags Bags are unordered collections (like sets) whereelements are allowed to appear more than once (unlike in sets) and they are also called multisets
Our formalization is parameterized over one language plugin providing all base types andprimitives In fact we expect a language plugin to be composed out of multiple language pluginsmerged together [Erdweg et al 2012] Our mechanization is mostly parameterized over languageplugins but see Appendix B1 for discussion of a few limitations
The sets of base types and primitive constants as well as the types for primitive constants areon purpose left unspecied and only dened by plugins mdash they are extensions points We writesome extension points using ellipses (ldquo rdquo) and other ones by creating names which typically useC as a superscript
A plugin denes a set of base types ι primitives c and their denotational semantics n c oC Asusual we require that n c oC nτ o whenever c τ
Summary To sum up the discussion of plugins we collect formally the plugin requirements wehave mentioned in this chapter
Plugin Requirement A210 (Base types)There is a set of base types ι and for each there is a domain n ι o
Plugin Requirement A211 (Constants)There is a set of constants c To each constant is associated a type τ such that the constant has thattype that is `C c τ and the constantsrsquo semantics matches that type that is n c oC nτ o
After discussing the metalanguage of our proofs the object language we study and its semanticswe begin discussing incrementalization in next chapter
188 Chapter A Preliminaries
Appendix B
More on our formalization
B1 Mechanizing plugins modularly and limitationsNext we discuss our mechanization of language plugins in Agda and its limitations For theconcerned reader we can say these limitations aect essentially how modular our proofs are andnot their cogency
In essence itrsquos not possible to express in Agda the correct interface for language plugins sosome parts of language plugins canrsquot be modularized as desirable Alternatively we can mechanizeany xed language plugin together with the main formalization which is not truly modular ormechanize a core formulation parameterized on a language plugin which that runs into a fewlimitations or encode plugins so they can be modularized and deal with encoding overhead
This section requires some Agda knowledge not provided here but we hope that readers familiarwith both Haskell and Coq will be able to follow along
Our mechanization is divided into multiple Agda modules Most modules have denitions thatdepend on language plugins directly or indirectly Those take denitions from language plugins asmodule parameters
For instance STLC object types are formalized through Agda type Type dened in moduleParametricSyntax Type The latter is parameterized over Base the type of base types
For instance Base can support a base type of integers and a base type of bags of elements oftype ι (where ι Base) Simplifying a few details our denition is equivalent to the following Agdacode
module ParametricSyntax Type (Base Set) wheredata Type Set where
base (ι Base) rarr Type_rArr_ (σ Type) rarr (τ Type) rarr Type
-- Elsewhere in plugindata Base Set wherebaseInt BasebaseBag (ι Base) rarr Base--
But with these denitions we only have bags of elements of base type If ι is a base typebase (baseBag ι) is the type of bags with elements of type base ι Hence we have bags of elementsof base type But we donrsquot have a way to encode Bag τ if τ is an arbitrary non-base type such as
189
190 Chapter B More on our formalization
base baseInt rArr base baseInt (the encoding of object type Zrarr Z) Can we do better If we ignoremodularization we can dene types through the following mutually recursive datatypes
mutualdata Type Set wherebase (ι Base Type) rarr Type_rArr_ (σ Type) rarr (τ Type) rarr Type
data Base Set wherebaseInt BasebaseBag (ι Type) rarr Base
So far so good but these types have to be dened together We can go a step further by dening
mutualdata Type Set wherebase (ι Base Type) rarr Type_rArr_ (σ Type) rarr (τ Type) rarr Type
data Base (Type Set) Set wherebaseInt Base TypebaseBag (ι Type) rarr Base Type
Here Base takes the type of object types as a parameter and Type uses Base Type to tie therecursion This denition still works but only as long as Base and Type are dened together
If we try to separate the denitions of Base and Type into dierent modules we run into trouble
module ParametricSyntax Type (Base Set rarr Set) wheredata Type Set wherebase (ι Base Type) rarr Type_rArr_ (σ Type) rarr (τ Type) rarr Type
-- Elsewhere in plugindata Base (Type Set) Set wherebaseInt Base TypebaseBag (ι Type) rarr Base Type
Here Type is dened for an arbitrary function on types Base Set rarr Set However thisdenition is rejected by Agdarsquos positivity checker Like Coq Agda forbids dening datatypes thatare not strictly positive as they can introduce inconsistencies
The above denition of Type is not strictly positive because we could pass to it as argumentBase = λτ rarr (τ rarr τ ) so that Base Type = Typerarr Type making Type occur in a negative positionHowever the actual uses of Base we are interested in are ne The problem is that we cannot informthe positivity checker that Base is supposed to be a strictly positive type function because Agdadoesnrsquot supply the needed expressivity
This problem is well-known It could be solved if Agda function spaces supported positivityannotations1 or by encoding a universe of strictly-positive type constructors This encoding isnot fully transparent and adds hard-to-hide noise to development [Schwaab and Siek 2013]2 Fewalternatives remain
bull We can forbid types from occurring in base types as we did in our original paper [Cai et al2014] There we did not discuss at all a recursive denition of base types
1As discussed in hpsgithubcomagdaagdaissues570 and hpsgithubcomagdaagdaissues24112Using pattern synonyms and DISPLAY pragmas might successfully hide this noise
Chapter B More on our formalization 191
bull We can use the modular mechanization disable positivity checking and risk introducinginconsistencies We tried this successfully in a branch of that formalization We believe wedid not introduce inconsistencies in that branch but have no hard evidence
bull We can simply combine the core formalization with a sample plugin This is not trulymodular because the core modularization can only be reused by copy-and-paste Moreover independently-typed languages the content of a denition can aect whether later denitionstypecheck so alternative plugins using the same interface might not typecheck3
Sadly giving up on modularity altogether appears the more conventional choice Either way aswe claimed at the outset these modularity concerns only limit the modularity of the mechanizedproofs not their cogency
3Indeed if Base were not strictly positive the application Type Base would be ill-typed as it would fail positivity checkingeven though Base Set rarr Set does not require Base to be strictly positive
192 Chapter B More on our formalization
Appendix C
(Un)typed ILC operationally
In Chapter 12 we have proved ILC correct for a simply-typed λ-calculus What about other languageswith more expressive type systems or no type system at all
In this chapter we prove that ILC is still correct in untyped call-by-value (CBV) λ-calculusWe do so without using denotational semantics but using only an environment-based big-stepoperational semantics and step-indexed logical relations The formal development in this chapterstands alone from the rest of the thesis though we do not repeat ideas present elsewhere
We prove ILC correct using in increasing order of complexity
1 STLC and standard syntactic logical relations
2 STLC and step-indexed logical relations
3 an untyped λ-calculus and step-indexed logical relations
We have fully mechanized the second proof in Agda1 and done the others on paper In all cases weprove the fundamental property for validity we detail later which corollaries we prove in which caseThe proof for untyped λ-calculus is the most interesting but the others can serve as stepping stonesYann Reacutegis-Gianas in collaboration with me recently mechanized a similar proof for untypedλ-calculus in Coq which appears in Sec 1732
Using operational semantics and step-indexed logical relations simplies extending the proofsto more expressive languages where denotational semantics or other forms of logical relationswould require more sophistication as argued by Ahmed [2006]
Proofs by (step-indexed) logical relations also promise to be scalable All these proofs appear tobe slight variants of proof techniques for logical program equivalence and parametricity which arewell-studied topics suggesting the approach might scale to more expressive type systems Hencewe believe these proofs clarify the relation with parametricity that has been noticed earlier [Atkey2015] However actually proving ILC correct for a polymorphic language (such as System F) is leftas future work
We also expect that from our logical relations one might derive a logical partial equivalencerelation among changes similarly to Sec 142 but we leave a development for future work
Compared to earlier chapters this one will be more technical and concise because we alreadyintroduced the ideas behind both ILC and logical relation proofs
1Source code available in this GitHub repo hpsgithubcominc-lcilc-agda2Mechanizing the proof for untyped λ-calculus is harder for purely technical reasons mechanizing well-founded
induction in Agda is harder than mechanizing structural induction
193
194 Chapter C (Un)typed ILC operationally
Binding and environments On the technical side we are able to mechanize our proof withoutneeding any technical lemmas about binding or weakening thanks to a number of choices wemention later
Among other reasons we avoid lemmas on binding because instead of substituting variableswith arbitrary terms we record mappings from variables to closed values via environments We alsoused environments in Appendix A but this time we use syntactic values rather than semantic onesAs a downside on paper using environments makes for more and bigger denitions because weneed to carry around both an environment and a term instead of merging them into a single term viasubstitution and because values are not a subset of terms but an entirely separate category But onpaper the extra work is straightforward and in a mechanized setting it is simpler than substitutionespecially in a setting with an intrinsically typed term representation (see Appendix A24)
Backgroundrelated work Our development is inspired signicantly by the use of step-indexedlogical relations by Ahmed [2006] and Acar Ahmed and Blume [2008] We refer to those worksand to Ahmedrsquos lectures at OPLSS 20133 for an introduction to (step-indexed) logical relations
Intensional and extensional validity Until this point change validity only species how func-tion changes behave that is their extension Using operational semantics we can specify how validfunction changes are concretely dened that is their intension To distinguish the two conceptswe contrast extensional validity and intensional validity Some extensionally valid changes are notintensionally valid but such changes are never created by derivation Dening intensional validityhelps to understand function changes function changes are produced from changes to values inenvironments or from functions being replaced altogether Requiring intensional validity helps toimplement change operations such as oplus more eciently by operating on environments Later inAppendix D we use similar ideas to implement change operations on defunctionalized functionchanges intensional validity helps put such eorts on more robust foundations even though we donot account formally for defunctionalization but only for the use of closures in the semantics
We use operational semantics to dene extensional validity in Appendix C3 (using plain logicalrelations) and Appendix C4 (using step-indexed logical relations) We switch to intensional validitydenition in Appendix C5
Non-termination and general recursion This proof implies correctness of ILC in the presenceof general recursion because untyped λ-calculus supports general recursion via xpoint combina-tors However the proof only applies to terminating executions of base programs like for earlierauthors [Acar Ahmed and Blume 2008] we prove that if a function terminates against both a baseinput v1 and an updated one v2 its derivative terminates against the base input and a valid inputchange dv from v1 to v2
We can also add a xpoint construct to our typed λ-calculus and to our mechanization withoutsignicant changes to our relations However a mechanical proof would require use of well-foundedinduction which our current mechanization avoids We return to this point in Appendix C7
While this support for general recursion is eective in some scenarios other scenarios can stillbe incrementalized better using structural recursion More ecient support for general recursion isa separate problem that we do not tackle here and leave for future work We refer to discussion inSec 151
Correctness statement Our nal correctness theorem is a variant of Corollary 1345 that werepeat for comparison
3hpswwwcsuoregoneduresearchsummerschoolsummer13curriculumhtml
Chapter C (Un)typed ILC operationally 195
Corollary 1345 (D n ndasho is correct corollary)If Γ ` t τ and dρ B ρ1 rarr ρ2 Γ then n t o ρ1 oplus nD n t o o dρ = n t o ρ2
We present our nal correctness theorem statement in Appendix C5 We anticipate it here forillustration the new statement is more explicit about evaluation but otherwise broadly similar
Corollary C56 (D n ndasho is correct corollary)Take any term t that is well-typed (Γ ` t τ ) and any suitable environments ρ1 dρ ρ2 intensionallyvalid at any step count (forallk (k ρ1 dρ ρ2) isin RG n Γ o) Assume t terminates in both the old envi-ronment ρ1 and the new environment ρ2 evaluating to output values v1 and v2 (ρ1 ` t dArr v1 andρ2 ` t dArr v2) Then D n t o evaluates in environment ρ and change environment dρ to a change valuedv (ρ1 dρ `∆ t dArr dv) and dv is a valid change from v1 to v2 so that v1 oplus dv = v2
Overall in this chapter we present the following contributions
bull We give an alternative presentation of derivation that can be mechanized without any binding-related lemmas not even weakening-related ones by introducing a separate syntax for changeterms (Appendix C1)
bull We prove formally ILC correct for STLC using big-step semantics and logical relations (Ap-pendix C3)
bull We show formally (with pen-and-paper proofs) that our semantics is equivalent to small-stepsemantics denitions (Appendix C2)
bull We introduce a formalized step-indexed variant of our denitions and proofs for simply-typed λ-calculus (Appendix C3) which scales directly to denitions and proofs for untypedλ-calculus
bull For typed λ-calculus we also mechanize our step-indexed proof
bull In addition to (extensional) validity we introduce a concept of intensional validity thatcaptures formally that function changes arise from changing environments or functions beingreplaced altogether (Appendix C5)
C1 Formalization
To present the proofs we rst describe our formal model of CBV ANF λ-calculus We dene anuntyped ANF language called λA We also dene a simply-typed variant called λArarr by adding ontop of λA a separate Curry-style type system
In our mechanization of λArarr however we nd it more convenient to dene a Church-styletype system (that is a syntax that only describes typing derivations for well-typed terms) separatelyfrom the untyped language
Terms resulting from dierentiation satisfy additional invariants and exploiting those invariantshelps simplify the proof Hence we dene separate languages for change terms produced fromdierentiation again in untyped (λ∆A) and typed (λ∆Ararr) variants
The syntax is summarized in Fig C1 the type systems in Fig C2 and the semantics in Fig C3The base languages are mostly standard while the change languages pick a few dierent choices
196 Chapter C (Un)typed ILC operationally
p = succ | add | m n = -- numbersc = n | w = x | λx rarr t | 〈ww〉 | cs t = w | w w | p w | let x = t in tv = ρ [λx rarr t ] | 〈v v〉 | cρ = x1 = v1 xn =vn
(a) Base terms (λA)
dc = 0dw = dx | λx dx rarr dt | 〈dw dw〉 | dcds dt = dw | dw w dw | 0p w dw |
let x = t dx = dt in dtdv = ρ dρ [λx dx rarr dt ] | 〈dv dv〉 | dcdρ = dx1 = dv1 dxn = dvn
(b) Change terms (λ∆A)
DC n n o = 0
D n x o = dx
D n λx rarr t o = λx dx rarr D n t oD n 〈wa wb 〉 o = 〈D nwa o D nwb o〉
D n c o = DC n c o
Dwf wa
= D
wf
wa D nwa o
D n p w o = 0p w D nw oD n let x = s in t o = let x = s dx = D n s o in D n t o
(c) Dierentiation
τ = N | τa times τb | σ rarr τ
∆N = N∆(τa times τb ) = ∆τa times ∆τb∆(σ rarr τ ) = σ rarr ∆σ rarr ∆τ
∆ε = ε∆(Γ x τ ) = ∆Γ dx ∆τ
(d) Types and contexts
Figure C1 ANF λ-calculus λA and λ∆A
C11 Types and contextsWe show the syntax of types contexts and change types in Fig C1d We introduce types forfunctions binary products and naturals Tuples can be encoded as usual through nested pairsChange types are mostly like earlier but this time we use naturals as change for naturals (hencewe cannot dene a total operation)
We modify the denition of change contexts and environment changes to not contain entriesfor base values in this presentation we use separate environments for base variables and changevariables This choice avoids the need to dene weakening lemmas
C12 Base syntax for λAFor convenience we consider a λ-calculus in A-normal form We do not parameterize this calculusover language plugins to reduce mechanization overhead but we dene separate syntactic categoriesfor possible extension points
We show the syntax of terms in Fig C1aMeta-variable v ranges over (closed) syntactic values that is evaluation results Values are
numbers pairs of values or closures A closure is a pair of a function and an environment as usualEnvironments ρ are nite maps from variables to syntactic values in our mechanization using deBruijn indexes environments are in particular nite lists of syntactic values
Meta-variable t ranges over arbitrary terms and w ranges over neutral forms Neutral formsevaluate to values in zero steps but unlike values they can be open a neutral form is either a
Chapter C (Un)typed ILC operationally 197
`C n NT-Lit
`P succ Nrarr NT-Succ
`P add (N times N) rarr NT-Add
x τ isin ΓΓ ` x τ
T-Var`C c τΓ ` c τ
T-Const Γ ` t τ Γ ` wf σ rarr τ Γ ` wa σΓ ` wf wa τ
T-App
Γ x σ ` t τΓ ` λx rarr t σ rarr τ
T-LamΓ ` s σ Γ x σ ` t τ
Γ ` let x = s in t τT-Let
Γ ` wa τa Γ ` wb τbΓ ` 〈wa wb 〉 τa times τb
T-Pair
`P p σ rarr τ Γ ` w σΓ ` p w τ
T-Prim
(a) λArarr base typing
x τ isin ΓΓ `∆ dx τ
T-DVar`C c ∆τΓ `∆ c τ
T-DConst Γ `∆ dt τ
Γ `∆ dwf σ rarr τ Γ ` wa σ Γ `∆ dwa σΓ `∆ dwf wa dwa τ
T-DApp
Γ ` s σ Γ `∆ ds σ Γ x σ `∆ dt τΓ `∆ let x = s dx = ds in dt τ
T-DLetΓ x σ `∆ dt τ
Γ `∆ λx dx rarr dt σ rarr τT-DLam
Γ `∆ dwa τa Γ `∆ dwb τbΓ `∆ 〈dwa dwb 〉 τa times τb
T-DPair`P p σ rarr τ Γ ` w σ Γ `∆ dw σ
Γ `∆ 0p w dw τT-DPrim
(b) λ∆Ararr change typing Judgement Γ `∆ dt τ means that variables from both Γ and ∆Γ are in scope in dt andthe nal type is in fact ∆τ
Figure C2 ANF λ-calculus λArarr and λ∆Ararr type system
198 Chapter C (Un)typed ILC operationally
ρ ` p w dArr1 P n p o (V nw o ρ) E-Primρ ` w dArr0 V nw o ρ E-Val ρ ` t dArrn v
ρ ` s dArrns vs (ρ x = vs ) ` t dArrnt vtρ ` let x = s in t dArr1+ns+nt vt
E-Letρ ` wf dArr0 vf ρ ` wa dArr0 va app vf va dArrn v
ρ ` wf wa dArr1+n vE-App
app (ρprime [λx rarr t ]) va = (ρprime x = va ) ` tV n x o ρ = ρ (x)V n λx rarr t o ρ = ρ [λx rarr t ]V n 〈wa wb 〉 o ρ = 〈V nwa o ρ V nwb o ρ 〉V n c o ρ = c
P n succ o n = 1 + nP n add o (m n) = m + n
(a) Base semantics Judgement ρ ` t dArrn v says that ρ ` t a pair of environment ρ and term t evaluates to valuev in n steps and app vf va dArrn v constructs the pair to evaluate via app vf va
ρ dρ `∆ dw dArr V∆ n dw o ρ dρE-DVal ρ dρ `∆ dt dArr dv
ρ dρ `∆ 0p w dw dArr P∆ n p o (V nw o ρ) (V∆ n dw o ρ dρ)E-DPrim
ρ dρ `∆ dwf dArr dvf ρ ` wa dArr va ρ dρ `∆ dwa dArr dva dapp dvf va dva dArr dv
ρ dρ `∆ dwf wa dwa dArr dvE-DApp
ρ ` s dArr vs ρ dρ `∆ ds dArr dvs (ρ x = vs ) (dρ dx = dvs) `∆ dt dArr dvt
ρ dρ `∆ let x = s dx = ds in dt dArr dvtE-DLet
dapp (ρprime dρprime [λx dx rarr dt ]) va dva = (ρprime x = va ) (dρprime dx = dva ) `∆ dt
V∆ n dx o ρ dρ = dρ (dx)V∆ n λx dx rarr dt o ρ dρ = ρ dρ [λx dx rarr dt ]V∆ n 〈dwa dwb 〉 o ρ dρ = 〈V∆ n dwa o ρ dρ V∆ n dwb o ρ dρ 〉V∆ n c o ρ dρ = c
P∆ n succ o n dn = dnP∆ n add o 〈m n〉 〈dm dn〉 = dm + dn
(b) Change semantics Judgement ρ dρ `∆ t dArr dv says that ρ dρ `∆ dt a triple of environment ρ changeenvironment dρ and change term t evaluates to change value dv and dapp dvf va dva dArr dv constructs thetriple to evaluate via dapp dvf va dva
Figure C3 ANF λ-calculus (λA and λ∆A) CBV semantics
Chapter C (Un)typed ILC operationally 199
variable a constant value c a λ-abstraction or a pair of neutral formsA term is either a neutral form an application of neutral forms a let expression or an application
of a primitive function p to a neutral form Multi-argument primitives are encoded as primitivestaking (nested) tuples of arguments Here we use literal numbers as constants and +1 and additionas primitives (to illustrate dierent arities) but further primitives are possible
Notation C11We use subscripts ab for pair components f a for function and argument and keep using 12 for oldand new values
C13 Change syntax for λ∆ANext we consider a separate language for change terms which can be transformed into the baselanguage This language supports directly the structure of change terms base variables and changevariables live in separate namespaces As we show later for the typed language those namespacesare represented by typing contexts Γ and ∆Γ that is the typing context for change variables isalways the change context for Γ
We show the syntax of change terms in Fig C1bChange terms often take or bind two parameters at once one for a base value and one for its
change Since a function change is applied to a base input and a change for it at once the syntax forchange term has a special binary application node dwf wa dwa otherwise in ANF such syntaxmust be encoded through separate applications via let df a = dwf wa in df a dwa In the sameway closure changes ρ dρ [λx dx rarr dt ] bind two variables at once and close over separateenvironments for base and change variables Various other changes in the same spirit simplifysimilar formalization and mechanization details
In change terms we write 0p as syntax for the derivative of p evaluated as such by the semanticsStrictly speaking dierentiation must map primitives to standard terms so that the resultingprograms can be executed by a standard semantics hence we should replace 0p by a concreteimplementation of the derivative of p However doing so in a new formalization yields littleadditional insight and requires writing concrete derivatives of primitives as de Bruijn terms
C14 DierentiationWe show dierentiation in Fig C1c Dierentiation maps constructs in the language of base termsone-to-one to constructs in the language of change terms
C15 Typing λArarr and λ∆ArarrWe dene typing judgement for λArarr base terms and for λ∆Ararr change terms We show typing rulesin Fig C2b
Typing for base terms is mostly standard We use judgements `P p and `C c to specify typing ofprimitive functions and constant values For change terms one could expect a type system onlyproving judgements with shape Γ∆Γ ` dt ∆τ (where Γ∆Γ stands for the concatenation of Γ and∆Γ) To simplify inversion on such judgements (especially in Agda) we write instead Γ `∆ dt τ sothat one can verify the following derived typing rule for D n ndash o
Γ ` t τΓ `∆ D n t o τ
T-Derive
200 Chapter C (Un)typed ILC operationally
We also use mutually recursive typing judgment v τ for values and ρ Γ for environmentsand similarly ∆ dv τ for change values and ∆ dρ Γ for change environments We only showthe (unsurprising) rules for v τ and omit the others One could alternatively and equivalentlydene typing on syntactic values v by translating them to neutral forms w = vlowast (using unsur-prising denitions in Appendix C2) and reusing typing on terms but as usual we prefer to avoidsubstitution
n NTV-Nat
va τa vb τb 〈va vb 〉 τa times τb
TV-Pair v τ
Γ x σ ` t τ ρ Γ ρ [λx rarr t ] σ rarr τ
TV-Lam
C16 SemanticsWe present our semantics for base terms in Fig C3a Our semantics gives meaning to pairs ofa environment ρ and term t consistent with each other that we write ρ ` t By ldquoconsistentrdquowe mean that ρ contains values for all free variables of t and (in a typed language) values withcompatible values (if Γ ` t τ then ρ Γ) Judgement ρ ` t dArrn v says that ρ ` t evaluates tovalue v in n steps The denition is given via a CBV big-step semantics Following Acar Ahmedand Blume [2008] we index our evaluation judgements via a step count which counts in essenceβ-reduction steps we use such step counts later to dene step-indexed logical relations Sinceour semantics uses environments β-reduction steps are implemented not via substitution but viaenvironment extension but the resulting step-counts are the same (Appendix C2) Applying closurevf = ρ prime [λx rarr t ] to argument va produces environment-term pair (ρ prime x = va) ` t which weabbreviate as app vf va Wersquoll reuse this syntax later to dene logical relations
In our mechanized formalization we have additionally proved lemmas to ensure that thissemantics is sound relative to our earlier denotational semantics (adapted for the ANF syntax)
Evaluation of primitives is delegated to function P n ndash o ndash We show complete equations for thetyped case for the untyped case we must turn P n ndash o ndash and P∆ n ndash o ndash into relations (or add expliciterror results) but we omit the standard details (see also Appendix C6) For simplicity we assumeevaluation of primitives takes one step We conjecture higher-order primitives might need to beassigned dierent costs but leave details for future work
We can evaluate neutral forms w to syntactic values v using a simple evaluation functionV nw o ρ and use P n p o v to evaluate primitives When we need to omit indexes we write ρ ` t dArr vto mean that for some n we have ρ ` t dArrn v
We can also dene an analogous non-indexed big-step semantics for change terms and wepresent it in Fig C3b
C17 Type soundnessEvaluation preserves types in the expected way
Lemma C12 (Big-step preservation)1 If Γ ` t τ ρ Γ and ρ ` t dArrn v then v τ
2 If Γ `∆ dt τ ρ Γ ∆ dρ Γ and ρ dρ `∆ dt dArr dv then ∆ dv τ
Chapter C (Un)typed ILC operationally 201
Proof By structural induction on evaluation judgements In our intrinsically typed mechanizationthis is actually part of the denition of values and big-step evaluation rules
To ensure that our semantics for λArarr is complete for the typed language instead of provinga small-step progress lemma or extending the semantics with errors we just prove that all typedterms normalize in the standard way As usual this fails if we add xpoints or for untyped terms Ifwe wanted to ensure type safety in such a case we could switch to functional big-step semantics ordenitional interpreters [Amin and Rompf 2017 Owens et al 2016]
Theorem C13 (CBV normalization)For any well-typed and closed term ` t τ there exist a step count n and value v such that ` t dArrn v
Proof A variant of standard proofs of normalization for STLC [Pierce 2002 Ch 12] adapted forbig-step semantics rather than small-step semantics (similarly to Appendix C3) We omit neededdenitions and refer interested readers to our Agda mechanization
We havenrsquot attempted to prove this lemma for arbitrary change terms (though we expect wecould prove it by dening an erasure to the base language and relating evaluation results) but weprove it for the result of dierentiating well-typed terms in Corollary C32
C2 Validating our step-indexed semanticsIn this section we show how we ensure the step counts in our base semantics are set correctly andhow we can relate this environment-based semantics to more conventional semantics based onsubstitution andor small-step We only consider the core calculus without primitives constantsand pairs Results from this section are not needed later and we have proved them formally onpaper but not mechanized them as our goal is to use environment-based big-step semantics in ourmechanization
To this end we relate our semantics rst with a big-step semantics based on substitution (ratherthan environments) and then relating this alternative semantics to a small-step semantics Resultsin this section are useful to understand better our semantics and as a design aide to modify it butare not necessary to the proof so we have not mechanized them
As a consequence we also conjecture that our logical relations and proofs could be adaptedto small-step semantics along the lines of Ahmed [2006] We however do not nd that necessaryWhile small-step semantics gives meaning to non-terminating programs and that is important fortype soundness proofs it does not seem useful (or possible) to try to incrementalize them or toensure we do so correctly
In proofs using step-indexed logical relations the use of step-counts in denitions is oftendelicate and tricky to get right But Acar Ahmed and Blume provide a robust recipe to ensurecorrect step-indexing in the semantics To be able to prove the fundamental property of logicalrelations we ensure step-counts agree with the ones induced by small-step semantics (which countsβ-reductions) Such a lemma is not actually needed in other proofs but only useful as a sanitycheck We also attempted using the style of step-indexing used by Amin and Rompf [2017] butwere unable to produce a proof To the best of our knowledge all proofs using step-indexed logicalrelations even with functional big-step semantics [Owens et al 2016] use step-indexing that agreeswith small-step semantics
Unlike Acar Ahmed and Blume we use environments in our big-step semantics this avoids theneed to dene substitution in our mechanized proof Nevertheless one can show the two semanticscorrespond to each other Our environments ρ can be regarded as closed value substitutions as longas we also substitute away environments in values Formally we write ρlowast(t) for the ldquohomomorphic
202 Chapter C (Un)typed ILC operationally
extensionrdquo of substitution ρ to terms which produces other terms If v is a value using environmentswe write w = vlowast for the result of translating that value to not use environments this operationproduces a closed neutral form w Operations ρlowast(t) and vlowast can be dened in a mutually recursiveway
ρlowast(x) = (ρ (x))lowast
ρlowast(λx rarr t) = λx rarr ρlowast(t)ρlowast(w1 w2) = ρ
lowast(w1) ρlowast(w2)
ρlowast(let x = t1 in t2) = let x = ρlowast(t1) in ρlowast(t2)
(ρ [λx rarr t ])lowast = λx rarr ρlowast(t)
If ρ ` t dArrn v in our semantics a standard induction over the derivation of ρ ` t dArrn v shows thatρlowast(t) dArrn vlowast in a more conventional big-step semantics using substitution rather than environments(also following Acar Ahmed and Blume)
x dArr0 xVarrsquo
λx rarr t dArr0 λx rarr tLamrsquo
t [x = w2 ] dArrn w prime
(λx rarr t) w2 dArr1+n w primeApprsquo
t1 dArrn1 w1 t2 [x = w1 ] dArrn2 w2
let x = t1 in t2 dArr1+n1+n2 w2Letrsquo
In this form it is more apparent that the step indexes count steps of β-reduction or substitutionItrsquos also easy to see that this big-step semantics agrees with a standard small-step semantics
7rarr (which we omit) if t dArrn w then t 7rarrn w Overall the two statements can be composed soour original semantics agrees with small-step semantics if ρ ` t dArrn v then ρlowast(t) dArrn vlowast and nallyρlowast(t) 7rarrn vlowast Hence we can translate evaluation derivations using big-step semantics to derivationsusing small-step semantics with the same step count
However to state and prove the fundamental property we need not prove that our semanticsis sound relative to some other semantics We simply dene the appropriate logical relation forvalidity and show it agrees with a suitable denition for oplus
Having dened our semantics we proceed to dene extensional validity
C3 Validity syntactically (λArarr λ∆Ararr)For our typed language λArarr at least as long as we do not add xpoint operators we can denelogical relations using big-step semantics but without using step-indexes The resulting relationsare well-founded only because they use structural recursion on types We present in Fig C4 theneeded denitions as a stepping stone to the denitions using step-indexed logical relations
Following Ahmed [2006] and Acar Ahmed and Blume [2008] we encode extensional validitythrough two mutually recursive type-indexed families of ternary logical relations RV nτ o overclosed values and RC nτ o over terms (and environments)
These relations are analogous to notions we considered earlier and express similar informalnotions
bull With denotational semantics we write dv B v1 rarr v2 τ to say that change value dv isin n∆τ ois a valid change from v1 to v2 at type τ With operational semantics instead we write(v1 dv v2) isin RV nτ o where v1 dv and v2 are now closed syntactic values
Chapter C (Un)typed ILC operationally 203
bull For terms with denotational semantics we write n dt o dρ B n t1 o ρ1 rarr n t2 o ρ2 τ tosay that dt is a valid change from t1 and t2 considering the respective environments Withoperational semantics instead we write (ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) isin RC nτ o
Since we use Church typing and only mechanize typed terms we must include in all casesappropriate typing assumptions
Relation RC nτ o relates tuples of environments and computations ρ1 ` t1 ρ dρ `∆ dt andρ2 ` t2 it holds if t1 evaluates in environment ρ1 to v1 and t2 evaluates in environment ρ2 to v2then dt must evaluate in environments ρ and dρ to a change value dv with v1 dv v2 related byRV nτ o The environments themselves need not be related this denition characterizes validityextensionally that is it can relate t1 dt and t2 that have unrelated implementations and unrelatedenvironments mdash in fact even unrelated typing contexts This exibility is useful to when relatingclosures of type σ rarr τ two closures might be related even if they have close over environmentsof dierent shape For instance closures v1 = [λx rarr 0] and v2 = (y = 0) [λx rarr y ] are relatedby a nil change such as dv = [λx dx rarr 0] In Appendix C5 we discuss instead an intensionaldenition of validity
In particular for function types the relation RV nσ rarr τ o relates function values f1 df and f2 ifthey map related input values (and for df input changes) to related output computations
We also extend the relation on values to environments via RG n Γ o environments are related iftheir corresponding entries are related values
RV nN o = (n1 dn n2) | n1 dn n2 isin N and n1 + dn = n2 RV nτa times τb o = (〈va1 vb1〉 〈dva dvb 〉 〈va2 vb2〉) |
(va1 dva va2) isin RV nτa o and (vb1 dvb vb2) isin RV nτb o RV nσ rarr τ o = (vf 1 dvf vf 2) |
vf 1 σ rarr τ and ∆ dvf σ rarr τ and vf 2 σ rarr τ
andforall(v1 dv v2) isin RV nσ o (app vf 1 v1 dapp dvf v1 dv app vf 2 v2) isin RC nτ o
RC nτ o = (ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) |(existΓ1 Γ Γ2 Γ1 ` t1 τ and Γ `∆ dt τ and Γ2 ` t2 τ )andforallv1 v2(ρ1 ` t1 dArr v1 and ρ2 ` t2 dArr v2) rArrexistdv ρ dρ `∆ dt dArr dv and (v1 dv v2) isin RV nτ o
RG n ε o = () RG n Γ x τ o = ((ρ1 x = v1) (dρ dx = dv) (ρ2 x = v2)) |
(ρ1 dρ ρ2) isin RG n Γ o and (v1 dv v2) isin RV nτ o Γ |= dt I t1 rarr t2 τ = forall(ρ1 dρ ρ2) isin RG n Γ o
(ρ1 ` t1 ρ1 dρ `∆ dt ρ2 ` t2) isin RC nτ o
Figure C4 Dening extensional validity via logical relations and big-step semantics
204 Chapter C (Un)typed ILC operationally
Given these denitions one can prove the fundamental property
Theorem C31 (Fundamental property correctness of D n ndasho)For every well-typed term Γ ` t τ we have that Γ |= D n t o I t rarr t τ
Proof sketch By induction on the structure on terms using ideas similar to Theorem 1222
It also follows that D n t o normalizes
Corollary C32 (D n ndasho normalizes to a nil change)For any well-typed and closed term ` t τ there exist value v and change value dv such that ` t dArr v`∆ D n t o dArr dv and (v dv v) isin RV nτ o
Proof A corollary of the fundamental property and of the normalization theorem (Theorem C13)since t is well-typed and closed it normalizes to a value v for some step count (` t dArr v) The emptyenvironment change is valid () isin RG n ε o so from the fundamental property we get that(` t `∆ D n t o ` t) isin RC nτ o From the denition of RC nτ o and ` t dArr v it follows that there existsdv such that `∆ D n t o dArr dv and (v dv v) isin RV nτ o
Remark C33Compared to prior work these relations are unusual for two reasons First instead of just relatingtwo executions of a term we relate two executions of a term with an execution of a change termSecond most such logical relations (including Ahmed [2006]rsquos one but except Acar et al [2008]rsquosone) dene a logical relation (sometimes called logical equivalence) that characterizes contextualequivalence while we donrsquot
Consider a logical equivalence dened through sets RC nτ o and RV nτ o If (t1 t2) isin RC nτ oholds and t1 terminates (with result v1) then t2 must terminate as well (with result v2) and theirresults v1 and v2 must in turn be logically equivalent (v1 v2 isin RV nτ o) And at base types like N(v1 v2) isin RV nN o means that v1 = v2
Here instead the fundamental property relates two executions of a term on dierent inputswhich might take dierent paths during execution In a suitably extended language we could evenwrite term t = λx rarr if x = 0 then 1 else loop and run it on inputs v1 = 0 and v2 = 1 theseinputs are related by change dv = 1 but t will converge on v1 and diverge on v2 We must usea semantics that allow such behavioral dierence Hence at base type N (v1 dv v2) isin RV nN omeans just that dv is a change from v1 to v2 hence that v1 oplus dv is equivalent to v2 because oplus agreeswith extensional validity in this context as well And if (ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) isin RC nτ oterm t1 might converge while t2 diverges only if both converge must their results be related
These subtleties become more relevant in the presence of general recursion and non-terminatingprograms as in untyped language λA or in a hypothetical extension of λArarr with xpoint operators
C4 Step-indexed extensional validity (λArarr λ∆Ararr)Step-indexed logical relations dene approximations to a relation to enable dealing with non-terminating programs Logical relations relate the behavior of multiple terms during evaluationwith step-indexed logical relations we can take a bound k and restrict attention to evaluations thattake at most k steps overall as made precise by the denitions Crucially entities related at stepcount k are also related at any step count j lt k Conversely the higher the step count the moreprecise the dened relation In the limit if entities are related at all step counts we simply say theyare related This construction of limits resembles various other approaches to constructing relationsby approximations but the entire framework remains elementary In particular the relations aredened simply because they are well-founded (that is only dened by induction on smaller numbers)
Chapter C (Un)typed ILC operationally 205
Proofs of logical relation statement need to deal with step-indexes but when they work (as here)they are otherwise not much harder than other syntactic or logical relation proofs
For instance if we dene equivalence as a step-indexed logical relation we can say that twoterms are equivalent for k or fewer steps even if they might have dierent behavior with moresteps available In our case we can say that a change appears valid at step count k if it behaves likea valid change in ldquoobservationsrdquo using at most k steps
Like before we dene a relation on values and one on computations as sets RV nτ o and RC nτ oInstead of indexing the sets with the step-count we add the step counts to the tuples they containso for instance (k v1 dv v2) isin RV nτ o means that value v1 change value dv and value v2 arerelated at step count k (or k-related) and similarly for RC nτ o
The details or the relation denitions are subtle but follow closely the use of step-indexingby Acar Ahmed and Blume [2008] We add mention of changes and must decide how to usestep-indexing for them
How step-indexing proceeds We explain gradually in words how the denition proceedsFirst we say k-related function values take j-related arguments to j-related results for all j
less than k That is reected in the denition for RV nσ rarr τ o it contains (k vf 1 dvf vf 2) if forall (j v1 dv v2) isin RV nσ o) with j lt k the result of applications are also j-related However theresult of application are not syntactic applications encoding vf 1 v14 It is instead necessary to useapp vf 1 v1 the result of one step of reduction The two denitions are not equivalent because asyntactic application would take one extra step to reduce
The denition for computations takes longer to describe Roughly speaking computations arek-related if after j steps of evaluations (with j lt k) they produce values related at k minus j stepsin particular if the computations happen to be neutral forms and evaluate in zero steps theyrsquorek-related as computations if the values they produce are k-related In fact the rule for evaluation hasa wrinkle Formally instead of saying that computations (ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) are k-relatedwe say that (k ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) isin RC nτ o We do not require all three computationsto take j steps Instead if the rst computation ρ1 ` t1 evaluates to a value in j lt k steps and thesecond computation ρ2 ` t2 evaluates to a value in any number of steps then the change computationρ dρ `∆ dt must also terminate to a change value dv (in an unspecied number of steps) and theresulting values must be related at step-count k minus j (that is (k minus j v1 dv v2) isin RV nτ o)
What is new in the above denition is the addition of changes and the choice to allow changeterm dt to evaluate to dv in an unbounded number of steps (like t2) as no bound is necessary forour proofs This is why the semantics we dened for change terms has no step counts
Well-foundedness of step-indexed logical relations has a small wrinkle because k minus j need notbe strictly less than k But we dene the two relations in a mutually recursive way and the pair ofrelations at step-count k is dened in terms of the pair of relation at smaller step-count All otherrecursive uses of relations are at smaller step-indexes
In this section we index the relation by both types and step-indexes since this is the one we usein our mechanized proof This relation is dened by structural induction on types We show thisdenition in Fig C5 Instead in Appendix C6 we consider untyped λ-calculus and drop types Theresulting denition is very similar but is dened by well-founded recursion on step-indexes
Again since we use Church typing and only mechanize typed terms we must include in all casesappropriate typing assumptions This choice does not match Ahmed [2006] but is one alternativeshe describes as equivalent Indeed while adapting the proof the extra typing assumptions andproof obligations were not a problem
4That happens to be illegal syntax in this presentation but can be encoded for instance as f = vf 1 x = v1 ` (f x) andthe problem is more general
206 Chapter C (Un)typed ILC operationally
RV nN o = (k n1 dn n2) | n1 dn n2 isin N and n1 + dn = n2 RV nτa times τb o = (k 〈va1 vb1〉 〈dva dvb 〉 〈va2 vb2〉) |
(k va1 dva va2) isin RV nτa o and (k vb1 dvb vb2) isin RV nτb o RV nσ rarr τ o = (k vf 1 dvf vf 2) |
vf 1 σ rarr τ and ∆ dvf σ rarr τ and vf 2 σ rarr τ
andforall(j v1 dv v2) isin RV nσ o j lt k rArr(j app vf 1 v1 dapp dvf v1 dv app vf 2 v2) isin RC nτ o
RC nτ o = (k ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) |(existΓ1 Γ Γ2 Γ1 ` t1 τ and Γ `∆ dt τ and Γ2 ` t2 τ )andforallj v1 v2(j lt k and ρ1 ` t1 dArrj v1 and ρ2 ` t2 dArr v2) rArrexistdv ρ dρ `∆ dt dArr dv and (k minus j v1 dv v2) isin RV nτ o
RG n ε o = (k) RG n Γ x τ o = (k (ρ1 x = v1) (dρ dx = dv) (ρ2 x = v2)) |
(k ρ1 dρ ρ2) isin RG n Γ o and (k v1 dv v2) isin RV nτ o Γ |= dt I t1 rarr t2 τ = forall(k ρ1 dρ ρ2) isin RG n Γ o
(k ρ1 ` t1 ρ1 dρ `∆ dt ρ2 ` t2) isin RC nτ o
Figure C5 Dening extensional validity via step-indexed logical relations and big-step semantics
At this moment we do not require that related closures contain related environments againwe are dening extensional validity
Given these denitions we can prove that all relations are downward-closed that is relations atstep-count n imply relations at step-count k lt n
Lemma C41 (Extensional validity is downward-closed)Assume k le n
1 If (n v1 dv v2) isin RV nτ o then (k v1 dv v2) isin RV nτ o
2 If (n ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) isin RC nτ o then
(k ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) isin RC nτ o
3 If (n ρ1 dρ ρ2) isin RG n Γ o then (k ρ1 dρ ρ2) isin RG n Γ o
Proof sketch For RV nτ o case split on τ and expand hypothesis and thesis If τ = N they coincideFor RV nσ rarr τ o parts of the hypothesis and thesis match For some relation P the rest of thehypothesis has shape foralljltn P (j v1 dv v2) and the rest of the thesis has shape foralljltk P (j v1 dv v2)Assume j lt k We must prove P (j v1 dv v2) but since j lt k le n we can just apply the hypothesis
Chapter C (Un)typed ILC operationally 207
The proof for RC nτ o follows the same idea as RV nσ rarr τ oFor RG n Γ o apply the theorem for RV nτ o to each environments entry x τ
At this point we prove the fundamental property
Theorem C42 (Fundamental property correctness of D n ndasho)For every well-typed term Γ ` t τ we have that Γ |= D n t o I t rarr t τ
Proof sketch By structural induction on typing derivations using ideas similar to Theorem C31and relying on Lemma C41 to reduce step counts where needed
C5 Step-indexed intensional validityUp to now we have dened when a function change is valid extensionally that is purely based onits behavior as we have done earlier when using denotational semantics We conjecture that withthese one can dene oplus and prove it agrees with extensional validity However we have not done so
Instead we modify denitions in Appendix C4 to dene validity intensionally To ensure thatf1oplusdf = f2 (for a suitable oplus) we choose to require that closures f1 df and f2 close over environmentsof matching shapes This change does not complicate the proof of the fundamental lemma all theadditional proof obligations are automatically satised
However it can still be necessary to replace a function value with a dierent one Hencewe extend our denition of values to allow replacement values Closure replacements producereplacements as results so we make replacement values into valid changes for all types We mustalso extend the change semantics both to allow evaluating closure replacements and to allowderivatives of primitive to handle replacement values
We have not added replacement values to the syntax so currently they can just be added tochange environments but we donrsquot expect adding them to the syntax would cause any signicanttrouble
We present the changes described above for the typed semantics We have successfully mecha-nized this variant of the semantics as well Adding replacement values v requires extending thedenition of change values evaluation and validity We add replacement values to change values
dv = | v
Derivatives of primitives when applied to replacement changes must recompute their output Therequired additional equations are not interesting but we show them anyway for completeness
P∆ n succ o n1 (n2) = (n2 + 1)P∆ n add o 〈_ _〉 (〈m2 n2〉) = (m2 + n2)P∆ n add o p1 (dp 〈dm n2〉) = (P n add o (p1 oplus dp))P∆ n add o p1 (dp 〈m dv〉) = (P n add o (p1 oplus dp))
Evaluation requires a new rule E-BangApp to evaluate change applications where the functionchange evaluates to a replacement change
ρ dρ `∆ dwf dArr vf ρ ` wa dArr va ρ dρ `∆ dwa dArr dva app vf (va oplus dva ) dArr v
ρ dρ `∆ dwf wa dwa dArr vE-BangApp
Evaluation rule E-BangApp requires dening oplus on syntactic values We dene it intensionally
208 Chapter C (Un)typed ILC operationally
Denition C51 (Update operator oplus)Operator oplus is dened on values by the following equations where match` ρ dρ (whose denitionwe omit) tests if ρ and dρ are environments for the same typing context Γ
v1 oplus v2 = v2ρ [λx rarr t ] oplus dρ [λx dx rarr dt ] =if match` ρ dρ then
-- If ρ and dρ are environments for the same typing context Γ(ρ oplus dρ) [λx rarr t ]
else-- otherwise the input change is invalid so just give-- any type-correct result
ρ [λx rarr t ]n oplus dn = n + dn〈va1 vb1〉 oplus 〈dva dvb 〉 = 〈va1 oplus dva vb1 oplus dvb 〉
-- An additional equation is needed in the untyped language-- not in the typed one This equation is for invalid-- changes so we can just return v1
v1 oplus dv = v1
We dene oplus on environments for matching contexts to combine values and changes pointwise
(x1 = v1 xn =vn) oplus (dx1 = dv1 dxn = dvn) =(x1 = v1 oplus dv1 xn =vn oplus dvn)
The denition of update for closures can only update them in few cases but this is not a problemas we show in a moment we restrict validity to the closure changes for which it is correct
We ensure replacement values are accepted as valid for all types by requiring the followingequation holds (hence modifying all equations for RV n ndash o we omit details)
RV nτ o supe (k v1 v2 v2) | v1 τ and v2 τ (C1)
where we write v τ to state that value v has type τ we omit the unsurprising rules for thisjudgement
To restrict valid closure changes RV nσ rarr τ o now requires that related elements satisfypredicate matchImpl vf 1 dvf vf 2 dened by the following equations
matchImpl(ρ1 [λx rarr t ])(ρ1 dρ [λx dx rarr D n t o])((ρ1 oplus dρ) [λx rarr t ]) = True
matchImpl _ _ _ = False
In other words validity (k vf 1 dvf vf 2) isin RV nσ rarr τ o now requires via matchImpl that thebase closure environment ρ1 and the base environment of the closure change dvf coincide thatρ2 = ρ1 oplus dρ and that vf 1 and vf 2 have λx rarr t as body while dvf has body λx dx rarr D n t o
To dene intensional validity for function changes RV nσ rarr τ o must use matchImpl and
Chapter C (Un)typed ILC operationally 209
explicitly support replacement closures to satisfy Eq (C1) Its denition is as follows
RV nσ rarr τ o = (k vf 1 dvf vf 2) | vf 1 σ rarr τ and ∆ dvf σ rarr τ and vf 2 σ rarr τ
andmatchImpl vf 1 dvf vf 2andforall(j v1 dv v2) isin RV nσ o j lt k rArr(j app vf 1 v1 dapp dvf v1 dv app vf 2 v2) isin RC nτ o cup (k vf 1 vf 2 vf 2) | vf 1 σ rarr τ and vf 2 σ rarr τ
Denitions of RV n ndash o for other types can be similarly updated to support replacement changesand satisfy Eq (C1)
Using these updated denitions we can again prove the fundamental property with the samestatement as Theorem C42 Furthermore we now prove that oplus agrees with validity
Theorem C52 (Fundamental property correctness of D n ndasho)For every well-typed term Γ ` t τ we have that Γ |= D n t o I t rarr t τ
Theorem C53 (oplus agrees with step-indexed intensional validity)If (k v1 dv v2) isin RV nτ o then v1 oplus dv = v2
Proof By induction on types For type N validity coincides with the thesis For type 〈τa τb 〉 wemust apply the induction hypothesis on both pair components
For closures validity requires that v1 = ρ1 [λx rarr t ] dv = dρ [λx dx rarr D n t o] v2 =ρ2 [λx rarr t ] with ρ1 oplus dρ = ρ2 and there exists Γ such that Γ x σ ` t τ Moreover from validitywe can show that ρ and dρ have matching shapes ρ is an environment matching Γ and dρ is achange environment matching ∆Γ Hence v1 oplus dv can update the stored environment and we canshow the thesis by the following calculation
v1 oplus dv = ρ1 [λx rarr t ] oplus dρ [λx dx rarr dt ] =(ρ1 oplus dρ) [λx rarr t ] = ρ2 [λx rarr t ] = v2
We can also dene 0 intensionallyas a metafunction on values and environments and proveit correct For closures we dierentiate the body and recurse on the environment The denitionextends from values to environments variable-wise so we omit the standard formal denitionDenition C54 (Nil changes 0)Nil changes on values are dened as follows
0ρ [λxrarrt ] = ρ 0ρ [λx dx rarr D n t o]0〈ab〉 = 〈0a 0b〉0n = 0
Lemma C55 (0 produces valid changes)For all values v τ and indexes k we have (k v 0v v) isin RV nτ o
Proof sketch By induction on v For closures we must apply the fundamental property (Theo-rem C52) to D n t o
210 Chapter C (Un)typed ILC operationally
Because 0 transforms closure bodies we cannot dene it internally to the language This problemcan be avoided by defunctionalizing functions and function changes as we do in Appendix D
We conclude with the overall correctness theorem analogous to Corollary 1345
Corollary C56 (D n ndasho is correct corollary)Take any term t that is well-typed (Γ ` t τ ) and any suitable environments ρ1 dρ ρ2 intensionallyvalid at any step count (forallk (k ρ1 dρ ρ2) isin RG n Γ o) Assume t terminates in both the old envi-ronment ρ1 and the new environment ρ2 evaluating to output values v1 and v2 (ρ1 ` t dArr v1 andρ2 ` t dArr v2) Then D n t o evaluates in environment ρ and change environment dρ to a change valuedv (ρ1 dρ `∆ t dArr dv) and dv is a valid change from v1 to v2 so that v1 oplus dv = v2
Proof Follows immediately from Theorem C52 and Theorem C53
C6 Untyped step-indexed validity (λA λ∆A)By removing mentions of types from step-indexed validity (intensional or extensional though weshow extensional denitions) we can adapt it to an untyped language We can still distinguishbetween functions numbers and pairs by matching on values themselves instead of matching ontypes Without types typing contexts Γ now degenerate to lists of free variables of a term we stilluse them to ensure that environments contain enough valid entries to evaluate a term Validityapplies to terminating executions hence we need not consider executions producing dynamic typeerrors when proving the fundamental property
We show resulting denitions for extensional validity in Fig C6 but we can also dene inten-sional validity and prove the fundamental lemma for it As mentioned earlier for λA we must turnP n ndash o ndash and P∆ n ndash o ndash into relations and update E-Prim accordingly
The main dierence in the proof is that this time the recursion used in the relations can only beproved to be well-founded because of the use of step-indexes we omit details [Ahmed 2006]
Otherwise the proof proceeds just as earlier in Appendix C4 We prove that the relations aredownward-closed just like in Lemma C41 (we omit the new statement) and we prove the newfundamental lemma by induction on the structure of terms (not of typing derivations)
Theorem C61 (Fundamental property correctness of D n ndasho)If FV (t) sube Γ then we have that Γ |= D n t o I t rarr t
Proof sketch Similar to the proof of Theorem C42 but by structural induction on terms andcomplete induction on step counts not on typing derivations
However we can use the induction hypothesis in the same ways as in earlier proofs for typedlanguages all uses of the induction hypothesis in the proof are on smaller terms and some also atsmaller step counts
C7 General recursion in λArarr and λ∆ArarrWe have sketched informally in Sec 151 how to support xpoint combinators
We have also extended our typed languages with a xpoint combinators and proved them correctformally (not mechanically yet) In this section we include the needed denitions to make ourclaim precise They are mostly unsurprising if long to state
Since we are in a call-by-value setting we only add recursive functions not recursive values ingeneral To this end we replace λ-abstraction λx rarr t with recursive function rec f x rarr t whichbinds both f and x in t and replaces f with the function itself upon evaluation
Chapter C (Un)typed ILC operationally 211
RV = (k n1 dn n2) | n1 dn n2 isin N and n1 + dn = n2 cup (k vf 1 dvf vf 2) |forall(j v1 dv v2) isin RV j lt k rArr(j app vf 1 v1 dapp dvf v1 dv app vf 2 v2) isin RC cup (k 〈va1 vb1〉 〈dva dvb 〉 〈va2 vb2〉) |(k va1 dva va2) isin RV and (k vb1 dvb vb2) isin RV
RC = (k ρ1 ` t1 ρ dρ `∆ dt ρ2 ` t2) |forallj v1 v2(j lt k and ρ1 ` t1 dArrj v1 and ρ2 ` t2 dArr v2) rArrexistdv ρ dρ `∆ dt dArr dv and (k minus j v1 dv v2) isin RV
RG n ε o = (k) RG n Γ x o = (k (ρ1 x = v1) (dρ dx = dv) (ρ2 x = v2)) |
(k ρ1 dρ ρ2) isin RG n Γ o and (k v1 dv v2) isin RV Γ |= dt I t1 rarr t2 = forall(k ρ1 dρ ρ2) isin RG n Γ o
(k ρ1 ` t1 ρ1 dρ `∆ dt ρ2 ` t2) isin RC
Figure C6 Dening extensional validity via untyped step-indexed logical relations and big-stepsemantics
The associated small-step reduction rule would be (rec f x rarr t) v rarr t [x =v f =rec f x rarr t ]As before we formalize reduction using big-step semantics
Typing rule T-Lam is replaced by a rule for recursive functions
Γ f σ rarr τ x σ ` t τΓ ` rec f x rarr t σ rarr τ
T-Rec
We replace closures with recursive closures in the denition of values
w = rec f x rarr t | v = ρ [rec f x rarr t ] |
We also modify the semantics for abstraction and application Rules E-Val and E-App are unchangedit is sucient to adapt the denitions of V n ndash o ndash and app so that evaluation of a function value vfhas access to vf in the environment
V n rec f x rarr t o ρ = ρ [rec f x rarr t ]app (vf (ρ prime [rec f x rarr t ])) va =(ρ prime f = vf x = va) ` t
Like in Haskell we write x p in equations to bind an argument as metavariable x and match itagainst pattern p
Similarly we modify the language of changes the denition of dierentiation and evaluationmetafunctions V∆ n ndash o ndash and dapp Since the derivative of a recursive function f = rec f x rarr t
212 Chapter C (Un)typed ILC operationally
can call the base function we remember the original function body t in the derivative togetherwith its derivative D n t o This should not be surprising in Sec 151 where recursive functions aredened using letrec a recursive function f is in scope in the body of its derivative df Here we usea dierent syntax but still ensure that f is in scope in the body of derivative df The denitions areotherwise unsurprising if long
dw = drec f df x dx rarr t dt | dv = ρ dρ [drec f df x dx rarr t dt ] | D n rec f x rarr t o = drec f df x dx rarr t D n t oV∆ ndrec f df x dx rarr dt o ρ dρ =ρ dρ [drec f df x dx rarr t dt ]
dapp (dvf (ρ prime dρ prime [rec f df x dx rarr dt ])) va dva =let vf = ρ prime [rec f x rarr t ]in (ρ prime f = vf x = va) (dρ prime df = dvf dx = dva) `∆ dt
Γ f σ rarr τ x σ ` t τ Γ f σ rarr τ x σ `∆ dt τΓ `∆ drec f df x dx rarr t dt σ rarr τ
T-DRec
We can adapt the proof of the fundamental property to the use of recursive functions
Theorem C71 (Fundamental property correctness of D n ndasho)For every well-typed term Γ ` t τ we have that Γ |= D n t o I t rarr t τ
Proof sketch Mostly as before modulo one interesting dierence To prove the fundamentalproperty for recursive functions at step-count k this time we must use the fundamental prop-erty inductively on the same term but at step-count j lt k This happens because to evaluatedw = D n rec f x rarr t o we evaluate D n t o with the value dv for dw in the environment to showthis invocation is valid we must show dw is itself a valid change But the step-indexed denition toRV nσ rarr τ o constrains the evaluation of the body only forallj lt k
C8 Future workWe have shown that oplus and 0 agree with validity which we consider a key requirement of a coreILC proof However change structures support further operators We leave operator for futurework though we are not aware of particular diculties However deserves special attention
C81 Change compositionWe have looked into change composition and it appears that composition of change expression isnot always valid but we conjecture that composition of change values preserves validity Showingthat change composition is valid appears related to showing that Ahmedrsquos logical equivalence is atransitive relation which is a subtle issue She only proves transitivity in a typed setting and witha stronger relation and her proof does not carry over directly indeed there is no correspondingproof in the untyped setting of Acar Ahmed and Blume [2008]
However the failure of transitivity we have veried is not overly worrisome the problem isthat transitivity is too strong an expectation in general Assume that Γ |= de1 I e1 rarr e2 andΓ |= de2 I e2 rarr e3 and try to show that Γ |= de1 de2 I e1 rarr e3 that is very roughly andignoring the environments we can assume that e1 and e3 terminate and have to show that their
Chapter C (Un)typed ILC operationally 213
result satisfy some properties To use both our hypotheses we need to know that e1 e2 and e3all terminate but we have no such guaranteed for e2 Indeed if e2 always diverges (because it issay the diverging term ω = (λx rarr x x) (λx rarr x x)) then de1 and de2 are vacuously valid Ife1 and e3 terminate we canrsquot expect de1 de2 to be a change between them To wit take e1 = 0e2 = ω e3 = 10 and de1 = de2 = 0 We can verify that for any Γ we have Γ |= de1 I e1 rarr e2 andΓ |= de2 I e2 rarr e3 while Γ |= de1 de2 I e1 rarr e3 means the absurd Γ |= 0 0 I 0 rarr 10
Apossible x Does transitivity hold if e2 terminates If (k e1 de1 e2) isin RC nτ o and (k e2 de2 e3) isinRC nτ o we still cannot conclude anything But like in Ahmed [2006] if e2 amd e3 are related at allstep counts that is if (k e1 de1 e2) isin RC nτ o and (foralln (n e2 de2 e3) isin RC nτ o) and if additionallye2 terminates we conjecture that Ahmedrsquos proof goes through We have however not yet examinedall details
C9 Development historyThe proof strategy used in this chapter comes from a collaboration between me and Yann Reacutegis-Gianas who came up with the general strategy and the rst partial proofs for untyped λ-calculiAfter we both struggled for a while to set up step-indexing correctly enough for a full proof I rstmanaged to give the denitions in this chapter and complete the proofs here described Reacutegis-Gianasthen mechanized a variant of our proof for untyped λ-calculus in Coq [Giarrusso et al Submitted]that appears here in Sec 173 That proof takes a few dierent choices and unfortunately strictlyspeaking neither proof subsumes the other (1) We also give a non-step-indexed syntactic prooffor simply-typed λ-calculus together with proofs dening validity extensionally (2) To supportremembering intermediate results by conversion to cache-transfer-style (CTS) Reacutegis-Gianasrsquo proofuses a lambda-lifted ArsquoNF syntax instead of plain ANF (3) Reacutegis-Gianasrsquo formalization adds tochange values a single token 0 which is a valid nil change for all valid values Hence if we knowa change is nil we can erase it As a downside evaluating df a da when df is 0 requires lookingup f In our presentation instead if f and g are dierent function values they have dierent nilchanges Such nil changes carry information so they can be evaluated directly but they cannot beerased Techniques in Appendix D enable erasing and reconstructing a nil change df for functionvalue f as long as the value of f is available
C10 ConclusionIn this chapter we have shown how to construct novel models for ILC by using (step-indexed)logical relations and have used this technique to deliver a new syntactic proof of correctness for ILCfor simply-typed lambda-calculus and to deliver the rst proof of correctness of ILC for untypedλ-calculus Moreover our proof appears rather close to existing logical-relations proofs hence webelieve it should be possible to translate other results to ILC theorems
By formally dening intensional validity for closures we provide a solid foundation for the useof defunctionalized function changes (Appendix D)
This proof builds on Ahmed [2006]rsquos work on step-indexed logical relations which enablehandling of powerful semantics feature using rather elementary techniques The only downside isthat it can be tricky to set up the correct denitions especially for a slightly non-standard semanticslike ours As an exercise we have shown that the our semantics is equivalent to more conventionalpresentations down to the produced step counts
214 Chapter C (Un)typed ILC operationally
Appendix D
Defunctionalizing functionchanges
In Chapter 12 and throughout most of Part II we represent function changes as functions which canonly be applied However incremental programs often inspect changes to decide how to react to themmost eciently Also inspecting function changes would help performance further Representingfunction changes as closures as we do in Chapter 17 and Appendix C allows implementing someoperations more ecient but is not fully satisfactory In this chapter we address these restrictionsby defunctionalizing functions and function changes so that we can inspect both at runtime withoutrestrictions
Once we defunctionalize function changes we can detect at runtime whether a function changeis nil As we have mentioned in Sec 163 nil function changes can typically be handled moreeciently For instance consider t = map f xs a term that maps function f to each element ofsequence xs In general D n t o = dmap f df xs dxs must handle any change in dxs (which weassume to be small) but also apply function change df to each element of xs (which we assume tobe big) However if df is nil we can skip this step decreasing time complexity of dmap f df xs dxsfrom O (|xs | + |dxs |) to O (|dxs |)
We will also present a change structure on defunctionalized function changes and show thatoperations on defunctionalized function changes become cheaper
D1 SetupWe write incremental programs based on ILC by manually writing Haskell code containing bothmanually-written plugin code and code that is transformed systematically based on informalgeneralizations and variants of D n ndash o Our main goal is to study variants of dierentiation andof encodings in Haskell while also studying language plugins for non-trivial primitives such asdierent sorts of collections We make liberal use of GHC extensions where needed
Code in this chapter has been extracted and type-checked though we elide a few details (mostlylanguage extensions and imports from the standard library) Code in this appendix is otherwiseself-contained We have also tested a copy of this code more extensively
As sketched before we dene change structure inside Haskell
class ChangeStruct t where-- Next line declares ∆t as an injective type function
215
216 Chapter D Defunctionalizing function changes
type ∆t = r | r rarr t(oplus) t rarr ∆t rarr toreplace t rarr ∆t
class ChangeStruct t rArr NilChangeStruct t where0 t rarr ∆t
class ChangeStruct arArr CompChangeStruct a where-- Compose change dx1 with dx2 so that-- x oplus (dx1 dx2) equiv x oplus dx1 oplus dx2() ∆ararr ∆ararr ∆a
With this code we dene type classes ChangeStruct NilChangeStruct and CompChangeStruct Weexplain each of the declared members in turn
First type ∆t represents the change type for type t To improve Haskell type inference wedeclare that ∆ is injective so that ∆t1 = ∆t2 implies t1 = t2 This forbids some potentially usefulchange structures but in exchange it makes type inference vastly more usable
Next we declare oplus 0 and as available to object programs Last we introduce oreplace toconstruct replacement changes characterized by the absorption law x oplus oreplace y = y for all xFunction oreplace encodes t that is the bang operator We use a dierent notation because isreserved for other purposes in Haskell
These typeclasses omit operation intentionally we do not require that change structuressupport a proper dierence operation Nevertheless as discussed b a can be expressed throughoreplace b
We can then dierentiate Haskell functions mdash even polymorphic ones We show a few trivialexamples to highlight how derivatives are encoded in Haskell especially polymorphic ones
-- The standard id functionid ararr aid x = x
-- and its derivativedid ararr ∆ararr ∆adid x dx = dxinstance (NilChangeStruct aChangeStruct b) rArr
ChangeStruct (ararr b) wheretype ∆(ararr b) = ararr ∆ararr ∆bf oplus df = λx rarr f x oplus df x 0xoreplace f = λx dx rarr oreplace (f (x oplus dx))
instance (NilChangeStruct aChangeStruct b) rArrNilChangeStruct (ararr b) where
0f = oreplace f-- Same for apply
apply (ararr b) rarr ararr bapply f x = f xdapply (ararr b) rarr ∆(ararr b) rarr ararr ∆ararr ∆bdapply f df x dx = df x dx
Which polymorphism As visible here polymorphism does not cause particular problemsHowever we only support ML (or prenex) polymorphism not rst-class polymorphism for tworeasons
Chapter D Defunctionalizing function changes 217
First with rst-class polymorphism we can encode existential types existX T and two valuesv1 v2 of the same existential type existX T can hide dierent types T1 T2 Hence a change between v1and v2 requires handling changes between types While we discuss such topics in Chapter 18 weavoid them here
Second prenex polymorphism is a small extension of simply-typed lambda calculus metatheoreti-cally We can treat prenex-polymorphic denitions as families of monomorphic (hence simply-typed)denitions to each denition we can apply all the ILC theory we developed to show dierentiationis correct
D2 DefunctionalizationDefunctionalization is a whole-program transformation that turns a program relying on rst-classfunctions into a rst-order program The resulting program is expressed in a rst-order language(often a subset of the original language) closures are encoded by data values which embed both theclosure environment and a tag to distinguish dierent function Defunctionalization also generatesa function that interprets encoded closures which we call applyFun
In a typed language defunctionalization can be done using generalized algebraic datatypes(GADTs) [Pottier and Gauthier 2004] Each rst-class function of type σ rarr τ is replaced by a valueof a new GADT Fun σ τ that represents defunctionalized function values and has a constructor foreach dierent function If a rst-class function t1 closes over x τ1 the corresponding constructorC1 will take x τ1 as an argument The interpreter for defunctionalized function values has typesignature applyFun Fun σ τ rarr σ rarr τ The resulting programs are expressed in a rst-ordersubset of the original programming language In defunctionalized programs all remaining functionsare rst-order top-level functions
For instance consider the program on sequences in Fig D1
successors [Z] rarr [Z]successors xs = map (λx rarr x + 1) xsnestedLoop [σ ] rarr [τ ] rarr [(σ τ )]nestedLoop xs ys = concatMap (λx rarr map (λy rarr (x y)) ys) xsmap (σ rarr τ ) rarr [σ ] rarr [τ ]map f [ ] = [ ]map f (x xs) = f x map f xsconcatMap (σ rarr [τ ]) rarr [σ ] rarr [τ ]concatMap f xs = concat (map f xs)
Figure D1 A small example program for defunctionalization
In this program the rst-class function values arise from evaluating the three terms λy rarr (x y)that we call pair λx rarr map (λy rarr (x y)) ys that we call mapPair and λx rarr x + 1 that we calladdOne Defunctionalization creates a type Fun σ τ with a constructor for each of the three termsrespectively Pair MapPair and AddOne Both pair and mapPair close over some free variables sotheir corresponding constructors will take an argument for each free variable for pair we have
x σ ` λy rarr (x y) τ rarr (σ τ )
while for mapPair we have
ys [τ ] ` λx rarr map (λy rarr (x y)) ys σ rarr [(σ τ )]
218 Chapter D Defunctionalizing function changes
Hence the type of defunctionalized functions Fun σ τ and its interpreter applyFun become
data Fun σ τ whereAddOne Fun Z ZPair σ rarr Fun τ (σ τ )MapPair [τ ] rarr Fun σ [(σ τ )]
applyFun Fun σ τ rarr σ rarr τapplyFun AddOne x = x + 1applyFun (Pair x) y = (x y)applyFun (MapPair ys) x = mapDF (Pair x) ys
We need to also transform the rest of the program accordingly
successors [Z] rarr [Z]successors xs = map (λx rarr x + 1) xsnestedLoopDF [σ ] rarr [τ ] rarr [(σ τ )]nestedLoopDF xs ys = concatMapDF (MapPair ys) xsmapDF Fun σ τ rarr [σ ] rarr [τ ]mapDF f [ ] = [ ]mapDF f (x xs) = applyFun f x mapDF f xsconcatMapDF Fun σ [τ ] rarr [σ ] rarr [τ ]concatMapDF f xs = concat (mapDF f xs)
Figure D2 Defunctionalized program
Some readers might notice this program still uses rst-class function because it encodes multi-argument functions through currying To get a fully rst-order program we encode multi-argumentsfunctions using tuples instead of currying1 Using tuples our example becomes
applyFun (Fun σ τ σ ) rarr τapplyFun (AddOne x) = x + 1applyFun (Pair x y) = (x y)applyFun (MapPair ys x) = mapDF (Pair x ys)mapDF (Fun σ τ [σ ]) rarr [τ ]mapDF (f [ ]) = [ ]mapDF (f x xs) = applyFun (f x) mapDF (f xs)concatMapDF (Fun σ [τ ] [σ ]) rarr [τ ]concatMapDF (f xs) = concat (mapDF (f xs))nestedLoopDF ([σ ] [τ ]) rarr [(σ τ )]nestedLoopDF (xs ys) = concatMapDF (MapPair ys xs)
However wersquoll often write such defunctionalized programs using Haskellrsquos typical curriedsyntax as in Fig D2 Such programs must not contain partial applications
1Strictly speaking the resulting program is still not rst-order because in Haskell multi-argument data constructorssuch as the pair constructor () that we use are still rst-class curried functions unlike for instance in OCaml To make thisprogram truly rst-order we must formalize tuple constructors as a term constructor or formalize these function denitionsas multi-argument functions At this point this discussion is merely a technicality that does not aect our goals but itbecomes relevant if we formalize the resulting rst-order language as in Sec 173
Chapter D Defunctionalizing function changes 219
In general defunctionalization creates a constructor C of type Fun σ τ for each rst-classfunction expression Γ ` t σ rarr τ in the source program2 While lambda expression l closesimplicitly over environment ρ n Γ o C closes over it explicitly the values bound to free variablesin environment ρ are passed as arguments to constructor C As a standard optimization we onlyincludes variables that actually occur free in l not all those that are bound in the context where loccurs
D21 Defunctionalization with separate function codesNext we show a slight variant of defunctionalization that we use to achieve our goals with lesscode duplication even at the expense of some eciency we call this variant defunctionalizationwith separate function codes
We rst encode contexts as types and environments as values Empty environments are encodedas empty tuples Environments for a context such as x τ1 y τ2 are encoded as values of typeτ1 times τ2 times
In this defunctionalization variant instead of dening directly a GADT of defunctionalizedfunctions Fun σ τ we dene a GADT of function codes Code env σ τ whose values contain noenvironment Type Code is indexed not just by σ and τ but also by the type of environments andhas a constructor for each rst-class function expression in the source program like Fun σ τ doesin conventional defunctionalization We then dene Fun σ τ as a pair of a function code of typeCode env σ τ and an environment of type env
As a downside separating function codes adds a few indirections to the memory representationof closures for instance we use (AddOne ()) instead of AddOne and (Pair 1) instead of Pair 1
As an upside with separate function codes we can dene many operations generically across allfunction codes (see Appendix D22) instead of generating denitions matching on each functionWhatrsquos more we later dene operations that use raw function codes and need no environmentwe could alternatively dene function codes without using them in the representation of functionvalues at the expense of even more code duplication Code duplication is especially relevant becausewe currently perform defunctionalization by hand though we are condent it would be conceptuallystraightforward to automate the process
Defunctionalizing the program with separate function codes produces the following GADT offunction codes
type Env env = (CompChangeStruct envNilTestable env)data Code env σ τ where
AddOne Code () Z ZPair Code σ τ (σ τ )MapPair Env σ rArr Code [τ ] σ [(σ τ )]
In this denition type Env names a set of typeclass constraints on the type of the environmentusing the ConstraintKinds GHC language extension Satisfying these constraints is necessary toimplement a few operations on functions We also require an interpretation function applyCode forfunction codes If c is the code for a function f = λx rarr t calling applyCode c computes f rsquos outputfrom an environment env for f and an argument for x
applyCode Code env σ τ rarr env rarr σ rarr τapplyCode AddOne () x = x + 1
2We only need codes for functions that are used as rst-class arguments not for other functions though codes for thelatter can be added as well
220 Chapter D Defunctionalizing function changes
applyCode Pair x y = (x y)applyCode MapPair ys x = mapDF (F (Pair x)) ys
The implementation of applyCode MapPair only works because of the Env σ constraint for con-structor MapPair this constraint is required when constructing the defunctionalized function valuethat we pass as argument to mapDF
We represent defunctionalized function values through type RawFun env σ τ a type synonymof the product of Code env σ τ and env We encode type σ rarr τ through type Fun σ τ dened asRawFun env σ τ where env is existentially bound We add constraint Env env to the denition ofFun σ τ because implementing oplus on function changes will require using oplus on environments
type RawFun env σ τ = (Code env σ τ env)data Fun σ τ whereF Env env rArr RawFun env σ τ rarr Fun σ τ
To interpret defunctionalized function values we wrap applyCode in a new version of applyFunhaving the same interface as the earlier applyFun
applyFun Fun σ τ rarr σ rarr τapplyFun (F (code env)) arg = applyCode code env arg
The rest of the source program is defunctionalized like before using the new denition of Fun σ τand of applyFun
D22 Defunctionalizing function changesDefunctionalization encodes function values as pairs of function codes and environments In ILC afunction value can change because the environment changes or because the whole closure is replacedby a dierent one with a dierent function code and dierent environment For now we focus onenvironment changes for simplicity To allow inspecting function changes we defunctionalize themas well but treat them specially
Assume we want to defunctionalize a function change df with type ∆(σ rarr τ ) = σ rarr ∆σ rarr ∆τ valid for function f σ rarr τ Instead of transforming type ∆(σ rarr τ ) into Fun σ (Fun ∆σ ∆τ ) wetransform ∆(σ rarr τ ) into a new type DFun σ τ the change type of Fun σ τ (∆(Fun σ τ ) = DFun σ τ )To apply DFun σ τ we introduce an interpreter dapplyFun Fun σ τ rarr ∆(Fun σ τ ) rarr ∆(σ rarr τ )or equivalently dapplyFun Fun σ τ rarr DFun σ τ rarr σ rarr ∆σ rarr ∆τ which also serves asderivative of applyFun
Like we did for Fun σ τ we dene DFun σ τ using function codes That is DFun σ τ pairs afunction code Code env σ τ together with an environment change and a change structure for theenvironment type
data DFun σ τ = forallenvChangeStruct env rArr DF (∆envCode env σ τ )
Without separate function codes the denition of DFun would have to include one case for eachrst-class function
Environment changes
Instead of dening change structures for environments we encode environments using tuples anddene change structures for tuples
We dene rst change structures for empty tuples and pairs Sec 1233
Chapter D Defunctionalizing function changes 221
instance ChangeStruct () wheretype ∆() = ()_ oplus _ = ()oreplace _ = ()
instance (ChangeStruct aChangeStruct b) rArr ChangeStruct (a b) wheretype ∆(a b) = (∆a∆b)(a b) oplus (da db) = (a oplus da b oplus db)oreplace (a2 b2) = (oreplace a2 oreplace b2)
To dene change structures for n-uples of other arities we have two choices which we show ontriples (a b c) and can be easily generalized
We can encode triples as nested pairs (a (b (c ()))) Or we can dene change structures fortriples directly
instance (ChangeStruct aChangeStruct bChangeStruct c) rArr ChangeStruct (a b c) where
type ∆(a b c) = (∆a∆b∆c)(a b c) oplus (da db dc) = (a oplus da b oplus db c oplus dc)oreplace (a2 b2 c2) = (oreplace a2 oreplace b2 oreplace c2)
Generalizing from pairs and triples one can dene similar instances for n-uples in general (sayfor values of n up to some high threshold)
Validity and oplus on defunctionalized function changes
A function change df is valid for f if df has the same function code as f and if df rsquos environmentchange is valid for f rsquos environment
DF dρ c B F ρ1 c rarr F ρ2 c Fun σ τ = dρ B ρ1 rarr ρ2 env
where c is a function code of type Code env σ τ and where crsquos type binds the type variable env weuse on the right-hand side
Next we implement oplus on function changes to match our denition of validity as required Weonly need f oplus df to give a result if df is a valid change for f Hence if the function code embeddedin df does not match the one in f we give an error3 However our rst attempt does not typechecksince the typechecker does not know whether the environment and the environment change havecompatible types
instance ChangeStruct (Fun σ τ ) wheretype ∆(Fun σ τ ) = DFun σ τF (env c1) oplus DF (denv c2) =
if c1 equiv c2 thenF (env oplus denv) c1
elseerror Invalid function change in oplus
In particular env oplus denv is reported as ill-typed because we donrsquot know that env and denvhave compatible types Take c1 = Pair c2 = MapPair f = F (env Pair) σ rarr τ and df =
3We originally specied oplus as a total function to avoid formalizing partial functions but as mentioned in Sec 103 we donot rely on the behavior of oplus on invalid changes
222 Chapter D Defunctionalizing function changes
DF (denvMapPair) ∆(σ rarr τ ) Assume we evaluate f oplus df = F (env Pair) oplus DF (denvMapPair)there indeed env σ and denv [τ ] so env oplus denv is not type-correct Yet evaluating f oplus dfwould not fail because MapPair and Pair are dierent c1 equiv c2 will return false and env oplus denvwonrsquot be evaluated But the typechecker does not know that
Hence we need an equality operation that produces a witness of type equality We dene theneeded infrastructure with few lines of code First we need a GADT of witnesses of type equalitywe can borrow from GHCrsquos standard library its denition which is just
-- From DataTypeEqualitydata τ1 sim τ2 where
Re τ sim τ
If x has type τ1 sim τ2 and matches pattern Re then by standard GADT typing rules τ1 and τ2 areequal Even if τ1 sim τ2 has only constructor Re a match is necessary since x might be bottomReaders familiar with type theory Agda or Coq will recognize that sim resembles Agdarsquos propositionalequality or Martin-Loumlfrsquos identity types even though it can only represents equality between typesand not between values
Next we implement function codeMatch to compare codes For equal codes this operationproduces a witness that their environment types match4 Using this operation we can complete theabove instance of ChangeStruct (Fun σ τ )
codeMatch Code env1 σ τ rarr Code env2 σ τ rarr Maybe (env1 sim env2)codeMatch AddOne AddOne = Just RecodeMatch Pair Pair = Just RecodeMatch MapPair MapPair = Just RecodeMatch _ _ = Nothinginstance ChangeStruct (Fun σ τ ) where
type ∆(Fun σ τ ) = DFun σ τF (c1 env) oplus DF (c2 denv) =case codeMatch c1 c2 ofJust Re rarr F (c1 env oplus denv)Nothing rarr error Invalid function change in oplus
Applying function changes
After dening environment changes we dene an incremental interpretation function dapplyCodeIf c is the code for a function f = λx rarr t as discussed calling applyCode c computes the output off from an environment env for f and an argument for x Similarly calling dapplyCode c computesthe output of D n f o from an environment env for f an environment change denv valid for env anargument for x and an argument change dx
In our example we have
dapplyCode Code env σ τ rarr env rarr ∆env rarr σ rarr ∆σ rarr ∆τdapplyCode AddOne () () x dx = dxdapplyCode Pair x dx y dy = (dx dy)dapplyCode MapPair ys dys x dx =dmapDF (F (Pair x)) (DF (Pair dx)) ys dys
4If a code is polymorphic in the environment type it must take as argument a representation of its type argument to beused to implement codeMatch We represent type arguments at runtime via instances of Typeable and omit standard detailshere
Chapter D Defunctionalizing function changes 223
On top of dapplyCode we can dene dapplyFun which functions as a a derivative for applyFunand allows applying function changes
dapplyFun Fun σ τ rarr DFun σ τ rarr σ rarr ∆σ rarr ∆τdapplyFun (F (c1 env)) (DF (c2 denv)) x dx =
case codeMatch c1 c2 ofJust Re rarr dapplyCode c1 env denv x dxNothing rarr error Invalid function change in dapplyFun
However we can also implement further accessors that inspect function changes We can nownally detect if a change is nil To this end we rst dene a typeclass that allows testing changes todetermine if theyrsquore nil
class NilChangeStruct t rArr NilTestable t whereisNil ∆t rarr Bool
Now a function change is nil only if the contained environment is nil
instance NilTestable (Fun σ τ ) whereisNil DFun σ τ rarr BoolisNil (DF (denv code)) = isNil denv
However this denition of isNil only works given a typeclass instance for NilTestable env we needto add this requirement as a constraint but we cannot add it to isNilrsquos type signature since typevariable env is not in scope there Instead we must add the constraint where we introduce it byexistential quantication just like the constraint ChangeStruct env In particular we can reuse theconstraint Env env
data DFun σ τ =forallenv Env env rArrDF (Code env σ τ ∆env)
instance NilChangeStruct (Fun σ τ ) where0F (codeenv) = DF (code 0env)
instance NilTestable (Fun σ τ ) whereisNil DFun σ τ rarr BoolisNil (DF (code denv)) = isNil denv
We can then wrap a derivative via function wrapDF to return a nil change immediately if atruntime all input changes turn out to be nil This was not possible in the setting described by Caiet al [2014] because nil function changes could not be detected at runtime only at compile timeTo do so we must produce directly a nil change for v = applyFun f x To avoid computing v weassume we can compute a nil change for v without access to v via operation onil and typeclassOnilChangeStruct (omitted here) argument Proxy is a constant required for purely technical reasons
wrapDF OnilChangeStruct τ rArr Fun σ τ rarr DFun σ τ rarr σ rarr ∆σ rarr ∆τwrapDF f df x dx =if isNil df thenonil Proxy -- Change-equivalent to 0applyFun f x
elsedapplyFun f df x dx
224 Chapter D Defunctionalizing function changes
D3 Defunctionalization and cache-transfer-styleWe can combine the above ideas with cache-transfer-style (Chapter 17) We show the details quickly
Combining the above with caching we can use defunctionalization as described to implement thefollowing interface for functions in caches For extra generality we use extension ConstraintKindsto allow instances to dene the required typeclass constraints
class FunOps k wheretype Dk k = (dk lowast rarr lowast rarr lowast rarr lowast rarr lowast) | dk rarr ktype ApplyCtx k i o Constraintapply ApplyCtx k i orArr k i o cacherarr irarr (o cache)type DApplyCtx k i o ConstraintdApply DApplyCtx k i orArrDk k i o cache1 cache2 rarr ∆irarr cache1 rarr (∆o cache2)
type DerivCtx k i o Constraintderiv DerivCtx k i orArrk i o cacherarr Dk k i o cache cache
type FunUpdCtx k i o ConstraintfunUpdate FunUpdCtx k i orArrk i o cache1 rarr Dk k i o cache1 cache2 rarr k i o cache2
isNilFun Dk k i o cache1 cache2 rarr Maybe (cache1 sim cache2)updatedDeriv (FunOps k FunUpdCtx k i oDerivCtx k i o) rArrk i o cache1 rarr Dk k i o cache1 cache2 rarr Dk k i o cache2 cache2
updatedDeriv f df = deriv (f lsquofunUpdatelsquo df )
Type constructor k denes the specic constructor for the function type In this interface the typeof function changes DK k i o cache1 cache2 represents functions (encoded by type constructor Dk k)with inputs of type i outputs of type o input cache type cache1 and output cache type cache2 Typescache1 and cache2 coincide for typical function changes but can be dierent for replacement functionchanges or more generally for function changes across functions with dierent implementationsand cache types Correspondingly dApply supports applying such changes across closures withdierent implementations unfortunately unless the two implementations are similar the cachecontent cannot be reused
To implement this interface it is sucient to dene a type of codes that admits an instance of type-class Codelike Earlier denitions of codeMatch applyFun and dapplyFun adapted to cache-transferstyle
class Codelike code wherecodeMatch code env1 a1 r1 cache1 rarr code env2 a2 r2 cache2 rarrMaybe ((env1 cache1) sim (env2 cache2))
applyCode code env a b cacherarr env rarr ararr (b cache)dapplyCode code env a b cacherarr ∆env rarr ∆ararr cacherarr (∆b cache)
Typically a defunctionalized program uses no rst-class functions and has a single type of functionsHaving a type class of function codes weakens that property We can still use a single type of codesthroughout our program we can also use dierent types of codes for dierent parts of a programwithout allowing for communications between those parts
On top of the Codelike interface we can dene instances of interface FunOps and ChangeStructTo demonstrate this we show a complete implementation in Figs D3 and D4 Similarly to oplus we
Chapter D Defunctionalizing function changes 225
can implement by comparing codes contained in function changes this is not straightforwardwhen using closure conversion as in Sec 1742 unless we store even more type representations
We can detect nil changes at runtime even in cache-passing style We can for instance denefunction wrapDer1 which does something trickier than wrapDF here we assume that dg is aderivative taking function change df as argument Then if df is nil dg df must also be nil so wecan return a nil change directly together with the input cache The input cache has the requiredtype because in this case the type cache2 of the desired output cache has matches type cache1 of theinput cache (because we have a nil change df across them) the return value of isNilFun witnessesthis type equality
wrapDer1 (FunOps kOnilChangeStruct r prime) rArr(Dk k i o cache1 cache2 rarr f cache1 rarr (∆r prime f cache2)) rarr(Dk k i o cache1 cache2 rarr f cache1 rarr (∆r prime f cache2))
wrapDer1 dg df c =case isNilFun df ofJust Re rarr (onil Proxy c)Nothing rarr dg df c
We can also hide the dierence between dierence cache types by dening a uniform type ofcaches Cache code To hide caches we can use a pair of a cache (of type cache) and a code for thatcache type When applying a function (change) to a cache or when composing the function we cancompare the function code with the cache code
In this code we have not shown support for replacement values for functions we leave detailsto our implementation
226 Chapter D Defunctionalizing function changes
type RawFun a b code env cache = (code env a b cache env)type RawDFun a b code env cache = (code env a b cache∆env)data Fun a b code = forallenv cache Env env rArr
F (RawFun a b code env cache)data DFun a b code = forallenv cache Env env rArr
DF (RawDFun a b code env cache)-- This cache is not indexed by a and b
data Cache code = foralla b env cacheC (code env a b cache) cache-- Wrapper
data FunW code a b cache whereW Fun a b coderarr FunW code a b (Cache code)
data DFunW code a b cache1 cache2 whereDW DFun a b coderarr DFunW code a b (Cache code) (Cache code)
derivFun Fun a b coderarr DFun a b codederivFun (F (code env)) = DF (code 0env)oplusBase Codelike coderArr Fun a b coderarr DFun a b coderarr Fun a b codeoplusBase (F (c1 env)) (DF (c2 denv)) =
case codeMatch c1 c2 ofJust Re rarrF (c1 env oplus denv)
_rarr error INVALID call to oplusBase
ocomposeBase Codelike coderArr DFun a b coderarr DFun a b coderarr DFun a b codeocomposeBase (DF (c1 denv1)) (DF (c2 denv2)) =case codeMatch c1 c2 ofJust Re rarrDF (c1 denv1 denv2)
_rarr error INVALID call to ocomposeBase
instance Codelike coderArr ChangeStruct (Fun a b code) wheretype ∆(Fun a b code) = DFun a b code(oplus) = oplusBase
instance Codelike coderArr NilChangeStruct (Fun a b code) where0F (cenv) = DF (c 0env)
instance Codelike coderArr CompChangeStruct (Fun a b code) wheredf 1 df 2 = ocomposeBase df 1 df 2
instance Codelike coderArr NilTestable (Fun a b code) whereisNil (DF (c env)) = isNil env
Figure D3 Implementing change structures using Codelike instances
Chapter D Defunctionalizing function changes 227
applyRaw Codelike coderArr RawFun a b code env cacherarr ararr (b cache)applyRaw (code env) = applyCode code envdapplyRaw Codelike coderArr RawDFun a b code env cacherarr ∆ararr cacherarr (∆b cache)dapplyRaw (code denv) = dapplyCode code denvapplyFun Codelike coderArr Fun a b coderarr ararr (bCache code)applyFun (F f (code env)) arg =(id lowastlowastlowast C code) $ applyRaw f arg
dapplyFun Codelike coderArr DFun a b coderarr ∆ararr Cache coderarr (∆bCache code)dapplyFun (DF (code1 denv)) darg (C code2 cache1) =case codeMatch code1 code2 ofJust Re rarr(id lowastlowastlowast C code1) $ dapplyCode code1 denv darg cache1
_rarr error INVALID call to dapplyFun
instance Codelike coderArr FunOps (FunW code) wheretype Dk (FunW code) = DFunW codeapply (W f ) = applyFun fdApply (DW df ) = dapplyFun dfderiv (W f ) = DW (derivFun f )funUpdate (W f ) (DW df ) = W (f oplus df )isNilFun (DW df ) =if isNil df thenJust Re
elseNothing
Figure D4 Implementing FunOps using Codelike instances
228 Chapter D Defunctionalizing function changes
Acknowledgments
I thank my supervisor Klaus Ostermann for encouraging me to do a PhD and for his years ofsupervision He has always been patient and kind
Thanks also to Fritz Henglein for agreeing to review my thesisI have many people to thank and too little time to do it properly Irsquoll nish these acknowledgments
in the nal versionThe research presented here has beneted from the help of coauthors reviewers and people
who oered helpful comments including comments from the POPLrsquo18 reviewers Moreover thisPhD thesis wouldnrsquot be possible without all the people who have supported me both during all theyears of my PhD but also in the years before
229
230 Chapter D Acknowledgments
Bibliography[Citing pages are listed after each reference]
Umut A Acar Self-Adjusting Computation PhD thesis Princeton University 2005 [Pages 73 135161 and 174]
Umut A Acar Self-adjusting computation (an overview) In PEPM pages 1ndash6 ACM 2009 [Pages 2128 161 and 173]
Umut A Acar Guy E Blelloch and Robert Harper Adaptive functional programming TOPLAS 28(6)990ndash1034 November 2006 [Page 173]
Umut A Acar Amal Ahmed and Matthias Blume Imperative self-adjusting computation InProceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programminglanguages POPL rsquo08 pages 309ndash322 New York NY USA 2008 ACM ISBN 978-1-59593-689-9doi 10114513284381328476 URL hpdoiacmorg10114513284381328476 [Pages 143 146161 194 200 201 202 204 205 and 212]
Umut A Acar Guy E Blelloch Matthias Blume Robert Harper and Kanat Tangwongsan Anexperimental analysis of self-adjusting computation TOPLAS 32(1)31ndash353 November 2009[Page 173]
Umut A Acar Guy Blelloch Ruy Ley-Wild Kanat Tangwongsan and Duru Turkoglu Traceable datatypes for self-adjusting computation In PLDI pages 483ndash496 ACM 2010 [Pages 162 and 173]
Stefan Ackermann Vojin Jovanovic Tiark Rompf and Martin Odersky Jet An embedded DSL forhigh performance big data processing In Intrsquol Workshop on End-to-end Management of Big Data(BigData) 2012 [Page 49]
Agda Development Team The Agda Wiki hpwikiportalchalmersseagda 2013 Accessed on2017-06-29 [Page 181]
Amal Ahmed Step-Indexed Syntactic Logical Relations for Recursive and Quantied Types InProgramming Languages and Systems pages 69ndash83 Springer Berlin Heidelberg March 2006 doi10100711693024_6 [Pages 143 146 161 163 193 194 201 202 204 205 210 212 and 213]
Guillaume Allais James Chapman Conor McBride and James McKinna Type-and-scope safeprograms and their proofs In Yves Bertot and Viktor Vafeiadis editors CPP 2017 ACM NewYork NY New York January 2017 ISBN 978-1-4503-4705-1 [Page 186]
Thorsten Altenkirch and Bernhard Reus Monadic Presentations of Lambda Terms Using GeneralizedInductive Types In Joumlrg Flum and Mario Rodriguez-Artalejo editors Computer Science LogicLecture Notes in Computer Science pages 453ndash468 Springer Berlin Heidelberg September 1999ISBN 978-3-540-66536-6 978-3-540-48168-3 doi 1010073-540-48168-0_32 [Page 186]
231
232 Bibliography
Nada Amin and Tiark Rompf Type soundness proofs with denitional interpreters In Proceedingsof the 44th ACM SIGPLAN Symposium on Principles of Programming Languages POPL 2017 pages666ndash679 New York NY USA 2017 ACM ISBN 978-1-4503-4660-3 doi 10114530098373009866URL hpdoiacmorg10114530098373009866 [Pages 186 and 201]
Andrew W Appel and Trevor Jim Shrinking lambda expressions in linear time JFP 7515ndash5401997 [Page 129]
Andrew W Appel and David McAllester An indexed model of recursive types for foundationalproof-carrying code ACM Trans Program Lang Syst 23(5)657ndash683 September 2001 ISSN0164-0925 doi 101145504709504712 URL hpdoiacmorg101145504709504712 [Page 161]
Robert Atkey The incremental λ-calculus and relational parametricity 2015 URL hpbentniborgposts2015-04-23-incremental-lambda-calculus-and-parametricityhtml Last accessed on 29June 2017 [Pages 81 126 163 and 193]
Lennart Augustsson and Magnus Carlsson An exercise in dependent types A well-typed interpreterIn In Workshop on Dependent Types in Programming Gothenburg 1999 [Page 186]
H P Barendregt The Lambda Calculus Its Syntax and Semantics Elsevier 1984 [Page 63]
Henk P Barendregt Lambda Calculi with Types pages 117ndash309 Oxford University Press New York1992 [Pages 164 166 and 171]
Nick Benton Chung-Kil Hur Andrew J Kennedy and Conor McBride Strongly Typed TermRepresentations in Coq Journal of Automated Reasoning 49(2)141ndash159 August 2012 ISSN0168-7433 1573-0670 doi 101007s10817-011-9219-0 [Page 186]
Jean-Philippe Bernardy and Marc Lasson Realizability and parametricity in pure type systemsIn Proceedings of the 14th International Conference on Foundations of Software Science and Com-putational Structures Part of the Joint European Conferences on Theory and Practice of SoftwareFOSSACSrsquo11ETAPSrsquo11 pages 108ndash122 Berlin Heidelberg 2011 Springer-Verlag ISBN 978-3-642-19804-5 [Pages 89 163 164 165 166 167 172 and 177]
Jean-Philippe Bernardy Patrik Jansson and Ross Paterson Parametricity and dependent typesIn Proceedings of the 15th ACM SIGPLAN International Conference on Functional ProgrammingICFP rsquo10 pages 345ndash356 New York NY USA 2010 ACM ISBN 978-1-60558-794-3 doi 10114518635431863592 URL hpdoiacmorg10114518635431863592 [Pages 166 167 and 172]
Gavin M Bierman Erik Meijer and Mads Torgersen Lost in translation formalizing proposedextensions to C In OOPSLA pages 479ndash498 ACM 2007 [Page 47]
Jose A Blakeley Per-Ake Larson and Frank Wm Tompa Eciently updating materialized viewsIn SIGMOD pages 61ndash71 ACM 1986 [Pages v vii 2 161 and 174]
Maximilian C Bolingbroke and Simon L Peyton Jones Types are calling conventions In Proceedingsof the 2nd ACM SIGPLAN Symposium on Haskell Haskell rsquo09 pages 1ndash12 New York NY USA2009 ACM ISBN 978-1-60558-508-6 doi 10114515966381596640 [Page 160]
KJ Brown AK Sujeeth Hyouk Joong Lee T Rompf H Cha M Odersky and K OlukotunA heterogeneous parallel framework for domain-specic languages In Parallel Architecturesand Compilation Techniques (PACT) 2011 International Conference on pages 89ndash100 2011 doi101109PACT201115 [Page 78]
Bibliography 233
Yufei Cai PhD thesis University of Tuumlbingen 2017 In preparation [Pages 81 and 126]
Yufei Cai Paolo G Giarrusso Tillmann Rendel and Klaus Ostermann A theory of changes for higher-order languages mdash Incrementalizing λ-calculi by static dierentiation In Proceedings of the 35thACM SIGPLAN Conference on Programming Language Design and Implementation PLDI rsquo14 pages145ndash155 New York NY USA 2014 ACM ISBN 978-1-4503-2784-8 doi 10114525942912594304URL hpdoiacmorg10114525942912594304 [Pages 2 3 59 66 84 92 93 107 117 121 124125 135 136 137 138 141 145 146 154 159 175 177 182 184 190 and 223]
Jacques Carette Oleg Kiselyov and Chung-chieh Shan Finally tagless partially evaluated Taglessstaged interpreters for simpler typed languages JFP 19509ndash543 2009 [Page 31]
Craig Chambers Ashish Raniwala Frances Perry Stephen Adams Robert R Henry Robert Bradshawand Nathan Weizenbaum FlumeJava easy ecient data-parallel pipelines In PLDI pages 363ndash375 ACM 2010 [Pages 10 20 and 47]
Yan Chen Joshua Duneld Matthew A Hammer and Umut A Acar Implicit self-adjustingcomputation for purely functional programs In ICFP pages 129ndash141 ACM 2011 [Page 173]
Yan Chen Umut A Acar and Kanat Tangwongsan Functional programming for dynamic and largedata with self-adjusting computation In Proceedings of the 19th ACM SIGPLAN InternationalConference on Functional Programming ICFP rsquo14 pages 227ndash240 New York NY USA 2014 ACMISBN 978-1-4503-2873-9 doi 10114526281362628150 URL hpdoiacmorg10114526281362628150 [Page 162]
James Cheney Sam Lindley and Philip Wadler A Practical Theory of Language-integrated QueryIn Proceedings of the 18th ACM SIGPLAN International Conference on Functional ProgrammingICFP rsquo13 pages 403ndash416 New York NY USA 2013 ACM ISBN 978-1-4503-2326-0 doi 10114525003652500586 [Page 174]
Adam Chlipala Parametric higher-order abstract syntax for mechanized semantics In Proceedingsof the 13th ACM SIGPLAN international conference on Functional programming ICFP rsquo08 pages143ndash156 New York NY USA 2008 ACM ISBN 978-1-59593-919-7 doi 10114514112041411226URL hpdoiacmorg10114514112041411226 [Page 186]
Ezgi Ccediliccedilek Zoe Paraskevopoulou and Deepak Garg A Type Theory for Incremental ComputationalComplexity with Control Flow Changes In Proceedings of the 21st ACM SIGPLAN InternationalConference on Functional Programming ICFP 2016 pages 132ndash145 New York NY USA 2016ACM ISBN 978-1-4503-4219-3 doi 10114529519132951950 [Page 161]
Thomas H Cormen Charles E Leiserson Ronald L Rivest and Cliord Stein Introduction to Algo-rithms MIT Press Cambridge MA USA 2nd edition 2001 ISBN 0-262-03293-7 9780262032933[Page 73]
Oren Eini The pain of implementing LINQ providers Commun ACM 54(8)55ndash61 2011 [Page 47]
Conal Elliott and Paul Hudak Functional reactive animation In ICFP pages 263ndash273 ACM 1997[Page 174]
Conal Elliott Sigbjorn Finne and Oege de Moor Compiling embedded languages JFP 13(2)455ndash4812003 [Pages 20 21 and 48]
Burak Emir Andrew Kennedy Claudio Russo and Dachuan Yu Variance and generalized constraintsfor C] generics In ECOOP pages 279ndash303 Springer-Verlag 2006 [Page 4]
234 Bibliography
Burak Emir Qin Ma and Martin Odersky Translation correctness for rst-order object-orientedpattern matching In Proceedings of the 5th Asian conference on Programming languages andsystems APLASrsquo07 pages 54ndash70 Berlin Heidelberg 2007a Springer-Verlag ISBN 3-540-76636-7978-3-540-76636-0 URL hpdlacmorgcitationcfmid=17847741784782 [Page 4]
Burak Emir Martin Odersky and John Williams Matching objects with patterns In ECOOP pages273ndash298 Springer-Verlag 2007b [Pages 4 32 44 and 45]
Sebastian Erdweg Paolo G Giarrusso and Tillmann Rendel Language composition untangled InProceedings of Workshop on Language Descriptions Tools and Applications (LDTA) LDTA rsquo12 pages71ndash78 New York NY USA 2012 ACM ISBN 978-1-4503-1536-4 doi 10114524270482427055URL hpdoiacmorg10114524270482427055 [Pages 4 and 187]
Sebastian Erdweg Tijs van der Storm and Yi Dai Capture-Avoiding and Hygienic ProgramTransformations In ECOOP 2014 ndash Object-Oriented Programming pages 489ndash514 Springer BerlinHeidelberg July 2014 doi 101007978-3-662-44202-9_20 [Page 93]
Andrew Farmer Andy Gill Ed Komp and Neil Sculthorpe The HERMIT in the Machine APlugin for the Interactive Transformation of GHC Core Language Programs In Proceedings ofthe 2012 Haskell Symposium Haskell rsquo12 pages 1ndash12 New York NY USA 2012 ACM ISBN978-1-4503-1574-6 doi 10114523645062364508 [Page 78]
Leonidas Fegaras Optimizing queries with object updates Journal of Intelligent InformationSystems 12219ndash242 1999 ISSN 0925-9902 URL hpdxdoiorg101023A1008757010516101023A1008757010516 [Page 71]
Leonidas Fegaras and David Maier Towards an eective calculus for object query languages InProceedings of the 1995 ACM SIGMOD international conference on Management of data SIGMODrsquo95 pages 47ndash58 New York NY USA 1995 ACM ISBN 0-89791-731-6 doi 101145223784223789URL hpdoiacmorg101145223784223789 [Page 71]
Leonidas Fegaras and David Maier Optimizing object queries using an eective calculus ACMTrans Database Systems (TODS) 25457ndash516 2000 [Pages 11 21 and 48]
Denis Firsov and Wolfgang Jeltsch Purely Functional Incremental Computing In Fernando Castorand Yu David Liu editors Programming Languages number 9889 in Lecture Notes in ComputerScience pages 62ndash77 Springer International Publishing September 2016 ISBN 978-3-319-45278-4978-3-319-45279-1 doi 101007978-3-319-45279-1_5 [Pages 70 75 156 and 161]
Philip J Fleming and John J Wallace How not to lie with statistics the correct way to summarizebenchmark results Commun ACM 29(3)218ndash221 March 1986 [Pages xvii 39 and 41]
Bent Flyvbjerg Five misunderstandings about case-study research Qualitative Inquiry 12(2)219ndash245 2006 [Page 23]
Miguel Garcia Anastasia Izmaylova and Sibylle Schupp Extending Scala with database querycapability Journal of Object Technology 9(4)45ndash68 2010 [Page 48]
Andy Georges Dries Buytaert and Lieven Eeckhout Statistically rigorous Java performanceevaluation In OOPSLA pages 57ndash76 ACM 2007 [Pages 40 41 and 132]
Paolo G Giarrusso Open GADTs and declaration-site variance A problem statement In Proceedingsof the 4th Workshop on Scala SCALA rsquo13 pages 51ndash54 New York NY USA 2013 ACM ISBN 978-1-4503-2064-1 doi 10114524898372489842 URL hpdoiacmorg10114524898372489842[Page 4]