Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
Formal Foundations of Serverless ComputingAbhinav Jangda Donald Pinckney Samuel Baxter
Joseph Spitzer Breanna Devore-McDonald Yuriy Brun
Arjun GuhaUniversity of Massachusetts Amherst
AbstractA robust, large-scale web service can be di�cult to engineer.When demand spikes, it must con�gure new machines andmanage load-balancing; when demand falls, it must shutdown idle machines to reduce costs; and when a machinecrashes, it must quickly work around the failure without los-ing data. In recent years, serverless computing, a new cloudcomputing abstraction, has emerged to help address thesechallenges. In serverless computing, programmers writeserverless functions, and the cloud platform transparentlymanages the operating system, resource allocation, load-balancing, and fault tolerance. In 2014, AmazonWeb Servicesintroduced the �rst serverless platform, AWS Lambda, andsimilar abstractions are now available on all major clouds.Unfortunately, the serverless computing abstraction ex-
poses several low-level operational details that make it hardfor programmers to write and reason about their code. Thispaper sheds light on this problem by presenting � , an op-erational semantics of the essence of serverless computing.Despite being a small core calculus (less than one column),� models all the low-level details that serverless functionscan observe. To show that � is useful, we present threeapplications. First, to make it easier for programmers to rea-son about their code, we present a simpli�ed semantics ofserverless execution and precisely characterize when thesimpli�ed semantics and � coincide. Second, we augment� with a key-value store, which allows us to reason aboutstateful serverless functions. Third, since a handful of server-less platforms support serverless function composition, weshow how to extend � with a composition language. Wehave implemented this composition language and show thatit outperforms prior work.
1 IntroductionServerless computing, also known as functions as a service, is anew approach to cloud computing that allows programmersto run event-driven functions in the cloud without the needto manage resource allocation or con�gure the runtime en-vironment. Instead, when a programmer deploys a serverlessfunction, the cloud platform automatically manages depen-dencies, compiles code, con�gures the operating system,
and manages resource allocation. Unlike virtual machinesor containers, which require low-level system and resourcemanagement, serverless computing allows programmers tofocus entirely on application code. If demand for the functionsuddenly increases, the cloud platform may transparentlyload another instance of the function on a new machine andmanage load-balancing without programmer intervention.Conversely, the platform will transparently terminate under-utilized instances when demand for the function falls. In fact,the cloud provider may shutdown all running instances ofthe function if there is no demand for an extended period oftime. Since the platform completely manages the operatingsystem and resource allocation in this manner, serverlesscomputing is a language-level abstraction for programmingthe cloud.An economic advantage of serverless functions is that
they only incur costs for the time spent processing events.Therefore, a function that is never invoked incurs no cost.By contrast, virtual machines incur costs when they areidle, and they need idle capacity to smoothly handle unex-pected increases in demand. Serverless computing allowscloud platforms to more easily optimize utilization acrossseveral customers, which increases hardware utilization andlowers costs (e.g., by lowering energy consumption).
Amazon Web Services introduced the �rst serverless com-puting platform, AWS Lambda, in 2014, and similar abstrac-tions are now available from all major cloud providers [1,14, 22, 26, 29, 31]. Serverless computing has seen rapid adop-tion [11], and programmers now often use serverless com-puting to write short, event-driven computations, such asweb services, backends for mobile apps, and aggregators forIoT devices.In the research community, there is burgeoning interest
in developing new programming abstractions for server-less computing, including abstractions for big data process-ing [4, 18, 28], modular programming [7], information �owcontrol [2], chatbot design [8], and virtual network func-tions [40]. However, serverless computing has signi�cant,peculiar characteristics that prior work does not address.
The serverless computing abstraction, despite its many ad-vantages, exposes several low-level operational details thatmake it hard for programmers to write and reason abouttheir code. For example, to reduce latency, serverless plat-forms try to reuse the same function instance to process
1
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
multiple requests. However, this behavior is not transparent,and it is easy to write a serverless function that producesincorrect results or leaks con�dential data when reused. Arelated problem is that serverless platforms abruptly termi-nate function instances when they are idle, which can leadto data loss if the programmer is not careful. To tolerate net-work and system failures, serverless platforms automaticallyre-execute functions on di�erent machines. However, it isup to the programmer to ensure that their functions performcorrectly when they are re-executed, which may include con-current re-execution when a transient failure occurs. Theseproblems are exacerbated when an application is composedof several functions. In summary, serverless functions are notfunctions in any typical sense of the word.
Our contributions. This paper presents a formal founda-tion for serverless computing. Based on our experience withseveral major serverless computing platforms (Google CloudFunctions, Apache OpenWhisk, and AWS Lambda), we havedeveloped a detailed, operational semantics of serverlessplatforms, called � , which manifests the essential low-levelbehaviors of these serverless platforms, including failures,concurrency, function restarts, and instance reuse. The de-tails of � matter because they are observable by programs,but programmers �nd it hard to write code that correctlyaddresses all of these behaviors. By elucidating these behav-iors, � can guide the development of programming toolsand extensions to serverless platforms. To demonstrate that� is useful, this paper presents three distinct case studiesthat employ it.First, we address the problem that it can be hard for pro-
grammers to reason about the complex, low-level behavior ofserverless platforms, which � models. We present an alter-native, naive semantics of serverless computing, that elidesthese complex behaviors, and precisely characterize when itis safe for a programmer to use the naive semantics insteadof � . Speci�cally, we establish a weak bisimulation between� and the naive semantics that is conditioned on simplesemantic criteria. This result helps programmers con�dentlyabstract away the low-level details of serverless computing.Second, we address the fact that serverless functions fre-
quently use a cloud-hosted key-value store to share state.We have designed � to be an extensible semantics, so weextend it with a model of a key-value store. Using this exten-sion, we precisely characterize idempotence for serverlessfunctions, which allows programmers to reason using anextended naive semantics, where each event is processed inisolation. The recipe that we follow to add a key-value storeto � could also be used to augment � with models of othercloud computing services.
Third, we account for serverless platforms that go beyondrunning individual serverless functions and also supportserverless function orchestration. We extend � to supporta serverless programming language (���), which has a suite
of I/O primitives, data processing operations, and composi-tion operators that can run safely and e�ciently without theneed for operating system isolation mechanisms. To performoperations that are beyond the scope of ���, we allow ���programs to invoke existing serverless functions as blackboxes. We implement ���, evaluate its performance, and �ndthat it can outperform a popular alternative in certain com-mon cases. Using case studies, we show that ��� is expressiveand easy to extend with new features.Our hope is that � will be a foundation for further re-
search on language-based abstractions for serverless comput-ing. The case studies that we present are detailed examplesthat show how to to design new abstractions and study ex-isting abstractions for serverless computing, using � .
The rest of this paper is organized as follows. §2 presentsan overview of serverless computing, and a variety of is-sues that arise when writing serverless code. §3 presents � ,our formal semantics of serverless computing. §4 presentsa simpli�ed semantics of serverless computing and provesexactly when it coincides with � . §5 augments � with akey-value store. §6 and §7 extend � with a language forserverless orchestration. Finally, §8 discusses related workand §9 concludes.
2 Overview of Serverless ComputingTo motivate the need for a formal foundation of serverlesscomputing, consider the serverless function in Figure 1a.1During deployment, the programmer can specify the typesof events that will trigger the function, such as, messageson a message-bus, updates to a database, or web requests toa ���. The function receives the event as a ����-formattedobject. In this example, the type �eld of the event deter-mines whether the function deposits money to an accountor transfers money between accounts. The function also re-ceives a callback, which it must call when event processingis complete. The callback allows the function to run asyn-chronously, although our example is presently synchronous.For brevity, we have elided authentication, authorization,error handling, and most input validation. However, thisfunction su�ers several other problems because it is notproperly designed for serverless execution.
Ephemeral state. The �rst problem arises after a few min-utes of inactivity: all updates to the accounts global variableare lost. This problem occurs because the serverless platformruns the function in an ephemeral container, and silentlyshuts down the container when it is idle. The exact time-out depends on overall load on the cloud provider’s infras-tructure, and is not known to the developer. Moreover, thefunction does not receive a noti�cation before a shut down
1The examples in this paper are in JavaScript— the language that is mostwidely supported by serverless platforms— and arewritten for Google CloudFunctions. However, it is easy to port our examples to other languages andserverless platforms.
2
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
Formal Foundations of Serverless Computing
1 let accounts = new Map();
2 exports.bank = function(req, res) {
3 if (req.body.type === �deposit�) {
4 accounts.set(req.body.name, req.body.value);
5 res.send(true);
6 } else if (req.body.type === �transfer�) {
7 let { from, to, amnt } = req.body;
8 if (accounts.get(from) >= amnt) {
9 accounts.set(to, accounts.get(to) + amnt);
10 accounts.set(from, accounts.get(from) - amnt);
11 res.send(true);
12 } else {
13 res.send(false); }}};
(a) An incorrect implementation that does not address low-leveldetails of serverless execution.1 let Datastore = require(�@google-cloud/datastore�);
2 exports.bank = function(req, res) {
3 let ds = new Datastore({ projectId: �bank-app� });
4 let dst = ds.transaction();
5 dst.run(function() {
6 let tId = ds.key([�Transaction�, req.body.transId]);
7 dst.get(tId, function(err, trans) {
8 if (err || trans) {
9 dst.rollback(function() { res.send(false); });
10 } else if (req.body.type === �deposit�) {
11 let to = ds.key([�Account�, req.body.to]);
12 dst.get(to, function(err, acct) {
13 acct.balance += req.body.amount;
14 dst.save({ key: to, data: acct });
15 dst.save({ key: tId, data: {} });
16 dst.commit(function() { res.send(true); });
17 });
18 } else if (req.body.type === �transfer�) {
19 let amnt = req.body.amount;
20 let from = ds.key([�Account�, req.body.from]);
21 let to = ds.key([�Account�, req.body.to]);
22 dst.get([from, to], function(err, accts) {
23 if (accts[0].balance >= amnt) {
24 accts[0].balance -= amnt;
25 accts[1].balance += amnt;
26 dst.save([{ key: from, data: accts[0] },
27 { key: to, data: accts[1] }]);
28 dst.save({ key: tId, data: {} });
29 dst.commit(function() { res.send(true); });
30 } else {
31 dst.rollback(function() { res.send(false);
32 });}});}}});});};
(b) A correct implementation that addresses instance termination,concurrency, and idempotence.
Figure 1. Serverless functions for banking.
occurs. Similarly, the platform automatically starts a newcontainer when a event eventually arrives, but all state islost. Therefore, the function must serialize all state updatesto a persistent store. In serverless environments, the localdisk is also ephemeral, therefore our example function mustuse a network-attached database or storage system.
Implicit parallel execution. The second problem ariseswhen the function receives multiple events over a shortperiod of time. When a function is under high load, theserverless platform transparently starts new instances of the
function and manages load-balancing to reduce latency. Eachinstance runs in isolation (in a container), there is no guaran-tee that two events from the same requester will be processedby the same instance, and instances may process eventsin parallel. Therefore, a correct implementation should usetransactions to correctly manage this implicit parallelism.
At-least-once execution. The third problem arises whenfailures occur in the serverless computing infrastructure.Serverless platforms are distributed systems that are de-signed to re-invoke functions when a failure is detected.2However, if the failure is transient, a single event may beprocessed to completion multiple times. Most platforms runfunctions at least once in response to a single event, whichcan cause problems when functions have side-e�ects [3, 23,30, 32]. In our example function, this would duplicate de-posits and transfers. We can �x this problem in several ways.A common approach is to require each request to have aunique identi�er, maintain a consistent log of processed iden-ti�ers in the database, and ignore requests that are alreadyin the log.
Figure 1b �xes the three problems mentioned above. Thisimplementation uses a cloud-hosted key-value store for per-sistence, uses transactions to address concurrency, and re-quires each request to have a unique identi�er to supportsafe re-invocation. This version of the function is more thantwice as long as the original, and requires the programmer tohave a deep understanding of the serverless execution model.The next section presents an operational semantics of server-less platforms (� ) that succinctly describes its peculiarities.Using � , the rest of the paper precisely characterizes thekinds of properties that are needed for serverless functions,such as our example, to be correct.
3 Semantics of Serverless ComputingWe now present our operational semantics of serverless plat-forms (� ) that captures the essential details of the serverlessexecution model, including concurrency, failures, executionretries, and function instance reuse. Figure 2 presents �in full, and is divided into two parts: (1) a model of server-less functions, and (2) a model of a serverless platform thatreceives requests (i.e., events) and produces responses by run-ning serverless functions. The peculiar features of serverlesscomputing are captured by the latter part.
Serverless functions. Serverless platforms allow program-mers to write serverless functions in a variety of sourcelanguages, but platforms themselves are source-language ag-nostic. Most platforms only require that serverless functionsoperate asynchronously, process a single request at a time,
2Even if the serverless platform does not re-invoke functions itself, there aresituations where the external caller will re-invoke a function to workarounda transient failure. For example, this arises when functions are triggered by���� requests.
3
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
Serverless Functions hf , �0, �, recv, stepiFunction name f B · · ·Internal states �B · · ·Initial state �0 2 �Receive event recv 2 � ⇥ �! �Internal step step 2 �! � ⇥ t With e�ect tValues � B · · · ����, ����, etc.Commands t B �
| return(� ) Return value
Serverless PlatformRequest �� x B · · ·Instance �� � B · · ·Execution mode mB idle Idle
| busy(x ) Processing xTransition labels `B Internal transition
| start(x, � ) Receive �| stop(x, � ) Respond �
Components C B F(f ,m, � , � ) Function instance| R(f , x, � ) Apply f to �| S(x, � ) Respond with value �
Component set CB {C1, · · · , Cn }
C`7�! C
R��x is fresh
Cstart(x,� )========) CR(f , x, � )
C���� is fresh recv(�, �0) = �
CR(f , x, � ) ) CR(f , x, � )F(f , busy(x ), � , � )
W���recv(�, � ) = � 0
CR(f , x, � )F(f , idle, � , � ) ) CR(f , x, � )F(f , busy(x ), � 0, � )
H�����step(� ) = (� 0, � )
CF(f , busy(x ), � , � ) ) CF(f , busy(x ), � 0, � )
R���step(� ) = (� 0, return(� 0))
CR(f , x, � )F(f , busy(x ), � , � )stop(x,� 0)========) CF(f , idle, � , � )S(x, � 0)
D��CF(f ,m, � , � ) ) C
Figure 2. � : An operational model of serverless platforms.
and usually consume and produce ���� values. These are thefeatures that our abstract model includes (Figure 2).
In our model, the operation of a serverless function (f ) isde�ned by two functions: a function that receives a request(recv), and a function that takes an internal step (step) thatmay produce a command for the serverless platform (t ). Fornow, the only valid command is return(� ), which indicatesthat the response to the last request is the value � .3 Thestate of a serverless function (� ) is abstract to the serverlessplatform. However, there is a distinguished initial state (�0)in which the platform launches a serverless function. In3 §5 extends our model with new commands.
practice, the initial state depends on the source language.For example, if f were written in JavaScript then �0 wouldbe the initial JavaScript heap.
Serverless platform. � is an operational semantics of aserverless platform that processes several concurrent re-quests. � is written in a process-calculus style, where thestate of the platform consists of a collection of running or idlefunctions (known as function instances), pending requests,and responses. A new request may arrive at any time for aserverless function f , and each request is given a globallyunique identi�er x (the R�� rule). However, the platformdoes not process requests immediately. Instead, at some laterstep, the platform may either cold-start a new instance off (the C��� rule), or it may warm-start by processing therequest on an existing, idle instance of f (the W��� rule).The internal steps of a function instance are unobservable(the H����� rule). The only observable command that aninstance can produce is to respond to a pending request(the R��� rule). When an instance responds to a request, �produces a response object, an observable response event,and marks the function instance as idle, which allows it tobe reused. Finally, a function instance may die at any timewithout noti�cation (the D�� rule).
� captures several subtle behaviors that occur in server-less platforms. (1) The platform produces an observable event(start(x ,� )) when it receives a request (R��), and not whenit starts to process the request. This is necessary because theplatform may start several instances for a single event x , forexample, if the platform detects a potential failure. (2) Duringa warm-start, it does not re-initialize the state. Instead, itapplies recv to whatever last state the instance had. (3) Theplatform can cold-start or warm-start a function instancefor any pending request at any time, even if the request iscurrently being processed. In practice, this behavior occursduring a transient failure, when an instance becomes tem-porarily unreachable. (4) After responding to a request, theplatform does not discard function state (R������), whichallows functions to cache results across invocations. (5) Iftwo instances are processing a single request x and one func-tion responds, the other function will eventually get stuckbecause there is no longer a request for it to respond to. Thestuck function will eventually terminate (D��). (6) Instancetermination is not observable and may occur due to failure,or when the platform reclaims the resources held by an in-stance that is idle or stuck (D��). (7) There is no rule in � torestart function execution after it dies. Instead, the semanticscan nondeterministically launch a new instance at any time.
In summary, � succinctly models the low-level details ofserverless platforms. The semantics manifests the subtletiesthat make serverless programming hard. The next sectionuses � to present a simpler model to make serverless pro-gramming easier.
4
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
Formal Foundations of Serverless Computing
Naive function state AB hf ,m, *� i
A `7�! A
N�S����x is fresh recv(�, �0) = � 0
hf , idle, *� i start(x,� )7��������! hf , busy(x ), [�0, � 0]i
N�S���step(� ) = (� 0, � )
hf , busy(x ), *� ++[� ]i 7! hf , busy(x ), *� ++[� , � 0]i
N�S���step(� ) = (� 0, return(� ))
hf , busy(x ), *� ++[� ]i stop(x,� )7��������! hf , idle, [� ]i
Figure 3. A naive semantics of serverless functions.
4 A Simpler Serverless SemanticsA natural way to make serverless programming easier is tosimplify the model. For example, we could execute functionsexactly once, or avoid reusing state. Unfortunately, imple-menting these changes is likely to be expensive (and, in manysituations, beyond our control). Therefore, the approach thatwe take in this section is to de�ne a simpler, naive semanticsof serverless computing. Moreover, using a weak bisimu-lation theorem, we precisely characterize when the naivesemantics and � coincide. This theorem addresses the low-level details of serverless execution once and for all, thusfreeing programmers to reason using the naive semanticsinstead of � .In the naive serverless semantics (Figure 3), serverless
functions (f ) are the same as the serverless functions in � .However, the operational semantics of the naive platform ismuch simpler: the platform runs a single function f on onerequest at a time, without any concurrency or failures. Ateach step of execution, the naive semantics either (1) startsprocessing a new event if the platform is idle, (2) takes aninternal step if the platform is busy, or (3) responds andreturns to being idle. The state of a naive platform consistsof (1) the function’s name (f ); (2) its execution mode (m); and(3) a trace of function states (*� ), where the last element of thetrace is the current state, and the �rst element was the initialstate of the function instance when it received the currentrequest. The trace is a convenience that helps us relate thenaive semantics to � , but has no e�ect on execution.
Serverless function correctness. Note that the naive se-mantics is an idealized model and is not correct for arbitraryserverless functions. However, we can precisely characterizethe exact conditions when it is safe for a programmer to rea-son with the naive semantics, even if their code is running ona full-�edged serverless platform (i.e., using � ). We requirethe programmer to de�ne a safety relation (R) over the stateof the serverless function. At a high-level, the programmermust ensure that (1) R is an equivalence relation, (2) pairsof equivalent states step to pairs of equivalent states, and(3) that the serverless function produces the same observable
command (if any) on equivalent states. The safety relation isformally de�ned as follows.
De�nition 4.1 (Safety Relation). For a serverless functionhf ,�0, �, recv, stepi, the relationR ✓ �⇥� is a safety relationif:
1. R is an equivalence relation,2. for all (�1,�2) 2 R and � , (recv(�,�1), recv(�,�2)) 2R,
3. for all (�1,�2) 2 R , if (� 01, t1) = step(�1) and (� 02, t2) =step(�2) then (� 01,�
02 ) 2 R and t1 = t2.
Weak bisimulation. We now prove a weak bisimulationbetween the naive semantics and � , conditioned on theserverless functions satisfying the safely relation de�nedabove. We prove a weak (rather than a strong) bisimulationbecause � models serverless execution inmore detail. There-fore, a single step in the naive semantics may correspond toseveral steps in � . The theorem below states that the naivesemantics and � are indistinguishable to the programmer,modulo unobservable steps.To establish the weak bisimulation, we de�ne a relation
(A ⇡ C) that precisely describes when naive state (A) isequivalent to a � state (C), as follows.
De�nition 4.2 (Bisimulation Relation). A ⇡ C is de�nedas:
1. hf , idle,*� i ⇡ C, and2. If for all F( f , busy(x ),� ,�) 2 C, 9� 0.� 0 2 *� such that
(� ,� 0) 2 R then hf , busy(x ),*� i ⇡ R( f ,x ,� ) C.
This relation formally captures several key ideas that arenecessary to reason about serverless execution. (1) A singlenaive state may be equivalent to multiple distinct � states.This may occur due to failures and restarts. (2) Conversely,a single � state may be equivalent to several naive states.This occurs when a serverless platform is processing severalrequests. In fact, we require all � states to be equivalentto all idle naive states, which is necessary for � to receiverequests at any time. (3) The � state may have several func-tion instances evaluating the same request. (4) Due to warmstarts, the state of a function may not be identical in thetwo semantics; however, they will be equivalent (per R).(5) Due to failures, the � semantics can “fall behind” thenaive semantics during evaluation, but the state of any func-tion instance in � will be equivalent to some state in theexecution history of the naive semantics. The weak bisimu-lation theorem accounts for failures, by specifying the seriesof � steps needed to then catch up with the naive semanticsbefore an observable event occurs.
An important simpli�cation in the naive semantics is thatit executes a single request at a time. Therefore, to relatea naive execution to a � execution, we need to �lter outevents that are generated by other requests. To do so, we
5
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
de�ne x (*`) as the sub-sequence of ` that only consists of
events labeled x .We can now state the weak bisimulation theorem.
Theorem 4.3 (Weak Bisimulation). For a serverless functionf with a safety relation R, for all A, C, `:
1. For all A 0, if A `7�! A 0 and A ⇡ C then there exists*`1,
*`2, C0, Ci and Ci+1 such that C )
*`1=) Ci
`=) Ci+1 )
*`2=) C0,
x (*`1) = � , x (
*`2) = � , and A 0 ⇡ C0
2. For all C0, if C`=) C0 and A ⇡ C then there exists A 0
such that A 0 ⇡ C0 and A 7�!`�! A 0.Proof. By Theorems A.4 and A.5 in the appendix (part of thesubmitted supplemental material). ⇤
The �rst part of this theorem states that every step inthe naive semantics corresponds to some sequence of stepsin � . We can interpret this as the sequence of steps that aserverless platform needs to execute to faithfully implementthe naive semantics. On the other hand, the second part ofthis theorem states that any arbitrary step in � — includingfailures, retries, and warm starts — corresponds to a (possiblyempty) sequence of steps in the naive semantics.In summary, this theorem allows programmers to justi�-
ably ignore the low-level details of � , and simply use thenaive semantics, if their code satis�es De�nition 4.1. Thereare now several tools that are working toward verifyingthese kinds of properties in scripting languages, such asJavaScript [19, 34], which is the most widely supported lan-guage for writing serverless functions. Our work, whichis source language-neutral, complements this work by es-tablishing the veri�cation conditions necessary for correctserverless execution.
5 Serverless Functions and Cloud StorageIt is common for serverless functions to use an externaldatabase for persistent storage because their local state isephemeral. But, serverless platforms warn programmers thatstateful serverless functions must be idempotent [23, 30, 32].In other words, they should be able to tolerate re-execution.Unfortunately, it is completely up to programmers to ensurethat their code is idempotent, and platforms do not pro-vide a clear explanation of what idempotence means, giventhat serverless functions perform warm-starts, execute con-currently, and may fail at any time. We now address theseproblems by adding a key-value store to both � and thenaive semantics, and present an extended weak bisimula-tion. In particular, the naive semantics still processes a singlerequest at a time, which is a convenient mental model forprogrammers.
Figure 4 augments � with a key-value store that supportstransactions. To the set of components, we add exactly onekey-value store (D(M,L)), which has a map from keys to
Serverless FunctionsKey set k B stringsKey-value map M 2 k * �Commands t B · · ·
| beginTx Lock data store| read(k ) Read value| write(k, � ) Write value| endTx Unlock data store
Component set CB {C1, · · · , Cn, D(M, L) }Lock state LB free Data store is free
| owned(�, M ) Owned by �
R���step(� ) = (� 0, read(k )) recv(M 0(k ), � 0) = � 00
CF(f , � , busy(x ), � )D(M, owned(�, M 0))) CF(f , � 00, busy(x ), � )D(M, owned(�, M 0))
W����step(� ) = (� 0, write(k, � )) M 00 = M 0[k 7! �]
CF(f , � , busy(x ), � )D(M, owned(�, M 0))) CF(f , � 0, busy(x ), � )D(M, owned(�, M 00))
B����T�step(� ) = (� 0, beginTx)
CF(f , � , busy(x ), � )D(M, free)) CF(f , � 0, busy(x ), � )D(M, owned(�, M ))
E��T�step(� ) = (� 0, endTx)
CF(f , � , busy(x ), � )D(M, owned(�, M 0))) CF(f , � 0, busy(x ), � )D(M 0, free)
D���T�8f �x . F(f , � , busy(x ), � ) < C
CD(M, owned(�, M 0)) ) CD(M, free)
Figure 4. � augmented with a key-value store.
values (M) and a lock (L), which is either unlocked (free) orcontains uncommitted updates from the function instancethat holds the lock (owned(�,M 0)). Note that the lock isheld by a function instance and not a request, since theremay be several running instances processing the same re-quest. We allow serverless functions to produce four newcommands: beginTx starts a transaction, endTx commits atransaction, read(k ) reads the value associated with key k ,and write(k,� ) sets the key k to value � . We add four newrules to � that execute these commands in the natural way:B����T� blocks until it can acquire a lock, E��T� commitschanges and releases a lock, and for simplicity, the R���andW���� rules require the running instance to have a lock.Finally, we need a �fth rule (D���T�) that releases a lock anddiscards its uncommitted changes if the function instancethat held the lock no longer exists. This may occur if thefunction instance dies before committing its changes.
Idempotence in the naive semantics. There are severalways to ensure that a serverless function is idempotent. Acommon protocol is to commit each output value, keyedby the unique request ��, to the key-value store, within atransactional update. Therefore, if the request is re-tried,
6
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
Formal Foundations of Serverless Computing
Serverless FunctionsOptional key-value map HM B lo�ed(M, *� ) | commit(� ) | ·HM, A `7�! HM, A
A 7! A0HM, A 7! HM, A0
A start(x,� )7��������! A0
·, A start(x,� )7��������! ·, A0A stop(x,� )7��������! A0
commit(� ), A stop(x,� )7��������! ·, A0
N�R���step(� ) = (� 0, read(k )) recv(M (k ), � 0) = � 00
lo�ed(M, *� 0), hf , busy(x ), *� ++[� ]i7! lo�ed(M, *� 0), hf , busy(x ), *� ++[� , � 0, � 00]i
N�W����step(� ) = (� 0, write(k, � )) M 0 = M[k 7! �]
lo�ed(M, *� 0), hf , busy(x ), *� ++[� ]i7! lo�ed(M 0, *� 0), hf , busy(x ), *� ++[� , � 0]i
N�B����T�step(� ) = (� 0, beginTx)
·, hf , busy(x ), *� ++[� ]i7! lo�ed(M, *� ++[� ]), hf , busy(x ), *� ++[� , � 0]i
N�E��T�step(� ) = (� 0, endTx)
lo�ed(M, *� 0), hf , busy(x ), *� ++[� ]i7! commit(M (x )), hf , busy(x ), *� ++[� , � 0]i
N�R�������lo�ed(M, *� 0), hf , busy(x ), *� i 7! ·, hf , busy(x ), *� 0i
Figure 5. The naive semantics with a key-value store.
the function can lookup and return the saved output value.We now formally characterize this protocol, and use it toprove a weak bisimulation theorem between � and thenaive semantics, where each is extended with a key-valuestore. This will allow programmers to reason about serverlessexecution using the naive semantics, which processes exactlyone request at a time, without concurrency.The challenge we face is to extend the bisimulation rela-
tion (De�nition 4.2) to account for the key-value store. Inthat de�nition, when the � state (C) and the naive state(A) are equivalent (A ⇡ C), it is possible for a failure tooccur (C ) C0), such that after the failure, � falls behindthe naive semantics. Nevertheless, we still treat the statesas equivalent (A ⇡ C0), and let the weak bisimulation proofre-invoke a function in C0 until it catches up with A. Un-fortunately, for an arbitrary serverless function, replay doesnot work because the key-value store may have changed.To address this, we need to ensure that functions that usethe key-value store follow an appropriate protocol to ensureidempotence. A more subtle problem arises when the naivestate (A) is within a transaction, and the equivalent � statetakes several steps (C)) C0) that result in a failure, followedby other updates to the key-value store. If this occurs, thenaive semantics must rollback to the start of the transactionand re-execute with the key-value store in C0.Figure 5 shows the extended naive semantics, which ad-
dresses these issues. In this semantics, the naive key-value
store (HM) goes through three states: (1) at the start of exe-cution, it is not present; (2) when a transaction begins, thesemantics selects a new mapping nondeterministically; and(3) when the transaction completes, the mapping moves to acommitted state, where it only contains the �nal result. Forsimplicity, we assume that reads andwrites only occurwithintransactions. The semantics also includes aN�R������� rule,which allows execution to rollback to the start of transac-tion. However, once a transaction is complete (N�E��T�), arollback is not possible.The extended bisimulation relation, shown below, uses
the bisimulation relation from the previous section (De�ni-tion 4.2). When the naive semantics is within a transaction,the relation requires some instance in � to be operating inlock-step with the naive semantics. However, it allows otherinstances that are not using the key-value store to makeprogress. Therefore, a transaction is not globally atomic in� , and other requests can be received and processed whilesome instance is in a transaction.
De�nition 5.1 (Extended Bisimulation Relation). HM,A ⇡D(M,L),C is de�ned as:
1. If A ⇡ C then ·,A ⇡ D(M, free)C, or2. If A = hf , busy(x ),*�++[� ]i, L = owned(�,M 0), HM =
lo�ed(M 0,*� ), and there exists F( f ,H� , busy(x ),�) 2C such that (� ,H� ) 2 R then HM,A ⇡ D(M,L),C, or
3. If A = hf , busy(x ),*� i, HM = commit(� ), and M (x ) =� then HM,A ⇡ D(M, free)C.
With this de�nition in place, we can prove an extendedweak bisimulation relation.
Theorem 5.2 (Extended Weak Bisimulation). For a server-less function f , if R is its safely relation and for all requestsR( f ,x ,� ) the following conditions hold:
1. f produces the value stored at key x , if it exists,2. When f completes a transaction, it stores a value � 0 at
key x , and3. When f stops, it produces � 0,
then, for all HM , A, C, ` :
1. For all HM0,A 0, if HM,A `7�! HM
0,A 0 and A ⇡ C then
there exists*`1,*`2, C0, Ci and Ci+1 such that C)
*`1=) Ci
`=)
Ci+1 )*`1=) C0, x (*`1) = � , x (
*`2) = � , and A 0 ⇡ C0
2. For all C0, if C`=) C0 and HM,A ⇡ C then there exists
A 0 such that A 0 ⇡ C0 and A 7�!`�! A 0.Proof. By Theorems A.9 and A.10 in the appendix (part ofthe submitted supplemental material). ⇤
The theorem statement in the appendix formalizes theconditions of the bisimulation, but the less formal conditionsare useful for programmers, since they are simple require-ments that are easy to ensure. Therefore, this theorem gives
7
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
the assurance that the reasoning with the naive semanticsis adequate, even though the serverless platform operatesusing � .
6 Serverless CompositionsThus far, we have discussed the basic serverless computingabstraction, where serverless functions are black-boxes tothe platform. However, there are many situations where thisabstraction makes it hard to write serverless programs in amodular way [7]. This section extends � with a signi�cantnew feature: a serverless programming language (���) forcomposing serverless functions that is designed to directlyrun on a serverless platform (i.e., without operating systemvirtualization). We motivate the need for ��� (§6.1), presentthe design of ��� (§6.2), and discuss our implementation(§6.3). Then, §7 evaluates ���’s performance.
6.1 The Need for Serverless Composition LanguagesBaldini et al. [7] present several problems that arise whenprogrammers build new serverless functions by composingexisting serverless functions. The natural approach to server-less function composition is to have one serverless functionact as a coordinator that invokes other functions. For ex-ample, Figure 6 shows a serverless function that depositsfunds into two accounts by calling our example function(bank) from §2. A problem with this approach is that theprogrammer gets “double-billed” for the time spent runningthe coordinator function (deposit2) which is mostly idleand the time spent doing actual work in bank. An alternativeapproach is to merge several functions into a single function.Unfortunately, this approach hinders code-reuse. In partic-ular, it does not work when source code is unavailable orwhen the serverless functions are written in di�erent lan-guages. A third approach is to write serverless functions thateach pass their output as input to another function, insteadof returning to the caller (i.e., continuation-passing style).However, this approach requires rewriting code. Moreover,some clients, such as web browsers, cannot produce a contin-uation ��� to receive the �nal result. Baldini et al. show thatthe only way to address these problems is to add primitivecomposition operators to serverless platforms [7].
6.2 Composing Serverless Functions with ArrowsWe now extend � with a domain speci�c language for com-posing serverless functions, which we call ��� (serverlessprogramming language). Since the serverless platform is ashared resource and programs are untrusted, ��� cannot runarbitrary code. However, ��� programs can invoke serverlessfunctions to perform arbitrary computation when needed.Therefore, invoking a serverless function is a primitive op-eration in ���, which serves as the wiring between severalserverless functions.
1 let request = require(�request-promise-native�);
2 exports.deposit2 = function(req, res) {
3 let { amount, tId1, tId2 } = req.body;
4 if (amount > 100) {
5 request.post({ url: ..., json: { type: �deposit�,
6 to: �checking�, amount: amount - 100,
7 transId: tId1 }})
8 .then(function() {
9 request.post({ url: ..., json: { type: �deposit�,
10 to: �savings�, amount: 100, transId: tId2 }})
11 .then(function() { res.send(true); });});
12 } else {
13 request.post({ url: ..., json: { type: �deposit�,
14 to: �checking�, amount: amount, transId: tId1 }})
15 .then(function() { res.send(true); });}};
Figure 6.A serverless function to deposit funds, based on theamount provided, using the serverless function in Figure 1b.
Values � B · · ·| (�1, �2) Tuples
��� expressions e B invoke f Invoke serverless function| first e Run e to �rst part of input| e1 >>> e2 Sequencing
��� continuations � B ret x Response to request| seq e � In a sequence| first � � In first
Components C B · · ·| E(e, �, � ) Running program| E(x, � ) Waiting program| R(e, x, � ) Run program e on �
C`7�! C
P�N��R��x is fresh
Cstart(� )======) CR(e, x, � )
P�S���� CR(e, x, � ) ) CR(e, x, � )E(e, �, ret x )
P�R������ CE(� 0, ret x )R(e, x, � )stop(� 0)======) CS(x, � 0)
P�S��1 CE(e1 >>> e2, �, � ) ) CE(e1, �, seq e2 � )
P�S��2 CE(�, seq e � ) ) CE(e, �, � )
P�I�����1x 0 is fresh
CE(invoke f , �, � ) ) CE(x 0, � )R(f , x 0, � )
P�I�����2 CE(x, � )S(x, � ) ) CE(�, � )
P�F����1 CE(first e, (�1, �2), � ) ) CE(e, �1, first �2 � )
P�F����2 CE(�1, first �2 � ) ) CE((�1, �2), � )
P�D�� CE(�, � ) ) C
Figure 7. Extending � with ���.
Figure 7 extends the � with ���. This extension allowsrequests to run ��� programs (R(e,x ,� )), in addition to ordi-nary requests that name serverless functions.4 ��� is based
4In practice, a request would name an ��� program instead of carrying theprogram itself.
8
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
Formal Foundations of Serverless Computing
��� expressions e B · · · | p Run transformation���� values � ::=n | b | str | null���� pattern p B � ���� literal
| [p1, · · · , pn ] Array| {str1 : p1, · · · , strn : pn } Object| p1 op p2 Operators| if (p1) then p2 else p3 Conditional| [str1 ! p1] Update �eld| in q Input reference
���� query q B Empty query| .[n]q Array index| .idq Field lookup
Figure 8. ���� transformation language.
on Hughes’ arrows [27], thus it supports the three basic ar-row combinators. An ��� program can (1) invoke a serverlessfunction (invoke f ); (2) run two subprograms in sequence(e1 >>> e2); or (3) run a subprogram on the �rst componentof a tuple, and return the second component unchanged(first e). These three operations are su�cient to describeloop- and branch-free compositions of serverless functions.It is straightforward to add support for bounded loops andbranches, which we do in our implementation.
To run ��� programs, we introduce a new kind of compo-nent (E) that executes programs using an abstract machinethat is similar to a �� machine [15]. In other words, theevaluation rules de�ne a small-step semantics with an ex-plicit representation of the continuation (�). This design isnecessary because programs need to suspend execution toinvoke serverless functions (P�I�����1) and then later re-sume execution (P�I�����2). Similar to serverless functions,��� programs also execute at-least-once. Therefore, a sin-gle request may spawn several programs (P�S����) and aprogram may die while waiting for a serverless function toresponse (P�D��).
A sub-language for ���� transformations. A problemthat arises in practice is that input and output values toserverless functions (�) are frequently formatted as ����values, which makes hard to de�ne the first operator in a sat-isfactory way. For example, we could de�ne first to operateover two-element ���� arrays, and then require program-mers to write serverless functions to transform arbitrary���� into this format. However, this approach is cumbersomeand resource-intensive. For even simple transformations, theprogrammer would have to write and deploy serverless func-tions; the serverless platform would need to sandbox theprocess using heavyweight �� mechanisms; and the plat-form would have to copy values to and from the process.Instead, we augment ��� with a sub-language of ����
transformations (Figure 8). This language is a superset of����. It has a distinguished variable (in) that refers to theinput ���� value, which may be followed by a query to selectfragments of the input. For example, we can use this trans-formation language to write an ��� program that receives
a <- invoke f(in); b <- invoke g(in);c <- invoke h({ x: b, y: a.d }); ret c;
(a) Surface syntax program.[in, { input: in }] >>> first (invoke f) >>>
[in[1].input, in[1][a -> in[0]]] >>>
first (invoke g) >>>
[{ x: in[0], y: in[1].a.d }, in[1]] >>>
first (invoke h) >>> in[0]
(b) Naive translation to ���.[in, { input: in }] >>> first (invoke f) >>>
[in[1].input, { a: in[0] }] >>> first (invoke g) >>>
[{ x: in[0], y: in[1].a.d }, {}] >>>
first (invoke h) >>> in[0]
(c) Live variable analysis eliminates several �elds.[in, { input: in }] >>> first (invoke f) >>>
[in[1].input, { ad: in[0].d }] >>> first (invoke g) >>>
[{ x: in[0], y: in[1].ad }, {}] >>>
first (invoke h) >>> in[0]
(d) Live key analysis immediately projects a.d.
Figure 9. Compiling the surface syntax of ���.
a two-element array as input and then runs two di�erentserverless functions on each element:first (invoke f) >>> [in[1], in[0]] >>> first (invoke g)
Without the ���� transformation language, we would needan auxiliary serverless function to swap the elements.
A simpler notation for ��� programs. ��� is designed tobe a minimal set of primitives that are straightforward toimplement in a serverless platform. However, ��� programsare di�cult for programmers to comprehend. To address thisproblem, we have also developed a surface syntax for ���that is based on Paterson’s notation for arrows [35]. Usingthe surface syntax, we can rewrite the previous example asfollows:x <- invoke f(in[0]); y <- invoke g(in[1]); ret [y, x];
This version is far less cryptic than the original.We describe the surface syntax compiler by example. At
a high level, the compiler produces ��� programs in store-passing style. For example, Figure 9a shows a surface syntaxprogram that invokes three serverless functions (f , �, and h).However, the composition is non-trivial because the inputof each function is not simply the output of the previousfunction. We compile this program to an equivalent ��� pro-gram that uses the ���� transformation language to saveintermediate values in a dictionary (Figure 9b). However,this naive translation carries unnecessary intermediate state.We address this problem with two optimizations. First, thecompiler performs a live variable analysis, which producesthe more compact program shown in Figure 9c. In the origi-nal program, the input reference (in) is not live after �, and
9
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
c is the only live variable after h, thus these are eliminatedfrom the state. Second, the compiler performs a livenessanalysis of the ���� keys returned by serverless functions,which produces an even smaller program (Figure 9d). In ourexample, f returns an object a, but the program only usesa.d and discards any other �elds that a may have. There aremany situations where the entire object a may be signi�-cantly larger than a.d , thus extracting it early can shrink theamount of state a program carries.
6.3 ImplementationOpenWhisk implementation. Apache OpenWhisk is ama-ture and widely-deployed serverless platform that is writ-ten in Scala and is the foundation of ��� Cloud Functions.We have implemented ��� as a 1200 ��� patch to Open-Whisk, which includes the surface syntax compiler and sev-eral changes to the OpenWhisk runtime system. We inheritOpenWhisk’s fault tolerance mechanisms (e.g., at-least-onceexecution) and reuse OpenWhisk’s support for serverlessfunction sequences [7] to implement the >>> operator of ���.Our OpenWhisk implementation of ��� has three di�er-
ences from the language presented so far. First, it supportsbounded loops, which are a programming convenience. Sec-ond, instead of implementing the first operator and the ����transformation language as independent expressions, wehave a single operator that performs the same action as first,but applies a ���� transformation to the input and output,which is how transformations are most commonly used. Fi-nally, we implement a multi-armed conditional, which is astraightforward extension to ���. These operators allow usto compile the surface syntax to smaller ��� programs, whichmoderately improves performance.
Portable implementation. We have also built a portableimplementation of ��� (1156 ��� of Rust) that can invokeserverless functions in public clouds. (We have tested withGoogle Cloud Functions.) Whereas the OpenWhisk imple-mentation allows us to carefully measure load and utilizationon our own hardware test-bed, we cannot perform the sameexperiments with our standalone implementation, since pub-lic clouds abstract away the hardware used to run serverlessfunctions. The portable implementation has helped us ensurethat the design of ��� is independent of the design and im-plementation of OpenWhisk, and we have used it to exploreother kinds of features that a serverless platformmay wish toprovide. For example, we have added a fetch operator to ���that receives the name of a �le in cloud storage as input andproduces the �le’s contents as output. It is common to haveserverless functions fetch private �les from cloud storage(after an access control check). The fetch operator can makethese kinds of functions faster and consume fewer resources.
7 EvaluationThis section �rst evaluates the performance of ��� using mi-crobenchmarks, and then highlights the expressivity of ���with three case studies. We will release our implementationsas open-source and submit our experiments to the ���.
7.1 Comparison to OpenWhisk ConductorThe Apache OpenWhisk serverless platform has built-insupport for composing serverless functions using a Conduc-tor [37]. A Conductor is special type of serverless functionthat, when invoked, may respond with the name and argu-ments of an auxiliary serverless function for the platform toinvoke, instead of returning immediately. When the auxil-iary function returns a response, the platform re-invokes theConductor with the response value. Therefore, the platforminterleaves the execution of the Conductor function and itsauxiliary functions, which allows a Conductor to implementsequential control �ow, similar to ���. The key di�erencebetween ��� and a Conductor, is that ��� is designed to run di-rectly on the platform, whereas the Conductor is a serverlessfunction itself, which consumes additional resources.
This section compares the performance of ��� to Conduc-tor with a microbenchmark that stresses the performance ofthe serverless platform. The benchmark ��� program runs asequence of ten serverless functions, and we translate it to anequivalent program for Conductor. We run two experimentsthat (1) vary the number of concurrent requests and (2) varythe size of the requests. In both experiments, we measureend-to-end response time, which is the metric that is mostrelevant to programmers. We �nd that ��� outperforms Con-ductor in both experiments, which is expected because itsdesign requires fewer resources, as explained below.
We conduct our experiments on a six-core Intel Xeon E5-1650 with 64 GB RAM with Hyper-Threading enabled.
Concurrent invocations. Our �rst experiment shows howresponse times change when the system is processing sev-eral concurrent requests. We run N concurrent requests ofthe same program, and then measure �ve response times.Figure 10a shows that ��� is slightly faster than Conductorwhen N 12, but approximately twice as fast when N > 12.We attribute this to the fact that Conductor interleaves theconductor function with the ten serverless functions, thus itrequires twice as many containers to run. Moreover, sinceour ��� has six hyper-threaded cores, Conductor is over-loaded with 12 concurrent requests.
Request size. Our second experiment shows how responsetimes depend on the size of the input request. We use thesame microbenchmark as before, and ensure that the plat-form only processes one request at a time. We vary the re-quest body size from 0KB to 1MB and measure �ve responsetimes at each request size. Figure 10b shows that ��� is al-most twice as fast as Conductor. We again attribute this to
10
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
Formal Foundations of Serverless Computing
●●●●● ●●●●● ●●
●●● ●●
●●●
●●●●●
●●
●
●●
●●●●●
●●●●●
●
●●
●
●
●
●●●●
●●●
●●
●●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●●
0
30
60
90
0 4 8 12 16 20 24 28
Concurrent Requests
Res
pons
e T
ime
(s)
● Conductor
SPL
(a) Response time vs. concurrent requests.
●●●●●
●
●●●
●
●●
●●
●
●
●●●●
●●●●●
●●●●●
●●●
●
●●●
●●● ●
●●●●
●
●
●●
●
●
●
●●
●
●●●●●
●●
●●●
●●●
●
● ●●●●●
●●●●●
●●●●● ●
●●●●
●
●●●●
●●●●●
0.5
1.0
1.5
025
050
075
0
Request Size (KB)
Res
pons
e T
ime
(s)
● Conductor
SPL
(b) Response time vs. request size.
●● ●
●●● ●●●●●●●●●●
●●●
●● ● ●●
●● ● ● ● ●● ● ● ●● ● ●●0
10
20
30
40
0 1 2 3 5 9 15 25 41 67 108
174
281
456
73711
8619
0430
7949
92
Size of CSV (KB)
Run
tim
e (s
)
Overhead
Total
(c) Response time and overhead.
Figure 10. ��� benchmarks. Figures 10a and 10b show that ��� is faster than OpenWhisk Conductor when the serverlessplatform is processing several concurrent requests, or when the request size increases. Figure 10c shows that most of theexecution time is spent in serverless functions, and not running ���.
the fact that Conductor needs to copy the request across asequence of functions that is twice as long as ���.
Summary. The ��-based isolation techniques that Conduc-tor uses have a nontrivial cost. ���, since it uses language-based isolation, is able to lower the resource utilization ofserverless compositions by up to a factor of two.
7.2 Case StudiesThe core ��� language, presented in §6, is a minimal fragmentthat lacks convenient features that are needed for real-worldprogramming. To identity the additional features that arenecessary, we have written several di�erent kinds of ���programs, and added new features to ��� when necessary.Fortunately, these new features easily �t the structure estab-lished by the core language. This section presents some ofthe programs that we’ve built and discusses the new featuresthat they employ. These examples illustrate that it is easy togrow core ��� into a convenient and expressive programminglanguage for serverless function composition.
Conditional bank deposits. Figure 11a uses ��� to rewritethe bank deposit function (Figure 6). The original programsu�ered the “double billing” problem, since the compositeserverless function needs to be active while waiting for thebank serverless function to do actual work. In contrast, theserverless platform suspends the ��� program when it in-vokes a serverless function and resumes it when the responseis ready. This example also shows that our implementationsupports basic arithmetic, which we add to the ���� trans-formation sub-language in a straightforward way.
Continuous integration. Baldini et al. [7] present a com-pelling example of serverless composition, which we adaptfor ���. Our ��� program (Figure 11b) connects GitHub, Slack,and Google Cloud Build (which is a continuous integrationtool, similar to TravisCI). Cloud Build makes it easy to runtests when a new commit is pushed to GitHub. But, it is muchharder to see the test results in a convenient way. However,
1 if (in.amount > 100) {
2 invoke bank({ type: �deposit�, to: �checking�,
3 amount: in.amount - 100, transId: in.tId1 });
4 invoke bank({ type: �deposit�, to: �savings�,
5 amount: 100, transId: in.tId2 });
6 } else {
7 invoke bank({ type: �deposit�, to: �checking�,
8 amount: in.amount, transId: in.tId1 });
9 } ret;
(a) Receive funds, deposit $100 in savings (if feasible), and depositthe rest in checking.1 invoke postStatusToGitHub({ state: in.state,2 sha: in.sha, url: in.url, repo: �<repo owner/name>� });
3 if (in.state == �failure�) {
4 invoke postToSlack({ channel: �<id>�, text: �<msg>� });
5 } ret;
(b) Receive build state from Google Cloud Build, set status onGitHub, and report failures on Slack.1 data <- get in.url;2 json <- invoke csvToJson(data);
3 out <- invoke plotJson({data:json,x:in.xAxis,y:in.yAxis});4 ret out;
(c) Receive a ��� of a ��� �le and two column names, downloadthe �le, convert it to ����, then plot the ����.
Figure 11. Example ��� programs.
Cloud Build can invoke a serverless function when testscomplete, and we use it run an ��� program that (1) uses theGitHub ��� to add a test-status icon next to each commitmessage, and (2) uses the Slack ��� to post a message when-ever a test fails. However, instead of writing a monolithicserverless function, we �rst write two serverless functionsthat post to Slack and set GitHub status icons respectively,and let the ��� program invoke them. It is easy to reuse theGitHub and Slack functions in other applications.
Data visualization. Our last example (Figure 11c) receivesthe ��� of a ���-formatted data �le with two columns, plots
11
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
the data, and responds with the plotted image. This is thekind of task that a power-constrained mobile applicationmay wish to o�oad to a serverless platform, especially whenthe size of the data is signi�cantly larger than the plot. Ourprogram invokes two independent serverless functions thatare trivial wrappers around popular JavaScript libraries: csvj-son, which converts ��� to ���� and vega, which createsplots from ���� data. This example uses a new primitive(get) that our implementation supports to download the data�le. Downloading is a common operation that is natural forthe platform to provide. Our implementation simply issuesan ���� ��� request and suspends the ��� program until theresponse is available. However, it is easy to imagine moresophisticated implementations that support caching, autho-rization, and other features. Finally, this example show thatour ��� implementation is not limited to processing ����.The get command produces a plain-text ��� �le, and theplotJson invocation produces a ���� image.A natural question to ask about this example is whether
the decomposition into three parts introduces excessive com-munication overhead. We investigate this by varying thesize of the input ���s (ten trials per size), and measuringthe total running time and the overhead, which we de�neas the time spent outside serverless functions (i.e., transfer-ring data, running get, and applying ���� transformations).Figure 10c shows that even as the �le size approaches 5 MB,the overhead remains low (less than 3% for a 5 MB �le, andup to 25% for 1 KB �le).
8 Related WorkServerless computing. Baldini et al. [7] introduce the prob-lem of serverless function composition and present a newprimitive for serverless function sequencing. Subsequentwork develops serverless state machines (Conductor) and a��� (Composer) thatmakes statemachines easier towrite [37].In §6, we present an alternative set of composition opera-tors that we formalize as an extension to � , implement inOpenWhisk, and evaluate their performance.
Trapeze [2] presents dynamic ��� for serverless computing,and further sandboxes serverless functions to mediate theirinteractions with shared storage. Their Coq formalizationof termination-sensitive noninterference does not modelseveral features of serverless platforms, such as warm startsand failures, that our semantics does model.Several projects exploit serverless computing for elastic
parallelization [4, 17, 18, 28]. §6 addresses modularity anddoes not support parallel execution. However, it would bean interesting challenge to grow the ��� in §6 to the pointwhere it can support the aforementioned applications with-out any non-serverless computing components. It is worthnoting that today’s serverless execution model is not a good�t for all applications. For example, Singhvi et al. [40] list
several barriers to running network functions on serverlessplatforms.
Serverless computing and other container-based platformssu�er several performance and utilization barriers. Thereare several ways to address these problems, including data-center design [20], resource allocation [9], programming ab-stractions [7, 37], and edge computing [5]. � is designed toelucidate subtle semantic issues (not performance problems)that a�ect programmers building serverless applications.
Language-based approaches to microservices. �FAIL [38]is a semantics for horizontally-scaled services with durablestorage, which are related to serverless computing. A keydi�erence between � and �FAIL is that � models warm-starts, which occur when a serverless platform runs a newrequest on an old function instance, without resetting itsstate. Warm-starts make it hard to reason about correctness,but this paper presents an approach to do so. Both � and�FAIL present weak bisimulations between detailed and naivesemantics. However, � ’s naive semantics processes a sin-gle request at a time, whereas �FAIL’s idealized semanticshas concurrency. We use � to specify a protocol to ensurethat serverless functions are idempotent and fault tolerant.However, �FAIL also presents a compiler that automaticallyensures that these properties hold for C# and F# code. Webelieve the approach would work for � . §6 extends � withnew primitives, which we then implement and evaluate.
Whip [41] and ucheck [33] are tools that check propertiesof microservice-based applications at run-time. These worksare complementary to ours. For example, our paper identi�esseveral important properties of serverless functions, whichcould then be checked using Whip or ucheck.
Cloud orchestration frameworks. Ballerina [42] is a lan-guage for managing cloud environments; Engage [16] is adeployment manager that supports inter-machine depen-dencies; and Pulumi [36] is an embedded ��� for writingprograms that con�gure and run in the cloud. In contrast,� is a semantics of serverless computing. §6 uses � to de-sign and implement a language for composing serverlessfunctions that runs within a serverless platform.
Veri�cation. There is a large body of work on veri�cation,testing, and modular programming for distributed systemsand algorithms (e.g., [6, 10, 12, 13, 21, 24, 25, 39, 43]). Theserverless computation model is more constrained than arbi-trary distributed systems and algorithms. This paper presentsa formal semantics of serverless computing, � , with an em-phasis on low-level details that are observable by programs,and thus hard for programmers to get right. To demonstratethat � is useful, we present three applications that employit and extend it in several ways. This paper does not addressveri�cation for serverless computing, but � could be usedas a foundation for future veri�cation work.
12
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
Formal Foundations of Serverless Computing
9 ConclusionWe have presented � , an operational semantics that modelsthe low-level of serverless platforms that are observable byprogrammers. We have also presented three applications of� . (1) We prove a weak bisimulation to characterize whenprogrammers can ignore the low-level details of � . (2) Weextend � with a key-value store to reason about statefulfunctions. (3) We extend � with a language for serverlessfunction composition, implement it, and evaluate its per-formance. We hope that these applications show that �can be a foundation for further research on language-basedapproaches to serverless computing.
References[1] A����, I. E., C���, R., R����, I., S����, M., S�����, K., B���, A.,
A�����, P., ��� H���, V. SAND: Towards high-performance serverlesscomputing. In USENIX Annual Technical Conference (ATC) (2018).
[2] A�������, K., F�������, C., F������, S., R�����, L., S����, M.,S������, T., ���W�������, K. Secure serverless computing using dy-namic information �ow control. Proc. ACM Program. Lang. 2, OOPSLA(Oct. 2018).
[3] AWS Lambda developer guide: Invoke. h�ps://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html. Accessed Jul 5 2018, 2018.
[4] A�, L., I���������, L., V������, G. M., ��� P�����, G. Sprocket: Aserverless video processing framework. In Proceedings of the ACMSymposium on Cloud Computing (2018), SoCC ’18.
[5] A���, A., ��� Z���, X. Supporting multi-provider serverless comput-ing on the edge. In Proceedings of the 47th International Conference onParallel Processing Companion (2018), ICPP ’18.
[6] B����, A., G������������, K. �., K, R. G., ��� J����, R. Verifyingdistributed programs via canonical sequentialization. InACM SIGPLANConference on Object Oriented Programming, Systems, Languages andApplications (OOPSLA) (2017).
[7] B������, I., C����, P., F���, S. J., M�������, N., M��������, V.,R�����, R., S����, P., ��� T������, O. The serverless trilemma:Function composition for serverless computing. In ACM SIGPLANInternational Symposium on New Ideas, New Paradigms, and Re�ectionson Programming and Software (Onward!) (2017).
[8] B������, G., D����, J., D����������, E., H�����, M., ��� S������,A. Protecting chatbots from toxic content. In 2018 ACM SIGPLANInternational Symposium on New Ideas, New Paradigms, and Re�ectionson Programming and Software (2018).
[9] B���������, M., B����, R., ��� B�����, W. Resource managementof replicated service systems provisioned in the cloud. In NetworkOperations and Management Symposium (NOMS) (2016).
[10] C�����, T., K�������, F., L������, B., ��� Z��������, N. Verifyingconcurrent software using movers in CSPEC. In 13th USENIX Sym-posium on Operating Systems Design and Implementation (OSDI 18)(2018).
[11] C�����, S. Cloud native technologies are scaling pro-duction applications. h�ps://www.cncf.io/blog/2017/12/06/cloud-native-technologies-scaling-production-applications/,2017. Accessed Jul 12 2018.
[12] D����, A., P����������, A., Q�����, S., ��� S�����, S. A. Composi-tional programming and testing of dynamic distributed systems. InACM SIGPLAN Conference on Object Oriented Programming, Systems,Languages and Applications (OOPSLA) (2018).
[13] D�����, C., H��������, T. A., ��� Z�������, D. Psync: A partiallysynchronous language for fault-tolerant distributed algorithms. InACM SIGPLAN-SIGACT Symposium on Principles of Programming Lan-guages (POPL) (2016).
[14] E����, A. OpenFaaS. h�ps://www.openfaas.com. Accessed Jul 5 2018.
[15] F��������, M., ��� F�������, D. P. Control operators, the SECD-machine, and the �-calculus. In Proceedings of the IFIP TC 2/WG 2.2Working Conference on Formal Description of Programming Concepts(1986).
[16] F������, J., M�������, R., ��� E�������������, S. Engage: A deploy-ment management system. In ACM SIGPLAN Conference on Program-ming Language Design and Implementation (PLDI) (2012).
[17] F������, S., I���, D., C���������, S., K��������, C., Z������, M.,��� W�������, K. A thunk to remember: make-j1000 (and other jobs)on functions-as-a-service infrastructure, 2017.
[18] F������, S., W����, R. S., S��������, B., B��������������, K. V.,Z���, W., B�������, R., S��������, A., P�����, G., ��� W�������,K. Encoding, fast and slow: Low-latency video processing using thou-sands of tiny threads. In USENIX Symposium on Networked SystemDesign and Implementation (NSDI) (2017).
[19] F������ S�����, J., M���������, P., N������̄�����̇, D., W���, T.,���G������, P. JaVerT: JavaScript veri�cation toolchain. Proceedingsof the ACM on Programming Languages 2, POPL (2018), 50:1–50:33.
[20] G��, Y., ��� D���������, C. The Architectural Implications of CloudMicroservices. In Computer Architecture Letters (CAL), vol.17, iss. 2(Jul-Dec 2018).
[21] G����, V. B. F., K��������, M., M�������, D. P., ��� B��������,A. R. Verifying strong eventual consistency in distributed systems. InACM SIGPLAN Conference on Object Oriented Programming, Systems,Languages and Applications (OOPSLA) (2017).
[22] Google Cloud Functions. h�ps://cloud.google.com/functions/. Ac-cessed Jul 5 2018.
[23] Cloud Functions execution environment. h�ps://cloud.google.com/functions/docs/concepts/exec. Accessed Jul 5 2018, 2018.
[24] G���, A., R��������, M., ��� F�����, N. Machine veri�ed networkcontrollers. In ACM SIGPLAN Conference on Programming LanguageDesign and Implementation (PLDI) (2013).
[25] H���������, C., H�����, J., K��������, M., L����, J. R., P����, B.,R������, M. L., S����, S., ��� Z���, B. Iron�eet: Proving practicaldistributed systems correct. In ACM Symposium on Operating SystemsPrinciples (SOSP) (2015).
[26] H����������, S., S���������, S., H�����, T., V������������, V.,A������D������, A. C., ��� A������D������, R. H. Serverless com-putation with OpenLambda. In USENIX Workshop on Hot Topics inCloud Computing (HotCloud) (2016).
[27] H�����, J. Generalising monads to arrows. Science of ComputerProgramming 37, 1–3 (May 2000), 67–111.
[28] J����, E., P�, Q., V�����������, S., S�����, I., ��� R����, B. Occupythe cloud: Distributed computing for the 99%. In Symposium on CloudComputing (2017).
[29] Microsoft Azure Functions. h�ps://azure.microso�.com/en-us/services/functions/. Accessed Jul 5 2018.
[30] Choose between Azure services that deliver messages. h�ps://docs.microso�.com/en-us/azure/event-grid/compare-messaging-services.Accessed Jul 5 2018, 2018.
[31] Apache OpenWhisk. h�ps://openwhisk.apache.org. Accessed Jul 52018.
[32] OpenWhisk actions. h�ps://github.com/apache/incubator-openwhisk/blob/master/docs/actions.md. AccessedJul 5 2018, 2018.
[33] P����, A., S����, M., ��� S������, S. Veri�cation in the age ofmicroservices. In Workshop on Hot Topics in Operating Systems (2017).
[34] P���, D., S���������, A., ��� R���, G. KJS: A complete formalsemantics of JavaScript. In ACM SIGPLAN Conference on ProgrammingLanguage Design and Implementation (PLDI) (2015).
[35] P�������, R. A new notation for arrows. In ACM International Con-ference on Functional Programming (ICFP) (2001).
[36] Pulumi. cloud native infrastructure as code. h�ps://www.pulumi.com/.Accessed Jul 5 2018, 2018.
13
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
[37] R�����, R. Composing functions into applicationsthe serverless way. h�ps://medium.com/openwhisk/composing-functions-into-applications-70d3200d0fac. Accessed Jul 52018, 2017.
[38] R���������, G., ��� V������, K. Fault tolerance via idempotence.In ACM SIGPLAN-SIGACT Symposium on Principles of ProgrammingLanguages (POPL) (2013).
[39] S�����, I., W�����, J. R., ��� T������, Z. Programming and provingwith distributed protocols. In ACM SIGPLAN-SIGACT Symposium onPrinciples of Programming Languages (POPL) (2017).
[40] S������, A., B�������, S., H������, Y., A�����, A., P���, M., ���R����, P. Granular computing and network intensive applications:
Friends or foes? In Proceedings of the 16th ACMWorkshop on Hot Topicsin Networks (2017), HotNets-XVI.
[41] W���, L., C����, S., ��� D�������, C. Whip: Higher-order contractsfor modern services. In ACM International Conference on FunctionalProgramming (ICFP) (2017).
[42] W����������, S., E��������, C., P�����, S., ��� L������, F. Bring-ing middleware to everyday programmers with ballerina. In BusinessProcess Management (2018).
[43] W�����, J. R., W���, D., P��������, P., T������, Z., W���, X., E����,M. D., ��� A�������, T. Verdi: A framework for implementing andformally verifying distributed systems. In ACM SIGPLAN Conferenceon Programming Language Design and Implementation (PLDI) (2015).
14