77
Fault-tolerant and Transactional Stateful Serverless Workflows Haoran Zhang, Adney Cardoza, Peter Baile Chen Sebastian Angel, Vincent Liu

Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Fault-tolerant and Transactional Stateful Serverless Workflows

Haoran Zhang, Adney Cardoza, Peter Baile ChenSebastian Angel, Vincent Liu

Page 2: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

What is serverless?

Page 3: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

DeveloperClient

Cloud

What is serverless?

Page 4: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

APIGateway

UserDeveloperClient

Cloud

What is serverless?

Page 5: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

WorkerWorkerWorkerWorker

APIGateway

UserDeveloperClient

Cloud

What is serverless?

Page 6: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

SharedDatabaseDatabase

WorkerWorkerWorkerWorker

APIGateway

UserDeveloperClient

Cloud

What is serverless?

Page 7: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

SharedDatabaseDatabase

WorkerWorkerWorkerWorker

APIGateway

UserDeveloperClient

Cloud

X

What is serverless?

Page 8: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

SharedDatabaseDatabase

WorkerWorkerWorkerWorker

APIGateway

UserDeveloperClient

Cloud

X

What is serverless?

Page 9: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

SharedDatabaseDatabase

WorkerWorkerWorkerWorker

APIGateway

UserDeveloperClient

Cloud

X

What is serverless?

Page 10: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

SharedDatabaseDatabase

WorkerWorkerWorkerWorker

APIGateway

UserDeveloperClient

Cloud

X

What is serverless?

Workers can fail!

Page 11: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

How could serverless go wrong?

End

Write(“a,endees”,N+1)

N=Read(“a,endees”)

StartSendRequest

CloudClient

Page 12: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

End

Write(“a,endees”,N+1)

N=Read(“a,endees”)

Start

ReceiveError/Timeout

SendRequest

CloudClient

How could serverless go wrong?

Page 13: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

End

Write(“a,endees”,N+1)

N=Read(“a,endees”)

Start

ShouldIRetry?

ReceiveError/Timeout

SendRequest

CloudClient

How could serverless go wrong?

Page 14: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

End

Write(“a,endees”,N+1)

N=Read(“a,endees”)

Start

ShouldIRetry?

ReceiveError/Timeout

SendRequest

CloudClient

How could serverless go wrong?

Page 15: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

End

Write(“a,endees”,N+1)

N=Read(“a,endees”)

Start

ShouldIRetry?

RecieveError/Timeout

SendRequest

CloudClient

How could serverless go wrong?

Write Idempotent Functions!

Page 16: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Beldi makes stateful serverless functions idempotent automatically!

Page 17: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Outline

• Beldi’s Infrastructure• Linked DAAL• Invocation with exactly-once semantics• Evaluation• Conclusion

Page 18: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Beldi’s architecture

Page 19: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

DatabaseAPI

Beldi’s architecture

Page 20: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

DatabaseAPI

Invoca.onAPI

Beldi’s architecture

Page 21: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

DatabaseAPI

Transac.onAPI

Invoca.onAPI

Beldi’s architecture

Page 22: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start InstanceId Done

DatabaseAPI

Transac.onAPI

Invoca.onAPI

Beldi’s architecture

Page 23: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

DatabaseAPI

Transac.onAPI

Invoca.onAPI

Beldi’s architecture

Page 24: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

Opera.on Value

DatabaseAPI

Transac.onAPI

Invoca.onAPI

Beldi’s architecture

Page 25: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

Opera.on Value

DatabaseAPI

Beldi’s architecture

Page 26: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

DatabaseAPI

Beldi’s architecture

Page 27: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

DatabaseAPI

ProgressLambda

Beldi’s architecture

Page 28: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

DatabaseAPI

ProgressLambda

Beldi’s architecture

Page 29: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

d78590e-1 10

DatabaseAPI

ProgressLambda

Beldi’s architecture

Page 30: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

d78590e-1 10

DatabaseAPI

ProgressLambda

Beldi’s architecture

Page 31: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Beldi’s architecture

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

d78590e-1 10

DatabaseAPI

ProgressLambda

Page 32: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start Key Value

a7endees 11

Opera.on Value

d78590e-1

d78590e-2

10

DatabaseAPI

Beldi’s architecture

Page 33: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start Key Value

a7endees 11

Opera.on Value

d78590e-1

d78590e-2

10

DatabaseAPI

Beldi’s architecture

Page 34: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start Key Value

a7endees 11

Opera.on Value

d78590e-1

d78590e-2

10

DatabaseAPI

Beldi’s architecture

Page 35: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start Key Value

a7endees 11

Opera.on Value

d78590e-1

d78590e-2

10

DatabaseAPI

Beldi’s architecture

Problem: ➀ and ➁ must be done atomicallySolution: Collocate write log with the data!

Page 36: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value RecentWrites

a7endees 10

InstanceId Done

d78590e False

Opera.on Value

d78590e-1 10

DatabaseAPI

ProgressLambda

Beldi’s architecture

Page 37: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value RecentWrites

a7endees [d78590e-2]11

InstanceId Done

d78590e False

Opera.on Value

d78590e-1 10

DatabaseAPI

ProgressLambda

Beldi’s architecture

Page 38: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Worker BeldiRun.me Storage

End

Write(“a7endees”,N+1)

N=Read(“a7endees”)

Start

Key Value RecentWrites

a7endees [d78590e-2]11

InstanceId Done

d78590e False

Opera.on Value

d78590e-1 10

DatabaseAPI

ProgressLambda

GarbageCollector

Beldi’s architecture

Page 39: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Technical Challenges

1. Limitation of databases

2. Federated setup

3. Transactions across multiple lambdas

Page 40: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Key Value RecentWrites

a1endees [d78590e-1,d78590e-2,…,d78590e-1000]10

Limitation of databases

Solution: spread the log for a given keyacross multiple rows

Page 41: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

NextRowRowId

f9cec2e

Key Value RecentWrites

a:endees [d78590e-1001]11

NextRow

f9cec2e

RowId

HEAD

Key Value RecentWrites

a:endees [d78590e-1,d78590e-2,…,d78590e-1000]10

Limitation of databases

Page 42: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

How do we traverse to the tail?

Page 43: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Page 44: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Page 45: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Page 46: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Page 47: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Solution: Use scan and projection todownload a skeleton version of Linked DAAL

Page 48: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Page 49: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

Linked DAAL

Page 50: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

RowId NextRow

RowId NextRow

HEAD NextRow 256Bits

Linked DAAL

Page 51: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

RowId Key Value RecentWrites NextRow

RowId Key Value RecentWrites NextRow

HEAD Key Value RecentWrites NextRow

{PrimaryKey

RowId NextRow

RowId NextRow

HEAD NextRow 256Bits

Linked DAAL

Page 52: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Outline

• Beldi’s Infrastructure• Linked DAAL• Invocation with exactly-once semantics• Evaluation• Conclusion

Page 53: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Invocation with exactly-once semantics

CallLambda2

Lambda1 Lambda2

Page 54: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

CallLambda2

Opera.on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 55: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 False

LoginProgressTableCallLambda2

Opera=on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 56: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 False

makesomewrites

LoginProgressTableCallLambda2

Opera?on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 57: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

MarkasDone

makesomewrites

LoginProgressTableCallLambda2

Opera@on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 58: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

MarkasDone

makesomewrites

LoginProgressTableCallLambda2

Opera@on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

X

Invocation with exactly-once semantics

Page 59: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

MarkasDone

makesomewrites

LoginProgressTableCallLambda2

Opera@on Callee

d78590e-1 b97bbe0

X

Lambda1 Lambda2

X

Invocation with exactly-once semantics

Page 60: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

CallLambda2

Opera:on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 61: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

MarkasDone

makesomewrites

LoginProgressTableCallLambda2

Opera@on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 62: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

MarkasDone

makesomewrites

LoginProgressTableCallLambda2

Opera@on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

ReceiveResponse

Invocation with exactly-once semantics

Page 63: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 True

GC

CallLambda2

Opera;on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 64: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

GC

CallLambda2

Opera6on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 65: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

InstanceId Done

b97bbe0 False

makesomewrites

LoginProgressTableCallLambda2

Opera?on Callee

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 66: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

CallLambda2

Opera.on Callee Result

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 67: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

makesomewrites

LoginIntentTableCallLambda2

Opera8on Callee Result

d78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 68: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Callback

makesomewrites

LoginIntentTableCallLambda2

Opera9on Callee Result

resultd78590e-1 b97bbe0

Lambda1 Lambda2

Invocation with exactly-once semantics

Page 69: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

MarkasDone

Callback

makesomewrites

LoginIntentTableCallLambda2

Opera;on Callee Result

resultd78590e-1 b97bbe0

Lambda1 Lambda2

X

Invocation with exactly-once semantics

Page 70: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Outline

• Beldi’s Infrastructure• Linked DAAL• Invocation with exactly-once semantics• Evaluation• Conclusion

Page 71: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Evaluation

1. What are the costs of Beldi’s API operations?

2. How does Beldi perform in real-world applications?

3. What is the effect of garbage collection?

Page 72: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

What are the costs of Beldi’s API operations?

20 rows in Linked DAAL, 2 - 4x more expensive than baseline

��

���

���

���

���

���

���

�� ��� ���� ��� �����

���

���

���

����������������������� ��!���"��

Page 73: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

How does Beldi perform in real-worldapplications?

Frontend

Search

Reserve

User

Profile

Geo

Rate

Reserve Flight

RecommendClient

Reserve Hotel

DeathStarBench (ASPLOS 19): open-source microservices benchmark• Movie review service (Cf. IMDB)• Travel reservation (Cf. Expedia)• Social media site (Cf. Twitter)

Page 74: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

How does Beldi perform in real-worldapplications?

��

����

�����

�����

�����

�����

�� ���� ���� ���� ���� ���� ���� ���

��

���

���

��

������������� �� ���� �����

��� !� ������� !� �""�� �!����� �!�""�

<400 req/s:2× higher thanbaseline

700 req/s(saturation):3.3 × higher thanbaseline

Page 75: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Outline

• Beldi’s Infrastructure• Linked DAAL• Invocation with exactly-once semantics• Evaluation• Conclusion

Page 76: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Conclusion

1. A framework to write transactional and fault-tolerant applicationson serverless.

2. A lock-free data structure (Linked DAAL) to support fast logging andexactly-once semantics

3. A collaborative distributed transaction protocol across multiple lambdas

4. An efficient garbage collection algorithm that runs independently without affecting running lambdas or requiring any pauses.

https://github.com/eniac/beldi

Page 77: Fault-tolerant and Transactional Stateful Serverless Workflows...Call Lambda2 Log in Progress Table Opera@on Callee d78590e-1 b97bbe0 Lambda 1 Lambda 2 Invocationwithexactly-oncesemantics

Thank you!