20
Early Scheduling in Parallel State Machine Replication Eduardo Alchieri, Fernando Dotti, and Fernando Pedone Universidade de Brasilia, Pontifica Universidade Católica do Rio Grande do Sul, and University of Lugano 1

Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Early Scheduling in Parallel State Machine Replication

Eduardo Alchieri, Fernando Dotti, and Fernando Pedone

Universidade de Brasilia, Pontifica Universidade Católica do Rio Grande do Sul, and University of Lugano

�1

Page 2: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

State Machine Replication (SMR)Fundamental approach to fault tolerance

Google Spanner

Apache Zookeeper

Windows Azure Storage

MySQL Group Replication

Galera Cluster

Blockchain, …

�2

Page 3: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

SMR is intuitive and simple

�3

Clients Servers

R0 R1 R2

same order deterministic

execution

R0 R1 R2

R0 R1 R2

Page 4: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Key observation

Independent requests can execute concurrently

Conflicting requests must be serialized and executed in the same order by the replicas

Two requests conflict if they access common state and at least one of them updates the state

Parallel State Machine Replication

�4

Page 5: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Parallel State Machine Replication

�5

R1

R2R4

R5 R3 R0

scheduler

worker

worker

scheduler

worker tx

Replica

Replicaworker ty

R0(x)R2(x)R3(x)R5(x)

R1(y)R4(y)

Late scheduling

Scheduling happens after requests are ordered

Early scheduling

Scheduling decisions happen before requests are ordered

E.g., worker tx executes requests on X, worker ty executes requests on Y

Page 6: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Scheduling tradeoff

�6

Con

curr

ency

Low

High

Synchronization Overhead

Low High

Classic SMR

Late Scheduling

Ideal

Early Scheduling

Early Scheduling

This paper

Page 7: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Our contributions

Generalization of Early Scheduling

Classes of requests: expressing application concurrency

How to automatically map classes to worker threads

How the resulting technique compares to late scheduling

�7

Page 8: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Classes of requests

Readers and writers

Class CR: read requests

Class CW: write requests

�8

CR

CWInternal conflict

External conflict

Page 9: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Mapping classes to workers

Define workers that execute requests in the class

Define class type

Sequential: one request at a time

Concurrent: requests executed concurrently

�9

CR

CWSequential

Concurrentt0, …, tk

tk, …, tn

Page 10: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Early Scheduling execution model

�10

scheduler

worker t0

worker t1

Replica

class ➝ workers mapping

ordered requestsR1, R2,… in class C

Class C is CONCURRENT: request assigned to t0 OR t1

R1R4 R3R6

R5 R2R7

Page 11: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Early Scheduling execution model

�11

scheduler

worker t0R1R2

worker t1R1R2

Replica

class ➝ workers mapping

ordered requestsR1, R2,… in class C

Class C is SEQUENTIAL: request assigned to t0 AND t1

barrier

Page 12: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Mapping classes to workers

�12

Every class must have at least one worker thread

t0,t1,t2

t3

Rule #1

C1

C2

If C has internal conflicts, then it must be sequential

Rule #2CR

CW SequentialC1

If C1 and C2 conflict, at least one must be sequential

Rule #3

Sequential

Sequential

or

C1

C2If C1 and C2 conflict, C1 is

concurrent, and C2 is sequential, workers of C1 are workers of C2

Rule #4

Sequential

Concurrent C1

C2

t0,t1

t0,t1,t2If C1 and C2 conflict, and are

sequential, then C1 and C2 must have one worker in common

Rule #5

Sequential

Sequential C1

C2

t0,t1,t2

t2,t3,t4

Page 13: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Local reads mostcommon requests

Workers: t0,t1,t2,t3

Mapping classes to workers

�13

CR1

CW1

CR2

CW2

CRg

CWg

Synchronizedt0, t1, t2

t0, t1, t2, t3

t0, t2, t3Sequential

Concurrent

Sequential

Concurrent

Sequential

Concurrent

t0, t1 t2, t3

t0, t2

Page 14: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Optimizing scheduling

O1a: Minimize workers in sequential classes

O1b: Maximize workers in concurrent classes

O2: Assign workers to concurrent classes in proportion to class weight (i.e., more work, more workers)

O3: Minimize unnecessary synchronization among classes

�14

Page 15: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Optimization model

�15

. . .

Described in AMPL

Solved with KNitro

Page 16: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Naive vs Optimized mapping

�16

Local reads mostcommon requests

Workers: t0,t1,t2,t3

CR1

CW1

CR2

CW2

CRg

CWg

Concurrent t0, t1

Concurrent t2, t3

Sequential t0, t1, t2, t3

Sequential t0, t1

Sequential t2, t3

parallel

Sequentialt0, t2

Page 17: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Experimental evaluationPrototype in BFT-SMaRt environment

Early scheduling and late scheduling

Configured to crash failures (not BFT)

Linked-list application

Single- and multi-shard deployments

Light, moderate, and heavy execution costs

Uniform and skewed workloads

�17

Page 18: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Single-shard, reads, moderate

�18

Page 19: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

Multi-shard, mixed, moderate

�19

Page 20: Early Scheduling in Parallel State Machine Replication · MySQL Group Replication Galera Cluster Blockchain, ... Parallel State Machine Replication 5 R1 R4 R2 R5 R3 R0 scheduler worker

�20

http://www.inf.usi.ch/faculty/pedone/