44
Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform) Consensus Spring 2009 Idit Keidar

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20091

Principles of Reliable Distributed Systems

Lecture 5: Synchronous (Uniform) Consensus

Spring 2009

Idit Keidar

Page 2: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20092

Today’s Material

• Distributed Algorithms, Nancy Lynch– Ch. 6

• Distributed Computing, Attiya and Welch– Ch. 5

Page 3: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20093

Reminder: State Machine Replication (SMR)

Client A

Client B

atomicbroadcast

Page 4: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20094

Replica Coordination Requirements

• Agreement: all replicas receive all client requests– What happens when a replica (server) fails?– What happens when a client fails?

• Order: replicas process requests in the same order

Page 5: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20095

Uniform Atomic Broadcast

• Uniform Reliable Broadcast– Validity: if a correct process broadcasts m then all

correct processes eventually deliver m– Uniform Agreement: if any process delivers m then all

correct processes eventually deliver m– Integrity: m is delivered by a correct process at most

once, and only if it was previously broadcast

• Uniform Total Order– If any two processes deliver both m and m’, they

deliver them in the same order

Page 6: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20096

Today’s Problem: Uniform Consensus

Each process has an input, should decide on an output (one-shot problem)

• Uniform Agreement: every two decisions are the same

• Validity: every decision is an input of one of the processes

• Termination: eventually all correct processes decide

Page 7: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20097

(Uniform) Consensus versus (Uniform) Atomic Broadcast

• From Atomic Broadcast to Consensus

• From Consensus to Atomic Broadcast – Homework question

• From now on, we will focus mainly on consensus, and keep in mind that it suffices for Atomic Broadcast and SMR

Page 8: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20098

Today’s Model(s)

• Round-based synchronous

• Static set P = {p1, …, pn} of processes

• Reliable links– What happens if links can fail?

• Fault tolerance:

1. Crash failures

2. Byzantine failures

Page 9: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20099

Round

Synchronous Round-Based Model

• Synchronous rounds:

1.Send messages to any set of processes;

2.Receive messages from this round;

3.Do local processing (possibly decide, halt)

Page 10: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200910

Model 1: Round-Based Failstop

• If pi does not crash in step 1 of round r, and pj does not crash in or before step 2 of round rthen any message sent by pi to pj in round r is received by pj in round r

• Note: If pi crashes in step 1 of a round, then any subset of the messages pi sends in this round can be lost

Page 11: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200911

Round-Based Failstop Model

• If a message from pj is expected, and no message from pj is received, then pj is suspected

• If pi is suspected in round r, pi fails in round r or r-1,and no further messages from pi will arrive

round 1 round 2

p1

p2

p3

p1 crashes in round 2, step1;

p2 receives p1’s round 2 msg

p3 suspects p1 in round 2

Page 12: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200912

t-Resilient Algorithm

• t is a threshold on the number of potential failures– The algorithm is correct as long as no more than t

processes fail

• In the following algorithm, 0 ≤ t < n• We denote by f the number of actual failures that

occur in a given run, 0 ≤ f ≤ t• We’d like t to be big (robust algorithm)

– But f will usually be small (failures are rare)

Page 13: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Example: t=0 versus f=0

• Thinks of a simple algorithm for t=0

• What happens if we run this algorithm where failures do occur?

13

Page 14: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200914

Notation

• P = {p1, …, pn} is the set of processes

• initi is pi’s initial value (input)

• The decide action determines the output

• Show code for process pi

• Local variables of pi are denoted: vi, Alivei

Page 15: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200915

t-Resilient Failstop Uniform Consensus Algorithm

vi=initi; Alivei = P in every round 1 ≤ k ≤ t+2:

send vi to allreceive round k messagesfor all pj

if (received vj) then vi = min(vi, vj)otherwise pj is suspected

if ( (pj Alivei : received vj = vi) && !decided ) then decide vi.

for all pj if (suspect pj) then Alivei=Alivei {pj}

Page 16: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200916

Proof: Validity

• Lemma: For every process pi, vi always includes the initial value initj of some process pj.

Page 17: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200917

Proof: Uniform Agreement

• Lemma: – If exist value v, round r, and process pi s.t.

– all processes that are in Alivei at the beginning of round r send v in round r,

– then v is the only possible decision value from r onward.

Page 18: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200918

Proof: Uniform Agreement (Cont’d)

• From the Lemma, we get that if some process decides v in round r, then v is the only possible decision value from r onward.

• Now look at the first round in which some process decides.

Page 19: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200919

Termination Lemma

• After a round r in which no process fails, all processes have the same vi forever

• Proof: – Because all receive the same messages in r,– By induction…

Page 20: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200920

Proof: Termination 1/2

• Consider a run where f processes fail– There are at most f rounds with failures– There are at most f rounds when Alivei changes

at any correct pi

– Alivei can change to reflect a failure either in the round of the failure or in the ensuing round

Page 21: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200921

Proof: Termination 2/2

• In f+2 rounds, there is at least one failure-free round and later at least one round in which Alivei does not change – Thus, from the Termination Lemma, after at

most f+2 rounds, there is a round in which Alivei does not change and all received values are the same

Page 22: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200922

How Long Does it Take?

• Early-deciding: in a run with f failures, decision is reached by the end of round f+2

• This is optimal – For Uniform Consensus, but not for Consensus– As long as f < t-1

Page 23: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 200923

Deciding vs. Stopping (Halting)

• The algorithm is not early-stopping: – It continues running for t+2 rounds– Even after reaching a decision

• Homework question: can you change the algorithm to be early-stopping?– Stop (halt) after f+k rounds in runs with t≥f≥0

failures for some constant k

Page 24: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Model 2: Byzantine Faults

Synchronous Byzantine Consensus

24

Page 25: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

The Byzantine Generals Problem

• First formulation of the consensus problem [Pease, Shostak, Lamport 80]

25

Let’s attackLet’s not attack

Page 26: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Byzantine Faults

• Faulty process can behave arbitrarily, i.e., they don’t have to follow the protocol, e.g.,– can suffer benign failures – crash, timing;– can send bogus values in messages;– can send messages at the wrong time; – can send different messages to different

processes; etc.

• Captures software bugs, hacker intrusions26

Page 27: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Byzantine Nodes can Lead Correct Nodes to Conflicting

Decisions

27

Correct nodes cannot know whom to believe

נדיח את מרינה

נדיח את גיא

Page 28: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Byzantine-Fault-Tolerant (BFT) Consensus

• Only non-uniform makes sense. Why?

• Recall, we defined consensus as follows:– Agreement: correct processes’ decisions are

the same– Termination: eventually all correct processes

decide – Validity: decision is input of one process

• Problem?28

Page 29: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Validity: Take II

• Strong unanimity: If the input of all the correct processes is v then no correct process decides a value other than v

• How resilient can an algorithm satisfying this property be?– Homework: prove this!

29

Page 30: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Consensus w/ Strong Unanimity

Each process has input, should decide on output• Agreement: correct processes’ decisions are the

same• Validity (Strong Unanimity): If the input of all the

correct processes is v then no correct process decides a value other than v

• Termination: eventually all correct processes decide

30

Page 31: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

2 Byzantine Models

1. Authenticated– Uses digital signatures– Assumes PKI – Public Key Infrastructure

2. Un-authenticated– No digital signatures– Secure point-to-point communication– Over the Internet – implemented with

symmetric keys31

Page 32: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

1. Authenticated (Byzantine) Model

• Authentication: The receiver of a message can ascertain its origin– An intruder cannot masquerade as someone else

• Integrity: The receiver of a message can verify that it has not been modified in transit– An intruder cannot substitute a false message for a

legitimate one

• Nonrepudiation: A sender cannot falsely deny later that he sent a message

32

Page 33: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Implementing Authentication

• Uses a Cryptographic Public Key Infrastructure (PKI)

• Each process has a well-know public key and a matching private key Mp is message M signed by p’s private key

– Only p can generate Mp

– Every process can verify p’s signature on Mp using p’s public key

33

Page 34: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Exploiting Authentication

• All messages are signed by their source• Every receiver can verify the message• Signed messages can be forwarded as proof

“I can prove that Idit said that I don’t have to submit this homework assignment” – Yossy does not have to submit homework assignment 2Idit

• Liars can be exposed

34

Page 35: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Today’s Model 2

• Round-based synchronous

• Static set P = {p1, …, pn} of processes

• t-out-of-n Byzantine (arbitrary) failures– t < n/2

• Authentication

35

Page 36: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Exponential Information Gathering (EIG) Algorithms

• Forward all received messages in each round, for t+1 rounds:

In round 1:

send your value to allIn later rounds:

for every received message m (w/out my_id)forward m + my_id to all

36

Page 37: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

EIG with Signatures for t <n/2send vi pi to allin every round 2 ≤ k ≤ t+1:

for every received message m: if (m has k-1 different valid signatures and not mine) then send mpi to all

Validi = {vjpj | all messages with t+1 different valid signatures starting with pj’s have same value vj }

decide on most common value in Validi

in case of a tie – choose the default value

37

Page 38: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Signatures Expose Liars

גיא דן נדיח את מרינה

דן ת גיאנדיח א

דן נדיח את מרינה

מרינה דןת גיא

נדיח א

Remove from Valid

38

Page 39: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Validity

• Need to prove Strong Unanimity: If the input of all correct processes is v then no correct process decides a value other than v

• Claim: At every correct pi, for all correct pj,Validi includes vjpj

• Validity follows

39

Page 40: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Agreement

• Claim: For two correct processes pi and pj, Validi and Validj include the same values

• Agreement follows

40

Page 41: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Termination

• Decide always happens after t+1 rounds

41

Page 42: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Can We Improve the Resilience?

42

Page 43: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Validity: Take III

• Weak unanimity: If the input of all the correct processes is v and no process fails then no correct process decides a value other than v

• Does this prevent a trivial solution?

• Resilience?– See recitation

43

Page 44: Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009 1 Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2009

Summary of Known Results

• Synchronous, Byzantine Fault-Tolerant, t-resilient consensus algorithms – – Strong unanimity with authentication iff t < n/2

• As we just saw

– Weak unanimity with authentication: iff t < n• Recitation

– Without authentication: iff t < n/3• Next week

44