Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,

Consensus and Related Problems

Béat Hirsbrunner

ReferencesG. Coulouris, J. Dollimore and T. Kindberg

"Distributed Systems: Concepts and Design", Ed. 4, Addison-Wesley 2005, Chap. 12.5

Distributed SystemsBéat Hirsbrunner (Fribourg) and Peter Kropf (Neuchâtel))

Summer Semester 2007, Lecture 3b, 25 May 2007

2

The problem

Roughly speaking, the problem is for processes to agree on a value after one or more of the processes has proposed what that value should be.

Assumption about communication• Point to point communication is reliable• Group communication is based on B-multicast

Assumption about processes• Processes communicate by message passing• Processes may crash(in the case of the Byzantyne Generals Problem, the processes may even arbitrary fail, i.e. be treacherous!)

3

(reminder, cf. p 53)

do

4


5

B-multicast (reminder, cf. p. 486)Basic multicast primitive that guarantees, unlike IP multicast, that a correct process will eventually deliver the message:

• To B-multicast(g,m): for each processs p in g, R-send(p,m)• On R-receive(m) at p: B-deliver(m) at p

R-send and R-receive (reminder, cf. p. 56)The term reliable communication is defined in terms of validity and integrity as follows:

• validity : any message in the outgoing message buffer is eventually delivered to the incoming message buffer.

• integrity: the message received is identical to one sent, and no messages are delivered twice

6


7

Consensusalgorithmp1

p3

p2

v1 = proceed

v2 = proceed

v3 = abortd1 = proceed

d2 = proceed

crashes

Consensus Problem (C)

8

9

RequirementsTermination and Agreement: same as for the consensus problem.Integrity: If the commander is correct, then all correct processes decide on the value that the commander proposed.

(BG)

Assumption: communication channels between pairs of prcs are reliable and private

10

LemmaThe four problems (a) Consensus, (b) Byzantine Generals, (c) Interactive Consistency and (d) Reliable Totally Ordered Multicast are equivalent in the sense that if we can find a solution for one of them we can apply the solution to all of them.

"Proof" (for more detail see p. 502-503)IC from BG: run BG N times, once with each process pi as commanderBG from C: all processes run C with the value received from commander pj

C from IC: apply an appropriate function on the vector to produce a single valueC from RTO-multicast: each pi performs RTO-multicast and choose the first value that the RTO-multicast deliversRTO-multicast from C: see Chanda and Touegg [1996] (not trivial, only for interested students)

Requirements• Termination: same as for the consensus problem.• Agreement: the decision vector of all correct processes is the same.• Integrity: if pi is correct, then all correct processes decide on vi as the i-th component of their vector.

Interactive Consistency Problem (IC)• Each process pi suggests one value vi.• Goal: all correct processes agree on a vector of values, each component corresponding to one processes’ agreed value. Example: agreement about each processes' local state.

11

ProofTermination: obvious as the system is synchronous!Agreement and integrity: follows from the Lemma "every process arrives at the same final set 'Values(f+1,_)' ".

"Proof" of the Lemma (for more detail see p. 504)- If a process crashes, its "B-multicasted" value may not arrive to every correct process- There is at least one round without process crash (i.e. a value v present in a correct p i is also present in all other correct pk: proof by recursion over the rounds) !

Only crashes, no byzantine faults

12

Byzantine generals in a synchronous system:

13

Byzantine generals in a synchronous system:Solution with one faulty process

p1 (Commander)

p2

p3

p4

1:v1:v1:v

3:1:u

3:1:w4:1:v 4:1:v

{v,u,v}

{v,v,w}

p1 (Commander)

p2

p3

1:w1:v

2:1:v

3:1:w

p4

1:v

4:1:v

2:1:v 3:1:w

4:1:v

{v,w,v}

{v,v,w}

{w,v,v}2:1:v

2:1:v

© Addison-Wesley Publishers 2000

p1 (Commander)

p2

p3

1:w1:u

2:1:u3:1:w

p4

1:v

4:1:v

2:1:u 3:1:w

4:1:v

{u,v,w}

{u,v,w}

{u,v,w}

p2: majority({v,u,v}) = v

p3: majority({v,v,w}) = v

p2: majority({v,w,v}) = v

p3: majority({v,v,w}) = v

p4: majority({w,v,v}) = v

p2, p3, p4:

majority({v,u,w}) =

Example

14

: Discussion

15

Previous algorithms: synchrony assumption– message exchanges in rounds– timeouts

In asynchronous systems, consensus is challenged by:– crashes that may not be detected– network partitioning– etc.

Idea: use of handshake protocols to "commit" the transfer of information so that all data has been delivered to all parties.

i.e. no completely asynchronous consensus protocol can tolerate even a single unannounced process death

– even with no byzantine failures, only crashes considered– and with reliable messaging assumed (all messages delivered, no duplication)

Theorem: In asynchronous systems, no algorithm can guarantee reaching consensus, even with just one process crash failure.

Proof idea: Show that there is always some continuation of the process’ execution that avoids consensus being reached.Reference: M. Fischer, N. Lynch and M. Paterson, Impossibility of Distributed Consensus with One Faulty Process, Journal of the ACM, Vol. 32, No. 2, April 1985, pp. 374-382.

Impossibility of Agreement in Asynchronous Systems

Documents

Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,