Upload
delilah-chambers
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Distributed Algorithms:
Agreement Protocols
Problems of Agreement
A set of processes need to agree on a value (decision), after one or more processes have proposed what that value (decision) should be
Examples: mutual exclusion, election, transactions
Processes may be correct, crashed, or they may exhibit arbitrary (Byzantine) failures
Messages are exchanged on an one-to-one basis, and they are not signed
Consensus and related problems
System modelN processes {p1, p2, ..., pN}Communication is reliable but processes may failAt most f processes out of N may be faulty.
Crash failure Byzantine failure (arbitrary)
The system is logically fully connectedA receiver process knows the identity of the sender processLimiting faults solely to the processes simplifies the solution to the agreement problems
Recently agreement problems have been studied under the failure of communication channels only & under the failure of both process & communication channels
Authenticated & Non-authenticated messages
To reach an agreement, processes have to exchange their values and relay the received values to other procs
Authenticated or signed message system – A (faulty) process cannot forge a message or change the contents of a received message (before it relays the message to other).
A process can verify the authenticity of a received message
Non-authenticated or unsigned or oral message – A (faulty) process can forge a message and claimed to have received it from another process or change the contents of a received message before it relays the message to other.
A process has no way of verifying the authenticity of a received message
Two Agreement Problems
Consensus problem: N processes agree on a value (e.g. synchronized action – go / abort)Consensus may have to be reached in the presence of failure Process failure – crash/fail-stop, arbitrary failure Communication failure
All process i starts in an “undecided” state
Every process i proposes a value vi , from a set D while in the undecided state.
Process i exchanges messages until it makes decision di and moves to decided state.A consensus is reached if all correct processes agree on the same value di
Consensus Requirements
Termination: Eventually each correct process sets its decision value
This may not be possible in the presence of process crashes in asynchronous system
Agreement: The decision value is same for all correct processes
Arbitrary (Byzantine) failures may cause inconsistency and prevent agreement
Integrity: if all correct processes propose the same value, any correct process decides that value
Consensus may involve a proposal stage and an agreement stage
Byzantine Generals Problem
Proposed and solved by LamportConsider a battle ground. There are a number of generals at different positions and want to reach an agreement in their attack plan, i.e, “attack” or “retreat”.
Generals are separated geographically and communicate through messengers. Some of the generals are “loyal” and some are “traitors”.
Upper bound on number of traitorsPease et al. showed that it is impossible to reach a consensus if f exceeds (N-1)/3
Byzantine Generals Problem
“Byzantine generals” problem: a “commander” process i orders value v.
The “lieutenant” processes must agree on what the commander ordered.
Processes may be faulty provide wrong or contradictory messages
Integrity requirement: A distinguished process decides a value for others to agree upon
Solution only exists if N > 3f, where f : #faulty processes
Differs from consensus in that a distinguished process supplies a value that the others are to agree upon, instead of each of them proposing a value
Byzantine Generals Problem
RequirementsTermination: Eventually each process sets its decision variable
Agreement: The decision value of all correct processes is the same
Integrity: If the commander is correct, then all correct processes agree on the value the commander proposed
Note: integrity implies agreement when the commander is correct; but the commander need not be correct
IC: A Variant of Consensus
Interactive Consistency Problem
Every process proposes a single value.
The goal of the algorithm is for the correct processes to agree on a vector of values, one for each process – the “decision vector”
Ex – for each of a set of processes to obtain the same information about their respective states
IC: A Variant of Consensus
Requirements
Termination: Eventually each process sets its decision variable
Agreement: The decision vector of all correct processes is the same
Integrity: If pi is correct, then all correct processes agree on vi as the ith component of its vector
Relationship between C, BG & IC
Although it is common to consider the BG problem with arbitrary process failures, in fact each of the three problems – C, BG, & IC – is meaningful in the context of either arbitrary or crash failures
Each can be framed assuming either a synchronous or an asynchronous system
It is sometimes possible to derive a solution to one problem using a solution to another
Relationship between C, BG & IC
Suppose that there exist solutions to C, BG & ICCi(v1, v2, … vN) returns the decision value of pi in a run of the solution to the consensus problem where v1, v2, … are the values that the processes proposed
BGi(j, v) returns the decision value of pi in a run of the solution to the BG problem, where pj, the commander proposed the value v
ICi(v1, v2, … vN)[ j ] returns the jth value in the decision vector of pi in a run of the solution to the IC problem, where v1, v2, … are the values that the processes proposed
It is possible to construct solutions out of the solutions to other problems
Relationship between C, BG & IC
IC can be solved by using BG’s solution by running it N times, once with each process pi (i = 1, 2, … N) acting as the commander:
ICi(v1, v2, … vN)[ j ] = BGi(j, v) (i = 1, 2, … N)
C can be solved by using IC’s solution by running IC to produce a vector of values at each process, then applying an appropriate function on the vector’s values to derive a single value:
Ci(v1, v2, … vN) = majority(ICi(v1, v2, … vN)[1], … ICi(v1, v2, … vN)[N] )
BG can be solved from C as follows: The commander pj sends its proposed value v to itself and each of the
remaining processes All processes run C with values v1, v2, … vN that they receive (pj may be faulty)
They derive BGi(j, v) = Ci(v1, v2, … vN) (i = 1, 2, … N)
Consensus
Solving consensus is equivalent to solving reliable and totally ordered multicast
Given a solution to one, we can solve the other
Implementing consensus with RTO-multicastCollect all processes into a group g
Each process pi performs RTO-multicast(g, vi)
Each process pi chooses di = mi, where mi is the first value that pi RTO-delivers
Termination property follows from the reliability of the multicast The agreement and integrity properties follow from the reliability and total
ordering of multicast delivery
Chandra & Toueg [1996] demonstrates how RTO-multicast can be derived from consensus
Consensus in a synchronous system
We discuss an algorithm that uses only a basic multicast protocol to solve consensus in a synchronous system
The algorithm assumes that up to f of the N processes exhibit crash failures
Communication Model
1p
2p
3p
4p5p
•Complete graph (i.e. logically fully connected)•Synchronous, network
Multicast
Send a message to all processors in one round
1p
2p
3p
4p5p
aa
aa
At the end of round: everybody receives a
1p
2p
3p
4p5p
a
a
a
a
Multicast
Two or more processes can multicast at the same round
1p
2p
3p
4p5p
a
a
aab
b
b
b
1p
2p
3p
4p5p
a,b
a
ba,b
a,b
Crash Failures
Faulty processor
1p
2p
3p
4p5p
aa
aa
Faulty processor
Some of the messages are lost,they are never received
1p
2p
3p
4p5p
a
a
Faulty processor
1p
2p
3p
4p5p
a
a
Consensus
0
1
2 3
4
Start
Everybody has an initial value
3
3
3 3
3
Finish
Everybody must decide the same value
1
1
1 1
1
Start
If everybody starts with the same valuethey must decide that value
Finish1
1
1 1
1
Validity condition:
A simple algorithm using B-multicast
1. B-multicast value to all processors
2. Decide on the minimum
Each processor:
(only one round is needed)
0
1
2 3
4
Start
0
1
2 3
4
B-multicast values
0,1,2,3,4
0,1,2,3,4
0,1,2,3,4
0,1,2,3,4
0,1,2,3,4
0
0
0 0
0
Decide on minimum
0,1,2,3,4
0,1,2,3,4
0,1,2,3,4
0,1,2,3,4
0,1,2,3,4
0
0
0 0
0
Finish
This algorithm satisfies the validity condition
1
1
1 1
1
Start Finish1
1
1 1
1
If everybody starts with the same initial value,everybody decides on that value (minimum)
Consensus with Crash Failures
1. B-multicast value to all processors
2. Decide on the minimum
Each processor:
The simple algorithm doesn’t work
0
1
2 3
4
Start fail
The failed processor doesn’t multicastits value to all processors
0
0
0
1
2 3
4
Multicasted values
0,1,2,3,4
1,2,3,4
fail
0,1,2,3,4
1,2,3,4
0
0
1 0
1
Decide on minimum
0,1,2,3,4
1,2,3,4
fail
0,1,2,3,4
1,2,3,4
0
0
1 0
1
Finish fail
No Consensus!!!
If an algorithm solves consensus for f failed process we say it is:
an f-resilient consensus algorithm
The input and output of a 3-resilient consensus algorithm
0
1
4 3
2
Start Finish1
1
Example:
New validity condition:
if all non-faulty processes start withthe same value then all non-faulty processesdecide that value
1
1
1 1
1
Start Finish1
1
An f-resilient algorithm
Round 1: B-multicast my value
Round 2 to round f+1: Multicast any new received values
End of round f+1: Decide on the minimum value received
0
1
2 3
4
Start
Example: f=1 failures, f+1 = 2 rounds needed
0
1
2 3
4
Round 1
0
0fail
Example: f=1 failures, f+1 = 2 rounds needed
B-multicast all values to everybody
0,1,2,3,4
1,2,3,4 0,1,2,3,4
1,2,3,4
(new values)
Example: f=1 failures, f+1 = 2 rounds needed
Round 2
B-multicast all new values to everybody
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
0,1,2,3,41
2 3
4
Example: f=1 failures, f+1 = 2 rounds needed
Finish
Decide on minimum value
0
0 0
0
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
0,1,2,3,4
0
1
2 3
4
Start
Example: f=2 failures, f+1 = 3 rounds needed
Another example execution with 3 failures
0
1
2 3
4
Round 1
0
Failure 1
Multicast all values to everybody
1,2,3,4
1,2,3,4 0,1,2,3,4
1,2,3,4
Example: f=2 failures, f+1 = 3 rounds needed
0
1
2 3
4
Round 2 Failure 1
Multicast new values to everybody
0,1,2,3,4
1,2,3,4 0,1,2,3,4
1,2,3,4
Failure 2
Example: f=2 failures, f+1 = 3 rounds needed
0
1
2 3
4
Round 3 Failure 1
Multicast new values to everybody
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
O, 1,2,3,4
Failure 2
Example: f=2 failures, f+1 = 3 rounds needed
0
0
0 3
0
Finish Failure 1
Decide on the minimum value
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
O, 1,2,3,4
Failure 2
Example: f=2 failures, f+1 = 3 rounds needed
0
1
2 3
4
Start
Example: f=2 failures, f+1 = 3 rounds needed
Another example execution with 3 failures
0
1
2 3
4
Round 1
0
Failure 1
Multicast all values to everybody
1,2,3,4
1,2,3,4 0,1,2,3,4
1,2,3,4
Example: f=2 failures, f+1 = 3 rounds needed
0
1
2 3
4
Round 2 Failure 1
Multicast new values to everybody
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
0,1,2,3,4
Example: f=2 failures, f+1 = 3 rounds needed
At the end of this round all processesknow about all the other values
Remark:
0
1
2 3
4
Round 3 Failure 1
Multicast new values to everybody
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
0,1,2,3,4
Example: f=2 failures, f+1 = 3 rounds needed
(no new values are learned in this round)
Failure 2
0
0
0 3
0
Finish Failure 1
Decide on minimum value
0,1,2,3,4
0,1,2,3,4 0,1,2,3,4
0,1,2,3,4
Example: f=2 failures, f+1 = 3 rounds needed
Failure 2
If there are f failures and f+1 rounds then there is a round with no failed process
Example: 5 failures,6 rounds
1 2
No failure
3 4 5 6Round
In the algorithm, at the end of theround with no failure:
• Every (non faulty) process knows about all the values of all other participating processes
•This knowledge doesn’t change until the end of the algorithm
Therefore, at the end of theround with no failure:
everybody would decide the same value
However, we don’t know the exact positionof this round, so we have to let the algorithmexecute for f+1 rounds
Validity of algorithm:
when all processes start with the sameinput value then the consensus is that value
This holds, since the value decided fromeach process is some input value
A Lower Bound
Any f-resilient consensus algorithm requires at least f+1 rounds
Theorem:
Proof sketch:
Assume for contradiction that f or less rounds are enough
Worst case scenario:
There is a process that fails in each round
Round
a
1
before process fails, it sends its value a to only one process
ip
kp
ip
kp
Worst case scenario
Round
a
1
before process fails, it sends value a to only one process
mp
kp
kpmp
Worst case scenario
2
Round 1
Worst case scenario
2
………
f3
Process may decide a, and all other processes may decide another value (b)
np
npa
bdecide
Round 1
Worst case scenario
2
………
f3
npa
bdecide
Therefore f rounds are not enoughAt least f+1 rounds are needed
Consensus in synchronous systems
Up to f faulty processesDuration of round: max. delay of B-multicast
Dolev & Strong, 1983:Any algorithm to reach consensus despite up to f failures requires (f +1) rounds.
Byzantine agreement: synchronous
p1 (Commander)
p2 p3
1:v1:v
2:1:v
3:1:u
p1 (Commander)
p2 p3
1:x1:w
2:1:w
3:1:x
3 says 1 says ‘u’
Faulty process
Lamport et al, 1982: No solution for N = 3, f = 1
Nothing can be done to improve a correct process’ knowledge beyond the first stage: - It cannot tell which process is faulty.
Pease et al, 1982: No solution for N<= 3*f(assuming private comm. channels)
Byzantine agreement for N > 3*f
Example with N=4, f=1: - 1st round: Commander sends a value to each lieutenant - 2nd round: Each of the lieutenants sends the value it has received to each of its peers. - A lieutenant receives a total of (N – 2) + 1 values, of which (N – 2) are correct. - By majority(), the correct lieutenants compute the same value.
p1 (Commander)
p2 p3
1:v1:v
2:1:v3:1:u
p4
1:v
4:1:v
2:1:v 3:1:w
4:1:v
p1 (Commander)
p2 p3
1:w1:u
2:1:u3:1:w
p4
1:v
4:1:v
2:1:u 3:1:w
4:1:v
In general, O(N(f+1)) msg’s
O(N2) for signed msg’s
Four Byzantine Generals: N = 4, f = 1 in a Synchronous DS
p1 (Commander)
p2 p3
1:v1:v
2:1:v3:1:u
Faulty processes
p4
1:v
4:1:v
2:1:v 3:1:w
4:1:v
p1 (Commander)
p2 p3
1:w1:u
2:1:u3:1:w
p4
1:v
4:1:v
2:1:u 3:1:w
4:1:v
p2 decides on majority(v,u,v) = vp4 decides on majority(v,v,w) = v
p2, p3, p4 decide on majority(u,v, w) =
Asynchronous system
Solutions to consensus and BG problem ( and to IC) exist in synchronous systems
No algorithm can guarantee to reach consensus in an asynchronous system, even with one process crash failure
In an asynchronous system, processes can respond to messages at arbitrary times – so a crashed process is indistinguishable from a slow one
There is always some continuation of the processes’ execution that avoids consensus being reached
Impossibility of (deterministic) consensus in asynchronous systems
M.J. Fischer, N. Lynch, and M. Paterson: “Impossibility of distributed consensus with one faulty process”, J. ACM, 32(2), pp. 374-382, 1985.
A crashed process cannot be distinguished from a slow one. - Not even with a 100% reliable comm. network !
There is always a chance that some continuation of theprocesses’ execution avoid consensus being reached.
Contd
Note the word “guarantee” in the statement of the impossibility result
The result does not mean that processes can never reach consensus in an asynchronous system if one is faulty – it allows that consensus can be reached with some probability greater than zero
For example, despite the fact that our systems are often effectively asynchronous, transaction systems have been reaching consensus regularly for many years