Upload
lynne
View
43
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Byzantine Fault Tolerance. CS 425: Distributed Systems Fall 2011. Material drived from slides by I . Gupta and N .Vaidya. Reading List. L. Lamport , R. Shostak , M. Pease, “The Byzantine Generals Problem,” ACM ToPLaS 1982. - PowerPoint PPT Presentation
Citation preview
1
Byzantine Fault Tolerance
CS 425: Distributed SystemsFall 2011
Material drived from slides by I. Gupta and N.Vaidya
2
Reading List
• L. Lamport, R. Shostak, M. Pease, “The Byzantine Generals Problem,” ACM ToPLaS 1982.
• M. Castro and B. Liskov, “Practical Byzantine Fault Tolerance,” OSDI 1999.
Byzantine Generals Problem
A sender wants to send message to n-1 other peers
• Fault-free nodes must agree
• Sender fault-free agree on its message
• Up to f failures
Byzantine Generals Problem
A sender wants to send message to n-1 other peers
• Fault-free nodes must agree
• Sender fault-free agree on its message
• Up to f failures
S
321
v
vv
Byzantine Generals Algorithm
5
value v
Faulty peer
S
321
v
vv
6
value v
v v
Byzantine Generals Algorithm
S
321
v
vv
7
value v
v v
?
?
Byzantine Generals Algorithm
S
321
v
vv
8
value v
v
v v
?
v?
Byzantine Generals Algorithm
32
v
vv
9
value v
x
v v
?
v?[v,v,?]
[v,v,?]
S
1
Byzantine Generals Algorithm
32
v
vv
10
value v
x
v v
?
v?v
v
S
1Majorityvote resultsin correctresult atgood peers
Byzantine Generals Algorithm
S
321
v
wx
11
Faulty source
Byzantine Generals Algorithm
S
32
v
w
x
12
w w1
Byzantine Generals Algorithm
S
32
v
wx
13x
w w
v
xv
1
Byzantine Generals Algorithm
S
32
v
wx
14x
w w
v
xv
1[v,w,x]
[v,w,x]
[v,w,x]
Byzantine Generals Algorithm
S
32
v
wx
15x
w w
v
xv
1[v,w,x]
[v,w,x]
[v,w,x]
Vote resultidentical atgood peers
Byzantine Generals Algorithm
Known Results
• Need 3f + 1 nodes to tolerate f failures
• Need Ω(n2) messages in general
16
Ω(n2) Message Complexity
• Each message at least 1 bit
• Ω(n2) bits “communication complexity” to agree on just 1 bit value
17
18
Practical Byzantine Fault Tolerance
• Computer systems provide crucial services• Computer systems fail– Crash-stop failure– Crash-recovery failure– Byzantine failure
• Example: natural disaster, malicious attack, hardware failure, software bug, etc.
• Need highly available service Replicate to increase availability
19
Challenges
Request A Request B
Client Client
20
Requirements
• All replicas must handle same requestsdespite failure.
• Replicas must handle requests in identical order despite failure.
21
Challenges
2: Request B
1: Request A
Client Client
22
State Machine Replication
2: Request B
1: Request A
2: Request B
1: Request A
2: Request B
1: Request A
2: Request B
1: Request A
Client Client
How to assign sequence number to requests?
23
Primary Backup Mechanism
Client Client
2: Request B
1: Request A
What if the primary is faulty?Agreeing on sequence number
Agreeing on changing the primary (view change)
View 0
24
Normal Case Operation
• Three phase algorithm:– PRE-PREPARE picks order of requests– PREPARE ensures order within views– COMMIT ensures order across views
• Replicas remember messages in log• Messages are authenticated– .σk denotes a message sent by k
25
Pre-prepare Phase
Primary: Replica 0
Replica 1
Replica 2
Replica 3
Request: m
PRE-PREPARE, v, n, mσ0
Fail
26
Prepare PhaseRequest: m
PRE-PREPARE
Primary: Replica 0
Replica 1
Replica 2
Replica 3 Fail
Accepted PRE-PREPARE
27
Prepare PhaseRequest: m
PRE-PREPARE
Primary: Replica 0
Replica 1
Replica 2
Replica 3 Fail
PREPARE, v, n, D(m), 1σ1
Accepted PRE-PREPARE
28
Prepare PhaseRequest: m
PRE-PREPARE
Primary: Replica 0
Replica 1
Replica 2
Replica 3 Fail
PREPARE, v, n, D(m), 1σ1
Accepted PRE-PREPARE
Collect PRE-PREPARE + 2f matching PREPARE
29
Commit PhaseRequest: m
PRE-PREPARE
Primary: Replica 0
Replica 1
Replica 2
Replica 3 Fail
PREPARE
COMMIT, v, n, D(m)σ2
30
Commit Phase (2)Request: m
PRE-PREPARE
Primary: Replica 0
Replica 1
Replica 2
Replica 3 Fail
PREPARE COMMIT
Collect 2f+1 matching COMMIT: execute and reply
31
View Change
• Provide liveness when primary fails– Timeouts trigger view changes– Select new primary (= view number mod 3f+1)
• Brief protocol– Replicas send VIEW-CHANGE message along with
the requests they prepared so far– New primary collects 2f+1 VIEW-CHANGE messages– Constructs information about committed requests
in previous views
32
View Change Safety
• Goal: No two different committed request with same sequence number across views
Quorum for Committed Certificate (m, v, n)
At least one correct replica has Prepared Certificate (m, v, n)
View Change Quorum
33
Related WorksFault Tolerance
Fail Stop Fault Tolerance
Paxos1989 (TR)
VS ReplicationPODC 1988
Byzantine Fault Tolerance
Byzantine Agreement
RampartTPDS 1995
SecureRingHICSS 1998
PBFT OSDI ‘99
BASETOCS ‘03
Byzantine Quorums
Malkhi-ReiterJDC 1998
PhalanxSRDS 1998
FleetToKDI ‘00
Q/USOSP ‘05
Hybrid Quorum
HQ Replication OSDI ‘06