30
Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The Byzantine Generals Problem”, by Lamport, Shostak, Pease, In ACM Transactions on Programming Languages and Systems, July 1982 why we need agreement assumptions algorithm steps through examples Bigger Picture: How to handle malicious components Play variant of Mafia/Werewolf

Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Embed Size (px)

Citation preview

Page 1: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Agreement: Byzantine Generals

UNIVERSITY of WISCONSIN-MADISONComputer Sciences Department

CS 739Distributed Systems

Andrea C. Arpaci-Dusseau

Paper:

• “The Byzantine Generals Problem”, by Lamport, Shostak, Pease, In ACM Transactions on Programming Languages and Systems, July 1982

• why we need agreement • assumptions• algorithm steps through examples

Bigger Picture: How to handle malicious components

Play variant of Mafia/Werewolf

Page 2: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Motivation

Build reliable systems in presence of faulty components

Common approach:• Send request (or input) to some “f-tolerant” server• Have multiple (potentially faulty) components compute same function

• Perform majority vote on outputs to get “right” result

C1

C2C3

majority(v1,v2,v3)

f faulty, f+1 good components ==> 2f+1 total

Page 3: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

What is a Byzantine Failure?

Three primary differences from Fail-Stop Failure1) Component can produce arbitrary output

• Fail-stop: produces correct output or none

2) Cannot always detect output is faulty• Fail-stop: can always detect that component has

stopped

3) Components may work together maliciously

With fail-stop failures: How many components are needed to be f-tolerant??

Page 4: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Assumption for F-tolerant Servers

Good (non-faulty) components must use same input• Otherwise, can’t trust their output result either

For majority voting to work:1) All non-faulty processors must use same input2) If input is non-faulty, then all non-faulty

processes use the value it providesMust agree on value of input

C1

C2C3

A

B

Page 5: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Byzantine Generals

Algorithm to achieve agreement among “loyal generals” (i.e., working components) given m “traitors” (i.e., faulty components)

Agreement such that:A) All loyal generals decide on same plan

(important even when input is faulty! Why?)B) Small number of traitors cannot cause loyal

generals to adopt “bad plan”

Terminology• Let v(i) be information communicated (I.e., input

observed) by ith general• Each general combines values v(1)...v(n) to form

plan

Page 6: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Agreement Conditions

Rephrase agreement conditions:A) Loyal generals decide on same plan if

All loyal generals use same method for combining information (and see same inputs)

B) Small number of traitors can’t hurt loyal generals ifUse robust function for decision, such as majority function of values v(1)...v(n)

Page 7: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Key Step: Agree on inputs

Generals communicate v(i) values to one another:1) Every loyal general must obtain same v(1)..v(n)1’) Any two loyal generals use same value of v(i)

– Traitor i will try to trick loyal generals into using different v(i)’s

2) If ith general is loyal, then the value he sends must be used by every other general as v(i)

How can each general send his value to n-1 others?

A commanding general must send an order (order: Use v(i) as my value) to his n-1 lieutenants such that:IC1) All loyal lieutenants obey same orderIC2) If commanding general is loyal, every loyal lieutenant

obeys the order he sends

Interactive Consistency conditions

Page 8: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Impossibility Result

With only 3 generals, no solution can work with even 1 traitor (given oral messages)

commander

attack

retreatL1 L2

What should L1 do? Is commander or L2 the traitor???

Page 9: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Option 1: Loyal Commander

commander

attack

retreatL1 L2

attack

What must L1 do?

By IC2: L1 must obey commander and attack

Page 10: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Option 2: Loyal L2

commander

attack

retreatL1 L2

retreat

What must L1 do?

By IC1: L1 and L2 must obey same order --> L1 must retreat

Problem: L1 can’t distinguish between 2 scenarios

Page 11: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

General Impossibility Result

No solution with fewer than 3m+1 generals can cope with m traitors

< see paper for details >

Page 12: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Oral Messages

AssumptionsA1) Every message sent is delivered correctly

– What if it is not?

A2) Receiver knows who sent message– What scenarios is this true for?

A3) Absence of message can be detected

– How can this be done?

Page 13: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Oral Message Algorithm

OM(m), m>0• Commander sends his value to every lieutenant• For each i, let vi be value Lieutenant i receives from commander; act as commander for OM(m-1) and send vi to n-2 other lieutenants

• For each i and each j not i, let vj be value Lieut i received from Lieut j. Lieut i computes majority(v1,...,vn-1)

OM(0)• Commander sends his value to every lieutenant

Page 14: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Example: Bad Lieutenant

Scenario: m=1, n=4, traitor = L3

C

L1 L3L2

AA

AOM(1):

OM(0):???

C

L1 L3L2A

AR

R

Decision??L1 = m (A, A, R); L2 = m (A, A, R); Both attack!

Page 15: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Example: Bad Commander

Scenario: m=1, n=4, traitor = C

C

L1 L3L2

AR

AOM(1):

OM(0):???L1 L3L2A

RA

A

Decision??L1=m(A, R, A); L2=m(A, R, A); L3=m(A,R,A); Attack!

R

A

Page 16: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Bigger Example: Bad Lieutenants

Scenario: m=2, n=3m+1=7, traitors=L5, L6

C

A A A

L2 L6L3 L5L4L1

AA

A

L2 L6L3 L5L4L1

A A AA RR

Decision???

Messages?

m(A,A,A,A,R,R) ==> All loyal lieutenants attack!

Page 17: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Bigger Example: Bad Commander+

Scenario: m=2, n=7, traitors=C, L6

C

L2 L6L3 L5L4L1

R A RA

Ax

A,R,A,R,AA R RA A

Decision???

L2 L6L3 L5L4L1

Messages?

Page 18: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Decision with Bad Commander+

L1: m(A,R,A,R,A,A) ==> AttackL2: m(A,R,A,R,A,R) ==> RetreatL3: m(A,R,A,R,A,A) ==> AttackL4: m(A,R,A,R,A,R) ==> RetreatL5: m(A,R,A,R,A,A) ==> AttackProblem: All loyal lieutenants do NOT choose same action

Page 19: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Next Step of Algorithm

Verify that lieutenants tell each other the same thing• Requires rounds = m+1• OM(0): Msg from Lieut i of form: “L0 said v0, L1 said v1, etc...”

What messages does L1 receive in this example?• OM(2): A• OM(1): 2R, 3A, 4R, 5A, 6A (doesn’t know 6 is traitor)• OM(0): 2{ 3A, 4R, 5A, 6R}• 3{2R, 4R, 5A, 6A}• 4{2R, 3A, 5A, 6R}• 5{2R, 3A, 4R, 6A}• 6{ total confusion }

All see same messages in OM(0) from L1,2,3,4, and 5m(A,R,A,R,A,-) ==> All attack

Page 20: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Signed Messages

Problem: Traitors can lie about what others said; how can we remove that ability?

New assumption: Signed messages (Cryptography)A4) a. Loyal general’s signature cannot be forged and contents cannot be altered

b. Anyone can verify authenticity of signature

Simplifies problem:• When lieutenant i passes on signed message from j, receiver knows that i did not lie about what j said

• Lieutenants cannot do any harm alone (cannot forge loyal general’s orders)

• Only have to check for traitor commander

With cryptographic primitives, can implement Byzantine Agreement with m+2 nodes, using SM(m)

Page 21: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Signed Messages Algorithm: SM(m)

1. Commander signs v and sends to all as (v:0)2. Each lieut i:

A) If receive (v:0) and no other order1) Vi = v2) send (V:0:i) to all

B) If receive (v:0:j:...:k) and v not in Vi1) Add v to Vi2) if (k<m) send (v:0:j:...:k:i)

to all not in j...k3. When no more msgs, obey order of choice(Vi)

Page 22: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

SM(1) Example: Bad Commander

Scenario: m=1, n=m+2=3, bad commander

C

L1 L2

A:0 R:0

What next?

L1 L2

A:0:L1

R:0:L2

V1={A,R} V2={R,A}Both L1 and L2 can trust orders are from CBoth apply same decision to {A,R}

Page 23: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

SM(2): Bad Commander+

Scenario: m=2, n=m+2=4, bad commander and L3

C

L1 L3L2

A:0A:0

xGoal? L1 and L2 must make same decision

L1 L3L2A:0:L1

A:0:L2A:0:L3

R:0:L3

L1 L2

R:0:L3:L1

V1 = V2 = {A,R} ==> Same decision

Page 24: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Other Variations

How to handle missing communication paths

< see paper for details>

Page 25: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

AssumptionsA1) Every message sent by nonfaulty processor is delivered correctly• Network failure ==> processor failure• Handle as less connectivity in graph

A2) Processor can determine sender of message• Communication is over fixed, dedicated lines• Switched network???

A3) Absence of message can be detected• Fixed max time to send message + synchronized clocks ==> If msg not received in fixed time, use default

A4) Processors sign msgs such that nonfaulty signatures cannot be forged• Use randomizing function or cryptography to make liklihood of forgery very small

Page 26: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Importance of Assumptions

“Separating Agreement from Execution for Byzantine Fault Tolerant Services” - SOSP’03

Goal: Reduce replication costs• 3f+1 agreement replicas• 2g+1 execution replicas

– Costly part to replicate– Often uses different software versions– Potentially long running time

To provide A2, protocol assumes cryptographic primitives, such that one can be sure “i said v” in switched environment

What is the problem??

Page 27: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Conclusions

Problem: To implement a fault-tolerant service with coordinated replicas, must agree on inputs

Byzantine failures make agreement challenging• Produce arbitrary output, can’t detect, collude

User different agreement protocol depending on assumptions• Oral messages: Need 3f+1 nodes to tolerate f failures– Difficult because traitors can lie about what others said

• Signed messages: Need f+2 nodes– Easier because traitors can only lie about other traitors

Page 28: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Byzantine Werewolves

Werewolf/Mafia: Psychological party game• Two groups: Werewolves and villagers• Multiple rounds of night and day

– Night: Werewolves kill a villager– Day: All vote on who to kill…hopefully a werewolf

• Werewolves trick villages into bad decisions (killing one of their own)• Werewolves lie, act in collision

• Werewolves have more information than villagers (I.e., who is a werewolf)• Traditional: Only 1 village “seer” has any info• My variant: Every villager can ask one question per round

• Traditional: More psychological…

Page 29: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Byzantine-Werewolf Game Rules

Everyone secretly assigned as werewolf or villager• 3 werewolves, rest are “seeing” villagers• I am moderator

Night round:• “Close your eyes”; make noises to hide activity• “Werewolves, open your eyes”: 3 can see who is who

– “Werewolves, pick someone to kill” (Not first round)– Silently agree on villager to kill by pointing– “Werewolves, close your eyes”

• For all: “NAME, open your eyes” “Pick someone to ask about”– Useless for Werewolves, but hides their identity…– Point to another player– Moderator signs thumbs up for werewolf, down for villager– “NAME, close your eyes”

Page 30: Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The

Rules: Day TimeDay Time: “Everyone open your eyes; its daytime”

• “NAME, you have been killed by the werewolves!”– They are now out of the game

• Agreement time: Everyone talks and votes on who should be “decommissioned”– Villagers try to decommission werewolf– Werewolves try to trick villagers with bad info– Pairwise communication

• Variant: Signed messages; leave a note with everyone telling them what you know; can show this note to others

– When you’ve made up your mind, tell moderator– Moderator: Uses majority voting to determine who is decommissioned “Okay, NAME is dead”

– Person is out of game (can’t talk anymore) and shows card

Repeat cycle until All werewolves dead OR werewolves >= villagers