24
CPSC 668 Set 9: Fault Tolerant Consensus 1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch

CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

CPSC 668 Set 9: Fault Tolerant Consensus 1

CPSC 668Distributed Algorithms and Systems

Fall 2006

Prof. Jennifer Welch

CPSC 668 Set 9: Fault Tolerant Consensus 2

Processor Failures in Message Passing• Crash: at some point the processor

stops taking steps– at the processor's final step, it might

succeed in sending only a subset of the messages it is supposed to send

• Byzantine: processor changes state arbitrarily and sends messages with arbitrary content

CPSC 668 Set 9: Fault Tolerant Consensus 3

Consensus Problem

• Every processor has an input.

• Termination: Eventually every nonfaulty processor must decide on a value.

• Agreement: All decisions by nonfaulty processors must be the same.

• Validity: If all inputs are the same, then the decision of a nonfaulty processor must equal the common input.

CPSC 668 Set 9: Fault Tolerant Consensus 4

Overview of Consensus Results• Let f be the maximum number of faulty

processors.

• Tight bounds for message passing:

crash failures Byzantine failures

number of rounds

f + 1 f + 1

total number of processors

f + 1 3f + 1

message size polynomial polynomial

CPSC 668 Set 9: Fault Tolerant Consensus 5

Overview of Consensus Results

• Impossible in asynchronous case.

• Even if we only want to tolerate a single crash failure.

• True both for message passing and shared read-write memory.

CPSC 668 Set 9: Fault Tolerant Consensus 6

Modeling Processor Failures• Modify failure-free definitions of admissible

execution to accommodate crash failures:• All but a set of at most f processors (the faulty

ones) taken an infinite number of steps.– In synchronous case: once a faulty processor fails

to take a step in a round, it takes no more steps.

• In a faulty processor's last step, an arbitrary subset of the processor's outgoing messages make it into the channels.

CPSC 668 Set 9: Fault Tolerant Consensus 7

Modeling Processor Failures

• Modify failure-free definitions of admissible execution to accommodate Byzantine failures:

• A set of at most f processors (the faulty ones) can send messages with arbitrary content and change state arbitrarily (i.e., not according to their transition functions).

CPSC 668 Set 9: Fault Tolerant Consensus 8

Consensus Algorithm for Crash FailuresCode for each processor:

v := my inputat each round 1 through f+1: if I have not yet sent v then send v to all wait to receive messages for this round v := minimum among all received values and

current value of v if this is round f+1 then decide on v

CPSC 668 Set 9: Fault Tolerant Consensus 9

Correctness of Crash Consensus Algorithm

Termination: By the code, finish in round f+1.

Validity: Holds since processors do not introduce spurious messages: if all inputs are the same, then that is the only value ever in circulation.

CPSC 668 Set 9: Fault Tolerant Consensus 10

Correctness of Crash Consensus AlgorithmAgreement:

• Suppose in contradiction pj decides on a smaller value, x, than does pi.

• Then x was hidden from pi by a chain of faulty processors:

• There are f + 1 faulty processors in this chain, a contradiction.

q1 q2 qf qf+1 pj

pi

round1

round2

roundf

roundf+1

CPSC 668 Set 9: Fault Tolerant Consensus 11

Performance of Crash Consensus Algorithm

• Number of processors n > f

• f + 1 rounds

• n2 •|V| messages, each of size log|V| bits, where V is the input set.

CPSC 668 Set 9: Fault Tolerant Consensus 12

Lower Bound on Rounds

Assumptions:

• n > f + 1

• every processor is supposed to send a message to every other processor in every round

• Input set is {0,1}

CPSC 668 Set 9: Fault Tolerant Consensus 13

Failure-Sparse Executions

• Bad behavior for the crash algorithm was when there was one crash per round.

• This is bad in general.

• A failure-sparse execution has at most one crash per round.

• We will deal exclusively with failure-sparse executions in this proof.

CPSC 668 Set 9: Fault Tolerant Consensus 14

Valence of a Configuration

• The valence of a configuration C is the set of all values decided by a nonfaulty processor in some configuration reachable from C by an admissible (failure-sparse) execution.

• Bivalent: set contains 0 and 1.

• Univalent: set contains only one value– 0-valent or 1-valent

CPSC 668 Set 9: Fault Tolerant Consensus 15

Valence of a Configuration

C

E F GD

0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 1 <= decisions

0/1 : bivalent1 : 1-valent0 : 0-valent

0/1

0 0/1 1 0/1

CPSC 668 Set 9: Fault Tolerant Consensus 16

Statement of Round Lower Bound

Theorem (5.3): Any crash-resilient consensus algorithm requires at least f + 1 rounds in the worst case.

Proof Strategy:

show bivalentinitialconfig.

…round

1round

2roundf - 2

roundf - 1

show we can keep things bivalentthrough round f - 1

roundf

show we cankeep a n.f.proc. fromdeciding inround f

CPSC 668 Set 9: Fault Tolerant Consensus 17

Existence of Bivalent Initial Config.

• Suppose in contradiction all initial configurations are univalent.

inputs valency

000…00 0

000…01 ?

000…11 ?

001…11 ?

011…11 ?

111…11 1

by validity condition

CPSC 668 Set 9: Fault Tolerant Consensus 18

Existence of Bivalent Initial Config.

• Let – I0 be a 0-valent initial config– I1 be a 1-valent initial config– s.t. they differ only in pi 's input

I0

pi fails initially, no other failures.By termination, eventually rest decide.

all but pi

decide 0

I1

This execution looks the same as the oneabove to all the processors except pi.

all but pi

decide 0

CPSC 668 Set 9: Fault Tolerant Consensus 19

Keeping Things Bivalent

• Let ' be a (failure-sparse) k-1 round execution ending in a bivalent config.– for k - 1 < f - 1

• Show there is a one-round (f-s) extension of ' ending in a bivalent config.– so has k < f rounds

• Suppose in contradiction every one-round (f-s) extension of ' is univalent.

CPSC 668 Set 9: Fault Tolerant Consensus 20

Keeping Things Bivalent

'

failure-free round k1-val

pi crashes0-val

pi fails to send to

pi fails to send to q1,…,qm

pi fails to send to q1,…,qj+1

pi fails to send to q1,…,qj

rounds 1 to k-1

1-val

0-val

bi-val

……

CPSC 668 Set 9: Fault Tolerant Consensus 21

Keeping Things Bivalent

'

1-val

0-val

pi fails to sendto q1,…,qj

pi fails to sendto q1,…,qj+1

rounds 1 to k-1

round k

n.f.decide1

n.f.decide1

qj+1 fails inrd. k+1; no other failures

CPSC 668 Set 9: Fault Tolerant Consensus 22

Cannot Decide in Round f

• We've shown there is an f - 1 round (failure-sparse) execution, call it , ending in a bivalent configuration.

• Extending this execution to f rounds might not preserve bivalence.

• However, we can keep a processor from explicitly deciding in round f, thus requiring at least one more round (f+1).

CPSC 668 Set 9: Fault Tolerant Consensus 23

Cannot Decide in Round f

Case 1: There is a 1-round (f-s) extension of ending in a bivalent config. Then we are done.

Case 2: All 1-round (f-s) extensions of end in univalent configs.

CPSC 668 Set 9: Fault Tolerant Consensus 24

Cannot Decide in Round f

0-val

pi fails to send to nf pj

rounds 1 to f-1

1-valround f

failure free

bi-val.

pi fails to send to nf pj ,sends to another nf pk

pi might send to pk

pi sends to pj

and pklook same to pk

look same to pj

pk either undecidedor decided 1

pj either undecidedor decided 0