Leader Election
Leader Election: the idea
We study Leader Election in rings
Why rings?
• historical reasons– original motivation: regenerate lost token in token ring
networks
• illustrates techniques and principles
• good for lower bounds and impossibility results
Outline
• Specification of Leader Election• YAIR• Leader election in asynchronous rings:
• An O(n2) algorithm• An O(nlog(n)) algorithm
• The revenge of the lower bound!• Leader election in synchronous rings
• Breaking the (nlog(n)) barrier
Message passing: Model
• n processors p0,…pn-1
• connected by bi-directional communication channels
• topology represented by undirected graph
p0
p2
p3
p4
p1some links may be missing
Processors
Each pi is a state machine
• state set Qi
• distinguished initial states
• could be infinite
pi’s state includes
• outbufi[l]: set of messages sent on l-th channel and not yet delivered
• inbufi[l]: set of messages delivered on l-th channel and not yet processed
• inbufi initially empty
• outbufi not accessible
State Transitions
A state transition:
• input: accessible state of pi (doesn’t depend on outbufi)
• consumes all messages in inbufi
• outputs at most a message per channel
Terminology
Definition: A configuration is a vector C = (q0,…,qn-1)• each qi is a state of pi
• set of outbufi are messages in transit
In an initial configuration each qi is an initial state of pi
Definition: An event is• a computation event comp(i)• a delivery event del(i,j,m)
Definition: An execution is an infinite sequence C0,0,C1,1,… where• C0 is an initial configuration• each Ci is a configuration• each i is an event
Definition: A schedule for the above execution is the sequence of events 0,1 ,…
Safety and Liveness
Safety property : “nothing bad happens”• holds in every finite execution prefix
– Windows™ never crashes– if one general attacks, both do– a program never terminates with a wrong answer
Liveness property: “something good eventually happens”• no partial execution is irremediable
– Windows™ always reboots– both generals eventually attack– a program eventually terminates
Admissible executions satisfy safety and liveness properties for a particular system type.
A really cool theorem
Every property is a combination of a safety property and a liveness property
(Alpern and Schneider)
Asynchronous Message-Passing Systems
if k = del(i,j,m)• in Ck-1
– m is in outbufi[l], where l is pi’s label for channel {pi, pj}
• in Ck , – remove m from outbufi[l]– add m to outbufi[h], where h is
pi’s label for channel {pi, pj}
if k = comp(i)
• pi changes state according to its transition function
• empties inbufi in Ck-1
• might add messages to outbufi in Ck
C0,0,C1,1,C2 …
Admissible if:• Every processor takes an infinite number of computation steps• Every message sent is eventually delivered
SynchronousMessage-Passing Systems
C0,0,C1,1,C2 …
• all asynchronous constraints, plus
• execution partitioned into disjoint rounds
• one delivery event for every message in every outbuf
• followed by one computation event for every processor
Remarks• not realistic, but
• good for algorithm design
• good for lower bounds
Complexity
TIME
• each processor’s state set includes terminated states
• termination: – all processors in terminated
states
– no messages in transit
Synchronous: count number of rounds until termination
Asynchronous: set unit of time as maximum message delay
SPACE
• Count maximum total number of messages
The Problem
• Final states of processes partitioned in two classes:
elected non-elected
• In every admissible execution, exactly one process (the leader) enters an elected state. All remaining enter a non-elected state
• Once entered a state, always in that state
Lots of variations...
• The ring can be unidirectional or bidirectional
• The number n of processors may be known or unknown
• Processors can be identical or can be somehow distinguished
• Communication may be synchronous or asynchronous
Uni- vs. Bidirectional
In unidirectional rings, messages can only be sent in a clockwise direction
Can processors be distinguished?
If no, anonymous algorithms
• Processors have no UID
• Formally: identical automata
• Can distinguish between left and right.
Can processors be distinguished?
If yes:• processors have unique IDs
• chosen from some large totally ordered space of ids (e.g. N+)
• no constraint on which ID are used (e.g. integers may not be consecutive)
• IDs can be either manipulated only by certain operations (e.g. comparison)
• or by unrestricted operations
Is n known?
If no, uniform algorithms
• Algorithm cannot use information about ring size
Communication:Asynchronous vs. Synchronous
Asynchronous:• no upper bound on message
delivery time
• no centralized clock
• no bound on relative speed of processes
Synchronous:• communication in rounds
• In a round a process:– delivers all pending
messages
– takes an execution step (which may involve sending one or more messages)
if no failures, every message sent is eventually delivered
An Impossibility Result
TheoremThere is no deterministic solution to the
leader election problem for a synchronous, non-uniform, anonymous bidirectional ring.
ProofSuppose that a solution exists for a system
A of n > 1 processes.
Each process of A starts in the same state
Lemma The states of all processors at the end of the each round of the execution of A are the same.
Proof By induction on number of rounds k• Base case: k = 0
Easy, since processes start in same state.• Inductive step: Lemma holds for k = t-1
– processors are identical up to round k = t-1
– send same messages to left and right neighbors
• every processors receives identical messages on left and right channel
– all processors apply same transition function to identical states in round t– all processors have identical states at the end of round t
Then, if one enters leader state, all do!
Observations
• What are the implication for asynchronous rings?
• What are the implication for uniform rings?
Outline
• Specification of Leader Election• YAIR• Leader election in asynchronous rings:
• An O(n2) algorithm• An O(nlog(n)) algorithm
• The revenge of the lower bound!• Leader election in synchronous rings
• Breaking the (nlog(n)) barrier
The LCR Algorithm
LeLann (1977), Chang and Roberts (1979)
• unidirectional
• asynchronous
• non anonymous: every process has uid
• uniform (does not depend on n)
3: upon receiving m from right
4: case
5: m.uid > uidi :
6: send m to left
7: m.uid < uidi :
8: discard m
9: m.uid = uidi :
10:leader := i
11:send <terminate, i> to left
12:terminate
endcase
13: upon receiving <terminate, i> from right neighbor
14:leader := i
15:send <terminate, i> to left
16:terminate
1: upon receiving no message
2: send uidi to left (clockwise)
Correctness
• messages from process with highest ID are never discarded
• therefore the correct leader is elected
• no other processor ID can traverse the entire ring
• therefore no one else is elected
Complexity
Message complexity:
O(n2)
Time complexity:
O(n)
Can we do better?
This bound is tight…
0
1
2
n-1 n-2
The HS algorithm
Hirschenberg and Sinclair (1980)
• Ring is bidirectional• Each process pi operates in phases• In each phase l, pi sends out
“tokens” containing uidi in both directions
• Tokens are intended to travel distance 2l and return to pi
Phase 2Phase 0Phase 0Phase 0Phase 1Phase 1Phase 1Phase 2Phase 2
• However, tokens may not make it back
• Token continues outbound only if greater than tokens on path
• Otherwise discarded
• All processes always forward tokens moving inbound
If pi receives its own token while it is going outbound, pi is the leader
The Protocol
1: upon receiving no message
2: if asleep then
asleep := false
send <uidi,out,1> to left and right
12: upon receiving <uidj,out,h> from right
13: case
14: uidj > uidi and h>1:
15: send <uidj,out,h-1> to left
16: uidj > uidi and h=1:
17: send <uidj,in, 1> to right
18: uidj = uidi
19: leader := i
20: endcase
3: upon receiving <uidj,out,h> from left
4: case
5: uidj > uidi and h>1 :
6: send <uidj,out,h-1> to right
7: uidj > uidi and h=1 :
8: send <uidj,in, 1> to left
9: uidj = uidi :
10: leader := i
11:endcase
21: upon receiving <uidj,in,1> from right
22: send <uidj,in,1> to left
23: upon receiving <uidj,in,1> from left 24:send <uidj,in,1> to right
25: upon receiving <uidi,in,1> from left and right
26: phase := phase +1
27: send (uidi,out,2phase) to left and 28:right
0: Init: asleep := true
Correctness
Same as LCR:
• messages from process with highest ID are never discarded
• therefore the correct leader is elected
• no other processor ID can traverse the entire ring
• therefore no one else is elected
– Winners in phase l > 0
– Tokens travel distance
– Total number of messages sent in phase l is bounded by
• Total number of phases
• No. of messages bound by which is
⎣ ⎦12 1 +−ln
Communication Complexity
• Every processor sends a token in phase 0
4n messages
• For phase l > 0, – the only processors to send a tokens are those who “won” in phase l-1
– There is a winner for every processors
⎣ ⎦( ) nlnl 824
12 1 ≤⋅+−
⎡ ⎤nlog1+
⎡ ⎤( )nn log18 + O(n log n)
2l-1+1
2l
Time Complexity
• Time for each phase l
• Final phase takes • Next to last phase is
• Total time complexity excluding last phase
Time complexity is at most
⎡ ⎤ 1log −= nl
2 · 2l = 2l+1
n (tokens only traveling outbound)
⎡ ⎤nlog22 ⋅
3n to 5n
The revenge of the lower bound
So far we have seen:• a simple O(n2) algorithm
• a more clever O(n log n) algorithm
• focus on message complexity
Facts: • (n log n) lower bound in asynchronous networks
• (n log n) lower bound in synchronous networks when using only comparisons
Outline
• Specification of Leader Election• YAIR• Leader election in asynchronous rings:
• An O(n2) algorithm• An O(nlog(n)) algorithm
• The revenge of the lower bound!• Leader election in synchronous rings
• Breaking the (nlog(n)) barrier
• The rise and fall of randomization
Leader Election with fewer than O(n log n) messages
• Synchronous rings
• UID are positive integers
• Can be manipulated using arbitrary arithmetic operations
TimeSlice
• n is known to all processors
• unidirectional communication
• O(n) messages
VariableSpeeds
• n is not known to all processors
• unidirectional communication
• O(n) messages
What about Time complexity?
What is special about synchronous rings?
• Can convey information by not sending a message
“when your phone doesn’t ring, it’s me”
TimeSlice
Runs in phases• each phase consists of n rounds
• in phase i 0
– if no one elected yet
– processor with id i
– declares itself the leader
– sends token with its UID around
Message complexity:
Time complexity:
n · UIDmin
n
VariableSpeeds
• Each process pi initiates a token
• Different tokens travel at different speeds:• for token carrying UIDv, 1 message every rounds
• (each process waits rounds after receiving the token before
sending it out)
• Each process keeps track of smallest UID seen
• Discard token with UID greater than smallest UID
vUID2vUID2
Complexity Analysis
• By the time UIDmin goes around the ring, the second smallest UID has gone only half way, third smallest a fourth of the way, etc.
• Forwarding the token carrying UIDmin has caused more messages than all the other tokens combined
• Message complexity bound by
• Time Complexity minUIDn 2⋅
2n
Variable start times
Processors can start at protocol different times
• processors that wake up spontaneously (participants) send token with UID around ring
• processors that wake up on receiving a UID (relays) do not initiate their own token
A message life cycle
• A message is in phase one • until it is received by an awake processor
• forwarded immediately
• A message is in phase two• once received by an awake processor
• forwarded after rounds12 −iUID
The New Algorithm
When participant receives a message from pi:
• if UIDi larger than minimal seen (including own), swallow it
• otherwise, delay for rounds
When relay receives a message from pi:
• if UIDi larger than minimal seen (not including own), swallow it
• otherwise, delay for rounds
12 −minUID
12 −minUID
Correctness
Lemma: Only the participant processor with the smallest identifier receives its token back
Proof: • Let pi be participating processor with smallest UID
• No processor can swallow UIDi
• All tokens must go through pi , and will be swallowed
• No other processor can receive token back
Complexity
Three categories of messages:
• phase one messages
• phase two messages sent before the message of eventual leader enters its second phase
• phase two messages sent after the eventual leader enters its second phase
Complexity
Lemma: The total number of messages in the first category is at most n.
Proof The lemma follows because at most one phase one message is forwarded by each processor
• Suppose pi forwards two phase 1 messages, carrying UIDj and UIDk
• Assume, WLOG, that pj closer to pi than pk.• Them, phase 1 message with UIDk must go through pj
• If pj awake, then it becomes a phase 2 message• Otherwise, pj becomes a relay and does not send its UID
Complexity
Lemma: The total number of messages in the second category is at most n
Proof• After the first process awakens, it takes at most n rounds before
message with UIDmin reaches a participant• During this time, token with UIDv is responsible for messages at most• Max number of messages obtained when UIDs are small (0,1,…,n-1)• Max number of messages in second category:
vUIDn 2
nnn
v
UIDv <∑ =12
Complexity
Lemma: The total number of messages in the third category is at most 2n
Proof: analogous to complexity analysis for Variable Speeds
In summary:
Message Complexity: At most 4n
Time complexity minUIDnn 2⋅+
And now for somethingcompletely different...
RANDOMIZATIONRANDOMIZATION
Randomized Algorithms
Extend transition function to accept as input
• a random number
• from a bounded range
• under some fixed distribution
Why is it important?
The bad news:
randomization alone does not generally affect
• impossibility results – leader election in anonymous network still impossible!
• worst case bounds
The good news:
randomization + weakening of problem statement does
Example: RandomizedLeader Election
• Impossibility in anonymous rings still holds• but can now elect a leader with some probability• So weaken LE as follows
Safety: In every configuration of every admissible execution, at most one processor is in an elected state
Liveness: At least one processor is elected with some non-zero probability
Behaviors allowed by weakened specification:
• terminate without a leader• never terminate
Back to Leader Election
• Use randomization to have processes generate a pseudo identifier
• Use a deterministic leader election algorithm to work with pseudo identifiers
• Not just any deterministic LE algorithm:• needs to work correctly if multiple processes generate
same pseudo id
• a plus is the ability to detect if no leader elected
A first result
Assume• synchronous ring
• non-uniform ring
• processor can randomly choose identifiers
TheoremThere is a randomized algorithm which, with probability c > 1/e, elects a leader in a synchronous ring; the algorithm sends O(n2) messages
The Algorithm
Initially
0: pidi :=
1: send pidi to left
2: upon receiving <S> from right
3: if |S| = n then
4: if pidi is unique max(S) then
5: elected := true
6: else
7: elected := false
8: else
9: send <S||pidi> to left
Observations:
• randomization used once
• one execution for each element of = {1,2}n
⎩⎨⎧
n
n
1 yprobabilit with 2
1-1 yprobabilit with 1
Code for processor pi
ℜ
{R : exec(R) satisfies P}
• exec(R): execution of R in
• Given a predicate P on executions
Pr[P]: probability of event
Definitions
ℜ
ℜ
Analysis
What is the probability that the algorithm terminates with a leader?
enc
nnn
n nnn11
11
11
11
1
11
→⎟⎠
⎞⎜⎝
⎛ −>=⎟⎠
⎞⎜⎝
⎛ −=⎟⎠
⎞⎜⎝
⎛ −⎟⎟⎠
⎞⎜⎜⎝
⎛ −−
Message Complexity:
O(n2)
Not good enough?
Trade off more time and messages for higher probability of success• if |S| = n and pi detects no single max in S
– choose new pidi
– restart algorithm
• becomes a set of n-tupleseach of which is a possibly infinite sequence over {1,2}
ℜ
Analysis
Probability of success in iteration k
(1-c)k-1· c
Time complexity:• worst-case number of iterations: • expected number of iterations:
Expected value of T:
∞ec <1
∑ =⋅=Tx
xTxTin
]Pr[][E
Expected message complexity: O(n2)
Impossibility of Uniform Algorithms
TheoremThere is no uniform randomized algorithm for leader election in a synchronous anonymous ring that terminates in even a single execution for a single ring size
Summary
• No deterministic solution for anonymous rings• No solution for uniform anonymous rings (even
when using randomization)• Protocols with O(n2) and O(n logn) messages for
uniform rings• (n log n) lower bound on message complexity for
practical protocols• O(n) message complexity for uniform synchronous
rings