Upload
gretchen-setter
View
213
Download
1
Embed Size (px)
Citation preview
2
The snapshot algorithm (Candy and Lamport)
3
4
5
Goal: design a snapshot (=global-state-detection) algorithm that:
will record a collection of states of all system components (which forms a global system state),
will not change the underlying computation,
will not freeze the underlying computation
6
A Process Can… record its own state, send and receive messages, record messages it sends and receives,
cooperate with other processes
Processes do not share clocks or memory
Processes cannot record their state
precisely at the same instant
7
Motivation
Many problems in distributed systems can be stated in terms of the problem of detecting global states:
Stable property detection problems : termination detection, deadlock detection etc.
Checkpointing
8
Stable Property Detection Problem
D - distributed systemy - a predicate function defined on the set of global states of DS, S’ – global states of D
y is stable if y(S) implies y(S’) for all S’ reachable from S
many distributed algorithms are structured as a sequence of phases
A phase: transient part, then a stable part
phase termination vs. computation termination
our view on the problem:i. detect the termination of a phaseii. initiate a new phase
Notice that “the kth phase has terminated” is a stable property
9
10
Model
Distributed system D is a finite, labeled, directed graph.
p q
C2
C1
Channels have infinite buffers, are error-free and preserve FIFO
Message delay is bounded, but unknown
11
State of a Channel
1p q
C1
23 1
[1, 2, 3] – sequence X of messages that were sent
[1] – sequence Y of received messages ( prefix of X )
[2, 3] – state of C1: X \ Y
p q
C2
C1
12
Example: System
Distributed system: pC2
C1
Initial global state: B A Ø
Ø
State transitions
(same for p and q):
A Bsend
receive
q
13
A A
Ø
A A
Ø
A B Ø
Ø
B A
Ø
Ø
A computation corresponds to a path in the diagram
p q qp
p sends
q receives
q sends
p receives q sends
C1
pC2
q
deterministic
A B
send
receive
Global state transition diagram
14
Distributed system:
State transition: p :
q : C Dsend
receive
A Bsend
receive
p
C2
C1
q
Example: System
15
qp
C2
C1
A D Ø
B C Ø
B D
A C Ø
Ø
p q qp
p sends
q sends
p receives
Global state transition diagram
q re
ceiv
es
non-deterministic
q sends
A Bsend
receiveC D
send
receive
q receives
16
qp
C2
C1
A D Ø
B C Ø
B D
A C Ø
Ø
p q qp
p sends
q sends
p receives
We look at the following sequence of events:
A Bsend
receiveC D
send
receive
17
Each process records its own statep and q cooperate to record the state of
C.
pC
q
in the snapshot algorithm:
18
B A Ø
p q
Example: System
A A
A A
Recorded state:
pC
q
Ø
No token
C1
pC2
qA B
send
receive
Record C
Record qRecord p
19
B A
Ø
Ø
p q
Example: System
B A
A A
Ø
Recorded state:
pC1
q
Two tokens
Record p
Record CRecord q
C1
pC2
qA B
send
receive
C’s state recorded
time
P sends a message on C
P’s state recorded
C’s state recorded
P sends a message on C
P’s state recorded
20
Record p
Record CRecord q
Record C
Record qRecord p
21
q will record the state of C
q starts recording C after it records its state
pC
q
p and q have to coordinate ; using a special
marker
q stops when receiving from p
But: how does q know when to record its state?
22
Who starts?
We assume one process.
The snapshot algorithm
Hw: extend discussion + proof to any number of startes.
Who will record the state of channel C? q
How q knows when to stop recording?
p sends right after it records its state, and before sending any other message
q starts recording after it records its state
(Intuition for the Algorithm)
pC
q
23
24
The snapshot algorithm
Ends when q receives along C
Starts when q records itself
channel recordingp
Cq
Note : for any q p0, the channel along which arrived first is recorded as
25
p0 starts.
The snapshot algorithm
p0 recoreds its state, and then broadcasts .
Shout-algorithm = PI (Propogation-of-information)= hot potato = … When q receives for the first time, it
records its own state
State recording
26
1. record the state of p2. send along c before sending any other messageMarker-Receiving Rule for a process q
if q’s state is not recorded: 1. record state; 2. record c’s state = ;else: c’s state is the sequence of messages received since q recorded its state
The snapshot algorithm
on receiving along channel c:
Marker-Sending Rule for a process q
Termination
Assumption No marker remains forever in an input channel
Claim: If the graph is strongly connected and at least one process records its state, then all processes will record their state in finite time
Proof: by induction
27
28
The Recorded Global State
State transition: p :
q : C Dsend
receive
A Bsend
receive
p
C2
C1
q
Ex: System
29
A D
B C
B D
A C
p q qp
p sends
q sends
p receives
A D
qp
C2
C1A Bsend
receiveC D
send
receive
A
30
What did we get?
31
Event e in process p is an atomic action: can change the state of p, and a state of at most one channel c incident on p (by sending/receiving message M along c )
e is defined by < p, s, s’, M, c > e =<p, s, s’, M, c> may occur in global state S
if 1. the state of p in S is s. 2 a. if c is directed towards p: c’s state has M in its head, and is deleted after applying e . b. if c is directed from p: c’s state has M in its tail after applying e . 3. the state of p after applying e is s’.
32
Process State and Global State A process: set of states, an initial state set of events A global state S: collection of process
states and channel states initially, each process is in its initial state and
all channels are empty
next(S, e) is the global state after event e in applied to global state S
33
Process State and Global State
seq = (ei : i = 0…n) is a computation of the system iff
ei may occur in Si , 0 i n
Si+1 = next(Si, ei)
(S0 is the initial global
state)
34
seq = (ei: i ≥ 0) a distributed computation
Si – the state of the system right before ei occurs
S0 – the initial state of the system
St – the state of the system at the termination of
the algorithm
S* - the recorded global state
The Recorded Global State
35
Definition Event ej is called pre-recording if ej is in a process p and p records its state after ej in seq .Event ej is called post-recording if ej is in a process p and p records its state before ej in seq .Assume that ej-1 is a post-recording event before Pre-recording event ej in seq.
pre-recording
post-recording
36
Lemma:
Proof: ej-1 occurs in p and ej in q , and q ≠p(since ej-1 is and ej is .)
1
1
1 2
3
1 3 3 4
24
I f , then
. canbe applied in ,say ,
. canbe applied in ,say , and
c. S =S .
j j
j
j
e e
je
je
S S S
a e S S S
b e S S S
37
The only scenario that might prevent interchanging the two events is that a message M is sent at ej-1 and received at ej .
but this cannot be possible: if M is sent at ej-1 , then M is , so a marker was sent to q before M, so when it is received in ej q already recorded its state, so ej Is ,a , a contradiction.
38
Hence, event ej can occur in global state Sj-1. The state of process p is not altered by ej, hence ej-1 can occur after ej.
39
We have to show that the states of all Processes and channels are the same in S2 and S4 .This clearly holds for proceses and channels That do not take part in ej-1 and ej .
40
states: the states of p and q in S2 and in S4 are the same.
channels: whether ej-1/ej send/receive(/neither) a message along a channel, the same is done in both scenarios, So the states of the channels in S2 and S4 are the same.(End of proof. )
(The Recorded Global State)
j '
jj '
j '
, where
1.
seq' = (e : j 0)
j < i j t : e = e
(e | i2. j <t)
: Given an execution seq, and an
output of the snapshot algorithm S*, there
exists a computation
For all or
The subseq
Theorem
uence
j
jj '
k
(e | i j <t)
j < i j t : S = 3
4. , such that
S
k, i k t
S * =
.
S
is a
permutation of the subsequence
For all or
There exists
42
Proof Using the lemma, swap the events till all events appear after all events. The acquired computation is seq’. All that is left to show: S* is a global state after all events and before all events.1. Process states2. Channel states
43
Claim: The state of a channel in S* is(sequence of messages corresp. to pre-recorded receives)-(sequence of messages corresp. to prerecorded sends) Proof: The state of channel c from process p to process q recorded in S* is the sequence of messages received on c by q after q records its state and before q receives a marker on c. The sequence of messages sent by p is the sequence corres. to prerecording sends on c.
44
A D
B C
D
A C
p q qp
p sends
q sends
p receives
A D
B
post
pre
post
qp
C2
C1A Bsend
receiveC D
send
receive
45
A D
A D
D
A C
p q qp
q sends
p sends
p receives
A D
A
(Another execution)
pre
post
post
B
qp
C2
C1A Bsend
receiveC D
send
receive
What did we get?
A configuration that could have happened
46
seq = (ei: i ≥ 0) a distributed computation
Si – the state of the system right before ei occurs
S0 – the initial state of the system
St – the state of the system at the termination of
the algorithm
S* - the recorded global state
47
Stable Detection
D - distributed systemy - a predicate function defined on the set of global states of DS, S’ – global states of D
y is a stable property of D if y(S) implies y(S’) for all S’ reachable from S
48
49
Input: A stable property yOutput: a boolean value b with the property: y(S0) b and b y(St)
Algorithm
Algorithm: begin
record a global state S* b := y(S*) end
50
Correctness 1. S* is reachable from S0
2. St is reachable from S*3. y(S) y(S’) for all S’ reachable from S
S0 S* St
y(S*)=true y(St)=true
y(S*)=false
y(S0)=false
References
K. M. Chandy and L. Lamport,Distributed Snapshots:Determining Global States of Distributed Systems
51