Upload
others
View
36
Download
0
Embed Size (px)
Citation preview
Koo-Toueg coordinated checkpointing
algorithm
โขA coordinated checkpointing and recovery
technique that takes a consistent set of
checkpointing and avoids domino effect and
livelock problems during the recovery
โขIncludes 2 parts: the checkpointing
algorithm and the recovery algorithm
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
1
Koo-Toueg coordinated checkpointingalgorithm(cont.)
โข Checkpointing algorithmโ Assumptions: FIFO channel, end-to-end protocols,
communication failures do not partition the network, single process initiation, no process fails during the execution of the algorithm
โ Two kinds of checkpoints: permanent and tentativeโข Permanent checkpoint: local checkpoint, part of a
consistent global checkpoint
โข Tentative checkpoint: temporary checkpoint, becomepermanent checkpoint when the algorithm terminatessuccessfully
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
2/31
Koo-Toueg coordinated checkpointingalgorithm(cont.)
โข Checkpointing algorithmโ 2 phases
โข The initiating process takes a tentative checkpoint and requests all other processes to take tentative checkpoints. Every process can not send messages after taking tentative checkpoint. All processes will finally have the single same decision: do or discard
โข All processes will receive the final decision from initiating processand act accordingly
โ Correctness: for 2 reasonsโข Either all or none of the processes take permanent checkpoint
โข No process sends message after taking permanent checkpoint
โ Optimization: maybe not all of the processes need to takecheckpoints (if not change since the last checkpoint)
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
3/31
Koo-Toueg coordinated checkpointingalgorithm(cont.)
โข The rollback recovery algorithmโ Restore the system state to a consistent state after a
failure with assumptions: single initiator, checkpoint and rollback recovery algorithms are not invoked concurrently
โ 2 phasesโข The initiating process send a message to all other processes
and ask for the preferences โ restarting to the previous checkpoints. All need to agree about either do or not.
โข The initiating process send the final decision to all processes, all the processes act accordingly after receiving the final decision.
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
4/31
Koo-Toueg coordinated checkpointingalgorithm(cont.)
โข Correctness: resume from a consistent state
โข Optimization: may not to recover all, since some of the processes did not change anything
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
5/31
Juang-Venkatesan algorithm for asynchronouscheckpointing and recovery
โข Assumptions: communication channels are reliable, delivery messages in FIFO order, infinite buffers, message transmission delay is arbitrary but finite
โข Underlying computation/application is event-driven: process P is at state s, receives message m, processes the message, moves to state sโ and send messages out. So the triplet (s, m, msgs_sent) represents the state of P
โข Two type of log storage are maintained:
โ Volatile log: short time to access but lost if processor crash. Move to stable log periodically.
โ Stable log: longer time to access but remained if crashed
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
6/31
Juang-Venkatesan algorithm for asynchronous
checkpointing and recovery(cont.)
โข Asynchronous checkpointing:โ After executing an event, the triplet is recorded without any
synchronization with other processes.โ Local checkpoint consist of set of records, first are stored in volatile
log, then moved to stable log.
โข Recovery algorithmโ Notations:
โข ๐ ๐ถ๐๐ท๐โ๐ (๐ถ๐๐๐ก๐ ): number of messages received by ๐๐ from ๐๐ , from the beginning of computation to checkpoint ๐ถ๐๐๐ก๐
โข ๐๐ธ๐๐๐โ๐ (๐ถ๐๐๐ก๐): number of messages sent by ๐๐ to ๐๐ , from the beginning of computation to checkpoint ๐ถ๐๐๐ก๐โ Idea:
โข From the set of checkpoints, find a set of consistent checkpoints
โข Doing that based on the number of messages sent and received
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
7/31
Juang-Venkatesan algorithm for asynchronous
checkpointing and recovery(cont.)
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
8
Juang-Venkatesan algorithm for asynchronous
checkpointing and recovery(cont.)
โข Example
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
9
Manivannan-Singhal algorithm
โข Observation: there are some checkpoints useless (i.e. never included in any consistent global checkpoint), even none of them are useful
โข Combine the coordinated and uncoordinated checkpointing approachesโ Take checkpoint asynchronously
โ Use communication-induced checkpointing to eliminates the useless checkpointโ Every checkpoint lies on a consistent checkpoint, determine the recovery
โข line is easy and fast
โข Ideaโ Each checkpoint of a process has a unique sequence number โ local number, increased
periodicallyโ When a process send out a message, its sequence number is piggybacked
โ When a process received a message, if the received sequence number > its sequence number, it is forced to take checkpoint, and any basic checkpointing with smaller sequence number is skipped
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
10/31
Manivannan-Singhal โ Checkpointing Alg. (1)
โข Checkpointing algorithm
โ Checkpoints satisfy the following interesting properties
โข Ci,m of Pi is concurrent with C*, m of all other processes
โข Checkpoints C*,m of all processes form a consistent global checkpoint
โข Checkpoint Ci,m of Pi is concurrent with earliest checkpoint Cj, n
with m โค n
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
11/31
Manivannan-Singhal โ Checkpointing Alg. (2)
For a forced
checkpoint
For a basic
checkpoint
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
12/31
Manivannan-Singhal โ Checkpointing Ex
โข M1 forces P2 to take a forced checkpoint with sequence number
3 before processing M1 because M1.sn> sn2
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
13/31
Manivannan-Singhal โ Recovery Alg. (1)
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
14
Manivannan-Singhal โ Recovery Alg. (2)
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
15
Manivannan-Singhal โ Recovery Ex
โข When ๐3 recovers,
โ broadcast rollback(inc3, rec_lin3) where inc3 = 1 and rec_line3 = 5
โ ๐2 rollback to ๐ถ2,5
โ ๐1 does not have a checkpoint with sequence number โฅ 5. So it takes
a local check point and assign 5 as its sequence number
๐ถ1,5
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
16/31
Activity
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
17/31
Manivannan-Singhal quasi-synchronouscheckpointing algorithm(cont.)
โข Comprehensive handling messages during recoveryโ Handling the replay of messages
โ Handle of received messages
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
18/31
Peterson-Kearns algorithm โ Definition (1)
โข Based on optimistic recovery protocol
โข Rollback based on vector time
โข Ring configuration : each processor knows its successor on the ring
โข Each process ๐๐ has a vector clock ๐๐ [๐], 0 โค j โค N-1
โข ๐๐ (๐๐) : the clock value of an event ๐๐ which occurred at ๐๐โข ๐๐ (๐๐) : the current vector clock time of ๐๐ and ๐๐ denotes the most
recent event in ๐๐, thus ๐๐ (๐๐) = ๐๐ (๐๐)
โข ๐๐ : i th event on ๐๐
โข s : A send event of the underlying computation
โข ๐(๐ ) : The process where send event s occurs
โข ๐(s) : The process where the receive event matched with send event s
occurs
โข ๐๐ : The i th failure on ๐๐
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
19/31
Peterson-Kearns algorithm โ Definition (2)
โข ๐๐๐ : The i th state checkpoint on ๐๐ . The check point resides on
the
stable stoarge
โข ๐๐ ๐ : The i th restart event on ๐๐
โข ๐๐๐ : The i th rollback event on ๐๐
โข LastEvent (๐๐ ) = ๐ iff ๐ โฆ ๐๐ ๐
โข
๐ โฒ โฒ ๐
๐ถ๐ ,๐ (๐) : The arrival of the final polling wave message for rollbackfrom failure ๐๐ at process ๐๐โข๐ค๐,๐(๐) : The response to this final polling wave by ๐๐. If no response is required, ๐ค๐,๐(๐) = ๐ถ๐,๐ (๐)
๐
โข The final polling wave for recovery from failure ๐๐ :๐
๐๐๐ (๐) = ๐โ1 ๐ค๐,๐(๐) ๐โ1 ๐ถ๐=0 ๐=0๐ ,๐
(๐)
โข tk(i, m).ts : the token with failure ๐๐ and restart event ๐๐ ๐โข tk(i, m).inc : incarnation number of in the token
๐ ๐
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
20/31
Peterson-Kearns Alg. โ Informal Description (1)
โข Step 1
โ When a process ๐๐ restarts after failure, it retrieves its latest
checkpoint, including its vector time value, and roll back to it
โข Step 2
โ The message log is replayed
โข Step 3
โ The recovering process executes a restart event ๐๐ ๐ to begin the
global rollback protocol
โ creates a token message containing the vector timestamp of the restart event
โ sends the token to its successor process
๐
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
21/31
Peterson-Kearns Alg. โ Informal Description (2)
โข Step 4
โ The token is circulated through all the processes on the ring
(propagation rule : from ๐๐ to ๐ ๐+1 ๐๐๐
๐)โ When the token arrives at process ๐๐ , the timestamp in the token
is used to determine whether ๐๐ must roll back
If tk(i, m).ts < ๐๐ (๐๐),
then ๐๐ must roll back to an earlier state because an orphan event has
occurred at ๐๐
Otherwise, the state of ๐๐ is not changed
โข Step 5
โ When the token returns to the originating process, the roll back
recovery is complete
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
22/31
Peterson-Kearns Alg. โ Formal Description (1)
โข described as set of six rules, CRB1 to CRB6
โข CRB1
โ A formerly failed process creates and propagates a token, event
๐ค๐,๐(๐), only after restoring the state from the latest checkpoint and
executing the message log from the stable storage
โข CRB2
โ The restart event increments the incarnation number at the recovering process, and the token carries the vector timestamp of the restart event and the newly incremented incarnation number
โข CRB3
โ A non-failed process will propagate the token only after it has
rolled back
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
23/31
Peterson-Kearns Alg. โ Formal Description (2)
โข CRB4
โ A non-failed process will propagate the token only after it has
incremented its incarnation number and has stored the vector
timestamp of the token and the incarnation number of the token in
its OrVect set
โข CRB5
โ When the process that failed, recovered, and initiated the token,
receives its token back, the rollback is complete
โข CRB6
โ Messages that were in transit and which were orphaned by the
failure and subsequent restart and recovery must be discarded
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
24/31
Peterson-Kearns - example
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
25
Helary-Mostefaoui-Netzer-Raynal protocol (1)
โข Communication-induced checking protocol
โข Some coordination is required in taking local checkpoints
โข Achieve the coordination by piggybacking control information on
application messages
โข Basic checkpoints
โ Processes take local checkpoints independently
โข Forced checkpoints
โ The protocol directs processes to take additional local checkpoints
โ A process takes a forces checkpoint when it receives a message and its predicate becomes true
โข No local checkpoint is useless
โข Takes as few forced checkpoints as possible
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
26/31
Helary-Mostefaoui-Netzer-Raynal protocol (2)
โข Based on Z-path and Z-cycle theory
โ A useless checkpoint exactly corresponds to the existence of a Z-
cycle in the distributed computation
โ The protocol prevents Z-Cycles
โข A Z-path exists from local check point A to local checkpoint B iff (i)A precedes B in the same process, or (ii) a sequence of message [๐1,
๐2, . . . , ๐๐] (q โฅ 1) exists such that
โ (1) A precedes send(๐ฆ๐) in the same process, and
โ (2) for each ๐ฆ๐ข, i < q, delivery(๐ฆ๐ข) is in the same or earlier interval as send(๐ฆ๐ข+๐), and
โ (3) delivery(๐ฆ๐ช) precedes B in the same process
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
27/31
Helary-Mostefaoui-Netzer-Raynal protocol (3)
โข A Z-path from a local checkpoint ๐ถ๐,๐ฅ to the same local checkpoint
๐ถ๐,๐ฅ is called a Z-cycle (i.e., it involves the local checkpoint ๐ถ๐,๐ฅ)
โข In a Z-path [๐1, ๐2, . . . , ๐๐], two consecutive messages ๐๐ผ and
๐๐ผ+1 form a Z-pattern iff send(๐๐ผ+1) โ delivery(๐๐ผ)
โข Theorem : For any pair of checkpoints ๐ถ๐,๐ฆ and ๐ถ๐,๐ง, such that there is
a Z-path from ๐ถ๐,๐ฆ to ๐ถ๐,๐ง, ๐ถ๐,๐ฆ. ๐ก < ๐ถ๐,๐ง. ๐ก implies that there is no Z-
cycle
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
28/31
H-M-N-R protocol โ Z-path & Z-cycle ex.
โข [๐3, ๐2] is a Z-path from ๐ถ๐,0 to ๐ถ๐,2
โข [๐5, ๐4] and [๐5, ๐6] are two Z-paths from ๐ถ๐,2 to ๐ถ๐,2
โข [๐3, ๐2] and [๐5, ๐4] are two Z-patterns
โขThe Z-path [๐7, ๐5, ๐6] is a Z-cycle that involves the local
checkpoint ๐ถ๐,2CS8603 -Distributed Systems-Algorithm for asynchronous
checkpointing and recovery29/31
H-M-N-R protocol โ forced checkpoints ex.
โข (a) ๐๐. ๐ โค ๐๐. ๐ : ๐ถ๐,๐ฆ. ๐ก < ๐1. ๐ก < ๐2. ๐ก < ๐ถ๐,๐ง. ๐ก . Hence, the Z-
pattern [๐1, ๐2] is consistent with the assumption of the above
theorem
โข (b) ๐๐. ๐ > ๐๐. ๐ : A safe strategy to prevent Z-cycle formation is
to direct ๐๐ to take a forced checkpoint ๐ถ๐,๐ฅ before delivering ๐1.
This โbreaksโ [๐1, ๐2], so it is no longer a Z-pattern
โข How to implement โtaking a forced checkpointโ?
โ ๐๐ takes a forced checkpoint if C is true, where
C โก (โ k: ๐ ๐๐๐ก_๐ก๐๐ ๐ โง ๐1. ๐ก > ๐๐๐_๐ก๐๐ ๐ )CS8603 -Distributed Systems-Algorithm for asynchronous
checkpointing and recovery30/31
Helary-Mostefaoui-Netzer-Raynal protocol โ Alg.
CS8603 -Distributed Systems-Algorithm for asynchronous checkpointing and recovery
31