20
Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section 11.5&12.4 Klara Nahrstedt

Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Embed Size (px)

DESCRIPTION

Lecture 4-3 Administrative Form Groups for MP projects –Up to 3 members per group – names to TA today Homework 1 posted today, September 3 (Thursday) –Deadline, September 17 (Thursday) Introductory material to get introduced to the Eclipse/Android Programming Environment is posted – –Optional MP0 –Posted on the class website

Citation preview

Page 1: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-1

Computer Science 425Distributed

Systems(Fall2009)

Lecture 4Chandy-Lamport Snapshot Algorithm and

Multicast Communication Reading: Section 11.5&12.4

Klara Nahrstedt

Page 2: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-2

Acknowledgement

• The slides during this semester are based on ideas and material from the following sources:

– Slides prepared by Professors M. Harandi, J. Hou, I. Gupta, N. Vaidya, Y-Ch. Hu, S. Mitra.

– Slides from Professor S. Gosh’s course at University o Iowa.

Page 3: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-3

Administrative

• Form Groups for MP projects – Up to 3 members per group– Email names to TA today

• Homework 1 posted today, September 3 (Thursday)

– Deadline, September 17 (Thursday)

• Introductory material to get introduced to the Eclipse/Android Programming Environment is posted –

– Optional MP0– Posted on the class website

Page 4: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-4

Plan for Today

• Chandy-Lamport Global Snapshot Algorithm

• New Topic: Multicast Communication

Page 5: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-5

Review• A run is a total ordering of events in H that is

consistent with each hi’s ordering– E.g., <e1

0, e11, e1

2, e13, e2

0, e21 , e2

2, e30 e3

1, e32>

• A linearization is a run consistent with happens-before () relation in H

– E.g., < e10, e1

1, e30, e2

0 ,…>, < e10, e3

0 , e11, e2

0 , …>– Concurrent events are ordered arbitrarily– Linearizations pass through consistent global states

P1

P2

P3

e10 e1

1 e12 e1

3

e20

e21

e22

e30 e3

1 e32

C3C2C1C0

Page 6: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-6

Chandy-Lamport “Snapshot” Algorithm “Snapshot” is a set of process and channel states

for set of processes Pi (i=1,…N) such that the recorded global state is consistentEven those the combination of recorded states may

never occurred at the same timeChandy-Lamport Algorithm records a set of

process and channel states such that the combination is a consistent global state.

Assumptions: No failure, all messages arrive intact, exactly once Communication channels are unidirectional and FIFO-ordered There is a comm. channel between each pair of processes Any process may initiate the snapshot (send “Marker”) Snapshot does not interfere with normal execution

Page 7: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-7

Chandy-Lamport Snapshot Algorithm (2) 1. Initiator process P0 records its state locally

2. Marker sending rule for process Pi: After Pi has recorded its state, for each outgoing channel Chij, Pi sends one

marker message over Chij (before Pi sends any other message over Chij)

3. Marker receiving rule for process Pi:Process Pi on receipt of a marker over channel Chji

If (Pi has not yet recorded its state) it Records its process state now;Records the state of Chji as empty set;Starts recording messages arriving over other incoming channels;

else (Pi has already recorded its state)Pi records the state of Chji as the set of all messages it has

received over Chji since it saved its state

Page 8: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-8

Snapshot Example

P1

P2

P3

e10

e20

e23

e30

e13

a

b

M

e11,2

M

1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels Ch21 and Ch31

e21,2,3

M

M

2- P2 receives Marker over Ch12, records its state (S2), sets state(Ch12) = {} sends Marker to P1 & P3; turns on recording for channel Ch32

e14

3- P1 receives Marker over Ch21, sets state(Ch21) = {a}

e32,3,4

M

M

4- P3 receives Marker over Ch13, records its state (S3), sets state(Ch13) = {} sends Marker to P1 & P2; turns on recording for channel Ch23

e24

5- P2 receives Marker over Ch32, sets state(Ch32) = {b}

e31

6- P3 receives Marker over Ch23, sets state(Ch23) = {}

e13

7- P1 receives Marker over Ch31, sets state(Ch31) = {}

Page 9: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-9

Another Example: Trading

p1 p2c2

c1

account widgets

$1000 (none)

account widgets

$50 2000

Page 10: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-10

Execution of the Processes

p1 p2(empty)<$1000, 0> <$50, 2000>

(empty)

c2

c1

1. Global state S0

2. Global state S1

3. Global state S2

4. Global state S3

p1 p2(Order 10, $100), M<$900, 0> <$50, 2000>

(empty)

c2

c1

p1 p2(Order 10, $100), M<$900, 0> <$50, 1995>

(five widgets)

c2

c1

p1 p2(Order 10, $100)<$900, 5> <$50, 1995>

(empty)

c2

c1

Send 5 widgets

(M = Marker Message)

Page 11: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-11

Reachability between States in Snapshot Algorithm

Sinit Sfinal

Ssnap

actual execution e 0,e1,...

recording recording begins ends

pre-snap: e '0,e'1,...e'R-1 post-snap: e'R,e'R+1,...Let ei and ej be events occurring at pi and pj, respectively such that ei ej The snapshot algorithm ensures that • if ej is in the cut then ei is also in the cut.• if ej occurred before pj recorded its state, then ei must have occurred before pi recorded its state.

Page 12: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-12

Causality Violation

P1

P2

P3

1 2

3 4

5

0

0

0

1

2

Physical Time

46

Send obj1

obj1.method()

Send obj1

• Causality violation occurs when order of messages causes an action based on information that another host has not yet received.

• In designing a DS, potential for causality violation is important

Page 13: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-13

Detecting Causality Violation

P1

P2

P3

(1,0,0)

(2,0,0)

Physical Time

(2,0,2)

• Potential causality violation can be detected by vector timestamps.

• If the vector timestamp of a message is less than the local vector timestamp, on arrival, there is a potential causality violation.

0,0,0

0,0,0

0,0,0

1,0,0

2,0,1

2,2,22,1,2

2,0,2

2,0,0Violation:

(1,0,0) < (2,1,2)

Page 14: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-14

Multicast Communication

Page 15: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-15

Communication Modes in DS Unicast (best effort or reliable)

Messages are sent from exactly one process to one process. Best effort guarantees that if a message is delivered it would

be intact. Reliable guarantees delivery of messages.

Broadcast Messages are sent from exactly one process to all processes

on the network. Reliable broadcast protocols are not practical.

Multicast Messages are sent from exactly one process to several

processes on the network. Reliable multicast can be implemented “above” (i.e., “using”) a

reliable unicast.

Page 16: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-16

Page 17: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-17

What’re we designing in this class

Application(at process p)

MULTICAST PROTOCOL

send multicast

Incomingmessages

delivermulticast

One process p

p1

p2

p3

Send/multicast

Delivermulticast

Delivermulticast

Page 18: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-18

Open and closed groups

Closed group Open group

Page 19: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-19

Basic Multicast (B-multicast)

• A straightforward way to implement B-multicast is to use a reliable one-to-one send operation:

B-multicast(g,m): for each process p in g, send (p,m). On receive(m) at p: B-deliver(m) at p.

• A “correct” process= a “non-faulty” process

• Basic multicast (B-multicast) primitive guarantees a correct (i.e., non-faulty) process will eventually deliver the message, as long as the sender (multicasting process) does not crash.

– Can we provide reliability even when the sender crashes (after it has sent the multicast )?

Page 20: Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section

Lecture 4-20

Summary

Multicast is operation of sending one message to multiple processes

• Reliable multicast algorithm built using reliable unicast• Introduction to Ordering – FIFO, total, causal

Next week• Read Sections 12.4• Homework 1 due September 17

– Come by office hours with your questions