37
CPSC 668 Set 12: Causality 1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 1

CPSC 668Distributed Algorithms and Systems

Fall 2009

Prof. Jennifer Welch

Page 2: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 2

Logical Clocks Motivation

• In an asynchronous system, often cannot tell which of two events occurred before the other:

Example A Example B

p0

p1

m0 m1

p0

p1

m0 m1

Page 3: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 3

Logical Clocks Motivation

• In Example A, processors cannot tell which message was sent first. Probably not important.

• In Example B, processors can tell which message was sent first. Might be important.

• Let's try to determine relative ordering of some (not all) events.

Page 4: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 4

Happens Before Partial Order

• Given an execution, computation event a happens before computation event b, denoted a b, if

• a and b occur at same processor and a precedes b, or

• a results in sending m and b includes receipt of m, or

• there exists computation event c such that a c and c b (transitive closure)

Page 5: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 5

Happens Before Partial Order

• Happens before means that information can flow from a to b, i.e., that a might cause b.

p0

p1

m0 m1

a d

b c

b c

a b

a c

c d

a d

b d

Page 6: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 6

Concurrent Events

• If a does not happen before b, and b does not happen before a, then a and b are concurrent, denoted a || b.

Page 7: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 7

Happens Before Example

Rule 1: a b, c d e f, g h i

Rule 2: a d, g e, f i

Rule 3: a e, c i, …h || e, …

Page 8: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 8

Logical Clocks

• Logical clocks are values assigned to events to provide some information about the order in which events happen.

• Goal is to assign an integer L(e) to each computation event e in an execution such that if a b, then L(a) < L(b).

Page 9: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 9

Logical Timestamps Algorithm

• Each pi keeps a counter (logical timestamp) Li, initially 0

• Every message pi sends is timestamped with current value of Li

• Li is incremented at each step to be greater than– its current value– the timestamps on all messages received at this step

• If a is an event at pi, then assign L(a) to be the value of Li at the end of a.

Page 10: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 10

Logical Timestamps Example

1

2 3 4

1 2 5

2

1

a b : L(a) = 1 < 2 = L(b)f i : L(f) = 4 < 5 = L(i)a e : L(a) = 1 < 3 = L(e)etc.

Page 11: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 11

Getting a Total Order

• If a total order is required, break ties using ids.

• In the example, L(a) = (1,0), L(c) = (1,1), etc.

• Timestamps are ordered lexicographically.

• In the example, L(a) < L(c).

Page 12: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 12

Drawback of Logical Clocks

• a b implies L(a) < L(b), but L(a) < L(b) does not necessarily imply a b.

• In previous example, L(g) = 1 and L(b) = 2, but g does not happen before b.

• Reason is that "happens before" is a partial order, but logical clock values are integers, which are totally ordered.

Page 13: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 13

Vector Clocks

• Generalize logical clocks to provide non-causality information as well as causality information.

• Implement with values drawn from a partially ordered set instead of a totally ordered set.

• Assign a value V(e) to each computation event e in an execution such that a b if and only if V(a) < V(b).

Page 14: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 14

Vector Timestamps Algorithm• Each pi keeps an n-vector Vi, initially all 0's• Entry j in Vi is pi 's estimate of how many steps

pj has taken• Every msg pi sends is timestamped with current

value of Vi

• At every step, increment Vi[i] by 1• When receiving a message with vector

timestamp T, update Vi 's components j ≠ i so that Vi[j] = max(T[j],Vi[j])

• If a is an event at pi, then assign V(a) to be value of Vi at end of a.

Page 15: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 15

Manipulating Vector Timestamps

Let V and W be two n-vectors of integers.Equality: V = W iff V[i] = W[i] for all i.

Example: (3,2,4) = (3,2,4)Less than or equal: V ≤ W iff V[i] ≤ W[i] for all i.

Example: (2,2,3) ≤ (3,2,4) and (3,2,4) ≤ (3,2,4)Less than: V < W iff V ≤ W but V ≠ W.

Example: (2,2,3) < (3,2,4)Incomparable: V || W iff !(V ≤ W) and !(W ≤ V).

Example: (3,2,4) || (4,1,4)

Page 16: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 16

Manipulating Vector Timestamps

• The partial order on n-vectors just defined is not the same as lexicographic ordering.

• Lexicographic ordering is a total order on vectors.

• Consider (3,2,4) vs. (4,1,4) in the two approaches.

Page 17: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 17

Vector Timestamps Example

(1,0,0)

(1,2,0) (1,3,1) (1,4,1)

(0,0,1) (0,0,2) (1,4,3)

(2,0,0)

(0,1,0)

V(g) = (0,0,1) and V(b) = (2,0,0), which are incomparable.Compare with logical clocks L(g) = 1 and L(b) = 2.

Page 18: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 18

Correctness of Vector Timestamps

Theorem (6.5 & 6.6): Vector timestamps implement vector clocks.

Proof: First, show a b implies V(a) < V(b).

Case 1: a and b both occur at pi, a first. Since Vi increases at each step, V(a) < V(b).

Page 19: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 19

Correctness of Vector Timestamps

Case 2: a occurs at pi and causes m to be sent, while b occurs at pj and includes the receipt of m.– During b, pj updates its vector timestamp in such a

way that V(a) ≤ V(b).– pi 's estimate of number of steps taken by pj is never

an over-estimate. Since m is not received before it is sent, pi 's estimate of the number of steps taken by pj when a occurs is less than the number of steps taken by pj when b occurs. So V(a)[j] < V(b)[j].

– Thus V(a) < V(b).

Page 20: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 20

Correctness of Vector Timestamps

Case 3: There exists c such that a c and c b.By induction (from Cases 1 and 2) and transitivity of <, V(a) < V(b).

Next show V(a) < V(b) implies a b.Equivalent to showing !(a b) implies !

(V(a) < V(b))

Page 21: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 21

Correctness of Vector Timestamps

• Suppose a occurs at pi, b occurs at pj, and a does not happen before b.

• Let V(a)[i] = k.• Since a does not happen before b, there is

no chain of messages from pi to pj originating at pi 's k-th step or later and ending at pj before b.

• Thus V(b)[i] < k.• Thus !(V(a) < V(b)).

Page 22: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 22

Size of Vector Timestamps

• Vector timestamps are big:– n components in each one– values in the components grow without

bound

• Is there a more efficient way to implement vector clocks?

• Answer is NO, at least under some conditions.

Page 23: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 23

Vector Clock Size Lower Bound

Theorem (6.9): Any implementation of vector clocks using vectors of real numbers requires vectors of length n (number of processors).

Proof: For any value of n, consider this execution:

Page 24: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 24

Example Bad Execution

For n = 4:

Page 25: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 25

Vector Clock Size Lower BoundClaim 1: ai+1 || bi for all i (with wraparound)

Proof: Since each proc. does all sends before any receives, there is no transitivity. Also pi+1 does not send to pi.

Claim 2: ai+1 bj for all j ≠ i.

Proof: If j = i+1, obvious.

If j ≠ i+1, then pi+1 sends to pj:

Page 26: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 26

Vector Clock Size Lower Bound

• Suppose in contradiction, there is a way to implement vector clocks with k-vectors of reals, where k < n.

• By Claim 1, ai+1 || bi

=> V(ai+1) and V(bi) are incomparable

=> V(ai+1) is larger than V(bi) in some coordinate h(i)

=> h : {0,…,n-1} {0,…,k}

Page 27: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 27

Vector Clock Size Lower Bound• Since k < n, the function h is not 1-1. So there

exist distinct i and j such that h(i) = h(j). Let r be this common value of h.

V(a0)V(a1)…V(ai+1)…V(aj+1)…V(an-1)

V(b0)…V(bi)…V(bj)…V(bn-2)V(bn-1)

> in h(0) comp

> in h(i) comp

> in h(j) comp

> in h(n-2) comp

> in h(n-1) comp

two of thesecomponents arethe same, sayh(i) = h(j) = r

Page 28: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 28

Vector Clock Size Lower Bound

V(ai+1)

V(aj+1)

V(bi)

V(bj)

> in component r

> in component r

≤ in all components,

since ai+1 b

j

> in co

mpo

nent

r,

cont

radic

ts a j+1

b i

Page 29: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 29

Vector Clock Size Lower Bound

• So V(ai+1) is larger than V(bi) in coordinate r and V(aj+1) is larger than V(bj) in coordinate r also.

• V(aj+1)[r] > V(bj)[r] by def. of r

≥ V(ai+1)[r] by Claim 2 (ai+1 bj) & correct.

≥ V(bi)[r] by def. of r• Thus V(aj+1) !< V(bi), contradicting Claim 2 (aj+1

bi) and assumed correctness of V.

Page 30: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 30

Application of Causality: Consistent Cuts• Consider an asynchronous message passing

system with– FIFO message delivery per channel– at most one msg received per computation step

• Number the computation steps of each processor 1,2,3,…

• A cut of an execution is K = (k0,…,kn-1), where ki indicates number of computation steps taken by pi

Page 31: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 31

Consistent Cuts

In a consistent cut K = (k0,…,kn-1), if step s of pj

happens before step ki of pi, then s ≤ kj.

(1,3) and (2,4) are consistent.

(3,6) is inconsistent: step 4 by p0 happens before step 6 of p1, but 4 is greater than 3.

some cuts

Page 32: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 32

Finding a Recent Consistent Cut

Problem Version 1: Processors all given a cut K and must find a maximal consistent cut that is ≤ K.

Application: Logging-based crash recovery.– Procs periodically write their state to stable

storage– When a proc recovers from a crash, it tries to

recover to latest logged state, but needs to coordinate with other procs

Page 33: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 33

Vector Clocks Solution

• Implement vector clocks using vector timestamps appended to application msgs.

• Store the vector clock of each computation step in a local array store[1,…]

• When pi is given input cut K:

for x := K[i] downto 1 do

if store[x] ≤ K then return x

return x (entry for pi of global answer)

Page 34: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 34

What About Channel State?

• Processor states are not sufficient to capture entire system state.

• Messages in transit must be calculated.

• Solution here requires– additional storage (number of messages)– additional computation at recovery time

(involving replaying original execution to capture messages sent but not received)

Page 35: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 35

Another Take on Recent Consistent StateProblem Version 2: A subset of procs

initiate (at arbitrary times) trying to find a consistent cut that includes the state of at least one of the initiators when it started.

• Called a distributed snapshot.• Snapshot info can be collected at one

proc. and then analyzed.Application: termination detection

Page 36: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 36

Marker Algorithm• Instead of adding extra information on each

application message, insert control messages ("markers") into the channels.

• Code for pi:

initially answer = -1 and num = 0

when application msg arrives:num++; do application action

when marker arrives or when initiating snapshot:if answer = -1 then

answer := num // pi's part of final answer send marker to all neighbors

Page 37: CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch

CPSC 668 Set 12: Causality 37

What About Channel States?

• pi records sequence of msgs received from pj between the time pi records its answer and the time pi gets the marker from pj

• These are the msgs in transit from pj to pi in the cut returned by the algorithm.