View
213
Download
0
Embed Size (px)
Citation preview
CPSC 668 Set 13: Clocks 1
CPSC 668Distributed Algorithms and Systems
Fall 2009
Prof. Jennifer Welch
CPSC 668 Set 13: Clocks 2
Hardware Clocks
• Suppose processors have access to some approximation of real time.
• Mechanism is through hardware clocks, one at each processor.
• pi 's hardware clock HCi is modeled as a function from real times to clock times.
• Consider timed executions: associate a real time with each event (increasing).
• During pi 's computation event at real time t, the value of HCi(t) can be used as input to pi 's transition function.
CPSC 668 Set 13: Clocks 3
Possible H/W Clock Properties• HCi is increasing
– a minimal property
• HCi(t) = number of steps taken by pi through real time t– easy to implement in software
• HCi(t) = t– perfect
• HCi(t) = t + ci
– h/w clock runs at same rate as real time but offset
• HCi(t) = ait + bi
– h/w clock drifts away from real time
CPSC 668 Set 13: Clocks 4
Adjusted Clocks
• Clocks are particularly useful if they are synchronized.
• But typically hardware clocks cannot be changed.
• Instead, consider adjusted clock, obtained by adding some value to the hardware clock value:
ACi(t) = HCi(t) + adji(t)
• adji is adjustment variable of pi
CPSC 668 Set 13: Clocks 5
Measuring Clock Differences
• How to evaluate how close together clocks are?
• Skew: how far apart clock times are at a given real time, or
• Precision: how far apart in real time clocks reach same clock time
• These are the same when there is no drift…
CPSC 668 Set 13: Clocks 6
Skew and Precision
real time
clocktime
skew
ACi
ACj
precision
T
t
CPSC 668 Set 13: Clocks 7
Synchronizing Clocks
If hardware clocks don't drift, then once clocks are adjusted, they stay the same distance apart.
Achieving -synchronized clocks:
• Termination: no processor assigns to its adj variable after some real time tf
-bounded skew: for all i and j, and all real times t ≥ tf, |ACi(t) - ACj(t)| ≤ .
CPSC 668 Set 13: Clocks 8
Bounded Message Delays
• We'll study the clock synchronization problem in message passing with bounded delays.
• Define a timed execution to be admissible if:– every processor takes an infinite number of
steps (no failures)– every message has delay in the range
[d–u,d]; call u the uncertainty
CPSC 668 Set 13: Clocks 9
Two Processor Algorithm
• Consider this simple algorithm:
• p0 uses its hardware clock as its adjusted clock
• p1 adopts (its best estimate of) p0's adjusted clock as its adjusted clock
• How does p1 do this? p0 sends its clock time to p1 in a message
• How to handle uncertain delay? Assume delay is in the middle of the range: d – u/2
CPSC 668 Set 13: Clocks 10
Code for Two Processor Algorithm
p0:
adj0 := 0
send HC0 to p1
p1:
when receive T from p0:
adj1 := (T + d – u/2) – HC1
CPSC 668 Set 13: Clocks 11
Analysis of Two Proc. Algorithm
• What is the skew attained by the algorithm?
• If message really did take d – u/2 time to arrive, skew is 0 (best case).
• If message took d or d – u time, skew is u/2 (worst case).
• Can we do better, perhaps with a more complicated algorithm?
CPSC 668 Set 13: Clocks 12
Proving Lower Bounds on Skew
• A useful technique for proving lower bounds on skew for clock synchronization is that of shifting executions.
• To define it, we first need to look at some modeling issues.
CPSC 668 Set 13: Clocks 13
Modeling Executions: Two Ways
• We've been modeling an execution as a sequence of events.
step by p2step by p0 step by p1
CPSC 668 Set 13: Clocks 14
Modeling Executions: Two Ways
• An alternative approach is to model with a set of sequences, one sequence per processor.
p2
p0
p1
CPSC 668 Set 13: Clocks 15
Modeling Executions: Two Ways
• Having one sequence per processor is technically convenient for lower bound proofs
• Can convert back and forth between the two modeling styles
CPSC 668 Set 13: Clocks 16
Processor Views
• A view of processor pi is:
– an initial state of pi
– a sequence of events (computation and delivery) occurring at pi
– a hardware clock value for each event
• A timed view of pi is a view with a real time associated with each event (increasing)
CPSC 668 Set 13: Clocks 17
Views vs. Timed Views
Two different timed views with the same (untimed) view:
11:15 11:20 11:45 11:52
3:00 3:05 3:10 4:00 h/w clock times
real times
8:08 9:00 9:10 10:10
3:00 3:05 3:10 4:00 h/w clock times
real times
CPSC 668 Set 13: Clocks 18
Extracting Views from Executions
• Given a timed execution, straightforward to extract timed views for all the processors:– get initial state of a processor from the
initial configuration– get sequence of events occurring at that
processor and their times from the events in the execution
CPSC 668 Set 13: Clocks 19
Merging Views into an Execution
Given a set of timed views, one per proc:
1. initial config is combination of initial states
2. obtain sequence of events by interleaving events from views in real-time order (break ties with ids)
3. apply events in order to initial config to obtain the other configs.
CPSC 668 Set 13: Clocks 20
But is Result Admissible?
• The result might not be admissible.
• Biggest issue is the message delays: must be in range d – u to d.
CPSC 668 Set 13: Clocks 21
Why Care About Views?
To prove lower bounds on skew:1. Start with a (carefully chosen) timed
execution2. Modify processors' views (in a carefully
chosen way)3. Merge resulting views to get a new
execution:• check that it is admissible• show that it violates some bound
CPSC 668 Set 13: Clocks 22
Shifting Timed Executions
Given timed execution and real numbers x0, x1, …, xn-1,
shift(,(x0, x1, …, xn-1)) is created by:
• extracting timed views v0, …, vn-1 from
• adding xi to the real time of each event in each vi
• merging the resulting timed views
CPSC 668 Set 13: Clocks 23
Shifting Examples
h/w clock times
real times
HCi(t) = T
t
h/w clock times
real times
HCi(t+x) = T
t + x
h/w clock times
real times
HCi(t+x) = T
t + x
shift bypositiveamount
shift bynegativeamount
CPSC 668 Set 13: Clocks 24
Facts About Shifted Executions
Result of shifting and merging might not be admissible: could shift receipt of a message earlier than its sending, for example.
But these facts hold:
1 New hardware clock HC'i satisfies:
HC'i(t) = HCi(t – xi) = HCi(t) – xi
2 Delay of a msg from pi to pj goes from to
– xi + xj since msg is sent xi later and received xj later
CPSC 668 Set 13: Clocks 25
Lower Bound for 2 Processors
• Let A be any 2-proc. alg that achieves -clock synchronization.
• Let be the timed admissible execution of A in which – every msg from p0 to p1 has delay d – u
– every msg from p1 to p0 has delay d
• After A terminates in ,
(1) AC0 ≥ AC1 –
CPSC 668 Set 13: Clocks 26
Lower Bound for 2 Processors
p0
p1
d-ud
p0
p1
d-ud
shift p0 backwards by u
CPSC 668 Set 13: Clocks 27
Lower Bound for 2 Processors
• Let ' = shift(,(–u,0)).
• Shift p0 earlier by u, leave p1 alone.
• In ',– every msg from p0 to p1 has delay d
– every msg from p1 to p0 has delay d – u
• After A terminates in ',
AC'1 ≥ AC'0 –
CPSC 668 Set 13: Clocks 28
Lower Bound for 2 Processors
AC'1 ≥ AC'0 – implies
AC1 ≥ (AC0 + u) – since AC'1 = AC1 and
AC'0 = AC0 + uRemember inequality (1):
AC0 ≥ AC1 – ≥ (AC0 + u – ) – (from just above)
Implies ≥ u/2
CPSC 668 Set 13: Clocks 29
Star Algorithm for n Processors
• Assume the network topology is a clique and message delay range for every edge is d – u to d.
• Pick one proc (say p0) and let every other proc try to adopt p0's clock using the 2-processor algorithm.
• Worst-case skew can be as large as u (one proc is u/2 behind p0's clock and another is u/2 ahead)
CPSC 668 Set 13: Clocks 30
Improved Algorithm for n Processors
• All processors exchange h/w clock values.
• Each processor estimates the difference between its own h/w clock and that of each other processor.
• Each processor computes the average of the differences and sets its adj variable to the result
CPSC 668 Set 13: Clocks 31
Code for Processor pi
initially diff[i] = 0
send HCi to all procs
when receive T from pj:
diffi[j] := (T + d – u/2) – HCi
when heard from all procs:
adji := (1/n)∑diffi[k]k = 0
n-1
CPSC 668 Set 13: Clocks 32
Analysis of n-Processor Algorithm
• To bound the skew, start with
|ACi – ACj|• Then substitute the formula for each AC
from the code:
HCi + (1/n)∑diffi[k]• Then do some algebra (rearranging
terms and using properties of absolute value) to get…
CPSC 668 Set 13: Clocks 33
Analysis of n-Processor Algorithm|ACi – ACj| ≤ (X + Y + Z)/n where• X = |diffj[i] – (HCi – HCj)|
error in pj's estimate of the difference between its own clock and pi's clock, at most u/2
• Y = |diffi[j] – (HCj – HCi)|
error in pi's estimate of the difference between its own clock and pj's clock, at most u/2
• Z = sum over all k other than i and j of
|diffi[k] – (HCk – HCi)| + |diffj[k] – (HCk – HCj)|
error in pi's estimate of pk's clock plus error in pj's estimate of pk's clock, at most u/2 + u/2 = u.
CPSC 668 Set 13: Clocks 34
Analysis of n-Processor Algorithm
To finish up,
|ACi – ACj| ≤ (u/2 + u/2 + (n–2)u)/n
= u(1 – 1/n).
CPSC 668 Set 13: Clocks 35
Lower Bound for n-Processor CS
Theorem (6.17): No algorithm can achieve -synchronized clocks for < u(1–1/n).
Proof: • Choose any algorithm A that achieves
-synchronized clocks.• Let be a timed admissible exec. s.t.
– every msg from pi to pj has delay d – u, i < j.
– every msg from pj to pi has delay d, i < j.
CPSC 668 Set 13: Clocks 36
Example of Reference Execution
For n = 4, the message delays in can be represented schematically like this:
p0
p1
p2
p3
d-u
d-u
d-u
d-u
d-u
d-u
d
d
d
d
d
d
CPSC 668 Set 13: Clocks 37
Additive Lemma
Lemma (6.18): ACk-1 ≤ ACk – u + , for all k.Proof:
Take and shift p0 through pk-1 earlier by u: ' = shift(,(–u,…, –u,0,…,0))
Verify that ' is admissible by checking that message delays are in range:– if sender and recipient were both shifted, then
delays are same as in – if one is shifted and other is not, then delays that
used to be d–u become d and delays that used to be d become d–u.
CPSC 668 Set 13: Clocks 38
Example of Shifted Execution
p0
p1
p2
p3
d-u
d-u
d-u
d-u
d-u
d-u
d
d
d
d
d
d
p0
p1
p2
p3
d-u
d-u
d-u
d-u
d-u
d-u
d
d
d
d
d
d
shift p0 and p1 earlier by u
CPSC 668 Set 13: Clocks 39
Additive Lemma Completed
• Since ' is admissible and algorithm achieves -synchronized clocks, after termination
ACk-1' ≤ ACk' +
• By shifting facts,
ACk-1' = ACk-1 + u and ACk' = ACk
• Thus ACk-1 ≤ ACk – u + .
CPSC 668 Set 13: Clocks 40
Back to Main Lower Bound Proof
After termination in :
ACn-1 ≤ AC0 + by correctness of algorithm
≤ AC1 – u + 2 by Additive Lemma
≤ AC2 – 2u + 3 by Additive Lemma
…
≤ ACn-1 – (n–1)u + n by Additive Lemma
Thus ≥ u(1 – 1/n).
CPSC 668 Set 13: Clocks 41
Message Delays in the Real World• In reality, message delays are not uniformly
distributed between a minimum and a maximum.• Typically the distribution has a spike close to the
minimum and a long tail going to infinity.• One approach to deal with the lack of a
maximum is to fix a "timeout" value d and consider any msg taking longer to be lost.
• But if d is chosen to be fairly large (to reduce the number of slow msgs incorrectly classified as lost), most msgs will take significantly less than d, and even significantly less than d – u/2.
CPSC 668 Set 13: Clocks 42
Estimating Clock Differences
• Take advantage of small delays that occur most of the time.
• pi sends a query to pj, which pj answers immediately with its current clock value.
• When pi gets the response, it assumes pj's response took half the round trip time.
• If the round trip time is small, error is reduced compared to original approach.
• pi can query repeatedly until getting a round trip time that is "sufficiently" small.
CPSC 668 Set 13: Clocks 43
Clock Drift
• Hardware clocks typically suffer from drift (gain or lose time).
• Usually the drift is bounded, though.• Bounded Drift: There exists > 0 such that
for all i, and all real times t1 and t2,
(1 + )–1(t2 – t1) ≤ HCi(t2) – HCi(t1)
≤ (1 + )(t2 – t1)
• That is, hardware clocks measure elapsed real time approximately correctly.
CPSC 668 Set 13: Clocks 44
Hardware Clock Drift
For quartz crystal clocks, is about 10–6
hardwareclock HCi
real time t
HCi(t)
max slope< 1+
1+
min slope< (1+)-1 (1+)-1
CPSC 668 Set 13: Clocks 45
Clock Synchronization with Drift
• When clocks can drift, processors must continually resynchronize. Two problems:
1. Establish: Get clocks close together.
2. Maintain: Keep clocks close together.
• We will focus on the maintenance problem, assuming clocks are initially within some B of each other.
CPSC 668 Set 13: Clocks 46
Maintaining Clock Synchronization with DriftClock Agreement: There exists s.t. for all i
and j, and all real times t:
|ACi(t) – ACj(t)| ≤ Clock Validity: There exists > 0 s.t. for all i
and all real times t:
(1 + )–1(HCi(t) – HCi(0)) ≤ ACi(t) – ACi(0)
≤ (1 + )(HCi(t) – HCi(0))When taking the "long view", adjusted clocks
measure elapsed time approximately as well as the hardware clocks.
CPSC 668 Set 13: Clocks 47
Byzantine Failures and Clock Synchronization
• Suppose up to f processors can exhibit Byzantine failures.
• Modify definition of maintaining clock synchronization with drift so that clock agreement and clock validity only need to hold for nonfaulty processors.
• To solve the problem, total number of processors n must satisfy n > 3f.
CPSC 668 Set 13: Clocks 48
Lower Bound on Number of Processors• The n > 3f condition is also true of consensus.• The consensus problem and the clock
maintenance problem are similar.• Can we use the n > 3f bound for consensus via
a reduction?• No one knows how. Instead, we'll do a direct
proof, but using familiar ideas– scaling (similar to shifting)– specify faulty behavior with a big ring
CPSC 668 Set 13: Clocks 49
Scaling Clocks
• Given a timed execution and a real number s > 0, scale(,s) is the result of multiplying every real time in by s.
• If s > 1, scaling causes clocks to slow down and delays to increase.
• If s < 1, scaling causes clocks to speed up and delays to decrease.
CPSC 668 Set 13: Clocks 50
Scaling Example
real time
HC0(t) = 3t
HC1(t) = 4t
HC'0(t) = (3/2)t
HC'0(t) = 2t
scale by s = 2
2:00 3:00 4:00 6:00
6:00
12:00
6:00
12:00
delay = 1:00
delay = 2:00
p0
p1
p0
p1
CPSC 668 Set 13: Clocks 51
Scaling Clocks
Lemma (13.1): In ' = scale(,s),
• HCi'(t) = HCi(t/s)
• ACi'(t) = ACi(t/s)
• if a msg has delay in , then it has delay s in '.
Lemma (13.2): If satisfies -clock agreement and -clock validity for a set of procs, then so does scale(,s).
CPSC 668 Set 13: Clocks 52
Processor Lower Bound for CS
Assume
• f = 1 – extend to larger f with reduction
• u ≥ d(1 – (1 + )–4) – needed for calculations to work out– since is tiny, this is not a significant
restriction (uncertainty must be at least slightly larger than 0)
CPSC 668 Set 13: Clocks 53
Processor Lower Bound for CS
• Assume in contradiction there is an algorithm (A,B,C) for n = 3 and f = 1 that achieves -clock agreement and -clock validity.
• Consider a ring of k processors, where– k is a multiple of 3– (1 + )2(k-1) > (1 + )2
• needed for the calculations to work out
CPSC 668 Set 13: Clocks 54
Big Ringp0
p1
p2
p3
pi-1
pi
pi+1
pk-1
A
B
C
A
AB
C
C
CPSC 668 Set 13: Clocks 55
Execution on Big Ring
p0
p1
p2
p3
pi-1
pi
pi+1
pk-1
A
B
C
A
AB
C
C
t(1+)
t(1+)1-2(i-1)t(1+)1-2i
t(1+)1-2(i+1)
t(1+)1-2(k-1) t(1+)-1
t(1+)-3
t(1+)-5
d(1+)0 = d
hardware clocks
message delays
d(1+)2
d(1+)-2d(1+)-4
d(1+)2i-2
d(1+)2i-4
d(1+)2k-6
local algorithms
and adj. varsare initially 0
CPSC 668 Set 13: Clocks 56
Execution on Big Ring
• We cannot rely on satisfying the clock synch properties:– more than 3 processors– some h/w clock drift rates are out of range– some message delays are out of range
• However, we can make some deductions about how processors behave in : – show that pieces of the ring "look like" certain
systems in which the algorithm is supposed to be correct.
CPSC 668 Set 13: Clocks 57
Behavior in Big Ring
Lemma (13.4): In , for all t:
a) |ACi(t) - ACi+1(t)| ≤ b) (1+)-1HCi(t) ≤ ACi(t) ≤ (1+ )(HCi(t))
Proof: Take pi and pi+1 from big ring and put them in a triangle in which 3rd processor is faulty and acts like the rest of the big ring. Call this execution .
CPSC 668 Set 13: Clocks 58
Triangle Based on Big Ring
pipi+1
t(1+)1-2it(1+)1-2(i+1)
d(1+)2(i+1)-4
d(1+)2i-4d(1+)2(i+2)-4
acts like pi-1
toward pi in acts like pi+2
toward pi+1 in
CPSC 668 Set 13: Clocks 59
Relationship of Triangle and Ring
Claim: pi and pi+1 behave the same in (the execution on the triangle with the Byzantine processor) as they do in (the execution on the big ring).
CPSC 668 Set 13: Clocks 60
Scaled Triangle
Scale by (1 + )-2i to get ' :
pipi+1
t(1+)t(1+)-1
d(1+)-2
d(1+)-4d
acts like pi-1
toward pi in acts like pi+2
toward pi+1 in
≥ d - u by assump.
CPSC 668 Set 13: Clocks 61
Relating the Three Executions
• Since ' is admissible, it satisfies -clock agreement and -clock validity for pi and pi+1.
• By Scaling Lemma (13.2), also satisfies those conditions for pi and pi+1.
• Since and look the same to pi and pi+1, also satisfies those conditions for pi and pi+1.
CPSC 668 Set 13: Clocks 62
Finishing the Main Lower Bound
Referring back to ,
AC0(t) ≤ AC1(t) + by Lemma 13.4(a)
≤ AC2(t) + 2 by Lemma 13.4(a) …
≤ ACk-1(t) + (k-1) by Lemma 13.4(a)
So ACk-1(t) ≥ AC0(t) - (k-1) ≥ (1+)-1HC0(t) - (k-1) by Lemma 13.4(b)
= (1+)-1(1+)2(k-1)HCk-1(t) - (k-1)
CPSC 668 Set 13: Clocks 63
Finishing the Main Lower Bound
From previous slide:
ACk-1(t) ≥ (1+)-1(1+)2(k-1)HCk-1(t) - (k-1)By Lemma 13.4(b):
ACk-1(t) ≤ (1+)HCk-1(t)
Combining and rearranging gives:
HCk-1(t) [(1+)-1(1+)2(k-1)- (1+)] ≤ (k-1)grows w/obound
positive, by assumption about k constant
CPSC 668 Set 13: Clocks 64
Fault-Tolerant Clock Synchronization Algorithms• Continue to focus on maintenance
algorithms.• Assume clocks are initially close together
– different algorithms state this condition differently
• Processors resynchronize every P time units:– different algorithms have different
constraints on P.
CPSC 668 Set 13: Clocks 65
A Fault-Tolerant CS Algorithm
[Welch & Lynch, 1988]• Assume adjusted clocks reach clock
time 0 within B real time of each other• Resynch every P time units; choose P
– large enough to avoid confusion between resynchronizations
– small enough to prevent skew due to drift from becoming too large
CPSC 668 Set 13: Clocks 66
Code for a Processor
when AC = kP (k = 1, 2, …): send AC to all set timer for (1 + )(B + d) in the future
when receive T msg from pj: diff[j] := (T + d – u/2) – ACwhen timer goes off: adj := adj + midpoint(trim(f,diff)) clear diff array
discard f largestand f lowest values
CPSC 668 Set 13: Clocks 67
Explanation of Timer Value
• Why wait (1 +)(B + d) time to collect messages?
• Want to hear from all nonfaulty processors before adjusting.– All nonfaulty procs will reach clock time kP within
B time of each other (true for k = 0 by assumption, shown by induction for k > 0)
– Maximum msg delay is d– Waiting B + d clock time might not be long enough
if your clock is fast. To be safe, wait extra factor of (1 + )
CPSC 668 Set 13: Clocks 68
Clock Agreement
Claim: Nonfaulty clocks reach each kP within B real time of each other.– Proved by induction.
Claim: After adjusting their clocks in each resynch period, the new (nonfaulty) clocks reach kP within real time B/2 + u + O() of each other. See figure.– Proved using properties of the trim and midpoint
functions: difference is roughly halved.
CPSC 668 Set 13: Clocks 69
Figure for Resynchronization
real time
kP
kP+(B+d)(1+)
(k+1)P
(k+1)P+(B+d)(1+)
at most B
at most B/2 + u + O()
ACiACj
CPSC 668 Set 13: Clocks 70
Clock Agreement
• Due to drift, new clocks reach (k+1)P (start of next resynch) within real time B/2 + u + 2P of each other.
• B/2 + u + 2P ≤ B
implies B ≥ 2u + 4P
= 2u + O()
• So B cannot be any smaller than 2u plus terms of order .
CPSC 668 Set 13: Clocks 71
Clock Agreement
Claim: The algorithm achieves -clock agreement, where
= B + u/2 + O()
Using the smallest possible B, the best this algorithm gives is
= 5u/2 + O().
CPSC 668 Set 13: Clocks 72
Clock Validity• Paper analyzes drift of adjusted clocks
with respect to real time, not hardware clock time.
• Adjusted clock drift rate is calculated to be + O(1/P), as opposed to for the hardware clocks.– The more frequently the processors
resynchronize, the more they degrade the drift rate (tradeoff with Clock Agreement)
• Careful analysis for the version of clock validity given in textbook is open.