Chapter 5 Synchronization 1 Synchronization Chapter 5

Preview:

Citation preview

Chapter 5 Synchronization 1

Synchronization

Chapter 5

Chapter 5 Synchronization 2

Synchronization Multiple processes must not

simultaneously access shared resource

Ordering may be importanto Such as, msg 1 must come before msg 2

Timeo Absolute time vs relative time

May want one process to coordinateo Election algorithms

Chapter 5 Synchronization 3

Synchronization Special topics… Distributed mutual exclusion

o Protect shared resources from simultaneous access

Distributed transactionso Similar, but try to optimize access

thru “advanced concurrency control”

Chapter 5 Synchronization 4

What Time is It? Easy to answer in a non-dist system

o Spse A asks for time, then Bo B’s time will be later than A’so In dist system, this may not be true

Spse A checks time, then B B’s time might not be later than A’s

o That is, time on A and B might not agreeo If time comes from a central location,

network communication variation is a problem

Chapter 5 Synchronization 5

What Time is It? Why do we care about time? Consider make example Make used to compile and link multiple

source files into one executable file If file.o was last modified before file.c,

then file.c must be recompiled If file.o was last modified after file.c,

then no need to recompile file.c This breaks if time is not the same in

distributed system

Chapter 5 Synchronization 6

Clock Synchronization

Both machines have their own clocko Clocks differ by “2”

What will make do with output.c? Oops!

Chapter 5 Synchronization 7

Time With single processor system

o Doesn’t matter if time is incorrecto Relative time is what’s important

If more than one processoro Clock skew is inevitable

Multiple clock problemso How to synchronize with “real” clock?o How to synchronize clocks with each other?

But first we digress…

Chapter 5 Synchronization 8

Physical Clocks

Time between 2 transits of the suno Solar day

Solar second is 1/86400th solar day

Chapter 5 Synchronization 9

Physical Clocks Period of earth rotation not constant

o Earth is slowing due to drago Days are getting longer

Atomic clock invented 1948 Official second is now

o 9,192,631,770 transitions of cesium 133 International Atomic Time (TAI) Today, 86,400 TAI seconds is about 3

msec less than mean solar day!

Chapter 5 Synchronization 10

Physical Clocks

Solar seconds are not of constant length TAI seconds are of constant length

o Leap seconds are used to keep in phase with sun

o Add leap second when discrepancy > 800 msec Otherwise noon would eventually be before

breakfast might cause riots!

Chapter 5 Synchronization 11

Physical Clocks TAI with leap seconds is known as

o Universal Coordinated Time (UTC) UTC replaces Greenwich Mean Time (GMT) NIST operates radio WWV from Colorado

o Sends out pulse at start of each UTC secondo But only accurate to within 1 mseco Do to atmospheric effects, can vary by 10

msec Some satellites offer similar service In any case, must know relative position

o To compensate for propagation delay

Chapter 5 Synchronization 12

Clock Sync. Algorithms Suppose one machine monitor

WWV How to keep other clocks in sync?

o Let t be UTC time

o Let Cp(t) be time on machine p

Ideally, want Cp(t) = t

o We’ll be happy if dCp/dt = 1

Chapter 5 Synchronization 13

Clock Sync. Algorithms

Clocks drift Suppose One clock is

slow and one is fast…

Drift apart at twice the drift rate

Chapter 5 Synchronization 14

Clock Sync. Algorithms Let Cp(t) be time on machine p Ideally, want Cp(t) = t

o Or dCp/dt = 1 But processor clocks can drift

o If maximum rate of drift is o After t, two clocks could be 2 t apart

If you want clocks to differ by less than o Must synchronize clocks every / 2

seconds How to synchronize?

Chapter 5 Synchronization 15

Clock Sync. Algorithms How to synchronize clocks? Cristian’s algorithm

o Pull protocol Berkeley algorithm

o Push protocol Averaging algorithms

o Decentralized approach Network Time Protocol (NTP) Multiple external time sources

Chapter 5 Synchronization 16

Cristian's Algorithm Spse time server has WWV time Clients want to stay within of

others Every / 2 seconds or less…

o Client asks time server for time Somebody got an algorithm named

after themselves for that? See next slide

Chapter 5 Synchronization 17

Cristian's Algorithm

What are the potential problems?o Time cannot run backwardso Takes (variable) time to get reply

Chapter 5 Synchronization 18

Cristian's Algorithm Time cannot run backwards

o If clock is fast…o Increment time more slowly than usual

Must account for time to get replyo How to do this?o Educated guess! Roundtrip time divided by

2o Account for time server takes to process,

multiple roundtrip measurements, etc., etc.

Chapter 5 Synchronization 19

Berkeley Algorithm Cristian’s “algorithm”

o Time server is passive Berkeley algorithm

o Time server is aggressiveo Does not require server to know UTCo Server polls clientso Computes average timeo Pushes result to clients

Chapter 5 Synchronization 20

Berkeley Algorithm

a) Server asks others for their clock valuesb) Machines answerc) Server tells others how to adjust their clock

Chapter 5 Synchronization 21

Averaging Algorithms Cristian’s and Berkeley are

centralized Averaging (decentralized)

approach…o All machines broadcast timeo Everybody computes averageo The usual refinements apply

When to broadcast? Only practical on a LAN

Chapter 5 Synchronization 22

Network Time Protocol According to book, NTP uses

o “advanced clock synchronization algorithms”

o Accuracy range of 1 to 50 msec But NTP is not very secure NTP actually uses Marzullo’s Algorithm

o Aka the Intersection Algorithm Have a collection of times intervals

o Example: time of 102 gives interval [8,12]

Chapter 5 Synchronization 23

Network Time Protocol Given collection of times intervals

o Of the form [a,b] Marzullo’s algorithm finds consistent

intervalo Efficient: linear in time and spaceo If no consistent interval, finds interval(s)

consistent with the most sources Marzullo takes center of resulting interval Intersection Algorithm refines this

o Use statistical info on confidence intervalso Selected time not necessarily midpoint

Chapter 5 Synchronization 24

Multiple External Time Sources

Suppose very accurate time needed Multiple UTC sources? But these will not agree So need to average (or similar)

o Network delayso Processing delays, etc.

Not clear that this helps very much!

Chapter 5 Synchronization 25

Use of Synchronized Clocks

Today, computers can be at or near UTC How to make use of this? To enforce “at most once delivery” Traditional approach

o Server keeps track of msg numberso Checks list against incoming msg numberso How long to keep list? What if server

crashes? Alternative is to use timestamps

o We discuss other apps in later sections

Chapter 5 Synchronization 26

Logical Clocks Usually good enough to agree on

timeo Even if it’s not the actual time

Often sufficient to agree on ordero Recall make example

Lamport timeo Synchronize logical clocks

Vector timestampso Extension of Lamport’s algorithm

Chapter 5 Synchronization 27

Lamport Timestamps “Happens before”: a b According to Tanenbaum: a b if all

processes agree that a came before b

Lamport actually defines “” as the “smallest” relation satisfyingo If a occurs before b on same processor

then a b o If a == send, b == receive, a b o Transitive: a b and b c implies a

c

Chapter 5 Synchronization 28

Lamport Timestamps “Happens before”: a b Does “happens before” equal “really

happened before”? If a and b are on same process and a

occurs before b, then a b If a == msg sent, b == (same) msg

received, then a b o It takes time for message to be sent

If a b and b a, msgs are concurrent

/ /

Chapter 5 Synchronization 29

Lamport Timestamps For event a, want timestamp C(a)

o If a b then C(a) < C(b)o C is a non-decreasing functiono Time cannot go backwards!

Lamport’s solutiono Each msg carries timestamp with ito If local time is less than timestamp, set local

time to timestamp + 1o Advance clock between any two events

Illustrated on next slide…

Chapter 5 Synchronization 30

Lamport Timestamps

a) Three processes with different clocks

60

54

48

42

36

30

24

18

12

6

0

80

72

64

56

48

40

32

24

16

8

0

100

90

80

70

60

50

40

30

20

10

0

A

B

C

D

76

70

48

42

36

30

24

18

12

6

0

85

77

69

61

48

40

32

24

16

8

0

100

90

80

70

60

50

40

30

20

10

0

A

B

C

D

b) Lamport's algorithm corrects the clocks

Chapter 5 Synchronization 31

Lamport Timestamps Can also insure that no two events ever

occur at exactly the same timeo 40.1 for process 1o 40.2 for process 2, etc.

With this refinement, we have a total ordering on all events in the systemo If a b on same process then C(a) < C(b)o If a == msg sent, b == msg received, then

we have C(a) < C(b)o If a b then C(a) C(b)

Chapter 5 Synchronization 32

Totally-Ordered Multicast Consider replicated database

o Spse replica in San Jose and in New Yorko Query goes to nearest copy

Updates are trickyo Must have updates in same order at replicaso For example: Interest calculation and

deposit For consistency, no “right” order

o Just want updates to happen in same order Correctness is a different story…

Chapter 5 Synchronization 33

Non-Totally-Ordered Multicast

Assumptionso $1000 in acct, deposit is $1000, interest rate is 10%

On left, $2200, on right $2100 Inconsistent!

Deposit Interest

Chapter 5 Synchronization 34

Totally-Ordered Multicast Assume msgs received in order and no

loss Using Lamport timestamps…

o Msgs timestamped with sender’s logical timeo Multicast sent to all sites, including sendero Msgs go into local queue in timestamp

ordero Multicast ACK msgs (to yourself too)

Message only removed from queue ifo It is at head of queue ando It has been ACKed

Does this work? See next slide…

Chapter 5 Synchronization 35

Totally-Ordered Multicast $1000 in acct, deposit is $1000, interest rate 10% What happens in this case?

Deposit Interest

Deposit

91

90

46

45

20

10

0

120

105

90

75

60

45

30

Interest

ACK(D)ACK(I)

Later…Interest: 10Deposit: 45ACK(I): 90ACK(D): 105

Later…Interest: 10Deposit: 45ACK(D): 46ACK(I): 90

After 45…Deposit: 45

After 10…Interest: 10

Chapter 5 Synchronization 36

Totally-Ordered Multicast

When is interest calculation done? When is deposit made?

Deposit Interest

Deposit

91

90

46

45

20

10

0

120

105

90

75

60

45

30

Interest

ACK(D)ACK(I)

Later…Interest: 10Deposit: 45ACK(I): 90ACK(D): 105

Later…Interest: 10Deposit: 45ACK(D): 46ACK(I): 90

After 45…Deposit: 45

After 10…Interest: 10

Chapter 5 Synchronization 37

Scalar Timestamps Scalar timestamps (such as Lamport

timestamps) give total ordering using C(a) But C(a) < C(b) does not mean that event a

really happened before b

P1

P2

P3

1 2 3 8 9

10

54

3

1

1

5 6 7

112

4

7 9

The “4” at P2 occurs before the “3” at P1

Chapter 5 Synchronization 38

Vector Timestamps Lamport timestamps don’t reflect causality

o Local events are causally ordered Example: multicast news posting

o “Happens after” not necessarily “response to” Vector timestamps do reflect causality Must specify

o Local data structures to represent logical timeo Update mechanism/protocol

Tanenbaum’s description is confusing!

Chapter 5 Synchronization 39

Vector Timestamps Want vector timestamp such that

o If VT(a) < VT(b) then a causally precedes b Process Pi maintains vector Vi

o Vi[i] is incremented for each event at io Vi[j] is Pi’s current view of the number of

events that have occurred at process Pj

Vi[i] is easy to maintain Vi[j] is obtained from info sent with

msgso Each message includes vector timestamp

Chapter 5 Synchronization 40

Vector Timestamps Suppose Pj received msg m from Pi

Pi includes it’s vector timestamp, vt Then Pj adjusts its values according to

vt Pj then knows the number of events on

which m can depend Tanenbaum claims…

o Pj knows no. of messages it must receive before it has seen everything that m could depend on

o Not true! Event msg!

Chapter 5 Synchronization 41

Vector Timestamps

P1

P2

P3

100

010

001

200

300

434

534

220

230

240

554

564

234

233

232

230

200

534

234

Chapter 5 Synchronization 42

Vector Timestamp Modified (useful) form of VT Suppose Vi[i] counts msgs sent by

Pi

Now consider multicast newsgroup Suppose Pi post a message

o Includes vector vt(a)

Suppose Pj posts a responseo Includes vector vt(b)

Chapter 5 Synchronization 43

Vector Timestamp Pi posts a and includes vt(a) Pj posts response b with vector

vt(b) Suppose Pk receives b before a Pk waits to deliver msg until

o vt(b)[j] == Vk[j] + 1 This is the next msg expected from Pj

o vt(b)[i] <= Vk[i], all i j Ensures that Pk must have seen msg a

Chapter 5 Synchronization 44

Vector Timestamp Example

Chapter 5 Synchronization 45

Global State Global state of distributed system

o All local states plus msgs in transito Definition of “state” can vary

Useful to know global state too Know that computation is finishedo Detect deadlock

How to record global state?o Distributed snapshot

Chapter 5 Synchronization 46

Global State Distributed snapshot

o A consistent state “in which the system might have been”

o For example, if Q received msg from P then must show that P sent the msg

o P sent msg Q has not yet received is OK

Global state represented by a cut Next slide…

Chapter 5 Synchronization 47

Global State

a) Consistent cutb) Inconsistent cut

Chapter 5 Synchronization 48

Global State Assume distributed system uses point-

to-point unidirectional communication Any process can initiate snapshot Suppose P starts snapshot

o P records its stateo P sends “marker” to neighbors

When Q receives markero First marker on any channel: Q records stateo Record incoming messages until…o …Q has received marker on all incoming

channels, then Q is done

Chapter 5 Synchronization 49

Global State

This figure does not match algorithm!

See next few slides…

Chapter 5 Synchronization 50

Global State Consider the following example Bank has 3 branches, A, B, C Each branch connected to others

o Point-to-point links State consists of

o Money in branch and…o …money in transit

Chapter 5 Synchronization 51

Global State

Note that no messages are in transit Global state: (SA,SB,SC)

A

B

C

Begin: SA

M1

M2

SB

SC

M6

M5M3

M4

Done: SA

Done: SB

Done: SC

Chapter 5 Synchronization 52

Global State

Note that no messages are in transit Global state: (SA,T,SB,SC)

A

B

C

Begin: SA

M1

M2

SB

SC

M6

M5M3

M4

Done: (SA,T)

Done: SB

Done: SC

T

(SA,T)

Chapter 5 Synchronization 53

Global State Example: Termination detection Process Q received marker 1st

timeo Process that sent it is Q’s predecessoro When Q completes its part…o …Q sends DONE msg to its

predecessor When is snapshot DONE?

o When initiator of snapshot received DONE from all of its successors

Chapter 5 Synchronization 54

Global State Problem: if DONE and msgs in transit,

then computation may not really be done

Are msgs part of snapshot or computation?

Modification: send DONE providedo All of Q’s successors returned DONE ando Q has not received any msg between time

state was recorded and marker(s) received Otherwise send CONTINUE msg DONE when initiator receives all DONEs

o If CONTINUEs, must do it again

Chapter 5 Synchronization 55

Election Algorithms May want one process to coordinate

o We don’t care which process How to choose coordinator? Have an election!

o Assume each process has a unique numbero All processes know everybody else’s numbero But some processes may be downo Want to elect (live) process with highest

number We’ll consider two election algorithms

o Bully algorithm and ring algorithm

Chapter 5 Synchronization 56

Bully Algorithm P notices coordinator not responding

o P sends ELECTION msg to all processes with higher number than P’s

o If no one responds, P becomes coordinatoro If a higher number responds, P is done

Process receives ELECTION from lower no.o Responds with OKo If not already doing so, it initiates an

election Eventually, everybody gives up…

o Except for the biggest bully

Chapter 5 Synchronization 57

Bully Algorithm

Process 7 was coordinator until he died Process 4 is first to notice, so holds an election 5 and 6 respond, 4 gives up (why not stop

here?) Now 5 and 6 each hold an election

Chapter 5 Synchronization 58

Bully Algorithm

d) Process 6 tells 5 to give upe) Process 6 wins, then tells everyone

Chapter 5 Synchronization 59

Ring Algorithm Assume processes are ordered

o Everyone knows their successoro Note that no “token” involved

Spse P notices coordinator has diedo P sends ELECTION msg to its successor with

P’s number attachedo If no response, sends msg to P’s successor’s

successor, and so ono Each guy in chain appends its numbero When msg gets back to P, it selects highest

number on list and sends COORDINATOR msg

Chapter 5 Synchronization 60

Ring Algorithm

5 and 2 both initiate ELECTION What will happen?

Chapter 5 Synchronization 61

Mutual Exclusion Critical region a place where mutual

exclusion is requiredo Example: update to a shared data structure

For single processor systemo Use semaphore, monitors, etc.

Possible istributed system approacheso Imitate single processor approacho Distributed approacho Token ring approach

Chapter 5 Synchronization 62

Centralized Algorithm Elect a coordinator If P want to enter critical region

o Checks with coordinator How does coordinator deny request?

o Either explicit denial or no responseo Queues any pending requests

Fair, efficient, etc.o No starvation?

But it’s centralized and we hate that!

Chapter 5 Synchronization 63

Centralized Algorithm

a) Process 1 OK to enter a critical regionb) Process 2 asks permission to enter the

same critical region, but no replyc) Process 1 exits, coordinator replies to 2

Chapter 5 Synchronization 64

Distributed Algorithm For this, we need a total ordering on events

o We know how to do this, right? P wants to enter critical region

o Send request msg (with timestamp) to everybodyo Including itself

When request is receivedo Receiver not in critical region? Send OKo Receiver in critical region? No reply, queue requesto Receiver wants to enter critical region but has not yet?

Check timestamps, low one wins After OKed by everybody, enter critical region

Chapter 5 Synchronization 65

Distributed Algorithm

a) Processes 0 and 2 want to enter critical regionb) Process 0 has the lowest timestamp, it winsc) When process 0 is done, 2 gets its turn

Chapter 5 Synchronization 66

Token Ring Algorithm A logical ring with a token Token passed around ring Process can only enter critical

region when it has the token Easy to see that this works! Usual token ring problems apply

Chapter 5 Synchronization 67

Token Ring Algorithm

a) Unordered group of processesb) Logical ring (also need a token)

Chapter 5 Synchronization 68

Comparison of Mutual Exclusion Algorithms

?????

Lost token, process crash

0 to n – 11 to Token ring

Crash of any process

2 ( n – 1 )2 ( n – 1 )Distributed

Coordinator crash23Centralized

ProblemsDelay before entry (in message times)

Messages per entry/exit

Algorithm

Chapter 5 Synchronization 69

Distributed Transactions Blah

Chapter 5 Synchronization 70

Transaction Model

Updating master tape is fault tolerant

Chapter 5 Synchronization 71

Transaction Model

Primitives for transactions

Primitive Description

BEGIN_TRANSACTION Make the start of a transaction

END_TRANSACTION Terminate the transaction and try to commit

ABORT_TRANSACTION Kill the transaction and restore the old values

READ Read data from a file, a table, or otherwise

WRITE Write data to a file, a table, or otherwise

Chapter 5 Synchronization 72

The Transaction Model

a) Transaction to reserve 3 flights commitsb) Aborts when 3rd flight unavailable

BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi;END_TRANSACTION

(a)

BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full =>ABORT_TRANSACTION (b)

Chapter 5 Synchronization 73

Distributed Transactions

a) A nested transactionb) A distributed transaction

Chapter 5 Synchronization 74

Private Workspace

a) File index and disk blocks of 3-block fileb) After transaction modified 0, appended block 3c) After committing

Chapter 5 Synchronization 75

Writeahead Log

a) A transactionb) – d) Log before statement is executed

x = 0;

y = 0;

BEGIN_TRANSACTION;

x = x + 1;

y = y + 2

x = y * y;

END_TRANSACTION;

(a)

Log

[x = 0 / 1]

(b)

Log

[x = 0 / 1]

[y = 0/2]

(c)

Log

[x = 0 / 1]

[y = 0/2]

[x = 1/4]

(d)

Chapter 5 Synchronization 76

Concurrency Control

Managers for handling transactions

Chapter 5 Synchronization 77

Concurrency Control Managers for distributed

transactions

Chapter 5 Synchronization 78

Serializability

a) – c) Transactions T1, T2, and T3

d) Possible schedules

BEGIN_TRANSACTION x = 0; x = x + 1;END_TRANSACTION

(a)

BEGIN_TRANSACTION x = 0; x = x + 2;END_TRANSACTION

(b)

BEGIN_TRANSACTION x = 0; x = x + 3;END_TRANSACTION

(c)

Schedule 1 x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 Legal

Schedule 2 x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; Legal

Schedule 3 x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; Illegal

(d)

Chapter 5 Synchronization 79

Two-Phase Locking

Two-phase locking (duh!)

Chapter 5 Synchronization 80

Two-Phase Locking

Strict two-phase locking

Chapter 5 Synchronization 81

Pessimistic Timestamp Ordering

Concurrency control using timestamps

Recommended