34
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Embed Size (px)

Citation preview

Page 1: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

DISTRIBUTED ALGORITHMS AND SYSTEMSSpring 2014Prof. Jennifer Welch

CSCE 668

1

Page 2: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Shared Memory Model

Processors communicate via a set of shared variables, instead of passing messages.

Each shared variable has a type, defining a set of operations that can be performed atomically.

2

Page 3: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Shared Memory Model Example

3

p0 p1 p2

X Y

read write writeread

Page 4: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Shared Memory Model

Changes to the model from the message-passing case: no inbuf and outbuf state components configuration includes a value for each

shared variable only event type is a computation step by a

processor An execution is admissible if every

processor takes an infinite number of steps

4

Page 5: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Computation Step in Shared Memory Model

When processor pi takes a step: pi 's state in old configuration specifies

which shared variable is to be accessed and with which operation

operation is done: shared variable's value in the new configuration changes according to the operation's semantics

pi 's state in new configuration changes according to its old state and the result of the operation

5

Page 6: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Observations on SM Model

Accesses to the shared variables are modeled as occurring instantaneously (atomically) during a computation step, one access per step

Definition of admissible execution implies asynchronous no failures

6

Page 7: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutual Exclusion (Mutex) Problem Each processor's code is divided into four

sections:

entry: synchronize with others to ensure mutually exclusive access to the …

critical: use some resource; when done, enter the… exit: clean up; when done, enter the… remainder: not interested in using the resource

7

entry

critical

exit

remainder

Page 8: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutual Exclusion Algorithms

A mutual exclusion algorithm specifies code for entry and exit sections to ensure: mutual exclusion: at most one processor

is in its critical section at any time, and some kind of "liveness" or "progress"

condition. There are three commonly considered ones…

8

Page 9: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutex Progress Conditions

no deadlock: if a processor is in its entry section at some time, then later some processor is in its critical section

no lockout: if a processor is in its entry section at some time, then later the same processor is in its critical section

bounded waiting: no lockout + while a processor is in its entry section, other processors enter the critical section no more than a certain number of times.

These conditions are increasingly strong.

9

Page 10: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutual Exclusion Algorithms

The code for the entry and exit sections is allowed to assume that no processor stays in its critical section

forever shared variables used in the entry and exit

sections are not accessed during the critical and remainder sections

10

Page 11: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Complexity Measure for Mutex An important complexity measure for

shared memory mutex algorithms is amount of shared space needed.

Space complexity is affected by: how powerful is the type of the shared

variables how strong is the progress property to be

satisfied (no deadlock vs. no lockout vs. bounded waiting)

11

Page 12: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Test-and-Set Shared Variable A test-and-set variable V holds two

values, 0 or 1, and supports two (atomic) operations: test&set(V):

temp := VV := 1return temp

reset(V):V := 0

12

Page 13: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutex Algorithm Using Test&Set code for entry section:repeat t := test&set(V)until (t = 0)An alternative syntactic construction is:wait until test&set(V) = 0

code for exit section:reset(V)

13

Page 14: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutual Exclusion is Ensured

Suppose not. Consider first violation, when some pi enters CS but another pj is already in CS

14

pj enters CS:sees V = 0,sets V to 1

pi enters CS:sees V = 0,sets V to 1

no node leaves CS so V stays 1

impossible!

Page 15: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

No Deadlock

Claim: V = 0 iff no processor is in CS. Proof is by induction on events in

execution, and relies on fact that mutual exclusion holds.

Suppose there is a time after which a processor p is in its entry section but no processor ever enters CS.

15

p is in entry but no processor enters CS

p is still in entry, no processor is in CSV always equals 0, next t&s by p returns 0p enters CS, contradiction!

Page 16: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

What About No Lockout?

One processor could always grab V (i.e., win the test&set competition) and starve the others.

No Lockout does not hold. Thus Bounded Waiting does not hold.

16

Page 17: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Read-Modify-Write Shared Variable The state of this kind of variable can be

anything and of any size. Variable V supports the (atomic)

operation rmw(V,f ), where f is any functiontemp := VV := f(V)return temp

This variable type is so strong there is no point in having multiple variables (from a theoretical perspective).

17

Page 18: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutex Algorithm Using RMW Conceptually, the list of waiting processors

is stored in a shared circular queue of length n

Each waiting processor remembers in its local state its location in the queue (instead of keeping this info in the shared variable)

Shared RMW variable V keeps track of active part of the queue with first and last pointers, which are indices into the queue (between 0 and n-1) so V has two components, first and last

18

Page 19: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Conceptual Data Structure19

The RMW shared object just contains these two"pointers"

1 23

4

5

6

78

91011

12

13

14

15

0first

last

Page 20: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Mutex Algorithm Using RMW Code for entry section:

// increment last to enqueue selfposition := rmw(V,(V.first,V.last+1))// wait until first equals this valuerepeat queue := rmw(V,V)until (queue.first = position.last)

Code for exit section:// increment first to dequeue selfrmw(V,(V.first+1,V.last))

20

Page 21: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Correctness Sketch

Mutual Exclusion: Only the processor at the head of the

queue (V.first) can enter the CS, and only one processor is at the head at any time.

n-Bounded Waiting: FIFO order of enqueueing, and fact that no

processor stays in CS forever, give this result.

21

Page 22: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Space Complexity

The shared RMW variable V has two components in its state, first and last.

Both are integers that take on values from 0 to n-1, n different values.

The total number of different states of V thus is n2.

And thus the required size of V in bits is 2*log2 n .

22

Page 23: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Spinning

A drawback of the RMW queue algorithm is that processors in entry section repeatedly access the same shared variable called spinning

Having multiple processors spinning on the same shared variable can be very time-inefficient in certain multiprocessor architectures

Alter the queue algorithm so that each waiting processor spins on a different shared variable

23

Page 24: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

RMW Mutex Algorithm With Separate Spinning

Shared RMW variables:Last : corresponds to last "pointer" from previous algorithm

cycles through 0 to n-1 keeps track of index to be given to

the next processor that starts waiting

initially 0

24

Page 25: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

RMW Mutex Algorithm With Separate SpinningShared RMW variables (continued):Flags[0..n-1] : array of binary variables

these are the variables that processors spin on

make sure no two processors spin on the same variable at the same time

initially Flags[0] = 1 (proc "has lock") and

Flags[i] = 0 (proc "must wait") for i > 0

25

Page 26: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Overview of Algorithm

entry section: get next index from Last and store in a local

variable myPlace increment Last (with wrap-around)

spin on Flags[myPlace] until it equals 1 (means proc "has lock" and can enter CS)

set Flags[myPlace] to 0 ("doesn't have lock") exit section:

set Flags[myPlace+1] to 1 (i.e., give the priority to the next proc) use modular arithmetic to wrap around

26

Page 27: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Question

Do the shared variables Last and Flags have to be RMW variables?

Answer: The RMW semantics (atomically reading and updating a variable) are needed for Last, to make sure two processors don't get the same index at overlapping times.

27

Page 28: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Invariants of the Algorithm

1. At most one element of Flags has value 1 ("has lock")

2. If no element of Flags has value 1, then some processor is in the CS.

3. If Flags[k] = 1, then exactly (Last - k) mod n processors are in the entry section, spinning on Flags[i], for i = k, (k+1) mod n, …, (Last-1) mod n.

28

Page 29: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Example of Invariant29

0 0 1 0 0 0 0 0

0 1 2 3 4 5 6 7

Flags

5Last

k = 2 and Last = 5.So 5 - 2 = 3 procs are in entry, spinning on Flags[2], Flags[3], Flags[4]

Page 30: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Correctness

Those three invariants can be used to prove: Mutual exclusion is satisfied n-Bounded Waiting is satisfied.

30

Page 31: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Lower Bound on Number of Memory StatesTheorem (4.4): Any mutex algorithm

with k-bounded waiting (and no-deadlock) uses at least n states of shared memory.

Proof: Assume in contradiction there is an algorithm using less than n states of shared memory.

31

Page 32: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Lower Bound on Number of Memory States Consider this execution of the algorithm:

There exist i and j such that Ci and Cj have the same state of shared memory.

32

p0 p0 p0 … p1 p2 pn-1

C C0 C2 Cn-1C1……

p0 inCS byND

p1 inentrysec.

p2 inentrysec.

pn-1 inentrysec.

initialconfig.,all in rem.

Page 33: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Lower Bound on Number of Memory States

Shared memory state is same in Ci as in Cj

33

Ci Cjp0 in CS,p1-pi in entry,rest in rem.

p0 in CS,p1-pj in entry,rest in rem.

pi+1, pi+2, …, pj

= sched. in whichp0-pi take steps in round robin

by ND, some ph

has entered CSk+1 times

ph enters CSk+1 times whilepi+1 is in entry

Page 34: DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 1

Lower Bound on Number of Memory States But why does ph do the same thing when

executing the sequence of steps in when starting from Cj as when starting from Ci?

All the processors p0,…,pi do the same thing because: they are in same states in the two configs shared memory state is same in the two configs only differences between Ci and Cj are

(potentially) the states of pi+1,…,pj and those processors don't take any steps in

34