Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Concurrent Objects
Companion slides for
The Art of Multiprocessor Programming
by Maurice Herlihy & Nir Shavit
Art of Multiprocessor
Programming
2
Concurrent Computation
memory
object object
Art of Multiprocessor
Programming
3
Objectivism
• What is a concurrent object?
– How do we describe one?
– How do we implement one?
– How do we tell if we’re right?
Art of Multiprocessor
Programming
4
Objectivism
• What is a concurrent object?
– How do we describe one?
– How do we tell if we’re right?
Art of Multiprocessor
Programming
5
FIFO Queue: Enqueue Method
q.enq( )
Art of Multiprocessor
Programming
6
FIFO Queue: Dequeue Method
q.deq()/
Art of Multiprocessor
Programming
7
Lock-Based Queue
headtail0
2
1
5 4
3
yx
capacity = 8
7
6
Art of Multiprocessor
Programming
8
Lock-Based Queue
headtail0
2
1
5 4
3
capacity = 8
7
6
Fields protected by
single shared lock
yx
class LockBasedQueue<T> {
int head, tail;
T[] items;
Lock lock;
public LockBasedQueue(int capacity) {
head = 0; tail = 0;
lock = new ReentrantLock();
items = (T[]) new Object[capacity];
}
Art of Multiprocessor
Programming
9
A Lock-Based Queue
class LockBasedQueue<T> {
int head, tail;
T[] items;
Lock lock;
public LockBasedQueue(int capacity) {
head = 0; tail = 0;
lock = new ReentrantLock();
items = (T[]) new Object[capacity];
}
Art of Multiprocessor
Programming
10
A Lock-Based Queue
Fields protected by
single shared lock
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
11
Lock-Based Queue
head
tail
0
2
1
5 4
3
Initially: head = tail
7
6
class LockBasedQueue<T> {
int head, tail;
T[] items;
Lock lock;
public LockBasedQueue(int capacity) {
head = 0; tail = 0;
lock = new ReentrantLock();
items = (T[]) new Object[capacity];
}
Art of Multiprocessor
Programming
12
A Lock-Based Queue
Initially head = tail
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
13
Lock-Based deq()
headtail0
2
5 4
7
36
1
yx
Art of Multiprocessor
Programming
14
Acquire Lock
headtail0
2
5 4
7
36
yx
1
Waiting to
enqueue…
My turn … yx
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
15
Implementation: deq()
Acquire lock at
method start
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
16
Check if Non-Empty
headtail
0
2
5 4
7
36
1
yx
Waiting to
enqueue…
Not equal?
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
17
Implementation: deq()
If queue empty
throw exception
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
18
Modify the Queue
headtail0
2
1
5 4
7
36
head
Waiting to
enqueue…
yx
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
19
Implementation: deq()
Queue not empty?
Remove item and update head
0 1
capacity-12
head tail
y z
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
20
Implementation: deq()
Return result
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
21
Release the Lock
tail0
2
1
5 4
7
36
y
x
head
Waiting…
Art of Multiprocessor
Programming
22
Release the Lock
tail0
2
1
5 4
7
36
y
x
head
My turn!
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
23
Implementation: deq()
Release lock no
matter what!
0 1
capacity-12
head tail
y z
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
24
Implementation: deq()
Art of Multiprocessor
Programming
25
Now consider the following
implementation
• The same thing without mutual exclusion
• For simplicity, only two threads
– One thread enq only
– The other deq only
Art of Multiprocessor
Programming
26
Wait-free 2-Thread Queue
headtail0
2
1
5 4
7
36
yx
capacity = 8
Art of Multiprocessor
Programming
27
Wait-free 2-Thread Queue
tail0
2
5 4
7
36
yx
1
enq(z)deq()
z
head
Art of Multiprocessor
Programming
28
Wait-free 2-Thread Queuehead
tail0
2
5 4
7
36
y
1
queue[tail]
= z
result = x
z
x
Art of Multiprocessor
Programming
29
Wait-free 2-Thread Queue
tail0
2
5 4
7
36
y
1
tail--head++
z
head
x
public class WaitFreeQueue {
int head = 0, tail = 0;
items = (T[]) new Object[capacity];
public void enq(Item x) {
if (tail-head == capacity) throw
new FullException();
items[tail % capacity] = x; tail++;
}
public Item deq() {
if (tail == head) throw
new EmptyException();
Item item = items[head % capacity]; head++;
return item;
}}Art of Multiprocessor
Programming
30
Wait-free 2-Thread Queue
No lock needed !
0 1
capacity-12
head tail
y z
Wait-free 2-Thread Queue
Art of Multiprocessor
Programming
31
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
32
What is a Concurrent Queue?
• Need a way to specify a concurrent queue object
• Need a way to prove that an algorithm implements the object’s specification
• Lets talk about object specifications …
Correctness and Progress
• In a concurrent setting, we need to specify
both the safety and the liveness properties
of an object
• Need a way to define
– when an implementation is correct
– the conditions under which it guarantees
progress
Art of Multiprocessor
Programming
33
Lets begin with correctness
Art of Multiprocessor
Programming
34
Sequential Objects
• Each object has a state
– Usually given by a set of fields
– Queue example: sequence of items
• Each object has a set of methods
– Only way to manipulate state
– Queue example: enq and deq methods
Art of Multiprocessor
Programming
35
Sequential Specifications
• If (precondition)
– the object is in such-and-such a state
– before you call the method,
• Then (postcondition)
– the method will return a particular value
– or throw a particular exception.
• and (postcondition, con’t)
– the object will be in some other state
– when the method returns,
Art of Multiprocessor
Programming
36
Pre and PostConditions for
Dequeue
• Precondition:
– Queue is non-empty
• Postcondition:
– Returns first item in queue
• Postcondition:
– Removes first item in queue
Art of Multiprocessor
Programming
37
Pre and PostConditions for
Dequeue
• Precondition:
– Queue is empty
• Postcondition:
– Throws Empty exception
• Postcondition:
– Queue state unchanged
Art of Multiprocessor
Programming
38
Why Sequential Specifications
Totally Rock
• Interactions among methods captured by side-
effects on object state
– State meaningful between method calls
• Documentation size linear in number of methods
– Each method described in isolation
• Can add new methods
– Without changing descriptions of old methods
Art of Multiprocessor
Programming
39
What About Concurrent
Specifications ?
• Methods?
• Documentation?
• Adding new methods?
Art of Multiprocessor
Programming
40
Methods Take Time
timetime
Art of Multiprocessor
Programming
41
Methods Take Time
time
invocation 12:00
q.enq(...)
time
Art of Multiprocessor
Programming
42
Methods Take Time
time
Method call
invocation 12:00
time
q.enq(...)
Art of Multiprocessor
Programming
43
Methods Take Time
time
Method call
invocation 12:00
time
q.enq(...)
Art of Multiprocessor
Programming
44
Methods Take Time
time
Method call
invocation 12:00
time
void
response 12:01
q.enq(...)
Art of Multiprocessor
Programming
45
Sequential vs Concurrent
• Sequential
– Methods take time? Who knew?
• Concurrent
– Method call is not an event
– Method call is an interval.
Art of Multiprocessor
Programming
46
time
Concurrent Methods Take
Overlapping Time
time
Art of Multiprocessor
Programming
47
time
Concurrent Methods Take
Overlapping Time
time
Method call
Art of Multiprocessor
Programming
48
time
Concurrent Methods Take
Overlapping Time
time
Method call
Method call
Art of Multiprocessor
Programming
49
time
Concurrent Methods Take
Overlapping Time
time
Method call Method call
Method call
Art of Multiprocessor
Programming
50
Sequential vs Concurrent
• Sequential:
– Object needs meaningful state only between
method calls
• Concurrent
– Because method calls overlap, object might
never be between method calls
Art of Multiprocessor
Programming
51
Sequential vs Concurrent
• Sequential:
– Each method described in isolation
• Concurrent
– Must characterize all possible interactions
with concurrent calls
• What if two enq() calls overlap?
• Two deq() calls? enq() and deq()? …
Art of Multiprocessor
Programming
52
Sequential vs Concurrent
• Sequential:
– Can add new methods without affecting older
methods
• Concurrent:
– Everything can potentially interact with
everything else
Art of Multiprocessor
Programming
53
Sequential vs Concurrent
• Sequential:
– Can add new methods without affecting older
methods
• Concurrent:
– Everything can potentially interact with
everything else
Art of Multiprocessor
Programming
54
The Big Question
• What does it mean for a concurrent object to be correct?
– What is a concurrent FIFO queue?
– FIFO means strict temporal order
– Concurrent means ambiguous temporal order
Art of Multiprocessor
Programming
55
Intuitively…
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Art of Multiprocessor
Programming
56
Intuitively…
public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
All queue modifications
are mutually exclusive
Art of Multiprocessor
Programming
57
time
Intuitively
q.deq
q.enq
enq deq
lock() unlock()
lock() unlock()
Behavior is
“Sequential”
enq
deq
Lets capture the idea of describing
the concurrent via the sequential
Art of Multiprocessor
Programming
58
Linearizability
• Each method should
– “take effect”
– Instantaneously
– Between invocation and response events
• Object is correct if this “sequential” behavior is correct
• Any such concurrent object is
– Linearizable™
Art of Multiprocessor
Programming
59
Is it really about the object?
• Each method should
– “take effect”
– Instantaneously
– Between invocation and response events
• Sounds like a property of an execution…
• A linearizable object: one all of whose
possible executions are linearizable
Art of Multiprocessor
Programming
60
Example
timetime
Art of Multiprocessor
Programming
61
Example
time
q.enq(x)
time
Art of Multiprocessor
Programming
62
Example
time
q.enq(x)
q.enq(y)
time
Art of Multiprocessor
Programming
63
Example
time
q.enq(x)
q.enq(y) q.deq(x)
time
Art of Multiprocessor
Programming
64
Example
time
q.enq(x)
q.enq(y) q.deq(x)
q.deq(y)
time
Art of Multiprocessor
Programming
65
Example
time
q.enq(x)
q.enq(y) q.deq(x)
q.deq(y)q.enq(x)
q.enq(y) q.deq(x)
q.deq(y)
time
Art of Multiprocessor
Programming
66
Example
time
q.enq(x)
q.enq(y) q.deq(x)
q.deq(y)q.enq(x)
q.enq(y) q.deq(x)
q.deq(y)
time
Art of Multiprocessor
Programming
67
Example
time
Art of Multiprocessor
Programming
68
Example
time
q.enq(x)
Art of Multiprocessor
Programming
69
Example
time
q.enq(x) q.deq(y)
Art of Multiprocessor
Programming
70
Example
time
q.enq(x)
q.enq(y)
q.deq(y)
Art of Multiprocessor
Programming
71
Example
time
q.enq(x)
q.enq(y)
q.deq(y)q.enq(x)
q.enq(y)
Art of Multiprocessor
Programming
72
Example
time
q.enq(x)
q.enq(y)
q.deq(y)q.enq(x)
q.enq(y)
Art of Multiprocessor
Programming
73
Example
timetime
Art of Multiprocessor
Programming
74
Example
time
q.enq(x)
time
Art of Multiprocessor
Programming
75
Example
time
q.enq(x)
q.deq(x)
time
Art of Multiprocessor
Programming
76
Example
time
q.enq(x)
q.deq(x)
q.enq(x)
q.deq(x)
time
Art of Multiprocessor
Programming
77
Example
time
q.enq(x)
q.deq(x)
q.enq(x)
q.deq(x)
time
Art of Multiprocessor
Programming
78
Example
time
q.enq(x)
time
Art of Multiprocessor
Programming
79
Example
time
q.enq(x)
q.enq(y)
time
Art of Multiprocessor
Programming
80
Example
time
q.enq(x)
q.enq(y)
q.deq(y)
time
Art of Multiprocessor
Programming
81
Example
time
q.enq(x)
q.enq(y)
q.deq(y)
q.deq(x)
time
Art of Multiprocessor
Programming
82
q.enq(x)
q.enq(y)
q.deq(y)
q.deq(x)
Comme ci Example
time
Comme ça
Art of Multiprocessor
Programming
83
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(0)
Art of Multiprocessor
Programming
84
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(0)
write(1) already
happened
Art of Multiprocessor
Programming
85
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(0)write(1)
write(1) already
happened
Art of Multiprocessor
Programming
86
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(0)write(1)
write(1) already
happened
Art of Multiprocessor
Programming
87
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(1)
write(1) already
happened
Art of Multiprocessor
Programming
88
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(1)write(1)
write(2)
write(1) already
happened
Art of Multiprocessor
Programming
89
Read/Write Register Example
time
read(1)write(0)
write(1)
write(2)
time
read(1)write(1)
write(2)
write(1) already
happened
Art of Multiprocessor
Programming
90
Read/Write Register Example
time
write(0)
write(1)
write(2)
time
read(1)
Art of Multiprocessor
Programming
91
Read/Write Register Example
time
write(0)
write(1)
write(2)
time
read(1)write(1)
write(2)
Art of Multiprocessor
Programming
92
Read/Write Register Example
time
write(0)
write(1)
write(2)
time
read(1)write(1)
write(2)
Art of Multiprocessor
Programming
93
Talking About Executions
• Why?
– Can’t we specify the linearization point of
each operation without describing an
execution?
• Not Always
– In some cases, linearization point depends
on the execution
Art of Multiprocessor
Programming
94
Formal Model of Executions
• Define precisely what we mean
– Ambiguity is bad when intuition is weak
• Allow reasoning
– Formal
– But mostly informal
• In the long run, actually more important
• Ask me why!
Art of Multiprocessor
Programming
95
Split Method Calls into Two Events
• Invocation
– method name & args
– q.enq(x)
• Response
– result or exception
– q.enq(x) returns void
– q.deq() returns x
– q.deq() throws empty
Art of Multiprocessor
Programming
96
Invocation Notation
A q.enq(x)
Art of Multiprocessor
Programming
97
Invocation Notation
A q.enq(x)
thread
Art of Multiprocessor
Programming
98
Invocation Notation
A q.enq(x)
thread method
Art of Multiprocessor
Programming
99
Invocation Notation
A q.enq(x)
thread
object
method
Art of Multiprocessor
Programming
100
Invocation Notation
A q.enq(x)
thread
object
method
arguments
Art of Multiprocessor
Programming
101
Response Notation
A q: void
Art of Multiprocessor
Programming
102
Response Notation
A q: void
thread
Art of Multiprocessor
Programming
103
Response Notation
A q: void
thread result
Art of Multiprocessor
Programming
104
Response Notation
A q: void
thread
object
result
Art of Multiprocessor
Programming
105
Response Notation
A q: void
thread
object
result
Art of Multiprocessor
Programming
106
Response Notation
A q: empty()
thread
object
exception
Art of Multiprocessor
Programming
107
History - Describing an Execution
A q.enq(3)
A q:void
A q.enq(5)
B p.enq(4)
B p:void
B q.deq()
B q:3Sequence of
invocations and
responses
H =
Art of Multiprocessor
Programming
108
Definition
• Invocation & response match if
A q.enq(3)
A q:void
Thread
names agree
Object names agree
Method call
Art of Multiprocessor
Programming
109
Object Projections
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
H =
Art of Multiprocessor
Programming
110
Object Projections
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
H|q =
Art of Multiprocessor
Programming
111
Thread Projections
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
H =
Art of Multiprocessor
Programming
112
Thread Projections
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
H|B =
Art of Multiprocessor
Programming
113
Complete Subhistory
A q.enq(3)
A q:void
A q.enq(5)
B p.enq(4)
B p:void
B q.deq()
B q:3An invocation is
pending if it has no
matching respnse
H =
Art of Multiprocessor
Programming
114
Complete Subhistory
A q.enq(3)
A q:void
A q.enq(5)
B p.enq(4)
B p:void
B q.deq()
B q:3May or may not
have taken effect
H =
Art of Multiprocessor
Programming
115
Complete Subhistory
A q.enq(3)
A q:void
A q.enq(5)
B p.enq(4)
B p:void
B q.deq()
B q:3discard pending
invocations
H =
Art of Multiprocessor
Programming
116
Complete Subhistory
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
Complete(H) =
Art of Multiprocessor
Programming
117
Sequential Histories
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
A q:enq(5)
Art of Multiprocessor
Programming
118
Sequential Histories
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
A q:enq(5)
match
Art of Multiprocessor
Programming
119
Sequential Histories
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
A q:enq(5)
match
match
Art of Multiprocessor
Programming
120
Sequential Histories
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
A q:enq(5)
match
match
match
Art of Multiprocessor
Programming
121
Sequential Histories
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
A q:enq(5)
match
match
match
Final pending
invocation OK
Art of Multiprocessor
Programming
122
Sequential Histories
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
A q:enq(5)
match
match
match
Final pending
invocation OK
Art of Multiprocessor
Programming
123
Well-Formed Histories
H=
A q.enq(3)
B p.enq(4)
B p:void
B q.deq()
A q:void
B q:3
Art of Multiprocessor
Programming
124
Well-Formed Histories
H=
A q.enq(3)
B p.enq(4)
B p:void
B q.deq()
A q:void
B q:3
H|B=
B p.enq(4)
B p:void
B q.deq()
B q:3
Per-thread projections
sequential
Art of Multiprocessor
Programming
125
Well-Formed Histories
H=
A q.enq(3)
B p.enq(4)
B p:void
B q.deq()
A q:void
B q:3
H|B=
B p.enq(4)
B p:void
B q.deq()
B q:3
A q.enq(3)
A q:voidH|A=
Per-thread projections
sequential
Art of Multiprocessor
Programming
126
Equivalent Histories
H=
A q.enq(3)
B p.enq(4)
B p:void
B q.deq()
A q:void
B q:3
Threads see the same
thing in both
A q.enq(3)
A q:void
B p.enq(4)
B p:void
B q.deq()
B q:3
G=
H|A = G|A
H|B = G|B
Art of Multiprocessor
Programming
127
Sequential Specifications
• A sequential specification is some way of
telling whether a
– Single-thread, single-object history
– Is legal
• For example:
– Pre and post-conditions
– But plenty of other techniques exist …
Art of Multiprocessor
Programming
128
Legal Histories
• A sequential (multi-object) history H is
legal if
– For every object x
– H|x is in the sequential spec for x
Art of Multiprocessor
Programming
129
Precedence
A q.enq(3)
B p.enq(4)
B p.void
A q:void
B q.deq()
B q:3
A method call precedes
another if response
event precedes
invocation event
Method call Method call
Art of Multiprocessor
Programming
130
Non-Precedence
A q.enq(3)
B p.enq(4)
B p.void
B q.deq()
A q:void
B q:3
Some method calls
overlap one another
Method call
Method call
Art of Multiprocessor
Programming
131
Notation
• Given
– History H
– method executions m0 and m1 in H
• We say m0 ➔H m1, if
– m0 precedes m1
• Relation m0 ➔H m1 is a
– Partial order
– Total order if H is sequential
m0 m1
Art of Multiprocessor
Programming
132
Linearizability
• History H is linearizable if it can be extended to G by
– Appending zero or more responses to pending invocations
– Discarding other pending invocations
• So that G is equivalent to
– Legal sequential history S
– where ➔G ➔S
Art of Multiprocessor
Programming
133
Remarks
• Some pending invocations
– Took effect, so keep them
– Discard the rest
• Condition ➔G ➔S
– Means that S respects “real-time order” of G
Art of Multiprocessor
Programming
134
Ensuring ➔G ➔S
time
a
b
time ➔S
c➔G
➔G = {a→c,b→c}
➔S = {a→b,a→c,b→c}
Art of Multiprocessor
Programming
135
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
B q:enq(6)
Example
time
B q.enq(4)
A q.enq(3)
B q.deq(4) B q.enq(6)
Art of Multiprocessor
Programming
136
Example
Complete this
pending
invocation
time
B q.enq(4) B q.deq(3) B q.enq(6)
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
B q:enq(6)
A q.enq(3)
Art of Multiprocessor
Programming
137
Example
Complete this
pending
invocation
time
B q.enq(4) B q.deq(4) B q.enq(6)
A q.enq(3)
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
B q:enq(6)
A q:void
Art of Multiprocessor
Programming
138
Example
time
B q.enq(4) B q.deq(4) B q.enq(6)
A q.enq(3)
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
B q:enq(6)
A q:void
discard this one
Art of Multiprocessor
Programming
139
Example
time
B q.enq(4) B q.deq(4)
A q.enq(3)
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
A q:void
discard this one
Art of Multiprocessor
Programming
140
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
A q:void
Example
time
B q.enq(4) B q.deq(4)
A q.enq(3)
Art of Multiprocessor
Programming
141
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
A q:void
Example
time
B q.enq(4)
B q:void
A q.enq(3)
A q:void
B q.deq()
B q:4
B q.enq(4) B q.deq(4)
A q.enq(3)
Art of Multiprocessor
Programming
142
B q.enq(4) B q.deq(4)
A q.enq(3)
A q.enq(3)
B q.enq(4)
B q:void
B q.deq()
B q:4
A q:void
Example
time
B q.enq(4)
B q:void
A q.enq(3)
A q:void
B q.deq()
B q:4
Equivalent sequential history
Art of Multiprocessor
Programming
143
Concurrency
• How much concurrency does linearizability
allow?
• When must a method invocation block?
Art of Multiprocessor
Programming
144
Concurrency
• Focus on total methods
– Defined in every state
• Example:– deq() that throws Empty exception
– Versus deq() that waits …
• Why?
– Otherwise, blocking unrelated to synchronization
Art of Multiprocessor
Programming
145
Concurrency
• Question: When does linearizability
require a method invocation to block?
• Answer: never.
• Linearizability is non-blocking
Art of Multiprocessor
Programming
146
Non-Blocking Theorem
If method invocation
A q.inv(…)
is pending in history H, then there exists a
response
A q:res(…)
such that
H + A q:res(…)
is linearizable
Art of Multiprocessor
Programming
147
Proof
• Pick linearization S of H
• If S already contains
– Invocation A q.inv(…) and response,
– Then we are done.
• Otherwise, pick a response such that
– S + A q.inv(…) + A q:res(…)
– Possible because object is total.
Art of Multiprocessor
Programming
148
Composability Theorem
• History H is linearizable if and only if
– For every object x
– H|x is linearizable
• We care about objects only!
– (Materialism?)
Art of Multiprocessor
Programming
149
Why Does Composability Matter?
• Modularity
• Can prove linearizability of objects in
isolation
• Can compose independently-implemented
objects
Art of Multiprocessor
Programming
150
Reasoning About
Linearizability: Locking public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
151
Reasoning About
Linearizability: Locking public T deq() throws EmptyException {
lock.lock();
try {
if (tail == head)
throw new EmptyException();
T x = items[head % items.length];
head++;
return x;
} finally {
lock.unlock();
}
}
Linearization points
are when locks are
released
0 1
capacity-12
head tail
y z
Art of Multiprocessor
Programming
152
More Reasoning: Wait-free
0 1
capacity-12
head tail
y z
public class WaitFreeQueue {
int head = 0, tail = 0;
items = (T[]) new Object[capacity];
public void enq(Item x) {
if (tail-head == capacity) throw
new FullException();
items[tail % capacity] = x; tail++;
}
public Item deq() {
if (tail == head) throw
new EmptyException();
Item item = items[head % capacity]; head++;
return item;
}}
0 1
capacity-12
head tail
y z
public class WaitFreeQueue {
int head = 0, tail = 0;
items = (T[]) new Object[capacity];
public void enq(Item x) {
if (tail-head == capacity) throw
new FullException();
items[tail % capacity] = x; tail++;
}
public Item deq() {
if (tail == head) throw
new EmptyException();
Item item = items[head % capacity]; head++;
return item;
}}Art of Multiprocessor
Programming
153
More Reasoning: Wait-free
Linearization order is
order head and tail
fields modified
Art of Multiprocessor
Programming
154
Strategy
• Identify one atomic step where method
“happens”
– Critical section
– Machine instruction
• Doesn’t always work
– Might need to define several different steps
for a given method
Art of Multiprocessor
Programming
155
Linearizability: Summary
• Powerful specification tool for shared
objects
• Allows us to capture the notion of objects
being “atomic”
• Don’t leave home without it
Art of Multiprocessor
Programming
156
Alternative: Sequential
Consistency
• History H is Sequentially Consistent if it can be extended to G by
– Appending zero or more responses to pending invocations
– Discarding other pending invocations
• So that G is equivalent to a
– Legal sequential history S
– Where ➔G ➔S
Differs from
linearizability
Art of Multiprocessor
Programming
157
Sequential Consistency
• No need to preserve real-time order
– Cannot re-order operations done by the
same thread
– Can re-order non-overlapping operations
done by different threads
• Often used to describe multiprocessor
memory architectures
Art of Multiprocessor
Programming
158
Example
time
Art of Multiprocessor
Programming
159
Example
time
q.enq(x)
Art of Multiprocessor
Programming
160
Example
time
q.enq(x) q.deq(y)
Art of Multiprocessor
Programming
161
Example
time
q.enq(x)
q.enq(y)
q.deq(y)
Art of Multiprocessor
Programming
162
Example
time
q.enq(x)
q.enq(y)
q.deq(y)q.enq(x)
q.enq(y)
Art of Multiprocessor
Programming
163
Example
time
q.enq(x)
q.enq(y)
q.deq(y)q.enq(x)
q.enq(y)
Art of Multiprocessor
Programming
164
Example
time
q.enq(x)
q.enq(y)
q.deq(y)q.enq(x)
q.enq(y)
Art of Multiprocessor
Programming
165
Theorem
Sequential Consistency is not
composable
Art of Multiprocessor
Programming
166
FIFO Queue Example
time
p.enq(x) p.deq(y)q.enq(x)
time
Art of Multiprocessor
Programming
167
FIFO Queue Example
time
p.enq(x) p.deq(y)q.enq(x)
q.enq(y) q.deq(x)p.enq(y)
time
Art of Multiprocessor
Programming
168
FIFO Queue Example
time
p.enq(x) p.deq(y)q.enq(x)
q.enq(y) q.deq(x)p.enq(y)
History H
time
Art of Multiprocessor
Programming
169
H|p Sequentially Consistent
time
p.enq(x) p.deq(y)
p.enq(y)
q.enq(x)
q.enq(y) q.deq(x)
time
Art of Multiprocessor
Programming
170
H|q Sequentially Consistent
time
p.enq(x) p.deq(y)q.enq(x)
q.enq(y) q.deq(x)p.enq(y)
time
Art of Multiprocessor
Programming
171
Ordering imposed by p
time
p.enq(x) p.deq(y)q.enq(x)
q.enq(y) q.deq(x)p.enq(y)
time
Art of Multiprocessor
Programming
172
Ordering imposed by q
time
p.enq(x) p.deq(y)q.enq(x)
q.enq(y) q.deq(x)p.enq(y)
time
Art of Multiprocessor
Programming
173
p.enq(x)
Ordering imposed by both
time
q.enq(x)
q.enq(y) q.deq(x)
time
p.deq(y)
p.enq(y)
Art of Multiprocessor
Programming
174
p.enq(x)
Combining orders
time
q.enq(x)
q.enq(y) q.deq(x)
time
p.deq(y)
p.enq(y)
Art of Multiprocessor
Programming
175
Fact
• Most hardware architectures don’t support
sequential consistency
• Because they think it’s too strong
• Here’s another story …
Art of Multiprocessor
Programming
176
The Flag Example
time
x.write(1) y.read(0)
y.write(1) x.read(0)
time
Art of Multiprocessor
Programming
177
The Flag Example
time
x.write(1) y.read(0)
y.write(1) x.read(0)
• Each thread’s view is sequentially
consistent
– It went first
Art of Multiprocessor
Programming
178
The Flag Example
time
x.write(1) y.read(0)
y.write(1) x.read(0)
• Entire history isn’t sequentially
consistent
– Can’t both go first
Art of Multiprocessor
Programming
179
The Flag Example
time
x.write(1) y.read(0)
y.write(1) x.read(0)
• Is this behavior really so wrong?
– We can argue either way …
Art of Multiprocessor
Programming
180
Opinion: It’s Wrong
• This pattern
– Write mine, read yours
• Is exactly the flag principle
– Beloved of Alice and Bob
– Heart of mutual exclusion• Peterson
• Bakery, etc.
• It’s non-negotiable!
Art of Multiprocessor
Programming
181
Peterson's Algorithm
public void lock() {
flag[i] = true;
victim = i;
while (flag[j] && victim == i) {};
}
public void unlock() {
flag[i] = false;
}
Art of Multiprocessor
Programming
182
Crux of Peterson Proof
(1) writeB(flag[B]=true)➔writeB(victim=B)
(3) writeB(victim=B)➔writeA(victim=A)
(2) writeA(victim=A)➔readA(flag[B])
➔ readA(victim)
Art of Multiprocessor
Programming
183
Crux of Peterson Proof
(1) writeB(flag[B]=true)➔writeB(victim=B)
(3) writeB(victim=B)➔writeA(victim=A)
(2) writeA(victim=A)➔readA(flag[B])
➔ readA(victim)
Observation: proof relied on fact that if a
location is stored, a later load by some thread
will return this or a later stored value.
Art of Multiprocessor
Programming
184
Opinion: But It Feels So Right …
• Many hardware architects think that
sequential consistency is too strong
• Too expensive to implement in modern
hardware
• OK if flag principle
– violated by default
– Honored by explicit request
Hardware Consistancy
mov 1, a ;Store
mov b, %ebx ;Load
mov 1, b ;Store
mov a, %eax ;Load
Initially, a = b = 0.
Processor 0 Processor 1
What are the final possible values of %eax
and %ebx after both processors have
executed?
Sequential consistency implies that no
execution ends with %eax= %ebx = 0
Slide used with permission of
Charles E. Leiserson
Hardware Consistency
∙ No modern-day processor implements
sequential consistency.
∙ Hardware actively reorders instructions.
∙ Compilers may reorder instructions, too.
∙ Why?
∙ Because most of performance is derived
from a single thread’s unsynchronized
execution of code.
Art of Multiprocessor
Programming
Instruction Reordering
Q. Why might the hardware or compiler
decide to reorder these instructions?A. To obtain higher performance by covering
load latency — instruction-level
parallelism.
mov 1, a ;Store
mov b, %ebx ;Load
mov b, %ebx ;Load
mov 1, a ;Store
Program Order Execution Order
Slide used with permission of
Charles E. Leiserson
Instruction Reordering
Q. When is it safe for the hardware or
compiler to perform this reordering?
A. When a ≠ b.
A′. And there’s no concurrency.
mov 1, a ;Store
mov b, %ebx ;Load
mov b, %ebx ;Load
mov 1, a ;Store
Program Order Execution Order
Slide used with permission of
Charles E. Leiserson
Hardware Reordering
∙ Processor can issue stores faster than the
network can handle them ⇒ store buffer.
∙ Loads take priority, bypassing the store buffer.
∙ Except if a load address matches an address in
the store buffer, the store buffer returns the result.
Memory
System
Load Bypass
Processor Network
Store Buffer
Slide used with permission of
Charles E. Leiserson
X86: Memory Consistency
1. Loads are not reordered with loads.
2. Stores are not reordered with stores.
3. Stores are not reordered with prior
loads.
4. A load may be reordered with a prior
store to a different location but not
with a prior store to the same
location.
5. Stores to the same location respect a
global total order.
Store1
Store2
Load1
Store3
Store4
Load3
Load2
Load4
Load5
Thread’s
Code
Art of Multiprocessor
Programming
1. Loads are not reordered with loads.
2. Stores are not reordered with stores.
3. Stores are not reordered with prior
loads.
4. A load may be reordered with a prior
store to a different location but not
with a prior store to the same
location.
5. Stores to the same location respect a
global total order.
X86: Memory Consistency
Store1
Store2
Load1
Store3
Store4
Load3
Load2
Load4
Load5
Thread’s
Code
Total Store Ordering
(TSO)…weaker than
sequential consistency
LO
A
D
S
OK!
Art of Multiprocessor
Programming
Memory Barriers (Fences)
∙A memory barrier (or memory fence) is a
hardware action that enforces an ordering
constraint between the instructions before
and after the fence.
∙A memory barrier can be issued explicitly
as an instruction (x86: mfence)
∙The typical cost of a memory fence is
comparable to that of an L2-cache
access.
Art of Multiprocessor
Programming
1. Loads are not reordered with loads.
2. Stores are not reordered with stores.
3. Stores are not reordered with prior
loads.
4. A load may be reordered with a prior
store to a different location but not
with a prior store to the same
location.
5. Stores to the same location respect a
global total order.
X86: Memory Consistency
Store1
Store2
Load1
Store3
Store4
Load3
Load2
Load4
Load5
Thread’s
Code
Total Store Ordering +
properly placed memory
barriers = sequential
consistency
Barrier
Art of Multiprocessor
Programming
194
Memory Barriers
• Explicit Synchronization
• Memory barrier will
– Flush write buffer
– Bring caches up to date
• Compilers often do this for you
– Entering and leaving critical sections
Art of Multiprocessor
Programming
195
Volatile Variables
• In Java, can ask compiler to keep a
variable up-to-date by declaring it
volatile
• Adds a memory barrier after each store
• Inhibits reordering, removing from
loops, & other “compiler optimizations”
• Will talk about it in detail in later
lectures
Art of Multiprocessor
Programming
196
Summary: Real-World
• Hardware weaker than sequential
consistency
• Can get sequential consistency at a price
• Linearizability better fit for high-level
software
Art of Multiprocessor
Programming
197
Linearizability
• Linearizability
– Operation takes effect instantaneously
between invocation and response
– Uses sequential specification, locality implies
composablity
Art of Multiprocessor
Programming
198
Summary: Correctness
• Sequential Consistency
– Not composable
– Harder to work with
– Good way to think about hardware models
• We will use linearizability as our
consistency condition in the remainder of
this course unless stated otherwise
Progress
• We saw an implementation whose
methods were lock-based (deadlock-free)
• We saw an implementation whose
methods did not use locks (lock-free)
• How do they relate?
Art of Multiprocessor
Programming
199
Progress Conditions
• Deadlock-free: some thread trying to acquire the
lock eventually succeeds.
• Starvation-free: every thread trying to acquire
the lock eventually succeeds.
• Lock-free: some thread calling a method
eventually returns.
• Wait-free: every thread calling a method
eventually returns.
Art of Multiprocessor
Programming
200
Progress Conditions
Art of Multiprocessor
Programming201
Wait-free
Lock-free
Starvation-free
Deadlock-free
Everyone
makes
progress
Non-Blocking Blocking
Someone
makes
progress
Art of Multiprocessor
Programming
202
Summary
• We will look at linearizable blocking and
non-blocking implementations of objects.
Art of Multiprocessor
Programming
203
This work is licensed under a Creative Commons Attribution-
ShareAlike 2.5 License.
• You are free:
– to Share — to copy, distribute and transmit the work
– to Remix — to adapt the work
• Under the following conditions:
– Attribution. You must attribute the work to “The Art of
Multiprocessor Programming” (but not in any way that suggests that
the authors endorse you or your use of the work).
– Share Alike. If you alter, transform, or build upon this work, you
may distribute the resulting work only under the same, similar or a
compatible license.
• For any reuse or distribution, you must make clear to others the license
terms of this work. The best way to do this is with a link to
– http://creativecommons.org/licenses/by-sa/3.0/.
• Any of the above conditions can be waived if you get permission from
the copyright holder.
• Nothing in this license impairs or restricts the author's moral rights.