Fence Complexity in Concurrent Algorithms

Preview:

DESCRIPTION

Fence Complexity in Concurrent Algorithms. Petr Kuznetsov TU Berlin/DT-Labs. STM is about ease-of-programming and efficiency. What is “efficient“ in a concurrent system?. Cost metrics. Space: used memory Cheap Advanced garbage-collection Time: - PowerPoint PPT Presentation

Citation preview

Fence Complexity in Concurrent Algorithms

Petr KuznetsovTU Berlin/DT-Labs

STM is about ease-of-programmingand efficiency

What is “efficient“ in a concurrent system?

4

Cost metrics

Space: used memoryCheapAdvanced garbage-collection

Time: the number of reads and writes (per operation)the number of stalls

5

Relaxed memory modelsMemory is much slower than CPURead: check the cache -> read the memoryWrite: invalidate the caches -> update the memoryTo overcome “stalled writes” – reorder operations

Reordering may result in inconsistency

6

What is inconsistency?

Process P:

Write(X,1)

Read(Y)

Process Q:

Write(Y,1)

Read(X)

P

QW(Y,1)

R(Y)W(X,1)

R(X)

W(X,1)

7

Possible outcomes

P Q

P reads before Q writes

P reads after Q writes

Q reads after P writes

Q reads before P writes

Out-of-order

8

Fixing out-of-order Memory fences: read-after-write (RAW)

write(X,1)

fence() // enforce the order

read(Y)

P

QW(Y,1)

R(Y)W(X,1)

R(X)

9

Fixing out-of-order Atomic operations: atomic-write-after-read atomic{

read(Y)

write(X,1)

}E.g., CAS, TAS, Fetch&Add,…

RAW/AWAR fences take ~60 RMRs

10

Our result

10

Any concurrent program in a certain class must use RAW/AWARs

11

What programs?

Concurrent data types:queues, counters, hash tables, trees,…Non-commutative operationsLinearizable solo-terminating implementations

Mutual exclusion

12

Non-commutative operations

Operation A is non-commutative if there exists operation B where (applied to some state):

A influences Band

B influences A

13

Example: Queue enq(v) – add v to the end of the queue deq() – dequeues the item at the head of the queue

Q=1;2

Q.deq():1;Q.deq():2 vs. Q.deq():2;Q.deq():1deq() influence each other

Q.enq(3):ok;Q.deq():1 vs. Q.deq():1;Q.enq(3):okenq() is commutative

14

Proof sketch A non-commutative operation must write Suppose not

deq():1 deq():11;2

there must be a write!

w

15

Proof sketch Let w be the first write Suppose there are no AWAR

deq():11;2

A(w) - the longest atomic construct containing w

w

w must be the first base-object event in A(w)!

16

Proof sketch Suppose there are no RAWs

deq():11;2

No RAW - no difference for deq()!

deq():1

A(w)

17

Mutual exclusionLock() – acquire the lockUnlock() – release the lock (Mutex) No two process holds the lock at the

same time (Deadlock-freedom) If at least one process

executes Lock() and no active process fails, at least one process acquires the lock

Two Lock() operations influence each other!

18

Our result

18

In any implementation of mutual exclusion or a concurrent data type with a non-

commutative operation op, a complete execution of op or lock() contains a

RAW or AWAR

Every successful lock acquire incurs a RAW/AWAR fence

19

Why do we care?

Hardware design: what primitives must be optimized?

API design: returned values matterSet with add returning fail vs. returning ok

Verification – early catch of obviously incorrect algorithm

20

What’s next? Weaker primitives?

Idempotent Work Stealing [Michael et al,PPoPP’09 ] Tight lower bounds?

How many RAW/AWAR fences are incurred? Other patterns

Read-after-readWrite-after-writeMulti-RAW:

write(Xi,1)

collect(X1,..,Xn)

21

References H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov,

M. Michael, M. VechevLaws of Order: Expensive Synchronization in Concurrent Algorithms Cannot be EliminatedIn POPL 2011

Srivatsan’s talk on STM fence complexity, TR on the way

22

QUESTIONS?

Recommended