22
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Embed Size (px)

Citation preview

Page 1: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Multiprocessor Cache Consistency

(or, what does volatile mean?)

Andrew Whitaker

CSE451

Page 2: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

What Does This Program Print?1. public class VisiblityExample extends Thread {

2. private static int x = 1;3. private static int y = 1; 4. private static boolean ready = false;

5. public static void main(String[] args) {6. Thread t = new VisiblityExample(); 7. t.start();8. 9. x = 2;10. y = 2;11. ready = true;12. }

13. public void run() {14. while (! ready)15. Thread.yield(); // give up the processor16. System.out.println(“x= “ + x + “ y= “ + y);17. }18. }

Page 3: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Answer

It’s a race condition. Many different outputs are possible: x=2, y=2 x=1,y=2 x=2,y=1 x=1,y=1 Or, the program may print nothing!

The ready loop runs forever

Page 4: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

What’s Going on Here?

Processor caches ($) can get out-of-sync

CPU

$

Memory

CPU

$

CPU

$

CPU

$

Page 5: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

1. // Not real code; for illustration purposes only2. public class Example extends Thread {3. private static final int NUM_PROCESSORS = 4;

4. private static int x[NUM_PROCESSORS];5. private static int y[NUM_PROCESSORS]; 6. private static boolean ready[NUM_PROCESSORS];7. // …

A Mental Model

Every thread/processor has its own copy of every variable Yikes!

Page 6: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Two Issues

Cache coherence Do caches eventually converge on the same state

All modern caches are coherent

Cache consistency When are operations by one processor visible on other

processors? Sometimes called “publication”

How much re-ordering is possible across processors?

Page 7: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Subjective View of Cache Consistency Strategies

Fast and scalable

Amount ofreordering

Relaxed

Strict

Page 8: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Factors Pushing Towards Relaxed Consistency Models

Hardware perspective: consistency operations are expensive Writing processor must invalidate all other processors Reading processor must re-validate its cached state

Compiler perspective: optimizations frequently re-arrange memory operations to hide latency These are guaranteed to be transparent, but only on a

single processor

Page 9: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Caches 101

Caches store blocks of main memory Blocks are fairly small (perhaps 64 bytes)

Each cache block exists in one of three states Invalid, shared, exclusive

Memory operations causes the cache block to change states

CPUs must communicate to implement cache block state changes

Page 10: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Cache Block State During a Coherence Operation

Invalid Shared(read-only)

Exclusive(read-write)

Writingprocessor

Reading processors

Page 11: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Some Terminology

Publication: A CPU announces its updates to some or all of cache memory

Fetch: A CPU loads that latest values for previously published updates

Page 12: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Hardware Support: Memory Fences (Barriers)

No memory operation can be moved across a fence No operation after the fence appears before the

fence No operation before the fence appears after the

fenceSeveral variants:

Write fences (for publication) Read fences (for fetch) Read/write (total) fences

Page 13: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Sequential Consistency

All writes are immediately published All reads fetch the latest value All processors agree on order of

memory accesses Every operation is a fence

Behaves like shuffling cards

Page 14: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Sequential Consistency Example

A. x = 2;B. y = 3;

C. x = 4;D. y = 5;

Processor 1 Processor 2

A always appears before BC always appears before D

A. x = 2;B. y = 3;C. x = 4;D. y = 5;

C. x = 4;D. y = 5;A. x = 2;B. y = 3;

C. x = 4;A. x = 2;D. y = 5;B. y = 3;

A. x = 2;C. x = 4;D. y = 5;B. y = 3;

A subset of legal orderings:

Page 15: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

The Cost of Sequential Consistency

Every write requires a complete cache invalidation Writing processor acquires exclusive access Writing processor sends an invalidation

message Writing processor receives acknowledgements

from all processors

Expensive!

Page 16: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Relaxed Consistency Models

Updates are published lazily Therefore, updates may appear out-of-order

Challenge: Exposing a programming model that a human can understand

Page 17: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Release Consistency

Observation: concurrent programs usually use proper synchronization “All shared, mutable state must be properly synchronized”

It suffices to sync-up memory during synchronized operations

Big performance win: the number of cache coherency operations scales with synchronization, not the number of loads and stores

Page 18: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

synchronized (this) { x++; y++;}

Fetch current values

Publish new values

Simple Example

Within the critical section, updates can be re-ordered

Without publication, updates may never be visible

Page 19: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Java Volatile Variables

Java synchronized does double-duty It provides mutual exclusion, atomicity It ensures safe publication of updates

Sometimes, we don’t want to pay the cost of mutual exclusion

Volatile variables provide safe publication without mutual exclusion

volatile int x = 7;

Page 20: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

More on Volatile

Updates to volatile fields are propagated immediately “Don’t cache me!” Effectively, this activates sequential consistency

Volatile serves as a fence to the compiler and hardware Memory operations are not re-ordered around a

volatile

Page 21: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Rule #1, Revised

All shared, mutable state must be properly synchronized With a synchronized statement, an Atomic

variable, or with volatile

Page 22: Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451

Example: Lazy Initialization

class Example { static List list = null;

public static List getList () { if (list == null) { list = new LinkedList(); return list; }} Need synchronization to

ensure publication