The Pillars Of Concurrency

The Pillars Of Concurrency

Responsiveness and

Isolation

Throughput and

ScalabilityConsistency

Pillar I – Responsiveness and Isolation

UI

BackgroundThread

BackgroundThread

Running expensive work asynchronously via background threads (or thread pool threads) to avoid blocking ‘time sensitive’ threads such as the UI Thread.

The background threads communicate with the ‘time sensitive’ threads via asynchronous messages

Thread-PoolThread

Msg

Msg

Msg

Msg

How? Using BackgroundWorker

1. private void btoStart_Click(object sender, EventArgs e) 2. { 3. backgroundWorker1.RunWorkerAsync(); 4. }

5. private void backgroundWorker1_DoWork(…) 2. { 14. backgroundWorker1.ReportProgress(…) 15. 21. if (backgroundWorker1.CancellationPending) 22. { 23. e.Cancel = true; 24. return; 25. } 36. }

37. //This event is raised on the main thread. 3. private void backgroundWorker1_ProgressChanged(…) 5. { 6. progressBar1.Value = e.ProgressPercentage; 2. }

Pillar II -Throughput and Scalability

Distribute the work between threads in order to put as many cores to work, to maximize throughput

1KB and more

instructions

1KB and more

instructions1KB and

more instructions

1KB and more

instructions

1KB and more

instructions

1KB and more

instructions

Work Load

OpenMP TPL

Parallel Studio

1KB and more

instructions

Why?The number of

transistors never stopped

climbing

The Free Lunch is Over However,

clock speed stopped

somewhere near 3GHz

The Solution

Re-Enable the Free Lunch

Use the Thread-Pool to execute

your work asynchronously

Add a concurrency control mechanism which will adjust

the amount of work items thrown into the pool according

to the workload and the machine architecture, in order to put the maximum number of cores to work with minimum

contentions

How many callbacks to put in the

pool?

How to separate the

work?

The Future Lock-Free Thread-Pool

Instead of using a linked list, use the array-style, lock-free, GC-friendly

ConcurrentQueue<T> class

The increasing number of

work items and worker threads

result in a problematic

contention on the pool.

The Future Work Stealing Queues

When work is queued from a

non-pool thread, it goes into the global queue.

Each worker thread in the

pool has its own private WSQ.

When work is queued from a

pool worker thread, the work

goes into its WSQ, most of the time, avoiding all

locking.

WSQ has two ends, it allows

lock-free pushes and pops from

one end (“private”), but

requires synchronization from the other end (“public”)

Worker thread is being

created/assigned to grab work from the global

queueThe worker thread grab

work from its WSQ in a LIFO

fashion, avoiding all locking.

Worker threads steal work from other WSQs in a

FIFO fashion, synchronization

is required.

The Future Work Stealing Queues (cont)

When threads are looking for work, they can have a preferred search order: Check the local WSQ. Work here can be dequeued without locks. Check the global queue. Work here must be dequeued using locks. Check other threads’ WSQs. This is called “stealing”, and requires locks.

The Future Task Parallel Library

Aims to lower the cost of fine-grained parallelism by executing the asynchronous work (tasks) in a way that fit the number of available cores, and providing developers with more control over the way in which the tasks get scheduled and executed.

The Future Task Parallel Library

Exposes rich set of APIs that enable operations such as

Waiting for tasks Canceling tasks Optimizing the fairness when scheduling tasks Marking tasks known to be long-running to help the

scheduler execute efficiently Creating tasks as attached or detached to the parent

task Scheduling continuations that run if a task threw an

exception or got cancelled Running tasks synchronously and more.

Pillar III -Consistency

With the addition of concurrency comes the responsibility for ensuring safety. We need to deal with shared memory without causing either corruption or deadlock

What’s the Problem?

Race conditions - when one or more threads are writing a piece of data while one or more threads are also reading that piece of data.

What can we do?

In the mean while – learn to love locks! Use ‘out of the box’ lock-free data

structures Move forwards functional languages

Avoid sharing altogether Communicate through message passing Use immutable objects Good luck with that! In F#, types are immutable by default

What’s the problem with Locks?

Races – even when using a single lock

Dead Locks – when multiple locks are being acquired in the wrong order

Composability – locks do not compose

Modularity - locks do not support modular programming

And More

What the experts suggest?

Isolation first Immutability second Synchronization last

The Future Transactional memory

Wrapping all the code that access shared memory in a transaction and let the runtime execute it atomically and in isolation by doing the appropriate synchronization behind the scenes. By that, maintaining a single lock illusion.

With STM we can write code that looks sequential, can be reasoned about sequentially, and yet run concurrently in a scalable fashion.

1. atmic 2. {

//Safely access shared memory

3. }

The Future Transactional memory (cont)

STM generates a single lock illusion and still maintain great scalability by utilizing an optimistic concurrency control technique.

Store reads version

Shadow copy writes

Execute transaction (instead of writing into master locations –

write into shadow copies)

Re-execute transaction

(Consider switching to pessimistic

concurrency control)

Copy shadow copies into their master location

Update writes versions

Conflict

Validate

readsSubmit

Documents

The Pillars Of Concurrency