Download ppt - Multithreading Patterns

Multithreading patternsMultithreading patterns

Cristian Nicola Cristian Nicola Development ManagerDevelopment Manager

Net Evidence (SLM) LtdNet Evidence (SLM) Ltdhttp://www.tonicola.com [email protected] [email protected]

http://www.tonicola.com/

1. Introduction to multithreading 2. Multithreading patterns

1. Introduction to multithreading

In this section…• Why do multi-threading?• When and when not to use threads?

• Multithreading basic structures (Critical sections, Mutexes, Events, Semaphores and Timers)

• Multithreading problems (atomic operations, race conditions, priority inversion, deadlocks, livelocks, boxcar / lock convoys / thundering herd)

Why multi-threading?

• Multi-core / multi-CPU machines are now standard

• Makes programming more fun

When to use threads?

• Clearly defined work-tasks, and the work-tasks are long enough

• Data needed to complete the work tasks does not overlap (or maybe just a little)

• Generally UI interaction is not needed – background tasks

When NOT to use threads?

• Work-tasks are not clearly defined

• There is a lot of shared data between the tasks

• UI interaction is a requirement

• Work-tasks are small

• You do not have a good reason to use it

Multithreading structures

Jobs, processes, threads, fibers

Job 1

Process 1

Thread 1

Thread 2

Thread M

…

…

Process N

Fiber 1 Fiber 2 Fiber X…

What we need…a way to

i. …avoid simultaneous access to a common resource (mutexes, critical sections)

ii. …signal an occurrence or an action (events)

iii. …restrict/throttle the access to some shared resources (semaphores)

iv. …signal a due time – sometimes periodically (timers)

Critical sections

• User object - lightweight

• Their number is limited by memory

• Re-entrant

• Very fast when no collisions (10’s of instructions)

• Downgrades to a kernel object when locked

• No time-out

Mutexes

• Kernel object• Can be named for inter-process communication• Can have security flags• Can be inherited by child processes

• Can be acquired/released

Events


• Holds a state: signalled, non-signalled

• Can be auto-reset - PulseEvent (should not be used)

• Auto-reset events are NOT re-entrant

Semaphores


• Have a count property, but it cannot be interrogated

• Signalled when count > 0

Timers


• Can be auto-reset

Kernel-land / User-land

• Kernel transition – expensive• User transition – fast

• Should avoid kernel transitions when possible (system calls, usage of kernel objects, un-needed thread creation or destruction)

Multithreading problems

Atomic operations

• A set of operations that must be executed as a whole, so they appear to the rest of the system to be a single operation

• There can be 2 outcomes:

- success

- failure

Atomic operations

For example the code:I = J + 1;

Can be compiled as:

MOV EAX, [EBP-$10]

INC EAX

MOV [EBP-$0C], EAX

Possible task switch

Possible task switch

Solution:Lock;

I = J + 1;

Unlock;

Race conditions

• A task switch can occur any timeA task switch can occur any time

Race conditions

• When 2 threads race to change the data

Problem:

Unpredictable result

Race conditions

Thread 1* Read A=1 into register* Increment register

* Write register= 2 into A in memory

Input: A = 1

Example: 2 threads incrementing a variable by 1

If we start from 1 then the expected result would be 3

Thread 2

* Read A=1 into register* Increment register* Write register= 2 into A in memory

Output: A = 2

Priority inversion

• A thread with a higher priority waits for a resource used by a thread with a lower priority

Problem:

A high priority thread is executed less often than a lower priority thread

Thread 1 (low priority)* Lock a file for usage writing some data into it

* Do some more work with the file* Release the file

Thread 2 (high priority)

* Wait for the file to be available

* Use the file

Priority inversionExample: 2 threads accessing the same file

Out of 3 switches:Low priority 2High priority 1

Deadlock

• 2 or more actions depend on each other for completion, and as a result none finishes

Problem:

One or more threads stop working for indefinite amounts of time

Deadlock conditions

1. Mutual exclusion locking of resources

2. Resources are locked while others are waited for

3. Pre-emption while holding resources is permitted

4. A circular wait condition exists

Deadlock

Thread 1

* Lock resource A

* Wait for resource B to be available

Thread 2

* Lock resource B

* Wait for resource A to be available

Example: 2 threads accessing the same resources

Both threads are now stopped, no way to wake up

Livelock

• Same as deadlock, except the detection/prevention of deadlocks would wake up the threads, without progressing

Problem:

One or more threads do not progress, they do spin

• 2 people travelling in opposite directions, each other is polite and moves aside to make space – none of them can pass as the move from side to side

Boxcar / Lock Convoys / Thundering herd

• Can have a serious performance penalty• The application would work fine• A certain flag wakes up many threads, however only the

first one has work to do

Problem:

Threads wake up, wait on a resource and then there is no work to do

Boxcar / Lock Convoys / Thundering herd

Thread 1* Sleep waiting for event

* Lock data

* Use data* Unlock data* Go back to sleep

Thread 2

* Sleep waiting for event

* Wait for the data lock to be available

* Lock data* Nothing to do* Unlock data* Go back to sleep

Example: 2 threads wake up to use the same resource

Flag is signalled

2. Multithreading patterns

In this section…

• What is a design pattern?

• Groups of patterns (control-flow patterns, data patterns, resource patterns, exception/error patterns)

• Multithreading patterns sources

• A design pattern is a reusable solution to a recurring problem in the context of object oriented development

• Patterns can be about other topics

What is a design pattern?

• Control-flow: aspects related to control and flow dependencies between various threads (e.g. parallelism, choice, synchronization)

• Data perspective: passing of information , scoping of variables, etc

• Resource perspective: resource to thread allocation, delegation, etc.

• Exception handling: various causes of exceptions and the various actions that need to be taken as a result of exceptions occurring

Groups of patterns

Control-flow patterns

Worker threads

• Sometimes referred as “Active Object”, “Cyclic Executive” or “Concurrency Pattern”

• Generic threads doing some work without being aware of what kind of work they do

• They share a common work queue

• Very useful in highly parallel systems

Worker threads

• Windows Vista/Server has API support for creating thread pools (CreateThreadpool)

• Use a semaphore to limit the number of active threads to a number compared to the CPU’s count (usually 2 x CPU)

• Background Worker Patternnotifications when the thread completes, but provides an update on the status of the operation– May need a cancel of the operation

• Asynchronous Results Patternyou are more interested in the result than the actual status of the operations

Worker threads - variants

http://codeidol.com/csharp/essential-csharp/Multithreading-Patterns/Background-Worker-Pattern/

http://codeidol.com/csharp/essential-csharp/Multithreading-Patterns/Asynchronous-Results-Pattern/

Implicit Terminationthe worker has finished its work and can end

Explicit Termination the worker is asked to terminate

Worker threads - Termination

http://www.workflowpatterns.com/patterns/control/structural/wcp11.php

http://www.workflowpatterns.com/patterns/control/new/wcp43.php

Scheduler

• Explicitly control when threads may execute single-threaded code (sequences waiting threads)

• Independent mechanism to implement a scheduling policy

• Read/Write lock is usually implemented using the scheduler pattern to ensure fairness in scheduling

• Adds significant overhead

Thread pool

• A number of threads are created to perform a number of tasks, usually organized in a queue

• There are many more tasks than threads

• When thread completes its task:– If more tasks -> request the next task from the queue – If no more tasks -> it terminates, or sleeps

• Number of threads used is a parameter that can be tuned - can be dynamic based on the number of waiting tasks

Thread pool

• The creating or destroying algorithm impacts overall performance:– Create too many threads = resources and time are wasted – Destroy too many threads = time spent re-creating– Creating threads too slowly = poor client performance – Destroying threads too slowly = starvation of resources

• Negates thread creation and destruction overhead

• Better performance and better system stability

Thread pool - triggers

• Transient Trigger– Offers the capability to signal currently running

threads– They are lost if not acted upon right away

• Persistent Trigger – Generally it would result in the pool actions– They would be persisted and would eventually be

handled



• Asynchronous communications, implemented via queued messages

• Simple, without mutual exclusion problems

• No resource is shared by reference

• The shared information is passed by value

Message Queuing

• Occurs when the event of interest occurs

• Executes very quickly and with little overhead

• Provide a means for timely response to urgent needs

• There are circumstances when their use can lead to system failure

• Asynchronous procedure calls (APC)

Interrupt

• Used when it may not be possible to wait for an asynchronous rendezvous

• The call of the method of the appropriate object in the other thread can lead to mutual exclusion problems if the called object is currently active doing something else

• The Guarded Call Pattern handles this case through the use of a mutual exclusion semaphore

Guarded Call

• Concerned with modelling the preconditions for synchronization or rendezvous of threads

• ready threads registers with the Rendezvous class

• then blocks until the Rendezvous class releases it to run

• Build a collaboration structure that allows any arbitrary set of preconditions to be met for thread synchronization,

• Independent of task phrasings, scheduling policies, and priorities

Rendezvous

Data patterns

• Also called thread-local storage

• Any function in that thread will get the same value, TLS is allocated per thread

• Similar to global storage - unlike global storage, functions in another thread will not get the same value

• Thread specific storage sometimes refers to the private virtual address space of a running task

Thread-Specific Storage

http://en.wikipedia.org/wiki/Thread-local_storage

• Dynamic memory problems: • nondeterministic timing of memory allocation and de-allocation• memory fragmentation

• Simple approach to solving both these problems: disallow dynamic memory allocation

• Only used simple systems with highly predictable and consistent loads

• All objects are allocated during system initialization (the system takes longer to initialize, but it operates well during execution)

Static Allocation

• Involves creating of pools of objects at start-up

• Doesn't address needs for dynamic memory

• The pools are not necessarily initialized at start-up

• The pools are available upon request

Pool Allocation

• Memory fragmentation occurs when:• The order of allocation is independent of the release order • Memory is allocated in various sizes from the heap

• Used when we cannot tolerate dynamic allocation problems like fragmentation

• Fragmentation-free dynamic memory allocation at the cost of loss of memory usage optimality

• Similar to a dynamic allocation but only allows fixed pre-defined sizes to be allocated

Fixed Sized Buffer

• Solves memory leaks and dangling pointers

• It does not address memory fragmentation

• Takes the programmer out of the loop

• Adds run-time overhead

• Adds a loss of execution predictability

Garbage Collection

• Removes memory fragmentation

• Maintains two memory segments in the heap

• Moves live objects from one segment to the next

• The free memory in on of the segments is a contiguous block

Garbage Compactor

Resource patterns

Locked structures

• Structures that use a locking mechanism

• Easy to implement, easy to debug

• Can deadlock

• Do not scale well

Lock-free structures

• They do not need to lock

• They need hardware support (e.g. compare-and-swap instructions)

• They can “burn” CPU

• Hard to implement and debug

Wait-free structures

• Same as lock free structures, but there is a guarantee they would finish in a certain number of steps

• All wait-free structures are lock-free

• Very difficult to implement

• Very few real life applications

Single writer / multi reader

• Special kind of lock that would allow multiple read access to the data but only a single write (exclusive write access)

• Problems on promoting from read to write (reader starvation, writers starvation) – Scheduler pattern

• Also known as "Double-Checked Locking Optimization"

• Reduces the overhead of acquiring a lock

• Used for implementing "lazy initialization" in a multi-threaded environment

If check failed thenLock

If check failed thenInitialize

Unlock

Double-checked locking

http://en.wikipedia.org/wiki/Lazy_initialization

• Common memory area addressable by multiple processors

• Almost always involves a combined hardware/software solution

• If the data to be shared is read-only then concurrency protection mechanisms may not be required

• Used when responses to messages and events are not desired or too slow

Shared Memory

• Deadlocks avoidance

• Works in an all-or-none fashion

• Prevents the condition of holding some resources while requesting others

• Allows higher-priority tasks to run if they don't need any of the locked resources

Simultaneous Locking

• Eliminates deadlocks

• Orders resources and enforcing an ordered policy in which resources must be allocated

• If enforced then no circular waiting condition can ever occur

• Explicitly lock and release the resources

• Has the potential for neglecting to unlock the resource exists

Ordered Locking

Exception/error patterns

• Work failure • Deadline expiry• Resource unavailability • External trigger • Constraint violation

Exceptions/errors

Handling:• Continue• Remove work item• Remove all items

Recovery:• no action • rollback • compensate

Balking

• Executes an action on an object when the object is in a particular state

• An attempt to use the object out of its legal state would result in an "Illegal State Exception"

Triple Modular Redundancy

• Used when there is no fail-safe state

• Based on an odd number of channels operating in parallel

• The computational results or resulting actuation signals are compared, and if there is a disagreement, then a two-out-of-three majority wins

• Any deviating computation of the third channel is discarded

Watchdog

• Lightweight and inexpensive

• Minimal coverage

• Watches out over processing of another component

• Usually checks a computation time base …

• … or ensures that computation steps are proceeding in a predefined order

• http://www.workflowpatterns.com

• “Real-Time Design Patterns: Robust Scalable Architecture for Real-Time Systems” by Bruce Powel Douglass

Multithreading patterns sources

http://www.workflowpatterns.com/

Questions ?

Big thank you!