22
Concurrent Programming Introducing the principles of reentrancy, mutual exclusion and thread-synchronication

Concurrent Programming

  • Upload
    danica

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

Concurrent Programming. Introducing the principles of reentrancy, mutual exclusion and thread-synchronication. Advantages of multithreading. - PowerPoint PPT Presentation

Citation preview

Page 1: Concurrent Programming

Concurrent Programming

Introducing the principles of reentrancy, mutual exclusion and

thread-synchronication

Page 2: Concurrent Programming

Advantages of multithreading

• For multiprocessor systems (two or more CPUs), there are potential efficiencies in the parallel execution of separate threads (a computing job may be finished sooner)

• For uniprocessor systems (just one CPU), there are likely software design benefits in dividing a complex job into simpler pieces (easier to debug and maintain -- or reuse)

Page 3: Concurrent Programming

Some Obstacles

• Separate tasks need to coordinate actions, share data, and avoid competing for same system resources

• Management ‘overhead’ could seriously degrade the system’s overall efficiency

• Examples:– Frequent task-switching is costly in CPU time– Busy-Waiting is wasteful of system resources

Page 4: Concurrent Programming

Some ‘work-arounds’

• Instead of using ‘pipes’ for the exchange of data among separate processes, Linux lets ‘threads’ use the same address-space (reduces ‘overhead’ in context-switching)

• Instead of requiring one thread to waste time busy-waiting while another finishes some particular action, Linux lets a thread voluntarily give up its control of the CPU

Page 5: Concurrent Programming

Additional pitfalls

• Every thread needs some private memory that cannot be ‘trashed’ by another thread (for example, it needs a private stack for handling interrupts, passing arguments to functions, creating local variables, saving CPU register-values temporarily)

• Each thread needs a way to prevent being interrupted in a ‘critical’ multi-stage action

Page 6: Concurrent Programming

Example of a ‘critical section’

• If interrupt occurs

• Recall Disk-Drive device-programming (status-register and control-register)

• Algorithm: – (1) Loop rereads status-register until ‘ready’– (2) Write drive-command to control-register

• If an interrupt occurs between these steps, another thread can send its own command

Page 7: Concurrent Programming

‘mutual exclusion’

• To prevent one thread from ‘sabotaging’ the actions of another, some mechanism is needed that allows a thread to temporarily ‘block’ other threads from gaining control of the CPU -- until the first thread has completed its ‘critical’ action

• Some ways to accomplish this:– Disable interrupts (stops CPU time-sharing)– Use a ‘mutex’ (a mutual exclusion variable)– Put other tasks to sleep (remove from run-queue)

Page 8: Concurrent Programming

What about ‘cli’?

• Disabling interrupts will stop ‘time-sharing’ among tasks on a uniprocessor system

• But it would be ‘unfair’ in to allow this in a multi-user system (monopolize the CPU)

• So ‘cli’ is a privileged instruction: it cannot normally be executed by user-mode tasks

• It won’t work on a multiprocessor system

Page 9: Concurrent Programming

What about a ‘mutex’?

• A shared global variable acts as a ‘lock’

• Initially it’s ‘unlocked’: e.g., int mutex = 1;

• Before entering a ‘critical section’ of code, a task ‘locks’ the mutex: i.e., mutex = 0;

• As soon as it leaves its ‘critical section’, it ‘unlocks’ the mutex: i.e., mutex = 1;

• While the mutex is ‘locked’, no other task can enter the ‘critical section’ of code

Page 10: Concurrent Programming

Advantages and cautions

• A mutex can be used in both uniprocessor and multiprocessor systems – provided it is possible for a CPU to ‘lock’ the mutex with a single ‘atomic’ instruction (requires special support by processors’ hardware)

• Use of a mutex can introduce busy-waiting by tasks trying to enter the ‘critical section’ (thereby severely degrading performance)

Page 11: Concurrent Programming

Software mechanism

• The operating system can assist threads needing mutual exclusion, simply by not scheduling other threads that might want to enter the same ‘critical section’ of code

• Linux accomplishes this by implementing ‘wait-queues’ for those threads that are all contending for access to the same system resource – including ‘critical sections’

Page 12: Concurrent Programming

Demo programs

• To show why ‘synchronization’ is needed in multithreaded programs, we wrote the ‘concur1.cpp’ demo-program

• Here several separate threads will all try to increment a shared ‘counter’ – but without any mechanism for doing synchronization

• The result is unpredictable – a different total is gotten each time the program runs!

Page 13: Concurrent Programming

How to employ a ‘mutex’

• Declare a global variable: int mutex = 1;

• Define a pair of shared subroutines– void enter_critical_section( void );– void leave_critical_section( void );

• Insert calls to these subroutines before and after accessing the global ‘counter’

Page 14: Concurrent Programming

Special x86 instructions

• We need to use x86 assembly-language (to implement ‘atomic’ mutex-operations)

• Several instruction-choices are possible, but ‘btr’ and ‘bts’ are simplest to use:– ‘btr’ means ‘bit-test-and-reset’– ‘bts’ means ‘bit-test-and’set’

• Syntax and semantics:– asm(“ btr $0, mutex “); // acquire the

mutex– asm(“ bts $0, mutex “); // release the mutex

Page 15: Concurrent Programming

The two mutex-functions

void enter_critical_section( void ) {

asm(“spin: btr $0, mutex “);asm(“ jnc spin “);

} void leave_critical_section( void ) {

asm(“ bts $0, mutex “); }

Page 16: Concurrent Programming

Where to use the functions

void my_thread( int * data ) {

int i, temp;for (i = 0; i < maximum; i++)

{enter_critical_section();temp = counter;temp += 1;counter = temp;leave_critical_section();}

}

Page 17: Concurrent Programming

‘reentrancy’

• By the way, we point out as an aside that our ‘my_thread()’ function (on the previous slide) is an example of ‘reentrant’ code

• More than one process (or processor) can be safely executing it concurrently

• It needs to obey two cardinal rules:– It contains no ‘self-modifying’ instructions– Access to shared variables is ‘exclusive’

Page 18: Concurrent Programming

In-class exercise #1

• Rewrite the ‘concur1.cpp’ demo-program, as ‘concur2.cpp’, inserting these functions that will implement ‘mutual exclusion’ for our thread’s ‘critical section’

• Then try running your ‘concur2.cpp’ on a uniprocessor system (your workstation)

• Also try running your ‘concur2.cpp’ on a multiprocessor system (e.g., dept server)

Page 19: Concurrent Programming

The x86 ‘lock’ prefix

• In order for the ‘btr’ instruction to perform an ‘atomic’ update (when multiple CPUs are using the same bus to access memory simultaneously), it is necessary to insert an x86 ‘lock’ prefix, like this:

asm(“ spin: lock btr $0, mutex “);

• This instruction ‘locks’ the shared system-bus during this instruction execution -- so another CPU cannot intervene

Page 20: Concurrent Programming

In-class exercise #2

• Add the ‘lock’ prefix to your ‘concur2.cpp’ demo, and then try executing it again on the multiprocessor system

• Use the Linux ‘time’ command to measure how long it takes for your demo to finish

• Observe the ‘degraded’ performance due to adding the ‘mutex’ functions – penalty for achieving a ‘correct’ parallel program

Page 21: Concurrent Programming

The ‘nanosleep()’ system-call

• Your multithreaded demo-program shows poor performance because your threads are doing lots of ‘busy-waiting’

• When a thread can’t acquire the mutex, it should voluntarily give up control of the CPU (so another thread can do real work)

• The Linux ‘nanosleep()’ system-call allows a thread to ‘yield’ its time-slice

Page 22: Concurrent Programming

In-class exercise #3

• Revice your ‘concur3.cpp’ program so that a thread will ‘yield’ if it cannot immediately acquire the mutex (see our ‘yielding.cpp’ demo for header-files and call-syntax)

• Use the Linux ‘time’ command to compare the performance of ‘concur3’ and ‘concur2’– On a uniprocessor platform– On a multiprocessor platform