Threads Tutorial #7 CPSC 261. A thread is a virtual processor Each thread is provided the illusion that it owns a core – Copy of the registers – It is

Threads

Tutorial #7CPSC 261

A thread is a virtual processor

• Each thread is provided the illusion that it owns a core– Copy of the registers– It is running all the time

• In fact, all of the threads share the hardware cores– The operating system rapidly switches the cores

among the threads that want to run, providing this illusion that each thread owns a core

POSIX standard: pthreads

• Threads are created via pthread_create()• A thread dies when it:– returns from the function given to pthread_create– or calls pthread_exit()

• You can wait for a thread to complete via pthread_join()

Visualizing thread execution

pthread_create()

pthread_join()

main thread

child threads

PingPong.cvoid *p(void *arg) { long i; for (i = 0; i < LIMIT; ++i) {

counter = counter + 1; } return 0;}void main(...) { pthread_create(&t1, NULL, &p, "ping”); pthread_create(&t2, NULL, &p, "pong”); pthread_join(t1, NULL); pthread_join(t2, NULL);}

Questions you might ask

• What if t2 finishes before t1?– It will just wait as long as necessary for the main

thread to join with it• How many threads can I create?– It depends. On Linux a few 10s of thousands

The big thing that goes wrong

• Uncontrolled access to memory– race condition– with multiple writers of a single shared variable,

updates can get lost• Fixed by:– single writers– locks

Before-and-after atomicity

• Sometimes you need an arbitrary sequence of operations to be atomic

• A sequence of operations that needs to be atomic is also called a critical section

• Mutual exclusion means only one thread at a time (one thread in the critical section excludes all others)

Achieving mutual exclusion

• In pthreads, mutual exclusion is provided by mutex objects– created as all other objects (malloc)– initialized by pthread_mutex_init()– acquired by pthread_mutex_lock()– released by pthread_mutex_unlock()

The lock idiom

• Every critical section is protected by a mutex• The code looks like:

pthread_mutex_lock(&lock);// critical sectionpthread_mutex_unlock(&lock);

Locking PingPongvoid *p(void *arg) { long i; for (i = 0; i < LIMIT; ++i) { pthread_mutex_lock(&lock); // Once the lock is held, this // “critical section” can be as long // as you need or want it to be counter = counter + 1; pthread_mutex_unlock(&lock); } return 0;}

Multiple critical sections

• If there are multiple critical sections that access the same shared data– They need to be protected by the same lock

Multiple critical section idiom

pthread_mutex_lock (&lock);// critical section// for thread 1pthread_mutex_unlock (&lock);

pthread_mutex_lock (&lock);// critical section// for thread 2pthread_mutex_unlock (&lock);

Locking issues

• Fine-grained locks– more parallelism– more overhead– more complexity

• Coarse-grained locks– less parallelism– less overhead– simpler

• Deadlock

Sample thread code

• Lots of examples in the threads directory of the lectures repository

Using threads for speedup

Cores used

Speedup

perfect speedup45o line

Things to think about

• Each core has its own cache• The L3 cache is shared between all the cores• If you have 8 cores, how many “pieces of

work” should you create?– 8?– <8?– >8?

More things to think about

• What if the “pieces of work” aren’t all the same size?

• Or what if one thread is slower than the other threads?– Why could this be?• Randomness• Interference with other activity on the machine

– These slow threads are called “stragglers” and are a real problem in practice

The ideal case – all threads finish at the same time

What might happen

Even more things to think about

• Suppose I have 8 cores.• Should I create 8 threads – one for each core• Or more than 8 threads to deal with

“stragglers”• Or ...

Documents

Threads Tutorial #7 CPSC 261. A thread is a virtual processor Each thread is provided the illusion that it owns a core – Copy of the registers – It is