Upload
shavonne-may
View
228
Download
0
Tags:
Embed Size (px)
Citation preview
Threads
Tutorial #7CPSC 261
A thread is a virtual processor
• Each thread is provided the illusion that it owns a core– Copy of the registers– It is running all the time
• In fact, all of the threads share the hardware cores– The operating system rapidly switches the cores
among the threads that want to run, providing this illusion that each thread owns a core
POSIX standard: pthreads
• Threads are created via pthread_create()• A thread dies when it:– returns from the function given to pthread_create– or calls pthread_exit()
• You can wait for a thread to complete via pthread_join()
Visualizing thread execution
pthread_create()
pthread_join()
main thread
child threads
PingPong.cvoid *p(void *arg) { long i; for (i = 0; i < LIMIT; ++i) {
counter = counter + 1; } return 0;}void main(...) { pthread_create(&t1, NULL, &p, "ping”); pthread_create(&t2, NULL, &p, "pong”); pthread_join(t1, NULL); pthread_join(t2, NULL);}
Questions you might ask
• What if t2 finishes before t1?– It will just wait as long as necessary for the main
thread to join with it• How many threads can I create?– It depends. On Linux a few 10s of thousands
The big thing that goes wrong
• Uncontrolled access to memory– race condition– with multiple writers of a single shared variable,
updates can get lost• Fixed by:– single writers– locks
Before-and-after atomicity
• Sometimes you need an arbitrary sequence of operations to be atomic
• A sequence of operations that needs to be atomic is also called a critical section
• Mutual exclusion means only one thread at a time (one thread in the critical section excludes all others)
Achieving mutual exclusion
• In pthreads, mutual exclusion is provided by mutex objects– created as all other objects (malloc)– initialized by pthread_mutex_init()– acquired by pthread_mutex_lock()– released by pthread_mutex_unlock()
The lock idiom
• Every critical section is protected by a mutex• The code looks like:
pthread_mutex_lock(&lock);// critical sectionpthread_mutex_unlock(&lock);
Locking PingPongvoid *p(void *arg) { long i; for (i = 0; i < LIMIT; ++i) { pthread_mutex_lock(&lock); // Once the lock is held, this // “critical section” can be as long // as you need or want it to be counter = counter + 1; pthread_mutex_unlock(&lock); } return 0;}
Multiple critical sections
• If there are multiple critical sections that access the same shared data– They need to be protected by the same lock
Multiple critical section idiom
pthread_mutex_lock (&lock);// critical section// for thread 1pthread_mutex_unlock (&lock);
pthread_mutex_lock (&lock);// critical section// for thread 2pthread_mutex_unlock (&lock);
Locking issues
• Fine-grained locks– more parallelism– more overhead– more complexity
• Coarse-grained locks– less parallelism– less overhead– simpler
• Deadlock
Sample thread code
• Lots of examples in the threads directory of the lectures repository
Using threads for speedup
Cores used
Speedup
perfect speedup45o line
Things to think about
• Each core has its own cache• The L3 cache is shared between all the cores• If you have 8 cores, how many “pieces of
work” should you create?– 8?– <8?– >8?
More things to think about
• What if the “pieces of work” aren’t all the same size?
• Or what if one thread is slower than the other threads?– Why could this be?• Randomness• Interference with other activity on the machine
– These slow threads are called “stragglers” and are a real problem in practice
The ideal case – all threads finish at the same time
What might happen
Even more things to think about
• Suppose I have 8 cores.• Should I create 8 threads – one for each core• Or more than 8 threads to deal with
“stragglers”• Or ...