D u k e S y s t e m s
Servers and Threads, Continued
Jeff ChaseDuke University
Processes and threads
+ +…
virtual address space main thread
stack
Each process has a thread bound to the VAS, with stacks (user and kernel).
If we say a process does something, we really mean
its thread does it.
The kernel can suspend/restart the thread wherever and whenever it
wants.
Each process has a virtual address space (VAS): a private name
space for the virtual memory it uses.
The VAS is both a “sandbox” and a
“lockbox”: it limits what the process can
see/do, and protects its data from others.
From now on, we suppose that a process could have
additional threads.
We are not concerned with how to implement them,
but we presume that they can all make system calls and block independently.
other threads (optional)
STOP wait
Inside your Web server
packet queues
listen queue
accept queue
Server application(Apache,
Tomcat/Java, etc)
Server operationscreate socket(s)bind to port number(s)listen to advertise port
wait for client to arrive on port (select/poll/epoll of ports)accept client connection read or recv requestwrite or send responseclose client socket
disk queue
Web server: handling a request
Accept ClientConnection
Read HTTPRequest Header
FindFile
Send HTTPResponse Header
Read FileSend Data
may blockwaiting ondisk I/O
Want to be able to process requests concurrently.
may blockwaiting onnetwork
Multi-programmed server: idealized
Incoming requestqueue
worker loop
Handle one request,
blocking as necessary.
When request is complete,
return to worker pool.
Magic elastic worker pool
Resize worker pool to match incoming request load:
create/destroy workers as needed. dispatch
idle workers
Workers wait here for next request dispatch.
Workers could be processes or threads.
Multi-process server architecture
AcceptConn
ReadRequest
FindFile
SendHeader
Read FileSend Data
AcceptConn
ReadRequest
FindFile
SendHeader
Read FileSend Data
Process 1
Process N…
separate address spaces
Multi-process server architecture
• Each of P processes can execute one request at a time, concurrently with other processes.
• If a process blocks, the other processes may still make progress on other requests.
• Max # requests in service concurrently == P
• The processes may loop and handle multiple requests serially, or can fork a process per request.– Tradeoffs?
• Examples:– inetd “internet daemon” for standard /etc/services
– Design pattern for (Web) servers: “prefork” a fixed number of worker processes.
Example: inetd
• Classic Unix systems run an inetd “internet daemon”.
• Inetd receives requests for standard services.
– Standard services and ports listed in /etc/services.
– inetd listens on the ports and accepts connections.
• For each connection, inetd forks a child process.
• Child execs the service configured for the port.
• Child executes the request, then exits.
[Apache Modeling Project: http://www.fmc-modeling.org/projects/apache]
Children of init: inetd
New child processes are created to run network services.
They may be created on demand on connect attempts from the network for designated service ports.
Should they run as root?
Prefork
[Apache Modeling Project: http://www.fmc-modeling.org/projects/apache]
In the Apache MPM “prefork”
option, only one child polls or accepts at a
time: the child at the head of a queue. Avoid “thundering
herd”.
Details, details
“Scoreboard” keeps track of child/worker activity, so parent can manage an
elastic worker pool.
Multi-threaded server architecture
AcceptConn
ReadRequest
FindFile
SendHeader
Read FileSend Data
AcceptConn
ReadRequest
FindFile
SendHeader
Read FileSend Data
Thread 1
Thread N
…
This structure might have lower cost than the multi-process architecture if threads are “cheaper” than processes.
Servers structure, recap
• The server structure discussion motivates threads, and illustrates the need for concurrency management.
– We return later to performance impacts and effective I/O overlap.
• A continuing theme of the class presentation: Unix systems fall short of the idealized model.
– Thundering herd problem when multiple workers wake up and contend for an arriving request: one worker wins and consumes the request, the others go back to sleep – their work was wasted. Recent fix in Linux.
– Separation of poll/select and accept in Unix syscall interface: multiple workers wake up when a socket has new data, but only one can accept the request: thundering herd again, requires an API change to fix it.
– There is no easy way to manage an elastic worker pool.
• Real servers (e.g., Apache/MPM) incorporate lots of complexity to overcome these problems. We skip this topic.
Threads
• We now enter the topic of threads and concurrency control.
– This will be a focus for several lectures.
– We start by introducing more detail on thread management, and the problem of nondeterminism in concurrent execution schedules.
• Server structure discussion motivates threads, but there are other motivations.
– Harnessing parallel computing power in the multicore era
– Managing concurrent I/O streams
– Organizing/structuring processing for user interface (UI)
– Threading and concurrency management are fundamental to OS kernel implementation: processes/threads execute concurrently in the kernel address space for system calls and fault handling. The kernel is a multithreaded program.
• So let’s get to it….
The theater analogy
Threads
Address space
Program
scriptcontext (stage)
[lpcox]
Running a program is like performing a play.
A Thread
Thread* t
machine state
name/status etc
“fencepost”
0xdeadbeef
Stack
low
high stack top
unused
thread objector
thread control block(TCB)
int stack[StackSize]
ucontext_t
Example: pthreads
pthread_t threads[N]; int rc; int t = …; rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t); if (rc) error….
void *PrintHello(void *threadid){ long tid; tid = (long)threadid; printf("Hello World! It's me, thread #%ld!\n", tid); pthread_exit(NULL);}
[http://computing.llnl.gov/tutorials/pthreads/]
Example: Java Threads (1)
class PrimeThread extends Thread { long minPrime; PrimeThread(long minPrime) { this.minPrime = minPrime; } public void run() { // compute primes larger than minPrime . . . } } PrimeThread p = new PrimeThread(143); p.start();
[http://download.oracle.com/javase/6/docs/api/java/lang/Thread.html]
Example: Java Threads (2)
[http://download.oracle.com/javase/6/docs/api/java/lang/Thread.html]
class PrimeRun implements Runnable { long minPrime; PrimeRun(long minPrime) { this.minPrime = minPrime; } public void run() { // compute primes larger than minPrime . . . } }
PrimeRun p = new PrimeRun(143); new Thread(p).start();
Thread states and transitions
running
readyblocked
exited
The kernel process/thread scheduler governs these transitions.
exit
wait, STOP, read, write, listen, receive, etc.
sleep
STOP wait
wakeup
Sleep and wakeup are internal primitives. Wakeup adds a thread to the scheduler’s ready pool: a set of
threads in the ready state.
EXIT
yield
Two threads sharing a CPU
reality
concept
context switch
CPU Scheduling 101
The OS scheduler makes a sequence of “moves”.
– Next move: if a CPU core is idle, pick a ready thread t from the ready pool and dispatch it (run it).
– Scheduler’s choice is “nondeterministic”
– Scheduler’s choice determines interleaving of execution
WakeupGetNextToRun
SWITCH()
ready poolblockedthreads
If timer expires, or wait/yield/terminate
Yield() { disable; next = FindNextToRun(); ReadyToRun(this); Switch(this, next); enable;}
Sleep() { disable; this->status = BLOCKED; next = FindNextToRun(); Switch(this, next); enable;}
A Rough Idea
Issues to resolve:What if there are no ready threads?How does a thread terminate?How does the first thread start?
/* * Save context of the calling thread (old), restore registers of * the next thread to run (new), and return in context of new. */switch/MIPS (old, new) {
old->stackTop = SP;save RA in old->MachineState[PC];save callee registers in old->MachineState
restore callee registers from new->MachineState
RA = new->MachineState[PC];SP = new->stackTop;
return (to RA)}
This example (from the old MIPS ISA) illustrates how context switch saves/restores the user register context for a thread, efficiently and without assigning a value directly into the PC.
switch/MIPS (old, new) {old->stackTop = SP;save RA in old->MachineState[PC];save callee registers in old->MachineState
restore callee registers from new->MachineStateRA = new->MachineState[PC];SP = new->stackTop;
return (to RA)}
Example: Switch()
Caller-saved registers (if needed) are already saved on its stack, and restored automatically on return.
Return to procedure that called switch in new thread.
Save current stack pointer and caller’s return address in old thread object.
Switch off of old stack and over to new stack.
RA is the return address register. It contains the address that a procedure return instruction branches to.
What to know about context switch• The Switch/MIPS example is an illustration for those of you who are
interested. It is not required to study it. But you should understand how a thread system would use it (refer to state transition diagram):
• Switch() is a procedure that returns immediately, but it returns onto the stack of new thread, and not in the old thread that called it.
• Switch() is called from internal routines to sleep or yield (or exit).
• Therefore, every thread in the blocked or ready state has a frame for Switch() on top of its stack: it was the last frame pushed on the stack before the thread switched out. (Need per-thread stacks to block.)
• The thread create primitive seeds a Switch() frame manually on the stack of the new thread, since it is too young to have switched before.
• When a thread switches into the running state, it always returns immediately from Switch() back to the internal sleep or yield routine, and from there back on its way to wherever it goes next.
Creating a new thread
Also called “forking” a thread Idea: create initial state, put on ready queue1.Allocate, initialize a new TCB2.Allocate a new stack3.Make it look like thread was going to call a
function PC points to first instruction in function SP points to new stack Stack contains arguments passed to function
4.Add thread to ready queue
Thread control block
CPUCPU
Address SpaceAddress Space
TCB1PCSP
registers
TCB1PCSP
registers
TCB2PCSP
registers
TCB2PCSP
registers
TCB3PCSP
registers
TCB3PCSP
registers
CodeCode
StackStackCodeCode
StackStackCodeCode
StackStack
PCSP
registers
PCSP
registersThread 1 running
Ready queue
Thread control block
CPUCPU
Address SpaceAddress Space
TCB2PCSP
registers
TCB2PCSP
registers
TCB3PCSP
registers
TCB3PCSP
registers
CodeCode
StackStack StackStack StackStack
PCSP
registers
PCSP
registersThread 1 running
Ready queue
Kernel threads (“native”)
Thread
User mode
Scheduler
…
PCSP
Thread
…
PCSP
Thread
…
PCSP
Thread
…
PCSP
Kernel mode
User-level threads (“green”)
Thread
User mode
Scheduler
…
PCSP
Thread Thread
…
PCSP
Thread
…
PCSP
Sched
…
PCSP
Kernel mode
Andrew Birrell
Bob Taylor
Concurrency: An Example
int counters[N];int total;
/* * Increment a counter by a specified value, and keep a running sum. */voidTouchCount(int tid, int value){
counters[tid] += value;total += value;
}
Reading Between the Lines of C/* ; counters and total are global data ; tid and value are local data counters[tid] += value; total += value;*/loadcounters, R1 ; load counters baseload8(SP), R2 ; load tid indexshl R2, #2, R2 ; index = index * sizeof(int)add R1, R2, R1 ; compute index to arrayload (R1), R2 ; load counters[tid]
load4(SP), R3 ; load valueadd R2, R3, R2 ; counters[tid] += valuestore R2, (R1) ; store back to counters[tid]
load total, R2 ; load totaladd R2, R3, R2 ; total += valuestore R2, total ; store total
Reading Between the Lines of C
load total, R2 ; load totaladd R2, R3, R2 ; total += valuestore R2, total ; store total
loadaddstore
loadaddstore
Two executions of this code, so:two values are added to total.
Interleaving matters
load total, R2 ; load totaladd R2, R3, R2 ; total += valuestore R2, total ; store total
load
add
store
load
add
store
In this schedule, only one value is added to total: last writer wins.The scheduler made a legal move that broke this program.
Non-determinism and ordering
Time
Thread A
Thread B
Thread C
Global orderingWhy do we care about the global ordering? Might have dependencies between events Different orderings can produce different resultsWhy is this ordering unpredictable? Can’t predict how fast processors will run
Non-determinism example
y=10; Thread A: x = y+1; Thread B: y = y*2; Possible results?
A goes first: x = 11 and y = 20 B goes first: y = 20 and x = 21
What is shared between threads? Variable y
Another example
Two threads (A and B) A tries to increment i B tries to decrement i
Thread A: i = o; while (i < 10){ i++; } print “A done.”
Thread B: i = o; while (i > -10){ i--; } print “B done.”
Example continued
Who wins? Does someone have to win?
Thread A: i = o; while (i < 10){ i++; } print “A done.”
Thread B: i = o; while (i > -10){ i--; } print “B done.”
Debugging non-determinism
Requires worst-case reasoning Eliminate all ways for program to break
Debugging is hard Can’t test all possible interleavings Bugs may only happen sometimes
Heisenbug Re-running program may make the bug
disappear Doesn’t mean it isn’t still there!