D u k e S y s t e m s Servers and Threads, Continued Jeff Chase Duke University

D u k e S y s t e m s

Servers and Threads, Continued

Jeff ChaseDuke University

Processes and threads

+ +…

virtual address space main thread

stack

Each process has a thread bound to the VAS, with stacks (user and kernel).

If we say a process does something, we really mean

its thread does it.

The kernel can suspend/restart the thread wherever and whenever it

wants.

Each process has a virtual address space (VAS): a private name

space for the virtual memory it uses.

The VAS is both a “sandbox” and a

“lockbox”: it limits what the process can

see/do, and protects its data from others.

From now on, we suppose that a process could have

additional threads.

We are not concerned with how to implement them,

but we presume that they can all make system calls and block independently.

other threads (optional)

STOP wait

Inside your Web server

packet queues

listen queue

accept queue

Server application(Apache,

Tomcat/Java, etc)

Server operationscreate socket(s)bind to port number(s)listen to advertise port

wait for client to arrive on port (select/poll/epoll of ports)accept client connection read or recv requestwrite or send responseclose client socket

disk queue

Web server: handling a request

Accept ClientConnection

Read HTTPRequest Header

FindFile

Send HTTPResponse Header

Read FileSend Data

may blockwaiting ondisk I/O

Want to be able to process requests concurrently.

may blockwaiting onnetwork

Multi-programmed server: idealized

Incoming requestqueue

worker loop

Handle one request,

blocking as necessary.

When request is complete,

return to worker pool.

Magic elastic worker pool

Resize worker pool to match incoming request load:

create/destroy workers as needed. dispatch

idle workers

Workers wait here for next request dispatch.

Workers could be processes or threads.

Multi-process server architecture

AcceptConn

ReadRequest

FindFile

SendHeader

Read FileSend Data

AcceptConn

ReadRequest

FindFile

SendHeader

Read FileSend Data

Process 1

Process N…

separate address spaces

Multi-process server architecture

• Each of P processes can execute one request at a time, concurrently with other processes.

• If a process blocks, the other processes may still make progress on other requests.

• Max # requests in service concurrently == P

• The processes may loop and handle multiple requests serially, or can fork a process per request.– Tradeoffs?

• Examples:– inetd “internet daemon” for standard /etc/services

– Design pattern for (Web) servers: “prefork” a fixed number of worker processes.

Example: inetd

• Classic Unix systems run an inetd “internet daemon”.

• Inetd receives requests for standard services.

– Standard services and ports listed in /etc/services.

– inetd listens on the ports and accepts connections.

• For each connection, inetd forks a child process.

• Child execs the service configured for the port.

• Child executes the request, then exits.

[Apache Modeling Project: http://www.fmc-modeling.org/projects/apache]

Children of init: inetd

New child processes are created to run network services.

They may be created on demand on connect attempts from the network for designated service ports.

Should they run as root?

Prefork

[Apache Modeling Project: http://www.fmc-modeling.org/projects/apache]

In the Apache MPM “prefork”

option, only one child polls or accepts at a

time: the child at the head of a queue. Avoid “thundering

herd”.

Details, details

“Scoreboard” keeps track of child/worker activity, so parent can manage an

elastic worker pool.

Multi-threaded server architecture

AcceptConn

ReadRequest

FindFile

SendHeader

Read FileSend Data

AcceptConn

ReadRequest

FindFile

SendHeader

Read FileSend Data

Thread 1

Thread N

…

This structure might have lower cost than the multi-process architecture if threads are “cheaper” than processes.

Servers structure, recap

• The server structure discussion motivates threads, and illustrates the need for concurrency management.

– We return later to performance impacts and effective I/O overlap.

• A continuing theme of the class presentation: Unix systems fall short of the idealized model.

– Thundering herd problem when multiple workers wake up and contend for an arriving request: one worker wins and consumes the request, the others go back to sleep – their work was wasted. Recent fix in Linux.

– Separation of poll/select and accept in Unix syscall interface: multiple workers wake up when a socket has new data, but only one can accept the request: thundering herd again, requires an API change to fix it.

– There is no easy way to manage an elastic worker pool.

• Real servers (e.g., Apache/MPM) incorporate lots of complexity to overcome these problems. We skip this topic.

Threads

• We now enter the topic of threads and concurrency control.

– This will be a focus for several lectures.

– We start by introducing more detail on thread management, and the problem of nondeterminism in concurrent execution schedules.

• Server structure discussion motivates threads, but there are other motivations.

– Harnessing parallel computing power in the multicore era

– Managing concurrent I/O streams

– Organizing/structuring processing for user interface (UI)

– Threading and concurrency management are fundamental to OS kernel implementation: processes/threads execute concurrently in the kernel address space for system calls and fault handling. The kernel is a multithreaded program.

• So let’s get to it….

The theater analogy

Threads

Address space

Program

scriptcontext (stage)

[lpcox]

Running a program is like performing a play.

A Thread

Thread* t

machine state

name/status etc

“fencepost”

0xdeadbeef

Stack

low

high stack top

unused

thread objector

thread control block(TCB)

int stack[StackSize]

ucontext_t

Example: pthreads

pthread_t threads[N]; int rc; int t = …; rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t); if (rc) error….

void *PrintHello(void *threadid){ long tid; tid = (long)threadid; printf("Hello World! It's me, thread #%ld!\n", tid); pthread_exit(NULL);}

[http://computing.llnl.gov/tutorials/pthreads/]

Example: Java Threads (1)

class PrimeThread extends Thread { long minPrime; PrimeThread(long minPrime) { this.minPrime = minPrime; } public void run() { // compute primes larger than minPrime . . . } } PrimeThread p = new PrimeThread(143); p.start();

[http://download.oracle.com/javase/6/docs/api/java/lang/Thread.html]

Example: Java Threads (2)

[http://download.oracle.com/javase/6/docs/api/java/lang/Thread.html]

class PrimeRun implements Runnable { long minPrime; PrimeRun(long minPrime) { this.minPrime = minPrime; } public void run() { // compute primes larger than minPrime . . . } }

PrimeRun p = new PrimeRun(143); new Thread(p).start();

Thread states and transitions

running

readyblocked

exited

The kernel process/thread scheduler governs these transitions.

exit

wait, STOP, read, write, listen, receive, etc.

sleep

STOP wait

wakeup

Sleep and wakeup are internal primitives. Wakeup adds a thread to the scheduler’s ready pool: a set of

threads in the ready state.

EXIT

yield

Two threads sharing a CPU

reality

concept

context switch

CPU Scheduling 101

The OS scheduler makes a sequence of “moves”.

– Next move: if a CPU core is idle, pick a ready thread t from the ready pool and dispatch it (run it).

– Scheduler’s choice is “nondeterministic”

– Scheduler’s choice determines interleaving of execution

WakeupGetNextToRun

SWITCH()

ready poolblockedthreads

If timer expires, or wait/yield/terminate

Yield() { disable; next = FindNextToRun(); ReadyToRun(this); Switch(this, next); enable;}

Sleep() { disable; this->status = BLOCKED; next = FindNextToRun(); Switch(this, next); enable;}

A Rough Idea

Issues to resolve:What if there are no ready threads?How does a thread terminate?How does the first thread start?

/* * Save context of the calling thread (old), restore registers of * the next thread to run (new), and return in context of new. */switch/MIPS (old, new) {

old->stackTop = SP;save RA in old->MachineState[PC];save callee registers in old->MachineState

restore callee registers from new->MachineState

RA = new->MachineState[PC];SP = new->stackTop;

return (to RA)}

This example (from the old MIPS ISA) illustrates how context switch saves/restores the user register context for a thread, efficiently and without assigning a value directly into the PC.

switch/MIPS (old, new) {old->stackTop = SP;save RA in old->MachineState[PC];save callee registers in old->MachineState

restore callee registers from new->MachineStateRA = new->MachineState[PC];SP = new->stackTop;

return (to RA)}

Example: Switch()

Caller-saved registers (if needed) are already saved on its stack, and restored automatically on return.

Return to procedure that called switch in new thread.

Save current stack pointer and caller’s return address in old thread object.

Switch off of old stack and over to new stack.

RA is the return address register. It contains the address that a procedure return instruction branches to.

What to know about context switch• The Switch/MIPS example is an illustration for those of you who are

interested. It is not required to study it. But you should understand how a thread system would use it (refer to state transition diagram):

• Switch() is a procedure that returns immediately, but it returns onto the stack of new thread, and not in the old thread that called it.

• Switch() is called from internal routines to sleep or yield (or exit).

• Therefore, every thread in the blocked or ready state has a frame for Switch() on top of its stack: it was the last frame pushed on the stack before the thread switched out. (Need per-thread stacks to block.)

• The thread create primitive seeds a Switch() frame manually on the stack of the new thread, since it is too young to have switched before.

• When a thread switches into the running state, it always returns immediately from Switch() back to the internal sleep or yield routine, and from there back on its way to wherever it goes next.

Creating a new thread

Also called “forking” a thread Idea: create initial state, put on ready queue1.Allocate, initialize a new TCB2.Allocate a new stack3.Make it look like thread was going to call a

function PC points to first instruction in function SP points to new stack Stack contains arguments passed to function

4.Add thread to ready queue

Thread control block

CPUCPU

Address SpaceAddress Space

TCB1PCSP

registers

TCB1PCSP

registers

TCB2PCSP

registers

TCB2PCSP

registers

TCB3PCSP

registers

TCB3PCSP

registers

CodeCode

StackStackCodeCode

StackStackCodeCode

StackStack

PCSP

registers

PCSP

registersThread 1 running

Ready queue

Thread control block

CPUCPU

Address SpaceAddress Space

TCB2PCSP

registers

TCB2PCSP

registers

TCB3PCSP

registers

TCB3PCSP

registers

CodeCode

StackStack StackStack StackStack

PCSP

registers

PCSP

registersThread 1 running

Ready queue

Kernel threads (“native”)

Thread

User mode

Scheduler

…

PCSP

Thread

…

PCSP

Thread

…

PCSP

Thread

…

PCSP

Kernel mode

User-level threads (“green”)

Thread

User mode

Scheduler

…

PCSP

Thread Thread

…

PCSP

Thread

…

PCSP

Sched

…

PCSP

Kernel mode

Andrew Birrell

Bob Taylor

Concurrency: An Example

int counters[N];int total;

/* * Increment a counter by a specified value, and keep a running sum. */voidTouchCount(int tid, int value){

counters[tid] += value;total += value;

}

Reading Between the Lines of C/* ; counters and total are global data ; tid and value are local data counters[tid] += value; total += value;*/loadcounters, R1 ; load counters baseload8(SP), R2 ; load tid indexshl R2, #2, R2 ; index = index * sizeof(int)add R1, R2, R1 ; compute index to arrayload (R1), R2 ; load counters[tid]

load4(SP), R3 ; load valueadd R2, R3, R2 ; counters[tid] += valuestore R2, (R1) ; store back to counters[tid]

load total, R2 ; load totaladd R2, R3, R2 ; total += valuestore R2, total ; store total

Reading Between the Lines of C


loadaddstore

loadaddstore

Two executions of this code, so:two values are added to total.

Interleaving matters


load

add

store

load

add

store

In this schedule, only one value is added to total: last writer wins.The scheduler made a legal move that broke this program.

Non-determinism and ordering

Time

Thread A

Thread B

Thread C

Global orderingWhy do we care about the global ordering? Might have dependencies between events Different orderings can produce different resultsWhy is this ordering unpredictable? Can’t predict how fast processors will run

Non-determinism example

y=10; Thread A: x = y+1; Thread B: y = y*2; Possible results?

A goes first: x = 11 and y = 20 B goes first: y = 20 and x = 21

What is shared between threads? Variable y

Another example

Two threads (A and B) A tries to increment i B tries to decrement i

Thread A: i = o; while (i < 10){ i++; } print “A done.”

Thread B: i = o; while (i > -10){ i--; } print “B done.”

Example continued

Who wins? Does someone have to win?

Thread A: i = o; while (i < 10){ i++; } print “A done.”

Thread B: i = o; while (i > -10){ i--; } print “B done.”

Debugging non-determinism

Requires worst-case reasoning Eliminate all ways for program to break

Debugging is hard Can’t test all possible interleavings Bugs may only happen sometimes

Heisenbug Re-running program may make the bug

disappear Doesn’t mean it isn’t still there!

Documents

D u k e S y s t e m s Servers and Threads, Continued Jeff Chase Duke University