31
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley Presenter: Olusanya Soyannwo

Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Embed Size (px)

Citation preview

Page 1: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Capriccio: Scalable Threads for Internet Services

Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer

University of California at Berkeley

Presenter: Olusanya Soyannwo

Page 2: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Outline

Motivation Background Goals Approach Experiments Results Related work Conclusion & Future work

EECS Advanced Operating Systems Northwestern University

Page 3: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Motivation

Increasing scalability demands for Internet services

Hardware improvements are limited by existing software

Current implementations are event based

EECS Advanced Operating Systems Northwestern University

Page 4: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Background : Event Based Systems - Drawbacks

Events systems hide the control flowDifficult to understand and debug

Programmers need to match related events

Burdens programmers

EECS Advanced Operating Systems Northwestern University

Page 5: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Goals: Capriccio

Support for existing thread API

Scalability to hundreds of thousands of threads

Automate application-specific customization

EECS Advanced Operating Systems Northwestern University

Page 6: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Capriccio

Thread packageCooperative scheduling

Linked stacksAddress the problem of stack

allocation for large numbers of threadsCombination of compile-time and run-

time analysis Resource-aware scheduler

EECS Advanced Operating Systems Northwestern University

Page 7: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: User Level Thread – The Choice

POSIX API(-)Complex preemption

(-)Bad interaction with Kernel scheduler

Performance Ease thread synchronization overhead No kernel crossing for preemptive threading More efficient memory management at user level

Flexibility Decoupling user and kernel threads allows faster innovation Can use new kernel thread features without changing application

code Scheduler tailored for applications

EECS Advanced Operating Systems Northwestern University

Page 8: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: User Level Thread – Disadvantages

Additional OverheadReplacing blocking calls with non-

blocking calls

Multiple CPU synchronization

EECS Advanced Operating Systems Northwestern University

Page 9: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: User Level Thread – Implementation Context Switches

Built on top of Edgar Toernig’s coroutine library Fast context switches when threads voluntarily yield

I/O Capriccio intercepts blocking I/O calls Uses epoll for asynchronous I/O

Scheduling Very much like an event-driven application Events are hidden from programmers

Synchronization Supports cooperative threading on single-CPU machines

• Requires only Boolean checks

EECS Advanced Operating Systems Northwestern University

Page 10: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Linked Stack

The problem: fixed stacksOverflow vs. wasted spaceLimits thread numbers

The solution: linked stacksAllocate space as neededCompiler analysis

• Add runtime checkpoints • Guarantee enough space until next check

Fixed Stacks

Linked Stack

EECS Advanced Operating Systems Northwestern University

Page 11: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

4

2

6

3

3

2

3

MaxPath = 8

EECS Advanced Operating Systems Northwestern University

Page 12: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

4

2

6

3

3

2

3

MaxPath = 8

EECS Advanced Operating Systems Northwestern University

Page 13: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

4

2

3

3

2

3

MaxPath = 8

6

EECS Advanced Operating Systems Northwestern University

Page 14: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

3

MaxPath = 8

6

3

2

2

4

3

EECS Advanced Operating Systems Northwestern University

Page 15: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls MaxPath = 8

6

3

2

2

4

3

3

3

EECS Advanced Operating Systems Northwestern University

Page 16: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Scheduling

Advantages of event-based schedulingTailored for applicationsWith event handlers

Events provide two important pieces of information for schedulingWhether a process is close to

completionWhether a system is overloaded

EECS Advanced Operating Systems Northwestern University

Page 17: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Scheduling -The Blocking Graph

Close

Write

WriteReadSleep

ThreadcreateMain

Thread-based View applications as

sequence of stages, separated by blocking calls

Analogous to event-based scheduler

EECS Advanced Operating Systems Northwestern University

Page 18: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Approach: Resource-aware Scheduling Track resources used along BG edges

Memory, file descriptors, CPU Predict future from the past Algorithm

• Increase use when underutilized• Decrease use near saturation

Advantages Operate near the knee w/o thrashing Automatic admission control

EECS Advanced Operating Systems Northwestern University

Page 19: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Experiment: Threading Microbenchmarks

SMP, two 2.4 GHz Xeon processors 1 GB memory two 10 K RPM SCSI Ultra II hard

drives Linux 2.5.70 Compared Capriccio, LinuxThreads,

and Native POSIX Threads for Linux

EECS Advanced Operating Systems Northwestern University

Page 20: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Experiment: Thread Scalability

Producer-consumer microbenchmark LinuxThreads begin to degrade after

20 threads NPTL degrades after 100 Capriccio scales to 32K producers

and consumers (64K threads total)

EECS Advanced Operating Systems Northwestern University

Page 21: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Results: Thread Primitive - Latency

Capriccio LinuxThreads NPTL

Thread creation

21.5 21.5 17.7

Thread context switch

0.24 0.71 0.65

Uncontended mutex lock

0.04 0.14 0.15

EECS Advanced Operating Systems Northwestern University

Page 22: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Results: Thread Scalability

EECS Advanced Operating Systems Northwestern University

Page 23: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Results: I/O performance

Network performanceToken passing among pipesSimulates the effect of slow client links10% overhead compared to epollTwice as fast as both LinuxThreads

and NPTL when more than 1000 threads

Disk I/O comparable to kernel threads

EECS Advanced Operating Systems Northwestern University

Page 24: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Results: Runtime Overhead

Tested Apache 2.0.44 Stack linking

73% slowdown for null call 3-4% overall

Resource statistics 2% (on all the time) 0.1% (with sampling)

Stack traces 8% overhead

EECS Advanced Operating Systems Northwestern University

Page 25: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Results: Web Server Performance

EECS Advanced Operating Systems Northwestern University

Page 26: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Related Work

Programming Model of high concurrency Event based models are a result of poor thread implementations

User-Level Threads Capriccio is unique

Kernel Threads NPTL

Application Specific Optimization SPIN & Exokernel Burden on programmers Portability

Asynchronous I/O Stack Management

Using heap requires a garbage collector (ML of NJ)

EECS Advanced Operating Systems Northwestern University

Page 27: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Related Work (cont’d)

Resource Aware SchedulingSeveral similar to capriccio

Page 28: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Future Work

ThreadingMulti-CPU supportKernel interface

(enabled) Compile-time techniquesVariations on linked stacksStatic blocking graph

SchedulingMore sophisticated prediction

EECS Advanced Operating Systems Northwestern University

Page 29: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

Conclusion

Capriccio simplifies high concurrency Scalable & high performanceControl over concurrency model

• Stack safety• Resource-aware scheduling• Enables compiler support, invariants

Issues Additional burden to programmer Resource controlled sched.? What

hysteresis?

EECS Advanced Operating Systems Northwestern University

Page 30: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

OTHER GRAPHS

Page 31: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley

OTHER GRAPHS