Upload
debra-anthony
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Computer Science
Cataclysm: Policing Extreme Overloads in Internet
Applications
Bhuvan Urgaonkar and Prashant Shenoy
University of Massachusetts
Computer Science
Motivation
Internet applications used in a variety of domains Online banking, online brokerage,
online music store, e-commerce
Internet usage continues to grow rapidly Broadband deployment is
accelerating
Outages of Internet applications more common
“Site not responding”“connection timed out”
Computer Science
Internet Application Outages
Down for 30 minutes
Average download time ~ 260 sec
Periodic outages over 4 days
Cause: Too many users leading to overload
Holiday Shopping Season 2000:
9/11: site inaccessible for brief periods
Computer Science
Internet Data Centers Internet applications run
on data centers Server farms
Provide computational and storage resources
Applications share data center resources
Problem: How can the platform handle extreme overloads seen by applications?
Computer Science
Handling Extreme Overloads
Existing work is based on three approaches Request policing [Kanodia00, Li00, Verma03, Welsh03, …] Dynamic capacity provisioning [Chase01, Ranjan04] Degrade performance of admitted requests [Abdelzaher99]
Shortcomings of existing work: Does not attempt to integrate these three approaches Does not address scalability of the policer!
• The policer itself may become the bottleneck during overloads
Computer Science
Our Contribution: Cataclysm
Comprehensive approach Novel policer that can scale during overloads Dynamic provisioning for both application and policer SLA-based performance adaptation
Implementation and evaluation on a Linux cluster
Focus of this talk: design of the policer
Computer Science
Talk Outline
Motivation Internet data center model Request policing Cataclysm Server Platform Experimental results Summary
Computer Science
Data Center Model
Dedicated hosting: each application runs on a subset of servers in the data center Subsets are mutually exclusive: no server sharing Data center hosts multiple applications
Free server pool: unused servers
Retail Web site streaming
Computer Science
Internet Application Model
Internet applications replicated on multiple servers E.g., clustered HTTP
Each application employs a sentry Load balancing and request policing
One or more request classes Service-level agreement
Specifies certain guaranteed request admission rate per class Specifies allowed degradation in response time with arrival rate
requests
http
load balancing sentry
droppedrequests
Computer Science
Talk Outline
Motivation Internet data center model Request policing Cataclysm Server Platform Experimental results Summary
Computer Science
Policer: Design Goals
Class-based differentiation Each class should sustain its guaranteed admission rate
Revenue maximization Challenging due to online nature of the problem
• An admitted request may cause a more important request arriving later to be dropped
Approach: Preferential admission to higher class requests
Scalability The policer should remain operational even under extremely
high arrival rates
Computer Science
Overview of Policer Design
Cataclysm policer has three components Request classifier and per-class leaky buckets Class-specific queues Admission control
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
Admission control
dgold
dsilver
dbronze
dropped
admitted
Computer Science
Class-based Differentiation
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
dgold
dsilver
dbronze
Each incoming request undergoes classification Per-class leaky buckets used to ensure that rates
guaranteed in SLA are admitted
Admission control
dropped
admitted
Computer Science
Revenue Maximization
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
dgold
dsilver
dbronze
Idea: Add different delays in processing of requests of different classes More important requests processed more frequently Methodology to compute delay values in online manner
Bounds probability of a request denying admission to a more important request
Admission control
dropped
admitted
Computer Science
Admission Control
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
dgold
dsilver
dbronze
Admission control
Goal: Ensure that an admitted request meets its response time target Measurement-based admission control algorithm Use information about current load on servers and estimated size of
new request to make decision
dropped
admitted
Computer Science
Scalability of Admission Control
Idea #1: Reduce the per-request admission control cost Admission control on every request may be expensive
Bursty arrivals during overloads => batches get formed Delays for class-based differentiation => batches get formed
Admission control test that operates on batches instead of requests
Idea #2: Sacrifice accuracy for computational overhead When batch-based processing becomes prohibitive
Threshold-based scheme• E.g., Admit all Gold requests, drop all Silver and Bronze requests
• Thresholds chosen based on observed arrival rates and service times
+ Extremely efficient
- Wrong threshold => bad response times or fewer requests admitted
Computer Science
Scaling Even Further …
Protocol processing overheads will saturate sentry resources at extremely high arrival rates Indiscriminate dropping of requests will occur
• Important requests may be turned away without even undergoing the admission control test
• Loss in revenue! Sentry should still be able to process each arriving request!
Idea: Dynamic capacity provisioning for sentry Pull in an additional sentry if CPU utilization of existing
sentries exceeds a threshold (e.g., 90%) Round-robin DNS to load balance among sentries
Computer Science
Talk Outline
Motivation Internet data center model Request policing Cataclysm Server Platform Experimental results Summary
Computer Science
Cataclysm Server Platform
Prototype data center 20 Pentium servers Gigabit switches Linux-based platform
Sentry implemented in Layer-7 switch Linux module ktcpvs
Replicated Web server applications using Apache Dynamic content using PHP
Computer Science
Class-based Differentiation
Arrival rate
0
50
100
150
200
250
0 100 200 300 400 500
Time (sec)
Arriv
al ra
te (r
eq/s
)
Gold
Silver
Bronze
Fraction admitted
0
0.2
0.4
0.6
0.8
1
0 100 200 300 400 500
Time (sec)
Fra
cti
on
ad
mit
ted
Gold
Silver
Bronze
Three classes of requests: Gold, Silver, Bronze Policer successful in providing preferential admission
to important requests
Computer Science
Threshold-based: Higher Scalability
Scalability
0
20
40
60
80
100
0 5000 10000 15000 20000
Arrival rate (req/s)
CP
U u
til
(%)
Batch
Threshold
Threshold-based processing allows the policer to handle upto 4 times higher arrival rate Single sentry can handle about 19000 req/s
Computer Science
Threshold-based: Loss of Accuracy
Admission rate
0
50
100
150
200
250
0 100 200 300 400 500
Time (sec)
Adm
issi
on r
ate
(req
/s)
Gold
Silver
Bronze
95th resp time
0
1000
2000
3000
4000
5000
0 100 200 300 400 500
Time (sec)
95th
res
p tim
e (m
sec)
Gold
Silver
Bronze
Higher scalability comes at a loss in accuracy of admission control
Occasional violations of response time targets
Computer Science
Sentry Provisioning
CPU util
0
20
40
60
80
100
0 100 200 300 400 500 600
Time (sec)
CP
U u
til
(%)
CPU util
Arrical rate
0
10000
20000
30000
40000
50000
0 100 200 300 400 500 600
Time (sec)
Arr
ival
rat
e (r
eq/s
)
Total arrival
Arival at sentry 1
Computer Science
Summary
Cataclysm: a comprehensive overload management technique consisting of Request policing Dynamic capacity provisioning SLA-based performance adaptation
Cataclysm achieves the following Class-based differentiation Revenue maximization Ability to scale to extreme overloads
More information: http://lass.cs.umass.edu
Computer Science
Policing and Provisioning
Application 1
0
500
1000
1500
2000
2500
3000
0 1000 2000 3000 4000 5000
Time (sec)
95th
resp
tim
e (m
sec) 95th resp time
Target resptime
Application 1
0
500
1000
1500
2000
0 1000 2000 3000 4000 5000
Time (sec)
Rat
e (r
eq/s
)
Arrival rate
Admission rate
Computer Science
Policing and Provisioning
Application 2
0
200
400
600
800
1000
1200
1400
0 1000 2000 3000 4000 5000
Time (sec)
Rat
e (r
eq/s
)
Arrival rate
Admission rate
Application 2
0
500
1000
1500
2000
2500
3000
0 1000 2000 3000 4000 5000
Time (sec)
95th
resp
tim
e (m
sec) 95th resp time
Target resptime