View
222
Download
1
Tags:
Embed Size (px)
Citation preview
John Chuang 2
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
http://bits.blogs.nytimes.com/2007/11/26/yahoos-cybermonday-meltdown/index.html
John Chuang 3
Computing Trends
Multi-core CPUs Data centers Cloud computing
What are the drivers?- scalability, availability, cost-effectiveness
Server
Server
Server
Service
Client
Client
John Chuang 4
Lecture Outline
Performance Metrics Availability Queuing theory
- M/M/1 queue Scalability
- M/M/m queue
John Chuang 5
What is Performance?
Users want fast response time and high availability
Managers want happy users, and many of them, while minimizing cost
What are standard measures of system performance?
John Chuang 6
Performance Metrics
Response time (seconds) Throughput (MIPS, Mbps, TPS, ...)
Resource utilization (%) Availability (%)
John Chuang 7
Availability
Availability
Down-time per year
One hour down-time per:
90% 36 days 9 hours
99% 3.7 days 4.1 days
99.9% 9 hours 41.6 days
99.99% 53 minutes 1.14 years
99.999% 5 minutes 11.41 years
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Availability = MTTF / (MTTF + MTTR)-Mean-time-to-failure (MTTF)-Mean-time-to-recover (MTTR)
John Chuang 8
Response Time
M/M/1 Queue (μ = 100)
0
0.05
0.1
0.15
0.2
0.25
0 0.2 0.4 0.6 0.8 1
Utilization
( )Response Time s
Client Server
Formulaterequest Message latency
Message latency
Processing time
Interpretresponse
Network
Queuing time
Adapted from: David Messerschmitt
John Chuang 9
Queuing Theory
1. Arrival Process
2. Service TimeDistribution
3. Number ofServers
4. SystemCapacity
5. Customer Population
6. ServiceDiscipline
Source: Raj Jain
John Chuang 10
Kendall’s Notation (1953)
A/B/c/k/N/D- A: arrival process- B: service time distribution- c: number of servers- k: system capacity- N: population size- D: service discipline
M: Markov (exponential, memoryless, random, Poisson)
D: deterministicE: ErlangH: hyper-exponentialG: general FCFS: first come first
servedFCLS: first come last
servedRR: round-robinetc.
1. Arrival Process
2. Service Time Distribution
3. Number of Servers
4. SystemCapacity
5. Customer Population
6. ServiceDiscipline
John Chuang 11
Example Systems
M/M/1/ / /FCFS (simplified as M/M/1)- Markovian (Poisson, memoryless) arrival- Markovian service time- 1 server- Infinite server capacity- Infinite arrival stream- First-come-first-serve discipline
Other examples:- M/M/1/k (finite capacity)- M/M/m (m servers)- G/D/1 (arbitrary arrival, deterministic service time)
8 8
John Chuang 12
M/M/1 Queue Poisson arrival, with average arrival rate of jobs/sec
Poisson service, with average service rate of μ jobs/sec
Single server with infinite queue
System utilization (hopefully < 1): = /μ
Average number of jobs in system:N = n·pn = /(1 - )
System throughput (if < 1) : X =
Average response time (from Little’s Law):R = N/X = 1/(μ - )
John Chuang 13
Example: Web Server
Web server receives 40 requests/second Web server can process 100 requests/second
What is server utilization? At any given time, how many requests are at server (waiting plus being processed)?
What is the mean total delay at server (waiting plus processing)?
What happens when traffic rate doubles?
John Chuang 14
Example: Web Server
= 40 requests/second μ = 100 requests/second Utilization = = /μ = 40/100 = 40%
# of requests = N = /(1 - ) = 0.67
Average time spent at server = R = N/X = 0.67/40 = 17ms
John Chuang 15
Example: Traffic Doubled
= 80 requests/second μ = 100 requests/second Utilization = = /μ = 80/100 = 80%
# of requests = N = /(1 - ) = 4 Average time spent at server = R = N/X = 4/80 = 50ms (more than doubled!)
John Chuang 16
Approaching Congestion
= 99 requests/second μ = 100 requests/second Utilization = = /μ = 99/100 = 99%
# of requests = N = /(1 - ) = 99
Average time spent at server = R = N/X = 99/99 = 1 second!
John Chuang 17
Utilization Affects Performance
M/M/1 Queue (μ = 100)
0
0.05
0.1
0.15
0.2
0.25
0 0.2 0.4 0.6 0.8 1
Utilization
( )Response Time s
John Chuang 18
M/M/1/k Queue (Finite Capacity)
= /μ N = /(1-) – (k+1)k+1/(1-k+1) R = N/X = N/eff
- where eff = (1-Pk) = effective arrival rate
- and Pk = k(1-)/(1-k+1) = probability of a full queue
Loss rate = - eff
John Chuang 19
M/M/1/k Response TimeM/M/1 and M/M/1/k Queues (μ = 100)
0
0.05
0.1
0.15
0.2
0.25
0 0.2 0.4 0.6 0.8 1
Utilization
( )Response Time s
M/M/1
M/M/1/1
M/M/1/2
M/M/1/10
M/M/1/100
John Chuang 20
M/M/1/k ThroughputThroughput given Service rate μ = 100 jobs/sec
0
10
20
30
40
50
60
70
80
90
100
0 0.2 0.4 0.6 0.8 1
Utilization
( / )Throughput jobs sec
M/M/1
M/M/1/1
M/M/1/2
M/M/1/10
M/M/1/100
John Chuang 21
Lecture Outline
Performance Metrics Availability Queuing theory
- M/M/1 queue Scalability
- M/M/m queue
John Chuang 22
Scalability
The capability of a system to increase total throughput under an increased load when resources (typically hardware) are added- Cost of additional resource- Performance degradation under increased load
John Chuang 23
Scalability Example
Original web server: can process μ requests/sec; accepts requests at /sec
Now request rate increases to 10/sec and web server is swamped ( = 10/μ)!
Need to add new hardware!
John Chuang 24
Which is better?
Option 1: One big web server that can process 10μ requests/sec
Option 2: Ten web servers, each can process μ requests/sec; each accepts 10% of requests (/sec per server)
Option 3: Ten web servers, each can process μ requests/sec; share single queue (load balancer) that accepts requests at 10/sec
John Chuang 25
μ 10 10μ μ
μ
μ
μ
μ
μ
μ
μ
μ
μ
μμ
μ
μ
μ
μ
μ
μμ
10
Option 1: M/M/1 queue with big server Option 2: (ten M/M/1 queues)
Option 3: M/M/10 queue
John Chuang 26
M/M/m Queue (m Servers)
= /mμ N = m + /(1-)
where
and
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.