41
ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT INSTANCES MICHELE MAZZUCCO* UNIVERSITY OF TARTU * And Marlon Dumas. Thanks to Madeline Gonzalez for the useful discussions.

ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

ACHIEVING PERFORMANCE AND AVAILABILITY

GUARANTEES WITH AMAZON SPOT INSTANCES

MICHELE MAZZUCCO*

UNIVERSITY OF TARTU

* And Marlon Dumas. Thanks to Madeline Gonzalez for the useful discussions.

Page 2: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

INTRODUCTION (1)

•  Cloud computing = pay-as-you go + “unlimited” capacity •  Amazon Elastic Cloud Compute (EC2)

•  Web service providing resizable compute capacity •  Different instance types, e.g., small, large, xlarge servers •  Different purchasing options

1.  On-Demand Instances •  Pay for resources by the hour •  No long-term commitments

2.  Reserved Instances •  Upfront payment for reserving instances •  Discounted usage rate

2

Page 3: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

INTRODUCTION (2)

3.  Spot Instances •  Users bid for unused resources at a “limit price” (the

maximum price the bidder is willing to pay) •  Amazon gathers the bids and determines a clearing

price (“spot price”) based on the bids and the available capacity

•  A bidder gets the required instances if his/her limit price is above the clearing price. In this case, the bidder pays the clearing price (not his/her limit price)

•  The clearing price is updated as new bids arrive. If the clearing price goes above the bidder’s limit price, the bidder’s running spot instances are terminated.

3

Page 4: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

INTRODUCTION (3)

•  In this presentation I will focus on “spot instances” •  They allow IaaS providers to sell spare capacity

(via auctions), thus improving resource utilization •  At the same time they enable IaaS users to

acquire resources at a price lower than on-demand price

•  However, spot instances are terminated any time the “clearing price” goes above the user’s “limit price”

4

Page 5: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PROBLEM STATEMENT

“Can we use spot instances to run paid web services while achieving

performance and availability guarantees?”!

5

Page 6: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

TALK OUTLINE

6

Bidding strategy

• Decides how much to bid on the spot market

Server allocation

• Decides how many servers to use

Admission control

• Decides how many jobs to admit

Mission accomplished Meet performance

targets

Meet availability goals

Page 7: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PART 1: SPOT PRICE PREDICTION

•  We must determine the optimal “limit price” that the SaaS provider should bid on the spot market •  The goal is to bid the lowest possible price capable of

achieving the desired level of availability

•  Let’s see how some spot prices look like…

7

Page 8: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

SPOT PRICES (M1.SMALL)

8

0.03

0.04

0.05

0.06

0.07

0.08

0.09

25/12 01/01 08/01 15/01 22/01 29/01 05/02 12/02 19/02 26/02

Pri

ce (

$/h

)

m1.small (Linux/UNIX)

0.029

0.0295

0.03

0.0305

0.031

11/Fri 12/Sat 13/Sun 14/Mon 15/Tue

Pri

ce (

$/h

)

Page 9: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

SPOT PRICES (M1.LARGE)

9

0.1

0.15

0.2

0.25

0.3

0.35

25/12 01/01 08/01 15/01 22/01 29/01 05/02 12/02 19/02 26/02

Pri

ce (

$/h)

m1.large (Linux/UNIX)

0.113

0.116

0.119

0.122

0.125

0.128

11/Fri 12/Sat 13/Sun 14/Mon 15/Tue

Pri

ce (

$/h

)

Page 10: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

SPOT PRICES (M1.XLARGE)

10

0.2

0.3

0.4

0.5

0.6

0.7

0.8

25/12 01/01 08/01 15/01 22/01 29/01 05/02 12/02 19/02 26/02

Pri

ce (

$/h)

m1.xlarge (Linux/UNIX)

0.24

0.27

0.3

0.33

0.36

0.39

11/Fri 12/Sat 13/Sun 14/Mon 15/Tue

Pri

ce (

$/h

)

Page 11: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

1ST ATTEMPT •  Assume that prices follow patterns, e.g., day/night, week

days/week ends, etc. •  Use triple exponential smoothing (“Holt-Winters”) for

predicting future prices

11

St =!ytIt+ (1!!)(St!1 + bt!1)

bt = " (St ! St!1)+ (1!" )bt!1

It = #ytSt+ (1!#)It!L

Ft+m = (St +mbt )It!L+m

Updates the trend

Updates the smoothed value

Updates the seasonal component

m periods ahead’s forecast

Con

stan

ts m

inim

izin

g th

e M

SE

Page 12: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

WIKIPEDIA TRAFFIC

Time [hours]

Arr.

rate

[req

./hou

r]

0 100 200 300 400 500 600 700

2.0e

+07

2.5e

+07

3.0e

+07

3.5e

+07

12

~ 10,000 req./sec at peak

Page 13: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

WIKIPEDIA TRAFFIC: TIME SERIES DECOMPOSITION

13

2.0e+07

3.0e+07

data

−6e+06

−2e+06

2e+06

seasonal

2.3e+07

2.6e+07

2.9e+07

trend

−1e+07

−4e+06

2e+06

0 5 10 15 20 25 30

remainder

time

The max irregular component is ~11% of the data

= da

ta –

(tre

nd +

sea

sona

l)

Page 14: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PREDICTING WIKIPEDIA TRAFFIC WITH HOLT WINTERS

Predicting Wikipedia traffic with Holt−Winters

Time [days]

Arr.

rate

[req

./hou

r]

0 5 10 15 20 25 30

2.0e

+07

2.5e

+07

3.0e

+07

3.5e

+07 Observed

Predicted

14

Page 15: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PREDICTING SPOT PRICES WITH HOLT WINTERS (1)

m1.small

Time [~ days]

Pric

e [$

/h]

10 20 30 40

0.03

0.04

0.05

0.06

0.07

0.08

ObservedPredicted

15

Days [5−9]

Time [hours]

Pric

e [$

/h]

0 20 40 60 80 100 120

0.03

00.

035

0.04

00.

045

0.05

0

ObservedPredicted

Page 16: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PREDICTING SPOT PRICES WITH HOLT WINTERS (2)

m1.large

Time [~ days]

Pric

e [$

/h]

10 20 30 40

0.15

0.20

0.25

0.30

ObservedPredicted

16

Days [5−9]

Time [hours]

Pric

e [$

/h]

0 20 40 60 80 100 120

0.15

0.20

0.25

0.30

ObservedPredicted

Page 17: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PREDICTING SPOT PRICES WITH HOLT WINTERS (3)

m1.xlarge

Time [~ days]

Pric

e [$

/h]

10 20 30 40

0.3

0.4

0.5

0.6

0.7

0.8 Observed

Predicted

17

Days [5−9]

Time [hours]

Pric

e [$

/h]

0 20 40 60 80 100 120

0.3

0.4

0.5

0.6

0.7

0.8 Observed

Predicted

Page 18: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

XLARGE PRICES: TIME SERIES DECOMPOSITION

m1.xlarge

0.3

0.5

0.7

data

−0.03

−0.01

0.01

0.03

seasonal

0.25

0.35

0.45

trend

−0.2

0.0

0.2

0.4

5 10 15

remainder

time

18

The max irregular component is ~50% of the data

Page 19: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

2ND ATTEMPT (1) •  What about employing the distribution of the relative

errors to correct the prediction?

19

Histogram of errors

Relative error [(observed − predicted)/observed]

Den

sity

−0.2 −0.1 0.0 0.1 0.2

01

23

45

6

m1.large

We are interested in this part

Page 20: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

2ND ATTEMPT (2)

20

0.00 0.05 0.10 0.15 0.20

0.0

0.2

0.4

0.6

0.8

1.0

CDF rel. error (Wikipedia)

Relative error [(observed − predicted)/observed]

Fn(x

)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●● ●●●●●●●●

●●●●● ●●●●●● ● ●● ●●● ● ● ● ● ● ●● ● ●● ● ● ● ●

All the arrival rates can be estimated with 83% accuracy

Page 21: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

2ND ATTEMPT (3)

21

0.0 0.1 0.2 0.3 0.4 0.5 0.6

0.2

0.4

0.6

0.8

1.0

CDF rel. error. (xlarge)

Relative error [(observed − predicted)/observed]

Fn(x

)

It does not really work, does it?

Page 22: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

ONE MORE THING… Dec. 25, 2010 − Feb. 23, 2011

Time [hours]

Pric

e [$

/h]

0 200 400 600 800 1000

0.03

0.05

0.07

Jun. 22, 2011 − Sept. 23, 2011

Time [hours]

Pric

e [$

/h]

0 100 200 300 400 500

0.03

0.05

0.07

22

•  ... they look quite different, don’t they?

Page 23: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

ACF(l) =(xt ! x )(xt+l ! x )

t=1

N

"

(xt ! x )2

t=1

N

"

3RD ATTEMPT

1.  Employ the autocorrelation function (ACF) to determine the similarity between prices as a function of the time difference between them (lag)

23

Average value

Value at time t

No. of observations

Lag

Page 24: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

2ND ATTEMPT

•  The autocorrelation function (ACF) of the prices confirms the lack of any relationship between prices over the time

2.  Use a normal approximation to model prices •  There is some evidence that prices in similar markets are

approximately normally distributed 24

0 50 100 150

−0.2

0.0

0.2

0.4

0.6

0.8

1.0

LAG (hours)

ACF

m1.small

Page 25: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

2ND ATTEMPT

•  The autocorrelation function (ACF) of the prices confirms the lack of any relationship between prices over the time

2.  Use a normal approximation to model prices •  There is some evidence that prices in similar markets are

approximately normally distributed 25

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

LAG (hours)

ACF

m1.large

Page 26: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

LAG (hours)

ACF

2ND ATTEMPT

•  The autocorrelation function (ACF) of the prices confirms the lack of any relationship between prices over the time

2.  Use a normal approximation to model prices •  There is some evidence that prices in similar markets are

approximately normally distributed 26

m1.xlarge

Page 27: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

SPOT PRICE DISTRIBUTION

Data Normal Approx. Minimum 0.029 0.02098 1st Quartile 0.029 0.02877 Median 0.031 0.03020 3rd Quartile 0.031 0.03163 Maximum 0.085 0.04010 Mean (1st moment) 0.0302 0.03025 Variance (2nd moment) 4.503941 × 10-6 4.515323 × 10-6

Skewness (3rd moment) 1.877202 × 10 1.659259 × 10-2 Kurtosis (4th moment) 4.700735 × 102 3.004374

27

m1.small (Linux/Unix), us-east1 region. Dec. 25, 2010 – Feb. 23, 2011

The

dist

ribut

ion

of th

e pr

ices

is m

ore

heav

y ta

iled

(c

onfir

med

als

o by

Q-Q

plo

ts a

nd S

hapi

ro-W

ilk te

st)

Page 28: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PRICE PREDICTION ALGORITHM ①  Collect price statistics, and compute mean and variance ②  Compute ACF of the prices for lag l, with l being the prediction

horizon ③  If ACF(l) > 0.4,

a)  Predict the future prices using linear regression, y = a + bx, otherwise

b)  Use the quantile function (inverse CDF) of the Normal distribution to predict the future prices

•  The inverse CDF returns the value of x such as P(X ≤ x) = p

④  Use the max value returned by (3) as a “limit price”

⑤  In case of out-of-bid events, increase the bid by 40%

28

Page 29: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PREDICTION ALGORITHM PERFORMANCE

29 m1.small (Linux/Unix), us-east1 region. Dec. 25, 2010 – Feb. 23, 2011.

The price of “on-demand” instances is 0.085$/h

Prediction (hours)

Target availability

Achieved availability

Avg. big ($/h)

6 90% 99.673% 0.03170 6 99% 99.673% 0.03270 6 99.999% 99.673% 0.03463 12 90% 99.675% 0.03207 12 99% 99.675% 0.03307 12 99.999% 99.675% 0.03499 24 90% 99.679% 0.03253 24 99% 99.679% 0.03350 24 99.999% 99.679% 0.03533

Page 30: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PART 2: MODEL AND SLA (1)

The SaaS provider 1.  Pays the IaaS for renting computing resources 2.  Charges its customer for executing WS transactions 3.  Pays penalties to its customer for failing to meet performance and availability objectives

30

Page 31: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

MODEL AND SLA (2)…

1.  For each accepted and completed request, the client pays a charge of c $

2.  Performance: if the avg. resp. time, β, over an interval of length t exceeds the threshold q, then the provider pays a penalty of r1 $ for each job executed in the interval

3.  Availability: the provider should pay a penalty of r2 $ for each rejected job

4.  Disaster: the provider is liable to pay a r3 $ for every accepted job that is lost due to resources becoming unavailable

5.  The provider pays r4 $/h to rent each server •  The provider tries to optimize the average net revenue

earned per unit time

31

Page 32: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

… OR

•  r1 = penalty for “performance” •  r2 = penalty for “availability” •  r3 = penalty for “disaster” •  r4 = server rental cost •  γ = rate at which jobs are accepted into the system

32

R = ![c! r1P(" > q)]! r2 (# !! )! r3P(lp < sp )L ! r4n

Prob. that the avg. resp. time exceeds the threshold

Rate at which jobs are rejected

Prob. that spot instances are terminated

Avg. no. of jobs in the system

Page 33: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

POLICIES Resource allocation and admission control

•  Admission control defined by means of a threshold, K

1.  “Threshold” policy

•  The simultaneous optimization of K and n is non trivial •  Treats the system as an M/M/n/K queue •  Uses a “Hill Climbing” algorithm to find the “best” value of

n and K 2.  “Heuristic”: assumes that the response time is dominated

by the service time (true in “large” systems) •  Treats the system as an GI/G/n queue •  K = ∞, e.g., no job is rejected

33

Page 34: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

PERFORMANCE EVALUATION - SETTINGS

b 0.5 sec. Average service time q 1 sec. Performance threshold c 2.951 × 10-5 $ Charge per job

r1 1.5 × c Penalty (performance) r2 2 × c Penalty (availability) r3 3 × c Penalty (disaster) r4 0.085 $/h Rental cost (“on-demand” instances) t 1 min. Interval length (SLA evaluation)

34

Page 35: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

REVENUE AS FUNCTION OF K AND N

35

-100

-80

-60

-40

-20

0

20

40

60

20 40 60 80 100 120 140 160 180 200 220

Rev

enu

e, $

/h

Admission threshold, K

n=23n=27n=30n=31n=35

p(lp < cp) = 0

Page 36: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

REVENUE AS FUNCTION OF K AND N

36

-100

-80

-60

-40

-20

0

20

40

60

20 40 60 80 100 120 140 160 180 200 220

Rev

enu

e, $

/h

Admission threshold, K

n=23n=27n=30n=31n=35

p(lp < cp) = 0.25

Page 37: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

REVENUE AS FUNCTION OF K AND N

37

-100

-80

-60

-40

-20

0

20

40

60

20 40 60 80 100 120 140 160 180 200 220

Rev

enu

e, $

/h

Admission threshold, K

n=23n=27n=30n=31n=35

p(lp < cp) = 0.5

Page 38: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

REVENUE AS FUNCTION OF R1, R2 AND K

38 n = 10, ρ = 10, p(lp < cp) = 0 (e.g., no disasters occur)

-4

-3

-2

-1

0

1

2

10 15 20 25 30 35 40

Rev

enu

e, $

/h

Admission threshold, K

r1=2.5c, r2=0r1=2c, r2=0.5c

r1=1.5c, r2=cr1=c, r2=1.5c

r1=0.5c, r2=2cr1=0, r2=2.5c

The system is saturated •  Have to choose between a

long queue and rejecting many jobs

Page 39: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

REVENUE AS FUNCTION OF THE ARRIVAL RATE

39 ca2=1, cs2=1, ρ = 7.5,…,25, p(lp < cp) = 0.001

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

15 20 25 30 35 40 45 50

Rev

enue

, $/h

Arrival rate, ! (req./sec.)

On-Demand (Threshold)On-Demand (Heuristic)

Spot (Threshold)Spot (Heuristic)

1.  The revenue increases with the load

2.  The use spot instances makes the system more profitable

3.  The “Heuristic” policy performs almost as good as the “Threshold” algorithm

Each point represents a run lasting 28 days

Page 40: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

CONCLUSIONS

•  I have discussed an approach aiming at maximizing the net revenue earned by a SaaS using spot instances to provide a web service to paying customers 1.  How much to bid for resources on the spot market? 2.  How many servers should be allocated for a given time

period, and how many jobs to accept?

•  The number of running servers, as well as the maximum number of jobs admitted into the system, have a significant effect on the earned revenues •  The optimal queue length is highly dependent on the

availability level (e.g., shorter queue when the likelihood premature termination is high)

40

Page 41: ACHIEVING PERFORMANCE AND AVAILABILITY GUARANTEES WITH AMAZON SPOT …tarmo/tday-torve/mazzucco-slides.pdf · 2011-10-13 · INTRODUCTION (1) • Cloud computing = pay-as-you go +

41