28
©2009 HP Confidential template rev. 12.10.09 1 Dynamic Capacity Provisioning of Server Farm Yuan Chen Senior Research Scientist Sustainable Ecosystems Research Group Hewlett Packard Laboratories

Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

©2009 HP Confidential template rev. 12.10.09 1

Dynamic Capacity Provisioning of Server Farm

Yuan Chen Senior Research Scientist Sustainable Ecosystems Research Group Hewlett Packard Laboratories

Page 2: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Demand How to allocate servers to meet

SLA requirements while minimizing

power consumption?

Consume lots of power

100 billion kWh in 2011

$ 7.4 billion

Have to meet SLAs SLA violations lead to revenue loss

20% traffic loss for additional 500ms delay in Google search

Resource Provisioning in Data Centers

Page 3: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

3

Workload Traces

• SAP a SAP enterprise application that hosts enterprise applications

such as customer relationship management applications for small and

medium sized businesses

• VDR a high-availability, multi-tier business-critical HP application

serving both external customers and HP users on six continents

• Web 2.0 - a popular HP Web service application with more than 85

million registered users in 22 countries (over 10 million users daily)

Page 4: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

4

Variability

(a) SAP (b) VDR (c) Web

Workload demand for a single day for (a) the SAP trace; (b) the VDR trace; and (c) the Web trace

Variability in the workload demands

Observation: The workload demands

typically have significant variability.

Implication: A single size cannot fit all, and

will result in either over-provisioning or

under-provisioning

Page 5: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

5

Variability

(a) SAP (b) VDR (c) Web

Workload demand for two hours for (a) the SAP trace; (b) the VDR trace; and (c) the

Web trace

Observation: Workload demands can change abruptly during short intervals.

Implication: To handle such variations, a provisioning mechanism is

required at short time-scales.

Page 6: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

6

Periodicity

Observation: The workloads exhibit prominent daily patterns

(a) SAP (b) VDR (c) Web

Time-series and periodogram for (a) the SAP trace; (b) the VDR trace; and (c) the Web trace.

Page 7: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Predictive [Krioukov, …, Culler, Katz „10] [Chen, He, …, Zhao ‟08] [Bobroff, Kuchut, Beaty ‘07] [Chen, Das, …, Gautam ‟05] [Castellanos et al. „05]

Every few hours

Stable and easy

Cannot react to changes

Reactive [Leite, Kusic, Mosse „10] [Nathuji, Kansal, Ghaffarkhah „10] [Wang, Chen „08] [Fan, Weber, Barroso „07] [Wood, Shenoy, … „07]

Every few minutes

Can react quickly

Unstable and expensive

Prior Work

Page 8: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Demand varies, but there are periodic patterns

There will be deviations from these patterns

Provisioning is not free; there are various

associated costs and risks. Turning servers on can take a significant amount of time and

consume a lot of power

“wear and tear”

SAP trace VDR trace Web trace

Observations

Page 9: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

9

Our Approach

Combines predictive and reactive control to allocate resources at multi-time scales

• Identify long-term sustained patterns --- a “base” workload

• A predictive provisioning proactively handles the estimated base workload (every a few hours)

• A reactive provisioning handles any excess workload (every a few minutes)

Page 10: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Base Workload

Predictor

Coordinator Predictive

Controller Reactive

Controller

Server pool 1 Server pool 2

actual workload

historic workload traces

predicted base

workload

excess workload

workload not exceeding base

live workload

trace

base provisioning

noise provisioning

Hybrid Provisioning

Page 11: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

1

1

Base Workload Predictor

Coordinator Predictive Controller

Reactive Controller

Server pool 1 Server pool 2

actual workload

historic workload traces

predicted base

workload

excess workload

workload not exceeding base

live workload

trace

base provisioning

noise provisioning

Base Workload Prediction

Page 12: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

12

Base Workload Prediction

1. Periodicity analysis

• Use Fast Fourier Transform (FFT)

2. Workload prediction

• Auto regressive model

3. Workload discretization

Page 13: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

13

Workload Discretization

– Discretize the demands into consecutive, disjoint time-intervals with a single representative demand value in each interval

– Given the demand time-series X on the domain [s, t], a time-series Y on the same domain is a workload characterization of X if [s, t] can be partitioned into n successive disjoint time intervals, {[s, t1],[t1, t2],...,[tn-1, t]}, such that X(j)= ri, for all j in the ith interval, [ti-1, ti].

1. Deviation from actual demand is small

2. Avoid having too many intervals

Page 14: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

14

Workload Discretization

Optimization problem: Minimize (Error + C. # Changes)

, ,

• For a given partition, setting ri to be the mean of the time-series values on that partition minimizes the mean-squared error

• The optimal solution for the domain [t0, tn] contains the optimal solution for the domain [t0, tn-1]

• dynamic programming

• Pick the normalization constant c

Page 15: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

15

Different Discretization Techniques

Mean 90 %ile Max

Mean/1hr Mean/3hrs Mean/6hrs

DP SAX K-means

Page 16: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Base Workload Predictor

Coordinator Predictive Controller

Reactive Controller

Server pool 1 Server pool 2

actual workload

historic workload traces

predicted base

workload

excess workload

workload not exceeding base

live workload

trace

base provisioning

noise provisioning

Dynamic Provisioning

Page 17: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

How ?

base server provisioning predicted base workload

time responses ize job

1

rate(t) arrival servers(t) Num.

1Given

SLA

Derive experimentally

Predictive Controller

Page 18: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Simple feedback model: noise(interval(t)) = noise(interval(t-1)) Interval length = 10mins Can use sophisticated control-theoretic models Not the focus of this work

actual-base=noise

time responses ize job

1

rate(t) arrival servers(t) Num.

1

noise provisioning

Real-time

Reactive Controller

Page 19: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

1

9

Base Workload Predictor

Coordinator Predictive Controller

Reactive Controller

Server pool 1 Server pool 2

actual workload

historic workload traces

predicted base

workload

excess workload

workload not exceeding base

live workload

trace

base provisioning

noise provisioning

Coordinator

Page 20: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

20

Coordinator

Forward incoming requests to either the server farm based on the predicted base demand and the actual demand

– Load Balancing Dispatcher

• Load-balance the incoming requests among all servers of two server farms

– Priority Dispatcher

• Forward and load balance the job requests to the base provisioning servers as long as the request rate is below the forecasted base workload request rate

• Requests that exceed the forecasted request rate are forwarded to the reactive provisioning servers

• Isolate the base workload from the noise workload

• Provide stronger performance guarantees for the base workload

• Dispatch the important jobs to the (more robust) base workload server farm

Page 21: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Coordinator Predictive Controller

Reactive Controller

Server pool 1 Server pool 2

excess workload workload not exceeding base

actual workload

Base Workload Predictor

Put It All Together

Page 22: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

22

Trace-driven Simulation

Provision resources for a single tier Web server farm

Different policies:

• Predictive: 24 hour, 6 hour, 4 hour and variable length

• Reactive: 10 minutes

• Hybrid: predictive (fixed length and variable length) plus reactive

Metrics:

• Percentage of SLA violations

• Power consumption

• Number of provisioning changes

Page 23: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

23

Results for SAP Trace

Trace-based analysis results for the SAP trace showing the

SLA violations, power consumption and number of

provisioning changes.

Time-series for the demand, SLA violations, power consumption and the

number of servers for the SAP trace.

Page 24: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

24

Results for Web 2.0 and World Cup Traces

Trace-based analysis results for the Web trace showing the SLA

violations, power consumption and number of provisioning

changes.

Trace-based analysis results for the World Cup 98 trace showing the

SLA violations, power consumption and number of provisioning

changes.

Page 25: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

10-server test bed (web server farm)

Multiple workload traces (SAP, VDR, Web)

Workload generator (httperf) + Load balancer (Apache) + Back-end servers.

Provisioning strategies: Predictive (every 1 hr)

Reactive (every 10 mins)

Hybrid (base provisioning + noise provisioning)

Experimental Setup

Page 26: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Web trace VDR trace

Hybrid reduces response times by as much as 40% compared to Predictive.

Hybrid provides better response times than Predictive and Reactive, and invokes fewer changes than Reactive.

Experiment Results

Page 27: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

Hybrid (Predictive + Reactive) server provisioning

Good performance, low power consumption, very few changes, across various traces.

Important to have a good “base workload”: use dynamic programming.

Conclusion

Page 28: Dynamic Capacity Provisioning of Server Farm · Forward incoming requests to either the server farm based on the predicted base demand and the actual demand –Load Balancing Dispatcher

28

References

1.Minimizing Data Center SLA Violations and Power Consumption via

Hybrid Resource Provisioning. Anshul Gandhi, Yuan Chen, Daniel Gmach, Martin

Arlitt, and Manish Marwah. Proceedings of the Second International Green Computing

Conference (IGCC 2011), July 2011.

2.Hybrid Resource Provisioning for Minimizing Data Center SLA Violations

and Power Consumption. Anshul Gandhi, Yuan Chen, Daniel Gmach, Martin Arlitt,

and Manish Marwah. Journal of Sustainable Computing: Informatics and Systems

(SUSCOM), 2012. (extended version of IGCC paper)