Noha danms13 talk_final

Autonomous Resource Provision in Virtual Data Centers

Presented by: Noha Elprince

noha.elprince@uwaterloo.ca

IFIP/IEEE DANMS, 31 May 2013

Unused resources

Demand

Capacity

Demand

Capacity

Static Data Centers vs. Dynamic

Figure: RAD Lab, UC Berkeley

•  Fixed pre-assigned Resources ( provision for peak )

•  Static Environment •  Manual change of configurations

•  Cloud Elasticity ~ “Pay as you go” •  Virtualized Environment •  Automated “Self-service” change of

configurations.

•  under-provisioning Heavy Penalty

Lost revenue

Lost users

Demand

Capacity

Time (days) 1 2 3

Demand

Capacity

Time (days) 1 2 3

Demand

Capacity

Time (days) 1 2 3

Figures: RAD Lab, UC Berkeley

Cloud Elasticity problems…

Cloud Elasticity …

•  over-provisioning overutilization

Demand

Capacity

Unused resources

Figure: RAD Lab, UC Berkeley

Virtualization (Cloud Foundation) •  Virtualization allows a

computational resource to be partitioned on multiple isolated execution environments (VMs)

•  Turning the machine into a “virtual

image” ~ Self-immunity from: Ø  Hardware breakdowns. Ø  Running out of the resources

Challenge: Service Differentiation

Problem q  Over and under provisioning in spite of:

difficulty of estimating the actual needs due to time-varying and diverse workload.

q  Enabling service differentiation in a virtualized environment.

Methodology

•  Develop and implement an autonomic resource management controller that: Ø  Effectively optimize the resource by predicting current

resource needs. Ø Continuous resource self-tuning to accommodate load

variations and enforce service differentiation during resource allocation.

•  Test the proposed prototype on real traces.

Motivation

•  Help datacenters to manage resources effectively. •  Propagate Cloud Computing (increase cloud users =>

less expensive) •  Optimize resources (green I.T !)

Related Work

v  Approaches for autonomic resource management:

–  Utility based self-optimizing approach. –  Model-based approach based on perf. Modeling. –  Machine Learning Approach –  Fuzzy logic approach.

Proposed Solution Architecture: Sys. Modeling

r(t+1)

I. System Modeling : Data set

•  Idea: Learning from successful jobs: (normal termination, fulfill client’s anticipated perf.) •  A real computing center trace of Los Alamos National Lab (LANL) •  LANL is a United States Department of Energy (DOE) national

laboratory.

•  LANL conducts multidisciplinary research in fields such as national security , space exploration, renewable energy, medicine, nanotechnology and supercomputing.

•  System: 1024-node Connection Machine CM-5 from Thinking Machines •  Jobs: 201,387 , Duration: 2 Yrs.

v  Feature Selection •  Use stepwise regression to: Sort variables out & leave more influential ones in the model. •  Results: Out of 18 features, 5 features were selected:

=> run_time, wait_time, Avg_cpu_time, used_mem, status

v  Filter •  Remove jobs with status =unsuccessful ( failed/ aborted ) •  Discard records that have average_cpu_time_used <=0 and used_mem <=0

v  Data Cleaning •  Normalize data to remove noise

I. System Modeling : Data Preprocessing

I. System Modeling : Statistical Analysis

Cascaded classifiers (MISO model)

I. System Modeling : Model I/O

§  Linear Regression §  Sugeno Fuzzy Inference System (FCM, SUB) §  Regression Tree (REP-Tree) §  Model Tree (M5P) §  Boosting (Rep-Tree, M5P) §  Bagging (Rep-Tree, M5P)

I. System Modeling : ML approaches

Why ML ? - Due to the Non-linear nature of the data -  Ability to deal with complex nature of data. -  Detect dependency between i/ps and o/ps efficiently.

Bagging vs. Boosting Classifiers

•  Bagging (Bootstrap aggregating) uses bootstrap sampling.

•  Trains k classifier on each bootstrap sample.

•  A weighted majority (voting) of the k learned classifiers (using equal weights).

•  Boosting: weak classifiers form a final strong classifier.

•  After a weak learner is added, the data is reweighted: Ø  misclassified examples=> gain weight Ø  examples classified correctly => lose

weight.

•  Thus future learners focus more on the data that previous weak learners misclassified.

II. Res. Predictor

v The client requests hosting a specific type of application with a pre-specified response time.

v  An initial estimate is generated. v Rate of prediction is

accompanied by the coming of the client to the data center.

Classifier Type RMSE MAE RAE CC

Linear Reg. C1 0.0024 0.0008 50.33% 0.70 C2 0.0023 0.0001 57.29% 0.71 C3 0.0026 0.0003 58.15% 0.98

Sugeno FIS (SUB) C1 0.0021 0.0009 44.89% 0.66 C2 0.0012 0.0002 51.06% 0.66 C3 0.0011 0.0002 53.93% 0.85

Boosting (M5P) C1 0.0020 0.0006 34.59% 0.80 C2 0.0018 0.0007 39.20% 0.84 C3 0.0003 0.0001 10.99% 0.99

Bagging Tree (M5P)

C1 0.0018 0.0005 32.57% 0.84 C2 0.0017 0.0007 36.38% 0.84 C3 0.0003 0.0001 11.82% 0.99

Validation: Perf. Measures for different prediction models

Resource Predictor: Learning Time Comparison

III. Resource Allocator

1.  Res Allocator initially allocates resources ( based on the prediction model).

2.  Check the error resulting from the

tuner. 3.  The tuner Calculates the normalized error in resource allocation

4. Takes the feedback from the tuner (ResAdjustment) and sends a command to the VC in the VM with the appropriate decision.

RespTimeError (k) = RespTime(k)ref ! RespTime(k)obs

RespTime(k)ref

IV. Resource Tuner : Rule-Based Fuzzy System

RespTimeError

ClientClass

Status ResDirection

ResController ( mamdani)

IV. Resource Tuner: Rule-Based Fuzzy System

!!!!"#$%&'()($%!

!!!!!!!#$%*+',$-((.(!/.0! 1$2'3,! 4'56!

!!!78'$9:!!;8<%%!

=.82! !!!!!!!! !!>?/!

!!!!!!!!>&1!!>?1!

!!!!!!>&4!!>?4!

>'8@$(! !!!!!!>&/!!! !!

!!!!!!!>&1!!>?1!

!!!!!>&4!!>?4!

A(.9B$! !!!!!!>&/!! !!

!!!!!!!>&1!!!!!!!!!! !

!!!!!!!>&4!>?4!

!Over provision / Under provision

-  Total # rules : 18 -  The grades of

membership of each attribute (high, medium , low) are adjusted by experts in the datacenter.

-  ResDir : •  reflects a percentage of the resource that should be utilized in

the VC. (ResAdjust = ResDir x ResWt x VCres) •  ranges [-1 +1] with MFs (low, med, high) for:

Ø  speed up (+ve side) Ø  step down (-ve side)

V. Adaptive Learning

New incoming data will be fed into the prediction model by different ways (depending on the prediction model used): -  Directly via clustering (if

clustering is used as in TS-FIS) => online learning

-  OR it will be stored in the

database until a certain threshold reached, then an ECA rule is fired , initiating re-modeling => offline learning

V. Adaptive Learning: update Rules in Fuzzy Tuner FIS

Rule Editor

Resource Tuner validation - Example

!!!!"#$%&'()($%!

!!!!!!!#$%*+',$-((.(!/.0! 1$2'3,! 4'56!

!!!78'$9:!!;8<%%!

=.82! !!!!!!!! !!>?/!

!!!!!!!!>&1!!>?1!

!!!!!!>&4!!>?4!

>'8@$(! !!!!!!>&/!!! !!

!!!!!!!>&1!!>?1!

!!!!!>&4!!>?4!

A(.9B$! !!!!!!>&/!! !!

!!!!!!!>&1!!!!!!!!!! !

!!!!!!!>&4!>?4!

! Over provision / Under provision

Method: Testing cases using the fuzzy rule viewer.

I/ps: RespTimeError : medium , client class : Gold and status: underprovision O/p: ResDirection : SUM (speed up medium)

Resource Tuner Validation - Example

RespTimeError= 0.5 ClientClass= 0.9 status= 0.2 ResDirection = 0.5

I/ps: RespTimeError: medium, Client class: Silver, Status : underprovision o/p: ResDirection is SUM (speed up medium)

I/ps: RespTimeError : medium , client class : bronze , status: underprovision o/p: ResDirection : noAction

Conclusions •  Proposed ML model predicts the right amount of

resources (Bagging/Boosting is promising). •  The Fuzzy tuner

- Accommodates any deviation in workload c/cs. - Enforces service differentiation.

•  Adaptive Learning guarantee having an up-to-date model that lowers future SLA violations.

Questions ?

Noha danms13 talk_final

Documents

L. Noha Soliman Biochemistry lab 2 (Carbohydrates)

Magarkar, Aniket; Mele, Nawel; Abdel-Rahman, Noha ......Aniket Magarkar1, Nawel Mele1, Noha Abdel-Rahman2, Sarah Butcher2, Mika Torkkeli3, Ritva Serimaa3, Arja Paananen4, Markus Linder4,5,

DR –Noha Elsayed 2014- 2015 Critical Care Critical Care

Ghoz, Noha (2020) Ocular neovascularization. PhD thesis

NOHA INTENSIVE PROGRAMME 2007 RUHR-UNIVERSITY (Bochum ... › documents › noha › ip2007.pdf · NOHA Director, University of Deusto-Bilbao, Spain Speakers: Representatives of the

NOHA NORWAY AS...2018/05/02 · NOHA NORWA AS • .noha.no brANNslANgetrOMler • noha modell 1 og 1a NOHA MOdell 1 Og 1A er Ce-Merket ifølge Ns-eN671-1 Og leveres Med: • Slangetrommel

Million Book Project @ Bibliotheca Alexandrina Noha Adly 20 November 2006

James Kennell, Dr Noha Nasser- Greenwich University

Product Catalog HOSE REELS - NOHA · 5 NOHA NORWA AS • FIRE HOSE REELS 4 NOHA NORWA AS • TECHNiCAL DATA CAPASITY AND THROW LENGTH Pressure at inlet to …File Size: 1MBPage Count:

2015 ISMRM talk_final

Highlights Faculty & Guests NOHA & Teaching Media & Publications · 2019-07-01 · Highlights Faculty & Guests NOHA & Teaching Media & Publications Tobias Ackermann, „Die Auswirkungen

Noha today

lec1 - Noha math

NOHA WINTER SCHOOL Humanitarian - Development Nexus · NOHA WINTER SCHOOL Humanitarian - Development Nexus University of Pavia - Italy 28 November 2017 Humanitarian– Development

INSTALLATION - NOHA...2 NOHA NORWA AS 1 Important : Always function test reel after installation (page 14). Automatic Valve: Automatic valve inside reel. NOHA recommends a service

NOHA SPRING SCHOOL IN HUMANITARIAN ACTION Target … › pub › 7 › 5513-2.pdf · NOHA Spring School which will be conducted by NOHA universities staff and field actors in order

Intégration négociée de services dans les systèmes distribués · Noha Ibrahim, Frédéric Le Mouël, Stéphane Frénot To cite this version: Noha Ibrahim, Frédéric Le Mouël,

Noha mega store

NOHA - International Association of Universities › uploads › news › 1 › downloads › NOHA... · Please send therequestedinformation,a CV and a letterofmotivation before October

Movement and transport planning 2011, Noha Nasser