16
Self Self-Tuning Tuning Data Centers Data Centers R&D R&D Reza Rahimi, PhD Senior Staff Software Architect, Huawei R&D Storage Lab, Santa Clara, USA. Keynote Talk, Big Data Innovation Summit, Jan 2017, Las Vegas, USA.

Self-Tuning Data Centers

Embed Size (px)

Citation preview

Page 1: Self-Tuning Data Centers

SelfSelf--Tuning Tuning

Data Centers Data Centers

R&DR&D

Reza Rahimi, PhD

Senior Staff Software Architect,

Huawei R&D Storage Lab,

Santa Clara, USA.

Keynote Talk,

Big Data Innovation Summit, Jan 2017,

Las Vegas, USA.

Page 2: Self-Tuning Data Centers

ProloguePrologueEvery Important trend has an

R&D

ProloguePrologueEvery Important trend has an

Interesting story, so does big data

Page 3: Self-Tuning Data Centers

Business Challenge:

Software Defined Business

Software Defined Transportation Software Defined Video Streaming

Control, Management

and Analytics Tier

R&D

Software Defined Leasing/Hostelling Software Defined Data Centers

and Analytics Tier

Resource Pool Tier

Page 4: Self-Tuning Data Centers

Engineering Challenge:

Big Data Problem

Resource/Supply Providing

Monitoring, Analytics, Management

Prediction/Optimization

Customer/Service User

Control and

Management Tier

Resource

Pool Tier

Large amount of data and meta data generated

R&D

Large amount of data and meta data generated

Page 5: Self-Tuning Data Centers

Workloads and Applications:

First Class Citizens of Data Centers

R&D

First Class Citizens of Data Centers

Page 6: Self-Tuning Data Centers

Applications Spectrum

Computing (CPU , GPU, DSP, FPGA,...)Self-Driving CarsRobotic/AI Applications

These applications will be

fully or partially supported

by Data Centers Services

(Cloud-Based)

R&D

Storage (DRAM, SSD, HDD,..)

Network (Wired, Wi-Fi, 4G,…)

Data Management Systems

Video Streaming/IoT,…

Page 7: Self-Tuning Data Centers

Typical Data Center Architecture

As a simple rule of thumb:

Enterprise Data Center Size :

100 Hosts

1000 VMs

~Logs : 40 GB/Day

Data Center

Management

Host 1 Host 2 Host nApps are running

on VMs

R&D

VM-1-k

VM-1-1 VM-2-1

VM-2-m

VM-n-1

VM-n-l

logslogs logs

Storage Pool

Big Data Engineering

and Science

Page 8: Self-Tuning Data Centers

Data Center Management

Challenges and Opportunities

R&D

Challenges and Opportunities

Page 9: Self-Tuning Data Centers

Challenges in Data Center

Management

Service Level Agreement (SLA) : Throughput/Latency (e-commerce applications):

► 2014 US $304 billion increasing 15.4% yearly in e-commerce [1],

► 100ms latency costs 1% decrease in sale [3],

► Page loading should be less than 2 seconds per page not to lose

customer, will decrease overall sales by 7% [2],

R&D

Availability and Fault Tolerance :► Example Huawei public cloud 99.9999 Availability [4] =

Daily: 0.9s

Weekly: 6.0s

Monthly: 26.3s

Yearly: 5m 15.6s

Scalable and Elastic (on Demand) :► Should know when and how to scale to satisfy SLA dynamically,

Page 10: Self-Tuning Data Centers

Data Center Energy Efficiency and Resource Utilization :► By 2020 reduction of energy cost 30% based on

European law-Green DC [5],

Challenges in Data Center

Management

Security and Privacy : ► Should guarantee data privacy (like medical data, Financial Data,…) and

security against attacks, data ownership,…

R&D

By 2020 reduction of energy cost 30% based on

European law-Green DC [5],

► US data centers consume ~ 90 billion Kilowatt hours annually =

House hold in NY for two years

► Pollute over 150 million tons of carbon yearly in USA [5],

► ~ 90 percent of the VMs utilizes < 15% of assigned cores [9],

► ~ 90 percent of the VMs only have < 10 IOPS [9],

► Average server runs on [12%-18%] of their capacity most of the time

still consuming 30% to 60% of their maximum power consumption [6,7].

► High utilization -> save in power consumption->Low carbon footprint

Page 11: Self-Tuning Data Centers

Software Compliance and License :► ~ $500,000 spent on software licensing for average size data center,

► It could be per User/Device/VM/Core/…

► Different models and policies for license like [8]:

1) Running licensed workload on bare metal (no virtualization),

2) Running licensed workload on dedicated cluster,

3) Migrate licensed workload,

Challenges in Data Center

Management

R&D

3) Migrate licensed workload,

4) …

► Workloads and cluster growth bring challenges for software license,

► This bring the challenge how to minimize the cost of software on data

centers and not violate license policy,

Dynamic Service Pricing :► Computing, network and storage are utilities for workloads.

► Should model to find a dynamic way and good policy of pricing in

competitive market of cloud providers while increasing revenue.

Page 12: Self-Tuning Data Centers

Self-Tuning Data Center :

Simplified Service Architecture

VM

Scheduling and

Orchestrating Services

and Resources

Real-time Log and

Monitoring Service

Alert and Policy

Service

Recommendation

Service

2) Ask correct size, type

And location for resource

Based on request

1) Request resource

3) Correct conf and resource

size and place4) Allocate required

resources

1) Telemetry and log sending

Initial

State

R&D

1) Telemetry and log sending2) Query logs for policy and

alert checks

4) Check for violation

and warnings5) Alert of Violation

6) Ask for Recommendation

7 ) Send Recommendations

and Recipes

8 ) Apply Recommendation

Operational and

Recovery State

1 ) Ask Recommendation

For Self-Tune (for example

in low traffic state)

2) Send Tuning Plan

and Recommendation

(like VM migration or resizing)

3 ) Apply Self-Tuning

recommendation

Self-Tuning

State

3) Collected Data

Page 13: Self-Tuning Data Centers

Huawei Position in

Self-Tuning Data Center

► Huawei Cloud is growing very fast > 50% revenue increase y-y.

► Huawei launched its first public cloud outside China in Europe

(announced in CeBIT 2016) with 50,000 Hosts.

► Working on intelligent service in Huawei R&D Storage Lab in USA to

address self-tuning data centers and provide solution for Huawei

customers and their needs.

► Using and contributing idea from/to open source big data

R&D

► Using and contributing idea from/to open source big data

community.

Page 14: Self-Tuning Data Centers

Conclusions and

Future Directions

► Cloud-based ecosystem is the future of IT.

► Cloud data center composed of different resources

to satisfy applications requirements.

► Managing these resources is a complicated task that human

can not do it manually.

► Machines in data centers are generating big amount of logs which

describe what happen in data center.

R&D

describe what happen in data center.

► Data scientists and engineers are needed to study system

behavior and data center optimization.

► This will result to the next generation data centers which are self-tuning

and need minimum human efforts.

Page 15: Self-Tuning Data Centers

References

[1] U.S Census Bureau News : http://www2.census.gov/retail/releases/historical/ecomm/14q4.pdf

[2] Akamai Newsroom : http://www.akamai.com/html/about/press/releases/2009/press_091409.html

[3] High Scalability Blog : http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it

[4] High Availability : https://en.wikipedia.org/wiki/High_availability

[5] European Commission on Renewable Energy : https://ec.europa.eu/energy/en/topics/renewable-

energy

R&D

energy

[6] ISSUE Brief : https://www.nrdc.org/sites/default/files/data-center-efficiency-assessment-IB.pdf

[7] ISSUE Paper : https://www.nrdc.org/sites/default/files/data-center-efficiency-assessment-IP.pdf

[8] Turbotonic white paper “Licensing, Compliance & Audits in the Cloud Era”, 2015.

[9] CloudPhysics, Global IT Data Lake Report, Q4 2016

Page 16: Self-Tuning Data Centers

R&D

Thanks ☺☺☺☺Questions?