13
Christina Delimitrou Cornell University WAX – April 9 th 2017 SAIL (Systems, Architecture and Infrastructure Lab) IMPROVING RESOURCE EFFICIENCY IN CLOUD COMPUTING Neeraj Kulkarni, Feng Qi, Glyfina Fernando and Christina Delimitrou Cornell University Leveraging Approximation to Improve Resource Efficiency in the Cloud

Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

Christina Delimitrou

Cornell University

WAX – April 9th 2017

SAIL (Systems, Architecture and Infrastructure Lab)

IMPROVING RESOURCE EFFICIENCY

IN CLOUD COMPUTING

Neeraj Kulkarni, Feng Qi, Glyfina Fernando

and Christina Delimitrou

Cornell University

Leveraging Approximation to Improve

Resource Efficiency in the Cloud

Page 2: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

2

Datacenter Underutilization

1 C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management, ASPLOS 2014.

2 L. A. Barroso, U. Holzle. The Datacenter as a Computer, 2013.

Twitter (Mesos)1

4-5x

Google (Borg)2

3-5x

0 10 20 30 40 50 60 70 80 90 100

CPU Utilization (%)

Page 3: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

3

Co-schedule multiple cloud services on same physical platform

Often leads to resource interference, especially when sharing

cores

A Common Approach

App1 App2

Page 4: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

4

Co-schedule one high priority and one/more best-effort apps

Performance is non-critical for best effort jobs

Disadvantage: assume best-effort apps are always low priority

A Common Cure

App1 App2

Page 5: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

5

Approximate computing apps can absorb a loss of resources as

loss of output quality instead of a loss in performance

Advantage: performance of all co-scheduled applications is high-

priority

Approximate Computing Apps to the Rescue

App1 App2

Page 6: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

6

Enables latency-critical & approximate apps to share resources

(including cores) without penalizing their performance

Tunes degree and type of approximation based on measured

interference

Pliant

App1 App2

Pliant runtime

Page 7: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

7

1. Identify opportunities for approximation

ACCEPT (precision, loop perforation, sync

elision), algorithmic exploration

2. Lightweight profiling to determine when to

employ approximation

End-to-end latency/throughput & perf counters

3. Determine what resource(s) to constrain?

Based on measured interference

4. Determine what type of approximation & to

what extent?

Based on interference and performance impact

Challenges

App1 App2

Pliant runtime

Page 8: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

8

DynamoRIO for switching between precise/approximate versions

Initial implementation, overheads high but not prohibitive

Looking into Petabricks and LLVM

Pliant

App1 App2

Pliant runtimeClient

Performance monitor

Workload generator

Interference

monitor

Server

Page 9: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

9

Incremental approximation:

Employ the minimum amount of approximation (quality loss) to

restore the performance of the interactive service

Several versions for each type of approximation, choose online

Interference-aware approximation:

Choose the type of interference that minimizes pressure in the

bottlenecked resource

Example:

High memory interference prioritize algo tuning

High CPU interference prioritize sync elision, loop perforation

Adaptive Approximation

Page 10: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

10

Latency-critical interactive services: memcached & nginx

Open-loop workload generator & performance monitor

Facebook traffic pattern

Approximate computing apps: PARSEC, SPLASH, Spark MLlib

System: 2 2-socket, 40-core servers, 128GB RAM each

Methodology

Page 11: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

11

Evaluation

memcached sharing physical cores with PARSEC

Latency Degree of approximation

Page 12: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

12

Approximate computing: opportunity to improve cloud efficiency

without loss in performance

Pliant: cloud runtime to co-schedule interactive services with

approximate computing apps

Incremental and interference-aware approximation

Preserves QoS for interactive service with minimal loss in quality for

approximate computing application

Current work:

DynamoRIO Petabricks/LLVM

Add cloud approximate computing application

Improve interference awareness

Leverage hardware isolation techniques

Conclusions

Page 13: Leveraging Approximation to Improve IMPROVING RESOURCE ...delimitrou/slides/2017.wax.pliant.slides.… · Leveraging Approximation to Improve Resource Efficiency in the Cloud. 2 Datacenter

13

Approximate computing: opportunity to improve cloud efficiency

without loss in performance

Pliant: cloud runtime to co-schedule interactive services with

approximate computing apps

Incremental and interference-aware approximation

Preserves QoS for interactive service with minimal loss in quality for

approximate computing application

Current work:

DynamoRIO Petabricks/LLVM

Add cloud approximate computing application

Improve interference awareness

Leverage hardware isolation techniques

Questions?