PARTIES : QOS-AWARE RESOURCE PARTITIONING FOR … · Application Memcached Xapian NGINX Moses MongoDB Sphinx Domain Key-value store Web search Web server Real-time translation Persistent

PARTIES: QOS-AWARE RESOURCE PARTITIONINGFOR MULTIPLE INTERACTIVE SERVICES

Shuang Chen, Christina Delimitrou, José F. Martínez

Cornell University

Best-effort

COLOCATION OF APPLICATIONS

Privatecaches

Last-levelCache

Privatecaches

Privatecaches

Privatecaches

P P P P…

Latency-critical

Page 1 of 15

Motivation• Characterization• PARTIES• Evaluation • Conclusions

PRIOR WORK§ Interference during colocation§ Scheduling [Nathuji’10, Mars’13, Delimitrou’14]

• Avoid co-scheduling of apps that may interfere- May require offline knowledge- Limit colocation options

§ Resource partitioning [Sanchez’11, Lo’15]• Partition shared resources- At most 1 LC app + multiple best-effort jobs

Page 2 of 15


Best-effort

TRENDS IN DATACENTERS

Latency-critical

Microservices

Monolith 1 LC + many BE

many LC + many BE

Page 3 of 15


All have QoS targets More LC jobs

MAIN CONTRIBUTIONS

§ Workload characterization• The impact of resource sharing• The effectiveness of resource isolation• Relationship between different resources

§ PARTIES: First QoS-aware resource manager for colocation of many LC services• Dynamic partitioning of 9 shared resources• No a priori application knowledge• 61% higher throughput under QoS constraints • Adapts to varying load patterns

Page 1 of 15


Table 1: Latency-Critical Applications

Application Memcached Xapian NGINX Moses MongoDB Sphinx

Domain Key-valuestoreWeb

searchWeb

serverReal-timetranslation

Persistentdatabase

Speechrecognition

Target QoS 600us 5ms 10ms 15ms 300ms 2.5sMax Load 1,280,000 8,000 560,000 2,800 240 14User / Sys /IO CPU% 13 / 78 / 0 42 / 23 / 0 20 / 50 / 0 50 / 14 / 0 0.3 / 0.2 / 57 85 / 0.6 / 0

LLC MPKI 0.55 0.03 0.06 10.48 0.01 6.28MemoryCapacity 9.3 GB 0.02 GB 1.9 GB 2.5 GB 18 GB 1.4 GB

MemoryBandwidth 0.6 GB/s 0.01 GB/s 0.6 GB/s 26 GB/s 0.03 GB/s 3.1 GB/s

DiskBandwidth 0 MB/s 0 MB/s 0 MB/s 0 MB/s 5 MB/s 0 MB/s

NetworkBandwidth 3.0 Gbps 0.07 Gbps 6.2 Gbps 0.001 Gbps 0.01 Gbps 0.001 Gbps

Table 2: Impact of resource interference. Each row corresponds to a type of resource. Values in the table are the maximum percentageof max load for which the server can satisfy QoS when the LC application is running under interference. Cells with smaller numbersand darker colors mean that applications are more sensitive to that type of interference.

Memcached Xapian NGINX Moses MongoDB SphinxHyperthread 0% 0% 0% 0% 90% 0%

CPU 10% 50% 60% 70% 100% 30%Power 60% 80% 90% 90% 100% 70%

LLC Capacity 90% 90% 70% 80% 100% 30%LLC Bandwidth 0% 60% 70% 30% 90% 0%

Memory Bandwidth 0% 50% 40% 10% 70% 0%Memory Capacity 0% 100% 0% 0% 20% 80%Disk Bandwidth 100% 100% 100% 100% 10% 100%

Network Bandwidth 20% 90% 10% 90% 90% 80%

1

INTERACTIVE LC APPLICATIONS

Page 5 of 15





searchWeb


Persistentdatabase

Speechrecognition







Memcached Xapian NGINX Moses MongoDB SphinxHyperthread 0% 0% 0% 0% 90% 0%

CPU 10% 50% 60% 70% 100% 30%Power 60% 80% 90% 90% 100% 70%

LLC Capacity 90% 90% 70% 80% 100% 30%LLC Bandwidth 0% 60% 70% 30% 90% 0%

Memory Bandwidth 0% 50% 40% 10% 70% 0%Memory Capacity 0% 100% 0% 0% 20% 80%Disk Bandwidth 100% 100% 100% 100% 10% 100%

Network Bandwidth 20% 90% 10% 90% 90% 80%

1

INTERACTIVE LC APPLICATIONS

Page 6 of 15


Max load: max RPS under QoS target when running alone




searchWeb


Persistentdatabase

Speechrecognition







Memcached Xapian NGINX Moses MongoDB SphinxHyperthread

CPUPower

LLC CapacityLLC Bandwidth

Memory BandwidthMemory CapacityDisk Bandwidth

Network Bandwidth

1

0%Extremely sensitive

100%Not sensitive at all% of max load under QoS target

INTERFERENCE STUDY

• Applications are sensitive to resources with high usage• Applications with strict QoS targets are more sensitive

Page 7 of 15


ISOLATION MECHANISMS

PPrivatecaches

Last-levelCache

Privatecaches

PPrivatecaches

P…

Privatecaches

P

cgroup ACPI frequency driverqdiscIntel CAT

• Core mapping» Hyperthreads» Core counts

• Memory capacity• Disk bandwidth• Core frequency

» Power• LLC capacity

» Cache capacity» Cache bandwidth» Memory bandwidth

• Network bandwidth

Page 8 of 15


RESOURCE FUNGIBILITY

§ Resources are fungible• More flexibility in resource allocation• Simplifies resource manager

1 3 5 7 9 1113151719Cache ways

13579

1113

XXXXX

XXXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

Xapian

1 3 5 7 9 1113151719Cache ways

13579

1113

XXXXXXXXXXX

XXXXXXXXX

XXXXXXXXX

XXXXXXXX

XXXXXXX

XXXXXXX

XXXXXX

XXXXXX

XXXXX

XXXX

XXXX

XXXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XapianStand-alone With memory interference

Page 9 of 15


PARTIES: DESIGN PRINCIPLES

§ PARTIES• PARTitioning for multiple InteractivE Services

§ Design principles• LC apps are equally important• Allocation should be dynamic and fine-grained• No a priori application knowledge or offline profiling• Recover quickly from incorrect decisions• Migration is used as a last resort

Page 10 of 15


Compute Storage

D

M No Benefit

$

C

F No Benefit & Mem Slack < 1GB UP UP

UP

DOWN

Compute Storage

D

M No Benefit

$

C

F No Benefit

DOWN DOWN

Storage

D

M No Benefit

$

C

F No Benefit

Compute

C

$

F

time

Slack

0

20%

PARTIES

LatencyMonitor

Clientside

Serverside

MainFunction

Polllatencyevery100ms

Page 11 of 15


App2

App1

Unallocatedpool

• 5 knobs organized into 2 wheels• Start from a random resource• Follow the wheels to visit all resources

QoS violations?Upsize!

Excessresources?Downsize!

Compute Storage

D

M No Benefit

$

C


UP

DOWN

Compute Storage

D

M No Benefit

$

C

F No Benefit

DOWN DOWN

Storage

D

M No Benefit

$

C

F No Benefit

Compute

C

$

F

Compute Storage

D

M No Benefit

$

C


UP

DOWN

Compute Storage

D

M No Benefit

$

C

F No Benefit

DOWN DOWN

Storage

D

M No Benefit

$

C

F No Benefit

Compute

C

$

F DownsizingApp2…

Compute Storage

D

M No Benefit

$

C


UP

DOWN

Compute Storage

D

M No Benefit

$

C

F No Benefit

DOWN DOWN

Storage

D

M No Benefit

$

C

F No Benefit

Compute

C

$

F

Compute Storage

D

M No Benefit

$

C


UP

DOWN

Compute Storage

D

M No Benefit

$

C

F No Benefit

DOWN DOWN

Storage

D

M No Benefit

$

C

F No Benefit

Compute

C

$

F

Compute Storage

D

M No Benefit

$

C


UP

DOWN

Compute Storage

D

M No Benefit

$

C

F No Benefit

DOWN DOWN

Storage

D

M No Benefit

$

C

F No Benefit

Compute

C

$

F

UpsizingApp1…

METHODOLOGY§ Platform: Intel E5-2699 v4

• Single socket with 22 cores (8 IRQ cores)§ Virtualization

• LXC 2.0.7§ Load generators

• Open loop• Request inter-arrival distribution: exponential• Request popularity: Zipfian

§ Testing strategy• Constant load: 30s warmup, 1m measurement (x5) • Varying load simulates diurnal load patterns

Page 12 of 15


10 20 30 40 50 60 70 80Max Load of Xapian(%)

70

70

60

30

20

10

60

50

40

20

10

50

40

30

10

40

20

10

30

20

10

20

10

20 10

20

40

60

80

100


10

20

30

40

50

60

Max

Load

ofM

emca

ched

(%)

60

50

40

30

50

40

30

10

40

20

10

30

10

10


80

70

60

40

20

10

70

50

40

30

10

50

40

30

20

40

30

20

30

20

10

30

10

20 10

20

40

60

80

100


10

20

30

40

50

60

Max

Load

ofM

emca

ched

(%)

50

30

10

10

30

20

20 10 10

CONSTANT LOADS: MEMCACHED, XAPIAN & NGINXUnmanaged

Heracles PARTIES

Oracle

Max

LoadofN

GIN

X(%

)M

axLoad

ofNG

INX

(%)

Oracle- Offline profiling- Always finds the

global optimum

Heracles- No partitioning

between BE jobs- Suspend BE upon

QoS violation- No interaction

between resources

Page 13 of 15


MORE EVALUATION

Page 14 of 15


Constant loads§ All 2- and 3-app mixes under PARTIES§ Comparison with Heracles for 2- to 6-app mixes

Diurnal load pattern§ Colocation of Memcached, Xapian and Moses

PARTIES overhead§ Convergence time for 2- to 6-app mixes

CONCLUSIONS

§ Need to manage multiple LC apps§ Insights

• Resource partitioning• Resource fungibility

§ PARTIES• Partition 9 shared resources • No offline knowledge required• 61% higher throughput under QoS targets• Adapts to varying load patterns

Page 15 of 15


PARTIES: QOS-AWARE RESOURCE PARTITIONINGFOR MULTIPLE INTERACTIVE SERVICES

Shuang Chen, Christina Delimitrou, José F. MartínezCornell University

http://tiny.cc/parties

Documents

PARTIES : QOS-AWARE RESOURCE PARTITIONING FOR … · Application Memcached Xapian NGINX Moses MongoDB Sphinx Domain Key-value store Web search Web server Real-time translation Persistent