Towards Predictable Multi-Tenant Shared Cloud Storage

PrincetonUniversity

Towards Predictable Multi-Tenant Shared Cloud Storage

David Shue*, Michael Freedman*, and Anees Shaikh✦

*Princeton ✦IBM Research

2

Shared Services in the Cloud

Z

Y

T

FZ

Y

F

T

S3 EBS SQSDynamoDB

4

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

Z Y T FZ Y FTY YZ F F F

2x demand 2x demand

6



Y FY FY Y F F F

2x demand 2x demand Z Y T FZ Y FTY YZ F F F

8



2x demand 2x demand Y FY FY Y F F FZ Y T FZ Y FTY YZ F F F

Resource contention = unpredictable performance

16

Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F

DD DD DD DDSS SS SS SS

Towards Predictable Shared Cloud Storage

Shared Storage Service

Per-tenant Resource Allocation and Performance

Isolation

17

Zynga Yelp FoursquareTP

Shared Storage Service



SS SS SS SS

80 kreq/s120 kreq/s 160 kreq/s40 kreq/s

Hard limits are too restrictive, achieve lower utilization

≤

18

Zynga Yelp FoursquareTP

Shared Storage


wz = 20%wy = 30% wf = 40%wt = 10%

demandz = 40%ratez = 30%

demandf = 30%


SS SS SS SS

Goal: per-tenant max-min fair share of system resources

≥

23

PISCES: Predictable Shared Cloud Storage

•PISCES Goals- Allocate per-tenant fair share of total system

resources

- Isolate tenant request performance

- Minimize overhead and preserve system throughput

•PISCES Mechanisms

minutessecondsmicroseconds

Partition Placement (Migration)

Local Weight Allocation

Replica Selection

Fair Queuing

24

Tenant A Tenant B Tenant C

PISCES Node 2

PISCES Node 3

PISCES Node 4

VM VM VM VM VM VM VM VM VM

PISCESNode 1

weightA weightB weightC Tenant D

VM VM VM

weightD

Place Partitions By Fairness Constraints

Akeyspace BkeyspaceCkeyspaceDkeyspace

keyspace partition

pop

ula

rity

25

Controller

PP


Rate A < wA Rate B < wB Rate C < wC Compute feasible partition

placement

25

Over-loaded

26

Controller

PP


Compute feasible partition placement

31

PISCES Mechanisms


minutessecondsmicrosecondsTimescale

Mechanism

Controller

32


wA wB wC VM VM VM

wD

Allocate Local Weights By Tenant Demand

wA wB wC wD = = =

wa2wb2wc2wd2 wa3wb3wc3wd3 wa4wb4wc4wd4

32

wa1wb1wc1wd1

RA > wA RB < wB RC < wC RD > wD

33

Swap weights to minimize tenant latency (demand)

Allocate Local Weights By Tenant Demand

WA

Controller

VM VM VM VM VM VM VM VM VM VM VM VM

wA wB wC wD

A→C C→AD→B C→D B→C

41

Achieving System-wide Fairness


Mechanism Local Weight Allocation

Controller

Controller


42

Select Replicas By Local Weight


wA wB wD

42

RR

1/2 1/2

VM VM

wC VM

GET 1101100

C over-loaded

C under-utilized

RSSelect replicas based

on node latency

GET 1101100

43

Select Replicas By Local Weight


wA wB wD

RR

VM VM

wC VM

RSSelect replicas based

on node latency

1/3 2/3

48




Replica Selection

Partition Placement (Migration)Controlle

r

Controller

RRRR ...

49

Fair Queuing

Queue Tenants By Dominant Resource

50 12.5

6.350

62.5

7.3

40.5 1

outin req

40.5

3.2

outin reqBandwidth limitedRequest Limited

Out bytes fair sharing

VM VM VM VM VM VM

wA wB

50

Queue Tenants By Dominant Resource

out

40.5 1in req

40.5

3.2

outin req

Fair Queuing

Bandwidth limitedRequest Limited

5511.3

5.645 55

6.9

Dominant resource fair sharing

Shared out bytesbottleneck

VM VM VM VM VM VM

wA wB

57




Replica Selection

Fair Queuing

Partition Placement (Migration)Controlle

r

Controller

RRRR

N

...

N...

61

•Does PISCES achieve system-wide fairness?

•Does PISCES provide performance isolation?

•Can PISCES adapt to shifting tenant distributions?

Evaluation

YCSB 1

ToRSwitch

YCSB 2

PISCES 6

PISCES 8

PISCES 5

PISCES 7

GigabitEthernet

YCSB 3

YCSB 8

YCSB 7

YCSB 4

YCSB 5

YCSB 6

PISCES 1

PISCES 2

PISCES 3

PISCES 4

63

PISCES Achieves System-wide Fairness

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.68 MMR 0.51 MMR 0.79 MMR 0.97 MMR

3.77-5.30 ms 3.31-5.77 ms 3.80-4.69 ms 4.05-4.23 ms

GET R

eq

uest

s (k

req

/s)

Ideal fair share: 110 kreq/s (1kB requests)

Time (s)

66

PISCES Provides Strong Performance Isolation

Membase (Noqueue)


Selection

0.68 MMR 0.51 MMR 0.79 MMR 0.97 MMR

3.77-5.30 ms 3.31-5.77 ms 3.80-4.69 ms 4.05-4.23 ms

GET R

eq

uest

s (k

req

/s)

Time (s)

2x demand vs. 1x demand tenants (equal weights)

Membase (Noqueue)


Selection

0.42 MMR 0.50 MMR 0.50 MMR 0.97 MMR

3.94-4.99 ms 4.29-6.15 ms 4.16-4.27 ms 5.40-5.45 ms

3.27-4.14 ms 2.41-3.72 ms 3.57-4.04 ms 2.78-2.83 ms

68

Equal Global Weight, Differing Resource Workloads

PISCES Achieves Dominant Resource Fairness

Time (s)

Band

wid

th (

Mb

/s)

Latency (ms)

GET R

eq

uest

s (k

req

/s)

76% of effective bandwidth

76% of effective throughput

Bottleneck Resource

69

Differentiated Global Weights (4-3-2-1)

PISCES Achieves System-wide Weighted Fairness

GET R

eq

uest

s (k

req

/s)

Latency (ms)

Fract

ion

of

Req

uest

s

Time (s)

70

Equal Global Weights, Staggered Local Weights

PISCES Achieves Local Weighted Fairness

Time (s)

GET R

eq

uest

s (k

req

/s)

Latency (ms)

Fract

ion

of

Req

uest

s

71

Differentiated (Staggered) Local Weights

Server 1 Server 2 Server 3 Server 4

PISCES Achieves Local Weighted Fairness

Time (s)

wt = 4

wt = 3

wt = 2

wt = 1

GET R

eq

uest

s (k

req

/s)

74

PISCES Adapts To Shifting Tenant Distributions

Tenant 3Tenant 2Tenant 1 Tenant 4Weight

1xWeight

2xWeight

1xWeight

2x

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

75

Global Tenant Throughput Tenant Latency (1s average)

Server 1 Server 2 Server 3 Server 4

PISCES Adapts To Shifting Tenant Distributions

Tenant 3Tenant 2Tenant 1 Tenant 4

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Ten

ant

Weig

ht

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

78

Conclusion

•PISCES achieves:

- System-wide per-tenant fair sharing

- Strong performance isolation

- Low operational overhead (< 3%)

•PISCES combines:

- Partition Placement: find a feasible fair allocation (TBD)

- Weight Allocation: adapts to (shifting) per-tenant demand

- Replica Selection: distributes load according to local weights

- Fair Queuing: enforces per-tenant fairness and isolation

80

Future Work

•Implement partition placement (for T >> N)

•Generalize the fairness mechanisms to different services and resources (CPU, Memory, Disk)

•Scale evaluation to a larger test-bed (simulation)

Thanks!

Documents

Towards Predictable Multi-Tenant Shared Cloud Storage