33
PrincetonUnive rsity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh *Princeton IBM Research

Towards Predictable Multi-Tenant Shared Cloud Storage

  • Upload
    cahil

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Towards Predictable Multi-Tenant Shared Cloud Storage. David Shue*, Michael Freedman*, and Anees Shaikh ✦. *Princeton ✦ IBM Research. Shared Services in the Cloud. Y. Y. F. F. Z. Z. T. T. DynamoDB. S3. EBS. SQS. Co-located Tenants Contend For Resources. 2x demand. 2x demand. Z. - PowerPoint PPT Presentation

Citation preview

Page 1: Towards Predictable Multi-Tenant Shared Cloud Storage

PrincetonUniversity

Towards Predictable Multi-Tenant Shared Cloud Storage

David Shue*, Michael Freedman*, and Anees Shaikh✦

*Princeton ✦IBM Research

Page 2: Towards Predictable Multi-Tenant Shared Cloud Storage

2

Shared Services in the Cloud

Z

Y

T

FZ

Y

F

T

S3 EBS SQSDynamoDB

Page 3: Towards Predictable Multi-Tenant Shared Cloud Storage

4

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

Z Y T FZ Y FTY YZ F F F

2x demand 2x demand

Page 4: Towards Predictable Multi-Tenant Shared Cloud Storage

6

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

Y FY FY Y F F F

2x demand 2x demand Z Y T FZ Y FTY YZ F F F

Page 5: Towards Predictable Multi-Tenant Shared Cloud Storage

8

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

2x demand 2x demand Y FY FY Y F F FZ Y T FZ Y FTY YZ F F F

Resource contention = unpredictable performance

Page 6: Towards Predictable Multi-Tenant Shared Cloud Storage

16

Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F

DD DD DD DDSS SS SS SS

Towards Predictable Shared Cloud Storage

Shared Storage Service

Per-tenant Resource Allocation and Performance

Isolation

Page 7: Towards Predictable Multi-Tenant Shared Cloud Storage

17

Zynga Yelp FoursquareTP

Shared Storage Service

Towards Predictable Shared Cloud Storage

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

80 kreq/s120 kreq/s 160 kreq/s40 kreq/s

Hard limits are too restrictive, achieve lower utilization

Page 8: Towards Predictable Multi-Tenant Shared Cloud Storage

18

Zynga Yelp FoursquareTP

Shared Storage

Towards Predictable Shared Cloud Storage

wz = 20%wy = 30% wf = 40%wt = 10%

demandz = 40%ratez = 30%

demandf = 30%

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

Goal: per-tenant max-min fair share of system resources

Page 9: Towards Predictable Multi-Tenant Shared Cloud Storage

23

PISCES: Predictable Shared Cloud Storage

•PISCES Goals- Allocate per-tenant fair share of total system

resources

- Isolate tenant request performance

- Minimize overhead and preserve system throughput

•PISCES Mechanisms

minutessecondsmicroseconds

Partition Placement (Migration)

Local Weight Allocation

Replica Selection

Fair Queuing

Page 10: Towards Predictable Multi-Tenant Shared Cloud Storage

24

Tenant A Tenant B Tenant C

PISCES Node 2

PISCES Node 3

PISCES Node 4

VM VM VM VM VM VM VM VM VM

PISCESNode 1

weightA weightB weightC Tenant D

VM VM VM

weightD

Place Partitions By Fairness Constraints

Akeyspace BkeyspaceCkeyspaceDkeyspace

keyspace partition

pop

ula

rity

Page 11: Towards Predictable Multi-Tenant Shared Cloud Storage

25

Controller

PP

Place Partitions By Fairness Constraints

Rate A < wA Rate B < wB Rate C < wC Compute feasible partition

placement

25

Over-loaded

Page 12: Towards Predictable Multi-Tenant Shared Cloud Storage

26

Controller

PP

Place Partitions By Fairness Constraints

Compute feasible partition placement

Page 13: Towards Predictable Multi-Tenant Shared Cloud Storage

31

PISCES Mechanisms

Partition Placement (Migration)

minutessecondsmicrosecondsTimescale

Mechanism

Controller

Page 14: Towards Predictable Multi-Tenant Shared Cloud Storage

32

VM VM VM VM VM VM VM VM VM

wA wB wC VM VM VM

wD

Allocate Local Weights By Tenant Demand

wA wB wC wD = = =

wa2wb2wc2wd2 wa3wb3wc3wd3 wa4wb4wc4wd4

32

wa1wb1wc1wd1

RA > wA RB < wB RC < wC RD > wD

Page 15: Towards Predictable Multi-Tenant Shared Cloud Storage

33

Swap weights to minimize tenant latency (demand)

Allocate Local Weights By Tenant Demand

WA

Controller

VM VM VM VM VM VM VM VM VM VM VM VM

wA wB wC wD

A→C C→AD→B C→D B→C

Page 16: Towards Predictable Multi-Tenant Shared Cloud Storage

41

Achieving System-wide Fairness

minutessecondsmicrosecondsTimescale

Mechanism Local Weight Allocation

Controller

Controller

Partition Placement (Migration)

Page 17: Towards Predictable Multi-Tenant Shared Cloud Storage

42

Select Replicas By Local Weight

VM VM VM VM VM VM VM VM VM

wA wB wD

42

RR

1/2 1/2

VM VM

wC VM

GET 1101100

C over-loaded

C under-utilized

RSSelect replicas based

on node latency

GET 1101100

Page 18: Towards Predictable Multi-Tenant Shared Cloud Storage

43

Select Replicas By Local Weight

VM VM VM VM VM VM VM VM VM

wA wB wD

RR

VM VM

wC VM

RSSelect replicas based

on node latency

1/3 2/3

Page 19: Towards Predictable Multi-Tenant Shared Cloud Storage

48

Achieving System-wide Fairness

minutessecondsmicrosecondsTimescale

Mechanism Local Weight Allocation

Replica Selection

Partition Placement (Migration)Controlle

r

Controller

RRRR ...

Page 20: Towards Predictable Multi-Tenant Shared Cloud Storage

49

Fair Queuing

Queue Tenants By Dominant Resource

50 12.5

6.350

62.5

7.3

40.5 1

outin req

40.5

3.2

outin reqBandwidth limitedRequest Limited

Out bytes fair sharing

VM VM VM VM VM VM

wA wB

Page 21: Towards Predictable Multi-Tenant Shared Cloud Storage

50

Queue Tenants By Dominant Resource

out

40.5 1in req

40.5

3.2

outin req

Fair Queuing

Bandwidth limitedRequest Limited

5511.3

5.645 55

6.9

Dominant resource fair sharing

Shared out bytesbottleneck

VM VM VM VM VM VM

wA wB

Page 22: Towards Predictable Multi-Tenant Shared Cloud Storage

57

Achieving System-wide Fairness

minutessecondsmicrosecondsTimescale

Mechanism Local Weight Allocation

Replica Selection

Fair Queuing

Partition Placement (Migration)Controlle

r

Controller

RRRR

N

...

N...

Page 23: Towards Predictable Multi-Tenant Shared Cloud Storage

61

•Does PISCES achieve system-wide fairness?

•Does PISCES provide performance isolation?

•Can PISCES adapt to shifting tenant distributions?

Evaluation

YCSB 1

ToRSwitch

YCSB 2

PISCES 6

PISCES 8

PISCES 5

PISCES 7

GigabitEthernet

YCSB 3

YCSB 8

YCSB 7

YCSB 4

YCSB 5

YCSB 6

PISCES 1

PISCES 2

PISCES 3

PISCES 4

Page 24: Towards Predictable Multi-Tenant Shared Cloud Storage

63

PISCES Achieves System-wide Fairness

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.68 MMR 0.51 MMR 0.79 MMR 0.97 MMR

3.77-5.30 ms 3.31-5.77 ms 3.80-4.69 ms 4.05-4.23 ms

GET R

eq

uest

s (k

req

/s)

Ideal fair share: 110 kreq/s (1kB requests)

Time (s)

Page 25: Towards Predictable Multi-Tenant Shared Cloud Storage

66

PISCES Provides Strong Performance Isolation

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.68 MMR 0.51 MMR 0.79 MMR 0.97 MMR

3.77-5.30 ms 3.31-5.77 ms 3.80-4.69 ms 4.05-4.23 ms

GET R

eq

uest

s (k

req

/s)

Time (s)

2x demand vs. 1x demand tenants (equal weights)

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.42 MMR 0.50 MMR 0.50 MMR 0.97 MMR

3.94-4.99 ms 4.29-6.15 ms 4.16-4.27 ms 5.40-5.45 ms

3.27-4.14 ms 2.41-3.72 ms 3.57-4.04 ms 2.78-2.83 ms

Page 26: Towards Predictable Multi-Tenant Shared Cloud Storage

68

Equal Global Weight, Differing Resource Workloads

PISCES Achieves Dominant Resource Fairness

Time (s)

Band

wid

th (

Mb

/s)

Latency (ms)

GET R

eq

uest

s (k

req

/s)

76% of effective bandwidth

76% of effective throughput

Bottleneck Resource

Page 27: Towards Predictable Multi-Tenant Shared Cloud Storage

69

Differentiated Global Weights (4-3-2-1)

PISCES Achieves System-wide Weighted Fairness

GET R

eq

uest

s (k

req

/s)

Latency (ms)

Fract

ion

of

Req

uest

s

Time (s)

Page 28: Towards Predictable Multi-Tenant Shared Cloud Storage

70

Equal Global Weights, Staggered Local Weights

PISCES Achieves Local Weighted Fairness

Time (s)

GET R

eq

uest

s (k

req

/s)

Latency (ms)

Fract

ion

of

Req

uest

s

Page 29: Towards Predictable Multi-Tenant Shared Cloud Storage

71

Differentiated (Staggered) Local Weights

Server 1 Server 2 Server 3 Server 4

PISCES Achieves Local Weighted Fairness

Time (s)

wt = 4

wt = 3

wt = 2

wt = 1

GET R

eq

uest

s (k

req

/s)

Page 30: Towards Predictable Multi-Tenant Shared Cloud Storage

74

PISCES Adapts To Shifting Tenant Distributions

Tenant 3Tenant 2Tenant 1 Tenant 4Weight

1xWeight

2xWeight

1xWeight

2x

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Page 31: Towards Predictable Multi-Tenant Shared Cloud Storage

75

Global Tenant Throughput Tenant Latency (1s average)

Server 1 Server 2 Server 3 Server 4

PISCES Adapts To Shifting Tenant Distributions

Tenant 3Tenant 2Tenant 1 Tenant 4

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Ten

ant

Weig

ht

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Page 32: Towards Predictable Multi-Tenant Shared Cloud Storage

78

Conclusion

•PISCES achieves:

- System-wide per-tenant fair sharing

- Strong performance isolation

- Low operational overhead (< 3%)

•PISCES combines:

- Partition Placement: find a feasible fair allocation (TBD)

- Weight Allocation: adapts to (shifting) per-tenant demand

- Replica Selection: distributes load according to local weights

- Fair Queuing: enforces per-tenant fairness and isolation

Page 33: Towards Predictable Multi-Tenant Shared Cloud Storage

80

Future Work

•Implement partition placement (for T >> N)

•Generalize the fairness mechanisms to different services and resources (CPU, Memory, Disk)

•Scale evaluation to a larger test-bed (simulation)

Thanks!