From application requests to Virtual IOPs: Provisioned key-value storage with Libra

Preview:

DESCRIPTION

From application requests to Virtual IOPs: Provisioned key-value storage with Libra. David Shue * and Michael J. Freedman. (*now at Google). Shared Cloud. Tenant C. Tenant A. Tenant B. VM. VM. VM. VM. VM. VM. VM. VM. VM. Shared Cloud Storage. Tenant C. Tenant A. Tenant B. VM. - PowerPoint PPT Presentation

Citation preview

PrincetonUniversity

From application requests to Virtual IOPs:Provisioned key-value storage with Libra

David Shue* and Michael J. Freedman

(*now at Google)

Shared Cloud

Tenant A Tenant BVM VM VM VM VM VM

Tenant CVM VM VM

Shared Cloud Storage

Tenant A Tenant BVM VM VM VM VM VM

Key-Value Storage

Block Storage

SQL Database

Tenant CVM VM VM

Unpredictable Shared Cloud Storage

Tenant A Tenant BVM VM VM VM VM VM

Key-Value Storage

Block Storage

SQL Database

Tenant CVM VM VM

Disk IO-bound TenantsSSD-backed storage

Key-Value Storage

Provisioned Shared Key-Value Storage

Tenant AVM VM VM

Tenant BVM VM VM

Tenant CVM VM VM

Application Requests

Shared Key-Value Storage

Low-level IO

SSDSSD SSD SSD SSD

GET/sPUT/s

IOPsBW

ReservationA ReservationB ReservationC

(1KB normalized)

6

Libra Contributions

•Libra IO Scheduler- Provisions low-level IO allocations for app-

request reservations w/ high utilization.

- Supports arbitrary object distributions and workloads.

•2 key mechanisms- Track per-tenant app-request resource profiles.

- Model IO resources with Virtual IOPs.

7

Related Work

Storage Type

App-reques

ts

WorkConservin

g

Media

Maestro Block N N HDD

mClock Block N Y HDD

FlashFQ Block N Y SSD

DynamoDB

Key-Value

Y N SSD

Provisioned Distributed Key-Value Storage

Tenant A Tenant BVM VM VM VM VM VM

ReservationA ReservationB Tenant B

VM VM VM

Storage Node N

Storage Node 1 ...Shared Key-Value Storage

Global Reservation Problem

Local Reservation Problem

[Pisces OSD1 ’12]

9

Storage Node N

Provisioned Distributed Key-Value Storage

ReservationA ReservationB

Storage Node 1

data partitionsdem

an

d

...

Tenant AVM VM VM

Tenant BVM VM VM

10

Storage Node N

Storage Node 1

Provisioned Distributed Key-Value Storage

ReservationA ReservationB

...

ReservationA ReservationB

11

Storage Node N

Storage Node 1

Provisioned Distributed Key-Value Storage

...

ResAn ResBnResA1 ResB1 ReservationA ReservationB

ReservationA ReservationB

12

Key-Value Protocol

Persistence Engine

IO Scheduler

Physical Disk

Retrieve K

Read l337

IO operation

Provisioned Local Key-Value Storage

GET K

GET 1001100

Libra IO Scheduler

13

blah

blah

Libra Design

LibraProvisionin

gPolicy

Libra IO Scheduler

GET

PU

T

How much IO to consume?

Reservation

Distribution Policy

How much IO to provision?

Persistence Engine

Physical Disk

DRR

14

Provisioning App-request Reservations is Hard

IO Amplification

IO Interference

Non-linear IO Performance

1 KB PUT ≥ 1 KB Write

Variable IO throughput

Underestimate provisionable IO

Non-linear cost per KB

Model IO with Virtual IOPs

Track tenant app-request resource profiles

15

Workload-dependent IO AmplificationPU

T PUT K,V3

V3V2

FLU

SH

V3 indexA M

LevelDB (LSM-Tree)

CO

MPA

CT

V1 indexH V

V0 indexF O

V3 index

A O

16

Workload-dependent IO AmplificationPU

TFL

US

HC

OM

PA

CT

PUT K,V3

V3V2

V3 indexA M

V1 indexH V

V0 indexF O

V3 index

A O

LevelDB (LSM-Tree)

17

Workload-dependent IO AmplificationPU

TFL

US

HC

OM

PA

CT

PUT K,V3

V3V2

V3 indexA M

V1 indexH V

V0 indexF O

V3 index

A O

LevelDB (LSM-Tree)

18

Workload-dependent IO Amplification

GET K

indexA M

indexH V

indexF O

K indexG Z

19

Libra Tracks App-request IO Consumption to Determine IO

Allocations

5 GET

25 PUT

FLUSH

COMPACT

Per-PUT

Per-GET

Compute app-request IO

profiles

GET

x

x PU

T IO=

Tenant A 5 IO units

PUT

FLUSH

Track IOconsumption

Provision IOallocations

blah

blah

LibraProvisionin

gPolicy

IOTenant A 500 IO/s

5

100

50

1

+6

10.5

80

70 500

20

Unpredictable IO Interference

Die-level parallelism, low latency IOPs

4 read/4 write tenants

Shared-controller and bus contention

Erase-before-write overhead

FTL and read-modify-write garbage colleciton

21

Unpredictable IO Interference

21

22

Libra Underestimates IO Capacity to Ensure Provisionable Throughput

Provisionable IO throughput = floor(workloads)

(18 Kop/s)Pro

vis

ionable

IO

lim

it

23

Libra Underestimates IO Capacity to Ensure Provisionable Throughput

Provisionable IO throughput = floor(workloads)

(18 Kop/s)Pro

vis

ionable

IO

lim

it

24

Libra Underestimates IO Capacity to Ensure Provisionable Throughput

Provisionable IO throughput = floor(workloads)

(18 Kop/s)Pro

vis

ionable

IO

lim

it

25

Non-linear IO Performance

Max BW Max IOP/s

26

Non-linear IO Performance

Max BW Max IOP/s

27

Libra Uses Virtual IOPs to Model IO Resources

Unifies IO cost into a single metric

Captures non-linear IO performance

Provides IO insulation

2 equal-allocation tenantsIO Insulation = 1/2 Max

Read/Write

28

Libra Uses Virtual IOPs to Model IO Resources

2 equal-allocation tenantsIO Insulation = 1/2 Max

Read/Write

29

blah

blah

Libra Design

LibraProvisionin

gPolicy

Libra IO Scheduler

Persistence Engine

Physical Disk

Charge tenant IOPs based on

VOP cost

Track app-request VOP consumption

Provision VOPs within

provisionable limit

Update tenant VOP

allocations

30

Evaluation

• Does Libra's IO resource model achieve accurate resource allocations?

• Does Libra's IO threshold make an acceptable tradeoff of performance for predictability in a real storage stack?

• Can Libra ensure per-tenant app-request reservations while achieving high utilization?

31

Interference-free Ideal

Libra Achieves Accurate IO Allocations

Throughput Ratio = Actual / Expected (IO Insulation)

Read 1 KB

even

32

Interference-free Ideal

Libra Achieves Accurate IO Allocations

Write 1-256 KB

Throughput Ratio = Actual / Expected (IO Insulation)

33

Libra Achieves Accurate IO Allocations

34

Libra Achieves Accurate IO Allocations

35

Libra Achieves Accurate IO AllocationsMin-Max Ratio =

Min Throughput Ratio / Max Throughput Ratio

36

Libra Trades-off Nominal IO Throughput For Predictability

37

Libra Trades-off Nominal IO Throughput For Predictability

< 10th percentile covered by SLA and higher-level policies

PercentileWorkloa

d10th 50th 80th All

99:1 1.6% 30.5% 40.5% 45.8%

25:75 1.4% 14.9% 25.0% 34.7%

1:99 0.7% 12.2% 19.5% 28.1%

Unprovisionable Throughput As a Percentage of Total Throughput

38

Libra Achieves App-request Reservations

Read Heavy Mixed Write Heavy

1.5x0.5x

Work-conserving

consumption of unprovisioned

resources

Fully provisioned allocations

39

Libra Achieves App-request Reservations

Read Heavy Mixed Write Heavy

40

•Libra IO Scheduler- Provisions IO allocations for app-request

reservations w/ high utilization.

- Supports arbitrary object distributions and workloads.

•2 key mechanisms

- Track per-tenant app-request resource profiles.

- Model IO resources with Virtual IOPs.

•Evaluation

- Achieves accurate low-level IO allocations.

- Provisions the majority of IO resources over a wide range of workloads

- Satisfies app-request reservations w/ high utilization.

Conclusion

Recommended