RAMCloud Scalable High-Performance Storage Entirely in DRAM John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières,

1

RAMCloudScalable High-Performance Storage Entirely in DRAM

John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, Aravind Narayanan,

Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman

Stanford University

Presented by Sangjin Han(many slides are stolen from https://ramcloud.stanford.edu)

2

Features (or Goals)

• Low latency: 5-10 µs (not milliseconds)• High throughput: 1M operations/s• Key-value storage with 1000s of servers• No replicas• Entirely in DRAM– Disks only for backup

• Fast recovery (1-2 secs)

3

Why DRAM?

Mid-1980’s 2009 Change

Disk capacity 30 MB 500 GB 16667x

Max. transfer rate 2 MB/s 100 MB/s 50x

Latency (seek & rotate) 20 ms 10 ms 2x

Capacity/bandwidth(large blocks) 15 s 5000 s 333x

Capacity/bandwidth(1KB blocks) 600 s 58 days 8333x

Jim Gray’s Rule (1KB) 5 min. 30 hours 360x

Today 5-10 years

# servers 1000 4000

GB/server 64GB 256GB

Total capacity 64TB 1PB

Total server cost $4M $6M

$/GB $60 $6

create(tableId, blob) => objectId, versionread(tableId, objectId) => blob, versionwrite(tableId, objectId, blob) => versioncwrite(tableId, objectId, blob, version) => versiondelete(tableId, objectId)

June 3, 2011 RAMCloud Overview & Status Slide 4

Data Model

Tables

Identifier (64b)Version (64b)

Blob (≤1MB)

Object

(Only overwrite ifversion matches)

Master

Backup

Master

Backup

Master

Backup

Master

Backup…

Appl.

Library

Appl.

Library

Appl.

Library

Appl.

Library…

DatacenterNetwork Coordinator

1000 – 10,000 Storage Servers

1000 – 100,000 Application Servers

32-64GB/server

5

Overall Architecture

6

Per-Node Architecture

• One server per data• 2-3 backups for

buffered logging

7

Coordinator

• Centralized server for data placement

• E.g,

• Clients obtain a map from the coordinator– And cache

Table # First Key Last Key Server

12 0 264 - 1 192.168.0.1

47 63,742 5,723,742 192.168.0.2

… … …

8

Fast Recovery (First Try)RecoveryMaster

Backups

CrashedMaster

• Reconstruct data from backup logs

• Bottleneck: disk B/W• Solution: more disks

9

Fast Recovery (Second Try)

• Randomly distribute log shards• Bottleneck: recovery master• Solution: no single recovery master

RecoveryMaster

~1000Backups

CrashedMaster

10

Fast Recovery (Third Try)

• Temporarily distributed recovery masters• Happy?– 35 GB recovery in 1.6s using 60 nodes (SOSP 2011)

RecoveryMasters

Backups

CrashedMaster

11

Typical Datacenter Latency

12

RAMCloud Latency

13

Publication History

• Unveiling: SOSP 2009 WIP• White paper: Operating Systems Review 2009– What you read

• (Call for) low latency: HotOS 2011• Fast recovery: SOSP 2011• Ongoing work: index, transaction, transport, …

• Visit http://ramcloud.stanford.edu

14

Thoughts & Discussion

• Garbage collection?• Infiniband vs. Ethernet– HW support for transport layer?

• Killer app?• Non volatile memory?• Synchronous vs. asynchronous query• Moving data vs. Moving code• How many papers out of this project? :P

15

Thanks

16

SORRY

• My English still s^^ks• Would you repeat the question– SLOWLY and– CLEARLY

???

Documents

RAMCloud Scalable High-Performance Storage Entirely in DRAM John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières,