Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Consistent and Durable Data Structures for
Non-Volatile Byte-Addressable Memory
Shivaram Venkataraman*†, Niraj Tolia‡, Parthasarathy Ranganathan* and Roy H. Campbell†
*HP Labs, Palo Alto, ‡Maginatics, and †University of Illinois, Urbana-Champaign
Non-Volatile Byte-Addressable Memory (NVBM)
Memristor
3/4/11 2
Phase Change Memory Memristor
Non-Volatile Byte-Addressable Memory (NVBM)
50-150 nanoseconds
Scalable
Non-Volatile
Lower energy
Memristor
3/4/11 3
Access Times
1
10
100
1000
10000
100000
1000000
10000000
Nan
osec
onds
3/4/11 4
Hard Disk Writes – 3 ms Write to SLC Flash – 200 μs
Processor clock cycle – 1ns Access L2 cache – 10ns Update DRAM – 55ns
Access Times
1
10
100
1000
10000
100000
1000000
10000000
Nan
osec
onds
3/4/11 5
Hard Disk Writes – 3 ms Write to SLC Flash – 200 μs
Processor clock cycle – 1ns Access L2 cache – 10ns Update DRAM – 55ns
Writes to PCM / Memristor – 100-150 ns
Data Stores - Disk
L1 Cache
Traditional DB
DRAM
Core1 Core2
L1 Cache L1 Cache
L2 Cache
Disk
File systems
3/4/11 6
DRAM
Data Stores - DRAM
Core1 Core2
L1 Cache L1 Cache
L2 Cache
Commit Log - Disk
RAMCloud memcached Memory-based DB
3/4/11 7
DRAM
Data Stores - NVBM
Core1 Core2
L1 Cache L1 Cache
L2 Cache
Non-Volatile Memory
Single-level store
3/4/11 8
Challenges
10
5 20
15
2
1
Consistency Durability
3/4/11 9
Outline
§ Motivation § Consistent durable data structures
§ Consistent durable B-Tree § Tembo – Distributed Data Store Implementation
§ Evaluation
3/4/11 10
Consistent Durable Data Structures
§ Versioning for consistency across failures
§ Restore to last consistent version on recovery
§ Atomic change across versions
§ No new processor extensions!
3/4/11 11
Versioning
§ Totally ordered – Increasing natural numbers
§ Every update creates a new version
§ Last consistent version § Stored in a well-known location § Used by reader threads and for recovery
3/4/11 12
Consistent Durable B-Tree
B – Size of a B-Tree node
3/4/11 13
Key [start, end) Deleted entry
Live entry
Lookup
Find key 20 at version 5
3/4/11 14
Insert / Split
3/4/11 15
Garbage Collection
3/4/11 16
Tembo – Distributed Data Store Implementation
Based on open source key-value store
Widely used in production
In-memory dataset
3/4/11 17
Tembo – Distributed Data Store Implementation
Key Value Server
Consistent durable B-Tree
Single writer, shared reader
3/4/11 18
Consistent Hashing
Outline
§ Motivation § Consistent durable data structures
§ Consistent durable B-Tree § Tembo – Distributed Data Store Implementation
§ Evaluation
3/4/11 19
Ease of Integration
Lines of Code Original STX B-Tree 2110 CDDS Modifications 1902 (90%)
Redis (v2.0.0-rc4) 18539 Tembo Modifications 321 (1.7%)
3/4/11 20
Evaluation - Setup
§ API Microbenchmarks § Compare with Berkeley DB § Tembo: Versioning vs. write-ahead logging
§ End-to-End Comparison § NoSQL systems – Cassandra § Yahoo Cloud Serving Benchmark
§ 15 node test cluster § 13 servers, 2 clients § 720 GB RAM, 120 cores
3/4/11 21
Durability - Logging vs. Versioning
3/4/11 22
0
2000
4000
6000
8000
10000
12000
14000
256 1024 4096
Thr
ough
put
(Ops
/sec
)
Value size (bytes)
Redis - BTree+Logging Redis - Hashtable+Logging Tembo - CDDS BTree
2M insert operations, two client threads
Yahoo Cloud Serving Benchmark
0
20000
40000
60000
80000
100000
120000
140000
160000
2 10 20 30
Ops
/sec
Client Threads
Tembo Cassandra-inmemory Cassandra-disk
3/4/11 23
286%
44%
Furthermore
§ Algorithms for deletion
§ Analysis for space usage and height of B-Tree
§ Durability techniques for current processors
3/4/11 24
Related Work
§ Multi-version data structures § Used in transaction time databases
§ NVBM based systems § BPFS – File system (SOSP 2009) § NV-Heaps – Transaction Interface (ASPLOS 2011)
§ In-memory data stores § H-Store – MIT, Brown University, Yale University § RAMCloud – Stanford University
3/4/11 25
Work-in-progress
§ Robust reliability testing
§ Support for transaction-like operations
§ Integration of versioning and wear-leveling
3/4/11 26
Conclusion
§ Changes in storage media § Rethink software stack
§ Consistent Durable Data Structures § Single-level store § Durability through versioning § Up to 286% faster than memory-backed systems
3/4/11 27