Upload
mahdi-atawneh
View
261
Download
0
Embed Size (px)
Citation preview
SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen, Michael Kaminsky Carnegie Mellon University, Intel Labs
1
Outline1- Introduction.2- Problem statement.3. Contributions.
4- Experiments5- Results and conclusion
This template is free to use under Creative Commons Attribution license. If you use the graphic assets (photos, icons and typographies) provided with this presentation you must keep the Credits slide.2
1.Introduction
4
What is key-value store ?
a data storage paradigm designed for storing, retrieving, and managing associative arrays,
5
Introduction
Ecommerce (Amazon) Picture stores (facebook) Web object caching.
6
key-value stores used in:
Many projects have examined flash memory-based key-value stores ; Faster than disk, cheaper than DRAM
7
DRAM vs Flash
RAM (main memory) a bit more expensive . requires constant power. is much faster.
Flash memory (HD) low-cost . retains data when power is
removed (nonvolatile), but its performance is also
slow.
Memory overhead: Index size per entry, Ideally 0 (no memory overhead)
Read amplification: Flash reads per query, Limits query throughput, Ideally 1 (no wasted flash reads).
Write amplification: Flash writes per entry, Limits insert throughput, Also reduces flash life expectancy
8
Three Metrics to Minimize
2.MotivationProblem statement
9
MotivationAs key-value stores scale in both size and importance,
index memory efficiency is increasingly becoming one of the most important factors for the system’s scalability and overall cost effectiveness.
10
Challenge
Memory efficiency
High performance
11
This talk will introduce SILT, which uses drastically less memory than previous systems while retaining high performance.
Related WorkMany studies tried to reduce in-memory index overhead,but:
• there solutions either require more memory.
• or keeping parts of there index on disk (low performance "called read amplification")
12
3.Contributions
SILT (Small Index Large Table)
13
Contributions1. The design and implementation of three basic key-value
stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful
balance between memory, storage, and computation .
14
Contributions1. The design and implementation of three basic key-value
stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful
balance between memory, storage, and computation .
15
Basic store designLogStore, HashStore, and SortedStore . ( overall overview).
16
Basic store designLogStore, HashStore, and SortedStore . ( overall overview).
17
SILT: 1. LogStore1. Write friendly key-value store.2. Use a tag (15 bits) for an index rather than an
entire hash index (160bits) .3. A customized version of Cuckoo hashing is used.4. In-memory hash table to map contents in flash.5. Only one instance.
18
SILT: 1. LogStoreHow it works? .
19
SILT: 1. LogStore
K2h1(k2)
Tag Offset
DRAM (memory)** store short tag (15b)
FLASH (hard disk)** store the full key (160)
h2(k2)
1 2 3 4
Cuckoo hashing
20
SILT: 1. LogStore
K2h1(k2)
Tag Offset
h2(k2)
1
K2
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k2)
1 2 3 4
21
SILT: 1. LogStore
K1h1(k1)
Tag Offset
h2(k2)
1
h1(k1)
2
K2 K1
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k1)
1 2 3 4
22
SILT: 1. LogStore
K4h1(k2)
Tag Offset
h2(k2)
1
h1(k4)
3
h1(k1)
2
K2 K1 K4
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k2)
1 2 3 4
23
SILT: 1. LogStore
K3h1(k3)
Tag Offset
h2(k2)
1
h1(k4)
3
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k3)
1 2 3 4
24
SILT: 1. LogStore
K3h1(k3)
Tag Offset
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k3)
1 2 3 4
25
SILT: 1. LogStore
K3h1(k3)
Tag Offset
h2(k3)
4
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k3)
1 2 3 4
26
SILT: 1. LogStore
K5? h1(k5)
Tag Offset
h2(k3)
4
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k5)
1 2 3 4
27
SILT: 1. LogStore
K5? h1(k5)
Tag Offset
h2(k3)
4
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k5)
1 2 3 4
28
SILT: 1. LogStore
K5? h1(k5)
Tag Offset
h2(k3)
4
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
h2(k5)
1 2 3 4LogStore is full?1. SILT freezes the LogStore 2. initializes a new one without expensive rehashing.3. Convert the old LogStore to a HashStore ( in
background).
29
Basic store designLogStore, HashStore, and SortedStore . ( overall overview).
30
SILT: 2. HashStore1. Convert logStore into a more memory-efficient data
structure.2. Sort the LogStore based on ‘HashOrder’3. Saves lots of in-memory by eliminating the index and
reordering the on-flash pairs from insertion order to hash-order
31
SILT: 2. HashStore
32
SILT: 2. HashStore
Tag Offset
h2(k3)
4
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
1 2 3 4
Convert logStore into a more memory-efficient data structure..
33
SILT: 2. HashStore
Tag Offset
h2(k3)
4
h1(k4)
3
h2(k2)
1
h1(k1)
2
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
1 2 3 4
Convert logStore into a more memory-efficient data structure..1. Remove the Offset
column
34
SILT: 2. HashStore
Tag
h2(k3)
h1(k4)
h2(k2)
h1(k1)
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
1 2 3 4
Convert logStore into a more memory-efficient data structure..
35
SILT: 2. HashStore
Tag
h2(k3)
h1(k4)
h2(k2)
h1(k1)
K2 K1 K4 K3
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
1 2 3 4
Convert logStore into a more memory-efficient data structure..
2. Sort according to
the hash
36
SILT: 2. HashStore
Tag
h2(k3)
h1(k4)
h2(k2)
h1(k1)
K3 K4 K2 K1
DRAM (memory)** store short tag
FLASH (hard disk)** store the full key (160)
1 2 3 4
Convert logStore into a more memory-efficient data structure..
37
SILT: 2. HashStore
K3 K4 K2 K1
Hashstore store many logstores .
K3 K4 K2 K1
K3 K4 K2 K1
K3 K4 K2 K1
38
Basic store designLogStore, HashStore, and SortedStore . ( overall overview).
39
SILT: 3. SortedStore Multiple HashStore can be aggregated into one
SortedStore. Focuses on minimize the bit presentation (by Using
Sorted Data on Flash). From the sorted results, indices are re-made ( trie data
structure , uses 0.4 bytes of index memory per key on average).
keeps read amplification low (exactly 1) by directly pointing to the correct location on flash.
40
SILT: 3. SortedStoreIndexing Sorted Data with a Trie:leaf = key internal node = common prefix of the keys represented by its descendants
How it works?41
SILT: 3. SortedStoreIndexing Sorted Data with a Trie:
Example
42
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
43
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
44
0 1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
45
0
0 1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
46
0
0 1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
47
0
0
0
1
1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
48
0
0
0
1
1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
0
49
0
0
0
1
1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
0
Ignored
50
0
0
0
1
1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
0 1
51
0
0
0
1
1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
0 1
2
52
10
2
763
54
0
0
0
0
0
0
0
1
1
1
1
1 1
1
0 0 0 1 1 1 1 1
0 0 1 0 0 0 1 1
0 1 1 0 1 1 0 1
1 0 1 1 0 1 1 0
0 1 0 0 0 1 0 1
0 1 2 3 4 5 6 7
53
SortedStore uses a compact recursive representation to eliminate pointers
54
SLIT Lookup Process Queries look up stores in sequence (from new to old) Note : Inserts only go to Log
55
SLIT Lookup Process Queries look up stores in sequence (from new to old) Note : Inserts only go to Log
56
Contributions1. The design and implementation of three basic key-value
stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful
balance between memory, storage, and computation .
57
Contributions1. The design and implementation of three basic key-value
stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful
balance between memory, storage, and computation .
58
Analytic model
59
Memory overhead (MA)=
Read amplification (RA)=
Write amplification (WA)=data written to flash
data written by application
data read from flash
data read by application
total memory consumed
number of items
4.Evaluation & Experiments
60
Experiment Setup
61
CPU 2.80 GHz (4 cores)
Flash driveSATA 256 GB
(48 K random 1024-byte reads/sec)
Workload size 20-byte key, 1000-byte value, ≥ 50 M keys
Query pattern Uniformly distributed
Experiment 1LogStore Alone: Too Much Memory Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)
62
Experiment 2LogStore + SortedStore: Still Much MemoryWorkload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)
63
Experiment 3Full SILT: Very Memory EfficientWorkload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)
64
“Thanks