20
MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing Bin Fan, David G. Andersen, Michael Kaminsky Presenter: Son Nguyen

MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing

  • Upload
    gates

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. Bin Fan, David G. Andersen, Michael Kaminsky. Presenter: Son Nguyen. Memcached internal. LRU caching using chaining Hashtable and doubly linked list. Goals. Reduce space overhead (bytes/key) - PowerPoint PPT Presentation

Citation preview

Page 1: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

MemC3: Compact and Concurrent MemCache with Dumber

Caching and Smarter Hashing

Bin Fan, David G. Andersen, Michael Kaminsky

Presenter: Son Nguyen

Page 2: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Memcached internal• LRU caching using chaining Hashtable and

doubly linked list

Page 3: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Goals

• Reduce space overhead (bytes/key)• Improve throughput (queries/sec)• Target read-intensive workload with small

objects• Result: 3X throughput, 30% more objects

Page 4: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Doubly-linked-list’s problems

• At least two pointers per item -> expensive• Both read and write change the list’s structure

-> need locking between threads (no concurrency)

Page 5: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Solution: CLOCK-based LRU

• Approximate LRU• Multiple readers/single writer• Circular queue instead of linked list -> less

space overhead

Page 6: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

CLOCK exampleentry (ka, va) (kb, vb) (kc, vc) (kd, vd) (ke, ve)

recency 1 0 1 1 0

entry (ka, va) (kb, vb) (kc, vc) (kd, vd) (ke, ve)

recency 1 0 1 0 0Read(kd):

entry (ka, va) (kb, vb) (kf, vf) (kd, vd) (ke, ve)

recency 1 1 0 0 0Write(kf, vf):

entry (kg, vg) (kb, vb) (kf, vf) (kd, vd) (ke, ve)

recency 0 1 0 1 1Write(kg, vg):

Originally:

Page 7: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Chaining Hashtable’s problems

• Use linked list -> costly space overhead for pointers

• Pointer dereference is slow (no advantage from CPU cache)

• Read is not constant time (due to possibly long list)

Page 8: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Solution: Cuckoo Hashing

• Use 2 hashtables• Each bucket has exactly 4 slots (fits in CPU

cache)• Each (key, value) object therefore can reside at

one of the 8 possible slots

Page 9: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo Hashing

(ka,va)

HASH1(ka)

HASH2(ka)

Page 10: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo Hashing

• Read: always 8 lookups (constant, fast)• Write: write(ka, va) – Find an empty slot in 8 possible slots of ka– If all are full then randomly kick some (kb, vb) out– Now find an empty slot for (kb, vb)– Repeat 500 times or until an empty slot is found– If still not found then do table expansion

Page 11: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo HashingX

X X X

X

X X

X X

X X X

X c X X

X X

X X X X

X

X

(ka,va)

HASH1(ka)

HASH2(ka)

ba

Insert a:

Page 12: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo HashingX

X X a X

X

X X

X X

X X X

X X X

X X

X X X X

X

X

(kb,vb)

HASH1(kb)

HASH2(kb) cb

Insert b:

Page 13: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo HashingX

X X a X

X

X X

X X

X X X

X b X X

X X

X X X X

X

X

(kc,vc)

HASH1(kc)

HASH2(kc)

c

Insert c:

Done !!!

Page 14: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo Hashing

• Problem: after (kb, vb) is kicked out, a reader might attempt to read (kb, vb) and get a false cache miss

• Solution: Compute the kick out path (Cuckoo path) first, then move items backward

• Before: (b,c,Null)->(a,c,Null)->(a,b,Null)->(a,b,c)• Fixed: (b,c,Null)->(b,c,c)->(b,b,c)->(a,b,c)

Page 15: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo pathX

X X b X

X

X X

X X

X X X

X c X X

X X

X X X X

X

X

(ka,va)

HASH1(ka)

HASH2(ka)

Insert a:

Page 16: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo path backward insertX

X X X

X

X X

X X

X X X

X X X

X X

X X X X

X

X

(ka,va)

HASH1(ka)

HASH2(ka)

Insert a:

c

ba

Page 17: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Cuckoo’s advantages

• Concurrency: multiple readers/single writer• Read optimized (entries fit in CPU cache)• Still O(1) amortized time for write• 30% less space overhead• 95% table occupancy

Page 18: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Evaluation68% throughput improvement in all hit case. 235% for all miss

Page 19: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Evaluation3x throughput on “real” workload

Page 20: MemC3: Compact and Concurrent  MemCache  with Dumber Caching and Smarter Hashing

Discussion

• Write is slower than chaining Hashtable– Chaining Hashtable: 14.38 million keys/sec– Cuckoo: 7 million keys/sec

• Idea: finding cuckoo path in parallel– Benchmark doesn’t show much improvement

• Can we make it write-concurrent?