46
Parallel Hashing Algorithms Prepared By : Akash Panchal (14MCEC17) & Nisarg Patel (14MCEC20) @Nirma University, Ahmedabad

Parallel Hashing Algorithms

Embed Size (px)

Citation preview

Page 1: Parallel Hashing Algorithms

Parallel Hashing Algorithms

Prepared By : Akash Panchal (14MCEC17) & Nisarg Patel (14MCEC20)@Nirma University, Ahmedabad

Page 2: Parallel Hashing Algorithms

Concurrent hash table

• Indexing key-value objects• Lookup(key)• Insert(key, value)• Delete(key)

•Fundamental building block for modern systems

• System applications (e.g., kernel caches)

• Concurrent user-level applications

Page 3: Parallel Hashing Algorithms

Goal: memory-efficient and high-throughput

• Memory efficient (e.g., > 90% space utilized)

• Fast concurrent reads (scale with # of cores/threads)

• Fast concurrent writes (scale with # of cores/threads)

Page 4: Parallel Hashing Algorithms

separate chaining hash table

K V

K V

K V

K V

K V

Chaining items hashed in same bucket

lookup K V

Page 5: Parallel Hashing Algorithms

separate chaining hash table

- Acquiring a Linked List Node Location- Only one thread should insert new node.

- find the end of the linked list, lock out other threads.

- Make sure we are still at the end of the linked list.

- Then allocate a new node from the unfilled region, and finally link it in.

Page 6: Parallel Hashing Algorithms

Good: simple

Bad: poor cache locality

Bad: pointers cost space

separate chaining hash table

Page 7: Parallel Hashing Algorithms

open addressing hash table

Probing alternate locations for vacancy

e.g., linear/quadratic probing, double hashing

- Inserting a Key :

- 1) The bucket is empty.- Must acquire a location.- TwoStepAcquireAttempt algorithm is used.- Insert the key.

- 2) The bucket is already filled.- Find next possible bucket using any probing

method.

Page 8: Parallel Hashing Algorithms

Procedure: TwoStepAcquireAttempt

Parameters: (location, key)

// first check without locking

1: if location is empty then

2: lock(location)

// then check with the lock

3: if location is still empty then

4: Reserve location for key

5: unlock(location)

6: return true

7: end if

// Another thread modified the structure before we could acquire location

8: unlock(location)

9: end if

10: return false . // Caller needs to continue searching

Page 9: Parallel Hashing Algorithms

open addressing hash table

Probing alternate locations for vacancy

lookupGood: cache friendly

Bad: poor memory efficiency

- performance dramatically degrades when the usage grows beyond 70% capacity or so

- e.g., Google dense_hash_map wastes 50% memory by default.

Page 10: Parallel Hashing Algorithms

Linear Hashing with linear Probing

H1 H2 H3 H4 H5 H6 H7 H8 H9 h10

Parallel Insert :Case 1 : Two keys are inserted at different location.

xH1 = insert( ) yH5 = insert( )

Pass : Pass 1

Page 11: Parallel Hashing Algorithms

Linear Hashing with linear Probing

H1 H2 H3 H4 H5 H6 H7 H8 H9 h10

s u t

Parallel Insert :Case 2 : Two keys are inserted at different location but collision occurs.

xH1 = insert( ) yH5 = insert( )

Pass : Pass 1 Pass 2 Pass 3

Page 12: Parallel Hashing Algorithms

Linear Hashing with linear Probing

H1 H2 H3 H4 H5 H6 H7 H8 H9 h10

xH1 = insert( ) yH1 = insert( )

Pass : Pass 1

Parallel Insert :Case 3 : Two keys tried to insert at same location and contention occurred.

x y

Pass 2

Page 13: Parallel Hashing Algorithms

Analysis : linear hashing

Serial Approach Parallel Approach

Lookup : Best case : O(n)Worst Case : O(n2)

Best case : O(1),Worst case : O(n)

Insert Best Case : O(n),Worst Case : O(n2)

Best case : O(1)Worst Case : O(n)

Removal : Best case : O(n),Worst Case : O(n2)

Best case : O(1)Worst case : O(n)

Page 14: Parallel Hashing Algorithms

Cuckoo hashing

• Use two hash tables and two hash functions

• Each element will have exactly one “nest” (hash location) in each table• Guarantee that any element will only ever exist in one of its “nests”.

• Lookup/delete are O(1) because we can check 2 locations (“nests”) in O(1) time.

Page 15: Parallel Hashing Algorithms

Cuckoo hashing

Insertion :

1. Insert an element by finding one of its “nests” and putting it there

• This may evict another element!

2. Insert the evicted element into its *other* “nest” This may evict another element!

Page 16: Parallel Hashing Algorithms

X

Cuckoo hashing in Sequential: ‘X’ is inserted by moving ‘Y’ and ‘Z’

Page 17: Parallel Hashing Algorithms

Cuckoo hashing

Asymmetric cuckoo hashing :

• Choose one (the first) table to be larger than the other

– Improves the probability that we get a hit on the first lookup

– Only a minor slowdown on insert

Page 18: Parallel Hashing Algorithms

Cuckoo hashing

Same Table cuckoo Hashing :

• We didn’t actually need two separate tables.

– It made the analysis much easier.

– But… In practice, we just need two hash functions.

Page 19: Parallel Hashing Algorithms

Cuckoo hashing

Each bucket has b slots for items (b-way set-associative)

Each key is mapped to two random buckets

• stored in one of thembuckets

012345678

key x

hash1(x)

hash2(x)

Page 20: Parallel Hashing Algorithms

Predictable and fast lookup

• Lookup: read 2 buckets in parallel• constant time in the worst case

x

012345678

Lookup x

Page 21: Parallel Hashing Algorithms

012345678

move keys to alternate buckets

Insert may need “cuckoo move”

• Insert:

Both are full?

Insert y a

x

b

k

r

cse

nf

x

a

bWrite to an empty slot in one of the

possible locations

x

a

bpossible locations

possible locations

Page 22: Parallel Hashing Algorithms

Insert may need “cuckoo move”

• Insert: move keys to alternate buckets

• find a “cuckoo path” to an empty slot

• move hole backwards

012345678

Insert y

x

a

b

y

b

a

x

Page 23: Parallel Hashing Algorithms

Case 1 : Inserting ‘a’ and ‘b’ in parallel, ‘b’ find its place in 1st hash function and ‘a’ has to evict value.

Page 24: Parallel Hashing Algorithms

k

Case 2 : Inserting ‘a’ and ‘b’ in parallel, ‘a’ and ‘b’ has to evict value.

Page 25: Parallel Hashing Algorithms

Algorithmic optimizations

• Lock after discovering a cuckoo path• minimize critical sections

• But…

• Decreases parallelization

Page 26: Parallel Hashing Algorithms

Previous approach: writer locks the table during the whole insert process

lock();

Search for a cuckoo path;

Cuckoo move and insert;

unlock();

All Insert operations of other threads are blocked

Page 27: Parallel Hashing Algorithms

Case 3 : Inserting ‘a’ and ‘b’ in parallel, ‘a’ and ‘b’ has to evict value.But they are having same path, so contention will occur on path.

while(1) {

Search for a cuckoo path;

lock();

Cuckoo move and insert;

if(success)

unlock();

break;

}

This algo will work to avoid contention on path and one who came first will be completing the process and other will try after some time.

Page 28: Parallel Hashing Algorithms

Lock after discovering a cuckoo path

while(1) {Search for a cuckoo path;

lock();Cuckoo move and insert;if(success)

unlock();break;

}

// no locking required

Multiple Insert threads can look for cuckoo paths concurrently

←collision

Cuckoo move and insert;

Page 29: Parallel Hashing Algorithms

Analysis : parallel cuckoo hashing

Serial Approach Parallel Approach

Lookup : Best case : O(n)Worst Case : O(n2)

Best case : O(1),Worst case : O(n)

Insert : Best Case : O(n),Worst Case : O(n),

Best case : O(1)Worst Case : O(n),

Removal : Best case : O(n),Worst Case : O(n2)

Best case : O(1)Worst case : O(n)

Cuckoo with 2 hash function performs well until 50% load.

Cuckoo with 3 hash function performs well until 90% load.

Page 30: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• The algorithm allows mutating operations to operate concurrently with query ones and requires only single word compare-and-swap primitives.

• When an insertion triggers a sequence of key displacements, instead of locking the whole cuckoo path, this algorithm breaks down the chain of relocations into several single relocations which can be executed independently and concurrently with other operations.

Page 31: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• Here, concurrent cuckoo hashing contains two hash tables (sub-tables), which correspond to two independent hash functions.

• Each key can be stored at one of its two possible positions, one in each sub-table called primary and secondary respectively.

Page 32: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• Problem :

• The original cuckoo approach inserts the new key to the primary sub-table by evicting a collided key and re-inserting it to the other sub-table.

• This approach, however, causes the evicted key to be “nestless”, i.e. absent from both sub-tables, until the re-insertion is completed.

• This might be an issue in concurrent environments: the “nestless” key is unreachable by other concurrent operations which want to operate on it.

Page 33: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• Problem :

• Moving key problem :

• The key present in table[1] is relocated to table[0], meanwhile search reads

from table[0] and then table[1].

• To avoid such missing, search performs the second round query.

• Another issue with insertion operations is that a key can be inserted to two sub-tables simultaneously. • Since concurrent insertions can operate independently on two sub-tables, they can

both succeed. This results in two existing instances of one key, possibly mapped to different values.

• To prevent such a case to happen, one can use a lock during insertion to lock both slots so that only one instance of a key is successfully added to the table at a time.

Page 34: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• Searching :

• Searching for a key in our lock-free cuckoo hash table includes querying for the existence of the key in two sub-tables.

• A key is available if it is found in one of them. A search operation starts with the first query round by reading from the possible slots in the primary sub-table and then in the secondary sub-table

• If the key is found in one of them, the value mapped to key is returned.

Page 35: Parallel Hashing Algorithms

Search(x) : Case 1 : Value found in the Primary sub-table.

x

Primary sub-table Secondary sub-table

H1 = hash1(x)

h1

Page 36: Parallel Hashing Algorithms

Search(x) : Case 2: Value found in the secondary sub-table.

Primary sub-table Secondary sub-table

x

H2= hash2(x)

h2

Page 37: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• Insert :

• Find :

• It examines both sub-tables to discover if two instances of the key exist in two sub-tables. When the same key is found on both sub-tables, the one in the secondary sub-table is deleted.

Page 38: Parallel Hashing Algorithms

Find(x) :

Case 1 : If key already exist then update & return.

Primary sub-table Secondary sub-table

x

H1 = hash1(x)

Page 39: Parallel Hashing Algorithms

Find(x) :

Case 2 : If the same key found on both then one from secondary sub-table gets deleted.

Primary sub-table Secondary sub-table

x

H1 = hash1(x)

x

H2= hash2(x)

h1

h2

Page 40: Parallel Hashing Algorithms

Find(x) :

Case 3 : returns current items which are stored at possible slots where key should be hashed to.

Primary sub-table Secondary sub-table

y

H1 = hash1(x)

z

H2= hash2(x)

return y;

return z;

Page 41: Parallel Hashing Algorithms

Lock-free Cuckoo Hashing

• Insert :• Insert a new value using CAS( Compare-And-Swap) primitives.

• CAS is a synchronization primitive available in most modern processors. It compares the content of a memory word to a given value and, only if they are the same, modifies the content of that word to a given new value.

Page 42: Parallel Hashing Algorithms

y

Insert (x) :

Case 1 : If one of the two slots is empty, the new entry is inserted with a CAS.

Primary sub-table Secondary sub-table

H1 = hash1(x)

x

H2 = hash2(x)

x

h1

h2

Page 43: Parallel Hashing Algorithms

r

e

c

a

b

n

Insert (x) :

Case 2 : If both the slots are occupied then relocationprocess initiated using the cuckoo path.

Primary sub-table Secondary sub-table

H1 = hash1(x)

x

Cuckoo Path : s y k

s

y k

ys

k

k

y

s

f

Page 44: Parallel Hashing Algorithms

X

Cuckoo hash : Problem while Cycle

Solution : After Threshold time, use another hash function or extend the size of hash table.

S T

U

Z

Y V

Page 45: Parallel Hashing Algorithms

Analysis : lock-free cuckoo

Serial Approach Parallel Approach

Lookup Best Case : O(n)Worst Case : O(d*n)

Best Case : O(1)Worst Case : O(d)

Insert Best Case : O(n),Worst Case : > O(d*n),

Best case : O(1)Worst case : > O(d),

Removal Best Case : O(n)Worst Case : O(d*n)

Best Case : O(1)Worst Case : O(d)

Where, d>2 is number of hash functions and sub table

Concurrent cuckoo hash table

- high memory efficiency

- fast concurrent writes and reads

Page 46: Parallel Hashing Algorithms

References

1. Eric L. Goodman, David J. Hagliny, Chad Scherrer,Daniel Chavarr´ia-Miranda, Jace Mogil, John Feo, “Hashing Strategies for the Cray XMT”

2. Parallel and Sequential Data Structures and Algorithms, 15-210 (Spring 2012) Lectured by Margaret Reid-Miller — 26 April 2012.

3. Xiaozhou Li, David G. Andersen, Michael Kaminsky, Michael J. Freedman, “Algorithmic Improvements for Fast Concurrent Cuckoo Hashing”, “Princeton University, Carnegie Mellon University, Intel Labs.”

4. Nhan Nguyen, Philippas Tsigas,Chalmers University of Technology Gothenburg, Sweden, “Lock-free Cuckoo Hashing”.