Upload
madilyn-havens
View
226
Download
0
Embed Size (px)
Citation preview
2
External Hashing
What if the hash table is a file in which each bucket is a record in that file?
Observations: A bucket may contain more than one
key value. The number of buckets may expand or
contract dynamically.
3
Extendible Hashing
Handling multiple key values per bucket is not a problem.
Collisions are resolved with overflow buckets rather than the next bucket.
Keep track of the number of times all buckets have been split (the “level”) and the next bucket to split.
4
The Hash Function
The standard hash function would now be something like:
H(x, L) = x mod (n * 2L)
“L” is the level, initially zero.If H(x, L) < b, then calculate
H(x, L+1).“b” is the next bucket to split.
5
The “Split”
Questions: When do I split the next bucket? What does a split entail?
We split when the load factor exceeds a certain threshold. The load factor is the number of key values / number of slots.
A split entails creating a new bucket and rehashing all keys in bucket b at level L+1.
6
The Insert Algorithm
Initialize L = 0 and b = 0;Calculate bucket = H(x, L)
if (bucket < b) bucket = H(x, L+1)If bucket has an empty slot, fill it with x
Else, create an overflow bucket for xIf the new load factor >= the threshold
Add new bucket at end Rehash all key values in bucket b at Level
L+1 Add one to b.
7
The Insert Algorithm II
If b = n * 2L We have split all the buckets at the
current level, so L = L + 1 b = 0
8
Insert Example
Insert 24:bucket = H(24, 0) = 0bucket >= b, so bucket 0 it is:
Insert: 24,10,15,33,60,11, 61,41
210 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 0/6 = 0threshold = 0.75
9
Insert Example
Insert 10:bucket = H(10, 0) = 1bucket >= b, so bucket 1 it is:
Insert 10,15,33,60,11, 61,41
21
24
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 1/6 = 0.17threshold = 0.75
10
Insert Example
Insert 15:bucket = H(15, 0) = 0bucket >= b, so bucket 0 it is:
Insert:15,33,60,11, 61,41
2
10
1
24
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 2/6 = 0.33threshold = 0.75
11
Insert Example
Insert 33:bucket = H(33, 0) = 0bucket >= b, so bucket 0 it is:
Insert:33,60,11, 61,41
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 3/6 = 0.5threshold = 0.75
12
Insert Example
This requires an overflow bucket.Let’s assume overflow buckets also can hold
2 key values. Now, update load factor:
Insert:60,11, 61,41
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 4/6 = 0.67threshold = 0.75
33
13
Insert Example
Insert 60bucket = H(60, 0) = 0bucket >= b, so bucket 0 it is:
Insert:60,11, 61,41
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 4/8 = 0.5threshold = 0.75
33
14
Insert Example
Insert 11bucket = H(11, 0) = 2bucket >= b, so bucket 2 it is:
Insert:11, 61,41
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 5/8 = 0.63threshold = 0.75
3360
15
Insert Example
Load factor >= threshold, so it is time to rehash all keys in bucket b = 0:
First, create a new bucket:
Insert:61,41
11
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/8 = 0.75threshold = 0.75
3360
16
Insert Example
rehash 24 at level L+1:H(24, 1) = 24 mod 6 = 024 stays at bucket 0
Insert:61,41
11
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
3360
3
17
Insert Example
rehash 15 at level L+1:H(15, 1) = 15 mod 6 = 315 moves to bucket 3
Insert:61,41
11
2
10
1
2415
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
3360
3
18
Insert Example
rehash 33 at level L+1:H(33, 1) = 33 mod 6 = 333 moves to bucket 3
Insert:61,41
11
2
10
1
24
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
3360
15
3
19
Insert Example
rehash 60 at level L+1:H(60, 1) = 60 mod 6 = 060 stays at bucket 0
Insert:61,41
11
2
10
1
24
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
60
1533
3
20
Insert Example
Add 1 to b; it is less than 3, so done with first split.
I now have an empty overflow bucket; remove it and recalculate load factor:
Insert:61,41
11
2
10
1
2460
0 b=0, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
1533
3
21
Insert Example
Load factor is now 0.75, so I need to split again, this time b=1.
Insert:61,41
11
2
10
1
2460
0 b=1, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/8 = 0.75threshold = 0.75
1533
3
22
Insert Example
Add bucket 4 and rehash all key values at bucket 1.
10 mod 6 = 4, so it should move:
Insert:61,41
11
2
10
1
2460
0 b=1, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
1533
3 4
23
Insert Example
Note update of b to 2; the load factor is OK, so continue with insert of 61.
Insert:61,41
11
21
2460
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
1533
3
10
4
24
Insert Example
bucket = H(61,0) = 1Since bucket < b, recalculate at L+1:bucket = H(61, 1) = 1
Insert:61,41
11
21
2460
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6threshold = 0.75
1533
3
10
4
25
Insert Example
Finally, insert 41bucket = H(41,0) = 2bucket >= b so 2 it is:
Insert:41
11
2
61
1
2460
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 7/10 = 0.7threshold = 0.75
1533
3
10
4
26
Insert Example
Load factor >= threshold, so split bucket 2:
Insert:done
1141
2
61
1
2460
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 8/10 = 0.8threshold = 0.75
1533
3
10
4
27
Insert Example
Both 11 and 41 are 5 mod 6, so both go to bucket 5.
Update b...
Insert:done
2
61
1
2460
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 8/12 = 0.67threshold = 0.75
1533
3
10
4
1141
5
28
Insert Example
b = 3*2L, so set b=0 and L=L+1:
Insert:done
2
61
1
2460
0 b=3, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 8/12 = 0.67threshold = 0.75
1533
3
10
4
1141
5
29
Insert Example
Done.
Insert:done
2
61
1
2460
0 b=0, L=1H(x) = x mod 3*2L
2 key values /bucketLoad factor = 8/12 = 0.67threshold = 0.75
1533
3
10
4
1141
5
31
Deleting with Extendible Hashing
Delete works the opposite of insert: When the load factor goes below a
lower threshold, combine buckets. Note: if b=0, it is necessary to
decrement the level
32
Delete Algorithm
Hash the key value to delete in the standard way, hashing at level L+1 if necessary. If the key value is not found, report
failure and stop Else continue
Update the load factor
33
Delete Algorithm II
If the load factor <= Lower Threshold Decrement b if (b== -1)
if (L=0) set b=0 and stopL=L-1 and b=n*2L - 1
Combine the last bucket with bucket b; Repeat if necessary.
34
Delete Example
Let’s start with the final table from our insert example.
We’ll use 0.5 as our lower threshold.
Delete: 60, 10, 41
2
61
1
2460
0 b=0, L=1H(x) = x mod 3*2L
2 key values /bucketLoad factor = 8/12 = 0.67Lower threshold = 0.5
1533
3
10
4
1141
5
35
Delete Example
Delete 60H(60, 1) = 0 which is >= bRemove 60 from bucket 0:
Delete: 60, 10, 41
2
61
1
2460
0 b=0, L=1H(x) = x mod 3*2L
2 key values /bucketLoad factor = 8/12 = 0.67Lower threshold = 0.5
1533
3
10
4
1141
5
36
Delete Example
Delete 10H(10,1) = 4 which is >=bRemove 10 from bucket 4:
Delete: 10, 41
2
61
1
24
0 b=0, L=1H(x) = x mod 3*2L
2 key values /bucketLoad factor = 7/12 = 0.58Lower threshold = 0.5
1533
3
10
4
1141
5
37
Delete Example
Time to combine buckets.Decrementing b results in b=-1 soset L=0 and b= 3*20 - 1 = 2
Delete: 41
2
61
1
24
0 b=0, L=1H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/12 = 0.5Lower threshold = 0.5
1533
3 4
1141
5
38
Delete Example
Next, combine the last bucket (5) with bucket 2:
Delete: 41
2
61
1
24
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/12 = 0.5Lower threshold = 0.5
1533
3 4
1141
5
39
Delete Example
Bucket 5 is deleted and the load factor is updated.
Load factor > lower threshold, so done.
Delete: 41
1141
2
61
1
24
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6Lower threshold = 0.5
1533
3 4
40
Delete Example
Delete 33H(33, 0) = 0 < b, so rehash at L+1:H(33, 1) = 3; remove 33 from bucket 3:
Delete: 33
1141
2
61
1
24
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 6/10 = 0.6Lower threshold = 0.5
1533
3 4
41
Delete Example
Load Factor <= Lower threshold, so time to combine...
First, decrement b:
Delete: done
1141
2
61
1
24
0 b=2, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 5/10 = 0.5Lower threshold = 0.5
15
3 4
42
Delete Example
Now, combine last bucket (4) with bucket b=1, and remove bucket 4.
Update the load factor too:
Delete: done
1141
2
61
1
24
0 b=1, L=0H(x) = x mod 3*2L
2 key values /bucketLoad factor = 5/10 = 0.5Lower threshold = 0.5
15
3 4