Upload
eloise-goulder
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
11
Hash TablesHash Tables
Saurav KarmakarSaurav Karmakar
22
MotivationMotivation
What are the dictionary operations?What are the dictionary operations?
(1) Insert(1) Insert
(2) Delete(2) Delete
(3) Search (most of the time, we will be (3) Search (most of the time, we will be focusing on search)focusing on search)
33
ObjectiveObjective
Searching takes Searching takes ΘΘ(n) time in the worst (n) time in the worst case (when the data is unorganized).case (when the data is unorganized).
Even using binary search it takes Even using binary search it takes ΘΘ(log n) time when the data are (log n) time when the data are sorted.sorted.
Our Objective?Our Objective?
O(1) time on average using hashing, O(1) time on average using hashing, under a reasonable assumption.under a reasonable assumption.
44
DefinitionsDefinitions
A hash table is a generalization of an array A hash table is a generalization of an array (direct addressing is allowed), so let’s first (direct addressing is allowed), so let’s first talk about direct-address table.talk about direct-address table.
Universe of keys U={0,1,2,…,m-1}, no two Universe of keys U={0,1,2,…,m-1}, no two elements have the same key.elements have the same key.
To represent a dynamic set, we use an To represent a dynamic set, we use an array, or direct address table T[0..m-1], in array, or direct address table T[0..m-1], in which each position (slot) corresponds to which each position (slot) corresponds to the key in the universe.the key in the universe.
55
DefinitionsDefinitionsTo represent a dynamic set, we use an array, To represent a dynamic set, we use an array,
or direct address table T[0..m-1], in which each or direct address table T[0..m-1], in which each position (slot) corresponds to a key in the position (slot) corresponds to a key in the universe.universe.
U (universe of keys)
K (actual keys)
•1
•2 •3
•5
•8•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
/
key
satellite data
2
3
5
8
66
With a direct address table T[0..m-1], how do With a direct address table T[0..m-1], how do we search an element x with key k?we search an element x with key k?
Direct-Address-Search(T,k): return T[k]Direct-Address-Search(T,k): return T[k]
U (universe of keys)
K (actual keys)
•1
•2 •3
•5
•8•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
/
key
satellite data
2
3
5
8
77
With a direct address table T[0..m-1], how do we With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k?search/insert/delete an element x with key k?
Direct-Address-Search(T,k): return T[k]Direct-Address-Search(T,k): return T[k]
Direct-Address-Insert(T,x): T[key[x]] ← xDirect-Address-Insert(T,x): T[key[x]] ← x
Direct-Address-Delete(T,x): T[key[x]] ← NilDirect-Address-Delete(T,x): T[key[x]] ← Nil
U (universe of keys)
K (actual keys)
•1
•2 •3
•5
•8•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
/
key
satellite data
2
3
5
8
88
With a direct address table T[0..m-1], how do we With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k?search/insert/delete an element x with key k?
Direct-Address-Search(T,k): return T[k]Direct-Address-Search(T,k): return T[k]
Direct-Address-Insert(T,x): T[key[x]] ← x Direct-Address-Insert(T,x): T[key[x]] ← x O(1) O(1) time!time!
Direct-Address-Delete(T,x): T[key[x]] ← NilDirect-Address-Delete(T,x): T[key[x]] ← Nil
U (universe of keys)
K (actual keys)
•1
•2 •3
•5
•8•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
/
key
satellite data
2
3
5
8
99
With a direct address table T[0..m-1], how do we With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k?search/insert/delete an element x with key k?
Direct-Address-Search(T,k): return T[k]Direct-Address-Search(T,k): return T[k]
Direct-Address-Insert(T,x): T[key[x]] ← x Direct-Address-Insert(T,x): T[key[x]] ← x
Problem?Problem? Direct-Address-Delete(T,x): T[key[x]] ← NilDirect-Address-Delete(T,x): T[key[x]] ← Nil
U (universe of keys)
K (actual keys)
•1
•2 •3
•5
•8•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
/
key
satellite data
2
3
5
8
1010
With direct addressing, an element with key k is With direct addressing, an element with key k is inserted in slot h(k). h is called a hash function.inserted in slot h(k). h is called a hash function.
h maps the universe U of keys into the slots of a hash h maps the universe U of keys into the slots of a hash table T[0..m-1].table T[0..m-1].
h : U → {0,1,…,m-1}h : U → {0,1,…,m-1}
U (universe of keys)
K (actual keys)
•1
•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
/
Hash Table
•2 •2
•5•5
•3•3
•8
•8
1111
Basic IdeaBasic Idea
Use Use hash functionhash function to map keys into to map keys into positions in a positions in a hash tablehash table
IdeallyIdeally If element If element ee has key has key kk and and hh is hash is hash
function, then function, then ee is stored in position is stored in position h(k)h(k) of tableof table
To search for To search for ee, compute , compute h(k)h(k) to locate to locate position. If no element, dictionary does position. If no element, dictionary does not contain not contain ee..
1212
With direct addressing, an element with key k is With direct addressing, an element with key k is inserted in slot h(k). h is called a hash function.inserted in slot h(k). h is called a hash function.
h maps the universe U of keys into the slots of a hash h maps the universe U of keys into the slots of a hash table T[0..m-1].table T[0..m-1].
h : U → {0,1,…,m-1}h : U → {0,1,…,m-1}
U (universe of keys)
K (actual keys)
•1
•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
Hash Table: Collision Problem
•2 •2
•5•5
•3•3
•8
•8
/
/
X Collision!
If h(5)=h(8)
1313
Two keys hash to the same slot --- collision.Two keys hash to the same slot --- collision. While collision is hard to avoid, if we design the hash While collision is hard to avoid, if we design the hash
function carefully we can at least decrease the function carefully we can at least decrease the chance for collision (and in some cases may avoid chance for collision (and in some cases may avoid collision).collision).
U (universe of keys)
K (actual keys)
•1
•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
Collision
•2 •2
•5•5
•3•3
•8
•8
/
/
X Collision!
If h(5)=h(8)
1414
Two keys hash to the same slot --- collision.Two keys hash to the same slot --- collision. While collision is hard to avoid, if we design the hash While collision is hard to avoid, if we design the hash
function carefully we can at least decrease the function carefully we can at least decrease the chance for collision (and in some cases may avoid chance for collision (and in some cases may avoid collision).collision).
U (universe of keys)
K (actual keys)
•1
•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
Collision Resolution by Chaining
•2 •2
•5
•3•3
•8
/
/
If h(5)=h(8)
•5 •8
1515
Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])]Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])]
Chained-Hash-Search(T,k): search for an element with key k in list Chained-Hash-Search(T,k): search for an element with key k in list T[h(k)] T[h(k)]
U (universe of keys)
K (actual keys)
•1
•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
Collision Resolution by Chaining
•2 •2
•5
•3•3
•8
/
/
If h(5)=h(8)
•5 •8
1616
Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])]Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])]
Chained-Hash-Search(T,k): search for an element with key k in list Chained-Hash-Search(T,k): search for an element with key k in list T[h(k)]T[h(k)]
Chained-Hash-Delete(T,x): delete x from the list T[h(key[x])]Chained-Hash-Delete(T,x): delete x from the list T[h(key[x])]
Time?Time?
U (universe of keys)
K (actual keys)
•1
•4
•0•9
T
0
1
2
3
4
5
6
7
8
9
/
/
/
/
/
Collision Resolution by Chaining
•2 •2
•5
•3•3
•8
/
/
If h(5)=h(8)
•5 •8
1717
Example: Let h(k)= k mod 11, insert Example: Let h(k)= k mod 11, insert 5,28,19,15,20,33,12,17,39,11 into T[0..10]. 5,28,19,15,20,33,12,17,39,11 into T[0..10].
Collision Resolution by Chaining
T
0
1
2
3
4
5
6
7
8
9
/
/
5
19
10
28
15
20
33
12
1739
/
/
11
1818
A hash function which causes no collision is called A hash function which causes no collision is called perfectperfect hash hash function.function.
A A good good hash function is one which satisfies hash function is one which satisfies simple uniform simple uniform hashinghashing --- each key is equally likely to hash to any of the m --- each key is equally likely to hash to any of the m slots. (It is difficult to check this condition though.)slots. (It is difficult to check this condition though.)
Now let’s see some example for hash functions. Assume that all Now let’s see some example for hash functions. Assume that all the keys can be represented as natural numbers.the keys can be represented as natural numbers.
Hash function
1919
Division: Division: h(k) = k mod m, m should be a prime number, h(k) = k mod m, m should be a prime number, better close to a power of 2.better close to a power of 2.
Multiplication: Multiplication: h(k) = h(k) = floorfloor [m(kA mod 1)], [m(kA mod 1)],
A=(√5 – 1)/2=0.61803... A=(√5 – 1)/2=0.61803...
Two step processTwo step process Step 1: Step 1:
– Multiply the key Multiply the key kk by a constant 0< A < 1 and extract the by a constant 0< A < 1 and extract the fraction part of fraction part of kkA.A.
Step 2:Step 2:– Multiply kA by m and take the floor of the result.Multiply kA by m and take the floor of the result.
Famous Examples of Hash Functions
2020
Division: Division: h(k) = k mod m, m should be a prime h(k) = k mod m, m should be a prime number, better close to a power of 2.number, better close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘,
A=(√5 – 1)/2=0.61803…A=(√5 – 1)/2=0.61803…
Example. K = 123456, m=10000.Example. K = 123456, m=10000.
h(k) = └10000(123456 x 0.61803… mod 1)┘h(k) = └10000(123456 x 0.61803… mod 1)┘
Famous Examples of Hash Functions
2121
Division: Division: h(k) = k mod m, m should be a prime number, h(k) = k mod m, m should be a prime number, better close to a power of 2.better close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘,
A=(√5 – 1)/2=0.61803... A=(√5 – 1)/2=0.61803...
Example. K = 123456, m=10000.Example. K = 123456, m=10000.
h(k) = └10000(123456 x 0.61803… mod 1)┘h(k) = └10000(123456 x 0.61803… mod 1)┘
= └10000(76300.0041151… mod 1)┘= └10000(76300.0041151… mod 1)┘
Famous Examples of Hash Functions
2222
Division: Division: h(k) = k mod m, m should be a prime number, h(k) = k mod m, m should be a prime number, better close to a power of 2.better close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘,
A=(√5 – 1)/2=0.61803… A=(√5 – 1)/2=0.61803…
Example. K = 123456, m=10000.Example. K = 123456, m=10000.
h(k) = └10000(123456 x 0.61803… mod 1)┘h(k) = └10000(123456 x 0.61803… mod 1)┘
= └10000(76300.0041151… mod 1)┘= └10000(76300.0041151… mod 1)┘
= └10000 x 0.0041151…)┘= └10000 x 0.0041151…)┘
Famous Examples of Hash Functions
2323
Division: Division: h(k) = k mod m, m should be a prime number, h(k) = k mod m, m should be a prime number, better close to a power of 2.better close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803. Example. K = 123456, m=10000.Example. K = 123456, m=10000. h(k) = └10000(123456 x 0.61803 mod 1)┘h(k) = └10000(123456 x 0.61803 mod 1)┘
= └10000(76300.0041151… mod 1)┘= └10000(76300.0041151… mod 1)┘ = └10000 x 0.0041151…)┘= └10000 x 0.0041151…)┘ = └41.151…┘= └41.151…┘
Famous Examples of Hash Functions
2424
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803. Example. K = 123456, m=10000.Example. K = 123456, m=10000. h(k) = └10000(123456 x 0.61803 mod 1)┘h(k) = └10000(123456 x 0.61803 mod 1)┘
= └10000(76300.0041151… mod 1)┘= └10000(76300.0041151… mod 1)┘ = └10000 x 0.0041151…)┘= └10000 x 0.0041151…)┘ = └41.151…┘= └41.151…┘ = 41= 41
Famous Examples of Hash Functions
2525
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Example 1. Shift folding: Example 1. Shift folding: 123-456-789 (SSN)123-456-789 (SSN) 123+456+789 = 1368123+456+789 = 1368
1368 mod 1000 = 368.1368 mod 1000 = 368.
Famous Examples of Hash Functions
2626
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Example 1. Shift folding: Example 1. Shift folding: 123-456-789 (SSN)123-456-789 (SSN) 123+456+789 = 1368123+456+789 = 1368
1368 mod 1000 = 368.1368 mod 1000 = 368.
Example 2. Boundary folding: Example 2. Boundary folding: 123-456-789 (SSN)123-456-789 (SSN) 123+654+789 = 1566123+654+789 = 1566
1566 mod 1000 = 566.1566 mod 1000 = 566.
Famous Examples of Hash Functions
2727
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of key is squared and the middle part of the result is taken as the address.the result is taken as the address.
Example. Example. k=3121, 3121k=3121, 312122 = 9740641, so h(k) = = 9740641, so h(k) =
Famous Examples of Hash Functions
2828
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of key is squared and the middle part of the result is taken as the address.the result is taken as the address.
Example. Example. k=3121, 3121k=3121, 312122 = 9740641, so h(k) = 406. = 9740641, so h(k) = 406. You can also encode the square into binary representation You can also encode the square into binary representation
and take the middle part.and take the middle part.
Famous Examples of Hash Functions
2929
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of key is squared and the middle part of the result is taken as the address.the result is taken as the address.
Extraction:Extraction: Only a part of the key is used to compute the Only a part of the key is used to compute the address.address.
Example: Example: 123456789, first 4 digits 1234, last 4 digits 6789123456789, first 4 digits 1234, last 4 digits 6789 first 2 digits of 1234 ◦ last 2 digits of 6789first 2 digits of 1234 ◦ last 2 digits of 6789 we have 1289we have 1289
Famous Examples of Hash Functions
3030
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of key is squared and the middle part of the result is taken as the address.the result is taken as the address.
Extraction:Extraction: Only a part of the key is used to compute the Only a part of the key is used to compute the address.address.
Radix Transformation: Radix Transformation: k is transformed into another number k is transformed into another number basebase
Example: Example: 34534510 10 = 423= 4239 9 , then 423 mod 100 = 23., then 423 mod 100 = 23.
Famous Examples of Hash Functions
3131
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of key is squared and the middle part of the result is taken as the address.the result is taken as the address.
Extraction:Extraction: Only a part of the key is used to compute the Only a part of the key is used to compute the address.address.
Radix Transformation: Radix Transformation: k is transformed into another number k is transformed into another number basebase
Example: Example: 34534510 10 = 423= 4239 9 , then 423 mod 100 = 23., then 423 mod 100 = 23. 26426410 10 = 323= 32399, then 323 mod 100 =23. , then 323 mod 100 =23.
Famous Examples of Hash Functions
3232
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of the key is squared and the middle part of the result is taken as the address.result is taken as the address.
Extraction:Extraction: Only a part of the key is used to compute the Only a part of the key is used to compute the address.address.
Radix Transformation: Radix Transformation: k is transformed into another number k is transformed into another number basebase
Example: Example: 34534510 10 = 423= 4239 9 , then 423 mod 100 = 23., then 423 mod 100 = 23. 26426410 10 = 323= 32399, then 323 mod 100 =23. , then 323 mod 100 =23.
Collision is hard to avoid in the worst case!Collision is hard to avoid in the worst case!
Famous Examples of Hash Functions
3333
Division: Division: h(k) = k mod m, m should be a prime number, better h(k) = k mod m, m should be a prime number, better close to a power of 2.close to a power of 2.
Multiplication: Multiplication: h(k) = └m(kA mod 1)┘, h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2=0.61803. A=(√5 – 1)/2=0.61803.
Folding: Folding: The key is divided into several parts. These parts are The key is divided into several parts. These parts are combined or folded together and are transformed in a certain combined or folded together and are transformed in a certain way to create the target address.way to create the target address.
Mid-square function: Mid-square function: key is squared and the middle part of key is squared and the middle part of the result is taken as the address.the result is taken as the address.
Extraction:Extraction: Only a part of the key is used to compute the Only a part of the key is used to compute the address.address.
Radix Transformation: Radix Transformation: k is transformed into another number k is transformed into another number basebase
Famous Examples of Hash Functions
3434
In some applications, it is hard to dynamically allocate In some applications, it is hard to dynamically allocate additional space for handling the chaining.additional space for handling the chaining.
So it is natural to come up with a different way to handle So it is natural to come up with a different way to handle collision in which all elements are stored in the hash table collision in which all elements are stored in the hash table itself. Then, instead of following pointers, we simply compute itself. Then, instead of following pointers, we simply compute the sequences of slots to be examined.the sequences of slots to be examined.
Let’s use insertion as an example.Let’s use insertion as an example.
Open Addressing
3535
Let’s use insertion as an example.Let’s use insertion as an example.
To perform insertion using open addressing, we successively To perform insertion using open addressing, we successively examine orexamine or probe probe the hash table until we find an empty slot to the hash table until we find an empty slot to put the element. put the element.
Moreover, the sequence of positions probed depends on the Moreover, the sequence of positions probed depends on the key being inserted; i.e.,key being inserted; i.e.,
h: U x {0,1,…,m-1} → {0,1,…,m-1}h: U x {0,1,…,m-1} → {0,1,…,m-1}
Open Addressing
3636
To perform insertion using open addressing, we successively To perform insertion using open addressing, we successively examine orexamine or probe probe the hash table until we find an empty slot to the hash table until we find an empty slot to put the element. put the element.
Moreover, the sequence of positions probed depends on the Moreover, the sequence of positions probed depends on the key being inserted; i.e.,key being inserted; i.e.,
h: U x {0,1,…,m-1} → {0,1,…,m-1}h: U x {0,1,…,m-1} → {0,1,…,m-1}
Apparently, for every key k, the probe sequence Apparently, for every key k, the probe sequence <h(k,0), h(k,1),…,h(k,m-1)> is a permutation of <0,1,…,m-1><h(k,0), h(k,1),…,h(k,m-1)> is a permutation of <0,1,…,m-1> so that every position in the hash table is eventually so that every position in the hash table is eventually
considered as a slot for a new key as the table fills up.considered as a slot for a new key as the table fills up.
Now, for simplicity, assume k=x, and there is no deletion.Now, for simplicity, assume k=x, and there is no deletion.
Open Addressing
3737
Hash-Insert(T,k)Hash-Insert(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == Nilif T[j] == Nil4.4. then T[j] ← kthen T[j] ← k5.5. return jreturn j6.6. else i ← i + 1 else i ← i + 1 7. until i=m 7. until i=m 8. error “hash table overflow”8. error “hash table overflow”
Open Addressing
3838
Hash-Insert(T,k)Hash-Insert(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == Nilif T[j] == Nil4.4. then T[j] ← kthen T[j] ← k5.5. return jreturn j6.6. else i ← i + 1 else i ← i + 1 7. until i=m 7. until i=m 8. error “hash table overflow”8. error “hash table overflow”
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
3939
Hash-Insert(T,k)Hash-Insert(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == Nilif T[j] == Nil4.4. then T[j] ← kthen T[j] ← k5.5. return jreturn j6.6. else i ← i + 1 else i ← i + 1 7. until i=m 7. until i=m 8. error “hash table overflow”8. error “hash table overflow”
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
h(10,0)=(10+0) mod 11
= 10
10
4040
Hash-Insert(T,k)Hash-Insert(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == Nilif T[j] == Nil4.4. then T[j] ← kthen T[j] ← k5.5. return jreturn j6.6. else i ← i + 1 else i ← i + 1 7. until i=m 7. until i=m 8. error “hash table overflow”8. error “hash table overflow”
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
h(10,0)=(10+0) mod 11
= 10
h(22,0)= 0
h(31,0)=9
h(4,0)=4
h(15,0)=(4+0) mod 11
=4
10
4
31
22
4141
Hash-Insert(T,k)Hash-Insert(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == Nilif T[j] == Nil4.4. then T[j] ← kthen T[j] ← k5.5. return jreturn j6.6. else i ← i + 1 else i ← i + 1 7. until i=m 7. until i=m 8. error “hash table overflow”8. error “hash table overflow”
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
h(10,0)=(10+0) mod 11
= 10
h(22,0)= 0
h(31,0)=9
h(4,0)=4
h(15,0)=(4+0) mod 11
=4
h(15,1)=(4+1) mod 11
=510
31
22
4
15
4242
Hash-Insert(T,k)Hash-Insert(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == Nilif T[j] == Nil4.4. then T[j] ← kthen T[j] ← k5.5. return jreturn j6.6. else i ← i + 1 else i ← i + 1 7. until i=m 7. until i=m 8. error “hash table overflow”8. error “hash table overflow”
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
•Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.•h(k,i)=[h’(k)+i] mod m, •h’(k)=k mod m.
10
31
22
4
15
88
28
17
59
4343
Hash-Search(T,k)Hash-Search(T,k)1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] == kif T[j] == k4.4. then return jthen return j5.5. i ← i + 1 i ← i + 1 6. until T[j]=Nil or i=m 6. until T[j]=Nil or i=m 7. return Nil7. return Nil
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Search 15 in T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.10
31
22
4
15
88
28
17
59
i = 0
j ← h(15,0)=4
T[j] != 15
i = 1
j ← h(15,1)=5
T[j] = 15
return 5
4444
How about deletion?How about deletion?You can simply use Hash-SearchYou can simply use Hash-Searchto find the key first. Then what?to find the key first. Then what?
1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] != Nil and T[j]==kif T[j] != Nil and T[j]==k4.4. then T[j] ← Nil? exitthen T[j] ← Nil? exit5.5. i ← i + 1 i ← i + 1 6. until T[j]=Nil or i=m 6. until T[j]=Nil or i=m
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Delete 4,15 in T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
10
31
22
4
15
88
28
17
59
4545
How about deletion?How about deletion?You can simply use Hash-SearchYou can simply use Hash-Searchto find the key first. Then what?to find the key first. Then what?
1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] != Nil and T[j] == kif T[j] != Nil and T[j] == k4.4. then T[j] ← Nil?, exitthen T[j] ← Nil?, exit5.5. i ← i + 1 i ← i + 1 6. until T[j]=Nil or i=m 6. until T[j]=Nil or i=m
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Delete 4,15 in T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
10
31
22
Nil
15
88
28
17
59
Delete 15:
i = 0
j ← h(15,0)=4
T[j] = Nil
exit
4646
How about deletion?How about deletion?You can simply use Hash-SearchYou can simply use Hash-Searchto find the key first.to find the key first.
1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] != Nil and T[j] == kif T[j] != Nil and T[j] == k4.4. then T[j] ← deleted, exitthen T[j] ← deleted, exit5.5. i ← i + 1 i ← i + 1 6. until T[j]=Nil or i=m 6. until T[j]=Nil or i=m
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Delete 4,15 in T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
10
31
22
deleted
15
88
28
17
59
Delete 15:
i = 0
j ← h(15,0)=4
T[j] = deleted
i = 1
j ← h(15,1)=5
T[j]=15
4747
How about deletion?How about deletion?You can simply use Hash-SearchYou can simply use Hash-Searchto find the key first.to find the key first.
1. i ← 01. i ← 02. repeat j ← h(k,i)2. repeat j ← h(k,i)3.3. if T[j] != Nil and T[j] == kif T[j] != Nil and T[j] == k4.4. then T[j] ← deleted, exitthen T[j] ← deleted, exit5.5. i ← i + 1 i ← i + 1 6. until T[j]=Nil or i=m 6. until T[j]=Nil or i=m
Open Addressing
T
0
1
2
3
4
5
6
7
8
9
10
Example. Delete 4,15 in T.
h(k,i)=[h’(k)+i] mod m,
h’(k)=k mod m.
10
31
22
88
28
17
59
Delete 15:
i = 0
j ← h(15,0)=4
T[j] = deleted
i = 1
j ← h(15,1)=5
T[j]=15
15 is deleted!
deleted
deleted
4848
That is what we have just seen.That is what we have just seen.
h’ is an ordinary hash function; i.e., h’ is an ordinary hash function; i.e., h’: U → {0,1,2,…,m-1}h’: U → {0,1,2,…,m-1}
h(k,i) = [h’(k) + i] mod m.h(k,i) = [h’(k) + i] mod m.
Initial slot probed is exactly T[h’(k)]. Initial slot probed is exactly T[h’(k)].
Linear probing
4949
h’ is an ordinary hash function; i.e., h’ is an ordinary hash function; i.e., h’: U → {0,1,2,…,m-1}h’: U → {0,1,2,…,m-1}
h(k,i) = [h’(k) + Ch(k,i) = [h’(k) + C11i + Ci + C22ii22] mod m, C] mod m, C11, C, C22 are two non-zero are two non-zero
constants.constants.
Initial slot probed is also T[h’(k)], but when i>0 it is intuitively Initial slot probed is also T[h’(k)], but when i>0 it is intuitively better than linear probing. better than linear probing.
Quadratic probing
5050
ExampleExample. . Insert keys Insert keys 10,22,31,4,15,28,17,88,10,22,31,4,15,28,17,88,59 into T. 59 into T.
h(k,i)=[h’(k)+Ch(k,i)=[h’(k)+C11i + Ci + C22ii22] mod ] mod
m, m, h’(k)=k mod m,h’(k)=k mod m,CC11=1, C=1, C22=3.=3.
Quadratic probing
T
0
1
2
3
4
5
6
7
8
9
10
5151
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[h’(k)+Ch(k,i)=[h’(k)+C11i + Ci + C22ii22] mod m, ] mod m,
h’(k)=k mod m,h’(k)=k mod m,
CC11=1, C=1, C22=3.=3.
Quadratic probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
22
4
h(10,0)=10
h(22,0)=0
h(31,0)=9
h(4,0)=4
h(15,0)=4
5252
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[h’(k)+Ch(k,i)=[h’(k)+C11i + Ci + C22ii22] mod m, ] mod m,
h’(k)=k mod m,h’(k)=k mod m,
CC11=1, C=1, C22=3.=3.
Quadratic probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
22
4
15
28
h(10,0)=10
h(22,0)=0
h(31,0)=9
h(4,0)=4
h(15,0)=4
h(15,1)=[4+1+3] mod 11 = 8
h(28,0)=6
5353
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[h’(k)+Ch(k,i)=[h’(k)+C11i + Ci + C22ii22] mod m, ] mod m,
h’(k)=k mod m,h’(k)=k mod m,
CC11=1, C=1, C22=3.=3.
Quadratic probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
22
h(17,0)=6
h(17,1)=10
h(17,2)=[6+2+3x22] mod 11 = 9
h(17,3)=[6+3+3x32] mod 11 = 3
15
28
4
17
5454
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[h’(k)+Ch(k,i)=[h’(k)+C11i + Ci + C22ii22] mod m, ] mod m,
h’(k)=k mod m,h’(k)=k mod m,
CC11=1, C=1, C22=3.=3.
Quadratic probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
22
15
4
17
88
28
59
5555
Double probing
h(k,i) = [h1(k) + ih2(k)] mod m,
h1, h2 are two auxiliary hash functions.
Initial slot probed is also T[h1(k)].
5656
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[hh(k,i)=[h11(k)+ih(k)+ih22(k)] mod m, (k)] mod m,
hh11(k)=k mod m,(k)=k mod m,
hh22(k)=1 + [k mod (m-1)].(k)=1 + [k mod (m-1)].
Double probing
T
0
1
2
3
4
5
6
7
8
9
10
5757
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[hh(k,i)=[h11(k)+ih(k)+ih22(k)] mod m, (k)] mod m,
hh11(k)=k mod m,(k)=k mod m,
hh22(k)=1 + [k mod (m-1)].(k)=1 + [k mod (m-1)].
Double probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
22
4
h(10,0)=10
h(22,0)=0
h(31,0)=9
h(4,0)=4
h(15,0)=4
h(15,1)=[4+1x6] mod 11 = 10
h(15,2)=[4+2x6] mod 11 = 5
5858
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[hh(k,i)=[h11(k)+ih(k)+ih22(k)] mod m, (k)] mod m,
hh11(k)=k mod m,(k)=k mod m,
hh22(k)=1 + [k mod (m-1)].(k)=1 + [k mod (m-1)].
Double probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
22 h(28,0)=6
h(17,0)=6
h(17,1)=[6+1x8] mod 11 = 3
…
h(88,2)=[0+2x9] mod 11 = 7
…
h(59,2)=[4+2x10] mod 11 = 2
4
15
28
17
5959
ExampleExample. .
Insert keys Insert keys 10,22,31,4,15,28,17,88,510,22,31,4,15,28,17,88,59 into T. 9 into T.
h(k,i)=[hh(k,i)=[h11(k)+ih(k)+ih22(k)] mod m, (k)] mod m,
hh11(k)=k mod m,(k)=k mod m,
hh22(k)=1 + [k mod (m-1)].(k)=1 + [k mod (m-1)].
Double probing
T
0
1
2
3
4
5
6
7
8
9
10 10
31
h(28,0)=6
h(17,0)=6
h(17,1)=[6+1x8] mod 11 = 3
…
h(88,2)=[0+2x9] mod 11 = 7
…
h(59,2)=[4+2x10] mod 11 = 2
4
15
28
17
22
59
88
6060
Analysis of hashing (in general tough)Analysis of hashing (in general tough)
In a hash table with size m, we want to In a hash table with size m, we want to store n elements with collision store n elements with collision resolved by chaining. resolved by chaining.
Load Factor Load Factor αα = n/m. = n/m.
Theorem: An unsuccessful search Theorem: An unsuccessful search takes expected time O(1+takes expected time O(1+αα), under the ), under the assumption of simple uniform hashing.assumption of simple uniform hashing.