Upload
wilfrid-kelley
View
214
Download
1
Embed Size (px)
Citation preview
The Hash Table Data Structure
Mugurel Ionuț Andreica
Spring 2012
Operations
• put(key, value)– Inserts the pair (key, value) in the hash table– If a pair (key, value’) (with the same key) already
exists, then value’ is replaced by value– We say that the value value is associated to the key
key• get(key)
– Returns the value associated to the key key– If no value is associated to key, then an error occurs
• hasKey(key)– Returns 1 if the key key exists in the hash table, and 0
otherwise
Example
• put(3, 7.9)
• put(2, 8.3)
• get(3) -> returns 7.9
• put(3, 10.2)
• get(3) -> returns 10.2
• get(2) -> returns 8.3
• hasKey(5) -> returns 0
• hasKey(2) -> returns 1
• get(5) -> generates an error
Possible implementation• Maintain an array H[HMAX] of linked lists
– The info field of each element of a list consists of a struct containing a key and a value
• Each key is mapped to a value hkey=hash(key), such that 0≤hkey≤HMAX-1– hash(key) is called the hash function
• put(k, v)– Searches for the key k in the list H[hkey=hash(k)]– If the key is found, then we replace the value by v– If the key is not found, then we insert the pair (k,v) in H[hkey]
• get(k)– Search for the key k in H[hkey=hash(k)]– If it finds the key, then it returns its associated value; otherwise, an error
occurs• hasKey(k)
– Search for the key k in H[hkey=hash(k)]– If it finds the key, then it returns 1; otherwise, it returns 0
Possible implementation (cont.)
• Class Hashtable• HMAX and hash => arguments in the
constructor– The function hash will be passed as an argument
(actually, a pointer to the function will be passed in fact)
– Obviously, hash must be defined differently according to the data type of the keys (see later some examples for int and char*)
– The array H: allocated dynamically in the constructor & deallocated in the destructor
Hash Table – Implementation (hash_table.h)
#include "linked_list.h"
template<typename Tkey, typename Tvalue> struct elem_info { Tkey key; Tvalue value; };
template<typename Tkey, typename Tvalue> class Hashtable { private: LinkedList<struct elem_info<Tkey, Tvalue> > *H; int HMAX; int (*hash) (Tkey);
public: Hashtable(int hmax, int (*h) (Tkey)) { HMAX = hmax; hash = h; H = new LinkedList<struct elem_info<Tkey, Tvalue> > [HMAX]; }
~Hashtable() { for (int i = 0; i < HMAX; i++) { while (!H[i].isEmpty()) H[i].removeFirst(); }
delete H; }
void put(Tkey key, Tvalue value) {
struct list_elem<struct elem_info<Tkey, Tvalue> > *p;
struct elem_info<Tkey, Tvalue> info;
int hkey = hash(key);
p = H[hkey].pfirst;
while (p != NULL) {
/* the == operator must be meaningful when comparing
values of the type Tkey ; otherwise, an equality testing
function should be passed as an argument to the constructor */
if (p->info.key == key)
break;
p = p->next;
}
if (p != NULL)
p->info.value = value;
else {
info.key = key;
info.value = value;
H[hkey].addLast(info);
}
}
Hash Table – Implementation (hash_table.h) (cont.)
Tvalue get(Tkey key) {
struct list_elem<struct elem_info<Tkey, Tvalue> > *p;
int hkey = hash(key);
p = H[hkey].pfirst;
while (p != NULL) {
if (p->info.key == key) break;
p = p->next;
}
if (p != NULL)
return p->info.value;
else {
fprintf(stderr, "Error 101 - The key does not exist in the hashtable\n");
Tvalue x;
return x;
}
}
int hasKey(Tkey key) { struct list_elem<struct elem_info<Tkey,
Tvalue> > *p;
int hkey = hash(key); p = H[hkey].pfirst;
while (p != NULL) { if (p->info.key == key) break; p = p->next; }
if (p != NULL) return 1; else return 0; }};
Using the Hash Table - example#include <stdio.h>
#include <string.h>
#include “hash_table.h”
#define VMAX 17
#define P 13
int hfunc(int key) {
return (P * key) % VMAX;
}
Hashtable<int, double> hid(VMAX, hfunc);
int hfunc2(char* key) {
int hkey = 0;
for (int i = 0; i < strlen(key); i++)
hkey = (hkey * P + key[i]) % VMAX;
return hkey;
}
Hashtable<char*, int> hci(VMAX, hfunc2);
char *k1 = "abc";
char *k2 = "xyze";
char *k3 = "Abc";
char *k4 = "abcD";
int main() { hid.put(3, 7.9); hid.put(2, 8.3); printf("%.3lf\n", hid.get(3)); hid.put(3, 10.2); printf("%.3lf\n", hid.get(3)); printf("%.3lf\n", hid.get(2)); printf("%d\n", hid.hasKey(5)); printf("%d\n", hid.hasKey(2)); printf("%.3lf\n", hid.get(5));
hci.put(k1, 10); hci.put(k2, 20); printf("%d\n", hci.get(k1)); hci.put(k1, 30); printf("%d\n", hci.get(k1)); printf("%d\n", hci.get(k2)); printf("%d\n", hci.hasKey(k3)); printf("%d\n", hci.hasKey(k2)); printf("%d\n", hci.get(k4));
char *k5 = new char[4]; k5[0] = ‘a’; k5[1] = ‘b’; k5[2] = ‘c’; k5[3] = 0; printf("%d\n", hci.get(k5)); // what happens ?
return 0;}
Final remarks
• The Hash table is a fundamental data structure in many situations– Packet routing in the Internet– Session management in web servers– Web search (e.g. Google)– etc.
• Many other implementations exist, besides using an array of linked lists– For example: Linear probing, Cuckoo hashing, etc.– Many of them are more efficient, but more difficult to
understand