Upload
dinesh
View
212
Download
0
Embed Size (px)
Citation preview
8/16/2019 Final Hashing in Java Aima311 30apr16
1/26
1
Concept of Hashing
8/16/2019 Final Hashing in Java Aima311 30apr16
2/26
2
Hash tables The problem at hands is to speed up
searching.
Consider the problem of searching anarray for a given value. If the array isnot sorted, the search might requireexamining each and all elements of the
array. If the array is sorted, we can use the
binary search, and therefore reduce the
worse-case runtime complexity to !logn .
8/16/2019 Final Hashing in Java Aima311 30apr16
3/26
3
Hash #unction $e could search even faster if we %now in
advance the index at which that value islocated in the array.
&uppose we do have that magic function thatwould tell us the index for a given value.
$ith this magic function our search is reducedto 'ust one probe, giving us a constant runtime!(".
&uch a function is called a hash function . )hash function is a function which when given a%ey, generates an address in the table.
8/16/2019 Final Hashing in Java Aima311 30apr16
4/26
4
8/16/2019 Final Hashing in Java Aima311 30apr16
5/26
5
Thus, we say that our hashfunction has the following
properties it always returns a number for an
ob'ect. two equal ob'ects will always have
the same number two unequal ob'ects not always
have di*erent numbers
8/16/2019 Final Hashing in Java Aima311 30apr16
6/26
6
The precedure of storing objets using a
hash function is the following
. Create an array of size M . Choose a hash function h, that is a mapping
from objects into integers 0, 1, ..., M-1. Put these objects into an array at indexes
computed
via the hash function index = h(object). Sucharray is called a hash table.
8/16/2019 Final Hashing in Java Aima311 30apr16
7/267
ow to choose a hash function!
"ne approach of creating a hash function is to use #ava$s
hashCode() method. The hashCode%& method is implemented in the "bject
class and therefore each class in #ava inherits it. The hash code provides a numeric representation of an
object %this is somewhat similar to the toString method
that gives a text representation of an object&.
Conside the following code exampleInteger ob'( + new Integer!"/ &tring ob' + new &tring!00"/&ystem.out.println!0hashCode for an integer is 0 1ob'(.hashCode!""/
&ystem.out.println!0hashCode for a string is 0 1ob .hashCode!""/
8/16/2019 Final Hashing in Java Aima311 30apr16
8/268
Collisions
(hen we put objects into a hashtable,
it is possible that different objects %by the
equals() method& might have the same hashcode.
This is called a collision.
ere is the example of collision. Two different
strings ))*a) and )++) have the same ey- .0)a0 + 5)5 6 3( 1 5a5 + ((0770 + 575 6 3( 1 575 + ((
8/16/2019 Final Hashing in Java Aima311 30apr16
9/269
How to resolve collisions8 Collision handling technuique9 (9- :inear ;robing 9-
8/16/2019 Final Hashing in Java Aima311 30apr16
10/2610
8/16/2019 Final Hashing in Java Aima311 30apr16
11/2611
8/16/2019 Final Hashing in Java Aima311 30apr16
12/2612
ne of them is based on idea ofputting the %eys that collide in a
lin%ed list> ) hash table then is an array of
lists>> This technique is called a
separate chaining collisionresolution.
8/16/2019 Final Hashing in Java Aima311 30apr16
13/2613
?ethod of de=ning hash
function (9-?id &quare ?ethod 9-#olding ?ethod 39-
8/16/2019 Final Hashing in Java Aima311 30apr16
14/26
14
Mid-square f m
! x "+middle! x "9
#requently used in symbol table applications.
$e compute f m by squaring the identi=er and
then using an appropriate number of bits fromthe middle of the square to obtain the buc%etaddress.
The number of bits used to obtain the buc%etaddress depends on the table si@e. If we use r bits, the range of the value is r .
&ince the middle bits of the square usuallydepend upon all the characters in an identi=er,there is high probability that di*erent
identi=ers will produce di*erent hash
8/16/2019 Final Hashing in Java Aima311 30apr16
15/26
15
8/16/2019 Final Hashing in Java Aima311 30apr16
16/26
16
Folding Method
;artition identi=er x into several parts )ll parts except for the last one have the same
length )dd the parts together to obtain the hash
address These parts are then added together to obtain
the hash value. The groups are added together and truncated
if necessary #or example 23D(E we divided it 2 1 3D(
1 E+(( , truncated to (.
8/16/2019 Final Hashing in Java Aima311 30apr16
17/26
17
)rray of list
This technique is called a separate chaining collisionresolution.
8/16/2019 Final Hashing in Java Aima311 30apr16
18/26
18
Implementing FynamicFictionaries
$ant a data structure in which =ndsGsearchesare very fast )s close to !(" as possible minimum number of executed instructions per
method
Insert and Feletes should be fast too b'ects in dictionary have unique %eys
) %ey may be a single propertyGattribute value r may be created from multiple propertiesGvalues
8/16/2019 Final Hashing in Java Aima311 30apr16
19/26
19
Hash tables vs. ther Fata&tructures
$e want to implement the dictionary operationsInsert!", Felete!" and &earch!"G#ind!" eciently.
)rrays9
can accomplish in !(" time but are not space ecient !assumes we leave empty
space for %eys not currently in dictionary"
7inary search trees can accomplish in !log n" time are space ecient.
Hash Tables9 ) generali@ation of an array that under some
reasonable assumptions is !(" for
InsertGFeleteG&earch of a %ey
8/16/2019 Final Hashing in Java Aima311 30apr16
20/26
20
)rray )pproach example
) social security application %eeping trac% ofpeople where the primary search %ey is apersonJs social security number !&&K"
Lou can use an array to hold references to allthe person ob'ects Bse an array with range - ,, Bsing the &&K as a %ey, you have !(" access to any
person ob'ect
Bnfortunately, the number of active %eys!&ocial &ecurity Kumbers" is much less thanthe array si@e !( billion entries" Mst. B& population, ct. th E9 E,2E, ver A of the array would be unused
8/16/2019 Final Hashing in Java Aima311 30apr16
21/26
21
Hash Tables
Nery useful data structure Oood for storing and retrieving %ey-value
pairs Kot good for iterating through a list of items
Mxample applications9 &toring ob'ects according to IF numbers
$hen the IF numbers are widely spread out $hen you donJt need to access items in IF order
8/16/2019 Final Hashing in Java Aima311 30apr16
22/26
22
Hash Tables solve these problems by using a much smaller array and
mapping %eys with a hash function. :et universe of %eys B and an array of si@e m. ) hash function h is a
function from B to Pm, that is9
h : U 0…
Hash Tables
U
universe of %eys&
/
0 1
2
3
/
01
4
2
5
h % /&6/h % &6h % 0&60
h % 2&64
h % 1&65
8/16/2019 Final Hashing in Java Aima311 30apr16
23/26
23
Hash indexGvalue
) hash value or hash index is used toindex the hash table !array"
) hash function ta%es a key and returnsa hash valueGindex The hash index is a integer !to index an
array"
The %ey is speci=c value associated witha speci=c ob'ect being stored in thehash table It is important that the %ey remain constant
for the lifetime of the ob'ect
8/16/2019 Final Hashing in Java Aima311 30apr16
24/26
24
) problem arises when we have two %eysthat hash in the same array entry this iscalled a collision.
There are two ways to resolve collision9
Hashing !ith Chaining "a#$#a# %&eparateChaining'(: every hash table entry contains apointer to a lin%ed list of %eys that hash in the
same entry
Hashing !ith )pen *ddressing: every hashtable entry contains only one %ey. If a new %eyhashes to a table entry which is =lled,systematically examine other table entries untilyou =nd one empty entry to place the new %ey
Fealing with Collisions
8/16/2019 Final Hashing in Java Aima311 30apr16
25/26
25
Hashing with Chaining
The problem is that %eys 3E and 2E hash in the same entry !E". $esolve this collision by placing all %eys that hash in the same hash tableentry in a chain !lin%ed list" or buc$et !array" pointed by this entry9
3
/
0
1
other
key key data
Insert 54
/
/
41 01
CHAIN
3
/
0
1
Insert 101
/
/
41 01
3
8/16/2019 Final Hashing in Java Aima311 30apr16
26/26
26
Fata manipulation in Hash Tables
Fata is placed into a hashtable through the put method.nce we put a value in Hashtable, we can get bac%wout with the remove!" method.
$e can test for the existence %ey with containsQey!"mothed.
)nd can be accessed using the get mothed. The getmethod returns a null value if no element exist for thegiven %ey.
Kothe that hashtables donJt store sequentially, so thereis no ordering to the list.
The HashTable calss and motheds supplied in the havafoundation classes are a powerful tool for data
manipulation, ; ti l l h id d t t i l hi