Final Hashing in Java Aima311 30apr16

  • Upload
    dinesh

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    1/26

    1

    Concept of Hashing

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    2/26

    2

    Hash tables  The problem at hands is to speed up

    searching.

     Consider the problem of searching anarray for a given value. If the array isnot sorted, the search might requireexamining each and all elements of the

    array. If the array is sorted, we can use the

    binary search, and therefore reduce the

    worse-case runtime complexity to !logn .

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    3/26

    3

      Hash #unction $e could search even faster if we %now in

    advance the index at which that value islocated in the array.

    &uppose we do have that magic function thatwould tell us the index for a given value.

    $ith this magic function our search is reducedto 'ust one probe, giving us a constant runtime!(".

    &uch a function is called a hash function . )hash function is a function which when given a%ey, generates an address in the table.

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    4/26

    4

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    5/26

    5

     Thus, we say that our hashfunction has the following

    properties it always returns a number for an

    ob'ect. two equal ob'ects will always have

    the same number two unequal ob'ects not always

    have di*erent numbers

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    6/26

    6

    The precedure of storing objets using a

    hash function is the following

    . Create an array of size M .  Choose a hash function h, that is a mapping

    from objects into integers 0, 1, ..., M-1.  Put these objects into an array at indexes

    computed

    via the hash function index = h(object). Sucharray is called a hash table.

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    7/267

    ow to choose a hash function!

    "ne approach of creating a hash function is to use #ava$s

    hashCode() method. The hashCode%& method is implemented in the "bject

    class and therefore each class in #ava inherits it.  The hash code provides a numeric representation of an

    object %this is somewhat similar to the toString method

    that gives a text representation of an object&.

     Conside the following code exampleInteger ob'( + new Integer!"/ &tring ob' + new &tring!00"/&ystem.out.println!0hashCode for an integer is 0 1ob'(.hashCode!""/

    &ystem.out.println!0hashCode for a string is 0 1ob .hashCode!""/

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    8/268

    Collisions

    (hen we put objects into a hashtable,

    it is possible that different objects %by the

    equals() method& might have the same hashcode.

    This is called a collision.

    ere is the example of collision. Two different

    strings ))*a) and )++) have the same ey- .0)a0 + 5)5 6 3( 1 5a5 + ((0770 + 575 6 3( 1 575 + ((

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    9/269

    How to resolve collisions8 Collision handling technuique9 (9- :inear ;robing 9-

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    10/2610

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    11/2611

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    12/2612

    ne of them is based on idea ofputting the %eys that collide in a

    lin%ed list> ) hash table then is an array of

    lists>> This technique is called a

    separate chaining collisionresolution.

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    13/2613

    ?ethod of de=ning hash

    function (9-?id &quare ?ethod 9-#olding ?ethod 39-

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    14/26

    14

    Mid-square f m

    ! x "+middle! x "9

    #requently used in symbol table applications.

    $e compute f m by squaring the identi=er and

    then using an appropriate number of bits fromthe middle of the square to obtain the buc%etaddress.

     The number of bits used to obtain the buc%etaddress depends on the table si@e. If we use r  bits, the range of the value is r .

    &ince the middle bits of the square usuallydepend upon all the characters in an identi=er,there is high probability that di*erent

    identi=ers will produce di*erent hash

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    15/26

    15

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    16/26

    16

    Folding Method

    ;artition identi=er x  into several parts )ll parts except for the last one have the same

    length )dd the parts together to obtain the hash

    address  These parts are then added together to obtain

    the hash value.  The groups are added together and truncated

    if necessary #or example 23D(E we divided it 2 1 3D(

    1 E+(( , truncated to (.

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    17/26

    17

    )rray of list

     This technique is called a separate chaining collisionresolution.

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    18/26

    18

    Implementing FynamicFictionaries

    $ant a data structure in which =ndsGsearchesare very fast )s close to !(" as possible minimum number of executed instructions per

    method

    Insert and Feletes should be fast too b'ects in dictionary have unique %eys

    ) %ey may be a single propertyGattribute value r may be created from multiple propertiesGvalues

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    19/26

    19

    Hash tables vs. ther Fata&tructures

    $e want to implement the dictionary operationsInsert!", Felete!" and &earch!"G#ind!" eciently.

    )rrays9

    can accomplish in !(" time but are not space ecient !assumes we leave empty

    space for %eys not currently in dictionary"

    7inary search trees can accomplish in !log n" time are space ecient.

    Hash Tables9 ) generali@ation of an array that under some

    reasonable assumptions is !(" for

    InsertGFeleteG&earch of a %ey

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    20/26

    20

    )rray )pproach example

    ) social security application %eeping trac% ofpeople where the primary search %ey is apersonJs social security number !&&K"

     Lou can use an array to hold references to allthe person ob'ects Bse an array with range - ,, Bsing the &&K as a %ey, you have !(" access to any

    person ob'ect

    Bnfortunately, the number of active %eys!&ocial &ecurity Kumbers" is much less thanthe array si@e !( billion entries" Mst. B& population, ct. th E9 E,2E, ver A of the array would be unused

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    21/26

    21

    Hash Tables

    Nery useful data structure Oood for storing and retrieving %ey-value

    pairs Kot good for iterating through a list of items

    Mxample applications9 &toring ob'ects according to IF numbers

    $hen the IF numbers are widely spread out $hen you donJt need to access items in IF order

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    22/26

    22

    Hash Tables solve these problems by using a much smaller array and

    mapping %eys with a hash function. :et universe of %eys B and an array of si@e m. ) hash function h is a

    function from B to Pm, that is9

    h : U 0… 

    Hash Tables

    U

    universe of %eys&

        / 

     0   1

     2

    3

    /

    01

    4

    2

    5

    h % /&6/h % &6h % 0&60

    h % 2&64

    h % 1&65

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    23/26

    23

    Hash indexGvalue

    ) hash value or hash index  is used toindex the hash table !array"

    ) hash function ta%es a key  and returnsa hash valueGindex  The hash index is a integer !to index an

    array"

     The %ey is speci=c value associated witha speci=c ob'ect being stored in thehash table It is important that the %ey remain constant

    for the lifetime of the ob'ect

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    24/26

    24

    ) problem arises when we have two %eysthat hash in the same array entry this iscalled a collision.

     There are two ways to resolve collision9

    Hashing !ith Chaining "a#$#a# %&eparateChaining'(: every hash table entry contains apointer to a lin%ed list of %eys that hash in the

    same entry

    Hashing !ith )pen *ddressing: every hashtable entry contains only one %ey. If a new %eyhashes to a table entry which is =lled,systematically examine other table entries untilyou =nd one empty entry to place the new %ey

    Fealing with Collisions

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    25/26

    25

    Hashing with Chaining

     The problem is that %eys 3E and 2E hash in the same entry !E". $esolve this collision by placing all %eys that hash in the same hash tableentry in a chain !lin%ed list" or buc$et !array" pointed by this entry9

    3

    /

    0

    1

      other

    key key data

     Insert 54 

    /

    /

    41 01

    CHAIN

    3

    /

    0

    1

     Insert 101 

    /

    /

    41 01

    3

  • 8/16/2019 Final Hashing in Java Aima311 30apr16

    26/26

    26

    Fata manipulation in Hash Tables

    Fata is placed into a hashtable through the put method.nce we put a value in Hashtable, we can get bac%wout with the remove!" method.

    $e can test for the existence %ey with containsQey!"mothed.

    )nd can be accessed using the get mothed. The getmethod returns a null value if no element exist for thegiven %ey.

    Kothe that hashtables donJt store sequentially, so thereis no ordering to the list.

     The HashTable calss and motheds supplied in the havafoundation classes are a powerful tool for data

    manipulation, ; ti l l h id d t t i l hi