33
Hashing Chapter 8 1

Chapter 8 1. Symbol Table Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management

Embed Size (px)

Citation preview

Hashing

HashingChapter 81Symbol TableSymbol table is used widely in many applications.dictionary is a kind of symbol tabledata dictionary is database managementIn general, the following operations are performed on a symbol tabledetermine if a particular name is in the tableretrieve the attribute of that namemodify the attributes of that nameinsert a new name and its attributesdelete a name and its attributes2Symbol TablePopular operations on a symbol table include search, insertion, and deletionA binary search tree could be used to represent a symbol table.The worse-case complexities for the operations are O(n).Hashing: insertions, deletions & finds in O(1).Static hashingDynamic hashing3Static HashingDictionary pairs are stored in a fixed-size table called hash table.The address of location of an dictionary pairs, x, is obtained by computing some arithmetic function h(x).The memory available to maintain the symbol table (hash table) is assumed to be sequential.The hash table consists of b buckets and each bucket contains s records.h(x) maps the set of possible dictionary pairs onto the integers 0 through b-1.4Hash TablesThe key density of a hash table is the ratio n/T, where n is the number of dictionary pairs in the table and T is the total number of possible keys.The loading density or loading factor of a hash table is = n/(sb).5Hash TablesTwo keys, I1, and I2, are said to be synonyms with respect to h if h(I1) = h(I2).An overflow occurs when a new key i is mapped or hashed by h into a full bucket.A collision occurs when two non-identical keys are hashed into the same bucket.If the bucket size is 1, collisions and overflows occur at the same time.6Example 8.1b=26, s=2Assume there are 10 distinct keys GA, A,G,L,A2,A1,A3,A4 and EIf no overflow occur, the time required for hashing depends only on the time required to compute the hash function h.A1?7

Hash FunctionsRequirementsSimple to computeMinimize the number of collisionsDependent upon all the characters in the keysUniform hash functionIf there are b buckets, we hope to have h(x)=i with the probability being (1/b)Mid-square, division, folding, digit analysis8DivisionUsing the modulo (%) operator.A key k is divided by some number D and the remainder is used as the hash address for k.The bucket addresses are in the range of 0 through D-1.If D is a power of 2, then h(k) depends only on the least significant bits of k.9DivisionIf a division function h is used as the hash function, the table size should not be a power of two.Since programmers have a tendency to use many variables with the same suffix too many collisionsIf D is divisible by two, the odd keys are mapped to odd buckets and even keys are mapped to even buckets. Thus, the hash table is biased.10Mid-SquareMid-Square functionIt is computed by squaring the key and then using an appropriate number of bits from the middle of the square to obtain the bucket address.Table size is a power of two11FoldingThe key k is partitioned into several parts, all but the last being of the same length.All partitions are added together to obtain the hash address for k.Shift folding: different partitions are added together to get h(k).Folding at the boundaries: key is folded at the partition boundaries, and digits falling into the same position are added together to obtain h(k).This is similar to reversing every other partition and then adding.12Example 8.213Secure Hash FunctionsCan be applied to any sized message M Produces fixed-length output hIs easy to compute h=H(M) for any message MGiven h is infeasible to find x s.t. H(x)=hone-way propertyGiven x is infeasible to find y s.t. H(y)=H(x)weak collision resistanceIs infeasible to find any x,y s.t. H(y)=H(x)strong collision resistance14Secure Hash Algorithm (SHA) 1: q*512q02: 160-OB32-A, B, C, D, EA = 67453401B = efcdab89C = 98badcfeD = 10325476E = c3d2e1f0163: for (int i = 1; i