78
Priority Queue Sorting We can use a priority queue to sort a list of comparable elements 1. Insert the elements one by one with a series of insert operations 2. Remove the elements in sorted order with a series of removeMin operations The running time of this sorting method depends on the priority queue implementation Algorithm PQ-Sort(S, C) Input list S, comparator C for the elements of S Output list S sorted in increasing order according to C P priority queue with comparator C while ¬S.isEmpty () e S.remove(S.first ()) P.insert (e, ) while ¬P.isEmpty() e P.removeMin().getKey() S.addLast(e) © 2014 Goodrich, Tamassia, Goldwasser 1 Heaps

Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Priority Queue Sorting q  We can use a priority queue

to sort a list of comparable elements 1.  Insert the elements one by

one with a series of insert operations

2.  Remove the elements in sorted order with a series of removeMin operations

q  The running time of this sorting method depends on the priority queue implementation

Algorithm PQ-Sort(S, C) n  Input list S, comparator C

for the elements of S n  Output list S sorted in

increasing order according to C

P ← priority queue with comparator C

while ¬S.isEmpty () e ← S.remove(S.first ())

P.insert (e, ∅) while ¬P.isEmpty()

e ← P.removeMin().getKey() S.addLast(e)

© 2014 Goodrich, Tamassia, Goldwasser 1 Heaps

Page 2: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Selection-Sort q  Selection-sort is the variation of PQ-sort where the

priority queue is implemented with an unsorted sequence

q  Running time of Selection-sort: 1.  Inserting the elements into the priority queue with n insert

operations takes O(n) time 2.  Removing the elements in sorted order from the priority

queue with n removeMin operations takes time proportional to n + n-1 + … +2 +1

q  Selection-sort runs in O(n2) time

© 2014 Goodrich, Tamassia, Goldwasser 2 Heaps

Page 3: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Selection-Sort Example

© 2014 Goodrich, Tamassia, Goldwasser Heaps 3

Sequence S Priority Queue P Input: (7,4,8,2,5,3,9) () Phase 1

(a) (4,8,2,5,3,9) (7) (b) (8,2,5,3,9) (7,4) .. .. .. (g) () (7,4,8,2,5,3,9)

Phase 2

(a) (2) (7,4,8,5,3,9) (b) (2,3) (7,4,8,5,9) (c) (2,3,4) (7,8,5,9) (d) (2,3,4,5) (7,8,9) (e) (2,3,4,5,7) (8,9) (f) (2,3,4,5,7,8) (9) (g) (2,3,4,5,7,8,9) ()

Page 4: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Insertion-Sort q  Insertion-sort is the variation of PQ-sort

where the priority queue is implemented with a sorted sequence

q  Running time of Insertion-sort: n  Inserting the elements into the priority queue

with n insert operations takes time proportional to 1 + 2 + …+ n

n  Removing the elements in sorted order from the priority queue with a series of n removeMin operations takes O(n) time

q  Insertion-sort runs in O(n2) time

© 2014 Goodrich, Tamassia, Goldwasser 4 Heaps

Page 5: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Insertion-Sort Example Sequence S Priority queue P

Input: (7,4,8,2,5,3,9) () Phase 1 (a) (4,8,2,5,3,9) (7)

(b) (8,2,5,3,9) (4,7) (c) (2,5,3,9) (4,7,8) (d) (5,3,9) (2,4,7,8) (e) (3,9) (2,4,5,7,8) (f) (9) (2,3,4,5,7,8) (g) () (2,3,4,5,7,8,9)

Phase 2

(a) (2) (3,4,5,7,8,9) (b) (2,3) (4,5,7,8,9) .. .. .. (g) (2,3,4,5,7,8,9) ()

© 2014 Goodrich, Tamassia, Goldwasser 5 Heaps

Page 6: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

In-place Insertion-Sort q  Instead of using an

external data structure, we can implement selection-sort and insertion-sort in-place

q  A portion of the input sequence itself serves as the priority queue

q  For in-place insertion-sort n  We keep sorted the initial

portion of the sequence n  We can use swaps instead

of modifying the sequence

© 2014 Goodrich, Tamassia, Goldwasser 6 Heaps

5 4 2 3 1

5 4 2 3 1

4 5 2 3 1

2 4 5 3 1

2 3 4 5 1

1 2 3 4 5

1 2 3 4 5

Page 7: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort q  Consider a priority queue

with n items implemented by means of a heap n  the space used is O(n) n  methods insert and

removeMin take O(log n) time

n  methods size, isEmpty, and min take time O(1) time

q  Using a heap-based priority queue, we can sort a sequence of n elements in O(n log n) time

q  The resulting algorithm is called heap-sort

q  Heap-sort is much faster than quadratic sorting algorithms, such as insertion-sort and selection-sort

© 2014 Goodrich, Tamassia, Goldwasser 7 Heaps

Page 8: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort q  Create a heap q  Do removeMin repeatedly till head

becomes empty q  To do an in place sort, we move deleted

element to end of heap

8 Heaps

Page 9: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 9

12

20 23

25

31 26

11

17

33

29

13

8

8 11 13 12 25 17 29 20 23 31 26 33

Page 10: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 10

12

20 23

25

31 26

11

17

8

29

13

33

33 11 13 12 25 17 29 20 23 31 26 8

Page 11: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 11

12

20 23

25

31 26

33

17

8

29

13

11

11 33 13 12 25 17 29 20 23 31 26 8

Page 12: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 12

33

20 23

25

31 26

12

17

8

29

13

11

11 12 13 33 25 17 29 20 23 31 26 8

Page 13: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 13

20

33 23

25

31 26

12

17

8

29

13

11

11 12 13 20 25 17 29 33 23 31 26 8

Page 14: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 14

20

33 23

25

31 11

12

17

8

29

13

26

26 12 13 20 25 17 29 33 23 31 11 8

Page 15: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 15

20

33 23

25

31 11

26

17

8

29

13

12

12 26 13 20 25 17 29 33 23 31 11 8

Page 16: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 16

26

33 23

25

31 11

20

17

8

29

13

12

12 20 13 26 25 17 29 33 23 31 11 8

Page 17: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 17

23

33 26

25

31 11

20

17

8

29

13

12

12 20 13 23 25 17 29 33 26 31 11 8

Page 18: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 18

23

33 26

25

12 11

20

17

8

29

13

31

31 20 13 23 25 17 29 33 26 12 11 8

Page 19: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heap Sort

Heaps 19

26

17 13

25

12 11

31

23

8

20

29

33

33 31 29 26 25 23 20 17 13 12 11 8

Page 20: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Heapsort - Analysis q  Create a heap - O(n) q  Do removeMin repeatedly till head

becomes empty n  O(log n) + O(log n-1) + O(log n-2) + …

O(1) = O(n log n)

q  To do an in place sort, we move deleted element to end of heap

20 Heaps

Page 21: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Maps

Page 22: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Maps q  A map models a searchable collection of

key-value entries q  The main operations of a map are for

searching, inserting, and deleting items q  Multiple entries with the same key are

not allowed q  Applications:

n  address book n  student-record database

© 2014 Goodrich, Tamassia, Goldwasser 22 Maps

Page 23: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

The Map ADT q  get(k): if the map M has an entry with key k, return its

associated value; else, return null q  put(k, v): insert entry (k, v) into the map M; if key k is not

already in M, then return null; else, return old value associated with k

q  remove(k): if the map M has an entry with key k, remove it from M and return its associated value; else, return null

q  size(), isEmpty() q  entrySet(): return an iterable collection of the entries in M q  keySet(): return an iterable collection of the keys in M q  values(): return an iterator of the values in M

© 2014 Goodrich, Tamassia, Goldwasser 23 Maps

Page 24: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Example Operation Output Map isEmpty() true Ø put(5,A) null (5,A) put(7,B) null (5,A),(7,B) put(2,C) null (5,A),(7,B),(2,C) put(8,D) null (5,A),(7,B),(2,C),(8,D) put(2,E) C (5,A),(7,B),(2,E),(8,D) get(7) B (5,A),(7,B),(2,E),(8,D) get(4) null (5,A),(7,B),(2,E),(8,D) get(2) E (5,A),(7,B),(2,E),(8,D) size() 4 (5,A),(7,B),(2,E),(8,D) remove(5) A (7,B),(2,E),(8,D) remove(2) E (7,B),(8,D) get(2) null (7,B),(8,D) isEmpty() false (7,B),(8,D)

© 2014 Goodrich, Tamassia, Goldwasser 24 Maps

Page 25: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

A Simple List-Based Map q  We can implement a map using an

unsorted list n  We store the items of the map in a list S

(based on a doubly linked list), in arbitrary order

© 2014 Goodrich, Tamassia, Goldwasser 25 Maps

trailer header nodes/positions

entries

9 c 6 c 5 c 8 c

Page 26: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

The get(k) Algorithm Algorithm get(k):

B = S.positions() {B is an iterator of the positions in S} while B.hasNext() do p = B.next() { the next position in B } if p.element().getKey() = k then return p.element().getValue() return null {there is no entry with key equal to k}

© 2014 Goodrich, Tamassia, Goldwasser 26 Maps

Page 27: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

The put(k,v) Algorithm Algorithm put(k,v): B = S.positions() while B.hasNext() do

p = B.next() if p.element().getKey() = k then t = p.element().getValue() S.set(p,(k,v)) return t {return the old value}

S.addLast((k,v)) n = n + 1 {increment variable storing number of

entries} return null { there was no entry with key equal to k }

© 2014 Goodrich, Tamassia, Goldwasser 27 Maps

Page 28: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

The remove(k) Algorithm Algorithm remove(k): B =S.positions() while B.hasNext() do

p = B.next() if p.element().getKey() = k then t = p.element().getValue() S.remove(p) n = n – 1 {decrement number of entries} return t {return the removed value}

return null {there is no entry with key equal to k}

© 2014 Goodrich, Tamassia, Goldwasser 28 Maps

Page 29: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Performance of a List-Based Map q  Performance:

n  put takes O(1) time since we can insert the new item at the beginning or at the end of the sequence

n  get and remove take O(n) time since in the worst case (the item is not found) we traverse the entire sequence to look for an item with the given key

n  Average case time?

q  The unsorted list implementation is effective only for maps of small size or for maps in which puts are the most common operations, while searches and removals are rarely performed (e.g., historical record of logins to a workstation)

29 Maps

Page 30: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Different Data Structures to Implement Map ADT q  arrays, linked lists (inefficient) q  Binary Trees q  Hash Tables q  Red/Black Trees q  AVL Trees q  B-Trees q  Java

n  java.util.Dictionary - abstract class n  java.util.Map - interface

30 Maps

Page 31: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Binary Search

© 2014 Goodrich, Tamassia, Goldwasser 31 Maps

Page 32: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Direct Addressing q  an array that is indexed by the key

n  O(1) - time n  O(r) space - where r is the range of

numbers n  wastage of space

32 Maps

A B C D

XXXX-XXXX XXXX-XXXX 1234-5678 XXXX-XXXX

Page 33: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Tables

0 1 2 3 4 451-229-0004

981-101-0002 025-612-0001

Page 34: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Intuitive Notion of a Map q  Intuitively, a map M supports the abstraction

of using keys as indices with a syntax such as M[k].

q  As a mental warm-up, consider a restricted setting in which a map with n items uses keys that are known to be integers in a range from 0 to N − 1, for some N ≥ n.

© 2014 Goodrich, Tamassia, Goldwasser 34 Hash Tables

Page 35: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Table Solution q  O(1) - expected time q  O(n+m) - space, where m is size of the table q  instead of a one-to-one map between the key values

and array locations, find a function to map the large range into one which we can manage n  e.g., key value modulo size of array, and use that as an index n  Insert (12345678, C) into a hashed array of size 5, -

12345678 mod 5 = 3

35 Hash Tables

C

XXXX-XXXX XXXX-XXXX XXXX-XXXX 1234-5678 XXXX-XXXX

Page 36: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

More General Kinds of Keys q  But what should we do if our keys are not

integers in the range from 0 to N – 1? n  Use a hash function to map general keys to

corresponding indices in a table. n  For instance, the last four digits of a Identification

number.

© 2014 Goodrich, Tamassia, Goldwasser 36 Hash Tables

0 1 2 3 4 451-229-0004

981-101-0002 025-612-0001

Page 37: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Functions and Hash Tables q  A hash function h maps keys of a given type to

integers in a fixed interval [0, N - 1] q  Example:

h(x) = x mod N is a hash function for integer keys

q  The integer h(x) is called the hash value of key x

© 2014 Goodrich, Tamassia, Goldwasser 37 Hash Tables

q  A hash table for a given key type consists of n  Hash function h n  Array (called table) of size N

q  When implementing a map with a hash table, the goal is to store item (k, o) at index i = h(k)

Page 38: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Example

0 1 2 3 4 5 6 7 8 9 10

(1,D) (25,C)

(3,F)

(14,Z)

(39,C)

(6,A) (7,Q)

38 Hash Tables

Collision – Two different keys resulting in the same hash value

Page 39: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Functions q  Need to choose a good hash function

n  quick to compute n  uniform distribution of keys throughout the

table n  good hash functions are very rare.

q  How to deal with hashing non-integer keys n  find some way to turn keys into integers n  use standard hash functions on these

integers

39 Hash Tables

Page 40: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Functions (2) q  A hash function is

usually specified as the composition of two functions: Hash code: h1: keys → integers Compression function: h2: integers → [0, N - 1]

q  The hash code is applied first, and the compression function is applied next on the result, i.e.,

h(x) = h2(h1(x)) q  The goal of the hash

function is to “disperse” the keys in an apparently random way

© 2014 Goodrich, Tamassia, Goldwasser 40 Hash Tables

Page 41: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Codes q  Integer cast:

n  We reinterpret the bits of the key as an integer

n  Suitable for keys of length less than or equal to the number of bits of the integer type (e.g., byte, short, int and float in Java)

n  For keys of length greater than the number of bits of the integer type, ignore the exceeding bits

q  Component sum: n  We partition the bits of the

key into components of fixed length (e.g., 16 or 32 bits) and we sum the components (ignoring overflows)

n  Suitable for numeric keys of fixed length greater than or equal to the number of bits of the integer type (e.g., long and double in Java)

n  Not a good choice for strings

41 Hash Tables

Page 42: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Hash Codes (2) q  Polynomial accumulation:

n  We partition the bits of the key into a sequence of components of fixed length (e.g., 8, 16 or 32 bits) a0 a1 … an-1

n  We evaluate the polynomial p(z) = a0 + a1 z + a2 z2 + …

… + an-1zn-1

at a fixed value z, ignoring overflows

n  Especially suitable for strings (e.g., the choice z = 33 gives at most 6 collisions on a set of 50,000 English words)

q  Polynomial p(z) can be evaluated in O(n) time using Horner’s rule: n  The following polynomials

are successively computed, each from the previous one in O(1) time p0(z) = an-1 pi (z) = an-i-1 + zpi-1(z) (i = 1, 2, …, n -1)

q  We have p(z) = pn-1(z)

© 2014 Goodrich, Tamassia, Goldwasser 42 Hash Tables

Page 43: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Compression Functions q  Division:

n  Use the remainder n  h2 (y) = y mod N

w y is the key w N is size of the table

n  how to choose N?

43 Hash Tables

Consider (200, 205,210,215,220,…600) N = 100 N = 101

Page 44: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Compression Functions (2) q  Division:

n  Use the remainder n  h2 (y) = y mod N

w  y is the key w N is size of the table

n  how to choose N? n  N = bx (bad)

w N is a power of 2, h2(y) gives the x least significant bits of y.

w all keys with the same ending go to the same place n  N is prime (good)

w helps ensure uniform distribution

44 Hash Tables

Consider (200,205,210,215,220,…600) N = 100 N = 101

Page 45: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Compression Functions (3) q  Multiply, Add and Divide (MAD):

n  h2 (y) = [(ay + b) mod p ]mod N n  a and b are nonnegative integers such that

a mod N ≠ 0 n  Otherwise, every integer would map to the

same value b n  p is a prime number larger than N n  a and b are chosen at random from the

interval [0, p-1], with a > 0

45 Hash Tables

Page 46: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Compression Functions (4) q  For any choice of hash function, there

always exists a bad set of keys q  You could choose only these keys,

resulting in all of them getting mapped to the same slot. n  reduction in performance.

q  Solution n  collection of hash functions n  a random hash function n  choose a hash function that is independent of

the keys

46 Hash Tables

Page 47: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Collision Handling q  Collisions occur when

different elements are mapped to the same cell

q  Separate Chaining: let each cell in the table point to a linked list of entries that map there n  complexity depends

on load factor

q  Separate chaining is simple, but requires additional memory outside the table

© 2014 Goodrich, Tamassia, Goldwasser 47 Hash Tables

∅ ∅

0 1 2 3 4 451-229-0004 981-101-0004

025-612-0001

Page 48: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Map with Separate Chaining Delegate operations to a list-based map at each cell:

Algorithm get(k): return A[h(k)].get(k) Algorithm put(k,v): t = A[h(k)].put(k,v) if t = null then {k is a new key}

n = n + 1 return t Algorithm remove(k): t = A[h(k)].remove(k) if t ≠ null then {k was found}

n = n - 1 return t

© 2014 Goodrich, Tamassia, Goldwasser 48 Hash Tables

Page 49: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing q  Open addressing: the colliding item is placed

in a different cell of the table n  load factor is at most 1

q  Linear probing: handles collisions by placing the colliding item in the next (circularly) available table cell

q  Each table cell inspected is referred to as a “probe”

q  Colliding items lump together, causing future collisions to cause a longer sequence of probes

49 Hash Tables

Page 50: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 50 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

Page 51: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 51 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

18

Page 52: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 52 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

18

Page 53: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 53 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18

Page 54: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 54 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18

Page 55: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 55 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 22

Page 56: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 56 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 22

Page 57: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 57 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 22

44

Page 58: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 58 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 22

44

Page 59: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 59 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 22

Page 60: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 60 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 22

Page 61: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order

© 2014 Goodrich, Tamassia, Goldwasser 61 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 22

57

Page 62: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 62 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 22

57

Page 63: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13 w h(x) = (x+2) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 63 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 22

57

Page 64: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13 w h(x) = (x+2) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 64 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 57 22

Page 65: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13 w h(x) = (x+2) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 65 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 57 32 22

Page 66: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13 w h(x) = (x+2) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 66 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 57 32 22 31

Page 67: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order w h(x) = (x+1) mod 13 w h(x) = (x+2) mod 13

© 2014 Goodrich, Tamassia, Goldwasser 67 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 57 32 22 31 73

Page 68: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Search with Linear Probing q  Consider a hash table A

that uses linear probing q  get(k)

n  We start at cell h(k) n  We probe consecutive

locations until one of the following occurs w  An item with key k is

found, or w  An empty cell is found,

or w  N cells have been

unsuccessfully probed

© 2014 Goodrich, Tamassia, Goldwasser 68 Hash Tables

Algorithm get(k) i ← h(k) p ← 0 repeat c ← A[i] if c = ∅

return null else if c.getKey () = k return c.getValue() else i ← (i + 1) mod N

p ← p + 1 until p = N return null

Page 69: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Search with Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order n  Search for 57

69 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 57 32 22 31 73

Page 70: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Updates with Linear Probing q  To handle insertions and deletions, we introduce a

special object, called DEFUNCT, which replaces deleted elements

q  remove(k) n  We search for an entry with key k

n  If such an entry (k, o) is found, we replace it with the

special item DEFUNCT and we return element o n  Else, we return null

70 Hash Tables

Page 71: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Update with Linear Probing (2) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order n  Remove 57 n  Search for 32

71 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 32 22 31 73

Page 72: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Update with Linear Probing (3) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order n  Remove 57 n  Search for 32

72 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 X 32 22 31 73

Page 73: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Updates with Linear Probing (4) q  To handle insertions and deletions, we introduce a special

object, called DEFUNCT, which replaces deleted elements

q  put(k, o) n  We throw an exception if the table is full n  We start at cell h(k) n  We probe consecutive cells until one of the

following occurs w A cell i is found that is either empty or stores

DEFUNCT, or w N cells have been unsuccessfully probed

n  We store (k, o) in cell i

© 2014 Goodrich, Tamassia, Goldwasser 73 Hash Tables

Page 74: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Update with Linear Probing (5) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order n  Remove 57 n  Search for 32 n  Insert 58

74 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 X 32 22 31 73

Page 75: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Update with Linear Probing (5) q  Example:

n  h(x) = x mod 13 n  Insert keys 18, 41, 22, 44, 57, 32, 31, 73,

in this order n  Remove 57 n  Search for 32 n  Insert 58

75 Hash Tables

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 58 32 22 31 73

Page 76: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Double Hashing q  Double hashing uses a secondary hash

function d(k) and handles collisions by placing an item in the first available cell of the series

(i + jd(k)) mod N for j = 0, 1, … , N - 1

q  The secondary hash function d(k) cannot have zero values

q  The table size N must be a prime to allow probing of all the cells

© 2014 Goodrich, Tamassia, Goldwasser 76 Hash Tables

Page 77: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Example of Double Hashing q  Consider a hash table

storing integer keys that handles collision with double hashing n  N = 13 n  h(k) = k mod 13 n  d(k) = 7 - k mod 7

q  Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order

© 2014 Goodrich, Tamassia, Goldwasser 77 Hash Tables

k h (k ) d (k ) Probes18 5 3 541 2 1 222 9 6 944 5 5 5 1059 7 4 732 6 3 631 5 4 5 9 073 8 4 8

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 59 32 22 31 73

31 41 18 32 59 73 22 44

Page 78: Priority Queue Sorting - cse.iitrpr.ac.incse.iitrpr.ac.in/ckn/courses/f2015/csl201/w6.pdf · to sort a list of comparable elements 1. Insert the elements one by one with a series

Performance of Hashing q  In the worst case, searches,

insertions and removals on a hash table take O(n) time

q  The worst case occurs when all the keys inserted into the map collide

q  The load factor α = n/N affects the performance of a hash table

q  Assuming that the hash values are like random numbers, it can be shown that the expected number of probes for an insertion with open addressing is

1 / (1 - α)

q  The expected running time of all the dictionary ADT operations in a hash table is O(1)

q  In practice, hashing is very fast provided the load factor is not close to 100%

q  Applications of hash tables: n  small databases n  compilers n  browser caches

© 2014 Goodrich, Tamassia, Goldwasser 78 Hash Tables