103
Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved.

Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

Embed Size (px)

Citation preview

Page 1: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

Building Java ProgramsChapter 18

Advanced Data Structures:Hashing and Heaps

Copyright (c) Pearson 2013.All rights reserved.

Page 2: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

Hashing

Reading: 18.1

Page 3: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

3

Recall: ADTs

• abstract data type (ADT): A specification of a collection of data and the operations that can be performed on it.– Describes what a collection does, not how it does it.

• Java's collection framework describes ADTs with interfaces:– Collection, Deque, List, Map, Queue, Set, SortedMap

• An ADT can be implemented in multiple ways by classes:– ArrayList and LinkedList implement List– HashSet and TreeSet implement Set– LinkedList , ArrayDeque, etc. implement Queue

Page 4: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

4

SearchTree as a set

• We implemented a class SearchTree to store a BST of ints:

• Our BST is essentially a set of integers.Operations we support:– add– contains– remove

...

• But there are other ways to implement a set...

9160

8729

55

42-3

overallRoot

Page 5: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

5

Sets

• set: A collection of unique values (no duplicates allowed)

that can perform the following operations efficiently:– add, remove, search (contains)

– The client doesn't think of a set as having indexes; we just add things to the set in general and don't worry about orderset.contains("to") true

set

"the" "of"

"from""to"

"she""you"

"him""why"

"in"

"down""by"

"if"

set.contains("be") false

Page 6: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

6

Int Set ADT interface

• Let's think about how to write our own implementation of a set.– To simplify the problem, we only store ints in our set for

now.– As is (usually) done in the Java Collection Framework, we

will define sets as an ADT by creating a Set interface.– Core operations are: add, contains, remove.

public interface IntSet { void add(int value); boolean contains(int value); void clear(); boolean isEmpty(); void remove(int value); int size();}

Page 7: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

7

Unfilled array set

• Consider storing a set in an unfilled array.– It doesn't really matter what order the elements appear in a

set, so long as they can be added and searched quickly.– What would make a good ordering for the elements?

• If we store them in the next available index, as in a list, ...– set.add(9);set.add(23);set.add(8);set.add(-3);set.add(49);set.add(12);

– How efficient is add? contains? remove?•O(1), O(N), O(N)•(contains must loop over the array; remove must shift elements.)

index

0 1 2 3 4 5 6 7 8 9

value

9 23

8 -3 49

12

0 0 0 0

size 6

Page 8: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

8

Sorted array set

• Suppose we store the elements in an unfilled array, butin sorted order rather than order of insertion.– set.add(9);set.add(23);set.add(8);set.add(-3);set.add(49);set.add(12);

– How efficient is add? contains? remove?•O(N), O(log N), O(N)•(You can do an O(log N) binary search to find elements in contains,and to find the proper index in add/remove; but add/remove still need to shift elements right/left to make room, which is O(N) on average.)

index

0 1 2 3 4 5 6 7 8 9

value

-3 8 9 12

23

49

0 0 0 0

size 6

Page 9: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

9

A strange idea

• Silly idea: When client adds value i, store it at index i in the array.– Would this work?– Problems / drawbacks of this approach? How to work

around them?

set.add(7);set.add(1);set.add(9);...

set.add(18);set.add(12);

index

0 1 2 3 4 5 6 7 8 9

value

0 1 0 0 0 0 0 7 0 9

size 3

index

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9

value

0 1 0 0 0 0 0 7 0 9 0 0 12

0 0 0 0 0 18

0

size 5

Page 10: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

10

Hashing• hash: To map a large domain of values to a smaller fixed

domain.– Typically, mapping a set of elements to integer indexes in an array.– Idea: Store any given element value in a particular predictable

index.•That way, adding / removing / looking for it are constant-time (O(1)).

– hash table: An array that stores elements via hashing.

• hash function: An algorithm that maps values to indexes.– hash code: The output of a hash function for a given value.

– In previous slide, our "hash function" was: hash(i) i•Potentially requires a large array (a.length > i).•Doesn't work for negative numbers.•Array could be very sparse, mostly empty (memory waste).

Page 11: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

11

Improved hash function

• To deal with negative numbers: hash(i) abs(i)• To deal with large numbers: hash(i) abs(i) %

length

set.add(37); // abs(37) % 10 == 7set.add(-2); // abs(-2) % 10 == 2set.add(49); // abs(49) % 10 == 9

// inside HashIntSet classprivate int hash(int i) { return Math.abs(i) % elements.length;}

index

0 1 2 3 4 5 6 7 8 9

value

0 0 -2 0 0 0 0 37

0 49

size 3

Page 12: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

12

Sketch of implementation

public class HashIntSet implements IntSet { private int[] elements; ... public void add(int value) { elements[hash(value)] = value; }

public boolean contains(int value) { return elements[hash(value)] == value; }

public void remove(int value) { elements[hash(value)] = 0; }}

– Runtime of add, contains, and remove: O(1) !!•Are there any problems with this approach?

Page 13: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

13

Collisions

• collision: When hash function maps 2 values to same index.

set.add(11);set.add(49);set.add(24);set.add(37);set.add(54); // collides with 24!

• collision resolution: An algorithm for fixing collisions.

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 54

0 0 37

0 49

size 5

Page 14: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

14

Probing

• probing: Resolving a collision by moving to another index.– linear probing: Moves to the next available index (wraps if

needed).

set.add(11);set.add(49);set.add(24);set.add(37);set.add(54); // collides with 24; must probe

– variation: quadratic probing moves increasingly far away: +1, +4, +9, ...

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

54

0 37

0 49

size 5

Page 15: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

15

Implementing HashIntSet

• Let's implement an int set using a hash table with linear probing.– For simplicity, assume that the set cannot store 0s for now.

public class HashIntSet implements IntSet { private int[] elements; private int size;

// constructs new empty set public HashIntSet() { elements = new int[10]; size = 0; }

// hash function maps values to indexes private int hash(int value) { return Math.abs(value) % elements.length; } ...

Page 16: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

16

The add operation

• How do we add an element to the hash table?– Use the hash function to find the proper bucket index.– If we see a 0, put it there.– If not, move forward until we find an empty (0) index to

store it.– If we see that the value is already in the table, don't re-

add it.

– set.add(54); // client code– set.add(14);inde

x0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

54

14

37

0 49

size 6

Page 17: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

17

Implementing add• How do we add an element to the hash table?public void add(int value) { int h = hash(value); while (elements[h] != 0 && elements[h] != value) { // linear probing h = (h + 1) % elements.length; // for empty slot } if (elements[h] != value) { // avoid duplicates elements[h] = value; size++; }}

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

54

0 37

0 49

size 5

Page 18: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

18

The contains operation

• How do we search for an element in the hash table?– Use the hash function to find the proper bucket index.– Loop forward until we either find the value, or an empty

index (0).– If find the value, it is contained (true). If we find 0, it is

not (false).

– set.contains(24) // true– set.contains(14) // true– set.contains(35) // falseinde

x0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

54

14

37

0 49

size 6

Page 19: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

19

Implementing containspublic boolean contains(int value) { int h = hash(value); while (elements[h] != 0) { if (elements[h] == value) { // linear probing return true; // to search } h = (h + 1) % elements.length; } return false; // not found}

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

54

0 37

0 49

size 5

Page 20: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

20

The remove operation

• We cannot remove by simply zeroing out an element:set.remove(54); // set index 5 to 0set.contains(14) // false??? oops

• Instead, we replace it by a special "removed" placeholder value– (can be re-used on add, but keep searching on contains)

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

0 14

34

0 49

size 5

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

XX

14

34

0 49

size 5

Page 21: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

21

Implementing removepublic void remove(int value) { int h = hash(value); while (elements[h] != 0 && elements[h] != value) { h = (h + 1) % elements.length; } if (elements[h] == value) { elements[h] = -999; // "removed" flag value size--; }}

set.remove(54); // client codeset.remove(11);set.remove(34);

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

-999

14

34

0 49

size 5

Page 22: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

22

Patching add, containsprivate static final int REMOVED = -999;

public void add(int value) { int h = hash(value); while (elements[h] != 0 && elements[h] != value && elements[h] != REMOVED) { h = (h + 1) % elements.length; } if (elements[h] != value) { elements[h] = value; size++; }}

// contains does not need patching;// it should keep going on a -999, which it already doespublic boolean contains(int value) { int h = hash(value); while (elements[h] != 0 && elements[h] != value) { h = (h + 1) % elements.length; } return elements[h] == value;}

Page 23: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

23

Problem: full array

• clustering: Clumps of elements at neighboring indexes.– Slows down the hash table lookup; you must loop through

them.set.add(11);set.add(49);set.add(24);set.add(37);set.add(54); // collides with 24set.add(14); // collides with 24, then 54set.add(86); // collides with 14, then 37

•Where does each value go in the array?•How many indexes must be examined to answer contains(94)?•What will happen if the array completely fills?

index

0 1 2 3 4 5 6 7 8 9

value

0 0 0 0 0 0 0 0 0 0

size 0

Page 24: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

24

Rehashing

• rehash: Growing to a larger array when the table is too full.– Cannot simply copy the old array to a new one. (Why

not?)

• load factor: ratio of (# of elements ) / (hash table length )– many collections rehash when load factor ≅ .75

index

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9

value

0 0 0 0 24

0 66

0 48

0 0 11

0 0 54

95

14

37

0 0

size 8

index

0 1 2 3 4 5 6 7 8 9

value

95

11

0 0 24

54

14

37

66

48

size 8

Page 25: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

25

Implementing rehash // Grows hash table to twice its original size. private void rehash() { int[] old = elements; elements = new int[2 * old.length]; size = 0; for (int value : old) { if (value != 0 && value != REMOVED) { add(value); } } }

public void add(int value) { if ((double) size / elements.length >= 0.75) { rehash(); } ... }

Page 26: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

26

Hash table sizes

• Can use prime numbers as hash table sizes to reduce collisions.

• Also improves spread / reduces clustering on rehash.

set.add(11); // 11 % 13 == 11set.add(39); // 39 % 13 == 0set.add(21); // 21 % 13 == 8set.add(29); // 29 % 13 == 3set.add(71); // 81 % 13 == 6set.add(41); // 41 % 13 == 2set.add(99); // 101 % 13 == 10

index

0 1 2 3 4 5 6 7 8 9 10 11

12

value

39

0 41

29 0 0 71 0 21

0 101

11 0

size 7

Page 27: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

27

Other details

• How would we implement toString on our HashIntSet?

System.out.println(set);// [11, 24, 54, 37, 49]

index

0 1 2 3 4 5 6 7 8 9

value

0 11

0 0 24

54

0 37

0 49

size 5

Page 28: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

28

Separate chaining

• separate chaining: Solving collisions by storing a list at each index.– add/contains/remove must traverse lists, but the lists are short– impossible to "run out" of indexes, unlike with probing

private class Node { public int data; public Node next; ...}

index

0 1 2 3 4 5 6 7 8 9

value

54

14

24

11 7 49

Page 29: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

29

Implementing HashIntSet

• Let's implement a hash set of ints using separate chaining.

public class HashIntSet implements IntSet { // array of linked lists; // elements[i] = front of list #i (null if empty) private Node[] elements; private int size;

// constructs new empty set public HashIntSet() { elements = new Node[10]; size = 0; }

// hash function maps values to indexes private int hash(int value) { return Math.abs(value) % elements.length; } ...

Page 30: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

30

The add operation

• How do we add an element to the hash table?– When you want to modify a linked list, you must either

change the list's front reference, or the next field of a node in the list.

– Where in the list should we add the new element?– Must make sure to avoid duplicates.

– set.add(24);

index

0 1 2 3 4 5 6 7 8 9

value 54

1424

11 7 49

new node

Page 31: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

31

Implementing addpublic void add(int value) { if (!contains(value)) { int h = hash(value); // add to front Node newNode = new Node(value); // of list #h newNode.next = elements[h]; elements[h] = newNode; size++; }}

Page 32: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

32

The contains operation

• How do we search for an element in the hash table?– Must loop through the linked list for the appropriate hash

index,looking for the desired value.

– Looping through a linked list requires a "current" node reference.

– set.contains(14) // true– set.contains(84) // false– set.contains(53) // false

index

0 1 2 3 4 5 6 7 8 9

value

54

14

2411 7 49

current

Page 33: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

33

Implementing containspublic boolean contains(int value) { Node current = elements[hash(value)]; while (current != null) { if (current.data == value) { return true; } current = current.next; } return false;}

Page 34: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

34

The remove operation

• How do we remove an element from the hash table?– Cases to consider: front (24), non-front (14), not found

(94), null (32)– To remove a node from a linked list, you must either

change the list's front reference, or the next field of the previous node in the list.

– set.remove(54);index

0 1 2 3 4 5 6 7 8 9

value

54

14

2411 7 49

current

Page 35: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

35

Implementing removepublic void remove(int value) { int h = hash(value); if (elements[h] != null && elements[h].data == value) { elements[h] = elements[h].next; // front case size--; } else { Node current = elements[h]; // non-front case while (current != null && current.next != null) { if (current.next.data == value) { current.next = current.next.next; size--; return; } current = current.next; } }}

Page 36: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

36

Rehashing w/ chaining

• Separate chaining handles rehashing similarly to linear probing.– Loop over the list in each hash bucket; re-add each

element.– An optimal implementation re-uses node objects, but this

is optional.

index

0 1 2 3 4 5 6 7 8 9

value 1

1245414

7 49

index

0 1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

value 1

124

14

7 49

54

Page 37: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

37

Hash set of objectspublic class HashSet<E> implements Set<E> { ... private class Node { public E data; public Node next; }}

• It is easy to hash an integer i (use index abs(i) % length ).– How can we hash other types of values (such as objects)?

Page 38: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

38

The hashCode method

• All Java objects contain the following method:

public int hashCode()

Returns an integer hash code for this object.

– We can call hashCode on any object to find its preferred index.

– HashSet, HashMap, and the other built-in "hash" collections call hashCode internally on their elements to store the data.

• We can modify our set's hash function to be the following:private int hash(E e) { return Math.abs(e.hashCode()) % elements.length;}

Page 39: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

39

Issues with generics

• You must make an unusual cast on your array of generic nodes:public class HashSet<E> implements Set<E> { private Node[] elements; ... public HashSet() { elements = (Node[]) new HashSet.Node[10]; }

• Perform all element comparisons using equals:public boolean contains(int value) { ... // if (current.data == value) { if (current.data.equals(value)) { return true; } ...

Page 40: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

40

Implementing hashCode• You can write your own hashCode methods in classes you

write.– All classes come with a default version based on memory

address.– Your overridden version should somehow "add up" the object's

state.•Often you scale/multiply parts of the result to distribute the results.

public class Point { private int x; private int y; ... public int hashCode() { // better than just returning (x + y); // spreads out numbers, fewer collisions return 137 * x + 23 * y; }}

Page 41: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

41

Good hashCode behavior

• A well-written hashCode method has:– Consistently with itself (must produce same results on each

call):o.hashCode() == o.hashCode(), if o's state doesn't change

– Consistently with equality:a.equals(b) must imply that a.hashCode() == b.hashCode(),

!a.equals(b) does NOT necessarily imply that a.hashCode() != b.hashCode() (why not?)

•When your class has an equals or hashCode, it should have both.

– Good distribution of hash codes:•For a large set of objects with distinct states, they will generally

return unique hash codes rather than all colliding into the same hash bucket.

Page 42: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

42

Example: String hashCode

• The hashCode function inside a String object looks like this:

public int hashCode() { int hash = 0; for (int i = 0; i < this.length(); i++) { hash = 31 * hash + this.charAt(i); } return hash;}

– As with any general hashing function, collisions are possible.•Example: "Ea" and "FB" have the same hash value.

– Early versions of the Java examined only the first 16 characters.For some common data this led to poor hash table performance.

Page 43: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

43

hashCode tricks

• If one of your object's fields is an object, call its hashCode:public int hashCode() { // Student return 531 * firstName.hashCode() + ...;

• To incorporate a double or boolean, use the hashCode method from the Double or Boolean wrapper classes:public int hashCode() { // BankAccount return 37 * Double.valueOf(balance).hashCode() + Boolean.valueOf(isCheckingAccount).hashCode();

• Guava includes an Objects.hashCode(...) method that takes any number of values and combines them into one hash code.public int hashCode() { // BankAccount return Objects.hashCode(name, id, balance);

Page 44: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

44

Implementing a hash map

• A hash map is like a set where the nodes store key/value pairs:

public class HashMap<K, V> implements Map<K, V> { ...}

// key valuemap.put("Marty", 14);map.put("Jeff", 21);map.put("Kasey", 20);map.put("Stef", 35);

– Must modify your Node class to store a key and a value

index

0 1 2 3 4 5 6 7 8 9

value "Jeff" 2

1"Mart

y"14

"Kasey"

20

"Stef"

35

Page 45: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

45

Map ADT interface• Let's think about how to write our own implementation

of a map.– As is (usually) done in the Java Collection Framework, we

will define map as an ADT by creating a Map interface.– Core operations: put (add), get, contains key, remove

public interface Map<K, V> { void clear(); boolean containsKey(K key); V get(K key); boolean isEmpty(); void put(K key, V value); void remove(int value); int size();}

Page 46: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

46

Hash map vs. hash set– The hashing is always done on the keys, not the values.– The contains method is now containsKey; there and in remove, you search for a node whose key matches a given key.

– The add method is now put; if the given key is already there, you must replace its old value with the new one.•map.put("Bill", 66); // replace 49 with 66

index

0 1 2 3 4 5 6 7 8 9

value "Jeff" 2

1"Mart

y"14

"Kasey"

20

"Stef"

35

66

"Abby"

57

"Bill" 49

Page 47: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

Priority Queuesand Heaps

Reading: 18.2

Page 48: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

48

Prioritization problems

• print jobs: CSE lab printers constantly accept and complete jobs from all over the building. We want to print faculty jobs before staff before student jobs, and grad students before undergrad, etc.

• ER scheduling: Scheduling patients for treatment in the ER. A gunshot victim should be treated sooner than a guy with a cold, regardless of arrival time. How do we always choose the most urgent case when new patients continue to arrive?

• key operations we want:– add an element (print job, patient, etc.)– get/remove the most "important" or "urgent" element

Page 49: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

49

Priority Queue ADT

• priority queue: A collection of ordered elements that provides fast access to the minimum (or maximum) element.– add adds in order– peek returns minimum or "highest priority" value– remove removes/returns minimum value– isEmpty, clear, size, iterator O(1)

pq.add("if");pq.add("from");...

priority queue

"the" "of"

"from""to"

"she" "you"

"him""why"

"in"

"down""by"

"if"

pq.remove() "by"

Page 50: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

50

Unfilled array?• Consider using an unfilled array to implement a priority

queue.– add: Store it in the next available index, as in a list.– peek: Loop over elements to find minimum element.– remove: Loop over elements to find min. Shift to remove.

queue.add(9);queue.add(23);queue.add(8);queue.add(-3);queue.add(49);queue.add(12);queue.remove();

– How efficient is add? peek? remove?•O(1), O(N), O(N)•(peek must loop over the array; remove must shift elements)

index

0 1 2 3 4 5 6 7 8 9

value

9 23

8 -3

49

12

0 0 0 0

size 6

Page 51: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

51

Sorted array?

• Consider using a sorted array to implement a priority queue.– add: Store it in the proper index to maintain sorted order.– peek: Minimum element is in index [0].– remove: Shift elements to remove min from index [0].

queue.add(9);queue.add(23);queue.add(8);queue.add(-3);queue.add(49);queue.add(12);queue.remove();

– How efficient is add? peek? remove?•O(N), O(1), O(N)•(add and remove must shift elements)

index

0 1 2 3 4 5 6 7 8 9

value

-3

8 9 12

23

49

0 0 0 0

size 6

Page 52: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

52

Linked list?

• Consider using a doubly linked list to implement a priority queue.– add: Store it at the end of the linked list.– peek: Loop over elements to find minimum element.– remove: Loop over elements to find min. Unlink to remove.

queue.add(9);queue.add(23);queue.add(8);queue.add(-3);queue.add(49);queue.add(12);queue.remove();

– How efficient is add? peek? remove?•O(1), O(N), O(N)•(peek and remove must loop over the linked list)

9 23

8 -3

49

12

front back

Page 53: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

53

Sorted linked list?• Consider using a sorted linked list to implement a priority

queue.– add: Store it in the proper place to maintain sorted order.– peek: Minimum element is at the front.– remove: Unlink front element to remove.

queue.add(9);queue.add(23);queue.add(8);queue.add(-3);queue.add(49);queue.add(12);queue.remove();

– How efficient is add? peek? remove?•O(N), O(1), O(1)•(add must loop over the linked list to find the proper insertion

point)

-3

8 9 12

23

49

front back

Page 54: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

54

Binary search tree?

• Consider using a binary search tree to implement a PQ.– add: Store it in the proper BST L/R - ordered spot.– peek: Minimum element is at the far left edge of the tree.– remove: Unlink far left element to remove.

queue.add(9);queue.add(23);queue.add(8);queue.add(-3);queue.add(49);queue.add(12);queue.remove();

– How efficient is add? peek? remove?•O(log N), O(log N), O(log N)...?•(good in theory, but the tree tends to become unbalanced to the

right)

49-3

238

9

12

Page 55: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

55

Unbalanced binary treequeue.add(9);queue.add(23);queue.add(8);queue.add(-3);queue.add(49);queue.add(12);queue.remove();

queue.add(16);queue.add(34);queue.remove();queue.remove();queue.add(42);queue.add(45);queue.remove();

– Simulate these operations. What is the tree's shape?– A tree that is unbalanced has a height close to N rather

than log N, which breaks the expected runtime of many operations.

49

23

12

16

34

42

45

Page 56: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

56

Heaps

• heap: A complete binary tree with vertical ordering.– complete tree: Every level is full except possibly the

lowest level, which must be filled from left to right•(i.e., a node may not have any children until all possible

siblings exist)

Page 57: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

57

Heap ordering

• heap ordering: If P ≤ X for every element X with parent P.– Parents' values are always smaller than those of their

children.– Implies that minimum element is always the root (a "min-

heap").•variation: "max-heap" stores largest element at root,

reverses ordering

– Is a heap a BST? How are they related?

Page 58: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

58

Which are min-heaps?

1530

8020

10

996040

8020

10

50 700

85

996040

8020

10

50 700

85 996040

8010

20

50 700

85

6040

8020

10

996040

8020

10

no no

no

no

Page 59: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

59

24

7 3

30

10 40

30

80

2510

48

21

14

10 17

33

91828

11

22

3530

50

30

10 20

no

no

Which are max-heaps?

59

Page 60: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

60

Heap height and runtime

• The height of a complete tree is always log N.– How do we know this for sure?

• Because of this, if we implement a priority queue using a heap, we can provide the following runtime guarantees:– add: O(log N)– peek: O(1)– remove:O(log N)

n-node complete treeof height h:

2h n 2h+1 – 1h = log n

Page 61: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

61

The add operation

• When an element is added to a heap, where should it go?– Must insert a new node while maintaining heap

properties.– queue.add(15);

996040

8020

10

50 700

85

65

15

new node

Page 62: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

62

The add operation

• When an element is added to a heap, it should be initially placed as the rightmost leaf (to maintain the completeness property).– But the heap ordering property becomes broken!

996040

8020

10

50 700

85

65

996040

8020

10

50 700

85

65 15

Page 63: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

63

"Bubbling up" a node

• bubble up: To restore heap ordering, the newly added element is shifted ("bubbled") up the tree until it reaches its proper place.– Weiss: "percolate up" by swapping with its parent– How many bubble-ups are necessary, at most?

996040

8020

10

50 700

85

65 15

992040

8015

10

50 700

85

65 60

Page 64: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

64

Bubble-up exercise

• Draw the tree state of a min-heap after adding these elements:– 6, 50, 11, 25, 42, 20, 104, 76, 19, 55, 88, 2

1044225

619

2

76 50

11

55 88 20

Page 65: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

65

The peek operation

• A peek on a min-heap is trivial to perform.– because of heap properties, minimum element is always

the root– O(1) runtime

• Peek on a max-heap would be O(1) as well (return max, not min)

996040

8020

10

50 76

85

65

Page 66: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

66

The remove operation

• When an element is removed from a heap, what should we do?– The root is the node to remove. How do we alter the

tree?– queue.remove();

996040

8020

10

50 700

85

65

Page 67: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

67

The remove operation

• When the root is removed from a heap, it should be initially replaced by the rightmost leaf (to maintain completeness).– But the heap ordering property becomes broken!

996040

8020

10

700 50

85

65

996040

8020

65

700 50

85

65

Page 68: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

68

"Bubbling down" a node

• bubble down: To restore heap ordering, the new improper root is shifted ("bubbled") down the tree until it reaches its proper place.– Weiss: "percolate down" by swapping with its smaller

child (why?)– How many bubble-down are necessary, at most?

996040

8020

65

74 50

85 996050

8040

20

74 65

85

Page 69: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

69

Bubble-down exercise

• Suppose we have the min-heap shown below. • Show the state of the heap tree after remove has been

called 3 times, and which elements are returned by the removal.

1044225

619

2

76 50

11

55 88 20

Page 70: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

70

Array heap implementation

• Though a heap is conceptually a binary tree,since it is a complete tree, when implementing itwe actually can "cheat" and just use an array!– index of root = 1 (leave 0 empty to simplify the math)– for any node n at index i :

•index of n.left = 2i•index of n.right = 2i + 1•parent index of n?

– This array representationis elegant and efficient (O(1))for common tree operations.

Page 71: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

71

Implementing HeapPQ

• Let's implement an int priority queue using a min-heap array.

public class HeapIntPriorityQueue implements IntPriorityQueue { private int[] elements; private int size;

// constructs a new empty priority queue public HeapIntPriorityQueue() { elements = new int[10]; size = 0; }

...}

Page 72: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

72

Helper methods

• Since we will treat the array as a complete tree/heap, and walk up/down between parents/children, these methods are helpful:

// helpers for navigating indexes up/down the treeprivate int parent(int index) { return index/2; }private int leftChild(int index) { return index*2; }private int rightChild(int index) { return index*2 + 1; }private boolean hasParent(int index) { return index > 1; }private boolean hasLeftChild(int index) { return leftChild(index) <= size;}private boolean hasRightChild(int index) { return rightChild(index) <= size;}private void swap(int[] a, int index1, int index2) { int temp = a[index1]; a[index1] = a[index2]; a[index2] = temp;}

Page 73: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

73

Implementing add• Let's write the code to add an element to the heap:

public void add(int value) { ...}

996040

8020

10

50 700

85

65 15

992040

8015

10

50 700

85

65 60

Page 74: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

74

Implementing add// Adds the given value to this priority queue in order.public void add(int value) { elements[size + 1] = value; // add as rightmost leaf

// "bubble up" as necessary to fix ordering int index = size + 1; boolean found = false; while (!found && hasParent(index)) { int parent = parent(index); if (elements[index] < elements[parent]) { swap(elements, index, parent(index)); index = parent(index); } else { found = true; // found proper location; stop } }

size++;}

Page 75: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

75

Resizing a heap

• What if our array heap runs out of space?– We must enlarge it.– When enlarging hash sets, we needed to carefully rehash

the data.– What must we do here?

– (We can simply copy the datainto a larger array.)

Page 76: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

76

Modified add code// Adds the given value to this priority queue in order.public void add(int value) { // resize to enlarge the heap if necessary if (size == elements.length - 1) { elements = Arrays.copyOf(elements, 2 * elements.length); } ...}

Page 77: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

77

Implementing peek• Let's write code to retrieve the minimum element in

the heap:

public int peek() { ...}

992040

8015

10

50 700

85

65 60

Page 78: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

78

Implementing peek// Returns the minimum element in this priority queue.// precondition: queue is not emptypublic int peek() { return elements[1];}

Page 79: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

79

Implementing remove• Let's write code to remove the minimum element in the

heap:

public int remove() { ...}

996040

8020

10

700 50

85

65

996040

8020

65

700 50

85

65

Page 80: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

80

Implementing removepublic int remove() { // precondition: queue is not empty int result = elements[1]; // last leaf -> root elements[1] = elements[size]; size--; int index = 1; // "bubble down" to fix ordering boolean found = false; while (!found && hasLeftChild(index)) { int left = leftChild(index); int right = rightChild(index); int child = left; if (hasRightChild(index) && elements[right] < elements[left]) { child = right; } if (elements[index] > elements[child]) { swap(elements, index, child); index = child; } else { found = true; // found proper location; stop } } return result;}

Page 81: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

81

Int PQ ADT interface

• Let's write our own implementation of a priority queue.– To simplify the problem, we only store ints in our set for

now.– As is (usually) done in the Java Collection Framework, we

will define sets as an ADT by creating a Set interface.– Core operations are: add, peek (at min), remove (min).

public interface IntPriorityQueue { void add(int value); void clear(); boolean isEmpty(); int peek(); // return min element int remove(); // remove/return min element int size();}

Page 82: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

82

Generic PQ ADT

• Let's modify our priority queue so it can store any type of data.– As with past collections, we will use Java generics (a type

parameter).

public interface PriorityQueue<E> { void add(E value); void clear(); boolean isEmpty(); E peek(); // return min element E remove(); // remove/return min element int size();}

Page 83: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

83

Generic HeapPQ class

• We can modify our heap priority class to use generics as usual...

public class HeapPriorityQueue<E> implements PriorityQueue<E> { private E[] elements; private int size;

// constructs a new empty priority queue public HeapPriorityQueue() { elements = (E[]) new Object[10]; size = 0; }

...}

Page 84: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

84

Problem: ordering elements

// Adds the given value to this priority queue in order.public void add(E value) { ... int index = size + 1; boolean found = false; while (!found && hasParent(index)) { int parent = parent(index); if (elements[index] < elements[parent]) { // error swap(elements, index, parent(index)); index = parent(index); } else { found = true; // found proper location; stop } }}

– Even changing the < to a compareTo call does not work.•Java cannot be sure that type E has a compareTo method.

Page 85: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

85

Comparing objects

• Heaps rely on being able to order their elements.• Operators like < and > do not work with objects in Java.

– But we do think of some types as having an ordering (e.g. Dates).

– (In other languages, we can enable <, > with operator overloading.)

• natural ordering: Rules governing the relative placement of all values of a given type.– Implies a notion of equality (like equals) but also < and >

.– total ordering: All elements can be arranged in A ≤ B ≤

C ≤ ... order.– The Comparable interface provides a natural ordering.

Page 86: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

86

The Comparable interface

• The standard way for a Java class to define a comparison function for its objects is to implement the Comparable interface.

public interface Comparable<T> {

public int compareTo(T other);

}

• A call of A.compareTo(B) should return:a value < 0 if A comes "before" B in the ordering,a value > 0 if A comes "after" B in the ordering,or exactly0 if A and B are considered "equal" in the

ordering.

• Effective Java Tip #12: Consider implementing Comparable.

Page 87: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

87

Bounded type parameters

<Type extends SuperType>– An upper bound; accepts the given supertype or any of its

subtypes.– Works for multiple superclass/interfaces with & :<Type extends ClassA & InterfaceB & InterfaceC & ...>

<Type super SuperType>– A lower bound; accepts the given supertype or any of its

supertypes.

• Example:// can be instantiated with any animal typepublic class Nest<T extends Animal> { ...}...Nest<Bluebird> nest = new Nest<Bluebird>();

Page 88: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

88

Corrected HeapPQ classpublic class HeapPriorityQueue<E extends Comparable<E>> implements PriorityQueue<E> { private E[] elements; private int size;

// constructs a new empty priority queue public HeapPriorityQueue() { elements = (E[]) new Object[10]; size = 0; } ... public void add(E value) { ... while (...) { if (elements[index].compareTo( elements[parent]) < 0) { swap(...); } } }}

Page 89: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

Ordering and Comparators

Page 90: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

90

What's the "natural" order?

public class Rectangle implements Comparable<Rectangle> { private int x, y, width, height;

public int compareTo(Rectangle other) { // ...? }}

• What is the "natural ordering" of rectangles?– By x, breaking ties by y?– By width, breaking ties by height?– By area? By perimeter?

• Do rectangles have any "natural" ordering?– Might we want to arrange rectangles into some order

anyway?

Page 91: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

91

Comparator interfacepublic interface Comparator<T> { public int compare(T first, T second);}

• Interface Comparator is an external object that specifies a comparison function over some other type of objects.– Allows you to define multiple orderings for the same type.– Allows you to define a specific ordering(s) for a type even if

there is no obvious "natural" ordering for that type.– Allows you to externally define an ordering for a class that, for

whatever reason, you are not able to modify to make it Comparable:•a class that is part of the Java class libraries•a class that is final and can't be extended•a class from another library or author, that you don't control• ...

Page 92: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

92

Comparator examplespublic class RectangleAreaComparator implements Comparator<Rectangle> { // compare in ascending order by area (WxH) public int compare(Rectangle r1, Rectangle r2) { return r1.getArea() - r2.getArea(); }}

public class RectangleXYComparator implements Comparator<Rectangle> { // compare by ascending x, break ties by y public int compare(Rectangle r1, Rectangle r2) { if (r1.getX() != r2.getX()) { return r1.getX() - r2.getX(); } else { return r1.getY() - r2.getY(); } }}

Page 93: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

93

Using Comparators•TreeSet, TreeMap , PriorityQueue can use Comparator:

Comparator<Rectangle> comp = new RectangleAreaComparator();Set<Rectangle> set = new TreeSet<Rectangle>(comp);Queue<Rectangle> pq = new PriorityQueue<Rectangle>(10,comp);

• Searching and sorting methods can accept Comparators.Arrays.binarySearch(array, value, comparator)Arrays.sort(array, comparator)Collections.binarySearch(list, comparator)Collections.max(collection, comparator)Collections.min(collection, comparator)Collections.sort(list, comparator)

• Methods are provided to reverse a Comparator's ordering:public static Comparator Collections.reverseOrder()public static Comparator Collections.reverseOrder(comparator)

Page 94: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

94

PQ and Comparator

• Our heap priority queue currently relies on the Comparable natural ordering of its elements:public class HeapPriorityQueue<E extends Comparable<E>>

implements PriorityQueue<E> {

...

public HeapPriorityQueue() {...}

}

• To allow other orderings, we can add a constructor that accepts a Comparator so clients can arrange elements in any order: ...

public HeapPriorityQueue(Comparator<E> comp) {...}

Page 95: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

95

PQ Comparator exercise

• Write code that stores strings in a priority queue and reads them back out in ascending order by length.– If two strings are the same length, break the tie by ABC

order.

Queue<String> pq = new PriorityQueue<String>(...);pq.add("you");pq.add("meet");pq.add("madam");pq.add("sir");pq.add("hello");pq.add("goodbye");while (!pq.isEmpty()) { System.out.print(pq.remove() + " ");}

// sir you meet hello madam goodbye

Page 96: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

96

PQ Comparator answer

• Use the following comparator class to organize the strings:

public class LengthComparator implements Comparator<String> { public int compare(String s1, String s2) { if (s1.length() != s2.length()) { // if lengths are unequal, compare by length return s1.length() - s2.length(); } else { // break ties by ABC order return s1.compareTo(s2); } }}...Queue<String> pq = new PriorityQueue<String>(100, new LengthComparator());

Page 97: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

97

Heap sort

• heap sort: An algorithm to sort an array of N elements by turning the array into a heap, then calling remove N times.– The elements will come out in sorted order.– We can put them into a new sorted array.– What is the runtime?

Page 98: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

98

Heap sort implementation

public static void heapSort(int[] a) { PriorityQueue<Integer> pq = new HeapPriorityQueue<Integer>(); for (int n : a) { pq.add(a); } for (int i = 0; i < a.length; i++) { a[i] = pq.remove(); }}

– This code is correct and runs in O(N log N) time but wastes memory.

– It makes an entire copy of the array a into the internal heap of the priority queue.

– Can we perform a heap sort without making a copy of a?

Page 99: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

99

Improving the code

• Idea: Treat a itself as a max-heap, whose data starts at 0 (not 1). – a is not actually in heap order.– But if you repeatedly "bubble down" each non-leaf node,

starting from the last one, you will eventually have a proper heap.

• Now that a is a valid max-heap:– Call remove repeatedly until the heap is empty.– But make it so that when an element is "removed", it is

moved to the end of the array instead of completely evicted from the array.

– When you are done, voila! The array is sorted.

Page 100: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

100

Step 1: Build heap in-place

• "Bubble" down non-leaf nodes until the array is a max-heap:– int[] a = {21, 66, 40, 10, 70, 81, 30, 22, 45, 95, 88, 38};

– Swap each node with itslarger child as needed.

307010

4066

21

22 45

81

95 88

index

0 1 2 3 4 5 6 7 8 9 0 1 2 ...

value 21

66

40

10

70

81

30

22

45

95

88

38

0 ...

size 12

38

Page 101: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

101

Build heap in-place answer

– 30: nothing to do– 81: nothing to do– 70: swap with 95– 10: swap with 45– 40: swap with 81– 66: swap with 95, then 88– 21: swap with 95, then 88, then 70

307045

8188

95

22 10

40

66 21

index

0 1 2 3 4 5 6 7 8 9 0 1 2 ...

value 95

88

81

45

70

40

30

22

10

66

21

38

0 ...

size 12

38

Page 102: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

102

Remove to sort

• Now that we have a max-heap, remove elements repeatedly until we have a sorted array.– Move each removed element

to the end, rather than tossing it.

307045

8188

95

22 10

40

66 21

index

0 1 2 3 4 5 6 7 8 9 0 1 2 ...

value 95

88

81

45

70

40

30

22

10

66

21

38

0 ...

size 12

38

Page 103: Building Java Programs Chapter 18 Advanced Data Structures: Hashing and Heaps Copyright (c) Pearson 2013. All rights reserved

103

Remove to sort answer– 95: move 38 up, swap with 88, 70, 66– 88: move 21 up, swap with 81, 40– 81: move 38 up, swap with 70, 66– 70: move 10 up, swap with 66, 45, 22– ...

– (Notice that after 4 removes,the last 4 elements in thearray are sorted.If we remove everyelement, the entirearray will be sorted.)

303822

4045

66

10 70

21

81 88

index

0 1 2 3 4 5 6 7 8 9 0 1 2 ...

value 66

45

40

22

38

21

30

10

70

81

88

95

0 ...

size 12

95