51
Chapter 4, Part II Sorting Algorithms

Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

Embed Size (px)

DESCRIPTION

3 Heap Details A heap is also a complete tree, so nodes are filled in along the bottom of the tree from left to right and a new level is started only when the previous level has been filled The largest value stored in a heap will be in the root of the heap and the smallest value will be in one of the leaves

Citation preview

Page 1: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

Chapter 4, Part IISorting Algorithms

Page 2: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

2

Heap Details

• A heap is a tree structure where for each subtree the value stored at the root is larger than all of the values stored in the subtree

• There is no ordering between the children of any node other than that they are smaller

Page 3: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

3

Heap Details

• A heap is also a complete tree, so nodes are filled in along the bottom of the tree from left to right and a new level is started only when the previous level has been filled

• The largest value stored in a heap will be in the root of the heap and the smallest value will be in one of the leaves

Page 4: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

4

Heap Example

Page 5: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

5

Heapsort

• Heapsort begins by constructing a heap• The root (the largest value in the heap)

is moved to the last location of the list• The heap is fixed and the process is

repeated

Page 6: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

6

Heap Storage

• We can store the heap using an array• For an element at location i, its children

will be in locations 2i and 2i+1• If 2i and 2i+1 are greater than the list size,

then the element at location i is a leaf• If only 2i+1 is greater than the list size,

then the element at location i has just one child

Page 7: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

7

Heap Construction Example

Page 8: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

8

Final Heapsort Loop

Page 9: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

9

Final Heapsort Loop

Page 10: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

10

Heapsort Algorithm

construct the heap

for i = 1 to N do

copy the root to the list

fix the heap

end for

Page 11: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

11

FixHeap Algorithmvacant = rootwhile 2*vacant ≤ bound do

largerChild = 2*vacantif (largerChild < bound) and

(list[largerChild+1] > list[largerChild]) thenlargerChild = largerChild + 1

end ifif key > list[ largerChild ] then

breakelse

list[ vacant ] = list[ largerChild ]vacant = largerChild

end ifend whilelist[ vacant ] = key

Page 12: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

12

Constructing the Heap

for i = N/2 down to 1 doFixHeap( list, i, list[ i ], N )

end for

Page 13: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

13

The Final Heapsort Algorithm

for i = N/2 down to 1 doFixHeap( list, i, list[ i ], N )

end forfor i = N down to 2 do

max = list[ 1 ]FixHeap( list, 1, list[ i ], i-1 )list[ i ] = max

end for

Page 14: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

14

Worst-Case Analysis

• We analyze FixHeap because the rest of the analysis depends on it

• For each level of the heap, FixHeap does two comparisons – one between the children and the other between the new value and the largest child

• For a heap with D levels, there will be at most 2D comparisons

Page 15: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

15

Worst-Case Analysis

• During heap construction, FixHeap is called N/2 times

• On the first pass, the heap will have depth of 1

• On the last pass, the heap will have depth of lg N

• We need to determine how many nodes there are on each of the levels

Page 16: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

16

Worst-Case Analysis

• For binary trees, we know that there is 1 node on the first level, 2 nodes on the second level, 4 nodes on the third level, and so on

• Putting this together gives:

)(O 4 lg*2 *4

lgfor2*)(*2 )(W1

0onConstructi

NNN

NDiDND

i

i

Page 17: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

17

Worst-Case Analysis

• In the second loop, the size of the heap decreases by one each pass

• If there are k nodes left in the heap, then the heap will have a depth of lg k

• This gives:

) lg (O

lg2 )(W1

1Loop

NN

kNN

k

Page 18: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

18

Worst-Case Analysis

• Overall, the worst case is given by:

) lg (O ) lg (O )(O

)(W )(W )(W LooponConstructi

NNNNN

NNN

Page 19: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

19

Best-Case Analysis

• In the best case, the elements will be in the array in reverse order

• The construction phase will still be of O(N)

• Once the heap is constructed, the main loop will take the same O(N lg N) work

• So, the best case for heapsort is also O(N lg N)

Page 20: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

20

Average-Case Analysis

• Average case must be between the best case and the worst case

• The average case for heapsort must be O(N lg N), because best and worst case are both O(N lg N)

Page 21: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

21

Merge Sort

• If you have two sorted lists, you can create a combined sorted list if you merge the lists

• We know that the smallest value will be the first one in either of the two lists

• If we move the smallest value to the new list, we can repeat the process until the entire list is created

Page 22: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

22

Merge Sort Example

Page 23: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

23

Merge Sort Example (continued)

Page 24: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

24

Merge Sort Example (continued)

Page 25: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

25

The Algorithm

if first < last then

middle = ( first + last ) / 2

MergeSort( list, first, middle )

MergeSort( list, middle + 1, last )

MergeLists( list, first, middle, middle + 1, last )

end if

Page 26: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

26

MergeList AlgorithmPart 1

finalStart = start1finalEnd = end2indexC = 1while (start1 ≤ end1) and (start2 ≤ end2) do

if list[start1] < list[start2] thenresult[indexC] = list[start1]start1 = start1 + 1

elseresult[indexC] = list[start2]start2 = start2 + 1

end ifindexC = indexC + 1

end while

Page 27: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

27

MergeList AlgorithmPart 2

if start1 ≤ end1 thenfor i = start1 to end1 do

result[indexC] = list[i]indexC = indexC + 1

end forelse

for i = start2 to end2 doresult[indexC] = list[i]indexC = indexC + 1

end forend if

Page 28: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

28

MergeList AlgorithmPart 3

indexC = 1for i = finalStart to finalEnd do

list[i] = result[indexC]indexC = indexC + 1

end for

Page 29: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

29

MergeLists Analysis

• The best case is when the elements of one list are larger than all of the elements of the other list

• One worst case is when the elements are interleaved

• If each list has N elements, we will do N comparisons in the best case, and 2N-1 comparisons in the worst case

Page 30: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

30

MergeSort Analysis

• MergeSort divides the list in half each time, so the difference between the best and worst cases is how much work MergeList does

• In the analysis, we consider that a list of N elements gets broken into two lists of N/2 elements that are recursively sorted and then merged together

Page 31: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

31

MergeSort Analysis

• The worst case is:W(N) = 2W(N/2) + N – 1W(0) = W(1) = 0

which solves to W(N) = O(N lg N)• The best case is:

B(N) = 2B(N/2) + N/2B(0) = B(1) = 0

which solves to B(N) = O(N lg N)

Page 32: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

32

Quicksort

• In Chapter 3, we saw a partition process used to help us find the Kth largest element in a list

• We now use the partitioning process to help us sort a list

• We now will apply the process to both parts of the list instead of just one of them

Page 33: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

33

Quicksort

• Quicksort will partition a list into two pieces:– Those elements smaller than the pivot

value– Those elements larger than the pivot value

• Quicksort is then called recursively on both pieces

Page 34: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

34

Quicksort Example

Page 35: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

35

Quicksort Example (continued)

Page 36: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

36

Quicksort Algorithm

if first < last thenpivot = PivotList( list, first, last )Quicksort( list, first, pivot-1 )Quicksort( list, pivot+1, last )

end if

Page 37: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

37

Partitioning Process

• The algorithm moves through the list comparing values to the pivot

• During this process, there are sections of elements as indicated below

Page 38: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

38

PivotList Algorithm

PivotValue = list[ first ]PivotPoint = firstfor index = first+1 to last do

if list[ index ] < PivotValue thenPivotPoint = PivotPoint + 1Swap( list[ PivotPoint ], list[ index ] )

end ifend for// move pivot value into correct placeSwap( list[ first ], list[ PivotPoint ] )return PivotPoint

Page 39: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

39

Worst-Case Analysis

• In the worst case, PivotList will do N – 1 comparisons, but create one partition that has N – 1 elements and the other will have no elements

• Because it winds up just reducing the partition by one element each time, worst case is given by:

)(O 2

)1( 1 )(W 2

2NNNiN

N

i

Page 40: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

40

Average-Case Analysis

• In the average case, we need to consider all of the possible places where the pivot point winds up

• Because there are N – 1 comparisons done to partition the list, and there are N ways this can be done, we have:

0)0(A)1(A

)(A)1(A11 )(A1

N

iiNi

NNN

Page 41: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

41

Average-Case Analysis

• Algebra can be used to simplify this recurrence relation to:

• This will then solve to:

0)0(A)1(A

22)1(A*)1( )(A

N

NNNN

NNN lg)1(4.1 )(A

Page 42: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

42

External Polyphase Merge Sort

• Used when the data to be sorted is so large it will not fit in the computer’s memory

• External files are used to hold partial results

• Read in as many records as possible into memory and then sort them using one of the other sorts

• Alternate writing these runs of sorted records to one of two files

Page 43: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

43

External Polyphase Merge Sort

• Merge pairs of runs from the two files into one run that is twice the length

• To do this, the runs might have to be read into memory in pieces, but the entire two runs must be merged before moving onto the next pair of runs

• This doubles the run length and halves the number of runs

Page 44: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

44

External Polyphase Merge Sort

• The bigger runs are written alternately between two new files

• The process continues to merge pairs of runs until the entire data set has been merged back into a single sorted file

Page 45: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

45

Run Creation Algorithm

CurrentFile = Awhile not at the end of the input file do

read S records from the input filesort the S recordswrite the records to file CurrentFileif CurrentFile == A then

CurrentFile = Belse

CurrentFile = Aend if

end while

Page 46: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

46

Run Merge AlgorithmSize = SInput1 = AInput2 = BCurrentOutput = Cwhile not done do

***Merge runs process on next slide**Size = Size * 2if Input1 == A then

Input1 = CInput2 = DCurrrentOutput = A

ElseInput1 = AInput2 = BCurrentOutput = C

end ifend while

Page 47: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

47

Merge Runs Processwhile more runs this pass do

Merge one run of length Size from file Input1with one run of length Size from file Input2sending output to CurrentOutput

if CurrentOutput == A thenCurrentOutput = B

elseif CurrentOutput == B thenCurrentOutput = A

elseif CurrentOutput == C thenCurrentOutput = D

elseif CurrentOutput == D thenCurrentOutput = C

end ifend while

Page 48: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

48

Run Creation Analysis

• This analysis assumes that there are N elements in the list and that they are broken down into R runs of S elements (N = R * S)

• If we use an efficient sort to create the runs, each run will take O(S lg S) and there will be R of them for a total time ofO(R * S * lg S) = O(N lg S)

Page 49: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

49

Run Merging Analysis

• On the first pass, we have R runs of S elements, so there will be R/2 merges that can take up to 2S – 1 comparisons, which is R/2 * (2S – 1) = R*S – R/2

• On the second pass, we will have R/2 runs of 2S elements, so there will be R/4 merges that can take up to 4S – 1 comparisons, which is R/4 * (4S – 1) = R*S – R/4

Page 50: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

50

Run Merging Analysis

• There will be lg R passes of the merge phase, so that the complexity is given by:

RRNRSRR

i i lg* 2

* lg

1

Page 51: Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all

51

External Polyphase Merge Sort Analysis

• Putting the run creation and run merging calculations together we find that the overall complexity is O(N lg N)