51
Chapter 4, Part II Sorting Algorithms

Chapter 04 02

Embed Size (px)

Citation preview

Page 1: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 1/51

Chapter 4, Part II

Sorting Algorithms

Page 2: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 2/51

2

Heap Details

•  A heap is a tree structure where foreach subtree the value stored at the

root is larger than all of the values

stored in the subtree

• There is no ordering between the

children of any node other than that

they are smaller

Page 3: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 3/51

3

Heap Details

•  A heap is also a complete tree, so nodes arefilled in along the bottom of the tree from left

to right and a new level is started only when

the previous level has been filled

• The largest value stored in a heap will be in

the root of the heap and the smallest value

will be in one of the leaves

Page 4: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 4/51

4

Heap Example

Page 5: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 5/51

5

Heapsort

• Heapsort begins by constructing a heap

• The root (the largest value in the heap)

is moved to the last location of the list

• The heap is fixed and the process is

repeated

Page 6: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 6/51

6

Heap Storage

• We can store the heap using an array

• For an element at location i , its children will be in

locations 2i  and 2i +1

• If 2i  and 2i +1 are greater than the list size, then the

element at location i  is a leaf

• If only 2i +1 is greater than the list size, then the

element at location i  has just one child

Page 7: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 7/51

7

Heap Construction Example

i=8 2i=16

Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16

the last internal node

no change

Page 8: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 8/51

8

Final Heapsort Loop

11

1

9

33

3

66

Page 9: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 9/51

9

Final Heapsort Loop

Page 10: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 10/51

10

Heapsort Algorithm

construct the heap

for i = 1 to N do

copy the root to the list

fix the heap

end for

Page 11: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 11/51

11

FixHeap Algorithm

vacant = root

while 2*vacant ≤ bound  do

largerChild = 2*vacant

if (largerChild < bound) and

(list[largerChild+1] > list[largerChild]) thenlargerChild = largerChild + 1

end if

if key > list[ largerChild ] then

break

else

list[ vacant ] = list[ largerChild ]

vacant = largerChild

end if

end while

list[ vacant ] = key

i

2i 2i+1

Page 12: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 12/51

12

Constructing the Heap

for i = N/2 down to 1 do

FixHeap( list, i, list[ i ], N )

end for

Page 13: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 13/51

13

Final Heapsort Algorithm

// Constructing the heap from the initial list

for i = N/2 down to 1 do

FixHeap( list, i, list[ i ], N )

end for

// Sorting on the constructed heap

for i = N down to 2 do

max = list[ 1 ]

FixHeap( list, 1, list[ i ], i-1 )

list[ i ] = max

end for

Page 14: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 14/51

14

Worst-Case Analysis

• We analyze FixHeap because the rest of theanalysis depends on it

• For each level of the heap, FixHeap does twocomparisons  – one between the children and theother between the new value and the largest child

• For a heap with D levels, there will be at most 2D comparisons

Page 15: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 15/51

15

Worst-Case Analysis

• During heap construction, FixHeap is called (N / 2)times

• On the first pass, the heap will have depth of 1

• On the last pass, the heap will have depth of (lg N)

• We need to determine how many nodes there are on

each of the levels

Page 16: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 16/51

16

Worst-Case Analysis

• For binary trees, we know that there is 1 node on thefirst level, 2 nodes on the second level, 4 nodes on

the third level, and so on

• Putting this together gives:

)(O

4lg*2*4

 lgfor 2*)(*2)(W1

0onConstructi

 N 

 N  N 

 N  Di D N 

 D

i

i

   

L-1 (i=0)

L-2 (i=1)

L-3 (i=2)

L-4 (i=3)No of nodes on level  i

Max. depth of the sub-tree at level i

2 comparisons 

( see p. 104 for the derivation )

Page 17: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 17/51

17

Worst-Case Analysis

• In the second loop, the size of the heap decreases by

one each pass

• If there are k  nodes left in the heap, then the heapwill have a depth of lg k  

• This gives:

)lg(O

 lg2)(W1

1Loop

 N  N 

k  N  N 

Page 18: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 18/51

18

Worst-Case Analysis

• Overall, the worst case is given by:

)lg(O

)lg(O)(O

)(W)(W)(WLooponConstructi

 N  N 

 N  N  N 

 N  N  N 

Page 19: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 19/51

19

Best-Case Analysis

• In the best case, the elements will be in the array in

reverse order

• The construction phase will still be of O(N )

• Once the heap is constructed, the main loop will take

the same O(N  lg N ) work

• So, the best case for heapsort is also

O(N  lg N )

Page 20: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 20/51

20

 Average-Case Analysis

•  Average case must be between the best case and

the worst case

• The average case for heapsort must be O(N  lg N ), because best and worst case are both O(N  lg N )

Page 21: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 21/51

21

Merge Sort

• If you have two sorted lists, you can create acombined sorted list if you merge the lists

• We know that the smallest value will be the first onein either of the two lists

• If we move the smallest value to the new list, we canrepeat the process until the entire list is created

Page 22: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 22/51

22

Merge Sort Example

Page 23: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 23/51

23

Merge Sort Example (continued)

Page 24: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 24/51

24

Merge Sort Example (continued)

Page 25: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 25/51

25

The Algorithm

if first < last then

middle = ( first + last ) / 2

 MergeSort( list, first, middle )

 MergeSort( list, middle + 1, last )

 MergeLists( list, first, middle, middle + 1, last )

end if

Page 26: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 26/51

Page 27: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 27/51

27

 MergeList AlgorithmPart 2

if start1 ≤ end1 then 

for i = start1 to end1 do

result[indexC] = list[i]

indexC = indexC + 1

end for

else

for i = start2 to end2 do

result[indexC] = list[i]

indexC = indexC + 1

end for

end if

Page 28: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 28/51

28

 MergeList AlgorithmPart 3

indexC = 1

for i = finalStart to finalEnd do

list[i] = result[indexC]

indexC = indexC + 1

end for

Page 29: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 29/51

29

 MergeLists Analysis

• The best case is when the elements of one list arelarger than all of the elements of the other list

• One worst case is when the elements are interleaved

• If each list has N  elements, we will do N  comparisonsin the best case, and 2N -1 comparisons in the worstcase

Page 30: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 30/51

30

 MergeSort  Analysis

• MergeSort divides the list in half each time, so the

difference between the best and worst cases is howmuch work MergeList does

• In the analysis, we consider that a list of N  elements

gets broken into two lists of N /2 elements that are

recursively sorted and then merged together

Page 31: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 31/51

31

MergeSort Analysis

• The worst case is:W(N ) = 2W(N /2) + N – 1W(0) = W(1) = 0

which solves to W(N ) = O(N  lg N )

• The best case is:B(N ) = 2B(N /2) + N /2B(0) = B(1) = 0

which solves to B(N ) = O(N  lg N )

Page 32: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 32/51

32

Quicksort

• In Chapter 3, we saw a partition processused to help us find the K th largest element in

a list

• We now use the partitioning process to help

us sort a list• We now will apply the process to both parts

of the list instead of just one of them

Page 33: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 33/51

33

Quicksort

• Quicksort will partition a list into two pieces:

 – Those elements smaller than the pivot value

 – Those elements larger than the pivot value

• Quicksort is then called recursively on both pieces

Page 34: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 34/51

34

Quicksort Example

Page 35: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 35/51

35

Quicksort Example (continued)

Page 36: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 36/51

36

Quicksort Algorithm

if first < last then

pivot = PivotList( list, first, last )

Quicksort( list, first, pivot-1 )

Quicksort( list, pivot+1, last )

end if

Page 37: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 37/51

37

Partitioning Process

• The algorithm moves through the list

comparing values to the pivot

• During this process, there are sections of

elements as indicated below

Page 38: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 38/51

38

PivotList Algorithm

PivotValue = list[ first ]

PivotPoint = first

for index = first+1 to last do

if list[ index ] < PivotValue then

PivotPoint = PivotPoint + 1Swap( list[ PivotPoint ], list[ index ] )

end if

end for

// move pivot value into correct place

Swap( list[ first ], list[ PivotPoint ] )

return PivotPoint

Page 39: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 39/51

39

Worst-Case Analysis

• In the worst case, PivotList will do N   – 1

comparisons, but create one partition that has N – 1

elements and the other will have no elements

• Because it winds up just reducing the partition by one

element each time, worst case is given by:

)(O2

)1( 1)(W 2

2

 N  N  N 

i N  N 

i

 

Page 40: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 40/51

40

 Average-Case Analysis

• In the average case, we need to consider all

of the possible places where the pivot point

winds up

• Because there are N   – 1 comparisons done

to partition the list, and there are N  ways this

can be done, we have:

0)0(A)1(A

)(A)1(A11)(A1

 

 

 

   

 N 

i

i N i N 

 N  N 

Page 41: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 41/51

41

 Average-Case Analysis

•  Algebra can be used to simplify this recurrencerelation to:

• This will then solve to:

0)0(A)1(A

22)1(A*)1( )(A

 N 

 N  N  N  N 

 N  N  N   lg)1(4.1)(A  

Page 42: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 42/51

42

External Polyphase Merge Sort

• Used when the data to be sorted is so largeit will not fit in the computer’s memory 

• External files are used to hold partial results

• Read in as many records as possible intomemory and then sort them using one of theother sorts

•  Alternate writing these runs of sorted recordsto one of two files

Page 43: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 43/51

43

External Polyphase Merge Sort

• Merge pairs of runs from the two filesinto one run that is twice the length

• To do this, the runs might have to beread into memory in pieces, but theentire two runs must be merged beforemoving onto the next pair of runs

• This doubles the run length and halvesthe number of runs

Page 44: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 44/51

44

External Polyphase Merge Sort

• The bigger runs are written alternately

between two new files

• The process continues to merge pairs ofruns until the entire data set has been

merged back into a single sorted file

Page 45: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 45/51

45

Run Creation Algorithm

CurrentFile = A

while not at the end of the input file do

read S records from the input file

sort the S records

write the records to file CurrentFile

if CurrentFile == A then

CurrentFile = B

else

CurrentFile = A

end if

end while

Page 46: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 46/51

46

Run Merge Algorithm

Size = S

Input1 = A

Input2 = B

CurrentOutput = C

while not done do

***Merge runs process on next slide**

Size = Size * 2

if Input1 == A then

Input1 = C

Input2 = D

CurrrentOutput = A

Else

Input1 = A

Input2 = B

CurrentOutput = C

end if

end while

Page 47: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 47/51

47

Merge Runs Process

while more runs this pass do

Merge one run of length Size from file Input1

with one run of length Size from file Input2

sending output to CurrentOutput

if CurrentOutput == A thenCurrentOutput = B

elseif CurrentOutput == B then

CurrentOutput = A

elseif CurrentOutput == C then

CurrentOutput = Delseif CurrentOutput == D then

CurrentOutput = C

end if

end while

Page 48: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 48/51

48

Run Creation Analysis

• This analysis assumes that there are N  

elements in the list and that they are

broken down into R  runs of S elements(N  = R  * S)

• If we use an efficient sort to create the

runs, each run will take O(S lg S) and

there will be R  of them for a total time of

O(R  * S * lg S) = O(N  lg S)

Page 49: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 49/51

49

Run Merging Analysis

• On the first pass, we have R  runs of S 

elements, so there will be R /2 merges

that can take up to 2S  – 1 comparisons,

which is R /2 * (2S  – 1) = R *S  – R /2

• On the second pass, we will have R /2

runs of 2S elements, so there will be

R /4 merges that can take up to 4S  – 1

comparisons, which is R /4 * (4S  – 1) =

R *S  – R /4

Page 50: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 50/51

50

Run Merging Analysis

• There will be lg R  passes of the merge

phase, so that the complexity is given

by: R R N 

 RS  R

 R

i  i

 lg*2 *

 lg

1

 

  

 

Page 51: Chapter 04 02

8/10/2019 Chapter 04 02

http://slidepdf.com/reader/full/chapter-04-02 51/51

51

External Polyphase Merge

Sort Analysis

• Putting the run creation and run

merging calculations together we find

that the overall complexity is O(N  lg N )