Upload
simonsbay
View
224
Download
0
Embed Size (px)
Citation preview
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 1/51
Chapter 4, Part II
Sorting Algorithms
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 2/51
2
Heap Details
• A heap is a tree structure where foreach subtree the value stored at the
root is larger than all of the values
stored in the subtree
• There is no ordering between the
children of any node other than that
they are smaller
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 3/51
3
Heap Details
• A heap is also a complete tree, so nodes arefilled in along the bottom of the tree from left
to right and a new level is started only when
the previous level has been filled
• The largest value stored in a heap will be in
the root of the heap and the smallest value
will be in one of the leaves
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 4/51
4
Heap Example
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 5/51
5
Heapsort
• Heapsort begins by constructing a heap
• The root (the largest value in the heap)
is moved to the last location of the list
• The heap is fixed and the process is
repeated
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 6/51
6
Heap Storage
• We can store the heap using an array
• For an element at location i , its children will be in
locations 2i and 2i +1
• If 2i and 2i +1 are greater than the list size, then the
element at location i is a leaf
• If only 2i +1 is greater than the list size, then the
element at location i has just one child
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 7/51
7
Heap Construction Example
i=8 2i=16
Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
16
the last internal node
no change
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 8/51
8
Final Heapsort Loop
11
1
9
33
3
66
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 9/51
9
Final Heapsort Loop
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 10/51
10
Heapsort Algorithm
construct the heap
for i = 1 to N do
copy the root to the list
fix the heap
end for
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 11/51
11
FixHeap Algorithm
vacant = root
while 2*vacant ≤ bound do
largerChild = 2*vacant
if (largerChild < bound) and
(list[largerChild+1] > list[largerChild]) thenlargerChild = largerChild + 1
end if
if key > list[ largerChild ] then
break
else
list[ vacant ] = list[ largerChild ]
vacant = largerChild
end if
end while
list[ vacant ] = key
i
2i 2i+1
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 12/51
12
Constructing the Heap
for i = N/2 down to 1 do
FixHeap( list, i, list[ i ], N )
end for
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 13/51
13
Final Heapsort Algorithm
// Constructing the heap from the initial list
for i = N/2 down to 1 do
FixHeap( list, i, list[ i ], N )
end for
// Sorting on the constructed heap
for i = N down to 2 do
max = list[ 1 ]
FixHeap( list, 1, list[ i ], i-1 )
list[ i ] = max
end for
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 14/51
14
Worst-Case Analysis
• We analyze FixHeap because the rest of theanalysis depends on it
• For each level of the heap, FixHeap does twocomparisons – one between the children and theother between the new value and the largest child
• For a heap with D levels, there will be at most 2D comparisons
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 15/51
15
Worst-Case Analysis
• During heap construction, FixHeap is called (N / 2)times
• On the first pass, the heap will have depth of 1
• On the last pass, the heap will have depth of (lg N)
• We need to determine how many nodes there are on
each of the levels
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 16/51
16
Worst-Case Analysis
• For binary trees, we know that there is 1 node on thefirst level, 2 nodes on the second level, 4 nodes on
the third level, and so on
• Putting this together gives:
)(O
4lg*2*4
lgfor 2*)(*2)(W1
0onConstructi
N
N N
N Di D N
D
i
i
L-1 (i=0)
L-2 (i=1)
L-3 (i=2)
L-4 (i=3)No of nodes on level i
Max. depth of the sub-tree at level i
2 comparisons
( see p. 104 for the derivation )
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 17/51
17
Worst-Case Analysis
• In the second loop, the size of the heap decreases by
one each pass
• If there are k nodes left in the heap, then the heapwill have a depth of lg k
• This gives:
)lg(O
lg2)(W1
1Loop
N N
k N N
k
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 18/51
18
Worst-Case Analysis
• Overall, the worst case is given by:
)lg(O
)lg(O)(O
)(W)(W)(WLooponConstructi
N N
N N N
N N N
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 19/51
19
Best-Case Analysis
• In the best case, the elements will be in the array in
reverse order
• The construction phase will still be of O(N )
• Once the heap is constructed, the main loop will take
the same O(N lg N ) work
• So, the best case for heapsort is also
O(N lg N )
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 20/51
20
Average-Case Analysis
• Average case must be between the best case and
the worst case
• The average case for heapsort must be O(N lg N ), because best and worst case are both O(N lg N )
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 21/51
21
Merge Sort
• If you have two sorted lists, you can create acombined sorted list if you merge the lists
• We know that the smallest value will be the first onein either of the two lists
• If we move the smallest value to the new list, we canrepeat the process until the entire list is created
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 22/51
22
Merge Sort Example
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 23/51
23
Merge Sort Example (continued)
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 24/51
24
Merge Sort Example (continued)
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 25/51
25
The Algorithm
if first < last then
middle = ( first + last ) / 2
MergeSort( list, first, middle )
MergeSort( list, middle + 1, last )
MergeLists( list, first, middle, middle + 1, last )
end if
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 26/51
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 27/51
27
MergeList AlgorithmPart 2
if start1 ≤ end1 then
for i = start1 to end1 do
result[indexC] = list[i]
indexC = indexC + 1
end for
else
for i = start2 to end2 do
result[indexC] = list[i]
indexC = indexC + 1
end for
end if
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 28/51
28
MergeList AlgorithmPart 3
indexC = 1
for i = finalStart to finalEnd do
list[i] = result[indexC]
indexC = indexC + 1
end for
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 29/51
29
MergeLists Analysis
• The best case is when the elements of one list arelarger than all of the elements of the other list
• One worst case is when the elements are interleaved
• If each list has N elements, we will do N comparisonsin the best case, and 2N -1 comparisons in the worstcase
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 30/51
30
MergeSort Analysis
• MergeSort divides the list in half each time, so the
difference between the best and worst cases is howmuch work MergeList does
• In the analysis, we consider that a list of N elements
gets broken into two lists of N /2 elements that are
recursively sorted and then merged together
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 31/51
31
MergeSort Analysis
• The worst case is:W(N ) = 2W(N /2) + N – 1W(0) = W(1) = 0
which solves to W(N ) = O(N lg N )
• The best case is:B(N ) = 2B(N /2) + N /2B(0) = B(1) = 0
which solves to B(N ) = O(N lg N )
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 32/51
32
Quicksort
• In Chapter 3, we saw a partition processused to help us find the K th largest element in
a list
• We now use the partitioning process to help
us sort a list• We now will apply the process to both parts
of the list instead of just one of them
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 33/51
33
Quicksort
• Quicksort will partition a list into two pieces:
– Those elements smaller than the pivot value
– Those elements larger than the pivot value
• Quicksort is then called recursively on both pieces
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 34/51
34
Quicksort Example
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 35/51
35
Quicksort Example (continued)
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 36/51
36
Quicksort Algorithm
if first < last then
pivot = PivotList( list, first, last )
Quicksort( list, first, pivot-1 )
Quicksort( list, pivot+1, last )
end if
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 37/51
37
Partitioning Process
• The algorithm moves through the list
comparing values to the pivot
• During this process, there are sections of
elements as indicated below
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 38/51
38
PivotList Algorithm
PivotValue = list[ first ]
PivotPoint = first
for index = first+1 to last do
if list[ index ] < PivotValue then
PivotPoint = PivotPoint + 1Swap( list[ PivotPoint ], list[ index ] )
end if
end for
// move pivot value into correct place
Swap( list[ first ], list[ PivotPoint ] )
return PivotPoint
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 39/51
39
Worst-Case Analysis
• In the worst case, PivotList will do N – 1
comparisons, but create one partition that has N – 1
elements and the other will have no elements
• Because it winds up just reducing the partition by one
element each time, worst case is given by:
)(O2
)1( 1)(W 2
2
N N N
i N N
i
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 40/51
40
Average-Case Analysis
• In the average case, we need to consider all
of the possible places where the pivot point
winds up
• Because there are N – 1 comparisons done
to partition the list, and there are N ways this
can be done, we have:
0)0(A)1(A
)(A)1(A11)(A1
N
i
i N i N
N N
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 41/51
41
Average-Case Analysis
• Algebra can be used to simplify this recurrencerelation to:
• This will then solve to:
0)0(A)1(A
22)1(A*)1( )(A
N
N N N N
N N N lg)1(4.1)(A
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 42/51
42
External Polyphase Merge Sort
• Used when the data to be sorted is so largeit will not fit in the computer’s memory
• External files are used to hold partial results
• Read in as many records as possible intomemory and then sort them using one of theother sorts
• Alternate writing these runs of sorted recordsto one of two files
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 43/51
43
External Polyphase Merge Sort
• Merge pairs of runs from the two filesinto one run that is twice the length
• To do this, the runs might have to beread into memory in pieces, but theentire two runs must be merged beforemoving onto the next pair of runs
• This doubles the run length and halvesthe number of runs
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 44/51
44
External Polyphase Merge Sort
• The bigger runs are written alternately
between two new files
• The process continues to merge pairs ofruns until the entire data set has been
merged back into a single sorted file
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 45/51
45
Run Creation Algorithm
CurrentFile = A
while not at the end of the input file do
read S records from the input file
sort the S records
write the records to file CurrentFile
if CurrentFile == A then
CurrentFile = B
else
CurrentFile = A
end if
end while
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 46/51
46
Run Merge Algorithm
Size = S
Input1 = A
Input2 = B
CurrentOutput = C
while not done do
***Merge runs process on next slide**
Size = Size * 2
if Input1 == A then
Input1 = C
Input2 = D
CurrrentOutput = A
Else
Input1 = A
Input2 = B
CurrentOutput = C
end if
end while
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 47/51
47
Merge Runs Process
while more runs this pass do
Merge one run of length Size from file Input1
with one run of length Size from file Input2
sending output to CurrentOutput
if CurrentOutput == A thenCurrentOutput = B
elseif CurrentOutput == B then
CurrentOutput = A
elseif CurrentOutput == C then
CurrentOutput = Delseif CurrentOutput == D then
CurrentOutput = C
end if
end while
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 48/51
48
Run Creation Analysis
• This analysis assumes that there are N
elements in the list and that they are
broken down into R runs of S elements(N = R * S)
• If we use an efficient sort to create the
runs, each run will take O(S lg S) and
there will be R of them for a total time of
O(R * S * lg S) = O(N lg S)
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 49/51
49
Run Merging Analysis
• On the first pass, we have R runs of S
elements, so there will be R /2 merges
that can take up to 2S – 1 comparisons,
which is R /2 * (2S – 1) = R *S – R /2
• On the second pass, we will have R /2
runs of 2S elements, so there will be
R /4 merges that can take up to 4S – 1
comparisons, which is R /4 * (4S – 1) =
R *S – R /4
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 50/51
50
Run Merging Analysis
• There will be lg R passes of the merge
phase, so that the complexity is given
by: R R N
RS R
R
i i
lg*2 *
lg
1
8/10/2019 Chapter 04 02
http://slidepdf.com/reader/full/chapter-04-02 51/51
51
External Polyphase Merge
Sort Analysis
• Putting the run creation and run
merging calculations together we find
that the overall complexity is O(N lg N )