Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Lecture 7: Sorting Techniques
Prakash Gautamhttps://prakashgautam.com.np/dipit02/[email protected]
26 April, 2018
Agenda
➔ Introduction to Sorting➔ Different Sorting Techniques
◆ Bubble Sort◆ Selection Sort◆ Insertion Sort◆ Merge Sort◆ Quick Sort◆ Shell Sort
2
3
4
5
Sorting➔ An operation that segregates items into groups
according to specified criterion➔ Input: A = { 3 1 6 2 1 3 4 5 9 0 }➔ Output: A = { 0 1 1 2 3 3 4 5 6 9 }➔ Sorting: Ordering➔ Sorted: Ordered based on a particular way➔ Examples
◆ Words in the dictionary
6
➔ It is arranging the elements in a list or collection in increasing or decreasing order of some property
➔ We may have list of any data types◆ Strings or Words: Lexicographical◆ List of Integers: Increasing order of value
➔ 2, 3, 9, 4, 6◆ 2, 3, 4, 6, 9 [Increasing order of value]◆ 9, 6, 4, 3, 2 [Decreasing order of value]◆ 2, 3, 9, 4, 6 [Increasing order of # factors]
7
➔ Sorted data are useful not only for representation & retrieval of data◆ It also significantly helps to improve the computational
power◆ Unsorted: Linear Search◆ Sorted: Binary Search
➔ Goal: Study, Analyze & Compare the various sorting algorithms
8
Sorting Algorithms➔ Bubble Sort➔ Selection Sort➔ Insertion Sort➔ Merge Sort➔ Quick Sort➔ Shell Sort➔ Radix Sort➔ Swap Sort➔ Heap Sort 9
Classification of Sorting Algorithms➔ Time Complexity
◆ Rate of growth of time taken by an algorithm with respect to input size, n
➔ Space Complexity◆ In-Place Algorithm or not…?
➔ Stability◆ Does it preserve the relative order of key values?
➔ Internal or External Sort (RAM or Disks)➔ Recursive or Non-Recursive
10
Bubble Sort➔ Bubble sort: Sinking sort➔ It is a simple sorting algorithm that works by
repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order
➔ The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted
11
12
Pass 1
13
Pass 2
14
Pass 3
15
Pass 4
16
➔ The algorithm gets its name from the way smaller elements "bubble" to the top of the list
➔ As it only uses comparisons to operate on elements, it is a comparison sort
➔ Notice that at least one element will be in the correct position each iteration
➔ Although the algorithm is simple, it is too slow for practical use
17
Bubble Sort: Algorithm
18
Bubble_Sort(A,n)for k=1 to n-1
for i=0 to n-2 if(A[i]>A[i+1])
Swap(A[i], A[i+1])
Swap(A[i],A[i+1])temp=A[i]A[i]=A[i+1]A[i+1]=temp
Bubble Sort: Improvement-I
19
Bubble_Sort(A,n)for k=1 to n-1
for i=0 to n-k-1 if(A[i]>A[i+1])
Swap(A[i], A[i+1])
Bubble Sort: Improvement-II
20
Bubble_Sort(A,n)for k=1 to n-1
for i=0 to n-k-1 flag=0
if(A[i]>A[i+1])Swap(A[i], A[i+1])flag=1
if(flag==0) break;
Bubble Sort: Complexity Analysis➔ Best Case: [ O(n)]➔ Worst Case: [ O(n2)]➔ Average Case: [ O(n2)]
21
Selection Sort➔ Array is imaginary divided into two parts - sorted
one & unsorted one➔ At the beginning, sorted part is empty, while
unsorted one contains whole array➔ At every step, algorithm finds minimal element in
the unsorted part and adds it to the end of the sorted one
➔ When unsorted part becomes empty, algorithm stops 22
23
24
25
Selection Sort: Algorithm
26
Selection_Sort(A,n)for i=0 to n-1 iMin=i
for j=i+1 to n-1if(A[j]<A[iMin])
iMin=jSwap(A[i], A[iMin])
Selection Sort: Complexity Analysis➔ O(n2)➔ It minimizes # of swaps
27
Insertion Sort➔ Array is imaginary divided into two parts - sorted
one & unsorted one➔ At the beginning, sorted part is empty, while
unsorted one contains whole array➔ It keeps a prefix of the array sorted➔ This prefix is grown by inserting the next value
into it at the correct place➔ Eventually, the prefix is the entire array, which is
therefore sorted 28
3 7 4 9 5 2 6 1
3 7 4 9 5 2 6 1
3 7 4 9 5 2 6 1
3 4 7 9 5 2 6 1
3 4 7 9 5 2 6 1
3 4 5 7 9 2 6 1 29
2 3 4 5 7 9 6 1
2 3 4 5 6 7 9 1
1 2 3 4 5 6 7 9
1 2 3 4 5 6 7 9
30
Insertion Sort: Algorithm
31
Insertion_Sort(A,n)for i=1 to n-1 Value=A[i]; hole=i;
while(hole>0 && A[hole-1]>Value)A[hole]=A[hole-1]hole=hole-1
A[hole]=Value
Insertion Sort: Complexity Analysis➔ Best Case: [ O(n) ]➔ Worst Case: [O(n2)]➔ Average Case: [O(n2)]➔ It minimizes # of swaps➔ Practical comparisions & swaps are much less than
Bubble & Selection sort.
32
Merge Sort➔ Divide & Conquer
◆ DIVIDE: Partition the n-element sequence to be sorted into two subsequences of n/2 elements each
◆ CONQUER: Sort the two subsequences recursively using the merge sort
◆ COMBINE: Merge the two sorted subsequences of size n/2 each to produce the sorted sequence
➔ Note that, Recursion "bottoms out" when the sequence to be sorted is of unit length
33
➔ Since every sequence of length 1 is in sorted order, no further recursive call is necessary
➔ The key operation of the merge sort algorithm is the merging of the two sorted sub sequences in the "combine step"
34
35
Merge Sort: Algorithm
36
Merge_Sort(A)n=length(A); if(n<2) {Its Sorted}mid=n/2; left=Array of size(mid);right=Array of size(n-mid)for i=0 to mid-1
left[i]=A[i]for i=mid to n-1
right[i-mid]=A[i]Merge_Sort(left); Merge_Sort(right);Merge(left, right, A)
37
Merge(L, R, A)nL=length(L); nR=length(R); i=j=k=0;while(i<nL && i<nR)
if(L[i]<=R[j])A[k]=L[i]; i++;
elseA[k]=R[j]; j++;
k++;while(i<nL)
A[k]=L[i]; i++; k++;while(j<nR)
A[k]=R[j]; j++; k++;
Merge Sort: Analysis➔ Time Complexity: O(n logn)➔ Space Complexity
◆ Non In-Place Algorithm ….WHY?◆ If we don’t clear extra memory for left & right: O(n logn)◆ If we clear extra memory in each call: O(n)
38
Quick Sort➔ It is the currently fastest known sorting algorithm
and is often the best practical choice for sorting➔ Pick an element, called a pivot, from the array➔ Reorder the array so that all elements with values less than the
pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation
39
➔ Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values
➔ Divide & Conquer◆ Divide: Partition T[i..j] = T[i .. k-1] & T[k+1... j] such that
each element of T[i..k-1]<=Pvt & T[k+1….j]>Pvt◆ Conquer: Sort the two sub arrays T[i .. k-1] & T[k+1... j] by
recursive calls to quicksort◆ Combine: Since the sub arrays are sorted in place, no work
is needed to combining them: the entire array T[i..j] is now sorted
40
41
42
43
44
Quick Sort: Algorithm
45
Quick_Sort(A, Start, End)if (start<end)
Pindex=Partition(A, Start, End)Quick_Sort(A, Start, Pindex-1);Quick_Sort(A, Pindex+1, End)
46
Partition(A, Start, End)pivot=A[End] ; Pindex=Start;
for i=Start to End-1if (A[i]<=pivot)
swap(A[i], A[Pindex]);Pindex=Pindex+1;
Swap(A[Pindex], A[End]);return Pindex;
Quick Sort: Analysis➔ Time Complexity
◆ Best Case(Balanced): O(n logn)◆ Worst Case(If already sorted/Unbalanced): O(n2)
● Solution: Randomized Partition◆ Average Case: O(n logn)
➔ Space Complexity◆ An In-Place Algorithm◆ Worst Case: O(n)
47
48
Randomized_Partition(A, Start, End)pivotIndex=Random(Start, End)Swap(A[pivotIndex], A[End])Partition(A, Start, End)
Shell Sort➔ Donald L. Shell (1959)➔ Generalization of the Insertion Sort➔ We compare elements that are distant apart rather
than adjacent➔ Comparison of elements: If there are N elements
then we start with a value gap<N➔ In each pass, we keep reducing the value of gap till
we reach the last pass when gap is 1
49
➔ In last pass: Shell Sort = Insertion Sort
[14 18 19 37 23 40 29 30 11] - A[ ]
0 1 2 3 4 5 6 7 8 Index
➔ Total Elements (N) = 9➔ “gap” must be less than N➔ gap = Floor[N/2]
50
➔ Here gap = 4 { Floor[9/2] }➔ So, Pass=1 & gap=4
◆ First element at Index 0◆ Second at Index, 0+4=4◆ Third at Index, 4+4=8
[14 18 19 37 23 40 29 30 11] - A[ ]
0 1 2 3 4 5 6 7 8 Index
51
[14 18 19 37 23 40 29 30 11] - A[ ]
0 1 2 3 4 5 6 7 8 Index
➔ Is 14 > 23...?, Is 18 > 40…?, Is 19 > 29…?, Is 37 > 30 (Now Swap) Is 23 > 11…?(Now Swap), Is 14 > 11…?
[11 18 19 30 14 40 29 37 23] - A[ ]
52
➔ Pass=2 & gap=2◆ gap = Floor [gap/2] = 2
[14 18 19 37 23 40 29 30 11] - A[ ]
0 1 2 3 4 5 6 7 8 Index
[11 18 14 30 19 37 23 40 29] - A[ ]
0 1 2 3 4 5 6 7 8 Index53
➔ Pass=3 & gap=1◆ gap = Floor [2/2] = 1◆ Equivalent with Insertion Sort when gap = 1
[11 18 14 30 19 37 23 40 29] - A[ ]
0 1 2 3 4 5 6 7 8 Index
FINALLY SORTED:
[11 14 18 19 23 29 30 37 40] - A[ ]
0 1 2 3 4 5 6 7 8 Index54
Shell Sort: Algorithm
55
Shell_Sort(A, Size)gap = Size/2;While(gap > 0)
i = gapwhile(i < Size)
temp = A[i]for(j=i; j>=gap && A[j-gap]>temp; j=j-gap)
A[j]=A[j-gap]A[j]=temp
i=i+1gap=gap/2
Shell Sort: Algorithm
56
1. Calculate gap2. While gap>0
FOR each element in the list, gap apartExtract the current itemLocate the position to insertInsert the item to the position
END FOR3. Calculate gap4. END While
Shell Sort: Analysis➔ Time Complexity
◆ Average Case: O(n5/4) O(n3/2)◆ Worst case: Insertion Sort O(n2)◆ Exact Complexity of this algorithm is still being debated
➔ Space Complexity◆ In Place
➔ Stable Sorting…?◆ No, It doesn’t preserve the relative order of duplicates
➔ Experience: Not better than O(nlogn)57
Heap Sort
58
59
60
61
Heap Data Structure➔ A special tree-based data structure➔ It must be a complete binary tree
62
Heapify
63
Heapify (A)Root = A[0]Largest=largest(A[0],A[2i+1], A[2i+2])If (Root != Largest)
Swap(Root, Largest)
64
heapify(int arr[], int n, int i){ int largest = i;
int l = 2*i + 1; int r =2*i+ 2; if (l < n && arr[l] > arr[largest]) largest = l; if (r < n && arr[r] > arr[largest]) largest = r; if (largest != i) {swap(arr[i], arr[largest]); heapify(arr, n, largest)}
}
65
66
67
Heap Sort➔ Max-Heap: largest item is stored at the root node➔ Remove the root node & put at end of the array➔ Reduce the size of the heap by 1 and heapify the
root element again so that we have highest element at root
➔ The process is repeated until all the items of the list is sorted
68
69
70
71
72
73
74
75
Heap Sort: Algorithm
76
for (int i=n-1; i>=0; i--) { swap(arr[0], arr[i]);//call max heapify on the reduced heap heapify(arr, i, 0); }
Heap Sort: Analysis➔ Time Complexity O(nlogn)
◆ The height of a complete binary tree containing n elements is log(n)
➔ Space Complexity◆ In Place
➔ Stable Sorting…?◆ No, It doesn’t preserve the relative order of duplicates
➔ Recursive
77
78
➔ Radix Sort➔ Counting Sort➔ Topological Sort➔ Bucket Sort➔ Comb Sort➔ Cycle Sort➔ Cocktail Sort➔ Bitonic Sort➔ Gnome Sort➔ Sleep Sort 79
Thank You…!
80
…?