Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Sorting algorithms
Sorting algorithms
Properties of sorting algorithms
A sorting algorithm is
Comparison based If it works by pairwise key comparisons.
In place If only a constant number of elements of theinput array are ever stored outside the array.
Stable If it preserve thes relative order of elements withequal keys.
The space complexity of a sorting algorithm is the amount ofmemory it uses in addition to the input array.
Insertion Sort Θ(1).Merge Sort Θ(n).Quicksort Θ(n) (worst case).
Sorting algorithms
Lower bound for comparison based sorting
algorithms
All possible runs of a comparison based sorting algorithmcan be modeled by a decision tree.
The height of the decision tree is a lower bound on thetime complexity of the algorithm.
The decision tree of insertion sort for n = 3 is shownbelow. 1:2
2:3 1:3
1:3 2:31,2,3
1,3,2 3,1,2
2,1,3
2,3,1 3,2,1
Sorting algorithms
Lower bound for comparison based sorting
algorithms
Theorem
For every comparison based sorting algorithm, the decisiontree for n elements has height Ω(n log n).
Proof.
The number of leaves in the tree is ≥ n!.
A binary tree with height h has ≤ 2h leaves.
Therefore, 2h ≥ n! ⇒ h ≥ log(n!).
By Stirling’s approximation, n! ≥ (n/e)n. Therefore,
h ≥ log ((n/e)n)
= n log(n/e) = n log n − n log e = Ω(n log n)
Sorting algorithms
Lower bound for comparison based sorting
algorithms
Lemma
A binary tree with height h has ≤ 2h leaves.
Proof.
The proof is by induction on h.
The base h = 0 is true.
If T is a binary tree of height h > 0, then removing theroot of T gives two trees T1,T2 of height at most h − 1each.
By induction, each of T1,T2 has ≤ 2h−1 leaves.
Therefore, T has ≤ 2 · 2h−1 = 2h leaves.
Sorting algorithms
Sorting in linear time
We will see several sorting algorithms that run in Θ(n) time:
Counting Sort.
Radix Sort.
Bucket Sort.
These algorithms are not comparison based.
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
0C
1 2 3 4 5 6 7 8
1 2 3 4 50
0 0 0 0 0
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
0C
1 2 3 4 5 6 7 8
1 2 3 4 50
0 1 0 0 0
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
0C
1 2 3 4 5 6 7 8
1 2 3 4 50
0 1 0 0 1
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
0 2 3 0 1
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 2 3 0 1
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 3 0 1
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 7 7 8
B
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 7 7 8
B 3
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 6 7 8
B 3
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
2C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 6 7 8
B 30
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
1C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 6 7 8
B 30
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
1C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 6 7 8
B 30 3
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A
1C
1 2 3 4 5 6 7 8
1 2 3 4 50
2 4 5 7 8
B 30 3
Sorting algorithms
Counting Sort
A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A1 2 3 4 5 6 7 8
B 30 30 2 2 3 5
0C1 2 3 4 50
2 2 4 7 7
Sorting algorithms
Counting Sort
Time complexity: Θ(n + k).If k = O(n) then the time is Θ(n).
CountingSort(A,B , k)
(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1
% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]
% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1
33 052 2 0 3A1 2 3 4 5 6 7 8
B 30 30 2 2 3 5
0C1 2 3 4 50
2 2 4 7 7
Sorting algorithms
Counting Sort
Counting Sort is stable (elements with same key maintain theirrelative order).
33 052 2 0 3A1 2 3 4 5 6 7 8
B 30 30 2 2 3 5
Sorting algorithms
Radix Sort
A = Input array. Values in A are numbers with d digits.
RadixSort(A, d)
(1) for i = 1 to d(2) Use a stable sort to sort array A on digit i
329457657839436720355
720355436457657329839
720329436839355457657
329355436457657720839
Sorting algorithms
Time complexity
In the following, assume that the stable sorting algorithm usedby radix sort is Counting Sort.
Claim
Given n numbers, where each number has d digits in base k ,the time complexity of Radix Sort is Θ(d(n + k)).
Proof.
The algorithm has d iterations.
The time of a single iteration is Θ(n + k).
Sorting algorithms
Time complexity
Claim
Given n b-bits numbers, and r ≤ b, the time complexity ofRadix Sort is Θ((b/r)(n + 2r )).
Proof.
Each number can be viewed as a number in base k = 2r
with d = db/re digits.
The time complexity is Θ(d(n + k)) = Θ((b/r)(n + 2r )).
011101001 b=9, r=3
3 5 1Sorting algorithms
Choosing r — example
Suppose that n = 50000 and b = 32.
r d = b/r k = 2r (b/r)(n + 2r )
1 32 2 32 · (50000 + 2) = 16000642 16 4 16 · (50000 + 4) = 8000643 11 8 11 · (50000 + 8) = 550088...9 4 512 4 · (50000 + 512) = 202048
10 4 1024 4 · (50000 + 1024) = 20409611 3 2048 3 · (50000 + 2048) = 15614412 3 4096 3 · (50000 + 4096) = 16228813 3 8192 3 · (50000 + 8192) = 17457614 3 16384 3 · (50000 + 16384) = 19915215 3 32768 3 · (50000 + 32768) = 24830416 2 65536 2 · (50000 + 65536) = 231072...
Sorting algorithms
Choosing r
Claim
Given n b-bits numbers, the time complexity of Radix Sort is
Θ(n) if b ≤ log n
Θ(bn/ log n) if b > log n.
Proof.
If b ≤ log n:
For every r ≤ b, (b/r)(n + 2r ) ≤ (b/r)(n + n) = 2b/r · n.
For b = r , the time is Θ(n) which is optimal.
If b > log n:
For r = blog nc, (b/r)(n + 2r ) = Θ((b/ log n)n).
If r < blog nc, the b/r term increases while the n + 2r
term remains Θ(n).
If r > blog nc, the 2r term increases faster than the rterm in the denominator.
Sorting algorithms
Choosing r — example
Suppose that n = 50000 and b = 32.
log2 n = 15.609, so an asymptotically optimal choice isr = 15.
Sorting algorithms
Bucket Sort
The Bucket Sort algorithm sorts an array whose elementsare numbers from the interval [0, 1).
The main idea is to partition the interval [0, 1) into nintervals of equal lengths. The elements of the array arepartitioned into buckets according to the intervals. Then,each bucket is sorted using some sorting algorithm (forexample, insertion sort).
Sorting algorithms
Bucket Sort
A = Input array. 0 ≤ A[i ] < 1 for all i .
BucketSort(A)
(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order
0.78
0.17
0.39
0.26
0.72
0.94
0.21
0.12
0.23
0.68
0
1
2
3
4
5
6
7
8
9
A B
Sorting algorithms
Bucket Sort
A = Input array. 0 ≤ A[i ] < 1 for all i .
BucketSort(A)
(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order
0.78
0.17
0.39
0.26
0.72
0.94
0.21
0.12
0.23
0.68
0
1
2
3
4
5
6
7
8
9
A B0.72
0.72
Sorting algorithms
Bucket Sort
A = Input array. 0 ≤ A[i ] < 1 for all i .
BucketSort(A)
(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order
0.78
0.17
0.39
0.26
0.72
0.94
0.21
0.12
0.23
0.68
0
1
2
3
4
5
6
7
8
9
A B
0.17
0.72
0.17
Sorting algorithms
Bucket Sort
A = Input array. 0 ≤ A[i ] < 1 for all i .
BucketSort(A)
(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order
0.78
0.17
0.39
0.26
0.72
0.94
0.21
0.12
0.23
0.68
0
1
2
3
4
5
6
7
8
9
A B
0.78
0.78
0.17
0.72
Sorting algorithms
Bucket Sort
A = Input array. 0 ≤ A[i ] < 1 for all i .
BucketSort(A)
(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order
0.78
0.17
0.39
0.26
0.72
0.94
0.21
0.12
0.23
0.68
0
1
2
3
4
5
6
7
8
9
A B
0.78
0.12
0.72
0.17
0.23 0.21 0.26
0.39
0.68
0.94
Sorting algorithms
Bucket Sort
A = Input array. 0 ≤ A[i ] < 1 for all i .
BucketSort(A)
(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order
0.78
0.17
0.39
0.26
0.72
0.94
0.21
0.12
0.23
0.68
0
1
2
3
4
5
6
7
8
9
A B
0.78
0.12
0.72
0.17
0.230.21 0.26
0.39
0.68
0.94
Sorting algorithms
Time complexity
Assume that the elements of A are chosen randomly from[0, 1) (uniformly and independently).
Let ni be the number of elements in B[i ].
The expected time complexity is
E
[Θ(n) +
n−1∑i=0
O(n2i )
]= Θ(n) +
n−1∑i=0
E [O(n2i )]
= Θ(n) +n−1∑i=0
O(E [n2i ])
= Θ(n) + O(n · E [n21])
Sorting algorithms
Time complexity
Let Xi = 1 if A[i ] falls to bucket 1.
n1 =∑n
i=1 Xi
E [n21] = E
( n∑i=1
Xi
)2
= E
n∑i=1
X 2i +
n∑i=1
n∑j=1j 6=i
XiXj
=
n∑i=1
E [X 2i ] +
n∑j=1j 6=i
E [XiXj ]
= nE [X 21 ] + n(n − 1)E [X1X2]
Sorting algorithms
Time complexity
E [X 21 ] = 0 ·
(1− 1
n
)+ 1 · 1
n=
1
n
E [X1X2] = 0 ·(
1− 1
n2
)+ 1 · 1
n2=
1
n2
Therefore,
E [n21] = n · 1
n+ n(n − 1) · 1
n2= 1 +
n − 1
n= 2− 1
n.
The expected time complexity of Bucket Sort is Θ(n).
Sorting algorithms