24
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu,

Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Embed Size (px)

Citation preview

Page 1: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Computer AlgorithmsLecture 11

Sorting in Linear Time

Ch. 8

Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Page 2: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Comparison-based Sorting

• Comparison sort– Only comparison of pairs of elements may be used to gain order

information about a sequence.– Hence, a lower bound on the number of comparisons will be a

lower bound on the complexity of any comparison-based sorting algorithm.

• What sorts we have analyzed so far have been comparison sorts

• The best worst-case complexity so far is (n lg n)– To sort n elements, comparison sorts must make (nlgn)

comparisons in the worst case

• We prove a lower bound of n lg n, (or (n lg n)) for any comparison sort, implying that merge sort and heapsort are optimal.

Page 3: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Decision Tree Model• Full binary tree: every node is either a leaf of has degree 2• Represents the comparisons made by a sorting algorithm on an

input of a given size: models all possible execution traces• Control, data movement, other operations are ignored• Count only the comparisons• Decision tree for insertion sort on three elements:

node

leaf:

one execution trace

Page 4: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Decision Tree Model

• All permutations on n elements must appear as one of the leaves in the decision tree

• Worst-case number of comparisons – the length of the longest path from the root to a leaf – the height of the decision tree

n! permutations

Page 5: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Decision Tree Model

• Goal: finding a lower bound on the running time on any comparison sort algorithm– find a lower bound on the heights of all decision trees for all

algorithms

Page 6: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Lemma• Any binary tree of height h has at most

Proof: induction on h

Basis: h = 0 tree has one node, which is a leaf

2h = 1

Inductive step: assume true for h-1 (2h-1 leaves)– Extend the height of the tree with one more level– Each leaf becomes parent to two (complete binary tree) new

leaves

No. of leaves for tree of height h =

= 2 (no. of leaves for tree of height h-1)

≤ 2 2h-1

= 2h

2h leaves

Page 7: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Lower Bound for Comparison SortsTheorem: Any comparison sort algorithm requires (nlgn) comparisons in the worst case.

Proof: How many leaves does the tree have? – At least n! (each of the n! permutations of the input appears as

some leaf) n! ≤ l – At most 2h leaves

n! ≤ l ≤ 2h

h ≥ lg(n!) = (nlgn)

(3.18)

We can beat the (nlgn) running time if we use other operations than comparisons!

h

l leaves

What comparison sorts are asymptotically optimal? Merge sort and heapsort

Page 8: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Beating the lower bound

• We can beat the lower bound if we don’t base our sort on comparisons:– Counting sort for keys in [0..k], k=O(n)– Radix sort for keys with a fixed number of “digits”– Bucket sort for random keys (uniformly distributed)

Page 9: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Counting Sort

• Assumption: – The elements to be sorted are integers in the range 0 to k

• Idea:– Determine for each input element x, the number of elements

smaller than or equal to x– Place element x into its correct position in the output array

303203521 2 3 4 5 6 7 8

A 774221 2 3 4 5

C 80

533322001 2 3 4 5 6 7 8

B

Page 10: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Counting Sort

Alg.: COUNTING-SORT(A, B, n, k)1. for i ← 0 to k2. do C[ i ] ← 03. for j ← 1 to n4. do C[A[ j ]] ← C[A[ j ]] + 15. C[i] contains the number of elements equal to i6. for i ← 1 to k7. do C[ i ] ← C[ i ] + C[i -1]8. C[i] contains the number of elements ≤ i9. for j ← n downto 110. do B[C[A[ j ]]] ← A[ j ]11. C[A[ j ]] ← C[A[ j ]] - 1

1 n

0 k

A

C

1 n

B

j

Page 11: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Example

303203521 2 3 4 5 6 7 8

A

032021 2 3 4 5

C 10

774221 2 3 4 5

C 80

31 2 3 4 5 6 7 8

B

764221 2 3 4 5

C 80

301 2 3 4 5 6 7 8

B

764211 2 3 4 5

C 80

3301 2 3 4 5 6 7 8

B

754211 2 3 4 5

C 80

33201 2 3 4 5 6 7 8

B

753211 2 3 4 5

C 80

CountingSort(A, B, k)

1. for i 1 to k

2. do C[i] 03. for j 1 to length[A]

4. do C[A[j]] C[A[j]] + 1

5. for i 2 to k

6. do C[i] C[i] + C[i –1]

7. for j length[A] downto 1

8. do B[C[A[ j ]]] A[j]

9. C[A[j]] C[A[j]]–1

CountingSort(A, B, k)

1. for i 1 to k

2. do C[i] 03. for j 1 to length[A]

4. do C[A[j]] C[A[j]] + 1

5. for i 2 to k

6. do C[i] C[i] + C[i –1]

7. for j length[A] downto 1

8. do B[C[A[ j ]]] A[j]

9. C[A[j]] C[A[j]]–1

A[8]=3 A[7]=0

A[6]=3 A[5]=2

Page 12: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Example (cont.)

303203521 2 3 4 5 6 7 8

A

332001 2 3 4 5 6 7 8

B

753201 2 3 4 5

C 80

53332001 2 3 4 5 6 7 8

B

743201 2 3 4 5

C 70

3332001 2 3 4 5 6 7 8

B

743201 2 3 4 5

C 80

533322001 2 3 4 5 6 7 8

B

A[4]=0

A[3]=3

A[2]=5

A[1]=2

Page 13: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Counting-Sort (A, B, k)

CountingSort(A, B, k)

1. for i 1 to k

2. do C[i] 03. for j 1 to length[A]

4. do C[A[j]] C[A[j]] + 1

5. for i 2 to k

6. do C[i] C[i] + C[i –1]

7. for j length[A] downto 1

8. do B[C[A[ j ]]] A[j]

9. C[A[j]] C[A[j]]–1

CountingSort(A, B, k)

1. for i 1 to k

2. do C[i] 03. for j 1 to length[A]

4. do C[A[j]] C[A[j]] + 1

5. for i 2 to k

6. do C[i] C[i] + C[i –1]

7. for j length[A] downto 1

8. do B[C[A[ j ]]] A[j]

9. C[A[j]] C[A[j]]–1

O(k)

O(k)

O(n)

O(n)

Overall time: (n + k)

Stable, but not in place.

No comparisons made: it uses actual values of the elements to index into an array.

Page 14: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Radix Sort• It was used by the card-sorting

machines.• Card sorters worked on one column at

a time.• It is the algorithm for using the

machine that extends the technique to multi-column sorting.

• The human operator was part of the algorithm!

Key idea: sort on the “least significant digit” first and on the remaining digits in sequential order. The sorting method used to sort each digit must be “stable”.

If we start with the “most significant digit”, we’ll need extra storage.

Page 15: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Radix Sort

• Considers keys as numbers in a base-k number– A d-digit number will occupy a field of d columns

• Sorting looks at one column at a time– For a d-digit number, sort the least significant digit first

– Continue sorting on the next least significant digit, until all digits have been sorted

– Requires only d passes through the list

Page 16: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

An Example

392 631 928 356

356 392 631 392

446 532 532 446

928 495 446 495

631 356 356 532

532 446 392 631

495 928 495 928

Input After sortingon LSD

After sortingon middle digit

After sortingon MSD

Page 17: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Radix-Sort(A, d)

Correctness of Radix Sort

By induction on the number of digits sorted.

Assume that radix sort works for d – 1 digits.

Show that it works for d digits.

Radix sort of d digits radix sort of the low-order d – 1 digits followed by a sort on digit d .

RadixSort(A, d)

1. for i 1 to d

2. do use a stable sort to sort array A on digit I

1 is the lowest order digit, d is the highest-order digit

RadixSort(A, d)

1. for i 1 to d

2. do use a stable sort to sort array A on digit I

1 is the lowest order digit, d is the highest-order digit

Page 18: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Correctness of Radix sort• We use induction on the number d of passes through the digits

• Basis: If d = 1, there’s only one digit, trivial

• Inductive step: assume digits 1, 2, . . . , d-1 are sorted– Now sort on the d-th digit

– If ad < bd, sort will put a before b: correct

a < b regardless of the low-order digits

– If ad > bd, sort will put a after b: correct

a > b regardless of the low-order digits

– If ad = bd, sort will leave a and b in the

same order and a and b are already sorted

on the low-order d-1 digits

Page 19: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Analysis of Radix Sort

• Given n numbers of d digits each, where each digit may

take up to k possible values, RADIX-SORT correctly

sorts the numbers in (d(n+k))

– One pass of sorting per digit takes (n+k) assuming that we use

counting sort

– There are d passes (for each digit)

Page 20: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Bucket Sort

• Assumption: – the input is generated by a random process that distributes

elements uniformly over [0, 1)– the discrete () uniform distribution is a discrete probability distribution

that can be characterized by saying that all values of a finite set of possible values are equally probable.

• Idea:– Divide [0, 1) into n equal-sized buckets– Distribute the n input values into the buckets– Sort each bucket– Go through the buckets in order, listing elements in each one

• Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i • Output: elements in A sorted• Auxiliary array: B[0 . . n - 1] of linked lists, each list initially empty

Page 21: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Example - Bucket Sort

.78

.17

.39

.26

.72

.94

.21

.12

.23

.68

0

1

2

3

4

5

6

7

8

9

1

2

3

4

5

6

7

8

9

10

.21

.12 /

.72 /

.23 /

.78

.94 /

.68 /

.39 /

.26

.17

/

/

/

/

Page 22: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Example - Bucket Sort

0

1

2

3

4

5

6

7

8

9

.23

.17 /

.78 /

.26 /

.72

.94 /

.68 /

.39 /

.21

.12

/

/

/

/

.17.12 .23 .26.21 .39 .68 .78.72 .94 /

Concatenate the lists from 0 to n – 1 together, in order

Page 23: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Analysis

• Relies on no bucket getting too many values.

• All lines except insertion sorting take O(n) altogether.

• Intuitively, if each bucket gets a constant number of elements, it takes O(1) time to sort each bucket O(n) sort time for all buckets.

• We “expect” each bucket to have few elements, since the average is 1 element per bucket.

• But we need to do a careful analysis.

Page 24: Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Non-Comparison Based Sorts

Counting Sort O(n + k) O(n + k) O(n + k) noRadix Sort O(d(n + k')) O(d(n + k')) O(d(n + k')) noBucket Sort O(n) no

worst-case average-case best-case in place

Running Time

Summary

Counting sort assumes input elements are in range [0,1,2,..,k] and uses array indexing to count the number of occurrences of each value.

Radix sort assumes each integer consists of d digits, and each digit is in range [1,2,..,k'].

Bucket sort requires advance knowledge of input distribution (sorts n numbers uniformly distributed in range in O(n) time).