View
217
Download
3
Tags:
Embed Size (px)
Citation preview
Sorting Suppose you wanted to write a computer
game like Doom 4: The Caverns of Calvin… How do you render those nice (lurid) pictures
of Calvin College torture chambers, with hidden surfaces removed?
Given a collection of polygons (points, tests, values), how do you sort them?
My favorite sort: What are your favorite sorts?
Read 6.1-6.5, omit rest of chapter 6.
Simple (and slow) algorithms Bubble sort:
Selection Sort:
Insertion Sort:
Which is best?important factors: comparisons, data
movement
Sorting out Sorting A collection or file of items with keys Sorting may be on items or pointers Sorting may be internal or external Sorting may or may not be stable Simple algorithms:
easy to implement slow (on big sets of data) show the basic approaches, concepts May be used to improve fancier algorithms
Sorting UtilitiesWe’d like our sorting algorithms to work with all
data types…
template <class Item> void exch(Item &A, Item &B) {Item t=A; A=B; B=t; }template <class Item> void compexch(Item &A, Item &B) {if (B<A) exch(A, B); }
Bubble Sort The first sort most
students learn And the worst…
template <class Item>void bubble(Item a[], int l, int r){ for (int i=l; i<r; i++) for (int j=r; j>i; j--) compexch(a[j-1], a[j]);}
comparisons? something like n2/2
date movements?something like n2/2
Selection Sort Find smallest element Exchange with first Recursively sort rest
template <class Item>void selection(Item a[], int l, int r){ for (int i=1; i<r; i++) { int min=i; for (int j=i+1; j<=r; j++) if (a[j]<a[min]) min=j; exch(a[i], a[min]); }}
comparisons? n2/2
swaps? n
Insertion Sort Like sorting cards Put next one in place
template <class Item>void insertion(Item a[], int l, int r){ int i; for (i=r; i>l; i--) compexch(a[i-1],a[i]); for (i=l+2; i<=r; i++) { int j=i; Item v=a[i]; while (v<a[j-1]) { a[j] = a[j-1]; j--; } a[j] = v; }}
comparisons? n2/4
data moves? n2/4
Which one to use? Selection: few data movements Insertion: few comparisons Bubble: blows But all of these are (n2), which, as you know,
is TERRIBLE for large n Can we do better than (n2)?
32-bit int keys String keysN S I B S I B
1000 5 4 11 13 8 192000 21 15 45 56 31 784000 85 62 182 228 126 321
Merge Sort The quintessential divide-and-conquer
algorithm
Divide the list in half Sort each half recursively Merge the results. Base case:
left as an exercise to the reader
Merge Sort Analysis Recall runtime recurrence:
T(1)=0; T(n) = 2T(n/2) + cn
(n log n) runtime in the worst case
Much better than the simple sorts on big data files – and easy to implement!
Can implement in-place and bottom-up to avoid some data movement and recursion overhead
Still, empirically, it’s slower than Quicksort, which we’ll study next.
Quicksort
Pick a pivot; pivot list; sort halves recursively. The most widely used algorithm A heavily studied algorithm with many
variations and improvements (“it seems to invite tinkering”)
A carefully tuned quicksort is usually fastest (e.g. unix’s qsort standard library function)
but not stable, and in some situations slooow…
Quicksort
template <class Item>void qsort(Item a[], int l, int r){ if (r<=l) return; int i=partition(a, l, r); qsort(a, l, i-1); qsort(a, i+1, r);}
partition: pick an item as pivot, p (last item?) rearrange list into items smaller, equal, and greater than p
Partitioningtemplate <class Item>int partition(Item a[], int l, int r){ int i=l-1, j=r; Item v=a[r]; for (;;) { while (a[++i] < v) ; while (v<a[--j]) if (j==l) break; if (i >= j) break; exch(a[i], a[j]); } exch(a[i], a[r]); return i;}
Quicksort Analysis What is the runtime for Quicksort? Recurrence relation? Worst case: (n2) Best, Average case: (n log n)
When does the worst case arise?when the list is (nearly) sorted! oops…
Recursive algorithms also have lots of overhead. How to reduce the recursion overhead?
Quick Hacks: Cutoff How to improve the recursion overhead?
Don’t sort lists of size <= 10 (e.g.)
At the end, run a pass of insertion sort. In practice, this speeds up the algorithm
Quick Hacks: Picking a Pivot How to prevent that nasty worst-case
behavior? Be smarter about picking a pivot E.g. pick median of first, middle, and last elements
as pivot
Again, this yields an improvement in empirical performance: the worst case is much more rare
(what would have to happen to get the worst case?)
Quicksort empirical results Basic Quicksort Median-of-three
N c=0 c=10 c=20 c=0 c=10 c=20
100000 24 22 22 25 20 28
200000 53 48 50 52 44 54
400000 116 105 110 114 97 118
800000 255 231 241 252 213 258
A solution attempt Adjust gray levels
so that 5% of the pixels
are black 70% are white The rest are
interpolated [then gamma
correction is used]
But there are some undesirable artifacts…
Median, Order Statistics Quicksort improvement idea: use the median as
pivot Order Statistics: an algorithm to
find the smallest element of a list find the n/2th element (median) find the largest 20% of the items find the kth element from the bottom
Algorithm idea: sort, then pick the middle element.
(n log n) worst, average case. This won’t help for quicksort! Can we do better?
Quicksort-based selection Pick a pivot; partition list. Let i be location of
pivot. If i>k search left part; if i<k search right parttemplate<class Item>void select(Item a[], int l, int r, int k){ if (r <= l) return a[r]; int i = partition(a, l, r); if (i > k) return select(a, l, i-1, k); if (i < k) return select(a, i+1, r, k);}
Worst-case runtime? O(n2)
Expected runtime? O(n)
Lower Bound on Sorting Do you think that there will always be
improvements in sorting algorithms? better than (n)? better than (n log n)? how to prove that no comparison sort is better
than (n log n) in the worst case? consider all algorithms!?
Few non-trivial lower bounds are known. Hard!
But, we can say that the runtime for any comparison sort is (n log n).
Comparison sort lower bound How many comparisons are needed to sort? decision tree: each leaf a permutation; each
node a comparison: a < b? A sort of a particular list: a path from root to leaf. How many leaves?
n! Shortest possible decision tree?
(log n!) Stirling’s formula (p. 43): lg n! is about n lg n – n lg e +
lg(sqrt(2 pi n)) (n log n)!
There is no comparison sort better than (n log n)
(but are there other approaches to sorting?)