24
Sorting Suppose you wanted to write a computer game like Doom 4: The Caverns of Calvin… How do you render those nice (lurid) pictures of Calvin College torture chambers, with hidden surfaces removed? Given a collection of polygons (points, tests, values), how do you sort them? My favorite sort: What are your favorite sorts? Read 6.1-6.5, omit rest of chapter 6.

Sorting Suppose you wanted to write a computer game like Doom 4: The Caverns of Calvin… How do you render those nice (lurid) pictures of Calvin College

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Sorting Suppose you wanted to write a computer

game like Doom 4: The Caverns of Calvin… How do you render those nice (lurid) pictures

of Calvin College torture chambers, with hidden surfaces removed?

Given a collection of polygons (points, tests, values), how do you sort them?

My favorite sort: What are your favorite sorts?

Read 6.1-6.5, omit rest of chapter 6.

Simple (and slow) algorithms Bubble sort:

Selection Sort:

Insertion Sort:

Which is best?important factors: comparisons, data

movement

Sorting out Sorting A collection or file of items with keys Sorting may be on items or pointers Sorting may be internal or external Sorting may or may not be stable Simple algorithms:

easy to implement slow (on big sets of data) show the basic approaches, concepts May be used to improve fancier algorithms

Sorting UtilitiesWe’d like our sorting algorithms to work with all

data types…

template <class Item> void exch(Item &A, Item &B) {Item t=A; A=B; B=t; }template <class Item> void compexch(Item &A, Item &B) {if (B<A) exch(A, B); }

Bubble Sort The first sort most

students learn And the worst…

template <class Item>void bubble(Item a[], int l, int r){ for (int i=l; i<r; i++) for (int j=r; j>i; j--) compexch(a[j-1], a[j]);}

comparisons? something like n2/2

date movements?something like n2/2

Selection Sort Find smallest element Exchange with first Recursively sort rest

template <class Item>void selection(Item a[], int l, int r){ for (int i=1; i<r; i++) { int min=i; for (int j=i+1; j<=r; j++) if (a[j]<a[min]) min=j; exch(a[i], a[min]); }}

comparisons? n2/2

swaps? n

Insertion Sort Like sorting cards Put next one in place

template <class Item>void insertion(Item a[], int l, int r){ int i; for (i=r; i>l; i--) compexch(a[i-1],a[i]); for (i=l+2; i<=r; i++) { int j=i; Item v=a[i]; while (v<a[j-1]) { a[j] = a[j-1]; j--; } a[j] = v; }}

comparisons? n2/4

data moves? n2/4

Which one to use? Selection: few data movements Insertion: few comparisons Bubble: blows But all of these are (n2), which, as you know,

is TERRIBLE for large n Can we do better than (n2)?

32-bit int keys String keysN S I B S I B

1000 5 4 11 13 8 192000 21 15 45 56 31 784000 85 62 182 228 126 321

Merge Sort The quintessential divide-and-conquer

algorithm

Divide the list in half Sort each half recursively Merge the results. Base case:

left as an exercise to the reader

Merge Sort Analysis Recall runtime recurrence:

T(1)=0; T(n) = 2T(n/2) + cn

(n log n) runtime in the worst case

Much better than the simple sorts on big data files – and easy to implement!

Can implement in-place and bottom-up to avoid some data movement and recursion overhead

Still, empirically, it’s slower than Quicksort, which we’ll study next.

Quicksort

Pick a pivot; pivot list; sort halves recursively. The most widely used algorithm A heavily studied algorithm with many

variations and improvements (“it seems to invite tinkering”)

A carefully tuned quicksort is usually fastest (e.g. unix’s qsort standard library function)

but not stable, and in some situations slooow…

Quicksort

template <class Item>void qsort(Item a[], int l, int r){ if (r<=l) return; int i=partition(a, l, r); qsort(a, l, i-1); qsort(a, i+1, r);}

partition: pick an item as pivot, p (last item?) rearrange list into items smaller, equal, and greater than p

Partitioningtemplate <class Item>int partition(Item a[], int l, int r){ int i=l-1, j=r; Item v=a[r]; for (;;) { while (a[++i] < v) ; while (v<a[--j]) if (j==l) break; if (i >= j) break; exch(a[i], a[j]); } exch(a[i], a[r]); return i;}

Quicksort Analysis What is the runtime for Quicksort? Recurrence relation? Worst case: (n2) Best, Average case: (n log n)

When does the worst case arise?when the list is (nearly) sorted! oops…

Recursive algorithms also have lots of overhead. How to reduce the recursion overhead?

Quick Hacks: Cutoff How to improve the recursion overhead?

Don’t sort lists of size <= 10 (e.g.)

At the end, run a pass of insertion sort. In practice, this speeds up the algorithm

Quick Hacks: Picking a Pivot How to prevent that nasty worst-case

behavior? Be smarter about picking a pivot E.g. pick median of first, middle, and last elements

as pivot

Again, this yields an improvement in empirical performance: the worst case is much more rare

(what would have to happen to get the worst case?)

Quicksort empirical results Basic Quicksort Median-of-three

N c=0 c=10 c=20 c=0 c=10 c=20

100000 24 22 22 25 20 28

200000 53 48 50 52 44 54

400000 116 105 110 114 97 118

800000 255 231 241 252 213 258

Problem: Page image contrast enhancement

How can we adjust contrastso that all of thetext looks good—despite illuminationvariations?

A solution attempt Adjust gray levels

so that 5% of the pixels

are black 70% are white The rest are

interpolated [then gamma

correction is used]

But there are some undesirable artifacts…

Median, Order Statistics Quicksort improvement idea: use the median as

pivot Order Statistics: an algorithm to

find the smallest element of a list find the n/2th element (median) find the largest 20% of the items find the kth element from the bottom

Algorithm idea: sort, then pick the middle element.

(n log n) worst, average case. This won’t help for quicksort! Can we do better?

Quicksort-based selection Pick a pivot; partition list. Let i be location of

pivot. If i>k search left part; if i<k search right parttemplate<class Item>void select(Item a[], int l, int r, int k){ if (r <= l) return a[r]; int i = partition(a, l, r); if (i > k) return select(a, l, i-1, k); if (i < k) return select(a, i+1, r, k);}

Worst-case runtime? O(n2)

Expected runtime? O(n)

Lower Bound on Sorting Do you think that there will always be

improvements in sorting algorithms? better than (n)? better than (n log n)? how to prove that no comparison sort is better

than (n log n) in the worst case? consider all algorithms!?

Few non-trivial lower bounds are known. Hard!

But, we can say that the runtime for any comparison sort is (n log n).

Comparison sort lower bound How many comparisons are needed to sort? decision tree: each leaf a permutation; each

node a comparison: a < b? A sort of a particular list: a path from root to leaf. How many leaves?

n! Shortest possible decision tree?

(log n!) Stirling’s formula (p. 43): lg n! is about n lg n – n lg e +

lg(sqrt(2 pi n)) (n log n)!

There is no comparison sort better than (n log n)

(but are there other approaches to sorting?)