15
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting

CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting

Embed Size (px)

Citation preview

CSC 2300Data Structures & Algorithms

March 23, 2007

Chapter 7. Sorting

Today – Sorting

Quicksort – Algorithm Pivot Analysis

Worst Case Best Case Average Case

Quicksort – Algorithm

1. If the number of elements in S is 0 or 1, then return.

2. Pick any element v in S. This is called the pivot.

3. Partition S – {v} into two disjoint groups:

S1 = { x ε S – {v} | x ≤ v}

and

S2 = { x ε S – {v} | x ≥ v}.

4. Return { quicksort(S1) followed by v followed by quicksort(S2)}.

Quicksort – Example

Quicksort – Partition Strategy Example. Input: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Say 6 is chosen as pivot. 8 1 4 9 0 3 5 2 7 6

i j pivot 8 1 4 9 0 3 5 2 7 6

i j 2 1 4 9 0 3 5 8 7 6

i j 2 1 4 9 0 3 5 8 7 6

i j 2 1 4 5 0 3 9 8 7 6

i j 2 1 4 5 0 3 9 8 7 6

j i pivot 2 1 4 5 0 3 6 8 7 9

pivot

Choices of Pivot

Four suggestions: First element of array; Larger of first two distinct elements of array; Middle element of array; Randomly.

What do you think about these choices? All bad choices. Why?

Good Choice of Pivot

Best choice: median of array. Disadvantage? Practical choice: Median of Three. What is it? Median of left, right, and center elements. Example: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Median of 8, 6, and 0.

Example

Example: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Pivot = Median of 8, 6, and 0. What should new array look like? Recall what we have done:

8 1 4 9 0 3 5 2 7 6i j pivot

Can we do better?0 1 4 9 6 3 5 2 7 8i pivot j

Where should we move pivot?0 1 4 9 7 3 5 2 6 8

i j pivot

Median-of-Three Code

Quicksort – Analysis

Quicksort is recursive. We thus get a recurrence formula:

T(0) = T(1) = 1,

T(N) = T(i) + T(N – i – 1) + cN,

where i denotes the number of elements in S1. What value of i gives worst case? What value of i gives best case?

Worst Case Analysis

We have i = 0, always. What does that say about the pivot? Always the smallest element. Recurrence becomes

T(N) = T(0) + T(N – 1) + cN. Ignore T(0), and get

T(N) = T(N – 1) + cN. Hence

T(N – 1) = T(N – 2) + c(N – 1),T(N – 2) = T(N – 3) + c(N – 2),…T(2) = T(1) + c(2).

We getT(N) = T(1) + c ∑ i = 1 + c [ N(N+1)/2 – 1] = O(N2).

Best Case Analysis

We have i = N/2, always. What does that say about the pivot? Always the median. Recurrence becomes

T(N) = T(N/2) + T(N/2) + cN = 2 T(N/2) + cN. Do you remember how to solve this recurrence? Divide by N to get

T(N)/N = T(N/2)/(N/2) + c. Thus,

T(N/2)/(N/2) = T(N/4)/(N/4) + c,T(N/4)/(N/4) = T(N/8)/(N/8) + c,…T(2)/2 = T(1)/1 + c.

We getT(N)/N = T(1)/1 + c logN,

and soT(N) = N + c N logN = O(N log N).

Average Case Analysis

Always much harder than worst and best cases. What can we assume about the pivot? Assume that each of the sizes for S1 is equally likely and

thus has probability 1/N. The average value of T(i) is thus (1/N) ∑ T(j). What can we say about the value of T(N – i – 1)? Recurrence becomes

T(N) = (2/N) ∑ T(j) + cN. Does this recurrence look familiar? When we did an internal path length analysis in Chapter 4

(Trees).

Average Case Analysis

Recurrence:

T(N) = (2/N) ∑ T(j) + cN. How can we solve this recurrence? Divide by N? No, multiply by N! We get this recurrence:

N T(N) = 2 ∑ T(j) + cN2. How do we get rid of the ∑ T(j) ? We use this recurrence:

(N – 1)T(N – 1) = 2 ∑ T(j) + c(N – 1)2. Subtracting one recurrence from the other, we get

NT(N) – (N – 1)T(N – 1) = 2 T(N – 1) + c(2N – 1). Simplifying and dropping the c term, we get

NT(N) = (N+1) T(N – 1) + 2cN.

Recurrence

Recurrence:NT(N) = (N+1) T(N – 1) + 2cN.

How can we solve this recurrence? Divide by N? Divide by N+1? No, divide by N(N+1)! We get this recurrence:

T(N)/(N+1) = T(N – 1)/N + 2c/(N+1). What to do now? We can telescope:

T(N – 1)/N = T(N – 2)/(N – 1) + 2c/N,T(N – 2)/(N – 1) = T(N – 3)/(N – 2) + 2c/(N – 1),…T(2)/3 = T(1)/2 + 2c/3.

We get this solution:T(N)/(N+1) = T(1)/2 + 2c ∑ (1/i).

What does ∑ (1/i) equal? We get T(N) = O(N log N).