Nothing is particularly hard if you divide it into small jobs. Henry Ford Nothing is particularly hard if you divide it into small jobs. Henry Ford

Divide and ConquerMerge and Quick

Divide and Conquer

Nothing is particularly hard if you divide it into small jobs.Henry Ford

Divide and Conquer

Recursive in structure Divide

▪ The problem into sub-problems that are similar to the original but smaller in size.

Conquer▪ The sub-problems by solving them recursively. ▪ If they are small enough, just solve them in a

straightforward manner. Combine

▪ The solutions to create a solution to the original problem

Merge Sort

Merge Sort

Divide array into two halves, recursively sort left and right halves, then merge two halves Mergesort

To sort an array A[p . . r]: Divide

Divide the n-element sequence to be sorted into two subsequences of n/2 elements each

Conquer Sort the subsequences recursively using merge sort When the size of the sequences is 1 there is nothing more to do

Combine Merge the two sorted subsequences to produce the sorted

answer.

Divide it in two at the midpoint Conquer each side in turn (by

recursively sorting) Merge two halves together

8 2 9 4 5 3 1 6

Merge Sort Approach

Merge Sort Algorithmvoid Mergesort (int A[], int first,

int last)

{

if(first < last)

{

int mid = (first +

last)/2; Mergesort(A, first,

mid); Mergesort(A, mid+1,last);

Merge(A, first, mid,

last);

}

}

0 1 2 3 4 5 6 7

62317425

first lastmid

Check for base case

Divide Conquer ConquerCombine

Merge Sort – Example

18 26 32 6 43 15 9 1

18 26 32 6 43 15 9 1

18 26 32 6 43 15 9 1

2618 6 32 1543 1 9

18 26 32 6 43 15 9 1

18 26 32 6 43 15 9 1

18 26 32 6 15 43 1 9

6 18 26 32 1 9 15 43

1 6 9 15 18 26 32 43

18 26

18 26

18 26

32

32

6

6

32 6

18 26 32 6

43

43

15

15

43 15

9

9

1

1

9 1

43 15 9 1

18 26 32 6 43 15 9 1

18 26 6 32

6 26 3218

1543 1 9

1 9 15 43

1 6 9 1518 26 32 43

Original Sequence Sorted Sequence

j

Merge Function

6 8 26 32 1 9 42 43

A

k

6 8 26 32 1 9 42 43

k k k k k k

i i i i i j j j j

6 8 26 32 1 9 42 43

1 6 8 9 26 32 42 43

k

B C

void Merge(int A[], int first, int mid, int last)

{ int n1, n2, i, j, k; n1 = mid - first + 1; n2 = last - mid;

int B[n1]; int C[n2];

for (i=0; i< n1;i++) B[i] = A[first +i];

for (j=0; j< n2;j++) C[j] = A[mid +j+1];

i = 0; j = 0; k = first; while(i< n1 && j < n2) { if( B[i] <= C[j]) { A[k] = B[i];

i= i+1; } else { A[k] = C[j];

j= j+1; } k= k+1; } if(i < n1) { while(i < n1) { A[k] = B[i];

k =k+1; i=i+1; }

{ if(j < n2) { while(j < n2) { A[k] = C[j];

k =k+1; j=j+1; }

}}

first mid last

Declaring auxiliary arrays of

size n1 and n2

Moving element to auxiliary

arrays

Compare two elements, move one of them to the original array constant cost.

k

Copying the remaining elements to original array.

j

Analysis of Merge Function

6 8 26 32 1 9 42 43

A

k

6 8 26 32 1 9 42 43

k k k k k k

i i i i i j j j j

6 8 26 32 1 9 42 43

1 6 8 9 26 32 42 43

k

B C

void Merge(int A[], int first, int mid, int last)

{ int n1, n2, i, j, k; n1 = mid - first + 1; n2 = last - mid;

int B[n1]; int C[n2];

for (i=0; i< n1;i++) B[i] = A[first +i];

for (j=0; j< n2;j++) C[j] = A[mid + j+1];

i = 0; j = 0; k = first; while(i< n1 && j < n2) { if( B[i] <= C[j]) { A[k] = B[i];

i= i+1; } else { A[k] = C[j];

j= j+1; } k= k+1; } if(i < n1) { while(i < n1) { A[k] = B[i];

k =k+1; i=i+1; }

{ if(j < n2) { while(j < n2) { A[k] = C[j];

k =k+1; j=j+1; }

}}

first mid last

Declaring auxiliary arrays of

size n1 and n2

O(1)

Moving element to auxiliary

arraysO(n)

O(1)

Compare two elements, move one of them to the original array constant cost.

k

Copying the remaining elements to original array.

O(n)

O(n)

Analysis of Merge Sort So far we have seen that it takes

O(n) time to merge two subarrays of size n/2 O(n) time to merge four subarrays of size n/4

into two subarrays of size n/2 O(n) time to merge eight subarrays of size n/8

into four subarrays of size n/4 Etc.

How many levels deep do we have to proceed?

How many times can we divide an array of size n into two halves?

O(log n)

Analysis of Merge Sort… So if our recursion goes log n levels deep... ...and we do O(n) work at each level... ...our total time is: log n * O(n) ... ...or in other words,

O(n log n) For large arrays, this is much better than

Bubblesort, Selection sort, or Insertion sort, all of which are O(n2)

Not in place Mergesort does, however, require a “workspace”

array as large as our original array (O(n) extra space)

Analysis of Merge Sortvoid Mergesort (int A[], int

first, int last)

{

if(first < last)

{

int mid = (first + last)/2;

Mergesort(A, first, mid);

Mergesort(A, mid+1,last);

Merge(A, first, mid, last);

}

}

Check for base case

Divide Conquer ConquerCombine

Running time T(n) of Merge Sort:

computing the middle takes (1)

(1)

solving 2 sub-problems takes 2T(n/2)

merging n elements takes (n)

Total:T(n) = (1) if n = 1

T(n) = 2T(n/2) + (n) if n > 1

T(n) = (n lg n)

Recursion Tree – Example Running time of Merge Sort:

T(n) = (1) if n = 1T(n) = 2T(n/2) + (n) if n > 1

Rewrite the recurrence asT(n) = c if n = 1

T(n) = 2T(n/2) + cn if n > 1c > 0: Running time for the base case and time per array element for the divide and combine steps.

Recursion Tree for Merge Sort

For the original problem, we have a cost of cn, plus two subproblems each of size (n/2) and running time T(n/2).

cn

T(n/2) T(n/2)

Each of the size n/2 problems has a cost of cn/2 plus two subproblems, each costing T(n/4). cn

cn/2 cn/2

T(n/4) T(n/4) T(n/4) T(n/4)

Cost of divide and merge.

Cost of sorting subproblems.

Comp 122

Recursion Tree for Merge SortContinue expanding until the problem size reduces to 1.

cn

cn/2 cn/2

cn/4 cn/4 cn/4 cn/4

c c c cc c

lg n

cn

cn

cn

cnTotal : cnlgn+cn

Recursion Tree for Merge Sort

Continue expanding until the problem size reduces to 1.cn

cn/2 cn/2

cn/4 cn/4 cn/4 cn/4

c c c cc c

•Each level has total cost cn.•Each time we go down one level, the number of subproblems doubles, but the cost per subproblem halves cost per level remains the same.•There are lg n + 1 levels, height is lg n. (Assuming n is a power of 2.)• Can be proved by induction.

•Total cost = sum of costs at each level = (lg n + 1)cn = cnlgn + cn = (n lgn).

Quick Sort

Quick Sort

Partition array into items that are “small” and items that are “large”, then recursively sort the two sets Quicksort

Divide Partition array into left and right sub-arrays

▪ Choose an element of the array, called pivot▪ The elements in left sub-array are all less than pivot▪ Elements in right sub-array are all greater than pivot

Conquer Recursively sort left and right sub-arrays

Combine Trivial: the arrays are sorted in place No additional work is required to combine them The entire array is now sorted

Quick Sort…A key step in the Quicksort algorithm

is partitioning the array We choose some (any) number p in the

array to use as a pivot We partition the array into three parts:

p

numbers less than p

numbers greater than or equal to p

p

Quick Sort Approach

1381

9243

65

31 57

26

750

S select pivot value

13 819243 65

31

5726

750S1 S2partition S

13 4331 57260

S1

81 927565

S2QuickSort(S1)

13 4331 57260 65 81 9275S S is sorted

QuickSort(S2)

Quick Sort Algorithmvoid quicksort(int array[], int left, int

right) { if (left < right) {

int p = partition(array, left, right); quicksort(array, left, p - 1);

quicksort(array, p + 1, right);}

}

Check for base case

Divide

Conquer

Conquer

Combine

The Partition method int partition(int a[], int left, int right) { int p = a[left], l = left + 1, r = right; while (l < r)

{ while (l < right && a[l] < p)

l++; while (r > left && a[r] >= p)

r--; if (l < r)

{ swap(a[l], a[r]); } } a[left] = a[r]; a[r] = p; return r; }

ExampleWe are given array of n integers to

sort:40 20 10 80 60 50 7 30 100

Pick Pivot ElementThere are a number of ways to pick the pivot

element. In this example, we will use the first element in the array:

40 20 10 80 60 50 7 30 100

Partitioning ArrayGiven a pivot, partition the elements of

the array such that the resulting array consists of:

1. One sub-array that contains elements < pivot

2. Another sub-array that contains elements >= pivot

The sub-arrays are stored in the original data array.

Partitioning loops through, swapping elements below/above pivot.

40 20 10 80 60 50 7 30 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

l r

1. while (l<r)

40 20 10 80 60 50 7 30 100pivot_index = 0

l r

1. while (l<r) 2. { while (l < right && a[l] < p )

l ++;

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 80 60 50 7 30 100pivot_index = 0

l r


l ++;

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 80 60 50 7 30 100pivot_index = 0

l r


l ++;

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 80 60 50 7 30 100pivot_index = 0

l r

1. while(l<r)2. { while (l < right && a[l] < p)

l ++;3. while (r > left &&a[r] >= p)

r--;

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 80 60 50 7 30 100pivot_index = 0

l r

1. while(l<r)2. { while (l < right && a[l] < p)

l ++;3. while (r > left &&a[r] >= p)

r--;

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 80 60 50 7 30 100pivot_index = 0

l r

1. while (l < r)2. { while (l < right && a[l] < p)

l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 30 60 50 7 80 100pivot_index = 0

l r


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 30 60 50 7 80 100pivot_index = 0

l r


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 30 60 50 7 80 100pivot_index = 0

l r


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

[0] [1] [2] [3] [4] [5] [6] [7] [8]

40 20 10 30 60 50 7 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 60 50 7 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 60 50 7 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 60 50 7 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

40 20 10 30 7 50 60 80 100pivot_index = 0

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

a[left] = a[r]; a[r] = p; return r;

7 20 10 30 40 50 60 80 100pivot_index = 4

l r

[0] [1] [2] [3] [4] [5] [6] [7] [8]


l ++;3. while (r > left &&a[r] >= p)

r--;4. if ( l < r )

swap ( a[l] , a[r] )}

a[left] = a[r]; a[r] = p; return r;

Partition Result

7 20 10 30 40 50 60 80 100

<= pivot > pivot

Recursion: Quicksort Sub-arrays

7 20 10 30 40 50 60 80 100

< data[pivot] > = data[pivot]

[0] [1] [2] [3] [4] [5] [6] [7] [8]

Example of partitioning

choose pivot: 4 3 6 9 2 4 3 1 2 1 8 9 3 5 6 search: 4 3 6 9 2 4 3 1 2 1 8 9 3 5 6 swap: 4 3 3 9 2 4 3 1 2 1 8 9 6 5 6 search: 4 3 3 9 2 4 3 1 2 1 8 9 6 5 6 swap: 4 3 3 1 2 4 3 1 2 9 8 9 6 5 6 search: 4 3 3 1 2 4 3 1 2 9 8 9 6 5 6 swap: 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6 search: 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6 (left > right) swap with pivot: 1 3 3 1 2 2 3 4 4 9 8 9 6 5 6

Analysis of Partition method int partition(int a[], int left, int right) { int p = a[left], l = left + 1, r = right; while (l < r)

{ while (l < right && a[l] < p)

l++; while (r > left && a[r] >= p)

r--; if (l < r)

{ swap(a[l], a[r]); } } a[left] = a[r]; a[r] = p; return r; }

O(n)

Analysis of Quick Sortvoid quicksort(int array[], int left, int




}

Check for base case

Divide

Conquer

Conquer

Combine

(1)

(n)

(0)

Best Case

Suppose each partition operation divides the array almost exactly in half

Then the depth of the recursion in log2n.At each level of the recursion, all the

partitions at that level do work that is linear in n.

O(log2n) * O(n) = O(n log2n) Hence in the average case, quicksort

has time complexity O(n log2n)

Partitioning at various levels-Best Case

Analysis of Quick Sort- Best Casevoid quicksort(int array[], int left, int




}

Check for base case

Divide

Conquer

Conquer

Combine

(1)

(n)

(0)

Total:T(n) = (1) if n = 1

T(n) = 2T(n/2) + (n) if n > 1

T(n) = (n lg n)

2T(n/2)

Best Case Partitioning Best-case partitioning

Partitioning produces two regions of size n/2 Recurrence: q=n/2

T(n) = 2T(n/2) + (n)T(n) = (nlgn) (Master theorem)

Worst case

In the worst case, partitioning always divides the size n array into these three parts: A length one part, containing the pivot

itself A length zero part, and A length n-1 part, containing everything

elseWe don’t recur on the zero-length

partRecurring on the length n-1 part

requires (in the worst case) recurring to depth n.

Worst Case Call Tree N=4

Worst Case Partitioning Worst-case partitioning

One region has zero element and the other has n – 1 elements

Maximally unbalanced Recurrence: q=1

T(n) = T(n – 1) = T(0) + (n) T(1) = (1)T(n) = T(n – 1) + n

nn - 1

n - 2n - 3

21

00

0

0

0n

nn - 1n - 2n - 3

21

(n2)

When does the worst case happen?

Analysis of Quick Sort- Worst Casevoid quicksort(int array[], int left, int




}

Check for base case

Divide

Conquer

Conquer

Combine

(1)

(n)

(0)

Total:

T(n) = (1) if n = 1T(n) = T(n-1) (n) if n > 1

T(n) = (n 2)

T(n-1)+T(0)=T(n-1)

Quicksort: Worst Case

Assume first element is chosen as pivot.

Assume we get array that is already in order:

2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot]++too_big_index

2. While data[too_small_index] > data[pivot]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]

2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]






2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_indextoo_small_index





2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]






2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]






2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]






2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]






2 4 10 12 13 50 57 63 100pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

> data[pivot]<= data[pivot]

Case Between Worst and Best 9-to-1 proportional split

Q(n) = Q(9n/10) + Q(n/10) + n

Average-Case Analysis

Assume Each of the sizes for S1 is equally likely

This assumption is valid for our pivoting (median-of-three) strategy

On average, the running time is O(N log N)

Quicksort Analysis Assume that keys are random, uniformly

distributed. Best case and average case running time:

O(n log n) Worst case running time?

Recursion:1. Partition splits array in two sub-arrays:

• one sub-array of size 0• the other sub-array of size n-1

2. Quicksort each sub-array Depth of recursion tree? O(n) Number of accesses per partition? O(n)

Quicksort Analysis

Assume that keys are random, uniformly distributed.

Best case running time: O(n log n) Average case running time: O(n log n) Worst case running time: O(n2)!!!

What can we do to avoid worst case?

Improved Pivot SelectionPick median value of three elements from

data array:data[0], data[n/2], and data[n-1].

Use this median value as pivot.

8 1 4 9 0 3 5 2 7 6

0 1 2 3 4 5 6 7 8 9

6 1 4 9 0 3 5 2 6 8

Median of 0, 6, 8 is 6. Pivot is 6

Choose the pivot as the median of three

Partitioning: Choosing the pivot One implementation (there are others)

median3 finds pivot and sorts left, center, right▪ Median3 takes the median of leftmost, middle,

and rightmost elements▪ An alternative is to choose the pivot randomly

(need a random number generator; “expensive”)▪ Another alternative is to choose the first element

(but can be very bad. Why?) Swap pivot with the first element

Summary of QuicksortBest case: split in the middle — Θ( n

log n) Worst case: sorted array! — Θ( n2) Average case: random arrays — Θ( n

log n)Memory requirement? In-place sorting algorithmConsidered as the method of choice

for internal sorting for large files (n ≥ 10000)

Summary of Quicksort Best case: split in the middle — Θ( n log n) Worst case: sorted array! — Θ( n2) Average case: random arrays — Θ( n log n) Considered as the method of choice for internal

sorting for large files (n ≥ 10000) Improvements:

better pivot selection: median of three partitioning avoids worst case in sorted files

switch to insertion sort on small subfiles elimination of recursionthese combine to 20-25% improvement

Small arrays

For very small arrays, quicksort does not perform as well as insertion sort how small depends on many factors,

such as the time spent making a recursive call, the compiler, etc

Do not use quicksort recursively for small arrays Instead, use a sorting algorithm that is

efficient for small arrays, such as insertion sort

Properties of Quicksort

Not stable because of long distance swapping.

No iterative version (without using a stack).

Pure quicksort not good for small arrays. “In-place”, but uses auxiliary storage

because of recursive call (O(logn) space). O(n log n) average case performance, but

O(n2) worst case performance.

Documents

Nothing is particularly hard if you divide it into small jobs. Henry Ford Nothing is particularly hard if you divide it into small jobs. Henry Ford