Design & Analysis of Algorithms CSc 4520/6520

Preview:

DESCRIPTION

Design & Analysis of Algorithms CSc 4520/6520. Sorting and Asymptotic Notations Fall 2013 -- GSU Anu Bourgeois*. Sorting. Insertion sort Design approach: Sorts in place: Best case: Worst case: Bubble Sort Design approach: Sorts in place: Running time:. incremental. Yes.  (n). - PowerPoint PPT Presentation

Citation preview

Design & Analysis of AlgorithmsCSc 4520/6520

Sorting and Asymptotic NotationsFall 2013 -- GSUAnu Bourgeois*

2

Sorting

• Insertion sort– Design approach:– Sorts in place:– Best case:– Worst case:

• Bubble Sort– Design approach:– Sorts in place:– Running time:

Yes(n)

(n2)

incremental

Yes(n2)

incremental

Insertion Sort

• while some elements unsorted:– Using linear search, find the location in the sorted portion

where the 1st element of the unsorted portion should be inserted

– Move all the elements after the insertion location up one position to make space for the new element

13 2145 79 47 2238 74 3666 94 2957 8160 16

45

666045

the fourth iteration of this loop is shown here

“Bubbling” All the Elements

77123542 5

1 2 3 4 5 6

101

5421235 77

1 2 3 4 5 6

101

42 5 3512 77

1 2 3 4 5 6

101

42 35 512 77

1 2 3 4 5 6

101

42 35 12 5 77

1 2 3 4 5 6

101

N -

1

5

Sorting

• Selection sort– Design approach:– Sorts in place:– Running time:

• Merge Sort– Design approach:– Sorts in place:– Running time:

Yes

(n2)

incremental

NoLet’s see!!

divide and conquer

Selection Sort

Selection Sort1. Start with the 1st element, scan the entire list to find its smallest element

and exchange it with the 1st element2. Start with the 2nd element, scan the remaining list to find the the smallest

among the last (N-1) elements and exchange it with the 2nd element3. …

Example: 89 45 68 90 29 34 17 17 | 45 68 90 29 34 89 29 | 68 90 45 34 89 34 | 90 45 68 89 45 | 90 68 89 68 | 90 89 89 | 90 90

7

Divide-and-Conquer

• Divide the problem into a number of sub-problems

– Similar sub-problems of smaller size

• Conquer the sub-problems

– Solve the sub-problems recursively

– Sub-problem size small enough solve the problems in

straightforward manner

• Combine the solutions of the sub-problems

– Obtain the solution for the original problem

8

Merge Sort Approach

• To sort an array A[p . . r]:

• Divide– Divide the n-element sequence to be sorted into two

subsequences of n/2 elements each

• Conquer

– Sort the subsequences recursively using merge sort

– When the size of the sequences is 1 there is nothing

more to do

• Combine

– Merge the two sorted subsequences

9

Merge Sort

Alg.: MERGE-SORT(A, p, r)

if p < r Check for base case

then q ← (p + r)/2 Divide

MERGE-SORT(A, p, q) Conquer

MERGE-SORT(A, q + 1, r) Conquer

MERGE(A, p, q, r) Combine

• Initial call: MERGE-SORT(A, 1, n)

1 2 3 4 5 6 7 8

62317425

p rq

10

Example – n Power of 2

1 2 3 4 5 6 7 8

q = 462317425

1 2 3 4

7425

5 6 7 8

6231

1 2

25

3 4

74

5 6

31

7 8

62

1

5

2

2

3

4

4

7 1

6

3

7

2

8

6

5

Divide

11

Example – n Power of 2

1

5

2

2

3

4

4

7 1

6

3

7

2

8

6

5

1 2 3 4 5 6 7 8

76543221

1 2 3 4

7542

5 6 7 8

6321

1 2

52

3 4

74

5 6

31

7 8

62

ConquerandMerge

12

Example – n Not a Power of 2

62537416274

1 2 3 4 5 6 7 8 9 10 11

q = 6

416274

1 2 3 4 5 6

62537

7 8 9 10 11

q = 9q = 3

274

1 2 3

416

4 5 6

537

7 8 9

62

10 11

74

1 2

2

3

16

4 5

4

6

37

7 8

5

9

2

10

6

11

4

1

7

2

6

4

1

5

7

7

3

8

Divide

13

Example – n Not a Power of 2

77665443221

1 2 3 4 5 6 7 8 9 10 11

764421

1 2 3 4 5 6

76532

7 8 9 10 11

742

1 2 3

641

4 5 6

753

7 8 9

62

10 11

2

3

4

6

5

9

2

10

6

11

4

1

7

2

6

4

1

5

7

7

3

8

74

1 2

61

4 5

73

7 8

ConquerandMerge

14

Merging

• Input: Array A and indices p, q, r such that p ≤ q < r– Subarrays A[p . . q] and A[q + 1 . . r] are sorted

• Output: One single sorted subarray A[p . . r]

1 2 3 4 5 6 7 8

63217542

p rq

15

Merging

• Idea for merging:

– Two piles of sorted cards

• Choose the smaller of the two top cards

• Remove it and place it in the output pile

– Repeat the process until one pile is empty

– Take the remaining input pile and place it face-down

onto the output pile

1 2 3 4 5 6 7 8

63217542

p rq

A1 A[p, q]

A2 A[q+1, r]

A[p, r]

16

Example: MERGE(A, 9, 12, 16)p rq

17

Example: MERGE(A, 9, 12, 16)

18

Example (cont.)

19

Example (cont.)

20

Example (cont.)

Done!

21

Merge - Pseudocode

Alg.: MERGE(A, p, q, r)1. Compute n1 and n2

2. Copy the first n1 elements into L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]

3. L[n1 + 1] ← ; R[n2 + 1] ←

4. i ← 1; j ← 15. for k ← p to r6. do if L[ i ] ≤ R[ j ]7. then A[k] ← L[ i ]8. i ←i + 19. else A[k] ← R[ j ]10. j ← j + 1

p q

7542

6321rq + 1

L

R

1 2 3 4 5 6 7 8

63217542

p rq

n1 n2

22

Running Time of Merge(assume last for loop)

• Initialization (copying into temporary arrays):

(n1 + n2) = (n)

• Adding the elements to the final array:

- n iterations, each taking constant time (n)

• Total time for Merge: (n)

23

Analyzing Divide-and Conquer Algorithms

• The recurrence is based on the three steps of the paradigm:– T(n) – running time on a problem of size n– Divide the problem into a subproblems, each of size n/b: takes D(n)

– Conquer (solve) the subproblems aT(n/b) – Combine the solutions C(n)

(1) if n ≤ c

T(n) = aT(n/b) + D(n) + C(n)otherwise

24

MERGE-SORT Running Time

• Divide: – compute q as the average of p and r: D(n) = (1)

• Conquer: – recursively solve 2 subproblems, each of size n/2

2T (n/2)

• Combine: – MERGE on an n-element subarray takes (n) time

C(n) = (n)

(1) if n =1

T(n) = 2T(n/2) + (n) if n > 1

25

Solve the Recurrence

T(n) = c if n = 12T(n/2) + cn if n > 1

Use Master’s Theorem:

Compare n with f(n) = cnCase 2: T(n) = Θ(nlgn)

Asymptotic Notations

• BIG O: O – f = O(g) if f is no faster then g– f / g < some constant

• BIG OMEGA: – f = (g) if f is no slower then g– f / g > some constant

• BIG Theta: – f = (g) if f has the same growth rate as g– some constant < f / g < some constant

26

27

Analysis of Algorithms

• An algorithm is a finite set of precise instructions for performing a computation or for solving a problem.

• What is the goal of analysis of algorithms?– To compare algorithms mainly in terms of running

time but also in terms of other factors (e.g., memory requirements, programmer's effort etc.)

• What do we mean by running time analysis?– Determine how running time increases as the size

of the problem increases.

28

Input Size

• Input size (number of elements in the input)

– size of an array

– polynomial degree

– # of elements in a matrix

– # of bits in the binary representation of the input

– vertices and edges in a graph

29

Types of Analysis

• Worst case– Provides an upper bound on running time

– An absolute guarantee that the algorithm would not run longer, no matter what the inputs are

• Best case– Provides a lower bound on running time

– Input is the one for which the algorithm runs the fastest

• Average case– Provides a prediction about the running time

– Assumes that the input is random

Lower Bound Running Time Upper Bound

30

How do we compare algorithms?

• We need to define a number of objective measures.

(1) Compare execution times? Not good: times are specific to a particular

computer !!

(2) Count the number of statements executed? Not good: number of statements vary with the programming language as well as the style of the individual programmer.

31

Ideal Solution

• Express running time as a function of the input size n (i.e., f(n)).

• Compare different functions corresponding to running times.

• Such an analysis is independent of machine time, programming style, etc.

32

Asymptotic Analysis

• To compare two algorithms with running times f(n) and g(n), we need a rough measure that characterizes how fast each function grows.

• Hint: use rate of growth

• Compare functions in the limit, that is, asymptotically!(i.e., for large values of n)

33

Rate of Growth

• Consider the example of buying elephants and goldfish:

Cost: cost_of_elephants + cost_of_goldfish

Cost ~ cost_of_elephants (approximation)• The low order terms in a function are relatively

insignificant for large n

n4 + 100n2 + 10n + 50 ~ n4

i.e., we say that n4 + 100n2 + 10n + 50 and n4 have the same rate of growth

34

Asymptotic Notation

• O notation: asymptotic “less than”:

– f(n)=O(g(n)) implies: f(n) “≤” g(n)

notation: asymptotic “greater than”:

– f(n)= (g(n)) implies: f(n) “≥” g(n)

notation: asymptotic “equality”:

– f(n)= (g(n)) implies: f(n) “=” g(n)

35

Big-O Notation

• We say fA(n)=30n+8 is order n, or O (n) It is, at most, roughly proportional to n.

• fB(n)=n2+1 is order n2, or O(n2). It is, at most, roughly proportional to n2.

• In general, any O(n2) function is faster- growing than any O(n) function.

36

Big-O Visualization

O(g(n)) is the set of

functions with smaller

or same order of

growth as g(n)

37

Asymptotic notations

• O-notation

38

• Note 30n+8 isn’tless than nanywhere (n>0).

• It isn’t evenless than 31neverywhere.

• But it is less than31n everywhere tothe right of n=8.

n>n0=8

Big-O example, graphically

Increasing n

Val

ue o

f fu

ncti

on

n

30n+8cn =31n

30n+8O(n)

39

No Uniqueness

• There is no unique set of values for n0 and c in proving the

asymptotic bounds

• Prove that 100n + 5 = O(n2)

– 100n + 5 ≤ 100n + n = 101n ≤ 101n2

for all n ≥ 5

n0 = 5 and c = 101 is a solution

– 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2

for all n ≥ 1

n0 = 1 and c = 105 is also a solution

Must find SOME constants c and n0 that satisfy the asymptotic notation relation

40

Asymptotic notations (cont.)

- notation

(g(n)) is the set of functions

with larger or same order of

growth as g(n)

41

Examples

– 5n2 = (n)

– 100n + 5 ≠ (n2)

– n = (2n), n3 = (n2), n = (logn)

c, n0 such that: 0 cn 5n2 cn 5n2 c = 1 and n0 = 1

c, n0 such that: 0 cn2 100n + 5

100n + 5 100n + 5n ( n 1) = 105n

cn2 105n n(cn – 105) 0

Since n is positive cn – 105 0 n 105/c

contradiction: n cannot be smaller than a constant

42

Asymptotic notations (cont.)

-notation

(g(n)) is the set of functions

with the same order of growth

as g(n)

43

Examples

– n2/2 –n/2 = (n2)

• ½ n2 - ½ n ≤ ½ n2 n ≥ 0 c2= ½

• ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼ n2

c1= ¼

– n ≠ (n2): c1 n2 ≤ n ≤ c2 n2

only holds for: n ≤ 1/c1

44

Examples

– 6n3 ≠ (n2): c1 n2 ≤ 6n3 ≤ c2 n2

only holds for: n ≤ c2 /6

– n ≠ (logn): c1 logn ≤ n ≤ c2 logn

c2 ≥ n/logn, n≥ n0 – impossible

45

• Subset relations between order-of-growth sets.

Relations Between Different Sets

RR( f )O( f )

( f )• f

Asymptotic Notations

46

47

More Examples

• For each of the following pairs of functions, either f(n) is O(g(n)), f(n) is Ω(g(n)), or f(n) = Θ(g(n)). Determine which relationship is correct.

– f(n) = log n2; g(n) = log n + 5

– f(n) = n; g(n) = log n2

– f(n) = log log n; g(n) = log n

– f(n) = n; g(n) = log2 n

– f(n) = n log n + n; g(n) = log n

– f(n) = 10; g(n) = log 10

– f(n) = 2n; g(n) = 10n2

– f(n) = 2n; g(n) = 3n

f(n) = (g(n))

f(n) = (g(n))

f(n) = O(g(n))

f(n) = (g(n))

f(n) = (g(n))

f(n) = (g(n))

f(n) = (g(n))

f(n) = O(g(n))

Master Theorem

48

49

Solve the Recurrence

T(n) = c if n = 12T(n/2) + cn if n > 1

Use Master’s Theorem:

Compare n with f(n) = cnCase 2: T(n) = Θ(nlgn)

50

Merge Sort - Discussion

• Running time insensitive of the input

• Advantages:– Guaranteed to run in (nlgn)

• Disadvantage– Requires extra space N

Merge Sort51

Merge-Sort Tree

• An execution of merge-sort is depicted by a binary tree– each node represents a recursive call of merge-sort and stores

• unsorted sequence before the execution and its partition• sorted sequence at the end of the execution

– the root is the initial call – the leaves are calls on subsequences of size 0 or 1

7 2 9 4 2 4 7 9

7 2 2 7 9 4 4 9

7 7 2 2 9 9 4 4

Merge Sort52

Execution Example

• Partition

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort53

Execution Example (cont.)

• Recursive call, partition

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort54

Execution Example (cont.)

• Recursive call, partition

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort55

Execution Example (cont.)

• Recursive call, base case

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort56

Execution Example (cont.)

• Recursive call, base case

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort57

Execution Example (cont.)

• Merge

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort58

Execution Example (cont.)

• Recursive call, …, base case, merge

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

9 9 4 4

Merge Sort59

Execution Example (cont.)

• Merge

7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort60

Execution Example (cont.)

• Recursive call, …, merge, merge

7 2 9 4 2 4 7 9 3 8 6 1 1 3 6 8

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort61

Execution Example (cont.)

• Merge

7 2 9 4 2 4 7 9 3 8 6 1 1 3 6 8

7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6

7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1

7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9

Merge Sort62

Analysis of Merge-Sort

• The height h of the merge-sort tree is O(log n) – at each recursive call we divide in half the sequence,

• The overall amount or work done at the nodes of depth i is O(n) – we partition and merge 2i sequences of size n2i

– we make 2i1 recursive calls

• Thus, the total running time of merge-sort is O(n log n)

depth #seqs

size

0 1 n

1 2 n2

i 2i n2i

… … …

63

Quicksort

• Sort an array A[p…r]

• Divide– Partition the array A into 2 subarrays A[p..q] and A[q+1..r],

such that each element of A[p..q] is smaller than or equal to

each element in A[q+1..r]

– Need to find index q to partition the array

≤A[p…q] A[q+1…r]

64

Quicksort

• Conquer

– Recursively sort A[p..q] and A[q+1..r] using

Quicksort

• Combine

– Trivial: the arrays are sorted in place

– No additional work is required to combine them

– The entire array is now sorted

A[p…q] A[q+1…r]≤

65

QUICKSORT

Alg.: QUICKSORT(A, p, r)

if p < r

then q PARTITION(A, p, r)

QUICKSORT (A, p, q)

QUICKSORT (A, q+1, r)

Recurrence:

Initially: p=1, r=n

PARTITION())T(n) = T(q) + T(n – q) + f(n)

66

Partitioning the Array

• Choosing PARTITION()

– There are different ways to do this

– Each has its own advantages/disadvantages

– Select a pivot element x around which to partition

– Grows two regions

A[p…i] x

x A[j…r]A[p…i] x x A[j…r]

i j

67

Example

68

Partitioning the Array

Alg. PARTITION (A, p, r)

1. x A[p]2. i p – 13. j r + 14. while TRUE

5. do repeat j j – 16. until A[j] ≤ x7. do repeat i i + 18. until A[i] ≥ x9. if i < j10. then exchange A[i] A[j]11. else return j

Running time: (n)n = r – p + 1

73146235

i j

A:

arap

ij=q

A:

A[p…q] A[q+1…r]≤

p r

Each element isvisited once!

69

Recurrence

Alg.: QUICKSORT(A, p, r)

if p < r

then q PARTITION(A, p, r)

QUICKSORT (A, p, q)

QUICKSORT (A, q+1, r)

Recurrence:

Initially: p=1, r=n

T(n) = T(q) + T(n – q) + n

70

Worst Case Partitioning

• Worst-case partitioning

– One region has one element and the other has n – 1 elements

– Maximally unbalanced

• Recurrence: q=1

T(n) = T(1) + T(n – 1) + n,

T(1) = (1)

T(n) = T(n – 1) + n

=

2 2

1

1 ( ) ( ) ( )n

k

n k n n n

nn - 1

n - 2

n - 3

2

1

1

1

1

1

1

n

nnn - 1

n - 2

3

2

(n2)

When does the worst case happen?

71

Best Case Partitioning

• Best-case partitioning– Partitioning produces two regions of size n/2

• Recurrence: q=n/2

T(n) = 2T(n/2) + (n)

T(n) = (nlgn) (Master theorem)

Runtime of Quicksort

• Worst case: – every time nothing to move– pivot = left (right) end of subarray– O(n^2)

0123456789

123456789

8

0

9

89

n

Runtime of Quicksort

• Best case: – every time partition in (almost) equal parts – no worse than in given proportion– O(n log n)

• Average case– O(n log n)

Summary of Sorting Algorithms

* Slides adopted from the following sources

• Haidong Xue – Georgia State University• Alex Zelikovsky – Georgia State University• George Bebis – University of Nevada, Reno• Gregory Ball – University of Buffalo