86
Summer 2007 CISC121 - Prof. McLeod 1 CISC121 – Lecture 9 Last time: Finished Linked Lists Stacks & Queues

Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Embed Size (px)

Citation preview

Page 1: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 1

CISC121 – Lecture 9

• Last time:– Finished Linked Lists– Stacks & Queues

Page 2: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 2

You Will Need To:

• Work on exercise 4 and assignment 4.• Assn 4 is due Saturday at 5pm. If you hand it in

on Friday instead, Krista will have it marked earlier.

Page 3: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 3

Today

• Analysis of Complexity• Sorting

– Insertion– Selection– Bubble

• Searching– Sequential– Binary

• Start Recursion?

Page 4: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 4

Comparing Programs

• Possible bases for comparison:– Speed of execution– Memory consumption– Ease of debugging– Ease of interfacing with other program elements– Backwards compatibility– Quality of external documentation, support– Ease of use– Quality of user interface– Cost…

Page 5: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 5

Comparing Algorithms, Cont.

• Nobody will buy a program that is too slow!– Regardless of all its other “features”.

• Too much work to finish an entire program only to find out it is too slow!

• It would be nice if we could predict speed of operation, or at least predict bottlenecks as code is developed.

Page 6: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 6

Comparing Algorithms, Cont.

• For example, it was fairly obvious that deleting the tail node in a doubly linked list is much faster than deleting the tail node in a singly linked list.

• It is not always this easy to compare code on the basis of expected execution speed.

• The total time of execution is sum of many complex processor and memory operations.

Page 7: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 7

The Analysis of Complexity

• Is really all about simplifying complexity.• Complexity is what really goes on when a

program is run, the accumulation of all the nano-second operations that take place.

• All we really need to do is to be able to say is that algorithm “A” will be faster than algorithm “B” if they are both tested under the same conditions with the same kind of data.

Page 8: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 8

Analysis of Complexity - Cont.

• This theoretical technique can be explained by first generalizing the complexity of what goes on when code is run.

• First, define a “Primitive Operation”:

Page 9: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 9

Primitive Operations• A primitive operation takes a unit of time (a

“tick”). The actual length of time will depend on external factors such as the hardware and software environment.

• Select operations that would all take about the same length of time. For example:– Assigning a value to a variable.– Calling a method.– Performing an arithmetic operation.– Comparing two numbers.– Indexing an array element.– Following an object reference.– Returning from a method.

Page 10: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 10

Primitive Operations - Cont.

• Assume: Each of these kinds of operations will take the same amount of time on a given hardware and software platform.

• Count primitive operations for a simple method.

• Example - find the highest integer in an array:

Page 11: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 11

public static int arrayMax(int[] A) {

int currentMax = A[0]; // 3 operationsint n = A.length; // 3 operationsfor (int i = 0; i < n; i++)// 2+3n

if (currentMax < A[i]) // 3ncurrentMax = A[i]; // 0 to 2n

return currentMax; // 1} // end arrayMax

• Number of primitive operations:– Minimum = 3+3+2+3n+3n+1= 9 + 6n– Maximum = 9 + 8n

Primitive Operations - Cont.

Page 12: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 12

Primitive Operations - Cont.

• Or, Best case is 9 + 6n, Worst case is 9 + 8n.• What is the Average case?• If the maximum element had exactly the same

probability of being in each of the array locations then the average case would be = 9 + 7n.

• But if the data is not randomly distributed, then getting the average case is very difficult.

• It is much easier to just assume the worst case analysis, 9 + 8n.

Page 13: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 13

Analysis of Complexity - Cont.

• Consider the case when n gets very large (Asymptotic Analysis).

• Then 8n is much greater than 9, so just using 8n is a good approximation.

• Or, it is even easier to say that the running time grows proportionally to n.

• Or the order is to the first power of n, or just first order, or O(n).

• For the most simple analysis, it makes sense to just ignore the constant values in the equation.

Page 14: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 14

“Big O” Notation

• Mathematical definition of Big O Notation:

Let f(n) and g(n) be functions that map nonnegative integers to real numbers. We say that f(n) is O(g(n)) if there is a real constant, c, where c > 0 and an integer constant n0, where n0 1 such that f(n) cg(n) for every integer n n0.

Page 15: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 15

“Big O” Notation - Cont.

Page 16: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 16

“Big O” Notation - Cont.

• f(n) is the function that describes the actual time of the program as a function of the size of the dataset, n.

• The idea here is that g(n) is a much simpler function that what f(n) is.

• Given all the assumptions and approximations made to get any kind of a function, it does not make sense to use anything more than the simplest representation.

• The more complicated the function, the harder they are to compare.

Page 17: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 17

“Big O” Notation - Cont.

• For example, consider the function:

f(n) = n2 + 100n + log10n + 1000

• For small values of n the last term, 1000, dominates. When n is around 10, the terms 100n + 1000 dominate. When n is around 100, the terms n2 and 100n dominate. When n gets much larger than 100, the n2 dominates all others. The log10n term never gets to play at all!

• So it would be safe to say that this function is O(n2) for values of n > 100.

Page 18: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 18

“Big O” Notation - Cont.

• Similarly a function like:

20n3 + 10n(log(n)) + 5

is O(n3). (It can be proved that the function is 35n3 for n 1. So c is 35 and n0 is 1.)

Page 19: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 19

“Big O” Notation - Cont.

• A question:• If an algorithm is O(n2), why is it not also O(n3) or

generally O(ni) where i>2?• The answer: It is!• O(n2) is just a subset of O(ni).• However, O(n2) is probably the closest

representation of the algorithm. While other notations satisfy the definition of “big O” they would lie further away from the algorithm’s actual behaviour, and would not do as good a job at describing the algorithm.

Page 20: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 20

“Big O” Notation - Cont.

• So, use the the simplest notation that lies closest to the actual behaviour of the algorithm.

• If you can choose between algorithms based on the simplest notation, great!

• If you cannot, then you need to increase the complexity of the equations.

• If this does not work, then give up and go to experiment!

Page 21: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 21

“Big O” Notation - Cont.

• Common big O notations:

– Constant O(1)– Logarithmic O(log(n))– Linear O(n)– N log n O(n(log(n)))– Quadratic O(n2)– Cubic O(n3)– Polynomial O(nx)– Exponential O(2n)

Page 22: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 22

“Big O” Notation - Cont.

• On the next slide: how these functions vary with n. Assume a rate of 1 instruction per µsec (micro-second):

Page 23: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 23

“Big O” Notation - Cont.

Notation n10 102 103 104 105 106

O(1) 1 µsec 1 µsec 1 µsec 1 µsec 1 µsec 1 µsec

O(log(n)) 3 µsec 7 µsec 10 µsec 13 µsec 17 µsec 20 µsec

O(n) 10

µsec

100 µsec 1 msec 10 msec 100 msec

1 sec

O(nlog(n)) 33 µsec 664 µsec 10 msec 13.3 msec

1.6 sec 20 sec

O(n2) 100 µsec

10 msec 1 sec 1.7 min 16.7 min

11.6 days

O(n3) 1 msec 1 sec 16.7 min 11.6 days

31.7 years

31709 years

O(2n) 10 msec 3e17 years

Page 24: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 24

“Big O” Notation - Cont.

• So, algorithms of O(n3) or O(2n) are not practical for any more than a few iterations.

• Quadratic, O(n2) (common for nested “for” loops!) gets pretty ugly for n > 1000 (Bubble sort for example…).

• To figure out the big O notation for more complicated algorithms, it is necessary to understand some mathematical concepts.

Page 25: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 25

Some Math…

• Logarithms and exponents - a quick review:• By definition:

logb(a) = c, when a = bc

Page 26: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 26

More Math…

• Some rules when using logarithms and exponents - for a, b, c positive real numbers:

– logb(ac) = logb(a) + logb(c)

– logb(a/c) = logb(a) - logb(c)

– logb(ac) = c(logb(a))

– logb(a) = (logc(a))/logc(b)

– blogc(a) = alog

c(b)

– (ba)c = bac

– babc = ba + c

– ba/bc = ba - c

Page 27: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 27

More Math…

• Summations:

• If a does not depend on i:

)(...)2()1()()( bfafafafifb

ai

n

i

n

i

ifaiaf11

)()(

Page 28: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 28

More Math…

• Also:

2

)1(

1

nni

n

i

6

)12)(1(

1

2

nnni

n

i

Page 29: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 29

Still More Math…

• The Geometric Sum, for any real number, a > 0 and a 1.

• Also for 0 < a < 1:

a

aa

nn

i

i

1

1 1

0

aa

i

i

1

1

0

Page 30: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 30

Still More Math…

• Finally, for a > 0, a 1, and n 2:

• Lots of other summation formulae exist, (Mathematicians just love to collect them!) but these are the ones commonly found in proofs of big O notations.

2

)2()1(

1 )1(

)1(

a

naanaia

nnn

i

i

Page 31: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 31

Properties of Big O Notation

1. If d(n) is O(f(n)), then ad(n) is still O(f(n)), for any constant, a > 0.

2. If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n) + e(n) is O(f(n) + g(n)).

3. If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)e(n) is O(f(n)g(n)).

4. If d(n) is O(f(n)) and f(n) is O(g(n)), then d(n) is O(g(n)).

Page 32: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 32

Properties of Big O Notation – Cont.

5. If f(n) is a polynomial of degree d (ie. f(n) = a0 + a1n + a2n2 + a3n3 + … + adnd, then f(n) is O(nd).

6. loga(n) is O(logb(n)) for any a, b > 0. That is to say loga(n) is O(log2(n)) for any a > 1.

Page 33: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 33

“Big O” Notation - Cont.

• While correct, it is considered “bad form” to include constants and lower order terms when presenting a Big O notation unless the other terms cannot be simplified and are close in magnitude.

• For most comparisons between algorithms, the simplest representation of the notation is sufficient.

• When simplifying a complex function always look for the obviously dominant term (for large n) and toss all the others.

Page 34: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 34

Application of Big O to Loops

• For example, a single “for” loop:

sum = 0;

for (i = 0; i < n; i++)

sum += a[i];

• 2 operations before the loop, 6 operations inside the loop. Total is 2 + 6n. This gives the loop O(n).

Page 35: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 35

Application of Big O to Loops - Cont.

• Nested “for” loops, where both loop until n:

for (i = 0; i < n; i++) {sum[i] = 0;for (j = 0; j < n; j++)

sum[i] += a[i, j];}

• Outer loop: 1+5n+n(inner loop). Inner loop = 1+7n. Total = 1+6n+7n2, so algorithm is O(n2).

Page 36: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 36

Application of Big O to Loops - Cont.

• Nested “for” loops where the inner loop goes less than n times (used in simple sorts):

for (i = 0; i < n; i++) {sum[i] = 0;for (j = 0; j < i; j++)

sum[i] += a[i, j];}

• Outer loop: 1 + 6n + (7i, for i = 1, 2, 3, …, n-1).

Page 37: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 37

Application of Big O to Loops - Cont.

• Or:

• Which is O(n) + O(n2), which is finally just O(n2).

2

)1(761761761

1

1

1

1

nnninin

n

i

n

i

Page 38: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 38

• Nested for loops are usually O(n2), but not always!

for (i = 3; i < n; i++) {

sum[i] = 0;

for (j = i-3; j <= i; j++)

sum[i] += a[i, j];

}

• Inner loop runs 4 times for every value of i. So, this is actually O(n), linear, not quadratic.

Application of Big O to Loops - Cont.

Page 39: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 39

Big O and Other Algorithms

• For the rest of this course, as we develop algorithms for:– Searching– Sorting– Recursion– Etc.

we will analyze the algorithm in terms of its “Big O” notation.

• We’ll find out what’s good and what’s bad!

Page 40: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 40

Sorting – Overview

• Sorting algorithms can be compared on the basis of:– The number of comparisons for a dataset of

size n, – The number of data movements (“swaps”)

necessary,– How these measures depend on the state of

the data (random, mostly in order, reverse order), and

– How these measures change with n (“Big O” notation).

Page 41: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 41

Simple Sorting Algorithms – Insertion Sort

• Probably the most “instinctive” kind of sort: Find the location for an element and move all others up one, and insert the element.

• Pseudocode:

• Loop through array from i=1 to array.length-1, selecting element at position i = temp.– Locate position for temp (position j, where j <= i), and

move all elements above j up one location– Put temp at position j.

Page 42: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 42

Simple Sorting Algorithms – Insertion Sort – Cont.

public static void insertionSort (int[] A) {

int temp; int i, j;

for (i=1; i < A.length; i++) { temp = A[i]; for (j=i; j>0 && temp < A[j-1]; j--) A[j] = A[j-1]; A[j] = temp; } // end for} // end insertionSort

Page 43: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 43

Complexity of Insertion Sort

• “A.length” is the same as “n”.• Outer loop is going to go n times, regardless of

the state of the data.• How about the inner loop?

• Best case (data in order) is O(n).

• Worst and average case is O(n2).

Page 44: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 44

Simple Sorting Algorithms – Selection Sort

• This one works by selecting the smallest element and then putting it in its proper location.

• Pseudocode:

• Loop through the array from i=0 to one element short of the end of the array.– Select the smallest element in the array range from i

plus one to the end of the array.– Swap this value with the value at position i.

Page 45: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 45

Simple Sorting Algorithms – Selection Sort

• First, a “swap” method that will be used by this and other sorts:

public static void swap(int[] A, int pos1, int pos2) {

int temp = A[pos1]; A[pos1] = A[pos2]; A[pos2] = temp;

} // end swap

Page 46: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 46

Simple Sorting Algorithms – Selection Sort

public static void selectionSort(int[] A) {

int i, j, least;

for (i = 0; i < A.length-1; i++) { least = i; for (j = i+1; j < A.length; j++) if (A[j] < A[least]) least = j; if (i != least) swap(A, least, i); } // end for

} // end selectionSort

Page 47: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 47

Complexity of Selection Sort

• Outer loop goes n times, just like Insertion sort.• Does the number of executions of the inner loop

depend on the state of the data?• (See slides 36 and 37) The complexity for

comparisons is O(n2), regardless of the state of the data.

• The complexity of swaps is O(n), since the swap only takes place in the outer loop.

Page 48: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 48

Simple Sorting Algorithms – Bubble Sort

• Is best envisioned as a vertical column of numbers as bubbles. The larger bubbles gradually work their way to the top of the column, with the smaller ones pushed down to the bottom.

• Pseudocode:

• Loop through array from i=0 to length of array.– Loop down from the last element in the array to i.

- Swap adjacent elements if they are in the wrong order.

Page 49: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 49

Simple Sorting Algorithms – Bubble Sort – Cont.

public static void bubbleSort( int[] A) {

int i, j;

for (i=0; i < A.length; i++)

for (j = A.length-1; j > i; j--)

if (A[j] < A[j-1]) swap(A, j, j-1);

} // end bubbleSort

Page 50: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 50

Complexity of Bubble Sort

• Outer loop goes n times, as usual.• Both the comparison and swap are in the inner

loop (yuk!)• Best case (data already in order): comparisons

are still O(n2), but swaps will be O(n).

• Bubble sort can be improved (slightly) by adding a flag to the inner loop. If you loop through the array without swapping, then the data is in order, and the method can be halted.

Page 51: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 51

Measurements of Execution Speed

• Refer to the Excel spreadsheet “SortintTimes.xls”:• Sorted between 1000 and 10,000 random int

values.• Used System.currentTimeMillis() to

measure execution time.• Measured times for completely random array

(average case), almost sorted array (best case) and reverse order array (worst case).

Page 52: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 52

Choosing Between the Simple Sorts

• (Assuming you have less than about 10,000 data items…)

• First: what is the state of the data?• Second: What is slower: comparisons or swaps?• Third: Never use bubble sort, anyways!

• Since both selection and insertion are O(n2), you will need to look at the coefficient (c), calculated using sample data, to choose between them.

Page 53: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 53

Sorting Animations

• For a collection of animation links see:

http://www.hig.no/~algmet/animate.html

• Here are a couple that I liked:

http://www.cs.pitt.edu/~kirk/cs1501/animations/Sort3.html

http://cs.smith.edu/~thiebaut/java/sort/demo.html

Page 54: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 54

Sequential Search

• Sequential search pseudocode:

• Loop through array starting at the first element until the value of target matches one of the array elements.

• If a match is not found, return –1.

Page 55: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 55

Sequential Search, Cont.

• As code:

public int seqSearch(int[] A, int target) {

for (int i = 0; i < A.length; i++)

if (A[i] == target) return i;

return –1;

}

Page 56: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 56

Complexity of Sequential Search

• Best case - first element is the match: O(1).• Worst case - element not found: O(n).• Average case - element in middle: O(n).

Page 57: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 57

Binary Search

• Binary search pseudocode. Only works on ordered sets:

a) Locate midpoint of array to search.b) Determine if target is in lower half or upper half of

array.• If in lower half, make this half the array to search.• If in upper half, make this half the array to search.• Loop back to step a), unless the size of the array to

search is one, and this element does not match, in which case return –1.

Page 58: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 58

Binary Search – Cont.

public int binSearch (int[] A, int key) {

int lo = 0;int hi = A.length - 1;int mid = (lo + hi) / 2;

while (lo <= hi) {if (key < A[mid])hi = mid - 1;else if (A[mid] < key)lo = mid + 1;else return mid;mid = (lo + hi) / 2;

}return -1;

}

Page 59: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 59

Complexity of Binary Search

• For the best case, the element matches right at the middle of the array, and the loop only executes once.

• For the worst case, key will not be found and the maximum number of loops will occur.

• Note that the loop will execute until there is only one element left that does not match.

• Each time through the loop the number of elements left is halved, giving the progression below for the number of elements:

Page 60: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 60

Complexity of Binary Search, Cont.

• Number of elements to be searched - progression:

• The last comparison is for n/2m, when the number of elements is one (worst case).

• So, n/2m = 1, or n = 2m.

• Or, m = log2(n).

• So, the algorithm is O(log(n)) for the worst case.

m

nnnnn

2...,,

2,

2,

2,

32

Page 61: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 61

Binary Search - Cont.

• Binary search at O(log(n)) is much better than sequential at O(n).

• Major reason to sort datasets!

Page 62: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 62

A Caution on the Use of “Big O”

• Big O notation usually ignores the constant c.• The constant’s magnitude depends on external factors

(hardware, OS, etc.)

• For example, algorithm A has the number of operations = 108n, and algorithm B has the form 10n2.

• A is then O(n), and B is O(n2).• Based on this alone, A would be chosen over B.• But, if c is considered, B would be faster for n up to 107 -

which is a very large dataset!• So, in this case, B would be preferred over A, in spite of

the Big O notation.

Page 63: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 63

A Caution on the Use of “Big O” – Cont.

• For “mission critical” algorithm comparison, nothing beats experiments where actual times are measured.

• When measuring times, keep everything else constant, just change the algorithm (same hardware, OS, type of data, etc.).

Page 64: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 64

Recursion

• An advanced programming technique that can be used with any language that supports procedure or function calls (or “methods!”).

• Advanced sorting algorithms, like Quicksort, use recursion.

• We will also look at how to analyze the complexity of recursive algorithms.

Page 65: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 65

Recursion, Cont.

• Most simply defined as when a method “calls itself” as part of its normal operation. For example:

public int factorial (int n) {

if (n == 0)return 1;

elsereturn n * factorial(n – 1);

} // end factorial method

Page 66: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 66

Recursion, Cont.

• Start by considering recursive definitions:• A recursive definition has two parts:

– The “anchor”, “ground case” or “stopping case” supplies all the basic elements out of which all other elements of the set are constructed. This part is not recursive.

– The second part consists of the rules that determine how all other elements are constructed. It will define a new element starting from an element that has already been constructed (the “recursive” part).

Page 67: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 67

Recursive Definitions

• For example, the factorial function, n!, is evaluated as:

n! = n(n-1)(n-2)(n-3)…(2)(1)

• So 15! = 15(14)(13)(12)(11)… = 1307674368000.• Or, in terms of a

mathematical definition:

n

i

in1

!

Page 68: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 68

Recursive Definitions - Cont.

• Or, in terms of a recursive definition:

• The upper part is the anchor, and the lower part is the definition of the rules that are used to get all other values. It is also called the “inductive step”.

• The lower part is recursive.

0)!1(

01!

nifnn

nifn

Page 69: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 69

Coding a Recursive Definition

• In code:

public int factorial (int n) {

if (n == 0)return 1;

elsereturn n * factorial(n – 1);

} // end factorial method

0)!1(

01!

nifnn

nifn

Page 70: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 70

Coding a Recursive Definition – Cont.

• “But, wait!” you say! Can’t this just be done in a loop?

• Yes, most recursive codes can be carried out using loop(s) instead.

• However, the “looping” version is seldom as compact as the recursive one.

• Recursion is used in a program when it produces code that is shorter and more easily understood (and debugged!) than code using a loop.

• Recursive versions are usually slower than looping versions.

Page 71: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 71

Coding a Recursive Definition – Cont.

• A non-recursive factorial method:

public int factorial (int n) {

int f = 1;

if (n > 0)

for (int i = 2; i <= n; i++)

f *= i;

return f;

} // end factorial method

Page 72: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 72

Coding a Recursive Definition – Cont.

• (Note that we are ignoring the very real possibility of numeric overflow in these factorial methods.

• It would be better to use a long, or even an “BigInt” variable.)

Page 73: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 73

Coding a Recursive Definition – Cont.

• Back to the recursive factorial method:

public int factorial (int n) {

if (n == 0) return 1;else return n * factorial(n – 1);

} // end factorial method

• This is a slightly more “elegant” piece of code, and it is easy to write and debug.

• What is not so easy to understand is how it works!

Page 74: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 74

Back to Stacks - Activation Frames

• Stacks are an integral part of computer architecture.• For example, Java byte code is run by the “Java Virtual

Machine”, or “JVM”.• The JVM is stack – based.• Each thread (or process…) in the JVM has its own private

run-time stack in memory (our programs are “single threaded”).

• Each run-time stack contains “activation frames” – one for each method that has been activated.

• Only one activation frame (or method) can be active at a time for each thread.

• An activation frame is created, or “pushed”, every time a method is invoked. When the method is completed, the frame is popped off the stack.

Page 75: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 75

Recursion and the Run-Time Stack

• Consider another simple recursive definition (ignore n<0):

public static double power (double x, int n) {

if (n == 0) return 1.0;else

return x * power(x, n-1);

} // end power

0

011 nifxx

nifx

n

n

Page 76: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 76

Recursion and the Run-Time Stack – Cont.

• In a main method:

public static void main(String[] args) {

double z = power(5.6, 2);

} // end main

• Look at what happens in debug mode in Eclipse…

Page 77: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 77

• An activation frame for this first call to power is pushed onto the run-time stack, and execution is transferred to this frame (execution does not continue in main).

• The power method runs until it hits the second call to power on line:

power(5.6, 1) // second call

Recursion and the Run-Time Stack – Cont.

main

power

5.6, 2

Page 78: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 78

• Execution in the first call frame stops and a frame for the second call is pushed onto the stack.

• Execution in the second call frame continues until it hits the third call to power:

power(5.6, 0) // third call

Recursion and the Run-Time Stack – Cont.

main

power

5.6, 1

power

Page 79: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 79

• Execution in the second call frame stops and a frame for the third call is pushed onto the stack.

• Now, power executes until it reaches the line:

• This causes the value 1.0 to be placed at the “Return value” part of the third call’s activation frame.

Recursion and the Run-Time Stack – Cont.

main

power

5.6, 0

power

power

return 1.0;

Page 80: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 80

• This third call to power completes, and the third call’s frame is popped off the stack.

• The second call’s frame accepts the value of 1.0, and execution resumes when 5.6 * 1.0 is calculated.

Recursion and the Run-Time Stack – Cont.

main

power

power

power1.0

Page 81: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 81

• This allows the second call’s frame to complete execution. The resulting value (5.6) is placed in the second call’s Return value part of the frame.

Recursion and the Run-Time Stack – Cont.

main

power

power5.6

Page 82: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 82

Recursion and the Run-Time Stack – Cont.

• The second call’s frame is popped off the stack, and the value “5.6” is given to the first call’s frame and execution resumes at line when 5.6 * 5.6 is calculated.

• The first call’s execution completes and the first call’s frame is popped off the stack providing the value “31.36” to the next frame on the stack.

main

power31.36

Page 83: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 83

Recursion and the Run-Time Stack – Cont.

• Execution resumes in the main frame, where the value “31.36” is put into the variable z.

• See the factorial method, as well (when does the factorial calculation actually take place?).

mainy=31.36

Page 84: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 84

Recursion and the Run-Time Stack – Summary

• Execution stops at any method call and execution is passed to that method.

• Information on the calling method is preserved in the run-time stack, and a new frame is created for the newly called method.

• As recursive calls continue, the frames continue to pile up, each one waiting to complete their execution.

• Finally, the last recursive call is made and the “anchor condition” or the “stopping case” is encountered. The method completes without making another recursive call and may provide a Return value, for example.

Page 85: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 85

Recursion and the Run-Time Stack – Summary, Cont.

• The frame for the last call, the “stopping case” is popped off the stack, with the Return value being passed down into the next frame.

• The frame below accepts the Return value, completes, gets popped off, and so on, until the only frame remaining is the frame that contains the original call to the recursive method.

• This frame accepts the value from the method, and continues its execution.

Page 86: Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 86

Another Example

• Remember that the activation frames can also hold code after the recursive call.

• See TestRecursion.java