Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues

Summer 2007 CISC121 - Prof. McLeod 1

CISC121 – Lecture 9

• Last time:– Finished Linked Lists– Stacks & Queues


You Will Need To:

• Work on exercise 4 and assignment 4.• Assn 4 is due Saturday at 5pm. If you hand it in

on Friday instead, Krista will have it marked earlier.


Today

• Analysis of Complexity• Sorting

– Insertion– Selection– Bubble

• Searching– Sequential– Binary

• Start Recursion?


Comparing Programs

• Possible bases for comparison:– Speed of execution– Memory consumption– Ease of debugging– Ease of interfacing with other program elements– Backwards compatibility– Quality of external documentation, support– Ease of use– Quality of user interface– Cost…


Comparing Algorithms, Cont.

• Nobody will buy a program that is too slow!– Regardless of all its other “features”.

• Too much work to finish an entire program only to find out it is too slow!

• It would be nice if we could predict speed of operation, or at least predict bottlenecks as code is developed.


Comparing Algorithms, Cont.

• For example, it was fairly obvious that deleting the tail node in a doubly linked list is much faster than deleting the tail node in a singly linked list.

• It is not always this easy to compare code on the basis of expected execution speed.

• The total time of execution is sum of many complex processor and memory operations.


The Analysis of Complexity

• Is really all about simplifying complexity.• Complexity is what really goes on when a

program is run, the accumulation of all the nano-second operations that take place.

• All we really need to do is to be able to say is that algorithm “A” will be faster than algorithm “B” if they are both tested under the same conditions with the same kind of data.


Analysis of Complexity - Cont.

• This theoretical technique can be explained by first generalizing the complexity of what goes on when code is run.

• First, define a “Primitive Operation”:


Primitive Operations• A primitive operation takes a unit of time (a

“tick”). The actual length of time will depend on external factors such as the hardware and software environment.

• Select operations that would all take about the same length of time. For example:– Assigning a value to a variable.– Calling a method.– Performing an arithmetic operation.– Comparing two numbers.– Indexing an array element.– Following an object reference.– Returning from a method.


Primitive Operations - Cont.

• Assume: Each of these kinds of operations will take the same amount of time on a given hardware and software platform.

• Count primitive operations for a simple method.

• Example - find the highest integer in an array:


public static int arrayMax(int[] A) {

int currentMax = A[0]; // 3 operationsint n = A.length; // 3 operationsfor (int i = 0; i < n; i++)// 2+3n

if (currentMax < A[i]) // 3ncurrentMax = A[i]; // 0 to 2n

return currentMax; // 1} // end arrayMax

• Number of primitive operations:– Minimum = 3+3+2+3n+3n+1= 9 + 6n– Maximum = 9 + 8n




• Or, Best case is 9 + 6n, Worst case is 9 + 8n.• What is the Average case?• If the maximum element had exactly the same

probability of being in each of the array locations then the average case would be = 9 + 7n.

• But if the data is not randomly distributed, then getting the average case is very difficult.

• It is much easier to just assume the worst case analysis, 9 + 8n.


Analysis of Complexity - Cont.

• Consider the case when n gets very large (Asymptotic Analysis).

• Then 8n is much greater than 9, so just using 8n is a good approximation.

• Or, it is even easier to say that the running time grows proportionally to n.

• Or the order is to the first power of n, or just first order, or O(n).

• For the most simple analysis, it makes sense to just ignore the constant values in the equation.


“Big O” Notation

• Mathematical definition of Big O Notation:

Let f(n) and g(n) be functions that map nonnegative integers to real numbers. We say that f(n) is O(g(n)) if there is a real constant, c, where c > 0 and an integer constant n0, where n0 1 such that f(n) cg(n) for every integer n n0.


“Big O” Notation - Cont.



• f(n) is the function that describes the actual time of the program as a function of the size of the dataset, n.

• The idea here is that g(n) is a much simpler function that what f(n) is.

• Given all the assumptions and approximations made to get any kind of a function, it does not make sense to use anything more than the simplest representation.

• The more complicated the function, the harder they are to compare.



• For example, consider the function:

f(n) = n2 + 100n + log10n + 1000

• For small values of n the last term, 1000, dominates. When n is around 10, the terms 100n + 1000 dominate. When n is around 100, the terms n2 and 100n dominate. When n gets much larger than 100, the n2 dominates all others. The log10n term never gets to play at all!

• So it would be safe to say that this function is O(n2) for values of n > 100.



• Similarly a function like:

20n3 + 10n(log(n)) + 5

is O(n3). (It can be proved that the function is 35n3 for n 1. So c is 35 and n0 is 1.)



• A question:• If an algorithm is O(n2), why is it not also O(n3) or

generally O(ni) where i>2?• The answer: It is!• O(n2) is just a subset of O(ni).• However, O(n2) is probably the closest

representation of the algorithm. While other notations satisfy the definition of “big O” they would lie further away from the algorithm’s actual behaviour, and would not do as good a job at describing the algorithm.



• So, use the the simplest notation that lies closest to the actual behaviour of the algorithm.

• If you can choose between algorithms based on the simplest notation, great!

• If you cannot, then you need to increase the complexity of the equations.

• If this does not work, then give up and go to experiment!



• Common big O notations:

– Constant O(1)– Logarithmic O(log(n))– Linear O(n)– N log n O(n(log(n)))– Quadratic O(n2)– Cubic O(n3)– Polynomial O(nx)– Exponential O(2n)



• On the next slide: how these functions vary with n. Assume a rate of 1 instruction per µsec (micro-second):



Notation n10 102 103 104 105 106

O(1) 1 µsec 1 µsec 1 µsec 1 µsec 1 µsec 1 µsec

O(log(n)) 3 µsec 7 µsec 10 µsec 13 µsec 17 µsec 20 µsec

O(n) 10

µsec

100 µsec 1 msec 10 msec 100 msec

1 sec

O(nlog(n)) 33 µsec 664 µsec 10 msec 13.3 msec

1.6 sec 20 sec

O(n2) 100 µsec

10 msec 1 sec 1.7 min 16.7 min

11.6 days

O(n3) 1 msec 1 sec 16.7 min 11.6 days

31.7 years

31709 years

O(2n) 10 msec 3e17 years



• So, algorithms of O(n3) or O(2n) are not practical for any more than a few iterations.

• Quadratic, O(n2) (common for nested “for” loops!) gets pretty ugly for n > 1000 (Bubble sort for example…).

• To figure out the big O notation for more complicated algorithms, it is necessary to understand some mathematical concepts.


Some Math…

• Logarithms and exponents - a quick review:• By definition:

logb(a) = c, when a = bc


More Math…

• Some rules when using logarithms and exponents - for a, b, c positive real numbers:

– logb(ac) = logb(a) + logb(c)

– logb(a/c) = logb(a) - logb(c)

– logb(ac) = c(logb(a))

– logb(a) = (logc(a))/logc(b)

– blogc(a) = alog

c(b)

– (ba)c = bac

– babc = ba + c

– ba/bc = ba - c


More Math…

• Summations:

• If a does not depend on i:

)(...)2()1()()( bfafafafifb

ai

n

i

n

i

ifaiaf11

)()(


More Math…

• Also:

2

)1(

1

nni

n

i

6

)12)(1(

1

2

nnni

n

i


Still More Math…

• The Geometric Sum, for any real number, a > 0 and a 1.

• Also for 0 < a < 1:

a

aa

nn

i

i

1

1 1

0

aa

i

i

1

1

0


Still More Math…

• Finally, for a > 0, a 1, and n 2:

• Lots of other summation formulae exist, (Mathematicians just love to collect them!) but these are the ones commonly found in proofs of big O notations.

2

)2()1(

1 )1(

)1(

a

naanaia

nnn

i

i


Properties of Big O Notation

1. If d(n) is O(f(n)), then ad(n) is still O(f(n)), for any constant, a > 0.

2. If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n) + e(n) is O(f(n) + g(n)).

3. If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)e(n) is O(f(n)g(n)).

4. If d(n) is O(f(n)) and f(n) is O(g(n)), then d(n) is O(g(n)).


Properties of Big O Notation – Cont.

5. If f(n) is a polynomial of degree d (ie. f(n) = a0 + a1n + a2n2 + a3n3 + … + adnd, then f(n) is O(nd).

6. loga(n) is O(logb(n)) for any a, b > 0. That is to say loga(n) is O(log2(n)) for any a > 1.



• While correct, it is considered “bad form” to include constants and lower order terms when presenting a Big O notation unless the other terms cannot be simplified and are close in magnitude.

• For most comparisons between algorithms, the simplest representation of the notation is sufficient.

• When simplifying a complex function always look for the obviously dominant term (for large n) and toss all the others.


Application of Big O to Loops

• For example, a single “for” loop:

sum = 0;

for (i = 0; i < n; i++)

sum += a[i];

• 2 operations before the loop, 6 operations inside the loop. Total is 2 + 6n. This gives the loop O(n).


Application of Big O to Loops - Cont.

• Nested “for” loops, where both loop until n:

for (i = 0; i < n; i++) {sum[i] = 0;for (j = 0; j < n; j++)

sum[i] += a[i, j];}

• Outer loop: 1+5n+n(inner loop). Inner loop = 1+7n. Total = 1+6n+7n2, so algorithm is O(n2).



• Nested “for” loops where the inner loop goes less than n times (used in simple sorts):

for (i = 0; i < n; i++) {sum[i] = 0;for (j = 0; j < i; j++)

sum[i] += a[i, j];}

• Outer loop: 1 + 6n + (7i, for i = 1, 2, 3, …, n-1).



• Or:

• Which is O(n) + O(n2), which is finally just O(n2).

2

)1(761761761

1

1

1

1

nnninin

n

i

n

i


• Nested for loops are usually O(n2), but not always!

for (i = 3; i < n; i++) {

sum[i] = 0;

for (j = i-3; j <= i; j++)

sum[i] += a[i, j];

}

• Inner loop runs 4 times for every value of i. So, this is actually O(n), linear, not quadratic.



Big O and Other Algorithms

• For the rest of this course, as we develop algorithms for:– Searching– Sorting– Recursion– Etc.

we will analyze the algorithm in terms of its “Big O” notation.

• We’ll find out what’s good and what’s bad!


Sorting – Overview

• Sorting algorithms can be compared on the basis of:– The number of comparisons for a dataset of

size n, – The number of data movements (“swaps”)

necessary,– How these measures depend on the state of

the data (random, mostly in order, reverse order), and

– How these measures change with n (“Big O” notation).


Simple Sorting Algorithms – Insertion Sort

• Probably the most “instinctive” kind of sort: Find the location for an element and move all others up one, and insert the element.

• Pseudocode:

• Loop through array from i=1 to array.length-1, selecting element at position i = temp.– Locate position for temp (position j, where j <= i), and

move all elements above j up one location– Put temp at position j.


Simple Sorting Algorithms – Insertion Sort – Cont.

public static void insertionSort (int[] A) {

int temp; int i, j;

for (i=1; i < A.length; i++) { temp = A[i]; for (j=i; j>0 && temp < A[j-1]; j--) A[j] = A[j-1]; A[j] = temp; } // end for} // end insertionSort


Complexity of Insertion Sort

• “A.length” is the same as “n”.• Outer loop is going to go n times, regardless of

the state of the data.• How about the inner loop?

• Best case (data in order) is O(n).

• Worst and average case is O(n2).


Simple Sorting Algorithms – Selection Sort

• This one works by selecting the smallest element and then putting it in its proper location.

• Pseudocode:

• Loop through the array from i=0 to one element short of the end of the array.– Select the smallest element in the array range from i

plus one to the end of the array.– Swap this value with the value at position i.



• First, a “swap” method that will be used by this and other sorts:

public static void swap(int[] A, int pos1, int pos2) {

int temp = A[pos1]; A[pos1] = A[pos2]; A[pos2] = temp;

} // end swap



public static void selectionSort(int[] A) {

int i, j, least;

for (i = 0; i < A.length-1; i++) { least = i; for (j = i+1; j < A.length; j++) if (A[j] < A[least]) least = j; if (i != least) swap(A, least, i); } // end for

} // end selectionSort


Complexity of Selection Sort

• Outer loop goes n times, just like Insertion sort.• Does the number of executions of the inner loop

depend on the state of the data?• (See slides 36 and 37) The complexity for

comparisons is O(n2), regardless of the state of the data.

• The complexity of swaps is O(n), since the swap only takes place in the outer loop.


Simple Sorting Algorithms – Bubble Sort

• Is best envisioned as a vertical column of numbers as bubbles. The larger bubbles gradually work their way to the top of the column, with the smaller ones pushed down to the bottom.

• Pseudocode:

• Loop through array from i=0 to length of array.– Loop down from the last element in the array to i.

- Swap adjacent elements if they are in the wrong order.


Simple Sorting Algorithms – Bubble Sort – Cont.

public static void bubbleSort( int[] A) {

int i, j;

for (i=0; i < A.length; i++)

for (j = A.length-1; j > i; j--)

if (A[j] < A[j-1]) swap(A, j, j-1);

} // end bubbleSort


Complexity of Bubble Sort

• Outer loop goes n times, as usual.• Both the comparison and swap are in the inner

loop (yuk!)• Best case (data already in order): comparisons

are still O(n2), but swaps will be O(n).

• Bubble sort can be improved (slightly) by adding a flag to the inner loop. If you loop through the array without swapping, then the data is in order, and the method can be halted.


Measurements of Execution Speed

• Refer to the Excel spreadsheet “SortintTimes.xls”:• Sorted between 1000 and 10,000 random int

values.• Used System.currentTimeMillis() to

measure execution time.• Measured times for completely random array

(average case), almost sorted array (best case) and reverse order array (worst case).


Choosing Between the Simple Sorts

• (Assuming you have less than about 10,000 data items…)

• First: what is the state of the data?• Second: What is slower: comparisons or swaps?• Third: Never use bubble sort, anyways!

• Since both selection and insertion are O(n2), you will need to look at the coefficient (c), calculated using sample data, to choose between them.


Sorting Animations

• For a collection of animation links see:

http://www.hig.no/~algmet/animate.html

• Here are a couple that I liked:

http://www.cs.pitt.edu/~kirk/cs1501/animations/Sort3.html

http://cs.smith.edu/~thiebaut/java/sort/demo.html


Sequential Search

• Sequential search pseudocode:

• Loop through array starting at the first element until the value of target matches one of the array elements.

• If a match is not found, return –1.


Sequential Search, Cont.

• As code:

public int seqSearch(int[] A, int target) {

for (int i = 0; i < A.length; i++)

if (A[i] == target) return i;

return –1;

}


Complexity of Sequential Search

• Best case - first element is the match: O(1).• Worst case - element not found: O(n).• Average case - element in middle: O(n).


Binary Search

• Binary search pseudocode. Only works on ordered sets:

a) Locate midpoint of array to search.b) Determine if target is in lower half or upper half of

array.• If in lower half, make this half the array to search.• If in upper half, make this half the array to search.• Loop back to step a), unless the size of the array to

search is one, and this element does not match, in which case return –1.


Binary Search – Cont.

public int binSearch (int[] A, int key) {

int lo = 0;int hi = A.length - 1;int mid = (lo + hi) / 2;

while (lo <= hi) {if (key < A[mid])hi = mid - 1;else if (A[mid] < key)lo = mid + 1;else return mid;mid = (lo + hi) / 2;

}return -1;

}


Complexity of Binary Search

• For the best case, the element matches right at the middle of the array, and the loop only executes once.

• For the worst case, key will not be found and the maximum number of loops will occur.

• Note that the loop will execute until there is only one element left that does not match.

• Each time through the loop the number of elements left is halved, giving the progression below for the number of elements:


Complexity of Binary Search, Cont.

• Number of elements to be searched - progression:

• The last comparison is for n/2m, when the number of elements is one (worst case).

• So, n/2m = 1, or n = 2m.

• Or, m = log2(n).

• So, the algorithm is O(log(n)) for the worst case.

m

nnnnn

2...,,

2,

2,

2,

32


Binary Search - Cont.

• Binary search at O(log(n)) is much better than sequential at O(n).

• Major reason to sort datasets!


A Caution on the Use of “Big O”

• Big O notation usually ignores the constant c.• The constant’s magnitude depends on external factors

(hardware, OS, etc.)

• For example, algorithm A has the number of operations = 108n, and algorithm B has the form 10n2.

• A is then O(n), and B is O(n2).• Based on this alone, A would be chosen over B.• But, if c is considered, B would be faster for n up to 107 -

which is a very large dataset!• So, in this case, B would be preferred over A, in spite of

the Big O notation.


A Caution on the Use of “Big O” – Cont.

• For “mission critical” algorithm comparison, nothing beats experiments where actual times are measured.

• When measuring times, keep everything else constant, just change the algorithm (same hardware, OS, type of data, etc.).


Recursion

• An advanced programming technique that can be used with any language that supports procedure or function calls (or “methods!”).

• Advanced sorting algorithms, like Quicksort, use recursion.

• We will also look at how to analyze the complexity of recursive algorithms.


Recursion, Cont.

• Most simply defined as when a method “calls itself” as part of its normal operation. For example:

public int factorial (int n) {

if (n == 0)return 1;

elsereturn n * factorial(n – 1);

} // end factorial method


Recursion, Cont.

• Start by considering recursive definitions:• A recursive definition has two parts:

– The “anchor”, “ground case” or “stopping case” supplies all the basic elements out of which all other elements of the set are constructed. This part is not recursive.

– The second part consists of the rules that determine how all other elements are constructed. It will define a new element starting from an element that has already been constructed (the “recursive” part).


Recursive Definitions

• For example, the factorial function, n!, is evaluated as:

n! = n(n-1)(n-2)(n-3)…(2)(1)

• So 15! = 15(14)(13)(12)(11)… = 1307674368000.• Or, in terms of a

mathematical definition:

n

i

in1

!


Recursive Definitions - Cont.

• Or, in terms of a recursive definition:

• The upper part is the anchor, and the lower part is the definition of the rules that are used to get all other values. It is also called the “inductive step”.

• The lower part is recursive.

0)!1(

01!

nifnn

nifn


Coding a Recursive Definition

• In code:


if (n == 0)return 1;

elsereturn n * factorial(n – 1);


0)!1(

01!

nifnn

nifn


Coding a Recursive Definition – Cont.

• “But, wait!” you say! Can’t this just be done in a loop?

• Yes, most recursive codes can be carried out using loop(s) instead.

• However, the “looping” version is seldom as compact as the recursive one.

• Recursion is used in a program when it produces code that is shorter and more easily understood (and debugged!) than code using a loop.

• Recursive versions are usually slower than looping versions.



• A non-recursive factorial method:


int f = 1;

if (n > 0)

for (int i = 2; i <= n; i++)

f *= i;

return f;




• (Note that we are ignoring the very real possibility of numeric overflow in these factorial methods.

• It would be better to use a long, or even an “BigInt” variable.)



• Back to the recursive factorial method:


if (n == 0) return 1;else return n * factorial(n – 1);


• This is a slightly more “elegant” piece of code, and it is easy to write and debug.

• What is not so easy to understand is how it works!


Back to Stacks - Activation Frames

• Stacks are an integral part of computer architecture.• For example, Java byte code is run by the “Java Virtual

Machine”, or “JVM”.• The JVM is stack – based.• Each thread (or process…) in the JVM has its own private

run-time stack in memory (our programs are “single threaded”).

• Each run-time stack contains “activation frames” – one for each method that has been activated.

• Only one activation frame (or method) can be active at a time for each thread.

• An activation frame is created, or “pushed”, every time a method is invoked. When the method is completed, the frame is popped off the stack.


Recursion and the Run-Time Stack

• Consider another simple recursive definition (ignore n<0):

public static double power (double x, int n) {

if (n == 0) return 1.0;else

return x * power(x, n-1);

} // end power

0

011 nifxx

nifx

n

n


Recursion and the Run-Time Stack – Cont.

• In a main method:

public static void main(String[] args) {

double z = power(5.6, 2);

} // end main

• Look at what happens in debug mode in Eclipse…


• An activation frame for this first call to power is pushed onto the run-time stack, and execution is transferred to this frame (execution does not continue in main).

• The power method runs until it hits the second call to power on line:

power(5.6, 1) // second call


main

power

5.6, 2


• Execution in the first call frame stops and a frame for the second call is pushed onto the stack.

• Execution in the second call frame continues until it hits the third call to power:

power(5.6, 0) // third call


main

power

5.6, 1

power


• Execution in the second call frame stops and a frame for the third call is pushed onto the stack.

• Now, power executes until it reaches the line:

• This causes the value 1.0 to be placed at the “Return value” part of the third call’s activation frame.


main

power

5.6, 0

power

power

return 1.0;


• This third call to power completes, and the third call’s frame is popped off the stack.

• The second call’s frame accepts the value of 1.0, and execution resumes when 5.6 * 1.0 is calculated.


main

power

power

power1.0


• This allows the second call’s frame to complete execution. The resulting value (5.6) is placed in the second call’s Return value part of the frame.


main

power

power5.6



• The second call’s frame is popped off the stack, and the value “5.6” is given to the first call’s frame and execution resumes at line when 5.6 * 5.6 is calculated.

• The first call’s execution completes and the first call’s frame is popped off the stack providing the value “31.36” to the next frame on the stack.

main

power31.36



• Execution resumes in the main frame, where the value “31.36” is put into the variable z.

• See the factorial method, as well (when does the factorial calculation actually take place?).

mainy=31.36


Recursion and the Run-Time Stack – Summary

• Execution stops at any method call and execution is passed to that method.

• Information on the calling method is preserved in the run-time stack, and a new frame is created for the newly called method.

• As recursive calls continue, the frames continue to pile up, each one waiting to complete their execution.

• Finally, the last recursive call is made and the “anchor condition” or the “stopping case” is encountered. The method completes without making another recursive call and may provide a Return value, for example.


Recursion and the Run-Time Stack – Summary, Cont.

• The frame for the last call, the “stopping case” is popped off the stack, with the Return value being passed down into the next frame.

• The frame below accepts the Return value, completes, gets popped off, and so on, until the only frame remaining is the frame that contains the original call to the recursive method.

• This frame accepts the value from the method, and continues its execution.


Another Example

• Remember that the activation frames can also hold code after the recursive call.

• See TestRecursion.java

Documents

Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 9 Last time: –Finished Linked Lists –Stacks & Queues