Analysis of Algorithms
What is an algorithm?
The ideas behind computer programs Stays the same no matter
Which kind of hardware it is running on Which programming language it is written in
Solves a general, well-specified problem Is specified by
Describing the set of instances (input) it must work on
Describing the desired properties of the output
Important Properties of Algorithms
Correct always returns the desired output for all
legal instances of the problem. Efficient
Can be measured in terms of Time Space
Time tends to be more important
Expressing Algorithms
English description
Pseudocode
High-level programming language
More precise
More easily expressed
Pseudocode A shorthand for specifying algorithms Leaves out the implementation details Leaves in the essence of the algorithm
Algorithm ArrayMax(A,n)Input: An array A storing n>=1 integersOutput: The maximum element in A.
currentMax A[0]for i 1 to n-1 do
if currentMax < A[i] then currentMax A[i]
return currentMax
Algorithms
Algorithms are simply a list of steps required to solve some particular problem
They are designed as abstractions of processes carried out by computer programs
Examples include Sorting Determining if a student qualifies for financial aid Depth-First Search
In some cases we have only one algorithm for a problem or the problem is so straightforward that there is no need to consider anything other than the obvious
Some other problems have many known algorithms We obviously want to choose the "best" algorithm
Other problems have no known algorithm!
Analysis of Algorithms Why analyze algorithms?
evaluate algorithm performance compare different algorithms
Analyze what about them? running time, memory usage, solution quality worst-case and “typical” case
Computational complexity understanding the intrinsic difficulty of computational
problems - classifying problems according to difficulty algorithms provide upper bound to show problem is hard, must show that any algorithm
to solve it requires at least a given amount of resources transform problems to establish “equivalent” difficulty
What is the "best" Algorithm? Traditionally we focused on two questions
How fast does it run? In the early days this was measured by timing the implementation of the
algorithm It was common to hear (and read papers!) about a "new" SuperDuper Sort
that could sort a list of 1 million integers in 17 seconds whereas Crap Sort requires 43 seconds
How much memory does it require? There are other considerations such as
How "complicated" is the algorithm How well is the algorithm documented
For now, we will leave these Soft. Eng. considerations aside
One question remains. Based on the statement above about SuperDuper Sort, can we say that it is better than Crap Sort? Can the machine used have influence on the results?
Analysis of Algorithms
Programs depend on the operating systems, machine, compiler/interpreter used, etc.
Analysis of algorithms compare algorithms and not programs
It is based on the premise that the longer the algorithm takes the longer its implementation will run Sorting 1 million items ought to take longer than sorting
1000 But if we comparing algorithms (not yet
implemented) how can we express it's performance? How can we "measure" the performance of an algorithm?
Analysis of Algorithms What we want though is an expression that can be
applied to any computer This is only possible by stating the efficiency in
terms of some critical operations These operations depend on the problem We could for instance say that in sorting algorithms it is
the number of time two elements are compared In general we do analysis of algorithms using the
RAM model (Random Access Machine) Instructions are executed one after the other
There is no concurrency Basic operations take the same time (constant time)
We normally say that each line (step) in the algorithm takes time 1 (one)
The RAM Model
Random Access Machine (not R.A. Memory) An idealized notion of how the computer works
Each "simple" operation (+, -, =, if) takes exactly 1 step.
Each memory access takes exactly 1 step Loops and method calls are not simple operations,
but depend upon the size of the data and the contents of the method.
Measure the run time of an algorithm by counting the number of steps.
Random Access Machine A Random Access Machine (RAM) consists
of: a fixed program an unbounded memory a read-only input tape a write-only output tape
Each memory register can hold an arbitrary integer (*).
Each tape cell can hold a single symbol from a finite alphabet .
. . .
Program
Memory
...
. . .
output tape
input tape
Instruction set:x y, x y {+, , *, div, mod} zgoto labelif y {<, , =, ,> , } z goto labelx input, output yhalt
Addressing modes:x may be direct or indirect referencey and z may be constants, direct or indirect references
Sample Program The following program reads a value and
writes the remainder mod 7 to the output.r0 input r1 0
10 if r0 = ‘%’ goto 20r0 r0 ‘0’r1 r1 10r1 r1 r0
r0 inputgoto 10
20 r2 r1 mod7r2 r2 ‘0’output r2
halt Example of indirection: *r2 *r3 5 means
value in location pointed to be r3 is added to 5 and the result written to location pointed to by r2.
Primitive Operations
Assign a value to a variable Call a method Arithmetic operation Comparing two numbers Indexing into an array Following an object reference Returning from a method
Counting Primitive Operations
Algorithm ArrayMax(A,n)Input: An array A storing N integersOutput: The maximum element in A.
currentMax A[0]for i 1 to n-1 do
if currentMax < A[i] then currentMax A[i]
return currentMax
2 steps + 1 to initialize i
2 steps
2 steps
1 step
2 step each time (compare i to n, inc i)n-1 times
How often done??
It depends on the order the numbers appear in in A[]
Between 4(n-1) and 6(n-1) in the loop
Analysis of Algorithms
But you could ask be asking: If each line takes constant time the whole algorithm (any algorithm) will take constant time, right?
Wrong! Although some algorithms may take constant time, the
majority of algorithms varies its number of steps based on the size of instance we're trying to solve
Therefore the efficiency of an algorithm is always normally stated as a function of the problem size We generally use the variable n to represent the problem size
On the implementation, we could find out that out SuperDuper Sort takes 0.6n2 + 0.3n + 0.45 seconds on a Pentium 3. Plug a value for n and you have how long it takes
Number of steps
for i = 1 .. N do a = a + 2 i = i + 1end do
for i = 1 .. N do a = a + 2 i = i + 1end do
for i = 1 .. N do a = a + 2 i = i + 2end do
for i = 1 .. N do a = a + 2 i = i + 2end do
It is easy to see that most of algorithms vary their number of steps
Algorithm Complexity
Worst Case Complexity: the function defined by the maximum number
of steps taken on any instance of size n Best Case Complexity:
the function defined by the minimum number of steps taken on any instance of size n
Average Case Complexity: the function defined by the average number of
steps taken on any instance of size n
Best, Worst, and Average Case Complexity
Worst Case Complexity
Average Case Complexity
Best Case Complexity
Number of steps
N (input size)
Doing the Analysis It’s hard to estimate the running time
exactly Best case depends on the input Average case is difficult to compute So we usually focus on worst case analysis
Easier to compute Usually close to the actual running time
Strategy: try to find upper and lower bounds of the worst case function.Upper bound
Lower bound
Actual function
Analysis of Algorithms
But what we're just said is not yet independent of the machine. Remember that we said that we said that the formula
for the SuperDuper Sort is valid for a Pentium 3 We need to identify the most important aspect
of the function that represents the running time of an algorithm
Which one is the "best" f(n) = 10000000n g(n) = n2 + n
Asymptotic Analysis
Asymptotic analysis of an algorithm describes the relative efficiency of an algorithm as n get very large.
In the example it is easy to see that for very large n, g(n) grows faster than f(n) Take for instance the value n=20000000
Remember that the goal here is to compare algorithms. In practice, if you're writing small programs, asymptotic analysis may not be that important
When you're dealing with small input size, most of algorithms will do
When the input size is very large things change
Asymptotic Performance
In this course, we care most about asymptotic performance How does the algorithm behave as the
problem size gets very large? Running time Memory/storage requirements Bandwidth/power requirements/logic gates/etc.
Asymptotic Notation
By now you should have an intuitive feel for asymptotic (big-O) notation: What does O(n) running time mean? O(n2)?
O(n lg n)? How does asymptotic running time relate to
asymptotic memory usage? Our first task is to define this notation
more formally and completely
Constants versus n We usually focus on big-oh. It is important to understand the difference
between constants and n A constant has a fixed value, doesn’t change
Doesn’t really matter what the value of the constant is. n reflects the size of the problem
So n can get really really big This is why we talk about the time in terms of a function
of n In the ArrayMax example,
We don’t really need to pay attention to 4(n-1) versus 6(n-1)
They are both order n
Problems that have large n
Put a list of all contributors to the 2000 presidential campaigns into alphabetical order.
Run a photoshop-style filter across all the pixels of a high-resolution image.
Others?
Problems that have large n
Checking mailboxes The mail delivery person needs to visit all
of them. What is the best order to go in? This is easy if n is small (say, 10) Otherwise it is hard!
Number of mailboxes in the city (n) Minimum number of paths required to be
taken? Maximum number of paths between pairs of
mailboxes?
Constraining Large Problems
Number of ways there are to fly roundtrip from the Bay Area to Washington DC.
Here n is the number of available flights on a given day throughout the country
But have to add lots of constraints too Choose SFO, OAK, or SJ Choose BWI, Dulles, or National Choose airline Direct, one stop, two stops?
Connect through Dallas or Denver or Chicago or LAX or … How long must be allowed for layovers? Which combos are cheapest? What about open-jaw?
If you try all possible combinations, it will take a very long time to run!!
A running time is O(g(n)) if there exist constants n0 > 0
and c > 0 such that for all problem sizes n > n0, the
running time for a problem of size n is at most c(g(n)). In other words, c(g(n)) is an upper bound on the running time for sufficiently large n.
c g(n)
The Crossover Point
One function starts out faster for small values of n.But for n > n0, the other function is always faster.
Higher order Term
Analyzing the table given earlier we can see that in an efficiency function we are interested in the term with higher order If we have a function f(n) = n3 + n2, for the case when n = 100000 the
running time of the algorithm is 31.7 years + 2.8 hours Its clear that a couple of hours does not make much difference if the
program is to run for 31.7 years!
In the case above we say that f(n) is O(n3) meaning that f(n) is of the order n3
The so-called big-O notation disregards any constant multiplying the term of highest order and any term of smaller order f(n) = 10000000000000n3 is O(n3)
It's crucial that you understand how to identify the most significant term in a formula
A more Formal Definition ofbig-O For a given function g(n), we say the set
of functions that satisfy the assertion below are O(g(n))
c g(n)
f(n)
nn0
t
f(n) = O(g(n))
More formally
Let f(n) and g(n) be functions mapping nonnegative integers to real numbers.
f(n) is (g(n)) if there exist positive constants n0 and c such that for all n>=n0, f(n) <= c*g(n)
Other ways to say this:f(n) is order g(n)f(n) is big-Oh of g(n) f(n) is Oh of g(n)
Function Pecking Order
In increasing order log(n) n n^2 n^5 2^n
1 2 4 32 42 4 16 1024 163 8 64 32768 2564 16 256 1048576 655365 32 1024 33554432 4.29E+096 64 4096 1.07E+09 1.84E+197 128 16384 3.44E+10 3.4E+388 256 65536 1.1E+12 1.16E+779 512 262144 3.52E+13 1.3E+154
10 1024 1048576 1.13E+15 #NUM!
Plot them!
0
2E+153
4E+153
6E+153
8E+153
1E+154
1.2E+154
1.4E+154
1.6E+154
1 2 3 4 5 6 7 8 9 10
log(n)
n
n 2̂
n 5̂
2 n̂
11E+121E+24
1E+361E+481E+601E+72
1E+841E+96
1E+1081E+120
1E+1321E+1441E+156
1 2 3 4 5 6 7 8 9 10
log(n)
n
n 2̂
n 5̂
2 n̂
Both x and y linear scales Convert y axis to log scale
(that jump for large n happens because the last number is out of range)
Notice how much bigger 2^n is than n^k
This is why exponential growth is BAD BAD BAD!!
An simple comparison Let's assume that you have 3 algorithms to sort a list
f(n) = n log2n g(n) = n2 h(n) = n3
Let's also assume that each step takes 1 microsecond (10-6)
Most of the algorithms discussed here will be given in terms of common functions: polynomials, logarithms, exponentials and product of these functions
n n log n n^2 n^310 33.2 100 1000100 664 10000 1seg
1000 9966 1seg 16min100000 1.7s 2.8 hours 31.7 years
Common Functions
Constant: Very fast. Some hash table algorithms can look up one item from the table of n items in an average time which is constant (independent of the table size)
Logarithmic: Also very fast. Typical of many algorithms that use (binary) trees.
Linear Time: Typical of fast algorithms on a single-processor computer. If all the input of size n has to be read.
Poly-logarithmic (n log n): Typical of the best sorting algorithms. Considered a good solution
Polynomial: When a problem of size n can be solved in time nk where k is a constant. Small n's (n <= 3) is OK.
Common Functions
Exponential: Are those that use time kn where k is a constant. Algorithms that grow on this rate are suitable only for small problems. Unfortunately the best algorithms known to many
problems use exponential time Much of the work on developing algorithms today is
focused on these problems because they take an huge amount of time to execute (even for reasonably small input size)
There is a large variation in the size of various exponential functions (20.0001n and 2n). But for large n the functions become huge
Doing the Analysis It’s hard to estimate the running time
exactly Best case depends on the input Average case is difficult to compute So we usually focus on worst case analysis
Easier to compute Usually close to the actual running time
Strategy: try to find upper and lower bounds of the worst case function.Upper bound
Lower bound
Actual function
Algorithm Complexity
Analogy: shoe shopping Best, average, worst case:
Best case: drive right up to the door, don’t have to walk anywhere You can buy those Manolo Blahnik stilettos
Average case: walking around campus Tivos
Worst case: hiking in the desert for weeks Need boots that can accommodate hot, swollen feet
Upper vs. lower bound Assume you wear size 8.25 (in the worst case) Perfect fit –
custom made shoes Upper bound – fit as closely as possible, with a well-known function (size),
that is just bigger than actual foot size Get a size 8 ½
Lower bound – fit as closely as possible, with a well-known function (size), that is just smaller than actual shoe size
Get a size 8
Large Problems
Number of ways there are to fly roundtrip from the Bay Area to Washington DC.
Here n is the number of available flights on a given day throughout the country
But have to add lots of constraints too Choose SFO, OAK, or SJ Choose BWI, Dulles, or National Choose airline Direct, one stop, two stops?
Connect through Dallas or Denver or Chicago or LAX or … How long must be allowed for layovers? Which combos are cheapest? What about open-jaw?
If you try all possible combinations, it will take a very long time to run!!
More Plots
0
10000
20000
30000
40000
50000
60000
1 2 3 4 5 6 7 8 9 10 11 12
log n
n
n log n
log n n n log n1 2 22 4 83 8 244 16 645 32 1606 64 3847 128 8968 256 20489 512 4608
10 1024 1024011 2048 2252812 4096 49152
1
10
100
1000
10000
100000
1 2 3 4 5 6 7 8 9 10 11 12
log n
n
n log n
Let’s Count Some Beer A well-known “song”
“100 bottles of beer on the wall, 100 bottles of beer; you take one down, pass it around, 99 bottles of beer on the wall.”
“99 bottles of beer on the wall, 99 bottles of beer; you take one down, pass it around, 98 bottles of beer on the wall.”
… “1 bottle of beer on the wall, 1 bottle of beer, you take it
down, pass it around, no bottles of beer on the wall.” HALT.
Let’s change the song to “N bottles of beer on the wall”. The number of bottles of beer passed around is Order what?
Let’s Count Some Ants
Another song: The ants go marching 1 by 1 The ants go marching 2 by 2 The ants go marching 3 by 3
How ants are in the lead in each wave of ants?1 + 2 + 3 + … + n
Does this remind you of anything?
)(2
)1( 2
1
nOnn
in
i
Graph it!Let’s plot beer(n) versus ants(n)
0
10
20
30
40
50
60
70
80
90
1 2 3 4 5 6 7 8 9 10 11 12
n
tota
l # it
ems
Gifts
Beer BottlesAnts
Constants versus n We usually focus on big-oh. It is important to understand the difference
between constants and n A constant has a fixed value, doesn’t change
Doesn’t really matter what the value of the constant is. n reflects the size of the problem
So n can get really really big This is why we talk about the time in terms of a function
of n In the ArrayMax example,
We don’t really need to pay attention to 4(n-1) versus 6(n-1)
They are both order n
ArrayMax, revisited
Algorithm ArrayMax(A,n)Input: An array A storing N integersOutput: The maximum element in A.
currentMax A[0]for i 1 to n-1 do
if currentMax < A[i] then currentMax A[i]
return currentMax
2 steps + 1 to initialize i
2 steps
2 steps
1 step
2 step each time (compare i to n, inc i)n-1 times
How often done??
It depends on the order the numbers appear in in A[]
Between 4(n-1) and 6(n-1) in the loop
ArrayMax, revisited
Algorithm ArrayMax(A,n)Input: An array A storing N integersOutput: The maximum element in A.
currentMax A[0]for i 1 to n-1 do
if currentMax < A[i] then currentMax A[i]
return currentMax
n-1 times
All that really matters is that we go through it order n times.
Worst Case and Best Case
If we return to our original question of "how fast does a program run?" we can see that this question is not enough
Inputs vary in the way they are organized and this can influence the number of critical operations performed Suppose that we are searching of an element in an ordered list
If the target key is the first in the list our function takes constant time If the target key is not in the list our function takes O(n), where n is
the size of the list
The examples above are referred to as best case analysis and worst case analysis.
Which is the really relevant case? Worst case is more important because it gives us a bound on
how long the function might have to run
Average Case
In some situations neither the best nor the worst case analysis express well the performance of an algorithm
Average case analysis can be used if necessary
Still average case is uncommon because It may be cumbersome to do an average analysis of
non-trivial algorithms In most cases the "order" of the average analysis is
the same as the worst
Comparing Two Functions
There are several ways we can find if a function is order of another
The standard way is to use the definition of the big-O notation Is 10n + 23 = O(n)?
10n + 23 <= c.n, for all n >= n0
Assume c=10 and n0 = 13
We have 10n + 23 <= 10n, for all n>=13
Trivially true
Comparing Running Times
Analysis Example: Phonebook
Given: A physical phone book
Organized in alphabetical order A name you want to look up An algorithm in which you search through the
book sequentially, from first page to last What is the order of:
The best case running time? The worst case running time? The average case running time?
What is: A better algorithm? The worst case running time for this algorithm?
Analysis Example (Phonebook)
This better algorithm is called Binary Search What is its running time?
First you look in the middle of n elements Then you look in the middle of n/2 = ½*n
elements Then you look in the middle of ½ * ½*n elements … Continue until there is only 1 element left Say you did this m times: ½ * ½ * ½* …*n Then the number of repetitions is the smallest
integer m such that
12
1n
m
Analyzing Binary Search
In the worst case, the number of repetitions is the smallest integer m such that
We can rewrite this as follows:
mn
n
n
m
m
log
2
12
1
Multiply both sides by m2
Take the log of both sides
Since m is the worst case time, the algorithm is O(logn)
12
1n
m
Analysis Example
“prefix averages”You want this mapping from array of numbers to an array of averages of the preceding numbers (who knows why – not my example):5 10 15 20 25 30
5/1 15/2 30/3 50/4 75/5 105/6There are two straightforward algorithms:One is easy but wasteful.The other is more efficient, but requires insight into the problem.
Analysis Example
Analysis Example
For each position i in A, you look at the values for all the elements that came before What is the number of positions in the largest
part? When i=n, you look at n positions When i=n-1, you look at n-1 positions When i=n-2, you look at n-2 positions … When i=2, you look at 2 positions When i=1, you look at 1 position
Analysis Example
A useful tool: store partial information in a variable!Uses space to save time. The key – don’t divide s.Eliminates one for loop – always a good thing to do.
Summary: Analysis of Algorithms
A method for determining, in an abstract way, the asymptotic running time of an algorithm Here asymptotic means as n gets very large
Useful for comparing algorithms Useful also for determing tractability
Meaning, a way to determine if the problem is intractable (impossible) or not
Exponential time algorithms are usually intractable. We’ll revisit these ideas throughout the rest of
the course.