Download ppt - Analysis of Algorithms. What is an algorithm? The ideas behind computer programs Stays the same no matter Which kind of hardware it is running on Which

Analysis of Algorithms

What is an algorithm?

The ideas behind computer programs Stays the same no matter

Which kind of hardware it is running on Which programming language it is written in

Solves a general, well-specified problem Is specified by

Describing the set of instances (input) it must work on

Describing the desired properties of the output

Important Properties of Algorithms

Correct always returns the desired output for all

legal instances of the problem. Efficient

Can be measured in terms of Time Space

Time tends to be more important

Expressing Algorithms

English description

Pseudocode

High-level programming language

More precise

More easily expressed

Pseudocode A shorthand for specifying algorithms Leaves out the implementation details Leaves in the essence of the algorithm

Algorithm ArrayMax(A,n)Input: An array A storing n>=1 integersOutput: The maximum element in A.

currentMax A[0]for i 1 to n-1 do

if currentMax < A[i] then currentMax A[i]

return currentMax

Algorithms

Algorithms are simply a list of steps required to solve some particular problem

They are designed as abstractions of processes carried out by computer programs

Examples include Sorting Determining if a student qualifies for financial aid Depth-First Search

In some cases we have only one algorithm for a problem or the problem is so straightforward that there is no need to consider anything other than the obvious

Some other problems have many known algorithms We obviously want to choose the "best" algorithm

Other problems have no known algorithm!

Analysis of Algorithms Why analyze algorithms?

evaluate algorithm performance compare different algorithms

Analyze what about them? running time, memory usage, solution quality worst-case and “typical” case

Computational complexity understanding the intrinsic difficulty of computational

problems - classifying problems according to difficulty algorithms provide upper bound to show problem is hard, must show that any algorithm

to solve it requires at least a given amount of resources transform problems to establish “equivalent” difficulty

What is the "best" Algorithm? Traditionally we focused on two questions

How fast does it run? In the early days this was measured by timing the implementation of the

algorithm It was common to hear (and read papers!) about a "new" SuperDuper Sort

that could sort a list of 1 million integers in 17 seconds whereas Crap Sort requires 43 seconds

How much memory does it require? There are other considerations such as

How "complicated" is the algorithm How well is the algorithm documented

For now, we will leave these Soft. Eng. considerations aside

One question remains. Based on the statement above about SuperDuper Sort, can we say that it is better than Crap Sort? Can the machine used have influence on the results?


Programs depend on the operating systems, machine, compiler/interpreter used, etc.

Analysis of algorithms compare algorithms and not programs

It is based on the premise that the longer the algorithm takes the longer its implementation will run Sorting 1 million items ought to take longer than sorting

1000 But if we comparing algorithms (not yet

implemented) how can we express it's performance? How can we "measure" the performance of an algorithm?

Analysis of Algorithms What we want though is an expression that can be

applied to any computer This is only possible by stating the efficiency in

terms of some critical operations These operations depend on the problem We could for instance say that in sorting algorithms it is

the number of time two elements are compared In general we do analysis of algorithms using the

RAM model (Random Access Machine) Instructions are executed one after the other

There is no concurrency Basic operations take the same time (constant time)

We normally say that each line (step) in the algorithm takes time 1 (one)

The RAM Model

Random Access Machine (not R.A. Memory) An idealized notion of how the computer works

Each "simple" operation (+, -, =, if) takes exactly 1 step.

Each memory access takes exactly 1 step Loops and method calls are not simple operations,

but depend upon the size of the data and the contents of the method.

Measure the run time of an algorithm by counting the number of steps.

Random Access Machine A Random Access Machine (RAM) consists

of: a fixed program an unbounded memory a read-only input tape a write-only output tape

Each memory register can hold an arbitrary integer (*).

Each tape cell can hold a single symbol from a finite alphabet .

. . .

Program

Memory

...

. . .

output tape

input tape

Instruction set:x y, x y {+, , *, div, mod} zgoto labelif y {<, , =, ,> , } z goto labelx input, output yhalt

Addressing modes:x may be direct or indirect referencey and z may be constants, direct or indirect references

Sample Program The following program reads a value and

writes the remainder mod 7 to the output.r0 input r1 0

10 if r0 = ‘%’ goto 20r0 r0 ‘0’r1 r1 10r1 r1 r0

r0 inputgoto 10

20 r2 r1 mod7r2 r2 ‘0’output r2

halt Example of indirection: *r2 *r3 5 means

value in location pointed to be r3 is added to 5 and the result written to location pointed to by r2.

Primitive Operations

Assign a value to a variable Call a method Arithmetic operation Comparing two numbers Indexing into an array Following an object reference Returning from a method

Counting Primitive Operations

Algorithm ArrayMax(A,n)Input: An array A storing N integersOutput: The maximum element in A.



return currentMax

2 steps + 1 to initialize i

2 steps

2 steps

1 step

2 step each time (compare i to n, inc i)n-1 times

How often done??

It depends on the order the numbers appear in in A[]

Between 4(n-1) and 6(n-1) in the loop


But you could ask be asking: If each line takes constant time the whole algorithm (any algorithm) will take constant time, right?

Wrong! Although some algorithms may take constant time, the

majority of algorithms varies its number of steps based on the size of instance we're trying to solve

Therefore the efficiency of an algorithm is always normally stated as a function of the problem size We generally use the variable n to represent the problem size

On the implementation, we could find out that out SuperDuper Sort takes 0.6n2 + 0.3n + 0.45 seconds on a Pentium 3. Plug a value for n and you have how long it takes

Number of steps

for i = 1 .. N do a = a + 2 i = i + 1end do




It is easy to see that most of algorithms vary their number of steps

Algorithm Complexity

Worst Case Complexity: the function defined by the maximum number

of steps taken on any instance of size n Best Case Complexity:

the function defined by the minimum number of steps taken on any instance of size n

Average Case Complexity: the function defined by the average number of

steps taken on any instance of size n

Best, Worst, and Average Case Complexity

Worst Case Complexity

Average Case Complexity

Best Case Complexity

Number of steps

N (input size)

Doing the Analysis It’s hard to estimate the running time

exactly Best case depends on the input Average case is difficult to compute So we usually focus on worst case analysis

Easier to compute Usually close to the actual running time

Strategy: try to find upper and lower bounds of the worst case function.Upper bound

Lower bound

Actual function


But what we're just said is not yet independent of the machine. Remember that we said that we said that the formula

for the SuperDuper Sort is valid for a Pentium 3 We need to identify the most important aspect

of the function that represents the running time of an algorithm

Which one is the "best" f(n) = 10000000n g(n) = n2 + n

Asymptotic Analysis

Asymptotic analysis of an algorithm describes the relative efficiency of an algorithm as n get very large.

In the example it is easy to see that for very large n, g(n) grows faster than f(n) Take for instance the value n=20000000

Remember that the goal here is to compare algorithms. In practice, if you're writing small programs, asymptotic analysis may not be that important

When you're dealing with small input size, most of algorithms will do

When the input size is very large things change

Asymptotic Performance

In this course, we care most about asymptotic performance How does the algorithm behave as the

problem size gets very large? Running time Memory/storage requirements Bandwidth/power requirements/logic gates/etc.

Asymptotic Notation

By now you should have an intuitive feel for asymptotic (big-O) notation: What does O(n) running time mean? O(n2)?

O(n lg n)? How does asymptotic running time relate to

asymptotic memory usage? Our first task is to define this notation

more formally and completely

Constants versus n We usually focus on big-oh. It is important to understand the difference

between constants and n A constant has a fixed value, doesn’t change

Doesn’t really matter what the value of the constant is. n reflects the size of the problem

So n can get really really big This is why we talk about the time in terms of a function

of n In the ArrayMax example,

We don’t really need to pay attention to 4(n-1) versus 6(n-1)

They are both order n

Problems that have large n

Put a list of all contributors to the 2000 presidential campaigns into alphabetical order.

Run a photoshop-style filter across all the pixels of a high-resolution image.

Others?

Problems that have large n

Checking mailboxes The mail delivery person needs to visit all

of them. What is the best order to go in? This is easy if n is small (say, 10) Otherwise it is hard!

Number of mailboxes in the city (n) Minimum number of paths required to be

taken? Maximum number of paths between pairs of

mailboxes?

Constraining Large Problems

Number of ways there are to fly roundtrip from the Bay Area to Washington DC.

Here n is the number of available flights on a given day throughout the country

But have to add lots of constraints too Choose SFO, OAK, or SJ Choose BWI, Dulles, or National Choose airline Direct, one stop, two stops?

Connect through Dallas or Denver or Chicago or LAX or … How long must be allowed for layovers? Which combos are cheapest? What about open-jaw?

If you try all possible combinations, it will take a very long time to run!!

A running time is O(g(n)) if there exist constants n0 > 0

and c > 0 such that for all problem sizes n > n0, the

running time for a problem of size n is at most c(g(n)). In other words, c(g(n)) is an upper bound on the running time for sufficiently large n.

c g(n)

The Crossover Point

One function starts out faster for small values of n.But for n > n0, the other function is always faster.

Higher order Term

Analyzing the table given earlier we can see that in an efficiency function we are interested in the term with higher order If we have a function f(n) = n3 + n2, for the case when n = 100000 the

running time of the algorithm is 31.7 years + 2.8 hours Its clear that a couple of hours does not make much difference if the

program is to run for 31.7 years!

In the case above we say that f(n) is O(n3) meaning that f(n) is of the order n3

The so-called big-O notation disregards any constant multiplying the term of highest order and any term of smaller order f(n) = 10000000000000n3 is O(n3)

It's crucial that you understand how to identify the most significant term in a formula

A more Formal Definition ofbig-O For a given function g(n), we say the set

of functions that satisfy the assertion below are O(g(n))

c g(n)

f(n)

nn0

t

f(n) = O(g(n))

More formally

Let f(n) and g(n) be functions mapping nonnegative integers to real numbers.

f(n) is (g(n)) if there exist positive constants n0 and c such that for all n>=n0, f(n) <= c*g(n)

Other ways to say this:f(n) is order g(n)f(n) is big-Oh of g(n) f(n) is Oh of g(n)

Function Pecking Order

In increasing order log(n) n n^2 n^5 2^n

1 2 4 32 42 4 16 1024 163 8 64 32768 2564 16 256 1048576 655365 32 1024 33554432 4.29E+096 64 4096 1.07E+09 1.84E+197 128 16384 3.44E+10 3.4E+388 256 65536 1.1E+12 1.16E+779 512 262144 3.52E+13 1.3E+154

10 1024 1048576 1.13E+15 #NUM!

Plot them!

0

2E+153

4E+153

6E+153

8E+153

1E+154

1.2E+154

1.4E+154

1.6E+154

1 2 3 4 5 6 7 8 9 10

log(n)

n

n 2̂

n 5̂

2 n̂

11E+121E+24

1E+361E+481E+601E+72

1E+841E+96

1E+1081E+120

1E+1321E+1441E+156

1 2 3 4 5 6 7 8 9 10

log(n)

n

n 2̂

n 5̂

2 n̂

Both x and y linear scales Convert y axis to log scale

(that jump for large n happens because the last number is out of range)

Notice how much bigger 2^n is than n^k

This is why exponential growth is BAD BAD BAD!!

An simple comparison Let's assume that you have 3 algorithms to sort a list

f(n) = n log2n g(n) = n2 h(n) = n3

Let's also assume that each step takes 1 microsecond (10-6)

Most of the algorithms discussed here will be given in terms of common functions: polynomials, logarithms, exponentials and product of these functions

n n log n n^2 n^310 33.2 100 1000100 664 10000 1seg

1000 9966 1seg 16min100000 1.7s 2.8 hours 31.7 years

Common Functions

Constant: Very fast. Some hash table algorithms can look up one item from the table of n items in an average time which is constant (independent of the table size)

Logarithmic: Also very fast. Typical of many algorithms that use (binary) trees.

Linear Time: Typical of fast algorithms on a single-processor computer. If all the input of size n has to be read.

Poly-logarithmic (n log n): Typical of the best sorting algorithms. Considered a good solution

Polynomial: When a problem of size n can be solved in time nk where k is a constant. Small n's (n <= 3) is OK.

Common Functions

Exponential: Are those that use time kn where k is a constant. Algorithms that grow on this rate are suitable only for small problems. Unfortunately the best algorithms known to many

problems use exponential time Much of the work on developing algorithms today is

focused on these problems because they take an huge amount of time to execute (even for reasonably small input size)

There is a large variation in the size of various exponential functions (20.0001n and 2n). But for large n the functions become huge

Doing the Analysis It’s hard to estimate the running time

exactly Best case depends on the input Average case is difficult to compute So we usually focus on worst case analysis

Easier to compute Usually close to the actual running time

Strategy: try to find upper and lower bounds of the worst case function.Upper bound

Lower bound

Actual function

Algorithm Complexity

Analogy: shoe shopping Best, average, worst case:

Best case: drive right up to the door, don’t have to walk anywhere You can buy those Manolo Blahnik stilettos

Average case: walking around campus Tivos

Worst case: hiking in the desert for weeks Need boots that can accommodate hot, swollen feet

Upper vs. lower bound Assume you wear size 8.25 (in the worst case) Perfect fit –

custom made shoes Upper bound – fit as closely as possible, with a well-known function (size),

that is just bigger than actual foot size Get a size 8 ½

Lower bound – fit as closely as possible, with a well-known function (size), that is just smaller than actual shoe size

Get a size 8

Large Problems

Number of ways there are to fly roundtrip from the Bay Area to Washington DC.

Here n is the number of available flights on a given day throughout the country

But have to add lots of constraints too Choose SFO, OAK, or SJ Choose BWI, Dulles, or National Choose airline Direct, one stop, two stops?

Connect through Dallas or Denver or Chicago or LAX or … How long must be allowed for layovers? Which combos are cheapest? What about open-jaw?

If you try all possible combinations, it will take a very long time to run!!

More Plots

0

10000

20000

30000

40000

50000

60000

1 2 3 4 5 6 7 8 9 10 11 12

log n

n

n log n

log n n n log n1 2 22 4 83 8 244 16 645 32 1606 64 3847 128 8968 256 20489 512 4608

10 1024 1024011 2048 2252812 4096 49152

1

10

100

1000

10000

100000

1 2 3 4 5 6 7 8 9 10 11 12

log n

n

n log n

Let’s Count Some Beer A well-known “song”

“100 bottles of beer on the wall, 100 bottles of beer; you take one down, pass it around, 99 bottles of beer on the wall.”

“99 bottles of beer on the wall, 99 bottles of beer; you take one down, pass it around, 98 bottles of beer on the wall.”

… “1 bottle of beer on the wall, 1 bottle of beer, you take it

down, pass it around, no bottles of beer on the wall.” HALT.

Let’s change the song to “N bottles of beer on the wall”. The number of bottles of beer passed around is Order what?

Let’s Count Some Ants

Another song: The ants go marching 1 by 1 The ants go marching 2 by 2 The ants go marching 3 by 3

How ants are in the lead in each wave of ants?1 + 2 + 3 + … + n

Does this remind you of anything?

)(2

)1( 2

1

nOnn

in

i

Graph it!Let’s plot beer(n) versus ants(n)

0

10

20

30

40

50

60

70

80

90

1 2 3 4 5 6 7 8 9 10 11 12

n

tota

l # it

ems

Gifts

Beer BottlesAnts

Constants versus n We usually focus on big-oh. It is important to understand the difference

between constants and n A constant has a fixed value, doesn’t change

Doesn’t really matter what the value of the constant is. n reflects the size of the problem

So n can get really really big This is why we talk about the time in terms of a function

of n In the ArrayMax example,

We don’t really need to pay attention to 4(n-1) versus 6(n-1)

They are both order n

ArrayMax, revisited




return currentMax

2 steps + 1 to initialize i

2 steps

2 steps

1 step

2 step each time (compare i to n, inc i)n-1 times

How often done??

It depends on the order the numbers appear in in A[]

Between 4(n-1) and 6(n-1) in the loop

ArrayMax, revisited




return currentMax

n-1 times

All that really matters is that we go through it order n times.

Worst Case and Best Case

If we return to our original question of "how fast does a program run?" we can see that this question is not enough

Inputs vary in the way they are organized and this can influence the number of critical operations performed Suppose that we are searching of an element in an ordered list

If the target key is the first in the list our function takes constant time If the target key is not in the list our function takes O(n), where n is

the size of the list

The examples above are referred to as best case analysis and worst case analysis.

Which is the really relevant case? Worst case is more important because it gives us a bound on

how long the function might have to run

Average Case

In some situations neither the best nor the worst case analysis express well the performance of an algorithm

Average case analysis can be used if necessary

Still average case is uncommon because It may be cumbersome to do an average analysis of

non-trivial algorithms In most cases the "order" of the average analysis is

the same as the worst

Comparing Two Functions

There are several ways we can find if a function is order of another

The standard way is to use the definition of the big-O notation Is 10n + 23 = O(n)?

10n + 23 <= c.n, for all n >= n0

Assume c=10 and n0 = 13

We have 10n + 23 <= 10n, for all n>=13

Trivially true

Comparing Running Times

Analysis Example: Phonebook

Given: A physical phone book

Organized in alphabetical order A name you want to look up An algorithm in which you search through the

book sequentially, from first page to last What is the order of:

The best case running time? The worst case running time? The average case running time?

What is: A better algorithm? The worst case running time for this algorithm?

Analysis Example (Phonebook)

This better algorithm is called Binary Search What is its running time?

First you look in the middle of n elements Then you look in the middle of n/2 = ½*n

elements Then you look in the middle of ½ * ½*n elements … Continue until there is only 1 element left Say you did this m times: ½ * ½ * ½* …*n Then the number of repetitions is the smallest

integer m such that

12

1n

m

Analyzing Binary Search

In the worst case, the number of repetitions is the smallest integer m such that

We can rewrite this as follows:

mn

n

n

m

m

log

2

12

1

Multiply both sides by m2

Take the log of both sides

Since m is the worst case time, the algorithm is O(logn)

12

1n

m

Analysis Example

“prefix averages”You want this mapping from array of numbers to an array of averages of the preceding numbers (who knows why – not my example):5 10 15 20 25 30

5/1 15/2 30/3 50/4 75/5 105/6There are two straightforward algorithms:One is easy but wasteful.The other is more efficient, but requires insight into the problem.

Analysis Example

Analysis Example

For each position i in A, you look at the values for all the elements that came before What is the number of positions in the largest

part? When i=n, you look at n positions When i=n-1, you look at n-1 positions When i=n-2, you look at n-2 positions … When i=2, you look at 2 positions When i=1, you look at 1 position

Analysis Example

A useful tool: store partial information in a variable!Uses space to save time. The key – don’t divide s.Eliminates one for loop – always a good thing to do.

Summary: Analysis of Algorithms

A method for determining, in an abstract way, the asymptotic running time of an algorithm Here asymptotic means as n gets very large

Useful for comparing algorithms Useful also for determing tractability

Meaning, a way to determine if the problem is intractable (impossible) or not

Exponential time algorithms are usually intractable. We’ll revisit these ideas throughout the rest of

the course.