CS 2133: Data Structures Mathematics Review and Asymptotic Notation

Preview:

Citation preview

CS 2133: Data Structures

Mathematics Review and

Asymptotic Notation

Arithmetic Series Review

1 + 2 + 3 + . . . + n = ?

Sn = a + a+d + a+2d + a+3d + . . . a+(n-1)dSn= 2a+(n-1)d + 2a+(n-2)d + 2a+(n-3)d + … + a

2Sn = 2a+(n-1)d + 2a+(n-1)d + 2a+(n-1)d . . . + 2a+(n-1)dSn = n/2[2a + (n-1)d]

consequently 1+2+3+…+n = n(n+1)/2

Problems

Find the sum of the following

1+3+5+ . . . + 121 = ?

The first 50 terms of -3 + 3 + 9 + 15 + …

1 + 3/2 + 2 + 5/2 + . . . 25=?

Geometric Series Review

1 + 2 + 4 + 8 + . . . + 2n

1 + 1/2 + 1/4 + . . . + 2-n

r

arara

nn

i

i

1

1

0

Theorem:

Sn= a + ar + ar2 + . . . + arn

rSn= ar + ar2 + . . . + arn + arn+1

Sn-rSn = a - arn+1 r

araS

n

n

1

1

What about the case where -1< r < 1 ?

Geometric Problems

?8

1

4

1

2

11

2

1

0

i

i

What is the sum of 3+9/4 + 27/16 + . . .

1/2 - 1/4 + 1/8 - 1/16 + . . .

Harmonic Series

577.0

ln

14

13

12

11

nH

nH

n

n

This is Eulers constants

Just an Interesting Question

What is the optimal base to use in the representation of numbers n?

Example: with base x we have _ _ _ _ _ _ _ _ _

1log nx

X values

slots

We minimize

xnc 1log2

Logarithm Review

Ln = loge is called the natural logarithmLg = log2 is called the binary logarithm

How many bits are required to represent the number n in binary

1log2 n

Logarithm Rules

The logarithm to the base b of x denoted logbx is defined to that number y such that

by = x

logbx > 0 if x > 1logbx = 0 if x = 1logbx < 0 if 0 < x < 1

logb(x1*x2) = logb x1 + logb x2

logb(x1/x2) = logb x1 - logb x2

logb xc = c logbx

Additional Rules

For all real a>0, b>0 , c>0 and n

logb a = logca / logc b logb (1/a) = - logb a

logb a = 1/ logab a logb n = n log

b a

Asymptotic Performance

In this course, we care most about the asymptotic performance of an algorithm. How does the algorithm behave as the problem

size gets very large? Running time Memory requirements

Coming up: Asymptotic performance of two search algorithms, A formal introduction to asymptotic notation

Input Size

Time and space complexity This is generally a function of the input size

E.g., sorting, multiplication How we characterize input size depends:

Sorting: number of input items Multiplication: total number of bits Graph algorithms: number of nodes & edges Etc

Running Time

Number of primitive steps that are executed Except for time of executing a function call most

statements roughly require the same amount of time y = m * x + b c = 5 / 9 * (t - 32 ) z = f(x) + g(y)

We can be more exact if need be

Analysis

Worst case Provides an upper bound on running time An absolute guarantee

Average case Provides the expected running time Very useful, but treat with care: what is “average”?

Random (equally likely) inputs Real-life inputs

An Example: Insertion Sort

InsertionSort(A, n) {for i = 2 to n {

key = A[i]j = i - 1;while (j > 0) and (A[j] > key) {

A[j+1] = A[j]j = j - 1

}A[j+1] = key

}

}

Insertion Sort

InsertionSort(A, n) {for i = 2 to n {

key = A[i]j = i - 1;while (j > 0) and (A[j] > key) {

A[j+1] = A[j]j = j - 1

}A[j+1] = key

}

}

What is the preconditionfor this loop?

Insertion Sort

InsertionSort(A, n) {for i = 2 to n {

key = A[i]j = i - 1;while (j > 0) and (A[j] > key) {

A[j+1] = A[j]j = j - 1

}A[j+1] = key

}

}How many times will this loop execute?

Insertion Sort

Statement EffortInsertionSort(A, n) {

for i = 2 to n { c1n

key = A[i] c2(n-1)

j = i - 1; c3(n-1)

while (j > 0) and (A[j] > key) { c4T

A[j+1] = A[j] c5(T-(n-1))

j = j - 1 c6(T-(n-1))

} 0

A[j+1] = key c7(n-1)

} 0

}

T = t2 + t3 + … + tn where ti is number of while expression evaluations for the ith for loop iteration

Analyzing Insertion Sort

T(n) = c1n + c2(n-1) + c3(n-1) + c4T + c5(T - (n-1)) + c6(T - (n-1)) + c7(n-1)

= c8T + c9n + c10

What can T be? Best case -- inner loop body never executed

ti = 1 T(n) is a linear function Worst case -- inner loop body executed for all

previous elements ti = i T(n) is a quadratic function T=1+2+3+4+ . . . n-1 + n = n(n+1)/2

Analysis

Simplifications Ignore actual and abstract statement costs Order of growth is the interesting measure:

Highest-order term is what counts Remember, we are doing asymptotic analysis As the input size grows larger it is the high order term that

dominates

Upper Bound Notation

We say InsertionSort’s run time is O(n2) Properly we should say run time is in O(n2) Read O as “Big-O” (you’ll also hear it as “order”)

In general a function f(n) is O(g(n)) if there exist positive constants c and n0

such that f(n) c g(n) for all n n0

Formally O(g(n)) = { f(n): positive constants c and n0 such that

f(n) c g(n) n n0

Big O example

Show using the definition that 5n+4 O(n)Where g(n)=nFirst we must find a c and an n0

We now need to show that f(n) c g(n) for every n n0

clearly 5n + 5 6n whenever n 6

Hence c=6 and n0=6 satisfy the requirements.

Insertion Sort Is O(n2)

Proof Suppose runtime is an2 + bn + c

If any of a, b, and c are less than 0 replace the constant with its absolute value

an2 + bn + c (a + b + c)n2 + (a + b + c)n + (a + b + c) 3(a + b + c)n2 for n 1 Let c’ = 3(a + b + c) and let n0 = 1

Question Is InsertionSort O(n3)? Is InsertionSort O(n)?

Big O Fact

A polynomial of degree k is O(nk) Proof:

Suppose f(n) = bknk + bk-1nk-1 + … + b1n + b0

Let ai = | bi |

f(n) aknk + ak-1nk-1 + … + a1n + a0

ki

kk

i

ik cnan

n

nan

Lower Bound Notation

We say InsertionSort’s run time is (n) In general a function

f(n) is (g(n)) if positive constants c and n0 such that 0 cg(n) f(n) n n0

Proof: Suppose run time is an + b

Assume a and b are positive (what if b is negative?) an an + b

Asymptotic Tight Bound

A function f(n) is (g(n)) if positive constants c1, c2, and n0 such that

c1 g(n) f(n) c2 g(n) n n0

Theorem f(n) is (g(n)) iff f(n) is both O(g(n)) and (g(n)) Proof: someday

Notation

(g) is the set of all functions f such that there exist positive constants c1, c2, and n0 such that

0 c1g(n) f(n) c2 g(n) for every n > nc

c1g(n)

f(n)

c2 g(n)

Growth Rate Theorems1. The power n is in O(n) iff (with ,>0) and n is in o(n) iff

2. logbn o(n ) for any b and

3. n o(cn) for any >0 and c>1

4. logbn O(logbn) for any a and b

5. cn O(dn) iff cd and cn o(dn) iff c<d

6. Any constant function f(n) =c is in O(1)

Big O Relationships

1. o(f) O(f)

2. If fo(g) then O(f) o(g)

3. If f O(g) then o(f) o(g)

4. If f O(g) then f(n) + g(n) O(g)

5. If f O(f `) and g O(g`) then

f(n)* g(n) O(f `(n) * g`(n))

Theorem: log(n!)(nlogn)Case 1 nlogn O(log(n!))

log(n!) = log(n*(n-1)*(n-2) * * * 3*2*1)

= log(n*(n-1)*(n-2)**n/2*(n/2-1)* * 2*1

=> log(n/2*n/2* * * n/2*1 *1*1* * * 1)

= log(n/2)n/2 = n/2 log n/2 O(nlogn)

Case 2 log(n!) O(nlogn)

log(n!) = logn + log(n-1) + log(n-2) + . . . Log(2) + log(1)

< log n + log n + log n . . . + log n

= nlogn

The Little o Theorem: If log(f)o(log(g)) andlim g(n) =inf as n goes to inf then f o(g)

Note the above theorem does not apply to big O for log(n2) O(log n) but n2 O(n)

Application: Show that 2n o(nn)Taking the log of functions we have log(2n)=nlog22and log( nn) = nlog2n.

Hence

2log

log

2log

loglimlim

n

n

nn

nn

Implies that 2n o(nn)

Theorem: )(lg non

n

n

n

nnn 2ln

lnlim

lglim

n

nn

lnlim

2ln

1

02

lim2ln

1

nn

)2(1

/1lim

2ln

1

n

nn

Practical Complexity

0

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

f(n) = n

f(n) = log(n)

f(n) = n log(n)

f(n) = n 2̂

f(n) = n 3̂

f(n) = 2 n̂

Practical Complexity

0

500

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

f(n) = n

f(n) = log(n)

f(n) = n log(n)

f(n) = n 2̂

f(n) = n 3̂

f(n) = 2 n̂

Practical Complexity

0

1000

1 3 5 7 9 11 13 15 17 19

f(n) = n

f(n) = log(n)

f(n) = n log(n)

f(n) = n 2̂

f(n) = n 3̂

f(n) = 2 n̂

Practical Complexity

0

1000

2000

3000

4000

5000

1 3 5 7 9 11 13 15 17 19

f(n) = n

f(n) = log(n)

f(n) = n log(n)

f(n) = n 2̂

f(n) = n 3̂

f(n) = 2 n̂

Other Asymptotic Notations

A function f(n) is o(g(n)) if positive constants c and n0 such that

f(n) < c g(n) n n0

A function f(n) is (g(n)) if positive constants c and n0 such that

c g(n) < f(n) n n0

Intuitively, o() is like < O() is like

() is like > () is like

() is like =

Comparing functions

Definition: The function f is said to dominate g if f(n)/g(n)increases without bound as n increases without bound.

i.e. for any c>0 there exist n0>0 such that f(n)> c g(n)for every n>n0

)(

)(lim

ng

nfn

1log2!2 222

nnn nn

Little o Complexity

o(g) is the set of all functions that are dominated by g, i.e. The set of all f such that for every c>0 there exist nc>0such that f(n)c g(n) for every n > nc

Up Next

Solving recurrences Substitution method Master theorem

Recommended