COMP 2402/2002 Abstract Data Types and Algorithms Prof: Office: Email: Office hours: Eduardo Mesa HP 5347 [email protected] Tuesday and Thursday 4:30pm

COMP 2402/2002Abstract Data Types and

AlgorithmsProf:Office:Email:

Office hours:

Eduardo Mesa

HP 5347

[email protected] and Thursday 4:30pm – 5:30pm

Web Site: http://people.scs.carleton.ca/~eamesaba/

and also WebCT

Textbook:Open Data Structures (in Java). The pdf can be downloaded from the website

http://people.scs.carleton.ca/~eamesaba/

TAs

Andrew Trenholm

• No office hours for TAs

Tawfic Abdul-Fatah

• Prompt answer to questions in:− Web-CT discussion − Carleton Computer Science Society forum.

Evaluation

AssignmentsMidterm:Final Exam:Active Participation:

30%

30%

40%

5%

3 assignments 10% each

2 Midterms 15% each

bonus

Assignments

• 1 Theory assignment

• 2 Programming Assignments

− Must be handled first thing in class.− All pages must be stapled

No late assignment will be accepted.

− Must be uploaded on Web CT− Make a folder (<Student ID>_<First Name>_<Last Name>_ <Assignment #>)− Put all the source files in the folder− Add a text file with your Id and full name− Zip the folder

End of the Introduction

Begining of the Lecture

a data storage format that can contain a specific type or range of values characterized by a set of operations that satisfy a set of specific properties.

Data Type

ExampleData Type Int

Range of Value: -(231) to (231)

Properties:

•Symmetry: a+b = b+a, a*b = b*a •Associative a+(b+c) = (a+b)+c•Definition: a/b , b must not be 0

Operations: +, -, *, /

Data Structures

How to organize data to be able to perform operations on those data efficiently.

A variable is the simplest data structure.

ExampleElectronic Phone Book

Contains different DATA:- Names- phone numbers- addresses

Need to perform certain OPERATIONS:- add- delete- look for a phone number- look for an address


How to organize the data so to optimize the efficiency of the

operations

AB

XZ

List Binary Search Tree Dictionary

ExampleFinding the best route for an email

message in a network

Contains DATA:- Network + Traffic

Need to perform certain OPERATIONS:- Find the best route

12

217

18

8

10

43

16

149


How to represent the data

Adjacency Matrix Adjacency List

Abstract Data Type (ADT)(interfaces)

Define what operation can be done.

Define how each operation is performed.

Implementations(algorithms)

A same ADT could have several different implementations.

Lucy

GetBook ( Book b )

Binary Search

ADT (Shelf)Peter

Random Search

•Identify your data•Identify the operations you need to perform (and how often each operation is performed)

So to perform the operations efficiently we need:

Data Structures

•Define efficiency•Choose the best structure for your data.

AlgorithmInput Output

Analysis of Algorithms

An algorithm is a step-by-step procedure forsolving a problem in a finite amount of time.

Analyze an algorithm = determine its efficiency

Efficiency ?Running time …

Memory …

Quality of the result ….

Simplicity ….

Generally, while improving the efficiency in one of these aspects we diminish the efficiency in the others

Running time

0

20

40

60

80

100

120

Ru

nn

ing

Tim

e

1000 2000 3000 4000

Input Size

Best caseAverage caseWorst caseThe running time depends

on the input size

It also depends on the input data:Different inputs can have different running times

19

Running Time of an algorithm

• Average case time is often difficult to determine.

• We focus on the worst case running time.– Easier to analyze– Crucial to applications

such as games, finance and robotics

0

20

40

60

80

100

120

Runnin

g T

ime

1000 2000 3000 4000

Input Size

best caseaverage caseworst case

20

If x is odd return xIf x is even

compute the sum Sof the first x integersreturn S

15 15

4 10

Example ….

21

Measuring the Running Time

• How should we measure the running time of an algorithm?

• Approach 1: Experimental Study

50 1000

t (ms)

n

10

20

30

40

50

60

22

Beyond Experimental Studies

• Experimental studies have several limitations:– need to implement– limited set of inputs– hardware and software environments.

23

Theoretical Analysis

• We need a general methodology that: -

Uses a high-level description of the algorithm (independent of implementation).

Characterizes running time as a function of the input size.

Takes into account all possible inputs.

Is independent of the hardware and software environment.

24

Analysis of Algorithms

• Primitive Operations: Low-level computations independent from the programming language can be identified in pseudocode.

• Examples:– calling a method and returning from a method– arithmetic operations (e.g. addition)– comparing two numbers, etc.

By inspecting the pseudo-code, we can count the number of primitive operations

executed by an algorithm.

25

Example:

Algorithm arrayMax(A, n):Input: An array A storing n integers.Output: The maximum element in A.

currentMax A[0]for i 1 to n -1 doif currentMax < A[i] then

currentMax A[i]return currentMax

26


currentMax A[i]

return currentMax

5 13 4 7 6 2 3 8 1 2

currentMax

A

27


currentMax A[i]

return currentMax

5 13 4 7 6 2 3 8 1 2

currentMax

A

5

28


currentMax A[i]

return currentMax

5 13 4 7 6 2 3 8 1 2A

currentMax 5

29

What are the primitive operations to count ?


currentMax A[i]

return currentMax

5 13 4 7 6 2 3 8 1 2ComparisonsAssignments to

currentMax

A

currentMax 13

30


currentMax A[i]

return currentMax

1 assignment

n-1 comparisonsn-1 assignments

(worst case)

5 7 8 10 11 12 14 16 17 20


currentMax A[i]

return currentMax

In the best case ?

15 1 12 3 9 7 6 4 2 2

1 assignment

n-1 comparisons0 assignments

Summarizing:

Worst Case:n-1 comparisonsn assignments

Best Case:n-1 comparisons1 assignment

Compute the exact number of primitive operations could be difficult

We compare the asymptotic behaviour of the running time when the size of the input rise.

33

Big-Oh

– given two functions f(n) and g(n), we say that

f(n) is O(g(n)) if and only if there are positive constants c and n0 such that

f(n) c g(n) for n n0

n0

c • g(n)

f(n)

n

(upper bound)

34

g(n) = n

n

What does it mean c g(n) ?

Example:

34

2 g(n) = 2 n

n

3 g(n) = 3 n

35

f(n) = 2n+1

g(n) = n2

Graphical example …

f(n) is O(n2)


n

n0≈2.5 c = 1

?

36

f(n) = 2n+1

g(n) = n

But also


n

37

f(n) = 2n+1

2 g(n) = 2 n

But also


n

38

f(n) = 2n+1

3 g(n) = 3 n

But also


f(n) is O(n)

nc = 3 and n0 = 1

39

On the other hand…

n2 is not O(n) because there is no c and n0 such that: n2 cn for n n0

( no matter how large a c is chosen there is an n big enough (n > c) that n2 > c n ).

n2

n

n0 n

2n 3n 4n

O(g(n)) = {f(n) : there exists positive constants c and n0

such that f(n) cg(n) for all n n0 }

Notice: O(g(n)) is a set of functions•When we say f(n) = O(g(n)) we really mean f(n) ϵ O(g(n))

Formal definition of big-Oh:

41

Prove that f(n) = 60n2 + 5n + 1 is O(n2)

We must find a constant c and a constant n0 such that:

60n2 + 5n + 1 ≤ c n2 for all n≥n0

5n ≤ 5n2 for all n≥11 ≤ n2 for all n≥1

f(n) ≤ 60n2 +5n2 + n2 for all n≥1

f(n) ≤ 66n2 for all n≥1 c= 66 et n0=1 => f(n) = O(n2)

Example:

f(n) ≤ 13n log2 n for all n ≥ 2

f(n) ϵ O(n log2 n ) [ c = 13, n0 = 2 ]

Prove f(n) = 5n log2 n + 8n - 200 = O(n log2 n)

5n log2 n + 8n - 200 ≤ 5n log2 n + 8n

≤ 5n log2 n + 8n log2 n for n ≥ 2 (log2 n ≥ 1)

≤ 13n log2 n

Example:

We can multiply these to learn about other functions, O(an) = O(n) ⊂ O(n log n) ⊂ O(n1+b) ⊂ O(ncn)

Some commons relations

O(nc1) ⊂ O(nc

2) for any c1 < c2

For any constants a; b; c > 0,O(a) ⊂ O(log n) ⊂ O(nb) ⊂ O(cn)

These make things faster2 log2 n + 2 = O(log n)n + 2 = O(n)2n + 15n1/2 = O(n)

Examples: O(n1/5) ⊂ O(n1/5 log n)

Ex 1:

2n3 + 3n2 = O (max(2n3, 3n2)) = O(2n3) = O(n3)

Theorem: If g(n) is O(f(n)) , then for any constant c > 0 g(n) is also O(c f(n))

Theorem: O(f(n) + g(n)) = O(max(f(n), g(n)))

Ex 2:

n2 + 3 log n – 7 = O(max(n2, 3 log n – 7)) = O(n2)

45

Drop lower order terms and constant factors

7n-3 is O(n)

8n2log n + 5n2 + n is O(n2log n)

12n3 + 5000n2 + 2n4 is O(n4)

Simple Big Oh Rule:

46

•Use the smallest possible class of functions

–Say “2n is O(n)” instead of “2n is O(n2)”

Other Big Oh Rules:

•Use the simplest expression of the class

–Say “3n + 5 is O(n)” instead of

“3n + 5 is O(3n)”

Asymptotic Notation (terminology)

• Special classes of algorithms:constant: O(1)logarithmic: O(log n)linear: O(n)quadratic: O(n2)cubic: O(n3)polynomial: O(nk), k >0exponential: O(an), n > 1

48

The i-th prefix average of an array X is average of the first (i 1) elements of X

X[0] X[1] … X[i]

Example of Asymptotic Analysis

An algorithm for computing prefix averages

(i 1)

5 13 4 8 6 2 3 8 1 2

5 9 7.3 7.5 … … … … … …

A[i]

49


Algorithm prefixAverages1(X, n)

Input array X of n integersOutput array A of prefix averages of X #operations

A new array of n integers for i 0 to n 1 do

s X[0] for j 1 to i do

s s + X[j] A[i] s / (i + 1)

return A

50

5 13 4 8 6 2 3 8 1 2

i

j = 0

5

51

5 13 4 8 6 2 3 8 1 2

i

j = 0,1

5 9

52

5 13 4 8 6 2 3 8 1 2

i

j = 0,1,2

5 9 7.3

53

5 13 4 8 6 2 3 8 1 2

i

j = 0,1,2,3

5 9 7.3 7.5

54


Algorithm prefixAverages1(X, n)

Input array X of n integersOutput array A of prefix averages of X #operations

A new array of n integers nfor i 0 to n 1 do n

s X[0] nfor j 1 to i do 1 + 2 + …+ (n 1)

s s + X[j] 1 + 2 + …+ (n 1)A[i] s / (i + 1) n

return A 1

55

• The running time of prefixAverages1 isO(1 + 2 + …+ n)

• The sum of the first n integers is n(n + 1) / 2– There is a simple

visual proof of this fact

0

1

2

3

4

5

6

7

1 2 3 4 5 6

56

Thus, algorithm prefixAverages1 runs in timeO(n(n + 1) / 2)

which is O(n2)

1 + 2 + …+ n = n(n+1)

TO REMEMBER

2

57

Another Example:A better algorithm for computing prefix

averages

Algorithm prefixAverages2(X):Input: An n-element array X of numbers.Output: An n -element array A of numbers such that A[i] is the

average of elements X[0], ... , X[i]. Let X be an array of n numbers.

s 0 for i 0 to n-1 do

s s + X[i] A[i] s/(i+ 1)

return array A

58

5 13 4 8 6 2 3 8 1 2

i

s=0

5

59

5 13 4 8 6 2 3 8 1 2

i

s=5

5 9

60

5 13 4 8 6 2 3 8 1 2

i

s=18

5 9 7.3

61

5 13 4 8 6 2 3 8 1 2

i

s=22

5 9 7.3 7.5

62

Let X be an array of n numbers. # operationss 0 1for i 0 to n-1 do n

s s + X[i] nA[i] s/(i+ 1) n

return array A 1

O(n) time

Another Example:A better algorithm for computing prefix

averages

63

big-Omega (lower bound)

f(n) is (g(n))

if there exist c > 0 and n0 > 0 such that

f(n) c • g(n) for all n n0

nn0

c • g(n)f(n)

(thus, f(n) is (g(n)) iff g(n) is O(f(n)) )

64

… is big theta …

g(n) is (f(n))

<===>

if g(n) O(f(n))

AND

f(n) O(g(n))

big-Theta

65

We have seen that

f(n) = 60n2 + 5n + 1 is O(n2)

but 60n2 + 5n + 1 60n2 for n 1

So: with c = 60 and n0 = 1

f(n) c • n2 for all n 1

f(n) is (n2)

Example:

66

Intuition for Asymptotic Notation

Big-Oh– f(n) is O(g(n)) if f(n) is

asymptotically less than or equal to g(n)

big-Omega– f(n) is (g(n)) if f(n) is

asymptotically greater than or equal to g(n)

big-Theta– f(n) is (g(n)) if f(n) is

asymptotically equal to g(n)

67

Math You Need to Review Logarithms and Exponents

properties of logarithms:logb(xy) = logbx + logbylogb (x/y) = logbx - logbylogbxa = alogbxlogbax = (1/a)logbxlogba= logxa/logxb

Natural logarithm: ln k = ∫1

k(1/x)dxe = limk→∞(1+1/n)n ≈ 2.71828

68

Math You Need to Review Logarithms and Exponents

properties of exponentials:a(b+c) = aba c

abc = (ab)c

ab /ac = a(b-c)

b = a logab

bc = a c*logab

69

More Math to Review

• Floor: x = the largest integer ≤ x

• Ceiling: x = the smallest integer ≥ x

• Summations: – Arithmetic progression:

– Geometric progression:

70

More Math to ReviewArithmetic Progression

n

S = di = 0 + d + 2d + … + nd i=0

= nd+(n-1)d+(n-2)d + … + 0

S = d/2 n(n+1)

for d=1 S = 1/2 n(n+1)

2S = nd + nd + nd + …+ nd

= (n+1) nd

71

More Math to ReviewGeometric Progression

n

S = ri = 1 + r + r2 + … + rn

i=0

rS = r + r2 + … + rn + rn+1

If r=2,S = (2n+1-

1)

rS - S = (r-1)S = rn+1 - 1

S = (rn+1-1)/(r-1)

n

rS - S = ri = -1 - r - r2 - … - rn

i=0

r + r2 + … + rn + rn+1

72

Math You Need to Review Randomization and Probability

Expected valueE[X] = ∑x ϵ U(x*Pr{X=x})

Properties E[X + Y ] = E[X] + E[Y ]

E[∑i =1..k(Xi)] = ∑i =1..k (E[Xi])

Documents

COMP 2402/2002 Abstract Data Types and Algorithms Prof: Office: Email: Office hours: Eduardo Mesa HP 5347 [email protected] Tuesday and Thursday 4:30pm