MELJUN CORTES Algorithm Lecture

01/08/13 1

ALGORITHMS

An algorithm is any well defined computational procedure that takes some values, or set of values as input and produce some value as output.

An algorithm is thus a sequence of computational steps that transform the input into output.

MELJUN CORTES

01/08/13 2

In addition all algorithm must follow Following criteria:

Input: Zero or more quantities are externally supplied .

Output: At least one quantity is produced.

certainty: Each instruction must be clear and unambiguous.

Finiteness: An algorithm must terminates out after a finite number of steps.

Effectiveness: Every instruction must be very basic so that it can be carried out.

01/08/13 3

Analyzing Algorithms

Analyzing an algorithm mean predicting the resources that the algorithm requires .

Occasionally resources such as memory, communication bandwidth, or computer hardware are of primary concern, but most often it is computational time we want to measure.

In other words The theoretical study of computer-program performance and resource usage.

01/08/13 4

Analysis of algorithmsWhat’s more important than performance?• modularity• correctness• maintainability• functionality• robustness• user-friendliness• programmer time• simplicity• extensibility• reliability

01/08/13 5

Why study algorithms and performance ?

Algorithms help us to understand scalability.

Performance often draws the line between what is feasible and what is impossible.

Algorithmic mathematics provides a language for talking about program behavior.

The lessons of program performance generalize to other computing resources.

01/08/13 6

Running time & Input size

The running time of an algorithm on a particular input is the number of primitive operations or “steps” executed.

The best notation for “input size” depends on the problem being studied. For example:

In sorting or computing discrete Fourier transforms the most natural measure is the no. of item in the input For example the array size n for sorting.

On the other hand in multiplying two integers the best measure is the total no. of bits needed to represent the input in ordinary binary notation .

01/08/13 7

• The running time depends on the input: an already sorted sequence is easier to sort.

• Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones.

• Generally, we seek upper bounds on the running time, because everybody likes a guarantee.

01/08/13 8

Complexity of Algorithm

The complexity of an algorithm M is the functionf(n) which give the running time and/or storage space requirement of the algorithm in term of the size n of input data. Frequently storage space required by an algorithm is simply a multiple of data size n. Accordingly, unless or otherwise stated the term “complexity” will refer to running time of an algorithm.

01/08/13 9

Cases for complexity function

Worst case :The maximum value of f(n) for any possible input.

Average case: the expected value of f(n). Best caseBest case: : Sometimes we consider

minimum possible value of f(n) called the best case.

01/08/13 10

Asymptotic notation

Theta notation(Θ) Big oh notation(O) Small oh notation(o) Omega notation (Ω) Little omega notation (ω)

01/08/13 11

Theta notation (Θ)

For a given function g(n), we denote Θ(g(n))={f(n): there exist positive

constant C1,C2 and n0 such that 0 ≤ C1g(n) ≤ f(n) ≤ C2g(n)for all n≥ n0

01/08/13 12

f(n) = Θ(g(n))

C2(g(n))

C1(g(n))

f(n)

n0 n

01/08/13 13

Big oh notation(O)

The theta –notation asymptotically bounds a function from above and below. When we have only an asymptotically upper bound ,

We use O-notation.For a given function g(n), we denote by

Og(n) the set of functionsOg(n)= {f(n): there exist positive constant C

and n0 such that 0 ≤ f(n) ≤ Cg(n) for all n≥ n0

01/08/13 14

f(n) = O(g(n))

f(n) = O(g(n))

n0

C(g(n))

f(n)

n

01/08/13 15

Omega notation(Ω)

Just as O-notation provides an asymptotic upper bound on a function, Ω notation provide an asymptotic lower bound

For a given function g(n), we denote by Ω(g(n)) the set of functions

Ω(g(n))={f(n): there exist positive constant C and n0 such that

0 ≤ Cg(n) ≤ f(n) for all n≥ n0 }

01/08/13 16

f(n)=Ω(g(n))

f(n)=Ω(g(n))

no n

C(g(n))

f(n)

01/08/13 17

Small oh notation(o)

The asymptotic upper bound provided by O-notation may or may not be asymptotically tight, but the bound o-notation to denote an upper bound that is not asymptotically tight.

We formally define o(g(n)) as the set Og(n)= {f(n): for any constant c>0

there exist a constant C >0and n0 >0 such that

0 ≤ f(n) < Cg(n) for all n≥ n0 }

01/08/13 18

Little omega notation (ω)

By analogy ω- notation is to Ω-notation is same as o notation to O- notation.

We use ω- notation to denote a lower bound that is not asymptotically tight .

It is defined by :f(n) Є ω(g(n)) iff g(n) Є o(f(n))Formally we defined ω(g(n)) as the set ω(g(n))={ f(n) : for any positive constant c>0, there exists a constant n0 >0such that

0 ≤ c(g(n)) < f(n) for all n≥n0 }

01/08/13 19

Comparison of function

01/08/13 20

Reflexivity

f(n) = Θ(f(n))

f(n) = O(f(n))

f(n) = Ω(f(n))

01/08/13 21

Symmetry

f(n) = Θ(g(n)) iff g(n) = Θ(f(n))

01/08/13 22

Transitivity

f(n) = Θ(g(n)) and g(n) = Θ(h(n)) imply f(n)= Θ(g(n))

f(n) = O(g(n)) and g(n) = O(h(n)) imply f(n)= O(g(n))

f(n) = Ω (g(n)) and g(n) = Ω(h(n)) imply f(n)= Ω(g(n))

f(n) = o(g(n)) and g(n) = o(h(n)) imply f(n)= o(g(n))

f(n) = ω(g(n)) and g(n) = ω(h(n)) imply f(n)= ω(g(n))

01/08/13 23

Transpose symmetry

f(n) =O(g(n)) iff g(n) = Ω(f(n))

f(n) =o(g(n)) iff g(n) = ω(f(n))

01/08/13 24

INSERTION SORT

INSERTION SORT (A) Cost times 1 for j 2 to length [A] C1 n

2 do key A[j] C2 n-1

3 Insert A[j] into the sorted

Sequence A [1 . . . j-1] 0 n-1

1 i j-1 C4 n-1 n

5 while i>0 and A[i] >key C5 tj

j=2

01/08/13 25

n

6 do A[i+1] A[i] C6 (tj -1)

j=2 n

7 i i+1 C7 (tj -1)

j=28 A[i+1] key C8 n-1

01/08/13 26

Running time of insertion sort

n n T(n) =c1n+c2(n-1) +c4(n-4)+c5 tj+c6 (tj-1) j=2 j=2 n +c7 (tj-1) +c8(n-1) j=2

01/08/13 27

In insertion sort best case occurs if the array is already sorted then in line 5 when i has its initial value of j-1. thus tj =1 for all j=2,3,4…..n

and the best case running time isT(n)=c1(n) +c2(n-1)+c4(n-1)+c5(n-1) +c8(n-1) =(c1+c2+c4+c5+c8)n-(c2+c4+c5+c8)

Best case

01/08/13 28

This running time can be expressed as an+b for constant a and b that depend on the statement cost ci;

It is thus a liner function of n.

01/08/13 29

Worst case If the array is in reverse sorted order then

we must compare each element A[j] with each element in the entire sorted sub array A[1…j-1]and so tj=j for j=2,3,……n

n (tj-1) =n(n+1) -1 j=2 2 n (j-1)=n(n-1)

J=2 2

01/08/13 30

T(n)=c1n+c2(n-1)+c4(n-1) +c5(n(n+1)/2-1)+c6(n(n-1)/2)

+ c7(n(n-1)/2)+c8(n-1) =(c5/2+c6/2 c7/2)n +

(c1+c2+c4+c5/2-c6/2-c7/2+c8)n -(c2+c4+c5+c8)This worst case running time can be

expressed as an+bn+c

2

2

01/08/13 31

Worst case and average case

The average case is often as bad as worst case .

On average if half the element inA[1…..j-1]are less than A[j] and half are greater . On average we check half the sub array so tj=j/2.

If we worked out the resulting average case running time it turn out to be a quadratic function of the input size ,just like the worst case running time.

01/08/13 32

Example of insertion sort

01/08/13 33

Order of growth

we use some simplifying abstraction to ease our analysis of INSERTION SORT procedure .

first we ignored the actual cost of each

statement using the constant ci to represent these cost .

We really need the worst case running time is an+bn+c for some constant a , b,

and c that depend upon the cost ci .

2

01/08/13 34

We thus ignored not only the actual statement costs but also the abstract cost

ci.

We shall now make one more simplifying abstraction. It is rate of growth or order of growth. We therefore consider only leading term of a formula

(e.g. an ). We ignore the lower order terms and constant term

because they are insignificant for large n. We also ignore constant coefficient of leading term . Thus we write that for example has a worst case

running tine of Θ(n).

2

2

01/08/13 35

Designing AlgorithmsThere are many ways to design algorithm , however

here mainly we use two methods.

Incremental approach : Insertion sort use incremental approach: having

sorted the sub array A[1….j-1], we insert the single element A[j] into its proper place , yielding the sorted sub array A[1…j]

Divide and conquer approach: Many useful algorithms are recursive in

structure .these algorithm typically follow a D and C approach . They break problem into several sub problem that are similar to original problem but smaller in size, solve the sub problem recursively and combine these solution to create a solution to original problem.

01/08/13 36

Divide and conquer approach

The divide and conquer paradigm involve three steps at each level of recursion:

Divide: divide the problem into a number of sub problem .

Conquer: conquer the sub problems by soloing them recursively. if the sub problem size is small enough ,however jus solve the sub problem in straightforward manner.

Combine: combine the solution of the sub problems into the solution for original problem.

the merge sort algorithm follow the divide and

conquer paradigm.

01/08/13 37

divide and conquer approach in merge sort

Operation of merge sort on the array A=<5,2,4,7,1,3,2,6>

Initial sequence

Sorted sequence

01/08/13 38

Analyzing divide and conquer algorithm

When an algorithm contain a recursive call to itself its running can be described by a recurrence equation which describe the overall running time of a problem in term of running time of smaller inputs.

We can use mathematical tool to solve recurrence equation and provide bounds on the performance of the algorithm.

01/08/13 39

let T(n) be running time of on a problem of size n.

If the problem size is small enough, say n ≤ c for some constant ‘c’ the straightforward solution take constant time Θ(1).

Let our division of problem yields ‘a’ sub problem each of which is ‘1/b’ of the original size, we get the recurrence .

For merge sort both ‘a’ and ‘b’ are 2.

01/08/13 40

If we take D(n) time to divide the problem into the solution and C(n) time to combine the solutions to the sub problems into solution to the original problem , we get the recurrences

T(n)={Θ(1) if n ≤ c

{ aT(n/b) +D(n) +C(n) otherwise The method for solving recurrence are given in

next slides.

Documents

MELJUN CORTES Algorithm Lecture