Dr. Naveed Riaz Design and Analysis of Algorithms 11 Dr. Naveed Riaz Advanced Design and Analysis of Algorithms (Overview) LECTURE: 1

Dr. Naveed Riaz

Design and Analysis of Algorithms

11

Dr. Naveed Riaz

Advanced Design and

Analysis of Algorithms

(Overview)

LECTURE: 1

Dr. Naveed Riaz


22

About Me Education

2001, MCS Bahria Institute / Peshawar University

2005, M.S. (Software Engineering) NUST

2008, Ph.D. (Software Engineering) Graz University of Technology, Austria.

Research Intrests Model-based and Qualitative Reasoning

Theorem Proving

Verification and Validation

Test Pattern Generation

Dr. Naveed Riaz


33

About Me Work Experiences

Permanent Faculty Member, COMSATS IIT Islamabad.

Scientific Officer, PAEC

Visiting Faculty Member, COMSATS, AIOU

Research Assistant, Graz University of Technology, Austria

Senior Scientist, PAEC

Offshore Researcher, Graz University of Technology.

Contact [email protected]

0331-5260536

Dr. Naveed Riaz


44

Course Outline Introduction to Algorithms and Review of Data Structures

Introduction to Algorithm Analysis

Greedy Algorithms

Sorting Algorithms

Graph and Tree algorithms

Divide and Conquer Algorithms

Recurrences

Heaps

Hashing

String Matching

NP Completeness

Approximation Algorithms

Dr. Naveed Riaz


55

Pragmatics Pre-Requisites

Data Structures and Programming (Preferably Java or C++)

Text book “Introduction to Algorithms” by Cormen, Leiserson, Rivest

and Stein

Class Notes

Assignments Mostly Written Homeworks, roughly every two weeks

Quizzes Roughly once every two week

One Programming Project

Dr. Naveed Riaz


66

Introduction to Algorithms A computational problem is a specification of the

desired input-output relationship. Given any two numbers x and y, find the sum of x and y

An instance of a problem is all the inputs needed to compute a solution to the problem. What is the sum of 10 and 29

An algorithm is a well defined computational procedure that transforms inputs into outputs, achieving the desired input-output relationship.

A correct algorithm halts with the correct output for every input instance. We can then say that the algorithm solves the problem.

Dr. Naveed Riaz


77

What is this Course all about? Solving Computational Problems

1. Create data structures & algorithms to solve problems.

2. Prove algorithms work. Buggy algorithms are worthless!

3. Examine properties of algorithms.

Simplicity, running time, space needed,……..

Too often, programmers try to solve problems using brute force techniques and end up with slow complicated code! A few hours of abstract thought devoted to algorithm design could have speeded up the solution substantially and simplified it.

Dr. Naveed Riaz


88

Does Running Time Matter? There is often a small critical portion of the software,

which may involve few lines of code, but where the great majority of computational time is spent. 80% of the execution time takes place in 20% of the code

These sections need to be written in the most efficient manner possible.

Normally, first an inefficient algorithm is designed and then attempts are made to fine-tune its performance by applying clever coding tricks or by implementing it on the most expensive and fastest machines around to boost performance as much as possible.

Dr. Naveed Riaz


99

Does Running Time Matter? Boeing 777 Project

Virtual Reality System

Large groups of Programmers and best super-computer

System Unacceptably slow

New Programmer Hired

Changed inner loops in programs

System Ran Twice as fast on his Desktop computer than the super computer

Diagnosis Engine, TUGraz

Developed by TU Vienna

Input (A Hardware Design), Generated Conflict sets

Dr. Naveed Riaz


1010

Does Running Time Matter? Generated Conflict sets for a single testcase of circuit s510.v,

Using P-4 (2.8 GHz) in 18 hours

Joerg Weber and I, changed the way Conflict sets were accessed.

Using same testcase and computer, the results were produced in 6min and 11 seconds. (180 times faster)

Dr. Naveed Riaz


1111

Analyzing Algorithms Simplicity

Informal, easy to understand, easy to change etc.

Time efficiency As a function of its input size, how long does it take?

Space efficiency As a function of its input size, how much additional space

does it use?

Running time Depends on the number of primitive operations (addition,

multiplication, comparisons) used to solve the problem and on problem instance.

Dr. Naveed Riaz


1212

Three Cases of Analysis Best Case: constraints on the input, other than size,

resulting in the fastest possible running time. Searching an element in an Array?

Worst Case: constraints on the input, other than size, resulting in the slowest possible running time. Searching an element in an Array? Searching an element in a sorted Array, using binary

search? Average Case: average running time over every possible

type of input (usually involve probabilities of different types of input) Searching an element in an Array?

Dr. Naveed Riaz


1313

In this Course We would like to compare efficiencies of different

algorithms for the same problem, instead of different programs or implementations. This removes dependency on machines and programming skill.

It becomes meaningless to measure absolute time since we do not have a particular machine in mind. Instead, we measure the number of steps. We call this the time complexity or running time and denote it by T(n).

We would like to estimate how T(n) varies with the input size n.

Dr. Naveed Riaz


1414

Review of Basic Data Structures Data: Plural of Datum, means raw facts

C T A A, 9045-5 Information: Processed form of Data

A CAT , My Bank Account # Structure: The way a thing is organized Data Structures: The ways of Organizing Data

Array: A list of finite similar data elements.

int Arr[10] A[0], A[1], A[2],…..,A[9] Matrix

Dr. Naveed Riaz


1515

Operations Traversing Searching Inserting Deleting Sorting Merging

Dr. Naveed Riaz


1616

Basic Data Structures Queues

FIFO system

Deletion can take place at one end and insertions at the other end

People waiting in a queue

Print jobs on a network printer

Stacks LIFO system

Insertions and deletions at only one end

Stack of Dishes

Post-fix and Infix of expressions

Dr. Naveed Riaz


1717

Basic Data Structures Linked Lists:

Dynamic Data Structure

Elements are chained

Information and next address field

Trees: Hierarchical relationship

Family Trees, Table of contents

Graphs: Consist of a set of vertices and connecting edges

Suitable for data sets where the individual elements are interconnected in complex ways

Dr. Naveed Riaz


1818

Preliminaries Floor and Ceiling Functions Mod and Modulo Integer (INT(x)) and absolute value function (|x|) Summations

a1 + a2 + ………….. + an =

What is Sigma = 1+2+……+n = n(n+1)/2

Factorial n! = n(n-1)(n-2)……1 4! = 4(3)(2)(1) = 24 Or simply n! = n(n-1)!

n

iia

1

5

2

2

j

j

n

i

i1

n

i

i1

Dr. Naveed Riaz


1919

Preliminaries Permutations: Arrangement of elements in different order

abc, acb, bac, bca, cab, cba Exponents:

am = (a)(a)……..(a) m times a-m = 1/ am

am/n = =

Logarithms: The exponent to which must ‘b’ rasied to obtain x i.e y = logbx means by = x log2 8 = 3

log10 100 = 2

We will consider log2, unless otherwise specified

mn a )(

n mamn a )(

Dr. Naveed Riaz


2020

Rate of Growth of Functions The Criteria for comparing algorithms

Space, Time

If n is the size of input and f(n) is the complexity then which one is better

log n, n, n log n, n2, n3

8n3 + 576n2 + 832n -248 = O(n3)

SUBALGORITHMS: (1) Functions

(2) Always Returns

Dr. Naveed Riaz


2121

Dr. Naveed Riaz

Advanced Design and


(2 D Maxima, Summations,

Analyzing programs with loops)

LECTURE: 2

Dr. Naveed Riaz


2222

2 Dimension Maxima Problem The emphasis in this course will be on the design of

efficient algorithm, and hence we will measure algorithms in terms of the amount of computational resources that the algorithm requires.

Purchasing Car Problem: You Want a Fast Car, but cost matters as well (Difficult to Decide)

One thing is sure, you definitely do NOT want to consider a car if there is another car that is both faster and cheaper

2 D Maxima: Given a set of points P = {p1,p2, …., pn} in 2-space, each represented by its x and y integer coordinates, output the set of the maximal points of P, that is, those points pi, such that pi is not dominated by any other point of P.

Dr. Naveed Riaz


2323

2 Dimension Maxima Problem

Let a point p in 2 dimensio-nal space be given by its integer coordinates, p = (p.x, p.y).

A point p is said to dominated by point q if p.x <= q.x and p.y <= q.y.

Given a set of n points, P = (p1, p2….., pn) in 2-space a point is said to be maximal if it is not dominated by any other point in P.

Now Write Algorithm

Dr. Naveed Riaz


2424

2 Dimension Maxima ProblemMaxima (int n, Point P[1..n]) { // output maxima of P[0..n-1]

for i = 1 to n {

maximal = true; // P[i] is maximal by default

for j = 1 to n {

if (i != j and P[i].x <= P[j].x and P[i].y

<= P[j].y) {

maximal = false; // P[i] is dominated by P[j]

break; }

}

if (maximal) output P[i]; // no one dominated...PRINT

}

}

Dr. Naveed Riaz


2525

2 Dimension Maxima Problem Running Time?

we go through the outer loop n times for each time through this loop, we go through the inner

loop n times as well. The condition in the if-statement makes four accesses

to P The output statement makes two accesses (to P[i].x

and P[i].y) for each point that is output. In the worst case every point is maximal, so these two

access are made for each time through the outer loop.

Dr. Naveed Riaz


2626

Summations In the last example

We were most interested in the growth rate for large values of n

We do not care about constant factors (Depends on machines)

Now consider another example

Dr. Naveed Riaz


2727

Summationsfor i = 1 to n { // assume that n is input size

...

for j = 1 to 2*i {

...

k = j;

while (k >= 0) {

...

k = k - 1;

}

}

}

Dr. Naveed Riaz


2828

Summations Let I(), M(), T() be the running times for (one full execution of)

the inner loop, middle loop, and the entire program. We work from inside out.

The number of passes through the loop depends on j. It is executed for k = j, j -1, j-2, …….,0

Now consider the Middle loop Running time is determined by i

And we know that so

Dr. Naveed Riaz


2929

Summations We get

Now for the outermost loop we have

or

For n >= 0

So

or simply T(n3)

Dr. Naveed Riaz


3030

Solving Summations Do we have to memorize everything…….NOOOOO Use Crude Bounds?

Replace every term in the summation with a simple upper bound

Works pretty well with relatively slow growing functions

It does not give good bounds with faster growing functions like exponentials. 2i

Approximate using Integrals? Integration and Summation are closely related.

Integration is a continuous form of summation.

Dr. Naveed Riaz


3131

Solving Summations

Using Constructive Induction? A nice method whenever we can guess the general for

of summation, but not sure about the constant factors.

Consider

Dr. Naveed Riaz


3232

Solving Summations We do not know what a, b, c and d are. We solve it in two steps. First we take a basis case and

then we apply induction step.

Basis Case: (n = 0)

So =

i.e. d = 0

Induction Step:

Dr. Naveed Riaz


3333

Solving SummationsComparing we get

We already know that d = 0 from the basis case. From the second constraint above we can cancel b from

both sides, implying that a = 1=3. Combining this with the third constraint we have b = 1=2. Finally from the last constraint we have c = −a + b = 1=6.

This gives the final formula

Dr. Naveed Riaz


3434

Dr. Naveed Riaz

Advanced Design and


(Asymptotics, Divide and Conquer)

LECTURE: 3

Dr. Naveed Riaz


3535

Asymptotics Asymptotic analysis is a method of describing limiting behavior i.e.

asymptotic analysis refers to solving problems approximately

Θ notation: Given any function g(n), we define θ(g(n)) to be a set of functions that are asymptotically equivalent to g(n), or put formally:

θ(g(n)) = { f(n) | there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0}.

T(n) = θ(n2) or T(n) Є θ(n2) ????????

For 2DMaxima we had T(n) = 4n2 + 2n now g(n) = n2

For θ notation, we must argue that there exist constants c1, c2, and n0 such that 0 ≤ c1n2 ≤ (4n2 + 2n) ≤ c2n2 for all n≥n0

The constraint 0 ≤ c1n2 is no problem, since we will always be dealing with positive n and positive constants.

Now we will consider c1n2 ≤ 4n2 + 2n (Take c1 = 4)

4n2 ≤ 4n2 + 2n for all n ≥ n0

Dr. Naveed Riaz


3636

Asymptotics 4n2 + 2n ≤ c2n2??? Take c2 = 6

4n2 + 2n ≤ 6n2 i.e. 4n2 + 2n ≤ 4n2 + 2n2 for all n ≥ 1

We had two constraints n ≥ 1 and n ≥ 0 so now make it one n ≥ 1

Thus 4n2 + 2n Є θ(n2)

NOW WE WILL GO INTO DETAILS of θ

“f(n) Є θ (g(n))” means f(n) and g(n) are asymptotically equivalent. This means that they have essentially the same growth rates for large n.

4n2, (8n2+2n−3), (n2/5−10 log n), and n(n−3) are all asymptotically equivalent.

c1 and c2 is essentially saying “the constants do not matter because you may pick c1 and c2 however you like to satisfy these conditions

Similarly n0 can be made a big constant as well.

Dr. Naveed Riaz


3737

Asymptotics Consider another example

f(n) = 8n2 + 2n − 3 easily θ(n2) Now we will prove

We need to show two things: first, that f(n) does grows asymptotically at least as fast as n2, and second, that f(n) grows no faster asymptotically than n2.

Lower Bound: Recall the definition of θ notation, f(n) ≥ c1n2 for all n ≥ n0.

f(n) = 8n2 + 2n−3 ≥ 8n2 −3 (Assumption: 2n ≥ 0)

= 7n2 + (n2 −3) ≥ 7n2 = 7n2 (Assumption: n2 −3 ≥ 0)

Thus Set C1 = 7 and n ≥ (3)1/2

Upper Bound: f(n) ≤ c2n2.

f(n) = 8n2 + 2n−3 ≤ 8n2 + 2n ≤ 8n2 + 2n2 = 10n2 (Assumption 2n≤2n2)

Thus c2 = 10 and n ≥ 1 and we are done (c1=7,c2=10,n ≥ (3)1/2)

Dr. Naveed Riaz


3838

Asymptotics Difficult Isn't it??? Much Better to just throw away the constants and

lower terms of n right???? Yup..but that’s how we can prove….

Why is f(n) in our example does not belong to θ(n), What if we make c2 very large.????????

Divide both sides by n

8n becomes infinity and so will always be greater than c2

Similarly for c1: Lets see Why is f(n) in our example does not belong to θ(n3)

Divide both sides by n3

C1 can not be 0 so FALSE

Dr. Naveed Riaz


3939

Asymptotics Big O and Ω notations:

Θ notation provides both upper and lower bounds.

Big O notation provides only upper bound.

Ω provides only asymptotic lower bounds.

O(g(n)) = {f(n) | there exist positive constants c and n0 such that

0 ≤ f(n) ≤ cg(n) for all n ≥ n0} Ω(g(n)) = {f(n) | there exist positive constants c and n0 such that

0 ≤ cg(n) ≤ f(n) for all n ≥ n0} f(n) Є θ(g(n)) if and only if f(n) Є O(g(n)) and f(n) Є Ω(g(n)) f(n) Є O(g(n)) means that f(n) grows asymptotically at the same rate

or slower than g(n). f(n) Є Ω(g(n)) means that f(n) grows asymptotically at the same rate

or faster than g(n).

Dr. Naveed Riaz


4040

Asymptotics THUS, f(n) = 3n2 + 4n Є θ(n2) but it is not in θ(n) or θ(n3). But f(n) ЄO(n2) and in O(n3) but not in O(n). Finally, f(n)Є Ω(n2) and in Ω(n) but not in Ω(n3).

For θ notation: if where c ≥ 0, then f(n) Є θ(g(n))

For O notation: if where c ≥ 0, then f(n) Є O(g(n))

For Ω notation: if where c ≥ 0, then f(n) Є Ω(g(n))

For Polynomial Function: Let f(n) = 2n4 −5n3 −2n2 + 4n−7. Then f(n) Є θ(n4).

= 2-0-0+0-0 = 2

THUS, f(n) Є θ(n4).

Dr. Naveed Riaz


4141

Asymptotics θ(1): Constant time; you can’t beat it!

θ(log n): This is typically the speed that most efficient data structures operate in for a single access. (E.g., inserting a key into a balanced binary tree.) Also it is the time to find an object in a sorted list of length n by binary search.

θ(n): This is about the fastest that an algorithm can run, given that you need (n) time just to read in all the data.

θ(n log n): This is the running time of the best sorting algorithms. Since many problems require sorting the inputs, this is still considered quite efficient.

θ(n2),θ(n3),……Polynomial time. These running times are acceptable either when the exponent is small or when the data size is not too large (e.g. n ≤1000).

Dr. Naveed Riaz


4242

Asymptotics θ(2n), θ(3n), ……: Exponential time: This is only acceptable when

either (1) your know that you inputs will be of very small size (e.g. n≤50), or (2) you know that this is a worst-case running time that will rarely occur in practical instances. In case (2), it would be a good idea to try to get a more accurate average case analysis.

θ(n!), θ(nn): Acceptable only for really small inputs (e.g. n 20).

Dr. Naveed Riaz


4343

Divide And Conquer Roman Politicians: Divide your enemies (by getting them to distrust

each other) and then conquer them piece by piece. Algorithm Design: Take a problem on a large input, break the inputinto smaller pieces (recursively), solve the problem on each of the small

pieces, and then combine the piecewise solutions into a global solution.

Three steps

Divide Conquer Combine

Analyzing the running times of recursive programs is rather tricky (We will study recurrences)

Dr. Naveed Riaz


4444

Merge Sort Divide: Split A down the middle into two subsequences, each of size roughly n=2. Conquer: Sort each subsequence (by calling MergeSort recursively on each). Combine: Merge the two sorted subsequences into a single sorted list.

The dividing process ends when we have split the subsequences down

to a single item The key operation where all the work is done is in the combine stage,

which merges together two sorted lists into a single sorted list

Dr. Naveed Riaz


4545

Merge Sort

Dr. Naveed Riaz


4646

Merge Sort Tricks for Improving:

1. Use two arrays A and B, During recursions, copy A to B. We will save the last copying of Data.

2. Rather than dividing to a size ‘1’, Divide to another size like ’20’ and then use Brute force algorithms (θ(n2). Usually perform Ok for small ‘n’. We will not have to perform recursion. Remember 202 is constant.

Analysis: First consider procedure Merge(), There are four independent

loops. So running time is n+n+n+n = 4n or simply n In MergeSort(), if we call MergeSort with a list containing a

single element, then the running time is a constant. When we call MergeSort with a list of length n > 1, e.g.

Merge(A, p, r), where r−p+1 = n, the algorithm first computes q = floor((p + r)/2)) and size of left subarray is ceiling(n/2). For right subarray we have floor(n/2) elements. so

Dr. Naveed Riaz


4747

Merge Sort For right subarray we have floor(n/2) elements. So to sort left subarray we need and for right subarray we

will need To merge both sorted lists takes n time

So

Dr. Naveed Riaz


4848

Dr. Naveed Riaz

Advanced Design and


(Recurrences)

LECTURE: 4

Dr. Naveed Riaz


4949

Recurrences Recurrences and Divide and Conquer:

Recall that the basic steps in divide-and-conquer solution are (1) divide the problem into a small number of subproblems, (2) solve each subproblem recursively, and

(3) combine the solutions to the subproblems to a global solution.

We also described MergeSort, a sorting algorithm based on divide-and-conquer.

It is important to develop mathematical techniques for solving recurrences, either exactly or asymptotically

We now introduce the notion of a recurrence, that is, a recursively defined function.

Today we will discuss a number of techniques for solving recurrences.

Dr. Naveed Riaz


5050

MergeSort Recurrence For MergeSort We found out

T(1) = 1 (by the basis.)

T(2) = T(1) + T(1) + 2 = 1 + 1 + 2 = 4

T(3) = T(2) + T(1) + 3 = 4 + 1 + 3 = 8

T(4) = T(2) + T(2) + 4 = 4 + 4 + 4 = 12

…………..

T(8) = T(4) + T(4)+8 = 12+12+8 = 32

………………

T(16) = T(8) + T(8) + 16 = 32 + 32 + 16 = 80

……………..

T(32) = T(16) + T(16) + 32 = 80 + 80 + 32 = 192:

Dr. Naveed Riaz


5151

MergeSort Recurrence Since the recurrence divides by 2 each time, let’s consider

powers of 2, since the function will behave most regularly for these values.

So the new Pattern is more interesting

T(1)/1 = 1 T(8)/8 = 4

T(2)/2 = 2 T(16)/16 = 5

T(4)/4 = 3 T(32)/32 = 6

This suggests that for powers of 2, T(n)/n = (lg n) + 1, or equivalently, T(n) = (nlg n) + n or simply

Eliminate Floor and Ceilings:

Dr. Naveed Riaz


5252

MergeSort Recurrence (Induction Method)

Proof By Induction:

Because n is limited to powers of 2, we cannot do the usuall n to n + 1 proof ( if n is a power of 2, n+1 generally not)

Claim: For all n >=1, n a power of 2, T(n) = (nlg n) + n.

Proof: Basis Case: (n = 1) In this case T(1) = 1 by definition and the formula gives 1lg1+1 = 1, which matches.

Let n > 1, and assume that the formula T(n’) = (n’ lg n’)+n’, holds whenever n’ < n. We want to prove the formula holds for n itself. To do this, we need to express T(n) in terms of smaller values.

To do this, we apply the definition: T(n) = 2T(n/2) + n:

Now, n/2 < n, so we can apply the induction hypothesis here, yielding T(n/2) = (n/2) lg(n/2)+ (n/2)

Dr. Naveed Riaz


5353

MergeSort Recurrence Putting the value we get

T(n) = 2((n/2) lg(n/2) + (n/2)) + n

= (nlg(n/2) + n) + n

= n(lg n - lg 2) + 2n

= (nlg n - n) + 2n

= nlg n + n (Hence Proved)

The above method of “guessing” a solution works fine as long as recurrence is simple enough that we come up with a good guess.

The following method when works, allows you to convert a recurrence into a summation.

In large, summations are easier to solve than recurrences (and if nothing else, you can usually approximate them by integrals)

Dr. Naveed Riaz


5454

Recurrence (Iteration Method) Now we will discuss the Iteration Method.

Convert recurrences to summations

We start expanding out the definition until we see a pattern developing.

T(n) = 2T(n/2) + n. This has a recursive formula inside T(n/2)

Expanding with n/2 we get

T(n) = 2T(n/2) + n //initial 1st time so k = 1

= 2(2T(n/4) + n/2) + n = 4T(n/4) + n + n //putting value of T(n/2) k =2

= 4(2T(n/8) + n/4) + n + n = 8T(n/8) + n + n + n // k = 3

= 8(2T(n/16) + n/8) + n + n + n // k = 4

or 16T(n/16) + n + n + n + n

=…………………………………

Dr. Naveed Riaz


5555

Recurrence (Iteration Method) We can see that a pattern is developing

T(n) = 2kT(n/(2k)) + (n + n +……..+n) (k times)

= 2kT(n/(2k)) + kn:

Now we need to get rid of the T() from the right-hand side

We know that T(1) = 1. Thus, let us select k to be a value which forces the term n/(2k) = 1. This means that n = 2k, implying that k = lgn. Thus substituting we get

T(n) = 2(lg n) T(n/ (2 (lg n) )) + (lg n)n

= 2(lg n)T(1) + n lg n = 2(lg n) + n lg n = n + n lg n

Dr. Naveed Riaz


5656

Recurrence (Iteration Method) Now Consider a more difficult example

we’ll make the simplifying assumption here that n is a power of 4.

Dr. Naveed Riaz


5757

Recurrence (Iteration Method) As before, we have the recursive term T(n/4k) still floating

around.

To get rid of it we recall that we know the value of T(1), and so we set n/4k = 1 implying that 4k = n, that is, k = log4 n.

Dr. Naveed Riaz


5858

Recurrence (Iteration Method) Applying the formula for the geometric series, i.e. for x!= 1

For x != 1:

SO

Next

Plugging it back

Dr. Naveed Riaz


5959

Recurrence (Iteration Method) Applying the formula for the geometric series, i.e. for x!= 1

For x != 1:

SO

Next

Plugging it back

Dr. Naveed Riaz


6060

Recurrence (Iteration Method)So the final result (at last!) is

Dr. Naveed Riaz


6161

Recurrence (Visualising Recurrences) A nice way to visualize what is going on in iteration is to

describe any recurrence in terms of a tree, where each expansion of the recurrence takes us one level deeper in the tree.

We had for MergeSort

Visu-

alizing

Dr. Naveed Riaz


6262

Recurrence (Visualising Recurrences)

Dr. Naveed Riaz


6363

Recurrence (Iteration Method) work for T(m) is m2.

For the top level (or 0th level) the work is n2.

At level 1 we have three nodes whose work is (n/2)2 each, for a total of 3(n/2)2. This can be written as n2(3/4).

At the level 2 the work is 9(n/4)2, which can be written as n2(9/16).

In general it is easy to extrapolate to see that at the level i, we have 3i nodes, each involving (n/2i)2 work, for a total of 3i(n/2i)2 = n2(3/4)i.

This leads to the following summation (We have not determined where the tree bottoms out)

Dr. Naveed Riaz


6464

Recurrence (Iteration Method) If all we wanted was an asymptotic expression, then are

essentially done at this point??????????? RIGHT

The summation is a geometric series, and the base (3/4) is less than 1. This means that this series converges to some nonzero constant. So θ(n2)

But lets go for a more specific result.

The recursion bottoms out when we get down to single items

Since the sizes of the inputs are cut by half at each level, it is not hard to see that the final level is level lg n SO

Dr. Naveed Riaz


6565

Recurrence (Iteration Method)

Dr. Naveed Riaz


6666

Recurrence (Master Theorem) We have already seen that in Divide and Conquer, same general

type of recurrence keeps popping up.

We break a problem into ‘a’ subproblems, where each subproblem is roughly a factor of 1/b of the original problem size

the time it takes to do the splitting and combining on an input of size n is θ(nk)

In MergeSort, a = 2, b = 2, and k = 1.

If we are only interested in asymptotic notation we can always come up with a general solution.

Theorem: (Simplified Master Theorem) Let a >=1, b > 1 be constants and let T(n) be the recurrence

T(n) = aT(n/b) + nk; defined for n >=0.

Dr. Naveed Riaz


6767

Recurrence (Master Theorem) The basis case, T(1) can be any constant value.

Using this version of the Master Theorem we can see that in the MergeSort recurrence a = 2, b = 2, and k = 1.

Thus, a = bk (2 = 21) and so Case 2 applies. From this we have T(n) Є θ (n log n).

In the recurrence above, T(n) = 3T(n/2) + n2, we have a = 3, b = 2 and k = 2. We have a < bk (3<22) in this case, and so Case 3 applies. From this we have T(n) Є θ (n2).

Dr. Naveed Riaz


6868

Recurrence (Master Theorem) Finally, consider the recurrence T(n) = 4T(n/3)+n

we have a > bk (4 > 31), and so Case 1 applies.

From this we have a = 4, b = 3 and k = 1

There are recurrences which can not be put into this form e.g.

Although iteration works fine and provides results θ(nlog2n), Master theorem fails.

Documents

Dr. Naveed Riaz Design and Analysis of Algorithms 11 Dr. Naveed Riaz Advanced Design and Analysis of Algorithms (Overview) LECTURE: 1