prog-assign-3-explained (1).pdf

Embed Size (px)

Citation preview

  • 8/11/2019 prog-assign-3-explained (1).pdf

    1/2

    Data Structure & Algorithms

    CS210A,ESO207A,ESO211

    Semester I, 2012-13, CSE, IIT Kanpur

    Programming Assignment III

    Deadline : 11:55 PM on 19th October.

    Some guidelines for evaluating the efficiency of an algorithm experimentally:

    1. This programming assignment consists of two problems. The problems involve algorithms for mul-tiplication of two integers and sorting.

    2. The aim of this assignment is to analyze the experimental behavior, especially the efficiency, ofvarious algorithms for a given problem.

    3. You will have to design modules to generate random input instances for each of these problems andtest your algorithms on these inputs.

    4. You will have to bring a one or two page report at the time of demo/viva for each of these problems.

    5. You must be able to demonstrate that the running time of an algorithm for an input during thedemo is nearly the same as the running time you reported in the report (plots).

    6. Verifying whether one has implemented an algorithm correctly is a very important task. Whatsteps will you take to do this verification ? Instead of the theoretical approach (loop invariantand induction), sometimes there are simple engineering methods which are very effective for thispurpose. Here is a sketch of one such method. Suppose there are two algorithms A and B forsolving a given problem. Let A be much simpler to implement than B and the implementation ofBis very complex due to large number of procedures it employs. Furthermore, the ideas underlying

    A and B are quite different. In such cases, a very effective (though not 100% fool proof) way toverify the correctness in the implementation ofB is to determine whether the output of algorithmsAand B for a large sample of input instances matches. IfA is much simpler algorithm, it is highlyunlikely that one would make mistake in implementing A. Therefore, it will be extremely unlikelythat implementation ofB is incorrect if bothAandB give same output for a large sample of inputs.

    7. Ideally one should run an algorithm for input size from a sufficiently large range. Furthermore, fora given input size n, one should run the algorithm on many instances of size n. Sometimes, justcomputing the average value of the time taken by the algorithm is not sufficient; one should alsotry to find out the deviation in the running time, especially if the running time of the algorithmvaries on different instances of same size (For example, quick sort).

    8. Instead of plotting running time T(n) versus input size n, would it be better if we plot log(T(n))versus log n?

    9. In order to avoid the variation of the running time due to external sources (operating system), onemight try counting the basic number of operations executed during the algorithm. For example,during sorting, comparison is one basic operation; so while comparing two sorting algorithms, onemay focus on counting the number of comparisons. This will also help you determine the exactvalue of the constant in big O notations.

    10. While implementing an algorithm, one must strive for the most efficient implementation. Otherwise,the hidden constant in the big O notation may become too large that one fails to notice theefficiency of the algorithm in practice.

    1

  • 8/11/2019 prog-assign-3-explained (1).pdf

    2/2

    1 Divide and Conquer algorithm for multiplying integers

    Let x and y be any two binary numbers consisting of n bits each. The trivial algorithm to multi-ply these numbers takes O(n2) time. We discussed a divide and conquer based algorithm which takesO(nlog23) = O(n1.59) time. You have to implement these two algorithms and compare their performanceexperimentally. More specifically, you have to achieve the following ob jectives.

    Verify whether the experimentally observed running time of each algorithm matches its theoreticaltime complexity. In particular, try to estimate experimentally the value of the exponents ofn for

    each algorithm. How close are your estimate to the theoretical values (n

    2

    and n

    1.59

    ). For implementation of each of these algorithms, find out the largest value ofn such that two numbers

    ofn bits can be multiplied in 1 minute. The larger the value ofn for which you get the result within1 minute, better the implementation of the algorithm. Do the same for 2 minutes, 4 minutes, 10minutes.

    What useful lessons do you learn from this experimentation ? You may state them briefly in thereport but you should be able to provide detailed explanation and reason for these lessons to theTA during the viva.

    2 Quick sort: the most preferred sorting algorithm in practice!

    We all know that the worst case time complexity of quick sort is O(n2). But, quick sort is still themost preferred sorting algorithm in practice. It outperforms your favorite merge sort (and even heapsort) algorithm in real life applications. Also note that unlike merge sort which requires O(n) extraspace, quick sort requires only O(1) extra space. We shall prove in some lecture that average runningtime of quick sort is O(n log n). In particular, the average number of comparisons during quick sort is2n log

    en + O(n). Here average number of comparisons (or average running time) means the number of

    comparisons performed during quick sort averaged over all n! permutations. Interestingly, what makesquick sort practically the most efficient sorting algorithm is not just its average O(n log n) running time.In fact, as the input size increases, the variation in the running time of the quick sort reduces drastically.Let me explain it in a bit more detail. Let n0 be some positive integer and c >1. Suppose on executingthe quick sort algorithm on many random instances of size n0, for 10% of these instances, the numberof comparisons exceeds ctimes the average number of comparisons. Then if you double the value ofn0,

    the fraction of the instances for which the number of comparisons exceeds c times the average number ofcomparisons would be much smaller than 10% (In fact ifnis sufficiently large, it may be less than 0.01%.).This behavior is captured informally by saying that the running time of quick sort gets more and moreconcentrated around its average running time as the value ofn increases. In other words, the behaviorof quick sort becomes closer to a deterministic (instead of just average) O(n log n) time algorithm as thevalue ofn increases. Since one is concerned about the time complexity for large value ofn, this propertyof quick sort makes it practically the most efficient algorithm. In order to get a theoretical explanationof this wonderful property of quick sort, do a course on randomized algorithm in future.

    You have to implement quick sort and merge sort and compare their performance experimentally.For quick sort, you have to demonstrate experimentally that its running time becomes more and moreconcentrated around O(n log n) as the value ofn increases.Last note: Though the experimental guidelines and the objective have been provided above in greatdetails, there are still a few points which you will have to think over in order to make these experiments

    really successful. These missing points require no extraordinary abilities. These points require a patientand attentive mind and an enthusiastic spirit to do this assignment. I sincerely hope that many of you willsurprise me with your innovative analytical and engineering skills during the demo. I would personallytake the demo of a sample of students.

    2