View
216
Download
2
Embed Size (px)
Citation preview
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
DCO20105 Data structures and algorithms
Lecture 6: Algorithms and performance analysis
Algorithms Recursion Performance analysis Inline code expansion Big-O notation
-- By Rossella Lau
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Algorithms
General features
Specific input and output descriptions
Clear, simple, straight forward process steps Step-wise refinement easily translated to a computer program
• Pseudo Code or Structured English which is quite similar to a program language but no syntax restriction
• E.g., the resize() in Lecture 1 (slide 10)
Usually, no data structure specified, provide termination, consistent and efficient to produce correct output
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
An example: merge()
// s1 and s2 must be in ordermerge(Stream s1, Stream s2, Stream out) { while (s1 || s2) { if (!s1) // s1 ends copy the rest of s2 to out if (!s2) copy the rest of s2 to out if (s1.front() < s2.front()) out.push_back(s1.front()); s1.pop(); else out.push_back(s2.front()); s2.pop();} }
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Recursive AlgorithmsAn algorithm makes use of itself as part of the solution; e.g.,
Factorial n's factorial is the product of all integers between 1 and n The definition can be written as follows:
n! = 1 if n==0n! = n*(n-1) * (n-2) * … * 1 if n > 0
Obviously, the second line can be rewritten as:n! = n*(n-1)! if n > 0
By using the second definition, we may evaluate n! as5! = 5 * 4! 4! = 4 * 3! 3! = 3 * 2! 2! = 2 * 1! 1! = 1 * 0! 0! = 1
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Recursive Process
From the evaluation of n's factorial according to the recursive definition, we may see
1. Each evaluation is reduced to a simpler case
2. The reduction is continued until a direct definition is given
3. The result substitutes back the previous reduction
5! = 5 * 24 = 120 4! = 4 * 6 = 24 3! = 3 * 2 = 6 2! = 2 * 1 = 2 1! = 1 * 1 = 1 0! = 1
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Recursive definitionReconsider the data structure of a linked list:
The definition which defines an object of itself is called recursive definition
A recursive data structure may carry recursive algorithms Printing a linked list: print the item; print the rest of the list
template <class T>class Node { T item; Node<T> *next;};
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
The Tower of Hanoi Problem
Problem statement (Ford: 3-7)
Given three pegs, A, B, and C and a number of disks of differing diameters. Initially, all the disks are placed on peg A. The problem is to move all disks from peg A to peg C with the following constraints:
1. Disks placed on a peg should follow an order: a larger disk is always below a smaller disk.
2. Only the top disk on any peg may be moved to any other peg
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
The solution for the problem
The solution (Ford: prg3_3.cpp)
From the solution, it can be seen how the elegance of the recursive approach contributes to problem solving
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
More recursive functions I
int aFunction( int n, int m){ if (!n) return m; return aFunction (n-1, m+1); }
int bFunction( int n, int m){ if (!n) return 0; return bFunction (n-1, m) + m; }
int f( int n ){ if ( !n ) return 0; if ( !(n & 1) ) return f(n/2); return f(n/2) + 1; }
What are aFunction() and bFunction()?
What is the value of f(10)?
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
More recursive functions II
How would you compare the three solutions for the same simple problem?
int funny(int a, int b){ if ( a == 1 ) return b; return a & 1 ? funny ( a>>1, b<<1) + b : funny ( a>>1, b<<1);}
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
The underlying algorithm of recursion
All recursive algorithms can be rewritten as non-recursive algorithms
Because a function call uses the concept of a stack Non-recursive algorithms can make use of a stack to
rewrite the recursive algorithms
Some recursive algorithms don't even need a stack to rewrite the algorithm, e.g., factorial function
Recursive algorithms which must use a stack to be rewritten as non-recursive algorithms are called natural recursive functions, e.g., the Hanoi problem
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Efficiency of recursionA recursive algorithm can always be converted to a non-
recursive algorithm
The recursive approach is not as efficient as the non-recursive version since additional operations and spaces are required for function calls
However, sometimes a recursive solution is the most natural and logical way of solving a problem
Conflict of machine efficiency and development efficiency Usually, if an algorithm is a natural recursive solution, it is not worth
a programmer's time to construct a non-recursive solution. Though, for recursive algorithms demanding frequent use, it maybe
worthwhile to rewrite
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Performance Analysis
To see if an algorithm is efficient, we measure computer execution time (usually more critical) memory required (for some cases)
For execution efficiency, complexity is measured by: details of an algorithm: number of operations, cost of
operations, function used, etc scalability: how well the algorithm is executed when the
problem size (the size of data being processed) is increased
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Execution time measurement
Use system time to measure program execution time; e.g., the code session in TimeSearch.cpp(v1.0)startTime = clock();
linearSearch(forSearch, SIZE, TARGET);
endTime = clock(); The difference of startTime and endTime is the execution time
Use system time to measure program execution time date program date
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Notes on measurement execution time
Measurement should focus on the algorithm and avoid I/O inside
I/O may cause other system functions to be executed which are not needed every time; e.g., paging
I/O may cause waiting for user input
Usually, the system also runs other applications at the same time
Avoid other applications/programs which require a lot of I/O time running at the same time
Avoid other applications/programs which demand CPU time
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Factors of efficiency
Number of operations
Cost of different operations, e.g., +/- is much simpler than */ constant value is more efficient than a variable's value cost of branch statements are more expensive than
sequence statements
Cost of function call is more expensive
A system function is usually faster than a user-defined function
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Contradiction of execution and development efficiency
Literal values in a statement may be faster than using an identifier but an identifier is more meaningful and maintainable
A function call causes system overhead, program pointer and parameter passing, but is more meaningful and maintainable
System solution: Preprocessor, Optimizer, and new features in C++
Preprocessor: macro substitution can pretend value/function
Optimizer: some optimization processes suppress the problem
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Optimizer
Many compilers have an optimization phase, called an optimizer, which changes the coding to an internal form to make the program more efficient
Popular optimization: common expression substitution function inline expansion
• Instead of doing a call, the function is expanded by its codes to replace the function call to reduce overhead
• Recursive functions cannot be inline expanded redundant codes removal arithmetic expression optimization
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Manual function inline expansion
Other than using an optimizer to perform function inline expansion, C++ provides a new feature to allow a programmer to specify a function which should be inline expanded
E.g., the output of measuring messages can be rewritten as displayTime() in TimeSearch.cpp(v2.0) to reduce similar bulky codes (for better maintainability)
displayTime() can be further rewritten as the following to allow for better execution efficiency by avoiding overhead generated for the call:
inline void displayTime(……)
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Scalability
An algorithm which is efficient for one problem size may not be efficient for large problem size
E.g., the Traveling Salesman Problem: the shortest path for a salesman to go to each destination. It, at least, involves: (n-1)! checking. When n is 100, it is already an astronomical value!!
To see if a program/function is scalable, big-O notation is used
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Big O-notation
Count each line of coding as one execution timeunit, if the computation time for a problem size, n, e.g., is f(n) = 7n2 + 2n +8, we simply denote it as O(n2).
Formal definition f(n) is O(g(n)) if there exists positive numbers of a and b
that f(n) < a*g(n) holds for n>=b
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Examples of Big-O analysis
Ford’s exercises: 3.22n + 5 n2 + 6n + 7
n 1/3 + n ½ + 7 (n3 + n2 )/ (n + 1)
Ford’s exercises: 3.25
bool g(int a[], int n, int k){ int i; for (i=0; i<n; i++) if (a[i] == k) return true; return true;}
void h(int a[], int b[], int n){ int i; for (i=0; i<n; i++) for (j=0; i<n; j++) a[i} += b[j];}
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Asymptotic Analysis and Big-O notation
To allow for B-O measuring, the following asymptotic analysis functions are used for categorizing algorithms:
O(1) -- constant time O(log(n)) -- logarithmic time O(n) -- linear time O(nlog(n)) -- n-lon-n time O(n2) -- quadratic time O(n3) -- cubic time O(2n) -- exponential time
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
More examples of Big-O notation
Factorial
Linear search (TimeSearch.cpp)
Binary search (TimeSearch.cpp)
Tower of Hanoi
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Interpretation of Big O-notation
It is a simplified complexity measurement notation
It is a simple way to see the relationship between the growth of the execution time and the growth of the problem size
It ignores all the coefficient numbers of f(n), the execution time function of the problem size, and treats all coefficient numbers or constants as 1
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Typical meanings of Big O-notation
Algorithms with O(1) are ideal
Algorithms with O(f(n)) are near ideal when f(n) < n
Algorithms with O(f(n)) are acceptable when f(n) < nc
Algorithms with O(f(n)) are NP-complete or NP hard when f(n) > cn
c is a small number of constant
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Typical Growth of Various f(n) with n.
Table 1.2 in Smith’s reference
f(n) n=3 n=10 n=30 n=100 n=300 n=1000
lg n 1.6 3.3 4.9 6.6 8.2 10
n 3 10 30 100 300 1000
nlg n 4.8 33 147 664 2469 9966
n2 9 100 900 10000 90000 106
N3 27 1000 27000 106 2.7* 107 109
2n 8 1024 109 1030 1090 10300
10n 1000 1010 1030 10100 10300 101000
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Performance evaluationLinear search: O(n)
A measure of t(na=10,000) = 1.5 seconds t(nb=5,120,000) = 512 na / na * t(na) = 512*1.5 768 seconds
Binary search: O(log2 n) A measure of t(na=100,000) = 0.0001 second t(nb=100,000,000) = t(1000na) = log100,000,000 / log100,000 *
t(na) about 0.0002 second
Tower of Hanoi: O(2n) A measure of t(na = 10) = .10 second Evaluation: t(nb=15): 215/210 = 32 32 *.1 = 3.2 seconds
t(nc=20): 220/210 = 1024 1024 *.1 about 102.4 seconds An actual run: t(10)= .10 sec, t(15)=2.8 sec, t(20)=126.78sec
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Big-O for some basic functionsVector
push_back(), searchBinary()push_front(), insert(), searchLinear()
Linked Listpush_back(), push_front(), Search() – only linear search
insert()
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Big-O for bookShop
Big-O notation for inserting n items with option 4 (add/modify)
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
More on performance evaluation
Execution time: O(n2)
If t(na) = a, t(10 na) = 102 a = 100a
void List<T>::printTail() { for (Node<T> *ptr = head; ptr, ptr=ptr->next) {
cout << ptr->item; if (ptr->next) for (Node<T> *curr=ptr->next; curr, curr=curr->next) cout << “” << curr->item; cout << endl;
}}
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
SummaryAlgorithms should clearly specify how a solution solves a
problem
Recursive algorithms exhibit elegant solutions
The solution for Tower of Hanoi is a typical natural recursive algorithm
Program efficiency is usually measured by execution time and memory spaces
Inline expansion is one of the solutions to solve the development and efficiency contradiction
Big-O notation is a popular method to measure and evaluate the performance of an algorithm
Rossella Lau Lecture 6, DCO20105, Semester A,2005-6
Reference
Ford: 3.3-4, 3.6-7
Data Structures: Form and Function by Harry Smith, Harcourt Brace Jovanovich, 1987
STL online references http://www.sgi.com/tech/stl http://www.cppreference.com/
Example programs: TimeSearch.cpp(v2.0), Ford: prg3_3.cpp
-- END --