Upload
ralf-robertson
View
216
Download
3
Embed Size (px)
Citation preview
Vishnu Kotrajaras, PhD. 1
Data Structures
Vishnu Kotrajaras, PhD. 2
Introduction
Why study data structure?Can understand more code.
Can choose a correct data structure for any task.
Vishnu Kotrajaras, PhD. 3
Example, storing 5 numbers
1 2 3 54
PLinked ist
2
1 4
3 5
P
Tree (Binary Search Tree)
Vishnu Kotrajaras, PhD. 4
Choosing how to store
Heap 5
4 3
2 1
If we want to always retrieve a maximum value, heap is the best for that.
Vishnu Kotrajaras, PhD. 5
Estimating the program speed
Big O
if where c and N0 are constants
and N>=N0 This is telling us how the
program grows.
))(()( NfNT ))(()( NfcNT
Vishnu Kotrajaras, PhD. 6
BIG O example If T(N) = 339N and f(N) = N*N
Let us have N0 = 339 และ C = 1
Therefore 339N0 <= 1*(N0*N0)
->There are other possible answers. If we let f(N)=340N, we will have
T(N) <= 1*(340N) <=c*f(N) -> This also fits the definition.
Therefore T(N) <= 1*(340N) is also correct.
))0(()0( NfcNT
)())(()( 2NNfNT
Vishnu Kotrajaras, PhD. 7
BIG O example (cont.)
Therefore T(N)=O(N) is also correct. Which one should we use as an
answer? Normally, we choose the smallest
one. Therefore O(N) is our answer. How does it connect to a program
speed? Please read on.
Vishnu Kotrajaras, PhD. 8
sigmaOfSquare(int n) // calculate {
1: int tempSum;
2: tempSum = 0;
3: for (int i=1;i<=n;i++)4: tempSum += i*i;
5: return tempSum;
}
Find the speed of the following code
1 unit (declare only)
1 unit (assignment)
1 unit (return)
1 unit
n+1 unitn unit
Multiply, add, and assignment, each has n times. Therefore we have 3n unit. Total time is 5n+5 unit.
Vishnu Kotrajaras, PhD. 9
But it’s unreasonable to use so detailed process
It’s better to use an approximation time. That is Big O
From the example, the time can be estimated from the loop (other running times become insignificant)
The loop is performed n times.Therefore, Big O = O(n)
The detailed time is 5n+5, which matches O(n) -> (5n+5<= 6n).
Vishnu Kotrajaras, PhD. 10
Big O is O(n2).
Finding BIG O from various loops
For loop-> Its Big O is the number of repetition.
Nested loop
1: for (i = 1; i <= n; i++)2: for (j = 1; j <= n; j++)
statements;
n times
n times
Vishnu Kotrajaras, PhD. 11
Finding BIG O from various loops(cont.)
Here is the Big O for Nested loop:If T1(N)=O(f(N)) and T2(N)= O(g(N)),
thenT1(N)* T2(N)= O(f(N)*g(N))
From last page -> f(n) = g(n) = n Therefore they add up to O(n2).
Vishnu Kotrajaras, PhD. 12
Finding BIG O from various loops(cont2.)
Consecutive Statements1: for (i = 0; i <= n; i++)2: statement1;
3: for (j = 0; j <= n; j++)4: for (k = 0; k <= n; k++)5: statement2;
O(n)
O(n2)
The answer is their max. -> O(n2)
Vishnu Kotrajaras, PhD. 13
Finding BIG O from various loops(cont3.)
Big O definition for consecutive statements: If T1(N)=O(f(N)) and T2(N)= O(g(N)), then
T1(N)+ T2(N)= max(O(f(N),O(g(N)))From last page -> f(n) = O(n), g(n) = O(n2)
The answer is therefore O(n2)
Vishnu Kotrajaras, PhD. 14
Finding BIG O from various loops(cont4.)
Conditional statement1: if (condition)
2: Statement13: Else
4: Statement2
O(f(n))
O(g(n))
Use the max -> max(O(f(n),O(g(n)))
Vishnu Kotrajaras, PhD. 15
Finding BIG O from recursion1:mymethod (int n) {
2: if (n == 1) {
3: return 1;
4: } else {
5: return 2*mymethod(n – 1) + 1;
6: }
7:}
n times, big O = O(n)
Vishnu Kotrajaras, PhD. 16
Maximum Subsequence Sum, choosing the best Big O Maximum Subsequence Sum is:
For integer A1,A2, …, An
Maximum Subsequence Sum is that gives the maximum value. It is a consecutive sequence that gives the highest added value.
Example: -2, 11, -6, 16, -5, 7The sum of 11, -6, 16 is 21. But the max
sequence is 11, -6, 16, -5, 7 -> the sum is 23.
23 is the max. sub. Sum.
j
ikkA
consecutive
Vishnu Kotrajaras, PhD. 17
Solving max sub sum: 1st method1: int maxSubSum01 ( int [] a) {2: int maxSum = 0;3: for (int i = 0; i < a.length; i++) {4: for (int j = i; j < a.length; j++) {5: int theSum = 0;6: for (int k = i; k <= j; k++) {7: theSum += a[k];8: }9: if (theSum > maxSum) {10: maxSum = theSum;11: }12: }13: return maxSum;14: }15: }
First index
Last index
Sum from first to last.
Choose to store max value.
Vishnu Kotrajaras, PhD. 18
This first method has big O = O(n3).
Not good enough. Too many redundant calculations. If we have added elements from index 0 to 2, when we add elements from index 0 to 3, we should not start the addition from scratch.
Solving max sub sum: 1st method(cont.)
Vishnu Kotrajaras, PhD. 19
1: int maxSubSum02 (int [] a) {2: int maxSum = 0;3: for (int i = 0; i < a.length; i++) {4: int theSum = 0;5: for (int j = i; j < a.length; j++) {6: theSum += a[j];7: if (theSum > maxSum) {8: maxSum = theSum;
9: }10: }11: }12: return maxSum;13: }
Solving max sub sum: 2nd method
Starting position
Do the addition from the starting position and collect the result. BIG O = O(n2)
Vishnu Kotrajaras, PhD. 20
-2 11 -6 4
when i=0, j=0: theSum = -2 maxSum = 0
when i=0, j=1: theSum = -2 + 11 = 9 maxSum becomes 9.
when i=0, j=2: theSum = 9 + (-6) = 3
maxSum is still 9. when i=0, j=3: theSum = 3 + 4
maxSum is still 9.
Solving max sub sum: 2nd method(cont.)
Vishnu Kotrajaras, PhD. 21
Use divide and conquer The result sequence maybe in
The left half or the array, orThe right half, orLie between the left half and the right half. (its sequence contains the last element of the left half and the first element of the right half.)
Solving max sub sum: 3rd method
Vishnu Kotrajaras, PhD. 22
Solving max sub sum: 3rd method (cont.)
1 -2 7 -6 2 8 -5 4
Max sub sum on the left with (-6) is 1. Max sub sum on the right with (2) is 10.
Max sub sum on this side is 7.
Max sub sum on this side is 10.
Max sub sum that covers between the left side and the right side is therefore 1 +10 = 11 (this is the final answer).
Vishnu Kotrajaras, PhD. 23
1:int maxSumDivideConquer (int [] array, int leftindex, int rightindex {
2: //assume that the array can be divided evenly.3: if (leftindex == rightindex) { // Base Case5: if (array[leftindex] > 0 )6: return array[leftindex];7: else8: return 0; // min value of maxSubSum9: }10: int centerindex = (leftindex + rightindex)/2;12: int maxsumleft = maxSumDivideConquer(array,
leftindex, centerindex);13: int maxsumright = maxSumDivideConquer ( array,
centerindex + 1, right);
Solving max sub sum: 3rd method (cont 2.)
T(n)
T(n/2)
T(n/2)
Vishnu Kotrajaras, PhD. 24
14: int maxlefthalfSum = 0, lefthalfSum = 0;
15: //max sum – from the last element of the left
//side to the first element.
16: for (int i = center; i >= leftindex; i--) {
17: lefthalfSum = lefthalfSum + array[i];
18: if (lefthalfSum > maxlefthalfSum) {
19: maxlefthalfSum = lefthalfSum;
20: }
21: }
Solving max sub sum: 3rd method (cont 3.)
O(n/2)
Vishnu Kotrajaras, PhD. 25
22: int maxrighthalfSum = 0, righthalfSum = 0;
23: // max sum – from the first element of the right
//side to the last element.
24: for (int i = centerindex + 1; i <= rightindex; i++) {
25: righthalfSum = righthalfSum + array [i];
26: if (righthalfSum > maxrighthalfSum) {
27: maxrighthalfSum = righthalfSum;
28: }
29: }
Solving max sub sum: 3rd method (cont 4.)
O(n/2)
Vishnu Kotrajaras, PhD. 26
30: //finally, find max of the three.31: return max3 (maxsumleft, maxsumright,
maxlefthalfSum + maxrighthalfSum)}
Therefore the total time is T(n) = 2T(n/2) + 2O(n/2)
Solving max sub sum: 3rd method (cont 5.)
This part takes constant time. We can ignore.
Vishnu Kotrajaras, PhD. 27
We find the total BIG O:
T(n) = 2T(n/2) + 2O(n/2) = 2T(n/2) + O(n)
= 2T(n/2) + cnDivide everything by n, we get:
Solving max sub sum: 3rd method (cont 6.)
O(n) <= c*n according to the definition
cn
nT
n
nT
2
)2
()((1)
Vishnu Kotrajaras, PhD. 28
We can create a series of equations:
Solving max sub sum: 3rd method (cont 7.)
cTT
cn
nT
n
nT
cn
nT
n
nT
1
)1(
2
)2(
.................8
)8
(
4
)4
(
4
)4
(
2
)2
(
(X)
(3)
(2)
Vishnu Kotrajaras, PhD. 29
Do (1) + (2) + (3) +…..+ (x), we get:
The left and right hand side cancel each other out. And c is added for log2 n times.
Multiply both sides by n, we get:
Because T(1) is constant, we can conclude that Big O = O(n log n)
Solving max sub sum: 3rd method (cont 8.)
)(log*1
)1()(2 nc
T
n
nT
)(log**)1(*)( 2 nncTnnT
Vishnu Kotrajaras, PhD. 30
We improve on the 2nd method, with two points to note:
First, the first element of any maximum subsequence sum cannot be a negative value. For example: 3, -5, 1, 4, 7, -4
-5 cannot be the first element of our result. It can only make the total smaller. Any single positive number gives a better result anyway.
Solving max sub sum: 4th method
Vishnu Kotrajaras, PhD. 31
Second, any subsequence that is negative cannot begin max sub sum. Let us be in a loop execution. Let i be
the index of the first element of a subsequence an j be the index of the last element of that subsequence.
Let the last element make this subsequence negative.
Let p be any index between i+1 and j.
Solving max sub sum: 4th method (cont.)
3 4 1 -3 -9 1 5
i jp
Vishnu Kotrajaras, PhD. 32
Solving max sub sum: 4th method (cont 2.)
The next step of this loop -> increment j by one.
•If a[j] is negative, we will not get a better max sub sum. Max sub sum value will not change.
•If a[j] is positive, a[i]+…+a[j] will be greater than a[i]+…+a[j-1]. However, because a[i]+…+a[j-1] is negative, the new sum is never more than a stored max sub sum. The new sum cannot even match a[j] alone.
•Therefore if we have a negative subsequence, we should not move j. We should move i instead.
Vishnu Kotrajaras, PhD. 33
Should we only increment i by one or more?
From our assumption, we know that a[j] makes a[i]+…+a[j] negative. Therefore, incrementing i by one within the range between i and p will only make a[i]+…+ a[p] smaller. (p is any index between i and j).
If we want to get a larger max sub sum, we must start our subsequence from position j+1. Therefore i should be incremented to j+1.
Solving max sub sum: 4th method (cont 3.)
3 4 1 -3 -9 1 5
i jp
Vishnu Kotrajaras, PhD. 34
1: int maxsubsumOptimum (int[] array) {2: int maxSum = 0, theSum = 0;3: for (int j = 0; j < a.length; j++) {4: theSum = theSum + array [j];5: if ( theSum > maxSum) {6: maxSum = theSum;7: } else if (theSum < 0) { // if a[j] makes the8: //sequence negative, 9: theSum = 0; // start again from 10: // position j+1.11: }12: }13: return maxSum;14: }
Solving max sub sum: 4th method (cont 4.)
Vishnu Kotrajaras, PhD. 35
Logarithm in big O
If we can spend a constant time (O(1)) to divide a problem into equal subproblems (3rd method of the maximum subsequence sum problem), that problem will have big O = O(log n).
Usually ,we make an assumption that all data is in the system. Otherwise, reading data in will take O(n).
Vishnu Kotrajaras, PhD. 36
Example: O(log n)
finding 5 in a sorted array. If we start from the first array member, it takes
O(n) to find a number. But we know that the array is sorted:
So we can look at the middle of the array, and search from there, going to either left or right depending on the value of that middle element.
And keep searching by looking at the middle element of the subarray we are looking at, and so on.
This is called -> Binary Search.
Vishnu Kotrajaras, PhD. 37
int binarySearch (int[] a, int x) {
int left = 0, right = a.length – 1;
while (left <=right) {
int mid = (left + right)/2;
if (a[mid] < x ) {
left = mid + 1;
} else if (a[mid] > x) {
right = mid – 1;
} else {
return mid;
}
}
return -1; // reaching this point means -> not found.
}
Big O = O(log2 n)
Vishnu Kotrajaras, PhD. 38
Example: O(log n) (cont.) Greatest common divisor
long gcd (long m , long n) {while (n!=0) {
long rem = m%n;m = n;n = rem;
}return m;
}
The reduction of the remainder tells us the Big O. In this program, The remainder decreases without any specific pattern.
How do we find big O?
Vishnu Kotrajaras, PhD. 39
Big O of gcd We use the following definition: if M > N, M mod N < M/2
Prove: if N <= M/2: Because the remainder from M mod N
must be less than N, so it must also be less than M/2.
if N > M/2: M divided by N will = 1 + (M-N). The remainder is M-N or M – (> M/2). Therefore the remainder is less than M/2.
If we look at the code for gcd: The remainder from the xth loop will be used as m of
the (x+2)th loop. Therefore the remainder from the (x+2)th loop must
be less than half the remainder from the xth loop. Meaning -> with 2 iterations passed, the remainder
must surely reduce by half or more.
Vishnu Kotrajaras, PhD. 40
gcd (2564, 1988))
Vishnu Kotrajaras, PhD. 41
Calculate xn by divide and conquer. long power (long x, int n) {
if (n==0)
return 1;
if (isEven (n))
return power (x*x, n/2);
else
return power (x*x, n/2)*x;
}
Example: O(log n) (cont 2.)
Big O = O (log2 n)
The original problem is divided by half in each method call.
Vishnu Kotrajaras, PhD. 42
O(log n) definition
logk n = O(n) when k is constant.This definition tells us that a logarithmic function has a small growth rate.
f(n) = loga n has its big O = O(logb n), where a and b is a positive number more than 1.Any two logarithmic functions have the same growth rate.
Vishnu Kotrajaras, PhD. 43
let and
Any two logarithmic functions have the
same growth rate: a proofxna log ynb log
bnan
byax
nbyax
nbna
ba
yx
ln*logln*log
lnln
lnlnln
,
cna
bnn bba *)(log
ln
ln*loglog
)(loglog nOn ba
Vishnu Kotrajaras, PhD. 44
Runtime –small(top) to large (bottom) c log n logk n n n log n n2
n3
2n
Vishnu Kotrajaras, PhD. 45
Definitions other than big O
Big Omega ( ) T(N) = (g(N)) if there exist
constant C and N0 thatT(N) >= C g(N), where N>=N0
From def. if f(N) = (N2), then f(N) = (N) = (N1/2)We should choose the most realistic
answer.
Vishnu Kotrajaras, PhD. 46
Big Theta ( ) T(N) = (h(N)) if T(N) = O(h(N))
and T(N) = (h(N)) There exist c1, c2, N0 that make
c1*h(N) <= T(N) <= c2*h(N), where N >= N0
Definitions other than big O (CONT.)
Vishnu Kotrajaras, PhD. 47
small O T(N) = o(p(N)) if T(N) = O(p(N))
but T(N) (p(N))
Definitions other than big O (CONT 2.)
Vishnu Kotrajaras, PhD. 48
Notes from the definitions T(N) = O(f(N)) has the same meaning as f(N)
= (T(N))We can say f(N) is an “upper bound” of T(N), and
T(N) is a lower bound of f(N). f(N) = N2 and g(N) = 2N2 have the same Big
O และ Big . That is f(N) = (g(N)) f(N) = N2 can have several Big O -> (O(N3),
O(N4)) but the best value is O(N2).We can use f(N) = (N2) to tell that this value
is the best big O.
Vishnu Kotrajaras, PhD. 49
If T(N) is a Polynomial degree k, then
T(N) = (Nk)
From here, if T(N) = 5N4 + 4N3 + N, we know
that T(N) = (N4)
Thus, we have the latest definition:
Vishnu Kotrajaras, PhD. 50
Best case, Worst case, Average case worst case = a maximum running
time possible. best case = a minimum running time
possible. average case?
For each input, see how long the program runs.
average case running time = total time from every input divided by the number of input.
Vishnu Kotrajaras, PhD. 51
The average case definition is based on an assumption that: Each input has equal chance of
occurrence. If we do not want the assumption,
We must take a probability of each input into account.
Average case = (prob. of inputi * unit time when use inputi )
Average case
i
Vishnu Kotrajaras, PhD. 52
Example: Finding Average case Let’s say we want to find x in an array of
size n. Best case: find x in the first array slot. Worst case: x is in the array’s last slot, or x
is not in the array at all. Average case:
Assume each array slot has an equal chance of having x inside.
Therefore, a chance of x being in a slot is 1/n.
Vishnu Kotrajaras, PhD. 53
Average Case running time = 1/n * (steps used when finding x in the first slot) + 1/n * (steps used when finding x in the second slot) + ... + 1/n * (steps used when finding x in the last slot, or not finding x at all)
= (1 + 2 +… + n) / n = (n+1)/2 = O(n) = big O of worst
case
Example: Finding Average case (cont.)