Summer 2007 CISC121 - Prof. McLeod 1
CISC121 – Lecture 12
• Last time:– Efficient recursive and non-recursive sorts.– Analyzing the complexity of recursive methods.
• Today– Last lecture!
Summer 2007 CISC121 - Prof. McLeod 2
You Will Need To:
• Look at exercise 5 and assignment 5.
• If you wish to replace any of your assignment marks you can submit the “makeup” assignment on the day of the final exam.
Summer 2007 CISC121 - Prof. McLeod 3
Final Exam…
• On the 16th, unless other arrangements have been made.
• Exam topics are listed off the course web site.• Exambank has 18 exams from 1996 to 2006.• I will not provide full exam solutions, but will be
happy to discuss individual exam problems.• Ana or Krista can supply extra tutoring if needed.
Ask them if you want an exam prep tutorial.
Summer 2007 CISC121 - Prof. McLeod 4
Today
• Analyze the complexity of mergesort.
• Number representation and roundoff error.
• Movie!
• “Satisfaction” survey
Summer 2007 CISC121 - Prof. McLeod 5
Mergesort – “aMergeSort” Code• Code for sorting arrays:
public static void aMergeSort (int[] A) { aMergeSort(A, 0, A.length-1); } // end aMergeSort public static void aMergeSort (int[] A, int first,
int last) { if (first < last) { int mid = (first + last) / 2; aMergeSort(A, first, mid); aMergeSort(A, mid + 1, last); aMerge(A, first, last); } // end if } // end aMergeSort recursive
Summer 2007 CISC121 - Prof. McLeod 6
Mergesort – “aMerge” Code private static void aMerge (int[] A, int first, int
last) { int mid = (first + last) / 2; int i1 = 0, i2 = first, i3 = mid + 1; int[] temp = new int[last - first + 1]; while (i2 <= mid && i3 <= last) { if (A[i2] < A[i3]) { temp[i1] = A[i2]; i2++; } else { temp[i1] = A[i3]; i3++; } i1++; } // end while
Summer 2007 CISC121 - Prof. McLeod 7
Mergesort – “aMerge” Code - Cont. while (i2 <= mid) { temp[i1] = A[i2]; i2++; i1++; } // end while
while (i3 <= last) { temp[i1] = A[i3]; i3++; i1++; } // end while
i1 = 0; i2 = first; while (i2 <= last) { A[i2] = temp[i1]; i1++; i2++; } // end while
} // end aMerge
Summer 2007 CISC121 - Prof. McLeod 8
Complexity of Mergesort
• Consider the aMergeSort code shown above:• Suppose that the entire method takes t(n) time,
where n is A.length. We want to know the big O notation for t(n).
• There are no loops in aMergeSort, just some constant time operations, the two recursive calls and the call to aMerge.
)(22
)( ntn
tn
tant merge
Summer 2007 CISC121 - Prof. McLeod 9
Complexity of Mergesort - Cont.
• What is the time function for aMerge?• There is some O(1) stuff and four loops that are
O(n):
• So,
cnantmerge )(
cnan
tn
tant
22)(
Summer 2007 CISC121 - Prof. McLeod 10
Complexity of Mergesort - Cont.
• So far, we have not made any mention of the state of the data. Does it make any difference if the data is in reverse order (worst case), random order (average case) or in order already (best case)?
• Express t(n) in a recursive expression:
otherwisecnn
tn
ta
nifb
nt
22
1
)(
Summer 2007 CISC121 - Prof. McLeod 11
Complexity of Mergesort - Cont.
• Assume that n is a power of 2:
• (It is easy enough to show that the proof still holds when n is not a power of two - but I’m not going to do that here).
otherwisecnn
ta
nifb
nt
22
1
)(
Summer 2007 CISC121 - Prof. McLeod 12
Complexity of Mergesort - Cont.
• Substitute n/2 for n, to get t(n/2):
)2(22
23)(
2222)(
,
222
2
22
2
2
icnn
tant
cnn
cn
taant
or
nc
nta
nt
Summer 2007 CISC121 - Prof. McLeod 13
Complexity of Mergesort - Cont.
• Do the next unrolling, which will be n/22:
• So, after i unrolling’s:
)3(32
27)(3
3
icn
ntant
icnn
tanti
ii
2212)(
Summer 2007 CISC121 - Prof. McLeod 14
Complexity of Mergesort - Cont.
• This recursion stops when the anchor case, n 1 is encountered. This will occur when:
• Substituting this back in the equation on the previous slide:
niorn
whenor
n
i
i
log,2
,
12
Summer 2007 CISC121 - Prof. McLeod 15
Complexity of Mergesort - Cont.
• At the anchor case:
• Now the equation can be simplified to yield the big O notation, which indicates that t(n) is O(nlog(n)).
ncnknjncnnbannt
or
ncnntannt
or
cnnn
nntannt
loglog1)(
,
log)1(1)(
,
)(log1)(
Summer 2007 CISC121 - Prof. McLeod 16
public static void quickSort (int[] A, int first, int last) {
int lower = first + 1; int upper = last; swap(A, first, (first+last)/2);
int pivot = A[first]; while (lower <= upper) { while (A[lower] < pivot) lower++; while (A[upper] > pivot) upper--; if (lower < upper) swap(A, lower++, upper--); else lower++; } swap(A, upper, first); if (first < upper - 1) quickSort(A, first, upper-1); if (upper + 1 < last) quickSort(A, upper+1, last);} // end quickSort(subarrays)
Summer 2007 CISC121 - Prof. McLeod 17
Complexity of Quicksort
• The worst case is when a near-median value is not chosen – the pivot value is always a maximum or a minimum value. Now the algorithm is O(n2).
• However, if the pivot values are always near the median value of the arrays, the algorithm is O(nlog(n)) – which is the best case. (See the derivation of this complexity for merge sort).
• The average case also turns out to be O(nlog(n)).
Summer 2007 CISC121 - Prof. McLeod 18
Number Representation
• Binary numbers or “base 2” is a natural representation of numbers to a computer.
• As a transition, hexadecimal (or “hex”, base 16) numbers are also used.
• Octal (base 8) numbers are used to a lesser degree.
• Decimal (base 10) numbers are *not* naturally represented in computers.
Summer 2007 CISC121 - Prof. McLeod 19
Number Representation - Cont.
• In base 2 (digits either 0 or 1):
r=2, a binary number: (110101.11)2=
1×25+1×24+0×23+1×22+0×21+1×20 +1×2-1 +1×2-2 =
=53.75 (in base 10)
“r” is the “radix” or the base of the number
Summer 2007 CISC121 - Prof. McLeod 20
Number Representation - Cont.
• Octal Numbers: a base-8 system with 8 digits: 0, 1, 2, 3, 4, 5, 6 and 7:
• For example:
(127.4)8 = 1×82+2×81+7×80+4×8-1=87.5
Summer 2007 CISC121 - Prof. McLeod 21
Number Representation - Cont.
• Hexadecimal Numbers: a base-16 system with 16 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F:
• For example:
(B65F)16 = 11×163+6×162+5×161+15×160 = 46687.
Summer 2007 CISC121 - Prof. McLeod 22
Number Representation - Cont.
• The above series show how you can convert from binary, octal or hex to decimal.
• How to convert from decimal to one of the other bases?:
• integral part: divide by r and keep the remainder.• decimal part: multiply by r and keep the carry• “r” is the base - either 2, 8 or 16
Summer 2007 CISC121 - Prof. McLeod 23
Number Representation - Cont.
• For example,
convert 625.7610
to binary:
• So, 62510 is
10011100012
Divisor(r) Dividend Remainder
2 625
2 312 (quotient) 1
2 156 0
2 78 0
2 39 0
2 19 1
2 9 1
2 4 1
2 2 0
2 1 0
0 1
most significant digit
least significant digit
Summer 2007 CISC121 - Prof. McLeod 24
Number Representation - Cont.
• For the “0.7610”
part:
• So, 0.7610 is
0.11000010102
• 625.76 is:
(1001110001.1100001010)2
Multiplier(r) Multiplicand Carry
2 0 .76
2 1 .52 (product) 1
2 1 .04 1
2 0 .08 0
2 0 .16 0
2 0 .32 0
2 0 .64 0
2 1 .28 1
2 0 .56 0
2 1 .02 1
...
Summer 2007 CISC121 - Prof. McLeod 25
Number Representation - Cont.
• Converting between binary, octal and hex is much easier - done by “grouping” the numbers:
• For example:
(010110001101011.111100000110)2=(?)8
010 110 001 101 011 . 111 100 000 110
(2 6 1 5 3 . 7 4 0 6)8
Summer 2007 CISC121 - Prof. McLeod 26
Number Representation - Cont.
• Another example:
(2C6B.F06)16=(?)2
(2 C 6 B . F 0 6)16
( 0010 1100 0110 1011 . 1111 0000 0110)2
Summer 2007 CISC121 - Prof. McLeod 27
From Before: Integer Primitive Types in Java
• For byte, from -128 to 127, inclusive (1 byte).• For short, from -32768 to 32767, inclusive (2
bytes).• For int, from -2147483648 to 2147483647,
inclusive (4 bytes). • For long, from -9223372036854775808 to
9223372036854775807, inclusive (8 bytes).
• A “byte” is 8 bits, where a “bit” is either 1 or 0.
Summer 2007 CISC121 - Prof. McLeod 28
Storage of Integers
• An “un-signed” 8 digit binary number can range from 00000000 to 11111111
• 00000000 is 0 in base 10.• 11111111 is 1x20 + 1x21 + 1x22 + … + 1x27 = 255,
base 10.
Summer 2007 CISC121 - Prof. McLeod 29
Storage of Integers - Cont.
• So, how can a negative binary number be stored?• One way is to use the Two’s Complement
system of storage.• Make the most significant bit a negative number:• So, the lowest “signed” binary 8 digit number is
now: 10000000, which is -1x27, or -128 base 10.
Summer 2007 CISC121 - Prof. McLeod 30
Storage of Integers - Cont.
• Two’s Complement System:
binary base 10
10000000 -128
10000001 -127
11111111 -1
00000000 0
00000001 1
01111111 127
Summer 2007 CISC121 - Prof. McLeod 31
Storage of Integers - Cont.
• For example, the binary number
10010101 is
1x20 + 1x22 + 1x24 - 1x27
= 1 + 4 + 16 - 128
= -107 base 10
• Now you can see how the primitive integer type, byte, ranges from -128 to 127.
Summer 2007 CISC121 - Prof. McLeod 32
Storage of Integers - Cont.
• Suppose we wish to add 1 to the largest byte value: 01111111+00000001
• This would be equivalent to adding 1 to 127 in base 10 - the result would normally be 128.
• In base 2, using two’s compliment, the result of the addition is 10000000, which is -128 in base 10!
• So integer numbers wrap around, in the case of overflow - no warning is given in Java!
Summer 2007 CISC121 - Prof. McLeod 33
Storage of Integers - Cont.
• An int is stored in 4 bytes using “two’s complement”.
• An int ranges from:
10000000 00000000 00000000 00000000
to
01111111 11111111 11111111 11111111
or -2147483648 to 2147483647 in base 10
Summer 2007 CISC121 - Prof. McLeod 34
Real Primitive Types
• For float, (4 bytes) roughly ±1.4 x 10-38 to ±3.4 x 1038 to 7 significant digits.
• For double, (8 bytes) roughly ±4.9 x 10-308 to ±1.7 x 10308 to 15 significant digits.
Summer 2007 CISC121 - Prof. McLeod 35
Storage of Real Numbers
• The system used to store real numbers in Java complies with the IEEE standard number 754.
• Like an int, a float is stored in 4 bytes or 32 bits.
• These bits consist of 24 bits for the mantissa and 8 bits for the exponent:
00000000 00000000 00000000 00000000
mantissa exponent
Summer 2007 CISC121 - Prof. McLeod 36
Storage of Real Numbers - Cont.
• So a value is stored as:
value = mantissa 2exponent
• The exponent for a float can range from 2-128 to 2128, which is about 10-38 to 1038.
• The float mantissa must lie between -1.0 and 1.0 exclusive, and will have about 7 significant digits when converted to base 10.
Summer 2007 CISC121 - Prof. McLeod 37
Storage of Real Numbers - Cont.
• The double type is stored using 8 bytes or 64 bits - 53 bits for the mantissa, and 11 bits for the exponent.
• The exponent gives numbers between 2-1024 and 21024, which is about 10-308 and 10308.
• The mantissa allows for the storage of about 16 significant digits in base 10.
• (Double.MAX_VALUE is: 1.7976931348623157E308)
Summer 2007 CISC121 - Prof. McLeod 38
Storage of Real Numbers - Cont.
• See the following web site for more info:
http://grouper.ieee.org/groups/754/
• Or:
http://en.wikipedia.org/wiki/IEEE_floating-point_standard
Summer 2007 CISC121 - Prof. McLeod 39
Storage of Real Numbers - Cont.
• So, a real number can only occupy a finite amount of storage in memory.
• This effect is very important for two kinds of numbers:– Numbers like 0.1 that can be written exactly in base
10, but cannot be stored exactly in base 2.– Real numbers (like or e) that have an infinite number
of digits in their “real” representation can only be stored in a finite number of digits in memory.
• And, we will see that it has an effect on the accuracy of mathematical operations.
Summer 2007 CISC121 - Prof. McLeod 40
Roundoff Error
• Consider 0.1:
(0.1)10 = (0.0 0011 0011 0011 0011 0011…)2
• What happens to the part of a real number that cannot be stored?
• It is lost - the number is either truncated or rounded (truncated in Java).
• The “lost part” is called the Roundoff Error.
Summer 2007 CISC121 - Prof. McLeod 41
Storage of “Real” or “Floating-Point” Numbers - Cont.
• Compute:
• And, compare to 1000.
float sum = 0;
for (int i = 0; i < 10000; i++)
sum += 0.1;
System.out.println(sum);
10000
1
1.0i
Summer 2007 CISC121 - Prof. McLeod 42
Storage of “Real” or “Floating-Point” Numbers - Cont.
• Prints a value of 999.9029 to the screen.• If sum is declared to be a double then the
value: 1000.0000000001588 is printed to the screen.
• So, the individual roundoff errors have piled up to contribute to a cumulative error in this calculation.
• As expected, the roundoff error is smaller for a double than for a float.
Summer 2007 CISC121 - Prof. McLeod 43
Roundoff Error – Cont.
• This error is referred to in two different ways:
• The absolute error:
absolute error = |x - xapprox|
• The relative error:
relative error = (absolute error) |x|
Summer 2007 CISC121 - Prof. McLeod 44
Roundoff Error - Cont.
• So for the calculation of 1000 as shown above, the errors are:
• The relative error on the storage of 0.1 is the absolute error divided by 1000.
Type Absolute Relative
float 0.0971 9.71E-5
double 1.588E-10 1.588E-13
Summer 2007 CISC121 - Prof. McLeod 45
The Effects of Roundoff Error
• Roundoff error can have an effect on any arithmetic operation carried out involving real numbers.
• For example, consider subtracting two numbers that are very close together:
• Use the function
for example. As x approaches zero, cos(x) approaches 1.
)cos(1)( xxf
Summer 2007 CISC121 - Prof. McLeod 46
The Effects of Roundoff Error
• Using double variables, and a value of x of 1.0E-12, f(x) evaluates to 0.0.
• But, it can be shown that the function f(x) can also be represented by f’(x):
• For x = 1.0E-12, f’(x) evaluates to 5.0E-25.• The f’(x) function is less susceptible to roundoff
error.
)cos(1
)(sin)()('
2
x
xxfxf
Summer 2007 CISC121 - Prof. McLeod 47
The Effects of Roundoff Error - Cont.
• Another example. Consider the smallest root of the polynomial: ax2+bx+c=0:
• What happens when ac is small, compared to b?
• It is known that for the two roots, x1 and x2:
a
acbbx
2
42
1
a
cxx 21
Summer 2007 CISC121 - Prof. McLeod 48
The Effects of Roundoff Error - Cont.
• Which leads to an equation for the root which is not as susceptible to roundoff error:
• This equation approaches –c/b instead of zero when ac << b2.
acbb
cx
4
221
Summer 2007 CISC121 - Prof. McLeod 49
The Effects of Roundoff Error - Cont.
• The examples above show what can happen when two numbers that are very close are subtracted.
• Remember that this effect is a direct result of these numbers being stored with finite accuracy in memory.
Summer 2007 CISC121 - Prof. McLeod 50
The Effects of Roundoff Error - Cont.
• A similar effect occurs when an attempt is made to add a comparatively small number to a large number:
boolean aVal = ((1.0E10 + 1.0E-20)==1.0E10);System.out.println(aVal);
• Prints out true to the screen• Since 1.0E-20 is just too small to affect any of the bit
values used to store 1.0E10. The small number would have to be about 1.0E-5 or larger to affect the large number.
• So, keep this behaviour in mind when designing expressions!
Summer 2007 CISC121 - Prof. McLeod 51
The Effect on Summations
• Taylor Series are used to approximate many functions. For example:
• For ln(2):
1
1)1()1ln(
i
ii
i
xx
...4
1
3
1
2
11
)1()2ln(
1
1
i
i
i
Summer 2007 CISC121 - Prof. McLeod 52
The Effect on Summations – Cont.
• Since we cannot loop to infinity, how many terms would be sufficient?
• Since the sum is stored in a finite memory space, at some point the terms to be added will be much smaller than the sum itself.
• If the sum is stored in a float, which has about 7 significant digits, a term of about 1x10-8 would not be significant. So, i would be about 108 - that’s a lot of iterations!
Summer 2007 CISC121 - Prof. McLeod 53
The Effect on Summations - Cont.
• On testing using a float, it took 33554433 iterations and 25540 msec to compute! (sum no longer changing, value = 0.6931375)
• Math.log(2) = 0.6931471805599453• So, roundoff error had a significant effect and the
summation did not even provide the correct value. A float could only provide about 5 correct significant digits, tops.
• For double, about 1015 iterations would be required! (I didn’t try this one…)
• So, this series does not converge quickly, and roundoff error has a strong effect on the answer!
Summer 2007 CISC121 - Prof. McLeod 54
The Effect on Summations - Cont.
• Here is another way to compute natural logs:
• Using x = 1/3 will provide ln(2).
0
12
12
12
1
1ln
i
ixix
x
Summer 2007 CISC121 - Prof. McLeod 55
The Effect on Summations - Cont.
• For float, this took 8 iterations and <1msec (value = 0.6931472).
• Math.log(2) = 0.6931471805599453• For double, it took 17 iterations, <1 msec to give
the value = 0.6931471805599451• Using the Windows calculator ln(2) =
0.69314718055994530941723212145818 (!!)• So, the use of the 17 iterations still introduced a
slight roundoff error.
Summer 2007 CISC121 - Prof. McLeod 57
Numeric Calculations
• Error is introduced into a calculation through two sources (assuming the formulae are correct!):– The inherent error in the numbers used in the
calculation.– Error resulting from roundoff error.
• Often the inherent error dominates the roundoff error.
• But, watch for conditions of slow convergence or ill-conditioned matrices, where roundoff error will accumulate or is amplified and end up swamping out the inherent error.
Summer 2007 CISC121 - Prof. McLeod 58
Numeric Calculations - Cont.
• Once a number is calculated, it is very important to be able to estimate the error using both sources, if necessary.
• The error must be known in order that the number produced by your program can be reported in a valid manner.
• This is a non-trivial topic in numeric calculation that we will not discuss in this course.
Summer 2007 CISC121 - Prof. McLeod 59
Real World Roundoff Error Disasters
• Arianne 5 Launch• Patriot Missiles
• See the movie!
Summer 2007 CISC121 - Prof. McLeod 60
Patriot Missile Problem
• The Patriot’s tracking system used a 24 bit number to keep track of the number of tenth seconds passed since the tracking system was turned “on”.
• 0.1 in binary is 0.0001100110011001100110011001100....
• If you only have 24 bits to store this number then the error is 0.0000000000000000000000011001100…. or 0.000000095 in base 10