Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 12 Last time: –Efficient recursive and non-recursive sorts. –Analyzing the complexity of recursive

Summer 2007 CISC121 - Prof. McLeod 1

CISC121 – Lecture 12

• Last time:– Efficient recursive and non-recursive sorts.– Analyzing the complexity of recursive methods.

• Today– Last lecture!


You Will Need To:

• Look at exercise 5 and assignment 5.

• If you wish to replace any of your assignment marks you can submit the “makeup” assignment on the day of the final exam.


Final Exam…

• On the 16th, unless other arrangements have been made.

• Exam topics are listed off the course web site.• Exambank has 18 exams from 1996 to 2006.• I will not provide full exam solutions, but will be

happy to discuss individual exam problems.• Ana or Krista can supply extra tutoring if needed.

Ask them if you want an exam prep tutorial.


Today

• Analyze the complexity of mergesort.

• Number representation and roundoff error.

• Movie!

• “Satisfaction” survey


Mergesort – “aMergeSort” Code• Code for sorting arrays:

public static void aMergeSort (int[] A) { aMergeSort(A, 0, A.length-1); } // end aMergeSort public static void aMergeSort (int[] A, int first,

int last) { if (first < last) { int mid = (first + last) / 2; aMergeSort(A, first, mid); aMergeSort(A, mid + 1, last); aMerge(A, first, last); } // end if } // end aMergeSort recursive


Mergesort – “aMerge” Code private static void aMerge (int[] A, int first, int

last) { int mid = (first + last) / 2; int i1 = 0, i2 = first, i3 = mid + 1; int[] temp = new int[last - first + 1]; while (i2 <= mid && i3 <= last) { if (A[i2] < A[i3]) { temp[i1] = A[i2]; i2++; } else { temp[i1] = A[i3]; i3++; } i1++; } // end while


Mergesort – “aMerge” Code - Cont. while (i2 <= mid) { temp[i1] = A[i2]; i2++; i1++; } // end while

while (i3 <= last) { temp[i1] = A[i3]; i3++; i1++; } // end while

i1 = 0; i2 = first; while (i2 <= last) { A[i2] = temp[i1]; i1++; i2++; } // end while

} // end aMerge


Complexity of Mergesort

• Consider the aMergeSort code shown above:• Suppose that the entire method takes t(n) time,

where n is A.length. We want to know the big O notation for t(n).

• There are no loops in aMergeSort, just some constant time operations, the two recursive calls and the call to aMerge.

)(22

)( ntn

tn

tant merge


Complexity of Mergesort - Cont.

• What is the time function for aMerge?• There is some O(1) stuff and four loops that are

O(n):

• So,

cnantmerge )(

cnan

tn

tant

22)(



• So far, we have not made any mention of the state of the data. Does it make any difference if the data is in reverse order (worst case), random order (average case) or in order already (best case)?

• Express t(n) in a recursive expression:

otherwisecnn

tn

ta

nifb

nt

22

1

)(



• Assume that n is a power of 2:

• (It is easy enough to show that the proof still holds when n is not a power of two - but I’m not going to do that here).

otherwisecnn

ta

nifb

nt

22

1

)(



• Substitute n/2 for n, to get t(n/2):

)2(22

23)(

2222)(

,

222

2

22

2

2

icnn

tant

cnn

cn

taant

or

nc

nta

nt



• Do the next unrolling, which will be n/22:

• So, after i unrolling’s:

)3(32

27)(3

3

icn

ntant

icnn

tanti

ii

2212)(



• This recursion stops when the anchor case, n 1 is encountered. This will occur when:

• Substituting this back in the equation on the previous slide:

niorn

whenor

n

i

i

log,2

,

12



• At the anchor case:

• Now the equation can be simplified to yield the big O notation, which indicates that t(n) is O(nlog(n)).

ncnknjncnnbannt

or

ncnntannt

or

cnnn

nntannt

loglog1)(

,

log)1(1)(

,

)(log1)(


public static void quickSort (int[] A, int first, int last) {

int lower = first + 1; int upper = last; swap(A, first, (first+last)/2);

int pivot = A[first]; while (lower <= upper) { while (A[lower] < pivot) lower++; while (A[upper] > pivot) upper--; if (lower < upper) swap(A, lower++, upper--); else lower++; } swap(A, upper, first); if (first < upper - 1) quickSort(A, first, upper-1); if (upper + 1 < last) quickSort(A, upper+1, last);} // end quickSort(subarrays)


Complexity of Quicksort

• The worst case is when a near-median value is not chosen – the pivot value is always a maximum or a minimum value. Now the algorithm is O(n2).

• However, if the pivot values are always near the median value of the arrays, the algorithm is O(nlog(n)) – which is the best case. (See the derivation of this complexity for merge sort).

• The average case also turns out to be O(nlog(n)).


Number Representation

• Binary numbers or “base 2” is a natural representation of numbers to a computer.

• As a transition, hexadecimal (or “hex”, base 16) numbers are also used.

• Octal (base 8) numbers are used to a lesser degree.

• Decimal (base 10) numbers are *not* naturally represented in computers.


Number Representation - Cont.

• In base 2 (digits either 0 or 1):

r=2, a binary number: (110101.11)2=

1×25+1×24+0×23+1×22+0×21+1×20 +1×2-1 +1×2-2 =

=53.75 (in base 10)

“r” is the “radix” or the base of the number



• Octal Numbers: a base-8 system with 8 digits: 0, 1, 2, 3, 4, 5, 6 and 7:

• For example:

(127.4)8 = 1×82+2×81+7×80+4×8-1=87.5



• Hexadecimal Numbers: a base-16 system with 16 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F:

• For example:

(B65F)16 = 11×163+6×162+5×161+15×160 = 46687.



• The above series show how you can convert from binary, octal or hex to decimal.

• How to convert from decimal to one of the other bases?:

• integral part: divide by r and keep the remainder.• decimal part: multiply by r and keep the carry• “r” is the base - either 2, 8 or 16



• For example,

convert 625.7610

to binary:

• So, 62510 is

10011100012

Divisor(r) Dividend Remainder

2 625

2 312 (quotient) 1

2 156 0

2 78 0

2 39 0

2 19 1

2 9 1

2 4 1

2 2 0

2 1 0

0 1

most significant digit

least significant digit



• For the “0.7610”

part:

• So, 0.7610 is

0.11000010102

• 625.76 is:

(1001110001.1100001010)2

Multiplier(r) Multiplicand Carry

2 0 .76

2 1 .52 (product) 1

2 1 .04 1

2 0 .08 0

2 0 .16 0

2 0 .32 0

2 0 .64 0

2 1 .28 1

2 0 .56 0

2 1 .02 1

...



• Converting between binary, octal and hex is much easier - done by “grouping” the numbers:

• For example:

(010110001101011.111100000110)2=(?)8

010 110 001 101 011 . 111 100 000 110

(2 6 1 5 3 . 7 4 0 6)8



• Another example:

(2C6B.F06)16=(?)2

(2 C 6 B . F 0 6)16

( 0010 1100 0110 1011 . 1111 0000 0110)2


From Before: Integer Primitive Types in Java

• For byte, from -128 to 127, inclusive (1 byte).• For short, from -32768 to 32767, inclusive (2

bytes).• For int, from -2147483648 to 2147483647,

inclusive (4 bytes). • For long, from -9223372036854775808 to

9223372036854775807, inclusive (8 bytes).

• A “byte” is 8 bits, where a “bit” is either 1 or 0.


Storage of Integers

• An “un-signed” 8 digit binary number can range from 00000000 to 11111111

• 00000000 is 0 in base 10.• 11111111 is 1x20 + 1x21 + 1x22 + … + 1x27 = 255,

base 10.


Storage of Integers - Cont.

• So, how can a negative binary number be stored?• One way is to use the Two’s Complement

system of storage.• Make the most significant bit a negative number:• So, the lowest “signed” binary 8 digit number is

now: 10000000, which is -1x27, or -128 base 10.



• Two’s Complement System:

binary base 10

10000000 -128

10000001 -127

11111111 -1

00000000 0

00000001 1

01111111 127



• For example, the binary number

10010101 is

1x20 + 1x22 + 1x24 - 1x27

= 1 + 4 + 16 - 128

= -107 base 10

• Now you can see how the primitive integer type, byte, ranges from -128 to 127.



• Suppose we wish to add 1 to the largest byte value: 01111111+00000001

• This would be equivalent to adding 1 to 127 in base 10 - the result would normally be 128.

• In base 2, using two’s compliment, the result of the addition is 10000000, which is -128 in base 10!

• So integer numbers wrap around, in the case of overflow - no warning is given in Java!



• An int is stored in 4 bytes using “two’s complement”.

• An int ranges from:

10000000 00000000 00000000 00000000

to

01111111 11111111 11111111 11111111

or -2147483648 to 2147483647 in base 10


Real Primitive Types

• For float, (4 bytes) roughly ±1.4 x 10-38 to ±3.4 x 1038 to 7 significant digits.

• For double, (8 bytes) roughly ±4.9 x 10-308 to ±1.7 x 10308 to 15 significant digits.


Storage of Real Numbers

• The system used to store real numbers in Java complies with the IEEE standard number 754.

• Like an int, a float is stored in 4 bytes or 32 bits.

• These bits consist of 24 bits for the mantissa and 8 bits for the exponent:

00000000 00000000 00000000 00000000

mantissa exponent


Storage of Real Numbers - Cont.

• So a value is stored as:

value = mantissa 2exponent

• The exponent for a float can range from 2-128 to 2128, which is about 10-38 to 1038.

• The float mantissa must lie between -1.0 and 1.0 exclusive, and will have about 7 significant digits when converted to base 10.



• The double type is stored using 8 bytes or 64 bits - 53 bits for the mantissa, and 11 bits for the exponent.

• The exponent gives numbers between 2-1024 and 21024, which is about 10-308 and 10308.

• The mantissa allows for the storage of about 16 significant digits in base 10.

• (Double.MAX_VALUE is: 1.7976931348623157E308)



• See the following web site for more info:

http://grouper.ieee.org/groups/754/

• Or:

http://en.wikipedia.org/wiki/IEEE_floating-point_standard



• So, a real number can only occupy a finite amount of storage in memory.

• This effect is very important for two kinds of numbers:– Numbers like 0.1 that can be written exactly in base

10, but cannot be stored exactly in base 2.– Real numbers (like or e) that have an infinite number

of digits in their “real” representation can only be stored in a finite number of digits in memory.

• And, we will see that it has an effect on the accuracy of mathematical operations.


Roundoff Error

• Consider 0.1:

(0.1)10 = (0.0 0011 0011 0011 0011 0011…)2

• What happens to the part of a real number that cannot be stored?

• It is lost - the number is either truncated or rounded (truncated in Java).

• The “lost part” is called the Roundoff Error.


Storage of “Real” or “Floating-Point” Numbers - Cont.

• Compute:

• And, compare to 1000.

float sum = 0;

for (int i = 0; i < 10000; i++)

sum += 0.1;

System.out.println(sum);

10000

1

1.0i


Storage of “Real” or “Floating-Point” Numbers - Cont.

• Prints a value of 999.9029 to the screen.• If sum is declared to be a double then the

value: 1000.0000000001588 is printed to the screen.

• So, the individual roundoff errors have piled up to contribute to a cumulative error in this calculation.

• As expected, the roundoff error is smaller for a double than for a float.


Roundoff Error – Cont.

• This error is referred to in two different ways:

• The absolute error:

absolute error = |x - xapprox|

• The relative error:

relative error = (absolute error) |x|


Roundoff Error - Cont.

• So for the calculation of 1000 as shown above, the errors are:

• The relative error on the storage of 0.1 is the absolute error divided by 1000.

Type Absolute Relative

float 0.0971 9.71E-5

double 1.588E-10 1.588E-13


The Effects of Roundoff Error

• Roundoff error can have an effect on any arithmetic operation carried out involving real numbers.

• For example, consider subtracting two numbers that are very close together:

• Use the function

for example. As x approaches zero, cos(x) approaches 1.

)cos(1)( xxf


The Effects of Roundoff Error

• Using double variables, and a value of x of 1.0E-12, f(x) evaluates to 0.0.

• But, it can be shown that the function f(x) can also be represented by f’(x):

• For x = 1.0E-12, f’(x) evaluates to 5.0E-25.• The f’(x) function is less susceptible to roundoff

error.

)cos(1

)(sin)()('

2

x

xxfxf


The Effects of Roundoff Error - Cont.

• Another example. Consider the smallest root of the polynomial: ax2+bx+c=0:

• What happens when ac is small, compared to b?

• It is known that for the two roots, x1 and x2:

a

acbbx

2

42

1

a

cxx 21



• Which leads to an equation for the root which is not as susceptible to roundoff error:

• This equation approaches –c/b instead of zero when ac << b2.

acbb

cx

4

221



• The examples above show what can happen when two numbers that are very close are subtracted.

• Remember that this effect is a direct result of these numbers being stored with finite accuracy in memory.



• A similar effect occurs when an attempt is made to add a comparatively small number to a large number:

boolean aVal = ((1.0E10 + 1.0E-20)==1.0E10);System.out.println(aVal);

• Prints out true to the screen• Since 1.0E-20 is just too small to affect any of the bit

values used to store 1.0E10. The small number would have to be about 1.0E-5 or larger to affect the large number.

• So, keep this behaviour in mind when designing expressions!


The Effect on Summations

• Taylor Series are used to approximate many functions. For example:

• For ln(2):

1

1)1()1ln(

i

ii

i

xx

...4

1

3

1

2

11

)1()2ln(

1

1

i

i

i


The Effect on Summations – Cont.

• Since we cannot loop to infinity, how many terms would be sufficient?

• Since the sum is stored in a finite memory space, at some point the terms to be added will be much smaller than the sum itself.

• If the sum is stored in a float, which has about 7 significant digits, a term of about 1x10-8 would not be significant. So, i would be about 108 - that’s a lot of iterations!


The Effect on Summations - Cont.

• On testing using a float, it took 33554433 iterations and 25540 msec to compute! (sum no longer changing, value = 0.6931375)

• Math.log(2) = 0.6931471805599453• So, roundoff error had a significant effect and the

summation did not even provide the correct value. A float could only provide about 5 correct significant digits, tops.

• For double, about 1015 iterations would be required! (I didn’t try this one…)

• So, this series does not converge quickly, and roundoff error has a strong effect on the answer!



• Here is another way to compute natural logs:

• Using x = 1/3 will provide ln(2).

0

12

12

12

1

1ln

i

ixix

x



• For float, this took 8 iterations and <1msec (value = 0.6931472).

• Math.log(2) = 0.6931471805599453• For double, it took 17 iterations, <1 msec to give

the value = 0.6931471805599451• Using the Windows calculator ln(2) =

0.69314718055994530941723212145818 (!!)• So, the use of the 17 iterations still introduced a

slight roundoff error.


Aside - Extended Precision in Windows


Numeric Calculations

• Error is introduced into a calculation through two sources (assuming the formulae are correct!):– The inherent error in the numbers used in the

calculation.– Error resulting from roundoff error.

• Often the inherent error dominates the roundoff error.

• But, watch for conditions of slow convergence or ill-conditioned matrices, where roundoff error will accumulate or is amplified and end up swamping out the inherent error.


Numeric Calculations - Cont.

• Once a number is calculated, it is very important to be able to estimate the error using both sources, if necessary.

• The error must be known in order that the number produced by your program can be reported in a valid manner.

• This is a non-trivial topic in numeric calculation that we will not discuss in this course.


Real World Roundoff Error Disasters

• Arianne 5 Launch• Patriot Missiles

• See the movie!


Patriot Missile Problem

• The Patriot’s tracking system used a 24 bit number to keep track of the number of tenth seconds passed since the tracking system was turned “on”.

• 0.1 in binary is 0.0001100110011001100110011001100....

• If you only have 24 bits to store this number then the error is 0.0000000000000000000000011001100…. or 0.000000095 in base 10


Patriot Missile Problem, Cont.

• So over 100 hours of operation:

0.000000095×100×60×60×10=0.34 seconds out.

• A scud is moving a mach 5 = 1,676 metres per second, so the error is 0.34×1,676 = 570 metres.

Documents

Summer 2007CISC121 - Prof. McLeod1 CISC121 – Lecture 12 Last time: –Efficient recursive and non-recursive sorts. –Analyzing the complexity of recursive