Upload
bertina-stevenson
View
219
Download
0
Embed Size (px)
Citation preview
1
COMS 161Introduction to Computing
Title: Numeric Processing
Date: November 08, 2004
Lecture Number: 30
2
Announcements
3
Review
• Real numbers– Representation– Limitations
4
Outline
• Real numbers– Representation– Limitations
5
IEEE Standard 754
• Provides two floating point types– Single
• 24-bits of significand precision
– Double• 53-bits of significand precision
6
Single Precision
• IEEE standard 754– Floating point number representation– 32-bit
s eeeeeeee fffffff ffffffffffffffff
– s: (1) sign bit• 0 means positive, 1 means negative
s exponent significand31 30 23 22 0
7
Single Precision
s eeeeeeee fffffff ffffffffffffffff – e: (8) exponent bits [-126 … 127]
• A bias of 127 is added to the exponent
– f: (24) fractional part [23 bits + 1 implied bit]• Normalize the fractional part• 1 will always be on the left side of the binary point
8
Special Single Cases
• Two zeros– Signed zero– e = 0, f = 0 (exponent and fractional bits are all 0)– (-1)s x 0.0
• 0000 0000 0000 0000 0000 0000 0000 0000– 0x0000 0000 (+0)
• 1000 0000 0000 0000 0000 0000 0000 0000– 0x8000 0000 (-0)
9
Special Single Cases
• Positive infinity– +INF– s = 0, e = 255, f = 0 (all fractional bits are all 0)
• 0111 1111 1000 0000 0000 0000 0000 0000• 0x7f80 0000
• Negative infinity– -INF– s = 1, e = 255, f = 0 (all fractional bits are all 0)
• 1111 1111 1000 0000 0000 0000 0000 0000• 0xff80 0000
10
Special Single Cases
• Not-A-Number (NaN)– s = 0 | 1, e = 255, f != 0 (at least one fractional bit
is NOT 0)– There are many representations for NaN– Here is one example
• 0111 1111 1100 0000 0000 0000 0000 0000• 0x7fc0 0000
11
Special Single Cases
• Maximum single number– 0111 1111 0111 1111 1111 1111 1111 1111– 0x7f7f ffff– 3.40282347 x 1038
• Minimum positive single number– 0000 0000 1000 0000 0000 0000 0000 0000– 0x00800000– 1.17549435 x 10-38
• To represent larger numbers
12
Double Precision
• IEEE standard 754– Floating point number representation– 64-bit
s eeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffff – s: (1) sign bit
• 0 means positive, 1 means negative
s exponent significand63 62 52 51 32
significand31 0
13
Single Precision
s eeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffff– e: (11) exponent bits [-1022 … 1023]
• A bias of 1023 is added to the exponent
– f: (53) fractional part [52 bits + 1 implied bit]• Normalize the fractional part• 1 will always be on the left side of the binary point
14
Real (Decimal) Number Storage
• Double precision floating point numbers
– s: (1) sign bit
– e: (11) exponent bits [-1022 … 1023]
– f: (53) fractional part [52 bits + 1 implied bit]
seeeeeee eee f f f f f f f f f f f f f f f f f f f f
Byte 0 1 2 3
f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f
Byte 4 5 6 7
15
Special Double Cases
• Two zeros– Signed zero– e = 0, f = 0 (exponent and fractional bits are all
0)– (-1)s x 0.0
• 64 bits• 0000 0000 0000 0000 0000 0000 0000 … 0000
– 0x0000 0000 0000 0000 (+0)• 1000 0000 0000 0000 0000 0000 0000 … 0000
– 0x8000 0000 0000 0000 (-0)
16
Special Double Cases
• Positive infinity– +INF– s = 0, e = 2047, f = 0 (all fractional bits are all 0)
• 0111 1111 1111 0000 0000 0000 0000 … 0000• 0x7ff0 0000 0000 0000
• Negative infinity– -INF– s = 1, e = 2047, f = 0 (all fractional bits are all 0)
• 1111 1111 1111 0000 0000 0000 0000 … 0000• 0xfff0 0000 0000 0000
17
Special Double Cases
• Not-A-Number (NaN)– s = 0 | 1, e = 2047, f != 0 (at least one fractional
bit is NOT 0)– There are many representations for NaN– Here is one example
• 0111 1111 1111 1000 0000 0000 0000 … 0000• 0x7ff8 0000 0000 0000
18
Special Double Cases
• Maximum double number– 0111 1111 1110 1111 1111 1111 1111 … 1111– 0x7fef ffff ffff ffff– 1.7976931348623157 x 10308
• Minimum positive single number– 0000 0000 0001 0000 0000 0000 0000 … 0000– 0x0010 0000 0000 0000– 2.2250738585072014 x 10-308 – Don’t forget about the implied 1 bit!!
19
Decimal to Float Conversion
• Show –24.12510 in IEEE single precision format– First, save sign (negative so 1) and convert to binary…
– 24.12510 = 11000.0012 x 20
– Normalize…
– = 1.10000012 x 24
– Strip 1 off the mantissa and extend to form significand
– = .10000010000000000000000– Bias the exponent…
– Exp + Bias = 4 + 127 = 131 = 100000112
20
Real (Decimal) Number Storage
• 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
• 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
• Hex value : 0xC1C10000
• Link me baby