Upload
irdginfo
View
75
Download
0
Embed Size (px)
Citation preview
Numbering Systems - III
Floating Point
• Real numbers are stored in scientific notation• The components of the number are stored separately in
binary– The sign– The exponent– The fraction (called the mantissa or significand)
Floating Point
• There are in infinite number of real numbers• Not all real numbers can be represented• If a number can not be represented exactly, it is rounded• A number may be too large or too small to be represented
Floating Point
• Single-precision – 32 bits
• 1 bit - sign• 8 bits - exponent• 23 bits - fraction
• Double-precision– 64 bits
• 1 bit - sign• 11 bits - exponent• 52 bits - fraction
• Note how the size of the fraction increases more than the exponent between single-precision and double-precision– Accuracy (precision) is more important than range
Floating Point
• The sign is stored as you would expect– 1 = negative– 0 = positive
• The other components are not stored exactly as you would expect
• The exponent is offset with a bias– The bias is added to the exponent– The bias is different for single- and double-precision
• 127 for single-precision• 1023 for double-precision
Floating Point• The fraction is also not what you would expect• The integer part of the number is converted to
binary like an unsigned number• The part after the decimal point is converted by
dividing it by powers of 2-X, with X increasing from 1. The quotients are the binary version
• The left and right parts are concatenated together and the result is shifted into normalized form– The amount of the shift is the exponent
• The first 1 is dropped from the fraction since it will always be there
Floating Point
• Example: -118.625• 118 = 11101102
• .625 = .1012 • Result: 1110110.101• Shift into normalized form• 1.110110101 x 26 • Exponent: 6 + 127 = 133 = 100001012
Operation Quotient Remainder
0.625 ÷ 0.5 (2-1) 1 0.125
0.125 ÷ 0.25 (2-2) 0 0.125
0.125 ÷ 0.125 (2-3) 1 0
1 1 0 0 0 0 1 0 1 1 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Tool: http://www.h-schmidt.net/FloatApplet/IEEE754.html