Upload
griffith-fields
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Basic data types and their representation. CS101 2012.1. Announcements. If biometric ID does not work, write your roll number and sign your name on a piece of paper All lab batches should be stable now Lab this week will continue with familiarization and small programs - PowerPoint PPT Presentation
Citation preview
CS101 2012.1
Announcements If biometric ID does not work, write your roll number
and sign your name on a piece of paper All lab batches should be stable now Lab this week will continue with familiarization and
small programs No tutorials this week either But we will begin posting homework on Moodle Will give you an impression of exam questions Will not be graded Will be discussed in tutorials
CS101 2012.1
Layers of abstraction Three layers Implement fixed size
primitive types by mapping possible/supported values to bit patterns
Add collection types on top of primitive types to assist writing complicated programs
Collections usually change sizes and memory layout during program execution
Memory as arrayof bytes
Primitive data types: character, integer,
float, double
Collection types: arrays, matrices, lists,
maps, strings
CS101 2012.1
Memory, values, variables Unit of storage: bit (0/1) Because such computers are easier to
implement by switching transistors off and on A byte is 8 bits wide
• Values range from 00000000 to 11111111• 28 = 256 possible bit configurations• Can be interpreted as integers from 0 to 255
(“unsigned char”)• Electronic and magnetic memory is allocated in
units of bytes
CS101 2012.1
Binary arithmetic Byte value in binary: 00000000 (8 bits) Corresponding decimal value = 0 Written as 0dec to avoid confusion
In decimal, to increment a number, increment the unit position, unless there is overflow, in which case carry over… etc.
Same in binary Next few values are 00000001 (=1dec),
00000010 (=2dec), 00000011 (=3dec), 00000100 (=4dec), 00000101 (=5dec) etc.
CS101 2012.1
Character (char) To a first approximation, a character is the
same as a 8-bit byte• (More recently, multi-byte characters have been
designed to support all the world’s languages)
The key difference is in how the byte is interpreted and processed (e.g., printed)
E.g., 97 means ‘a’, 98=‘b’, 65=‘A’, 66=‘B’ etc. C++ lets you compare characters using the
corresponding integer Useful for sorting strings in dictionary order
CS101 2012.1
Hexadecimal notation (hex) Byte (8 bits) consists of two “nibbles” (4 bits) Nibble ranges between 0 and 15 Expressed in hexadecimal, 0 to 9, a to f
• a=10, b=11, c=12, d=13, e=14, f=15
So a byte is written as two hexadecimal digits, e.g. 0a or c5
Note that 23 hex is not 23 decimal! To make clear, written as 0x23 printf demo
CS101 2012.1
Fixed size integer types “Short integers” (short) are 16 bits wide
• 65536 possible values
Standard integers (int) are 32 bits wide• 4,294,967,296 possible values• Adequate for most purposes except
governments bailing out banks and airlines
A long long int is 64 bits wide• Will sometimes call long for brevity (as in Java)
Real numbers are represented using float and double (“double precision”) … later
CS101 2012.1
Two’s complement representation Want to represent both positive and negative
integers with a bit sequence (say 4 bits) Trivial: use one bit for sign
• Waste one configuration (plus and minus zero)
0000 (0) through 0111 (7) are positive 8 more values, so assign to 8 through 1
Binary Decimal Binary Decimal
1000 8 1100 41001 7 1101 31010 6 1110 21011 5 1111 1
CS101 2012.1
The wrap-around
Zero is one position to the right of center
-1=1…1 0=0…0 Max=01…1Min=10…0
CS101 2012.1
Two’s complement, continued One sudden “wrap-around” from 7 to 8 Works exactly the same for short, int and long int, with corresponding wrap, max, min values
Most programming systems will not detect if the wrap happens
If your program uses values near the edges, be careful in doing arithmetic and check the result!
Library packages exist to support arbitrarily large integers, not as efficient as fixed length
CS101 2012.1
Real number representations “Floating (decimal) point” In decimal we write 0.3141011
0.314 is the mantissa, 11 is the exponent Mantissa has decimal point at beginning Same approach in computers, with radix 2
instead of 10 In a float
• 1 sign, 8 exponent, 23 mantissa bits
In a double• 1 sign, 11 exponent, 52 mantissa bits
CS101 2012.1
Floating point numbersCosts how many bits to store
Magnitude of maximum value
Magnitude of minimum value
float 32 3.41038 1.410-45
double 64 1.79810308 4.910-324
Finite bits cannot represent all real values Gaps between numbers that can be represented Need care in writing expressions that combine
values to avoid errors, minimize loss of precision
CS101 2012.1
Some finite precision pitfalls Some 32- and 64-bit patterns have been set
aside to represent• Positive and negative infinity• Not a number or NaN (e.g. result of 0/0)
Most systems will detect overflow but not underflow
float a = 3.3e38 / 0.01; correctly results in a being “inf”
But 3.3e38 + 5 silently equals 3.3e38 (not enough bits in mantissa)
CS101 2012.1
Operations on numeric types All integers support +, , *, /, % (remainder) Even characters support + and
• E.g., ‘a’ + 1 = ‘b’; what is ‘Z’+1? (Try it) Float and double support +, , *, / More complicated operations like log, exp,
sine, etc. are implemented as functions You can compare numbers using
comparison operators <, <=, ==, >=, !=• The result is a Boolean (0/1) value (next)• cout << (5 > 7);• cout << (4 != 3);
CS101 2012.1
Boolean values and operations In C++, int can be
reused as Boolean (0 = false, anything else is true)
Binary operator && (and)
Binary operator || (or) Short-circuit evaluation
x y x || y
0 0 0
0 1 1
1 0 1
1 1 1
x y x&& y
0 0 0
0 1 0
1 0 0
1 1 1
CS101 2012.1
Not and ex(clusive) or Unary operator ! (not) Binary operator exor is
not available on single Booleans but instead on bit vectors (next)
Input x Output !x
0 1
1 0
x y x ^ y
0 0 0
0 1 1
1 0 1
1 1 0
CS101 2012.1
The bool type Old C++ used int to store Boolean values But ANSI standard C++ does offer a type
called bool bool tval = true, fval = false; int ival = int(tval); However, old bad habits still allowed
• if (37) { … }• bool bval = 37;
Overall value unclear
CS101 2012.1
Bit array manipulation Fixed size integers are arrays of bits C++ lets you do bitwise Boolean algebra a & b (and), a | b (or), a^b (exor), ~b (not)
1011011010010101
10010100&
1011011010010101
00100011^
1011011010010101
10110111|
00100011
11011100~
CS101 2012.1
Bit shift operations int c = 5; cout << (c << 2);
Bits lost from the left (msb) Zero bits inserted from the right (lsb) Result is 20 (= 5 22) Cheap way to multiply by powers of two
00000000,00000000,00000000,00010100
00000000,00000000,00000000,00000101
CS101 2012.1
Right shift c >> 2 Bits discarded to the right (lsb) If msb of c was 0, then 0 bits injected from
left (msb)• 5 >> 2 gives 1
If msb of c was 1 (c was negative) then 1 bits injected from left• -5 >> 2 gives -2 (work it out)• 0xfffffffb >> 2 gives 0xfffffffe
Preserves sign of number
CS101 2012.1
Some applications of bit operations Is an int x odd or even?
• int isOdd = (x & 1); Remainder when divided by 8
• int remain = (x & 7);• Faster than x % 8
How many one bits in a 32-bit int? Repeat 32 times:
• numOnes = numOnes + (x & 0x8000000);• x = x << 1;
In binary this looks like a one followed by 31 zeros
CS101 2012.1
Primitive variable declaration and literals float fahrenheit;
• Uninitialized, may get garbage on read
float fahrenheit = 95; const float fahrenheit = 9.52e14;
• Value will never change• Scientific notation saves typing lots of zeros
int x = 3, y = x/2;• Can initialize variables based on others already
initialized
CS101 2012.1
Why bother to declare Variable names
• What if you type it incorrectly later?• To initialize before any use
Types• To check all assignments to the variable• To interpret a bit sequence as intended in your
program (e.g. float and int are both 32 bits)
There are languages that do not enforce variable name and type declarations• Can be lazy, but generally a Bad Idea
CS101 2012.1
Type conversions Some conversions are implicit
• short x = 20000; int y = x;• int x = 40000; short y = x;
Others may result in overflow• double x = 5e40; float y = 2*x;
Some are errors• float x = (float) “hello world”;
Implicit typing• float x = 7/3;• float x = 7/3.;
CS101 2012.1
Polymorphic operators and literals 7/3 vs 7/3. / represents division for int, float, double Which one is invoked depends on the
(inferred) type of arguments
`7’ ‘3’
toInt toInt
intDiv
toFloat
`7’ ‘3.’
toInt toFloat
floatDiv
intToFloat
CS101 2012.1
The string data type When we saidcout << “Hello world\n”“Hello world\n” was stored as an array of characters
Byte corresponding to H, e, …, \n, and finally a “null byte” or 00000000 (in binary) to mark the end of the string
A more modern and better way is to use the string data type
string message(“Hello world”);
CS101 2012.1
Calling a method on a string object
Common string operations Get the number of characters in the string
• message.size() Get the character at a specific position
• message.at(5) or message[5] Get a substring of the given string
• message.substr(1, 3) Index out of bound?
• Some operations throw exceptions• Some silently truncate• Some may return garbage
CS101 2012.1
More string operations Find the first (leftmost) or last (rightmost)
occurrence of a character• message.find_first_of(‘o’)• message.find_last_of(‘e’)
Compare two strings (dictionary or lexicographic order)• msg1.compare(msg2)• Returns an integer
• Negative if msg1 should appear before msg2• Zero if msg1 and msg2 are equal• Positive if msg1 should appear after msg2