38
Data Representation CS280 – 09/13/05

Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Embed Size (px)

Citation preview

Page 1: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Data Representation

CS280 – 09/13/05

Page 2: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Binary (from a Hacker’s dictionary)

A base-2 numbering system with only two digits, 0 and 1, which is perfectly suited for electronic operations since it can be expressed by power states (on/off), voltage levels (high/low) or charge (positive/negative), but is less than ideal for humans, who find it awkward to say things like “It’s a Catch – 10110 situation,” “He’s the 11011010-pound gorilla,” and “That’s the 10110010011011001-dollar question.”

“There are only 10 kinds of people in the world…those that understand binary and those that do not.” (from the ACM CS t-shirt).

Page 3: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Looking at more data

Character representations

Page 4: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Data and Data Representation

So how is this data that we operate on stored in the computer?

Page 5: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Let’s start with numbers

Binary codes are Base 2 We “think” and operate in Base 10.

What does this mean?

Page 6: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Counting

Base 10 has 10 digits to represent different numbers of things

Base 2 has only 2 digits available.

Counting

Page 7: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Base 10

0 1 2 3 4 5 6 7 89

then we run out of unique digits.

So we move to a positional system.

10 – means we have ten things – a 1 in the 10’s position and no more things in the 1’s position.

Page 8: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Binary counting

0 1 then we run out of digits

10

This represents the number 2. A 1 in the 2’s position and a 0 in the 1’s position.

Page 9: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Positional notation

10’s positions represented

1 * 100 = 1 1 * 101 = 10 1 * 102 = 100 1 * 103 = 1,000 1 * 104 = 10,000 1 * 105 = 100,000 1 * 106 = 1,000,000

2’s positions represented

1 * 20 = 1 1 * 21 = 2 1 * 22 = 4 1 * 23 = 8 1 * 24 = 16 1 * 25 = 32 1 * 26 = 64

Page 10: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

How does the computer then store numbers?

Let’s say we want to represent the number 53 in binary.

5310 = 110101

Why?

See chart next page.

Page 11: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Converting from binary to decimalUse chart

1 * 20 = 1 1 * 21 = 2 1 * 22 = 4 1 * 23 = 8 1 * 24 = 16 1 * 25 = 32 1 * 26 = 64 1 * 27 = 128 1 * 28 = 256

75 decimalMust use 7 bits

xxxxxxx

75 – 64 = 9

1xxxxxx

32 and 16 are not used

100xxxx

9 – 8 = 1

1001xxx

4 and 2 are not used last digit is 1

1001001

What is the general subtractionalgorithm to convert from binary to decimal number?

Page 12: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Converting binary to decimal

274 – decimal

4 * 100 = 4 7 * 101 = 70 2 * 102 = 200

total 274

1011 – binary

1 * 20 = 1 1 * 21 = 2 0 * 22 = 0 1 * 23 = 8

total 11

What is the general algorithm for taking a number in base X and converting it to its base 2 equivalent?

Page 13: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Number representation

Numbers are represented by their corresponding binary representation We are disregarding sign We are disregarding floating point

What about other kinds of data?

Page 14: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Think about the binary values as a kind of code.

Page 15: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

The binary values represent codes

How many different values can be stored in 1 bit?

How many in 2 bits?

How many in 4 bits?

How many in a byte?

Page 16: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

General form encoding

If you have x possible unique symbols, and y positions for any one of those symbols, then the general number of unique codes is

xy

Example, you have 2 dice each of which has 6 different face values, so there are 36 or 62 possible unique codes.

Page 17: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

ASCII codes represent characters of data Use 1 byte or 8 bits Unicode extends the Ascii codes by another

byte. ASCII can form most of the characters used

by “Western” languages along with punctuation symbols.

Unicode allows for special symbols and symbols in other languages like Japanese, Chinese, Arabic

Page 18: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Figure 8.7. ASCII, The American Standard Code for Information Interchange (page 220)

Page 19: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Reading the chart

Left column is the left side of the byte (group of 8 bits) (another term is the high order)

Right column is the right side of the byte.

Value is the corresponding binary code.

Page 20: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Binary to hex

Hexidecimal (base 16) codes can be used to represent groups of 4 binary digits.

Hexidecimal counting:

0 1 2 3 4 5 6 7 8 9 A B C D E F

A = 10 Binary 1010B = 11 1011C = 12 1100D = 13 1101E = 14 1110F = 15 1111

Page 21: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

So the letter Z can be abbreviated

0101 1010 in binary 5 A in hex

Commonly binary numbers are represented in groups of 4 numbers with the leading 0’s used as placeholders.

Hex numbers are shown as 2 digit with a space in between each group of two.

Page 22: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Encoding – character string

Text or character strings are typically contiguously stored in memory.

Assume that each character takes up one byte of space, how many bytes would be required for a phone number (we are using a slightly different example than the book. Note the hyphens and spaces:

568 - 8771

Page 23: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

568 – 8771 – requires 10 bytes

5 0011 0101 3516 5310

6 0011 0110 3616 5410

8 0011 1000 3816 5610

0010 0000 2016 3210

- 0010 1101 2E16 4510

0010 0000 2016 3210

8 0011 1000 3816 5610

7 0011 0111 3716 5510

7 0011 0111 3716 5510

1 0011 0001 3116 4910

Page 24: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

In class assignment

Using the chart on page 220, what is your first (or nick) name in ASCII binary codes?

Work with your partner. Write the first name (spread out). Write the binary code for each letter of your

name based on the ASCII chart. Convert at least one of those binary codes to

the decimal (base 10) equivalent.

Page 25: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

What about other kinds of data?

Chapter 11 material

Page 26: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Pixels

A pixel is like a dot. Your computer screen is composed of thousands of pixels.

How many?

Settings – Control Panel – Display – Settings Screen area is the dimensions expressed in

terms of pixels. Higher the number the better the resolution.

Page 27: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Each pixel

Has a color associated with it. Colors are a combination of red, green, and

blue light – RGB The intensity of the particular color defines

how much of that color contributes to the overall color displayed.

Each color is associated with a 1 byte code. In one byte we can have values from 0 (no color) to 1 (full intensity).

Page 28: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Color

See example in Word document

Black is coded 0 0 0

red green blue White is coded 255 255 255

We will also use this feature when we code HTML colors.

Page 29: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Sound

Analog – real world – infinitely continuous Digital – representation - discrete

Sound is a continuous series of sound waves. To digitize we cannot capture every infinite

value that hits our ears. But we can sample the values.

Page 30: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Figure 11.8. Sound wave. The horizontal axis is time; the vertical axis is sound pressure.

Page 31: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Figure 11.9. Two sampling rates; the rate on the right is twice as fast as that on the left.

Page 32: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Figure 11.11. (a) Three-bit precision for samples requires that the indicated reading be approximated as +10. (b) Adding another bit makes the sample twice as accurate.

Page 33: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Figure 11.10. Schematic for analog-to-digital and digital-to-analog conversion.

Page 34: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Sampling

While we lose some information in this process, it is usually negligible in terms of our ability to perceive the sounds.

Page 35: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

But to produce sounds

Requires a large amount of data. For example, at a 16 bit representation of

each sound, it would take 10 megabytes to reproduce 1 minute of a song.

Compression – Remove the parts of the sound that we cannot hear. – MP3 format.

Page 36: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Images

Images have the same problem. If each image is made up of thousands of

pixels, and each pixel requires 3 bytes of data, then each image is huge.

JPEG format compresses the digital representation to remove the differences in hues of a picture that we cannot perceive.

Then we can compress by using run-length compression to code the remaining bits.

Page 37: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Run-length compression

If my bit pattern is:

00000000000000000000000011111111111111000000000000000011111111111111001001

We can code a value to indicate that we have:24 0’s followed by 14 1’s followed by 16 0’s, etc.

When we have many changing values in the pattern, it will not save us much space, but by making patterns of identical pixels, you can save a good deal of data space.

Page 38: Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly

Lossy vs lossless conversion

Lossless – no loss of data in the conversion Lossy – there is loss of data

Run-length coding is lossless. You can convert the original to a compressed form and recover it exactly.

Compression that removes some of the detail (things that we cannot perceive) is lossy. You cannot reproduce exactly the same sound/picture.