85
Numbers in Codes GCNU 1025 Numbers Save the Day

Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Embed Size (px)

Citation preview

Page 1: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 2: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Coding• Converting information into another form of

representation (codes) based on a specific rule• Encoding: information to symbols• Decoding: symbols back to information

Page 3: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Binary Codes• Two symbols are used to represent data• Example: Morse code, ASCII code

Page 5: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Morse Codes• On-off tones, lights, clicks, dots and dashes, etc.

Click to learn Morse codes

Page 6: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Binary codes• Language of computers: 0 and 1 (binary system)• Codeword: a string of 0’s and 1’s representing a character• ASCII code: American Standard Code for Information Interchange• 128 characters, of which 33 are control characters• Enables the use of same codewords in different machines

Page 7: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Binary Codes: Play

http://www.binaryhexconverter.com/binary-to-ascii-text-converter

http://www.binaryhexconverter.com/ascii-text-to-binary-converter

Page 8: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Coding of Chinese characters (optional)

• Example: Chinese telegraph code (non-binary)• 4-digit: 0000-9999• Decoding (relatively easy for documentation): 3413法• Encoding (more difficult for documentation):法 3413 • Four-corner method: method for documentation for encoding

Page 9: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Coding of Chinese characters (optional)

• Example: Chinese telegraph code (non-binary)• 4-digit: 0000-9999• Four-corner method: method for documentation for encoding

Page 10: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Coding of Chinese characters (optional)

• Example: Chinese telegraph code (non-binary)• 4-digit: 0000-9999• Four-corner method: method for documentation for encoding

Page 11: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Announcement

• In-class Assignment #1 on Sep 19 (Friday)• 10% of final score• Coverage: up to Section 2.2• Books, notes, other materials and discussions all allowed• Help from instructor and teaching assistant• Assignments submitted after class subject to penalty

Page 12: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 13: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Error-detection for binary codes

• Rule: every valid codeword has a special property

• Parity check: a validity check concerning the parity (i.e. being odd or even) of the number of 1’s in a codeword

• Example: 1001 sent as 1011• Original number of 1’s: 2 (even)• Number of 1’s in 1011: 3 (odd)• 1 error leads to a change of parity of the number of 1’s• Error detected if all valid codewords consist of an even number of 1’s

Page 14: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Simple parity check• Rule: the last digit of a codeword is a check digit appended

to the original message (to be sent) so that the total number of 1’s in the codeword is even!• Example: sending a message 1000001• Check digit to be appended: 0• Codeword for the message: 10000010 (total number of 1’s: 2)

• Example: sending a message 1001001• Check digit to be appended: 1• Codeword for the message: 10010011 (total number of 1’s: 4)

Page 15: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Error-correction in codes• Is it possible to detect AND correct an error (without re-

transmission/data re-entry)?• Error-correction by multiple entries

• Sending all messages 3 times regardless of existence/absence of errors• Example: 1100 sent as 1100 1100 1100

• Error-correction power: a received message of 1100 1100 1000 can be automatically corrected to 1100 1100 1100 without further data re-entry

• High resources demand: tripling message length

• Error-correction by multiple parity check digits

Page 16: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Error-correction in codes• Example: transmit 1001 by multiple parity check digits

1001101

Page 17: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Error-correction in codes• 1-error correction: if the message received is 1000101

How can we detect/correct the error? (assume at most 1 error)

Page 18: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Lengths of codes• Basic question: how many digits do we need?

• How many digits are needed to encode 2 characters (e.g A, B)?• How many digits are needed to encode 4 characters?• How many digits are needed to encode 26 characters (A-Z)?

Page 19: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 20: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Run-length encoding (RLE): reduce number of characters transmitted (data compression)• Example: black-and-white documents

Page 21: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Run-length encoding (RLE): reduce number of characters transmitted (data compression)• Example: black-and-white documents

Page 22: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Run-length encoding (RLE): reduce number of characters transmitted (data compression)• Example: black-and-white documents• Reduce length of duplicated characters• Common for faxed documents and files containing runs• Size increased if runs are absent

Page 23: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Example: two different ways of encoding• Type 1

• Type 2

Page 24: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Example: two different ways of encoding• Type 1: fixed number of digits used

• Type 2: different numbers of digits used

Page 25: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Different coding methods• Fixed length code: fixed number of digits used

• Variable length code: different numbers of digits used

Page 26: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Shorter code for frequently used characters: efficiency enhanced• Is there anything wrong with the following code?

• Is there anything wrong in encoding BIT?• Is there anything wrong in decoding 0000001101?

Page 27: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Is there anything wrong with the following code?

• Is there anything wrong in encoding BIT? No!• Is there anything wrong in decoding 0000001101? Yes! Possible multiple interpretations

(BIT or FET)!

• Prefix property: no codeword can be a prefix of another codeword• Uniquely decipherable code: code satisfying the prefix property

Page 28: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Prefix property: no codeword can be a prefix of another codeword• Uniquely decipherable code: code satisfying the prefix property• Example: the code is not uniquely decipherable as the codeword of B is a

prefix of the codeword of F (this set of code does not satisfy the prefix property)

Page 29: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Example: do these two codes satisfy the prefix property?

Page 30: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 31: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Which of the two uniquely decipherable codes is more efficient?

Page 32: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Which of the two uniquely decipherable codes has a shorter average length?

Page 33: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Which of the two uniquely decipherable codes has a shorter average length?

Scheme 1!• Example: DELETE THE FILE

• Scheme 1: 52 digits in total• Scheme 2: 49 digits in total

Page 34: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Which of the two uniquely decipherable codes has a shorter average length?

Scheme 1!• Example: DELETE THE FILE

• Scheme 1: 52 digits in total• Scheme 2: 49 digits in total• E is a very common (heavy) character• Frequencies (weights) also important!• Weighted average should be considered instead

Page 35: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Weighted average code length

• Example: • Message: DELETE THE FILE

• Weighted average code length:

Page 36: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Weighted average code length

• Example: • Message: DELETE THE FILE

• Weighted average code length:

Page 37: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Weighted average code length

• Choice of frequency tables:• Choice #1: frequency table from specific message

• Choice #2: general frequency table for typical English passages

Page 38: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Efficiency of data transmission

• Variable length code• Weighted average code length

• (Partial) example: Morse code

Page 39: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Classwork: Calculate the weighted average code length for the Morse codes, using the general frequency table

Answer: 2.544

Page 40: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 41: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Aim: produce a code with the smallest weighted average code length for a given frequency table • Basic principle: shorter codewords for more frequent characters• Tool: a tree built from bottom to top with characters being the

“leaves”

Page 42: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters

• Step 1: combine the 2 with lowest probabilities

Page 43: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters

• Step 2: combine the 2 among “D”, “E” and “LT” with lowest probabilities

Page 44: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters

• Step 3: combine the 2 among “E” and “LTD” with lowest probabilities

Page 45: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters

• Step 4: assign “0” to the branch with the bigger probability and “1” to the branch with the smaller probability

Page 46: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters

• Step 4: assign “0” to the branch with the bigger probability and “1” to the branch with the smaller probability

Page 47: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters

• Step 5: read out the codewords from the top of the tree

Page 48: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code

• Example: a code for 4 characters• Does the code constructed this way always satisfy the prefix property?

• If “11” is a codeword for D, is it possible for other codewords to begin with “11”? No (as the branch for D stops at “11”)!

Page 49: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Classwork: Constructing Huffman code

Page 50: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 51: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 52: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 53: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 54: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 55: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 56: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 57: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 58: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 59: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 60: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Constructing Huffman code

Page 61: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman code: remarks

• Multiple possible Huffman codes for same frequency table• Different number of layers possible

• Are the weighted average code lengths the same?• Different Huffman codes for same frequency table have same weighted

average code length• Smallest weighted average code length guaranteed (proof out of scope)

Page 62: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Huffman codes: comparison

Page 63: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 64: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• No one-to-one correspondence between characters and codewords (unlike Huffman code)• Encode whole message into one number• Example: “DELETE” encoded as 0.11633 (decimal number)

Page 65: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Example: “DELETE” encoded as 0.11633 (decimal number)

• Step 1: Divide the interval (0, 1) into portions

Page 66: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Example: “DELETE” encoded as 0.11633 (decimal number)

• Step 2: Choose (zoom into) portion of first character “D” and divide the portion according to the probabilities (as in Step 1)

Page 67: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Step 2: Choose (zoom into) portion of first character “D” and divide the portion according to the probabilities (as in Step 1)

Page 68: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Example: “DELETE” encoded as 0.11633 (decimal number)

• Step 3: Choose (zoom into) portion of second character “E” and divide the portion according to the probabilities

Page 69: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Step 4: Keep choosing (zooming into) portions in correct order and dividing the chosen portion according to the probabilities

Page 70: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Step 5: Choose the portion of “END” when the message ends

• Step 6: Choose any number within the range of “END” as the codeword for the message (e.g. 0.11633)

Page 71: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

• Example: decoded 0.11633 with the frequency table

• Step 1: Divide into portions• Step 2: Where is 0.11633? Zoom in!

• 0.11633 is in Section D: first character of message is “D”

• Step 3: Repeat Step 1 and 2. Stop when it hits “END”!

Page 72: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Arithmetic coding

Page 73: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in CodesGCNU 1025

Numbers Save the Day

Page 74: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in daily life

• Examples of prefixes: • Mega-pixel• Nano-meter• Giga-watt

Page 75: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

SI prefixes

• International system of units• Examples: km, mm, cm, mL • Some common SI prefixes:

Page 76: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in data transmission

• SI prefixes commonly used for transmission speed• Example: 100Mbps

• Mbps: Mega-bit per second• Mega (SI prefix): • Bit: binary digit

Page 77: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in data transmission

• SI prefixes commonly used for transmission speed• Example: 56kbps

• kbps: kilo-bit per second• kilo (SI prefix): 1000• Bit: binary digit

Page 78: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Binary prefixes

• Different from SI prefixes: same letter, different meaning• 1024 used instead of 1000• Comparison:

Page 79: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in computer systems (file size)• Binary prefixes used for file size/actual capacity in computer systems• Example: file of size 10MB

• MB: Mega-byte• Mega (binary prefix): • Byte: 8 bits

Page 80: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in telecommunication

• Example: How long does it take to send a 100 MB file with the speed of 100 Mbps?• 100 MB = 100 x 1024 x 1024 x 8 bits• 100 Mbps = 100 x 1000 x 1000 bits per second• (Minimum) Time needed:

Page 81: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in telecommunication

• Example: How long does it take to download a 4 MB song via a 56K modem?• 4 MB: 4 x 1024 x 1024 x 8 bits• 56k modem: 56 Kbps transfer rate • 56 Kbps: 56 x 1000 bits per second• (Minimum) Time needed for downloading: ~600 seconds

Page 82: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Classwork 10: telecommunication

Page 83: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in hard disk packaging

• Confusion in units:• SI prefixes used in packaging of hard disks/flash drives• True capacity of disk/computer memory (e.g. RAM)/file size expressed by

binary prefixes

Page 84: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Units in hard disk packaging

• Confusion in units:• SI prefixes used in packaging of hard disks/flash drives

• Example: true capacity of a 4 GB flash drive• 4 GB flash drive: bytes

• True capacity of disk/computer memory (e.g. RAM)/file size expressed by binary prefixes• Example: size of a 4 GB file

• 4 GB file: bytes (a 4 GB flash drive does not have enough space for a 4 GB file!)

Page 85: Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Numbers in Codes-End-