19
CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Embed Size (px)

Citation preview

Page 1: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

CIS 234: Character Codes

Dr. Ralph D. WestfallApril, 2011

Page 2: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Problem 1 (other PowerPoint) computers only understand binary

coded data (zeros and ones) 00000000, 11111111, 01010101

people like to count in decimals00000000=0, 11111111=255,

01010101=85 1st problem: it is extremely hard for

people to work with binary data

Page 3: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Problems 2a and 2b since computers only work with

numbers, they need to use numbers to identify letters to print or show on screen e.g., 01000001=65=A people who don't read English also use

computers next problem: what kind of numbering

should be used for different languages?

Page 4: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Problem 2 Solution using binary data to display

characters make up a "coding scheme" that

assigns characters to numbers ASCII code: 7-8 bits (1 byte) Unicode: 16 bits (2 bytes)

Page 5: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

ASCII Code used for teletypes before computers

128 characters in original ASCII 0 to 31 (decimal) control the

machine7 (BEL) rings bell8 (BS) backspace key10 (LF) line feed (go down 1 line) 13 (CR) carriage return (to left of page)Java: '\n' = 10 and 13 together (2 bytes)

Page 6: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

ASCII Characters A = 41 hex (65 decimal), Z = 5A h (90) a = 61 hex (97 decimal), z = 7A h (122)

see calculator (String or ASCII choices) space character = 20 hex (32 decimal)

see how space character code is used in browser Address textbox

; (semicolon) = 3B hex (59 decimal)

Page 7: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Printable ASCII Characters

(space)

ASCII mage is from Wikipedia

Page 8: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

ASCII Numbers codes are for characters on screen

and do NOT equal the values of the characters Code numeric values can NOT be used

in calculations without adjustments

0 = 30 hex (ASCII 0 is really 48 decimal)

9 = 39 hex (57 decimal)

Page 9: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Unicode ASCII is a 7-8 bit encoding scheme

128-256 character limit Unicode is a 16-bit scheme

Uni comes from the word universal (also from Unix)

can code 65,536 characters (actually more)

Java uses Unicode encoding so that it can be used for many different languages

Page 10: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Unicode - 2 Unicode characters for many languages

Western alphabets: Latin (English), Greek, Cyrillic (Russian), etc.

Unicode uses 0000000 + ASCII for English 00000000 01000001 = A (65 decimal)

Asian characters: CJK (Chinese, Japanese, Korean) has over 20,000 characters

many character systems require installing special fonts onto user's computer

Page 11: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Using Unicode in Javachar letter = 'A' ; //easiest waychar letter = '\u0041' ; // also = 'A'char letter = '\u3220' ; // or '\u3280' ;

// 1 Chinese character for 1 \ (backslash) = escape character \u means Unicode (#s are in hexadecimal)

char sound = '\u0007' ; // BEL sounds speakers when "printed" to screen

Page 12: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Review Questions How many bits are there in ASCII code? How many bits are there in Unicode? True or False: All ASCII codes can be

seen as characters on the screen How many characters can be printed

using ASCII? Using Unicode? (match 2) around 90, around 12,000, over 50,000

Page 13: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Review Questions - 2 Why was Unicode created to

handle over 50,000 characters? Give an example of what some

non-printable ASCII character does on a computer or screen

How does Java code need to handle calculations on numeric characters entered on the screen by the user

Page 14: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Review Questions - 3 Is a space a character? What is the Chinese character for

the number 1? 2? 3? this will NOT be on a test! see answers on next slide

Page 15: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Chinese Characters: 3, 2 and 1

Page 16: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Appendix the following slides show how

ASCII characters can be read from the keyboard and converted to values that can be used for mathematical calculations

Page 17: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Reading Characters in DOSint iInit = System.in.read() ;

gets numeric value of character it reads if character is A, iInit = 65 (decimal)

char cInit = (char) System.in.read() ; (char) "casts" (converts) numeric value

to character type

System.out.println(iInit) ; //numberSystem.out.println(cInit) ;

//character

Page 18: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Reading Characters in Java - 2 2 characters sent when hit Enter key

CR (13) and then LF (10 decimal) when accepting keyboard input from

DOS window in Java, need to "absorb" both characters from Enter keystrokeSystem.in.read(); System.in.read(); reads characters, doesn't store (=) them program is now ready to read next input

Page 19: CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

Using Characters for Math numbers (characters) read from

keyboard have numeric values need to convert character's decimal

value to its mathematical value 0 = 30 h (48 decimal), 9 = 39 h (57) math value = decimal value – 48

int quantity = System.in.read() – 48 ;code // notes