David Evansevans@cs.virginia.eduhttp://www.cs.virginia.edu/evans/cs3901001 Things Every Self-Respecting Computer Scientist Should Know2Ethics andbut might not learn in CS101-CS390
Why This Isnt a Research PitchThe students I want to work with are:Resourceful enough to learn about my research by visiting my web page and reading papersSmart enough to pick a thesis advisor by talking to current/recent studentsI only have one hour and there are more important things to tell you than about my own researchI may go over, feel free to leave at any time
1001 Questions0000What is Computer Science?0001What problem did the first electronic programmable computer solve?0010Why was the first personal computer built?0011Is Computer Science a science, engineering or other?0100What are the worlds most complex programs?0101 How do Computer Scientists manage complexity?0110 Who was the first object-oriented programmer?0111 Who invented the Internet?1000 Why should we say goodbye to Hello World!?
0. What is Computer Science?
Let AB and CD be the two given numbers not relatively prime. It is required to find the greatest common measure of AB and CD.
If now CD measures AB, since it also measures itself, then CD is a common measure of CD and AB. And it is manifest that it is also the greatest, for no greater number than CD measures CD. Euclids Elements, Book VII, Proposition 2 (300BC)
The note on the inflected line is only difficult to you, because it is so easy. There is in fact nothing in it, but you think there must be some grand mystery hidden under that word inflected! Whenever from any point without a given line, you draw a long to any point in the given line, you have inflected a line upon a given line.Ada Byron (age 19), letter to Annabella Acheson (explaining Euclid), 1834
What is the difference between Euclid and Ada?It depends on what your definition of is is. Bill Gates (at Microsofts anti-trust trial)
Geometry vs. Computer ScienceGeometry (mathematics) is about declarative knowledge: what is If now CD measures AB, since it also measures itself, then CD is a common measure of CD and ABComputer Science is about imperative knowledge: how toAbout computing not computersAn unnatural science
Computer ScienceHow to knowledge:Ways of describing imperative processes (computations)
Ways of reasoning about (predicting) what imperative processes will do
1.What problem did the first electronic programmable computer solve?
ColossusFirst Programmable ComputerBletchley Park, 1943Designed by Tommy Flowers10 Colossi in operation at end of WWIIDestroyed in 1960, kept secret until 1970s(2 years before ENIAC calculating artillery tables)
Colossus ProblemDecode Nazi high command messages from Lorenz Machine XOR encoding:Ci = Mi KiPerfect cipher, if K is random and secret
Why perfectly secure?For any given ciphertext, all plaintexts are equally possible.Ciphertext: 0100111110101Key: 1100000100110Plaintext: 1000111010011 = CS10B
Breaking LorenzOperator and receiver need same keysGenerate key bits using rotor machine, start with same configurationOne operator retransmitted a message (but abbreviated message header the second time!)Enough for Bletchley Park to figure out key and structure of machine that generated it!But still had to try all configurations
ColossusRead ciphertext and Lorenz wheel patterns from tapesTried each alignment, calculated correlation with GermanDecoded messages (63M letters by 10 Colossus machines) that enabled Allies to know German troop locations to plan D-Day
2.Why was the first personal computer built?
Apollo Guidance Computer, 1961-691 cubic foot, 70 poundsWhy did they need to fit the guidance computer in the rocket?4KB of read/write magnetic core memory64KB of read-only memory
AGC HistoryNeeded all guidance to be on board in case Soviets jammed signals for EarthDesign began in 1961Risky decision to use Integrated Circuits (invented in 1958)Building 4 prototypes used 60% of all ICs produced in the US in the early 60s!Spurred industry growth
3. Science, Engineering or Other?
Science?Understanding Nature through ObservationAbout real things like bowling balls, black holes, antimatter, electrons, comets, etc.Math and Computer Science are about fake things like numbers, graphs, functions, lists, etc.Computer Science is a useful tool for doing real science, but not a real science
Engineering?Engineering is design under constraint Engineering is synthetic - it strives to create what can be, but it is constrained by nature, by cost, by concerns of safety, reliability, environmental impact, manufacturability, maintainability and many other such 'ilities.' ...
Computing Power 1969-2002(in Apollo Control Computer Units)Moores Law: computing power doubles every 18 months!If Apollo Guidance Computer power is 1 inch, you have 5 miles!(1GB/4KB = 262144)
30083886082516582400bits300 MB to print poster
419430420024 million!20024194304alpha century26500000000000110416666.66666726.5 trillion
Constraints Computer Scientists FaceNot like those for engineers:Cost, weight, physics, etc.If 8 Million times what NASA had in 1969 isnt enough for you, wait until 2006 and you will have 32 Million timesMore like those for Musicians and Poets:Imagination and CreativityComplexity of what we can understandCost of human effort
So, what is computer science?ScienceNo: its about fake things like numbers, not about observing and understanding nature EngineeringNo: we dont have to deal with engineering-type constraintsMust be a Liberal Art!
The Liberal ArtsTrivium (3 roads)languageQuadrivium (4 roads)numbersGrammarRhetoricLogicArithmeticGeometryMusicAstronomy
Liberal ArtsGrammar: study of meaning in written expressionRhetoric: comprehension of verbal and written discourseLogic: argumentative discourse for discovering truthArithmetic: understanding numbersGeometry: quantification of spaceMusic: number in timeAstronomy: laws of the planets and starsYes, we need to understandmeaning to describe computationsInterfaces between components, discourse between programs and usersLogic for controlling and reasoning aboutcomputationsYesYes (graphics)Yes (read Gdel, Escher, Bach)Yes, read Neil DeGrasse Tysons essayTriviumQuadrivium
4. What are the worlds most complex programs?
Complex ProgramsApollo Guidance Software~36K instructionsF-22 Steath Fighter Avionics Software1.5M lines of code (Ada)5EEE (phone switching software)18M linesWindows XP ~50M lines (1 error per kloc ~ 50,000 bugs)Anything more complex?
Human GenomeProduces60 Trillion Cells (6 * 1013)50 Million die every second!
Today is the 50th anniversaryof the most important scientificpaper of the 20th century!Molecular structure of Nucleic Acids, James Watson and Francis Crick. Letter to Nature, sent 2 April 1953 (2 pages)
How Big is the Make-a-Human Program?3 Billion Base PairsEach nucleotide is 2 bits (4 possibilities)3B bases * 1 byte/4 pairs = 750 MB 1 CD ~ 650 MBWal-Marts databaseis 280 Terabytes
Encoding is RedundantDNA encodes proteinsEvery sequence of 3 base pairs one of 20 amino acids (or stop codon)21 possible codons, but 43 = 64 possible valuesSo, really only 750GB * (21/64) ~ 246 MBTrillions of creatures, over millions of years, had to die to create this program!
Expressiveness of DNAGenetic code for 2 humans differs in only 2 million bases4 million bits = 0.5 MB
1/3 of a floppy disk
5. How do Computer Scientists manage complexity?
AbstractionAdapted from Gerard Holzmanns FSE Slides
Abstraction in Computer ScienceProcedural Abstraction (CS101)Abstract what to do from specific values to do it toData Abstraction (CS201)Abstrac