11
Alfred V. Aho is Lawrence Gussman Professor of Computer Science and Vice Chair for Undergraduate Education in the Computer Science Department. He served as Chair of the department from 1995 to 1997 and in the spring of 2003. He was elected to the U.S. National Academy of Engineering for contributions to the fields of algorithms and programming tools. Professor Aho received his B.A.Sc in Engineering Physics from the University of Toronto and a Ph.D. in Electrical Engineering/Computer Science from Princeton University. Prior to his current position at Columbia, Professor Aho served in many capacities at the Computing Sciences Research Center at Bell Labs, such as Vice President, Director, department head, and member of technical staff. This is the lab that invented UNIX, C and C++. Al was also the General Manager of the Information Sciences and Technologies Research Laboratory at Bellcore (now Telcordia). Professor Aho is the “A” in AWK, a widely used pattern- matching language (you can think of AWK as the initial pure version of perl). “W” is Peter Weinberger and “K” is Brian Kernighan. Al also wrote the initial versions of the string pattern-matching programs egrep and fgrep that first appeared on UNIX. His current research interests include quantum computing, programming languages, compilers, and algorithms. (continued on next page) CS@CU CS@CU FALL 2006 1 At this year’s computer science department “Hello Meeting,” members of the faculty recalled the early days of the department when the same meeting could be held on the first floor of Mudd, in the small area that is now used for making sandwiches in the Carleton Lounge. Since then, the computer science department has grown both in size and prominence. The faculty now includes six members of the National Academies: Al Aho, Steven M. Bellovin, Zvi Galil,Ted Shortliffe, Joe Traub, and Vladimir Vapnik. Here is a brief overview of their achievements and contributions. Columbia Computer Science Faculty in the National Academies NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY VOL .3 NO.1 FALL 2006 Alfred V. Aho Stephen M. Bellovin Zvi Galil Edward H. Shortliffe Joseph F. Traub Vladimir Vapnik

NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

Columbia University in the City of New YorkDepartment of Computer Science1214 Amsterdam AvenueMailcode: 0401New York, NY 10027-7003

ADDRESS SERVICE REQUESTED

CS@CU NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY WWW.CS.COLUMBIA.EDU

NON-PROFIT ORG.

U.S. POSTAGE

PAID

NEW YORK, NY

PERMIT 4332 Alfred V. Aho is LawrenceGussman Professor ofComputer Science and ViceChair for UndergraduateEducation in the ComputerScience Department. Heserved as Chair of thedepartment from 1995 to1997 and in the spring of2003. He was elected to theU.S. National Academy ofEngineering for contributionsto the fields of algorithmsand programming tools.

Professor Aho received hisB.A.Sc in Engineering Physicsfrom the University of Torontoand a Ph.D. in ElectricalEngineering/Computer Sciencefrom Princeton University. Prior to his current position atColumbia, Professor Aho servedin many capacities at theComputing Sciences ResearchCenter at Bell Labs, such as

Vice President, Director,department head, and memberof technical staff. This is thelab that invented UNIX, C andC++. Al was also the GeneralManager of the InformationSciences and TechnologiesResearch Laboratory atBellcore (now Telcordia).

Professor Aho is the “A” inAWK, a widely used pattern-matching language (you canthink of AWK as the initialpure version of perl). “W” isPeter Weinberger and “K”is Brian Kernighan. Al also wrote the initial versions ofthe string pattern-matchingprograms egrep and fgrepthat first appeared on UNIX.His current research interestsinclude quantum computing,programming languages,compilers, and algorithms.

(continued on next page)

CS@CU

CS@CU FALL 2006 1

CUCS Resources

You can subscribe to the CUCS news mailing list at http://lists.cs.columbia.edu/mailman/listinfo/cucs-news/

CUCS colloquium information is available at http://www.cs.columbia.edu/lectures

Visit the CUCS alumni portal at http://alum.cs.columbia.edu

At this year’s computer science department“Hello Meeting,” members of the facultyrecalled the early days of the departmentwhen the same meeting could be held on the first floor of Mudd, in the small area thatis now used for making sandwiches in theCarleton Lounge. Since then, the computerscience department has grown both in sizeand prominence. The faculty now includes

six members of the National Academies:

Al Aho, Steven M. Bellovin, Zvi Galil, Ted

Shortliffe, Joe Traub, and Vladimir Vapnik.

Here is a brief overview of their achievementsand contributions.

Columbia ComputerScience Faculty in theNational Academies

NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY VOL.3 NO.1 FALL 2006

Alfred V. Aho Stephen M. Bellovin

Zvi Galil Edward H. Shortliffe Joseph F. Traub Vladimir Vapnik

Page 2: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

CS@CU FALL 2006 32 CS@CU FALL 2006

Cover Story (continued)Professor Aho has won theIEEE John von NeumannMedal and has received honorary doctorates from the Universities of Helsinki and Waterloo. He is a Member of the AmericanAcademy of Arts and Sciences and a Fellow of the American Association for the Advancement ofScience, ACM, Bell Labs, and IEEE. He received the Great Teacher Award for 2003 from the Society of Columbia Graduates.

Steven M. Bellovin is aProfessor of Computer Science.He was elected to the NationalAcademy of Engineering for contributions to networkapplications and security.

Professor Bellovin joined thefaculty in 2005 after manyyears at Bell Labs and AT&TLabs Research, where he wasan AT&T Fellow. He was firstexposed to computers in 10thgrade at Stuyvesant HighSchool and has been workingin the field ever since. Hereceived a BA degree fromColumbia University, and anMS and PhD in ComputerScience from the Universityof North Carolina at ChapelHill. As a graduate student,he helped create Netnews;for this, he and his fellowcontributors were awardedthe 1995 Usenix LifetimeAchievement Award.

Bellovin is the co-author of the book “Firewalls andInternet Security: Repelling the Wily Hacker,” and holdsseveral patents on crypto-graphic and network protocols.His current research interestsinclude security and privacyissues in large scale and distributed networks. He isalso interested in human factor aspects of security.

Professor Bellovin has servedon many National ResearchCouncil study committees,including those on informationsystems trustworthiness, the privacy implications ofauthentication technologies,

and cybersecurity researchneeds. He was also a memberof the information technologysubcommittee of an NRCstudy group on science versusterrorism. He was a memberof the Internet ArchitectureBoard from 1996-2002, andwas co-director of the SecurityArea of the IETF from 2002through 2004. He sits onthe Department of HomelandSecurity’s Science andTechnology Advisory Board.

Zvi Galil is the Dean of theSchool of Engineering andApplied Science. He is theJulian Clarence Levi Professorof Mathematical Methods and Computer Science, andserved as Chairman of theDepartment of ComputerScience from 1989 until July1995. In 2004, Dean Galil was elected to the NationalAcademy of Engineering forhis contributions to the designand analysis of algorithms and for leadership in computer science and engineering.

Dean Galil began his studies inapplied mathematics; he soonturned to the emerging field of computer science, which hesaw as an extension of hisinterest in mathematics. Hismain research pursuits are in the design and analysis ofalgorithms, computationalcomplexity, and cryptography.He is a world leader in thedesign and analysis of computeralgorithms and has developeda number of techniques toimprove their efficiency. Manyof his algorithms remain thefastest at solving a particularkind of problem or use thesmallest amount of memory to find a solution. He has collaborated with scientists in diverse fields, including biology, mathematics, and statistics, to devise novel waysto attack difficult problems.

Dean Galil has been a professorat Columbia University since1982, having previously servedat Tel Aviv University. He is the author of more than 200research papers in refereed

journals and is the editor of the book “ComputationalAlgorithms on Strings.”Although being the Dean ofEngineering and AppliedScience has taken up most ofhis time for the last 12 years, hereports, with great mirth, thathe still tries to find the mostefficient way to solve problems.

Edward H. Shortliffe is Rolf A. Scholdager Professorand Chair of the Departmentof Biomedical Informatics at Columbia College ofPhysicians and Surgeons. Heis an elected member of theInstitute of Medicine of theNational Academy of Sciences.

After receiving an A.B. inApplied Mathematics fromHarvard College in 1970, hemoved to Stanford Universitywhere he could pursue interests in medicine andcomputer science, and wasawarded a Ph.D. in MedicalInformation Sciences in1975 and an M.D. in 1976.Subsequently, he served asProfessor of Medicine and ofComputer Science at StanfordUniversity. In January 2000 he assumed his new post atColumbia University, where he is also Deputy VicePresident (Columbia UniversityMedical Center); SeniorAssociate Dean for StrategicInformation Resources(College of Physicians andSurgeons); Professor ofMedicine; Professor ofComputer Science; andDirector of Medical InformaticsServices for the New YorkPresbyterian Hospital.

During the early 1970s,Professor Shortliffe was principaldeveloper of the medicalexpert system known asMYCIN. After a pause forinternal medicine house-stafftraining at MassachusettsGeneral Hospital and StanfordHospital between 1976 and1979, he joined the Stanfordinternal medicine faculty wherehe served as Chief of GeneralInternal Medicine, AssociateChair of Medicine for Primary

Care, while directing an activeresearch program in clinicalinformation systems and decision support. He spear-headed the formation of aStanford graduate degree pro-gram in biomedical informaticsand divided his time betweenclinical medicine and biomed-ical informatics research.

Professor Shortliffe continuesto be closely involved with biomedical informatics graduatetraining where he works tocreate a new breed of healthprofessional. His researchinterests include the broadrange of issues related tointegrated decision-supportsystems, their effective implementation, and the roleof the Internet in health care.

Joseph F. Traub is the EdwinHoward Armstrong Professorof Computer Science. He came to Columbia in1979 as founding chair of the Computer ScienceDepartment. From 1971 to 1979 he was Head of the Computer ScienceDepartment at CarnegieMellon University. He servedas founding Chairman ofthe Computer Science andTelecommunications Board(CSTB) of the NationalAcademies from 1986 to 1992and is again serving as Chair.CSTB deals with criticalnational issues in computing,communications, and publicpolicy (see www.cstb.org). Hehas served as founding Editor-in-Chief of the Journal ofComplexity since 1985.

He is the author, co-author, oreditor of nine books and haspublished some one hundredand twenty papers. His latest book, Complexity andInformation, co-authored withProfessor Arthur G. Werschulz,was published by CambridgeUniversity Press in 1998.

Traub and Professor HenrykWozniakowski started thefield of information-basedcomplexity (IBC), which isnow an active research areainvolving researchers from

many countries. IBC studiesoptimal algorithms and computational complexity for the continuous problemswhich arise in physical science,mathematical finance, engineering and economics. A major focus of his currentwork is quantum computing.

He has received numeroushonors including election to the National Academy ofEngineering in 1985, theEmanuel R. Piore Medal fromIEEE in 1991, selection by the Accademia Nazionale deiLincei in Rome to present the 1993 Lezione Lincei, andthe 1999 Mayor’s Award forExcellence in Science andTechnology in New York City. In2001, he received an honoraryDoctorate of Science from theUniversity of Central Florida.

Vladimir Vapnik is Professorof Computer Science atColumbia and Fellow at NECLabs. He is currently at theCenter for ComputationalLearning Systems. The

National Academy ofEngineering elected ProfessorVapnik as a new member in2006 for “insights into the fundamental complexities oflearning and for inventing practical and widely appliedmachine-learning algorithms”.

Professor Vapnik obtained hisMasters Degree in Mathematicsin 1958 at Uzbek StateUniversity, Samarkand, USSRand his Ph.D at the Institute of Control Sciences in theAcademy of Sciences, Moscow,in 1964. From 1964 to 1990 he worked at the Institute,where he became Head of theComputer Science ResearchDepartment. He then joinedAT&T Bell Laboratories, Holmdel,NJ, as a Consultant, and was appointed Professor ofComputer Science and Statisticsat Royal Holloway in 1995.

Professor Vapnik has taughtand researched in computerscience and theoretical andapplied statistics for over thirtyyears. He has published sixmonographs and more than

one hundred research papers.His major achievements havebeen the development of ageneral theory for minimizingthe expected risk of predictorsusing empirical data, and anew type of learning machine(the Support Vector Machine)that possesses a high level ofgeneralization ability. Thesetechniques have been used tosolve many pattern recognitionand regression estimationproblems and have beenapplied to the problems ofdependency estimation, fore-casting, and constructing intelligent machines. ProfessorVapnik’s research combinesmathematical analysis of theproblem of empirical inferencein complex (high dimensional)worlds with philosophical inter-pretation of such inferencesbased on the imperative whichhe formulated in 1995: “Whensolving a problem of interest,do not solve a more generalproblem as an intermediatestep. Try to get the answerthat you really need but not amore general one.”

These six NationalAcademy members represent the growthand prominence of theColumbia ComputerScience Department sinceits modest beginnings.Undergraduates andgraduate students alikeare fortunate to havesuch giant shoulders tostand on.

T.C. Chang Professor of Computer ScienceShree Nayar.

Professor Shree Nayar wasnamed as the recipient of the2006 Great Teacher Award forthe School of Engineering atColumbia University. The awardis bestowed by the Society ofColumbia Graduates. Its Boardof Directors named ProfessorNayar for this award because itfeels that he exemplifies thegreatest traditions of teachingat Columbia and has earnedthe recognition of his studentsand his peers as a dedicatedand inspired undergraduateteacher and mentor.

As one of the Society’s GreatTeachers, Shree joins the ranksof Columbia’s finest and mostbeloved professors, such as

Mark Van Doren, Lionel Trilling,Mario Salvadori, MortonFriedman, Rene Testa, and others. The Great TeachersAward Dinner was held in Low Library on the eveningof Thursday, October 19, 2006.

The Society was formed in 1909; it will soon be celebrating its 100 yearanniversary. Throughout much of its existence, theSociety’s principal mission has been to recognize greatservice to Columbia by itsalumni and by its faculty.

Congratulations to Shree onthis well-deserved recognitionof his outstanding teaching!

Professor Shree Nayar Named as 2006 Great TeacherFeature Article

Page 3: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

CS@CU FALL 2006 54 CS@CU FALL 2006

Other News

Debbie Cook and Moti Yung

offered COMS 6998-1, Practical

Cryptography, in the Fall 2006semester. The course focuseson practical aspects of cryptog-raphy to complement the topicscovered in “Introduction toCryptography” and in “NetworkSecurity.” The course consistsof a mixture of lectures and student presentations of currentresearch papers. Topics includethe design of algorithms usedin practice and cryptanalysis,hash functions, elliptic curves,electronic cash, threshold cryp-tography, forward security, keyinsulated security, and trustedcomputing. The course requiresa semester-long project on atopic of the student’s choice.The projects are practical in nature and include imple-mentations of attacks on orthe statistical analysis ofalgorithms, and implementa-tions of electronic cash andelectronic voting systems.

Donna Dillenberger offered a new 6000 level graduatecourse, Advanced Computer

Design, in the Fall 2006semester. The goal of thecourse is to train students todesign systems holistically, from

• The hardware level(chip level, I/O subsystems,packaging, cooling considerations), to

• The virtualization layer (hard-ware, software hypervisors,paravirtualization), to

• The operating system(advanced topics: workloadmanagement, recovery, availability, reliability, servicability, metering), to

• Middleware scalability andclustering issues.

Student learn how tradeoffs inthese layers lead to differentoptimizations for differenttypes of workloads. The courseuses real case studies in the

design of mainframes, desk-tops, PDAs and set top boxesto illustrate architectural concepts. The final assignmentis for students to write a paperon how they would designtheir own ideal system for aparticular workload they aretargeting, describing their idealhardware, operating system,virtualization and I/O layers. At the end of class, studentspitch their designs to a fictitiousCEO (our class) and the classvotes for which system to “bet” the company on.

Prabhakar Kudva offered CSEE

4340 (formerly listed as ELEN4340) for the first time as aComputer Science departmentcourse in the Spring 2006semester. This course is a hands-on, laboratory-based introductionto the design of digital systems,culminating in the design of acomplete working computer.Students learn computer archi-tecture, register-transfer-levelspecification, describe designswith and simulate using VHDLand implement a computer usingFPGAs. This is a practical course,not a theoretical one. The goal isto teach the art and practice ofdigital system design by workingon a real design; as such, mostof the focus of the course is thelaboratory and the project. Thecourse includes a lot of hands-onexperience with industrial designtools and techniques. Studentsextensively use the industrystandard language VHDL as wellas use the Cadence tool suitealso widely used in industry todevelop their design.

Professor Itsik Pe’er taughtCOMS 6998-3, Computational

Human Genetics, in the Fall2006 semester. This course isintended to introduce studentsof both computational and bio-medical skill sets to currentquantitative understanding ofhuman genetics and prepare

New Computer Science Courses them for computational researchin the field. Topics include:genetics of a single site, coa-lescence with recombination,history of humans, mappingrare mutations through linkage,mapping common variantsthrough association, isolatedand admixed populations, natural selection, copy numberchanges, model organisms,and genotyping technologies.The computational toolboxdiscussed includes parameterinference, likelihood analysis,hidden Markov and othergraphical models, eigenvaluedecompositions, and classifica-tion problems.

Professor Henning Schulzrinne

offered COMS 4995-01,Special Topics in Computer:

VoIP Security, in the Fall 2006semester. This is a seminarand lab course on VoIP (voice-over-IP) and VoIP security. Thecourse defines VoIP broadly toinclude real-time voice, videoand instant messaging. Topicscovered include

• basic VoIP technology: audioand video coding, RTP, SIP;

• IM and presence: proprietarysystems, XMPP and SIMPLE;

• basic security issues:impersonation and authentication, privacy;

• VoIP spam (“spit”, “spim”)and prevention mechanisms;

• VoIP denial-of-service attacks;

• Skype;

• security for peer-to-peerVoIP; and

• emergency calling.

As part of the course, students

• learn about VoIP protocolsand technology;

• install, test and measure a complete VoIP system;

• conduct a team projectimplementing, asopen-source software, an aspect of VoIP; and

• prepare a survey talk on a topic related to VoIP.

Adjunct Professors Michael

Theobald and Franjo Ivancic

taught CSEE 6832, Formal

Verification of Hardware and

Software Systems, in the Fall2006 semester. The courseintroduces the theory andpractice of formal methods forthe analysis of concurrent andembedded systems. The focusof the course is on modelchecking, which is an algorith-mic approach to verification.Topics include temporal logics,Binary Decision Diagrams, andSAT solvers. Students alsolearn how to use the popularSPIN and SMV model check-ers, and hear about practicalexperiences of applying formalverification to hardware andsoftware projects in industry.Students may elect to do aresearch project that is gearedtowards their background.

Bernard Yee offered COMS4995-2, Video Game Design

and Development, in theSpring 2006 semester. Thecourse covers the process ofcreating and developing videogame designs; the theme is toallow creative vision and tech-nical limitations (time, budget,team size, technology choices,etc.) to constrain and informeach other. Students learn todesign and prototype gamedesigns, acquire tools for thecritical analysis of the game-play experience, and applythese critical skills as theyplay, dissect, tune, and replaygames in a rigorous iterativegame development process.The course also touches uponrelated subjects including interactive narrative, characterdesign, multiplayer dynamics, AIdesign, production processes,

and the business of games.The approach uses both academic/lecture/discussionclass time and practical,guided workshop sessions.Students are expected to workindividually and collaborate as part of a small team. Thetheme is to replicate a real-world pre-production process.

A screenshot from a student project in COMS 4995-2, Video Game Design and Development.

Page 4: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

CS@CU FALL 2006 76 CS@CU FALL 2006

New FacesMansoorAlicherrycompleted his B.S. inComputerScience fromNationalInstitute of

Technology, Calicut, India in1997. He worked in Wipro from1997-1998. He did his M.S. inComputer Science from IndianInstitute of Science (IISc),Bangalore in 2000, where hespecialized on operating sys-tems. He has been working inLucent Technologies’ Bell Labssince 2000. His research inter-ests are in network security androuting and design algorithmsfor optical and mobile networks.He is currently working withProfessor Angelos Keromytison network security.

Stuart Andrews began workingwith ProfessorTony Jebara asa postdoc in theFall of 2006. Hehold B.Sc. andM.Sc. degrees

from the University of Torontoand shortly expects to receivehis Ph.D. from Brown University.His research focuses on semi-supervised and structured-output learning algorithms. At Columbia, he and ProfessorJebara are developing algo-rithms that learn to predictproperties of edges in a network of N entities, basedon attributes of the entities as well as the global networkstructure. When applied to a social network, such as thekind maintained by students in a university setting, thisalgorithm learns how to assignin a balanced fashion a set oflikely friends and/or mentorsfor the incoming class, fromthe friendship structure of the existing students. Otherapplications of these methodsinclude network biology and e-commerce.

Hila Becker graduated fromStony BrookUniversity with a B.S. in ComputerScience andApplied Math

and Statistics in May 2004.She got her M.S. in ComputerScience from ColumbiaUniversity in 2005 and startedas a Ph.D. student in the springof 2006. Hila’s main researchinterests include theoretical andapplied Machine Learning andInformation Retrieval. She iscurrently working in the Centerfor Computational LearningSystems with Dr. Marta Ariason online-learning techniquesfor prediction of electricity distribution feeder failures.

Fadi Biadsy graduated fromBen-Gurion University with a B.S. in Computer Scienceand Mathematics in July 2002.He worked at Dalet DigitalMedia Systems from 2001 to2005 as a software architectand developer. He received hisM.S. in computer science atBen-Gurion University withhighest honors in 2005; heexplored the problem of onlinehandwriting recognition forArabic script using hiddenMarkov models. He started as a Ph.D. student in theDepartment of ComputerScience at Columbia Universityin Spring of 2006. Fadi’s mainresearch interests focus onspoken language processing;particularly, identifying charis-matic speech, Arabic dialectidentification using prosody,and the acoustic/prosodic andintonational structure of Arabiclanguage. He is currently working with Professor JuliaHirschberg’s Spoken LanguageProcessing group studying theacoustic/prosodic and lexicalcharacteristics of Arabic andAmerican English Charismaticspeech.

Malek Ben-Salem graduatedfrom the University of Hanover,Germany with a B.S. and M.Sin Electrical Engineering in 1999and 2001 respectively. Sheworked at IBM Corporationfrom 2001-2006, did an M.S. incomputer science at Columbiafrom 2003-2005, and started as a Ph.D. student in Fall 2006.Malek’s main research interestsare Computer Security andMachine Learning. She is cur-rently working with ProfessorStolfo’s group studying InsiderIntrusion Detection.

Sasha P. Caskey graduated from BrandeisUniversity witha BA in 1997.After graduationhe joined The MITRE

Corporation in Bedford, MAwhere he contributed to researchand development in CollaborationSystems (CVS, JCS, SIMP),Natural Language Processing andSpoken Dialog Systems (DARPACommunicator). In 2000 hejoined a startup, SpeechworksInt., where he worked in productdevelopment and research onspoken dialog systems. Hejoined IBM T.J. Watson Researchlabs in 2003 and is a workingfirst year Ph.D. Student under thesupervision of Professor KathyMcKeown in the NLP group.

Oliver Cossairt graduated witha B.S. fromEvergreenState College inJune 2001. Hedid an M.S. inMedia Arts and

Sciences at MIT from 2001-2003. After graduating fromMIT, he worked at ActualitySystems for three years beforehe began his PhD at ColumbiaUniversity in the Fall of 2006.Oliver’s main research interestis in computation displays andcameras, specifically in Three-Dimensional displays and lightfields. He is currently workingwith Shree Nayar in the Visionand Graphics Group.

Other News

Professor Dana Pe’er usesmachine learning techniques todevelop computational toolsand models to understand howbiological networks processsignals and execute complexbiological functions. ProfessorPe’er is interested in under-standing the design, functionand evolution of molecular net-works. Molecular networksregulate the basics processesof life, for example, a sequenceof developmental decisionsregulate the formation of anentire human body from a sin-gle zygote cell. Understandingthe normal function of themolecular network will allowus to understand how its “dysfunction” leads to manydiseases including cancer andautoimmunity.

Professor Dana Pe’er usesBayesian networks and otherstatistical approaches to devel-op models that integrate het-erogeneous types of biologicaldata to uncover interactionsbetween biological componentsand elucidate how these com-bine together at a systemslevel to compute and executedecisions. These models canrepresent stochastic nonlinearrelationships among multipleinteracting molecules andaccommodate the noise inher-ent to biologically deriveddata. Additionally, the modelsare inherently flexible enoughto integrate diverse datatypes and handle unobservedcomponents. These networkmodels provide the means toaddress questions such ashow does dysregulation causedisease and where are theoptimal points for correctiveinterference and cure.

Dana Pe’er received a B.Sc. inmathematics (1995), an M.Sc.in theoretical computer science(1999) and a Ph.D. for her workin machine learning approachesfor reconstruction of molecularnetworks (2003), all from the Hebrew University ofJerusalem: Professor Pe’er’sgraduate research pioneeredthe use of probabilistic graphicalmodels (Bayesian Networks)for the reconstruction ofmolecular networks. Her influ-ential adaptation of Bayesiannetworks has more than 600citations and is taught in computational biology coursesworld-wide. To strengthen her cross training in biology,following her graduation, Dr. Pe’er joined the laboratoryof Professor George Church in the Genetics department at Harvard Medical School.During her postdoctoral fellow-ship, she gained biologicalexpertise, while continuing to do research of significantimpact and influence on both the computational andbiological sciences.

Professor Dana Pe’er hasreceived numerous awards,most notably the BurroughsWelcome Fund Career Awardat the Interface of Science. Her work on reconstructing signaling networks in humancells was runner up for ScienceMagazine’s Breakthrough ofthe year 2005. At ColumbiaUniversity, Dana Pe’er intendsto continue bridging the gapbetween Computer Scienceand Biology to elucidate fundamental understandings in biology and disease.

Professor Itsik Pe’er holds bachelors, masters, and doctoraldegrees in computer sciencefrom Tel Aviv University. Hehas worked extensively in bio-medical research laboratoriesat the Weizmann Institute andthe Broad Institute of Harvard& MIT, where he conductedpostdoctoral research. Itsikstudies, develops, and appliesnovel computational methodsin human genetics. How is itbest to measure, describe andquantify differences betweenindividual DNA sequences?How does sequence variationaffect biological processes?How can we use it to under-stand and influence humandisease? All these questionspose complex analytical challenges, with direct impacton medical research. ProfessorItsik Pe’er is specifically interested in genetics of special populations that underwent bottleneck andadmixture events, in the optimal implementation andanalysis of whole genomeassociation studies, and in the interplay between somaticand germline variation.

Computer ScienceDepartment Welcomes Professors Dana and Itsik Pe'er

Professor Dana Pe’er Professor Itsik Pe’er

The Computer Science departmentis delighted to welcome Dana andItsik Pe’er as Assistant Professorsof Computer Science.

Page 5: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

CS@CU FALL 2006 98 CS@CU FALL 2006

New Faces (continued)Schulzrinne. He is currentlyinvestigating adaptive algorithmssuitable for transient peer-to-peer communication networks.

Chris Murphyis working with Prof. GailKaiser in theProgrammingSystems Lab andis investigatingthe software

testing and quality assuranceof machine learning algorithms.He graduated from BostonUniversity with a BS in ComputerEngineering in 1995, and completed his MS in ComputerScience at Columbia this pastspring. Chris worked as a soft-ware architect and consultantfor six years; prior to studyingat Columbia he taught Englishessay writing for two years inSeoul, South Korea. Chris enjoysteaching and was the winnerof the Andrew P. KosoresowMemorial Award for Excellencein Teaching and Service in 2006.

Richard Neill holds a B.S andM.S. in ElectricalEngineering andhas held a tech-nology positionat CablevisionSystems

Corporation. He started hisPh.D. in Fall 2006 working inProf. Carloni’s research group.Richard’s main research interestsare in distributed embeddedsystems, parallel architectures,and grid computing.

Ohan Oda received his B.Sfrom Universityof WisconsinMadison withdouble degreesin ComputerScience and

Computer Engineering in May2005. He has worked at GEHealthcare for two summers asan intern. He started as a M.Sstudent in Computer Scienceat Columbia in Fall 2005, andmoved to the Ph.D program in

Fall 2006. Ohan’s main researchinterest is Augmented Realityand Virtual Reality. He is currentlyworking with Professor StevenFeiner’s group developing aframework that will be useful forcreating 3D User Interface andAugmented Reality applicationsusing an existing commercialgame engine.

Kumiko Ono graduated fromOchanomizuUniversity, Japanwith a B.S. inMathematics.She has workedas a network

research engineer at NTT Labs.for over ten years, and visitedColumbia in 2005. She started asa Ph.D. student in Spring 2006.Her main research interestsare VoIP network and security.She is currently working withProfessor Schulzrinne’s group.

Kristen Parton graduated from StanfordUniversity with a B.S.in ComputerScience in June2003. She

worked on Intelligent TutoringSystems at Stottler HenkeAssociates, Inc. from 2003-2005, then on multilingual websearch at Yahoo! in 2006. Shestarted as a Ph.D. student in theNatural Language Processinggroup in Fall 2006, and will beworking with Professor KathyMcKeown and Owen Rambow.

MarianaRaykova graduated from BardCollege with a double B.A. in ComputerScience and

Mathematics in 2006. Hermain research interests are inthe areas of cryptography andsecurity. She is working withMoti Yung on cryptographichash functions.

Lily Bao Secora joined theComputerScienceDepartment in April 2006 as the Ph.D.Program

Administrator. She manages theday-to-day operations of the PhDprogram. She has a Masters ofEducation degree in Counselingand Personnel Services and hasworked for five years in highereducation. Her hobbies includerollerblading, tennis, volleyball,cooking, and traveling.

Wonsang Song graduated fromSeoul NationalUniversity with a B.S.in ComputerEngineering inFebruary 1997.

He worked as a system engineerand programmer in Korea from1997-2004, did an M.S. inComputer Science at Columbiafrom 2004-2006, and startedas a Ph.D. student in Fall 2006.Wonsang’s main researchinterests are multimedia net-working, location-based servicesand distributed systems. He is currently working on theNG911 project with ProfessorHenning Schulzrinne.

Hang Zhao graduated from NationalUniversity of Singaporewith a B.S. in ComputerScience in

2005. She joined the ComputerScience Department atColumbia in Fall 2006 as a PhDstudent. Hang’s main researchinterest is network security.She is currently working withProfessor Steven Bellovin onthe Policy Based SecurityManagement project collabo-rating with IBM Research andother universities from theU.S. and the U.K.

Kevin Egan graduated from BrownUniversity in2003, thenworked on the softwarerenderer at

Rhythm and Hues Studios inLos Angeles for two years. He started as a Ph.D. studentat Columbia in Fall 2006, andis working with Professor Ravi Ramamoorthi’s ComputerGraphics group.

IoannisGiannakakis graduated from NationalTechnicalUniversity ofAthens (NTUA)with a Diploma

in Electrical and ComputerEngineering in July 2006. Hestarted as a Ph.D. student atColumbia University in Fall2006. Ioannis’s main researchinterests are Algorithms andDatabases. He is currentlyworking with ProfessorKenneth Ross’s group inDatabase systems.

Dana Glasner graduated from YeshivaUniversity with a B.A.in ComputerScience andMath in May

2006 and started as a Ph.D. student at Columbia in Fall 2006.Dana’s main research interestsare Cryptography, LearningTheory, and Complexity Theory.She is currently working with Professor Tal Malkin andProfessor Rocco Servedio.

Roni Goldenthal is a ComputerScience PhDstudent at the HebrewUniversity ofJerusalem,where his

advisor is Professor MichelBercovier. His research interests

are geometric modeling andphysical simulation. Currently heis part of the computer graphicsgroup at Columbia, where heworks with Professor EitanGrinspun on a new algorithmfor physical simulation of cloth.

David Harmon graduated fromWofford Collegewith a B.S. in ComputerScience andMathematics inMay 2005. After

he was awarded a NationalScience Foundation GraduateFellowship, he enrolled in thecomputer science graduate pro-gram at Columbia in the fall of2005. He is currently workingwith Professor Eitan Grinspun aspart of the Columbia ComputerGraphics Group. His researchinterests include physical simu-lation, computer animation, andcomputer graphics.

SteveHenderson started hisPh.D. studies at Columbia inFall 2006. He is working withProfessor Feiner

in the Computer Graphics andUser Interface Lab. Steve joinsColumbia from the faculty ofthe United States MilitaryAcademy at West Point, NewYork where he served as anAssistant Professor of SystemsEngineering. Steve is an officerin the US Army, and holds anMS in Systems Engineeringfrom the University of Arizona,and a BS in Computer Sciencefrom the United States MilitaryAcademy. His research inter-ests are mobile augmentedreality, information visualization,intelligent systems, social networks, large-scale databasedesign, and optimization.

Tie Hu is a postdoc inthe Roboticslab workingwith ProfessorPeter Allen. Hereceived a Ph.Din Mechanical

Engineering from DrexelUniversity in 2006, and workedas an automation engineer inChina from 1999 to 2001. Hisresearch interests are medicalrobotics, haptics, soft tissuemodeling, electro-mechanicalsystem design and control,and instrumentation design.He is currently working on theproject ‘In vivo imaging systemfor minimally invasive surgery’in the Robotics Lab.

Bert C. Huang graduated from BrandeisUniversity with a B.S. in ComputerScience and a B.A. in

Philosophy in 2004. He com-pleted an M.S. at Columbia inFebruary 2006. Bert’s researchinterests are in the field ofMachine Learning. Currently,he is working in the Center for Computational LearningSystems and in the MachineLearning Laboratory.

Maritza Johnson graduated fromthe Universityof San Diegowith a B.A. inComputerScience in Mayof 2005. Her

research interests are user-cen-tered design, human-computerinteraction, and security. She iscurrently working with ProfessorSteven Bellovin on designing amethod of evaluating a financialinstitute’s ability to authenticateitself to users in an online environment. In the past shehas been active in the womenin computer science communityand plans to continue doing sowhile at Columbia.

Aly A. Khan graduated fromCarnegie Mellonwith a M.S. inComputationalBiology in May2006. He com-pleted his MS

thesis under Professor RussellSchwartz on modeling a novelbiological mechanism for splic-ing of very large introns in DNA.He was a Howard HughesMedical Institute undergraduatefellow while working in the labof Professor Chris Bailey-Kelloggat Purdue. He is interested in biological networks and is currently working with Dr.Christina Leslie and ProfessorChris Wiggins, in collaborationwith Dr. Chris Sander’s lab atMemorial Sloan-Kettering, onstudying microRNA regulation.

Jong Yul Kim came toColumbiaUniversity fromSouth Koreaafter graduatingfrom SeoulNational

University with a B.S. inComputer Science andEngineering in February 2005.He finished his M.S. inComputer Science in May2006, and started his Ph.D.program in Fall 2006. Since thesummer of 2005, Jong Yul hasbeen working with ProfessorSchulzrinne on the NextGeneration 9-1-1 project.

Jae Woo Lee graduated from ColumbiaCollege with aB.A. in Physicsin 1994. Afterworking for anumber of finan-

cial software/service companies,he co-founded MyRisk.com in1999, which was acquired byDirectAdvice.com in 2001. Hecame back to Columbia in 2004,received an M.S. in computerscience in 2006, and startedas a Ph.D. student in Fall 2006under Professor Henning

Page 6: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

10 CS@CU FALL 2006 CS@CU FALL 2006 11

Professor Eitan Grinspun

received one of three “bestpaper” awards at EURO-GRAPHICS 2006 in earlySeptember. EUROGRAPHICSis the European cousin of SIG-GRAPH and one of the mostimportant computer graphicsconferences. The paper, titled“Computing discrete shapeoperators on general meshes,”was co-authored by two NYUstudents, Yotam Gingold andJason Reisman, and Prof. Denis Zorin (NYU). Eurographicsconsidered 246 submittedpapers, and accepted 42.

Professor Eitan Grinspun wasawarded an NSF grant togetherwith Professors Karamcheti andZorin at NYU. The three-yearproject will develop parallelarchitectures for interactive sci-entific computing, with a focuson engineering and biomedicaldesign-space exploration.Specific points of focus include(1) higher-level user control ofthe overall study (as opposedto individual experiments); (2)reuse of data from prior experi-ments in carrying forward newcomputations; (3) dynamic man-agement of system resourcesby relying on a tighter couplingbetween application and systemsoftware; and (4) software reusebased on common componentarchitecture (CCA) complianceand standardization of a morepermeable system-/solver-levelinterface. The architecture will beevaluated on real-world biomed-ical applications, with a specificfocus on natural incorporationof existing simulation, solver,and domain-specific codes.

Computer Science PhD studentAlex Haubold won the BestPoster Award for this year’sIEEE International Conferenceand Multimedia Expo (ICME’06).This annual conference, one ofthe most significant and largestin the area of multimedia,featured over 270 posters.Alex, who is Professor JohnKender’s student, won for hispaper reporting on research he did as part of his IBM intern-ship last summer: “SemanticMultimedia Retrieval usingLexical Query Expansion andModel-Based Reranking”.

Professor Julia Hirschberg,along with Martin Jansche

(of the Columbia Center for Computational LearningSystems) and colleagues atUIUC, received an NSF grantto improve the tutoring of the rhythmic and intonationalaspect of language (prosody)for those learning Chinese andEnglish as a second language.Prosody is an integral part ofhuman communication, butone that second-languagelearners rarely learn. Topicshifts, contrastive focus, andeven simple question/state-ment distinctions, cannot berecognized or produced inmany languages without anunderstanding of their prosody.However, ‘translating’ betweenthe prosody of one languageand that of another is a little-studied phenomenon. Thisresearch addresses the‘prosody translation’ problem forMandarin Chinese and Englishsecond-language learners byidentifying correspondencesbetween prosodic phenomenain each language that conveysimilar meanings.

Professor Julia Hirschberg

was featured in an article inthe International Herald Tribuneon text-to-speech technology.

Professors Julia Hirschberg

and Jason Nieh, along withProfessor Cliff Stein (joint withIEOR), received the prestigiousIBM Faculty Award, reflectingtheir technical achievementsand close interaction with IBMresearchers.

Professor Tony Jebara receivedhis third KDD grant, titled“Learning to Match People,Multimedia and Graphs viaPermutation”. This proposalundertakes a novel researchdirection that explores matchingand permutation within statisti-cal learning. These researchtools have applications innational security as a way toidentify and match people from text and multimedia anddiscover links between them.More specifically, this proposaladdresses the following keyapplication areas:

– Matching authors: permuta-tional clustering methods andpermutationally invariant kernelsare used to compute the likeli-hood the same person wrote agiven publication or text.– Matching text and multimediadocuments: permutationalalgorithms and permutationallyinvariant kernels to performtext, image and word matchingsof descriptions of people toknown individuals in a database.– Matching social networks andgraphs: social network matchingtools from permutational algo-rithms which find a subnetworkin a larger network that has adesired topology.

Professors Gail Kaiser, Angelos

Keromytis and Sal Stolfo wona National Science Foundation(NSF) four-year grant as part of the CyberTrust program tostudy collaborative self-healingsystems. Collaborative Self-healing Systems (COSS) is anew paradigm for protectingsoftware systems. Softwaremonocultures are widely usedapplications that share commonvulnerabilities. Hence, anyattack that exploits one instanceof a vulnerable application pro-vides the means for wide-spreaddamage. The emerging conceptof collaborative security, whereinindependent but cooperativeentities form a group to improvetheir individual security, providesthe opportunity to exploit thehomogeneity of a softwaremonoculture for collective andmutual protection.

Professor John Kender, togeth-er with Shih-Fu Chang ofColumbia EE and Tom Huang

of UIUC, is a co-PI on a threeyear, ten senior researchergrant headed by PI John Smithof IBM, funded out of theDisruptive Technologies Officeas part of their Video Analysisand Content Extraction (VACE)program. The project pursuesresearch on statistical modelingtechniques that will characterizevideo contents in large semanticspaces, using open sourceinternational news broadcasts.It emphasizes cross-domain andcross-cultural scalability, fasterthan real-time performance, and

the exploitation of the temporalevolutionary aspects of videocontents. It will build a retrievalworkbench with video mining,topic tracking, and cross-linkingcapabilities, along with othervideo understanding services.

Professor Angelos Keromytis

chaired the security tracks forthe 2007 World Wide Web conference (WWW) and for the2007 International Conferenceon Distributed ComputingSystems (ICDCS).

Ph.D. students Homin Lee andAndrew Wan received theMark Fulk Best Student Paperaward at the 19th AnnualConference on Learning Theory(COLT 2006), held in Pittsburgh,PA, in July. The award is fortheir paper titled “DNF areTeachable in the AverageCase,” which is joint work withProfessor Rocco Servedio.COLT is the top conference incomputational learning theory,with more than 100 paperssubmitted per year for the lastseveral years.

Ph.D. student Jackson

Liscombe won one of three“best student paper” awardsat the INTERSPEECH 2006conference. The paper was co-authored by postdoctoralresearcher Jennifer Venditti

and Professor Julia Hirschberg.The paper was titled “DetectingQuestion-Bearing Turns inSpoken Tutorial Dialogues.”INTERSPEECH is the annualconference of the InternationalSpeech CommunicationAssociation (http://www.isca-speech.org), which has about1500 members. The conferenceis held annually, this year inPittsburgh. There are usuallyabout 1100 attendees andapproximately 1000 submis-sions. ISCA is one of the majorspeech science and technologyorganizations internationally.

Professors Vishal Misra

(Computer Science), Dan

Rubenstein (ElectricalEngineering and CS) and Ed

Coffman (EE and CS), togetherwith Professor PredragJelenkovic and Professor MorHarchol-Balter (of CMU) won a

Department News & AwardsAlp Atici (Math) received theE. M. Gold Award for ALT 2006 (the 17th InternationalConference on AlgorithmicLearning Theory, held inBarcelona in October). Thisaward is given to the best student paper at the confer-ence; the award is for thepaper “Learning Unions of$\omega(1)$-DimensionalRectangles” which is co-authored by ProfessorRocco Servedio.

Professors Steven Bellovin

and Henning Schulzrinne willparticipate in the InternationalTechnology Alliance. Thealliance will perform researchin the four areas of networktheory, security across systemof systems, sensor informationprocessing and delivery anddistributed coalition planningand decision making.

Professors Steven Bellovin

and Sal Stolfo, together with Professor Sean Smith ofDartmouth, received a grantfrom ARO to organize and runa Workshop on Insider ThreatResearch. The workshop willbe held in Washington, DC inJune 2007.

Professor Steven Bellovin

was interviewed by RobertSiegel on NPR’s popular pro-gram “All Things Considered”in May. Steven discussed theNational Security Agency’sefforts to analyze the hugedatabases of domestic phonecalls turned over to the NSA byphone companies.

Professor Steven Bellovin wasinterviewed for an NY1 TV articleon security and terrorism.

Matthew Burnside, Jae Woo

Lee, Christian Murphy andJoshua Reich and were namedas “extraordinary TAs” for theFall 2005 semester, based onthe evaluation of students intheir classes. Congratulationsto our outstanding TAs!

Professor Luca Carloni, alongwith Professor Ken Shepard of the Electrical Engineeringdepartment, was awarded athree-year NSF grant to study

the design of low-power scala-ble communication networksfor multi-core systems-on-chip.The project is funded by theNSF Foundations of ComputingProcesses and Artifacts (CPA)Cluster; in 2005 the NSF CPAcluster received 532 proposalsand funded approximately 10%of them.

Professor Luca Carloni hasbecome a member of theMARCO-sponsored, multi-institution Gigascale SystemResearch Center (GSRC) wherehe joins the “Core DesignTechnology for ComplexHeterogeneous Systems”theme. The MicroelectronicsAdvanced Research Corporation(MARCO) funds and operatesuniversity-based research centers in microelectronicstechnology. Its charter initiative,the Focus Center ResearchProgram (FCRP), is designedto expand pre-competitive,cooperative, long-range appliedmicroelectronics research atU.S. universities. Each FocusCenter targets research in aparticular area of expertise.The GSRC Focus Center bringstogether 41 faculty from 17American universities to focuson pertinent problems thesemiconductor industry facesin the next decade in the areasof system design, integration,test, and verification. ProfessorCarloni will work on the development of correct-by-construction methodologies for the integrated design andvalidation of robust gigascalesystems. The ultimate goal of this project is to assist engineering teams coping withthe complexity of gigascalesystem design by addressingtwo critical challenges: how toreduce the design productivitygap and how to handle themany sources of design vari-ability introduced by nanometertechnology processes.

Cristian Soviani, Olivier

Tardieu, and ProfessorStephen Edwards won abest paper award at DATE(Design Automation and Test in Europe), one of the majordesign automation confer-

ences. The conference had a28% acceptance rate; onlytwo such awards were givenout for the 800+ submissionsthey had this year.

Professor Stephen Edwards

was awarded an NSF granttitled “SHIM: DevelopingEmbedded Systems withDeterministic Concurrency”:The SHIM model of computa-tion provides deterministicconcurrency with reliable communication, simplifying validation because behavior is reproducible. Based on asynchronous concurrentprocesses that communicatethrough rendezvous channels,SHIM can handle control,multi- and variable-ratedataflow, and data-dependentdecisions.

Members of the department’scomputer graphics group wereextremely successful in theupcoming SIGGRAPH 2006conference, having 10 papersaccepted, the most from anysingle institution in the last fiveyears. SIGGRAPH is the mostprestigious conference forcomputer graphics; the 2006conference took place inBoston, Massachusetts inAugust 2006. A total of 86papers were accepted from474 submissions. Authors fromColumbia University includeProf. Ravi Ramamoorthi, Prof. Shree Nayar, Prof. Eitan Grinspun, and Prof.Peter Belhumeur, along with their graduate students.More information about theColumbia Vision + GraphicsCenter can be found athttp://www.cs.columbia.edu/cvgc/

The paper “To Search or toCrawl? Towards a QueryOptimizer for Text-Centric Tasks,”by Panos Ipeirotis, Eugene

Agichtein, Pranay Jain, andProfessor Luis Gravano,received the “Best Paper”Award at the SIGMOD 2006Conference. SIGMOD is one of the two premier databaseconferences and is highlyselective; the acceptance ratefor SIGMOD 2006 was lowerthan 15%.

Page 7: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

12 CS@CU FALL 2006 CS@CU FALL 2006 13

After a busy year inclassrooms and labs at Columbia, CUCS students branched out to a wide range of interesting jobs and internships over the summer. Here arebrief snapshots of what a few computerscience students didwith their summersaway from campus.

Student News

An image from the intelligent movie director, courtesy Institute for Creative Technologies.

“What I did with my summer vacation”highly-competitive NSF grantto study resource sharing andallocation on large server farms.A wide variety of systems,including web farms, virtualmachines, multi-tasking OSes,GRID computing systems, and sensor networks improvetheir accessibility, availability,resilience and fairness by‘sharing’ resources acrossthe consumers they support.However, research thatexplores how to shareresources generally derivespoint solutions, where differentresource/consumer configura-tions require separately-designed sharing mechanisms.This project seeks to developand analyze Adaptive SharingMechanisms (ASMs) in whichthe mechanism used to shareresources adapts dynamicallyto both the set of availableresources and the currentneeds of the consumers,such that the system is trulyautonomic. The grant extendsover three years and is part ofthe NSF Computer SystemsResearch (CSR) program. Onlyapproximately 10% of all grantapplications were funded.

Professors Vishal Misra andDan Rubenstein won a three-year NSF CyberTrust grant onnetwork control plane security.Their proposal seeks to furtherdevelopment of a methodologyfor measuring the inherentsecurity of the control planecomponent of existing andfuture routing protocols. Theapproach has a significant theoretical component: itinvolves looking at generalclasses of routing protocolsand showing how they can be analyzed for their ability to monitor themselves.

Professor Shree Nayar’s workwas profiled in an August 2006IEEE Computer Magazine article (cover feature) onComputational Photography.

Professor Kenneth Ross wasawarded an NSF grant to studythe design of database systemssoftware on modern multicore

and multithreaded processors.This project, titled “Cache-AwareDatabase Systems on ModernMultithreading Processors”,studies how to best utilize theresources available in modernprocessors in the developmentof database system software.A primary objective is avoidingcache interference betweenthreads in multithreaded andmulti-core processors, so thatperformance scales well as the number of cores/threadsincreases. A variety of tech-niques are considered, includingmulti-threaded algorithm design,threads explicitly devoted toresource management, andscheduling algorithms that areaware of thread interferencepatterns. Simulations andimplementations on real hard-ware are used to measure theeffectiveness of each approach.Project-related information canbe found at http://www.cs.columbia.edu/~kar/fastqueryproj.html This project was one of onlyeleven funded in the DatabaseManagement Systems programin 2006 and lasts throughAugust 2008.

Professor Henning Schulzrinne

and Professor Ram Dantu (U.North Texas) have received atwo-year grant from the NationalScience Foundation, as part ofthe CyberTrust program, tostudy methods to prevent unso-licited calls in VoIP systems.The research will focus onusing trust paths to determinewhether unknown callers arelikely to be telemarketers orother spammers. Trust pathscapture transitive trust in afriend-of-a-friend model, withtrust established by having a person send email or callanother person. Such trustpaths are suitable for low-riskdecisions, such as whether toaccept an email or phone call,rather than high-risk decisionssuch as whether to loan moneyor reveal private information.

Professors Sal Stolfo andAngelos Keromytis wereawarded a DARPA grant titled“Behavior-based Access Control

and Communication inMANETs”. Through this grant,they will develop a new, behav-ior- based mechanism forauthenticating and authorizingnew nodes in wirelessMANETs. Rather than onlygranting access to a network,or to services on a network, by means of an authenticatedidentity or a qualified role, theypropose to require nodes toalso exchange a model of theirbehavior to grant access and toassess the legitimacy of theirsubsequent communication.When a node requests access,it provides its pre-computedegress behavior model toanother node who may grant it access to some service. The receiver compares therequestor’s egress model to itsown ingress model to determinewhether the new device con-forms to its expected behavior.Access rights are thus grantedor denied based upon the levelof agreement between thetwo models, and the level ofrisk the recipient is willing tomanage. The second use ofthe exchanged models is tovalidate active communicationafter access has been granted.As a result, MANET nodes, willhave greater confidence that a new node is not malicious; if an already admitted nodestarts misbehaving, otherMANET nodes will quicklydetect and evict it.

Professors Joseph Traub andHenryk Wozniakowski wereawarded a three-year grantfrom the NSF titled “Quantumand Classical Complexity ofMultivariate Problems.” Thismarks 36 years in which NSFhas funded every proposalProfessor Traub has submitted;this may be a record for thefield of Computer Science.Professors Traub andWozniakowski also receivedadditional funds from DARPA’SQuIST (Quantum InformationScience and Technology) eventhough that program has ended.

Department News & Awards (continued)

Marc Eaddy spent his summeras an intern at MicrosoftResearch in Redmond, WA. He worked with ManuelFadrich on an importantproblem for Microsoft: portingsoftware libraries, like .NET, to different platforms such aswatches, cell phones, PDAs,and non-Windows operatingsystems. He developedautomated pruning algorithmsthat tailor the library to the target platform by removingextraneous and unsupportedfeatures. In the process, helearned a lot about softwaredependencies, graph theory,and Open Classes. He tookadvantage of Seattle’s gorgeoussummer weather (no joke!) to enjoy tennis, volleyball, Bill Gates’s lawn, and a cruisesponsored by MSR.

Ph.D. student David Elson

spent the summer at the Institute for CreativeTechnologies, a center for virtual reality and computersimulation research at the University of SouthernCalifornia. At ICT, Davidworked under Dr. Mark Riedland Julia Kim, who areexperts in his research areaof computational narrativereasoning. As part of a largerproject aiming to automatethe creation of animated narrative cinema for case-based leadership training,David built a “robotic moviedirector” capable of realizingmovie scripts as aestheticallycoherent films by intelligentlyblocking virtual actors and placing cameras in a 3D world.

Joseph Kaptur (SEAS ‘08)spent his summer in Denver,Colorado. In addition tocamping and hiking, heworked for a company namedDecisioneering, which makesCrystal Ball, an add-in forMicrosoft Excel that allows auser to turn any spreadsheetinto a Monte Carlo simulation.Joseph spent most of his timethere writing code that wouldallow any user of MicrosoftProject (a scheduling tool for creating project plans,Gantt charts, and resourceschedules) to use the power of Monte Carlo simulation toapply probability distributionsto task durations, and be ableto say precisely how likely aproject is to be on time, twoweeks late, etc. The tool hebuilt is now in use at Fortune500 companies and is beingmarketed to the Navy.

Page 8: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

CS@CU FALL 2006 1514 CS@CU FALL 2006

Regina Barzilay (Ph.D. ‘02),now an assistant professor ofelectrical engineering and com-puter science at MassachusettsInstitute of Technology, was oneof five recipients of the highlyprestigious Microsoft ResearchNew Faculty Fellowships.Because new faculty membersare essential to the future ofacademic computing, MicrosoftResearch honors early-careerprofessors who demonstratethe drive and creativity to devel-op original research while con-tinually advancing the state ofthe art of computing. Regina isfocusing her research on com-putational modeling of linguisticphenomena. She is exploringthe ability of a computer tosummarize information found inmultiple documents that containrelated information, such asnews articles covering the sameevent. This will help readers findmeaning in the ever-increasingbody of information availabletoday. Regina graduated fromColumbia University in 2002,where she was advised byProfessor Kathy McKeown.

Deborah Cook (Ph.D. ‘06) finished her dissertation underProf. Angelos Keromytis. She has taken a position withBell Labs.

German Creamer (Ph.D. ‘06)finished his dissertation underProf. Sal Stolfo. He is workingin a management position atAmerican Express Corporation.

Min-Yen Kan (SEAS B.S. ‘96,Ph.D. ‘02) married Alicia Ang ina civil ceremony at the RafflesHotel in Singapore on December23, 2005. Min is currently an assistant professor at the National University ofSingapore, in the Departmentof Computer Science,researching digital libraries.

Rui Kuang (Ph.D. ‘06) finishedhis dissertation under Dr.Christina Leslie of the Centerfor Computational LearningSystems. He is now anAssistant Professor at theUniversity of Minnesota.

Jonathan Liu (CC ‘04) writes:‘Currently I am working atPricewaterhouseCoopers doingconsulting work. My work isprimarily accounting-basedconsulting and usually involvesa technology aspect. Still keeping in touch withJonathan Tse (CS, SEAS ‘04),Tim Soohoo (CS, SEAS ‘04),and Jeff Eng (CS, SEAS ‘04)!’

Blair MacIntyre (Ph.D.’ 99) wasawarded tenure in the Collegeof Computing at Georgia Tech.

Smaranda Muresan (Ph.D. ‘06)finished her dissertation on‘Learning Constraint-basedGrammars from RepresentativeExamples.’ She was advised by Professors Judith Klavansand Kathy McKeown and Dr.Owen Rambow. Smaranda isnow working as a postdoc withPhil Resnik at the University of Maryland.

Ani Nenkova (Ph.D. ‘05) starteda tenure-track faculty position atthe University of Pennsylvania.

Dragomir R. Radev (Ph.D. ‘99)has received tenure at theUniversity of Michigan, wherehe is now an AssociateProfessor of Informationand Computer Science andEngineering. He works on textmining with applications in areaslike computer science, bioinfor-matics, and political science. He recently shared the 2006Gosnell Prize for Excellence inPolitical Methodology; the prizeis awarded for the best work inpolitical methodology presentedat any political science confer-ence during the preceding year.He was elected as secretary

of the ACL, the internationalorganization for ComputationalLinguistics and NaturalLanguage Processing, and willbe the program chair for theupcoming US Olympiad inComputational Linguistics. He ison leave in the 2006-7 academicyear and is spending some ofthat time at Columbia.

Jonathan Rosenberg (Ph.D.‘01) was named a CiscoFellow. The Cisco Fellow program was created to honorCisco’s most distinguishedengineers; the title of ‘CiscoFellow’ is the company’s highest honor for engineeringexcellence. Only ten Ciscoemployees have been namedas Cisco fellows out of nearly40,000 employees worldwide.

Ke Wang (Ph.D. ‘06) finishedher dissertation under Prof. Sal Stolfo in 2006. She is working in the infrastructureand security group at Googleon the West Coast.

Alumni News Recent & Upcoming PhD Defenses Debra CookAdvisor: Angelos KeromytisElastic Block Ciphers

Abstract: Standard block ciphersare designed around one or asmall number of block sizes. Fromboth a practical and a theoreticalperspective, the question of howto efficiently support a range of block sizes is of interest. Inapplications, the length of thedata to be encrypted is often not a multiple of the supportedblock size. This results in the useof plaintext-padding schemesthat impose computational andspace overheads. Furthermore, a variable-length block cipherideally provides a variable-lengthpseudorandom permutation and strong pseudorandom per-mutation, which are theoreticalcounterparts of practical blockciphers and correspond to idealproperties for a block cipher.The focus of our research is thedesign and analysis of a methodfor creating variable-length blockciphers from existing fixed-lengthblock ciphers. As the heart of themethod, we introduce the conceptof an elastic block cipher, whichrefers to stretching the supportedblock size of a block cipher to anylength up to twice the originalblock size while incurring a computational workload that isproportional to the block size. We create a structure, referredto as the elastic network, thatuses the round function from anyexisting block cipher in a mannerthat allows the properties of theround function to be maintainedand results in the security of theelastic version of a block cipherbeing directly related to that ofthe original version. By forming a reduction between the elasticand original versions, we provethat the elastic version of acipher is secure against round-key recovery attacks if the originalcipher is secure against suchattacks. We illustrate the methodby creating elastic versions of fourexisting block ciphers. In addition,the elastic network provides anew primitive structure for use insymmetric-key cipher design. Itallows for the creation of variable-length pseudorandom permuta-

tions and strong pseudorandompermutations in the range of b to2b bits from round functions thatare independently chosen pseudo-random permutations on b bits.

GermanCreamerAdvisors: Sal Stolfo andYoav FreundUsing Boostingfor Automated

Trading and Planning

Abstract: The problem: Much of finance theory is based on the efficient market hypothesis.According to this hypothesis, the prices of financial assets,such as stocks, incorporate allinformation that may affect theirfuture performance. However, thetranslation of publicly availableinformation into predictions offuture performance is far fromtrivial. Making such predictionsis the livelihood of stock traders,market analysts, and the like.Clearly, the efficient markethypothesis is only an approxima-tion which ignores the cost ofproducing accurate predictions.Markets are becoming more efficient and more accessiblebecause of the use of ever fastermethods for communicating and analyzing financial data.Algorithms developed in machinelearning can be used to automateparts of this translation process.In other words, we can now usemachine learning algorithms toanalyze vast amounts of informa-tion and compile them to predictthe performance of companies,stocks, or even market analysts. Infinancial terms, we would say thatsuch algorithms discover ineffi-ciencies in the current market.These discoveries can be used tomake a profit and, in turn, reducethe market inefficiencies or sup-port strategic planning processes.Relevance: Currently, the majorstock exchanges such as NYSEand NASDAQ are transformingtheir markets into electronicfinancial markets. Players inthese markets must processlarge amounts of information and

make instantaneous investmentdecisions. Machine learningtechniques help investors andcorporations recognize new busi-ness opportunities or potentialcorporate problems in these mar-kets. With time, these techniqueshelp the financial market becomebetter regulated and more stable.Also, corporations could savesignificant amount of resources if they can automate certain corporate finance functions suchas planning and trading.Results: This dissertation offers anovel approach to using boostingas a predictive and interpretativetool for problems in finance. Even more, we demonstratehow boosting can support theautomation of strategic planningand trading functions.Many of the recent bankruptcyscandals in publicly held UScompanies such as Enron andWorldCom are inextricably linkedto the conflict of interest betweenshareholders (principals) andmanagers (agents). We evaluatethis conflict in the case of LatinAmerican and US companies. Inthe first part of this dissertation,we use Adaboost to analyze theimpact of corporate governancevariables on performance. In thisrespect, we present an algorithmthat calculates alternating deci-sion trees (ADTs), ranks variablesaccording to their level of impor-tance, and generates representa-tive ADTs. We develop a boardBalanced Scorecard (BSC) basedon these representative ADTswhich is part of the process toautomate the planning functions.In the second part of this disser-tation we present three mainalgorithms to improve forecastingand automated trading. First, weintroduce a link mining algorithmusing a mixture of economic and social network indicatorsto forecast earnings surprises,

and cumulative abnormal return.Second, we propose a tradingalgorithm for short-term technicaltrading. The algorithm was testedin the context of the Penn-LehmanAutomated Trading Project (PLAT)competition using the Microsoftstock. The algorithm was prof-itable during the competition.

Page 9: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

16 CS@CU FALL 2006 CS@CU FALL 2006 17

Recent & Upcoming PhD Defenses (continued)using existing protocols. Wedescribe the architecture andimplementation of our SessionInitiation Protocol (SIP)-basedenterprise Internet telephonyarchitecture known as ColumbiaInterNet Extensible MultimediaArchitecture (CINEMA). It consistsof a SIP registration and proxyserver, a multi-party conferencingserver, a gateway for interworkingSIP with ITU’s H.323, an interactivevoice response system and amultimedia mail server. CINEMAprovides a distributed interopera-ble architecture for collaborationusing synchronous communica-tions like multimedia conferenc-ing, instant messaging, sharedweb-browsing, and asynchronouscommunications like discussionforum, shared files, voice andvideo mails. It allows seamlessintegration with various commu-nication means like telephone, IPphone, web and electronic mail.We present two techniques forproviding scalability and reliabilityin SIP: server redundancy and anovel peer-to-peer architecture.For the former, we use DNS-based load sharing among multi-ple distributed servers that usebackend SQL databases to main-tain user records. Our two-stagearchitecture scales linearly withthe number of servers. For thelatter, we propose a peer-to-peerInternet telephony architecturethat supports basic user registra-tion and call setup as well asadvanced services such as offlinemessage delivery, voice mail andmulti-party conferencing usingSIP. It interworks with server-based SIP infrastructures.

AlejandroTroccoliAdvisor: Peter AllenNew methodsand tools for 3D-modeling

of large scale outdoor scenesusing range and color images

Abstract: Systems for the creationof photorealistic models usingrange scans and digital photo-graphs are becoming increasinglypopular in a wide range of fields,

from reverse engineering to cultural heritage preservation.These systems employ a rangefinder to acquire the geometryinformation and a digital camerato measure color detail. Butbringing together a set of rangescans and color images to pro-duce an accurate and usablemodel is still an area of researchwith many unsolved problems.In this dissertation we presentnew tools and methods for creat-ing digital models from range andcolor images, with emphasis onlarge-scale outdoor scenes. First,we address the problem of rangeand color image registration. Inthis area, we introduce a semi-automatic tool for range and colorimage registration that makes useof line-features to solve for theposition and orientation of the dig-ital camera. This allows us to effi-ciently register images of urbanscenery. Secondly, we present aregistration technique that usesthe shadows cast by the sun ascues to find the correct camerapose, which we have successfullyapplied in the creation of a digitalmodel of an archaeological exca-vation in Monte Polizzo, Sicily. Wealso address the problem of howto build seamless integrated tex-ture maps from images that weretaken under different illuminationconditions. To achieve this wepresent two different solutions.The first one is to align all theimages to the same illumination.For this, we have developed a technique that computes arelighting operator over the areaof overlap of a pair of images,which we then use to relight the entire image. Our proposedmethod can handle images withshadows and can effectivelyremove the shadows from theimage, if required. The secondtechnique uses the ratio of twoimages to factor out the diffusereflectance of an image from its illumination. We achieve thiswithout any light measuringdevice. By computing the actualreflectance we remove from theimages any effects of the illumi-nation, which then allows us tocreate new renderings undernovel illumination conditions.

Ke WangAdvisor:Salvatore StolfoNetworkPayload-basedAnomalyDetection and

Content-based Alert Correlation

Abstract: Every computer on theInternet nowadays is a potentialtarget for a new attack at anymoment. The pervasive use of sig-nature-based anti-virus scannersand misuse detection IntrusionDetection Systems have failed to provide adequate protectionagainst a constant barrage of“zero-day” attacks. Such attacksmay cause denial-of-service, system crashes, or informationtheft resulting in the loss of criticalinformation. In this thesis, weconsider the problem of detectingthese “zero-day” intrusions quicklyand accurately upon their veryfirst appearance.Most current Network IntrusionDetection Systems (NIDS) usesimple features, like packet headers and derived statisticsdescribing connections and sessions (packet rates, bytestransferred, etc.) to detect unusualevents that indicate a system is likely under attack. Theseapproaches, however, are blind tothe content of the packet stream,and in particular, the packet content delivered to a service thatcontains the data and code thatexploits the vulnerable applicationsoftware. We conjecture that fast and efficient detectors thatfocus on network packet contentanomaly detection will improvedefenses and identify zero-dayattacks far more accurately thanapproaches that consider onlyheader information.We therefore present two pay-load-based anomaly detectors,PAYL and Anagram, for intrusiondetection. They are designed todetect attacks that are otherwisenormal connections except thatthe packets carry bad (anom-alous) content indicative of a newexploit. These payload-basedanomaly sensors can augmentother sensors and enrich theview of network traffic to detect

malicious events. Both PAYL andAnagram create models of site-specific normal network applica-tion payload as n-grams in a fullyautomatic, unsupervised andvery efficient fashion. PAYL com-putes, during a training phase, aprofile of byte (1-gram) frequencydistribution and their standarddeviation of the application pay-load flowing to a single host andport. PAYL produces a very fine-grained model that is conditionedon payload length. Anagrammodels high-order n-grams (n>1)which capture the sequentialinformation between bytes. Weexperimentally demonstrate thatboth of these sensors are capa-ble of detecting new attacks withhigh accuracy and low false pos-itive rates. Furthermore, in orderto detect the very early onset ofa worm attack, we designed aningress/egress correlation func-tion that is built in the sensors toquickly identify the worms’ initialpropagation. The sensors arealso designed to generate robustsignatures of validated maliciouspacket content. The techniquedoes not depend upon the detec-tion of probing or scanningbehavior or the prevalence ofcommon probe payload, so it isespecially useful for the detectionof slow and stealthy worms.An often-cited weakness ofanomaly detection systems is thatthey suffer from “mimicry attack”:clever adversaries may craftattacks that appear normal to ananomaly detector and hence willgo unnoticed as a false negative.A mimicry attack against a sitedefended by a content-basedanomaly detector can be executedby an attacker by sniffing the target site’s traffic flow, modelingthe byte distributions of that flow,and blending their exploit with“normal” appearing byte padding.To defend against such attacks,we further propose the tech-niques of randomized modelingand randomized testing. Underrandomized modeling/testing,each sensor will randomly parti-tion the payload into several sub-sequences, each of whom aremodeled/tested separately, thusbuilding a “model/test diversity”

Third, we present a multi-stockautomated trading system thatincludes a machine learningalgorithm that makes the predic-tion, a weighting algorithm thatcombines the experts, and a riskmanagement layer that selectsonly the strongest prediction andavoids trading when there is ahistory of negative performance.This algorithm was tested with100 randomly selected S&P 500stocks. We find that even an efficient learning algorithm, suchas boosting, still requires powerfulcontrol mechanisms in order to reduce unnecessary andunprofitable trades that increasetransaction costs.

Rui KuangAdvisor: Christina LeslieInferring ProteinStructure withDiscriminativeLearning and

Network Diffusion

Abstract: As the completegenomes of more and morespecies become available, thelarge scale study of protein struc-ture is becoming increasinglyimportant. Given the difficulty ofexperimental determination ofprotein structure with X-Ray crys-tallography or nuclear magneticresonance (NMR), much effortover the past decade has beendevoted to the development ofnew machine learning approach-es for tackling the challengingproblem of protein structure pre-diction. In this thesis, we focuson applying several advancedlearning techniques, includingkernel methods and network dif-fusion algorithms, to address fourfundamental learning problems in protein structure inference:• Classification of a new proteininto its structural class. To detectsubtle sequence similaritiesbetween remote homologies, wepropose inexact matching stringkernels and profile-based stringkernels for use with the supportvector machine (SVM). We alsouse the kernel-based SVM clas-sifier to extract discriminative

sequence motifs, which can cor-respond to meaningful structuralfeatures in the protein data.• Protein database search forhomology detection. We proposea new network diffusion algo-rithm, called MotifProp, based ona protein-motif network to detectmore subtle sequence similaritiesthan a pairwise comparisonmethod such as PSI-BLAST.Furthermore, top-ranked motifsand motif-rich regions inducedby the propagation are also help-ful for discovering conservedstructural components in remotehomologies.• Prediction of dihedral torsionangles of protein backbone. Weapply kernel methods to accu-rately predict protein backbonetorsion angles, which helps tosubstantially improve the model-ing of local structures of proteinsequence segments, especiallythe loop conformations, which donot form regular structures.• Segmentation of multi-domainprotein. We predict proteindomain labels and boundariessimultaneously by solving an opti-mization problem, in which we usetrained SVM domain recognizersto find the optimal segmentationof a protein giving the largestsum of classification scores.Our work provides effective andefficient solutions to these fourfundamental problems in proteinstructure inference, validated withlarge scale experiments. The pro-posed new techniques, achievinghigh prediction performance and capable of handling hugedatasets, will become increasinglyimportant in this post-genome era.

SmarandaMuresanAdvisors: OwenRambow andJudith KlavansLearningConstraint-

based Grammars fromRepresentative Examples

Abstract: Computationally efficientmodels for natural languageunderstanding can have a wide

variety of applications startingfrom text mining and questionanswering, to natural languageinterfaces to databases.Constraint-based grammar for-malisms have been widely usedfor deep language understanding.Yet, one serious obstacle for theiruse in real world applications isthat these formalisms have over-looked an important requirement:learnability. Currently, there is a poor match between thesegrammar formalisms and existinglearning methods.This dissertation defines a newtype of constraint-based gram-mars, Lexicalized Well-FoundedGrammars (LWFGs), which allowdeep language understanding andare learnable. These grammarsmodel both syntax and semanticsand have constraints at the rulelevel for semantic composition andsemantic interpretation. The inter-pretation constraints allow accessto meaning during language processing. They establish linksbetween linguistic expressionsand the entities they refer to in thereal world. We use an ontology-based interpretation, proposing asemantic representation that canbe conceived as an ontology querylanguage. This representation issufficiently expressive to representmany aspects of language and yetsufficiently restrictive to supportlearning and tractable inferences.In this thesis, we propose a newrelational learning model forLWFG induction. The learner ispresented with a small set ofpositive representative examples,which consist of utterancespaired with their semantic repre-sentations. We have proved thatthe search space for grammarinduction is a complete grammarlattice, which allows the con-struction and generalization ofthe hypotheses and guaranteesthe uniqueness of the solution,regardless of the order of learning.We have proved a learnabilitytheorem and have provided polynomial algorithms for LWFGinduction, proving their sound-ness. The learnability theoremextends significantly the class ofproblems learnable by InductiveLogic Programming methods.

In this dissertation, we have imple-mented a system that representsan experimental platform for all thetheoretical algorithms. The systemhas the practical advantage ofimplementing sound grammar revi-sion and grammar merging, whichallow an incremental coverage of natural language fragments.We have provided qualitativeevaluations that cover the follow-ing issues: coverage of diverseand complex linguistic phenom-ena; terminological knowledgeacquisition from natural languagedefinitions; and question answer-ing, where the questions can be vague or precise, while theanswer is always precise at theconcept level.

Kundan SinghAdvisor: HenningSchulzrinneReliable,Scalable andInteroperableInternetTelephony

Abstract: The public switchedtelephone network (PSTN) pro-vides ubiquitous availability andvery high scalability of more thana million busy hour call attemptsper switch. If large carriers areto adopt Internet telephony, thenInternet telephony servers shouldoffer at least similar quantifiableguarantees for scalability and reli-ability using metrics such as callsetup latency, server call handlingcapacity, busy hour call arrivals,mean-time between failures andmean-time to recover. This thesispresents a reliable, scalable andinteroperable Internet telephonyarchitecture for user registration,call routing, conferencing andunified messaging using com-modity hardware. The resultsextend beyond Internet telephonyto encompass multimedia com-munication in general.The architecture presented in thisthesis deals with two aspects: atleast PSTN-grade reliability andscalability of the Internet teleph-ony servers, and interoperableInternet telephony services suchas conferencing and voice mail

Page 10: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

CS@CU FALL 2006 19

2006 Technical Report SeriesThe following technical reports were publishedthrough November 2006.All 2006 reports are available

at http://www.cs.columbia.edu/research/publications

CUCS-001-06 Converting from Spherical to Parabolic CoordinatesAner Ben-Artzi

CUCS-002-06 Binary-level Function Profiling for Intrusion Detection and Smart Error VirtualizationMichael Locasto, Angelos Keromytis

CUCS-003-06W3Bcrypt: Encryption as a StylesheetAngelos Stavrou, Michael Locasto,Angelos D. Keromytis

CUCS-004-06Grouped Distributed Queues:Distributed Queue, ProportionalShare Multiprocessor SchedulingBogdan Caprita, Jason Nieh, Clifford Stein

CUCS-005-06Theoretical Bounds on Control-PlaneSelf-Monitoring in Routing ProtocolsRaj Kumar Rajendran, Vishal Misra, Dan Rubenstein

CUCS-007-06Using an External DHT as a SIP Location ServiceKundan Singh, Henning Schulzrinne

CUCS-008-06Rigid Formations with Leader-Follower ArchitectureTolga Eren, Walter Whiteley, Peter N. Belhumeur

CUCS-009-06A Runtime Adaptation Framework forNative C and Bytecode ApplicationsRean Griffith, Gail Kaiser

CUCS-010-06Evaluating an Evaluation Method:The Pyramid Method Applied to 2003 Document UnderstandingConference (DUC) DataRebecca Passonneau

CUCS-011-06Passive Duplicate Address Detection for Dynamic HostConfiguration Protocol (DHCP)Andrea G. Forte, Sangho Shin, Henning Schulzrinne

CUCS-012-06A Theory of Spherical HarmonicIdentities for BRDF/Lighting Transfer and Image ConsistencyDhruv Mahajan, Ravi Ramamoorthi, Brian Curless

CUCS-013-06Using Angle of Arrival (Bearing)Information in Network LocalizationTolga Eren, Walter Whiteley, Peter N. Belhumeur

CUCS-014-06PalProtect: A Collaborative SecurityApproach to Comment SpamBenny Wong, Michael Locasto, Angelos D. Keromytis

CUCS-015-06Streak Seeding Automation Using Silicon ToolsAtanas Georgiev, Sergey Vorobiev,William Edstrom, Ting Song, Andrew Laine, John Hunt, Peter Allen

CUCS-016-06Bloodhound: Searching OutMalicious Input in Network Flows for Automatic Repair ValidationMichael Locasto, Matthew Burnside,Angelos D. Keromytis

CUCS-017-06Quantifying Application BehaviorSpace for Detection and Self-HealingMichael Locasto, Angelos Stavrou,Gabriela G. Cretu, Angelos D. Keromytis,Salvatore J. Stolfo

CUCS-018-06Seamless Layer-2 Handoff using Two Radios in IEEE 802.11Wireless NetworksSangho Shin, Andrea G. Forte, Henning Schulzrinne

CUCS-019-06A Survey of Security Issues and Solutions in PresenceVishal Kumar Singh, Henning Schulzrinne

CUCS-020-06Anagram: A Content AnomalyDetector Resistant to Mimicry Attack Ke Wang, Janak Parekh, Salvatore Stolfo

CUCS-021-06A First Order Analysis of Lighting,Shading, and ShadowsRavi Ramamoorthi, Dhruv Mahajan, Peter Belhumeur

CUCS-022-06PBS: A Unified Priority-Based CPU SchedulerHanhua Feng, Vishal Misra, Dan Rubenstein

CUCS-023-06SIMPLEstone–BenchmarkingPresence Server PerformanceVishal Kumar Singh, Henning Schulzrinne

CUCS-024-06Speculative Execution as anOperating System ServiceMichael Locasto, Angelos Keromytis

CUCS-025-06Exploiting Temporal Coherence forPre-computation Based RenderingRyan Overbeck

CUCS-026-06Privacy-Preserving Payload-BasedCorrelation for Accurate MaliciousTraffic DetectionJanak Parekh, Ke Wang, Salvatore Stolfo

CUCS-027-06Feasibility of Voice over IP on the InternetAlex Sherman, Jason Nieh, Yoav Freund

CUCS-028-06Practical Preference Relations for Large Data SetsKenneth Ross, Peter Stuckey, Amelie Marian

CUCS-029-06Measurement and Evaluation ofENUM Server PerformanceCharles Shen, Henning Schulzrinne

CUCS-030-06Projection Volumetric Displayusing Passive Optical Scatterers

Shree K. Nayar, Vijay N. Anand

CUCS-031-06Complexity and tractability of the heat equationArthur G. Werschulz

CUCS-032-06Linear Approximation of Optimal Attempt Rate in Random Access NetworksHoon Chang, Vishal Misra, Dan Rubenstein

CUCS-033-06Throughput and Fairness in Random Access NetworksHoon Chang, Vishal Misra, Dan Rubenstein

CUCS-034-06A Framework for Quality Assuranceof Machine Learning ApplicationsChristian Murphy, Gail Kaiser, Marta Arias

CUCS-035-06Debugging Woven CodeMarc Eaddy, Alfred Aho, Weiping Hu,Paddy McDonald, Julian Burger

CUCS-037-06Specifying Confluent ProcessesOlivier Tardieu, Stephen A. Edwards

CUCS-038-06MacShim: Compiling MATLAB to a Scheduling-IndependentConcurrent LanguageNeesha Subramaniam, Ohan Oda,Stephen A. Edwards

CUCS-039-06A VoIP Privacy Mechanism and itsApplication in VoIP Peering for Voice Service Provider Topology and Identity HidingCharles Shen, Henning Schulzrinne

18 CS@CU FALL 2006

on each host that is unknown tothe mimicry attacker. This raisesthe bar for the attackers as theyhave no means to know how andwhere to pad the exploit code toappear normal within each ran-domly computed partition, even ifthey have the global knowledgeof the target site’s content flow.Finally, PAYL/Anagram’s speedand high detection rate makes it valuable not only as a stand-alone network-based sensor, butalso as a host-based data-flowclassifier in an instrumented,fault-tolerant host-based envi-ronment; this enables significantcost amortization and the possi-bility of a “symbiotic” feedbackloop that can improve accuracyand reduce false positive ratesover time.Besides building stand-aloneanomaly sensors, we alsodemonstrate a collaborativesecurity strategy whereby differ-ent hosts may exchange payloadalerts to increase the accuracy ofthe local sensor and reduce falsepositives. We propose and exam-ine several new approaches toenable the sharing of suspiciouspayloads via privacy-preservingtechnologies. We detail the workwe have done with PAYL andAnagram to support generalizedpayload correlation and signaturegeneration without releasingidentifiable payload data. Theimportant principle demonstratedis that correlation of multiplealerts can identify true positivesfrom the set of anomaly alerts,reducing incorrect decisions andproducing accurate mitigationagainst zero-day attacks.A new wave of cleverly craftedpolymorphic attacks has sub-stantially complicated the task of automatically generating“string-based” signatures to filter newly discovered zero-dayattacks. Although the payloadanomaly detection techniqueswe present are able to detectthese attacks, correlating theindividual packet content deliv-ering distinct instances of thesame polymorphic attack areshown to have limited value,requiring new approaches forgenerating robust signatures.

Weibin ZhaoAdvisor: HenningSchulzrinneTowardsAutonomicComputing:Service

Discovery and Web HotspotRescue

Abstract: Autonomic computing isa vision that addresses the grow-ing complexity of computing sys-tems by enabling them to managethemselves without direct humanintervention. This thesis studiestwo related problems, service dis-covery and web hotspot rescue,which can serve as a buildingblock and a prototype for auto-nomic networking and distributedsystems, respectively.Service discovery allows endsystems to discover desired serv-ices on networks automatically,eliminating administrative config-uration. We made four enhance-ments to the Service LocationProtocol (SLP): mesh enhance-ment, remote service discovery,preference filters, and globalattributes. These enhancementsimprove SLP efficiency and scal-ability, and enable SLP to bettersupport new and advanced dis-covery scenarios. The SLP meshenhancement (mSLP), remoteservice discovery, and preferencefilters are now experimental RFCs(Request for Comments). Weexpect that similar techniquescan be applied to other servicediscovery systems.During the development of mSLP,we designed selective anti-entropy, a generic mechanism forhigh availability partial replica-tion. Traditional anti-entropy onlysupports full replication. Weenhanced it to support partialreplication by allowing two replicas to selectively reconcileinconsistent data in a session.Web hotspots are short-term dra-matic load spikes. We developedDotSlash, a self-configuring andscalable rescue system for han-dling web hotspots effectively.DotSlash works autonomously. Ituses service discovery to allocateresources dynamically from aserver pool distributed globally,

and uses adaptive overload con-trol to automate the whole rescueprocess. As a comprehensivesolution, DotSlash enables a website to build an adaptive distrib-uted web server system on the fly,replicate application programsdynamically, and set up distributedquery result caching on demand.DotSlash relieves a spectrum of bottlenecks ranging fromaccess network bandwidth toweb servers, application servers,and database servers.As part of DotSlash, we devel-oped a prediction algorithm forestimating the upper bound offuture web traffic volume, whichis simple and effective for short-term bursty web traffic. Thisalgorithm provides insight intocharacterizing traffic of webhotspots, and is useful for webserver overload prevention.

Recent & Upcoming PhD Defenses (continued)

Page 11: NEWSLETTER OF THE DEPARTMENT OF COMPUTER …...Computer Science Research Department. He then joined AT&T Bell Laboratories, Holmdel, NJ, as a Consultant, and was appointed Professor

Columbia University in the City of New YorkDepartment of Computer Science1214 Amsterdam AvenueMailcode: 0401New York, NY 10027-7003

ADDRESS SERVICE REQUESTED

CS@CU NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY WWW.CS.COLUMBIA.EDU

NON-PROFIT ORG.

U.S. POSTAGE

PAID

NEW YORK, NY

PERMIT 4332 Alfred V. Aho is LawrenceGussman Professor ofComputer Science and ViceChair for UndergraduateEducation in the ComputerScience Department. Heserved as Chair of thedepartment from 1995 to1997 and in the spring of2003. He was elected to theU.S. National Academy ofEngineering for contributionsto the fields of algorithmsand programming tools.

Professor Aho received hisB.A.Sc in Engineering Physicsfrom the University of Torontoand a Ph.D. in ElectricalEngineering/Computer Sciencefrom Princeton University. Prior to his current position atColumbia, Professor Aho servedin many capacities at theComputing Sciences ResearchCenter at Bell Labs, such as

Vice President, Director,department head, and memberof technical staff. This is thelab that invented UNIX, C andC++. Al was also the GeneralManager of the InformationSciences and TechnologiesResearch Laboratory atBellcore (now Telcordia).

Professor Aho is the “A” inAWK, a widely used pattern-matching language (you canthink of AWK as the initialpure version of perl). “W” isPeter Weinberger and “K”is Brian Kernighan. Al also wrote the initial versions ofthe string pattern-matchingprograms egrep and fgrepthat first appeared on UNIX.His current research interestsinclude quantum computing,programming languages,compilers, and algorithms.

(continued on next page)

CS@CU

CS@CU FALL 2006 1

CUCS Resources

You can subscribe to the CUCS news mailing list at http://lists.cs.columbia.edu/mailman/listinfo/cucs-news/

CUCS colloquium information is available at http://www.cs.columbia.edu/lectures

Visit the CUCS alumni portal at http://alum.cs.columbia.edu

At this year’s computer science department“Hello Meeting,” members of the facultyrecalled the early days of the departmentwhen the same meeting could be held on the first floor of Mudd, in the small area thatis now used for making sandwiches in theCarleton Lounge. Since then, the computerscience department has grown both in sizeand prominence. The faculty now includes

six members of the National Academies:

Al Aho, Steven M. Bellovin, Zvi Galil, Ted

Shortliffe, Joe Traub, and Vladimir Vapnik.

Here is a brief overview of their achievementsand contributions.

Columbia ComputerScience Faculty in theNational Academies

NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY VOL.3 NO.1 FALL 2006

Alfred V. Aho Stephen M. Bellovin

Zvi Galil Edward H. Shortliffe Joseph F. Traub Vladimir Vapnik