Upload
cameroon45
View
4
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
The Initiative in Innovative Computing at Harvard
Alyssa A. GoodmanIIC Director & Prof. of Astronomy
Agenda
What is IIC? (“Filling the Gap”)
Where did it come from? (A Story)
What have we done so far? (Startup Mode)
What are we about to do? (Projects, Hiring Plans)
What do we hope to do? (Long-term Goals)
Filling the “Gap” between Science and Computer
Science
Increasingly, core problems in science require computational solution
Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need
Focused on finding elegant solutions to basic computer
science challenges
Often see specific, “applied” problems as outside their
interests
Scientific disciplines
Computer Science departments
Where did IIC come from?
Short Version: Response to Harvard’s “expansion” in Science, and into Allston. See IIC Whitepaper (2004) & Task Force on Science & Technology report (2005) for more.
Long Version…
Computational challenges are common across scientific disciplines
How to:
Acquire, transmit, organize, and query new kinds of data?
Apply distributed computing resources to solve complex
problems?
Derive meaningful insight from large datasets?
Share, integrate and analyze knowledge across
geographically dispersed researchers?
Visually represent scientific results so as to maximize
understanding? Opportunity to collaborate and apply insights from one field to another
Workflow and WORKFLOWExamples Astronomy Public Health
““Collect”Collect” TelescopeTelescope Microscope, Microscope,
Stethoscope, SurveyStethoscope, Survey
COLLECTCOLLECT ““National Virtual National Virtual Observatory”/Observatory”/
COMPLETECOMPLETE
CDC WonderCDC Wonder
““Analyze”Analyze” Study the density Study the density structure of a star-structure of a star-
forming glob of forming glob of gasgas
Find a link between Find a link between one factory’s one factory’s
chlorine runoff & chlorine runoff & diseasedisease
ANALYZEANALYZE Study the density Study the density structure of structure of allall
star-forming gas star-forming gas in…in…
Study the toxic Study the toxic effects of chlorine effects of chlorine runoff runoff in the U.Sin the U.S..
““Collaborate”Collaborate” Work with your student Work with your student
COLLABORATECOLLABORATE Work with 20 people in 5 countries, in Work with 20 people in 5 countries, in real-timereal-time
““Respond”Respond” Write a paper for a Journal.Write a paper for a Journal.
RESPONDRESPOND Write a paper, the quantitative results of Write a paper, the quantitative results of which are shared globally, digitally.which are shared globally, digitally.
Real World Workflow
e.g. Emergency Medicine in the Age of High-Speed Networks, Fast Processors, Mass Storage, and Miniature
Devices
IIC/Harvard contact: Matt Welsh, DEAS
Continuum
“Pure” Discipline Science
(e.g. Galileo)
“Pure” Computer Science
(e.g. Turing)
“Computational Science”Missing at Most Universities
Filling the “computational science” gap: IIC
Problem-driven approach…focusing effort on solving problems that will have greatest impact
& educational valueCollaborative projects
…combining disciplinary knowledge with computer science expertise
Interdisciplinary effort…to ensure that best practices are shared across fields and that
new tools and methodologies will be broadly applicable
Links with industry…to draw on and learn from experience in applied computation
Institutional funding…to ensure effort is directed towards key needs and not driven
solely by narrow priorities of funding agencies
Where are the optimal “IIC” problems?
Low High
Computer Science Payoff
Dom
ain
Sci
ence
Payoff
Low
HIg
h
“Never Mind” Computer
Science Department
Science Departments
CSDepartments
What is the rightshape for
that boundary?
Visualization Distributed Computing
Databases/ Provenance
Analysis & Simulations
Instrumentation
Physically meaningful combination of diverse data types.
e-Science aspects of large collaborations.
Sharing of data and computational resources and tools in real-time.
Management, and rapid retrieval, of data.
“Research reproducibility” …where did the data come from? How?
Development of efficient algorithms.
Cross-disciplinary comparative tools (e.g. statistical).
Improved data acquisition.
Novel hardware approaches (e.g. GPUs, sensors).
IIC Research Branches( and Projects Draw upon >1 )
V
DC
DB/P AS I
Plus…Educational Programs that bring IIC Science to Harvard students, and to the public at large.
Education is central to IIC’s mission
At Harvard:
Undergraduate & graduate courses focused on “data-intensive science”
New graduate certificate program, within existing Ph.D. programs
Research opportunities at undergraduate, graduate, and postdoctoral levels
Beyond Harvard:
New museum, highlighting the kind of science done at the IIC
Image & Meaning Collaboration
IIC Seminar Series at Harvard
Astronomical Medicine (IIC/CfA/HMS/MGH/BWH-SPL)
1st Call for Ideas (deadline was 3/15/06)
V
I
IIC’s First Activities(2005-)
V
DC DB/P AS I
V
DC DB/P AS
V
DC DB/P AS I
“Image and Meaning”
“I-M”=Working group of scientists, computer scientists, graphic artists, writers, publishers, designers organized and led by Felice Frankel, now at IIC!
Goal: To increase both scientists understanding of their own data, and the public’s understanding of scientists’ findings, through graphical display.
Activities: Large conferences at MIT in 2001 and Getty Center in
2005.Smaller “IM2.x” local workshops throughout 2006-7,
including @ IIC.Upcoming IM/SIGGRAPH, in conjunction with SIGGRAPH
2007.Online community to be hosted by IIC, beginning later
this year. (Social Network model.)
Seminar Sampler (Fall 2005-Spring 2006)
Jim Reese How to Build Google in Your Spare Time
Ian Foster Service-Oriented Science
Volker Springel/Nick Holliman
Numerical Cosmology & 3D Viz
Tim Kaxiras Multi-Scale Modeling
Anne Trefethen UK e-Science
Carl Kesselman Emergence of Cyberinfrastructure
Panel on CS & Visual Depiction (Frankel, Rheigans, Durand, Pfister)
Jim Hendler Science & the Semantic Web
Mark Green Building a Grid-enabled Gateway for Science & Engineering
Roy Williams Virtual Observatory as a Model for Information Sharing
Andy van Dam/Anne Spalter
Digital Visual Literacy
Pete Eltgroth Profiles in Supercomputing
Luc Moreau Provenance
Curtis Wong Interactive Media
Eric Klopfer Games, Simulation & Learning
Jim Myers yesterday! Cyberenvironments
Phil Campbell Future of e-Publishing
And more… Grid, Agile Methods, Array-based Databases, Bio & Neuro informatics, Clinical Applications in Autism Research, Astronomical Medicine…
Atomistic Modeling of Biomolecular Function
Multiscale Hemodynamics
Gene Pattern + The Virtual Data Center
Medical Treatment Outcomes Online
Enhanced Viz/Analysis Tools for Archaeo/Geo/Seismology
Spatial Ontology Mapping (Community-based)
Knowledge Ecology of Science (Peer-to-Peer Collaboration Networks)
Framework for Multimodal Studies in Genetics, Biology & the Mind
Connectional Analysis of Synaptic Circuitry in the Mammalian Nervous System (The “Connectome”)
LHC/LSST/MWA Consortium for Data-Intensive Science
A Portal for the National Virtual Observatory
Time-Series Research Collaborative
Responses to 1st IIC Call for IdeasV
DC DB/P AS
V
DC DB/P AS
V
DC DB/P AS
DC DB/P I
V
DC DB/P AS I
V
DC
DB/PDC
V
DC DB/P AS
V
DC DB/P AS I
V
DC DB/P AS I
V
DC
V
DC DB/P AS
V
I
Building the Best (Startup) Program
Visualization Distributed Computing
Databases/ Provenance
Analysis & Simulations
Instrumentation
V
DC
DB/P AS I
Project 1Project 1Project 1Project 1
Building the Best (Startup) Program
Visualization Distributed Computing
Databases/ Provenance
Analysis & Simulations
Instrumentation
V
DC
DB/P AS I
Project 1Project 1Project 1Project 1Project 2Project 2Project 2Project 2
Project 3Project 3Project 3Project 3
Atomistic Modeling of Biomolecular Function
Multiscale Hemodynamics
Gene Pattern + The Virtual Data Center
Medical Treatment Outcomes Online
Enhanced Viz/Analysis Tools for Archaeo/Geo/Seismology
Spatial Ontology Mapping (Community-based)
Knowledge Ecology of Science (Peer-to-Peer Collaboration Networks)
Framework for Multimodal Studies in Genetics, Biology & the Mind
Connectional Analysis of Synaptic Circuitry in the Mammalian Nervous System (The “Connectome”)
LHC/LSST/MWA Consortium for Data-Intensive Science
A Portal for the National Virtual Observatory
Time-Series Research Collaborative
Now…V
DC DB/P AS
V
DC DB/P AS
V
DC DB/P AS
DC DB/P I
V
DC DB/P AS I
V
DC
DB/PDC
V
DC DB/P AS
V
DC DB/P AS I
V
DC DB/P AS I
V
DC
V
DC DB/P AS
V
I
planning grant
planning grant
looking into collab around 3D displays
ongoing discussion w/Harvard librarians
integrated into Data-Intensive Project
2006-7 Project Portfolio
The Connectome
+Astronomical Medicine
Computational Framework for Neuroinformatics and Genetics
Data-Intensive Science and High Capacity Scientific Databases
Genepattern and the Virtual Data Center (VDC)
National Virtual Observatory Portal
+Envisioning Science Program
“Astronomical Medicine”
“Astronomical Medicine”
Brigham & Women’s Hospital, Surgical Planning Lab
Massachusetts General Hospital, Martinos Center
Harvard-Smithsonian Center for Astrophysics
IIC
Present Team:Alyssa Goodman (IIC & CfA, Co-I)Michael Halle (IIC & BWH, Co-I)Douglas Alan (IIC, Sen. Scientific S/W Engineer)Michelle Borkin (IIC, Res. Assoc.)Jens Kauffmann (IIC & CfA, postdoc)
Demo MovieDemo Movie
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
The “Connectome”:Wiring Diagram for a Complete Brain
Circuit (Connectional Analysis of Synaptic Circuitry in the Mammalian
Nervous System)3D images from electron-microsope images of serial sections (slices)
– Large volumes studies: up to 500 mm cubes– High resolution: 5nm x-y; 50 nm in z (105 x 105 x 104=1014
voxels)– Large datasets: 10-100 TB
Potentially intractable computationally w/o a hierarchical approach
Start with the large, dominant pathways: The biggest wires and the biggest excitatory connections.
Use this as scaffolding to then solve other pathways: inhibition, lateral connections, feedback.
V
DC DB/P AS I
Virtual Observatory Portal
V
DC
Virtual Observatory Portal?
Virtual Observatory Portal?
Virtual Observatory Portal
Default values are shown in green
Data on: One object One Region A list of objects A list of regions
I want: Spectra Images Catalogs (click all that apply)
I want to: Use VO tools to browse data Download data to local computer
Would you like help writing a script to do your query? Yes or No
Continue
Virtual Observatory Portal
V
DC
A Computational Framework for Neuroinformatics and Genetics
Goal: Create an integrated framework for simultaneous analysis and reproducible retrieval of multimodal data in structural & functional brain imaging and genetics.
Collaboration Amongst Several HMS Hospitals & Departments:
NeurologyRadiologyPsychologyMolecular GeneticsIIC
Core Imaging Methodologies
Cortical Thickness AD vs. Controls
A Computational Framework forMultimodal Studies in GENETICS, BIOLOGY, AND THE MIND
Family history pedigree software toolkit
Histological Correlates of AD
Patients
Normal Controls
AnteriorPosterior
Superior
Right Amygdala
Common
Topology differences in cocaine addiction
V
DC DB/P AS I
Lab 1
Lab 1 Lab
3Lab 3
Lab 2
Lab 2
Lab 4
Lab 4
Lab 5
Lab 5Computational FrameworkComputational FrameworkComputational FrameworkComputational Framework
Computational FrameworkComputational FrameworkComputational FrameworkComputational Framework
A Computational Framework forMultimodal Studies in GENETICS, BIOLOGY, AND THE MIND
V
DC DB/P AS I
“An Entire Disease or
Condition of the Brain”
Data-Intensive Science
Collaboration Amongst:Physics DepartmentDEASHarvard-Smithsonian Center for
AstrophysicsIIC & Harvard CIO’sCERN & NSF
Goal: Create powerful HPC / Grid capabilities in data-intensive science, advanced analytical algorithms in Astronomy & Physics, & advanced research in scientific VLDBMS
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Data-Intensive Science
Of interest to ABCD:Tier 2 Grid Node & Staff will come to HarvardHPC Real-Time Computing capability for MWA
projectLarge, fast, storage for Pan-Starrs ProjectSeeds of a “Center” for Time-Series Analysis
Gene Pattern - Virtual Data Center
Collaboration Amongst:
The Broad Institute of Harvard & MIT
Harvard Institute for Quantitative Social Science (IQSS)
IIC
Goal: Integrate biomedical computational workflow engines with a statistical framework and canonical data repository originally developed for social science research
Multiscale Hemodynamics
Goal: Build an accurate multiscale simulation of hemodynamics to enable significant advances in fundamental knowledge of blood flow and treatment of related diseases.
Collaboration Amongst:DEAS/Applied MathematicsPhysicsDEAS/Computer ScienceHMS/CardiologyDEAS/Chemical EngineeringIIC
The ‘Envisioning Science’ Program
Mission: To enable Scientists, computer scientists, graphic designers, journalists, and editors to co-develop new methods of scientific communication and education focused on scientific images.
Collaboration Amongst:Faculty of Arts and Sciences (Felice
Frankel!!)Harvard Medical School
MIT, NSF, Apple…IIC
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
OngoingImage and Meaning Conference
seriesPicturing to Learn program
Agenda
What is IIC? (“Filling the Gap”)
Where did it come from? (A Story)
What have we done so far? (Startup Mode)
What are we about to do? (Projects, Hiring Plans)
What do we hope to do? (Long-term Goals)
IIC will evolve over three phases
Phase I
2005-08Timing
IIC staffing level, combo of• new faculty• senior scientists• admin staff
Number of projects
Educational mission• New courses offered• Outreach programs
Other key milestones
Phase II
2008-10
Phase III
2011+
Total ~25 to ~100
~5-7 to ~15-18
New courses to museum
Evaluation schedule (internal, external committees)
Organizational Challenges
Result of “Allston” Science & Technology Task Force
IIC intended to be a “University” (not a single school)
initiative
FAS (Faculty of Arts & Science) Constraints
Faculty Appointments
Non-Faculty Appointments
Startup Space
“Chicken-and-Egg” Problem with Recruiting
Good, but not certain, Funding Prospects
Role of DEAS Computer Science
Challenges for Discussion• How will IIC research enable better collaboration amongst
geographically distributed researchers?• What are the best technologies for visualizing enormous data
sets?• How can "human-in-the-loop" software, where we admit that
humans are better than computers at many (particularly graphical) tasks, best be created and used
• Are "mashups," where many software packages are "mashedup" together the way of the future, or is it reasonable to strive for "perfect" standalone software packages?
• If mashups ultimately prevail, what is the business model for developing them?
• How do we best appoint new IIC faculty at Harvard, given that their work often does not fit within existing departmental boundaries?
• How can IIC best partner with industry to accomplish goals of mutual interest, and which of those goals are paramount?
“Challenges”
Low High
Computer Science Payoff
Dom
ain
Sci
ence
Payoff
Low
HIg
h
“Never Mind” Computer
Science Department
Science Departments
CSDepartments
Will CS/DEASuse slots for
these people?How big is that overlap?
Will departments hire“computationalists” with
regular slots?How big is this overlap?
How do we give Seniornon-faculty similar statureto faculty? (e.g. P.I. rights,
job security)
IIC: Mission
The Institute for Innovative Computing (IIC) will make Harvard a world leader in the innovative and creative use of computational resources to address forefront scientific problems.
We will focus on developing capabilities that are applicable to multiple disciplines, by undertaking specific, well-defined projects, thereby developing tools and approaches that can be generalized and shared.
We will foster the flow of ideas and inventions along the continuum from basic science to scientific computation to computational science to computer science.
We will train a next generation of creative and computationally capable scientists, build linkages to industry, and communicate with the public at large.
The Initiative in Innovative Computing at Harvard
Alyssa A. GoodmanIIC Director & Prof. of Astronomy
Sample Long Term Goal“3D Data Desk”
Demo, using data from http://www.electoral-vote.com/2004/info/president.csv)
Perseus file