Upload
ian-stokes-rees
View
456
Download
0
Tags:
Embed Size (px)
Citation preview
Advancing Life Sciences Research with High Performance Computing and Cyberinfrastructure
Ian Stokes-ReesHarvard Medical School
SHOW - Making Biology Binary, June 2010
Dengue Virus Movie
animation, not simulation, informed by science
digizyme.com
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Science Behind the MovieMulti-scale
Data intensive
Dynamic
Models
Simulation
Analysis
Water channel through aquaporin tetramere in lipid bilayerTajkhorshid, E., Nollert, P., Jensen, M.O., Miercke, L.J., O'Connell, J., Stroud, R.M., and Schulten, K. (2002). Science 296, 525-530
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Molecular Dynamics
Computationally intensive
Necessarily parallel
Nanosecond scale today
Millisecond to second tomorrow
Rapidly growing interest
48 cores, single system image
GPU Computing 200-800 stream processing cores per card
NextGen Sequencing
Collaborations and Communities
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Tufts
University
School of
Medicin
e
Boston Life SciencesUniversitiesHospitalsPharmaceuticalsResearch Institutes
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Rice UniversityE. NikonowiczY. ShamooY.J. Tao
CalTechP. BjorkmanW. ClemonsG. JensenD. Rees
StanfordA. BrungerK. GarciaT. Jardetzky
UCSFJJ MirandaY. Cheng
UC DavisH. Stahlberg
UCSDT. NakagawaH. Viadiu
WesternUM. Swairjo
U. WashingtonT. Gonen
Washington U. School of Med.T. EllenbergerD. Fremont
VanderbiltCenter for Structural Biology
Rosalind FranklinD. Harrison
A. LeschzinerK. MillerA. RaoT. RapoportM. SamsoP. SlizT. SpringerG. VerdineG. WagnerL. WalenskyS.WalkerT.WalzJ. WangS. Wong
N. Beglova S. BlacklowB. ChenJ. ChouJ. ClardyM. EckB. FurieR. GaudetM. GrantS.C. Harrison J. HogleD. JeruzalmiD. KahneT. Kirchhausen
Harvard and Affiliates
NE-CATR. OswaldC. ParrishH. Sondermann
R. CerioneB. CraneS. EalickM. JinA. Ke
Cornell U.
Brandeis U.N. Grigorieff
Tufts U.K. Heldwein
UMass MedicalW. Royer
NIHM. Mayer
U. MarylandE. Toth
K. ReinischJ. SchlessingerF. SigworthF. Zhou
T. BoggonD. BraddockY. HaE. Lolis
Yale U.
C. SandersB. SpillerM. Stone
M. Waterman
W. ChazinB. EichmanM. EgliB. LacyM. Ohi
Columbia U.Q. Fan
Rockefeller U.R. MacKinnon
Thomas JeffersonJ. Williams
Not Pictured: University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan
If the particle physicists can use it...
Open Science Grid
opensciencegrid.org
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Grid Computing
Federated and scalable
Secure
Standardized
Compute sharing & cycle scavenging
Dynamic formation of collaborations
Data sharing
Protein Structure Studies
Ian Stokes-Rees, http://sbgrid.org
AcknowledgementsPiotr Sliz
PI and SBGrid team leader
Ian Levesque
Systems Architect
Ben Eisenbraun
Software Curator
Peter Doherty
Grid Administrator
Caitlin Colgrove
Intern Software Engineer
Steve Jahl
System Administrator
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
SummaryCompute power increasingly affordable
New computational techniques
New hardware (multi-core, GPU)
Grid and cloud computing
Fast networking, cheap storage
Scientists developing necessary skillsBe in touch - [email protected]
Extras
Ian Stokes-Rees, SBGrid, Harvard Medical School October 13th, 2009
How to get a structural biologist using CIEase of use
No command line
X.509 (initial request, VOs, proxies, Roles, etc.) are really complicated
Support infrastructure (mailing lists, tickets, phone, training)
Killer apps
They will use it if they see peers using it to advance scientific goals
They will use it if some novel workflows or workflow patterns are established
Data management is a big problem for everyone (see bonus, time permitting) -- we believe grid infrastructure could provide a solution
Security
Data needs to be secure ...
... but users still want to control sharing/access
Roadblocks
Reliability of underlying infrastructure and difficulty in debugging
Applications tied to GUIs, rudimentary interfaces
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Security ChallengesIdentity Management
Mixture of .htpasswd, PAM, X.509, and application-specific IDs
Complexity of X.509 (and associated paraphernalia) confuses users
account creation, use, and management
Virtual Organization hierarchies and user-driven collaborations
Inheritance of rights/policies
How to allow users to easily create and manage groups
Merging security policies
Site/resource, VO, and user policies need to be merged
Encryption and Privacy Preservation
Generic mechanisms for encryption and key management
Preserving privacy of actions and data in federated grid environment
Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
Security WorkMeta data system
Provide more generic pointers to ACLs and encryption keys
Extension of GACL system
Include non-X.509 ID tokens as policy principals
Allow GACL policies to apply to web framework objects (pyGACL)
Simple replicated key system for file encryption
Use of meta-data framework to point to encryption key (and replicas)
Use GACL to control key access (regular file)
Libraries to automatically read/write encrypted files
Future
VO hierarchies
Tools for user driven ACL management
Tools for policy management (merging site, VO and user policies)
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org