Upload
thuyet
View
21
Download
0
Tags:
Embed Size (px)
DESCRIPTION
The impact of grid computing on UK research. R Perrott Queen’s University Belfast. http://. Web : Uniform access to HTML documents. http://. Software catalogs. Computers. Sensor nets. Colleagues. Data archives. The Grid: The Web on Steroids. - PowerPoint PPT Presentation
Citation preview
The impact of grid computing on UK research
R Perrott
Queen’s University
Belfast
On-demand creation of powerful virtual computing systems
The Grid: The Web on Steroidshttp://
http://
Web: Uniform access to HTML documents
Grid: Flexible, high-perf access to all significant resources
Sensor nets
Data archives
Computers
Softwarecatalogs
Colleagues
Why Now?• The Internet as infrastructure
– Increasing bandwidth, advanced services
• Advances in storage capacity
– Terabyte for < $15,000
• Increased availability of compute resources
– Clusters, supercomputers, etc.
• Advances in application concepts
– Simulation-based design, advanced scientific instruments, collaborative engineering, ...
Grids– computational grid
• provides the raw computing power, high speed bandwidth interconnection and associate data storage
– information grid
• allows easily accessible connections to major sources of information and tools for its analysis and visualisation
– knowledge grid
• gives added value to the information and also provides intelligent guidance for decision-makers
Grid Architecture
Knowledge GridKnowledge Grid
Information GridInformation Grid
CommunicationsCommunications
Data to Knowledge
Data to Knowledge
ControlControl
Computation & Data GridComputation & Data Grid
Application Layer
Middleware
Base Layer
Program suite that can search out data and services, to satisfy job requirements. User-level API and libraries.
Schedulers to launch the work in the right places. The authentication, authorisation, etc. to allow it to happen.
Local resources that form local contributions to federal resource.
Software Suppliers
Application Users
UK Research Councils Approx.. funding for 2000/01 (£M)
• Biotechnology and Biological Sciences 200Research Council (BBSRC)
• Engineering and Physical Sciences 400Research Council (EPSRC)
• Economic and Social Research Council (ESRC) 70
• Medical Research Council (MRC) 350
• Natural Environment Research Council (NERC) 225
• Particle Physics and Astronomy 200Research Council (PPARC)
• Council for the Central Laboratory of the 100Research Councils
UK Grid Development Plan
1. Network of Grid Core Programme e-Science Centres
2. Development of Generic Grid Middleware
3. Grid Grand Challenge Project
4. Support for e-Science Projects
5. International Involvement
6. Grid Network Team
1. Grid Core Programme Centres• National e-Science Centre to achieve
international visibility
• National Centre will host international e-Science seminars ‘similar’ to Newton Institute
• Funding 8 Regional e-Science Centres to form coherent UK Grid
• DTI funding requires matching industrial involvement
• Good overlap with Particle Physics and AstroGrid Centres
Cambridge
Newcastle
Edinburgh
Oxford
Glasgow
Manchester
Cardiff
Soton
London
Belfast
DL
RL Hinxton
Centres will be Access Grid Nodes• Access Grid will enable informal and
formal group to group collaboration
• It enables:
– Distributed lectures and seminars– Virtual meetings– Complex distributed grid demos
• Will improve the user experience (“sense of presence”) - natural interactions (natural audio, big display)
Access Grid
2. Generic Grid Middleware• Continuing dialogue with major industrial
players
- IBM, Microsoft, Oracle, Sun, HP ..
- IBM Press Announcement August 2001
• Open Call for Proposals from July 2001 plus Centre industrial projects
• Funding Computer Science involvement in EU DataGrid Middleware Work Packages
3. Grid Interdisciplinary Research Centres Project
• 4 IT-centric IRCs funded
- DIRC : Dependability
- EQUATOR : HCI
- AKT : Knowledge Management
- Medical Informatics
• ‘Grand Challenge’ in Medical/Healthcare Informatics
- issues of security, privacy and trust
4. Support for e-Science Projects
• ‘Grid Starter Kit’ Version 1.0 available for distribution from July 2001
• Set up Grid Support Centre
• Training Courses
• National e-Science Centre Research Seminar Programme
5. International Involvement
• ‘GridNet’ at National Centre for UK participation in the Global Grid Forum
• Funding CERN and iVDGL ‘Grid Fellowships’
• Participation/Leadership in EU Grid Activities
- New FP5 Grid Projects (DataTag, GRIP, …)
• Establishing links with major US Centres – San Diego Supercomputer Center, NCSA
6. Grid Network Team
• Tasked with ensuring adequate end-to-end bandwidth for e-Science Projects
• Identify/fix network bottlenecks
• Identify network requirements of e-Science projects
• Funding traffic engineering project
• Upgrade SuperJANET4 connection to sites
Network Issues• Upgrading SJ4 backbone from 2.5 Gbps to
10 Gbps• Installing 2.5 Gbps link to GEANT pan-
European network• TransAtlantic bandwidth procurement
– 2.5 Gbps dedicated fibre– Connections to Abilene and ESNet
• EU DataTAG project 2.5 Gbps link from CERN to Chicago
Early e-Science DemonstratorsFunded• Dynamic Brain Atlas• Biodiversity• Chemical Structures
Under Development/Consideration• Grid-Microscopy• Robotic Astronomy• Collaborative Visualisation • Mouse Genes• 3D Engineering Prototypes• Medical Imaging/VR
Particle Physics and Astronomy Research Council (PPARC)
• GridPP (http://www.gridpp.ac.uk/)
• to develop the Grid technologies required to meet the LHC computing challenge
• collaboration with international grid developments in Europe and the US
Particle Physics and Astronomy Research Council (PPARC)
• ASTROGRID (http://www.astrogrid.ac.uk/)
• a ~£4M project aimed at building a data-grid for UK astronomy, which will form the UK contribution to a global Virtual Observatory
EPSRC Testbeds (1)
• DAME : Distributed Aircraft Maintenance Environment
• RealityGrid : closely couple high performance computing, high throughput experiment and visualization
• GEODISE : Grid Enabled Optimisation and DesIgn Search for Engineering
EPSRC Testbeds (2)
• CombiChem : combinatorial chemistry structure-property mapping
• MyGrid : personalised extensible environments for data-intensive experiments in biology
• Discovery Net : high throughput sensing
Distributed Aircraft Maintenance
Environment
Jim Austin, University of York
Peter Dew, Leeds
Graham Hesketh, Rolls-Royce
In flight data
Airline
Maintenance Centre
Ground Station
Global Network
Internet, e-mail, pager
DS&S Engine Health Center
Data centre
Aims
• To build a generic grid test bed for distributed diagnostics on a global scale
• To demonstrate this on distributed aircraft maintenance
• Evaluate the effectiveness of grid for this task• To deliver grid-enabled technologies that
underpin the application• To investigate performance issues
Computational Infrastructure
LeedsLocal Grid
Onyx 33D InteractiveGraphics &Conferencing
Lab. Machines
teradataCluster
Shared Mem.
White Rose Computational Grid
(SAN)
York Shared Memory
Sheffield Dist. Memory
RunningAcrossYHMAN
Super Janet
MyGrid
Personalised
extensible environments for
data-intensive experiments
in biology
ibm
Professor Carole Goble,University of Manchester
Dr Alan Robinson,EBI
Consortium
• Scientific Team– Biologists
– GSK, AZ, Merck KGaA, Manchester, EBI
• Technical Team– Manchester, Southampton, Newcastle, Sheffield, EBI,
Nottingham
– IBM, SUN
– GeneticXchange
– Network Inference, Epistemics Ltd
Comparative Functional Genomics
• Vast amounts of data & escalating
• Highly heterogeneous– Data types– Data forms– Community
• Highly complex and inter-related
• Volatile
MyGrid e-Science Objectives
Revolutionise scientific practice in biology• Straightforward discovery, interoperation, sharing• Improving quality of both experiments and data• Individual creativity & collaborative working• Enabling genomic level bioinformatics
Cottage Industry to an Industrial Scale
On the shoulders of giants
We are not starting from scratch…• Globus Starter Kit …• Web Service initiatives …• Our own environments …• Integration platforms for bioinformatics …• Standards e.g. OMG LSR, I3C …• Experience with Open Source
Specific Outcomes
• E-Scientists– Environment built on toolkits for service access,
personalisation & community
– Gene function expression analysis
– Annotation workbench for the PRINTS pattern database
• Developers– MyGrid-in-a-Box developers kit
– Re-purposing existing integration platforms
Discovery Net
• Yike Guo, John Darlington (Dept. of Computing),
• John Hassard (Depts. of Physics and Bioengineering)
• Bob Spence (Dept. of Electrical Engineering)• Tony Cass (Department of Biochemistry),• Sevket Durucan (T. H. Huxley School of Environment)
• Imperial College London
Discovery Net
AIM
• To design, develop and implement an infrastructure to support real time processing, interaction, integration, visualisation and mining of massive amounts of time critical data generated by high throughput devices.
The Consortium
• Industry Connection : 4 Spin-off companies + related companies (AstraZeneca, Pfizer, GSK, Cisco, IBM, HP, Fujitsu, Gene Logic, Applera, Evotec, International Power, Hydro Quebec, BP, British Energy, ….)
Industrial Contribution
• Hardware : sensors (photodiode arrays), systems (optics, mechanical systems, DSPs, FPGAs)
• Software (analysis packages, algorithms, data warehousing and mining systems)
• Intellectual Property: access to IP portfolio suite at no cost
• Data: raw and processed data from biotechnology, pharmacogenomic, remote sensing (GUSTO installations, satellite data from geo-hazard programmes) and renewable energy data (from remote tidal power systems)
High Throughput Sensing
Characteristics
Different Devices but same computational characteristics
•Data intensive &
• Data dispersive
•large scale,
•heterogeneous
•distributed data
•Real-time data manipulation Need to
• calibrate
• integrate
• analyse
GRID issues:
Data issues:
Information issues:
Discovery issues:
Distributed DevicesDistributed
warehousing
Distributed Reference DBs
Distributed Users
Collaborative applications
Testbed ApplicationsHTS Applications
Large-scale Dynamic Real- time Decision support
Large-scale Dynamic System Knowledge Discovery
Bio Chip Applications
Protein-folding chips: SNP chips, Diff. Gene chips using LFIIProtein-based fluorescent micro arrays
Renewable energy Applications
Tidal EnergyConnections to other renewable initiatives (solar, biomass, fuel cells), & to CHP and baseload stations
Remote Sensing Applications
Air Sensing, GUSTOGeological, geohazard analysis
1-100
10-100
>50000Image
RegistrationVisualisation
PredictiveModelling
RT decisions
1-100010-1000 >10000
Data QualityVisualisationStructuringClusteringDistributed Dynamic
Knowledge Management
Throughput(GB/s)
Size(petabytes)
Node Number
operations
1-10 1-10
>20000
StructuringMiningOptimisationRT decisions
Large-scale urban air sensing applicationsEach GUSTO air pollution system produces 1kbit per second, or 1010 bits per year. We expect to increase the number (from the present 2 systems) to over 20,000 over next 3 years, to reach a total of 0.6 petabytes of data within the 3-year ramp-up.
GUSTO
GUSTO
NO
simulant 6.7.2001
The useful information comes from time-resolved correlations among remote stations, and with other environmental data sets.
You are here
The IC AdvantageThe IC infrastructure: microgird for the testbed
ICPC Resource
+20 TB of disk storage
+25 TB of tape storage
3 Clusters
(> 1 Tera Flops)
Network upgrade
Over than 12000 end devices
10 Mb/s – 1Gb/s to end devices
1 Gb/s between floors
10 Gb/s to backbone
10 Gb/s between backbone router matrix and wireless capability
2x1Gb/s to LMAN II
(10Gb/s scheduled 2004)
Access to disparate off-campus sites: IC hospitals, Wye College etc.
Core router switches
Building router switches
Floor switches
End devices
Core Fibre
Core to Building Fibre
Building Riser Fibre
Cat 5 floor wiring
London MANJANET
Proposed firewall
workstation cluster
storage
SMP
Central Computing Facilities
wireless
End devices
Floor switches
Building Router Switches
Core Router Switches
Proposed Firewall
London MAN/ JANET
£3m SRIF funding
150 Gflops Processing
>100 GB Memory
5 TB of disk storage
Conclusions• Good ‘buy-in’ from scientists and engineers• Considerable industrial interest • Reasonable ‘buy-in’ from good fraction of
Computer Science community but not all• Serious interest in Grids from IBM, HP,
Oracle and Sun• On paper UK now has most visible and
focussed e-Science/Grid programme in Europe
Now have to deliver!
US Grid Projects/Proposals• NASA Information Power Grid• DOE Science Grid• NSF National Virtual Observatory• NSF GriPhyN• DOE Particle Physics Data Grid• NSF Distributed Terascale Facility• DOE ASCI Grid• DOE Earth Systems Grid• DARPA CoABS Grid• NEESGrid• NSF BIRN• NSF iVDGL
EU GridProjects
• DataGrid (CERN, ..)• EuroGrid (Unicore)• DataTag (TTT…)• Astrophysical Virtual Observatory• GRIP (Globus/Unicore)• GRIA (Industrial applications)• GridLab (Cactus Toolkit)• CrossGrid (Infrastructure Components)• EGSO (Solar Physics)