Upload
malcolm-barrie-anderson
View
216
Download
0
Embed Size (px)
Citation preview
The Earth System Grid (ESG)
Goals, Objectives and Strategies
DOE SciDAC ESG Project ReviewArgonne National Laboratory, Illinois
May 8-9, 2003
May 8, 2003 Earth System Grid 2
Presentation Agenda
ESG Goals ESG Objectives ESG Strategies Summary of Goals, Objectives, and Strategies
Part I
ESG
Goals
May 8, 2003 Earth System Grid 4
ESG: Problem Statement
Climate study is fundamentally multidisciplinary. As we strive to understand its complexity, researchers from different fields and different locations must become engaged in large multinational teams to tackle these “Grand Challenge” problems
Need a software infrastructure to support this multidisciplinary “Virtual Organization” (VO) Community code (open/modular/shared simulation codes) Tools that support collaboration and data sharing Location-independent equal-access to shared resources (data,
visualization, supercomputers, experiments, whiteboard, etc..)
May 8, 2003 Earth System Grid 5
ESG: Goals
Funded by the Scientific Discovery through Advanced Computing (SciDAC), the goal of ESG is to make climate resources – particularly climate model data – easily accessible to the climate community.
Enabling researchers to understand and make effective use of very large, distributed climate datasets is critical. The broad strategy is to develop a collection of server-side capabilities – minimize the amount of data movement.
A “Collaboratory Pilot Project” – Built upon ESG-I, Globus Toolkit, and other DataGrid & Web technologies
Multiple interfaces to ESG will allow researchers to focus on science and not issues with data receipt, format, and data set manipulation.
May 8, 2003 Earth System Grid 6
ESG: Benefits
Improved over all climate research
Improved collaboration between national and inter-national institutions, groups and agencies
Climate modelers will have greater access to simulated and observed data for fine tuning their models which they’re trying to improve
Climate researchers will have the freedom to browse and diagnose model data for inter-comparison studies with weather data and without data formats restrictions
Part II
ESG
Objectives
May 8, 2003 Earth System Grid 8
ESG: Objectives
Assemble a team of computer scientists and domain scientists to work together in collaboration to deliver grid technology in the service of key climate scientific questions.
Allow access to retrospective climate data (input and output) needed to enable a feedback mechanism to tie researchers directly back to quality control and diagnostics of models.
Allow researchers access to “format independent” climate and observational data for case-study & training.
In the U.S., climate simulation can be viewed as a systems problem, allow a team of multi-agencies and institutions working together in collaboration (i.e., “Virtual Organization” (VO))
May 8, 2003 Earth System Grid 9
Part III
ESG
Strategies
May 8, 2003 Earth System Grid 11
ESG: Strategies Move data a minimal amount, keep it close to computational point
of origin when possible Data access protocols, distributed analysis
When we must move data, do it fast and with a minimum amount of human intervention Storage Resource Management, fast networks
Keep track of what we have, particularly what’s on deep storage Metadata and Replica Catalogs
Harness a federation of sites, data portals Globus Toolkit -> The Earth System Grid -> The UltraDataGrid
Leverage existing software and projects Collaborate with other national and inter-national groups, agencies
with similar ESG interests and goals
May 8, 2003 Earth System Grid 12
Server
Tera/Peta-scaleArchive
HRM
Tools for reliable staging,
transport, and replication
Server
Tera/Peta-scaleArchive
HRM
ClientSelectionControl
MonitoringHRM
Storage/Data Management
May 8, 2003 Earth System Grid 13
Typical Application
Data(local)
netCDF lib
Application
Data(remote)
OPeNDAP Client
Application
OPeNDAPViahttp
Big Data(remote)
ESG client
Application
ESG+
DODS
OPeNDAP Server ESG Server
Distributed Application
dataOPeNDAP
ViaGrid
ESG: Distributed Data Access Protocols
Gridded Application
May 8, 2003 Earth System Grid 14
ESG: Portal Client Layers
Thin Client: Slow interaction, but you know its going to work! Delivery: HTML to any web-browser Users: No time investment
Slender Client: Faster interaction, but primary work on remote server. Delivery: NCL, Python modules, signed applications, tiny binaries Users: Some time investment in acquiring modules
Thick Clients: Portal merely a data broker between distributed resources and your helper application. Delivery: Standalone applications of any sort Users: More significant time investment to install helper application (i.e.,
CDAT, NCL)
May 8, 2003 Earth System Grid 15
OPeNDAP (DODS): Distributed Oceanographic Data System (Unidata)Integrations of Globus GridFTP, DODS data access
THREDDS: THematic Real‑time Environmental Distributed Data Services (Unidata)LAS: Live Access Server (NOAA Pacific Marine Environmental Laboratory)
Works with NCL, CDAT, Ferret, GrADS, …CDAT: Climate Data Analysis Tools (PCMDI), includes CDMS: Climate Data Management System, VCDAT visualizationCommunity Data Portal project (NCAR)NCL: NCAR Command LanguageMFT: Multiple File Transfer (LBNL), include HRM: Hierarchical Resource ManagerGlobus Grid technology(ANL, ISI): GridFTP, CAS Community Authorization Service, Globus Resource Allocation Manager GRAM
ESG: Leveraging Off of Existing Software and Projects
May 8, 2003 Earth System Grid 16
ESG: Collaborations & Relationships
Large Grid Projects under negotiation with ESG CCSM Data Management Group Other SciDAC Projects: Climate, Security
& Policy for Group Collaboration, Scientific Data Management ISIC, & High-performance DataGrid
Earth Science Portal (ESP) Group e-Science:NERC DataGrid, CLRC Earth System Modeling Framework
(ESMF) NOAA Operational Model Archive and
Distribution System – (NOMADS) Committee on Earth Observation
Satellites – (CEOS)
Remote Data
Tookit
Remote Calc.
Toolkit
Remote Viz
Toolkit
GenericApps
Grid Infrastructure
Brokers Info Schedule Data Monitor Security
Grid Application Toolkit (Middleware)
User
Adm.
PortalsApplications Generic
U.S. Users
CDAT Users Ferret Users
U.K. UsersClimate Community
Commercial Users
Community OutreachUniversity Users
Sponsors
Networks
ESG GridU.K. NERC DataGrid
CEOS GridOther Grids
May 8, 2003 Earth System Grid 17
Grid and NetworkInfrastructure
Onlinestorage systems
Computationalresources
? RCAS
ESG services: information, replica,metadata, community authorization
M
Data consumers
Data producers
ESG: Collaboration Network
Part IV
Summary of Goals,
Objectives, and Strategies
May 8, 2003 Earth System Grid 19
ESG: Immediate Directions
Broaden usage of DataMover and refine Build data catalogs with rich metadata Release “real” ESG portal
Search, browse, access Alpha version of OPeNDAPg
Test and evaluate with three client applications (ncview, CDAT, & NCL)
Move software and web portals into the hands of serious users, and get feedback!
Continue to collaborate with others
May 8, 2003 Earth System Grid 20
ESG: Future Directions
The Open Grid Services Architecture (OGSA), server-side analysis Leverage the work of ESG to meet specific distributed database,
data access, and data movement requirements of other DOE agencies
Leverage the work of the Earth Science Portal (ESP) to provide a universal and secure web-based data access portal for the broad-based climate data collections
Merge climate data analysis tools (at NCAR and PCMDI) into one product to provide a wide-range of Grid-enabled data analysis tools and diagnostics methods to U.S. Government agencies
May 8, 2003 Earth System Grid 21
Closing Thoughts
Building an environment for the long-termDifficult, expensive, and time-consumingBut a worthwhile investment
Team-building is a critical processCollaboration technologies really help (e.g., Access
Grid) Managing all the collaborations is a challenge
But extremely valuable Good progress, real use cases