Earth System CoG and the Earth System Grid Federation: A Partnership for Improved Data Management and Project Coordination NOAA ESRL Seminar April 8, 2014

Embed Size (px)

Citation preview

  • Slide 1
  • Earth System CoG and the Earth System Grid Federation: A Partnership for Improved Data Management and Project Coordination NOAA ESRL Seminar April 8, 2014 Boulder, CO Sylvia Murphy (NOAA/CIRES) ([email protected]), Luca Cinquini (JPL/NOAA), Cecelia DeLuca (NOAA/CIRES), Allyn Treshansky (NOAA/CIRES)
  • Slide 2
  • Presentation Outline ESGF-CoG Integration Overview of ESGF ESGF Architecture and Local Data Holdings Overview of CoG CoG Capabilities (Live Demo) ESGF-CoG Integration Development Tasks Upcoming Tutorials
  • Slide 3
  • ESGF-CoG Integration ESGF is an international, federated data archive focused on climate projects. CoG is a collaboration environment and hub to connect projects in the Earth sciences. CoG is going to become the new front-end for ESGF. This will mean a superior interface to ESGF users and data managers in terms of: Overall usability Content management Model Intercomparison Project (MIP) support Multi-project support Online collaboration tools Reference: 3 rd Annual Earth System Grid Federation and Ultrascale Visualization Climate Data Analysis Tools Face-to-Face Meeting Report December (http://aims-group.github.io/pdf/ESGF_UV-CDAT_Meeting_Report_March2014.pdf)
  • Slide 4
  • ESGF Overview The Earth System Grid Federation (ESGF) is a multi-agency, international collaboration of people and institutions working together to build an open source software infrastructure for the management and analysis of Earth Science data on a global scale Collaboration led by PCMDI, includes institutions from several agencies from the U.S.A. (DOE, NASA, NOAA), Canada, Europe (IS-ENES-2), Australia and Asia ESGF manages and serves a global archive of climate data including: CMIP5 model output (basis of IPCC-AR5) Possibly the largest modeling effort in history: 40+ models, 25+ modeling centers, 17 countries, 2 PB of data Obs4MIPs: selected observations from NASA and DOE especially formatted for comparison and evaluation of CMIP5 models Ana4MIPs: reanalysis data also formatted as CMIP5 model output CORDEX: regional climate models, 2 PB of data TAMIP: atmospheric model intercomparison GeoMIP: geo-engineering model intercomparison DCMIP: atmospheric dynamical core model intercomparison WCRP recommended use of ESGF infrastructure for all future MIPs
  • Slide 5
  • ESGF System Architecture ESGF is a system of distributed and federated Nodes that interact dynamically through a Peer-To-Peer (P2P) protocol Distributed: data and metadata are published, stored and served from multiple centers (Nodes) Federated: Nodes interoperate because of the adoption of common services, protocols and APIs, and the establishment of mutual trust relationships Dynamic: Nodes can join/leave the federation dynamically global data and services will change accordingly A client (browser or program) can start from any Node in the federation and discover, download and analyze data from multiple locations as if they were stored in a single central archive
  • Slide 6
  • ESGF Software Stack Software components can be grouped into 4 areas of functionality the Node flavors: Data Node: secure data publication and access Index Node: metadata indexing and searching Identity Provider: user authentication and group membership Compute Node: analysis and visualization The ESGF software stack is based on the integration of several applications, APIs: Open source engines (Postgres, Tomcat, Solr) Geo-spatial servers (Thredds Data Server, Live Access Server) Industry standards: OpenSSL, X509, OpenID, REST, Custom ESGF software Node flavors can be installed in various combinations depending on site needs, or to achieve higher performance and scalability All ESGF software is Open Source (BSD License) and freely available on GitHub https://github.com/ESGF
  • Slide 7
  • ESGF ESRL Node NOAA/ESRL is hosting a full-featured ESGF Node: http://hydra.fsl.noaa.gov/http://hydra.fsl.noaa.gov/ Node system administrator: Doug Ohlhorst (big thanks!) Available data collections: Ana4MIPs 20 th Century Reanalysis (Gil Compo, Cathy Smith) DCMIP-2012 ( Atmospheric Dynamical Core Inter-Comparison workshop at NCAR, led by Christiane Jablonowski), including NOAA FIM model QED-2013 (Quantitative Evaluation of Downscaling workshop at NCAR, sponsored by National Climate Projection and Prediction NCPP- project) ESRL Node is part of ESGF federation: ESRL collections can be accessed and discovered from other ESGF sites Vice versa, a user can start from ESRL Node and find CMIP5 data throughout ESGF Vertical mesh layout from FIM test 5-1 (idealized tropical cyclone) conducted during DCMIP-2012.
  • Slide 8
  • Summary of ESGF Achievements ESGF represents a significant step forward for the management and access of climate data world-wide: Established the first global, distributed database of PB of climate model output and observations Data can be discovered through a federated faceted search or RESTful API Data download can be scripted and executed by programs Users need register only once, authenticate everywhere Architecture is scalable (for increased model and instrument resolution and rates) and extensible (to other formats, repositories and scientific domains) ESGF has established an open source collaboration across agencies and international boundaries Image courtesy of NCAR/CGD
  • Slide 9
  • Overview of CoG CoG is a collaboration environment and hub to connect projects in the Earth sciences. It hosts software development projects, model intercomparison projects (MIPS), university short-courses, and workshops. It includes a configurable search to data on ANY ESGF data node. It provides projects with a wiki and customizable navigation to wiki content. Projects, files, or pages can be made private. It contains an ontology for the description and management of projects and provides a consolidated look at this content across a projects network. It contains a file server for documents and images. It provides services for Earth system model metadata collection and display. Some of the 74 projects hosted on CoG include: NOAAs High Impact Weather Prediction Project (HIWPP) Atmospheric Dynamical Core Model Intercomparison Project (DCMIP) Reanalysis Data for CMIP5 (Ana4MIPs) Observational Data for CMIP5 (Obs4MIPs) National Unified Operational Prediction Capability (NUOPC) National Climate Predictions and Projections Platform (NCPP) Earth System Documentation (ES- DOC) Earth System Prediction Capability (ESPC) CoG Development Partners
  • Slide 10
  • Whos Using CoG HIWPP (NOAA): https://www.earthsystemcog.org/projects/hiwpp/https://www.earthsystemcog.org/projects/hiwpp/ NCPP (NOAA): https://www.earthsystemcog.org/projects/ncpp/https://www.earthsystemcog.org/projects/ncpp/ Ana4MIPs (NOAA): https://www.earthsystemcog.org/projects/ana4mips/https://www.earthsystemcog.org/projects/ana4mips/ NUOPC (Navy, USAF, NOAA): https://www.earthsystemcog.org/projects/nuopc/https://www.earthsystemcog.org/projects/nuopc/ DCMIP-2012: https://www.earthsystemcog.org/projects/dcmip-2012/https://www.earthsystemcog.org/projects/dcmip-2012/
  • Slide 11
  • Wiki and Collaboration Tools https://www.earthsystemcog.org/projects/dcmip-2012/ The CoG layout is color- coded: The right-hand side (dark yellow) is where services (data, news, project connectivity) are located. The Upper Navigation bar (dark teal) contains links to project-level metadata. On the left (light teal) is an auto-generated navigation system created when projects develop freeform content. The central portion of the site is a wiki that allows projects to create their own content. Screenshot of the CoG project workspace for the 2012 Dynamical Core Model Intercomparison (DCMIP) Workshop.
  • Slide 12
  • Customizable Data ServicesInterfacing with ESGF https://www.earthsystemcog.org/search/downscaling-2013/ Search widget can be turned on/off. Search can be narrowed to any ESGF node and to any project (e.g. CMIP). Search facets can be created, deleted, and grouped. Help text can be added to the top of the search page. Search results can be saved to a Data Cart associated with a user. Items in the Data Cart persist. Search results can be: Forwarded to the Live Access Server (LAS) for simple visualization. Downloaded directly via a WGET script. Associated with model metadata if it exists.
  • Slide 13
  • ESGF Search Customization https://www.earthsystemcog.org/search/ downscaling-2013/
  • Slide 14
  • Data Cart Items in the Data Cart can be sent individually or collectively to LAS or WGET. The Data Cart is associated with a user and not a project.
  • Slide 15
  • Show Metadata
  • Slide 16
  • Project Networks and the Project Browser https://www.earthsystemcog.org/projects/nesii/ Projects in CoG are arranged in a hierarchy of Parents, Peers, and Children. The Project Browser displays the network and allows for inter-project navigation. Projects can be tagged with keywords and projects can be searched for using keywords.
  • Slide 17
  • CoG Schema https://www.earthsystemcog.org/projects/cog/ The CoG schema contains classes to describe software development projects, short- courses or meetings, and overall project coordination. Projects select which metadata to display via a simple web form. Project-level metadata is linked in standardized locations via the upper navigation bar.
  • Slide 18
  • Project-level Metadata Roll-up https://www.earthsystemcog.org/projects/es-doc-models/aboutus/ Management of information is a major problem in projects that involve many sub-projects, partners, multiple leads, and many resources. CoG acts as an index into project information that is necessary for coordination and collaboration and enables people responsible for overall coordination to quickly get consolidated views of information. This example shows the Partners feature that allows projects to list their project partners and include a logo for each. Below the list for ED-DOC is a consolidated view of the partners for ES-DOCs peer projects.
  • Slide 19
  • Resources https://www.earthsystemcog.org/projects/es-doc-models/resources/ Resources are pointers to data, files, and URLs. Resources folders can be created, moved, and deleted. Projects can turn on a set of standardized Resources folders (e.g. Presentations, Minutes). Saved data searches can be saved as a Resource. Each Resource can have a private wiki-based notes page to facilitate discussions.
  • Slide 20
  • News https://www.earthsystemcog.org/projects/climatetranslator/ News is a way to send announcements across a project network. News is visible in the news widget on any targeted project. News will be added to social media (Google+, Facebook, Twitter, RSS) in a future release.
  • Slide 21
  • Model Metadata Services The CoG Team is partnering with the international Earth System Documentation (ES- DOC) project to develop and use an Earth System Model metadata entry and view capability. The ES-DOC Viewer is a lightweight JavaScript plugin that will display any Common Information Model (CIM) record. The ES-Questionnaire collects standardized CIM metadata through a high-customizable web form. The output is saved to a community CIM repository.
  • Slide 22
  • CoG-ESGF Future Work Requirements are coming from HIWPP, CMIP6, the ESGF integration, and other projects. CMIP6 will include a set of interconnected MIPs. Work is starting on the CMIP6 sites. CoG is going to replace the ESGF web front end. Work should be completed by the end of the summer 2014 with a production system in place by the end of the year. CoG will be federated so that projects hosted on one CoG-ESGF instance will be visible on others. CoG is being modified to conform to Federal and DOC requirements. OpenID access will be added to CoG, which will improve the security of the site. The local CoG URL will be changed to a.gov address.
  • Slide 23
  • Webinar/Tutorials Fridays at 10am Mountain Time Contact [email protected] for more [email protected] Other group or individual sessions available on demand. Scheduled Sessions: 11 Apr: HIWPP 02 May: ESPC
  • Slide 24
  • Questions? [email protected] CoG: https://earthsystemcog.org/https://earthsystemcog.org/ ESRL ESGF data node: http://hydra.fsl.noaa.gov/esgf-web-fe/http://hydra.fsl.noaa.gov/esgf-web-fe/ PCMDI ESGF data node: http://pcmdi9.llnl.gov/esgf-web-fe/http://pcmdi9.llnl.gov/esgf-web-fe/ JPL ESGF data node: http://esg-datanode.jpl.nasa.gov/esgf-web-fe/http://esg-datanode.jpl.nasa.gov/esgf-web-fe/