12
Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1 , Luca Cinquini 1 , Lukasz Lacinski 2 , Rachana Ananthakrishnan 2 , and Robert Ferraro 1 1 Jet Propulsion Laboratory, California Institute of Technology 2 Globus at the University of Chicago

Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Embed Size (px)

Citation preview

Page 1: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF)

Brian Knosp1, Luca Cinquini1, Lukasz Lacinski2, Rachana Ananthakrishnan2, and Robert Ferraro1

1Jet Propulsion Laboratory, California Institute of Technology2Globus at the University of Chicago

Page 2: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

The Earth System Grid Federation

The Earth System Grid Federation (ESGF) is an international collaborative effort that focuses on supporting climate science. The grid is comprised of several geographically dispersed data nodes – the ESGF’s CoG system acts as a common gateway for users to search for and download data, regardless of its location.

NASA JPL participates in the ESGF and hosts a data node at JPL

©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 3: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

The NASA-JPL ESGF Node

• NASA-JPL node most notably hosts obs4MIPS data that can be easily compared with the CMIP5 climate model

• Like other ESGF nodes, there are multiple ways to download data from the NASA-JPL ESG node: – wget– HTTP– OPeNDAP– LAS– Globus

©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 4: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Implementation Goals

• Looking for a way for users to quickly download data with a fewer clicks

• Process had to be well documented and established

• Had to be a cross-platform solution

• Any 3rd party software had to be easy to install©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 5: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Deciding on a Technology

• We evaluated several data download technologies and eventually decided to pursue Globus as the most feature-rich solution

• Access to Globus downloads existed in the ESGF, but was hampered by multiple authentication points (>=3) and manual endpoint activation

©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 6: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Enabling Globus Transfers

• The Globus integration had to be ported from the old ESGF UI (Java-based) to the new ESGF UI (CoG, Python-based)

• Had to re-install the esg#jpl server to:– Install a new Globus Connect Server (jplesgnode#public) to support a shared endpoint – Auto-activates endpoints– ESGF CoG is a trusted OAuth Globus client

• Now, users only need to log in twice:– ESGF OpenID login– Globus Login (can persist on the website)

©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 7: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

CoG-Globus Download Workflow

Page 8: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Globus Web Downloads

• Does require the user to create a Globus account in addition to their ESGF OpenID account

• User must download Globus Connect

• Downloading through the web is the preferred method for downloading files using Globus on the CoG

©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 9: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Globus Web Downloads

Page 10: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Globus CLI Downloads

• User does have to sign up for a Globus account

• User must download Globus Connect

• User must upload their SSH key

• This option is mainly meant for power users who prefer script to a web interface

• Script is a Python wrapper around the Globus CLI interface

• Only uses Python packages that come with a standard Python install (no extra downloads)

• This option will also be used to enable large transfers between ESGF nodes (for example, to replicate data between climate centers)

©2015 California Institute of Technology; Government Sponsorship Acknowledged

Page 11: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Globus CLI Downloads

Page 12: Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz

Future Plans

• Use ESFG OpenID as an alternative to Globus ID– This will eliminate the need to have a CoG user

register a separate Globus username

• Once the ESGF they will have Globus Connect Servers on them and that should make extending Globus downloads on all data nodes easier

©2015 California Institute of Technology; Government Sponsorship Acknowledged