1
oceanobservatories.org Funding for the Ocean Observatories Initiative is provided by the National Science Foundation through a Cooperative Agreement with the Consortium for Ocean Leadership. The OOI Program Implementing Organizations are funded through sub- awards from the Consortium for Ocean Leadership. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. The Rutgers data management team has developed a series of command-line Python tools that streamline data acquisition in order to facilitate the QA/QC review process. Data Download Query uFrame for data availability Generate requests Data acquisition is made easy by a uFrame API (see Links section) written in Python. In the following example, we search uFrame for Endurance data and send requests for the Washington Offshore Profiler Mooring (CE09OSPM). OOI CI Data Acquisition Functions and Automated Python Modules Michael J. Smith 1 , M. Vardaro 1 , L. Belabbassi 1 , L. Garzio 1 , F. Knuth 1 , J. Kerfoot 1 , M.F. Crowley 1 , J.P. Fram 2 , S.M. Glenn 1 , O. Schofield 1 1 Rutgers, The State University of New Jersey, Department of Marine and Coastal Sciences, New Brunswick, NJ 2 Oregon State University, College of Earth, Ocean, and Atmospheric Sciences, Corvallis, OR Poster # OD14A-2396 Acknowledgements The OOI is funded by the National Science foundation and managed by the Consortium for Ocean Leadership. Images courtesy of University of Washington. Questions or Comments? email [email protected] OOI Official Website www.oceanobservatories.org Data Download Rutgers Data Management Team Workflow Cyberinfrastructure (CI) The OOINet system employs the uFrame Service-Oriented Architecture (SOA) software framework that processes the raw data. CI delivers the parsed and calculated data to the general public via RESTful HTTP application developed by Raytheon called ‘Stream Engine.’ The data is output into commonly used scientific data file formats such as JSON, NetCDF, and CSV. The Ocean Observatories Initiative (OOI), funded by the National Science Foundation, provides users with access to long-term datasets from a variety of oceanographic sensors. Arrays Cabled Array Coastal Pioneer Coastal Endurance Global Argentine Basin Global Irminger Sea Global Southern Ocean Global Station Papa OOI Overview Oregon Line 7 moorings 2 cabled benthic experiment packages (BEP) 6 underwater gliders Washington Line 6 moorings 6 gliders The Endurance Array is outfitted with a variety of instrument packages. Acoustic Doppler Current Profiler Conductivity, Temperature, and Depth Dissolved Oxygen • Fluorometer Bulk Meteorology Instrument Package More information can be found at: • oceanobservatories.org/instruments/ Data Visualization Uses free, open-source toolboxes like numpy and matplotlib Automatically plots all data that is available in the netCDF files • plotNC4.py Points to a local directory containing netCDF files. Plot all variables or a select few Python Pickle (.pkl) files available that contain important parameters. • Science • Engineering Specify plot tops: time series, profiles, and depth cross-sections Interval for plots (i.e. plot any combination of hourly, daily, weekly, monthly, yearly) • plotNC4_opendap.py Same as above, but points to an OPenDAP server catalog. Crawls through main catalog Select only data you want to plot via command prompt 1. search_frame_arrays.py Requests a list of all available platforms 2. search_uframe_instrument_deployments.py Requests a list of all available instruments/ deployments for a specific platform 3. build_instrument_deployment_requests.py Generate a CSV containing a series of urls that request specified dataset from uFrame 4. Two options for sending requests 1. Manually copy and paste link into a web browser 2. Programmatically requested (i.e. Python Requests module). 5. Request successful 1. Receive an outputURL link to the location where you can find the requested dataset. Please remember to save this link! 2. Data is computed on demand 3. Time between when the request is accepted and posted at the link you received will vary based upon the amount of data requested and other factors such as sampling interval. Links Data Team Toolbox: github.com/ooi-data-review uFrame software: github.com/oceanobservatories Rutgers CI: github.com/ooi-integration OPenDap Server: http://opendap-devel.ooi.rutgers.edu:8090/thredds/catalog/catalog.html All Python tools have been uploaded to Github repositories that are openly available to help facilitate OOI data access and visualization for all users Data Analysis uFrame Python API plot-nc-ooi Python toolbox PLEASE NOTE: The server used to request data is temporarily hidden due to ongoing load testing. Access to the data server will be available in the near future. All scripts have comprehensive help manuals. Just execute the script and type -h Data Visualization Generate plots automatically via OPenDAP Data Analysis Marine Implementing Organizations (MIO) Oregon State University University of Washington Woods Hole Oceanographic Institution Generate Requests Send Requests to Stream Engine Output Data Plot Data Analyze Plots Check Data vs. Expected Data Data gets moved to THREDDS Define the Issue Instrument Problem: Assign to MIO Software Problem: Assign to Raytheon Generate Redmine Ticket Assignee and Data Team Test Fix Push Fix to Production Server MIO/Raytheon/Rutgers Fixes Issue Rutgers Data Management Team Yes No Science User Finds Issue Endurance Raw Data OSU OMC Server Rutgers OMC Server WHOI OMC Server Pioneer & Global Raw Data Dataset Parsers Storage in uFrame (Cassandra) Stream Engine Data Product Algorithms Asset Management Deployment/ Calibration Events Data output (JSON or netCDF) Data Requests Science User uFrame Asset Records Integration Events Endurance Array CE09OSPM Cyberinfrastructure Rutgers, The State University of New Jersey The Endurance Array operated by Oregon State University in the Pacific Ocean consists of two separate survey lines off the coasts of Oregon and Washington. Cabled Raw Data Pittock Server The data team uses a suite of programming tools to analyze the data in addition to the plots Relational Database • This database is cross-checked with each file to ensure that uFrame is outputting the correct streams, parameters, and other important metadata. Provenance Data Interpretation The repository ‘read_provenance’ parses provenance in each netCDF into JSON Provenance information is file specific Deployment information Raw data files used Data product algorithms/parsers used

OOI CI Data Acquisition Functions and Automated …...• Generate a CSV containing a series of urls that request specified dataset from uFrame 4. Two options for sending requests

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OOI CI Data Acquisition Functions and Automated …...• Generate a CSV containing a series of urls that request specified dataset from uFrame 4. Two options for sending requests

oceanobservatories.org Funding for the Ocean Observatories Initiative is provided by the National Science Foundation through a Cooperative Agreement with the Consortium for Ocean Leadership. The OOI Program Implementing Organizations are funded through sub-awards from the Consortium for Ocean Leadership. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

The Rutgers data management team has developed a series of command-line Python tools that streamline data acquisition in order to facilitate the QA/QC review process. • Data Download

• Query uFrame for data availability • Generate requests

Data acquisition is made easy by a uFrame API (see Links section) written in Python.

In the following example, we search uFrame for Endurance data and send requests for the Washington Offshore Profiler Mooring (CE09OSPM).

OOI CI Data Acquisition Functions and Automated Python ModulesMichael J. Smith1, M. Vardaro1, L. Belabbassi1, L. Garzio1, F. Knuth1,

J. Kerfoot1, M.F. Crowley1, J.P. Fram2, S.M. Glenn1, O. Schofield1 1Rutgers, The State University of New Jersey, Department of Marine and Coastal Sciences, New Brunswick, NJ

2Oregon State University, College of Earth, Ocean, and Atmospheric Sciences, Corvallis, ORPoster # OD14A-2396

AcknowledgementsThe OOI is funded by the National Science foundation and managed by the Consortium for Ocean Leadership. Images courtesy of University of Washington.

Questions or Comments?email [email protected]

OOI Official Websitewww.oceanobservatories.org

Data Download

Rutgers Data Management Team Workflow

Cyberinfrastructure (CI)The OOINet system employs the uFrame Service-Oriented Architecture (SOA) software framework that processes the raw data. CI delivers the parsed and calculated data to the general public via RESTful HTTP application developed by Raytheon called ‘Stream Engine.’ The data is output into commonly used scientific data file formats such as JSON, NetCDF, and CSV.

The Ocean Observatories Initiative (OOI), funded by the National Science Foundation, provides users with access to long-term datasets from a variety of oceanographic sensors.

Arrays • Cabled Array • Coastal Pioneer • Coastal Endurance • Global Argentine Basin • Global Irminger Sea • Global Southern Ocean • Global Station Papa

OOI Overview

• Oregon Line • 7 moorings • 2 cabled benthic experiment packages

(BEP) • 6 underwater gliders

• Washington Line • 6 moorings • 6 gliders

The Endurance Array is outfitted with a variety of instrument packages. • Acoustic Doppler Current Profiler • Conductivity, Temperature, and Depth • Dissolved Oxygen • Fluorometer • Bulk Meteorology Instrument Package • More information can be found at:

• oceanobservatories.org/instruments/

Data Visualization• Uses free, open-source toolboxes like numpy and matplotlib • Automatically plots all data that is available in the netCDF files

• plotNC4.py • Points to a local directory containing netCDF files. • Plot all variables or a select few

• Python Pickle (.pkl) files available that contain important parameters. • Science • Engineering

• Specify plot tops: time series, profiles, and depth cross-sections • Interval for plots (i.e. plot any combination of hourly, daily, weekly, monthly, yearly)

• plotNC4_opendap.py • Same as above, but points to an OPenDAP server catalog. • Crawls through main catalog

• Select only data you want to plot via command prompt

1. search_frame_arrays.py • Requests a list of all available platforms

2. search_uframe_instrument_deployments.py • Requests a list of all available instruments/

deployments for a specific platform

3. build_instrument_deployment_requests.py • Generate a CSV containing a series of urls

that request specified dataset from uFrame

4. Two options for sending requests 1. Manually copy and paste link into a web browser 2. Programmatically requested (i.e. Python Requests module).

5. Request successful 1. Receive an outputURL link to the location where you can find the requested dataset.

Please remember to save this link! 2. Data is computed on demand 3. Time between when the request is accepted and posted at the link you received will vary

based upon the amount of data requested and other factors such as sampling interval.

Links

• Data Team Toolbox: github.com/ooi-data-review • uFrame software: github.com/oceanobservatories • Rutgers CI: github.com/ooi-integration • OPenDap Server: http://opendap-devel.ooi.rutgers.edu:8090/thredds/catalog/catalog.html

All Python tools have been uploaded to Github repositories that are openly available to help facilitate OOI data access and visualization for all users

Data Analysis

uFrame Python API

plot-nc-ooi Python toolbox

PLEASE NOTE: The server used to request data is temporarily hidden due to ongoing load testing. Access to the data server will be available in the near future.

All scripts have comprehensive help manuals. Just execute the script and type -h

• Data Visualization • Generate plots automatically via OPenDAP

• Data Analysis

Marine Implementing Organizations (MIO) • Oregon State University • University of Washington • Woods Hole Oceanographic Institution

Generate Requests

Send Requests to

Stream Engine

Output Data

Plot Data Analyze Plots Check Data vs. Expected Data

Data gets moved to THREDDS Define the Issue

Instrument Problem: Assign to MIO

Software Problem: Assign to Raytheon

Generate Redmine Ticket

Assignee and Data Team Test Fix

Push Fix to Production Server

MIO/Raytheon/Rutgers Fixes Issue

Rutgers Data Management Team

Yes No

Science User Finds Issue

Endurance Raw Data

OSU OMC Server

Rutgers OMC Server

WHOI OMC Server

Pioneer & Global Raw Data

Dataset Parsers

Storage in uFrame

(Cassandra)

Stream EngineData Product Algorithms

Asset Management

Deployment/Calibration Events

Data output (JSON or netCDF)

Data Requests Science User

uFrame

Asset Records

Integration Events

Endurance Array

CE09OSPM

Cyberinfrastructure • Rutgers, The State University of New Jersey

The Endurance Array operated by Oregon State University in the Pacific Ocean consists of two separate survey lines off the coasts of Oregon and Washington.

Cabled Raw Data

Pittock Server

The data team uses a suite of programming tools to analyze the data in addition to the plots • Relational Database

• This database is cross-checked with each file to ensure that uFrame is outputting the correct streams, parameters, and other important metadata.

• Provenance Data Interpretation • The repository ‘read_provenance’ parses provenance in each netCDF into JSON • Provenance information is file specific

• Deployment information • Raw data files used • Data product algorithms/parsers used