Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
oceanobservatories.org Funding for the Ocean Observatories Initiative is provided by the National Science Foundation through a Cooperative Agreement with the Consortium for Ocean Leadership. The OOI Program Implementing Organizations are funded through sub-awards from the Consortium for Ocean Leadership. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
The Rutgers data management team has developed a series of command-line Python tools that streamline data acquisition in order to facilitate the QA/QC review process. • Data Download
• Query uFrame for data availability • Generate requests
Data acquisition is made easy by a uFrame API (see Links section) written in Python.
In the following example, we search uFrame for Endurance data and send requests for the Washington Offshore Profiler Mooring (CE09OSPM).
OOI CI Data Acquisition Functions and Automated Python ModulesMichael J. Smith1, M. Vardaro1, L. Belabbassi1, L. Garzio1, F. Knuth1,
J. Kerfoot1, M.F. Crowley1, J.P. Fram2, S.M. Glenn1, O. Schofield1 1Rutgers, The State University of New Jersey, Department of Marine and Coastal Sciences, New Brunswick, NJ
2Oregon State University, College of Earth, Ocean, and Atmospheric Sciences, Corvallis, ORPoster # OD14A-2396
AcknowledgementsThe OOI is funded by the National Science foundation and managed by the Consortium for Ocean Leadership. Images courtesy of University of Washington.
Questions or Comments?email [email protected]
OOI Official Websitewww.oceanobservatories.org
Data Download
Rutgers Data Management Team Workflow
Cyberinfrastructure (CI)The OOINet system employs the uFrame Service-Oriented Architecture (SOA) software framework that processes the raw data. CI delivers the parsed and calculated data to the general public via RESTful HTTP application developed by Raytheon called ‘Stream Engine.’ The data is output into commonly used scientific data file formats such as JSON, NetCDF, and CSV.
The Ocean Observatories Initiative (OOI), funded by the National Science Foundation, provides users with access to long-term datasets from a variety of oceanographic sensors.
Arrays • Cabled Array • Coastal Pioneer • Coastal Endurance • Global Argentine Basin • Global Irminger Sea • Global Southern Ocean • Global Station Papa
OOI Overview
• Oregon Line • 7 moorings • 2 cabled benthic experiment packages
(BEP) • 6 underwater gliders
• Washington Line • 6 moorings • 6 gliders
The Endurance Array is outfitted with a variety of instrument packages. • Acoustic Doppler Current Profiler • Conductivity, Temperature, and Depth • Dissolved Oxygen • Fluorometer • Bulk Meteorology Instrument Package • More information can be found at:
• oceanobservatories.org/instruments/
Data Visualization• Uses free, open-source toolboxes like numpy and matplotlib • Automatically plots all data that is available in the netCDF files
• plotNC4.py • Points to a local directory containing netCDF files. • Plot all variables or a select few
• Python Pickle (.pkl) files available that contain important parameters. • Science • Engineering
• Specify plot tops: time series, profiles, and depth cross-sections • Interval for plots (i.e. plot any combination of hourly, daily, weekly, monthly, yearly)
• plotNC4_opendap.py • Same as above, but points to an OPenDAP server catalog. • Crawls through main catalog
• Select only data you want to plot via command prompt
1. search_frame_arrays.py • Requests a list of all available platforms
2. search_uframe_instrument_deployments.py • Requests a list of all available instruments/
deployments for a specific platform
3. build_instrument_deployment_requests.py • Generate a CSV containing a series of urls
that request specified dataset from uFrame
4. Two options for sending requests 1. Manually copy and paste link into a web browser 2. Programmatically requested (i.e. Python Requests module).
5. Request successful 1. Receive an outputURL link to the location where you can find the requested dataset.
Please remember to save this link! 2. Data is computed on demand 3. Time between when the request is accepted and posted at the link you received will vary
based upon the amount of data requested and other factors such as sampling interval.
Links
• Data Team Toolbox: github.com/ooi-data-review • uFrame software: github.com/oceanobservatories • Rutgers CI: github.com/ooi-integration • OPenDap Server: http://opendap-devel.ooi.rutgers.edu:8090/thredds/catalog/catalog.html
All Python tools have been uploaded to Github repositories that are openly available to help facilitate OOI data access and visualization for all users
Data Analysis
uFrame Python API
plot-nc-ooi Python toolbox
PLEASE NOTE: The server used to request data is temporarily hidden due to ongoing load testing. Access to the data server will be available in the near future.
All scripts have comprehensive help manuals. Just execute the script and type -h
• Data Visualization • Generate plots automatically via OPenDAP
• Data Analysis
Marine Implementing Organizations (MIO) • Oregon State University • University of Washington • Woods Hole Oceanographic Institution
Generate Requests
Send Requests to
Stream Engine
Output Data
Plot Data Analyze Plots Check Data vs. Expected Data
Data gets moved to THREDDS Define the Issue
Instrument Problem: Assign to MIO
Software Problem: Assign to Raytheon
Generate Redmine Ticket
Assignee and Data Team Test Fix
Push Fix to Production Server
MIO/Raytheon/Rutgers Fixes Issue
Rutgers Data Management Team
Yes No
Science User Finds Issue
Endurance Raw Data
OSU OMC Server
Rutgers OMC Server
WHOI OMC Server
Pioneer & Global Raw Data
Dataset Parsers
Storage in uFrame
(Cassandra)
Stream EngineData Product Algorithms
Asset Management
Deployment/Calibration Events
Data output (JSON or netCDF)
Data Requests Science User
uFrame
Asset Records
Integration Events
Endurance Array
CE09OSPM
Cyberinfrastructure • Rutgers, The State University of New Jersey
The Endurance Array operated by Oregon State University in the Pacific Ocean consists of two separate survey lines off the coasts of Oregon and Washington.
Cabled Raw Data
Pittock Server
The data team uses a suite of programming tools to analyze the data in addition to the plots • Relational Database
• This database is cross-checked with each file to ensure that uFrame is outputting the correct streams, parameters, and other important metadata.
• Provenance Data Interpretation • The repository ‘read_provenance’ parses provenance in each netCDF into JSON • Provenance information is file specific
• Deployment information • Raw data files used • Data product algorithms/parsers used