16
1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working Group Meeting September 27-28, 2010 ORNL research was sponsored by the U.S. Department of Energy and performed at Oak Ridge National Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725.

1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

Embed Size (px)

Citation preview

Page 1: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

1

CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda

Environmental Sciences Division

Oak Ridge National Laboratory

CDIAC User Working Group MeetingSeptember 27-28, 2010

ORNL research was sponsored by the U.S. Department of Energy and performed at Oak Ridge National Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-

00OR22725.

Page 2: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

2

Overview

Data Support for SPRUCE

• Data Management Planning

• Goals outlined in the Science Plan

• Requirements identified in the Data Policy

• Actions and resources needed to meet requirements are in the Data Management Plan

• Implementation

• SPRUCE web site

• Resources and products accessible on the web site

Data Support for NGEE

• Data Management Planning

• Expect planning to be similar to SPRUCE

• NGEE Web Site

Shared Development Effort for Acquisition and Processing of Sensor Data

Page 3: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

3

Science Plan for the Climate Change Response Scientific Focus Area

3.11 Data and informatics Goals for Response SFA data management are to ensure the fidelity and accessibility of the SFA data, minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata, and ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community. The suite of activities that collectively comprise this component of the SFA will naturally evolve over the life of the SFA, and they will be done in collaboration with data management components of other Climate SFAs. Initial data management work will focus on defining the data collection and distribution requirements, identifying key leverage points across SFAs and other projects, ensuring that site characterization data is maintained, and resolving any critical informatics knowledge gaps identified in the requirements definition. As the experiments begin to collect high resolution data, the data management activities will shift to ensuring that the experimental data are properly archived and distributed according to the SFA’s data access policy. Data from the Response SFA will be a combination of observational data recorded by researchers and data collected by automated equipment. Further details can be found in Annex C.

The data management component will leverage the expertise and tools in the Environmental Data Science and Systems (EDSS) group, particularly the Carbon Dioxide Information and Analysis Center (CDIAC) and the Atmospheric Radiation Measurement (ARM) program archive, to ensure that both observational and automated data are robustly archived in relational data models with necessary timestamp, spatial, temporal, and provenance metadata.

Goals for SPRUCE Data Management • Ensure the fidelity of and accessibility of SPRUCE data to the participants to facilitate all the pertinent science questions; • Minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata; and • Ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community and public when appropriate quality checked data are available.

Approach to Data Management Planning • Provide a structured framework to capture the project-defined requirements• Provide data management guidance and best practices• Responsibility of ORNL SPRUCE research group, the Task Leaders in particular, and Forest Service Staff, to reach a consensus about what needs to be controlled, to provide processing details, and to establish who is responsible for implementation. Accountability is key.

Planning Considerations • The plan supports field sampling, measurements, monitoring, and analyses. • Data management information collected pre-experiment will inform the final experimental data management processes.• SPRUCE tasks are subject to change or modification and experimental technology will evolve. The data management plan will have to be flexible and updated as needed, with version control.

Page 4: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

4

Version 1.2 2010/05/10

SPRUCE Data Policy: Archiving, Sharing, and Fair-Use

The open sharing of all SPRUCE experiment data among researchers, the broader scientific community, and the public is critical to advancing the mission of DOE’s Program of Terrestrial Ecosystem Science.

SPRUCE is implementing an experimental platform for the long-term testing of the mechanisms controlling the vulnerability of organisms, ecosystems, and ecosystem functions to increases in temperature and exposure to elevated CO2 treatments within the northern peatland high-carbon ecosystem. All data collected at the SPRUCE facility, all results of any analysis or synthesis of information, and all model algorithms and codes developed in support of SPRUCE will be submitted to the SPRUCE Data Archive in a timely manner such that data will be available for use by SPRUCE researchers and, following publication, the public.

This policy is applicable to all SPRUCE participants including the SPRUCE Research Group at the Oak Ridge National Laboratory (ORNL), the U.S. Forest Service, cooperating independent researchers, and to the users of SPRUCE data products (see the Data Fair-Use Statement).

SPRUCE data policies are consistent with the sponsoring U.S. DOE Program for Terrestrial Ecosystem Science Data Policy and with the Memorandum of Understanding between the U.S. Forest Service and UT-Battelle.

Data Management Requirements are identified in the Data Policy

Page 5: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

5

Data Archiving and Discovery

• Archive at Carbon Dioxide Information Analysis Center (CDIAC)

• Two levels of data accessibility.

• First is for sharing recently collected, derived, and processed data products among SPRUCE participants.

• Second is for access to mature data products by the broader scientific community and public.

• Public access will be concurrent with open literature or web site publication of SPRUCE results.

• Discovery facilitated through the compilation of descriptive companion metadata records and their inclusion in searchable metadata databases and clearinghouses.

Data Policy, continued

Page 6: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

6

Data Sharing

Timeliness of Data Availability

• Researchers will actively process, quality assure, and document environmental measurements, etc

• Task Leaders will define a schedule for submitting data to the Archive for their given measurements.

Suggested guidelines for submitting data to the Archive for sharing among SPRUCE participants. Environmental measurements (automated instruments) -- 30 days after the completion of a month of measurements Annual surveys and seasonal measurement efforts -- 120 days from the completion of the survey Laboratory analyses of vegetation nutrient concentrations -- 60 days from completion of analyses

Suggested guidelines for submitting data to the Archive for public access.• Environmental measurements (automated instruments) -- annual updates Annual surveys and seasonal measurement efforts -- With publication of papers. Laboratory analyses of vegetation nutrient concentrations -- With publication of papers.

Quality Assurance of Data

• Task Leader will define the quality assurance checks to be performed prior to data sharing • among SPRUCE participants (Quality Level 1) and• (Quality Level 2) prior to public access Suggested guidelines for defining data Quality Levels: Level 1 and Level 2

Data Policy, continued

Page 7: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

7

Data Fair-Use Statement

The SPRUCE data provided on the public archive are freely available and were furnished by the SPRUCE Research Group at ORNL, U.S. Forest Service, and cooperating independent researchers who encourage their use.

Please inform SPRUCE scientist(s) of your use of the archived data and of any publications.

Check the Archive frequently to ensure that you are using the latest version of the data.

Please acknowledge (1) data products as a citation as provided in the data archive documentation, (2) web site information downloads as a bibliographic web citation, or (3) general SPRUCE information as an acknowledgment or personal communication if no other citation form is applicable.

When publishing original analyses and results using these data, please acknowledge the agency or organization that supported the collection of the original data.

Please include these terms as publication keywords as applicable: SPRUCE Experiment, ORNL, U.S. DOE Office of Science, Marcell Experimental Forest, Northern Research Station, U.S. Forest Service.

Please provide an electronic reprint of your independent work to the SPRUCE Project so that all publications can be tracked by CDIAC.

Disclaimer of Liability

Data Policy, continued

Page 8: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

8

Data and Metadata Reporting• Reporting Sampling and Measurement Dates and Times• Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables• Reporting Units for Chemical, Physical, and Descriptive Variables• Reporting Values below Detection Limits• Reporting Missing Data• Reporting Uncertainty Estimates• Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions• Assigning Project-Specific Data Quality Flags

Organization• Data Policy • Data Flow• Project Name Information• Identifying Measurement and Sampling Sites

Data Processing• Data Entry, Transfer, and Transformation• Managing Hardcopy Format Project Records• Managing Electronic Format Project Records• Names and Reporting Formats for Data Files• Scripted Programs for Processing and Analysis• Quality Level of Data

Data Documentation and Archiving• Planning to Archive Data for Public Release• Creating Archive Documentation• Providing Metadata to Searchable Indexes and Clearinghouses• Assigning Descriptive Data Set Titles

Data Systems Management• Day-to-Day Operation of Data Management Systems• Data Management System and Software Configuration Control Guidelines

Actions and resources needed to meet requirements are in the Data Management Plan

Page 9: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

9

Task: Environmental Measurements• Automated InstrumentsTask: Environmental Measurements• Automated Instruments

Existing/Historical Data • MEF, NADP, Remote Sensing• Ground penetrating radar assessments• Additional links to existing data ?

Existing/Historical Data • MEF, NADP, Remote Sensing• Ground penetrating radar assessments• Additional links to existing data ?

SourcesSources

Task R2: Plant growth phenology and NPP• Periodic ObservationsTask R2: Plant growth phenology and NPP• Periodic Observations

Processing/QA FrequencyProcessing/QA Frequency

30-60 days aftercollection

30-60 days aftercollection

120 days after

survey, 60 days

after sample

analyses

120 days after

survey, 60 days

after sample

analyses

• Selected data uploaded• Periodic updates with new data and products

• Selected data uploaded• Periodic updates with new data and products

DestinationDestination AccessAccess

Supplemental Information• Photos, Videos, Additional ?Supplemental Information• Photos, Videos, Additional ?

Timing ?Timing ?

SPRUCE Data FlowSPRUCE Data Flow

Compiled by Les Hook, 2010/05/10

Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 • Inputs and Outputs ?

Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 • Inputs and Outputs ?

Task R3: Community composition• Periodic ObservationsTask R3: Community composition• Periodic Observations

Task R4: Plant Physiology • Periodic ObservationsTask R4: Plant Physiology • Periodic Observations

Task R5: Biogeochemical cycling responses• Periodic ObservationsTask R5: Biogeochemical cycling responses• Periodic Observations

SPRUCEData Archive

(CDIAC)

SPRUCEData Archive

(CDIAC)

Project Data Sharing

Public Data Sharing

with publication or per schedule

SPRUCE Web Site

Project and Public Access to Data and

Resources

Project Data Access• 100% open for Project Team• Permission needed by others

Project Resources • Common reference sources• Metadata Content Editor

Public Data Archive• 100% open to Public• Data and Metadata Search• Relational Database (e.g., FACE) ?

SPRUCE Web Site

Project and Public Access to Data and

Resources

Project Data Access• 100% open for Project Team• Permission needed by others

Project Resources • Common reference sources• Metadata Content Editor

Public Data Archive• 100% open to Public• Data and Metadata Search• Relational Database (e.g., FACE) ?

30-60 days aftercollection

30-60 days aftercollection

Page 10: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

10

http://mnspruce.ornl.gov

Page 11: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

11

http://ngee.ornl.gov

Page 12: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

12

Shared Development Effort for Acquisition and Processing of Sensor Data

SPRUCE Sensors and data loggers

Acquisition and evaluation software

Independent processing steps

Next for SPRUCE and NGEE

• Number of sensors 25X

• Need advanced automated processing, displays, and alarms

• Web accessible

• Other needs?

Page 13: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

13

Shared Development Effort for Acquisition and Processing of Sensor Data

Next Steps:

• Purchasing Campbell Scientific (CS) software with more capabilities.

• Meeting with CS Technical Representative for planning guidance.

• Making connections with ORNL CS power users.

• Learn from SPRUCE and NGEE prototypes

• Starting to look beyond acquisition and processing to analysis.

Page 14: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

14

Additional Data Flow Diagrams

• Overview of Task Inputs and Resources

• S1 Bog Vegetation Survey Task

Page 15: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

15

Task-Specific InputsTask-Specific InputsResourcesResources

SPRUCE Web Site

Project Access to Data and Resources

Project Resources

Common references:

• SPRUCE Task Description template• SPRUCE Variable Name template• SPRUCE Project Names template• Site Information template• Data Collection Guides

Project Data Archive• 100% open for Project Team• Permission needed by others

SPRUCE Web Site

Project Access to Data and Resources

Project Resources

Common references:

• SPRUCE Task Description template• SPRUCE Variable Name template• SPRUCE Project Names template• Site Information template• Data Collection Guides

Project Data Archive• 100% open for Project Team• Permission needed by others

Overview of Task Inputs and ResourcesOverview of Task Inputs and Resources

Compiled by Les Hook, 2010/05/10

Task EM:Task EM:

Existing/Historical DataExisting/Historical Data

Task R2:Task R2:

Supplemental InformationSupplemental Information

Task R6:Task R6:

Task R3:Task R3:

Task R4:Task R4:

Task R5:Task R5:

Data Policy

Data FlowTask Information

• Task Description• ID Measurements• Field Sampling & Measurement Description• Laboratory Analysis Description• Data Processing• Archive Schedule• QA Level Defined• Task Metadata• Task Data

SPRUCEData Archive

(CDIAC)

SPRUCEData Archive

(CDIAC)

Project Data Sharing

Page 16: 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working

16

Task-Specific InputsTask-Specific Inputs

SPRUCE Web Site

Project and Public Access to Data and Resources

SPRUCE Web Site

Project and Public Access to Data and Resources

S1 Bog Vegetation Survey Task >>>Data Management Planning

S1 Bog Vegetation Survey Task >>>Data Management Planning

Compiled by Les Hook, 2010/04/30, updated 2010/09/20

SPRUCEData Archive

(CDIAC)

SPRUCEData Archive

(CDIAC)

Project Data Sharing

Forest Service• Survey Plot Coordinates

Data and Metadata Reporting• Reporting Sampling and Measurement Dates and Times• Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables• Reporting Units for Chemical, Physical, and Descriptive Variables• Reporting Values below Detection Limits• Reporting Missing Data• Reporting Uncertainty Estimates• Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions• Assigning Project-Specific Data Quality Flags

Data Processing• Data Entry, Transfer, and Transformation• Managing Hardcopy Format Project Records• Managing Electronic Format Project Records• Names and Reporting Formats for Data Files• Scripted Programs for Processing and Analysis• Quality Level of Data

Organization• Data Policy • Data Flow• Project Name Information• Identifying Measurement and Sampling Sites

See DCG – Site Information

Data Documentation and Archiving• Planning to Archive Data for Public Release• Creating Archive Documentation• Providing Metadata to Searchable Indexes and Clearinghouses• Assigning Descriptive Data Set Titles

Project Master List of Site Information

Task Metadata• Task Description• Field Sampling & Measurement Description• Laboratory Analysis Description• QA Level Defined• Archive Schedule

Data and Metadata Compilation

Data Systems Management• Day-to-Day Operation of Data Management Systems• Data Management System and Software Configuration Control Guidelines

See DCG – Hardcopy

Forms

See DCG

– Task Plan