53
Research Data Infrastructure for Geochemistry iedadata. org 1

Research Data Infrastructure for Geochemistry (DFG Roundtable)

Embed Size (px)

Citation preview

Page 1: Research Data Infrastructure for Geochemistry (DFG Roundtable)

1

Research Data Infrastructure for Geochemistry

iedadata.org

Page 2: Research Data Infrastructure for Geochemistry (DFG Roundtable)

2

Investment2

IEDA 2016-2021: Operation of a Multi-Disciplinary Data Facility for the Earth Science Community• Invited renewal proposal after IEDA

review in 2014/15 • Next 5 years of operating IEDA• $14.4 million

Page 3: Research Data Infrastructure for Geochemistry (DFG Roundtable)

IEDA Data Systems for Geochemistry3

Page 4: Research Data Infrastructure for Geochemistry (DFG Roundtable)

4

IEDA / EarthChem

Community driven Community governance Community engagement & training

Standards compliant (accredited ‘trustworthiness’) Follow data curation standards

QA/QC procedures Unique, persistent identification of data Persistent access of data holdings

Operational procedures (risk management, IP, etc.)

Demonstrated impact on science

4

Page 5: Research Data Infrastructure for Geochemistry (DFG Roundtable)

5

5

Scientific Justification

enable new data intensive science, new cross-disciplinary studies, and new kinds of collaborations.

expand opportunities for scientists, educators, and the public to participate in science.

maximize the return on national research investments.ensure reproducible science: permit verification of

research results.contribute to new science initiatives.

“Data collections provide more than an increase in the efficiency and accuracy of research: they enable new research opportunities.”Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century” (NSB Report, September 2005)

Page 6: Research Data Infrastructure for Geochemistry (DFG Roundtable)

6

Science from EarthChem Data Systems

Page 7: Research Data Infrastructure for Geochemistry (DFG Roundtable)

7Gale et al.

Page 8: Research Data Infrastructure for Geochemistry (DFG Roundtable)

8Gale et al.

Page 9: Research Data Infrastructure for Geochemistry (DFG Roundtable)

9

Data Policies

December 11, 2013

9

Agencies

Societies

Journals

May 9, 2013

February 22, 2013

Page 10: Research Data Infrastructure for Geochemistry (DFG Roundtable)

10

Data Policies

December 11, 2013

10

Page 11: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Concern: Reproducibility11

“The field sciences (e.g., geology, ecology, and archaeology), where each study is temporally (and often spatially) unique, provide exemplars for the importance of

preserving data and samples for further analysis.”

Page 12: Research Data Infrastructure for Geochemistry (DFG Roundtable)

12

Data Policies:

December 11, 2013

12

Page 13: Research Data Infrastructure for Geochemistry (DFG Roundtable)

COPDESSCoalition for Publishing Data in the Earth & Space Sciences

“Connecting Earth Science publishers and Data Facilities to help translate the aspirations of open,

available, and useful data from policy into practice.”

Page 14: Research Data Infrastructure for Geochemistry (DFG Roundtable)

14

14

Data: Publishers’ PerspectiveMany have had supplements for some time.

Difficult to deal with, costlyPDF’s mostly (not searchable, poorly indexed, variable quality)

Require authors to comply with data availability policy; policing

Little guidance on community standards

Want to use and promote repositories, but not well integrated except for a few exceptions

Worried about repository funding and stability

Slide courtesy of Brooks Hanson, AGU Director for Publications

Page 15: Research Data Infrastructure for Geochemistry (DFG Roundtable)

15

15

Statement of CommitmentCOPDESS.orgreaffirm and ensure adherence to our existing journal and

publishing policies…regarding data sharing and archiving...

Signed by ~50 publishers & data facilities

“Earth and space science data should, to the greatest extent possible, be stored in appropriate domain repositories that ... follow leading practices, and can provide additional data services.”

released 15 January. Article in Eos.org https://eos.org/agu-news/committing-publishing-data-earth-space-s

ciences

Page 16: Research Data Infrastructure for Geochemistry (DFG Roundtable)

https://copdessdirectory.osf.io/

To be integrated with re3data.org

Page 17: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Domain-specific Data Facilities17

Science Community

Domain specific Data facility

17

Libraries Archives

CI, Computer Science

Publishers, editors

Discipline-specific data services• Context & provenance metadata

• Semantics• Workflows

Funding Agencies

Data Facilities

Registries

Data curation servicesCI development

Page 18: Research Data Infrastructure for Geochemistry (DFG Roundtable)

18

findableidentification,persistence

accessibleprotection,protocols

context,provenance

re-usableharmonized, machine-readable

interoperableBIG DATA

Adding Value

small data

1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data"

Generic Repositories

Data Curation Standards

Community Data Collections

Domain-specific Data Standards

Page 19: Research Data Infrastructure for Geochemistry (DFG Roundtable)

19

findableidentification,persistence

accessibleprotection,protocols

context,provenance

re-usableharmonized, machine-readable

interoperableBIG DATA

Generic Repositories Community Data Collections

Domain Repositories

Adding Value

small data

Page 20: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Unleashing the BIG in small Research Data

Kerstin Lehnert Lamont -Doherty Earth Observatory of Columbia UniversityPalisades, NY, 10964

http://bigdata-madesimple.com/hey-big-data-dont-forget-your-little-data-cousin/

Page 21: Research Data Infrastructure for Geochemistry (DFG Roundtable)

21

Small Data:Pieces of a Puzzle …

1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data"

21

Page 22: Research Data Infrastructure for Geochemistry (DFG Roundtable)

1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data"

22

… that build a picture

Page 23: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Small Data, Big Science: Example 123

1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data"

“Understanding where the dust that's in the atmosphere and oceans comes from can help scientists estimate its impact on earth's climate system.”

Bess Koffman, Michael Kaplan, Steven Goldstein, Gisela Winckler (LDEO), Natalie Mahowald (Cornell)

http://blogs.ei.columbia.edu/2014/03/13/did-new-zealand-dust-influence-the-last-ice-age/

Science Question:Did New Zealand Dust Influence the Last Ice Age?

Page 24: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Small Data - Big Effort or What it takes to generate a few kilobytes of data

ESIP Winter 2016: "Unleashing the BIG in Small Data"

24

1/6/16

Page 25: Research Data Infrastructure for Geochemistry (DFG Roundtable)

ESIP Winter 2016: "Unleashing the BIG in Small Data"

25

25

Small Data, Big Science: Example 2

1/6/16

Science question:Do convergent margin volcanoes really represent continental crust?

“As it is crucial to understand the extent and origin of the compositional difference between central Aleutian lavas and plutons through time and space, this project will map and sample plutonic rocks exposed on the central Aleutians and their coeval volcanic host rocks.”

http://www.nsf.gov/discoveries/disc_summ.jsp?cntn_id=135851&org=NSF

Page 26: Research Data Infrastructure for Geochemistry (DFG Roundtable)

ESIP Winter 2016: "Unleashing the BIG in Small Data"

26Small Data - Big Effort or What it takes to generate a few kilobytes of data

1/6/16

• 4 scientists (3 institutions) traveling to Alaska

• 5 weeks on remote islands• a boat (with crew)• a helicopter

Anticipated Data:• ~ 250 samples• ~ 200 major element analyses• ~ 150 trace element analyses• 50 U/Pb zircon geochronology• 30 Ar-Ar ages• 80 Sr, Nd, Hf and Pb isotope analyses

Page 27: Research Data Infrastructure for Geochemistry (DFG Roundtable)

27

Page 28: Research Data Infrastructure for Geochemistry (DFG Roundtable)

28EarthChem Data Systems

Data Data Data Data Data

EarthChem Library

Data Data Data Data Data

PetDB, SedDB EarthChem Portal

Data Publication & Preservation Data Mining & Analysis

InvestigatorsMetadata

Catalog Data & Metadata

Data & Metadata

External SystemsEarthChem Data Managers

Page 29: Research Data Infrastructure for Geochemistry (DFG Roundtable)

29

EarthChem Library

Data Types:- Analytical datasets- Experimental datasets- Macros/tools- Data compilations (syntheses)- Images- Data reports

Page 30: Research Data Infrastructure for Geochemistry (DFG Roundtable)

30DOI to allow proper citation of data

Link to publications

Link to funding source

30

Page 31: Research Data Infrastructure for Geochemistry (DFG Roundtable)

31

Accessible in the EarthChem Library

Page 32: Research Data Infrastructure for Geochemistry (DFG Roundtable)

32

Editors Roundtable Recommendations

Data need to be available in useful format Complete disclosure of data Data in tabular (usable!) format, no .PDF or .jpg No ratios

Sample metadata locations Unique sample identifiers Object classifications

Analytical metadata Method Lab Data quality & reproducibility (reference material measurements)

Page 33: Research Data Infrastructure for Geochemistry (DFG Roundtable)

33

33

Data Templates

LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data

Page 34: Research Data Infrastructure for Geochemistry (DFG Roundtable)

EarthChem Data Templates34

Page 35: Research Data Infrastructure for Geochemistry (DFG Roundtable)
Page 36: Research Data Infrastructure for Geochemistry (DFG Roundtable)

36

NEW!

Page 37: Research Data Infrastructure for Geochemistry (DFG Roundtable)

37

Data Standards: Why?

Re-usability of data

Reproducibility of science

Integration/interoperability of data

Page 38: Research Data Infrastructure for Geochemistry (DFG Roundtable)

38

38

Open Geospatial Consortium (OGC):Observations & Measurements

Observation Result

Feature of Interest

Sampling Sampling Feature

Observation

“Observations commonly involve sampling of an ultimate feature of interest. This International Standard defines a common set of sampling feature types classified primarily by topological dimension, as well as

samples for ex-situ observations.” (OGC O&M 2.0.0 / ISO19156; editor: Simon Cox)

e.g. Station,Transect, Section, Specimen

Page 39: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Observation Data Model v2

39

ODM2 Team:J S HorsburghA K AufdenkampeL HsuA JonesK LehnertE MayorgaL SongD TarbotonI Zaslavsky

Horsburgh et al., Environmental Modelling & Software, Volume 79, 2016.

Page 40: Research Data Infrastructure for Geochemistry (DFG Roundtable)

PetDB40

Page 41: Research Data Infrastructure for Geochemistry (DFG Roundtable)

ESIP Winter 2016: "Unleashing the BIG in Small Data"

41

41

PetDB Data Mining: Search & Filter

1/6/16

Filter by method or concentration

Page 42: Research Data Infrastructure for Geochemistry (DFG Roundtable)

ESIP Winter 2016: "Unleashing the BIG in Small Data"

42

Page 43: Research Data Infrastructure for Geochemistry (DFG Roundtable)

43

EarthChem Collaborations

External EC Portal contributors GEOROC, USGS, MetPetDB, GANSEKI

Critical Zone Observatories

DiamondDB (funded by Sloan Foundation/DCO)

DECADE Portal (funded by Sloan Foundation/DCO) Collaboration with Global Volcanism Program & MAGA

database (C. Cardellini)

Layered Intrusions Database J. van Tongeren (student engagement project)

MoonDB (funded by NASA 2015-2017) Johnson Space Center, C. Neal,

43

Page 44: Research Data Infrastructure for Geochemistry (DFG Roundtable)

44

IEDA Data Rescue Initiative

Data Rescue Mini-awards ($7,000) J. Delano (SUNY Albany), A. Saal, E. Hauri: Apollo samples J. Gill (UCSC, retired): P. Janney (UCT): UCT Mantle Xenolith Collection M. Rhodes (U Mass): Hawaiian Drilling project T. Fischer (UNM): Russian Volcanic Gas Data

International Data Rescue Award in the Geosciences Sponsored by Elsevier Research Data division Awared 2013 (at AGU FM) and 2015 (at EGU GA) Competition for 2016 starting soon

Special Issue of GeoResJ on Data Rescue (volume 6, 2015)

44

Page 45: Research Data Infrastructure for Geochemistry (DFG Roundtable)

EarthChem Portal45

Page 46: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Data Analysis 46

Page 47: Research Data Infrastructure for Geochemistry (DFG Roundtable)

47

Page 48: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Data Analysis 48

Page 49: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Interoperability with LEPR (M. Ghiroso) 49

Page 50: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Results at LEPR 50

Page 51: Research Data Infrastructure for Geochemistry (DFG Roundtable)

Data Analysis 51

Page 52: Research Data Infrastructure for Geochemistry (DFG Roundtable)

52

Page 53: Research Data Infrastructure for Geochemistry (DFG Roundtable)

53

53

EarthCube

Advances coordination, collaboration, and integrationCommunity governance Integrative Activities

Fosters new data communitiesResearch Coordination Networks

Develops and adapts new technologies to structure, transform, integrate, document, harmonize data & metadataBuilding Blocks