25
BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak, Tim Robertson, Olaf Bánki Global Biodiversity Information Facility (GBIF)

BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Embed Size (px)

Citation preview

Page 1: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

BIS TDWG Conference29 October 2014, Jönköping, Sweden

Publishing sample-based data using Darwin Core Archives

Éamonn Ó Tuama, Markus Döring, Kyle Braak, Tim Robertson, Olaf BánkiGlobal Biodiversity Information Facility (GBIF)

Page 2: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Why do this?• Long perceived need by GBIF to

enable publishing of abundance (sample) data;

• Requirement with the EU Project EU BON (http://eubon.eu);

• Meeting the needs of the GEO Biodiversity Observation Network (GEO BON ).

Page 3: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Sample-based data• Output of monitoring programmes;

• Quantitative, calibrated;

• Using standard protocols;

• Repeatable, comparable.

Detect changes and trends in populations

Page 4: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Constraints• Be available for testing in 2015

• Build on existing widely used standards: Darwin Core

• Work within the existing tools ecosystem: IPT

• … while acknowledging the promise of ontologies (BCO, OBOE …)

Page 5: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Caveat

Aim: demonstrate one way data can be exposed to maximize discoverability and reuse. Not in scope: establishing how data should be captured or modelled.

Page 6: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

A use case

Enabling the flow of sample based data in support of GEO BON Essential Biodiversity Variables (EBVs).

Page 7: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Essential Biodiversity Variablesintermediate layer between raw data and indicators

GEO BON has identified six EBV classes

a measurement required for study,

reporting and management of

biodiversity change

Page 8: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

EBV Class: Species populations

Page 9: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Building onthe

Darwin Corevocabulary

Page 10: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

taxonRank

higherClassification

taxonConceptIDcollectionCode

geodeticDatumspecificEpithet

coordinatePosition

collectionCode: The name, acronym, coden, or initialism identifying the collection or data set from which the record was derived. Examples: "Mammals", "Hildebrandt", "eBird".

Darwin Core – a glossary of terms

Page 11: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

7 essential terms for encoding sample data

1. eventID2. projectID (new)

3. samplingProtocol4. sampleSize (new)

5. sampleSizeUnit (new)

6. quantity (new)

7. quantityType (new)

Page 12: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

New terms requiredeventID: an identifier for the set of information associated with an Event; may be a global unique identifier or an identifier specific to the data set.

projectID: an identifier for a project with which the data is associated; use to link related data sets, e.g., a monitoring series; may be a global unique identifier or an identifier specific to the series.

Page 13: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

New terms requiredsampleSize: a numeric value for the time duration, length, area or volume involved in the sampling.

sampleSizeUnit: the unit of measurement used for sampling, e.g., minute, hour, day, metre, metre^2, metre^3.

2 hour3 m217 km1 litre

Page 14: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Unit of measurement vocabulary

Page 15: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Used in IPT as controlled list for sampleSizeUnit

Unit of measurement vocabulary

http://rs.gbif.org/sandbox/vocabulary/gbif/unit_of_measurement.xml

Page 16: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

New terms requiredquantity: the number or enumeration value of the entity or category being quantified in the sample. As such it is paired with quantityType.

quantityType: the entity being referred to by quantity, e.g., individuals, a percentage (e.g., species, biomass, biovolume), a scale type

14 Individualsr BraunBlanquetScale0.4 %Species31 %Biomass

Page 17: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Publishing sample data

using the IPT

Page 18: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

http://www.gbif.org/ipt

Page 19: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Event Core• An event core is the logical way of

organising a sampling event;

• Related environmental measurements can be included in an extension;

• Vegetation plot data (coverages) can be included separately from “occurrences”.

Page 20: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

http://rs.tdwg.org/dwc/terms/guides/text/index.htm

Darwin Core Archive components

Event core

Occurrence ext

Measurement-or-fact ext

Relevé ext

meta.xml

EML.xml…

+DwC Archive

Page 21: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Event Core(Event, Location, Geological Context)

eventID, projectID (n), samplingProtocol, sampleSize (n), sampleSizeUnit (n)

Occurrence Extension(Occurrence, Taxon, Identification)

eventID, quantity (n), quantityType (n)

(n) = proposed new term

Placing the terms in a Darwin Core Archive

For term definitions, see http://links.gbif.org/ipt-sample-data-primer

Page 22: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

eventID projectID samplingProtocol

sample Size

sampleSizeUnit

event Date location decimal Latitude

decimalLongitude

C_1428 RM065 AQEM 1.25 m2 1963-03-01 Kinzig O3 Rothenbergen

48.1333 11.5667 …

C_1538 RM065 AQEM 1.25 m2 1975-01-21 Kinzig W1 Bulau

-34.6033 -58.3817 …

eventID scientificName quantity quantityType …

C_1428 Baetis rhodani 14 individuals …

C_1428 Ephemera danica 15 individuals …

C_1428 Gyraulus albus 2 individuals …

C_1538 Serratella ignita 318 individuals …

A sampling event uses a particular samplingProtocol with sampleSize and sampleSizeUnit, etc. and can record one or more taxa, each of which has a measurement (quantity and quantityType associated with it.

Event core

Occurrence extension

http://rs.gbif.org/sandbox/core/dwc_event.xml

http://rs.gbif.org/sandbox/extension/event_occurrence.xml

Page 23: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Adapting the IPT

http://eubon-ipt.gbif.org

Now with Event Core

Page 24: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

This project has received funding from the European Union’s Seventh Programme for research, technological development and demonstration under grant agreement No 308454.

Acknowledgement

EU BON and GEO BON partners, TDWG mailing list contributors and GBIF sample data workshop participants informed this work and are gratefully acknowledged.

Page 25: BIS TDWG Conference 29 October 2014, Jönköping, Sweden Publishing sample-based data using Darwin Core Archives Éamonn Ó Tuama, Markus Döring, Kyle Braak,

Thank you

GBIF SecretariatUniversitetsparken 15DK-2100 Copenhagen ØDenmark

www.gbif.org

E-mail: [email protected]: +45 3532 1470Fax: +45 3532 1480