33
Using Biodiversity Data from the NBN Database for Research y Paula Lightfoot, NBN Trust Data Access Officer

Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Using Biodiversity Data from the NBN Database for Research

y

Paula Lightfoot, NBN Trust Data Access Officer

Page 2: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Introduction to the NBN Database

1. Overview of available data

2. Finding and accessing data

3. Evaluating data quality

4. Using and referencing data

http://data.nbn.org.uk

Page 3: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Summary of Available Data

• 91 million georeferenced taxon occurrence records.

• 27 habitat datasets and 44 site boundary datasets to provide context and act as filters.

• 856 datasets from 150 data providers.

• Standard data format.

• Standard taxonomy from UK Species Inventory.

http://data.nbn.org.uk

Page 4: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Data Providers

• A large proportion of data

comes from skilled amateur

naturalists.

• Data collated taxonomically

and/or geographically.

• Some structured surveys,

much ad hoc recording.

Records in the NBN Database

by data provider type.

January 2014

(n = 91,206,588)

Page 5: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Geographic Coverage and Sampling Effort:

• Recorder effort and data mobilisation are not evenly spread across the British Isles.

• New NBN Gateway (v.5) extends coverage to include the Channel Islands and offshore data.

• National Biodiversity Data Centre is the repository for ROI data.

Page 6: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Sampling Effort

Collembola Recording Scheme

10,633 records of 336 species over 200 years

BTO Second Atlas of Breeding Birds in Britain and Ireland: 1988-1991

1,465,400 records of 272 species over 4 years

Page 7: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Sampling Effort

http://tombio.myspecies.info/

Orchesella villosa (a springtail)

Page 8: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Taxonomic breakdown of records in the NBN Database at January 2014 n = 91,269,685

Taxonomic Coverage

Page 9: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Currency of Data

Number of records in the NBN Database

by year of record (January 2014)

n = 89,091,428 (98% of total)

Page 10: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Data Attributes

Standard attributes in NBN Exchange Format:

Required: Unique record key, taxon, date, date type, coordinates/grid

reference/polygon, projection, precision (what? where? when?)

Optional: Survey key, sample key, absent, sensitive, site key, site name, recorder, determiner

Other attributes are not (yet) standardised across datasets: e.g. abundance, life stage, sex, verification status, record type, depth i.e. not standard fields and not standard units / vocabularies

Page 11: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Absence Data 10km Interactive Map

of Sargassum muticum Zero abundance (T/F) is a standard attribute Absence records are displayed on the NBN Gateway Interactive Map The NBN Database currently holds 30,625 absence records across 26 datasets (Jan 2014)

Page 12: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Effort-based Data

• The NBN Database holds some effort-based datasets (e.g. BTO Breeding Bird Survey, Shorewatch, Shore Thing, UK Butterfly Monitoring Scheme)

• The effort-based methodology should be described in the metadata.

• Effort data may be stored as attributes of the species observation e.g. number of observers, timespan of observation period.

• Effort data is not stored in a way that enables ‘per unit effort’ analysis. NBN Exchange Format is a flat file, not relational tables.

Page 13: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

• The finest resolution currently available is 100m squares.

• Data providers can blur resolution of the ‘public’ version of the records to 1km, 2km or 10km, while granting full access to select users.

• ‘Full access’ includes recorder and determiner names and attributes (where available).

Data Resolution

Page 14: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Data providers

Data Resolution

Access resolution of all records in the NBN Database (n = 91,206,588)

Access resolution of records of designated taxa in the NBN Database (n = 20,548,842)

Page 15: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Data Resolution

Access resolution of vascular plant records in the NBN Database (n = 25,998,531)

Access resolution of dragonfly and damselfly records in the NBN Database (n = 1,486,554)

Page 16: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Exploring Data

Page 17: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Exploring Data

NBN Gateway Interactive Map – create and query layers of species, habitats and site boundaries

Page 18: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

• Publicly accessible records have gone through quality control processes, e.g. checks by local and national experts.

• Some have also been checked using NBN Record Cleaner, based on:

Spatial distribution rules

Temporal rules: flight period or first/last year recorded

Identification difficulty / rarity / taxonomic uncertainty

• NBN Record Cleaner rules have been created by experts at national recording schemes for over 18,000 species including 77% of conservation priority species (NERC Act 2006).

• Nevertheless, erroneous records do occur. Always read the metadata.

Evaluating Data Quality and Accessibility

http://www.nbn.org.uk/record-cleaner.aspx

Page 19: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Evaluating Data Quality and Accessibility

Read the dataset’s metadata:

Page 20: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Evaluating Data Quality and Accessibility

Read the dataset’s metadata:

Page 21: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Requesting better access to data

For one off use:

• Request access as an individual.

• Apply taxonomic / geographic / date and dataset filters to request access to the records you need across multiple datasets.

For repeated use (strongly recommended!)

• Register your organisation on the NBN Gateway (quick and free!).

• Apply as an organisation for access to all datasets and permission to use data for research purposes.

• Make colleagues and students members of the organisation.

Over 200 organisations have user accounts on the NBN Gateway, around 80% of whom also share their own data

Page 22: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Who you are (individual / organisation)

Why you are downloading the data (dropdown list and free text description)

Page 23: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Include sensitive records

You will need to have been granted access to these records before downloading data

Page 24: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Geographic filter

10km square

Site boundary

‘Within’ or ‘overlapping and within’

Page 25: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Taxonomic filter

Taxon (up to Order)

Taxon reporting category (e.g. terrestrial mammals)

Designation

User-defined list

User-defined lists: e.g. species as proxy indicators of climate change, habitat condition, ecosystem services etc. Must be supplied and maintained (with metadata) by a named organisation. Must be relevant for repeated use, not just one-off use.

Page 26: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Year Range

e.g. restrict to recent records only

Page 27: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Dataset filter

You may wish to exclude some datasets e.g.

If they have not granted permission

If the metadata shows they are not suitable for your purpose

Page 28: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Downloading data from the NBN Gateway

Download

Zip file containing:

Observations (CSV file)

Metadata (TXT file)

Download date, time and filters used (TXT file)

Limitations: Filters are not ‘multi-select’. For data on 2 species at 5 sites, you have to do 10 downloads. You have to use a taxonomic, geographic or dataset filter – you can’t download everything!

Page 29: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Accessing data via the NBN REST API

• REST API available to view and download

• Full documentation available by end March

• rNBN tool for release this year

Custom downloads and REST API downloads are logged and reported to data providers, the same as downloads from the NBN Gateway.

Custom downloads from the NBN Database

• Filter by user-defined species list (one-off use)

• Filter by user-defined polygon

• ESRI shapefile download format

https://data.nbn.org.uk/Documentation/Web_Services/Web_Services-REST/

Page 30: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Accessing and Using Data

Page 31: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

NBN Gateway Terms and Conditions

• Require written permission for research use

• Require the data provider(s) to be acknowledged

• Require the recorder to be acknowledged if appropriate and possible

• Require a waiver statement to be included

• Require OS Map images to be acknowledged

Accessing and Using Data

https://data.nbn.org.uk/Terms

Page 32: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Referencing Data

Guidance on referencing data is available on the NBN Website

• DOIs are not currently generated from the NBN Database

• This is being considered, but the data access controls and the fact that data may be withdrawn by data providers poses a challenge.

Page 33: Using Biodiversity Data from the NBN Database for Research … · Introduction to the NBN Database 1. Overview of available data 2. Finding and accessing data 3. Evaluating data quality

Links and References

Data providers who contributed to maps used in this presentation: 10km interactive map of Sargassum muticum: https://data.nbn.org.uk/Taxa/NBNSYS0000188809

Collembola Recording Scheme dataset: https://data.nbn.org.uk/Datasets/GA000566

BTO Breeding Bird Atlas 1988-1991: https://data.nbn.org.uk/Datasets/GA000147

National Biodiversity Network: www.nbn.org.uk

NBN Gateway: http://data.nbn.org.uk

NBN Record Cleaner: http://www.nbn.org.uk/Tools-Resources/Recording-Resources/NBN-Record-Cleaner.aspx

Guidance on referencing data from the NBN Database: http://www.nbn.org.uk/Use-Data/Using-Maps-or-Data/Using-and-referencing-data-from-the-Gateway.aspx

GBIF: www.gbif.org

NERC guidance on DOIs: http://www.nerc.ac.uk/research/sites/data/doi.asp

Guide to the NBN Exchange Format on YouTube: http://www.youtube.com/watch?v=2WfOjQOaVFI#t=24