PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April...

Preview:

Citation preview

PESI

Pan-European Species-directories Infrastructure

European GBIF nodes Meeting — Paris, 4 April 2011

Walter Berendsohn

(based on presentation by Yde de Jong)

PESI project

• Aim: Creating a Pan-European taxonomic backbone (an All-Species Checklist of Europe)

• Based on the existing 3 large European databases:

• + European Fungi (Index Fungorum) & Algae (Algaebase)

• 2.64 million Euro, 40 partners, May 2008 - April 2011

Fauna Europaea

European animal species Terrestrial & freshwater

European Register of Marine Species

European marine species

Euro+Med PlantBase

European plant species Terrestrial & freshwater

PESI in the context of other EU projects

(FP 4) (FP 5) (FP 6) (FP 7)

Digitisation & Infrastructures Integration & e-ScienceNetworks of Excellence

Sustaining Pan-European checklists - issues

Sustaining theexpert networks

Sustaining thedatabasesystems

Sustaining themaintenance (updating)

tools & functions

Sustaining theinteroperability

(e.g. Global Name Architecture)

Sustaining the data dissemination

(webportal)Sustaining the

data verification(quality control / validation) Sustaining the

implementation ofEuropean taxonomic standards

SMEBDSMEBDHost InstitutionsHost Institutions

EDIT Platform for Cybertaxonomy

EDIT Platform for Cybertaxonomy

Focal NetworksFocal Networks

VLIZ / Hosts(LifeWatch?)VLIZ / Hosts(LifeWatch?)

*4Life projects(LifeWatch?)

*4Life projects(LifeWatch?)

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

PESI WP2 — Expert Networks

Expert Networks

Society for the Management of Electronic Biodiversity Data(SMEBD)

http://www.smebd.eu

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

PESI Focal Points Networks

PESI Focal Point Networks

• Cross-validation pan-European lists with local species lists

– TaxonMatch Tool

• Provide meta-data on local expertise (experts, resources, etc.)

– Focal Points Expertise database

Major tasks of PESI National Focal Points

PESI Validation Tools: Taxon Match Tool - 1

http://www.eu-nomen.eu/portal/taxamatch.phpTAXAMATCH fuzzy matching algorithm by Tony Rees, PHP/MySql port of TAXAMATCH by Michael Giddens, Scientific Names Parser by Dmitry Mozzherin

1. Exact match test

2. Phonetic match test

3. Custom Modified Damerau-Levenshtein Distance (MDLD)

4. Modified n-gram comparison of author names and dates, including known abbreviations

PESI validation tools: Taxon Match Tool - 2

Mapping between two taxon names lists (exact and fussy)

PESI validation tools: Taxon Match Tool - 3

Excel file export

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

Taxonomic standards & authority files

• What are the properties of a Taxonomic Backbone

– connecting different uses of the same name for multiple classifications

– persistent name-name relationships of species names

• Optimise data sharing / interoperability

– Linked Data mark-up

– persistent identifiers: globally unique IDs (GUIDs, LSIDs)

– DarwinCore Archive format for transport

• Standardised ontologies / vocabularies

– consensus classification

– consensus distribution and occurence scheme

Conceptual integration

PESI consensus distribution and occurrence scheme

Gazetteer: http://www.vliz.be/vmdcdata/vlimar/

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

Taxonomic information e-infrastructure

EDIT Platform dataflow in PESI domain

© Walter Berendsohn

PESI Phyco-Myco databases

Quality control mechanisms

Inconsistency checks used in the merging process

PESI Data Warehouse Model

> 440.000 taxon names~ 210.000 valid species names

> 440.000 taxon names~ 210.000 valid species names

PESI Data Warehouse - statistics

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

e-Services for users & dissemination

PESI project website

http://www.eu-nomen.eu/pesi

PESI dataportal

http://www.eu-nomen.eu/portal

Linking to Global Names Architecture

Acknowledgement (PESI SC & management)

Nihat Aktaç

Ward Appeltans

Walter Berendsohn

Phillip Boegh

Louis Boumans

Thierry Bourgoin

Mark Costello

Charles Hussey

Roger Hyam

Yde de Jong

Julia Kouwenberg

David Ouvrad

Henrik Pedersen

Recommended