25
PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde de Jong)

PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Embed Size (px)

Citation preview

Page 1: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI

Pan-European Species-directories Infrastructure

European GBIF nodes Meeting — Paris, 4 April 2011

Walter Berendsohn

(based on presentation by Yde de Jong)

Page 2: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI project

• Aim: Creating a Pan-European taxonomic backbone (an All-Species Checklist of Europe)

• Based on the existing 3 large European databases:

• + European Fungi (Index Fungorum) & Algae (Algaebase)

• 2.64 million Euro, 40 partners, May 2008 - April 2011

Fauna Europaea

European animal species Terrestrial & freshwater

European Register of Marine Species

European marine species

Euro+Med PlantBase

European plant species Terrestrial & freshwater

Page 3: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI in the context of other EU projects

(FP 4) (FP 5) (FP 6) (FP 7)

Digitisation & Infrastructures Integration & e-ScienceNetworks of Excellence

Page 4: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Sustaining Pan-European checklists - issues

Sustaining theexpert networks

Sustaining thedatabasesystems

Sustaining themaintenance (updating)

tools & functions

Sustaining theinteroperability

(e.g. Global Name Architecture)

Sustaining the data dissemination

(webportal)Sustaining the

data verification(quality control / validation) Sustaining the

implementation ofEuropean taxonomic standards

SMEBDSMEBDHost InstitutionsHost Institutions

EDIT Platform for Cybertaxonomy

EDIT Platform for Cybertaxonomy

Focal NetworksFocal Networks

VLIZ / Hosts(LifeWatch?)VLIZ / Hosts(LifeWatch?)

*4Life projects(LifeWatch?)

*4Life projects(LifeWatch?)

Page 5: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

PESI WP2 — Expert Networks

Page 6: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Expert Networks

Society for the Management of Electronic Biodiversity Data(SMEBD)

http://www.smebd.eu

Page 7: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

PESI Focal Points Networks

Page 8: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI Focal Point Networks

Page 9: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

• Cross-validation pan-European lists with local species lists

– TaxonMatch Tool

• Provide meta-data on local expertise (experts, resources, etc.)

– Focal Points Expertise database

Major tasks of PESI National Focal Points

Page 10: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI Validation Tools: Taxon Match Tool - 1

http://www.eu-nomen.eu/portal/taxamatch.phpTAXAMATCH fuzzy matching algorithm by Tony Rees, PHP/MySql port of TAXAMATCH by Michael Giddens, Scientific Names Parser by Dmitry Mozzherin

Page 11: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

1. Exact match test

2. Phonetic match test

3. Custom Modified Damerau-Levenshtein Distance (MDLD)

4. Modified n-gram comparison of author names and dates, including known abbreviations

PESI validation tools: Taxon Match Tool - 2

Mapping between two taxon names lists (exact and fussy)

Page 12: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI validation tools: Taxon Match Tool - 3

Excel file export

Page 13: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

Taxonomic standards & authority files

Page 14: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

• What are the properties of a Taxonomic Backbone

– connecting different uses of the same name for multiple classifications

– persistent name-name relationships of species names

• Optimise data sharing / interoperability

– Linked Data mark-up

– persistent identifiers: globally unique IDs (GUIDs, LSIDs)

– DarwinCore Archive format for transport

• Standardised ontologies / vocabularies

– consensus classification

– consensus distribution and occurence scheme

Conceptual integration

Page 15: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI consensus distribution and occurrence scheme

Gazetteer: http://www.vliz.be/vmdcdata/vlimar/

Page 16: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

Taxonomic information e-infrastructure

Page 17: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

EDIT Platform dataflow in PESI domain

© Walter Berendsohn

PESI Phyco-Myco databases

Page 18: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Quality control mechanisms

Inconsistency checks used in the merging process

Page 19: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI Data Warehouse Model

> 440.000 taxon names~ 210.000 valid species names

> 440.000 taxon names~ 210.000 valid species names

Page 20: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI Data Warehouse - statistics

Page 21: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Management & CoordinationManagement & Coordination

Infrastructural NetworksC

omm

unit

y N

etw

orks

Zoological Community

Botanical Community

Marine Community

Mycological Community

Expert-networks

Focalpoint

networks

Authorityfiles &

Standards

Datae-Infra-structure

e-Services

Phycological Community

e-Services for users & dissemination

Page 22: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI project website

http://www.eu-nomen.eu/pesi

Page 23: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

PESI dataportal

http://www.eu-nomen.eu/portal

Page 24: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Linking to Global Names Architecture

Page 25: PESI Pan-European Species-directories Infrastructure European GBIF nodes Meeting — Paris, 4 April 2011 Walter Berendsohn (based on presentation by Yde

Acknowledgement (PESI SC & management)

Nihat Aktaç

Ward Appeltans

Walter Berendsohn

Phillip Boegh

Louis Boumans

Thierry Bourgoin

Mark Costello

Charles Hussey

Roger Hyam

Yde de Jong

Julia Kouwenberg

David Ouvrad

Henrik Pedersen