35
speciesLink A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos Centro de Referência em Informação Ambiental, CrIA

species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

  • Upload
    frey

  • View
    18

  • Download
    0

Embed Size (px)

DESCRIPTION

species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos Centro de Referência em Informação Ambiental, CrIA. Overview. CRIA SinBiota and The Species Analyst speciesLink Type of collections involved Number of records Technical features - PowerPoint PPT Presentation

Citation preview

Page 1: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

speciesLink

A System for integrating distributed primary biodiversity data

Vanderlei Perez Canhos

Centro de Referência em Informação Ambiental, CrIA

Page 2: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

OverviewOverview

• CRIA

• SinBiota and The Species Analyst

• speciesLink

• Type of collections involved

• Number of records

• Technical features

• Future plans

Page 3: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

CrIAReference Center on Environmental Informationhttp://www.cria.org.br

Focus on Biodiversity

Informatics

• Open source software

• Standards and protocols

• Systems interoperability

• Partnerships

Page 4: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos
Page 5: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos
Page 6: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos
Page 7: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos
Page 8: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos
Page 9: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos
Page 10: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

http://speciesanalyst.net/

Location of participant collections: mainly United Statesmainly United States

Taxonomic groups: several taxaseveral taxa

Protocol: Z39.50 (migration to DiGIR on process)Z39.50 (migration to DiGIR on process)

Number of records: ~ 50.000.000~ 50.000.000

Page 11: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Importance of data sharingImportance of data sharing

Paris

KU – Natural HistoryMuseum

British Museum

Field Museum

Page 12: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

The main goal of

speciesLink was to

build a distributed

system integrating

several biological

collections and

making their primary

data available on the

Internet.

speciesLinkDistributed Information System for Biological Collections

http://splink.cria.org.br

Page 13: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

fish: 3

herbaria: 4 microorganisms: 3

mites: 2

inventories: SinBiota

Geographic distribution of the participant collections – phase I

São Paulo State CollectionsSão Paulo State Collections

Page 14: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Number of RecordsNumber of Records

available existing

Herbaria 72,000 of 740,000

Microorganisms 1,000 of 2,700

Mites 18,000 of 22,000

Fish 70,000 of 123,000

Inventories (species)

38,000 of 38,000

~200,000 of ~1,000,000

Page 15: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Microbial CollectionsMicrobial Collections

CBMAI 110 700

IBSBF 929 2,000

Observational DataObservational Data

SinBiota 38,109 38,109

Botanical CollectionsBotanical Collections

ESA 730 80,000

SP 11,280 350,000

IAC 25,245 45,000

SPF 21,828 133,500

UEC 12,860 130,000

Zoological CollectionsZoological Collections

ACARISJRP 5,382 7,000

ACARIESALQ 12,392 15,000

DSZSJRP(fish)

5,714 23,000

LIRP(fish)

4,314 30,000

MZUSP

(fish)

60,000 110,000

Collection Management SoftwareCollection Management Software

Page 16: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Support to collectionsSupport to collections

• Providing basic equipment and network infrastructure

• Helping to choose a management system, when needed

• Helping to train and to import data, when needed

Page 17: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Protocol and Content SchemaProtocol and Content Schema

• DiGIR protocol (Distributed Generic Information Retrieval)

Potential to be globally accepted

• DiGIR software (Java Portal & PHP Provider)

Collaborative development

• DarwinCore v.2

Covers the basic content elements (taxonomic

identification, location and date of collecting event)

Page 18: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Simple Search Simple Search InterfaceInterface

Page 19: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

speciesLink site

Presentation Layer

speciesLink site

Presentation Layer

DiGIRPortal(Java)

DiGIRPortal(Java)

PerlPerl

Slow or unstable connectivity

Fast and stable connectivity

DataSOAP client

CollectionManagement

System

SQL

Collection C

DataRepository

DataSOAP client

CollectionManagement

System

SQL

Collection B

DataRepository

PostgresPHP

Provider

SOAP Server

SQL

Regional Server

DataPHP

Provider

Collection Management

System

SQL

Collection A

System’s System’s ArchitectureArchitecture

Page 20: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

RegionalServer

RegionalServer

RegionalServer

RegionalServer

Network DesignNetwork Design

Page 21: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

speciesLink site

Presentation Layer

speciesLink site

Presentation Layer

DiGIRPortal(Java)

DiGIRPortal(Java)

PerlPerl

Slow or unstable connectivity

Fast and stable connectivity

DataSOAP client

CollectionManagement

System

SQL

Collection C

DataRepository

DataSOAP client

CollectionManagement

System

SQL

Collection B

DataRepository

PostgresPHP

Provider

SOAP Server

SQL

Regional Server

DataPHP

Provider

Collection Management

System

SQL

Collection A

System’s System’s ArchitectureArchitecture

Page 22: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Data Migration ClientData Migration Client

• Platform independent (java)

• Connects to any database accessible via JDBC(simple text files are also supported)

• Complete control over data

• Low traffic

• Possibility to filter sensitive data using a regular expression

Page 23: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

speciesLink site

Presentation Layer

speciesLink site

Presentation Layer

DiGIRPortal(Java)

DiGIRPortal(Java)

PerlPerl

Slow or unstable connectivity

Fast and stable connectivity

DataSOAP client

CollectionManagement

System

SQL

Collection C

DataRepository

DataSOAP client

CollectionManagement

System

SQL

Collection B

DataRepository

PostgresPHP

Provider

SOAP Server

SQL

Regional Server

DataPHP

Provider

Collection Management

System

SQL

Collection A

System’s System’s ArchitectureArchitecture

Page 24: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Regional serverRegional server

Features

• perl / PostgreSQL combination

• Can hold data from several collections

• Interpretation rules can be applied to specific data

PostgresProvider

PHP

SOAP Server(perl)

SQL

Page 25: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Query Result (brief)Query Result (brief)

Page 26: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

speciesLink – phase IIspeciesLink – phase II

Page 27: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

>35 collections available>35 collections available

Page 28: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Future plansFuture plans

• Mapping tools

Page 29: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Future plansFuture plans

• Mapping tools

• Data cleaning tools

Page 30: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Future plansFuture plans

• Mapping tools

• Data cleaning tools

• Modelling framework

Page 31: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

DiGIRPortal

DiGIRPortal Precipitation

Vegetation

Temperature

Environmental layers

ACME

BioclimNeural

Net GARP

specimens

BioCASEPortal

BioCASEPortal

Modelling algoritms

Infrastructure for Species Distribution ModellingInfrastructure for Species Distribution Modelling

Page 32: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Instituto de Botânica Universidade Estadual de Campinas

Universidade de São Paulo

Instituto Agronômico de Campinas

Instituto Biológico

Universidade Estadual Paulista

Acknowledgements (phase I)Acknowledgements (phase I)

Escola Superior de Agricultura “Luiz de

Queiroz”

Page 33: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

FellowshipsFellowships

• Visiting researchers

– Andrew Townsend Peterson (3 months)– Arthur Chapman (1 year)

• Pos-doctor

– Ingrid Koch

• Technical training (6 TT fellowships)

Page 34: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Summing upSumming up

• Achieved proof of concept

• Data is already available

• Low cost for connecting new collections

• Triggered off a movement within the collections to improve the quality of data and to increase the amount of available information

• Adoption of standards and protocols

• International partnerships: DiGIR, modelling framework

• Interoperability with similar initiatives

Page 35: species Link A System for integrating distributed primary biodiversity data Vanderlei Perez Canhos

Thank you!Thank you!

http://splink.cria.org.br

[email protected]