34
GBIF IPT installations for EURISCO GBIF Tools and Darwin Core extension for germplasm rtoon by Sasha Kopf (Creative Commons) BIF Nodes Meeting 2010, March 10 th -12 th Alicante, Spain Genetic Resources Center, NordGen

EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Embed Size (px)

DESCRIPTION

Regional GBIF NODES meeting of Europe in March 2010. Presentation of current activities from the NordGen NODE. Implementations of the GBIF IPT toolkit for genebanks in Europe. Upgrade for selected genebanks from the BioCASE publishing toolkit to the IPT. First step of a scheduled larger implementation planned to start in 2011 as part of the EuroGeneBank application pending EU funding decision. NordGen IPT EURISCO

Citation preview

Page 1: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

GBIF IPT installations for EURISCOGBIF Tools and Darwin Core extension for germplasm

Cartoon by Sasha Kopf (Creative Commons)

European GBIF Nodes Meeting 2010, March 10th-12th Alicante, SpainDag Endresen, Nordiv Genetic Resources Center, NordGen

Page 2: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Topics for this session

GBIF IPT installation for EURISCO Overview of the project Darwin Core extension for germplasm GBIF informatics tools Integrated Publishing Toolkit (IPT) IPT installations for EURISCO Possible PGR network model

Page 3: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Darwin Core extension for Germplasm,

(presented at TDWG 2009)

Opened up for use of new GBIF technology

in gene banking world

Proposal to implement GBIF technology as a test in the

European gene banking community

Page 4: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

“... a feasibility study aimed at demonstrating the practical implementation of the GBIF decentralised architecture strategy and in particular in the context of the EURISCO Network.”

“... focused on the adoption of the IPT by selected gene banks in Europe, the publishing of richer content using the Darwin Core germplasm extension and the indexing of these published resources by the EURISCO platform.”

“... implemented in the context of EURISCO and therefore in close collaboration with the EURISCO Coordinator.”

From the contract between NordGen and GBIF:

Page 5: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

GBIF Informatics Suite

GBIF tools to empower decentralized thematic or regional networks

Darwin Core extension for germplasm makes these tools usable for crop gene banks.

Page 6: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Darwin CoreThe purpose of DwC terms is to facilitate data sharing • a well-defined standard core vocabulary

• a flexible framework to maximize re-usability

The Darwin Core can be extended by adding new terms to share additional information.

TDWG standard 2009

“The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information.”

http://rs.tdwg.org/dwc/

Page 7: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

DwC star schema model

Page 8: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

http://code.google.com/p/darwincore-germplasm

http://rs.nordgen.org/dwc

DwC extension for Germplasm

DwC Germplasm : DRAFT 0.1 : August 26, 2009

• “MCPD in Darwin Core”

• Maintained by gene banks worldwide

• Additional terms to describe germplasm samples

• Includes the new terms for crop trait experiments developed as part of the European EPGRIS3 project

• Includes a few additional terms for new international crop treaty regulations

Page 9: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

DwC Germplasm (1)

Page 10: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

DwC Germplasm (2)

Page 11: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

DwC Germplasm (3)

Page 12: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

DwC Germplasm (4)

Page 13: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

DwC Germplasm (5)

Page 14: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

GermplasmDistributionPerhaps add new terms to facilitate the reporting of germplasm distribution for the ITPGRFA (International Treaty for Genetic Resources for Food and Agriculture)

GermplasmManagementThe Millennium Seed Bank (Kew) has contributed feedback to the DwC-G modeling and proposed to include a number of seed management descriptors.

• Seed processing terms• Seed cleaning• Seed germination testing

ConservationStatusSuggested by ENSCONET - threat status for populations in situ

DwC Germplasm (6)

Page 15: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Mapping of DwC-G terms to the MCPD descriptors

Page 16: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Mapping of DwC-G terms to the MCPD descriptors (continued)

Page 17: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

MCPD -> ABCD 2.06 (2004)National Inventory CodeInstitute CodeAccession NumberCollecting NumberCollecting Institute CodeGenusSpeciesSpecies Authority„Subtaxa“„Subtaxa“ AuthorityCommon Crop NameAccession NameAcquisition Date

Country of OriginLocation of Collection SiteLatitude of CSLongitude of CSElevation of CSCollecting Date of SampleBreeding Institute CodeBiological Status of

AccessionAncestral DataCollecting/Acquisition

Source

Donor Institute CodeDonor Accession NumberOther Identification (Number)

associated with the accession

Location of Safety DuplicatesType of Germplasm StorageRemarksDecoded Collecting InstituteDecoded Breeding InstituteDecoded Donor InstituteDecoded Safety Duplication

LocationAccession URL

Descriptors marked red did not match the earlier versions of ABCD ABCD was extended by a PGR section [W. Berendsohn, H. Knüpffer]

Helmut KnüpfferIPK Gatersleben

Walter BerendsohnBGBM

http://www.ecpgr.cgiar.org/epgris/Tech_papers/EURISCO_Descriptors.pdf

Page 18: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Home: http://code.google.com/p/gbif-providertoolkit/ Primary developers: Markus Döring, Tim Robertson, John WieczorekSource code: Java Released: 2009DEMO at http://ipt.gbif.org/ Genebank Example at http://ipt.nordgen.org/ipt/

Page 19: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

A tool in support of data publishers.

A simple and straightforward mechanism to share primary biodiversity data following the Darwin Core standard.

Open source, Java based web application.

Provides a local tool for data quality assessment.

Integrated Publishing Toolkit (IPT)

Page 20: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

GBIF Integrated Publishing Toolkit (IPT)

- Java 1.5 or higher is required- Apache Tomcat is recommended (1 GB

RAM+)- GBIF IPT is provided as a WAR archive (for

easy deployment)- GeoServer is included for web mapping

(OGC Compliant, WFS, WMS, etc)- H2 Embedded Java Database (with JDBC

interface and web console)- Hibernate (object relational mapping)

Page 21: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

IPT Interfaces

REST XML TAPIR DwC Archive OGC (WFS, WMS, Web Mapping) EML (Ecological Markup Language)

Page 22: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Darwin Core Archive (DwC-A) DwC-A publish dwc records including extensions Simple text based format Zipped single file archive

Germplasm.txt

http://code.google.com/p/gbif-ecat/wiki/DwCArchive

Page 23: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Alternatives:-------

• TAPIR (2004 ->)-------

• DiGIR (PHP, 2001-2006)

• TapirLink (PHP, 2007 ->)-------

• BioCASE (Python, 2001-2008)

• PyWrapper3 (2006-2008)-------

• EURISCO (tab-delimited, 2003) -------

• ICIS (Java, 1996 ->)-------

• BioMOBY (Perl, 2001 ->)

IPT service from NordGen at http://ipt.nordgen.org/ipt/

Page 24: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

• Embeds its own database

• Multilingual

• Has a user management feature based on roles, which allows for multiple data managers to share a common instance

• Manages multiple data sources

• Several upload options: relational database management systems or data files

• Public web interface allows for data browsing and full text search

• Customised detail pages

Page 25: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

GBIF IPT

GBIF IPT implements the Darwin Core Standard; and provides an interface to easily build extensions to the core Darwin Core terms.

The draft germplasm extension is one example of how-to extend the Darwin Core terms for the GBIF IPT.

Page 26: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

The IPT user interface includes

the germplasm extension

Page 27: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

XML interface includes thegermplasm extension

Page 28: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Addresses the need of Nodes managers, to aggregate indexes of published primary biodiversity data.

Aims to ease the complexity of heterogeneous networks of data publishers, by shielding the end-user from the complexities of the different protocols.

The Harvesting and Indexing Toolkit (HIT)

Page 29: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

A Yellow Page reference of Biodiversity resources.

The IPT and HIT instances installed in the course of this project will be registered in the GBRDS.

Any biodiversity organisation should be able to register their resources and services into the GBRDS and contribute to the discovery services.

Biodiversity Resources Discovery System (GBRDS)

Page 30: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Objectives of the European genebank project

Evaluate the GBIF decentralized architecture

Upgrade of the Integrated Publishing Toolkit (IPT) with the genebank extension and develop associated documentation.

Install and test the IPT installation in various genebanks in Europe that, as far as possible, are also EURISCO/ECPGR partners.

Test the registration of IPT installation through the GBIF Global Biodiversity Resources Discovery System (GBRDS).

Test the Harvesting and Indexing Toolkit (HIT) installation for the EURISCO platform.

Install an IPT instance on the EURISCO platform and synchronize with GBIF central Index.

Project runs until 20 December 2010.

Page 31: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

IPT deployment in Europe NordGen in Sweden covering 5 countries (Denmark, Sweden, Finland,

Norway and Iceland)

EURISCO / Bioversity-HQ (Italy)

Bioversity-Montpellier (France)

IPK Gatersleben (Germany)

WUR CGN (The Netherlands)

CRI (Czech Republic)

VIR (Russia)

Balkan countries (Albania, Bosnia, Croatia, Macedonia, Serbia, Romania)

Baltic countries (Estonia, Latvia, Lithuania)

Page 32: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

32

Possible PGR Network model The gene bank dataset is

shared from the holding gene bank.

The National Inventory (NI) endorse all national gene banks (and eventually individual accessions) for EURISCO.

ECPGR Crop databases can access passport data from EURISCO and additional crop specific data from the genebank IPT interface.

Standard data sharing tools ensure that the genebank dataset is available to other relevant decentralized thematic, regional or global networks.

Page 33: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.

The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community (TDWG, GBIF).

Potential of GBIF technology

http://data.gbif.org/datasets/network/2

Page 34: EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

• GBIF, Global Biodiversity Information Facility http://www.gbif.org

• TDWG, Biodiversity Information Standards http://www.tdwg.org

• BioCASE, The Biological Collection Access Service for Europe. http://www.biocase.org

• Bioversity International http://www.bioversityinternational.org

Things can happen in a band, or any type of collaboration, that would not otherwise happen. (Jim Coleman, Musician)

Special thanks to: