Microbial resources data standards and WDCM MDSMicroalgae 8 (3) Bacteria 11(6) Cyanobacteria 11(5)...

Preview:

Citation preview

Microbial resources data standards

and WDCM MDS

Outlines

International data standards relevant to Microbial

resources information

OECD best practice guidelines and CABRI

MINE

ABCD and Darwin Core

MIGS\MIMS\MIMARKS

StrainInfo and MCL

How to incorporate these standards in WDCM datasets

design

WDCM minimum datasets and recommended datasets

Standards Architecture

Biodiversity Information Standards (TDWG) Principal

Biodiversity data will be modelled as graph

of identifiable objects

Objects are defined by an ontology: Understandable by

humans and computers

Requires globally unique identifiers to link objects

across the network

Requires a transport protocol to ‘wrap’ the

biodiversity data for transport: TAPIR

OECD best practice guidelines

Organism Type MDS(RDS)

Filamentous fungi 10(6)

Yeasts 10(6)

Microalgae 8 (3)

Bacteria 11(6)

Cyanobacteria 11(5)

Archaea 11(5)

Protozoa 11(5)

Plasmids 13(17)

Phages 11(1)

Viruses 12

cDNA and gDNA Libraries

6

Common Access to Biological Resources

and Information

Partner collections: BCCM, CABI, CBS, CRBIP, DSMZ, ICLC, NCCB, NCIMB,

28 catalogues,

Minimum datasets , Recommended datasets, Full datasets

The Minimum and Recommended datasets is in conformity with OECD best practice guidelines

Full Datasets

Substrate

Genotype

Pathogenicity

Enzyme Production

Metabolite Production

Remarks

Price Code

Full Datasets

Sexual state

Pathogenicity

Enzyme Production

Metabolite Production

Catalogue entry

Remarks

Price Code

Plasmids

Microbial Information

Network Europe (MINE)

Microbial Information Network Europe (MINE) is being

constructed by a number of major microbial culture

collections in countries of the European Community,

with the support of the Biotechnology Action Programme

(BAP) of the Commission of the European Community.

Species records

strain records

synonym records,

alternative morphonym records

Microbial Information

Network Europe (MINE)

Minimum datasets of 30 fields

Full datasets: 99 fields, grouped in 12 blocks: 1. internal administration

2. Name

3. strain administration

4. Status

5. environment and history

6. biological interactions

7. sexuality

8. properties (cytology, biomolecular data)

9. genotype and genetics

10. growth conditions

11. chemistry and enzymes

12. practical applications

Biodiversity Information Standards Previously: Taxonomic Databases Working Group (TDWG)

is an international not-for-profit group

that develops standards and protocols for

sharing biological data…

TDWG Groups

Biological descriptions

Geospatial

Global identifiers

Imaging

Invasive species

Literature

Observations and specimens

TDWG access protocol for information retrieval (TAPIR)

Taxon names and concepts

Technical Architecture

Access to Biological Collections Data Standard

(ABCD)

ABCD Schema was developed within the BioCASE project

(Biological Collections Access Service for Europe)

Standard for access to and exchange of data about specimens

and observations including living and preserved specimens.

ABCD is much more complex than Darwin Core containing more

than 1300 fields.

It is possible to map the ABCD element to Darwin Core

elements in order for data to be shared between systems.

Darwin Core

The Darwin Core is a standard designed to

facilitate the exchange of information about the

geographic occurrence of species and the existence

of specimens in collections.

It includes 184 terms.

Widely used in global and regional projects such

as GBIF

Without field relative to cultures such as

Restrictions, Toxicity, Identification, Deposition

and Isolation data, Conditions for growth, Storage

Methods, Race, Mutant, Serovar

XML schema of DarwinCore

Genomic Standards Consortium (GSC)

……towards richer descriptions of our collection of genomes,

metagenomes and marker genes …..to promote mechanisms for

standardizing the description of (meta)genomes, including the exchange

and integration of (meta)genomic data.

MIGS\MIMS\MIMARKS

The minimum information about a (meta)genome

sequence(MIGS\MIMS) specification

To describe genomic and metagenomic sequences.

MIGS/MIMS has been extended and adapted for

describing environmental sequences: MIMARKS

MCL

10 Classes, nearly 100 fields 1. Culture 2. Strain 3. Sample 4. Isolation 5. Medium 6. Publication 7. Deposit 8. CatalogDescription 9. BRC 10. StrainInfo

Standards for Journals and

Publications

How to incorporate these standards in

WDCM datasets design

Taxonomic Info

Strain Info

Environment and history

Properties\Phenotypic info

Sequence and genomic info

Reference

WFCC Global Catalogue of Microorganisms

Taxon

Concept

Schema(TC

S)

Darwin Core, MINE, MIMARKS Genbank Schema

MIGS\MIMS

Pubmed

Endnote

WDCM Minimum Data Sets

and

recommended datasets

ATCC JCM NBRC CBS DSMZ BCC …

Strain number、Name、Organism type、 Date of deposition、History 、isolated from、Geographic origin、Condition for growth、Other collection numbers、Application、Reference

WDCM minimal datasets

Indexing System

Isolation source

Original Location

Application

WDCM recommended datasets

Environment package

Application package

Sequence information package

Biochemical and Physiological package

Searching by

Isolation source: human related、soil、water….

Application and products:

Enzyme、biofuel…

OECD Guidelines Darwin core ABCD code

MCL

JSCC ABRCN EBRCN CABRI ….

WDCM experts working group

Extremophiles type:

High temperature、PH

Geographic Characteristics: Hot spring、salt

lake

OECD JCM NBRC DSMZ CBS ATCC STRAININFO ABRCN JSCC WDCM Accession number √ √ √ √ √ √ √ √ Strain number Other collection

numbers √ √ √ √ √ √ √ √

Name √ √ √ √ √ √ √ √ Genus Name

Species_epithet

√ √ Date of

deposition Organism type √ √ √ √ √ √ √

Restrictions √ √ √ Status √ √ √ √ √

History of deposit √ √ √ √ √ √

Condition for

growth √ √ √ √ √ √

Condition for growth:

Temperature /medium

Form of supply √ √ shipped √ Geographic origin √ √ √ √ √ √ Misapplied names

Isolated from √ √ √ √ √ √ √ Mutant

Literature √ √ √ √ √ √ √ Sexual state

Race

Production Application Application Application Applicati

on Synonym Application

Biochemistry

/Physiology

sequence Patents Designations sequence

Cell wall Synonymous Name Deposited

by Depositor sample info Depositor

Fatty acid Rehydration Fluid Biosafety Level

medium info

Quinone Biosafety Level depositor

info

G+C content

Mating Type

Phylogeny Genetic Marker

Plant Quarantine No.

Animal Quarantine No.

Herbarium No.

Restriction

Implementation of Data

standards in WDCM data

management system

EML

Recommended