Susanna-Assunta Sansone,Associate Director, Oxford e-Research Centre,
University of Oxford, UKdx.doi.org/10.6084/m9.figshare.4055496.v1
@biosharing
bioCADDIE – DATS and CDEs Workshop, Bethesda, 8 May 2017
Formats Terminologies Guidelines
CommonData
Elements
Types of content standards
Content standards: descriptors essential for interpretation, verification, reproducibility, sharing etc. of datasets
Minimum information reporting requirements, checklists
o Report the same core, essential information
o e.g. MIAME guidelines
Controlled vocabularies, taxonomies, thesauri, ontologies etc.
o Unambiguous identification and definition of concepts
o e.g. Gene Ontology
Conceptual model, schema, exchange formats etc
o Define the structure and interrelation of information, and the transmission format
o e.g. FASTA Formats Terminologies Guidelines
Types of content standards
CommonData
Elements
de jure de factograss-roots
groupsstandard
organizations
Nanotechnology Working Group
Formats Terminologies Guidelines
Community-driven efforts, just few examples
Formats Terminologies Guidelines
224
115
500+
source sourcesource
MIAMEMIRIAM
MIQASMIXMIGEN
ARRIVEMIAPE
MIASE
MIQE
MISFISHIE….
REMARK
CONSORT
SRAxml
SOFT FASTADICOM
MzMLSBRML
SEDML…
GELML
ISA
CML
MITAB
AAOCHEBIOBI
PATO ENVOMOD
BTOIDO…
TEDDY
PROXAO
DO
VO
Content standards in numbers
Aweb-based,curatedandsearchableportalthat monitorsthedevelopment and
evolution ofstandards,theiruse indatabases andtheadoptionofbothindata
policies,toinform andeducate theusercommunity
Data policies by funders, journals and other organizations
Content standards
Formats Terminologies Guidelines
Map this complex and evolving landscape
Databases
Data policies by funders, journals and other organizations
Databases
Content standards
Formats Terminologies Guidelines
Using indicators to describe ‘status’
Readyforuse,implementation,orrecommendation
Indevelopment
Statusuncertain
Deprecatedassubsumedorsuperseded
Allrecordsaremanuallycurated
in-houseandverifiedbythe
communitybehindeachresource
Understanding how standards are used
Understanding how standards are used
Guideline
Understanding how standards are used
Formats
Guideline
Understanding how standards are used
Formats
Guideline
Formats
Understanding how standards are used
Formats
Guideline
Formats
Terminology
Technologically-delineated views of the world
Biologically-delineated views of the world
Generic features (‘common core’)- description of source biomaterial- experimental design components
Arrays
Scanning Arrays &Scanning
Columns
GelsMS MS
FTIR
NMR
Columns
transcriptomics proteomics metabolomics
plant biologyepidemiology microbiology
Duplications & lack of interoperability among standards
Arrays
Scanning Arrays &Scanning
Columns
GelsMS MS
FTIR
NMR
Columns
transcriptomics proteomics metabolomics
plant biologyepidemiology microbiology
Hard to use them in combinations, e.g. to represent:
Proteomics-based gut microbiota profiling
Proteomics and metabolomics based gut microbiota profiling
Arrays
Scanning Arrays &Scanning
Columns
GelsMS MS
FTIR
NMR
Columns
transcriptomics proteomics metabolomics
plant biologyepidemiology microbiology
Enhancing modularization
Proteomics-based gut microbiota profiling
Proteomics and metabolomics based gut microbiota profiling
Arrays
Scanning Arrays &Scanning
Columns
GelsMS MS
FTIR
NMR
Columns
transcriptomics proteomics metabolomics
plant biologyepidemiology microbiology
Proteomics-based gut microbiota profiling
Proteomics and metabolomics based gut microbiota profiling
Enhancing modularization
bsg-000174
biosharing:ReportingGuideline
bsg-000161
MINSEQE
MIMARKS
sample information
sample identifier
taxonomyidentifier
sequence read
geo location
High-level information about the metadata standards
Representations of the standards elements
Template elementsfor
el-000001
el-000002
el-000003
provenance: MINSEQE
provenance: MINSEQE
and MIMARKS
provenance:MIMARKS
• Serve machine-readable content metadata standards, providing provenance for their elements• Inform the creation of metadata templates, rendering standards invisible to the researchers
Modularize and combine
Standard developing groups:Journal, publishers:
Cross-links, data exchange:
Societies and organisations: Institutional RDM services:
Projects, programmes: