Upload
christoph-steinbeck
View
88
Download
4
Embed Size (px)
Citation preview
A World-Wide Network for Metabolomics Data ExchangeChristoph Steinbeck
European Bioinformatics Institute(EMBL-EBI)
The European Molecular Biology Laboratory
(EMBL)
A basic research institute funded by public research monies from 20 member states.
European Bioinformatics Institute (EBI)Genes, genomes & variation
Literature & ontologies Europe PubMed Central Gene Ontology Experimental Factor Ontology Molecular structures
Protein Data Bank in Europe Electron Microscopy Data Bank
European Nucleotide Archive 1000 Genomes
Gene, protein & metabolite expression
Protein sequences, families & motifs
Chemical biology
Reactions, interactions & pathways Systems
Ensembl Ensembl Genomes
European Genome-phenome Archive Metagenomics portal
Reaction times following external change
• Genetics (decades, centuries…)
• Epigenetics (days, month, years,…)
• Gene Expression (hours)
• Metabolism (seconds)
> 100,000 patient samples / year> Several PetaBytes/year
=> ExaBytes of human data at moderate scale-up
What do the EBI databases do? Labs around the world send us their data and
we…
Archive it
Classify itShare it with other data providers
Analyse it
…provide tools to help researchers
use it
A collaborative enterprise
MetaboLights
http://www.ebi.ac.uk/metabolights
open-access, cross-species, cross-application,long-term supported
Salek, R.M., Haug, K. and Steinbeck, C. (2013) Dissemination of metabolomics results: role of MetaboLights and COSMOS. Gigascience, 2:8.
MetaboLights Database
Experimental Repository
Reference Layer
Chemistry Spectroscopy Biology
Ana
lysi
s To
ols
Primary Literature
Primary data and Meta-Data, Spectra, Protocols, Synopses, ...
Data growth in EBI data repositories
3-month doubling time
for Metabolomics
MetaboLights is now the recommended
repositoryfor the Nature journals,
EMBO journal, PLOS journals, Metabolomics
Journal and others
Sansone,… Steinbeck et al. (2012) Toward interoperable bioscience data.
Nature Genetics, 44, 121–126.
Controlled VocabulariesOntologies
Minimum Information Standards
COSMOS COrdination of Standards in MetabolOmicS
European FP7 coordination action coordinated by us at
EMBL-EBI, Hinxton, Cambridge
• Create missing standards & formats
• Define workflows for dissemination
• Create world-wide data network
MetabolomeXchange 2014
• Global network for exchange and discoverability of metabolomics data
• Includes study as well as reference data
•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced
What about completed metabolomes?
Species Metabolomes and How Little We Know
1"
10"
100"
1000"
10000"
100000"
1000000"
10000000"
100000000"
Metabolites"in"Human"
Metabolites"in"Microbes"
Compounds"in"ChEBI"
Metabolites"in"HMDB"
Metabolites"in"Plants"
Compounds"in"ChEMBL"
Compounds"in"PubChem"
80,000200,000
2,000,000
There are known knowns; there are things we know we know.We also know there are known unknowns; that is to say, we know there are some things we do not know.But there are also unknown unknowns – the ones we don’t know we don’t know.
—United States Secretary of Defense,
Donald Rumsfeld
Building upon extensive genomics research, we argue that the time is now right to focus intensively on model organism metabolomes. We propose a grand challenge for metabolomics studies of model organisms: to identify and map all metabolites onto metabolic pathways, to develop quantitative metabolic models for model organisms, and to relate organism metabolic pathways within the context of evolutionary metabolomics, i.e., phylometabolomics. These efforts should focus on a series of established model organisms in microbial, animal and plant research.
Metabolites. 2016 Feb 15;6(1)
A Case for Deep Metabolome Annotation
Help building species metabolomes
•Submit your metabolomics study to MetaboLights•Submit data publications (e.g. to Scientific Data)•Be highly cited :)
•500 Million people in European Union•Full Genomes (soon for less than $1000 p. P.)•Urine/Blood Metabolome < 20 Euros per Patient
> 100,000 patient samples / year> Several PetaBytes/year
=> ExaBytes of human data at moderate scale-up
Large Scale Computing with Medical Metabolomics Data
• EBI lead• H2020• 3 Years• 13 Partners• 8 Mio €• 830 PM• Kick-off 9/15• H2020 e-infra
Computer-Assisted Structure Elucidation
(CASE)
Steinbeck C (2004) Recent developments in automated structure elucidation of natural products. Nat. Prod. Rep. 21, 512–518.
Finding the unknown
Limits to Growth•Deterministic methods suffer from combinatorial explosion
•Prospective use of spectroscopic input information may make them error-intolerant
No. of Heavy Atoms
No.
of C
onst
itutio
nal I
som
er
Cal
cula
tion
Tim
e
C13H16O3 (16 Heavy Atoms)
> 2,000,000,000 Constitutional Isomers
C10H16 (10 Heavy Atoms)
24938 Constitutional Isomers
C30H48O2 (32 Heavy Atoms)
>> 1012 Constitutional Isomers