Julio E. Peironcely @peyron
Juliopeironcely.com
PhD student at Leiden University and TNO
Understanding And Classifying Metabolite Space and Metabolite-Likeness PLoS One (in press)
Metabolomics
the quantitative and qualitative analysis of all metabolites in
samples of cells, body fluids, tissues, etc.
Julio E. Peironcely
Metabolomics
Julio E. Peironcely
Biological question
Sample preparation
Experi- mental design
Data acquisition
Data pre- processing
Biological inter-
pretation
Data analysis
Samples Raw data List of peaks/ biomolecules
Relevant biomolecules/ connectivities
& Models
Metabolites
Sampling
Protocol
Metabolomics
Julio E. Peironcely
Biological question
Sample preparation
Experi- mental design
Data acquisition
Data pre- processing
Biological inter-
pretation
Data analysis
Samples Raw data List of peaks/ biomolecules
Relevant biomolecules/ connectivities
& Models
Metabolites
Sampling
Protocol
Elemental Composition
Structure Generation
Molecules
Metabolite Likeness
Metabolites
Julio E. Peironcely
Metabolite-likeness
Julio E. Peironcely
HMDB 8K
ZINC 21M
Atom Counts
Physicochemical desc.
MDL Public Keys
FCFP_4
ECFP_4
Support Vector Machines (SVM)
Random Forest (RF)
Naïve Bayes (NB)
Representation + Classification
Metabolite-likeness
Julio E. Peironcely
HMDB 8K
ZINC 21M
Standardization
Diversity Selection Atom Counts Physicochemical desc.
MDL Public Keys FCFP_4 ECFP_4
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
Atom Counts Physicochemical desc.
MDL Public Keys FCFP_4 ECFP_4
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
5-fold CV
SVM RF BC
Atom Counts Physicochemical desc.
MDL Public Keys FCFP_4 ECFP_4
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
5-fold CV
SVM RF BC
Metabolite likeness
3 classifiers X
5 descriptions
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
5-fold CV
SVM RF BC
Metabolite likeness
Best = RF – MDLPublicKeys
Sensitivity Specificity AUC
99.84% 87.52% 99.20%
Bad BC – P_desc
Sensitivity Specificity AUC
42.51% 86.56% 61.57%
Metabolite-likeness, external validation
Julio E. Peironcely
HMDB External
validation set ChEMBL
Metabolite likeness
DrugBank
Standardization
Random Selection
Conclusions
Julio E. Peironcely
Prediction is good, interpretation not
Useful in different fields
Local models needed