Understanding and classifying metabolite space and metabolite likeness

  • View
    355

  • Download
    0

Embed Size (px)

Text of Understanding and classifying metabolite space and metabolite likeness

  • Julio E. Peironcely @peyron

    Juliopeironcely.com

    PhD student at Leiden University and TNO

    Understanding And Classifying Metabolite Space and Metabolite-Likeness PLoS One (in press)

  • Metabolomics

    the quantitative and qualitative analysis of all metabolites in

    samples of cells, body fluids, tissues, etc.

    Julio E. Peironcely

  • Metabolomics

    Julio E. Peironcely

    Biological question

    Sample preparation

    Experi- mental design

    Data acquisition

    Data pre- processing

    Biological inter-

    pretation

    Data analysis

    Samples Raw data List of peaks/ biomolecules

    Relevant biomolecules/ connectivities

    & Models

    Metabolites

    Sampling

    Protocol

  • Metabolomics

    Julio E. Peironcely

    Biological question

    Sample preparation

    Experi- mental design

    Data acquisition

    Data pre- processing

    Biological inter-

    pretation

    Data analysis

    Samples Raw data List of peaks/ biomolecules

    Relevant biomolecules/ connectivities

    & Models

    Metabolites

    Sampling

    Protocol

  • How do metabolites look like?

  • HMDB 8K

    ZINC 21M

    Julio E. Peironcely

  • metabolites non metabolites

    Water Solubility MW

    C Atoms Struc. Complexity

    PSA

    Julio E. Peironcely

  • PCA

    Julio E. Peironcely

  • PCA

  • Not so different

  • Decision Tree

    Julio E. Peironcely

  • Lots of candidates structures

  • Elemental Composition

    Julio E. Peironcely

  • Elemental Composition

    Structure Generation

    Julio E. Peironcely

  • Elemental Composition

    Structure Generation

    Molecules

    Julio E. Peironcely

  • We are looking for metabolites

  • Elemental Composition

    Structure Generation

    Molecules

    Metabolite Likeness

    Julio E. Peironcely

  • Elemental Composition

    Structure Generation

    Molecules

    Metabolite Likeness

    Metabolites

    Julio E. Peironcely

  • Metabolite-likeness

    Julio E. Peironcely

    HMDB 8K

    ZINC 21M

    Atom Counts

    Physicochemical desc.

    MDL Public Keys

    FCFP_4

    ECFP_4

    Support Vector Machines (SVM)

    Random Forest (RF)

    Nave Bayes (NB)

    Representation + Classification

  • Metabolite-likeness

    Julio E. Peironcely

    HMDB 8K

    ZINC 21M

    Standardization

    Diversity Selection Atom Counts Physicochemical desc.

    MDL Public Keys FCFP_4 ECFP_4

  • Metabolite-likeness

    Julio E. Peironcely

    Training Set 532 + 532

    HMDB 8K

    ZINC 21M

    Standardization

    Diversity Selection

    Test Set 6.4K + 6.4K

    Atom Counts Physicochemical desc.

    MDL Public Keys FCFP_4 ECFP_4

  • Metabolite-likeness

    Julio E. Peironcely

    Training Set 532 + 532

    HMDB 8K

    ZINC 21M

    Standardization

    Diversity Selection

    Test Set 6.4K + 6.4K

    5-fold CV

    SVM RF BC

    Atom Counts Physicochemical desc.

    MDL Public Keys FCFP_4 ECFP_4

  • Metabolite-likeness

    Julio E. Peironcely

    Training Set 532 + 532

    HMDB 8K

    ZINC 21M

    Standardization

    Diversity Selection

    Test Set 6.4K + 6.4K

    5-fold CV

    SVM RF BC

    Metabolite likeness

    3 classifiers X

    5 descriptions

  • Metabolite-likeness

    Julio E. Peironcely

    Training Set 532 + 532

    HMDB 8K

    ZINC 21M

    Standardization

    Diversity Selection

    Test Set 6.4K + 6.4K

    5-fold CV

    SVM RF BC

    Metabolite likeness

    Best = RF MDLPublicKeys

    Sensitivity Specificity AUC

    99.84% 87.52% 99.20%

    Bad BC P_desc

    Sensitivity Specificity AUC

    42.51% 86.56% 61.57%

  • Metabolite-likeness, external validation

    Julio E. Peironcely

    HMDB External

    validation set ChEMBL

    Metabolite likeness

    DrugBank

    Standardization

    Random Selection

  • Metabolite-likeness, external validation

    Julio E. Peironcely

  • Met-likeness + structure generation (methylhistamine) 260K

    Julio E. Peironcely

    46% 71%

  • Met-likeness + structure generation (malic acid) 8K

    Julio E. Peironcely

    100%

    57% 77%

  • Conclusions

    Julio E. Peironcely

    Prediction is good, interpretation not

    Useful in different fields

    Local models needed

  • Acknowledgements

    TNO Quality of Life

    Leon Coulier

    HMP University of Alberta

    David Wishart Ying (Edison) Dong

    Leiden University

    Theo Reijmers Thomas Hankemeier

    University of Cambridge

    Andreas Bender

    Julio E. Peironcely