30
Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Embed Size (px)

DESCRIPTION

A presentation from the in silico drug discovery conference, Dec 3rd 2014, RTP, NC. describes analysis of NIH probes, machine learning and predictions

Citation preview

Page 1: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH

chemical probes

Page 2: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

SaaS

Easy to use

Used by AcademiaIndustry, Biotech

Private

Selective collaboration

100’s of published datasets

Page 3: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Copyright © 2013 All Rights Reserved Collaborative Drug Discovery

MM4TB: 25 organizations

New

Old

Neuroscience

Kinetoplastid Drug Development

Consortium

Page 4: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

NIH spent a decade funding HTS efforts as part of the MLSCN and MLPCN

By 2010 $576.6M in funding Various definitions of a probe Potency, selectivity, solubility and availability Little has been done to learn from this work

Page 5: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes
Page 6: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Lajiness et al. - 13 Chemists assessed 22,000 compounds (2000 each) for drug or lead likeness. Not consistent in rejecting undesirable compounds

(J Med Chem 2004, 47: 4891-6)

Hack et al.- 145 chemists to fill holes in a screening library (J Chem Inf Model 2012; 51, 3275-86)

Kutchukian et al. – medicinal chemists surveyed in selecting fragments for a lead – lack of consensus in compound selection

(PLOS ONE 2012, 7, e48476)

Since the rule of 5 there has been a considerable focus on more rules –ALERTS, PAINS, QED, BadApple etc

Page 7: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

But do we really need a crowd? Could 1 medicinal chemist be enough? > 40 years experience

Page 8: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Chris Lipinski scored the original 64 cpds – he was close to median

Found more probes since 2009 Now scored more than 300 NIH Probes for

desirability

Extensive due diligence

▪ Based on literature (public/private)

▪ Chemical Reactivity

Page 9: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

79% of 322 probes are desirable

Page 10: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

ML010

(CID 17757274)

valsartan

(CID 60846) CAS1164083-19-5US20120040982

(CID 57498937)

ML160

(CID 824820)

representing molecules of different classes from public and commercial databases

Page 11: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Properties from CDD

Properties from Discovery Studio

Higher Mwt, rotatable bonds and heavy atoms is desirable

Page 12: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Yellow - desirable

Blue - undesirable

Yellow – chemical probes

Blue - Microsource spectrum compounds

Page 13: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Desirable probes less likely to be filtered by PAINS or BadApple as promiscuous than those scored as undesirable.

(Fisher's exact test, p>0.0001 for PAINS and p=0.04 for BadApple).

Page 14: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

322 NIH MLP probes

clustered into 44 groups using ECFP_6 fingerprints

using a Tanimotosimilarity threshold of >0.11 for cluster membership.

Blue - desirable Red – undesirable

Circle area is proportional to cluster size, and singletons are represented as a dot.

Page 15: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Drug discovery is repetitive and there are 1000s of diseases

Drug discovery is high risk

Do we need robots or just smarter programs that discover the ideas we test?

Page 16: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

What would happen if we could model Chris’s decisions

Potential for other non medicinal chemists to benefit Streamline scoring compounds, save time

NIH probes

Page 17: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

FCFP_6 descriptors + 8 simple descriptors Leave out 50% x 100 of Bayesian models

5 fold cross validation for n307 models

Page 18: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes
Page 19: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

• The colors on the heat map correspond to the value of

the indicated metric for each probe, listed vertically.

• The scale was normalized internally with green

corresponding to the optimal condition within each

metric.

Page 20: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes
Page 21: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

MoDELS RESIDE IN PAPERSNOT ACCESSIBLE…THIS IS UNDESIRABLE

How do we share them?How do we use Them?

Page 22: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Open Extended Connectivity Fingerprints

ECFP_6 FCFP_6

Collected, deduplicated, hashed

Sparse integers

• Invented for Pipeline Pilot: public method, proprietary details

• Often used with Bayesian models: many published papers

• Built a new implementation: open source, Java, CDK– stable: fingerprints don't change with each new toolkit release

– well defined: easy to document precise steps

– easy to port: already migrated to iOS (Objective-C) for TB Mobile app

• Provides core basis feature for CDD open source model service

Page 23: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Data + One Click =

Uses Bayesian algorithm and FCFP_6 fingerprints

Page 24: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Rebuilt the n307 model in CDD Models

3 fold cross validation

ROC = 0.69

Page 25: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

http://goo.gl/PVkQeo

Making the data more accessible as we are drowning in molecules

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

log database size (millions)

Page 26: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Ligand efficiency higher in undesirable compounds

Bayesian model preferable in classifying desirable compounds vs other molecule quality metrics

Model could improve probe selection, score libraries, prior to more extensive due diligence

Probes could be scored by additional chemists dependent on needs e.g. bias to CNS, anticancer..

CNS

Anticancer

NIH probes

Page 27: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Complexities in finding the NIH MLP probes in PubChem

Identifier and structure searches in CAS SciFinderTM

reveals an extreme disclosure

The parallel worlds of commercial and public database disclosure do not completely intersect

Integration and intersections of databases and the need for bioassay ontology adoption

Public Commercial

Page 28: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Need more collaboration or opennessin terms of availability of chemistryand biology data.

Increased communication betweenthe various databases that are bothpublic and proprietary

Major hurdles exist to prevent thisfrom happening - too muchcommercial value to proprietarydatabases

Clearly CAS and the othercommercial vendors have to takenotice

Page 29: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

We acknowledge that the Bayesian model software within CDD was developed with support from Award Number 9R44TR000942-02 “Biocomputation across distributed private datasets to enhance drug discovery” from the NCATS.

SE gratefully acknowledges Biovia (formerly Accelrys) for providing Discovery Studio.

SE thanks Jeremy Yang for the link to BadApple

Page 30: Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Litterman NK, Lipinski CA, Bunin BA, Ekins S. Computational Prediction and Validation of an Expert's Evaluation of Chemical Probes. J Chem Inf Model. 2014 Oct 27;54(10):2996-3004. doi: 10.1021/ci500445u. Epub 2014 Oct 7.

Christopher A. Lipinski, Nadia Litterman, Christopher Southan, Antony J. Williams, Alex M. Clark and Sean Ekins, The parallel worlds of public and commercial bioactive chemistry data J Med Chem. Epub 2014 Nov 21.