23
PhD Candidate: Gian Marco Ghiandoni [email protected] Reaction Class Recommendation Models in de novo Drug Design

Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

PhD Candidate: Gian Marco Ghiandoni

[email protected]

Reaction Class Recommendation Modelsin de novo Drug Design

Page 2: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

2

Outline

1. Reaction vector-based de novo design

2. Issues related to reaction vector-based de novo design

3. Recommender systems

4. Recommender implementation, screening, and validation

Page 3: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Molecular De Novo Drug Design

Hartenfeller, M. & Schneider, G., 2011. Enabling future drug discovery by de novo design. Wiley Interdisciplinary Reviews: Computational Molecular Science, 1(5), pp. 742-759

1) Scoring Functions

3) Search Strategies

Chemical Structure Biological Activity

De novo Design

3

2) Construction Methods

Page 4: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Molecular De Novo Drug Design

Hartenfeller, M. & Schneider, G., 2011. Enabling future drug discovery by de novo design. Wiley Interdisciplinary Reviews: Computational Molecular Science, 1(5), pp. 742-759

1) Scoring Functions

3) Search Strategies4

2) Construction MethodsGenerative Models (AI)

Use joint distributions p(x, y) to generate new data with

characteristics similar to the training data

Reaction-driven AlgorithmsUse examples of chemical reactions as references to

combine rationally fragments with each other

Page 5: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Database Construction Process

Patent ID: US03931156-1976

USPD Reaction Vector Database

93.000 unique reaction vectors

organised in 336 classes

Computation of the difference vector and

reaction classification:

Cl(1,0,0)-2(1)-C(3,2,1) = +1

N(1,0,0)-2(1)-C(3,2,1) = -1

Database encoding of vectors, synthetic

references (IDs), and reaction classes

USPD Grants 1976-2016

1.8 million reactions of pharmaceutical

interest

Atom-pair description of the reaction components

“Functional Conversion (Amino to Chloro)”

5

Page 6: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Reaction Vector-based De Novo Design

Patel, H. et al., 2009. Knowledge-based approach to de Novo design using reaction vectors. Journal of Chemical Information and Modeling, 49(5), pp. 1163-1184

SM

R

RV

P40,000reagents

93,000classifiedreactions

1,500potentiallyaccessibleproducts

Synthetic references are also provided to speed up the preparation of the candidates

6

Page 7: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Reaction Vector-based De Novo Design

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

SM

R

RV

P40,000reagents

93,000classifiedreactions

1,500potentiallyaccessibleproducts

7

More accessible

Less accessible

PC Analysis

Page 8: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Reaction Vector-based De Novo Design

SM

R

RV

P40,000reagents

93,000classifiedreactions

700recommended

products

8

3500recommended

reactions

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

Recommender

More accessible

Less accessible

PC Analysis

Page 9: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Reaction Vector-based De Novo Design

SM

R

RV

P40,000reagents

93,000classifiedreactions

9

3500recommended

reactions

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

700recommended

products

Recommender

Page 10: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Reaction Vector-based De Novo Design

SM

R

RV

P40,000reagents

93,000classifiedreactions

10

3500recommended

reactions

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

700recommended

products

Recommender

Page 11: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Recommender Systems

11

Ricci, F. et al., 2011. Recommender Systems Handbook. Springer US, Springer-Science Business Media, LLC, pp. XXX-842

Recommendation System

trained on past decisions to provide future suggestions

Input Vectorstructured data

format describing a given entry

List of Suggestionswhich can be used to

prioritise certain decisions

Page 12: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Recommender Systems

12

[C-N Bond Formation (Amination), C-N Bond Formation (N-arylation),Functional Elimination (Defluorination)]

(1) Molecule

(2) Fingerprint

(4) Recommendations

MODEL

Recommendation System

trained on past decisions to provide future suggestions

List of Suggestionswhich can be used to

prioritise certain decisions

Input Vectorstructured data

format describing a given entry

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

(3) Multi-labelAlgorithm

Page 13: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Why reaction classes rather than reactions?

13

Structural NoveltyThe application of groups of reactions

rather than specific examples preserves the chances of finding novel molecules

Database InterchangeabilityClass recommendations can be applied to any compatible database, thus extending

the versatility of the recommender

Data Requirements and Model SizeThe use of classes enhances the model

generalisation, thus reducing the amount of data required to train an effective model

Page 14: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

14

Label Discrimination Types

Ghiandoni, G. et al., 2018. Augmenting De novo Drug Design using Reaction Classification [Poster], UK QSAR and MGMS Structure-Activity Relationships conference, Cardiff (UK), 11th - 12th April 2018

Labels can be decomposed in order to include more subclasses within the same suggestions

Hierarchical labelling system:

4-layer: “C-C Bond Formation (Coupling) (Suzuki) (Bromo)”

3-layer: “C-C Bond Formation (Coupling) (Suzuki)”

2-layer: “C-C Bond Formation (Coupling)”

1-layer: “C-C Bond Formation”

Page 15: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Reaction Vector-based De Novo Design

15

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

SM

R

RV

P40,000reagents

93,000classifiedreactions

3500recommended

reactions

Recommender

[C-N Bond Formation (Amination), C-N Bond Formation (N-arylation),Functional Elimination (Defluorination)]

(1) Molecule

(2) Fingerprint

(4) Recommendations

MODEL(3) Multi-labelAlgorithm

Page 16: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Data Mining for Reaction Class Recommendation

18

Ghiandoni, G. et al., 2018. Fingerprint-based Recommendation Models in Reaction-driven Drug Design [Poster], UK QSAR Autumn Fall conference, Oxford (UK), 26th September 2018.

Pivoting

“Similar molecules have similar reactivity”

Identical descriptions are grouped together and associated with multiple

reaction classes

Dataset shape and content depend on the encoding system as well

Page 17: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Model Screening

19

Components:

• 1,056,836 starting materials and classes (US patents)

• 2 levels of class discrimination (generic and specific)

• 22 molecular descriptions (e.g. fingerprints)

• 6 multi-label approaches / 2 classifiers

Procedure:

• Data Split: Training 80% / Test 20%

• Metrics:Recall, Precision, F1-score, Hamming Loss, 0/1 Loss

534 models were screened in total

Page 18: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

Model Screening

20

1 2 3Label-type Approach Classifier

Page 19: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

21

Design Validation

SM

R

RV

24,000reagents

12,000classifiedreactions

PCControllibrary

3-layerCC-RF

MACCSrecommender

P11-layer

suggestionslibrary

P22-layer

suggestionslibrary

P33-layer

suggestionslibrary

26starting

fragments

Page 20: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

22

Design Validation

Synthetic Accessibility Estimations

ID Pipeline Applicable Reactions Unique Products(InChI Keys) Enumeration Time (s) Mean RSynth2 Mean SAscore3

P3 3-layer 85 (0.24) 43,952 (0.31) 381.22 (0.38) 0.57 (1.10) 1.69 (0.94)

P2 2-layer 94 (0.26) 49,988 (0.35) 412.47 (0.42) 0.56 (1.08) 1.78 (0.99)

P1 1-layer 170 (0.48) 73,741 (0.52) 585.03 (0.59) 0.54 (1.04) 1.83 (1.02)

PC Control 357 (1.00) 141,834 (1.00) 991.65 (1.00) 0.52 (1.00) 1.80 (1.00)

1: RSynth values were computed using the software MOE by Chemical Computing Group ULC (2019)

2: SAscore values were computed using the RDKit implementation of the method by Ertl & Schuffenhauer (2009)

Page 21: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

23

Design Validation

1: RSynth values were computed using the software MOE by Chemical Computing Group ULC (2019)

2: SAscore values were computed using the RDKit implementation of the method by Ertl & Schuffenhauer (2009)

Synthetic Accessibility Estimations

ID Pipeline Mean RSynth2 Mean SAscore3

P3 3-layer 0.57 (1.10) 1.69 (0.94)

P2 2-layer 0.56 (1.08) 1.78 (0.99)

P1 1-layer 0.54 (1.04) 1.83 (1.02)

PC Control 0.52 (1.00) 1.80 (1.00)

Page 22: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

24

Conclusions

• Reaction vector design generates structures that have more chances to be synthetically accessible

• Machine learning recommender systems can be used to support the current design framework by:

• Further increasing the product synthetic accessibility• Reducing the global enumeration time

Page 23: Reaction Class Recommendation Models - UK-QSAR Groupukqsar.org/wp-content/uploads/2019/01/Talk-Ghiandoni... · 2019-05-13 · Reaction-driven Algorithms Use examples of chemical reactions

25

Acknowledgements

Sheffield Chemoinformatics Research Group

Prof. Val GilletJames Webster

Dr. Antonio de la Vega de León

Evotec UK Research Informatics Team

Dr. Mike BodkinDr. James Wallace

Dr. Dimitar Hristozov