Upload
chemaxon
View
125
Download
1
Tags:
Embed Size (px)
Citation preview
1
A vast virtual chemical space
powered by ChemAxon’ s Markush Search Platform
and its utility in molecular design
Zhengwei Peng
09/26/2014, 2014 ChemAxon UGM, Boston
2
Outline
• What are virtual chemical spaces?
• How to search them?
• How to construct them?
• A pilot VL powered by the extended Markush Search
Engine and accessible by users via a web tool
• A wish list
• Summary
3
Chemical spaces: Patent Markush and
CombiChem libraries
R1
R2
R3
R1
R2
R3
R4
R5
Peng, DDT, 2013
4
Previous work on vast virtual chemical spaces and related similarity search methods
Z. Peng, (2013) Very large virtual compound spaces: construction, storage, and utility in drug
discovery. Drug Discovery Today: Technologies, http://dx.doi.org/10.1016/j.ddtec.2013.01.004.
5
ChemAxon-Merck collaboration (2014) to extend the patent
Markush search engine to handle CombiChem virtual libraries
Hu et al, “LEAP into the Pfizer Global
Virtual Library (PGVL) Space: Creation
of Readily Synthesizable Design Ideas
Automatically”, J.Z. Zhou (ed.),
Chemical Library Design, Methods in
Molecular Biology 685 (2011)
R1
R2
R3
Extended
ChemAxon
Markush search
Engine
>1012
• EXACT search
• Sub Structure search
• + Similarity search (LEAP2 of Pfizer)
Advantages:
• a single search engine which supports all three search types
commonly used by chemists
• tighter integration to boost performance
• better support and continue improvements
6
LEAP2: a similarity search method
Levitra
Step 1: rank R-group
fragments based on
Super Similarity score
between Basis Products
and query molecule.
Step 2: focus on top
ranking fragments and
enumerate a smaller
subset of products.
0 false positive rate, low false negative rate. Tremendous speed-up.
Step 3: perform the standard
similarity comparison between
the query molecule and
enumerated products. Return
search hits.
7
Deployed as a service
Extended ChemAxon
Markush Search Engine
REST-ful
API
PLP
connector
Web App PLP protocolsRich GUI App
8
Deployed as a service
Web tool Other Mol. design tools
Chem. Informatics Platform & Services
ChemAxon Markush
Search engine
Virtual Libraries
(~1010-18)
Rxns &
building
blocks
9
Workflow used to construct & update VL content
Hu et al, Pfizer Global Virtual Library (PGVL): A chemistry design tool powered by
experimentally validated parallel synthesis information, ACS Comb. Sci., 2012, 14, 579-589.
The workflow is implemented using Biovia’s Pipeline Pilot
10
Chemistry knowledge/rules captured for re-use
R1 O
O
H
NR2H
R3R1 N
O
R3
R2+
A B Product
A ComponentComponent
ExclusionH
N<sp3>
H
<sp3>N
<sp3>
H
V1
V InclusionO
O
H
V Exclusion O
O
O
OH
H O
O
H
O
O
HO
O
HO
O
H
V Clipping.1. O
O
H
.1.
Z1
CoreR1 R2
O
General Exclusion[F,Cl,Br,I]
O
NNH
H
Ch
B Component
Hu et al, Pfizer Global Virtual Library (PGVL): A chemistry design tool powered by experimentally validated
parallel synthesis information, ACS Comb. Sci., 2012, 14, 579-589.
11
Construction of a sample VL dataset
Hartenfeller et al (Novartis), A collection of
robust organic synthesis reactions for in
silico molecular design, J. Chem. Inf. Model.
2011, 51, 3093-3089
58 Rnxs in combination with ACD
reaction building blocks
a sample VL of 10 10-11
12
A pilot web tool to showcase search capabilities
Similarity search time:
• depends on query structure and server load
• ranging 20s to 60s
13
Sample search result
Hartenfeller et al. A Collection of Robust
Organic Synthesis Reactions for in Silico
Molecular Design, J. Chem. Inf. Model,
2011, 51, 3093-3098.
Returned info on a search hit:• Library ID
• Rxn ID
• Building block IDs
14
Utilities of VLs in molecular design
1) Hit expansion
Given a library hit
retrieve the Rxn used
and associated
suitable reactants,
initiate hit follow-up
library design and
synthesis.
Given a lead structure
retrieve similar
compounds in VL (with
Rxns & corresponding
reactants), initiate
multiple sets of lead
hopping library design
and synthesis.
Virtual Rxns,
reactants, Markush
cores, and R-group
fragments can also
be added into VL for
general design idea
generation and
evaluation.
2) Lead hopping 3) Idea generation
Expected benefits:
• Speed: enable a faster and more cost-effective design/synthesis/test cycle
• Scale: leverage the captured Rxn knowledge by all molecular designers
• Compete: competitive edge (size of VL, diversity in captured Rxns, proprietary
reaction building blocks) at project teams’ finger tips
15
Potential areas of applications
• Allow drug discovery project teams to perform in-depth analysis to
facilitate prioritization of proposed scaffolds and/or proposed reactions
• DNA-encoded libraries: enable structure searches and library (as well as
library idea) evaluation and comparisons
• Automatically notify project chemists of possibilities & alternatives…
• ….
16
A wish list
• Performance enhancement
• faster library loading
• faster SSS and Exact searching
• High availability and scalability
• many users, many VLs, & many CPUs 10 sec per search
• …an embarrassingly parallel-able search problem…
Opportunity as a community effort with mutual benefits
• many pharma companies are using virtual libraries…
• adoption of the DNA-encoded library production for hit
generation is growing
• with pooled resources, more could be accomplished
An extension framework for customer-built similarity search
methods
• …shape-based, pharmacophore-based, etc.
• ..let (enable via the Platform) the hundred flowers bloom…
17
Summary
• The extended Markush Search Engine can now support Exact, SSS,
and Similarity searches into vast virtual spaces spanned by
combinatorial chemical reactions
• Good performance has been seen based on searches against a pilot
VL containing ~1010-12 compounds
• There are still rooms for further innovation and enhancements
• leverage the platform
• A great opportunity for community engagement
18
Acknowledgements
• ChemAxon collaborators• Tim Dudgeon, Andras Volford, Peter Borbas, and Doug Drake
• Merck colleagues
• The VL Search team: • Chris Culberson, Brad Feuston, Sookhee Ha, Scott Harrison, Gopal
Parthasarathy, and Bob Sheridan.
• MRL chemistry groups: • Milana Maletic, Zhi-Cai Shi, Graham Smith, and Nunzio Sciammetta.
• MRL-IT: • Chris Brofft and Gang Huang
• Management support: • Chris Waller and Frank Brown