Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
3D ligand-based virtual screening in the cloud
(Blaze training)
Cresset European User Group Meeting – Workshops
June 2016
© Cresset
Files for this workshop
> The files used in this workshop are available for download on
request
> Please send an email to [email protected] stating the
name of the workshop that you wish to get the files for
© Cresset
Comparing structurally disparate molecules
PDB:2ogz PDB:3g0g
BioisosteresBioisosteric
groups
© Cresset
Effective ligand-based virtual screening
© Cresset
> Search a database for new
structures
> Uses a Linux CPU or GPU cluster
> Software, service or rental (SaaS)
Virtual screening with Blaze
> Diverse new structures
> Complementary to other techniques
> New uses for existing drugs
© Cresset
Complementarity of Blaze
Shape
Hits
Standard hits
Search Blaze
1. Standard method
2. Shape only setting
3. Compare results
© Cresset
Where does it perform poorly?
> Start from ‘Ala-Glu-Asp-Phe-Gly-Trp’
> Not enough 3D ligand info
> Start from Empty protein
> Search for new metal binding warhead
> Not chemically intelligent
> Search for covalent inhibitors
> All known actives have MW > 1000
> Conformation space too big
© Cresset
About Blaze
> Full virtual screening system
> Compound and collection management
> User and project permissions
> Integrates with standard queueing systems (e.g., SGE or LSF)
> Search history and archiving
> Choice of interface
> Web browser
> Command line
> RESTful API (aka ‘REST API’ or ‘web service’)
> Access from Pipeline Pilot, KNIME etc
> Access from Cresset’s Forge
© Cresset
Blaze operation
Search Retrieve
© Cresset
Search molecules
© Cresset
Search molecule choice
> Field points describe potential binding
> Choose the smallest (in field terms), most active compound
(ligand efficiency)
> More active = more interactions
> More efficient = fewer extraneous groups
> In field space, charged groups are BIG – e.g.
> CO2 >>>>>> iPr in field space
> remove extraneous groups (e.g., solubilisers)
© Cresset
Multiple references are good for:
> Alignment (Pose) prediction> Extra references to add information
> Electrostatic
> Shape
> Scaffold hopping (Spark)> Optimising a lower active by scoring against a high active
> Virtual Screening ?> Require molecules to look like both references
> Theoretically should be good
> Practically not always true
> Separate searches with data fusion performs better (?)
We prefer multiple searches to multiple references
© Cresset
Data fusion for multiple searches
> We use ‘combine on rank’
> E.g., run 3 searches and take the top 1,000 from each search
> We don’t recommend combining on score
> Score is local not global
> Value is dependent on size of search query
> Currently investigating the use of Z-scores and other approaches
> Find ways to evaluate ‘quality’ of scores
> Identify ‘frequent hitters’
> Score normalisation
© Cresset
Database molecules
© Cresset
Numbers
> 10M molecules is standard
> Uses compute cluster - scales to at least 500 CPU cores, tens of
thousands of GPU cores
> Pre-populated conformations
> Molecules in collections split by heavy atom ranges
> Duplicates across collections get filtered during searching
> GPU or CPU
© Cresset
Other issues
> Tautomers> Enumerate outside of Cresset and load as separate molecules
> Protomers> Enumerate externally
or
> Let Cresset choose
> Rules based on pH 7
> Good for amines, carboxylates
> Less good for e.g. aminopyridines
> Flat chirality> Fully explored
> Up to 3 flat centres
> Speed penalty
© Cresset
Searching
© Cresset
Searching
> 4 levels
> FieldPrint
> Nasty, very fast
> Clique (Fast mode)
> Alignment by matching field points, single-point true score
> Fast
> Good
> Simplex
> Alignment by matching field points, optimised score
> Protein excluded volumes
> Slower – best poses and scores
> Filter
© Cresset
Blaze: Workflow
New Search
Upload Molecule
Check Conversion
Setup Experiment
Name
Collections
Refinements
clique 50%+
simplex 10%+
Get results
Repeat or perfect
© Cresset
Let’s go
> Searching ChEMBL fragment like molecules
> Upload a search molecule
> SDF format (Mol2 also)
> Check the search molecule in Forge
> Run the search
> Marvel at the enrichment!
© Cresset
Connect to Blaze
Cresset Demo: http://blaze.cresset-group.com/blaze/ui/
Username Signup at http://blaze.cresset-group.com
User Preferences
Email when results available
Turn on/off automatic help
Every Page has context sensitive help here
© Cresset
Start a new search
Choose
New Search
© Cresset
Upload the search molecule
Choose Browse
Choose
Blaze/A2C_blaze_training.sdf
Choose Submit
© Cresset
Field addition and constraints
Download with field points
Need a viewer that displays field points
Use to
check file conversion
find field points to constrain
© Cresset
Open search molecule in Forge
© Cresset
> Click on the molecule
> Press Shift-i
Molecule should be displayed with
the index of the field point
> Press Shift-f
Field points should have a number
next to them (the size)
> These are needed to identify a field
point to constrain
Finding field points to constrain
© Cresset
Moving on…
© Cresset
Simple Search Page – 3 sections, part 1
Unique Name
A2C_<Uname>
Choose “EUGM2016”
© Cresset
Simple Search Page – 3 sections, part 2
Heavy atom ranges
“Collections”
Choose “Chembl20_filtered”
Choose “11-20 atoms”
© Cresset
Simple Search Page – 3 sections , part 3
50-100%
Always!
10% is good
Find the ID, Enter the size
Used in Simplex refinement
Excluded volume only
© Cresset
Submit the search
> And wait ......
> Any Questions?
© Cresset
> View results
> Select search:
EUGM_SEARCH_1
> You should be taken to the parent
result page
Download pre-calculated results
© Cresset
> These are the worst results to
download!
> The links at the top show all
refinements of this parent
> We want the last (right most)
refinement – should be green
> Click on ‘Simplex’
Caution!
© Cresset
> Check that you have a ‘Download
Overlays’ button
NO? Not on the right page
YES? Download the top 100 results
as sdf file
Always use sdf files
> Open in Forge
Download results 3D
© Cresset
Opening results in Forge
© Cresset
Some good results
Result assessment
But many lack the positive charge
© Cresset
From the initial search
Filter results to give only positive compounds
From the completed results
© Cresset
Filters
Must contain any one of these
Must contain any one of these
Must NOT contain any one of these
AND
AND
Must obey each one of these
AND
© Cresset
Positive filter
Unique Name
‘positive’
= 1
© Cresset
Filter Early ~10,700 mols
Filter early > Filter later
Filter late ~3,300 mols
© Cresset
Download results
Download top 100 results
© Cresset
> Quickest download is of 1D text file
of results
Download Results 1D
cressetgroup
Questions welcomed
Example files available from
Contact us for our tailored training courses