Upload
bcbbslides
View
78
Download
4
Embed Size (px)
Citation preview
Advanced Computational Drug Design
Phillip Cruz, Ph.D.November 19, 2015
1
OFFICE OF CYBER INFRASTRUCTURE AND COMPUTATIONAL BIOLOGY
NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES
Bioinformatics and Computational Biosciences Branch (BCBB)
• Biostatistics, phylogenetics, microarray analysis, structural biology, NextGen sequencing, protein-protein interaction networks, programming
Outline
De Novo Ligand Design
Structure File Formats
Docking- Practical Aspects Evaluate Results Protein and Ligand Preparation
Docking hands-on exercise (AutoDock Vina)
5
Some Docking programs available at NIH
AutoDock Vina• Free and open source, multiplatform• Interface from Chimera
Glide• Part of Schrodinger Maestro suite• Available at NIH via Molecular Modeling Interest
Group (http://mmignet.nih.gov)• Requires Linux computer
pharmacophoremodeling
QSAR2D and 3D search
Drug Design Methods
docking
active sitefeatures
De Novo Design
Ligand-basedStructure-based
7
De Novo design-
Generate ideas based on multiple design criteria you choose
Design new candidates that mimic the shape and pharmacophore features of your lead structures
Elaborate fragments in the context of a protein binding site for fragment based drug design
Choose to preserve either scaffolds or R-groups during design (scaffold hopping or lead hopping)
Chemical structure mutation operators ensures druglike structures are suggested
8
De Novo Ligand design- Workflow
9
De Novo Design Software
LUDI• One of the first examples• Accelrys
RACHEL• Automatic and Guided mode• Certera
Muse Invent• Multi-criteria optimization• Includes synthesis guidance• Certara
Common Structure File Formats
Proteins• PDB (Protein Data Bank)• mol2 (SYBYL)Ligands (small molecules)• SDF (structure-data file)• mol2• SMILES (2D information only)
Proteins and Ligands• PDB (but issues with ligands)• mol2
Common File Formats- PDB4 character identifier from PDB- 1mbnPDB web site: rcsb.org
File excerpt:ATOM 1 N ILE J 11 5.804 123.968 147.434 1.00 94.01 NATOM 2 CA ILE J 11 5.791 123.831 145.944 1.00 93.94 CATOM 3 C ILE J 11 7.198 123.695 145.333 1.00 92.32 CATOM 4 O ILE J 11 8.169 124.255 145.843 1.00 93.52 O...TER 6327 ASP J 431HETATM 6328 O HOH J2001 4.852 121.472 146.292 1.00 50.45 OHETATM 6329 H1 HOH J2001 4.622 120.611 146.642 1.00 0.00 H...CONECT 6329 6328CONECT 6330 6328...END
Common File Formats- PDB4 character identifier from PDB- 1mbn
ATOM records- Common AA and Nucleotides only
ATOM 1 N ILE J 11 5.804 123.968 147.434 1.00 94.01 NATOM 2 CA ILE J 11 5.791 123.831 145.944 1.00 93.94 CATOM 3 C ILE J 11 7.198 123.695 145.333 1.00 92.32 CATOM 4 O ILE J 11 8.169 124.255 145.843 1.00 93.52 O...TER 6327 ASP J 431HETATM 6328 O HOH J2001 4.852 121.472 146.292 1.00 50.45 OHETATM 6329 H1 HOH J2001 4.622 120.611 146.642 1.00 0.00 H...CONECT 6329 6328CONECT 6330 6328...END
Common File Formats- PDB4 character identifier from PDB- 1mbnHETATM records- All other atomsCONECT records- bonds between HETATMs
-Don’t include bond order so not for general useATOM 1 N ILE J 11 5.804 123.968 147.434 1.00 94.01 NATOM 2 CA ILE J 11 5.791 123.831 145.944 1.00 93.94 CATOM 3 C ILE J 11 7.198 123.695 145.333 1.00 92.32 CATOM 4 O ILE J 11 8.169 124.255 145.843 1.00 93.52 O...TER 6327 ASP J 431HETATM 6328 O HOH J2001 4.852 121.472 146.292 1.00 50.45 OHETATM 6329 H1 HOH J2001 4.622 120.611 146.642 1.00 0.00 H...CONECT 6329 6328CONECT 6330 6328...END
Common File Formats- mol2File excerpt:@<TRIPOS>MOLECULE3ZWZ.pdb2892 2732 564 0 0PROTEINNO_CHARGES
@<TRIPOS>ATOM 1 N -15.6500 14.3770 5.0450 N.4 1 ASN 0.0000 2 CA -15.2850 13.0110 5.5660 C.3 1 ASN 0.0000 3 C -15.8880 12.7820 6.9380 C.2 1 ASN 0.0000…@<TRIPOS>BOND 1 328 1568 1 2 866 1109 2…@<TRIPOS>SUBSTRUCTURE 1 ASN 2 RESIDUE 4 A ASN 1 ROOT 2 PRO 10 RESIDUE 4 A PRO 2
Common File Formats- mol2File excerpt:@<TRIPOS>MOLECULE3ZWZ.pdb2892 2732 564 0 0PROTEINNO_CHARGES
@<TRIPOS>ATOM 1 N -15.6500 14.3770 5.0450 N.4 1 ASN 0.0000 2 CA -15.2850 13.0110 5.5660 C.3 1 ASN 0.0000 3 C -15.8880 12.7820 6.9380 C.2 1 ASN 0.0000…@<TRIPOS>BOND 1 328 1568 1 2 866 1109 2…@<TRIPOS>SUBSTRUCTURE 1 ASN 2 RESIDUE 4 A ASN 1 ROOT 2 PRO 10 RESIDUE 4 A PRO 2
(Includes bond order so can be used for ligands)
Common File Formats- sdfCan have arbitrary data fieldsFile excerpt:GDP -OEchem 43 45 0 0 1 0 0 0 0 0999 V2000 330.4117 176.7213 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 343.8844 163.2455 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 320.9156 147.4924 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0... 1 2 1 0 0 0 0 1 3 1 0 0 0 0 3 4 1 0 0 0 0...M END> <ENERGY>-12.385$$$$
Common File Formats- sdfNo residue information- not good for proteinsFile excerpt:GDP -OEchem 43 45 0 0 1 0 0 0 0 0999 V2000 330.4117 176.7213 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0 343.8844 163.2455 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 320.9156 147.4924 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0... 1 2 1 0 0 0 0 1 3 1 0 0 0 0 3 4 1 0 0 0 0...M END> <ENERGY>-12.385$$$$
Z-coordinate zero: 2D structure
Common File Formats- smilesFile excerpt:
CN(C)C1=CC=C(C=C1)NS(=O)(=O)C2=C(C(=C(C(=C2F)F)F)F)F
Common File Formats- smilesFile excerpt:
CN(C)C1=CC=C(C=C1)NS(=O)(=O)C2=C(C(=C(C(=C2F)F)F)F)F
-Reminiscent of chemical formula-2D-only-Don’t use for proteins
Two free virtual databases
21 million commercially available compoundshttp://zinc.docking.org
100 million compounds3D representations for many compoundshttp://pubchem.ncbi.nlm.nih.gov/search/search.cgi#
21
Web sketchers that understand smiles
Pubchem• http://pubchem.ncbi.nlm.nih.gov/edit2/index.html
JME/JSME Molecular Editor (property calculations)• http://www.molinspiration.com/cgi-bin/properties
ChemAxon Marvin (includes property calculations)• http://www.chemaxon.com/marvin/sketch/index.jsp
pharmacophoremodeling
QSAR2D and 3D search
Drug Design Methods
docking
active sitefeatures
Ligand-basedStructure-based
De Novo Design
Structure-Based Drug Design
“docking”
Define active site features
Use active site features to query database, fitting compounds to active site features
Calculate energy of binding interaction
Take top hits (lowest energy) and cluster
Pass results to chemist
24
Evaluate Docking Results- ROC plot
ROC (Receiver Operator Characteristic)
Requires two sets of docking scores, from known actives (positives) and decoys (negatives), ordered from best to worst
Y-axis: fraction of true positives out of total actual positives (true positive rate, or Sensitivity) • First y-value is 1/P where P is total number of positives
X-axis: Fraction of false positives out of the total actual negatives (false positive rate, or Specificity)• First x-value is FP/N where FP is number of decoys
with better score than the first (best) active, and N is the total number of negatives.
25
Evaluate Results- ROC plot
Actives: 12.0 11.5 11.1 10.9 10. 8 10.6 10.5 10.4 10.3 10.2 9.9 9.5 9.4 Decoys: 11.0 10.0 10.0 9.0 9.0 9.0 8.5 …… Sort by Score: 12.0 11.5 11.1 11.0 10.9 10. 8 10.6 10.5 10.4 10.3 10.2 10.0 10.0 9.9 9.5 9.4 9.0 9.0 9.0 8.5 ……
Perfect predictivity
Random predictivityAUC- Area Under Curve
26
Preparation for docking
Protein• Add hydrogens (or not!)• Treat chain terminal groups• Sidechain torsions• Remove ligand
Ligands• Add hydrogens• Ionization state (pH)• Stereoisomers• Tautomer
Find out what is necessary/important for your docking program and docking goals!
Ioniza
tion
Stereoisomers
Tautomers
“Multiplex”
•A suite of automated docking tools•Free•Cross-platform•Open source•Available on Biowulf•Docks ligands up to 2048 atoms
•Dr. Oleg Trott, Scripps
AutoDock Vina
Chemical complementarity docking
AutoDock Vina
1. Protein ligand separated by distance2. Ligand torsions moved (flexible); translations and rotations (rigid)3. Energy of interaction evaluated each step4. Ligand settles into active site
29
Identify Active Site to Guide DockingIndicate center of box and dimensions
AutoDock Vina hands-on
Goal: Setup, run and analyze the docking of DAC to Glucocorticoid Receptor using the AutoDock Vina interface through the Chimera program. (See separate handout)
Two parts to docking
Define active site features
Use active site features to query database, fitting compounds to active site features
Calculate energy of binding interaction
Take top hits and cluster
Pass results to chemist
1. Search method 2. Scoring method
32
Recent Review Article
Sliwoski, et al., (2014) Computational Methods in Drug Discovery Pharmacol Rev 66, 334-395
Take Away Messages• Devise strategy based on specific goals and known
information Structure based Ligand based Both
• Use appropriate file type for your structures
• Understand what input/preparation is needed by the program
• Understand limitations and evaluate quality of results
• Communicate with medicinal chemist