Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Softwares for Molecular Docking
Lokesh P. TripathiNCBS
17 December 2007
Molecular Docking
• Attempt to predict structures of an intermolecular complex between two or more molecules– Receptor-ligand (or drug)– Enzyme-substrate– Protein-DNA (or RNA)– Protein-protein
Brief History of Docking
• Crick (1953) suggested that complementarity in helical coils could be modelled as knobs fitting into holes
• DOCK (Kuntz, 1982) pioneered the field of molecular docking
• GRID (Goodford, 1985) too became a part of many subsequent softwares
General considerations
• Molecular representations– Abstract or atoms– Fixed or flexible
• Juxtaposition of molecules– Interactive or automated– Search algorithm to create conformations
• Evaluating complementarity (ranking)– Scoring function– Force field energy functions
Search Algorithms
• Potentially several ways of putting two molecules together; possibilities increase exponentially with size of molecules involved
• Attempt to locate the most stable state in the energy landscape
• Broadly two types: 1) full solution space search; 2) guided search through solution space
Search Algorithms•Random
– Genetic algorithms– Monte Carlo methods– Tabu search
•Systematic– Fragment-based methods– Point complementary methods– Distance geometry methods– Database
•Simulation– Molecular dynamics– Energy minimisation
•Multiple methods Algorithms
Docking Softwares
Virtual screening De novo designAutoDock LUDI
DOCK GRIDFlexX/E MCSSSLIDE SMoG
Surflex GrowMolICM SPROUT
GOLD
Random methods
• Sample the conformation space by making single change to a ligand or a population of ligands
• Alteration performed at each step and accepted or rejected based on a predetermined probability function
• Include Monte Carlo (MC) methods; Genetic Algorithm (GA) methods; Tabu search methods
Monte Carlo methods• Use a simple energy function
• Makes random moves and accepting or rejecting based on Boltzmann probability function
• More efficient in stepping over energy barriers, allowing more complete searches of conformation space
• PRODOCK, MC-DOCK, ICM, DockVision, QXP, GLIDE; too slow for extensive flexible docking
Energy global minimum conformers generated by Monte Carlo method
Genetic Algorithm methods• Apply ideas of genetics and evolution in
docking
• Start with an initial population of random ligand conformers wrt protein, each defined by a set of variables called genes
• Genetic operators (mutations, crossovers) applied to sample conformation space till optimal population is derived
• AUTODOCK, GOLD, DIVALI, DARWIN; too slow for extensive flexible docking
Autodock
• Suite of automated docking tools• Designed to predict how small molecules
(ligands drug candidates) bind to a receptor; AMBER force field
• Three constituent programs-Autotors- define torsions in the ligand-Autogrid- calculate grids-Autodock- docking tool-AutoDockTools (ADT)- GUI to facilitate above and other modules accompanying AutoDock
Autodock Lamarckian GA
• LGA encompasses a “genotypic” and “phenotypic” phase i.e. genetic operations and energy function to be optimised
• Energy minimisation performed after “genotypic” changes and these “phenotypic” changes mapped back onto “genes” (by changing ligand coordinates.
• Most efficient and reliable of random methods
Autodock Grid maps
• Pre-calculated • Grid for each atom type
(e.g. C, H, O, N)• Consists of 3D lattice of
regularly spaced points, surrounding and centered on region of interest in the macromolecule
• Typical spacing is 0.375 Å• Probe atom placed at each
grid point and energy calculated
GOLD
• Genetic Optimisation and Ligand Docking, uses multiple subpopulations of ligand
• Force-field based scoring function, includes three terms: H-bonding term, intermolecular dispersion potential, intramolecular potential
• 71% success in identifying experimental binding mode in 100 protein complexes
Tabu Search methods
• Impose restrictions preventing searches from repeating already explored conformations
• New conformation is compared to the previous ones based on RMSD values which determine acceptance
• PRO-LEADS
Systematic Search methods
• Attempt to explore all degrees of freedom in a molecule
• Can be divided into three types: conformational search methods, fragmentation methods, and database methods
Conformational Search methods
• Brute force or shotgun methods of docking
• All rotatable bonds in ligand rotated through 360°till in fixed increments till all possible combinations generated and evaluated
• Number of structures generated increases exponentially with number of rotatable bonds- combinatorial explosion
Fragmentation Search methods
• Incrementally grow ligand into the active site, by docking several fragments into the active site followed by covalent-linking to recreate the initial ligand
• Rigid core-fragment of the ligand is docked first followed by addition of flexible regions
• DOCK, FlexX, LUDI, ADAM, Hammerhead
DOCK
Methodology
FlexX• Base fragment is picked up and docked using
“pose-clustering” algorithm
• Clustering algorithm is implemented to merge similar ligand transformations into active site
• Flexible fragments are added incrementally using MIMUMBA and evaluated using overlap function, followed by energy calculations till the ligand is completely built
• Final evaluation through Böhm’s scoring function that includes H-bonds, ionic, aromatic and lipophilic terms
Database methods• Tackle combinatorial explosion by using
libraries of pregenerated conformations to deal with ligand flexibility
• FLOG generates and docks conformational libraries called Flexibases using distance geometry
• EUDOC uses conformational searches of ligands to generate different structures, which are placed into receptor active-site followed by energy evaluation
Scoring• Essential to rank the ligand conformations
determined by the search algorithms
• Scoring function must be able to distinguish between true binding modes and others
• Speed and accuracy are most desirable
• Three major classes: force-field based; empirical; knowledge-based
Force-field based Scoring• Quantify sum of two energies-interaction
energy between receptor-ligand; internal energy of the ligand
• Consist of van der Waals (Lennard-Jones potential) + electrostatic energy terms (Coulombic function)
• Do not include solvation and entropic terms
• GoldScore, G-SCORE, D-SCORE, AMBER, CHARRM, GROMOS
Empirical Scoring • Designed to reproduce experimental data;
binding energy can be approximated by sum of individual uncorrelated terms
• Experimentally determined binding energies used to quantify individual terms
• Easy computation, but non-versatile due to dependence on experimental datasets
• ChemScore, Böhm’s scoring function, F-Score, X-Score
Knowledge-based Scoring• Statistically derived principles that aim to
replicate experimentally determined structures
• Employ simple interactions to screen large databases
• Dependent on information available in preexisting datasets
• DrugScore, SMoG score, Potential of Mean force (PMF)
Consensus Scoring
• Combines information from different scoring schemes to compensate for individual limitations
• Correlation of individual scoring systems may be a problem
• X-SCORE combines functions from PMF, ChemScore, PMF with FlexX
Protein-protein Docking
• Prediction of protein complex structure given individual components’ structures
• Huge number of degrees of freedom; docking largely performed as rigid body docking
• Z-DOCK, a Fast Fourier Transform-based rigid body docking program, is one of the most accurate programs as rated in Critical Assessment of Predicted Interactions (CAPRI)
Docking- strengths and limitations
• Most available softwares are able to “predict” known protein-bound conformations with an accuracy of 1.5-2 Å; 70-80% success rate
• Scoring function- major limitation factor due to simplifications and assumptions
• Solvation effects, quality of crystallographic data
Comparing Docking softwares in difficult
• Several studies compare docking programs but conclusions of general applicability are not evident
• Minor differences in methodology can have significant impact on success rates of various docking programs
• Cole et al., 2005 PROTEINS 60, 325-332 provide a list of recommendations in assessing docking programs
Docking softwares’
representations in
citations
Docking Softwares- Citations per year
Challenges
• Predicting structures of multi-domain, multi- subunit protein complexes
• Prediction and specificity in protein-nucleic acid interactions
• Protein-docking with backbone flexibility