49
Computer-aided drug design platform using PyMOL User Guide Markus Lill Department of Medicinal Chemistry and Molecular Pharmacology College of Pharmacy Purdue University 575 Stadium Mall Drive West Lafayette, IN 47907 Email: [email protected]

Computer-aided drug design platform using PyMOL User Guide

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Computer-aided drug design platform using PyMOL

User Guide

Markus Lill Department of Medicinal Chemistry and Molecular Pharmacology

College of Pharmacy Purdue University

575 Stadium Mall Drive West Lafayette, IN 47907 Email: [email protected]

Table of Contents

Installation ................................................................................ 1

Download and Installation ................................................................. 1

Amber plugin ..................................................................................... 1

Prerequisites .................................................................................................. 2

Molecular mechanics program Amber and AmberTools ................................................ 2

File format conversion: Open Babel .............................................................................. 2

Protein preparation: Reduce ......................................................................................... 2

Free energy calculation: SIE......................................................................................... 2

Free energy calculation: MM/PBSA .............................................................................. 2

AutoDock Vina plugin ........................................................................ 4

Prerequisites .................................................................................................. 4

Docking program AutoDock Vina and MGLTools .......................................................... 4

Slide plugin ........................................................................................ 4

QSAR plugin ..................................................................................... 4

Setup ....................................................................................... 5

Setup ................................................................................................. 5

Username .......................................................................................... 6

Amber plugin ..................................................................................... 6

Cluster settings for Amber simulations .......................................................... 7

AutoDock Vina plugin ........................................................................ 7

Slide plugin ........................................................................................ 8

QSAR plugin ..................................................................................... 8

Plugin: amber_Linux.py ............................................................ 9

Preparation of protein-ligand complex ............................................... 9

Optimize hydrogen-bond network .................................................................. 9

Analysis of water molecules ......................................................................... 10

Ligand and protein preparation for molecular mechanics simulations ......... 12

Separating ligand and protein ..................................................................................... 12

Preparing ligand ......................................................................................................... 12

Preparing protein ........................................................................................................ 12

Molecular mechanics simulations .................................................... 14

Energy minimization (interactive) ................................................................. 14

Molecular dynamics (MD) simulations of protein-ligand complex

(background) ................................................................................................ 18

Start simulation .......................................................................................................... 18

Monitor and analyze simulation .................................................................................. 21

B-factor analysis ......................................................................................................... 23

RMSD analysis ........................................................................................................... 23

Energy analysis .......................................................................................................... 23

Molecular dynamics (MD) simulations of protein alone (background) ......... 24

Start simulation .......................................................................................................... 24

Monitor and analyze simulation .................................................................................. 27

B-factor analysis ......................................................................................................... 28

RMSD analysis ........................................................................................................... 29

Energy analysis .......................................................................................................... 29

Free energy calculations ................................................................. 29

Solvent interaction energy (SIE) analysis .................................................... 29

Plugin: AutoDockVina.py ........................................................ 31

Generate ligand library .................................................................... 31

Prepare and run docking ................................................................. 32

Static docking ............................................................................................... 32

Docking with flexible side chains ................................................................. 40

Ensemble docking ........................................................................................ 42

U S E R G U I D E

1

Installation

In this chapter we will describe the installation process of the provided plugins.

Download and Installation

The software package consists of currently two Python plugins (two additional will follow soon) and a settings file containing information about the location of external software and used databases. The plugins should be viewed as interface between the GUI PyMOL to view, prepare and analyze protein-ligand complexes, and molecular modeling software to perform computations and simulations on the protein-ligand complexes. Thus, additional installation of external software is required to utilize our plugins. The current version of our plugins is designed for Linux OS.

Before downloading and installing our scripts, please download PyMOL from http://pymol.org and install it on your computer. The location of PyMOL will be noted as

$PyMOL

in the following.

The file Settings_Linux.txt needs to be downloaded and copied to your home directory.

Amber plugin

The plugin to perform molecular mechanics simulations using Amber can be downloaded here:

amber_Linux.py

It needs to be located in $PyMOL/modules/pmg_tk/startup/ where $PyMOL is the top-directory containing your local PyMOL installation.

Chapter

1

2

The file AMBER_library.tar.gz needs to be downloaded (default to /usr/local) and extracted using

tar -zxf AMBER_library.tar.gz

Prerequisites

To utilize the plugin, the following programs have to be installed:

Molecular mechanics program Amber and AmberTools

Our plugin has beed tested with AmberTools 1.2 and Amber10 (http://ambermd.org).

First, install AmberTools using the default location /usr/local/amber10. Please specify the environment variable AMBERHOME=/usr/local/amber10 (We added the variable to the .bashrc file, as we are using bash as default shell).

Install Amber to the same location (default: /usr/local/amber10).

File format conversion: Open Babel

To convert between different file formats our plugins use Open Babel (http://openbabel.org/wiki/Get_Open_Babel).

Our plugins have been tested with the source package of Open Babel 2.3.1.

Protein preparation: Reduce

To add hydrogen atoms to the protein and optimize the protein’s hydrogen-bond network we make use of the program reduce. Please download and install Reduce from http://kinemage.biochem.duke.edu/software/reduce.php. The executable has to be named reduce.

Our plugin has been tested with reduce 3.14. We downloaded the src distribution to /usr/local/reduce, untared the file and used the Makefile (type make) in the directory reduce.3.14.080821.src . The executable will be located in the subdirectory reduce_src with the default name reduce.

Free energy calculation: SIE

Free energy analysis based on an AMBER protein-ligand MD trajectory can optionally be performed using Solvent Interaction Energy analysis with sietraj (http://www2.bri.nrc.ca/ccb/pub/sietraj_main.php).

Free energy calculation: MM/PBSA

PBSA contributions to the free energy of binding may be calculated during protein-ligand complex optimization (entropy to follow). Following changes to the standard MM/PBSA installation of Amber 10 are required:

3

• Organic molecules such as the ligand in our tutorial example contain elements such as F, Cl, Br, I. Atomic radii need to be added to the file $AMBERHOME/src/mm_pbsa_calceneent.pm in the subroutine generate_pqr:

4

AutoDock Vina plugin

The plugin to perform docking using AutoDock Vina can be downloaded here:

AutoDockVina.py

It needs to be located in $PyMOL/modules/pmg_tk/startup/ where $PyMOL is the top-directory containing your local PyMOL installation.

Prerequisites

To utilize the plugin, the following programs have to be installed:

Docking program AutoDock Vina and MGLTools

Our plugin has been tested with AutoDock Vina 1.1.2 (http://vina.scripps.edu) and MGLTools 1.5.4 (http://mgltools.scripps.edu/downloads).

Slide plugin

We are currently adjusting the script to the current version of Slide. An updated version will be available as soon as possible.

QSAR plugin

... will follow.

5

Setup

In this chapter we will describe the process of correctly setting up the plugins.

Setup

Before using the plugins the location paths to the used molecular modeling programs need to be correctly setup. This setup can be performed within any of the individual plugins.

For initiation of the setup, please start PyMol. New menus named Amber and AutoDock Vina will appear.

Select the menu item Modify Settings_Linux.txt file from any of the two new menus.

Chapter

2

6

Specify the following settings and the paths for the following plugins (Slide and QSAR plugin settings can be currently ignored):

Username

First, press the button Reset to defaults and default settings will be assigned to all variables. As part of this process, the username should have been automatically detected.

Amber plugin

Please, specify the settings for the path to AMBER_library containing co-factors, the $AMBERHOME variable containing the Amber installation, and the paths where the

7

executables for the programs sietraj and reduce are installed (changepdb amd changecrd are currently not used and can be ignored).

Cluster settings for Amber simulations

Setup and initiation of MD simulations on external clusters using PBS is implemented in the Amber plugin.

Please specify up to ten clusters with full name or IP address, maximum number of cores allowed to allocate per MD simulation, SSH port (default: 22), queue name and home directory, and path where Amber is installed on the cluster. The PBS scripts are setup for the conventions used at Purdue University (changes to code might be required for other cluster setups).

AutoDock Vina plugin

Please, specify the settings for the path to AUTODOCK_library containing co-factors, the directory for the AutoDock Tools installation, and the path where the executable of AutoDock Vina is installed.

8

Slide plugin

... will follow.

QSAR plugin

... will follow.

9

Plugin: amber_Linux.py

In this chapter, we will discuss how the PyMOL plugin amber_Linux.py can be used to perform energy minimization, molecular dynamics (MD) simulations and free energy calculations using Amber and other molecular modeling programs.

We will demonstrate the use of amber_Linux.py on the complex of the enzyme thrombin with an inhibitor (PDB-code: 1MU6). We will first download and setup the protein-ligand system, then perform energy minimization, MD simulations and finally free energy calculations using SIE.

Preparation of protein-ligand complex

In this section, we will introduce the concepts to prepare a protein-ligand complex for subsequent molecular modeling studies.

Optimize hydrogen-bond network

To download the PDB file 1MU6, please open PyMOL and go to menu Plugin → PDB Loader Service. Enter 1MU6 or 1mu6 and press OK.

In x-ray structures the side chain conformations and protonation states of Asn, Gln and His, and the tautomer states of His cannot be unambiguously determined using the electron density alone, as groups/atoms N and CH as well as NH 2 and O have similar electron density. Thus, side chain conformations, protonation and tautomer states of these residues have to be selected based on an analysis of hydrogen-bond networks (including waters). In this plugin, we will use Reduce which performs the hydrogen-bond network analysis, automatically rotates Asn, Gln, His if necessary, and assigns a protonation state to each His residue.

Please, run reduce and the Asn, Gln and His states are automatically updated:

Amber → Use Reduce to optimize ASN,GLN,HIS conformations/protonation states

Chapter

3

10

Select the protein for which the hydrogen-bond network should be optimized: In this tutorial only 1mu6 is available to select.

After the analysis and assignment is performed, a window appears that displays the assigned histidine states and the His, Asn or Gln sidechains that have been flipped:

(HIE = ε-protonated, HID = δ-protonated, HIP = doubly-protonated)

Manual modifications can be made if necessary:

• Rotation of Asn, Gln, His: Choose Mouse Mode : 3-Button Editing [lower right window] by clicking on 3-Button Viewing; Pick bond as axis of rotation [Ctrl-Right mouse button]; Change torsion value [Ctrl-Left mouse button and move mouse up/down, !!! Click on half of bond next to group you want to rotate !!!].

• Modify histidine state: Select HIS residue first [now is named HIE, HID or

HIP] and use Amber → Modify HIS state and choose specific state.

Save current state as Tutorial_Amber_1.pse [use File → Save Session As …].

Analysis of water molecules

Water molecules play a critical role in mediating interactions between protein residues (filling cavities), between protein-protein contacts or mediating protein-ligand interactions. They can, however, also be artifacts of the x-ray experiment:

• Common buffers constituents such as sodium ions, ammonium ions and water molecules are isoelectronic, and are only differentiable based on local environment.

• If resolution is not high, features in electron density might be water or noise. Often addition of water is simply used to reduce artificially the difference between observed and calculated structure-factor amplitudes.

11

It is important to include water molecules that are stabilizing the protein itself or the interaction between protein and ligand. Other waters however should be removed from the simulation, e.g. for the following reasons:

• Implicit solvation simulations: Surface waters might become instable and diffuse from the protein.

• Docking: Water molecules that mediate interactions to ligand A might prevent binding of structurally diverse ligand B.

The plugin can be used to select and potentially remove water molecules based on simple criteria:

• location: surface or cavity

• mediating ligand-protein interactions or not

• 0, 1 or 2+ hydrogen bonds

Use Amber → Remove waters, select protein, and change criteria for number of hydrogen bonds:

Under default settings, water molecules are selected if they are not engaged in hydrogen bonds with the protein or have less than two hydrogen bonds with the protein or other water molecules (see Figure above).

The selected water molecules are stored in the selection (waters) and highlighted as selection in the PyMOL Viewer. Water molecules can be manually added to or removed from the selection.

To remove waters select (waters) → A→ remove atoms.

Save current state as Tutorial_Amber_2.pse.

You can also repeat the process after removing the first layer of water molecules.

12

Ligand and protein preparation for molecular mechanics

simulations

Separating ligand and protein

Select ligand CDA in the PyMOL Viewer and extract ligand by selecting

(sele) → A → extract object.

Rename ligand object obj01 to ligand_1mu6 by selecting obj01 → A → rename object and type ligand_1mu6 as new name.

Preparing ligand

Next, please check hybridization states and add hydrogen atoms to ligand:

Select Builder [smaller window] and check single/double/triple bonds. Bond character can be modified by choosing “|”, “||” or “|||” and clicking on bond that should be modified (not necessary for ligand in our tutorial).

To add hydrogen atoms, select any ligand atom and click on Add H [smaller window].

Finally, analyze protein-ligand contacts to adjust hydrogen positions and protonation states (not necessary for ligand in tutorial):

• Modify hydrogen positions by rotating bond manually: Choose Mouse Mode: 3-Button Editing [lower right window] by clicking on 3-Button Viewing; Pick bond as axis of rotation [Ctrl-Right mouse button]; Change torsion value [Ctrl-Left mouse button and move mouse up/down; Click on half of bond next to group you want to rotate!!!].

• Change protonation state of ligand if necessary by selecting atom (in editing mode) and select Charge state [smaller window: “+1”, “0”, “-1”].

Save current state as Tutorial_Amber_3.pse.

Preparing protein

The tutorial example 1mu6 contains the modified amino acid TYS. In order to run simulations with such residues, the force field files for Amber have to be manually modified. (We refer to the Amber manual for details). Alternatively, non-standard residues and co-factors such as heme can be added to the AMBER_library directory (location specified by library_dir):

13

Force field files for Amber (.frcmod and .lib file) need to be manually generated and added to AMBER_libary (e.g. using existing force field parameters or utilizing the GAFF force field in combination with antechamber, parmchk and tleap [programs are part of the AmberTools suite]). A line for each non-standard residue/co-factor needs to be added to cofactors.txt containing the 3-letter code in the PDB file and name of the .frcmod/.lib file (without ending):

Bonds between the non-standard residue and other residues can be specified in the bonding.txt file: Each possible bond is specified in one line with residue and atom name for one residue and corresponding names for the other residue. Bonds are only formed if the distance between specified atoms is smaller than the cutoff value defined in the fifth column [in Å]:

In this tutorial, we will just remove residues 355-365 containing the non-standard residue TYS: Select residues 355-365 in the PyMOL Viewer and delete all atoms by

(sele) → A → remove atoms.

Next, hydrogen atoms are added to the protein:

Amber → Prepare protein

Finally, analyze flexible hydrogen atoms in the protein, in particular at the interface between protein and ligand: Identify possible Ser, Thr, Tyr residues and hydroxyl groups, amines (tetrahedral) and rotate bonds if necessary.

Save current state as Tutorial_Amber_4.pse.

14

Molecular mechanics simulations

Energy minimization (interactive)

Energy minimization drives a system into the next local energy minimum (not the global energy minimum). Structures downloaded from PDB are not necessarily minimized; in particular added hydrogen atoms (e.g. on water molecules) are not.

Start with Tutorial_Amber_4.pse.

Minimize structure using 100 steps:

a. Start process with Amber → Minimize (AMBER, interactive)

b. Pick protein and ligand:

c. Choose minimization settings:

i. Restraints:

Different restraints are applicable to the protein (a, protein is not considered in minimization; b, protein and water molecules are fully restrained; c, protein and water atoms beyond a zone with certain radius around the ligand are restrained; d, all atoms are unrestrained).

15

ii. Ligand, partial charges:

The net charge of the ligand and the type of charge calculation needs to be specified (a, Gasteiger charges; b, semi-empirical AM1-BCC).

iii. Number of minimization steps:

iv. Solvation settings:

16

In the interactive minimization process, currently only implicit solvation using the GB model of Onufriev, Bashford, and Case (OBC) or simple distance-dependent dielectric.

v. Energy calculation:

By default only the potential energy for the optimized protein-ligand complex will be outputted. PB/SA or SIE energies will be computed if the corresponding buttons are checked. The MM/PBSA program from Amber10 and/or SIE need to be properly installed and adjusted as described in Chapter 1.

Press OK and the minimization will start. Using the settings above, the simulation will take a few minutes. The protein-ligand structure will be automatically updated and the output energies will be displayed in a new window:

17

It displays for the complex (COMPLEX), protein (RECEPTOR), ligand (LIGAND) and the difference between those entities (=DELTA: protein-ligand interaction energy) the individual energy contributions: ELE=electrostatic energy, VDW=van der Waals energy, INT=internal energy, GAS=ELE+VDW+INT, PBSUR=nonpolar contribution to the solvation free energy, PBCAL = the electrostatic contribution to the solvation free energy, PBSOL=sum of nonpolar and polar contributions to solvation free energy, PBELE=sum of the electrostatic solvation free energy and MM electrostatic energy, PBTOT= final estimated binding free energy calculated from the terms above (all energies are in kcal/mol).

Save current state as Tutorial_Amber_5.pse.

18

Molecular dynamics (MD) simulations of protein-ligand

complex (background)

Start simulation

Based on the already prepared and minimized structure of 1mu6 (Tutorial_Amber_5.pse) we want to run molecular dynamics (MD) simulations and analyze the dynamics of the protein-ligand complex:

a. Load Tutorial_Amber_5.pse into PyMOL (this is the minimized X-ray structure)

b. Set up MD simulation: Amber → Molecular Mechanics for protein-ligand complex (AMBER, background)

c. Select protein and ligand:

d. Choose directory to store output:

e. Specify settings for MD simulation:

i. Restraints:

19

Different restraints are applicable to the protein (a, protein is not considered in minimization; b, protein and water molecules are fully restrained; c, protein and water atoms beyond a zone with certain radius around the ligand are restrained; d, all atoms are unrestrained).

ii. Ligand, partial charges:

The net charge of the ligand and the type of charge calculation needs to be specified (a, Gasteiger charges; b, semi-empirical AM1-BCC).

iii. Number of time steps:

The MD simulations are by default set up into four section: energy minimization, water equilibration with protein and ligand restraint, equilibration of whole system, production run. Individual section can be turned off by deselecting the corresponding checkboxes (only bottom-up deselecting possible). Frequency for energy and coordinate output is specified in the last column.

(Note: The current settings represent only a very short simulation. Standard MD simulations are typically in the order of several ns)

20

iv. Solvation:

Three different selection methods are currently incorporated: Implicit solvation using GB/OBC, solvation cap with a user-defined radius, and solvation box for periodic boundary simulations with a user-defined minimum distance between a protein or ligand atom to the boundary.

v. Resources:

The user can define the computer or compute cluster the simulation should be performed on. In addition to the local computer, other resources need to be defined throughout the set-up stage (see chapter 2).

Please, make sure that you can ssh and scp to the server without password prompt (set up public keys). Otherwise, just prepare file for background job, copy data to server and start jobs manually. Also result files need to be manually copied over to local computer in this case.

f. Amber will now generate ligand charges, generate the protein topology and add water molecules and ions to neutralize the system. The interface script will do this automatically and will also generate input files containing run settings. It will finally (after a few minutes) start the job in the background.

21

Monitor and analyze simulation

You can check the progress of the simulation and import the results of already finished portions (minimization, position restraint, equilibrium and/or production run) with

Amber → Monitor/Read molecular mechanics results

Select MD_test_client subdirectory and open Monitor.prg in the appearing dialog:

The dialog appearing next will display the progress of the MD simulation. If you initiate the monitor process after a short period of time following the start of the simulation, only a small fraction of the MD process will be completed:

22

Just press Cancel and repeat the monitoring process after a few minutes. You can also check if a process sander is still running, for example by typing top in a console.

If the simulation has finished you can select the portion you want to visually inspect in PyMOL or just transfer the output from server to client:

If the results haven’t been transferred this process might take several minutes dependent on connection speed (if run on external server/cluster) and MD output size. [For the current settings the process should be completed in several seconds if the simulation was performed on the local computer.]

If you select to visually inspect all four portions of the MD, four new objects will appear in PyMOL:

min_fin structure after minimization

pr_traj trajectory of position restraint run

equ_traj trajectory of equilibration run

md_traj trajectory of production run

You can control the animation with the lower right bar:

or display an overlay of snapshots with

Movie → Show All States

23

B-factor analysis

Start with fresh PyMOL session.

To display the computed b-factors open md_bfact.pdb from the MD_test_client folder and select

Amber → Color by b-factor

and choose md_bfact from the subsequent menu.

The following color coding was chosen:

Blue → green → red for low → medium → high b-factors.

The MD simulation in the tutorial was chosen to be very short. Thus, no significant fluctuations can be observed (most atoms are colored in blue). You can repeat the MD simulation with much longer simulation length and perform the b-factor analysis again.

You can also display b-factors for the original X-ray structure. Load the original X-ray structure 1mu6 with

Plugin → PDB Loader Service

and type 1mu6.

Select Amber → Color by b-factor

and choose 1MU6 from the subsequent menu.

RMSD analysis

Import md_rmsd_backbone.dat (folder: MD_test_client) into OpenOffice/Excel or any other plotting program and plot rmsd vs. simulation time. The rmsd is calculated with respect to the starting structure.

Energy analysis

Import md_energies.dat (folder: MD_test_client) into OpenOffice/Excel and plot total energy, potential energy and kinetic energy vs. simulation time.

24

Molecular dynamics (MD) simulations of protein alone

(background)

Start simulation

Based on the already prepared and minimized structure of 1mu6 (Tutorial_Amber_5.pse) we want to run molecular dynamics (MD) simulations of the protein alone:

a. Load Tutorial_Amber_5.pse into PyMOL (this is the minimized X-ray structure)

b. Set up MD simulation: Amber → Molecular Mechanics for protein (AMBER, background)

c. Select protein and ligand:

d. Choose directory to store output:

25

e. Specify settings for MD simulation:

i. Restraints:

Different restraints are applicable to the protein: a, protein and water atoms beyond a zone with certain radius around a user-defined coordinate are restrained (the coordinate has to be defined under Solvation::Solvation cap::Center; b, all atoms are unrestrained).

ii. Number of time steps:

The MD simulations are by default set up into four section: energy minimization, water equilibration with protein restraint, equilibration of whole system, production run. Individual section can be turned off by deselecting the corresponding checkboxes (only bottom-up deselecting possible). Frequency for energy and coordinate output is specified in the last column.

(Note: The current settings represent only a very short simulation. Standard MD simulations are typically in the order of several ns.)

26

iii. Solvation:

Three different selection methods are currently incorporated: Implicit solvation using GB/OBC, solvation cap with a user-defined radius and center coordinates, and solvation box for periodic boundary simulations with a user-defined minimum distance between any protein atom to the boundary.

iv. Resources:

The user can define the computer or compute cluster the simulation should be performed on. In addition to the local computer, other resources need to be defined throughout the set-up stage (see chapter 2).

Please, make sure that you can ssh and scp to the server without password prompt (set up public keys). Otherwise, just prepare file for background job, copy data to server and start jobs manually. Also result files need to be manually copied over to local computer in this case.

f. Amber will now generate the protein topology and add water molecules and ions to neutralize the system. The interface script will do this automatically and will also generate input files containing run settings. It will finally (after a few minutes) start the job in the background.

27

Monitor and analyze simulation

You can check the progress of the simulation and import the results of already finished portions (minimization, position restraint, equilibrium and/or production run) with

Amber → Monitor/Read molecular mechanics results

Select MD_prot_test_client subdirectory and open Monitor.prg in the appearing dialog.

The dialog appearing next will display the progress of the MD simulation. If you initiate the monitor process after a short period of time following the start of the simulation, only a small fraction of the MD process will be completed:

Just press Cancel and repeat the monitoring process after a few minutes. You can also check if a process sander is still running, for example by typing top in a console.

If the simulation has finished you can select the portion you want to visually inspect in PyMOL or just transfer the output from server to client:

If the results haven’t been transferred this process might take several minutes dependent on connection speed (if run on external server/cluster) and MD output size. [For the current settings the process should be completed in several seconds if the simulation was performed on the local computer.]

28

If you select to visually inspect all four portions of the MD four new object will appear in PyMOL:

min_fin structure after minimization

pr_traj trajectory of position restraint run

equ_traj trajectory of equilibration run

md_traj trajectory of production run

The animation can be controlled with the lower right bar:

or an overlay of snapshots can be generated with

Movie → Show All States

B-factor analysis

Start with fresh PyMOL session.

To display the computed b-factors open md_bfact.pdb from the MD_prot_test_client folder and select

Amber → Color by b-factor

and choose md_bfact from the subsequent menu.

The following color coding was chosen:

Blue → green → red for low → medium → high b-factors.

The MD simulation in the tutorial was chosen to be very short. Thus, no significant fluctuations can be observed (most atoms are colored in blue). The MD simulation can be repeated with much longer simulation length and the b-factor analysis can be performed again.

b-factors for the original X-ray structure can also be displayed: Load the original X-ray structure 1mu6 with

Plugin → PDB Loader Service

and type 1mu6.

Select Amber → Color by b-factor

29

and choose 1MU6 from the subsequent menu.

RMSD analysis

Import md_rmsd_backbone.dat (folder: MD_prot_test_client) into OpenOffice/Excel or any other plotting program and plot rmsd vs. simulation time. The rmsd is calculated with respect to the starting structure.

Energy analysis

Import md_energies.dat (folder: MD_prot_test_client) into OpenOffice/Excel and plot total energy, potential energy and kinetic energy vs. simulation time.

Free energy calculations

Solvent interaction energy (SIE) analysis

In the previous we performed a short MD simulation on the already prepared and minimized structure of 1mu6:

Project subdirectory: MD_test_client

To calculate free energies of binding using the SIE approach,

a. Open PyMol and select Amber → Calculate free energy (SIE, AMBER only).

b. Select Monitor.prg file in folder MD_test_client.

c. Specify settings for SIE calculation:

The production trajectory of the short simulation contains 10 frames. You can select the first and last frames as well as the interval (frequency) of frames that should be considered in the calculation of the free energy of binding (SIE will use the average over all energies computed for the selected frames).

30

The calculations will take between 5 and 10 minutes and you will finally yield a new window displaying the calculated free energies of binding for this ligand-protein complex as calculated with SIE:

Please note the individual energy components (experimental affinity: -11.5 kcal/mol)

You can also open the underlying file independently from PyMOL in an editor: sie_ave.out

In addition a text file is generated (sie.txt) that contains the individual free energy components of SIE for all snapshots.

31

Plugin: AutoDockVina.py

In this chapter, we will discuss how the PyMOL plugin AutoDockVina.py can be used to perform docking calculations using AutoDock Vina.

We will demonstrate the use of AutoDockVina.py for docking several inhibitors to the enzyme thrombin (PDB-code: 1MU6). We will first generate a small ligand library, download and setup the protein system, and then perform docking calculations.

Generate ligand library

The general procedure to generate a ligand library is to generate ligands as individual PyMOL objects. Alternatively ligands can be imported as individual files (e.g. as pdb or mol2 files). Make sure that the topology (including bond strength) and hydration state is correctly assigned to each ligand. Please note the net charge of each ligand.

We will start with a series of pre-generated thrombin inhibitors in mol2 format:

Download the file Ligands.tar.gz from our, and store it on your local computer. Open a terminal, change into the subdirectory containing Ligands.tar.gz. Then, unzip and un-tar the file Ligands.tar.gz by typing:

tar -zxf Ligands.tar.gz

This will produce a folder Ligands in the subdirectory which contains the individual pdb files. Change into the Ligands subdirectory, and then open the pdb files in PyMOL by typing:

pymol *.pdb

Set-up ligand library for AutoDockVina using Menu item:

AutoDockVina → Export ligand library

Chapter

4

32

Specify ligand name and net charge of ligand:

Specify library name:

A ligand library will be generated under $autodock_dir (default: $HOME/AUTODOCK_library).

Prepare and run docking

Static docking

In this tutorial, we will use the PDB structure 1mu6 as template for docking thrombin inhibitors.

In general, the original PDB structure needs to be modified by separating ligand from the protein. This is best done by generating two distinct objects in PyMOL. In this tutorial, we will use the minimized structure from the 1mu6 protein-ligand complex (Tutorial_Amber_5.pse).

Restart PyMol and open Tutorial_Amber_5.pse in PyMOL. Choose

AutoDockVina → Prepare system and start AutoDock

from the menu bar. The following dialog will appear:

33

Define Project subdirectory: project_xray_ligands

Define Ligand library by pressing button Search and Import and selecting Thrombin_xray_ligands:

At this stage all ligands that have been exported by a user will show up.

Select the protein template for docking: You can use a prepared protein file that is currently not imported into PyMOL, you can perform docking to an ensemble of protein structures (either using an NMR-type PDB file or an Amber trajectory file; see section: Ensemble docking), or use the currently imported protein template. In this tutorial, we will use the last option.

In this tutorial, we already have prepared the protein structure (including assignment of protonation states). PyMOL will reassign the protonation states if you check “Let autodock change protonation states”.

34

Finally, parameters for the docking calculations can be set:

a. Search volume:

The search volume should cover the binding pocket of thrombin. Two options are suggested:

i. Display only x-ray ligand (or several x-ray ligands if applicable) and manually define box to cover space of ligand plus some extra space. Use the sliders to define center and box size in every direction (see figure above).

ii. Select the x-ray ligand (or several ligands) in the PyMOL Viewer (selection (sele) has to show up as object/selection in the right column

35

of the PyMOL Viewer). Press button Determine new box coordinates, and the center of the box will be shifted to the center of mass of all atoms in the selection (sele). [Can also be used by selection several amino acid residues in the binding site without use of ligand data.]

Press button Determine new box dimensions and the box will be rescaled to fit all atoms in the selection (sele) plus a user-defined extra space in every direction using Radius around selection (has to be set before pressing button).

The search volume is displayed and always updated as yellow box in the PyMOL Viewer:

b. Flexible residues:

AutoDock Vina can be performed with a static protein structure (no flexible residues) or with flexible side-chains of selected residues. Different types of selection mechanisms for flexible residues will be discussed in the next subchapter on Docking with flexible side chains.

Please, select No flexible residues for the current tutorial:

36

c. Output options:

The maximum number of predicted binding poses that will be stored can be defined (in the first line) with the additional condition that the maximum score difference between pose and lowest-score pose is smaller than the energy difference specified in the second line.

Please note that the current AutoDock Vina version allows a maximum number of 20 poses to be stored.

37

The docking calculation will start in the background. Docking per ligand on a single core takes about 1-5 minutes on current architecture. You can check if the simulation has been completed and import the results with AutoDockVina Import Results. Select Monitor.aut from the folder project_xray_ligands_client.

38

If docking hasn’t completed, the following message will appear:

If docking has completed, the following message will appear:

Before pressing OK, make sure that all objects or the full session is saved, if you want to keep the data/information for future purpose. All existing objects will be deleted and the docking solutions will be read into PyMOL. The different states of an object in PyMOL display the ligand configurations. You can use the same buttons and menu items as for the MD simulation analysis to toggle between the states or show all states at the same time.

If you would like to add the protein template, use File → Open and select preparedProtein.pdb from the folder project_xray_ligands_client/frames:

39

Also, a dialog will show the predicted affinities for each ligand:

If you press the Show details button, the individual scores for each pose of the selected ligand will be displayed.

40

Docking with flexible side chains

Docking can also be performed with flexible side chains. The torsions of the side chains are treated as additional degrees of freedom throughout the posing stage.

The preparation of the docking calculations is identical to that described in the previous section, until the flexible residues have to be defined:

The flexible residues are defined by a selection (flexible_residues) in the PyMOL Viewer. This selection can be initialized by selecting all residues inside the search volume (All residues fully contained in search volume) or inside a sphere around a selection of

41

atoms (sele), e.g. the co-crystallized ligand (Residues partially contained within sphere around any atom of selection). The selection can then be manually modified by deselecting and selecting new residues in the PyMOL Viewer. Please note, that the number of flexible residues should be small, as the computation time will significantly increase with number of additional degrees of freedom. For example in our tutorial, if all residues within a 4.0 Å zone are considered as flexible, the simulation time will increase to about 2-3 hours from 1-5 minutes when docking to the static structure.

In this tutorial, we use all residues within a 3.0 Å zone around the ligands, which should be defined in the selection (sele). Docking will take about 15-30 minutes.

Importing the docking poses will also display the flexible residues in their predicted conformation:

42

Ensemble docking

In the previous section, we discussed the inclusion of side-chain flexibility into docking. This approach doesn’t include backbone flexibility and the efficiency decreases significantly if too many residues are considered flexible. In this section, we will present another approach to incorporate protein flexibility into docking, known as ensemble docking: A MD trajectory (typically from the ligand-free state of the protein) will be utilized to extract an ensemble of protein structures that are used as alternative templates for docking. Thus, each ligand will be docked into all members of the protein structure ensemble and the efficiency scales linearly with the size of the ensemble.

Here, we will use the MD simulation performed on thrombin (see Chapter 3) to generate alternative template for docking thrombin inhibitors.

Restart PyMol and load again the original PDB structure 1mu6 into PyMOL. Turn on

Display → Sequence. We will only use the structure later for the identification of the search volume. Choose

AutoDockVina → Prepare system and start AutoDock

from the menu bar. The following dialog will appear:

43

Define Project subdirectory: project_xray_ligands_EPS

Define Ligand library by pressing button Search and Import and selecting Thrombin_xray_ligands:

Select Trajectory from Amber and specify the paths for the topology and trajectory files: MD_prot_test_client/prot.top and MD_prot_test_client/md.trj (see figure above).

You can also perform ensemble docking using an NMR-type PDB file.

When defining the search volume, you will recognize that Amber has shifted the coordinates of the protein with respect to the original PDB coordinates:

44

In the PyMOL Viewer select

1MU6 → align → to molecule → preparedProtein

and it will align the x-ray structure (including ligand) to the first snapshot of the MD trajectory (Don’t align the first snapshot onto the x-ray structure as this doesn’t change the coordinates of the full trajectory). Now, select the ligand CDA in PyMOL Viewer (selection (sele) will appear on the object bar on the right).

Now you can set the docking parameters including search volume as described under section “Static docking”.

The final dialog allows setting parameters for the subsequent clustering of the MD trajectory:

The MD trajectory contains ten frames that are structurally quite similar to each other (due to the short simulation length). Thus a small initial Maximum RMSD between center of cluster and other members of the cluster of 0.25 Å is chosen for the QT clustering algorithm. As the size of the MD trajectory is small and docking calculation should not take too long in this tutorial, we aim for a small number of clusters (=templates for ensemble docking). We select Aim for number of clusters equal to 3. The user can also specify that a cluster has a certain minimum number of members (Minimum number of frames in cluster) to remove protein conformations that are rarely visited. Throughout the clustering stage the maximum RMSD between center of cluster and other members is increased or decreased to generate the aimed number of

clusters which are allowed to vary between 0.5× and 1.5× of the Aim for number of clusters. The RMSD between frames is calculated using only the residues within the search volume.

Please note, that the current settings are selected to accommodate the short simulation length of the MD simulation. For more realistic simulation lengths of several nanoseconds, the default settings should be more appropriate.

The process of clustering and finally selected protein structures are displayed in the PyMOL Viewer:

45

The docking calculation will start in the background. Docking per ligand on a single core (with 4 templates) takes about 4-20 minutes on current architecture.

Importing the docking results, will open a new dialog that allows you to cluster similar binding modes (resulting from different templates):

Poses will be clustered to one solution if their pairwise RMSD is lower than the Maximum RMSD between center of cluster and other members. Only clusters that contain at least a Minimum number of poses in cluster will be displayed as solution. The import can be repeated with different settings as the raw data of the docking calculations will not be modified. The different cluster centers are loaded as

46

different object into the PyMOL Viewer. Also, a dialog will show the predicted affinities for each ligand:

If you press the Show details button, the individual scores for each pose of the selected ligand will be displayed. Also histograms with frequency of observed score for each cluster or all clusters can be displayed by pressing Show histogram buttons.