Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
COMPUTATIONAL APPROACHES TO BIOLOGICAL AND
SUPRAMOLECULAR SYSTEMS
COMPUTATIONAL APPROACHES TO PROTONATION AND
DEPROTONATION REACTIONS FOR BIOLOGICAL
MACROMOLECULES AND SUPRAMOLECULAR COMPLEXES
By
AHMED MOHAMMED
A Thesis
Submitted to the School of Graduate Studies
In Partial Fulfillment of the Requirements for the Degree
Master of Science
McMaster University
©Copyright by Ahmed Mohammed, September 2013
ii
Master of Science (2013) McMaster University
(Chemistry & Chemical Biology) Hamilton, Ontario, Canada
TITLE: Computational Approaches to Protonation and Deprotonation
Reactions for Biological Macromolecules and Supramolecular
Complexes
AUTHOR: Ahmed Mohammed M.Sc. Physical Chemistry, Inha University
B.Sc. Chemistry, Assiut University
SUPERVISOR: Professor Paul W. Ayers
NUMBER OF PAGES: x, 74
iii
ABSTRACT
Understanding and predicting chemical phenomena is the main goal of
computational chemistry. In this thesis I present my work on applying computational
approaches to study chemical processes in biological and supramolecular systems.
pH-responsive molecular tweezers have been proposed as an approach for
targeting drug-delivery to tumors, which tend to have a lower pH than normal cells. In
chapter 2 I present a computational study I performed on a pH-responsive molecular
tweezer using ab initio quantum chemistry in the gas phase and molecular dynamics
simulations in solution. The binding free energy in solution was calculated using Steered
Molecular Dynamics. We observe, in atomistic detail, the pH-induced conformational
switch of the tweezer and the resulting release of the drug molecule. Even when the
tweezer opens, the drug molecule remains near a hydrophobic arm of the molecular
tweezer. Drug release cannot occur, it seems, unless the tweezer is a hydrophobic
environment with low pH.
The protonation state of amino acid residues in proteins depends on their
respective pKa values. Computational methods are particularly important for estimating
the pKa values of buried and active site residues, where experimental data is scarce. In
chapter 3 I used the cluster model approach to predict the pKa of some challenging
protein residues and for which methods based on the numerical solution of the Poisson-
iv
Boltzmann equation and empirical approaches fail. The ionizable residue and its close
environment were treated quantum mechanically, while the rest of the protein was
replaced by a uniform dielectric continuum. The approach was found to overestimate the
electrostatic interaction leading to predicting lower pKa values.
v
ACKNOWLEDGEMENTS
First I would like express my deepest gratitude for my advisor, Professor Paul W.
Ayers for giving me the opportunity to work on my master’s in his group. I’m grateful for
his support, encouragement, and patience during a difficult time when I first came here.
He was always there when I needed him, and he inspires his students by his suggestions,
knowledge, and productivity.
I would like to thank Dr. Bain and Dr. Dumont who were my committee. Thanks
to their help, encouragement, and constructive suggestions.
Many thanks go to the Ayers group. I would like to thank Dr. Steven Burger for
his help with the research project. I would also like to thank Dr. Peter Limacher for his
help and Guidance. I would also like to thank all the previous and current members of the
Ayers group for their help and support: Rogelio Cuevas-Saavedra, Sandra Rabi, Carlos
Cardenas, Lourdes Romero, Alfredo Guevara, James Anderson, Ivan Vinogradov,
Debajit Chakraborty, Paul Johnson, Farnaz Heidarzadeh, Annie Liu, Cristina González,
Eleonora Echegaray, Mathew Chan, Katharina Boguslawski, Pawel Tecmer, Pavel
Kulikov. Thanks to all of them for being such good friends.
I would like to thank McMaster University and the department of chemistry and
chemical biology for me the opportunity to work here. Thanks to the Natural Sciences
and Engineering Research Council (NSERC) and McMaster University for financial
support.
vi
Last but not least, I would like to express my deepest gratitude for my family.
Thank you for your love and support.
vii
Table of Contents
Abstract ……………………………………………………………..……….………….……..… iii
Acknowledgements……………………….…………………………………………….……... …v
List of Figures………….………..…………………………………………………………….…..ix
List of Tables…………………………………………………………………………................... x
1 Background ............................................................................................................................ 1
1.1 Introduction ...................................................................................................................... 1
1.2 The Schrödinger Equation ............................................................................................... 3
1.3 Hartree-Fock Method ....................................................................................................... 5
1.4 Density Functional Theory ............................................................................................... 8
1.5 Molecular Mechanics ..................................................................................................... 11
1.6 Molecular Dynamics ...................................................................................................... 14
1.7 Steered Molecular Dynamics ......................................................................................... 16
1.8 Free Energy Calculations ............................................................................................... 17
1.9 pKa calculations ............................................................................................................. 20
1.10 References ...................................................................................................................... 23
2 Drug Release by pH-Responsive Molecular Tweezers: Atomistic Details from
Molecular Modelling .................................................................................................................... 27
2.1 Introduction .................................................................................................................... 27
2.2 Computational Details .................................................................................................... 30
2.2.1 Quantum Mechanics Calculations on the Switching Unit ..................................... 30
2.2.2 Quantum Chemistry Calculations of the Binding Energy in the Gas Phase .......... 30
2.2.3 Molecular Dynamics (MD) Simulations With Implicit Solvent ............................ 31
2.2.4 Steered Molecular Dynamics Simulations With Explicit Solvent ......................... 31
2.3 Results and Discussions ................................................................................................. 32
2.3.1 Relative stability of the switching unit conformers ............................................... 32
2.3.2 Relative Stability of Tweezer Conformations In the Gas Phase ............................ 36
viii
2.3.3 Gas-Phase Binding Energy .................................................................................... 41
2.3.4 Stable Conformations of the Tweezer in Implicit Water Solvent .......................... 41
2.3.5 Drug Release from the Tweezer ............................................................................. 42
2.4 Conclusions .................................................................................................................... 46
2.5 References ...................................................................................................................... 48
3 Failures of Embedded Cluster Models for pKa Shifts Dominated by Electrostatic
Effects……………………………………… ………………………………………………….. 53
3.1 Introduction .................................................................................................................... 53
3.2 Computational Details .................................................................................................... 56
3.2.1 pKa calculations...................................................................................................... 56
3.2.2 Free energy calculations ......................................................................................... 57
3.2.3 Protein model construction .................................................................................... 57
3.3 Results ............................................................................................................................ 58
3.3.1 Aspartate 75 ........................................................................................................... 58
3.3.2 Aspartate 21 ........................................................................................................... 60
3.4 Discussion ...................................................................................................................... 61
3.5 Conclusions .................................................................................................................... 63
3.6 References ...................................................................................................................... 65
4 Conclusions ........................................................................................................................... 70
ix
List of Figures
Figure 2.1: Different conformers of the protonated and deprotonated switching unit ortho-py. ... 33
Figure 2.2: Different conformers of the protonated tweezer .......................................................... 37
Figure 2.3: The half-open UW-conformer of the protonated tweezer. .......................................... 38
Figure 2.4: Two snapshots from molecular dynamics simulation using explicit solvent model for
the protonated tweezer ................................................................................................................... 43
Figure 2.5: Two snapshots from steered molecular dynamics simulation using explicit solvent
model for the protonated tweezer with the drug molecule. ............................................................ 45
Figure 3.1: Cluster model for Asp75 of bacterial barnase. ............................................................ 59
Figure 3.2: Cluster model for Asp21of fungal beta-cryptogein. .................................................... 61
x
List of Tables
Table 2.1: Calculated and experimental 13
C chemical shifts for the switching unit ortho-py in the
U shape........................................................................................................................................... 34
Table 2.2: Calculated and experimental 1H chemical shift changes upon the protonation of the
switching unit ortho-py. ................................................................................................................. 35
Table 2.3: Calculated 13
C chemical shifts for the fully-open (W-shaped) and the half-open (UW-
shaped) conformations of the protonated tweezer. ........................................................................ 39
Table 2.4: Calculated 1H chemical shifts for the fully-open (W-shaped) and the half-open (UW-
shaped) conformations of the protonated tweezer. ........................................................................ 40
Table 3.1: Comparison between experimental and several pKa prediction methods. .................... 60
END
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
1
Chapter 1
1 Background
1.1 Introduction
Computational chemists strive to understand, quantify, and predict chemical
phenomena by using efficient computer programs, based on fundamental principles from
theoretical chemistry, to simulate atoms, molecules, and solids. Computational
approaches allow us to characterize chemical phenomena in atomistic detail, and are
therefore often complementary to experimental techniques.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
2
In the last few decades, the types of systems that can be approached with
computational chemistry method have rapidly increased in size and complexity, allowing
computational chemists to enter new fields of chemistry, including materials’ chemistry
and (of primary relevance for this thesis) biological chemistry. The increased scope of
computational chemistry is primarily due to two factors: the development of improved
theoretical tools (mathematical methods, computer algorithms, and software) and the
exponential growth in available computing power. Nowadays, computational chemistry is
routinely used to predict molecular structure, energy profiles for chemical reactions,
dipole and higher multipole moments, electronic charge distribution, vibrational
frequencies, correlations between molecular structures and properties, molecular spectra,
electronic and thermal properties of materials, and drug binding affinities.[1] [2] [3] This
thesis presents my work using computational chemistry methods to simulate proton
abstraction and addition reactions in a biological settings. Designing efficient and reliable
computational approaches for simulating proton abstraction and addition reactions is
important for drug delivery (chapter 2) and enzymology and drug design (chapter 3). To
perform these simulations, a broad range of techniques were needed, ranging from
quantum mechanical calculations of molecular electronic structure, to molecular
dynamics simulations of thermodynamic properties, to continuum electrostatics
computations to estimate the effects a molecule’s surroundings on its (de)protonation
energy.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
3
Chapter two is a computational study of a pH sensitive molecular tweezer that is
designed to release a chemotherapy drug in tumors, which are more acidic than the
surrounding tissue. The goal of this study is to study pH sensitive molecular tweezers at
atomic resolution and to assess whether it is a promising new drug delivery vehicle.
Chapter 3 tests a cluster-model for determining the pKa of amino acid residues in
proteins. The pKa of amino acid residues is determinative of protein structure, dynamics,
and function and must be correctly assigned before numerical simulations of proteins can
commence. The results of the cluster model are disappointing when electrostatic
interactions between amino acids are important, suggesting that explicit simulations of
electrostatic interactions and desolvation effects are needed to accurately determine
protein pKa values. Chapter 4 summarizes the results of the thesis.
1.2 The Schrödinger Equation
Molecules’ structure, function, and dynamics are determined by the shape of the
molecular potential energy surface (PES) which, at the most fundamental level, is
determined by solving the Schrödinger equation. Solving the Schrödinger equation
provides the quantum-mechanical description of the motion of nuclei and electrons. The
time-independent Schrödinger equation for a molecule can be written in the form
1 2 1 2 1 2 1 2ˆ ( , ,...., , , ,...., ) ( , ,...., , , ,...., ) ,N n N nH R R R r r r E R R R r r r (1.1)
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
4
where Ĥ is the Hamiltonian operator (the sum of kinetic energy and potential energy
operators for the system), Ψ is the wavefunction for nuclei and electrons in the molecule,
RI is the position for nucleus I and ri is the position of electron i, and E is molecular
energy associated with the wavefunction. The molecular Hamiltonian in atomic units for
a molecule with n electrons and N nuclei can be written as
221 1 1ˆ ,
2 2
N n n N N nI JI I
i
I i i I I J i jI iI IJ ij
Z ZZH
m r r r
(1.2)
where mI and ZI are the mass and atomic numbers of the Ith
atom. rij, rIJ, and riI are
electron-electron, nucleus-nucleus, and electron-nucleus distances. The Schrödinger
equation can be solved exactly only for one-electron atoms. For molecular systems and
many-electron atoms, approximations must be used.
The Born-Oppenheimer Approximation is one of the most popular
approximations; [4] with the Born-Oppenheimer approximation, the Schrödinger
equation can be solved exactly for some one-electron molecules (and not just one-
electron atoms). In addition, the Born-Oppenheimer approximation, or one of several
similar approximations, is needed in order to define a molecular potential energy surface,
which is one of the most fundamental concepts in chemistry.[5]
The Born-Oppenheimer approximation is based on the realization that nuclei are
much heavier than electrons, and therefore move much more slowly. It is reasonable to
assume, then, that electrons respond to the nuclear movement instantaneously and that the
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
5
nuclei do not move on the timescale of electronic motion. Therefore electrons can be
considered moving while nuclei are fixed; this is the essence of the Born-Oppenheimer
approximation.
The Born-Oppenheimer approximation enables us to separate the total molecular
wavefunction into electronic and nuclear wavefunctions. The nuclear kinetic and
potential energy terms do not affect the electronic Hamiltonian, which can be written as
21 1ˆ .
2
n n N nI
e i
i i I i jiI ij
ZH
r r
(1.3)
The eigenvalue of the electronic Hamiltonian is the electronic energy; adding the
electronic energy to the nucleus-nucleus repulsion energy gives the potential energy
surface. Most of theoretical chemistry is concerned with finding various approximations
to the potential energy surface, and most of computational chemistry involves computing
the potential-energy surface and using it to elucidate chemical phenomena.
1.3 Hartree-Fock Method
The electronic Schrödinger equation cannot be solved for many-electron systems,
which motivates further approximations for the molecular wavefunction. One of the
earliest and most fundamental approximations is the Hartree-Fock method. The Hartree-
Fock method can be considered an approach that approximates the Hamiltonian by
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
6
replacing the problematic electron-electron repulsion potential with an nonlocal, but one-
electron, effective potential operator. The resulting Hartree-Fock equations can be solved
rather easily for systems with hundreds, and occasionally even thousands, of electrons.[6]
In the Hartree-Fock approximation, the electronic wavefunction is a single Slater
determinant containing the n occupied spin-orbitals.
To be more specific, in Hartree-Fock theory, the electronic Hamiltonian is
replaced by the Fock operator,
/2
1
ˆ ˆ ˆ ˆ1 1 2 1 1 ,n
core
j j j
j
F H J K
(1.4)
where ˆ 1jF
is the one-electron Fock operator, comprising the one-electron core
Hamiltonian ˆ 1coreH ,
2
1
1
1ˆ 1 ,2
core A
A A
ZH
r (1.5)
where ˆ 1jJ is the Coulomb operator, and ˆ 1jK is the exchange operator.
Solving the one-electron Schrödinger equation with the one-electron Fock
operator as a Hamiltonian gives one-electron wavefunctions for the system called the
Hartree-Fock spin-orbitals,
ˆ 1 1 1 .i i iF (1.6)
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
7
The Slater determinant of the occupied spin-orbitals is the many-electron wavefunction in
the Hartree-Fock Approximation.
The major deficiency of the Hartree-Fock approximation is the absence of
electron correlation: in Hartree-Fock theory, electrons move in an effective one-body
potential, and the detailed interelectronic motions (induced by the two-electron repulsion
term in the exact Hamiltonian) is neglected. There are many post-Hartree-Fock methods
for including electron correlation. One of the most affordable of these is Møller-Plesset
perturbation theory.[7] [8] [9]
In perturbation theory the Hamiltonian is written as
0 1ˆ ˆ ˆ ,H H H (1.7)
where 0H is the model Hamiltonian and λ is an arbitrary parameter. And the perturbation
expansion of wavefunctions and energies are
0 (1) 2 (2) ..... ,i i i i (1.8)
0 (1) 2 (2) ..... .i i i i (1.9)
Inserting these equations into the Schrodinger equation and equating terms, order-by-
order, in λ gives the perturbative equations. This allows us to approximate the
wavefunction and energies of a difficult-to-solve Hamiltonian (like the true electronic
Hamiltonian) from an easier-to-solve model Hamiltonian (like the Hartree-Fock
Hamiltonian). Møller-Plesset perturbation theory is perturbation theory where the model
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
8
Hamiltonian is the Hartree-Fock Hamiltonian. Like all post Hartree-Fock methods,
Møller-Plesset perturbation theory is computationally demanding, and cannot be applied
to large molecules.
1.4 Density Functional Theory
Density functional theory (DFT) is a method that approximates the effects of
electron correlation (like post Hartree-Fock methods) with a computational cost that is
similar to, and sometimes even less than, the Hartree-Fock approximation. It is widely
used to study organic reactions, biological processes, and materials. DFT is based on the
Hohenberg-Kohn theorem, which indicates that the ground-state electron density fully
specifies all molecular properties, including the exact electronic energy, molecular
potential energy surface, and many-electron wavefunction.[10] [11] [12] The electron
density is a simpler mathematical object than the many-electron wavefunction because it
is a function of only three variables: the three Cartesian coordinates x, y, and z. The
electronic density is also a simpler conceptual object, since it (unlike the many-electron
wavefunction) is a physical observable.
In DFT, the electronic energy is therefore written as a functional of the electron
density,
,elecE r E (1.10)
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
9
where Eelec is the exact electronic energy. The energy functional can be written as a sum
of two terms
,extE r V r r dr F r (1.11)
where the first term represents the interactions of electrons with an external potential
Vext(r) typically due to nuclei. The second term is the sum of the kinetic energy of
electrons and electron-electron interactions. The ground-state energy is determined by
variationally minimizing the energy with respect to the electron density.
In DFT, the problem of solving the electronic Schrödinger equation is replaced by
the problem of approximately determining the Hohenberg-Kohn function, F [ρ(r)], for
which no explicit mathematical equation is known. Kohn and Sham [13] suggested that
approximating F[ρ(r)] as the sum of three terms
,KE H XCF r E r E r E r (1.12)
where KEE r is the kinetic energy of a system of non-interacting electrons that has
the same density as the real system, HE r is classical self-repulsion energy of the
electron density with itself, and XCE r contains everything else, including the
potential energy contributions from exchange and electron correlation, as well as the
difference between the kinetic energy of the real and non-interacting systems. Only the
last term—the exchange-correlation energy functional—is unknown. Most of the current
research in DFT is related to designing improved approximations to EXC.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
10
The Kohn-Sham expression for the energy for a system of N nuclei and n
electrons is
21 2
1 2
1 1 2
1
1
2 2
.
n
i i XC
i
NA
A A
r rE r r r dr drdr E r
r r
Zr dr
r R
(1.13)
Kohn and Sham also showed that the density can be obtained from a set of one-
electron orthonormal orbitals:
2
1
,n
i
i
r r
(1.14)
which are the solutions of the one-electron Kohn-Sham equations
2
212 1 1 1
1 1 12
.2
NA
XC i i i
A A
rZdr V r r r
r r
(1.15)
Solving the Kohn-Sham equations is similar in cost to solving the Hartree-Fock
equations, but density-functional theory (approximately) includes the effects of electron
correlation.
DFT methods generally offer a good combination of accuracy and computational
cost. They are widely used for large systems. Unfortunately, there are hundreds of
approximate exchange-correlation functionals in the literature, and making a good choice
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
11
is not always easy. Given a good functional (which can be chosen by validation with
experiment or by calibration with a high-accuracy ―benchmark‖ quantum chemistry
method), DFT is a reliable and robust computational tool. One of the biggest advantages
of DFT is its ability to calculate analytical gradient of molecular energy with respect to
the positions of the nuclei. That allows the molecular geometry to be optimized
efficiently. Moreover, by using identities from linear algebra, conventional DFT
calculations can be performed with computational costs that grow linearly with the size of
the system. This allows one to perform DFT calculations on very large (even hundreds of
thousands of atoms) systems.
1.5 Molecular Mechanics
The disadvantage of quantum chemistry methods is two-fold: the number of
particles in the system is equal to the number of electrons plus the number of atomic
nuclei, and the electrons must be treated quantum-mechanically. Chemists often do not
think of molecules in terms of electrons, preferring to think of atoms linked by bonds—
the venerable ball-and-spring model of molecular structure. The ball-and-spring model is
the basis for a more computationally efficient model called molecular mechanics, where
electrons are not treated explicitly, and atoms are treated as classical particles that interact
using classical potentials. Molecular mechanics allows one to treat systems that are orders
of magnitude larger than the systems that can be treated by quantum mechanics.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
12
Molecular mechanics is especially important in condensed phases and for
macromolecules, where the number of atoms is large and the movement (not just the
optimal position) of atoms is important. Molecular mechanics techniques allow one to
routinely treat thousands of atoms, and even million-atom simulations are possible (using
supercomputers).[14] [15]
The essential approximations in molecular mechanics are represented by the force
field, which is an expression for the molecular potential energy as a function of the
nuclear positions. Different types of force fields are appropriate for different types of
problems, and several popular commercial force fields are available for molecular
modelling. Different force fields have different forms and values for parameters, but most
force fields have the same general form, based on chemical intuition:
.
bonded non bonded
bonded bond angle dihedral
non bonded van derWaals electrostatic
V V V
V V V V
V V V
(1.16)
The force field clearly represents a simplified model of interactions within a chemical
system. Usually the bond and angle terms are modeled by harmonic oscillators, and
torsion contribution are modeled by periodic functions. Van der Waals terms are usually
modeled using the empirical Lennard-Jones potential.[16] Coulomb interactions between
fixed atomic point charges are used to model electrostatic terms. In this study, we used
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
13
the Generalized Amber Force Field (GAFF); [37] its potential energy function has the
form
2 2
0 0
12 60 01
,
1 1 0
1 1 11 cos(
2 2 2
2 4
N
bond angle dihedral van derWaals electrostatic
b n
bonds angles torsions
N Nij ij i j
i j
j i j ij ij ij
V r V V V V V
k b b k V n
r r q q
r r r
,
(1.17)
where V is the potential energy, a function of positions (r) of N particles. Kb is the force
constant for stretching and b0 is the equilibrium bond length. Kθ is the force constant of
bending and θ0 is the equilibrium bond angle. Vn is the torsional barrier height and n is
periodicity, which is the number of maxima in a full rotation, is the value for torsional
angle, and γ is the phase angle. εi,j is Lennard-Jones potential well-depth and r0
ij is the
equilibrium distance between atoms i and j. qi and qj are partial charges of atoms i and j;
ε0 is the dielectric constant. The values of these parameters depend on the atom, atom
type, and its environment. There are standard techniques for parameterizing force fields,
but parameterizing force fields is truly an art: no parameterization can be trusted without
validation against more accurate calculations or experimental results.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
14
1.6 Molecular Dynamics
In 1913, Einstein and Stern suggested that molecules, and quantum-mechanical
oscillators in general, have a residual energy at the absolute zero and called it the zero-
point energy.[18] This means that although molecules (and, indeed, all systems with
bound ground states) have a structure with minimum potential energy, they rarely adopt
that structure all the time: instead they fluctuate around it. That is, even in their ground
state, small systems undergo structural fluctuations. Macroscopic properties of the system
(such as work, free energy, and entropy) can be measured from the microscopic
properties of its particles. Such systems should be modeled as an ensemble that represents
all possible configurations of the system.
Molecular dynamics (MD) is a computational approach to statistical mechanics.
Equilibrium and dynamic properties of systems, which cannot be calculated analytically,
can be estimated using MD. In MD simulations, successive configurations of a system of
interacting particles are produced by integrating Newton’s laws of motion. The time
evolution of positions and velocities of the particles of the system can be known.
As the MD simulation progresses, statistical data are collected. The statistical
mechanical average for the position vector of a particle A can be calculated from the
relation
1
1 ,
n
A A i
i
r r tn t
(1.18)
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
15
where A ir t is the position of atom A at time ti. Similarly, the total kinetic energy of the
atoms is calculated as the sum of the kinetic energies of individual particles
2
1 1
1.
2
n N
k j j i
i j
E m tn t
v (1.19)
Although the basic idea of MD simulations is simple, running an MD simulation
is challenging. Setting the initial conditions is not trivial and trajectories are sensitive to
initial condition. Choosing the suitable numerical integrator for Newton’s laws of motion
can be challenging. Sometimes including high frequency mode is necessary, which
requires using a very small timestep (Δt = 1 fs or less) which increases the computational
cost needed to simulate a system for a long enough time to converge the time-averages
one uses to compute observable properties.[19] There are standard techniques for
avoiding most of these problems, but MD simulations for large systems requires insights
from the computational chemist.
A more fundamental problem with MD simulations comes from the ergodic
hypothesis. The ergodic hypothesis states that over a long period of time all accessible
microstates are equiprobable; if the ergodic hypothesis is not true, then the time average
cannot be used to calculate the ensemble average, and the entire idea of molecular
dynamics is invalid. It is not known whether the ergodic hypothesis is actually
mathematically true for chemical and biological systems treated with conventional
molecular-mechanics force fields, but chemical systems can exhibit nonergodic behavior
during a (short) MD simulation because the full phase space (position and momentum
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
16
space, together) may not be explored due to thermodynamic bottlenecks that separate
different regions of phase space. Large systems are particularly problematic, since they
diffuse slowly.
1.7 Steered Molecular Dynamics
Adequately sampling phase space is essential for computing thermodynamic
properties with MD, but adequate sampling is increasingly difficult for large systems.
This is especially true when different thermodynamic states are separated by barriers on
the potential energy surface that are much higher than the typical kinetic energies of the
atoms in the system. Steered molecular dynamics (SMD) is a relatively new approach in
which time-dependent external forces are applied to the system to pull it along a certain
degree of freedom, thereby forcing the system to sample new regions of phase space. The
response of the system to the pulling-force is analyzed, and can be used to gain
qualitative insight and to compute quantitative properties (like the free-energy difference
between the starting and ending states). Like conventional MD simulations, SMD
simulations reveal transitions between molecular structures in atomic detail. Unlike
conventional MD simulations, SMD simulations can model processes that occur on
timescales of milliseconds or longer like the dissociation and association of
macromolecules. In biology SMD is used to simulate biological processes such as
binding and dissociation of substrates and proteins, conformational changes of protein,
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
17
and DNA stretching. SMD is also ideal for studying the response of molecules to
external forces, so it is used to study mechanical unfolding and stretching for proteins and
other polymers, as well as the mechanical DNA-unzipping.
One of the biggest advantages of SMD simulations is that it allows one to
simulate systems that are not in thermodynamic equilibrium. Jarzynski showed that the
free energy difference between two states can be calculated from the exponential
averages of irreversible work.[20] Through time series analysis methods, the potentials
of mean force can be calculated from multiple SMD simulations. SMD simulations are
particularly powerful when only qualitative results are required, rather than quantitative
ones. For those cases, such as finding pathways for binding, SMD is a useful tool even
when it has a large error bar. Another advantage of SMD simulations is that they mimic
the atomic force microscopy, so direct comparison with experiment is available.[21] [22]
1.8 Free Energy Calculations
One of the primary objectives in computational chemistry is to compute free-
energy differences. This allows one to ascertain the relative stability of different
molecules and conformations, leading to a quantitative understanding of chemical
kinetics and equilibrium. Of primary importance in this thesis are free-energy differences
of biologically relevant molecules like proteins (chapter 3) and supramolecular drug-
delivery vehicles (chapter 2). In biochemistry, free-energy differences are necessary for
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
18
computing ligand binding affinity (e.g., drug affinity), for pKa prediction (proton
affinity), and to characterize protein and DNA (un)folding.
In chemistry, it is usually the Gibbs free energy, G, that is of interest. The Gibbs
free energy is the thermodynamic potential of isothermal-isobaric ensemble (NPT), that is
the ensemble in which the number of particles N, the system pressure P, and temperature
T are fixed. The Gibbs free energy is defined as
,G H TS (1.20)
where H is the enthalpy and S is the entropy. When the volume, instead of the pressure, is
constant one is in the canonical NVT ensemble, and the free energy is the Helmholtz free
energy,
,
A H TS PV
U TS
(1.21)
where U is the internal energy of the system.
Depending on how one designs the MD simulation, either G or A can be
determined straightforwardly. Specifically, the free energy can be computed from the
partition function. For example, the Helmholtz free energy can be written as
( , , ) ln ( , , ) ,BA N V T k T Z N V T (1.22)
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
19
where kB is Boltzmann constant and Z(N,V,T) is the canonical partition function, which is
calculated from the relation
( , , ) ,B
E
k TZ N V T e
(1.23)
where the summation is done over all the points in phase space υ. Another way to
calculate the partition function is to integrate over the classical phase space. The partition
function can be written as
,
( , , ) exp ,
N N
N N
B
E p rZ N V T dp dr
k T
(1.24)
where rN is the position and p
N is the momentum. In conventional MD approaches, the
phase-space averages are simplified by integrating with respect to the momentum degrees
of freedom analytically; the remaining integration over conformational space is
approximated by a time-average (assuming the validity of the ergodic hypothesis).
Sampling can be performed using MD or Monte Carlo simulations. Because the
ensembles are (very) incompletely sampled by practical simulation techniques, accurate
estimation of partition function and absolute free energy is impossible. Fortunately, for
most applications were are only interested in the free energy difference between two
states, and free-energy differences between states that are not too dissimilar can be
estimated with reasonable precision. Traditionally, free energy differences have been
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
20
calculated using thermodynamic integration or umbrella sampling. In this thesis, we will
use the more modern method of steered molecular dynamics to calculate the free energy
differences.
1.9 pKa calculations
The free energy difference between an acid (or a base) in its protonated and its
deprotonated state is so important and fundamental to chemistry and biochemistry that it
has inspired a unique nomenclature and associated numerical and formal computational
techniques. The pKa of an acid is defined as
10pK log ,a aK (1.25)
where the acid-dissociation equilibrium constant, Ka, is defined as
dissociation ln .aG RT K (1.26)
The pKa value of a protein residue determines the protonation state of that residue
at a certain pH. If the pKa of the residue is less than seven, the residue tends to be
deprotonated in aqueous solution at standard temperature and pressure; if the pKa of the
residue is greater than seven, the residue tends to be protonated. Whether a residue is
protonated or not is important for protein function and stability.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
21
In the past few decades, many computational approaches have been developed to
predict protein residues’ pKa values. Most of these approaches fall into one of three
categories. The first approach is based on solving the linearized Poisson-Boltzmann
equation (LPBE) using finite difference methods.[23] [24] The protein is assigned a low
dielectric constant and is surrounded by a solvent with high dielectric constant, usually
water. Distribution of counterions in the solvent is described by Boltzmann distribution.
Fixed atomic charges are assigned to protein atoms and the electrostatic potential of
protein and solvent are calculated. The drawback of these methods is that it is formally
invalid to use a concept from macroscopic continuum electrostatics (the dielectric
constant) to compute a property that based on atomic details like (de)protonation.
The second approach is based on statistical fitting of an empirical model, rather
than equations derived from classical physics. These methods have the advantage of
being extremely fast and the disadvantage of lacking physical basis. The pKa value of the
protein residue is calculated by adding environmental perturbations to the pKa value of
the model residue
,model ,a a apK pK pK (1.27)
where apk is the sum of all perturbations from environmental factors: charge-charge
interactions, hydrogen bonding, and desolvation effects.[27]
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
22
The third approach is based on calculating the difference in free energy for the
protonated and non-protonated states using either QM methods or MM methods.[28] [29]
pKa values can be directly computed from the free energy difference through the relation
dissociation .
2.3026a
GpK
RT
(1.28)
The disadvantage of these methods is their computational expense. Rigorous methods
based on long-timescale molecular dynamics are computationally impractical because
each protein has many ionizable residues, suggesting that one should consider only a
portion of the enzyme, i.e., a molecular cluster embedded in the protein environment.
Chapter 3 studies a cluster model for computing protein pKa’s and compares it to cheaper
and more conventional approaches based on the linearized Poisson-Boltzmann equation
and statistical models.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
23
1.10 References
[1] Leach, A. R. Molecular Modelling: Principles and Applications, Pearson
Education Limited, second edition, 2001.
[2] Cramer, C. J. Essentials of Computational Chemistry: Theories and Models,
John Wiley & Sons Inc., Hoboken, NJ, second edition, 2004.
[3] Schlick, T. Molecular Modeling and Simulation: An Interdisciplinary Guide,
Springer: New York, second edition, 2002.
[4] Born, V. M.; Oppenheimer, R. ―Zur quantentheorie der molekeln." [On the
Quantum Theory of Molecules]. Annalen der Physik 1927, 389, 457–484.
[5] Cramer, C. J. Potential Energy Surfaces, in Essentials of Computational
Chemistry: Theories and Models, John Wiley & Sons Inc., Hoboken, NJ, second
edition, 2004; pp 6-10.
[6] Szabo, A.; Ostlund, N. S. The Hartree-Fock Approximation, in Modern
Quantum Chemistry: Introduction to Advanced Electronic Structure Theory.
Mineola, New York: Dover Publishing: 1996: pp 108-230
[7] Møller, C.; Plesset, M. S. "Note on an Approximation Treatment for Many-
Electron Systems." Phys. Rev. 1934, 46, 618–622.
[8] Head-Gordon, M.; Pople, J. A.; Frisch, M. J. "MP2 energy evaluation by direct
methods." Chem. Phys. Lett. 1988, 153, 503–506.
[9] Cramer, C. J. Perturbation Theory, in Essentials of Computational Chemistry:
Theories and Models. John Wiley & Sons Inc., Hoboken, NJ, second edition,
2004; pp 216-224.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
24
[10] Ayers, P. W.; Yang, W. Density Functional Theory, in P. Bultinck, H. de
Winter, W. Langenaeker, J.P. Tollenaere (Eds.) Computational Medicinal
Chemistry for Drug Discovery, Dekker, New York, 2003, pp. 571-616.
[11] Hinchliffe, A. Density Functional Theory, in Modelling Molecular Structures,
John Wiley & Sons Ltd, Baffins Lane, Chichester, West Sussex, second edition,
2001; pp 218-229.
[12] Hohenberg, P.; Kohn, W. "Inhomogeneous Electron Gas." Phys. Rev. 1964,136,
B864-B871.
[13] Kohn, W.; Sham, L. J. "Self-Consistent Equations Including Exchange and
Correlation Effects." Phys. Rev. 1965, 140, A1133- A1138.
[14] Leach, A. R. Empirical Force Field Models: Molecular Mechanics, in Molecular
Modelling: Principles and Applications, Pearson Education Limited, second
edition, 2001; pp 165-252.
[15] Cramer, C. J. Molecular Mechanics, in Essentials of Computational Chemistry:
Theories and Models, John Wiley & Sons Inc., Hoboken, NJ, second edition,
2004; pp 17-68.
[16] Lennard-Jones, J. E. "On the Determination of Molecular Fields" Proc. R. Soc.
Lond. A 1924, 106, 463–477
[17] Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A.
―Development and testing of a general amber force field.‖ J. Comput. Chem.
2005, 26, 114.
[18] Einstein, A.; Stern, O. Einige ―Argumente fur die Annahme einer molekularen
Agitation beim absoluten Nullpunkt.‖ Ann. Phys. 1913, 345, 551-560.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
25
[19] Schlick, T. Molecular Dynamics: Basics, in Molecular Modeling and
Simulation: An Interdisciplinary Guide, Springer: New York, second edition,
2002; pp 425-461.
[20] Jarzynski, C. ―Nonequilibrium equality for free energy differences.‖ Phys. Rev.
Lett. 1997, 78, 2690-2693.
[21] Isralewitz, B.; Baudry, J.; Gullingsrud, J.; Kosztin, D.; Schulten. K. ―Steered
molecular dynamics investigations of protein function.‖ J. Mol. Graph. Model.
2001, 19, 13–25.
[22] Isralewitz, B.; Gao, M.; Schulten, K. ―Steered molecular dynamics and
mechanical functions of proteins.‖ Curr. Opin. Struct. Biol. 2001, 11, 224–230.
[23] Davis, M. E.; McCammon, J. A. ―Electrostatics in biomolecular structure and
dynamics.‖ Chem. Rev. 1990, 90, 509–521.
[24] Gilson, M. K.; Rashin, A.; Fine, R.; Honig, B. ―On the calculation of
electrostatic interactions in proteins.‖ J. Mol. Biol. 1985, 184, 503–516.
[25] You, T. J.; Bashford, D. ―Conformation and hydrogen ion titration of proteins: a
continuum electrostatic model with conformational flexibility.‖ Biophys. J.
1995, 69, 1721–1733.
[26] Gilson, M. K.; Honig, B. H. ―Electrostatic fields in the active sites of
lysozymes.‖ Nature 1987, 330, 84–86.
[27] Li, H.; Robertson, A. D.; Jensen, J. H. "Very Fast Empirical Prediction and
Interpretation of Protein pKa Values." Proteins 2005, 61, 704–721.
[28] Simonson, T.; Carlsson, J.; Case, A. D. ―Proton binding to proteins: pK(a)
calculations with explicit and implicit solvent models.‖ J. Am. Chem. Soc. 2004,
126, 4167–4180.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
26
[29] Li, H.; Robertson, A. D.; Jensen, J. H. ―The determinants of carboxyl pKa values
in turkey ovomucoid third domain.‖ Proteins 2004, 55, 689-704.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
27
Chapter 2
2 Drug Release by pH-Responsive Molecular Tweezers:
Atomistic Details from Molecular Modelling
2.1 Introduction
Dynamic molecular tweezers [1] [2] are two-armed molecules that switch from
the closed (arms together) to open (arms apart) form when stimulated by an external
trigger like light or a change in ion concentration.[3] These nanomechanical devices can
hold guest molecules between their arms using non-covalent interactions and then release
the molecules when triggered by environmental cues. This leads to a host of applications
related to molecular recognition and sensing.[4] Our specific interest is in studying
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
28
whether molecular tweezers that are sensitive to hydronium ion concentration (pH) can
be used as vehicles for drug delivery.
Drug targeting research is motivated by the observation that if a drug could be
carried to the specific pathogen or type of cell that is implicated in a disease, then the
effective dose of the drug could be reduced, alleviating side effects.[5] This is especially
important when the drug is toxic, as is typically the case in chemotherapy.
Supramolecular systems with chemical labels have been studied as drug carriers, and
some success has been achieved bringing drugs to specific parts of the human body.
However, the drug release at the desired site still represents a challenge.[6] The concept
is simple: a stimulus from the drug carrier’s environment will trigger a change in the
carrier’s molecular structure, releasing a drug. For example, a molecular tweezer with an
acid-sensitive linkage will cleave in a mildly acidic environment like an endosome or a
tumor.[7] [8] Unfortunately these systems are often either too labile or too stable to offer
adequate control over the release rate of the substrate.[9] There are indications that
supramolecular frameworks that bind and release drugs using noncovalent interactions
will have reasonable affinities and response properties.[4] [10] The molecular tweezers
we study were designed based on this principle.
Anne Petitjean’s group recently designed a pH-responsive molecular tweezer that
can release an aromatic substrate when it switches from the closed (U-shaped) to the open
(W-shaped) conformation.[11] The change in conformation was confirmed by NMR
experiments and emission fluorescence suggests that in the closed U-shape conformation,
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
29
the two naphthalene arms of the tweezer can bind an aromatic substrate using
noncovalent π-π stacking interactions. When a change in pH triggers a change to the open
W-shape conformation, the two arms are far from each other and the substrate can be
released into solution.
In this study, we use quantum mechanical (QM) methods along with molecular
dynamics (MD) simulations to investigate the switching ability of this tweezer. Quantum
mechanics provides efficient tools to understand the intramolecular interactions within
the tweezer and the intermolecular interaction between the substrate and the tweezer. MD
simulations provide more insight into the mechanism and dynamics of molecular
switching and the substrate release process.
A few previous studies of molecular tweezers have been reported, mostly using
quantum chemistry methods or molecular mechanics approaches with continuum
solvation.[12] - [22] The results of these studies indicate that dispersion-corrected
density functionals are required to obtain sensible results because the host-guest binding
interaction in most molecular tweezer complexes, including those considered here, is
usually dispersion and, specifically, π-π stacking.[16] - [22] Several of the more recent
studies indicate that the B97D functional,[23] [24] which we will use in this work, gives
good results,[20] [22] comparable to the most modern -D3 dispersion models.[25] We
will also use molecular dynamics simulations with both continuum solvation and explicit
solvation; to our knowledge explicit solvent molecules have only been considered in one
previous calculation on molecular tweezers.[18]
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
30
2.2 Computational Details
2.2.1 Quantum Mechanics Calculations on the Switching Unit
All quantum chemistry calculations were performed using the Gaussian 09
program, [26] using Density Functional Theory (DFT).[27] [28] For the different
conformers of the switching unit, the methoxyphenylpyridinemethoxyphenyl triad, the
B3LYP functional [29] was used with the 6-31G(d) standard basis set.[30] In
experiment, conformational changes in the tweezer are usually deduced from NMR
spectra. We simulated NMR chemical shifts using the functional B97-2 [31] with the 6-
311++G(2d,p) basis set, using geometries obtained by the same method.
2.2.2 Quantum Chemistry Calculations of the Binding Energy in the Gas Phase
For the tweezer including the naphthalene arms, dispersion interactions have to be
included, so we optimized the geometry using the B97D functional with the
TZVP/TZVPFIT basis set.[32] The counterpoise correction was used to reduce the basis
set superposition error.[33] The binding energy was computed using the formula
Eb = Etwz-sub – [Etwz + Esub] , (2.1)
where Eb is the energy of binding, Etwz-sub is the energy of the system with the substrate
inside the tweezer, Etwz is the energy of the tweezer alone, and Esub is the energy of the
substrate alone.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
31
2.2.3 Molecular Dynamics (MD) Simulations With Implicit Solvent
All molecular mechanics calculations were performed using the Amber10
software.[34] [35] Atomic charges for the molecular mechanics force field were
calculated at the HF/6-31G(d) level of theory using Restrained Electrostatic Potential
(RESP) fitting.[36]
Minimizations were carried out by the Amber module Sander to relax the system
prior to MD simulations. The QM optimized geometry is initially optimized by 250 steps
with the steepest descent algorithm, followed by a 250 steps of minimization by the
conjugate gradient algorithm with no restraints. No cutoff was used and periodic
boundary conditions were turned off.
Molecular dynamics simulations were performed under isothermal/isobaric (NPT)
conditions with the Parameterized Generalized Amber Force Field (GAFF) on the
minimized geometries.[37] The Langevin thermostat was used to maintain the
temperature at 300 K. The generalized Born implicit solvation model was used.
2.2.4 Steered Molecular Dynamics Simulations With Explicit Solvent
We used the steered molecular dynamics technique to calculate the free energy of
binding, ΔG, using Jarzynski relationship.[37] [38] The free energy difference between
the two equilibrium states is related to the work W done on the system by the inequality
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
32
ΔG ≤ W (2.2)
And the binding free energy is calculated from the relation
ΔG = <W> - σw2
/ 2kBT, (2.3)
where σw2
is the variance of work, kB is Boltzmann’s constant, and T is the absolute
temperature.
2.3 Results and Discussions
2.3.1 Relative stability of the switching unit conformers
We started our study by optimizing the geometry of the switching unit (see
Figure 2.1) at the B3LYP/6-31G(d,p) level of theory. When the pyridine derivative is
deprotonated, the closed U-shaped conformer of the tweezer is 5.6 kcal/mol more stable
than the fully open W-shaped conformer. Conversely, when the nitrogen atom is
protonated, the U-shaped conformation becomes 4.9 kcal/mol less stable than the W-
shaped conformer. Converting from the U-conformer to the W-conformer requires
crossing a modest barrier of 2.8 kcal/mol, indicating that after the nitrogen atom is
protonated in the U-conformer, the W-conformer is thermally accessible.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
33
Figure 2.1: Different conformers of the protonated and deprotonated switching unit
ortho-py.
The experimental evidence for the conformational rearrangement of the tweezer is
based on NMR studies of the switching unit. The calculated and observed 13
C chemical
shifts are reported in Table 2.1 for the ortho derivative of the U-conformer. The
agreement between the computed and measured data supports the experimental
assignment of the closed U-conformer to the deprotonated tweezer. Table 2.2 shows the
calculated and the observed difference in 1H chemical shifts between the deprotonated
(U-shaped) and protonated (W-shaped) units. Only one hydrogen atom has the opposite
order, but the difference is within the expected error of the DFT method.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
34
Table 2.1: Calculated and experimental 13
C chemical shifts for the switching unit ortho-
py in the U shape. Calculations were done at the B972/6-311++G(2d,p) level of theory.
13C chemical shifts (ppm)
Atoma Calc. Exp.
b Error
CH3 53.69 56.48 -2.79
C3Py 124.00 123.95 0.05
Co 135.05 132.34 2.71
C4Py 135.25 136.02 -0.77
C-O 158.69 157.92 0.77
Cm2 121.06 121.89 -0.83
Cm1 108.64 112.19 -3.55
Cp 130.40 130.51 -0.11
C-C 132.13 130.39 1.74
C-C 158.21 156.24 1.97
a See Figure 2.1 for atom labels
bExperimental data were obtained from Ref. [11]
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
35
Table 2.2: Calculated and experimental 1H chemical shift changes upon the protonation
of the switching unit ortho-py. The deprotonated unit is in the U shape and the protonated
one is in the W shape. Calculations were done at the B972/6-311++G(2d,p) level of
theory.
Change in 1H chemical shifts (ppm)
Atoma Calc. Exp.
b
o-H3py -0.03 0.5
o-H4py 0.79 0.8
o-Hp 0.61 0.3
a See Figure 2.1 for atom labels
bExperimental data were obtained from Ref. [11]
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
36
2.3.2 Relative Stability of Tweezer Conformations In the Gas Phase
Further evidence for the U-shaped conformation of the deprotonated tweezer
comes from the experimental X-ray structure for the full tweezer.[11] Starting from the
X-ray structure, we optimized the tweezer’s geometry at the B97D/TZVP/TZVPFIT level
of theory. The optimized geometry was close to the starting X-ray structure. As expected,
the U-conformer was preferred. Key stabilizing interactions in the U-conformer are (1)
the hydrogen bonds between the nitrogen atom of the pyridine ring and the hydrogen
atoms of the anisole (methoxybenzene) rings, (2) hydrogen bonds between the oxygen
atoms of the anisole rings and the facing hydrogens of the pyridine ring, and (3) the π-π
stacking interaction between the two naphthalene rings that form the arms of the tweezer.
In the W conformation, the π-π stacking interaction and all hydrogen bonds are lost.
Moreover, the electrostatic repulsion resulting from the partial negative charges on the
nitrogen and the two anisole oxygen atoms, which are towards each other in the W
conformer, make the W shape unfavorable for the non-protonated tweezer.
Since we did not have a reference X-ray structure for the protonated tweezer, we
performed a two-dimensional potential energy surface (PES) scan over the dihedral
angles that connected the pyridine to the two anisole moieties; we then optimized the
stable conformers identified from the PES scan, obtaining the structures in Figure 2.2.
Optimization of the minima revealed that the UW-conformer is most stable, with the U-
conformer .8 kcal/mol higher in energy. Surprisingly, the least-stable conformer was the
W-conformer, which was 1 kcal/mol above the UW conformer. These energies are within
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
37
the error of DFT methods, but the results seem plausible when the structures are studied.
The W-conformer is unstable because it brings the two anisole oxygen atoms into close
proximity. The U conformer features a steric and electrostatic clash between the proton
on the pyridine ring and hydrogen atoms on the anisole ring, which forces one of the arms
of the tweezer to twist. The UW features a hydrogen bond to stabilize the pyridinium ion.
Figure 2.2: Different conformers of the protonated tweezer optimized at the
B97D/TZVP/TZVPFIT level of theory.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
38
To see if the fully-open and the half-open conformations of the protonated
tweezer have a different chemical shifts, we calculated the 13
C chemical shifts for both
the W- and UW-conformers. Results are shown in Table 2.3. The values are similar
which makes the observed shifts inconclusive for distinguishing between the two
conformations. In Table 2.4 the 1H chemical shifts for both conformations are close for
most of the atoms and support our results. In the experimental study, the loss of mutual
shielding of naphthalene rings was interpreted as switching to the W-shape. However,
this behaviour is also expected from the UW-shape, since upon rotation of a single arm of
the tweezer, atoms of both arms will be deshielded. Based on our calculations, we believe
that the experimental spectrum was misassigned, and that the most stable conformer of
the protonated switching unit and of the protonated tweezer is not the open W-form, but
the half-open UW-form.
Figure 2.3: The half-open UW-conformer of the protonated tweezer.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
39
Table 2.3: Calculated 13
C chemical shifts for the fully-open (W-shaped) and the half-open
(UW-shaped) conformations of the protonated tweezer. Shifts were calculated at the
B97D/6-311++G(2d,p) level of theory. Geometries were obtained at
B97D/TZVP/TZVPFIT. For the UW-conformer, we show shifts for the arm that did not
switch.
13
C chemical shifts (ppm)
Atoma W UW Diff.
4py 139.51 141.16 1.65
3py 119.52 122.24 2.72
o 127.93 128.59 0.66
m 108.28 112.03 3.75
p 138.88 136.33 2.56
n1 125.12 127.03 1.92
n2 124.62 126.00 1.38
n3 129.40 128.72 0.68
n4 128.47 129.00 0.53
n5 125.75 124.81 0.94
n6 125.29 125.32 0.03
n7 122.36 122.62 0.25
a See Figure 2.3 for atom labels.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
40
Table 2.4: Calculated 1H chemical shifts for the fully-open (W-shaped) and the half-open
(UW-shaped) conformations of the protonated tweezer. Shifts were calculated at the
B97D/6-311++G(2d,p) level of theory. Geometries were obtained at
B97D/TZVP/TZVPFIT. For the UW-conformer, we show shifts for the arm that did not
switch.
1H chemical shifts (ppm)
Atoma W UW Diff.
4py 8.16 8.15 0.01
3py 7.97 8.42 0.45
o 7.99 7.92 0.07
m 7.09 7.02 0.07
p 7.97 7.83 0.14
n1 7.49 7.58 0.10
n2 7.68 7.67 0.01
n3 8.11 8.05 0.06
n4 8.01 8.08 0.06
n5 7.66 7.76 0.10
n6 7.64 7.80 0.15
n7 7.90 7.86 0.03
a See Figure 2.3 for atom labels.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
41
2.3.3 Gas-Phase Binding Energy
Our calculations support the experimentalists’ assertion that this molecular
tweezer is flexible and pH-sensitive. These calculations do not assess, however, whether
the tweezer can bind a drug molecule. To assess this, we considered the interaction
energy between the tweezer and quinizarin in the gas phase. Quinizarin is an aromatic
molecule (it will have favorable π-stacking interactions with the naphthalene rings) that is
an anticancer drug. It was used in the experimental study.[11] The gas-phase binding
energy of quinizarin was -22.0 kcal/mol for the protonated tweezer and -19.8 kcal/mol for
the deprotonated tweezer.
2.3.4 Stable Conformations of the Tweezer in Implicit Water Solvent
To make a final assessment of whether the tweezer can work for drug delivery, we
need to perform calculations in solution. We did this by parameterizing the Amber force
field for the tweezer, as described in the computational methodology section. The gas-
phase binding energy of quinizarin to the deprotonated tweezer, computed with the force
field, was -22.9 kcal/mol and the geometry was similar to the one computed using DFT.
Further support for the parameterization arose from the fact that the U-conformer of the
deprotonated tweezer was 3.5 kcal/mol more stable than the W-conformer, which is also
consistent with the trend from the DFT calculations, where the U-conformer is more
stable by 8.5 kcal/mol.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
42
After validating the force field model, we performed calculations on the U- and
W-conformations of the deprotonated tweezer with implicit (generalized Born) water
solvation. Within 50 ns, the W-conformer converted to the U-conformer. The simulation
that started with the U-conformer was stable, with the U-conformer persisting until the
end of the simulation. We also performed simulations on the protonated tweezer, also
starting from the U-conformer and W-conformer. Within 50 ns, both the U- and W-
conformers converted to the UW-conformer that our gas-phase calculations had predicted
to be most stable. This is in contradiction to the NMR data from the experimental study,
which was interpreted as supporting the W-conformer. As explained in the previous
subsection, we are skeptical of that spectral assignment, and believe the half-open
conformation is more stable. Our data does strongly support the experimental assertion
that the (de)protonation of the pyridine ring induces a conformation shift in the tweezer.
2.3.5 Drug Release from the Tweezer
To test the ability of the tweezer to release the substrate, an MD simulation in
explicit solvent was performed on the U-conformer of the protonated tweezer with the
drug molecule inside. Figure 2.4 shows that after 35ns, the tweezer has switched to the
UW-conformer and the substrate is exposed to the solvent. The substrate remains close to
the tweezer, however, because the hydrophobic quinizarin molecule prefers to be near the
hydrophobic naphthalene arms, rather than completely exposed to solvent. This
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
43
arrangement is favorable because separation of the hydrophobic molecules in water
disrupts the hydrogen bonding between the polar molecules.[39] [40]
Figure 2.4: Two snapshots from molecular dynamics simulations using explicit solvent
model for the protonated tweezer with the drug molecule inside. (a) The starting
geometry for the U shaped conformer of the protonated tweezer with the quinizarin
molecule between the two arms. (b) A snapshot after 35 ns shows the drug molecule
released into the solution and the tweezer switched to the UW-conformation.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
44
Finally, we studied the dissociation energy of the drug molecule from the
protonated tweezer, over a timescale of 5 ns, using steered molecular dynamics in explicit
water. Representative starting and ending structures for the simulations are given in
Figure 2.5. The mean amount of work required to pull the drug molecule out of the
tweezer was 9.5 kcal/mol. The simulation was repeated 12 times and the average of the
work was taken. The binding free energy was calculated according to equation (2.3) as
-8.3 kcal/mol. As can be seen from Figure 2.5, a large part of this work is associated with
pulling the quinizarin molecule away from the hydrophobic tweezer arms and into the
aqueous environment. The difference in solvation free energy for quinizarin in aqueous
environment and a hydrophobic environment (modeled as n-hexane to mimic the
hydrophobic part of a lipid membrane) is 3.8 kcal/mol; the difference in solvation free
energy between water and benzene is 5 kcal/mol. This suggests that if quinizarin were
released from the tweezer into a hydrophobic environment, the binding free energy would
reduce to around 4 kcal/mol, which is clearly thermodynamically feasible.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
45
Figure 2.5: Two snapshots from steered molecular dynamics simulations using explicit
solvent model for the protonated tweezer with the drug molecule inside. (a) The starting
geometry for the U shaped conformer of the protonated tweezer with the quinizarin
molecule between the two arms. (b) A snapshot after 5 ns shows the drug molecule
released into the solution.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
46
2.4 Conclusions
We have performed a systematic computational study of a new pH-responsive
molecular tweezer of potential use for delivering hydrophobic chemotherapy drugs to
tumors. Our results support experimental evidence that the tweezer has potential as a drug
carrier, and provide a new insight: the conformation of the protonated tweezer, which is
the conformation that would release drug molecules in the acidic environment of a tumor
or endosome, is the half-open UW-conformation rather than the fully-open W-
conformation. The main reason for this preference is probably the unfavorable
electrostatic clashes between two anisole oxygen atoms in the W conformation. The UW-
conformation is fully capable of releasing a drug molecule, however, as is apparent from
the molecular-dynamics simulations of the noncovalent complex between the tweezer and
the cancer drug quinizarin.
Our calculations support several key features of the tweezer: (1) it is flexible with
small barriers to conformational changes; (2) the preferred conformation is dependent on
the protonation state of the pyridine ring that occupies the hinge of the tweezer; (3) the
tweezer is able to hold an aromatic substrate between its two arms with noncovalent
interactions. Molecular dynamics simulations allowed us to observe the dynamics of the
pH-induced conformational switching and the drug-release process.
The binding free energy from steered molecular dynamics was -8.3 kcal/mol,
which is higher than expected. This is probably due to the reluctance of the hydrophobic
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
47
drug molecule to relinquish its contact with the hydrophobic arm of the tweezer. The
binding energy in a physiological environment might be much less because there are
hydrophobic environments in the cell in which the quinizarin molecule would be much
more stable; in such environments the binding energy of the tweezer is probably closer to
4 kcal/mol. It is also worth noting that the barrier to conformational change, the pH at
which the tweezer opens, and the drug-binding energy can all be tuned by adding electron
withdrawing/donating substituents to the tweezer. The methodical computational study
we have performed here provides a protocol by which new, substituted, tweezer
molecules could be computationally tested for desirable properties, providing guidance
and insight that can reduce the number of tweezers that need to be synthesized and
experimentally tested.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
48
2.5 References
[1] Steed, J. W.; Awood, J. L. Supramolecular chemistry, Wiley, Chichester, 2000.
[2] (a) Chen, C. W.; Whitlock, H. W. J. J. Am. Chem. Soc. 1978, 100, 4921–
4922. (b) Zimmerman, S. C. Top. Curr. Chem. 1993, 165, 71–102. (c) Klärner,
F. G.; Kahlert, B. Acc. Chem. Res. 2003, 36, 919–932. (d) Harmata, M. Acc.
Chem. Res. 2004, 37, 862–873.
[3] (a) Petitjean, A.; Khoury, R. G.; Kyritsakas, N.; Lehn, J.-M. J. Am. Chem. Soc.
2004, 126, 6637–6647. (b) Landge, S. M.; Aprahamian, I. J. Am. Chem. Soc.
2009, 131, 18269–18271. (c) Skibinski, M.; Gόmez, R.; Lork, E.; Azov, V.
Tetrahedron. 2009, 65, 10348–10354. (d) Legouin, B.; Uriac, P.; Tomasi, S.;
Toupet, L.; Bondon, A.; van de Weghe, P. Org. Lett. 2009, 11, 745–748.
[4] (a) Müller, B. K.; Reuter, A.; Simmel, F. C.; Lamb, D. C. Nano Lett. 2006, 6,
2814–2820. (b) Petitjean, A.; Lehn, J.-M. Inorg. Chim. Acta. 2007, 360, 849–
856. (c) Anslyn, E. V. J. Org. Chem. 2007, 72, 687–699. (d) Phillips, M. D.;
Fyles, T. M.; Barwell, N. P.; James, T. D. Chem. Commun. 2009, 43, 6557–
6559. (e) Leblond, J.; Petitjean, A. Chemphyschem. 2011, 12, 1043-1051.
[5] Rajendran, L.; Knolker, H.-J.; Simons, K. Nat. Rev. Drug Discovery 2010, 9,
29–42.
[6] Duncan, R. Nat. Rev. Cancer 2006, 6, 688–701.
[7] Guo, X.; Szoka, F. C. Acc. Chem. Res. 2003, 36, 335–341.
[8] (a) Ulbrich, K.; Subr, V. Adv. Drug Delivery Rev. 2004, 56, 1023–1050. (b)
Oh, K. T.; Yin, H.; Lee, E. S.; Bae, Y. H. J. Mater. Chem. 2007, 17, 3987–
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
49
4001. (c) Yessine, M.-A.; Leroux, J.-C. Adv. Drug Delivery Rev. 2004, 56,
999–1021.
[9] (a) Gullotti, E.; Yeo, Y. Mol. Pharmaceut. 2009, 6, 1041–1051. (b)
Drummond, D. C.; Daleke, D. L. Chem. Phys. Lipids 1995, 75, 27–41. (c)
Walker, G. F.; Fella, C.; Pelisek, J.; Fahrmeir, J.; Boeckle, S.; Ogris, M.;
Wagner, E. Mol. Ther. 2005, 11, 418–425.
[10] (a) Schneider, H.-J. Angew. Chem., Int. Ed. 2009, 48, 3924–3977. (b) Liu, F.;
Urban, M. W. Prog. Polym. Sci. 2010, 35, 3–23.
[11] Leblond, J.; Gao, H.; Petitjean, A.; Leroux, J.-C. J. Am. Chem. Soc. 2010, 132,
8544–8545.
[12] Klarner, F. G.; Panitzky, J.; Preda, D.; Scott, L.T. J. Mol. Model. 2000, 6, 318-
327.
[13] Brown, S. P.; Schaller, T.; Seelbach, U. P.; Koziol, F.; Ochsenfeld, C.; Klarner,
F. G.; Spiess, H. W. Angewandte Chemie-International Edition 2001, 40, 717-
720.
[14] Klarner, F.G.; Kahlert, B. Acc. Chem. Res. 2003, 36, 919-932.
[15] Chang, C. E.; Gilson, M. K. J. Am. Chem. Soc. 2004, 126, 13156-13164.
[16] Parac, M.; Etinski, M.; Peric, M.; Grimme, S. J. Chem. Theory Comp. 2005, 1,
1110-1118.
[17] Parchansky, V.; Matejka, P.; Dolensky, B.; Havlik, M.; Bour, P. J. Mol. Struct.
2009, 934, 117-122.
[18] Kessler, J.; Jakubek, M.; Dolensky, B.; Bour, P. J. Comput. Chem. 2012, 33,
2310-2317.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
50
[19] Risthaus, T.; Grimme, S. J. Chem. Theory Comp. 2013, 9, 1580-1591.
[20] Graton, J.; Le Questel, J. Y. ; Legouin, B.; Uriac, P.; van de Weghe, P.;
Jacquemin, D. Chem. Phys. Lett. 2012, 522, 11-16.
[21] Grimme, S. Chemistry-a European Journal 2012, 18, 9955-9964.
[22] Graton, J.; Legouin, B.; Besseau, F.; Uriac, P.; Le Questel, J. Y.; van de
Weghe, P.; Jacquemin, D. J. Phys. Chem. C 2012, 116, 23067-23074.
[23] Becke, A. D. J. Chem. Phys. 1997, 107, 8554-8560.
[24] Grimme, S. J. Comp. Chem. 2006, 27, 1787-1799.
[25] Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132,
154104.
[26] Frisch, M. J.; Trucks , G. W.; Schlegel, H. B. et al, Gaussian 09,
Revision C.01, Gaussian, Inc., Wallingford CT, 2009.
[27] Cohen, A. J.; Mori-Sanchez, P.; Yang, W. T. Chem. Rev. 2012, 112, 289-320.
[28] Ayers, P. W.; Yang, W. Density Functional Theory, in: Bultinck, P.; de Winter,
H.; Langenaeker, W.; Tollenaere, J.P. (Eds.) Computational Medicinal
Chemistry for Drug Discovery, Dekker, New York, 2003, pp. 571-616.
[29] (a) Becke, A. D. J. Chem. Phys. 1993, 98, 5648-5652. (b) Lee, C.; Yang, W.;
Parr, R. G. Phys. Rev. B 1988, 37, 785-789. (c) Vosko, S. H.; Wilk, L.; Nusair,
M. Can. J. Phys. 1980, 58, 1200-1211. (d) Stephens, P. J.; Devlin, F. J.;
Chabalowski, C. F.; Frisch, M. J. J. Phys. Chem. 1994, 98, 11623-11627.
[30] (a) Ditchfield, R.; Hehre, W. J.; Pople, J. A. J. Chem. Phys. 1971, 54, 724. (b)
Hehre, W. J.; Ditchfield, R.; Pople, J. A. J. Chem. Phys. 1972, 56, 2257. (c)
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
51
Hariharan, P. C.; Pople, J. A. Theor. Chem. Acc. 1973, 28, 213. (d) Frisch, M.
J.; Pople, J. A.; Binkley, J. S. J. Chem. Phys. 1984, 80, 3265.
[31] Wilson, P. J.; Bradley, T. J.; Tozer, D. J. J. Chem. Phys. 2001, 115, 9233-9242.
[32] (a) Schaefer, A.; Horn, H.; Ahlrichs, R. J. Chem. Phys. 1992, 97, 2571-2577.
(b) Schaefer, A.; Huber, C.; Ahlrichs, R. J. Chem. Phys. 1994, 100, 5829-5835.
(c) Eichkorn, K.; Treutler, O.; Ohm, H.; Haser, M.; Ahlrichs, R. Chem. Phys.
Lett., 1995, 240, 283-289. (d) Eichkorn, K.; Weigend, F.; Treutler, O.;
Ahlrichs, R. Theor. Chem. Acc. 1997, 97, 119-124.
[33] (a) Boys, S. F.; Bernardi, F. Mol. Phys. 1970, 19, 553. (b) Simon, S.; Duran,
M.; Dannenberg, J. J. J. Chem. Phys. 1996, 105, 11024-11031.
[34] Case, D. A.; Darden, T. A.; Cheatham, III, T. E. et al. AMBER 10, University
of California, San Francisco, 2008.
[35] (a) Pearlman, D. A.; Case, D. A.; Caldwell, J. W.; Ross, W. S.; Cheatham, III,
T. E.; DeBolt, S.; Ferguson, D.; Seibel, G.; Kollman, P. Phys. Commun. 1995,
91, 1-41. (b) Case, D. A.; Cheatham, T.; Darden, T.; Gohlke, H.; Luo, R.;
Merz, K. M., Jr., Onufriev, A.; Simmerling, C.; Wang, B.; Woods, R. J.
Computat. Chem. 2005, 26, 1668-168.
[36] Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. J. Phys. Chem. 1993, 97,
10269-10280.
[37] Wang, J; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. J. Comput.
Chem. 2004, 25, 1157-1174.
[38] Jarzynski, C. Phys. Rev. Lett. 1997, 78, 2690.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
52
[39] Pratt, L. R.; Chandler, D. J. Chem. Phys. 1980, 73, 3434-3441.
[40] Chandler, D. Nature 2005, 437, 640-647.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
53
Chapter 3
3 Failures of Embedded Cluster Models for pKa Shifts
Dominated by Electrostatic Effects
3.1 Introduction
Accurate prediction of protein residues’ pKa values, which determine the
protonation state of ionizable residues at a given pH, is very important since these
residues are involved in intraprotein, protein-solvent, and protein-ligand interactions.
Consequently, the protonation state of a protein makes significant contributions to protein
stability, solubility, folding, binding ability and catalytic activity.[1] [2] Therefore it is
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
54
impossible to build a computational model for a protein, or even to obtain a qualitative
understanding of protein structure and function, without first determining the protonation
state of the protein. While the protonation state can be determined empirically, in
principle, experimental measurements (mostly using NMR) are difficult for large
proteins, buried residues, and active-site residues. This has motivated the development of
computational methods to predict pKa values for protein residues.[1] [3] - [16]
Unfortunately, these methods are often unreliable for buried residues.
The most commonly used methods for protein pKa prediction are based either on
the numerical solution of Linearized Poisson-Boltzmann equation (LPBE) or on fast
empirical methods built from quantitative structure-property relationships (QSPR). (The
main challenge for developing empirical methods is that the specific properties that
determine protein pKas are still not clear.[17] [18] ) For computational expediency,
methods based on both of these approaches usually model the protein with a molecular
mechanical force field, immersed in a dielectric continuum. The dielectric constant of the
solvent is set to 80, and for the protein a value between four and twenty is used. The shift
in pKa for the residue is calculated by comparing the electrostatic energy of its protonated
and non-protonated forms. Then the shift is added to a model pKa value. The structure of
the protein is assumed to not undergo any change upon (de)protonation, which is often a
poor assumption, especially for buried residues.[19] [20] Methods based on these
traditional approaches often fail to predict the pKa of buried residues and those which
have large pKa shift.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
55
Typical molecular mechanics models for electrostatic energies are limited in
accuracy by the assumption that atoms of a certain type always have the same
charge.[1] [21] - [23] This unrealistic simplification of electrostatic interactions can be
overcome using flexible-charge molecular mechanics models [24] - [30] or, ideal,
quantum mechanics (QM).[31] However, due to its computational cost, QM methods
cannot be applied to the whole protein. The cluster approach—wherein the ionizable
residue and its immediate environment are taken out of the protein and treated quantum
mechanically—hurdles this obstacle. Since it was introduced two decades ago, the cluster
approach has been successfully used to model enzyme reactions and the number of
applications for it is growing.[32] - [34]
The cluster approach has been used by Li and coworkers to determine the pKa of
protein residues.[35] In their approach, the part of the protein that includes the ionizable
residue and its immediate environment is treated by QM, while the rest of the protein is
neglected. The solvent is described by polarized continuum solvation model. The method
successfully predicted the pKa of five residues in the turkey ovomucoid third domain
(OMTKY3). The approach was also applied successfully to predict pKa values for small
organic molecules [36] - [40] and protein residues near metal atoms. However all the
residues were on surface of the enzyme, where the interactions are dominated by
hydrogen bonding.
We wondered whether the cluster method would work when electrostatic and
empirical pKa prediction methods fail. To this end, we have tested the cluster method for
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
56
two aspartate residues—one of which is buried in the protein interior, and one of which is
solvent-exposed—in which the pKa of the carboxyl group is strongly affected by nearby
charged residue(s).
3.2 Computational Details
3.2.1 pKa calculations
We followed the method developed by Li and coworkers to calculate the pKa for a
protein residue.[24] Using the reaction,
2 5 2 5HA C H COO A C H COOHG
aq aq aq aq , (3.1)
the pKa value of a carboxyl group in the protein, HA, can be determined from the
equation
apK 4.87 1.36 ,G (3.2)
where 4.87 is the experimental pKa value for propionic acid at 298 K and 1.36 is 2.303RT
at T = 298 K. ∆G is the change in the standard free energy of reaction (3.1) in kcal/mol.
To compare the accuracy of the quantum mechanical cluster model to more conventional
approaches, we used two continuum electrostatic methods, the web-based version of
KARLSBERG+ [41] and the MCCE code.[42] - [44] To represent the empirical
methods, we used the popular PROPKA program.[45]
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
57
3.2.2 Free energy calculations
All quantum mechanics calculations were performed with the Gaussian 09
program.[46] Solvation free energies calculations were performed with the Gaussian 03
program.[47]
The free energy of a molecule is given by
ele sol .G E G (3.3)
The electronic energies, Eele, were computed at the MP2/6-31+G(2d,p) level, using
structures that were optimized at RHF/6-31+G(d) level of theory. The solvation free
energies, Gsol, were calculated by the polarizable continuum model at the IEF-
PCM/RHF/6-31+G(d) level of theory for the same geometries, using the keywords
RADII=UAHF, RET=100, TSNUM=240, and SCFVAC. To account for conformational
flexibility in the cluster, all possible conformers of the acid were considered and the final
free energy was computed by Boltzmann weighting all conformers within 1 kcal/mol of
the most stable conformer. I.e.,
conformers
ln exp .iG RT G RT
(3.4)
3.2.3 Protein model construction
For each model, the coordinates for the atoms were taken from the PDB files
downloaded from the PDB database. The files were stripped of metal atoms, cofactors,
inhibitors, and solvent molecules. Atomic coordinates for the model for Asp75 was taken
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
58
from pdb code: 1a2p bacterial barnase.[48] Atomic coordinates for the model for Asp21
were taken from pdb code: 1beo, which is fungal beta-cryptogein.[49] Hydrogen atoms
were added using the web-based program molprobity software.[50] Where the residue
was truncated, additional hydrogen atoms were added manually to satisfy the valence.
The protein backbone was fixed and all the hydrogen atoms were optimized at HF/3-
21G(d) level of theory; then the side chain (CH2COO– or CH2COOH) was optimized.
One oxygen atom of the carboxyl group was fixed. The positions of the neighboring
protons of OH groups were also minimized at the same level of theory.
3.3 Results
3.3.1 Aspartate 75
Figure 3.1 shows the cluster model for Asp75. The model includes the immediate
chemical environment of the ionizable residue. Three conformers were found for the
protonated form. There are two hydrogen bonds between two hydrogens of Arg83 and the
two oxygens of the carboxyl group of Asp75. A pKa of -4.63 pH units was obtained by
the method developed by Li and coworkers, which is much lower than the experimental
value (error = -7.4.) Negative results were also obtained by PROPKA and
KARLSBERG+, see Table 3.1. Only the MCCE code gave a reasonable prediction, 3.6.
One possible explanation is that these methods overestimate the stabilization from the
charge-charge interaction between the carboxylate group of Asp75 and two nearby
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
59
positively-charged amides: Arg83 and Arg87. These interactions favor the ionized (non-
protonated) form, leading to a lower pKa value. Another possible explanation, suggested
by Li and coworkers, is that the position of the Arg83 in the crystal structure is different
from its position in solution. To address this possibility, the geometries of Arg83 were
optimized at the same level, and the pKa was recalculated. This refined calculation did
not improve the results. The solvation free energy was calculated for different values of
dielectric constants, 4 and 20, but this did not improve the results either.
Figure 3.1: Model compound for Asp75 of bacterial barnase. The model contains the
ionizable group and its immediate chemical environment
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
60
Table 3.1: Comparison between experimental and several pKa prediction methods.
PDB Code Residue Exp. pKa Current study PROPKAa MCCE
a KARLSBERG+
1a2p Asp75 3.1 -4.6 -1.3 3.6 -8.2
1beo Asp21 2.5 -3.7 1.4 3.6 0.3
a Results were obtained from Ref. [64]
3.3.2 Aspartate 21
The model for Asp21 is represented in Figure 3.2. The pKa shift is primarily due
to electrostatic interactions between the carboxyl group and the positively-charged amide
group of Lys62. A pKa of -3.72 pH units was obtained by the method developed by Li
and coworkers, see Table 3.1. Lower pKa values were obtained by the rest of the methods
(except for MCCE), which suggests that the explanation for these poor results is similar
to the explanation we deduced for Asp75: nearby positively-charged residues cause these
methods to overstabilize the ionized form of Asp21.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
61
Figure 3.2: Model compound for Asp21of fungal beta-cryptogein. The model contains the
ionizable group and its immediate chemical environment
3.4 Discussion
The calculated pKa values for both aspartate residues we considered were far
lower than the experimentally measured values for most of the methods. We attribute this
to overestimation of Coulombic interactions between nearby positively-charged residues
and the carboxyl group, causing the models to overestimate the stabilization of
carboxylate anion. Not even the trends are right: the experimental pKa value of Asp75 is
higher than that of Asp21, but the methods predict the opposite order. This is consistent
with our hypothesis: Asp75 has two nearby positively-charged Arg residues, while
Asp21 has only one nearby positively-charged Lys residue.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
62
Asp75 is a buried residue. Prediction of pKa for buried residues is extremely
challenging, partly because desolvation is a key determinant of pKa. The hydrophobic
environment stabilizes the neutral form of Asp75, which increase the pKa of the residue.
Desolvation effects were found to raise the pKa shifts by 4-6 pH units.[51] - [54] A study
by Laurents et al. reported that present-day continuum models underestimate the
contribution of desolvation effect.[55] The inability of continuum solvation models to
adequately capture desolvation effects helps explain why Asp75 is erroneously predicted
to be more acidic than Asp21 by the computational approaches we consider.
Although Asp21 is a surface residue, the cluster method still underestimates its
pKa value. Again, the likely culprit is the implicit solvation model, the polarized
continuum model, which probably fails to effectively screen the charges, giving too large
a shift in pKa values. This observation is consistent with the well-known, but typically
ignored, observation that the dielectric screening—a concept from macroscopic
electrostatics—is an inappropriate model for short-range electrostatic interactions in the
protein environment.[56] - [61] Also, it is known that pKa shifts are affected by slight
changes in protein structure, including those induced by crystal-packing
forces.[60] [62] [63] Especially for buried residues, minimization of the protein using a
method like QM/MM is necessary to obtain the right structure.
Our results indicate that one needs to develop better continuum models for
describing electrostatic interactions in proteins. In addition, explicit water molecules need
to be included, as also suggested by a benchmark study.[64] The response of the
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
63
neighboring protein residues, due to protein flexibility, to (de)protonation also needs to
be included. There may also be other important effects. Indeed, the failure of most of
these methods can be attributed to our lack of a deep understanding of what properties of
proteins affect the pKa and the absence of quantitative models for how these properties
affect the pKa.
3.5 Conclusions
The goal of this work was to use a cluster model, similar to those that are widely
used to model enzymatic chemical reactions, to predict the pKa values of amino acid
residues in proteins. Results from the cluster model were compared to those from more
traditional (and less computationally demanding approaches) including those based on
the numerical solution of the Poisson-Boltzmann equation (MCCE and KARLSBERG+)
and those based on empirical fitting (PROPKA). We tested the approaches on two
aspartate residues whose pKa-shift is induced by nearby positively-charged residues. One
of the residues, Asp75, was buried. The other, Asp21, was solvent-exposed.
The cluster approach predicted pKa values that were far too low. This contradicts
earlier findings from Li and coworkers, where the cluster model gave excellent results.
We attribute the inadequacy of the cluster model for our cases to the overestimation of
the stabilization effect due to nearby positively charged residues and the underestimation
of the hydrophobic effect by the continuum solvation model. It is also possible that
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
64
(presumably slight) differences in the structure of the protein in solution, compared to the
crystal structure, accounts for part of the discrepancy. Finally, the change in protein
geometry upon (de)protonation may be important.
One obvious solution is to increase the size of the cluster model and (ideally)
introduce explicit solvent molecules. This is not practical because the number of
conformers grows exponentially as the size of the cluster model increases, causing a
proportionate increase in computational cost. Approaches based on QM/MM method,
where the protein environment is modeled with a molecular mechanics force field, are
more promising. Such models would provide a better description of the hydrophobic
environment of the ionizable residue, but the computational cost is much higher.
QM/MM approaches are not yet routinely practical, but they could be used for difficult
residues in important proteins.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
65
3.6 References
[1] Warshel, A.; Sharma, P. K.; Kato, M.; Parson, W. W. Biochimica Et
Biophysica Acta-Proteins and Proteomics 2006, 1764, 1647-1676.
[2] Schlick, T. Molecular modeling and simulation: An interdisciplinary guide,
Springer, New York, 2002.
[3] Burger, S. K.; Ayers, P. W. J. Comput. Chem. 2011, 32, 2140-2148.
[4] Burger, S. K.; Ayers, P. W. Proteins 2011, 79, 2044-2052.
[5] Fogolari, F.; Brigo, A.; Molinari, H. J. Mol. Recognit. 2002, 15, 377-392.
[6] Bashford, D.; Karplus, M. Biochemistry 1990, 29, 10219–10225.
[7] Yang, A. S.; Gunner, M. R.; Sampogna, R.; Sharp, K; Honig, B. Proteins
1993, 15, 252–265.
[8] Antosiewicz, J.; McCammon, J. A.; Gilson, M. K. J. Mol. Biol. 1994, 238,
415–436.
[9] Antosiewicz, J.; Briggs, J. M.; Elcock, A. H.; Gilson, M. K.; McCammon, J.
A. J. Comput. Chem. 1996, 17, 1633–1644.
[10] Sham, Y. Y.; Chu, Z. T.; Warshel, A. J. Phys. Chem. B 1997, 101, 4458–
4472.
[11] Delbuono, G. S.; Figueirido, F. E.; Levy, R. M. Proteins 1994, 20, 85–97.
[12] Warshel, A.; Sussman, F.; King, G. Biochemistry 1986, 25, 8368–8372.
[13] Kollman, P. Chem. Rev. 1993, 93, 2395–2417.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
66
[14] Mehler, E. L.; Guarnieri, F. A. Biophys. J. 1999, 77, 3–22.
[15] Sandberg, L.; Edholm, O. Proteins 1999, 36, 474–483.
[16] Matthew J. B.; Gurd, F. R. N. Methods Enzymol. 1986, 130, 413–436.
[17] Forsyth W. R.; Antosiewiez, J. M.; Robertson, A. D. Proteins 2002, 48,
388–403.
[18] Edgcomb, S. P.; Murphy, K. P. Proteins 2002, 49, 1–6.
[19] Swails, J. M.; Roitberg, A. E. J. Chem. Theory Comp. 2012, 8, 4393-4404.
[20] Meng, Y. L.; Roitberg, A. E. J. Chem. Theory Comp. 2010, 6, 1401-1412.
[21] Ji, C. G.; Mei, Y.; Zhang, J. Z. H. Biophys. J. 2008, 95, 1080-1088.
[22] Tong, Y.; Ji, C. G.; Mei, Y.; Zhang, J. Z. J. Am. Chem. Soc. 2009, 131,
8636-8641.
[23] Lee, L. P.; Cole, D. J.; Skylaris, C.-K.; Jorgensen, W. L.; Payne, M. C. J.
Chem. Theory Comp. 2013, 9, 2981-2991.
[24] Mortier, W. J. 1987, 66, 125-143.
[25] Mortier, W. J.; Ghosh, S. K.; Shankar, S. J. Am. Chem. Soc. 1986, 108,
4315-4320.
[26] Mortier, W. J.; Vangenechten, K.; Gasteiger, J. J. Am. Chem. Soc. 1985,
107, 829-835.
[27] Chelli, R.; Procacci, P.; Righini, R.; Califano, S. J. Chem. Phys. 1999, 111,
8569-8575.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
67
[28] Nistor, R. A.; Polihronov, J. G.; Muser, M. H.; Mosey, N. J. J. Chem. Phys.
2006, 125.
[29] Verstraelen, T.; Ayers, P. W.; Van Speybroeck, V.; Waroquier, M. J. Chem.
Phys. 2013, 138.
[30] van Duin, A. C. T.; Dasgupta, S.; Lorant, F.; Goddard, W. A. J. Phys.
Chem. A 2001, 105, 9396-9409.
[31] Helgaker, T.; Jørgensen, P.; Olsen, J. Modern electronic structure theory,
Wiley, Chichester, 2000.
[32] Siegbahn, P. E. M.; Himo, F. J. Biol. Inorg. Chem. 2009, 14, 643-651.
[33] Siegbahn, P. E. M.; Borowski, T. Acc. Chem. Res. 2006, 39, 729-738.
[34] Siegbahn, P. E. M.; Blomberg, M. R. A. Chem. Rev. 2000, 100, 421-437.
[35] Li, H.; Robertson, A. D.; Jensen, J. H. Proteins 2004, 55, 689-704.
[36] Lim, C.; Bashford, D.; Karplus, M. J. Phys. Chem. 1991, 95, 5610–5620.
[37] Chen, J. L.; Noodleman, L.; Case, D. A.; Bashford, D. J. Phys. Chem. 1994,
98, 11059–11068.
[38] Richardson, W. H.; Peng, C.; Bashford, D.; Noodleman, L.; Case, D. A. Int.
J. Quantum Chem. 1997, 61, 207–217.
[39] Topol, I. A.; Tawa, G. J.; Burt, S. K.; Rashin, A. A. J. Phys. Chem. A 1997,
101, 10075–10081.
[40] Li. H.; Hains, A. W.; Everts, J. E.; Robertson, A. D.; Jensen, J. H. J. Phys.
Chem. B 2002, 106, 3486–3494.
[41] Kieseritzky, G.; Knapp, E. W. Proteins 2008, 71, 1335-1348.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
68
[42] Song, Y.; Mao, J.; Gunner, M. R. J. Comp. Chem. 2009, 30, 2231-2247.
[43] Georgescu, R. E.; Alexov, E. G.; Gunner, M. R. Biophys. J. 2002, 83, 1731-
1748
[44] Alexov, E.; Gunner, M. R. Biophys. J. 1997, 74, 2075-2093.
[45] Li, H.; Robertson, A. D.; Jensen, J. H. Proteins, 2005, 61, 704-721.
[46] Frisch, M. J.; Trucks, G. W.; Schlegel, H. B. et al, Gaussian 09, Revision
C.01, Gaussian, Inc., Wallingford CT, 2009.
[47] Frisch, M. J.; Trucks, G. W.; Schlegel, H. B. et al, Gaussian 03, Revision
D.01, Gaussian, Inc., Wallingford CT, 2004.
[48] Oliveberg, M.; Arcus, V. L.; Fersht, A. R. Biochemistry 1995, 34, 9424–
9433.
[49] Gooley, P. R.; Keniry, M. A.; Dimitrov, R. A.; Marsh, D. E.; Gayler, K. R.;
Grant, B. R. J. Biol. NMR 1998, 12, 523–534.
[50] Davis, I. W. et al. Nucleic Acids Research 2007, 35, W375-W383.
[51] Mehler E. L.; Fuxreiter, M.; Garcia-Moreno, B. Biophys. J. 2002, 82, 1748.
[52] Mehler, E. L.; Fuxreiter, M.; Simon, I.; Garcia-Moreno, E. B. Proteins 2002,
48, 283–292.
[53] Dwyer, J. J.; Gittis, A. G.; Karp, D. A.; Lattman, E. E.; Spencer, D. S.;
Stites, W. E.; Garcia-Moreno. B. Biophys. J. 2000, 79, 1610–1620.
[54] Lambeir, A. M.; Backmann, J.; Ruiz-Sanz, J.; Filimonov, V.; Nielsen, J. E.;
Kursula, I.; Norledge, B. V.; Wierenga, R. K. Eur. J. Biochem. 2000, 267,
2516–2524.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
69
[55] Laurents D. V.; Huyghues-Despointes, B. M. P.; Bruix, M; Thurlkill, R. L.;
Schell, D.; Newsom, S.; Grimsley, G. R.; Shaw, K. L.; Trevino, S.; Rico,
M.; Briggs, J. M., Antosiewicz, J. M.; Scholtz, J. M.; Pace, C. N. J. Mol.
Biol. 2003, 325, 1077–1092.
[56] Berkowitz, M. L.; Bostick, D. L.; Pandit, S. Chem. Rev. 2006, 106, 1527-
1539.
[57] Mehler, E. L.; Guarnieri, F. Biophys. J. 1999, 77, 3–22.
[58] Sandberg, L.; Edholm, O. A. Proteins 1999, 36, 474–483.
[59] Matthew, J. B.; Gurd, F. R. N. Methods Enzymol. 1986, 130, 413–436.
[60] Nielsen, J. E.; Vriend, G. Proteins 2001, 43, 403–412.
[61] Schutz, C. N.; Warshel, A. Proteins 2001, 44, 400–417.
[62] Nielsen J. E.; McCammon, J. A. Protein Sci. 2003, 12, 313–326.
[63] Georgescu, R. E.; Alexov, E. G.; Gunner, M. R. Biophys. J. 2002, 83, 1731–
1748.
[64] Stanton, C. L.; Houk, K. N. J. Chem. Theory Comp. 2008, 4, 951-966
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
70
Chapter 4
4 Conclusions
This thesis develops and tests computational models for protonation and
deprotonation reactions in biological and supramolecular systems.
Chapter 2 of the thesis presents a systematic study of a supramolecular structure
based on a molecular tweezer that has been proposed as a drug carrier. The goal of our
study was to elucidate the dynamics of interaction within the tweezer and between the
tweezer and substrate.
Our study supports the experimental evidence that the molecule is pH-responsive.
We showed that the molecule is flexible with a small barrier to rotation. We also showed
that the shape of the molecule depends on the protonation state of its pyridine ring. Upon
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
71
protonation the molecule switches from the closed (U-shaped) to the open (UW-shaped)
conformation. Quantum mechanics calculations showed that the molecule is capable of
holding an aromatic substrate between its two arms through noncovalent π-π stacking
interactions. Molecular dynamics simulations allowed us to observe at atomistic details
the dynamics of shape-switching and substrate-releasing processes. The binding free
energy of quinizarin, an anticancer drug, to the tweezer was calculated by steered
molecular dynamics. The binding free energy is moderate, demonstrating that the
molecule has a potential as a drug carrier to tumoral tissues, which tend to have lower pH
than healthy cells. The flexibility of the molecule, the response to pH change, and the
binding free energy are all tunable by introducing electron withdrawing/donating groups
to the tweezer. The computational approaches we established for this molecule can be
used to assess the effects of adding substituents to the tweezer, thereby streamlining the
drug development process.
Chapter 3 studies the cluster model approach for predicting protein residues pKa
values. Predicting the pKa values for buried and active site amino acid residues in
proteins is important since these residues are involved in intraprotein, protein-solvent,
and protein-ligand interactions. Therefore, protein conformation and stability depends on
its protonation state. Computational models for the protein pKa’s are especially important
for buried amino acids, as experimental data for pKa values of buried protein residues is
scarce.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
72
The two residues that were chosen for this study are particularly challenging, and
traditional methods, based on numerical solution of the Poisson-Boltzmann equation and
empirical fitting, fail. We treated the residue and its immediate chemical environment
using quantum mechanics, while the rest of the protein was replaced by a continuum
dielectric. This method had been previously applied successfully to predict pKa values of
surface residues of turkey ovomucoid third domain. However, for the residues we
studied, the cluster model predicts pKa values much lower than those observed
experimentally. The inadequacy of the cluster model can be attributed to the
overestimation of the stabilizing effect of electrostatic interactions and the
underestimation of the hydrophobic effect by the continuum solvation model.
The study highlights the importance of protein flexibility in the simulation.
Including the bulk of the protein is important for good results, which is possible through
the hybrid QM/MM methods, since increasing the size of the cluster is computationally
too demanding. The determinants of pKa shifts in protein need more investigation. The
study also, along with other studies, indicates the need for developing better (de)solvation
models.
In addition to the work reported in this Master’s thesis, I have worked to
calculate nonlinear optical properties of molecules using the finite field (FF) method.
These properties are important for the development of optical devices, studying long
range interactions, and characterizing solvation and the interactions between molecules
and ionic reagents.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
73
FF methods are straightforward, easy-to-implement techniques for calculating
electric response properties. They are cheaper than other methods (e.g., response theory
and the sum over states method). However, one of the major drawbacks of FF methods is
the dependence of results on the chosen field strength. In particular, choosing a field
strength that is too small gives inaccurate results due to round-off errors; choosing a field
strength that is too large gives inaccurate results because higher-order responses become
dominant.
We showed that for a large data set of 120 molecules, for the second
hyperpolarizability, choosing a field sequence smaller than two gives better results than
using a value of 2, as universally used in the literature. We also showed that Richardson
extrapolation is superior to polynomial fitting for refining the calculated response
properties and only two steps of refinements were necessary to reach the maximum
precision. The results of this study have already published in the Journal of
Computational Chemistry.
An open question so far is, how an optimal field strength should be chosen for a
particular molecule. We developed a simple protocol to predict the optimal field strength
to calculate the second hyperpolarizabilities. The protocol depends only on the maximum
nuclear distance within the molecule and was applied successfully for a wide range of
molecules.
M.Sc. Thesis - Ahmed Mohammed McMaster University - Department of Chemistry
74
Our findings are currently being extended to include lower derivatives, i.e. dipole
polarizabilities and the first hyperpolarizabilities. We will also extend the approach to
different levels of theory.
In summary, this thesis focused on studying protonation reactions of biochemical
relevance, both in terms of drug delivery (chapter 2) and protein structure and function
(chapter 3). Chapter 3 points out limitations in the cluster model for protein pKa’s (i.e.,
don’t use that method for buried residues) and Chapter 2 elucidates the binding
mechanism of a pH-sensitive molecular tweezer.