72
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1885 Computational Modeling of the Structure, Function and Dynamics of Biomolecular Systems YASHRAJ KULKARNI ISSN 1651-6214 ISBN 978-91-513-0828-9 urn:nbn:se:uu:diva-398169

Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2020

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1885

Computational Modeling of theStructure, Function and Dynamicsof Biomolecular Systems

YASHRAJ KULKARNI

ISSN 1651-6214ISBN 978-91-513-0828-9urn:nbn:se:uu:diva-398169

Page 2: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Dissertation presented at Uppsala University to be publicly examined in B21, BMC,Husargatan 3, Uppsala, Wednesday, 5 February 2020 at 13:15 for the degree of Doctor ofPhilosophy. The examination will be conducted in English. Faculty examiner: Prof. Dr.Markus Meuwly (University of Basel, Department of Chemistry).

AbstractKulkarni, Y. 2020. Computational Modeling of the Structure, Function and Dynamics ofBiomolecular Systems. Digital Comprehensive Summaries of Uppsala Dissertations fromthe Faculty of Science and Technology 1885. 72 pp. Uppsala: Acta Universitatis Upsaliensis.ISBN 978-91-513-0828-9.

Proteins are a structurally diverse and functionally versatile class of biomolecules. Theyperform a variety of life-sustaining biological processes with utmost efficiency. A profoundunderstanding of protein function requires knowledge of its structure. Experimentallydetermined protein structures can serve as a starting point for computer simulations in order tostudy their dynamic behavior at a molecular level. In this thesis, computational methods havebeen used to understand structure-function relationships in two classes of proteins - intrinsicallydisordered proteins (IDP) and enzymes.

Misfolding and subsequent aggregation of the amyloid beta (Aβ) peptide, an IDP, isassociated with the progression of Alzheimer’s disease. Besides enriching our understandingof structural dynamics, computational studies on a medically relevant IDP such as Aβ canpotentially guide therapeutic development. In the present work, binding interactions of themonomeric form of this peptide with biologically relevant molecular species such as divalentmetal ions (Zn2+, Cu2+, Mn2+) and amphiphilic surfactants were characterized using longtimescale molecular dynamics (MD) simulations. Among the metal ions, while Zn2+ and Cu2+

maintained coordination to a well-defined binding site in Aβ, Mn2+-binding was observed tobe comparatively weak and transient. Surfactants with charged headgroups displayed strongbinding interaction with Aβ. Complemented by biophysical experiments, these studies provideda multifaceted perspective of Aβ interactions with the partner molecules.

Triosephosphate isomerase (TIM), a highly evolved and catalytically proficient enzyme, wasstudied using empirical valence bond (EVB) calculations to obtain deeper insights into thecatalytic reaction mechanism. Multiple structural features of TIM such as the flexible loopand preorganized active site residues were investigated for their role in enzyme catalysis. Theeffect of substrate binding was also studied using truncated substrates. Finally, using enhancedsampling methods, dynamic behavior of the catalytically important loop 6 was characterized.The importance of structural stability and flexibility on protein function was illustrated bythe work presented in this thesis, thus furthering our scientific understanding of proteins at amolecular level.

Keywords: Molecular Dynamics, Empirical Valence Bond, Enzyme Catalysis, AmyloidBeta, Aβ, Triosephosphate Isomerase, TIM, Computational Biochemistry, ComputationalEnzymology

Yashraj Kulkarni, Department of Chemistry - BMC, Biochemistry, Box 576, UppsalaUniversity, SE-75123 Uppsala, Sweden.

© Yashraj Kulkarni 2020

ISSN 1651-6214ISBN 978-91-513-0828-9urn:nbn:se:uu:diva-398169 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-398169)

Page 3: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Dedicated to my dear Ajja.

Page 4: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric
Page 5: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Wallin, C., Kulkarni, Y. S., Abelein, A., Jarvet, J., Liao, Q.,

Strodel, B., Olsson, L., Luo, J., Abrahams, J. P., Sholts, S. B., Roos, P. M., Kamerlin, S. C. L., Graslund, A., Warmlander, S. K. T. S. (2016) Characterization of Mn(II) ion binding to the amyloid-beta peptide in Alzheimer’s disease. J. Trace Elem. Med. Biol., 38:183-193

II Österlund, N., Kulkarni, Y. S., Misiaszek, A. D., Wallin, C., Krüger, D. M., Liao, Q., Rad, F. M., Jarvet, J., Strodel, B. Wärmländer, S. K. T. S., Ilag, L. L., Kamerlin, S. C. L., Gräslund, A. (2018) Amyloid-β Peptide Interactions with Am-phiphilic Surfactants: Electrostatic and Hydrophobic Effects. ACS Chem. Neurosci., 9(7):1680-1692

III Kulkarni, Y. S., Liao, Q., Petrović, D., Krüger, D. M., Strodel,

B., Amyes, T. L., Richard, J. P., Kamerlin, S. C. L. (2017) En-zyme Architecture: Modeling the Operation of a Hydrophobic Clamp in Catalysis by Triosephosphate Isomerase. J. Am. Chem. Soc., 139(30):10514-10525

IV Kulkarni, Y. S., Amyes, T. L., Richard, J. P., Kamerlin, S. C.

L. (2019) Uncovering the Role of Key Active-Site Side Chains in Catalysis: An Extended Brønsted Relationship for Substrate Deprotonation Catalyzed by Wild-Type and Variants of Tri-osephosphate Isomerase. J. Am. Chem. Soc., 141(40):16139-16150

V Kulkarni, Y. S., Liao, Q., Byléhn, F., Amyes, T. L., Richard, J.

P., Kamerlin, S. C. L. (2018) Role of Ligand-Driven Conforma-tional Changes in Enzyme Catalysis: Modeling the Reactivity of the Catalytic Cage of Triosephosphate Isomerase. J. Am. Chem. Soc., 140(11):3854-3857

Page 6: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

VI Liao, Q., Kulkarni, Y., Sengupta, U., Petrović, D., Mulholland, A. J., van der Kamp, M. W., Strodel, B., Kamerlin, S. C. L. (2018) Loop Motion in Triosephosphate Isomerase Is Not a Simple Open and Shut Case. J. Am. Chem. Soc., 140(46):15889-15903

Reprints were made with permission from the respective publishers.

Page 7: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Additional Publications

VII Parkash, V., Kulkarni, Y., ter Beek, J., Shcherbakova, P. V., Kamerlin, S. C. L., Johansson, E. (2019) Structural conse-quence of the most frequently recurring cancer-associated sub-stitution in DNA polymerase ε. Nat. Commun., 10:373

VIII Kulkarni, Y., Kamerlin, S. C. L. (2019) Computational physi-cal organic chemistry using the empirical valence bond ap-proach. Adv. Phys. Org. Chem., 53:69

IX Amrein, B., Steffen-Munsberg, F., Szeler, I., Purg, M., Kul-

karni, Y., Kamerlin, S. C. L. (2017) CADEE: Computer Aided Directed Evolution of Enzymes. IUCrJ, 4:50

Page 8: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Contribution report

Contributions to the articles that are part of this thesis are listed here.

Paper I: Worked on the computational aspect of the study. Performed part of the simulations and analysis. Paper II: Worked on the computational aspect of the study. Performed part of the simulations and analysis. Paper III: Performed force field parameterization of the system, and part of the simulations and analysis. Paper IV: Performed all of the simulations and analysis. Paper V: Designed the simulations. Performed part of the simulations and analysis. Paper VI: Performed the EVB simulations and relevant analysis.

Page 9: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Contents

1.  Introduction ......................................................................................... 13 

2.  Structure-Function Relationships in Proteins ...................................... 16 2.1 Intrinsically Disordered Proteins ........................................................ 17 

2.1.1 Biological Function of IDPs ....................................................... 18 2.1.2 Involvement of IDPs in Diseases ................................................ 18 2.1.3 Amyloid β Peptide ...................................................................... 19 

2.2 Enzymes ............................................................................................. 21 2.2.1 Enzyme Kinetics ......................................................................... 22 2.2.2 Enzyme Catalysis ........................................................................ 26 2.2.3 Structural and Dynamic Effects on Enzyme Function ................ 27 2.2.4 Triosephosphate Isomerase ......................................................... 28 

3.  Computational Approaches ................................................................. 30 3.1 Quantum Mechanics ........................................................................... 30 

3.1.1 Ab initio methods ........................................................................ 31 3.1.2 Density Functional Theory ......................................................... 31 

3.2 Molecular Mechanics ......................................................................... 32 3.2.1 Enhanced Sampling Methods ..................................................... 34 3.2.2 Calculation of Free Energies....................................................... 35 

3.3 Multiscale Modeling .......................................................................... 36 3.3.1 The QM/MM Approach .............................................................. 37 3.3.2 The Empirical Valence Bond Approach ..................................... 37 

4.  Conformational Dynamics of Aβ peptide ............................................ 41 4.1 Binding Interactions with Metal Ions (Paper I) .................................. 42 4.2 Binding Interactions with Surfactants (Paper II) ................................ 43 4.3 Perspectives ........................................................................................ 46 

5.  Computational Modeling of Enzyme Catalysis ................................... 47 5.1 Modeling the Hydrophobic Clamp of TIM (Paper III) ...................... 48 5.2 Role of Key Active Site Residues in TIM (Paper IV) ........................ 50 5.3 Effect of Dianion Binding on Catalysis (Paper V) ............................. 53 5.4 Dynamics of Loop 6 in TIM (Paper VI) ............................................. 54 5.5 Perspectives ........................................................................................ 56 

Page 10: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

6.  Conclusions and Future Perspectives .................................................. 57 

7.  Sammanfattning på svenska ................................................................ 59 

Acknowledgments......................................................................................... 61 

References ..................................................................................................... 65 

Page 11: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Abbreviations

Aβ Amyloid-β Peptide

AD Alzheimer’s Disease

BE-METAD Biased Exchange Metadynamics

DDM Dodecyl β–D-maltoside

DHAP Dihydroxyacetone Phosphate

EM Electron Microscopy

EVB Empirical Valence Bond

FEP Free Energy Perturbation

GA Glycolaldehyde

GAP D-glyceraldehyde 3-phosphate

HREX Hamiltonian Replica Exchange

IDP Intrinsically Disordered Protein

LFER Linear Free Energy Relationships

MD Molecular Dynamics

MM Molecular Mechanics

NMR Nuclear Magnetic Resonance

PDB Protein Data Bank

QM Quantum Mechanics

SDS Sodium Dodecyl Sulfate

TIM Triosephosphate Isomerase

TST Transition State Theory

US Umbrella Sampling

Page 12: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric
Page 13: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

13

1. Introduction

Biomolecules form the chemical basis of all life processes. Evolution has gradually driven the organization and specialization of these molecules, ren-dering them self-sustainable in the process. Biomolecules are mainly classi-fied as carbohydrates, lipids, nucleic acids, and proteins, and can exist in sim-ple (e.g., monomeric), or complex (e.g., oligomeric, polymeric) forms. They are also observed to display a trend in which simple building blocks compound to form large complex structures that perform distinct functional roles.1

Proteins are a structurally diverse and functionally versatile class of biomole-cules. As enzymes, they catalyze biochemical reactions; as structural proteins, they render structural integrity to cell membranes by providing support to lipid molecules; as motility proteins, they facilitate cellular or organismal move-ment; and as membrane proteins, they function as receptors or signal trans-ducers, and also as ion-transport channels.1 The list is not exhaustive, as pro-teins are involved in many more vital life processes. The diversity in protein function is achieved mainly by virtue of the broad range of three-dimensional structures formed by polypeptides. Each of the twenty naturally occurring amino acids, which function as monomeric units of polypeptides, have distinct physical and chemical properties that allow proteins to fold into specific struc-tures.2 Research efforts addressing the question of protein folding have em-ployed experimental and computational methods3–5 to study the features of a protein sequence that direct it to fold into its preferred or “correct” structure. Understanding the significance of structural features of a protein for its effi-cient functioning is another question that, in this context, is important as well as relevant. Studying protein structures at atomic resolution is essential for understanding structure-function relationships of proteins. X-ray crystallog-raphy, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM) methods are some of the most popular techniques used today to generate three-dimensional structural models of biomolecules, rang-ing from short peptides to viral particles. These models have provided valua-ble insights to our understanding of biomolecular structure and function.

Page 14: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

14

X-ray crystallography is a widely-used technique used to determine high-res-olution structures of proteins. This method involves the exposure of protein crystals to incident X-ray beams resulting in diffraction patterns. The data is then used to calculate the electron density, and subsequently determine atomic coordinates.6 Structures of enzymes with or without bound ligands can be ob-tained using this method. NMR spectroscopy is used to obtain an ensemble of structures from a solution of isotope-labeled peptides. This technique is gen-erally used for determining structures of short polypeptides or fragments of large protein complexes.7 Cryo-EM on the other hand, which uses protein samples prepared at cryogenic temperatures, is generally used for structure determination of larger protein complexes such as multimeric structures.8

Structural models provide a static snapshot of a protein at an atomic resolu-tion. Computer simulations, which model the dynamics of a protein structure using physics-based theories, have been of great help in providing deeper in-sights into protein function using structural models.9,10 These methods, when used in conjunction with experimental methods, provide a holistic view of the system under study, thereby allowing the determination of the role of struc-tural rigidity and flexibility in protein function.

The work presented in this thesis deals with: (A) observing the structural var-iations in proteins caused by changes in their external environment or inherent characteristics, and (B) studying the effect of these structural and dynamic changes on the desired function of the proteins. Chapter 2 contains a descrip-tion on the diversity in protein structure and function in two types of proteins – intrinsically disordered proteins (IDPs), and enzymes. While IDPs are known to change their structural forms at different physicochemical condi-tions, enzymes have a distinct three-dimensional structure that might undergo localized conformational changes as part of its catalytic function. The com-mon factor between both these systems is that structural perturbations in a protein can affect its function. The amyloid beta (Aβ) peptide, an IDP, and triosephosphate isomerase (TIM), a highly catalytically proficient enzyme, have been chosen as model systems for the investigations. Chapter 3 contains a detailed description and discussion about the various computational methods used in my work. Chapters 4 and 5 provide a detailed summary of my research on the Aβ peptide and TIM respectively, where I have used long timescale molecular dynamics (MD) simulations to study con-formational dynamics of the Aβ peptide, and empirical valence bond (EVB)

Page 15: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

15

calculations to model enzyme catalysis in TIM. Chapter 6 summarizes key takeaways from the research showcased in this thesis. It also contains my re-flections on the scientific learnings from the studies, and comments on the future prospects of these areas of research.

Page 16: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

16

2. Structure-Function Relationships in Proteins

Proteins are an indispensable chemical component of every biological process. As mentioned before, the structural and functional versatility of proteins is borne out of the chemical diversity of their constituent amino acids. Hence, knowing the structure of a protein is key to determining its function. This par-adigm11 had its foundations laid when the first three-dimensional structure of myoglobin was solved by Kendrew et al. in 1958.12 Ever since, there has been a tremendous growth in the number of protein structures generated and depos-ited in the Protein Data Bank (PDB).13 This wealth of structural data has been instrumental towards enhancing our understanding of functional aspects of proteins such as enzymes or cell surface receptors. Structure-based drug dis-covery has enabled high-throughput screening of drug candidates,14 and has resulted in a number of effective drugs targeting fatal diseases such as HIV,15–

17 tuberculosis,18 and cancer.19 Significant advances have been made in the field of enzymology too, where clarity on the mechanistic aspects of enzyme-catalyzed reactions could be achieved as a result of structural analysis of the catalytic active site.20–23 On the other end of the spectrum, recent evidences resulting largely from stud-ies of disordered proteins or disordered regions of folded proteins have chal-lenged this paradigm,24,25 attributing functional capability and diversity of pro-teins to disorderliness in their structures.26 Peptides involved in signaling path-ways are intrinsically disordered, allowing them to function in multiple path-ways.27,28 Flexible loops in highly-ordered globular proteins such as enzymes are known to play a major role in their catalytic function.28–30

We notice that both order and disorder in a protein structure are crucial for its desired function. This acts as a motivation to perform detailed studies on not just the static structure of a protein, but also on its dynamic behavior. Compu-

Page 17: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

17

tational approaches to simulate structural dynamics and study its effect on pro-tein function thus become instrumental in furthering our understanding and lending clarity to the structure-function paradigm. In my effort to use computational approaches to better our understanding of protein structural dynamics and its effect on the protein function, I have cho-sen two contrasting classes of proteins – an intrinsically disordered protein (IDP), and an enzyme – as systems of investigation. In the first part of this chapter, I will discuss about the various salient features associated with IDPs and their implications on living organisms, and give a brief account of their structural properties. The second part of this chapter will feature a detailed overview of enzyme catalysis and its key concepts that are relevant to the re-search work presented in the thesis.

2.1 Intrinsically Disordered Proteins Proteins that do not exist in a single, well-defined equilibrium structure but instead as dynamic conformational ensembles are regarded as intrinsically dis-ordered proteins (IDPs).31 These proteins are said to be disordered as they do not have a definite tertiary structure and attain multiple secondary structural forms, resulting in a dearth of structural order.32 This disorderliness in an IDP though, is defined by the sequence and composition of its constituent amino acids, and hence the word intrinsic is used to refer to this innate characteristic of the protein.25,28

IDPs can extend their primary amino acid sequence into a random coil, but can also collapse to form a molten globule.33–35 The protein sequences of IDPs and structured/ordered proteins have marked differences in their amino acid composition, aromaticity, hydrophobicity, and charge, to name a few. Notice-ably, IDPs lack bulky hydrophobic and aromatic residues that are crucial in forming and stabilizing the hydrophobic cores of most folded globular pro-teins. On the other hand, they are rich in polar amino acids and also structure-breaking amino acids such as glycine and proline.32,34,36,37 A combination of these two factors grants these proteins their conformational instability, thereby making them functionally versatile.

Page 18: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

18

2.1.1 Biological Function of IDPs Owing to their structural plasticity, IDPs and intrinsically disordered regions of folded proteins possess the ability to carry out multiple biological functions. More specifically, they are involved in various signaling and regulatory path-ways, via interactions with other proteins, nucleic acids, and ligands. Some notable examples of their functions include regulation of transcription and translation, protein phosphorylation, ribosome assembly, and chaperone-as-sisted protein folding.28,32 IDPs also undergo folding upon binding to a suitable partner, in a process called coupled folding and binding. The entropic cost to fold a disordered pro-tein is paid by the binding enthalpy generated in the process. This phenome-non is utilized by a number of protein–nucleic-acid and protein-protein inter-actions that are part of transcription and translation. Complexes resulting from such interactions tend to have high specificity and low affinity, enabling tar-geted association during process initiation and easier dissociation during pro-cess termination.28,32

2.1.2 Involvement of IDPs in Diseases Proper protein function is key to maintenance of life, and protein dysfunction can result in deleterious health conditions. While structural variability confers functional versatility in IDPs, it also holds the potential to trigger protein mis-folding, which can further snowball into protein aggregation resulting in the formation of fibrils. Misfolding can be caused by both intrinsic (e.g. point mu-tations) and extrinsic factors (e.g. interactions with other proteins or small molecules), acting independently or in complex association with one an-other.32 Protein misfolding and aggregation is associated with the pathogenesis of some of the most prominent neurodegenerative diseases such as Alzheimer’s disease (AD), Parkinson’s disease (PD), Huntington’s disease, and prion dis-eases.38 The mechanism usually involves proteins converting from their solu-ble, functional states into highly ordered, filamentous protein aggregates. These aggregates, known as amyloid fibrils, tend to accumulate in various or-gans and tissues.32 The defining structural feature of a fibril is a core cross-β-sheet structure formed by β-strands of individual protein units running per-pendicular to the fibril’s axis (Figure 1).32,39

Page 19: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

19

Figure 1. An example structure of the cross-β-sheet core of an amyloid fibril shown in views (A) perpendicular to the fibril’s axis, and (B) along the fibril’s axis. (C) shows a side view of the structure illustrating the stacking of individual β-sheet struc-tures over one another. The structure shown here is of a protofilament formed by a fragment of the protein transthyretin (PDB ID: 2M5N).13,40 Figure generated using PyMOL.41

The increase in propensity for an IDP to undergo aggregation is attributed to stabilization of a partially folded conformation, which triggers oligomeriza-tion and fibrillation by enabling a monomer to make electrostatic, hydrogen bond, and hydrophobic interactions with other monomers. The transformation of a disordered structure into a partially folded conformation, caused by mu-tations in the protein sequence or changes in the environment, thus acts as a prerequisite for fibrillation.32,42 Therefore, in order to elucidate these structural transformations, it is essential to perform studies on the structure and dynam-ics of IDPs under varying physiological conditions.

2.1.3 Amyloid β Peptide Alzheimer’s disease (AD) is a neurodegenerative disorder clinically charac-terized by dementia, cognitive and behavioral impairment, social and occupa-tional dysfunction, and eventual death.43 The amyloid-β (Aβ) peptide, accord-ing to the amyloid hypothesis,44,45 is considered to be the main cause of this disease. Misfolding of the Aβ peptide triggers its aggregation, resulting in the formation of senile plaques.46 The formation of these plaques in the brains of patients affected by AD is considered to be associated with the progression of the disease.43,46,47 As the Aβ peptide is a key component of the plaques and hence an important indicator of AD, it is highly useful to study its structural properties in order to understand its role in AD pathology at a molecular level.

Page 20: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

20

The Aβ peptide is a soluble disordered peptide of length ranging from 39 to 43 amino acid residues. It is produced by proteolytic cleavage of a transmem-brane protein called amyloid precursor protein (APP), catalyzed by the en-zymes β-secretase and γ-secretase.48–50 Faulty metabolism and an imbalance between the production and clearance of this peptide, resulting in aggregation and fibril formation, is associated with AD pathology. Due to the disordered nature of Aβ peptide, methods such as NMR spectros-copy and MD simulations are preferred for its structural characterization. Models of functionally relevant peptide fragments, as well as the two most common variants – Aβ(1-40) and Aβ(1-42) – have been derived using NMR methods and studied for structural changes induced by variation in their phys-ical and chemical environments.50–53 Although Aβ(1-40) is found more abun-dantly than Aβ(1-42), the latter is known to possess greater propensity for ol-igomerization and eventual aggregation.54 For this reason, both these variants are studied extensively for their structural dynamics and aggregation path-ways.

Figure 2. Primary sequence of the Aβ(1-40) peptide. Charged amino acid residues are individually indicated in bold and colored in red (negatively charged), deep blue (pos-itively charged) and light blue (histidines). The background color provides a general indication of the hydrophilic (light orange) and hydrophobic (light yellow) regions of the peptide.

Aβ(1-40) is structurally characterized as being composed of a hydrophilic N-terminus region, and two hydrophobic segments forming the central and C-terminal regions (Figure 2). In Aβ(1-42), the C-terminus is extended further by two hydrophobic residues, isoleucine and alanine, which reduces C-termi-nal flexibility by forming a β-hairpin at residues 38-41.55 This could explain why Aβ(1-42) has greater tendency to aggregate. However, both the variants, Aβ(1-40) and Aβ(1-42), show multiple discrete structural transitions between α-helical, β-sheet, and random coil forms.55,56 It is therefore important to study the structural dynamics of Aβ peptide as a monomer and identify the confor-mations that qualify as a misfolded state bearing potential to undergo oli-gomerization.

Page 21: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

21

2.2 Enzymes Enzymes are biological catalysts capable of significantly accelerating the rates of biologically relevant chemical reactions, by up to 1019 orders of magnitude, in some cases.57,58 In a living organism, they are ubiquitous and influence every important life process, such as glucose metabolism, replication of ge-netic material, and protein synthesis. The chemistry responsible for sustaining life involves biochemical transformations that cannot occur at physiologically meaningful timescales without the help of enzymes. With a large number of reactions simultaneously taking place in a cell, enzymes, in addition to their catalytic efficiency, exhibit remarkable specificity towards their respective substrates.1 This combination of the efficiency and specificity of enzymes, in addition to their biodegradable nature, makes them ideal candidates for use in industrial applications in today’s world where sustainability is of paramount importance. Barring a few catalytic RNA molecules, the majority of naturally occurring enzymes are comprised of proteins having a distinct three-dimensional struc-ture called a fold. This structural fold, defined by the composition and se-quence of amino acids in the protein, is very critical for the enzyme to function efficiently. The structure-function paradigm discussed earlier in this chapter stems from the observations made in enzymes that hint at a direct relationship between their structure and biochemical function. However, a similar struc-tural fold can exhibit multiple different protein functionalities (protein ‘moon-lighting’), whereas functionally related enzymes might also possess different structures.59,60 As already stated in a broader context, both order and disorder are necessary for efficient protein function. Investigating the role of enzyme structure and dynamics on its function using computational models is useful for understand-ing the origins of enzyme catalysis from a different perspective. The ability to analyze enzymatic reactions at an atomic level empowers us to perform mech-anistic studies and derive information about thermodynamic quantities. The following sections will discuss the basics of enzyme catalysis and describe how quantities determined by biochemical experiments can be viewed in con-junction with those obtained from theoretical simulations.

Page 22: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

22

2.2.1 Enzyme Kinetics Enzyme catalysis, in simple terms, refers to an increase in the rate of a chem-ical reaction facilitated by an enzyme. Chemical kinetics deals with the study of the rates of chemical reactions, and attempts to establish a relationship be-tween the rate of a reaction and the concentration of reactants. These relation-ships can be described by rate equations or rate laws. They carry vital infor-mation regarding the reaction mechanisms and corresponding kinetic rate con-stants, often acting as a bridge between experimental and computational en-zymology. The utility of these rate laws are enhanced when they are complemented by other analyses from the physical organic chemistry toolkit, such as linear free energy relationships (LFER) and kinetic isotope effects (KIE). The Michaelis-Menten relationship can be applied to analyze the ki-netics of an enzyme-catalyzed reaction.

Michaelis-Menten Relationship

A simple enzyme-catalyzed reaction involving the conversion of one substrate molecule to a product molecule, without any intermediate states, can be writ-ten as,

where E, S, ES and P represent the enzyme, substrate, the enzyme-substrate complex, and the product, respectively. k1, k2 and k-1 denote the rate constants for the respective reaction steps. While the formation of the ES complex, also known as the Michaelis complex, is reversible and in equilibrium with sub-strate release from the enzyme, the step leading to product formation is as-sumed to be irreversible. Thus, the rate of product formation, and change in concentration of the ES complex is given by the following relationships:

(2.1)

E S ES ES (2.2)

According to the steady state approximation, the ES complex is in fast equi-librium with the free enzyme and substrate, thus implying [ES] to be constant. This results in the following expression:

0 E S ES ES (2.3)

Page 23: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

23

Separating the concentration terms and the rate terms, we get

(2.4)

The quantity on the left hand side of equation (2.4) can be defined as the Mich-aelis constant, KM.

(2.5)

The initial concentration of the enzyme, [E0], can be written as the sum of concentrations of the free enzyme [E] and enzyme-substrate complex [ES]:

E E ES (2.6)

Substituting the expressions of KM and [ES] as obtained from equations (2.5) and (2.6) into equation (2.4), and rearranging the terms results in the following expression for [ES]:

ES (2.7)

Substituting equation (2.7) in equation (2.1), we obtain an expression for the rate of the reaction, in terms of initial concentrations of the enzyme and substrate:

(2.8)

The maximum velocity of an enzymatic reaction, Vmax, is given by: E (2.9)

Replacing k2 in equation (2.8) with equation (2.9), we obtain the final form of the Michaelis-Menten equation,

(2.10)

The Michaelis-Menten relationship, represented in a schematic form in Figure 3, can be used to obtain key kinetic parameters that serve as a guide to evaluate the catalytic efficiency of an enzyme. These parameters are kcat, KM, and kcat/KM. kcat is a first-order rate constant that indicates the rate of conversion of an en-zyme-substrate complex to free enzyme and product. Although for practical purposes it is considered as the rate constant for the chemical step, i.e. k2, it is theoretically the lower limit of the rate constant of the rate-limiting step post ES formation. At saturating levels of the substrate, k2 becomes equivalent to kcat. This kinetic constant is of great importance in computational enzymology,

Page 24: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

24

as it can be used to obtain the free energy of activation with the help of tran-sition state theory (TST), thus providing a bridge between the kinetic world of experiment and the thermodynamic world of simulation.

Figure 3. Schematic representation of the Michaelis-Menten relationship. The profile for reaction rate versus substrate concentration applies to an enzyme-catalyzed reac-tion exhibiting steady state kinetics. The variables shown in the figure are described in the text.

KM is the concentration of the substrate at which the rate of enzyme-catalyzed reaction is half of the maximum rate. Known as the Michaelis constant, it is used as an indicator of the binding affinity of the enzyme for a particular sub-strate. Although here too, it is important to note that this assumption is valid only when the reaction is two-step and the rate of substrate dissociation far exceeds the rate of product formation (k-1 >> k2), which is rarely the case for enzyme catalyzed reactions.61,62 kcat/KM is usually considered to be the catalytic efficiency of an enzyme to-wards a certain substrate. As it reflects both the binding and catalytic events, it functions as an apparent second-order rate constant, indicating how sub-strate binding affects the catalytic rate. In cases where the catalytic rate far exceeds the rate of substrate dissociation, the reaction rate is said to have reached the diffusion limit. Computational enzymology deals with the calculation of thermodynamic pa-rameters wherein a single enzyme-substrate system is studied in isolation.

Page 25: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

25

This is contrary to experiments, where kinetic parameters obtained for the en-zymatic reaction in solution are macroscopic in nature. In order to utilize data obtained from experiments for validating computational models, it is essential to make a connection between chemical kinetics and thermodynamics.

Linking Kinetics with Thermodynamics

The Arrhenius equation63 explains the effects of temperature dependence on the rate of enzyme-catalyzed reactions,

A ⁄ (2.11)

where k is the observed rate constant, A is the reaction specific pre-factor, R is the universal gas constant, T is the temperature in Kelvin, and Ea is the ac-tivation energy of the reaction. It implies an increase in the rate of the reaction with an increase in temperature.

Figure 4. Schematic representation of a free energy profile for the conversion of sub-strate S to product P. The profile shown in black corresponds to the uncatalyzed reac-tion in solution (in this case, water, indicated by “w” in the subscripts). The profile shown in green corresponds to the enzyme-catalyzed reaction.

The Eyring-Polanyi equation64,65 is similar in nature, relating the rate of the reaction with its Gibbs free energy,

‡⁄ (2.12)

where k is the rate of the reaction at temperature T, kB is the Boltzmann con-stant, h is the Planck constant, and ΔG‡ is the Gibbs activation free energy. This relationship allows us to obtain thermodynamic parameters such as the

Page 26: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

26

Gibbs free energy from experiments, and compare them with theoretically evaluated reaction energetics. According to Transition State Theory (TST), a chemical reaction proceeds from the reactant state to the product state via a metastable transition state existing as a hypersurface in phase space. The transition state is understood to be in quasi-equilibrium with the reactant state, meaning that crossing the bar-rier always leads to the product state.66 The shortcomings of TST, however, include not considering re-crossing events (formation of reactants even after crossing the barrier) and quantum tunneling.67 The free energy profiles of an uncatalyzed reaction and its corresponding enzyme-catalyzed reaction are schematically represented in Figure 4.

2.2.2 Enzyme Catalysis Enzymes are biological catalysts that function to increase the rate of a chemi-cal reaction by lowering the free energy barrier between the reactant and tran-sition states without undergoing any permanent change themselves.68 An en-zyme initially forms a stable complex with the reactant followed by the chem-ical step that results in product formation. The final step involves product re-lease, and this cycle is repeated again for the next reactant molecule. The catalytic effect of the enzyme on the reaction is on display over the entire process from substrate binding to product release. Elucidating the way in which catalysis occurs is an important area of study to understand the origins of enzyme catalysis.69 One of the earliest models proposed to explain enzyme catalysis was Fischer’s “Lock and Key” hypothesis, which was based on shape complementarity be-tween the active site of the enzyme and the substrate.70 This was later followed by Koshland’s induced fit model,71 which suggested that the substrate mole-cule induces conformational changes in the enzyme to accommodate itself in the active site. However, studies showing tight binding between enzyme and transition state analogues72 indicated that there are other factors influencing enzyme catalysis.

Transition State Stabilization

The theory of transition state stabilization was first propounded by Pauling, in which he proposed that the active site of an enzyme binds to the transition state with greater affinity than to the substrate.73 In this case, the free energy

Page 27: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

27

of the transition state in complex with the enzyme is lower than that of the transition state for an uncatalyzed reaction in solution. This is made possible as a result of a favorable active site environment in which suitable polar and non-polar residues are immaculately arranged in three-dimensional space to facilitate stabilization of the transition state as compared to the reactant and product states. The theory has since garnered support from the concepts of electrostatic stabilization74 and active site preorganization,75,76 which state that the stabilizing charges in an active site are preorganized for preferential bind-ing to the transition state and do not have to reorient themselves during the reaction. More recently, these theories have been validated by experimental and computational studies.81–87

Ground State Destabilization

The active site of an enzyme can also function to destabilize the bound sub-strate, thus lowering the free energy difference between the reactant state (or ground state) and the transition state. Ground state destabilization implies that the free energy of the transition state is not affected by the enzyme, and catal-ysis occurs solely by raising the free energy of the ground state in relation to the transition state. Though this theory has been proposed to work in some cases,84,85 newer studies have shown that ground state destabilization is not as significant as transition state stabilization.86–89

2.2.3 Structural and Dynamic Effects on Enzyme Function The effect of protein structure – the order and disorder therein – on its function has already been emphasized in this chapter. It is no different for enzymes, which contain distinct structural scaffolds that help in properly orientating rel-evant amino acids at the active site. Some disordered regions in enzymes such as loops play a hugely important role in regulating the catalytic activity. The growth of structural and computational biology in recent decades has resulted in large amounts of data that act as evidence to prove that dynamic effects leading to conformational changes in an enzyme are critical for catalysis. The significance of dynamic effects – correlated motions90 and promoting vi-brations91 to name a few – on enzyme function have been shown using numer-ous computational studies.92–95 My work on modeling the enzyme-catalyzed reaction in triosephosphate isomerase (TIM) has also yielded interesting in-sights with regard to the effect of active site architecture (Papers III and IV) and loop motion (Paper VI) on catalysis.

Page 28: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

28

2.2.4 Triosephosphate Isomerase Triosephosphate isomerase (TIM) is an enzyme that reversibly catalyzes the conversion of dihydroxyacetone phosphate (DHAP) to D-glyceraldehyde-3-phosphate (GAP) (Figure 5).96–99 It is a glycolytic enzyme which performs an important function of regulating the cellular concentrations of GAP and DHAP, as these molecules link the glycolytic pathway to other metabolic pathways such as the pentosephosphate pathway, gluconeogenesis, and lipid metabolism.1

Figure 5. Reaction scheme showing the reversible interconversion of DHAP and GAP catalyzed by TIM.

Structurally, TIM is a homodimer consisting of two identical subunits or mon-omers about 250 residues in length (248 residues in yeast TIM100) (Figure 6A). A notable feature of its structure is the TIM barrel fold, which is the most common structural fold observed in nature, found in approximately 10% of all known proteins.101–103 It comprises of an eightfold repeating pattern of β-strands and α–helices, also represented as the (βα)8-fold. The β-strands and α–helices are linked by flexible loops, referred to as loop 1, loop 2, and so on until loop 8.104 The motion of loops 6 and 7 has significant relevance to the catalytic function of TIM.97 Various conformational states of these loops in ligand-bound and “apo” enzymes have been studied with the help of structural models.100,105–107 Closure of TIM loop 6 upon substrate binding forms non-covalent stabilizing interactions between the enzyme and the phosphodianion moiety of the sub-strate.108 This is a classic example of ligand-driven conformational change,109 a feature also observed in other enzymes such as glycerol-3-phosphate dehy-drogenase (GPDH)110,111 and orotidine 5´-monophosphate decarboxylase (OMPDC)112. Structural analysis has also helped to elucidate the active site architecture of TIM, with key amino acid residues such as E165, H95, and K12 (residue numbering from yeast TIM100, the same numbering will be used

Page 29: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

29

at all places in this thesis) surrounding the bound substrate and creating a fa-vorable environment for the chemical reaction to occur (Figure 6B).96,97,108,113

Figure 6. (A) Cartoon representation of the dimeric form of yeast TIM (PDB ID: 1NEY).13,100 (B) Close-up view of the active site of TIM showing key amino acid residues (grey color, stick representation) and the bound substrate DHAP (yellow color, ball-and-stick representation). Figure generated using PyMOL.41

TIM catalyzes the isomerization reaction 109 times faster than the uncatalyzed reaction in solution.97,114–117 The catalytic proficiency of this enzyme is a result of optimal arrangement of active site residues for transition state stabilization, and dynamic properties of a flexible loop that is involved in reorganization of the active site.97,113 Thus, we observe both structural order and disorder at play in TIM for efficient catalysis of the isomerization reaction. Chapter 5 will pro-vide a more detailed overview of this enzyme system with greater relevance to my investigations on TIM-catalyzed reactions using computational ap-proaches.

Page 30: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

30

3. Computational Approaches

The development of new theories in modern physics and chemistry around the early 20th century heralded a new age in scientific research that prompted the use of theoretical calculations to study natural phenomena. In the early stages of these fields, calculations were performed manually in order to derive equa-tions supporting new theories such as quantum mechanics.118 However, the advent of powerful computers over the last few decades has catalyzed the mat-uration of this field. The study of sub-atomic particles, atoms, and molecules pervades various dis-ciplines of science. All known matter is composed of these very components arranged in various permutations and combinations. This fundamental prop-erty of matter makes theoretical chemistry a very powerful tool to study any reasonably sized molecular system using mathematical models at relevant timescales. The field of computational biochemistry has made and continues to make good use of theoretical models in order to study the structural and dynamic properties of biomolecular systems. Furthermore, it also enables the investigation of their function at an atomic level. This chapter contains a brief overview of the computational approaches that are most commonly used on biomolecular systems. A more detailed descrip-tion of the underlying principles is included here for those methods that have been utilized in the research work presented in this thesis.

3.1 Quantum Mechanics Quantum mechanics (QM) is a theory that can be used to provide a physically meaningful description of chemical systems, particularly at the sub-atomic level. One of the fundamentals of QM is the description of a system in the form of a wave function (Ψ). The Schrödinger equation (Equation 3.1) can be solved to obtain information about the current state of a system.

Ψ Ψ (3.1)

Page 31: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

31

Here, is the Hamiltonian operator that acts upon the wave function Ψ, re-sulting in Ψ multiplied by a scalar quantity E, which is the energy associated with the wave function. It is often difficult to know Ψ for non-trivial systems, making it extremely difficult to solve equation (3.1) analytically. Hence, in order to solve the equation numerically, approximations of certain aspects of the theory are made such that they do not significantly influence the final so-lution.119 The most commonly used approximation is the Born-Oppenheimer (BO) approximation,120 which treats the atomic nuclei as stationary particles relative to the fast-moving electrons, thereby decoupling their motions.

3.1.1 Ab initio methods The Latin term ab initio translates to “from first principles”. Ab initio methods, thus, make use of the basic principles of quantum mechanics to determine the state of a system. The BO approximation of the Schrödinger equation is usu-ally a starting point for these methods. More approximations are introduced to relatively simplify the process of solving the Schrödinger equation for non-trivial systems, without compromising a lot on the accuracy. The Hartree-Fock (HF) method,121 for example, is a classic ab initio method that introduces an approximation by neglecting the phenomenon of electron correlation and in-stead treats each electron with an effective average potential. Post-HF meth-ods include Möller-Plesset perturbation theory (MPn),122 configuration inter-action (CI),123 and coupled cluster theory (CC).124 These methods treat elec-tron correlation using different kinds of approximations that improve the ac-curacy of solutions relative to HF, but are computationally more expensive.

3.1.2 Density Functional Theory Density Functional Theory (DFT) is used to solve for energies of molecular systems by calculating the electron density.125–127 It serves as a computation-ally less intensive option in comparison to ab initio methods that make use of the electron wave function. In this theory, the ground state energy of a system is calculated as a sum of three components: (A) the attraction between nuclei and electrons, depending on the electron densities, (B) the kinetic energy of electrons, and (C) the interaction energy between electrons. The challenge in DFT lies in successfully arriving at a functional that accurately describes the electron kinetic energy and electron interaction energy. This is done by carry-ing out parametrization of the functional to reproduce experimentally deter-mined properties of known compounds, thus making DFT an empirically cor-rected QM approach.

Page 32: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

32

We now begin to observe that introducing approximations to fundamental the-ories has many advantages. Firstly, it allows us to develop algorithms and mathematical models that are less intensive to compute. This saves time as well as valuable computational resources. Secondly, the results from these ap-proximated methods can be used for practical purposes without compromising on accuracy, as they do not deviate significantly from the exact method. The QM-based methods discussed so far are useful for modeling chemical reactivity, as they can describe bond cleavage and formation. However, due to their high computational cost, these approaches are also limited by the num-ber of atoms in a system that can be studied with acceptable computational efficiency. Therefore, if the objective of a simulation is to model the motion or dynamics of a protein or an enzyme system in solvent, we need to opt for a more approximate method.

3.2 Molecular Mechanics Molecular mechanics (MM) makes use of a classical mechanical framework to describe a molecular system. Approximations are introduced in order to describe interatomic interactions, and Newtonian equations of motion are im-plemented for modeling dynamic behavior. In this framework, atoms are treated as balls or whole particles, whereas the bonds connecting them are treated as springs. Consequently, subatomic particles such as protons, neu-trons and electrons are not explicitly considered. However, information about the mass and charge (partial charge, when considered as part of a molecule) of an atom is included in a key function called a force field. A force field is central to any MM-based method, and consists of a number of individual func-tions and parameters that describe various bonded and non-bonded interac-tions that are possible between intra- or intermolecular atoms. The standard form of a force field is given by:128

12

∙12

∙ 1 cos ∙12

4 ,

(3.2)

Page 33: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

33

Here, the terms kb, kϕ, kψ and kθ represent the force constants for the potentials describing the harmonic motion of the bonds, angles, proper dihedrals, and improper torsions, respectively. Coulomb’s law is used to calculate the elec-trostatic interactions between two atoms. The values qi and qj represent the partial atomic charges for the two atoms i and j. Aij and Bij are the repulsive and attractive terms, respectively, of the Lennard-Jones potential that is used to describe van der Waals interactions between the atoms i and j. Solving for equation (3.2) gives the overall potential energy of the system. The parametrization of these functional terms is done by fitting to data deter-mined by experiments and/or by performing high-level QM calculations on the molecular fragments of interest.129 At present, many force fields have been developed that are used to simulate dynamics of biomolecular systems. Some prominent ones include CHARMM36m,130 AMBER-ff14SB,131 and OPLS-AA/m,132 the latter two of these having being extensively used in my investi-gations presented as part of this thesis. The individual functions and parame-ters are modified differently in each of these force fields, but they largely draw inspiration from equation (3.2). Since the MM force field is a relatively simpler function to solve than the ones we encounter in QM-based methods, it can be employed for simulating large biomolecular systems comprising thousands of atoms. In order to study changes in atomic positions and variations in molecular conformations, the potential energy of a system needs to be calculated repeatedly in a determin-istic fashion. The outcome of this exercise is molecular dynamics (MD). This involves the integration of Newton’s equations of motion, thereby allowing the system to propagate in time. A trajectory resulting from a MD simulation contains information about the spatial coordinates of all atoms included in the system. This information is vital for analyzing conformational transitions in local or global regions of a biomolecular system, and for identifying energet-ically stable structural conformations. The major disadvantage of MM in com-parison to QM is its inability to describe bond breaking and bond formation events of a chemical reaction, although reactive force fields such as ReaxFF133 or specialized approaches such as the empirical valence bond (EVB) ap-proach134,135 have been developed that permit changes in bonding patterns to be modeled also in an MM framework.

Page 34: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

34

3.2.1 Enhanced Sampling Methods Exploration of the potential energy surface of a system using conformational sampling remains a daunting task for MM-based methods. A single MD tra-jectory allows for a deterministic exploration of the various energy minima for a system, propagating from a solitary starting point. This is theoretically possible if the trajectory is allowed to continue indefinitely, but in practice is difficult to obtain a comprehensive scan of the energy landscape as a result of a single long timescale simulation. A usual workaround to this problem is to run multiple simulations with non-identical starting points (initial velocities) that propel the system in different directions in the quest to sample the differ-ent energy minima accessible to the system. This manner of improving the sampling of the conformational space is fairly common, as demonstrated in all the papers included in this thesis. However, over time, there has been a push towards the development and implementation of innovative methods into MM in an effort to maximize conformational sampling of the system at mini-mal computational cost.136 Some of these methods, such as Hamiltonian Rep-lica Exchange molecular dynamics (HREX-MD) and Biased Exchange Metadynamics (BE-METAD), have been used in Paper VI to extensively sam-ple the conformational diversity of loop 6 in TIM. Enhanced sampling methods can be broadly classified into two groups based on whether they require a biasing collective variable (CV) or not. In a general sense, a CV is a function of the coordinates of the system which is able to explicitly describe the transition of a system from a known initial state to a known final state.137 This is particularly advantageous if there is prior knowledge about the initial and final states of the system following, for exam-ple, a conformational change. However, the effectiveness of a CV-dependent enhanced sampling method – such as metadynamics – relies heavily on the quality of the chosen CV. On the other hand, methods that are not constrained by CVs – such as replica exchange MD – can explore the potential energy surface of a molecular system without being biased towards sampling a pre-defined conformational change.

Replica Exchange Molecular Dynamics

In replica exchange MD (REMD), enhancement of the conformational sam-pling of the system is achieved by running multiple replicas of simulations at different temperatures in parallel. An exchange of system configurations is done at a certain frequency between pairs of replicas, after which the ex-

Page 35: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

35

changed replicas are simulated at new temperatures. This aids in the conver-gence of various replicas towards a common energy minimum, and also allows for greater sampling of the conformational landscape of the system. It also helps in preventing systems from getting trapped in conformations corre-sponding to local energy minima.138,139

Metadynamics

Metadynamics, a CV-based enhanced sampling approach, makes use of a his-tory-dependent bias potential acting on the CV to accelerate the sampling of rare events in a simulation. This is usually done by flattening the free energy landscape of the CV by populating it with Gaussians, thereby allowing for better sampling of all conformations along the CV space.137,138 The simula-tions are run until the bias potential converges to the negative of the real free energy of the CV. In systems exhibiting complicated motions, the sampling efficiency can be further improved by using a combination of metadynamics and replica exchange methods.137,138

3.2.2 Calculation of Free Energies The free energy of a system is a fundamental concept in physical chemistry that can be used to describe a process in thermodynamic terms. In various computational approaches, calculation or estimation of the free energy of a process (for example, a chemical reaction) is of paramount importance. Two of the most commonly used methods for free energy calculations – free energy perturbation (FEP), and umbrella sampling (US) are discussed for a two-state system in this section.

Free Energy Perturbation

The Zwanzig equation140 can be used to evaluate the free energy difference between the two states 1 and 2 of a system by calculating the ensemble average of the potential energy difference sampled at state 1.

Δ ln ⟨ ⟩ (3.3)

Here, ΔG is the free energy difference, R is the universal gas constant, T is the temperature in Kelvin, and U1 and U2 are the potential energies corresponding to states 1 and 2, respectively. This relation is however accurate only if the two states overlap significantly with respect to the conformational spaces they represent. To overcome this problem, a number of intermediate states are in-troduced between states 1 and 2 in order to gradually perturb the system.

Page 36: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

36

A new potential is created as a linear combination of the two potentials, with λ being the coupling parameter whose values range from 0 to 1.

1 (3.4)

The free energy of the system going from state 1 to state 2 can then be calcu-lated as a sum over all intermediate steps.

Δ → ln ⟨ ⟩

(3.5) This approach is referred to as free energy perturbation (FEP), and is highly instrumental in studying enzymatic chemical reactions, especially when the initial and final states are known beforehand.

Umbrella Sampling

While the FEP methodology entails the generation of intermediate states for calculation of the free energy, it also presents a challenge of being able to sufficiently sample rare events associated with or corresponding to these in-termediate states. These could include, for example, crossing of the energetic barrier to identify the transition state in a chemical reaction. The umbrella sampling (US) approach allows for ample sampling of the system near the intermediate states. This is done by adding a biasing potential W to the sam-pling potential U1 for the purpose of improving the sampling, resulting in an effective potential Utotal.

(3.6)

Evaluation of the free energy from US calculations is eventually done by re-moving the energy contribution associated with the biasing potential from the total energy.

3.3 Multiscale Modeling Methods based on the principles of quantum mechanics are capable of describ-ing chemical reactions that involve the breaking and forming of bonds. How-ever, the high precision of these calculations comes at significant computa-tional cost, such that their computational efficiency reduces dramatically with increasing system size. This compels users to truncate the size of the system to include only the reactive species for a fast and accurate description of reac-tion energetics, thereby losing information regarding the effect of surrounding

Page 37: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

37

atoms on the reacting atoms.141,142 Molecular mechanics, on the other hand, oversimplifies the model of an atom by treating it as a whole particle with pre-defined bonding patterns. This rules out the possibility of modeling chemical reactions using conventional MM-based methods (with the exception of spe-cialist approaches such as those mentioned in the introduction to Section 3.2). A hybrid method called QM/MM (quantum mechanics / molecular mechan-ics) was developed to solve this issue by amalgamating scalability, efficiency, and theoretical rigor.

3.3.1 The QM/MM Approach In a multiscale model,143,144 the system being studied is partitioned into two regions – the reactive region, and the surrounding region. The reactive region (also termed as the QM region) is a small group of atoms that are actively involved in the reactive step, and are therefore treated at the more accurate QM level theory. The remainder of the system, i.e. the surrounding region (also termed as the classical region), is treated at the more approximate MM level of theory. The potential energy of the system (Utotal) is, hence, a sum of the potential energies of the classical (UMM) and QM (UQM) regions, as well as a coupling (UQM/MM) between the two regions.

/ (3.7)

The QM/MM approach thus extracts the best of both QM- and MM-based methods, by modeling chemical reactivity as accurately as possible and also by describing the neighborhood of the reactive center.

3.3.2 The Empirical Valence Bond Approach The empirical valence bond (EVB) approach was developed by Warshel and Weiss in 1980 as a simple and reliable approach to study solvation effects in enzyme-catalyzed chemical reactions.134,135 While the QM/MM approach manages to improve system scalability and optimization of computational power for accurate description of enzymatic reactions, calculation of energies for the QM region can still prove to be significantly time-consuming and com-putationally expensive. EVB presents itself as a semi-empirical approach that, in a QM/MM-like framework, makes use of classical force fields and empiri-cally-fitted parameters to describe the chemistry in an enzymatic system,

Page 38: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

38

thereby allowing for fast and extensive conformational sampling of the reac-tive region. The EVB approach defines a chemical reaction as a mixture of independent valence bond states that are coupled by reaction-specific parameters. These diabatic states correspond to the reactant and product states of a reaction, and can also represent any reaction-specific intermediate states. Each of these di-abatic states are represented by parabolic functions, which in turn are de-scribed using classical force fields and are combined with the help of the EVB mapping parameters Hij and α (for descriptions, see below), to produce an ad-iabatic free energy surface. This is schematically represented in Figure 7 for a simple two-state reaction.

Figure 7. Schematic representation of the EVB ground state adiabatic free energy sur-face (εg, black) for a simple 2-state reaction (equation 3.12). The EVB diabatic parab-olas for the two states are represented as εrs for the reactant state (red) and εps for the product state (blue). The reaction coordinate can be geometric in nature, but can also be the energy gap between the two diabatic states. The off-diagonal term, H12, de-scribes the coupling between the diabatic states εrs and εps. See main text for detailed description of all parameters. Figure adapted from ref. 145 with permission from Else-vier.

As discussed in Section 3.2, molecular mechanics uses a harmonic approxi-mation to describe bond stretch and hence cannot model chemical events. The EVB approach employs the Morse potential to specifically represent the en-ergy of breaking and forming bonds in a reaction. This ensures that calculation of reaction energetics is not performed at a significant cost to computing power, as it would be with standard QM/MM approaches.

Page 39: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

39

Within an EVB framework, the energy of a diabatic state is expressed mathe-matically as:

, , , , , (3.8)

where R and Q denote the atomic coordinates and charges of the reacting at-oms, and r and q denote the atomic coordinates and charges of all the remain-ing surrounding atoms which include non-reacting protein atoms as well as solvent molecules. The Uintra term represents the intramolecular interactions of the solute system, Uinter represents the interaction between atoms of the so-lute and the surrounding solvent, and Usolvent is the energy associated with the

solvent. The term denotes the gas-phase energy, which is a constant value

when all the fragments (reacting atoms, solute, and solvent) are considered to be at infinite separation. This can be used to adjust the parabolas with respect to each other for a reference reaction, the free energies of which are known from experiments or QM calculations. The off-diagonal term Hij is the cou-pling parameter that combines the diabatic states to obtain an adiabatic energy surface, and is represented by a simple exponential function:

A | | (3.9)

where A is a constant and Δ is the distance between atoms of reacting spe-cies. Using the expressions for Hii (representing the diabatic states) and Hij from equations (3.8) and (3.9), the Hamiltonian matrix can be constructed as

(3.10)

where ε1 and ε2 represent the energies of the valence bond states. The adiabatic ground state energy Eg can be calculated by using this Hamiltonian.

EE 0 (3.11)

Solving this eigenvalue problem for Eg results in the following solutions which correspond to the ground state (lowest eigenvalue) and transition state energies.146

E 4 ∙ (3.12)

The activation free energy, ΔG‡, can be calculated by gradually changing the system from one state to another using a “mapping potential”, Uλ, along the coordinate of λ.

1 (3.13)

Page 40: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

40

By incrementally varying the value of λ, the free energy can be obtained using the FEP/US method as outlined in Section 3.2.2. The free energy difference between each λ window is evaluated as

Δ ∙ ⟨ ⟩

(3.14) The free energy gap, Ugap, is used as a generalized reaction coordinate to con-struct the free energy profile in EVB. It is the difference in free energies of the two diabatic states for a given set of coordinates r.

(3.15)

EVB calculations provide us data regarding the free energies of a catalyzed reaction, and comparing free energy profiles of different enzyme variants gives a good account of the effect of mutations on the catalytic step. In addi-tion to this, analysis of energy contributions coming from various parts of the system – the reacting atoms, surrounding protein atoms, and solvent molecules – can provide greater insights into the catalytic mechanism. For example, the linear response approximation (LRA)147 can be used to calculate differences in interaction free energies between any two states. These energies can be de-lineated to get energy contributions of specific amino acids or water molecules to the reacting species.

Δ ∙ ⟨ ⟩ ⟨ ⟩ (3.16)

The work showcased in this thesis has used the EVB approach to provide a microscopic rationale to the catalytic mechanism of enzyme-catalyzed reac-tions in TIM, especially with regard to the analyses involving linear free en-ergy relationships (LFER) which correlate the computed values of activation free energies and thermodynamic free energies. Chapter 5 will expand on the utility of this analysis with respect to the work done in Papers III and IV.

Page 41: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

41

4. Conformational Dynamics of Aβ peptide

The amyloid-β peptide (Aβ) is an intrinsically disordered protein of length ranging from 39 to 43 amino acid residues.148 The close association of this peptide with the pathology of Alzheimer’s disease (AD), a neurodegenerative disorder, has garnered tremendous interest among various disciplines of the scientific community. The Aβ peptide, owing to lack of a distinct stable struc-tural form, is prone to aggregation into oligomers and extremely stable fibril-like structures.47,149 From a medical perspective, it is of utmost value to under-stand how Aβ aggregation is related to AD progression. It is still unclear whether Aβ misfolding and aggregation is a cause, an effect, or a combination of both for disease development.43 Structural biophysicists and biochemists are particularly interested in the pro-cess of misfolding and aggregation, the various structural forms that Aβ at-tains during its conformational transitions, the kinetic and thermodynamic pa-rameters associated with these events, and the effect of external influ-ences.51,150,151 Computational approaches such as MD simulations and other enhanced sampling methods can be useful in not only visualizing the structural changes in the Aβ peptide as monomers or oligomers, but also in studying the influence of external factors that are both physical (temperature) and chemical (small molecules, metal ions, surfactants, lipids) in nature.47,51 In my research investigations on the Aβ peptide, I have used computational approaches to: (A) characterize the interactions between Aβ and divalent metal ions, and (B) study the structural implications of Aβ binding to am-phiphilic surfactants. These studies were carried out in collaboration with ex-perimental biophysicists, who used a range of techniques to obtain infor-mation about binding affinity, secondary structure, aggregation kinetics, and images of aggregates. Long timescale MD simulations were vital for configu-rational sampling of Aβ peptide bound to metal ions and surfactant micelles, and provided greater support to the findings by complementing well with ex-perimental results.

Page 42: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

42

4.1 Binding Interactions with Metal Ions (Paper I) The possibility of metals playing a role in the pathogenesis of AD has been evidenced by numerous clinical and biochemical studies.152–155 The fact that a monomeric Aβ peptide bears a net negative charge of around -3 at physiolog-ical pH warranted an understanding of its binding interactions with positively charged small molecules such as polyamines,156 and with metal ions.157,158 The study reported in Paper I148 characterized binding interactions of the Aβ peptide with divalent metal ions using biophysical experiments as well as computer simulations, with my contribution being to the latter. Specifically, zinc (Zn2+), copper (Cu2+), and manganese (Mn2+) ions were chosen for the study. The binding affinity of the Aβ peptide towards Zn2+ and Cu2+ ions was known to be in the micromolar to nanomolar range,159,160 indicating a probable well-defined binding site in the Aβ peptide that is composed of three histidines (H6, H13, H14) and the N-terminal aspartate (D1).161–164 Beyond this, the binding of Mn2+ ions to Aβ peptide had not been extensively studied in the literature. Manganese is known to exist in multiple oxidation states and plays a key role in various biological processes.165–167 Thus, this work was signifi-cant as it was aimed at characterizing the binding site for all the three metal ions, and also study the effect of Mn2+-binding on Aβ aggregation. As mentioned in Section 2.1.3, two isoforms of the Aβ peptide are commonly studied – Aβ(1-40) and Aβ(1-42). In this study, Aβ(1-40) was used for various biophysical experiments, whereas a simpler model comprising the hydrophilic N-terminal region Aβ(1-16) was used in MD simulations. This truncated re-gion of the peptide was chosen due to its likelihood to contain a specific bind-ing site for metal ions in the form of polar amino acid residues (Figure 2). Additionally, it also meant a smaller system size for the simulations which ensured longer sampling times with lesser expenditure of computational re-sources. This strategy was drafted keeping in mind the primary objective of the study, which was to characterize the binding site of metal ions, and not sampling the configurational space of the Aβ(1-40) peptide. Long timescale MD simulations of Aβ(1-16) in complex with metal ions (Zn2+, Cu2+, Mn2+) in a 1:1 stoichiometry resulted in extensive configurational sampling. Among all the metal ions, binding of Zn2+ ion to Aβ(1-16) was the most stable, with H6, H13, H14 and E11 forming the coordination center (Fig-ure 8A). Cu2+ too was fairly stable and had a coordination center similar to

Page 43: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

43

that of Zn2+, except for the dissociation of H13 (Figure 8B). Contrary to both Zn2+ and Cu2+, Mn2+ was observed to interact with the residues D1, E3, D7 and E11 with occasional changes in binding modes (Figure 8C). These theo-retical observations were found to be in good agreement with experimental observations which showed that while binding of Zn2+ and Cu2+ to Aβ peptide is stable and at a well-defined coordination center, Mn2+-binding is relatively weak and transient in nature. It was further concluded that Mn2+-binding might not have an influence on Aβ aggregation.

Figure 8. Aβ(1-16) peptide in complex with (A) Zn2+ (orange), (B) Cu2+ (green), and (C) Mn2+ (purple) ions. Shown here are representative structures of the top-ranked clusters obtained in each case upon performing clustering analysis of MD simulation trajectories. The starting conformation in each case is the divalent metal coordinating by H6, E11, H13 and H14. Distances indicated in the figure correspond to average values over all trajectory frames for each system. Figure adapted and reprinted from ref. 148 with permission from Elsevier.

Binding of Aβ to metal ions can influence initiation or acceleration of peptide aggregation by trapping the monomeric peptide in a partially folded confor-mation. Although this computational study could not explore the effect of metal ion binding on Aβ aggregation, it was able to confirm the composition and stability of the binding site for Zn2+ and Cu2+ ions. It also provided first-hand information about the characteristics of Mn2+-binding to Aβ peptide.

4.2 Binding Interactions with Surfactants (Paper II) The Aβ peptide is largely composed of hydrophobic amino acid residues, ex-cept for the N-terminal region, which is relatively hydrophilic as a result of a higher concentration of polar and charged amino acid residues (Figure 2). Likewise, lipid molecules that form vesicles and membranes are amphiphilic, meaning that they are made up of a long hydrophobic tail and a polar head-group which could carry a positive, negative or neutral charge. The similarity

Page 44: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

44

of this differential in charge, polarity and hydrophobicity across a monomeric Aβ peptide and a lipid molecule can potentially lead to binding interactions between their respective hydrophilic and hydrophobic regions. Understanding the nature of interactions between the Aβ peptide and lipid membranes is therefore of significant interest. The goal of this study, as reported in Paper II,168 was to characterize the bind-ing interactions between the Aβ(1-40) peptide and micelles of amphiphilic surfactants. These surfactants share molecular similarities with lipid mole-cules, and their micellar forms function as biomembrane mimetics. As was the case in the previous study (Paper I)148 concerning metal ion interactions, a combination of biophysical experiments and computer simulations was used in this work as well. However, this study used the full length Aβ(1-40) peptide for MD simulations as both the hydrophilic and hydrophobic binding interac-tions had to be characterized. Surfactants differing in physical and chemical properties such as charge on the headgroup, length of the aliphatic chain, mi-cellar size, and aggregation number were selected. For the computational study, a non-ionic and a negatively charged surfactant in the form of dodecyl β-D-maltoside (DDM) and sodium dodecyl sulfate (SDS) respectively, were chosen to analyze the effect of electrostatics and micellar surface area on Aβ binding. An NMR study showed that Aβ attains an α-helical structure – particularly in the core and C-terminal regions – when bound to SDS micelles, with the N-terminal region remaining relatively disordered.169 Hence, two different con-figurations of the Aβ(1-40) were studied – one in which the peptide was a random coil and bound to the micellar surface, and the other in which the α-helical portion of the peptide was buried into the micelle. MD simulations showed that the random coil peptide remained bound to the micellar surface with its hydrophobic region interacting with shallow hydrophobic grooves on the micelle. This was observed in the cases of both DDM and SDS micelles (Figure 9A and 9B). The α-helical peptide, on the other hand, displayed peculiar behavior during the simulations, the most noticeable one being the formation of a kink in the α-helix at residues N27 and K28. This feature was consistent across both mi-celle systems (Figure 9C and 9D). However, the defining difference between the interactions of Aβ(1-40) with DDM and SDS micelles was the binding affinity which was computationally estimated. The DDM-bound peptide

Page 45: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

45

moved towards the micellar surface, with the N-terminal region showing weak interactions with the non-ionic headgroup of DDM. On the other hand, the peptide was able to tightly bind to the SDS micelle, with its hydrophilic and disordered N-terminal region making strong electrostatic contacts with the negatively charged headgroups and the α-helical part binding to deep hydro-phobic grooves on the micelle.

Figure 9. Top-ranked cluster representations from MD simulation trajectories of (A) a disordered Aβ(1-40) peptide in complex with a DDM micelle, (B) a disordered Aβ(1-40) peptide in complex with an SDS micelle, and (C) an α-helical Aβ(1-40) peptide in complex with a DDM micelle, and (D) an α-helical Aβ(1-40) peptide in complex with an SDS micelle. Cartoon representation of the peptide is shown in pur-ple, with the hydrophobic regions highlighted in yellow. The micelles are shown in transparent surface representation, with carbon atoms shown in white, oxygen atoms shown in red, and sulfur atoms shown in yellow. Reprinted (adapted) with permission from Ref. 168. Copyright 2018 American Chemical Society.

The study thus underlined the importance of electrostatic and hydrophobic in-teractions between the Aβ peptide and surfactants. Computational methods were able to elucidate the structure and dynamics of these binding partners, and in the process support relevant data obtained from experiments. The scope of these results could be extended also to lipid molecules of cell membranes. Membrane surfaces formed by negatively charged groups could potentially act as strong anchor points for free Aβ peptides and act as surface catalysts for Aβ aggregation. The findings of this study can be of potential use for AD re-search.

Page 46: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

46

4.3 Perspectives Scientific studies on Alzheimer’s disease and the Aβ peptide are being carried out globally by a large number of research groups. Each incremental step to-wards understanding Aβ aggregation and its association with AD pathology can potentially go a long way towards reaching the ultimate objective of this research, which is to develop a cure or preventive therapy for this neurodegen-erative disease. The collaborative work showcased in this chapter is an effort to answer the questions pertaining to the nature of Aβ binding to physiologi-cally relevant partner molecules. In these investigations, computational meth-ods in the form of conventional MD simulations have played a key role in putting the pieces of the jigsaw puzzle together with experimental observa-tions, and furthering our understanding of the structure and dynamics of IDPs.

Page 47: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

47

5. Computational Modeling of Enzyme Catalysis

Scientific understanding of the origins of enzyme catalysis is an endeavor that presents challenges of various kinds for different classes of enzymes. While some enzymes make use of external cofactors170 and metal ions171 to achieve catalysis, there are other enzymes that are able to efficiently utilize their struc-tural scaffold to catalyze a chemical reaction.96 Triosephosphate isomerase (TIM), an enzyme belonging to the latter category, has been the subject of numerous mechanistic studies mainly due to its excellent catalytic profi-ciency.99,113,116,172,173 There exists a wealth of biochemical and structural knowledge96,97 about this enzyme, making it an ideal model for performing computational investigations to get more insights into its catalytic origins.

Figure 10. Reaction scheme of the mechanism of TIM-catalyzed reversible isomeri-zation of D-glyceraldehyde 3-phosphate (GAP) to dihydroxyacetone phosphate (DHAP) via enediolate intermediates. Reprinted (adapted) with permission from Ref. 174. Copyright 2019 American Chemical Society.

The reaction catalyzed by TIM is the reversible conversion of dihydroxyace-tone phosphate (DHAP) to D-glyceraldehyde-3-phosphate (GAP), via en-zyme-bound enediolate intermediates (Figure 10). Although the reaction scheme showcases the mechanism as a series of simple proton transfer steps, the actual events occurring in the enzyme are much more complex and involve a proper orchestration of the ensemble of amino acid residues that form the active site. The work described in this chapter is predominantly dedicated to computational modeling of TIM-catalyzed substrate deprotonation using the empirical valence bond (EVB) method.

Page 48: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

48

5.1 Modeling the Hydrophobic Clamp of TIM (Paper III) The most noteworthy structural feature of TIM that is crucial for ensuring ef-ficient catalysis is the large conformational change of “loop 6” to close over the active site.97 Loop 6 is important for two reasons. Firstly, it bears a bulky non-polar residue in the form of I170 which desolvates the active site upon closure of loop 6. This, in conjunction with the action of another bulky non-polar residue, L230, located on the other side of the active site, forms a hydro-phobic cage around the general base E165.175 Secondly, the main chain nitro-gen atom of G171 makes a hydrogen bond interaction with the phosphodian-ion oxygen of the substrate, thus anchoring the substrate to the closed loop (Figure 11A).

Figure 11. Structure of the active site in yeast TIM (PDB ID: 1NEY)13,100 in Michaelis complex with DHAP (green). Main chain atoms of the protein are shown in cartoon representation (blue). E165, G171 and S211 are shown in grey, whereas the hydro-phobic residues I170 and L230 are shown in yellow. (A) Side view of the active site illustrating hydrogen bond interactions between substrate phosphodianion and main chain nitrogens of G171 and S211. (B) Top view of the active site illustrating the hydrophobic clamp formed by the sidechains of I170 and L230 around E165 and the substrate. Figure generated using PyMOL.41

Deprotonation of the carbon acid of substrates DHAP and GAP by a general base in the form of E165 is the rate-limiting step in TIM-catalyzed isomeriza-tion.176 The ability of E165 to abstract a proton from the substrate is influenced by structural changes around the active site of the enzyme. One of these is the closure of loop 6 that brings the bulky hydrophobic sidechain of I170 in close proximity to E165, desolvates the active site, and increases the basicity of

Page 49: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

49

E165. In the catalytically competent conformation of the active site, I170 and L230 flank the general base and bound substrate on both sides, thus resulting in a hydrophobic clamp (Figure 11B). The effect of this hydrophobic clamp on the rate-limiting deprotonation step was studied using EVB calculations by calculating the free energy barriers in wild-type TIM, as well as the mutants I170A, L230A, and the double mutant I170A/L230A.177 The extent of catalytic effect in the wild-type enzyme on substrate deprotonation was also calculated by comparing the ΔG‡ and ΔG0 values with those for the uncatalyzed reaction in solution. Kinetic data avail-able from biochemical experiments on these variants of TbbTIM175 were used to validate the EVB model. The calculations successfully reproduced experi-mentally determined ΔG‡ values for deprotonation of both DHAP and GAP in all four enzyme variants (Table 1).

Table 1. Activation (ΔG‡) and Gibbs free energies (ΔG0) for TIM-catalyzed deproto-nation of DHAP and GAP in wild-type (WT) and mutant variants of TIM. All energies are shown in kcal•mol-1.

Substrate Catalyst ΔG‡calc ΔG0

calc ΔG‡exp

a ΔG‡calc -

ΔG‡exp

# of watersb

DHAP

CH3CH2CO2-

in water 25.2 ± 0.2 18.9 ± 0.2 - - -

WT-TIM 14.5 ± 1.4 5.6 ± 1.8 14.1 0.4 1.2 I170A 16.3 ± 1.5 7.6 ± 1.4 15.8 0.5 2.0 L230A 16.7 ± 0.8 8.6 ± 0.8 16.6 0.1 2.2

I170A/L230A 18.5 ± 1.0 11.0 ± 1.3 17.4 1.1 2.6

GAP

CH3CH2CO2-

in water 24.1 ± 0.2 16.1 ± 0.2 - - -

WT-TIM 12.9 ± 0.8 2.5 ± 0.9 12.9 0.0 2.1 I170A 16.2 ± 1.7 5.7 ± 1.9 16.0 0.2 2.8 L230A 14.9 ± 0.8 3.1 ± 1.0 14.2 0.7 2.2

I170A/L230A 16.5 ± 1.4 5.4 ± 1.8 16.3 0.2 3.0 a Activation barriers calculated from experimentally determined kinetic parameters175,178,179 us-ing Transition State Theory (TST). b Average number of water molecules within 4 Å of either of the E165 carboxylate oxygen atoms at the transition state.

The alkyl ammonium side chain of K12 (Figure 12) has a dominant contribu-tion towards stabilization of the negative charge developed at the deprotonated carbon of the substrate. However, the electrostatic contribution from this res-idue and other key active site residues did not vary significantly between the enzyme variants to cause the observed changes in the free energies. Further analysis of data extracted from the EVB simulations revealed a direct correla-tion between calculated values of ΔG‡ and the number of water molecules in

Page 50: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

50

the vicinity of the carboxylate group of E165 across all TIM variants (Table 1). This finding reinforced our hypothesis regarding the role of the hydropho-bic clamp to increase the basicity of E165. The residues I170 and L230 thus function as an important building block to facilitate optimal stabilizing inter-actions between the active site residues and the transition state.

5.2 Role of Key Active Site Residues in TIM (Paper IV) To further our understanding of the influence of other key amino acid residues in the active site, the EVB setup validated in the previous study (Paper III)177 was subjected to model catalysis in some more mutants of TIM.174 K12, with its positively charged sidechain, is extremely important for electrostatic stabi-lization of the negative charge developing at the deprotonated carbon of the enediolate intermediate.180–182 It is strategically placed at such a location that it also offers stabilization to the negative charge of the phosphodianion moiety of the substrate. K12 is also ion-paired with a second-shell residue E97. This interaction is key to maintaining the correct spatial position of K12. In addi-tion to these two residues, the role of P166 was also investigated (Figure 12). P166, located adjacent to the catalytic general base E165, plays an important role in positioning the side chain of E165 into the active site for performing the proton transfer steps. This involves a combination of conformational changes in loops 6 and 7 that trigger a steric clash between the side chain of P166 and the carbonyl oxygen of G209 on loop 7. The effects of K12 and E97 on TIM-catalyzed deprotonation of DHAP and GAP were studied using the single mutants K12G, E97A, E97D, E97Q, and the double mutant K12G/E97A. These mutations were chosen based on exist-ing experimental data which could be used to further validate the EVB model. The K12G substitution was found to significantly affect electrostatic stabili-zation of the transition state, which was reflected in the increases in calculated activation barriers for substrate deprotonation in the K12G mutants as com-pared to wild-type TIM (Figure 13). The E97A, E97D, and E97Q substitutions had little to no direct influence on the stabilization of the transition state, and the double substitution of K12G/E97A had a similar effect as K12G on the calculated activation free energy. This confirmed that K12 contributes directly towards the stabilization

Page 51: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

51

of the anionic enediolate intermediate, and E97 is involved in the preorgani-zation of the K12 side chain, but does not influence transition state stabiliza-tion.

Figure 12. Structure of the active site in yeast TIM (PDB ID: 1NEY)13,100 in Michaelis complex with DHAP. Shown here are sidechains of key active-site residues: K12, E97, E165, P166, I170, and L230. Reprinted (adapted) with permission from Ref. 174. Copyright 2019 American Chemical Society.

The P166A substitution provided an exciting opportunity to model the cata-lytic effect of this critical residue using EVB. Structural data of this mutant in TbbTIM183 showed the E165 side chain to have attained a “swung out” con-formation from its catalytically competent “swung in” conformation as ob-served in yTIM100 (Figure 14A). This feature was captured in the EVB calcu-lations as well (Figure 14B), and the effect of perturbed donor-acceptor dis-tances between E165 and the substrate on the deprotonation step was visible in terms of the 2.2 and 2.7 kcal•mol-1 increases in the activation barriers for DHAP and GAP respectively (Figure 13). In addition to this, the mutants studied in Paper III177 – I170A, L230A, and I170A/L230A – were also revisited and modeled using EVB. The calculated values of ΔG‡ for all enzyme variants examined in this study were in excellent agreement with experimental data (Figure 13). The highlight of this study was the extended Brønsted relationship for TIM-catalyzed substrate deprotonation that showed strong linear correlation between the computed values of activa-tion free energies and thermodynamic free energies for wild-type TIM and nine other mutants (β = 0.73 for DHAP, β = 0.74 for GAP) (Figure 15).

Page 52: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

52

Figure 13. Activation free energies for TIM-catalyzed deprotonation of (A) DHAP and (B) GAP, determined by EVB calculations (blue) and derived from experimental kinetic parameters (tan). Reprinted (adapted) with permission from Ref. 174. Copyright 2019 American Chemical Society.

Figure 14. Structure overlays of wild-type and P166A TIM illustrating the “swung in” and “swung out” conformations of the E165 0073idechain. (A) Overlay of X-ray structures of ligand-bound wild-type yeast TIM (PDB ID: 1NEY, dark green, E165 “swung in”) and ligand-bound P166A TbbTIM (PDB ID: 2J27, tan, E165 “swung out”). (B) Overlay of top-ranked cluster representative structures from EVB simula-tions at the Michaelis complex for wild-type (light green, E165 “swung in”) and P166A TIM (pink, E165 “swung out”). (C) Overlay of all structures shown in panels (A) and (B). Reprinted (adapted) with permission from Ref. 174. Copyright 2019 American Chemical Society.

Page 53: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

53

Figure 15. Linear free energy relationships between calculated values of ΔG‡ and ΔG0

for TIM-catalyzed deprotonation of DHAP and GAP. Black circles: uncatalyzed re-action, red circles: single mutations of polar sidechains (K12G and E97A), green cir-cles: single mutations of nonpolar sidechains (P166A, I170A, and L230A), orange circle: I170A/L230A. Reprinted (adapted) with permission from Ref. 174. Copyright 2019 American Chemical Society.

This computational study thus underlined the contribution of non-polar resi-dues (P166, I170, L230) towards the generation of a catalytic cage, and the role of electrostatic preorganization of polar active site side chains (K12) in transition state stabilization.

5.3 Effect of Dianion Binding on Catalysis (Paper V) While the chemical step in TIM-catalyzed isomerization occurs at one end of the substrate molecule, the phosphodianion group at the other end does not actively participate in the chemical reaction. It is, however, essential for an-choring the substrate to the enzyme by making hydrogen bond interactions with residues on loops 6 and 7. The effect of substrate phosphodianion on catalysis in TIM has been experimentally investigated by using substrate frag-ments glycolaldehyde (GA) and phosphite dianion (HPi) (Figure 16).184,185 However, knowing its direct effect on the chemical step required a computa-tional study in which the substrate deprotonation step could be modeled for GA with and without the presence of the HPi fragment.

Page 54: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

54

Figure 16. Chemical structures of the substrate fragments glycolaldehyde (GA), phos-phite dianion (HPi), and the full substrate GAP.

The EVB approach was employed to model the energetics for the deprotona-tion of substrate fragments GA and GA•HPi in wild-type TIM. The calculated activation barriers of 15.0 ± 2.4 and 15.5 ± 3.5 kcal•mol-1 for GA and GA•HPi respectively showed a small increase of ≤2.6 kcal•mol-1 in comparison to the corresponding value for the full substrate GAP,113,172 which was attributed to an entropic effect resulting from increased conformational flexibility of the substrate fragments in the active site of TIM. The increase in calculated ΔG‡ could not fully account for the intrinsic binding energy of the dianionic phos-phoryl group. This study186 concluded that the dianion binding energy used to drive a conformational change in the enzyme is fully expressed at the Mich-aelis complex, thereby corroborating the role of ligand-gated conformational change in enzyme catalysis.

5.4 Dynamics of Loop 6 in TIM (Paper VI) Closure of loop 6 is important for effective substrate binding and generation of a catalytically competent hydrophobic active site environment. Many ear-lier studies based on NMR data,107,187,188 structural models of TIM in open-loop and closed-loop conformations,97,100,105–107 and computational studies em-ploying simplified models or insufficient simulation timescales189 had led to the description of loop 6 dynamics as a simple two-state rigid-body motion.190 On the contrary, conformational sampling from EVB calculations in Papers III to V174,177,186 revealed that loop 6 undergoes large conformational changes and takes on multiple different conformations in its open form. In order to understand the enigma of loop 6 dynamics and comprehensively characterize its conformational transitions, a detailed computational study was performed using long timescale conventional MD simulations and an assortment of en-hanced sampling methods.191

Page 55: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

55

Crystal structures of unliganded and ligand-bound TIM having loop 6 in open and closed states were chosen as starting points for conventional MD simula-tions for extensive conformational sampling of loop 6. In all cases, the loop opened up substantially within the initial 10-20 ns of simulation time as the process of loop closure was too slow to be captured within reasonable simu-lation timescales using unbiased MD simulations. Enhanced sampling ap-proaches such as Hamiltonian Replica Exchange MD (HREX-MD) and Bi-ased Exchange Metadynamics (BE-METAD) were employed to speed up the sampling rate and capture the closed state of loop 6. Altogether, it was ob-served that the conformational landscape for the “open” state of loop 6 was relatively much larger than that for the closed state (Figure 17). It was also observed that loop 6 existed in various open conformations which do not qual-ify as being catalytically competent. Markov state models (MSMs), con-structed using data obtained from conventional MD simulations, revealed swift transitions of loop 6 between various open conformations.

Figure 17. Principal component analysis of conventional MD simulations on (A) apo and (B) DHAP-bound structures of wild-type TIM. The principal components are based on the distances between Cα atoms of loops 6 and 7. Shown here are free energy surfaces indicating the conformational space explored by these loops. PC1 effectively describes the opening and closing motion of loop 6. The open and closed states of the loops derived from the respective crystal structures for apo-TIM (PDB ID: 1YPI,13,106 ▼) and DHAP-TIM (PDB ID: 1NEY,13,100 ▲) are projected here. Reprinted (adapted) with permission from Ref. 191. Copyright 2018 American Chemical Society.

The link between loop dynamics and catalysis in TIM was also explored by performing EVB calculations to model the deprotonation of DHAP at various conformational states of loop 6. A significant falloff in catalysis was observed when loop 6 conformations deviated from the catalytically competent closed

Page 56: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

56

state. These open-loop conformations sampled from BE-METAD had the sub-strate and key active site sidechains such as I170 displaced from their original positions, resulting in disruption of key enzyme-substrate interactions and flooding of the active site with water molecules. This study showed with con-clusive evidence for the first time that the motion of loop 6 in TIM is highly complex and dynamic, and is not a simple two-state rigid body motion as pre-viously assumed in literature. Closure of TIM loop 6 into a catalytically com-petent conformation was observed to be a rare event at simulation timescales, with its odds only slightly improving in the presence of a substrate or ligand. Simulating the dynamics of flexible enzyme loops is therefore challenging, and demands strategic use of computational approaches and resources.

5.5 Perspectives Understanding the origins of enzyme catalysis has been the holy grail of bio-chemical investigations ever since Fischer proposed the “Lock and Key” hy-pothesis.70 Improvements in experimental techniques over the past century have guided significant advancements in elucidation of catalytic mechanisms in enzymes. The past few decades have witnessed computational approaches gain traction in the field of enzymology, targeted at characterizing the role of dynamics in enzyme catalysis. The work presented in this chapter is built on the foundations laid by pioneering experimental, structural, and computational studies on TIM as a model enzyme. The enzyme uses preorganized active site residues for transition state stabilization, but also utilizes the motion of a highly flexible loop to prepare the active site for efficient catalysis. Each of these aspects – the hydrophobic clamp generated by loop 6, the preorganiza-tion of key active site residues, the role of substrate phosphodianion in catal-ysis, and the nature of loop motion in TIM – have been examined using com-putational approaches. This work has complemented experimental data and provided newer insights into a long-studied system, thus underscoring the im-portance of collaborative research in science of the 21st Century.

Page 57: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

57

6. Conclusions and Future Perspectives

Proteins exist in diverse structural forms and show tremendous functional ver-satility by performing a variety of life-sustaining biological processes with utmost efficiency. Biochemical and structural studies have provided valuable insights towards determining the link between protein structure and function. More recently, developments in computational infrastructure have catalyzed the implementation of theoretical models for accurate description of molecular systems. Collaboration among the fields of experimental biochemistry, bio-physics, structural biology, and computational biochemistry have been instru-mental in characterizing structure-function relationship in proteins, as exem-plified by the work showcased in this thesis. Knowledge of the three-dimensional structure of a protein is crucial in deter-mining its biological effect. Computational investigations included in this the-sis have been targeted at modeling the dynamics of two protein systems – IDPs and enzymes – that lie on opposite ends of the flexibility spectrum. The Aβ peptide, an IDP associated with the progression of Alzheimer’s disease, was studied for global structural perturbations resulting from its interactions with biologically-relevant partner molecules such as metal ions and amphiphilic surfactants. Complemented by data from biophysical experiments, these stud-ies were able to definitively characterize the binding interactions. Triosephosphate isomerase (TIM), a catalytically proficient enzyme existing in a well-defined structural fold, was the subject of the second set of studies aimed at uncovering the role of key active site features in enzyme catalysis. Despite being a globally ordered structure, TIM consists of a local disordered region in the form of the highly flexible loop 6. As described in Chapter 5, the disorderliness of this loop is key to generating a catalytically competent active site. Preorganization of active site sidechains is an important function of the structural fold. Thus, it was observed that both structural order and disorder play equally important roles in proper functioning of a protein. These studies provided deeper insights into enzyme catalysis and enriched our understand-ing of chemical reaction mechanisms in highly evolved enzymes.

Page 58: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

58

Simulations of biomolecular systems function as a computational microscope for the study of their dynamic behavior and the resultant effect on biological function. A limiting factor, however, is the inability of current methods to sample events at biologically relevant timescales. Advancements in computer hardware and algorithms have shown encouraging signs of moving towards the ultimate objective of computational modeling. The growing use of ma-chine learning methods and artificial intelligence augurs well for the field in terms of method development and derivation of biologically relevant trends from large amounts of simulation data. The synergistic use of experimental methods and computational modeling can potentially lead to a profound un-derstanding of the structure-function relationship in biomolecules. It can also pave ways for applied scientific endeavors such as enzyme design, and devel-opment of therapeutics targeting IDPs or structured proteins.

Page 59: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

59

7. Sammanfattning på svenska

Proteiner är biomolekyler som består av enskilda aminosyror sammansatta i långa kedjor. Det finns en enorm variation i strukturen hos proteinerna och dessa strukturer ger upphov till olika biologiska processer i organismen. Kun-skap och förståelse om den tredimensionella strukturen hos ett protein är av-görande för att bestämma dess biologiska funktion. Modeller av proteinstruk-turen kan framställas med hjälp av tekniker såsom röntgenkristallografi och NMR-spektroskopi. Genom att sedan använda dessa modeller som referens-punkt i datorsimuleringar samt genom att tillämpa fysikteorier på dem kan proteinernas dynamiska beteende studeras och så kallade ”molekylära filmer” skapas. Dessa metoder ger en ny dimension till studier av proteiner på mole-kylär nivå.

Min forskning fokuseras på användandet av beräkningsmetoder till modelle-ring av dynamiken i proteiner för att se effekten den har på dess funktion. Ordning och oordning i en proteinstruktur är avgörande för att bestämma dess funktion. I denna forskning valdes därför två klasser av proteiner tillhörande motsatta ändar av flexibilitetsspektrumet att studeras – ”intrinsically disor-dered proteins” (IDP) och enzymer.

IDP:er är funktionellt viktiga peptider som har möjlighet att genomgå globala strukturella förändringar för att interagera med olika målmolekyler. Denna inre oordning kan också leda till felaktig veckning och efterföljande aggrege-ring av peptiden vilken är förknippad med patogenesen för framträdande neurodegenerativa sjukdomar. Amyloid beta (Aβ) peptiden är en IDP som är associerad med utvecklingen av Alzheimers sjukdom. För att studera de struk-turella förändringarna i denna peptid orsakad av dess interaktion med metall-joner (artikel I) och amfifila ytaktiva medel (artikel II) användes datorsimule-ringar.

Enzymer är biologiska katalysatorer som påskyndar hastigheterna för biolo-giskt relevanta kemiska reaktioner med många storleksordningar. Enzymer består av välordnade strukturella veckningar och är selektiva för att katalysera

Page 60: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

60

en specifik kemisk reaktion. Kombinationen av ett enzyms effektivitet och specificitet, vid sidan av dess biologiskt nedbrytbara natur, gör dem till ideala kandidater för tillämpning i industrin inte minst då hållbarhet idag är av stor vikt. Design och anpassning av nya enzymer kan vara möjlig endast om vi har en djup förståelse för ursprunget till enzymkatalys.

Den andra delen av denna avhandling beskriver en samling beräkningsstudier utförda för att modellera och studera katalys i enzymet triosefosfatisomeras (TIM). TIM är ett mycket utvecklat, katalytiskt skickligt enzym som verkar i glykolysen. Beräkningsmodellering ger en spännande möjlighet att ta del av molekylära detaljer i enzymkatalys i TIM. Enzymet använder en kombination av förorganiserade aktiva platsrester och rörelsen av en mycket flexibel loop (loop 6) för att förbereda det aktiva sätet för effektiv katalys. Var och en av dessa aspekter - den hydrofoba klämman som genereras av loop 6 (artikel III), förorganisationen av resterande viktiga aktiva platser (artikel IV), rollen för substratfosfodianion i katalys (artikel V) och arten av loop-rörelse i TIM (ar-tikel VI) - har studerats. Detta arbete illustrerade den funktionella betydelsen av strukturell stabilitet och dynamik i ett enzym, vilket ger nya insikter om katalys i ett modellenzymsystem.

Forskningen som presenteras i denna avhandling exemplifierar ett givande samarbete mellan områdena experimentell biokemi, biofysik, strukturell bio-logi och beräkningsbiokemi. Förutom att förstå de strukturella funktionerna hos den medicinskt relevanta Aβ peptiden, har beräkningsmodelleringen av kemi i TIM givit nya insikter för att förstå ursprunget till enzymkatalys. Dessa undersökningar har därför bidragit till att karakterisera struktur-funktionsför-hållanden i proteiner och därmed förbättra vår vetenskapliga förståelse av pro-teiner på molekylär nivå.

Page 61: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

61

Acknowledgments

The work presented in this thesis has resulted from four years of my PhD training. During this period, I have interacted professionally and personally with many people who have been of immense help and support to me. I would like to thank them (as many as I can) here. First of all, I would like to thank my supervisor, Prof. Lynn Kamerlin. Thank you for accepting me as a Masters student in the group and supporting me to carry out a PhD under your supervision. Thank you for your constant support and guidance over the past five years, and letting me attend many interesting conferences worldwide! The experience at Kamerlin lab has been memorable and worth cherishing, and I have learnt about many things both within and outside the scientific domain. A big thank you once again! I would like to thank my co-supervisor, Prof. Sherry Mowbray, for her sup-port, guidance, and giving me fond memories of the Protein Engineering course, both as a student and as a lab teacher. I have had the good fortune of working with extremely knowledgeable and collegial collaborators. I thank Prof. Astrid Gräslund for her support as a co-supervisor, and for all the help and guidance with the Aβ project. Thank you for also letting me do experiments in your lab. Sebastian, thank you for teach-ing me experiments (especially NMR) at DBB. Cecilia and Nicklas, thank you for your help in the lab and interesting discussions. I thank Prof. John Richard for a successful collaboration on the TIM project. Thank you also for your insights on enzyme catalysis in TIM. I have learnt a lot from our scien-tific and non-scientific conversations. I thank Prof. Dr. Birgit Strodel for the collaboration on both Aβ and TIM projects, and for helpful discussions. I would like to thank all the co-authors on each of the publications listed in this thesis. Thank you Prof. Erik Johansson and Vimal for giving me an oppor-tunity to work on the DNA Polymerase project.

Page 62: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

62

A big thank you to all former and current members of the Kamerlin lab. I owe a lot to my seniors in the group who patiently taught me how to run simula-tions and helped me in my projects. Fernanda, you were among the first peo-ple I interacted with in the group. Thank you for helping me understand what I was doing during my initial phase in the group. Alex, yo, thanks for all the fun and sorry I did not join the beer clubs that often! Beat, you were my first supervisor in the group. Thank you for letting me work in your projects and teaching me how to work in Linux. Also, thanks for testing CADEE on TIM! Paul, thank you for being there whenever I needed help with parametrization or troubleshooting my simulations. David, I had a great time working with you on the PON1 project. Miha, thanks a ton for making it much easier for me to work with EVB. It was a pleasure interacting with you whenever we did. Fabian, thank you for the “pretty-picture” settings for PyMOL. Dennis, thanks for your help with the micelle models. Anna, thanks for your feedback on various projects during the group meetings. Klaudia and Irek, thank you for organizing the board game nights at your place. Malin, thank you for help-ing me out with stuff whenever I needed. Andrey, unfortunately we could not interact much, but I wish you good luck for your PhD. Qinghua, thank you for all your help during the last three years. You were my first point of contact if I was stuck with any problem with the simulations. It was a pleasure sharing the office with you. I will miss the badminton and table tennis games. I wish you the best for the future. Dušan, thank you for contributing to the TIM pro-jects, and for all the fun times. Anil, thank you so much for sharing my joys and sorrows, for the insightful scientific discussions, and for taking over the GPDH project. I will especially miss our discussions on cricket. Adrian, thanks for lightening my mood with all the awesome fun, and for always being ready to go for a coffee whenever we asked you to join. Good luck with your projects! Marina, Jasmine, Michal, I am sorry we could not interact very often. Thank you for the discussions during the group meetings, fikas, and the cruise trip. I wish all the very best to you. Sebastian, thank you for sharing your knowledge on ASR. Cátia and Rita, it was nice interacting with you during fikas. All the best for your GTPase project. Thank you Bruno, Lucas, and Neli for all the interactions. Good Sir Rory, thank you for all the “sorries”! It was fun having you here over the spring. I wish you good luck with the projects you will work on after joining the group. I want to thank all the project students who worked alongside me on my vari-ous projects. It was a pleasure to supervise all of them. Agata, thank you for staying motivated towards the project and running the Aβ simulations. I am

Page 63: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

63

happy to see you doing well, and wish you the best for your PhD. Fabian, thank you for your help with testing force fields on the TIM system. Good luck for your PhD. Nathalie, thank you for testing the various TIM mutants. I would like to thank all my colleagues at ICM for the helpful discussions and lovely interactions. I thank Prof. Mikael Widersten and Prof. Helena Dan-ielson for their scientific and non-scientific help and support at Kemi-BMC. I would also like to thank my colleagues here whom I interacted with fre-quently. Thilak, Eldar, Leandro, and Caroline, thank you for your company in the gym and for all the interesting discussions. Joana, thank you for your help with the course lab. Gina, thank you so much for your help in the wet lab. I wish you the best for your PhD. My research work was funded by a fellowship award from the Sven and Lilly Lawski Foundation, as well as by the Swedish Research Council (VR). I thank both the funding agencies for their generous support. I also thank the Swedish National Infrastructure for Computing (SNIC) for the computa-tional resources. My life in Uppsala would have been incomplete without my friends outside work. I huge thank you to everyone whom I interacted with over the course of the past six years. Please forgive me if I miss mentioning your name. Firstly, I thank my closest friends who were like a family to me here, and made me feel like at home away from home. Sandesh, I don’t know if I can thank you enough. You were my first “buddy” when I came to Uppsala and helped me with every little thing. Thank you for letting me share your apartment, for treating me with delicious meals, for teaching me how to cook, and for all the fun. The list is endless, and I am happy to have shared these years with you. All the best with finishing your PhD. Phani, thank you for enriching my life with your “thug”ness and always motivating me to publish papers. Please do not give up on the “My Group” idea! Good luck with your PhD. Punya and Smruti, thank you for all the awesome get-togethers and for giving a homely feeling whenever I visited you. Malavika, thank you for maintaining the lunch group even though it reduced to just three people. Hope you finish your PhD soon! Just do it! Narayan, thank you for all the questions and cinnamons. Good luck for the future. Vivek, thank you for being an awesome gym partner and for helping me carry things when I moved to the new apartment. That makes me thank Hemanth as well, for all the help with everything and always being available. Oksana, it was nice interacting with you over lunches and fikas while you were here. Good luck for your PhD. Sangeetha (aka bouncer),

Page 64: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

64

thank you for all the fun times, bringing delicious food to 83B, and for accom-panying me to the gym. Wish you the best for your PhD as well. Pruthvi, thank you for all our interactions, even though we did not meet that often dur-ing our PhD days. All the best for your future endeavors. Anitha, thank you for all the discussions on music and cricket, and for teaching me how to make a pizza from scratch. Loora, thank you for visiting India with Madle. I hope you travel there once again. Cecilia, ett stort tack till dig for helping me with the Swedish sammanfattning! Annika, tack så mycket for your help at such short notice! Ana, Sudarsan and Kristine, thank you for your company whenever you visited our apartment. I also thank my friends from the cricket group for all the awesome games we played together. Those were some of my most favorite days here. Thank you, Srinivas, Sethu, Hari, and Prasad. I thank Prof. Suparna Sanyal and Prof. Biplab Sanyal for making me a part of Uppsala Indian Choir, and for all the lovely events and melodious eve-nings spent at Sanyal Bari. Thank you to all other members of UIC too. I thank all my friends from RVCE, who stayed in contact with me for all these years. Raghu, thank you so much for coming to Scandinavia time and again and taking me on some of the most memorable road trips and hikes of my life. Thank you also for hosting me in Houston and for the amazing week with Sambit and Ashlesh. I shall always remember that trip. Srinidhi, thank you for hosting me in Oslo and all the fun times during all the trips. I also thank all my friends in India who gave me so much love every time I made a trip back home. It really meant a lot. Special thanks to Mr. Shanti Agarwal for his support and encouragement. I thank Mr. Sanjay, Ms. Regina, and every-one else at Ovobel for all their help with various things. I would like to thank all my teachers, from my school days, to the Bachelors, Masters and PhD programs, who have played an important role in my schol-arly development. Thank you! Finally, I would like to thank my family. Aai and Baba, thank you for every-thing. This would not have been possible without your love and support over all these years. Thank you Mummy and Papa for your encouragement. I thank my extended family as well. In the end, I would like to thank my best friend, who also happens to be my dear wife, Divya, for all the love, for being a strong pillar of support, and for standing by me during the good and bad times. Thank you for inspiring me to do science the right way, and for teaching me to never give up in life. Best wishes for your PhD!

Page 65: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

65

References

(1) Nelson, D. L.; Cox, M. M. Lehninger Principles of Biochemistry, Seventh Edition; 2013.

(2) Boyle, J. Molecular Biology of the Cell, 5th Edition by B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, and P. Walter. Biochem. Mol. Biol. Educ. 2008.

(3) Dill, K. A.; MacCallum, J. L. Science 2012, 338, 1042-1046. (4) Li, B.; Fooksa, M.; Heinze, S.; Meiler, J. Crit. Rev. Biochem. Mol. Biol. 2018,

53, 1-28. (5) Bowman, G. R.; Voelz, V. A.; Pande, V. S. Curr. Opin. Struct. Biol. 2011, 21,

4-11. (6) McPherson, A. Methods 2004, 34, 254-265. (7) Wüthrich, K. Science 1989, 243, 45-50. (8) Wang, R. Y. R.; Kudryashev, M.; Li, X.; Egelman, E. H.; Basler, M.; Cheng,

Y.; Baker, D.; Dimaio, F. Nat. Methods 2015, 12, 335-338. (9) Van Der Kamp, M. W.; Shaw, K. E.; Woods, C. J.; Mulholland, A. J. J. R. Soc.

Interface. 2008, 5, S173-S190. (10) Hollingsworth, S. A.; Dror, R. O. Neuron. 2018, 99, 1129-1143. (11) Redfern, O. C.; Dessailly, B.; Orengo, C. A. Curr. Opin. Struct. Biol. 2008, 18,

394-402. (12) Kendrew, B. D. J. C.; Bodo, G.; Dintzis, H. M.; Parrish, R. G.; Wyckoff, H.;

Phillips, D. C. Nature 1958, 181, 662–666. (13) Berman, H. M. Nucleic Acids Res. 2000, 28, 235-242. (14) Batool, M.; Ahmad, B.; Choi, S. A. Int. J. Mol. Sci. 2019, 20, 2783. (15) Anderson, A. C. Chem. Biol. 2003, 10, 787-797. (16) Wlodawer, A.; Vondrasek, J. Annu. Rev. Biophys. Biomol. Struct. 1998, 27,

249-284. (17) Clark, D. E. Expert Opin. Drug Discov. 2006, 1, 103-110. (18) Marrakchi, H.; Lanéelle, G.; Quémard, A. Microbiology 2000, 146, 289-296. (19) Ren, J.-X.; Li, L.-L.; Zheng, R.-L.; Xie, H.-Z.; Cao, Z.-X.; Feng, S.; Pan, Y.-

L.; Chen, X.; Wei, Y.-Q.; Yang, S.-Y. J. Chem. Inf. Model. 2011, 51, 1364-1375.

(20) Steltz, T. A. Nature 1998, 391, 231-232. (21) Nardini, M.; Dijkstra, B. W. Curr. Opin. Struct. Biol. 1999, 9, 732-737. (22) Shen, Y.; Joachimiak, A.; Rich Rosner, M.; Tang, W. J. Nature 2006, 443,

870-874. (23) MacRae, I. J.; Doudna, J. A. Curr. Opin. Struct. Biol. 2007, 17, 138-145. (24) Papoian, G. A. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 14237-14238. (25) Wright, P. E.; Dyson, H. J. J. Mol. Biol. 1999, 293, 321-331. (26) Uversky, V. N. Front. Phys. 2019, 7, 8–23. (27) Bellay, J.; Han, S.; Michaut, M.; Kim, T. H.; Costanzo, M.; Andrews, B. J.;

Boone, C.; Bader, G. D.; Myers, C. L.; Kim, P. M. Genome Biol. 2011, 12, R14.

(28) Oldfield, C. J.; Dunker, A. K. Annu. Rev. Biochem. 2014, 83, 553-584.

Page 66: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

66

(29) Fersht, A. R.; Knill-Jones, J. W.; Bedouelle, H.; Winter, G. Biochemistry 1988, 27, 1581-1587.

(30) McElheny, D.; Schnell, J. R.; Lansing, J. C.; Dyson, H. J.; Wright, P. E. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 5032-5037.

(31) Uversky, V. N. Chem. Rev. 2014, 114, 6557-6560. (32) Uversky, V. N.; Oldfield, C. J.; Dunker, A. K. Annu. Rev. Biophys. 2008, 37,

215-246. (33) Daughdrill, G. W.; Pielak, G. J.; Uversky, V. N.; Cortese, M. S.; Dunker, A.

K. Natively Disordered Proteins. In Protein Folding Handbook; Buchner, J., Kiefhaber, T., Eds.; Wiley-VCH Verlag GmbH: Weinheim, Germany, 2005; pp 275–357. https://doi.org/10.1002/9783527619498.ch41.

(34) Dunker, A. K.; Lawson, J. D.; Brown, C. J.; Williams, R. M.; Romero, P.; Oh, J. S.; Oldfield, C. J.; Campen, A. M.; Ratliff, C. M.; Hipps, K. W.; et al. J. Mol. Graph. Model. 2001, 19, 26-59.

(35) Uversky, V. N. Cell. Mol. Life Sci. 2003, 60, 1852-1871. (36) Romero, P.; Obradovic, Z.; Li, X.; Garner, E. C.; Brown, C. J.; Dunker, A. K.

Proteins Struct. Funct. Genet. 2001, 42, 38-48. (37) Radivojac, P.; Iakoucheva, L. M.; Oldfield, C. J.; Obradovic, Z.; Uversky, V.

N.; Dunker, A. K. Biophys. J. 2007, 92, 1439-1456. (38) Dyson, H. J.; Wright, P. E. Nat. Rev. Mol. Cell Biol. 2005, 6, 197-208. (39) Sunde, M.; Serpell, L. C.; Bartlam, M.; Fraser, P. E.; Pepys, M. B.; Blake, C.

C. F. J. Mol. Biol. 1997, 273, 729-739. (40) Fitzpatrick, A. W. P.; Debelouchina, G. T.; Bayro, M. J.; Clare, D. K.;

Caporini, M. A.; Bajaj, V. S.; Jaroniec, C. P.; Wang, L.; Ladizhansky, V.; Muller, S. A.; et al. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 5468-5473.

(41) Schrödinger, L. The PyMol Molecular Graphics System, Versión 1.8, 2015. (42) Uversky, V. N.; Fink, A. L. Biochim. Biophys. Acta 2004, 1698, 131-153. (43) Panza, F.; Lozupone, M.; Logroscino, G.; Imbimbo, B. P. Nat. Rev. Neurol.

2019, 15, 73-88. (44) Huang, H. C.; Jiang, Z. F. J. Alzheimers Dis. 2009, 16, 15-27. (45) Luo, J.; Wärmländer, S. K. T. S.; Gräslund, A.; Abrahams, J. P. J. Biol. Chem.

2016, 291, 16485-16493. (46) Grasso, G.; Rebella, M.; Morbiducci, U.; Tuszynski, J. A.; Danani, A.; Deriu,

M. A. Front. Bioeng. Biotechnol. 2019, 7, 1-11. (47) Wärmländer, S.; Tiiman, A.; Abelein, A.; Luo, J.; Jarvet, J.; Söderberg, K. L.;

Danielsson, J.; Gräslund, A. ChemBioChem 2013, 14, 1692-1704. (48) Kang, J.; Lemaire, H. G.; Unterbeck, A.; Salbaum, J. M.; Masters, C. L.;

Grzeschik, K. H.; Multhaup, G.; Beyreuther, K.; Müller-Hill, B. Nature 1987, 325, 733-736.

(49) Sun, X.; Chen, W. D.; Wang, Y. D. Front. Pharmacol. 2015, 6, 221. (50) Abelein, A.; Abrahams, J. P.; Danielsson, J.; Gräslund, A.; Jarvet, J.; Luo, J.;

Tiiman, A.; Wärmländer, S. K. T. S. J. Biol. Inorg. Chem. 2014, 19, 623-634. (51) Nasica-Labouze, J.; Nguyen, P. H.; Sterpone, F.; Berthoumieu, O.; Buchete,

N. V.; Coté, S.; De Simone, A.; Doig, A. J.; Faller, P.; Garcia, A.; et al. Chem. Rev. 2015, 115, 3518-3563.

(52) Nagel-Steger, L.; Owen, M. C.; Strodel, B. ChemBioChem. 2016, 17, 657-676. (53) Schmidt, M.; Rohou, A.; Lasker, K.; Yadav, J. K.; Schiene-Fischer, C.;

Fändrich, M.; Grigorieff, N.; Petsko, G. A. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 11858-11863.

(54) Masters, C. L.; Selkoe, D. J. Cold Spring Harb. Perspect. Med. 2012, 2, a006262.

Page 67: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

67

(55) Chen, G. F.; Xu, T. H.; Yan, Y.; Zhou, Y. R.; Jiang, Y.; Melcher, K.; Xu, H. E. Acta Pharmacol. Sin. 2017, 38, 1205-1235.

(56) Yang, M.; Teplow, D. B. J. Mol. Biol. 2008, 384, 450-464. (57) Wolfenden, R.; Snider, M. J. Acc. Chem. Res. 2001, 34, 938-945. (58) Kamerlin, S. C. L.; Sharma, P. K.; Prasad, R. B.; Warshel, A. Q. Rev. Biophys.

2013, 46, 1-132. (59) Elias, M.; Tawfik, D. S. J. Biol. Chem. 2012, 287, 11-20. (60) Galperin, M. Y.; Koonin, E. V. J. Biol. Chem. 2012, 287, 21-28. (61) Johnson, K. A. Enzymes 1992, 20, 1-61. (62) Berg, J.; Tymoczko, J.; Stryer, L. Biochemistry, 5th Edition; 2002. (63) Arrhenius, S. Zeitschrift für Phys. Chemie 1889, 4. (64) Eyring, H. J. Chem. Phys. 1935, 3, 107-115. (65) Evans, M. G.; Polanyi, M. Trans. Faraday Soc. 1935, 31, 875. (66) Villà, J.; Warshel, A. J. Phys. Chem. B. 2001, 105, 7887-7907. (67) Truhlar, D. G. Arch. Biochem. Biophys. 2015, 582, 10-17. (68) Anslyn,E.V. & Dougherty, D. A. Modern Physical Organic Chemistry

(University Science Books), 2006. (69) Fersht, A. R. Structure and Mechanism in Protein Science: A Guide to Enzyme

Catalysis and Protein Folding; W. H. Freeman, 1999. (70) Fischer, E. Berichte der Dtsch. Chem. Gesellschaft 1894, 27, 2985-2993. (71) Koshland, D. E. Proc. Natl. Acad. Sci. U. S. A. 1958, 44, 98-104. (72) Wolfenden, R. Nature 1969, 223, 704-705. (73) Pauling, L. Chem. Eng. News 1946, 24, 1375-1377. (74) Warshel, A.; Sharma, P. K.; Kato, M.; Xiang, Y.; Liu, H.; Olsson, M. H. M.

Chem. Rev. 2006, 106, 3210-3235. (75) Warshel, A. J. Biol. Chem. 1998, 273, 27035-27038. (76) Fuxreiter, M.; Mones, L. Curr. Opin. Chem. Biol. 2014, 21, 34-41. (77) Martí, S.; Andrés, J.; Moliner, V.; Silla, E.; Tuñón, I.; Bertrán, J.; Field, M. J.

J. Am. Chem. Soc. 2001, 123, 1709-1712. (78) Kamerlin, S. C. L.; Sharma, P. K.; Chu, Z. T.; Warshel, A. Proc. Natl. Acad.

Sci. U. S. A. 2010, 107, 4075-4080. (79) Preiswerk, N.; Beck, T.; Schulz, J. D.; Milovník, P.; Mayer, C.; Siegel, J. B.;

Baker, D.; Hilvert, D. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 8013-8018. (80) Fried, S. D.; Bagchi, S.; Boxer, S. G. Science 2014, 346, 1510-1514. (81) Wu, Y.; Boxer, S. G. J. Am. Chem. Soc. 2016, 138, 11890-11895. (82) Szefczyk, B.; Mulholland, A. J.; Ranaghan, K. E.; Sokalski, W. A. J. Am.

Chem. Soc. 2004, 126, 16148-16159. (83) Welborn, V. V.; Ruiz Pestana, L.; Head-Gordon, T. Nat. Catal. 2018, 1, 805. (84) Xiang, S.; Short, S. A.; Wolfenden, R.; Carter, C. W. Biochemistry 1997, 36,

4768-4774. (85) Wu, N.; Mo, Y.; Gao, J.; Pai, E. F. Proc. Natl. Acad. Sci. U. S. A. 2000, 97,

2017-2022. (86) Warshel, A.; Strajbl, M.; Villa, J.; Florian, J. Biochemistry 2000, 39, 14728-

14738. (87) Shurki, A.; Štrajbl, M.; Villà, J.; Warshel, A. J. Am. Chem. Soc. 2002, 124,

4097-4107. (88) Callahan, B. P.; Wolfenden, R. J. Am. Chem. Soc. 2004, 126, 14698-14699. (89) Burschowsky, D.; Van Eerde, A.; Okvist, M.; Kienhöfer, A.; Kast, P.; Hilvert,

D.; Krengel, U. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 17516-17521. (90) Bhabha, G.; Lee, J.; Ekiert, D. C.; Gam, J.; Wilson, I. A.; Dyson, H. J.;

Benkovic, S. J.; Wright, P. E. Science 2011, 332, 234-238. (91) Hay, S.; Scrutton, N. S. Nat. Chem. 2012, 4, 161-168.

Page 68: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

68

(92) Kamerlin, S. C. L.; Warshel, A. Proteins. 2010, 78, 1339-1375. (93) Warshel, A.; Bora, R. P. J. Chem. Phys. 2016, 144, 180901. (94) Glowacki, D. R.; Harvey, J. N.; Mulholland, A. J. Nat. Chem. 2012, 4, 169-

176. (95) Luk, L. Y. P.; Ruiz-Pernía, J. J.; Dawson, W. M.; Roca, M.; Loveridge, E. J.;

Glowacki, D. R.; Harvey, J. N.; Mulholland, A. J.; Tuñón, I.; Moliner, V.; et al. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 16344-16349.

(96) Richard, J. P. Biochemistry 2012, 51, 2652-2661. (97) Wierenga, R. K.; Kapetaniou, E. G.; Venkatesan, R. Cell. Mol. Life Sci. 2010,

67, 3961-3982. (98) Knowles, J. R. Nature. 1991, 350, 121-124. (99) Knowles, J. R.; Albery, W. J. Acc. Chem. Res. 1977, 10, 105-111. (100) Jogl, G.; Rozovsky, S.; McDermott, A. E.; Tong, L. Proc. Natl. Acad. Sci. U.

S. A. 2003, 100, 50-55. (101) Goldman, A. D.; Beatty, J. T.; Landweber, L. F. J. Mol. Evol. 2016, 82, 17-26. (102) Nagano, N.; Orengo, C. A.; Thornton, J. M. J. Mol. Biol. 2002, 321, 741-765. (103) Copley, R. R.; Bork, P. J. Mol. Biol. 2000, 303, 627-641. (104) Wierenga, R. K. FEBS Letters. 2001, 492, 193-198. (105) Davenport, R. C.; Bash, P. A.; Seaton, B. A.; Karplus, M.; Petsko, G. A.;

Ringe, D. Biochemistry 1991, 30, 5821-5826. (106) Lolis, E.; Alber, T.; Davenport, R. C.; Rose, D.; Hartman, F. C.; Petsko, G. A.

Biochemistry 1990, 29, 6609-6618. (107) Rozovsky, S.; Jogl, G.; Tong, L.; McDermott, A. E. J. Mol. Biol. 2001, 310,

271-280. (108) Richard, J. P.; Amyes, T. L.; Goryanova, B.; Zhai, X. Curr. Opin. Chem. Biol.

2014, 21, 1-10. (109) Koshland D. E., Jr. Protein Sci. 1993, 2, 1364-1368. (110) Reyes, A. C.; Zhai, X.; Morgan, K. T.; Reinhardt, C. J.; Amyes, T. L.; Richard,

J. P. J. Am. Chem. Soc. 2015, 137, 1372-1382. (111) Tsang, W. Y.; Amyes, T. L.; Richard, J. P. Biochemistry 2008, 47, 4575-4582. (112) Amyes, T. L.; Richard, J. P.; Tait, J. J. J. Am. Chem. Soc. 2005, 127, 15708-

15709. (113) Amyes, T. L.; Richard, J. P. Biochemistry 2007, 46, 5841-5854. (114) Albery, W. J.; Knowles, J. R. Angew. Chem. Int. Ed. Engl. 1977, 16, 285-293. (115) Hall, A.; Knowles, J. R. Biochemistry 1975, 14, 4348-4353. (116) Albery, W. J.; Knowles, J. R. Biochemistry 1976, 15, 5627-5631. (117) Albery, W. J.; Knowles, J. R. Biochemistry 1976, 15, 5631-5640. (118) Jensen, F. Introduction to Computational Chemistry, 3rd Edition, 3rd ed.;

Wiley, 2017. (119) Szabo, A.; Ostlund, N. S. Modern Quantum Chemistry: Introduction to

Advanced Electronic Structure Theory. Revised. McGraw-Hill, New York 1989.

(120) Born, M.; Oppenheimer, R. Ann. Phys. 1927, 389, 457-484. (121) Hartree, D. R. Reports Prog. Phys. 1947, 11, 305. (122) Møller, C.; Plesset, M. S. Phys. Rev. 1934, 46, 618-622. (123) Sherrill, C. D.; Schaefer, H. F. The Configuration Interaction Method:

Advances in Highly Correlated Approaches. In Advances in Quantum Chemistry; 1999.

(124) Kümmel, H. G. A Biography of the Coupled Cluster Method. In International Journal of Modern Physics B; 2003.

(125) Thomas, L. H. Math. Proc. Cambridge Philos. Soc. 1927, 23, 542-548. (126) Hohenberg, P.; Kohn, W. Phys. Rev. 1964, 136, B864-B871.

Page 69: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

69

(127) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133-A1138. (128) Levitt, M.; Lifson, S. J. Mol. Biol. 1969, 46, 269-279. (129) Monticelli. Biomolecular Simulations: Methods and Protocols; 2013. (130) Huang, J.; Mackerell, A. D. J. Comput. Chem. 2013, 34, 2135-2145. (131) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.;

Simmerling, C. J. Chem. Theory Comput. 2015, 11, 3696-3713. (132) Robertson, M. J.; Tirado-Rives, J.; Jorgensen, W. L. J. Chem. Theory Comput.

2015, 11, 3499-3509. (133) Senftle, T. P.; Hong, S.; Islam, M. M.; Kylasa, S. B.; Zheng, Y.; Shin, Y. K.;

Junkermeier, C.; Engel-Herbert, R.; Janik, M. J.; Aktulga, H. M.; et al. npj Comput. Mater. 2016, 2, 15011.

(134) Warshel, A.; Weiss, R. M. J. Am. Chem. Soc. 1980, 102, 6218-6226. (135) Kamerlin, S. C. L.; Warshel, A. Wiley Interdiscip. Rev. Comput. Mol. Sci.

2011, 1, 30-45. (136) Bernardi, R. C.; Melo, M. C. R.; Schulten, K. Biochim. Biophys. Acta 2015,

1850, 872-877. (137) Barducci, A.; Bonomi, M.; Parrinello, M. Wiley Interdiscip. Rev. Comput. Mol.

Sci. 2011, 1, 826-843. (138) Abrams, C.; Bussi, G. Entropy 2014, 16, 163-199. (139) Miao, Y.; McCammon, J. A. Mol. Simul. 2016, 42, 1046-1055. (140) Zwanzig, R. W. J. Chem. Phys. 1954, 22, 1420-1426. (141) Bartlett, R. J. Annu. Rev. Phys. Chem. 1981, 32, 359-401. (142) Bartlett, R. J.; Musiał, M. Rev. Mod. Phys. 2007, 79, 291. (143) Warshel, A.; Karplus, M. J. Am. Chem. Soc. 1972, 94, 5612-5625. (144) Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227-249. (145) Kulkarni, Y.; Kamerlin, S. C. L. Adv. Phys. Org. Chem. 2019, 53, 69-104. (146) Warshel, A. Computer Modeling of Chemical Reactions in Enzymes and

Solutions; Wiley, 1997. (147) Muegge, I.; Tao, H.; Warshel, A. Protein Eng. 1997, 10, 1363-1372. (148) Wallin, C.; Kulkarni, Y. S.; Abelein, A.; Jarvet, J.; Liao, Q.; Strodel, B.;

Olsson, L.; Luo, J.; Abrahams, J. P.; Sholts, S. B.; et al. J. Trace Elem. Med. Biol. 2016, 38, 183-193.

(149) Aleksis, R.; Oleskovs, F.; Jaudzems, K.; Pahnke, J.; Biverstål, H. Biochimie 2017, 140, 176-192.

(150) Cohen, S. I. A.; Vendruscolo, M.; Welland, M. E.; Dobson, C. M.; Terentjev, E. M.; Knowles, T. P. J. J. Chem. Phys. 2011, 135, 065105.

(151) Cohen, S. I. A.; Linse, S.; Luheshi, L. M.; Hellstrand, E.; White, D. A.; Rajah, L.; Otzen, D. E.; Vendruscolo, M.; Dobson, C. M.; Knowles, T. P. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 9758-9763.

(152) Bush, A. I. Adv. Alzheimer’s Dis. 2013, 33, S277-S281. (153) Squitti, R. Front. Biosci. 2012, 17, 451-472. (154) Pithadia, A. S.; Lim, M. H. Curr. Opin. Chem. Biol. 2012, 16, 67-73. (155) Faller, P.; Hureau, C.; Berthoumieu, O. Inorg. Chem. 2013, 52, 12193-12206. (156) Luo, J.; Yu, C. H.; Yu, H.; Borstnar, R.; Kamerlin, S. C. L.; Gräslund, A.;

Abrahams, J. P.; Wärmländer, S. K. T. S. ACS Chem. Neurosci. 2013, 4, 454-462.

(157) Jiao, Y.; Yang, P. J. Phys. Chem. B 2007, 111, 7646-7655. (158) Furlan, S.; Hureau, C.; Faller, P.; La Penna, G. J. Phys. Chem. B 2012, 116,

11899-11910. (159) Hane, F.; Leonenko, Z. Biomolecules. 2014, 4, 101-116. (160) Faller, P.; Hureau, C. Dalton Trans. 2009, 7, 1080-1094.

Page 70: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

70

(161) Ghalebani, L.; Wahlström, A.; Danielsson, J.; Wärmländer, S. K. T. S.; Gräslund, A. Biochem. Biophys. Res. Commun. 2012, 421, 554-560.

(162) Jiang, D.; Zhang, L.; Grant, G. P. G.; Dudzik, C. G.; Chen, S.; Patel, S.; Hao, Y.; Millhauser, G. L.; Zhou, F. Biochemistry 2013, 52, 547-556.

(163) Nair, N. G.; Perry, G.; Smith, M. A.; Reddy, V. P. J. Alzheimers Dis. 2010, 20, 57-66.

(164) Han, D.; Wang, H.; Yang, P. BioMetals 2008, 21, 189-196. (165) Fitsanakis, V. A.; Piccola, G.; Dos Santos, A. P. M.; Aschner, J. L.; Aschner,

M. Hum. Exp. Toxicol. 2007, 26, 295-302. (166) Smith, Q. R.; Rabin, O.; Chikhale, E. G. Delivery of Metals to Brain and the

Role of the Blood-Brain Barrier. In Metals and Oxidative Damage in Neurological Disorders; 1997.

(167) Aschner, M.; Guilarte, T. R.; Schneider, J. S.; Zheng, W. Toxicol. Appl. Pharmacol. 2007, 221, 131-147.

(168) Österlund, N.; Kulkarni, Y. S.; Misiaszek, A. D.; Wallin, C.; Krüger, D. M.; Liao, Q.; Mashayekhy Rad, F.; Jarvet, J.; Strodel, B.; Wärmländer, S. K. T. S.; et al. ACS Chem. Neurosci. 2018, 9, 1680-1692.

(169) Coles, M.; Bicknell, W.; Watson, R. A.; Fairlie, D. P.; Craik, D. J. Biochemistry 1998, 37, 11064-11077.

(170) Reyes, A. C.; Amyes, T. L.; Richard, J. P. Org. Biomol. Chem. 2017, 15, 8856-8866.

(171) Baier, F.; Tokuriki, N. J. Mol. Biol. 2014, 426, 2442-2456. (172) Amyes, T. L.; O’Donoghue, A. C.; Richard, J. P. J. Am. Chem. Soc. 2001, 123,

11325-11326. (173) Go, M. K.; Amyes, T. L.; Richard, J. P. Biochemistry 2009, 48, 5769-5778. (174) Kulkarni, Y. S.; Amyes, T. L.; Richard, J. P.; Kamerlin, S. C. L. J. Am. Chem.

Soc. 2019, 141, 16139-16150. (175) Richard, J. P.; Amyes, T. L.; Malabanan, M. M.; Zhai, X.; Kim, K. J.;

Reinhardt, C. J.; Wierenga, R. K.; Drake, E. J.; Gulick, A. M. Biochemistry 2016, 55, 3036-3047.

(176) Richard, J. P. J. Am. Chem. Soc. 1984, 106, 4926-4936. (177) Kulkarni, Y. S.; Liao, Q.; Petrović, D.; Krüger, D. M.; Strodel, B.; Amyes, T.

L.; Richard, J. P.; Kamerlin, S. C. L. J. Am. Chem. Soc. 2017, 139, 10514-10525.

(178) Malabanan, M. M.; Go, M. K.; Amyes, T. L.; Richard, J. P. Biochemistry 2011, 50, 5767-5779.

(179) Malabanan, M. M.; Koudelka, A. P.; Amyes, T. L.; Richard, J. P. J. Am. Chem. Soc. 2012, 134, 10286-10298.

(180) Go, M. K.; Koudelka, A.; Amyes, T. L.; Richard, J. P. Biochemistry 2010, 49, 5377-5389.

(181) Lodi, P. J.; Chang, L. C.; Knowles, J. R.; Komives, E. A. Biochemistry 1994, 33, 2809-2814.

(182) Go, M. K.; Amyes, T. L.; Richard, J. P. J. Am. Chem. Soc. 2010, 132, 13525-13532.

(183) Casteleijn, M. G.; Alahuhta, M.; Groebel, K.; El-Sayed, I.; Augustyns, K.; Lambeir, A.-M.; Neubauer, P.; Wierenga, R. K. Biochemistry 2006, 45, 15483-15494.

(184) Zhai, X.; Amyes, T. L.; Richard, J. P. J. Am. Chem. Soc. 2015, 137, 15185-15197.

(185) Zhai, X.; Amyes, T. L.; Richard, J. P. J. Am. Chem. Soc. 2014, 136, 4145-4148. (186) Kulkarni, Y. S.; Liao, Q.; Byléhn, F.; Amyes, T. L.; Richard, J. P.; Kamerlin,

S. C. L. J. Am. Chem. Soc. 2018, 140, 3854-3857.

Page 71: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

71

(187) Williams, J. C.; McDermott, A. E. Biochemistry 1995, 34, 8309-8319. (188) Kempf, J. G.; Jung, J. Y.; Ragain, C.; Sampson, N. S.; Loria, J. P. J. Mol. Biol.

2007, 368, 131-149. (189) Brown, F. K.; Kollman, P. A. J. Mol. Biol. 1987, 198, 533-546. (190) Joseph, D.; Petsko, G. A.; Karplus, M. Science 1990, 249, 1425-1428. (191) Liao, Q.; Kulkarni, Y.; Sengupta, U.; Petrović, D.; Mulholland, A. J.; Van Der

Kamp, M. W.; Strodel, B.; Kamerlin, S. C. L. J. Am. Chem. Soc. 2018, 140, 15889-15903.

Page 72: Computational Modeling of the Structure, Function and ...uu.diva-portal.org/smash/get/diva2:1375418/FULLTEXT01.pdf · determination of larger protein complexes such as multimeric

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1885

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science andTechnology, Uppsala University, is usually a summary of anumber of papers. A few copies of the complete dissertationare kept at major Swedish research libraries, while thesummary alone is distributed internationally throughthe series Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology.(Prior to January, 2005, the series was published under thetitle “Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-398169

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2020