Membrane Protein Structure and Dynamics || Modeling the Structural Communication in Supramolecular Complexes Involving GPCRs

319

Nagarajan Vaidehi and Judith Klein-Seetharaman (eds.), Membrane Protein Structure and Dynamics: Methods and Protocols, Methods in Molecular Biology, vol. 914, DOI 10.1007/978-1-62703-023-6_18, © Springer Science+Business Media, LLC 2012

Chapter 18

Modeling the Structural Communication in Supramolecular Complexes Involving GPCRs

Francesca Fanelli

Abstract

This article describes a computational strategy aimed at studying the structural communication in G-Protein Coupled Receptors (GPCRs) and G proteins. The strategy relies on comparative Molecular Dynamics (MD) simulations and analyses of wild-type (i.e., reference state) vs. mutated (i.e., perturbed state), or free (i.e., reference state) vs. bound (i.e., perturbed state) forms of a GPCR or a G protein. Bound forms of a GPCR include complexes with small ligands and/or receptor dimers/oligomers, whereas bound forms of heterotrimeric GDP-bound G proteins concern the complex with a GPCR. The computational strategy includes structure prediction of a receptor monomer (in the absence of high-resolution structure), a recep-tor dimer/oligomer, and a receptor–G protein complex, which constitute the inputs of MD simulations. Finally, the analyses of the MD trajectories are instrumental in inferring the structural/dynamics differ-ences between reference and perturbed states of a GPCR or a G protein. In this respect, focus will be put on the analysis of protein structure networks and communication paths.

Key words: GPCRs , Constitutively active mutants , Comparative modeling , Dimerization , Protein–protein docking , Molecular dynamics , Protein structure network

Intramolecular and intermolecular communications inside a protein molecule and among the components of protein networks are at the basis of signal transduction and functioning mechanisms of mem-brane proteins, in particular, G Protein-Coupled Receptors (GPCRs) (Reviewed in ref. 1 ) . GPCRs regulate most aspects of cell activity by transmitting extracellular signals inside the cell (reviewed in refs. 2, 3 ) . GPCRs share an up-and-down bundle of seven transmembrane helices connected by three intracellular (IL1, IL2, and IL3) and three extracellular (EL1, EL2, and EL3) loops, an extracellular N-terminus and an intracellular C-terminus. Upon activation by extracellular signals, the receptors activate the a -subunit in

1. Introduction

320 F. Fanelli

heterotrimeric guanine nucleotide binding proteins (G proteins) by catalyzing the exchange of bound GDP for GTP, i.e., they act as Guanine Nucleotide Exchange Factors (GEFs). Thus, GPCRs are allosteric proteins that transform extracellular signals into promo-tion of nucleotide exchange in intracellular G proteins. These recep-tors have regions of low and high fl exibility that communicate with each other, even if distal. GPCRs exist as complex statistical confor-mation ensembles ( 4, 5 ) . Their functional properties are related to the distribution of states within the native ensemble, which is differ-ently affected by ligands, oligomeric state (i.e., monomer, homo- and heterodimers, and oligomers), interacting proteins, and amino acid mutations. Regulated protein–protein interactions are key fea-tures of many aspects of GPCR function and there is increasing evidence that these receptors act as part of multicomponent units comprising a variety of signaling and scaffolding molecules ( 3, 6 ) .

Unraveling the GPCR functioning mechanisms at the atomic level is dif fi cult due to the limited high resolution structural infor-mation, which, is presently limited to: a) rhodopsin in its dark (inactive), constitutively active, and photoactivated states ( 7– 11 ) ; b) β2- and β1-adrenergic receptors (ARs) bound to agonists, partial agonists, inverse agonists, antagonists, a nanobody, and heterotri-meric Gs ( 12– 20 ) ; c) A2A adenosine receptor (A2AR) bound to an antagonist, a number of agonists, and an allosteric inverse agonist antibody ( 21– 25 ) ; d) squid rhodopsin bound to 11-cis retinal ( 26 ) ; e) CXCR4 chemokine receptor bound to small molecule and cyclic peptide antagonists ( 27 ) ; f) D3 dopamine receptor in complex with a D2/D3 selective antagonist ( 28 ) ; g) the H1 histamine receptor (H1R) in complex with an antagonist ( 29 ) ; h) the m2 and m3 muscarinic receptors bound to antagonists ( 30, 31 ) ; i) the δ-, k- and µ- opioid receptos bound to antagonists ( 32– 34 ) ; j) the nociceptin/orphanin FQ receptor in complex with a peptide mimetic antagonist ( 35 ) ; and k) a lipid GPCR in complex with a sphingolipid antagonist ( 36 ) . In spite of the extraordinary advance in structure determination of GPCRs since year 2007, there is still little knowledge of the likely architecture of GPCR dimers/oli-gomers, the effects of ligand binding, site-directed mutagenesis and dimerization on the intrinsic dynamics of the receptor, as well as the receptor impact on the intrinsic dynamics of the G protein, which makes GPCRs act as GEFs.

This article describes a strategy addressing the allosteric structural communication involving GPCRs by integrating dif-ferent computational approaches in a comparative framework. These approaches include, building of the structural model of the receptor, prediction of the likely architecture of GPCR homo- and heterodimers/oligomers, prediction of the ligand-receptor and the receptor–G protein complexes that constitute the func-tional unit, and simulating receptor and G protein dynamics in their native, mutated, free, and differently bound states. Ultimately, structural/dynamic differences between reference

32118 Modeling the Structural Communication…

and perturbed states of a GPCR or G protein are inferred by comparing time-averaged features, essential motions, structure networks, and communication paths.

1. The MODELLER software is employed for comparative mod-eling of the initial structural model of a given receptor or G protein, not yet solved at the atomic level of detail ( http://www.salilab.org/modeller/ ( 37, 38 ) ).

2. Protein quality check and automatic adjustment of side chain rotamers are done through the “Protein Health” and “Protein Design” modules of the Quanta molecular graphics package ( www.accelrys.com ), which is not freely available. However, amino acid side chain rotamer libraries and software for side chain conformation prediction can be freely downloaded from the Dunbrack’s Web site ( http://dunbrack.fccc.edu/bbdep/index.php ). Moreover, many of the freely available protein modeling programs (e.g., Chimera ( http://www.cgl.ucsf.edu/chimera/ ), Pymol ( http://www.pymol.org/ ), and Bodil ( http://users.abo. fi /bodil/ )) allow modi fi cations of amino acid side chains according to rotamer libraries. On the other hand, freely available software for checking the stereochemical quality of a protein structure include PROCHECK ( http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/ ) and WHATIF ( http://swift.cmbi.ru.nl/whatif/ ).

3. Rigid body protein–protein docking for predicting receptor quaternary structures or receptor–G protein complexes is carried by means of the ZDOCK program ( http://zlab.bu.edu/zdock/ ( 39, 40 ) ).

4. Reorientation of a receptor structure according to the membrane topology can be done by means of the FiPD soft-ware, which serves essentially to analyze the outputs of the ZDOCK program (visit “Software” at http://www.csbl. unimore.it ( 41 ) ).

5. MD simulations are carried out by means of the CHARMM soft-ware ( http://www.charmm.org/html/info/intro.html ( 42 ) ).

6. All MD analyses are done by means of the Wordom software, which is freely downloadable from: http://wordom.source-forge.net/index.html ( 43 ) .

7. Cluster analysis of the docking solutions can be done by means of the FiPD software.

8. All drawings shown in Figs. 1 – 4 were done using the software PyMOL 1.1r1 ( http://www.pymol.org/ ).

2. Materials

http://www.salilab.org/modeller/

http://www.salilab.org/modeller/

http://www.accelrys.com

http://dunbrack.fccc.edu/bbdep/index.php

http://dunbrack.fccc.edu/bbdep/index.php

http://www.cgl.ucsf.edu/chimera/

http://www.cgl.ucsf.edu/chimera/

http://www.pymol.org/

http://users.abo.fi/bodil/

http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/

http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/

http://swift.cmbi.ru.nl/whatif/

http://zlab.bu.edu/zdock/

http://zlab.bu.edu/zdock/

http://www.csbl.unimore.it


http://www.charmm.org/html/info/intro.html

http://wordom.sourceforge.net/index.html

http://wordom.sourceforge.net/index.html

http://www.pymol.org/

Fig. 1. Cytosolic ends of wild-type luteinizing hormone receptor (LHR) (inactive state, top ) and of D578 (6.44) H constitutively active LHR mutant ( bottom ). The numbering in parenthe-ses follows the arbitrary scheme by Ballesteros and Weinstein ( 58 ) . Details on the building of the structural models, which are average minimized structures, are published else-where ( 46 ) . The structures are seen from the cytosolic side in a direction perpendicular to the membrane surface. Helices 1, 2, 3, 4, 5, 6, 7, and 8 are, respectively, colored blue , orange , green , pink , yellow , cyan , violet , and red . IL 1, 2, and 3 are, respectively, lime , gray , and magenta . Details of the interaction of selected conserved amino acids are shown in stick representation , which mark the structural differences between inactive (reference) and active (perturbed) states. Gray dots are the solvent accessible surface computed over R464 (3.50) , V467 (3.53) , T468 (5.54) , K566 (6.32) , I567 (6.33) , and K570 (6.36) . The relative SASA in wild-type and mutated LHR forms is 89 and 255 Å 2 , respectively. This fi gure illustrates how selected interaction patterns and size–shape descriptors such as SASA can be used to mark structural differences between functionally different states.

Fig. 3. (continued) of the circle is proportional to the number of links made by the considered node, with the lowest value corresponding to one link. Link color refers to the frequency of the link. In this respect, cyan , green , yellow , orange , and red correspond, respectively, to the following frequency (F) ranges: 50 £ F £ 60 %, 60 £ F £ 70 %, 70 £ F £ 80 %, 80 £ F £ 90 % and F ³ 90 %.This fi gure illustrates how a perturbation such as a point mutation may result in a different composition of the structure network that characterizes the wild-type form.


Fig. 2. C a -displacements of the Gq a -subunit in its free ( left ) state and in complex with the thromboxane A2 receptor (TP) ( right ) along the second eigenvector from a PCA run on a concatenated trajectory made of 12,000 frames. A number of conformations of the a -subunit were generated between the minimum and maximum projection on the selected eigenvector. These pictures refer to a work published elsewhere ( 45 ) . The G protein a -, b -, and g -subunits are, respectively, gray , cyan and yellow , whereas TP is green . The GDP molecule is red . Red dots indicate the SAS of the nucleotide. This fi gure illustrates how a perturbation such as receptor binding induces changes in the essential motions of the G protein a -subunit. In this respect, comparative PCA may serve to high-light divergences in the intrinsic dynamics of a reference system (i.e., receptor-free a -subunit) and a perturbed system (i.e., receptor-bound a -subunit).

Fig. 3. 3D PSG concerning wild-type LHR (i.e., inactive state, left ) and in D578 (6.44) H consti-tutively active LHR mutant ( right ). These pictures result from a work published elsewhere ( 46 ) . The spheres centered on the C a -atoms concern node pairs in the PSG, which are linked in more than 50 % of frames in a 10 ns trajectory. Nodes are colored according to their location (i.e., according to the different receptor regions). In this respect, helices 1, 2, 3, 4, 5, 6, 7, and 8 are, respectively, blue , orange , green , pink , yellow , cyan , violet , and red . IL1 and EL1 are lime , IL2 and EL2 are gray, and IL3 and EL3 are magenta . The diameter

324 F. Fanelli

1. The initial structure modeling of a given GPCR is achieved by comparative modeling by the MODELLER software, which is based on the satisfaction of spatial restraints ( 37, 38 ) . The inputs to the program are restraints on the spatial structure of the amino acid sequence(s) and ligands to be modeled. The output is a 3D structure that satis fi es these restraints as well as possible. Restraints can in principle be derived from a number of different sources. These include related protein structures (comparative modeling), NMR experiments (NMR re fi nement), rules of secondary structure packing (combinatorial model-ing), secondary structure predictions, cross-linking experi-ments, fl uorescence spectroscopy, image reconstruction in electron microscopy, site-directed mutagenesis, intuition, and residue–residue and atom–atom potentials of mean force. The restraints can operate on distances, angles, dihedral angles, pairs of dihedral angles, and some other spatial features de fi ned by atoms or pseudo atoms. By defaults, MODELLER auto-matically derives the restraints only from the known related structures and their alignment with the target sequence. A 3D model is obtained by optimization of a molecular probability

3. Methods

3.1. Modeling the GPCR Structure

Fig. 4. Average minimized structures concerning the A 2A R in its free ( a ) and antagonist-bound (i.e., bound to the antagonist ZM241385 (ZMA)) ( b ) monomeric forms, as well as in three different ZMA-bound dimeric forms ( c – e ). The architectures of the three different dimers are shown at the bottom of panels c – e . For the color coding of the different receptor portions, see the legend to Fig. 3 . Pictures refer to results published elsewhere ( 47 ) . The top images show the effects of ligand bind-ing ( panel b ) and of dimerization ( panels c – e ) on the PSG characterizing the apo-form ( panel a ) (see the legend to Fig. 3 for a description of 3D PSG). This fi gure illustrates how PSN analysis can be used to infer differences in the intramolecular and intermolecular communication following ligand binding or receptor dimerization.


density function (PDF). The molecular PDF for comparative modeling is optimized with the variable target function proce-dure in Cartesian space that employs methods of conjugate gradients and MD with simulated annealing. The goodness of a comparative model depends on the goodness of the sequence alignment and on the structural quality of the template(s). External restraints can be added by the user. Nonhomologous regions, such as loops, can be modeled following an ab initio approach (see Note 1) ( 38 ) .

2. As for comparative modeling of GPCRs, selection of template structures from one or more of the GPCR crystal structures available so far must privilege sequence similarity between tar-get and template proteins. Fully automated modeling of an entire GPCR sequence is rarely feasible even for members of the same subfamily, and human intervention is necessary to modify the selected template, to add extra restraints (e.g., to elongate a -helices or add b -strands), to add patches (i.e., dis-ulphide bridges), to re fi ne loops ab initio (i.e., by the “loop-model” option). A number of different sequence alignments are worth probing. For each alignment, it is worth generating multiple models (i.e., more than 100 for each alignment) by randomizing all the Cartesian coordinates of standard residues in the initial model (i.e., “randomize.xyz” option). A high degree of model re fi nement within MODELLER is recom-mended. From each run, the top ten models are selected, char-acterized by the lowest values of the MODELLER objective function (which means lowest degree of restraint violation).

3. The set of selected models are subjected to quality checks, which verify the correctness of the main chain conformation, leading to selection of one or more model(s) (see Note 2 and Subheading 2 ).

4. The side chains conformations of selected model(s) are fi nally optimized by using rotamer libraries (see Note 3). We use the side chain modeling tool within the Quanta commercial soft-ware ( www.accelrys.com ), since it allows the use of three different backbone-independent and backbone-dependent rotamer libraries either on the whole protein or on a wide vari-ety of amino acid selections. However, a number of rotamer libraries and software for side chain modeling are freely avail-able and worth using (see Subheading 2 ).

5. Further re fi nement of such model(s) by taking average mini-mized structures from MD simulations is frequently required especially in those cases, in which sequence identity between template and target is low (e.g., ~20 % in the transmembrane domains and higher in the loops and in the N- and C-termini). Time-averaged structures (i.e., the structures obtained by aver-aging the Cartesian coordinates of the structural models over

http://www.accelrys.com

326 F. Fanelli

the frames that constitute the MD trajectory) are worth computing over the initial, central, last as well as all frames of the MD trajectory. In cases of low C a -RMSD values concern-ing the trajectory frames, the different average structures will be quite similar. It is worth noting that for any average struc-ture, energy minimization is mandatory to eliminate physically meaningless bond lengths, bond angles and torsion angles.

1. Prediction of likely architectures of GPCRs homo- and het-erodimers follows a computational approach developed for quaternary structure predictions of transmembrane (TM) a -helical proteins ( 41 ) . The approach consists of rigid-body docking using a version of the ZDOCK program devoid of desolvation as a component of the docking score ( 39 ) . It does not employ symmetry constraints either for improving sam-pling or in the fi ltering step. Furthermore, there are no size limitations for the systems under study, which are not limited to the TM regions but include the loop regions as well. The only requirement with this approach is the structural model of the monomer and the knowledge of a set of C a -atoms, which lie at the two lipid/water interfaces, de fi ning two parallel planes (see Note 4). In the target monomer, these two planes must be parallel to the xy plane and, hence, perpendicular to the z -axis, to make the membrane topology fi lter work properly (see Note 5). If these planes are parallel to the xy plane and, hence, per-pendicular to the z -axis, the orientation of the monomer is considered good and no reorientation is needed. In contrast, if such planes are not parallel to the xy plane, the monomer needs a reorientation. The latter can be done through the FiPD soft-ware (visit “Software” at http://www.csbl.unimore.it ).

The structural models subjected to docking simulations can be either crystal structures or minimized averages from MD simulations on selected models from comparative modeling.

2. In the case of homodimerization, two identical copies of the structural receptor model are docked together, i.e., one mono-mer is used as a fi xed protein (target) and the other as a mobile protein (probe). For predicting heterodimers, the structural model of receptor A is taken as a target, whereas the structural model of receptor B is taken as a probe and/or vice versa. A rotational sampling interval of 6º is recommended, (i.e., dense sampling, “-D” option), and the best 4,000 solutions are retained and ranked according to the ZDOCK score (i.e., “-N 4000” setup).

3. The best docked solutions by shape complementarity as selected by the docking program are then fi ltered using the FiPD soft-ware ( 41 ) , the “membrane topology” fi lter, which discards all

3.2. Prediction of GPCR Dimer/Oligomer Architecture



the solutions that violate the membrane topology requirements (see Note 5). The membrane topology fi lter, indeed, discards all the solutions characterized by a deviation angle from the original z -axis, i.e., tilt angle, and a displacement of the geo-metrical center along the z -axis, i.e., z -offset, above de fi ned threshold values. In the case of GPCRs, tilt angle and z -offset thresholds of 0.4 rad (“-a 0.4” setup) and 6.0 Å (“-o 6.0” setup) are worth using (see Note 6). Following such fi ltering, discarded solutions generally constitute more than 94 % of the solutions selected by the docking program.

4. The fi ltered solutions from each run are merged with the tar-get protein, leading to an equivalent number of dimers that are clustered using a C a -RMSD threshold of 3.0 Å for each pair of superimposed dimers. All the amino acid residues in the dimer are included in C a -RMSD calculations. Cluster analysis is based on a Quality Threshold-like clustering algorithm ( 44 ) implemented both in the FiPD and Wordom software ( 41, 43 ) (see Note 7). Since the fi ltering cutoffs of the membrane topol-ogy parameters are intentionally quite permissive (see Note 6), inspection of the cluster centers (i.e., the solutions with the highest number of neighbors in each cluster) often serves as a fi nal fi lter to discard remaining false positives, leading to a reduction of the reliable solutions to about 1 % of the total 4,000 solutions. The best scored docking solution(s) from the most populated and reliable cluster(s) is(are) fi nally chosen.

5. If necessary, adjustment of the torsion angles of interface amino acid side chains involved in steric clashes is done by using rotamer libraries (see Note 3).

6. Predictions of higher order oligomers can be done either by selecting solutions from reliable clusters (i.e., which can coex-ist in a multimeric complex), or by running a new docking by using the predicted dimer as a target and the monomer as a probe. Filtering and cluster analysis then follow the same pro-cedure as that employed for predicting dimers.

1. Prediction of the structure of the protein complex between a given GPCR and the cognate G protein is achieved following docking simulations using the ZDOCK version that calculates desolvation penalty, pairwise shape complementarity, and elec-trostatics to score the docked complexes. The receptor is used as a target, whereas heterotrimeric G protein is the probe. To improve sampling ef fi ciency, only the cytosolic portions of the receptor are taken into account in docking simulations (i.e., by “blocking” atoms that cannot be part of the interface). Dense sampling and fi nal retrieval of the best 4,000 solutions are done.

2. To fi lter the most reliable solutions among the 4,000 best scored docked GPCR–G protein complexes, a distance cutoff

3.3. Prediction of Receptor–G Protein Complex Structures

328 F. Fanelli

of 20 Å is used between the C a -atom of the fully conserved arginine of the E/DRY motif of the receptor and the C a -atom of the last amino acid on the G protein a -subunit. Since such distance-based fi ltering is intentionally quite permissive to avoid unwanted elimination of reliable solutions, the fi ltered solutions are, then, subjected to cluster analysis, by using the QT-like algorithm with a C a -RMSD cutoff of 4.0 Å. Finally, visual inspection of the cluster centers serves to discard the remaining solutions that violate the expected membrane topol-ogy of the G protein (see Note 8). The best scored solution(s) from the most populated and reliable (i.e., that accomplish the expected membrane topology of the G protein) clusters are fi nally selected and subjected to side chain rotamer optimiza-tion (see Note 3 and Subheading 2 ).

1. MD simulations can be used to study the allosteric communica-tion (i.e., intramolecular and intermolecular) involving GPCRs. Comparative analyses of MD trajectories of wild-type (i.e., refer-ence state) vs. mutated (i.e., perturbed state) or free (i.e., reference state) vs. bound (i.e., perturbed state) forms of a GPCR or a G protein ( 1, 45– 47 ) would provide such information. Bound forms of a GPCR include complexes with small ligands and receptor dimers/oligomers. The allosteric communication can be, hence, inferred by comparing the essential motions as well as selected time-dependent features of reference and per-turbed states. In this context, time-dependent features may include structure networks and shortest communication paths.

2. Strategies to reduce the system’s degrees of freedom and detect meaningful motions are worth using. They include the employ-ment of implicit water/membrane models like the GBSW ( 48 ) implemented in the CHARMM molecular simulation software ( 42 ) . Such a solvation model is used with the all-hydrogen parameter set PARAM22/CMAP. The surface tension coef fi cient (representing the nonpolar solvation energy) is set to 0.03 kcal/(mol Å 2 ) and the membrane thickness centered at Z = 0 is set to 30.0 Å with a membrane smoothing length of 5.0 Å ( w m = 2.5 Å). Additional ways to reduce the degrees of freedom may include the use of intra-helical restraints between the oxygen atom of residue i and the backbone nitrogen atom of residue i + 4, except for prolines. Noncanonical a -helix con-formations are excluded from the intra-backbone restraints as well. The scaling factor of such restraints is 10 and the force constant at 300 K is 10 kcal/mol Å. The structural model must have an appropriate membrane topology, i.e., the C a -atoms predicted to be at the membrane/water interface must lie in the xy plane. Prior to MD simulations, the potential energy of the simulation system should be minimized using steepest descent followed by Adopted Basis Newton-Raphson (ABNR)

3.4. MD Simulations


minimization, until the root mean square gradient is less than 0.001 kcal/mol Å.

3. As for the setup of equilibrium MD simulations, the bond lengths involving the hydrogen atoms are restrained by the SHAKE algorithm, allowing for an integration time step of 0.002 ps. The system is slowly heated to 300 K with 7.5 K rises every 2.5 ps per 100 ps, by randomly assigning velocities from a Gaussian distribution. After heating, the system is allowed to equilibrate for a system-dependent time period. Equilibration is followed by the MD production phase that consists of con-stant temperature (300 K) simulations in the nanosecond/submicrosecond time scale (see Note 9).

1. The structural differences between reference and perturbed states of a GPCR or a G protein can be inferred by comparing properties either computed on the average structures, or plot-ted as time series (i.e., property vs. time plots). The selected structural properties must be the same for all the compared systems (Note 9).

Useful time-averaged properties include inter-residue main chain and/or side chain distances accounting for relative motions of selected segments of the protein, and/or formation or breakage of networks of interactions between conserved residues (Fig. 1 ), and/or changes in the solvent accessible sur-face area (SASA) of selected receptor segments in response to the initial perturbation (i.e., small molecule/protein binding or point mutation). In particular, as also demonstrated by the crystal structures of dark and photoactivated (Meta II) rho-dopsin, as well as constitutively active opsin ( 7– 9 ) , in GPCRs of the rhodopsin family, the SASA index computed over an amino acid set comprising sometimes the fully conserved argi-nine of the E/DRY motif and selected surrounding amino acids (see Note 10) can be used as an indicator of presence or absence of activation (Fig. 1 ). Other time averages may include the C a -RMSD (computable by the “RMSD” module of Wordom ( 43 ) ) accounting for differences in the average back-bone fl exibility between inactive and active states. Furthermore, the comparative analyses of the C a -atom Root Mean Square Fluctuations (C a -RMSFs, computable by the “RMSF” mod-ule of Wordom ( 43 ) ) concerning reference and perturbed states can be used as well, which provide differences in devia-tion between the position of the C a -atoms in each trajectory frame and the relative average position over the whole trajec-tory. Comparisons of the C a -RMSF plots would provide infor-mation on the fl exibility of selected regions of the protein in response to a perturbation occurring at a distal site. This is the case of comparative C a -RMSF analysis of free and receptor-bound forms of the G protein a -subunit ( 45 ) .

3.5 . Comparative MD Analyses

330 F. Fanelli

2. Other comparative analyses done on equilibrium MD trajecto-ries include the Principal Component Analysis (PCA) on the C a -atoms (i.e., the “PCA” module in Wordom ( 43 ) ). This is a strategy to isolate and identify low frequency, high amplitude movements in the dynamics, thus separating meaningful con-certed motions (i.e., essential motions) from noise and high frequency oscillations ( 49 ) (see Note 11). Comparing the essential motions of reference and perturbed states of a GPCR or a G protein provides an understanding of the intrinsic dynamics associated with the functioning mechanism of the considered protein (Fig. 2 ).

3. The structural differences between reference and perturbed states of a GPCR or a G protein resulting from MD simula-tions can be also expressed as, (a) changes in the parameters that de fi ne the structure network of the reference protein and, (b) changes in the dynamic distribution of the shortest com-munication paths connecting two amino acids. This is achieved through the Protein Structure Network (PSN) analysis. The PSN analysis implemented in the Wordom software is ( 43 ) a product of graph theory applied to protein structures ( 50 ) . A graph is de fi ned by a set of points (nodes) and connections (edges) between them. In a protein structure graph (PSG), each amino acid is represented as a node and these nodes are connected by edges based on the strength of noncovalent interactions between nodes ( I ij ) ( 51 , 52 ) (see Note 12). I ij are calculated for all nodes, excluding i ± n , where n is a given neighbor cutoff of 3 (“--PROXIMITY 3” setup in the “PSN” module). An interaction strength cutoff I min is then chosen and any residue pair ij for which I ij ³ I min is considered to be inter-acting and hence is connected in the PSG. Selection of the proper I min for the system under study requires calculation of a number of PSGs over an I min range (e.g., from 0 to 5 by 0.1 steps: “--INTMIN 0.0:5.0:0.1” setup in the “PSN module”). In a given PSG, the residues making zero edges are termed as orphans and those that make four or more edges are referred to as hubs at that particular I min . Node interconnectivity is fi nally used to highlight cluster-forming nodes, where a cluster is a set of connected amino acids in a graph (see Note 13). Cluster size, i.e., the number of nodes constituting a cluster, varies as a function of the I min , and the size of the largest cluster is used to calculate the I critic value. The latter is de fi ned as the I min at which the size of the largest cluster is half the size of the largest cluster at I min = 0.0 %. At I min = I critic weak node interac-tions are discarded, emphasizing the effects of stronger inter-actions on PSN properties. Thus, the fi rst I min value higher than the I critic is recommended, which results from an I min scanning from 0 to 5 by 0.1 steps. Comparative PSN analyses allow to search for differences in the network parameters (e.g., identity


and number of nodes, hubs, links-connecting nodes and links-connecting hubs) between reference and perturbed GPCR (Figs. 3 and 4 ) or G protein. Such differences are linked to intramolecular or intermolecular communication mechanisms.

4. Following calculation of the PSN-based connectivities (by the “PSN” module) and of correlated C a -atom motions (i.e., by using the Linear Mutual Information (LMI) method (“--LMI” option in the “CORR” module of Wordom)), for each frame, the procedure to search for the shortest path(s) between each residue pair (i.e., by the “PSNPATH” module) consists of: (a) searching for the shortest path(s) between each selected amino acid pair based upon the PSN connectivities, and (b) selecting the shortest path(s) that contain(s) at least one residue correlated (i.e., with a LMI cross-correlation ³ 0.3: “--CUTOFF 0.3” setup in the PSN module) with either one of the two extremities (i.e., the fi rst and last amino acids in the path). The path search implemented in Wordom relies on the Dijkstra’s algorithm ( 52 ) . Once the shortest paths have been found, calculation of the path frequencies, i.e., number of frames containing the selected path divided by the total number of frames in the trajectory, is done. All those paths characterized by frequency values above given thresholds, which depend on the simulated system, are then subjected to cluster analysis by means of the QT algorithm, according to a similarity score ( 53, 54 ) (see Note 14). Meaningful paths can be then selected as the centers of the most populated clusters. In addition, coarse path representations can be used to compare the structural communication in reference and perturbed systems ( 54 ) . These coarse paths consist in the representation of the most recurrent nodes and links either in the entire set of paths (above a fre-quency threshold) or among the members of the most popu-lated clusters ( 54 ) . Thus in a comparative framework, path analysis on MD trajectories represent a further way to highlight differences in the structural communication between reference and perturbed GPCR or G protein ( 46, 47 ) .

1. Ab initio modeled loops are less reliable than the loop struc-tures modeled by satisfying the spatial restraints derived from the sequence alignment of target and template proteins. (i.e., by comparative modeling).

2. In some cases, it is worth considering selection of more than one model, showing comparable quality check parameters but with some signi fi cant structural differences (e.g., in loops conformations).

4. Notes

332 F. Fanelli

3. The Dunbrack and Karplus ( 55 ) , Ponder and Richards ( 56 ) , and Sutcliffe ( 57 ) rotamer libraries are used, by automatically assigning the conformation that gets consensus among the three with no steric clashes. In general, more than one receptor conformer results from this operation. Such conformers are all worth considering for further calculations.

4. Membrane topology predictors from a single predictor such as PRODIV-TMHMM_0.91 or consensus prediction from more than one source are worth using to individuate the interface C a -atoms lying in the xy plane.

5. For the membrane topology fi lter to work properly, the two docked structural models must have the appropriate orientation with respect to the putative membrane. This is due to the fact that ZDOCK expresses its docking solutions in terms of a x , y , z -translation and a RzRxRz-rotation of the probe. If both target and probe are properly oriented, i.e., with the interface C a -atoms lying in the xy plane, the translation along the z -axis can be considered as an offset out of the membrane, and the Rx component of the rotation as a deviation from the original orientation in the membrane. Wrong membrane topology of the docked proteins generate errors in the fi ltering stage.

6. Selection of quite permissive fi ltering cutoffs is necessary to avoid removal of reliable solutions as a consequence of possible small input deviations from the proper membrane topology.

7. The QT-like algorithm fi rst calculates the C a -RMSD for each superimposed pair of dimers/oligomers and then it computes the number of neighbors for each dimer/oligomer by using a threshold C a -RMSD. The dimer/oligomer with the highest number of neighbors is considered as the center of the fi rst cluster. All the neighbors of this con fi guration are removed from the ensemble of con fi gurations to be counted only once. The center of the second cluster is then determined in the same way as for the fi rst cluster, and this procedure is repeated until each structure is assigned to a cluster.

8. It is worth noting that the criteria for evaluating the correct-ness of the G protein membrane topology are quite rough. Indeed, acceptable membrane topologies were considered those characterized by the main axis of the N-terminal helix of the a -subunit almost parallel and close enough to the mem-brane surface to allow the post-translational hydrophobic modi fi cations of the a - and g -subunits to insert into the mem-brane. In this respect, visual inspection serves to discard obvi-ous deviations.

9. Final results must be representative of a number of MD simu-lations differing in the input structural models and/or in the input setup.


10. Selection of the amino acids that contribute to SASA must be customized ad hoc on the considered system and privilege maximal difference in SASA between functionally different states (i.e., inactive and active).

11. A covariance matrix is constructed by using the Cartesian coor-dinates of the C a -atoms as variable set and the trajectory frames as data set. PCA is carried out both on single or concat-enated C a -trajectories of the molecular systems under com-parison. According to the Essential Dynamics analysis protocol, the diagonalization of the covariance matrix pro-duces a set of eigenvectors and eigenvalue pairs, which indi-cate, respectively, directions and amplitudes of motions. Eigenvectors characterized by high eigenvalues describe motions with great atomic displacements. The motions along the most signi fi cant eigenvectors can be obtained by projecting each frame of the original trajectories over a number of eigen-vectors, which describe the essential subspace of the system.

12. The strength of interaction between residues i and j ( I ij ) is evaluated as a percentage given by equation above: 100ij

iji j

nI

N N= × where I ij is the percentage interaction

between residues i and j ; n ij is the number of atom–atom pairs between the side chains of residues i and j within a distance cutoff (4.5 Å); N i and N j are normalization factors for residue types i and j , which take into account the differences in size of the side chains of the residue types and their propensity to make the maximum number of contacts with other amino acid residues in protein structures.

13. Node clusterization procedure is such that nodes are iteratively assigned to a cluster if they can establish a link with at least one node in such a cluster. A node not linkable to existing clusters initiates a novel cluster and so on until the node list is exhausted.

14. Path clusterization relies on a similarity score (S) between paths a and b , computed according to the following equation:

⎛ ⎞ ⎛ ⎞ ⎛ ⎞⋅ ⋅ ⋅⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠

2C 2 2C0.15 0.4 0.45N P L

a,ba b a b a b

Max(C )S = + +

N + N N + N L + L

Where (a) C N is the number of common nodes in both paths; (b) N a and N b are the number of nodes in paths a and b , respec-tively; (c) Max(C P ) is the greatest number of nodes at the same position in the path as obtained by sliding the nodes of path a over the nodes of path b by one position at a time and then inverting the two paths (i.e., sliding path b over path a ); (d) C L is the number of common links in both paths (i.e., those links

334 F. Fanelli

connecting pairs of identical nodes); and (e) L a and L b are the number of links in path a and b, respectively. The similarity score ranges from 0, for two totally different paths, and 1, for two identical paths.

The similarity score is also used to compute the cluster centers. A cluster center is, indeed, the path with the highest average S among all the paths in the cluster, i.e., the path with the highest number of neighbors in the cluster.

Acknowledgment

This study was supported by a Telethon-Italy grant n. S00068TELU and S00068TELC.

Michele Seeber, Angelo Felline, Francesco Raimondi, and Daniele Casciari deserve acknowledgment for their valuable contri-bution to method development.

References

1. Fanelli F, De Benedetti PG (2011) Update 1 of: computational modeling approaches to structure-function analysis of G Protein-coupled receptors. Chem Rev 111:438–535

2. Lefkowitz RJ (2000) The superfamily of heptahelical receptors. Nat Cell Biol 2:133–136

3. Pierce KL, Premont RT, Lefkowitz RJ (2002) Seven-transmembrane receptors. Nat Rev Mol Cell Biol 3:639–650

4. Onaran HO, Scheer A, Cotecchia S, Costa T (2000) In: Kenakin T, Angus J (eds) Handbook of experimental pharmacology, vol 148, Springer, Heidelberg, pp 217–280

5. Kenakin T (2002) Ef fi cacy at G-protein-coupled receptors. Nat Rev Drug Discov 1:103–110

6. Brady AE, Limbird LE (2002) G protein-cou-pled receptor interacting proteins: emerging roles in localization and signal transduction. Cell Signal 14:297–309

7. Palczewski K (2006) G protein-coupled recep-tor rhodopsin. Annu Rev Biochem 75:743–767

8. Choe HW, Kim YJ, Park JH, Morizumi T, Pai EF, Krauss N, Hofmann KP, Scheerer P, Ernst OP (2011) Crystal structure of metarhodopsin II. Nature 471:651–655

9. Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst OP (2008) Crystal structure of the ligand-free G-protein-coupled receptor opsin. Nature 454:183–187

10. Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krauss N, Choe HW, Hofmann KP, Ernst OP (2008) Crystal structure of opsin in its G-protein-interacting conformation. Nature 455:497–502

11. Standfuss J, Edwards PC, D’Antona A, Fransen M, Xie G, Oprian DD, Schertler GF (2011) The structural basis of agonist-induced activa-tion in constitutively active rhodopsin. Nature 471:656–660

12. Rasmussen SG, Choi HJ, Rosenbaum DM, Kobilka TS, Thian FS, Edwards PC, Burghammer M, Ratnala VR, Sanishvili R, Fischetti RF, Schertler GF, Weis WI, Kobilka BK (2007) Crystal structure of the human beta(2) adrenergic G-protein-coupled receptor. Nature 450:383–387

13. Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen SG, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis WI, Kobilka BK, Stevens RC (2007) High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science 318:1258–1265

14. Warne T, Serrano-Vega MJ, Baker JG, Moukhametzianov R, Edwards PC, Henderson R, Leslie AG, Tate CG, Schertler GF (2008) Structure of a beta1-adrenergic G-protein-coupled receptor. Nature 454:486–491

15. Hanson MA, Cherezov V, Grif fi th MT, Roth CB, Jaakola VP, Chien EY, Velasquez J, Kuhn P, Stevens RC (2008) A speci fi c cholesterol binding site is established by the 2.8 A structure


of the human beta2-adrenergic receptor. Structure 16:897–905

16. Wacker D, Fenalti G, Brown MA, Katritch V, Abagyan R, Cherezov V, Stevens RC (2010) Conserved binding mode of human beta(2) adrenergic receptor inverse agonists and antag-onist revealed by X-ray crystallography. J Am Chem Soc 132:11443–11445

17. Rasmussen SG, Choi HJ, Fung JJ, Pardon E, Casarosa P, Chae PS, Devree BT, Rosenbaum DM, Thian FS, Kobilka TS, Schnapp A, Konetzki I, Sunahara RK, Gellman SH, Pautsch A, Steyaert J, Weis WI, Kobilka BK (2011) Structure of a nanobody-stabilized active state of the beta(2) adrenoceptor. Nature 469:175–180

18. Rosenbaum DM, Zhang C, Lyons JA, Holl R, Aragao D, Arlow DH, Rasmussen SG, Choi HJ, Devree BT, Sunahara RK, Chae PS, Gellman SH, Dror RO, Shaw DE, Weis WI, Caffrey M, Gmeiner P, Kobilka BK (2011) Structure and function of an irreversible ago-nist-beta(2) adrenoceptor complex. Nature 469:236–240

19. Warne T, Moukhametzianov R, Baker JG, Nehme R, Edwards PC, Leslie AG, Schertler GF, Tate CG (2011) The structural basis for agonist and partial agonist action on a beta(1)-adrenergic receptor. Nature 469:241–244

20. Rasmussen SG, Devree BT, Zou Y, Kruse AC, Chung KY, Kobilka TS, Thian FS, Chae PS, Pardon E, Calinski D, Mathiesen JM, Shah ST, Lyons JA, Caffrey M, Gellman SH, Steyaert J, Skiniotis G, Weis WI, Sunahara RK, Kobilka BK (2011) Crystal structure of the beta(2) adrenergic receptor-Gs protein complex. Nature 469:175–181

21. Jaakola VP, Grif fi th MT, Hanson MA, Cherezov V, Chien EY, Lane JR, Ijzerman AP, Stevens RC (2008) The 2.6 angstrom crystal structure of a human A2A adenosine receptor bound to an antagonist. Science 322:1211–1217

22. Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao ZG, Cherezov V, Stevens RC (2011) Structure of an agonist-bound human A2A adenosine receptor. Science 332:322–327

23. Lebon G, Warne T, Edwards PC, Bennett K, Langmead CJ, Leslie AG, Tate CG (2011) Agonist-bound adenosine A2A receptor struc-tures reveal common features of GPCR activa-tion. Nature 474:521–525

24. Dore AS, Robertson N, Errey JC, Ng I, Hollenstein K, Tehan B, Hurrell E, Bennett K, Congreve M, Magnani F, Tate CG, Weir M, Marshall FH (2011) Structure of the adenosine A(2A) receptor in complex with ZM241385 and the xanthines XAC and caffeine. Structure 19:1283–1293

25. Hino T, Arakawa T, Iwanari H, Yurugi-Kobayashi T, Ikeda-Suno C, Nakada-Nakura Y, Kusano-Arai O, Weyand S, Shimamura T, Nomura N, Cameron AD, Kobayashi T, Hamakubo T, Iwata S, Murata T (2012) G-protein-coupled receptor inactivation by an allosteric inverse-agonist antibody. Nature 482:237–240

26. Murakami M, Kouyama T (2008) Crystal structure of squid rhodopsin. Nature 453:363–367

27. Wu B, Chien EY, Mol CD, Fenalti G, Liu W, Katritch V, Abagyan R, Brooun A, Wells P, Bi FC, Hamel DJ, Kuhn P, Handel TM, Cherezov V, Stevens RC (2010) Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists. Science 330:1066–1071

28. Chien EY, Liu W, Zhao Q, Katritch V, Han GW, Hanson MA, Shi L, Newman AH, Javitch JA, Cherezov V, Stevens RC (2010) Structure of the human dopamine D3 receptor in com-plex with a D2/D3 selective antagonist. Science 330:1091–1095

29. Shimamura T, Shiroishi M, Weyand S, Tsujimoto H, Winter G, Katritch V, Abagyan R, Cherezov V, Liu W, Han GW, Kobayashi T, Stevens RC, Iwata S (2011) Structure of the human histamine H1 receptor complex with doxepin. Nature 475:65–70

30. Haga K, Kruse AC, Asada H, Yurugi-Kobayashi T, Shiroishi M, Zhang C, Weis WI, Okada T, Kobilka BK, Haga T, Kobayashi T (2012) Structure of the human M2 muscarinic acetyl-choline receptor bound to an antagonist. Nature 482:547–551

31. Kruse AC, Hu J, Pan AC, Arlow DH, Rosenbaum DM, Rosemond E, Green HF, Liu T, Chae PS, Dror RO, Shaw DE, Weis WI, Wess J, Kobilka BK (2012) Structure and dynamics of the M3 muscarinic acetylcholine receptor. Nature 482:552–556

32. Granier S, Manglik A, Kruse AC, Kobilka TS, Thian FS, Weis WI, Kobilka BK (2012) Structure of the delta-opioid receptor bound to naltrindole. Nature 485:400–404

33. Wu H, Wacker D, Mileni M, Katritch V, Han GW, Vardy E, Liu W, Thompson AA, Huang XP, Carroll FI, Mascarella SW, Westkaemper RB, Mosier PD, Roth BL, Cherezov V, Stevens RC (2012) Structure of the human kappa-opi-oid receptor in complex with JDTic. Nature 485:327–332

34. Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, Granier S (2012) Crystal structure of the micro-opioid receptor bound to a morphinan antagonist. Nature 485:321–326

336 F. Fanelli

35. Thompson AA, Liu W, Chun E, Katritch V, Wu H, Vardy E, Huang XP, Trapella C, Guerrini R, Calo G, Roth BL, Cherezov V, Stevens RC (2012) Structure of the nociceptin/orphanin FQ receptor in complex with a peptide mimetic. Nature 485:395–399

36. Hanson MA, Roth CB, Jo E, Grif fi th MT, Scott FL, Reinhart G, Desale H, Clemons B, Cahalan SM, Schuerer SC, Sanna MG, Han GW, Kuhn P, Rosen H, Stevens RC (2012) Crystal struc-ture of a lipid G protein-coupled receptor. Science 335:851–855

37. Sali A, Blundell TL (1993) Comparative pro-tein modelling by satisfaction of spatial restraints. J Mol Biol 234:779–815

38. Fiser A, Do RK, Sali A (2000) Modeling of loops in protein structures. Protein Sci 9:1753–1773

39. Chen R, Li L, Weng Z (2003) ZDOCK: an initial-stage protein-docking algorithm. Proteins 52:80–87

40. Chen R, Weng Z (2003) A novel shape com-plementarity scoring function for protein-pro-tein docking. Proteins 51:397–408

41. Casciari D, Seeber M, Fanelli F (2006) Quaternary structure predictions of transmem-brane proteins starting from the monomer: a docking-based approach. BMC Bioinformatics 7:340

42. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M (1983) Charmm: a program for macromolecular energy, minimization and dynamics calcula-tions. J Comput Chem 4:187–217

43. Seeber M, Felline A, Raimondi F, Muff S, Friedman R, Rao F, Ca fl isch A, Fanelli F (2011) Wordom: a user-friendly program for the analy-sis of molecular structures, trajectories, and free energy surfaces. J Comput Chem 32:1183–1194

44. Heyer LJ, Kruglyak S, Yooseph S (1999) Exploring expression data: identi fi cation and analysis of coexpressed genes. Genome Res 9:1106–1115

45. Raimondi F, Seeber M, Benedetti PG, Fanelli F (2008) Mechanisms of inter- and intramolecu-lar communication in GPCRs and G proteins. J Am Chem Soc 130:4310–4325

46. Angelova K, Felline A, Lee M, Patel M, Puett D, Fanelli F (2011) Conserved amino acids participate in the structure networks deputed

to intramolecular communication in the lutro-pin receptor. Cell Mol Life Sci 68:1227–1239

47. Fanelli F, Felline A (2011) Dimerization and ligand binding affect the structure network of A(2A) adenosine receptor. Biochim Biophys Acta 1808:1256–1266

48. Im W, Feig M, Brooks CL 3rd (2003) An implicit membrane generalized born theory for the study of structure, stability, and interactions of membrane proteins. Biophys J 85:2900–2918

49. Amadei A, Linssen AB, Berendsen HJ (1993) Essential dynamics of proteins. Proteins 17:412–425

50. Vishveshwara S, Brinda KV, Kannan N (2002) Protein structure: insights from graph theory. J Theor Comput Chem 1:187–211

51. Vishveshwara S, Ghosh A, Hansia P (2009) Intra and inter-molecular communications through protein structure network. Curr Protein Pept Sci 10:146–160

52. Dijkstra EW (1959) A note on two problems in connexion with graphs. Numer Math 1:269–271

53. Heyer LJ, Kruglyak S, Yooseph S (1999) Exploring expression data: identi fi cation and analysis of coexpressed genes. Genome Res 9:1106–1115

54. Raimondi F, Felline A, Portella G, Orozco M, Fanelli F (2012) Light on the structural com-munication in Ras GTPases. J Biolmol Struct Dyn (in press)

55. Dunbrack RL Jr, Karplus M (1993) Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J Mol Biol 230:543–574

56. Ponder JW, Richards FM (1987) Tertiary tem-plates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193:775–791

57. Sutcliffe MJ, Hayes FR, Blundell TL (1987) Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. Protein Eng 1:385–392

58. Ballesteros JA, Weinstein H (1995) Integrated methods for the construction of three-dimen-sional models and computational probing of structure-function relations in G protein-coupled receptors. Methods Neurosci 25:366–428

Documents

Membrane Protein Structure and Dynamics || Modeling the Structural Communication in Supramolecular Complexes Involving GPCRs