7
Denaturant-dependent folding of GFP Govardhan Reddy a , Zhenxing Liu b , and D. Thirumalai a,1 a Biophysics Program, Institute for Physical Science and Technology and Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742; and b Department of Physics, Beijing Normal University, Beijing 100875, China Edited by Peter G. Wolynes, Rice University, Houston, TX, and approved June 12, 2012 (received for review March 31, 2012) We use molecular simulations using a coarse-grained model to map the folding landscape of Green Fluorescent Protein (GFP), which is extensively used as a marker in cell biology and biotechnology. Thermal and Guanidinium chloride (GdmCl) induced unfolding of a variant of GFP, without the chromophore, occurs in an apparent two-state manner. The calculated midpoint of the equilibrium folding in GdmCl, taken into account using the Molecular Transfer Model (MTM), is in excellent agreement with the experiments. The melting temperatures decrease linearly as the concentrations of GdmCl and urea are increased. The structural features of rarely populated equilibrium intermediates, visible only in free energy profiles projected along a few order parameters, are remarkably similar to those identified in a number of ensemble experiments in GFP with the chromophore. The excellent agreement between simulations and experiments show that the equilibrium intermedi- ates are stabilized by the chromophore. Folding kinetics, upon tem- perature quench, show that GFP first collapses and populates an ensemble of compact structures. Despite the seeming simplicity of the equilibrium folding, flux to the native state flows through multiple channels and can be described by the kinetic partitioning mechanism. Detailed analysis of the folding trajectories show that both equilibrium and several kinetic intermediates, including misfolded structures, are sampled during folding. Interestingly, the intermediates characterized in the simulations coincide with those identified in single molecule pulling experiments. Our predictions, amenable to experimental tests, show that MTM is a practical way to simulate the effect of denaturants on the folding of large proteins. complex folding landscape of GFP equilibrium and kinetic pathways multiple folding routes self-organized polymer model multicanonical simulation O ur understanding of the folding mechanisms of small single domain proteins has advanced significantly over the last twenty years thanks to theoretical developments (17), simula- tions (813), and advances in experimental methods (1419). Indeed, it can be justly stated that simulations of structure-based models are remarkably successful in predicting the folding mechanisms in the presence (2022) and absence of denaturants (2325). More refined questions, such as the relationship be- tween pathway diversity and the symmetry of the underlying na- tive structures (26), transition path times for crossing free energy barriers (27), the nature of the transition state ensemble (2831) have also been addressed using theory and experiments. In con- trast, much less is known about how large proteins with number, N, of amino acids exceeding 200 with complex β-sheet fold. Folding mechanisms of Green Fluorescent Protein (GFP), with predominantly β-sheet native structures stabilized by con- tacts between residues that are well separated along the se- quence, have been extensively investigated (3240). However, the details of the folding kinetics including the structures in the states that are visited along the folding routes are unexplored. It has been difficult to describe quantitatively the folding of GFP be- cause of the extraordinarily long equilibration times, which has made it non-trivial to obtain consistent values of even thermody- namic quantities using different experimental probes. For exam- ple, the number of intermediates and their stabilities, and the associated denaturant-dependent properties (m-values) are not unambiguously known for many of the variants of GFP. Because of the presence of a visual chromophore, that is shielded from water by the barrel structure (Fig. 1A), GFP has been used as a marker in a number of biotechnology and biophysical applica- tions ranging from protein-protein interactions to gene expres- sion (41). Consequently, it is important to map in detail the folding landscapes of FPs (32), which could aid in the develop- ment of new GFP tools for use in in vivo localization studies. The folded structure of GFP has eleven β-strands that are closed to form the characteristic barrel structure (Figs. 1A and 1B). The chromophore, surrounded by the barrel and the capping helices, causes hysteresis (denaturant-induced equili- brium folding and unfolding do not coincide even on time scales that exceed hours) in the folding of a number of variants of GFP (34, 35). Mutants that inhibit chromophore formation fold with- out hysteresis, and could form the basis of describing the de novo folding of the complex barrel structure. Here, we use simulations of a coarse-grained (CG) Self-Orga- nized Polymer model (42) with side chains (SOP-SC) (21) of GFP, with N ¼ 230. Our simulations are performed for a chromophore less variant of GFP, Citrine, which differs from the wild type se- quence by point mutations at four locations, S65G/V68L/Q69M/ S72A/T203Y (36). Thermal denaturation and unfolding induced by Guanidinium chloride (GdmCl) (simulated using the Molecu- lar Transfer Model (MTM)) (20) show that equilibrium folding can be approximately described by a two-state model. Folding trajectories probing equilibrium fluctuations and free energy pro- files point to the presence of intermediates. We show, in agree- ment with experiments (33, 38), using a variety of probes of the structures (distributions of the radius of gyration, R g , solvent accessible surface area, and structural overlap function, χ) that in the intermediate, the N-terminus is folded whereas the C-ter- minus region comprising of strands β 7 to β 10 are flexible and dis- ordered. Upon initiating folding the time-dependent decrease in hR g ðtÞi, obtained using Brownian dynamics simulations (21, 43), shows that compaction occurs in at least two major stages. In accord with the predictions of the kinetic partitioning mechan- ism(KPM) (44) Citrine folds by at least four distinct routes. In the dominant pathway, folding commences from preformed local β-sheet structures and occurs in an almost all-or-none manner. Stable local structures associate by diffusion-collision mechanism (45) resulting in the folded state. Besides providing testable pre- dictions, such as the linear decrease in the melting temperature as a function of GdmCl and urea concentrations, our work shows that the complexity of GFP folding can be fairly accurately cap- tured using the SOP-SC model in conjunction with MTM. Author contributions: G.R. and D.T. designed research; G.R., Z.L., and D.T. performed research; G.R. and D.T. contributed new reagents/analytic tools; G.R. and D.T. analyzed data; and G.R. and D.T. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1201808109/-/DCSupplemental. 1783217838 PNAS October 30, 2012 vol. 109 no. 44 www.pnas.org/cgi/doi/10.1073/pnas.1201808109

Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

Denaturant-dependent folding of GFPGovardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

aBiophysics Program, Institute for Physical Science and Technology and Department of Chemistry and Biochemistry, University of Maryland, College Park,MD 20742; and bDepartment of Physics, Beijing Normal University, Beijing 100875, China

Edited by Peter G. Wolynes, Rice University, Houston, TX, and approved June 12, 2012 (received for review March 31, 2012)

Weusemolecular simulations using a coarse-grainedmodel to mapthe folding landscape of Green Fluorescent Protein (GFP), which isextensively used as a marker in cell biology and biotechnology.Thermal and Guanidinium chloride (GdmCl) induced unfolding ofa variant of GFP, without the chromophore, occurs in an apparenttwo-state manner. The calculated midpoint of the equilibriumfolding in GdmCl, taken into account using the Molecular TransferModel (MTM), is in excellent agreement with the experiments. Themelting temperatures decrease linearly as the concentrations ofGdmCl and urea are increased. The structural features of rarelypopulated equilibrium intermediates, visible only in free energyprofiles projected along a few order parameters, are remarkablysimilar to those identified in a number of ensemble experimentsin GFP with the chromophore. The excellent agreement betweensimulations and experiments show that the equilibrium intermedi-ates are stabilized by the chromophore. Folding kinetics, upon tem-perature quench, show that GFP first collapses and populates anensemble of compact structures. Despite the seeming simplicityof the equilibrium folding, flux to the native state flows throughmultiple channels and can be described by the kinetic partitioningmechanism. Detailed analysis of the folding trajectories showthat both equilibrium and several kinetic intermediates, includingmisfolded structures, are sampled during folding. Interestingly, theintermediates characterized in the simulations coincide with thoseidentified in single molecule pulling experiments. Our predictions,amenable to experimental tests, show that MTM is a practicalway to simulate the effect of denaturants on the folding of largeproteins.

complex folding landscape of GFP ∣ equilibrium and kinetic pathways ∣multiple folding routes ∣ self-organized polymer model ∣ multicanonicalsimulation

Our understanding of the folding mechanisms of small singledomain proteins has advanced significantly over the last

twenty years thanks to theoretical developments (1–7), simula-tions (8–13), and advances in experimental methods (14–19).Indeed, it can be justly stated that simulations of structure-basedmodels are remarkably successful in predicting the foldingmechanisms in the presence (20–22) and absence of denaturants(23–25). More refined questions, such as the relationship be-tween pathway diversity and the symmetry of the underlying na-tive structures (26), transition path times for crossing free energybarriers (27), the nature of the transition state ensemble (28–31)have also been addressed using theory and experiments. In con-trast, much less is known about how large proteins with number,N, of amino acids exceeding ≈200 with complex β-sheet fold.

Folding mechanisms of Green Fluorescent Protein (GFP),with predominantly β-sheet native structures stabilized by con-tacts between residues that are well separated along the se-quence, have been extensively investigated (32–40). However, thedetails of the folding kinetics including the structures in the statesthat are visited along the folding routes are unexplored. It hasbeen difficult to describe quantitatively the folding of GFP be-cause of the extraordinarily long equilibration times, which hasmade it non-trivial to obtain consistent values of even thermody-namic quantities using different experimental probes. For exam-ple, the number of intermediates and their stabilities, and the

associated denaturant-dependent properties (m-values) are notunambiguously known for many of the variants of GFP. Becauseof the presence of a visual chromophore, that is shielded fromwater by the barrel structure (Fig. 1A), GFP has been used asa marker in a number of biotechnology and biophysical applica-tions ranging from protein-protein interactions to gene expres-sion (41). Consequently, it is important to map in detail thefolding landscapes of FPs (32), which could aid in the develop-ment of new GFP tools for use in in vivo localization studies.

The folded structure of GFP has eleven β-strands that areclosed to form the characteristic barrel structure (Figs. 1Aand 1B). The chromophore, surrounded by the barrel and thecapping helices, causes hysteresis (denaturant-induced equili-brium folding and unfolding do not coincide even on time scalesthat exceed hours) in the folding of a number of variants of GFP(34, 35). Mutants that inhibit chromophore formation fold with-out hysteresis, and could form the basis of describing the de novofolding of the complex barrel structure.

Here, we use simulations of a coarse-grained (CG) Self-Orga-nized Polymer model (42) with side chains (SOP-SC) (21) of GFP,withN ¼ 230. Our simulations are performed for a chromophoreless variant of GFP, Citrine, which differs from the wild type se-quence by point mutations at four locations, S65G/V68L/Q69M/S72A/T203Y (36). Thermal denaturation and unfolding inducedby Guanidinium chloride (GdmCl) (simulated using the Molecu-lar Transfer Model (MTM)) (20) show that equilibrium foldingcan be approximately described by a two-state model. Foldingtrajectories probing equilibrium fluctuations and free energy pro-files point to the presence of intermediates. We show, in agree-ment with experiments (33, 38), using a variety of probes of thestructures (distributions of the radius of gyration, Rg, solventaccessible surface area, and structural overlap function, χ) thatin the intermediate, the N-terminus is folded whereas the C-ter-minus region comprising of strands β7 to β10 are flexible and dis-ordered. Upon initiating folding the time-dependent decrease inhRgðtÞi, obtained using Brownian dynamics simulations (21, 43),shows that compaction occurs in at least two major stages. Inaccord with the predictions of the kinetic partitioning mechan-ism(KPM) (44) Citrine folds by at least four distinct routes. Inthe dominant pathway, folding commences from preformed localβ-sheet structures and occurs in an almost all-or-none manner.Stable local structures associate by diffusion-collision mechanism(45) resulting in the folded state. Besides providing testable pre-dictions, such as the linear decrease in the melting temperature asa function of GdmCl and urea concentrations, our work showsthat the complexity of GFP folding can be fairly accurately cap-tured using the SOP-SC model in conjunction with MTM.

Author contributions: G.R. and D.T. designed research; G.R., Z.L., and D.T. performedresearch; G.R. and D.T. contributed new reagents/analytic tools; G.R. and D.T. analyzeddata; and G.R. and D.T. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.1To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1201808109/-/DCSupplemental.

17832–17838 ∣ PNAS ∣ October 30, 2012 ∣ vol. 109 ∣ no. 44 www.pnas.org/cgi/doi/10.1073/pnas.1201808109

Page 2: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

ResultsThermal Denaturation: We used multicanonical (MC) moleculardynamics simulations (46, 47) so that thermodynamic propertiesof Citrine can be computed accurately (see SI Text, and Movie S1for details). Dependence of the total energy hEðTÞi as a functionof temperature, T, and the associated heat capacityCvðTÞð¼ hE2i−hEi2

kBT 2 Þ both indicate that folding occurs coopera-tively (inset in Fig. 1C). The melting temperature, correspondingto the peak in CvðTÞ in the absence of denaturants is,Tm ¼ 379 K (see right inset in Fig. 1C). The structural overlapfunction, χ, is used to distinguish between the Native Basin ofAttraction (NBA), equilibrium intermediates (IEQL), and UBA,the unfolded basin of attraction (Methods). The fraction of mo-lecules in the Native Basin of Attraction, fNBAðTÞ, as a functionof T shows that the folding transition is cooperative (Fig. 1C)suggesting that a two-state description is adequate. The valueof Tm computed using PðTmÞ ¼ 0.5 yields Tm ¼ 379 K, whichcoincides with the peak in the heat capacity (Fig. 1C). The Tmvalue obtained in our simulations are in very good agreementwith calorimetry measurements (48, 49), thus validating theSOP-SC model.

The two state nature of thermal unfolding obtained fromFig. 1C hides the presence of equilibrium intermediates, whichare evident in the MC folding trajectory (see Fig. S1 in the SIText). The time dependent changes in E and χ in Fig. S1 showthat at least two equilibrium states (hereafter referred collectivelyas IEQL) are populated. The dips in the plots of free energy pro-files, FðEÞ as a function of E, above and below Tm correspond topopulation of intermediates with χ values in the range specified inMethods (Fig. 1D). The inset in Fig. 1C shows that f INTðTÞ is verysmall, which explains the apparent two-state folding of Citrine in

the absence of the chromophore (see also (34)). The finding thatIEQL are populated during the folding process (see also Fig. 2)provides indirect support to the analysis used in experimental stu-dies, which found that a three-state description provides a quan-titative fit to denaturant unfolding of Citrine in the presence ofchromophore. Thus, the intermediate is, in all likelihood, stabi-lized by the chromophore.

Denaturant Induced Unfolding: To make a direct comparison withensemble experiments (33, 34) we simulated the effects of Gua-nidinium chloride (GdmCl) using the MTM (see Methods and SIText for details). Following our previous work (21), we choose asimulation temperature, Ts, at which the free energy differencesbetween the NBA and the unfolded state, ΔGNUð¼ GNðTsÞ−GUðTsÞÞ calculated from simulations agree with the measuredvalue of ΔGE

NU ¼ −16.5 kcal∕mole for R96A mutant (34) at de-naturant concentration, ½C� ¼ 0. We chose R96A as a referenceGFP for fixing Ts because it exhibits a clear equilibrium two-statetransition (34). Since the absolute values of the effective interac-tion energies in CG models cannot be determined, we used Ts asan adjustable parameter to get accurate estimate of ΔGNU . Thechoice of Ts, which is the only parameter that is fixed in our si-mulations, amounts to choosing the overall free energy scale (21).The dependence of fNBA½C� and fUBA½C� on ½C� at Ts ¼ 368.2 Kalso shows that Citrine folds and unfolds reversibly in an apparenttwo state manner(Fig. 2A). However, the right inset in Fig. 2Ashows signs for the presence of an intermediate.

The midpoint of the folding transition f ð½Cm�Þ ¼ 0.5 yieldsCm ¼ 1.3 M. The simulated equilibrium titration curves can befit using a two state model from which the dependence ofΔGNUð½C�Þ on ½C� can be calculated using ΔGNUð½C�Þ ¼−RTs lnð fNBAð½C�Þ

1−fNBAð½C�ÞÞ. The linear fit, ΔGNUð½C�Þ ¼ GANUð½0�Þþ

Fig. 1. GFP structure andthermal folding. (A) Ribbondiagram of Citrine (Yellowversion of GFP, PDB ID:1HUY). Residues marked inorange, green and mauveare Lys3, Asn198, andAsn212, respectively. (B)Splay representation of GFP.The N-terminus β-strandsare represented in blue,the kinked helix at the cen-ter of the β-strand barrel isin green, the three β-strandsin the center, which form lo-cal contacts are in silver, andthe C-terminus β-strands arein red. (C) Fraction of GFP inthe NBA, INT, and UBA as afunction of temperature, T ,are shown in triangles, dia-monds, and circles respec-tively. The left inset showsf INTðTÞ as a function of T .The right inset shows en-ergy, E, and the heat capa-city, Cv as a function of T .(D) Free energy of GFP asa function of E for tempera-tures above and belowthe melting temperature(T ¼ 379 K). The structureof one of the intermediatesis shown in splay represen-tation.

Reddy et al. PNAS ∣ October 30, 2012 ∣ vol. 109 ∣ no. 44 ∣ 17833

CHEM

ISTR

YBIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

SPEC

IALFEAT

URE

Page 3: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

mGdmCl½C�, yields the apparent stability at ½C� ¼ 0, GANUð½0�Þ≈

−10.8 kcal∕mole and mGdmCl ≈ 11.3 kcal∕mole:M. We attributethe difference between GA

NUð½0�Þ and GENUð½0�Þ to well estab-

lished uncertainties in the measured transfer energies at lowGdmCl concentration. For comparison we also show the experi-mental results on R96A. The left inset in Fig. 2A, which shows thedependence of ΔGNUð½C�Þ on ½C� using the experimental data onR96A yields mR96A

GdmCl ≈ 14.1 kcal∕mole:M and Cm ≈ 1.3 M. Them and Cm values obtained from simulations are in reasonableagreement with experiments.

To ensure that the source of discrepancy between GANUð½0�Þ

andGENUð½0�Þ is due to the problems associated with transfer free

energy measurements involving GdmCl, we calculated the depen-dence of ΔGNUð½C�Þ on ½C� for urea induced unfolding using theMTM theory (20). The left inset in Fig. 2A shows thatΔGNUð½C�Þalso decreases linearly as urea ½C� increases. The calculatedvalue of GNUð½0�Þ ≈ −16.4 kcal∕mole is in excellent agreementwith experiment (34). The MTM based theory predicts thatmurea ≈ 5.7 kcal∕mole:M, which is considerably smaller thanmGdmCl. The predictions for urea induced unfolding of Citrinecan be tested experimentally.

The apparent all or none transition in Fig. 2A gives no indica-tion of the presence of IEQL (see also Fig. 1C). However, the freeenergy profiles, FðRgÞ, as a function of Rg (Fig. 2B) shows thatthere are two high energy intermediates at all values of ½C�. As isthe case in thermal denaturation, the probability of observingthese states is extremely small even though the signature of theirpresence is evident in the folding trajectories (Fig. S1 in SI Text).The free energy profiles projected along E and Rg both are simi-lar implying that these states are structurally similar.

The heat capacity curves at various values of GdmCl concen-trations show that the peaks corresponding to Tmð½C�Þ decreasesas ½C� increases (Fig. 2C). The decrease in Tmð½C�Þ is linear forboth GdmCl and urea (Fig. 2D). The variations in Tmð½C�Þ can befit using Tmð½C�Þ ¼ Tmð½0�Þ − αi½C�, where αurea ¼ 3.85 K∕Mand αGdmCl ¼ 5.7 K∕M. Taken together, the simulations showthat the global equilibrium folding of Citrine, without the chro-mophore, induced by temperature or denaturants is cooperative.

Structural Characteristics of IEQL and Unfolded States: From the mul-ticanonical trajectories (Fig. S1) and the free energy profiles(Fig. 1D and Fig. 2B) we infer that there are two equilibrium in-termediates. In one of the intermediates, interactions involvingpredominantly the C-terminal strands (β7 − β10) are disrupted.Almost all of the contacts in the C-terminal strands are brokenin the second intermediate (Fig. 3A). In both the the intermedi-ates, the N-terminal β strands are formed and undergo small fluc-tuations (Fig. 3A). Interestingly, H/D exchange experiments showthat these regions have the largest exchange rates suggestingthat in IEQL the C-terminal region is flexible (33), whereas theN-terminal strands are more-or-less intact. Thus, the structuralattributes of the equilibrium intermediates in our simulationsare similar to those identified in experiments.

To compare with the SAXS experiments, we also calculated thepair distance distribution PðrÞ for the folded, IEQL, and the un-folded states by varying the GdmCl concentration (Fig. 3B). Atlow ½C�, we find that PðrÞ is peaked around 20 Å. The meanhRgið¼ ½∫ r2PðrÞdr∕2�1∕2Þ ≈20 Å compares well with the valueobtained using Rg ¼ ½ 1

2N 2 ∑i ∑jð ~ri − ~rjÞ2�1∕2 where ~ri and ~rj arethe positions of the ith and jth beads respectively from theX-ray structure. The PðrÞ for IEQL is broader with a shoulderaround 50 Å, which corresponds to structures with complete dis-ruption of the C-terminal structure. The value of hRI

g i forIEQL ≈ 43.8 Å. The PðrÞ for the unfolded state at high GdmClconcentration is broad with a tail that extends to ≈200 Å. Thecalculated value of hRgi for the unfolded state is hRU

g i ≈ 61 Å.The features found in PðrÞ for the three states are in broad agree-ment with SAXS experiments (38) on a different variant of GFPin which unfolding is triggered by decreasing pH. However, thevalues of hRI

g i and hRUg i obtained from simulations are consider-

ably higher than the estimates from SAXS at pH ¼ 4 andpH ¼ 2.2, respectively. It is unclear if the greater compactionof IEQL and the unfolded state at acidic pH compared to simula-tions is due to the different conditions (pH versus temperature),the restricted range of scattering vectors (38) or due to the shortcomings of the SOP-SC model. The distribution PðRgÞ of Rg

shows that as [C] increases GFP samples extended conformationswith Rg exceeding 80 ° (Fig. 3C).

C D

A B

Fig. 2. Denaturant effects on GFP: (A) Dependence offαð½C�Þ (α ¼ NBA, INT, and UBA) as a function of GdmCl con-centration. Blue triangles, black diamond, and red circlesare used for fNBAð½C�Þ, f INTð½C�Þ, and fUBAð½C�Þ, respectively.The experimental data for fraction of protein in UBA as afunction of ½C� is shown in green squares. The right insetshows the low probability of populating the intermediatespecies. The left inset shows the linear variation of free en-ergy difference between the NBA and UBA, ΔGNU , as afunction of denaturant concentrations. The data in redand green are for GdmCl and Urea, respectively. (B) FreeEnergy profiles, FðRgÞ, of GFP as a function of Rg showthe population of a thermodynamic intermediate corre-sponding to the shaded green area. (C) Heat capacity asa function of GdmCl. (D) Variations in the melting tempera-ture, corresponding to the peak in CvðTÞ, as a function ofdenaturant concentrations. The upper curve is for urea andthe lower one corresponds to GdmCl.

17834 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1201808109 Reddy et al.

Page 4: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

Collapse and Folding Kinetics:We used Brownian dynamics simula-tions to generate 80 folding trajectories starting from an initialensemble of equilibrated structures at a high temperature (TH >420 K) and reducing the temperature to T ¼ 300 K to initiatefolding. From the distribution of first passage times, PFPðsÞ,we calculated the probability, PUðtÞ ¼ ∫ t

0PFPðsÞds, that Citrinehas not folded at time, t (inset in Fig. 4A). The folding time ob-tained from the exponential decay of PUðtÞ ≈ expð−t∕τFÞ isτF ≈ 5 ms. Although there is a wide range for τF reported inexperiments the smallest value for τF for Citrine from recentexperiments is ≈1 s (36). Thus, our estimate of τF is at least ahundred fold less than experimental values, which could arise

from neglect of solvent and the coarse-grained representationof the proteins.

The time-dependent changes in hRgðtÞi during the foldingprocess shows that collapse occurs in two stages (Fig. 4A). Thedecay of hRgðtÞi can be fit using hRgðtÞi ¼ a1 expð−t∕τ1Þþa2 expð−t∕τ2Þ, where a1 ¼ ð23.4� 0.2Þ Å, a2 ¼ ð22.5� 0.2Þ Å,τ1 ¼ 3.6 ms and τ2 ¼ 43.47 ms (Fig. 4A). The decrease in thevalue of RgðtÞ in each stage is roughly the same (≈22 Å). How-ever, there is a separation in the time scales in the two stagesimplying that distinctly different ensembles of structures aresampled in two stages. The structures reached at the end of thefirst stage correspond to minimum energy compact structures(50) that guide the folding of Citrine. The predicted multistagecompaction of GFP, similar to that found in other proteins (51),can be tested using time-resolved SAXS experiments.

Parallel Pathways and Kinetic Intermediates: By analyzing the 80folding trajectories using E (seeMethods) as the progress variablefor the folding reaction (Fig. S2) we find that Citrine folds alongfour pathways. For reasons explained below, we classify the struc-tures populated in three of the pathways as kinetic intermediatesand the intermediate in the fourth has most of the hallmarks ofIEQL. Representative trajectories one from each pathway, dis-playing energy as a function of t, show that in each pathway fold-ing occurs in stages (Fig. 4B). The nature of structures and theirlifetimes vary greatly depending on the pathway. In the dominantpathway KIN1, through which ≈50% of the flux to the native stateis channeled, Citrine folds in nearly a two state manner (Fig. 4B).In this pathway, β-strands at the N-terminus, the core of theprotein, and the C-terminal strands form rapidly in a single step.The folding units in the long lived meta-stable states collide andconsolidate (see Movie S1 in the SI Text) as envisioned in the dif-fusion-collision model (45).

In the second kinetic pathway, KIN2, representing ≈16% ofthe trajectories, folding occurs through an intermediate in whichtertiary contacts between strands β3 and β11 and the interfacebetween β1 and β6 are not formed (Fig. 4C). The structural prop-erties of the intermediate in KIN2, such as the distributions of Rgand χ, differ from IEQL, and hence we conclude that it is observedonly during the process of folding. Similarly, in the third kineticpathway, KIN3, through which about 10% of the flux to the nativestate flows, the intermediates shown in Fig. 4D are observed. Theintermediate shown in Fig. 4D, has a kinked helix interacting withthe N-terminal β-sheets. The splay diagram shows further thatlong-range interactions involving β3 and β11 as well as β1 and β6are disrupted.

Equilibrium Folding Intermediate: In about 15% of the trajectoriesan intermediate whose structural characteristics almost coincidewith IEQL is populated. The folding trajectory (Fig. 4B) and the½χ; Rg� plot in Fig. 4D show that the intermediate has a long life-time. There are four lines for evidence which show that the ki-netic and equilibrium intermediates are structurally similar.First, the hRgi obtained using the conformations sampled withfolded N-terminal strands (the C-terminal strands are fluid-like)at equilibrium is ∼14 Å whereas those obtained from kinetic tra-jectories is ∼13 Å. Second, the typical conformations sampled atequilibrium and during the folding process are similar (comparethe structures in Fig 3A). In both the structures the strands β1through β6 are formed whereas the C-terminal is more flexible.Third, the values of χ for the conformations corresponding to theplateau in E as a function of t (purple line in Fig. 4B) is in therange 0.7 < χ ≤ 0.895, which coincides with the estimates ob-tained using equilibrium trajectories (see also the shaded greenregion in Fig. 2B). Fourth, we calculated the distribution, PðΔRÞ,of ΔR ¼ ðAU−AIÞ

ðAU−AFÞ, whereAU ,AI andAN are the solvent accessiblesurface area (SASA) for the unfolded, IEQL, and the native states

0.5

0.4

0.3

0.2

0.1

0.0

P(R

g)

80604020Rg(Å)

1.7M 1.3M 0M

50x10-3

40

30

20

10

0

P(R

g)

908070605040

Rg(Å)

30x10-3

20

10

0

P(r

)

200150100500 r (Å)

16

12

8

4

0

P(

R)

0.90.80.70.60.50.4

R

A

B

C

Fig. 3. (A) Distribution of PðΔRÞ where ΔR is the relative accessible surfacearea (see text for details) in one of the kinetic intermediates with attributessimilar to that found under equilibrium. The typical structure sampled atequilibrium is on the left whereas the structure on the right represents a con-formation obtained in the kinetic simulation in the EQL pathway. (B) Distri-bution, PðrÞ, as a function of r, the distance between a pair of sites for NBA(triangles), INT (diamonds), and UBA (circles) computed using conformationssampled at equilibrium with ½C� ¼ 0. (C) Distribution of Rg as a function ofGdmCl concentration. The midpoint of the transition is Cm ¼ 1.3 M. The insetshows PðRgÞ at large Rg.

Reddy et al. PNAS ∣ October 30, 2012 ∣ vol. 109 ∣ no. 44 ∣ 17835

CHEM

ISTR

YBIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

SPEC

IALFEAT

URE

Page 5: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

respectively using the conformations sampled during the kineticsimulations (plateau region in the purple curve in Fig. 4B). FromPðΔRÞ in Fig. 3A we find that the mean values hΔRi ≈ 0.62, whichcompares well with the experimental value hΔRi ¼ 0.59 esti-mated for the equilibrium intermediate in the Citrine folding(36). Thus, we surmise that the structures sampled along theEQL pathway (Fig. 4B) are similar to those observed under equi-librium conditions.

Relation Between Collapse, Secondary Structure Formation and Fold-ing: It has been shown that for efficient folding, collapse and fold-ing occur nearly simultaneously (52), where as for larger complexproteins collapse precedes by folding. In Fig. 4C and 4D we showthe conformations sampled in all the pathways. Each conforma-tion is represented by χ and Rg. We conclude from the results inFigs. 4C and 4D that in all the four pathways Citrine undergoescompaction (reduction in Rg) before folding (decrease in χ).These findings show that the search for the NBA occurs amongthe ensemble of minimum energy compact structures (50). To ex-plore the link between secondary structure formation and col-lapse we plot the fraction, f ss, of secondary structure acquiredas the polypeptide chain collapses. Fig. S3 shows that in all thepathways the value of f ss is relatively small even after substantialcollapse, which also implies that collapse generates a fluid-likeglobule. Consolidation of structure occurs only after reductionin chain dimension.

Topological Traps: In 6 out the 80 folding trajectories topologicaltraps (44) give rise to Citrine misfolding. In five of the trajec-tories, Citrine is kinetically trapped in a native-like structure inwhich the loop connecting β9 and β10 (Fig. 5B) is in an incorrectposition, thus disrupting the interactions between β-strands 4 and9. In the other misfolded structure (35), the helix is outside thebarrel and hence does not pack in the center as observed in thenative structure.

DiscussionIntermediates in GFP Folding: Compelling evidence for postulatingthe presence of equilibrium intermediates is the non-coincidenceof free energies extracted using fluorescence and NMR measure-ments even though visually the decrease in fluorescence as a func-tion of GdmCl suggests that a two-state description is sufficient(33). More recently, single molecule fluorescence experimentshave presented clear evidence for three state equilibrium foldingof Citrine with the chromophore (36). The presence of highenergy equilibrium intermediates in R96A is only evident in freeenergy profiles computed as a function of energy (Fig. 1D), radiusof gyration (Fig. 2B), and the fraction of native contacts(Fig. S1B). The structural characteristics of IEQL support thefinding based on equilibrium H/D exchange experiments showingorder in the N-terminus and flexibility in the C-terminal regionspanning β7 to β10 (33). It is worth pointing out that one of theintermediates populated in single molecule pulling experimentscorresponds to unfolding in the C-terminus domain with β1through β6 remaining intact (53). Thus, both experiments andsimulations point to the presence of IEQL. The observation thatIEQL in our simulations is a high energy intermediate occurringwith negligible probability whereas the ones characterized inexperiments are stable suggests that the intermediates in GFPfolding are stabilized by the chromophore.

Comparing Simulation and Experimental Folding Pathways: We havefound evidence for kinetic intermediates that direct folding to thenative state. The range of kinetic intermediates observed in oursimulations have previously been identified in single moleculepulling experiments and simulations (39, 40, 42, 53). When GFPis stretched by applying mechanical force, f , between residues 3and 212 (Fig. 1), GFP unfolds along two routes. In the first one,unfolding is in an all-or-none process where as in the second path-way unfolding occurs by populating an intermediate (53). Theincrease in the contour length upon unfolding GFP from thefolded to the intermediate state is consistent with formation of

0.9

0.8

0.7

0.6

0.5

0.4

0.3

706050403020

Rg(Å)

C

-2

-1

0

1

E (

kca

l/mo

le.re

sidue)

86420 t (ms)

KIN1 KIN2 KIN3 EQL

B1.0

0.8

0.6

0.4

0.2

Pu(t

)1086420

t (ms)

45

40

35

30

25

20

<R

g(t

)> Å

1086420 t (ms)

A

0.9

0.8

0.7

0.6

0.5

0.4

0.3

6050403020

Rg(Å)

D

Fig. 4. GFP folding kinetics. (A) Time (t)dependent decrease in hRgðtÞi. The solidline in green is a two exponential fit tothe data. The inset shows the fraction oftrajectories unfolded, PuðtÞ as function oft, with the solid black line being an expo-nential fit. (B) Total energy E as a functionof t for folding along four pathways. A re-presentative trajectory for each pathwayis shown. (C) Dynamic profiles generatedusing the folding trajectories in terms ofχ and Rg for KIN1 and KIN2 in red andblack respectively. The ribbon diagramof the kinetic intermediate in KIN2 showsthat β1 − β3 sheets are not packed ontothe rest of the GFP barrel structure. (D)Plot of χ as function of Rg for foldingpathways KIN3 and EQL are in greenand blue, respectively. Interactions invol-ving the helix and β1 − β3 in the first inter-mediate sampled in KIN3 with the restof the barrel are absent. The second inter-mediate in KIN3 is similar to the one ob-served for KIN2. In the intermediatepopulated in the EQL pathway, the C-terminal β-sheets do not interact withthe ordered N-terminal strands β1 − β6.

17836 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1201808109 Reddy et al.

Page 6: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

a structure in which strands (1–6) are intact and the rest are un-folded. Our simulations and previous ensemble experiments (33,37, 38) have identified such an intermediate as occurring both inequilibrium and during the folding process.

When mechanical force is applied (53) to the residues 3 and198, GFP unfolds by populating a different intermediate in whichthe 3 N-terminus β-strands (β1, β2, β3) have ruptured away fromthe rest of the barrel. In addition, when f is applied to the ends ofthe molecule the dominant unfolding pathway also occursthrough such an intermediate (40, 42). We showed (42) that uponinitiating folding by reducing the mechanical force from a highvalue to f ¼ 0, a long-lived metastable intermediate similar tothat observed in KIN2 and KIN3 is populated before the barrelstructure forms. We find here that in the process of refolding,upon temperature quench, the intermediate with loss of interac-tions between strands (1–3) and rest of the barrel is populated inpathways KIN2 and KIN3.

Single molecule fluorescence experiments have also shownthat Citrine unfolds by parallel pathways. In one of the pathways,the folded state is reached directly, where as an intermediate witha low FRET is populated in the other (36). It is likely, that theselow FRET states are an ensemble of conformations containing amixture of equilibrium and kinetic intermediates. Taken togetherit is clear that folding GFP must occur through a number of dis-tinct intermediates, which become visible only by using different

techniques. By combining all the results from simulations and ex-periments, we propose a model (Fig. 5B) for GFP folding thatinvolves a complex network of connected states through whichthe flux to the native flows. The resulting multiple pathways in-clude equilibrium and kinetic intermediates as well as misfoldedstructures. Detailed comparison between our simulations and ex-periments show that a number of different experimental probesare needed to quantitatively map the folding landscape of GFP,and presumably other proteins with complex topology.

Collapse and Folding: Since it was first demonstrated theoreticallythat folding and collapse transitions are intimately linked for pro-teins that reach the native state efficiently (54, 55) there has beenconsiderable interest in validating this concept. For wild typeGFP it is unmistakable that folding is preceded by populatingcompact intermediate species. Both SAXS experiments (38) aswell our simulations show that a compact state, with hRgi inter-mediate between the values in the folded and unfolded states, ispresent and is likely to be on-pathway to the NBA. The kineticfolding trajectories clearly show (Fig. 4 C and D) that GFP firstcollapses before the formation of secondary and tertiary interac-tions. Our finding is strikingly similar to the order in which struc-ture is acquired in monellin with β-sheet topology (51) as well asin GFP upon force quench (42).

MethodsSOP-Side Chain (SOP-SC) Model: In the SOP-SC model, which uses residue-de-pendent interaction between SCs (56), each residue is represented by twointeraction centers one for the backbone atoms and the other correspondsto the side chain atoms. The SOP-SC model is constructed using the crystalstructure of Citrine, an improved yellow version of GFP (57), with the ProteinData Bank PDB ID: 1HUY. Citrine has 5 mutations compared to the wild typeGFP (S65G, V68L, Q69M, S72A, T203Y). The residues labeled 0 and 1A in 1HUYare deleted. In Citrine the chromophore, labeled residue 66 in 1HUY, involvesresidues GLY65, TYR66, and GLY67. In our simulations the chromophore isdisabled by eliminating the chemical reaction involving the three chromo-phore residues. The functional form of SOP-SC model and the parametersof the CG force field are given in Table S1 in the SI Text.

Molecular Transfer Model (MTM): The energy of transferring a specific proteinconformation from water to the denaturant solution at concentration ½C� iswritten as ΔGtrð½C�Þ ¼ ∑2N

i¼1 δgtr;ið½C�Þðαi∕αGly−i−GlyÞ where the summation in-cludes both backbone and side chain atoms, δgtr;ið½C�Þ is the experimentallymeasured transfer free energy of i, αi is the solvent accessible surface area(SASA) of the bead i, αGly−i−Gly is the solvent accessible surface area of thebead in the tripeptide Gly − i − Gly. The radii of amino acid backbone andsidechain are given in Table S2. The transfer free energies δgtr;ið½C�Þ forthe backbone and side chains, taken from experiments (20, 22, 58), are listedin table S3 in ref. (21). The values for αGly−i−Gly are listed in table S4 in ref. (21).The thermodynamic properties of a protein at ½C� ≠ 0 are obtained using theprocedure described earlier (20, 22).

Data Analysis: We use the structural overlap function (59) χ ¼ 2ðM 2−5Mþ6Þ

∑M−3i¼1 ∑M

j¼iþ3 Θðδ − jrij − r 0ij jÞ to distinguish between the various populatedstates. Here, M ¼ 2N ¼ 460 is the number of interaction centers in theSOP-SC representation of GFP, rij is the distance between the beads i and jwith r 0ij being the corresponding distance in the folded state, Θ is the Heavi-side step function, and δ ¼ 2 Å. The conformation with 0 < χ ≤ 0.7 arefolded, an intermediate state has 0.7 < χ ≤ 0.895, and conformations withχ > 0.895 are unfolded (Fig. S1 in the SI Text).

ACKNOWLEDGMENTS. We are pleased to acknowledge useful discussions withSophie Jackson. This work was supported by a grant from the NationalScience Foundation through grant CHE 09-14033. ZL acknowledges financialsupport from the National Natural Science Foundation of China under thegrant no. 11104015.

1. Thirumalai D, Hyeon C (2005) RNA and protein folding: Common themes and varia-

tions. Biochemistry 44:4957–4970.

2. Onuchic J, LutheySchulten Z, Wolynes P (1997) Theory of protein folding: The energy

landscape perspective. Annu Rev Phys Chem 48:545–600.

3. Shakhnovich E (2006) Protein folding thermodynamics and dynamics: Where physics,

chemistry, and biology meet. Chem Rev 106:1559–1588.

4. Dill KA, Ozkan SB, Shell MS, Weikl TR (2008) The protein folding problem. Annu Rev

Biophys 37:289–316.

Fig. 5. Topological traps and model for GFP folding. (A) The wild typeGFP structure in blue is superimposed onto the misfolded structure in red.Misfolding of the loop connecting β9 and β10 causes a topological trap.The arrows show the correct (folded) and incorrect (misfolded) conformationof the loop. (B) In the second misfolded structure the barrel involving the 11β-strands form with no contact between the central α-helix and rest of thestructure. (C) Folding landscape and network of connected states based onsimulations. The flux through the four channels and the routes from the UBAto the NBA are indicated by arrows. Because topological traps (structures in(A) and (B)) are dead ends in the folding process they do not reach the NBAon the time scale of our simulations. Thus, the total flux to the NBA from UBAis less than 100%. Representative structures sampled along the four path-ways as well as an unfolded conformation are shown.

Reddy et al. PNAS ∣ October 30, 2012 ∣ vol. 109 ∣ no. 44 ∣ 17837

CHEM

ISTR

YBIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

SPEC

IALFEAT

URE

Page 7: Denaturant-dependent folding of GFPbiotheory.umd.edu/PDF/Reddy_PNAS_2012-2.pdf · 2012. 11. 13. · Denaturant-dependent folding of GFP Govardhan Reddya, Zhenxing Liub, and D. Thirumalaia,1

5. Munoz V, EatonW (1999) A simplemodel for calculating the kinetics of protein foldingfrom three-dimensional structures. Proc Natl Acad Sci USA 96:11311–11316.

6. Kubelka J, Henry ER, Cellmer T, Hofrichter J, Eaton WA (2008) Chemical, physical,and theoretical kinetics of an ultrafast folding protein. Proc Natl Acad Sci USA105:18655–18662.

7. Thirumalai D, O’Brien EP, Morrison G, Hyeon C (2010) Theoretical Perspectives onProtein Folding. Annu Rev Biophys 39:159–183.

8. Fersht A, Daggett V (2002) Protein folding and unfolding at atomic resolution. Cell108:573–582.

9. Shaw DE, et al. (2010) Atomic-Level Characterization of the Structural Dynamics ofProteins. Science 330:341–346.

10. Voelz VA, Singh VR, Wedemeyer WJ, Lapidus LJ, Pande VS (2010) Unfolded-statedynamics and structureof protein L characterized by simulation and experiment.J Am Chem Soc 132:4702–4709.

11. Hyeon C, Thirumalai D (2011) Capturing the essence of folding and functions of bio-molecules using coarse-grained models. Nat Commun 2:487.

12. Whitford PC, et al. (2009) An all-atom structure-based potential for proteins: Bridgingminimal models with all-atom empirical forcefields. Proteins 75:430–441.

13. Zhang Z, Chan HS (2010) Competition between native topology and nonnative inter-actions in simple and complex folding kinetics of natural and designed proteins. ProcNatl Acad Sci USA 107:2920–2925.

14. Schuler B, EatonWA (2008) Protein folding studied by single-molecule FRET. Curr OpinStruct Biol 18:16–26.

15. Nickson AA, Clarke J (2010) What lessons can be learned from studying the folding ofhomologous proteins? Methods 52:38–50.

16. Borgia A, Williams PM, Clarke J (2008) Single-molecule studies of protein folding.Annu Rev Biochem 77:101–125.

17. Bartlett AI, Radford SE (2009) An expanding arsenal of experimental methodsyields an explosion of insights into protein folding mechanisms. Nat Struct Mol Biol16:582–588.

18. Shank EA, Cecconi C, Dill JW, Marqusee S, Bustamante C (2010) The folding coopera-tivity of a protein is controlled by its chain topology. Nature 465:637–U134.

19. Gebhardt JCM, Bornschloegla T, Rief M (2010) Full distance-resolved folding energylandscape of one single protein molecule. Proc Natl Acad Sci USA 107:2013–2018.

20. O’Brien E, Ziv G, Haran G, Brooks B, Thirumalai D (2008) Effects of denaturants andosmolytes on proteins are accurately predicted by the molecular transfer model. ProcNatl Acad Sci USA 105:13403–13408.

21. Liu Z, Reddy G, O’Brien EP, Thirumalai D (2011) Collapse kinetics and chevron plotsfrom simulations of denaturant-dependent folding of globular proteins. Proc NatlAcad Sci USA 108:7787–7792.

22. O’Brien E, Brooks B, Thirumalai D (2009) Molecular origin of constant m-values,denatured state collapse, and residue-dependent transitionmidpoints in globular pro-teins. Biochemistry 48:3743–3754.

23. Oliveberg M, Wolynes PG (2005) The experimental survey of protein-folding energylandscapes. Q Rev Biophys 38:245–288.

24. Klimov D, Thirumalai D (2000) Native topology determines force-induced unfoldingpathways in globular proteins. Proc Natl Acad Sci USA 97:7254–7259.

25. Fernandez-Escamilla A, et al. (2004) Solvation in protein folding analysis: Combinationof theoretical and experimental approaches. Proc Natl Acad Sci USA 101:2834–2839.

26. Klimov DK, Thirumalai D (2005) Symmetric connectivity of secondary structure ele-ments enhances the diversity of folding pathways. J Mol Biol 353:1171–1186.

27. Chung HS, Louis JM, Eaton WA (2009) Experimental determination of upper boundfor transition path times in protein folding from single-molecule photon-by-photontrajectories. Proc Natl Acad Sci USA 106:11837–11844.

28. Klimov D, Thirumalai D (2002) Stiffness of the distal loop restricts the structural het-erogeneity of the transition state ensemble in SH3 domains. J Mol Biol 317:721–737.

29. Ding F, Guo W, Dokholyan N, Shakhnovich E, Shea J (2005) Reconstruction of the src-SH3 protein domain transition state ensemble using multiscale molecular dynamicssimulations. J Mol Biol 350:1035–1050.

30. Onuchic J, Socci N, LutheySchulten Z, Wolynes P (1996) Protein folding funnels: Thenature of the transition state ensemble. Folding Des 1:441–450.

31. Guo Z, Thirumalai D (1997) The Nucleation-collapse mechanism in protein folding:Evidence for the non-uniqueness of the folding nucleus. Folding Des 2:277–341.

32. Hsu STD, Blaser G, Jackson SE (2009) The folding, stability and conformationaldynamics of beta-barrel fluorescent proteins. Chem Soc Rev 38:2951–2965.

33. Huang J, Craggs T, Christodoulou J, Jackson S (2007) Stable intermediate states andhigh energy barriers in the unfolding of GFP. J Mol Biol 370:356–371.

34. Andrews B, Schoenfish A, Roy M, Waldo G, Jennings P (2007) The rough energy land-scapeof Superfolder GFP is linked to chromophore. J Mol Biol 373:476–490.

35. Andrews B, Gosavi S, Finke J, Onuchic J, Jennings P (2008) The dual-basin landscape inGFP folding. Proc Natl Acad Sci USA 105:12283–12288.

36. Orte A, Craggs T, White S, Jackson S, Klenerman D (2008) Evidence of an intermediateand parallel pathways in protein unfolding from single-molecule fluorscence. J AmChem Soc 130:7898–7907.

37. Enoki S, Saeki K, Maki K, Kuwajima K (2004) Acid denaturation and refolding of GreenFluorscent Protein. Biochemistry 43:14238–14248.

38. Enoki S, et al. (2006) The equilibrium unfolding intermediate observed at pH 4 and itsrelationship with the kinetic folding intermediates in green fluorescent protein. J MolBiol 361:969–982.

39. Dietz H, Rief M (2006) Exploring the energy landscape of GFP by single-moleculemechanical experiments. Proc Natl Acad Sci USA 101:16192–16197.

40. Mickler M, et al. (2007) Revealing the bifurcation in the unfolding pathways of GFPby using single-molecule experiments and simulations. Proc Natl Acad Sci USA104:20268–20273.

41. Tsien R (1998) The green fluorescent protein. Annu Rev Biochem 67:509–544.42. Hyeon C, Dima R, Thirumalai D (2006) Pathways and kinetic barriers in mechanical

unfolding and refolding of RNA and proteins. Structure 14:1633–1645.43. Veitshans T, Klimov D, Thirumalai D (1996) Protein folding kinetics: timescales, path-

ways and energy landscapes in terms of sequence-dependent properties. Folding Des2:1–22.

44. Guo Z, Thirumalai D (1995) Kinetics of protein-folding—nucleation mechanism, timescales, and pathways. Biopolymers 36:83–102.

45. Karplus M, Weaver D (1994) Protein-folding dynamics—the diffusion-collision modeland experimental-data. Protein Sci 3:650–668.

46. Okamoto Y, Hansmann U (1995) Thermodynamics of helix-coil transitions studied bymulticanonical algorithms. J Phys Chem 99:11276–11287.

47. Hansmann U, Okamoto Y, Eisenmenger F (1996) Molecular dynamics, Langevinand hybrid Monte Carlo simulations in a multicanonical ensemble. Chem Phys Lett259:321–330.

48. Nagy A, Malnasi-Csizmadia A, Somogyi B, Lorinczy D (2004) Thermal stability of che-mically denatured green fluorescent protein (GFP)—A preliminary study. ThermochimActa 410:161–163.

49. Melnik TN, Povarnitsyna TV, Glukhov AS, Uversky VN, Melnik BS (2011) Sequentialmelting of two hydrophobic clusters within the green fluorescent protein GFP-cycle3.Biochemistry 50:7735–7744.

50. Camacho C, Thirumalai D (1993) Minimum energy compact structures of randomsequences of heteropolymers. Phys Rev Lett 71:2505–2508.

51. Kimura T, et al. (2005) Specific collapse followed by slow hydrogen-bond formationof beta-sheet in the folding of single-chain monellin. Proc Natl Acad Sci USA102:2748–2753.

52. Klimov D, Thirumalai D (1996) Factors governing the foldability of proteins. Proteins26:411–441.

53. Bertz M, Kunfermann A, Rief M (2008) Navigating the folding energy landscape ofgreen fluorescent protein. Angew Chem Int 47:8192–8195.

54. Guo ZY, Thirumalai D, Honeycutt JD (1992) Folding kinetics of proteins—A modelstudy. J Chem Phys 97:525–535.

55. Camacho CJ, Thirumalai D (1993) Kinetics and thermodynamics of folding in modelproteins. Proc Natl Acad Sci USA 90:6369–6372.

56. Betancourt M, Thirumalai D (1999) Pair potentials for protein folding: choice of refer-ence states and sensitivity of predicted native states to variations in the interactionschemes. Protein Sci 8:361–369.

57. Griesbeck O, Baird G, Campbell R, Zacharias D, Tsien R (2001) Reducing the environ-mental sensitivity of yellow fluorescent protein. mechanism and applications. J BiolChem 276:29188–29194.

58. Auton M, Bolen D (2004) Additive transfer free energies of the peptide backbone unitthat are independent of the model compound and the choice of concentration scale.Biochemistry 43:1329–1342.

59. Guo Z, Thirumalai D (1996) Kinetics and thermodynamics of folding of a de Novodesigned four-helix bundle protein. J Mol Biol 263:323–343.

17838 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1201808109 Reddy et al.