14
RNA Simulations: Probing Hairpin Unfolding and the Dynamics of a GNRA Tetraloop Eric J. Sorin 1 , Mark A. Engelhardt 2 , Daniel Herschlag 1,2 and Vijay S. Pande 1,3,4,5 * 1 Department of Chemistry 2 Department of Biochemistry 3 Department of Biophysics 4 Department of Structural Biology and 5 Stanford Synchrotron Radiation Laboratory, Stanford University, Stanford CA 94305-5080 Simulations of an RNA hairpin containing a GNRA tetraloop were con- ducted to allow the characterization of its secondary structure formation and dynamics. Ten 10 ns trajectories of the folded hairpin 5 0 - GGGC[GCAA]GCCU-3 0 were generated using stochastic dynamics and the GB/SA implicit solvent model at 300 K. Overall, we find the stem to be a very stable subunit of this molecule, whereas multiple loop confor- mations and transitions between them were observed. These trajectories strongly suggest that extension of the C6 base away from the loop occurs cooperatively with an N-type ! S-type sugar pucker conversion in that residue and that similar pucker transitions are necessary to stabilize other looped-out bases. In addition, a short-lived conformer with an extended fourth loop residue (A8) lacking this stabilizing 2 0 -endo pucker mode was observed. Results of thermal perturbation at 400 K support this model of loop dynamics. Unfolding trajectories were produced using this same methodology at temperatures of 500 to 700 K. The observed unfolding events display three-state behavior kinetically (including native, globular, and unfolded populations) and, based on these obser- vations, we propose a folding mechanism that consists of three distinct events: (i) collapse of the random unfolded structure and sampling of the globular state; (ii) passage into the folded region of configurational space as stem base-pairs form and gain helicity; and (iii) attainment of proper loop geometry and organization of loop pairing and stacking interactions. These results are considered in the context of current experimental knowledge of this and similar nucleic acid hairpins. # 2002 Elsevier Science Ltd. Keywords: RNA folding; GCAA; implicit solvent; stochastic dynamics; sugar pucker *Corresponding author Introduction Like proteins, RNA molecules adopt specific secondary and tertiary structure in order to act as biocatalysts. 1 Unlike proteins, RNA molecules are thought to form stable secondary structures early in the folding process. To accomplish the 180 backbone inversions necessary for secondary struc- ture formation between nearby hairpin strands, a variety of turn and loop structures exists. Among these, tetraloops are commonplace, with the GNRA, CUUG, and UNCG sequences comprising more than 70 % of known tetramer loops while accounting for less than 4 % of possible tetraloop sequences. 2,3 Structural and phylogenetic compari- sons indicate that these tetraloops are frequently used in the assembly of tertiary structure from sec- ondary structural units. 4-8 Understanding how tet- raloop hairpins form and behave is therefore a basic component of understanding the formation and dynamics of larger RNA systems. GNRA tetraloops, common in ribosomal and sig- nal-recognition particle RNAs, are known to form tertiary contacts in larger RNA structures 2,8,9 and to serve as protein binding recognition sites. 10, 11 Of these, GCAA and GAAA loops are most preva- E-mail address of the corresponding author: [email protected] Abbreviations used: RMSD, root-mean-squared deviation; GB, generalized Born; PB, Poisson-Boltzmann; SA, surface area; COM, center-of-mass; NC, native contact; F, folded/native configuration; G, compact globular unfolded conformation; U, extended unfolded conformation; N, any ribonucleotide; R, ribonucleotide with a purine base, adenine or guanine; L2/L4, the 2nd/4th nucleotide position in a loop structure. doi:10.1006/jmbi.2002.5447 available online at http://www.idealibrary.com on J. Mol. Biol. (2002) 317, 493–506 0022-2836/02/040493–14 $35.00/0 # 2002 Elsevier Science Ltd.

RNA Simulations: Probing Hairpin Unfolding and the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RNA Simulations: Probing Hairpin Unfolding and the

doi:10.1006/jmbi.2002.5447 available online at http://www.idealibrary.com on J. Mol. Biol. (2002) 317, 493±506

RNA Simulations: Probing Hairpin Unfolding and theDynamics of a GNRA Tetraloop

Eric J. Sorin1, Mark A. Engelhardt2, Daniel Herschlag1,2

and Vijay S. Pande1,3,4,5*

1Department of Chemistry2Department of Biochemistry3Department of Biophysics4Department of StructuralBiology and5Stanford SynchrotronRadiation Laboratory, StanfordUniversity, StanfordCA 94305-5080

E-mail address of the [email protected]

Abbreviations used: RMSD, root-deviation; GB, generalized Born; PBSA, surface area; COM, center-of-mcontact; F, folded/native con®guratglobular unfolded conformation; U,conformation; N, any ribonucleotidwith a purine base, adenine or guan2nd/4th nucleotide position in a lo

0022-2836/02/040493±14 $35.00/0

Simulations of an RNA hairpin containing a GNRA tetraloop were con-ducted to allow the characterization of its secondary structure formationand dynamics. Ten 10 ns trajectories of the folded hairpin 50-GGGC[GCAA]GCCU-30 were generated using stochastic dynamics andthe GB/SA implicit solvent model at 300 K. Overall, we ®nd the stem tobe a very stable subunit of this molecule, whereas multiple loop confor-mations and transitions between them were observed. These trajectoriesstrongly suggest that extension of the C6 base away from the loop occurscooperatively with an N-type! S-type sugar pucker conversion in thatresidue and that similar pucker transitions are necessary to stabilizeother looped-out bases. In addition, a short-lived conformer with anextended fourth loop residue (A8) lacking this stabilizing 20-endo puckermode was observed. Results of thermal perturbation at 400 K supportthis model of loop dynamics. Unfolding trajectories were produced usingthis same methodology at temperatures of 500 to 700 K. The observedunfolding events display three-state behavior kinetically (includingnative, globular, and unfolded populations) and, based on these obser-vations, we propose a folding mechanism that consists of three distinctevents: (i) collapse of the random unfolded structure and sampling of theglobular state; (ii) passage into the folded region of con®gurational spaceas stem base-pairs form and gain helicity; and (iii) attainment of properloop geometry and organization of loop pairing and stacking interactions.These results are considered in the context of current experimentalknowledge of this and similar nucleic acid hairpins.

# 2002 Elsevier Science Ltd.

Keywords: RNA folding; GCAA; implicit solvent; stochastic dynamics;sugar pucker

*Corresponding author

Introduction

Like proteins, RNA molecules adopt speci®csecondary and tertiary structure in order to act asbiocatalysts.1 Unlike proteins, RNA molecules arethought to form stable secondary structures earlyin the folding process. To accomplish the 180 �backbone inversions necessary for secondary struc-

ing author:

mean-squared, Poisson-Boltzmann;ass; NC, nativeion; G, compactextended unfolded

e; R, ribonucleotideine; L2/L4, the

op structure.

ture formation between nearby hairpin strands, avariety of turn and loop structures exists. Amongthese, tetraloops are commonplace, with theGNRA, CUUG, and UNCG sequences comprisingmore than 70 % of known tetramer loops whileaccounting for less than 4 % of possible tetraloopsequences.2,3 Structural and phylogenetic compari-sons indicate that these tetraloops are frequentlyused in the assembly of tertiary structure from sec-ondary structural units.4-8 Understanding how tet-raloop hairpins form and behave is therefore abasic component of understanding the formationand dynamics of larger RNA systems.

GNRA tetraloops, common in ribosomal and sig-nal-recognition particle RNAs, are known to formtertiary contacts in larger RNA structures2,8,9 andto serve as protein binding recognition sites.10, 11

Of these, GCAA and GAAA loops are most preva-

# 2002 Elsevier Science Ltd.

Page 2: RNA Simulations: Probing Hairpin Unfolding and the

494 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

lent, and share similar structural characteristics.12

For instance, the stacked third and fourth loop Aresidues in such tetraloops interact with speci®cminor groove receptor sites and can engage inloop-to-receptor base stacking within larger RNAstructures.7,13 ± 16

Still, little is known about either the dynamicstaking place in these loop regions or the mechan-ism of formation of nucleic acid hairpins ingeneral. To investigate these dynamics, we havestudied a 12-mer RNA tetraloop hairpin previouslycharacterized by NMR.12 The sequence 50-GGGC[GCAA]GCCU-30 is native to Escherichia coli 16 Sribosomal RNA with an additional terminal base-pair. Structural features of this hairpin in solutioninclude an often-frayed terminal G �U cis wobblepair, the expected A-form helical stem confor-mation, and a favorable C4 �G9 closing base-pair atthe stem-loop interface.3,}17 Early moleculardynamics simulations of this species by Zichi18

suggested that the stem and loop regions of thishairpin act as distinct structural subunits. Also, theexperimentally derived thermodynamic propertiesfor this hairpin at 300 K are well known: the melt-ing temperatures for this fragment are �68 �C and�71 �C in the presence of low (�5 mM) andmoderate (�100 mM) sodium ion concentra-tions, respectively,5, }19 and folding of the hairpinat 300 K occurs with �G � ÿ 3.9 kcal/mol,�H � ÿ 32.7 kcal/mol, and �S � ÿ 96.0 cal/Kmol.

We report multiple atomistic Langevin dynamicssimulations using implicit solvation energy calcu-lations to model loop dynamics speci®c to theGCAA tetraloop on the ten-nanosecond timescale.The results of these simulations provide exper-imentally unattainable insight into conformationalchanges within the tetraloop and offer a model ofthe folding of the tetraloop hairpin molecule as awhole.

Unfortunately, standard atomistic dynamicsimulations fall short of the micro- to millisecondtimescales often necessary to simulate foldingevents of even the smallest biomolecules, such asnucleic acid and protein secondary structural uni-ts.20 We have therefore employed high temperatureunfolding simulations to provide potential insightsinto pathways between the native and denaturedstates of this system and to assess dynamics withinthe hairpin stem. Thermal unfolding has served asa powerful tool for investigating dynamics invol-ving large conformational changes on computa-tionally tractable timescales.20-24 It has beenpreviously suggested that the unfolding processfor proteins can in general re¯ect the main attri-butes of the folding event,20 ± 24 and it has been

{ We de®ne a native contact (NC) exist between twostacked or paired bases if their center-of-massseparation is within 1.5 AÊ of that seen in NMR work;paris with larger separations are considered as broken,solvent-exposed base-pairs.

shown that construction of folding pathways,including transition state ensembles, is possibleusing high temperature unfolding.21 A recentreport of the direct, atomistic folding of a smallprotein in silico}25 has shown that previously pub-lished thermal unfolding trajectories21 served as agood tool for predicting the folding rate and mech-anism. Furthermore, by maintaining low-tempera-ture properties of the continuum solvent modelemployed we increase the likelihood that theobserved dynamics are representative of low tem-perature unfolding and folding dynamics.

Results and Discussion

We ®rst introduce general structural features ofthe GCAA tetraloop hairpin, then discussdynamics within the loop, and ®nish with adescription of thermal unfolding and its relevanceto the folding pathway.

Structural overview and stability

The primary tetraloop hairpin structuresobserved in the 300 K ensemble are introduced inFigure 1. Stick diagrams illustrate the prevalentloop conformations observed, while schematic dia-grams show base positions and interactions includ-ing the fairly static stem structure. Blue barsbetween bases represent the native contacts (NC)considered in our analysis.{ Color-coding in thestick diagrams follows that in the schematics.Shown are (a) the dominant ``closed-loop'' confor-mer that displays a well-aligned C6-A7-A8 stack-ing triplet, (b) the less favored ``open-loop''structure in which the C6 residue has left the loopto become fully solvated, and (c) the far less stable``A8-extended'' loop conformer, which retains C6-A7 stacking.

A comparison of the backbone dihedral anglesfrom the ensemble of ten NMR structures with therelaxed starting structure and the average over 100ns of simulation time at 300 K is shown in Table 1and Scheme 1. As the loop in this hairpin under-goes signi®cant conformational ¯uctuations belowthe melting temperature, the table includes dihe-dral angles for a central stem base-pair (G3 �C10),the closing base-pair (C4 �G9), and the mismatchG5 �A8 pair that compromises the loop-stem inter-face. The greatest deviations from the rangesuggested by the re®ned NMR structures areshaded. While a small amount of loop distortion isobserved, this results from both our relativelyshort sampling time and the ¯uctuations observedin this region at 300 K. Overall, the stem retains itsstructure: only the G9 a-torsion is signi®cantly out-side of the NMR range, a result of slight distortionin the A8 backbone region (Table 1 and data notshown).

Page 3: RNA Simulations: Probing Hairpin Unfolding and the

Figure 1. The three GCAA tetra-loop hairpin conformationsobserved at 300 K. Taken directlyfrom our simulations, stick dia-grams (top) illustrate the prevalentloop conformations observed. Sche-matic diagrams (bottom) show baseand phosphate group positions andinteractions. Blue bars betweenbases represent the native contacts(NC) considered in our analysisand, for each diagram, the fourloop nucleotides are color coded asfollows: G5 in yellow, C6 in blue,A7 in green, and A8 in red. Allresidues have 30-endo pucker modesunless speci®ed and U12 undergoesfrequent fraying motions, as notedby broken lines. (a) The dominantclosed-loop conformer displays thewell-aligned A7-A8 stacking pair.The C6 base is only slightly out ofplane from the double-A stack, andmaintains a 30-endo pucker. Base-pairing is also occurring betweenthe G5 and A8 bases (yellow-redalignment). (b) In the less commonopen-loop structure the C6 residuehas left the loop to become fullysolvated. C6 has a 20-endo sugar

pucker only in this conformer, as calculated from NMR data.12 (c) The far less stable A8-extended loop conformerconserves C6-A7 stacking.

Table 1. Comparison of NMR and simuation dihedral angles in the stem and stem-loop interface

Central stem Stem closing Loop mismatchBackbone torsionangle G3 C10 C4 G9 G5 A8

All dihedral angles are in degrees and were calculated from 50 to 30 according to Scheme 1.The top entries give the mean and standard deviation in the ten re®ned NMR structures.Below these are the starting structure values following 1 ns relaxation in implicit solvent.The third entry is the 100 ns average and standard deviation for each torsion.Values that show the greatest deviation from the re®ned NMR structural range are shaded.

RNA Simulations: GNRA Tetraloop Hairpin Dynamics 495

Page 4: RNA Simulations: Probing Hairpin Unfolding and the

Scheme 1.

496 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

Tetraloop dynamics

Kinetic plots for the ten 300 K trajectories arepresented in Figure 2, with each run spanning 10ns. The number of native contacts NC and theradius of gyration Rg are shown in Figure 2(a). Theloop and stem root-mean-squared atomic devi-ations (RMSDs) are plotted in Figure 2(b), showingthe stability of the stem during all conformationalchanges in the loop subunit. The stem RMSD,which includes the fraying terminal base-pair,rarely reaches 1.5 AÊ . Figure 2(c) shows the changein base center-of-mass (COM) separation from thestarting conformation (�DNC) between native con-tact pairs. Loop contacts (top panel) show con-siderable structural variation, as discussed below.Sugar puckers for each of the 12 residues areshown in Figure 2(d), with residues 6 and 12labeled for clarity, as no other nucleotides withinthe 12-mer are observed to undergo pucker conver-sion at 300 K.

C6 extensions and ribose pucker

Frequent transitions between closed-loop andopen-loop conformations in the low temperaturetrajectories are observed. The stable closed-loopconformer is the dominant species in nearly allruns at 300 K. In runs 3, 7, 8 and 10 there is a lossof the C6-A7 contact, which de®nes the open-loopconformer. This conformer is seen in two of the tenre®ned NMR structures determined by Juckeret al.,12 consistent with the closed-loop structurebeing more stable.

All runs in which the open-loop conformer per-sists included simultaneous transitions to the 20-endo pucker mode (Figure 2(d)).{ At numerouspoints in these trajectories, an open-loop conformerfails to persist and quickly returns to the closed-loop conformation (runs 1, 4, 6, and late in run 7).These failures appear in conjunction with a stable

{ The timescale of observed pucker conformationalshifts within our simulations agrees well with aprevious computational study26 that discusses similar 30-endo�20-endo transitions in the UUCG tetraloop. Thiscorrelation is particularly noteworthy when consideringthat an explicit solvent model is used in that study.

30-endo pucker that never undergoes conversioninto the 20-endo mode.

The correlation between the occurrence of 20-endoconformations in C6 and structures with a solvent-exposed C6 residue was statistically evaluated. Inboth of the open-loop NMR structures proposedby Jucker et al.}12 the base COM separationbetween C6 and A7 is over 5 AÊ greater than in cor-responding closed-loop structures. Using thischange in base separation, �DC6-A7 5 5 AÊ , tode®ne C6 as being completely looped-out results ina perfect correlation (R2 � 1.0, n � 53) betweenthese two structural features. Using signi®cantlysmaller base separation cutoffs of 4 AÊ and 3 AÊ con-tinues to result in high correlations of 0.99 and0.93, respectively. This suggests that the open-loopconformer requires a conversion to the 20-endomode prior to reaching the fully looped-out struc-ture as suggested by two in ten re®ned NMRstructures.}12

When considering only structures in which thebase separation is �DC6-A7 < 1 AÊ from the startingstructure, less than 3 % of closed-loop structureshave a 20-endo pucker and less than 1 % of struc-tures with a base separation equal to or less thanthe starting structure are in 20-endo conformations.These rare and extremely brief conversions to 20modes in closed-loop structures are not surprisingin light of random solvent forces imparted uponloop nucleotides and the small energy barriers forsugar pucker conversion.27

The above correlations suggest a highly coopera-tive relationship between 30-endo� 20-endo andclosed-loop� open-loop transitions in the GNRAtetraloop. Within our 300 K ensemble, [closed-loop]C630-Endo structures compromise approximately87 % of the collected structures (3483 in 4000, span-ning 100 ns), consistent with the NMR results.12

These cooperative base extension/pucker modetransitions may be mechanically driven (Scheme 2).The 20-endo pucker modes expand the loop back-bone,27 and from simple geometric considerationsit is reasonable to assume that these 20-endo puck-ers also allow extension of the base away from theloop. Without this change in sugar pucker or moredramatic changes throughout the backbone, anextension of the base would require substantialdeviation from tetrahedral (Td) geometry aroundthe C10 carbon (Scheme 2).

Interestingly, Menger et al.28 brie¯y consideredthe possibility that pucker conversions may beessential for transitions between loop conformers.Their discussion focused on the difference in time-scales between the pucker conversions previouslyreported26, }29 (1 to 10 ns, as observed herein) andthe loop transitions that they monitor in micro-seconds. This timescale difference is most easilyreconciled by considering the nature of the tran-sitions being studied: simple stacking changesoccur on sub-microsecond timescales,28 below thelimits of detection in their study.

A recent Raman spectroscopy study by Leulliotet al.30 offers insight into our results. In our trajec-

Page 5: RNA Simulations: Probing Hairpin Unfolding and the

Figure 2. Characterization of the 300 K ensemble. (a) The number of native contacts (blue) and the radius of gyra-tion (red). (b) Loop and stem RMSD values are shown in blue and red, respectively. (c) Changes in contact distancesthroughout the trajectories (�DNC), where blue represents closest contact distance and yellow represents farthest. Thevalues in these panels are scaled between the minimum and maximum values for each set of contacts and serve forcomparative purposes only. Because of this scaling, identical colors in different panels cannot be assumed to representthe same absolute change in distance. The top panel shows the three loop contacts, and the bottom shows the fourstem contacts. (d) The ribose pucker for each residue, with blue and yellow specifying the 30-endo and 20-endo puckermodes, respectively.

RNA Simulations: GNRA Tetraloop Hairpin Dynamics 497

tories, all loop and stem G and C residuesremained exclusively or predominantly 30-endo andno glycosidic anti�syn conformational changeswere observed; this is in agreement with theirresults. They suggest, based on IR spectra, that atleast one loop residue adopts a 20-endoconformation and speculate that this may occur atA8, in disagreement with the re®ned NMRstructures12 and the mechanism of [closed-loop]C630-endo� [open-loop]C620-endo transitionsobserved in our 300 K data. A de®nitive determi-

Schem

nation of loop nucleotide pucker modes will offerinsight into all of the methods previously used toexamine this GCAA structural feature.

An important aspect of the Leulliot study is the20-endo shoulder seen in the UUCG tetraloop spec-trum, which is assigned to the second loop U resi-due. This residue is in an extended, open-loopconformation analogous to the C6-extended open-loop structure we observe (Figure 1(b)). This sup-ports our suggestion that transitions to looped-out

e 2.

Page 6: RNA Simulations: Probing Hairpin Unfolding and the

Figure 3. Changes in hydrogen bonding patternduring A8 extensions observed in run 2. (a) Changes inloop contact distances are shown to allow direct com-parison to loop rearrangement events. (b) Interatomicdistances between A7N7 and its possible bonding part-ners: G520-H (blue) and G520-OH (red). (c) Hydrogen bond-ing distances for possible contacts formed between theA7PA8 pro-SP phosphoryl oxygen and both the nearestG5 amino proton (black trace) and the N1 hydrogen inthat same base (green trace). (d) The two structuresshown occurred at 2.5 ns (closed-loop) and 6.9 ns (A8-extended), respectively, and follow the coloring schemein Figure 1. Hydrogen bond colors correspond to thetraces in (c): the black dotted line indicates a contactbetween the pro-SP phosphoryl oxygen (red) and theamine proton of G5; the green dotted line indicates con-tact between this same phosphoryl oxygen and the N1proton of G5 (green-tipped).

498 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

base conformations may generally occur coopera-tively with transitions to 20-endo pucker modes.

Within the family of GNRA loops, sequence-speci®c tertiary contact behavior is observed. Forsuch loop structures, the position of highestsequence variability, position L2, determines recep-tor subtype preference,31 and changes in secondarystructure induced by tertiary structure formationhave been reported.16, 32 We speculate that thespatial variability at L2, the only location of signi®-cant conformational change within the loop in oursimulations at 300 K, serves to facilitate such struc-tural adaptations and interactions.

A8 Extensions and hydrogen bonding

In addition to the C6 extensions describedabove, residue A8 also leaves the loop brie¯y,breaking contacts with both G5 and A7 simul-taneously (signaled by increased A7-A8 andG5 �A8 contact distances in Figure 2(c)). Run 2 inFigure 2 represents the longest series of ¯uctu-ations in and out of this conformational state, withvery brief occurrences in runs 6, 8, 9, and 10. If wede®ne the A8-extended ensemble to include allstructures in which both of these contacts are bro-ken (�DNC > 1.5 AÊ ), only �5 % of the 300 K con-formations meet this criteria. Moreover, only 0.4 %of the structures are included if a conservative con-tact distance increase of 4 AÊ is assumed. This isconsistent with the inability to observe this confor-mer by NMR.12 The ribose pucker of the A8 resi-due remains in a 30-endo mode throughout all suchtransitions, and the C6-A7 stacking interaction atthe top of the loop is conserved in this rare confor-mer (Figure 1(c)).

As proposed by Jucker et al.,12 changes in hydro-gen bonding pattern within the loop during con-formational shifts are observed in our trajectories.An example of the dynamic nature of loop hydro-gen bonding is shown in Figure 3, which detailsthe ¯uctuations seen in run 2. Interatomic distancesbetween the N7 of residue A7 (A7N7) and its poss-ible bonding partners are shown (Figure 3(b)). Ahydrogen bond between the 20-H of G5 (G520-H)and A7N7 was not previously described. However,in both our simulations and the NMR models12

this proton directly attached to the 20-carbon(Scheme 1) is near N7 of A7. Hydrogen bondinginvolving 20-hydrogen atoms has been previouslyproposed based on analysis of crystal structuresand prior simulations.33 In our simulations, how-ever, the average A7N7 to G520-H separation of over3 AÊ suggests that no hydrogen bonding activity isoccurring.

Based on their re®ned NMR structures, Juckeret al.12 have speculated that a hydrogen bond maybe possible between the 20-OH of G5 (G520-OH) andA7N7, and we observe a signi®cant decrease in theaverage spacing between these potential bondingpartners during both A8 extensions (Figure 3(b))and C6 extensions (data not shown). Nevertheless,the mean separation (and ¯uctuation) for this pair

in our most static closed-loop trajectory (run 1) is3.27(�1.05) AÊ , exceeding the accepted hydrogenbond range. This result is consistent with the ther-modynamic results reported by Santa Lucia et al.,19

who found that H substitution at the 20-OH pos-ition in G5 results in a ��G� of 0.3 kcal/mol.Although a direct comparison of the results regard-ing the proposed A7N7 to G520-OH hydrogen bondis not possible, the simulations suggest that loss ofthis contact would not greatly destabilize the hair-

Page 7: RNA Simulations: Probing Hairpin Unfolding and the

Figure 4. Characterization of the 400 K ensemble. Five of our ten 400 K trajectories are shown (following the formatof Figure 2). (a) The number of native contacts (blue) and the radius of gyration (red). (b) Loop and stem RMSDvalues are shown in blue and red, respectively. (c) The change in contact distances throughout the trajectories(�DNC), where blue represents closest contact distance and yellow represents farthest. The top panel shows the threeloop contacts, with stem contacts in the lower panel. (d) Pucker modes for these ®ve trajectories, with blue represent-ing a 30-endo (N-type) pucker and yellow representing a 20-endo (S-type) mode. For clarity, the sixth and twelfth ribosepuckers have been numbered to the left of the ®gure.

RNA Simulations: GNRA Tetraloop Hairpin Dynamics 499

pin. Still, while the 20-OH of G5 may not form ahydrogen bond in the most stable structure, it isinteresting to consider that a hydrogen bond invol-ving this 20-OH could stabilize conformationalexcursions.

Figure 3(c) shows the hydrogen bonding dis-tances for possible contacts formed between theA7PA8 pro-SP oxygen (i.e. the phosphoryl oxygenpointing toward the center of the loop) and both aproton of the amino group and the N1 proton ofG5. Within the 300 K trajectories, many nativeclosed-loop include bifurcated hydrogen bondingwith nearly identical distances of 2.0-2.2 AÊ

between these two donor-acceptor pairs. Interest-ingly, bifurcated mismatches at the stem-loopinterface of tRNA anticodon hairpins have beenpreviously reported for C �A base-pairs with verydifferent overall structure.}34 Figure 3(d) shows thechange in relative positioning of the G5 and A8residues in going from the closed-loop conformerto the A8-extended conformation.

As the A8 residue extends, the C6-A7 stackingcouple drops partially into the newly created voidand the loop backbone ¯attens out signi®cantly(Figure 1(c)). This allows only the amino proton tomaintain contact with the phosphate oxygen,removing an interaction within the loop. This baseextension and loss of the contact between the phos-phoryl oxygen and the N1 proton, however, isaccompanied by a decreased bond distancebetween the phosphate oxygen and the amino pro-

ton. This is in qualitative agreement with a pre-vious report that substitution of an H in place ofthe G5 amino group, which removes the A7PA8phosphate oxygen to G5 amino proton hydrogenbond from the loop, results in a ��G� of�0.7 kcal/mol.19 In contrast, similar substitution inthe UUCG loop results in a ��G� of �1.3 kcal/mol. By removing the hydrogen bond to the aminoproton, a strengthening of the bond to the N1 pro-ton could potentially result in the small effect theyreport.

Menger et al. have proposed that a 50-stackedloop structure exists in equilibrium on the micro-second timescale.28 The A8-extended confor-mation seen herein could be a precursor to such50-stacked structures. As a stabilized A8-extendedconformer has not been witnessed on the 100 nstimescale, this could rationalize the ``relativelylow rate of conformational transition'' reportedin their temperature-jump experiments.28 Futureexperimental studies including 2-aminopurinesubstitution at the L2 position might provide atest of this model, as a dramatic decrease in ¯u-orescence would be expected from stackingbetween G5 and A7.

At the L4 position within this tetraloop, A8 actsas part of the interface between the rigid stem andmore ¯exible loop region, and we postulate thatthe favored C4 �G9 closing base-pair imposes rigid-ity at the A8 position that the ribose ring cannoteasily overcome on the nanosecond timescale at

Page 8: RNA Simulations: Probing Hairpin Unfolding and the

500 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

low temperatures. This is supported by the rela-tively small torsion angle ¯uctuations shown inTable 1 for the closing base-pair. This, in concertwith the partial loss of hydrogen bondingdescribed above, could explain the relatively short-lived nature of this loop conformer.

Thermal perturbation and a model ofloop dynamics

As all but one of the 400 K trajectories main-tained stable or globular stem structures over thecourse of 10 ns, the sugar puckers and base orien-tations throughout the ten trajectories at this highertemperature have been examined for comparisonto the 300 K ensemble. As in the 300 K data, noanti� syn glycosidic transitions were witnessed.Five of the ten 400 K trajectories are shown inFigure 4. The results of these simulations offerinsight into, and support for, the arguments madeabove based on our 300 K ensemble. As predictedby the 300 K runs, all closed-loop� open-looptransitions occur cooperatively with C6 30-endo� 20-endo pucker conversions (early in run 5,late in run 7, and most of run 8). Furthermore,closed-loop� A8-extended transitions are seen atthis higher temperature that clearly correlate withsimultaneous 30-endo to 20-endo pucker conversionsin the A8 ribose (runs 2 and 3; also runs 4 and 9,not shown).

This thermal perturbation has allowed us toexamine an ensemble in which looped-out basesbecome much more prevalent. However, at somepoint in most of the 400 K trajectories the loopbackbone loses native-like geometry and once theloop opens fully to solvent a variety of pucker andbase orientations is easily accessible. Consideringonly the initial appearance of these A8-extendedstructures prior to loop dissociation, we again seean absolute correlation with the 20-endo pucker,meaning that the initial loop rearrangements toA8-extended conformers all adopted 20-endo puckermodes during the transition.

An extension of residue A7 away from the looplate in run 8 is also observed and, as is the case forother loop residues, a 30-endo�20-endo conversionoccurs in tandem. Run 8 featured one of the moreslowly degenerating hairpin structures, with thestem RMSD remaining under � 3 AÊ , thus allowingthis comparison late in the simulation. This againsupports the observation that a 20endo puckermode is essential in stabilizing extended, solvent-exposed loop bases.

Based on the above results, we propose that anequilibrium of at least three loop structures exists

{ Two-state kinetics found experimentally means thatthere is a single rate-limiting step, but does not implythat there are not additional metastable states along thefolding pathway.

{ As these trajectories have not crossed all relevantfree energy barriers numerous times, this plot does notrepresent equilibrium conformational probabilities.

in the GCAA loop on sub-microsecond timescales(see Figure 1). Our model consists of a [closed-loop]C6 30-endo�[open-loop]C6 20-endo transition and aless frequent [closed-loop]A8 30-endo�[A8-extende-d]A8 30-endo transition, with the A8-extended confor-mer existing only for bursts that are short oncurrent experimental timescales. This model willserve as both a starting point and an element ofcomparison for modeling other GNRA tetraloops,as well as the previously noted UNCG loops,which are similar in structure.6

Thermal unfolding dynamics

To enhance our sampling of con®gurations ofthe hairpin, we have examined unfolding at a var-iety of temperatures. Although it is experimentallyobserved that folding of this hairpin takes place asa two-state system event,17, 30 a globular intermedi-ate (described below) that occurs in transitionsbetween the folded and unfolded regions of con®g-urational space was detected.{ The relative confor-mational probabilities observed in the ensemble of500 K unfolding trajectories as a function of thetotal hairpin RMSD and the number of native con-tacts within each given structure is shown inFigure 5.{ In the context of hairpin folding andunfolding dynamics, three probability maxima areeasily identi®ed, and are labeled as the unfolded(U), globular intermediate (G), and folded (F) statesof the tetraloop hairpin: F consists of structureswith high NC and low RMSD; G consists of con-formations with RMSD values between �3 AÊ and�6 AÊ and two to four native contacts; U consists ofstructures with less than two native contacts andRMSD values of greater than 6 AÊ . Because the

Figure 5. Observed relative conformational probabil-ities at 500 K. The unfolded (U), globular (G), andfolded (F) states are labeled on the probability surfacecreated by the RMSD and NC degrees of freedom. Datafrom ten 500 K trajectories are included in the plot, giv-ing a total of 4330 structures over a total sampling timeof 65 ns.

Page 9: RNA Simulations: Probing Hairpin Unfolding and the

Figure 6. A summary of the 50 simulations analyzed.All ten runs at each of the ®ve temperatures studied areshown (run number versus time in ns for each tempera-ture block). The 300 K trajectories are shown in the toprow (a), with T increasing in 100 K increments downeach row to 700 K (e). Left: the number of native con-tacts for each trajectory ranging from yellow (NC � 7)to white (NC � 0). Right: simpli®ed trajectory represen-tations based on the three-state model shown inFigure 5.

Figure 7. Characterization of high temperatureunfolding trajectories. Shown in descending order is onerepresentative run from the 500 K (run 4, see alsoFigure 8), 600 K (run 8), and 700 K (run 1) unfoldingensembles. (a) The number of native contacts (blue) andradius of gyration (red). (b) The loop (blue) and stem(red) RMSD values are equally scaled to show relativechanges between simulations.

RNA Simulations: GNRA Tetraloop Hairpin Dynamics 501

400 K trajectories did not unfold completely and

the higher temperature trajectories unfolded

rapidly, the 500 K data set best represents an

ensemble of unfolding events.

A total of 50 trajectories were collected at tem-

peratures from 300 to 700 K, as outlined in

Figure 6, which shows the number of native con-

tacts and the simpli®ed trajectory representations

throughout each trajectory. The globular state was

sampled very brie¯y in only four of our 300 K tra-

jectories, with this classi®cation being the result of

terminal fraying and the A8 extensions described

above. Of the 400 K ensemble, only one brief

period of complete unfolding was witnessed. At

500 K full unfolding was observed in nearly all of

our simulations.

Folding pathway from simulated unfolding

Figure 7 details one trajectory from each of thedata sets at 500 K (run 4), 600 K (run 8), and 700 K(run 1). The globular intermediate is detectable innearly all of the 500 K trajectories and most of the600 K trajectories, whereas the 700 K trajectoriesappear much more continuous with respect to alldegrees of freedom monitored, suggesting that athigher temperatures the barrier separating the Uand G states is not present to any appreciableextent. Run 4 of the 500 K series (®rst row inFigure 7, see also Figure 8) shows a very discretejump to a globular conformation that persists untilnear the end of the run when the fully unfoldedstate is reached. A similar but less discrete event isobserved in the 600 K run, with both trajectoriesshown manifesting fairly constant RMSD valuesduring this period, regardless of the lifetime of theintermediate.

Analysis of these 30 trajectories has led to a clearunfolding scheme: the general unfolding pathwayof the GCAA tetraloop hairpin witnessed in thisstudy is shown in Figure 8. From the folded state(Figure 8(a)), the loss of terminal and loop inter-actions and ¯attening of the loop backbone occurs®rst. The mismatched terminal base-pair exhibitssome fraying at equilibrium and the slight instabil-ity inherent to this pair makes it one of the ®rststructural locations to degenerate. As described

Page 10: RNA Simulations: Probing Hairpin Unfolding and the

Figure 8. Observed unfoldingpathway. The pathway followed byrun 4 in the 500 K series (as out-lined quantitatively in Figure 7, toprow) is shown. The phosphate-ribose backbones of the stem aredisplayed as black sticks for clarity,with the loop backbone and basescolored according to the samescheme used in Figure 1 (from 50 to30: G5 in red, C6 in blue, A7 ingreen, and A8 in red). The lowerpanel of images shows each struc-ture rotated �90 � to display heli-city (or lack thereof) within eachstructure.

502 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

above, the tetraloop is itself relatively ¯exible andundergoes two signi®cant, if brief, conformationalchanges on the nanosecond timescale at lower tem-peratures. This ¯exibility contributes to the earlyloss of stacking and/or base-pairing interactions inthe loop. Initial contact loss occurs both at theterminal base-pair and in the loop region,suggesting that both ends of this hairpin areroughly equally susceptible to perturbation. Heli-city is maintained only near the center of the hair-pin in this initial step, a result of the remainingstem base-pair contacts (Figure 8(b)). These centralbase-pair contacts then break while the stem back-bone ¯attens out, resulting in a mostly planar,symmetrically bent structure (Figure 8(c)). Frayingof the terminal ends and loop region are highlyvariable between trajectories. Finally, the structurerelaxes slowly (relative to the initial stages ofunfolding) and bases begin to randomly samplevarious solvated orientations (Figure 8(d)). Themostly linear strands of the hairpin then begindiffusing away from each other.

If we assume that the folding of this fragment isthe reverse of the unfolding trajectories,24 the fold-ing mechanism is straightforward. Initial collapseof the unfolded structures would precede samplingof the globular state and would take place ratherquickly, stabilized by random stacking and base-pairing interactions. Next, from these globularintermediates, the hairpin crosses over the barrierinto the correctly folded region of con®gurationalspace, which could only occur after the ``core'' ofthe hairpin (i.e. the central base-pairs of the stem)had properly formed. Upon formation of the cen-tral base-pairs, stem helicity would be partiallycomplete. Proper stacking and base pairing in the

loop in tandem with ®nalization of A-form helicitywould constitute the end of the event.

Kinetics of unfolding and folding

At each temperature in the 400 to 700 K range,the fraction of the population still folded (NC 5 5)was ®t to a single exponential function of simu-lation time (not shown). Only data between thestarting point (tT � 0) and the point at which all ofthe simulations at a given temperature had enteredthe globular state (t400 K � 10 ns; t500 K � 5 ns;t600 K � 1 ns; t700 K � 0.25 ns) were included inthese ®ts. The resulting temperature-dependentunfolding rates ku(T) are shown in Figure 9. Froman Arrhenius ®t (Figure 9, inset), the unfolding rateconstant at 300 K was approximated as ku

(300 K) � 0.1 msÿ1.To approximate upper and lower bounds on this

unfolding rate and to estimate the folding rate at300 K, the extreme linear ®ts produced by thesmall data set were calculated (Figure 9, inset: bro-ken lines). This results in lower and upper boundson ku(300 K) of � 0.03 msÿ1 and �1 msÿ1. From theknown free energy change of folding for this hair-pin, an equilibrium constant of Keq � 700 requires adifference between the folding and unfolding ratesof approximately three orders of magnitude. Theresulting bounds on the folding rate kf(300 K)based on these simulations are then on the order of30 msÿ1 and 1000 msÿ1. Considering the estimatedrate of base-pair formation in nucleic acid helices35

of � 4 msÿ1, a reasonable approximation of thefolding of this small hairpin might be on the 10 mstimescale, as predicted for slightly larger DNAhairpins.36,37

Page 11: RNA Simulations: Probing Hairpin Unfolding and the

Figure 9. Approximation of the unfolding rate at300 K. The unfolding rate ku at each temperature isplotted against inverse temperature and ®t to a singleexponential. Only the 700 K rate approximation had anerror beyond the point size shown. Inset: approximationof the temperature-dependent unfolding rate of the 12-mer hairpin by a linear ®tting (continuous line) of thenatural logarithm of the rate constant versus inversetemperature for the 400 to 700 K series. The errorinherent to each data point falls within the employedpoint size. Broken lines represent the upper and lowerbounds on our approximation and the arrow indicatesthe range of approximate extrapolated values at 300 K.

RNA Simulations: GNRA Tetraloop Hairpin Dynamics 503

This is slower than our lower bound and thisdifference may be accounted for by the absence ofexplicit counterions in simulation, since theywould likely slow unfolding and thus lower thepredicted rate of folding; a temperature depen-dence of unfolding that is not single-exponential inform, as is assumed above; the use of the course-grained NC reaction coordinate to predict rates; orlack of suf®cient data points in our small sample toaccurately predict the 300 K unfolding rate. Withthese possible sources of error in mind, our appar-ent lack of accuracy is understandable, and wespeculate that the calculated rates would be inmuch better agreement with experiment were suchfactors accounted for.

Conclusion

We have used stochastic dynamics simulationsto sample a total of 100 ns of GCAA tetraloop hair-pin con®gurations at 300 K. These trajectories haveled to the proposal of an equilibrium of three loopconformations consistent with current experimen-tally determined structural properties: closed-loopstructures (C6 stacked above the 30 strand with A7and A8) are dominant, open-loop conformers (C6extended into solvent) make up �15-20 % of theensemble, and transient A8-extended confor-mations occur rarely and lack stability. In addition,the second loop nucleotide (C6 in GCAA) under-goes cooperative 30-endo� 20-endo sugar puckerconversions and closed-loop� open-loop tran-sitions consistent with the NMR-derived con-

formers.12 An equivalent simulated ensemble at400 K supports this model of cooperativitybetween transitions to solvent-exposed base con-formers and transitions to S-type pucker confor-mations, which is seen in other loop nucleotides(both A7 and A8). Experimental data on UUCGtetraloops, which are similar in structure, are con-sistent with this model.30

In regards to the folding dynamics of nucleicacid hairpins, we have analyzed 30 trajectorieswhich span a temperature range from 500 to700 K and observed a single, generic ``ends-®rst''unfolding mechanism which occurs in threerecognizable steps: (a) terminal base-pair andloop contacts are lost; (b) the breaking of centralbase-pairs occurs cooperatively with a loss ofstem helicity; and (c) random orientation/exten-sion of the disconnected strands is gained viadiffusion. From the observed unfolding pathway,a glimpse of the folding event has beenobtained. We propose a folding pathway thatincludes (a) collapse of the unfolded 12-mer andsampling of the globular state, (b) a passage intothe folded region of con®gurational space as thecentral base-pairs of the stem form properly andgain helicity, and (c) pairing of the terminal basepair and proper stacking and base-pairing in theloop in tandem with achievement of full A-formhelicity. This model of hairpin formation is ingood agreement with a recently reported statisti-cal mechanical model of DNA hairpin folding.36

The stability of this RNA hairpin in our simu-lations and the agreement observed between ourtrajectories and experimental data on this andother relevant systems lends credence to this meth-od of RNA simulation. However, addressing theinclusion of an explicitly ionic solvent and theaccuracy of the rate calculations should be a futurefocus of atomistic RNA simulation. In particular,the timescales investigated here suggest that it maybe possible to conduct (a) atomistic studies of thechanges in secondary structure upon tertiary con-tact formation, (b) simulations of the direct foldingof this and similar sequences in silico, and (c) stu-dies of the equilibrium thermodynamic propertiesof loop structures at longer (ms) timescales.

Materials and Methods

All simulations discussed herein employed theAmber95/AmberN force ®eld for nucleic acids}38 andthe GB/SA continuum solvent model.}39 For solventenergy and force calculations a surface area prefactor of7.0 cal/mol AÊ 2 and a probe sphere radius of 1.4 AÊ forsurface area calculations were used, with the solventpolarization and surface area derivatives (forces) beingcalculated at every dynamic timestep. As simulation stu-dies do not require electroneutrality, our calculationsincluded 389 atoms with no explicit water molecules,ions, or periodic boundaries. We assume that our ion-free environment can approximate the low sodium beha-vior of this fragment, noting that previous moleculardynamics simulations of the tRNAAsp anticodon hairpinhave strongly suggested that sodium ions lack the bind-

Page 12: RNA Simulations: Probing Hairpin Unfolding and the

504 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

ing speci®city seen in inorganic ions such asammonium.}40 A 2.0 fs integration step was used and theRattle algorithm41 was employed to constrain only H-Xcovalent bonds (where X represents heavy atoms). Noelectrostatic cutoffs or tapers were used.

The production runs presented herein were startedfrom the same con®guration, which was the result of a 1ns molecular dynamics relaxation period performed onthe ®rst (closed-loop) model of the most recent NMRwork (PDB ID no. 1ZIH12). This relaxation run employedBeeman integration (zero solvent-induced friction) at300 K. The resulting relaxed structure had anRMSD � 1.62 (�0.16) AÊ from the ensemble of ten NMRmodels and maintained all structural and interactivecharacteristics of the re®ned NMR structures.

Production runs were conducted on the SGI Origin2000 supercomputer at Stanford, CA using Allen0s sto-chastic integrator42 within the Tinker 3.8.{ package. Torepresent the physics of real solvents, the force on aNewtonian particle can be considered as the sum of ran-dom thermal ¯uctuations (R) and viscous drag (withfriction coef®cient g), resulting in the three-dimensional,many-particle Langevin equation:

m� �~r�i� � g _~r�i� � ~R�i� � 0: �1�Numerical integration of this differential equation resultsin the following equations of motion, with a time inter-val dt

~r�ti�1� � ~r�ti� � c1dt _~r�ti� � c2dt2 �~r�ti� � ~Rr; �2�

~r�ti�1� � c0_~r�ti� � c1dt~r�ti� � ~Rv: �3�

Here Rr and Rv are randomly chosen from a tempera-ture-dependent bivariate Gaussian distribution at eachtimestep, and the integration coef®cients are given by:

c0 � eÿg�dt; c1 � 1ÿ c0

gdt; and c2 � 1ÿ c1

gdt: �4�

Assuming that the proper friction coef®cient is used tocharacterize the desired solvent, the addition of a sto-chastic force term in this algorithm does not induce devi-ations from the simulation times expected of analogousmolecular dynamics simulations. Although earlysimulations43 ± 45 of aqueous systems included frictioncoef®cient (g) values over the wide range of 50-200 psÿ1,proper modeling of solvent effects requires considerationof the relaxation time of the solvent as given by the vel-ocity autocorrelation function:

h_r�t�_r�0�i / eÿg�t: �5�For water, the accepted decorrelation time of �10 fsgives g � 100 psÿ1, which is in agreement with a recentreview of implicit solvent models by Cramer andTruhlar.46 Here we have used g � 91 psÿ1, which haspreviously yielded quantitatively correct rates in simu-lations of (protein) b-hairpin folding.25 These productionruns thus included both energetic (GB/SA) and viscous(Langevin) representations of the aqueous environment.

{ http://www.dasher.wustl.edu/tinker/{ http://www.biochem.ucl.ac.uk/�martin/

swreg.html

Ten trajectories were collected at each of ®ve tempera-tures and frames were saved every 25 ps at 300 K, 20 psat 400 K, 15 ps at 500 K, and 10 ps for both 600 K and700 K. The ®nal data set consists of: 10 ns for each 300 Krun, 10 ns for each 400 K run, 6.5 ns for each 500 K run,2.5 ns for each 600 K run, and 1.5 ns for each 700 K run,giving composite simulation times of 100 ns at 300 Kand over 0.3 ms in all. Each of these 50 trajectories wasstarted with a unique random number seed to initiateatomic velocities according to the relevant Boltzmanndistribution and was therefore started from a uniquelocation in phase space P(rN, pN).

Several parameters were calculated for each trajec-tory. The total, loop, and stem (including terminalbase-pair) RMSD values were calculated using the aca-demic distribution ProFit.{ The radius of gyration Rg

was calculated using only heavy atoms and includedthe terminal base-pair. Total potential GB energieswere calculated using the analyze routine in Tinker3.8 and the analogous Poisson-Boltzmann basedenergy was calculated using analyze with an interfaceto the ZAP algorithm,47 with both employing a soluteinner dielectric of 1. Base center of mass assignmentsincluded only the six-membered ring in each nucleo-tide for symmetry, and we de®ned a native contact asexisting if the distance between two COMs was lessthan the starting distance (native COM gap) plus 1.5AÊ . As is demonstrated in the schematics shown inFigure 1, we considered seven native contacts (shownin blue between the relative bases) in the foldedclosed-loop structure and six in the native open-loopstructure (note that fraying of the terminal pair canremove a NC from the total periodically). By de®-nition, any contact that is ``broken'' allows passage ofwater molecules between the relevant bases in thebroken pair.

An assessment of the application of the GB/SA sol-vation model to GCAA tetraloop hairpin simulationswas conducted prior to collection of production runs. A®t of the GB energies to the expectedly more accurate PBenergies39 was calculated for a preliminary data set con-sisting of ®ve trajectories comprising a total of 1662 indi-vidual con®gurations, with one run at each of thetemperatures employed here. The resulting PB to GB lin-ear regression returned an R2 value of 0.997. As has beendiscussed elsewhere,48 ± 52 it would appear that the rela-tive accuracy of the GB model makes this an acceptableapproximation of the solvation energy term for DNAand RNA secondary structure simulations and allows forsuccessful simulation of small nucleic acid molecules inthe absence of an explicit solvent model.

Acknowledgments

We thank Stefan Larson, Bojan Zagrovic, Rhiju Das,and the rest of the Pande and Herschlag groups for theirhelpful discussions regarding this manuscript, and offerappreciation to our referees for their valuable insight.This work was supported in part by ACS PRF grant36028-AC4 and CPIMA seed funds. D.H. acknowledgesNIH grant (GM49243). M.A.E. is a NSERC postgraduatefellow.

Page 13: RNA Simulations: Probing Hairpin Unfolding and the

RNA Simulations: GNRA Tetraloop Hairpin Dynamics 505

References

1. Doherty, E. & Doudna, J. (2001). Ribozyme struc-tures and mechanisms. Annu. Rev. Biophys. Biomol.Struct. 30, 457-475.

2. Uhlenbeck, O. C. (1990). Nucleic-acid structure-tetra-loops and RNA folding. Nature, 346, 613-614.

3. Woese, C. R., Winker, S. & Gutell, R. R. (1990).Architecture of ribosomal RNA: constraints on thesequence of ``tetra-loops''. Proc. Natl Acad. Sci. USA,87, 8467-8471.

4. Costa, M. & Michel, F. (1995). Frequent use of thesame tertiary motif by self-folding RNAs. EMBO J.14, 1276-1285.

5. Brion, P. & Westhof, E. (1997). Hierarchy anddynamics of RNA folding. Annu. Rev. Biophys.Biomol. Struct. 26, 113-137.

6. Moore, P. B. (1999). Structural motifs in RNA. Annu.Rev. Biochem. 68, 287-300.

7. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Kundrot, C. E. et al. (1996). Crystalstructure of a group I ribozyme domain: principlesof RNA packing. Science, 273, 1678-1685.

8. FerreÂ-D'AmareÂ, A. R. & Doudna, J. A. (1999). RNAfolds: insights from recent crystal structures. Annu.Rev. Biophys. Biomol. Struct. 28, 57-73.

9. Pley, H. W., Flaherty, K. M. & McKay, D. B. (1994).Model for an RNA tertiary interaction from thestructure of an intermolecular complex between aGAAA tetraloop and an RNA helix. Nature, 372,111-113.

10. Endo, Y., Chan, Y. L., Lin, A., Tsurugi, K. & Wool,I. G. (1988). The cytotoxins a-sarcin and ricin retaintheir speci®city when tested on a synthetic oligori-bonucleotide (35-mer) that mimics a region of 28Sribosomal ribonucleotide acids. J. Biol. Chem. 263,7917-7920.

11. Gluck, A., Endo, Y. & Wool, I. G. (1992). Ribosomal-RNA identity elements for ricin A-chain recognitionand catalysis-analysis with tetraloop mutants. J. Mol.Biol. 226, 411-424.

12. Jucker, F. M., Heus, H. A., Yip, P. F., Moors, E. H. M.& Pardi, A. (1996). A network of heterogeneoushydrogen bonds in GNRA tetraloops. J. Mol. Biol.264, 968-980.

13. Jaeger, L., Michel, F. & Westhof, E. (1994). Involve-ment of a GNRA tetraloop in long-range tertiaryinteractions. J. Mol. Biol. 236, 1271-1276.

14. Murphy, F. L. & Cech, T. R. (1994). GAAA tetraloopand conserved bulge stabilize tertiary structure of agroup I intron domain. J. Mol. Biol. 236, 49-63.

15. Doherty, E. A., Batey, R. T., Masquida, B. &Doudna, J. A. (2001). A universal mode of helixpacking in RNA. Nature Struct. Biol. 8, 339-343.

16. Costa, M. & Michel, F. (1997). Rules for RNA recog-nition of GNRA tetraloops deduced by in vitro selec-tion: comparison with in vivo evolution. EMBO J. 16,3289-3302.

17. Heus, H. A. & Pardi, A. (1990). Structural featuresthat give rise to the unusual stability of RNA hair-pins containing GNRA loops. Science, 253, 191-194.

18. Zichi, D. A. (1995). Molecular dynamics of RNAwith the OPLS force ®eld: aqueous simulation of ahairpin containing a tetranucleotide loop. J. Amer.Chem. Soc. 117, 2957-2969.

19. Santa Lucia, J., Jr, Kierzek, R. & Turner, D. H.(1992). Context dependence of hydrogen bond freeenergy revealed by substitutions in an RNA hairpin.Science, 256, 217-219.

20. Brooks, C. L., III (1998). Simulations of protein fold-ing and unfolding. Curr. Opin. Struct. Biol. 8, 222-226.

21. Pande, V. S. & Rokhsar, D. S. (1999). Moleculardynamics simulations of unfolding and refolding ofa b-hairpin fragment of protein G. Proc. Natl Acad.Sci. USA, 96, 9062-9067.

22. Daggett, V. & Levitt, M. (1993). Protein unfoldingpathways explored through molecular dynamicssimulations. J. Mol. Biol. 232, 600-619.

23. Lazaridis, T. & Karplus, M. (1997). ``New view'' ofprotein folding reconciled with the old through mul-tiple unfolding simulations. Science, 278, 1928-1931.

24. Dinner, A. & Karplus, M. (1999). Is protein unfold-ing the reverse of protein folding? A lattice simu-lation analysis. J. Mol. Biol. 292, 403-419.

25. Zagrovic, B., Sorin, E. J. & Pande, V. S. (2001). b-hairpin folding simulations in atomistic detail usingan implicit solvent model. J. Mol. Biol. 313, 151-169.

26. Nowakowski, J., Miller, J. L., Kollman, P. A. &Tinoco, I., Jr. (1996). Time evolution of NMR protonchemical shifts of an RNA hairpin during a molecu-lar dynamics simulation. J. Am. Chem. Soc. 118,12812-12820.

27. Saenger, W. (1984). Principles of Nucleic Acid Struc-ture, Springer-Verlag. New York.

28. Menger, M., Eckstein, F. & Porschke, D. (2000).Dynamics of the RNA hairpin GNRA tetraloop.Biochemistry, 39, 4500-4507.

29. Hermann, T., Auf®nger, P. & Westhof, E. (1998).Molecular dynamics investigations of hammerheadribozyme RNA. Eur. Biophys. J. 27, 153-165.

30. Leulliot, N., Baumruk, V., Abdelka®, M., Turpin, P.,Namane, A., Gouyette, C. et al. (1999). Unusualnucleotide conformations in GNRA and UNCG typetetraloop hairpins: evidence from Raman markersassignments. Nucl. Acids Research, 27, 1398-1404.

31. Thirumalai, D. (1998). Native secondary structureformation in RNA may be a slave to tertiary folding.Proc. Natl Acad. Sci. USA, 95, 11506-11508.

32. Wu, M. & Tinoco, I., Jr. (1998). RNA folding causessecondary structure rearrangement. Proc. Natl Acad.Sci. USA, 95, 11555-11560.

33. Auf®nger, P. & Westhof, E. (1997). Rules governingthe orientation of the 20-hydroxyl group in RNA.J. Mol. Biol. 274, 54-63.

34. Auf®nger, P. & Westhof, E. (1999). Singly andbifurcated hydrogen-bonded base-pairs in tRNAanticodon hairpins and ribozymes. J. Mol. Biol. 292,467-483.

35. Cohen, R. J. & Crothers, D. M. (1971). Rate ofunwinding small DNA. J. Mol. Biol. 61, 525-542.

36. Ansari, A., Kuznetsov, S. V. & Shen, Y. (2001). Con-®gurational diffusion down a folding funneldescribes the dynamics of DNA hairpins. Proc. NatlAcad. Sci. USA, 98, 7773-7776.

37. Shen, Y., Kuznetsov, S. V. & Ansari, A. (2001). Loopdependence of the dynamics of DNA hairpins.J. Phys. Chem. B, 105, 12202-12211.

38. Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R.,Merz, K. M., Jr et al. (1995). A second generationforce ®eld for the simulation of proteins, nucleicacids, and organic molecules. J. Am. Chem. Soc. 117,5179-5197.

39. Qui, D., Shenkin, P. S., Hollinger, F. P. & Still, W. C.(1997). The GB/SA continuum model for solvation.A fast analytical method for the calculation forapproximate Born radii. J. Phys. Chem. 101, 3005-3014.

Page 14: RNA Simulations: Probing Hairpin Unfolding and the

506 RNA Simulations: GNRA Tetraloop Hairpin Dynamics

40. Auf®nger, P. & Westhof, E. (1998). Simulations ofthe molecular dynamics of nucleic acids. Curr. Opin.Struct. Biol. 8, 227-236.

41. Andersen, H. C. (1983). Rattle: a ``velocity'' versionof the Shake algorithm for molecular dynamicscalculations. J. Comput. Phys. 52, 24-34.

42. Allen, M. P. (1980). Brownian dynamics simulationof a chemical reaction in solution. Mol. Phys. 40,1073-1087.

43. Brooks, C. L., III & Karplus, M. (1983). Deformablestochastic boundaries in molecular dynamics.J. Chem. Phys. 79, 6312-6325.

44. BruÈ nger, A., Brooks, C. L., III & Karplus, M. (1984).Stochastic boundary conditions for moleculardynamics simulations of ST2 water. Chem. Phys.Letters, 105, 495-500.

45. BruÈ nger, A., Brooks, C. L., III & Karplus, M. (1985).Active site dynamics of ribonuclease. Proc. NatlAcad. Sci. USA, 82, 8458-8462.

46. Cramer, C. J. & Truhlar, D. G. (1999). Implicitsolvation models: equilibria, structure, spectra, anddynamics. Chem. Rev, 99, 2161-2200.

47. Grant, J. A., Pickup, B. T. & Nicholls, A. (2000). Asmooth permittivity function for Poisson-Boltzmannsolvation methods. J. Comp. Chem. 22, 608-640.

48. Zacharias, M. (2000). Simulation of the structure anddynamics of nonhelical RNA motifs. Curr. Opin.Struct. Biol. 10, 311-317.

49. Zacharias, M. (2001). Conformational analysis ofDNA-trinucleotide-hairpin-loop structures using acontinuum solvent model. Biophys. J. 80, 2350-2363.

50. Srinivasan, J., Miller, J., Kollman, P. A. & Case, D. A.(1998). Continuum solvent studies of the stability ofRNA hairpin loops and helices. J. Biomol. Struct.Dynam. 16, 671-682.

51. Williams, D. J. & Hall, K. B. (1999). Unrestrainedstochastic dynamics simulations of the UUCG tetra-loop using an implicit solvation model. Biophys. J.76, 3192-3205.

52. Williams, D. J. & Hall, K. B. (2000). Experimentaland computational studies of the G[UUCG]C RNAtetraloop. J. Mol. Biol. 297, 1045-1061.

Edited by J. Dovdna

(Received 18 September 2001; received in revised form17 January 2002; accepted 18 January 2002)

http://www.academicpress.com/jmb

Supplementary Material comprising @ 35 unpub-lished trajectories is available on IDEAL