MESSENGER RNA THROUGH NEW SEQUENCING TECHNOLOGY … · 2013-05-08 · – transfer RNA . UTR – untranslated region . viii. ix R. Hugh F. Bender . A METHOD FOR IDENTIFYING RIBOSOME

A METHOD FOR IDENTIFYING RIBOSOME PAUSE SITES IN

MESSENGER RNA THROUGH NEW SEQUENCING TECHNOLOGY

By

R. HUGH F. BENDER

A Thesis Submitted to the Graduate Faculty of

WAKE FOREST UNIVERSITY

in Partial Fulfillment of the Requirements

for the Degree of

MASTER OF SCIENCE

in the Department of Biology

August 2009

Winston-Salem, North Carolina

Approved By: James F. Curran, Ph.D., Advisor __________________________________________ Examining Committee: Brian W. Tague, Ph.D, Chairman __________________________________________ Jacquelyn S. Fetrow, Ph.D. __________________________________________ David A. Ornelles, Ph.D. __________________________________________

ACKNOWLEDGEMENTS

I would first like to thank my advisor, Dr. Jim Curran, for the opportunity to

undertake a truly ground-breaking project. Although this study did not evolve as either

of us anticipated, we have both learned much about high-throughput sequencing and the

challenges inherent in preparing a robust set of RNA fragments for this process. Dr.

Curran has shown a great deal of patience throughout the development of this project and

has demonstrated a willingness to expand my knowledge base of laboratory techniques.

My experiences in the Curran Lab have enhanced my abilities and given me new

opportunities that were not available two years ago.

I also want to thank Dr. David Ornelles, whose collaboration was invaluable in

completing this thesis. His generosity in allowing me to use his lab space and equipment

saved countless hours of time and effort. His willingness to advise me on molecular

problems was also extremely helpful in troubleshooting problems as Dr. Curran and I

developed our protocol. I must also thank the graduate students of the Ornelles lab:

Roberta Turner, Megan Spurgeon, and Gena Nichols. Roberta cultured our HeLa cells

during the first half of this thesis and I am extremely grateful for her willingness to carry

out such a time-consuming task on my behalf. Megan and Gena were very helpful in my

adjustment to graduate school and assisted me in tackling a number of issues that sprang

up while I was working in the Ornelles Lab.

Also, I would like to thank the other two members of my committee, Drs. Jacque

Fetrow and Brian Tague. Dr. Fetrow deserves credit for initially starting me down my

graduate school path by giving me the chance to complete a bioinformatics Honors thesis

ii

during my undergraduate senior year—a thesis which led directly to the opportunity for

this Master’s thesis. Her emphasis on advising during that project has greatly assisted my

work on this thesis. I am indebted to Dr. Fetrow for providing the opportunity to take on

such a project, and I am grateful for her willingness to participate on my committee for

this thesis. Dr. Tague has served on both my Honors thesis and Master’s thesis

committees, and has been a dedicated advisor in both capacities. As director of the

Biology graduate program, he has always been willing to help, especially during my

adjustment to graduate school, and my planning for Ph.D. work after Wake Forest.

I would be remiss to ignore the contributions of those who have supported me

throughout my work: my parents, my fellow graduate students and post-docs in the

Biology department, and my friends outside the program. My parents, Alison Frost and

Rick Bender, have instilled in me a desire to constantly challenge myself—a gift which

has motivated me throughout my college and graduate endeavors. During this Master’s

thesis, they have always been willing to listen and advise me on the next step to take

when I became especially frustrated with this project. There are too many of my fellow

graduate students and post-docs to name here, but they have all played an important role

in supporting me and helping me work through the challenges of this project over the last

two years. Many of them have been instrumental in providing ideas and insights for

troubleshooting this protocol. My friends who were not directly affiliated with this

program have also provided immeasurable contributions during my work on this project

and I am grateful to them for their willingness to listen during the past two years.

iii

TABLE OF CONTENTS

Page

LIST OF FIGURES.……………………………………………………………….. vi

LIST OF ABBREVIATIONS………………….………………………………….. viii

ABSTRACT…………………………………………………………………........... ix

INTRODUCTION…………………………………………………………………. 1

Translational Control Mechanisms…………………………………….............. 2

Footprinting Determines Ribosome Location During Translation…………….. 14

High-Throughput Sequencing of Ribosome Footprints………………………... 15

MATERIALS AND METHODS…………………………………………………… 20

Isolating Ribosome-Protected mRNA Fragments……………………………… 20

Preparing Fragments for Sequencing: Bender & Curran Method……………… 21

Preparing Fragments for Sequencing: Ingolia Method………………………… 25

RESULTS…………………………………………………………………………... 29

Preparation of Ribosome Footprints…………………………………………… 29

Polyadenylation.…………………………………………………………........... 31

First-Strand Synthesis………………………………………………………….. 33

Tailing…………………………………………………………………………... 36

Ligation…………………………………………………………………………. 37

Results from the Ingolia Method………………………………………………. 41

DISCUSSION…………………………………………………………………......... 43

Problems with the Ingolia Method…………………………………………....... 43

iv

Problems with the Bender & Curran Method………………………………….. 46

Strategies for Enhancing the Ingolia Method………………………………….. 47

Practical Application of these Methods……………………………………….. 49

LITERATURE CITED…………………………………………………………….. 53

FIGURES…………………………………………………………………………… 57

SCHOLASTIC VITA…………………………………………………………......... 87

v

LIST OF FIGURES

Page

Figure 1. Illumina method for preparing small RNAs for sequencing …………….. 57

Figure 2. Bender & Curran method for preparing small RNAs for sequencing........ 58

Figure 3. Ingolia method for preparing small RNAs for sequencing.……………… 59

Figure 4. Ribonuclease digestion of ribosomal RNAs……………………………... 60

Figure 5. Isolation of ribosome-protected mRNA fragments…………………......... 61

Figure 6. Damage to digested DNA over time……………………………………... 62

Figure 7. Simultaneous dephosphorylation and polyadenylation reactions………... 63

Figure 8. The decline of poly-A polymerase activity over time………………......... 64

Figure 9. Comparison of Ambion and NEB poly-A polymerase…………………... 65

Figure 10. Pulse-chase assay of polyadenylation……………………………........... 66

Figure 11. Factors affecting polyadenylation……………………………................. 67

Figure 12. Additional factors affecting polyadenylation…………………………… 68

Figure 13. The effect of dNTP concentration during first-strand synthesis……… 69

Figure 14. First-strand synthesis using several oligo dT sequences………………... 70

Figure 15. First-strand synthesis using Ingolia’s method…………………………... 71

Figure 16. First-strand synthesis using Bender & Curran method……………......... 72

Figure 17. DNA retention following binding to streptavidin magnetic beads……... 73

Figure 18. DNA retention following removal of RNA…………………………….. 74

Figure 19. Comparison of dTTP and dCTP tailing reactions………………………. 75

Figure 20. Factors affecting dCTP tailing………………………………………….. 76

vi

Figure 21. Optimizing the dCTP tailing reaction…………………………………… 77

Figure 22. Quantification of tailing efficiency……………………………………. 78

Figure 23. Techniques for ligating adapter sequences & fragments of interest…. 79

Figure 24. Increasing ligation efficiency by augmenting oligo concentration.......... 80

Figure 25. Varying dG overhang length to enhance ligation efficiency……………. 81

Figure 26. Evidence for successful ligation using Bender & Curran method……… 82

Figure 27. Comparison of signal shifts for non-ligated and ligated samples………. 83

Figure 28. Retention of cDNA after removal of gel purification steps……….......... 84

Figure 29. Circularization of control oligos using CircLigase……………………... 85

Figure 30. Circularization as evidenced by PCR amplification……………………. 86

vii

LIST OF ABBREVIATIONS

aa-tRNA – amino-acylated tRNA

cDNA – complementary DNA

IRES – internal ribosome entry site

miRNA – micro RNA

mRNA – messenger RNA

PCR – polymerase chain reaction

rRNA – ribosomal RNA

SD – Shine-Dalgarno sequence

tRNA – transfer RNA

UTR – untranslated region

viii

ix

R. Hugh F. Bender

A METHOD FOR IDENTIFYING RIBOSOME PAUSE SITES IN

MESSENGER RNA THROUGH NEW SEQUENCING TECHNOLOGY

Thesis under the direction of James F. Curran, Ph.D., Professor of Biology.

Ribosome frameshifting, pausing, and termination events have a profound effect

on the accuracy and reliability of protein synthesis. A variety of translational control

mechanisms combine to ensure the fidelity of this process. Mechanisms, such as

aminoacyl-tRNA selection and codon:anticodon interactions, are well characterized.

However, more complex mechanisms, such as mRNA secondary structures, require a

new approach for further study. We suggest that these mechanisms may be studied on a

global scale by mapping the distribution of ribosomes along a select group of mRNAs. A

combination of techniques for ribosome-protected mRNA fragment isolation (ribosome

footprinting) and high-throughput sequencing—a method for simultaneously sequencing

up to 40 million fragments—provides the tools for studying global translational control

mechanisms. We plan to sequence a set of ribosome footprints using Illumina’s Genome

Analyzer, which requires the presence of two unique adapter sequences on either end of

these fragments. Conventional methods for adding these adapter sequences to RNA

fragments result in significant material loss. Here we propose an improved method for

isolating and converting small RNA fragments to a sequencing-ready form without the

use of gel purification or sample amplification (PCR) steps. We also critique a similar

method published by Ingolia, et al., (2009).

INTRODUCTION

Translation is often thought of as a uniform process. Ribosomes move smoothly

across messenger RNAs (mRNA) and produce a single peptide chain which folds to

become a functional protein. In reality, the process is more complex. Factors such as

mRNA sequence and secondary structures can influence ribosome movement along the

mRNA and lead to programmed ribosomal frameshifting, pausing, and termination events.

These events can have a profound effect on the production and functionality of proteins.

Regulation of these events is critical to preserving the fidelity of RNA-to-polypeptide

translation, yet our understanding of these control mechanisms and their effects on

translation remains limited.

Previous studies aimed at understanding gene expression have utilized

microarrays to quantify the relative populations of individual mRNAs in the cell. These

studies are useful as a first step for understanding protein production in the cell, but they

fall short in observing the entire synthesis process. Translational control mechanisms

affect the binding and movement of ribosomes along these mRNAs and are not readily

discovered by observing the cellular transcriptome alone. The advent of high-throughput

sequencing techniques has opened the door for in-depth studies of translational control

mechanisms. We propose that a combination of ribosome footprinting and high-

throughput sequencing techniques can reveal important information about ribosomal

pause sites and potential translational control regions. This study discusses methods we

have developed for isolating small, ribosome-protected mRNA fragments (the ribosome

footprint) and for preparing these fragments through a series of molecular manipulations,

- 1 -

for high-throughput sequencing. Additionally, this study compares our method with

another recently published procedure also designed for small RNA conversion.

Translational Control Mechanisms

The control mechanisms present in cells that regulate protein synthesis are varied

and complex. These controls range from regulating the selection of the correct amino

acids to physically disrupting the movement of ribosomes across the message. An

understanding of the individual control mechanisms is critical to understanding their

potential interactions from a global standpoint.

Translation Initiation: Binding the Ribosome

Prokaryotic and eukaryotic systems vary in the mechanisms by which translation

is initiated. Several mechanisms exist in both systems to increase the affinity of the

ribosomal translation complex for the start codon of the mRNA.

Prokaryotes primarily use a high-affinity sequence to bind the small ribosomal

subunit to the region of mRNA just upstream of the start (AUG) codon. The Shine-

Dalgarno (SD) region has a consensus sequence (5’-GGAGGU-3’) complementary to a

six-nucleotide sequence at the 3’ end of the 16S ribosomal RNA (rRNA) sequence, the

interaction of which is likely to facilitate binding of the small subunit to the mRNA

(Shine and Dalgarno, 1974). Interestingly, the formation of secondary structure may

make this sequence unavailable for ribosome binding in some instances, providing a

potential regulatory mechanism for translation. More recent evidence suggests the SD

region may form a duplex with the region of the mRNA just 5’ of the start codon and this

is what binds the small subunit with the assistance of protein S2 (Yusupova et al., 2006).

Interestingly, mutations to the last four nucleotides of the SD region decrease the level of

- 2 -

expression, although modifying the first two nucleotides enhances expression (Park et al.,

2007). To form the rest of the translation complex, ribosomal protein S1 increases the

affinity of the new small ribosomal subunit:mRNA complex for the large, 70S subunit of

the ribosome (Hartz et al., 1991). Without the S1 protein, translation does not occur.

Lastly, the completed ribosome machinery is stabilized by the addition of

codon:anticodon interactions following addition of the fMet-tRNA to the P-site to begin

translation (Yusupova et al., 2006). The SD region also facilitates the beginning of

ribosome translocation, as will be discussed later in the context of ribosome movement

(Uemura et al., 2007).

Eukaryotic translation initiation is more complex and varied compared to

prokaryotic initiation. A 5’ terminal cap structure, in conjunction with nucleotides just

upstream of the AUG start codon are critical to successful ribosome binding. Cap

Binding Protein I (CBP I) forms a complex (CBP II) with two other proteins at the cap

structure to allow for binding of the small ribosomal subunit (40S) to the 5’ cap (Shatkin,

1985). Translation does not occur in the absence of the cap or the CBP II complex. Once

bound to the mRNA, the small ribosomal subunit will scan downstream of the cap to find

the start codon. There are typically multiple AUG codons upstream of the correct start

codon which presents a potential problem for the ribosome in identifying the correct start

codon. The affinity of the sequence surrounding each AUG codon for the 40S subunit

determines if ribosomes bind at each site consistently (Kozak, 1984). A eukaryotic

translation initiation sequence (consensus: 5’-CCA/GCCAUGG-3’) around the start

codon (underlined) is therefore critical for signaling the ribosome to stop at the correct

AUG. A purine three bases upstream of the stop codon is especially important as

- 3 -

mutations in positions -1, -2, and +4 (relative to A within the start codon, designated ‘0’)

become significant if this is replaced with a pyrimidine (Kozak, 1986b). Additionally,

GCC motifs in the -6 and -9 positions upstream of the start codon in mammalian systems

appear to further enhance the signal for the 40S subunit to pause at the correct AUG

codon (Kozak, 1987). Once the 40S subunit has bound the AUG codon, the large

ribosomal subunit binds to form the translation complex. This structure is stabilized by

fMet-tRNA binding followed by translation initiation (Kozak, 1992).

Viruses are known to shut-off host translation mechanisms by removing cap-

binding capabilities of cells. This allows increased expression of virus messages without

the interference of cellular mRNAs. However, by deactivating cap recognition, viruses

must have another mechanism by which to bind the 40S ribosomal subunit (Sonenberg

and Hinnebusch, 2007). In place of the 5’ cap, viral mRNAs have a sequence upstream

of the start codon which forms a secondary structure known as the Internal Ribosome

Entry Site (IRES). When confronted with the IRES or other similar secondary structure,

ribosomes bind this structure, unfold it, then feed the mRNA through the appropriate

ribosomal tunnel in preparation for translation (Marzi et al., 2007). Interestingly,

ribosomes contain a platform that is suited for binding a variety of additional structures

including poly-A and poly-U regions, hairpins, and pseudoknots. Evidence suggests that

ribosome binding is not dependent solely on the secondary structure of the IRES and that

there is actually significant structural and sequence variation in these regions (Xia and

Holcik, 2009). It is possible other mechanisms exist by which translation initiation

factors may bind to the mRNA, but further studies must be conducted to determine if this

is actually the case.

- 4 -

Selection of Aminoacylated-tRNAs

Following initiation, a variety of other factors affect the speed with which

ribosomes move along the mRNA. Some of the best understood factors are those which

are sequence specific. All codons are not translated at the same rate, a phenomenon due

in large part to variations in the concentrations of aminoacylated-transfer RNAs (aa-

tRNA) in the cell (Curran and Yarus, 1989). During translation, ribosomes translocate to

the next codon following entry and binding of an aa-tRNA which matches the codon.

However, at “hungry” or slowly translated codons—those codons where the correct aa-

tRNA is not readily available—the ribosome must pause until the correct aa-tRNA is

bound. This disparity in transfer RNA (tRNA) concentration selects for non-uniform

movement of ribosomes along the mRNA and likely provides a mechanism by which

cells can globally control translation by altering the concentrations of one or more tRNAs.

Starvation mediated by tRNA concentration not only causes ribosome pausing but

also facilitates ribosome bypassing and forward/reverse frameshifting around slowly

translating codons (Liao et al., 2008; Lindsley et al., 2003). Interestingly, ribosome

bypassing—a phenomenon in which the ribosome skips over a portion of the mRNA—

sometimes occurs when another in-frame codon is positioned downstream of the slow

codon. This provides a mechanism which preserves amino acid order in the peptide

sequence despite the presence of a slow codon and the physical continuity of translation.

The downstream codon is often synonomous to the slow codon, allowing the same amino

acid to be integrated using a more readily available aa-tRNA.

Codon read speed appears to be a critical factor in producing functional proteins.

Studies replacing slow codons with faster, synonomous codons found translation greatly

- 5 -

increased, as expected, but also noted severely decreased enzyme activity (Komar et al.,

1999). Evidence suggests ribosome pausing at these and other translational control

regions may be useful for nascent polypeptide folding (Purvis et al., 1987). Further, the

mechanisms by which nascent polypeptide folding occur may explain why some proteins

are not able to re-fold after inactivation in the absence of the ribosomal machinery. Fast

and slow codons have been correlated to the formation of specific secondary protein

structures (Thanaraj and Argos, 1996a). In general, faster codons have been tied to the

formation of alpha helices, while slower regions match with beta sheet regions. Coil

regions in the protein also appear to be correlated with slowly translated regions,

although evidence for this association is not as clear cut as that for beta sheets. Fast

codons also typically code for amino acids which readily become alpha helices, a

structure which may be formed via interactions with the peptide exit channel of the

ribosome. Slowly translating regions also correlate with protein segments at the ends or

regions linking two protein domains (Thanaraj and Argos, 1996b). The ribosome pause

at these slow codons occurs after the domain portion of the peptide has been synthesized,

providing further evidence for a connection between ribosome pausing and nascent

polypeptide folding.

Codon:Anticodon Interactions

The selection of the correct aa-tRNA is important in preserving nucleotide-to-

peptide code translation. This selection is believed to have a low error rate of one

incorrect aa-tRNA for every 1,000 correct aa-tRNAs integrated during translation

(Wintermeyer et al., 2004). The molecular codon:anticodon interactions within the

ribosome play a critical role in distinguishing between correct and incorrect aa-tRNAs

- 6 -

and may also function in facilitating ribosomal frameshifting events. An analysis of

codon:anticodon interactions reveals that stability of this complex alone is not specific

enough to distinguish between aa-tRNAs (Lim and Curran, 2001). While stability plays a

role, the breakage and reformation of a small number of hydrogen and ionic bonds during

codon:anticodon complex formation appears to have the biggest effect on distinguishing

between correct and incorrect aa-tRNAs.

Studies of the E. coli release factor 2 (RF2) programmed frameshift site have

helped to characterize the sequences necessary for ribosomal frameshifting. In

prokaryotes, frameshifting is often accompanied by a slowly translated UGA codon and

may be enhanced by the presence of an SD sequence (Schwartz and Curran, 1997).

Codon:anticodon base pairing within specific tRNA binding sites within the ribosome

may further affect frameshifting. As discussed previously, the availability of select aa-

tRNAs in the ribosomal A-site facilitates frameshifting at slow or hungry codons (Liao et

al., 2008; Lindsley et al., 2003). The ribosomal P-site appears to have the greatest effect

on the occurrence of frameshifting events. In particular, frameshifting appears to be

enhanced when the frameshifted P-site codon:anticodon interaction is more stable than

the in-frame interaction (Curran, 1993). Although it was long thought to have a non-

existent role in translation, recent evidence suggests that the empty tRNA present in the

E-site has a direct effect on frameshifting (Liao et al., 2008; Sanders and Curran, 2007).

Stronger codon:anticodon interactions in this site inhibit the rate of frameshifting, while

weaker interactions actually facilitate frameshifting events.

The Secondary Structure of Messenger RNA

Perhaps the least understood of the translational control mechanisms described

- 7 -

here is the role of mRNA secondary structure in protein synthesis. However, this is one

of the most intriguing mechanisms and one we hope to discover ample evidence for in

future studies.

In the context of ribosome binding during translation initiation, a large body of

evidence suggests the importance of secondary structure surrounding the initiating AUG

codon. A discrepancy has been observed between prokaryotic and eukaryotic ribosome

interactions with mRNA secondary structure (Zagorska et al., 1982). In the presence of

either prokaryotic or eukaryotic mRNAs, prokaryotic ribosomes appeared unable to

recognize and initiate translation on transcripts with secondary structure. In contrast,

eukaryotic ribosomes appear to function equally efficiently in the presence or absence of

secondary structure. Further studies have determined that eukaryotic ribosomes are, in

fact, sensitive to secondary structures such as hairpins, although this effect is relative to

the proximity of the hairpin to the initiation codon (Kozak, 1986a; Kozak, 1989).

Hairpin structures 5’ to the start codon appear to inhibit translation, although more stable

hairpins have a greater effect than weaker hairpins. Interestingly, vertebrate mRNAs

have been found to contain GC-rich regions which form stable secondary structures

(Kozak, 1991). A variety of regulatory proteins, such as oncoproteins, growth factors,

and transcription factors, among others, appear to have the highest occurrence of such

structures which regulate their expression. It has also been suggested that these structures

may facilitate non-AUG translation initiation, a phenomenon noted in both vertebrate and

viral mRNAs.

Secondary structure has also been found to occur in the coding region of mRNAs

and appears to have a direct effect on the movement of ribosomes. Wolin and Walter

- 8 -

(1988) identified four ribosome pause sites along a transcript by isolating the positions of

individual ribosomes. Translation initiation and termination appears to be accompanied

by ribosome pausing as might be expected due to the decreased rate of ribosome binding

and disengaging relative to the speed of the ribosome during translation. However, two

pause regions identified in the coding region were correlated with weak secondary

structures in both regions, rather than the position of slow codons. Importantly, this

study also noted ribosome stacking at these pause sites, suggesting potential regulatory

effects for upstream ribosomes at these sites. As with the initiation sequence, hairpin

stability may have a direct effect on the ability of a ribosome to translate a region of

mRNA (Baim et al., 1985). A mutant hairpin with increased stability relative to the wild-

type hairpin structure was observed to correlate with an 80% reduction in protein

synthesis. Ribosomal frameshifting in a retroviral system is often correlated with a stem-

loop secondary structure just upstream (5’) of the frameshift site, in conjunction with a

slippery sequence also upstream of the site (Jacks et al., 1988). It has been suggested that

secondary structures encourage pushing or pulling of the ribosome during frameshifting

in addition to physically stopping the ribosome, a process which might facilitate

rearrangement of the codon:anticodon interactions in the P- and A- sites of the ribosome.

Taken together, these studies suggest a substantial, albeit largely unknown, role for

secondary structure within the coding portion of the mRNA.

An alternative form of secondary structure to hairpins, pseudoknots are also found

in the coding regions of mRNAs. These structures are believed to play a role in ribosome

pausing and frameshifting (Alam, et al., 1999). Pseudoknots appear to cause the

ribosome to pause directly over the frameshift site, although heelprinting studies suggest

- 9 -

there may be other factors which facilitate frameshifting beyond the pseudoknot itself (Tu,

et al., 1992). Slippery sequences, which facilitate forward or backward sliding of the

ribosome during frameshifting, have been found to accompany some pseudoknots a short

space upstream of the pseudoknot (Alam, et al., 1999). An in-depth study of ribosome-

pseudoknot interactions suggest the ribosome may move until the pseudoknot is at the

entrance of the mRNA tunnel of the ribosome (Namy, et al., 2006). The pseudoknot and

ribosome then appear to interact directly, limiting the movement of the ribosome after

pausing. As the ribosome attempts to translocate, codon:anticodon interactions within

the ribosome appear to be disrupted, providing a mechanism by which frameshifting may

occur. However, it is important to note that the authors of this study suggest a different

mechanism may be in place with hairpin structures, as the ribosome may be more capable

of unwinding these structures. The unwinding capabilities of ribosomes, as discussed in

the next section, appear to come into play with these structures (Wen et al., 2008).

Ribosome pausing at pseudoknot structures, as with hairpins, depends greatly on

the stability of the structure. In fact, Namy, et al. (2006) note that the stability of the

pseudoknot may be greatly dependent on the interaction between the ribosomal helicase

center and the pseudoknot. Multiple studies have also noted that events, such as

increased temperature and mutations to key nucleotides in the pseudoknot structure,

affect pseudoknot stability and are associated with decreased rates of frameshifting

(Alam, et al., 1999; Somogyi, et al., 1993). Shen and Tinoco (1995) have identified an

extremely stable pseudoknot structure in the mouse mammary tumor virus, which varies

significantly from those previously identified. This hints at the presence of a wide

variety of pseudoknot structures which have not been previously identified and indirectly

- 10 -

suggests that pseudoknots may be uniquely specialized in their structures to regulate

translation. Replacement of Shen and Tinoco’s pseudoknot structure with other known

pseudoknot sequences noted significant decreases in frameshifting. This provides

evidence for the variety of translational regulatory roles assumed by pseudoknots,

although more extensive experimentation is required. It is unclear if ribosomes stack

behind pseudoknots as they do behind hairpins, but based on the variety of pseudoknot

regulatory capabilities noted above, it is possible that the degree of ribosome stacking

may also depend on the stability of the pseudoknot.

Ribosome Movement Along mRNA

Based on the evidence described in the previous sections, translocation of the

ribosome itself is dependent on both sequence and structural elements within the mRNA.

Actual translocation events are believed to occur sporadically (Wintermeyer et al., 2004).

Alterations to the binding affinity of codon:anticodon interactions in the P-site following

peptide bond formation causes this to be a thermodynamically random process of

movement (Takyar et al., 2005). Prior to peptide bond formation within the ribosome,

the force necessary for translocation is 26.5 piconewtons (pN) yet this force is

dramatically decreased with peptide bond formation to 12.7 pN (Uemura et al., 2007). In

prokaryotes, the presence of the SD sequence both upstream of the start codon and within

the coding region itself appears to have an especially direct effect on the movement of the

ribosome, perhaps due to the P-site interactions. Recent evidence suggests this sequence

may cause translation to pause or even terminate in some locations, a phenomenon that

disappears when these sequences are replaced with synonomous codons (Wen et al.,

2008). The force for translocation was also found to decrease with several modified SD

- 11 -

regions, providing evidence for a mechanism which may “kick-off” the ribosome to

begin translation (Uemura et al., 2007). It is possible these mechanisms may combine to

pause translation or cause ribosomes to drop off of the transcript as evidenced by Wen, et

al., (2008).

As discussed previously, secondary mRNA structures have a profound effect on

the movement of ribosomes and affect the folding of the nascent polypeptide (Hardesty et

al., 1999; Purvis et al., 1987). The mechanisms by which the ribosome unwinds these

secondary structures are complex and vary by the actual structure to be unfolded. In most

cases, an RNA helicase is believed to be associated with the ribosome which assists in

unfolding secondary structures during translation. However, recent evidence suggests

that the ribosome itself may have some independent helicase activity (Takyar et al., 2005).

This helicase enzyme site appears to be centered in the mRNA entrance tunnel

approximately 11 nucleotides downstream from the P-site codon and utilizes the S3 and

S4 proteins in this tunnel, which may clamp the mRNA during the unwinding process.

The location of this helicase site appears to agree with evidence for frameshifting and

pseudoknot unwinding as discussed by Alam, et al. (1999).

Takyar, et al. (2005) suggested in their study that this helicase activity may allow

ribosomes to unwind hairpin structures. A recent, extensive study on the movement of

single ribosomes along mRNAs confirmed this hypothesis (Wen et al., 2008).

Ribosomes are capable of unwinding relatively large hairpin structures (60 and 274

bases) with no assistance from RNA helicase or other proteins outside of the ribosome

itself. As a ribosome unfolded the hairpin during translation, the length of the mRNA

fragment was found to extend. This extension occurred by a distance proportional to two

- 12 -

codons, suggesting that the ribosome itself unwinds the hairpin one codon at a time (the

codon and its complement must unwind simultaneously, accounting for the two codon

extension). With a larger hairpin, the same unwinding capability was found, but

translation was found to pause at some codons. Translation could be rescued by applying

force to the ends of the mRNA, suggesting the helicase activity inherent to the ribosome

is not sufficient for unwinding all secondary structures a ribosome might encounter.

Nevertheless, this study reveals important details about the ability of ribosomes to

unwind hairpin structures.

Pseudoknots are structurally more complex, presenting different challenges for

translation than the hairpin. Although a similar study to that of Wen, et al., has yet to be

conducted for pseudoknot structures, mechanical unwinding studies can give an idea of

the mechanics needed to unfold these structures. The stabilities of hairpin and

pseudoknot structures are similar, although the unwinding kinetics differ greatly (Green

et al., 2008). While hairpins are unwound by a tensile force perpendicular to the structure,

pseudoknots experience a shearing force during mechanical unfolding. Therefore,

hairpins are unwound at a faster kinetic rate than pseudoknots. Interestingly, pseudoknot

unwinding is independent of the force applied, while increased force increases the rate at

which hairpins are unwound. Despite the similar stabilities of the structures, the minimal

unwinding force is greater for pseudoknots (~50pN) as compared to hairpins (~20pN)

(Chen et al., 2007). These studies suggest the process for unwinding secondary structures

are fundamentally different for the ribosome and it will be important to study the

mechanics of ribosomal unwinding of pseudoknots in the future. For now, there is

substantial evidence for the effects of hairpins and pseudoknots on protein production and

- 13 -

it will be interesting to determine the effects of these structures on a global scale.

Footprinting Determines the Location of Ribosomes During Translation

Obtaining the location of individual ribosomes is the first step in identifying

translational control regions in a set of mRNAs. Ribosome location has been determined

in previous studies using ribosome footprinting techniques, designated toeprinting or

heelprinting depending on the method employed. Ribosome toeprinting involves

hybridizing a primer to a downstream portion of RNA and using reverse transcriptase to

synthesize a cDNA fragment up to the 3’ (toe) portion of the stopped ribosome

(Brimacombe, 1991). Synthesized fragments can then be compared to an unmodified

RNA using gel electrophoresis to give the relative position of ribosomes on the mRNA.

However, this method can yield false positives as secondary structures within the mRNA

may not allow for complete synthesis of cDNA up to a ribosome. Ribosome heelprinting

takes the toeprinting concept a step further and uses a ribonuclease to digest exposed

mRNA up to the portion of mRNA protected by the ribosome (Somogyi et al., 1993;

Wolin and Walter, 1988). These protected fragments are then purified and hybridized to

a complementary DNA oligo. A primer is bound downstream of this fragment and DNA

is synthesized from the primer to the hybridized fragment. The lengths of synthesized

DNA fragments are compared to the unmodified DNA template using gel electrophoresis

and can again be used to determine the relative positions of individual ribosomes, this

time up to the 5’ (heel) portion of the ribosome. Ribosome heelprinting can also be used

to accurately determine the size of the ribosome footprint, which has been placed

between 24 and 32 nucleotides in length by Wolin and Walter (1988).

By isolating ribosome footprints in a manner similar to that used during ribosome

- 14 -

heelprinting, we can collect a fragment pool directly, without subsequent hybridization,

synthesis, and electrophoresis steps. Our method incorporates a procedure for isolating

polyribosomes from HeLa cells and digesting exposed portions of mRNA to yield

ribosome footprint fragments for sequencing.

High-Throughput Sequencing of Ribosome Footprints

Upon collection, ribosome footprint fragments can be used to identify the location

of individual ribosomes. High-throughput sequencing allows us to efficiently sequence

the high number of collected fragments in a single sequencing run. Mapping these

sequences along mRNA sequences can yield the position of individual ribosomes and a

global ribosome distribution map for a population of gene transcripts. Identification of

regions of ribosome accumulation should indicate translational control regions.

Traditional Sanger sequencing techniques have proven especially valuable for

sequencing large quantities of longer (700 base pair) DNA fragments—a practical tool

for whole genome sequencing applications. However, novel high-throughput sequencing

technology has the capacity to read up to 40 million different sequences in a single run at

a greatly reduced cost per run. Given a cost of 0.1 cents per base, it would cost

approximately $25 million to sequence a mammalian genome with Sanger sequencing

(Hudson, 2008). High-throughput sequencing offers a drastically reduced cost. By

comparison, 454’s pyrosequencing method has a capacity of 100 million bases per run at

an approximate cost of 0.005 cents/base. Illumina’s sequencing-by-synthesis method

decreases this cost even further to approximately 0.002 cents/base while reading 10-fold

more bases (1 billion) per sequencing run compared with pyrosequencing. To sequence

the same mammalian genome noted above, it would cost approximately $1.25 million

- 15 -

using pyrosequencing or $500,000 using Illumina sequencing, although whole-genome

sequencing using these methods is impractical at this time. The increased capacity and

reduction in costs provides the opportunity to ask more advanced questions about

transcriptome and microRNA (miRNA) populations—applications which were

previously impossible using Sanger sequencing.

Sequencing-by-synthesis, a method for tracking the addition of individual

nucleotides during primer-based polymerase extension, was first made commercially

available by 454 Life Sciences in the form of pyrosequencing. Using this method,

sequencing is carried out by converting the pyrophosphate released during nucleotide

addition to ATP using ATP sulfurylase (Ronaghi et al., 1998). This ATP then provides

energy for luciferase to give off a light signal. An important development in this method

is the addition of apyrase which degrades free nucleotides prior to the addition of the next

base (Ronaghi, 2001). The use of this enzyme allows for bases to be added sequentially

without additional wash steps, thereby decreasing the number of steps and time for each

sequencing run. It is important to consider that homopolymeric regions of up to three or

four residues in a row can be sequenced using this method as the light given off is

directly proportional to the number of bases incorporated. However, homopolymeric

regions of greater length are not easily detected (Hudson, 2008). For this reason, we have

explored other methods for our high-throughput sequencing needs.

A method designed to compete with 454’s technology contains several key

modifications to avoid the same homopolymeric sequence detection problems described

above. This method, now used in Illumina’s Genome Analyzer, incorporates unique

fluorescently-tagged bases with a reversible-terminator in place of the 3’ hydroxyl group

- 16 -

(Bentley et al., 2008). The combination of these additions allows all four bases to be

added simultaneously without the risks of over- or misincorporation, thereby allowing for

simultaneous sequencing of all fragments bound to a flow cell at once. Unfortunately,

the enzymatic steps used to remove the fluorophore and reversible terminator limit the

sequence read-length to around 35 bases, as compared to 100-400 bases with

pyrosequencing (Hudson, 2008). While this would pose problems for some applications,

the size of ribosome footprints in our application, as determined by Wolin and Walter

(1988), is within these limits. Given the improvements in sequencing fidelity and the

decreased cost per run, we have selected Illumina sequencing for our ribosome

footprinting experiments.

Illumina sequencing requires fragments of interest to be bound to a flow cell and

amplified in situ using cluster amplification. This amplification creates a series of

clusters arranged across the flow cell, with each cluster representing one sequence to be

read. To accomplish these steps, Illumina requires the presence of a unique adapter

sequence on each end of the sequencing-ready fragment. One adapter, the PCR adapter,

is utilized as a primer during the cluster amplification steps prior to sequencing. The

second adapter, the Sequencing adapter, first hybridizes the original fragment to the flow

cell prior to amplification then also acts as a primer for sequencing after the amplification

steps. Given the limitations in read length discussed above, the position of the

Sequencing adapter must be as near to the sequence of interest as possible to limit

sequencing of bases outside the sequence of interest. Illumina has developed a kit for

preparing RNA samples for sequencing which ligates adapters directly to the RNA before

cDNA synthesis (Figure 1). However, this method requires a substantial amount of

- 17 -

starting RNA material since multiple purification steps throughout the process decrease

the sequencing pool significantly. This method also requires the use of PCR

amplification to complete the synthesis of full-length, sequencing-ready fragments, a

technique which is not desirable in our application.

Given the constraints for preparing fragments for Illumina sequencing and the

problems noted above with their fragment preparation kit, we have designed a method

which minimizes purification and amplification steps in order to retain a large, and

therefore, representative pool of fragments throughout the process of fragment processing

(Figure 2). Gel purification techniques eliminate a large portion of the sequencing pool

as only a small portion is actually recovered after gel elution. PCR has also been

minimized in our protocol as any amplification has the potential to alter the

representation of individual fragments in our pool. Some have argued that stopping

amplification in the linear range—the stage prior to complete depletion of PCR reagents

and, therefore, uneven fragment amplification—will avoid this problem. However, we

feel an accurate data set is best obtained entirely without or with minimal use of PCR

amplification steps. Another group has also sought to improve on Illumina’s method by

eliminating some of the steps which lead to data loss (Figure 3) (Ingolia et al., 2009).

Their method is fundamentally simpler than ours but still contains several gel purification

steps and a final PCR amplification step. The efficiency of both methods will be

compared in this paper. In addition, we will provide evidence for the viability of both

methods.

Summary

High-throughput sequencing technology has greatly expanded the questions we

- 18 -

can explore beyond traditional sequencing technology. When combined with established

ribosome footprinting techniques, this provides a powerful tool for studying translational

control mechanisms. Although mechanisms such as translation initiation and

codon:anticodon binding have been studied at the level of individual ribosomes, it will be

useful to observe ribosome positions in a more global context to reveal details of complex

translational control mechanisms, such as secondary structures and the effects these have

on ribosome movement. While high-throughput sequencing provides new opportunities

for translation exploration, this technology also presents new obstacles for efficiently

preparing libraries of ribosome footprints for sequencing. Given the inefficiency of

fragment preparation using Illumina’s kit, two methods have been designed to improve

the yield of sequencing-ready fragments. Ingolia, et al., have established a simple

method for attaching the required adapter sequences to small RNAs, but we have

developed a method which aims to go a step further. The Bender & Curran method

discussed here seeks to convert a maximum number of ribosome footprints to a

sequencing-ready DNA form with minimal use of gel purification and PCR amplification

steps in order to preserve the fidelity of the original data set. This paper will discuss and

compare the advantages and disadvantages of each method.

- 19 -

MATERIALS AND METHODS

Isolating Ribosome-Protected mRNA Fragments

Sucrose gradients were prepared the night prior to the footprinting procedure in

RNase-free SW41 ultracentrifuge tubes. A 10-42% linear gradient was prepared by

sequentially underlaying 2.2 mL of 10%, 18%, 26%, 34%, and 42% sucrose solutions in

high salt buffer (HSB: 0.5M NaCl, 50mM MgCl2, and 10mM Tris-Cl pH 7.4). Gradients

were covered with plastic wrap and left at 4°C for a minimum of 15 hours.

A 50mL sample of HeLa cells (5 x 105 cells/mL) were collected the next morning

in a 50mL Falcon tube supplemented with 0.1mg/mL cycloheximide. These cells were

pelleted on a bench-top centrifuge for 5 minutes at 1,000 rpm. The pellet was

resuspended in 1mL Hank’s buffered salt solution (Invitrogen) supplemented with

0.1mg/mL cycloheximide and this solution was transferred to a microcentrifuge tube.

The cells were pelleted again in a microcentrifuge for 5 minutes at 400 x g at 4°C. After

draining the supernatant, the pellet was resuspended in 0.2mL reticulocyte standard

buffer (RSB – 10mM NaCl, 3mM MgCl2, 10mM Tris-Cl pH 7.4). Cells were allowed to

swell on ice for at least 10 minutes. A solution of 0.2mL RSB and 2x Magik detergent

(1% sodium deoxycholate and 2% Tween 40) were added while vortexing the sample.

The sample was incubated a further 5-10 minutes on ice. After vortexing briefly, the cell

nuclei were pelleted in a microcentrifuge for 10 minutes at 2,000 x g at 4°C. The

supernatant was removed and added atop the 10-42% sucrose gradients prepared

previously. Samples were next centrifuged using an SW41 rotor in a Beckman LC3B

ultracentrifuge at 37,000rpm for 1 hour 40 minutes at 4°C. Using a Pharmacia FRAC-

- 20 -

100 fractionator, the sucrose gradient was collected from the bottom-up in 0.7mL

fractions. Fractions containing the polyribosome portion of the gradient, as determined

by A260 absorbance, were combined in a new, RNase-free SW41 ultracentrifuge tube.

This solution was diluted with 6mL of HSB.

To isolate the ribosome footprint fragments, the fractionated and diluted solutions

were centrifuged using the same ultracentrifuge at 37,000rpm for 75 minutes at 4°C.

Each tube was carefully drained of supernatant and the pellet was resuspended in 200µL

of RSB. To digest the exposed mRNA (that not protected by the ribosome), a digest

solution (17U micrococcal nuclease (New England Biolabs), 5mM CaCl2, and RSB

solution to a total volume of 200µL) was added to the pelleted polyribosome solution and

digestion was allowed to occur for 20 minutes at 20°C. The reaction was stopped by the

addition of 10mM EGTA.

It should be noted here that given the significant time required to isolate RNA

from HeLa cells, as discussed above, it was more economical to create a synthetic RNA

oligo (5’-AGCUGGGAUGAUCAGUCAGGAUCGUCCAUG) to test further molecular

manipulation steps. RNA concentrations following micrococcal nuclease digestion were

found to be approximately 150ng/µL prior to molecular manipulations. Therefore, our

RNA stock solution (967ng/µL) was diluted 6-fold prior to all subsequent manipulations

to yield a starting RNA concentration equivalent to our standard cellular yields.

Preparing Fragments for Sequencing: Bender & Curran Method

cDNA Synthesis

Following digestion, the RNA concentration of our sample was quantified using a

Nanodrop spectrophotometer. A polyadenylation solution was made up, containing

- 21 -

approximately 50pmol RNA, 1x poly-(A) polymerase buffer, 2.5U E. coli poly-(A)

polymerase (NEB), 1mM ATP, 3U T4 polynucleotide kinase (NEB), 20U RNaseOUT

(Invitrogen), 1mg/mL Bovine Serum Albumin (BSA) (NEB), and DEPC water to a final

volume of 10µL. This solution was incubated at 37°C for 1 hour.

A special primer was designed for reverse transcription in the form of HB006 (5’-

/5Biosg/(T)45GGATCCTTTTTTTTN), where /5Biosg/ represents a biotin moiety on the

5’ end of the sequence and N indicates the incorporation of a random nucleotide. Fifteen

microliters of hydrophilic streptavidin magnetic beads (NEB) were combined with 3µL

HB006 oligo, 12µL DEPC water, and 60µL streptavidin bead wash/binding buffer (0.5M

NaCl, 20mM Tris-HCl pH 7.5, 1mM EDTA) and were allowed to sit on the bench top

during the polyadenylation reaction (approximately 45 minutes). A magnet was applied

to the sides of each tube and the supernatant was removed leaving behind only oligo dT-

bound beads. These beads were resuspended in 3.5µL of polyadenylated RNA from the

previous reaction and 3µL of DEPC water and allowed to anneal to the RNA poly-A tail

at room temperature for 5-10 minutes. The cDNA reaction mixture was completed by

adding 1X M-MuLV reverse transcriptase reaction buffer, 20U M-MuLV reverse

transcriptase (NEB), 100µM dNTPs, and 20U of RNaseOUT. This solution was

incubated at 37°C for 1 hour. Subsequently, a magnet was applied and the supernatant

was removed from the beads.

In order to remove the RNA, the beads were resuspended in a digest solution

containing 1x RNase H buffer, 5U RNase H (NEB), and 0.1x TE to a volume of 10µL.

This mixture was incubated for 1 hour at 37°C. The supernatant was removed after

applying a magnet to the tube.

- 22 -

Sequencing Adapter Ligations

First-strand cDNA fragments underwent 3’ tailing by preparing a 10µL tailing

solution with 1x terminal transferase buffer, 250µM CoCl2, 4mM dCTP, and 40U

terminal transferase (NEB). This solution was incubated for 1 hour at 37°C. A magnet

was applied to this tube and the supernatant was removed from the beads.

In preparation for ligation of the PCR adapter, two oligos, 3µL HB013 (5’ –

pGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGGCGCGCCTGC) where p

represents a 5’ phosphate and 3µL HB015 (5’ – GTCAGGCGCGCCCAAGCAGAAG-

ACGGCATACGAGCTCTTCCGATCGGGG) were incubated for 5 minutes at 65°C

then chilled for 1 minute on ice. The beads from the previous reaction were resuspended

in this mixture with a ligation solution containing 1x T4 DNA ligase buffer, 400U T4

DNA ligase, 1mM ATP, 0.5mM spermidine, and 0.1 µL 10mg/mL BSA, and sterile

water to a final volume of 10µL. Ligation was carried out using a thermal cycler which

cycled incubation temperature between 14°C and 22°C. This cycling incubated the

sample at 14°C for 5 minutes, ramped to 16°C over 30 seconds and remained there for 5

minutes, ramped to 18°C over 30 seconds and remained there for 5 minutes, ramped to

20°C over 30 seconds and remained there for 5 minutes, ramped to 22°C over 30 second

and remained there for 5 minutes before returning to 14°C over the course of 5 minutes.

This cycle was repeated 34 times. A magnet was applied and the ligation solution

supernatant was removed.

Following ligation, second-strand DNA was synthesized using the primer ligated

in the previous step. A 10µL synthesis solution containing 1x NEB buffer 2, 200µM

each dNTP, and 2.5U DNA polymerase I (Klenow fragment) (NEB) was used to

- 23 -

resuspend the beads. This solution was incubated at room temperature for 1 hour. A

magnet was again applied and the supernatant was removed from the beads.

Synthesis of the second strand completes the BamHI site at the 5’ end of the

cDNA strand, allowing us to use this for ligation of the Sequencing adapter. Beads from

the previous reaction were incubated with a 10µL reaction mixture containing 1x NEB

buffer 3 and 10U BamHI (NEB). This solution was incubated for 1 ½ hours at 25°C. A

magnet was applied to the tube and the supernatant was transferred to a new

microcentrifuge tube while the beads were discarded.

The second adapter, the Sequencing adapter, was prepared for ligation by

combining 2µL of HB009 (5’ - /5Biosg/ATCAGCGGCCGCACACTCTTTCCCTACA-

CGACGCTCTTCCGATCTA), where /5Biosg/ represents a 5’ biotin moiety, and 2µL of

HB010 (5’ – pGATCTAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTGCG-

GCCGCTGAT), where p represents a 5’ phosphate. These oligos were allowed to anneal

while simultaneously binding to new magnetic beads by combining with 15µL

hydrophilic streptavidin magnetic beads (NEB), 12µL sterile water, 60µL of streptavidin

magnetic bead wash/binding buffer and allowing this solution to sit for at least 5 minutes

at room temperature. A magnet was then applied to remove the supernatant. The beads

were next resuspended in the supernatant from the BamHI digest described above, in

combination with 10U BglII (NEB), 2U BamHI (NEB), 1mM ATP, 0.5µL NEB buffer 3,

0.1µL 10mg/mL BSA, and 600U T4 DNA ligase (NEB) and water to a final volume of

15µL. This solution was incubated for 1 ½ hours at 25°C. Both BamHI and BglII were

used in this reaction mixture to enhance production of the desired product which was not

susceptible to digestion with either enzyme. Self-ligated products are digested by either

- 24 -

of the two restriction enzymes.

Size Fractionation

As a final step in purifying our samples, a thick, 10% acrylamide-urea denaturing

gel was prepared and samples were allowed to run for 1 hour at 250 volts. The gel was

stained with SYBR Gold (Invitrogen) according to the manufacturer’s instructions and

exposed to UV light. The respective sample band, a streak of a minimum of 140

nucleotides in size, was excised and removed to a new microcentrifuge tube. The gel was

crushed and DNA was eluted by rolling samples at 37°C overnight in 200µL of gel

elution buffer (300 mM NaOAc pH 5.5 and 1 mM EDTA). DNA was collected the

following morning by using a Spin-X column (Corning) to remove gel debris and

precipitating DNA from the solution using GlycoBlue as a coprecipitant using standard

methods. The DNA was pelleted in a microcentrifuge for 10 minutes at 14,000rpm and

the supernatant was removed. The pellet was resuspended in 10µL TE solution in

preparation for sequencing.

Preparing Fragments for Sequencing: Ingolia Method

RNA Size Selection

Following isolation of ribosome footprints, fragments of the correct size

(approximately 28 nucleotides) were selected for using gel purification. Fragments were

dephosphorylated in a 10µL reaction with 1x T4 polynucleotide kinase buffer (without

ATP), 10U SUPERase-In, and 10U T4 polynucleotide kinase (NEB). Samples were

incubated for 1 hour at 37°C, followed by incubation for 10 minutes at 75°C to heat

inactivate the enzyme.

Products from this reaction, along with a synthetic 28 base RNA nucleotide (5’ –

- 25 -

AUGUACACGGAGTCGACCCGCAACGCGA) to use as a reference for gel excision,

were mixed with 2x Novex TBE-Urea sample prep buffer (Invitrogen) and briefly

denatured. The samples were next loaded on a Novex denaturing 15% polyacrylamide

TBE-urea gel (Invitrogen) and run according to manufacturer’s instructions. SYBR Gold

(Invitrogen) was used to stain the gel. The 28 nucleotide region of the sample was

excised and crushed in a microcentrifuge tube. These gel fragments were soaked

overnight in gel elution buffer (300mM NaOAc pH 5.5, 1mM EDTA, 0.1U/µL

SUPERase-In). Gel fragments were removed using a Spin-X column (Corning) followed

by precipitation with GlycoBlue as a coprecipitant by standard methods.

cDNA Synthesis

Gel-purified RNA was resuspended in 10mM Tris pH 8.0 and quantified using the

BioAnalyzer Small RNA assay (Agilent). Approximately 10-20 pmoles of RNA was

denatured prior to preparing a 6.6µL poly-(A) tailing reaction solution with 1x poly-(A)

polymerase buffer, ATP in 40-50:1 molar ratio to RNA, 0.75U/µL SUPERase-In, and 3U

E. coli poly-(A) polymerase (NEB). This solution was incubated for 30 minutes at 37°C.

Reverse transcription was carried out using oNTI223 (HB018) (5’ – pGATCGTC-

GGACTGTAGAACTCT/idSp/CAAGCAGAAGACGGCATACGATTTTTTTTTTTTTT

TTTTTTVN) where p represents 5’ phosphorylation, /idSp/ represents an abasic dSpacer

furan, N indicates incorporation of a random nucleotide, and V indicates incorporation of

A, C, or G bases. A 5µL sample of polyadenylated RNA was combined with 570nmol

Tris pH 8.0, 8.2nmol each dNTP, 50pmol oNTI223 primer, and water to a final reaction

volume of 14.25µL. This mixture was heated for 3 minutes at 75°C then placed on ice

for 1 minute. To carry out first-strand synthesis, 10U SUPERase-In, 82nmol DTT, and

- 26 -

164U SuperScript III (Invitrogen) were added to this solution. This reverse transcription

solution was incubated for 30 minutes at 48°C.

To remove RNA, 1.8µL 1M NaOH was added to the solution and incubated for

20 minutes at 98°C. The reaction mixture was returned to neutral pH by adding 1.8µL

1M HCl. The cDNA fragments were isolated using gel purified techniques as described

in the previous section on a 10% polyacrylamide TBE-urea gel. This gel was run with a

91 nucleotide oligo to identify the correct portion of the gel to excise. Excised bands

were crushed and DNA was recovered using gel elution buffer as described previously.

Circularization of cDNA

First-strand cDNA was circularized by resuspending DNA in a 5µL reaction

mixture containing 1x CircLigase buffer, 50µM ATP, 2.5mM MnCl2, and 0.5 µL

CircLigase (Epicentre). This mixture was incubated for 1 hour at 60°C followed by 10

minutes at 80°C to heat inactivate CircLigase.

This cDNA was then relinearized by adding 6.25µL relinearization solution

(50mM KCl and 1mM DTT) and 12.5U APE 1 (NEB) and incubating for 1 hour at 37°C.

This relinearized DNA was purified using a Novex 10% polyacrylamide TBE-urea gel

(Invitrogen). The relinearized form of the DNA moves faster through the gel than the

circularized form, providing a mechanism by which to purify for relinearized DNA. This

DNA was recovered using gel elution buffer using the methods described previously.

Preparation of Sequencing Fragments

Relinearized DNA from the previous steps was PCR-amplified using the Phusion

High-Fidelity PCR kit (NEB) according to manufacturer’s instructions. The primers

oNTI200 (HB019) (5’-CAAGCAGAAGACGGCATA) and oNTI201 (HB020) (5’-

- 27 -

AATGATAC-GGCGACCACCGACAGGTTCAGAGTTCTACAGTCCGACG) were

used to create fully functioning adapters on either end of the cDNA. PCR was carried out

with an initial 30 second denaturation step at 98°C, followed by 8-14 cycles of 10

seconds at 98°C, 10 seconds annealing at 60°C, and 5 seconds extension at 72°C.

Amplification was stopped while PCR was still in the linear range to avoid distortions in

the representation of fragments relative to one another. Several samples with different

amplification cycles were compared using a non-denaturing 8% polyacrylamide TBE gel

to determine the optimal number of cycles to amplify the desired fragments.

- 28 -

RESULTS

This study has extensively explored the development of the Bender & Curran

method for preparing RNA fragments for high-throughput sequencing. Therefore, this

section explores the results obtained during these experiments and their implications for

the process as a whole. Given the breadth of our studies, our results are highly applicable

to the method recently published by Ingolia, et al. (2009). We will compare these two

methods here and in our discussion.

The majority of the results described here are qualitative in nature since consistent

and accurate methods for quantitatively assaying reaction products have not been

developed at this time. However, the majority of the data discussed in this section and

shown in our representative figures have occurred on multiple occasions and are, we

believe, accurately reflective of our results.

Preparation of Ribosome Footprints

In order to eliminate portions of mRNA which are not protected by ribosomes, we

incorporated a ribonuclease digestion step. It is critical that this enzyme 1) thoroughly

digest the mRNA in a sequence-independent manner and 2) that it limit digestion of

rRNA. This study compared digestion of mRNA and rRNA using RNase A, RNase T1,

and micrococcal nuclease enzymes. RNase A specifically cleaves 3’ of uridine and

cytosine residues while RNase T1 cleaves after guanine residues (Blackburn and Moore,

1982; Takahashi and Moore, 1982). Combining these ribonucleases in a single reaction

is a common practice for extensively digesting all host cell RNA prior to DNA treatments.

As such, these enzymes are more difficult to control without the addition of an RNase

- 29 -

inhibiting solution at the end of the reaction. Microccoccal nuclease exhibits preference

for adenine- and thymidine-rich regions in the mRNA, but digestion with this enzyme is

easily halted when Ca2+ is removed from the solution with the addition of excess EGTA,

presenting a viable method for digestion control not afforded by the other enzymes

(Cuatrecasas et al., 1967). Our studies indicate micrococcal nuclease as the best

candidate for preventing non-specific digestion of the ribosome while completely

digesting exposed mRNA (Figure 4). We have repeatedly isolated streaks of RNA

approximately 30+ nucleotides in length as may be expected given Wolin and Walter’s

(1988) estimation of ribosome footprint size (Figure 5). The sequence specificity of

micrococcal nuclease may be responsible for the presence of fragments larger than the

expected 30 nucleotide size. Although micrococcal nuclease digestion can be stopped

with the addition of EGTA, we have seen evidence of continued digestion several weeks

after footprint collection, despite deep freezing of the samples (Figure 6). Therefore, it is

our recommendation that fragments should be isolated and converted to cDNA in a single

day, at a minimum. As with any procedure, the reliability of this method is enhanced

when all steps are completed without intermediate freezing and storage steps.

While we chose to use micrococcal nuclease, Ingolia, et al., have utilized RNase I

which appears more adept at meeting our RNase objectives noted above. RNase I

preferentially digests single-stranded RNAs after all four bases and is irreversibly

inactivated with 0.1% SDS or phenol extraction (Spahr and Hollingworth, 1961).

However, it is important to note here that Ingolia, et al., identified 84% of their fragments

as originating from the ribosome. Although we have no data to directly compare, it

would be useful to study damage to rRNA with RNase I relative to micrococcal nuclease.

- 30 -

It may be possible to incorporate subtractive hybridization, as suggested in Ingolia’s

paper, to eliminate a majority of rRNA fragments and increase the number of fragments

of interest in the sequencing pool.

The digestion of mRNA with micrococcal nuclease leaves RNA fragments with

3’ phosphate and 5’ hydroxyl ends. As this will inhibit any subsequent polyadenylation

reactions, it is necessary to incorporate a dephosphorylation step to render the 3’ end

amenable to polyadenylation. Both polynucleotide kinase and calf intestinal phosphatase

were tested in conjunction with poly-A polymerase to allow both reactions to occur

simultaneously (Figure 7). Repeated tests have determined that polynucleotide kinase

can be introduced to the polyadenylation reaction mixture with minimal side effects,

provided this solution is supplemented with 1mM ATP. It is important that

polynucleotide kinase buffer is not included in this solution as we have evidence this may

be detrimental to polyadenylation (data not shown).

Polyadenylation

Polyadenylation was first incorporated into this procedure to prime cDNA

synthesis across the vast majority of RNA fragments, thereby minimizing the possibility

of lost data during this step. We have utilized three E. coli poly-A polymerases from

different manufacturers with mixed results throughout our trials. Invitrogen’s Ncode

miRNA First-Strand cDNA Synthesis Kit was used to polyadenylate fragments in

preparation for first-strand synthesis. The kit poly-A polymerase exhibited suitable

polyadenylation of our RNA molecules when the kit was fresh, although the activity of

this enzyme decreased dramatically within a month of arrival in our lab (Figure 8). A

poly-A polymerase from Ambion exhibited a longer life span but a comparable decrease

- 31 -

in activity was noted approximately two months after enzyme arrival as compared with

fresh poly-A polymerase from New England Biolabs (Figure 9A). We have used the

same aliquot of the NEB enzyme since its arrival in the lab and have noted no decrease in

activity through three months of work (Figure 9B).

Given that these enzymes are provided from similar biological sources

independent of their manufacturer, it is reasonable to conclude poly-A polymerase is an

inherently unstable enzyme. Our results suggest that this instability may result in a

shorter shelf life and a limited activity window under reaction conditions. As determined

through a radiolabel pulse-chase time course assay, poly-A polymerase appears to be

most active in the first five minutes under standard reaction conditions (Figure 10). As

this suggests fragility of the enzyme during the reaction, we have enhanced activity with

the addition of 0.1mg/mL BSA, as indicated by the appearance of a streak higher on the

gel (Figure 11). The addition of this reagent alone enhances the reaction more effectively

than increasing enzyme concentrations, altering the buffer solution, or decreasing the

reaction temperature to a more permissible range. Given that they are produced from a

similar source as the NEB poly-A polymerase, it is reasonable to conclude that Invitrogen

and Ambion poly-A polymerase activity may also be enhanced by the addition of

0.1mg/mL BSA.

The unreliability of this reaction has prompted a closer investigation into other

factors which alter the consistency of this reaction. Increasing ATP concentration or

incubation time does not enhance poly-adenylation activity. However, we have noted

that samples which are not heat inactivated after polyadenylation exhibit a higher degree

of tailing (Figure 12). It is possible the heat inactivation step is destructive to the newly

- 32 -

synthesized poly-A tail or the poly-A polymerase remains active well after the normal

incubation period. While we do not have a definitive explanation for the trend noted here,

the elimination of the heat inactivation step does not appear to have any detrimental

effects on the downstream cDNA synthesis step.

First-Strand Synthesis

The efficiency of the first-strand synthesis reaction has varied with our enzyme

supplier in a manner similar to poly-A polymerase. As with poly-A polymerase, we have

maximized the efficiency of the standard reagents in this reaction.

During early trials with Invitrogen’s first-strand synthesis kit, the concentration of

dNTPs was compared in 10-fold serial dilutions to determine the optimal concentration

for use in this reaction (Figure 13). High concentrations (10mM) of dNTPs do not allow

significant first-strand synthesis to occur while a 10-fold dilution showed successful

synthesis. A lower concentration of dNTPs (100µM) also yielded satisfactory first-strand

DNA synthesis. Given these results, our reactions have included 100µM dNTPs for all

subsequent assays and methods although it is possible that 1mM dNTP concentrations

may give equally or slightly more complete synthesis results.

Through the course of our studies, we have compared several oligo dT molecules

for priming first-strand DNA synthesis (Figure 14). A standard 47-mer oligo dT,

provided in Invitrogen’s first strand synthesis kit, is suitable for reliably hybridizing to

the RNA poly-A tail and priming synthesis. We have subsequently tested the addition of

an adapter sequence to the 5’ portion of the oligo dT, incorporated either an EcoRI or a

BamHI digest site eight dT residues from the 3’ end of the oligo dT and have also added

a 5’ biotin moiety to facilitate manipulations using streptavidin magnetic beads. First-

- 33 -

strand synthesis has occurred reliably using each of these primers, suggesting there is a

high degree of flexibility in oligo dT design. Although each of our custom oligos

contained a full 47 dT residues, Ingolia, et al., have demonstrated that primer

hybridization may occur effectively with just 20 dT residues before the adapter

sequence(s). We have noted some difficulties with first-strand synthesis in the Ingolia

method (Figure 15), although sub-optimal polyadenylation likely plays a role in this

problem. We have tested our methods for polyadenylation and first-strand synthesis with

the Ingolia oligo dT and have noted complete first-strand synthesis (Figure 16).

Therefore, the oligo dT itself does not appear to be responsible for the lack of first-strand

synthesis using the Ingolia method.

The efficiency of this and all downstream reactions can be greatly enhanced

through the use of streptavidin magnetic beads bound to the 5’ end of the cDNA via a

streptavidin-biotin interaction. The number of manipulations required for ligating both

adapter sequences to the cDNA requires careful planning to avoid enzyme and buffer

interactions. This problem may be controlled or eliminated through the use of

streptavidin magnetic beads. Binding a streptavidin magnetic bead to a biotin molecule

on the 5’ end of the oligo dT prior to first-strand synthesis is optimal for minimizing loss

of material at this step. Since biotinylation is not a completely efficient process, there

will be a portion of oligo dTs which will not have a biotin molecule but are still able to

prime first-strand synthesis. We can eliminate these non-biotinylated sequences by

binding the beads to the oligo dT in a solution independent of the RNA. We have found

it practical to set up the bead binding solution just after setting up the polyadenylation

reaction. The supernatant can be removed from this solution just prior to the introduction

- 34 -

of RNA to eliminate all non-biotinylated oligos.

Our method initially used the standard streptavidin beads available from NEB but

these beads gave inconsistent results, often resulting in lost first-strand samples during

the course of our experiments (Figure 17A). NEB has recently developed hydrophilic

streptavidin beads which, in our trials, have exhibited a greater affinity for biotin and

increased consistency in DNA retention. We have compared these two beads directly in

a manner designed to replicate experimental procedures but we have not noted the same

results (Figure 17B). It is possible another element of the procedure is interfering with

the streptavidin-biotin interaction on the standard beads, leading to the loss of material.

According to NEB, this difference has been noted by other groups working on similar

applications although there is no definitive explanation for this discrepancy at this time (J.

McFarland, Personal Communication, 27 March 2009). We have noted no loss of

material during our trials with the hydrophilic beads so it is our recommendation that

these beads be used in all future trials to ensure consistency during all manipulations.

As a final step in cDNA synthesis, our downstream manipulations require the

removal of RNA from the RNA:DNA hybrids. Our initial methods used Tris-Cl pH 10 to

raise the pH of the solution and remove the RNA. This pH was chosen to be just enough

to remove the RNA while simultaneously preventing undue damage to our DNA. Our

reaction solution was returned to the initial pH by replacing the basic supernatant with

reaction buffer at neutral pH. Ingolia, et al., incubate their DNA with a strong base,

followed by a strong acid to return the solution to neutral pH. Due to the increased risk

of DNA damage, we adopted RNase H as an alternative to acid-base treatments to avoid

damage to the streptavidin beads or the DNA. We have compared these methods using

- 35 -

labeled cDNA on streptavidin magnetic beads subjected to each treatment (Figure 18). A

substantial loss of material using the strong acid-base method employed by Ingolia leads

us to believe this treatment permanently impairs streptavidin-biotin interactions, leading

to the loss of material in subsequent steps. It is also reasonable to conclude that this

treatment is equally detrimental to the DNA itself, which may account for some of the

problems we experienced in downstream reactions of the Ingolia method. While there is

substantially less risk of losing material with a treatment of Tris-Cl pH 10, we believe the

safest method is one which uses RNase H to remove RNA.

Tailing

This study has explored a number of methods for ligating the PCR and

Sequencing adapters to our fragments of interest, as described in the next section.

Throughout this study, synthesis of 3’ homopolymer tails using terminal transferase has

presented a viable option for preparing our cDNA fragments for ligation reactions or

second-strand DNA synthesis reactions. As such, tailing has become a critical step in our

conversion of RNA fragments to sequencing-ready DNA. Therefore, we have spent a

great deal of time developing this reaction to operate at maximum efficiency.

Several protocols in this study have synthesized short, poly-dT and poly-dC tails

on the 3’ end of the cDNA oligo. We initially developed a method for tailing these oligos

with dT residues, but ligation to this overhang was not as efficient as expected. The

incorporation of a longer tailing reaction to synthesize a hybridization site for a second-

strand synthesis primer proved to be efficient, although the method as a whole was not

viable. The synthesis of a short, poly-dC tail presented perhaps the best option for

creating a ligation binding site on the cDNA but this reaction has proven to be more

- 36 -

difficult. A comparison of poly-dT and poly-dC tailing showed dramatically higher

tailing efficiency using dTTP (Figure 19). As might be expected, the concentration of the

incorporating nucleotide also has a direct impact on the length of the tail synthesized by

terminal transferase. Using the information from this experiment, we altered and

compared several reaction factors, including enzyme concentration, dCTP concentration,

and incubation time to enhance our dC tailing reaction (Figure 20).

In an effort to minimize the loss of material at this step, this protocol was further

optimized to convert the vast majority of cDNA fragments to a tailed form containing at

least ten dC residues. A comparison of methods including doubling enzyme

concentration, dCTP concentration, and/or incubation time over the previous values

revealed a yield of greater than 90% tailing efficiency when each of these factors were

doubled (Figures 21 & 22). We believe this to be a very efficient reaction in its current

form as tailing yield in all subsequent trials have been noted above 90%. Although, in

many cases, this method will yield a longer-than-necessary tail, this appears to be the

most efficient method for tailing the maximum number individual fragments to prevent

material loss. Additionally, the distance between the sequence of interest and the PCR

adapter sequence (the length of the tail) is insignificant as sequencing will be initiated at

the opposite end of the fragment.

Ligation

We have tested a variety of methods for ligating the PCR adapter to the 3’ end of

our cDNA. Ligation is widely considered to be an inefficient reaction and our demands

for ligating a single-stranded molecule to either single- or double-stranded oligos proved

to be especially problematic. In total, we assayed seven methods for ligation and have

- 37 -

met with moderate success in the latest version of this protocol (Figure 23). We have

tested ligations of single-stranded PCR adapter sequences to cDNA or directly to RNA

using either RNA or DNA ligase (Methods 1 and 3). Other methods have incorporated

anchor portions of the adapter complement to stabilize interactions between double-

stranded oligos and the single-stranded cDNA or RNA (Methods 2, 4, 5, 6, and 7). Trials

with the crowding agent polyethylene glycol (PEG) revealed no noticeable induction of

ligation in any of the methods tested. Lastly, we have tested a poly-A primer for

synthesizing the second DNA strand in preparation for a blunt end, double-strand to

double-strand DNA ligation. However, this poly-dA also bound to the initial oligo dT

and blocked synthesis near the 5’ end of the cDNA. The use of polymerases with high

displacement activity, such as phi29 (NEB), did not remedy this situation. Without

complete synthesis, we were unable to determine the extent to blunting prior to

attempting a ligation reaction.

While PEG was unable to enhance our reactions as anticipated, our protocol has

incorporated other factors aimed at enhancing the efficiency of this reaction. These

enhancements have been primarily used in assays of Method 7 as this yielded the most

promising results. The use of multiple guanine residue overhangs presents some

structural problems as these can form G-quadruplex structures. Additionally, the strength

of G-C interactions is greater than A-T binding and thereby decreases flexibility should

the ends bind in a manner where the ligating ends are not flush. To enhance correct

hybridization of our fragments, a thermal cycler was used to oscillate reaction

temperatures around the optimal enzyme temperature up to and past the theoretical

melting point of the G-C interactions of interest. This cycling allows for ligation oligos

- 38 -

to hop on and off one another until the two oligos are bridged in a position so as to allow

ligation to occur. Identifying an optimal cycling routine has been difficult but we believe

the key is to chill the solution to 12°C or 13°C at the lower end of the cycle to allow for

full hybridization of the bridging portion of the adapter oligo complement. High room

temperatures in our laboratory have prevented our cycler from consistently reaching these

temperatures which may contribute to the variations within our results.

Initial results indicating successful ligation were followed by experiments in

which limited, if any, ligation occurred. Given our experience with poly-A polymerase,

several factors were added to enhance the ligation reaction. These included ATP to

enhance enzyme activity, BSA to enhance the stability of DNA ligase over the long

reaction times required for our method, and spermidine to enhance and stabilize

hybridization of the overhang portion of our oligo just prior to ligation. Although our

current information suggests these reagents do not inhibit the ligation reaction, we have

no direct evidence that these reagents enhance our reaction. In another effort to increase

ligation efficiency, we increased the concentration of our adapter and complement oligos

(Figure 24). These results hint that some ligation may occur with higher oligo

concentrations, as indicated by some streaking between 150-200 nts in length, but more

extreme variations in concentration should be tested to determine the validity of this

result. Based on these data, we have incorporated increased oligo concentrations, ATP,

BSA, and spermidine as additional factors in our ligation reaction in the interest of

optimizing conditions for ligation to occur.

Despite the difficulties associated with troubleshooting the ligation reaction, our

experiments have yielded some promising, albeit inconsistent, results. A comparison of

- 39 -

adapter complement oligos with dG residue overhangs of varying length and using a

thermal cycling protocol exhibited some ligation (Figure 25). The entire first-strand band

did not shift upwards, indicating that the reaction was not optimized for converting a high

number of our fragments to a ligated form. It is possible that this shift represents that of

the tailing reaction (not assayed here), although we believe the appearance of distinct

bands approximately 50 nucleotides greater in length suggests this is indicative of

successful ligation. Furthermore, there is some evidence for second-strand DNA

synthesis, but these signals are faint, again suggesting only a small portion of our

fragments were converted to the ligated form. As we have not witnessed these results

consistently, it is important to note that this gel suggests successful ligation but it is not a

definitive demonstration of such.

In other experiments we have noted more evidence of successful ligation as

evidenced by an upward shift of tailed sample following ligation (Figure 26). We have

compared the signal output in the two lanes to determine if the shift is indicative of an

increase in DNA length (Figure 27). Using a logarithmic scale to correlate the distance

traveled on the gel with the increase in fragment length following ligation, we have noted

that the difference in distance traveled between the two samples accounts for an increase

in length of approximately 30 nucleotides. Accounting for uneven movement of

fragments through the gel using the initial peaks at the right of this figure gives an error

of approximately four bases. Therefore, we can accurately conclude that the shift noted

in Figure 26 corresponds to an increase in length of 26 nucleotides. However, the oligo

we are ligating is 45 nucleotides in length, suggesting ligation may be occurring but it is

uneven at best. This measurement is ambiguous due to the difficulty in measuring the

- 40 -

exact position of material on the gel so other methods should be used to quantify ligation

efficiency in the future. Furthermore, these results have not been noted consistently

suggesting other factors may be affecting the efficiency of this reaction. In an effort to

better assay ligation, we have used second-strand DNA synthesis and PCR steps (using

primers for the ligated adapter and the oligo dT) with inconsistent results. We

recommend using these methods, in addition to gel assays, to determine the efficiency of

ligation in future experiments.

Results from the Ingolia Method

Following its publication in April 2009, we have tested the Ingolia method

alongside our own in an effort to compare the efficiencies of each. While we have

developed a complex method aimed at minimizing steps which may lose or skew the data

set, Ingolia incorporates a simpler approach which utilizes gel purification and PCR

techniques to obtain a pure and adequately sized fragment pool.

Initial results using the complete Ingolia method showed diminished

polyadenylation efficiency and little to no first-strand synthesis (Figure 15). However,

using our protocol for polyadenylation and first-strand synthesis with Ingolia’s oligo dT,

optimal products from each reaction were obtained (Figure 16). Removing gel

purification and precipitation steps yielded the presence of labeled first-strand throughout

our procedure (Figure 28). Despite the presence of intact cDNA in all samples, there is

no evidence for circularization, as would be indicated by an upward shift in the first-

strand synthesis band. However, we have noted this shift when using control oligos

under similar reaction conditions (Figure 29). The Ingolia method uses Epicentre’s

CircLigase, a specialized single-stranded DNA ligase with enhanced intramolecular

- 41 -

ligation activity (Epicentre, 2009). This enzyme is known to have some sequence

specificity. If the oligo of interest contains a 3’ dC residue, Epicentre has noted that

ligation will not occur, although we have not encountered these same problems using a

control oligo (“HB007” in Figure 29). Additionally, Epicentre informed us that lower

concentrations of ATP may be favorable as this reagent can inhibit the CircLigase

enzyme when present in high concentrations. Lastly, we suspect the strong acid-base

treatment used by Ingolia, et al., may cause unnecessary damage to the DNA template

that inhibits circularization. We have therefore substituted this treatment with an hour-

long incubation with RNase H in a manner similar to our protocol and noted faint

evidence of successful circularization, as assayed using PCR (Figure 30). However, this

result is unreliable so it remains possible that another factor is inhibiting circularization.

This will need to be investigated in future studies.

- 42 -

DISCUSSION

The Ingolia and Bender & Curran methods have been designed to enhance

Illumina’s protocol for converting small RNAs to a sequencing-ready DNA form. These

protocols and evidence for and against their viability has been discussed in the previous

sections. Given the number of manipulation steps required by each of these three

methods, it is critical to minimize the loss of material throughout the fragment conversion

process. Lost material has the potential to non-selectively skew the data set, a problem

which could have a large impact on applications where fragment representation is critical

to the objective of the study, such as ours. We will spend much of this discussion

addressing further opportunities for enhancing both methods in addition to touching on

some future applications using these methods.

Problems with the Ingolia Method

In the process of testing the Ingolia method, we have noted several problems

which decreased the yield of sequencing-ready fragments. Firstly, the Ingolia method

uses three gel purifications to 1) purify ribosome footprint RNA, 2) purify for

relinearized RNA, and 3) purify a final, PCR-amplified form with full-length adapter

sequences at either end. Using the crush and soak method found in this protocol, gel

purification can have a relatively low yield, in the range of <30-90% of the original DNA,

depending on the fragment length (Sambrook and Russell, 2001). Assuming an average

DNA recovery rate of 50% from each purification, the three purification steps used in this

method account for a yield of 12.5% of the original material. In combination with the

inefficiencies inherent in enzymatic reactions, this could further cut the final yield to well

- 43 -

below 10% of the original material. We have already noted evidence of reaction

inefficiency when using the Ingolia method for polyadenylation and first strand synthesis.

While there are limits to how much enzymatic reactions can be augmented, other steps

during which material is regularly lost should be minimized or eliminated entirely.

Secondly, the circularization reaction has proved to be especially problematic in

this method for reasons we have not yet determined. As noted in the results section, we

have successfully tested circularization of several oligos using CircLigase but have not

noted the same results with our cDNA. Originally, the size of the oligo was considered to

be part of the issue, but it should be noted that the kit is designed to work with any oligos

greater than 15 nucleotides in length (Epicentre, 2009). Therefore, we do not believe the

size of the oligo itself is a limiting factor in the reaction. Communication with Epicentre

regarding this reaction has given several insights into enhancing the reaction. The

enzyme itself can be adenylated if ATP concentration is too high, blocking the enzyme

active site and decreasing its ligation activity. Upon the recommendation of the

Epicentre technical staff, we tested a 10-fold lower concentration of ATP in the reaction

but no circularization was found to occur then either. CircLigase is also known to have

some sequence specificity. In particular, Epicentre has determined that 3’ ends with a

cytosine residue do not ligate and thymidine and cytosine residues at the 5’ end exhibit

decreased ligation efficiency. Interestingly, we have not found 3’ cytosine to prevent

circularization during control assays (Figure 29). As an alternative to CircLigase, T4

RNA ligase was also tested but circularization of the DNA did not occur here either.

Circularization itself may provide another point at which fragments are lost using

the Ingolia method. In our control trials, circularization occurred in an incomplete

- 44 -

manner as indicated by the presence of both circular and linear fragments in Figure 29

following incubation with CircLigase. The Ingolia method requires a purification step

here which will eliminate any fragments which have not been circularized. While it is

important to purify for fragments containing adapter sequences on each side of the

fragment, this step likely results in substantial and non-specific loss of material.

Enhancing the efficiency of the CircLigase reaction may help to minimize this loss.

Thirdly, the difficulty in circularizing the DNA led to questions about the stability

of the oligo dT outlined in the Ingolia method. It is possible the stability of the abasic

furan leads to early disintegration of the oligo prior to circularization. Although we have

yet to find a molecular method for stabilizing or shielding this structure from prior

manipulations, Ingolia et al., have developed an alternative method which eliminates the

abasic furan from the oligo dT entirely (N.T. Ingolia, Personal Communication, 8 June

2009). As this is privileged information, the details of this alternative sequence will not

be included here, but it can be said that two internal spacers have been added in place of

the abasic furan to increase the flexibility of the DNA during circularization. These

spacers block rolling circle DNA replication, thereby eliminating the need to relinearize

the DNA prior to the PCR step. We have not independently tested this oligo and,

therefore, have no evidence for its enhancement of the circularization reaction at this time.

Lastly, the PCR steps used in Ingolia’s procedure 1) allow for synthesis of full-

length adapter sequences at either end of the fragment and 2) amplify the amount of DNA

fragments to that required for sequencing. In order to avoid biases in their data set,

Ingolia, et al., have terminated PCR in the “linear” phase of amplification. At this stage,

reagents are becoming consumed and the reaction efficiency begins to decrease, leading

- 45 -

to uneven fragment amplification. While PCR steps may ultimately be required in our

procedure as well, we have avoided the process up to this point since this provides ample

potential for altering the distribution of fragments which could be misleading during data

analysis.

Problems with the Bender & Curran Method

We have explored many of the weaknesses of our method in the results section

already, but there are some elements worth exploring more in depth here. We have

already worked to maximize the efficiency of enzymatic reactions by using magnetic

beads to exchange entire reaction solutions with minimal loss of DNA. Our reactions

have demonstrated higher efficiency as a result, as demonstrated by nearly complete

shifts of bands during the polyadenylation and tailing reactions.

Despite these advances in our procedure, the ligation of our PCR adapter has been

the foremost obstacle to successful execution of our method. The inherent inefficiencies

of ligation reactions and the challenges of attaching a single-stranded oligo to either a

single-stranded or double-stranded PCR adapter suggest this step represents a significant

bottleneck in preserving the fidelity of our data set. Our colleagues have suggested an

alternative method of employing random hexamers to prime second-strand DNA

synthesis prior to ligation of a PCR adapter (a blunt-end double-stranded DNA reaction).

However, we have rejected this method as small RNA fragments are not amenable to this

method and a required size selection step after synthesis would further deplete our

fragment pool.

We cannot definitively eliminate all of the ligation methods outlined in Figure 23

as entirely ineffective, although evidence supporting the viability of these methods has

- 46 -

not been forthcoming. As we have seen some evidence for ligation using our latest

method, we believe this to be the best protocol currently available for attaching our PCR

adapter in preparation for sequencing. However, the efficiency and reliability of this

method must be enhanced for our method to be a viable alternative to previously

established protocols.

In order to further enhance our method, we suggest several changes to be

considered. Firstly, although we believe 100µM dNTP concentration to be sufficient to

yield satisfactory cDNA synthesis, it is possible more complete synthesis may occur with

dNTP concentrations up to 1mM. It is important to note that we have not seen noticeable

decreases in efficiency in this or any downstream steps, but the results in Figure 13

demand that dNTP concentration be considered again when further optimizing this

method. Secondly, using even higher oligo concentrations should be considered as a

method to increase ligation efficiency. Upon Integrated DNA Technologies’ (IDT)

recommendations, we have prepared most oligo stock solutions to 100µM concentration,

although this has limited our ability to drastically vary the final oligo concentration in our

ligation solution. In the future, we would suggest preparing stock solutions to 1mM

concentration to give a greater range over which to test the effect of oligo concentrations

on ligation. There are a number of other changes we have made during development of

this method, which have been discussed elsewhere in this paper. These should be

considered carefully when further modifying this procedure.

Strategies for Enhancing the Ingolia Method

During troubleshooting of our own method, we have discovered several methods

which we feel might enhance the Ingolia method. Importantly, we have previously noted

- 47 -

increased polyadenylation and first-strand DNA synthesis efficiencies using our methods,

but several other steps may be taken to enhance the acquired data set further.

The primary issue at hand is the lack of successful circularization using the

CircLigase kit. Given the sequence specificity of this enzyme, particularly at the 3’ end,

we suggest tailing these oligos briefly with terminal transferase and dTTP since this

nucleotide is readily incorporated and a 3’ thymidine end is most favorable for the

circularization reaction.

In order to confront the issue of material loss during gel purification steps, we

suggest incorporating a biotin moiety on the oligo dT in a manner similar to that used in

our protocol. We have found this decreases the likelihood of losing DNA during

intermediate steps while simultaneously allowing us to substitute an optimal reaction

solution to the DNA between steps. Additional steps must be taken to incorporate this

procedure into the Ingolia method since, unlike our method, the biotin moiety cannot be

incorporated on the end portion of the oligo. We have discussed with IDT the possibility

of incorporating an internal biotin tag next to the spacer portions of the revised, flexible

oligo dT discussed previously. This is a custom oligo design project which time

constraints did not allow us to explore fully. It is recommended that future efforts with

the Ingolia method test this oligo dT with an internal biotin.

The incorporation of this oligo dT should eliminate the need to carry out PCR

amplification strictly for the purpose of amplifying fragments. On average, our protocol

begins with approximately five-fold more RNA than the Ingolia method and it is believed

to retain a higher amount of this starting material throughout the procedure, yielding an

amount which should be sufficient for sequencing. Since PCR is also used to synthesize

- 48 -

the remaining lengths of the adapter sequences in the Ingolia method, we would

recommend incorporating complete adapter sequences when ordering the oligo dT. It is

possible the incorporation of complete sequences may have an effect on first-strand

synthesis and circularization reaction efficiencies so this should be considered when

assaying these steps. The elimination of the relinearization step in the revised protocol

using this oligo dT will require another step to give sequencing-ready DNA fragments.

The single-stranded, linear DNA fragments can be synthesized by hybridizing an oligo

primer complementary to the 3’ adapter and adding DNA polymerase (Klenow fragment)

with nucleotides to synthesize full-length, single-stranded fragments.

Practical Applications of these Methods

During the development of this method, several applications have become

obvious for testing the Bender & Curran or modified Ingolia methods for small RNA

high-throughput sequencing preparation.

Initially, this project sought to use human adenovirus as a tool for testing the

viability of our method to explore translational control regions. Human adenovirus was a

viable candidate for this study given its ability to create a small pool of well-represented

gene transcripts within the host cell in a short period of time. The genome of adenovirus

consists of approximately 36,000 base pairs divided into approximately 40 early and late

transcription genes produced by alternate splicing. Adenovirus has a high rate of growth

in HeLa cells, making this system ideal for studying viral gene transcription in a short

period (5-8 hours) following infection (Shenk, 2001). Cells are generally infected with

more than ten plaque-forming units per cell, ensuring simultaneous infection of a large

population of cells. A high concentration of viral mRNA proportional to host mRNA

- 49 -

results during infection (Kozak, 1992), which we propose will lead to easier isolation of

adequate amounts of viral mRNA as the data subset for a study of translational control

mechanisms.

The presence of these control mechanisms in adenovirus gene transcripts have not

been well characterized at this point. A variety of viral adaptations have arisen to counter

host cell defenses, including ribosome shunting at late stages of infection to bypass host

cell defense mechanisms aimed at preventing translation of viral mRNA (Weitzman and

Ornelles, 2005). Additionally, non-coding virus-associated (VA) RNAs form secondary

structures which function in shutting down host defenses by inactivating proteins

designed to interfere with dsRNA while simultaneously preventing shut-down of

translation (Kozak, 1992; Mathews and Shenk, 1991). This suggests that some mRNA

secondary structures are present in viral gene transcripts and these may play a significant

role in controlling mRNA translation post-infection. This presents a viable model for an

initial study of translational control mechanisms using the Bender & Curran method

outlined here.

Our method has often employed kits designed for microRNA (miRNA) isolation

and transcription during development. Although none of these kits are incorporated in

the method as it stands now, the procedure has been tailored specifically for small RNAs

of 20-30 nucleotides in length. Therefore, we believe our method is ideally suited for

applications requiring the isolation and sequencing of miRNAs. MicroRNAs, small RNA

fragments approximately 22 nucleotides in length, operate as translational regulators

primarily by triggering the destabilization of mRNAs in a cell. These fragments typically

hybridize to a complementary sequence within the 3’ untranslated region (UTR) of the

- 50 -

mRNA and trigger the formation of a RNA-induced silencing complex (RISC) which

cleaves the mRNA (Sullivan and Ganem, 2005). Some evidence indicates miRNAs may

also bind to the 5’ UTR and upregulate translation, although the exact mechanisms of this

are still unknown. The complementarity of the miRNA:mRNA complex determines the

degree to which the RISC is formed and cleavage is triggered. Some studies show a

relatively mild binding specificity, with miRNAs typically binding 7 or 8-mer matching

regions (Baek et al., 2008). While the effect of miRNA-based translational regulation has

only a mild effect on the synthesis of individual proteins, it is believed miRNAs provide a

mechanism by which to globally regulate translation (Selbach et al., 2008). MicroRNAs

are not limited to cellular systems as some DNA viruses have been found to code for

miRNAs as well (Sullivan and Ganem, 2005). Interestingly, adenovirus is also believed

to code for miRNAs, presenting a potential alternative to our previous application using

the same system.

In contrast to the applications described in the previous paragraphs, a broader

impact application may be to observe the effects of mutations in select mRNAs which are

known to be involved in a pathogenic state. Mutations within the mRNA sequence can

alter protein synthesis by altering the binding or movement of ribosomes along a

particular mRNA and may cause the misfolding of the protein product from this

transcript (Scheper et al., 2007). Additionally, mutations may affect the synthesis of

translation machinery, such as tRNAs or translation initiation factors. As our current

study is primarily concerned with translational regulation and its effect on protein

synthesis, diseases pertinent to this area of translation are discussed here. While a

number of disease mutations have been identified in the coding region of mRNA, others

- 51 -

have also been identified in the 5’ UTR. Cataract syndrome causes increased synthesis of

ferritin, an iron storage protein critical to regulating free iron in the cell. The mechanism

for this appears to be an altered stem-loop structure in the 5’ UTR which upregulates

ferritin production (Scheper et al., 2007). In line with Kozak’s studies on translation

initiation, the loss or gain of an upstream regulatory AUG codon causes the disease states

seen in thrombocythaemia (excessive production of thrombopoietin) or melanoma

(caused by decreased translational efficiency of the tumor suppressor p16), respectively

(Liu et al., 1999; Scheper et al., 2007). A single nucleotide mutation in the IRES of the

c-myc gene enhances the ability of proteins to bind the structure (Paulin et al., 1998;

Scheper et al., 2007). This mutation is often found in patients with multiple myeloma

which demonstrate translational upregulation and increased translational efficiency

presumably due to the increased affinity of proteins for the mutated IRES during

translation initiation. This is only a small sampling of diseases linked to mRNA

mutations which demonstrate altered translational regulation but this represents a

significant area of future research. These diseases are clear evidence for the important

role of translational regulation in protein synthesis. The methods outlined in this paper

are uniquely positioned to explore the role of altered translational regulation on these

disease states.

- 52 -

LITERATURE CITED

Baek, D., J. Villen, C. Shin, F.D. Camargo, S.P. Gygi, and D.P. Bartel. 2008. The impact of microRNAs on protein output. Nature. 455:64-71.

Baim, S.B., D.F. Pietras, D.C. Eustice, and F. Sherman. 1985. A mutation allowing an mRNA secondary structure diminishes translation of Saccharomyces cerevisiae iso-1-cytochrome c. Mol Cell Biol. 5:1839-46.

Bentley, D.R., S. Balasubramanian, H.P. Swerdlow, G.P. Smith, J. Milton, C.G. Brown, K.P. Hall, D.J. Evers, C.L. Barnes, H.R. Bignell, J.M. Boutell, J. Bryant, R.J. Carter, R. Keira Cheetham, A.J. Cox, D.J. Ellis, M.R. Flatbush, N.A. Gormley, S.J. Humphray, L.J. Irving, M.S. Karbelashvili, S.M. Kirk, H. Li, X. Liu, K.S. Maisinger, L.J. Murray, B. Obradovic, T. Ost, M.L. Parkinson, M.R. Pratt, I.M. Rasolonjatovo, M.T. Reed, R. Rigatti, C. Rodighiero, M.T. Ross, A. Sabot, S.V. Sankar, A. Scally, G.P. Schroth, M.E. Smith, V.P. Smith, A. Spiridou, P.E. Torrance, S.S. Tzonev, E.H. Vermaas, K. Walter, X. Wu, L. Zhang, M.D. Alam, C. Anastasi, I.C. Aniebo, D.M. Bailey, I.R. Bancarz, S. Banerjee, S.G. Barbour, P.A. Baybayan, V.A. Benoit, K.F. Benson, C. Bevis, P.J. Black, A. Boodhun, J.S. Brennan, J.A. Bridgham, R.C. Brown, A.A. Brown, D.H. Buermann, A.A. Bundu, J.C. Burrows, N.P. Carter, N. Castillo, E.C.M. Chiara, S. Chang, R. Neil Cooley, N.R. Crake, O.O. Dada, K.D. Diakoumakos, B. Dominguez-Fernandez, D.J. Earnshaw, U.C. Egbujor, D.W. Elmore, S.S. Etchin, M.R. Ewan, M. Fedurco, L.J. Fraser, K.V. Fuentes Fajardo, W. Scott Furey, D. George, K.J. Gietzen, C.P. Goddard, G.S. Golda, P.A. Granieri, D.E. Green, D.L. Gustafson, N.F. Hansen, K. Harnish, C.D. Haudenschild, N.I. Heyer, M.M. Hims, J.T. Ho, A.M. Horgan, et al. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 456:53-9.

Blackburn, P., and S. Moore. 1982. The Enzymes. Vol. 15, Part B. P. Boyer, editor. Academic Press, Orlando, FL. 317-433.

Brimacombe, R. 1991. RNA-protein interactions in the Escherichia coli ribosome. Biochimie. 73:927-36.

Chen, G., J.D. Wen, and I. Tinoco, Jr. 2007. Single-molecule mechanical unfolding and folding of a pseudoknot in human telomerase RNA. Rna. 13:2175-88.

Cuatrecasas, P., S. Fuchs, and C.B. Anfinsen. 1967. Catalytic properties and specificity of the extracellular nuclease of Staphylococcus aureus. J Biol Chem. 242:1541-7.

Curran, J.F. 1993. Analysis of effects of tRNA:message stability on frameshift frequency at the Escherichia coli RF2 programmed frameshift site. Nucleic Acids Res. 21:1837-43.

Curran, J.F., and M. Yarus. 1989. Rates of aminoacyl-tRNA selection at 29 sense codons in vivo. J Mol Biol. 209:65-77.

Epicentre. 2009. CircLigase ssDNA Ligase Product Literature. Epicentre Biotechnologies. Lit. #222, 1-4.

Green, L., C.H. Kim, C. Bustamante, and I. Tinoco, Jr. 2008. Characterization of the mechanical unfolding of RNA pseudoknots. J Mol Biol. 375:511-28.

- 53 -

Hardesty, B., T. Tsalkova, and G. Kramer. 1999. Co-translational folding. Curr Opin Struct Biol. 9:111-4.

Hartz, D., D.S. McPheeters, L. Green, and L. Gold. 1991. Detection of Escherichia coli ribosome binding at translation initiation sites in the absence of tRNA. J Mol Biol. 218:99-105.

Hudson, M.E. 2008. Sequencing breakthroughs for genomic ecology and evolutionary biology. Molecular Ecology Resources. 8:3-17.

Ingolia, N.T., S. Ghaemmaghami, J.R. Newman, and J.S. Weissman. 2009. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 324:218-23.

Jacks, T., H.D. Madhani, F.R. Masiarz, and H.E. Varmus. 1988. Signals for ribosomal frameshifting in the Rous sarcoma virus gag-pol region. Cell. 55:447-58.

Komar, A.A., T. Lesnik, and C. Reiss. 1999. Synonymous codon substitutions affect ribosome traffic and protein folding during in vitro translation. FEBS Lett. 462:387-91.

Kozak, M. 1984. Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. Nucleic Acids Res. 12:857-72.

Kozak, M. 1986a. Influences of mRNA secondary structure on initiation by eukaryotic ribosomes. Proc Natl Acad Sci U S A. 83:2850-4.

Kozak, M. 1986b. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell. 44:283-92.

Kozak, M. 1987. At least six nucleotides preceding the AUG initiator codon enhance translation in mammalian cells. J Mol Biol. 196:947-50.

Kozak, M. 1989. Circumstances and mechanisms of inhibition of translation by secondary structure in eucaryotic mRNAs. Mol Cell Biol. 9:5134-42.

Kozak, M. 1991. An analysis of vertebrate mRNA sequences: intimations of translational control. J Cell Biol. 115:887-903.

Kozak, M. 1992. Regulation of translation in eukaryotic systems. Annu Rev Cell Biol. 8:197-225.

Liao, P.Y., P. Gupta, A.N. Petrov, J.D. Dinman, and K.H. Lee. 2008. A new kinetic model reveals the synergistic effect of E-, P- and A-sites on +1 ribosomal frameshifting. Nucleic Acids Res.

Lim, V.I., and J.F. Curran. 2001. Analysis of codon:anticodon interactions within the ribosome provides new insights into codon reading and the genetic code structure. Rna. 7:942-57.

Lindsley, D., J. Gallant, and G. Guarneros. 2003. Ribosome bypassing elicited by tRNA depletion. Mol Microbiol. 48:1267-74.

Liu, L., D. Dilworth, L. Gao, J. Monzon, A. Summers, N. Lassam, and D. Hogg. 1999. Mutation of the CDKN2A 5' UTR creates an aberrant initiation codon and predisposes to melanoma. Nat Genet. 21:128-32.

Marzi, S., A.G. Myasnikov, A. Serganov, C. Ehresmann, P. Romby, M. Yusupov, and B.P. Klaholz. 2007. Structured mRNAs regulate translation initiation by binding to the platform of the ribosome. Cell. 130:1019-31.

Mathews, M.B., and T. Shenk. 1991. Adenovirus virus-associated RNA and translation control. J Virol. 65:5657-62.

- 54 -

Park, Y.S., S.W. Seo, S. Hwang, H.S. Chu, J.H. Ahn, T.W. Kim, D.M. Kim, and G.Y. Jung. 2007. Design of 5'-untranslated region variants for tunable expression in Escherichia coli. Biochem Biophys Res Commun. 356:136-41.

Paulin, F.E., S.A. Chappell, and A.E. Willis. 1998. A single nucleotide change in the c-myc internal ribosome entry segment leads to enhanced binding of a group of protein factors. Nucleic Acids Res. 26:3097-103.

Purvis, I.J., A.J. Bettany, T.C. Santiago, J.R. Coggins, K. Duncan, R. Eason, and A.J. Brown. 1987. The efficiency of folding of some proteins is increased by controlled rates of translation in vivo. A hypothesis. J Mol Biol. 193:413-7.

Ronaghi, M. 2001. Pyrosequencing sheds light on DNA sequencing. Genome Res. 11:3-11.

Ronaghi, M., M. Uhlen, and P. Nyren. 1998. A sequencing method based on real-time pyrophosphate. Science. 281:363, 365.

Sambrook, J., and D.W. Russell. 2001. Gel Electrophoresis of DNA and Pulsed-field Agarose Gel Electrophoresis. In Molecular Cloning: A Laboratory Manual. Vol. 1. Cold Spring Harbor Laboratory Press, Cold Spring Harbor.

Sanders, C.L., and J.F. Curran. 2007. Genetic analysis of the E site during RF2 programmed frameshifting. Rna. 13:1483-91.

Scheper, G.C., M.S. van der Knaap, and C.G. Proud. 2007. Translation matters: protein synthesis defects in inherited disease. Nat Rev Genet. 8:711-23.

Schwartz, R., and J.F. Curran. 1997. Analyses of frameshifting at UUU-pyrimidine sites. Nucleic Acids Res. 25:2005-11.

Selbach, M., B. Schwanhausser, N. Thierfelder, Z. Fang, R. Khanin, and N. Rajewsky. 2008. Widespread changes in protein synthesis induced by microRNAs. Nature. 455:58-63.

Shatkin, A.J. 1985. mRNA cap binding proteins: essential factors for initiating translation. Cell. 40:223-4.

Shen, L.X., and I. Tinoco, Jr. 1995. The structure of an RNA pseudoknot that causes efficient frameshifting in mouse mammary tumor virus. J Mol Biol. 247:963-78.

Shenk, T.E. 2001. Adenoviridae: The Viruses and Their Replication. In Fundamental Virology. D.M. Knipe, editor. Lippincott, Williams & Wilkins, Philadelphila.

Shine, J., and L. Dalgarno. 1974. The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci U S A. 71:1342-6.

Somogyi, P., A.J. Jenner, I. Brierley, and S.C. Inglis. 1993. Ribosomal pausing during translation of an RNA pseudoknot. Mol Cell Biol. 13:6931-40.

Sonenberg, N., and A.G. Hinnebusch. 2007. New modes of translational control in development, behavior, and disease. Mol Cell. 28:721-9.

Spahr, P., and B. Hollingworth. 1961. Purification and mechanism of action of ribonuclease from Escherichia coli ribosomes. Journal of Biological Chemistry. 236:823-831.

Sullivan, C.S., and D. Ganem. 2005. MicroRNAs and viral infection. Mol Cell. 20:3-7. Takahashi, K., and S. Moore. 1982. The Enzymes. Vol. 15, Part B. P. Boyer, editor.

Academic Press, Orlando, FL. 435-468. Takyar, S., R.P. Hickerson, and H.F. Noller. 2005. mRNA helicase activity of the

ribosome. Cell. 120:49-58.

- 55 -

Thanaraj, T.A., and P. Argos. 1996a. Protein secondary structural types are differentially coded on messenger RNA. Protein Sci. 5:1973-83.

Thanaraj, T.A., and P. Argos. 1996b. Ribosome-mediated translational pause and protein domain organization. Protein Sci. 5:1594-612.

Uemura, S., M. Dorywalska, T.H. Lee, H.D. Kim, J.D. Puglisi, and S. Chu. 2007. Peptide bond formation destabilizes Shine-Dalgarno interaction on the ribosome. Nature. 446:454-7.

Weitzman, M.D., and D.A. Ornelles. 2005. Inactivating intracellular antiviral responses during adenovirus infection. Oncogene. 24:7686-96.

Wen, J.D., L. Lancaster, C. Hodges, A.C. Zeri, S.H. Yoshimura, H.F. Noller, C. Bustamante, and I. Tinoco. 2008. Following translation by single ribosomes one codon at a time. Nature. 452:598-603.

Wintermeyer, W., F. Peske, M. Beringer, K.B. Gromadski, A. Savelsbergh, and M.V. Rodnina. 2004. Mechanisms of elongation on the ribosome: dynamics of a macromolecular machine. Biochem Soc Trans. 32:733-7.

Wolin, S.L., and P. Walter. 1988. Ribosome pausing and stacking during translation of a eukaryotic mRNA. Embo J. 7:3559-69.

Xia, X., and M. Holcik. 2009. Strong eukaryotic IRESs have weak secondary structure. PLoS ONE. 4:e4136.

Yusupova, G., L. Jenner, B. Rees, D. Moras, and M. Yusupov. 2006. Structural basis for messenger RNA movement on the ribosome. Nature. 444:391-4.

Zagorska, L., J. Chroboczek, S. Klita, and P. Szafranski. 1982. Effect of secondary structure of messenger ribonucleic acid on the formation of initiation complexes with prokaryotic and eukaryotic ribosomes. Eur J Biochem. 122:265-9.

- 56 -

FIGURES

Figure 1. Illumina method for preparing small RNAs for sequencing. Red bars represent RNA and blue bars represent DNA, while synthesis reactions are indicated by arrows. The Sequencing adapter sequence is indicated by pink (RNA) or turquoise (DNA) bars while the PCR adapter sequence is indicated by dark red (RNA) or light blue (DNA). This method is described in detail in literature available on Illumina’s website (http://www.illumina.com).

- 57 -

Figure 2. Bender & Curran method for preparing small RNAs for sequencing. Red bars represent RNA and blue bars represent DNA, while synthesis reactions are indicated by arrows. The PCR adapter sequence is indicated by a lighter blue region to the left while the Sequencing adapter is indicated by a turquoise region to the right. Orange circles represent magnetic beads bound to the 5’ end of the cDNA through biotin-streptavidin interactions. The BamHI site used in Step 5 is encoded in the oligo dT (not shown) and becomes a functional site upon completion of second-strand synthesis. Refer to the text for details of this procedure.

- 58 -

Figure 3. Ingolia method for preparing small RNAs for sequencing. Red bars represent RNA and blue bars represent DNA, while synthesis reactions are indicated by arrows. The PCR adapter sequence is indicated by a lighter blue bar while the Sequencing adapter is indicated by a turquoise bar. The star (*) represents an abasic furan which is easily broken by treatment with APE I enzyme to relinearize the DNA. Refer to the text for details of this procedure.

- 59 -

Figure 4. Ribonuclease digestion of ribosomal RNAs. A comparison of damage to the two large ribosomal RNAs from digests using RNases A, T1, and micrococcal nuclease. Intact ribosomes are indicated by the appearance of distinct double bands while streaks indicate digestion damage. Control groups (C) appear at the left portion of each gel followed by five 10-fold enzyme dilution samples from high (H) to low (L) concentrations. All samples were run for 8 minutes on a thick, 1% agarose gel then stained with ethidium bromide.

- 60 -

Figure 5. Isolation of ribosome-protected mRNA fragments. RNA fragments phosphorylated on the 5’ end with gamma-labeled 32P-ATP following digestion with micrococcal nuclease as described in the text. Fragments are indicated by streaks and appear slightly larger than the 30 nt length expected for ribosome-protected mRNA fragments. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 61 -

Figure 6. Damage to digested DNA over time. Double bands indicate the presence of the two large ribosomal RNAs. Original sample was digested followed by enzyme inactivation then frozen at -80°C for approximately three weeks while new samples were collected just prior to this assay. Original sample appears largely digested while new digest samples show distinct double banding. Samples were run for 8 minutes on a thick, 1% agarose gel to separate the ribosomal RNAs then stained with ethidium bromide.

- 62 -

Figure 7. Simultaneous dephosphorylation and polyadenylation reactions. A comparison of polyadenylation reactions without dephosphorylation reagents (“Poly-A”) or with either polynucleotide kinase (“PNK”) or calf intestinal phosphatase (“CIP”). The 5’ end of the RNA was phosphorylated with gamma-labeled 32P-ATP in “RNA Only” lane while all other lanes were labeled with a 5 minute pulse of alpha-labeled 32P-ATP, followed by a 55 minute chase with 1mM unlabeled ATP. Initial RNA is 30 nts in length. Streaks appearing near the top of the gel (>200 nts) indicate samples undergoing effective polyadenylation. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 63 -

Figure 8. The decline of poly-A polymerase activity over time. Activity comparison of three aliquots of E. coli poly-A polymerases supplied in Invitrogen’s miRNA first-strand synthesis kit arriving in our lab on the date indicated. Activity was found to be highest in the newest kit while kits one month (7/18/08) and two months (6/8/08) old showed no polyadenylation activity. “RNA only” sample was phosphorylated with gamma-labeled 32P-ATP while poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 64 -

Figure 9. Comparison of Ambion and NEB poly-A polymerases. Fresh NEB poly-A polymerase was found to yield higher efficiency polyadenylation activity over a longer lifetime than Ambion poly-A polymerase. (A) Comparison of 2 month old Ambion poly-A polymerase (PAP) and fresh NEB PAP on 3/17/09. (B) Polyadenylation with NEB PAP still yields good results two months later (5/27/09). “RNA only” sample was phosphorylated with gamma-labeled 32P-ATP while poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 65 -

Figure 10. Pulse-chase assay of polyadenylation. Poly-A polymerase is most active in the first five minutes of the reaction. “RNA only” sample was phosphorylated with gamma-labeled 32P-ATP while poly-A polymerase activity was assayed via a pulse of alpha-labeled 32P-ATP chased by 1mM unlabeled ATP. Pulse times varied as indicated and total incubation time was equal to one hour total. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 66 -

Figure 11. Factors affecting polyadenylation. This assay tested Ambion’s poly-A polymerase (1) with the addition of BSA, (2) with additional enzyme added midway through the reaction, (3) with NEB’s poly-A polymerase reaction buffer, (4) with decreased incubation temperature, and after a separate, heat-inactivated polynucleotide kinase (PNK) reaction. “RNA” sample was phosphorylated with gamma-labeled 32P-ATP while poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. The standard reaction consisted of all factors identified in the text but not 0.1 mg/mL BSA. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 67 -

Figure 12. Additional factors affecting polyadenylation. Other factors were also tested using Ambion’s poly-A polymerase, such as increasing ATP concentration (1.5x), doubling the initial enzyme concentration, increasing incubation time (1.5x), and removing the heat inactivation of poly-A polymerase step. “RNA Only” sample was phosphorylated with gamma-labeled 32P-ATP while poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 68 -

Figure 13. The effect of dNTP concentration during first-strand synthesis. dNTP concentration during first-strand synthesis was found to have a direct effect on the product synthesized. “RNA Only” sample was phosphorylated with gamma-labeled 32P-ATP while poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. Additionally, first-strand synthesis was assayed with the indicated concentration of dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 69 -

Figure 14. First-strand synthesis using several oligo dT sequences. (A) Diagram of oligo dT sequences used in this experiment. Dark blue represents oligo dT segments where light blue represents the PCR adapter sequence and the turquoise bar represents the Sequencing adapter. Orange circles represent streptavidin magnetic beads bound to oligos with one of two restriction digest sites (as indicated). In the last sequence, the star is an abasic furan. (B) First-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP during synthesis without a chase step. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 70 -

Figure 15. First-strand synthesis using Ingolia’s method. First-strand synthesis appears absent and polyadenylation is greatly inhibited using Ingolia’s method. Poly-A polymerase activity was assayed using a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. Additionally, first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 71 -

Figure 16. First-strand synthesis using Bender & Curran method. Polyadenylation appears faint but to the appropriate length. First-strand synthesis occurs using the Ingolia primer with an enhanced first-strand synthesis reaction and cDNA is successfully precipitated with GlycoBlue (“1st Precipiation,” refer to the text for details). Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 100µM unlabeled ATP. First-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 72 -

Figure 17. DNA retention following binding to streptavidin magnetic beads. (A) First strand synthesis assays using NEB standard or hydrophilic beads. Lost material is noted when standard beads are used. (B) A second first-strand assay with subsequent washes noted the retention of DNA on beads during synthesis and wash steps. Beads were bound by standard methods as described in the text. Washed beads were washed three times with 0.1x TE prior to gel analysis. Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP. First-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 73 -

Figure 18. DNA retention following removal of RNA. Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 1mM unlabeled ATP while first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. All samples were bound to hydrophilic streptavidin magnetic beads using HB006 oligo dT by the method described in the text. 1M NaOH/1M HCL sample was subjected to treatments described in the text for Ingolia’s method and beads were allowed to re-bind DNA for 10 minutes prior to removal of supernatant. Tris-Cl pH 10 sample was treated with an excess of this solution and incubated at 65°C for 20 minutes, then beads were allowed to re-bind DNA for 10 minutes prior to removal of supernatant from the beads. RNase H digestion was carried out using the Bender & Curran method described in the text. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 74 -

Figure 19. Comparison of dTTP and dCTP tailing reactions. Tailing with dTTP is more efficient than dCTP tailing under standard tailing conditions. HB002 oligos were phosphorylated with gamma-labeled 32P-ATP prior to tailing with cold nucleotides (dCTP or dTTP). “HB002 Alone” sample was labeled in the same manner but did not undergo tailing. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 75 -

Figure 20. Factors affecting dCTP tailing. Nucleotide (dCTP) concentration, enzyme (terminal transferase) concentration, and incubation times were varied to determine optimal conditions for near complete tailing of cDNA samples. HB002 oligos were phosphorylated with gamma-labeled 32P-ATP prior to the tailing reaction. “HB002 only” sample was labeled in the same manner but did not undergo tailing. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 76 -

Figure 21. Optimizing the dCTP tailing reaction. Nucleotide (dCTP) concentration, terminal transferase (TT) concentration, and incubation times were tested at standard and 2x standard conditions, as determined in Figure 20, to determine the optimal conditions for near complete tailing of cDNA samples. Complete tailing is indicated by the disappearance of the cDNA band. Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 100µM unlabeled ATP while first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. Tailing reactions were assayed by tailing first-strand labeled DNA. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 77 -

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1st S

tr.

Std

Rxn

2x d

CT

P

2x In

cu.

2x d

CT

P, 2

x In

cu.

2x E

nz.

2x E

nz.,

2x d

CT

P

2x E

nz.,

2x In

cu.

2x A

ll

Perc

enta

ge o

f Tot

al S

igna

l in

Col

umn

Tailed 1stStrand1st Strand

B

A

Figure 22. Quantification of tailing efficiency. (A) Original gel with volume markers used to determine signal present in each portion of a column. Upper (tailed fragment) and lower (first-strand) portions of the columns were summed to give total column signal, from which the signal present of each product was correlated to give percent signal of each. (B) Comparison of percent signals for each sample as first-strand and tailing portions of the gel.

- 78 -

Figure 23. Techniques for ligating adapter sequences & fragments of interest. Red bars represent RNA while blue bars represent DNA. Lighter blue bars represent the PCR adapter sequence which is ligated either to RNA or DNA in single- or double-stranded form.

- 79 -

Figure 24. Increasing ligation efficiency by augmenting oligo concentration. Ligations were tested at 1/3x, 1x, and 3x standard oligo conditions. Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a 55 minute chase with 100µM unlabeled ATP while first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. Tailing reaction was assayed by tailing first-strand labeled DNA. To assay ligation, adapter sequence oligos were phosphorylated with gamma-labeled 32P-ATP prior to the ligation reaction. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 80 -

Figure 25. Varying dG overhang length to enhance ligation efficiency. PCR adapter sequences hybridized to complementary sequences with an overhanginging number of dG residues (indicated) were ligated to radio-labeled cDNAs with a poly-dC tail. Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a chase with 100µM unlabeled ATP while first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. Second-strand synthesis was assayed using 200µM dDTPs supplemented with alpha-labeled 32P-dCTP without a chase step. All samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 81 -

Figure 26. Evidence for successful ligation using Bender & Curran method. Shift of tailing streak suggests fragments increased in size relative to tailed fragments. Poly-A polymerase activity was assayed via a 5 minute pulse of alpha-labeled 32P-ATP followed by a chase with 100µM unlabeled ATP while first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. Labeled first-strand DNA was tailed with dCTP then was ligated to PCR adapter sequence with a four dG residue overhang. Samples were assayed on a thin 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 82 -

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

0 0.057

0.114

0.171

0.228

0.285

0.342

0.398

0.455

0.512

0.569

0.626

0.683

0.74

0.797

0.854

0.911

0.968

Relative Distance on Gel (Top --> Bottom)

Rad

ioac

tive

Cou

nts

No LigationLigation

Figure 27. Comparison of signal shifts for non-ligated and ligated samples. The highest signal peak (at left) indicates tailed (red) or ligated (blue) fragments while secondary peaks (at right) indicate signal from labeled first-strand alone. The shift between the ligated and unligated signal peaks was calculated to be around 30 nucleotides. Data was collected using BioRad’s phorphorimaging software.

- 83 -

Figure 28. Retention of cDNA after removal of gel purification steps. Samples were subjected to all steps preceded to the left of each sample, then assayed after the step indicated. Only one precipitation step was used following RNA removal but prior to circularization. Poly-A polymerase activity was assayed via a 5 minutes pulse of alpha-labeled 32P-ATP followed by a chase with 100µM unlabeled ATP while first-strand synthesis was assayed with 100µM dDTPs (containing all dNTPs except dCTP) supplemented with alpha-labeled 32P-dCTP without a chase step. Samples were assayed on a thin, 10% polyacrylamide denaturing gel for 1 hour at 250 volts.

- 84 -

Figure 29. Circularization of control oligos using CircLigase. Oligos shown are before and after the ligation reaction. All oligos demonstrated the expected partial upward shift in the original band, indicating circularization occurred. The HB010 oligo is a phosphorylated 50-mer while the HB007 oligo is a phosphorylated 45-mer containing a dC residue at the 3’ end. All circularization reactions were carried out according to Epicentre’s instructions. Samples were assayed on a thick, 14% polyacrylamide denaturing gel for 1 ½ hours at 250 volts and stained with SYBR Gold prior to imaging.

- 85 -

Figure 30. Circularization as evidenced by successful PCR amplification. Circularization was tested at two concentrations of ATP, then assayed using Ingolia’s PCR primers. PCR with these primers will only work if successful circularization has occurred. Faint product bands from PCR amplification are highlighted (circle) although it is conceivable these may be hybridized primer products (dimers). Samples were assayed on a thick, 14% polyacrylamide denaturing gel for 1 ½ hours at 250 volts and stained with SYBR Gold prior to imaging.

- 86 -

- 87 -

SCHOLASTIC VITA

R. Hugh F. Bender

BORN: February 5, 1985, Denver, CO UNDERGRADUATE Wake Forest University STUDY: Winston-Salem, NC B.S., Biology with Honors, 2007 GRADUATE STUDY: Wake Forest University Winston-Salem, NC M.S., Cell and Molecular Biology, 2009 SCHOLASTIC AND PROFESSIONAL EXPERIENCE:

Undergraduate Researcher, Wake Forest University, 2006-2007. Graduate Teaching Assistant, Wake Forest University, 2008-2009. Graduate Researcher, Wake Forest University, 2007-2009.

HONORS AND AWARDS:

Honors in Biology, 2007. PROFESSIONAL SOCIETIES:

Beta Beta Beta Biological Honor Society, 2004-2007. Delta Phi Alpha German Honor Society, 2006-2007.

PUBLICATIONS:

Bender RHF, Ornelles DA, Curran JF. In prep. A method for identifying ribosome pause sites in messenger RNA through new sequencing technology. Lons RB, Saldana SJ, Bender RHF, Daker R, Turkett Jr. W, Fetrow JS. In prep. FurBall: targeted interaction network exploration.

Documents

MESSENGER RNA THROUGH NEW SEQUENCING TECHNOLOGY … · 2013-05-08 · – transfer RNA . UTR – untranslated region . viii. ix R. Hugh F. Bender . A METHOD FOR IDENTIFYING RIBOSOME