Structural and functional characterization of proteins involved in

Research Collection

Doctoral Thesis

Structural and functional characterization of proteins involved inRNA-mediated gene silencing

Author(s): Lingel, Andreas

Publication Date: 2006

Permanent Link: https://doi.org/10.3929/ethz-a-005164730

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

https://doi.org/10.3929/ethz-a-005164730

http://rightsstatements.org/page/InC-NC/1.0/

https://www.research-collection.ethz.ch

https://www.research-collection.ethz.ch/terms-of-use

DISS. ETH NO. 16441

Structural and functional characterization of proteins involved in RNA-mediated gene silencing

A dissertation submitted to the

SWISS FEDERAL INSTITUTE OF TECHNOLOGY ZURICH

for the degree of Doctor of Sciences

presented by

ANDREAS LINGEL Dipl. Natw. ETH

born 08.02.1976 citizen of Germany

accepted on the recommendation of

Prof. Dr. Ulrike Kutay Prof. Dr. Frédéric Allain

Dr. Elisa Izaurralde

2006

Previous page: In Greek mythology, the Argonauts are a group of sailors, named after their ship, the Argo. Under the leadership of Jason, they start an expedition to find and bring back the Golden Fleece, a treasury guarded by a dragon and located in Colchis on the Black Sea. With the help of the gods, they are able to finish their mission successfully and return to Greece. The illustration shows a detail of an Athenian red-figure clay vase, dated about 425-375 BC. It shows the Argonauts on board their ship Argo, with the NMR ensemble of the Drosophila Argonaute2 PAZ domain in the foreground. The vase is located in the Museo Jatta in Ruvo, Italy.

This thesis describes work carried out in the laboratory of Dr. Elisa Izaurralde at the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany under the supervision of Prof. Dr. Ulrike Kutay (Institute of Biochemistry at the Swiss Federal Institute of Technology Zurich). The work was completed between May 2002 and December 2005 and was supported by an EMBL predoctoral fellowship. The following publications are presented in this thesis: 1. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2003) Structure and nucleic acid

binding of the Drosophila Argonaute2 PAZ domain. Nature, 426, 465-469. 2. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2004) Nucleic acid 3’-end

recognition by the Argonaute2 PAZ domain. Nat Struct Mol Biol, 11, 576-577. 3. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2004) NMR assignment of the

Drosophila Argonaute2 PAZ domain. J Biomol NMR, 29, 421-432. 4. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2005) The structure of the FHV

B2 protein, a viral suppressor of RNAi, reveals a novel mode of dsRNA recognition. Embo Rep, 6, 1149-1155.

In addition, a perspective and a review are presented: 5. Lingel, A., and Izaurralde, E. (2004) RNAi: Finding the elusive endonuclease. RNA, 10,

1675-1679. 6. Lingel, A., and Sattler, M. (2005) Novel modes of protein-RNA recognition in the

RNAi pathway. Curr Opin Struct Biol, 15, 107-115.

Acknowledgements First of all, I want to thank my supervisor Dr. Elisa Izaurralde for giving me the opportunity to work in her lab. I would like to thank her for excellent scientific guidance and all her enthusiasm, for the great help and for the constant support throughout the last years. This was all invaluable for the success of this Ph.D. thesis. Furthermore, she enabled my long-term collaboration with the laboratory of biomolecular NMR at EMBL. Especially, I want to thank Dr. Michael Sattler for the possibility to do this collaborative work with his lab. He and Dr. Bernd Simon had essential contributions to my projects and taught me NMR spectroscopy. Without them, this work would not have been possible. I am very grateful to Prof. Dr. Ulrike Kutay for the supervision of this thesis at ETH Zurich. Furthermore, I thank Prof. Dr. Frédéric Allain for being my ‘co-referent’ and Prof. Dr. Nikolaus Amrhein for his help and for chairing the examination committee at ETH Zurich. I would like to thank the members of my thesis advisory committee at EMBL for their interest in my work, for fruitful discussions and for their helpful advice. In addition to Dr. Elisa Izaurralde and Dr. Michael Sattler, these were Dr. Andreas Ladurner, and Prof. Dr. Ulrike Kutay. A special “thank you” goes to Michaela Rode and Gunter Stier. Both of them provided excellent technical support and very useful advice throughout the time of my thesis. Next, I want to thank the past and present members of the lab for being nice colleagues and friends, and for enjoyable times inside and outside of the lab. These are: Andrea Herold, David Gatfield, Leonie Unterholzner, Daniel Forler, Kerstin Gari, Jan Rehwinkel, Maria Polycarpou-Schwarz, Tamás Orban, Pavel Natalin, Isabelle Behm-Ansmant, Valerie Hilgers, Schu-Fee Yang, Ana Sofia Bregieiro Eulalio, Suheda Erener, and Foteini Christodoulou. In addition, I would like to thank the members of the NMR lab: Fabian Filipp, Christian Edlich, Cameron Mackereth, Anna Oddone, Lorenzo Corsini, Ana Messias, Malgorzata Duszczyk, Frank Gabel, and Alexander Gasch. I also want to thank other people at EMBL who contributed to my work in one way or another. These are Elena Conti, Sébastien Fribourg, Atlanta Cook, Jérome Basquin, all members of the photolab and of other service units. I am very thankful to Elisa, Sufe, Jan, Pavel, Michael and Bernd for reading this thesis and for helpful comments. Mein ganz besonderer Dank gilt meiner Familie. Meine Eltern und meine Schwester Barbara haben mich durchgehend unterstützt und haben mir immer wieder Kraft gegeben. Ohne sie wäre das alles nicht möglich gewesen. Vielen Dank!

Table of Contents

Table of Contents Zusammenfassung .................................................................................................................... 3 Summary ................................................................................................................................... 5 1. Introduction ...................................................................................................................... 7

1.1 Regulation of gene expression by RNA silencing ..................................................... 7 1.1.1 A brief history of the discovery of RNA silencing ............................................ 7 1.1.2 RNA interference ............................................................................................... 9 1.1.3 miRNA pathway............................................................................................... 13 1.1.4 RNA-mediated transcriptional gene silencing ................................................. 16

1.2 Argonaute proteins ................................................................................................... 18

1.2.1 Argonaute proteins: a conserved protein family with diverse functions.......... 18 1.2.2 The Drosophila Ago1 and Ago2 proteins ........................................................ 21 1.2.3 Drosophila and human Argonaute2 act as Slicer............................................. 22 1.2.4 Structures of archaeal Argonaute and Piwi proteins ........................................ 23

1.3 Viral suppression of RNA silencing......................................................................... 30

1.4 NMR spectroscopy: a tool to understand the function of biomolecules .................. 36

1.4.1 The basic principle of nuclear magnetic resonance ......................................... 36 1.4.2 The chemical shift ............................................................................................ 39 1.4.3 The basic NMR experiment ............................................................................. 40 1.4.4 The interaction of spins through chemical bonds: scalar coupling .................. 41 1.4.5 Resonance assignment in proteins.................................................................... 43 1.4.6 The interaction of spins through space: dipolar coupling and residual dipolar coupling .................................................................................. 45 1.4.7 The nuclear Overhauser effect ......................................................................... 47 1.4.8 Relaxation and dynamics ................................................................................. 48 1.4.9 Structure determination with NMR derived constraints .................................. 52 1.4.10 Interaction studies by NMR spectroscopy ....................................................... 54

2. Results ............................................................................................................................. 57

2.1 Structure and nucleic acid binding of the Drosophila Argonaute2 PAZ domain .... 58

2.2 Nucleic acid 3’-end recognition by the Argonaute2 PAZ domain........................... 71

2.3 NMR assignment of the Drosophila Argonaute2 PAZ domain............................... 88

2.4 The structure of the FHV B2 protein, a viral suppressor of RNAi, reveals a novel mode of dsRNA recognition ........................................................... 91

1

Table of Contents

3. Discussion ...................................................................................................................... 103

3.1 The PAZ domain represents a novel nucleic acid binding fold ............................. 103

3.2 The PAZ domain as a specificity providing module in the RNAi pathway........... 110

3.3 The FHV B2 protein binds dsRNA in a novel mode ............................................. 113

3.4 Viral suppressors of RNAi act at different levels in the RNAi pathway ............... 117 4. Additional Publications................................................................................................ 119

4.1 RNAi: Finding the elusive endonuclease ............................................................... 120

4.2 Novel modes of protein-RNA recognition in the RNAi pathway.......................... 126 5. Abbreviations................................................................................................................ 136 6. References ..................................................................................................................... 138 7. Curriculum vitae .......................................................................................................... 148

2

Zusammenfassung

Zusammenfassung Um den komplexen Prozess der Genexpression regulieren zu können, hat die Zelle ein breites

Repertoire an Kontrollmechanismen entwickelt. RNA-Interferenz (RNAi) ist ein solcher

Mechanismus, der bei Anwesenheit von doppelsträngiger RNA in den Zellen die

Genexpression herunterreguliert. Die doppelsträngige RNA wird dabei von Dicer, einer

Ribonuklease der Klasse III, in „small interfering RNAs (siRNAs)“ gespalten. Diese sind 21-

23 Nukleotide lang und bilden einen ca. 19 Basenpaare umfassenden Duplex, der auf beiden

Seiten am 3’-Ende einen Überhang von zwei Nukleotiden aufweist. In einem zweiten Schritt

werden die siRNAs in den multimeren „RNA-induced silencing complex (RISC)“ überführt,

in dem sie mit der Hilfe eines Argonaute-Proteins die Aufgabe haben, eine komplementäre

Boten-RNA zu erkennen und zu spalten. Die Mitglieder der konservierten Argonaute-

Proteinfamilie zeichnen sich durch die Anwesenheit von zwei charakteristischen Domänen,

einer zentralen PAZ-Domäne und einer C-terminalen Piwi-Domäne, aus.

Zu Beginn dieser Arbeit war bereits bekannt, dass die PAZ-Domäne nur in Dicer- und

Argonaute-Proteinen vorkommt. Darüber hinaus konnte die Interaktion zwischen Mitgliedern

beider Proteinfamilien gezeigt werden und deshalb wurde vorgeschlagen, dass die PAZ-

Domäne als Protein-Protein-Bindungsdomäne den direkten Kontakt zwischen den beiden

Proteinen vermittelt.

Der Hauptgesichtspunkt dieser Arbeit war die Charakterisierung der Funktion der PAZ-

Domäne. In einem ersten Schritt habe ich mit Hilfe der Kernspinresonanz (NMR)-

Spektroskopie die dreidimensionale Struktur der PAZ-Domäne des Argonaute2-Proteins von

Drosophila melanogaster gelöst (Kapitel 2.1). Die Domäne hat eine bisher unbekannte

Proteinfaltung, die aus einem zentralen, fünfsträngigen β-Faltblatt besteht, das auf der einen

Seite von zwei α-Helices und auf der anderen von einem konservierten β-Schleife/α-Helix-

Modul flankiert wird. Basierend auf der Struktur konnte ich durch weitere Experimente

zeigen, dass die PAZ-Domäne in vitro an Nukleinsäuren bindet; ein Ergebnis, das im Bezug

auf die vorgeschlagene Funktion erstaunlich war. Eine genauere Analyse ergab, dass die

PAZ-Domäne dabei sowohl an einzelsträngige RNA als auch an doppelsträngige RNA, die

einen zwei Nukleotide langen Überhang am 3’-Ende hat, bindet. Dies deutete bereits auf eine

Spezifität der PAZ-Domäne für einzelsträngige Nukleinsäuren hin. Konservierte

Aminosäuren in der Bindungstasche ließen darauf schließen, dass alle PAZ-Domänen

3

Zusammenfassung

Nukleinsäuren binden können, was ich tatsächlich für mehrere PAZ-Domänen von

Argonaute-Proteinen von Drosophila und des Menschen zeigen konnte.

Um die beobachtete Bindungsspezifität der PAZ-Domäne besser verstehen zu können, habe

ich im weiteren Verlauf der Untersuchungen die Strukturen der PAZ-Domäne des

Argonaute2-Proteins im Komplex mit DNA- und RNA-Oligonukleotiden gelöst (Kapitel 2.2).

Dadurch wurde deutlich, dass nur die zwei letzten Nukleotide am 3’-Ende gebunden werden

und dass die 3’-OH Gruppe des letzten Nukleotids tief innerhalb der Bindungstasche sitzt.

Das gleiche Resultat ergab die Charakterisierung der Bindung eines weiteren RNA-Liganden,

der wie siRNAs einen doppelsträngigen Teil und einen Überhang am 3’-Ende besaß. Dieses

Ergebnis ermöglichte eine Erklärung der beobachteten Bindungsspezifität der PAZ-Domäne

und die Definition der molekulare Funktion als die der 3’-Erkennung. Somit kann man

schlussfolgern, dass die PAZ-Domäne zur spezifischen Erkennung von siRNAs, die von

Dicer erzeugt wurden, beitragen kann und deren Einbau in den RNAi-Effektorkomplex

unterstützt.

RNAi ist ein Mechanismus, mit dessen Hilfe sich Zellen gegen virale Infektionen schützen

können und der bei der Anwesenheit fremder, doppelsträngiger RNA aktiviert wird. Im

zweiten Teil dieser Arbeit habe ich die Struktur und die Funktion eines viralen Proteins, von

dem gezeigt wurde, dass es den herunterregulierenden Effekt von RNAi in Insektenzellen, in

Pflanzen und in Caenorhabditis elegans außer Kraft setzen kann, charakterisiert. Eine

Vielzahl dieser viralen Faktoren ist beschrieben worden, allerdings ist bisher sehr wenig über

deren molekularen Wirkungsmechanismus bekannt. Das untersuchte Protein war das B2-

Protein des Flock House Virus (FHV), und meine Arbeit resultierte in der Aufklärung dessen

dreidimensionaler Struktur (Kapitel 2.4). Das B2-Protein bildet ein elongiertes Dimer, in dem

jedes Monomer aus drei α-Helices aufgebaut ist. Durch NMR-Experimente zur Untersuchung

der Bindung einer siRNA an das B2-Protein konnte ich zeigen, dass auf der einen Seite des

Dimers zwei α-Helices - je eine von einem Monomer - eine Interaktionsfläche mit der RNA

bilden. Die Anordnung von basischen Seitenketten legt nahe, dass die RNA überwiegend

durch Kontakte zum Phosphodiester-Ribose-Rückgrat gebunden wird, wodurch die fehlende

Sequenzspezifität erklärt wird. Da sowohl siRNAs als auch lange doppelsträngige RNAs von

B2 gebunden werden, läßt sich die Inhibition von RNAi auf zweierlei Weise erklären: zum

einen kann das B2-Protein von FHV an seine eigene, virale RNA binden und dadurch eine

Spaltung durch Dicer verhindern, und zum anderen kann B2 an siRNAs binden und so deren

Einbau in den RISC-Komplex unterbinden.

4

Summary

Summary RNA interference (RNAi) is an evolutionary conserved mechanism that regulates gene

expression in response to the presence of double-stranded RNA (dsRNA) in the cell. The

dsRNA is first cleaved by the ribonuclease (RNase) III-type enzyme Dicer into 21-23

nucleotide (nt) small interfering RNA duplexes (siRNAs) with 2 nt 3’-overhangs. In a

subsequent step, the siRNAs are incorporated into a multimeric RNA-induced silencing

complex (RISC), where they guide the selection of a complementary mRNA. Argonaute

protein family members are core components of RISCs and characterized by two conserved

domains, namely the central PAZ domain and the C-terminal Piwi domain. The Piwi domain

has endonucleolytic activity and mediates cleavage of the target mRNA in a region with

sequence complementarity to the siRNA guide. The expression of the targeted gene is thereby

strongly down-regulated.

At the time when this study was initiated, the PAZ domain had been described to be present

exclusively in two protein families, namely the Dicer and Argonaute protein family. As

members of these families had been shown to interact with each other, it was proposed that

the PAZ domain might serve as a protein-protein interaction domain, mediating physical

contact between Dicer and Argonaute proteins.

The main focus of the work described in this thesis was to reveal the function of the PAZ

domain on a molecular level. To achieve this, I first solved the three dimensional nuclear

magnetic resonance (NMR) structure of the free Drosophila melanogaster Ago2 PAZ domain

(chapter 2.1). This showed that the PAZ domain adopts a novel fold composed of a central

five-stranded β-barrel flanked by two α-helices and a conserved β-hairpin/α-helix insertion.

Based on the structure, I performed further experiments that showed that the PAZ domain

binds nucleic acids in vitro, which was an unexpected result. I could demonstrate that the

PAZ domain binds single-stranded RNAs and double-stranded RNAs with 2 nt single-

stranded 3’-overhangs, but has reduced affinity for blunt-ended double-stranded RNA

molecules. This pointed towards a specificity of the PAZ domain for single-stranded nucleic

acids. The conservation of residues located in the binding site suggested a similar function for

all PAZ domains. Indeed, I could show nucleic acid binding for multiple PAZ domains of

Drosophila and human Argonaute proteins.

5

Summary

Having established the function of the PAZ domain in nucleic acid binding, the next goal was

to understand the apparent specificity for single-stranded nucleic acids. For this, I defined

short oligonucleotides as effective ligands of the PAZ domain and determined two complex

structures of the Drosophila Ago2 PAZ domain bound to RNA and DNA oligomers,

respectively (chapter 2.2). This revealed that only the two 3’-terminal nucleotides are bound

by the PAZ domain, whereas the other nucleotides showed no stable interaction. A very

similar binding mode was observed when I studied the binding of an siRNA mimic, which

had a double-stranded stem and a 2 nt 3’-overhang. These observations provided an

explanation for the characterized specificity of the PAZ domain and demonstrated that the

PAZ domain is a 3’-end recognition domain. Together, my results indicate that the PAZ

domain contributes to the recognition and incorporation of Dicer-processed siRNAs into

RISC and thus serves as a specificity providing module in the RNAi pathway.

In the second part of my thesis, I studied the structure and function of a viral suppressor of

RNA interference. RNAi is thought to originate from an ancient endogenous defense

mechanism against viral and other heterologous dsRNAs. It had been shown before that many

viruses have evolved proteins that suppress RNA silencing and thereby counteract the cellular

response to the presence of double-stranded viral RNA. Although a large number of viral

suppressors had been identified, little is known about their molecular mechanisms.

One of these previously described suppressors of RNAi is the Flock House virus (FHV) B2

protein, which was shown to inhibit an anti-viral response in insect cells, plants and

Caenorhabditis elegans. To understand the molecular basis of its suppression activity, I

determined the solution structure of the FHV B2 protein (chapter 2.4). This study

demonstrated that B2 is an elongated dimer in solution, with each monomer composed of

three α-helices. NMR experiments demonstrated binding of the B2 dimer to a double-

stranded siRNA with high affinity. The binding site was mapped to a surface which is

comprised of one complete side of the dimer and composed of two anti-parallel helices (one

of each monomer). The distribution of positively charged side chains located in these helices

suggested that the dsRNA is contacted mainly via the sugar phosphate backbone, which

explains the lack of sequence specificity. This binding mode represents a novel way of

dsRNA recognition. As both siRNAs and long dsRNAs are bound, a dual mode of

suppression of RNAi is suggested: the B2 dimer could prevent the incorporation of viral RNA

into RISC by inhibition of Dicer-mediated cleavage of viral dsRNA and by binding directly to

siRNAs.

6

Introduction

7

1. Introduction

1.1 Regulation of gene expression by RNA silencing The basic flow of information in gene expression is from DNA to RNA to protein. This is also

known as the ‘central dogma of molecular biology’ that was established by Francis Crick in

the year 1958 (Crick, 1958). In order to ensure the right correlation between genes and the

amount of protein being produced, the expression of genes has to be tightly controlled by

complex regulatory mechanisms in both temporal and spatial manner. Recently, several post-

transcriptional mechanisms that regulate gene expression on the level of the messenger RNA

(mRNA) were discovered to be mediated by RNA. All these pathways, now commonly

referred to as RNA silencing pathways, change mRNA levels and/or influence the level of the

corresponding protein in response to the presence of dsRNA.

1.1.1 A brief history of the discovery of RNA silencing Before RNA silencing was discovered in animals, two phenomena of double-stranded RNA-

induced gene silencing had been already known for some years in plant systems, namely

virus-induced gene silencing (VIGS) and co-suppression. In the latter case it was observed

that after introducing exogenous transgenes into petunias with the goal of altering

pigmentation, the flowers did not deepen flower colour as expected. Instead, the flowers

showed variegated pigmentation, with some lacking pigment completely (Jorgensen, 1990).

This phenomenon, then called co-suppression, indicated that not only the transgenes

themselves were inactive, but also that the added DNA sequences somehow affected the

expression of the endogenous loci. Also, several laboratories found that plants responded to

RNA viruses by targeting viral RNAs for destruction (Ruiz et al., 1998 and chapter 1.3).

Nowadays, it is established that both complex transgene arrays and replicating RNA viruses

generate dsRNA, which triggers silencing as biological response.

A related phenomenon in animals was first discovered in the nematode worm Caenorhabditis

elegans as a response to double-stranded RNA, which resulted in sequence-specific gene

silencing. It was found that in an antisense RNA approach to inhibit gene expression, sense

RNA, that was used as a control, was as effective as antisense RNA for specific silencing of

the target gene (Guo & Kemphues, 1995). In a subsequent study, Fire et al. reasoned that the

Introduction

8

in vitro transcribed RNA preparations used by Guo et al. in their antisense studies were not

purely single-stranded RNA and that dsRNA in these preparations might be the key trigger of

silencing (Fire et al., 1998). Their breakthrough was to test the synergy of sense and antisense

RNAs. They could show that the dsRNA mixture was at least 10-fold more potent as a

silencing trigger than the sense or antisense RNAs alone. This phenomenon was named RNA

interference (RNAi), distinguishing it mechanistically from classical antisense-mediated

suppression. Silencing by dsRNAs had a number of remarkable properties: RNAi could be

induced by injection of dsRNA into the C. elegans gonad or by introduction of dsRNA

through either soaking of worms in dsRNA solution or feeding them with bacteria engineered

to express it (Timmons & Fire, 1998). Genetic and biochemical studies have now confirmed

that RNAi, co-suppression and virus-induced gene silencing share mechanistic similarities,

and that the biological pathways underlying dsRNA-induced gene silencing exist in many, if

not most, eukaryotic organisms (Hannon, 2002; Denli & Hannon, 2003; Meister & Tuschl,

2004; Novina & Sharp, 2004).

The initial observations in C. elegans were consistent with dsRNA-induced gene silencing

operating at the post-transcriptional level. Exposure to dsRNAs resulted in loss of the

complementary mRNAs, and promoter and intronic sequences were largely ineffective as

silencing triggers (Fire et al., 1998). A post-transcriptional mode was also consistent with data

from plant systems in which exposure to dsRNA, for example in the form of an RNA virus,

triggered depletion of mRNA sequences without an apparent effect on the rate of transcription

(Jones et al., 2001). Indeed, viral transcripts themselves were targeted, despite the fact that

they were synthesized cytoplasmically by transcription of RNA genomes (Ruiz et al., 1998).

These studies led to the notion that RNAi induces degradation of complementary mRNAs. In parallel, another mechanism was discovered that operates mainly at the level of protein

synthesis and is now known as miRNA-mediated gene silencing. One of the fist observations

was that lin-4, a gene known to control the timing of C. elegans larval development, does not

code for a protein but instead produces a pair of small RNAs (Lee et al., 1993). It was noticed

that these lin-4 RNAs had antisense complementarity to multiple sites in the 3’-UTR of the

lin-14 gene (Lee et al., 1993; Wightman et al., 1993), which were shown to be important for

regulation of lin-14 by lin-4. The fact that this regulation substantially reduced the amount of

Lin-14 protein without a noticeable change in lin-14 mRNA levels, supported a model in

which the lin-4 RNAs pair to the lin-14 3’-UTR to specify translational repression of the lin-

14 message. The shorter lin-4 RNA is now recognized as the founding member of an

abundant class of tiny regulatory RNAs called microRNAs (miRNAs). The importance of

Introduction

9

miRNA-directed gene regulation became evident as more miRNAs and their regulatory

targets were discovered (reviewed by Bartel, 2004; He & Hannon, 2004; Kim, 2005). The list

of biological processes regulated by miRNAs is already long and is expected to grow in the

future.

A third mechanism that affects gene expression in response to the presence of dsRNA acts on

the transcriptional level, i.e. on the chromatin structure (for review, see Bayne & Allshire,

2005). The initial observation of dsRNA-induced changes of chromatin was DNA

methylation of endogenous sequences that shared homology with viroids in infected plants

(Wassenegger et al., 1994). This observation was later followed by the finding that dsRNA

sharing sequence homology with promoter regions was able to induce gene silencing which

correlated with de novo methylation of promoter sequences (Mette et al., 2000). Later,

studies in fission yeast and flies showed that RNA-directed transcriptional silencing is not

confined to plants (see chapter 1.1.4).

After this brief historical overview, the different types of RNA-mediated gene silencing

mechanisms will be described in more detail in the following three subchapters.

1.1.2 RNA interference

In the course of RNA interference, a longer dsRNA precursor is cleaved into smaller RNAs

that act as guides for the silencing machinery. As a result, the expression of the corresponding

gene is down-regulated due to the degradation of the complementary mRNA (see Figure

1.1A).

The presence of these small RNAs was described for the first time in a study on transgene-

and virus-induced post-transcriptional gene silencing in plants. Hamilton et al. observed the

formation of discrete, small RNAs of around 25 nt that were complementary to the target of

silencing (Hamilton & Baulcombe, 1999). This result initiated biochemical studies to reveal

the relationship between the dsRNA trigger and the resulting small RNAs. For this, Tuschl et

al. developed a cell free Drosophila extract system that recapitulates many of the features of

RNAi (Tuschl et al., 1999). They observed sequence-specific mRNA degradation that is

promoted by dsRNA. Using the same system, it was shown that the dsRNA silencing trigger

is processed to RNA segments of 21-23 nt in length (Zamore et al., 2000). The authors termed

these RNAs small interfering RNAs (siRNAs). Simultaneously, it was found that extracts of

Drosophila cells that were transfected with dsRNA contain a nuclease activity that

Introduction

10

specifically degrades mRNAs homologous to this dsRNA (Hammond et al., 2000). The

siRNAs were shown to be associated with this nuclease, named RNA-induced silencing

complex (RISC, Figure 1.1A). Analysis of the chemical structure of siRNAs showed that they

were double-stranded and contained 5’-phosphorylated termini and 2 nt 3’-overhangs

(Elbashir et al., 2001c). This anatomy pointed towards a role of an RNase III ribonuclease in

generating siRNAs, because this family of nucleases displays specificity for dsRNAs and

generates such termini. The identity of the nuclease that cleaves the dsRNA into siRNAs in

the initiation step was revealed by a biochemical approach to be indeed a member of the

RNase III family of enzymes and was termed Dicer (Bernstein et al., 2001). The family of

Dicer enzymes is evolutionarily conserved, and proteins from many organisms besides

Drosophila, including Arabidopsis, tobacco, C. elegans, mammals and Neurospora have all

been shown to recognize and process dsRNA into siRNAs of a characteristic size for the

relevant species (Bernstein et al., 2001; Ketting et al., 2001). In general, RNase III enzymes

exhibit specificity for dsRNA and are involved in RNA metabolism in organisms ranging

from phages to animals (reviewed by Nicholson, 2003).

Figure 1.1 siRNA- and miRNA-mediated gene silencing pathways (A) Pathways of RNA silencing originating from dsRNA or endogenous hairpin pre-miRNAs. Current models of the structural and topological features of the protein–RNA complexes involved in RNAi are shown. The color coding of the different domains is the same as in B. Red arrows indicate endonucleolytic cleavage. RISC and miRNP (micro ribonucleoprotein particle) are the effector complexes of the siRNA and miRNA pathways, respectively. (B) Typical domain structure of Dicer and Argonaute proteins. Dicer comprises a DEXH helicase, a dsRBD, a PAZ domain and a domain of unknown function (DUF283), in addition to two RNase III domains. Argonaute proteins share a PAZ domain and a Piwi domain. This figure is taken from the review that is reprinted in chapter 4.2 (Lingel & Sattler, 2005).

A B

Introduction

11

Three structural classes of RNase III proteins have been described. The first class is

represented by Escherichia coli RNase III, the second by Drosha and the third by Dicer. The

first class comprises the simplest RNase III proteins, each of which contains one catalytic

endonuclease domain (RNase III domain) and a dsRNA binding domain (dsRBD). Members

of the second class of RNase III proteins comprise Drosha and homologs and contain two

RNase III domains, a dsRBD, and a long N-terminal segment, which is believed to be

involved in protein-protein interaction. Drosha was shown to have a function in miRNA

maturation (see next subchapter). The third class of RNase III enzymes, comprised of Dicer-

like proteins, are ~200 kDa multidomain proteins. Typically, Dicers contain an N-terminal

DEXH-box RNA helicase domain, a domain of unknown function (DUF283), a PAZ domain,

two RNase III domains, and a dsRBD (Figure 1.1B).

Insight into the molecular details of the RNase III activity of Dicer has been derived from the

crystal structure of the endonuclease domain of the RNase III homolog from Aquifex aeolicus

(Blaszczyk et al., 2001). This structure, together with biochemical studies, has established that

the catalytically active enzyme in prokaryotes comprises a homodimer of two RNase III

domains. As the Drosha and Dicer enzymes each contain two RNase III domains (RNase IIIa

and RNase IIIb), it was proposed that they could dimerize intra-molecularly to form an

enzymatically active unit that resembles the RNase III homodimer of A. aeolicus. For Dicer, it

was shown that the RNase IIIa and RNase IIIb domains of monomeric human Dicer do indeed

constitute the active enzyme that contains only a single center for the processing of dsRNA

(Zhang et al., 2004). This single processing center comprises two catalytic sites (one from

each RNase III domain) and cleaves both strands of the dsRNA substrate at one location,

leaving a 3’-overhang of 2 nt.

Several organisms contain more than one Dicer gene, with each Dicer preferentially

processing dsRNAs originating from a specific source. Drosophila has two paralogues: Dicer-

1 preferentially processes miRNA precursors (Lee et al., 2004b), and Dicer-2 is required for

long dsRNA processing (Pham et al., 2004). Dicer-2 is stably associated with the dsRBD-

containing protein R2D2 (Liu et al., 2003).

After the generation of siRNAs, the ribonucleoprotein particles (RNPs) are subsequently

rearranged into the RISC, which was introduced above as being the effector complex of

RNAi. For each siRNA duplex, only one strand, the so-called guide strand, is present in the

active, mature RISC, whereas the other strand, named the passenger strand, is destroyed. The

single-stranded siRNA in RISC guides the sequence-specific degradation of complementary

or near-complementary target mRNAs. The mRNA is cleaved at a position 10 nt from the 5’-

Introduction

12

end of the guiding siRNA (Elbashir et al., 2001b; Nykanen et al., 2001; Martinez et al., 2002).

Subsequently, the 5’-mRNA fragment generated by RISC cleavage is rapidly degraded from

its 3’-end by the exosome, whereas the 3’-fragment is degraded from its 5’-end by XRN1

(Orban & Izaurralde, 2005).

Like for Dicer, the identification of RISC components started with the analyses of the

associated activity in extracts of Drosophila embryos and cultured S2 cells. The first subunit

of RISC to be identified was the siRNA, which recognizes the substrate through Watson-

Crick base pairing (Hammond et al., 2000). Nykanen et al. showed that RISC is formed in

embryo extracts as a precursor complex of ~ 250 kDa which becomes activated upon addition

of ATP to form a ~100 kDa complex that can cleave substrate mRNAs (Nykanen et al., 2001),

whereas RISC from Drosophila S2 cells was purified as a ~500 kDa ribonucleoprotein

(Hammond et al., 2000; Hammond et al., 2001). The first protein component to be identified

as part of RISC was Argonaute2 (Hammond et al., 2001), a homolog of C. elegans Rde-1.

Rde-1 had previously been shown to be required for RNAi in C. elegans (Tabara et al., 1999)

and belongs to the Argonaute family of proteins (see chapter 1.2). In analogy to Argonaute2

in Drosophila, the human homologs eIF2C1 and eIF2C2 were isolated from HeLa cells in an

attempt to purify RISC activity (Martinez et al., 2002). In two subsequent studies on the

composition of RISC, three putative RNA binding proteins, the Drosophila homolog of the

fragile X mental retardation protein (FMRP), dFXR, VIG (Vasa intronic gene) and a

Drosophila homolog of the p68 RNA helicase (Dmp68), were shown to associate with a

tagged version of Ago2 (Caudy et al., 2002; Ishizuka et al., 2002). Tudor-SN, a 100 kDa

protein with 4 N-terminal Staphylococcus nuclease (SN) domains and a hybrid tudor/SN

domain, was another factor demonstrated to be bound to RISC (Caudy et al., 2003). It was

speculated that this protein might be part of the nucleolytic activity of RISC, but later studies

revealed that the actual endonuclease in RISC is a Mg2+ dependent nuclease (Martinez &

Tuschl, 2004; Schwarz et al., 2004), which ruled out Tudor-SN as it is Ca2+ dependent. A set

of structural and biochemical studies showed then that Ago2 itself is the protein responsible

for the endonucleolytic activity that cleaves the mRNA in RISC, now referred to as Slicer (see

Figure 1.1A and chapter 1.2). In addition to cleaving the target mRNA, it was recently

demonstrated that Ago2 also cleaves the passenger strand (Matranga et al., 2005; Rand et al.,

2005), thereby liberating the single-stranded guide strand. These results indicate that Ago2

receives directly the double-stranded siRNA duplex from the RISC assembling machinery

instead of binding a single-stranded siRNA after its separation by a helicase, as it was

proposed before (Tomari et al., 2004a).

Introduction

13

The possibility of specific down-regulation of target genes by RNAi provides a novel and

now frequently used tool to study gene function. In contrast to other functional inhibition

approaches like the use of antibodies or small molecule antagonists, RNAi has much higher

specificity and a less restricted applicability. For example, the transfection of mammalian

cells with siRNAs results in the potent, long-lasting post-transcriptional silencing of the

specific target genes (Caplen et al., 2001; Elbashir et al., 2001a). Beyond their value for

validation of gene function, siRNAs also hold great potential as gene-specific therapeutic

agents.

1.1.3 miRNA pathway

MicroRNAs (miRNAs) are single-stranded RNAs of 19-25 nucleotides in length and are

generated from endogenous hairpin-shaped transcripts by the same RNase III-type enzyme

Dicer (Figure 1.1A, Ambros et al., 2003; Bartel, 2004). miRNAs function as guide molecules

in post-transcriptional gene silencing by base pairing with target mRNAs, which leads to

mRNA cleavage or translational repression (Figure 1.1A). miRNAs have been shown to play

key roles in diverse regulatory pathways, including control of developmental timing,

haematopoietic cell differentiation, apoptosis, cell proliferation and organ development

(reviewed by Ambros, 2004; Bartel, 2004; He & Hannon, 2004; Kim, 2005).

miRNA genes can be clustered in the genome as it is observed for Drosophila (Lagos-

Quintana et al., 2001). In contrast to that, the majority of worm and human miRNA genes are

isolated and not clustered (Lim et al., 2003a; Lim et al., 2003b). Transcription of miRNA

genes is mediated by RNA polymerase II, and the transcripts, called primary miRNAs (pri-

miRNAs) were shown to contain both cap-structures and poly(A)-tails (Cai et al., 2004; Lee

et al., 2004a). The pri-miRNAs are usually several kilobases long and contain one or several

local hairpin structures. The stem loop structure is cleaved by the nuclear RNase III Drosha to

release the precursor miRNA (pre-miRNA) (Lee et al., 2003, Figure 1.1). The hairpin

precursors are usually ~60-80 nt in animals, but the lengths are more variable in plants. The

conserved family of Drosha ribonucleases was introduced in the previous chapter. Drosha

forms a large complex of ~500 kDa in D. melanogaster (Denli et al., 2004) or ~650 kDa in

humans (Gregory et al., 2004; Han et al., 2004). In this complex, called microprocessor,

Drosha interacts with its cofactor, the two dsRBDs containing DiGeorge syndrome critical

region gene 8 (DGCR8) protein in humans, which is also known as Pasha in D. melanogaster

Introduction

14

and C. elegans (Denli et al., 2004; Gregory et al., 2004; Han et al., 2004; Landthaler et al.,

2004). The tertiary structure of pri-miRNAs seems to be the primary determinant for substrate

specificity. Apparently, the Drosha complex can measure the length of the stem, because the

cleavage site is located approximately two helical turns (~22 nucleotides) from the terminal

loop (Zeng et al., 2005). When Drosha excises the hairpin miRNA precursor, a 5’-phosphate

and a 2 nt 3’-overhang remain at the base of the stem (Lee et al., 2003)

After being processed by Drosha, the pre-miRNA is exported out of the nucleus into the

cytoplasm by the nuclear export factor Exportin5 (Exp5) (Yi et al., 2003; Bohnsack et al.,

2004; Lund et al., 2004). Exp5 is a member of the RanGTP-binding transport receptors and

was initially identified as a factor exporting the eukaryotic elongation factor 1A together with

tRNAs (Calado et al., 2002). It was shown that the interaction between Exp5 and the RNA is

direct and involves contacts to the tRNA or miRNA double-stranded stem (Lund et al., 2004).

Following their export from the nucleus, pre-miRNAs are subsequently processed into ~22

nucleotide, usually imperfect miRNA duplexes by the cytoplasmic RNase III Dicer (Grishok

et al., 2001; Hutvagner et al., 2001; Knight & Bass, 2001). In Drosophila, the pre-miRNA

hairpins are cleaved by Dicer-1, which was shown to act in concert with the double-stranded

RNA binding domain protein Loquacious (Forstemann et al., 2005; Saito et al., 2005). As a

result of being cleaved in two subsequent steps by an RNase III enzyme, both ends of the

miRNA duplex have 2 nt 3’-end overhangs and 5’-phosphates. Like in the course of RNAi,

the double-stranded miRNA is incorporated into an Argonaute protein containing complex,

which is referred to as miRNP (Figure 1.1A). For example, the first described human miRNP

showed that miRNAs are present in a complex with the Argonaute eIF2C2, together with the

helicase Gemin3, and Gemin4 (Mourelatos et al., 2002).

MicroRNA loaded miRNPs can direct the downregulation of gene expression by either of two

post-transcriptional mechanisms: mRNA cleavage or translational repression. According to

the current model, the choice of the post-transcriptional mechanisms is not determined by

whether the small silencing RNA originated as an siRNA or a miRNA, but by the degree of

complementarity with the target. Once incorporated into a cytoplasmic RISC, the miRNA will

specify cleavage, if the mRNA has sufficient complementarity to the miRNA. However, it

will repress productive translation if the mRNA does not have sufficient complementarity to

be cleaved but does have a suitable constellation of miRNA complementary sites (Doench et

al., 2003; Zeng et al., 2003). Another determinant is the nature of the Argonaute protein that

is present in the particular miRNP, as it was demonstrated that not all Argonaute proteins are

competent for substrate cleavage (see chapter 1.2). In plants, miRNAs base pair with

Introduction

15

messenger RNA targets by precise or nearly precise complementarity and direct cleavage and

destruction of the target mRNA (Rhoades et al., 2002). When a miRNA guides cleavage, the

cut is at precisely the same site as that seen for siRNA-guided cleavage, i.e., between the

nucleotides pairing to residues 10 and 11 of the miRNA (Llave et al., 2002). In contrast, most

animal miRNAs are imprecisely complementary to their mRNA targets, with targets sites

being generally in the 3’-UTRs of the target mRNA. It was shown that not the complete

length of the miRNA is equally important for this interaction. Positions 2-7 form the critical

“seed”, and base pairing in this region is most important for target recognition. This fact is

also employed for the prediction of miRNA target sites (Enright et al., 2003; Stark et al.,

2003; John et al., 2004; Brennecke et al., 2005). Initially, it was believed that animal miRNAs

act only by inhibiting protein synthesis without affecting mRNA levels. More recently, it was

established that a miRNA can also act to guide the miRNP for cleavage of the cognate mRNA

(Hutvagner & Zamore, 2002), resulting in a change of mRNA levels (Lim et al., 2005). Thus,

it seems that miRNA-guided translational regulation and targeted mRNA degradation are

used simultaneously as natural regulatory mechanisms.

Recent observations concerning the localization of human Argonaute proteins provide a

possible and attractive explanation of the apparent repression of protein synthesis mediated by

miRNAs. It was shown that miRNA loaded Argonaute proteins and their target mRNAs

localize to mammalian processing bodies (P-bodies) (Liu et al., 2005b; Sen & Blau, 2005),

which were found previously to contain untranslated mRNAs and to serve as sites of mRNA

degradation (Ingelfinger et al., 2002; Eystathioy et al., 2003; Sheth & Parker, 2003). Such

localization could potentially explain in part or completely the translational repression and

degradation. Interestingly, one of the marker proteins of P-bodies, GW182 (Eystathioy et al.,

2002; Ingelfinger et al., 2002; Sheth & Parker, 2003), was found in a subsequent study to be

required for miRNA-mediated gene silencing (Rehwinkel et al., 2005). This emphasized the

role of P-body components in the miRNA pathway. Now, Argonaute proteins were shown to

interact physically with GW182 and it was further demonstrated that the structural P-body

integrity is crucial for miRNA-mediated silencing (Liu et al., 2005a).

Introduction

16

1.1.4 RNA-mediated transcriptional gene silencing

As outlined in chapter 1.1.1, RNA silencing mechanisms have been shown to be involved in

heterochromatin formation and transcriptional gene silencing (TGS). RNA-mediated

heterochromatin formation appears to be a natural epigenetic gene regulation mechanism.

This mechanism is believed to be active in most eukaryotes in response to environmental

changes and controls heritable changes in gene expression that are not caused by mutations

(for reviews, see Baulcombe, 2004; Lippman & Martienssen, 2004; Matzke & Birchler, 2005;

Wassenegger, 2005).

Most heterochromatin is found near centromeres and telomeres, and consists of tandem and

satellite repeats, which are sometimes interrupted by transposable elements. The expression of

heterochromatin is silenced by conserved epigenetic modifications of histones (methylation,

acetylation, phosphorylation and ubiquitination) and DNA (methylation) (Jaenisch & Bird,

2003). DNA methylation is absent, or nearly absent, in yeast, flies and nematodes, but a link

between DNA methylation and histone methylation is well established in fungi apart from

yeast, and in animals and plants.

The main fraction of endogenous siRNAs in Arabidopsis corresponds to transposons and

repeats whose histones and DNA are heavily methylated. The initiator dsRNA may be

produced by bidirectional transcription or transcription of inverted repeats. Another source of

dsRNA can be the activity of an RNA-dependent RNA polymerase (RdRP), which makes

dsRNA from a single-stranded RNA template. RNA-dependent RNA polymerases are found

in many organisms including C. elegans and plants, where they were shown to be required for

the amplification and systemic spreading of RNA-mediated silencing (Dalmay et al., 2000;

Sijen et al., 2001). However, RdRP genes have not been identified in insects and mammals.

RNA-directed DNA methylation (RdDM) was the first RNA-guided epigenetic modification

of the genome to be discovered. Originally detected in viroid-infected tobacco plants

(Wassenegger et al., 1994), RdDM was subsequently shown to require a dsRNA that is

processed into small RNAs of 21-24 nt, therefore reinforcing a link with RNAi (Mette et al.,

2000). dsRNAs which contain sequences that are homologous to promoter regions can then

trigger promoter methylation and transcriptional gene silencing. The Arabidopsis Dicer-like 3

protein and the Argonaute proteins Ago4 and Ago1 were shown to be involved in de novo

methylation or maintenance of methylation (Hamilton et al., 2002; Zilberman et al., 2003),

establishing the role of RNAi components in transcriptional gene silencing. The existence of

RdDM in other organisms is not clarified yet.

Introduction

17

In Schizosaccharomyces pombe, heterochromatin consists of simple transposon-derived

tandem arrays, which surround the central-core centromeric region of each chromosome. The

derepression of the centromeric outer-transposon repeats in mutants deficient for components

of the RNAi machinery led to the proposal that small RNAs function as guides to target the

chromatin modifications that are typical of heterochromatin (Volpe et al., 2002; , reviewed by

Grewal & Rice, 2004). Small interfering RNAs (siRNAs) are generated from centromeric

dsRNAs by the RNase III-type endonuclease Dicer. It was demonstrated later that a nuclear

effector complex known as RITS (RNA-induced initiation of transcriptional gene silencing)

complex is involved in targeting chromatin modifications. RITS contains siRNAs that

originate from heterochromatic regions, such as centromeres, and three identified proteins:

Ago1, Chp1 (a centromere-associated chromodomain protein), and Tas3 (a serine-rich protein

that is specific to fission yeast) (Verdel et al., 2004). Binding of the complex enables

recruitment of chromatin modifying proteins. Argonaute proteins were also shown to be

involved in heterochromatin formation in D. melanogaster (see next chapter).

Introduction

18

1.2 Argonaute proteins As described in the previous chapter, Argonaute proteins constitute core components of RNA-

mediated gene silencing mechanisms. They form a highly conserved family of proteins with

members found to be involved in various biological pathways ranging from RNAi to

development and stem cell determination and maintenance (reviewed by Carmell et al., 2002).

In the following subchapters, the functional diversity of Argonaute proteins will be introduced

briefly, followed by a more detailed description of the Drosophila Argonaute1 and

Argonaute2 proteins, with an emphasis on the role of Argonaute2 as Slicer. At the end, recent

structural insights into archaeal Argonaute and Piwi proteins will be provided. As the

structural and functional characterization of the PAZ domain was a main objective of this

thesis, this domain will only be introduced briefly without giving further details (for this, see

chapters 2.1, 2.2 and the discussion).

1.2.1 Argonaute proteins: a conserved protein family with diverse functions

The founding member of the Argonaute protein family is the Argonaute1 protein of

Arabidopsis thaliana. It was identified during an examination of ethyl methanesulfonate

(EMS) mutagenized A. thaliana populations for abnormal leaf morphology (Bohmert et al.,

1998). In this study, it was found that AtAgo1 mutation results in pleiotropical effects on

general plant architecture. Because of very narrow rosette leaves that looked similar to squids,

the authors called the identified genetic locus argonaute.

Argonaute family members constitute ~100 kDa highly basic proteins that contain two

signature domains, namely the PAZ and the Piwi domain (Figure 1.1B). Conservation of

amino acids in the Piwi domain of Argonaute proteins was first mentioned by Cox et al. when

analyzing the Drosophila Piwi (P-element induced wimpy testis) protein (Cox et al., 1998).

They showed that the central and C-terminal region of these proteins are well conserved and

defined a ~40 amino acid long, highly conserved C-terminal region, which they called the

Piwi box. In a subsequent bioinformatics study it was found that the Piwi box is part of a

larger ~300 amino acid domain which was termed Piwi domain (Cerutti et al., 2000). The

authors also demonstrated that the Piwi domain is not restricted to eukaryotes, but is also

Introduction

19

found in prokaryotes. In this initial study, no function could be proposed for the conserved

Piwi domain on the basis of sequence homology.

A central region of the Drosophila Piwi protein was identified to show a high level of

homology with other Argonaute proteins and with a region found in Dicer proteins (Cerutti et

al., 2000; Bernstein et al., 2001). Cerutti et al. named this novel, ~120 amino acid region PAZ

domain, after three proteins containing this domain: Piwi, Argonaute and Zwille/Pinhead. The

alignment indicated that PAZ domains can be grouped in two subfamilies, namely the

Argonaute and the Dicer family. The PAZ domains of the latter one are characterized by an

approximately 30 residue long conserved insert (see Figure S1 in chapter 2.1). Importantly,

the Dicer and Argonaute protein families are the only ones containing this domain. Based on

this fact and on co-immunoprecipitation experiments showing that both Drosophila Ago1 and

Ago2 proteins interact with Dicer (Hammond et al., 2001; Caudy et al., 2002), the function of

the PAZ domain was proposed to be protein-protein interaction, potentially mediating both

homo- and hetero-dimerization (Cerutti et al., 2000). As the PAZ domain was described to be

present only in eukaryotes, it was initially thought that also Argonaute proteins exist only in

eukaryotes. However, recent structural studies on Piwi domain containing proteins of

archaebacteria revealed that Argonaute proteins are also present in prokaryotes.

At the time of this initial bioinformatics analysis of eukaryotic Argonaute protein domains, no

structural information was available for any of the family members or of fragments of them.

Now, after intensive structural studies on the PAZ domain and on Argonaute proteins of

archaebacteria and biochemical investigations into eukaryotic Argonaute proteins, the

molecular function of both conserved domains is established and additional conserved

domains have been identified and annotated, namely the N- and Mid domain (see below).

In addition to the initial report on the developmental phenotype caused by a mutation in the

ago1 gene of A. thaliana, Argonaute family genes have been isolated from several organisms

in genetic screens for mutants that are deficient in RNAi and related phenomena, including

post-transcriptional gene silencing in plants and quelling in fungi. These include the Qde2

protein of Neurospora (Cogoni & Macino, 1997) and the Caenorhabditis elegans Rde-1

protein, of which mutants are strongly resistant to RNAi but are developmentally normal

(Tabara et al., 1999). In contrast, following the discovery of its pleiotropic effect on plant

architecture, Arabidopsis Ago1 was also shown to be necessary for post-transcriptional gene

silencing of transgenes as well as for co-suppression of transgenes and their corresponding

homologous endogenous genes (Fagard et al., 2000). Thereby, it was established that

Introduction

20

Argonaute proteins have functions in both development and RNA-mediated gene silencing.

This participation in two classes of biological functions is reflected by the fact that Argonaute

proteins can be separated according to their sequences into two subclasses: those that

resemble Arabidopsis Ago1, and those that more closely resemble Drosophila Piwi (see

Figure 1.2). Proteins of the Piwi subfamily are believed to be involved primarily in

developmental processes, whereas the Argonaute subfamily constitutes members that are

mainly implicated in RNA silencing mechanisms.

Figure 1.2 Phylogenetic tree of Argonaute proteins. The Ago subfamily is indicated in red, the Piwi subfamily in blue, orphans in black. The tree is based on an alignment done with ClustalW. Bootstrap percentages are indicated at each fork where the percentage is not 100. The tree shows that the Ago and Piwi branches are completely separated. Adapted from Carmell et al, 2002.

The diversity of functions of Argonaute proteins in development and RNA silencing can be

demonstrated by looking at the Drosophila Argonaute proteins. Drosophila contains four

characterized Argonaute proteins, namely Ago1, Ago2, Piwi and Aubergine, plus one

predicted from genomic DNA (Ago3). The first two (and the putative Ago3) belong to the

Argonaute subfamily and the latter two are included in the Piwi subfamily. DmAgo1 and

DmAgo2 function mainly in RNA silencing and will be discussed below. The piwi gene

encodes a nucleoplasmic protein known to be required for self-renewal of stem cells in the

male and female germline and the regulation of their division (Cox et al., 1998; Cox et al.,

2000). Similar to Piwi, Aubergine (also called Sting) is expressed embryonically in the

Introduction

21

presumptive gonad and is characterized by mutations that affect germline development

(Wilson et al., 1996; Schmidt et al., 1999). Aubergine is also responsible for acting post-

transcriptionally to maintain silencing of the X-linked repetitive Stellate locus that is

necessary for male fertility. The Stellate locus is silenced through a homology-dependent

mechanism mediated by an RNA transcribed from paralogous repeats called Suppressor of

Stellate, Su(Ste) (Aravin et al., 2001). Sense and antisense transcripts of Su(Ste) can be

detected and it was suggested that they form double-stranded RNA. Mutations in the

aubergine gene impair silencing by eliminating the short Su(Ste) RNA (Aravin et al., 2004).

In another study, Piwi and Aubergine were also shown to have a role in heterochromatin

silencing and HP1 localization in Drosophila (Pal-Bhadra et al., 2004). The authors

demonstrated loss of silencing as a result of mutations in piwi, aubergine, or spindle-E

(homeless), using tandem mini-white arrays and white transgenes in heterochromatin.

1.2.2 The Drosophila Ago1 and Ago2 proteins

DmArgonaute1 was the first identified Drosophila homologue of Arabidopsis Ago1, isolated

by a genetic approach to search for regulators of Wingless signal transduction. DmAgo1

maternal and zygotic mutant embryos showed developmental defects, with malformation of

the nervous system being the most prominent (Kataoka et al., 2001). Later, Ago1 was shown

to be required for efficient RNAi in Drosophila embryos, with its particular function lying

downstream of Dicer (Williams & Rubin, 2002).

Argonaute2 was identified by a biochemical approach to purify the RNAi effector nuclease

from cultured Drosophila cells (Hammond et al., 2001). After several steps of purification,

the remaining active fraction contained a ribonucleoprotein complex of around 500 kDa. The

identity of Argonaute2 was then revealed by mass spectrometry sequencing. It was also

shown that down-regulation of Ago2 by RNAi could suppress the silencing of a reporter,

indicating that Ago2 plays an important role in the silencing pathway. Soon after that, the

structure of the Ago2 PAZ domain was solved by NMR spectroscopy (see chapter 2.1),

providing the first molecular insight into the function of Argonaute proteins. At the same

time, the solution structure of the DmAgo1 PAZ domain was determined (Yan et al., 2003),

and later, the DmAgo2 PAZ domain was also solved by x-ray crystallography (Song et al.,

2003). Subsequent structural studies on the interaction of PAZ domains with nucleic acids

revealed the basis for RNA binding and specificity of the PAZ domain (chapter 2.2, and Ma et

Introduction

22

al., 2004), establishing a detailed characterization of the molecular function of this conserved

domain.

In an attempt to distinguish the functions performed by Ago1- and Ago2-associated RISCs in

Drosophila, Okamura et al. did a detailed study of RNA silencing pathways in embryos

lacking Ago1 or Ago2, complemented with experiments in cultured cells (Okamura et al.,

2004). They showed that Ago2 is an essential component for the siRNA-directed RNA

interference response and is required for the unwinding of the siRNA duplex and in

consequence for the assembly of the siRNA into RISC in Drosophila embryos. However,

Drosophila embryos lacking Ago2, which were siRNA-directed RNAi-defective, were still

capable of miRNA-directed target RNA cleavage. In contrast, Ago1, which is dispensable for

siRNA-directed target RNA cleavage, was shown to be required for mature miRNA

production that results in miRNA-directed RNA cleavage. This pointed already into the

direction that Ago2 might be very closely associated with the mRNA cleaving activity in

RISC and that Ago1 is more implicated in miRNA-mediated pathways.

1.2.3 Drosophila and human Argonaute2 act as Slicer

To figure out the relationship between DmAgo2 and the endonucleolytic activity, a stringent

biochemical purification of RISC activity from Drosophila Schneider cells was performed,

and Ago2 was found the be the only protein component present in the functional RISC that

was purified to homogeneity (Rand et al., 2004).

These results on DmAgo2 were in agreement with simultaneously published key studies

which presented compelling evidence that human Argonaute2 (HsAgo2) is the previously

unidentified endonuclease in human cells (Liu et al., 2004; Meister et al., 2004; Song et al.,

2004). Human Ago2 (also known as eIF2C2) has been shown before to be part of purified

RISC activity in HeLa cells, together with single-stranded siRNAs (Martinez et al., 2002) and

to be associated with miRNA containing ribonucleoprotein complexes (Mourelatos et al.,

2002). To get further insight into the specific functions of human Argonaute proteins in

RNAi, Liu et al. investigated their roles concerning RISC activity (Liu et al., 2004). By

transfection and immunoprecipitation experiments followed by an assay for siRNA-guided

cleavage of a complementary synthetic mRNA, they found that only HsAgo2-associated

RISC was able to catalyze the cleavage. This was true even though all Argonaute proteins

bound the co-transfected siRNA. The authors could also reconstitute RISC-mediated cleavage

Introduction

23

activity in vitro by adding an siRNA to immunopurified tagged HsAgo2, which was isolated

from cells that were not co-transfected with siRNAs. Mutational analysis, based on the

accompanying study describing the crystal structure of an Argonaute protein from the archaea

Pyrococcus furiosus (Song et al., 2004), provided strong evidence that the cleavage activity

was an intrinsic feature of Ago2, making it unlikely that the endonucleolytic activity was just

contributed by a co-purified Ago2-associated factor. Similar results providing additional

support for a role of HsAgo2 in cleavage were obtained by Tuschl and coworkers. They

demonstrated that purified Ago2 complexes, but not Ago1, Ago3 or Ago4 complexes, had

RISC activity (Meister et al., 2004).

The final biochemical evidence that Ago2 constitutes the Slicer activity was provided by a

recent reconstitution experiment (Rivas et al., 2005). The authors could show that human

Argonaute2, expressed recombinantly in E. coli and purified to homogeneity, can combine

with a small interfering RNA to form a minimal RISC that accurately cleaves substrate

RNAs.

1.2.4 Structures of archaeal Argonaute and Piwi proteins As described above, first structural insights into the molecular function of Argonaute proteins

were obtained from studies on the PAZ domain, which could be expressed recombinantly and

yielded soluble protein. Attempts of many laboratories to express full length eukaryotic

Argonaute proteins or other fragments besides the PAZ domain in amounts necessary for

structure determination were not successful up to this day. Being aware of these difficulties,

several labs turned successfully to prokaryotic homologs with unknown function (reviewed

by Hall, 2005).

The structure of the Argonaute protein of Pyrococcus furiosus and the molecular basis

of Slicer activity

To gain insight into the possible function of the larger signature domain of Argonaute

proteins, namely the Piwi domain, Song et al. performed a structural analysis of a

multidomain protein from the archaebacterium Pyrococcus furiosus (Song et al., 2004). It has

been predicted by sequence homology that this protein contains a Piwi domain. Remarkably,

the crystal structure of the protein revealed that it also comprises an unanticipated PAZ

domain (Figure 1.3). Thus, containing a Piwi and a PAZ domain, the protein represents a

bona fide Argonaute protein (now known as PfAgo), indicating that this protein family is not

Introduction

24

restricted to eukaryotes. PfAgo consists of four distinct domains held together by an

interdomain connector: an N-terminal domain, a PAZ domain, a middle (Mid) domain

structurally related to the Lac repressor (Friedman et al., 1995), and a C-terminal Piwi domain

(Figure 1.3). The N-terminal, middle and Piwi domains form a crescent-shaped structure, with

the Piwi domain sitting in the center of the crescent (Figure 1.3B). The N-terminal domain

forms a “stalk” that holds the PAZ domain on top of the concave face of the crescent, placing

it in opposition to the Piwi domain. Overall, the structure is quite compact, considering the

length of the polypeptide chain (i.e. 770 residues).

Figure 1.3 Crystal structure of the Pyrococcus furiosusArgonaute protein. (A) Schematic diagram of the domain organisation. Domain borders are indicated by the respective amino acid numbers. (B) Ribbon representation of PfAgo showing the N-terminal domain (gray), the PAZ domain (blue), the middle domain (red) and the Piwi domain (green). Putative active site residues in the Piwi domain and conserved residues in the PAZ domain are drawn in stick representation (see also Figure 1.4 and Figure 3.3). This figure and all other figures displaying molecular structures were prepared with MOLMOL (Koradi et al., 1996).

B

A

Despite very low sequence identity (<10%), the PAZ-like domain of PfAgo protein

superimposes well with the known structures of eukaryotic PAZ domains. The similarities

and differences will be presented in the discussion (see chapter 3.1).

The structure of the full length PfAgo also provided the first structural view on a Piwi

domain. The Piwi domain of PfAgo shows a striking structural similarity and conserved

secondary structure topology with the family of RNase H enzymes. This class of enzymes is

known to cleave the RNA strand of an RNA:DNA hybrid (Yang & Steitz, 1995). The RNase

H fold consists of a five-stranded mixed β-sheet surrounded by α-helices on both sides (see

Figure 1.4).

Introduction

25

Figure 1.4 The Piwi domain of PfAgo resembles the catalytic domain of RNase H endonucleases. (A) Crystal structure of RNase H1 from E. coli, showing side chains of the catalytic DDE triad residues and a bound Mg2+ ion. Conserved and unconserved secondary structure elements are colored green and yellow, respectively. (B) The Piwi domain in the crystal structure of PfAgo. Side chains of putative residues of a catalytic DDE triad and Arg627 are shown. Structural elements are colored as in (A). This figure is taken from the review reprinted in chapter 4.2 (Lingel & Sattler, 2005).

B A

The active site of RNase H enzymes is comprised of a so-called DDE motif, three acidic

amino acids with the side chain carboxylates positioned to catalyze the cleavage reaction. The

reaction is known to require the presence of divalent cations, such as Mg2+ or Mn2+ (Steitz &

Steitz, 1993; Kanaya & Ikehara, 1995; Chapados et al., 2001). The reaction mechanism

differs from most ribonucleases but resembles deoxyribonucleases, leaving 3’-OH and 5’-

phosphate termini (Wintersberger, 1990). Notably, Slicer activity results in the same products

(Martinez & Tuschl, 2004; Schwarz et al., 2004). In the Piwi domain of the PfAgo protein,

two of the three potentially catalytic carboxylates (D558 and D628) are in equivalent

secondary structure positions as those found in the RNase H fold (Figure 1.4). Mutagenesis of

the equivalent residues in HsAgo2 to alanine eliminated mRNA cleavage activity, as

described above (Liu et al., 2004). Song et al. suggested that the active site may be completed

by a third carboxylate from a conserved glutamate (E635), even though this residue is located

in a different sequence position in bacterial proteins (e.g. E48 in E. coli RNase H1). The

identity of the third carboxylate was revealed in a subsequent study on PfAgo, which made

use of PfAgo crystals that were soaked in with Mn2+ solution. The Mn2+ ion can replace the

Introduction

26

Mg2+ expected to be coordinated at the active site but is easier to detect crystallographically.

This experiment indicated that the third coordinating side chain is H745 (Figure 1.5). In

addition, the equivalent residue in an Argonaute protein of Aquifex aeolicus (AaAgo), D683,

was coordinated by a Ca2+ ion along with the two other conserved acidic residues in the

crystal structure of this protein (Yuan et al., 2005, see below). Mutation of the corresponding

residue in HsAgo2, H807, also eliminated mRNA cleavage activity, confirming its

importance for catalytic activity (Rivas et al., 2005).

Notably, not all Ago proteins are active as endonucleases, and the identification of the active

site residues of this family of proteins explains why some Ago proteins are not cleavage

competent. For example, in cleavage incompetent HsAgo1, the catalytic histidine is replaced

by arginine and mutation of the equivalent histidine in cleavage competent HsAgo2 to

arginine inactivates it (Liu et al., 2004; Rivas et al., 2005). Similarly, in cleavage incompetent

HsAgo 4, one of the catalytic aspartates is replaced by glycine.

Figure 1.5 Superimposition of the active sites of PfAgo (blue) and Bh-RNase HC (green). The RNA strand from the DNA:RNA hybrid in the Bh-RNase HC structure is shown in pink. Water molecules from the PfAgo (red) and Bh-RNase HC (dark pink) are shown as spheres. Gray dashed lines indicatecoordination of metals A and B in Bh-RNase HC. Adapted from Hall, 2005.

Additional insight into the catalytic mechanism of Ago proteins was provided by the recent

crystal structures of the RNase H of Bacillus halodurans (Bh-RNase HC) with its DNA:RNA

hybrid substrate (Nowotny et al., 2005). The authors found that the DNA:RNA hybrid binds

at the active site of Bh-RNase HC via two divalent cations. One is equivalent to the Mn2+

coordinated by two aspartates and the histidine in Ago proteins. This metal is also coordinated

by two water molecules, one that is positioned ~3 Å from the scissile phosphate group and is

predicted to be the nucleophile (metal A in Figure 1.5). Equivalent water molecules are found

coordinating the Mn2+ ion in the PfAgo structure. A third water molecule is coordinated to the

Introduction

27

Mn2+ and this corresponds to a non-bridging oxygen atom in the phosphate group preceding

the scissile phosphate.

A second divalent cation (metal B in Figure 1.5) is coordinated by an aspartate that

coordinates both metals, a glutamate, an asparagine, a water molecule, and the RNA. It is

more difficult to predict where this second metal binds in the PfAgo structure. Two possible

metal ligands are E635 and R627, although they are more distant from the first metal ion than

the ligands for the second metal in the Bh-RNase HC structure (Nowotny et al., 2005). The

two-metal ion catalytic mechanism is likely to be conserved among RNase H family enzymes,

but identification of the ligands for the second metal ion site may require a crystal structure of

an Ago protein with RNA bound at the active site.

The structures of the Piwi protein of Archaeoglobus fulgidus and the Argonaute protein

of Aquifex aeolicus: insights into 5’-end recognition

The crystal structure of a simpler archaeal, Piwi domain containing protein from

Archaeoglobus fulgidus was solved by Parker et al. They described first the structure of the

free protein together with in vitro binding data (Parker et al., 2004). Because of the lack of a

PAZ domain, the protein is not considered to be an Argonaute protein and was called AfPiwi.

In comparison to PfAgo, AfPiwi lacks also the N-terminal domain and thereby represents an

N-terminally truncated version of the PfAgo. The structure confirmed the similarity of the

Piwi domain fold with the RNase H fold and showed the presence of carboxylate side chains

in the right position in a putative active site. Most importantly, the structure revealed that the

side chains of the C-terminal residues contribute to the hydrophobic core of the protein.

Interestingly, the C-terminal four residues of Ago and Piwi proteins are highly conserved as

aliphatic and aromatic residues. Moreover, the C-terminal carboxylate group projects onto the

molecular surface, lying at the interface between the middle and the Piwi domain. There, the

carboxylate is bound to a well-ordered metal ion, which is coordinated also by other side

chains of highly conserved amino acids. Binding data revealed that AfPiwi forms a distinct

complex with an siRNA-like duplex. A mutation in the conserved region at the C-terminus

reduced the binding affinity significantly, suggesting an important role for this region in RNA

binding.

This was confirmed by two subsequent structural studies of the AfPiwi protein in complex

with a double-stranded RNA solved simultaneously (Ma et al., 2005; Parker et al., 2005).

Both structures revealed the molecular basis of RNA binding and of 5’-end recognition. For

several reasons, the 5’-ends of siRNAs and miRNAs are important for mRNA target

Introduction

28

recognition and definition of the site of RNA cleavage. The main reasons are that the

positions 2-7 of miRNAs are the most critical region for target recognition (see chapter 1.1.3)

and that a phosphate group at the 5’-end of the siRNA strand that forms the guide for mRNA

target recognition has been shown to be required for efficient RNAi and for proper cleavage

site selection (Nykanen et al., 2001; Schwarz et al., 2002). The crystal structures of AfPiwi

with siRNA-like duplexes could provide a structural perspective on the importance of the 5’-

end. In both structures, the 5’-nucleotide of the guide siRNA is unpaired and bound in a

pocket made up primarily of residues in the middle domain. The first base, U1 or A1, forms a

stacking interaction with Y123, a conserved aromatic position in Ago proteins (Figure 1.6).

The unpairing of the first base explains its relative unimportance for specifying miRNA

targets. The structures contain a highly conserved metal binding site that anchors the 5’-

nucleotide of the guide RNA. The 5’-phosphate group of the guide siRNA is bound directly

by side chains of four residues that are invariant in Ago proteins (Y123, K127, Q137, and

K163) and by the main chain nitrogen of F138 (Figure 1.6).

Figure 1.6 5’-phosphate binding pocket of AfPiwi. A divalent metal ion (M) is coordinated by the 5’-phosphate group (5’ P), Q159, the terminal carboxylate of L427, the third phosphate group of the guide RNA (yellow) and a water molecule (wat). Black dashed lines: metal coordination, red dashed lines: hydrogen bonds. Adapted from Hall, 2005.

The phosphate group is also bound by a divalent cation that is coordinated by Q159, the C-

terminal carboxylate, a water molecule, and the third phosphate group of the guide RNA,

confirming the previously observed importance of the C-terminus (see above). Mutation of

the corresponding amino acids that contact the 5’-phosphate in human Ago2 resulted in

attenuated mRNA cleaving activity, suggesting that 5’-end binding is a conserved function

Introduction

29

(Ma et al., 2005). Thus, the binding of the 5’-phosphate group observed in the AfPiwi

structures appears to represent a good model for the RNAi effector complex.

As mentioned above, a crystal structure of a full length Argonaute protein from Aquifex

aeolicus (AaAgo) in complex with a single-stranded rU8 oligonucleotide was determined

recently and reported together with binding and cleavage studies (Yuan et al., 2005).

Although the rU8 was required for successful crystallization, it is disordered and therefore not

visible in the electron density map. The structure confirms the domain organisation of the

PfAgo protein. It shows a bilobal architecture, with one lobe comprising the N-terminal and

PAZ domain and the other one comprising the middle and Piwi domain. The PAZ domain

observed in this structure superimposes with the structures of eukaryotic PAZ domains, but is

most similar to the PAZ domain of PfAgo (see chapter 3.1 for discussion). The Piwi core fold

of AaAgo is closely related to the RNase HII domain (Lai et al., 2000). A deep basic surface

pocket is lined by highly conserved residues, including an aromatic residue, a patch of basic

residues and polar residues from the Mid domain, supplemented by the C-terminal

carboxylate and conserved aromatic residue, which are contributed to the Mid-Piwi interface

by the Piwi domain.

This highly conserved basic pocket is localized to the same region that was identified in the

crystal structures of AfPiwi-RNA complexes to anchor the phosphorylated 5’-end of the guide

RNA. This implies that this pocket in AaAgo is also likely to be the site for 5’-end recognition

of the guide strand in all Argonaute proteins. In addition to describing the crystal structure,

the authors found that AaAgo could direct the cleavage of the RNA target when pre-incubated

with DNA or RNA, showing for the first time a ribonuclease activity for an archaeal

Argonaute protein. Whether this demonstration of DNA-directed cleavage of RNA is

reflecting the function of prokaryotic Argonaute proteins in vivo remains to be established. Up

to now, no regulatory mechanism similar to eukaryotic RNA silencing was reported for

prokaryotes.

Introduction

30

1.3 Viral suppression of RNA silencing RNA interference is a post-transcriptional gene regulation pathway (see chapter 1.1) that is

thought to have originated from an ancient cellular defense mechanism against viral and other

endogenous double-stranded RNAs in the cell, like transposons with dsRNA intermediates

(Plasterk, 2002). Early studies in plants and the concomitant discovery of plant viral

suppressors of RNA silencing already pointed towards a role of RNAi in antiviral defense.

Most plant viruses have positive-sense, single-stranded RNA genomes (Hull, 2002). During

their replication cycle, there are dsRNA intermediates present which are strong inducers of

RNA silencing and are believed to be the precursors of viral siRNAs (Ahlquist, 2002). Upon

incorporation of these viral-derived siRNAs into RISC, the viral RNA is specifically targeted

for degradation (Baulcombe, 1999; Waterhouse et al., 1999), allowing the plant to recover

from viral infection. Thus, it is not surprising that many plant viruses encode suppressors of

RNA silencing (reviewed by Li & Ding, 2001; Voinnet, 2001; Ding et al., 2004; Roth et al.,

2004; Voinnet, 2005), which can serve as an example of co-evolution. The expression of

these suppressors is believed to be a means for viruses to overcome the host antiviral

silencing defense.

The first viral suppressors of RNA silencing reported were the helper component proteinase

HC-Pro and the 2b protein encoded by potyviruses and cucumoviruses, respectively

(Anandalakshmi et al., 1998; Brigneti et al., 1998; Kasschau & Carrington, 1998; Li et al.,

1999), which were both identified as pathogenicity determinants of synergistic viral diseases

(Ding et al., 1996; Pruss et al., 1997). This provided a rationale for identifying novel viral-

encoded suppressors, and the re-investigation of other factors of this type revealed that several

effectively inhibit RNA-mediated silencing (Voinnet et al., 1999).

For the identification of viral suppressors, a transient expression assay using Agrobacterium

co-infiltration is used, as it provides a rapid and easy test of suppressor activity (Llave et al.,

2000; Voinnet et al., 2000). Agrobacterium tumefaciens is a commonly used bacterial

pathogen of plants. In the assay, two different strains are used: one strain is used to induce

RNA silencing of a reporter gene (usually green fluorescent protein (GFP)), and another strain

is used to express the candidate suppressor. Briefly, the strategy is to co-infiltrate mixtures of

the two bacterial strains into a plant leaf (usually tobacco leafs) and then examine the

infiltrated patch over time for silencing of the reporter. Most of the known suppressors were

identified by this simple assay (Table 1.1).

Introduction

31

Viral family Virus Suppressors Other functions

Positive-strand RNA viruses in plants Carmovirus Turnip Crinkle virus p38 Coat protein

Cucumovirus Cucumber Mosaic virus Tomato Aspermy virus 2b Host-specific movement

Closterovirus Beet Yellows virus Citrus Tristeza virus

p21 p20 p23 CP

Replication enhancer Replication enhancer Nucleic acid binding Coat protein

Comovirus Cowpea Mosaic virus S protein Small coat protein

Hordeivirus Barley Yellow Mosaic virus γb Replication enhancer, movement, seed transmission, pathogenicity determinant

Pecluvirus Peanut Clump virus p15 Movement

Polerovirus Beet Western Yellows virus Cucurbit Aphid-borne Yellows virus P0 Pathogenicity determinant

Potexvirus Potato virus X p25 Movement

Potyvirus Potato virus Y Tobacco Etch virus Turnip Yellow virus

HC-Pro

Movement, polyprotein processing, aphid transmission, pathogenicity determinant

Sobernovirus Rice Yellow Mottle virus P1 Movement, pathogenicity determinant

Tombusvirus Tomato Bushy Stunt virus Cymbidium Ringspot virus Carnation Italian Singspot virus

p19 Movement, pathogenicity determinant

Tobamovirus Tobacco Mosaic virus Tomato Mosaic virus p30 Replication

Tymovirus Turnip Yellow Mosaic virus p69 Movement, pathogenicity determinant

Negative-strand RNA viruses in plants Tospovirus Tomato Spotted Wilt virus NSs Pathogenicity determinant Tenuivirus Rice Hoja Blanca virus NS3 Unknown DNA viruses in plants

Geminivirus African Cassava Mosaic virus Tomato Yellow Leaf Curl virus Mungbean Yellow Mosaic virus

AC2 C2 C2

Transcriptional activator protein (TrAP)

Positive-strand RNA viruses in animals

Nodavirus Flock House virus Nodamura virus B2 Plaque formation

Negative-strand RNA viruses in animals

Orthomyxovirus Influenza virus A* NS1 Poly(A) binding, inhibitor of mRNA export, PKR inhibitor

DNA viruses in animals Adenovirus Adenovirus VA1 RNA PKR inhibitor Poxvirus Vaccinia virus* E3L PKR inhibitor

Table 1.1 RNA silencing suppressors encoded by plant, insect and vertebrate viruses. * These have only been demonstrated in heterologous systems (insects and plants). HC-Pro, helper component proteinase; PKR, RNA-dependent protein kinase. Adapted from Voinnet et al., 2005.

Many unrelated viral proteins have evolved silencing suppressor activities in addition to their

other functions, contributing to the remarkable diversity of these factors, which have now

Introduction

32

been identified for almost all types of plant viruses (see Table 1.1). It would not be surprising

if all plant viruses are found to encode at least one suppressor of RNA silencing, provided that

each of the viral proteins is assayed for possible suppression of both intra- and inter-cellular

(systemic) RNA silencing.

Silencing suppression has also been documented in insect cells and was discovered through

the related expression strategies and functional similarities of the Cucumber Mosaic virus

(CMV) suppressor 2b and the B2 protein of Flock House virus (FHV) (Table 1.1, Li et al.,

2002). FHV is an animal virus and belongs to the Nodaviridae family of non-enveloped

icosahedral viruses. FHV infects mainly insect cells, but replicates also in yeast, plant and

mammalian cells. All members of the Nodaviridae family have a bipartite genome comprised

of a positive-sense, single-stranded RNA (Johnson et al., 2001). The first genome segment of

FHV, RNA1, encodes the entire viral contribution to the viral RNA-dependent RNA

polymerase and replicates autonomously in the absence of RNA2, which depends on RNA1

for replication (Ball & Johnson, 1999). During replication, each RNA1 synthesizes a

subgenomic RNA3 that corresponds to the 3’-end of RNA1 and encodes two open reading

frames (ORFs, B1 and B2, see Figure 1.7).

Figure 1.7 Schematic representation of the nodavirus genome organization showing the two segments of the bipartite RNA genome (RNA 1 and 2) and the subgenomic RNA3. ORFs are shown in white and gray boxes.

Deletion of the 106 amino acids encoding B2 ORF from FHV results in a drastic loss of virus

accumulation in Drosophila S2 cells, which can be rescued by decreasing the cellular content

of Ago2 (Li et al., 2002). Therefore, B2 suppresses the effect of the Ago2-dependent silencing

response that normally restricts FHV accumulation. A subsequent study showed complete

replication of the FHV RNA genome in C. elegans (Lu et al., 2005). Similarly to the Ago2

dependence in Drosophila cells, replication of FHV in C. elegans triggered potent antiviral

silencing that required the Argonaute protein Rde-1, which is essential for RNAi mediated by

small interfering RNAs (siRNAs) but not by microRNAs. C. elegans was capable of rapid

virus clearance in the absence of FHV B2 protein but not when B2 was present. Thus, B2 acts

as a broad-spectrum RNAi inhibitor.

Introduction

33

The B2 protein of a related Nodavirus family member, Nodamura virus (NoV), which is

infectious for insect and mammalian hosts, has been shown to block RNAi in mammalian

cells (Sullivan & Ganem, 2005). Both the FHV and NoV B2 proteins bind to siRNAs and

longer dsRNAs, and therefore it was proposed that they may sequester siRNAs and prevent

the processing of long dsRNAs into siRNAs by the host RNaseIII-like enzyme Dicer (Lu et

al., 2005; Sullivan & Ganem, 2005).

Silencing suppression clearly plays a role in establishing virus infections as most of these

suppressors were previously shown to be essential for systemic virus spread in their hosts.

The molecular mechanism is unknown for most of the identified suppressors, with the

exception of p19 and p21, which will be introduced below. Structural and biochemical studies

have been initiated to reveal the molecular function of these interesting proteins, including

FHV B2 (see chapter 2.4).

Size selective recognition of siRNAs by the tombusvirus p19 protein

The linear, positive-sense ssRNA genome of tombusviruses encodes a 19 kDa protein that

was shown to be a suppressor of RNA silencing (Voinnet et al., 1999; Qiu et al., 2002; Qu &

Morris, 2002; Silhavy et al., 2002) and to bind RNA silencing-generated and synthetic 21 nt

siRNAs in vitro under stoichiometric conditions (Silhavy et al., 2002). Therefore, it was

suggested that p19 binds siRNAs and sequesters them, thereby preventing their incorporation

into RISC. This was confirmed by crystal structures of the p19 protein bound to siRNAs

(Vargason et al., 2003; Ye et al., 2003), which will be described in more detail.

The p19 homodimer bound to siRNAs revealed a C-terminal domain comprised of a four-

stranded β-sheet flanked on one side by three α-helices (Figure 1.8). Two N-terminal helices

are docked onto the C-terminal domain by salt bridges. The secondary structure topology of

the C-terminal domain corresponds to a circular permutation of the ribosomal L1 protein,

which binds dsRNA as a monomer with its β-sheet surface (Nikulin et al., 2003).

Homodimeric p19 forms an extended concave anti-parallel β-sheet surface, which mediates

multiple interactions with the sugar phosphate backbone of the double-stranded helical region

of the siRNA (Figure 1.8). Recognition of the siRNA stem by the β-sheet surface is distinct

from the mode of binding of dsRBDs, in which mainly loop regions interact with the stem of

the RNA duplex (see chapter 3.3). The p19 homodimer binds the siRNA with very high

affinity (KD ~0.17 nM) (Vargason et al., 2003). The siRNA recognition is sequence-

unspecific and employs hydrogen bonds, electrostatic contacts (involving phosphate and 2’-

Introduction

34

hydroxyl groups), and aromatic-base stacking interactions. A number of conserved serine and

threonine residues mediate key interactions with 2’-OH groups in the minor groove of the

RNA duplex, with the axis of the RNA helix being bent ~40° towards the protein.

Figure 1.8 Crystal structure of the p19 homodimer bound to a 21 nt siRNA (PDB code 1RPU). The two N-terminal helices of p19, which mediate key interactions with the terminal base pairs and contain two important tryptophan residues, are colored orange. The inset shows essential interactions that contribute to the size-selective RNA binding by the p19 homodimer; these interactions occur symmetrically at both ends of the 19 bp RNA stem. This figure is taken from the review reprinted in chapter 4.2 (Lingel & Sattler, 2005).

Size selectivity for an RNA helix of 19 bp is defined by symmetric interactions of the highly

conserved residue W42 and the more variable residue W39 with the 5’- and 3’-terminal bases,

respectively, at both ends of the RNA duplex region of a 21 nt siRNA (Figure 1.8). Consistent

with this, p19 does not interact with single-stranded nucleic acids or dsDNA, and has reduced

binding affinity for siRNAs with longer or shorter double-stranded regions (Vargason et al.,

2003; Ye et al., 2003). In one of the structures, the 2 nt single-stranded 3’-overhang makes

few interactions with p19 (Vargason et al., 2003), whereas in the other structure no electron

density is observed for the two nucleotides at the 3’-end (Ye et al., 2003). This suggests that

recognition of the 2 nt overhang is not essential, a proposal that is consistent with the

comparable binding affinities of p19 for a 19 bp RNA duplex with and without a 2 nt single-

stranded 3’-overhang (Vargason et al., 2003; Ye et al., 2003). By contrast, a hydrogen bond

between the indole amide of the conserved W42 and the 5’-phosphate seems to be important

for binding, since the affinity of p19 for an siRNA lacking a 5’-phosphate is reduced 23-fold

(Vargason et al., 2003). The major determinant of siRNA specificity is thus size selective

recognition of the 19 bp duplex - a binding mode that is unprecedented and provides a

fascinating example of structure-specific recognition of the key mediators of RNAi.

Introduction

35

General nucleic acid binding of the Beet yellow virus p21 protein

Recently, the crystal structure of a second suppressor of RNA silencing was solved by Ye et

al. (Ye & Patel, 2005), namely of the p21 protein of Beet Yellow virus (BYV). This virus is a

positive-sense, single-stranded RNA virus that comprises a large 15.5 kb genome and is a

member of the genus Closterovirus. The p21 protein, one of nine proteins encoded in the

BYV genome, was the only one that suppressed the silencing of green fluorescence protein

(GFP) induced by GFP dsRNA in the transitive agrobacterium infiltration experiment (Reed

et al., 2003). Additional biochemical studies showed that p21 binds siRNA duplexes both in

vitro and in vivo, but not single-stranded miRNAs (Chapman et al., 2004). It was also shown

that p21 has no strict binding specificity for siRNA duplexes.

Ye et al. have determined the crystal structure of p21 in the free state. The structure shows

that p21 forms an octameric ring of previously unknown topology, in which individual p21

monomers adopt an all α-helical structure (Figure 1.9).

Figure 1.9 Ribbon representation of the entire p21 octamer. Neighboring monomers are alternatively colored with green and red. The dots indicate disordered regions. Adapted from Ye & Patel, 2005.

Interestingly, the eight p21 monomer subunits associate into the closed ring structure through

two types of inter-monomer interactions, namely a head-to-head and a tail-to-tail

arrangement. Electrostatic potential analysis revealed that the ring surface displays a highly

polarized distribution of charges: negative charges dominate the outer surface of the ring,

while positive charges dominate the inner surface of the ring. Therefore, the inner surface was

proposed to bind RNA via electrostatic interactions, but mutations of conserved residues in

this putative binding site gave no conclusive results. Electromobility shift analysis revealed

that p21 is a general nucleic acid binding protein. Thus, p21 is very different from p19 in

terms of structure and RNA binding. The mechanism of RNAi suppression by p21 is therefore

not revealed by the structure, because it can interact with all kinds of nucleic acids. However,

it has been shown that transgenically expressed p21 can co-immunoprecipitate siRNAs in vivo

(Chapman et al., 2004).

Introduction

36

1.4 NMR spectroscopy: a tool to understand the function of biomolecules

a

ery important tool in the repertoire of physical methods to study biomolecules (Wüthrich,

986). It is employed mainly to determine the three dimensional structure of both proteins and

tic resonance

agnetic resonance in condensed phase was first detected by Bloch and Purcell in

asis for observation of nuclear

agnetism by NMR spectroscopy is the nuclear spin angular momentum I (representing a

Over the last 20 years, nuclear magnetic resonance (NMR) spectroscopy has developed into

v

1

nucleic acids at atomic resolution, to study their dynamics in solution and to characterize

molecular interactions (Krishna & Berliner, 2003). Unlike in x-ray crystallography, the

investigated sample is usually in liquid phase and not in solid phase, which makes it possible

to study the structural and dynamical parameters in a close to physiological environment. The

obtained results are thus essentially complementary to crystallographic investigations. The

main disadvantage of NMR spectroscopy is the size limit concerning the biomolecule of

interest. This limitation is slowly pushed towards larger molecules but further improvement

still remains one of the major tasks in method development in modern NMR spectroscopy.

In the course of this Ph.D. thesis, three main application of biomolecular NMR were

employed: solution structure determination, characterization of binding events and molecular

recognition, and analysis of dynamics. All three topics will be introduced in the following

chapters, in addition to some basic concepts.

1.4.1 The basic principle of nuclear magne Nuclear m

1946 (Bloch et al., 1946; Purcell et al., 1946). The physical b

m

vector, indicated in bold), a purely quantum mechanic property that has no analog in classical

mechanics. The magnitude of I is given by equation

( )[ ]122 +=⋅= IIhIII

in which I is the angular momentum quantum number. The value of I is specified by the

physical nature of the nucleus. Table 1.2 (adapted from et al., 1996) gives an

overview of angular momentum quantum numbers I, the gyromagnetic ratios γ (see below)

Cavanagh

and on the natural isotopic abundance of the respective nuclei that are commonly used in

NMR spectroscopy of biomolecules. Most commonly, only nuclei with I=1/2 are employed

Introduction

37

Nucleus I γ (T·s)-1 Natural abundance (%) 1H 1/2 2.6752 · 108 99.98 2H 13

1 4.107 · 107

1

--

100.00 23 100.00

0.02 C

41/2 6.728 · 107 1.11

N 5

1 1.934 · 107

7 99.64 1 N 1/2

5/2 2.712 · 10

70.36

17O 3.628 · 108

0.04 19F 1/2 2.5181 · 10

7Na 3/2 7.080 · 10831P 1/2 1.0841 · 10 100.00

by biomolecular NMR, which includes the nuclei 1H, 13C, 15N and 31P. Because of the low

natural abundance of 13C and 15N (see Table 1.2), proteins are usually uniformly labelled to b

Table Properties selected nuclei. The angular monumbe the gyrom ic ratio tural isotopic abundance for nuclei of particular importance in biomolecular NMR spectroscopy are shown.

1.2 of mentum quantum r I, agnet γ, and the na

nt of the nuclear spin angular momentum is

e

enriched in these isotopes.

The value of the z compone

mIz h=

in which m is the magnetic quantum number with m = (-I, -I+1, …, I-1, I). Thus, Iz has 2I+1

possible values. The orientation of the spin angular momentum vector in space is quantized,

is collinear with the vector representing the nuclear spin angular momentum vector

because the magnitude of the vector is constant and the z component has a set of discrete

values.

Nuclei that have a non-zero spin angular momentum also possess a nuclear magnetic moment

μ, which

and is proportionally linked to it:

Iγ=μ with a z component of mIzz hγ=γ=μ

with γ being the gyromagn istic for a give nucleus (see Table

1.2). Because the angular momentum is a quantized property, so is the nuclear magnetic

etic ratio, a constant character n

moment.

Introduction

38

In an external magnetic field B, the spin states of the nucleus have energies given by

IBB γ−=−= μE

in which B is the magnetic field vector. The magnetic pole moment μ aligns with the main

axis of the field and becomes quantized with energies proportional to their projection onto B.

is directed along the z

di

In a high field NMR spectrometer, the static external magnetic field B0

axis of the laboratory coordinate system resulting in an energy term:

00zm BmBIE γ−=γ−= h

The projection of the angular momentum of the nuclei onto of the laboratory frame

results in 2I+1 equally spaced energy levels, which are known as the Zeeman levels (Figure

At equilibrium energy states are unequally populated because lower energy

rientations of the magnetic dipole vector are more probable. The relative population of the

the z axis

1.10). For the I=1/2 nuclei usually observed in biomolecular NMR, this results in two distinct

energy levels which are commonly termed α and β, corresponding to spin states with m=+1/2

and m=-1/2, respectively.

Figure 1.10 The two precessional cones for a I=1/2 nucleus in a magnetic field B

cshowing typi al spin vectors occupying the two Zeeman states. mI=+1/2 results in the α state, whereas mI=-1/2 results in the βstate.

, the different

o

states is given by the Boltzmann distribution

TkBeE

NNα

Δ−

β

=

The bulk magnetic moment M of a macroscopic sample is given by the vector sum of the

corresponding quantities of μ for individual nuclei. The small population excess in the lower

energy levels gives rise to a bulk magnetization of the sample parallel to the static magnetic

field. This macroscopic magnetization is referred to as the longitudinal magnetization.

Introduction

39

Transitions between Zeeman levels can be stimulated by applying electromagnetic radiation,

with the selection rule for magnetic dipole transitions being Δm= ±1. The energy required to

excite a transition between the m and (m+1) state (α and β state) is proportional to the static

magnetic field B0 (see Figure 1.11).

Figure 1.11 A magnetic field B0 acting on a nucleus with I=1/2 results in two different spin states which are referred to as Zeeman levels, or α and β states. The energy difference ΔE between the states is dependent on the gyromagnetic ratio of the particular nucleus and on the field strength of B0.

The corresponding frequency is given by

00 Bω γ= or ππν 2/B2/ 000 γ=ω=

This frequency ω0 is also called t frequency and can be imagined, in analogy to

classical mechanics, as the frequency of precession of the magnetic dipole moment around the

0, a given nucleus experiences a local magnetic field

local that depends on the chemical environment. Thus, the resonance frequency of a spin is

he Larmor

main axis of the static magnetic field. The sensitivity of NMR spectroscopy depends on the

population differences between Zeeman states. Because the difference is very marginal, in the

order of 1 in 105 for 1H spins in an 11.7 Tesla magnetic field, NMR is an insensitive

spectroscopic technique compared to other methods such as visible or ultraviolet

spectroscopy. As the difference is directly correlated with the magnitude of the static

magnetic field, it is an ongoing effort to build more powerful magnets, especially for

biomolecular NMR.

1.4.2 The chemical shift

In addition to the static magnetic field B

B

governed by the effective field Beff

local0eff BBB +=

The local magnetic contribution is mainly constituted by the electronic environment of a

given nucleus, e.g. high electron densities result in large shielding of the B0 field. Thus, the

bond structure and polarity of the surrounding atoms of a spin give rise to slightly different

Introduction

40

frequencies of precession, which therefore directly correspond to different local chemical

environments. This phenomenon is called chemical shift δ. In practice, chemical shifts are

measured in parts per million (ppm) relative to a reference resonance signal of a standard

molecule (which is water in biomolecular NMR):

6

0

ref 10×ω

Ω−Ω=δ

in which Ω and Ωref are the offset frequencies of the signal of interest and the reference signal,

respectively, and ω0 is the Larmor frequency. Differences in chemical shifts of individual

nuclei are responsible for the dispersion that is observed in NMR spectra.

ple NMR experiment, a radio frequency with the associated temporary external

an axis perpendicular to the main axis of the static

agnetic field B0. This causes the equilibrium longitudinal magnetization Mz,eq to precess

1.4.3 The basic NMR experiment

In a sim

magnetic field B1 is applied along

m

around the main axis of B1. If the radio frequency irradiation is set as a pulse that makes the

magnetization to rotate by 90 degree, the magnetization is left in the xy plane, giving rise to a

so-called transverse magnetization (see Figure 1.12).

FT

Figure 1.12 In the static external magnetic field B0, the macroscopic magnetization is oriented along the z axis and has an equilibrium value of Mz,eq (green arrow). A 90 degree radio frequency pulse along the y axis causes the magnetization to rotate around this axis resulting in a transverse magnetization Mxy in the xy plane. Precession in the xy plane is recorded by a detector. Relaxation phenomena cause the signal to decay exponentially resulting in the detection of a free induction decay (FID). This information is transferred into a spectrum by Fourier transformation (FT).

Introduction

41

As a result of the 90 degree pulse, zero magnetization is left along the z axis. Under the

fluence of the static magnetic field, the magnetization is precessing in the xy plane with

ifferent radial frequencies for individual spins (because of their different chemical shift, see

s through chemical bonds: scalar coupling

ins that are close to each other can influence each other, and these interactions are

employed by NMR spectroscopy to obtain information about the chemical structure of the

ig

in

d

above). The rotation of the magnetization is recorded by a detector in the xy plane. Because of

relaxation phenomena (see chapter 1.4.8), the transverse magnetization is lost by time which

results in an exponential decay of the detected signal, which is therefore called a free

induction decay (FID). The intensity of the FID is detected over a certain time range. This

data represents the so called time domain that is transferred into a spectrum representing the

signal as a function of frequency by Fourier transformation. In case the pulse is double in

time, the magnetization is flipped by 180 degree, resulting in a magnetization with a value of

–Mz,eq. Thus, by tuning the time and power of a radio frequency pulse, the magnetization can

be changed by a desired angle, which is used extensively in pulse programs implemented in

NMR spectroscopy.

1.4.4 The interaction of spin Sp

invest ated molecule. For example, spin-spin interactions or couplings mediated by electrons

that form chemical bonds between nuclei are called scalar couplings. The strength of the

interaction is measured by the scalar coupling constant, nJIS, in which n designates the number

of covalent bonds separating the two nuclei, I and S. The magnitude is usually expressed in

Hertz (Hz) and is independent of the strength of the external field B0. Values are large (~10-

140 Hz) for nuclei separated by a single bond and become smaller (~0-10 Hz) for two- and

three-bond couplings. Typical values observed in proteins are shown in Figure 1.13.

Figure 1.13 Typical values for one- and two-bond scalar couplings (1J and 2J) in polypeptides are shown above the bond that mediates the spin interaction. Numbers are in red and given in Hertz (Hz). C’ is the carbonyl (CO) carbon.

Introduction

42

The scalar coupling modifies the energy levels of the involved spin systems and modifies

thereby the NMR spectrum. A coupled two-sp s

four energy levels corresponding to the possib

orientation of spins (up-up and down-down, β

and down-up, βα, αβ). An illustration of th

Transitions between the states occur according to the selection rule Δm = ±1 (where m is the

agnetic quantum number of each state) and result in two signals at resonance frequencies of

For J-coupling to be active, one component of the coupled spin systems needs to contain

transverse magnetization. When transfer of magnetization from one nucleus to another is

desired, scalar coupling is allowed to evolve, thus J-couplings serve as a fundamental

in system constituted of spins I and S display

le combination of spin states, these are parallel

β, αα) and anti-parallel orientations (up-down

e energy levels is shown in Figure 1.14.

m

the I and S states. Introducing a scalar coupling between I and S with the value of JIS results in

the modulation of the energy levels and each of the previously observed signals is now split

into two with a distance corresponding to the magnitude of the J-coupling (Figure 1.14).

Figure 1.14 Energy level diagram of an IS spin system. The four possible energy states (horizontal lines with numbers 1-4) of an uncoupled IS spin system are indicated on the left, whereas the same J-coupled spin system is shown on the right. Allowed transitions between them obeying the selection rule Δm=±1 are indicated by arrows and result in the corresponding spectra which are outlined below. The I and S spins and the correlated resonance lines are shown in black and red, respectively. Identical J-couplings and therefore line splittings are observed for both signals, as energy levels change by the same magnitude in opposite directions.

Introduction

43

mechanism to transfer magnetization between spins in NMR spectroscopy. In case coupling

of spins is not wanted, the spins of a coupled heteronuclear spin system can be inverted by

applying a 180 degree pulse on one of the components (decoupling).

1.4.5 Resonance assignment in proteins In order to be able to correlate the information in the NMR spectrum with a specific atomic

position in the molecule, the resonance frequency or chemical shift of each atom has to be

assigned. The sequence specific assignment of NMR resonances requires many different

experiments taking days of spectrometer time. The resonance assignment of the backbone

atoms of a protein provides already information about the secondary structure and is a

necessity to do interaction studies (see chapter 1.4.10) and to analyze the dynamics of the

protein (chapter 1.4.8). Nevertheless, backbone assignment does not reveal much of

information about the three dimensional structure of a protein and the complete assignment is

a necessary prerequisite to achieve this goal.

Assignment of proteins requires in general uniformly isotopically labelled protein samples.

This is because of the large amounts of protons in proteins that give rise to very crowded

spectra. For that reason, the signals are correlated with the resonances of heteronuclei, namely 15N and 13C, and thereby dispersed over additional frequency axes. As the natural abundance

of these nuclei is very low (see Table 1.2), protein samples have to be enriched for these

advantage of increased

solution, this type of experiments also provide better signal to noise ratio. The strategy is

usually to assign the atoms of the protein backbone first, and then the atoms of the amino acid

shifts of the Cα atoms are often not dispersed enough for unambiguous assignment, the Cβ

isotopes. This is achieved by recombinant expression of the protein in bacteria grown in

isotope enriched media which results in the uniform incorporation of 15N or 15N/13C. The

following description is based on a 15N/13C-labelled protein sample.

Resonance assignment is most commonly based on triple-resonance experiments that

correlate different nuclei by the transfer of magnetization via the bonds connecting them, i.e.

via J-couplings (Sattler et al., 1999) (see chapter 1.4.4). Besides the

re

side chains.

Backbone assignment

A basic set of experiments is outlined below to assign the resonances of the backbone atoms

and to obtain connectivities between individual amino acids. Due to the fact that the chemical

Introduction

44

resonance is normally also employed to obtain reliable assignments. A sufficient set consists

of four experiments: the HNCA-, HN(CO)CA-, CBCANH-, and CBCA(CO)NH-experiments

(Figure 1.15).

on atoms of the previous amino acid (i-1) with a given amide group. In contrast,

xperiments without this step reveal the chemical shifts of both the previous and the same

mparison of peaks can be used to get connections between residues

hain protons and transferred to the directly bound carbon nuclei. There, it is

llowed to propagate or mix along the whole of the aliphatic spin system via a transfer termed

TOCSY (total correlation spectroscopy), redirected to the amide nitrogen spin system via the

carbonyl carbon (CO) and eventually transferred from there to the HN protons for detection.

These experiments are referred to as (H)CC(CO)NH and H(CC)(CO)NH for correlation of the

aliphatic carbon resonances and the side chain protons of the previous amino acid (i-1) to the

backbone amide of residue i, respectively (see Figure 1.16).

Figure 1.15 Summary of NMR experiments for the resonance assignments of protein backbone atoms. Correlated spin systems are boxed for the respective experiments, spin systems which are employed to transfer magnetization without chemical shift evolution are outlined in gray and indicated in brackets in the respective pulse sequence names.

All of them rely on the correlation of the Cα and/or Cβ atoms with the amide nitrogen and

proton of either the same amino acid (i) and/or the previous amino acid in the polypeptide

sequence (i-1). The brackets represent thereby a step in the experiment at which the chemical

shift of the specified nucleus is used only for the transfer of magnetization but not evolved.

Experiments involving a transfer over the backbone carbonyl carbon (CO) therefore correlate

only the carb

e

residue. Therefore, a co

and to assign the resonances of the HN, N, Cα and Cβ atoms.

Side chain assignment

In a second step, triple-resonance experiments are employed to assign carbon and proton

chemical shifts of individual side chains. In general, magnetization is initially created on

aliphatic side c

a

Introduction

45

nucleus affects the magnetic

ifically, the z component of the dipolar field of

e frequency of nucleus Q by an amount that depends on

Figure 1.16 Summary of NMR experiments for the resonance assignments of protein side chain atoms. Correlated spin systems are boxed for the respective experiments, spin systems which are employed to transfer magnetization without chemical shift evolution are outlined in gray and indicated in brackets in the respective pulse sequence names.

In addition, a HCCH-TOCSY correlates all aliphatic proton resonances with all carbon spins

systems within the same amino acid residue.

1.4.6 The interaction of spins through space: dipolar coupling and residual dipolar coupling

A second type of interaction between spins is the direct dipole-dipole interaction which is

referred to as dipolar coupling. The magnetic field caused by one

field at the site of the other nucleus. More spec

nucleus P will change the resonanc

the internuclear distance and on the orientation of the internuclear vector relative to the static

field B0. This results in a splitting of resonances with a separation in frequency by

( ) 2/1cos3DD 2max −θ= PQPQ

where θ is the angle between the internuclear vector and B0 and DPQmax is the doublet splitting

that applies for the case where θ is zero. Thus, the dipolar splitting DPQ provides direct

information of the internuclear vector. Dipolar coupling splittings are very large in solid phase

NMR, but because of the fast reorientation and isotropic orientation in an isotropic solution,

dipolar couplings are usually averaged to zero and therefore not visible in liquid state NMR

pectroscopy. Nevertheless, information about the orientation of bonds can now be even

btained from samples in solution because of the availability of tunable, weak alignment,

which results in residual dipolar coupling.

s

o

Introduction

46

Residual dipolar coupling

Recently, NMR methods have been developed which allow the extraction of structural

restraints characterizing long-range order. Traditionally, structure determination was based on

distance restraints derived from quantification of NOESY spectra (see chapter 1.4.7) as well

as torsion angle restraints from measurement of 3J-couplings (chapter 1.4.4 and 1.4.9). A key

limitation inherent to this approach concerns the strictly local nature of these parameters,

since they solely define distances and angles between atoms close in space within the

tructure.

ne-bond vectors relative to the molecular

usceptibility tensor, have been introduced to derive angular restraints and are now employed

in NMR based structure calculations (Tolman et al., 1995; Tjandra & Bax, 1997; Tjandra et

lso

r as they restrain all bond orientations relative to a common

at

tunable degrees of molecular alignment can be achieved by placing the molecule under

investigation into a dilute liquid crystalline media. The alignment relies on the steric

interaction between the highly aligned medium and the biomacromolecules, which induces a

s

In particular, residual dipolar couplings (RDCs), which contain information about the

orientation of the internuclear, usually o

s

al., 1997; reviewed in Bax, 2003). Besides constraining local geometry, dipolar couplings a

have a global ordering characte

frame, independent of their individual localization in the structure.

As described before, the source of this structural information is the direct dipole-dipole

interaction between two nuclei, which is normally averaged to zero in solution. In the case of

weak, partial anisotropic alignment, however, a certain residual orientation of the dipole-

dipole vector in the magnetic field remains. Therefore, the major breakthrough with respect to

the potential use of dipolar coupling derived structural restraints was the demonstration th

weak alignment of the biomolecule itself. In this case, not all orientations of the molecule are

equally likely to occur and a residual dipolar coupling can be observed.

The most commonly measured residual dipolar coupling in structure determination of proteins

is the coupling between the 15N nucleus and a proton (Figure 1.17).

Figure 1.17 Magnetic dipole-dipole coupling, illustrated for a 15N-1H spin pair. 15N and 1H moments are aligned parallel (or anti-parallel) to the static magnetic field B0. The total magnetic field in the B0 direction at the 15N position can increase or decrease relative to B0, depending on the orientation of the 15N-1H vector and the spin state of the proton (parallel or anti-parallel to B0). Adapted from Bax, 2003.

Introduction

47

The measurement of 15N-1H RDCs provides information about the orientation of the bond

vector of amide groups in the protein backbone (Figure 1.17). Similar RDC values therefore

indicate similar orientation of the bond vectors, as observed for example for amino acids

located in helices. Tjandra et al. developed a procedure to incorporate these experimental

dipolar couplings into the structure calculation (Tjandra & Bax, 1997). This resulted in an

substantial improvement of the percentage of backbone angles in the most favoured region of

the Ramachandran plot and in considerably higher cross-validation statistics. Thus, RDC

sonance is irradiated in a

Although the dipolar interaction is averaged to zero for molecules in solution, the interaction

of spins results in relaxation, which is a second order perturbation. The transitions include W0

and W2 transitions, which involve the simultaneous change of both spins and thus give rise to

an NOE. These transitions are forbidden in the conventional sense that they cannot be directly

measurements are a valuable tool in structure validation and refinement and were employed in

the structure determinations presented in this thesis (see results).

1.4.7 The nuclear Overhauser effect

In general, the nuclear Overhauser enhancement or nuclear Overhauser effect (NOE) is the

fractional change in intensity of one NMR line when another re

double irradiation experiment. The NOE has its origin in population changes as a result of a

particular form of relaxation (see next chapter), namely dipole-dipole cross-relaxation

(Neuhaus & Williamson, 2000). This is caused by the dipolar interactions between spins,

where the magnetic moment of a particular spin induces a magnetic field at the position of the

interacting spin that is close in space (see previous chapter).

The coupling gives rise to modified energy levels shown in Figure 1.18 for a two-spin system

IS.

Figure 1.18 Energspin system (IS) w

y diagram of a two ith dipolar coupling.

n.

The NOE is a result of W0 and W2transitions which occur via spin-spin cross-relaxatio

Introduction

48

excited by a radio frequency or cause directly detectable NMR signals, but they are not

forbidden in the context of relaxation.

The NOE is so important for NMR spectroscopy of biomolecules because it is manifested by

a through space interaction and the magnitude of it directly correlates with the distance

between the interacting nuclei. The NOE is therefore used to derive proton-proton distance

restraints for the structure determination by NMR. The intensity of an NOE is measured using

NOE spectroscopy (NOESY) experiments. The cross-peak intensity is approximately

proportional to the inverse sixth power if the inter-nuclear distance and can be observed for

distances up to 6 Å.

1.4.8 Relaxation and dynamics

The macroscopic longitudinal and transverse magnetizations were introduced in chapter 1.4.1

and 1.4.3, respectively. The longitudinal magnetization (Mz) is aligned along the static field

axis of B0 with a certain equilibrium value. The transverse magnetization (Mxy) is

perpendicular to B0, around which it precesses at the Larmor frequency ω0 and it has an

equilibrium value of zero. After any disturbance caused by the action of a radio frequency

irradiation, two processes must occur for equilibrium to be restored: the longitudinal

magnetization must return to its original value and the transverse magnetization must decay to

nuclear spin relaxation is a consequence of coupling of the spin system to the

ations in the magnetic field.

urement of T1 re

Longitudinal relaxation necessarily involves transition

magnetization is a result of difference in populations of spin levels and β (chapter 1.4.1).

The longitudinal relaxation time is also called spin-lattice relaxation time or simply T1. The

lattice means basically the surrounding that includes a set of energy levels that is

given NMR transition, there will be always a possible change within the lattice that involves

zero. The first process is called longitudinal relaxation and the second transverse relaxation.

Any

surroundings and is induced by fluctu

Longitudinal relaxation and meas laxation times

s between spin states as the longitudinal

α

term

generally provided by the various translations, rotations and internal motions of the molecules

that constitute the sample. Because of the very large number of degrees of freedom that the

lattice possesses, this set is effectively a continuum. The lattice modifies the local magnetic

fields at the locations of nuclei and thereby couples the lattice and the spin system. For any

Introduction

49

the same quantity of energy. Thus, exchange of energy between the spin systems and the

lattice brings the spin system in thermal equilibrium with the lattice and the distribution of

spins returns to the equilibrium population.

The T1 relaxation time describes the longitudinal relaxation and is measured by the inversion

recovery experiment (Figure 1.19).

Following an inversion by applying an 180 degree pulse, the value of Mz equals –Mz,eq. Then,

the decay of Mz to its equilibrium value will follow the simple expression

Figure 1.19 The longitudinal relaxation time T1 is measured by the inversion recovery experiment. At equilibrium, the macroscopic magnetization (green arrow) is oriented along the z axis and has a value of Mz,eq. At the beginning of the experiment, it is inverted by a 180 degree pulse along y into the –z direction. After a time τ, the magnetization has decayed due to longitudinal relaxation. A 90 degree pulse along the y axis flips the magnetization into the xy plane, where it is detected as an FID.

⎥⎤

⎢⎡

−=ττ−

T1eqz,z 21M)(M e

⎦⎣

n in the xy plane that is After a delay τ, a 90 degree pulse is applied to generate magnetizatio

recorded as an FID. The delay is varied incrementally to obtain a set of data to fit the T1 time.

Transversal relaxation and measurement of T2 relaxation times

In contrast to T1, the transverse relaxation is a result of loss and dephasing of individual

contributions to the macroscopic transverse magnetization. This is due to transitions, but also

due to any other process that perturbs the individual Larmor frequencies like loss of phase

coherence. The transverse relaxation time is also known as spin-spin relaxation time or T2.

Introduction

50

As the line width of a resonance signal is related to T2 by equation

T21

⋅=

π 1/2νΔ

( 2/1νΔ is the full line width at half height of the line), a shorter transverse relaxation time T2

gives rise to a broader line width. Due to the fact that T2 is inversely correlated to the

molecular tumbling rate (see below and Figure 1.21), this is a major obstacle for NMR

spectroscopy of large molecules.

As explained, T2 is the relaxation rate of Mxy and T2 times are generally measured by

employing a Hahn spin-echo experiment, which is using the principle of symmetry reversal

(Figure 1.20).

magnetization in the xy

ed

s an FID. The size of this echo will only be affected by the spin-spin relaxation processes. To

obtain a whole data set, the delay τ is incrementally changed. The Carr-Purcell-Meiboom-Gill

Figure 1.20 The transversal relaxation time T2 is determined by the Hahn spin-echo experiment. At equilibrium, the macroscopic magnetization (green arrow) is oriented along the z axis and has a value of Mz,eq. A first 90 degree pulse along the y axis flips the magnetization into the xy plane. Spins are allowed to evolve for a delay time τ. The spins are inverted by a 180 degree pulse along the x axis, followed by a second delay time τ. The echo is then recorded as an FID. The T2 time can be extracted from the magnitude of the echo, because all contributions of spin-lattice relaxation (T1) are cancelled out due to the symmetrical concept of the experiment.

The Hahn echo is constituted first by a 90 degree pulse that flips the

plane. During the first τ delay, the magnetization evolves according to its chemical shift and

field inhomogeneity. Then a 180 degree pulse is applied, which inverts the magnetization.

Following this inversion pulse, another τ delay is applied. During this delay, the

magnetization refocuses. This will give rise to an echo after the 2 τ delays which is record

a

Introduction

51

(CPMG) sequence, which is derived from the Hahn spin-echo sequence, can be also used to

measure T2 relaxation times. This sequence is equipped with a "built-in" procedure to self-

correct pulse accuracy error.

modulated by molecular motion. These interactions involve

ainly dipolar coupling (see chapter 1.4.6), and chemical shift anisotropy (CSA). Other terms

used by T2. The m

us, relaxation rates are affected by intra-molecular mobility in flexible

structures and correlated with the overall rotational tumbling of the molecule in solution.

Small and medium sized molecules have a fast tumbling rate which is reflected by a small

correlation time τc. The rotational correlation time τc represents the time that it takes for a

molecule to reorient, which is much shorter for small molecules compared to large molecules,

like biomolecules. For molecules having a small τc, T2 is normally equal to T1 (Figure 1.21).

For biomolecules that are usually larger in size resulting in a slower tumbling rate and a larger

correlation time τc, T2 is shorter than T1 (Figure 1.21).

Relaxation and dynamics in biomolecules

Longitudinal and transverse relaxation rates are intimately linked to motion and the

interactions between the spins and the surroundings. This is due to the fact that relaxation is a

result of spin interactions that are

m

are responsible for additional line broadening ca ost important is chemical

exchange, which describes the interconversion of two or more states (conformations) into

each other. Th

Adobe System

s

100

10-1

10-2

101

10-3

103

102

10-12 10-11 10-10 10-9 10-8 10-7

τc [s]

T1&T2 [s]

Figure 1.21 Logarithmic plots of the T1 and T2 times versus the correlation time τcfor dipolar relaxation. The curves were computed for proton resonance frequencies νo of 400, 600 and 800 MHz.

Introduction

52

Because the slow tumbling limit applies for typical biomacromolecules with a molecular

weight above ~6 kDa, T2 is inversely proportional to the correlation time τc, which itself is

proportional to the molecular weight. Thus, by measuring T2 relaxation times (see above) it is

ossible to estimate the molecular weight of a macromolecule in solution.

otion, they are an optimal probe to characterize

be

Structure determination with NMR derived constraints

p

As the relaxation rates are directly linked to m

the dynamical properties of a molecule. For proteins, the aforementioned measurements are

usually set up as two dimensional experiments, correlating 15N nuclei with a directly bound

proton. Thus, in proteins, relaxation rates are determined for the backbone amide nitrogens.

Most commonly, the obtained data is represented in a plot of the ratio of the T1 and T2 times

of the amide nitrogens versus the position of the corresponding residue in the amino acid

sequence. If the observed value differs significantly from the average T1/T2 ratio, it indicates

regions of internal motion at time scales above or below the correlation time τc of the overall

tumbling. Thus, information about the rotational diffusion tensor of a molecule can

extracted from relaxation data. Values of T1/T2 will differ significantly within one molecule

if the shape of the molecule can not be approximated by a sphere, resulting in anisotropic

diffusion. Measurements of dynamical properties were obtained of the investigated proteins in

this thesis (chapter 2.1 and 2.4).

1.4.9

As described in the previous chapters, a variety of structural restraints can be extracted from

NMR experiments. Considering the input data for structure calculations, the most important

ones are short proton-proton distances derived from NOESY experiments (Table 1.3). With

increasing size of the biomolecule, overlap of cross-peaks becomes a problem for the

unambiguous assignment of distances. This overlap can be reduced by the introduction of

additional frequency dimensions, and therefore the NOESY spectrum is in practice usually

acquired as three dimensional 13C- and 15N-edited NOESY experiments. The distances rij can

be calculated from the NOE intensities Iij by equation

6/1

ij

refrefij I

Irr ⎟

⎟⎠

⎞⎜⎜⎝

⎛=

using an appropriate reference for distance calibration. In practice, the experimental distance

is imprecise because of peak integration errors, peak overlap, spin diffusion and internal

Introduction

53

dynamics. For this reason, the distance is defined to be between a lower and an upper bound.

Usually, the lower bound separation of a pair of protons is set to the sum of the van der Waals

radii.

In addition to NOEs, structural information can be obtained by analyzing 3J-coupling

constants. As was first described by Karplus (Karplus, 1959), the magnitude of a 3J-scalar

coupling constant is a function of the dihedral angle θ, formed by the three covalent bonds

CBA +θ+θ= coscosJ 23

The constants A, B, and C depend on the particular nuclei involved in the covalent bonds.

Most commonly, the coupling constant between the amide proton HN and the Cα bound

proton Hα is analyzed (Table 1.3). The value of the coupling constant provides a direct

correlation with the backbone dihedral angles Φ and Ψ.

NMR observable Information

NOEs (15N- and 13C-resolved NOESY) Interatomic distances (<6 Å)

J-couplings (3JHN,Hα) Dihedral angles (backbone angles Φ and Ψ)

RDCs (HN-N) Bond projection angles (backbone HN-N)

Solvent exchange (HN) Hydrogen bonds

Structure calculation

the heavy water and can therefore be

observed in a 15N,1H-correlation experiment. The information that a specific proton is located

in a regular secondary structure is then translated into distances between this proton and

another atom, based on the previously determined secondary structure (Table 1.3).

Restrained molecular dynamics Simulated annealing

3D structure ensemble

Another set of restraints is generally obtained from experimentally observed H/D exchange

rates. To detect slowly exchanging amide protons which might be part of H-bonds, the protein

is first lyophilized and then redissolved in 100% D2O. Protected amide protons do not or only

slowly exchange with the surrounding deuterium of

Table 1.3 Structure determination by NMR spectroscopy. NMR observable parameters (left) are translated into structural restraints (right), which are employed in calculations resulting in a 3D structure ensemble.

As described in chapter 1.4.6, more recent NMR observables that are implemented in

structure calculations are residual dipolar couplings. The most commonly used are the HN-N

Introduction

54

RDCs. They provide information about the directionality of the vector of the bond between

the amide nitrogen and the amide proton.

The experimental information has to be extended by chemical knowledge about bond length,

al.,

1998) and ARIA (Linge et al., 2001). The structure calculation is basically an iterative search

for an energy minimum of a given data set. The result is a fa ily of structures, also called the

ean square

eviation of the position of selected atoms in the individual structures from a mean structure.

Another set of quality checks concerns the accuracy of the structures. This includes the

analysis of backbone dihedral angle values in the Ramachandran plot and of the so-called Q-

factor for a set of RDC c

For the structural analysis of complexes betw of

NO t is employed. Usually, the d and

the ide or a nucleic a as an unlabelled compound in

saturating amounts. The edited-filtered NOESY experiment allows then to either select or

sup o-nucleus y provide

NOE intensities between protons bound to 13C nuclei an C (unlabelled)

nuc ces between protons of the protein and the ligand

can as constraints in structure ca ted complex

ee chapter 2.2).

bond angles and van der Waals radii of the molecule.

The obtained experimental restraints together with the fixed chemical restraints are then

applied in a simulated annealing protocol, e.g. by using the programs CNS (Brünger et

m

NMR ensemble, with the individual solutions fulfilling the input restraints (Table 1.3). The

final ensemble of NMR structures is then refined in a shell of water molecules (Linge et al.,

2003). The quality of the structure and of the input restraints can then be estimated by the

similarity of these structures (precision), which is normally expressed as the root m

d

onstraints.

een a polypeptide and a ligand, a special type

ESY experimen protein under investigation is fully labelle

ligand, may it be a pept cid, is added to it

press magnetization evolution of heter

lei. Thus, short inter-molecular distan

be identified and used

bou rebnd protons and can the

d protons bound to 12

lculations of the investiga

(s

1.4.10 Interaction studies by NMR spectroscopy

NMR spectroscopy is widely used to study the interaction between molecules (reviewed by

Zuiderweg, 2002). Compared to other techniques that report interactions between

biomolecules like electromobility shift assays, Biacore experiments or calorimetry and

fluorescence based assays, NMR based studies have the main advantage that not only binding

is detected but also structural details of the binding process become available. This is because

Introduction

55

the chemical shift can be used as a very sensitive probe that is reporting changes in its

physico-chemical environment.

Chemical shift perturbation mapping

Chemical shift perturbations can be analyzed in principal for all atoms of the biomolecule that

were previously assigned (see chapter 1.4.5). In practice, one is employing most frequently 15N,1H-correlation spectra (15N,1H-HSQC experiments) to detect amides in the protein

backbone that are located in or in close vicinity to the binding site. For this, the spectrum of

the free and the bound form of the biomolecule under investigation are measured and

compared by superimposition. Observation of chemical shift perturbations upon addition of a

ligand reveals atoms in which close spatial proximity the other molecule binds. An example

showing binding of ssRNA to the Drosophila Argonaute2 PAZ domain is shown in Figure

1.22 (see also chapter 2.1).

Figure 1.22 A superimposition of

15N,1H-HSQC spectra for the Ago2 PAZ domain in the absence (blue) and presence (red) of a six fold molar excess of a 21 nt ssRNA.Green arrows indicate the direction of peak movements. The insert shows additional spectra at 0.5-, 1-, 1.5- and 3-fold molar excess of RNA. Significant chemical shifts changes were observed for labelled residues. This example is taken from the publication presented in chapter 2.1 (Lingel et al., 2003).

Chemical shift perturbation measurements just yield the locations of the interfaces on the

individual binding partners. It is then still unknown how the partners interact on an atom-to-

atom basis. This can be achieved by observing inter-molecular NOEs and employing them in

calculations of complex structures (see previous chapter and chapter 2.2).

Introduction

56

Titrations with NMR

In addition to the mapping of the interface, titrations can be carried out to estimate the

ffinity, stoichiometry and the kinetics of the binding. How the chemical shifts of the labelled

rmined by the kinetics of the interaction.

ound protein (see chapter 2.4). During the titration, the ‘free set’

will disappear and will be replaced by the ‘bound set’. This regime is referred to as slow

chemical exchange and is observed usually for high affinity interactions. In slow exchange

one does not automatically know to which new location the resonance has moved, and

therefore a quantitative analysis of chemical shift changes is difficult unless the bound state is

assigned. However, the binding constant can still be quantitated by measuring the intensities

of the disappearing and/or appearing peaks as a function of the ligand concentration.

In the intermediate chemical exchange case, the frequencies of the changing resonances

become poorly defined and extensive kinetic broadening sets in, which might result in the

complete disappearance of resonance lines from the NMR spectrum.

The range of binding constants that can be determined in a quantitative way by NMR is set by

the concentration of the interaction partner that is observed by the NMR experiment. As a

rule, quantitative measures can be obtained for KD values within an order of magnitude of the

concentration of the studied species.

A practical application of interaction studies by NMR is the identification of new drugs, as the

design of drugs initially relies on the detection of binding activity of substances to target

a

protein change during the titration is dete

If the complex dissociation is very fast, there is only a single set of resonances observable

with chemical shifts being the fractionally weighted average of the free and bound chemical

shifts (Figure 1.22 and chapter 2.1). This is referred to as fast chemical exchange and is often

observed for weaker interactions. By titration with increasing amounts of ligand and by

following the peak position in the acquired spectra for each of the changing signals, a data set

is obtained which correlates the amount of ligand with the fractional change of the NMR

signal. This data can be fit to obtain the dissociation constant.

If the complex dissociation is very slow, one observes one set of resonances for the free

protein and one set for the b

proteins. Binding of low molecular weight compounds to large receptor molecules is a

therefore prominent example of the implementation of NMR in applied research. This concept

of identification and characterization of binding compounds is commonly known as structure

activity relationship (SAR) (Shuker et al., 1996).

Results

57

2. Results The major part of the results I obtained during my thesis was published in the following

papers, listed in chronological order:

Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2003) Structure and nucleic acid binding of the Drosophila Argonaute2 PAZ domain. Nature, 426, 465-469. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2004) Nby the Argonaute2 PAZ domain. Nat Struct Mol Biol, 11, 576-

ucleic acid 3’-end recognition 577.

Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2004) NMR assignment of the Drosophila Argonaute2 PAZ domain. J Biomol NMR, 29, 421-432. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2005) The structure of the FHV B2 protein, a viral suppressor of RNAi, reveals a novel mode of dsRNA recognition. Embo Rep, 6, 1149-1155. The following part is divided into four subchapters, one for each publication. Each of them

contains a summary of the results and a statement about my contribution, followed by a

reprint of the original article. The statements of contributions were confirmed by Dr. Elisa

Izaurralde and Dr. Michael Sattler.

Results

58

2.1 Structure and nucleic acid binding of the Drosophila Argonaute2 PAZ domain

he solution structure of the Drosophila Argonaute2 PAZ domain was solved. The construct

n was expressed

combinantly in E.coli und purified to homogeneity. NMR experiments showed that the

A refined NMR structure

as determined from experimentally determined short distances within the protein (NOEs),

easurement of relaxation times and heteronuclear 1H-15N NOE experiments showed that

redefines the domain borders.

n analysis of the structure revealed that the domain is composed of two structural modules, a

-stranded β-barrel, that is covered on one side by two N-terminal helices, and on the other

eic acid binding. As a consequence, the possibility of

ucleic acid binding was explored. Using NMR spectroscopy, binding of the PAZ domain to a

single-stranded 21 nucleotide RNA was detected. The binding site was mapped by chemical

shift perturbation analysis and located to the cleft mentioned above. Mutation of conserved

residues in the binding site of the PAZ domain confirmed the binding site und identified

critical residues for binding, as their mutation strongly reduced the affinity for RNA. Based

on a titration experiment, the dissociation constant for the interaction of the Ago2 PAZ

domain with a 21 nt RNA was determined to be around 20 µM. In addition to NMR titrations,

the binding of nucleic acids was also shown by UV-crosslinking assays, in which the purified

protein could be efficiently crosslinked to ssRNA. Detection of binding by electromobility

shift assays (EMSA) was not successful. To determine the relative affinities of different

Summary of results T

comprised 143 residues and was chosen based on an alignment that initially defined the

boundaries of this conserved domain (Cerutti et al., 2000). The protei

re

purified protein is properly folded. A complete set of spectra was recorded and the resonances

f the polypeptide backbone and side chains atoms were assigned. o

w

HN-N residual dipolar couplings, backbone dihedral angles and hydrogen bonds.

M

20 amino acids at the C-terminus are unstructured in solution. This information together with

the determined secondary structure elements enabled it to make a structure based sequence

lignment that a

A

5

side by a conserved module comprising a β-hairpin and a short α-helix. Inbetween these two

modules, a cleft aligned by basic and hydrophobic residues is formed. The central β-barrel is a

topological variation of the oligonucleotide/oligosaccharide (OB) fold, which is often

implicated in single-stranded nucl

n

Results

59

nucleic acid ligands, competition assays were performed with detection of binding by the

go2 PAZ domain, the DNA even with slightly higher affinity. Furthermore, a double-

tranded RNA with a 19 bp duplex region and a 2 nt 3’-overhang on each side of the duplex

s shown to bind, whereas the 19 bp duplex without the overhangs

howed a reduced affinity. This suggested that the 3’-end of nucleic acids could be critical for

imate my contribution to the manuscript to be

crosslinking experiment. It was found that not only ssRNA, but also ssDNA binds to the

A

s

(an siRNA duplex) wa

s

binding to the PAZ domain. All binding data were confirmed always by both methods, NMR

spectroscopy and the UV-crosslinking assay.

Contribution My contribution to the published data was to clone, express und purify the protein necessary

for all the NMR measurements. In addition, I did part of the NMR data acquisition and

processing. Analysis of the spectra to obtain data for the structure calculation and the

structure calculation itself was done in majority by me. I also did the NMR titration

experiments with RNA. I expressed the PAZ domain mutants as glutathione S-transferase

(GST) fusion proteins and purified them. I also contributed to the writing of the manuscript

and the preparation of figures. Overall, I est

around 50%.

Results

60

Results

61

Results

62

Results

63

Results

64

Results

65

Results

66

Results

67

Results

68

Results

69

Results

70

Results

71

2.2 Nucleic acid 3’-end recognition by the Argonaute2 PAZ domain Summary of results After solving the structure of the free Drosophila Argonaute2 PAZ domain, one important

goal was to get further insight into the interaction between the PAZ domain and nucleic acids.

Therefore, the solution structures of the domain bound to single-stranded RNA and DNA

oligomer ligands were determined. Based on the definition of the exact domain borders of the

PAZ domain in the previous study (chapter 2.1), a construct that is 20 residues shorter than

the construct investigated in the structure determination of the free protein was used. Using

the previously established competition crosslinking assay, a 5 nucleotide long oligomer could

be defined as a ligand and was used in NMR structure determination. Two complete sets of

spectra were measured to determine the complex structures with two ligands, a DNA

oligomer with the sequence CTCAC and the corresponding CUCAC RNA oligomer. As

before, the structures were calculated based on experimentally determined short distances

within the protein (NOEs), HN-N residual dipolar couplings, backbone dihedral angles and on

hydrogen bonds. Inter-molecular short distances between the ligand and the protein revealed

that only the two 3’-terminal nucleotides are bound by the PAZ domain. The interaction

surface of the protein mapped essentially to the region that was identified before as the

binding site for nucleic acids. In both complexes, the interaction is based mainly on

hydrophobic contacts of aromatic and nonpolar residues of the protein with the bases and

sugar backbone of the two 3’-terminal nucleotides. The 3’-OH group of the last nucleotide is

buried deeply inside of the protein, making it impossible to extend the polynucleotide chain

into the 3’-direction. These results enabled the definition of the PAZ domain as a 3’-end

recognition domain.

Contribution I prepared all samples of the Drosophila Argonaute2 PAZ domain needed for the

determination of the two complex structures. Data acquisition and processing was done in part

by me. I did all assignments of the DNA and RNA oligomers and of the protein in its bound

state and analyzed the spectra to get short distances within the protein and between the PAZ

domain and the nucleic acid ligands. I was involved in writing the manuscript and prepared all

figures. My contribution to the publication is around 70%.

Results

72

Results

73

Results

74

Results

75

Results

76

Results

77

Results

78

Results

79

Results

80

Results

81

Results

82

Results

83

Results

84

Results

85

Results

86

Results

87

Results

88

2.3 NMR assignment of the Drosophila Argonaute2 PAZ domain Summary of results In the process of the structure determination of the Drosophila Argonaute2 PAZ domain,

chemical shifts of the protein backbone and side chain atoms were assigned. Triple-resonance

NMR experiments were performed to get essentially complete sequence-specific assignments.

To allow public access to the obtained results, the chemical shifts were deposited in the

BioMagResBank (BMRB) and an assignment note was published.

Contribution My contribution to this publication was to do the major part of chemical shift assignment of

the Drosophila Argonaute2 PAZ domain. Furthermore, I wrote the manuscript and prepared

the figure. The overall contribution is estimated to be 80%.

Results

89

Results

90

Results

91

2.4 The structure of the FHV B2 protein, a viral suppressor of RNAi, reveals a novel mode of dsRNA recognition

Summary of results The structure of the Flock House virus B2 protein was solved by NMR spectroscopy. The full

length protein comprises 106 amino acids and was expressed recombinantly in E.coli. NMR

experiments indicated that a portion of the full length protein is not well defined. Based on

secondary structure predictions and NMR spectra, C-terminally deleted versions of the protein

were made. The shortest folded construct comprised 72 amino acids, lacked flexible residues

and was therefore chosen for further studies. Gel filtration of the purified protein and the line

width of the NMR signals suggested that both the full length and the B2 (1-72) construct are

dimers in solution.

Based on triple-resonance NMR experiments, chemical shift assignment was performed for

the polypeptide backbone and side chains atoms to almost completeness. The structure of the

B2 (1-72) dimer was determined by structure calculations based on experimentally

determined short distances in the protein (NOEs), residual dipolar couplings (RDCs), dihedral

angles and hydrogen bonds.

The NMR structure shows that B2 (1-72) is a symmetric, anti-parallel dimer in solution. The

dimeric conformation is confirmed by NMR relaxation experiments. One monomer is

comprised of three α-helices, which are arranged in a triangular manner and thereby the N-

and C-termini are placed in close proximity. Helix α1 and α2 of one monomer are anti-

parallel and the interaction between them is stabilized by hydrophobic residues.

The B2 dimer is formed by a head-to-tail interaction of the two monomers, with the interface

being stabilized mainly by hydrophobic side chains of residues in helix α1 and α2. In

addition, the dimer is held together by three electrostatic interactions.

To characterize the interaction of B2 with RNA and to map the binding surface, a double-

stranded siRNA was added to the B2 dimer and 15N,1H-correlation spectra were recorded.

Slow exchange in the NMR timescale between the free and the RNA-bound form of the

protein was observed, which indicates that B2 binds the siRNA with a high affinity in the

nanomolar range. Chemical shift perturbations were located mainly at helix α2, with some

additional changes also in the small helix α3. Helix α2 contains a number of positively

charged residues with one of them being arginine 54, a residue previously identified to be

Results

92

critical for dsRNA binding. The spacing of positively charged side chains indicates that B2

. This

in-line with the reported sequence-independent binding specificity of B2. Based on a

tration experiment, the stoichiometry of the dsRNA-B2 dimer complex could be determined

the observed length of the binding site, this suggests that the siRNA

ulations themselves. Furthermore, I performed the characterization of the

contacts the dsRNA primarily by polar interactions with the sugar phosphate backbone

is

ti

to be 1:1. Together with

binds with its helix axis oriented in parallel to the α2 helix axis.

Contribution I cloned, expressed and purified all constructs of the B2 protein and prepared all NMR

samples. I also did part of the data acquisition and processing. I did the complete chemical

shift assignment, the analysis of the spectra to get constraints for the structure calculations

and the structure calc

dsRNA binding to the B2 protein by NMR. I was involved in writing the manuscript and

prepared all the figures. I estimate my overall contribution to this publication to be 80%.

Results

93

Results

94

Results

95

Results

96

Results

97

Results

98

Results

99

Results

100

Results

101

Results

102

Discussion

3. Discussion

.1 The PAZ domain represents a novel nucleic acid binding fold

he PAZ domain is one of the two signature domains of Argonaute proteins and was initially

entified by a bioinformatics approach which searched for conserved sequence blocks in

roteins involved in RNA interference and cell differentiation (Cerutti et al., 2000). It was

und that the PAZ domain is exclusively present in only two protein families: the Argonaute

rotein family, both of them being known to constitute core components of the

initiation and effector step of RNA silencing, respectively (Figure 1.1). In the course of RNAi

or miRNA-mediated gene silencing, the small RNAs generated by Dicer are transferred into

RISC to perform their function in the effector complex. A direct transfer seems likely, as

Dicer was shown to interact with Argonaute proteins in several studies (Hammond et al.,

2001; Caudy et al., 2002). This was the reason for the proposal that the PAZ domain might act

as a protein-protein interaction domain, mediating physical contact between Dicer and

Argonaute proteins.

Structure of the PAZ domain

The structure of Drosophila Argonaute2 PAZ domain, presented in chapter 2.1 of this thesis,

provided the first molecular insight into the function of this conserved protein domain. The

structure revealed that the domain has a bimodular composition, with a cleft formed

inbetween the two modules. The core of the PAZ domain is formed by a central five-stranded

β-barrel that resembles structures of known β-barrels, such as the Sm fold and the

oligosaccharide/oligonucleotide binding (OB) fold (Figure 3.1, dark blue), which were both

shown to have a function in nucleic acid binding. The Sm fold is shared by members of the

LSm protein family and is known to form hexameric or heptameric rings that bind to single-

stranded RNA (reviewed by Khusial et al., 2005). It constitutes a closed barrel comprising 5

anti-parallel β-strands with an α-helix on one side (Figure 3.1B). RNA binding is found not

for the monomer alone, but only when the protein is part of a multimeric ring. The binding

site is located in the lumen of the ring with one nucleotide binding to each subunit (Thore et

al., 2003) and is formed of the loops between strands β2 and β3 and between β4 and β5 (see

Figure 3.1B, black arrows). Members of the closely related OB fold superfamily consist of

two 3-stranded anti-parallel β-sheets, where strand β1 is shared by both sheets (Figure 3.1C,

3

T

id

p

fo

and the Dicer p

103

Discussion

104

Theobald et al., 2003). The observed variability in length among OB fold domains is due to

ngth of the loops between the well-conserved elements of secondary

tructure. The nucleic acid binding subfamily of OB fold proteins is the largest within the OB

nding. OB

differences in the le

s

superfamily, and members of this subfamily are involved in ssRNA or ssDNA bi

folds tend to use a common ligand binding interface centered on β-strands 2 and 3 (Figure

3.1C, arrow).

The OB-like barrel of the DmAgo2 PAZ domain is covered on one side by two N-terminal

helices and on the other side by an inserted β-hairpin/α-helix module (Figure 3.1A, light

blue), which is around 35 amino acids long and contains several highly conserved residues.

One of them is a phenylalanine (F72) in strand β4, which is invariant in PAZ domains.

Inbetween the central β-barrel and the conserved module, a hydrophobic cleft can be

identified. This domain architecture of a β-barrel covered by a β-hairpin/α-helix module was

not observed in previously solved structures and thus represents a novel domain fold.

The structure of the DmAgo2 PAZ domain reported in chapter 2.1 was confirmed by a crystal

structure of an MBP-DmAgo2 PAZ fusion protein that was solved by Song et al. shortly

afterwards (Song et al., 2003). The two structures determined by different methods are

Figure 3.1 Comparison of β-barrel protein folds. The strands of the central β-barrel are colored in dark blue in all panels. (A) The DmAgo2 PAZ domain, with the conserved insertion shown in light blue. (B) Sm fold of the human Sm D2 protein (PDB code 1B34, chain B) (Kambach et al., 1999). The black arrows indicate the common Sm fold binding site. (C) OB fold of the E. coli Rho transcription termination factor (PDB code 1A8V) (Bogden et al., 1999). The black arrow indicates the common OB fold nucleic acid binding site on β2 and β3.

A B C

Discussion

105

essentially identical. The same is true for the PAZ domain of DmAgo1, that was also solved

as a solution structure and published simultaneously (Yan et al., 2003). Both structures show

the same secondary structure and domain architecture.

Function of the PAZ domain in nucleic acid 3’-end recognition

Unlike previously suggested, the PAZ domain was found to bind nucleic acids in vitro.

inding of ssRNA, ssDNA, and dsRNA with 3’-end overhangs to the DmAgo2 PAZ domain

bserved low affinity for blunt-ended dsRNA and double-stranded RNA with 5’-overhangs.

human Ago1 PAZ domain and for the PAZ domains of all four Argonaute proteins of

B

could be detected by chemical shift perturbation analysis and crosslink experiments (chapter

2.1). Using NMR, the binding site for a single-stranded siRNA could be mapped to be in the

hydrophobic cleft between the β-barrel and the β-hairpin/α-helix module. Furthermore, two

critical residues for nucleic acid binding, namely F50 and F72, which are located on opposite

faces of the clamp in the PAZ domain, could be identified by mutational analysis. Notably,

the two mutant proteins were still folded as confirmed by NMR spectroscopy, but showed no

chemical shift changes upon addition of ssRNA and were not able to form a photo-induced

product in the UV-crosslinking assay. Competition experiments revealed that ssRNA and

ssDNA are bound with similar affinities by the DmAgo2 PAZ domain. Different siRNAs

competed equally well in the assay, indicating that the RNA is bound with little sequence

preference. The observation that blunt-ended dsRNA molecules were bound with a reduced

binding affinity to the PAZ domain pointed towards a specificity for ssRNA.

These results were in-line with UV-crosslinking experiments that were reported together with

the crystal structure of the DmAgo2 PAZ domain (Song et al., 2003). The authors showed

binding of a single-stranded siRNA and of double-stranded siRNAs with 3’-overhangs, but

o

This suggested a role of a single-stranded 3’-end in nucleic acid binding.

As the identified region of interaction includes highly conserved residues, it was suggested

that also other PAZ domains function in nucleic acid binding (chapter 2.1). Indeed, in the

simultaneously published study of the Drosophila Argonaute1 PAZ domain, Yan et al.

showed binding of a small, single-stranded RNA oligonucleotide (Yan et al., 2003). The

authors also reported a reduced affinity for DNA, showing that the DmAgo1 PAZ domain

discriminates between RNA and DNA, whereas the Ago2 PAZ domain does not (chapter 2.1

and 2.2). Furthermore, RNA binding was shown by NMR for the PAZ domains of human

Ago1, Ago2 and Piwi proteins (Yan et al., 2003).

In agreement with that, binding of nucleic acids was detected by the crosslinking assay for the

Discussion

106

Drosophila (A. Lingel and E. Izaurralde, unpublished). In these experiments, PAZ domains

showed a clear preference for RNA over DNA, except for the DmAgo2 PAZ domain (see

above). If this difference in specificity is employed in vivo in different RISCs is presently not

nown.

o2-RNA (Y44)

ting that the 3’-end

nd makes only a few contacts with the PAZ

k

The molecular basis of nucleic acid recognition has been revealed by NMR solution structures

of the DmAgo2 PAZ domain bound to single-stranded RNA and DNA oligonucleotides and

by the characterization of the binding of an siRNA mimic to the PAZ domain, as described in

chapter 2.2 of this Ph.D. thesis. In both complex structures, only the two 3’-terminal

nucleotides are bound at similar binding sites, which are located in the previously identified

hydrophobic cleft of the PAZ domain (Figure 3.2A). This established the molecular function

of the PAZ domain as 3’-end recognition. Simultaneously, the crystal structure of the HsAgo1

PAZ domain in complex with an siRNA mimic was published (Ma et al., 2004) and

confirmed these results. Like in the structures of the DmAgo2 PAZ complexes, the 2 nt 3’-

overhang of the siRNA-like ligand is bound in the same cleft (Figure 3.2B). Recognition of

the 3’-terminal nucleotide involves a stacking interaction with a conserved aromatic residue

(F292 and F72 in HsAgo1 and DmAgo2, respectively) of the β-hairpin/α-helix module. The

phosphate groups of the two 3’-terminal nucleotides interact with tyrosine and histidine

residues in the HsAgo1-RNA (Y309, Y314, H269 and Y277) and DmAg

complexes. It is conceivable that some of these residues (such as Y314 and H269, which are

not conserved in DmAgo2) contribute to the selectivity of most PAZ domains for RNA over

DNA. This may also be linked to the different conformations of the sugar phosphate

backbone in the complexes of DmAgo2 with RNA and DNA (chapter 2.2).

Very few contacts involve the sugar of the 3’-terminal nucleotide, indica

of the nucleic acid is specified mainly by steric exclusion. In the complexes of the DmAgo2

PAZ domain with single-stranded nucleic acids, the penultimate nucleotide interacts with

conserved residues Y57 and K107. Notably, this binding pocket for the penultimate

nucleotide corresponds to the location of a single uridine nucleotide bound to the DmAgo1

PAZ domain (Yan et al., 2003), suggesting that it is a high-affinity site. By contrast, in the

complex of HsAgo1 with an siRNA mimic, the base of the penultimate nucleotide is in a

different more exposed location (Figure 3.2B, Ma et al., 2004). It acts as a spacer connecting

the 3’-terminal nucleotide with the dsRNA stem a

domain. This observation indicates that there is conformational plasticity with respect to the

location of the penultimate nucleotide. The mode of recognition of this nucleotide may

Discussion

107

depend on whether the upstream nucleic acid adopts a single-stranded or double-stranded

conformation. In both studies, the double-stranded stem of the different siRNA mimics

extends along the β3 strand of the central β barrel of the PAZ domain (chapter 2.2 and Ma et

al., 2004).

The phosphodiester backbone of the RNA strand which is bound via its 3’-end mediates

numerous contacts to positively charged arginine and lysine side chains (Ma et al., 2004).

These additional contacts depend on the presence of the dsRNA stem and contribute

substantially to the binding affinity, as the extension of the 3’-end single-stranded overhang

from two to ten uridines reduces the interaction 50-fold (corresponding to a change in the

dissociation constant, KD, from ~2 nM to 100 nM). Notably, the complementary RNA strand,

including its 5’-end, does not seem to be recognized by the PAZ domain (Figure 3.2B).

Figure 3.2 Recognition of the 3’-end of nucleic acids by the PAZ domain. The central β-barrel and the insertion module are shown in dark and light blue, respectively. Amino acid side chains that are involved in nucleic acid recognition (see text) are indicated. (A)DmAgo2 PAZ bound to a single-stranded 5 nt RNA (PDB code 1T2R). Only the two 3’-terminal nucleotides that interact with the protein are shown. (B) HsAgo1 PAZ bound to

an siRNA mimic (PDB code 1SI3). Only one half of the protein-RNA dimer iunit of the crystal structure is shown. (C) The OB fold domain of S. pombe Pot

n the asymmetric 1 (SpPot1) bound

to telomeric DNA (PDB code 1QZG) (Lei et al., 2003).

A B

C

Discussion

108

Taken together, the structural studies on the Drosophila Argonaute2 PAZ domain show that

the PAZ domain binds the two 3’-terminal nucleotides of single-stranded nucleic acids by

burying them in a hydrophobic cleft between the characteristic β-hairpin/α-helix module and

a central β-barrel.

ucture

lements suggests that the overall fold of Dicer PAZ domains is similar to that of Argonaute

The mode of recognition of the two terminal nucleotides is unique and distinct from that used

by other single-stranded nucleic acid binding domains, like the Sm or the OB fold (Theobald

et al., 2003; Messias & Sattler, 2004). In strong contrast to OB and Sm fold proteins, where

the solvent-exposed surface of the β-barrel is used for nucleic acid recognition (Figure 3.1),

the major part of the interaction surface of the PAZ domain involves the inner part of the

central β-barrel near β7. An even more significant difference is the presence of the highly

conserved β-hairpin/α-helix module in the PAZ domain, which is absent in the OB fold

(Figure 3.1). The burial of the 3’-end by the monomeric PAZ domain also differs from 3’-end

recognition by OB fold domains, such as the recognition of telomeric DNA by the OB fold of

Schizosaccharomyces pombe Pot1 (Figure 3.2C, Lei et al., 2003). SpPot1 engages in several

base- and DNA-specific contacts, including intra-DNA interactions, consistent with sequence-

specific recognition and its preference for DNA over RNA. The 3’-end of the single-stranded

telomeric DNA is exposed and recognized by specific contacts between its 3’-terminal

hydroxyl group and the side chains of arginine and threonine residues (Figure 3.2C).

Thus, the PAZ domain represents not only a novel nucleic acid binding fold, also the mode of

selectively binding the 3’-end of nucleic acids was not precedenced.

Diversity of PAZ domains

In eukaryotes, only Dicer and Argonaute proteins contain the PAZ domain. Unlike for the

PAZ domain of Argonaute proteins, little is known about the Dicer family PAZ domains.

Sequences of Dicer PAZ domains align well with the sequences of Argonaute proteins, except

for a stretch of approximately 30 amino acids in Dicer PAZ domains that are inserted in the

loop between the β6 and β7 strand of Argonaute PAZ domains (for sequence alignments, see

Supplementary Figure 1 in chapter 2.1 and Supplementary Figure 2 in chapter 2.2). This

insertion is in close proximity to the nucleic acid binding site and is rich in arginine and lysine

residues. This suggests that it might be involved in nucleic acid binding, but it could modulate

the function of the PAZ domain. Conservation of hydrophobic residues in secondary str

e

Discussion

109

proteins. A proposed model for the function of Dicer PAZ domains will be discussed in the

next chapter.

Initially, the PAZ domain was believed to occur only in eukaryotes, as there were no

conserved sequences identified in prokaryotes (Cerutti et al., 2000). Later, crystal structures

interactions with conserved aromatic residues. Residues Y190 and Y216 of PfAgo are

of archaeal proteins with identified Piwi domains revealed that the PAZ domain is not

restricted to eukaryotes. First, Song et al. solved the structure of an Argonaute protein of

Pyrococcus furiosus (Song et al., 2004) and found a PAZ domain which can be superimposed

with eukaryotic PAZ domains, despite very low sequence identity. However, the structural

similarity is not complete, and there are structural elements unique to the PfAgo PAZ domain.

The main difference is that in the PfAgo protein, the β-hairpin/α-helix module is replaced by

two α-helices (Figure 3.3). In addition, some loop conformations are different. Despite these

variations, amino acids reported to be important for binding to the 3’-end of nucleic acids (see

chapter 2.2), including those in the β-hairpin/α-helix module, are conserved. Their side chains

are located in similar spatial positions in the Pyrococcus protein, although some of these

residues are at different positions in the protein sequence and protrude from different

secondary structural elements (Figure 3.3).

As described above, key features of nucleic acid recognition by the PAZ domain involve base

B

Figure 3.3 Comparison of the Drosophila Argonaute2 PAZ domain (DmAgo2, A) and the PAZ domain of the Argonaute protein from Pyrococcus furiosus (PfAgo, B). Structures are shown as ribbon diagrams. The β-hairpin/α-helix module is shown in light blue. This module contains two α-helices in the PAZ domain of PfAgo. Side chains of residues shown to contact nucleic acids inthe DmAgo2 PAZ domain are shown and labeled. Residues in equivalent positions in the PfAgo protein are indicated.

A

Discussion

110

equivalent to Y57 and Y84 of the DmAgo2 PAZ domain (Figure 3.3). The position of residue

F72 of the DmAgo2 PAZ domain, which was identified to be critical for nucleic acid

providing module in the RNAi

recognition, is occupied by another aromatic residue, W213, in PfAgo. On the basis of these

similarities, Song et al. suggested that the PAZ domain of PfAgo could also bind to single-

stranded 3’-ends of RNAs.

A PAZ domain is also found in the crystal structure of the Argonaute protein of another

archaea, Aquifex aeolicus (Yuan et al., 2005). Similarly to the PAZ domain of the Argonaute

protein of Pyrococcus furiosus, the β-hairpin/α-helix module found in eukaryotic PAZ

domains is replaced by two α-helices. Also like in the PfAgo PAZ domain, conserved

residues occupy similar spatial positions with their counterparts in other PAZ domains (Figure

3.3), suggesting a conserved function in 3’-end binding. However, nucleic acid binding to

archaeal PAZ domains has not been demonstrated yet and thus the function of this class of

PAZ domains remains to be established.

3.2 The PAZ domain as a specificitypathway

The structural studies on eukaryotic Argonaute PAZ domains revealed that the PAZ domain

provides a unique platform for the recognition of the two 3’-terminal nucleotides in single-

stranded nucleic acids. A key feature of the binding is the burial of the terminal 3’-OH group.

This binding mode provides new insights into the biological role of this domain within Dicer

and Argonaute proteins, which constitute the cores of RNA silencing complexes.

In the initiation step of RNA-mediated gene silencing, the RNase III enzyme Dicer generates

small double-stranded RNAs. Based on the established function of the Argonaute PAZ

domain, a model of PAZ function in Dicer was proposed by Filipowicz and co-workers in a

recent study addressing the function of human Dicer (Zhang et al., 2004). The authors suggest

that the characteristic size (~20 bp) of the siRNAs resulting from Dicer cleavage may be

linked to 3’-end recognition of the dsRNA substrate by the Dicer PAZ domain. In this model,

the PAZ domain recognizes the 2 nt 3’-overhang generated by Drosha in the miRNA

pathway, or the last cut made by Dicer in the RNAi pathway. PAZ and RNase III domains of

Dicer are spaced such that when the 3’-overhang is bound, they act as a ruler to position the

cleavage site exactly 22 nt away from the 3’-end. This model is consistent with the known

preference of Dicer for free ends of dsRNAs (Zhang et al., 2002). This results in Dicer

Discussion

111

cleavage at the ends of dsRNA and pre-miRNAs, shortening the substrate consecutively by

siRNA-sized segments, and not cutting internally, which would lead to the generation of

multiple larger fragments (Zhang et al., 2002; Zhang et al., 2004; Vermeulen et al., 2005). In-

line with this model, both the PAZ and dsRBD domains are required for full Dicer activity.

Intact Dicer can slowly process blunt-ended dsRNA, however deletion of the dsRBD makes

cleavage of long dsRNA, the Dicer-2-R2D2-siRNA

rnary complex is an assembly intermediate in the formation of RISC (Pham et al., 2004;

Tomari et al., 2004b) that is proposed to recruit Ago2 directly to the siRNA. This interaction

uplex is

hereb d from the initiator complex into the RISC effector complex, as it was

Dicer more dependent on substrates with 3’-overhangs (Zhang et al., 2004). This suggests that

the dsRBD contributes to non-specific binding of incoming substrates and potentially allows

Dicer to cleave long dsRNA substrates processively. As no experimental structural data is

available for the PAZ domain of Dicer proteins, the exact details of the catalytic mechanism

and the role of the PAZ domain must await the structure of a Dicer-dsRNA complex.

In Drosophila, Dicer-2 is stably associated with the dsRBD containing protein R2D2 (Liu et

al., 2003), and they are together required for cleavage of dsRNA and for RISC assembly (Lee

et al., 2004b; Pham et al., 2004). After

te

might be mediated directly by Ago2 and Dicer-2 (Tahbaz et al., 2004). The siRNA d

y transferret

demonstrated recently (Matranga et al., 2005). The results on the PAZ domain of Argonaute

proteins presented in this thesis show that it specifically recognizes the 2 nt 3’-overhang of

siRNAs. This suggests that the PAZ domain is involved in receiving the siRNA from Dicer

and to facilitate its incorporation into RISC.

The role of the PAZ domain in the subsequent catalytic cycle of RISC can be envisioned as

follows: The PAZ domain of Ago2 binds the 3’-end of the strand that is considered the guide

strand of the siRNA duplex. At the same time, the 5’-end of the guide strand is docked in the

phosphate binding pocket of the Ago2 Piwi domain (see chapter 1.2.4). The decision which

strand of the siRNA duplex functions as guide strand is determined by the orientation of the

siRNA in the RISC loading complex. The passenger strand would then be cleaved by Ago2

and the resulting fragments released (Matranga et al., 2005; Rand et al., 2005). This release

might be facilitated by an ATP-dependent cofactor, like the release of the products of target

mRNA cleavage which was shown to be stimulated by ATP (Haley & Zamore, 2004). The

resulting RISC contains Ago2 with the single-stranded guide RNA bound. This ‘mature’

RISC provides the sequence-specific binding site for the complementary mRNA. The mRNA

can then base pair with the guide strand and is cleaved by the endonucleolytic activity

Discussion

112

constituted by the Piwi domain of the Argonaute protein (see chapter 1.2.3). The reaction

cycle is completed with the release of the cleavage products.

It is not established yet that the PAZ domain interacts with the 3’-end of double-stranded or

single-stranded siRNAs in vivo. Experiments testing this are possible and will be done

certainly in the future. A strong indication that the PAZ domain is important for RNAi

function in vivo is provided by results of a genetic screen in C. elegans (Tabara et al., 1999).

Worms deficient in RNAi were isolated in which only an absolutely conserved glutamate in

the PAZ domain of Rde-1 is mutated to lysine. The structure of the Drosophila Ago2 PAZ

domain suggests that this mutation destabilizes the fold, which most probably also disrupts

the structural integrity of the PAZ domain in vivo.

Despite the diversity of functions attributed to the different Argonaute-associated complexes

(see chapter 1.2.1), a common feature of these complexes is the use of non-coding RNAs to

select their targets in a sequence-specific manner. These non-coding RNAs (siRNAs or

microRNAs) have the specific features of RNAs processed by the RNase III enzyme Dicer in

common, i.e. a 19-22 nucleotide double-stranded RNA body with two nucleotide 3’-

overhangs and 5’-monophosphate groups. This raises the question of how siRNAs or

miRNAs are discriminated from other cellular RNAs and specifically incorporated into

RISCs. With the binding specificity of the PAZ domain of Argonaute proteins, it is likely that

it provides a contribution to the specific recognition of the 2 nt 3’-overhang feature of

siRNAs. It would then act as a selectivity filter in RISCs, allowing only small RNAs

processed by Dicer to enter the effector complex and excluding all other RNAs that are

present in the cell. Without such a filter, all double-stranded RNA that is present in a cell

could enter the RNAi pathway. This selectivity is very important, as the incorporation of

unwanted dsRNA would lead to the down-regulation of the expression of complementary

genes.

Discussion

113

3.3 The FHV B2 protein binds dsRNA in a novel

mode

as previously shown to act as a potent

ppressor of RNAi in insect cells, C. elegans and plants (Li et al., 2002; Lu et al., 2005). The

e distance over the minor groove of A-form dsRNA. These observations, together

ith the determined 1:1 binding stoichiometry and the fact that the complete α2/α2’ surface

dsRNA suggests that the dsRNA binds parallel to the α2 helix

axis. The length of the RNA binding surface is about 45 Å, which corresponds to an

approximately 17 base pair dsRNA. This proposal was confirmed by a crystal structure of a

B2 dimer bound to a palindromic 18 bp RNA duplex that was published simultaneously

(Chao et al., 2005). In this protein-RNA complex, the α2/α2’ helices of the B2 dimer are

indeed oriented almost parallel to the helical axis of the A-form RNA (the angle between the

To counteract the cellular response to the presence of dsRNA, viruses have evolved proteins

that can act as suppressors of RNA interference. The detailed mechanism of suppression is

unknown for most of the identified proteins, with the exception of the p19 protein of

tombusviruses (see chapter 1.3). p19 was shown to bind specifically to siRNAs and thereby

blocks their incorporation into RISC (Vargason et al., 2003; Ye et al., 2003).

As part of this thesis, the solution structure of the B2 protein of Flock House virus was

determined and is described in chapter 2.4. B2 w

su

NMR structure shows that B2 is an anti-parallel, symmetric dimer in solution with each

monomer composed of three α-helices. Searches in databases of known structures did not

result in significant hits, indicating that this dimer fold is novel. Binding of dsRNA was

detected by chemical shift perturbation analysis and the binding site could be mapped to an

elongated surface formed by the two α2 helices of the dimer (α2/α2’). Notably, the helices

are highly positively charged, as many lysine and arginine residues are located on this surface.

Arginine 54, which has been shown to be crucial for dsRNA binding (Lu et al., 2005), is

located at the centre of this binding interface, which confirmed the importance of residues in

the identified binding surface. The exposed arginine and lysine residues suggest that

recognition of dsRNA is mediated by non-sequence-specific electrostatic contacts with the

sugar phosphate backbone of the dsRNA stem. Interestingly, the spacing of positively charged

side chains corresponds to known distances in regular A-form dsRNA. R54 and K47 are

separated with a spacing of approximately 10 Å (Figure 3.4A), which is approximately the

distance of phosphates across the major groove. Furthermore, R36 and K62 are positioned to

bridge th

w

provides the binding site for

Discussion

114

B2 dimer axis and the axis of the dsRNA is ~10°) and B2-RNA interaction is made up almost

gure 3.4B). entirely of contacts with the phosphate backbone and ribose sugars (Fi

As suggested by the solution structure of B2, K47 and R54 of both α2 and α2’ helices contact

the phosphate backbone in the major groove, whereas R36 and K62 contact the phosphates

over the minor groove. Both the monomer and the dimer folds of the B2 dimer in complex

with dsRNA are very similar to the solution structure of the free B2 protein (backbone

coordinate r.m.s.d. 1.2/1.3 Å for monomer/dimer). A slightly different orientation of the α3

helices is potentially linked to electrostatic contacts of residues in the preceding linker with

the RNA. In addition, the helix dipole moment of the small helix α3 is oriented towards the

RNA and the partial positive charge could be also used to interact with the phosphate

backbone.

Figure 3.4 Comparison of the solution structure of the free B2 dimer (A) with the crystal structure of the B2 dimer in complex with a palindromic 18 base pair dsRNA (B). The two monomers of the B2 dimer are shown in dark und light blue, respectively. Lysine and arginine residues in the RNA binding surface (helices α2 and α2’) mediating contacts to the phosphate backbone are shown in both structures and labelled in A. Distances between these residues and the total length of the RNA binding site are indicated (see also text). The major groove (MG) and minor grooves (mg) in the dsRNA are indicated.

A B

Discussion

115

The structure of the B2 dimer reveals a novel mode of sequence-independent dsRNA

recognition, a feature that is shared with the double-stranded RNA binding domain (dsRBD,

ia coli. The Rop protein forms a four-helical bundle and binds an RNA-kissing

op structure. Several models of the Rop-RNA complex have been built on the basis of

also called double-stranded RNA binding motif, dsRBM). The dsRBD comprises about 70-75

amino acids and adopts an αβββα fold in which the two α-helices pack against one face of

the 3-stranded β-sheet. It uses residues in both the α-helical and β-strand loop regions for

RNA binding (Figure 3.5, Ryter & Schultz, 1998; Stefl et al., 2005). The dsRBD it found in

many RNA binding proteins with various functions, including pre-mRNA editing, RNA

transport, and RNA-mediating silencing. In the dsRBD, the α-helix 1, the N-terminal part of

α-helix 2, and the loop between β-strands β1 and β2 recognize the shape of the dsRNA stem.

Like the B2 dimer, the dsRBD discriminates between dsRNA and dsDNA but has no

sequence-specificity. Both the dsRBD and the B2 dimer recognize two minor grooves and the

intervening major groove along one face of the dsRNA, but the recognition is performed by

completely different protein architectures and backbone contacts (Figure 3.4 and Figure 3.5).

Thus, B2 and the canonical dsRBD represent distinct solutions to the problem of sequence-

independent dsRNA recognition.

Figure 3.5 Recognition of dsRNA by the dsRBD. The dsRBD of the Xenopus laevis RNA-binding protein A (Xlrbpa2) is shown in complex with dsRNA (Ryter & Schultz, 1998). The protein interacts with two successive minor grooves and across the intervening major groove on one face of the dsRNA. Recognition of the shape of the dsRNA stem is achieved by resdues in helices α1 and α2 and by residues in the loop connecting β1 and β2. Side chains that interact with the RNA are shown.

Notably, the topology of the four-helical bundle formed by α1, α2, α1’and α2’ in the B2

dimer resembles the small dimeric Repressor of primer (Rop) protein (Figure 3.6B, Banner et

al., 1987), which was shown to be important for the regulation of ColE1 plasmid copy number

in Escherich

lo

Discussion

116

mutagenesis and NMR chemical shift mapping (Predki et al., 1995). The predicted binding

interface with kissing RNA hairpins comprises an anti-parallel pair of the first helix (α1) of

each monomer, somewhat resembling the B2 dimeric RNA binding interface. But in contrast

to B2, in which all four helices are almost parallel to each other, Rop forms a canonical left-

handed four-helical bundle, as reflected in the rather large r.m.s.d. of the atomic backbone

coordinates (3.7 Å) between the B2 and Rop dimers (Figure 3.6A,B).

BA C

Similarly, the N-terminal RNA binding domain of the Influenza A virus NS1 protein is a

symmetric homodimer formed by two three-helix monomers (Figure 3.6C, Chien et al., 1997;

Liu et al., 1997). It corresponds to the first 73 amino acids of the NS1 protein and possesses

all of the dsRNA binding activity of the full length protein (Qian et al., 1995). Site-directed

mutagenesis experiments indicated that the side chains of positively charged amino acids in

Figure 3.6 Small, dimeric and α-helical proteins that bind dsRNA. The two monomers are shown in each panel in dark and light blue, respectively. Side chains of amino acids that were shown to be important for dsRNA binding are shown, placing the binding site for dsRNA on the right side.(A) Solution structure of the B2 dimer as described in chapter 2.4. (B) Crystal structure of the Rop protein of E. coli (Banner et al., 1987). (C) Crystal structure of the dsRNA binding domain of the Influenza A virus non structural protein NS1 (Liu et al., 1997).

e second α-helix (α2) are the only amino acid side chains that are required for the dsRNA th

binding activity of the intact dimeric protein (Figure 3.6C).

Despite these similarities, the structure of B2 reveals clearly a novel fold and an

unprecedenced way of interaction with dsRNA.

Discussion

117

3.4 Viral suppressors of RNAi act at different levels in the RNAi pathway The B2 protein of FHV was the first identified suppressor of RNA silencing of an animal

virus (Li et al., 2002), after the presence of suppressors of RNAi was already established for

plant viruses. The fact that FHV suppresses RNA silencing during replication in plant, insect

tep of RNAi (Lu et al., 2005). B2

affinity. Interestingly, inhibition of Dicer

leavage of long dsRNA after addition of B2 was indeed demonstrated very recently in vitro

(Chao et al., 2005; Lu et al., 2005). This is supported by recent studies on the B2 protein of

and worm hosts suggested that B2 has evolved a host-independent evasion mechanism (Li et

al., 2002; Lu et al., 2005). The solution structure of the FHV B2 protein dimer and the novel

dsRNA binding mode of contacting the sugar phosphate backbone to enable sequence-

independent recognition of dsRNA provide an explanation of the suppressor activity on a

molecular level. These results are consistent with a suppression of RNAi by B2 at two

different stages: first, it can bind the viral dsRNA that is generated during viral replication,

thereby protecting it from being cleaved by Dicer. Secondly, in the case viral siRNAs are

generated, they can be bound by B2 blocking their incorporation into RISC (Figure 3.7).

Figure 3.7 Suppression of RNAi by the FHV B2 protein. B2 can act at two stages of the RNAi pathway: by binding viral dsRNA it can block the initiation step(Dicer cleavage), preventing the generation of siRNAs. Like p19, it can also bind to siRNAs and thereby taking them out of the pathway.

This model is in-line with genetic analysis in C. elegans showing that B2 is active in both

wild-type worms and worms that are deficient of the Argonaute protein Rde-1, which

indicated that B2 might act upstream of the RISC-mediated s

has been demonstrated to bind siRNAs in vitro, but unlike p19 of tombusviruses, B2 can also

bind to long dsRNA, even with much higher

c

Discussion

118

Nodamura virus (NoV, Sullivan & Ganem, 2005). The two B2 proteins share only little

,

lmost all of the residues that contact the RNA are conserved between these proteins. Sullivan

es a

econd mechanism to protect the viral RNA from being cleared from the cell. The structure of

sequence similarity, but probably adopt similar structures (Johnson et al., 2001). Interestingly

a

et al. showed that NoV B2 is able to inhibit processing of long dsRNA into siRNAs in virus

infected mammalian cells (Sullivan & Ganem, 2005). Furthermore, cellular NoV B2 was

found to be associated with small ~20 nt RNAs which derived from a longer precursor. In

addition to the inhibition of Dicer in vivo, the authors showed the binding of B2 to dsRNA

and blocking of Dicer in vitro.

In conclusion, the B2 protein represents the first example of a dual mode of suppression of

RNAi: it is not only targeting siRNAs as it was demonstrated for p19, but also the long

precursor RNA that becomes thereby protected from Dicer cleavage. The protection of the

viral dsRNA might be not complete, and a certain fraction of the replication intermediates

might be processed anyway. Then, the binding of the generated siRNAs by B2 provid

s

the B2 dimer and the characteristics of dsRNA binding described in chapter 2.4 of this thesis

provide an explanation for that on a molecular level.

The list of identified suppressors is long (Table 1.1) and future studies on the molecular

function of these proteins might reveal novel modes of RNAi inhibition that are unknown to

date. Suppressors of RNA silencing may provide very useful tools to study RNA silencing

mechanisms, e.g. to identify endogenous targets of siRNA or miRNAs.

Additional Publications

119

4. Additional Publications

In addition to the publications containing primary data, I was involved in writing one

perspective and one review during the time of my Ph.D. thesis:

Perspective Lingel, A., and Izaurralde, E. (2004) RNAi: Finding the elusive endonuclease. RNA, 10, 1675-1679.

Review

Lingel, A., and Sattler, M. (2005) Novel modes of protein-RNA recognition in the RNAi pathway. Curr Opin Struct Biol, 15, 107-115.

For each of the manuscripts, a summary of the discussed topic is given, followed by the

reprint of the publication.


120

4.1 RNAi: Finding the elusive endonuclease

ummary

e Dicer containing complex

to smaller RNAs, called siRNAs. These are transferred and incorporated into a second

omplex, which contains an Argonaute protein as key player and name giving component.

called the effector complex, as it targets an mRNA that is

plex was termed Slicer, according to the activity that cleaves RNA in the first

tep, Dicer. The molecular identity of Slicer remained unknown despite intensive studies until

Then, Song et al. solved the crystal structure of an Argonaute protein of the archaebakterium

adopts an RNase H fold. This class of enzymes is known to cleave the RNA strand of an

cture, Liu et al. did mutations in the putative active site

f human Argonaute proteins and showed that the Argonaute2-associated complex is not

rgonaute2 protein itself is Slicer came from results of Tuschl and

coworkers (Meister et al., 2004). They showed that of different Argonaute-associated

plexes purified from human cells, only the Ago2 complex was cleavage competent.

This perspective summarizes the results of these three papers which led to the identification of

Ago2 as the molecular identity of Slicer.

S As described in the introduction, the process of RNAi can be divided into two steps. Briefly,

in the first step, dsRNA is cleaved by the RNase III-type enzym

in

c

This complex is also

complementary to the siRNA for degradation. The endonucleolytic activity of this Argonaute-

associated com

s

the middle of the year 2004.

Pyrococcus furiosus (Song et al., 2004). The key result of their study is that the Piwi domain

RNA:DNA hybrid. Based on this stru

o

longer capable of cleaving a target mRNA (Liu et al., 2004). Additonal evidence for the

hypothesis that the human A

com


121


122


123


124


125


126

4.2 Novel modes of protein-RNA recognition in the RNAi pathway

Summary Over the last couple of years, an increasing number of structures of proteins or protein

domains involved in RNAi were determined. These structures include PAZ domains, either in

their free form or in complex with nucleic acid ligands, the structure of the Argonaute protein

of the archaea Pyrococcus furiosus, structures of RNase III-type enzymes and crystal

structures of the p19 protein, a viral suppressor of RNA silencing, in complex with an siRNA.

In this review, the novel structures are described und the modes of interaction of the proteins

with RNA are discussed.


127


128


129


130


131


132


133


134


135

Abbreviations

136

5. Abbreviations Aa Aquifex aeolicus Af Archaeoglobus fulgidus At Arabidosis thaliana ATP Adenosin triphosphate BMRB BioMagResBank bp Base pair BYV Beet Yellow virus cDNA Complementary DNA CMV Cucumber Mosaic virus CSA Chemical shift anisotropy DGCR8 DiGeorge syndrome critical region gene 8 Dm Drosophila melanogaster DNA Deoxyribonucleic acid ds Double-stranded DUF Domain of unknown function Ec Escherichia coli EDTA Ethylenediaminetetraacetic acid EMS Ethyl methanesulfonate EMSA Electro mobility shift assay FHV Flock House virus FID Free induction decay FMRP Fragile X mental retardation protein FT Fourier transformation GST Glutathione S-transferase GTP Guanosin triphosphate Hs Homo sapiens HSQC Heteronuclear single quantum correlation IPTG Isopropyl-β-D-thiogalactopyranoside kDa Kilo Dalton miRNA MicroRNA mRNA Messenger RNA NMR Nuclear magnetic resonance NOE Nuclear overhauser effect / enhancement NOESY NOE spectroscopy NoV Nodamura virus NPC Nuclear pore complex nt Nucleotide

Abbreviations

137

OB Oligosaccharide/oligonucleotide en reading frame

AGE Polyacryl-amid gel electrophoresis

R

oupling methylation

polymerase omplex transcriptional silencing

ncing

d gene silencing

ORF OpPPAZPf

Piwi, Argonaute, Zwille Pyrococcus furiosus RNA-dependent protein kinase PK

RBD RBM

RNA binding domain RNA binding motif

RDC Residual dipolar cRdDMRdRP

RNA-directed DNA RNA-dependent RNA RNA-induced silencingRISC cRNA-induced initiation oRITS fRoot mean square deviationRMSD

RNA

Ribonucleic acid RNAiRNase

RNA interference Ribonulease

RNP Rop

Ribonucleoprotein Repressor of primer

S2 Drosophila Schneider cell line 2 Structure activity relationship SAR

SDS Sodium dodecyl sulfate siRNA Small interfering RNA Sm Stephanie Smith SN Staphylococcus nuclease

Schizosaccharomyces pombe Sp ss Single-stranded TEV Tobacco Etch virus TGS Transcriptional gene silTOCSYUV

Total correlation spectroscopy Ultra-violet

VIG Vasa intronic gene VIGS Virus-induce

References

138

6. References

ist P. 2002. RNA-d s, viruses, and RNA silencing. Science 296:1270-1273.

s V. 2004. The fun re 431:350-355. s V, Bartel B, Bar gton JC, Chen X, Dreyfuss G, Eddy SR, Griffiths-Jones S, Marshall M, Matzke M, Ruvkun G, Tuschl T. 2003. A uniform system for microRNA annotation. Rna 9

kshmi R, Pruss , Smith TH, Vance VB. 1998. A viral suppressor of gen ci U S A 95:13079-13084. AA, Klenov MS , Gvozdev VA. 2004. Dissection of a natural RNA sil Mol Cell Biol 24:6742-6750.

A, Naumova N zovsky YM, Gvozdev VA. 2001. Double-stranded RNA-m genomic tandem repeats and transposable elements in the D. melanogaster :1017-1027. Johnson KL. 19 etics of nodaviruses. Adv Virus Res 53:229-244.

r DW, Kokkinidis 1987. Structure of the ColE1 rop protein at 1.7 A resolution. J Mol

DP. 2004. MicroR on. Cell 116:281-297. mbe D. 2004. RNA 63. mbe DC. 1999. Fa Curr Opin Plant Biol 2:109-113. 003. Weak alignment offers new NMR opportunities to study protein structure and dynamics.

Protein Sci 12:1-e EH, Allshire RC. 2 tional gene silencing in mammals. Trends Genet

21:370-373. nstein E, Caudy AA, annon GJ. 2001. Role for a bidentate ribonuclease in the

initiation step of re 409:363-366. zyk J, Tropea JE M, Waugh DS, Court DL, Ji X. 2001.

Crystallographic II suggest a mechanism for double-stranded A cleavage. S

F, Hansen WW, Pa . Phys Rev 69:127. n CE, Fass D, Ber , Berger JM. 1999. The structural basis for terminator

recognition by th ctor. Mol Cell 3:487-493. Bohmert K, Camus I, Bellini C, Bouchez D, Caboche M, Benning C. 1998. AGO1 defines a novel

locus of Arabidopsis controlling leaf development. EMBO J 17:170-180. Bohnsack MT, Czaplinski K, Gorlich D. 2004. Exportin 5 is a RanGTP-dependent dsRNA-binding

protein that mediates nuclear export of pre-miRNAs. Rna 10:185-191. Brennecke J, Stark A, Russell RB, Cohen SM. 2005. Principles of microRNA-target recognition. PLoS

Biol 3:e85. Brigneti G, Voinnet O, Li WX, Ji LH, Ding SW, Baulcombe DC. 1998. Viral pathogenicity

determinants are suppressors of transgene silencing in Nicotiana benthamiana. Embo J 17:6739-6746.

quAhl ependent RNA polymerase

Ambro ctions of animal microRNAs. NatuAmbro tel DP, Burge CB, Carrin

:277-279. Anandala GJ, Ge X, Marathe R, Mallory AC

e silencing in plants. Proc Natl Acad SAravin , Vagin VV, Bantignies F, Cavalli G

encing process in the Drosophila melanogaster germ line.

Aravin A M, Tulin AV, Vagin VV, Roediated silencing of germline. Curr Biol 11

Ball LA, 99. Reverse genBanne M, Tsernoglou D.

Biol 196:657-675. Bartel NAs: genomics, biogenesis, mechanism, and functi

Nature 431 56-3Baulco silencing in plants. :3Baulco st forward genetics based on virus-induced gene silencing.

Bax A. 216.

Bayn 005. RNA-directed transcrip

Ber Hammond SM, HRNA interference. Natu

Blaszc , Bubunenko M, Routzahn Kand modeling studies of RNase I

RN tructure (Camb) 9:1225-1236. Bloch ckard M. 1946Bogde gman N, Nichols MD

e Rho transcription termination fa

References

139

Brünger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang JS, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL. 1998. y & NMR system: A new software suite for macromolecular structure

determination. Acta Crystallogr D Biol Crystallogr 54:905-921.

diated nuclear export of

v 16:2733-

Caudy asterk RH. 2003. A micrococcal nuclease homologue in RNAi effector

Caudy :2491-2496.

y: Academic

Cerutti ins in gene silencing and cell differentiation proteins: the

iamson JR. 2005. Dual modes of

/nsmb1005.

rimer recognition and removal during DNA replication. J Mol Biol

Chapma evsky AI, Gopinath K, Dolja VV, Carrington JC. 2004. Viral RNA silencing

Cogoniing in Neurospora crassa. Proc Natl Acad Sci

Dalmay RNA-dependent RNA

Denli Acessor complex. Nature 432:231-235.

Kuszewski J, Crystallograph

Cai X, Hagedorn CH, Cullen BR. 2004. Human microRNAs are processed from capped, polyadenylated transcripts that can also function as mRNAs. Rna 10:1957-1966.

Calado A, Treichel N, Muller EC, Otto A, Kutay U. 2002. Exportin-5-meeukaryotic elongation factor 1A and tRNA. Embo J 21:6216-6224.

Caplen NJ, Parrish S, Imani F, Fire A, Morgan RA. 2001. Specific inhibition of gene expression by small double-stranded RNAs in invertebrate and vertebrate systems. Proc Natl Acad Sci U S A 98:9742-9747.

Carmell MA, Xuan Z, Zhang MQ, Hannon GJ. 2002. The Argonaute family: tentacles that reach into RNAi, developmental control, stem cell maintenance, and tumorigenesis. Genes De2742.

AA, Ketting RF, Hammond SM, Denli AM, Bathoorn AM, Tops BB, Silva JM, Myers MM, Hannon GJ, Plcomplexes. Nature 425:411-414.

AA, Myers M, Hannon GJ, Hammond SM. 2002. Fragile X-related protein and VIG associate with the RNA interference machinery. Genes Dev 16

Cavanagh J, Fairbrother WJ, Palmer III AG, Skelton NJ. 1996. Protein NMR SpectroscopPress. L, Mian N, Bateman A. 2000. Domanovel PAZ domain and redefinition of the Piwi domain. Trends Biochem Sci 25:481-482.

Chao JA, Lee JH, Chapados BR, Debler EW, Schneemann A, WillRNA-silencing suppression by Flock House virus protein B2. Nat Struct Mol Biol:doi:10.1038

Chapados BR, Chai Q, Hosfield DJ, Qiu J, Shen B, Tainer JA. 2001. Structural biochemistry of a type 2 RNase H: RNA p307:541-556. n EJ, Prokhnsuppressors inhibit the microRNA pathway at an intermediate step. Genes Dev 18:1179-1186.

Chien CY, Tejero R, Huang Y, Zimmerman DE, Rios CB, Krug RM, Montelione GT. 1997. A novel RNA-binding motif in influenza A virus non-structural protein 1. Nat Struct Biol 4:891-895.

C, Macino G. 1997. Isolation of quelling-defective (qde) mutants impaired in posttranscriptional transgene-induced gene silencU S A 94:10233-10238.

Cox DN, Chao A, Baker J, Chang L, Qiao D, Lin H. 1998. A novel class of evolutionarily conserved genes defined by piwi are essential for stem cell self-renewal. Genes Dev 12:3715-3727.

Cox DN, Chao A, Lin H. 2000. piwi encodes a nucleoplasmic factor whose activity modulates the number and division rate of germline stem cells. Development 127:503-514.

Crick FH. 1958. On protein synthesis. Symp Soc Exp Biol 12:138-163. T, Hamilton A, Rudd S, Angell S, Baulcombe DC. 2000. An polymerase gene in Arabidopsis is required for posttranscriptional gene silencing mediated by a transgene but not by a virus. Cell 101:543-553.

Denli AM, Hannon GJ. 2003. RNAi: an ever-growing puzzle. Trends Biochem Sci 28:196-201. M, Tops BB, Plasterk RH, Ketting RF, Hannon GJ. 2004. Processing of primary microRNAs by the Micropro

References

140

Ding SW, Li H, Lu R, Li F, Li WX. 2004. RNA silencing: a conserved antiviral immunity of plants and animals. Virus Res 102:109-115.

, Shi BJ, Li WX, Symons RH. 1996. An interspecies hybrid RNA virus is significantly more virulent than either parental virus. Proc Natl Acad Sci U S A 93:7

Ding SW470-7474.

Eystathi

dies. Rna 9:1171-1173.

NA

Fire A, d RNA in Caenorhabditis elegans. Nature 391:806-811.

ble-stranded RNA-binding domain protein. PLoS Biol 3:e236.

11-620. complex. Nat Struct Mol Biol

Hall TMNA in

Doench JG, Petersen CP, Sharp PA. 2003. siRNAs can function as miRNAs. Genes Dev 17:438-442. Elbashir SM, Harborth J, Lendeckel W, Yalcin A, Weber K, Tuschl T. 2001a. Duplexes of 21-

nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411:494-498.

Elbashir SM, Lendeckel W, Tuschl T. 2001b. RNA interference is mediated by 21- and 22-nucleotide RNAs. Genes Dev 15:188-200.

Elbashir SM, Martinez J, Patkaniowska A, Lendeckel W, Tuschl T. 2001c. Functional anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo lysate. EMBO J 20:6877-6888.

Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS. 2003. MicroRNA targets in Drosophila. Genome Biol 5:R1. oy T, Chan EK, Tenenbaum SA, Keene JD, Griffith K, Fritzler MJ. 2002. A phosphorylated cytoplasmic autoantigen, GW182, associates with a unique population of human mRNAs within novel cytoplasmic speckles. Mol Biol Cell 13:1338-1351.

Eystathioy T, Jakymiw A, Chan EK, Seraphin B, Cougot N, Fritzler MJ. 2003. The GW182 protein colocalizes with mRNA degradation associated proteins hDcp1 and hLSm4 in cytoplasmic GW bo

Fagard M, Boutet S, Morel JB, Bellini C, Vaucheret H. 2000. AGO1, QDE-2, and RDE-1 are related proteins required for post-transcriptional gene silencing in plants, quelling in fungi, and Rinterference in animals. Proc Natl Acad Sci U S A 97:11650-11654. Xu S, Montgomery MK, Kostas SA, Driver SE, Mello CC. 1998. Potent and specific genetic interference by double-strande

Forstemann K, Tomari Y, Du T, Vagin VV, Denli AM, Bratu DP, Klattenhoff C, Theurkauf WE, Zamore PD. 2005. Normal microRNA maturation and germ-line stem cell maintenance requires Loquacious, a dou

Friedman AM, Fischmann TO, Steitz TA. 1995. Crystal structure of lac repressor core tetramer and its implications for DNA looping. Science 268:1721-1727.

Gregory RI, Yan KP, Amuthan G, Chendrimada T, Doratotaj B, Cooch N, Shiekhattar R. 2004. The Microprocessor complex mediates the genesis of microRNAs. Nature 432:235-240.

Grewal SI, Rice JC. 2004. Regulation of heterochromatin by histone methylation and small RNAs. Curr Opin Cell Biol 16:230-238.

Grishok A, Pasquinelli AE, Conte D, Li N, Parrish S, Ha I, Baillie DL, Fire A, Ruvkun G, Mello CC. 2001. Genes and mechanisms related to RNA interference regulate expression of the small temporal RNAs that control C. elegans developmental timing. Cell 106:23-34.

Guo S, Kemphues KJ. 1995. par-1, a gene required for establishing polarity in C. elegans embryos, encodes a putative Ser/Thr kinase that is asymmetrically distributed. Cell 81:6

Haley B, Zamore PD. 2004. Kinetic analysis of the RNAi enzyme 11:599-606. . 2005. Structure and function of argonaute proteins. Structure (Camb) 13:1403-1408.

Hamilton A, Voinnet O, Chappell L, Baulcombe D. 2002. Two classes of short interfering RRNA silencing. Embo J 21:4671-4679.

Hamilton AJ, Baulcombe DC. 1999. A species of small antisense RNA in posttranscriptional gene silencing in plants. Science 286:950-952.

References

141

Hammond SM, Bernstein E, Beach D, Hannon GJ. 2000. An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404:293-296.

HannonHe L, Hannon GJ. 2004. MicroRNAs: small RNAs with a big role in gene regulation. Nat Rev Genet

Hull R. San Diego, CA: Academic Press.

-838.

John B, MicroRNA targets. PLoS

Johnson larger

Jones L can

96:375-387.

Kasscha sion of

Kataoka Y, Takeichi M, Uemura T. 2001. Developmental roles and molecular characterization of a .

NA involved in developmental timing in C. elegans.

Hammond SM, Boettcher S, Caudy AA, Kobayashi R, Hannon GJ. 2001. Argonaute2, a link between genetic and biochemical analyses of RNAi. Science 293:1146-1150.

Han J, Lee Y, Yeom KH, Kim YK, Jin H, Kim VN. 2004. The Drosha-DGCR8 complex in primarymicroRNA processing. Genes Dev 18:3016-3027. GJ. 2002. RNA interference. Nature 418:244-251.

5:522-531. 2002. Matthews' Plant Virology.

Hutvagner G, McLachlan J, Pasquinelli AE, Balint E, Tuschl T, Zamore PD. 2001. A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science 293:834

Hutvagner G, Zamore PD. 2002. A microRNA in a multiple-turnover RNAi enzyme complex. Science 297:2056-2060.

Ingelfinger D, Arndt-Jovin DJ, Luhrmann R, Achsel T. 2002. The human LSm1-7 proteins colocalize with the mRNA-degrading enzymes Dcp1/2 and Xrnl in distinct cytoplasmic foci. Rna 8:1489-1501.

Ishizuka A, Siomi MC, Siomi H. 2002. A Drosophila fragile X protein interacts with components of RNAi and ribosomal proteins. Genes Dev 16:2497-2508.

Jaenisch R, Bird A. 2003. Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet 33 Suppl:245-254. Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. 2004. HumanBiol 2:e363. KN, Johnson KL, Dasgupta R, Gratsch T, Ball LA. 2001. Comparisons among thegenome segments of six nodaviruses and their encoded RNA replicases. J Gen Virol 82:1855-1866. , Ratcliff F, Baulcombe DC. 2001. RNA-directed transcriptional gene silencing in plantsbe inherited independently of the RNA trigger and requires Met1 for maintenance. Curr Biol 11:747-757.

Jorgensen R. 1990. Altered gene expression in plants due to trans interactions between homologous genes. Trends Biotechnol 8:340-344.

Kambach C, Walke S, Young R, Avis JM, de la Fortelle E, Raker VA, Luhrmann R, Li J, Nagai K. 1999. Crystal structures of two Sm protein complexes and their implications for the assembly of the spliceosomal snRNPs. Cell

Kanaya S, Ikehara M. 1995. Functions and structures of ribonuclease H enzymes. Subcell Biochem 24:377-422.

Karplus M. 1959. J Phys Chem 30:11-15. u KD, Carrington JC. 1998. A counterdefensive strategy of plant viruses: suppresposttranscriptional gene silencing. Cell 95:461-470.

Drosophila homologue of Arabidopsis Argonaute1, the founder of a novel gene superfamilyGenes Cells 6:313-325.

Ketting RF, Fischer SE, Bernstein E, Sijen T, Hannon GJ, Plasterk RH. 2001. Dicer functions in RNA interference and in synthesis of small RGenes Dev 15:2654-2659.

References

142

Khusial P, Plaag R, Zieve GW. 2005. LSm proteins form heptameric rings that bind to RNA via repeating motifs. Trends Biochem Sci 30:522-528.

Kim VN. 2005. MicroRNA biogenesis: coordinated cropping and dicing. Nat Rev Mol Cell Biol 6:376-385.

Knight SW, Bass BL. 2001. A role for the RNase III enzyme DCR-1 in RNA interference and germ line development in Caenorhabditis elegans. Science 293:2269-2271.

a program for display and analysis of

Krishna NR, Berliner LJ. 2003. Protein NMR for the Millenium. New York: Kluwer

Lai L, Kim R, Kim SH. 2000. Crystal structure of archaeal RNase HII: a

Landthaler M, Yalcin A, Tuschl T. 2004. The human DiGeorge syndrome critical region gene 8 and

Lee RCisense complementarity to lin-14. Cell 75:843-854.

ing. Nature 425:415-419.

icer-1 and Dicer-2 in the siRNA/miRNA silencing pathways. Cell 117:69-81.

Li H, L ession of RNA silencing by an animal virus.

Li HW,l suppressor of the plant gene silencing defence mechanism. Embo J 18:2683-

Li WX, NA silencing. Curr Opin Biotechnol 12:150-154.

s. Nature 433:769-773. hakim A, Yekta S, Rhoades MW, Burge CB, Bartel DP.

Linge J nment of ambiguous nuclear overhauser

Linge J 50:496-506.

Koradi R, Billeter M, Wüthrich K. 1996. MOLMOL: macromolecular structures. J Mol Graph 14:51-55.

Academic/Plenum Publishers. Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. 2001. Identification of novel genes coding for

small expressed RNAs. Science 294:853-858. Yokota H, Hung LW, homologue of human major RNase H. Structure Fold Des 8:897-904.

Its D. melanogaster homolog are required for miRNA biogenesis. Curr Biol 14:2162-2167. , Feinbaum RL, Ambros V. 1993. The C. elegans heterochronic gene lin-4 encodes small RNAs with ant

Lee Y, Ahn C, Han J, Choi H, Kim J, Yim J, Lee J, Provost P, Radmark O, Kim S, Kim VN. 2003. The nuclear RNase III Drosha initiates microRNA process

Lee Y, Kim M, Han J, Yeom KH, Lee S, Baek SH, Kim VN. 2004a. MicroRNA genes are transcribed by RNA polymerase II. Embo J 23:4051-4060.

Lee YS, Nakahara K, Pham JW, Kim K, He Z, Sontheimer EJ, Carthew RW. 2004b. Distinct roles for Drosophila D

Lei M, Podell ER, Baumann P, Cech TR. 2003. DNA self-recognition in the structure of Pot1 bound to telomeric single-stranded DNA. Nature 426:198-203. i WX, Ding SW. 2002. Induction and supprScience 296:1319-1321. Lucy AP, Guo HS, Li WX, Ji LH, Wong SM, Ding SW. 1999. Strong host resistance targeted against a vira2691. Ding SW. 2001. Viral suppressors of R

Lim LP, Glasner ME, Yekta S, Burge CB, Bartel DP. 2003a. Vertebrate microRNA genes. Science 299:1540.

Lim LP, Lau NC, Garrett-Engele P, Grimson A, Schelter JM, Castle J, Bartel DP, Linsley PS, Johnson JM. 2005. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNA

Lim LP, Lau NC, Weinstein EG, Abdel2003b. The microRNAs of Caenorhabditis elegans. Genes Dev 17:991-1008. P, O'Donoghue SI, Nilges M. 2001. Automated assigeffects with ARIA. Methods Enzymol 339:71-90.

P, Williams MA, Spronk CA, Bonvin AM, Nilges M. 2003. Refinement of protein structures in explicit solvent. Proteins

Lingel A, Sattler M. 2005. Novel modes of protein-RNA recognition in the RNAi pathway. Curr Opin Struct Biol 15:107-115.

References

143

Lingel A, Simon B, Izaurralde E, Sattler M. 2003. Structure and nucleic-acid binding of the Drosophila Argonaute 2 PAZ domain. Nature 426:465-469.

.

NAi. Science 305:1437-

Liu J, L erman HM. 1997. Crystal structure of the

Liu J, Rivas FV, Wohlschlegel J, Yates JR, Parker R, Hannon GJ. 2005a. A role for the P-body

Liu J, V . 2005b. MicroRNA-dependent localization of

Liu Q, 2, a bridge between the

Llave C ne

by the A. fulgidus Piwi protein. Nature 434:666-670.

Dev

Meister orsett Y, Teng G, Tuschl T. 2004. Human Argonaute2

Meister Nature 431:343-

Messias -stranded RNA recognition. Acc Chem Res

Mette M tzke MA, Matzke AJ. 2000. Transcriptional silencing

Mourelatos Z, Dostie J, Paushkin S, Sharma A, Charroux B, Abel L, Rappsilber J, Mann M, Dreyfuss G. 2002. miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes Dev 16:720-728.

Lippman Z, Martienssen R. 2004. The role of RNA interference in heterochromatic silencing. Nature 431:364-370

Liu J, Carmell MA, Rivas FV, Marsden CG, Thomson JM, Song JJ, Hammond SM, Joshua-Tor L, Hannon GJ. 2004. Argonaute2 is the catalytic engine of mammalian R1441. ynch PA, Chien CY, Montelione GT, Krug RM, Bunique RNA-binding domain of the influenza virus NS1 protein. Nat Struct Biol 4:896-899.

component GW182 in microRNA function. Nat Cell Biol. alencia-Sanchez MA, Hannon GJ, Parker R

targeted mRNAs to mammalian P-bodies. Nat Cell Biol 7:719-723. Rand TA, Kalidas S, Du F, Kim HE, Smith DP, Wang X. 2003. R2Dinitiation and effector steps of the Drosophila RNAi pathway. Science 301:1921-1925. , Kasschau KD, Carrington JC. 2000. Virus-encoded suppressor of posttranscriptional gesilencing targets a maintenance step in the silencing pathway. Proc Natl Acad Sci U S A 97:13401-13406.

Llave C, Xie Z, Kasschau KD, Carrington JC. 2002. Cleavage of Scarecrow-like mRNA targets directed by a class of Arabidopsis miRNA. Science 297:2053-2056.

Lu R, Maduro M, Li F, Li HW, Broitman-Maduro G, Li WX, Ding SW. 2005. Animal virus replication and RNAi-mediated antiviral silencing in Caenorhabditis elegans. Nature 436:1040-1043.

Lund E, Guttinger S, Calado A, Dahlberg JE, Kutay U. 2004. Nuclear export of microRNA precursors. Science 303:95-98.

Ma JB, Ye K, Patel DJ. 2004. Structural basis for overhang-specific small interfering RNA recognition by the PAZ domain. Nature 429:318-322.

Ma JB, Yuan YR, Meister G, Pei Y, Tuschl T, Patel DJ. 2005. Structural basis for 5'-end-specific recognition of guide RNA

Martinez J, Patkaniowska A, Urlaub H, Luhrmann R, Tuschl T. 2002. Single-stranded antisense siRNAs guide target RNA cleavage in RNAi. Cell 110:563-574.

Martinez J, Tuschl T. 2004. RISC is a 5' phosphomonoester-producing RNA endonuclease. Genes18:975-980.

Matranga C, Tomari Y, Shin C, Bartel DP, Zamore PD. 2005. Passenger-Strand Cleavage Facilitates Assembly of siRNA into Ago2-Containing RNAi Enzyme Complexes. Cell 123:607-620.

Matzke MA, Birchler JA. 2005. RNAi-mediated pathways in the nucleus. Nat Rev Genet 6:24-35. G, Landthaler M, Patkaniowska A, DMediates RNA Cleavage Targeted by miRNAs and siRNAs. Mol Cell 15:185-197. G, Tuschl T. 2004. Mechanisms of gene silencing by double-stranded RNA. 349. AC, Sattler M. 2004. Structural basis of single37:279-287. F, Aufsatz W, van der Winden J, Ma

and promoter methylation triggered by double-stranded RNA. Embo J 19:5194-5201.

References

144

Neuhaus D, Williamson MP. 2000. The Nuclear Overhauser Effect in Structural and Conformational Analysis: Wiley-VCH.

Nicholson AW. 2003. The ribonuclease III family: forms and functions in RNA maturation, decay, and gene silencing. Cold Spring Harbor, New York: Cold Spring Harbor Press.

rance in the ribosome. Nat Struct Biol 10:104-108.

Nowotn n

Nykane all interfering RNA structure in the

Okamu gonaute proteins in small

Orban T mplex,

Pal-Bhailencing and HP1 localization in Drosophila are dependent on the RNAi

Parker suggests mechanisms for

Parker Jplex. Nature 434:663-666.

NAs during RNAi in Drosophila. Cell 117:83-94.

ene

ated by the coat protein and maintained by p19 through suppression of gene silencing.

Rand TA, Ginalski K, Grishin NV, Wang X. 2004. Biochemical identification of Argonaute 2 as the

Nikulin A, Eliseikina I, Tishchenko S, Nevskaya N, Davydova N, Platonova O, Piendl W, Selmer M, Liljas A, Drygin D, Zimmermann R, Garber M, Nikonov S. 2003. Structure of the L1 protube

Novina CD, Sharp PA. 2004. The RNAi revolution. Nature 430:161-164. y M, Gaidamakov SA, Crouch RJ, Yang W. 2005. Crystal structures of RNase H bound to aRNA/DNA hybrid: substrate specificity and metal-dependent catalysis. Cell 121:1005-1016. n A, Haley B, Zamore PD. 2001. ATP requirements and smRNA interference pathway. Cell 107:309-321.

ra K, Ishizuka A, Siomi H, Siomi MC. 2004. Distinct roles for ArRNA-directed RNA cleavage pathways. Genes Dev 18:1655-1666. I, Izaurralde E. 2005. Decay of mRNAs targeted by RISC requires XRN1, the Ski coand the exosome. Rna 11:459-469. dra M, Leibovitch BA, Gandhi SG, Rao M, Bhadra U, Birchler JA, Elgin SC. 2004. Heterochromatic smachinery. Science 303:669-672.

JS, Roe SM, Barford D. 2004. Crystal structure of a PIWI protein siRNA recognition and slicer activity. Embo J 23:4727-4737. S, Roe SM, Barford D. 2005. Structural insights into mRNA recognition from a PIWI domain-siRNA guide com

Pham JW, Pellino JL, Lee YS, Carthew RW, Sontheimer EJ. 2004. A Dicer-2-dependent 80s complex cleaves targeted mR

Plasterk RH. 2002. RNA silencing: the genome's immune system. Science 296:1263-1265. Predki PF, Nayak LM, Gottlieb MB, Regan L. 1995. Dissecting RNA-protein interactions: RNA-RNA

recognition by Rop. Cell 80:41-50. Pruss G, Ge X, Shi XM, Carrington JC, Bowman Vance V. 1997. Plant viral synergism: the potyviral

genome encodes a broad-range pathogenicity enhancer that transactivates replication of heterologous viruses. Plant Cell 9:859-868.

Purcell EM, Torrey HC, Pound RV. 1946. Phys Rev 69:37-38. Qian XY, Chien CY, Lu Y, Montelione GT, Krug RM. 1995. An amino-terminal polypeptide

fragment of the influenza virus NS1 protein possesses specific RNA-binding activity and largely helical backbone structure. Rna 1:948-956.

Qiu W, Park JW, Scholthof HB. 2002. Tombusvirus P19-mediated suppression of virus-induced gsilencing is controlled by genetic and dosage features that influence pathogenicity. Mol Plant Microbe Interact 15:269-280.

Qu F, Morris TJ. 2002. Efficient infection of Nicotiana benthamiana by Tomato bushy stunt virus is facilitMol Plant Microbe Interact 15:193-202.

sole protein required for RNA-induced silencing complex activity. Proc Natl Acad Sci U S A 101:14385-14389.

Rand TA, Petersen S, Du F, Wang X. 2005. Argonaute2 Cleaves the Anti-Guide Strand of siRNA during RISC Activation. Cell 123:621-629.

References

145

Reed JC, Kasschau KD, Prokhnevsky AI, Gopinath K, Pogue GP, Carrington JC, Dolja VV. 2003. Suppressor of RNA silencing encoded by Beet yellows virus. Virology 306:203-209.

1:1640-1647.

Rivas F -Tor L. 2005. Purified Argonaute2 l 12:340-349.

05-7513.

ells. PLoS Biol 3:e235.

and meiotic drive in the male

Schwar 2. Evidence that siRNAs function as guides, not

Schwar complex is a Mg2+-

Shuker 6. Discovering high-affinity ligands for proteins:

Silhavy vazza M, Burgyan J. 2002. A viral protein

Song JJ an J, Smith SK, Martienssen RA, Hannon GJ, Joshua-Tor L.

Song J L. 2004. Crystal Structure of Argonaute and Its

Stark A

6:33-38.

Rehwinkel J, Behm-Ansmant I, Gatfield D, Izaurralde E. 2005. A crucial role for GW182 and the DCP1:DCP2 decapping complex in miRNA-mediated gene silencing. Rna 1

Rhoades MW, Reinhart BJ, Lim LP, Burge CB, Bartel B, Bartel DP. 2002. Prediction of plant microRNA targets. Cell 110:513-520. V, Tolia NH, Song JJ, Aragon JP, Liu J, Hannon GJ, Joshuaand an siRNA form recombinant human RISC. Nat Struct Mol Bio

Roth BM, Pruss GJ, Vance VB. 2004. Plant viral suppressors of RNA silencing. Virus Res 102:97-108.

Ruiz MT, Voinnet O, Baulcombe DC. 1998. Initiation and maintenance of virus-induced gene silencing. Plant Cell 10:937-946.

Ryter JM, Schultz SC. 1998. Molecular basis of double-stranded RNA-protein interactions: structure of a dsRNA-binding domain complexed with dsRNA. Embo J 17:75

Saito K, Ishizuka A, Siomi H, Siomi MC. 2005. Processing of pre-microRNAs by the Dicer-1-Loquacious complex in Drosophila c

Sattler M, Schleucher J, Griesinger C. 1999. Heteronuclear multidimensional NMR experiments for the structure determination of proteins in solution employing pulsed field gradients. Prog NMR Spectrosc 34:93-158.

Schmidt A, Palumbo G, Bozzetti MP, Tritto P, Pimpinelli S, Schafer U. 1999. Genetic and molecular characterization of sting, a gene involved in crystal formationgerm line of Drosophila melanogaster. Genetics 151:749-760.

z DS, Hutvagner G, Haley B, Zamore PD. 200primers, in the Drosophila and human RNAi pathways. Mol Cell 10:537-548.

z DS, Tomari Y, Zamore PD. 2004. The RNA-induced silencing dependent endonuclease. Curr Biol 14:787-791.

Sen GL, Blau HM. 2005. Argonaute 2/RISC resides in sites of mammalian mRNA decay known as cytoplasmic bodies. Nat Cell Biol 7:633-636.

Sheth U, Parker R. 2003. Decapping and decay of messenger RNA occur in cytoplasmic processing bodies. Science 300:805-808. SB, Hajduk PJ, Meadows RP, Fesik SW. 199SAR by NMR. Science 274:1531-1534.

Sijen T, Fleenor J, Simmer F, Thijssen KL, Parrish S, Timmons L, Plasterk RH, Fire A. 2001. On the role of RNA amplification in dsRNA-triggered gene silencing. Cell 107:465-476. D, Molnar A, Lucioli A, Szittya G, Hornyik C, Tasuppresses RNA silencing and binds silencing-generated, 21- to 25-nucleotide double-stranded RNAs. Embo J 21:3070-3080. , Liu J, Tolia NH, Schneiderm2003. The crystal structure of the Argonaute2 PAZ domain reveals an RNA binding motif in RNAi effector complexes. Nat Struct Biol 10:1026-1032.

J, Smith SK, Hannon GJ, Joshua-Tor Implications for RISC Slicer Activity. Science. , Brennecke J, Russell RB, Cohen SM. 2003. Identification of Drosophila MicroRNA targets. PLoS Biol 1:E60.

Stefl R, Skrisovska L, Allain FH. 2005. RNA sequence- and shape-dependent recognition by proteins in the ribonucleoprotein particle. EMBO Rep

References

146

Steitz TA, Steitz JA. 1993. A general two-metal-ion mechanism for catalytic RNA. Proc Natl Acad Sci U S A 90:6498-6502.

Sullivan CS, Ganem D. 2005. A virus-encoded inhibitor that blocks RNA interference in mammalian cells. J Virol 79:7371-7379.

Tabara H, Sarkissian M, Kelly WG, Fleenor J, Grishok A, Timmons L, Fire A, Mello CC. 1999. The rde-1 gene, RNA interference, and transposon silencing in C. elegans. Cell 99:123-132.

Rep 5:189-

Theobald DL, Mitton-Fry RM, Wuttke DS. 2003. Nucleic Acid Recognition by OB-Fold Proteins.

Thore S ck D. 2003. Crystal structures of the Pyrococcus abyssi Sm

Tjandra es in biomolecules by NMR in a

Tjandra the dependence of heteronuclear relaxation times on rotational

Tolman in solution. Proc Natl Acad Sci

Tomari uf WE, Zamore

Voinnet O. 2001. RNA silencing as a plant immune system against viruses. Trends Genet 17:449-459.

Voinne protein prevents spread of the gene

Voinne ion of gene silencing: a general strategy used

Volpe all IM, Teng G, Grewal SI, Martienssen RA. 2002. Regulation of

Tahbaz N, Kolb FA, Zhang H, Jaronczyk K, Filipowicz W, Hobman TC. 2004. Characterization of the interactions between mammalian PAZ PIWI domain proteins and Dicer. EMBO194.

Annu Rev Biophys Biomol Struct 32:115-133. , Mayer C, Sauter C, Weeks S, Sucore and its complex with RNA. Common features of RNA binding in archaea and eukarya. J Biol Chem 278:1239-1247.

Timmons L, Fire A. 1998. Specific interference by ingested dsRNA. Nature 395:854. N, Bax A. 1997. Direct measurement of distances and angldilute liquid crystalline medium. Science 278:1111-1114. N, Garrett DS, Gronenborn AM, Bax A, Clore GM. 1997. Defining long range order in NMR structure determination fromdiffusion anisotropy. Nature Struct Biol 4:443-449. JR, Flanagan JM, Kennedy MA, Prestegard JH. 1995. Nuclear magnetic dipole interactions in field-oriented proteins: information for structure determinationU S A 92:9279-9283.

Y, Du T, Haley B, Schwarz DS, Bennett R, Cook HA, Koppetsch BS, TheurkaPD. 2004a. RISC assembly defects in the Drosophila RNAi mutant armitage. Cell 116:831-841.

Tomari Y, Matranga C, Haley B, Martinez N, Zamore PD. 2004b. A protein sensor for siRNA asymmetry. Science 306:1377-1380.

Tuschl T, Zamore PD, Lehmann R, Bartel DP, Sharp PA. 1999. Targeted mRNA degradation by double-stranded RNA in vitro. Genes Dev 13:3191-3197.

Vargason JM, Szittya G, Burgyan J, Tanaka Hall TM. 2003. Size selective recognition of siRNA by an RNA silencing suppressor. Cell 115:799-811.

Verdel A, Jia S, Gerber S, Sugiyama T, Gygi S, Grewal SI, Moazed D. 2004. RNAi-mediated targeting of heterochromatin by the RITS complex. Science 303:672-676.

Vermeulen A, Behlen L, Reynolds A, Wolfson A, Marshall WS, Karpilow J, Khvorova A. 2005. The contributions of dsRNA structure to Dicer specificity and efficiency. Rna 11:674-682.

Voinnet O. 2005. Induction and suppression of RNA silencing: insights from viral infections. Nat Rev Genet 6:206-220.

t O, Lederer C, Baulcombe DC. 2000. A viral movement silencing signal in Nicotiana benthamiana. Cell 103:157-167.

t O, Pinto YM, Baulcombe DC. 1999. Suppressby diverse DNA and RNA viruses of plants. Proc Natl Acad Sci U S A 96:14147-14152. TA, Kidner C, Hheterochromatic silencing and histone H3 lysine-9 methylation by RNAi. Science 297:1833-1837.

References

147

Wassenegger M. 2005. The role of the RNAi machinery in heterochromatin formation. Cell 122:13-16.

Wassenegger M, Heimes S, Riedel L, Sanger HL. 1994. RNA-directed de novo methylation of genomic sequences in plants. Cell 76:567-576.

William , Rubin GM. 2002. ARGONAUTE1 is required for efficient RNA interference in

Wilson enhances oskar translation in the Drosophila

Winters

Ye K, Patel DJ. 2005. RNA Silenc

ort hairpin RNAs. Genes Dev 17:3011-3016.

d bacterial RNase III. Cell 118:57-68.

16-719.

Waterhouse PM, Smith NA, Wang MB. 1999. Virus resistance and gene silencing: killing the messenger. Trends Plant Sci 4:452-457.

Wightman B, Ha I, Ruvkun G. 1993. Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75:855-862. s RWDrosophila embryos. Proc Natl Acad Sci U S A 99:6889-6894. JE, Connell JE, Macdonald PM. 1996. aubergine ovary. Development 122:1631-1639. berger U. 1990. Ribonucleases H of retroviral and cellular origin. Pharmacol Ther 48:259-280.

Wüthrich K. 1986. NMR of proteins and nucleic acids: Wiley-Interscience. Yan KS, Yan S, Farooq A, Han A, Zeng L, Zhou MM. 2003. Structure and conserved RNA binding of

the PAZ domain. Nature 426:468-474. Yang W, Steitz TA. 1995. Recombining the structures of HIV integrase, RuvC and RNase H.

Structure 3:131-134. Ye K, Malinina L, Patel DJ. 2003. Recognition of small interfering RNA by a viral suppressor of RNA

silencing. Nature 426:874-878. ing Suppressor p21 of Beet Yellows Virus Forms an RNA Binding

Octameric Ring Structure. Structure (Camb) 13:1375-1384. Yi R, Qin Y, Macara IG, Cullen BR. 2003. Exportin-5 mediates the nuclear export of pre-microRNAs

and shYuan YR, Pei Y, Ma JB, Kuryavyi V, Zhadina M, Meister G, Chen HY, Dauter Z, Tuschl T, Patel DJ.

2005. Crystal structure of A. aeolicus argonaute, a site-specific DNA-guided endoribonuclease, provides insights into RISC-mediated mRNA cleavage. Mol Cell 19:405-419.

Zamore PD, Tuschl T, Sharp PA, Bartel DP. 2000. RNAi: double-stranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals. Cell 101:25-33.

Zeng Y, Yi R, Cullen BR. 2003. MicroRNAs and small interfering RNAs can inhibit mRNA expression by similar mechanisms. Proc Natl Acad Sci U S A 100:9779-9784.

Zeng Y, Yi R, Cullen BR. 2005. Recognition and cleavage of primary microRNA precursors by the nuclear processing enzyme Drosha. Embo J 24:138-148.

Zhang H, Kolb FA, Brondani V, Billy E, Filipowicz W. 2002. Human Dicer preferentially cleavesdsRNAs at their termini without a requirement for ATP. EMBO J 21:5875-5885.

Zhang H, Kolb FA, Jaskiewicz L, Westhof E, Filipowicz W. 2004. Single processing center models for human Dicer an

Zilberman D, Cao X, Jacobsen SE. 2003. ARGONAUTE4 control of locus-specific siRNA accumulation and DNA and histone methylation. Science 299:7

Zuiderweg ER. 2002. Mapping protein-protein interactions in solution by NMR spectroscopy. Biochemistry 41:1-7.

Curriculum vitae

148

7. Curriculum vitae

Name

Place of

Educa

Since 0 European Molecular Biology Laboratory (EMBL) Heidelberg, Germany and , Switzerland

r at ETH Zurich: Prof. Dr. Ulrike Kutay

10/1999

10/2000

rgency service

Personal details

Andreas Lingel Date of birth February 8th, 1976

birth Marbach am Neckar, Germany Marital status Single Nationality German

tion

5/2002 Swiss Federal Institute of Technology (ETH) Zurich International Ph.D. Programme at EMBL Heidelberg Ph.D. research project under the supervision of Dr. Elisa Izaurralde: “Structural

and functional characterization of proteins involved in RNA-mediated gene silencing” Superviso

11/2001-02/2002 Rockefeller University, New York City, New York, USA Laboratory of Cell Biology, Prof. Dr. Günter Blobel

Research internship, topic: “The Role of POM121 and gp210 on Nuclear Pore Complex (NPC) Structure and Assembly”

-11/2001 ETH Zurich, Switzerland Diploma in biology (biochemistry), passed with distinction, grade: 5.93

-04/2001 University of Zurich, Switzerland Department of Biochemistry, laboratory of Prof. Dr. Andreas Plückthun Diploma thesis, entitled “Stability Enhancement of Proteins by Intein mediated Cyclization”, grade: 6, with honors

04/2000-07/2000 ETH Zurich, Switzerland Institute of Biochemistry, laboratory of Prof. Dr. Ari Helenius Semester work, topic: “Quality Control of Newly Synthesized Proteins in the Endoplasmatic Reticulum”

04/1997-09/1999 Eberhard-Karls-University Tübingen, Germany Vordiplom in biochemistry, passed with distinction, grade: “sehr gut”

01/1996-01/1997 German Red Cross, District Ludwigsburg, Germany Civil service as an ambulance man in the eme

06/1995 Friedrich-Schiller-Gymnasium in Marbach am Neckar, Germany Abitur, grade: 1.4, passed with distinction

Curriculum vitae

149

ublications P

1. Lingel, A., Simon, B., Izaurralde, E., and Sattler, M. (2005) The structure of the FHV B2 protein, a viral suppressor of RNAi, reveals a novel mode of dsRNA recognition.

Rep

l, A

Embo , 6, 1149-1155.

2. Ling .e , and Sattler, M. (2005) Novel modes of protein-RNA recognition in the urr Opin Struct Biol, 15, 107-115.

3. Lingel, A.

RNAi pathway. C

, and Izaurralde, E. (2004) RNAi: Finding the elusive endonuclease. RNA, 675-1679.

l, A.

10, 1

4. Linge , Simon, B., Izaurralde, E., and Sattler, M. (2004) NMR assignment of the Drosophil

. Lingel, A.

a Argonaute2 PAZ domain. J Biomol NMR, 29, 421-432.

5 , Simon, B., Izaurralde, E., and Sattler, M. (2004) Nucleic acid 3’-end n by the Argonaute2 PAZ domain. Nat Struct Mol Biol, 11, 576-577. recognitio

. Lingel, A.6 , Simon, B., Izaurralde, E., and Sattler, M. (2Drosophila Argonaute2 PAZ domain.

003) Structure and nucleic acid binding of the Nature, 426, 465-469.

. Iwai, H., L7 ingel, A., Plückthun, A. (2001) Cyclic green fluorescent protein produced in vivo usChem, 276

Conferences

He EMBO Co

Poster pr recognition by the Argonaute2 PAZ

. Keystone Keystone

Poster presentation: “Structural Basis for 3’-end recognition of nucleic acids by the

3. Max Planctein complexes”, 17.-20.

SeptembPoster pre nucleic acid binding fo

4. Cold Spring Harbor Laboratory, New York, USA Cold Spring Harbor Conference on “Eukaryotic mRNA Processing”, 20.-24. August 2003 Oral presentation: “Structure and nucleic acid binding of the Drosophila Argonaute 2 PAZ domain”

ing an artificially split PI-PfuI intein from Pyrococcus furiosus. J Biol , 16548-16554.

1 idelberg, Germany . EMBL nference “Structures in Biology”, 10.-13. November 2004

esentation: “Nucleic acid 3’-enddomain”

2 Resort, Colorado, USA Symposium on “siRNAs and miRNAs”, 14.-19. April 2004

Drosophila Argonaute2 PAZ domain”

k Institut for Biophysical Chemistry, Göttingen, Germany Symposium “Structure, function and dynamics of RNA-pro

er 2003 sentation: “The Argonaute 2 PAZ domain adopts a novel ld”

Documents

Structural and functional characterization of proteins involved in