22
Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Embed Size (px)

Citation preview

Page 1: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Topological Methods for RNA Pseudoknots

Nicole A. LarsenGeorgia Institute of Technology

Department of Mathematics

Math 4803 – 04/21/2008

Page 2: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Overview

• Introduction to Pseudoknots• Topological Representation and Classification• Thermodynamic Calculations• Conclusions and Open Problems

Page 3: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Pseudoknots

• RNA secondary structures with “crossing” base pairs

• Prevalent in nature– Telomerase– Viruses such as

Hepatitis C, SARS Coronavirus, and even several strains of HIV

Coronavirus

Page 4: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

The Trouble with Pseudoknots

• Cannot be represented as a plane tree• Current energy calculation methods do not hold• About the only thing we can do is use recursive methods

Single Hairpin Pseudoknot

Page 5: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Representing Pseudoknots

Page 6: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Topological Genus• For a surface in 3-space: g=0 for a sphere, g=1 for a single-holed

torus, g=2 for a double-holed torus… g=n for an n-holed torus.• The genus of an RNA structure is defined by Bon et al. to be the

minimum g such that the disk diagram can be drawn on a surface of genus g with no crossing arcs.

Page 7: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Calculating Genus

Where P is the number of arcs in the diagram and L is the number of loops.

Page 8: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Properties of Genus

• Pseudoknot-free structures have genus 0.• Stacked base pairs do not contribute to genus.• For concatenated structures, genus is the sum

of the two substructures.• For nested structures, genus is the sum of the

two substructures.

Page 9: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

RNA Structures with Genus 1

Page 10: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Classification Results

• There are 4 primitive pseudoknots of genus 1• Pseudobase: Contains 246 pseudoknots

– 238 were H-pseudoknots or nested H-pseudoknots– Only 1 had genus >1

• World Wide Protein Database (wwPDB)– Even very long RNA structures (~2000 bases) have

low genus (<18)– Primitive pseudoknots have genus 1 or 2

• Expected genus for random RNA sequences ~ length/4

Page 11: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Classification Results

(Left) Genus as a function of length of the RNA structure. (Right) A histogram of the genus of primitive RNA structures found in the wwPDB (Bon et al.)

Page 12: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

What good is it, anyway?

• Genus gives us a way to measure the “complexity” of a pseudoknot

• If we can determine a relationship between topological genus and energy then we can use a minimum free energy approach for prediction

Page 13: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Thermodynamics and Quantum Matrix Field Theory

• RNA disk diagrams --------- Feynman diagrams

Feynman diagrams representing the Lamb shift – Nothing to do with RNA at all!

Page 14: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Partition Function

• Thermodynamic partition function:

where the sum ranges over all possible Feynman diagrams D for a given RNA sequence and E(D) is the energy of diagram D

where the sum ranges over all possible Feynman diagrams D for a given RNA sequence and E(D) is the energy of diagram D

Page 15: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

• Vernizzi and Orland use a Monte Carlo method to generate RNA structures weighed by the partition function:

• Where is a “topological potential energy” and g is genus. By adjusting you can allow RNA structures of any genus, or restrict to small genus structures. Useful for rapidly exploring energy regions to find minimum energy structures.

• When goes to infinity (PKF) results agree with mfold predictions.• g/L ~ 0.23 for random sequences

Results

Page 16: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Modeling with a Cubic Lattice

• Infinitely flexible polymer sequence• Given by a self-avoiding random walk on a cubic

lattice• Each base lies on a vertex of the lattice• Bases only bond with neighboring bases,

modeled by “spin vectors”where the sum ranges over all possible Feynman diagrams D for a given RNA sequence and E(D) is the energy of diagram D

Page 17: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Results

where the sum ranges over all possible Feynman diagrams D for a given RNA sequence and E(D) is the energy of diagram D

Average genus per unit energy

Page 18: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Results

Average genus per unit length for the low-energy phase (left) and the high-energy phase (right)<g/L> = 0.141 ± 0.003 for low energy and <g/L> = (585 ± 8) x 10-6 for high energy

Page 19: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Conclusions

• Topological genus provides a nice, relatively easy classification scheme for pseudoknots

• Thermodynamic predictions based on genus agree with observations and with predictions given by mfold

• Low-genus structures are more likely to be found in nature.

Page 20: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

Open Questions

• Create an algorithm for predicting secondary structures that may have pseudoknots– Pillsbury, Orland, and Zee: steepest-descent method

that takes O(L6) just to calculate partition function, much less optimal structures!

• Experimental measurement and cataloging of low-genus structures

• How does genus depend on temperature?• Can genus be used to predict asymptotic

behavior of very long sequences?• Incorporation of higher-order considerations

such as entropy

Page 21: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

References• Key Sources

– Bon, Michael, Graziano Vernizzi, Henri Orland, & A. Zee. “Topological Classification of RNA Structures.” ArXiv Quantitative Biology e-prints (2006): arXiv:q-bio/0607032v1.

– Orland, Henri, & A. Zee. “RNA Folding and Large N Matrix Theory.” Nucl.Phys. B620 (2002): 456-476.

– Vernizzi, Graziano, and Henri Orland. “Large-N Random Matrices for RNA Folding.” Acta Physica Polonica B 36(2005): 2821-2827.

– Vernizzi, Graziano, Paulo Ribeca, Henri Orland, & A. Zee. “Topology of Pseudoknotted Homopolymers.” Physical Review E 73(2006).

• Mathematics Sources (found in MathSciNet)– Karp, Richard M. “Mathematical Challenges from

Genomics and Molecular Biology.” Notices of the AMS 49(2002): 544-553.

– Pillsbury, M., J. A. Taylor, H. Orland, & A. Zee. “An Algorithm for RNA Pseudoknots.” ArXiv Condensed Matter e-prints (2005): arXiv:cond-mat/0310505.

– Rivas, Elena, and Sean R. Eddy. “A Dynamic Programming Algorithm for RNA Structure Prediction Including Pseudoknots.” Journal of Molecular Biology, Vol. 285 No 5 (5 February 1999), pp 2053-2068.

– Vernizzi, Graziano, Henri Orland, & A. Zee. “Enumeration of RNA Structures by Matrix Models.” Phys Rev Lett. 94(2006).

– Zee, A. “Random Matrix Theory and RNA Folding.” Acta Physica Polonica B 36(2005): 2829-2836.

• Biology Sources (found in PubMed)– Brierley, Ian, Simon Pennell, and Robert J. C. Gilbert.

“Viral RNA Pseudoknots: Versatile Motifs in Gene Expression and Replication.” Nature Reviews Microbiology 5(2007): 598-610.

– Chen, Jiunn-Liang, and Carol W. Greider. “Functional Analysis of the Pseudoknot Structure in Human Telomerase RNA.” Proceedings of the National Academy of Sciences 102(2005): 8080-8085.

– Maugh, Thomas H. “RNA Viruses: The Age of Innocence Ends.” Science, New Series, Vol. 183, No. 4130. (Mar. 22, 1974), pp. 1181-1185.

– Tu, Chialing, Tzy-Hwa Tzeng, and Jeremy A. Bruenn. “Ribosomal Movement Impeded at a Pseudoknot Required for Frameshifting.” Proceedings of the National Academy of Sciences of the United States of America, Vol. 89, No. 18. (Sep. 15, 1992), pp. 8636-8640.

• Other Sources– Rong, Yongwu. “Feynman diagrams, RNA folding, and

the transition polynomial.” IMA Annual Program Year Workshop: RNA in Biology, Bioengineering and Nanotechnology. October 29-November 2, 2007.

– Staple DW, Butcher SE (2005) “Pseudoknots: RNA Structures with Diverse Functions.” PLoS Biol 3(6) (2005), e213 doi:10.1371/journal.pbio.0030213.

Page 22: Topological Methods for RNA Pseudoknots Nicole A. Larsen Georgia Institute of Technology Department of Mathematics Math 4803 – 04/21/2008

THE END