Applications of Knot Theory to DNA (Document)

Teresa Rothaar Math 4096 Final Report

Using Knots to Topologically Model DNA

Introduction

Chemists became interested in knot theory in the 19th Century, when Lord Kelvin

hypothesized that atoms existed as knots in a substance called ether, which supposedly

permeated all of space; different knots, Kelvin thought, corresponded to different atoms.

(Kelvin’s friend, mathematical physicist Peter Guthrie Tait, produced the first modern

knot tables.) After Kelvin’s theory was proved false, scientists lost interest in knots,

although mathematicians continued to study them.

Mathematical applications in DNA research emerged in the 1950’s, when Crick

and Watson produced the now familiar double helix model of DNA. Even at this time,

Crick and Watson noted that some mechanism must exist to deal with the tangles that

would inevitably occur from this structure. The discovery of DNA knotting 30 years later

reignited interest in knot theory by biologists and biochemists. Knotting is involved in

many of the biological processes of DNA, including the action of enzymes called

topoisomerases, which wind and unwind DNA so that critical processes such as

replication can occur.

Definitions/Basic Examples

What is DNA?

Deoxyribonucleic acid (DNA) is a nucleic acid that contains the instructions

required to construct other cellular components, such as proteins and RNA molecules;

hence it is often referred to as the “blueprint of life.” DNA itself does not act on other

Teresa Rothaar Math 4096 Final Report Page 2 molecules; it is acted upon by enzymes which control replication and other DNA

processes.

The familiar “twisted ladder” structure of DNA consists of two long strands made

of sugars and phosphate groups. Attached to each sugar is one of four types of bases:

adenine (A), cytosine (C), guanine (G) or thymine (T). Each type of base on one strand

bonds with just one type on the other strand; A bonds only with T and C bonds only with

G. These base pairs form the “rungs” of the ladder, and the length of DNA is measured

by counting the number of base pairs. Human DNA contains approximately 3 million

base pairs, while the bacteria E. coli has approximately 4.4. million.

DNA can be linear or circular. Most bacteria and viruses have circular DNA,

while human DNA is linear.

Teresa Rothaar Math 4096 Final Report Page 3 Supercoiling

Human DNA is extremely long and tightly packed into cell nuclei. Imagine

stuffing 200 km of fishing line into a basketball—without neatly winding it. This induces

a state called supercoiling. When DNA is in a “relaxed” state, a strand circles the axis

of the human DNA helix about once every 10.4 pairs. If the DNA is twisted, the strands

become more tightly wound, like an old-fashioned telephone headset cord that has

been twisted around itself. This is called supercoiling.

If all of the crossings of the coils are negative, the DNA is negatively supercoiled;

likewise, it is positively supercoiled if all of the crossings are positive.

Meanwhile, in order for enzymes to act on DNA and perform critical processes

like replication, the DNA must be unpacked and the supercoils relaxed; a family of

enzymes called the topoisomerases perform this function.

Teresa Rothaar Math 4096 Final Report Page 4 DNA as a Belt or Ribbon: Lk = Tw + Wr

The Lk = Tw + Wr (Linking No. = Twist + Writhe) formula, well known in the field

of differential geometry as a way to relate space curves, is arguably even better known

to molecular biologists. We will first look at this formula from a conceptual point of view,

using a belt to model twist and writhe.

Imagine the two edges of a belt are strands and that the very center of the belt is

its axis in space (imagine drawing a straight line down the exact center of the belt, from

end to end). With the belt unbuckled, hold on to one end while twisting the other (180

degrees for a half-twist, 360 degrees for a full twist, etc.). Twist describes how the two

edges of the belt (the “strands”) wind around each other in space, specifically, how

tightly the two edges of the belt twist around the belt’s imaginary axis. Writhe measures

how the center line of the belt winds around in space; in other words, the amount of

supercoiling. If the belt is buckled without being untwisted, then relaxed, the twist is

converted into writhe. Moving the belt converts writhe to twist and vice versa.

Teresa Rothaar Math 4096 Final Report Page 5

Thus, while linking number is a topological invariant, writhe and twist are not; as

a result, calculating them is much more complex than calculating linking number. In

addition, any change in twist must be exactly balanced by a change in writhe, and vice

versa, so that the linking number remains unchanged.

Calculating Lk, Tw & Wr

Linking number is calculated exactly as it is in knot theory: ½ the sum of all the

+1 and -1 crossings of the two backbone components of the DNA strand, as they cross

each other while winding around the helix.

The calculation of twist depends on whether the axis is flat in the plane. If the

axis is flat in the plane, without crossing itself, twist is calculated as simply ½ the sum of

the +1 and -1’s of the crossings between the axis and a particular one of the two strands

bounding the axis. However, this is a textbook example that is not seen in reality.

When the axis is not flat in the plane—the usual case in the real world—calculating twist

is much more complex; in this case, twist is the integral of the incremental twist of the

belt about the axis, integrated as the axis is traversed once. These crossings indicate

the helical pitch of the DNA, meaning the number of base pairs per complete revolution.

Mathematically, twist is calculated using Frenet framing, as the total torsion of the

curve γ(s) (the curve being one of the two backbone strands of DNA):


Where T is the unit vector tangent to γ(s), N is the derivative of T with respect to the

arc length of γ(s), divided by its length, and τ is the torsion, or the measure of how

nonplanar the curve is.

Writhe is calculated using signed crossover numbers. For any particular

projection of the axis, the signed crossover number is the sum of all the +1 and -1

crossings where the axis crosses itself. Because writhe is not a topological invariant,

we must calculate the average value of the signed crossover number over every

possible projection of the axis. Because the axis remains fixed in space, “every

possible projection” is defined as the planar pictures we would see if we were to view

the fixed axis from all possible vantage points on a unit sphere surrounding it in space,

as illustrated below:

Mathematically, we take the integral of the signed crossover numbers, integrating

over all vantage points on a unit sphere, then divide it by 4π (the surface area of a unit

sphere):

1/4π∫signed crossover number dA

As an example, consider the figure below:


Part (a) illustrates the familiar concept of positive and negative crossings from

knot theory. Part b illustrates supercoiling being condensed into writhe; in this example,

Wr (-3) and Tw 0. In part c, the ends of the DNA are pulled apart, but not twisted,

and the absolute value of Wr decreases while Tw increases. In part d, the ends of the

DNA have been pulled apart and twisted, so that supercoiling that has been completely

converted into twist, the DNA helix no longer coils in three dimensions, and Tw = -3

while Wr = 0.

Teresa Rothaar Math 4096 Final Report Page 8 Topoisomerases

Topoisomerases are isomerase enzymes which modify the topology of DNA to

unknot, unlink and maintain proper supercoiling, thus making possible the crucial

processes of transcription, recombination and replication. Specifically, they cut a strand

of DNA, allow another segment of DNA to pass through the break, then reseal it. There

are two main types of topoisomerases: type I (which change the linking number of the

DNA in increments of 1) and type II (which change it in increments of 2).

Main Results

How the Topoisomerases Regulate Supercoiling, Knotting and Linking

Type I topoisomerases can break only a single backbone strand, and thus only

operate on single-stranded DNA or double-stranded DNA which contains a nick (a

broken phosphodiester bond between the sugars of two consecutive bases on one

strand). The sole function of type I topoisomerases is to regulate supercoiling. By

breaking one backbone strand, letting the other strand pass through it, then resealing

the break, a change in twist is converted to a change in writhe (supercoiling). Type I

topoisomerases are needed for DNA replication to occur. During replication, the DNA

helix is unzipped, and supercoils can build up in advance of the unzipped region. If the

supercoiling becomes too tight, the DNA molecule can break. Type I topoisomerases

release supercoiling as needed to prevent this. Type I topoisomerases have been

found in all studied organisms, and their absence causes cell death.

Type II topoisomerases break both backbone strands and thus operate on

double-stranded DNA. Similar to what happens with a type I, both backbone strands

Teresa Rothaar Math 4096 Final Report Page 9 are cut, another unbroken DNA helix is passed through them, and the cuts are resealed.

While they can add and remove supercoils, the primary purpose of type II

topoisomerases is to change DNA knot or link type. Further, they preferentially unknot

and unlink DNA; this is called topological simplification. While scientists agree that type

II topoisomerases do their work in an extremely efficient manner, exactly how they

achieve this level of efficiency is still debated. Among other functions, type II

topoisomerases play a crucial role in DNA replication. At the end of the replication

process, daughter cells must be completely disentangled—unlinked—before mitosis

(division of the chromosomes in a cell nucleus) can occur; if the daughter cells are not

unlinked, the cell cannot replicate and dies. Because of their involvement in cell

replication, many chemotherapy and antibiotic drugs target type II topoisomerases; the

drugs work by preventing cancer/bacterial cells from reproducing, thus killing existing

disease cells and preventing them from producing new ones.

Applications

Unfortunately, scientists have no way of observing the action of topoisomerases

directly; they can view only the beginning and end results. Thus, while it is understood

what the topoisomerases do, how, exactly, they perform the functions of unknotting and

unlinking is still a mystery. Because the topoisomerases are involved in knotting and

unknotting, the unknotting number can be used to understand topoisomerase action. If

the unknotting number of a particular DNA molecule is known, biochemists can

accurately estimate how long it will take for a topoisomerase to unknot it. Further,

because both type I and II topoisomerases can change only one crossing number at a

Teresa Rothaar Math 4096 Final Report Page 10 time, understanding the unknotting action of type II topoisomerases is directly related to

the goal of classifying all knots with unknotting number 1.

In addition to unknotting numbers, crossing changes and the concept of knot

distance— the minimum number of crossing changes needed to convert one knot to

another—plays a role in understanding topoisomerase activity. Knot distances have

been tabulated for the rational knots, some non-rational knots, and composites of

rational knots up to 13 crossings, but there are gaps in the tabulations. Research to fill

in the gaps and improve the lower bounds of knot distances is ongoing.

Although current technology does not allow direct observation of topoisomerase

activity, scientists can use electrophoresis to separate molecules of different shapes

and weights. Although DNA molecules all have the same weight, how many crossings

a knotted DNA molecule has affects how quickly it will travel through a gel; the more

supercoiled the molecule, the more compact it is, and thus the more quickly it will travel.

Future Directions & Open Problems

There are additional topoisomerases beyond the two types discussed in this

paper. It is believed that topoisomerase III regulates recombination, while

topoisomerase IV regulates the process by which newly replicated chromosomes

segregate from one another. At least one study indicates that topoisomerase IV

unknots the DNA of e. coli.

In addition to their crucial role in the action of topoisomerases, knots and links

also impact the function of two other critical enzymes, the recombinases and

transposases. The role of knots in a process called site-specific recombination—which

Teresa Rothaar Math 4096 Final Report Page 11 deletes, inverts or inserts a DNA segment and reshuffles the genetic sequence—is the

subject of extensive collaborative research between biochemists and mathematicians.

Mathematician Dorothy Buck, for example, has shown that site-specific recombination

produces only knots from a certain family. It is hoped that understanding of the specific

knots involved in site-specific recombination will lead to treatments for viral infections

and genetic disorders.


References

1. Darcy, I. K., Sumners, D. W., Applications of Topology to DNA, in Knot Theory, Banach Center Publications, Volume (42) 1998.

2. Buck, D. (2009) DNA Topology. In: Buck, D. and Flapan, E. (eds.) Proceedings of Symposia in Applied Mathematics, Volume 66: Applications of Knot Theory. American Mathematical Society. pp 47-82.

3. Collins, J., DNA or knot DNA? That is the question. (Slides from talk/presentation.) PG Colloquium, 1 February 2007.

4. Tompkins, J., Modeling DNA with Knot Theory: An Introduction. Undergraduate research project, University of Texas at Tyler, Summer 2005.

5. Adams, C., The Knot Book, New York: W.H. Freeman and Company, 1994.

6. Deibler, R.W., Rahmati, S., Zechiedrich, L., Topoisomerase IV, alone, unknots DNA in E. coli, Genes & Dev. 2001. 15: 748-761.

7. Buck, D. & Flapan, E. (2007) A topological characterization of knots and links arising from site-specific recombination. Journal of Molecular Biology, 374 (5), 1186–1199. DOI: 10.1016/ j.jmb.2007.10.016

8. Cozzarelli, N.R., Cost, G.J., Nöllmann, M., Viard, T., and Stray, J.E., “Giant proteins that move DNA: bullies of the genomic playground,” Nature Reviews Molecular Cell Biology 7, 580-588 (August 2006).

9. Moon, H., Darcy, I., Polynomial invariants, knot distances and topoisomerase action (poster), Advanced School and Conference on Knot Theory and its Applications to Physics and Biology, May 11-29, 2009, International Centre for Theoretical Physics, Trieste, Italy.

10. MedicineNet.com entry for Topoisomerase: http://www.medterms.com/script/main/art.asp?articlekey=32631.

11. Wikipedia entry for Peter Guthrie Tait: http://en.wikipedia.org/wiki/Peter_Guthrie_Tait.

12. Wikipedia entry for DNA: http://en.wikipedia.org/wiki/Dna.

13. Wikipedia entry for Topoisomerase: http://en.wikipedia.org/wiki/Topoisomerase.

14. Wikipedia entry for Frenet-Serret Formulas: http://en.wikipedia.org/wiki/Frenet%E2%80%93Serret_formulas

Education

Applications of Knot Theory to DNA (Document)