71
Biochemistry, computing in biology

Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Embed Size (px)

Citation preview

Page 1: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Biochemistry, computing in biology

Page 2: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

1 Introduction

2 Theoretical background Biochemistry/molecular biology

3 Theoretical background computer science

4 History of the field

5 Splicing systems

6 P systems

7 Hairpins

8 Detection techniques

9 Micro technology introduction

10 Microchips and fluidics

11 Self assembly

12 Regulatory networks

13 Molecular motors

14 DNA nanowires

15 Protein computers

16 DNA computing - summery

17 Presentation of essay and discussion

Course outline

Page 3: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Recombination

Page 4: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Recombination and crossover

Page 5: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Recombination and crossover

Page 6: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

If no exchange of genes (i.e. phenotypic marker) occurs, recombination event can not be detected

Recombination and crossover

Page 7: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Recombination and crossover

Page 8: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Introduction to ciliates

Page 9: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

literature

Genome Gymnastics: Unique Modes  of DNA

Evolution and Processing in Ciliates. David M.

Prescott, Nature Reviews Genetics

Computational power of gene rearrangement.

Lila Kari and Laura Landweber, DIMACS series

in discreet mathematics and theoretical

computer science

Page 10: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Very ancient ( ~ 2 . 109 years ago)

Very rich group ( ~ 10000 genetically

different organisms)

Very important from the evolutionary

point of view

The ciliate

Page 11: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

DNA molecules in micronucleus are very

long (hundreds of kilo bps)

DNA molecules in macronucleus are gene-

size, short (average ~ 2000 bps)

The ciliate

Page 12: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The ciliate

Page 13: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Baldauf et al. 2000. Science 290:972.

The ciliate tree

Page 14: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Urostyla grandis

Bar: 50 m

Holosticha kessleri

Bar: 100 m

Uroleptus sp.

Bar: 100 m

Scrambled Genes Found

S. lemnaeO. trifallaxO. nova

Eschaneustyla sp.

Bar: 25 m

Page 15: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The ciliate

Page 16: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The ciliate

Page 17: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Dapi staining of the ciliate

Page 18: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Nuclei

Micronucleus the small nucleus containing a

single copy of the genome that is used for

sexual reproduction

Macronucleus the large nucleus that carries up

to several hundred copies of the genome and

controls metabolism and asexual reproduction

Page 19: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Prescott, 2000

Macronucleus

Micronucleus

Cutting, splicing, elimination, reordering, and amplification of DNA

Lifecycle of a ciliate

Page 20: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The ciliate, meiosis

Page 21: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

CellPairing

Meiosis andNuclear Exchange

Nuclear Fusion andDuplication of theZygotic Nucleus

Macronuclear Developmentand Nuclear Degeneration

MIC

MAC

Modified from Larry Klobutcher & Carolyn Jahn Ann. Review Microbiology, 2002

Polytenization

Chromatid breakage

De novo telomere formation

The ciliate, reproduction

Page 22: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Computing in ciliates

Page 23: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Astounding feats of ‘DNA computing’ are routine in this ‘simple’ single -celled organism— a protozoan. In initial micronucleus, DNA is‘junky’and scrambled, but….

….it reassembles itself in proper sequence by means of computer-like acrobatics (unscrambling, throwing out genetic ‘junk’)—in macronucleus

The ciliate

Page 24: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

IES: internal eliminated segmentsMDS: macronuclear destined sequences

MAC

MIC

Telomere Pointers

The complexity of spirotrich biology

Page 25: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Splicing

Page 26: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Fractioned genes

Page 27: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Intervening non-coding DNA regions (IES: internal

eliminated segments) interrupt protein-coding

sequences (MDS macronuclear destined sequences)

IESs are removed during macronuclear development

MDSs are unscrambled

Prescott, 2000

The complexity of gene scrambling

Page 28: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Actin I

DNA polymerase

Landweber et al., 2000

Hogan et al., 2001

-TBP

Prescott et al., 1998

Oxytricha nova

Scramble genes -TBP, actin I, DNA pol

Page 29: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Prescott et al, 1998

Degree of scrambling in -TBP

Page 30: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Hogan et al, 2001

Unscrambling of actin I

Page 31: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Landweber et al, 2000

Degree of scrambling in DNA pol

Page 32: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

DNA folding and recombination DNA pol

Page 33: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

DNA folding and recombination

Page 34: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

DNA pol : Hairpin loop

Prescott, 2000

DNA folding and recombination DNA pol

Page 35: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Prescott et al, 1998

Recombination -TBP

Page 36: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

(i) Isolate the micronuclear and macronuclear forms

of the -TBP gene

(ii) Compare the micronuclear and macronuclear gene

structures (MDS and IESs) to determine whether

the gene is scrambled

(iii) Compare homologous MDSs and scrambling patterns

in various stichotrich species (earlier

diverging species vs later diverging species)

(iv) Trace a parsimonious evolutionary scrambling

pathway

Tracing evolutionary scrambling

Page 37: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Uroleptus sp.

Oxytrichidae and Paraurostyla weissei

Comparisons of scrambling complexity

Page 38: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Oxytricha trifallax

Oxytricha nova

Stylonychia mytilus

Uroleptus sp.

Paraurostyla weissei

100

100

100

The evolution of recombination

Page 39: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

P. weissei Uroleptus sp.

Holosticha sp.

O. trifallax

O. nova

S. mytilus

Evolutionary scrambling pathway

Page 40: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Formal theory

Page 41: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Ciliate computing The process of gene unscrambling in

hypotrichous ciliates represents one of

nature’s ingenious solutions to the

computational problem of gene assembly.

With some essential genes fragmented in as

many as 50 pieces, these organisms rely on a

set of sequence and structural clues to

detangle their coding regions.

For example, pointer sequences present at

the junctions between coding and non-coding

sequences permit reassembly of the

functional copy. As the process of gene

unscrambling appears to follow a precise

algorithm or set of algorithms, the question

remains: what is the actual problem being

solved?

Page 42: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Genomic Copies of some Protein-coding

genes are obscured by intervening non-

protein-coding DNA sequence elements

(internally eliminated sequences, IES)

Protein-coding sequences (macronuclear

destined sequences, MDS) are present in

a permuted order, and must be

rearranged.

The problem in the cell

Page 43: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

By clever structural alignment…, the cell

decides which sequences are IES and MDS, as well

as which are guides.

After this decision, the process is simply

sorting, O(n).

Decision process unknown, but amounts to finding

the correct path. Most Costly.

Assumption

Page 44: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

there is some as yet undiscovered

“oracle”mechanism within the cell,

or the cell simulates non-determinism

the former solution lacks biological

credibility and the latter implies

exponential time and space explosion.

What we want is a deterministic algorithm

for applying the inter- and intra-

molecular recombination operations to

descramble an arbitrary gene.

Ciliate computing

Page 45: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The first proposed step in gene unscrambling—

alignment or combinatorial pattern matching—

may involve searches through several possible

matches, via either intra-molecular or

intermolecular strand associations.

This part could be similar to Adleman’s (1994)

DNA solution of a directed Hamiltonian path

problem.

Ciliate computing

Page 46: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The second step—homologous recombination at

aligned repeats—involves the choice of whether to

retain the coding or the non-coding segment

between each pair of recombination junctions.

This decision process could even be equivalent to

solving an n-bit instance of a satisfiability

problem, where n is the number of scrambled

segments.

Ciliate computing

Page 47: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

We use our knowledge of the first step to develop

a model for the guided homologous recombinations

and prove that such a model has the computational

power of a Turing machine, the accepted formal

model of computation. This indicates that, in

principle, these unicellular organisms may have

the capacity to perform at least any computation

carried out by an electronic computer.

Ciliate computing

Page 48: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Assume the cell simply reconstructs

the genes by matching up pointers. Just one problem... pointer sequences

are not unique. In fact, may have

multiplicities greater than 13. The proposed solution to this was

that the cell would simply try every

possible combination of pointers

until it found the right two.

Ciliate computing, the naïve model

Page 49: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Relies on short repeat sequences to act

as guides in homologous recombination

events

Splints analogous to edges in Adleman

One example represents solution of 50

city HP (50 pieces reordered)

How the cell computes

Page 50: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Guided recombination system

wxuxvuxwxv

Formal model

Page 51: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Context necessary for a re-

combination between repeats x

(p, x, q) ~ (p’, x, q’)

Formal model

Page 52: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Formal Language Model

Where u=u’p, w=qw’=w’’p’, v=q’v’

Intramolecular recombination. The guide is

x. Delete x wx from original.

Intermolecuar recombination. Strand

Exchange.

This is a universal Turing machine (proven

by Tom Head)

wxuxvuxwxv

Formal model, splicing operation

Page 53: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Formal model, splicing operation

Page 54: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Gene unscrambling algorithm

Page 55: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Ciliate computing

Page 56: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Micronucleus: cell mating

Macronucleus: RNA transcripts (expression)

Micro: I0 M1 I1 M2 I2 M3 … Ik Mk Ik+1

M = P1 N P2

Macro: permutation of (possibly rotated)

M1,…, Mk and I0 ,…, Ik+1are removed

Gene assembly in ciliates

Page 57: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 58: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 59: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 60: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 61: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 62: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 63: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 64: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 65: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 66: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 67: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 68: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 69: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Molecular operators

Page 70: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

The pointer sequences

must be in spatial

proximity during

unscrambling

Topology must be

faithfully reproduced

somehow

Pointers

Page 71: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History

Recombination event

attaches Minor Locus to

end of Major Locus

Relocation of a locus