DNA Computing on Surfaces Anne Condon, Computer Science, UBC Robert Corn, Chemistry, U. Wisconsin...

Preview:

Citation preview

DNA Computing on Surfaces

Anne Condon, Computer Science, UBCRobert Corn, Chemistry, U. WisconsinMax Lagally, Materials Science, U. WisconsinLloyd Smith, Chemistry, U. Wisconsin

Goals

• Encode information in DNA strands

• Compute on many strands in parallel: chemical manipulations = logical operations

(Adleman, Science 266:1994)

“…the number of of operations per second … would exceed that of current supercomputers by a thousandfold…remarkable energy efficiency… information density a dramatic improvement over existing storage media

Len Adleman, Science 266:1994

“for certain intrinsically complex problems…where existing electronic computers are very inefficient and where massively parallel searches can be organized to take advantage of the operations that molecular biology currently provides, molecular computation might compete with electronic computation in the near term”

OutlineBackground

DNA Computing on Surfaces

Conclusions•Models•Experiments

• What is computation? What is DNA?• DNA computation

•Research on DNA computation

• in the biotech industry• in the solution of combinatorial problems

What is Computation?(very simple view)

• Input: string over finite alphabet

• Process: determine if input satisfies

some property

• Output: yes or no

Satisfy a Property: Binary Inputs

• set the output of a circuit to 1 or

and

notand

0 11 0

Output:

Input:

0 1

1

1

Satisfy a Property: Non-binary Inputs

• Set the output of a generalized circuit to a given value

C GA G

Output:

T G

G

C

Simple Parallel Computation

• Input: set of strings

• Process: independently for each input,

determine if it satisfies a

common circuit

• Output: indicate whether there exists an

input satisfying the circuit

What is DNA?

“DNA Computation:” Affymetrix Arrays

• Input: strings over {A,C,G,T}, (represented as the corresponding single-stranded DNA)

Photolithography used to synthesize and array DNA strands on a planar surface

“DNA Computation:” Affymetrix Arrays

• Process: e.g. for each input, test if it approximately matches a given string

(i.e. hybridizes to Watson-Crick complement of given string)

“DNA Computation:” Affymetrix Arrays

• Output: fluorescence detection

Adleman’s Hamiltonian Path Experiment

• Input: generate random paths

• Process:

• Output: “yes” iff path remains

S

2 1

3

4

5

T

• select paths from S to T• select paths with 7 nodes• select paths entering all nodes at least once

Generate Random Paths

• Associate DNA strands with nodes and edges

• Join edge strands in test tube to form double-stranded “paths” (hybridization, ligation)

• Wash to form single-stranded paths

542 3

Adleman’s Experiment: Select Paths That Enter Node 2

• Attach strand associated with node 2 to beads and introduce to test tube

• The paths that enter node 2 hybridize to strands on the beads

• Remove beads; wash and detach desired paths

Biomolecular Computation Research

• “Classical” DNA/RNA computation

(e.g. search-and-prune)• O(1)-biostep computation

(e.g. self-assembly of 3-D DNA molecules)

Biomolecular Computation Research

• Splicing-based computation• Non-computational applications

(e.g. exquisite detection, DNA2DNA computation, DNA nanotechnology, DNA tags)

DNA Computing on Surfaces

• Advantages over “solution phase” chemistry:

• Disadvantages:

DNA Computing on Surfaces

•Facile purification steps•Reduced interference between strands•Easily automated

•Loss of information density (2D)•Lower surface hybridization efficiency•Slower surface enzyme kinetics

DNA Surface Model: Input

DNA strands representing the set {0,1}^n are synthesized and subsequently immobilized on a surface in a non-addressed fashion

Encoding of Binary Information in DNA Strands

A strand is comprised of words. Each word is a short DNA strand (16mer) representing one or more bits.

ACCT...

Word Bit

1

2

3

4

12341234...

DNA Word Design Problem

• Requirements of a “DNA code”:– Success in specific hybridization between a DNA

code word and its Watson-crick complement– Few false positive signals

• Virtually all designs enforce combinatorial constraints on the code words

• Applications: – Information storage, retrieval for DNA computing – Molecular bar codes for chemical libraries

What combinatorial constraints are placed on DNA Codes?

• Hamming: distance between two code words should be large

• Reverse complement: distance between a word and the reverse complement of another word should be large

• Also: frame shift, distinct sub-words, forbidden sub-words, …

Work on DNA code design• Seeman (1990): de novo design of

sequences for nucleic acid structural engineering

• Brenner (1997): sorting polynucleotides using DNA tags

• Shoemaker et al. (1996): analysis of yeast deletion mutants using a parallel molecular bar-coding strategy

• Many other examples in DNA computing

Word Design Example

DNA Surface Model: Process

•MARK strands in which bit j = 0 (or 1): hybridize with Watson-Crick complements of word containing bit j, followed by polymerization•DESTROY•UNMARK

DNA Surface Model: Process

•MARK strands in which bit j = 0 (or 1)•DESTROY unmarked strands: exonuclease degradation•UNMARK

DNA Surface Model: Process

MARK strands in which bit j = 0 (or 1): hybridize with Watson-Crick complements of word containing bit j, followed by polymerization

DNA Surface Model: Process

•MARK strands in which bit j = 0 (or 1)•DESTROY unmarked strands•UNMARK strands: wash in distilled water

DNA Surface Model: Output

• Detect remaining strands (if any)

by detaching strands from surface and

amplifying using PCR (polymerase chain

reaction).

Computational Power ofDNA Surface Model

Theorem: Any CNFSAT formula of size m can be computed using O(m) mark, unmark and destroy operations.

Theorem: Any circuit of size m can be computed using O(m) mark, unmark, destroy, and append operations.

Surface DNA Computation: the Satisfiability Problem

•Input: 16 strands•Process:

•Output: exactly those strands that satisfy the circuit remain on the surface.

or

not

or

z

and

w y x

MARK if bit z = 1 MARK if bit w = 1 MARK if bit y = 0 DESTROY UNMARK

MARK if bit w = 0 MARK if bit y = 0 DESTROY UNMARK

or or

not not

DNA Computing on Surfaces: Experiments

Students: Tony Frutos, Susan Gillmor, Zhen Guo, Qinghua Liu, Andy Thiel, Liman Wang

MARK Operation: 4-Base Mismatch Word Design

Repeated MARK, DESTROY, UNMARK Operations

Append (DNA Ligase)

A. Hybridize with CbB. Hybridize with Cab, WbC. Ligate; Wash; Hybridize with Cb.

Two-Word Mark and Destroy

A. Mark C1a, C1b, C2bB. Ligate; Melt single wordsC. Destroy; Unmark; Mark C1a, C1b, C2b.

Surface Attachment Chemistry

Word Readout Strategy

•PCR amplify words remaining on surface

•Detect PCR products on single word readout arrays

4-Variable SAT Demo

•Synthesize; Attach•Mark•Destroy•Umark•Readout

Cycle

Conclusions• DNA computing has expanded the notion of what

is computation• Solid-phase chemistry is a promising approach to

DNA computing• DNA computing will require greatly improved

DNA surface attachment chemistries and control of chemical and enzymatic processes

• New research problems in combinatorics, complexity theory and algorithms

Open Problem: DNA Strand Engineering

Given a DNA strand, there are polynomial-time algorithms that predict the secondary structure of the strand.

Inverse Problem: find an efficient algorithm that, given a desired secondary structure, generates a strand with that structure.