42
Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication -- Leonardo da Vinci

Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Embed Size (px)

Citation preview

Page 1: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Biosimplicity: Engineering Simple Life

Tom KnightMIT Computer Science andArtificial Intelligence Laboratory

Simplicity is the ultimate sophistication

-- Leonardo da Vinci

Page 2: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Measuring Complexity

• Number of components• Size of description• Size of functional

model

Have you ever thought... about whatever man builds, that all of man'sindustrial efforts, all his calculations and computations, all thenights spent over working draughts and blueprints, invariablyculminate in the production of a thing whose sole and guidingprinciple is the ultimate principle of simplicity?In any thing at all, perfection is finally attained, not when there isno longer anything to add, but when there is no longer anything totake away. -- Antoine de Sainte Exupery, in "Wind, Sand, and Stars"

Page 3: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Engineered Simple Organisms• modular• understood• malleable• low complexity

• Start with a simple existing organism• Remove structure until failure• Rationalize the infrastructure• Learn new biology along the way

The chassis and power supply for our computing

Page 4: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Some history…• Confusion over what PPLO/Mycoplasma were

“The Microbe of pleuorpneumonia” Nocard 1896

• 1932 isolation of “PPLO.” Koch postulates.• 1958 Klieneberger-Nobel identifies them as free

living bacterial species• Morowitz 1962 SciAm: “the smallest living cell”• 1980 Gilbert effort to sequence M. capricolum• 1982 Morowitz “complete understanding of life”• 1996 Fraser et al. M. genitalium sequence• 1999 Hutchison et al. Minimal genome set for

M. genitalium

Page 5: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Relative Complexity

100 1K 10K 100K 1M 10M 100M 1G 10G 100G

Gen

e

Plas

mid

Myc

opla

sma

geni

taliu

m (5

80 k

B)

Mes

opla

sma

floru

m (7

93 k

B)

E. C

oli (

4.6

MB

)

S. C

erev

isia

e (1

2 M

B)

Hum

an (3

.3 G

B)

Lilly

(12

GB

)

T7 P

hage

(36

kB)

Log Genome Size, base pairs

Alive

Autotroph

Page 6: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

From Giovannoni et al. 2005

Page 7: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Choosing an organism• Safe

BSL-1 organism -- insect commensal

• Un-regulatedNot a crop plant or domesticated animal

pathogen

• Fast growing40 minute doubling timevs. six hours for M. genitalium

• Convenient to work withFacultative anaerobe

• Small genome• Known sequence• Complete annotation

Page 8: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

The Mollicute Bibliome

Complete collection of mycoplasma related papers:

• 6,418 and counting

• All books and book chapters also

• Endnote & Refworks

• Downloaded .pdfs for articles > 1995

• Scanned articles and books, OCR with Abbyy Finereader

• Plans for a Google appliance search engine

• Collaboration for “shallow semantic” understanding

• people.csail.mit.edu/tk/mfpapers/ user=meso, pass=meso

Page 9: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication
Page 10: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Mesoplasma florum

1 u

Page 11: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Tomographic TEM

CourtesyJensen Lab,Caltech

Page 12: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Culture Medium 1161

• Beef Heart infusion• 4% Sucrose• Fresh yeast culture broth• 20% horse serum• Penicillin• Phenol red

Page 13: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Defined mediumSodium phosphateKClMagnesium sulfate

GlycerolSpermine

Nicotinic acidThiamineRiboflavinPyridoxamineThioctic acidCoenzyme A

Amino acids (minus asp, glu)GuanineUracilThymineAdenine

Glucose

BSAPalmitic acidOleic acid

Page 14: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Synthesis vs. Import

• Mycoplasma import virtually all small biochemical molecules

• Each import is done with a specific membrane protein – some are capable of importing a class

• Complexity is reduced if the import is simpler than the synthesis

• Example of the opposite:Glutamine Glutamic acidAsparagine Aspartic acid

Page 15: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Sequencing the genome• First sequenced small portions of the

genome to test that we had the correct speciesCompared the results to Genbank entriesSequenced PTS system gene, identical to

reported sequenceSequenced 16S rRNA (unreported)Discovered identical to Mesoplasma

entomophilum 16S rRNA sequence – probably the same species Genbank entry

• Measured genome size with pulse field gels• Sequenced 12% of genome to see what we

were up against

Page 16: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

PFGE of Mesoplasmagenomic DNA

1.2% agarose 9C 6V/cm Ramped 90 - 120 sec 48 hours

yyeast marker

lambda marker

meMesoplasma entomophilum

mfMesoplasma florum

mlMesoplasma lactucae

Page 17: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

I-CeuI digestion

• Special restriction enzymes cut only at 23S rRNA sites

5´ ..TAACTATAACGGTC CTAA^GGTAGCGA..3´

3´ ..ATTGATATTGCCAG^GATT CCATCGCT..5´

Calculate the number of rRNA sites in the genome from the number of cut fragments

Page 18: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

I-CeuI digests

Page 19: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

rRNA sequences

• Degenerate primers

• The two rRNA sequences are different:

Page 20: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication
Page 21: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Library creation• Randomly cut genomic DNA with EcoRI• Shotgun cloned into pUC18 vector• Sequenced the inserts (0.1 – 8 Kb)

• Sheared the genomic DNA with a needle• End repaired• Cloned into defective lambda phage vector• Packaged vector into phage heads• Infected E. coli cells with phage

~ 40 Kb inserts

Page 22: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

What we learned from partial sequence

• Almost all “old friends”• Little or no extra junk• Inter-gene sequences small (-4 to 30 bp)• Little transcriptional control• High AT vs. GC content – 27% average GC

20% in typical genes, even less in control regions

40% in rRNA regions

Page 23: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Origin of Replication

Page 24: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Whitehead Agreement

• Whitehead agreed in January 2002 to sequence the organism

• Estimated to take about two hours of time on their sequencers“Sure, we can do it Tom, but what do we do with

the rest of the day after the coffee break?”

• “How many other organisms like this are there?”300

• “Why don’t we sequence them all?”• Good draft available November 2002• Gaps closed July 2003 -- final November 2003

Page 25: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Gap closure• 9 gaps remained• Long range PCR • Primer walking• One difficult sequence

Poly A region 16-17 bp longSequencing stutteredReprime with aaaaaaaaaaaaaaaag

• Repeat region 186 bp in surface lipoproteinGive up on accurate sequence, PCR for length

• Final assembly verification

Page 26: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Genome characteristics

• 793281 base pairs• 26.52% G + C• 682 protein coding regions

UGA for tryptophanNo CGG codon or corresponding tRNAClassic circular genomeoriC, terminator region, gene orientation

• 39 stable RNAs29 tRNAs2x 16S, 23S, 5SRNAse-P, tmRNA, SRP

Page 27: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Standard Motifs• -10 present usually very highly conserved

Often preceded by a “TG” 1-2 bp upstream

• Seldom a conserved -35 region• RBS is standard Shine-Dalgarno• Alternate RBS matches complementary

region of 16S rRNA UAACAACAU (Loechel 91)• Standard stem-loop terminators with loop

TTAA6-8 poly T tail in forward direction

• dnaA box TTATCCACA• Four ribo-box sequences (Thi, Ile, Val,

Guanine)

Page 28: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication
Page 29: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Understand the metabolism

• Identify major metabolic pathways by finding critical genes coding for known enzymes

• Predict necessary enzymes which may not have been found

• Evaluate the list of unknown function genes for candidates

• Build the major metabolic pathway map of the organism

• Consider elimination of entire pathways

Page 30: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Mfl214, Mfl187

Mfl516, Mfl527, Mfl187

Mfl500 Mfl669Mfl009, Mfl033,Mfl318, Mfl312

Mfl666, Mfl667, Mfl668

Mfl023, Mfl024,Mfl025, Mfl026

ribose ABC transporter

glucosesucrose trehalose xylose

unknownfructose

sn-glycerol-3-phosphate ABC transporter

Mfl254, Mfl180, Mfl514, Mfl174, Mfl644, Mfl200, Mfl504, Mfl578, Mfl577, Mfl502, Mfl120, Mfl468, Mfl175, Mfl259Mfl039, Mfl040, Mfl041, Mfl042, Mfl043, Mfl044, Mfl596, Mfl281

Glycolysis

Mfl497 Mfl515, Mfl526 Mfl499 Mfl317?, Mfl313? ?

Mfl181

beta-glucoside

Mfl009, Mfl011, Mfl012, Mfl425, Mfl615, Mfl034, Mfl617, Mfl430, Mfl313?

PTS II SystemMfl519, Mfl565

chitin degradation

Mfl223, Mfl640, Mfl642, Mfl105, Mfl349

Pentose-Phosphate Pathway

glyceraldehyde-3-phosphate

Mfl619, Mfl431, Mfl426

Mfl074, Mfl075, Mfl276, Mfl665, Mfl463, Mfl144, Mfl342, Mfl343, Mfl170, Mfl195, Mfl372

Mfl419, Mfl676, Mfl635, Mfl119, Mfl107, Mfl679, Mfl306, Mfl648,Mfl143, Mfl466, Mfl198, Mfl556, Mfl385

Mfl076, Mfl121, Mfl639, Mfl528, Mfl530, Mfl529, Mfl547, Mfl375

Purine/Pyrimidine Salvage

glucose-6-phosphate

ribose-5-phosphate

Mfl413, Mfl658

xanthine/uracilpermease

DNA RNAMfl027, Mfl369

competence/DNA transport

DNA Polymerase

degradation

RNA Polymerase

Mfl047, Mfl048, Mfl475

Mfl237

protein translocation complex (Sec)

protein secretion (ftsY)

srpRNA, Mfl479

Signal Recognition Particle (SRP) Ribosome

Export

Mfl182, Mfl183, Mfl184

Mfl509, Mfl510, Mfl511

Mfl652Mfl557

Mfl605Mfl019

Mfl094, Mfl095, Mfl096, Mfl097,

Mfl098

Mfl015

spermidine/putrescineABC transporter

unknown amino acidABC transporterglutamine

ABC transporter

oligopeptide ABC transporter

arginine/ornithineantiporter lysine

APC transporteralanine/Na+ symporter

glutamate/Na+symporter

Mfl016, Mfl664

putrescine/ornithineAPC transporter

23sRNA, 16sRNA, 5sRNA,

Mfl122, Mfl149, Mfl624, Mfl148, Mfl136, Mfl284, Mfl542, Mfl132, Mfl082,Mfl127, Mfl561, Mfl368.1, Mfl362.1, Mfl129, Mfl586, Mfl140, Mfl080,

Mfl623, Mfl137, Mfl492, Mfl406

Mfl608, Mfl602, Mfl609, Mfl493, Mfl133, Mfl141, Mfl130, Mfl151, Mfl139, Mfl539, Mfl126, Mfl190, Mfl441, Mfl128, Mfl125, Mfl134, Mfl439, Mfl227,

Mfl131, Mfl123, Mfl638, Mfl396, Mfl089, Mfl380, Mfl682.1, Mfl189, Mfl147, Mfl124, Mfl135, Mfl138, Mfl601, Mfl083, Mfl294, Mfl440?

proteins

degradation

Mfl418, Mfl404, Mfl241, Mfl287, Mfl659, Mfl263, Mfl402, Mfl484, Mfl494, Mfl210, tmRNA

tRNA aminoacylation

ribosomal RNA transfer RNA

messenger RNA

Mfl029, Mfl412, Mfl540, Mfl014, Mfl196,Mfl156, Mfl282, Mfl387, Mfl682, Mfl673, Mfl077, rnpRNA

Mfl563, Mfl548, Mfl088, Mfl258, Mfl329, Mfl374, Mfl541, Mfl005, Mfl647, Mfl231, Mfl209

Mfl613, Mfl554, Mfl480, Mfl087, Mfl651, Mfl268, Mfl366, Mfl389, Mfl490, Mfl030, Mfl036, Mfl399, Mfl398, Mfl589,

Mfl017, Mfl476, Mfl177, Mfl192, Mfl587, Mfl355

Mfl086, Mfl162, Mfl163, Mfl161

amino acids

Amino Acid Transport

intraconversion?

Mfl590, Mfl591

Lipid SynthesisMfl230, Mfl382, Mfl286, Mfl663, Mfl465, Mfl626

fatty acid/lipid transporter

Identified Metabolic Pathways in

Mesoplasma florum

Mfl384, Mfl593,Mfl046, Mfl052

L-lactate,acetate

Mfl099, Mfl474,Mfl315, Mfl325,Mfl482

cardiolipin/phospholipids

membrane synthesis

x22

Mfl444, Mfl446, Mfl451

variable surface lipoproteins

hypotheticallipoproteins

phospholipid membrane

Mfl063, Mfl065, Mfl038,Mfl388

Mfl186 formate/nitratetransporter

Mfl060, Mfl167, Mfl383, Mfl250

Formyl-THF Synthesis

THF?

x57hypothetical transmembrane proteins

met-tRNA formylationMfl409, Mfl569

Mfl152, Mfl153, Mfl154

Mfl233, Mfl234, Mfl235

Mfl571, Mfl572

Mfl356, Mfl496, Mfl217

Mfl064, Mfl178Nfl289, Mfl037, Mfl653, Mfl193

Mfl109, Mfl110, Mfl111, Mfl112, Mfl113, Mfl114,

Mfl115, Mfl116

ATP Synthase Complex

ATP ADP

phosphate ABC transporter

phosphonate ABC transporter

metal ion transporter

Mfl583, Mfl288, Mfl002, Mfl678, Mfl675, Mfl582,

Mfl055, Mfl328

Mfl150, Mfl598, Mfl597, Mfl270, Mfl649

acetyl-CoA

cobalt ABC transporter

Mfl165, Mfl166

K+, Na+transporter

Mfl378

malate transporter?

Mfl340, Mfl373, Mfl521, Mfl588

Pyridine Nucleotide Cycling

NAD+

Electron Carrier Pathways

NADHNADPH

NADP

Flavin Synthesis

riboflavin?

FMN, FADMfl283, Mfl334

Mfl193

Mfl057, Mfl068, Mfl142,Mfl090,

Mfl275

Mfl347, Mfl558

G. Fournier02/23/04

x13+

unknown substrate transporters

PRPP

niacin?

Page 31: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

How Simple is this?• Missing cell wall, outer membrane• Missing TCA cycle• Missing amino acid synthesis• Missing fatty acid synthesis• One sigma factor• Small number of dna binding proteins• One insertion sequence, probably not

active• One restriction system (Sau3AI-like)• CTG/CAG methylation (function?)• Evidence for shared protein function

MDH/LDH (Pollack 97 Crit rev microbiol 23:269)

Page 32: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Minimal is not always simple• Shared function of parts• Overlapped genes• Tradeoff of import vs. synthesis

• Example:Television set designShared deflection coil, high voltage power

supply, isolated filament supplyThree functions, a single circuit, a difficult

engineering, modeling, debugging, and repair task

• how many genes have multiple functions

Page 33: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

DNA Methylation

• Bisulfite conversion of genomic DNA• Sequencing of converted DNA to identify

methylated C positions

• Results: GATC sequences, as expected• Unexpected: CAG and CTG sequences

Page 34: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Current work• Array experiments

Close species Transcriptional units, pseudo-genes TRASH

• Protein species by LC/LC/MS/MS• Elimination of the restriction system• Plasmid system

pBG7AU based

• Recombination system Positive/negative selection

• Yeast chromosome transfer• Genome edits to reduce size• Genome edits to modularize• Genome edits to eliminate complexity• Use as a construction chassis

Page 35: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Reconstruct the Genome• Use recombination techniques to edit the

genome• Eliminate unnecessary genes• Remove overlaps• Standardize promoters, ribosomal binding

sites• Identify transcriptional and translational

regulators• Recode proteins to use a reduced portion of

the coding spaceThe code is 4 billion years old,it’s time for a rewrite

Page 36: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

YAC mutagenesis

• Bring up YAC technology Spheroplasts, old YAC plasmid sequencing

• Triple transform with MF chromosome and pRML1 (Spencer 92) pRML2 Genome inactive except for ARS, telomeres, selection

markers

• Use yeast recombination systems for genome editing

• Isolate and recircularize YAC to form a new genome• Lipid encapsulate genome into vesicles• Fuse vesicles with genome-killed wild type cells

Page 37: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Proteome• Collaboration with Steve Tannenbaum / Yingwu

Wang• 2-D gels + MS spot ID• LC/LC/MS/MS ID of trypsin digests

Page 38: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Riboswitch Analysis

• Collaboration with Ron Breaker / Adam Roth

• Discovery of unique riboswitches specific for GTP rather than dGTP

• Found in no other sequenced genomes

• Analysis of close relatives under way

Page 39: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Engineer plasmids• No known plasmids for this class of

organism• Renaudin has made plasmids using the

chromosomal OriC as the replicative elementLartigue 03

• M. mycoides has pADB201 and pBG7AU rolling circle plasmids similar to pE194 (1082 bp)

• We know the antibiotic sensitivities and have working resistance genes

Page 40: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Kit Part the genome

• Make Biobrick parts from each gene, tRNA, promoter, other part-like genome element

• Attempt to develop techniques for recombining parts into coherent modules

• Develop techniques for assembling and modeling the resulting structures

Page 41: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Thanks to…• Harold Morowitz• Greg Fournier• Gail Gasparich• Bob Whitcomb• Eric Lander• Bruce Birren• Nicole Stange-

Thomann• George Church• Roger Brent• Grant Jensen• Yingwu Wang

• Nick Papadakis• Ron Weiss• Drew Endy• Randy Rettberg• Austin Che• Reshma Shetty• MIT Synthetic biology

working group• DARPA, NTT, NSF,

Microsoft

Page 42: Biosimplicity: Engineering Simple Life Tom Knight MIT Computer Science and Artificial Intelligence Laboratory Simplicity is the ultimate sophistication

Synthetic Biology

• An alternative to understanding complexity is to remove it

• This complements rather than replaces standard approaches

• Engineering synthetic constructs will be easierEnabling quicker more facile experimentsEnabling deeper understanding of the basic

mechanismsEnabling applications in nanotechnology,

medicine and agricultureSimplicity is the ultimate sophistication

-- Leonardo da Vinci