45
Systems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics DNA sequencing technology DNA sequencing technology Genome sequencing technology Genome sequencing technology Current statistics Current statistics Basics of genome sequence analysis Basics of genome sequence analysis

Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Systems MicrobiologyMonday Oct 23 - Ch 15 -Brock

Genomics •• DNA sequencing technologyDNA sequencing technology•• Genome sequencing technologyGenome sequencing technology•• Current statisticsCurrent statistics•• Basics of genome sequence analysisBasics of genome sequence analysis

Page 2: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Human genome sequence analysis reached a first stage of completion in the summer of 2000 summer by the New York Times:

Scanned magazine and newspaper articles about J. Craig Venter and Francis Collins removed due to copyright restrictions.

Page 3: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

The Race to Sequence

Cartoon image of J. Craig Venter and Francis Collins racing to the finishremoved due to copyright restrictions.

• Celera - J. Craig Venter

• NHGRI - Francis Collins

• G5 – Sanger Centre – Whitehead Institute

– JGI Walnut Creek

– Washington University – Baylor

Page 4: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Human Genome Project • HGP drove innovation in biotechnology • 2 major technological benefits

– stimulated development of high throughput methods -- the assembly line (or dis-assembly line) meets biological research

– reliance on computational tools for data mining andvisualization of biological information

Page 5: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

DNA Sequencing

DNA sequencing = determining the nucleotide sequence of DNA.

Developed by Frederick Sanger in the 1970s.

First whole DNA genome sequenced was a virus, PhiX174 in 1977 - 5386 bp.

First demo of Sanger Photographs removed due to copyright restrictions.

dideoxynucleotide nucleotidesequencing technique.

1980: Walter Gilbert (Biol. Labs) & Frederick Sanger (MRC Labs)

Page 6: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Manual Dideoxy DNA sequencing-How it works:

1. DNA template is denatured to single strands.

2. DNA primer (with 3’ end near sequence of interest) is annealed to the template DNA and extended with DNA polymerase.

3. Four reactions are set up, each containing:

1. DNA template 2. Primer annealed to template DNA 3. DNA polymerase 4. dNTPS (dATP, dTTP, dCTP, and dGTP)

4. Next, a different radio-labeled dideoxynucleotide (ddATP, ddTTP, ddCTP, or ddGTP) is added to each of the four reaction tubes at 1/100th the concentration of normal dNTPs.

5. ddNTPs possess a 3’-H instead of 3’-OH, compete in the reaction with normal dNTPS, and produce no phosphodiester bond.

6. Whenever the radio-labeled ddNTPs are incorporated in the chain, DNA synthesis terminates.

7. Each of the four reaction mixtures produces a population of DNA molecules with DNA chains terminating at all possible positions.

Page 7: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

OH

Manual Dideoxy DNA sequencing-How it works (cont.):

8. Extension products in each of the four reaction mixutes also end with a different radio-labeled ddNTP (depending on the base).

9. Next, each reaction mixture is electrophoresed in a separate lane (4 lanes) at high voltage on a polyacrylamide gel.

10. Pattern of bands in each of the four lanes is visualized on X-ray film.

11. Location of “bands” in each of the four lanes indicate the size of the fragment terminating with a respective radio-labeled ddNTP.

12. DNA sequence is deduced from the pattern of bands in the 4 lanes.

C5’ O

O P

O

C O-

3’H

Page 8: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

,

OH

template Sanger sequencing works by primer extension using

DNA polymerase and the 4 deoxynucleotide triphosphates

PLUS one dideoxynucleotide per sequencing rxn.

Newly synthesized DNA

5’ O O

P O- O

C

3’ HX 3’ dideoxynuclotide

primer

Page 9: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Images of polyacrylamide gels removed due to copyright restrictions.

Vigilant et al. 1989PNAS 86:9350-9354

Page 10: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

(polyarcrylamide gel) Radio-labeled ddNTPs (4 rxns)

Sequence (5’ to 3’)

G G A T A T A A C C C C T G T

Long products

Short products

Page 11: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Automated Dye-Terminator DNA Sequencing:

1. Dideoxy DNA sequencing was time consuming, radioactive, and throughput was low, typically ~300 bp per run.

2. Automated DNA sequencing employs the same general procedure, but uses ddNTPs labeled with fluorescent dyes.

3. Combine 4 dyes in one reaction tube and electrophores in one lane on a polyacrylamide gel or capillary containing polyacrylamide.

4. UV laser detects dyes and reads the sequence.

5. Sequence data is displayed as colored peaks (chromatograms) that correspond to the position of each nucleotide in the sequence.

6. Throughput is high, up to 1,200 bp per reaction and 96 reactions every 3 hours with capillary sequencers.

7. Most automated DNA sequencers can load robotically and operate around the clock for weeks with minimal labor.

Page 12: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Applied Biosystems PRISM 377(Gel, 34-96 lanes) Applied Biosystems PRISM 3700

(Capillary, 96 capillaries)

Photographs of equipment removed due to copyright restrictions.

Applied Biosystems PRISM 3100 (Capillary, 16 capillaries)

Page 13: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

“virtual autorad” - real-time DNA sequence output from ABI 377

1. Trace files (dye signals) are analyzed and bases called to create chromatograms.

2. Chromatograms from opposite strands are reconciled with software to create double-stranded sequence data.

Page 14: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Table and graph showing the growth of GenBank removed due to copyright restrictions.

Page 15: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Fraser, Eisen, and Salzman Shotgun genome sequencingNature 406, 799-803 (17 August 2000)

Images removed due to copyright restrictions.

Page 16: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Small insert cloning/seqing

Image removed due to copyright restrictions. See Figure 15-1a in Madigan, Michael, and John Martinko. Brock Biology of Microorganisms. 11th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2006. ISBN: 0131443291.

Page 17: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Blue white selection of insert containing clones based on lacZ (beta galactosidase) activity

Image removed due to copyright restrictions. See Figure 15-1bc in Madigan, Michael, and John Martinko. Brock Biology of Microorganisms. 11th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2006. ISBN: 0131443291.

Page 18: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Fluorescent Sanger dideoxy seqing rxns,{& capillary electrophoresis !

(typically, shotgun seqing done to 8X min.)

Reads

Contigs

Genome

Figure by MIT OCW.

Page 19: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Shotgun sequencing: A Bottom Up, Random Strategy

The key to the strategy is shotgun cloning of random (not mapped) fragments of a defined length (2kb or 10kb) so that scaffolds can be assembled from terminal sequences of each cloned insert.

STSGenomeMapped Scaffolds:

Scaffold:

Contig:

Read pair (mates)

ConsensusReads (of several haplotypes)

Gap (mean and standard deviation known)

SNPsBAC Fragments

Figure by MIT OCW.

Page 20: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

DNA Sequence Assembly

Computer screenshot showing DNA sequencing removed due to copyright restrictions.

Page 21: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Large insert cloning/seqing

Image removed due to copyright restrictions. See Figure 15-2 in Madigan, Michael, and John Martinko. Brock Biology of Microorganisms. 11th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2006. ISBN: 0131443291.

Page 22: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

BAC sequencing: A Top down Up Strategy Using Maps

BAC = Bacterial Artificial Chromosome (F-Factor based)

Genomic DNA

BAC library

Organized mapped large clone contigs

BAC to be sequenced

Shotgun sequenceAssembly

Shotgun clones

...ACCGTAAATGGGCTGATCATGCTTAAA

...ACCGTAAATGGGCTGATCATGCTTAAACCCTGTGCATCCTACTG...

TGATCATGCTTAAACCCTGTGCATCCTACTG..

Hierarchial shotgun sequencing

Figure by MIT OCW.

Page 23: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Image removed due to copyright restrictions.

The NIH consortium strategy was dependent on multiple clone overlaps

Page 24: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Brief technical outline for DNA sequence production

Page 25: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

High throughput DNA sequencing is entirely automated from colonies to computers Approach made possible by automating massively parallel processes on an industrial scale

Photograph of an industrial scale DNA sequencing lab removed due to copyright restrictions.

Page 26: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Sequence Production Pipeline

_ Pick & archive library clones _ Template production _ dideoxy terminator reactions _ Post-reaction processing _ Capillary sequencing _ Sequence data analysis - bioinformatics

Page 27: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Making the Genome Bite Size

Image removed due to copyright restrictions.

Page 28: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Making a DNA fragment library

LacZ gene producesblue pigment

Sheared fragments

The Lacz gene isdisrupted by theinsert

Hybidization + DNA ligase

A couple of thousand insertsare created from each originallarge sequence. Each color is adifferent sheared fragment

Some of theseplasmids do nottake in a fragment

Gene for antibioticresistance

RecombinantDNA

pUC18

Figure by MIT OCW.

Page 29: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Plating a DNA Fragment Library

Plating photographs removed due to copyright restrictions.

Page 30: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Picking & Archiving Clones • QPixII from Genetix, Ltd (UK) • 4500 colonies per hour • 300,000 colonies; 67 hrs; 384-well plates = 781

Various laboratory photographs removed due to copyright restrictions.

Page 31: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

454Corporation

The first commercial,massively parallel,

DNA sequencing technology

Images removed due to copyright restrictions.

Page 32: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Deposit DNA Beads into PicoTiterPlate™

Images of DNA beads on PicoTiterPlates removed due to copyright restrictions.

Page 33: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Sequencing Instrument

Images removed due to copyright restrictions.

Page 34: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

454 DNA sequencingtechnology

– 300,000 DNA fragments sequenced in parallel– Average read length 100 - 110 nt – 30 million nt per run – One 454 run = 4 hours – $5,000 reagent cost

Page 35: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Company

Sequencing

Format Read length Throughput

Amersham Capillary electrophoresis 550-1,000 bases 2.8 megabases/day(384-capillary instrument)Biosciences

(Uppsala, Sweden)

Applied Biosystems(Foster City, CA)

Capillary electrophoresis 500-950 bases (depending on column length)

2 megabases/day(production-scale system)

Lynx Therapeutics(Hayward, CA)

Massively parallelbead arrays

20 bases 8 megabases/day

Pyrosequencing(Uppsala, Sweden)

Real-time sequencingby synthesis

40-50bases/well(96- or 384-well plate)

110-691 kilobases/day

Resequencing

454 Life Sciences(Branford, CT)

Massively parallelmicrofluidic, solid-surfacesequencing

100 bases currently 60 megabases/day

Affymetrix(Santa Clara, CA)

High-density microarrays 30 kilobases/array currently;500 kilobases/array projected

3 megabases/day

Perlegen(Mountain View, CA)

Whole-wafer Affymetrixmicroarray

15 megabase/wafer 300 megabases/day

Sequencing and resequencing technologies currently availableFigure by MIT OCW.

Page 36: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

http://www.genomesonline.org/Gold_statistics.html

Courtesy of Genomes Online. Used with permission.

Page 37: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

http://www.genomesonline.org/Gold_statistics.html

Courtesy of Genomes Online. Used with permission

Page 38: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

http://www.genomesonline.org/Gold_statistics.html

Courtesy of Genomes Online. Used with permission._____________

Page 39: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Prote

obac

teria

25 (6

2)

Spiro

chet

es 2

(2)

Ther

mus

/Dei

noco

ccus

1 (1

)B

CF

grou

p (5

)

Green sulfur (1)

Low G+C gram positive 16 (3 2)

Cyanobacteria 2 (5)

Acidobacterium (1)

Chlamydia 5 (2)

Planctomycetes (1)

Actino

bacter

ia 3 (

12)

Green n

on-su

lfur (

1)

Ther

mot

ogal

es 1

Aqu

ilica

les 1

Pyro

bacu

lum

1H

yper

ther

mus

(1)

Aer

opyr

um 1

Sulfo

lobus

2 (2

)

Methanopyrus 1

Methanobacterium 1

Archaeoglobus 1

Thermoplasma 2 (1)Methanosarcina 2 (1)Halophiles 1 (1)

Methanococcus 1 (1)

Thermococcus 3

Fuso

bact

eria

TM7

WS6

OP1

1

TM6

Marine group A

Fibrobacter

Flexistipes

SynergistesOP8

Termite group 1

NitrospiraOS-K

Vernicomicrobia

OP3

OP10

WS1

OP9

Dic

tyog

lom

usC

opro

chem

obac

ter

Ther

mod

esul

foba

cter

ium

OP5

pJP 78

pJP 27

Methanospirillum

planktonic mesophiles

Methanothermus

pSL5

0

pJP4

1 Ther

mob

lum

pJP80

pSL 123

pSL 17pSL 22

pSL 4

pSL 78

pSL 12

Cenarchaeum

planktonic Crenarchaeotes

EUCARYA

KORARCHAEOTA

EURYARCHAEOTA 12(4)

CRENARCHAEOTA 4 (3)

ARCHAEA 16 (7)

BACTERIA 56 (125)

Figure by MIT OCW.

Page 40: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Courtesy of Genomes Online. Used with permission.

Page 41: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

http://www.genomesonline.org/Gold_statistics.html

Courtesy of Genomes Online. Used with permission.

Page 42: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

http://www.genomesonline.org/Gold_statistics.html

Courtesy of Genomes Online. Used with permission.

Page 43: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Tables removed due to copyright restrictions. See Tables 15-1 and 15-3 in Madigan, Michael, and John Martinko. Brock Biology of Microorganisms. 11th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2006. ISBN: 0131443291.

Page 44: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Genome Streamlining in a Cosmopolitan Oceanic BacteriumGiovanonni et al, Science 309:1242 (2005)

10.0

5.0

1.0

0.5

0.1100 500 1000 5000 10000

Nanoarchaeum equitansMycoplasma genitalium Wigglesworthia glossinidia

Rickettsia conoriiPelagibacter ubique

Ehrlichia ruminantiumBartonella quintanaThermoplasma acidophilum

Bartonella henselaeCoxiella burnetii

Prochlorococcus marinus MED4Prochlorococcus marinus SS120

Prochlorococcus marinus MIT9313

Synechococcus sp.WH8102

Silicibacter pomeroyi

Rhodopirellula baltica

Streptomyces coelicolor

Mesoplasma florum

Free-livingHost-associated

Obligate symbionts/parasitesPelagibacter ubique

Number of protein encoding genes

Gen

ome

size

(Mbp

)

Figure by MIT OCW.

Page 45: Monday Oct 23 - Ch 15 -Brock - MIT OpenCourseWareSystems Microbiology Monday Oct 23 - Ch 15 -Brock Genomics • DNA sequencing technologyDNA sequencing technology • Genome sequencing

Science 314:267(Oct 13, 2006)

00 1 2 3 4 5 6 7 8 9

10

20

30

40

50

60

70

Chromosomal size (Mb)

GC

con

tent

(%)

Carsonella

Carsonella

50 µm

Figure by MIT OCW.