21
1 George Church Thanks to: Personal Genome Project AppliedBiosystems, Helicos, Roche454, Illumina, CGI, IBS, Affymetrix, Enzymatics PGP Volunteers & Donors !

Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

Embed Size (px)

Citation preview

Page 1: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

1

George Church

Thanks to:

Personal Genome Project

AppliedBiosystems, Helicos, Roche454, Illumina, CGI, IBS, Affymetrix, Enzymatics

PGP Volunteers & Donors !

Page 2: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

2

What about government requiring testing of babies for intelligence genes?

PKU Phenylketonuria(Phenylalanine hydroxylase deficiency)Tested in nearly all 4M newborns per year1 in 15,000 births.

Close to 100% heritable (& 100% environmental)Nutritional preventative

Page 3: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

3

Is anonymity in genomics realistic? http://arep.med.harvard.edu/PGP/Anon.htm(10) Re-identification after “de-identification” using other public data. Group Insurance Commission list of birth date, gender, and zip code was sufficient to re-identify medical records of Governor Weld & family via voter-registration records (1998) (9) Hacking. A hacker gained access to confidential medical info at the U. Washington Medical Center -- 4000 files (names, conditions, etc, 2000)(8) Combination of surnames from genotype with geographical infoAn anonymous sperm donor was traced on the internet 2005 by his 15 year old son who used his own Y chromosome genealogy to access surname relations.(7) Inferring phenotype from genotype Markers for eye, skin, and hair color, height, weight, racial features, dysmorphologies, etc. are known & the list is growing.(6) Self-identification. An example of this at Celera undermined confidence in the investigators. Kennedy D. Science. 2002 297:1237. Not wicked, perhaps, but tacky.(5) A tiny amount of DNA data in the public domain with a name leverages the rest. This would allow the vast amount of DNA data in the HapMap (or other study) to be identified. This can happen for example in court cases even if the suspect is acquitted.(4) Laptop theft. 26 million Veterans' medical records, SSN & disabilities stolen Jun 2006. (3) Unauthorized access to DNA bearing samples (e.g. hair, dandruff, hand-prints, etc.) (2) Identification by phenotype. If CT or MR imaging data is part of a study, one could reconstruct a person’s appearance . Even blood chemistry can be identifying in some cases. (1) Government subpoena. False positive IDs can be very disruptive.

Page 4: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

4

Personal Genome Project (PGP) -- ELSI• Submitted May’03 to NIH, $10M approved Mar 2004 (technology)• HMS IRB Human Subjects protocol approved Aug 2005.

(Possibly unique in including identifiable traits)• Highly-informed individuals consenting to potentially non-anonymous genomes & extensive phenotypes (medical records, imaging, omics). Scaling to 100K volunteers: http://pgen.us

• Cell lines in Coriell NIGMS Repository(B-cells, keratinocytes, fibroblasts)

G M Church GM (2005) The Personal Genome ProjectNature Molecular Systems Biology doi:10.1038/msb4100040 Kohane IS, Altman RB. (2005) Health-information altruists--a potentially critical resource. N Engl J Med. 353:2074-7. McGuire AL, Gibbs RA (2006). Genetics. No longer de-identified. Science. 312:370-1.

Page 5: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

5

PGP: comprehensive traits, [diseases], (treatments)

Hair: Baldness [alopecia](minoxidil) Eyes: [Near/Far-sightedness](glasses) Iris color [ARMD] (glasses)Face: [Developmental syndromes, Wrinkles] (Botox)Brain: ADHD(Ritalin); Depression(Prozac); Headache(analgesics)Sleep & Circadian (caffeine, amphetamine, modafinil)Motion sickness (Dramamine, and Scopolamine)Ears: Sensitivity (hearing aids)Nose: Shape [breathing disorders] (CPAP)Lip: [Cleft palate] (surgery); [Hirsutism] (calcium thioglycolate)Mouth: Halitosis, throat exams; aerosols [airborne pathogens]Digestion [reflux, gas,ulcer] (antibiotics, antacids, PPIs) Back: Strain sensitivity [IDD] (analgesics)Skin: Perspiration, Body odor, Pheromones (deodorants)Surface texture [psoriasis] (topicals, photo-treatments)Immune components [acne] (topical antibiotics) Skin color [vitamin D & sunburn] (supplements, SPF cream)Hands: Dermatoglyphics [syndromes], [Arthritis](corticosteroids) Internal sensors: Proprioceptor, Repetitive stress (NSAIAs)Body: Height [Marfan] [short stature] (hGH)Weight [anorexia] [obesity] (Orlistat, Phentermine, Sibutramine)Allergies (antihistamines, cortisone, epinephrine, theophylline)Metabolic polymorphisms (vitamins, minerals, insulin)Feet: Plantar fasciitis (orthotic shoes)Athlete’s foot (miconazole, itraconazole, terbinafine, salicylate)1933

Page 6: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

6

Status quo “de-identification”problems & potential solutions1) Less integrated, holistic, comprehensive2) Less enabling of system-wide medicine3) Subjects not informed enough to “opt-out”4) Life-impacting info can’t be shared with subjects5) False sense of anonymity (see http://pgen.us)

In contrast, PGP IRB emphasizes genetics education to enable 1) Active subject participation, informed opt-out, 2) choose research-only -OR- open public database. 3) genomics, EMRs, traits-questionaire linked4) Scaleable to millions of research subjects,

leveraging inexpensive geno/phenotypes5) Possibly self-funding. Early adopters pay

for less financially able (but well-informed) set

Page 7: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

7

What if no treatment exists?

Huntington's ChoreaNancy Wexler’s family

AdrenoleukodystrophyAugusto Odone’s son

DougMelton’s son, Sam, has diabetes

Inspire personal health activism.

Parkison’sDiseaseMichael J. Fox

Cancer, substance

abuseBetty Ford

SchizophreniaJim Watson’s son Rufus

Page 8: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

8Siebold 2004 “Crystal structure of HLA-DQ0602 that protects against type 1 diabetes and confers strong susceptibility to narcolepsy”

Causative alleles: cell/animal/drug models, physiological/anatomical mechanisms,

MAO-AFoxP2

Page 9: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

9

Polony sequencing processopen source software hardware, wetware

Shendure, Porreca et al., Science 309: 1729-3210:11

Page 10: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

10

G

A

C

T

Multiplex Cyclic Sequencing by Synthesis(Next-gen, polonies on glass or beads)

Polymerase -or- LigaseShendure,

Porreca, et al. 2005 Science

Illumina, IBS

AB-SOLiD, CGI

Mitra, et al. 2003 Analyt.

Biochem.1999NAR

Page 11: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

11

ACUCAUC…(3’)…TAGAGT????????????????TGAGTAG…(5’)

5’-Cy5-nnnnAnnnn-3’5’-Cy3-nnnnGnnnn-3’

5’-TR-nnnnCnnnn-3’5’-Cy3+Cy5-nnnnTnnnn-3’

5'PO4

Sequencing by Ligation (SBL) with fluorescent combinatorial 9-mers

Excitation Emission647 700555 605572 630555 700

nm

Shendure, Porreca, et al. (2005) Science 309:1728

Page 12: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

12

HPLC autosampler(96 wells) syringe

pump

Polony Sequencing EquipmentHMS/AB/APG

microscope with xyz controls

flow-cell

temperature control

Page 13: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

13

2nd-generation sequencing

AB-SOLiD$550K

36 flow-cells * 28 bp * 60Mbeads = 60 Gbp / 30-180 h run36*28*2000*1Mpix*4 colors*2 bytes = 16 Terabytes / run

Harvard-model-F07: $106K incl. computer. $14K support. Open-source software, hardware, wetware Reduce reagent volume & per vol cost 100X each.

E07 (Nikon) F07

PorrecaTerry

Page 14: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

14

Reducing costs of 2nd generation genome sequencing in an open-access model

(1) 5X reduction in equipment: F07: $106K

(2) >20X reads per run: 60M beads per each of 36 flow-cells, = 60 Gbp per run (cf. 1G to 4G)(>= 3X coverage minimum: 3E-7 error rate)

(3) Kit costs are inflated 50X (relative to standard enzymes)

(4) Enzyme costs (e.g. Taq Pol) are inflated 100X.

(5) Flow-cell volume reduced 15-fold to improve flow also reduces reagent use! (another 70-fold reduction in progress).

Page 15: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

15

Selective genome sequencing

Shendure, et al. Science 309(5741):1728-32. Nilsson et al. (2006) Trends Biotechnol 24:83.

Red=Synthetic; Yellow=genome/cDNA

How do we optimize >100K 100mers ?

7 ways to capture alleles from genomic or c-DNA

In vitro Paired-tag

library

Gapfill

Cleave& ligate

Zhang, Chou, Shendure, Li, Leproust, Dahl, Davis, Nilsson, Church,

For rearrangments

2. 3.

4. Hybridize-select5. Allelic-RNA-ratio6. Mb region primers7. Dilution haplotype

1.

Page 16: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

16

10 Mbp of oligos / $300 chip

8K Atactic/Xeotron/Invitrogen Photo-Generated Acid

12K Combimatrix Electrolytic44K Agilent Ink-jet standard reagents380K Nimblegen/GA Photolabile 5'protection

Tian et al. Nature. 432:1050Carr & Jacobson 2004 NAR

Smith & Modrich 1997 PNAS

~1000X lower oligo costs

Amplify pools of 50mers using flanking universal PCR primers &

3 paths to 10X error correction

Page 17: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

17

Circle Capture Oligos from Chips

Page 18: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

18

Circle Capture

3,5: duplicate controls

(no genome)

2, 4 duplicatecapture experiments

Expected :140 – 271 bp

Page 19: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

19

Consensus error rate Estimated # of false positiveHuman Exons Full Genome

1E-4 Bermuda/Hapmap 600 600,0004E-5 454 Nature ‘05 240 240,0003E-7 Polony-SbL Science‘05 18 1,800

Goal of genotyping & resequencing Discovery of variantsE.g. cancer somatic mutations ~1E-6 (or lab evolved cells)

Why low error rates?

false negative genotype reduction by allelic-ratio-RNA & haplotypes

Page 20: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

20

Monitoring resistance to BCR-ABL-kinase inhibitors with polonies during CML patient therapy Nardi, Raz, Chao, Wu, Stone, Cortes, Deininger, Church, Zhu, Daley (submitted)

E255K

T315I

M244V

Page 21: Personal Genome Project - Harvard Universityarep.med.harvard.edu/gmc/ppt/07Sep_JH.pdf · 4 Personal Genome Project (PGP) -- ELSI • Submitted May’03 to NIH, $10M approved Mar 2004

21

Rearrangements detected using polony paired end reads Shendure et al Science Sep 2005

Deletion Insertion Inversion(rare in this clonal population)