35
Applica’ons of Personal Genome Machine (PGM™) in SNPbased Human Iden’fica’on Sharon Chao Woo*on, PhD Senior Bioinforma3cs Scien3st Human Iden3fica3on Life Technologies

Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

Embed Size (px)

Citation preview

Page 1: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

Applica'ons  of  Personal  Genome  Machine  (PGM™)  in  SNP-­‐based  Human  Iden'fica'onSharon  Chao  Woo*on,  PhDSenior  Bioinforma3cs  Scien3stHuman  Iden3fica3onLife  Technologies

Page 2: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

Agenda

•Overview  of  PGM™  applica2ons

•SNPs  in  human  iden2fica2on

•PGM™  Technology

•Development  of  a  SNP  panel

•Future  plans

Page 3: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Applica'ons  on  the  PGM  

STRGenotype

SNPGenotype

mtDNAHaplotype

Y-­‐STRGenotype

MicrobialForensics

Page 4: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Applica'ons  on  the  PGM  

STRGenotype

SNPGenotype

mtDNAHaplotype

Iden%ty  SNPs

Lineage-­‐informa%ve

SNPs

Ancestry-­‐informa%ve  

SNPs

•  High  heterozygosity•  Low  popula'on  heterogeneity

PhenotypicSNPs

•  Haplotype  markers  for  kinship  analysis•  Mitochondrial  genome  or  control  region•  Y-­‐chr  SNPs•  Mini  haplogroups

•  High  popula'on  heterogeneity

•  Hair,  eye,  skin  color

Y-­‐STRGenotype

MicrobialForensics

Page 5: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

PGM™  Applica'ons  -­‐  SNP  Genotyping

•  Abundant  in  the  human  genome  (~9  million)• 90%    of  human  gene2c  varia2on  comes  from  SNPs• SNPs  occur  about  every  300  bp;  coding  and  non  coding  regions•Most  SNPs  are  bi-­‐allelic• Low  muta2on  rate  (10  -­‐8  -­‐  10-­‐9  per  locus  per  genera2on)• Small  amplicon  size

STRGenotype

SNPGenotype

mtDNAHaplotype

Y-­‐STRGenotype

MicrobialForensics

Page 6: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

•  Missing  person  iden2fica2on•  Paternity•  DVI•  Ancestral  haplotyping•  Molecular  “phenotype”

PGM™  Applica'ons  -­‐  SNP  Genotyping

Page 7: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Current  SNP  Technologies

•  Allele  discrimina2on  methods• Sequencing• Primer  extension• Liga2on•Hybridiza2on• Enzyma2c  cleavage

Homozygote 2

Homozygote 1Heterozygote

SNaP

shot

®O

ligo

ligat

ion

assa

y (O

LA)

TaqM

an®

Ass

ay

Page 8: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Current  SNP  Technologies

•  Allele  discrimina2on  methods• Sequencing• Primer  extension• Liga2on•Hybridiza2on• Enzyma2c  cleavage

Homozygote 2

Homozygote 1Heterozygote

SNaP

shot

®O

ligo

ligat

ion

assa

y (O

LA)

TaqM

an®

Ass

ay

Limitation on number of SNPs and samples run simultaneously

Page 9: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

PGM™  for  SNP  Genotyping

•  Allows  combina2on  of  large  number  of  SNPs  in  one  mul2plex

• Simultaneous  sequencing  of  autosomal,  Y-­‐chr,  X-­‐chr,  mitochondrial  SNPs  possible

•  Barcode  up  to  96  individuals  and  sequence  on  one  chip•  Output  is  sequence,  not  an  indicator

Page 10: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

PGM™  Instrument

Page 11: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

 Sequencing  Instrument

 One  Touch™  Instruments  Emulsion  PCR  and  Enrichment

 Semiconductor  Chip

 Sequencing  Chemistry•  Natural  nucleo3des•  Natural  enzymes

 Sample  Prep•  Libraries•  Clonal  beads

 Torrent  Server

The  Ion  Torrent  PGM™  Instrument  System

INSTRUMENTS REAGENTS DATA  ANALYSIS

Page 12: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

▲ Ion 314 >150 Mb

▲ Ion 316 >850 Mb

▲ Ion 318 >1400 Mb

Ion  chip  scalability

Page 13: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Leveraging  Semiconductor  Technology  

WAFER  SEMICONDUCTOR  MANUFACTURING

CHIP  SEMICONDUCTOR  PACKAGING

CHIP  CROSS  SECTION  SEMICONDUCTOR  DESIGN

Page 14: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Sequence  detec'on  by  pH

Rothberg)J.M.))et#al#Nature#doi:10.1038/nature10242#

Sensor Plate

Silicon Substrate Drain Source Bulk

dNTP

To column receiver

∆ pH

∆ Q

∆ V

Sensing Layer

H+

Page 15: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

•  Must  be  read  “up-­‐and-­‐down”  along  with  “le`-­‐to-­‐right”

•  Height  of  bar  indicates  how  many  nucleo2des  incorporated  during  flow

• “Negative” or “zero” flows indicate no nucleotide incorporation

•“Nega2ve”  or  “zero”  flows  indicate  no  nucleo2de  incorpora2on• These  observa2ons  are  omibed  when  conver2ng  to  nucleo2de  space

Data  Output  is  an  Ionogram

Key Sequence

Sequence: …AATCTTCTGAATTTCTGCAA…. (TTT)

(AA) (AA)

Page 16: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

PGM™  Applica'ons

STRGenotype

SNPGenotype

mtDNAHaplotype

Y-­‐STRGenotype

MicrobialForensics

Page 17: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

PGM™  Applica'ons

Iden%ty  SNPs

Lineage-­‐informa%ve

SNPs

Ancestry-­‐informa%ve  

SNPs

PhenotypicSNPs

•  High  heterozygosity•  Low  popula'on  heterogeneity

•  Haplotype  markers  for  kinship  analysis•  Mitochondrial  genome  or  control  region•  Y-­‐chr  SNPs•  Mini  haplogroups

•  High  popula%on  heterogeneity

•  Hair,  eye,  skin  color

STRGenotype

SNPGenotype

mtDNAHaplotype

Y-­‐STRGenotype

MicrobialForensics

Page 18: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

HID  SNP  Panel  v0.1

•Based  on  published  SNPs  with  high  heterozygosity  and  low  Fst

•Genotype  match  probabili2es  of  10-­‐31  -­‐  10-­‐35  136 SNPs

33 Y - SNP

70 - Ken Kidd SNPs

36 - SNPforID SNPlex

Page 19: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

Popula'on  data

Kidd  Lab’s  ALFRED  (the  ALlele  FREquency  Database)  

ALFRED  allele  frequencies  by  popula7on  for  rs7704770

Page 20: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

SNP  Panel  Development  -­‐  Ampliseq™

Construct Library Prepare Template Run Sequence Analyze DataCustomize Panel

AMPLISEQ™ CUSTOM PANEL

• Up to 6,144 primer pairs

• 10 ng DNA input

• Up to 200 bp targets

MULTIPLEX -CLONAL BEADAMPLIFICATION

Generic

MULTIPLEX -SEQUENCING ON SINGLE CHIP

Generic

SNP GENOTYPE PLUGIN

Analysis custom to SNP panel

Page 21: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Library  Prepara'on

Short Amplicon Method Long Amplicon Method

Target amplicons 75 - 200 bp Target amplicons > 200 bp

Genomic DNA

FWD

REV

FWD

REV

Short amplicon pool 75 - 200 bp

PCR, pool amplicons, end-repair

PCR, pool amplicons, fragment with Ion Shear™ Kit

Fragmented long amplicon pool 50 - 500 bp

Genomic DNA

Adaptor ligation

OR

Barcode adaptors ligation

P1IA

P1IA-BCx

Nick-translation and PCR

Final barcoded libraryIA BCx Target amplicon P1

Construct Library Prepare Template Run Sequence Analyze DataCustomize Panel

AMPLISEQ™ CUSTOM PANEL

• Up to 6,144 primer pairs

• 10 ng DNA input

• Up to 200 bp targets

MULTIPLEX -CLONAL BEADAMPLIFICATION

Generic

MULTIPLEX -SEQUENCING ON SINGLE CHIP

Generic

SNP GENOTYPE PLUGIN

Analysis custom to SNP panel

Page 22: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Library  Prepara'on

Short Amplicon Method Long Amplicon Method

Target amplicons 75 - 200 bp Target amplicons > 200 bp

Genomic DNA

FWD

REV

FWD

REV

Short amplicon pool 75 - 200 bp

PCR, pool amplicons, end-repair

PCR, pool amplicons, fragment with Ion Shear™ Kit

Fragmented long amplicon pool 50 - 500 bp

Genomic DNA

Adaptor ligation

OR

Barcode adaptors ligation

P1IA

P1IA-BCx

Nick-translation and PCR

Final barcoded libraryIA BCx Target amplicon P1

Construct Library Prepare Template Run Sequence Analyze DataCustomize Panel

AMPLISEQ™ CUSTOM PANEL

• Up to 6,144 primer pairs

• 10 ng DNA input

• Up to 200 bp targets

MULTIPLEX -CLONAL BEADAMPLIFICATION

Generic

MULTIPLEX -SEQUENCING ON SINGLE CHIP

Generic

SNP GENOTYPE PLUGIN

Analysis custom to SNP panel

Page 23: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

Data  analysisConstruct Library Prepare Template Run Sequence Analyze DataCustomize Panel

AMPLISEQ™ CUSTOM PANEL

• Up to 6,144 primer pairs

• 10 ng DNA input

• Up to 200 bp targets

MULTIPLEX -CLONAL BEADAMPLIFICATION

Generic

MULTIPLEX -SEQUENCING ON SINGLE CHIP

Generic

SNP GENOTYPE PLUGIN

Analysis custom to SNP panel

Page 24: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

Construct Library Prepare Template Run Sequence Analyze DataCustomize Panel

AMPLISEQ™ CUSTOM PANEL

• Up to 6,144 primer pairs

• 10 ng DNA input

• Up to 200 bp targets

MULTIPLEX -CLONAL BEADAMPLIFICATION

Generic

MULTIPLEX -SEQUENCING ON SINGLE CHIP

Generic

SNP GENOTYPE PLUGIN

Analysis custom to SNP panelSNP  genotype  calling

rs891700

coverage

alignments

ATCG

Het  G/A

reference  

Page 25: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

0"

100"

200"

300"

400"

500"

600"

700"

800"

900"

T"G"C"A"

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0"

100"

200"

300"

400"

500"

600"

T"G"C"A"

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0"

200"

400"

600"

800"

1000"

1200"

T"G"C"A"

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0"

50"

100"

150"

200"

250"

300"

350"

400"

T"G"C"A"

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

BC6

BC7

BC8

BC9

BC10

BC6

BC7

BC8

BC9

BC10

0"

100"

200"

300"

400"

500"

600"

700"

rs1490413"

rs10495407"

rs891700"

rs1413212"

rs876724"

rs907100"

rs1357617"

rs1355366"

rs2046361"

rs1979255"

rs717302"

rs251934"

rs1029047"

rs727811"

rs917118"

rs737681"

rs763869"

rs2056277"

rs1015250"

rs1463729"

rs1360288"

rs735155"

rs964681"

rs901398"

rs2076848"

rs2107612"

rs2111980"

rs1335873"

rs1886510"

rs354439"

rs1454361"

rs873196"

rs8037429"

rs1528460"

rs729172"

rs1382387"

rs740910"

rs938283"

rs1493232"

rs1024116"

rs719366"

rs1031825"

rs1005533"

rs722098"

rs2831700"

rs914165"

rs733164"

rs2040411"

rs1028528"

T"G"C"A"

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

rs1490413#

rs10495407#

rs891700#

rs1413212#

rs876724#

rs907100#

rs1357617#

rs1355366#

rs2046361#

rs1979255#

rs717302#

rs251934#

rs1029047#

rs727811#

rs917118#

rs737681#

rs763869#

rs2056277#

rs1015250#

rs1463729#

rs1360288#

rs735155#

rs964681#

rs901398#

rs2076848#

rs2107612#

rs2111980#

rs1335873#

rs1886510#

rs354439#

rs1454361#

rs873196#

rs8037429#

rs1528460#

rs729172#

rs1382387#

rs740910#

rs938283#

rs1493232#

rs1024116#

rs719366#

rs1031825#

rs1005533#

rs722098#

rs2831700#

rs914165#

rs733164#

rs2040411#

rs1028528#

T#

G#

C#

A#

Reads per allele Reads per allele normalized

Coverage  needed  to  overcome  undercalling

100x

100x

100x

100x

100x

SNPs  by  rs  ID SNPs  by  rs  ID

Depth  of  coverag

e  for  e

ach  allele

Allele  alloca7on

Page 26: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

0"

100"

200"

300"

400"

500"

600"

700"

800"

900"

rs1490413(

rs7520386(

rs560681(

rs10495407(

rs891700(

rs1413212(

rs876724(

rs12997453(

rs1357617(

rs9866013(

rs1872575(

rs1355366(

rs6444724(

rs13134862(

rs1554472(

rs6811238(

rs1979255(

rs717302(

rs159606(

rs13182883(

rs7704770(

rs251934(

rs338882(

rs1029047(

rs13218440(

rs2811231(

rs1478829(

rs1358856(

rs2503107(

rs2272998(

rs214955(

rs727811(

rs6955448(

rs917118(

rs1019029(

rs321198(

rs737681(

rs10092491(

rs4288409(

rs2056277(

rs4606077(

rs2270529(

rs7041158(

rs1463729(

rs10776839(

rs735155(

rs3780962(

rs1410059(

rs740598(

rs964681(

rs10768550(

rs10500617(

rs1498553(

rs901398(

rs6591147(

rs590162(

rs2107612(

rs2255301(

rs2269355(

rs2111980(

rs10773760(

rs1886510(

rs9546538(

rs1058083(

rs354439(

rs1454361(

rs873196(

rs4530059(

rs1821380(

rs729172(

rs2342747(

rs430046(

rs1382387(

rs2175957(

rs8070085(

rs1004357(

rs1027895(

rs8078417(

rs2291395(

rs4789798(

rs689512(

rs3744163(

rs2292972(

rs1493232(

rs9951171(

rs7229946(

rs985492(

rs521861(

rs1736442(

rs1024116(

rs719366(

rs576261(

rs12480506(

rs2567608(

rs1005533(

rs1523537(

rs722098(

rs464663(

rs2833736(

rs914165(

rs9606186(

rs5746846(

rs2073383(

rs733164(

rs987640(

rs2040411(

rs1028528(

rs1800865(

rs2075640(

rs2299942(

rs2267801(

rs2267802(

rs2071394(

rs1865680(

rs2075182.3(

rs2075181(

rs1515817(

rs2032595(

rs2032598(

rs2032599(

rs2032601(

rs2032600(

rs2032607(

rs2032604(

rs2032624(

rs2020857(

rs2032668(

rs2032666(

rs2032658(

rs2072422(

rs2032653(

rs1864258(

rs3897(

rs3900(

rs891407(

rs2032611(

rs2032631(

rs2032673(

rs2032626(

rs1558843(

rs1276035(

rs1276034(

T"G"C"A"

0"

100"

200"

300"

400"

500"

600"

700"

800"

T"G"C"A"

0"

100"

200"

300"

400"

500"

600"

700"

800"

900"

T"G"C"A"

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

T"G"C"A"

0"

50"

100"

150"

200"

250"

300"

350"

T"G"C"A"

LT01

LT03

LT04

LT05

LT06

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

T#

G#

C#

A#

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

rs1490413(

rs7520386(

rs560681(

rs10495407(

rs891700(

rs1413212(

rs876724(

rs12997453(

rs1357617(

rs9866013(

rs1872575(

rs1355366(

rs6444724(

rs13134862(

rs1554472(

rs6811238(

rs1979255(

rs717302(

rs159606(

rs13182883(

rs7704770(

rs251934(

rs338882(

rs1029047(

rs13218440(

rs2811231(

rs1478829(

rs1358856(

rs2503107(

rs2272998(

rs214955(

rs727811(

rs6955448(

rs917118(

rs1019029(

rs321198(

rs737681(

rs10092491(

rs4288409(

rs2056277(

rs4606077(

rs2270529(

rs7041158(

rs1463729(

rs10776839(

rs735155(

rs3780962(

rs1410059(

rs740598(

rs964681(

rs10768550(

rs10500617(

rs1498553(

rs901398(

rs6591147(

rs590162(

rs2107612(

rs2255301(

rs2269355(

rs2111980(

rs10773760(

rs1886510(

rs9546538(

rs1058083(

rs354439(

rs1454361(

rs873196(

rs4530059(

rs1821380(

rs729172(

rs2342747(

rs430046(

rs1382387(

rs2175957(

rs8070085(

rs1004357(

rs1027895(

rs8078417(

rs2291395(

rs4789798(

rs689512(

rs3744163(

rs2292972(

rs1493232(

rs9951171(

rs7229946(

rs985492(

rs521861(

rs1736442(

rs1024116(

rs719366(

rs576261(

rs12480506(

rs2567608(

rs1005533(

rs1523537(

rs722098(

rs464663(

rs2833736(

rs914165(

rs9606186(

rs5746846(

rs2073383(

rs733164(

rs987640(

rs2040411(

rs1028528(

rs1800865(

rs2075640(

rs2299942(

rs2267801(

rs2267802(

rs2071394(

rs1865680(

rs2075182.3(

rs2075181(

rs1515817(

rs2032595(

rs2032598(

rs2032599(

rs2032601(

rs2032600(

rs2032607(

rs2032604(

rs2032624(

rs2020857(

rs2032668(

rs2032666(

rs2032658(

rs2072422(

rs2032653(

rs1864258(

rs3897(

rs3900(

rs891407(

rs2032611(

rs2032631(

rs2032673(

rs2032626(

rs1558843(

rs1276035(

rs1276034(

T#

G#

C#

A#

Higher  depth  of  coverage  precludes  undercalling

100x

100x

100x

100x

100x

SNPs  by  rs  ID SNPs  by  rs  ID

Depth  of  coverag

e  for  e

ach  allele

Allele  allo

ca7o

n

Page 27: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Amplicon  coverageConstruct Library Prepare Template Run Sequence Analyze DataCustomize Panel

AMPLISEQ™ CUSTOM PANEL

• Up to 6,144 primer pairs

• 10 ng DNA input

• Up to 200 bp targets

MULTIPLEX -CLONAL BEADAMPLIFICATION

Generic

MULTIPLEX -SEQUENCING ON SINGLE CHIP

Generic

SNP GENOTYPE PLUGIN

Analysis custom to SNP panel

Y  -­‐  SNPS

Female  individual Male  individual

SNPs  sorted  by  depth  of  coverage SNPs  sorted  by  depth  of  coverage

depth  of  coverag

e

depth  of  coverag

e

Page 28: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

Page 29: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

allele  coverage

alleles  represented

HID-­‐SNP  Genotyper  plugin

SNPs  by  rs  ID

Page 30: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

314  chip:  ~32  individuals316  chip:  >96  individuals

12,762 124x

Page 31: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/02/12

HID  Ion  Community  Homepage

Page 32: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Summary

•SNPs  are  valuable  iden2fiers  when  small-­‐amplicon  PCR  based  detec2on  is  necessary

•Not  intended  to  supplant  STRs  for  forensic  typing.

•Next-­‐genera2on  sequencing  technologies  allow  for  high  mul2plexed  capabili2es  -­‐  of  SNPs  and  individuals  

•Iden2ty  SNP  panel  on  PGM™  using  well-­‐characterized  polymorphisms

Page 33: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Future  Plans

•Poten2al  external  collabora2ons  for  research  applica2ons  on  the  PGM

•Ancestral  and  phenotypic  SNP  panel

•Mini  haplogroups,  Y  and  Mito  haplotyping  panel  

•Microbial  forensics

Page 34: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Acknowledgments  

•Robert  Lagacé

•Lori  Hennessy  

•Reina  Langit  

•  Joe  Chang

•  Chien-­‐wei  Chang

•Narasimhan  Rajagopalan  [email protected]

Page 35: Applications of Personal Genome Machine (PGM™) in SNP-based Human Identification

07/03/12

Thank  You©  2012  Life  Technologies  Corpora2on.  All  rights  reserved.  The  trademarks  men2oned  herein  are  the  property  of  Life  Technologies  Corpora2on  or  their  respec2ve  owners.  

Refer  to  product  page  on  the  Life  Technologies  website  for  Limited  Use  License.

The  content  provided  herein  may  relate  to  products  that  have  not  been  officially  released  and  is  subject  to  change  without  no2ce.