Upload
candy-smellie
View
336
Download
2
Tags:
Embed Size (px)
Citation preview
HORIZON DIAGNOSTICS
Molecular QC: Interpreting your Bioinformatics Pipeline 25th June 2015
Dr. Danielle Folkard and Dr. Alessandro Riccombeni
2
What is the impact of assay failure in your laboratory and how do you monitor for it?
Research Use Only
3
External Quality Assessment
T790M &
L858R
E746_A
750del
Wild
type
Wild
type
E746_A
750del
T790M &
L858R
E746_A
750del
Wild
type
T790M &
L858R
T790M &
L858R
E746_A
750del
T790M &
L858R
E746_A
750del
Wild
type
Wild
type
Wild
type
Wild
type
G719S
T790M &
L858R
G719S0
5
10
15
20
25
30
35
40
EGFR Genotyping ErrorsExternal Quality Assessment 2014
EGFR Sample Tested
Perc
enta
ge o
f Inc
orre
ct R
esul
ts
European Molecular Quality Network (EMQN)
Research Use Only
4
Clinical Application of Next Generation Sequencing
Using just one sample, one workflow can test for mutation status across multiple genes
Research Use Only
8
Introduction
2016
Four decades, three generations
1976Maxam-Gilbert
1977 Sanger φX174 genome
1983 PCR 1990 HGP starts (3B $) Pyrosequencing
2003 First human genome sequence
1987 First automated sequencer
ABI 370
2000 454 LS Corp.
2016 454 ends1998 Solexa2006 GA SOLiD
2005 Roche 454
1976
2008 RNA-Seq Helicos
2011 Ion Torrent MiSeq 2012 Helicos ends
2009 PacBio ON CG
LEGENDFirst generationSecond generation (NGS)Third generation
2014 HiSeqX
Research Use Only
9
Next-Generation Sequencing for Clinical Bioinformatics
• NGS revolutionised our access to genomic information
• 2nd generation technology allows WGS for less than 1000 GBP
• However, a number of challenges exist• Data creation• Data analysis/processing• Data (clinical) interpretation
Research Use Only
10
FFPE and NGS
• Why FFPE?
• FFPE has been used for a number of NGS applications• Tuononen, Spencer (targeted resequencing)• Fanelli, Gu (ChIP-seq, RRBS)• Weng, Meng (RNA-Seq, miRNA)
• What happens with fixatives?• Need to counteract protein-DNA interactions• Additional effects from tissue preparation,
paraffin embedding, x-linking, chemical modification of tissues
• Lower DNA yield, DNA degradation, smaller fragments
• How does FFPE affect NGS pipelines?
Spencer et al. 2013
Research Use Only
11
FFPE and NGS
• Schweiger 2009, comparison of FFPE and Fresh Frozen (FF) tissues for Illumina sequencing:• Fixation time does not significantly affect the quality of sequencing data from FFPE• Lower mappability• Higher mutation rate• Lower fraction of known SNVs
• Van Allen 2014: good correlation between FFPE and FF samples
Van Allen 2014
Research Use Only
12
FFPE and NGS
• Hedegaard 2014: 3 months storage resulted in less efficient DNA extraction• High fragmentation: loss of material• Decrease in library complexity• High increase in PCR duplicates, 60-
85% for FFPE vs. 30% for FF
• C > U deamination is a common cause of artifacts• U-tolerant polymerase didn’t help• Pattern, T <> C, A <> G transition
• The fraction of mapped reads decreases with storage time• Increase in partial mappings• Increase in gapped mappings
Hedegaard et al. 2014
Wong et al. 2014Research Use Only
13
FFPE and NGS
CTTTTT
CTTTTT
New mismatches: artifact variants
CTT
T
Lost mappingsPartial mappings
TTTTT
Artifacts include:• SNVs• Larger indels• CNV
Research Use Only
14
FFPE: Conclusions
• FFPE artifacts increase with storage time
• Artifacts go against the statistical power of your variant calling analysis
• Molecular reference standards help filter out bad mappings and spurious variants
• Bioinformatics pipelines allow adding Molecular Reference Standards in your joint variant calling pipeline
Research Use Only
15
Upcoming Webinar
Title:
Understanding and Controlling for Sample and Platform Biases in NGS Assays
Date:
Wednesday 22nd July 2015
Time:
4:00pm BST, 11:00am EST
Register now: www.horizondx.com/upcomingwebinar
Research Use Only
16
Genome in a Bottle
Infrastructure for performance assessment of NGS
No widely accepted set of metrics to characterize the
fidelity of variant calls from NGS
GIAB is developing standards to provide well characterized human genomes as Reference Materials
Tools and standardized methods to use these RMs
Research Use Only
17
Horizon Diagnostics: Ashkenazim Trio FFPE Reference Standards
• GM24385 – Ashkenazim PGP Son• Coriell: NA24385• NIST: HG002• PGP: huAA53E0
• GM24149 – Ashkenazim PGP Father• Coriell: NA24149• NIST: HG003• PGP: hu6E4515
• GM24143 – Ashkenazim PGP Mother• Coriell: NA24143• NIST: HG004• PGP: hu8E87A9
Research Use Only
18
Horizon Diagnostics: Ashkenazim Trio FFPE Reference Standards
• GM24385 – Ashkenazim PGP Son
• GM24149 – Ashkenazim PGP Father
• GM24143 – Ashkenazim PGP Mother
• Complete Genomics:• Small variants (SNPs & Indels)• Copy Number Variants• Structural Variants• Mobile Element Insertions
• National Institute of Standards and Technology (NIST):• Illumina HiSeq, BWA + GATK 1.6
SNPs, Indels, large SVs, CNVs• Illumina Mate Pair 6kb Insert: mappings• PacBio: raw data• Ion Torrent Exome: variants + mappings• BioNano: raw data + assemblies• Moleculo: mappings for Son and Father
• SNPs and Indels shared by 2+ technologies:• Complete Genomics, proprietary pipeline• Illumina HiSeq, BWA + GATK 1.6• TMLT Ion Proton, TAMP + TVC
Research Use Only
19
NIST preliminary analysis of Ashkenazim Trio
Run SNP & Indels
SV CNV Genomic VCF
Son 1 x x x x
2 x x x
Father 1 x x x x
3 x x x x
Mother 1 x x x x
2 x x x x
• GM24385 – Ashkenazim PGP Son
• GM24149 – Ashkenazim PGP Father
• GM24143 – Ashkenazim PGP Mother
• National Institute of Standards and Technology (NIST):• Illumina HiSeq, BWA + GATK 1.6
SNPs, Indels, large SVs, CNVs
Research Use Only
20
NIST preliminary analysis of Ashkenazim Trio
• GM24385 – Ashkenazim PGP Son
• GM24149 – Ashkenazim PGP Father
• GM24143 – Ashkenazim PGP Mother
• National Institute of Standards and Technology (NIST):• Illumina HiSeq, BWA + GATK 1.6
SNPs, Indels, large SVs, CNVs
00 Run SNP & Indels
SV CNV Genomic VCF
UCSC BED tracks
Son 1 x x x x x
2 x x x x
Father 1 x x x x x
3 x x x x x
Mother 1 x x x x x
2 x x x x x
Trio Merged x x x x x
Research Use Only
21
Merged Ashkenazim Variants
• GM24385 – Ashkenazim PGP Son
• GM24149 – Ashkenazim PGP Father
• GM24143 – Ashkenazim PGP Mother
• National Institute of Standards and Technology (NIST):• Illumina HiSeq, BWA + GATK 1.6
SNPs, Indels, large SVs, CNVs
Run SNP & Indels
SV CNV
Son 1 5637374 14785 381
2 5618495 0 358
Father 1 5575725 17091 377
3 5598533 17569 348
Mother 1 5709480 16851 385
2 5690410 17488 356
Trio Total 33830017 83784 2205
Trio Merged 8423146 53151 1100
Research Use Only
22
Filtered, Merged Ashkenazim Variants
• Horizon Diagnostics:• Annotation: snpEff + SnpSift (dbNSFP)
COSMIC ID dbSNP HGVS AA change HGVS codon change NCBI ClinVar (clinical significance) SIFT score (prob. damaging variant) phastCons 1000 score (site conservation) 1000 genomes p1 Allele Freq. (non-syn.)
SNP & Indels
SV CNV
Merged Trio 8423146 53151 1100
Filtered variants
32532 53151 1100
Mixed variants 1162 1352 0
HIGH impact 5169 1607 41
MOD. impact 68028 265 0
Ann. Effects 73236 175708 3220
• GM24385 – Ashkenazim PGP Son
• GM24149 – Ashkenazim PGP Father
• GM24143 – Ashkenazim PGP Mother
• National Institute of Standards and Technology (NIST):• Illumina HiSeq, BWA + GATK 1.6
SNPs, Indels, large SVs, CNVs
Research Use Only
24
GIAB: Conclusions
• Genome In A Bottle Reference Standards are invaluable for validating variant calling analysis
• NIST and its collaborators shared datasets created with most NGS technologies
• Horizon Diagnostics shared annotated, merged variant calls from NIST for the Ashkenazim Trio
• ~35K variants are predicted having high or moderate impact within the Trio
• GM24385 (Ashkenazim Son) includes 352 small variants with high/moderate impact which are absent in Father and Mother
• Filtered, annotated variants are available for download on horizondx.com
Research Use Only
26
“I would like to validate my NGS workflow. What is the application of your different Q-Seq products?”
Research Use Only
27
How to Test the Robustness and Sensitivity of your Workflow and Assay
StructuralMultiplex
DNA
Sample Complexity
SampleFeatures
Quantitative Multiplex
FFPE, DNA and Formalin-
Compromised DNA
Genome In A BottleFFPE
Gene-SpecificMultiplex
DNA and FFPE
Tru-QDNA
Research Use Only
30
Quantitative Multiplex Reference Standard as Formalin-Compromised DNA
Characterized fragmentation levels, DNA quantification, and defined allelic frequency
*These products are part of our early access program. It is the responsibility of the individual laboratory to determine expected results specific to its assay.
Genomic DNA Tapescreen assay
1 Ladder
2, 4 HD-C749 Reference Standard
3, 5 HD-C751 Reference Standard
[bp] 1 2 3 4 5
Research Use Only
31
Upcoming Webinar
Title:
Understanding and Controlling for Sample and Platform Biases in NGS Assays
Date:
Wednesday 22nd July 2015
Time:
4:00pm BST, 11:00am EST
Register now: www.horizondx.com/upcomingwebinar
Research Use Only
32
“I would like to assess my bioinformatics pipeline for detection of SNVs, Structural Variants and CNVs”
Research Use Only
33
Variant Type Mutation Expected Fractional Abundance (%) or CNV:
SNV High GC GNA11 Q209L 5.6SNV High GC AKT1 E17K 5.6SNV Low GC KRAS G13D 5.6SNV Low GC Pi3Ka E545K 5.6Long Insertion EGFR V769 ins 5.6
Long DeletionEGFR (delE746-A750)
5.3
Fusion ROS1 translocation 5.6
Fusion RET translocation 5.6
CNV MET amplification 4.5 x amplification
CNV MYC amplification 9.5 x amplification
SNP EGFR_G719S 5.3Short Deletion MET_p.V237fs 4.8*SNV High GC NOTCH1_p.P668S 5.0Short Deletion FLT3_p.S985fs 5.6Short Deletion BRCA2_p.A1689fs 5.6Short Deletion FBXW7_p.G667fs 5.6
* %AF lower due to MET amplification
Structural Multiplex Reference Standard
*This product is part of our early access program. It is the responsibility of the individual laboratory to determine expected results specific to its assay.
Research Use Only
34
Routinely monitor the performance of your workflows and assays with independent external controls
What extraction and quantification methods are you
using?
What is the limit of detection of your
workflow?
Is the impact of formalin treatment interesting to you?
What is the impact of assay failure in your laboratory and how do you monitor for it?
Research Use Only
35
References
Slide 13 http://www.genome.gov/sequencingcosts/
Tuononen 2013, http://www.ncbi.nlm.nih.gov/pubmed/23362162
Spencer 2013, http://www.ncbi.nlm.nih.gov/pubmed/23810758
Fanelli 2010, http://www.ncbi.nlm.nih.gov/pubmed/21106756
Fanelli 2011, http://www.ncbi.nlm.nih.gov/pubmed/22082985
Gu 2011, http://www.ncbi.nlm.nih.gov/pubmed/?term=Preparation+of+reduced+representation+bisulfite+sequencing+libraries+for+genome-scale+DNA+methylation+profiling.
Gu 2010, http://www.ncbi.nlm.nih.gov/pubmed/20062050
Weng 2010, http://www.ncbi.nlm.nih.gov/pubmed/20593407
Meng, 2013, http://www.ncbi.nlm.nih.gov/pubmed/?term=meng+2013+comparison+of+microrna+deep+sequencing
Slide 15 Schweiger 2009, http://www.ncbi.nlm.nih.gov/pubmed/?term=schweiger%5BAuthor%5D+AND+2009%5BDate+-+Publication%5D+ffpe
Van Allen 2014, http://www.ncbi.nlm.nih.gov/pubmed/24836576
Slide 16 Hedegaard 2014, http://www.ncbi.nlm.nih.gov/pubmed/24878701
Wong 2014, http://www.ncbi.nlm.nih.gov/pubmed/24885028
Slide 27 BWA: Li 2010, http://www.ncbi.nlm.nih.gov/pubmed/20080505
GATK: McKenna 2010, http://www.ncbi.nlm.nih.gov/pubmed/20644199
snpEff: Cingolani 2012, http://www.ncbi.nlm.nih.gov/pubmed/?term=22728672
SnpSift: Cingolani 2012, http://www.ncbi.nlm.nih.gov/pubmed/22435069
dbNSFP: Liu 2013, http://www.ncbi.nlm.nih.gov/pubmed/23843252
Research Use Only