12
GATK assessment follow up Jianying Li

GATK assessment follow up

  • Upload
    nasya

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

GATK assessment follow up. Jianying Li. Samples tested. Analytical parameter set up. Aligner: BWA 0.5.9 Samtools : 0.1.12a GATK: 1.0.5083 Covariates: readgroup * , Q raw , dinucleotide, machine cycle SVA version: 1.1. Reported quality score (als9c2, build 37). - PowerPoint PPT Presentation

Citation preview

Page 1: GATK assessment follow up

GATK assessment follow up

Jianying Li

Page 2: GATK assessment follow up

Samples tested

SAMPLE Gender ETHNICITYGenome ref

AVERAGE_COVERAGE dbsnp

mchd002a2 Female Female Build 36 30.77 dbsnp_132

dukscz0106 Male Male Build 36 28.93 dbsnp_132

als9c2 Female white Build 37 41.53 v4.0

na12878 Female 1KG Build 36/7 40 dbsnp_132

Page 3: GATK assessment follow up

Analytical parameter set up

• Aligner: BWA 0.5.9• Samtools: 0.1.12a• GATK: 1.0.5083• Covariates: readgroup*, Qraw, dinucleotide,

machine cycle• SVA version: 1.1

Page 4: GATK assessment follow up

Reported quality score (als9c2, build 37)

Before recalibration After recalibration

15% more bases with Q32 and above

Page 5: GATK assessment follow up

Computational complexity

SNVs & Indel files

Processed BAM files

var_flt_vcf.snpvar_flt_vcf.indel

SNV count Hom/het ratio Indel countTi/Tv ratio Overlap

with dbSNPconcordance

Het/tot ratio on X

~ 4 days

~ 6 hours

Page 6: GATK assessment follow up

Increased variant calls (als9c2)Metrics Raw GATK -recal

All SNVs 3,668,875 3,724,498

Filtered SNVs 3,508,465 3,570,758

INDELs 521,506 521,680

All SNVs Filtered SNVs INDELs0

500000

1000000

1500000

2000000

2500000

3000000

3500000

4000000

als9c2 (raw)als9c2 (GATK)

Page 7: GATK assessment follow up

Ratio (%) of variant calls --GATK against raw

All SNVs Filtered SNVs INDELs97.50

98.00

98.50

99.00

99.50

100.00

100.50

101.00

101.50

102.00

als9c2dukscz0106mchd002A2

Page 8: GATK assessment follow up

Ti/Tv ratio (??)

Ti/Tv Ratio Raw GATK

mchd002a2 2.1226 2.1343

als9c2 2.1548 2.1286

dukscz0106 2.1408 2.1287

Page 9: GATK assessment follow up

Reduced overlap to dbsnp

Overlap dbsnp autosomes Samtools.1.12 GATK

na12878 (1000G) 95.27 running..

mchd002a2 89.32 89.06

als9c2 93.2 93.05

dukscz0106 83.13 82.2

Page 10: GATK assessment follow up

Increased concordance (SVA)

part_I matched no.matched match%

Part I gatk 294629 3018 98.9860

raw 293658 3039 98.9757

Part II gatk 280499 6629 97.6913

raw 280504 7591 97.3651

Over all gatk 575128 9647 98.3503

raw 574162 10630 98.1823

Dukscz0106, build 36

match%_partI match%_partII Overall0.965

0.97

0.975

0.98

0.985

0.99

0.995

gatkraw

Page 11: GATK assessment follow up

Increased concordance (SVA)

als9c2, build 37

part_I matched no.matched match%

Part I gatk 296497 1297 99.5645

raw 294755 1842 99.3790

Part II gatk 281392 3882 98.6392

raw 281402 5365 98.1291

Overall gatk 577889 5179 99.1118

raw 576157 7207 98.7646

match%_partI match%_partII Overall0.97

0.975

0.98

0.985

0.99

0.995

1

als9c2-gatkals9c2-raw

Page 12: GATK assessment follow up

Decreased concordance (??)

mchd002A2, build 36

part_I matched no.matched match%

Part I gatk 299325 1026 99.6584

raw 300054 984 99.6731

Part II gatk 281392 3882 98.6392

raw 280139 3675 98.7051

Over all gatk 280494 3265 98.8494

raw 580193 4659 99.2034

match%_partI match%_partII Overall%0.982

0.984

0.986

0.988

0.99

0.992

0.994

0.996

0.998

mchd002A2-gatkmchd002A2-raw