1
Confirmative study to verify the accuracy of our NGS based workflow and software for high-resolution KIR genotyping Bianca Heyn 1 , Jill A. Hollenbach 2 , Wesley Marin², Ravi Dandekar², Paul J. Norman 3 , Annett Heidl 1 , Alexander Schmidt 1,4 , Jürgen Sauter 4 , Vinzenz Lange 1 1 DKMS Life Science Lab, Blasewitzer Str. 43, 01307 Dresden, Germany; 2 University of California, San Francisco School of Medicine, San Francisco, CA, USA; 3 University of Colorado School of Medicine, Denver, CO, USA; 4 DKMS, Kressbach 1, 72072 Tübingen, Germany Introduction Human killer-cell immunoglobulin-like receptor (KIR) genes play an important role in the immune system and have been reported to affect hematopoietic stem cell transplantation (HSCT) outcome. KIR specificity and affinity is influenced by multiple facets of allelic polymorphism. Therefore, high resolution KIR genotyping might contribute towards improving donor selection and success of HSCT. As previously reported we established an approach based on next generation sequencing (NGS) to generate high resolution KIR genotyping data. However, unlike for HLA genotyping, external proficiency testing services are not yet available for comprehensive high resolution KIR genotyping. Therefore, we exchanged 366 samples that had been previously analyzed comprehensively for KIR based on NGS capture sequencing data and the PING pipeline. Given the early days for comprehensive high resolution KIR genotyping in conjunction with the limited coverage of the KIR diversity in the IPD/KIR database, comparisons with results obtained using independent approaches are essential to spot systematic biases or shortcomings. DKMS Life Science Lab Blasewitzer Str. 43 * 01307 Dresden * www.dkms-lab.de KIR Workflow and Algorithm Results and Conclusion DKMS Life Science Lab generates high resolution KIR genotyping data by next generation sequencing of KIR exons 3, 4, 5, 7, 8 and 9. Since October 2016, when this workflow was implemented, we have genotyped more than a million stem cell donor registry samples for KIR at high resolution. High resolution KIR genotyping poses particular challenges due to the variable gene content in combination with the high homology of many KIR genes. Therefore, we developed an algorithm to estimate exon sequence copy numbers based on the sequence specific read counts. The estimated copy numbers play a pivotal role in determining gene copy numbers and serve as important quality marker. KIR gene Concordant results Allele agreement, unclear gene cop number Discordant results No comparison possible 2DL1 347 94.8% 2 0.5% 9 2,5% 8 2,2% 2DL2 363 99.2% 3 0,8% 2DL3 213 100.0% 2DL4 350 95.6% 1 0.3% 7 1,9% 8 2,2% 2DL5 355 98.9% 1 0.3% 2 0,6% 1 0,3% 2DP1 356 97.3% 1 0.3% 8 2,2% 1 0,3% 2DS1 361 99.7% 1 0,3% 2DS2 360 98.9% 3 0,8% 1 0,3% 2DS3 363 99.7% 1 0,3% 2DS4 365 99.7% 1 0,3% 2DS4N 362 99.2% 3 0,8% 2DS5 362 100.0% 3DL1 359 98.4% 2 0.5% 3 0,8% 1 0,3% 3DL2 358 98.1% 5 1,4% 2 0,5% 3DL3 356 97.3% 8 2,2% 2 0,5% 3DS1 363 99.7% 1 0,3% total 5593 98.5% 7 0.1% 53 0.9% 26 0.5% Table 1. Comparison of the genotyping results between UCSF and DKMS. Number of samples and percentages. For this confirmative study we exchanged 366 samples that had been previously analyzed comprehensively for KIR based on NGS capture sequencing data and the PING pipeline. We analyzed the samples with our NGS workflow and performed high resolution KIR genotyping using our analysis software neXtype: 99.3% of all present genes were called at high resolution. In 196 cases we identified sequences so far not represented in the IPD-KIR database (Figure 2). Comparing the genotyping data based on neXtype with corresponding PING results 98.5% of the genotype calls on individual KIR gene level were in concordance. In another 0.1% of the results we found allele agreement with unclear gene copy numbers. After detailed reanalysis of the discordant results we identified 0.9% discrepant results (Figure 3). For 0.5% no comparison was possible because of low sample quality or insufficient reference data. (Table 1). This study demonstrates the maturity of the current approaches and underscores that comprehensive high resolution KIR genotyping is possible in a high throughput setting. Figure 1. neXtype high resolution KIR genotyping interface. Data and results of one sample. Absent genes are shown in grey, detected genes in green. Displayed details include read counts and predicted copy numbers for each exon. Figure 2. Distribution of novel alleles found with neXtype among the 15 KIR genes. Figure 3. Proportion of concordant/discordant allele level KIR genotyping results obtained with neXtype and PING.

Confirmative study to verify the accuracy of our NGS based ... UCSF.pdf · Confirmative study to verify the accuracy of our NGS based workflow and software for high-resolution KIR

Embed Size (px)

Citation preview

Page 1: Confirmative study to verify the accuracy of our NGS based ... UCSF.pdf · Confirmative study to verify the accuracy of our NGS based workflow and software for high-resolution KIR

Confirmative study to verify the accuracy of our NGS based workflow and software for high-resolution KIR genotyping

Bianca Heyn1, Jill A. Hollenbach2, Wesley Marin², Ravi Dandekar², Paul J. Norman3, Annett Heidl1, Alexander Schmidt1,4, Jürgen Sauter4, Vinzenz Lange1 1DKMS Life Science Lab, Blasewitzer Str. 43, 01307 Dresden, Germany; 2University of California, San Francisco School of Medicine, San Francisco, CA, USA;

3University of Colorado School of Medicine, Denver, CO, USA; 4DKMS, Kressbach 1, 72072 Tübingen, Germany

Introduction

Human killer-cell immunoglobulin-like receptor (KIR) genes play an

important role in the immune system and have been reported to affect

hematopoietic stem cell transplantation (HSCT) outcome. KIR specificity and

affinity is influenced by multiple facets of allelic polymorphism. Therefore,

high resolution KIR genotyping might contribute towards improving donor

selection and success of HSCT.

As previously reported we established an approach based on next

generation sequencing (NGS) to generate high resolution KIR genotyping

data. However, unlike for HLA genotyping, external proficiency testing

services are not yet available for comprehensive high resolution KIR

genotyping. Therefore, we exchanged 366 samples that had been

previously analyzed comprehensively for KIR based on NGS capture

sequencing data and the PING pipeline.

Given the early days for comprehensive high resolution KIR genotyping in

conjunction with the limited coverage of the KIR diversity in the IPD/KIR

database, comparisons with results obtained using independent approaches

are essential to spot systematic biases or shortcomings.

DKMS Life Science Lab

Blasewitzer Str. 43 * 01307 Dresden * www.dkms-lab.de

KIR Workflow and Algorithm

Results and Conclusion

DKMS Life Science Lab generates high resolution KIR

genotyping data by next generation sequencing of KIR

exons 3, 4, 5, 7, 8 and 9. Since October 2016, when this

workflow was implemented, we have genotyped more than

a million stem cell donor registry samples for KIR at high

resolution.

High resolution KIR genotyping poses particular challenges

due to the variable gene content in combination with the

high homology of many KIR genes. Therefore, we

developed an algorithm to estimate exon sequence copy

numbers based on the sequence specific read counts. The

estimated copy numbers play a pivotal role in determining

gene copy numbers and serve as important quality marker.

KIR

gene

Concordant

results

Allele

agreement,

unclear gene

cop number

Discordant

results

No comparison

possible

2DL1 347 94.8% 2 0.5% 9 2,5% 8 2,2%

2DL2 363 99.2% 3 0,8%

2DL3 213 100.0%

2DL4 350 95.6% 1 0.3% 7 1,9% 8 2,2%

2DL5 355 98.9% 1 0.3% 2 0,6% 1 0,3%

2DP1 356 97.3% 1 0.3% 8 2,2% 1 0,3%

2DS1 361 99.7% 1 0,3%

2DS2 360 98.9% 3 0,8% 1 0,3%

2DS3 363 99.7% 1 0,3%

2DS4 365 99.7% 1 0,3%

2DS4N 362 99.2% 3 0,8%

2DS5 362 100.0%

3DL1 359 98.4% 2 0.5% 3 0,8% 1 0,3%

3DL2 358 98.1% 5 1,4% 2 0,5%

3DL3 356 97.3% 8 2,2% 2 0,5%

3DS1 363 99.7% 1 0,3%

total 5593 98.5% 7 0.1% 53 0.9% 26 0.5%

Table 1. Comparison of the genotyping results between UCSF and DKMS. Number of

samples and percentages.

For this confirmative study we exchanged 366 samples that had been previously

analyzed comprehensively for KIR based on NGS capture sequencing data and the

PING pipeline. We analyzed the samples with our NGS workflow and performed

high resolution KIR genotyping using our analysis software neXtype:

99.3% of all present genes were called at high resolution. In 196 cases we identified

sequences so far not represented in the IPD-KIR database (Figure 2). Comparing

the genotyping data based on neXtype with corresponding PING results 98.5% of

the genotype calls on individual KIR gene level were in concordance. In another

0.1% of the results we found allele agreement with unclear gene copy numbers.

After detailed reanalysis of the discordant results we identified 0.9% discrepant

results (Figure 3). For 0.5% no comparison was possible because of low sample

quality or insufficient reference data. (Table 1).

This study demonstrates the maturity of the current approaches and underscores

that comprehensive high resolution KIR genotyping is possible in a high throughput

setting.

Figure 1. neXtype high resolution KIR genotyping interface. Data and results of one sample. Absent genes are shown in grey,

detected genes in green. Displayed details include read counts and predicted copy numbers for each exon.

Figure 2. Distribution of novel alleles found with

neXtype among the 15 KIR genes.

Figure 3. Proportion of concordant/discordant

allele level KIR genotyping results obtained with

neXtype and PING.