Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
6000 Shoreline Court, Suite 300 T 844.464.LUNG (844.464.5864)South San Francisco, CA 94080 USA [email protected]
©2016 Veracyte, Inc. All rights reserved. The Veracyte name and logo are registered trademarks of Veracyte, Inc. All other logos and names are marks of their respective owners in the United States and other countries. PC069.1.1605
Bridging the Gap Between HRCT and Final IPF Diagnosis: Classifying the Usual Interstitial Pneumonia Pattern in Transbronchial Biopsies using Machine Learning on High Dimensional Genomic DataKennedy, G.C., Pankratz, D., Choi, Y., Colby, T., Myers, J., Brown, K., Flaherty, K., Lynch, D., Steele, M., Imtiaz, U., Fedorowicz, G., Barth, N., Walsh P.S., and Huang, J.
Presented at the
2016 ATS International Conference
May 13-18, 2016San Francisco, California
Testing AUC Sens Spec
SVM 0.83 0.86 0.67
GLMNet 0.84 0.93 0.58
Gradient Boost 0.78 0.95 0.37
Ensemble 0.83 0.95 0.48
RATIONALEIdiopathic pulmonary fibrosis (IPF) is a chronic, progressive, fibrotic lung disease of unknown cause. Diagnosis requires the presence of a usual interstitial pneumonia (UIP) pattern on pathology or radiology. When high resolution CT (HRCT) results are inconclusive, surgical lung biopsy (SLB) is frequently recommended. Our goal is to develop a molecular test that distinguishes UIP from non-UIP pathology patterns in less invasive transbronchial biopsies (TBB).
METHODSWe collected up to 5 TBBs per patient from 94 patients (Table 1). We used a centralized expert review process to determine histopathol-ogy truth labels at both the patient and lung lobe level. Using whole-exome NGS transcriptional data on 211 TBB samples from 59 patients, we trained four different classification models to differ-entiate UIP (n=128) from non-UIP (n=83) in TBB samples using lobe level pathology truth. We evaluated sample-level classification per-formance by leave-one-patient-out (LOPO) cross-validation, and se-lected the best performing model (GLMnet: logistic regression with LASSO penalty using 49 genes) for validation on an independent test set of 35 patients. We used in silico mixture modeling to explore the limit of detection of UIP in mixtures of a UIP and non-UIP sam-ple. We compared classifier sensitivity to radiology in 31 test set pa-tients with available HRCT diagnoses.
RESULTSThe best performing classifier (GLMnet) achieves ROC AUCs of 0.89 in cross-validation (Figure 1) and 0.84 on the independent test set (sensitivity of 93%, specificity of 58%) (Figure 2) when all samples are independently scored. All 20 test set patients with UIP pathology in any lobe had a majority of TBB samples correctly classified as UIP (5 false negative samples out of 80) (Figure 2B). Simulated mixtures of a UIP sample with a non-UIP sample suggests the molecular test can identify UIP when it represents at least 58% of sample content (Figure 3). By radiology, UIP was confidently identified in only 5 of 19 patients with UIP pathology (26% sensitivity) (Table 2). UIP was detected by the molecular classifier in a majority of TBB samples from each of the 14 patients where UIP was not conclusively identified by radiology (Figure 2B).
CONCLUSIONMolecular classification of TBB samples into UIP and non-UIP is largely concordant with histopathology, and accurately identifies UIP in substantially more samples than HRCT alone. This molecular test could improve the detection of UIP using less invasive bronchoscopy samples, potentially reducing the need for SLB and improving the diagnostic accuracy of IPF.
Bridging the Gap Between HRCT and Final IPF Diagnosis: Classifying the Usual Interstitial Pneumonia Pattern in Transbronchial Biopsies using Machine Learning on High Dimensional Genomic DataKennedy, G.C., Pankratz, D., Choi, Y., Colby, T., Myers, J., Brown, K., Flaherty, K., Lynch, D., Steele, M., Imtiaz, U., Fedorowicz, G., Barth, N., Walsh P.S., and Huang, J.
Veracyte, Inc., South San Francisco, CA
FIGURE 1. Classifier Training Performance
FIGURE 2. Classifier Validation Performance
Training LOPO AUC Sens Spec
SVM 0.85 0.90 0.46
GLMNet 0.89 0.90 0.64
Gradient Boost 0.76 0.90 0.30
Ensemble 0.86 0.90 0.45
A. ROC-AUC curves with classifier decision boundaries (dashed lines) shown for each model. Decision boundaries were selected which achieved 90% sensitivity in cross-validation for all models.
A. ROC-AUC curves with classifier decision boundaries (dashed lines) shown for each model. Decision boundaries were pre- determined on the training set and applied to the validation set.
B. GLMNet classification scores (y-axis) for individual TBB samples in the training set, grouped by patient. TBBs are marked by lobe of origin (shape) and by sample-level pathology diagnosis (color). Pathology diagnosis (lower x-axis) and radiology diagnosis (upper x-axis) at the patient level are also listed.
B. GLMNet classification scores (y-axis) for individual TBB samples in the validation set, grouped by patient. TBBs are marked by lobe of origin (shape) and by sample-level pathology diagnosis (color). Pathology diagnosis (lower x-axis) and radiology diagnosis (upper x-axis) at the patient level are also listed.
Radiology Dx
Radiology Dx
Patient-level Pathology Dx
Patient-level Pathology Dx
© 2016 Veracyte, Inc. All rights reserved. PC065.1.1605 The Veracyte name and logo are trademarks of Veracyte.
Disclosure of Commercial Support and Relevant Financial Interests: Veracyte, Inc. provided all commercial support for the research presented in this study. Giulia
Kennedy, Daniel Pankratz, Yoonha Choi, Urooj Imtiaz, Grazyna Fedorowicz, Neil Barth, Sean Walsh, and Jing Huang are employees and shareholders of Veracyte, Inc.
TABLE 1.
Patients and Pathologies Used to Train and Validate the UIP TBB Classifier
Training Samples (59 Total Patients) Validation Samples (35 Total Patients)
Lobe-level Histopathology Dx Non-UIP UIP Non-UIP UIP
Acute Lung Injury 2 (1) . . .
Bronchiolitis 5 (1) . 5 (1) .
Chronic bronchiolitis 5 (1) . . .
Respiratory bronchiolitis 11 (4) . 7 (2) .
Cellular NSIP 2 (1) . . .
Favor NSIP 3 (1) . 3 (1) .
NSIP 9 (3) . 3 (2) .
Emphysema 5 (1) . 6 (2) .
Favor HP 5 (1) . . .
Hypersensitivity Pneumonitis 8 (2) . 5 (1) .
Organizing pneumonia 9 (3) . 13 (4) .
Eosinophilic pneumonia . . 2 (1) .
Pneumocystis pneumonia 3 (1) . . .
Pulmonary hypertension . . 2 (1) .
SRIF 3 (1) . 2 (1) .
Sarcoidosis 8 (4) . 4 (1) .
Lung Cancer 2 (1) . . .
Conferral-disagree 3 (2) . . .
Classic UIP . 40 (13) . 22 (8)
Difficult UIP . 26 (8) . 11 (3)
Favor UIP . 20 (7) . 19 (5)
UIP . 42 (15) . 28 (10)
Total TBB Samples 83 128 52 80
TABLE 2.
Patient-level Pathology Diagnoses for Patients with Available Radiology Diagnoses
Pathology Dx (Patient Level)
Non-UIP UIP Total
Radiology (HRCT)
Dx
Non-UIP 12 14 26
UIP 0 5 5
FIGURE 3.
In Silico Mixture Modeling of a UIP Sample with a Non-UIP Sample from Different Patients GLMNet classification scores (y-axis) are shown for a range of mixture proportions (x-axis), with technical variability included in the simulation. The vertical dashed line indicates the mixture proportion which intersects the decision boundary (horizontal dashed line).