23
1 Development and Validation of a Gene Signature Classifier for Consensus Mo- lecular Subtyping of Colorectal Carcinoma in a CLIA-Certified Setting Jeffrey S. Morris 1 , Rajayalakshmi Luthra 2 , Yusha Liu 3 , Dzifa Duose 2 , Wonyul Lee 4 , Neelima Reddy 2 , Justin Windham 5 , Huiqin Chen 4 , Zhimin Tong 2 , Baili Zhang 2 , Wei Wei 6 , Manyam Ganiraju 7 , Bradley Broom 7 , Hector Alvarez 2 , Alicia Mejia 2 , Omkara Veeranki 2 , Mark Routbort 2 , Van Morris 8 , Michael J. Overman 8 , David Menter 8 , Riham Katkhuda 9 , Ignacio I. Wistuba 2 , Jennifer S. Davis 10 , Scott Kopetz 8* , Dipen M. Maru 2* 1 Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania School of Medicine 2 Division of Pathology and Laboratory Medicine, The University of Texas MD Anderson Cancer Cen- ter 3 Department of Biostatistics, University of Chicago School of Medicine 4 Department of Biostatistics, The University of Texas MD Anderson Cancer Center 5 NanoString Technologies Inc. 6 Cleveland Clinic Foundation 7 Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center 8 Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer Center 9 Department of Pathology, University of Chicago Medical Center 10 Department of Epidemiology, The University of Texas MD Anderson Cancer Center * Contributed equally as co-senior authors of this article Corresponding Author: Dipen Maru, MD Professor Departments of Anatomic Pathology and Translational Molecular Pathology The University of Texas MD Anderson Cancer Center Phone: 713 792 2678 Email: [email protected] on July 5, 2021. © 2020 American Association for Cancer Research. clincancerres.aacrjournals.org Downloaded from Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

Development and Validation of a Gene Signature Classifier ......Oct 27, 2020  · 2 . Abstract: Purpose: Consensus molecular subtyping (CMS) of colorectal cancer (CRC) has potential

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    Development and Validation of a Gene Signature Classifier for Consensus Mo-

    lecular Subtyping of Colorectal Carcinoma in a CLIA-Certified Setting

    Jeffrey S. Morris1, Rajayalakshmi Luthra2, Yusha Liu3, Dzifa Duose2, Wonyul Lee4, Neelima Reddy2,

    Justin Windham5, Huiqin Chen4, Zhimin Tong2, Baili Zhang2, Wei Wei6, Manyam Ganiraju7, Bradley

    Broom7, Hector Alvarez2, Alicia Mejia2, Omkara Veeranki2, Mark Routbort2, Van Morris8, Michael J.

    Overman8, David Menter8, Riham Katkhuda9, Ignacio I. Wistuba2, Jennifer S. Davis10, Scott Kopetz8*,

    Dipen M. Maru2*

    1 Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania School of

    Medicine

    2 Division of Pathology and Laboratory Medicine, The University of Texas MD Anderson Cancer Cen-

    ter

    3 Department of Biostatistics, University of Chicago School of Medicine

    4 Department of Biostatistics, The University of Texas MD Anderson Cancer Center

    5 NanoString Technologies Inc.

    6 Cleveland Clinic Foundation

    7Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson

    Cancer Center

    8Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer

    Center

    9Department of Pathology, University of Chicago Medical Center

    10Department of Epidemiology, The University of Texas MD Anderson Cancer Center

    * Contributed equally as co-senior authors of this article

    Corresponding Author:

    Dipen Maru, MD

    Professor

    Departments of Anatomic Pathology and Translational Molecular Pathology

    The University of Texas MD Anderson Cancer Center

    Phone: 713 792 2678

    Email: [email protected]

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    mailto:[email protected]://clincancerres.aacrjournals.org/

  • 2

    Abstract:

    Purpose: Consensus molecular subtyping (CMS) of colorectal cancer (CRC) has potential to reshape

    the CRC landscape. We developed and validated an assay that is applicable on formalin fixed paraf-

    fin embedded (FFPE) samples of CRC and implemented the assay in a CLIA-certified laboratory. Ex-

    perimental design: We performed an in silico experiment to build an optimal CMS classifier using a

    training set of 1329 samples from 12 studies and validation set of 1329 samples from 14 studies. We

    constructed assay based on Nanostring codesets for the top 472 genes, and performed analyses on

    paired flash frozen (FF)/FFPE samples from 175 CRCs to adapt the classifier to FFPE using a subset

    of genes found to be concordant between FF and FFPE, and tested the classifier`s reproducibility,

    repeatability, and validated in a CLIA-certified laboratory. We assessed prognostic significance of

    CMS in 345 patients pooled across 3 clinical trials. Results: The best classifier was Weighted Sup-

    port Vector Machine with high accuracy across platforms and gene lists (>0.95), and the 472-gene

    model outperforming existing classifiers. We constructed subsets of 99 and 200 genes with high

    FF/FFPE concordance, and adapted FFPE-based classifier that had strong classification accuracy

    (>80%) relative to “gold standard” CMS. The classifier was reproducible to sample type, RNA quality,

    and demonstrated poor prognosis for CMS1-3 and good prognosis for CMS2 in metastatic CRC

    (p

  • 3

    Introduction:

    Colorectal cancer (CRC) is the third most common cancer and a leading cause of cancer death

    worldwide. Several papers were published introducing CRC molecular subtyping systems; each parti-

    tioning colorectal cancer into three to six subtypes (1-7). The translational value of these works were

    limited by their relatively small sample sizes and lack of consensus regarding which of the six subtyp-

    ing systems best captured the tumor heterogeneity and had superior utility as predictive and/or prog-

    nostic marker. In this context, Guinney, et al. (8) assembled the colorectal subtyping consortium

    (CRCSC) that sought to identify consensus molecular subtypes (CMS) by assembling a data base of

    gene expression measurements from 4,151 CRC patients from a collection of 18 international stud-

    ies, having each of the six subtyping systems applied to each of these samples, and then using a

    network analysis to identify consensus clusters. The four consensus subtypes were identified primari-

    ly based on the biologic characteristics of colorectal cancer. However, findings by Guinney, et al. and

    subsequent other studies have demonstrated prognostic and predictive value of CMS in colorectal

    cancer (9-17).

    In order to fully realize these potential benefits of CMS, it is necessary to have a robust, reliable sin-

    gle sample classifier to discern a CRC patient’s CMS from the tumor tissue. As part of an internation-

    al consortium (8), we previously presented a Random Forest classifier that included 5973 genes and

    a “single sample” classifier based on nearest centroid predictor applied using 693 genes, built primari-

    ly using microarrays designed for use with flash frozen (FF) samples. Efforts are needed to build a

    more parsimonious single sample classifier using fewer genes that can be reliably run on RNA ex-

    tracted from formalin-fixed paraffin embedded (FFPE) samples. In this paper, we introduce an FFPE-

    based CMS classifier using the NanoString platform that has strong accuracy for predicting the CMS

    in CRC samples including those from the CRCSC study. This gene classifier was discovered and val-

    idated in silico by using the CRCSC data sets and subsequently optimized based on degree of corre-

    lation across tissue types; FFPE vs. FF samples and platform type NanoString vs. Affymetrix. Subse-

    quently, we validated this FFPE tissue based gene classifier in a CLIA-certified molecular diagnostic

    laboratory and demonstrate prognostic significance of CMS in CRC.

    Material and Methods:

    Development and Validation of CMS Classifier on CRCSC: We performed in silico development

    and validation of the CMS classifier using the samples and datasets that were part of the “consensus”

    set in CRCSC, meaning that they had so-called “gold standard” CMS status, defined based on

    agreement among the six different subtyping systems, against which we could compare to assess

    classification accuracy of our CMS classifier. Details of the discovery and validation approach, data

    sets used, including tissue type, total number of samples, total number of consensus samples and

    preprocessing method are shown in figure-1A, supplementary table-1 and supplementary methods 1

    and 2. Classification Modeling Strategies: The classification modelling strategies we considered

    included Linear Discriminant Analysis (18,19), Quadratic Discriminant Analysis (20,21), K-Nearest

    Neighbor (22,23), Random Forrest (8,24), Rotation Forrest (25,26), Weighted Support Vector Ma-

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 4

    chine (wSVM)(27,28), Distance-Weighted Discrimination (DWD)(29,30), and Ensemble Methods

    comprised of voting schemes across these classifiers (supplementary methods 3). We split the train-

    ing data set of V1 into subsets, containing 332, 332, 332, and 333 samples, respectively, for use in

    the four-fold cross validation model building strategy. For each modeling strategy, we applied the

    Quantile Normalization(30) and fit the model to each of the four ¾ subsets, optimizing tuning parame-

    ters using nested cross validation, and assessed accuracy for predicting the gold standard CMS on

    the left-out ¼ subset. Tuning parameters that showed the best accuracy were selected as optimal

    parameters for each subset. We summarized the predictive accuracy of each modeling strategy as a

    function of number of genes, allowing us to both assess which modeling strategy appears to be best

    and the minimum number of genes needed for accurate CMS classification. Choosing the best mod-

    eling strategy, we computed the classification accuracy in the validation data set V2 as well as the

    various subsets mentioned above, and summarized results again as a function of number of genes in

    the model. Gene Ranking Strategy: We designed a boosting procedure based on multi-class Ada-

    boost (31) to order the genes (see supplementary methods 4), which amounts to a forward stage-

    wise additive selection in which samples were repeatedly re-weighted at each step so the next best

    gene focused more on samples misclassified on previous steps, resulting in a list of genes ranked in

    descending order of classification importance. By using the same reduced gene sets for each classifi-

    cation method, we were able to gain a straightforward comparison of which method appears to per-

    form better, to find the minimum gene set size yielding good classification performance, and to fairly

    compare the various methods at any desired model size. wSVM classifier: Our results revealed that

    the best performing classifier was the wSVM. The user calls the wSVM function with an N by P matrix

    of expression values for P genes for each of N samples with the column names as Entrez IDs, and

    the function will quantile normalize the data and apply the wSVM to get class predictions. After pair-

    wise coupling, for sample I, we obtain probabilities of each CMS, ij such that ∑ 𝜋𝑖𝑗 = 14𝑗=1 , with

    i=maxj {ij} indicating the highest CMS class probability for that sample, which we consider a meas-

    ure of CMS classification confidence. We have two possible rules to classify a sample into a CMS

    group based on these measures:

    1. Most Likely CMS: Classify sample i into the most likely CMS, {j: ij =max (ij)}, regardless of clas-

    sification confidence i.

    2. Most Likely CMS with a Confidence Threshold: Classify sample i into the most likely CMS as

    long as the classification confidence i is above some threshold (e.g. 0.50 or higher), and oth-

    erwise consider indeterminate {Choose CMS j: ij =max (ij) if ij >, otherwise CMS indetermi-

    nate}. Indeterminate samples are heterogenous tumors containing characteristics of multiple

    CMS, so could also be called “mixed CMS”, as done for 13% of total samples by Guinney et al.

    (8).

    Generation of Gene Signature Classifier in CRC Samples:

    Summary of the Approach for Development of NanoString Classifier: We used a novel strategy

    to port the CMS classifier designed for Affymetrix platform on FF samples over to the NanoS-

    tring/FFPE setting that efficiently utilizes the vast information available to us in the CRCSC data sets

    and overcomes inconsistencies in mRNA quality between FF and FFPE samples (Figure 1-B and

    supplementary Figure-1).

    Affymetrix 133-2 Plus2.0 and NanoString CodeSets Based Gene Expression Assays: Flash fro-

    zen and FFPE tumor samples from randomly selected 175 patients including 95 men and 80 women

    with stage I-IV colon cancer were included in the first phase of the study to build a FFPE classifier.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 5

    Seventy-two out of 175 samples were included in the CRCSC, therefore, the “gold standard” CMS based on

    Affymetrix expression array was known and only Nanostring assay was run on RNA extracted from FF and

    FFPE tissue samples of these tumors. Additional FF and FFPE samples from 103 tumors were identified from

    our institutional biorepository. The “gold standard” CMS was not known for these samples, therefore, Affymet-

    rix 133-2 Plus2.0 was run on FF samples to identify the CMS as the “gold standard”. Subsequently, Nanostring

    assay was run on RNA extracted from FF and FFPE samples from these tumors. All samples were derived

    from primary colon/rectum resection specimens without preoperative tumor targeted therapy. The

    clinico-pathologic features of the patient population are shown in Table-1 and supplementary meth-

    ods 5. Briefly, tumor areas with higher than 60% tumor cellularity from the gland forming tumor and

    higher than 20% tumor cellularity from signet ring or mucinous tumor were manually macro-dissected

    from FF or FFPE tissue sections to enrich for the demarcated tumor area. In more than 2/3 samples,

    superficial and deep (invasive border) areas of the tumor were included for the macro dissection.

    RNA was extracted using Qiagen’s AllPrep DNA/RNA kits (QIAGEN, Netherlands) per manufacturer’s

    instructions. Each sample was quantitated using the qubit fluorometer, with yields ranging from .9 g

    to 26 g. Each sample was run on the Agilent Bioanalyzer (Agilent Technologies, Santa Clara, CA) to

    determine the RNA Integrity Number (RIN) for the FF samples and the DV200 value for the FFPE

    samples. The FFPE samples included in the development of single sample classifier had 18% to 79%

    of the RNA having greater than 200 intact nucleotides. Gene expression analysis with Affymetrix 133-

    2 Plus2.0 was performed as described previously (supplementary methods 6). We designed a custom

    set of NanoString CodeSets (NanoString Technologies, Seattle, WA) with 472 signature probes, se-

    lected from the top 500 genes from the boosting procedure and 28 reference probes. The NanoS-

    tring CodeSets for each gene was chosen to be the genomic region that was most highly correlated

    with the fRMA level-expression summary of the gene by Affymetrix 133-2 Plus2.0 probe set (Affymet-

    rix, Santa Clara, CA). The genes with at least 0.70 correlation between the CodeSets and gene level

    summary were included in the customized CodeSets. The 28 reference CodeSets were selected from

    the reference CodeSets on the NanoString PanCan array plusR, and selecting genes with evidence of

    no difference in CMS in our preliminary data (32-36). The assay was performed as per the NanoS-

    tring guidelines (supplementary methods 7) with 10 patient samples, a positive and negative control

    on each cartridge. Raw data from the nsolver software was transferred to the bioinformatics group

    where the custom CMS classifier algorithm in the form of R script was used to determine which sam-

    ples belonged to a particular CMS.

    Validation at the Research Molecular Diagnostic Laboratory: The NanoString CodeSets were technically validated by running two samples with “gold standard” CMS known from the CRCSC data on three different lots of CodeSets. The old lot of CodeSets and new lot of CodeSets were run to-gether in the same run and accuracy in identifying the CMS by these CodeSets was assessed by lin-ear regression. We tested repeatability of CMS assay across 4 different runs by same technician in 12 samples, reproducibility with different technician in 12 samples, and reproducibility with different input RNA quantity (50-500ng) for 6 samples using the same CodeSets on the same nCounter used for prior experiments. We also tested reproducibility of CMS between colonoscopy biopsies and sur-gically resected primary CRC by running customized CodeSets on matched biopsies and resection samples, using same CodeSets and nCounter and laboratory personnel. Assessing Performance of CMS Classifier Assay in A CLIA Certified Laboratory:

    The NanoString assay with top 200 genes (CRC CMS-200) and top 99 genes (CRC CMS-100),was

    further validated at our CLIA-certified Molecular Diagnostic Laboratory (MDL) to apply this assay as

    an integral biomarker for a phase II clinical trial (NCT034365630) assessing safety and efficacy of

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 6

    dual TGF-β trap: anti-PD-L1 molecule M7824 (EMD-Serono) in CMS4 subtype CRC. Thirty-five tumor

    samples from stage II/III primary colon cancer, previously used for validation at the research molecu-

    lar diagnostic laboratory, were used to validate the assay across 10 runs for a total of 120 reactions.

    All 35 samples were included in the CRCSC study and gold standard CMS was known for these

    samples, and the laboratory technician was blinded from the gold standard CMS for those samples.

    Input for the assay was 250 ng of total RNA extracted from FFPE tumor tissue with 20% or higher

    tumor cellularity. Accuracy, analytical sensitivity, and analytical specificity were assessed by compar-

    ing calls from the MDL CMS panel with “gold standard” CRCSC Affymetrix calls. Reproducibility was

    assessed across original run and at least 3 additional repeat runs without re-extraction of RNA. Re-

    peat runs were also performed with re-extracted RNA and by 2 technicians.

    Assessing Performance of CMS Classifier as a Prognostic Marker in Stage IV Colorectal Can-

    cer:

    Patients with a CMS determination from the NanoString based gene expression score were pooled

    from three separate sources, including two clinical trials; NCT03436563 (n=91) and a phase II trial

    assessing Trametinib and Durvalumab in microsatellite stable colorectal cancer (n=19,

    NCT03428126) and Assessment of Targeted Therapies Against Colorectal Cancer (ATTACC,

    NCT01196130) Screening Protocol (n=235) (37). The ATTACC samples and the samples from the

    Trametinib/Durvalumab study were characterized by CRC CMS-100 assay at the research molecular

    diagnostic laboratory, while samples from patients enrolled in M7824 clinical trial were characterized

    by CRC CMS-200 performed at the CLIA compliant molecular diagnostic laboratory. Median overall

    survival was calculated from date of stage IV diagnosis to death or date of last follow up, which was

    censored. Survival patterns were visualized with Kaplan Meier survival curves and compared using

    the log-rank test. Graphs were generated using IBM SPSS Statistics 24.

    The study was approved by the institutional review board (IRB) with an informed consent from each

    subject or each subject`s guardian for the clinical trial samples. The work on samples from subjects

    not enrolled in the clinical trial was approved by the IRB with waiver of informed consent. The study

    has been conducted as per the ethical guidelines of U.S. common rule.

    Results:

    Performance of CRCSC Classifier on CRCSC Data Sets:

    We selected the wSVM model as it had the best performance in the Training Data V1 using 4-fold

    cross-validation (Supplementary Figure 2, Supplementary Table 2 and supplementary material 8).

    The 4-group classification accuracy of the wSVM model on the validation data set V2 was 0.955 for

    the full model (5973 genes), and still outstanding for models involving smaller gene numbers, with 4-

    group classification accuracies of 0.959, 0.932, and 0.898 for models with 500, 75, and 20 genes, re-

    spectively. The performance of the wSVM classifier for the out-of-sample subset (V2o), the RNAseq

    subset (TCGA), and the Affymetrix subset (V2a) were comparable to the overall validation perfor-

    mance (V2), suggesting the classifier was robust to platform and has good out-of-sample perfor-

    mance, relatively evenly across CMS (Supplementary Tables 3-7, supplementary material 9). We

    chose a wSVM classifier with 472 genes to move forward with further validation. This classifier yield-

    ed an overall 96.3% classification accuracy in the Affymetrix subset V2a, with accuracies of 0.966,

    0.967, 0.932, and 0.971 for CMS1, CMS2, CMS3, and CMS4, respectively. The CMS structure was

    remarkably persistent being highly consistent in training and validation datasets (heat map in Sup-

    plemental Figure-3). Further comparison on our classifier with classifiers described by Guinney et. al.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 7

    is shown in the supplementary Table-8. Performance of 472 gene CRCSC classifier based on single

    Affymetrix probe gene set and the classifier performance by classification confidence are described in

    supplementary material 10 and 11 and supplementary Figure-4. Supplementary Table-9 shows all of

    the misclassified samples along with the corresponding wSVM class probabilities (ij) for each CMS,

    classification confidence (i), and indication of whether this sample could be considered a “CMS mix-

    ture” (i.e. (ij >0.20 for multiple CMS) and if the “gold standard” was a part of that mixture. From this,

    we see that most of the “misclassified samples” had lower classification confidence i, and many had

    evidence of being CMS mixtures, with the “gold standard” CMS being a component of the mixtures.

    Nanostring CRCSC Classifier Optimization Based on Correlation between FF and FFPE Tumor

    Samples:

    The sample-specific correlations of FF and FFPE measurements were very high for most samples

    (Figure 2a, Supplementary Table 10A and supplementary material 12), with a small number of sam-

    ples with low correlations tending to have poorer RNA quality for their FF samples (p=0.0077) but not

    FFPE (p=0.28, Figures 2B and 2C). A histogram of the gene-specific correlation of FF and FFPE

    measurements for each of the 472 classifier genes is found in Figure 2D and summarized in supple-

    mentary Table 10B. This histogram demonstrates the high level of variability across genes in terms

    of concordance of paired FF/FFPE gene expression measurements, and the remarkable consistency

    of the gene-specific concordances across batches (Supplementary Figure 5) suggests that this con-

    cordance is a consistent characteristic of the gene/probe set and not a random technical factor. This

    motivated us to select a subset of genes showing high FF/FFPE concordance for use in our FFPE

    classifier, choosing the top 100 genes in terms of FF/FFPE correlation for the CMS-100 classifier and

    the top 200 genes for CMS-200.

    NanoString FFPE Classifier Performance:

    Figure 3 shows the classification accuracy of CMS-100 on FFPE and FF samples, CMS-472 on FF

    samples, and the Affy FF-100 and Affy FF-472 based on the Affymetrix validation data V2a, with ac-

    curacy split out by confidence threshold and proportion of unclassified samples. The CRC CMS-

    100 model applied to FFPE samples had 4-group accuracy of 0.80 with 0.81 for CRCSC samples and

    0.78 for non-CRCSC samples. For samples with high confidence (i>0.80 or 0.90), the performance

    was better with 4-group accuracy of 0.86 and 0.89, respectively. For FF samples, the CMS-100 had

    4-group classification accuracy of 0.80, 0.74 for the CRCSC samples and 0.88 for non-CRCSC sam-

    ples, and 4-class accuracy of 0.87 and 0.92 for samples classified with high confidence i =0.80 or

    0.90, respectively (supplementary material 13). These performed comparably to the CMS-472, the

    472 gene classifier on FF samples, and not much worse than CMS-100 in an idealized non-clinical

    setting based on batch-corrected Affymetrix data from the CRCSC studies.

    Supplementary figures 6A and 6B plot the 4-class accuracy vs. confidence level i for FFPE and FF

    samples, demonstrating that samples classified with high confidence were more likely to be accurate-

    ly classified. Supplementary figures 6C and 6D plot the 4-class accuracy vs. RNA quality, defined by

    %200nt (FFPE) or RIN (FF), demonstrating that there is little if any association of CMS accuracy with

    RNA quality, suggesting that the performance of classifier is robust to RNA quality in this study. One

    gene out of the 100 was mistakenly left off an order of the Nanostring codesets for some of the vali-

    dation study, so the corresponding classifier CMS-100 that was validated has 99 genes. We con-

    firmed the performance of the 100 and 99 gene classifier was concordant.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 8

    The CRC CMS-100 assay with 99 genes was 100% reproducible in predicting a CMS across different

    runs (12 samples= 48 runs), between two laboratory personnel (12 samples) and with different RNA

    input concentration (n=6). The reproducibility between biopsy and resection was 91% with 15 of 17

    patients had same CMS between matched biopsy and resection specimens (Supplementary Table-

    11). All (12 from left colon and 5 from right colon) biopsy samples were procured from same tumor as

    surgically resected specimens. Tissue sections from RNA were derived from FFPE blocks that were

    generated for the clinical use. The two cases with discrepant CMS between biopsy and resection

    were sporadic CRC without any known predisposing condition or preoperative tumor targeted thera-

    py. To determine impact of tumor location and histopathologic features on reproducibility of CMS, an-

    other pathologist reviewed Hematoxylin & Eosin stained sections of primary tumor from surgical re-

    sections of those included in assessment of inter-run reproducibility, InterTech reproducibility, repro-

    ducibility across different RNA concentration and reproducibility between biopsy and resection (n=30).

    In 19 samples both superficial and deep area of the tumor were macro dissected, 4 samples had only

    superficial and 7 had only deep area of the tumor macro dissected. Five tumors were poorly differen-

    tiated, one with mucinous histology and 25 tumors were moderately differentiated including one with

    mucinous histology. Due to high reproducibility across runs, technicians and RNA concentration, we

    did not observe any difference in CMS call among samples with different areas of macro dissection or

    histologic parameters. We also did not observe significant difference in probability of a CMS in the

    context of histologic parameters. However, two (of 17) samples that showed discrepancy for CMS be-

    tween biopsy and surgical resection had only deep area of the tumor macro dissected from the resec-

    tion specimens. We also did not find histologic features unique to 7 samples that were discrepant for

    the CMS between research laboratory and CLIA certified laboratory.

    Performance of CRC CMS-200 in CLIA-certified Molecular Diagnostic Laboratory:

    On initial run, 32/35 samples were accurately assigned the CMS as compared to the gold standard

    based on “most likely CMS”, i.e. with confidence threshold of i>0.50. Three misclassified samples,

    with 0.50 and 0.57 “most likely CMS” probability on the initial run had “most likely CMS” probability in

    the borderline range (≥0.43 & 0.57 is used, then in all three samples in all runs had CMS as per the gold

    standard, and the assay had 100% analytical sensitivity and analytical specificity. Inter run reproduci-

    bility was assessed from 3 separate extractions from 4 unique patient samples for a total of 12 cases.

    These 12 cases were run across 3 separate NanoString Runs and by 2 technologists. There was

    100% concordance for the CMS classification among all 3 runs with an average standard deviation of

    ± 0.002 for the “most likely CMS” probability. The inter tech reproducibility was 100% for the CMS

    classification between both technicians with an average standard deviation of ± 0.002 for the CMS

    probability. Intra run reproducibility was assessed among 4 samples run in triplicate on a single

    NanoString run. There was 100% concordance for the CMS classification among all 3 runs with an

    average standard deviation of ± 0.012 for the CMS probability. Comparing CMS reproducibility with

    CMS-100 (99 genes) vs. 200 gene assay demonstrated 97% reproducibility with only 1 of 35 samples

    showing discordant CMS. List of CMS100 (99 genes) and CMS200 test and 16 housekeeping genes

    is shown in Supplementary Table-12. These reproducibility and repeatability findings were deemed

    up to the level of a CLIA certified assay to determine CMS 4 vs. other CMS for FFPE tumor samples

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 9

    from patients enrolled in the clinical trial targeting patients with CMS4 colorectal cancer, as described

    in the methods.

    KRAS-BRAF Mutational Status and Prognostic Relevance of CMS by the NanoString CMS

    Classifier:

    To confirm the expected biologic performance of the assay, we surveyed a set of mCRC patients en-

    rolled in clinical trials and ATTACC Protocol (Table-2). Higher frequency of KRAS mutation was ob-

    served with CMS 3 (66%) and CMS 4 (50%) samples. BRAF mutations were identified only in CMS 1

    (50%), CMS 4(8%), and mixed (12%) subtype samples (Figure-4A). We did not find significant differ-

    ence in any of the clinicopathologic and molecular characteristics between samples that were classi-

    fied as mixed vs. those that were classified in to one of CMS. Using the CMS-100 (99 genes) classifi-

    er, we were able to identify significant differences in overall survival by CMS, consistent with prior

    studies (9,14). Specifically, patients with a CMS2 tumor had the best survival with a median of 46.1

    months from stage IV diagnosis (95% CI: 36.6, 58.1), patients with a CMS1 or CMS3 tumor had the

    poorest survival after a stage IV diagnosis with median survival times of 23.2 (95% CI: 19.3, 59.2) or

    21.4 (95% CI: 15.8, 34.6) months, respectively. Patients with a CMS4 tumor had a survival pattern

    that was in between that of CMS2 and CMS1 or CMS3 with a median survival time of 35.3 months

    (95% CI: 32.2, 40.0) (Figure-4B).

    Discussion:

    Consensus molecular subtyping has great potential to reshape the landscape of CRC treatment and

    contribute to the development of new precision therapeutic approaches. However, to realize this po-

    tential, it is necessary to transform the CMS based on network analysis of multiple gene expression

    datasets into a clinical test requires an assay that is reproducible across platforms and tissue types,

    has high classification accuracy, and is able to generate CMS in a single sample setting. We

    achieved this objective using a three step approach for building the classifier; in silico testing of vari-

    ous classification strategies considering various gene list sizes on CRCSC data generated on Affy-

    metrix platform and determining that wSVM is the optimal system, a gene reduction exercise to select

    genes with best concordance across gene expression profiling platforms and tissue types to ensure

    optimal performance in FFPE samples, updating the wSVM using the CRCSC training set based on

    this reduced number of genes, and then using this classifier on measurements from the Nanostring

    assay on FFPE samples after transforming these values onto the scale of the Affymetrix data on FF

    samples that dominated the CRCSC training set.

    Rather than just choosing a classification strategy in an ad hoc fashion, we used a systematic, rigor-ous strategy to rank the genes based on their classification value and to compare a large number of classification strategies for a wide range of model sizes. This allowed us to find out which strategy performed best, the wSVM, and to determine how parsimonious a classifier could be without sacrific-ing substantial classification accuracy. Given that the classification literature clearly demonstrates that no one classification strategy is optimal for all data sets, the consideration of multiple approaches is important when building classification signatures. Moreover, findings from the Microarray Quality Con-trol (MAQC) project from FDA has shown that even slight differences in the statistical analysis led to discrepancies in biological interpretation (38). High accuracy in predicting CMS by nearly all statistical

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 10

    methods gives credence to the utility of our CMS assay in accurately classifying a colon cancer in one of the CMS. The wSVM classifier we built using this strategy applied to training data, consisting largely of batch-

    corrected Affymetrix gene expression measurements from FF samples, performed exceptionally well

    in the CRCSC validation data. Our custom design of CodeSets best capitulating the signal in our

    training data, and our strategy of starting with more genes than necessary, then narrowing to a subset

    with evidence of high FF/FFPE correlation further mitigated the influence of FFPE on classification

    performance. Our quantile normalization strategy was sufficient to obtain reasonable performance for

    small (99 or 200) gene FFPE NanoString classifier. This strategy allowed us to efficiently utilize our

    data resources, using the enormous data on FF samples to train the classifier and collecting a smaller

    set of paired FF/FFPE samples to identify genes with high FF/FFPE concordance and map the FFPE

    NanoString expression values to the scale of the FF Affymetrix expression values, leading to our

    novel strategy for building the classifier. The consistency of gene-specific FF/FFPE concordance

    across batches provides strong support for this strategy.

    The high concordance we observed in CMS for CRC samples between a research molecular testing laboratory and a CLIA-certified clinical molecular diagnostic laboratory indicate robust performance of the assay. High inter-laboratory reproducibility is likely due to similarities in the pre-analytical and analytical processes between two laboratories. Another reason for our high inter-laboratory reproduc-ibility is the use of NanoString n-counter technology that utilizes non-amplified nucleic acids without any reverse transcription step and is applicable to multiple samples. Ragulan et. al.(39) demonstrated high classification accuracy and reproducibility of NanoString based subtyping-classification between FF and FFPE tissue samples of colorectal cancer. In spite of significant differences in the CRC clas-ses and validation approach, this study and our study indicate that NanoString is a reliable platform to develop and validate gene expression based signature using FFPE samples of CRC. Guinney et al. (8) found that approximately 87% of CRC tumors classified cleanly into a single CMS, but approximately 13% were “mixed CMS”, not outliers or a fifth subtype but heterogeneous samples containing characteristics of multiple CMS. We also found similar proportions of “mixed CMS” sam-ples in our analyses. Clinically, patients with mixed CMS tumors could be treated multiple ways. One option would be to include any “mixed CMS” sample with a high enough probability of CMSx as a po-tential candidate for any targeted therapy that has been validated as a precision therapeutic for CMSx (x=1, 2, 3, or 4), which of course would require prospective validation before clinical application. There is increasing evidence of the prognostic and predictive utility of CMS. Lenz et al. (9) using a NanoString based assay in a large cohort of patients with metastatic or advanced colorectal cancer enrolled in CALGB/SWOG 80305 phase III clinical trial, demonstrated that there is significant differ-ence in overall survival by CMS with median survival of 40 months in CMS2 vs. median survival of 15 months in CMS1. The NanoString assay used for the CALGB/SWOG 80305 and our study differed significantly. Lenz et al. developed a customized NanoString based genes that were derived from some of the large data sets with published gold standard CMS labels, including, The Cancer Genome Atlas and other studies (5,13). Only genes that are common to these three data sets and those as-sessed in the CALGB/SWOG 80405 panel are used. While the genes included in our NanoString based assay were all derived from CRCSC database. Similar prognostic trends were observed by other groups including in patients enrolled in FIRE3 study comparing Cetuximab vs. Bevacizumab with FOLFIRI in metastatic colorectal cancer patients and by Mooi et. al.(14). As research only clas-sifiers, these methodologies are not designed for application for individual patients or suitable for use

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 11

    in prospective patient assignment. In contrast, our classifier as deployed in a clinical lab is suitable for classifying individual patients with the rigor needed for guiding clinical management. Our CLIA validated assay has potential of integral, integrated and exploratory marker. Hypotheses being explored include focused immunotherapy in CMS1, which represents a subgroup with evidence of higher immune infiltrates and activated T-cells. CMS2 represents a group with best overall survival from EGFR inhibition in retrospective assessment of the CALGB/SWOG 80405 trial, while CMS1 benefited from VEGF inhibition (PMID 31042420). CMS4 has active stromal signature and an im-mune modulating strategy has been proposed. For example, in the clinical trial (NCT034365630) as-sessing safety and efficacy of dual TGF-β trap: anti-PD-L1 molecule M7824 (EMD-Serono), CMS as-say was used as an intergral biomarker to select patients with CMS4. Expanded efforts in this trial or other ongoing trials can be done looking to identify other CMS where efficacy of either M7824 or other drug can be assessed based on its mechanism of action and CMS biology. As an integrated assay, all patients can be prospectively tested to identify CMS. The interim analysis then looks at all comers, and if negative, then looks at CMS specific subgroups, with plan to continue the second half of the randomized study. Finally as an exploratory biomarker, a retrospective analysis can be done with the high quality CLIA assay to look for a CMS signal but also to minimize the risk of inconsistent assays when designing the follow up study. To support the goal of dissemination of a robust CMS classifier for retrospective or prospective utilization, the NanoString CodeSets and supporting bioinformatics information can be found at http://qcsrlshinypro.mdanderson.edu/CMSclia/. Unavailability of matched samples prevented us from assessing CMS accuracy between primary and

    metastatic tissues in our study. Fontana E et.al (40) using publically available data from Khambata-

    Ford dataset (41), demonstrated no significant difference in CMS distribution between localized vs.

    metastatic disease. The impact of sample site on CMS classification is necessary to determine host

    organ influence and metastasis associated evolution of gene expression in CRC.

    In summary, we have developed, validated and demonstrated prognostic utility of a CRC-CMS assay

    using FFPE samples. This CLIA validated assay provides a foundation to expand its utility to assess

    prognosis in a standard of care setting and explore the assay as a predictor of response to therapy in

    clinical trials.

    Acknowledgments: This work is funded by National Cancer Institute through Assay Validation For

    High Quality Markers For NCI-Supported Clinical Trials (UH2CA207101) and MD Anderson Cancer

    Center SPORE in Gastrointestinal Cancer (P50 CA221707). Part of this research was performed in

    MD Anderson’s Core facilities which is supported in part by the National Institutes of Health through

    Cancer Center Support Grant CA016672. Part of the validation work in the clinical lab was funded as

    part of a clinical trial (NCT03436563) funded by EMD-Serono. We thank Kim-Anh Vu in MD Anderson’s Department of Anatomic Pathology for helping with the figures.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://qcsrlshinypro.mdanderson.edu/CMSclia/http://projectreporter.nih.gov/project_info_description.cfm?aid=9427630http://clincancerres.aacrjournals.org/

  • 12

    References:

    1. Cancer Genome Atlas N. Comprehensive molecular characterization of human colon and rectal cancer. Nature 2012;487(7407):330-7 doi 10.1038/nature11252.

    2. Perez-Villamil B, Romera-Lopez A, Hernandez-Prieto S, Lopez-Campos G, Calles A, Lopez-Asenjo JA, et al. Colon cancer molecular subtypes identified by expression profiling and associated to stroma, mucinous type and different clinical behavior. BMC Cancer 2012;12:260 doi 10.1186/1471-2407-12-260.

    3. Schlicker A, Beran G, Chresta CM, McWalter G, Pritchard A, Weston S, et al. Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines. BMC Med Genomics 2012;5:66 doi 10.1186/1755-8794-5-66.

    4. Sadanandam A, Lyssiotis CA, Homicsko K, Collisson EA, Gibb WJ, Wullschleger S, et al. A colorectal cancer classification system that associates cellular phenotype and responses to therapy. Nat Med 2013;19(5):619-25 doi 10.1038/nm.3175.

    5. Marisa L, de Reynies A, Duval A, Selves J, Gaub MP, Vescovo L, et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med 2013;10(5):e1001453 doi 10.1371/journal.pmed.1001453.

    6. De Sousa EMF, Wang X, Jansen M, Fessler E, Trinh A, de Rooij LP, et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nat Med 2013;19(5):614-8 doi 10.1038/nm.3174.

    7. Budinska E, Popovici V, Tejpar S, D'Ario G, Lapique N, Sikora KO, et al. Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer. J Pathol 2013;231(1):63-76 doi 10.1002/path.4212.

    8. Guinney J, Dienstmann R, Wang X, de Reynies A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med 2015;21(11):1350-6 doi 10.1038/nm.3967.

    9. Lenz HJ, Ou FS, Venook AP, Hochster HS, Niedzwiecki D, Goldberg RM, et al. Impact of Consensus Molecular Subtype on Survival in Patients With Metastatic Colorectal Cancer: Results From CALGB/SWOG 80405 (Alliance). Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2019;37(22):1876-85 doi 10.1200/JCO.18.02258.

    10. Kwon Y, Park M, Jang M, Yun S, Kim WK, Kim S, et al. Prognosis of stage III colorectal carcinomas with FOLFOX adjuvant chemotherapy can be predicted by molecular subtype. Oncotarget 2017;8(24):39367-81 doi 10.18632/oncotarget.17023.

    11. Song N, Pogue-Geile KL, Gavin PG, Yothers G, Kim SR, Johnson NL, et al. Clinical Outcome From Oxaliplatin Treatment in Stage II/III Colon Cancer According to Intrinsic Subtypes: Secondary Analysis of NSABP C-07/NRG Oncology Randomized Clinical Trial. JAMA oncology 2016;2(9):1162-9 doi 10.1001/jamaoncol.2016.2314.

    12. Stintzing S WP, Lenz HJ et al. Consensus molecular subgroups (CMS)of colorectal cancer (CRC) and first-line efficacy of FOLFIRI plus cetuximab or bevasizumab in the FIRE3 (AIO KRK-0306) trial. . JCO 2017;35(Suppl 15):3510.

    13. Van Cutsem E, Cervantes A, Adam R, Sobrero A, Van Krieken JH, Aderka D, et al. ESMO consensus guidelines for the management of patients with metastatic colorectal cancer. Annals of oncology : official journal of the European Society for Medical Oncology 2016;27(8):1386-422 doi 10.1093/annonc/mdw235.

    14. Mooi JK, Wirapati P, Asher R, Lee CK, Savas P, Price TJ, et al. The prognostic impact of consensus molecular subtypes (CMS) and its predictive effects for bevacizumab benefit in metastatic colorectal cancer: molecular analysis of the AGITG MAX clinical trial. Annals of oncology : official journal of the European Society for Medical Oncology 2018;29(11):2240-6 doi 10.1093/annonc/mdy410.

    15. Becht E, de Reynies A, Giraldo NA, Pilati C, Buttard B, Lacroix L, et al. Immune and Stromal Classification of Colorectal Cancer Is Associated with Molecular Subtypes and Relevant for Precision Immunotherapy. Clinical cancer research : an official journal of the American Association for Cancer Research 2016;22(16):4057-66 doi 10.1158/1078-0432.CCR-15-2879.

    16. Lal N, White BS, Goussous G, Pickles O, Mason MJ, Beggs AD, et al. KRAS Mutation and Consensus Molecular Subtypes 2 and 3 Are Independently Associated with Reduced Immune Infiltration and Reactivity in Colorectal Cancer. Clinical cancer research : an official journal of the American Association for Cancer Research 2018;24(1):224-33 doi 10.1158/1078-0432.CCR-17-1090.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 13

    17. Okita A, Takahashi S, Ouchi K, Inoue M, Watanabe M, Endo M, et al. Consensus molecular subtypes classification of colorectal cancer as a predictive factor for chemotherapeutic efficacy against metastatic colorectal cancer. Oncotarget 2018;9(27):18698-711 doi 10.18632/oncotarget.24617.

    18. Huang D, Quan Y, He M, Zhou B. Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data. J Exp Clin Cancer Res 2009;28:149 doi 10.1186/1756-9966-28-149.

    19. Guo Y, Hastie T, Tibshirani R. Regularized linear discriminant analysis and its application in microarrays. Biostatistics 2007;8(1):86-100 doi 10.1093/biostatistics/kxj035.

    20. Arevalillo JM, Navarro H. A new method for identifying bivariate differential expression in high dimensional microarray data using quadratic discriminant analysis. BMC Bioinformatics 2011;12 Suppl 12:S6 doi 10.1186/1471-2105-12-S12-S6.

    21. Hastie T, Tibshirani R. Efficient quadratic regularization for expression arrays. Biostatistics 2004;5(3):329-40 doi 10.1093/biostatistics/5.3.329.

    22. Ayyad SM, Saleh AI, Labib LM. Gene expression cancer classification using modified K-Nearest Neighbors technique. Biosystems 2019;176:41-51 doi 10.1016/j.biosystems.2018.12.009.

    23. Kumar MA, Ewoldt RH, Zukoski CF. Intrinsic nonlinearities in the mechanics of hard sphere suspensions. Soft Matter 2016;12(36):7655-62 doi 10.1039/c6sm01310d.

    24. Huynh-Thu VA, Geurts P. Unsupervised Gene Network Inference with Decision Trees and Random Forests. Methods Mol Biol 2019;1883:195-215 doi 10.1007/978-1-4939-8882-2_8.

    25. Stiglic G, Rodriguez JJ, Kokol P. Rotation of random forests for genomic and proteomic classification problems. Adv Exp Med Biol 2011;696:211-21 doi 10.1007/978-1-4419-7046-6_21.

    26. Rodriguez JJ, Kuncheva LI, Alonso CJ. Rotation forest: A new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 2006;28(10):1619-30 doi 10.1109/TPAMI.2006.211.

    27. Chan WH, Mohamad MS, Deris S, Zaki N, Kasim S, Omatu S, et al. Identification of informative genes and pathways using an improved penalized support vector machine with a weighting scheme. Comput Biol Med 2016;77:102-15 doi 10.1016/j.compbiomed.2016.08.004.

    28. Abdi MJ, Hosseini SM, Rezghi M. A novel weighted support vector machine based on particle swarm optimization for gene selection and tumor classification. Comput Math Methods Med 2012;2012:320698 doi 10.1155/2012/320698.

    29. Huang H, Lu X, Liu Y, Haaland P, Marron JS. R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment. Bioinformatics 2012;28(8):1182-3 doi 10.1093/bioinformatics/bts096.

    30. Franks JM, Cai G, Whitfield ML. Feature specific quantile normalization enables cross-platform classification of molecular subtypes using gene expression data. Bioinformatics 2018;34(11):1868-74 doi 10.1093/bioinformatics/bty026.

    31. Ji Zhu HZ, Saharon Rosset and Trevor Hastie. Multi-class AdaBoost. 2009. 349-60 p. 32. Chen DT, Davis-Yadley AH, Huang PY, Husain K, Centeno BA, Permuth-Wey J, et al. Prognostic

    Fifteen-Gene Signature for Early Stage Pancreatic Ductal Adenocarcinoma. PLoS One 2015;10(8):e0133562 doi 10.1371/journal.pone.0133562.

    33. Ligibel JA, Cirrincione CT, Liu M, Citron M, Ingle JN, Gradishar W, et al. Body Mass Index, PAM50 Subtype, and Outcomes in Node-Positive Breast Cancer: CALGB 9741 (Alliance). J Natl Cancer Inst 2015;107(9) doi 10.1093/jnci/djv179.

    34. Prat A, Galvan P, Jimenez B, Buckingham W, Jeiranian HA, Schaper C, et al. Prediction of Response to Neoadjuvant Chemotherapy Using Core Needle Biopsy Samples with the Prosigna Assay. Clinical cancer research : an official journal of the American Association for Cancer Research 2016;22(3):560-6 doi 10.1158/1078-0432.CCR-15-0630.

    35. Veldman-Jones MH, Lai Z, Wappett M, Harbron CG, Barrett JC, Harrington EA, et al. Reproducible, Quantitative, and Flexible Molecular Subtyping of Clinical DLBCL Samples Using the NanoString nCounter System. Clinical cancer research : an official journal of the American Association for Cancer Research 2015;21(10):2367-78 doi 10.1158/1078-0432.CCR-14-0357.

    36. Wallden B, Storhoff J, Nielsen T, Dowidar N, Schaper C, Ferree S, et al. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC Med Genomics 2015;8:54 doi 10.1186/s12920-015-0129-6.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 14

    37. Overman MJ, Morris V, Kee B, Fogelman D, Xiao L, Eng C, et al. Utility of a molecular prescreening program in advanced colorectal cancer for enrollment on biomarker-selected clinical trials. Annals of oncology : official journal of the European Society for Medical Oncology 2016;27(6):1068-74 doi 10.1093/annonc/mdw073.

    38. Goodsaid FM, Amur S, Aubrecht J, Burczynski ME, Carl K, Catalano J, et al. Voluntary exploratory data submissions to the US FDA and the EMA: experience and impact. Nature reviews Drug discovery 2010;9(6):435-45 doi 10.1038/nrd3116.

    39. Ragulan C, Eason K, Fontana E, Nyamundanda G, Tarazona N, Patil Y, et al. Analytical Validation of Multiplex Biomarker Assay to Stratify Colorectal Cancer into Molecular Subtypes. Scientific reports 2019;9(1):7665 doi 10.1038/s41598-019-43492-0.

    40. Fontana E, Eason K, Cervantes A, Salazar R, Sadanandam A. Context matters-consensus molecular subtypes of colorectal cancer as biomarkers for clinical trials. Annals of oncology : official journal of the European Society for Medical Oncology 2019;30(4):520-7 doi 10.1093/annonc/mdz052.

    41. Khambata-Ford S, Garrett CR, Meropol NJ, Basik M, Harbison CT, Wu S, et al. Expression of epiregulin and amphiregulin and K-ras mutation status predict disease control in metastatic colorectal cancer patients treated with cetuximab. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2007;25(22):3230-7 doi 10.1200/JCO.2006.10.5437.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 15

    Table-1: Clinicopathologic features of patients and samples included in the assay development and validation in the research molecular diagnostic laboratory

    Characteristics

    n

    Age 50 years 136

    Gender Male 95 Female 80

    Tumor location Right colon 75 Left and sigmoid colon 92 Rectum 5 Multiple primary tumors 3

    pT stage pT1 0 pT2 16 pT3 133 pT4 18 pT4a 3 pT4b 1

    pN stage pN0 76 pN1 55 pN1a 2 pN1b 5 pN1c 0 pN2 27 pN2a 4 pN2b 2 pN3 0 pNX 4

    pM stage pM0 165 pM1 5 pMX 5

    Grade Low (well or moderately differentiated) 145 High (poorly differentiated) 30

    Time between date of surgery and gene expression analysis

    < 5years 20

    5-10 years 95 >10 years 60

    Samples Matched flash frozen and formalin fixed paraffin embedded

    149

    Only formalin fixed paraffin embedded 12 Only flash frozen 4

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 16

    Table-2: Patient characteristics of the cohorts utilized to correlate CMS and overall survival

    N = 345

    Mean Age at Initial Diagnosis (SD) 50.9 (11.5)

    Mean Age at Stage IV Diagnosis (SD) 51.5 (11.5)

    Sex

    Male 187 (54.2) Female 158 (45.8)

    Race/Ethnicity

    Non-Hispanic (NH) White 258 (74.8) NH African American 32 (9.3) Hispanic 27 (7.8) NH Asian 22 (6.4) Other/Unknown 6 (1.7)

    Stage at Initial Diagnosis

    I 5 (1.4) II 24 (7.0) III 110 (31.9) IV 204 (59.1) NA 2 (0.6)

    KRAS mutation status

    wild type 107 (31.0) canonical mutation 147 (42.6) NA 91 (26.4)

    NRAS mutation status

    wild type 236 (68.4) canonical mutation 17 (4.9) non-canonical mutation 1 (0.3) NA 91 (26.4)

    BRAF mutation status

    wild type 229 (66.4) v600 20 (5.8) other mutation 5 (1.4) NA 91 (26.4)

    MSI status

    MSS 177 (51.3) NA 168 (48.7)

    Consensus Molecular Subtype

    1, Immune 12 (3.5) 2, Canonical 117 (33.9) 3, Metabolic 21 (6.1) 4, Mesenchymal 161 (46.7) Mixed 34 (9.9)

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 17

    Figure Legends:

    Figure-1: Flowchart showing approach development and validation of CMS classifier on colorectal cancer sub-

    typing consortium (CRCSC) (1A) and development of NanoString classifier on colorectal cancer samples.

    Figure 2: Sample-wise and Gene-wise correlation of paired FF/FFPE Samples: Figure- 2A: Histogram of sam-

    ple-wise Spearman correlation of paired FF/FFPE values across all 472 CMS genes on Nanostring assay, with

    threshold of 0.75 marked with red vertical line. Figure-2B and 2C: Association of sample-wise FF/FFPE with

    RNA Quality: Scatterplot of gene-specific Spearman correlation of FF/FFPE vs. RNA quality of FF samples

    (2B, based on RIN) and FFPE samples (2C, based on % with 200nt). Figure-2D: Histogram of gene-wise

    Spearman correlation of paired FF/FFPE values based on samples with sample-wise correlation > 0.75, with

    thresholds to determine the top 100 and top 200 genes indicated by red and blue vertical lines, respectively.

    Figure-3: Bar chart and table showing 4-class accuracy of CMS classifiers, along with number (proportion) of

    samples classified to each CMS. We assess accuracy for classifier with top 100 genes in terms of FF/FFPE

    correlation for FFPE and FF, computed based on Nanostring measurements for FFPE and FF in current study

    (Nano FFPE-100, Nano FF-100) and based on Affymetrix measurements for FF in the Affy CRCSC validation

    data set (V2a, Affy FF-100), and for the full 472 gene classifier applied to FF samples run on Nanostring plat-

    form for current study (Nano FF-472) and FF samples run on Affymetrix in the Affymetrix CRCSC validation

    data set (Affy FF-472). Performance is summarized overall and for subsets of samples with high classification

    confidence (⍺>0.50, 0.80 or 0.90).

    Figure-4: Distribution of KRAS and BRAF mutations across CMS (4A) and correlation of CMS with overall sur-

    vival (4B) in stage IV colorectal cancer.

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • CRCSC databases (n=18)

    Agilent Datasets excluded

    CRCSC databases (n=14)

    V1: Discovery set

    (n=1329 samples from 12 studies)V2: Validation

    set (n=1329

    samples from

    14 studies)• Apply four-fold cross validation

    model building for each

    classification model

    • Apply quantile normalization

    Test accuracy of classification model

    Predictive accuracy of each model as function

    of number of genes

    • Gene ranking strategy based on its

    classification from 5973 to 5 genes

    • Compare the classification model with same

    gene list

    V2a:

    Validation

    subset with

    data on

    Affymetrix

    U133Plus2.0

    (n=929

    samples )

    V2ap:

    Validation

    subset with

    Affymetrix

    U133Plus2.0

    arrays fRMA

    probe-level

    data (n=929

    samples)

    Validation

    subset with

    TCGA RNAseq

    data (n=189

    samples)

    V2o: Out of

    sample

    validation

    subset

    containing

    GSE2109 and

    GSE17536

    (n=383

    samples)

    Test accuracy of each model and gene ranking

    strategy on validation cohorts

    Select the best performing model with least

    number of genes - Weighted Vector Support

    Machine Model

    Further

    optimization &

    tuning of the

    parameters

    ValidationDiscovery

    • Obtain best possible accuracy by

    tuning parameters with each model

    • Apply multiclass Adaboost to rank

    genes for each model

    Figure 1A

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • Sample wise correlation (ρ) between NanoString flash

    frozen (FF) value and NanoString formalin fixed paraffin

    embedded (FFPE) value across 472 genes (n=175

    samples)

    Samples with ρ≥0.75 (n=142 samples)

    Gene-wise correlation (r) between FF value & FFPE to

    identify sets of genes with top 100 (CMS100) and top

    200 (CMS200) correlations

    Train new model on reduced gene sets (CMS100/CMS200)

    on CRCSC V1 and obtain new wSVM classifier Ψ100 and Ψ200Determine accuracy using Affymetrix data on CRCSC V2

    Pick gene set (100 or 200 genes)

    NanoString expression for each

    FFPE sample for the chosen

    gene set (n=158)

    Quantile normalized expression

    on NanoString FFPE sample for

    the chosen gene set

    Compute probability of each CMS pj• Most likely CMS: Choose CMS j with

    highest pjOR

    • Apply confidence threshold:

    Choose CMS j with highest pj if

    greater than confidence threshold α,

    otherwise classify as “Mixed CMS”

    • Quantile normalized to the scale of

    Affymetrix value of matched FF

    sample for the chosen gene set

    Apply Ψ to NanoString data normalized

    to the scale of Affymetrix data

    CMS for each sample

    Figure 1B

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • Figure 2F

    req

    ue

    ncy

    Fre

    qu

    en

    cy

    Sa

    mp

    le-w

    ise

    FF

    /FF

    FP

    E

    co

    rre

    latio

    n

    Sa

    mp

    le-w

    ise

    FF

    /FF

    FP

    E

    co

    rre

    latio

    n

    Sample-wise FF/FFPE correlation

    Gene-wise FF/FFPE correlation

    top 100 genes

    top 100 genes

    FF RIN

    FFPE %200nt

    p-value=0.0077

    p-value=0.280

    00.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    0.2 0.4 0.6 0.8 1.0

    0.2 0.4 0.6 0.80.0

    2 4 6 8 10

    20

    40

    60

    80

    0

    20

    20 30 50 70

    40

    40

    60

    60

    80

    80

    100

    120

    140

    A B

    C D

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • CMS1 CMS2 CMS3 CMS4 Unclassified

    No threshold 0.80 38 (0.24) 88 (0.56) 9 (0.06) 23 (0.15) 0

    α>0.50 0.81 36 (0.23) 85 (0.54) 8 (0.05) 22 (0.14) 5 (0.03)

    α>0.80 0.86 29 (0.19) 65 (0.42) 8 (0.05) 11 (0.07) 43 (0.28)

    α>0.90 0.90 25 (0.16) 51 (0.33) 7 (0.04) 7 (0.04) 66 (0.42)

    CMS1 CMS2 CMS3 CMS4 Unclassified

    No threshold 0.80 34 (0.22) 72 (0.47) 31 (0.20) 16 (0.10) 0

    α>0.50 0.82 34 (0.22) 71 (0.46) 28 (0.18) 19 (0.12) 4 (0.03)

    α>0.80 0.87 29 (0.19) 64 (0.42) 19 (0.13) 9 (0.06) 30 (0.20)

    α>0.90 0.92 20 (0.13) 51 (0.34) 16 (0.11) 5 (0.03) 59 (0.39)

    CMS1 CMS2 CMS3 CMS4 Unclassified

    No threshold 0.81 40 (0.26) 64 (0.42) 20 (0.13) 29 (0.19) 0

    α>0.50 0.82 38 (0.25) 60 (0.40) 19 (0.13) 27 (0.18) 7 (0.05)

    α>0.80 0.85 34 (0.22) 51 (0.33) 16 (0.10) 23 (0.15) 29 (0.19)

    α>0.90 0.88 30 (0.20) 42 (0.28) 12 (0.08) 20 (0.13) 48 (0.32)

    CMS1 CMS2 CMS3 CMS4 Unclassified

    No threshold 0.89 222 (0.17) 597 (0.45) 176 (0.13) 334 (0.25) 0

    α>0.50 0.9 218 (0.16) 590 (0.44) 172 (0.13) 329 (0.25) 20 (0.02)

    α>0.80 0.96 162 (0.12) 479 (0.36) 120 (0.09) 247 (0.19) 320 (0.24)

    α>0.90 0.98 131 (0.10) 418 (0.31) 103 (0.08) 199 (0.15) 478 (0.36)

    CMS1 CMS2 CMS3 CMS4 Unclassified

    No threshold 0.95 232 (0.18) 572 (0.43) 173 (0.13) 352 (0.27) 0

    α>0.50 0.95 231 (0.17) 569 (0.43) 170 (0.13) 348 (0.26) 11 (0.01)

    α>0.80 0.98 190 (0.14) 530 (0.40) 146 (0.11) 313 (0.24) 150 (0.11)

    α>0.90 0.99 169 (0.13) 498 (0.38) 123 (0.09) 288 (0.22) 251 (0.19)

    Nano FF-1004-class

    accuracy

    4-class accuracy

    4-class accuracy

    4-class accuracy

    4-class accuracy

    Distribution of predicted CMS

    Nano FFPE-100Distribution of predicted CMS

    Affy FF-472*Distribution of predicted CMS

    Nano FF-472Distribution of predicted CMS

    Affy FF-100*Distribution of predicted CMS

    0.70

    0.75

    0.80

    0.85

    0.90

    0.95

    1.00

    Nano FFPE-100 Nano FF-100 Nano FF-472 Affy FF-100 Affy FF-472

    4-Group Accuracy Assessment of the CMS Classifier

    No threshold

    Fo

    ur

    Gro

    up

    -Accu

    racy

    α>0.50 α>0.80 α>0.90

    Figure 3

    A

    B

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/

  • 1.00

    0.50

    0

    1.00

    0.50

    0

    Pro

    po

    rtio

    n

    NA

    NA

    p

  • Published OnlineFirst October 27, 2020.Clin Cancer Res Jeffrey S Morris, Rajyalakshmi Luthra, Yusha Liu, et al. CLIA-Certified SettingConsensus Molecular Subtyping of Colorectal Carcinoma in a Development and Validation of a Gene Signature Classifier for

    Updated version

    10.1158/1078-0432.CCR-20-2403doi:

    Access the most recent version of this article at:

    Material

    Supplementary

    http://clincancerres.aacrjournals.org/content/suppl/2020/10/27/1078-0432.CCR-20-2403.DC1

    Access the most recent supplemental material at:

    Manuscript

    Authoredited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been

    E-mail alerts related to this article or journal.Sign up to receive free email-alerts

    Subscriptions

    Reprints and

    [email protected] at

    To order reprints of this article or to subscribe to the journal, contact the AACR Publications

    Permissions

    Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

    .http://clincancerres.aacrjournals.org/content/early/2020/10/27/1078-0432.CCR-20-2403To request permission to re-use all or part of this article, use this link

    on July 5, 2021. © 2020 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

    Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on October 27, 2020; DOI: 10.1158/1078-0432.CCR-20-2403

    http://clincancerres.aacrjournals.org/lookup/doi/10.1158/1078-0432.CCR-20-2403http://clincancerres.aacrjournals.org/content/suppl/2020/10/27/1078-0432.CCR-20-2403.DC1http://clincancerres.aacrjournals.org/cgi/alertsmailto:[email protected]://clincancerres.aacrjournals.org/content/early/2020/10/27/1078-0432.CCR-20-2403http://clincancerres.aacrjournals.org/

    Article FileFigure-1AFigure-1BFigure-2Figure-3Figure-4