View
1
Download
0
Category
Preview:
Citation preview
Tumor Biology and Immunology
Transcriptomic Differences between PrimaryColorectal Adenocarcinomas and DistantMetastases Reveal Metastatic Colorectal CancerSubtypesYasmin Kamal1, Stephanie L. Schmit2, Hannah J. Hoehn2, Christopher I. Amos1,3, andH. Robert Frost1
Abstract
Approximately 20% of colorectal cancer patients with colo-rectal adenocarcinomas present with metastases at the time ofdiagnosis, and therapies that specially target these metastasesare lacking. We present a novel approach for investigatingtranscriptomic differences between primary colorectal adeno-carcinoma and distant metastases, which may help to identifyprimary tumors with high risk for future dissemination and toinform the development of metastasis-targeted therapies. Toeffectively compare the transcriptomes of primary colorectaladenocarcinoma and metastatic lesions at both the gene andpathway levels, we eliminated tissue specificity of the "host"organswhere tumors are located and adjusted for confounderssuch as exposure to chemotherapy and radiation, and identi-fied that metastases were characterized by reduced epithelial–mesenchymal transition (EMT) but increased MYC target andDNA-repair pathway activities. FBN2 and MMP3 were the
most differentially expressed genes between primary tumorsand metastases. The two subtypes of colorectal adenocarcino-ma metastases that were identified, EMT inflammatory andproliferative, were distinct from the consensus molecularsubtype (CMS) 3, suggesting subtype exclusivity. In summary,this study highlights transcriptomic differences between pri-mary tumors and colorectal adenocarcinoma metastases anddelineates pathways that are activated inmetastases that couldbe targeted in colorectal adenocarcinoma patients with met-astatic disease.
Significance: These findings identify a colorectal adenocar-cinoma metastasis-specific gene-expression signature that isfree frompotentially confounding background signals comingfrom treatment exposure and the normal host tissue that themetastasis is now situated within.
IntroductionRoughly 20% of individuals with colorectal adenocarcinoma
present with metastatic disease at the time of diagnosis, andcolorectal adenocarcinoma is the primary cause of mortality dueto cancer (1, 2). In colorectal adenocarcinoma, the liver (70%) isthe most common site of disease metastasis followed by the lung(32%–47%; ref. 3). Although colorectal adenocarcinoma metas-tases are aggressively treated with some combination of chemo-therapy, curative-intent surgical resection (4), biologics, such as
epidermal growth factor (EGFR) inhibitors (5), and immunother-apy (for a subgroup of patients with mismatch-repair deficiency;ref. 6), metastasis-targeted therapies are severely lacking. There-fore, understanding the defining features ofmetastatic tumor cellsin distal organs is valuable for the development of targeted drugsand individualized therapies for patients with metastatic disease.
One approach for characterizing the biology of metastaticlesionshasbeen to compareprimary tumors andmetastatic lesionsof the same cancer type (7).However, this is limited by theneed forbiopsies of normal host organ tissue where metastases are locatedsuch that the transcriptomic profiles of metastases can be normal-ized (7, 8). One interesting survey evaluated primary versusmetastatic sites and found that expression studies of metastasesobtain signatures that partially reflect the host tissue but haveadditional signatures (9). This finding highlights the need forconsidering the metastatic site during analyses. Approaches com-paring primary tumors andmetastatic lesions often fail to addressthe role of treatment exposure in altering tumor transcriptomicprofiles (9, 10). This is particularly true formetastases, as biospeci-mens of metastases obtained in the clinical setting are usuallydrawn from patients heavily treated with chemotherapy and/orradiation prior to surgical resection (4, 11). To identify metastasis-specific features free from potentially confounding signals, wedeveloped a novel approach for comparing primary tumors andmetastases that takesbothnormalhost tissue expression, anatomicorigin of the tumors, and treatment exposure status of tumorsinto consideration as all three of these factors can substantially
1Department of Biomedical Data Sciences, Geisel School of Medicine atDartmouth, Hanover, New Hampshire. 2Department of Cancer Epidemiology,H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida. 3DanL. Duncan Comprehensive Cancer Center at Baylor College of Medicine, Hous-ton, Texas.
Note: Supplementary data for this article are available at Cancer ResearchOnline (http://cancerres.aacrjournals.org/).
Y. Kamal and S.L. Schmit contributed equally to this article.
CorrespondingAuthors:H.Robert Frost, DartmouthCollege, HB7936, Hanover,NH 03755. Phone: 603-667-1884; E-mail: rob.frost@dartmouth.edu; andChristopher I.Amos, Institute for Clinical and TranslationalResearch, BaylorCollegeof Medicine, 1 Baylor Plaza, Houston, TX 77030; E-mail: chris.amos@bcm.edu
Cancer Res 2019;79:4227–41
doi: 10.1158/0008-5472.CAN-18-3945
�2019 American Association for Cancer Research.
CancerResearch
www.aacrjournals.org 4227
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
influence the detection of metastasis-specific gene-expression pat-terns. This analytical approach allows for the determination ofdefining features of lung and liver metastases of colorectal adeno-carcinoma, while avoiding the need to always obtain normal hosttissues from patients with metastatic disease for the purposesof normalizing tumor gene expression. Last, it allows for theidentification of unique subtypes of colorectal adenocarcinomametastases that are independent of the site of metastasis.
Materials and MethodsPrimary and metastatic colorectal cancer samples
Gene expression from human colorectal cancer tissues andcorresponding clinical data were analyzed. All participants pro-vided written informed consent for data and tissue collection aspart of the followingprotocols atMoffittCancerCenter (MCC)andConsortium sites: Total Cancer Care (12), Lifetime Cancer Screen-ing, General Banking, Pre-HIPAA Biobanking, or Clinical Collec-tion. All tissue and data analyzed for this project were utilizedunder the approval of the Advarra Institutional Review Board thatensures research is conducted in accordance with recognizedethical guidelines (MCC# 19066/Pro00023353) under the HHSregulations 45 CFR part 46 for human subjects protections; spe-cifically under Subpart A (US Common Rule) as authorized by45CFR46.110. All subjects were �18 years of age and free ofpsychiatric incapacity or dementia. Residual tissue collected aspart of routine clinical care was assayed using the Rosetta/MerckHuman RSTA Custom Affymetrix 2.0 microarray platform and asingle standardoperatingprocedure. For thepurposes of this study,only tissues collected via surgical resection, as opposed to biopsy,were analyzed. Microsatellite instability (MSI) status was deter-mined for a subset of patients (n ¼ 71) using the Bethesdapanel (13) genes (BAT25, BAT26, NR21, NR24, and NR27).
We established two distinct datasets from total cancer care(TCC) samples. The discovery cohort (MCC dataset), consistingof 517 human colorectal cancer samples from 502 distinctpatients, included 333 primary lesions and 184 lung and livermetastases of colorectal adenocarcinoma. All samples in theMCCdataset were collected at the MCC hospital location in Tampa, FL.Wherever possible, histology, clinical information, andMSI statusfor the MCC dataset were verified through examination of elec-tronic medical records. In addition, we examined 618 humancolorectal cancer samples from 618 distinct patients including545 primary lesions and 73 lung and livermetastases of colorectaladenocarcinoma in the validation cohort (Consortium dataset).All samples in the Consortium dataset were obtained from non-MCC TCC regional consortium site partner institutions. All tran-scriptomic and clinical data have been deposited to the Gene-Expression Omnibus site curated by the National Center forBioinformatics (accession number: GSE131418).
Microarray expressionnormalization andprincipal componentanalysis
Microarray gene-expression data passingmultiple internal qual-ity control filters and curated by the MCC Shared ResourcesBioinformatics and Biostatistics Core were obtained for all TCCsamples. Microarray chips were normalized using iterative rank-order normalization (IRON; ref. 14) and log2 transformation. Inaddition, we excluded all probes that mapped to multiple genes.Probe set expression was converted to gene-level expression byselecting the probewith themaximumexpression for a given gene.
In addition, we performed principal component analysis(PCA) within the MCC and the Consortium datasets using allgenes captured on themicroarray platform.OneMCC samplewasfound to be an outlier (>3 standard deviations away from themean of PC1) and thuswas subsequently removed from the study(Supplementary Fig. S1). Furthermore, we did not note anysignificant batch effects between the MCC and Consortium data-sets (Supplementary Fig. S2).
Anatomic origin determination for the MCC datasetAnatomic origin for primary tumorswasdetermined as follows:
tumors located in the cecum, ascending colon, hepatic flexure,and transverse colon were classified as proximal, while tumorslocated in the splenic flexure, descending colon, sigmoid colon,and rectumwere classified as distal. Anatomic origin ofmetastaseswas classified in the same manner as for primary tumor, with theexception of 7 patients for which anatomic origin was not clearlydesignated. Specifically, for six of these seven individuals, ana-tomic origin of the metastatic tumor was obtained from patient-reported history of site of primary tumor resection or site ofhemicolectomy (left or right side) noted in the electronic medicalrecord. Anatomic origin could not be determined for 1 patient inthe MCC dataset after medical record examination.
Tumor treatment exposure determination for the MCC datasetTumor treatment exposure status was determined by assessing
the history of chemotherapy and/or radiation treatment withintwo years prior to surgical resection of the tumor sample. If therewas no history of chemotherapy or radiation exposure prior tosurgical resection of the tumor sample, the samplewas consideredtreatment na€�ve (i.e., resection occurred "pre" treatment). If with-in two years prior to surgical resection of the tumor sample inquestion any history of chemotherapy or radiation was noted inthe medical records, the tumor sample was considered treatmentexposed (i.e., resection occurred "post" treatment).
Anatomic origin classifier for Consortium samplesAs clinical data on the Consortium dataset were limited, we
imputed missing tumor anatomic origin by developing gene-expression and logistic regression-based anatomic origin classifiers,which categorize tumors as either originating from the proximalcolon or the distal colon/rectum. These classifiers were developedbased on na€�ve differential gene-expression analysis comparingproximally and distally originating tumors in the MCC dataset. Agenewasdetermined tobedifferentially expressed if the fold changewas >1.5 and the P value <0.05. The classifiers were developedseparately for primary tumors and metastases. Anatomic originclassification of primary tumors was based on the expression ofthe PRAC1 and HOXC6 genes (AUCmax ¼ 0.93), while the classi-fication of metastases was based on the expression of the PRAC1,HOXC6, OGN, and MUC12 genes (AUCmax ¼ 0.85). We imputedmissing anatomic origin for 20 primary tumors and 13 metastasesof colorectal adenocarcinoma in the Consortium dataset. Gene-expression differences between proximally and distally originatingprimary colorectal adenocarcinoma tumors were used for classify-ing the anatomic origin of metastases. PRAC1 and MUC12 werefound tobedifferentially expressedbetweenproximally anddistallyoriginatingprimary tumorsandmetastases (Supplementary Figs. S3and S4). The four genes (OGN,MUC12, PRAC1, andHOXC6) usedto classify anatomic origin in the Consortium dataset were elimi-nated from all subsequent Consortium analyses.
Kamal et al.
Cancer Res; 79(16) August 15, 2019 Cancer Research4228
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Treatment classification for Consortium samplesSimilar to the development of an anatomic origin classifier, we
developed gene-expression logistic regression-based classifiers todetermine the treatment exposure status of primary tumors andmetastases. The classifiers categorize tumors as either treatmentexposed or treatment na€�ve. The treatment classifier for primaryConsortium samples was developed based on na€�ve differentialgene-expression analysis comparing primary pre- and post-treatment tumors in the MCC dataset. A gene was determined tobe differentially expressed if the fold change was >1.5 and theP value <0.05. TheMCC dataset was split into a training and a testset, and the optimal combination of differentially expressed genesthat maximized the AUC (AUCmax ¼ 0.815) in the test set wereincorporated into the logistic regression model and served as thebasis for the primary tumor treatment classifier. The eight genesused to classify primary tumors are SCRG1,HBB,GREM2, SCN7A,CHRDL2, HSPB6, CXCL12, and PLP1.
We were unable to identify a set of genes meeting the differ-ential expression threshold criteria when comparing pre- andpost-treated metastases in the MCC dataset, and thus usedgene-expression differences between pre- and post-treatmentprimary tumors to develop the treatment classifier for metastases(Supplementary Figs. S5 and S6). The treatment classifier formetastases (AUCmax ¼ 0.722) was based on the expression of11 genes: SYNPO2, GREM2, ADH1B, HBB, C7, PLN, SFRP1,AGTR1, CHRDL2, MAMDC2, MYH11. For all subsequent Con-sortium analyses, we excluded the 16 genes that were used todevelop the treatment classifiers.
Tissue-specific gene-level and pathway-level weightsBased on the bioinformatics approach developed by Frost (15) to
address the tissue specificity of genes and gene sets, we computedlung, liver, colon, and rectum tissue-specific weights for individualgenes and gene sets (pathways) in theMolecular SignatureDatabase(MSigDB Version 6.0; ref. 16). Tissue-sensitive analyses were per-formed by eliminating genes and pathways exhibiting tissue spec-ificity for lung, liver, colon, or rectum tissues. Thefiltering criteria forgene-level and pathway-level tissue specificity are as follows: allgenes with >2-fold increase in tissue-specific expression and allpathways with a tissue-specific weight >10 are labeled as tissuespecific. Additional information on the development of gene andpathway-level tissue weights has been described previously (15).
Linear models for microarray data analysis and CAMERAapplication
CAMERA (17) application considered pathways from theMSigDB Hallmark (18) and C2 (CPG and CP), collections.Tissue-agnostic linear models for microarray data analysis(LIMMA; ref. 19) and CAMERA application to determine differ-entially expressed genes and gene sets between primary tumorsand metastases did not consider tissue-specific gene or pathwayexpression. All tissue-agnostic analyses adjust for age, sex, tumortreatment exposure status, and anatomic origin. Tissue-sensitiveLIMMA and CAMERA application to determine differentialexpression of genes and pathways between primary tumors andmetastases account for tissue-specific expression by eliminatingall genes and pathways exhibiting tissue specificity based on thefiltering criteria highlighted above. Tissue-sensitive analyses alsoadjust for age, sex, anatomic origin, and treatment exposure status(unless explicitly indicated otherwise), when comparing primarytumors and metastases. In addition, previous studies have
highlighted genetic and transcriptomic differences between colo-rectal adenocarcinoma tumors arising from the proximal colonand the distal colon/rectum (20). Therefore, if tumor anatomicorigin and treatment exposure status were missing for samples inthe Consortium dataset, the imputed missing data from gene-expression–based classifiers were used as inputs for LIMMA andCAMERA (Supplementary Figs. S4 and S6).MSI status for a subsetof MCC tumor samples (n ¼ 71) was available. Microsatelliteinstable tumors were typically MSI-high and were classified assuch, while tumors labeled MSI-low and microsatellite stable(MSS) were classified as MSS. For this small cohort, LIMMA andCAMERA analyses adjusted for MSI status. Lastly, we also usedCAMERA to determine pathways differentially expressed betweentreatment-na€�ve and treatment-exposed tumors while adjustingfor age, sex, anatomic origin, and tumor type (primary tumor ormetastasis of colorectal adenocarcinoma).
Determination of the M1 and M2 clusters in the MCC andConsortium datasets
To discover subtypes of metastases, we performed unsuper-vised hierarchical clustering using the top 500 differentiallyexpressed genes between primary tumors and metastases in eachdataset. Based on our clustering analysis, two main clusters ofmetastases were identified in each dataset, and differential path-way expression between these two clusters of metastases wasdetermined using CAMERA. To ensure that the M1 and M2clusters in each dataset exhibit similar underlying biology, wedeveloped an M1/M2 classifier trained on the MCC dataset andtested on the Consortium dataset. Cluster membership wasdefined by the hierarchical clustering of the top 500 differentiallyexpressed genes between primary tumors and metastases in eachdataset. Inputs for the logistic regression-based classifier includedthe covariates age, sex, tumor anatomic origin, and treatmentexposure status as well as single-sample gene set enrichmentscores (21) for pathways found to be differentially expressedbetween the M1 and M2 clusters in the MCC dataset.
Examining adaptations to distal sites of metastases observed inprimary tumors
We examined adaptations to distal sites of metastases in primarytumors of patients who either went on to develop lung (n¼ 18) orliver (n ¼ 48) metastases for their initial distal metastasis. Tumorsfrompatientswhodevelopedboth lungand livermetastases at onceorwhose initial distal siteofmetastasiswasnot the liveror lungwereexcluded. Using CAMERA, we determined differential pathwayexpression. Next, we assessed if these differential pathways exhib-ited high tissue specificity for normal lung or liver tissues (tissueweight >10). Enrichment of pathways with high tissue-specificweights was considered to be an adaptation to the lung and liverobserved in the primary colorectal adenocarcinoma tumors.
CMS classification and logistic regressionCMS is a transcriptome-based classification of colorectal ade-
nocarcinomas with prognostic value (22). Tumors were classifiedinto CMS groups 1–4 or CMS_NA using the single samplepredictor method as previously described (22). We performedlogistic regression to determine the association of CMS withprimary tumors ormetastases of colorectal adenocarcinomawhileadjusting for age, sex, anatomic origin, and tumor treatmentexposure status. For both the MCC and Consortium datasets, weexcluded CMS3 from the logistic regression models as metastases
Transcriptomes of Primary and Metastatic Colorectal Tumors
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4229
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
were never classified as CMS3. Inclusion of CMS3 into theregression model would have resulted in complete separation,thereby eliminating amaximumlikelihood estimate and resultingin inflated beta coefficients for the predictor.
Statistical analysisAll statistical analyses were performed in R (Version 3.5).
Difference in group means was determined using the Wil-coxon-rank sum test. Spearman rank correlation was used todetermine degree of overlap in pathway enrichment comparisonsbetween datasets. We set the false discovery rate (FDR) to 0.1 toidentify associations in expression analyses.
ResultsCharacteristics of TCC participants
Clinical and gene-expression data corresponding to humancolorectal adenocarcinoma surgical resection biospecimens wereobtained through the MCC TCC Protocol (12), including partic-ipating TCC Consortium sites. Data were separated into a dis-covery cohort of MCC samples (n¼ 517) and a validation cohortof non-MCC TCC Consortium samples (n ¼ 618). Both datasetsprimarily consist of unmatched tumor samples with n ¼ 15
matched primary tumors and metastatic lesion samples from thesame individual in the MCC dataset. Therefore, given the smallnumber of paired samples, all statistical analyses were performedignoring paired status. Clinical characteristics and tumor sampleinclusion/exclusion criteria for the discovery cohort, hereonreferred to as the MCC dataset, and the validation cohort, hereonreferred to as the Consortium dataset, are described in Table 1and Fig. 1. In the MCC dataset, lung and liver metastases ofcolorectal adenocarcinoma were more likely to originate fromthe distal colon or rectum (Wilcoxon-rank sum test; P¼ 0.00073)andweremore likely to have been exposed to chemotherapy and/or radiation treatment prior to surgery (Wilcoxon-rank sum test;P < 0.0001). Interestingly, even within metastases, we observedgene-expression differences between metastases originating fromthe proximal or distal colon/rectum. Therefore, we adjusted foranatomic origin and treatment exposure status in all analysescomparing primary tumors and metastases of colorectal adeno-carcinoma (Supplementary Figs. S3–S7).
Elimination of host tissue–specific gene expressionLung and liver resectionof colorectal adenocarcinomametastases
improves long-term survival (23–25). Surgical resection marginsshould be tumor-free to ensure removal of the entire tumor mass.
Table 1. Baseline characteristics for TCC MCC and Consortium participants in the primary and metastasis cohorts
MCC (n ¼ 517) Consortium (n ¼ 618)Primary (n ¼ 333) Metastases (n ¼ 184) Primary (n ¼ 545) Metastases (n ¼ 73)
Age at diagnosis (years) 63.64 59.30 67.94 58.09Race/ethnicity (%)White 303 (90.9%) 157 (85.3%) 470 (86.2%) 62 (85.0%)Black/African American 15 (4.5%) 10 (5.4%) 36 (6.6%) 8 (10.9%)Other/unknown 15 (4.5 %) 17(9.2%) 39 (7.2%) 3 (4.1%)
Gender (%)Male 183 (54.9%) 106 (57.6%) 265 (48.6%) 47 (64.4%)Female 150 (45.0%) 78 (42.4%) 280 (51.4%) 26 (35.6%)
Treatment status (%)Pretreatment 235 (70.6%) 56 (30.4%) 448 (82.2%)a 10 (13.7%)a
Post-treatment 98 (29.4%) 128 (69.6%) 97 (17.8%)a 63 (86.3%)a
Chemotherapy Only 28 (8.4%) 100 (54.3%)Chemotherapy and radiation 63 (18.9%) 27 (14.7%)Radiation only 7 (2.1%) 1(0.5%)
Anatomic origin (%)Proximal colon 129 (38.7%) 44 (23.9%) 284 (52.1%)a 19 (26.0%)a
Distal colon/rectum 204 (61.3%) 139 (75.5%) 261 (47.9%)a 54 (74.0%)a
MSI Status (%)MSI-high 3 (0.9%) 0 (0%)MSSMSI-low 5 (1.5%) 3 (1.6%)MSS 37 (11.1%) 23 (12.5%)
Unknown 288 (86.4%) 158 (85.9%)Primary tumor stage (%)Stage 1 56 (16.8%) 0 (0%)Stage 2 105 (31.5%) 7 (1.3%)Stage 3 100 (30.0%) 21 (3.8%)Stage 4 72 (21.7%) 27 (5.0%)Unknown 490 (89.9%)
Site of metastasis (%)Liver 141 (76.6%) 56 (76.7%)Lung 43 (23.4%) 17 (23.3%)
NOTE: Treatment status refers to the exposure of primary tumors andmetastases of colorectal adenocarcinoma to chemotherapy and/or radiation treatment prior tosurgical resection of the tumor. MSI statuswas determinedusing PCR forfiveMSImarkers (BAT25, BAT26, NR21, NR24, andNR27). Only 15 samples in theMCCdatasetwere paired samples with primary tumors and metastases originating from the same patient.aIf anatomic origin and treatment exposure status for tumors in the Consortium dataset was not available, the missing data were imputed using gene-expression–based classifiers. Anatomic origin was imputed for 20 primary tumors and 13 metastases of colorectal adenocarcinoma in the Consortium dataset. Treatment wasimputed for 336 primary tumors and 67 metastases of colorectal adenocarcinoma in the Consortium dataset. Imputed values are italicized.
Kamal et al.
Cancer Res; 79(16) August 15, 2019 Cancer Research4230
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Therefore, some remnants of normal tissue will inevitably be foundin resected biospecimens. This makes effective comparative tran-scriptomic analysis of primary tumors and metastatic lesions chal-lenging as expression differences between the normal primary andmetastasis host tissue sites can overshadow the true differencesbetween primary tumors and metastases. To address this issue, wegenerated tissue-specific gene and pathway-level weights, as previ-ously described (15), for all normal host tissues of interest (colon,rectum, lung, and liver). Theuseof discretized tissue-specificweightseliminates the need to profile each normal host tissue sampleadjacent to the metastatic colorectal adenocarcinoma lesions forthe purposes of normalizing the tumor transcriptomic data. Usingthese weights, we performed tissue-sensitive analyses that onlyincluded genes where the expression levels in the normal tissuewere below a specific threshold (described further in Materials andMethods). This resulted in the elimination of genes exhibiting hightissue specificity. In addition,weperformed tissue-agnostic analyses,which ignore the tissue specificity of genes and pathways. For thetissue-sensitive analysis, we eliminated all genes and pathwaysexhibiting tissue-specific activity in lung, liver, colon, or rectum(Supplementary Table S1); the tissue-agnostic analysis included allgenes and pathways irrespective of tissue specificity.
As a visual confirmation of this approach, we appliedt-distributed stochastic neighbor embedding (tSNE; ref. 26) on theMCCandConsortium gene-expression datasets in the tissue agnos-tic (i.e., without elimination of tissue-specific genes) and tissue-sensitive (i.e., with elimination of tissue-specific genes) settings(Fig. 2). In the tissue-agnostic setting, samples clustered based onthe site (colon/rectum, lung, or liver) of tumor resection (HotellingT2 test comparing tSNEclustersof liver and lungmetastases;MCC:P< 1 � 10�16). Conversely, in the tissue-sensitive setting, weobserved separation of primary tumors and metastases. However,
we no longer observed sample clustering based on the site ofmetastatic tumor resection (Hotelling T2 test comparing tSNEclusters of liver and lung metastases; MCC: P ¼ 0.0491), such thatlung and liver metastases are integrated across clusters (Fig. 2). Toevaluate the potential influence of tumor purity, we inferred tumorpurity for all samples in both datasets using the ESTIMATE algo-rithm (27), which uses gene-expression signatures to infer fractionsof stromal, immune, and cancer cells from a mixture. We did notfind significant differences in tumor purity between samples drawnfromprimary tumors andmetastases of colorectal adenocarcinoma(MCCWilcoxon-rank sum test,P¼ 0.7087,ConsortiumWilcoxon-rank sum test, P ¼ 0.695; Supplementary Fig. S7). This indicatedtumorpurity isnot themaindriver of differences betweenprimariesand metastases as both are likely capturing similar quantities ofnormal host tissue during tumor resection.
We examined differential expression of genes at the pathway-level using pathways in the Hallmark (18) and C2 collections ofthe Molecular Signature Database (MSigDB Version 6.0; ref. 16).Differential pathway analyses were performed, adjusting for age,sex, treatment exposure status, and anatomic site of originbetween colorectal adenocarcinoma primary and lungmetastasesas well as between colorectal adenocarcinoma primary and livermetastases in the tissue-sensitive and tissue-agnostic settings(Table 2; Supplementary Tables S2 and S3). Materials and Meth-ods and Supplementary Materials describe additional detailsabout the pathway-level analyses. Spearman rank correlation wasused to assess if similar pathways were enriched when comparingcolorectal adenocarcinoma primaries with lung metastases andwhen comparing colorectal adenocarcinoma primaries with livermetastases. In the tissue-agnostic setting, the rank correlations (r)observed for theHallmark andC2 gene set collections in theMCCdataset were rHallmark¼ 0.19 and rC2¼ 0.1, respectively, while in
Figure 1.
CONSORT flow diagram detailing inclusion and exclusion criteria for primary andmetastatic colorectal adenocarcinoma samples. All possible colorectaladenocarcinoma samples with available gene-expression data originating from the large bowel, rectum, or anus were considered for this study. colorectaladenocarcinomametastases were restricted to those found in the liver or lung. All tumor samples were restricted to one sample per patient with the exception of15 patients with matching colorectal adenocarcinoma primary and lung or liver metastases.
Transcriptomes of Primary and Metastatic Colorectal Tumors
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4231
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
the tissue-sensitive setting they were rHallmark ¼ 0.61 and rC2 ¼0.25. Convergence of pathway enrichment results in the tissuesensitive but not tissue-agnostic settings was replicated in theConsortium dataset (Table 2). In the tissue-agnostic setting, liver-specific pathways, such as bile acid production (FDRMCC¼ 1.77�10�08) and xenobiotic metabolism (FDRMCC ¼ 2.39 � 10�13),were enriched in liver metastases compared with primary tumors.However, in the tissue-sensitive setting, cancer-related pathways,such as MYC targets (FDRMCC ¼ 4.74 � 10�04, FDRConsortium ¼8.59� 10�09), were enriched in both liver and lungmetastases ascompared with primaries in both datasets (Tables 2 and 3; Sup-plementary Tables S2 and S3). These results highlight the role ofhost organ tissue gene expression when comparing primaries andmetastases. Elimination of tissue-specific gene expression of thehost organs allowed us to perform a meta-analysis of liver andlung metastases to determine defining features of metastases ofcolorectal adenocarcinoma after also accounting for tumor ana-tomic origin and tumor treatment exposure status.
The role of tumor treatment exposure in comparing colorectaladenocarcinoma primaries and metastases
Given the unbalanced distribution of treatment exposurebetween colorectal adenocarcinomaprimary tumors andmetastaticlesion biospecimens that underwent gene-expression profiling, we
aimed to examine the role of treatment as a potential confounderwhen comparing the transcriptomic patterns of primaries andmetastases. In the tissue-sensitive setting, we found that pathwayssuch as angiogenesis and hypoxia were enriched in metastases ofcolorectal adenocarcinoma compared with primary tumors whentreatment status is ignored (Table 3; Supplementary Table S4).Importantly, angiogenesis and hypoxia were also enriched in treat-ment-exposed tumors relative to treatment-na€�ve tumors, indicatingthat their apparent enrichment inmetastases is due to confoundingby treatment status (Table 3; Supplementary Tables S4 and S5).Supporting the role of treatment status as a confounder, angiogen-esis and hypoxia are no longer enriched in metastases when thepathway analysis adjusts for treatment exposure status. In order todetermine features of chemotherapy and/or radiation treatment-exposed metastases of colorectal adenocarcinoma, we comparedtreatment-na€�ve (n ¼ 56) and treatment-exposed (n ¼ 128)metastases in the MCC dataset to one another. Treatment-exposed metastases shared characteristics with treatment-exposed primaries, such as increased epithelial–mesenchymaltransition (EMT; HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION; FDRMCC ¼ 4.81 � 10�04), angiogenesis (HALL-MARK_ANGIOGENESIS; FDRMCC ¼ 6.21� 10�02), and hypoxia(HALLMARK_HYPOXIA; FDRMCC¼ 6.36� 10�02) whereas treat-ment-na€�ve metastases exhibited increased MYC_TARGETS_V2
Figure 2.
tSNE visualizations of MCC and Consortium primary colorectal adenocarcinoma and lung and liver metastases of colorectal adenocarcinoma in the tissue-agnostic and tissue-sensitive analysis settings. tSNE visualizations were generated using the first 50 principal components based on tumor gene-expressiondata. In the tissue-agnostic setting, all possible genes were considered for PCA and subsequent tSNE visualization. In the tissue-sensitive setting, genesexhibiting tissue specificity, defined as a 2-fold expression increase of a given gene in normal lung, liver, colon, or rectum tissues, were excluded.
Kamal et al.
Cancer Res; 79(16) August 15, 2019 Cancer Research4232
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Table
2.Differences
betwee
nprimarytumors
andmetastasesofco
lorectal
aden
ocarcinomas
withan
dwitho
utco
nsiderationoftissue
-specificpathw
ayexpression
MC
C P
rimar
y C
RC
vs.
Liv
er M
etas
tase
s: T
issu
e A
gnos
ticM
CC
Prim
ary
CR
C v
s. L
ung
Met
asta
ses:
Tis
sue
Agn
ostic
CPa
thw
ayFD
RD
irect
ion
Path
way
FDR
Dire
ctio
n
HH
ALL
MA
RK
_XE
NO
BIO
TIC
_ME
TAB
OLI
SM
2.39
*10-1
3U
pH
ALLM
AR
K_M
YC_T
AR
GET
S_V
21.
40*1
0-03
Up
HAL
LMAR
K_C
OA
GU
LATI
ON
1.77
*10-0
8U
pH
ALLM
AR
K_M
YC_T
AR
GET
S_V
14.
76*1
0-03
Up
HA
LLM
AR
K_B
ILE
_AC
ID_M
ETA
BO
LIS
M2.
66*1
0-05
Up
HA
LLM
AR
K_D
NA
_RE
PA
IR2.
82*1
0-02
Up
HAL
LMAR
K_M
YC_T
AR
GET
S_V
21.
69*1
0-04
Up
HAL
LMAR
K_E
2F_T
AR
GET
S9.
31*1
0-02
Up
HAL
LMAR
K_M
YC_T
AR
GET
S_V
14.
83*1
0-03
Up
HA
LLM
AR
K_T
NFA
_SIG
NA
LIN
G_V
IA_N
FKB
1.98
*10-1
0D
own
HAL
LMAR
K_FA
TTY_
AC
ID_M
ETA
BO
LIS
M1.
45*1
0-02
Up
HA
LLM
AR
K_E
PIT
HE
LIA
L_M
ES
EN
CH
YMA
L_TR
AN
SIT
ION
9.12
*10-1
8D
own
C2
LIV
ER
_SP
EC
IFIC
_GE
NE
S4.
83*1
0-84
Up
LUN
G_C
AN
CE
R_D
IFFE
RE
NTI
ATI
ON
_MA
RK
ER
S1.
23*1
0-22
Up
LIV
ER
1.48
*10-6
7U
pC
OLO
N_A
ND
_RE
CTA
L_C
AN
CE
R_U
P7.
91*1
0-07
Up
LIV
ER
_CA
NC
ER
_SU
BC
LAS
S_P
RO
LIFE
RA
TIO
N_D
N7.
29*1
0-42
Up
BR
EA
ST_
CA
NC
ER
_20Q
11_A
MP
LIC
ON
1.18
*10-0
6U
pLI
VE
R_C
AN
CE
R_S
UR
VIV
AL_
UP
2.19
*10-2
6U
pR
EA
CTO
ME
_IN
FLU
EN
ZA_V
IRA
L_R
NA
_TR
AN
SC
RIP
TIO
N_A
ND
_R
EP
LIC
ATI
ON
1.43
*10-0
6U
p
BIO
CA
RTA
_IN
TRIN
SIC
_PA
THW
AY
2.77
10-2
6U
pR
ICK
MA
N_H
EA
D_A
ND
_NE
CK
_CA
NC
ER
_D1.
67*1
0-06
Up
KE
GG
_CO
MP
LEM
EN
T_A
ND
_CO
AGU
LATI
ON
_CA
SC
AD
ES
2.43
10-2
4U
pR
EA
CTO
ME
_PE
PTI
DE
_CH
AIN
_ELO
NG
ATI
ON
2.32
*10-0
6U
p
MC
C P
rimar
y C
RC
vs.
Liv
er M
etas
tase
s: T
issu
e Se
nsiti
veM
CC
Prim
ary
CR
C v
s. L
ung
Met
asta
ses:
Tis
sue
Sens
itive
CPa
thw
ayFD
RD
irect
ion
Path
way
FDR
Dire
ctio
n
HH
ALLM
ARK_
MYC
_TA
RG
ETS
_V2
4.74
*10-0
4U
pH
ALLM
AR
K_M
YC_T
AR
GET
S_V
29.
25*1
0-04
Up
HAL
LMAR
K_M
TOR
C1_
SIG
NAL
ING
7.7*
10-0
2U
pH
ALLM
AR
K_D
NA
_REP
AIR
7.79
*10-0
2U
pH
ALLM
ARK_
DN
A_R
EPA
IR8.
44*1
0-02
Up
HA
LLM
AR
K_E
PIT
HE
LIA
L_M
ES
EN
CH
YMA
L_TR
AN
SIT
ION
1.28
*10-0
7D
own
HA
LLM
AR
K_G
LYC
OLY
SIS
9.32
*10-0
2U
pH
ALL
MA
RK
_UV
_RE
SP
ON
SE
_DN
4.61
*10-0
4D
own
HAL
LMAR
K_E
PIT
HEL
IAL_
ME
SE
NC
HYM
AL_
TRA
NS
ITIO
N5.
38*1
0-04
Dow
nH
ALLM
AR
K_M
YO
GEN
ES
IS2.
61*1
0-03
Dow
nH
ALLM
ARK_
UV_
RES
PO
NS
E_D
N4.
14*1
0-03
Dow
nH
ALL
MA
RK
_PA
NC
RE
AS
_BE
TA_C
ELL
S1.
50*1
0-02
Dow
nC
2S
EM
EN
ZA_H
IF1_
TAR
GE
TS1.
01*1
0-04
Up
BR
EA
ST_
CA
NC
ER
_20Q
11_A
MP
LIC
ON
1.84
*10-0
6U
pR
EA
CTO
ME
_IN
FLU
EN
ZA_V
IRA
L_R
NA
_TR
AN
SC
RIP
TIO
N_A
ND
_R
EP
LIC
ATI
ON
8.90
*10-0
4U
pR
EA
CTO
ME
_IN
FLU
EN
ZA_V
IRA
L_R
NA
_TR
AN
SC
RIP
TIO
N_A
ND
_R
EP
LIC
ATI
ON
2.00
*10-0
6U
p
RE
AC
TOM
E_N
ON
SE
NS
E_M
ED
IATE
D_D
EC
AY_
EN
HA
NC
ED
_BY_
THE
_EX
ON
_JU
NC
TIO
N_C
OM
PLE
X1.
27*1
0-03
Up
RE
AC
TOM
E_P
EP
TID
E_C
HA
IN_E
LON
GA
TIO
N3.
09*1
0-06
Up
BR
EAS
T_C
AN
CE
R_2
0Q11
_AM
PLI
CO
N1.
28*1
0-03
Up
KE
GG
_RIB
OS
OM
E4.
51*1
0-06
Up
KE
GG
_RIB
OS
OM
E1.
72*1
0-03
Up
REA
CTO
ME_
3_U
TR_M
EDIA
TED
_TR
ANSL
ATIO
NA
L_R
EGU
LATI
ON
5.21
*10-0
6U
pR
EA
CTO
ME
_PE
PTID
E_C
HA
IN_E
LON
GA
TIO
N1.
89*1
0-03
Up
RE
AC
TOM
E_N
ON
SE
NS
E_M
ED
IATE
D_D
EC
AY_
EN
HA
NC
ED
_BY_
THE
_EX
ON
_JU
NC
TIO
N_C
OM
PLE
X5.
21*1
0-06
Up
NOTE:Pathw
ayen
richmen
tdifferences
aredisplaye
dbetwee
nprimaryco
lorectalad
enocarcinomaan
dliver
colorectalad
enocarcinomametastasesan
dprimaryco
lorectalad
enocarcinomaan
dlung
colorectal
aden
ocarcinomametastasesin
theMCCco
hortin
thetissue
-agno
stican
dtissue
-sen
sitive
settings.Displaye
darethetopfive
pathw
aysfoun
din
each
analysis.O
verlap
pingen
richmen
tresultsbetwee
nthean
alyses
ofprimaryco
lorectalad
enocarcinomavs.liver
metastasesan
dprimaryco
lorectalad
enocarcinomavs.lun
gmetastasesarehighlighted
inblue.Onlyfiltered
pathw
ays
enriched
withan
FDR<0.1areshown.Allpathw
aysexam
ined
arefromtheMSigDBdatab
ase.C,collectionintheMSigDBdatab
ase;H,H
allm
arkco
llection;C2,curated(CGPan
dCP)co
llection.Direction
ischoseninreferenceto
pathw
aysen
riched
inco
lorectalad
enocarcinomametastases,such
that
Up¼en
riched
inco
lorectalad
enocarcinomametastases,whileDown¼en
riched
inprimaryco
lorectal
aden
ocarcinoma.Filtered
analyses
wereperform
edafterremovalofp
athw
aysexhibitingtissue
specificity
forliver,lun
g,colon,an
drectum
tissue
sinea
chco
llectionexam
ined
.Allan
alyses
adjustfor
age,
sex,
trea
tmen
tstatus,a
ndan
atomic
origin.F
ortheConsortium
dataset,m
issing
trea
tmen
tstatus
andan
atomic
origin
weredetermined
usinggen
eexpression–b
ased
classifiers.
Abbreviation:
CRC,colorectal
aden
ocarcinoma.
Transcriptomes of Primary and Metastatic Colorectal Tumors
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4233
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Table
3.The
role
ofchem
otherap
yan
dradiationexposure
whe
nco
mparingprimarytumors
andmetastasesofco
lorectal
aden
ocarcinoma
MC
C D
atas
et: P
rimar
y C
RC
vs.
lung
and
live
r CR
C m
etas
tase
sad
just
ed fo
r tre
atm
ent s
tatu
s.
MC
C D
atas
et: P
rimar
y C
RC
vs.
lung
and
live
r CR
C m
etas
tase
s
not a
djus
ted
for t
reat
men
t sta
tus
MC
C D
atas
et: P
re v
s. p
ost t
reat
men
tad
just
ed fo
r prim
ary
vs.
met
asta
tic tu
mor
sta
tus
C
Path
way
FD
R
Path
way
FD
R
Path
way
FD
R
Dire
ctio
n
H
HAL
LMAR
K_M
YC_T
AR
GET
S_V
2 9.
25*1
0-04
HA
LLM
AR
K_A
NG
IOG
EN
ES
IS
6.69
*10-0
2 H
ALL
MA
RK
_EP
ITH
ELI
AL_
ME
SE
NC
HYM
AL
_TR
AN
SIT
ION
3.
30*1
0-18
Up
HA
LLM
AR
K_D
NA
_RE
PA
IR
7.78
*10-0
2 H
ALL
MA
RK
_HYP
OX
IA
6.69
*10-0
2 H
ALL
MA
RK
_MY
OG
EN
ES
IS
4.28
*10-1
2 U
p
HA
LLM
AR
K_G
LYC
OLY
SIS
7.
78*1
0-02
HA
LLM
AR
K_G
LYC
OLY
SIS
1.
75*1
0-01
H
ALL
MA
RK
_HYP
OX
IA
9.15
*10-0
8 U
p
HA
LLM
AR
K_E
PIT
HE
LIA
L_M
ES
EN
CH
YMA
L_T
RA
NS
ITIO
N
9.25
*10-0
4 H
ALL
MA
RK
_PA
NC
RE
AS
_BE
TA_C
ELL
S
6.69
*10-0
2 H
ALLM
ARK_
MYC
_TA
RG
ETS
_V2
4.60
*10-1
8 D
own
HAL
LMAR
K_U
V_R
ESP
ON
SE
_DN
4.
66*1
0-03
HA
LLM
AR
K_U
V_R
ES
PO
NS
E_D
N
1.59
*10-0
1 H
ALLM
ARK_
DN
A_R
EPA
IR
1.36
*10-0
3 D
own
HA
LLM
AR
K_P
AN
CR
EA
S_B
ETA
_CE
LLS
2.
13*1
0-02
HA
LLM
AR
K_P
RO
TEIN
_SE
CR
ETI
ON
1.
76*1
0-01
HA
LLM
AR
K_M
TOR
C1_
SIG
NA
LIN
G
2.00
*10-0
3 D
own
Con
sort
ium
Dat
aset
: Prim
ary
CR
C v
s. lu
ng a
nd li
ver C
RC
m
etas
tase
sad
just
ed fo
r tre
atm
ent s
tatu
s.
Con
sort
ium
Dat
aset
: Prim
ary
CR
C v
s. lu
ng a
nd li
ver C
RC
m
etas
tase
sno
t adj
uste
d fo
r tre
atm
ent s
tatu
s C
onso
rtiu
m D
atas
et: P
re v
s. p
ost t
reat
men
tad
just
ed fo
r prim
ary
vs.
met
asta
tic tu
mor
sta
tus
C
Path
way
FD
R
Path
way
FD
R
Path
way
FD
R
Dire
ctio
n
H
HAL
LMAR
K_M
YC_T
AR
GET
S_V
2 8.
59*1
0-09
HA
LLM
AR
K_A
NG
IOG
EN
ES
IS
2.17
*10-0
3 H
ALL
MA
RK
_EP
ITH
ELI
AL_
ME
SE
NC
HYM
AL
_TR
AN
SIT
ION
1.
22*1
0-24
Up
HAL
LMAR
K_M
TOR
C1_
SIG
NAL
ING
6.
38*1
0-04
HA
LLM
AR
K_H
YPO
XIA
2.
86*1
0-03
HA
LLM
AR
K_M
YO
GE
NE
SIS
1.
45*1
0-12
Up
HA
LLM
AR
K_G
LYC
OLY
SIS
5.
75*1
0-03
HA
LLM
AR
K_P
ER
OX
ISO
ME
5.
54*1
0-02
HA
LLM
AR
K_U
V_R
ES
PO
NS
E_D
N
7.82
*10-1
0 U
p
HA
LLM
AR
K_E
PIT
HE
LIA
L_M
ES
EN
CH
YMA
L_
TRA
NS
ITIO
N
2.53
*10-1
4 H
ALL
MA
RK
_PA
NC
RE
AS
_BE
TA_C
ELL
S
1.96
*10-0
1 H
ALLM
ARK_
MYC
_TA
RG
ETS
_V2
8.97
*10-1
9 D
own
HAL
LMAR
K_M
YO
GE
NE
SIS
1.
46*1
0-06
HA
LLM
AR
K_E
STR
OG
EN
_RE
SP
ON
SE
_LA
TE
6.72
*10-0
1 H
ALL
MA
RK
_MTO
RC
1_S
IGN
ALI
NG
4.
64*1
0-06
Dow
n
HAL
LMAR
K_U
V_R
ESP
ON
SE
_DN
4.
93*1
0-06
HA
LLM
AR
K_S
PE
RM
ATO
GE
NE
SIS
6.
72*1
0-01
HAL
LMAR
K_D
NA
_REP
AIR
1.
68*1
0-04
Dow
n
NOTE:W
eev
alua
tedpathw
ayen
richmen
tdifferences
betwee
nprimariesan
dmetastaseswhile
adjustingfortumortrea
tmen
tstatus
andwhile
igno
ring
tumortrea
tmen
tstatus.Inad
dition,
we
exam
ined
differences
betwee
ntrea
tmen
t-na€�vean
dtrea
tmen
texposed
tumors
inboth
datasets.
Herewedisplaythetop
threepathw
aysen
riched
inmetastasesan
dprimaryco
lorectal
aden
ocarcinomatumorsinthetrea
tmen
t-ad
justed
andtrea
tmen
t-na€�vean
alyses,aswellasthetopthreepathw
aysen
riched
inna€�vean
dtrea
tmen
texposedtumors.Pathw
aysove
rlap
pingbetwee
ntheMCCan
dConsortiuman
alyses
arehighlighted
inye
llow.A
llpathw
aysexam
ined
arefromtheMSigDBgen
esetcollections.Topthreeup
regulated
andtopthreedownreg
ulated
pathw
aysfromea
chMSigDBco
llectionareshown.C,collectionintheMSigDBdatab
ase;H,H
allm
arkco
llection.Directionischoseninreferenceto
pathw
aysen
riched
inco
lorectalad
enocarcinomametastases(U
p)orin
trea
tmen
texposedsamples(U
p).The
downdirectionrefers
topathw
aysen
riched
ineither
primaryco
lorectal
aden
ocarcinomatumors
ortrea
tmen
t-na€ �vetumors
dep
endingonthean
alysis.A
llan
alyses
contrastingprimaryco
lorectalad
enocarcinomaan
dmetastasesofcolorectalad
enocarcinomawereperform
edafterfi
lteringforp
athw
aysexhibitingtissue
specificity
forcolon,rectum
,lun
g,
orliver
tissue
s,an
dafterad
justingforag
e,sex,an
atomicorigin,and
trea
tmen
tstatus
(unlessindicated
otherwise).W
henco
ntrastingpathw
ayen
richmen
tbetwee
ndifferent
trea
tmen
tgroup
s,we
adjusted
fortum
ortyp
e(primaryco
lorectalad
enocarcinomavs.lun
g/liver
colorectalad
enocarcinomametastases).ForC
onsortium
cases,missing
dataontrea
tmen
tstatus
andan
atomicoriginwere
imputed
usinggen
eexpression–b
ased
classifiers.
Abbreviation:
CRC,colorectal
aden
ocarcinoma.
Kamal et al.
Cancer Res; 79(16) August 15, 2019 Cancer Research4234
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
(FDRMCC ¼ 3.59 � 10�07) and proliferative activity (REACTO-ME_S_PHASE; FDRMCC¼ 2.76� 10�03). These findings (Table 3;Supplementary Table S6) highlighted not only the role of treat-ment exposure in altering the transcriptomic landscapes of bothcolorectal adenocarcinoma primary tumors and metastases, butalso demonstrated the importance of considering treatment expo-sure as a covariate when comparing gene-expression patterns ofprimary tumors and metastatic lesions.
Key gene and pathway enrichment differences betweencolorectal adenocarcinoma primaries and metastases
We aimed to discover transcriptomic signatures of metastaticlesions after consideration of host tissue expression, treatmentexposure status, tumor anatomic origin, age, and sex. We exam-inedpathway and gene enrichment differences between colorectaladenocarcinoma metastases in the lung and liver versus thecolorectal adenocarcinoma primary tumors (SupplementaryTables S7 and S8) in the MCC discovery and Consortium vali-dation datasets.
Examination of the Hallmark collection showed that colorectaladenocarcinoma metastases exhibited increased MYC signaling(FDRMCC ¼ 4.74 � 10�04), DNA repair (FDRMCC ¼ 8.44 �10�02), and glycolysis (FDRMCC ¼ 7.78 � 10�02) activity (Table 3;SupplementaryTableS7).Examinationof theC2collectionprovidesadditional support for enhanced MYC signaling in metastases ofcolorectal adenocarcinoma based on the numerous transcription,translation, and ribosomal pathways found to be enriched inmetastases (Supplementary Table S7). Metabolic machinery wasalso altered in metastases showing increased gluconeogenesis(MOOTHA_GLUCONEOGENESIS; FDRMCC ¼ 6.53 � 10�03 andREACTOME_GLUCONEOGENESIS; FDRMCC ¼ 6.52 � 10�03)activity, likely as a result of MYC upregulation. In addition, Hyp-oxia-inducible factor (HIF) targets (SEMENZA_HIF1_TARGETS;FDRMCC ¼ 9.87 � 10�05; ref. 28) and hypoxia targets of VHL(WACKER_HYPOXIA_TARGETS_OF_VHL; FDRMCC ¼ 1.18 �10�02; ref. 29) were also enriched in metastatic lesions, albeit to alesserdegree thanMYCand themetabolic changes associatedwith it.
MSI status could be a potential confounder when assessingdifferences between primary tumors and colorectal adenocarci-noma metastases, as MSI-high tumors are typically diagnosed atless advanced stage (30). Therefore, we replicated ourfindings in asmaller subset of samples in theMCCdataset for whichMSI statuswas available and adjusted for in the tissue-sensitive setting(Supplementary Table S9). We found the PECE_MAMMARY_STEM_CELL_UP (FDRMCC ¼ 6.84 � 10�03; ref. 31) andBENPORATH_ES_CORE_NINE (FDRMCC¼6.69�10�02; ref. 32)gene sets, which are potential cancer stem cell pathways, to beenriched in lung and liver metastases of colorectal adenocarcino-ma. We observed almost no overlap between the genes definingthese stem cell signatures and genes defining EMT activity (Fig. 3).As such, our findings suggest EMT and cancer stemness are notnecessarily coupled as EMT signatures were more prevalent incolorectal adenocarcinomaprimarieswhile cancer stemness activ-ity was enriched in metastases. Furthermore, we showed thatmetastases are enriched in expression of cancer stem cell geneseven after adjusting for treatment. This suggested that cancer stemcells likely exist in all metastases, but that chemotherapy andradiation treatment exposure may select for them. Similarly,hypoxia and angiogenesis activity are likely enriched in all metas-tases, but treatment exposure again appears to enhance theactivation of these pathways. Viral replication and transcription
pathways were also found in metastases. However, these path-ways highly overlapped with multiple global cellular transcrip-tion and translation pathways, which are likely due to theenhancedMYC signaling observed inmetastases (SupplementaryTable S10) and therefore are not indicative of distinct viral activity.Similarly, the hallmark myogenesis pathway was enriched inprimary tumors as it shares many mesenchymal phenotype geneswith the hallmark EMT pathway (Supplementary Table S11).
The most significant differentially expressed genes (Fig. 3;Supplementary Table S8) between primary tumors and metasta-ses of colorectal adenocarcinoma are related to EMT. FBN2 (foldchange ¼ 7.6; FDRMCC ¼ 1.12 � 10�99), MMP3 (fold change ¼38.2; FDRMCC ¼ 1.64 � 10�91), and FGF10 (fold change ¼ 5.7;FDRMCC ¼ 8.74 � 10�67) are all either known stimulators ormarkers of EMT (33, 34) and were highly elevated in primarytumors compared with metastases (Fig. 3). These findings weresupported by our pathway-level results that showed EMT (HALL-MARK_EPITHELIAL_MESENCHYMAL_TRANSITION; FDRMCC¼1.16 � 10�22) is highly upregulated in primary colorectal ade-nocarcinomas. In comparison, genes significantly elevated inmetastases (Fig. 3; Supplementary Table S8) includeGATA4 (foldchange¼ 3.9; FDRMCC¼ 2.09� 10�36), CLND10 (fold change¼3.9; FDRMCC ¼ 3.92 � 10�33), and SYT12 (fold change ¼ 3.4;FDRMCC ¼ 1.12 � 10�54). GATA4 is thought to mark fullydifferentiated epithelial cells and its expression is often silencedin colorectal adenocarcinoma as forced expression of GATA4results in impaired colorectal adenocarcinoma cell line prolifer-ation and migration (35). Increased expression of GATA4 inmetastases supported our pathway-level results, which showeddecreased EMT activity in metastases. Similarly, CLND10 codesfor a claudin protein. Claudin proteins are integral components oftight junctions and their expression has been associated withrecurrence of primary hepatocellular carcinoma (36). Lastly,SYT12 is involved in regulating calcium-independent sections innonneuronal cells, and it has been previously linked with unfa-vorable prognosis in pancreatic cancer (37).
Based on the top 500 differentially expressed genes betweenprimary tumors andmetastases of colorectal adenocarcinoma (Sup-plementary Table S8), we performed hierarchical clustering of allcolorectal adenocarcinoma tumor samples in both datasets in ordertodetermine thedegreeof transcriptomic similarity between clustersof primaries and metastases. We hypothesized a strong degree ofseparation between primary tumors and metastases. However,despite having generated five main clusters from the top 500differentially expressed genes, we observed several primary colorec-tal adenocarcinoma tumors embeddedwithin the clusters ofmetas-tases. Surprisingly, we also noticed thatmetastases only appeared intwomain clusters, hereoncalledM1andM2(Fig. 3; Table4), inbothdatasets. We aimed to confirm that the M1 and M2 clusters in theMCC and Consortium datasets were defined by similar underlyingbiology.M1/M2 clustermembership in each dataset was defined bythe top 500 differentially expressed genes between primary tumorsand metastases in an adjusted regression analysis in each dataset.Therefore, we developed a classifier trained on theMCCdataset thatpredicted the cluster membership of the Consortium metastases(Fig. 3E) based on pathway-level expression differences between theMCCM1 andM2 clusters (Table 4). Similar to the differential gene-expression analysis comparing primaries and metastases, inputs fortheM1/M2 classifier also adjusted for age, sex, anatomic origin, andtreatment exposure status. Based on the strong performance of ourclassifier (AUC¼ 0.905), we believe theM1 andM2 clusters in both
Transcriptomes of Primary and Metastatic Colorectal Tumors
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4235
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Figure 3.
Heat map visualization and volcano plots showing the top differentially expressed genes between primary tumors and metastases of colorectaladenocarcinomas in the MCC and Consortium datasets. A and B,Of the top 500 differentially expressed genes between primary tumors and metastases whileadjusting for clinical variables, the top 25 genes are shown. Hierarchal clustering was performed, which revealed twomain clusters of metastases in bothdatasets. For Consortium, we excluded 21 genes from the differential expression analysis, which were used to develop the anatomic origin and treatment statusclassifiers. In both the MCC and Consortium datasets, both clusters of metastases, named as MCC-M1 or Consortium-M1 and MCC-M2 and Consortium-M2, includeliver and lung metastases. Treatment status of each tumor is also denoted. C and D, Volcano plots displaying the top 500 differentially expressed genes. Thex-axis shows the log2-fold change (FC), and the y-axis displays the�log10 of the P values, where all P values are <0.001. The most differentially expressed genesin primaries (negative log2 FC) and metastases (positive log2 FC) are highlighted in blue and red, respectively. E, An ROC curve showing the performance of theM1/M2 classifier predicting M1 and M2 status in the Consortium dataset based on MCCM1/M2 cluster membership is shown. F,Overlap between stem cell genesets and the EMT gene set is highlighted.
Kamal et al.
Cancer Res; 79(16) August 15, 2019 Cancer Research4236
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Table
4.Differences
inclusters
ofmetastasesin
theMCCan
dConsortium
datasets
Hal
lmar
k &
C2.
CP.
Rea
ctom
e Pa
thw
ays
MC
CD
atas
etC
onso
rtiu
m D
atas
et
CPa
thw
ayR
ank
FDR
Ran
kFD
RD
irect
ion
H
HA
LLM
AR
K_E
PIT
HE
LIA
L_M
ES
EN
CH
YMA
L_TR
AN
SIT
ION
11.
16*1
0-22
19.
47*1
0-22
M1
HAL
LMAR
K_A
LLO
GR
AFT_
REJ
EC
TIO
N2
1.78
*10-1
92
1.05
*10-1
9M
1H
ALLM
ARK_
MY
OG
EN
ES
IS3
9.41
*10-1
67
4.93
*10-0
3M
1H
ALL
MA
RK
_E2F
_TA
RG
ETS
11.
16*1
0-22
49.
54*1
0-05
M2
HA
LLM
AR
K_M
YC_T
AR
GE
TS_V
12
1.78
*10-1
93
4.69
*10-0
5M
2H
ALL
MA
RK
_MYC
_TA
RG
ETS
_V2
39.
41*1
0-16
15.
61*1
0-15
M2
C2.
CP.
Rea
ctom
e
RE
AC
TOM
E_I
MM
UN
OR
EG
ULA
TOR
Y_IN
TER
AC
TIO
NS
_BE
TWEE
N_A
_LYM
PH
OID
_AN
D_A
_NO
N_L
YMP
HO
ID_C
ELL
11.
03*1
0-11
35.
40*1
0-12
M1
RE
AC
TOM
E_C
OLL
AG
EN
_FO
RM
ATI
ON
21.
83*1
0-10
18.
19*1
0-13
M1
RE
AC
TOM
E_E
XTR
AC
ELL
ULA
R_M
ATR
IX_O
RG
AN
IZA
TIO
N3
1.83
*10-1
02
8.56
*10-1
3M
1R
EA
CTO
ME
_CH
ON
DR
OIT
IN_S
ULF
ATE
_DE
RM
ATA
N_S
ULF
ATE
_M
ETA
BO
LIS
M4
1.96
*10-0
925
9.86
*10-0
4M
1
RE
AC
TOM
E_G
LYC
OS
AM
INO
GLY
CA
N_M
ETA
BO
LIS
M5
4.85
*10-0
918
3.18
*10-0
4M
1R
EA
CTO
ME
_IN
TEG
RIN
_CE
LL_S
UR
FAC
E_I
NTE
RA
CTI
ON
S6
7.08
*10-0
97
5.60
*10-0
7M
1R
EA
CTO
ME
_GE
NE
RA
TIO
N_O
F_S
EC
ON
D_M
ES
SE
NG
ER
_MO
LEC
ULE
S7
1.76
*10-0
84
4.68
*10-0
9M
1R
EA
CTO
ME
_PH
OS
PH
OR
YLA
TIO
N_O
F_C
D3_
AN
D_T
CR
_ZE
TA_C
HA
INS
85.
95*1
0-08
93.
03*1
0-06
M1
RE
AC
TOM
E_P
D1_
SIG
NA
LIN
G9
6.04
*10-0
85
5.99
*10-0
8M
1R
EA
CTO
ME
_TR
AN
SLO
CA
TIO
N_O
F_ZA
P_7
0_TO
_IM
MU
NO
LOG
ICA
L_S
YNA
PSE
106.
10*1
0-08
86.
00*1
0-07
M1
RE
AC
TOM
E_P
LATE
LET_
AC
TIV
ATI
ON
_SIG
NA
LIN
G_A
ND
_A
GG
RE
GA
TIO
N11
1.68
*10-0
731
1.26
*10-0
3M
1
RE
AC
TOM
E_C
HO
ND
RO
ITIN
_SU
LFA
TE_B
IOS
YNTH
ES
IS12
2.50
*10-0
738
2.45
*10-0
3M
1R
EAC
TOM
E_D
NA
_REP
LIC
ATI
ON
11.
03*1
0-11
211.
72*1
0-02
M2
RE
AC
TOM
E_M
ITO
TIC
_M_M
_G1_
PH
AS
ES
23.
59*1
0-11
252.
03*1
0-02
M2
RE
AC
TOM
E_G
2_M
_CH
EC
KP
OIN
TS3
2.48
*10-1
011
1.01
*10-0
2M
2R
EA
CTO
ME
_AC
TIV
ATI
ON
_OF_
THE_
PR
E_R
EP
LIC
ATI
VE
_CO
MP
LEX
43.
94*1
0-10
171.
46*1
0-02
M2
REA
CTO
ME_
S_P
HAS
E5
5.51
*10-1
071
6.76
*10-0
2M
2R
EA
CTO
ME
_AC
TIV
ATI
ON
_OF_
ATR
_IN
_RE
SP
ON
SE
_TO
_R
EP
LIC
ATI
ON
_STR
ES
S6
8.18
*10-1
020
1.66
*10-0
2M
2
REA
CTO
ME
_DN
A_S
TRA
ND
_ELO
NG
ATIO
N7
1.39
*10-0
963
5.91
*10-0
2M
2R
EA
CTO
ME
_TE
LOM
ER
E_M
AIN
TEN
AN
CE
81.
39*1
0-09
121.
01-0
2M
2R
EA
CTO
ME
_DE
PO
SIT
ION
_OF_
NEW
_CE
NP
A_C
ON
TAIN
ING
_N
UC
LEO
SO
ME
S_A
T_TH
E_C
EN
TRO
ME
RE
93.
80*1
0-09
312.
81*1
0-02
M2
REA
CTO
ME_
SY
NTH
ESI
S_O
F_D
NA
101.
58*1
0-09
726.
76*1
0-02
M2
RE
AC
TOM
E_C
HR
OM
OS
OM
E_M
AIN
TEN
AN
CE
111.
80*1
0-09
131.
04*1
0-02
M2
RE
AC
TOM
E_G
1_S
_TR
AN
SIT
ION
121.
96*1
0-09
524.
79*1
0-02
M2
NOTE:The
top50
0differentially
expressed
gen
esbetwee
nprimaryco
lorectalad
enocarcinomaan
dco
lorectalad
enocarcinomametastasesinboth
datasetswereused
toperform
hierarchalclustering
ofp
rimarytumorsan
dmetastases.Twomainclusters
ofcolorectalad
enocarcinomametastaseswerefoun
dinea
chdataset,hereo
nreferred
toas
M1a
ndM2.The
M1a
ndM2clustersinea
chdataset
wereco
mpared
withea
chother
todeterminedifferences
inpathw
ayactivity.Sho
wnarethetoppathw
aysen
riched
intheM1and
M2clustersfromboth
datasets(FDR<0.1).Spea
rman
rank
correlation
showssimilarove
rlap
inpathw
ayen
richmen
tbetwee
ntheM1an
dM2clusters
ofthetw
odatasets(H
r¼
0.66;C
2r¼
0.56).Allpathw
aysexam
ined
arefrom
theMSigDBgen
esetco
llections.C
,co
llectionintheMSigDBdatab
ase;H,H
allm
arkco
llection,C2,Rea
ctomeco
llection.Pathw
ayen
richmen
tdifferences
weredetermined
afterremovalo
fpathw
aysexhibitingtissue
specificity
forliver,
lung
,colon,an
drectum
tissue
sinea
chco
llectionexam
ined
.Inad
dition,allpathw
ayen
richmen
tana
lysesad
justfora
ge,sex,trea
tmen
tstatus,an
dan
atomicorigin.Forthe
Consortium
dataset,m
issing
dataontrea
tmen
texposure
status
andan
atomic
origin
wereim
puted
usinggen
eexpression–b
ased
classifiers.
Transcriptomes of Primary and Metastatic Colorectal Tumors
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4237
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
datasets have similar biology. Therefore, we further investigated thepathway-level differences between the M1 and M2 clusters ofmetastases observed in both datasets.
Subtypes of colorectal adenocarcinoma metastases in the lungand liver
We compared enrichment of pathways in theMSigDBhallmarkand C2.CP.REACTOME collections between the M1 and M2clusters of metastases found in both the MCC and Consortiumdatasets.We restricted the analysis to these two collections to bothassist with interpretation of results and avoid redundant enrich-ment of functional pathways (results for the complete C2 collec-tion canbe found in Supplementary Table S12). Tumors in theM2cluster primarily exhibited a proliferative phenotype withincreased MYC target activity (HALLMARK_MYC_TARGETS_V1;FDRMCC ¼ 1.78 � 10�19; HALLMARK_MYC_TARGETS_V2;FDRMCC ¼ 9.41 � 10�16) and E2F target activity (HALLMARK_E2F_TARGETS; FDRMCC¼ 1.16� 10�22). Tumors in theMCCM1cluster primarily exhibited an inflammatory and immune-escapephenotype (Table 4; Supplementary Table S12). Notably, path-way enrichment differences showed not only an innate immuneresponse (REACTOME_INNATE_IMMUNE_SYSTEM; FDRMCC ¼4.91 � 10�04) in the MCC M1 cluster, but also a verystrong adaptive immune response (REACTOME_ADAPTIVE_IMMUNITY; FDRMCC ¼ 2.53 � 10�03), defined by T-cell infiltra-tion (Table 4; Supplementary Table S12) in M1 metastases,which is likely blunted by the tumor through expression ofimmune-checkpoint inhibitors such as PD1 (REACTOME_PD1_SIGNALING; FDRMCC ¼ 6.04 � 10�08). We examined thedegree of overlap from pathway enrichment results when com-paring M1 and M2 clusters in both the MCC and Consortiumdatasets using Spearman rank correlation, where rHallmark ¼ 0.66and rC2.CP.Reactome ¼ 0.56. These results further highlighted therobustness of the M1 and M2 metastatic clusters found in eachdataset. Of note, the immune-related differences found betweenthe M1 and M2 clusters in the MCC dataset best predicted clustermembership of Consortium metastases (Fig. 3E). Overall, ourresults suggested there are two main types of colorectal adeno-carcinoma metastases to the lung and liver—those that can beconsidered immune "hot" tumors and exhibit an inflammatoryphenotype, and those that can be considered immune "cold"tumors that are not characterized by inflammation, but rathercanonical MYC and E2F signaling with a proliferative signature.
Consensus molecular subtype classification of colorectal lungand liver metastases
Due to the observance of canonical oncogenic signaling incolorectal adenocarcinomametastases, we assessed whether colo-rectal adenocarcinoma metastases are enriched for a particularconsensus molecular subtype (CMS). CMS is one of the mostrobust gene-expression–based colorectal adenocarcinoma classi-fication systems with known prognostic implications (22). Thereare four main CMS groups, CMS1-CMS4, as well as the CMS_NAgroup, which consists of tumor samples that cannot be classifiedas CMS1-CMS4. Samples in the CMS_NA group are thought tocontain properties ofmultiple CMS groups and, as such, CMS_NAis not considered to be a distinct CMS (22). We implemented theCMS classifier (22) in order to determine if metastases of colo-rectal adenocarcinoma exhibit a propensity to be in a specific CMSgroup. In both the MCC and Consortium datasets, metastaseswere never classified as CMS3, suggesting subtype exclusivity. In
addition, implementation of logistic regression models (Supple-mentary Fig. S8), which adjust for age, sex, anatomic origin, andtreatment exposure status of the tumor, showed the odds of atumor being a metastasis compared with a primary tumor is2.3 times higher among CMS2 tumors than among CMS4 tumors(MCC dataset: odds ratio 2.30, 95% confidence interval 1.40–3.82, P < 0.001). Furthermore, CMS2 (MCC: 36.4%, Consortium:38.3%) and CMS4 (MCC: 44.0%, Consortium: 45.2%) appearedto be the dominant subtypes found in colorectal adenocarcinomametastases compared with primary tumors. In addition, 86.6%and85.7%ofmetastases in theM1 clusterswereCMS4 and63.4%and 51.9% of metastases in the M2 clusters were CMS2 in theMCC (Fisher exact test; P < 2.2 � 10�16) and Consortium (Fisherexact test; P ¼ 2.5 � 10�05) datasets, respectively. Logistic regres-sion modeling applied to the MCC dataset further supported theassociation between M1metastases and the CMS4 group and M2metastases and the CMS2 group (Supplementary Table S13).
Adaptations to distal tissue sites observed within primarycolorectal adenocarcinoma tumors
As metastases are likely to have adapted to the microenviron-ments of their sites of metastasis prior to dissemination (38), weaimed todetermine if these adaptations could alreadybedetected inprimary tumors of patients who later go on to develop lung or livermetastases. We compared primary colorectal adenocarcinomatumors from patients who developed lung (n ¼ 18) metastases toprimary colorectal adenocarcinoma tumors from patients whodeveloped liver metastases (n ¼ 48) while adjusting for age, sex,stage 4 disease status, tumor anatomic origin, and treatment expo-sure status. Specifically, we looked for pathways with high lung orliver-specific tissue weights when comparing primary colorectaladenocarcinoma tumors to one another (Supplementary TableS14). In primary tumors from patients who went on to developliver metastases, we found lipid digestion (REACTOME_LIPID_DIGESTION_MOBILIZATION_AND_TRANSPORT; FDRMCC ¼6.96 � 10�02), lipid transport (REACTOME_CHYLOMICRON_MEDIATED_LIPID_TRANSPORT; FDRMCC ¼ 7.28 � 10�02),and adipogenesis (STEGER_ADIPOGENESIS_UP; FDRMCC ¼6.28 � 10�03) pathways to be enriched. In primary tumorsfrom patients who went on to develop lung metastases,we found enrichment of interferon alpha response pathways(HALLMARK_INTERFERON_ALPHA_RESPONSE; FDRMCC ¼7.17 � 10�03 and MOSERLE_IFNA_RESPONSE; FDRMCC ¼1.21� 10�03), which are known tomodulate lung inflammation.
DiscussionA tissue-sensitive approach for determining features ofmetastases
Our tissue-sensitive approach allows the comparison of metas-tases located in different host tissue sites without the need forsampling and transcriptomic profiling of normal host tissues. Forthe comparative analyses of primary tumors and metastases, atissue-sensitive approach supports pooling of metastatic cancersfrom multiple tissue sites, which both improves statistical powerand helps elucidate the common phenotype of metastases from asingle primary cancer type (Table 2). In addition, the use of gene-based classifiers to impute tumor anatomic origin and treatmentexposure status allows future researchers to adjust for thesevariables in their analyses when clinical data are limited forretrospective studies.
Kamal et al.
Cancer Res; 79(16) August 15, 2019 Cancer Research4238
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Role of treatment exposure in comparing colorectaladenocarcinoma primary and metastatic tumors
We highlighted the role of chemotherapy and/or radiationexposure prior to tumor surgical resection as a confounder whencomparing transcriptomic profiles of primary colorectal adeno-carcinomas and metastatic lesions (Table 3). As a key example ofthe confounding effect of treatment status, hypoxia and angio-genesis activity were found to be enriched in treatment-exposedtumors and inmetastases when treatment exposure statuswas notappropriately considered (Supplementary Tables S4 and S5).With appropriate adjustments for treatment exposure, hypoxiaand angiogenesis genes were no longer differentially expressedbetween metastases and primaries. Similarly, we showed thattreatment exposed tumors were enriched in EMT, and oncetreatment exposure status is taken into consideration, we foundthat primary tumors are more likely to exhibit EMT enrichmentthan metastases of colorectal adenocarcinoma. Our findings,especially on GATA4 enrichment in metastases, align with previ-ous studies that suggest that metastases undergo mesenchymal–epithelial transition (MET) to establish themselves at distal sitesand that metastatic tumor cells with a MET phenotype are morelikely to exhibit rapid proliferation compared with cells with anEMT phenotype (39, 40), which typically divide slowly. In addi-tion, neoadjuvant chemotherapy has been strongly associatedwith a mesenchymal phenotype in both primary tumorsand metastases and is therefore known to affect colorectal ade-nocarcinoma CMS classification (10, 41), which corroborate ourfindings on the role of treatmentwhen comparing primary tumorsand metastases of colorectal adenocarcinoma.
Characteristic features of metastases of colorectaladenocarcinoma
Compared with colorectal adenocarcinoma primary tumors,lung and liver metastases tended to be more differentiated withreduced EMT activity.Metastases exhibited a shift toward glycolysisand were also enriched in MYC target pathways and the down-stream effects of MYC such as increased proliferation and globalupregulation of transcription and translational cellular machin-ery (42). HIF targets were also found to be enriched; indeed,oncogenic MYC is known to collaborate with HIF to inducemetabolic alterations such as increased glycolysis (Warburg effect;refs. 43, 44). In particular, HIF1a expression is required for MYC-induced proliferation and anchorage-independent growth.
Previous studies in melanoma suggest genetic stability appearsto be necessary for the development of metastases (45). Theobservance of activated DNA-repair pathways in metastases sug-gests that a similarmetastatic programmay be at play in colorectaladenocarcinoma. These results are corroborated by our CMSanalysis, where genetically unstable subtypes such as CMS1 andCMS3 are almost nonexistent among metastases of colorectaladenocarcinoma (22). CMS classification results showmetastasesare more likely to be CMS2 in reference to CMS4 than primarytumors. Although CMS4 has previously been associated withadvanced stages (III and IV) of disease, CMS2 was not previouslyassociated with advanced disease. CMS2 is characterized byepithelial differentiation and strong upregulation of MYC andWNT signaling (22). This is supported by our gene and pathwayenrichment results, which showed metastases exhibit lower EMTactivity, were more likely to be differentiated compared withprimary tumors, and were enriched in MYC signaling and itsdownstream proliferative pathways. Furthermore, the lack of
CMS3 metastases in both the MCC and Consortium datasets wasparticularly intriguing and raises the question of whether themetabolic and genomic features of CMS3 are incompatible withmetastases. Future studies using paired primaries and metastasesare warranted to address this question.
Many metastases are also thought to contain cancer stem cells,which can drive drug resistance and are often associated with poorprognosis and survival.We foundpotential cancer stem cell activitywas upregulated in metastases of colorectal adenocarcinoma, sug-gesting that cancer stem cell features exist in metastases indepen-dent of treatment exposure and EMT activity. Our results alsosuggest that cancer stem cell–like features are not exclusively foundin EMT-high tumors as primary colorectal adenocarcinoma tumorsexhibitedhigherEMTactivity comparedwithmetastases, but cancerstem cell gene sets were found to be enriched in metastases, whichexhibited lower EMT activity. This agreeswith our gene-level resultsshowing increased GATA4 expression in metastases, which arethought to undergo MET at distal organs and therefore shouldexhibit epithelial features compared with primary tumors.
Identification of two main phenotypes of colorectaladenocarcinoma metastases
We identified two main groups of metastases based ontranscriptomic features. When comparing the two groups ofmetas-tases to one another, we found the first group (M1) was charac-terized primarily by inflammation featuring adaptive immunesystem responses, immune evasion pathways (e.g., PD1 signaling;refs. 46, 47), and lymphocytic cell-mediated immunity (Table 4;Supplementary Table S12). The second group (M2) was character-ized by cell proliferation and MYC signaling (Table 4; Supplemen-tary Table S12).Moreover, the enrichment of EMT activity found inboth the M1 cluster and post-treatment metastases and the enrich-ment ofMYCactivity inpretreatmentmetastases and theM2 cluster(Table 4) suggests these metastatic phenotypes may be influencedby treatment exposure. Nevertheless, the M1 cluster exhibits verystrong activation of inflammatory and immune response pathwaysand this immune-phenotype appears to be the defining feature ofthe M1 clusters in both datasets. This immune phenotype was notobserved in post-treatment metastases. Therefore, it is not clear iftreatment exposure can help drive metastases to specific pheno-types. However, our results are consistent with previous research inhumans and mouse models, which have suggested metastases fallinto two main subtypes—those characterized by EMT and inflam-mation signatures and those characterizedbyproliferation (48, 49).Recent work in melanoma (50) has shown "cold" metastases,which do not respond to immunotherapy and which are enrichedin a T-cell exclusion program, are characterized by MYC signalingand E2F targets. As our work has potentially characterized twophenotypes of metastases, one of which is also characterized byMYC and E2F proliferation signaling, we believe these metastaticphenotypes can inform immunotherapy treatment decisions forcolorectal adenocarcinoma as well.
Limitations and future directionsThough we describe a novel method for transcriptomic com-
parative analyses between primary tumors and metastases ofcolorectal adenocarcinoma, this study should be consideredwithin the context of its limitations. While this study comparesprimary colorectal adenocarcinoma tumors to liver and lungcolorectal adenocarcinoma metastases, it does so with a limitedset of matched (n ¼ 15) primary and metastatic tumor samples
Transcriptomes of Primary and Metastatic Colorectal Tumors
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4239
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
from the same individuals. Moreover, our analysis comparingprimary tumors from patients who go on to develop lung or livermetastases suggests some adaptations to the distal site of metas-tasis can already be observed in the primary tumor. These adapta-tions may be lost when implementing our tissue-sensitive adjust-ments for comparing primary tumors and metastases, and thiscould potentially produce false-negative results. Future studiesshould examine larger cohorts of matched samples, where avail-able, to explore features of metastatic progression and drugresistance. In addition, although this study captured and appro-priately adjusted for chemotherapy and radiation exposure, dueto data limitations, it did not capture specific features of treat-ment, such as radiation dose, duration of treatment, and drugclass of the chemotherapeutic agents. The use of classifiers toimpute anatomic origin and treatment status, though imperfectand requiring significant methodological improvement, allowsfuture research onmetastases with limited clinical information tobe performed while appropriately adjusting for potential con-founders. With regard to the M1/M2 clusters, we show replica-bility across datasets at the pathway-level but acknowledge thelimitations of M1/M2 cluster replicability across datasets at thegene-level due to underlying differences between the clinicalcharacteristics of each dataset as these differences inform gene-level M1/M2 cluster membership. Lastly, the prognostic differ-ences between subtypes of metastases should be carefullyexplored. In particular, the EMT inflammatory group of metas-tases can be better characterized to understand immune-escapemechanisms used by metastases.
Disclosure of Potential Conflicts of InterestNo potential conflicts of interest were disclosed.
Authors' ContributionsConception and design: Y. Kamal, S.L. Schmit, C.I. Amos, H.R. FrostDevelopment of methodology: Y. Kamal, C.I. Amos, H.R. FrostAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): Y. Kamal, S.L. SchmitAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): Y. Kamal, S.L. Schmit, C.I. Amos, H.R. FrostWriting, review, and/or revision of the manuscript: Y. Kamal, S.L. Schmit,H.J. Hoehn, C.I. Amos, H.R. FrostAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): H.J. HoehnStudy supervision: S.L. Schmit, H.J. Hoehn, C.I. Amos, H.R. Frost
AcknowledgmentsThe authors are grateful for the financial support from research grants
5T32LM012204-03 NIH-NLM, 1K01LM012426 NIH-NLM, and NCI CancerCenter Support Grant 5P30 CA023108-37 to the Norris Cotton Cancer Center.This work was supported in part by Moffitt's Total Cancer Care Initiative, theCollaborative Data Services Core and the Biostatistics and BioinformaticsShared Resource at the H. Lee Moffitt Cancer Center and Research Institute,an NCI-designated Comprehensive Cancer Center, under grant number P30-CA076292. The content is solely the responsibility of the authors and does notnecessarily represent the official views of the NIH or the H. Lee Moffitt CancerCenter andResearch Institute.Wewould also like to thankDrs. Eric A.Welsh andMichael J. Schell for their assistance in the acquisition, curation, and cleaning ofthe datasets used in this article. Partial support for this research was provided tosupport Dr. C.I. Amos efforts through Cancer Prevention Research Institute ofTexas (CPRIT) grant RR170048 and NIH/NCI grant U01CA196386. Dr. C.I.Amos is a CPRIT Research Scholar.
The costs of publication of this articlewere defrayed inpart by the payment ofpage charges. This article must therefore be hereby marked advertisement inaccordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received December 15, 2018; revised April 11, 2019; accepted June 20, 2019;published first June 25, 2019.
References1. Chambers AF, Groom AC, MacDonald IC. Dissemination and growth of
cancer cells in metastatic sites. Nat Rev Cancer 2002;2:563–72.2. van der Geest LGM, Lam-Boer J, KoopmanM, Verhoef C, ElferinkMAG, de
Wilt JHW. Nationwide trends in incidence, treatment and survival ofcolorectal cancer patients with synchronous metastases. Clin Exp Metas-tasis 2015;32:457–65.
3. Riihim€aki M, Hemminki A, Sundquist J, Hemminki K. Patterns of metas-tasis in colon and rectal cancer. Sci Rep 2016;6:29765.
4. Glynne-Jones R, Wyrwicz L, Tiret E, Brown G, R€odel C, Cervantes A, et al.Rectal cancer: ESMO Clinical Practice Guidelines for diagnosis, treatmentand follow-up†. Ann Oncol 2017;28:iv22–iv40.
5. Chan DLH, Segelov E, Wong RS, Smith A, Herbertson RA, Li BT, et al.Epidermal growth factor receptor (EGFR) inhibitors for metastatic colo-rectal cancer. Cochrane Database Syst Rev 2017;6:CD007047.
6. Le DT, Uram JN,WangH, Bartlett BR, KemberlingH, Eyring AD, et al. PD-1blockade in tumors with mismatch-repair deficiency. N Engl J Med 2015;372:2509–20.
7. Vignot S, Lefebvre C, FramptonGM,MeuriceG, Yelensky R, Palmer G, et al.Comparative analysis of primary tumour and matched metastases incolorectal cancer patients: evaluation of concordance between genomicand transcriptional profiles. Eur J Cancer 2015;51:791–9.
8. Wang S, ZhangC, ZhangZ,QianW, SunY, Ji B, et al. Transcriptome analysisin primary colorectal cancer tissues from patients with and without livermetastases using next-generation sequencing. Cancer Med 2017;6:1976–87.
9. Hartung F, Wang Y, Aronow B, Weber GF. A core program of geneexpression characterizes cancermetastases.Oncotarget 2017;8:102161–75.
10. Trumpi K, Ubink I, Trinh A, Djafarihamedani M, Jongen JM, Govaert KM,et al. Neoadjuvant chemotherapy affects molecular classification of colo-rectal tumors. Oncogenesis 2017;6:e357.
11. Van Cutsem E, Cervantes A, Nordlinger B, Arnold D, ESMO GuidelinesWorking Group. Metastatic colorectal cancer: ESMO Clinical PracticeGuidelines for diagnosis, treatment and follow-up. Ann Oncol 2014;25:iii1–iii9.
12. Fenstermacher DA, Wenham RM, Rollison DE, Dalton WS. Implement-ing personalized medicine in a cancer center. Cancer J 2011;17:528–36.
13. Rodriguez-Bigas MA, Boland CR, Hamilton SR, Henson DE, Jass JR, KhanPM, et al. A National Cancer Institute Workshop on Hereditary Nonpo-lyposis Colorectal Cancer Syndrome: meeting highlights and Bethesdaguidelines. J Natl Cancer Inst 1997;89:1758–62.
14. Welsh EA, Eschrich SA, Berglund AE, Fenstermacher DA. Iterativerank-order normalization of gene expression microarray data. BMCBioinformatics 2013;14:153.
15. Frost HR. Computation and application of tissue-specific gene set weights.Bioinformatics 2018;34:2957–64.
16. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P,Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics2011;27:1739–40.
17. Wu D, Smyth GK. Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Res 2012;40:e133.
18. Liberzon A, Birger C, Thorvaldsd�o H, Ghandi M, Mesirov JP, Tamayo P,et al. The molecular signatures database hallmark gene set collection.Cell Syst 2015;1:417–25.
19. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powersdifferential expression analyses for RNA-sequencing and microarray stud-ies. Nucleic Acids Res 2015;43:e47.
20. Bufill JA. Colorectal cancer: evidence for distinct genetic categoriesbased on proximal or distal tumor location. Ann Intern Med 1990;113:779–88.
Cancer Res; 79(16) August 15, 2019 Cancer Research4240
Kamal et al.
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
21. Barbie DA, Tamayo P, Boehm JS, Kim SY, Moody SE, Dunn IF, et al.Systematic RNA interference reveals that oncogenic KRAS-driven cancersrequire TBK1. Nature 2009;462:108–12.
22. Guinney J, Dienstmann R, Wang X, de Reyni�es A, Schlicker A, Soneson C,et al. The consensus molecular subtypes of colorectal cancer. Nat Med2015;21:1350–6.
23. Valderrama-Trevi~no AI, Barrera-Mera B, Ceballos-Villalva JC, Montalvo-Jav�e EE. Hepatic metastasis from colorectal cancer. Euroasian J Hepatogas-troenterol 2017;7:166–75.
24. McCormack PM, Burt ME, Bains MS, Martini N, Rusch VW, Ginsberg RJ.Lung resection for colorectal metastases. 10-year results. Arch Surg 1992;127:1403–6.
25. Shah SA, Haddad R, Al-Sukhni W, Kim RD, Greig PD, Grant DR, et al.Surgical resection of hepatic and pulmonary metastases from colorectalcarcinoma. J Am Coll Surg 2006;202:468–75.
26. vanderMaaten L,HintonG.VisualizingData using t-SNE. JMach LearnRes2008;9:2579–605.
27. Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune celladmixture from expression data. Nat Commun 2013;4:2612.
28. Semenza GL. Hypoxia-inducible factor 1: oxygen homeostasis and diseasepathophysiology. Trends Mol Med 2001;7:345–50.
29. Wacker I, Sachs M, Knaup K, Wiesener M,Weiske J, Huber O, et al. Key rolefor activin B in cellular transformation after loss of the von Hippel-Lindautumor suppressor. Mol Cell Biol 2009;29:1707–18.
30. Benatti P, Gaf�a R, Barana D, Marino M, Scarselli A, Pedroni M, et al.Microsatellite instability and colorectal cancer prognosis. Clin Cancer Res2005;11:8332–40.
31. Pece S, Tosoni D, Confalonieri S, Mazzarol G, Vecchi M, Ronzoni S, et al.Biological and molecular heterogeneity of breast cancers correlates withtheir cancer stem cell content. Cell 2010;140:62–73.
32. Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, et al. Anembryonic stem cell-like gene expression signature in poorly differentiatedaggressive human tumors. Nat Genet 2008;40:499–507.
33. Abolhassani A, Riazi GH, Azizi E, Amanpour S, Muhammadnejad S,Haddadi M, et al. FGF10: type III epithelial mesenchymal transition andinvasion in breast cancer cell lines. J Cancer 2014;5:537–47.
34. Chen QK, Lee K, Radisky DC, Nelson CM. Extracellular matrix proteinsregulate epithelial–mesenchymal transition in mammary epithelial cells.Differentiation 2013;86:126–32.
35. Zheng R, Blobel GA. GATA transcription factors and cancer. Genes Cancer2010;1:1178–88.
36. Cheung ST, Leung KL, Ip YC, Chen X, Fong DY, Ng IO, et al. Claudin-10expression level is associated with recurrence of primary hepatocellularcarcinoma. Clin Cancer Res 2005;11:551–6.
37. Uhlen M, Zhang C, Lee S, Sj€ostedt E, Fagerberg L, Bidkhori G, et al. Apathology atlas of the human cancer transcriptome. Science 2017;357. pii:eaan2507.
38. Cunningham JJ, Brown JS, Vincent TL, Gatenby RA. Divergent and con-vergent evolution in metastases suggest treatment strategies based onspecific metastatic sites. Evol Med Public Heal 2015;2015:76–87.
39. Tsai JH, Donaher JL, Murphy DA, Chau S, Yang J. Spatiotemporal regu-lation of epithelial-mesenchymal transition is essential for squamous cellcarcinoma metastasis. Cancer Cell 2012;22:725–36.
40. del Pozo Martin Y, Park D, Ramachandran A, Ombrato L, Calvo F,Chakravarty P, et al. Mesenchymal cancer cell-stroma crosstalk promotesniche activation, epithelial reversion, andmetastatic colonization. Cell Rep2015;13:2456–69.
41. Lee HH, Bellat V, Law B. Chemotherapy induces adaptive drug resistanceand metastatic potentials via phenotypic CXCR4-expressing cell statetransition in ovarian cancer. PLoS One 2017;12:e0171044.
42. Stine ZE, Walton ZE, Altman BJ, Hsieh AL, Dang CV. MYC, metabolism,and cancer. Cancer Discov 2015;5:1024–39.
43. Doe MR, Ascano JM, Kaur M, Cole MD. Myc posttranscriptionally inducesHIF1 protein and target gene expression in normal and cancer cells.Cancer Res 2012;72:949–57.
44. Podar K, Anderson KC. A therapeutic role for targeting c-Myc/Hif-1-dependent signaling pathways. Cell Cycle 2010;9:1722–8.
45. Kauffmann A, Rosselli F, Lazar V, Winnepenninckx V, Mansuet-Lupo A,Dessen P, et al. High expression of DNA repair pathways is associated withmetastasis in melanoma patients. Oncogene 2008;27:565–73.
46. Keir ME, Butte MJ, Freeman GJ, Sharpe AH. PD-1 and its ligands intolerance and immunity. Annu Rev Immunol 2008;26:677–704.
47. Fife BT, Bluestone JA. Control of peripheral T-cell tolerance and autoim-munity via the CTLA-4 and PD-1 pathways. Immunol Rev 2008;224:166–82.
48. Robinson R,WuY-M, Lonigro J, Vats P, Cobain R, Everett J, et al. Integrativeclinical genomics of metastatic cancer. Nature 2017;548:297–303.
49. Bakhoum SF, Ngo B, Laughney AM, Cavallo J-A, Murphy CJ, Ly P, et al.Chromosomal instability drives metastasis through a cytosolic DNAresponse. Nature 2018;553:467–72.
50. Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su M-J, Melms JC, et al. Acancer cell program promotes T-cell exclusion and resistance to checkpointblockade. Cell 2018;175:984–97.
www.aacrjournals.org Cancer Res; 79(16) August 15, 2019 4241
Transcriptomes of Primary and Metastatic Colorectal Tumors
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
2019;79:4227-4241. Published OnlineFirst June 25, 2019.Cancer Res Yasmin Kamal, Stephanie L. Schmit, Hannah J. Hoehn, et al. Colorectal Cancer SubtypesAdenocarcinomas and Distant Metastases Reveal Metastatic Transcriptomic Differences between Primary Colorectal
Updated version
10.1158/0008-5472.CAN-18-3945doi:
Access the most recent version of this article at:
Material
Supplementary
http://cancerres.aacrjournals.org/content/suppl/2019/06/25/0008-5472.CAN-18-3945.DC1
Access the most recent supplemental material at:
Cited articles
http://cancerres.aacrjournals.org/content/79/16/4227.full#ref-list-1
This article cites 50 articles, 6 of which you can access for free at:
Citing articles
http://cancerres.aacrjournals.org/content/79/16/4227.full#related-urls
This article has been cited by 1 HighWire-hosted articles. Access the articles at:
E-mail alerts related to this article or journal.Sign up to receive free email-alerts
Subscriptions
Reprints and
.pubs@aacr.org
To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at
Permissions
Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)
.http://cancerres.aacrjournals.org/content/79/16/4227To request permission to re-use all or part of this article, use this link
on March 8, 2021. © 2019 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst June 25, 2019; DOI: 10.1158/0008-5472.CAN-18-3945
Recommended