35
1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan 1,2 , Joseph Cursons 2,3,4 , Soroor Hediyeh-Zadeh 2 , Erik W. Thompson 1,5,6 , Melissa J. Davis* 2,7 1. The University of Melbourne Department of Surgery, St. Vincent’s Hospital, Parkville, VIC 3010, AUSTRALIA. 2. Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3010, AUSTRALIA 3. Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne, Parkville, VIC 3010, AUSTRALIA. 4. ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Melbourne School of Engineering, The University of Melbourne, Parkville, VIC 3010, AUSTRALIA. 5. Institute of Health and Biomedical Innovation and School of Biomedical Sciences, Queensland University of Technology, Kelvin Grove, QLD 4059, AUSTRALIA 6. Translational Research Institute, Wooloongabba, QLD, 4102, AUSTRALIA 7. Department of Biochemistry and Molecular Biology, Faculty of Medicine, Dentistry and Health, University of Melbourne, Parkville, VIC 3010, AUSTRALIA. * Corresponding Author(s): Melissa J. Davis – [email protected], T: +61393452597, F: +61393470852 Author emails: [email protected] [email protected] [email protected] [email protected] [email protected] Financial support: MF: Melbourne International Fee Remission Scholarship (MIFRS) and Melbourne International Research Scholarship (MIRS). JC: Australian Research Council Centre of Excellence in Convergent Bio-Nano Science and Technology (project number CE140100036). EWT: National Breast Cancer Foundation from the EMPathy Breast Cancer Network (CG-10-04), a National Collaborative Research Program. MJD: National Breast Cancer Foundation (NBCF-ECF-043-14). This manuscript includes about 5600 words, six figures, three tables, and three supplementary files including Supplementary Methods, Supplementary Results, and Supplementary Tables. Competing interests: The authors declare no competing interests. Availability of data and material: Data and information supporting the conclusions of this study are included in the paper and the additional files. The digital archive reproducing our results is also available at https://github.com/DavisLaboratory/mforoutan_tgfb_paper_2016 on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

1

A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4, Soroor Hediyeh-Zadeh2, Erik W. Thompson1,5,6, Melissa J. Davis*2,7

1. The University of Melbourne Department of Surgery, St. Vincent’s Hospital, Parkville, VIC 3010, AUSTRALIA.

2. Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3010, AUSTRALIA

3. Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne, Parkville, VIC 3010, AUSTRALIA.

4. ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Melbourne School of Engineering, The University of Melbourne, Parkville, VIC 3010, AUSTRALIA.

5. Institute of Health and Biomedical Innovation and School of Biomedical Sciences, Queensland University of Technology, Kelvin Grove, QLD 4059, AUSTRALIA

6. Translational Research Institute, Wooloongabba, QLD, 4102, AUSTRALIA 7. Department of Biochemistry and Molecular Biology, Faculty of Medicine, Dentistry and Health, University

of Melbourne, Parkville, VIC 3010, AUSTRALIA.

* Corresponding Author(s): Melissa J. Davis – [email protected], T: +61393452597, F: +61393470852

Author emails:

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

Financial support: MF: Melbourne International Fee Remission Scholarship (MIFRS) and Melbourne International Research Scholarship (MIRS). JC: Australian Research Council Centre of Excellence in Convergent Bio-Nano Science and Technology (project number CE140100036). EWT: National Breast Cancer Foundation from the EMPathy Breast Cancer Network (CG-10-04), a National Collaborative Research Program. MJD: National Breast Cancer Foundation (NBCF-ECF-043-14).

This manuscript includes about 5600 words, six figures, three tables, and three supplementary files including Supplementary Methods, Supplementary Results, and Supplementary Tables.

Competing interests: The authors declare no competing interests.

Availability of data and material: Data and information supporting the conclusions of this study are included in the paper and the additional files. The digital archive reproducing our results is also available at https://github.com/DavisLaboratory/mforoutan_tgfb_paper_2016

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 2: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

2

ABSTRACT:

Most cancer deaths are due to metastasis, and epithelial-to-mesenchymal transition (EMT) plays a central role

in driving cancer cell metastasis. EMT is induced by different stimuli, leading to different signaling patterns and

therapeutic responses. Transforming growth factor β (TGFβ) is one of the best-studied drivers of EMT, and

many drugs are available to target this signalling pathway. A comprehensive bioinformatics approach was

employed to derive a signature for TGFβ-induced EMT which can be used to score TGFβ-driven EMT in cells

and clinical specimens. Considering this signature in pan-cancer cell and tumour datasets, a number of cell

lines (including Basal B breast cancer and cancers of the central nervous system) show evidence for TGFβ-

driven EMT and carry a low mutational burden across the TGFβ signalling pathway. Further, significant

variation is observed in the response of high scoring cell lines to some common cancer drugs. Finally, this

signature was applied to pan-cancer data from The Cancer Genome Atlas to identify tumour types with

evidence of TGFβ-induced EMT. Tumour types with high scores showed significantly lower survival rates

compared to those with low scores, and also carry a lower mutational burden in the TGFβ pathway. The

current transcriptomic signature demonstrates reproducible results across independent cell line and cancer

datasets and identifies samples with strong mesenchymal phenotypes likely to be driven by TGFβ.

Implications: The TGFβ-induced EMT signature may be useful to identify patients with mesenchymal-like

tumours that could benefit from targeted therapeutics to inhibit pro-mesenchymal TGFβ signalling and disrupt

the metastatic cascade.

KEY WORDS:

Epithelial Mesenchymal Transition (EMT), TGFβ signalling pathway, TGFβ-induced EMT, pan-cancer analysis,

drug response

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 3: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

3

Introduction

Epithelial-to-mesenchymal transition (EMT) is a physiological process involved in development (1) and wound

healing (2) subverted during cancer metastasis, together with the reverse MET process (3). EMT is associated

with a migratory, invasive phenotype, and cancer stem cell (CSC) characteristics that aid metastasis, promote

drug resistance, and repress apoptosis (4). General EMT signatures have been derived using a variety of cell

lines and/or stimuli (5-7), with the aim of identifying a general molecular program underlying EMT. Our

previous work, however, suggests that EMT is a phenotypic endpoint that can be driven by different stimuli,

with important implications for drug responses (8).

The transforming growth factor β (TGFβ) family is a canonical driver of EMT (9). Mutations in the TGFβ

pathway can promote cell invasiveness and motility, and reduced TGFβ type II receptor (TGFΒR2) expression

has been reported in different cancer types (10). In various cancer cell lines active TGFβ signalling induces the

pro-mesenchymal transcription factors SNAI1 and SNAI2, which in turn suppress expression of E-cadherin

(CDH1), resulting in loss of cell-cell junctions (11). TGFβ controls important EMT markers through both

canonical and non-canonical signalling in different cancers (12). Note that EMT is not only regulated through

transcription, but is influenced and maintained through interconnected changes in epigenetics (13), micro-

RNAs (14), long non-coding RNAs (15) and protein synthesis (16,17).

In a clinical context, it is important to note that TGFβ can play a dual role in human cancer, promoting

metastasis or acting as a tumour suppressor (18). To some extent this parallels contrasting effects of TGFβ on

epithelial cells where it inhibits proliferation, and mesenchymal cells where it is stimulatory (19). Although

high tumour and circulatory levels of TGFβ are associated with poor prognosis across multiple cancers, TGFβ

levels alone are not sufficient to identify patients for TGFβ inhibitor therapies due to these suppressor effects

(20). Given the importance of TGFβ-induced EMT in tumour progression (21), we sought to identify a signature

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 4: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

4

which provides evidence of this specific phenotypic program, and thus may better identify patients where

cancer progression is driven by TGFβ. We have applied transcript meta-analysis methods, using data from a

range of cancer cell lines with TFGβ stimulation and verified EMT. To our knowledge this is the first time that

using comprehensive meta-analysis methods, a transcriptional signature of EMT specifically associated with

TGFβ has been obtained. It should be noted that this signature captures the change in gene expression caused

by TGFβ-induced EMT, and does not necessarily indicate further increases in TGFβ signalling.

We have assessed this signature using a wide range of cancer cell line and clinical tumour data. Specifically, we

applied single-sample gene set analysis methods to obtain TGFβ-EMT scores (TES) and identify cancer cell lines

and tumour samples with evidence of TGFβ-induced EMT. We illustrate a novel approach of comparing

information from multiple signatures, such as more general epithelial and mesenchymal signatures, to

examine specific molecular drivers of EMT across a large number of cell lines and clinical samples. Finally, we

demonstrate that TES is correlated with a differential response to certain drugs, and we show that higher TES

is associated with lower overall survival outcome in pan-cancer data. Our signature may be useful to identify

candidate patients for TGFβ-signalling inhibitor therapies (22), and the approach of gene set scoring is

particularly promising for the strategic design of personalised cancer treatments.

Methods

All computational and statistical analyses were performed using R (versions 3.1.1, 3.2.2 & 3.2.4) and

Bioconductor (version 3.0). Further details are given in the Supplementary methods.

Microarray data for the gene expression signature of TGFβ-induced EMT

Publically available microarray data were collected from Gene Expression Omnibus (GEO) using the NCBI

portal (http://www.ncbi.nlm.nih.gov/geo/), based upon the following criteria: (i) microarray experiments must

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 5: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

5

have been performed using Affymetrix, Agilent or Illumina platforms; (ii) data must be collected from human

cancer cell lines; (iii) there must be evidence of TGFβ-induced EMT through morphological or phenotypic

assays, and expression assays for EMT markers (e.g. VIM, CDH1 etc.); (iv) datasets must include biological

replicates.

Data quality was assessed using methods provided within the R/Bioconductor package Simpleaffy (23) for

Affymetrix platforms, and using unsupervised methods for other platforms to exclude all low quality samples

from the analysis. All ten datasets used in this study were of high quality.

Obtaining our TGFβ-EMT signature

To obtain our TGFβ-EMT signature, we applied two meta-analysis techniques. In the first technique, we

performed ‘Product of Rank’ (PR) metric which has been shown to have better performance characteristics

(biological association, stability and robustness) compared to the other available methods for identifying

genes that are differentially expressed in all of the datasets (24). In the second method, we integrated all the

ten datasets after RMA normalisation, and applied SVA (25) and ComBat (26) to estimate and remove batch

effects, and then performed differential expression analysis using limma.

Public cancer cell line and patient tumour data

Cell line data used in this study were derived from NCI-60 (27), CCLE (Cancer Cell Line Encyclopedia (28)) ,

COSMIC (29) pan-cancer cell lines, as well as the Neve et al. (2006) (30) and Heiser et al. (2012) (31) breast

cancer cell lines. We also used patients’ data from TCGA including breast cancer microarray data as well as the

pan-cancer RNASeq data. All of the datasets were scored against the signature, and COSMIC, NCI60 and Heiser

were further analysed for drug response assessment. Batch analysis of the large datasets is illustrated in

Supplementary Results, Figure S16-S27.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 6: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

6

Single-sample scoring methods applied on the cell line and tumour data

Using the GSVA package in R/Bioconductor, we applied ssGSEA and GSVA methods in order to obtain theTES

for samples in each dataset. We additionally defined a simple single-sample scoring method in order to

examine whether the TES obtained by ssGSEA and GSVA is associated with batches in the large datasets.

Results

Data for TGFβ-induced EMT were identified across the literature and integrated

To identify a gene expression signature associated with TGFβ-induced EMT, we collected microarray data from

GEO (32) for ten studies that examined TGFβ-induced EMT across a range of epithelial cell lines (Table 1). Six

differentially expressed genes (DEGs) were shared across all studies (COL1A1, FN1, ADAM19, SERPINE1,

TAGLN, and PMEPA1), all of which are associated with EMT and TGFβ stimulation.

A gene signature for TGFβ-induced EMT was obtained using meta-analysis methods

When using list overlap to derive gene signatures the results often have low statistical power due to the small

number of samples within each study. Given that only six DEGs were identified across all data sets we next

applied two meta-analysis techniques and took the union of the resulting gene sets for our signature.

First, within each study, genes were ranked by their limma (33,34) t-statistic and the product of ranks (PR)

(24,35) was calculated and compared to a permuted null distribution (details in Methods). This approach

identified 186 up- and 82 down-regulated genes implicated in TGFβ-induced EMT (Figure 1A).

Next, we obtained DEGs by combining the ten data sets after RMA normalization. Although the merged data

had a larger number of samples, providing us with greater statistical power, we needed to minimise batch

effects between the different studies. The R package SVA (25) was used with the SVA and ComBat (26)

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 7: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

7

algorithms to estimate and remove batch effects within our integrated data. Using limma (33,34) with batch

corrected data identified 121 up- and 74 down-regulated genes (Supplementary Table S1).

Across both signatures, there were 165 common genes and 301 genes in total. The combined list was used as

our signature of TGFβ-induced EMT (108 down- and 193 up-regulated genes, Supplementary Table S1).

The TGFβ-induced EMT gene signature had stimulus-dependent overlap with reported general EMT

signatures

A number of transcript abundance signatures have been identified for EMT, including reports by Taube et al.

(2010; ngenes = 250) (5), Groger et al. (2012; ngenes = 131) (6), Tan et al. (2014; ngenes = 218) (7), Cursons et al.

(2015; ngenes = 206; Supplementary Table S2) (8), and Du et al. (2016; ngenes= 571) (36). Comparing our TGFβ-

induced EMT signature against these previous studies, only three genes were shared across all: vimentin

(VIM), keratin-15 (KRT15), and tetraspanin-1 (TSPAN1), however, comparing each pair of signatures, the

number of shared genes was higher (Supplementary Results, Figure S1).

Low gene overlap across all signatures may reflect differences in EMT stimuli, and experimental and

computational methods. Our signature was derived using ten datasets that specifically examined TGFβ-

induced EMT using multiple cancer cell lines (Table 1), and these data were subjected to consistent pre-

processing and normalisation methods to minimise artefacts. Conversely, Cursons et al. derived their signature

using EGF- or hypoxia-induced EMT within basal breast cancer cell lines (8). Taube et al. used a number of

different stimuli (including TGFβ) to induce EMT, however, they only used one cell line and thus their signature

may show bias towards the genetic/phenotypic background of HMLE cells (5). Groger et al. selected genes

differentially expressed in at least 10 (of 18) data sets (6), whereas our hypothesis setting identified genes

differentially expressed across all datasets. The signature recently reported by Du et al. (36) was obtained

using only two cancer cell lines, with EMT induced by TGFβ and Oncostatin-M [OSM], and their EMT signature

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 8: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

8

was defined simply by taking the intersection of DEGs between the two cell lines. Finally, Tan et al. (7) applied

an alternative approach – rather than studying stimulated cell lines, they use known EMT markers to classify

epithelial or mesenchymal cell lines and tumours, and then derived their signature.

Supporting our suggestion that stimulus-specific differences underlie the differences in the above signatures,

our TGFβ-EMT signature had the greatest overlap with Du et al. (using TGFβ and OSM) and the lowest overlap

with the EGF/hypoxia-induced EMT signature from Cursons et al (Figure 1B). In general, the Cursons et al.

signature shared the least genes with other studies (Supplementary Results, Figure S1).

There is evidence of active TGFβ signalling in CD44+ breast cancer cells and embryonic stem cells (37,38), and

examining overlap between our TGFβ-EMT signature and the metastatic CD44+ breast cancer signature

derived by Shipitsin et al. (2007, ngenes= 952) (37) identified 49 common genes (greater than expected, p-value

≤ 1e-5; Table S3). Similarly, Blick et al. (39) identified 55 ‘Basal B Discriminator’ genes that distinguish Basal B

cell lines from both Basal A and Luminal subgroup breast cancer cell lines (37), and which shared eight genes

with our TGFβ-EMT gene signature (greater than expected, p-value < 2e-5): ANK3, ELF3, ERBB3, FBN1, INHBB,

TGFB1I1, MYO5C, and RAB25. Given the aggressive nature of these breast cancer subgroups, this overlap

suggests that TGFβ signalling may play a role in mediating clinical disease progression. In support of this, up-

regulated genes from our signature showed significantly higher expression values in Basal B and triple

negative breast cancer cell line subgroups, while down-regulated genes showed significantly lower expression

(Supplementary Results, Figure S2).

A wide range of cancer cell lines show graded variations in scoring metrics derived from the signature of

TGFβ-induced EMT

To identify cancer cell lines with evidence of TGFβ-induced EMT, we created a relative ‘TGFβ-EMT Score’ (TES)

which quantified the degree of concordance with our signature. Using both the ssGSEA (40) and GSVA (41)

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 9: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

9

single sample scoring methods, scores from the 108 down- and 193 up-regulated gene sets in our signature

were calculated separately and then summed (Supplementary methods) to obtain summed-ssGSEA (Figure 2

& 3) and summed-GSVA (Supplementary Results, Figure S3 & S4) scores. These were calculated for the

experimental data used to derive our TES (Figure 2A), the Neve et al. (2006) breast cancer cell line data

(nCellLines = 54; Figure 2B) (30), the NCI-60 data (nCellLines = 59; Figure 2C) (27,42), and the Cancer Cell Line

Encyclopaedia data (CCLE, nCellLines = 1036; Figure 2D, all scores given in Table S7-S11) (43,44). A comparison

between GSVA and ssGSEA scores is provided in Figure S5 (Supplementary Results).

Samples with a high TES show strong concordance with our signature, and are thus predicted to have

undergone some form of TGFβ-induced EMT. As expected, TGFβ-treated samples used to derive the signature

showed higher TES compared to the control samples (Figure 2A). All high TES breast cancer cell lines within the

Neve data were from the very aggressive basal B subgroup (Figure 2B), which has CSC-like characteristics and a

mesenchymal phenotype (30,39). Across the NCI60 (Figure 2C) and CCLE (Figure 2D) data very consistent

results were obtained for each cancer type. When cell lines were ranked by their TES and classified as “high

TES” (top 10%), “medium TES” (middle 10%) and “low TES” (bottom 10%), however, there appeared to be

some enrichment for different tissues of origin. Within the CCLE data, most high TES cell lines are derived from

the central nervous system, and there are a number of skin and bone cancer cell lines (Figure 2D). Conversely,

the majority of the low TES-cancer cell lines are from the large intestine, followed by lung and breast cancer

cell lines (Figure 2D). Considering the medium TES group, most cell lines are derived from hematopoietic and

lymphoid tissues and lung cancers (Figure 2D). Lists of cell lines with very high, very low, and medium TGFβ-

EMT scores are given in Supplementary Table S4.

The Neve, NCI-60, and CCLE data shared some common cell lines, and the TES calculated across these

independent studies was very consistent (Figure 2E & F), suggesting that our TGFβ-EMT signature is stable and

robust across different experimental systems.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 10: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

10

High TES cell lines also show a high mesenchymal score

To quantify changes across the landscape of epithelial and mesenchymal phenotypes, we used both ssGSEA

(Fig. 3A & B) and GSVA (Supplementary Results, Figure S4) to derive epithelial and mesenchymal scores (ES

and MS, respectively) from the corresponding Tan et al. (2014) gene signatures (7). For TGFβ-EMT data used to

derive our signature, ES provided reasonable separation of control and TGFβ-treated samples, while MS gave

good separation (Figure 3A & B). For CCLE samples grouped by low, medium and high TES we extracted the

TES, ES and MS (Figure 3C-E). Low TES cell lines tend to have lower mesenchymal and higher epithelial scores,

while samples with a high TES have high mesenchymal and low epithelial scores. Interestingly, samples with

medium TES show a greater range of epithelial and mesenchymal scores (Figure 3D). When all CCLE cell lines

are considered together, there is a strong negative correlation between TES and epithelial score (Figure 3F, ρ =

-0.83, p-value < 2e-16), and positive correlation between TES and mesenchymal score (Figure 3G, ρ = 0.80, p-

value < 2e-16). As noted above, a number of stimuli can drive EMT and this raises the possibility that cell lines

with a low/moderate TES but relatively high MS represent those with a different molecular etymology

inducing EMT.

We also scored previously identified signatures of general EMT (Supplementary Results, Figure S6). Across the

CCLE data there was a high correlation between TES and scores derived from the Du (ρ = 0.97), Groger (ρ =

0.95), or Taube EMT signatures (ρ = 0.91), all of which used TGFβ as one of their stimuli. It should be noted,

however, that there were a number of samples, particularly with lower TES where these metrics diverged

(Supplementary Results, Figure S6). Conversely, our TES had almost no association (ρ = 0.25) with scores from

the Cursons et al. signature which was derived using EGF and hypoxia stimulation. This appears to support the

earlier observation of stimulus-dependent similarities and differences between EMT signature list overlap.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 11: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

11

Cell lines with high TES tend to carry a lower mutation load within TGFβ signalling elements

We extracted gene names for all components of the KEGG and Reactome TGFβ signalling pathways (nGenes =

98; Supplementary Table S5) and for CCLE cell lines with a high, medium and low TES, we looked at the

frequency of mutations within these genes (Table 2). The TGFβ mutational load for each group was calculated

as the ratio of total mutations in TGFβ pathway genes to the number of cell lines in that group.

The high TES group contains a lower overall relative mutational load within TGFβ signalling pathway genes

(Table 2), and although mutations were observed for TGFΒ2 and TGFΒR2, other important upstream

regulators such as TGFΒ1, TGFB3 and TGFΒR1 were not mutated within high TES cell lines (Table 2).

Furthermore, in high TES cell lines 69% of mutations within TGFβ pathway components are not predicted to

change protein primary sequence (Table 3, details in Table S6). Conversely, in low TES cell lines only 32% of

TGFβ pathway mutations are not predicted to effect protein sequence (Supplementary Table S6).

More than half of the TGFβ component mutations in the high TES group occur within the 3’ UTR (53%), while

missense mutations (28%) and 5’ UTR mutations (13%) are also relatively common (Table 3). For the low TES

group, however, a large proportion of TGFβ mutations are missense (42%), or 3’ UTR (21%) mutations, while

frame shift deletions are more frequent (16%) compared to the high TES group (Table 3). When the mutation

frequency in TGFβ signalling genes was normalised against the total number of mutations in each cell line

there was no difference between the high TES and low TES groups, implying that lower TES is associated with

higher genomic mutational load.

High TES group responds differently to certain drugs

An association between EMT and general drug resistance has been reported (45,46), thus, we explored drug

efficacy for the high TES group of cell lines, which also show a relatively high mesenchymal score (Figure 3E).

We examined NCI-60 (27,42) and COSMIC (29) pan-cancer drug data and breast cancer cell line drug data from

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 12: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

12

Heiser et al. (31), and found samples with a high TES showed large, drug-specific differences in their

association with resistance (e.g. NSC 150412) or sensitivity (e.g. NSC 674493; Figure 4A).

For each drug across each data set, we calculated the Spearman’s correlation between cell line TES and GI50

or IC50 (Figure 4B for NCI60 data, further information in Supplementary Results, Figure S9-S11). The pan-drug

analysis shows a wide range of correlations between TES and GI50 suggesting there was no general drug

resistance associated with TGFβ-induced EMT, and reinforcing the observation of drug-specific effects.

Associations in our results suggest that high TES cell lines may have resistance against EGFR and AKT

inhibitors. In particular, the COSMIC pan-cancer data identifies: Afatinib (BIB2992), Gefitinib, and AKT inhibitor

VIII; for COSMIC breast cancer cell lines: Lapatinib and A443654, GSK2141795; and for the Heiser data: Sigma

AKT1/2 inhibitor. Conversely, samples with high TES appeared to show sensitivity to a number of drugs

inhibiting cell proliferation, mitosis and DNA synthesis. Within the COSMIC pan-cancer data, this included

NVPBEZ235, Temsirolimus, BX795, GSK269962A, and Cisplatin; while the COSMIC breast cancer and Heiser

data suggested sensitivity to NSC663284, Docetaxel, PF3814735, GSK650394, AZ628, and OSI906.

This is consistent with our previous results (8) showing that different EMT stimuli lead to differences in drug

response, and the work of others demonstrating that EMT is not universally associated with drug resistance

(7). It must be stressed that these results capture ‘population’ level, observational associations between TES

and drug response, and cannot be considered to be predictive of response, however they highlight the

importance of stimulus-specific EMT signatures when interpreting general trends for drug response data.

A wide range of cancer samples show graded variations in scoring metrics derived from the TGFβ-induced

EMT signature

A number of drugs targeting TGFβ pathway elements are in development for clinical applications (22) thus we

examined whether our signature could identify clinical samples with evidence of TGFβ-induced EMT. We used

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 13: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

13

ssGSEA to score the TCGA pan-cancer data (Figure 5A) and breast cancer data (Supplementary Results, Figure

S12) against our TGFβ-EMT signature, the Tan et al. (2014) tumour epithelial and mesenchymal signatures (7),

as well as the Du (36), Groger (6), Taube (5) and Cursons (8) EMT signatures. For breast cancer samples where

corresponding microarray data were available (nTumours= 571) we also calculated the TES, ES and MS, and found

a high correlation (Supplementary Results, Figure S13) suggesting that our results are robust to artefacts

associated with different detection platforms.

The TCGA pan-cancer mRNA sequencing data includes 9,756 clinical samples covering 34 cancer types from 29

tissues. The tumour TES of individual cancer types (Figure 5A) show a very similar distribution to the

corresponding cell line data (Figure 2C & D) - brain, bone, and skin cancer samples tend to display a high TES,

haematopoietic cancer samples have an intermediate TES, while colorectal cancers have a low TES.

Segregating pan-cancer tumours with a high (top 10%; nTumours = 976), moderate (middle 10%) or low (bottom

10%) TES and examining the epithelial and mesenchymal scores (Figure 5B-D), a high TES was associated with

tumours that have high mesenchymal and low epithelial scores. Similar results were observed for the TCGA

breast cancer data (Supplementary Results, Figure S12), although there was less agreement between TES and

MS for the breast cancer samples, suggesting that TGFβ may not be a common driver of mesenchymal

behaviour across all breast cancers.

Analysis of the TCGA pan-cancer mutation data showed that samples with high TES have relatively fewer

mutations in TGFβ signalling elements compared to samples with low TES, with fewer patients having at least

one mutation in the TGFβ signalling elements (35% vs. 47%). These differences, while more modest, are none-

the-less consistent with the results of our cell line analysis, demonstrating that even taking intra-tumour

heterogeneity into account, the genomic observations translate from cell lines to patient samples.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 14: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

14

Across all patients there were significant (p-value < 2e-16) correlations between TES and MS (Supplementary

Results, Figure S8) for breast cancer (ρ= 0.57) and pan-cancer data (ρ= 0.53), and between TES and ES for

breast cancer (ρ= -0.44) and pan-cancer (ρ=-0.73) data. Examining other EMT signatures across the TCGA pan-

cancer data, our TES was highly correlated with scores from the Du (ρ= 0.93), Groger (ρ= 0.80), and Taube (ρ=

0.66) signatures, but not correlated with the Cursons’ EMT score (ρ= 0.07; Supplementary Results, Figure S7),

which intriguingly shows two populations with separate TES but a range of overlapping Cursons’ EMT scores.

These results again support the notion that signatures derived using similar stimuli are highly associated, while

EMT signatures derived using alternative stimuli are largely independent, highlighting the need for stimulus-

specific EMT signatures.

Given that stromal cells often have a more mesenchymal phenotype and modulate TGFβ signaling (47), it is

possible that tumour TES is predominantly derived from stromal cells within the tissue biopsy. To examine

this, we calculated the correlation between TES and the consensus measurements of purity estimation (CPE)

(48) for TCGA pan-cancer samples (Figure 5E). There was a weak but significant negative correlation (ρ= -0.20,

p-value <2e-16) between TES and tumour purity across all cancer types suggesting that lower purity (i.e. more

stromal contamination) is associated with a similar or higher TES. Considering each cancer type separately,

weak negative correlations were also observed between TES and tumour purity for all cancer types except

glioblastoma (ρ=0.003) and adrenocortical carcinoma (ACC, ρ=0.21; Table S12). The strongest negative

correlation was seen in colorectal cancers (ρ= -0.55) (Figure 5F). This is particularly interesting as colorectal

cancers tend to have a very low TES, thus, despite the observation that this tumour type is the most likely to

show an inflated TES due to stromal contamination, we still see low TES overall. Conversely, we see no

correlation between tumour purity and TES in glioblastoma (ρ= 0.003, Figure 5G), despite this tumour type

being amongst those with the highest TES on average. Additionally, we calculated the TES for normal and

TGFβ-stimulated fibroblasts (Supplementary Results, Figure S15) and found that the TES of TGFβ-stimulated

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 15: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

15

fibroblasts was only slightly higher than untreated samples, perhaps reflecting the specificity of our signature

towards EMT. Collectively, these results demonstrate that even though tumour purity does show some

influence on TES, the influence is weak, particularly in cancer types with a very high TES.

The TES effectively separates pan-Cancer patient cohort with significant differences in survival rates

Tumours with a high TES (nTumours= 976) from the TCGA pan-cancer data were associated with a significantly

lower survival rate when compared to samples with low TES (nTumours= 976; Figure 6A, HR = 0.6, p-value = 2e-

09). To investigate if this effect can be attributed to different survival rates for tumours that could be

separated by their relative TES, we compared the overall survival rates of all cancer types. As shown in Figure

5A, skin and bone tumours have a higher TES; however, these tumour types also have longer median survival

time (Figure 6B), suggesting that tumour-dependent differences in survival may not confound the apparent

TES-mediated differences in survival time (Figure 6A).

Although we observed a highly significant difference in overall survival for patients with low TES versus high

TES tumours across the TCGA pan-cancer data, this difference was not statistically significant within all cancer

types (Supplementary Results, Figure S14). Highly significant, poor survival rates were associated with high-

versus-low TES samples in brain (p-value= 2e-5), kidney (p-value= 3e-4), and lining of body cavity cancers (p-

value= 0.003). For cancers of adrenal gland, thyroid gland, eye, and bladder, smaller but significant (p-value <

0.05) differences were seen between survival rates of the low and high TES groups (Supplementary Results,

Figure S14). For other cancer types no significant differences were observed in survival rates of the patients

with high versus low TES tumours, however, the trend was similar as the high TES groups tended to have a

worse survival outcome. It should be noted, however, that white blood cell (p-value= 0.04), and head and neck

(p-value= 0.15) cancers showed the opposite association, as low TES was associated with poor survival.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 16: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

16

Discussion

We have derived a transcriptional signature for TGFβ driven EMT and show consistent scoring results across

large sets of cell line and patient data. This analysis highlights basal B breast cancer cell lines, and clinical brain,

bone, and skin tumours as malignancies with relatively high TGFβ signalling activity, while colorectal and

endometrium cancers tend to show lower scores. The association our TES shows with epithelial and

mesenchymal scores suggests that TGFβ signalling contributes to EMT in numerous cancer cell lines (Figure

3C-G), and tumours (Figure 5B-D), however, there are interesting discrepancies. It is tempting to speculate

that samples with a high mesenchymal/general EMT score and low TES represent malignancies where

alternative drivers of EMT are active, and conversely, samples with a high TES but low general EMT scores are

more likely to display evidence of TGFβ driven EMT (Supplementary Results). This illustrates the advantage of

scoring multiple signatures associated with molecular phenotypes to gain a deeper understanding of tumour

biology and guide the development and application of targeted therapies. In this context, we note that Tan et

al. (7) combine their epithelial and mesenchymal scores into a single metric; however, samples with a specific

TES can show a range of ES or MS values (Figure 3F & G; Figure 5B-D), suggesting that summation into a single

metric may hide meaningful variation. The range of correlations between our TES and general EMT signature

scores appeared to reflect the relative contribution of TGFβ stimulation in the study design (Figure S6 & S7,

Supplementary Results), highlighting the need for more specific signatures of EMT, such as the one we present

here, that are matched to their underlying molecular etymology.

Examining mutation frequencies in genes of TGFβ signalling components, samples with a higher TES tended to

contain fewer mutations, and these were localised to non-coding regions. None of the 3’UTR, 5’UTR, or intron

mutations were predicted to cause protein sequence alterations, although some may influence miRNA

binding, alternative splicing, or transcript stability (49-51). Given the much greater number of mutations

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 17: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

17

within TGFβ signalling pathway genes for the low TES group, it is likely that many of these mutations tend to

repress TGFβ signalling activity. These data also suggest that for cells within the high TES group the TGFβ

signalling pathway tends to be intact, implying that aberrant signalling is not driven by loss of function

mutations in inhibitory pathway components, and indicating that these cells may respond to targeted inhibitor

therapies. In agreement with this, Park et al. recently used a novel TGFBR1 inhibitor, IN-1130, to successfully

block TGFβ-induced EMT in MCF10A and MDA-MB-231 breast cancer cell lines (52) which both have a

relatively high TES (Supplementary Table S8 and S9).

The effects of EMT on drug efficacy are contentious. Recent reports suggest that EMT can drive general drug

resistance (46,53), however our analysis shows that high TES and low TES groups respond differentially only

for specific drugs, more consistent with the observations of Tan et al. (2014) which examined response across

a spectrum of epithelial and mesenchymal phenotypes (7). We note a number of specific differences,

particularly for cancer types such as bone and skin that tend to have a higher TES. For example, Tan et al.

reported that more mesenchymal melanoma cell lines were resistant to GSK650394 (SGK1/SGK2 inhibitor),

while those with a more epithelial phenotype were resistant to Docetaxel. Our results showed no correlation

between melanoma cell line TES and IC50 values for GSK650394 (ρ= 0.02) and Docetaxel (ρ= -0.05), but

instead suggested that high TES cell lines tend to show Parthenolide resistance, and sensitivity to CGP60474

(CDK1/CDK2 inhibitor). Further, we found that bone cancer cell lines with high TES tended to be Epothilone

(microtubule inhibitor) resistant and Erlotinib (EGFR inhibitor) sensitive, while Tan et al. found little correlation

for the efficacy of these drugs. These observations concur with our previous work that demonstrated drug

responses were strongly influenced by the nature of the stimulus that generated EMT, and highlights the need

for stimulus-specific signatures such the one we present here.

We believe our TES can be used to inform a number of therapeutic options. The correlative associations

indicating resistance for several drugs in high TES cell lines raises the possibility that combination therapies

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 18: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

18

with TGFβ signalling inhibitors may increase sensitivity for these drugs. Further, a number of studies have

shown that chemotherapy-induced TGFβ signalling can promote a CSC phenotype and/or drug resistance. In

this context, combination therapy using TGFβ inhibitors and chemotherapeutic agents has been examined in

cell line and xenograft models (54-57) where they show promise (56-58). Finally, for high TES samples the

application of TGFβ inhibitors alone may be a potential treatment strategy. In support of this, antisense

TGFβ1/2 oligonucleotides (e.g. AP12009) are efficacious in the clinical treatment of patients with aggressive

gliomas (59), which have a high TES.

When working with complex tissue samples such as those derived from tumour biopsies, there is likely to be

at least some degree of stromal contamination. We show that for TCGA pan-cancer samples, tumour purity

(CPE) and TES have a weak negative correlation (Figure 5E-G) suggesting that higher TES values may be

associated with an increased fraction of stromal cells. The strongest negative association between TES and

tumour purity was observed for colorectal cancers suggesting that they may be sensitive to confounding

effects from tumour purity. Intriguingly, however, these tumour samples had a relatively low TES, while brain

cancers (e.g. GBM and LGG) which tend to show higher TES values often had greater sample purity, suggesting

that stromal contamination is not a dominant effect. We note that TGFβ mediates stromal-tumour

interactions and can drive tumour microenvironment remodelling to promote progression at all stages of

carcinogenesis (60,61), which may also influence the TES; however these effects were not explored in detail.

The relationship between patient survival and EMT-related gene signatures are discordant across the

literature. These inconsistencies may reflect the role of phenotypic switching in disease progression, and the

need for MET in establishment and growth of metastasis (62,63); alternatively these results may have arisen

due to difficulties with taking a biopsy from the invasive front of a clinical tumour sample. For example, Taube

et al. (2010) found no correlation between their core EMT signature and breast cancer patient survival (5),

while Tan et al. (2014) showed a tumour-type dependent association between their EMT signature and

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 19: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

19

survival, and surprisingly better overall survival outcomes for mesenchymal breast cancer and malignant

melanoma (7). Conversely, Shipitsin et al. (2007) reported an association between poor distant metastasis free

survival and higher expression values for TGFβ signalling pathway elements that were up-regulated in CD44+

breast cancer samples for one of their datasets (37).

We show that tumours with a high TES are significantly associated with poor survival within the TCGA pan-

cancer data, but there were large differences between cancers. The association between TES and survival was

significant for brain, kidney, lining of body cavities, adrenal gland, thyroid gland and eye cancers (p-

value<0.05). For some cancer types, there was no significant association between TES and survival, although a

similar trend was observed, such that the high TES group tended to have reduced survival. Surprisingly, for

haematopoietic or head and neck cancers a lower TES was associated with reduced survival rates. This may

reflect the fact that haematopoietic cells are not epithelial-derived thus different molecular mechanisms may

be active, while a large proportion of the head and neck cancers are squamous cell carcinomas where TGFβ

signalling elements such as TGFBR2, SMAD4 and SMAD2 are strongly down-regulated (64). Consistent with our

observation, it has been shown that patients with head and neck squamous cell carcinoma (HNSCC) who do

not have phospho-SMAD2/3 show better survival compared to patients with active phospho-SMAD2/3 (65).

Given the increasing availability of experimental and clinical drugs targeting the TGFβ signalling pathway (22),

our TGFβ-EMT gene signature may be useful to identify clinical subsets of patients who will benefit from

targeted treatment to inhibit metastasis. We believe that our results illustrate several advantages for stimulus-

specific classification of EMT programs, in particular the ability to pair signatures may facilitate improved

molecular stratification of cancer subtypes for targeted therapies. The future of personalised therapeutics will

certainly benefit from further development of driver-specific signatures for clinically-relevant phenotypic

programs in order to better target therapies.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 20: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

20

Acknowledgements: The authors would like to acknowledge the TCGA Research Network (http://cancergenome.nih.gov/) for generating the data used in this article. We thank Professor Greg Goodall, Associate Professor Frederic Hollande, and Dr. Hong-Jian Zhu for the valuable suggestions and feedback, and Professor Terry Speed for his advice in data analysis. MJD is supported the National Breast Cancer Foundation (ECF-14-043).

Authors' contributions: MJD, EWT, JC and MF designed the study. MF performed all the analysis and generated the results. MJD, EWT, JC and MF wrote the paper and interpreted the results. SH tested all code and generated digital archive for reproducibility of the analysis. All authors reviewed the manuscript prior to submission.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 21: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

21

References 1. Hay ED. An overview of epithelio-mesenchymal transformation. Acta Anat (Basel) 1995;154(1):8-20. 2. Iwano M, Plieth D, Danoff TM, Xue C, Okada H, Neilson EG. Evidence that fibroblasts derive from

epithelium during tissue fibrosis. The Journal of clinical investigation 2002;110(3):341-50. 3. Diepenbruck M, Christofori G. Epithelial-mesenchymal transition (EMT) and metastasis: yes, no,

maybe? Current opinion in cell biology 2016;43:7-13. 4. Reka AK, Chen G, Jones RC, Amunugama R, Kim S, Karnovsky A, et al. Epithelial-mesenchymal

transition-associated secretory phenotype predicts survival in lung cancer patients. Carcinogenesis 2014;35(6):1292-300.

5. Taube JH, Herschkowitz JI, Komurov K, Zhou AY, Gupta S, Yang J, et al. Core epithelial-to-mesenchymal transition interactome gene-expression signature is associated with claudin-low and metaplastic breast cancer subtypes. Proceedings of the National Academy of Sciences of the United States of America 2010;107(35):15449-54.

6. Groger CJ, Grubinger M, Waldhor T, Vierlinger K, Mikulits W. Meta-analysis of gene expression signatures defining the epithelial to mesenchymal transition during cancer progression. PloS one 2012;7(12):e51136.

7. Tan TZ, Miow QH, Miki Y, Noda T, Mori S, Huang RY, et al. Epithelial-mesenchymal transition spectrum quantification and its efficacy in deciphering survival and drug responses of cancer patients. EMBO molecular medicine 2014;6(10):1279-93.

8. Cursons J, Leuchowius KJ, Waltham M, Tomaskovic-Crook E, Foroutan M, Bracken CP, et al. Stimulus-dependent differences in signalling regulate epithelial-mesenchymal plasticity and change the effects of drugs in breast cancer cell lines. Cell Commun Signal 2015;13(1):26.

9. Xu J, Lamouille S, Derynck R. TGF-beta-induced epithelial to mesenchymal transition. Cell Res 2009;19(2):156-72.

10. Myeroff LL, Parsons R, Kim SJ, Hedrick L, Cho KR, Orth K, et al. A transforming growth factor beta receptor type II gene mutation common in colon and gastric but rare in endometrial cancers with microsatellite instability. Cancer research 1995;55(23):5545-7.

11. Taylor MA, Parvani JG, Schiemann WP. The pathophysiology of epithelial-mesenchymal transition induced by transforming growth factor-beta in normal and malignant mammary epithelial cells. Journal of mammary gland biology and neoplasia 2010;15(2):169-90.

12. Wendt MK, Allington TM, Schiemann WP. Mechanisms of the epithelial-mesenchymal transition by TGF-beta. Future Oncol 2009;5(8):1145-68.

13. Chen ZY, Wang PW, Shieh DB, Chiu KY, Liou YM. Involvement of gelsolin in TGF-beta 1 induced epithelial to mesenchymal transition in breast cancer cells. Journal of biomedical science 2015;22:90.

14. Gregory PA, Bracken CP, Smith E, Bert AG, Wright JA, Roslan S, et al. An autocrine TGF-beta/ZEB/miR-200 signaling network regulates establishment and maintenance of epithelial-mesenchymal transition. Molecular biology of the cell 2011;22(10):1686-98.

15. Richards EJ, Zhang G, Li ZP, Permuth-Wey J, Challa S, Li Y, et al. Long non-coding RNAs (LncRNA) regulated by transforming growth factor (TGF) beta: LncRNA-hit-mediated TGFbeta-induced epithelial to mesenchymal transition in mammary epithelia. The Journal of biological chemistry 2015;290(11):6857-67.

16. Desnoyers G, Frost LD, Courteau L, Wall ML, Lewis SM. Decreased eIF3e Expression Can Mediate Epithelial-to-Mesenchymal Transition through Activation of the TGFbeta Signaling Pathway. Molecular cancer research : MCR 2015;13(10):1421-30.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 22: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

22

17. Gillis LD, Lewis SM. Decreased eIF3e/Int6 expression causes epithelial-to-mesenchymal transition in breast epithelial cells. Oncogene 2013;32(31):3598-605.

18. Muraoka-Cook RS, Dumont N, Arteaga CL. Dual role of transforming growth factor beta in mammary tumorigenesis and metastatic progression. Clinical cancer research : an official journal of the American Association for Cancer Research 2005;11(2 Pt 2):937s-43s.

19. Freytag J, Wilkins-Port CE, Higgins CE, Higgins SP, Samarakoon R, Higgins PJ. PAI-1 mediates the TGF-beta1+EGF-induced "scatter" response in transformed human keratinocytes. The Journal of investigative dermatology 2010;130(9):2179-90.

20. Smith AL, Robin TP, Ford HL. Molecular pathways: targeting the TGF-beta pathway for cancer therapy. Clinical cancer research : an official journal of the American Association for Cancer Research 2012;18(17):4514-21.

21. Katz LH, Li Y, Chen JS, Munoz NM, Majumdar A, Chen J, et al. Targeting TGF-beta signaling in cancer. Expert opinion on therapeutic targets 2013;17(7):743-60.

22. Akhurst RJ, Hata A. Targeting the TGFbeta signalling pathway in disease. Nature reviews Drug discovery 2012;11(10):790-811.

23. Wilson CL, Miller CJ. Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis. Bioinformatics (Oxford, England) 2005;21(18):3683-5.

24. Chang LC, Lin HM, Sibille E, Tseng GC. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline. BMC bioinformatics 2013;14:368.

25. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics (Oxford, England) 2012;28(6):882-3.

26. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics (Oxford, England) 2007;8(1):118-27.

27. Reinhold WC, Sunshine M, Liu H, Varma S, Kohn KW, Morris J, et al. CellMiner: a web-based suite of genomic and pharmacologic tools to explore transcript and drug patterns in the NCI-60 cell line set. Cancer research 2012;72(14):3499-511.

28. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012;483(7391):603-7.

29. Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 2012;483(7391):570-5.

30. Neve RM, Chin K, Fridlyand J, Yeh J, Baehner FL, Fevr T, et al. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 2006;10(6):515-27.

31. Heiser LM, Sadanandam A, Kuo WL, Benz SC, Goldstein TC, Ng S, et al. Subtype and pathway specific responses to anticancer compounds in breast cancer. Proceedings of the National Academy of Sciences of the United States of America 2012;109(8):2724-9.

32. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res 2013;41(Database issue):D991-5.

33. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015.

34. Smyth GK. Limma: linear models for microarray data. Bioinformatics and computational biology solutions using R and Bioconductor: Springer; 2005. p 397-420.

35. Dreyfuss JM, Johnson MD, Park PJ. Meta-analysis of glioblastoma multiforme versus anaplastic astrocytoma identifies robust gene markers. Molecular cancer 2009;8:71.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 23: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

23

36. Du L, Yamamoto S, Burnette BL, Huang D, Gao K, Jamshidi N, et al. Transcriptome profiling reveals novel gene expression signatures and regulating transcription factors of TGFbeta-induced epithelial-to-mesenchymal transition. Cancer medicine 2016;5(8):1962-72.

37. Shipitsin M, Campbell LL, Argani P, Weremowicz S, Bloushtain-Qimron N, Yao J, et al. Molecular definition of breast tumor heterogeneity. Cancer Cell 2007;11(3):259-73.

38. James D, Levine AJ, Besser D, Hemmati-Brivanlou A. TGFbeta/activin/nodal signaling is necessary for the maintenance of pluripotency in human embryonic stem cells. Development (Cambridge, England) 2005;132(6):1273-82.

39. Blick T, Hugo H, Widodo E, Waltham M, Pinto C, Mani SA, et al. Epithelial mesenchymal transition traits in human breast cancer cell lines parallel the CD44(hi/)CD24 (lo/-) stem cell phenotype in human breast cancer. Journal of mammary gland biology and neoplasia 2010;15(2):235-52.

40. Verhaak RG, Tamayo P, Yang JY, Hubbard D, Zhang H, Creighton CJ, et al. Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. The Journal of clinical investigation 2013;123(1):517-25.

41. Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC bioinformatics 2013;14:7.

42. Shankavaram UT, Varma S, Kane D, Sunshine M, Chary KK, Reinhold WC, et al. CellMiner: a relational database and query tool for the NCI-60 cancer cell lines. BMC genomics 2009;10:277.

43. Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price EV, Coletti ME, et al. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer discovery 2015;5(11):1210-23.

44. Basu A, Bodycombe NE, Cheah JH, Price EV, Liu K, Schaefer GI, et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 2013;154(5):1151-61.

45. Maheswaran S, Haber DA. Cell fate: Transition loses its invasive edge. Nature 2015;527(7579):452-3. 46. Shang Y, Cai X, Fan D. Roles of epithelial-mesenchymal transition in cancer drug resistance. Current

cancer drug targets 2013;13(9):915-29. 47. Quail DF, Joyce JA. Microenvironmental regulation of tumor progression and metastasis. Nature

medicine 2013;19(11):1423-37. 48. Aran D, Sirota M, Butte AJ. Systematic pan-cancer analysis of tumour purity. Nature communications

2015;6:8971. 49. Halvorsen M, Martin JS, Broadaway S, Laederach A. Disease-associated mutations that alter the RNA

structural ensemble. PLoS genetics 2010;6(8):e1001074. 50. Orom UA, Nielsen FC, Lund AH. MicroRNA-10a binds the 5'UTR of ribosomal protein mRNAs and

enhances their translation. Molecular cell 2008;30(4):460-71. 51. Palaniswamy R, Teglund S, Lauth M, Zaphiropoulos PG, Shimokawa T. Genetic variations regulate

alternative splicing in the 5' untranslated regions of the mouse glioma-associated oncogene 1, Gli1. BMC molecular biology 2010;11:32.

52. Park CY, Min KN, Son JY, Park SY, Nam JS, Kim DK, et al. An novel inhibitor of TGF-beta type I receptor, IN-1130, blocks breast cancer lung metastasis through inhibition of epithelial-mesenchymal transition. Cancer letters 2014;351(1):72-80.

53. Singh A, Settleman J. EMT, cancer stem cells and drug resistance: an emerging axis of evil in the war on cancer. Oncogene 2010;29(34):4741-51.

54. Teicher BA, Maehara Y, Kakeji Y, Ara G, Keyes SR, Wong J, et al. Reversal of in vivo drug resistance by the transforming growth factor-beta inhibitor decorin. International journal of cancer 1997;71(1):49-58.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 24: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

24

55. Bandyopadhyay A, Wang L, Agyin J, Tang Y, Lin S, Yeh IT, et al. Doxorubicin in combination with a small TGFbeta inhibitor: a potential novel therapy for metastatic breast cancer in mouse models. PloS one 2010;5(4):e10365.

56. Kim YJ, Hwang JS, Hong YB, Bae I, Seong YS. Transforming growth factor beta receptor I inhibitor sensitizes drug-resistant pancreatic cancer cells to gemcitabine. Anticancer research 2012;32(3):799-806.

57. Park SY, Kim MJ, Park SA, Kim JS, Min KN, Kim DK, et al. Combinatorial TGF-beta attenuation with paclitaxel inhibits the epithelial-to-mesenchymal transition and breast cancer stem-like cells. Oncotarget 2015;6(35):37526-43.

58. Bhola NE, Balko JM, Dugger TC, Kuba MG, Sanchez V, Sanders M, et al. TGF-beta inhibition enhances chemotherapy action against triple-negative breast cancer. The Journal of clinical investigation 2013;123(3):1348-58.

59. Hau P, Jachimczak P, Schlingensiepen R, Schulmeyer F, Jauch T, Steinbrecher A, et al. Inhibition of TGF-beta2 with AP 12009 in recurrent malignant gliomas: from preclinical to phase I/II studies. Oligonucleotides 2007;17(2):201-12.

60. Neuzillet C, Tijeras-Raballand A, Cohen R, Cros J, Faivre S, Raymond E, et al. Targeting the TGFβ pathway for cancer therapy. Pharmacology & Therapeutics 2015;147:22-31.

61. Drabsch Y, ten Dijke P. TGF-beta signalling and its role in cancer progression and metastasis. Cancer metastasis reviews 2012;31(3-4):553-68.

62. Korpal M, Ell BJ, Buffa FM, Ibrahim T, Blanco MA, Celia-Terrassa T, et al. Direct targeting of Sec23a by miR-200s influences cancer cell secretome and promotes metastatic colonization. Nature medicine 2011;17(9):1101-8.

63. Haviv I, Thompson EW. Soiling the seed: microenvironment and epithelial mesenchymal plasticity. Cancer microenvironment : official journal of the International Cancer Microenvironment Society 2012;5(1):1-3.

64. White RA, Malkoski SP, Wang XJ. TGFbeta signaling in head and neck squamous cell carcinoma. Oncogene 2010;29(40):5437-46.

65. Xie W, Aisner S, Baredes S, Sreepada G, Shah R, Reiss M. Alterations of Smad expression and activation in defining 2 subtypes of human head and neck squamous cell carcinoma. Head & neck 2013;35(1):76-85.

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 25: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

25

Tables

Table 1. Microarray studies examining TGFβ-induced EMT. (GEO): Gene Expression Omnibus. In order to identify a signature for TGFβ-induced EMT that is not specific to a particular type of cancer, we use cell lines that represent a variety of cancers.

Cell line Tissue of cell line origin

Platform ReplicatesGEO accession number

Reference

HMLE Breast Affymetrix- HT_HG-U133A

3 GSE24202 Taube et al. (2010) (5)PMID: 20713713

MCF10A Breast Agilent-014850 Whole Human Genome

2 GSE28569 Deshiere et al. (2013)PMID: 22562247

HMEC-TR Breast Affymetrix- HG-U133_Plus_2

2 GSE28448 Hesling et al. (2011)

PMID: 21597466

A549 Lung Affymetrix- HG-U133_Plus_2

3 GSE17708 Keshamouni et al. (2010) PMID: 19118450

A549 Lung Affymetrix- HG-U133_Plus_2

3 GSE49644 Sun et al. (2014) PMID: 25379179

HCC827 Lung Affymetrix- HG-U133_Plus_2

3 GSE49644 Sun et al. (2014)PMID: 25379179

NCI-H358 Lung Affymetrix- HG-U133_Plus_2

3 GSE49644 Sun et al. (2014)PMID: 25379179

HK-2 Kidney Affymetrix- HG-U133A_2 3 GSE23338 Walsh et al. (2008)PMID: 25379179

Human proximal tubular cell line

Kidney Illumina- HumanWG-6 v3.0

3 GSE20247 Hills et al. (2010) PMID: 20197308

Panc-1 Pancreas Affymetrix- HG-U133_Plus_2

3 GSE23952 Maupin et al. (2010)PMID: 20885998

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 26: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

26

Table 2. Number of mutations for selected TGFβ pathway components within high, low and medium TES groups.

Absolute frequency of mutations within cell lines (number expected to cause protein sequence changes)

Selected genes from TGFβ pathway Low TES Med TES High TES

TGFΒ1 - 3 (3) -

TGFΒ2 26 (1) 25 (3) 16 (1)

TGFΒ3 6 (5) 2 (1) -

TGFΒR1 17 (16) 1 (0) -

TGFΒR2 24 (24) 6 (4) 1 (1)

SMAD2 6 (6) 2 (1) 1 (1)

SMAD3 5 (5) 2 (1)) 1 (0)

SMAD4 21 (20) - 1 (1)

SMAD7 10 (3) 2 (1) 1 (0)

RHOA 17 (15) 2 (2) 1 (1)

Relative frequency of TGFβ pathway mutations within cell line groups

Number of genes mutated in TGFβ pathway

40 33 28

Total mutations in TGFβ pathway genes 466 (316)

(n=104 CLs)

228 (268)

(n= 104 CLs)

135 (42)

(n= 104 CLs)

Average TGFβ pathway mutations per cell line

4.48 2.20 1.29

Total mutations 19012 10607 7626

TES= TGFβ-EMT Score; CLs= cell lines

Table 3. The proportion of TGFβ signalling mutations in each mutation class; data shown for CCLE mutations listed in Table 2.

Mutations 3’ UTR

5’ UTR

Frame Shift

Frame shift

In Frame

Intron Missense Nonsense Silent Splice Site

Splice Site

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 27: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

27

Del Ins Del Ins SNP

High TES 0.53 0.13 - - 0.01 0.03 0.28 0.01 0.01 - -

Med TES 0.33 0.10 0.05 0.02 0.01 0.04 0.37 0.04 0.02 0.004 0.02

Low TES 0.21 0.05 0.16 0.02 0.01 0.05 0.42 0.04 0.01 0.004 0.02

Del= deletion; Ins= insertion; SNP: single nucleotide polymorphism; UTR= un-translated region (28).

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 28: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

28

Figure legends

Figure 1. (A) Distribution of gene scores by product of ranks. The product of ranks (PR) value for each gene was calculated across all ten datasets studied and then log2 transformed (histogram). A permuted null distribution was calculated (density line; n=119x106 and a significance threshold was specified at 99.999% of the cumulative density (dashed line) to identify genes with a product of rank lower than expected (p-value < 5e-6). (B) Overlap of our TGFβ-EMT signature (Foroutan) with more general EMT signatures reported by Cursons et al. (8) and Du et al. (36).

Figure 2. Relative TGFβ-EMT Scores (TES) for cancer cell lines from different datasets. Up- and down-regulated sets from our 301 gene signature were scored and combined using a summed-ssGSEA approach. This was applied to (A) the integrated TGFβ-EMT dataset (Table 1), (B) Neve et al. (2006) cell lines as clustered, (C) NCI-60 cell lines, and (D) Cancer Cell Line Encyclopedia (CCLE) cell lines. Where cell lines are present in multiple datasets, we calculated the correlation between TES, for (E) CCLE and NCI-60 data, and (F) CCLE and Neve data, demonstrating that scores are highly correlated in independently measured data.

Figure 3. The (A) ‘epithelial score’ (ES) and (B) ‘mesenchymal score’ (MS) of the control and TGFβ-treated samples within the integrated TGFβ-EMT dataset (obtained by ssGSEA method). As shown, control cells (blue histograms) tend to have a high ES and low MS, while TGFβ-treated cells tend to have a low ES and high MS. (C-E) Comparison of TGFβ-EMT score (TES) to ES and MS in CCLE cell lines with (C) low TES, (D) medium TES, and (E) high TES. Samples with high TES show high MS and low ES, while samples with low TES tend to have low MS and high ES. (F) Correlation between TES and ES across all CCLE cell lines (Spearman’s ρ= -0.83), and (G) Correlation between TES and MS in all CCLE cell lines (Spearman’s ρ=0.8).

Figure 4. (A) The NCI-60 cell lines ordered by their relative TES, and log10(GI50) of the two drugs (scaled by Z-score). Higher values of log10(GI50) are associated with higher resistance of samples to that drug. (B) The volcano plot shows the Spearman’s correlation between TES and log10(GI50) along with the related p-values for the correlation. The left and the right plots are the compounds with the highest negative and positive correlations, respectively. A positive correlation is associated with higher resistance of high TES group to that drug, while this group is more sensitive to drugs with a negative correlation.

Figure 5. (A) Relative TGFβ-EMT scores (TES) for the TCGA pan-cancer data using the ssGSEA method. (B-D) Comparing TES, epithelial scores (ES) and mesenchymal scores (MS) in the TCGA pan-cancer data with tumours classified as (B) low TES, (C) medium TES, and (D) high TES. In general, samples with a high TES have a lower ES and higher MS, while samples with a low TES tend to have a higher ES and lower MS. (E-G) Relationship between TES and tumour purity (CPE: consensus measurements of purity estimations) in (E) TCGA pan-cancer data (nSamples= 7805), (F) colorectal cancer with the strongest negative correlation, and in (G) glioblastoma with the least correlation.

Figure 6. (A) Overall survival of pan-cancer patients with high versus low TES (nTumours= 976 in each group, p-value= 2e-9, Hazard Ratio= 0.6). (B) Overall survival rates of different cancer types with ten year censoring (ordered by median values for overall survival)

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 29: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Figure 1

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 30: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Figure 2

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 31: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Figure 3

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 32: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Figure 4

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 33: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Figure 5

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 34: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Figure 6

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313

Page 35: A Transcriptional Program for Detecting TGFbeta-induced EMT in … · 1 A Transcriptional Program for Detecting TGFβ-induced EMT in Cancer Momeneh Foroutan1,2, Joseph Cursons2,3,4,

Published OnlineFirst January 23, 2017.Mol Cancer Res   Momeneh Foroutan, Joseph Cursons, Soroor Hediyeh-Zadeh, et al.   in CancerA Transcriptional Program for Detecting TGFbeta-induced EMT

  Updated version

  10.1158/1541-7786.MCR-16-0313doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://mcr.aacrjournals.org/content/suppl/2017/03/07/1541-7786.MCR-16-0313.DC1

Access the most recent supplemental material at:

  Manuscript

Authoredited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been

   

   

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected] at

To order reprints of this article or to subscribe to the journal, contact the AACR Publications

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://mcr.aacrjournals.org/content/early/2017/01/21/1541-7786.MCR-16-0313To request permission to re-use all or part of this article, use this link

on June 13, 2020. © 2017 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on January 23, 2017; DOI: 10.1158/1541-7786.MCR-16-0313