Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Therapeutics, Targets, and Chemical Biology
High-Throughput Genomic Profiling of AdultSolid Tumors Reveals Novel Insights intoCancer PathogenesisRyan J. Hartmaier1, Lee A. Albacker1, Juliann Chmielecki1, Mark Bailey1, Jie He1,Michael E. Goldberg1, Shakti Ramkissoon1, James Suh1, Julia A. Elvin1, Samuel Chiacchia1,Garrett M. Frampton1, Jeffrey S. Ross1,2, Vincent Miller1, Philip J. Stephens1, andDoron Lipson1
Abstract
Genomic profiling is widely predicted to become a standard ofcare in clinical oncology, but more effective data sharing toaccelerate progress in precision medicine will be required. Here,we describe cancer-associated genomic profiles from 18,004unique adult cancers. The dataset was composed of 162 tumorsubtypes including multiple rare and uncommon tumors. Com-parison of alteration frequencies to The Cancer Genome Atlasidentified some differences and suggested an enrichment oftreatment-refractory samples in breast and lung cancer cohorts.To illustrate novelty within the dataset, we surveyed the genomiclandscape of rare diseases and identified an increased frequency ofNOTCH1 alterations in adenoid cystic carcinomas comparedwithprevious studies. Analysis of tumor suppressor gene patterns
revealed disease specificity for certain genes but broad inactiva-tion of others.We identifiedmultiple potentially druggable, noveland known kinase fusions in diseases beyond those in which theyare currently recognized. Analysis of variants of unknown signif-icance identified an enrichment of SMAD4 alterations in coloncancer and other rare alterations predicted to have functionalimpact. Analysis of established, clinically relevant alterationshighlighted the spectrum of molecular changes for which testingis currently recommended, as well as opportunities for expansionof indications for use of approved targeted therapies. Overall, thisdataset presents a new resource with which to investigate rarealterations and diseases, validate clinical relevance, and identifynovel therapeutic targets. Cancer Res; 77(9); 2464–75. �2017 AACR.
IntroductionLarge-scale genomic datasets, such as The Cancer Genome
Atlas (TCGA), have increased our understanding of the molec-ular mechanisms driving tumorigenesis (1–3). Public availabil-ity of this data has facilitated novel discoveries, validated rarefindings, and allowed for the incorporation of genomic featuresinto clinical trial design (4). As more cancer patients undergoclinical genomic profiling, sharing these data with the broaderresearch community is critical for accelerating precisionmedicine (4).
The value of genomic data is apparent in multiple cancerswhere molecular alterations define distinct clinical groups. Forexample, glioblastomas can be divided into four molecular sub-types, each with distinct survival and response rates to standard
therapies (5, 6). Insights from cancer genomics analyses have alsoled to the development and validation of targeted treatmentoptions againstmolecularlymatched alterations that can bemoreeffective and less toxic than traditional chemotherapeutic regi-mens. For example, in non–small cell lung cancer (NSCLC),alterations in 8 genes are associated with sensitivity to targetedinhibitors and genomic analyses of these targets is now recom-mended in treatment guidelines (7). In addition, broad genomicfeatures, such as total tumor mutational burden, have beenproposed as potential biomarkers of sensitivity for immunecheckpoints inhibitors (8). Genomic information can also informclinical trial design to better identify patients likely to respond totargeted inhibitors. Multi-arm umbrella trials such as NCI-MATCH (NCT02465060) and LUNG-MAP (NCT02154490) areusing genomic features to select appropriate treatment arms.Basket trials, selecting patients based on molecular alterationsversus tumor type have also been developed for alteration-specificinhibitors (9). Across cancers, patients enrolled in molecularlymatched clinical trials have demonstrated superior survival versusrandom trial enrollment (10).
The sharing of genomic data is critical to further our under-standing of molecular drivers and to develop effective thera-pies. Here, we present genomic profiles from 18,004 uniqueadult solid tumors that underwent targeted genomic profilingas part of routine clinical care. This collection represents a vastdiversity of tumor subtypes, including many rare diseasesnot profiled as part of large-scale efforts. High-level analysisidentified novel alterations in common diseases and confirmed
1Foundation Medicine, Cambridge, Massachusetts. 2Albany Medical College,Albany, New York.
Note: Supplementary data for this article are available at Cancer ResearchOnline (http://cancerres.aacrjournals.org/).
R.J. Hartmaier, L.A. Albacker, and J. Chmielecki contributed equally to this article.
Corresponding Authors: Ryan J. Hartmaier, Foundation Medicine, 150 Sec-ond Street, Cambridge, MA 02141. Phone: 857-242-7286. E-mail:[email protected]; and Doron Lipson,[email protected]
doi: 10.1158/0008-5472.CAN-16-2479
�2017 American Association for Cancer Research.
CancerResearch
Cancer Res; 77(9) May 1, 20172464
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
the prevalence of alterations that were underrepresentedpreviously in small cohort studies. In addition, we highlightthe spectrum of clinically relevant alterations with establishedroles in determining drug sensitivity. By making this dataavailable to the broader research community, it is anticipatedthat this information will serve as a source of discovery andvalidation for projects aimed at improving cancer treatmentsand outcomes.
Materials and MethodsSample profiling
This study was reviewed and approved by the WesternInstitutional Review Board. Samples were submitted between2012–2014 to a Clinical Laboratory Improvement Amendment(CLIA)-certified, New York State-accredited, and CAP-accre-dited laboratory (Foundation Medicine) for hybrid capturefollowed by next-generation sequencing (NGS). The pathologicdiagnosis of each case was reviewed and tumor content deter-mined from hematoxylin and eosin (H&E)-stained slides. Allsamples that advanced to DNA extraction contained a mini-mum of 20% tumor cells. The majority of samples contain <50% tumor content (Supplementary Fig. S1A). DNA wasextracted from formalin fixed paraffin embedded 10-mm sec-tions. Adaptor-ligated DNA underwent hybrid capture for allcoding exons of 287 cancer-related genes plus select intronsfrom 19 genes frequently rearranged in cancer (SupplementaryTable S1). Captured libraries were sequenced to a median exoncoverage depth of > 600� using Illumina HiSeq sequencingtechnology (Supplementary Fig. S1B). To protect against inad-vertent reidentification, samples in ultra-rare disease types(< 5 samples) were removed. In cases where multiple samplesfrom the same individual were tested (determined via a SNP-genotype based approach), only one sample was included. Nofurther selection was performed and all samples meeting thesecriteria are included in this dataset.
Sequencing analysis and deposition of dataResultant sequences were analyzed for short variants [base
substitutions, insertions/deletions (indels)], copy number (CN)alterations (focal amplifications andhomozygous deletions), andselect gene fusions using the hg19 reference genome, as describedpreviously (11). As tumor samples were sequenced without acorresponding matching normal sample, additional custom fil-tering was applied to highlight cancer-relevant alterations andreduce noise from benign germline events (Supplementary Fig.S2; Supplementary Methods). Sequence analysis results are avail-able publicly in the Genomic Data Commons (GDC) portal(https://gdc.nci.nih.gov/).
Analysis of TCGA dataTCGA data (2016-01-28), analyzed via Firehose, was accessed
via the 'firehose_get' tool. See the Supplementary Methods foradditional details. Survival analysis was determined via log-ranktest.
Mutation hotspot callerWe performed a hotspot analysis of missense and nonsense
mutations to predict hotspot changes within a given gene. SeeSupplementary Methods for additional details.
ResultsOverview of genomic and clinical characteristics
The FoundationMedicine (FM)adult dataset consists of 18,004unique tumor samples that underwent genomic profiling as partof standard clinical care. Each sample was assigned a detaileddiagnosis; these detailed subtypeswere then grouped intobroaderdisease categories. In total, 16 broad disease categories werecreated with tumors from 162 unique disease subtypes (Fig. 1Aand B; Supplementary Figs. S3–S11). The most common diseasecategories are thoracic cancers (20.7%, 12 subtypes), gastrointes-tinal (GI) cancers (17.1%, 15 subtypes), breast cancers (14.3%, 8subtypes), gynecologic cancers (8.3%, 25 subtypes), and hepato-pancreato-biliary cancers (7.1%, 15 subtypes). The remaining32.5% of samples include genitourinary, CNS, neuroendocrine,head and neck, melanoma, and additional less common cancertypes. Samples for which detailed pathologic information wasunavailable were grouped into general "not otherwise specified"(nos) categories within each disease category. Among all thespecific and "nos" categories, the most common disease subtypeswere lung adenocarcinoma (13.0%), colon adenocarcinoma(9.6%), breast carcinoma (nos; 6.6%), breast invasive ductalcarcinoma (6.5%), and lung non–small cell carcinoma (nos;3.4%). Of the 162 detailed diagnostic subtypes, 63% were com-prised of 50 or fewer samples, including multiple rare diseases(Supplementary Table S2). All detailed subtypes had at least 5unique tumors. In addition, 13.0% of the diseases subtypescontained 200 samples or greater (range 203–2,345) allowingfor robust statistical analyses in common diseases. Genes fre-quently altered across the dataset included the TP53 (54%), KRAS(21%), CDKN2A (19%), PIK3CA (14%), and CDKN2B(12%; Fig. 1C).
Gender information was available from all but 9 samples andshowed a slight bias toward females (56.7%) versus males(43.3%; Fig. 1D). This bias can be explained in part by the largenumber of breast andGYN cancer sampleswithin the dataset (Fig.1A). The average age of patients at the time of genomic profilingwas 57.7 years (median: 59 years, range: 19–88 years; Fig. 1D).Patients 89 years old or older at the time of testing were excludedto comply with privacy guidelines. Information about diseaserecurrence and previous treatment histories was not available forthe cohort.
Comparison of alteration frequencies to known datasetsIn contrast to other large-scale genomic profiling efforts that
employed multiple technologies (i.e., DNA-seq, CN arrays, andRNA-seq) to analyze genomic alterations, samples within thiscohort were analyzed for all classes of genomic alterations on asingle uniform platform. Due to the clinical setting in which thesesamples were profiled, no selection criteria (e.g., fresh vs. archivaltissue, primary vs. metastatic tissues, or pre- vs. posttreatmentsamples) was applied. Therefore, we hypothesized that the spec-trum and frequency of genomic alterations would vary comparedwith those fromother large-scale profiling efforts (i.e., TCGA) thatutilized multiple technologies and applied stringent selectioncriteria. We analyzed these differences using TCGA samples forwhich both CN (GISTIC2) and mutation data (MutSig2CV)existed and for which disease subtypes could be mapped readilybetween the two datasets. Finally, we included only diseases forwhich at least 200 samples existed in both cohorts to avoidsampling bias. (Fig. 2; Supplementary Table S3).
Public Release of Adult Cancer Genomic Data
www.aacrjournals.org Cancer Res; 77(9) May 1, 2017 2465
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
We first investigated the impact of methodologic differences.Differences in CN alteration frequencies were observed acrossmany tumors (Fig. 2, green bars). These discrepancies may beexplained in part by different technologies used to measure theseevents by TCGA (SNP arrays) and FM (NGS-based modeling),differences in sample input requirements (i.e., > 70% tumorcontent vs. � 20% tumor content), annotation thresholds (FMrequires focal segments for most genes and >6–8 copies depend-ing on tumor ploidy), or functional status (plotted FM dataexcludes variants of unknown significance, VUSes). However, CNdifferences appeared to be tumor type specific as fewdiscrepancieswere observed in breast and bladder cancer cohorts, whereasmultiple differences were observed in ovarian cancers, head andneck squamous cell carcinomas, and lung adenocarcinomas.These CN differences were not due to the high proportion ofmetastatic samples in the FM dataset as restricting the FM datasetto local disease produced similar results (Supplementary Figs. S12and S13). Although the low cellularity typical of FM samples
could potentially explain the lower rate of CN detection, the highsequencing depth applied allows for high CN sensitivity > 85%even at 20–30% cellularity (12). Thus, the differences in copynumber observed aremost likely due to differences in technology,annotation, and/or sample differences (i.e., differences in pro-portion of tumor subtypes).
We next investigated differences in mutation frequencies. Thelargest mutation frequency difference between the two datasetswas observed for LRP1B in lung adenocarcinomas (6.5% vs.40.9%). Large discrepancies for LRP1B are also seen in melano-mas and head and neck squamous cell carcinomas. These dis-parities are largely explained by the many VUSes in this gene(excluded from comparison in Fig. 2) that result from the highbackground mutation rate in these tumor types (SupplementaryFig. S14). In contrast, the higher TP53 alteration frequency inbreast and uterine/endometrial cancers could not be accountedfor by filtered VUSes or differences in the types of alterationsreported, as rates of missense, nonsense, splice, and frameshift
Thoracic cancers20.7%
GI Cancers17.1%
Breast cancers14.3%
Gyn cancers8.3%
Hep-Panc-Bil7.1%
Other carcinomas6.1%
GU Cancers5.6%
CNS Cancers5.4%
Neuroendocrine Cancers4.3%
H&N Cancers4.1%
Melanomas 3.2%
Endocrine cancers 1.3%
Other 2.5%
Lung adenocarcinoma 62.9%
Lung NSCLC 16.4%
Lung SCC 10.5%
Mesothelioma 4.9%
Thymus carcinoma (nos) 1.7%Other 3.5%
Colon adenocarcinoma (CRC)55.9%
GE Junction adeno12.4%
Rectum adeno 8.6%
Stomach adeno (nos)8.1%
Appendix Adeno 3.6%Stomach adeno diffuse Type 2.2%
Small Intestine adeno 2.0%Anus SCC 1.9%Duodenom adeno 1.7%Esophagus SCC 1.7%Other 2.0%
Breast carcinoma (nos)45.9%
Breast IDC45.5% Breast ILC 5.2%
Breast Metaplastic Carcinoma 2.0%
Other 1.3%
Age at testing
Num
ber o
f pat
ient
s0
100
200
300
400
500
600
700
19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87
Num
ber o
f pat
ient
s0
2,00
04,
000
6,00
08,
000
10,0
0012
,000
Male Female
Perc
ent s
ampl
es
010
2030
4050
60
SVCNREMult.
TP53KRAS
CDKN2A
PIK3C
A
CDKN2B APCMYC
PTEN
ARID1AEGFR
MCL1
ERBB2RB1
CCND1NF1
FGF19FGF3
FGF4BRAF
SMAD4
LRP1B
KMT2D
FGFR1
STK11
CCNE1
A B
C D
Figure 1.
Clinical and genomic characteristics of samples within the dataset. A, Samples were grouped into 16 broad disease categories describing their tissue of origin.Not shown on the graph (grouped into the other category for visualization) are four additional diseases (unclassified, 0.9%; nonmelanoma skin cancers, 0.8%;mesenchymal cancers, 0.6%; and germ cell tumors, 0.2%). B, In addition, each sample was assigned a detailed label that represented its pathologic diagnosis.Subtype distributions for the top three broad categories (thoracic, GI, and breast cancers) are depicted in the smaller surrounding charts. Thoracic cancersincluded samples from 12 disease subtypes (the top 5 are depicted in the small chart). GI cancers included 15 disease subtypes (the top 10 are depicted in thesmall chart). Breast cancers were comprised of samples from 8 subtypes (the top four are depicted in the small chart). See Supplementary Figs. S3–S11 andSupplementary Table S2 for further details about subtypes. C, Long tail distribution of alterations across the entire dataset. SV, short variants (includesmissense mutations and indels); CN, copy number alterations; RE, rearrangements; Mult, multiple events. D, Clinical characteristics of samples within the dataset,including age and gender distributions.
Hartmaier et al.
Cancer Res; 77(9) May 1, 2017 Cancer Research2466
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
mutations were similar between the two cohorts (SupplementaryFig. S15A). Similarly, tissues sites suspected to yield lower qualityDNA (i.e., pleural fluid) did not show evidence of distinct muta-tion detection (Supplementary Fig. S15B). Furthermore, withinthe FM dataset, TP53 alterations are enriched in local disease,indicating the discrepancy is not due to the high proportion ofmetastatic samples (see Supplementary Table S4 for genes alteredat different rates in local vs. metastatic disease). Therefore, differ-ences in mutation frequencies may represent sample differencesbetween the datasets.
To explore biological differences between the dataset, weinvestigated specific molecular trends associated with advancedrefractory tumors. For example, EGFR short variants (point muta-tions and indels) were slightly more frequent within FM lungadenocarcinomas versus TCGA (20.3% vs. 14.5%, P ¼ 0.003).However, the EGFR T790M mutation, associated with acquiredresistance to targeted inhibitors, was observed at a much higherfrequency within FM samples compared with the treatment-na€�veTCGAdataset (4.1%vs. 1.2%,P¼5.3�10�7). A similar trendwasalso observed in breast cancers, where the higher frequency ofESR1 alterations in the FMdataset (9.5%vs. 3.7%,P¼2.8�10�9)suggested a selection for samples with acquired resistance toendocrine therapies. These data suggest an enrichment of treat-ment refractory breast and lung cancer samples within the FMcollection.
Collectively, these comparisons confirm differences betweenTCGA and FM. Methodologic differences between the datasetsare likely contributing a minor role to the observed discrepan-cies. However, the enrichment of resistance-associated altera-tions in breast and lung cancers suggests a biological differencebetween the tumors within our cohort and TCGA. Althoughmolecular signatures of therapeutic resistance are less wellunderstood in diseases not treated routinely with targetedtherapies, one can infer that these tumors are likely fromadvanced stage cancers.
Novel variants and disease enrichment patternsWe next applied an internally developed hotspot-calling algo-
rithm (see Supplementary Methods) and in silico prediction tools(MutationAssessor and PolyPhen-2; refs. 12, 13) to search fornovel functional variants of unknown significance (VUS). Usingthe hotspot caller, we identified several mutation hotspots inSMAD4, a tumor suppressor gene (TSG) previously implicated incolorectal cancers (14). The hotspots identified were mostlynonsense mutations, consistent with a tumor suppressor role(Fig. 3A). However, we did observe several hotspot missensemutations including known hotspots such as D351A/G/V/Y,R361C/H/S, P356H/L/R/S, D537A/E/G/H/V/Y (15, 16), and nov-el hotspots such as A118V, E330K/�, G419M/R/W, andD493H/N,W524C/G/R (Fig. 3A).We also observed hotspotmutations in therelated pathway components SMAD2 (T303 and S464) andTGFBR2 (R495 and R528; Fig. 3B and C). As the TGFb/SMADpathway is a tumor suppressor pathway, we analyzed hotspotmutations, truncations, and homozygous deletions (collectivelyidentifying TGFb/SMAD pathway altered samples). Alterations inthis pathway were most frequent in GI and pancreato-hepato-biliary cancers with 15.8% (487/3076) and 13.6% (175/1284) oftotal cases, respectively (Fig. 3D). Alterations in this pathwaywerenonoverlapping although significance couldnot be addresseddueto the small sample size (Fig. 3E). We did observe a significantcooccurrence with KRAS alterations (OR¼ 1.70, P¼ 9.6� 10�6)and a significant mutual exclusivity with TP53 alterations(OR ¼ 0.62, P ¼ 2.34 � 10�4; Fig. 3F). To demonstrate clinicalrelevance, we incorporated survival data from TCGA samplesharboring TGFb/SMAD pathway alterations (14). Colon adeno-carcinomas were separated into TGFb/SMAD pathway altered(n ¼ 75; hotspot mutation/truncation/homozygous deletion)versus TGFb/SMAD pathway unaltered (n ¼ 254) groups. TheTGFb/SMAD-altered group exhibited reduced progression freesurvival (PFS) that trended toward significance (P ¼ 0.06; logrank test; Fig. 3G).
Short variants (SV)FMI
TCGACopy number (CN) alterations Both
(SV+CN)
STAD
TP53
ARID1A
CDKN2ACDH1
KRAS
ERBB2
PIK3C
A
CDKN2BMYC
KMT2D
FGFR2
CCND1APC
PTEN
FBXW7
010
2030
4050
60 **
**
**** **
OVCA
TP53MYC
CCNE1
BRCA1NF1
KRAS
PIK3C
AMCL1
BRCA2SOX2
AKT2
KAT6ABRAF
MYCL
FGF12
020
4060
8010
0
**
**** **
HNSC
TP53
CDKN2A
CCND1
PIK3C
A
FGF19FGF4
FGF3
CDKN2B
NOTCH1SOX2
EGFR
KMT2DPTEN
LRP1B
NOTCH2
020
4060
80
**
****
GBM
CDKN2A
CDKN2BPTEN
EGFRTP53 NF1
CDK4
PIK3C
AID
H1ATRX
RB1
PDGFRAMDM2
PIK3R
1KIT
010
2030
4050
6070
**
******
**
COAD
APCTP53
KRAS
PIK3C
A
SMAD4
FBXW7MYC
BRAFPTEN
FLT3CDK8
ARID1AGNAS
AMER1
ZNF217
020
4060
8010
0
**
** **
**
BRCA
TP53
PIK3C
AMYC
CCND1
FGF19FGF4
FGF3
FGFR1MCL1
ZNF703
ERBB2PTEN
KAT6AESR1
GATA3
010
2030
4050
60
**
**
**
SKCM%
Indi
vidu
als
% In
divi
dual
s%
Indi
vidu
als
% In
divi
dual
s
% In
divi
dual
s
% In
divi
dual
s%
Indi
vidu
als
% In
divi
dual
s%
Indi
vidu
als
% In
divi
dual
s
CDKN2ABRAF
NRASTP53 NF1
CDKN2B
LRP1B
PTENARID
2
GRIN2A
MYC
ARID1A
DNMT3A
PIK3C
AKIT
010
2030
4050
**
**
**
**
**
BLCA
TP53
CDKN2A
CDKN2B
ARID1A
KMT2D
KDM6A RB1
PIK3C
A
FGFR3
ERBB2
CCND1
FGF19FGF3
FGF4MCL1
010
2030
4050
60
****
UCEC
TP53
PIK3C
APTEN
ARID1AKRAS
PIK3R
1
CTNNB1
PPP2R1A
KMT2DCTCF
JAK1
FBXW7BCOR
ERBB2
CCNE1
010
2030
4050
6070 **
**
** **
LUAD
TP53KRAS
EGFR
CDKN2A
STK11
CDKN2BMCL1
MYC
NFKBIA
ERBB2
MDM2
LRP1B NF1
RICTOR
NKX2-1*0
1020
3040
5060
** ****
**
Figure 2.
Comparison of alteration frequencies between TCGA and FM datasets. Frequencies of alterations in analogous tumor subtypes were compared between FM andTCGA datasets. FM data excludes VUSes. � , neighboring gene was used to determine CN status at this locus; �� , significant differences in alterationfrequencies (P < 0.05). BRCA, breast invasive carcinoma; BLCA, bladder urothelial carcinoma; OVCA, ovarian carcinoma; UCEC, uterine corpus endometrialcarcinoma; GBM, glioblastoma; COAD, colon adenocarcinoma; SKCM, skin cutaneous melanoma; STAD, stomach adenocarcinoma; LUAD, lung adenocarcinoma;and HNSC, head and neck squamous cell carcinoma.
Public Release of Adult Cancer Genomic Data
www.aacrjournals.org Cancer Res; 77(9) May 1, 2017 2467
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
To capture rare alterations, we also analyzed the functionalimpact of VUS point mutations occurring in at least 5 samplesusing MutationAssessor and PolyPhen-2 tools (12, 13). A mergeof these outputs identified 23 unique pointmutations in 11 geneswith predicted functional impact across both algorithms (Sup-plementary Table S5). All mutations also reached significanceusing the hotspot caller described above. In addition to themutations described above, this VUS analysis identified multiplealterations in ERBB3 (T355I and T389I), BRIP1 (R762C andR251C), KEAP1 (G523V, R413C, and G419W), and SMARCA4(E882K, P913L, R973W, R1135Q, and R1192H) and intriguingvariants in PTEN (D24H), FLT1 (E432K), STK11 (P179R), LRP1B(G401E), ESR1 (A361V), and CDKN2A (G101V) predicted tohave functional impact. Interestingly, these variants tended tooccur in tumor types associated with alterations in that gene. Forexample, 3of 6FLT1E432Kalterationswere inmelanomas, 21/32of the various SMAD4 alterations were found GI and hepato-
billiary cancers, and 9 of 15 of the KEAP1 alterations were foundin thoracic cancers. The nonrandom distribution of these VUSesstrongly implicates them as functional alterations. Of note, asmatched germline DNA was not tested, we cannot completelydiscount the possibility that some of these alterations may berare germline alterations. However, somatic/germline statuswas predicted with a novel, internally developed algorithmthat assesses germline status based on allele frequency andtumor purity/ploidy (Sun and colleagues, in review 2016). Forexample, in a sample with moderate cellularity (�50%), incopy neutral, diploid regions, somatic alteration allele frequen-cies will be impacted by cellularity and have allele frequenciesapproximately 25%. In contrast, heterozygous germline var-iants will not be impacted by cellularity and will have allelefrequencies approximately 50%. These estimates suggest that>90% of the VUSes predicted to have a functional impact aresomatic events (data not shown).
TP53
KRAS
SMAD
TP53 Alt.KRAS Alt.Multiple TGFβ/SMAD Pathway Alts
TGFBR2 Alt.SMAD2 Alt.SMAD4 Alt. 0.0
0.2
0.4
0.6
0.8
1.0
Time (days)
PFS
Pro
babi
lity
A
B
F
G
0 100 200 300 400 500 552 aa
0
137
# M
utat
ions
SMAD4
R361C/H/S/fs*12
MH1 MH2
0 100 200 300 400 467 aa
0
16
# M
utat
ions
SMAD2S464*
MH1 MH2
0 100 200 300 400 500 567 aa
0
32
# M
utat
ions
TGFBR2R528C/H
ecTbetaR2 Pkinase_Tyr
TGFβ/SMAD Pathway-Unalt.TGFβ/SMAD Pathway-Alt.
C
Distribution of TGFβ/SMAD Pathway Alterations
% In
divi
dual
s
05
1015
20
Multiple Alts.TGFBR2 Alt.SMAD2 Alt.SMAD4 Alt.
GI
Hep-pa
nc-bi
l
Other C
a.
Thorac
ic
Non-m
elano
ma Skin
Germ C
ellH&N
Gyn GUBrea
st
Unclas
sified
Melano
mas
Neuroe
ndoc
rine
Endoc
rine
CNS
Mesen
chym
al
D
TGFBR2
SMAD2
SMAD4
TGFBR2 Alt.SMAD2 Alt.SMAD4 Alt.
E
Figure 3.
Novel variants in SMAD4 and SMADsignaling pathway components.A, Hotspot alterations within SMAD4identified by a hotspot analysis. Theseinclude both known hotspots (D351A/G/V/Y, R361C/H/S, P356H/L/R/S,D537A/E/G/H/V/Y), and novelhotspots (A118V, E330K/� , G419M/R/W, and D493H/N,W524C/G/R). B andC, Alterations within SMAD pathwaygenes SMAD2 and TGFRB2 were alsoanalyzed. D–F, Analysis of all hotspotand loss of function (truncation andhomozygous deletions) mutations inthese three genes (D) for incidenceacross disease groupings and(E) cooccurrence in colorectaladenocarcinoma. F, Cooccurrence ofTGFb/SMAD pathway alterations incolorectal adenocarcinoma withKRAS alterations and significantmutual exclusivity with TP53alterations is shown. G, TCGA colonadenocarcinomaswere separated intoTGFb/SMAD pathway altered (n ¼ 75hotspot mutations, truncatingmutations, and homozygousdeletions) versus TGFb/SMADpathway unaltered (n ¼ 254) groups.The TGFb/SMAD mutated groupexhibited reduced PFS that trendedtoward significance (P ¼ 0.06;log-rank test).
Hartmaier et al.
Cancer Res; 77(9) May 1, 2017 Cancer Research2468
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
Genomic analysis of rare diseases uncovers higher frequenciesofNOTCH1 andBCOR alterations in adenoid cystic carcinomas
We investigated the genomics of rare tumors given thatmany ofthe subtypes within the FMdataset were excluded from large-scaleanalyses (i.e., TCGA) or profiled only as part of small cohorts.Interesting results were observed in adenoid cystic carcinomas(ACC)of the head andneck region (n¼156 total), includingheadand neck ACCs (n ¼ 78), salivary gland ACCs (n ¼ 57), trachealACCs (n¼7), andunknownprimaryACCs (n¼14). In agreementwith recent findings of 36 recurrent andmetastatic ACCs (17), themost frequent alterations occurred in NOTCH1 (23%; Fig. 4A).NOTCH1 alterations were clustered in the C-terminal PESTdomain of the protein (Fig. 4B), and were significantly enrichedwithin this disease (Supplementary Table S6). NOTCH1 PESTdomain alterations are weakly activating by themselves (18).However, they are synergistic with HD domain mutations in cisand can significantly increase the activity of this protein when thiscombination of alterations is present. Interestingly, 7 samplesharbored mutations in both the PEST and HD domains, suggest-ing a potential mechanism through which activity of this genecould be altered (Fig. 4B). Unfortunately, phasing of the muta-
tions was impossible to determine definitively due to the distancebetween mutations and the length of the sequencing reads.However, two tumors contained a third alteration (E794� andN390fs�243) toward the 50 end ofNOTCH1, likely disrupting oneallele and suggestive that the other two alterations are in cis.Collectively, other genomic studies have investigated 111 totalACC tumors, including 28 samples also represented within thisdataset, and reported NOTCH1 missense and nonsense altera-tions in 5–10% of samples (19–21). This analysis confirmsfindings from smaller studies that NOTCH1 is the most com-monly altered gene in ACCs at approximately 24% and extends itby providing multiple examples of cooccurring PEST and HDdomain alterations. Further work to evaluate the effect of thesealterations is warranted as multiple inhibitors of this protein arecurrently in clinical trials (22).
The second most common alterations in ACCs occurred in theTSG BCOR (17%; Fig. 4A). All variants are predicted to inactivatethis protein (Fig. 4C). These results agree with previous findingswhere 4/36 ACCs were found to have truncating mutations inBCOR. Together, these results establish BCOR inactivation as asignature event in ACC. BCOR alterations have been described in
Perc
ent s
ampl
es0
510
1520
25
SVCNRESV & CN
NOTCH1
BCOR
KDM6ATP53
CREBBPKIT
ARID1A
KMT2D
RUNX1
PIK3C
A
PDGFRAKDR
EP300SPEN
BAP1
PIK3R
1ATM
CDKN2A
SMARCB1APC
PTEN
SETD2
SMARCA4
CCND1MCL1
A
B
C0 400 800 1,200 1,600 2,000 2,400 2,555 aa
0
5
# M
utat
ions
NOTCH1
0
20
40
60
80
100
120
140
0 1 2 3
Num
ber o
f sam
ple s
Number of NOTCH1 mutations per sample
NOTCH1 mutations/sample
0 400 800 1,200 1,600 1,755 aa
0
5
# M
utat
ions
BCOR
BCOR Ank_2 PUFD
Missense In-frame Truncation
Figure 4.
NOTCH1 and BCOR variants inadenoid cystic carcinomas. A, A longtail plot of alterations within ACCsshows a high prevalence of alterationsin NOTCH1 and BCOR. B, Thedistribution of alterations withinNOTCH1 reveals a clustering ofinactivating mutations within theC-terminal PEST domain. Arcsrepresent mutation pairs cooccurringin the same tumor. Mutations intumorswith 3NOTCH1mutations (n¼2) are highlighted with arrowheadscolor coded by sample. Proteindomains are: EGF-like (green),calcium-binding EGF (red), hEGF-like(blue), LNR (yellow), NOD (purple),NODP (orange), Ank (fuchsia, darkred, cyan), and DUF (mustard). NODand NODP together represent thehomopolymerization domain andDUF is contained within the PESTdomain. C, The distribution ofalterations in BCOR revealsinactivating events across the gene.Diagrams were generated usingMutationMapper (see Materials andMethods).
Public Release of Adult Cancer Genomic Data
www.aacrjournals.org Cancer Res; 77(9) May 1, 2017 2469
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
myelodysplastic syndromes where they are associated with poorprognosis and shorter overall survival (23). They have also beendescribed in multiple pediatric tumors where they are thought toplay a role in chromatin modification (24). Both BCOR andNOTCH1 alterations cooccurred with other events (Supplemen-tary Fig. S16).
Identification of novel fusion eventsWe investigated the spectrum of fusions involving 8 clinically
relevant kinases (ALK, BRAF, EGFR, NTRK1, PDGFRA, RAF1, RET,andROS1)with establisheddruggability. In total,we identified19novel fusions with structures similar to known oncogenic fusionproteins and multiple known fusions in diseases different fromthose in which they were reported originally.
Nine novel fusions involved the serine threonine kinases BRAFor RAF1 (Fig. 5). These included fusions with four novel fusionpartners in diseases known to be driven by these events, such asastrocytoma, melanoma, and prostatic acinar cell carcinoma (Fig.5A).RAF1 fusionswere also observed in 5 other disease types (Fig.5B). Interestingly, the PRKAR2A-RAF1 fusion was recurrent andobserved in both a lung adenocarcinoma and an unknownprimary melanoma. In addition, we also observed 7 novel tyro-sine kinase fusions involving ALK, RET, ROS1, and NTRK1 (Fig.5C) in non–small cell lung cancers and thyroid cancer. Novelfusions involvingALK,RET, ROS1, andNTRK1were alsoobservedin a colon adenocarcinoma, a GI neuroendocrine tumor, and auterine endometrial carcinoma (Fig. 5D).
Although kinase fusions can be hallmarks of certain cancers, weobserved known fusions outside of the diseases in which theywere identified originally. CLTC-ALK and STRN-ALK, have beenobserved in aggressive thyroid cancers and soft tissue malignan-cies (25, 26).Within the FMcohort, these fusionswere observed ina lung adenocarcinoma and an epithelioid peritoneal mesothe-lioma, respectively (Supplementary Fig. S17). Diagnosis of theepithelioid peritoneal mesothelioma is supported by IHC stain-ing (positive for cytokeratin 7, calretinin and vimentin; negativefor CEA, B72.3, and TTF1). While a subset of lung adenocarci-nomas is known to be driven byALK fusions,CLTC-ALK has yet to
be described in this disease. In contrast, kinase fusions have notbeen noted in mesothelioma; STRN-ALK represents a novel, yetrare (1/184, 0.5%), driver event in this disease. RET fusions inthyroid and lung cancers arewell characterized (27, 28).Here, twobreast cancers were found to harbor the oncogenic CCDC6-RETfusion (Supplementary Fig. S17). Two similar onocgenic RETfusions (NCOA4-RET and KIF5B-RET) were also identified in arare liver cholangiocarcinoma and an ovarian epithelial carcino-ma, respectively (Supplementary Fig. S17). BRAF fusions havebeen described inmelanoma, thyroid cancers, and pediatric braincancers (29). A single thyroid papillary carcinomawas foundwithaMAD1L1-BRAF fusion identified previously in melanoma (30).We also observed the known TMEM106B-ROS1 fusion (31) in aliver cholangiocarcinoma. Interestingly, GOPC(FIG)-ROS1 hasbeen reported in a glioblastoma cell line (32), in a lung adeno-carcinoma (33), and in rare biliary tract carcinomas (34).Here, weobserved this fusion in a small intestine adenocarcinoma andconfirm its presence in a human glioblastoma sample (Supple-mentary Fig. S17). Finally, we observed an imatinib sensitivehypereosinophilic syndrome fusion, FIP1L1-PDGFRA (35), in aglioblastoma (Supplementary Fig. S17).
Spectrum of known clinically relevant alterations acrossdiseases
Recent publications have suggested a broad spectrum of geno-mic changes in clinically relevant targets (29, 36–38), and expand-ed analyses arewarranted to identifymore patients predicted to besensitive or resistant to targeted therapies. Furthermore, the iden-tification of drug sensitivity and resistance biomarkers acrossmultiple indications suggests that targeted agents may havebroader utility beyond that for which they were approved orig-inally. We surveyed the spectrum of known clinically relevantgenomic changes to identify (i) the spectrum of these changes inindications for which testing is currently recommended and (ii)potential opportunities for broader utility of approved targetedagents.
In colorectal cancers (CRC), activating mutations in KRAS arepredictive of poor response to cetuximab. Alterations in hotspot
BRAF ex 7-18TPR ex 1-14Astrocytoma
RAF1 ex 6-17CUX1 ex 1-7Skin melanoma
RAF1 ex 8-17PRKAR2A ex 1-8Unknown primary melanoma
RAF1 ex 8-17INADL ex 1-31Prostate acinar cell carcinoma
Novel RAF fusions in known diseasesA
BRAF1 ex 8-17DHX9 ex 1-3CRC
RAF1 ex 8-17GOLGA4 ex 1-22Breast carcinoma (nos)
RAF1 ex 8-17PDZRN3 ex 1-7Pancreatic ductal adeno
RAF1 ex 6-17ALCAM ex 1-14Pancreatic neuroendocrine
RAF1 ex 8-17PRKAR2A ex 1-4Lung adenocarcinoma
Novel RAF fusions in novel diseases
ALK ex 20-29c2orf34 ex 1-3
RET ex 12-19 PICALM ex 1-19
ALK ex 20-29MPRIP ex 1-21
NTRK1 ex 11-17 GRIPAP1 ex 1-22
RET ex 12-19 TNIP2 ex 1-5
Novel tyrosine kinase fusions in lung adenocarcinoma/NSCLC
RET ex 12-19 SATB1 ex 1-7
Novel tyrosine kinase fusion in papillary thyroid carcinoma
Novel kinase fusions in novel diseases
ROS1 ex 35-43 MYO5C ex 1-30
C
DCRC ALK ex 20-29 CENPF ex 1-11
RET ex 12-19x GPHN ex 1-9 GI neuroendocrine
NTRK1 ex 8-17 LRRC71 ex 1 Uterine endometrial carcinoma
Figure 5.
Novel fusions in known and unknown diseases. A, Five novel BRAF and RAF1 (CRAF) fusions were identified in diseases where rearrangements involving thesegenes are known to be oncogenic. B, In addition, we also observed five novel RAF fusions in other diseases. C, Novel fusions involving ALK, RET, ROS1, andNTRK1were identified inNSCLCs and a papillary thyroid carcinoma. Kinase fusionswithin these diseases are known to play a tumorigenic role.D, Three novel tyrosinekinase fusions were also identified in other diseases.
Hartmaier et al.
Cancer Res; 77(9) May 1, 2017 Cancer Research2470
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
exons 12 and 13 were observed in 908 of 1,986 (45.7%) CRCs.Recently, expanded guidelines adopted by American Society ofClinical Oncology (ASCO) and others recommend testing foralterations in exon 2 (codons 12 and 13), exon 3 (codons 59 and61), and exon 4 (codons 117 and 146) in both KRAS and NRAS(39). These extended RAS testing guidelines captured an addi-tional 188 CRC samples (9.5%) within our dataset. Beyond theseextended guidelines, we also observed activating KRAS andNRASamplifications and mutations at codons 14 and 22 in an addi-tional 29 CRC samples (1.5%; Fig. 6A; refs. 40, 41). These dataprovide robust estimates of RAS alterations in CRCs, utilizingboth ASCO guidelines and the current state of knowledge, whichis critical when considering EGFR targeted therapies.
As mutations in the MAPK pathway are known drivers ofmultiple GI cancers, we performed a survey of KRAS, NRAS,HRAS, and BRAF alterations across cancers of the GI andhepato-biliary tracts (Fig. 6B). Gastroesophageal junction adeno-carcinoma was unique in that it had a high proportion of KRASamplifications without a KRAS mutation (16.3%). Appendixadenocarcinomas had a similar rate of MAPK alterations(64.5%) as neighboring small intestine (56.6%) and CRCs(60.3%). MAPK alterations in cancers of the biliary tree(see Fig. 6B) included a high frequency in bile duct adenocarci-nomas (46.3%) and pancreatic cancers (89%) and a low frequen-cy in gallbladder adenocarcinomas (11.5%) and liver cholangio-carcinomas (29.8%).
We next investigated the spectrum of clinically relevant inser-tions and deletions (indels) in EGFR (EGFRvIII rearrangementswere also identified and are discussed in a subsequent section).The most prevalent EGFR exon 19 indel was the canonical E746-
A750 deletion, although the length of deletions at amino acids746 and 747 ranged from 3–7 amino acids (Fig. 6C). We alsoobserved rare deletions of 8 amino acids at positions 751 and752.In contrast, drug-resistant EGFR exon 20 insertions lacked a singledominant location and length, but the vast majority occurredbetween amino acids 769–774 and inserted 1–3 residues withinthis region (Fig. 6D). Analogous ERBB2 insertions primarilyconsisted of a 4 amino acid duplication between residues 772–775, however, these insertions ranged from 1–4 amino acidsbetween residues 772 and 780 (Fig. 6E). The diversity of thesemutations has implications for robust diagnostic detection anddeveloping drugs to target this heterogeneous set of insertions(37).
Current data suggest that ERBB2 amplifications, oncogenicpoint mutations, and activating insertions can confer clinicalsensitivity to ERBB2 targeted agents (42). We observed recurrentactivating ERBB2 alterations across 15 tumor types (Fig. 7A),including new trends such as ERBB2 amplifications in cervicalcancer and skin squamous cell carcinomas. ERBB2 point muta-tions have been described in cervical cancers (43), but amplifica-tions suggest an alternatemechanism throughwhich the gene canbe activated in this disease. Consistent with previous studies,ERBB2 activating mutations in cervical cancers did not cooccurwith copy number alterations (Supplementary Fig. S18). To ourknowledge, ERBB2 amplifications in skin squamous cell carcino-ma represent a novel therapeutic target in this disease.
Amplification of the CDK4/6 locus has been associated withresponse to the CDK4 inhibitor palbociclib in breast cancers andliposarcomas (44, 45). Within this cohort, we observed CDK4/6amplification across 43 tumor types (Fig. 7B). Novel findings
2
4
6
8
10
12
14
16
18
20
321 4Insertion length
321 4Insertion length
5321 4
Deletion length5 6 7 8
772773774775776
768769770771
744
745
746
747
748
749
750
751
752
767
763764765766
772
773
774
775
776
777
778
779
780
EGFR Exon 19 deletions EGFR Exon 20 insertions ERBB2 Insertions
00
15
30
45
60
75
90
105
120
135
150
8
16
24
32
40
48
56
64
72
0
C D E
% S
ampl
es0
2040
6080
100
MultipleBRAF AnyHRAS AnyNRAS AnyKRAS AmpKRAS SV/SV&Amp
Pancre
as
Ductal
Appen
dix
Rectum
Colon
Small
Intes
tine
Bile D
uct
Liver
GEJ
Stomac
h
Gallbla
dder
A B
KRAS ex 2
Extend
ed R
AS
Comple
te RAS
020
040
060
080
01,
000
1,20
0
# S
ampl
es
Figure 6.
Diversity of clinically relevantalterations across the dataset.A, Distribution of clinically relevantKRAS alterations in colorectaladenocarcinomas. B, Distributionpattern of all RAS alterations in GIcancers. C–E, Indel alterations in EGFR(C and D) and ERBB2/HER2 (E) canvary in length.Numbers on the left sideof the graphs correspond to the codonpositions; heatmaps display thenumber of samples.
Public Release of Adult Cancer Genomic Data
www.aacrjournals.org Cancer Res; 77(9) May 1, 2017 2471
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
included amplification of these genes in gallbladder carcinomas(11.5%) and oligodendrogliomas (6.6%). Preclinical modelshave suggested that this event may contribute to oligodendro-glioma formation (46).
MET amplification was observed in glioblastoma samples at anappreciable frequency (2.1%; Fig. 7C). We explored whether thisfinding correlated with overexpression ofMETmRNA by incorpo-rating expressiondata fromTCGA glioblastoma samples. Althoughfew TCGA GBM cases exist withMET amplification, amplificationwas associatedwith increased expression (Supplementary Fig. S19).Interestingly, at least one case report has demonstrated clinicalsensitivity to the MET inhibitor, crizotinib, in this disease (47). Asimilar observation was confirmed in AKT1 E17K-mutant colorec-tal adenocarcinomas (Fig. 7D). While this mutation has beenobserved previously in this disease, frequencies varied from 0%–
8.2% (14, 48, 49). Our data confirm the rare occurrence of thisalteration in approximately 1.0% of routine CRC samples. Inter-estingly, in contrast to aprevious study (50),weobserveAKT1E17Kmutant colorectal samples to be enriched for KRAS alterations(P ¼ 0.02) but not BRAF alterations (Supplementary Fig. S20).
In addition to potentially targetable alterations that occur acrossdiseases,we alsoobserved strikingpatterns of disease specificity forcertain alterations. EGFRvIII rearrangements and extracellular acti-vating mutations were found almost exclusively in glioblastomaswhile activating indels in EGFR occurred almost exclusively inNSCLCs. Although we also saw an appreciable rate of EGFR indelsin unknown primary adenocarcinomas, it is likely that theserepresent NSCLC samples for which incomplete pathology infor-mationwas available (J. Ross, personal communication). A similartrend was also observed for ROS1 rearrangements. These events
were observed primarily in NSCLC and a small proportion ofglioblastomas. Both of these rates are consistent with publishedreports, and suggest that ROS1 fusions show tissue specificity.
While many oncogenic alterations cluster within diseases andare targetable directly, some inhibitors rely on the status of TSGs asbiomarkers of response. For example, deleterious alterations inBRCA1/2 are associated with sensitivity to PARP inhibitors (51),andmultiple trials require intact p53 (TP53) as enrollment criteria(NCT01760525, NCT02143635, NCT02264613). Therefore, weinvestigated patterns of TSG alterations. Unsupervised clusteringof alterations within a curated list of TSGs (Supplementary TableS7) identified unique patterns of inactivation across solid tumors(Supplementary Fig. S21). Genes including TP53 and CDKN2A/Bwere altered uniformly across multiple solid tumors. In contrast,other TSGs displayed disease-specific clustering, such asAPC altera-tions within GI cancers. Multiple novel disease-gene associationswere also present. For example, alterations in BCOR, NOTCH1,KDM6A, CREBBP, and KMT2D clustered primarily in ACCs. Col-lectively, thesedata suggest patterns ofTSG inactivation thatmaybedisease specific, similar to patterns for some oncogenes. Furtherresearch to understand this tissue selectivity is warranted.
DiscussionWe describe herein a dataset of 18,004 unique adult solid
tumors that underwent genomic profiling as part of routineclinical care. This collection represents "real world" specimensthat were not selected for any features prior to sequencing. Thedataset was composed of 162 disease subtypes, including manyrare and unusual tumors not included previously as part of larger
ERBB2
GEJ Ade
no.
Saliva
ry gla
nd ca
rc. (n
os)
Cervix
aden
o.
Gallbla
dder
aden
o.
Blad
der u
rothe
lial tr
asitio
nal c
ell ca
rc.
Duode
num ad
eno.
Breast
IDC
Uter
us en
dometr
ial pa
pillar
y sero
us ca
rc.
Stomac
h ade
no. (n
os)
Breast
carc.
(nos
)
Ovary
clear
cell c
arc.
Bile du
ct ad
eno.
Kidney
uroth
elial
carc.
Uterus
endo
metrial
aden
o (no
s)
Breast
ILC
% In
divi
dual
s
05
1015
2025
CDK4/6 amplification
GBM
GEJ Ade
no.
Gallbla
dder
aden
o.
Small in
tensti
ne ad
eno.
Adrena
l glan
d cort
ical c
arc.
Stomac
h ade
noca
rc. (n
os)
Oligod
endro
gliom
a
Lung
aden
o.
Saliva
ry gla
nd ca
rc. (n
os)
Breast
metapla
stic c
arc.
Lung
NSCLC
(nos
)
Skin m
elano
ma
Astroc
ytoma
Unkno
wn prim
ary ad
eno.
Unkno
wn prim
ary m
elano
ma
% In
divi
dual
s0
510
1520
25
MET
Lung
NSCLC
(nos
)
Stomac
h ade
no. d
iffuse
type
Stomac
h ade
no. (n
os)
GEJ Ade
no.
Lun
g larg
e cell
neuro
endo
crine
Lung
aden
o
Unkno
wn prim
ary ad
eno.
Liver
HCC
Liver
chola
ngioc
arcino
ma
Uter
us en
dometr
ial pa
pillar
y sero
us
Ovary
epith
elial
carc.
(nos
)
Lung
SCC
GBM
Unkno
wn prim
ary
Kidney
clea
r cell
carc.
% In
divi
dual
s0
24
68
10
AKT1 (E17K)
Breast
ILC
Uterus
endo
metrial
endo
metrioi
d ade
no
Uterus
endo
metrial
aden
o (no
s)
Breast
carc.
(nos
)
Breast
IDC
Unkno
wn prim
ary S
CC
Prostat
e acin
ar ad
eno.
HNSCC
Colon a
deno
. (CRC)
Blad
der u
rothe
lial tr
ansit
ional
cell c
arc
Stomac
h ade
no. (n
os)
Lung
SCC
Rectum
aden
o. (C
RC)
Lun
g small
cell u
ndiffe
rentia
ted ca
rc.
Unkno
wn prim
ary ca
rc.
% In
divi
dual
s
02
46
810
SV CN Both
A B
C D
Figure 7.
Spectrum of druggable alterationsacross cancers. Distribution ofalterations across the top 15 diseasetypes for ERBB2/HER2 (A),amplifications in CDK4/6 (B),amplifications in MET (C), andactivating mutations in AKT1 (D).
Hartmaier et al.
Cancer Res; 77(9) May 1, 2017 Cancer Research2472
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
sequencing efforts. In addition, common tumors (e.g., breast,lung, and colon) are represented by thousands of samples,enabling robust statistical analyses as well as validation of rarevariants. Comparison of alteration frequencies to TCGA, wherepossible, identified some significant differences, mostly in CNfrequencies. Detailed examination suggests that both technical(i.e., platform, annotation) and sample differences underliebetween the discrepancies between the two datasets.
We also observed an enrichment of treatment refractory sam-ples in FM breast and lung cancer cohorts based on an increasedfrequency of alterations associated with acquired resistance totargeted therapies in these diseases. To exemplify novelty withinthe FM dataset, we surveyed the genomic landscape of rarediseases and identified NOTCH1 alterations in ACCs at a higherfrequency compared with previous studies. We also identifiedmultiple potentially druggable novel kinase fusions as well asknown fusions in diseases beyond those in which they arecurrently recognized. Analysis of VUSes identified a clinicallysignificant enrichment of SMAD4 alterations in colon cancer, aswell asmultiple other rare alterations predicted to have functionalimpact. A survey of clinically relevant alterations highlighted thespectrum ofmolecular changes for which testing is recommendedas well as opportunities for expansion of approved targetedtherapies. Clustering of alterations in TSGs revealed patterns ofdisease specificity for certain genes, but broad inactivation ofothers. This dataset is rich with discovery potential and presents anew resource in which to investigate rare alterations and diseases,validate clinical relevance, and identify novel therapeutic targets.
To our knowledge, this dataset represents the largest collectionof tumors to date profiled on a single uniform platform. The highunique sequencing coverage (>600�) across all targets enablesaccurate detection of all classes of genomic variants, even inimpure clinical specimens. Previous validation has optimizedthis assay for sensitive and specific detection for all classes ofvariants down to lowmutant allele frequencies (11). The sampleswithin this dataset lack sequencing of patient-matched normaltissue, but multiple steps have been taken to enrich for significantcancer-associated variants (see Supplementary Methods). Theseinclude inclusion of (i) all truncation events in TSGs, (ii) knownpathogenic germline events, and (iii) uncharacterized alterationsreported previously in cancer (Supplementary Fig. S2). To min-imize the number of benign germline variants, those variants notmeeting the criteria above were filtered through online databases(ExAc and 1000Genomes) to remove events recognized currentlyas benignpolymorphisms. Collectively, the uniformity of the dataand the stringent filtering to enrich for cancer-associated altera-tions facilitate comparisons and enhance the discovery potentialfor variants contributing to tumorigenesis.
The dataset can be used by basic researchers to identify novelfindings for validation and to validate previous observations,especially those involving rare diseases and rare variants. Ourpreliminary analyses exemplify how hypothesis-generating dis-coveries within this cohort can be integrated with existing data.For example, the identification of novel SMAD4 genomic altera-tions in colorectal adenocarcinomas was expanded using theTCGA cohort to investigate survival differences among patients.In addition, MET amplifications in glioblastoma were shown tocorrelate with mRNA overexpression of this target in TCGA, afinding thatmay have been unappreciated in the past due to smalldatasets and the rarity of the event. A data collection of this sizealso allows for pan-cancer analyses to better understand tissue-
specific patterns of alterations, such as TSG inactivation. Thesefindings can be used to plan thoughtful functional follow-upexperiments. This resource also has applicability to clinical oncol-ogy and drug development. We highlight the spectrum of clini-cally relevant molecular markers and show that a wide variety ofalterations exist in targets for which established routine clinicaltests exist. This resource can also be used to explore opportunitiesfor drug expansion with new or approved targeted agents forwhom biomarkers of therapeutic sensitivity or resistance areknown. For example, we highlight multiple alterations, includingMET amplification, ERBB2 amplification, and activating pointmutations, and amplification of CDK4/6, that occur across mul-tiple diseases. In contrast, we observed that ALK rearrangementsand EGFR activating alterations are confined to specific diseasesand very rarely observed outside of those tissues.
The lack of clinical data is a limitation of this dataset. Infor-mation about exposure to previous treatments, survival, andresponse rates was unavailable. As these samples were profiledon a clinical platform, andnot as part of a research study, genomicinformation is only available for those 295 genomic targetsdeemed to have clinical relevance today. While the role of geno-mic changes in cancer development and treatment response iswell studied, it is likely that other changes in methylation,expression, and noncoding DNA regions may have implicationsand would not be captured within this dataset.
The National Cancer Moonshot Initiative has emphasizedthat data sharing is essential to accelerate progress in oncology.Academic, private, and public sectors have an obligation topatients, researchers, and clinicians to share data, knowledge,and insight across the field. Large-scale sequencing projectshave profiled many common tumors but often lack robustsample numbers for rare diseases and variants. The publicavailability of large genomic datasets, such as the one describedherein, enables the broad use of this data across multipledisciplines, and is designed to remove barriers to progress. Theinsights gleaned from this data release will be instrumental inaccelerating research and development efforts for targetedagents and immunotherapies.
Disclosure of Potential Conflicts of InterestR.J. Hartmaier has received speakers' bureau honoraria from Bio-Rad Lab-
oratories and has ownership interest in Foundation Medicine. L. Albacker hasownership interest in FoundationMedicine. J. Chmielecki is a current employeeat Foundation Medicine. M. Bailey has ownership interest in FoundationMedicine, Inc. J. He is a senior manager at Foundation Medicine. M. Goldbergis a clinical data analyst andhas ownership interest in FoundationMedicine, Inc.J.A. Elvin has ownership interest in Foundation Medicine. G.M. Framptonreports receiving a commercial research grant and has ownership interest inFoundation Medicine. J.S. Ross is a medical director at Foundation Medicineand has ownership interest in Foundation Medicine. V. Miller is a chief medicalofficer at Foundation Medicine, Inc. P.J. Stephens is a chief scientific officer atand has ownership interest in Foundation Medicine. D. Lipson is a vicepresident and has ownership interest in Foundation Medicine. No potentialconflicts of interest were disclosed by the other authors.
Authors' ContributionsConception and design: R.J. Hartmaier, L.A. Albacker, J. Chmielecki, J. He,J.S. Ross, P.J. Stephens, D. LipsonDevelopment of methodology: R.J. Hartmaier, L.A. Albacker, J. He, M.E.Goldberg, G.M. Frampton, J.S. Ross, D. LipsonAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): J.He, S. Ramkissoon, J. Suh, J.A. Elvin,G.M. Frampton,J.S. Ross
Public Release of Adult Cancer Genomic Data
www.aacrjournals.org Cancer Res; 77(9) May 1, 2017 2473
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
Analysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis):R. J.Hartmaier, L.A. Albacker, J. Chmielecki,M. Bailey,J. He, M.E. Goldberg, S. Ramkissoon, J.A. Elvin, S. Chiacchia, G.M. Frampton,J.S. Ross, V. Miller, D. LipsonWriting, review, and/or revision of the manuscript: R.J. Hartmaier, L.A.Albacker, J. Chmielecki, J. He, M.E. Goldberg, J. Suh, G.M. Frampton, J.S. Ross,V. Miller, P.J. Stephens, D. LipsonAdministrative, technical, or material support (i.e., reporting or organizingdata, constructing databases): R.J. Hartmaier, J. He, M.E. Goldberg, S. Ram-kissoon, J.A. Elvin, G.M. FramptonStudy supervision: J. Chmielecki, D. Lipson
AcknowledgmentsWe thank Susan Hager, Allen Nunnally, and Shannon Roberts (Foundation
Medicine) for their contributions to this project.The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby markedadvertisement in accordance with 18 U.S.C. Section 1734 solely to indicatethis fact.
Received September 13, 2016; revised October 24, 2016; accepted February16, 2017; published OnlineFirst February 24, 2017.
References1. Cancer Genome Atlas Network. Comprehensive molecular portraits of
human breast tumours. Nature 2012;490:61–70.2. The Cancer Genome Atlas Research Network. Comprehensive molecular
characterization of gastric adenocarcinoma. Nature 2014;513:202–9.3. The Cancer Genome Atlas Research Network. Comprehensive, integrative
genomic analysis of diffuse lower-grade gliomas. N Engl J Med 2015;372:2481–98.
4. Pereira S, Gibbs RA, McGuire AL. Open access data sharing in genomicresearch. Genes 2014;5:739–47.
5. Brennan CW, Verhaak RGW, McKenna A, Campos B, Noushmehr H,Salama SR, et al. The somatic genomic landscape of glioblastoma. Cell2013;155:462–77.
6. Verhaak RGW, Hoadley KA, Purdom E, Wang V, Qi Y, WilkersonMD, et al.Integrated genomic analysis identifies clinically relevant subtypes of glio-blastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, andNF1. Cancer Cell 2010;17:98–110.
7. EttingerDS,WoodDE,Chair FredV, AisnerDL, Bauman J,Chirieac LR, et al.NCCN Guidelines Version 3.2017 Panel Members Non-Small Cell LungCancer-Farber/BrighamandWomen's CancerCenter. FortWashington, PA:National Comprehensive Cancer Network; 2016[cited 2016 Nov 21].Available from: https://www.nccn.org/professionals/physician_gls/pdf/nscl.pdf.
8. Rizvi NA, Hellmann MD, Snyder A, Kvistborg P, Makarov V, Havel JJ, et al.Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 2015;348:124–8.
9. Catenacci DVT. Next-generation clinical trials: Novel strategies to addressthe challenge of tumor molecular heterogeneity. Mol Oncol 2015;9:967–96.
10. KrisMG, Johnson BE, Berry LD, KwiatkowskiDJ, Iafrate AJ,Wistuba II, et al.Using multiplexed assays of oncogenic drivers in lung cancers to selecttargeted drugs. JAMA 2014;311:1998–2006.
11. Frampton GM, Fichtenholtz A, Otto GA, Wang K, Downing SR, He J, et al.Development and validation of a clinical cancer genomic profiling testbased on massively parallel DNA sequencing. Nat Biotechnol 2013;31:1023–31.
12. Reva B, Antipin Y, Sander C. Predicting the functional impact of proteinmutations: application to cancer genomics. Nucleic Acids Res 2011;39:e118.
13. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P,et al. A method and server for predicting damaging missense mutations.Nat Methods 2010;7:248–9.
14. The Cancer Genome Atlas Network. Comprehensive molecular charac-terization of human colon and rectal cancer. Nature 2012;487:330–7.
15. Chang MT, Asthana S, Gao SP, Lee BH, Chapman JS, Kandoth C, et al.Identifying recurrent mutations in cancer reveals widespread lineage diver-sity and mutational specificity. Nat Biotechnol 2016;34:155–63.
16. Fleming NI, Jorissen RN, Mouradov D, Christie M, Sakthianandeswaren A,Palmieri M, et al. SMAD2, SMAD3 and SMAD4 mutations in colorectalcancer. Cancer Res 2013;73:725–35.
17. Morris LGT, Chandramohan R, West L, Zehir A, Chakravarty D, Pfister DG,et al. The molecular landscape of recurrent and metastatic head and neckcancers. JAMA Oncol. 2016 Jul 21. [Epub ahead of print].
18. Weng AP, Ferrando AA, LeeW,Morris JP, Silverman LB, Sanchez-Irizarry C,et al. Activating mutations of NOTCH1 in human T cell acute lympho-blastic leukemia. Science 2004;306:269–71.
19. Ho AS, Kannan K, Roy DM, Morris LGT, Ganly I, Katabi N, et al. Themutational landscape of adenoid cystic carcinoma. Nat Genet2013;45:791–8.
20. Stephens PJ, Davies HR, Mitani Y, Van Loo P, Shlien A, Tarpey PS, et al.Whole exome sequencing of adenoid cystic carcinoma. J Clin Invest2013;123:2965–8.
21. Ross JS, Wang K, Rand JV, Sheehan CE, Jennings TA, Al-Rohil RN, et al.Comprehensive genomic profiling of relapsed and metastatic adenoidcystic carcinomas by next-generation sequencing reveals potential newroutes to targeted therapies. Am J Surg Pathol 2014;38:235–8.
22. Olsauskas-Kuprys R, Zlobin A, Osipo C. Gamma secretase inhibitors ofNotch signaling. Onco Targets Ther 2013;6:943–55.
23. Damm F, Chesnais V, Nagata Y, Yoshida K, Scourzic L, Okuno Y, et al.BCOR and BCORL1 mutations in myelodysplastic syndromes and relateddisorders. Blood 2013;122:3169–77.
24. Shern JF, Chen L, Chmielecki J, Wei JS, Patidar R, Rosenberg M, et al.Comprehensive genomic analysis of rhabdomyosarcoma reveals a land-scape of alterations affecting a common genetic axis in fusion-positive andfusion-negative tumors. Cancer Discov 2014;4:216–31.
25. Bridge JA, Kanamori M,Ma Z, Pickering D, Hill DA, Lydiatt W, et al. Fusionof the ALK gene to the clathrin heavy chain gene, CLTC, in inflammatorymyofibroblastic tumor. Am J Pathol 2001;159:411–5.
26. Kelly LM, Barila G, Liu P, Evdokimova VN, Trivedi S, Panebianco F, et al.Identification of the transforming STRN-ALK fusion as a potential thera-peutic target in the aggressive forms of thyroid cancer. Proc Natl Acad Sci US A 2014;111:4233–8.
27. Thomas GA, Bunnell H, Cook HA, Williams ED, Nerovnya A, CherstvoyED, et al. High prevalence of RET/PTC rearrangements in Ukrainian andBelarussian post-Chernobyl thyroid papillary carcinomas: a strong corre-lationbetweenRET/PTC3 and the solid-follicular variant. J Clin EndocrinolMetab 1999;84:4232–8.
28. FalchookGS,OrdonezNG,BastidaCC,StephensPJ,MillerVA,GaidoL,etal.Effect of the RET inhibitor vandetanib in a patient with RET fusion-positivemetastatic non-small-cell lung cancer. J Clin Oncol 2016;34:e141–4.
29. Ross JS, Wang K, Chmielecki J, Gay L, Johnson A, Chudnovsky J, et al. Thedistribution of BRAF gene fusions in solid tumors and response to targetedtherapy. Int J Cancer 2016;138:881–90.
30. Botton T, Yeh I, Nelson T, Vemula SS, Sparatta A, Garrido MC, et al.Recurrent BRAF kinase fusions inmelanocytic tumors offer an opportunityfor targeted therapy. Pigment Cell Melanoma Res 2013;26:845–51.
31. Ou SH, Chalmers ZR, Azada MC, Ross JS, Stephens PJ, Ali SM, et al.Identification of a novel TMEM106B-ROS1 fusion variant in lung adeno-carcinoma by comprehensive genomic profiling. Lung Cancer 2015;88:352–4.
32. Charest A, Lane K,McMahon K, Park J, Preisinger E, ConroyH, et al. Fusionof FIG to the receptor tyrosine kinase ROS in a glioblastoma with aninterstitial del(6)(q21q21). Genes Chromosom Cancer 2003;37:58–71.
33. Suehara Y, Arcila M,Wang L, Hasanovic A, AngD, Ito T, et al. Identificationof KIF5B-RET and GOPC-ROS1 fusions in lung adenocarcinomas througha comprehensive mRNA-based screen for tyrosine kinase fusions. ClinCancer Res 2012;18:6599–608.
34. Peraldo Neia C, Cavalloni G, Balsamo A, Venesio T, Napoli F, Sassi F, et al.Screening for the FIG-ROS1 fusion in biliary tract carcinomas by nestedPCR. Genes Chromosomes Cancer 2014;53:1033–40.
35. Gotlib J, Cools J, Malone JM, Schrier SL, Gilliland DG, Coutr�e SE. TheFIP1L1-PDGFRalpha fusion tyrosine kinase in hypereosinophilic
Hartmaier et al.
Cancer Res; 77(9) May 1, 2017 Cancer Research2474
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
syndrome and chronic eosinophilic leukemia: implications for diagnosis,classification, and management. Blood 2004;103:2879–91.
36. Chmielecki J, Ross JS, Wang K, Frampton GM, Palmer GA, Ali SM, et al.Oncogenic alterations in ERBB2/HER2 represent potential therapeutictargets across tumors from diverse anatomic sites of origin. Oncologist2015;20:7–12.
37. Schrock AB, FramptonGM,HerndonD, Greenbowe JR,Wang K, LipsonD,et al. Comprehensive genomic profiling identifies frequent drug-sensitiveEGFR exon 19 deletions in NSCLC not identified by prior moleculartesting. Clin Cancer Res 2016;22:3281–5.
38. Ye K, Wang J, Jayasinghe R, Lameijer EW, McMichael JF, Ning J, et al.Systematic discovery of complex insertions and deletions in humancancers. Nat Med 2016;22:97–104.
39. Allegra CJ, Rumble RB, Hamilton SR, Mangu PB, Roach N, Hantel A, et al.Extended RAS gene mutation testing in metastatic colorectal carcinoma topredict response to anti-epidermal growth factor receptor monoclonalantibody therapy: American Society of Clinical Oncology ProvisionalClinical Opinion Update 2015. J Clin Oncol 2016;34:179–85.
40. Tsukuda K, Tanino M, Soga H, Shimizu N, Shimizu K. A novel activatingmutation of the K-ras gene in human primary colon adenocarcinoma.Biochem Biophys Res Commun 2000;278:653–8.
41. Gremer L, Merbitz-Zahradnik T, Dvorsky R, Cirstea IC, Kratz CP, ZenkerM,et al. Germline KRAS mutations cause aberrant biochemical and physicalproperties leading to developmental disorders. Hum Mutat 2011;32:33–43.
42. Tebbutt N, Pedersen MW, Johns TG. Targeting the ERBB family in cancer:couples therapy. Nat Rev Cancer 2013;13:663–73.
43. Ojesina AI, Lichtenstein L, Freeman SS, Pedamallu CS, Imaz-Rosshandler I,Pugh TJ, et al. Landscape of genomic alterations in cervical carcinomas.Nature 2014;506:371–5.
44. DicksonMA, TapWD, KeohanML, D'Angelo SP, GounderMM, AntonescuCR, et al. Phase II Trial of the CDK4 inhibitor PD0332991 in patients withadvanced CDK4-amplified well-differentiated or dedifferentiated liposar-coma. J Clin Oncol 2013;31:2024–8.
45. Cadoo KA, Gucalp A, Traina TA. Palbociclib: an evidence-based review ofits potential in the treatment of breast cancer. Breast Cancer 2014;6:123–33.
46. Ciznadija D, Liu Y, Pyonteck SM, Holland EC, Koff A. Cyclin D1 and Cdk4mediate development of neurologically destructive oligodendroglioma.Cancer Res 2011;71:6174–83.
47. Chi AS, Batchelor TT, Kwak EL, Clark JW,Wang DL,Wilner KD, et al. Rapidradiographic and clinical improvement after treatment of aMET-amplifiedrecurrent glioblastomawith amesenchymal-epithelial transition inhibitor.J Clin Oncol 2012;30:e30–3.
48. Kim MS, Jeong EG, Yoo NJ, Lee SH. Mutational analysis of oncogenic AKTE17Kmutation in common solid cancers and acute leukaemias. Br J Cancer2008;98:1533–5.
49. Malapelle U, Pisapia P, Sgariglia R, Vigliar E, Biglietto M, Carlomagno C,et al. Less frequently mutated genes in colorectal cancer: evidences fromnext-generation sequencing of 653 routine cases. J Clin Pathol 2016;69:767–71.
50. Hechtman JF, Sadowska J, Huse JT, Borsu L, Yaeger R, Shia J, et al. AKT1E17K in colorectal carcinoma is associatedwith BRAFV600Ebut notMSI-Hstatus: a clinicopathologic comparison to PIK3CA helical and kinasedomain mutants. Mol Cancer Res 2015;13:1003–8.
51. Ledermann J, Harter P, Gourley C, Friedlander M, Vergote I, Rustin G, et al.Olaparib maintenance therapy in patients with platinum-sensitiverelapsed serous ovarian cancer: a preplanned retrospective analysis ofoutcomes by BRCA status in a randomised phase 2 trial. Lancet Oncol2014;15:852–61.
www.aacrjournals.org Cancer Res; 77(9) May 1, 2017 2475
Public Release of Adult Cancer Genomic Data
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479
2017;77:2464-2475. Published OnlineFirst February 24, 2017.Cancer Res Ryan J. Hartmaier, Lee A. Albacker, Juliann Chmielecki, et al. Novel Insights into Cancer PathogenesisHigh-Throughput Genomic Profiling of Adult Solid Tumors Reveals
Updated version
10.1158/0008-5472.CAN-16-2479doi:
Access the most recent version of this article at:
Material
Supplementary
http://cancerres.aacrjournals.org/content/suppl/2017/02/24/0008-5472.CAN-16-2479.DC1
Access the most recent supplemental material at:
Cited articles
http://cancerres.aacrjournals.org/content/77/9/2464.full#ref-list-1
This article cites 49 articles, 17 of which you can access for free at:
Citing articles
http://cancerres.aacrjournals.org/content/77/9/2464.full#related-urls
This article has been cited by 7 HighWire-hosted articles. Access the articles at:
E-mail alerts related to this article or journal.Sign up to receive free email-alerts
Subscriptions
Reprints and
To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at
Permissions
Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)
.http://cancerres.aacrjournals.org/content/77/9/2464To request permission to re-use all or part of this article, use this link
on July 11, 2020. © 2017 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Published OnlineFirst February 24, 2017; DOI: 10.1158/0008-5472.CAN-16-2479