14
Mutational Signatures in Breast Cancer: The Problem at the DNA Level Serena Nik-Zainal 1,2 and Sandro Morganella 1 Abstract A breast cancer genome is a record of the historic mutagenic activity that has occurred throughout the development of the tumor. Indeed, every mutation may be informative. Although driver mutations were the main focus of cancer research for a long time, passenger mutational signatures, the imprints of DNA damage and DNA repair processes that have been oper- ative during tumorigenesis, are also biologically illuminating. This review is a chronicle of how the concept of mutational signatures arose and brings the reader up-to-date on this eld, particularly in breast cancer. Mutational signatures have now been advanced to include mutational processes that involve rearrangements, and novel cancer biological insights have been gained through studying these in great detail. Furthermore, there are efforts to take this eld into the clinical sphere. If validated, mutational signatures could thus form an additional weapon in the arsenal of cancer precision diagnostics and therapeutic stratication in the modern war against cancer. Clin Cancer Res; 23(11); 261729. Ó2017 AACR. See all articles in this CCR Focus section, "Breast Cancer Research: From Base Pairs to Populations." Introduction: Breast Cancer GenomicsAccess All Areas The central tenet of cancer research has for decades been the identication of somatic driver mutations that are causally impli- cated in tumorigenesis (1). Thus, a host of breast cancer driver events are now known (2), including copy number aberrations (38), such as the ERBB2 and CCND1 amplication loci and homozygous deletions of CDKN2A/B and PTEN, and high-frequen- cy substitution and insertion/deletion (indel) driver mutations in cancer genes like TP53 (frequency 53%), PIK3CA (8%26%), CDH1 (21%), AKT1 (8%), and GATA3 (4%; refs. 912). Separately, extensive germline exploration has led to documentation of rare, high-penetrance (BRCA1, BRCA2, TP53; refs. 13, 14), moderate penetrance (PTEN, STK11, CDH1, ATM, CHEK2, BRIP1, PALB2; refs. 1519), and common, low-penetrance risk alleles (2024) for developing breast cancer (25). Essentially, enormous efforts have been placed on breast cancer classication based on somatic and germline mutation information, histopathologic markers, copy number, and expression proles (9, 26, 27)all aimed at improv- ing diagnostic, prognostic, and therapeutic stratication. When massive parallel sequencing arrived in the late 2000s (28), the increase in the speed of sequencing was of orders of magnitude, permitting access to large swathes of the human genome not previously accessible at a reasonable cost. In a striking testament to this technology, ve back-to-back breast cancer articles were published in 2012 (912), providing a thorough view of the molecular foundations of breast cancer and saturating driver dis- covery in coding sequences (29). Quite apart from the mere handful of driver mutations present in each tumor, modern sequencing technologies enabled us to access the many thousands of passenger mutations present in each cancer as well. Herein lies a signicant realizationthat passenger mutations are not simply random manifestations or mutational debristhey represent the scars of biological processes that have gone awry during cancer development and are, therefore, a rich historical record of tumorigenesis (30). Mutational Signatures: Making Sense of the Mayhem The following model was previously proposed: At the point of a patient's cancer diagnosis, the set of somatic mutations revealed through sequencing of the tumor is the aggregate outcome of one or more mutational processes (3032). Each process, dened by the mechanisms of DNA damage and DNA repair that constitute it, leaves a characteristic imprint or mutational signature on the cancer genome (Fig. 1). The nal catalog of mutations is also determined by the intensity and duration of exposure to each mutational process (Fig. 1). Some may be weak or moderate in their intensity, whereas others may be very strong in their asser- tion. In addition, some exposures may be ongoing through the entire lifetime of the patient, even preceding the formation of the cancer, and some may commence late or become dominant later in tumorigenesis (Fig. 1). Furthermore, cancers comprise subclo- nal populations, which may be variably exposed to each muta- tional process (33, 34), promoting complexity of the nal land- scape of somatic mutations in a cancer genome. Base Substitution Mutational Signatures in Breast Cancer In 2012, the 183,016 substitutions present in 21 whole breast cancer genomes were used in a proof-of-principle exercise to demonstrate the existence of mutational signatures (30, 33). Critically, sequence context immediately 5 0 and 3 0 to each 1 Wellcome Trust Sanger Institute, Hinxton Genome Campus, Cambridge, United Kingdom. 2 East Anglian Medical Genetics Service, Cambridge University Hospi- tals NHS Foundation Trust, Cambridge, United Kingdom. Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/). Corresponding Author: Serena Nik-Zainal, Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom. Phone: 0044-1223-834244; E-mail: [email protected] doi: 10.1158/1078-0432.CCR-16-2810 Ó2017 American Association for Cancer Research. CCR FOCUS www.aacrjournals.org 2617 on November 11, 2020. © 2017 American Association for Cancer Research. clincancerres.aacrjournals.org Downloaded from

CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

CCR Focus

Mutational Signatures in Breast Cancer:The Problem at the DNA LevelSerena Nik-Zainal1,2 and Sandro Morganella1

Abstract

A breast cancer genome is a record of the historic mutagenicactivity that has occurred throughout the development of thetumor. Indeed, every mutation may be informative. Althoughdriver mutations were the main focus of cancer research for along time, passenger mutational signatures, the imprints ofDNA damage and DNA repair processes that have been oper-ative during tumorigenesis, are also biologically illuminating.This review is a chronicle of how the concept of mutationalsignatures arose and brings the reader up-to-date on this field,particularly in breast cancer. Mutational signatures have now

been advanced to include mutational processes that involverearrangements, and novel cancer biological insights have beengained through studying these in great detail. Furthermore,there are efforts to take this field into the clinical sphere. Ifvalidated, mutational signatures could thus form an additionalweapon in the arsenal of cancer precision diagnostics andtherapeutic stratification in the modern war against cancer.Clin Cancer Res; 23(11); 2617–29. �2017 AACR.

See all articles in this CCR Focus section, "Breast CancerResearch: From Base Pairs to Populations."

Introduction: Breast Cancer Genomics—Access All Areas

The central tenet of cancer research has for decades been theidentification of somatic driver mutations that are causally impli-cated in tumorigenesis (1). Thus, a host of breast cancer driverevents are now known (2), including copy number aberrations(3–8), such as the ERBB2 and CCND1 amplification loci andhomozygousdeletionsofCDKN2A/BandPTEN, andhigh-frequen-cy substitution and insertion/deletion (indel) driver mutations incancer genes like TP53 (�frequency 53%), PIK3CA (8%–26%),CDH1 (21%),AKT1 (8%), andGATA3 (4%; refs. 9–12). Separately,extensive germline exploration has led to documentation of rare,high-penetrance (BRCA1, BRCA2, TP53; refs. 13, 14), moderatepenetrance (PTEN, STK11, CDH1, ATM, CHEK2, BRIP1, PALB2;refs. 15–19), and common, low-penetrance risk alleles (20–24) fordeveloping breast cancer (25). Essentially, enormous efforts havebeen placed on breast cancer classification based on somatic andgermline mutation information, histopathologic markers, copynumber, and expression profiles (9, 26, 27)—all aimed at improv-ing diagnostic, prognostic, and therapeutic stratification.

Whenmassive parallel sequencing arrived in the late 2000s (28),the increase in the speed of sequencingwas of orders ofmagnitude,permitting access to large swathes of the human genome notpreviously accessible at a reasonable cost. In a striking testamentto this technology, five back-to-back breast cancer articles were

published in 2012 (9–12), providing a thorough view of themolecular foundations of breast cancer and saturating driver dis-covery in coding sequences (29).Quite apart from themere handfulof driver mutations present in each tumor, modern sequencingtechnologies enabled us to access themany thousands of passengermutations present in each cancer as well. Herein lies a significantrealization—that passenger mutations are not simply randommanifestations or mutational debris—they represent the scars ofbiologicalprocesses thathavegoneawryduring cancerdevelopmentand are, therefore, a rich historical record of tumorigenesis (30).

Mutational Signatures: Making Sense ofthe Mayhem

The followingmodelwas previously proposed: At the point of apatient's cancer diagnosis, the set of somatic mutations revealedthrough sequencing of the tumor is the aggregate outcome of oneor more mutational processes (30–32). Each process, defined bythe mechanisms of DNA damage and DNA repair that constituteit, leaves a characteristic imprint or mutational signature on thecancer genome (Fig. 1). The final catalog of mutations is alsodetermined by the intensity and duration of exposure to eachmutational process (Fig. 1). Some may be weak or moderate intheir intensity, whereas others may be very strong in their asser-tion. In addition, some exposures may be ongoing through theentire lifetime of the patient, even preceding the formation of thecancer, and some may commence late or become dominant laterin tumorigenesis (Fig. 1). Furthermore, cancers comprise subclo-nal populations, which may be variably exposed to each muta-tional process (33, 34), promoting complexity of the final land-scape of somatic mutations in a cancer genome.

Base Substitution Mutational Signatures inBreast Cancer

In 2012, the 183,016 substitutions present in 21 whole breastcancer genomes were used in a proof-of-principle exercise todemonstrate the existence of mutational signatures (30, 33).Critically, sequence context immediately 50 and 30 to each

1Wellcome Trust Sanger Institute, Hinxton Genome Campus, Cambridge, UnitedKingdom. 2East Anglian Medical Genetics Service, Cambridge University Hospi-tals NHS Foundation Trust, Cambridge, United Kingdom.

Note: Supplementary data for this article are available at Clinical CancerResearch Online (http://clincancerres.aacrjournals.org/).

Corresponding Author: Serena Nik-Zainal, Wellcome Trust Sanger Institute,Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.Phone: 0044-1223-834244; E-mail: [email protected]

doi: 10.1158/1078-0432.CCR-16-2810

�2017 American Association for Cancer Research.

CCRFOCUS

www.aacrjournals.org 2617

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 2: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

mutated base was taken into consideration in classifying eachsubstitution. As there are six classes of base substitution and 16possible sequence contexts for eachmutated base (A, C, G, or T atthe 50 base and A, C, G, or T at the 30 base), there are 96 possiblemutated trinucleotides for each tumor. Various mathematicalmethods were explored and finally, nonnegative matrix factori-zation was used to extract five substitution signatures present inthese tumors (signatures A–E, nowknownas signatures 1B, 2, 3, 8,and 13; refs. 30, 33; Fig. 2).

Subsequently, a methods article (35) and a landmark article(32) were published where this mathematical approach wasapplied across 30 cancer types involving 7,042 samples [507whole-genome sequencing (WGS) and 6,535 whole-exomesequencing (WES)] and revealed 21 substitution signatures alto-gether (http://cancer.sanger.ac.uk/cosmic/signatures). The num-ber of breast cancers available for analysis had increased consid-erably to 100 WGS and 800 WES tumors. Reassuringly, the same

five substitution signatures that were recognized previously wereconsistently identified in this larger dataset (30, 32), reinforcingconviction in the concept of mutational signatures and in themethods applied to extract them.

In a recent endeavor exploring 560WGSbreast tumors (36), thelargest cohort ofWGS cancers of a single tissue type to date, a totalof 12 substitution signatures were identified from 3,479,652mutations (Fig. 2A). This may superficially appear to be a sub-stantial surge in signature discovery in breast tumors. On closeinspection, many of the new signatures are relatively rare, presentin few samples (36). Thus, in a similar paradigm to that of drivers,wehave likely saturated thediscovery of high-frequency, commonmutational signatures in breast cancer. Sequencing further pri-mary breast tumors is unlikely to yield new,major signatures. Theincrease in power possibly permits disambiguation of closelycorrelated signatures. Signatures 1 and 5, hitherto classified assignature 1B, were only just separated by this analysis. Many

© 2017 American Association for Cancer Research

Multiple mutational processesadded together

A. Mutational process happening in allnormal cells throughout life

D. Mutational process due toacquired DNA repair defect

More likely to appear as “clonal”mutations

Mutations associated with any mutationalsignature could have been acquired prior

to the cell becoming a malignant cell

Historic mutational process

May appear as “clonal” or “subclonal”mutations

Ongoing mutational process

B. Mutational process dueto occupational exposure

C. Mutational process occurringas intermittent bursts

Time (years)Birth

0

15

30

45

60

75

Figure 1.

Somatic mutational processes in human cancer. Each mutational process leaves a characteristic imprint, or mutational signature, on the cancer genome, comprisingDNA damage and DNA repair components. The arrows indicate the duration and intensity of exposure to a specific mutational process. The amount ofexposure to each mutational process could vary from one person to another. Mutational processes A, B, C, and D represent hypothetical mutational processes thathave occurred through the lifetime of the developing tumor. A could represent a normal mutational process that happens in all our cells (including normal cells),hence it is occurring in a small amount throughout life. B could represent a mutational process caused by an environmental insult, such as an occupationalexposure to a carcinogen. C could represent a mutational process which occurs in bursts through tumorigenesis such as intermitted exposure to a chemical or to anintermittent disease process. D could represent the acquisition of a defect in a gene involved in normal DNA repair. The final mutational portrait is a compositeof all the mutational processes that have been active over the lifetime of the cancer patient. A different patient could have all of these mutational processesoccurring in their tumor or could have some of the same mutational processes as well as other mutational processes present.

CCRFOCUS

Clin Cancer Res; 23(11) June 1, 2017 Clinical Cancer Research2618

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 3: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

different algorithms are available today for mutation signatureextraction (37–41)—some may reveal 11 (with signature 1B) or12 signatures (signatures 1 and 5) from this dataset, or withmorerelaxation of parameters, even 13 (Supplementary Data). Regard-less, that five signatures were consistently seen when as few as 21samples were studied reveals that these early signatures are robustand common, and report ubiquitously present mutagenic pro-cesses in breast cells.

Of the 12 signatures now documented in breast cancer(ref. 36; Fig. 2A), signature 1B or signatures 1 and 5 are associatedwith age of diagnosis; signatures 2 and 13 are associated with theactivity of the APOBEC cytidine deaminases; signature 3 is asso-ciated with BRCA1/BRCA2 deficiency; signature 8 appears to beincreased in tumorswithBRCA1/BRCA2deficiency, although alsopresent at lower levels in other tumors; signatures 6, 20, and26 areassociatedwithmismatch repair deficiency; and signatures 17, 18,and 30 are of unknown etiology (36). Of note, these mutationalsignatures do not appear to demonstrate specificity to breastcancer subtype whether classified by estrogen receptor (ER) statusor other systems such as PAM50 or AIMS.

Most breast tumors have less than 20,000 substitutions in total(less than 6.5 mutations per Mb; Fig. 2B). Only a handful ofsamples have a very large number of mutations (up to 94,000substitutions; Fig. 2B). Irrespective of mutation burden, the vastmajority of samples comprise multiple mutational signatures(36). A subset of samples may be composed predominantly ofspecific signatures, and may even be overwhelmed by a very largenumber of mutations from these signatures and termed "hyper-mutators" (42). This trait is associated with certain mutationalprocesses: signatures 2, 13, 6, 20, 26, and17 in breast cancers (36).Indeed, someof these signatures (signatures 8, 13, and 17) appearto dominate later in breast tumorigenesis (43), observed latterlyin cancer evolution (33, 36) and in metastatic disease (44).Perhaps, in time, these associations will be definitively verifiedas harbingers of poorer outcomes.

It was previously observed that substitution signatures hadparticular relationships with classes of indels (30). Patients withgermline BRCA1/BRCA2 mutations exhibited an excess of largerindels (>3 bp) with microhomology present at breakpoint junc-tions (30). Moreover, tumors with signature 6, 20, or 26, whichare associated with mismatch repair deficiency, have a largenumber of indels at polynucleotide repeat tracts, consistent witha label of microsatellite instability in these cancers (32, 36). Thus,correlations are observable between substitution signatures andcrude indel patterns.

Advancing the Frameworks of MutationalSignatures

Mutational processes in human somatic cells are not restrictedto producing base substitutions. Indeed, DNA damage and DNArepair processes can generate patterns of indels and large-scalechromosomal aberrations or structural variation as well (31).Thus, the basic premise of mutational signatures was recentlyextended to structural variation in breast cancer (36).

Genomic instability is a broad concept that encompasses awiderange of chromosomal level abnormalities. Some tumors have alarge number of rearrangements (several hundred) that arefocused or "clustered" at specific loci reporting driver amplicons(e.g., CCND1, ERBB2) or are simply sites of chromothripsis (45),for example. In contrast, other tumors could have an equivalent

number of rearrangements but have them widely distributedthroughout the genome instead. Intuitively, different mutationalprocesses are likely to underpin these disparate genomic out-comes (36).

Rearrangements were thus separated according to whether theywere clustered or dispersed (Fig. 3A), and then by rearrangementclass (tandem duplication, deletion, inversion, or translocation;Fig. 3B) and by size (36). Following this classification, we appliedthe same mathematical framework, as described previously, andextracted six rearrangement signatures (RS; ref. 36; Fig. 3C). Thisexercise of defining rearrangement signatures was not simplyacademic—unsupervised hierarchical clustering yielded sevenmajor subgroups (groups A–G) that exhibited distinct associa-tionswith other genomic, histologic, gene expression, and clinicalfeatures (ref. 36; Fig. 4).

Three of the signatures are featured in homologous recombi-nation (HR)–deficient tumors: RS1, dominated by long (>100kb)tandem duplications, characterized many HR-deficient tumorsbut defined group F tumors associated with older age of diagnosisand poorer outcome in this small cohort; RS3, characterized byshort (<10 kb) tandem duplications was specific to BRCA1-mutant tumors (group D); whereas RS5, defined by deletions(<10kb), are present inBRCA1- andBRCA2-deficient samples andtypified group G BRCA2-mutated samples (36). Hence, we wereable to differentiate BRCA1- from BRCA2-null tumors, as well as aBRCA-like (but different) cohort with distinct clinical features(36). These diverse groups would have simply been labeled ashaving "genomic instability" in the past and been indistinguish-able (Fig. 4).

Of the remaining rearrangement signatures, RS2, characterizedby large (>100 kb) nonclustered deletions, inversions, and inter-chromosomal translocations, defined group E ER-positive tumors(36). In contrast, RS4 and RS6 were both characterized by clus-tered rearrangements and were enriched in groups A, B, and C,which were of mixed ER status but frequently had large driveramplicons, for example, ERBB2 and CCND1 (36).

Remarkably, deep analysis of individual rearrangement signa-tures has unearthed a novel, if somewhat disturbing, biologicalinsight. Very recently, 33 loci were identified as sites that arerearranged by long RS1 tandem duplications more frequentlythan expected in independent tumors from different patients,even if by only a single tandem duplication (46). Interestingly,these hotspots are enriched for breast cancer germline suscepti-bility loci, breast-specific super-enhancer regulatory elements,and oncogenes (46). These loci have high transcriptional activityin breast tissue and are susceptible to double-strand break (DSB)damage and, following DSB repair, to formation of rearrange-ments. Yet, not all classes of rearrangements are represented atthese sites—only long RS1 tandem duplications. It was hypoth-esized that long tandemduplications aremore likely to effectivelyincreasewhole copies of these regulatory elements/genes, and thatthis could confer some degree of secondary selective pressure,even if incrementally (46). Indeed, corroborative transcriptomicevidence was observed to support this postulate, providing adevastating insight into this mutational process of HR deficiency:It may commence as a passenger mutational signature but,unwittingly, creates secondary driver events. RS1 is, therefore, aparticularly deleterious genetic mechanism—an injurious muta-tional signature that perpetuates carcinogenesis (46).

This field of rearrangement signatures may only be in itsinfancy, but a number of deep messages are appearing, although

Mutational Signatures in Breast Cancer

www.aacrjournals.org Clin Cancer Res; 23(11) June 1, 2017 2619

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 4: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

AssociationsA

B

Presence in othercancer types

Prevalence inbreast cancer

Age of diagnosis; deaminationof methyl-cytosines

Signature 1

Age of diagnosis?Etiology unknown

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.15

0.30

Signature 5

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.02

0.04

Signature 8

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.02

0.04

Signature 3

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.03

0.06

Signature 2

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.02

0.04

Signature 13

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.15

0.30

Signature 6

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.10

0.20

Signature 20

Prob

abili

ty

0.00 C>A C>TC>G T>A T>C T>G

0.03

0.06

Signature 26

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.04

0.08

Signature 17

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.15

0.30

Signature 18

Prob

abili

ty

0.00C>A C>TC>G T>A T>C T>G

0.15

Signature 30

Prob

abili

ty

96 mutation types at each 5' and 3' base content

0.00C>A C>TC>G T>A T>C T>G

0.06

Increased in HR deficiency andlate in cancer evolution but present

at lower levels in many tumors

Homologous recombinationalrepair deficiency

APOBEC cytidine deaminases

APOBEC cytidine deaminases

Mismatch repair deficiency

Mismatch repair deficiency

Mismatch repair deficiency

Unknown

Unknown

Unknown

0%

Signatures 1 5 2 13 3 8 18 17 306 20 26

100%0

100,000

Common >75%of samples

Common >75%of samples

Rare <10% ofsamples

Rare <5% ofsamples

Rare <5% ofsamples

Rare <5% ofsamples

<20% of samples

Rare <5% ofsamples

Gastric

Cervical, gastric,uterine

Esophageal, liver, lung,lymphoma, gastric

Adrenocortical,neuroblastoma, gastric

Osteosarcoma

Common >60%of samples

Common >20%of samples

Many tumor types

Many tumor types

Many tumor types

Adrenocortical, colon,uterine, ovarian, pancreas

More than half of all tumortypes examined so far

Common >75%of samples

Reported to be in manyother tumor types, but not

consistently

Common >75%of samples

Reported in nearlyall other tumor types

CCRFOCUS

Clin Cancer Res; 23(11) June 1, 2017 Clinical Cancer Research2620

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 5: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

clinical significance requires further evaluation. An exciting futureawaits as the fieldmatures and other tissue types are incorporatedinto these analyses.

Localized Mutational SignaturesThe substitution signatures described thus far report muta-

genesis distributed throughout the human genome. Intriguing-ly, localized mutagenesis has also been reported (30). Bycalculating an intermutation distance, or the distance from asubstitution to the one immediately preceding it in the refer-ence genome, we were able to appreciate focal substitutionhypermutation (30). Although most mutations in a cancergenome would exhibit an intermutation distance of approxi-mately 105 bp to approximately 106 bp, localized regions ofhypermutation or "kataegis" presented as clusters of substitu-tions with shorter intermutation distances (defined as six ormore substitutions with an average intermutation distance of<1,000 bp; refs. 30, 32, 36). These focal mutation showers hadstriking characteristics—an excess of cytosine mutations at aTpC sequence context and colocalizing with a different class ofmutation altogether, rearrangements.

Kataegis mutations bear a strong resemblance to those ofgenome-wide signatures 2 and 13, which are associated withAPOBEC enzymatic activity (47–49). APOBECs are a family ofcytidine deaminases that evolved to restrict retroviruses andretrotransposon elements. APOBECs require single-strandedDNA (ssDNA) as a substrate for deamination of cytosine touracil. Notably, experimental studies in yeast suggest that DSBsand end resection are a source of ssDNA required for APOBECsto generate kataegis (47). In contrast, alternative cellular pro-cesses such as replication or transcription have been hypoth-esized as a potential fount of ssDNA for APOBEC activity–generating signatures 2 and 13 (31). Thus, although APOBECenzymes are involved in kataegis, and genome-wide signatures2 and 13, they are believed to be mechanistically distinctmutational processes likely arising at different instances ofcellular stress (Fig. 5).

Interestingly, an alternative form of kataegis was also rarelyobserved (0.9% of all kataegis foci identified in breast cancer;ref. 36). Also colocalizing with rearrangements, this version ofkataegis exhibited a different base substitution pattern of T>GandT>C mutations predominantly at NTT and NTA sequences. Theetiology of this form of kataegis is unknown.

Dynamic Cellular Processes and MutationalSignatures

Thedistributionof somaticmutations is uneven through cancergenomes, has been extensively studied, and has been found to belargely influenced by replication time domains and histone epi-

genetic marks (50, 51). Predicated on being able to probabilis-tically assign everymutation in a cancer to amutational signature,similar analyses have now been performed as mutational signa-tures (36). Because mutational signatures are proxies for specificbiological processes, the advantage of performing these analysesas mutational signatures is that one can interpret the influenceof dynamic cellular events, such as replication, transcription, andnucleosome occupancy, on the associated biological processes(ref. 36; Table 1).

For example, one of the most noteworthy insights obtainedfrom this analysis was the degree of asymmetry observed betweenreplication strands for particular signatures. For approximately100,000 mutations on the leading replicative strand, approxi-mately 140,000 mutations were observed on the lagging strandspecifically for APOBEC-related signatures 2 and 13 (36). Thislevel of asymmetry implies that replication has amechanistic rolein the generation of signatures 2 and 13 (Fig. 5). APOBECsdemand ssDNA as a deamination substrate, and replication is aperfect physiologic source of ssDNA. Indeed, in 2016, four otherpublications supported this observation through in vivo (52, 53)and in vitro (54, 55) studies. Replication strand asymmetry wasalso observed for signature 26 (36), one of the four mutationalsignatures associated with deficiency of mismatch repair. Hadthese analyses been performed on all mutations combined, thespecific behaviors (Table 1) would not have been appreciable—the signal diluted by aggregation. Thus, these vignettes demon-strate the value of performing analyses as mutational signatures.

Ultimately, a profound theme has crystallized. Differentsignatures exhibit different relationships with replication, tran-scription, and chromatin organization, fortifying how muta-tional signatures must be true biological phenomena and arenot simply theoretical, mathematical constructs.

In the Midst of Chaos, Lies OpportunitySome mutational signatures are a direct pathophysiologic read-

out of the abrogation of a DNA repair gene/pathway and could beused as a biomarker to reportDNA repair deficiency in a tumor (31,56). Somatic nullness of a single gene, such as BRCA1, however,does not simply produce one mutational signature; it produces amultitude of mutational patterns (36). On one hand, this compli-cates an already burdened mutational landscape. Conversely, thiscould be used to our advantage for potential clinical applications.

Very recently, a supervised Lasso logistic regression model wasused to learn the multiple substitution, indel, and rearrangementmutational signatures that distinguish germline BRCA1/BRCA2–mutated cancers from sporadic tumors (57). Six mutationalpatterns were found to be discriminatory and were weighted tocreate a mutational signature–based predictor of BRCA1/BRCA2deficiency called HRDetect (57).

Figure 2.Currently known extracted substitution mutational signatures in human breast cancers. A, Table of 12 mutational signatures extracted using nonnegativematrix factorization. Each signature is ordered by mutation class (C>A/G>T, C>G/G>C, C>T/G>A, T>A/A>T, T>C/A>G, T>G/A>C), taking immediate flankingsequence into account, resulting in 96 triplets. For each class,mutations are orderedby50 base (A, C, G, T)first, before 30 base (A, C, G, T).Y-axis reports theprobabilityof a signature generating each of the 96 triplets. Signature extraction was performed separately in 17 cancer types. The bars report the results of the extractionon the 560 breast cancers (37) using a widely available algorithm using simply default parameters (38), and the error bars demonstrate the variability (of thepresumptive same signatures) between cancers of different tissue types. The table also contains the associated etiologies of each signature, the prevalenceof these signatures in breast cancer, andwhether the signature is also seen in other tumor types. HR, homologous recombination.B,Absolute numbers ofmutations ofeach signature in each sample (top) andproportionof each signature in each sample (bottom). PanelB reprintedbypermission fromMacmillanPublishers Ltd.: Nature534:47–54, copyright 2016.

Mutational Signatures in Breast Cancer

www.aacrjournals.org Clin Cancer Res; 23(11) June 1, 2017 2621

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 6: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

© 2017 American Association for Cancer Research

Tumor with many clusteredrearrangements

Tumor with many nonclustered (ordispersed) rearrangements

Normal reference DNA chr A

A

B

C

Tandem duplication

Deletion

Inversion

Translocation

Clustered rearrangements

Inv TransTds

4

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

18

19

2021

22

X

Y

4

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

18

19

2021

22

X

Y

Del Inv TransTdsDel

Nonclustered rearrangements

Rearrangementsignature 1

Rearrangementsignature 2

Rearrangementsignature 3

Rearrangementsignature 4

Rearrangementsignature 5

Rearrangementsignature 6

1–10

kb

0%

20%

40%

60%

0%

20%

40%

60%

0%

20%

40%

60%

0%

20%

40%

60%

0%

20%

40%

60%

0%

20%

40%

60%

10–1

00

kb10

0kb

–1M

b1M

b–10

Mb

100

kb–1

Mb

1Mb–

10M

b

>10

Mb

>10

Mb

Tran

s

1–10

kb10

–10

0kb

100

kb–1

Mb

1Mb–

10M

b>1

0M

b1–

10kb

10–1

00

kb

1–10

kb10

–10

0kb

100

kb–1

Mb

1Mb–

10M

b

100

kb–1

Mb

1Mb–

10M

b

>10

Mb

>10

Mb

Tran

s

1–10

kb10

–10

0kb

100

kb–1

Mb

1Mb–

10M

b>1

0M

b1–

10kb

10–1

00

kb

Normal reference DNA chr B

CCRFOCUS

Clin Cancer Res; 23(11) June 1, 2017 Clinical Cancer Research2622

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 7: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

HRDetect outperforms customary copy number–basedapproaches (refs. 58–60; e.g., HRD index) for detectingBRCA1/BRCA2 deficiency and any individual signature on itsown (HRDetect AUC ¼ 0.98). This is unsurprising, as a predictorthat hunts for some combination of many signatures would bemore sensitive and specific than a predictor that is dependent ononly a single signal (57). Thus, HRDetect works extraordinarilywell even in situations of reduced mutation information second-ary to low tumor cellularity, low sequencing depth (e.g., lowcoverage WGS sequencing of �10-fold rather than 30-fold), orincreased noise (e.g., in cancer specimens that have artefactualgenetic changes arising from formalin fixation; ref. 57). Thisobservation could have immediate potential applications.

Of particular clinical importance, HRDetect revealed a largerproportion of patients with BRCA1/BRCA2 deficiency thanexpected, of up to 22%, that is, many more than the 3.9% ofgermline mutation carriers that were knowingly recruited to thestudy (57). More than half of these tumors would not have beendetected asBRCA1/BRCA2null using targeted sequencing of thesegenes alone. BRCA1/BRCA2–null tumors are selectively sensitiveto compounds such as PARP inhibitors (61–64), which arecurrently theoretically reserved for approximately 1% to 5% ofthe germline mutation carriers. Profoundly, if in fact one in everyfive breast cancer patients has the equivalent of a BRCA1/BRCA2–null tumor, could they be similarly selectively sensitive to PARPinhibitors? This is unknown, and it is nownecessary to embark onexperiments and/or clinical trials to seek conclusive evidence. Themessage to the community is this: We need clinical trials of drugslike PARP inhibitors, which are not restricted to germline muta-tion carriers, and are applied to sporadic breast and possibly othertumors in the general populace.

Beyond that of driver mutations, mutational signatures couldcontribute a powerful, additional spoke in the wheel of cancerdiagnostics and therapeutic stratification. There is likely to bescope for identifying other pathophysiologic processes with sen-sitivities to different therapies (e.g., replication stress with WEE1/ATR inhibitors or perhaps stratifying sensitivity to immunothera-pies; refs. 65, 66). The academic abstraction of mutational sig-natures takes a step closer toward the clinic.

Critical Dissection of the MutationalSignatures Concept

Although it is a fast-paced and exciting field, the mutationalsignatures model does warrant critical scrutiny. No matter howsophisticated the analyses of in vivomutagenesis, there are limita-tions to studying tumors—it is an uncontrolled and noisy system,

and even the best clinical metadata collections will, at best,provide associations.

First, we acknowledge that the model requires validationExperiments that show how different signatures can be gener-

ated by different exposures will contribute toward reinforcing theconcept. The field of environmental mutagenesis (67–71) willargue that historic TP53 and HPRT reporter assays, and experi-ments exposing mouse embryonic fibroblasts to external expo-sures (34), such as ultraviolet light and tobacco carcinogens,already provide evidence that mutations generated through exo-genous exposures generate mutation patterns that are similar tothose observed in human cancers. However, there have beenlimited efforts to demonstrate similarly clear relationships forendogenous mutational processes. Perhaps, systematic surveys ofmutational signatures of DNA-damaging agents and from abro-gation of DNA repair genes will be required to truly convince thescientific community that mutational signatures observed inhuman cancers arise from both external and internal sources ofDNA damage and DNA repair. Experimental evidence showingthat the amount of exposure (whether to a chemical compoundorendogenous exposure) is correlated with the degree of mutagen-esis will also help to strengthen conviction in thismodel. Thefinaldemonstration of being able to turn on a signature (through geneknockout) and turn it off again (through reversing the mutation)could definitively authenticate this model.

Second, what is the mathematical rigor of this concept?The principle of factorizing or reducing a complex, multidi-

mensional dataset into simpler parts is not unusual. Multipledifferent mathematical methods have been developed for pre-cisely this purpose (37–41). Although showing striking similarity,the results obtained through these different methods are notidentical (Supplementary Data, Supplementary Figs. S1–S3). Thishas raised concerns regarding reproducibility.

There are signatures that are staunchly similar, for example, thesignatures related to the activity of APOBEC enzymes (signatures2 and 13), which are pervasive, robust across algorithms (Sup-plementary Data, Supplementary Fig. S3), and undisputed asmutational signatures in human tumors. Related signatures areadmittedly sometimes less clearly distinguishable (Supplemen-tary Note, Supplementary Fig. S3). Various post hoc processingmethods are reported to be used to tease these apart, and these doresult in differences in the final extracted signatures. For example,signatures 6, 20, and 26 (all related tomismatch repair deficiency)are historically more difficult to disentangle because of common-alities in their 96-element profile. That they are more challenging

Figure 3.Extracting rearrangementmutational signatures in humanbreast cancers.A,Whole genomeCircos plotswere adapted from theRCircos package. Features depictedin Circos plots from outermost rings heading inwards: Karyotypic ideogram outermost. Base substitutions next, plotted as rainfall plots (log10 intermutationdistance on radial axis, dot colors: blue¼C>A; black¼C>G; red¼C>T; gray¼ T>A; green¼ T>C; pink¼ T>G). Ringwith short green lines¼ insertions; ringwith shortred lines ¼ deletions. Major copy number allele ring (green ¼ gain), minor copy number allele ring (pink ¼ loss); central lines represent rearrangements(green ¼ tandem duplications; pink ¼ deletions; blue ¼ inversions; and gray ¼ interchromosomal events). Note the difference in the nature of the distribution ofrearrangements between the two tumors depicted. The whole genome profile on the left has >300 rearrangements that are clustered at distinct loci inspecific chromosomes. In contrast, the >300 rearrangements present in the profile on the right-hand side are uniformly dispersed through the genome. Themutational processes underpinning the differing distributions in these two tumors are most likely to be different. Thus, separating rearrangements into whetherthey are clustered or dispersed represents a first step in the rearrangement classification. B, Types of rearrangements that can be ascertained easily. Thehypothetical pieces of reference DNA from two different chromosomes on the left can be rearranged to form four main classes of rearrangements, as shown on theright. This is a second step in the classification of rearrangements prior to rearrangement signature extraction. The rearrangements are also divided by sizebefore extraction.C,Six rearrangement signatures extracted using nonnegativematrix factorization. Probability of rearrangement element ony-axis. Rearrangementsize on x-axis. Chr, chromosome; Del, deletion; Inv, inversion; Tds, tandem duplication; Trans, translocation.

Mutational Signatures in Breast Cancer

www.aacrjournals.org Clin Cancer Res; 23(11) June 1, 2017 2623

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 8: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

to disambiguate fromone another does notmean that they donotexist, of course, and may even reflect biological interactionsbetween them.

The assignment of the amount of each signature present inindividual tumors is also a source of variation between algo-rithms. Invariably, these algorithms assign a small proportion ofevery signature to every sample examined. This is unlikely to bebiologically true, so penalties may or have been introduced toincrease the "sparsity" of mutation assignments, resulting invariation in final signature contributions to individual samples.Thus, mutational signature extraction and the assignments ofthese signatures are currently not fully deterministic. Of course,balance needs to be struck between precise signature analysis andnot overfitting data through post hoc processing.

Another potential source of muddied results comes frompooling of data across tumor types. At a first approximation,increasing the size of a cohort would provide greater statisticalpower for analysis. However, mixing of tumor types, particularlyif they are not of equivalent numbers (e.g., 500 breast cancers

with 25 leukemias) and have differingmutational burdens, couldresult in signal dilution or interference. This can be difficultto disentangle; therefore, pooled analyses should be undertakenwith a very clear declaration of methods, including what post hocprocessing steps are used. In the community, it remains contro-versial whether pooled analyses should be used. Such analysesimply that we expect to extract mutational signatures that areidentical across all tissues. This may be true for some signatures,but not for all. There is no reason to expect that genes involvedinHR repair performprecisely the same functions at the same timein the cell cycle to the same degree, in breast tissue as well as incolonic tissue. Indeed, the likelihood is that they almost certainlydo not.

Even when studying a specific tissue type such as breast cancer,we acknowledge that there are genuine biological differencesbetween cohorts of samples (see Supplementary Figs. S1–S3 forcomparison of two cohorts). Rare signatures present in 1% to 2%of tumors only,may not have been detected previously, because itwas simply not present in any prior dataset examined

© 2017 American Association for Cancer Research

Group A

Group B Group B

B C D E F GA

Group D Group G

Group G

4

4 4

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

1819

2021

22

X Y

4

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

1819

2021

22

X Y

4

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

1819

2021

22

X Y

4

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

1819

2021

22

X Y

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

1819

2021

22

X Y

3

2

1

5

6

7

8

910

11

12

13

14

15

16

17

18

19

2021

22

X Y

Rearrangementsignatures

Rearrangement signatures

Substitutionsignatures

Substitution signatures1

1

Ins Mh Rep None

5

54

2

2

13 6

6

3

3

308 18 1720 26

Indelpatterns

Indel patterns

BRCA1/BRCA2 statusER status

Figure 4.

The spectrum of signatures within 560 breast cancers and individual patient whole genome profiles. The panels in the middle represent, from top to bottom:BRCA1- or BRCA2-null samples (dark purple) versus what are believed to be non-BRCA1/BRCA2–mutated samples (light purple), ER status (black ¼ positive;gray¼ negative), proportions of substitution signatures, rearrangement signatures, and indel patterns present in the 560patients. Figure legends are provided at thetop of the figure. Samples are ordered according to hierarchical clustering performed on rearrangement mutational signatures. Six whole genome profiles ofindividual patients are shown to demonstrate how individualized each cancer genome is per patient. Note the striking differences between the six patients,even within the same "group" (groups B and G). Group D is enriched with BRCA1-null tumors, group G is enriched with BRCA2-null tumors, and group F isenriched with tumors that are never genetically BRCA1 null, are BRCA-like but different. Ins, insertions; Mh, microhomology mediated; Rep, polynucleotide repeat-tract mediated.

CCRFOCUS

Clin Cancer Res; 23(11) June 1, 2017 Clinical Cancer Research2624

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 9: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

Table1.

Sum

maryofrelationshipsbetwee

nea

chmutationa

lsigna

ture

andvarious

gen

omicfeatures.The

20mutationa

lsigna

turesareno

tedintheleft-m

ostco

lumn.Thisisfollo

wed

byinform

ationonmutationclasses,

features

that

predominan

tlycharacterize

each

signa

ture,and

associated

etiologies,ifkn

own.Relationshipsrelating

totran

scriptiona

lstran

ds,replicationtime,an

dstrand

san

dchromatinorgan

izationarealso

noted.

Mutationa

lsigna

ture

Mutation

type

Predominan

tfeatures

ofsigna

ture

Associated

mutationa

lproce

ssTran

scriptiona

lstrand

Rep

licative

strand

Rep

lication

time

Chromatin

organ

ization

1Sub

C>T

atCpG

Dea

minationof

methy

l-cytosine

(ageassociated

)

Somebias

Enriche

dlate

5Sub

T>C

Uncertain

(age

associated

)Somebias

Somebias

Enriche

dlate

Slig

hten

richmen

tat

linker

2Sub

C>T

atTpCpN

APOBECrelated

Somebias

Strong

lagging

strand

bias

Enriche

dlate

13Sub

C>G

atTpCpN

APOBECrelated

Somebias

Strong

lagging

strand

bias

Flat

6Sub

C>T

(and

C>A

andT>C

)MMRdefi

cien

tSomebias

Flat

20Sub

C>A

(and

C>T

andT>C

)MMRdefi

cien

tSomebias

Enriche

dlate

26Sub

T>C

MMRdefi

cien

tSomebias

Strong

bias

Enriche

dlate

Enriche

dat

linker

3Sub

HRdefi

cien

tSomebias

Somebias

Enriche

dlate

8Sub

C>A

Amplifi

edbyHR

defi

cien

cy?

Somebias

Enriche

dlate

18Sub

C>A

Uncertain

Somebias

Somebias

Enriche

dlate

Enriche

dat

nucleo

somes

andperiodic

17Sub

T>G

Uncertain

Somebias

Enriche

dlate

Enriche

dat

nucleo

somes

andperiodic

30Sub

C>T

Uncertain

Flat

RS1

Rea

rrLa

rgetand

emdup

lications

(>100kb

)Uncertain

typeof

HRdefi

cien

cy?

NA

NA

Enriche

dea

rly

RS2

Rea

rrDispersedtran

slocations

NA

NA

Enriche

dea

rly

RS3

Rea

rrSmalltan

dem

dup

lications

( <10

kb)

HRdefi

cien

cy(BRCA1)

NA

NA

Enriche

dea

rly

RS4

Rea

rrClustered

tran

slocations

NA

NA

Enriche

dea

rly

RS5

Rea

rrDeletions

HRdefi

cien

tNA

NA

Enriche

dea

rly

RS6

Rea

rrOther

clustered

rearrang

emen

tsNA

NA

Enriche

dea

rly

Rep

eat-

med

Indel

<3bpindel

atpolynu

ctract

MMRdefi

cien

tNA

NA

Enriche

dlate

Enriche

dat

linkeran

dperiodic

Microho

mIndel

�3bpindel

with

MMEJ-junctions

HRdefi

cien

tNA

NA

Enriche

dlate

Abbreviations:MMEJ,microho

mology-med

iateden

djoining;N

A,n

otavailable;polynu

c,polynu

cleo

tide;

Rea

rr,rea

rran

gem

ent;Sub

,sub

stitution.

Rep

rinted

bypermissionfrom

Macmillan

Pub

lishe

rsLtd.:NatureCommun

ications

7:1138

3,co

pyright

2016.

www.aacrjournals.org Clin Cancer Res; 23(11) June 1, 2017 2625

Mutational Signatures in Breast Cancer

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 10: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

© 2017 American Association for Cancer Research

C>T transition

GC

TA

TA

TA

CG

AT

G

G T C T C A G

G T U T C A G

G T T C A GC A G A G T C

G T C T C A GC

5

• Characterized by an excess of cytosine mutations at a TpC context• Localized distribution• Enriched at rearrangements (sites of previous DSBs)

5 5 55

3 3

3

5

5

5

3

3

3

A G A G T C

G T T C A GC A G A G T

C

C>T transition

UNG

APOBEC

APOBEC

Kataegis

• Characterized by an excess of cytosine mutations at a TpC context• Genome-wide distribution• Enriched in early replication time domains• Enriched on lagging replication strand

Signatures 2 and 13

APOBEC

Baseexcisionrepair

LIG3/XRCC1POLB

APE1

GC

TA

TA

TA

CG

AT

GC

C>G transversion

GC

TA

GC

TA

CG

AT

GC

C>A transversion

GC

TA

AC

TA

CG

AT

GC

A

B

Figure 5.

Mechanistic insights frommutagenesis: the APOBEC family of enzymes in genome-wide (signatures 2 and 13) and localizedmutational signatures (kataegis). On thebasis of the predominant cytosine mutagenesis at a TpC sequence context, the APOBEC family of enzymes has been implicated in causing both localizedkataegis and genome-wide signatures 2 and 13. A, APOBECs cause DNA damage, particularly on ssDNA, by deaminating cytosine into uracil. Uracil-N-glycosylase(UNG) first removes uracil before other components of the Base Excision Repair pathway restore the damaged DNA to its original state. If DNA is uncorrectedand enters replication as uracil or an abasic site, then the possibilities are of generating C>T transition or C>G and C>A transversion mutations. B, AlthoughAPOBECs are involved in both localized and genome-widemutagenesis, there is mounting experimental and analytic evidence to support the hypotheses that thesesignatures arise by different mechanisms. Kataegis is believed to require a DSB to arise first, before end resection of the DSB leaves ssDNA exposed forAPOBEC deamination (left). In contrast, APOBEC deamination that gives rise to signatures 2 and 13 requires long stretches of ssDNA that could occur duringuncoupling of the leading and lagging replication strands (right).

Clin Cancer Res; 23(11) June 1, 2017 Clinical Cancer Research2626

CCRFOCUS

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 11: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

(Supplementary Figs. S4–S5). Thus, getting a different result suchas a novelmutational signature in a new datasetmay be a genuinenew finding, provided, of course, that many of the canonicalsignatures are also detected.

It is very likely that mutational signatures in human tumors doindeed exist, but how analyses are performed could affect theresults of a signature extraction. Therefore, for any given analysis,it is vital to report how it was performedwith absolute clarity, andfor reviewers to critically assess whether the method applied isappropriate to the biological question being asked. What isdescribed in this review is what we have seen in breast cancersto date, although the possibility of change is there. Mathematicalextractions of mutational signatures have their limitations andshould not be considered as deterministic. Intertissue variation isexpected (Supplementary Table S1). Perhaps one way of present-ing data is that of an average signalwith error bars indicative of theintertissue variation (Fig. 2A).

Thus, there is variability in mathematical extraction of signa-tures depending on the algorithm used, on how data are used(whether analyzed as a pool of multiple tumor types or analyzedas separate tumor types), and evenonwhether the data are derivedfrom whole genome or from exome sequencing experiments.How best to handle these issues remains uncertain and will likelybe resolved in time.

Future DirectionsToday, we can demonstrate and quantifymutational signatures

in breast and other cancers; we can gain novel biological insightsand potentially exploit signature properties for clinical applica-tions. As noted above, some thoughtfulness is still required in theinterpretation of any cancer-based analysis, and experimentalwork remains the bastion for substantiating proposed etiologiesor mechanisms underpinning mutational signatures.

Notwithstanding, we are able to thoroughly profile cancergenomes per patient (Fig. 4 shows six strikingly different wholegenome profiles). Soberingly, for the near approximately 700WGS and approximately 1,500 WES breast cancers that havealready been scrutinized, no two patients shared the same set ofdrivers or the same quantities of signatures (Fig. 4). Personalizedgenomics is, therefore, not an option for us to debate; it is a fact oflife and a challenge we must embrace.

Applying comprehensive genomic approaches judiciously(72), particularly within the context of clinical trials, could prove

to bemost rewarding. If we had access to informative cohorts withoutcome data available, this would indeed help to acceleratetranslation into the clinic (72, 73).

It should also go beyond that of resequencing primary cancers.Precursors of breast cancer such as ductal carcinoma in situ (DCIS)and metastasic lesions (66) should be targeted for similarlydetailed levels of driver and mutational signature investigation.Likewise, tumors separated temporally and spatially in individualpatients could provide useful perspectives on tumorigenesis.There also remains more to explore through integration withother modalities, such as expression (74) and methylation, andassessments of surrounding tissue microenvironment.

Last but not least, the insights on mutational signatures haveonly transpired because data generated throughmany sequencingstudies, from many academic and clinical centers, have beenshared with the wider community. Thus, any future sequencingendeavor, be it within a clinical trial or otherwise, should becommitted to data sharing. This is because the opportunity tolearn new things from data resources not just immediately, butsubsequently, is huge, particularly if thorough genomic profilingis available.

Disclosure of Potential Conflicts of InterestS.Nik-Zainal is listed as a co-inventor onmultiple patentfilings related to the

application of mutational signatures that are owned by Genome ResearchLimited, and is a consultant/advisory board member for Artios Pharma Ltd.No potential conflicts of interest were disclosed by the other author.

AcknowledgmentsThe authors thank Esther Lips (NKI, Holland, the Netherlands), Shelley

Hwang (Duke University, Durham, NC), and Alastair Thompson (MD Ander-son Cancer Center, Houston, TX) for critical assessment of the manuscript. Theauthors also thank the ICGC Breast Cancer Working Group and the BASISConsortium funded by the Seventh EU Programme for having the foresight tosee the potential and conceive the idea of these extensive resequencing experi-ments in breast cancer.

Grant SupportS. Nik-Zainal was a Wellcome-Beit Fellow and personally funded by a

Wellcome Trust Intermediate Clinical Research Grant (WT100183MA) at thestart of writing this review, and subsequently funded by a CRUK AdvancedClinician Scientist Award (C60100/A23916). S. Morganella is funded by corefunds from the Wellcome Trust Sanger Institute.

Received January 18, 2017; revised February 27, 2017; accepted April 7, 2017;published online June 1, 2017.

References1. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature

2009;458:719–24.2. Stephens PJ, Tarpey PS,DaviesH, Van LooP,GreenmanC,WedgeDC, et al.

The landscape of cancer genes and mutational processes in breast cancer.Nature 2012;486:400–4.

3. Ching HC, Naidu R, Seong MK, Har YC, Taib NA. Integrated analysis ofcopy number and loss of heterozygosity in primary breast carcinomasusinghigh-density SNP array. Int J Oncol 2011;39:621–33.

4. Fang M, Toher J, Morgan M, Davison J, Tannenbaum S, Claffey K.Genomic differences between estrogen receptor (ER)-positive and ER-negative human breast carcinoma identified by single nucleotide poly-morphism array comparative genome hybridization analysis. Cancer2011;117:2024–34.

5. Hicks J, Krasnitz A, Lakshmi B, Navin NE, Riggs M, Leibu E, et al. Novelpatterns of genome rearrangement and their association with survival inbreast cancer. Genome Res 2006;16:1465–79.

6. Hicks J, Muthuswamy L, Krasnitz A, Navin N, Riggs M, Grubor V, et al.High-resolution ROMA CGH and FISH analysis of aneuploid anddiploid breast tumors. Cold Spring Harb Symp Quant Biol 2005;70:51–63.

7. King CR, Kraus MH, Aaronson SA. Amplification of a novel v-erbB-relatedgene in a human mammary carcinoma. Science 1985;229:974–6.

8. LearyRJ, Lin JC,Cummins J, Boca S,WoodLD,ParsonsDW, et al. Integratedanalysis of homozygous deletions, focal amplifications, and sequencealterations in breast and colorectal cancers. Proc Natl Acad Sci U S A2008;105:16224–9.

9. Curtis C, Shah SP, Chin SF, Turashvili G, RuedaOM,DunningMJ, et al. Thegenomic and transcriptomic architecture of 2,000 breast tumours revealsnovel subgroups. Nature 2012;486:346–52.

10. Ellis MJ, Ding L, Shen D, Luo J, Suman VJ, Wallis JW, et al. Whole-genomeanalysis informs breast cancer response to aromatase inhibition. Nature2012;486:353–60.

www.aacrjournals.org Clin Cancer Res; 23(11) June 1, 2017 2627

Mutational Signatures in Breast Cancer

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 12: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

11. Shah SP, Roth A, Goya R, Oloumi A, Ha G, Zhao Y, et al. The clonal andmutational evolution spectrum of primary triple-negative breast cancers.Nature 2012;486:395–9.

12. Banerji S, Cibulskis K, Rangel-Escareno C, Brown KK, Carter SL, FrederickAM, et al. Sequence analysis of mutations and translocations across breastcancer subtypes. Nature 2012;486:405–9.

13. Miki Y, Swensen J, Shattuck-EidensD, Futreal PA,HarshmanK, Tavtigian S,et al. A strong candidate for the breast and ovarian cancer susceptibilitygene BRCA1. Science 1994;266:66–71.

14. Wooster R, Bignell G, Lancaster J, Swift S, Seal S, Mangion J, et al.Identification of the breast cancer susceptibility gene BRCA2. Nature1995;378:789–92.

15. ThompsonD,Duedal S, Kirner J,McGuffog L, Last J, ReimanA, et al. Cancerrisks and mortality in heterozygous ATM mutation carriers. J Natl CancerInst 2005;97:813–22.

16. Masciari S, LarssonN, Senz J, Boyd N, Kaurah P, Kandel MJ, et al. GermlineE-cadherin mutations in familial lobular breast cancer. J Med Genet2007;44:726–31.

17. Meijers-Heijboer H, van den Ouweland A, Klijn J, Wasielewski M, de SnooA, Oldenburg R, et al. Low-penetrance susceptibility to breast cancer due toCHEK2(�)1100delC in noncarriers of BRCA1 or BRCA2 mutations. NatGenet 2002;31:55–9.

18. Litman R, PengM, Jin Z, Zhang F, Zhang J, Powell S, et al. BACH1 is criticalfor homologous recombination and appears to be the Fanconi anemiagene product FANCJ. Cancer Cell 2005;8:255–65.

19. Chen J, LindblomP, LindblomA.A studyof the PTEN/MMAC1gene in 136breast cancer families. Hum Genet 1998;102:124–5.

20. Cox A, Dunning AM, Garcia-Closas M, Balasubramanian S, Reed MW,Pooley KA, et al. A common coding variant in CASP8 is associated withbreast cancer risk. Nat Genet 2007;39:352–8.

21. Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, BallingerDG, et al. Genome-wide association study identifies novel breast cancersusceptibility loci. Nature 2007;447:1087–93.

22. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, et al. Agenome-wide association study identifies alleles in FGFR2 associated withrisk of sporadic postmenopausal breast cancer. Nat Genet 2007;39:870–4.

23. Stacey SN, Manolescu A, Sulem P, Rafnar T, Gudmundsson J, GudjonssonSA, et al. Common variants on chromosomes 2q35 and 16q12 confersusceptibility to estrogen receptor-positive breast cancer. Nat Genet2007;39:865–9.

24. Stacey SN, Manolescu A, Sulem P, Thorlacius S, Gudjonsson SA, JonssonGF, et al. Common variants on chromosome 5p12 confer susceptibility toestrogen receptor-positive breast cancer. Nat Genet 2008;40:703–6.

25. Thompson WD. Genetic epidemiology of breast cancer. Cancer 1994;74:279–87.

26. Bergamaschi A, Kim YH,Wang P, Sorlie T, Hernandez-Boussard T, LonningPE, et al. Distinct patterns of DNA copy number alteration are associatedwith different clinicopathological features and gene-expression subtypes ofbreast cancer. Genes Chromosomes Cancer 2006;45:1033–40.

27. Vincent-Salomon A, Lucchesi C, Gruel N, Raynal V, Pierron G, GoudefroyeR, et al. Integrated genomic and transcriptomic analysis of ductal carcino-ma in situ of the breast. Clin Cancer Res 2008;14:1956–65.

28. Bentley DR, Balasubramanian S, SwerdlowHP, Smith GP,Milton J, BrownCG, et al. Accurate whole human genome sequencing using reversibleterminator chemistry. Nature 2008;456:53–9.

29. Pereira B, Chin SF, Rueda OM, Vollan HK, Provenzano E, Bardwell HA,et al. The somatic mutation profiles of 2,433 breast cancers refines theirgenomic and transcriptomic landscapes. Nat Commun 2016;7:11479.

30. Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, RaineK, et al. Mutational processes molding the genomes of 21 breast cancers.Cell 2012;149:979–93.

31. Helleday T, Eshtad S, Nik-Zainal S. Mechanisms underlying mutationalsignatures in human cancers. Nat Rev Genet 2014;15:585–98.

32. AlexandrovLB,Nik-Zainal S,WedgeDC,Aparicio SA, Behjati S, BiankinAV,et al. Signatures of mutational processes in human cancer. Nature2013;500:415–21.

33. Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, LauKW, et al. The life history of 21 breast cancers. Cell 2012;149:994–1007.

34. Nik-Zainal S, Kucab JE, Morganella S, Glodzik D, Alexandrov LB, Arlt VM,et al. The genome as a record of environmental exposure. Mutagenesis2015;30:763–70.

35. Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR.Deciphering signatures ofmutational processes operative in human cancer.Cell Rep 2013;3:246–59.

36. Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, et al.Landscape of somatic mutations in 560 breast cancer whole-genomesequences. Nature 2016;534:47–54.

37. Kim J, Mouw KW, Polak P, Braunstein LZ, Kamburov A, Tiao G, et al.Somatic ERCC2mutations are associated with a distinct genomic signaturein urothelial tumors. Nat Genet 2016;48:600–6.

38. Gehring JS, Fischer B, Lawrence M, Huber W. SomaticSignatures: inferringmutational signatures from single-nucleotide variants. Bioinformatics2015;31:3673–5.

39. Fischer A, Illingworth CJ, Campbell PJ, Mustonen V. EMu: probabilisticinference of mutational processes and their localization in the cancergenome. Genome Biol 2013;14:R39.

40. Roberts ND, Wedge DC, Campbell PC. HDP. https://github.com/nicolaroberts/hdp; 2015.

41. Shiraishi Y, Tremmel G, Miyano S, Stephens M. A simple model-basedapproach to inferring and visualizing cancer mutation signatures. PLoSGenet 2015;11:e1005657.

42. Nik-Zainal S,WedgeDC, Alexandrov LB, PetljakM, Butler AP, Bolli N, et al.Association of a germline copy number polymorphism of APOBEC3A andAPOBEC3B with burden of putative APOBEC-dependent mutations inbreast cancer. Nat Genet 2014;46:487–91.

43. Swanton C, McGranahan N, Starrett GJ, Harris RS. APOBEC enzymes:mutagenic fuel for cancer evolution and heterogeneity. Cancer Discov2015;5:704–12.

44. Lefebvre C, Bachelot T, Filleron T, Pedrero M, Campone M, Soria JC, et al.Mutational profile of metastatic breast cancers: a retrospective analysis.PLoS Med 2016;13:e1002201.

45. Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, Mudie LJ, et al.Massive genomic rearrangement acquired in a single catastrophic eventduring cancer development. Cell 2011;144:27–40.

46. Glodzik D, Morganella S, Davies H, Simpson PT, Li Y, Zou X, et al. Asomatic-mutational process recurrently duplicates germline susceptibilityloci and tissue-specific super-enhancers in breast cancers. Nat Genet2017;49:341–8.

47. Taylor BJ, Nik-Zainal S, Wu YL, Stebbings LA, Raine K, Campbell PJ, et al.DNA deaminases induce break-associated mutation showers with impli-cation of APOBEC3B and 3A in breast cancer kataegis. eLife 2013;2:e00534.

48. Lada AG, Kliver SF, Dhar A, Polev DE, Masharsky AE, Rogozin IB, et al.Disruption of transcriptional coactivator sub1 leads to genome-wide re-distribution of clustered mutations induced by APOBEC in active yeastgenes. PLoS Genet 2015;11:e1005217.

49. Walker BA, Wardell CP, Murison A, Boyle EM, BegumDB, Dahir NM, et al.APOBEC family mutational signatures are associated with poor prognosistranslocations in multiple myeloma. Nat Commun 2015;6:6997.

50. Polak P, Karlic R, Koren A, Thurman R, Sandstrom R, Lawrence MS, et al.Cell-of-origin chromatin organization shapes the mutational landscape ofcancer. Nature 2015;518:360–4.

51. Schuster-Bockler B, Lehner B. Chromatin organization is a major influ-ence on regional mutation rates in human cancer cells. Nature 2012;488:504–7.

52. Haradhvala NJ, Polak P, Stojanov P, Covington KR, Shinbrot E, Hess JM,et al.Mutational strand asymmetries in cancer genomes revealmechanismsof DNA damage and repair. Cell 2016;164:538–49.

53. Seplyarskiy VB, Soldatov RA, Popadin KY, Antonarakis SE, Bazykin GA,Nikolaev SI. APOBEC-induced mutations in human cancers are stronglyenriched on the lagging DNA strand during replication. Genome Res2016;26:174–82.

54. Hoopes JI, Cortez LM, Mertz TM, Malc EP, Mieczkowski PA, Roberts SA.APOBEC3A and APOBEC3B preferentially deaminate the lagging strandtemplate during DNA replication. Cell Rep 2016;14:1273–82.

55. Kanu N, Cerone MA, Goh G, Zalmas LP, Bartkova J, Dietzen M, et al. DNAreplication stress mediates APOBEC3 family mutagenesis in breast cancer.Genome Biol 2016;17:185.

56. LordCJ, AshworthA. BRCAness revisited.Nat RevCancer 2016;16:110–20.57. Davies H, Glodzik D,Morganella S, Yates LR, Staaf J, Zou X, et al. HRDetect

is a predictor of BRCA1 and BRCA2 deficiency based on mutationalsignatures. Nat Med 2017;23:517–25.

Clin Cancer Res; 23(11) June 1, 2017 Clinical Cancer Research2628

CCRFOCUS

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 13: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

58. Joosse SA, van Beers EH, Tielen IH, Horlings H, Peterse JL, Hooger-brugge N, et al. Prediction of BRCA1-association in hereditary non-BRCA1/2 breast carcinomas with array-CGH. Breast Cancer Res Treat2009;116:479–89.

59. Vollebergh MA, Lips EH, Nederlof PM, Wessels LF, Schmidt MK, van BeersEH, et al. An aCGH classifier derived from BRCA1-mutated breast cancerand benefit of high-dose platinum-based chemotherapy in HER2-negativebreast cancer patients. Ann Oncol 2011;22:1561–70.

60. Watkins JA, Irshad S, Grigoriadis A, Tutt AN. Genomic scars as biomarkersof homologous recombination deficiency and drug response in breast andovarian cancers. Breast Cancer Res 2014;16:211.

61. Fong PC, Boss DS, Yap TA, Tutt A, Wu P, Mergui-Roelvink M, et al.Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA muta-tion carriers. N Engl J Med 2009;361:123–34.

62. Prakash R, Zhang Y, Feng W, Jasin M. Homologous recombination andhuman health: the roles of BRCA1, BRCA2, and associated proteins. ColdSpring Harb Perspect Biol 2015;7:a016600.

63. Bryant HE, Schultz N, Thomas HD, Parker KM, Flower D, Lopez E, et al.Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP-ribose) polymerase. Nature 2005;434:913–7.

64. Farmer H, McCabe N, Lord CJ, Tutt AN, Johnson DA, Richardson TB, et al.Targeting the DNA repair defect in BRCA mutant cells as a therapeuticstrategy. Nature 2005;434:917–21.

65. Vonderheide RH, Domchek SM, Clark AS. Immunotherapy for breastcancer: what are we missing? Clin Cancer Res 2017;23:2640–6.

66. Yates LR, Desmedt C. Translational genomics: practical applications of thegenomic revolution in breast cancer. Clin Cancer Res 2017;23:2630–9.

67. Hainaut P, Pfeifer GP. Somatic TP53 mutations in the era of genomesequencing. Cold Spring Harb Perspect Med 2016;6. pii: a026179.

68. Besaratinia A, Kim SI, Hainaut P, Pfeifer GP. In vitro recapitulating of TP53mutagenesis in hepatocellular carcinoma associated with dietary aflatoxinB1 exposure. Gastroenterology 2009;137:1127–37.

69. Hainaut P, Pfeifer GP. Patterns of p53 G–>T transversions in lung cancersreflect the primarymutagenic signature ofDNA-damageby tobacco smoke.Carcinogenesis 2001;22:367–74.

70. Bouaoun L, SonkinD, ArdinM,HollsteinM, ByrnesG, Zavadil J, et al. TP53variations in human cancers: new lessons from the IARC TP53 databaseand genomics data. Hum Mutat 2016;37:865–76.

71. Olivier M, Weninger A, Ardin M, Huskova H, Castells X, Vallee MP, et al.Modelling mutational landscapes of human cancers in vitro. Sci Rep2014;4:4482.

72. Reeder-Hayes KE, Anderson BO. Breast cancer disparities at home andabroad: a review of the challenges and opportunities for system-levelchange. Clin Cancer Res 2017;23:2655–64.

73. Freedman RA, Partridge AH. Emerging data and current challenges foryoung, old, obese, or male patients with breast cancer. Clin Cancer Res2017;23:2647–54.

74. Ferrari A, Vincent-Salomon A, Pivot X, Sertier AS, Thomas E, Tonon L, et al.A whole-genome sequence and transcriptome perspective on HER2-pos-itive breast cancers. Nat Commun 2016;7:12222.

www.aacrjournals.org Clin Cancer Res; 23(11) June 1, 2017 2629

Mutational Signatures in Breast Cancer

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from

Page 14: CCR FOCUS - Clinical Cancer Research · therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617–29. 2017 AACR. See all articles in this CCR Focus

2017;23:2617-2629. Clin Cancer Res   Serena Nik-Zainal and Sandro Morganella  LevelMutational Signatures in Breast Cancer: The Problem at the DNA

  Updated version

  http://clincancerres.aacrjournals.org/content/23/11/2617

Access the most recent version of this article at:

  Material

Supplementary

  http://clincancerres.aacrjournals.org/content/suppl/2017/06/10/23.11.2617.DC1

Access the most recent supplemental material at:

   

   

  Cited articles

  http://clincancerres.aacrjournals.org/content/23/11/2617.full#ref-list-1

This article cites 72 articles, 14 of which you can access for free at:

  Citing articles

  http://clincancerres.aacrjournals.org/content/23/11/2617.full#related-urls

This article has been cited by 14 HighWire-hosted articles. Access the articles at:

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected]

To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://clincancerres.aacrjournals.org/content/23/11/2617To request permission to re-use all or part of this article, use this link

on November 11, 2020. © 2017 American Association for Cancer Research.clincancerres.aacrjournals.org Downloaded from