Upload
lamnhan
View
223
Download
3
Embed Size (px)
Citation preview
Cell Reports, Volume 16
Supplemental Information
Direct Transcriptional Consequences
of Somatic Mutation in Breast Cancer
Adam Shlien, Keiran Raine, Fabio Fuligni, Roland Arnold, Serena Nik-Zainal, SergeDronov, Lira Mamanova, Andrej Rosic, Young Seok Ju, Susanna L. Cooke, ManasaRamakrishna, Elli Papaemmanuil, Helen R. Davies, Patrick S. Tarpey, PeterVan Loo, David C. Wedge, David R. Jones, Sancha Martin, John Marshall, ElizabethAnderson, Claire Hardy, ICGC Breast Cancer Working Group, Oslo Breast Cancer Re-search Consortium,, Violetta Barbashina, Samuel A.J.R. Aparicio, Torill Sauer, ØysteinGarred, Anne Vincent-Salomon, Odette Mariani, Sandrine Boyault, Aquila Fatima, AnitaLangerød, Åke Borg, Gilles Thomas, Andrea L. Richardson, Anne-LiseBørresen-Dale, Kornelia Polyak, Michael R. Stratton, and Peter J. Campbell
! 1!
Supplementary,Methods,
Analysis, of, variant, allele, fraction, differences, between, the, transcriptome, and,
genome,in,TCGA,data,,
!To!validate!our!finding,!we!calculated!differences!in!transcriptional!output!in!TCGA’s!
breast! cancer! cohort.! Aligned! BAM! files! for! 980! breast! cancer! samples!with! both!
RNAESeq! and! exome! sequencing! were! downloaded! from! CGHUB!
(https://cghub.ucsc.edu/)!using!GeneTorrent.! !PCR!duplicates! for!both!exome!and!
transcriptome!were!removed!using!SAMtools.!The!position!of!somatic!mutations,!in!
MAF! file! format,! and! gene! expression! values! (using! the! RSEM! method)! were!
obtained! from! https://tcgaEdata.nci.nih.gov/.! Additional! clinical! covariates! were!
obtained! from! cBioPortal! ! (http://www.cbioportal.org/).! All! putative! mutations!
were! reEannotated!using!Annovar! (release!2013Aug23)! and! all! potential! germline!
variants!were!removed!(present!in!NCBI!dbSNP!Human!build!142).!Finally,!70,071!
exonic/splicing!substitutions!present!in!the!980!RNAESeq!and!WES!paired!samples!
were!considered!for!further!analysis.!Mutations!in!the!5’!or!3’!UTRs!were!excluded.!
Mutated! loci! were! considered! not! expressed,! and! therefore! excluded! from! this!
analysis,! if! the! total! coverage! was! less! than! five! reads,! or! the! number! of! reads!
supporting!the!mutated!base!was!less!than!five!reads.!!
!
These! substitution! mutations! were! evaluated! in! a! number! of! ways,! including! by!
measuring! the! proportion! of! reads! reporting! the! mutation! in! the! transcriptome!
(variant! allele! fraction! or! VAF)! and! subtracting! it! from! the! same!measure! in! the!
genome!(i.e.!VAFdifference!=!VAFtranscriptome!E! !VAFgenome).!We!used!linear!regression!to!
model!the!relationship!between!the!amount!of!ESR1!expressed!by!a!tumor!and!the!
VAFdiff!of!its!mutations.!
!
! 2!
We! classified! TCGA! breast! cancers! into! known! subtypes! (Luminal! B,! Luminal! A,!
HER2Erelated! and! triple! negative)! by! immunohistochemistry! as! per! Blows! et! al!
(PLOS!Med!2010).!!
Estimating,the,excess,of,rearrangements,with,maximum,rank,for,aberrant,
transcription,
We!use!a!maximum!likelihood!approach!to!estimate!the!excess!of!rearrangements!at!
highest! rank.! Basically,! we! allow! the! ranks! to! be! distributed! as! a! multinomial!
process! with! probabilities! of!rank~{π,π,⋯,π,π+τ} ,! where! we! are! interested! in!estimating!!.! It! is! straightforward! to! show! that! the!maximum! likelihood!estimator!for!!!is!given!by:!
! = max 0,! − 1 !! − !!!!!
!! − 1 !! !
where!! !is! the! number! of! samples! (different! ranks)! and!!! !is! the! number! of!rearrangements!garnering!the!!th!rank.!Bootstrapping!of!the!observed!counts!across!all!possible!ranks!was!used!to!estimate!the!95%!confidence! intervals! for!the!point!
estimates.!
!
Detection(of(genomic(rearrangements((
PairedEend! maps! were! generated! using! a! new! inEhouse! algorithm! that! will! be!
published! separately! (J.! Marshall! et! al.,! manuscript! in! preparation).! Briefly,!
discordantly!mapped!read!pairs!were!filtered!against!BWA!read!pileup!loci,!repeat!
features!and!mitochondrial!sequences!in!GRCh37.!Additionally!alternative!mapping!
locations! were! evaluated! to! assess! whether! both! reads! could! be! aligned! to! an!
alternative! location! as! a! concordant! pair.! Remaining! discordant! read! pairs! were!
clustered!to!generate!a!putative!list!of!rearrangements!with!respect!to!the!GRCh37!
reference! genome.! Candidate! rearrangements! found! in! paired! normal! blood! DNA!
analyses,! or! previously! confirmed! by! PCR! to! be! germ! line! in! other! studies,! were!
removed.! These! steps! produced! a! pairedEend!map! cured! from! the!majority! of! the!
artefacts!resulting!from!BWAEmapping!and!from!putative!germ!line!variants.!
! 3!
!
In! this!manuscript! we! only! report! high! confidence! rearrangements! for! which!we!
have! successfully! resolved! the! breakpoint.! To! find! the! breakpoints! we! first!
determined! the! window! surrounding! rearrangements! using! the! average! and!
maximum! insert! size! of! each! BAM! file.!We! then! looked! for! reads!where! one! end!
mapped!within! this!window!and! the! other! end!was!unmapped.! !Unmapped! reads!
were!realigned!to!the!genome!(BLAT,!using!optimised!parameters).!!Realigned!reads!
that!accurately!mapped!within!the!both!windows!of!a!rearrangement!were!grouped!
together,! and! finally! each! putative! breakpoint! was! evaluated! by! measuring! the!
distance!between! the!breakpoint! region!and! the!breakpoint,! and! the!coefficient!of!
variation!of!the!breakpoint!position!themselves!(ideally,!there!is!no!variability!at!the!
position).!!
Validation,of,changes,to,gene,transcript,structure,
We!validated!our!RNAESeq!results!by!using!replicate!RNAs,!comparing!the!junctions!
to!existing!datasets,!using!RNA!pullEdown!sequencing,!and!by!manual!inspection.!
!
Sample! HCC1599! was! run! as! a! technical! replicate.! A! new! library! was! created,!
sequenced! and! analysed! using! the! same! algorithms.! ! One! hundred! percent! of! the!
genomic!rearrangements!causing!an!exon!skip!with!the!highest!rank!were!found!to!
lead!to!the!same!event! in!the!replicate!transcriptome!(5/5).! !Of! the!three!genomic!
rearrangements!involving!two!genes!in!the!same!orientation,!which!were!previously!
found! to! cause! an! expressed! fusion,! two! caused! the! same! event! in! the! replicate!
transcriptome! and! one! was! missed! in! the! replicate.! We! compared! the! inEframe!
fusions!to!those!we!previously!reported!(Stephens!et!al.!Table!3).!Of!the!14!inEframe!
fusions!found!in!both!analyses,!which!had!previously!been!validated!by!RTEPCR,!we!
identified! 10! expressed! fusions! in! the! new! RNAESeq! data! as! well! as! many! other!
others!not!reported!in!the!previous!data!set.!
! 4!
Supplementary,Figures,Legends,
!Supplementary(Figure(1.(RNA(Architect,(a(suite(of(algorithms(for(the(analysis(
of(cancer(RNA<Sequencing.(Related(to(Figure(3.(
(A) Overview!of!RNA!Architect’s!seedEandEextend!and!discordant!pair!algorithm.!
(B) !Statistics!from!a!representative!sample!that!has!been!run!through!this!pipeline.!!
(C) All! samples! sequenced! at! high! depth,! and! there! is! no! association! between!
coverage!and!percentage!of!expressed!mutations.!
(D) Similar!levels!of!expressed!mutation!found!in!TCGA!data.!
!
Supplementary(Figure(2.(Estimating(the(proportion(of(reads(derived(from(the(
tumour(and(the(stromal(cells.(Related(to(Figure(1.(
(A) Comparison!of!variants!from!the!active!and!inactive!X!chromosome.!
(B) Observed! fraction! of! reads! reporting! reference! allele! vs.! the! posterior!
probability!of! the!reference!allele!deriving! from!the!active!X!chromosome.!The!
depth!of!colour!reflects!the!level!of!expression.!
(C) Estimated!distribution!and!95%!posterior!intervals!for!relative!gene!expression!
in!cancer!versus!stromal!cells!for!ER+!and!ERE!breast!cancers.!!
(
Supplementary(Figure(3.((Related(to(Figure(1;(Figure(2.(
(A) Increased! expression! of! the!mutated! allele! in! ERE! as! compared! to! ER+! breast!
cancer!transcriptomes!(plotted!relative!to!the!genome).!
(B) Variant!allele!fraction!in!genome!compared!to!the!transcriptome,!for!all!samples!
including!cell!lines.!
! 5!
(C) Absence!of!negative!selection!in!nonsense!mutations.!Comparison!of!expression!
levels!from!the!organoids!of!normal!breast!epithelium!for!genes!mutated!in!the!
cancer!samples.!!
Supplementary( Figure( 4.( A( recurrent( in<frame( fusion( between( TRMT11( and(
NCOA7*in*two(breast(cancers.(Related(to(Figure(3;(Figure(4.(
A!tandem!duplication!on!chromosome!6!joins!the!5’!end!of!TRMT11!with!the!3’!end!
of!NCOA7.+!In!both!samples!the!fusion!is!inEframe!and!highly!expressed!as!shown!by!
the! numerous! junction! reads! (split! reads)! between! TRMT11+ exon! 11! and!NCOA7+
exon! 13! in! sample! PD4005a,! and! TRMT11! exon! 6! and! NCOA7+ exon! 7! in! sample!
HCC1954.!
+
Supplementary(Figure(5.(Regions(of(local(complexity(in(breast(cancer(sample(
PD4103a.(Related( to( Figure(7.!One!sample’s!regions!of!complexity!are!shown!as!
pairs! of! Circos! plots,! for! the! genome! and! transcriptome.! The! genomic! events! one!
would! predict! to! be! expressed! are! highlighted! (blue! arcs).! The! tumour! does! not!
express!all!of!these!events,!or!multiple!cis+rearrangements!have!been!amalgamated!
and!expressed!as!a! single! transcript! that! combines!genes!only! indirectly! linked! to!
another.!
(
Supplementary(Figure(6.(Compound(event(in(the(gene(MLL3.(Related(to(Figure(
3.((
(A) A!tandem!duplication!in!the!genome!within!the!footprint!of!MLL3,+an!established!
breast! cancer! gene,! results! in! a! complex! aberrant! transcript! involving! the!
reusage!of!exons!and!the!activation!of!an!alternative!donor!site.!The!reads!from!
TopHat! support! junctions! between! the! canonical! exon! edges! (red! arcs)! only!
! 6!
whereas! RNA! Architect! identifies! the! compound! event! (horizontal! lines!
represent!split!reads).!
(B) Aberrant!MLL3+ transcripts.! Shown! are! novel! isoforms! of+MLL3! found! in! TCGA!
breast! cancers! (n=980).! Data! were! reanalysed! and! reprocessed! using! our!
pipeline.! We! compared! each! putative! aberrant! junction! in! MLL3! to! 1,277!
normals! from! 30! tissue! types! and! excluded! anything! found! in! these! samples!
(GTEX).(
(
Supplementary( Figure( 7.( Transcriptional( output( of( ER<positive( and( ER<
negative(breast(cancers.((Related(to(Figure(1.(
(A) The! expression! of! mutations! differs! within! known! molecular! subgroups! of!
breast! cancer.! Samples! were! grouped! using! available! clinical! data!
(Supplementary!Methods),! into!known!molecular! subgroups.!Plotted!on! the!YE
axis!is!the!VAFdiff.!The!pie!charts,!shown!each!subgroup,!depicts!the!percentage!
of!mutations!expressed.!
(B) Differences! in! expression! of! TP53! missense! mutations! between! ER+! and! ERE!
breast!cancers.!
(C) Expression!of!common!mutated!genes!in!EREnegative!and!EREpositive!cancers.!
!
A
Supplementary figure S1
Pre-filter normally mapped reads
Seed-and-extendalgorithm
Build index of known genes
Shatter unmapped reads, align k-mers
Merge and extend k-mers,resolve breakpoints
Perform clean-up of breakpoints
Annotate junction breakpoints
Re-map initially unmapped reads
Discordant pairalgorithm
Merge alignments
Annotate each end of read pair,look for clusters of discordant pairs
Rank putative fusions by number of unique and
multi-mapped reads,consistency of mapping positions
and consistency of discordant read pairs’ orientation
Exon skips
Exon reusages
Alternative donors and acceptors
Early polyadenylation sites
Fusion genes
Fusion genes
B Sequencing depth (read pairs) RNA tumour: 367,423,584DNA tumour: 829,476,568DNA normal: 658,077,733
Exon skipsCanonical: 1944With alt donor or acceptor: 121
PD4107a
Exon reusagesCanonical: 8With alt donor or acceptor: 1
Alternative donors or acceptors: 160
Early poly(A) sitesCoding: 13UTR: 224
FusionsSense: 13Sense with alt donor or acceptor: 4Antisense: 16
D
1.01.52.02.53.03.5
0.00
0.25
0.50
0.75
1.00
log1
0(Co
vera
ge +
1)
Frac
tion
expr
esse
d
Breast cancers (n=980)
log1
0(Co
vera
ge +
1)
Frac
tion
expr
esse
d
C
0
1
2
3
HCC1143
HCC1187
HCC1395
HCC1599
HCC1937
HCC1954
HCC2157
HCC2218
HCC38
0.00
0.25
0.50
0.75
1.00
HCC1143
HCC1187
HCC1395
HCC1599
HCC1937
HCC1954
HCC2157
HCC2218
HCC38
PD3851a
PD3904a
PD4005a
PD4006a
PD4085a
PD4086a
PD4088a
PD4103a
PD4107a
PD4109a
PD4115a
PD4116a
PD4120a
PD4248a
PD3851a
PD3904a
PD4005a
PD4006a
PD4085a
PD4086a
PD4088a
PD4103a
PD4107a
PD4109a
PD4115a
PD4116a
PD4120a
PD4248a
Cell lines Primary tumors
C A
T G
XaXi
C A
T G
XaXi
C A
T G
XiXa
C A
T G
XaXi
Cancer cells Stromal cells
mRNA transcripts
A B
●●
●
●
●
●●● ●
●
●
●
●●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●● ●●● ●● ●
●
●
●
●●
●
● ●
●
●● ●●● ●●●
●●●
●
●
●
●
●
●
●●
●●● ●● ● ●●●●
●
● ●●
●
●●●
●
● ●● ●
●●
●● ●●
● ● ●● ●
●
●
●
●
● ●
●●●●●
●
●
●●●●
●
●
●
●
●
●●●
●
●●
●● ●●
●●
●
●
●
●
●
●●●
●● ●●●
●●
●
● ●
●
●
● ●●
●● ●●
●
●●
●
●
●
●
● ●
●
●● ●
●
●
●●●
●
● ●
●
●
●●
●
● ● ●
●
● ●●
●
●
●
●
●●
●
●●
● ●
●
●●●●
●●
●●●● ●
●●● ●●
●
●●● ●●● ●●●
●
●
●
●
● ●
●
●
●●
●
●●●● ● ●
●●
●●●● ●●●
●●
●●
● ●●● ●
●
●
● ●
●
●
●
●●● ● ●
●●●● ●
●
●
●
●
●
●
●
●
●
●●●
● ●
●
●
●
●
●
● ●●●●●●●●●●
●
●●●● ●●●
●
●
●
●●
●
●●
●
●●
●● ●●
●
●●●●●●●
●●●● ●
●
●● ●●● ●●●●●
●
● ●●
●
● ●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Observed fraction of reads reporting reference allelePo
ster
ior p
rob
refe
renc
e al
lele
on
Xa
ExpressionHigh
Low
CPD3904a PD4005a
PD4085a PD4086a
PD4088a PD4107a
PD4115a
0.0 0.2 0.4 0.6 0.8 1.0
PD4116a
0.0 0.2 0.4 0.6 0.8 1.0
PD4248a
Fraction of reads derived from tumour cells
Fraction of reads derived from tumour cells
ER+ breast cancers ER- breast cancers
95% posterior intervalsFitted distribution
Supplementary figure S2
A
B C
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00Variant allele fraction in genome
Varia
nt a
llele
frac
tion
in tr
ansc
ripto
me
0%
10%
20%
30%
40%
Silent Missense Nonsense
ExpressedIn all organoids
In none of the organoids
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●●
●
●
●
●
●●● ●
●●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
● ●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
●●
●
●
●
●●
●
●
●●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
● ●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●
●
●●●
●●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−0.5
0.0
0.5
PD4005a
PD4006a
PD4086a
PD4107a
PD4109a
PD4248a
PD3851a
PD3904a
PD4085a
PD4088a
PD4103a
PD4115a
PD4116a
PD4120a
Varia
nt a
llele
frac
tion
in tr
ansc
ripto
me
rela
tive
to g
enom
e
ER status●
●
−ve+ve
PD4005aPD4006aPD4086aPD4107aPD4109aPD4248a
PD3851aPD3904aPD4085aPD4088aPD4103aPD4115a
PD4116aPD4120aHCC1143HCC1187HCC1395HCC1599
HCC1937HCC1954HCC2157HCC2218HCC38
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Supplementary figure S3
PD4005a
HCC1954
PD4005a
126,333,959-126,337,665 126,210,729-126,241,270
chr6 chr6
NCOA7 exon 13 (NM_001199620)
Junc
tion
read
s in
RNA-
Seq
TRMT11 exon 11 (NM_001031712)
NCOA7 exon 7 (NM_001199620)TRMT11 exon 6 (NM_001031712)
chr6 chr6
126,318,222-126,321,274 126,198,418-126,199,757
In-frame fusion
In-frame fusion
Junc
tion
read
s in
RNA-
Seq
Supplementary figure S4
Genome
Transcriptome
cinegretni_2 STAT
1LR
P1B
ROBO
23_
inte
rgen
icAT
XN7
PDZR
N3CG
GBP1
EPHA
6RO
BO1
LSAM
PABC
E1LA
RP1B
MAPK10PTPN13
INPP4B
C4orf43
USO1
RUFY3
MARCH1
4_intergenic
RXFP1
EDIL3
5_intergenic
NDUFAF2
CDH12GALNT10
7_intergenic
AUTS2ZNF804BKIAA0196NECAB1
ENSG00000235517ADAM18CHD7TACC18_intergenicFER1L6SLC20A2NBNMTSS1NSMCE2UNC5D
9_intergenic
PTPRDITIH5
10_intergenic
FAM107B
KIAA1217
GPR158
ITIH2
CAMK1D
PLXDC2
ARMC4
MLLT10
PCDH15
ERCC6U
SP6NL
ST8SIA6C10orf112CH
KA2
NBTP
S
ACER
3BR
MS1
GD
PD4PC
ANO
1
CCND
1
C2CD
3
C11o
rf80
ORAO
V1
11_i
nter
geni
c
PAK1
SHANK2
XRRA1
MRPL21GLT
P
MDM1FICDHELBGRIP1ANKRD13APTHLHCDK17SRGAP1CCDC38TRHDELGR512_intergenicC12orf66
BEST3PTPRB
TSPAN8SSH1
16_intergenic
17_intergenicZNF28
NCOA3
LAMA5
ASXL1
CABLES2
BCAS1
EFCAB8
DOK5
20_intergenic
SULF2
UBE2G2
DSCAM
SLC37A1B3GALT5
HLCSPCNT
C21orf29SIM
2RUNX1
TMPRSS3
DOPEY2
21_intergenicGK
IL1RAPL1D
MD
X_intergenicKD
M6A
cinegretni_2 STAT
1LR
P1B
ROBO
23_
inte
rgen
icAT
XN7
PDZR
N3CG
GBP1
EPHA
6RO
BO1
LSAM
PABC
E1LA
RP1B
MAPK10PTPN13
INPP4B
C4orf43
USO1
RUFY3
MARCH1
4_intergenic
RXFP1
EDIL3
5_intergenic
NDUFAF2
CDH12GALNT10
7_intergenic
AUTS2ZNF804BKIAA0196NECAB1
ENSG00000235517ADAM18CHD7TACC18_intergenicFER1L6SLC20A2NBNMTSS1NSMCE2UNC5D
9_intergenic
PTPRDITIH5
10_intergenic
FAM107B
KIAA1217
GPR158
ITIH2
CAMK1D
PLXDC2
ARMC4
MLLT10
PCDH15
ERCC6U
SP6NL
ST8SIA6C10orf112CH
KA2
NBTP
S
ACER
3BR
MS1
GD
PD4PC
ANO
1
CCND
1
C2CD
3
C11o
rf80
ORAO
V1
11_i
nter
geni
c
PAK1
SHANK2
XRRA1
MRPL21GLT
P
MDM1FICDHELBGRIP1ANKRD13APTHLHCDK17SRGAP1CCDC38TRHDELGR512_intergenicC12orf66
BEST3PTPRB
TSPAN8SSH1
16_intergenic
17_intergenicZNF28
NCOA3
LAMA5
ASXL1
CABLES2
BCAS1
EFCAB8
DOK5
20_intergenic
SULF2
UBE2G2
DSCAM
SLC37A1B3GALT5
HLCSPCNT
C21orf29SIM
2RUNX1
TMPRSS3
DOPEY2
21_intergenicGK
IL1RAPL1D
MD
X_intergenicKD
M6A
PD4103a
Supplementary figure S5
Fusion to antisenseFusion
Alt donor / acceptorEarly polyA siteExon reusageExon skip
Complex rearrangement predicted to cause a fusion, exon skip or reusage
Complex rearrangementGenome arcs
Transcriptome arcs
DN
A re
arra
ngem
ents
151,874 kb 151,878 kb 151,882 kb 151,886 kb 151,890 kb 151,894 kb
21 kb
Internal duplication
Exon reusage with cryptic donor
MLL3 (KMT2C)
RNA
junc
tions
TopHat
Architect
Supplementary figure S6
151,900 kb 152,000 kb 152,100 kb304 kb
RNA
junc
tions
MLL3 (KMT2C)
A
B
Supplementary figure S7
0.00
0.25
0.50
0.75
Luminal B(7,349 muts. in 101 tumors)
Luminal A(16,652 muts. in
357 tumors)
HER2-related(1,034 muts. in 27 tumors)
Triple negative(5,597 muts. in
86 tumors)
VAF
Diff
eren
ceA
B
Expr
esse
d m
uts.
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.00
0.25
0.50
0.75
1.00
DNA RNADNA RNA
Prop
ortio
n of
rea
ds
repo
rtin
g m
ut.
ER+ (49 tumors)
TP53 missense mutations
ER- (42 tumors)
● ●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●●
0
5
10
15
0 5 10 15
●●●●●
●●●●●
●●●
APH1ACDH1CXorf40ADDX49IGSF8
ITCHPIK3CAPSMA5PSMA7PTEN
TCTN2TP53YAP1 E
xpre
ssio
n o
f mut
ated
gen
es
in E
R- c
ance
rs (l
og2(
TPM
))
Expression of mutated genes in ER+ cancers (log2(TPM))
C