Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Supplementary Information for:
A Big Bang model of human colorectal tumor growth Andrea Sottoriva1,6, Haeyoun Kang2,3, Zhicheng Ma1,6, Trevor A. Graham4,5, Matthew P. Salomon1, Junsong Zhao1, Paul Marjoram1, Kimberly Siegmund1, Michael F. Press2, Darryl Shibata2, Christina Curtis1, 6 Affiliations 1 Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA USA 2 Department of Pathology, Keck School of Medicine of the University of Southern California, Los Angeles, CA USA 3 Department of Pathology, CHA University, Seongnam-si, Gyeonggi-do, South Korea 4 Center for Evolution and Cancer, University of California, San Francisco, San Francisco, CA USA 5 Centre for Tumor Biology, Barts Cancer Institute, Queen Mary University of London, London, UK 6 Present addresses: Division of Molecular Pathology, Institute of Cancer Research, London, UK (A.S); Department of Medicine, Stanford University, Stanford, CA, USA (Z.M, C.C); Department of Genetics, Stanford University, Stanford, CA, USA (Z.M, C.C) Correspondence should be addressed to: Darryl Shibata ([email protected]) or Christina Curtis ([email protected])
Nature Genetics: doi:10.1038/ng.3214
2
Supplementary Figures
Supplementary Figure 1 (continued on next page).
Nature Genetics: doi:10.1038/ng.3214
3
Supplementary Figure 1. Integrated genomic analysis of ITH across multiple spatial scales. (a) Circos plot representation of copy number aberration (CNA) profiles (where concentric circles correspond to individual glands and bulk samples) reveal extensive heterogeneity at the copy number level in the carcinomas, but not
Nature Genetics: doi:10.1038/ng.3214
4
adenomas. Variegated and side-variegated CNAs were found in the majority of carcinomas (7/11), indicative of early mixing, followed by the scattering of sub-clones to distant tumor regions. (b) For all adenomas and a subset of carcinomas, WES was performed on bulk tumor samples, followed by targeted deep sequencing of a mutational panel in single glands. The allelic frequency of a subset of mutations is shown and color-coded according to side (right, left). Canonical drivers (APC, KRAS, or TP53) were commonly mutated and when present, were clonal within the tumor and thus used as positive controls. For tumor O, the canonical drivers were not mutated, and a clonal mutation in CTNND1 was used as a positive control. For diploid samples, the expected allelic frequencies are 0.5 or 1 (when loss of heterozygosity has occurred). For triploid samples, the frequencies are 0.33, 0.66 or 1 (N, O, T, U and the right side of K) and for quadraploid samples: 0.25, 0.5, 0.75 or 1. The mutational profiles strongly corroborate the CNA data and reveal variegation in all carcinomas, with glands from opposite sides harboring the same private mutation. (c) FISH analysis of the HER2 gene and corresponding chr17 centromere was performed on individual glands (n=3-6 per tumor side) and reveals extensive variability in copy number between physically adjacent cells, as summarized by Shannon Index of diversity. For each group, boxplots show the median, limited by the 25th (Q1) and 75th (Q3) percentiles, where whiskers represent the most extreme of the maximum or Q3+1.5(Q3−Q1) and the minimum or Q1−1.5(Q3−Q1), respectively. Importantly, ITH was uniformly high throughout the tumor, indicating the absence of recent clonal expansions, as postulated by the Big Bang model. The maximum possible heterogeneity corresponds to an index of 1.79 (99% of counts within [0,5], where max SI is ln(N) for N the number of possible count states, which is 6). (d) Molecular clock analysis based on neutral methylation tag sequencing corroborates the FISH data, with glands from different tumor regions exhibiting similar high levels of ITH throughout the tumor. Boxplots are formatted as in panel d. (e) The molecular clock analysis further highlights a complex hierarchy of mitotic sub-clones within glands, as demonstrated by the frequency distribution of intra-gland mitotic clones, where the five most common clones are shown.
Nature Genetics: doi:10.1038/ng.3214
5
Supplementary Figure 2. Reconstruction of tumor phylogenetic trees. Tumor phylogenetic trees were reconstructed based on individual gland genome-wide CNA profiles. Sub-clone mixing between right and left glands is apparent in some of the carcinomas, where glands from one side were more similar to other glands from the opposite side than to neighboring glands (indicated by red parentheses). The accurate reconstruction of tumor phylogenies for adenomas S, P and X and carcinoma W (which was microsatellite instable, MSI) was not possible due to minimal CNAs (these are included for completeness, but indicated by transparent shading).
K
L4
R4
R3
R2
R1
L1
L3
L2
Normal
●●●
●●
●●●
●S
R5
R4
R2
R1
L5
L4
L3
L1
L2
Normal
●●●●●●●●●
●P
L3
R4
L4
L2
R3
R2
R1
L1
Normal
●●
●●●●●●●
X
R1
R2
L4
R3
L1
L3
L2
Normal
●●
●●
●●●
●
M
R3
R2
L5
L2
R4
R1
R5
L3
L1
L4
Normal
●●
●●●●●●
●●
●N
L3
L4
L1
R4
R3
R1
R2
L5
L2
R5
Normal
●●●●●●●●●●
●O
L5
L3
L4
L2
L1
R4
R3
R2
R1
Normal
●●●●●●●●●
●T
R2
R4
R1
L4
L2
L3
L1
Normal
●●
●●●●●
●
U
L3
L4
R4
R1
R2
R3
L1
L2
Normal
●●●●●●●
●●
W
R3
R2
R1
L4
L3
L1
L2
Normal
●●●●●●●
●CA
R4
L1
R1
L5
L4
L2
R3
R2
L3
Normal
●●
●●
●●
●●
●●
CO
R1
R2
R4
R3
L1
L3
L2
Normal
●●●●●●●
●
G
L4
L5
L3
L2
L1
R4
R1
R3
R2
R5
Normal
●●●●●●●●
●●
●R
L4
L3
L5
L1
L2
R3
R4
R2
R1
R5
Normal
●●
●●
●●●●●●
●Z
R3
R2
R1
L3
L5
L4
L1
L2
Normal
●●●
●●●●●
●
Right and Leftglands mix
Nature Genetics: doi:10.1038/ng.3214
6
Supplementary Figure 3. Bulk tumor copy number profiles recapitulate single gland ITH. Individual tumor glands may exhibit distinct CNAs. When present, such heterogeneity was also noted in the bulk fragments collected from either the left or right side of the tumor, as illustrated here for representative carcinoma G.
G R
1
Chr 5
��
���
���
G R
2��
���
���
G R
3��
���
��0
G R
4��
���
���
G R
5��
���
���
G R
bul
k��
���
���
���
���
���
���
���
���
��0
0��
���
���
���
���
���
���
���
���
���
���
LRR BAF
Nature Genetics: doi:10.1038/ng.3214
7
Supplementary Figure 4. The impact of sampling bias on tumor ancestral tree reconstruction. CNA profiles from individual glands sampled from either the right (R#) or left (L#) side of the tumor were used to generate phylogenetic trees and were compared with trees generated using tumor-wide data taken from both the right and left side. As a result of extensive ITH, the reconstruction of phylogenies based on a single tumor region does not yield a sub-tree that is representative of the combined tumor-wide, but rather an erroneous phylogeny.
K (TXPRUïZLGH�
L4R4R3
R2R1
L1L3L2
Normal
●●●
●●
●●●
●.��5LJKW�VLGH�RQl\�
R1
R4
R3
R2
Normal
●●●●
●.��/HIW�VLGH�RQl\�
L4
L3
L1
L2
Normal
●●●●
●S (TXPRUïZLGH�
R5R4R2R1L5L4L3L1L2
Normal
●●●●●●●●● ●
6��5LJKW�VLGH�RQl\�
R5
R4
R1
R2
Normal
●●●●
●6��/HIW�VLGH�RQl\�
L5
L4
L3
L1
L2
Normal
●●●●●
●
P (TXPRUïZLGH�
L3R4
L4L2R3R2R1L1Normal
●●
●●●●●●●
3��/HIW�VLGH�RQl\�
L3
L4
L2
L1
Normal
●●
●●●
X (TXPRUïZLGH�
R1R2
L4R3
L1L3L2
Normal
●●
●●
●●●
●;��5LJKW�VLGH�RQl\�
R3
R1
R2
Normal
●
●
●
●;��/HIW�VLGH�RQl\�
L4
L1
L3
L2
Normal
●●●●
●
M (TXPRUïZLGH�
R3R2
L5L2R4R1R5L3
L1L4
Normal
●●●●●●●●
●●●0��5LJKW�VLGH�RQl\�
R5
R1
R4
R3
R2
Normal
●●●
●●●
0��/HIW�VLGH�RQl\�
L4
L3
L1
L5
L2
Normal
●●
●●
●●
N (TXPRUïZLGH�
L3L4L1R4R3R1R2L5L2R5
Normal
●●●●●●●●●●
●1��5LJKW�VLGH�RQl\�
R5
R4
R3
R1
R2
Normal
●●●●●
●1��/HIW�VLGH�RQl\�
L3
L4
L1
L5
L2
Normal
●●●●●
●
O (TXPRUïZLGH�
L5L3L4L2L1
R4R3R2
R1Normal
●●●●●●●●● ●
2��5LJKW�VLGH�RQl\�
R4
R3
R1
R2
Normal
●●●
●●
2��/HIW�VLGH�RQl\�
L1
L5
L3
L4
L2
Normal
●●●●●
●T (TXPRUïZLGH�
R2R4
R1L4L2L3L1
Normal
●●
●●●●●
●7��5LJKW�VLGH�RQl\�
R4
R1
R2
Normal
●
●
●
●7��/HIW�VLGH�RQl\�
L4
L3
L1
L2
Normal
●●●●
●
U (TXPRUïZLGH�
L3L4R4R1R2R3
L1L2
Normal
●●●●●●●
●●
8��5LJKW�VLGH�RQl\�
R4
R3
R1
R2
Normal
●●●●
●8��/HIW�VLGH�RQl\�
L3
L4
L1
L2
Normal
●●●
●●
W (TXPRUïZLGH�
R3R2R1L4L3L1L2
Normal
●●●●●●●
●:��5LJKW�VLGH�RQl\�
R3
R1
R2
Normal
●
●
●
●:��/HIW�VLGH�RQl\�
L4
L3
L1
L2
Normal
●●●●
●
CA (TXPRUïZLGH�
R4L1
R1L5
L4L2
R3R2
L3Normal
●● ●● ●● ●●●●
&$��5LJKW�VLGH�RQl\�
R4
R1
R3
R2
Normal
●●●●
●&$��/HIW�VLGH�RQl\�
L1
L4
L3
L5
L2
Normal
●●
●●●
●CO (TXPRUïZLGH�
R1R2R4R3
L1L3L2
Normal
●●●●●●●
●&2��5LJKW�VLGH�RQl\�
R4
R3
R1
R2
Normal
●●●●
●&2��/HIW�VLGH�RQl\�
L1
L3
L2
Normal
●
●
●
●
G (TXPRUïZLGH�
L4L5L3L2L1R4R1
R3R2
R5Normal
●●●●●●●●
●●●
*��5LJKW�VLGH�RQl\�
R4
R1
R5
R3
R2
Normal
●●
●●
●●
*��/HIW�VLGH�RQl\�
L4
L5
L3
L1
L2
Normal
●●●●●
●R (TXPRUïZLGH�
L4L3
L5L1
L2R3R4R2R1R5
Normal
●●●●
●●●●●●
●5��5LJKW�VLGH�RQl\�
R5
R3
R1
R4
R2
Normal
●●●●●
●5��/HIW�VLGH�RQl\�
L4
L3
L5
L1
L2
Normal
●●
●●
●●
Z (TXPRUïZLGH�
R3R2R1
L3L5L4L1L2
Normal
●●●
●●●●●
●=��5LJKW�VLGH�RQl\�
R3
R1
R2
Normal
●
●
●
●=��/HIW�VLGH�RQl\�
L3
L5
L4
L1
L2
Normal
●●●●●
●
Nature Genetics: doi:10.1038/ng.3214
8
adenoma K
carcinoma N
a b
d e carcinoma M
adenoma S adenoma P
carcinoma O
c
f
g h carcinoma T carcinoma U
adenoma X
carcinoma W
i
J
Alle
lic F
requ
ency
●●
●
●
●●●●
●
●
●
●●● ●● ●
●
●
●●
●
●
●●
●
● ●● ●● ●● ●
●●●
●
●
●
●●● ●● ●●●● ●● ●●
●●●●
●
●●● ●● ●
●●●●
●
●
●●
●
● ●● ●●
●
●
●
●
●
●●● ●
●●
●●
●
●
●●● ●● ●
●●
●●
●
●
●●● ●● ●●●● ●● ●
●●
●
●
●
●
●●● ●● ●
●
●
●
●
●
●●
●●
●
● ●● ●● ●● ●●● ●● ●
●
●
●
●
●
●●● ●●●
●
●●● ●● ●●●● ●● ●
●
●
●
●●●
●
●●
●●
●
●●● ●● ●●●● ●● ●
●
●●●
●
●
●
● ●●●
●
●●● ●● ●●●
●
● ●
●
● ●● ●● ●
●●●
●
●●
●●● ●● ●
●●
●●
●
● ●● ●● ●● ●●● ●● ●
●●●
●
●
●
●●●
●●
●
●●● ●● ●●●● ●● ●
●
●●●
●
●
●●
● ●
●●
●●● ●● ●●●
●●
●
●
●● ●● ●● ●●● ●● ●
●●
●●●
●
●●
●
● ●
●
● ●● ●● ●●●● ●● ●
●
●
●
●
●●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) A
PC,c
.236
5C>T
[chr
5,st
p]AB
LIM
3,c.
1108
C>T
[chr
5,ns
n]AD
CY8,
c.85
7C>T
[chr
8,ns
n]AN
KRD5
0,c.
2251
G>A
[chr
4,ns
n]CH
D7,c
.651
3C>T
[chr
8,sy
n]CH
ST12
,c.1
72A>
T [c
hr7,
nsn]
CIC,
c.41
6G>A
[chr
19,n
sn]
CILP
,c.2
71C>
A [c
hr15
,syn
]EM
C3,c
.440
G>A
[chr
3,ns
n]EN
THD2
,c.4
13C>
T [c
hr17
,nsn
]G
ALC,
c.84
7G>A
[chr
14,n
sn]
GAL
NTL6
,c.3
77G
>A [c
hr4,
nsn]
GLI
S1,c
.141
5C>T
[chr
1,ns
n]HE
LZ2,
c.21
01G
>C [c
hr20
,nsn
]IM
PG1,
c.15
19C>
T [c
hr6,
stp]
LCT,
c.11
07G
>T [c
hr2,
nsn]
MAP
3K1,
c.40
74G
>A [c
hr5,
syn]
MYB
L1,c
.130
1C>T
[chr
8,ns
n]O
R4X2
,c.1
92C>
A [c
hr11
,stp
]O
R7C2
,c.4
83G
>T [c
hr19
,nsn
]PL
CL1,
c.20
83C>
A [c
hr2,
nsn]
SDCC
AG8,
c.11
20C>
T [c
hr1,
stp]
SF1,
c.26
8G>T
[chr
11,s
tp]
SLC6
A17,
c.56
5G>A
[chr
1,ns
n]TM
EM20
0A,c
.167
G>A
[chr
6,ns
n]TM
EM25
,c.1
87G
>T [c
hr11
,nsn
]TN
C,c.
2072
G>A
[chr
9,ns
n]TR
PV6,
c.21
51C>
T [c
hr7,
syn]
ZNF3
41,c
.161
2T>C
[chr
20,n
sn]
ZNF6
41,c
.104
9G>A
[chr
12,n
sn]
●●
●
●
●●●●
●
●
●
●●● ●● ●
●
●
●●
●
●
●●
●
● ●● ●● ●● ●
●●●
●
●
●
●●● ●● ●●●● ●● ●●
●●●●
●
●●● ●● ●
●●●●
●
●
●●
●
● ●● ●●
●
●
●
●
●
●●● ●
●●
●●
●
●
●●● ●● ●
●●
●●
●
●
●●● ●● ●●●● ●● ●
●●
●
●
●
●
●●● ●● ●
●
●
●
●
●
●●
●●
●
● ●● ●● ●● ●●● ●● ●
●
●
●
●
●
●●● ●●●
●
●●● ●● ●●●● ●● ●
●
●
●
●●●
●
●●
●●
●
●●● ●● ●●●● ●● ●
●
●●●
●
●
●
● ●●●
●
●●● ●● ●●●
●
● ●
●
● ●● ●● ●
●●●
●
●●
●●● ●● ●
●●
●●
●
● ●● ●● ●● ●●● ●● ●
●●●
●
●
●
●●●
●●
●
●●● ●● ●●●● ●● ●
●
●●●
●
●
●●
● ●
●●
●●● ●● ●●●
●●
●
●
●● ●● ●● ●●● ●● ●
●●
●●●
●
●●
●
● ●
●
● ●● ●● ●●●● ●● ●
●
●
●
●
●● 1/2
1/3
2/3
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●
●●●
●● ●●
●
●
●●● ● ●●● ●
●●●
●
●●● ●
●
●●
●●● ●
●●●
●
●●● ●
●●●
●
●●● ●
●●
●
●
●●● ●● ●● ● ●●● ●
●
●
●
●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) K
RAS,
c.34
G>A
[chr
12,n
sn]
ITPR
2,c.
877C
>T [c
hr12
,nsn
]
NEB,
c.12
278C
>T [c
hr2,
nsn]
NRXN
1,c.
3643
G>A
[chr
2,ns
n]
PCNX
,c.4
41T>
A [c
hr14
,syn
]
PIP4
K2B,
c.41
2C>T
[chr
17,n
sn]
RBP1
,c.2
77C>
T [c
hr3,
nsn]
SYNE
2,c.
1405
C>T
[chr
14,s
tp]
ZFP1
4,c.
489G
>C [c
hr19
,nsn
]
●
●●●
●● ●●
●
●
●●● ● ●●● ●
●●●
●
●●● ●
●
●●
●●● ●
●●●
●
●●● ●
●●●
●
●●● ●
●●
●
●
●●● ●● ●● ● ●●● ●
●
●
●
●
1/2
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●●
●
●
●
●●
●● ●
●
●● ●
●●
●●
●●●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) A
PC,c
.241
3C>T
[chr
5,st
p]
APC,
c.42
22G
>T [c
hr5,
stp]
KRAS
,c.3
51A>
T [c
hr12
,nsn
]
●●
●
●
●
●●
●● ●
●
●● ●
●●
●●
●●● 1/2
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●● ●
●●
●
●
●
●●
●●● ●● ●● ●● ●
●
●●
●
●
●
●● ●
●●
● ●● ●●● ●●●
●●● ●● ●
●
● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ●
●
●
●
●●
●●● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●
●●
● ●● ●● ● ●●● ●● ●● ●● ●0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) K
RAS,
c.35
G>A
[chr
12,n
sn]
AGTR
1,c.
243G
>T [c
hr3,
nsn]
APC,
c.23
09C>
G [c
hr5,
stp]
APC,
c.43
48C>
T [c
hr5,
stp]
ATP8
B2,c
.197
2C>T
[chr
1,ns
n]
EHM
T1,c
.286
6G>A
[chr
9,ns
n]
GFP
T2,c
.195
0C>T
[chr
5,sy
n]
ITIH
1,c.
1557
G>A
[chr
3,sy
n]
OLI
G3,
c.31
9C>T
[chr
6,ns
n]
OR4
S2,c
.869
C>A
[chr
11,n
sn]
PCDH
9,c.
3623
G>A
[chr
13,n
sn]
SLIT
RK5,
c.13
52G
>A [c
hr13
,nsn
]
WDR
61,c
.438
G>A
[chr
15,s
yn]
ZDHH
C1,c
.126
4C>T
[chr
16,s
tp]
●● ●
●●
●
●
●
●●
●●● ●● ●● ●● ●
●
●●
●
●
●
●● ●
●●
● ●● ●●● ●●●
●●● ●● ●
●
● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ●
●
●
●
●●
●●● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●
●●
● ●● ●● ● ●●● ●● ●● ●● ●
1/2
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●
●
●●●●●
●
●
● ●●●
●
●●● ●●
●
●
●
●
●●●
●
●
●●● ●●
●
●
●
●
●
●●
●●
●●●● ●
●
●
●●
●●●
●
●
●●● ●●
●
●
●
●
●
●●●
●
●
●
●● ●
●
●
●
●
●
●
●●
●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●
●●
●●
●●● ●
●
●
●
●
●
●●●
●●● ●● ●●
●
●
●
●
●
●
●
●●● ●●
●
●● ●● ●● ●● ●●● ●●
●
●
●
●●●
●
●●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●● ●● ●● ●● ●●● ●●
●
●
●
●
●
●●
● ●
●●● ●●
●
●
●
●
●
●●●●
●●● ●●
●
●
●
●
●
●●
●
●
●●● ●●
●
●● ●● ●● ●● ●●● ●●
●
●
●
●
●●●●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●
●
●
●
●
●●
●
●●● ●● ●● ●●●● ●●
●
●
●
●
●
●
●●
●
●●● ●
●
●
●
●
●
● ●●●
●●● ●
●
●
●
●
●●
●
●●
●●● ●●
●
●
●
●
●
●
●
● ●
●●● ●
●
● ●
●
●
●
●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) A
PC,c
.401
2C>T
[chr
5,st
p]AG
L,c.
4364
C>T
[chr
1,ns
n]AR
MC3
,c.1
294C
>T [c
hr10
,nsn
]CD
H1,c
.220
C>T
[chr
16,s
tp]
CYP2
C8,c
.1A>
G [c
hr10
,nsn
]DN
M3,
c.22
29C>
T [c
hr1,
syn]
EPHX
3,c.
954C
>T [c
hr19
,syn
]HO
OK1
,c.5
88G
>A [c
hr1,
syn]
HSDL
2,c.
58C>
T [c
hr9,
nsn]
IGF2
BP3,
c.15
02G
>C [c
hr7,
nsn]
MEF
V,c.
1319
G>A
[chr
16,n
sn]
OLF
ML3
,c.4
48C>
T [c
hr1,
stp]
PIK3
CA,c
.109
3G>A
[chr
3,ns
n]PO
LQ,c
.139
9C>T
[chr
3,st
p]PT
GER
3,c.
1170
C>A
[chr
1,sy
n]PT
PRK,
c.15
16A>
T [c
hr6,
stp]
PTPR
K,c.
1518
G>A
[chr
6,sy
n]RG
S7,c
.874
G>T
[chr
1,st
p]RN
F17,
c.16
29G
>A [c
hr13
,syn
]RP
1L1,
c.21
09A>
G [c
hr8,
syn]
RYR2
,c.7
206T
>C [c
hr1,
syn]
S1PR
1,c.
1050
C>T
[chr
1,sy
n]SO
X7,c
.305
G>A
[chr
8,ns
n]ST
X1B,
c.57
0G>A
[chr
16,s
yn]
TAS1
R2,c
.903
C>T
[chr
1,sy
n]TM
CC2,
c.12
72G
>A [c
hr1,
syn]
ULK3
,c.5
47G
>A [c
hr15
,nsn
]ZD
HHC2
4,c.
496C
>T [c
hr11
,stp
]
●
●
●●●●●
●
●
● ●●●
●
●●● ●●
●
●
●
●
●●●
●
●
●●● ●●
●
●
●
●
●
●●
●●
●●●● ●
●
●
●●
●●●
●
●
●●● ●●
●
●
●
●
●
●●●
●
●
●
●● ●
●
●
●
●
●
●
●●
●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●
●●
●●
●●● ●
●
●
●
●
●
●●●
●●● ●● ●●
●
●
●
●
●
●
●
●●● ●●
●
●● ●● ●● ●● ●●● ●●
●
●
●
●●●
●
●●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●● ●● ●● ●● ●●● ●●
●
●
●
●
●
●●
● ●
●●● ●●
●
●
●
●
●
●●●●
●●● ●●
●
●
●
●
●
●●
●
●
●●● ●●
●
●● ●● ●● ●● ●●● ●●
●
●
●
●
●●●●
●●● ●●
●
●
●
●
●
●
●
●
●
●●● ●●
●
●
●
●
●
●
●
●
●
●●
●
●●● ●● ●● ●●●● ●●
●
●
●
●
●
●
●●
●
●●● ●
●
●
●
●
●
● ●●●
●●● ●
●
●
●
●
●●
●
●●
●●● ●●
●
●
●
●
●
●
●
● ●
●●● ●
●
● ●
●
●
●
●
1/2
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●●
●● ●
●
●●
●
●●
●
●
●●●
●●
●●
●
●●●
●
●●
●
●
●
●
●●●
●
●●
●
●●● ●●●●
●●
●●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●● ●●
● ●●
●●
●
●●
●
●
●●● ●●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●● ●● ●
●
●
● ●
●●●
●
●
●
●●
●
●
●●●●
●
●
●●
●
●
●●
●●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
● ●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●●●●
●●
●●
●
●●●
●●●
●
●●
●
●
●●
●●●
●
●●
●
●
●●●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●●● ●● ●
●●
● ●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) T
P53,
c.52
7G>T
[chr
17,n
sn]
AK8,
c.82
7C>A
[chr
9,ns
n]BT
NL2,
c.96
1G>A
[chr
6,ns
n]CU
L7,c
.508
9G>C
[chr
6,ns
n]CU
L9,c
.870
A>C
[chr
6,sy
n]DL
X3,c
.16G
>T [c
hr17
,nsn
]G
PR6,
c.29
4G>A
[chr
6,sy
n]G
TF2F
1,c.
550G
>C [c
hr19
,nsn
]HU
NK,c
.631
G>A
[chr
21,n
sn]
KRAS
,c.3
5G>A
[chr
12,n
sn]
LRSA
M1,
c.19
39G
>A [c
hr9,
nsn]
MAM
L1,c
.295
1C>G
[chr
5,ns
n]NA
LCN,
c.11
14C>
T [c
hr13
,nsn
]O
R14C
36,c
.667
G>A
[chr
1,ns
n]O
R4C1
2,c.
666T
>G [c
hr11
,syn
]PC
DH17
,c.9
42C>
T [c
hr13
,syn
]PE
LP1,
c.40
8G>T
[chr
17,n
sn]
PREX
1,c.
2992
C>T
[chr
20,n
sn]
PVRL
3,c.
995T
>C [c
hr3,
nsn]
SETD
B1,c
.171
3T>G
[chr
1,sy
n]SI
GLE
C1,c
.500
6G>A
[chr
20,n
sn]
SLC5
A4,c
.167
G>C
[chr
22,n
sn]
TSPY
L5,c
.468
G>A
[chr
8,sy
n]ZB
TB16
,c.7
29G
>A [c
hr11
,syn
]ZB
TB4,
c.14
06C>
T [c
hr17
,nsn
]ZD
HHC5
,c.1
471C
>T [c
hr11
,nsn
]ZS
WIM
4,c.
1261
T>C
[chr
19,n
sn]
●●
●● ●
●
●●
●
●●
●
●
●●●
●●
●●
●
●●●
●
●●
●
●
●
●
●●●
●
●●
●
●●● ●●●●
●●
●●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●● ●●
● ●●
●●
●
●●
●
●
●●● ●●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●● ●● ●
●
●
● ●
●●●
●
●
●
●●
●
●
●●●●
●
●
●●
●
●
●●
●●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
● ●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●●●●
●●
●●
●
●●●
●●●
●
●●
●
●
●●
●●●
●
●●
●
●
●●●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●●● ●● ●
●●
● ●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
1/2
1/3
2/3
1/4
3/4
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●
●
●●●●
●
●●
●
●
●
●●
●
●● ●●●
●●
●●●
●
●● ●
●
●●
●
●●●
●
●● ●● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●●● ●● ●
●
●
●
●
●
●●● ●● ●
●●
●
●●
●●●
●
●
●● ●●
●●
●
●
●
●
●●
●●
●
●●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) C
TNND
1,c.
729C
>T [c
hr11
,syn
]
ARHG
AP12
,c.1
149G
>A [c
hr10
,syn
]
DOCK
3,c.
5916
G>A
[chr
3,sy
n]
FPR1
,c.3
95G
>T [c
hr19
,nsn
]
HMOX
1,c.
695G
>A [c
hr22
,nsn
]
RPS6
KL1,
c.10
15G
>A [c
hr14
,nsn
]
RYR2
,c.1
2537
A>T
[chr
1,ns
n]
SLC2
2A5,
c.15
39C>
T [c
hr5,
syn]
SPAT
C1L,
c.11
9A>T
[chr
21,n
sn]
●
●
●●●●
●
●●
●
●
●
●●
●
●● ●●●
●●
●●●
●
●● ●
●
●●
●
●●●
●
●● ●● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●●● ●● ●
●
●
●
●
●
●●● ●● ●
●●
●
●●
●●●
●
●
●● ●●
●●
●
●
●
●
●●
●●
●
●● 1/2
1/3
2/3
1/4
3/4
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
● ●
●
●● ● ●●●●
●●● ●●
●
●●
●●
●●● ●●
●
●●
●
●
●●● ●
●
●●● ●● ●●● ●●
●●●
●●
●●● ●●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●
●●●
●● ●● ●
●
●
●
●●● ●● ●
●●●●
●● ●● ●
●
●●● ●
●
●
●
●
●
●
●●
●
●
●● ●● ● ●●● ●●
●
●●●
●
●●● ●●
● ●●●●
●
●●
●
●
●● ●● ●0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) A
PC,c
.401
2C>T
[chr
5,st
p]
A2M
,c.1
081C
>T [c
hr12
,stp
]
ANKR
D11,
c.34
37C>
T [c
hr16
,nsn
]
CA11
,c.6
96C>
T [c
hr19
,syn
]
CHD9
,c.6
689C
>T [c
hr16
,nsn
]
CNTN
6,c.
1340
C>T
[chr
3,ns
n]
LRP1
B,c.
8185
G>A
[chr
2,ns
n]
MCA
M,c
.162
4A>G
[chr
11,n
sn]
NCAN
,c.3
842G
>A [c
hr19
,nsn
]
NIPB
L,c.
4202
T>A
[chr
5,ns
n]
OCE
L1,c
.336
G>C
[chr
19,n
sn]
PFKF
B3,c
.121
1C>G
[chr
10,n
sn]
PTPR
B,c.
1847
G>C
[chr
12,n
sn]
SEC2
4B,c
.252
4C>G
[chr
4,ns
n]
SNRN
P200
,c.3
455G
>A [c
hr2,
nsn]
● ●
●
●● ● ●●●●
●●● ●●
●
●●
●●
●●● ●●
●
●●
●
●
●●● ●
●
●●● ●● ●●● ●●
●●●
●●
●●● ●●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●
●●●
●● ●● ●
●
●
●
●●● ●● ●
●●●●
●● ●● ●
●
●●● ●
●
●
●
●
●
●
●●
●
●
●● ●● ● ●●● ●●
●
●●●
●
●●● ●●
● ●●●●
●
●●
●
●
●● ●● ●
1/2
1/3
2/3
1/4
3/4
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●
●
●●●
●●
●●
●
●●● ●●
●
●●●
●
●●● ●●
●●
●
●●
●●● ●●
● ●●●
●
●●●● ●
●●● ●● ●●● ●●
● ●●
●●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) A
PC,c
.241
3C>T
[chr
5,st
p]
DLG
AP3,
c.11
94C>
G [c
hr1,
nsn]
EPHA
3,c.
2259
G>T
[chr
3,ns
n]
NKAI
N2,c
.115
G>A
[chr
6,ns
n]
NUDC
D1,c
.142
7A>G
[chr
8,ns
n]
RSPR
Y1,c
.631
G>A
[chr
16,n
sn]
●
●
●●●
●●
●●
●
●●● ●●
●
●●●
●
●●● ●●
●●
●
●●
●●● ●●
● ●●●
●
●●●● ●
●●● ●● ●●● ●●
● ●●
●●
1/2
1/3
2/3
1/4
3/4
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Alle
lic F
requ
ency
●●●
●
●●
●●●●
●
●●●
●
●
●●●
●
●●●●
●
●
●●
●
●
0.0
0.2
0.4
0.6
0.8
1.0
(Cnt
rl) K
RAS,
c.18
2A>T
[chr
12,n
sn]
PIK3
CA,c
.163
7A>G
[chr
3,ns
n]
PIK3
CA,c
.314
1T>G
[chr
3,ns
n]
●●●
●
●●
●●●●
●
●●●
●
●
●●●
●
●●●●
●
●
●●
●
● 1/2
●
●
Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted
Nature Genetics: doi:10.1038/ng.3214
9
Supplementary Figure 5. Mutational profiling of single glands. The allelic frequencies of additional mutations based on single gland targeted deep sequencing for tumors K, S, X, M, N, O, T and U are presented here, and corroborate the findings reported in Supplementary Figure 1B. Note that the whole right side of adenoma K is triploid, hence the allelic frequencies are shifted to either 0.33 or 0.66. Sub-clone segregation is apparent in the adenomas, whereas variegation is typical of the carcinomas. Putative driver mutations found in all cells within a tumor (public) were used as a control. The mean depth of coverage per mutation was 626.58±20.2 (95% CI).
Nature Genetics: doi:10.1038/ng.3214
10
Supplementary Figure 6. Illustrative simulations show how variegation arises from early mixing in the primordial tumor followed by scattering due to growth, independent of a clone’s fitness. In order to illustrate how genomic variegation can arise from simple growth dynamics, we simulated a small 2D (50x50 gland lattice) tumor that recapitulates the patterns of the larger 3D model used for statistical inference. Tumor glands are tubular in shape, but their average diameter can be approximated to 200 µm, hence a 50x50 gland lattice represents ~1cm2 tumor. The simulation begins with a single gland surrounded by normal tissue (white, not modeled). We model the growth of the gland in a manner analogous to the 3D model where new space is allocated by pushing glands outwards, but we do not account for the within-gland structure. We then simulate the emergence of a new mutant gland (red) with no survival advantage (identical to all other glands) when the tumor is composed of 4 (a), 8 (b), 64 (c) and 1024 (d) glands. The new mutation is passed on to the gland progeny, defining the sub-clone shown in red. The results presented are representative of a large number of simulations, which show that a mutant clone introduced early can result in variegated patterns, whereas a mutant clone that arises at later time points tends to remain segregated. This simple illustration reflects our inference results, which demonstrate that variegated/side-variegated alterations occur early due to random mixing during growth in the primordial neoplasm in the absence of normal cell adhesion, followed by subsequent scattering during expansion. The same phenomenon occurs when the mutant clone harbors a considerable fitness advantage (20%) for a tumor composed of 4 (e), 8 (f), 64 (g), and 1024 (h) glands, indicating that the timing of a mutation is the critical factor in generating genetic variegation, not selection. These results suggest that in some tumors, abnormal cell mixing required for later invasion is expressed very early.
x
y
10
20
30
40
10 20 30 40
x
y
10
20
30
40
10 20 30 40
x
y
10
20
30
40
10 20 30 40
x
y
10
20
30
40
10 20 30 40
x
y
10
20
30
40
10 20 30 40
x
y
10
20
30
40
10 20 30 40
x
y10
20
30
40
10 20 30 40
x
y
10
20
30
40
10 20 30 40
a b c d
e f g h
Nature Genetics: doi:10.1038/ng.3214
11
Supplementary Figure 7. Schematic representation of tumor evolutionary dynamic statistical inference framework. (a) In order to infer the evolutionary dynamics of neoplastic growth, including the mutational timeline, sub-clone fitness changes (σ), and the mutation rate (µ) from single gland genomic data, we employed a 3-dimensional mathematical model of tumor growth and statistical inference framework based on Approximate Bayesian Computation (ABC). The output is a set of posterior probability estimates for the tumor characteristics of interest given the data and the model of reference. (b) Schematic representation of the 3-dimensional spatial model of tumor growth. The model assumes gland fission, and accounts for the acquisition of CNAs or point mutations (indicated by a lightening symbol) that result in clones with a different relative fitness. Each simulation is run with a parameter set θ=(σ,µ) and generates virtual gland profiles such that the simulated data can be compared with the observed data within the inference method.
Nature Genetics: doi:10.1038/ng.3214
12
Supplementary Figure 8. Inferred tumor fitness parameters and mutation rates based on the CNA data. The inferred posterior probability distributions (a) for changes in sub-clone fitness and mutation (CNA) rate are illustrated for all samples. The inference results indicate that the adenomas do not exhibit sub-clone fitness changes and have a mutation rate of 10-6. In contrast, the carcinomas were more variable, as N, T, U, W and G have mutation rates on the order 10-6, while the remainder exhibit an elevated mutation rate (10-5), and all apart from W (MSI+) show moderate to large fitness changes. (b) Glands within the same tumor and even the same side exhibit variation in fitness based on the Shannon index. This suggests that despite evidence for selection, selective sweeps that would homogenize the population are rare and as a result, sub-clones with different fitness levels coexist within the same region. Error bars correspond to the standard deviation.
a bFitness change Mutation rate Fitness change Mutation rate
0 0.2 0.6
K
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
K
01
0 0.2 0.6
S
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
S
01
0 0.2 0.6
P
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
P0
1
0 0.2 0.6
X
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
X
01
0 0.2 0.6
M
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
M
01
0 0.2 0.6
N
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
N
01
0 0.2 0.6
O
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
O
01
0 0.2 0.6
TF
requency
00.5
10<8
10<7
10<6
10<5
10<4
T
01
0 0.2 0.6
U
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
U
01
0 0.2 0.6
W
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
W
01
0 0.2 0.6
CA
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
CA
01
0 0.2 0.6
CO
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
CO0
1
0 0.2 0.6
G
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
G
01
0 0.2 0.6
R
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
R
01
0 0.2 0.6
Z
Fre
quency
00.5
10<8
10<7
10<6
10<5
10<4
Z
01
K S P X M N O T U WC
AC
O G R Z
Right+Left glands
Shannon ind
ex
0.0
0.2
0.4
0.6
0.8
1.0
K S P X M N O T U WC
AC
O G R Z
Right glands
Shannon ind
ex
0.0
0.2
0.4
0.6
0.8
1.0
K S P X M N O T U WC
AC
O G R Z
Left glands
Shannon ind
ex
0.0
0.2
0.4
0.6
0.8
1.0
Nature Genetics: doi:10.1038/ng.3214
13
Supplementary Figure 9. Statistical inference on single gland mutational data corroborates findings at the CNA level. (a) The inferred posterior probability distributions for changes in sub-clone fitness and mutation rate based on mutational data are in agreement with the CNA-based inference. The variation in fitness between adjacent glands within the tumor (b) and the timeline during which different classes of alterations arise (c) are also congruent with the results based on the CNA data. Note that public mutations that occur after the transition to an established neoplasm were not found in the simulations selected by the inference framework. Error bars represent the standard deviation.
a
b
cFitness change Mutation rate Fitness change Mutation rateK S P X M N O T U W
Right+Left glands
Shannon ind
ex
0.0
0.2
0.4
0.6
0.8
1.0
K S P X M N O T U W
Right glands
Shannon ind
ex
0.0
0.2
0.4
0.6
0.8
1.0
K S P X M N O T U WLeft glands
Shannon ind
ex
0.0
0.2
0.4
0.6
0.8
1.0
105
107
109
1011
6LGHïVSHFLILF
Tumor size
Fre
quency
0.0
0.4
0.8
105
107
109
1011
6LGHïvariegated
Tumor size
Fre
quency
0.0
0.4
0.8
105
107
109
1011
Variegated
tumour size
Fre
quency
0.0
0.4
0.8
105
107
109
1011
Regional
Tumor size
Fre
quency
0.0
0.4
0.8
105
107
109
1011
Unique
Tumor size
Fre
quency
0.0
0.4
0.8
0 0.2 0.6
K
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
K
01
0 0.2 0.6
S
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
S
01
0 0.2 0.6
P
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
P
01
0 0.2 0.6
X
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
X
01
0 0.2 0.6
M
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
M
01
0 0.2 0.6
N
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
N
01
0 0.2 0.6
O
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
O
01
0 0.2 0.6
T
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
T
01
0 0.2 0.6
U
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
U
01
0 0.2 0.6
W
Fre
qu
en
cy
00
.5
10<8
10<7
10<6
10<5
10<4
W
01
Nature Genetics: doi:10.1038/ng.3214
14
Supplementary Figure 10. Schematic of microenvironment-aware simulation of tumor growth. In order to model the contribution of local microenvironment to selection, microenvironmental niches that select for different genotypic characteristics were simulated. To achieve this, the lattice in which the tumor grows was divided into a grid of separate microenvironmental niches, each of which selects for a specific alteration in a random chromosome (indicated by colored circles) by inducing a higher survival probability in glands that harbor that aberration.
chr... chr... chr... chr...
chr... chr... chr... chr...
chr... chr...
chr... chr...
chr... chr...
chr... chr...
chr... chr... chr... chr...
chr... chr... chr... chr...
chr... chr...
chr... chr...
chr3chr1
chr13
chr8
chr19chr4
chr6chr21
Nature Genetics: doi:10.1038/ng.3214
15
Supplementary Figure 11. The mutation rate and mutational timeline do not change under the assumption of microenvironmental niche-driven growth. (a) The inferred posterior probability distributions for niche size
c dMutation rate Mutation rate
a bMutation rate Mutation rateNiche size Niche size
Niche sizeNiche size
5 20 150
K
Freq
uenc
y
00.
5
10<810<710<610<510<4
K
01
5 20 150
S
Freq
uenc
y
00.
5
10<810<710<610<510<4
S
01
5 20 150
P
Freq
uenc
y
00.
5
10<810<710<610<510<4
P
01
5 20 150
X
Freq
uenc
y
00.
5
10<810<710<610<510<4
X
01
5 20 150
M
Freq
uenc
y
00.
5
10<810<710<610<510<4
M
01
5 20 150
N
Freq
uenc
y
00.
510<810<710<610<510<4
N
01
5 20 150
O
Freq
uenc
y
00.
5
10<810<710<610<510<4
O
01
5 20 150
T
Freq
uenc
y
00.
5
10<810<710<610<510<4
T
01
5 20 150
U
Freq
uenc
y
00.
5
10<810<710<610<510<4
U
01
5 20 150
W
Freq
uenc
y
00.
5
10<810<710<610<510<4
W
01
5 20 150
CA
Freq
uenc
y
00.
5
10<810<710<610<510<4
CA
01
5 20 150
CO
Freq
uenc
y
00.
5
10<810<710<610<510<4
CO
01
5 20 150
G
Freq
uenc
y
00.
5
10<810<710<610<510<4
G
01
5 20 150
R
Freq
uenc
y
00.
5
10<810<710<610<510<4
R
01
5 20 150
Z
Freq
uenc
y
00.
5
10<810<710<610<510<4
Z
01
105 107 109 1011
Public
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
6LGHïVSHFLILF
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
6LGHïvariegated
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
Variegated
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
Regional
Tumor size
Freq
uenc
y0.
00.
40.
8105 107 109 1011
Unique
Tumor size
Freq
uenc
y0.
00.
40.
8
5 20 150
K
Freq
uenc
y
00.
5
10<810<710<610<510<4
K
01
5 20 150
S
Freq
uenc
y
00.
5
10<810<710<610<510<4
S
01
5 20 150
P
Freq
uenc
y
00.
5
10<810<710<610<510<4
P
01
5 20 150
X
Freq
uenc
y
00.
5
10<810<710<610<510<4
X
01
5 20 150
M
Freq
uenc
y
00.
5
10<810<710<610<510<4
M
01
5 20 150
N
Freq
uenc
y
00.
5
10<810<710<610<510<4
N
01
5 20 150
O
Freq
uenc
y
00.
5
10<810<710<610<510<4
O
01
5 20 150
T
Freq
uenc
y
00.
5
10<810<710<610<510<4
T
01
5 20 150
U
Freq
uenc
y
00.
5
10<810<710<610<510<4
U
01
5 20 150
W
Freq
uenc
y
00.
5
10<810<710<610<510<4
W
01
105 107 109 1011
6LGHïVSHFLILF
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
6LGHïvariegated
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
Variegated
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
Regional
Tumor size
Freq
uenc
y0.
00.
40.
8
105 107 109 1011
Unique
Tumor size
Freq
uenc
y0.
00.
40.
8
Nature Genetics: doi:10.1038/ng.3214
16
indicate variable sizes for adenomas versus carcinomas. Despite the fact that microenvironments may differ within different regions of a tumor, the inferred CNA rates are equivalent to the microenvironment-free model. (b) Similarly, the inferred mutational timeline, shown here summarized across all samples, is in agreement with the microenvironment-free model. In particular, the parameter estimates are consistent across cases (as illustrated by the error bars), and indicate that public, as well as the majority of private alterations (side-variegated, variegated, and side-specific) occur early after the transition to an advanced tumor, whereas unique mutations occur late. (c) As in (a), similar results for the niche size and mutation rates were obtained independently using single gland targeted mutational profiles as input to the inference framework. (d) As in (b), the inferred mutational timeline is equivalent using single gland targeted mutational profiles. Public mutations occurring after the transition to an established neoplasm were not found in the simulations selected by the inference framework. Error bars represent the standard deviation.
Nature Genetics: doi:10.1038/ng.3214
17
Supplementary Figure 12. Depth of coverage for targeted deep sequencing. Single glands were sequenced to an average depth of coverage of 626.58±20.2 (95% CI), where boxplots are presented for individual samples (A, right; B, left) from each tumor. Boxplots show the median, limited by the 25th (Q1) and 75th (Q3) percentiles, where whiskers represent the most extreme of the maximum or Q3+1.5(Q3−Q1) and the minimum or Q1−1.5(Q3−Q1), respectively. The filled squares represent outliers.
De
pth
of
Co
ve
rag
e
Sample
KA
KB
MA
MB
NA
NB
OA
OB
PA
PB
SA
SB
TA
TB
UA
UB
WA
WB
XA
XB
200
400
600
800
1000
Nature Genetics: doi:10.1038/ng.3214
18
Supplementary Figure 13. Summary of variability in FISH copy number estimates between tumor and adjacent normal glands based on the Shannon Index. In order to determine whether the estimates of heterogeneity from the FISH data were biased as a result of using 5 µm thick sections (the diameter of colorectal tumor cells may be as large as 10 µm), we compared the number of aberrant events in tumor glands with the counts for adjacent normal colon tissue. As shown in the boxplots, estimates of ITH based on the Shannon Index differ significantly (t-test, p-value<0.01) between the tumor and adjacent normal. For each group, boxplots show the median, limited by the 25th (Q1) and 75th (Q3) percentiles, where whiskers represent the most extreme of the maximum or Q3+1.5(Q3−Q1) and the minimum or Q1−1.5(Q3−Q1), respectively. The open circles represent outliers.
Ch17
NORM
AL
Chr1
7TU
MO
R
Her2
NORM
AL
Her2
TUM
OR
0.0
0.5
1.0
1.5
2.0
Shan
non
Inde
x
●
●
Chr17 cent.Her2
p=0.002p=1.77e-6
Nature Genetics: doi:10.1038/ng.3214
19
Supplementary Figure 14. Validation of the statistical inference framework using synthetic data. To verify the accuracy of our inference framework we generated a synthetic dataset for 21 tumors with different parameters (σ, µ) using our spatial tumor growth model. The results illustrate that the correct parameter values (red circles) are recovered for the majority of cases, indicating the robustness of this approach.
Freq
uenc
yFr
eque
ncy
Freq
uenc
yFr
eque
ncy
Freq
uenc
y
Fitness change Fitness change Fitness changeMutation rate Mutation rate Mutation rate0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<40.
00.
6●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
0
0.2
0.6
0.00.10.20.30.40.50.6
●
10<8 10<6 10<4
0.0
0.6
●
Nature Genetics: doi:10.1038/ng.3214
20
Supplementary Tables Supplementary Table 1. Summary of patient clinical information for colorectal adenoma and carcinoma samples. Here, MSI refers to microsatellite instability, whereas MSS indicates microsatellite stability.
Supplementary Table 2. Summary of the number of CNAs detected within each class per tumor.
Patient Age Gender Stage Size (cm) Type MSI statusK 50 M 1 6 Adenoma MSSP 71 M 0 3.5 Adenoma MSSS 66 M 0 6 Adenoma MSSX 69 F 0 2.5 Adenoma MSSM 76 F 2 3 Carcinoma MSSN 71 M 1 2.3 Carcinoma MSSO 41 M 3 9.5 Carcinoma MSST 85 M 3 5.7 Carcinoma MSSU 78 F 2 3.9 Carcinoma MSSW 51 M 1 3.4 Carcinoma MSICO 83 M 2 8 Carcinoma MSSCA 57 M 2 8.5 Carcinoma MSSG 61 M 3 5.3 Carcinoma MSSR 51 M 3 3 Carcinoma MSSZ 83 M 3 7.5 Carcinoma MSS
Patient Public Side-specific Side-variegated Variegated Regional UniqueAdenomas
K 5 3 0 0 0 2S 2 0 0 0 0 0P 0 0 0 0 0 3X 1 0 0 0 0 6
CarcinomasM 1 0 23 2 4 10N 14 0 2 0 1 2O 25 2 2 1 2 7T 25 3 0 0 0 9U 20 2 0 1 0 11W 6 0 0 0 0 0CA 0 0 1 0 1 5CO 12 6 2 0 1 5G 17 2 0 0 3 3R 18 4 3 0 14 5Z 25 16 0 0 2 2
Nature Genetics: doi:10.1038/ng.3214