20
Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng Ma 1,6 , Trevor A. Graham 4,5 , Matthew P. Salomon 1 , Junsong Zhao 1 , Paul Marjoram 1 , Kimberly Siegmund 1 , Michael F. Press 2 , Darryl Shibata 2 , Christina Curtis 1, 6 Affiliations 1 Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA USA 2 Department of Pathology, Keck School of Medicine of the University of Southern California, Los Angeles, CA USA 3 Department of Pathology, CHA University, Seongnam-si, Gyeonggi-do, South Korea 4 Center for Evolution and Cancer, University of California, San Francisco, San Francisco, CA USA 5 Centre for Tumor Biology, Barts Cancer Institute, Queen Mary University of London, London, UK 6 Present addresses: Division of Molecular Pathology, Institute of Cancer Research, London, UK (A.S); Department of Medicine, Stanford University, Stanford, CA, USA (Z.M, C.C); Department of Genetics, Stanford University, Stanford, CA, USA (Z.M, C.C) Correspondence should be addressed to: Darryl Shibata ([email protected]) or Christina Curtis ([email protected]) Nature Genetics: doi:10.1038/ng.3214

Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

Supplementary Information for:

A Big Bang model of human colorectal tumor growth Andrea Sottoriva1,6, Haeyoun Kang2,3, Zhicheng Ma1,6, Trevor A. Graham4,5, Matthew P. Salomon1, Junsong Zhao1, Paul Marjoram1, Kimberly Siegmund1, Michael F. Press2, Darryl Shibata2, Christina Curtis1, 6 Affiliations 1 Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA USA 2 Department of Pathology, Keck School of Medicine of the University of Southern California, Los Angeles, CA USA 3 Department of Pathology, CHA University, Seongnam-si, Gyeonggi-do, South Korea 4 Center for Evolution and Cancer, University of California, San Francisco, San Francisco, CA USA 5 Centre for Tumor Biology, Barts Cancer Institute, Queen Mary University of London, London, UK 6 Present addresses: Division of Molecular Pathology, Institute of Cancer Research, London, UK (A.S); Department of Medicine, Stanford University, Stanford, CA, USA (Z.M, C.C); Department of Genetics, Stanford University, Stanford, CA, USA (Z.M, C.C) Correspondence should be addressed to: Darryl Shibata ([email protected]) or Christina Curtis ([email protected])

Nature Genetics: doi:10.1038/ng.3214

Page 2: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

2

Supplementary Figures

 Supplementary Figure 1 (continued on next page).

Nature Genetics: doi:10.1038/ng.3214

Page 3: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

3

Supplementary Figure 1. Integrated genomic analysis of ITH across multiple spatial scales. (a) Circos plot representation of copy number aberration (CNA) profiles (where concentric circles correspond to individual glands and bulk samples) reveal extensive heterogeneity at the copy number level in the carcinomas, but not

Nature Genetics: doi:10.1038/ng.3214

Page 4: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

4

adenomas. Variegated and side-variegated CNAs were found in the majority of carcinomas (7/11), indicative of early mixing, followed by the scattering of sub-clones to distant tumor regions. (b) For all adenomas and a subset of carcinomas, WES was performed on bulk tumor samples, followed by targeted deep sequencing of a mutational panel in single glands. The allelic frequency of a subset of mutations is shown and color-coded according to side (right, left). Canonical drivers (APC, KRAS, or TP53) were commonly mutated and when present, were clonal within the tumor and thus used as positive controls. For tumor O, the canonical drivers were not mutated, and a clonal mutation in CTNND1 was used as a positive control. For diploid samples, the expected allelic frequencies are 0.5 or 1 (when loss of heterozygosity has occurred). For triploid samples, the frequencies are 0.33, 0.66 or 1 (N, O, T, U and the right side of K) and for quadraploid samples: 0.25, 0.5, 0.75 or 1. The mutational profiles strongly corroborate the CNA data and reveal variegation in all carcinomas, with glands from opposite sides harboring the same private mutation. (c) FISH analysis of the HER2 gene and corresponding chr17 centromere was performed on individual glands (n=3-6 per tumor side) and reveals extensive variability in copy number between physically adjacent cells, as summarized by Shannon Index of diversity. For each group, boxplots show the median, limited by the 25th (Q1) and 75th (Q3) percentiles, where whiskers represent the most extreme of the maximum or Q3+1.5(Q3−Q1) and the minimum or Q1−1.5(Q3−Q1), respectively.   Importantly, ITH was uniformly high throughout the tumor, indicating the absence of recent clonal expansions, as postulated by the Big Bang model. The maximum possible heterogeneity corresponds to an index of 1.79 (99% of counts within [0,5], where max SI is ln(N) for N the number of possible count states, which is 6). (d) Molecular clock analysis based on neutral methylation tag sequencing corroborates the FISH data, with glands from different tumor regions exhibiting similar high levels of ITH throughout the tumor. Boxplots are formatted as in panel d. (e) The molecular clock analysis further highlights a complex hierarchy of mitotic sub-clones within glands, as demonstrated by the frequency distribution of intra-gland mitotic clones, where the five most common clones are shown.  

             

             

Nature Genetics: doi:10.1038/ng.3214

Page 5: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

5

Supplementary Figure 2. Reconstruction of tumor phylogenetic trees. Tumor phylogenetic trees were reconstructed based on individual gland genome-wide CNA profiles. Sub-clone mixing between right and left glands is apparent in some of the carcinomas, where glands from one side were more similar to other glands from the opposite side than to neighboring glands (indicated by red parentheses). The accurate reconstruction of tumor phylogenies for adenomas S, P and X and carcinoma W (which was microsatellite instable, MSI) was not possible due to minimal CNAs (these are included for completeness, but indicated by transparent shading).

K

L4

R4

R3

R2

R1

L1

L3

L2

Normal

●●●

●●

●●●

●S

R5

R4

R2

R1

L5

L4

L3

L1

L2

Normal

●●●●●●●●●

●P

L3

R4

L4

L2

R3

R2

R1

L1

Normal

●●

●●●●●●●

X

R1

R2

L4

R3

L1

L3

L2

Normal

●●

●●

●●●

M

R3

R2

L5

L2

R4

R1

R5

L3

L1

L4

Normal

●●

●●●●●●

●●

●N

L3

L4

L1

R4

R3

R1

R2

L5

L2

R5

Normal

●●●●●●●●●●

●O

L5

L3

L4

L2

L1

R4

R3

R2

R1

Normal

●●●●●●●●●

●T

R2

R4

R1

L4

L2

L3

L1

Normal

●●

●●●●●

U

L3

L4

R4

R1

R2

R3

L1

L2

Normal

●●●●●●●

●●

W

R3

R2

R1

L4

L3

L1

L2

Normal

●●●●●●●

●CA

R4

L1

R1

L5

L4

L2

R3

R2

L3

Normal

●●

●●

●●

●●

●●

CO

R1

R2

R4

R3

L1

L3

L2

Normal

●●●●●●●

G

L4

L5

L3

L2

L1

R4

R1

R3

R2

R5

Normal

●●●●●●●●

●●

●R

L4

L3

L5

L1

L2

R3

R4

R2

R1

R5

Normal

●●

●●

●●●●●●

●Z

R3

R2

R1

L3

L5

L4

L1

L2

Normal

●●●

●●●●●

Right and Leftglands mix

Nature Genetics: doi:10.1038/ng.3214

Page 6: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

6

Supplementary Figure 3. Bulk tumor copy number profiles recapitulate single gland ITH. Individual tumor glands may exhibit distinct CNAs. When present, such heterogeneity was also noted in the bulk fragments collected from either the left or right side of the tumor, as illustrated here for representative carcinoma G.

G R

1

Chr 5

��

���

���

G R

2��

���

���

G R

3��

���

��0

G R

4��

���

���

G R

5��

���

���

G R

bul

k��

���

���

���

���

���

���

���

���

��0

0��

���

���

���

���

���

���

���

���

���

���

LRR BAF

Nature Genetics: doi:10.1038/ng.3214

Page 7: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

7

Supplementary Figure 4. The impact of sampling bias on tumor ancestral tree reconstruction. CNA profiles from individual glands sampled from either the right (R#) or left (L#) side of the tumor were used to generate phylogenetic trees and were compared with trees generated using tumor-wide data taken from both the right and left side. As a result of extensive ITH, the reconstruction of phylogenies based on a single tumor region does not yield a sub-tree that is representative of the combined tumor-wide, but rather an erroneous phylogeny.

K (TXPRUïZLGH�

L4R4R3

R2R1

L1L3L2

Normal

●●●

●●

●●●

●.��5LJKW�VLGH�RQl\�

R1

R4

R3

R2

Normal

●●●●

●.��/HIW�VLGH�RQl\�

L4

L3

L1

L2

Normal

●●●●

●S (TXPRUïZLGH�

R5R4R2R1L5L4L3L1L2

Normal

●●●●●●●●● ●

6��5LJKW�VLGH�RQl\�

R5

R4

R1

R2

Normal

●●●●

●6��/HIW�VLGH�RQl\�

L5

L4

L3

L1

L2

Normal

●●●●●

P (TXPRUïZLGH�

L3R4

L4L2R3R2R1L1Normal

●●

●●●●●●●

3��/HIW�VLGH�RQl\�

L3

L4

L2

L1

Normal

●●

●●●

X (TXPRUïZLGH�

R1R2

L4R3

L1L3L2

Normal

●●

●●

●●●

●;��5LJKW�VLGH�RQl\�

R3

R1

R2

Normal

●;��/HIW�VLGH�RQl\�

L4

L1

L3

L2

Normal

●●●●

M (TXPRUïZLGH�

R3R2

L5L2R4R1R5L3

L1L4

Normal

●●●●●●●●

●●●0��5LJKW�VLGH�RQl\�

R5

R1

R4

R3

R2

Normal

●●●

●●●

0��/HIW�VLGH�RQl\�

L4

L3

L1

L5

L2

Normal

●●

●●

●●

N (TXPRUïZLGH�

L3L4L1R4R3R1R2L5L2R5

Normal

●●●●●●●●●●

●1��5LJKW�VLGH�RQl\�

R5

R4

R3

R1

R2

Normal

●●●●●

●1��/HIW�VLGH�RQl\�

L3

L4

L1

L5

L2

Normal

●●●●●

O (TXPRUïZLGH�

L5L3L4L2L1

R4R3R2

R1Normal

●●●●●●●●● ●

2��5LJKW�VLGH�RQl\�

R4

R3

R1

R2

Normal

●●●

●●

2��/HIW�VLGH�RQl\�

L1

L5

L3

L4

L2

Normal

●●●●●

●T (TXPRUïZLGH�

R2R4

R1L4L2L3L1

Normal

●●

●●●●●

●7��5LJKW�VLGH�RQl\�

R4

R1

R2

Normal

●7��/HIW�VLGH�RQl\�

L4

L3

L1

L2

Normal

●●●●

U (TXPRUïZLGH�

L3L4R4R1R2R3

L1L2

Normal

●●●●●●●

●●

8��5LJKW�VLGH�RQl\�

R4

R3

R1

R2

Normal

●●●●

●8��/HIW�VLGH�RQl\�

L3

L4

L1

L2

Normal

●●●

●●

W (TXPRUïZLGH�

R3R2R1L4L3L1L2

Normal

●●●●●●●

●:��5LJKW�VLGH�RQl\�

R3

R1

R2

Normal

●:��/HIW�VLGH�RQl\�

L4

L3

L1

L2

Normal

●●●●

CA (TXPRUïZLGH�

R4L1

R1L5

L4L2

R3R2

L3Normal

●● ●● ●● ●●●●

&$��5LJKW�VLGH�RQl\�

R4

R1

R3

R2

Normal

●●●●

●&$��/HIW�VLGH�RQl\�

L1

L4

L3

L5

L2

Normal

●●

●●●

●CO (TXPRUïZLGH�

R1R2R4R3

L1L3L2

Normal

●●●●●●●

●&2��5LJKW�VLGH�RQl\�

R4

R3

R1

R2

Normal

●●●●

●&2��/HIW�VLGH�RQl\�

L1

L3

L2

Normal

G (TXPRUïZLGH�

L4L5L3L2L1R4R1

R3R2

R5Normal

●●●●●●●●

●●●

*��5LJKW�VLGH�RQl\�

R4

R1

R5

R3

R2

Normal

●●

●●

●●

*��/HIW�VLGH�RQl\�

L4

L5

L3

L1

L2

Normal

●●●●●

●R (TXPRUïZLGH�

L4L3

L5L1

L2R3R4R2R1R5

Normal

●●●●

●●●●●●

●5��5LJKW�VLGH�RQl\�

R5

R3

R1

R4

R2

Normal

●●●●●

●5��/HIW�VLGH�RQl\�

L4

L3

L5

L1

L2

Normal

●●

●●

●●

Z (TXPRUïZLGH�

R3R2R1

L3L5L4L1L2

Normal

●●●

●●●●●

●=��5LJKW�VLGH�RQl\�

R3

R1

R2

Normal

●=��/HIW�VLGH�RQl\�

L3

L5

L4

L1

L2

Normal

●●●●●

Nature Genetics: doi:10.1038/ng.3214

Page 8: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

8

adenoma K

carcinoma N

a b

d e carcinoma M

adenoma S adenoma P

carcinoma O

c

f

g h carcinoma T carcinoma U

adenoma X

carcinoma W

i

J

Alle

lic F

requ

ency

●●

●●●●

●●● ●● ●

●●

●●

● ●● ●● ●● ●

●●●

●●● ●● ●●●● ●● ●●

●●●●

●●● ●● ●

●●●●

●●

● ●● ●●

●●● ●

●●

●●

●●● ●● ●

●●

●●

●●● ●● ●●●● ●● ●

●●

●●● ●● ●

●●

●●

● ●● ●● ●● ●●● ●● ●

●●● ●●●

●●● ●● ●●●● ●● ●

●●●

●●

●●

●●● ●● ●●●● ●● ●

●●●

● ●●●

●●● ●● ●●●

● ●

● ●● ●● ●

●●●

●●

●●● ●● ●

●●

●●

● ●● ●● ●● ●●● ●● ●

●●●

●●●

●●

●●● ●● ●●●● ●● ●

●●●

●●

● ●

●●

●●● ●● ●●●

●●

●● ●● ●● ●●● ●● ●

●●

●●●

●●

● ●

● ●● ●● ●●●● ●● ●

●●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) A

PC,c

.236

5C>T

[chr

5,st

p]AB

LIM

3,c.

1108

C>T

[chr

5,ns

n]AD

CY8,

c.85

7C>T

[chr

8,ns

n]AN

KRD5

0,c.

2251

G>A

[chr

4,ns

n]CH

D7,c

.651

3C>T

[chr

8,sy

n]CH

ST12

,c.1

72A>

T [c

hr7,

nsn]

CIC,

c.41

6G>A

[chr

19,n

sn]

CILP

,c.2

71C>

A [c

hr15

,syn

]EM

C3,c

.440

G>A

[chr

3,ns

n]EN

THD2

,c.4

13C>

T [c

hr17

,nsn

]G

ALC,

c.84

7G>A

[chr

14,n

sn]

GAL

NTL6

,c.3

77G

>A [c

hr4,

nsn]

GLI

S1,c

.141

5C>T

[chr

1,ns

n]HE

LZ2,

c.21

01G

>C [c

hr20

,nsn

]IM

PG1,

c.15

19C>

T [c

hr6,

stp]

LCT,

c.11

07G

>T [c

hr2,

nsn]

MAP

3K1,

c.40

74G

>A [c

hr5,

syn]

MYB

L1,c

.130

1C>T

[chr

8,ns

n]O

R4X2

,c.1

92C>

A [c

hr11

,stp

]O

R7C2

,c.4

83G

>T [c

hr19

,nsn

]PL

CL1,

c.20

83C>

A [c

hr2,

nsn]

SDCC

AG8,

c.11

20C>

T [c

hr1,

stp]

SF1,

c.26

8G>T

[chr

11,s

tp]

SLC6

A17,

c.56

5G>A

[chr

1,ns

n]TM

EM20

0A,c

.167

G>A

[chr

6,ns

n]TM

EM25

,c.1

87G

>T [c

hr11

,nsn

]TN

C,c.

2072

G>A

[chr

9,ns

n]TR

PV6,

c.21

51C>

T [c

hr7,

syn]

ZNF3

41,c

.161

2T>C

[chr

20,n

sn]

ZNF6

41,c

.104

9G>A

[chr

12,n

sn]

●●

●●●●

●●● ●● ●

●●

●●

● ●● ●● ●● ●

●●●

●●● ●● ●●●● ●● ●●

●●●●

●●● ●● ●

●●●●

●●

● ●● ●●

●●● ●

●●

●●

●●● ●● ●

●●

●●

●●● ●● ●●●● ●● ●

●●

●●● ●● ●

●●

●●

● ●● ●● ●● ●●● ●● ●

●●● ●●●

●●● ●● ●●●● ●● ●

●●●

●●

●●

●●● ●● ●●●● ●● ●

●●●

● ●●●

●●● ●● ●●●

● ●

● ●● ●● ●

●●●

●●

●●● ●● ●

●●

●●

● ●● ●● ●● ●●● ●● ●

●●●

●●●

●●

●●● ●● ●●●● ●● ●

●●●

●●

● ●

●●

●●● ●● ●●●

●●

●● ●● ●● ●●● ●● ●

●●

●●●

●●

● ●

● ●● ●● ●●●● ●● ●

●● 1/2

1/3

2/3

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●●

●● ●●

●●● ● ●●● ●

●●●

●●● ●

●●

●●● ●

●●●

●●● ●

●●●

●●● ●

●●

●●● ●● ●● ● ●●● ●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) K

RAS,

c.34

G>A

[chr

12,n

sn]

ITPR

2,c.

877C

>T [c

hr12

,nsn

]

NEB,

c.12

278C

>T [c

hr2,

nsn]

NRXN

1,c.

3643

G>A

[chr

2,ns

n]

PCNX

,c.4

41T>

A [c

hr14

,syn

]

PIP4

K2B,

c.41

2C>T

[chr

17,n

sn]

RBP1

,c.2

77C>

T [c

hr3,

nsn]

SYNE

2,c.

1405

C>T

[chr

14,s

tp]

ZFP1

4,c.

489G

>C [c

hr19

,nsn

]

●●●

●● ●●

●●● ● ●●● ●

●●●

●●● ●

●●

●●● ●

●●●

●●● ●

●●●

●●● ●

●●

●●● ●● ●● ● ●●● ●

1/2

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●

●●

●● ●

●● ●

●●

●●

●●●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) A

PC,c

.241

3C>T

[chr

5,st

p]

APC,

c.42

22G

>T [c

hr5,

stp]

KRAS

,c.3

51A>

T [c

hr12

,nsn

]

●●

●●

●● ●

●● ●

●●

●●

●●● 1/2

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●● ●

●●

●●

●●● ●● ●● ●● ●

●●

●● ●

●●

● ●● ●●● ●●●

●●● ●● ●

● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ●

●●

●●● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●

●●

● ●● ●● ● ●●● ●● ●● ●● ●0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) K

RAS,

c.35

G>A

[chr

12,n

sn]

AGTR

1,c.

243G

>T [c

hr3,

nsn]

APC,

c.23

09C>

G [c

hr5,

stp]

APC,

c.43

48C>

T [c

hr5,

stp]

ATP8

B2,c

.197

2C>T

[chr

1,ns

n]

EHM

T1,c

.286

6G>A

[chr

9,ns

n]

GFP

T2,c

.195

0C>T

[chr

5,sy

n]

ITIH

1,c.

1557

G>A

[chr

3,sy

n]

OLI

G3,

c.31

9C>T

[chr

6,ns

n]

OR4

S2,c

.869

C>A

[chr

11,n

sn]

PCDH

9,c.

3623

G>A

[chr

13,n

sn]

SLIT

RK5,

c.13

52G

>A [c

hr13

,nsn

]

WDR

61,c

.438

G>A

[chr

15,s

yn]

ZDHH

C1,c

.126

4C>T

[chr

16,s

tp]

●● ●

●●

●●

●●● ●● ●● ●● ●

●●

●● ●

●●

● ●● ●●● ●●●

●●● ●● ●

● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ●

●●

●●● ●● ●●● ●● ●● ●● ● ●●● ●● ●● ●● ● ●●

●●

● ●● ●● ● ●●● ●● ●● ●● ●

1/2

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●●●●

● ●●●

●●● ●●

●●●

●●● ●●

●●

●●

●●●● ●

●●

●●●

●●● ●●

●●●

●● ●

●●

●●● ●●

●●● ●●

●●

●●

●●● ●

●●●

●●● ●● ●●

●●● ●●

●● ●● ●● ●● ●●● ●●

●●●

●●

●●● ●●

●●● ●●

●●● ●●

●● ●● ●● ●● ●●● ●●

●●

● ●

●●● ●●

●●●●

●●● ●●

●●

●●● ●●

●● ●● ●● ●● ●●● ●●

●●●●

●●● ●●

●●● ●●

●●

●●● ●● ●● ●●●● ●●

●●

●●● ●

● ●●●

●●● ●

●●

●●

●●● ●●

● ●

●●● ●

● ●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) A

PC,c

.401

2C>T

[chr

5,st

p]AG

L,c.

4364

C>T

[chr

1,ns

n]AR

MC3

,c.1

294C

>T [c

hr10

,nsn

]CD

H1,c

.220

C>T

[chr

16,s

tp]

CYP2

C8,c

.1A>

G [c

hr10

,nsn

]DN

M3,

c.22

29C>

T [c

hr1,

syn]

EPHX

3,c.

954C

>T [c

hr19

,syn

]HO

OK1

,c.5

88G

>A [c

hr1,

syn]

HSDL

2,c.

58C>

T [c

hr9,

nsn]

IGF2

BP3,

c.15

02G

>C [c

hr7,

nsn]

MEF

V,c.

1319

G>A

[chr

16,n

sn]

OLF

ML3

,c.4

48C>

T [c

hr1,

stp]

PIK3

CA,c

.109

3G>A

[chr

3,ns

n]PO

LQ,c

.139

9C>T

[chr

3,st

p]PT

GER

3,c.

1170

C>A

[chr

1,sy

n]PT

PRK,

c.15

16A>

T [c

hr6,

stp]

PTPR

K,c.

1518

G>A

[chr

6,sy

n]RG

S7,c

.874

G>T

[chr

1,st

p]RN

F17,

c.16

29G

>A [c

hr13

,syn

]RP

1L1,

c.21

09A>

G [c

hr8,

syn]

RYR2

,c.7

206T

>C [c

hr1,

syn]

S1PR

1,c.

1050

C>T

[chr

1,sy

n]SO

X7,c

.305

G>A

[chr

8,ns

n]ST

X1B,

c.57

0G>A

[chr

16,s

yn]

TAS1

R2,c

.903

C>T

[chr

1,sy

n]TM

CC2,

c.12

72G

>A [c

hr1,

syn]

ULK3

,c.5

47G

>A [c

hr15

,nsn

]ZD

HHC2

4,c.

496C

>T [c

hr11

,stp

]

●●●●●

● ●●●

●●● ●●

●●●

●●● ●●

●●

●●

●●●● ●

●●

●●●

●●● ●●

●●●

●● ●

●●

●●● ●●

●●● ●●

●●

●●

●●● ●

●●●

●●● ●● ●●

●●● ●●

●● ●● ●● ●● ●●● ●●

●●●

●●

●●● ●●

●●● ●●

●●● ●●

●● ●● ●● ●● ●●● ●●

●●

● ●

●●● ●●

●●●●

●●● ●●

●●

●●● ●●

●● ●● ●● ●● ●●● ●●

●●●●

●●● ●●

●●● ●●

●●

●●● ●● ●● ●●●● ●●

●●

●●● ●

● ●●●

●●● ●

●●

●●

●●● ●●

● ●

●●● ●

● ●

1/2

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●

●● ●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●● ●●●●

●●

●●

●●●

●●

●●

●●● ●●

● ●●

●●

●●

●●● ●●

●●

●●

●●

●●● ●● ●

● ●

●●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●● ●● ●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) T

P53,

c.52

7G>T

[chr

17,n

sn]

AK8,

c.82

7C>A

[chr

9,ns

n]BT

NL2,

c.96

1G>A

[chr

6,ns

n]CU

L7,c

.508

9G>C

[chr

6,ns

n]CU

L9,c

.870

A>C

[chr

6,sy

n]DL

X3,c

.16G

>T [c

hr17

,nsn

]G

PR6,

c.29

4G>A

[chr

6,sy

n]G

TF2F

1,c.

550G

>C [c

hr19

,nsn

]HU

NK,c

.631

G>A

[chr

21,n

sn]

KRAS

,c.3

5G>A

[chr

12,n

sn]

LRSA

M1,

c.19

39G

>A [c

hr9,

nsn]

MAM

L1,c

.295

1C>G

[chr

5,ns

n]NA

LCN,

c.11

14C>

T [c

hr13

,nsn

]O

R14C

36,c

.667

G>A

[chr

1,ns

n]O

R4C1

2,c.

666T

>G [c

hr11

,syn

]PC

DH17

,c.9

42C>

T [c

hr13

,syn

]PE

LP1,

c.40

8G>T

[chr

17,n

sn]

PREX

1,c.

2992

C>T

[chr

20,n

sn]

PVRL

3,c.

995T

>C [c

hr3,

nsn]

SETD

B1,c

.171

3T>G

[chr

1,sy

n]SI

GLE

C1,c

.500

6G>A

[chr

20,n

sn]

SLC5

A4,c

.167

G>C

[chr

22,n

sn]

TSPY

L5,c

.468

G>A

[chr

8,sy

n]ZB

TB16

,c.7

29G

>A [c

hr11

,syn

]ZB

TB4,

c.14

06C>

T [c

hr17

,nsn

]ZD

HHC5

,c.1

471C

>T [c

hr11

,nsn

]ZS

WIM

4,c.

1261

T>C

[chr

19,n

sn]

●●

●● ●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●● ●●●●

●●

●●

●●●

●●

●●

●●● ●●

● ●●

●●

●●

●●● ●●

●●

●●

●●

●●● ●● ●

● ●

●●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●● ●● ●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

1/2

1/3

2/3

1/4

3/4

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●●●

●●

●●

●● ●●●

●●

●●●

●● ●

●●

●●●

●● ●● ●● ●

●●● ●● ●

●●● ●● ●

●●

●●

●●●

●● ●●

●●

●●

●●

●●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) C

TNND

1,c.

729C

>T [c

hr11

,syn

]

ARHG

AP12

,c.1

149G

>A [c

hr10

,syn

]

DOCK

3,c.

5916

G>A

[chr

3,sy

n]

FPR1

,c.3

95G

>T [c

hr19

,nsn

]

HMOX

1,c.

695G

>A [c

hr22

,nsn

]

RPS6

KL1,

c.10

15G

>A [c

hr14

,nsn

]

RYR2

,c.1

2537

A>T

[chr

1,ns

n]

SLC2

2A5,

c.15

39C>

T [c

hr5,

syn]

SPAT

C1L,

c.11

9A>T

[chr

21,n

sn]

●●●●

●●

●●

●● ●●●

●●

●●●

●● ●

●●

●●●

●● ●● ●● ●

●●● ●● ●

●●● ●● ●

●●

●●

●●●

●● ●●

●●

●●

●●

●● 1/2

1/3

2/3

1/4

3/4

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

● ●

●● ● ●●●●

●●● ●●

●●

●●

●●● ●●

●●

●●● ●

●●● ●● ●●● ●●

●●●

●●

●●● ●●

●●

●●

●●●

●●●

●● ●● ●

●●● ●● ●

●●●●

●● ●● ●

●●● ●

●●

●● ●● ● ●●● ●●

●●●

●●● ●●

● ●●●●

●●

●● ●● ●0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) A

PC,c

.401

2C>T

[chr

5,st

p]

A2M

,c.1

081C

>T [c

hr12

,stp

]

ANKR

D11,

c.34

37C>

T [c

hr16

,nsn

]

CA11

,c.6

96C>

T [c

hr19

,syn

]

CHD9

,c.6

689C

>T [c

hr16

,nsn

]

CNTN

6,c.

1340

C>T

[chr

3,ns

n]

LRP1

B,c.

8185

G>A

[chr

2,ns

n]

MCA

M,c

.162

4A>G

[chr

11,n

sn]

NCAN

,c.3

842G

>A [c

hr19

,nsn

]

NIPB

L,c.

4202

T>A

[chr

5,ns

n]

OCE

L1,c

.336

G>C

[chr

19,n

sn]

PFKF

B3,c

.121

1C>G

[chr

10,n

sn]

PTPR

B,c.

1847

G>C

[chr

12,n

sn]

SEC2

4B,c

.252

4C>G

[chr

4,ns

n]

SNRN

P200

,c.3

455G

>A [c

hr2,

nsn]

● ●

●● ● ●●●●

●●● ●●

●●

●●

●●● ●●

●●

●●● ●

●●● ●● ●●● ●●

●●●

●●

●●● ●●

●●

●●

●●●

●●●

●● ●● ●

●●● ●● ●

●●●●

●● ●● ●

●●● ●

●●

●● ●● ● ●●● ●●

●●●

●●● ●●

● ●●●●

●●

●● ●● ●

1/2

1/3

2/3

1/4

3/4

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●●

●●

●●

●●● ●●

●●●

●●● ●●

●●

●●

●●● ●●

● ●●●

●●●● ●

●●● ●● ●●● ●●

● ●●

●●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) A

PC,c

.241

3C>T

[chr

5,st

p]

DLG

AP3,

c.11

94C>

G [c

hr1,

nsn]

EPHA

3,c.

2259

G>T

[chr

3,ns

n]

NKAI

N2,c

.115

G>A

[chr

6,ns

n]

NUDC

D1,c

.142

7A>G

[chr

8,ns

n]

RSPR

Y1,c

.631

G>A

[chr

16,n

sn]

●●●

●●

●●

●●● ●●

●●●

●●● ●●

●●

●●

●●● ●●

● ●●●

●●●● ●

●●● ●● ●●● ●●

● ●●

●●

1/2

1/3

2/3

1/4

3/4

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Alle

lic F

requ

ency

●●●

●●

●●●●

●●●

●●●

●●●●

●●

0.0

0.2

0.4

0.6

0.8

1.0

(Cnt

rl) K

RAS,

c.18

2A>T

[chr

12,n

sn]

PIK3

CA,c

.163

7A>G

[chr

3,ns

n]

PIK3

CA,c

.314

1T>G

[chr

3,ns

n]

●●●

●●

●●●●

●●●

●●●

●●●●

●●

● 1/2

Right Gland5LJKW�%XON�:KROHïexomeRight Bulk TargetedLeft Gland/HIW�%XON�:KROHïexomeLeft Bulk Targeted

Nature Genetics: doi:10.1038/ng.3214

Page 9: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

9

Supplementary Figure 5. Mutational profiling of single glands. The allelic frequencies of additional mutations based on single gland targeted deep sequencing for tumors K, S, X, M, N, O, T and U are presented here, and corroborate the findings reported in Supplementary Figure 1B. Note that the whole right side of adenoma K is triploid, hence the allelic frequencies are shifted to either 0.33 or 0.66. Sub-clone segregation is apparent in the adenomas, whereas variegation is typical of the carcinomas. Putative driver mutations found in all cells within a tumor (public) were used as a control. The mean depth of coverage per mutation was 626.58±20.2 (95% CI).

Nature Genetics: doi:10.1038/ng.3214

Page 10: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

10

Supplementary Figure 6. Illustrative simulations show how variegation arises from early mixing in the primordial tumor followed by scattering due to growth, independent of a clone’s fitness. In order to illustrate how genomic variegation can arise from simple growth dynamics, we simulated a small 2D (50x50 gland lattice) tumor that recapitulates the patterns of the larger 3D model used for statistical inference. Tumor glands are tubular in shape, but their average diameter can be approximated to 200 µm, hence a 50x50 gland lattice represents ~1cm2 tumor. The simulation begins with a single gland surrounded by normal tissue (white, not modeled). We model the growth of the gland in a manner analogous to the 3D model where new space is allocated by pushing glands outwards, but we do not account for the within-gland structure. We then simulate the emergence of a new mutant gland (red) with no survival advantage (identical to all other glands) when the tumor is composed of 4 (a), 8 (b), 64 (c) and 1024 (d) glands. The new mutation is passed on to the gland progeny, defining the sub-clone shown in red. The results presented are representative of a large number of simulations, which show that a mutant clone introduced early can result in variegated patterns, whereas a mutant clone that arises at later time points tends to remain segregated. This simple illustration reflects our inference results, which demonstrate that variegated/side-variegated alterations occur early due to random mixing during growth in the primordial neoplasm in the absence of normal cell adhesion, followed by subsequent scattering during expansion. The same phenomenon occurs when the mutant clone harbors a considerable fitness advantage (20%) for a tumor composed of 4 (e), 8 (f), 64 (g), and 1024 (h) glands, indicating that the timing of a mutation is the critical factor in generating genetic variegation, not selection. These results suggest that in some tumors, abnormal cell mixing required for later invasion is expressed very early.

x

y

10

20

30

40

10 20 30 40

x

y

10

20

30

40

10 20 30 40

x

y

10

20

30

40

10 20 30 40

x

y

10

20

30

40

10 20 30 40

x

y

10

20

30

40

10 20 30 40

x

y

10

20

30

40

10 20 30 40

x

y10

20

30

40

10 20 30 40

x

y

10

20

30

40

10 20 30 40

a b c d

e f g h

Nature Genetics: doi:10.1038/ng.3214

Page 11: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

11

Supplementary Figure 7. Schematic representation of tumor evolutionary dynamic statistical inference framework. (a) In order to infer the evolutionary dynamics of neoplastic growth, including the mutational timeline, sub-clone fitness changes (σ), and the mutation rate (µ) from single gland genomic data, we employed a 3-dimensional mathematical model of tumor growth and statistical inference framework based on Approximate Bayesian Computation (ABC). The output is a set of posterior probability estimates for the tumor characteristics of interest given the data and the model of reference. (b) Schematic representation of the 3-dimensional spatial model of tumor growth. The model assumes gland fission, and accounts for the acquisition of CNAs or point mutations (indicated by a lightening symbol) that result in clones with a different relative fitness. Each simulation is run with a parameter set θ=(σ,µ) and generates virtual gland profiles such that the simulated data can be compared with the observed data within the inference method.

Nature Genetics: doi:10.1038/ng.3214

Page 12: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

12

Supplementary Figure 8. Inferred tumor fitness parameters and mutation rates based on the CNA data. The inferred posterior probability distributions (a) for changes in sub-clone fitness and mutation (CNA) rate are illustrated for all samples. The inference results indicate that the adenomas do not exhibit sub-clone fitness changes and have a mutation rate of 10-6. In contrast, the carcinomas were more variable, as N, T, U, W and G have mutation rates on the order 10-6, while the remainder exhibit an elevated mutation rate (10-5), and all apart from W (MSI+) show moderate to large fitness changes. (b) Glands within the same tumor and even the same side exhibit variation in fitness based on the Shannon index. This suggests that despite evidence for selection, selective sweeps that would homogenize the population are rare and as a result, sub-clones with different fitness levels coexist within the same region. Error bars correspond to the standard deviation.

a bFitness change Mutation rate Fitness change Mutation rate

0 0.2 0.6

K

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

K

01

0 0.2 0.6

S

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

S

01

0 0.2 0.6

P

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

P0

1

0 0.2 0.6

X

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

X

01

0 0.2 0.6

M

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

M

01

0 0.2 0.6

N

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

N

01

0 0.2 0.6

O

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

O

01

0 0.2 0.6

TF

requency

00.5

10<8

10<7

10<6

10<5

10<4

T

01

0 0.2 0.6

U

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

U

01

0 0.2 0.6

W

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

W

01

0 0.2 0.6

CA

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

CA

01

0 0.2 0.6

CO

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

CO0

1

0 0.2 0.6

G

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

G

01

0 0.2 0.6

R

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

R

01

0 0.2 0.6

Z

Fre

quency

00.5

10<8

10<7

10<6

10<5

10<4

Z

01

K S P X M N O T U WC

AC

O G R Z

Right+Left glands

Shannon ind

ex

0.0

0.2

0.4

0.6

0.8

1.0

K S P X M N O T U WC

AC

O G R Z

Right glands

Shannon ind

ex

0.0

0.2

0.4

0.6

0.8

1.0

K S P X M N O T U WC

AC

O G R Z

Left glands

Shannon ind

ex

0.0

0.2

0.4

0.6

0.8

1.0

Nature Genetics: doi:10.1038/ng.3214

Page 13: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

13

Supplementary Figure 9. Statistical inference on single gland mutational data corroborates findings at the CNA level. (a) The inferred posterior probability distributions for changes in sub-clone fitness and mutation rate based on mutational data are in agreement with the CNA-based inference. The variation in fitness between adjacent glands within the tumor (b) and the timeline during which different classes of alterations arise (c) are also congruent with the results based on the CNA data. Note that public mutations that occur after the transition to an established neoplasm were not found in the simulations selected by the inference framework. Error bars represent the standard deviation.

a

b

cFitness change Mutation rate Fitness change Mutation rateK S P X M N O T U W

Right+Left glands

Shannon ind

ex

0.0

0.2

0.4

0.6

0.8

1.0

K S P X M N O T U W

Right glands

Shannon ind

ex

0.0

0.2

0.4

0.6

0.8

1.0

K S P X M N O T U WLeft glands

Shannon ind

ex

0.0

0.2

0.4

0.6

0.8

1.0

105

107

109

1011

6LGHïVSHFLILF

Tumor size

Fre

quency

0.0

0.4

0.8

105

107

109

1011

6LGHïvariegated

Tumor size

Fre

quency

0.0

0.4

0.8

105

107

109

1011

Variegated

tumour size

Fre

quency

0.0

0.4

0.8

105

107

109

1011

Regional

Tumor size

Fre

quency

0.0

0.4

0.8

105

107

109

1011

Unique

Tumor size

Fre

quency

0.0

0.4

0.8

0 0.2 0.6

K

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

K

01

0 0.2 0.6

S

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

S

01

0 0.2 0.6

P

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

P

01

0 0.2 0.6

X

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

X

01

0 0.2 0.6

M

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

M

01

0 0.2 0.6

N

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

N

01

0 0.2 0.6

O

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

O

01

0 0.2 0.6

T

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

T

01

0 0.2 0.6

U

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

U

01

0 0.2 0.6

W

Fre

qu

en

cy

00

.5

10<8

10<7

10<6

10<5

10<4

W

01

Nature Genetics: doi:10.1038/ng.3214

Page 14: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

14

Supplementary Figure 10. Schematic of microenvironment-aware simulation of tumor growth. In order to model the contribution of local microenvironment to selection, microenvironmental niches that select for different genotypic characteristics were simulated. To achieve this, the lattice in which the tumor grows was divided into a grid of separate microenvironmental niches, each of which selects for a specific alteration in a random chromosome (indicated by colored circles) by inducing a higher survival probability in glands that harbor that aberration.

chr... chr... chr... chr...

chr... chr... chr... chr...

chr... chr...

chr... chr...

chr... chr...

chr... chr...

chr... chr... chr... chr...

chr... chr... chr... chr...

chr... chr...

chr... chr...

chr3chr1

chr13

chr8

chr19chr4

chr6chr21

Nature Genetics: doi:10.1038/ng.3214

Page 15: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

15

Supplementary Figure 11. The mutation rate and mutational timeline do not change under the assumption of microenvironmental niche-driven growth. (a) The inferred posterior probability distributions for niche size

c dMutation rate Mutation rate

a bMutation rate Mutation rateNiche size Niche size

Niche sizeNiche size

5 20 150

K

Freq

uenc

y

00.

5

10<810<710<610<510<4

K

01

5 20 150

S

Freq

uenc

y

00.

5

10<810<710<610<510<4

S

01

5 20 150

P

Freq

uenc

y

00.

5

10<810<710<610<510<4

P

01

5 20 150

X

Freq

uenc

y

00.

5

10<810<710<610<510<4

X

01

5 20 150

M

Freq

uenc

y

00.

5

10<810<710<610<510<4

M

01

5 20 150

N

Freq

uenc

y

00.

510<810<710<610<510<4

N

01

5 20 150

O

Freq

uenc

y

00.

5

10<810<710<610<510<4

O

01

5 20 150

T

Freq

uenc

y

00.

5

10<810<710<610<510<4

T

01

5 20 150

U

Freq

uenc

y

00.

5

10<810<710<610<510<4

U

01

5 20 150

W

Freq

uenc

y

00.

5

10<810<710<610<510<4

W

01

5 20 150

CA

Freq

uenc

y

00.

5

10<810<710<610<510<4

CA

01

5 20 150

CO

Freq

uenc

y

00.

5

10<810<710<610<510<4

CO

01

5 20 150

G

Freq

uenc

y

00.

5

10<810<710<610<510<4

G

01

5 20 150

R

Freq

uenc

y

00.

5

10<810<710<610<510<4

R

01

5 20 150

Z

Freq

uenc

y

00.

5

10<810<710<610<510<4

Z

01

105 107 109 1011

Public

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

6LGHïVSHFLILF

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

6LGHïvariegated

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

Variegated

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

Regional

Tumor size

Freq

uenc

y0.

00.

40.

8105 107 109 1011

Unique

Tumor size

Freq

uenc

y0.

00.

40.

8

5 20 150

K

Freq

uenc

y

00.

5

10<810<710<610<510<4

K

01

5 20 150

S

Freq

uenc

y

00.

5

10<810<710<610<510<4

S

01

5 20 150

P

Freq

uenc

y

00.

5

10<810<710<610<510<4

P

01

5 20 150

X

Freq

uenc

y

00.

5

10<810<710<610<510<4

X

01

5 20 150

M

Freq

uenc

y

00.

5

10<810<710<610<510<4

M

01

5 20 150

N

Freq

uenc

y

00.

5

10<810<710<610<510<4

N

01

5 20 150

O

Freq

uenc

y

00.

5

10<810<710<610<510<4

O

01

5 20 150

T

Freq

uenc

y

00.

5

10<810<710<610<510<4

T

01

5 20 150

U

Freq

uenc

y

00.

5

10<810<710<610<510<4

U

01

5 20 150

W

Freq

uenc

y

00.

5

10<810<710<610<510<4

W

01

105 107 109 1011

6LGHïVSHFLILF

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

6LGHïvariegated

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

Variegated

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

Regional

Tumor size

Freq

uenc

y0.

00.

40.

8

105 107 109 1011

Unique

Tumor size

Freq

uenc

y0.

00.

40.

8

Nature Genetics: doi:10.1038/ng.3214

Page 16: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

16

indicate variable sizes for adenomas versus carcinomas. Despite the fact that microenvironments may differ within different regions of a tumor, the inferred CNA rates are equivalent to the microenvironment-free model. (b) Similarly, the inferred mutational timeline, shown here summarized across all samples, is in agreement with the microenvironment-free model. In particular, the parameter estimates are consistent across cases (as illustrated by the error bars), and indicate that public, as well as the majority of private alterations (side-variegated, variegated, and side-specific) occur early after the transition to an advanced tumor, whereas unique mutations occur late. (c) As in (a), similar results for the niche size and mutation rates were obtained independently using single gland targeted mutational profiles as input to the inference framework. (d) As in (b), the inferred mutational timeline is equivalent using single gland targeted mutational profiles. Public mutations occurring after the transition to an established neoplasm were not found in the simulations selected by the inference framework. Error bars represent the standard deviation.

Nature Genetics: doi:10.1038/ng.3214

Page 17: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

17

Supplementary Figure 12. Depth of coverage for targeted deep sequencing. Single glands were sequenced to an average depth of coverage of 626.58±20.2 (95% CI), where boxplots are presented for individual samples (A, right; B, left) from each tumor. Boxplots show the median, limited by the 25th (Q1) and 75th (Q3) percentiles, where whiskers represent the most extreme of the maximum or Q3+1.5(Q3−Q1) and the minimum or Q1−1.5(Q3−Q1), respectively. The filled squares represent outliers.

De

pth

of

Co

ve

rag

e

Sample

KA

KB

MA

MB

NA

NB

OA

OB

PA

PB

SA

SB

TA

TB

UA

UB

WA

WB

XA

XB

200

400

600

800

1000

Nature Genetics: doi:10.1038/ng.3214

Page 18: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

18

Supplementary Figure 13. Summary of variability in FISH copy number estimates between tumor and adjacent normal glands based on the Shannon Index. In order to determine whether the estimates of heterogeneity from the FISH data were biased as a result of using 5 µm thick sections (the diameter of colorectal tumor cells may be as large as 10 µm), we compared the number of aberrant events in tumor glands with the counts for adjacent normal colon tissue. As shown in the boxplots, estimates of ITH based on the Shannon Index differ significantly (t-test, p-value<0.01) between the tumor and adjacent normal. For each group, boxplots show the median, limited by the 25th (Q1) and 75th (Q3) percentiles, where whiskers represent the most extreme of the maximum or Q3+1.5(Q3−Q1) and the minimum or Q1−1.5(Q3−Q1), respectively. The open circles represent outliers.

Ch17

NORM

AL

Chr1

7TU

MO

R

Her2

NORM

AL

Her2

TUM

OR

0.0

0.5

1.0

1.5

2.0

Shan

non

Inde

x

Chr17 cent.Her2

p=0.002p=1.77e-6

Nature Genetics: doi:10.1038/ng.3214

Page 19: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

19

Supplementary Figure 14. Validation of the statistical inference framework using synthetic data. To verify the accuracy of our inference framework we generated a synthetic dataset for 21 tumors with different parameters (σ, µ) using our spatial tumor growth model. The results illustrate that the correct parameter values (red circles) are recovered for the majority of cases, indicating the robustness of this approach.

Freq

uenc

yFr

eque

ncy

Freq

uenc

yFr

eque

ncy

Freq

uenc

y

Fitness change Fitness change Fitness changeMutation rate Mutation rate Mutation rate0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<40.

00.

6●

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

●0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

0

0.2

0.6

0.00.10.20.30.40.50.6

10<8 10<6 10<4

0.0

0.6

Nature Genetics: doi:10.1038/ng.3214

Page 20: Supplementary Information for: A Big Bang model …...Supplementary Information for: A Big Bang model of human colorectal tumor growth Andrea Sottoriva 1,6 , Haeyoun Kang 2,3 , Zhicheng

20

Supplementary Tables Supplementary Table 1. Summary of patient clinical information for colorectal adenoma and carcinoma samples. Here, MSI refers to microsatellite instability, whereas MSS indicates microsatellite stability.

Supplementary Table 2. Summary of the number of CNAs detected within each class per tumor.

Patient Age Gender Stage Size (cm) Type MSI statusK 50 M 1 6 Adenoma MSSP 71 M 0 3.5 Adenoma MSSS 66 M 0 6 Adenoma MSSX 69 F 0 2.5 Adenoma MSSM 76 F 2 3 Carcinoma MSSN 71 M 1 2.3 Carcinoma MSSO 41 M 3 9.5 Carcinoma MSST 85 M 3 5.7 Carcinoma MSSU 78 F 2 3.9 Carcinoma MSSW 51 M 1 3.4 Carcinoma MSICO 83 M 2 8 Carcinoma MSSCA 57 M 2 8.5 Carcinoma MSSG 61 M 3 5.3 Carcinoma MSSR 51 M 3 3 Carcinoma MSSZ 83 M 3 7.5 Carcinoma MSS

Patient Public Side-specific Side-variegated Variegated Regional UniqueAdenomas

K 5 3 0 0 0 2S 2 0 0 0 0 0P 0 0 0 0 0 3X 1 0 0 0 0 6

CarcinomasM 1 0 23 2 4 10N 14 0 2 0 1 2O 25 2 2 1 2 7T 25 3 0 0 0 9U 20 2 0 1 0 11W 6 0 0 0 0 0CA 0 0 1 0 1 5CO 12 6 2 0 1 5G 17 2 0 0 3 3R 18 4 3 0 14 5Z 25 16 0 0 2 2

Nature Genetics: doi:10.1038/ng.3214