20
DNA repair [12] Cellular response to DNA damage[86] Chromatin modification[10] Embryo development[52] Cell cycle [56] Cell adhesion[42] Cell proliferation[78] Cell death [76] Developmental process[194] Gene expression [163] DNA repair [21] Cellular response to DNA damage[86] Chromatin modification[49] Embryo development[79] Cell cycle [101] Cell adhesion[90] Cell proliferation[129] Cell death [134] Gene expression [348] Developmental process[361] DNA repair [40] Cellular response to DNA damage[86] Chromatin modification[42] Embryo development[75] Cell cycle [100] Cell adhesion[130] Cell proliferation[137] Cell death [178] Developmental process[452] Gene expression [338] DNA repair [4] Cellular response to DNA damage[86] Chromatin modification[7] Embryo development[10] Cell cycle [21] Cell adhesion[12] Cell proliferation[16] Cell death [22] Developmental process[28] Gene expression [44] a b c d Supplementary Figure S1. ( a -d) Distribution of CtBP bound regions in cancer-related gene ontologies. ( a) Genes that have CtBP binding downstream of TSS, ( b) Genes that have CtBP binding peaks at the distal upstream regions, ( c) Genes that have CtBP binding peaks in their introns, and ( d) Genes that have CtBP binding peaks at their exons. The GO terms are GO:0006281 DNA repair; GO:0006974 Cellular response to DNA damage; GO:0016568 Chromatin modification; GO:0009790 Embryo development; GO:0007049 Cell cycle; GO:0007155 Cell adhesion; GO:0008283 Cell proliferation; GO:0008219 Cell death; GO:0032502 Developmental process; and GO:0010467 Gene expression. 1

Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Embed Size (px)

Citation preview

Page 1: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

DNA repair [12]

Cellular response to DNA damage[86]

Chromatin modification[10]

Embryo development[52]

Cell cycle [56]

Cell adhesion[42]

Cell proliferation[78]

Cell death [76]

Developmental process[194]

Gene expression [163]

DNA repair [21]

Cellular response to DNA damage[86]

Chromatin modification[49]

Embryo development[79]

Cell cycle [101]

Cell adhesion[90]

Cell proliferation[129]

Cell death [134]

Gene expression [348]

Developmental process[361]

DNA repair [40]

Cellular response to DNA damage[86]

Chromatin modification[42]

Embryo development[75]

Cell cycle [100]

Cell adhesion[130]

Cell proliferation[137]

Cell death [178]

Developmental process[452]

Gene expression [338]

DNA repair [4]

Cellular response to DNA damage[86]

Chromatin modification[7]

Embryo development[10]

Cell cycle [21]

Cell adhesion[12]

Cell proliferation[16]

Cell death [22]

Developmental process[28]

Gene expression [44]

a b

c d

Supplementary Figure S1. (a-d) Distribution of CtBP bound regions in cancer-related gene ontologies. (a) Genes that have CtBP binding downstream of TSS, (b) Genes that have CtBP binding peaks at the distal upstream regions, (c) Genes that have CtBP binding peaks in their introns, and (d) Genes that have CtBP binding peaks at their exons. The GO terms are GO:0006281 DNA repair; GO:0006974 Cellular response to DNA damage; GO:0016568 Chromatin modification; GO:0009790 Embryo development; GO:0007049 Cell cycle; GO:0007155 Cell adhesion; GO:0008283 Cell proliferation; GO:0008219 Cell death; GO:0032502 Developmental process; and GO:0010467 Gene expression.

1

Page 2: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

307

8.3e-150

MA0346.1 (NHP6B) , M01504 (Nhp6b) , Srf_secondary, Tcfap2e_secondary, Mtf1_secondary, M00218 (MYB.Ph3) , Tcf3_secondary, M01247 (Nanog) , Foxa2_secondary, Zfp105_primary, M01566 (Sfl1) , MA0377.1 (SFL1) , M01012 (HNF3) , MA0120.1 (id1) , Elf3_secondary, M01324 (Octamer) , M00092 (BR-C) , M01092 (TCF) , M01183 (BCL6) , Pou3f3_3235.2

310

5.1e-086

MA0062.2 (GABPA) , M01258 (GABP) , MA0076.1 (ELK4) , Gabpa_primary, M00016 (E74A) , M01163 (Elk-1) , M01165 (Elk-1) , M01078 (c-Ets-1) , M01167 (SAP-1a) , Ehf_primary, MA0062.1 (GABPA) , M00108 (NRF-2/GABP) , M00025 (Elk-1) , M00771 (Ets) , MA0026.1 (Eip74EF) , M01208 (FLI1) , M00341 (GABP) , M00032 (c-Ets-1(p54)) , M00224 (STAT1) , M01203 (PU1)

248

7.3e-055

M00196 (Sp1) , M00932 (Sp1) , M00931 (Sp1) , M00933 (Sp1) , M01273 (SP4) , M00255 (GC) , MA0079.2 (SP1) , M01303 (SP1) , M00982 (KROX) , MA0443.1 (btd) , Klf7_primary, Sp4_primary, Zfp410_secondary, M00008 (Sp1) , Egr1_primary, M01068 (UF1H3BETA) , Bcl6b_secondary, M01175 (CKROX) , Smad3_secondary, M01219 (SP1:SP3)

75

2.3e-048 Atf1_primary, Jundm2_primary, M00916 (CREB) , M01187

(STF1) , M00036 (v-Jun) , MA0018.1 (CREB1) , M00113 (CREB) , M01186 (STF1) , M00178 (CREB) , MA0018.2 (CREB1) , M00338 (ATF) , MA0374.1 (RSC3) , M00039 (CREB) , M00917 (CREB) , M00177 (CREB) , M01586 (TGA2) , M00179 (ATF2) , M00041 (ATF2:c-Jun) , M00691 (ATF1) , M00801 (CREB)

124 8.1e-045

M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1)

62 4.1e-025

M00264 (Staf) , M00262 (Staf) , MA0088.1 (znf143) , M01260 (STAT1) , MA0137.2 (STAT1) , M00492 (STAT1) , M00457 (STAT5A) , M00460 (STAT5A) , M00224 (STAT1) , M00459 (STAT5B) , M01169 (Ikaros) , M00493 (STAT5A)

51

7.9e-013

M00793 (YY1) , M01035 (YY1) , M01057 (ERF2) , M00069 (YY1) , M00651 (NF-muE1) , MA0095.1 (YY1) , M00059 (YY1) , M00733 (SMAD4) , M01199 (RNF96) , MA0394.1 (STP1)

103

3.2e-002

M00171 (Adf-1) , M00923 (Adf-1) , M00695 (ETF) , M00982 (KROX) , M01057 (ERF2) , MA0079.2 (SP1) , M00933 (Sp1) , M01273 (SP4) , M01219 (SP1:SP3) , MA0410.1 (UGA3) , MA0123.1 (abi4) , M00321 (Muscle) , M00196 (Sp1) , M00958 (ABI4) , Egr1_primary, M00932 (Sp1) , M01253 (CNOT3) , M00915 (AP-2) , M00807

(Egr) , M01199 (RNF96)

82

2.5e-010

M00027 (AbaA) , Gabpa_secondary, M01172 (PU.1) , M01266 (ELF1) , M01219 (SP1:SP3) , MA0149.1 (EWSR1-FLI1) , MA0079.2 (SP1) , M00971 (Ets) , Sp4_primary, M00771 (Ets) , M00658 (PU.1) , Ascl2_secondary, M01122 (ZNF219) , MA0068.1 (Pax4) , M00339 (c-Ets-1)

, M00491 (MAZR) , M00957 (PR) , M01203 (PU1) , M01118 (WT1) , Irf3_secondary

Supplementary Figure S2. The Top 9 consensus binding motifs enriched under CtBP peaks identified from the top 1500 CtBP binding hotspots using MEME. Numbers indicate the frequency of the motifs within all the CtBP binding sequences and with the significance. Signif icance is ranked according to E-values as labeled. The E-value of a motif is based on its log likelihood ratio, width, numer of sites, the background letter frequencies, and the size of the training set. E-values are an estimate of the expected number of motifs with the given log likelihood ratio (or higher), and with the same width and site count, that one would find in a similarly sized set of random sequences. The consensus motifs were used to match transfac binding motifs using TOMTOM programming. The best matched binding motif is listed at right indicating the potential co-occupancy of CtBP with those factors.

2

Page 3: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

00.20.40.60.81

1.21.41.6

CtrlCtBPKDIGG

* * ** **

* * * * * * * * * *

*

* * ** ** * * * * *

*

*

CtBPMCF-7

CtB

P C

hIP

N

orm

aliz

ed to

Input

0

0.04

0.08

0.12

0.16

0.2

* ** ** ***

**

** ** * ** **

*

* ****

CtrlCtBPKDIGG

MDA-MB-231CtBP

Supplementary Figure S3. CtBP in both MCF-7 and MDA-MB-231 cells before and after CtBP gene depletion by RNAi. (Top) Western blot of CtBP depletion and over-expression (CtBP1-GFP) in MCF-7 and MDA-MB-231 cells.(Bottom) ChIP assay of CtBP at genes indicated. Data shown is the average of 2 independent experiments and the error bars represent the standard deviation of the mean. P values are calculated by 2-sided student’s t-test. Single asterisk represents p<0.05 and double asterisk represents p<0.01. Intr 15 is a non-promoter region located in intron 15 of the BRCA1 gene locus used as a negative control.

ChIP validation assay of selected 30 gene promoters (Fig. 1c)

Intr

15

anti-CtBP

anti-Actin

CtBP

CtBPGFP

CtBP

CtBPGFP

anti-CtBP

anti-Actin

anti-CtBPanti-CtBP

Ctrl CtBP OE Ctrl CtBP OE

CtrlCtBP KD CtBP KDCtrl

MCF-7 MDA-MB-231

CtB

P C

hIP

Norm

aliz

ed to Input

Intr

15

3

Page 4: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

0123456789

10

** ** *

*** ** * *

***

**

**

*

** *

* **

**

*

CtrlCtBPKD

MCF-7 mRNA

0

5

10

15

20

25

*

**

**

** ** ** * * *

**

*

*

* *

*

*

**

** * **

**

* *

CtrlCtBPKD

MDA-MB-231 mRNA

Supplementary Figure S4. Quantitative gene expression analysis of selected 30 CtBP target genes in MCF-7 and

MDA-MB-231 cells following with CtBP depletion by RNAi (CtBPKD). Here 53% and 80% of genes show

significant up-regulation (P< 0.05) and 16% and 0% of genes show significant down-regulation (P<0.05) in MCF-7

and MDA-MB-231 cells respectively. Data shown is the average of 3 independent experiments and the error bars

represent the standard deviation of the mean. P values are calculated by 2 sided student’s t-test. Single asterisk

represents p<0.05 and double asterisk represents p<0.01.

mR

NA

(qR

T-P

CR

) N

orm

aliz

ed to 1

8s

rRN

Am

RN

A(q

RT-P

CR

) N

orm

aliz

ed to 1

8s

rRN

A

4

Page 5: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

0

0.5

1

1.5

2

2.5

**

*

*

*

*

** ** **

****

*

*

* * *

*

****

CtrlCtBPOE

MCF-7 mRNA

* *

* *

*

0

0.5

1

1.5

2

2.5

3

3.5CtrlCtBPOE

MDA-MB-231 mRNA

Supplementary Figure S5. Quantitative gene expression analysis of selected 30 CtBP target genes in MCF-7 and MDA-MB-231 after CtBP overexpression (CTBPOE). There are 16% and 0% of genes that show significant up-regulation and 43% and 6% genes that show significant down-regulation in MCF-7 and MDA-MB-231 respectively. The data shown is the average of 3 independent experiments and the error bars represent the standard deviation of the mean. P values are calculated by 2-sided student’s t-test. Single asterisk represents p<0.05 and double asterisk represents p<0.01.

mR

NA

(qR

T-P

CR

) N

orm

aliz

ed to 1

8s

rRN

Am

RN

A(q

RT-P

CR

) N

orm

aliz

ed to 1

8s

rRN

A

5

Page 6: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

0

0.5

1

1.5

2 CtrlCtBP OE

CD44

CD24

** ** * ** *

* * * * * * * * * ** * * * * ** ** * ** ** ** ** **

*

0

1

2

3

4 CtrlCtBP KD

CD24

CD44N

orm

aliz

ed to 1

8s

rRN

A

**

** **** ** *

** * * * * * * *** * * * * * * * * * ** * * * **

0

0.4

0.8

1.2

1.6

UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW UP

PR

O

DW

CtBP PRONSIgG PROCtBP UP/DWNSIgG UP/DW

CtB

P C

hIP

Norm

aliz

ed to Input

BPNT1 PLS1 ELF3 PIK3R3 GADD45B SIRT1 IGHMBP2 AGR2 MRPL13 DDIT3 WWP1 RASEF RAD9A VAV3 PRLR ANAPC5 MAT2B NDC80

**

** ** ** ** ** ** ** **** ** ** ** ** **

* *

*

MCF-7

MCF-7

MCF-7CtBP

Supplementary Figure S6. Quantitative ChIP and gene expression validation analysis of additional 41 CtBP target genes in cells overexpressing (CtBP OE) or depleted of CtBP KD) in MCF-7 cells. (Top and Middle panel). 14.6% genes show significant up-regulation and 48.7% of genes show significant down-regulation with CtBP overexpression. 60.9% genes show significant up-regulation and 12.1% of genes show significant down-regulation with CtBP depletion. (Bottom) panel shows validation of 18 CtBP target genes identified in CtBP ChIP-seq by CtBP ChIP assay. Both regions upstream of promoter (UP) and downstream of promoter (DW) were selected as negative controls for each individual gene. The gene expression data shown at top and middle are the average of 3 independent experiments and the ChIP data shown is the average of 2 replicates. The error bars represent the standard deviation of the mean. P values are calculated by 2-sided student’s t-test. Single asterisk represents p<0.05 and double asterisk represents p<0.01.

Norm

aliz

ed to 1

8s

rRN

A

6

Page 7: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

DNA repair [9]

Cellular response to DNA

damage[12] Chromatin modification[8]

Embryo development[14]

Cell cycle [24]

Cell adhesion[16]

Cell proliferation[18]

Cell death [24]

Developmental process[57]

Gene expression [64]

DNA repair [4]

Cellular response to DNA

damage[12] Chromatin modification[6]

Embryo development[6]

Cell cycle [23]

Cell adhesion[6]

Cell proliferation[11]

Cell death [24]

Developmental process[34]

Gene expression [57]

279 CtBP targets with significant expression change upon CtBP knockdown

179 CtBP targets up-regulated significantly upon CtBP knockdown 100 CtBP targets down-regulated

significantly upon CtBP knockdown

DNA repair [8]

damage[17] Chromatin modification[10]

Embryo development[14]

Cell cycle [30]

Cell adhesion[12]

Cell proliferation[18]

Cell death [31]

Developmental process[81]

Gene expression [79]

Cellular response to DNA

a

b c

Supplementary Figure S7. Distribution of cancer specific gene ontology classes in CtBP bound genes identified as differentially expressed after CtBP gene depletion via micro-array analysis. Shown are the gene ontology distribution of (a) 279 CtBP target genes that show significant (p<0.05) change in expression following CtBP gene depletion. (b) CtBP target genes that are significantly (p<0.05) up-regulated by CtBP depletion. (c) CtBP target genes that are significantly (p<0.05) down-regulated by CtBP depletion.

7

Page 8: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Comet Assay

Ctrl RNAi

BRCA1 RNAi

+ ++ +++- --- --

P-value -5

=1.6 x 10

*

Ctrl H O2 2 H O2 2

recovery

0

100

200

300

400

500C

om

et A

ssay

Tail

mom

ent

Supplementary Figure S8. Analysis of change in tail moment in control and BRCA1 depleted

cells following DNA damage and recovery.

8

Page 9: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

00.51

1.52

2.53

3.54

4.5

0

0.4

0.8

1.2

1.6

2

* *

****

**

** *

*

**

**

**

* **

*

* * *

0

0.1

0.2

0.3

0.4

** ****

**

*

**

**

*

*

*

***

**

*

*

* ** **

CtrlCtBPKD

MDA-MB-231 H3Ac

MDA-MB-231 H4Ac

Supplementary Figure S9. ChIP assay of acetylated H3 and acetylated H4 at the selected 30 gene promoters in both MCF-7 cells and MDA-MB-231 cells. The data shown is the average of 2 independent experiments and the error bars represent the standard deviations. P values are calculated by 2-sided student’s t-test. Single asterisk represents p<0.05 and double asterisk represents p<0.01. Intr 15 is a non-promoter region located in intron 15 of the BRCA1 gene locus used as a negative control.

MCF-7 H3Ac

MCF-7 H4Ac

H3A

c C

hIP

N

orm

aliz

ed to Input

H3A

c C

hIP

N

orm

aliz

ed to Input

H4A

c C

hIP

N

orm

aliz

ed to Input

H4A

c C

hIP

N

orm

aliz

ed to Inpu

t

IGG

CtrlCtBPKDIGG

CtrlCtBPKDIGG

CtrlCtBPKDIGG

Intr

15

Intr

15

Intr

15

* * * * **

* * * ** *

**

* *

*

**

** *

**

*

*

*

*

*

*

**

*

*

*

*

*

*

*

*

*

*0

0.05

0.1

0.15

0.2

0.25

Intr

15

*

**

*

*

*

*

9

Page 10: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Low GlucoseHigh Glucose

Low GlucoseHigh Glucose

Low GlucoseHigh Glucose

00.5

11.5

22.5

33.5

4

AT

R

ER

CC

5

FAN

CD

2

RA

D5

1c

XR

CC

5

FAN

1

PALB

1

MU

S8

1

BR

IP1

0

0.5

1

1.5

2

2.5

3

3.5

4

AN

XA

CLD

N4

CST

6

KR

T1

8

NM

E1

ZN

F1

65

CLD

N9

GR

HL2

PAR

D6

B

0

0.5

1

1.5

2

2.5

3

3.5

4G

ATA

3

FOX

A

HE

S1

CE

BP

B

AM

OT

L2

RA

RG

DK

K1

TH

AP

OV

OL2

*

Genome Stability EMTSTEM/TIC

mR

NA

Fo

ld C

ha

ng

e

(No

rma

liz

ed

to

1

8s

rRN

A)

*

0

20

40

60

80

100

G1 S G2/M

High Glucose

Low Glucose

Cell Cycle Analysis: MCF7 cells in High and Low Glucose Media (n=2)

Re

lativ

e %

b

a

c

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

p<.0004

p<.0021

*

*

**

**

Lo Glucose

Hi Glucose

Re

lative

Ch

IP H

3 A

ce

tyla

tio

n(N

orm

aliz

ed

to

in

pu

t a

nd

to

tal H

3)

g

a b c d e g hf i j k l

BRCA1

0–3 –1 1 2.7 3210 62 82 83 84 85 kb

a b c d e hf i j k l

Supplementary Figure S10. Carbohydrate restriction influences histone acetylation and

expression of DNA repair genes and other drivers of breast cancer independent of cell cycle

changes. (a) ChIP profile of relative histone 3 lysine acetylation after normalization to ChIP for

histone 3. Asterisks represent the p-values as indicated. Error bars represent the standard

errors from 5 independent biological replicates. (b) qRT-PCR analysis of gene expression in

MCF-7 cells cultured in high glucose and low glucose medium. The genes are functionally

subcategorized as genome stability, STEM/TIC and EMT related (See Figure 1c). * indicates p-

value <0.05, ** indicates p-value <0.01. Error bars represent the standard errors from 3

biological replicates. (c) Cell cycle analysis of MCF-7 cells cultured in both high glucose medium

and low glucose medium.

10

Page 11: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

Ba

H2

H2

H2

H2

H2

H2

H2

H2

H2

H2

H2

H2

H2

H2

H2

LA

LA

LA

LA

LA

LA

LA

LA

LA

LA

LA

LA

LA

LA

LA

LB

LB

LB

LB

LB

LB

LB

LB

LB

LB

LB

LB

LB

LB

LB

NL

NL

NL

NL

NL

NL

NL

NL

NL

NL

NL

NL

NL

NL

NL

CL

CL

CL

CL

CL

CL

CL

CL

CL

CL

CL

CL

CL

CL

CL

ANXA3_EMT

−6

−4

−2

0

2

4

6

−6

−4

−2

0

2

4

6 p=2.12e−16

CDS1_EMT

−4

−2

0

2

−4

−2

0

2

p=2.02e−41

CLDN4_EMT

−6

−4

−2

0

2

−6

−4

−2

0

2

p=6.18e−35

CST6_EMT

−4

−2

0

2

4

−4

−2

0

2

4

p=1.44e−07KRT18_EMT

−4

−2

0

2

4

−4

−2

0

2

4

NME1_EMTp=1.4e−06

ZNF165_EMT

−4

−2

0

2

−4

−2

0

2

p=1.08e−11

CLDN9_EMT

−1

0

2

1

2p=1.58e−08

GRHL2_EMT

−3

−2

−1

0

1

2

3

−3

−2

−1

0

1

2

3 p=2.54e−28

PARD6B_EMT

−1

0

1

2

−1

0

1

2

p=5.31e−13

ATR_GS

−2

−1

0

1

2

−2

−1

0

1

2 p=2.65e−08

ERCC5_GS

−2

−1

0

1

2

3

−2

−1

0

1

2

3 p=0.000768RAD51C_GS

−2

0

2

4

6

−2

0

2

4

6p=4.36e−07

XRCC5_GS

FAN1_GS

−1.0

−0.5

0.0

0.5

1.0

1.5

−1.0

−0.5

0.0

0.5

1.0

1.5 p=0.0129

PALB2_GS

−2

−1

0

1

2

−2

−1

0

1

2 p=6.49e−08

BRIP1_GS

0

2

−2

0

2

p=2.58e−16

GATA3_STEM/TIC

−6

−4

−2

0

2

4

−6

−4

−2

0

2

4 p=3.1e−81

FOXA1_STEM/TIC

−10

−5

0

p=1.17e−140

HES1_STEM/TIC

−4

−2

0

2

−4

−2

0

2

p=4.72e−15

CEBPB_STEM/TIC

−4

−2

0

2

−4

−2

0

2

p=1.15e−08

RARG_STEM/TIC

−1

0

1

p=0.0494

DKK1_STEM/TIC

4

6

8

4

6

8 p=2.63e−13

−4

−2

0

−4

−2

0

22

AMOTL2_STEM/TIC THAP10_STEM/TIC OVOL2_STEM/TIC

−4

−2

0

2

−4

−2

0

2p=9.23e−53

1

2

3

1

2

3

00

p=7.91e−10

−3

−2

−1

−3

−2

−1

2

3

2

3

11

−2

−1

0

−2

−1

0

p=1.51e−15

−2

−1

0

1

2

−2

−1

0

1

2FANCD2_GS

p=1.48e−08

MUS81_GS

−1

0

1

2 p=0.0761

−4

−2

0

2

4

−4

−2

0

2

4p=3.89e−49

−2

−1

0

1

2

−2

−1

0

1

2 p=0.000178

a

(Creighton et al, 2010)

Supplementary Figure S11. CtBP targeted gene lists differentiate aggressive molecular

subtypes of breast cancer. (a) ANOVA profiling of individual genes from the EMT, genome

stability and STEM/TIC gene list described in Figure 1c. P-values shown were calculated by

comparing expression means across all subtypes.

11

Page 12: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

EGFR-Signaling

Basal vs Luminal Subtype

Hypoxia

Cisplatin

Ductal Ca vs Normal Lobular

DNA Strand Elongation

NES=2.57

NES=2.30

NES=2.30

NES=2.35

NES=2.42

NES=2.51

a b

c d

e f

g

FDR<.0001 FDR<.0009

FDR<.005FDR<.004

FDR<.006FDR<.0001

Supplementary Figure S12. Genes changed by CtBP depletion influence mult iple pathways in

breast cancer tumor progression. (a-f) Gene Set Enrichment Assay (GSEA) of the genes

signif icantly (p<0.05) changed upon CtBP knockdown (Fig 2b). The Top 6 hits are listed. The

matched datasets are listed based on the NES, normalized enrichment score. FDR represents

the false discovery rate. (g) Overlap between CtBP target genes from ChIP-seq analysis and

genes upregulated in Luminal vs basal-like (mesenchymal) breast cancers. P-value was

derived by calculating the hypergeometric distribution based on 23660 refseq sequences.

CtBP bound genes

1823

83

P= 9.0E-37

(Charafe-Jauffret et al 2006)

Genes upregulated in Luminal vs mesenchymal

Breast Cancer447

364 1740

12

Page 13: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

0

1

2

3

4

5

6

7

8

9

**

** ** **

*** **

**

*

*

*

*

CtrlMTOB

MCF-7 mRNA

Supplementary Figure S13. Selected 30 CtBP target gene expression in MCF-7 and MDA-MB-231 following MTOB treatment. There are 40% and 46% of genes that show significant up-regulation and 3% and 10% genes that show significant down-regulation in MCF-7 and MDA-MB-231 respectively. The data shown is the average of 3 independent experiments and the error bars represent the standard deviations. P values are calculated by student’s t-test. Single star represents p<0.05 and double star represents p<0.01.

00.51

1.52

2.53

3.54

4.5

*

*

*

* **

*

** ** * * * *

*

*

**

**

*

CtrlMTOB

MDA-MB-231 mRNA

mR

NA

(qR

T-P

CR

) N

orm

aliz

ed to 1

8s

rRN

Am

RN

A(q

RT-P

CR

) N

orm

aliz

ed to 1

8s

rRN

A

13

Page 14: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

00.20.40.60.81

1.21.41.6 Ctrl

MTOBIGG

** ** ** * ** **

**

**

* * **

* * * ***

*

*

MCF-7 CtBP

* * *

*** * * *

*

* **

*0

0.04

0.08

0.12

0.16

0.2

Supplementary Figure S14. ChIP assay of CtBP at selected 30 gene promoters in both MCF-7 cells and MDA-MB-231 cells following MTOB treatment. The data shown is the average of 2 independent experiments and the error bars represent the standard deviations. P values are calculated by student’s t-test. Single star represents p<0.05 and double star represents p<0.01. Intr 15 is the negative non-promoter region located in intron 15 of BRCA1 gene locus.

MDA-MB-231 CtBP

CtrlMTOBIGG

CtB

P C

hIP

Norm

aliz

ed to Input

CtB

P C

hIP

Norm

aliz

ed to Input

Intr

15

Intr

15

14

Page 15: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

0

0.5

1

1.5

2

2.5

3

3.5

WWP1 CDH17BRCA1SIRT1 SMAD3RSF1 TOB2 TFF1

MTOB 1mM

MTOB 10mM

**

**

** *

*

**

0

0.1

0.2

0.3

0.4

0.5

0.6

UP PRO DW UP PRO DW UP PRO DW UP PRO DW UP PRO DW UP PRO DW UP PRO DW UP PRO DW

Ctrl CtBP PROMTOB CtBP PRO

CtBP UP/DWIgG UP/DW

WWP1 SIRT1 RSF1 TOB2 BRCA1 SMAD3 CDH17 TFF1

** **** ** **

*

* *

CtrlMCF-7

MCF-7CtBP

Supplementary Figure S15. Top panel, expression of 8 additional CtBP regulated genes in response to MTOB. Bottom, ChIP assay of CtBP binding on 8 CtBP target gene promoters in response to MTOB. Both upstream of promoter (UP) and downstream of promoter (DW) were selected as negative control region for each individual gene. The gene expression data at top is the average of 3 independent experiments and the ChIP data shown at the bottom is the average of 2 replicates. The error bars represent the standard deviations. P values are calculated by a 2-sided student’s t-test. Single asterisk represents p<0.05 and double asterisk represents p<0.01.

mR

NA

, qR

T-P

CR

Norm

aliz

ed to 1

8s

rRN

A

CtB

P C

hIP

Norm

aliz

ed to Input

15

Page 16: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Gene Ontology Category TOTAL

GENES CtBP

Bound

LOG10(p)

GO:0044260 cellular macromolecule metabolic process

5948 752

-31.8147

GO:0006396 RNA processing

618 148

-28.2322

GO:0043170 macromolecule metabolic process

6533 791

-27.3157

GO:0010467 gene expression

3913 533

-26.7427

GO:0090304 nucleic acid metabolic process

3938

530

-25.172

GO:0044237 cellular metabolic process

7835

900

-24.9739

GO:0008152 metabolic process

8810

976

-22.6913

GO:0044238 primary metabolic process

7953

896

-21.2635

GO:0006807 nitrogen compound metabolic process

5043

623

-20.6194

Supplementary Table S1. Distribution of CtBP bound regions in cancer-related gene ontologies

16

Page 17: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Supplementary Methods

ChIP and ChIP-seq

All ChIP experiments were carried out as described 57. In brief, cells were cross-linked with 1% (w/v) formaldehyde for 5 min at room temperature. The cross-linking was quenched by 0.125 M glycine for 15 min. Then the cells were washed twice with PBS and collected. About 1x107 cross-linked cells, resuspended in 1 ml immunoprecipitation (IP) buffer (150 mM NaCl, 50 mM Tris-HCl, pH 7.5, 5 mM EDTA, 0.5% (v/v) NP-40, 1.0% (v/v) Triton X-100, and freshly added proteinase inhibitor cocktail), and were sonicated for 13x20 s with 30 s break. Then the sonicated cells were centrifuged and the supernatant was used for immunoprecipitation. In most cases, the lysate from at least 2 million cells (up to 10 million) was incubated with each antibody overnight with rotation at 4 °C. The preblocked protein G beads were added to the lysate with rotation for 2 h at 4 °C. The beads were washed with IP buffer supplemented with 500 mM NaCl, IP buffer and then TE buffer, pH 8.0. Finally, the precipitated DNA–protein complex was incubated overnight with SDS and proteinase K at 65 °C . The DNA was purified by phenol-chloroform extraction and ethanol precipitation with the presence of glycogen. The purified DNA can be used for either real-time PCR(ChIP) or library construction(ChIP-seq). The ChIP-seq data of CtBP was stored as NCBI GSE36546.

ChIP-seq data analysis

The 36-mer short-read tags were mapped to the human genome(UCSC HG19), and only uniquely mapping read positions were considered. The procedure in searching for CtBP binding hotspots is as follows: Enrichment of tags in a 250 bp target window relative to a 200 kb surrounding window (local background) is gauged by the model based on the binomial distribution. Each tag is extended to the 150 bp length into its strand and assigned a z-score (explained below) using the target window and the background window both centered on the tag. An unthresholded hotspot is defined as a contiguous union of 250 bp windows whose z-scores are greater than 2. Once a hotspot is called, the z-score of the hotspot is given by the maximum z-score of the constituent tags. z-score calculation: If there are n observed tags and N tags that overlap the target window and the local background window, respectively, then the probability p of a tag in the background window overlapping the target window is given by the ratio [# of uniquely mappable tags for the 250bp window] / [# of uniquely mappable tags for the 200 kb window]. Because not all 36-mers in a window can be aligned uniquely to the reference genome, p differs greatly from genomic region to region. The expected number of tags overlapping the target window is then after this important step of adjusting for the differential mappability of short

reads. The standard deviation of the expectation is )1( pNp −=σ . The z-score for the

observed number of tags in the target window is defined by σµ−= nz . Peak detection and input

adjustment for ChIP-seq. The sequencing data from matching input samples are used for the processing of ChIP data, as a measure of background signal which might be significant. After normalizing the input data to match the number of tags in the ChIP data, the number of input tags is subtracted from the number of ChIP tags in the target window before calculating its z-score. In addition, ChIP hotspots were refined into 150 bp peaks using a peak-finding procedure. For each hotspot, a new z-score ‘peak z’ was computed similarly as for hotspot detection, but instead using

17

Page 18: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

a 150 bp window within the hotspot and the hotspot as the target window and local background window, respectively. The peak z score was computed scanning through the hotspot with bp resolution for all possible 150 bp windows. A putative peak is defined as a 150 bp window whose peak z score is above 30. If nearby putative peaks overlap within a hotspot, the 150 bp window with the highest peak z score is selected to be the peak among the cluster of overlapping putative peaks. Therefore, a hotspot may not have a peak or may have multiple peaks. Repeat sequence filtering. In ChIP sequencing reads, artifacts are inevitably found and typically concentrated on relatively small regions. These artifacts were removed by filtering out sequence reads which overlap satellites, LINE (long interpersed repetitive elements), and STR (single tandem repeats) after extension of 150 bp. Repeat-masking data was downloaded from the UCSC genome browser site (http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/simpleRepeat.txt.gz). False discovery rate calculations. A final z-score threshold was imposed on hotspots for each dataset based on FDR (false discovery rate). To calculate FDR, random sampling of uniquely mappable tags was performed and a random dataset with the same number of tags was generated as for the observed dataset. Hotspots are called for the random dataset and their z-scores are calculated. Then the FDR for the observed hotspots at a given z score threshold z0 is estimated as [# of hotspots with z > z0 in the random data] / [# of hotspots with z > z0 in the observed data]. ‘FDR 0% hotspots’ are selected by setting a z score threshold that is just above the highest z score encountered in the random set.

Volcano plot of CtBP ChIP-seq binding profile

Read density for positions relative to transcription start sites (TSS). Shown is total number rd* of read tags summed over all Refseq annotated TSS normalized to the length L of the genome and the total number N of aligned reads (rd* = rd * L / N) such that rd*=1 approximately corresponds to an un-enriched distribution of reads.

Statistical calculation of Venn diagrams

To calculate if the overlapping of CtBP target gene list to other referenced gene lists, is statistical significant, a hypergeometric distribution is calculated to draw the statistical p value based on 37630 TSS in refseq(HG19, UCSC).

Comet assay

Comet assay were performed according to Olive et al 58. Briefly, a single-cell suspension was prepared using enzyme disaggregation. Keep cells in ice-cold medium or phosphate-buffered saline to minimize cell aggregation and inhibit DNA repair. Appropriate numbers of cells ere added into low-gelling-temperature agarose and a thin layer of cells will adherent to the Gel-Bond™ film. The cells were exposed to neutral lysis buffer (2% sarkosyl, 0.5M Na2EDTA, 0.5 mg/ml proteinase K (pH 8.0);equilibrate at 4 °C) for overnight at 37°C. Wash the film with electrophoresis buffer twice and submerge the film to the electrophoresis buffer in a chamber for electrophoresis. The cells were stained by SYBR Green and the pictures were obtained using flurecenc microsopy. The tail moment was used to tell the DNA damage extent. The tail moment is calculated by the following fomula: Tail moment=tail length x percentage of Tail DNA. Percentage of Tail DNA= aT×iT/(aT×iT+aH×iH), where aT represents the tail area, iT means average intensity of tail, aH represents the head area and iH means average intensity of Head.

18

Page 19: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

Comet ScoreTM was used to analyze the comet pictures.

Immunofluorescence staining of cells

Cells were grown on cover-slips. Wash cells twice with PBS and fix in 3.5% paraformaldehyde. Wash cells with PBS for 3 times. Cells are permeablized in PBS supplemented with 10% goat serum and 0.3% Triton X-100 for 15mins. Wash cells twice by PBS. Then cells were treated with 3% H2O2 for 10mins and wash for twice. Block cells with blocking buffer(10% goat serum in 1xPBS) for 1hr. Incubate cells with primary antibody(1:50) in blocking buffer for 1.5hr. Wash cells for 3 times by PBS. For γH2AX staining, the cells were incubated with Alexa Fluor® 488 Goat Anti-Mouse IgG for 1hr. After washing for 3 times, mount cover slips using VECTASHIELD with DAPI . For CtBP1 and CtBP2 staining, the Cells were incubated with EnVision+ System- HRP Labelled Polymer Anti-mouse (Dako, Carpinteria, CA, K4000) for 1 hr. After washing for 3 times, cells were incubated with Liquid DAB+ (Dako) for 3 minutes, and wash twice again. Then the cells were counterstained in Hematoxylin (Vector) for 30 sec, twice. Sequentially wash cells with water(twice, 5mins), 95% ethanol(2min) and 100% ethanol(2min). Finally, dip cells in xylene and mount with Permount onto a slide for microscopy imaging.

Immunohistochemistry staining of tissues:

Formalin-Fixed, Paraffin-Embedded tissues were de-paraffinized by submerging the slides in Xylene twice for 15 minutes each. Tissues were then rehydrated by immersing slides into 100%, 95% and 70% ethanol twice for 5 minutes each sequentially. The tissues were next rinsed by immersing into distilled H2O for 3 minutes twice followed by a sequential 5 minute rinse in distilled H2O. A Biocare Medical Decloaking Chamber was then filled with 500mL of water. Tissue slides were placed in a plastic chamber and submerged in 100mL of 1mM EDTA pH 8.0. The plastic chamber was placed into the decloaking chamber and heated to 98°C for 4minutes. Tissue slides were allowed to cool to 70°C before removing from the decloaking chamber. Tissue slides were rinsed by submerging in 1X phosphate buffered saline (PBS) twice for 5 minutes each. Next tissue slides were immersed in 30% hydrogen peroxide (Calbiochem) diluted with distilled H2O to a final concentration of 3% for 10 minutes. Tissue slides were next rinsed in 1X PBS twice for 5 minutes each. Tissue slides were next placed in a humidity chamber and blocked with 500uL of 10% normal goat serum (NGS) (Vector Laboratories) in 1X PBS for 30 minutes at room temperature. Following the expiration of the 30 minute incubation time, the blocking solution was removed and approximately 200uL of appropriate primary antibody either αCtBP1 or αCtBP2 (Calbiochem) at a 1:50 dilution in 1% NGS was added to the tissue and incubated in a humidity chamber for 1 hour at room temperature. Following the 1 hour incubation with primary antibody, slides were washed by submerging slides in 1X PBS twice for 5 minutes each. Next 4-5 drops of secondary antibody conjugated with horse-radish peroxidase (HRP) (Dako EnVision + System-HRP Labelled Polymer Anti-Rabbit or Anti-mouse) was added to the tissues and incubated for 30 minutes at room temperature. Tissue slides were next washed by submerging with 1X PBS twice for 5 minutes each. Four to five drops of DAB in buffer (Dako Liquid DAB + Substrate Chromogen System) was next added to the tissues and incubated for 15 minutes at room temperature. After the incubation time had expired the DAB was removed and tissue slides were immersed in tap water twice for 5 minutes each. The tissue slides were then immersed into hematoxylin (Vector Laboratories) for 20 seconds. Hematoxylin was then rinsed from the tissue slides by immersing slides in tap water twice for 5 minutes each. Tissue slides were then washed by immersing slides into 95% ethanol for 2 minutes followed by a wash in 100% ethanol for 2

19

Page 20: Di et al II supplemental figures V2 - media.nature.com · M00652 (NRF-1) , MA0291.1 (DAL82) , M01243 (MTF1) , M01192 (HDG9) , Zfp161_primary, M01579 (Tbs1) , M01069 (GZF1) 62 4.1e-025

minutes. Tissue slides were then immersed in Xylene for 1 minute and subsequently mounted with a coverslip using Permount and dried overnight.

Scanning and Analysis of Tissue Slides Method:

Immunohistochemically stained tissue slides were converted to digital slide images by scanning the slides on an Aperio ScanScope XT slide scanner. High resolution digital slide images were then archived into Aperio’s digital pathology information management system “Spectrum”. Digital slide images were analyzed using Aperio’s IHC Nuclear Image Analysis algorithm to assess the nuclear staining for CtBP1 in MCF7 cells and quantify the intensity at individual cells. Values and colors are assigned to individual cells based on the intensity of nuclear staining with a classification of 0 and a blue color for no nuclear staining, 1+ and a yellow color for weak nuclear staining, 2+ and an orange color for moderate nuclear staining and 3+ and a red color for intense nuclear staining.

Tumor gene expression analysis from breast cancer datasets:

Analysis of tumor gene expression from breast cancer datasets was performed using BRB ArrayTools Version: 4.1.0 - Beta_3 Release. Affymetrix data sets were downloaded from the NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/). Expression data were loaded into BRB ArrayTools using the Data Import Wizard. U133A probe sets for the individual gene signatures were identified by using the Affymetrix NetAffx Analysis Center Batch Query tool (http://www.affymetrix.com/analysis/index.affx). Expression data were filtered to exclude any probe set that was not a component of the signatures tested, and to eliminate any probe set whose expression variation across the dataset was P>0.05. Kaplan-Meier analysis was performed using the Survival Risk Prediction tool, specifying two risk groups, with fitting to a Cox proportional hazard model with P value ≤ 0.05.

Statistical analysis

All the error bars represent the standard deviations of the mean from at least 3 independent biological replicates unless otherwise indicated. Comparisons between two groups were done using unpaired Student’s t test and P<0.05 was considered statistically significant.

References

57. Nelson,J.D., Denisenko,O., & Bomsztyk,K. Protocol for the fast chromatin immunoprecipitation (ChIP) method. Nat. Protoc. 1, 179-185 (2006).

58. Olive,P.L. & Banath,J.P. The comet assay: a method to measure DNA damage in individual cells. Nat. Protoc. 1, 23-29 (2006).

20