View
218
Download
0
Embed Size (px)
Citation preview
Methods
• The Su et.al. (2005) supplementary material contains the sequence downstream of the NtcA box and also the 10 box itself (where found). From this it was possible to find the distance into this downstream sequence of the start of the 10 box. There were also genes listed that had NtcA sites without -10 boxes. Note that due to construction of the Su et.al. data sets, 10 boxes upstream of NtcA sites are not considered.
• GSEA tests were performed on lists of genes, some of which are described below, considering only the 6 and 48 hour time points in MED4 and MIT9313, using the Direction parameter of gsea.m to look for enrichment among repressed genes.
– Negative control: For the 6 hour time point, a GSEA test was run with the top 15 ranked NtcA sites for genes with both orthologs and non-orthologs ranked separately (see Methods in manuscript). These are parameters equivalent to those run for 6 hour enrichment of induced genes, except that the direction of the test was reversed so as to look for enrichment among repressed genes. This test is a negative control because the fact that these genes were enriched among activated genes means that they should not also have been enriched among repressed genes. Remaining tests considered sets of genes with NtcA sites without 10 boxes or with NtcA sites close to 10 boxes which might be occluded by NtcA binding.
– A list was prepared of genes with that were identified in Su et.al. (2005) as having NtcA sites without downstream 10 boxes. These could be sites that block transcription initiated by upstream binding sites for the RNA polymerase and associated factors. The situation is diagrammed below. GSEA tests looking for enrichment among repressed genes were conducted using this list and considering the top 15 sites for genes listed as having orthologs and as not having orthologs, ranked separately.
Methods (continued)
– A list was prepared of genes with that were identified in Su et.al. (2005) as having NtcA sites that were within 15bp of a downstream 10 box. These could be sites that are so close to the 10 box that NtcA binding occludes it, thereby blocking transcription. The situation is diagrammed below. There are only a very small number of these sites, and all were used in the GSEA regardless of NtcA site rank: MED4 has 16 (9 for genes with and 7 for genes without orthologs) and MIT9313 has 8 (6 for genes with and 2 for genes without orthologs).
gene
NtcARNAP
XX
geneNtcA
RNAPXX-10 box
Results
• All tests were negative for enrichment (P>>.05). The smallest P value was .6544 (MIT9313 time point 48 hr for NtcA sites within 15bp of a 10 box).
0 200 400 600 800 1000 1200 1400 1600 1800-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05ES=0.0225274
P=0.9562P
c=0.481466
le=50
MED4 T6 repression: top 15 NtcA sites for both genes with + without orthologs: score plot
gene number
runn
ing
enric
hmen
t sc
ore
Negative control
0 0.05 0.1 0.15 0.2 0.25 0.3 0.350
200
400
600
800
1000
1200ES=0.0225274
P=0.9562P
c=0.481466
MED4 T6 repression: top 15 NtcA sites for both genes with + without orthologs: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
Negative control
0 200 400 600 800 1000 1200 1400 1600 1800-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05ES=0.0201592
P=0.9618P
c=0.456128
le=1328
MED4 T6 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: score plot
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 6hr
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
200
400
600
800
1000
1200
1400ES=0.0201592
P=0.9618P
c=0.456128
MED4 T6 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 6hr
0 200 400 600 800 1000 1200 1400 1600 1800-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08ES=0.0782916
P=0.7426P
c=1.84667
le=895
MED4 T48 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: score plot
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 48hr
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
200
400
600
800
1000
1200ES=0.0782916
P=0.7426P
c=1.84667
MED4 T48 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 48hr
0 200 400 600 800 1000 1200 1400 1600 1800-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1ES=0.0466228
P=0.8666P
c=0.795495
le=1621
MED4 T6 repression: all genes with NtcA sites < 15bp from -10 box: score plot
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 6h
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50
200
400
600
800
1000
1200
1400ES=0.0466228
P=0.8666P
c=0.795495
MED4 T6 repression: all genes with NtcA sites < 15bp from -10 box: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 6h
0 200 400 600 800 1000 1200 1400 1600 1800-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06ES=0.0486287
P=0.9008P
c=0.5994
le=71
MED4 T48 repression: all genes with NtcA sites < 15bp from -10 box: score plot
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 48h
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.450
200
400
600
800
1000
1200ES=0.0486287
P=0.9008P
c=0.5994
MED4 T48 repression: all genes with NtcA sites < 15bp from -10 box: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 48h
0 500 1000 1500 2000 2500-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05ES=0.00408905
P=0.997P
c=0.301528
le=2241
MIT9313 T6 repression: top 15 NtcA sites for both genes with + without orthologs: score plot
gene number
runn
ing
enric
hmen
t sc
ore
Negative control
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
200
400
600
800
1000
1200
1400ES=0.00408905
P=0.997P
c=0.301528
MIT9313 T6 repression: top 15 NtcA sites for both genes with + without orthologs: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
Negative control
0 500 1000 1500 2000 2500-0.25
-0.2
-0.15
-0.1
-0.05
0ES=0
P=1P
c=0.293152
le=2250
MIT9313 T6 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: score plot
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 6hr
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
200
400
600
800
1000
1200ES=0
P=1P
c=0.293152
MIT9313 T6 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 6hr
0 500 1000 1500 2000 2500-0.15
-0.1
-0.05
0
0.05
0.1ES=0.0573243
P=0.8262P
c=0.999778
le=847
MIT9313 T48 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: score plot
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 48hr
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40
200
400
600
800
1000
1200ES=0.0573243
P=0.8262P
c=0.999778
MIT9313 T48 repression: top 15 NtcA sites for genes w/o -10box both with + w/o orthologs: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
gene
NtcARNAP
XX No -10 boxTime = 48hr
0 500 1000 1500 2000 2500-0.45
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05ES=0.00625559
P=0.9924P
c=0.345553
le=2236
MIT9313 T6 repression: all genes with NtcA sites < 15bp from -10 box: score plot
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 6h
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70
200
400
600
800
1000
1200ES=0.00625559
P=0.9924P
c=0.345553
MIT9313 T6 repression: all genes with NtcA sites < 15bp from -10 box: null distribution
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 6h
0 500 1000 1500 2000 2500-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15ES=0.134464
P=0.6544P
c=3.84253
le=1134
MIT9313 T48 repression: all genes with NtcA sites < 15bp from -10 box: score plot
gene number
runn
ing
enric
hmen
t sc
ore
geneNtcA
RNAPXX-10 box
-10 occludedTime = 48h