Upload
gotzon
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
What is the Best Way to Find the Binding Site for a Transcription Factor? Dennis Shasha, Courant Institute, New York University With Philip Benfey and Ken Birnbaum Biology Department, New York University. AAAA. Transcriptional Networks. Induction. Specificity Modularity. AAAA. AAAA. - PowerPoint PPT Presentation
Citation preview
What is the Best Way to Find the Binding Site for a Transcription Factor?
Dennis Shasha, Courant Institute, New York University With Philip Benfey and Ken Birnbaum Biology Department,New York University.
Transcriptional Networks
•Specificity
•Modularity
Time 1 Time 2 Time 3
Repression
Induction
e.g. cis-element
AAAA
AAAA
AAAA
AAAA
AAAA
AAAA
AAAA
AAAA
AAAA
Genomic and Expression Data
Cis-regulatory regions of expressed genes
0
10
20
30
1 2 3 4 5 6 7
Expres-sion Level
Time
Clusters of Co-Expressed Genes
0
2
4
6
8
1 2 3 4 5 6 7
Expres-sion level
Time
aa
a a
a
a
aa
a
Over-represented motifs
0
5
10
1 2 3 4 5 6 7
Transcription Factor
Dowstream Genes
Time
Expression Level
Low Correlation
Time
Modularity
later expres
sion module
early specific
ation module
From: Arnone & Davidson, Development, 1997
02468
10121416
1 2 3 4 5 6 7 8 9 10
Expres-sion
Transcription Factor X
Time
NAAAAAA.. ..TTTTTTN
02468
10121416
1 2 3 4 5 6 7 8 9 10
Expres-sion
Transcription Factor X
Gene Cis-regulatory
region
A
ACA?GTC
Time Expression levels of genes with
ACA?GTC in their promoters
02468
10121416
1 2 3 4 5 6 7 8 9 10
Expres-sion
Transcription Factor X
Gene Cis-regulatory
region
A
B
ACA?GTC
ACA?GTC
Time Expression levels of genes with
ACA?GTC in their promoters
02468
10121416
1 2 3 4 5 6 7 8 9 10
Expres-sion
Transcription Factor X
Gene Cis-regulatory
region
A
B
C
ACA?GTC
ACA?GTC
ACA?GTC
Time Expression levels of genes with
ACA?GTC in their promoters
02468
10121416
1 2 3 4 5 6 7 8 9 10
Expres-sion
Transcription Factor X
= 8Composite expression of ACAGTC at Time 5
Gene Cis-regulatory
region
Expression Level
A 2
B 4
C 2
ACA?GTC
ACA?GTC
ACA?GTC
Example: Time 5
Time Expression levels of genes with
ACA?GTC in their promoters
02468
10121416
1 2 3 4 5 6 7 8 9 10
Expres-sion
Transcription Factor X
= 8Composite expression of ACAGTC at Time 5
Gene Cis-regulatory
region
Expression Level
A 2
B 4
C 2
ACA?GTC
ACA?GTC
ACA?GTC
Example: Time 5
Time
Composite expression of genes
with ACA?GTC in
promoter
Expression levels of genes with
ACA?GTC in their promoters
0
1
2
3
1 2 3
Cell Type
TFZ
Cooperative Binding Model
Cell Type A
Cell Type B
Cell Type C
AND
TFs
Binding Sites
Z Z
Z
A B C
Exp
ress
ion
Leve
l
TF Expression
0
1
2
3
1 2 3
Cell Type
TFZ
Cooperative Binding Model
AND
TFs
Binding Sites
Cell Type A
Cell Type B
Cell Type C
X X
Z Z
Z Z
Z Z
0
1
2
3
1 2 3
Cell Type
A B C
A B C
Exp
ress
ion
Leve
lE
xpre
ssio
n Le
vel
TF Expression
Target Gene Expression
0
1
2
3
1 2 3
Cell Type
TFZ
Cooperative Binding Model
AND
TFs
Binding Sites
0
1
2
3
1 2 3
Cell Type
A B C
A B C
Exp
ress
ion
Leve
lE
xpre
ssio
n Le
vel
TF Expression
Target Gene Expression
Cell Type A
Cell Type B
Cell Type C
X X
Z Z
Z Z
Z Z
ZZZ
XX
0
1
2
3
1 2 3
Cell Type
TFZ
Cooperative Binding Model
AND
TFs
Binding Sites
0
1
2
3
1 2 3
Cell Type
A B C
A B C
Exp
ress
ion
Leve
lE
xpre
ssio
n Le
vel
TF Expression
Target Gene Expression
Cell Type A
Cell Type B
Cell Type C
X X
Z Z
Z Z
Z Z
ZZZ
XX
Cooperative Binding Model
OR
TFs
Binding Sites
ononb..c..z
onona..c..y
onona..b..x
TFCTFBTFA
Transcrip-tion
factors present
Promotors
Time 1 Time 2 Time 3
TFA
a….b
TFB
expressionor
Where a,b and c are targets for TFA,TFB, and TFC; x,y,z are not targets
00 . 20 . 40 . 60 . 8
11 . 2
1 2 3
T i m e P o i n t
E x p r e s s i o n
T F A
T F B
T F C
11 . 21 . 41 . 61 . 8
22 . 2
1 2 3
T i m e P o i n t
C i s -e l e m e n t " e x p r e s -
s i o n "
a
b
c
Independent Binding Model: well handled by Bussemaker et al.
Assumptions:
•TF RNA Expression = TF protein (protein movement)
•TFs are active where they are expressed (co-factors)
•Binding sites are within 2 kb of initiation site
Testing the Method
Yeast
1. Fully sequenced genome
2. Genome-wide mRNA expression profiles
300 knockout lines from Rosetta (Hughes et al. 2000)
2 datasets on yeast progressing through the cell cycle (Spellman et al. 1998; Cho et al. 1998)
Results for STE12
Documented Consensus Binding Site
ATGAAA
*From SCPD (Zhang, Cold Spring Harbor)
Results for STE12
Correl-ation
Identified by
Algorithm
Documented Consensus Binding Site
1 0.93 ?GTCGCA 2 0.93 GTCGCA? 3 0.91 CGGT?TC 4 0.89 ?TGAAAC ATGAAA
*From SCPD (Zhang, Cold Spring Harbor)
Results for STE12
Correl-ation
Identified by
Algorithm
Documented Consensus Binding Site
1 0.93 ?GTCGCA 2 0.93 GTCGCA? 3 0.91 CGGT?TC 4 0.89 ?TGAAAC ATGAAA
Gene
Regulated by STE12
STE12
Binding Sites
Ty1 TGAAACG Ty2 TGAAACG YCL027 GAAACA YCL027 GAAACG YDR461 TGAAACC YFL026 TGAAACA YNL145 ATGAAAC YNL145 ATGAAAC
*From SCPD (Zhang, Cold Spring Harbor)
TF Rank Correl- ation
(r value)
Program Prediction
Documented Site
Knockout lines STE12 4 0.89 ?TGAAAC ATGAAA BAS1 7 0.52 ?GAGTCA GAGTCA PDR3 5 0.65 CGCGG?T TCCG(C,T)GGA MET28 4 0.87 ?GTCACG TCACGTG Success Rate = 36%
Cell cycle (high density arrays) MET32 2 0.99 CT?TGGC AACTGTGG GCN4 2 0.95 CTGAC?C TGANTN Success Rate = 30%
Cell cycle (cDNA arrays) PHO4 1 0.95 CACG?G CACGT(T,G) Success Rate = 11 %
Overall Results
Conclusions
•Technique: correlate transcription factor expression with cis-element expression. Can capture information that would be missed by gene expression correlation.
•Can handle cooperative (AND) and independent (OR) cases. Does less well for complex circuits.
•Future effort: manipulate promoters to eliminate false positives. (Information theory + experiments)