Modeling and analysis of
Gene Regulatory Networks
Hamid Bolouri
Division of Human Biology
Fred Hutchinson Cancer Research Center
http://labs.fhcrc.org/bolouri
Woods Hole, October 2011
Outline: (new approach proposed by Eric Davison)
- Take a specific set of biological observations
- Explore how various computational approaches can help develop insights
• Papers
– Laslo et al, Cell 2006, 126:755–766
– Spooner et al, Immunity 2009, 31:576–586
– Cherry and Adler, J. Theoretical Biology 2000, 203:117-133
– Saka & Smith, BMC Developmental Biology 2007, 7:47.
– Chickarmane, Enver & Peterson PLoS Computational Biology 2009 5(1): e1000268.
• Further reading
– The regulatory Genome (Eric Davidson 2006)
– An Introduction to Systems Biology (Uri Alon, 2006)
– Computational Modeling of Gene
Regulatory Networks – a Primer (Hamid Bolouri, 2008) – R in Action (Robert Kabacoff, 2011)
Novershtern et al,
Lab Exercises: See handout for instructions.
Search for tag “MBL”
Stefan Materna & Paola Oliveri, Nature Protocols 3, -1876 - 1887 (2008)
Does network structure explain all observations?
Discovery
Analysis
Novershtern et al,
G1 G2 G3 G4 G5 G6 G7 G8 G9 G10
time1 0.082377 0.38766 0.61257 0.471963 -0.07442 -0.11739 0.51039 0.006912 0.011694 -0.14743
time2 0.710007 0.175795 0.035997 0.332428 0.499605 0.386174 0.171675 0.564456 0.500018 0.076234
time3 -1.0385 -0.83347 -0.92109 -0.81229 -1.35493 -1.01501 -0.74898 -0.6342 -0.69178 -0.9943
time4 -1.19125 -1.354 -0.73608 -1.03199 -1.15046 -0.81708 -1.22163 -0.88932 -0.41835 -0.5339
1 myData <- as.matrix(read.table(inFile,header=TRUE,sep=",",row.names=1)) 2 if (clusterBy=="genes") myData <- t(myData) 3 myData <- sweep(myData,1,apply(myData,1,mean),"-") 4 myData <- sweep(myData,1,apply(myData,1,sd),"/") 5 library(RColorBrewer) 6 library(gplots) 7 print(heatmap.2(myData[,dim(myData)[2]:1],col=brewer.pal(11,"RdYlGn"), trace="none", dendrogram="row", scale="column", Colv=FALSE,Rowv=TRUE,key=TRUE, margins=c(10,10)))
Search scripts & data for tag term “MBL” at CRdata.org
Observations
• PU.1 & CEPBa expression levels are both low in progenitors
• Progenitors express low levels of both macrophage and neutrophil associated genes
• PU.1 expression is required for macrophage specification
• CEPBa is required for neutrophil specification
• Ratio of PU.1/ CEPBa determines cell fate
• In mature macrophages and neutorphils, PU.1 and CEBPa are both expressed at high
levels & co-regulate cell-type-specific genes
• Multiple cytokines (G-CSF etc) act upstream of PU.1 and CEBPa
• PU.1 upregulates Egr2 and Nab2 expression (which co-regulate macrophage genes)
• Egr2 and Nab2 co-repress Gfi1 expression while Gfi represses Egr2 transcription
• Not included in model
– PU.1 is autoregulatory in macrophages
– CEPBa regulates PU.1 expression
PU.1
Myeloid cells
B cells
T cells
Hoogenkamp et al,
http://genomequebec.mcgill.ca/PReMod Blanchette lab, McGill
GATA
GATA
PAX
Bold ChIP Light EMSA
1
Leddin et al, Blood 2011, 117(10):2827-2838. Additional CRMs at -8Kbp & within introns.
Runx1
IKAROS
-14Kb -10Kb -9Kb -1Kb
Zarnegar, Chen & Rothenberg,
Enhancer
PU.1 exor GFi (Spooner ’09)
Logic simulation: need multiple value levels & memory
Chalk board discussion:
1. Do mutually repressing genes always result in 2 mutually excluded states?
2. How do the threshold parameters of a logic model relate to biochemistry?
http://gin.univ-mrs.fr/GINsim/
distance traveled = current position – starting position = speed x (time)
speed = [(position at time t2) – (position at time t1)] / (t2 – t1)
speed = time1time2
position1position2
d(time)
)d(position
|(time2 - time1) 0
2
2
d(time)
(position)d
d(time)
d(speed)acceleration =
Using Ordinary Differential Equations (ODEs) to model dynamics
speed
time
t1 t2 t3 t4 t5
Chalk board discussion: Integration & differentiation
Analysis of dynamic network behavior
Consider A
k1 k2
We can write
Which has a simple analytic solution:
(assuming [A](t=0) = 0)
[A]
t
)1()(.
2
1 2 tke
k
ktA
Akkdt
dA.2 1
2
1
k
k
initial slope=k1
[A] 0 max
2
1
k
k
0][
dt
Ad0
][
dt
Ad
Chalk board discussion: Stable and unstable steady states
GFi
PU.1
C/EBPa
Egr
NAB2
Macrophage genes Neutrophil genes
GFi
PU.1
C/EBPa
Egr
NAB2
Macrophage genes Neutrophil genes
Laslo et al, Cell 2006, 126:755–766 GFi
PU.1
C/EBPa
Egr
NAB2
Macrophage genes Neutrophil genes
GfiEgrCEBP
CEBP
dt
Gfid
EgrGfiPU
PU
dt
Egrd
PUGfi
e
dt
PUd p
4
4
4
1
1.
1
.)(
1
1.
11
1.)(
11
)1(
a
a
Steady states in feedback networks
stable steady state 1
stable steady state 2
unstable steady state 1
Chalk board:
mono vs bistability
stability, homeostasis
mediocristan
polarized/differentiated states,
extremestan
Laslo et al, Cell 126, 755–766, August 25, 2006
developmental
trajectory for
macrophages
Graph: a convenient graphing tool. Free at http://www.padowan.dk/graph/
GfiEgrCEBP
CEBP
dt
Gfid
EgrGfiPU
PU
dt
Egrd
PUGfi
e
dt
PUd p
4
4
4
1
1.
1
.)(
1
1.
11
1.)(
11
)1(
a
a
What do the fractional terms imply ?
Why are all Khalf=kd=1 ?
Where do the production rate functions come from? 1. Do the fractions in the rates have biochemical meaning? 2. Why use (fraction1)*(fraction2) form for the rates? - a side-trip into modeling the regulation of transcription
mRNA NTPs degradation
AAs Protein
Y
degradation
A simple 2-step ODE model of transcription and translation
dt
d[mRNA]
dt
d[P]
= kt.Y – kdm.mRNA kt is the maximal rate of transcription
= ks.mRNA – kdp.P ks is the protein synthesis rate/mRNA concentration unit
Y
mRNA
Protein
time
Pss = (ks/ kdp).(kt/kdm).Y
Pss ∝ Y (Y is usually set to the Fractional Saturation of TF complex on its DNA binding site)
At steady-state:
Gene NTPs, AAs Protein
Y
An even simpler 1-step ODE model of gene expression
dt
d[mRNA]
dt
d[P]
= kt.Y – kdm.mRNA = 0
= ks. – kdp.P
If we assume rapid mRNA equilibrium, then:
Yk
kmRNA
dm
tss .
Yk
k
dm
t .
PkYk
kk
kkL
dg
gdm
ts
..
.
dt
dP
then et
G
A P
mRNA
mRNA transcription
synthesis P
Activator (A)
degradation
degradation
Fractional DNA occupancy by one activating factor
PkmRNAkdt
PdmRNAkYk
dt
mRNAd
as before:
AK
AYalently: or equiv
AK
AKY
DNAAKDNA
DNAAK
DNAADNA
DNAAYOccupancyDNAFractional
dpsdmAt
DAA
A
AA
A
AA
..)(
..)(
.1
.
..
..
]:[][
]:[
increasing KA
G
R P
mRNA mRNA transcription
synthesis P
Repressor (R)
degradation
degradation
Transcriptional repression
mRNAkRK
kdt
mRNAd
RKDNARKDNA
DNAY
RK
RKY
dAR
t
ARARRnot
AR
ARR
..1
1.
)(
.1
1
..
.1
.
)(
Increasing KAR
(fraction of DNA occupied by R)
(1-YR) = fraction of DNA not occupied by R:
DNA occupancy by 2 factors
D
DB DAB
DA ka
k-a
kba
k-ba
kab k-ab kb k-b D
A P
mRNA
B
At equilibrium:
bbaaabbbaaab
bbab
b
ba
ba
ba
bababa
aaba
a
ab
ab
ab
ababab
b
b
a
aaa
KKKKDBAKKDBAKK
DBAKKDBk
kA
k
kDBA
k
kDABDABkDBAk
dt
DABd
DBAKKDAk
kB
k
kDAB
k
kDABDABkDABk
dt
DABd
DBk
kDBDA
k
kDADAkDAk
dt
DAd
dt
DBd
dt
DAd
dt
DABd
..........
........].[.]0].[].[.)(
........].[.]0].[].[.)(
..]..]].[..)(
0)(
,0)(
,0)(
[
[
[ :l ikewise , [
DNA occupancy by 2 factors
D
DB DAB
DA ka
k-a
kba
k-ba
kab k-ab kb k-b
D
A P
mRNA
B
In general: KA.KAB = KB.KBA = Kq.KA.KB
Where Kq = cooperativity factor
= 1 if A and B bind independently
DABkDABkdt
DABd
DAkDAkdt
DAd
abab
aa
...)(
...)(
qBABA
qBABA
.[B].K.[A] .KK.[B] K .[A] K
.[B].K.[A] .KK.[B] K.[A] KY
1
ionsconfiguratDNA all of levels statesteady the of sum
states activating of level statesteady occupancy
DAB DB DA D
DAB DB DA Y
DBAKKKDAKBKDABKDABk
kDAB
DBKDBDAKDAk
kDA
BAqAABAB
ab
ab
BA
a
a
.............
).. :(likewise ....
If A & B activate independently:
NNDA
N
NA
NA
AK
A Y or
.A)(K1
.A)(K Y
For 1 TF
multimer
qRARA
A
.[R].K.K .[A]K .[R]K .[A]K1
.[A]K
Y
homodimer) afor (Likewise
1 ,
[A]
[A] Y :lyequivalentor
.[A]1
.[A] Y : thenK.KLet
.[A].KK1
.[A].KK Y :sites ecooperativ- 2For
.[B].K.K .[A]K1
.[B].K.K .[A]K Y factors ecooperativhighly 2For
A
dA22
dA
2
22
A
22
A22
Aq
2
Aq
22
Aq
qBA
qBA
A
For 1 repressor, 1 activator
n=1
n=3
Chalk board: (1-occupancy) for a repressor multimer
50%~AB Ygives which, /nucleusM) 6-(1.7x10 molecules 3000~B A, 510~RBKRAK
, 2M7E~ND 15L4E ~ volume nuclear sites, 1.6E8~ND
:are numbers typical ,purpuratus S. urchin sea the of cellsembryonic in example, For
.K.B.KA.K.DB.K.DA.KB.DA.DD
.K.B.KA.KY
:gives bottom and top the from D cancelling
D
.K.B.KA.K.DB.K.DA.KB.DA.DD
D
.K.B.KA.K
Y
DNA) boundly specifical-non andDNA naked for terms includes rdenominato (note
D
.K.B.KA.K
D
B.K
D
A.K
D
B
D
A1
D
.K.B.KA.K
Y
:B andby A binding ecooperativ For
state)steady at ionsconfiguratDNA possible all of n(proportio
jointly B &by A occupied regions bindingDNA of proportionY
then sites, bindingspecific their of B &by A occupancy joint YLet
D
K.KKK
D
1K ,
D
1K
[A] ][AD
][A].[D
][ADK
then B,& Afactors for sites bindingDNA specific of number D Let
B & A facors for sites binding DNAspecific -non of number D Let
K
KK Let
qRBRANRBNRANN2N
qRBRAAB
2N
2N
qRBRANRBNRANN2N
2N
qRBRA
AB
2N
qRBRA
N
RB
N
RA
NN
2N
qRBRA
AB
AB
AB
N
RAm_NAequilibriuRAm_SAequilibriu
Nm_NBequilibriu
Nm_NAequilibriu
N
N
Nm_NAequilibriu
S
N
ficm_nonspeciequilibriu
m_specificequilibriuR
Calculating promoter occupancy as a function of
specific and non-specific DNA binding rates for two factors
50%~AB Ygives which, /nucleusM) 6-(1.7x10 molecules 3000~B A, 510~RBKRAK
, 2M7E~ND 15L4E ~ volume nuclear sites, 1.6E8~ND
:are numbers typical ,purpuratus S. urchin sea the of cellsembryonic in example, For
.K.B.KA.K.DB.K.DA.KB.DA.DD
.K.B.KA.KY
:gives bottom and top the from D cancelling
D
.K.B.KA.K.DB.K.DA.KB.DA.DD
D
.K.B.KA.K
Y
DNA) boundly specifical-non andDNA naked for terms includes rdenominato (note
D
.K.B.KA.K
D
B.K
D
A.K
D
B
D
A1
D
.K.B.KA.K
Y
:B andby A binding ecooperativ For
qRBRANRBNRANN2N
qRBRAAB
2N
2N
qRBRANRBNRANN2N
2N
qRBRA
AB
2N
qRBRA
N
RB
N
RA
NN
2N
qRBRA
AB
Bolouri H & Davidson EH, PNAS, 5 August 2003, 100(16):9371-9376.
Example model fit to sea urchin data – with added transcription initiation step
22
2
1
22
11
1
2
11
22
12
22
11
21
11
1
1
1
1
.1
1
.1
1
.G) - k
K
G
.( k dt
dG
.G) - k
K
G
.( k dt
dG
.G) - kG K
.( k dt
dG
.G) - kGK
.( k dt
dG
d
NDiss
Nt
d
NDiss
Nt
dNNA
t
dNNA
t
:as writere
Reducing the number of unknown parameters – a technique to simplify model exploration
22
1
22
11
2
11
1
1
1
1
.G) - kG
.( k dt
dG
.G) - kG
.( k dt
dG
KG
dNt
dNt
Diss
then , of units in measure weIf
Mutual repression
gene 2 expression
gen
e 1
exp
ress
ion
22
1
22
11
2
11
1
1
1
1
.G) - kG
.( k dt
dG
.G) - kG
.( k dt
dG
dNt
dNt
Mutual repression
1
1
d
t
k
k
cooperativity factor = 2
cooperativity factor = 3
cooperativity factor = 4
2
2
d
t
k
k
For exploratory exercises, see the Lab Notes handout
Chalk board discussion: - change-direction arrows - how inputs set the state
x1=off,x2=on
Inputs=0
x1=on,x2=off
Inputs=0
Controlling the state of a mutual repression switch with 2 independent activating inputs
x1=off,x2=on
Input1=0, Input2=0.5
x1=high, x2=low
Input1=0.75, Input2=0.5
k1,k2=5
Cf. Memory term in our earlier logic model But what feature of the system creates the memory?
Hysteresis
Two autoregulatory positive feedback loops maintain PU.1
Steady states in feedback networks
gene G conceptual model: .Gk
GK
G.k
dt
dGdt
rate of clearance
rate of production
rate of clearance > rate of production G will decrease over time
rate of production > rate of clearance G will increase over time
At Steady state,
production rate = clearance rate G
rate
stable steady state
Steady states in feedback networks
gene G
conceptual model: .GkGK
G.k
dt
dGdNN
N
t
1 2 3 4
0.25
0.5
0.75
1
1.25
G
rate
rate of clearance
rate of production
rate of clearance > rate of production G will decrease over time
rate of production > rate of clearance G will increase over time
stable steady states
unstable steady state
stable steady state 1
stable steady state 2
unstable steady state 1
At Steady states,
production rate = clearance rate
gene G
Two ways of providing input:
.GkiK
i.k
GK
G.k
dt
dGdnn
i
n
t2NNG
N
t1
gene G
independent
activating
input
Input is (or acts on) the
same protein as feedback
(1) (2)
add a second occupancy term:
.Gki)GK
i)(G.k
dt
dGdNN
N
t
(
add to G in the occupancy term:
rate of clearance
rate of production
input
mRNASS or PSS Auto-regulation: hysteresis &
bistable lock-on switches
increasing KDiss
G
rate
reducing input
And:
PU
.1 a
t st
ead
y st
ate
Gata1 at steady state
Where might the nonlinearities come from?
Additional feedback loops:
Where might the nonlinearities come from?
Facilitation by ‘pioneer’ factors
Simple model:
where D=DNA, T= transcription factor,
at steady state:
DTTTDTTD
2
1
2
1
2
1
2
2
1
2
11
].[
][][1
].[
][
].[
]].[[
.
]].[[ ,
]].[[
DqD
Dq
DqDqD
KK
T
K
T
KK
T
DTTDTD
DTTY
KK
TD
KK
TDTDTT
K
TDDT
sigmoid DNA
occupancy curve:
Y
T
At steady state:
=
R RP
S
If 1, 21 T
m
T
m
R
K
R
K
9.0, 21 T
m
T
m
R
K
R
K
1.0, 11 T
m
T
m
R
K
R
K
R RP
S
response is sigmoidal
RPP
r rP
S
R RP
Other cases:
Where might the nonlinearities come from? Se
e al
so h
ttp
://e
n.w
ikip
edia
.org
/wik
i/G
old
bet
er-K
osh
lan
d_k
inet
ics
steady state locus of A
steady state locus of B
A
B
Time (A.U.)
Portrait of the state space
2x2 patch of activated cells at time
zero
Activity dies out
Simulation of 100X100 array of cells with autocrine signaling pathway.
10X10 patch of activated cells at time
zero
Activity restricted to one patch
Simulation of 100X100 array of cells with autocrine signaling pathway.
There is a minimum cluster-size
requirement for activation
Activity stabilizes
Activity dies out
Randomly distributed 1% of cells
activated at time zero
Activity dies out
Simulation of 100X100 array of cells with autocrine signaling pathway.
Randomly distributed 10% of cells
activated at time zero
Activity spreads
Simulation of 100X100 array of cells with autocrine signaling pathway.
GFi
PU.1
Ikaros
Egr
NAB2
Macrophage genes B cell genes
Spooner et al, Immunity 2009, 31:576–586
Ids
E2A
Early activator