Modeling and analysis of Gene Regulatory Networks · 2021. 2. 1. · – The regulatory Genome...

Preview:

Citation preview

Modeling and analysis of

Gene Regulatory Networks

Hamid Bolouri

Division of Human Biology

Fred Hutchinson Cancer Research Center

http://labs.fhcrc.org/bolouri

Woods Hole, October 2011

Outline: (new approach proposed by Eric Davison)

- Take a specific set of biological observations

- Explore how various computational approaches can help develop insights

• Papers

– Laslo et al, Cell 2006, 126:755–766

– Spooner et al, Immunity 2009, 31:576–586

– Cherry and Adler, J. Theoretical Biology 2000, 203:117-133

– Saka & Smith, BMC Developmental Biology 2007, 7:47.

– Chickarmane, Enver & Peterson PLoS Computational Biology 2009 5(1): e1000268.

• Further reading

– The regulatory Genome (Eric Davidson 2006)

– An Introduction to Systems Biology (Uri Alon, 2006)

– Computational Modeling of Gene

Regulatory Networks – a Primer (Hamid Bolouri, 2008) – R in Action (Robert Kabacoff, 2011)

Novershtern et al,

Lab Exercises: See handout for instructions.

Search for tag “MBL”

Stefan Materna & Paola Oliveri, Nature Protocols 3, -1876 - 1887 (2008)

Does network structure explain all observations?

Discovery

Analysis

Novershtern et al,

G1 G2 G3 G4 G5 G6 G7 G8 G9 G10

time1 0.082377 0.38766 0.61257 0.471963 -0.07442 -0.11739 0.51039 0.006912 0.011694 -0.14743

time2 0.710007 0.175795 0.035997 0.332428 0.499605 0.386174 0.171675 0.564456 0.500018 0.076234

time3 -1.0385 -0.83347 -0.92109 -0.81229 -1.35493 -1.01501 -0.74898 -0.6342 -0.69178 -0.9943

time4 -1.19125 -1.354 -0.73608 -1.03199 -1.15046 -0.81708 -1.22163 -0.88932 -0.41835 -0.5339

1 myData <- as.matrix(read.table(inFile,header=TRUE,sep=",",row.names=1)) 2 if (clusterBy=="genes") myData <- t(myData) 3 myData <- sweep(myData,1,apply(myData,1,mean),"-") 4 myData <- sweep(myData,1,apply(myData,1,sd),"/") 5 library(RColorBrewer) 6 library(gplots) 7 print(heatmap.2(myData[,dim(myData)[2]:1],col=brewer.pal(11,"RdYlGn"), trace="none", dendrogram="row", scale="column", Colv=FALSE,Rowv=TRUE,key=TRUE, margins=c(10,10)))

Search scripts & data for tag term “MBL” at CRdata.org

Observations

• PU.1 & CEPBa expression levels are both low in progenitors

• Progenitors express low levels of both macrophage and neutrophil associated genes

• PU.1 expression is required for macrophage specification

• CEPBa is required for neutrophil specification

• Ratio of PU.1/ CEPBa determines cell fate

• In mature macrophages and neutorphils, PU.1 and CEBPa are both expressed at high

levels & co-regulate cell-type-specific genes

• Multiple cytokines (G-CSF etc) act upstream of PU.1 and CEBPa

• PU.1 upregulates Egr2 and Nab2 expression (which co-regulate macrophage genes)

• Egr2 and Nab2 co-repress Gfi1 expression while Gfi represses Egr2 transcription

• Not included in model

– PU.1 is autoregulatory in macrophages

– CEPBa regulates PU.1 expression

PU.1

Myeloid cells

B cells

T cells

Hoogenkamp et al,

http://genomequebec.mcgill.ca/PReMod Blanchette lab, McGill

GATA

GATA

PAX

Bold ChIP Light EMSA

1

Leddin et al, Blood 2011, 117(10):2827-2838. Additional CRMs at -8Kbp & within introns.

Runx1

IKAROS

-14Kb -10Kb -9Kb -1Kb

Zarnegar, Chen & Rothenberg,

Enhancer

PU.1 exor GFi (Spooner ’09)

Logic simulation: need multiple value levels & memory

Chalk board discussion:

1. Do mutually repressing genes always result in 2 mutually excluded states?

2. How do the threshold parameters of a logic model relate to biochemistry?

http://gin.univ-mrs.fr/GINsim/

distance traveled = current position – starting position = speed x (time)

speed = [(position at time t2) – (position at time t1)] / (t2 – t1)

speed = time1time2

position1position2

d(time)

)d(position

|(time2 - time1) 0

2

2

d(time)

(position)d

d(time)

d(speed)acceleration =

Using Ordinary Differential Equations (ODEs) to model dynamics

speed

time

t1 t2 t3 t4 t5

Chalk board discussion: Integration & differentiation

Analysis of dynamic network behavior

Consider A

k1 k2

We can write

Which has a simple analytic solution:

(assuming [A](t=0) = 0)

[A]

t

)1()(.

2

1 2 tke

k

ktA

Akkdt

dA.2 1

2

1

k

k

initial slope=k1

[A] 0 max

2

1

k

k

0][

dt

Ad0

][

dt

Ad

Chalk board discussion: Stable and unstable steady states

GFi

PU.1

C/EBPa

Egr

NAB2

Macrophage genes Neutrophil genes

GFi

PU.1

C/EBPa

Egr

NAB2

Macrophage genes Neutrophil genes

Laslo et al, Cell 2006, 126:755–766 GFi

PU.1

C/EBPa

Egr

NAB2

Macrophage genes Neutrophil genes

GfiEgrCEBP

CEBP

dt

Gfid

EgrGfiPU

PU

dt

Egrd

PUGfi

e

dt

PUd p

4

4

4

1

1.

1

.)(

1

1.

11

1.)(

11

)1(

a

a

Steady states in feedback networks

stable steady state 1

stable steady state 2

unstable steady state 1

Chalk board:

mono vs bistability

stability, homeostasis

mediocristan

polarized/differentiated states,

extremestan

Laslo et al, Cell 126, 755–766, August 25, 2006

developmental

trajectory for

macrophages

Graph: a convenient graphing tool. Free at http://www.padowan.dk/graph/

GfiEgrCEBP

CEBP

dt

Gfid

EgrGfiPU

PU

dt

Egrd

PUGfi

e

dt

PUd p

4

4

4

1

1.

1

.)(

1

1.

11

1.)(

11

)1(

a

a

What do the fractional terms imply ?

Why are all Khalf=kd=1 ?

Where do the production rate functions come from? 1. Do the fractions in the rates have biochemical meaning? 2. Why use (fraction1)*(fraction2) form for the rates? - a side-trip into modeling the regulation of transcription

mRNA NTPs degradation

AAs Protein

Y

degradation

A simple 2-step ODE model of transcription and translation

dt

d[mRNA]

dt

d[P]

= kt.Y – kdm.mRNA kt is the maximal rate of transcription

= ks.mRNA – kdp.P ks is the protein synthesis rate/mRNA concentration unit

Y

mRNA

Protein

time

Pss = (ks/ kdp).(kt/kdm).Y

Pss ∝ Y (Y is usually set to the Fractional Saturation of TF complex on its DNA binding site)

At steady-state:

Gene NTPs, AAs Protein

Y

An even simpler 1-step ODE model of gene expression

dt

d[mRNA]

dt

d[P]

= kt.Y – kdm.mRNA = 0

= ks. – kdp.P

If we assume rapid mRNA equilibrium, then:

Yk

kmRNA

dm

tss .

Yk

k

dm

t .

PkYk

kk

kkL

dg

gdm

ts

..

.

dt

dP

then et

G

A P

mRNA

mRNA transcription

synthesis P

Activator (A)

degradation

degradation

Fractional DNA occupancy by one activating factor

PkmRNAkdt

PdmRNAkYk

dt

mRNAd

as before:

AK

AYalently: or equiv

AK

AKY

DNAAKDNA

DNAAK

DNAADNA

DNAAYOccupancyDNAFractional

dpsdmAt

DAA

A

AA

A

AA

..)(

..)(

.1

.

..

..

]:[][

]:[

increasing KA

G

R P

mRNA mRNA transcription

synthesis P

Repressor (R)

degradation

degradation

Transcriptional repression

mRNAkRK

kdt

mRNAd

RKDNARKDNA

DNAY

RK

RKY

dAR

t

ARARRnot

AR

ARR

..1

1.

)(

.1

1

..

.1

.

)(

Increasing KAR

(fraction of DNA occupied by R)

(1-YR) = fraction of DNA not occupied by R:

DNA occupancy by 2 factors

D

DB DAB

DA ka

k-a

kba

k-ba

kab k-ab kb k-b D

A P

mRNA

B

At equilibrium:

bbaaabbbaaab

bbab

b

ba

ba

ba

bababa

aaba

a

ab

ab

ab

ababab

b

b

a

aaa

KKKKDBAKKDBAKK

DBAKKDBk

kA

k

kDBA

k

kDABDABkDBAk

dt

DABd

DBAKKDAk

kB

k

kDAB

k

kDABDABkDABk

dt

DABd

DBk

kDBDA

k

kDADAkDAk

dt

DAd

dt

DBd

dt

DAd

dt

DABd

..........

........].[.]0].[].[.)(

........].[.]0].[].[.)(

..]..]].[..)(

0)(

,0)(

,0)(

[

[

[ :l ikewise , [

DNA occupancy by 2 factors

D

DB DAB

DA ka

k-a

kba

k-ba

kab k-ab kb k-b

D

A P

mRNA

B

In general: KA.KAB = KB.KBA = Kq.KA.KB

Where Kq = cooperativity factor

= 1 if A and B bind independently

DABkDABkdt

DABd

DAkDAkdt

DAd

abab

aa

...)(

...)(

qBABA

qBABA

.[B].K.[A] .KK.[B] K .[A] K

.[B].K.[A] .KK.[B] K.[A] KY

1

ionsconfiguratDNA all of levels statesteady the of sum

states activating of level statesteady occupancy

DAB DB DA D

DAB DB DA Y

DBAKKKDAKBKDABKDABk

kDAB

DBKDBDAKDAk

kDA

BAqAABAB

ab

ab

BA

a

a

.............

).. :(likewise ....

If A & B activate independently:

NNDA

N

NA

NA

AK

A Y or

.A)(K1

.A)(K Y

For 1 TF

multimer

qRARA

A

.[R].K.K .[A]K .[R]K .[A]K1

.[A]K

Y

homodimer) afor (Likewise

1 ,

[A]

[A] Y :lyequivalentor

.[A]1

.[A] Y : thenK.KLet

.[A].KK1

.[A].KK Y :sites ecooperativ- 2For

.[B].K.K .[A]K1

.[B].K.K .[A]K Y factors ecooperativhighly 2For

A

dA22

dA

2

22

A

22

A22

Aq

2

Aq

22

Aq

qBA

qBA

A

For 1 repressor, 1 activator

n=1

n=3

Chalk board: (1-occupancy) for a repressor multimer

50%~AB Ygives which, /nucleusM) 6-(1.7x10 molecules 3000~B A, 510~RBKRAK

, 2M7E~ND 15L4E ~ volume nuclear sites, 1.6E8~ND

:are numbers typical ,purpuratus S. urchin sea the of cellsembryonic in example, For

.K.B.KA.K.DB.K.DA.KB.DA.DD

.K.B.KA.KY

:gives bottom and top the from D cancelling

D

.K.B.KA.K.DB.K.DA.KB.DA.DD

D

.K.B.KA.K

Y

DNA) boundly specifical-non andDNA naked for terms includes rdenominato (note

D

.K.B.KA.K

D

B.K

D

A.K

D

B

D

A1

D

.K.B.KA.K

Y

:B andby A binding ecooperativ For

state)steady at ionsconfiguratDNA possible all of n(proportio

jointly B &by A occupied regions bindingDNA of proportionY

then sites, bindingspecific their of B &by A occupancy joint YLet

D

K.KKK

D

1K ,

D

1K

[A] ][AD

][A].[D

][ADK

then B,& Afactors for sites bindingDNA specific of number D Let

B & A facors for sites binding DNAspecific -non of number D Let

K

KK Let

qRBRANRBNRANN2N

qRBRAAB

2N

2N

qRBRANRBNRANN2N

2N

qRBRA

AB

2N

qRBRA

N

RB

N

RA

NN

2N

qRBRA

AB

AB

AB

N

RAm_NAequilibriuRAm_SAequilibriu

Nm_NBequilibriu

Nm_NAequilibriu

N

N

Nm_NAequilibriu

S

N

ficm_nonspeciequilibriu

m_specificequilibriuR

Calculating promoter occupancy as a function of

specific and non-specific DNA binding rates for two factors

50%~AB Ygives which, /nucleusM) 6-(1.7x10 molecules 3000~B A, 510~RBKRAK

, 2M7E~ND 15L4E ~ volume nuclear sites, 1.6E8~ND

:are numbers typical ,purpuratus S. urchin sea the of cellsembryonic in example, For

.K.B.KA.K.DB.K.DA.KB.DA.DD

.K.B.KA.KY

:gives bottom and top the from D cancelling

D

.K.B.KA.K.DB.K.DA.KB.DA.DD

D

.K.B.KA.K

Y

DNA) boundly specifical-non andDNA naked for terms includes rdenominato (note

D

.K.B.KA.K

D

B.K

D

A.K

D

B

D

A1

D

.K.B.KA.K

Y

:B andby A binding ecooperativ For

qRBRANRBNRANN2N

qRBRAAB

2N

2N

qRBRANRBNRANN2N

2N

qRBRA

AB

2N

qRBRA

N

RB

N

RA

NN

2N

qRBRA

AB

Bolouri H & Davidson EH, PNAS, 5 August 2003, 100(16):9371-9376.

Example model fit to sea urchin data – with added transcription initiation step

22

2

1

22

11

1

2

11

22

12

22

11

21

11

1

1

1

1

.1

1

.1

1

.G) - k

K

G

.( k dt

dG

.G) - k

K

G

.( k dt

dG

.G) - kG K

.( k dt

dG

.G) - kGK

.( k dt

dG

d

NDiss

Nt

d

NDiss

Nt

dNNA

t

dNNA

t

:as writere

Reducing the number of unknown parameters – a technique to simplify model exploration

22

1

22

11

2

11

1

1

1

1

.G) - kG

.( k dt

dG

.G) - kG

.( k dt

dG

KG

dNt

dNt

Diss

then , of units in measure weIf

Mutual repression

gene 2 expression

gen

e 1

exp

ress

ion

22

1

22

11

2

11

1

1

1

1

.G) - kG

.( k dt

dG

.G) - kG

.( k dt

dG

dNt

dNt

Mutual repression

1

1

d

t

k

k

cooperativity factor = 2

cooperativity factor = 3

cooperativity factor = 4

2

2

d

t

k

k

For exploratory exercises, see the Lab Notes handout

Chalk board discussion: - change-direction arrows - how inputs set the state

x1=off,x2=on

Inputs=0

x1=on,x2=off

Inputs=0

Controlling the state of a mutual repression switch with 2 independent activating inputs

x1=off,x2=on

Input1=0, Input2=0.5

x1=high, x2=low

Input1=0.75, Input2=0.5

k1,k2=5

Cf. Memory term in our earlier logic model But what feature of the system creates the memory?

Hysteresis

Two autoregulatory positive feedback loops maintain PU.1

Steady states in feedback networks

gene G conceptual model: .Gk

GK

G.k

dt

dGdt

rate of clearance

rate of production

rate of clearance > rate of production G will decrease over time

rate of production > rate of clearance G will increase over time

At Steady state,

production rate = clearance rate G

rate

stable steady state

Steady states in feedback networks

gene G

conceptual model: .GkGK

G.k

dt

dGdNN

N

t

1 2 3 4

0.25

0.5

0.75

1

1.25

G

rate

rate of clearance

rate of production

rate of clearance > rate of production G will decrease over time

rate of production > rate of clearance G will increase over time

stable steady states

unstable steady state

stable steady state 1

stable steady state 2

unstable steady state 1

At Steady states,

production rate = clearance rate

gene G

Two ways of providing input:

.GkiK

i.k

GK

G.k

dt

dGdnn

i

n

t2NNG

N

t1

gene G

independent

activating

input

Input is (or acts on) the

same protein as feedback

(1) (2)

add a second occupancy term:

.Gki)GK

i)(G.k

dt

dGdNN

N

t

(

add to G in the occupancy term:

rate of clearance

rate of production

input

mRNASS or PSS Auto-regulation: hysteresis &

bistable lock-on switches

increasing KDiss

G

rate

reducing input

And:

PU

.1 a

t st

ead

y st

ate

Gata1 at steady state

Where might the nonlinearities come from?

Additional feedback loops:

Where might the nonlinearities come from?

Facilitation by ‘pioneer’ factors

Simple model:

where D=DNA, T= transcription factor,

at steady state:

DTTTDTTD

2

1

2

1

2

1

2

2

1

2

11

].[

][][1

].[

][

].[

]].[[

.

]].[[ ,

]].[[

DqD

Dq

DqDqD

KK

T

K

T

KK

T

DTTDTD

DTTY

KK

TD

KK

TDTDTT

K

TDDT

sigmoid DNA

occupancy curve:

Y

T

At steady state:

=

R RP

S

If 1, 21 T

m

T

m

R

K

R

K

9.0, 21 T

m

T

m

R

K

R

K

1.0, 11 T

m

T

m

R

K

R

K

R RP

S

response is sigmoidal

RPP

r rP

S

R RP

Other cases:

Where might the nonlinearities come from? Se

e al

so h

ttp

://e

n.w

ikip

edia

.org

/wik

i/G

old

bet

er-K

osh

lan

d_k

inet

ics

steady state locus of A

steady state locus of B

A

B

Time (A.U.)

Portrait of the state space

2x2 patch of activated cells at time

zero

Activity dies out

Simulation of 100X100 array of cells with autocrine signaling pathway.

10X10 patch of activated cells at time

zero

Activity restricted to one patch

Simulation of 100X100 array of cells with autocrine signaling pathway.

There is a minimum cluster-size

requirement for activation

Activity stabilizes

Activity dies out

Randomly distributed 1% of cells

activated at time zero

Activity dies out

Simulation of 100X100 array of cells with autocrine signaling pathway.

Randomly distributed 10% of cells

activated at time zero

Activity spreads

Simulation of 100X100 array of cells with autocrine signaling pathway.

GFi

PU.1

Ikaros

Egr

NAB2

Macrophage genes B cell genes

Spooner et al, Immunity 2009, 31:576–586

Ids

E2A

Early activator