Upload
eleanor-pamela-shields
View
217
Download
2
Tags:
Embed Size (px)
Citation preview
23. Lecture WS 2005/06
Bioinformatics III 1
V23 Stochastic simulations of cellular signalling
Traditional computational approach to chemical/biochemical kinetics:
(a) start with a set of coupled ODEs (reaction rate equations) that describe the
time-dependent concentration of chemical species,
(b) use some integrator to calculate the concentrations as a function of time given
the rate constants and a set of initial concentrations.
Successful applications : studies of yeast cell cycle, metabolic engineering,
whole-cell scale models of metabolic pathways (E-cell), ...
Major problem: cellular processes occur in very small volumes and frequently
involve very small number of molecules.
E.g. in gene expression processes a few TF molecules may interact with a single
gene regulatory region.
E.coli cells contain on average only 10 molecules of Lac repressor.
23. Lecture WS 2005/06
Bioinformatics III 2
Include stochastic effects
(Consequence1) modeling of reactions as continuous fluxes of matter is no
longer correct.
(Consequence2) Significant stochastic fluctuations occur.
To study the stochastic effects in biochemical reactions stochastic formulations of
chemical kinetics and Monte Carlo computer simulations have been used.
Daniel Gillespie (J Comput Phys 22, 403 (1976); J Chem Phys 81, 2340 (1977))
introduced the exact Dynamic Monte Carlo (DMC) method that connects the
traditional chemical kinetics and stochastic approaches.
Assuming that the system is well mixed, the rate constants appearing in these two
methods are related.
23. Lecture WS 2005/06
Bioinformatics III 3
Dynamic Monte Carlo
In the usual implementation of DMC for kinetic simulations, each reaction is
considered as an event and each event has an associated probability of occurring.
The probability P(Ei) that a certain chemical reaction Ei takes place in a given time
interval t is proportional to an effective rate constant k and to the number of
chemical species that can take part in that event.
E.g. the probability of the first-order reaction
X Y + Z
would be k1Nx with Nx :number of species X, and
k1 : rate constant of the reaction
Similarly, the probability of the reverse second-order reaction
Y + Z X
would be k2NYNZ.
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
23. Lecture WS 2005/06
Bioinformatics III 4
Dynamic Monte Carlo
As the method is a probabilistic approach based on „events“, „reactions“ included in
the DMC simulations do not have to be solely chemical reactions.
Any process that can be associated with a probability can be included as an event
in the DMC simulations.
E.g. a substrate attaching to a solid surface can initiate a series of chemical
reactions.
One can split the modelling into the physical events of substrate arrival, of
attaching the substrate, followed by the chemical reaction steps.
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
23. Lecture WS 2005/06
Bioinformatics III 5
Basic outline of the direct method of Gillespie
(Step i) generate a list of the components/species and define the initial distribution
at time t = 0.
(Step ii) generate a list of possible events Ei (chemical reactions as well as
physical processes).
(Step iii) using the current component/species distribution, prepare a probability
table P(Ei) of all the events that can take place.
Compute the total probability
P(Ei) : probability of event Ei .
(Step iv) Pick two random numbers r1 and r2 [0...1] to decide which event E will
occur next and the amount of time by which E occurs later since the most recent
event.
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
)(itotEPP
23. Lecture WS 2005/06
Bioinformatics III 6
Basic outline of the direct method of Gillespie
Using the random number r1 and the probability table,
the event E is determined by finding the event that satisfies the relation
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
1
1
11
ii
itoti
EPPrEP
The second random number r2 is used to obtain the amount of time between the
reactions
2ln
1r
Ptot
As the total probability of the events changes in time, the time step between
occurring steps varies.
Steps (iii) and (iv) are repeated at each step of the simulation.
The necessary number of runs depends on the inherent noise of the system and on
the desired statistical accuracy.
23. Lecture WS 2005/06
Bioinformatics III 7
Weighted SamplingIn the commonly used MC algorithm, the Markov chain is generated using
transition probabilities (i j) that are based on the physical probability
distribution:
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
kki
ji
P
Pji
The ensemble average of any physical quantity is obtained by taking the
arithmetic average of all the n simulation runs.
The individual averages i could e.g. be time-averages over the simulation run.
This choice disfavors the transitions with low probabilities.
If the system characteristics depend on the events that happen less frequently,
then the common implementation of MC requires extremely lengthy simulations to
acquire enough statistical sampling.
n
iin 1
1
23. Lecture WS 2005/06
Bioinformatics III 8
Weighted SamplingThis statistical sampling problem can be avoided if the probability distribution is
multiplied with a weight function that adjusts the sampling probability distribution
such that the low probability parts of the sampling space are visited more often.
In the case of weighted sampling, the Markov chain is generated by using the
modified probability distribution function
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
jiYjiPjiPw
where Y is the biasing weight function.
Since the probability of the transition i j is weighted with Y(i j), calculation of
the ensemble average of a physical quantity is obtained by computing the
average of / Y.
Division of by Y effectively corrects for the bias introduced in the sampling
probability distribution.
23. Lecture WS 2005/06
Bioinformatics III 9
Probability-Weighted DMC
Probability-weighted DMC incorporates weighted sampling into DMC.
Steps (iii) and (iv) of the DMC algorithm are replaced by
(Step iii‘) Using the current component/species distribution, prepare a probability
table of all the events Ei that can take place,
(Step iv‘) define the weight factor scale and compute the inverse probability weight
table
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
EYEw 1
for all events.
Note that the stochastic simulations mentioned here use discrete numbers of
molecules, i.e. the species are produced and consumed as whole integer units.
Therefore, the weight table w(E ) must contain only integer values.
23. Lecture WS 2005/06
Bioinformatics III 10
Probability-Weighted DMC
(Step v‘) Prepare the weighted probability table
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
i
i
iw Ew
EPEP
(Step vi‘) Compute the total probability by summing the weighted probabilities of all
individual events )(
iwtotEPP
(Step vii‘) Pick two random numbers r1,r2 [0...1].
Determine which event E occurs next as before using r1.
(Step viii‘) Propagate the time as before using r2.
The speed-up achieved by the PW-DMC algorithm stems from the fact that the
reactions with large probabilities are allowed to occur in „bundles“.
23. Lecture WS 2005/06
Bioinformatics III 11
Comparison of DMC and PW-DMCDMC is essentially a method to solve the master equation that rules how the
probabilities of the configurations are related to each other
Resat et al., J.Phys.Chem. B 105, 11026 (2001)
PWPWdt
dP
W : transition probability of going from configuration to
P : probability of configuration .
Using the master equation, the statistical average X of the rate of change of the
property X can be expressed as:
,
,XXPW
dt
Xd
In PW-DMC, this relation is rearranged using the weight factor w as
,
, XXww
PW
dt
Xd
PW-DMC leaves the ensemble averages unchanged.
However, the fluctuations increase with w.
23. Lecture WS 2005/06
Bioinformatics III 12
Protein dynamics
time scale maximal system size protein
diffusion
10 fs = 10-14 s fastest bond vibrations, 10-5 cm2 s-1
duration the catalytic step of a chemical reaction
1 ps = 10-12 s rotational correlation time of a water molecule
frequency of ring flips of Tyr and Phe rings
< 1 ns = 10-9 s < life-times of hydrogen bonds
1 ns - 1 dynamics of protein loops, protein-protein association
1s – 1ms crossing of membrane?
1 ms – 1 s protein folding/unfolding
23. Lecture WS 2005/06
Bioinformatics III 13
Time scales covered by various methods
method time scale maximal system size
Molecular Dynamics 1ns - 1s 100.000 atoms = (10 nm)3
Brownian Dynamics 1s – 1ms 100 rigid proteins = (100 nm)3
Random Walk 1s – 1ms
Diffusion equation1s – 1ms cell subcompartments (1 - 10 m)3
(e.g. Virtual Cell)
Dynamic Monte Carlo 1 ns – 1 s 10.000 reactions
Network models no time scale no length scale, 106 nodes
23. Lecture WS 2005/06
Bioinformatics III 14
Epidermal growth factor receptor signaling pathway
The EGFR signaling pathway is one of the most important pathways that regulate
growth, survival, proliferation, and differentiation in mammalian cells.
International consortium has assembled a comprehensive pathway map including
- EGFR endocytosis followed by by its degradation or recycling,
- small GTPase-mediated signal transduction such as MAPK cascade, PIP
signaling, cell cycle, and GPCR-mediated EGFR transactivation via intracellular
Ca2+ signalling.
Map includes 211 reactions and 322 species taking part in reactions.
Species: 202 proteins, 3 ions, 21 simple molecules, 73 oligomers, 7 genes, 7 RNAs.
Proteins: 122 molecules including 10 ligands, 10 receptors, 61 enzymes (including 32 kinases), 3 ion
channels, 10 transcription factors, 6 G protein subunits, 22 adaptor proteins.
Reactions: 131 state transitions, 34 transportations, 32 associations, 11 dissociations, 2 truncations.
Oda et al. Mol.Syst.Biol. 1 (2005)
23. Lecture WS 2005/06
Bioinformatics III 15
Oda et al. Mol.Syst.Biol. 1 (2005)
23. Lecture WS 2005/06
Bioinformatics III 16
Architecture of signaling network: bow-tie structure
Oda et al.
Mol.Syst.Biol. 1 (2005)
23. Lecture WS 2005/06
Bioinformatics III 17
Network control
Several system controls define the overall behavior of the signaling network:
- 2 positive feedback loops
- Pyk2/c-Src activates ADAMs, which shed pro-HB-EGF so that the
amount of HB-EGF will be increased and enhance the signalling
- active PLC/ produces DAG which results in the cascading activation
of protein kinase C (PKC), phospholipase D, and PI5 kinase.
- 6 negative feedback loops
- inhibitory feed-forward paths
There are also a few positive and negative feedback loops that affect ErbB
pathway dynamics.
Oda et al. Mol.Syst.Biol. 1 (2005)
23. Lecture WS 2005/06
Bioinformatics III 18
Process diagram
Oda et al. Mol.Syst.Biol. 1 (2005)
23. Lecture WS 2005/06
Bioinformatics III 19
Modification and localization of proteins
Oda et al. Mol.Syst.Biol. 1 (2005)
23. Lecture WS 2005/06
Bioinformatics III 20
Precise association states between EGFR and adaptorsOda et al. Mol.Syst.Biol. 1 (2005)
Ellipsis in drawing association states of proteins using an ‘address’. (A) Precise association states between EGFR and adaptors. Three adaptor proteins, Shc, Grb2, and Gab1, bind to the activated EGFR via its autophosphorylated tyrosine residues. Shc binds to activated EGFR and is phosphorylated on its tyrosine 317. Grb2 binds to activated EGFR either directly or via Shc bound to activated EGFR. Gab1 also binds to activated EGFR either directly or via Grb2 bound to activated EGFR, and is phosphorylated on its tyrosine 446, 472, and 589.
23. Lecture WS 2005/06
Bioinformatics III 21
Cells of living organism sense their
environment and respond to
environmental stimuli.
Cellular signaling mechanisms govern how information
from the environment is decoded, processed and transferred to the appropriate
locations within the cell.
Signaling through the receptor tyrosine kinase (RTK) family of receptors regulates
a wide range of biological phenomena, including cell proliferation and
differentiation.
Integrated PW-DMC Model of Epidermal Growth Factor Receptor Trafficking and Signal Transduction
Diagram showing the compartments involved in
receptor trafficking and the receptor movement
pathways within the cell.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 22
Integrated Model of Epidermal Growth Factor Receptor Trafficking and Signal Transduction
Signaling pathways of various RTKs are reasonably well characterized.
Common features:
- receptor self-phosphorylation on tyrosine residues
- subsequent interaction with molecules containing SH2 and phospho-Tyr
residues.
The signal from the receptor is transmitted to downstream effector molecules
through a series of protein-protein interactions, such as the MAP kinase cascade.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 23
Integrated Model of Epidermal Growth Factor Receptor Trafficking and Signal Transduction
The EGF receptor can be activated by the
binding of any one of a number of different
ligands.
Each ligand stimulates a somewhat different
spectrum of biological responses.
The effect of different ligands on EGFR
activity is quite similar at a biochemical level
the mechanisms responsible for their
differential effect on cellular responses are
unkown.
After binding of any of its ligands, EGFR is
rapidly internalized by endocytosis.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 24
Integrated Model of Epidermal Growth Factor Receptor Trafficking and Signal Transduction
Different EGFR ligands vary in their ability to bind to EGFR as a function of
receptor microenvironment such as intravesicular pH.
After endocytosis, receptor-ligand complexes pass through several different
compartments that vary in their intravesicular milieu.
Receptor movement among cellular compartments („receptor trafficking“) can
exert a significant effect on the activity of the complexes.
The different intracellular compartments also vary in their access to some of the
substrates of the EGFR kinase.
This coupled relationship between substrate access and ligand-dependent
activity in different endocytic compartments suggests that trafficking could
function to „decode“ the information unique to each ligand.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 25
3 functions of trafficking
(1) controlling the magnitude of the signal
(2) controlling the specificity of the response
(3) controlling the duration of the response.
Understanding the relative contribution of these 3 aspects for any given
combination of cells, conditions, and ligands is very difficult
use computational models!
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 26
Computational modelling of EGF receptor system
(1) trafficking and ligand-induced endocytosis
(2) signaling through Ras or MAP kinases
This work combines both aspects into a single model.
Most approaches to building computational kinetic models have severe
drawbacks when representing spatially heterogenous processes on a cellular
scale.
Review: In the traditional approach, we
- formulate set of coupled ODEs (reaction rate equations) for the time-dependent
concentration of chemical species
- use integrator to propagate the concentrations as a function of time given the
rate constants and a set of initial concentrations.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 27
Multiple time scale problemIn Dynamic Monte Carlo, reactions are considered events that occur with certain
probabilities over set intervals of time.
The event probabilities depend on the rate constant of the reaction and on the
number of molecules participating in the reaction.
In many interesting natural problems, the time scales of the events are spread
over a large spectrum.
Therefore it is very inefficient to treat all processes at the time scale of the fastest
individual reaction.
In the EGFR signaling network,
- receptor phosphorylation after ligand binding occurs almost instantaneously
- vesicle formation or sorting to lysosomes requires many minutes.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 28
Solution to multiple time scale problem
Computing millions and billions non-correlated random numbers can become a
time-consuming process.
Resat et al. (2001) introduced Probability-Weighted DMC to speed-up the
simulation by factor 20 – 100.
Different processes are only tested at variant times depending on their
probabilities
= very unlikely processes compute MC decision very infrequently.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 29
Signal transduction model of EGF receptor signaling pathway
Resat et al. Biophys Journal
85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 30
Species in the EGF receptor signaling model
Resat et al. Biophys Journal
85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 31
Receptor and ligand group definitions
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 32
Rate constants of the ligand:receptor interactions
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 33
Early endosome inclusion coefficients
Resat et al. Biophys Journal 85, 730 (2003)
These are adjusted to yield the experimentally determined rates of
ligand-free and ligand-bound receptor internalization.
23. Lecture WS 2005/06
Bioinformatics III 34
Time course of phosphorylated EGF receptors(a) Total number of phosphorylated EGF
receptors in the cell. Curves represent the
number of activated receptors when the cell is
stimulated with different ligand doses at the
beginning. The y axis represents the number of
receptors in thousands.
(b ) Ratio of the number of phosphorylated
receptors that are internalized to that of the
phosphorylated surface receptors.
(c) Ratio of the number of internalized
receptors to the number of surface receptors.
Curves are colored as:
[L] = 0.2 (magenta), 1 (blue), 2 (green), and 20
(red) nM.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 35
Distribution of the receptors among cellular compartments
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 36
Stimulation of EGFR signaling pathway by different ligands
Comparison of the results when the EGFR
signaling pathway is stimulated with its ligands
EGF (red) and TGF- (green).
(a ) Total number of receptors in the cell as a
function of time after 20 nM ligand is added to the
system. Red diamond (EGF) and green square
(TGF-) points show the experimental results.
(b) Distribution of the receptors between
intravesicular compartments and the cell
membrane.
(c) Distribution of the phosphorylated receptors
between intravesicular compartments and the cell
membrane. In the figures, y axes represent the
number of receptors in thousands.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 37
Ratio of internal/surface receptors
The ratio of the In/Sur ratios when
the EGFR signaling pathway is
stimulated with its ligands EGF and
TGF- at 20 nM ligand
concentration.
Comparison of computational (solid
lines) and experimental (points)
results.
Ratio of the ratios for the
phosphorylated (i.e., activated)
(blue), and total (phosphorylated +
unphosphorylated) number
(magenta) of receptors.
Resat et al. Biophys Journal 85, 730 (2003)
23. Lecture WS 2005/06
Bioinformatics III 38
SummaryLarge-scale simulations of the kinetics of biological signaling networks are
becoming feasible.
Here, the model consisted of hundreds of distinct compartments and ca. 13.000
reactions/events that occur on a wide spatial-temporal range.
The exact Dynamic Monte Carlo algorithm of Gillespie (1976/1977) was a
breakthrough for simulations of stochastic systems.
Problem: simulations can become very time-consuming. In particular if the
processes occur on different time scales.
Methods like the probability-weighted DMC are promising tools for studying
complex cellular systems using molecular quanta.
Many other variants of DMC have and are being development.