Richard Notebaart
Systems biology / Reconstruction and modeling large biological
networks
Seminar
• What is systems biology?
• How to reconstruct large biological networks/systems
• Methods to analyze large biological networks/systems
• Applying systems biology approaches to answer biological questions
What is systems biology:
• fashionable catchword?
• a real new (philosophical) concept?
• new discipline in biology?
• just biology?
• ...
Systems concept
• A system represents a set of components together with the relations connecting them to form a unity
• Defining a system divides reality into the system itself and its environment
• The number of interconnections within a system is larger than the number of connections with the environment
• Systems can include other systems as part of their constructionconcept of modularity!
• allows complex systems to be put together from known simple ones (system of systems)
• concept of modularity!
Systems levels
Ecosystem
Multicellular organisms
Organs
TissuesCellsPathwaysProteins/genes
• The behavior of a system depends on:
• (Properties of the) components of the system• The interactions between the components
THUS:You cannot understand a system via pure reductionism
(studying the components in isolation)
Systems theory
Systems biology
• New? NO and YES
• Systems theory and theoretical biology are old
• Experimental and computational possibilities are new
(publications of von Bartalanffy, 1933-1970)
Omics-revolution shifts paradigm to large systems
- Integrative bioinformatics- (Network) modeling
• Gene expression networks: based on micro-array data and clustering of genes with similar expression values over different conditions (i.e. correlations).
• Protein-protein interaction networks: based on yeast-two-hybrid approaches.
• Metabolic networks: network of interacting metabolites through biochemical reactions.
Reconstruction of networks from ~omics for systems analysis
• Genome annotation allows for reconstruction:• If an annotated gene codes for an enzyme it can (in
most cases) be associated to a reaction
How to reconstruct metabolic networks?
genome
transcriptome
proteome
metabolome
Genome-scale network
Reconstructed genome-scale networks
Species #Reactions #Genes Reference
Escherichia coli 2077 1260 Feist AM. et al. (2007), Mol. Syst. Biol.
Saccharomyces cerevisiae 1175 708 Förster J. et al. (2003), Genome Res.
Bacillus subtilis 1020 844 Oh YK. et al. (2007), J. Biol. Chem.
Lactobacillus plantarum 643 721 Teusink B. et al., (2006), J. Bio. Chem.
Human 3673 1865 Duarte NC. et al., (2007), PNAS
…>30
Data visualization via Gene-Protein-Reaction relations (formalized knowledge)
From network to modelThe Modeling Ideal - A complete kinetic
description
• Flux*Rxn1 = f(pH, temp, concentration, regulators,…)
• Can model fluxes and concentrations over time
• Drawbacks• Lots of parameters• Measured in vitro (valid in vivo?)• Can be complex, ‘nasty’ equations• Nearly impossible to get all parameters at genome-
scale
*measure of turnover rate of substrates through a reaction (mmol.h-1.gDW-1)
Theory vs. Genome-scale modeling
Theory • Complete knowledge
• Solution is a single point
Genome-scale• Incomplete knowledge
• Solution is a space
Flux A
Flux C
Flux B
Flux A
Flux C
Flux B
AB
C
For genome-scale networks there is no detailed kinetic description -> too many reactions involved!
Genome-scale modeling
• How to model genome-scale networks?• We need:
• A metabolic reaction network• Exchange reactions: link between environment
and reaction network (systems boundary)• Constraints that limit network function:
• Mass balancing (conservation) of metabolites in the systems
• Exchange fluxes with environment• ……
• Goal: prediction of growth and reaction fluxes
From network to constraint-based model
• A system represents a set of components together with the relations connecting them to form a whole unity
• Defining a system divides reality into the system itself and its environment
Mass balancing
Constraint-based modeling - Data structure• Stoichiometric matrix S (Mass balancing):
1: metabolite produced in reaction-1: metabolite consumed by reaction0: metabolite not involved in reaction
Principles of Constraint-Based Analysis
• Steady-state assumption: for each metabolite in network, write a balance equation
Xi
V1 V2
V3
Flux balance on component Xi:
V1 = V2 + V3 V1 - V2 - V3 = 0
Matrix notation: S.v = 0S = Stoichiometric matrix (m x n)v = Metabolic reaction fluxes (n)
• Result is a system of m equations (number of metabolites) and n unknowns (fluxes)
Normally, n>m so the system is underdetermined No unique solution!
What is underdetermined?
• Determined System (2 equations, 2 unknowns):X+Y=22X-Y=1
• Solution X=1, Y=1
• Underdetermined System (1 equation, 2 unknowns)
X+Y=2• Infinite Solutions!
In metabolism more fluxes (unknowns) than metabolites (equations)
Impose constraints
Flux
C
Flux B
UnboundedSolution Space
Flux
C
Flux B
UnboundedSolution Space
Flux
C
Flux B
Bounded Convex Subset
Flux
C
Flux B
Bounded Convex Subset
Flux
C
Flux B
Bounded Convex Subset
Flux
C
Flux B
Bounded Convex Subset
ConstraintsStoichiometry (mass conservation)
(ii) Exchange fluxes (capacity)
(iii) …
Constraints(i)
AB
C
Exchange reactions allow nutrients to be taken up from environment with a certain maximum flux, e.g. -2≤vexchange≤0
Interpretation of the convex cone
Convex cone, Flux cone, Solution spaceOne allowable functional state (flux distribution) of network given constraints
AB
C
AB
C
Flux balance analysis (FBA)
AB
C Constraints set bounds on solution space, but where in this space does the “real” solution lie?
FBA: optimize for that flux distribution that maximizes an objective function (e.g. biomass flux) – subject to S.v=0 and αj≤vj≤βj
Thus, it is assumed that organisms are evolved for maximal growth -> efficiency!
Prediction of microbial evolution by flux balance analysis (in E. coli)
Prediction of growth fails with flux balance analysis (in L. plantarum)
-8.4
0
-0.7
3.4
0
15.2
8.0
0
0
simulation
-8.4
0
-0.7
3.4
0
15.2
8.0
0
0
simulation
Glucose
succinate
citric acid
EtOH
acetoin
acetate
formate
Pyruvate
lactate
D = 0.32 h-1
Glucose
succinate
citric acid
EtOH
acetoin
acetate
formate
Pyruvate
lactate
D = 0.32 h-1
-8.4
0.9
-0.7
0.45
0
1.9
0.9
0.1
13.8
experiment
-8.4
0.9
-0.7
0.45
0
1.9
0.9
0.1
13.8
experimentglucose
lactate acetate + formate + ethanol
pyruvate
FBA predicts mixed acid fermentation with 40% too high biomass formation -> thus L. plantarum is not efficient!
2 ATP/Glc 2.5 ATP/Glc
Teusink B. et al., 2006, J. Bio. Chem.
Some other constraint-based methods
Robustness analysis: examining the effect of changing the flux through a reaction on the objective function (i.e. growth)
Some other constraint-based methods
Flux variability analysis: compute minimum and maximum flux values through each reaction without changing the optimal solution (i.e. maximum growth / phenotype)
FBA is performed to determine the optimal solution and is used as constraint.
Example of application: if one wants to change the optimal solution it is relevant to know which reactions have wide and narrow flux ranges
Available software – COBRA toolbox
Designed for matlab and freely available!
Flux coupling / correlations
• Genome-scale analysis to determine whether two fluxes (v1 and v2) are:
• Fully coupled: a non-zero flux of v1 implies a non-zero fixed flux for v2 (and vice versa)
• Directionally coupled: a non-zero flux v1 implies a non-zero flux for v2, but not necessarily the reverse
• Uncoupled: a non-zero flux v1 does not imply a non-zero flux for v2 (and vice versa)
Flux coupling / correlations
A and B: directionally
B and C: fully
C and D: uncoupled
Measured Vs. In silico flux correlations
(p < 10-14)Emmerling M. et al. J Bacteriol. 2002 Segre D. et al. PNAS, 2002
In silico and measured flux correlations are in agreement
Notebaart RA. et al. (2007), PLoS Comput Biol (in press)
Flux coupling for data analysis
• Does flux coupling relate to transcriptional co-regulation of genes?
Notebaart RA. et al. (2007), PLoS Comput Biol (in press)
Flux coupling for data analysis
Flux coupled genes in the E. coli metabolism are more likely lost or gained together over evolution
Coupling type Event #Events OR* (95% c.i.)
Fully coupled Transfer 5964.6 (24.2–168.8)
Fully coupled Loss 1,62450.0 (41.8–59.6)
Directionally coupled
Transfer 7860.3 (24.3–147.2)
Directionally coupled
Loss 2,833 9.6 (8.3–11.1)
*odd ratio (OR): how much more likely is an event X relative to event Y
Pal C. et al. (2005), Nature Genetics
Gene dispensability in metabolism of yeast
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
• Studies have shown that many metabolic genes are dispensable (80% of yeast genes appear not to be essential for growth)• Main question: why are most genes dispensable?
• ‘Forces’ that explain dispensability:• The impact of gene deletions may depend on the
environment (plasticity)• The presence of mutational robustness
(compensatory mechanisms) alternative pathways
• Or both…
• Objective: explore the interaction between the two forces.
Gene dispensability in metabolism
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
• A ’model’ of mutational robustness and environment: i) Simulate metabolism in different environments and ii) identify genes in alternative pathways by synthetic
lethality
Gene dispensability – single gene deletion
Gene is essential when a deletion is lethal (i.e. no growth):Delete the gene and apply FBA optimization equals zero gene is essential!
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Effect of environment and alternative pathways
BUT, single gene deletion does not supply direct information on alternative pathways and its role in gene dispensability
Method: Identify synthetic lethality between gene A and B:
i) Delete only gene A and apply FBA optimization unequal to zero gene is not essential
ii) Delete only gene B and apply FBA optimization unequal to zero gene is not essential
iii) Delete both gene A and B and apply FBA optimization equals zero either A or B must be present thus alternative pathway which explains gene dispensability!
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Effect of environment and alternative pathways
Alternative paths in all environments: 14.3%
Alternative paths (SL) in 1 or 2 environments: 50%
50% of genes in alternative pathways provide mutational robustness in only 1 or 2 environments thus the environment plays an important role in gene dispensability!
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Summary / conclusions
• Systems biology: studying living cells/tissues/etc by exploring their components and their interactions
• Even without detailed knowledge of kinetics, genome-scale modeling is still possible
• Genome-scale modeling has shown to be relevant in studying evolution and to interpret ~omics data
• Major challenge is to integrate knowledge of kinetics and genome-scale networks
Assignment• Read the following article: Pal C., Papp B., Lercher MJ.,
Csermely P., Oliver SG. and Hurst LD. (2006), Chance and necessity in the evolution of minimal metabolic networks, Nature
• Write a report of 2 / 3 pages and include/consider at least the following points:• What is the main hypothesis and scientific question?• What do you think about the hypothesis? Will it have
important implications?• Do the authors ask other scientific (sub)questions (related
to the main question) and if so, what are they and was it necessary to address them?
• What methods have been used and explain them (in your own words!).
• What are the major findings/results? • Summarize the conclusions and describe if you agree with
it based on the described results.