70
1 Model-based investigation of bacterial metabolism using gene essentiality data. PhD defense – Maxime Durot PhD prepared in the Computational Systems Biology Group at Genoscope under the supervision of Vincent Schachter & Jean Weissenbach

Model-based investigation of bacterial metabolism using gene essentiality data

  • Upload
    ondrea

  • View
    30

  • Download
    2

Embed Size (px)

DESCRIPTION

Model-based investigation of bacterial metabolism using gene essentiality data. PhD defense – Maxime Durot PhD prepared in the Computational Systems Biology Group at Genoscope under the supervision of Vincent Schachter & Jean Weissenbach. Motivation & goals of the thesis. Metabolism. - PowerPoint PPT Presentation

Citation preview

Page 1: Model-based investigation of bacterial metabolism using gene essentiality data

1

Model-based investigation of bacterial metabolism using gene essentiality data.

PhD defense – Maxime Durot

PhD prepared in the

Computational Systems Biology Group at Genoscope

under the supervision of

Vincent Schachter & Jean Weissenbach

Page 2: Model-based investigation of bacterial metabolism using gene essentiality data

2 Maxime Durot – PhD defense –

October 12, 2009

Motivation & goals

of the thesis

Page 3: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 3

Metabolism

[Picture: Roche Applied Science : http://www.expasy.org/tools/pathways/]

Page 4: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 4

Information from two scales

genome metabolism phenotype

molecular scale cellular scale

Page 5: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 5

Mutant phenotyping experiments

Wild-type bacterium

Genome

Gene

Knock-out mutant

Deleted gene

Wild-type growth phenotype

Mutant growth phenotype

Mutant phenotype: No growth = gene is essential on the tested environment Growth = gene is dispensable on the tested environment

Experiments are performed genome-wide for a growing number of organisms (Gerdes et al, Curr Opin Biotechnol 2006)

Page 6: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 6

Confronting the two scales is complex

Page 7: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 7

Modeling metabolism can help

(Stelling, Curr Opin Microbiol. 2004)

Page 8: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 8

The constraint-based modeling frameworkA(ext) B(ext) P(ext)

A

B

C

D

P

R1

R4

R7

R8

R6

R2

R3R5

R9

Key concepts: variable of interest = reactions fluxes

Page 9: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 9

The constraint-based modeling framework

Key concepts: variable of interest = reactions fluxes

A(ext) B(ext) P(ext)

A

B

C

D

P

1.50

0.5

1

0.5

0.5

10

1

Page 10: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 10

The constraint-based modeling framework

Key concepts: variable of interest = reactions fluxes constraint-based approach: applying

constraints to the model reduces the possible flux distributions

A(ext) B(ext) P(ext)

A

B

C

D

P

R1

R4

R7

R8

R6

R2

R3R5

R9

Admissible flux distributions

v1

v2

v3

Page 11: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 11

The constraint-based modeling framework

Key concepts: variable of interest = reactions fluxes constraint-based approach: applying

constraints to the model reduces the possible flux distributions

Classical constraints: metabolism in steady-state: metabolic

concentrations remain constant some reactions are irreversible flux values are bound to a maximal

value

Applicable at genome scale

A(ext) B(ext) P(ext)

A

B

C

D

P

R1

R4

R7

R8

R6

R2

R3R5

R9

Admissible flux distributions

Page 12: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 12

The constraint-based modeling framework

Key concepts: variable of interest = reactions fluxes constraint-based approach: applying

constraints to the model reduces the possible flux distributions

explore the space of admissible flux distributions

Classical constraints: metabolism in steady-state: metabolic

concentrations remain constant some reactions are irreversible flux values are bound to a maximal

value

Applicable at genome scale

A(ext) B(ext) P(ext)

A

B

C

D

P

R1

R4

R7

R8

R6

R2

R3R5

R9

Admissible flux distributions

Page 13: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 13

Models and gene essentiality datasets Constraint-based models can predict growth phenotypes for

genetic and environmental perturbations (Price et al, Nat Rev Microbiol 2004)(Durot et al, FEMS Microbiol Rev 2009)

Gene essentiality datasets have been used to provide rough assessments of metabolic models (Covert et al, Nature 2004)(Joyce et al, J Bacteriol 2006)

Compute predictive accuracy for gene essentiality prediction List of inconsistencies, used as a starting point for curation

Can gene essentiality datasets be used more systematically for metabolic model assessment & refinement ?

Page 14: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 14

Objectives of the thesis

1. Develop a framework for the refinement of metabolic models using gene essentiality data

Page 15: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 15

Context: the Metabolic Thesaurus project

Experimental context : Reliable genome annotation (Barbe et al, Nucleic Acics Res 2004) Comprehensive knock-out mutant collection (de Berardinis et al,

Mol Syst Biol 2008) Phenotyping capability : complete conditional essentiality

datasets on several media (de Berardinis et al, Mol Syst Biol 2008)

Acinetobacter baylyi ADP1 -proteobacteria, Pseudomonales group Nutritionally versatile, strictly aerobic Non-pathogenic Evidence of xenobiotic degradation

capabilities

Page 16: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 16

Objectives of the thesis

1. Develop a framework for the refinement of metabolic models using gene essentiality data

2. Application to Acinetobacter baylyi metabolism reconstruct a global metabolic model from its genome

annotation assess and refine the model using mutant phenotypes point out poorly understood metabolic events requiring

further experimental investigation

Page 17: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 17

Outline

A/ A formal framework for comparing predicted and experimental gene essentialities

B/ Reconstruction and refinement of A. baylyi metabolic model using mutant phenotypes

C/ Automated reasoning with metabolic models and essentiality data

Page 18: Model-based investigation of bacterial metabolism using gene essentiality data

18 Maxime Durot – PhD defense –

October 12, 2009

A/ A formal framework for comparing predicted and experimental gene essentialities

Page 19: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 19

Improved metabolic reconstruction

Initial metabolic reconstruction

model predictions experimental results

(Large-scale) experiments

model assessment & refinement

Model refinement using experimental data

model predictions experimental results

(Large-scale) experiments 2

model assessment & refinement

refinementstep 2

Page 20: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 20

Formal representation of a metabolic model Model refinement using large-scale genetics data requires :

Computer generation of variants of models Understanding the impact of model variations on phenotype

predictions

Problem : Constraint-based models appear to be complex mathematical

objects

An appropriate representation of metabolic models is required to perform automated reasoning with essentiality

Page 21: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 21

Set of reactions fulfilling the modeling

constraints

GPR

Genetic background

Formal representation of a metabolic model

Boolean gene-reaction associations (GPR)

r1: g1

r2: g1 and g2

Boolean rulesg1 g2

p1

c1

r1

p2

r2

Gene

Protein

Complex

Reaction

Page 22: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 22

Formal representation of a metabolic model

Metabolites of the medium

Set of reactions fulfilling the modeling

constraints

Producible metabolites

GPR

Genetic background

Boolean gene-reaction associations (GPR)

Set of metabolic reactions (NETWORK)

Page 23: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 23

Formal representation of a metabolic model

Metabolites of the medium

Set of reactions fulfilling the modeling

constraints

Producible metabolites

GPR

Genetic backgroundessential biomass

precursors

Boolean gene-reaction associations (GPR)

Set of metabolic reactions (NETWORK)

List of essential biomass precursors (BIOMASS)

Page 24: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 24

Predicting mutant phenotypes

Metabolites of the medium

Reactions fulfilling the modeling constraints

Producible metabolites

GPR

Genetic backgroundessential biomass

precursors

GPR

Gene deletion Reduction of producible metabolites space

Inactivated reactions

genetic perturbation

Page 25: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 25

Confronting model predictions with experiments Comparison of predictions with experiments reveal

inconsistencies

Essential Dispensable

Essential True Essential False Dispensable

Dispensable False Essential True Dispensable

Predictions

Ex

pe

rim

en

ts

Page 26: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 26

Classifying inconsistencies according to likely cause & correction type

GPR

NETWORK

BIOMASS

False essential

decrease impact of gene deletion on reaction set- add an alternate enzyme- gene is a non-essential subunit of a complex- reaction may occur spontaneously

augment reaction set

reduce biomass requirements- remove a biomass precursor

- add an alternate pathway

False dispensable

increase impact of gene deletion on reaction set- remove an isozyme- form a complex instead of isozyme- gene has an additional essential role

reduce reaction set

augment biomass requirements- add a biomass precursor

- remove or block an alternate pathway

Type of inconsistency

GPR

Page 27: Model-based investigation of bacterial metabolism using gene essentiality data

27 Maxime Durot – PhD defense –

October 12, 2009

B/ Reconstruction and refinement of A. baylyi metabolic model using mutant phenotypes

Page 28: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 28

A. baylyi model reconstruction Two step process

1. Identify all metabolic reactions occurring in the cell

2. Adapt representation to modeling requirements

Page 29: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 29

1/ Metabolic network reconstruction

Page 30: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 30

2/ Adapt to modeling requirements

Specific developments made for A. baylyi model Automated expansion of generic pathways Inference of enzyme complexes by homology to E. coli

Page 31: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 31

Initial model reconstructionCentral metabolism

Amino acids synthesis

Nucleotides synthesis

Transport

Lipid metabolism

Degradation pathways Cofactor synthesis

73

11692

148

145

115 108

Central metabolism

Amino acids synthesis

Nucleotides synthesis

Transport

Lipid metabolism

Degradation pathways Cofactor synthesis

70

13988

133

141

181 107

859 reactions using 697 metabolites, linked with 787 genes 109 metabolites that are exchangeable with the environment

Page 32: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 32

Evidence supporting the enzymatic function of model genes

70

Page 33: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 33

Growth phenotypes of wild-type strain on 190 carbon sources

Results: Growth on 45 carbon

sources No growth on

remaining 145 carbon sources

Dataset 1

Experimental datasets

Dataset 2

Genome-wide gene essentialities from A. baylyi mutant collection construction

Selection on succinate minimal medium

Gene essentiality results:

(de Berardinis et al, Mol Syst Biol 2008)

Dataset 3

Growth phenotypes of A. baylyi mutants on 8 defined environments

7 alternate C sources, 1 alternate N source

Quantitative growth measure (OD)

Frequency

Frequency

Page 34: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 34

Iterative refinement of A. baylyi modelInitial reconstruction

from:•genome annotation•pathway databases•literature

Dataset 1growth phenotypes of wild-type strain on 190 carbon sources

1 strain x 190 media

iAbaylyiv1

Page 35: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 35

Model refinement using dataset 1

overall prediction accuracy

correctly predicted carbon sources

correctly predicted non carbon sources

GPRNETWORKBIOMASS

090

Corrected inconsistencies

86%

24 / 45 (53%)

140 / 145 (97%)

iAbaylyiv1

91%

33 / 45 (73%)

140 / 145 (97%)

iAbaylyiv2

Page 36: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 36

Iterative refinement of A. baylyi model

Model accuracy• 91% on dataset 1

iAbaylyiv2

Initial reconstruction from:•genome annotation•pathway databases•literature

Dataset 1growth phenotypes of wild-type strain on 190 carbon sources

1 strain x 190 media

iAbaylyi v1

Model accuracy• 88% on dataset 1

Page 37: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 37

Iterative refinement of A. baylyi model

Model accuracy• 91% on dataset 1

iAbaylyiv2

Initial reconstruction from:•genome annotation•pathway databases•literature

Dataset 1growth phenotypes of wild-type strain on 190 carbon sources

1 strain x 190 media

iAbaylyi v1

Model accuracy• 88% on dataset 1

Dataset 2

genome-wide gene essentialities from A. baylyi mutant collection construction

3093 strains x 1 medium

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Page 38: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 38

Model refinement using dataset 2

overall prediction accuracy

correctly predicted essential genes

correctly predicted dispensable genes

GPRNETWORKBIOMASS

261110

Corrected inconsistencies

88%

187 / 251 (75%)

489 / 516 (95%)

iAbaylyiv2

94%

217 / 251 (86%)

495 / 505 (98%)

iAbaylyiv3

Page 39: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 39

Model accuracy• 91% on dataset 1• 94% on dataset 2

iAbaylyi v3

Iterative refinement of A. baylyi model

Model accuracy• 91% on dataset 1

iAbaylyiv2

• 88% on dataset 2

Initial reconstruction from:•genome annotation•pathway databases•literature

Dataset 1growth phenotypes of wild-type strain on 190 carbon sources

1 strain x 190 media

iAbaylyi v1

Model accuracy• 88% on dataset 1

Dataset 2

genome-wide gene essentialities from A. baylyi mutant collection construction

3093 strains x 1 medium

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Page 40: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 40

Model accuracy• 91% on dataset 1• 94% on dataset 2

iAbaylyi v3

Iterative refinement of A. baylyi model

Quantitative growth measure

Dataset 3growth phenotypes of A. baylyi mutant collection on 8 minimal media

2350 strains x 8 media

Model accuracy• 91% on dataset 1

iAbaylyiv2

• 88% on dataset 2

Initial reconstruction from:•genome annotation•pathway databases•literature

Dataset 1growth phenotypes of wild-type strain on 190 carbon sources

1 strain x 190 media

iAbaylyi v1

Model accuracy• 88% on dataset 1

Dataset 2

genome-wide gene essentialities from A. baylyi mutant collection construction

3093 strains x 1 medium

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Page 41: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 41

Model refinement using dataset 3

overall prediction accuracy

GPRNETWORKBIOMASS

810

Corrected inconsistencies

93%

16 / 36 (44%)

406 / 419 (97%)

iAbaylyiv3

94%

18 / 36 (50%)

408 / 416 (98%)

iAbaylyiv4

correctly predicted gene phenotypeswith ≥ 1 essentiality

correctly predicted gene phenotypeswith no essentiality

Page 42: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 42

Model accuracy• 91% on dataset 1• 94% on dataset 2

iAbaylyi v3

Iterative refinement of A. baylyi model

Quantitative growth measure

Dataset 3growth phenotypes of A. baylyi mutant collection on 8 minimal media

2350 strains x 8 media

Model accuracy• 91% on dataset 1

iAbaylyiv2

• 88% on dataset 2

Initial reconstruction from:•genome annotation•pathway databases•literature

Dataset 1growth phenotypes of wild-type strain on 190 carbon sources

1 strain x 190 media

iAbaylyi v1

Model accuracy• 88% on dataset 1

• 93% on dataset 3

Model accuracy• 91% on dataset 1• 94% on dataset 2• 94% on dataset 3

iAbaylyi v4

Dataset 2

genome-wide gene essentialities from A. baylyi mutant collection construction

3093 strains x 1 medium

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable

Page 43: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 43

ATP phospho-ribosyltransferase

ACIAD0661 (hisG) and ACIAD1257 (hisZ) were initially assigned as isozymes of ATP phosphoribosyl transferase reaction.

Observed essentiality of both genes suggests they are both necessary to the activity.

Further examination of the literature confirms that both proteins form an enzymatic complex (Sissler et al, PNAS 1999)

GPR correction example

essential gene or reaction

dispensable gene or reaction

biomass precursor

ACIAD0661 OR ACIAD1257

PRPP

phosphoribosyl-ATP

histidine

protein

Page 44: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 44

ATP phospho-ribosyltransferase

GPR correction example

essential gene or reaction

dispensable gene or reaction

biomass precursor

ACIAD0661 OR ACIAD1257

PRPP

phosphoribosyl-ATP

histidine

ACIAD0661 AND ACIAD1257

PRPP

phosphoribosyl-ATP

histidine

protein protein

Page 45: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 45

Network correction example ACIAD0822-0824 (gatABC)

annotated as an aspartyl/glutamyl-tRNA amidotransferase

gatABC are essential : only way to produce asparagine.

ACIAD1920 (glnS) catalyzes direct charging of glutamine on its tRNA

Essentiality of ACIAD1920 suggests that gatABC pathway is not effective for glutamine

essential gene or reaction

dispensable gene or reaction

biomass precursor

glutamine

glutamine-tRNA(gln)

protein

glutamate

glutamate-tRNA(gln)

ACIAD1920

ACIAD3371 ORACIAD0272

asparagine -tRNA(asn)

protein

aspartate

aspartate-tRNA(asn)

ACIAD0609

ACIAD0822 AND ACIAD0823 AND ACIAD0824

ACIAD0822 AND ACIAD0823 AND ACIAD0824

Page 46: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 46

Network correction example

essential gene or reaction

dispensable gene or reaction

biomass precursor

glutamine

glutamine-tRNA(gln)

protein

glutamate

glutamate-tRNA(gln)

ACIAD1920

ACIAD3371 ORACIAD0272

asparagine -tRNA(asn)

protein

aspartate

aspartate-tRNA(asn)

ACIAD0609

ACIAD0822 AND ACIAD0823 AND ACIAD0824

ACIAD0822 AND ACIAD0823 AND ACIAD0824

glutamine

glutamine-tRNA(gln)

protein

ACIAD1920asparagine -tRNA(asn)

protein

aspartate

aspartate-tRNA(asn)

ACIAD0609

ACIAD0822 AND ACIAD0823 AND ACIAD0824

Page 47: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 47

A. baylyi model refinement

Page 48: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 48

Online prediction of mutant phenotypes

(Le Fèvre et al, Bioinformatics 2009)

Page 49: Model-based investigation of bacterial metabolism using gene essentiality data

49 Maxime Durot – PhD defense –

October 12, 2009

C/ Automated reasoning with metabolic models and essentiality data

Page 50: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 50

Automated reasoning on gene-reaction associations

Use phenotypes as specifications for gene-reaction associations

Assume NETWORK and BIOMASS parts of the model are correct

For each inconsistency: search all GPRs compatible with experimental

data

GPR

GPR

Page 51: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 51

1/ Deduce impact scenarios from phenotypes Equivalent view of gene-reaction associations:

Deletion impact Impact (deletion of {G1,…,Gn}) = {R1,..,Rp} inactivated

Key idea: Phenotypes of reaction deletions can be predicted Compatible deletion impacts must follow the rules: lethal gene deletions must impact an essential reaction set

viable gene deletions must not impact any essential reaction set

Page 52: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 52

1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact

scenarios

Closed-world assumption: the set of genes potentially linked to a reaction is known

gene/reaction set is essential

gene/reaction set is dispensable

Predicted reaction

essentialities

Observed gene essentialities

R1

R2

G1

G2

G3

G4

reaction

gene

Legend

predefined gene-reaction link

Page 53: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 53

1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact

scenarios

Closed-world assumption: the set of genes potentially linked to a reaction is known

gene/reaction set is essential

gene/reaction set is dispensable

Predicted reaction

essentialities

Observed gene essentialities

R1

R2

G1

G2

G3

G4

reaction

gene

Legend

predefined gene-reaction link

impactscenario 1

Page 54: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 54

1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact

scenarios

Closed-world assumption: the set of genes potentially linked to a reaction is known

gene/reaction set is essential

gene/reaction set is dispensable

Predicted reaction

essentialities

Observed gene essentialities

R1

R2

G1

G2

G3

G4

reaction

gene

Legend

predefined gene-reaction link

impactscenario 2

Page 55: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 55

1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact

scenarios

Closed-world assumption: the set of genes potentially linked to a reaction is known

gene/reaction set is essential

gene/reaction set is dispensable

Predicted reaction

essentialities

Observed gene essentialities

R1

R2

G1

G2

G3

G4

reaction

gene

Legend

predefined gene-reaction link

impactscenario 3

Page 56: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 56

1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact

scenarios

Closed-world assumption: the set of genes potentially linked to a reaction is known

gene/reaction set is essential

gene/reaction set is dispensable

Predicted reaction

essentialities

Observed gene essentialities

R1

R2

G1

G2

G3

G4

reaction

gene

Legend

predefined gene-reaction link

impactscenario 4

Page 57: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 57

2/ Implement proposed impacts with GPR Choose an impact scenario

For each reaction, find Boolean rules implementing the impacts analogy to logic circuit design

GPR specificity: no negation rule monotonic increasing Boolean function (F(0,0) ≤ F(1,0) ≤ F(1,1)) constrains the possible implementations

Page 58: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 58

2/ Implement proposed impacts with GPR

Truth table for R1

Specifications for R1

G1 deletion does not impact R1

G2 deletion does not impact R1

G3 deletion does impact R1

G1 G2 G3 GPR

0 0 0

1 0 0

0 1 0

1 1 0 0

0 0 1

1 0 1 1

0 1 1 1

1 1 1

monotony

R1

R2

G1

G2

G3

G4

scenario 1

G1 G2 G3 GPR

0 0 0 0

1 0 0 0

0 1 0 0

1 1 0 0

0 0 1

1 0 1 1

0 1 1 1

1 1 1 1

Page 59: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 59

2/ Implement proposed impacts with GPR Multiple solutions

G1 G2 G3 GPR

0 0 0 0

1 0 0 0

0 1 0 0

1 1 0 0

0 0 1 ?

1 0 1 1

0 1 1 1

1 1 1 1

Generate all possible cases

GPRG3G2G1

1111

1110

1101

1100

0011

0010

0001

0000

GPRG3G2G1

1111

1110

1101

0100

0011

0010

0001

0000

GPR = G3

GPR = G3 and (G1 or G2)

Page 60: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 60

2/ Implement proposed impacts with GPR Multiple solutions

Generate all possible cases

Choose closest behavior to the original GPR

Propose experiment to fully determine the Boolean rule {G2, G3} double deletion here

G1 G2 G3 GPR

0 0 0 0

1 0 0 0

0 1 0 0

1 1 0 0

0 0 1 ?

1 0 1 1

0 1 1 1

1 1 1 1

Page 61: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 61

Comparing AutoGPR proposals with expert interpretations

Comparison with manual corrections of A. baylyi model

Type of expert interpretation InconsistenciesInconsistencies having AutoGPR proposals

Inconsistencies corrected using AutoGPR proposal

Model correctedGPR 34 24 (71%) 22 (65%) activity simultaneously requiring all genes 3 3 (100%) 3 (100%) isozyme not functional 22 19 (86%) 19 (86%) gene associated to another essential reaction 1 0 (0%) 0 (0%) presence of an alternate enzyme 6 1 (17%) 0 (0%) spontaneously occurring reaction 1 0 (0%) 0 (0%) wrong complex subunit 1 1 (100%) 0 (0%)NETWORK 12 0 (0%) 0 (0%)BIOMASS 10 0 (0%) 0 (0%)

Model not correctedValidated explanation 6 0 (0%) 0 (0%)Hypothetical explanation 21 5 (24%) 0 (0%)No precise interpretation 28 3 (11%) 0 (0%)

Page 62: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 62

Comparing AutoGPR proposals with expert interpretations

Type of expert interpretation InconsistenciesInconsistencies having AutoGPR proposals

GPR 21 12 (60%)NETWORK 18 0 (0%)BIOMASS 41 1 (2%)Other interpretation 128 12 (9%)No precise interpretation 29 1 (3%)

Comparison for S. cerevisiae model iND750 model predictions compared with gene essentiality data

on 8 environments (Duarte et al, Genome Res 2004)

Inconsistent predictions were manually interpreted (not corrected)

Page 63: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 63

Number of generated proposals for A. baylyi

0

2

4

6

8

10

12

14

16

18

20

1 10 100 1000 1E+04 1E+05 1E+06

Number of generated corrections

Num

ber

of c

ases

Page 64: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 64

Reducing complexity

First, simply test the existence of GPR corrections

Impose similar reactions to have similar GPR

Page 65: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 65

Examining corrections across environments GPR corrections can contradict each other across environments

Model Environments

Fraction of cases where AutoGPR corrections are inconsistent across environments

A. baylyi 8 minimal media 3 / 8 (37%)E. coli 2 minimal media 4 / 22 (18%) (J oyce et al, J Bacteriol 2006) S. cerevisiae 8 minimal and complex media 21 / 22 (95%) (Duarte et al, Genome Res 2004)

Possible interpretations Inconsistencies between experimental conditions

Error in NETWORK or BIOMASS model components

GPR are not constant across environments Conditional expression of genes Regulatory interactions intervene

(Durot et al, BMC Syst Biol 2008)

Page 66: Model-based investigation of bacterial metabolism using gene essentiality data

66 Maxime Durot – PhD defense –

October 12, 2009

Conclusion & perspectives

Page 67: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 67

Main contributions

Reconstruction of a global metabolic model of A. baylyi

Development of a framework for interpreting inconsistent growth phenotype predictions

Systematic interpretation of A. baylyi mutant phenotypes using its metabolic model

Design of an automated method to reason on GPR corrections from gene essentialities

Page 68: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 68

Perspectives

A. baylyi metabolic model Tool to integrate further experimental data

RNA-seq , metabolomics on A. baylyi and mutants

Metabolic model reconstruction Automate the reconstruction process from genome annotation Systematically assess model correctness using high-throughput

experimental data

=> Microme European project to be started

Page 69: Model-based investigation of bacterial metabolism using gene essentiality data

Maxime Durot – PhD defense –

October 12, 2009 69

Claudine MédigueDavid VallenetValérie Barbe

Georges CohenNuria FonknechtenAnnett Kreimeyer

Metabolic Thesaurus experimental work

Marcel SalanoubatVéronique de Berardinis

Alain PerretMarielle Besnard

Christophe LechaplaisAgnès Pinet

Acinetobacter baylyi annotation

Computational Systems Biology group

AcknowledgmentsSupervisors

Vincent Schachter & Jean Weissenbach

François Le FèvreGilles Vieira

Richard Baran*Pierre-Yves Bourguignon*

Serge Smidtas*(* former members)

Page 70: Model-based investigation of bacterial metabolism using gene essentiality data

70 Maxime Durot – PhD defense –

October 12, 2009

Discussion