60
Computational Systems Biology 1 Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis of biological networks Hongwu Ma Computational Systems Biology Group 10 February 2006

Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

1

Computational Systems Biology

Lectures 6 and 7: Reconstruction and structural analysis of

biological networks

Hongwu MaComputational Systems Biology Group

10 February 2006

Page 2: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

2

Simple models

1210321 XSSX EEE ⎯→⎯⎯→⎯⎯→⎯

An enzyme reaction Enzyme kinetics

A linear pathway

BA E⎯→⎯

Metabolic control analysis

Page 3: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

3

The reality

Glycolysis pathway The whole network

Page 4: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

4

Systems Biology comes of age

0

100

200

300

400

1997 1999 2001 2003 2005

SB institute in seattle

1st ICSB in Japan

Review paper by L hood

Special issues in Nature and Science

1996, Systems biology unit, Glaxo WellcomeMedicines Research Centre

No. of Pubm

edpapers

Page 5: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

5

The background

• Fully sequenced genomes for several hundreds organism

• High-throughput methods for measurement at whole system level (more data with less experiments)

• Various internet available databases make it easy to collect and integrate information at system level.

Page 6: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

6

But how to study SB

• Methods for constructing the system (network) from available information

• Methods for theoretical analysis of such complex systems

• How “system analysis” contribute to biology

Page 7: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

7

Reconstruction of metabolic networks

• Organism specific MN from its genome• High throughput reconstruction from

databases• High quality network: Human curation

Page 8: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

8

Metabolic network reconstruction

Genes

b0001b0002b0003……..

Enzymes

EC 1.1.1.3EC 1.1.1.6EC 1.1.1.17

………...

Reactions

D-Glucose + ATP = D-G6P + ADPD-G6P = D-F6P

………………………

Page 9: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

9

Reconstruction from genome

Database

Gene annotation

Network

CTGAGGTCGACTCTAGAGGATCCCCCTTCCAGATGTGTAAGTTA

TTTGAGTGAAATAGTGCCAGTTTAACCATAGTTCTAGTAAGCT

TTTGAGTGAAATAGTGCCAGTTTAACCATAGTTCTAGTAAGCTG

Genome sequence

ORF

Gene recognition

Sequence similarity search, Blast

ID Name start end functionEG11277 thrL 190 255 Regulatory leader peptide for thrABC operoEG10998 thrA 337 2799 Aspartokinase I-homoserine dehydrogenasEG10999 thrB 2801 3733 Homoserine kinase [EC 2.7.1.39]EG11000 thrC 3734 5020 Threonine synthase [EC 4.2.3.1]

Page 10: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

10

High throughput reconstruction from enzyme databases

http://www.genome.ad.jp/dbget-bin/www_bget?enzyme+1.1.1.81ENTRY EC 1.1.1.81 NAME Hydroxypyruvate reductase CLASS Oxidoreductases Acting on the CH-OH group of donors With NAD+ or NADP+ as acceptor SYSNAME D-Glycerate:NADP+ 2-oxidoreductase REACTION D-Glycerate + NAD+ or NADP+ = Hydroxypyruvate + NADH or NADPH PATHWAY PATH: MAP00260 Glycine, serine and threonine metabolism PATH: MAP00630 Glyoxylate and dicarboxylate metabolism GENES YPE: YPO2536 PAE: PA1499 RSO: RS03094(ttuD1) RS05749(ttuD2) MLO: mlr5146 SME: SMa1406(ttuD3) SMb20678(ttuD2) SMc04389(ttuD1) ATU: Atu3232 Atu5334

Page 11: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

11

Useful databases• KEGG: http://www.genome.ad.jp/kegg/• Biocyc: http://biocyc.org/• Expasy: http://www.expasy.ch• Brenda: http://www.brenda.uni-koeln.de/

Ability required: writing program to extract information from websites or downloadable text files.

Page 12: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

12

From Enzyme to reaction

For EC 5.3.1.9:

4 reactions in KEGG LIGAND database

2 reactions in WIT database

1 reaction in IUBMB and Expasy databases

1 reaction in BRENDA database

1 reaction in EcoCyc

An enzyme may have wide substrate catalysis activity. E.g. 1.1.1.2: an alcohol + NADP+ = an aldehyde + NADPH + H+

Page 13: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

13

KEGG LIGAND reaction databaseMore than 6000 reactions

………

ENTRY R02739

NAME alpha-D-Glucose 6-phosphate ketol-isomerase

DEFINITION alpha-D-Glucose 6-phosphate <=> beta-D-Glucose 6-phosphate

EQUATION C00668 <=> C01172

ENZYME 5.3.1.9 5.1.3.15

………

irreversibility

Page 14: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

14

E. Coli Metabolic network from KEGG

Reaction

Metabolite

638 Enzymes1081 reactions

Ma & Zeng (2003) Bioinformatics. 19, 270

Page 15: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

15

Problems in the high throughput methods

•Nonenzyme catalyzed reactions link

•Uncharacted enzymes like 1.-.-.-

•Unsequensed enzymes (gene for a known enzyme not identified, more than 40 in E. coli) link

•Wrongly annotated enzyme genes (the annotations in KEGG, Ecocyc and Expasy for E. coli are very different)

Page 16: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

16

Enzyme databases

Annotated genome

Enzymes

EC 1.1.1.3EC 1.1.1.6EC 1.1.1.17

………...

Reactions

R0004R0007R0100

………...Metabolic network

High quality network reconstruction

KEGGExpasyBrenda

Human curation

Page 17: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

17

Available high quality networks

E. coli: Ecocyc, Palsson’s group

Yeast: Palsson and Nielsen’s group

Helicobacter pylori

Haemophilus influenzae

Human?

link

Page 18: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

18

Structural analysis of metabolic networks

• Graph representation• Structure properties• Network decomposition• Software for network analysis

Page 19: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

19

Mathematical representation of MN

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

0010000110110100110100010

132313

6

PGPTPT

FDPPF

Glyceraldehyde 3-P

Fructose 1,6-P

Fructose 6-PH2O, Pi

1,3-P Glycerate

NAD, Pi

NADH

ATP

ADP

Glycerone P

r1 r2

r3

r4r5

T3P2

F6P

FDP

T3P1

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

−−−−

−−

−−

0111001010000000001100000000011101100000001100001100011

5

4

3

2

1

rrrrr

13PG

ATP

ADP

NAD

H

NAD

Pi

H

2O

Stoichiometric matrix Connectivity matrix

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

0110010100110100000100110

5

4

3

2

1

rrrrr

Page 20: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

20

Connectivity matrix to graph

r1 r5

r4

r2

r3

T3P2

F6P

13PG

FDP

T3P1

Metabolite graph

Reaction graph

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

0010000110110100110100010

132313

6

PGPTPT

FDPPF

Connectivity matrix

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

0110010100110100000100110

5

4

3

2

1

rrrrr

Page 21: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

21

Currency metabolites in graph analysis

Otherwise path length will be 2 instead of 9 (Jeong et al. 2000 Nature 407:651)

G l u c o s e

G l u c o s e 6 - P

G l y c e r a l d e h y d e 3 - P

F r u c t o s e 1 , 6 - P

F r u c t o s e 6 - P

2 - P G l y c e r a t e

P h o s p h o e n o l p y r u v a t e

A T P

A D P

H 2 O , P i

3 - P G l y c e r a t e

1 , 3 - P G l y c e r a t e

H 2 O

N A D , P i

N A D H

P y r u v a t e

A D P

A T P

A T PA D P

G l y c e r o n e P

A D PA T P

From glucose to pyruvate, ADP can not be used as a link.

Currency metabolites

Page 22: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

22

With or without currency metabolites

Metabolic network of S. pneumonia (616 reactions)

Page 23: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

23

Structural properties of large scale network

• Connection degree distribution• Path length (efficiency)• Node centrality (the most important nodes)• Connectivity of the network• Modules in the network• Network motifs

Page 24: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

24

Connection degree distribution

0. 001

0. 01

0. 1

1

1 10 100K

P(K)

out puti nput

γ−= akkP )(

P(k): Percentage of nodes with a connection degree k or not less than k (Cumulative distribution).

Power law degree distribution indicates a scale free network

The few highly connected nodes are called hubs in the network

Page 25: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

25

Hub metabolites

Glycerate-3-phosphate, D-Ribose-5-phosphate, Acetyl-CoA, Pyruvate, D-Xylulose 5-phosphateD-Fructose 6-phosphate, 5-Phospho-D-ribose 1-diphosphate, L-Glutamate, D-Glyceraldehyde 3-phosphate, L-Aspartate, Propanoyl-CoA, Malonyl-ACP, Succinate, Acetate, Isocitrate, Fumarate

Page 26: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

26

Average path lengthPath length: number of the steps in the shortest paths from one node to anotherAPL: average of the path lengths for all connected pairs of nodes

Breadth first searching method to find the shortest paths from one node to all other nodes

Page 27: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

27

Average path length in MNJeong et al. (2000) Nature: constant AL (about 3) for all the 43 organisms studied.

02468

1012

0 200 400 600 800 1000

Node number

Aver

age

path

leng

th

EukaryoteBacteriaArchaea

Structural differences among MNs of the three domains of organism

Ma & Zeng (2003) Bioinformatics. 19, 270

Page 28: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

28

Node Centrality

dyxdnxC

xyUy

1),(

1)( ,

=−

=∑ ≠∈

Closeness centrality of node x:

d(x,y) the path length between node x and node yU the set of all nodes

average path length between x and the other nodesd

The central nodes have short path lengths to other nodes in the network

Page 29: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

29

The most central metabolites in the metabolic network of E. coli

5.06G3P

5.03Acetaldehyde

4.98Acetate

4.932KD6PG

4.89Malate

4.76Actyl-CoA

4.44Pyruvate

Mean distanceMetabolite

nodes connectedhighly nodes centralmost ≠

Page 30: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

30

Network Centrality

)2)(1())(()32( *

−−

−−= ∑ ∈

nnxCCn

C Ux

n node number in the networkC(x) the closeness centrality for node xC* the highest value of node closeness centrality

circle network star network

C = 0 C = 1

Network closeness centralization index

Distribution of the node centrality in the network

Page 31: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

31

Average path length vs. network closeness centralization index

0

2

4

6

8

10

12

0 0.05 0.1 0.15 0.2 0.25

OCCI

ALG

EukaryoteBacteriaArchaea

C

Page 32: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

32

Network Connectivity

Degree distribution tell nothing about global connectivity

The right network can have short average path length though not connected at all

Page 33: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

33

Strongly or weakly connected components

Fully connected subsets depending on if direction is considered

Page 34: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

34

SC in metabolic network

72

15

5 3 1 2 10

10

20

30

40

50

60

70

80

2 3 4 5 6 7 8 9 10 11 274

Size of components

Num

ber

of c

ompo

nent

s

Num

ber

GSC: Giant strong component

One big SC and many small SCs

Page 35: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

35

Connectivity structure of MN

Giant strong component (GSC)metabolites fully converted and

convertible to each other

Substrate subset (93)converted to metabolite in GSC

Product subset (161)produced from metabolites in GSC

274 out of total 811 metabolites

Isolated subset (283)

Page 36: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

36

Bow-tie: a general structure of biological and physical networks

Giant Strong Component

disconnected components

OUTsubset

INsubset

• Metabolic network• Signal transduction• Web page• Material processing and other tech. systems

Page 37: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

37

The average path length of the GSC determines that of the whole network

0

4

8

12

0 2 4 6 8 10 12

ALW

ALG

Path length of whole network

Path

leng

th o

f GSC

Page 38: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

38

Other structure parameters

• Cluster coefficient• Betweeness centrality• Reachability• Clique and core

Page 39: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

39

Pajek, the right software• Written in C, very fast, many functions, 1M• Free and updated with new function frequently• good manual, theory introduction with how-to-do in

the software• Book available: Exploratory Social Network

Analysis with Pajek• http://vlado.fmf.uni-lj.si/pub/networks/pajek/ h

Page 40: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

40

Other tools

• Cytoscape http://www.cytoscape.org/• Ucinet

http://www.analytictech.com/ucinet.htm• Bioconductor and R• Matlab

Page 41: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

41

Network decomposition

• Identifying a somehow independent subnetwork (modules) from such complex networks for biological function analysis or developing more detailed kinetic model

Page 42: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

42

WHYUniversity of Edinburgh

CSE CMVM CHSS

S1 S2 S3 S1 S2 S3 S1 S2 S3

Page 43: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

43

A Method• Define a distance between two nodes: the shortest path

between node A and node B• Distance matrix a hierarchical classification tree

Page 44: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

44

Hierarchical decomposition of GSC

6

4

3

2

1

5

1

4

3

2

5

5

Page 45: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

45

Biological function of these modules

1. Aspartate metabolism, Thr, Lys synthesis2. Glutathione and THF metabolism 3. Glutamate metabolism, Arg, Pro synthesis4. galacterate, glycerate metabolism, Ser synthesis5. PP pathway, upper part of glycolysis pathway,

sugar, aminosugar and glycerol metabolism6. TCA cycle and glyoxylate cycle, pyruvate

metabolism, AcCoA, AcAcCoA metabolism

Modules identified by the decomposition method have true biological meaning

Page 46: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

46

Transcriptional regulation: biological basis

• Transcription unit: operon• Sigma factor• Transcription factor (TF)• Promotor• TF binding site• More complex in Eukaryotes

Transcription unit

RNA polymerase

Sigma factor

Promotor

TFs

Binding site Genes

Interactions between proteins and genes

Page 47: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

47

Reconstruction of MN and TRN

E. coli

P. aeruginosa

A1 B1

A2 B2

1.2.1.3

mA

mB

TF

+

+?

Are the orthologous genes in different organisms regulated in similar ways?

Page 48: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

48

Transcriptional regulatory networks

• Can not be reconstructed directly from gene annotation information

• data is mainly from database which collect information from literatures (for E. coli and B. subtilis)

• ChIP on chip technology for high throughput reconstruction (mainly yeast network)

Page 49: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

49

Available databases

• RegulonDB(http://www.cifn.unam.mx/Computational_Genomics/regulondb/)

• Ecocyc (ecocyc.org)• DBTBS (B. subtilis) (dbtbs.hgc.jp)• Prodoric Net (Prokaryote) (prodoric.tu-bs.de)• TRANSFAC (Eukaryotes)

(www.biobase.de/pages/products/transfac.html)

Page 50: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

50

E. coli regulatory network1278 genes 2724 interactions157 genes for TFs382 metabolic enzyme genes (Blue)

Ma et al. (2004) Nucleic Acid Research 32, 6643

Page 51: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

51

Connectivity analysisA giant weakly connected component with 1166 nodes

No strongly connected component exists, indicating no multiple gene regulatory loop exist, impling a somehow acyclic structure.

Page 52: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

52

Method to find regulatory hierarchy

Page 53: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

53

The nine-layer hierarchical structure

Average path length: 1.85

rpoE

fis

soxR

cspA

rpoS

phoB

crp

ihf

argP

cspE

rpoNdnaA

hnscytR

Like the organization of the university?

Page 54: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

54

Network motifSubgraphs that appear in the network at frequencies much higher than those found in randomized networks (Shen-Oorret al., Nature Genetics, 31:64).

Regarded as elementary building blocks of network

Milo et al. Network motifs: simple building blocks of complex networks.Science. 2002, 298(5594):824-7.

Page 55: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

55

Network motifs in regulatory network

Feed forward loops

712 in the network

Bi-fan motif

11935

Page 56: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

56

Motifs in the network

Blue: FFLs

Yellow: Bi-fan motifs

Green: Both

Page 57: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

57

Two kinds of FFLs

+

+

++

+

Coherent FFLs Incoherent FFL

Direct effect of the upper regulator A on the target gene C is consistent with its indirect effect through B.

Direct effect is inconsistent with the indirect effect.

C

BB

AA

C

Mangan and Alon, (2003) PNAS, 100:11980

Page 58: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

58

Motif distributionCoherent FFLs: 330

+

++

−−

−−

+−

+

++

+

+−

++type

number 265 28 7 30 119 9 8 16

examples

flhC-fliA-

fliGH

cpxR-csgD-csgA

fis-hns-cysG

fnr-narL-dcuB

crp-nagC-manX

YZ

arcA-betI-

betAB

ihf-flhD-nrfA

fnr-narL-

moeAB

+

Incoherent FFLs: 152

Page 59: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

59

Motifs often interact with each other

hns

gadW

gadA

gadX

crprpoS

gadA: glutamate decarboxylase (in aminobutarate pathway, a by path of TCA cycle), important in oxidative stress and acid stress response

Page 60: Computational Systems Biology Lectures 6 and 7: Reconstruction and structural analysis ... · 2006-03-01 · Computational Systems Biology 1 Computational Systems Biology Lectures

Computational Systems Biology

60

references

H.W. Ma and A.P. Zeng. (2003) Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics. 19(2)270-277. H.W. Ma and A.P. Zeng. (2003) The connectivity structure, giant strong component and centrality of metabolic networks. Bioinformatics. 19(11)1423-1430. H.W. Ma, et al (2004) An extended transcriptional regulatory network of Escherichia coli and its structural analysis, Nucleic Acid Research, 32:6643Barabási and Oltvai (2004) Network Biology: Understandingthe cell‘s functional organization. Nature Rev. Genet. 5, 101-114.