Upload
bryan-ward
View
213
Download
1
Tags:
Embed Size (px)
Citation preview
Motif Mining from Gene Regulatory Networks
Based on the publications of Uri Alon’s group
…presented by Pavlos Pavlidis
Tartu University, December 2005
Gene Regulatory Networks
• From WikipediaGene regulatory network is a collection of DNA segments in a cell which interact with each other and with other substances in the cell, thereby governing the rates at which genes in the network are transcribed into mRNA
• From DOEGene regulatory networks (GRNs) are the on-off switches and rheostats…dynamically orchestrate the level of expression for each gene….
Why networks can regulate Gene Expression?
• U. Alon and his group, stresses the importance of the building blocks of the network.
• These building blocks are called motifs
Motifs
• They are called also n-node subgraphs in a directed graph
(The work has also been extended for undirected graphs)
• They are characterized from the number n of the nodes and the relations between them – directed edges
The 13 different 3-node subgraphs
Feed Forward LoopIt regulates rapidly the production of Z
In what motifs they are interested
• Not in biologically significant– They don’t know a priori if a motif is
biologically significant
• They can calculate statistical significance– The probability that a randomized
network contains the same number or more instances of a particular motif must be smaller than P. Here P is 0.01.
Randomized Network
• A randomized network is not completely randomized.
It has some properties:• The same number of nodes as in the real
network• For each node the number of the
incoming and outgoing edges equals to the real network.
Operon 1 Operon 2 Operon 3 Operon 4 Operon 5 Operon 6 …Operon 1 0 0 1 0 0 0Operon 2 1 0 0 1 0 0Operon 3Operon 4Operon 5 Mij:Operon 6 1 if the j operon produces a TFOperon 7 which ragulates operon iOperon 8Operon 9Operon 10 1Operon 11 operon 2 regulates Operon 12 operon 11Operon 13Operon 14Operon 15Operon 16Operon 17Operon 18
Representation of the network as a matrix M
Randomization: Select randomly two cells which are 1 e.g A(1,3), B(2,1).
If A’(1, 1) and B’(2, 3) are 0 then swap
Goal : The randomized network must have the same sum in columns and in rows
Columns: The number of outgoing edges
Rows: The number of incoming edges
One more requirement:
If we are looking for n-node subgraphs, then the number of n-1 node subgraphs must be the same in real and randomized networks
This is done to avoid assigning high significance to a structure only because of the fact that it includes a highly significant substructure.
Significance of a motif
• Three requirements– P < 0.01
P was estimated (or bounded) by using 1000 randomized networks.
– The number of times it appears in the real network with distinct sets of nodes is at least U = 4.
– The number of appearances in the real network is significantly larger than in the randomized networks: Nreal – Nrand > 0.1Nrand (Why??).
What did they find
• That in biological systems as in E.coli or in S.cerevisiae only some certain types of motifs are statistically important.
• When they studied other systems such as:Food webs. The database of seven ecosystem food websNeuronal networks: the neural system of C.elegans
WWW
OTHER KIND OF MOTIFS WHERE STATISTICALLY IMPORTANT
FFL
SIM
DOR
FFL
• Biological Example– the L-arabinose utilization system:– Crp is the general transcription factor and
AraC the specific transcription factor.
The real model
FFL
• Coherent
• Incoherent
• Important for the speed of response
Software
mDraw Network visualization tool(mfinder and network motifs visualization tool embedded)