Upload
sharlene-west
View
214
Download
0
Embed Size (px)
Citation preview
ARACNE: An Algorithm for the Reconstruction of Gene RegulatoryNetworks in a Mammalian Cellular ContextAdam A Margolin, Ilya Nemenman, Katia Basso, Chris Wiggins,Gustavo Stolovitzky, Riccardo Dalla Favera and Andrea Califano.
Reverse engineering of regulatory networks in human B cellsKatia Basso, Adam Margolin, Gustavo Stolovitzky, Ulf Klein,Riccardo Dalla-Favera, and Andrea Califano
550.635 TOPICS IN BIOINFORMATICS550.635 TOPICS IN BIOINFORMATICS
Instructor: Dr. Donald GemanStudent: Francisco Sánchez Vega
Baltimore, 28 November 2006
THE JOHNS HOPKINS UNIVERSITYTHE JOHNS HOPKINS UNIVERSITY
http://www.cis.jhu.edu/~sanchez/ARACNE_635_sanchez.ppthttp://www.cis.jhu.edu/~sanchez/ARACNE_635_sanchez.ppt
Outline
• A very short introduction to MRF
• The ARACNE approach• Theoretical framework• Technical implementation
• Experiments and results• Synthetic data• Human B cells (paper by Basso et. al)
• Discussion
What is a Markov Random Field ?What is a Markov Random Field ?• Intuitive idea: concept of Markov chain
Xt1 X(t-1) XtXt0…
P(Xt|Xt0,Xt1,…,X(t-1))=P(Xt|X(t-1))
X(t+1) …
A very short introduction to MRFs
" The world is a huge Markov chain "
(Ronald A. Howard)
What is a Markov Random Field ?What is a Markov Random Field ?
• A Markov random field is a model of the joint probability distribution over a set X of random variables.
• A generalization of Markov Chains to a process indexed by the sites of a graph and satisfying a Markov property to the neighborhood system.
A very short introduction to MRFs
MRF: a constructive definitionMRF: a constructive definition
• Let X={X1,…,XN} be the set of variables (the stochastic process) whose JPD we want to model.
• Start by considering a set of sites or vertices V={V1,…,VN}.
• Define a neighborhood system: ={v, v V}
where
(1) v V(2) v v
(3) v u u v
A very short introduction to MRFs
MRF: a constructive definitionMRF: a constructive definition• Example: “N times N” square lattice
V={(i,j):1i,j N} (i,j) ={ (k,l)V: 1(i-k)2+(l-j)2 c}
c=1 c=2 c=8
A very short introduction to MRFs
MRF: a constructive definitionMRF: a constructive definition• At this point, we can build an undirected graph G=(V, E)
• Each vertex vV is associated to one of the random variables in X• The set of edges E is given by the chosen neighborhood system.
c=1 c=2
A very short introduction to MRFs
MRF: a constructive definitionMRF: a constructive definition• Clique: induced complete subgraph
c=1
c=2
A very short introduction to MRFs
• We say that X is a MRF on (V,) if
(a) P(X=x)>0, for all x S
(b) P(Xv=xv|Xu=xu,uv)=P(Xv=xv|Xu=xu,uv) for vV
MRF: a constructive definitionMRF: a constructive definition
• Imagine that each variable Xv in X={X1,…,XN} can take one of a finite number Lv of values:
• Define the configuration space S as
VvLX vv },1,...,1,0{
Vv
vLS }1,...,1,0{
A very short introduction to MRFs
MRF: a constructive definitionMRF: a constructive definition
A very short introduction to MRFs
c=1 c=2
• Let us now associate a specific function, that we will call potential, to each clique c C, so that:
• We say that a probability distribution p PS is a Gibbs distribution wrt (V,) if it is of the form:
}:{'),'()( cxxxxx vvcc
Ccc xZxp )(exp)( 1
where Z is the partition function:
Sx Ccc xZ )(exp
A very short introduction to MRFs
Correspondence TheoremCorrespondence Theorem
X is a MRF with respect to (V, ) if and only if
p(x)=P(X=x) is a Gibbs distribution with respect to (V, )
(i.e. every given Markov Random Field has an associated Gibbs distribution and every given Gibbs distribution has an associated Markov Random Field).
A very short introduction to MRFs
Outline
• A very short introduction to MRF
• The ARACNE approach• Theoretical framework• Technical implementation
• Experiments and results• Synthetic data• Human B cells (paper by Basso et. al)
• Discussion
Microarray data
Graphical model
Theoretical framework
The name of the game
• Need of a strategy to deal with uncertainty under small samples:
Maximum entropy modelsMaximum entropy models
• Philosophical basis: • Model all that is known and assume nothing about that
which is unknown.
• Given a certain dataset, choose a model which is consistent with it, but otherwise as “uniform” as possible
Theoretical framework
MaxEnt: toy example (adapted from A. Berger)MaxEnt: toy example (adapted from A. Berger)• Paul, Mary, Jane, Brian and Emily are five grad students that work in
the same research lab. • Let us try to model the probability of the discrete random variable
X“first person to arrive at the lab in the morning”.
p(X=Paul)+p(X=Mary)+p(X=Jane)+p(X=Bryan)+p(X=Emily)=1
• If we do not know anything about them:p(X=Paul) = 1/5
p(X=Mary) = 1/5
p(X=Jane) = 1/5
p(X=Bryan) = 1/5
p(X=Emily) = 1/5
The most “uniform” model is the onethat maximizes the entropy H(A)
)]([log))(( XpEXpH
Theoretical framework
MaxEnt: toy example (adapted from A. Berger)MaxEnt: toy example (adapted from A. Berger)• Imagine that we know that Mary or Jane are the first ones
to arrive 30% of the time.• In that case, we know:
p(X=Paul)+p(X=Mary)+p(X=Jane)+p(X=Bryan)+p(X=Emily)=1p(X=Mary)+p(X=Jane)=3/10
• Again, we may want to choose the most “uniform” model, although this time we need to respect the new constraint:
p(X=Paul) = (7/10)(1/3) = 7/30p(X=Mary) = (3/10)(1/2) = 3/20p(X=Jane) = (3/10)(1/2) = 3/20p(X=Bryan) = (7/10)(1/3) = 7/30p(X=Emily) = (7/10)(1/3) = 7/30
Maximize H(P(X)) under the given constraints
Theoretical framework
MaxEnt extensions ~ constrained optimization problem
• S countable set
• PS set of prob. measures on S, PS ={p(x): xS}.
set of K constraints for the problem subset of PS that satisfies all the constraints in
• Let {1,…,K} be a set of K functions SR [feature functions]
• We choose these functions so that each given constraint can be expressed as:
• If x are samples from a r.v. X, then E[i(X)]=i
Theoretical framework
KmxpxSx
mm ,...,1 ,)()(
M
iimmm x
M 1
)(1
MaxEnt extensions ~ constrained optimization problem
• Ex. Let x be a sample from X”person to arrive first in the lab”
In this case, S={Paul, Mary, Jane, Bryan, Emily} ={1}, 1 p(X=Mary)+p(X=Jane)=3/10
• We can model 1 it as follows:
Theoretical framework
1 if x=Mary or x=Jane0 otherwise
So that
1=3/10
Define 1(x)
Sx
xpxxE 10/3)()()]([ 111
MaxEnt extensions ~ constrained optimization problem
Theoretical framework
Find p(x)=arg max p H(p(x))
subset of PS that satisfies all the constraints in :
Of course, since p(x) is a probability measure:
KmxpxSx
mm ,...,1 ,)()(
1)(0
Sx
xp
• Use Lagrange multipliers:
Theoretical framework
K
i xiii
SxK xxpxpxppA
01 )()())(log()(),...,;(
H(p) i
• This leads to a solution of the form:
But this is a Gibbs distribution, and therefore, by theorem,we can think in terms of the underlying Markov Random Field !!
- We can profit from previous knowledge of MRF
- We have found the graphical model paradigmWe have found the graphical model paradigm that we were looking for!that we were looking for!
Theoretical framework
K
iii
x
K
iii
K
iii
xZ
x
x
xp0
0
0 )(exp)(
1
)(exp
)(exp
)(
Microarray data
Graphical model
Theoretical framework
Technicalimplementation
The name of the game
• Consider the discrete, binary case.
Approximation of the interaction structure
First order constraints:
1,111
11 ˆ
1)]([ ;
0 xif 0
1 xif 1)(
i
ixM
xEx
2,222
22 ˆ
1)]([ ;
0 xif 0
1 xif 1)(
i
ixM
xEx
5,555
55 ˆ
1)]([ ;
0 xif 0
1 xif 1)(
i
ixM
xEx
…
N
X1
X2
X3
X4
X5
• Consider the discrete, binary case.
Approximation of the interaction structure
Second order constraints:
otherwise 0
1 xand 1 xif 1)(
otherwise 0
0 xand 1 xif 1)(
otherwise 0
1 xand 0 xif 1)(
otherwise 0
0 xand 0 xif 1)(
)2,1(
21]1,1[,2,1
21]0,1[,2,1
21]1,0[,2,1
21]0,0[,2,1
x
x
x
x
bai
iiba bxaxIM
xE ,,2,121],[2,1 ˆ},{1
)]([
2
4
Nconstraints
• Consider the discrete, binary case.
Approximation of the interaction structure
For jth order:
2
j
Nj constraints
• The higher the order, the more accurate our approximation will be…
… provided we observe enough data !!
(from Elements of Statistical learning, by Hastie et Tibshirani)
Approximation of the interaction structure
“…for M → ∞ (where M is sample set size) the complete form of the JPD is restored. In fact, M > 100 is generally sufficient to estimate 2-way marginals in genomics problems, while P(gi, gj, gk) requires about an order of magnitude more samples…”
(from Dr. Munos lectures, Master MVA)
• Therefore, the model they adopt is of the form:
• All genes for which ij = 0 are declared mutually non-interacting:
• Some of these genes are statistically independent i.e. P(gi, gj) ≈ P(gi)P(gj))
• Some are genes that do not interact directly but are statistically dependent due to their interaction via other genes. i.e. P(gi, gj) ≠ P(gi)P(gj), but ij = 0
N
jijiij
N
iiii ggg
ZgP
,
),()(exp1
})({
How can we extract this information ??
Approximation of the interaction structure
• Therefore, the model they adopt is of the form:
N
jijiij
N
iii ggg
ZgiP
,
),()(exp1
})({
Approximation of the interaction structure
ij=0P(gi, gj)
=P(gi)P(gj)
Pairs of genes who interactthrough a third gene
Pairs of genes whose direct interactionis balanced out by a third gene
Pairs of genes for which the MI is a rightful indicator of dependency
• The mutual information between two random variables is a measure of the amount of information that one random variable contains about another.
• It is defined as the relative entropy (or Kullback-Leibler distance) between the joint distribution and the product of the marginals:
• Alternative formulations: x y ypxp
yxpyxpypxpyxpDYXI
)()(
),(log),())()(|),((),(
)|()(),()()(),( YXSXSYXSYSXSYXI
Mutual Information
Another toy exampleAnother toy example
XX YY
1 0
1 1
0 0
0 0
0 1
1 1
1 0
0 1
0 1
0 1
P(X=0)=6/10;P(X=1)=4/10;
P(Y=0)=4/10;P(Y=1)=6/10;
P(X=0,Y=0)=2/10P(X=0,Y=1)=4/10P(X=1,Y=0)=2/10P(X=1,Y=1)=2/10
257.06.0·4.0
2.0·log2.0
4.0·6.0
4.0·log4.0
4.0·4.0
2.0·log2.0
6.0·4.0
2.0·log2.0
)()(
),(log),(),(
,
yx ypxp
yxpyxpYXI 0
1
0
1
0
1
2
3
4
XY
0 10
1
2
3
4
5
6
7
8
Y0 1
0
1
2
3
4
5
6
7
8
X
P(X) P(Y)
P(X,Y)
Mutual Information estimation
iizzhGh
Mzf |)|(
1)( 12
The Gaussian Kernel estimator
Another toy exampleAnother toy example
Mutual Information estimation
iizzhGh
Mzf |)|(
1)( 12
i ii
ii
yfxf
yxf
MyixiI
)()(
),(log
1}){},({
-1
0
1
2
-1.5-1-0.500.511.522.5
0
1
2
3
4
YX
0
1
0
1
0
1
2
3
4
XY
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.50
10
20
30
40
50
60
Y
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.50
10
20
30
40
50
60
X
f(x)
f(y)
f(x,y)
P(X,Y)Another toy exampleAnother toy example
Mutual Information estimation
-1
0
1
2
-1.5-1-0.500.511.522.5
0
1
2
3
4
YX
0
1
0
1
0
1
2
3
4
XY
Reference (h=1)
-1
01
2
-1
0
1
2
0
2
4
6
8
10
Y
X
h’ = 4h
-1.5-1-0.500.511.522.5
-1
0
1
2
0
0.5
1
1.5
2
Y
X
h’’ = h/2
Another toy exampleAnother toy example
Mutual Information estimation
• Every pair of variables is copula-transformed:
(X1,X2)[FX1(X1),FX2
(X2)], where FX(x)=P(Xx)
• By the Probability Integral Transform Theorem, we know that FX(X)~U[0,1].
• Thus, the resulting variables have range between 0 and 1, and marginals are uniform.• Transformation is one-to-one H and MI unaffected• Reduction of the influence of arbitrary
transformations from microarray data preprocessing.• No need for position dependent kernel widths (h).
Mutual Information estimation
• The choice of h is critical for the accuracy of the MI estimate, but not so important for estimation of MI ranks.
Mutual Information estimation
• At this point, we have a procedure to construct an undirected graph from our data.
• In an “ideal world” (infinite samples, assumptions hold), we would be done.
• Unfortunately, things are not so simple !
Technical implementation
• Finite random samples• Possible higher-order interactions
For each pair:• H0: the two genes are mutually independent (no edge)
• HA: the two genes interact with each other (edge)
• We reject the null hypothesis H0 (i.e. we “draw an edge”) when the MI between two genes is big enough.
Need to choose a statistical threshold I0
Use hypothesis testing
Technical implementation
• Chosen approach: Random permutation analysis• Randomly shuffle gene expression values and labels
Choice of the statistical threshold
Sample 1 Sample 2 Sample 2 … Sample M
Gene 1 g11 g12 g13 … g1M
Gene 2 g21 g22 g23 … g2M
… … … … … …
Gene N gn1 gn2 gn3 … gnM
• Chosen approach: Random permutation analysis• Randomly shuffle gene expression values and labels
• The resulting variables are supposed to be mutually independent, but their MI will need not be equal to zero.
• Each threshold I0 can then be assigned a p-value (which measures the probability of getting a MI value higher or equal to I0 just “by chance”).
• Thus, by fixing a p-value, we obtain the desired I0.
Choice of the statistical threshold
MI values
Num
ber
of p
airs
I0
• Chosen approach: Random permutation analysis
Choice of the statistical threshold
• Chosen approach: Random permutation analysisParameter fitting: use known fact from large deviation theory
0)|( 00MIeHIIpvaluep
1
2
3
Choice of the statistical threshold
• If we have a real distribution of the type:
ARACNe will get it wrong, because of our assumption
N
kjikjiijkkiikkjjkjiij
N
iiii gggggggggg
ZgP
,,
),,(),(),(),()(exp1
})({
=0 =0 =0
Real network ARACNe’s output
The extension of the algorithm to higherorder interactions is a possible object of
future research.
Technical implementation
• If we have a real distribution of the type:
Iij=0
ARACNE will not identify the edge between gi and gj
),(),(),()(exp
1})({ kiikkjjkjiij
N
iiii ggggggg
ZgP
0
Real network ARACNe’s output
However, this situation is considered“biologically unrealistic”
Technical implementation
• If we have a real distribution of the type:
Iij 0
As it is, ARACNE will put an edge between gi and gj
),(),(),()(exp
1})({ kiikkjjkjiij
N
iii ggggggg
ZgiP
=0
Real network ARACNe’s output
The algorithm can be improved using theData Processing Inequality
Technical implementation
• The DPI is a well known theorem within the Information Theory community.
Let X,Y,Z be 3 random variables that form a Markov Chain X-Y-Z, then
I(X;Y)I(X,Z)
Proof. I(X; Y,Z) = I(X,Z) + I(X; Y|Z) = I(X,Y) + I(X; Z|Y) X and Z are conditionally independent given Y I(X; Z|Y)=0 Thus, I(X,Y) = I(X,Z) + I(X; Y|Z) I(X,Z), since I(X; Y|Z) 0
Similarly, we can prove I(Y;Z)I(X,Z) , and therefore:
I(X,Z) min[I(X;Y) ,I(Y;Z)]
The Data Processing Inequality
• A consequence of the DPI is that no transformation of Y, as clever as it can be, can increase the information that Y contains about X.
(consider the MC of the form X-Y-[Z=g(Y)] )
• From an intuitive point of view:
Mike Francisco Jean-Paul
The Data Processing Inequality
• Going back to our case of study
gk
gjgi
0.1
0.20.3
0.1
0.20.3
0.1
0.20.3
0.1
0.20.3
N
jijiij
N
iiii ggg
ZgP
,
),()(exp1
})({
So I assume that ij=0, even though P(gi,gj) P(gi)P(gj)
The Data Processing Inequality
• But, what if the underlying network is truly a three-gene loop?
gk
gjgi
…Then ARACNE will break the loop at the weakest edge!
Philosophy: “An interaction is retained iff there exist no alternatepaths which are a better explanation for the information exchange between two genes”
Claim: In practice, looking at the TP vs. FP tradeoff, it pays to simplify
(known flaw of the algorithm)
The Data Processing Inequality
• Theorem 1. If MIs can be estimated with no errors, then ARACNE reconstructs the underlying interaction network exactly, provided this network is a tree and has only pairwise interactions.
• Theorem 2. The Chow-Liu (CL) maximum mutual information tree is a subnetwork of the network reconstructed by ARACNE.
• Theorem 3. Let πik be the set of nodes forming the shortest path in the network between nodes i and k. Then, if MIs can be estimated without errors, ARACNE reconstructs an interaction network without false positive edges, provided: (a) the network consists only of pairwise interactions, (b) for each j πik, Iij ≥ Iik. Further, ARACNE does not produce any false negatives, and the network reconstruction is exact iff (c) for each directly connected pair (ij) and for any other node k, we have Iij ≥ min(Ijk, Iik).
The Data Processing Inequality
• Proof of theorem 1. (a) MIs can be estimated with no errors
(b) Network is a tree
(c) Only pairwise interactions
The Data Processing Inequality
ARACNE reconstructs true network without errors
-(c) no problem with higher order interactions-(a) blue area boundary is ok-(b) red area is contained in yellow area.-(a),(b),DPI yellow area is ok (every edge with =0 is removed and only edges with =0 are removed)
The Chow-Liu Maximum Entropy tree (1968)
• Method for approximating the JPD of a set of discrete variables using products of distributions involving no more than pair of variables.
• The Chow-Liu method approximates a distribution P(x) by T(x), a tree structured MRF.
• They proved that minimizing the Kullback-Leibler distance between P and T amounts to maximizing the total entropy of the edges of T.
The Data Processing Inequality
• In practice, the variance of the MI estimator may lead to the use of a tolerance so that the DPI relations become of the form:
Iij Iik(1-)
The Data Processing Inequality
(Erdös-Rényi) (Scale-free)
The final algorithmThe final algorithm
1. Choice of a threshold I0
2. Compute all pairwise MIs
3. Draw an edge when MII0
4. Look at all the three-geneloops and prune edge with the lowest MI.
Technical implementation: summary
The final algorithmThe final algorithm
1. Choice of a threshold I0
2. Compute all pairwise MIs
3. Draw an edge when MII0
4. Look at all the three-geneloops and prune edge with the lowest MI.
Technical implementation: summary
The final algorithmThe final algorithm
1. Choice of a threshold I0
2. Compute all pairwise MIs
3. Draw an edge when MII0
4. Look at all the three-geneloops and prune edge with the lowest MI.
Technical implementation: summary
The final algorithmThe final algorithm
1. Choice of a threshold I0
2. Compute all pairwise MIs
3. Draw an edge when MII0
4. Look at all the three-geneloops and prune edge with the lowest MI.
Technical implementation: summary
The final algorithmThe final algorithm
1. Choice of a threshold I0
2. Compute all pairwise MIs
3. Draw an edge when MII0
4. Look at all the three-geneloops and prune edge with the lowest MI.
2
N
3
N
pairs x M samples
maximum triplets
O(N2M2+N3)
iizzhGh
Mzf |)|(
1)( 12
i ii
ii
yfxf
yxf
MyixiI
)()(
),(log
1}){},({
Technical implementation: summary
Outline
• A very short introduction to MRF
• The ARACNE approach• Theoretical framework• Technical implementation
• Experiments and results• Synthetic data• Human B cells (paper by Basso et. al)
• Discussion
• Two different sets of experiments are presented:(a) Reconstruction of synthetic networks
(b) Reconstruction of a human B lymphocyte genetic network from microarray data.
• ARACNE’s performance is compared to that of two concurrent methodologies:• Bayesian networks• Relevance networks
Experiments and results
• Model proposed by Mendes et al. as a platform for comparison of reverse engineering algorithms.
• Simplification of real biological networks…
• …but reasonably complex to model some aspects of transcriptional regulation.
“An algorithm that does not perform well on this model is unlikely to perform well in a more complex case”
Synthetic data: AGNs
• Transcriptional interactions are approximated as:
where
xi level of expression of the i-th gene
Ni number of upstream inhibitors
NA number of activators
Ij concentration of upstream inhibitors
Al concentration of activators…
Synthetic data: AGNs
• Transcriptional interactions are approximated as:
• In order to generate M samples, the parameters for gene i in imaginary microarray k are:
iiki aa ,
iiki bb ,
are some original constant values of the parametersii ba ,
ikik ,, , are uniform random variables in [0,2]
This simulates the sampling of a population of distinct phenotypes at random time points and where the efficiency of biochemical reactions may be also distinct.
Synthetic data: AGNs
• Two possible network topologies are considered:
Erdös-Rényi
Each vertex of the graph is equally likely to be connected to any other vertex.
i.e. The presence of an edge between each possible pair of genes is modeled as a Bernoulli random variable with parameter p.
Synthetic data: AGNs
• Two possible network topologies are considered:
Scale-free
The distribution of the number of connections k associated with each vertex follows a power law:
p(k)~k-, >0
This motivates the appearance of large interaction hubs.
Synthetic data: AGNs
• Two possible network topologies are considered:
Erdös-Rényi Scale-free
Synthetic data: AGNs
• Two performance indicators:
• Recall: NTP/(NTP+NFN)
(fraction of the true edges that are identified)
• Precision: NTP/(NTP+NFP)
(fraction of the identified edges that are true)
Synthetic data: AGNs
• ARACNE outperforms its competitors for sufficiently small choices of p-values
• When the p-value is not small enough, non-statistically significant MI values are accepted and the algorithm crashes.
Synthetic data: AGNs
(Erdös-Rényi) (Scale-free)
• ARACNE’s performance depends on MI being high for directly interacting genes and decreasing rapidly with the interaction distance.
Synthetic data: AGNs
• Performance is stable in terms of kernel width as long as it is not too narrow.
Synthetic data: AGNs
• Better performance for Erdös-Rényi (less loops)• High precision and substantial recall, even for small samples
Synthetic data: AGNs
• Results for the case of human B cells are described in detail in the second paper:
“Reverse engineering of regulatory networks in human B cells”
Basso et al., Nature Genetics, 37, 382-390, 2005;
Real data: Human B cells
• Large gene expression datasets such as those derived from systematic perturbations to simple organisms (yeast) are not that easily obtained for mammalian cells.
• The more complex the organism becomes, the more difficult it is to distinguish among physical and functional interaction.
Real data: Human B cells
• Authors assume that an equivalent dynamic richness can be obtained for a given type of human cell.
• They choose to work with 340 B lymphocytes derived from normal, tumor-related and experimentally manipulated populations.
• They feed the data to ARACNE, which generates a regulatory network containing aprox. 129,000 interactions.
Real data: Human B cells
• Interesting observation:• The results show a power-law tail in the relationship
between the number of genes n and their number of interactions.
• This is suggestive of a scale-free underlying network structure.
• Important because the evidence of scale-free topology in higher-order eukaryotic cells is still scarce.
Real data: Human B cells
• They use Gene Ontology to analyze the biological processes affected by the most relevant (top 5%) hubs.
• Focus on the c-MYC proto-oncogene:• Because it is well characterized as a transcription
factor, which helps validate the results.• Subnetwork with 2,063 genes (56 directly connected
to MYC).• The network is handicapped by certain limitations:
• The edges are undirected• Some direct connections due to certain intermediates which
are not even represented on the microarray• Some direct relations incorrectly removed by the DPI
Real data: Human B cells
Real data: Human B cells
Real data: Human B cells
• Some results of interest can be summarized as follows:• 29/56 (51.8%) of the first neighbors presumed “correct”
(either reported in literature or ChIP validated in lab)
• This is statistically significant wrt the expected 11% of background c-MYC targets among randomly selected genes.
• Furthermore, c-MYC target genes are significantly more enriched among first neighbors than among second neighbors (51.8% vs 19.4%).
Real data: Human B cells
Outline
• A very short introduction to MRF
• The ARACNE approach• Theoretical framework• Technical implementation
• Experiments and results• Synthetic data• Human B cells (paper by Basso et. al)
• Discussion
• ARACNE provides a provably exact network reconstruction under a controlled set of approximations.
• Some limitations:• ARACNE opens all three-gene loops along weakest edge
(i.e. false negatives for triplets of truly interacting genes)
• Only statistical dependencies expressed as pairwise interaction potentials can be inferred.(solution: extend the model to higher orders)
• Edges are undirected (inevitable with non-temporal data?)
• Some ambiguities concerning the interpretation of the inferred irreducible statistical dependencies.
• Beware of loopy underlying topologies (go for tree-like)
Discussion
• ARACNE provides a provably exact network reconstruction under a controlled set of approximations.
• Some virtues:• Low computational complexity
• No need to discretize expression levels
• Does not rely on unrealistic network models or a priori assumptions (excuse-me ?? :-|)
• Avoidance of heuristic/stochastic search procedures
• High precision and recall on synthetic data
• Ability to infer genetic interactions on a genome-wide scale from gene-expression profiles of mammalian cells.
Discussion
• There is no “free lunch”: in order to effectively deal with the small sample regime some simplifying assumptions need to be made.
• Yet, it seems that this is the way to go: “divide and conquer paradigm”The big problem of inferring gene interaction networks should be broken down into smaller subproblems that can be addressed by relatively simple classes of models. Each model will then rely on strong assumptions, but they should be able to deal with complex scenarios when carefully chosen and properly combined.
• No magic recipes, a long and winding road ahead… (remember Minsky?)
My point of view
“…A well-known anecdote relates how, sometime in 1966, the legendary Artificial Intelligence pioneer Marvin Minsky directed an undergraduate student to solve "the problem of computer vision" as a summer project. This anecdote is often resuscitated to illustrate how egregiously the difficulty of computational vision has been underestimated. Indeed, nearly forty years later, the discipline continues to confront numerous unsolved (and perhaps unsolvable) challenges, particularly with respect to high-level "image understanding" issues such as pattern recognition and feature recognition. Nevertheless, the intervening decades of research have yielded a great wealth of well-understood, low-level techniques that are able, under controlled circumstances, to extract meaningful information from a camera scene. These techniques are indeed elementary enough to be implemented by novice programmers at the undergraduate or even high-school level…”
(from “Computer Vision for Artists and Designers” by Golan Levin)
My point of view
Thanks for your attention