Computational Inference of Cellular Networks from

PowerPoint-Präsentationfrom Microarray Data
Page 33/3/2008| Achim Tresch
Correlation
close
to 1 or
-1 strong linear dependence Correlation close to 0 no or weak linear dependence
(Pearson) Correlation
Correlation
close
to 1 or
Correlation
close
to 1 or
Correlation
close
to 1 or
Correlation
close
to 1 or
Correlation
close
to 1 or
An expression profile is a collection of expression vectors { Xg
= (Xg,s
)s

samples
Genes }
Correlation graph: Genes are the vertices of the graph and an undirected edge (i, j) is drawn
if
(Pearson correlation, Spearman rank correlation, Kendall’s tau) between Xi
and Xj
is sufficiently different from zero.
Advantage: This representation of the marginal dependence structure is easy to interpret and can be estimated accurately even if
p > N the number of features (genes)
> the number of samples
Application: Stuart et al. (Science, 2003) build a graph from coexpression
across multiple organisms.
It is impossible to distinguish direct from indirect dependence Three reasons why X, Y , and Z may be highly correlated:
Problems of correlation based approaches
A strong correlation is not a strong evidence for regulatory dependence (lots of false positives). But a low correlation is a
strong evidence for no regulatory dependence. Possible remedies:
•search for correlations which cannot be explained by other variables. •measure effects of gene perturbations
In other words: Knowing Z, knowing Y is irrelevant for knowing X (and vice versa). Z “explains”
any observed dependence between X and Y .
Conditional Independence
0 5
10 15
0 5
10 15
Gaussian Graphical Models (GGM)
Gaussian Graphical Models (GGM)
If we assume that the common expression distribution of all genes follows a multivariate Gaussian distribution (which is of course ridiculous), conditional independence can be assessed as follows:
Full conditional relationships can only be accurately estimated if the number of samples N is relatively large compared to the number of variables p.
Graph from Basso et al (Nat Genet, 2005)
Thus, if p »
N, you can . . . •
and
Strimmer, Bioinformatics´05) •
resort to a simpler model that does not rely on full conditional independence
What if p » N?
Correlation
Graphs
GGMs
probability distributions.
pa(N) = {S,R}
spring 1/4
dry wet\),(
spring 1/4
dry wet\),(
{}J
a
a
a
a
=⋅==⋅==⋅=== ===== nNrRsSjJP
summer)P(J summer)J|offP(S summer)J|rainP(R rain)Roff,S|wetP(N
)wet,rain,off,summer(
(instead
for
metric
wíth
of random
acyclic
graphs
can
versa. However
there exist
due to
G2 G3 , G4 , G5
G2 G3 , G4 , G5
necessarily direct
not?
“non est ponenda pluritas sine necessitate” (pluralities ought not to be proposed without necessity)
G2G1 G3
Need
for
algorithm
for which there
•
a sign +1 or –1 according to
the direction of the regulatory effect. Remove a→b
if product of all signs
along the path a→... →b equals the sign of the edge a→b
[Tringe
“Weights“. Let every edge be weighted with a non-
negative number. Edges with low weights are meant to represent edges for which there is strong evidence for a direct regulatory interaction.
Remove a→b
if sum of the weights along the path a→... →b is smaller than the weight of the edge a→b
[Tresch
a x1 b
Finding non-necessary edges
1 1 0
1 1 0
observable
is
linked
to
exactly
one
action
expressedally differenti IS log R
s P
s P =
0 20
0 40
0 60
graphs, ~ 4seconds
Lo g
Li ke
lih oo
•
•
•
•
Nessy, nem: Implementation and estimation of the Nested Effects Model
Networks in R/Bioconductor
Holger Fröhlich
Slide Number 2
Slide Number 3
Slide Number 4
Slide Number 5
Slide Number 6
Slide Number 7
Slide Number 8
Slide Number 9
Slide Number 10
Slide Number 11
Slide Number 12
Slide Number 13
Slide Number 14
Slide Number 15
Slide Number 16
Slide Number 17
Slide Number 18
Slide Number 19
Slide Number 20
Slide Number 21
Slide Number 22
Slide Number 23
Slide Number 24
Slide Number 25
Slide Number 26
Slide Number 27
Slide Number 28
Slide Number 29
Slide Number 30
Slide Number 31
Slide Number 32
Slide Number 33
Slide Number 34
Slide Number 35
Slide Number 36
Slide Number 37
Slide Number 38
Slide Number 39
Slide Number 40
Slide Number 41
Slide Number 42
Slide Number 43
Slide Number 44
Slide Number 45
Slide Number 46
Slide Number 47
Slide Number 48
Slide Number 49
Slide Number 50
Slide Number 51

Documents

Computational Inference of Cellular Networks from