Upload
sydney-evans
View
215
Download
2
Tags:
Embed Size (px)
Citation preview
.
Likelihood Computations Using Likelihood Computations Using Value AbstractionValue Abstraction
Application in Genetic Linkage Application in Genetic Linkage
AnalysisAnalysis
Nir Friedman Dan Geiger Noam LotnerHebrew University Technion Hebrew University
Likelihood Computation
Given a Bayesian network <G,> and evidence e, compute P(e)
Sum over all possible values of unobserved variables
nXXValv
vePeP1
,
Bet1 Die
Win1Win1
e = { Win1 = true }
The Basic Concept
P(e,Die=1)= P(e,Die=3)= P(e,Die=5)andP(e,Die=2)= P(e,Die=4)= P(e,Die=6)
Bet1 Die
Win1
Val(Bet1) = {odd,even}e = { Win1=true }
The exact value of Die need not be known to
calculate exact likelihood
Group values, calculate once for each group
Value Abstraction
Val(Die)
1
2
3
4
5
6
Val(Diea
)
{1,4}
{2,5,6}
{3}
Defi ne: P X v P X va a
v Val xv va
A partition of a variable’s domainBet1 Die
Win1
Safe Value Abstraction
An abstraction is safe w.r.t. evidence e if v v P e X v P e X v
Preserves likelihood information
Val(Die)
Val(Diea
)1
2
3
4
5
6
{1,3,5}
{2,4,6}
Bet1 Die
Win1
Win2
Bet2Val(Bet2)={ 1-2, 3-6 }
1
2
3
4
5
6
{1,3,5}
{2,4,6}
1
2
3
4
5
6
{1}
{2}
{3,5}
{4,6}
{1,3,5}
{2,4,6}
e = {Win1=true}
e = {Win1=true, Win2=true}
Bet1 Die
Win1
Safe Value Abstraction
Win1
Val(Bet1)={odd, even }
Win2
A safe abstraction for Val(Die)
Need to refine
Refinement
Cautious Value Abstraction
Bet1 Die
Win1Win2
Bet2
Maximal abstraction - a tight refinement
{2}
{3,5}
{4,6}
{1}{1,3,5}
{2,4,6}
{1,2}
{3-6}
Win1=true
Win2=true
Val(Bet1)={ odd, even } Val(Bet2)={ 1-2, 3-6 }
Abstracting a Bayesian Net
An abstraction of Xi implies a partition of Pai’s values
Abstract each variable after it’s children are abstracted, use a tight refinement of all partitions implied by children
Output - Ga : aGePGeP ,,
For each variable:1. Calculate maximal abstraction2. Propagate to parents
X
Initialization:Abstract observed variables
Linear in # variables and network representation
The Application-Genetic Linkage Analysis
The goal - Find the location of a disease (target) gene on a
chromosome relative to some other (known) locations
Map of human chromosome 16
Known loci
Converting the pedigree to a Bayesian netOne locus:
The Probabilistic Model
1 2
3
4
5 6
Orange - genotype nodes
Blue - phenotype nodes
Red - selector nodes (these represent linkage)
1 2
34
5 6
Locus #1
5
1 2
34
6
Locus #2
The Probabilistic Model
More than 1 locus:
s2
s1
112
12
ssP
ssP
1e+6
1e+910
100
1000
10000
1e+5
10 1000 1e+5 1e+7
Abs
trac
ted
Original
Clique-tree size
100
1000
100 1000
Abs
trac
ted
Original
Network size
Experimental Evaluation
90 pedigrees (5-200 individuals) from 10 studies Total of 280 linkage analysis problems
Varied number of loci
# loci:
+ 1
+ 2
+ 3
+ 4
Bet1 Die
Win2
Bet2
Win1
Abstracting Multiple Variables
1
2
3
4
5
6
{1,3,5}
{2,4,6}
Diea
odd
even
odd
even
Bet1a
loss
win
1,o
2,o
3,o
4,o
5,o
6,o
1,e
2,e
3,e
4,e
5,e
6,e
y
yvfvm ,121
z
zvxfvxm ,,, 232
Clique-Tree Elimination
X,V,U
X,W
X,V,ZV,Y
vxmvmuvxfxmuv
,,, 3231,
343
C1 C2
C3
C4
Message-Specific Abstraction
X,V,U
X,W
X,V,ZV,YC1 C2
C3
C4
vxmvmuvxfxmuv
,,, 3231,
343
Given safe abstractions for f3, m13, m23
- construct a safe abstraction for m34
Refinement multiplication
Projection summation
xmxmxx is safe for message m if
Use dynamic programming to efficiently compute a safe abstraction for the whole tree
Experimental Evaluation
How much more can we save ?
# loci: + 1 + 2 + 3 + 4
Abs
trac
ted
cliq
ue-t
ree
Abstracted network
10000
10
100
1e+6
10 100 10000 1e+6
Cliq
ue-t
ree
size
rat
io
Abstracted network
1
10
100
1000
10 100 1000 10000 100000 1e+6
Clique-tree size Ratio
Total Reduction
Cliq
ue-s
ize
Rat
io (
orig
/abs
)
Problem size (#individuals X #genotypes)
10
100
1000
10000
100000
1e+006
1e+007
1e+008
1e+009
1 10 100 1000 10000
Summary
Safe abstraction w.r.t. specific evidence An algorithm to reduce problem complexity
Linear in net representationIndependent of inference procedure
Motivated by VITESSE[ ] Further reductions with inference procedure known Caveats
As costly as inferenceCost is ammortized when used for e.g. parameter
estimation Representation of abstractions