Upload
camden-britt
View
18
Download
0
Embed Size (px)
DESCRIPTION
t1.id. r1.id. t1.cd. t1.la. T2.id. r2.id. t2.la. s1.la. s1.f. t2.cd. s2.la. r1.id. t1.id. t2.id. s1.f. s2.f. s2.f. r2.id. t1.id. r1.id. t2.id. r2.id. C. W. W. B. I. C. C. B. I. C.
Citation preview
Structure Refinement in First Order Conditional Influence Language
Sriraam Natarajan, Weng-Keen Wong, Prasad TadepalliSchool of EECS, Oregon State University
Weighted Mean{Weighted Mean{If {task(t), doc(d), role(d,r,t)} then If {task(t), doc(d), role(d,r,t)} then t.id, r.id Qinf (Mean) d.foldert.id, r.id Qinf (Mean) d.folderIf {doc(s), doc(d), source(s,d) } then If {doc(s), doc(d), source(s,d) } then s.folder Qinf (Mean) d.folders.folder Qinf (Mean) d.folder}}
0
50
100
1st
Qtr
2nd
Qtr
3rd
Qtr
4th
Qtr
0
50
100
1st
Qtr
2nd
Qtr
3rd
Qtr
4th
Qtr
0
50
100
1st
Qtr
2nd
Qtr
3rd
Qtr
4th
Qtr
t1.id
d.folder d.folder
d.folder
d.folder d.folder
s2.folder
d.folder d.folder
r1.id t2.id r2.ids1.folder
0
50
100
1st
Qtr
2nd
Qtr
3rd
Qtr
4th
Qtr
0
50
100
1st
Qtr
2nd
Qtr
3rd
Qtr
4th
Qtr
0
50
100
1st
Qtr
2nd
Qtr
3rd
Qtr
4th
Qtr
W1 W2
“Unrolled” Network for Folder Prediction
First-order Conditional Influence Language (FOCIL)
t1.id
d.f
d.f
s2.fr1.id r2.id t2.cd s1.ft1.cd t2.la s1.la s2.la
d.f d.f
d.f
d.f
d.f
t1.la T2.id
Prior Network
d.f
d.f
s2.fr1.id t2.id r2.id s1.ft1.id
d.f d.f
d.f
d.f
d.f
Learned Network
Prior Program
Learned Program
•Conditional BIC score = -2 *CLL + dConditional BIC score = -2 *CLL + dmmlogNlogN
• Different instantiations of the same rule share parametersDifferent instantiations of the same rule share parameters
• Conditional Likelihood: EM – Maximize the joint likelihoodConditional Likelihood: EM – Maximize the joint likelihood
• CBIC score with penalty scaled downCBIC score with penalty scaled down
•Greedy Search with random restartsGreedy Search with random restarts
Scoring metric
Folder Prediction
Rank Exhaustive -R HC+RR - R Exhaustive - I HC+RR - I
1 349 354 312 311
2 107 98 128 130
3 22 26 26 26
4 15 12 20 23
5 6 4 3 4
6 0 0 1 1
7 1 4 2 0
8 0 2 1 2
9 0 0 0 0
10 0 0 0 0
11 0 0 2 3
Score 0.8299 0.8325 .7926 0.7841
Synthetic Data Set
Irrelevant attributesRelevant attributes
• Data is expensive – Exploit prior knowledge in Data is expensive – Exploit prior knowledge in structure searchstructure search
• Derived the CBIC score for our settingDerived the CBIC score for our setting
• Learned the “true” network in the synthetic Learned the “true” network in the synthetic dataset dataset
• Folder dataset: Learned the best network with Folder dataset: Learned the best network with only relevant attributesonly relevant attributes
• Folder dataset with irrelevant attributes:Folder dataset with irrelevant attributes:
Conclusions
CB I Clear ned < CB I Cbest
• Different scoring metricsDifferent scoring metrics
• BDeuBDeu• Bias/VarianceBias/Variance
• Choose the best combining rule that fits the Choose the best combining rule that fits the datadata
• Structure refinement in large real-world Structure refinement in large real-world domainsdomains
Future work
• What is the correct complexity penalty in the presence of What is the correct complexity penalty in the presence of multi-valued variables?multi-valued variables?
• Counting the # of parameters may not be the right Counting the # of parameters may not be the right solutionsolution
• What is the right scoring metric in relational setting for What is the right scoring metric in relational setting for classification?classification?
• Can the search space be intelligently pruned?Can the search space be intelligently pruned?
Issues
i X iNlmd )1(
NNXXYPCBICScorei Xd
rkm i
rlog...|(log2 ,
11,1
Weighted Mean{Weighted Mean{If {task(t), doc(d), role(d,r,t)} then If {task(t), doc(d), role(d,r,t)} then t.id, r.id Qinf (Mean) d.foldert.id, r.id Qinf (Mean) d.folderIf {doc(s), doc(d), source(s,d) } then If {doc(s), doc(d), source(s,d) } then s.folder Qinf (Mean) d.folders.folder Qinf (Mean) d.folder}}
Weighted Mean{Weighted Mean{If {task(t), doc(d), role(d,r,t)} then If {task(t), doc(d), role(d,r,t)} then t.id, r.id, t.creationDate, t.lastAccessed Qinf t.id, r.id, t.creationDate, t.lastAccessed Qinf
(Mean) d.folder (Mean) d.folderIf {doc(s), doc(d), source(s,d) } then If {doc(s), doc(d), source(s,d) } then s.folder, s.lastAccessed Qinf (Mean) d.folders.folder, s.lastAccessed Qinf (Mean) d.folder}}