Upload
karley-milnes
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Learning Probabilistic Relational Models
Daphne KollerStanford University
Nir FriedmanHebrew [email protected]
Lise GetoorStanford University
Avi PfefferStanford University
• Data sources– relational and object-oriented databases– frame-based knowledge bases – World Wide Web
Learning from Relational Data
• Problem:– must fix attributes in advance
can represent only some limited set of structures– IID assumption may not hold
• Traditional approaches– work well with flat representations– fixed length attribute-value vectors – assume IID samples
Our Approach
• Probabilistic Relational Models (PRMs)– rich representation language models
• relational dependencies• probabilistic dependencies
• Learning PRMs – parameter estimation– model selection
from data stored in relational databases
Outline• Motivation
• Probabilistic relational models– Probabilistic Logic Programming
[Poole, 1993]; [Ngo & Haddawy 1994]– Probabilistic object-oriented knowledge
[Koller & Pfeffer 1997; 1998]; [Koller, Levy & Pfeffer; 1997]
• Learning PRMs
• Experimental results
• Conclusions
Probabilistic Relational Models
• Combine advantages of predicate logic & BNs: – natural domain modeling: objects, properties,
relations;– generalization over a variety of situations;– compact, natural probability models.
• Integrate uncertainty with relational model:– properties of domain entities can depend on
properties of related entities;– uncertainty over relational structure of domain.
Relational Schema
Student
Intelligence
Performance
Registration
Grade
Satisfaction
Course
Difficulty
Rating
Professor
Popularity
Teaching-Ability
Stress-Level
Teach
In
Take
• Describes the types of objects and relations in the database
ClassesClasses
RelationshipsRelationships
AttributesAttributes
Example instance I Professor
Prof. GumpPopularity
highTeaching Ability
mediumStress-Level
low
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
What’s Uncertain?
Relations
ProfessorProf. Gump
Popularityhigh
Teaching Abilitymedium
Stress-Levellow
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
Attribute Values
ObjectsStudent
Judy DunnIntelligence
highPerformance
high
StudentJohn Deer
Intelligence ???
Performance ???
Attribute Uncertainty
Fixed skeleton – set of objects in each class– relations between them
Uncertainty– over assignments of values to attributes
ProfessorProf. Gump
Popularity???
Teaching Ability???
Stress-Level???
CoursePhil142
Difficulty ???
Rating???
CoursePhil101
Difficulty ???
Rating???
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
Grade???
Satisfaction ???
StudentJane Doe
Intelligence ???
Performance ???
IntellReg.Taker.ficulty,Reg.In.Dif
|Reg.Grade P
PRM: Dependencies
StudentIntelligence
Performance
RegGrade
Satisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
1.06.03.0
1.01.08.0
4.05.01.0
1.04.05.0
,
,
,
,,
ll
hl
lh
hhCBAID
PRM: Dependencies (cont.)Professor
Prof. GumpPopularity
highTeaching Ability
mediumStress-Level
low
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
Grade?
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
StudentJohn Deer
Intelligence low
Performance average
Reg#5639
Grade?
Satisfaction 3
1.06.03.0
1.01.08.0
4.05.01.0
1.04.05.0
,
,
,
,,
ll
hl
lh
hhCBAID
1.06.03.0
1.01.08.0
4.05.01.0
1.04.05.0
,
,
,
,,
ll
hl
lh
hhCBAID
PRM: aggregate dependencies
RegGrade
StudentIntelligence
Performance
Satisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
StudentJane Doe
Intelligence high
Performance average
Reg#5077
GradeC
Satisfaction 2
Reg#5054
GradeC
Satisfaction 1
Reg#5639
GradeA
Satisfaction 3
Problem!!!
Need CPTs of varying sizes
avg
1.03.06.0
4.04.02.0
7.02.01.0
C
B
Ahmlavg
PRM: aggregate dependencies
StudentIntelligence
Performance
RegGrade
Satisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
avg
avg
count
sum, min, max, avg, mode, count
PRM: Summary• A PRM specifies
– a probabilistic dependency structure S• a set of parents for each attribute X.A
– a set of local probability models
• Given a skeleton structure , a PRM specifies a probability distribution over instances I:– over attribute values of all objects in
Classes Objects
)|(),,|( ).()( .
. axparentsX Xx AX
axPSP III
Value of attribute A in object xAttributes
Learning PRMs
Relational
Schema
Database:
• Parameter estimation
• Structure selection
Course Student
Reg
Course Student
Reg
Instance I
Parameter estimation in PRMs
• Assume known dependency structure S• Goal: estimate PRM parameters
– entries in local probability models,
• A parameterization is good if it is likely to generate the observed data, instance I .
• MLE Principle: Choose so as to maximize l
),|(log),:( SPSl II
).(|. AxparentsAx
crucial property: decomposition
separate terms for different X.A
ML parameter estimation
IntellReg.Taker.
ficulty,Reg.In.Dif
|Reg.Grade
P
StudentIntelligence
Performance
RegGrade
Satisfaction
CourseDifficulty
Rating
).,.().,.,.(
*
.,.|.
hISlDCNhISlDCAGRN
hISlDCAGR
DB technology well-suited to the computation of suff statistics:
Coursetable
Regtable
Studenttable
IntSGradeRDiffC
...
Count
sufficient statistics
Model Selection• Idea:
– define scoring function – do local search over legal structures
• Key Components:– scoring models– legal models– searching model space
Scoring Models
• Bayesian approach:
• closed form solution
])()|(log[)|(log):(
priorlikelihoodmarginal
SPSPSPSScore
III
Legal Models
• Dependency ordering over attributes:
x.a
y.b
axby .. if X.A depends on Y.B
Paper
Accepted
ResearcherReputation author-of
• PRM defines a coherent probability model over skeleton if is acyclic
Guaranteeing AcyclicityHow do we guarantee that a PRM is acyclic for every skeleton?
PRMdependency structure S
dependencygraph
Y.B
X.A
if X.A depends directly on Y.B
dependency graph acyclic acyclic for any Attribute stratification:
Limitation of stratification
Person
M-chromosome
P-chromosome
Blood-type
Person
M-chromosome
P-chromosome
Blood-type
Person
M-chromosome
P-chromosome
Blood-type
Father Mother
Person.M-chrom Person.P-chrom
Person.B-type ???
Guaranteed acyclic relations
Person
M-chromosome
P-chromosome
Blood-type
Person
M-chromosome
P-chromosome
Blood-type
Person
M-chromosome
P-chromosome
Blood-type
Father Mother
• Prior knowledge: the Father-of relation is acyclic– dependence of Person.A on Person.Father.B cannot induce cycles
Guaranteeing acyclicity
• With guaranteed acyclic relations, some cycles in the dependency graph are guaranteed to be safe.
• We color the edges in the dependency graph
A cycle is safe if– it has a green edge– it has no red edge
yellow: withinsingle object
X.B
X.Agreen: viag.a. relation
Y.B
X.Ared: viaother relations
Y.B
X.A
Person.M-chrom Person.P-chrom
Person.B-type
Searching Model Space
Student
Course RegscoreAdd C.AC.B
score
Delete S.IS.PStudent
Course Reg
Student
RegCourse
Phase 0: consider only dependencies within a class
Phased structure search
Student
Course Reg scoreAdd C.AR.B
score
Add S.IR.CStudent
Course Reg
Student
RegCourse
Phase 1: consider dependencies from “neighboring” classes, via schema relations
Phased structure search
scoreAdd C.AS.P
score
Add S.IC.B
Phase 2: consider dependencies from “further” classes, via relation chains
Student
Course Reg
Student
Course Reg
Student
Course Reg
Experimental Results:Movie Domain (real data)
11,000 movies, 7,000 actors
Actor
Gender
Appears
Role-type
Movie
Process
Decade
Genre
source: http://www-db.stanford.edu/movies/doc.html
Genetics domain (synthetic data)
Person
M-chromosome
P-chromosome
Blood-type
Person
M-chromosome
P-chromosome
Blood-type
Person
M-chromosome
P-chromosome
Blood-type
Father Mother
Blood-Test
Contaminated
Result
Experimental Results
-32000
-30000
-28000
-26000
-24000
-22000
-20000
-18000
200 300 400 500 600 700 800
Sco
re
Dataset Size
Median LikelihoodGold Standard
Future directions
• Learning in complex real-world domains– drug treatment regimes
– collaborative filtering
• Missing data
• Learning with structural uncertainty
• Discovery– hidden variables
– causal structure
– class hierarchy
Conclusions• PRMs natural extension of BNs:
– well-founded (probabilistic) semantics– compact representation of complex models
• Powerful learning techniques– builds on BN learning techniques– can learn directly from relational data
• Parameter estimation– efficient, effective exploitation of DB technology
• Structure identification– builds on well understood theory– major issues:
• guaranteeing coherence• search heuristics