View
216
Download
1
Tags:
Embed Size (px)
Citation preview
FLAIRS 2001 1
Graph-Based Concept Learning
Jesus Gonzalez, Lawrence Holder and Diane Cook
Department of Computer Science and EngineeringThe University of Texas at Arlington
FLAIRS 2001 2
Outline Relational concept learning Graph-based concept learning
Conceptual graphs and Galois lattice Graph-based discovery in Subdue SubdueCL
Empirical results Conclusions
FLAIRS 2001 3
Relational Concept Learning Inductive Logic Programming (ILP)
FOIL Progol
First-order logic vs. graphs Expressiveness Interpretability
Conceptual graphs
FLAIRS 2001 4
Conceptual Graphs Logic-based knowledge
representation
Object
On
Shape
Triangle
SquareObject
Shape
shape(X,triangle)shape(Y,square)on(X,Y)
FLAIRS 2001 5
Conceptual Graphs Graph Logic PAC-learning CGs [Jappy & Nock]
Size of CG class and generalization (projection) operator polynomial in
Number of relations Number of concepts Number of labels
FLAIRS 2001 6
Galois Lattice Each node consists of a description
graph and set of subsumed examples
Begins with positive examples Generalization operator
Most specific generalization Union of example sets
FLAIRS 2001 7
Galois Lattice
triangle
on
square
on
rectangle
triangle
on
square
on
circle
triangle
on
circle
on
rectangle
circle
on
rectangle
on
triangle
triangle
on
square
on
rectangle
triangle
on
square
on
circle
triangle
on
circle
on
rectangle
circle
on
rectangle
on
triangle
triangle
on
square
on
rectangle
triangle
on
square
on
rectangle
triangle
on
square
on
circle
triangle
on
square
on
circle
triangle
on
circle
on
rectangle
triangle
on
circle
on
rectangle
circle
on
rectangle
on
triangle
circle
on
rectangle
on
triangle
[ 1] [ 2] [ 3] [ 4]
[ 1, 2 ] [ 3, 4 ]
[ 1, 2, 3 ] [ 1, 2, 4 ] [ 1, 3, 4 ] [ 2, 3, 4 ]
[ 1, 2, 3, 4 ]
[ 1] [ 2] [ 3] [ 4]
[ 1, 2 ] [ 3, 4 ]
[ 1, 2, 3 ] [ 1, 2, 4 ] [ 1, 3, 4 ] [ 2, 3, 4 ]
[ 1, 2, 3, 4 ]
[ 1] [ 2] [ 3] [ 4]
[ 1, 2 ] [ 3, 4 ]
[ 1, 2, 3 ] [ 1, 2, 4 ] [ 1, 3, 4 ] [ 2, 3, 4 ]
[ 1, 2, 3, 4 ]
[ 1] [ 2] [ 3] [ 4]
[ 1, 2 ] [ 3, 4 ]
[ 1, 2, 3 ] [ 1, 2, 4 ] [ 1, 3, 4 ] [ 2, 3, 4 ]
[ 1, 2, 3, 4 ]
[ 1, 2 ] [ 3, 4 ] [ 1, 2, 3 ] [ 1, 2, 4 ] [ 1, 3, 4 ] [ 2, 3, 4 ] [ 1, 2, 3, 4 ]
triangle
on
square
on
circle
on
rectangle
triangle
on
triangle
on
on
rectangle
triangle
triangle
on
triangle
on
[ 1, 2 ]
triangle
on
square
on
circle
on
rectangle
triangle
on
triangle
on
on
rectangle
triangle
triangle
on
triangle
on
[ 1, 2 ]
triangle
on
square
on
triangle
on
square
on
circle
on
rectangle
circle
on
rectangle
triangle
on
triangle
on
triangle
on
on
rectangle
triangle
on
rectangle
triangle
triangle
on
triangle
on
FLAIRS 2001 8
Galois Lattice Galois lattice creation O(n3p)
n examples p nodes in lattice
Tractable for poly-time generalization
GRAAL system
FLAIRS 2001 9
Graph-Based Discovery Finding “interesting” and repetitive
substructures (connected subgraphs) in data represented as a graph
object
triangle
R1
C1
T1
S1
T2
S2
T3
S3
T4
S4
Input Database Substructure S1 (graph form)
Compressed Database
R1
C1object
squareon
shape
shape S1S1 S1S1 S1S1
S1S1
FLAIRS 2001 10
Graph-Based Discovery “Interesting” defined according to
the Minimum Description Length principle min [DL(S) + DL(G|S)]
General-to-specific beam search through substructure space
Poly-time inexact graph match Subdue system
S
FLAIRS 2001 11
Subdue System Graph-based…
Discovery Concept learning Hierarchical conceptual clustering
Background knowledge Parallel/distributed capability http://cygnus.uta.edu/subdue
FLAIRS 2001 12
Graph-Based Concept Learning
object
object
object
on
on
triangle
square
shape
shape
FLAIRS 2001 13
Graph-Based Concept Learning
Extension to graph-based discovery
Input now a set of positive graphs and a set of negative graphs
Set-covering approach Iterate until all positive graphs and no
negative graphs covered Result is a substructure DNF
FLAIRS 2001 14
Graph-Based Concept Learning Solution 1
Find substructure compressing positive graphs, but not negative graphs
Compress graphs and iterate until no further compression
Problem Compressing, instead of removing,
partially-covered positive graphs leads to overly-specific hypotheses
FLAIRS 2001 15
Graph-Based Concept Learning
Solution 2 Find substructure covering positive
graphs, but not negative graphs Remove covered positive graphs and
iterate until all covered Substructure value = 1 - Error
FLAIRS 2001 16
Empirical Results Comparison with ILP systems Non-relational domains from UCI
repositoryGolf Vote Diabetes Credit TicTacTo
e
FOIL 66.67 93.02 70.66 66.16 100.00
Progol 33.33 76.98 51.97 44.55 100.00
SubdueCL
66.67 94.88 64.21 71.52 100.00
FLAIRS 2001 17
Empirical Results Comparison with ILP systems Relational domains: Chess
endgame
WKC
WKR
WRR WRC
BKC
BKR
pos
adj
adj
pos
adj
adj
pos eq0 1 2
0
1
2
WK
WR
BK
lt
lt
lt
lt
FLAIRS 2001 18
Empirical Results: Chess FOIL: 11 rules, 99.34% Progol: 5 rules, 99.74% SubdueCL: 7 rules, 99.74%
WKC
pos
BKC
WKR BKRadj
ltWKC
pos
BKC
WKR BKRadj
adjWKCWKC
pos
BKCBKC
WKRWKR BKRBKRadj
ltWKCWKC
pos
BKCBKC
WKRWKR BKRBKRadj
adj
FLAIRS 2001 19
Empirical Results Relational domain: Cancer SubdueCL: 62% Progol: 64%
[72%]
compound
atom
atom
c
22
-13
c
22
-13
element
element
type
type
charge
charge
7
contains
contains
six_ring
in_groupin_group
halide10
ashby_alertashby_alert
p
6
positiveames
di227
cytogen_ca
compound
atom
atom
c
22
-13
c
22
-13
element
element
type
type
charge
charge
7
contains
contains
six_ring
in_groupin_group
halide10
ashby_alertashby_alert
p
6
positiveames
di227
cytogen_ca
compound
amine
pchromaberr
has_group
compound
amine
pchromaberr
has_group
compoundcompound
amine
pchromaberr
has_group
compound pdrosophila_slrl
compound pdrosophila_slrl
compoundcompound pdrosophila_slrl
FLAIRS 2001 20
Empirical Results Relational domain: Web
Professor (+) vs. student (-) websites Hyperlink structure and page content
page boxwordpage boxword
FLAIRS 2001 21
Empirical Results Relational domain: Web
Computer store (+) vs. professor (-) websites
Hyperlink structure only
page 8page 17
page 18
page 19
page 22
page 21
page 1
link
link link
link
linklink
link
link
page 8page 17
page 18
page 19
page 22
page 21
page 1
link
link link
link
linklink
link
link
FLAIRS 2001 22
Conclusions Theoretical analysis of graph-based
concept learning PAC-learning conceptual graphs Galois lattice Next step: Relax graph constraints
Empirical analysis Competitive with other relational concept
learners (ILP) Next step: More relational domains
SubdueCL (http://cygnus.uta.edu/subdue)