Upload
samantha-lam
View
1.633
Download
1
Embed Size (px)
DESCRIPTION
Presentation at Mining Data Semantics in Heterogeneous Networks Workshop at KDD 2013
Citation preview
Using the Structure ofDBpedia for ExploratorySearch
Speaker: Samantha LamSupervisor: Conor Hayes
Motivating Work
DBpedia - heterogeneous graph
2
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for refining search
specific schema, (semi) supervised
→ good for search when user is familiar with query
→ ...but what about complete beginners?
→ Requires Exploratory Search – Unsupervised
3
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for refining search
specific schema, (semi) supervised
→ good for search when user is familiar with query
→ ...but what about complete beginners?
→ Requires Exploratory Search – Unsupervised
3
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for refining search
specific schema, (semi) supervised
→ good for search when user is familiar with query
→ ...but what about complete beginners?
→ Requires Exploratory Search – Unsupervised
3
Exploratory Search?
Given query, how to organise results in a manner that is ‘useful’,i.e. aids exploratory search
E.g. suppose you hear a song on the radio...
Solution:
Classify results according to its contexts
Why? Alleviates in-depth reading and guides user
4
Exploratory Search?
Given query, how to organise results in a manner that is ‘useful’,i.e. aids exploratory search
E.g. suppose you hear a song on the radio...
Solution:
Classify results according to its contexts
Why? Alleviates in-depth reading and guides user
4
Assumption
similarity ⊂ relatedness
5
Research Questions
1 Can we provide an effective graph-based framework that canaid exploratory search?
2 To do this, what is DBpedia’s graph structures wrt itsdifferent datasets?
6
DBpedia graphs summary
Infobox properties
emergent, crowd-sourcedheterogeneous ‘types’dense
Infobox ontology, SKOS/Wiki Category, YAGO
agreed rulesis-A structuresparse, tree-like
Infobox good forGGGGGGGGGGA Relatedness
Ontology good forGGGGGGGGGGA Labelling similar items
7
DBpedia graphs summary
Infobox properties
emergent, crowd-sourcedheterogeneous ‘types’dense
Infobox ontology, SKOS/Wiki Category, YAGO
agreed rulesis-A structuresparse, tree-like
Infobox good forGGGGGGGGGGA Relatedness
Ontology good forGGGGGGGGGGA Labelling similar items
7
Research Q1 Proposition
General Framework:
8
Sample Query & Results
Query: Lisa Hannigan
Two methods Weighted (W) and Uniform (U), 6 clusters
Cluster 1 (W, U) instruments
Top label: (W, U) Musical instruments
Cluster 2 (W) songs (U) album and songs
Top label: (W) Songs by artist (U) Albums by artist
Cluster 3 (W) albums (U) album, music genres and songs
Top label: (W) Albums by artist (U) Music subgenres by genre
9
Sample Query & Results
Query: Lisa Hannigan
Two methods Weighted (W) and Uniform (U), 6 clusters
Cluster 1 (W, U) instruments
Top label: (W, U) Musical instruments
Cluster 2 (W) songs (U) album and songs
Top label: (W) Songs by artist (U) Albums by artist
Cluster 3 (W) albums (U) album, music genres and songs
Top label: (W) Albums by artist (U) Music subgenres by genre
9
Sample Query & Results
Query: Lisa Hannigan
Cluster 4 (W) mixed, (U) mixed
Top label: (W) Songs by artist (U) Missing people
Cluster 5 (W) mixed, (U) mixed
Top label: (W) Albums by artist (U)Towns and villages in the Republic of Ireland by county
Cluster 6 (W) musicians and bands, (U) musicians and bands
Top label: (W) Place of birth missing (living people) (U)Place of birth missing (living people)
10
Sample Query & Results
Summary:
Weighted produced 4 out of 6 coherent clusters whereasUnweighted only produced 2.
DBpedia Ontology labelling (see paper) provided broaderlabelling for messier clusters, e.g. top label was MusicalWorkfor mixed clusters
→ Categories better for more specific clusters.
11
Ongoing Challenges
Evaluation
User Study:
- compare only Weighted versus Unweighted results,different labelling methods?
Comparison:
- possible to compare against other faceted methods?
- compare with plain list for recall?
12
Summary
Investigated graph structure of DBpedia datasets
Framework to utilise this finding in exploratory search, gaveexample results
Ongoing challenge, evaluation
Thanks for listening! Questions welcome!
13
Summary
Investigated graph structure of DBpedia datasets
Framework to utilise this finding in exploratory search, gaveexample results
Ongoing challenge, evaluation
Thanks for listening! Questions welcome!
13