View
47
Download
0
Embed Size (px)
DESCRIPTION
Keyword-based linked data information retrieval is an easy choice for general purpose users, but implementation of such approach is a challenge because mere keyword does not hold semantics. Some studies have incorporated templates in an eort to bridge this gap, but most such pproaches have proven ineective because of inecient template management. Because linked data can be resented in a structured format, we can assume that the data's internal statistics can be used to eectively in uence template management. In this work, we explore the use of this in uence for template creation, ranking, and scaling. Then, we demonstrate how our proposal for automatic linked data information retrieval can be used alongside familiar keyword-based information retrieval methods, and can also be incorporated alongside other techniques, such as ontology inclusion and sophisticated matching, to achieve increased levels of performance.
Citation preview
BoTLRet: A Template-based Linked Data InformationRetrieval
Md-Mizanur Rahoman
July 11, 2013
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Outline
Introduction
Linked dataLinked data access
Related Work
Problem in current state of the artProbable solution
BoTLRet: Proposed System
BoTLRet for two-input-keywords queryBoTLRet for more than two-input-keywords query
Experiment
Experimental setupComparison with other systems
Conclusion
Future Work & PublicationMd-Mizanur Rahoman | 2
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Introduction
Linked dataRepresent knowledge using simpletechnique like subject, predicate, objectCan be presented by graph-like structureUse SQL-like expressive query calledSPARQL queryStore 295 datasets, 31 billion RDF triples,as of Sep. 2011
res:Barack_Obama
res:Hawaii
res:Shizuya_Hayashi
prop:placeOfBirth
prop:placeOfDeath
prop:placeOfBirth
B
A
C
z
y
K
i
onto:spouse
res:Michelle_Obama
x onto:spouse
prop:placeOfBirth
2.8311e+10
onto:areaTotal
yago/resource:Shizuya_Hayashi
owl:sameAs
Md-Mizanur Rahoman | 3
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Data access over linked data
Linked data accessRequires know-how on linked data underlying technologies likeschema information or query formationImposes adaptation difficulty to general-purpose usersCould be considered for familiar keyword-based retrieval but introduceimplementation difficulty
Mere keyword does not hold semantics
Could be implemented by introducing template for familiarkeyword-based retrieval
Template holds special semantics
Md-Mizanur Rahoman | 4
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Related Work
GoRelations: An Intuitive Query System for DBpedia [Han, et al.,2011]
A comparatively easy kind of query formation technique but user stillneeds to learn query
Template-Based Question Answering Over RDF Data [Unger, et al.,2012]
A NL tool based QA system where templates are constructed using NLtools (e.g., parser)Suffers in entity inferencing which is essential
Keyword-driven SPARQL Query Generation Leveraging BackgroundKnowledge [Shekarpour, et al., 2011]
A keyword-based linked data retrieval system but user still needs toknow dataset underlying technologiesOnly can handle query with at most two keywords
Md-Mizanur Rahoman | 5
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Problem & Solution
Current template fitted linked data access
Problem
Lacks of guideline on template constructionHolds poor template ranking strategy
Solution
Construct templates according to link data structureRank templates using dataset’s inside statistics
Md-Mizanur Rahoman | 6
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Overview of Our Proposed System
Binary Progressive Template Paradigm over Linked DataRetrieval (BoTLRet)
Assumes keywords are given orderlyConstructs keyword fitted query templates for
Two-input-keywordsMore than two-input-keywords
Ranks all query templates and selects best query templates for eachtwo adjacent keywords by a binary progressive approach
Construct query templates for each adjacent two-input-keywordsExtend template construction progressively, if more thantwo-input-keywords
Constructs final SPARQL query from all best query templates for eachtwo adjacent keywords
Md-Mizanur Rahoman | 7
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for two-input-keywords query
Resource Manager
1 2Keywords k , k
Related Resources,
Resource Types
Templates Arranged
in Affinity Matrix
Template Constructor Best Template
Selector
Best Possible
Query Template
Md-Mizanur Rahoman | 8
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for two-input-keywords query
Resource ManagerFinds related resources by matching keywords and dataset resourcesClassifies each related resource to predicate and non-predicate bytheir frequent appearance position
Resource Manager
1 2Keywords k , k
Related Resources,
Resource Types
Templates Arranged
in Affinity Matrix
Template Constructor Best Template
Selector
Best Possible
Query Template
Md-Mizanur Rahoman | 9
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for two-input-keywords query
Template ConstructorFinds some kind of semantic structure (query template) for twoadjacent keyword related resources and their resource types to getrelated triplesRanks all query templates with their appearance frequencies andrelated resource closeness (depth level) capability
Resource Manager
1 2Keywords k , k
Related Resources,
Resource Types
Templates Arranged
in Affinity Matrix
Template Constructor Best Template
Selector
Best Possible
Query Template
Md-Mizanur Rahoman | 10
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Query Template
Assume
r1 is keyword related resources from keyword k1r2 is keyword related resources from keyword k2
Query Template Depth level Network with SPARQL Querymerge-point
[‖?uri, r1, r2 ‖] 1
?uri
r2
r1
SELECT ?uri WHERE {?uri r1 r2.}... ... ... ...
[‖?uri, ?p1, r1 ‖,‖?uri, ?p2, r2 ‖] 2
?uri
r1 r2
?p1 ?p
2
SELECT ?uriWHERE {?uri?p1r1.?uri?p2r2.}
... ... ... ...
Md-Mizanur Rahoman | 11
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for two-input-keywords query
Best Template SelectorSelects best query template from all ranked query templatesGenerates SPARQL query for best query template
Resource Manager
1 2Keywords k , k
Related Resources,
Resource Types
Templates Arranged
in Affinity Matrix
Template Constructor Best Template
Selector
Best Possible
Query Template
Md-Mizanur Rahoman | 12
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for two-input-keywords query
Examplekeyword1 = spousekeyword2 = Barack Obamarelated resource1 = onto:spouserelated resource2 = res:Barack Obamaresource type1 = predicateresource type2 = non-predicateqt1 = [‖ ?uri , onto:spouse, res:Barack Obama ‖]qt2 = [‖ res:Barack Obama, onto:spouse, ?uri ‖]qt1 relates<res:Michelle Obama, onto:spouse, res:Barack Obama>qt2 relates<res:Barack Obama, onto:spouse, res:Michelle Obama>
res:Barack_Obama
res:Hawaii
res:Shizuya_Hayashi
prop:placeOfBirth
prop:placeOfDeath
prop:placeOfBirth
B
A
C
z
y
K i
onto:spouse
res:Michelle_Obama
xonto:spouse
prop:placeOfBirth
2.8311e+10
onto:areaTotal
yago/resource:Shizuya_Hayashi
owl:sameAs
Md-Mizanur Rahoman | 13
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for more than two-input-keywords query
Best Template
Constructor
Best Possible Query Templates,
Their Related Keywords and
Weights
Keywords
k , k , ... k1 2 n
Comparator
Retained Template,
Not Retained Keyword
Refiner
Refined Retained Template,
Adjusted Template
Final Merged
Query Template
Merger
Md-Mizanur Rahoman | 14
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for more than two-input-keywords query
Best Template ConstructorConstructs best query templates for each two adjacent keywords
Best Template
Constructor
Best Possible Query Templates,
Their Related Keywords and
Weights
Keywords
k , k , ... k1 2 n
Comparator
Retained Template,
Not Retained Keyword
Refiner
Refined Retained Template,
Adjusted Template
Final Merged
Query Template
Merger
Md-Mizanur Rahoman | 15
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for more than two-input-keywords query
ComparatorCompares each two adjacent best query templates, then
keeps one query template as retained query templatefinds one input keyword (called as not retained keyword) that is notkept by retained query template
Finds common input keywords which are associated to “retained querytemplates” and “not retained keywords”
Best Template
Constructor
Best Possible Query Templates,
Their Related Keywords and
Weights
Keywords
k , k , ... k1 2 n
Comparator
Retained Template,
Not Retained Keyword
Refiner
Refined Retained Template,
Adjusted Template
Final Merged
Query Template
Merger
Md-Mizanur Rahoman | 16
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for more than two-input-keywords query
RefinerDiscards common input keywords from previous processFinds query templates (called as adjusted template) for “not retainedkeywords” that are not discarded as common input keywordsAccumulates best “retained query template” (called as refinedretained template) and “adjusted template”
Best Template
Constructor
Best Possible Query Templates,
Their Related Keywords and
Weights
Keywords
k , k , ... k1 2 n
Comparator
Retained Template,
Not Retained Keyword
Refiner
Refined Retained Template,
Adjusted Template
Final Merged
Query Template
Merger
Md-Mizanur Rahoman | 17
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for more than two-input-keywords query
MergerMerges all “refined retained templates” and “adjusted templates”
Best Template
Constructor
Best Possible Query Templates,
Their Related Keywords and
Weights
Keywords
k , k , ... k1 2 n
Comparator
Retained Template,
Not Retained Keyword
Refiner
Refined Retained Template,
Adjusted Template
Final Merged
Query Template
Merger
Md-Mizanur Rahoman | 18
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
BoTLRet for more than two-input-keywords query
Assume, a process flow with 5 input keywordsk1, k2, k3, k4 and k5
K K K K K
qt qt qt qt
1 2 3 4 5
1 2 3 4
rt
nrk
rt
nrkrtnrk
1 2
1 2
33
=
=
=
=
=
=qtk
qtk
qtk1
33
243
Keyword
Best Query Template
Retained Query Template
Not Retained Keyword
Refined retained template
Refined not retained keyword
Adjusted template
mgtMerged template
rrt = qtrrt = qt
12
1
4
rnk = k1 3
adt1from k3
Md-Mizanur Rahoman | 19
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Experimental Setup
Input
Ordered keywords to find intended result
Experimental Data
Question Answering over Linked Data (QALD) challenge dataConsist natural language questions
DBPedia test case - 42 questions over DBPedia datasetMusicBrainz test case - 46 questions over MusicBrainz dataset
Construction of Keywords
Consider QALD question and underlying datasetExample
In which films directed by Garry Marshall was Julia Roberts starring?keywords: Film, starring, Julia Roberts, Director, Garry Marshall
Md-Mizanur Rahoman | 20
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Experimental Perspectives
Analyze BoTLRet for
Completeness vs computational cost effectiveness
Construct another exhaustive systems (called as MTS) which use similartemplate construction but ignore resource classificationCompare recall, precision and F1 measure of MTS and BoTLRet w.r.ttheir incurred computational cost
Competitiveness over other state of the art systems
Execute experimental data over 2 recent template-based linked dataretrieval initiativesCompare recall, precision and F1 measure over those systems
Md-Mizanur Rahoman | 21
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
DBPedia test case - Completeness vs ComputationalCost Effectiveness
Table: Performance analysis between MTS and BoTLRet
Keyword Group MTS BoTLRet
Precision Recall F1 Measure Precision Recall F1 Measure
2 0.935 0.963 0.941 0.898 0.926 0.9043 0.559 0.705 0.595 0.559 0.705 0.5954 0.333 0.333 0.333 0.333 0.333 0.3335 1.000 1.000 1.000 1.000 1.000 1.000
Average 0.851 0.795 0.808 0.827 0.771 0.785
Table: Computational cost consumption by MTS and BoTLRet
Used No of Templates Computational Cost by BoTLRet w.r.t MTS
MTS BoTLRetPR − NP case 49 3 0.061(=3/49)NP − NP case 49 8 0.163(=8/49)
TOT case 49 11 0.224(=11/49)
BoTLRet achieve very close performance to MTS with quite lowcomputational cost
Md-Mizanur Rahoman | 22
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
DBPedia test case - Competitiveness
Table: Performance comparison between GoRelations and BoTLRet
Precision Recall F1 Measure
GoRalations 0.687 0.722 0.704BoTLRet 0.793 0.825 0.801
Table: Performance comparison between KSQGLBK and BoTLRet
Precision Recall F1 Measure
KSQGLBK 0.687 1.00 0.814BoTLRet 0.854 0.917 0.867
BoTLRet performs comparatively well over other state of the arttemplate-based linked data information retrieval initiatives
Md-Mizanur Rahoman | 23
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
MusicBrainz test case - Completeness vsComputational Cost Effectiveness
Table: Performance analysis between MTS and BoTLRet
Keyword Group MTS BoTLRet
Precision Recall F1 Measure Precision Recall F1 Measure
2 0.944 0.944 0.944 0.944 0.944 0.9443 0.846 0.846 0.846 0.846 0.846 0.8464 1.000 1.000 1.000 0.000 0.000 0.000
Average 0.909 0.909 0.909 0.864 0.864 0.864
Table: Computational cost consumption by MTS and BoTLRet
Used No of Templates Computational Cost by BoTLRet w.r.t MTS
MTS BoTLRetPR − NP case 49 3 0.061(=3/49)NP − NP case 49 8 0.163(=8/49)
TOT case 49 11 0.224(=11/49)
BoTLRet achieve very close performance to MTS with quite lowcomputational cost
Md-Mizanur Rahoman | 24
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Conclusion
BoTLRet
Implements predefined templates for automatic incorporation ofsemantics in keyword-based linked data retrievalPresents concrete guideline for template construction and templaterankingShows implementation result for real linked implementation with somecomparison
Md-Mizanur Rahoman | 25
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Future Work
Work 1: Inclusion of Temporal Semantics over Keywrod-basedLinked Data Retrieval
Motivation - Temporal attributes hold concise semantics in any kind ofsearch paradigmBrief Research Plan
Start Submission to Conference Submission to Journal
March 2013 August 2013 October 2013
Work 2: Inclusion of Automatic Inferencing Framework toExtended-BoLTRet System over Multiple Linked Datasets Access
Motivation - Inference engine will able to produce more intuitive resultBrief Research Plan
Start Submission to Conference Submission to Journal
November 2013 August 2014 December 2014
Md-Mizanur Rahoman | 26
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Publication
Rahoman M., Ichise R.: An Automated Template SelectionFramework for Keyword Query over Linked Data. In Proceeding of2nd Joint International Semantic Technology Conference, pages175-190, (2012).
Rahoman M., Ichise R.: Inclusion of Temporal Semantics overKeyword-based Linked Data Retrieval. In Proceeding of 27thAnnual Conference of Japanese Society for Artificial Intelligence,(2013).
Md-Mizanur Rahoman | 27
Extra Slides
Md-Mizanur Rahoman | 29
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Query Template
TG Query Template Network with SPARQL Query for qt(r1, r2)qt(r1, r2) merge-point
TG1 [‖?uri, r1, r2 ‖]
?uri
r2
r1
SELECT ?uri WHERE {?uri r1 r2.}
[‖ r2, r1, ?uri ‖]
r
?uri
r1
2
SELECT ?uri WHERE {r2 r1 ?uri.}
TG2 [‖ r1, ?uri, r2 ‖]
r
r
1
2
?uri
SELECT ?uri WHERE {r1 ?uri r2.}
[‖ r2, ?uri, r1 ‖]
r
r1
2
?uri
SELECT ?uri WHERE {r2 ?uri r1.}
[‖?uri, ?p1, r1 ‖, ‖?uri, ?p2, r2 ‖]
?uri
r1 r2
?p1 ?p
2
SELECT ?uri WHERE {?uri ?p1 r1. ?uri ?p2 r2.}
[‖?uri, ?p1, r1 ‖, ‖ r2, ?p2, ?uri ‖]
?uri
r1 r2
?p1?p2
SELECT ?uri WHERE {?uri ?p1 r1. r2 ?p2 ?uri.}
[‖ r1, ?p1, ?uri ‖, ‖?uri, ?p2, r2 ‖] ?uri
r1
r2
?p1
?p2
SELECT ?uri WHERE {r1 ?p1 ?uri. ?uri ?p2 r2.}
[‖ r1, ?p1, ?uri ‖, ‖ r2, ?p2, ?uri ‖] ?uri
r1 r2
?p1
?p2
SELECT ?uri WHERE {r1 ?p1 ?uri. r2 ?p2 ?uri.}
Md-Mizanur Rahoman | 30
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Calculation of appearance frequency for querytemplate
Frequency of query template fqQT (qt(r1, r2))
Frequency of resource r , given query template qt(r1, r2)
fqR(r , qt(r1, r2)) =
PFs(r) if r is on subject in qt(r1, r2)PFp(r) if r is on predicate in qt(r1, r2)PFo(r) if r is on object in qt(r1, r2)
The Final appearance frequency flW (qt(r1, r2)) of query templateqt(r1, r2)
flW (qt(r1, r2)) = fqQT (qt(r1, r2))∗fqR(r1, qt(r1, r2))∗fqR(r2, qt(r1, r2))
Md-Mizanur Rahoman | 31
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Adjusted Template
Construct modified query template for each refined notretained keyword
Modified Modified Network with SPARQL Query for mqt(r1)Template Group Query Template merge-point
MTG1 [‖?uri, r1, ?o1 ‖]
?uri
?o
r1
1 SELECT ?uri WHERE {?uri r1 ?o1.}
MTG2 [‖ r1, ?p1, ?uri ‖]
r
?uri
?p1
1
SELECT ?uri WHERE {r1 ?p1 ?uri.}
[‖?uri, ?p1, r1 ‖]
?uri
r
?p1
1 SELECT ?uri WHERE {?uri ?p1 r1.}
Find adjusted template from modified query template foreach refined not retained keyword
Md-Mizanur Rahoman | 32
Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication
Merger
Example
Merge all refined retained andadjusted query templates
Maintain keyword order in query templatemergingFind merge-points for each refinedretained and adjusted query templateMerge query templates in merge-points
?uri
rk
?p1
1,1
?uri
?o
rk
1
?uri
rk
rk
(i) (ii) (iii)
2,1 3,1
4,k
?uri
rk
?p1
1,1
?uri
?o
rk
1
?uri
rk
rk
(iii)
2,13,1
4,1
?uri
rk
?p1
1,1
?uri
?o
rk
1?o
rk
(i) (ii) (iii)
2,1
3,1
+
rk4,1
1
+
(a)
(b)
(c)
(i) (ii)+
Md-Mizanur Rahoman | 33