33
BoTLRet: A Template-based Linked Data Information Retrieval Md-Mizanur Rahoman July 11, 2013

BoTLRet: A Template-based Linked Data Information Retrieval

Embed Size (px)

DESCRIPTION

Keyword-based linked data information retrieval is an easy choice for general purpose users, but implementation of such approach is a challenge because mere keyword does not hold semantics. Some studies have incorporated templates in an e ort to bridge this gap, but most such pproaches have proven ine ective because of inecient template management. Because linked data can be resented in a structured format, we can assume that the data's internal statistics can be used to e ectively in uence template management. In this work, we explore the use of this in uence for template creation, ranking, and scaling. Then, we demonstrate how our proposal for automatic linked data information retrieval can be used alongside familiar keyword-based information retrieval methods, and can also be incorporated alongside other techniques, such as ontology inclusion and sophisticated matching, to achieve increased levels of performance.

Citation preview

Page 1: BoTLRet: A Template-based Linked Data Information Retrieval

BoTLRet: A Template-based Linked Data InformationRetrieval

Md-Mizanur Rahoman

July 11, 2013

Page 2: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Outline

Introduction

Linked dataLinked data access

Related Work

Problem in current state of the artProbable solution

BoTLRet: Proposed System

BoTLRet for two-input-keywords queryBoTLRet for more than two-input-keywords query

Experiment

Experimental setupComparison with other systems

Conclusion

Future Work & PublicationMd-Mizanur Rahoman | 2

Page 3: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Introduction

Linked dataRepresent knowledge using simpletechnique like subject, predicate, objectCan be presented by graph-like structureUse SQL-like expressive query calledSPARQL queryStore 295 datasets, 31 billion RDF triples,as of Sep. 2011

res:Barack_Obama

res:Hawaii

res:Shizuya_Hayashi

prop:placeOfBirth

prop:placeOfDeath

prop:placeOfBirth

B

A

C

z

y

K

i

onto:spouse

res:Michelle_Obama

x onto:spouse

prop:placeOfBirth

2.8311e+10

onto:areaTotal

yago/resource:Shizuya_Hayashi

owl:sameAs

Md-Mizanur Rahoman | 3

Page 4: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Data access over linked data

Linked data accessRequires know-how on linked data underlying technologies likeschema information or query formationImposes adaptation difficulty to general-purpose usersCould be considered for familiar keyword-based retrieval but introduceimplementation difficulty

Mere keyword does not hold semantics

Could be implemented by introducing template for familiarkeyword-based retrieval

Template holds special semantics

Md-Mizanur Rahoman | 4

Page 5: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Related Work

GoRelations: An Intuitive Query System for DBpedia [Han, et al.,2011]

A comparatively easy kind of query formation technique but user stillneeds to learn query

Template-Based Question Answering Over RDF Data [Unger, et al.,2012]

A NL tool based QA system where templates are constructed using NLtools (e.g., parser)Suffers in entity inferencing which is essential

Keyword-driven SPARQL Query Generation Leveraging BackgroundKnowledge [Shekarpour, et al., 2011]

A keyword-based linked data retrieval system but user still needs toknow dataset underlying technologiesOnly can handle query with at most two keywords

Md-Mizanur Rahoman | 5

Page 6: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Problem & Solution

Current template fitted linked data access

Problem

Lacks of guideline on template constructionHolds poor template ranking strategy

Solution

Construct templates according to link data structureRank templates using dataset’s inside statistics

Md-Mizanur Rahoman | 6

Page 7: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Overview of Our Proposed System

Binary Progressive Template Paradigm over Linked DataRetrieval (BoTLRet)

Assumes keywords are given orderlyConstructs keyword fitted query templates for

Two-input-keywordsMore than two-input-keywords

Ranks all query templates and selects best query templates for eachtwo adjacent keywords by a binary progressive approach

Construct query templates for each adjacent two-input-keywordsExtend template construction progressively, if more thantwo-input-keywords

Constructs final SPARQL query from all best query templates for eachtwo adjacent keywords

Md-Mizanur Rahoman | 7

Page 8: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for two-input-keywords query

Resource Manager

1 2Keywords k , k

Related Resources,

Resource Types

Templates Arranged

in Affinity Matrix

Template Constructor Best Template

Selector

Best Possible

Query Template

Md-Mizanur Rahoman | 8

Page 9: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for two-input-keywords query

Resource ManagerFinds related resources by matching keywords and dataset resourcesClassifies each related resource to predicate and non-predicate bytheir frequent appearance position

Resource Manager

1 2Keywords k , k

Related Resources,

Resource Types

Templates Arranged

in Affinity Matrix

Template Constructor Best Template

Selector

Best Possible

Query Template

Md-Mizanur Rahoman | 9

Page 10: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for two-input-keywords query

Template ConstructorFinds some kind of semantic structure (query template) for twoadjacent keyword related resources and their resource types to getrelated triplesRanks all query templates with their appearance frequencies andrelated resource closeness (depth level) capability

Resource Manager

1 2Keywords k , k

Related Resources,

Resource Types

Templates Arranged

in Affinity Matrix

Template Constructor Best Template

Selector

Best Possible

Query Template

Md-Mizanur Rahoman | 10

Page 11: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Query Template

Assume

r1 is keyword related resources from keyword k1r2 is keyword related resources from keyword k2

Query Template Depth level Network with SPARQL Querymerge-point

[‖?uri, r1, r2 ‖] 1

?uri

r2

r1

SELECT ?uri WHERE {?uri r1 r2.}... ... ... ...

[‖?uri, ?p1, r1 ‖,‖?uri, ?p2, r2 ‖] 2

?uri

r1 r2

?p1 ?p

2

SELECT ?uriWHERE {?uri?p1r1.?uri?p2r2.}

... ... ... ...

Md-Mizanur Rahoman | 11

Page 12: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for two-input-keywords query

Best Template SelectorSelects best query template from all ranked query templatesGenerates SPARQL query for best query template

Resource Manager

1 2Keywords k , k

Related Resources,

Resource Types

Templates Arranged

in Affinity Matrix

Template Constructor Best Template

Selector

Best Possible

Query Template

Md-Mizanur Rahoman | 12

Page 13: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for two-input-keywords query

Examplekeyword1 = spousekeyword2 = Barack Obamarelated resource1 = onto:spouserelated resource2 = res:Barack Obamaresource type1 = predicateresource type2 = non-predicateqt1 = [‖ ?uri , onto:spouse, res:Barack Obama ‖]qt2 = [‖ res:Barack Obama, onto:spouse, ?uri ‖]qt1 relates<res:Michelle Obama, onto:spouse, res:Barack Obama>qt2 relates<res:Barack Obama, onto:spouse, res:Michelle Obama>

res:Barack_Obama

res:Hawaii

res:Shizuya_Hayashi

prop:placeOfBirth

prop:placeOfDeath

prop:placeOfBirth

B

A

C

z

y

K i

onto:spouse

res:Michelle_Obama

xonto:spouse

prop:placeOfBirth

2.8311e+10

onto:areaTotal

yago/resource:Shizuya_Hayashi

owl:sameAs

Md-Mizanur Rahoman | 13

Page 14: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for more than two-input-keywords query

Best Template

Constructor

Best Possible Query Templates,

Their Related Keywords and

Weights

Keywords

k , k , ... k1 2 n

Comparator

Retained Template,

Not Retained Keyword

Refiner

Refined Retained Template,

Adjusted Template

Final Merged

Query Template

Merger

Md-Mizanur Rahoman | 14

Page 15: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for more than two-input-keywords query

Best Template ConstructorConstructs best query templates for each two adjacent keywords

Best Template

Constructor

Best Possible Query Templates,

Their Related Keywords and

Weights

Keywords

k , k , ... k1 2 n

Comparator

Retained Template,

Not Retained Keyword

Refiner

Refined Retained Template,

Adjusted Template

Final Merged

Query Template

Merger

Md-Mizanur Rahoman | 15

Page 16: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for more than two-input-keywords query

ComparatorCompares each two adjacent best query templates, then

keeps one query template as retained query templatefinds one input keyword (called as not retained keyword) that is notkept by retained query template

Finds common input keywords which are associated to “retained querytemplates” and “not retained keywords”

Best Template

Constructor

Best Possible Query Templates,

Their Related Keywords and

Weights

Keywords

k , k , ... k1 2 n

Comparator

Retained Template,

Not Retained Keyword

Refiner

Refined Retained Template,

Adjusted Template

Final Merged

Query Template

Merger

Md-Mizanur Rahoman | 16

Page 17: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for more than two-input-keywords query

RefinerDiscards common input keywords from previous processFinds query templates (called as adjusted template) for “not retainedkeywords” that are not discarded as common input keywordsAccumulates best “retained query template” (called as refinedretained template) and “adjusted template”

Best Template

Constructor

Best Possible Query Templates,

Their Related Keywords and

Weights

Keywords

k , k , ... k1 2 n

Comparator

Retained Template,

Not Retained Keyword

Refiner

Refined Retained Template,

Adjusted Template

Final Merged

Query Template

Merger

Md-Mizanur Rahoman | 17

Page 18: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for more than two-input-keywords query

MergerMerges all “refined retained templates” and “adjusted templates”

Best Template

Constructor

Best Possible Query Templates,

Their Related Keywords and

Weights

Keywords

k , k , ... k1 2 n

Comparator

Retained Template,

Not Retained Keyword

Refiner

Refined Retained Template,

Adjusted Template

Final Merged

Query Template

Merger

Md-Mizanur Rahoman | 18

Page 19: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

BoTLRet for more than two-input-keywords query

Assume, a process flow with 5 input keywordsk1, k2, k3, k4 and k5

K K K K K

qt qt qt qt

1 2 3 4 5

1 2 3 4

rt

nrk

rt

nrkrtnrk

1 2

1 2

33

=

=

=

=

=

=qtk

qtk

qtk1

33

243

Keyword

Best Query Template

Retained Query Template

Not Retained Keyword

Refined retained template

Refined not retained keyword

Adjusted template

mgtMerged template

rrt = qtrrt = qt

12

1

4

rnk = k1 3

adt1from k3

Md-Mizanur Rahoman | 19

Page 20: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Experimental Setup

Input

Ordered keywords to find intended result

Experimental Data

Question Answering over Linked Data (QALD) challenge dataConsist natural language questions

DBPedia test case - 42 questions over DBPedia datasetMusicBrainz test case - 46 questions over MusicBrainz dataset

Construction of Keywords

Consider QALD question and underlying datasetExample

In which films directed by Garry Marshall was Julia Roberts starring?keywords: Film, starring, Julia Roberts, Director, Garry Marshall

Md-Mizanur Rahoman | 20

Page 21: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Experimental Perspectives

Analyze BoTLRet for

Completeness vs computational cost effectiveness

Construct another exhaustive systems (called as MTS) which use similartemplate construction but ignore resource classificationCompare recall, precision and F1 measure of MTS and BoTLRet w.r.ttheir incurred computational cost

Competitiveness over other state of the art systems

Execute experimental data over 2 recent template-based linked dataretrieval initiativesCompare recall, precision and F1 measure over those systems

Md-Mizanur Rahoman | 21

Page 22: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

DBPedia test case - Completeness vs ComputationalCost Effectiveness

Table: Performance analysis between MTS and BoTLRet

Keyword Group MTS BoTLRet

Precision Recall F1 Measure Precision Recall F1 Measure

2 0.935 0.963 0.941 0.898 0.926 0.9043 0.559 0.705 0.595 0.559 0.705 0.5954 0.333 0.333 0.333 0.333 0.333 0.3335 1.000 1.000 1.000 1.000 1.000 1.000

Average 0.851 0.795 0.808 0.827 0.771 0.785

Table: Computational cost consumption by MTS and BoTLRet

Used No of Templates Computational Cost by BoTLRet w.r.t MTS

MTS BoTLRetPR − NP case 49 3 0.061(=3/49)NP − NP case 49 8 0.163(=8/49)

TOT case 49 11 0.224(=11/49)

BoTLRet achieve very close performance to MTS with quite lowcomputational cost

Md-Mizanur Rahoman | 22

Page 23: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

DBPedia test case - Competitiveness

Table: Performance comparison between GoRelations and BoTLRet

Precision Recall F1 Measure

GoRalations 0.687 0.722 0.704BoTLRet 0.793 0.825 0.801

Table: Performance comparison between KSQGLBK and BoTLRet

Precision Recall F1 Measure

KSQGLBK 0.687 1.00 0.814BoTLRet 0.854 0.917 0.867

BoTLRet performs comparatively well over other state of the arttemplate-based linked data information retrieval initiatives

Md-Mizanur Rahoman | 23

Page 24: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

MusicBrainz test case - Completeness vsComputational Cost Effectiveness

Table: Performance analysis between MTS and BoTLRet

Keyword Group MTS BoTLRet

Precision Recall F1 Measure Precision Recall F1 Measure

2 0.944 0.944 0.944 0.944 0.944 0.9443 0.846 0.846 0.846 0.846 0.846 0.8464 1.000 1.000 1.000 0.000 0.000 0.000

Average 0.909 0.909 0.909 0.864 0.864 0.864

Table: Computational cost consumption by MTS and BoTLRet

Used No of Templates Computational Cost by BoTLRet w.r.t MTS

MTS BoTLRetPR − NP case 49 3 0.061(=3/49)NP − NP case 49 8 0.163(=8/49)

TOT case 49 11 0.224(=11/49)

BoTLRet achieve very close performance to MTS with quite lowcomputational cost

Md-Mizanur Rahoman | 24

Page 25: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Conclusion

BoTLRet

Implements predefined templates for automatic incorporation ofsemantics in keyword-based linked data retrievalPresents concrete guideline for template construction and templaterankingShows implementation result for real linked implementation with somecomparison

Md-Mizanur Rahoman | 25

Page 26: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Future Work

Work 1: Inclusion of Temporal Semantics over Keywrod-basedLinked Data Retrieval

Motivation - Temporal attributes hold concise semantics in any kind ofsearch paradigmBrief Research Plan

Start Submission to Conference Submission to Journal

March 2013 August 2013 October 2013

Work 2: Inclusion of Automatic Inferencing Framework toExtended-BoLTRet System over Multiple Linked Datasets Access

Motivation - Inference engine will able to produce more intuitive resultBrief Research Plan

Start Submission to Conference Submission to Journal

November 2013 August 2014 December 2014

Md-Mizanur Rahoman | 26

Page 27: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Publication

Rahoman M., Ichise R.: An Automated Template SelectionFramework for Keyword Query over Linked Data. In Proceeding of2nd Joint International Semantic Technology Conference, pages175-190, (2012).

Rahoman M., Ichise R.: Inclusion of Temporal Semantics overKeyword-based Linked Data Retrieval. In Proceeding of 27thAnnual Conference of Japanese Society for Artificial Intelligence,(2013).

Md-Mizanur Rahoman | 27

Page 28: BoTLRet: A Template-based Linked Data Information Retrieval

Questions?

Md-Mizanur Rahoman, [email protected]

Md-Mizanur Rahoman | 28

Page 29: BoTLRet: A Template-based Linked Data Information Retrieval

Extra Slides

Md-Mizanur Rahoman | 29

Page 30: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Query Template

TG Query Template Network with SPARQL Query for qt(r1, r2)qt(r1, r2) merge-point

TG1 [‖?uri, r1, r2 ‖]

?uri

r2

r1

SELECT ?uri WHERE {?uri r1 r2.}

[‖ r2, r1, ?uri ‖]

r

?uri

r1

2

SELECT ?uri WHERE {r2 r1 ?uri.}

TG2 [‖ r1, ?uri, r2 ‖]

r

r

1

2

?uri

SELECT ?uri WHERE {r1 ?uri r2.}

[‖ r2, ?uri, r1 ‖]

r

r1

2

?uri

SELECT ?uri WHERE {r2 ?uri r1.}

[‖?uri, ?p1, r1 ‖, ‖?uri, ?p2, r2 ‖]

?uri

r1 r2

?p1 ?p

2

SELECT ?uri WHERE {?uri ?p1 r1. ?uri ?p2 r2.}

[‖?uri, ?p1, r1 ‖, ‖ r2, ?p2, ?uri ‖]

?uri

r1 r2

?p1?p2

SELECT ?uri WHERE {?uri ?p1 r1. r2 ?p2 ?uri.}

[‖ r1, ?p1, ?uri ‖, ‖?uri, ?p2, r2 ‖] ?uri

r1

r2

?p1

?p2

SELECT ?uri WHERE {r1 ?p1 ?uri. ?uri ?p2 r2.}

[‖ r1, ?p1, ?uri ‖, ‖ r2, ?p2, ?uri ‖] ?uri

r1 r2

?p1

?p2

SELECT ?uri WHERE {r1 ?p1 ?uri. r2 ?p2 ?uri.}

Md-Mizanur Rahoman | 30

Page 31: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Calculation of appearance frequency for querytemplate

Frequency of query template fqQT (qt(r1, r2))

Frequency of resource r , given query template qt(r1, r2)

fqR(r , qt(r1, r2)) =

PFs(r) if r is on subject in qt(r1, r2)PFp(r) if r is on predicate in qt(r1, r2)PFo(r) if r is on object in qt(r1, r2)

The Final appearance frequency flW (qt(r1, r2)) of query templateqt(r1, r2)

flW (qt(r1, r2)) = fqQT (qt(r1, r2))∗fqR(r1, qt(r1, r2))∗fqR(r2, qt(r1, r2))

Md-Mizanur Rahoman | 31

Page 32: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Adjusted Template

Construct modified query template for each refined notretained keyword

Modified Modified Network with SPARQL Query for mqt(r1)Template Group Query Template merge-point

MTG1 [‖?uri, r1, ?o1 ‖]

?uri

?o

r1

1 SELECT ?uri WHERE {?uri r1 ?o1.}

MTG2 [‖ r1, ?p1, ?uri ‖]

r

?uri

?p1

1

SELECT ?uri WHERE {r1 ?p1 ?uri.}

[‖?uri, ?p1, r1 ‖]

?uri

r

?p1

1 SELECT ?uri WHERE {?uri ?p1 r1.}

Find adjusted template from modified query template foreach refined not retained keyword

Md-Mizanur Rahoman | 32

Page 33: BoTLRet: A Template-based Linked Data Information Retrieval

Introduction Related Work Proposed Method More than two keywords Experiments Conclusion Future Work Publication

Merger

Example

Merge all refined retained andadjusted query templates

Maintain keyword order in query templatemergingFind merge-points for each refinedretained and adjusted query templateMerge query templates in merge-points

?uri

rk

?p1

1,1

?uri

?o

rk

1

?uri

rk

rk

(i) (ii) (iii)

2,1 3,1

4,k

?uri

rk

?p1

1,1

?uri

?o

rk

1

?uri

rk

rk

(iii)

2,13,1

4,1

?uri

rk

?p1

1,1

?uri

?o

rk

1?o

rk

(i) (ii) (iii)

2,1

3,1

+

rk4,1

1

+

(a)

(b)

(c)

(i) (ii)+

Md-Mizanur Rahoman | 33