50
KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association Institute of Applied Informatics and Formal Description Metthods (AIFB) www.kit.edu ?uri ?string ?states ?population yago:AfricanCountries dbo:capital rdfs:label <1000000 LANG=en optional SPARTIQULATION Verbalizing SPARQL queries Basil Ell , Denny Vrandečić, Elena Simperl International Workshop on Interacting with Linked Data, Extended Semantic Web Conference 2012 28 May 2012

SPARTIQULATION - Verbalizing SPARQL queries

Embed Size (px)

DESCRIPTION

This presentation was given at the International Workshop on Interacting with Linked Data (ILD 2012) co-located with the 9th Extended Semantic Web Conference 2012, Heraklion, and is related the publication of the same title. Much research has been done to combine the fields of Data-bases and Natural Language Processing. While many works focus on the problem of deriving a structured query for a given natural language question, the problem of query verbalization -- translating a structured query into natural language -- is less explored. In this work we describe our approach to verbalizing SPARQL queries in order to create natural language expressions that are readable and understandable by the human day-to-day user. These expressions are helpful when having search engines that generate SPARQL queries for user-provided natural language questions or keywords. Displaying verbalizations of generated queries to a user enables the user to check whether the right question has been understood. While our approach enables verbalization of only a subset of SPARQL 1.1, this subset applies to 90% of the 209 queries in our training set. These observations are based on a corpus of SPARQL queries consisting of datasets from the QALD-1 challenge and the ILD2012 challenge. The publication is available at http://www.aifb.kit.edu/images/b/b7/VerbalizingSparqlQueries.pdf

Citation preview

Page 1: SPARTIQULATION - Verbalizing SPARQL queries

KIT – University of the State of Baden-Wuerttemberg and

National Research Center of the Helmholtz Association

Institute of Applied Informatics and Formal Description Metthods (AIFB)

www.kit.edu

?uri

?string ?states

?population yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

SPARTIQULATION Verbalizing SPARQL queries

Basil Ell, Denny Vrandečić, Elena Simperl International Workshop on Interacting with Linked Data, Extended Semantic Web Conference 2012

28 May 2012

Page 2: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 2 29.05.2012

MOTIVATION

Basil Ell – Verbalizing SPARQL queries

Page 3: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 3 29.05.2012

Motivation

Basil Ell – Verbalizing SPARQL queries

SPARQL

[QALD 2011]

[Haase et al., 2009]

[Shekarpour et al., 2011]

Page 4: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 4 29.05.2012

Motivation

Basil Ell – Verbalizing SPARQL queries

SPARQL SPARQL

[QALD 2011]

[Haase et al., 2009]

[Shekarpour et al., 2011]

Page 5: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 5 29.05.2012

Motivation

Basil Ell – Verbalizing SPARQL queries

SPARQL SPARQL Text

[QALD 2011]

[Haase et al., 2009]

[Shekarpour et al., 2011]

Page 6: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 6 29.05.2012

Motivation

Basil Ell – Verbalizing SPARQL queries

SPARQL SPARQL Text

[QALD 2011]

[Haase et al., 2009]

[Shekarpour et al., 2011]

Page 7: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 7 29.05.2012

APROACH

Basil Ell – Verbalizing SPARQL queries

Page 8: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 8 29.05.2012

The pipeline architecture

Basil Ell – Verbalizing SPARQL queries

Document

Planner

Microplanner

Surface

Realizer

Content determination

Document structuring

Lexicalization

Referring expression generation

Aggregation

Linguistic realization

Surface realization

SPARQL

Text

DP

TS

[Reiter and Dale, 2000]

Page 9: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 9 29.05.2012

The pipeline architecture

Basil Ell – Verbalizing SPARQL queries

Document

Planner

Microplanner

Surface

Realizer

Content determination

Document structuring

Lexicalization

Referring expression generation

Aggregation

Linguistic realization

Surface realization

SPARQL

Text

DP

TS

1. Select the information to communicate

2. Constructing messages and deciding

for their ordering and structure

[Reiter and Dale, 2000]

Page 10: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 10 29.05.2012

The pipeline architecture

Basil Ell – Verbalizing SPARQL queries

Document

Planner

Microplanner

Surface

Realizer

Content determination

Document structuring

Lexicalization

Referring expression generation

Aggregation

Linguistic realization

Surface realization

SPARQL

Text

DP

TS

1. Select the information to communicate

2. Constructing messages and deciding

for their ordering and structure

3. Decide which words to use in order to

express the content

4. Decide how to refer to an entity

5. Map to linguistic structures

[Reiter and Dale, 2000]

Page 11: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 11 29.05.2012

The pipeline architecture

Basil Ell – Verbalizing SPARQL queries

Document

Planner

Microplanner

Surface

Realizer

Content determination

Document structuring

Lexicalization

Referring expression generation

Aggregation

Linguistic realization

Surface realization

SPARQL

Text

DP

TS

1. Select the information to communicate

2. Constructing messages and deciding

for their ordering and structure

3. Decide which words to use in order to

express the content

4. Decide how to refer to an entity

5. Map to linguistic structures

6. Create natural language

7. Add structure to text such as

HTML elements

[Reiter and Dale, 2000]

Page 12: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 12 29.05.2012

The pipeline architecture

Basil Ell – Verbalizing SPARQL queries

Document

Planner

Microplanner

Surface

Realizer

Content determination

Document structuring

Lexicalization

Referring expression generation

Aggregation

Linguistic realization

Surface realization

SPARQL

Text

DP

TS

1. Select the information to communicate

2. Constructing messages and deciding

for their ordering and structure

3. Decide which words to use in order to

express the content

4. Decide how to refer to an entity

5. Map to linguistic structures

6. Create natural language

7. Add structure to text such as

HTML elements

[Reiter and Dale, 2000]

Page 13: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 13 29.05.2012

Restrictions

Basil Ell – Verbalizing SPARQL queries

SPARQL 1.0 SELECT

Page 14: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 14 29.05.2012

Restrictions

Basil Ell – Verbalizing SPARQL queries

SPARQL 1.0 SELECT

UNION and GROUP BY queries

Page 15: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 15 29.05.2012

Restrictions

Basil Ell – Verbalizing SPARQL queries

SPARQL 1.0 SELECT

UNION and GROUP BY queries

„Disconnected“ query graphs

Page 16: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 16 29.05.2012

Restrictions

Basil Ell – Verbalizing SPARQL queries

SPARQL 1.0 SELECT

UNION and GROUP BY queries

„Disconnected“ query graphs

Regular expressions etc.

Page 17: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 17 29.05.2012

Example – SPARQL query

Basil Ell – Verbalizing SPARQL queries

01 PREFIX dbo: <http://dbpedia.org/ontology/>

02 PREFIX yago: <http://dbpedia.org/class/yago/>

03 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

04 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

05 PREFIX dbp: <http://dbpedia.org/property/>

06 SELECT DISTINCT ?uri ?string

07 WHERE {

08 ?states rdf:type yago:AfricanCountries .

09 ?states dbo:capital ?uri .

10 ?uri dbp:population ?population .

11 FILTER ( ?population < 1000000 ) .

12 OPTIONAL { ?uri rdfs:label ?string. FILTER (lang(?string) = 'en') }

13 }

Page 18: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 18 29.05.2012

Example query – graph representation

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Page 19: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 19 29.05.2012

Document structuring – 4 Steps

Basil Ell – Verbalizing SPARQL queries

Main entity

identification

Graph

trans-

formation

Message

creation

Create

Document

Plan

Page 20: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 20 29.05.2012

Example – identify main entity

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Select a variable that is verbalized as subject

Page 21: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 21 29.05.2012

Example – identify main entity

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Select a variable that is verbalized as subject

Page 22: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 22 29.05.2012

Example – identify main entity

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Select a variable that is verbalized as subject

?string Labels if available of capitals of African countries ...

Bad: subject is optional.

Page 23: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 23 29.05.2012

Example – identify main entity

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Select a variable that is verbalized as subject

?popu-

lation

Population < 10^6 of capitals of African countries ...

Bad: variable is not selected.

Page 24: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 24 29.05.2012

Example – identify main entity

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Select a variable that is verbalized as subject

?states African countries having capitals that have populations < 10^6 ...

Bad: variable is not selected.

Page 25: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 25 29.05.2012

Example – identify main entity

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Select a variable that is verbalized as subject

?uri Capitals of African countries having population < 10^6 ...

Good: Label for main entity is requested.

Page 26: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 26 29.05.2012

Graph transformation

Idea: Reduce the set of message types

to simplify verbalization

Main entity is transformed into root node

Reversal of some edges necessary

Basil Ell – Verbalizing SPARQL queries

Page 27: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 27 29.05.2012

Graph transformation

Idea: Reduce the set of message types

to simplify verbalization

Main entity is transformed into root node

Reversal of some edges necessary

Basil Ell – Verbalizing SPARQL queries

Page 28: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 28 29.05.2012

Example – transformed graph

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital- rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Page 29: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 29 29.05.2012

Message creation

Cut graph into independently verbalizable parts

Filters are stored in VAR messages

Basil Ell – Verbalizing SPARQL queries

1 1

Messages (1-9) represent paths,

message types are path classes

Page 30: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 30 29.05.2012

Example – messages

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital- rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

Page 31: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 31 29.05.2012

Example – messages

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital- rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

(5) M(RV)*RlV

Page 32: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 32 29.05.2012

Example – messages

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital- rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

(3) M(RV)*RV

(5) M(RV)*RlV

Page 33: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 33 29.05.2012

Example – messages

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital- rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

(7) M(RV)*RtR

(3) M(RV)*RV

(5) M(RV)*RlV

Page 34: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 34 29.05.2012

Example – messages

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital- rdfs:label

<1000000

LANG=en

optional

?var ?var

resource

filter

selected var variable

(7) M(RV)*RtR

(3) M(RV)*RV

(5) M(RV)*RlV

(10) VAR +4 x

Page 35: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 35 29.05.2012

Document Plan

Basil Ell – Verbalizing SPARQL queries

constraints requests modifiers

DP

1 2 3

Page 36: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 36 29.05.2012

Document Plan

Basil Ell – Verbalizing SPARQL queries

constraints requests modifiers

DP

1 2 3

Constraits for main entity, e.g. its class,

having population < 10^6

Page 37: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 37 29.05.2012

Document Plan

Basil Ell – Verbalizing SPARQL queries

constraints requests modifiers

DP

1 2 3

Requested information, e.g. its name

Constraits for main entity, e.g. its class,

having population < 10^6

Page 38: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 38 29.05.2012

Document Plan

Basil Ell – Verbalizing SPARQL queries

constraints requests modifiers

DP

1 2 3

Modifiers, e.g. LIMIT, ORDER BY ...

Requested information, e.g. its name

Constraits for main entity, e.g. its class,

having population < 10^6

Page 39: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 39 29.05.2012

Example - verbalization

Basil Ell – Verbalizing SPARQL queries

?uri ?states

yago:AfricanCountries

dbo:capital-

M(RV)*RtR (cons)

Page 40: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 40 29.05.2012

Example - verbalization

Basil Ell – Verbalizing SPARQL queries

?uri ?states

yago:AfricanCountries

dbo:capital-

M(RV)*RtR

Capitals of African countries

(cons)

Page 41: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 41 29.05.2012

Example - verbalization

Basil Ell – Verbalizing SPARQL queries

?uri ?states

yago:AfricanCountries

dbo:capital-

M(RV)*RtR

Capitals of African countries

?uri

?population <1000000

M(RV)*RV

(cons)

(cons)

Page 42: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 42 29.05.2012

Example - verbalization

Basil Ell – Verbalizing SPARQL queries

?uri ?states

yago:AfricanCountries

dbo:capital-

M(RV)*RtR

Capitals of African countries

?uri

?population <1000000

M(RV)*RV

that are having populations that are less

than 1000000

(cons)

(cons)

Page 43: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 43 29.05.2012

Example - verbalization

Basil Ell – Verbalizing SPARQL queries

?uri ?states

yago:AfricanCountries

dbo:capital-

M(RV)*RtR

Capitals of African countries

?uri

?population <1000000

M(RV)*RV

that are having populations that are less

than 1000000

?uri

?string rdfs:label

LANG=en

optional

M(RV)*RlV

(cons)

(cons)

(req)

Page 44: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 44 29.05.2012

Example - verbalization

Basil Ell – Verbalizing SPARQL queries

?uri ?states

yago:AfricanCountries

dbo:capital-

M(RV)*RtR

Capitals of African countries

?uri

?population <1000000

M(RV)*RV

that are having populations that are less

than 1000000

?uri

?string rdfs:label

LANG=en

optional

M(RV)*RlV

where available their English labels. and

(cons)

(cons)

(req)

Page 45: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 45 29.05.2012

SUMMARY AND FUTURE WORK

Basil Ell – Verbalizing SPARQL queries

Page 46: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 46 29.05.2012

Summary and Future Work

Summary:

Presented an approach for explaining SPARQL

SELECT queries in natural language

Schema-agnostic

Basil Ell – Verbalizing SPARQL queries

Page 47: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 47 29.05.2012

Summary and Future Work

Summary:

Presented an approach for explaining SPARQL

SELECT queries in natural language

Schema-agnostic

Directions for future work:

Tackle challenges in the two missing pipeline

components

Exploitation of linguistic features of labels

Evaluation

Basil Ell – Verbalizing SPARQL queries

Page 48: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 48 29.05.2012

?QUESTIONS

http://km.aifb.kit.edu/projects/spartiqulator/

Basil Ell – Verbalizing SPARQL queries

?uri

?string ?states

?population

yago:AfricanCountries

dbo:capital rdfs:label

<1000000

LANG=en

optional

The work presented here is supported by the European Union's 7th

Framework Programme (FP7/2007-2013) under Grant Agreement 257790.

http://bit.ly/KGuDTL

Page 49: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 49 29.05.2012

REFERENCES

Basil Ell – Verbalizing SPARQL queries

Page 50: SPARTIQULATION - Verbalizing SPARQL queries

Institute of Applied Informatics and Formal Description Methods 50 29.05.2012

References

Basil Ell – Verbalizing SPARQL queries

S. Shekarpour, S. Auer, A.-C. Ngonga Ngomo, D. Gerber, S. Hellmann,

and C. Stadler. Keyword-driven SPARQL Query Generation Leveraging

Background Knowledge. In International Conference on Web Intelligence,

2011.

E. Reiter and R. Dale. Building Natural Language Generation Systems.

Natural Language Processing. Cambridge University Press, 2000.

P. Haase, D. M. Herzig, M. Musen, and D. T. Tran. Semantic Wiki Search.

In L. A. P. et al., editor, 6th Annual European Semantic Web Conference,

ESWC2009, Heraklion, Crete, Greece, volume 5554 of LNCS, pages 445-

460. Springer Verlag, Juni 2009.

QALD 2011: http://www.sc.cit-ec.uni-bielefeld.de/qald-1