23
KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association Institute of Applied Informatics and Formal Description Methods www.kit.edu SUMMA: A Common API for Linked Data Entity Summaries Andreas Thalhammer and Steffen Stadtmüller 15 th International Conference on Web Engineering (ICWE 2015) 25.06.2015 Rotterdam

(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked Data Entity Summaries"- Andreas Thalhammer and Steffen Stadtmüller

Embed Size (px)

Citation preview

KIT – University of the State of Baden-Wuerttemberg and

National Research Center of the Helmholtz Association

Institute of Applied Informatics and Formal Description Methods

www.kit.edu

SUMMA: A Common API for Linked Data Entity Summaries Andreas Thalhammer and Steffen Stadtmüller

15th International Conference on Web Engineering (ICWE 2015) 25.06.2015

Rotterdam

Institute of Applied Informatics and Formal Description Methods

(AIFB)

2

Outline

1. Motivation

2. SUMMA API definition

3. Implementation

4. Evaluation

5. Related work

6. Conclusions

7. Future work

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

3

Motivation

RDF graphs enable to represent all available information about entities:

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Pulp

Fiction

John

Travolta

Uma

Thurman

Quentin

Tarantino

1994

178

Bruce

Willis

runtime

(minutes)

year

starring

starring

starring

director

Kill Bill

Vol. 1

2003

112

runtime

(minutes)

year

starring

director

Institute of Applied Informatics and Formal Description Methods

(AIFB)

4

Motivation

RDF graphs enable to represent all available information about entities:

Problems:

Untidy visualization of specific entities with a graph

Feature-based representation

Many entities are involved in more than 1000 relations

Entity summarization

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Pulp

Fiction

John

Travolta

Uma

Thurman

Quentin

Tarantino

1994

178

Bruce

Willis

runtime

(minutes)

year

starring

starring

starring

director

Pulp Fiction Year: 1994

Runtime (minutes): 178

Director: Quentin Tarantino

Starring: Bruce Willis

John Travolta

Uma Thurman

Samuel L. Jackson

Harvey Keitel

Tim Roth

Lawrence Bender

Amanda Plummer

Eric Stoltz

Peter Greene

Phil LaMarr

Julia Sweeney

Bur Steers

Institute of Applied Informatics and Formal Description Methods

(AIFB)

5

Motivation

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

(Source: http://bing.com)

(Source: http://bbc.co.uk)

Institute of Applied Informatics and Formal Description Methods

(AIFB)

6

Motivation

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

(Source: http://bing.com)

(Source: http://bbc.co.uk)

http://rdf.freebase.com/ns/location.country.capital http://rdf.freebase.com/ns/m.0156q

http://rdf.freebase.com/ns/0345h

Institute of Applied Informatics and Formal Description Methods

(AIFB)

7

Motivation

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Problem: Tight coupling

low interoperability

low compatibility

Solution: Decoupling

Servers Clients Client Servers

Separate Servers and Clients Summarization mashups become possible

SUMMA SUMMA SUMMA

Institute of Applied Informatics and Formal Description Methods

(AIFB)

8

Motivation

Quantitative evaluation:

Qualitative evaluation:

A/B-Testing:

Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

08.07.2015

Comparator

Institute of Applied Informatics and Formal Description Methods

(AIFB)

9

SUMMA API definition

Producing a summary of an entity

What is needed:

URI (of the entity e) – the entity needs to be identified

k (number) – an upper limit of facts related to e

What could be needed:

Multi-language support

Statement groups (e.g., biographical data)

Restriction to specific properties

Multi-hop search space

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

PF JT

VV

actor

role

_: starring

Angela Merkel Date of birth: July 17, 1954

Place of birth: Hamburg (1,73 mio, 2013)

Institute of Applied Informatics and Formal Description Methods

(AIFB)

10

SUMMA API definition

The SUMMA API involves:

SUMMA Vocabulary

RESTful interaction

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

11

SUMMA API definition

SUMMA Vocabulary:

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

summa:Summary

xsd:positiveInteger

summa:topKsumma:entity

rdfs:Resource

xsd:String

summa:language

summa:fixedProperty

rdf:Property

summa:statement

rdf:Statement

xsd:positiveInteger

summa:maxHops

summa:SummaryGroup

summa:group

summa:path

Institute of Applied Informatics and Formal Description Methods

(AIFB)

12

SUMMA API definition

Vocabulary:

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

@prefix : <http://purl.org/voc/summa/>.

[ a :Summary ;

:entity dbpedia:Barack_Obama ;

:topK "2"^^xsd:positiveInteger .

:language "en" ;

:maxHops “1"^^xsd:positiveInteger ;

:fixedPredicate dbpedia-owl:birthPlace

]

Institute of Applied Informatics and Formal Description Methods

(AIFB)

13

SUMMA API definition

Interaction:

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Client Server

POST [ a :Summary;

:entity dbpedia:Barack_Obama; :topK 10 ] .

201 CREATED

Location: http://example.com/

summary?entity=dbpedia:Barack_Obama&topK=10

@ prefix summa: <http://purl.org/voc/summa/> .

...

GET http://example.com/

summary?entity=dbpedia:Barack_Obama&topK=10

200 OK

@ prefix summa: <http://purl.org/voc/summa/> .

...

Institute of Applied Informatics and Formal Description Methods

(AIFB)

14

SUMMA API definition

SUMMA Vocabulary:

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

<http://km.aifb.kit.edu/summaServer/sum?entity=http://dbpedia.org/resour

ce/Barack_Obama&topK=2&maxHops=1&language=enhttp://dbpedia.

org/ontology/birthPlace> a <http://purl.org/voc/summa/Summary> ;

....

summa:statement [ rdf:type rdf :Statement ;

rdf:subject dbpedia:Barack_Obama ;

rdf:predicate dbpedia-owl:birthPlace ;

rdf:object dbpedia:Honululu ;

vrank:hasRank [ vrank:rankValue " 5512.0" ^^ xsd:double ]

]

...

dbpedia:Honolulu rdfs:label "Honolulu"@en .

...

Institute of Applied Informatics and Formal Description Methods

(AIFB)

15

Implementation

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

<script>

summa("http://dbpedia.org/resource/Marie_Curie", 10, "en", null, "summary1",

" http://km.aifb.kit.edu/summa/summarum") ;

summa("http://dbpedia.org/resource/Marie_Curie", 10, "en", null, "summary2“,

"http://km.aifb.kit.edu/summaServer/sum") ;

...

https://github.com/athalhammer/summaServer

https://github.com/athalhammer/summaClient

http://people.aifb.kit.edu/ath/summaClient

Institute of Applied Informatics and Formal Description Methods

(AIFB)

16

Evaluation

Search Engines:

Google Knowledge Graph

Microsoft Bing Satori/Snapshots

Yahoo Knowledge

News Portals (Alexa Top 25 News sites):

Forbes

BBC News

Could the user interfaces be generated with data from the

SUMMA API without changing their layout?

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

17

Evaluation

Features:

Property Restriction

Statement Groups

Multi-hop Search Space

Languages

Five entities:

Spain (country)

Dirk Nowitzki (person/athlete)

Ramones (band)

SAP (company/organization)

Inglourious Basterds (movie)

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller

SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

(Source: http://google.com)

Institute of Applied Informatics and Formal Description Methods

(AIFB)

18

Evaluation

Results:

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

19

Related work

RDF data access via middle layers:

Pubby [1]

The Linked Data API [2]

RDF content selection and ranking:

Fresnel - Display Vocabulary for RDF [3]

vRank Vocabulary [4]

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

20

Conclusions

Decouple user interface from actual entity summarization

system by defining a common API.

Light-weight and extensible vocabulary and interaction

mechanism.

Reference implementations and their source code are

publicly available.

Evaluation demonstrates applicability in real-world

scenarios.

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

21

Future work

Build adapters to Google Knowledge Graph, Microsoft Bing

Satori/Snapshots, Yahoo Knowledge.

Implement a platform where SUMMA services can be

registered and (re-)used.

Extend the vocabulary and interaction mechanism towards

user context and personalization factors.

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Institute of Applied Informatics and Formal Description Methods

(AIFB)

22

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015

Questions?

[email protected]

@thalhamm

Institute of Applied Informatics and Formal Description Methods

(AIFB)

23

Bibliography

[1] http://wifo5-03.informatik.uni-mannheim.de/pubby/

[2] https://code.google.com/p/linked-data-api/

[3] Christian Bizer, Emmanuel Pietriga, David Karger, and Ryan Lee.

Fresnel: A Browser-Independent Presentation Vocabulary for RDF. In

Proc. of 5th International Semantic Web Conference, Athens, GA, USA,

November 5-9, 2006, LNCS 4273, 2006.

[4] Antonio Roa-Valverde, Andreas Thalhammer, Ioan Toma, and

Miguel-Angel Sicilia. Towards a formal model for sharing and reusing

ranking computations. In Proc. of the 6th Intl. Workshop on Ranking in

Databases In conjunction with VLDB 2012, 2012.

08.07.2015 Andreas Thalhammer and Steffen Stadtmüller SUMMA: A Common API for Linked Data Entity Summaries ICWE 2015