Upload
eswcsummerschool
View
82
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Querying Linked Data
Presented by: Barry Norton
Motivation: Music!
2
Visualiza6on Module
Metadata Streaming providers
Physical Wrapper
Downloads
Data acquisi6
on D2R Transf. LD Wrapper
Musical Content
Applica6
on
Analysis & Mining Module
LD Dataset
Access
LD Wrapper
RDF/ XML
Integrated Dataset
Interlinking Cleansing Vocabulary Mapping
SPARQL Endpoint
Publishing
RDFa
Other content
Agenda 1. Introduc+on to SPARQL
2. Querying Linked Data with SPARQL
3. SPARQL Algebra
4. Upda+ng Linked Data with SPARQL 1.1
5. SPARQL Protocol
6. Reasoning over Linked Data
3 EUCLID -‐ Querying Linked Data
INTRODUCTION TO SPARQL
EUCLID -‐ Querying Linked Data 4
SPARQL
EUCLID -‐ Querying Linked Data 5
Seman6c Web Stack Berners-‐Lee (2006)
* Protocol and RDF Query Language
Declara6ve query language
SPARQL
EUCLID -‐ Querying Linked Data 6
• SPARQL Query – Declara6ve query language for RDF data – h\p://www.w3.org/TR/rdf-‐sparql-‐query/
• SPARQL Algebra – Standard for communica6on between SPARQL services and clients – h\p://www.w3.org/2001/sw/DataAccess/rq23/rq24-‐algebra.html
• SPARQL Update – Declara6ve manipula6on language for RDF data – h\p://www.w3.org/TR/sparql11-‐update/
• SPARQL Protocol – Standard for communica6on between SPARQL services and clients – h\p://www.w3.org/TR/sparql11-‐protocol/
SPARQL Query 1.1
EUCLID -‐ Querying Linked Data 7
• SPARQL 1.0 only allows accessing the data (query) • SPARQL 1.1 introduces:
Query extensions
Updates
• Data management: Insert, Delete, Delete/Insert
• Graph management: Create, Load, Clear, Drop, Copy, Move, Add
• Aggregates, Subqueries, Nega6on, Expressions in the SELECT clause, Property paths, assignment, short form for CONSTRUCT, expanded set of func6ons and operators
Federa6on extension • Service, values, service variables (informa6ve) CH 5
SPARQL Basics
EUCLID -‐ Querying Linked Data 8
• RDF triple: Basic building block, of the form subject, predicate, object. Example:
• RDF triple paNern: Contains one or more variables. Examples:
• RDF quad paNern: Contains graph name: URI or variable. Examples:
dbpedia:The_Beatles foaf:name "The Beatles" .
dbpedia:The_Beatles foaf:made ?album. ?album mo:track ?track . ?album ?p ?o .
GRAPH <:g> {:s :p :o .} GRAPH ?g {dbpedia:The_Beatles foaf:name ?o.}
SPARQL Basics
EUCLID -‐ Querying Linked Data 9
• RDF graph: Set of RDF asser6ons, manipulated as a labeled directed graph.
• RDF data set: set of RDF triples. It is comprised of: • One default graph • Zero or more named graphs
• SPARQL protocol client: HTTP client that sends requests for SPARQL Protocol opera6ons (queries or updates)
• SPARQL protocol service: HTTP server that services requests for SPARQL Protocol opera6ons
• SPARQL endpoint: The URI at which a SPARQL Protocol service listens for requests from SPARQL clients
QUERYING LINKED DATA WITH SPARQL
EUCLID -‐ Querying Linked Data 10
SPARQL Query
EUCLID -‐ Querying Linked Data 11
Main idea: PaNern matching • Queries describe sub-‐graphs of the queried graph • Graph paNerns are RDF graphs specified in Turtle syntax, which contain variables (prefixed by either “?” or “$”)
• Sub-‐graphs that match the graph pa\erns yield a result
?album dbpedia: The_Beatles
foaf:made
SPARQL Query
EUCLID -‐ Querying Linked Data 12
?album dbpedia: The_Beatles
foaf:made
dbpedia: The_Beatles foaf:made
<h\p:// musicbrainz.org/record/...>
<h\p:// musicbrainz.org/record/...>
foaf:made Data:
Graph pa\ern: Results:
"Help!" "Let It Be"
dc:6tle dc:6tle
<h\p:// musicbrainz.org/record/...>
"Abbey Road"
dc:6tle
foaf:made
?album
<h\p://musicbrainz.org...>
<h\p://musicbrainz.org...>
<h\p://musicbrainz.org...>
SPARQL Query
EUCLID -‐ Querying Linked Data 13
?album
dbpedia: The_Beatles
dbpedia: The_Beatles foaf:made
<h\p:// musicbrainz.org/record/...>
<h\p:// musicbrainz.org/record/...>
foaf:made Data:
Graph pa\ern: Results:
"Help!" "Let It Be"
dc:6tle dc:6tle
<h\p:// musicbrainz.org/record/...>
"Abbey Road"
dc:6tle
foaf:made
?album
?+tle
<h\p://...> "Help!"
<h\p://...> "Abbey Road"
<h\p://...> "Let It Be" ?6tle
dc:6tle
SPARQL Query
EUCLID -‐ Querying Linked Data 14
?album
dbpedia: The_Beatles
dbpedia: The_Beatles foaf:made
<h\p:// musicbrainz.org/record/...>
<h\p:// musicbrainz.org
/track/...>
foaf:made Data:
Graph pa\ern: Results:
"Help!" "Help!"
dc:6tle dc:6tle
mo:track
a
mo:Record mo:Track
mo:Record
?album
<h\p://musicbrainz.org...>
SPARQL Query: Components
EUCLID -‐ Querying Linked Data 15
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
Prologue: • Prefix defini6ons • Subtly different from Turtle syntax -‐ the final period is not used
SPARQL Query: Components
EUCLID -‐ Querying Linked Data 16
Query form: • ASK, SELECT, DESCRIBE or CONSTRUCT • SELECT retrieves variables and their bindings as a table
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
SPARQL Query: Components
EUCLID -‐ Querying Linked Data 17
Data set specifica+on: • This clause is op6onal • FROM or FROM NAMED • Indicates the sources for the data against which to find matches
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
SPARQL Query: Components
EUCLID -‐ Querying Linked Data 18
Query paNern: • Defines pa\erns to match against the data • Generalises Turtle with variables and keywords – N.B. final period op6onal
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
Solu+on modifier: • Modify the result set • ORDER BY, LIMIT or OFFSET re-‐organise rows; • GROUP BY combines them
SPARQL Query: Components
EUCLID -‐ Querying Linked Data 19
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
Query Forms
EUCLID -‐ Querying Linked Data 20
SPARQL supports different query forms:
• ASK tests whether or not a query pa\ern has a solu6on. Returns yes/no
• SELECT returns variables and their bindings directly
• CONSTRUCT returns a single RDF graph specified by a graph template
• DESCRIBE returns a single RDF graph containing RDF data about resource
Query Form: ASK
EUCLID -‐ Querying Linked Data 21
• Namespaces are added with the ‘PREFIX’ direc6ve • Statement pa\erns that make up the graph are specified between brackets (“{}”)
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbpedia-‐ont: <http://dbpedia.org/ontology/> PREFIX mo: http://purl.org/ontology/mo/ ASK WHERE { dbpedia:The_Beatles mo:member dbpedia:Paul_McCartney.}
Is Paul McCartney member of ‘The Beatles’? Query: true
Results:
Is Elvis Presley member of ‘The Beatles’? Query: false
Results: PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbpedia-‐ont: <http://dbpedia.org/ontology/> PREFIX mo: http://purl.org/ontology/mo/ ASK WHERE { dbpedia:The_Beatles mo:member dbpedia:Elvis_Presley.}
Query Form: SELECT
EUCLID -‐ Querying Linked Data 22
• The solu6on modifier projec+on nominates which components of the matches should be returned
• “*” means all components should be returned
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album_name ?track_title WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name ; mo:track ?track . ?track dc:title ?track_title .}
Query: What albums and tracks did ‘The Beatles’ make?
Filter expressions • Different types of filters and func6ons may be used
Query Form: SELECT (2)
EUCLID -‐ Querying Linked Data 23
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album_name ?track_title ?date ?duration WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name ; mo:track ?track . ?track dc:title ?track_title ;
mo:duration ?duration; FILTER (?duration>300000 && ?duration<400000) }
Query: Filter: Comparison and logical operators
Retrieve the albums and tracks recorded by ‘The Beatles’, where the duraCon of the song is more than 300 secs. and no longer than 400 secs.
Query Form: SELECT (3)
EUCLID -‐ Querying Linked Data 24
Elimina+on of duplicates
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT MODIFIER ?album_name WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name .
mo:track ?track1 . mo:track ?track2 . FILTER (?track1 != ?track2) }
Retrieve the name of the albums recorded by ‘The Beatles’ which have at least two different songs. Query:
?album
“Revolver”
“Sessions”
“Abbey Road”
?album
“Revolver”
“Revolver”
“Revolver”
“Sessions”
“Abbey Road”
“Abbey Road”
DISTINCT
Results:
REDUCED MODIFIER= MODIFIER=
Aggregates
• Calculate aggregate values: COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT and SAMPLE
• Built around the GROUP BY operator • Prune at group level (cf. FILTER) using HAVING
Query Form: SELECT (4)
EUCLID -‐ Querying Linked Data 25
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album (SUM(?track_duration) AS ?album_duration) WHERE { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track . ?track mo:duration ?track_duration . } GROUP BY ?album HAVING (SUM(?track_duration) > 3600000)
Retrieve the duraCon of the albums recorded by ‘The Beatles’. Query:
Query Form: DESCRIBE
EUCLID -‐ Querying Linked Data 26
Takes the resources within the solu6on, and provides informa6on about them as RDF statements. They can be iden6fied by:
• Specifying explicit IRIs
• Bindings of variables in the WHERE clause
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX mo: <http://purl.org/ontology/mo/> DESCRIBE ?member WHERE { dbpedia:The_Beatles mo:member ?member .}
PREFIX dbpedia: <http://dbpedia.org/resource/> DESCRIBE dbpedia:Paul_McCartney
EUCLID -‐ Querying Linked Data 27
• CONSTRUCT WHERE: In order to query for a subgraph, without change, it is no longer necessary to repeat the graph pa\ern in the template
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> PREFIX dc: <http://purl.org/dc/elements/1.1/> CONSTRUCT WHERE { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track .}
Example:
Query Form: CONSTRUCT
Query Form: CONSTRUCT (2a)
EUCLID -‐ Querying Linked Data 28
dbpedia: The_Beatles foaf:made
<h\p:// musicbrainz.org/record/...>
<h\p:// musicbrainz.org/record/...>
foaf:made Data:
Query: Result:
"Help!" "Let It Be"
dc:6tle dc:6tle
<h\p:// musicbrainz.org/record/...>
"Abbey Road"
dc:6tle
foaf:made
CONSTRUCT { ?album dc:creator dbpedia:The_Beatles .} WHERE { dbpedia:The_Beatles foaf:made ?album .}
dbpedia: The_Beatles
<h\p:// musicbrainz
…>
<h\p:// musicbrainz
…> <h\p://
musicbrainz…>
dc:creator
dc:creator
dc:creator
EUCLID -‐ Querying Linked Data 29
• Returns RDF statements created from variable bindings
• Template: graph pa\ern with variables from the query pa\ern
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> PREFIX dc: <http://purl.org/dc/elements/1.1/> CONSTRUCT { ?album dc:creator dbpedia:The_Beatles . ?track dc:creator dbpedia:The_Beatles .} WHERE { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track .}
Create the dc:creator descripCons for albums and their tracks recorded by ‘The Beatles’. Query:
Query Form: CONSTRUCT (2b)
Query Form: CONSTRUCT (3)
EUCLID -‐ Querying Linked Data 30
Subsets of results • It is possible to combine the query with solu+on modifiers
(ORDER BY, LIMIT, OFFSET)
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> PREFIX dc: <http://purl.org/dc/elements/1.1/> CONSTRUCT { ?album dc:creator dbpedia:The_Beatles . ?track dc:creator dbpedia:The_Beatles .} WHERE { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track ; dc:date ?date . } ORDER BY DESC(?date) LIMIT 10
Create the dc:creator descripCons for the 10 most recent albums and their tracks recorded by ‘The Beatles’. Query:
Union Graph PaNern • Allows the specifica6on of alterna6ves (disjunc6ons)
EUCLID -‐ Querying Linked Data 31
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: <http://purl.org/dc/elements/1.1/> CONSTRUCT { ?album dc:creator dbpedia:The_Beatles . ?track dc:creator dbpedia:The_Beatles .} WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name . {?album dbpedia-‐ont:recordedIn dbpedia:Abbey_Road_Studios .}
UNION {?album dbpedia-‐ont:recordedIn dbpedia:Trident_Studios .}}
Create the dc:creator descripCons for the albums recorded by ‘The Beatles’ in ‘Abbey Road Studios’ or ‘Trident Studios’ Query:
Query Form: CONSTRUCT (4)
Filter expressions • Different types of filters and func6ons may be used
EUCLID -‐ Querying Linked Data 32
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> CONSTRUCT {?album dc:creator dbpedia:The_Beatles .} WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name ; FILTER (REGEX(?album_name, ".*love.*", i)) }
Query: Filter: Regular expressions over strings
Create the dc:creator descripCons of the albums recorded by ‘The Beatles’ whose Ctle contains the word ‘love .
Query Form: CONSTRUCT (5)
Type of func+on Func+on Result type
Func6onal Forms
bound IF COALESCE NOT EXISTS, EXISTS or, and RDFTerm-‐equal (=), sameTerm IN, NOT IN
xsd:boolean rdfTerm rdfTerm xsd:boolean xsd:boolean xsd:boolean boolean
Func6ons on RDF Terms
isIRI, isBlank, isLiteral, isNumeric str, lang, datatype IRI BNODE
xsd:boolean simple literal iri iri blank node
Func6ons on Numerics ABS, ROUND, CEIL, FLOOR RAND
numeric xsd:double
Filter expressions
EUCLID -‐ Querying Linked Data 33 Source:h\p://www.w3.org/TR/sparql11-‐query/#SparqlOps
Query Form: CONSTRUCT (6)
Type of func+on Func+on Result type
Func6ons on Strings STRLEN SUBSTR, UCASE, LCASE STRSTARTS, STRENDS, CONTAINS STRBEFORE, STRAFTER ENCODE_FOR_URI CONCAT langMatches REGEX REPLACE
xsd:integer string literal xsd:boolean literal simple literal string literal xsd:boolean xsd:boolean string literal
Func6ons on Dates and Times
now year, month, day, hours, minutes seconds 6mezone tz
xsd:dateTime xsd:integer xsd:decimal xsd:dayTimeDura6on simple literal
Filter expressions
EUCLID -‐ Querying Linked Data 34
Source:h\p://www.w3.org/TR/sparql11-‐query/#SparqlOps
Query Form: CONSTRUCT (7)
Op+onal Graph PaNern • OPTIONAL clause encloses the op6onal parts • If variables in the construct clause are not bound in the op6onal,
the triple pa\erns with these variables are not generated
EUCLID -‐ Querying Linked Data 35
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX dbpedia-‐owl: <http://dbpedia.org/ontology/> CONSTRUCT { ?album dc:creator ?artist . ?picture dc:depicts ?artist . } WHERE { ?artist foaf:made ?album . OPTIONAL {?artist foaf:depiction ?picture .}}
Create the dc:creator and dc:depicts descripCons of arCsts. Query:
Query Form: CONSTRUCT (8)
Op+onal Graph PaNern
• Can test if variables are bound in filter expressions • Solu6ons that meet the OPTIONAL clause can be filtered out by
using the logical filter NOT (!)
EUCLID -‐ Querying Linked Data 36
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX dbpedia-‐owl: <http://dbpedia.org/ontology/> CONSTRUCT { ?album dc:creator ?artist . } WHERE { ?artist foaf:made ?album . OPTIONAL {?artist dbpedia-‐owl:deathPlace ?place_of_death .} FILTER (!BOUND(?place_of_death)) }
Create the dc:creator descripCons of those arCsts who are not dead.
Query:
Query Form: CONSTRUCT (9)
NOT EXISTS {?artist dbpedia-‐owl:deathPlace ?place_of_death .}
Assigning Variables • The value of an expression can be added to a solu6on mapping by
binding a new variable (which can be further used and returned) • The BIND form allows to assign a value to a variable from a BGP
EUCLID -‐ Querying Linked Data 37
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> PREFIX dbpedia-‐ont: <http://dbpedia.org/ontology/> CONSTRUCT { ?track dbpedia-‐ont:runtime ?secs .} WHERE { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track . ?track mo:duration ?duration . BIND((?duration/1000) AS ?secs) .}
Calculate the duraCon of the tracks from ms to s, and store the value using the dbpedia-‐ont:runCme property .
Query:
Query Form: CONSTRUCT (10)
Sub-‐queries and Aggregate Values
• To combine the CONSTRUCT query form with aggregate values, a sub-‐query should be created inside the WHERE clause
EUCLID -‐ Querying Linked Data 38
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> CONSTRUCT {?album music-‐ont:duration ?album_duration .} WHERE { SELECT ?album (SUM(?track_duration) AS ?album_duration) { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track . ?track mo:duration ?track_duration . } GROUP BY ?album HAVING (SUM(?track_duration) > 3600000)}
Materialize the duraCon of the albums recorded by ‘The Beatles’. Query:
Query Form: CONSTRUCT (11)
UPDATING LINKED DATA WITH SPARQL 1.1
EUCLID -‐ Querying Linked Data 39
Data Management
EUCLID -‐ Querying Linked Data 40
SPARQL 1.1 provides data update opera6ons:
• INSERT data: adds some triples, given inline in the request, into a graph
• DELETE data: removes some triples, given inline in the request, if the respec6ve graphs contains those
• DELETE/INSERT data: uses in parallel INSERT and DELETE
CH 1
CH 1
Data Management (2)
EUCLID -‐ Querying Linked Data 41
INSERT data
PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA { GRAPH { <http://musicbrainz.org/20130302> <http://musicbrainz.org/artist/b10bbbfc-‐cf9e-‐42e0-‐be17-‐e2c3e1d2600d>
foaf:made <http://musicbrainz.org/release/3a685770-‐7326-‐34fc-‐9f18-‐e5f5626f3dc5> , < http://musicbrainz.org/release/cb6f8798-‐d51e-‐4fa5-‐a4d1-‐2c0602bfe1b6 > .
<http://musicbrainz.org/release/3a685770-‐7326-‐34fc-‐9f18-‐e5f5626f3dc5> dc:title "Please Please Me". < http://musicbrainz.org/release/cb6f8798-‐d51e-‐4fa5-‐a4d1-‐2c0602bfe1b6 >
dc:title "Something New". } }
Insert the following albums recorded by The Beatles into the graph http://musicbrainz.org/20130209-‐004702
CH 1
Data Management (3)
EUCLID -‐ Querying Linked Data 42
DELETE data Delete all the information about the album Casualities of The Beatles.
PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
DELETE { ?album ?predicate ?object . } WHERE { <http://musicbrainz.org/artist/b10bbbfc-‐cf9e-‐42e0-‐be17-‐e2c3e1d2600d>
foaf:made ?album . ?album dc:title "Casualities". ?album ?predicate ?object .}
CH 1
Data Management (4)
EUCLID -‐ Querying Linked Data 43
DELETE/INSERT data
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX db-‐ont: <http://dbpedia.org/ontology/> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
DELETE { dbpedia:The_Beatles db-‐ont:currentMember ?x . } INSERT { GRAPH <http://musicbrainz.org/20130209-‐004702> { dbpedia:The_Beatles db-‐ont:formerBandMember ?x .} } WHERE { dbpedia:The_Beatles db-‐ont:currentMember ?x . ?x foaf:name "Peter Best" .}
Delete the status of ‘Peter Best’ as current member of´The Beatles´, and insert his status as former member of the band.
Graph Management
EUCLID -‐ Querying Linked Data 44
SPARQL 1.1 provides graph update opera6ons:
• CREATE: creates an empty graph in the Graph Store
• LOAD: reads the content of a document into a graph in the Graph Store
• CLEAR: removes all triples in one or more graphs
• DROP: removes the graph from the Graph Store
• Other opera+ons: COPY, MOVE, ADD
Graph Management (2)
EUCLID -‐ Querying Linked Data 45
CREATE • Creates a new named graph • Can be used with the DEFAULT and ALL keywords
LOAD • An RDF graph can be loaded from a URL
• LOAD can be used with the SILENT keyword
CREATE GRAPH <http://musicbrainz.org/20130302>
LOAD <http://xmlns.com/foaf/spec/20100809.rdf>
LOAD <http://xmlns.com/foaf/spec/20100809.rdf> INTO <http://xmlns.com/foaf/0.1/>
Named graph
Graph Management (3)
EUCLID -‐ Querying Linked Data 46
CLEAR • Removes all triples in the graph (it is emp6ed but not deleted!) • The graph(s) can be specified with the following keywords:
DEFAULT, NAMED, ALL, GRAPH • Can be used with the SILENT keyword
DROP • The given graph is removed from the Graph Store, including its
content • Can be used with the DEFAULT and ALL keywords
CLEAR GRAPH <http://musicbrainz.org/20130302>
DROP GRAPH <http://musicbrainz.org/20130302>
Graph Management (4)
EUCLID -‐ Querying Linked Data 47
SPARQL 1.1 provides other graph management opera6ons:
• COPY … TO …
• MOVE … TO …
• ADD … TO …
COPY GRAPH <http://musicbrainz.org/20130302> TO GRAPH <http://musicbrainz.org/20130303>
MOVE GRAPH <http://musicbrainz.org/temp> TO GRAPH <http://musicbrainz.org/20130303>
ADD GRAPH <http://musicbrainz.org/20130302> TO GRAPH <http://musicbrainz.org/20130303>
SPARQL PROTOCOL FOR RDF
EUCLID -‐ Querying Linked Data 48
SPARQL 1.1. Protocol
EUCLID -‐ Querying Linked Data 49
• Consists of two opera+ons: query and update
• An opera6on defines: • The HTTP method (GET or POST) • The HTTP query string parameters • The message content included in the HTTP request body • The message content included in the HTTP response body
SPARQL Client SPARQL Endpoint Request
alt [no errors]!
[else]!
Success Response
Failure Response
EUCLID -‐ Querying Linked Data 50
• Sends a SPARQL query to a service and receives the results of the query
• The response is:
• May be invoked using HTTP GET or HTTP POST. The method POST with URL encoding is mostly used when the query string is too long
Query Operation
XML, JSON, CSV/TSV from a SELECT query RDF/XML, Turtle from a CONSTRUCT query
EUCLID -‐ Querying Linked Data 51
Example:
PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbpedia-‐ont: <http://dbpedia.org/ontology/> SELECT ?album WHERE { ?album dbpedia-‐ont:artist dbpedia:The_Beatles .} LIMIT 3
SPARQL Query:
Query Operation (2)
HTTP GET request:
Try this query! (Click on the hurl icon)
Update Operation
EUCLID -‐ Querying Linked Data 52
• Sends a SPARQL update request to a service
• Should be invoked using the HTTP PATCH/POST method
• The response consists of a HTTP response status code, which indicates success or failure of the opera6on
REASONING OVER LINKED DATA
EUCLID -‐ Querying Linked Data 53
Reasoning for Linked Data Integration
EUCLID -‐ Querying Linked Data 54
• Example: Integra6on of the MusicBrainz data set and the DBpedia data set
Integra+on Data set Data set
Reasoning for Linked Data Integration
EUCLID -‐ Querying Linked Data 55
mo:b10bbbfc-‐cf9e-‐42e0-‐be17-‐e2c3e1d2600d foaf:name The Beatles; mo:member mo:ba550d0e-‐adac-‐4864-‐b88b-‐407cab5e76af; mo:member mo:4d5447d7-‐c61c-‐4120-‐ba1b-‐d7f471d385b9; mo:member mo:42a8f507-‐8412-‐4611-‐854f-‐926571049fa0; mo:member mo:300c4c73-‐33ac-‐4255-‐9d57-‐4e32627f5e13.
dbpedia:The_Beatles dbpedia-‐ont:origin dbpedia:Liverpool; dbpedia-‐ont:genre dbpedia:Rock_music; foaf:depiction .
Integra+on Data set Data set
same
Reasoning for Linked Data Integration
EUCLID -‐ Querying Linked Data 56
mo:b10bbbfc-‐cf9e-‐42e0-‐be17-‐e2c3e1d2600d foaf:name The Beatles; mo:member mo:ba550d0e-‐adac-‐4864-‐b88b-‐407cab5e76af; mo:member mo:4d5447d7-‐c61c-‐4120-‐ba1b-‐d7f471d385b9; mo:member mo:42a8f507-‐8412-‐4611-‐854f-‐926571049fa0; mo:member mo:300c4c73-‐33ac-‐4255-‐9d57-‐4e32627f5e13.
dbpedia:The_Beatles dbpedia-‐ont:origin dbpedia:Liverpool; dbpedia-‐ont:genre dbpedia:Rock_music; foaf:depiction .
same
SELECT ?m ?g WHERE { dbpedia:The_Beatles dbpedia-‐ont:genre ?g; mo:member ?m.}
Query: ?m ?g mo:ba550d0e-‐adac-‐4864-‐b88b-‐407cab5e76af
dbpedia:Rock_music
mo:4d5447d7-‐c61c-‐4120-‐ba1b-‐d7f471d385b9
dbpedia:Rock_music
mo42a8f507-‐8412-‐4611-‐854f-‐926571049fa0;
dbpedia:Rock_music
mo300c4c73-‐33ac-‐4255-‐9d57-‐4e32627f5e13
dbpedia:Rock_music
Result set:
SPARQL 1.1: Entailment Regimes
EUCLID -‐ Querying Linked Data 57
• SPARQL 1.0 was defined only for simple entailment (pa\ern matching )
• SPARQL 1.1 is extended with entailment regimes other than simple entailment: – RDF entailment – RDFS entailment – D-‐Entailment – OWL RL entailment – OWL Full entailment – OWL 2 DL, EL, and QL entailment – RIF entailment
Source: h\p://www.w3.org/TR/rdf-‐mt/#RDFSRules
RDFS
EUCLID -‐ Querying Linked Data 58
Resource Descrip+on Framework Schema
Seman6c Web Stack Berners-‐Lee (2006)
Taxonomies and inferences
RDFS Entailment Regimes
EUCLID -‐ Querying Linked Data 59
• Contains 13 entailment rules denominated rdfsi for inference over RDFS defini6ons*: – rdfs:Literal (rdfs1, rdfs13) – rdfs:domain (rdfs2), rdfs:range (rdfs3)
– rdfs:Resource (rdfs4a, rdfs4, rdfs8) – rdfs:subPropertyOf (rdfs5, rdfs6, rdfs7, rdfs12) – rdfs:Class (rdfs8, rdfs10) – rdfs:subClassOf (rdfs9, rdfs10, rdfs11) – rdfs:ContainerMembershipProperty (rdfs12) – rdfs:Datatype (rdfs13)
* Source: h\p://www.w3.org/TR/rdf-‐mt/#RDFSRules
rdfs2 – rdfs:domain
EUCLID -‐ Querying Linked Data 60
dbpedia: The_Beatles
dbpedia: Paul_McCartney
SELECT ?x WHERE { ?x a mo:MusicGroup.}
mo:member rdfs:domain mo:MusicGroup .
?x ?x
dbpedia:The_Beatles …
mo:member
Schema: Query:
Result set: Result set with inference:
dbpedia: John_Lennon
dbpedia: George_Harrison
dbpedia: Ringo_Starr
mo:member mo:member mo:member
rdfs3 – rdfs:range
EUCLID -‐ Querying Linked Data 61
dbpedia: The_Beatles
dbpedia: Paul_McCartney
SELECT ?x WHERE { ?x a foaf:Agent.}
mo:member rdfs:range foaf:Agent .
?x ?x
dbpedia:Paul_McCartney
dbpedia:John_Lennon
dbpedia:Ringo_Starr
dbpedia:George_Harrison …
dbpedia-‐ont: bandMember
Schema: Query:
Result set: Result set with inference:
dbpedia: John_Lennon
dbpedia: George_Harrison
dbpedia: Ringo_Starr
dbpedia-‐ont: bandMember
dbpedia-‐ont: bandMember
dbpedia-‐ont: bandMember
rdfs7 – rdfs:subPropertyOf
EUCLID -‐ Querying Linked Data 62
dbpedia: Yesterday
dbpedia: Paul_McCartney
SELECT ?x WHERE { dbpedia:Yesterday mo:performer ?x.}
mo:singer rdfs:subPropertyOf mo:performer .
?x
dbpedia:John_Lennon
dbpedia:Ringo_Starr
dbpedia:George_Harrison
?x
dbpedia:John_Lennon
dbpedia:Ringo_Starr
dbpedia:George_Harrison
dbpedia:Paul_McCartney
mo:singer
Schema: Query:
Result set: Result set with inference:
dbpedia: John_Lennon
dbpedia: George_Harrison
dbpedia: Ringo_Starr
mo:performer mo:performer mo:performer
mo:performer
rdfs9 – rdfs:subClassOf
EUCLID -‐ Querying Linked Data 63
dbpedia: The_Beatles
SELECT ?x WHERE { ?x a mo:MusicArtist.}
mo:MusicGroup rdfs:subClassOf mo:MusicArtist .
?x ?x
dbpedia:The_Beatles …
Schema: Query:
Result set: Result set with inference:
mo: MusicAr6st
rdf:type
mo: MusicGroup
rdf:type
Inference from Schema
EUCLID -‐ Querying Linked Data 64
• Knowledge encoded in the schema leads to infer new facts
• This is also captured in the set of axioma+c triples, which provide basic meaning for all the vocabulary terms
mo:MusicGroup rdfs:subClassOf mo:MusicArtist . mo:MusicGroup a rdfs:Class . mo:MusicArtist a rdfs:Class .
Schema:
Inferred facts:
rdfs:subClassOf rdfs:domain rdfs:Class .
rdfs:subClassOf rdfs:range rdfs:Class .
RDFS: Lack of Consistency Check
EUCLID -‐ Querying Linked Data 65
• It is possible to infer facts that seem incorrect facts, but RDFS cannot prevent this:
Schema: mo:member rdfs:domain mo:MusicGroup ; rdfs:range foaf:Agent .
Exis6ng :PaulMcCartney a :SoloMusicArtist ; facts: :member :TheBeatles . Inferred :PaulMcCartney a :MusicGroup . facts: No contradic+on!:
The mis-‐modeling is not diagnosed
rdfs2
• We might wish further inferences, but these are beyond the entailment rules implemented by RDFS
RDFS: Inference Limitations
EUCLID -‐ Querying Linked Data 66
foaf:knows rdfs:domain foaf:Person ; rdfs:range foaf:Person .
foaf:made rdfs:domain foaf:Agent . :PaulMcCartney foaf:made :Yesterday ; foaf:knows :RingoStarr . :PaulMcCartney a foaf:Agent ; a foaf:Person . :RingoStarr a foaf:Person .
Schema:
Existing fact:
Inferred facts:
:Yesterday dc:creator :PaulMcCartney. :RingoStarr foaf:knows :PaulMcCartney .
These inferences require OWL!
NOT inferred:
Cannot model with RDFS that ‘x knows y’ implies ‘y knows x’
Cannot model with RDFS that if ‘x makes y’ implies that ‘the creator of y is x’
OWL
EUCLID -‐ Querying Linked Data 67
Web Ontology Language
Seman6c Web Stack Berners-‐Lee (2006)
Ontologies and inferences
Introduction to OWL
EUCLID -‐ Querying Linked Data 68
• Provides more ontological constructs and avoids some of the poten6al confusion in RDFS
• OWL 2 is divided into sub-‐languages denominated profiles: – OWL 2 EL: Limited to basic classifica6on, but with polynomial-‐6me reasoning
– OWL 2 QL: Designed to be translatable to rela6onal database querying
– OWL 2 RL: Designed to be efficiently implementable in rule-‐based systems
• Most triple stores concentrate on the use of RDFS with a subset of OWL features, called OWL-‐Horst or RDFS++
More restric6ve than OWL DL
OWL Properties
EUCLID -‐ Querying Linked Data 69
OWL dis6nguishes between two types of proper6es: • OWL ObjectProper+es: resources as values • OWL DatatypeProper+es: literals as values
:plays rdf:type owl:ObjectProperty;
rdfs:domain :Musician; rdfs:range :Instrument . :hasMembers rdf:type owl:DatatypeProperty; rdfs:domain :MusicGroup
rdfs:range xsd:int .
Property Axioms
EUCLID -‐ Querying Linked Data 70
• Property axioms include those from RDF Schema • OWL allows for property equivalence. Example: EquivalentObjectProperties(dbpedia-‐ont:bandMember mo:member) dbpedia-‐ont:bandMember owl:equivalentProperty mo:member.
≡
dbpedia: The_Beatles
dbpedia: Paul_McCartney
mo:member
dbpedia: John_Lennon
dbpedia: George_Harrison
dbpedia: Ringo_Starr
mo:member
mo:member
mo:member
SELECT ?x {dbpedia:The_Beatles dbpedia-‐ont:bandMember ?x.}
Query:
?x Result set:
?x dbpedia:Paul_McCartney
dbpedia:John_Lennon
dbpedia:Ringo_Starr
dbpedia:George_Harrison
Result set with inference:
Property Axioms
EUCLID -‐ Querying Linked Data 71
• Property axioms include those from RDF Schema • OWL allows for property equivalence. Example: EquivalentObjectProperties(dbpedia-‐ont:bandMember mo:member) dbpedia-‐ont:bandMember owl:equivalentProperty mo:member.
• OWL allows for property disjointness. Example: DisjointObjectProperty(dbpedia-‐ont:length mo:duration)
dbpedia-‐ont:length owl:propertyDisjointWith mo:duration.
• There is no standard for implemen6ng inconsistency reports under SPARQL
≡
≡
Property Axioms (2)
EUCLID -‐ Querying Linked Data 72
OWL allows the defini6on of property characteris6cs to infer new facts rela6ng to instances and their proper6es
• Symmetry
• Transi6vity • Inverse • Func6onal • Inverse Func6onal
Property Axioms: Symmetry
EUCLID -‐ Querying Linked Data 73
dbpedia: The_Beatles
dbpedia: Billy_Preston
dbpedia: Plas6c_Ono_
Band :associatedMusicalArtist a owl:SymmetricProperty .
?genre
dbpedia:Plastic_Ono_Band
?genre
dbpedia:Plastic_Ono_Band
dbpedia:Billy_Preston
:associatedMusicalArtist
Schema:
Result set: Result set with inference:
SELECT ?x WHERE { dbpedia:The_Beatles :associatedMusicalArtist ?x.}
Query:
:associatedMusicalArtist
Property Axioms: Transitivity
EUCLID -‐ Querying Linked Data 74
:Rock
:Heavy_ metal
:Black_ metal
:Punk_ rock SELECT ?genre WHERE {
:Rock :subgenre ?genre .}
:subgenre a owl:TransitiveProperty .
?genre
:Heavy_metal
:Punk_rock
?genre
:Heavy_metal
:Punk_rock
:Black_metal
:subgenre :subgenre
:subgenre :subgenre
Schema:
Query:
Result set: Result set with inference:
Property Axioms: Inverse
EUCLID -‐ Querying Linked Data 75
SELECT ?x WHERE { ?x mo:member_of dbpedia:The_Beatles .}
mo:member_of owl:inverseOf mo:member.
?x
dbpedia:John_Lennon
dbpedia:George_Harrison
?x
dbpedia:John_Lennon
dbpedia:George_Harrison
dbpedia:Paul_McCartney
dbpedia:Ringo_Starr
Schema:
Query:
Result set: Result set with inference:
dbpedia: The_Beatles
dbpedia: Paul_McCartney
mo:member_of
dbpedia: John_Lennon
dbpedia: George_Harrison
dbpedia: Ringo_Starr
mo:member
mo:member_of
mo:member
mo:member_of mo:member_of
Example: Every ar6st primarily plays only one musical instrument mo:primary_instrument rdf:type owl:FunctionalProperty . dbpedia:Jimi_Hendrix mo:primary_instrument dbpedia:Electric_Guitar. dbpedia:Jimi_Hendrix mo:primary_instrument dbpedia:E-‐Guitar. Conclusion dbpedia:Electric_Guitar owl:sameAs dbpedia:E-‐Guitar .
Property Axioms: Functional
EUCLID -‐ Querying Linked Data 76
It refers to a property that can have only one (unique) value for each instance
r2
mo:primary_
instrument
same
r1
mo:primary_ instrument
Example: Every recording has a unique ISRC (Interna6onal Standard Recording Code) mo:isrc rdf:type owl:InverseFunctionalProperty . mo:21047249-‐7b3f-‐4651-‐acca-‐246669c081fd mo:isrc "GBAYE6300412" . dbpedia:She_Loves_You mo:isrc "GBAYE6300412" . Conclusion mo:21047249-‐7b3f-‐4651-‐acca-‐246669c081fd
owl:sameAs :dbpedia:She_Loves_You .
Property Axioms: Inverse Functional
EUCLID -‐ Querying Linked Data 77
It is useful for specifying unique proper6es iden6fying an individual
r2
mo:isrc
mo:isrc same
r1
Individual Axioms
EUCLID -‐ Querying Linked Data 78
OWL Individuals represent instances of classes. They are related to their class by the rdf:type property
• We can state that two individuals are the same SameIndividual(<artist/ba550d0e-‐adac-‐4864-‐b88b-‐407cab5e76af#_> dbpedia:PaulMcCartney)�
<artist/ba550d0e-‐adac-‐4864-‐b88b-‐407cab5e76af#_> owl:sameAs dbpedia:PaulMcCartney .
• We can state that two individuals are different DifferentIndividuals(:TheBeatles_band :TheBeatles_TVseries)
:TheBeatles_band owl:differentFrom :TheBeatles_Tvseries .
≡
≡
Class Axioms
EUCLID -‐ Querying Linked Data 79
Axioms declare general statements about concepts which are used in logical inference (reasoning). Class axioms:
• Sub-‐class rela6onship (from RDF Schema)
• Equivalent rela6onship: classes have the same individuals EquivalentClass(:Musician :MusicArtist)
:Musician owl:equivalentClass :MusicArtist .
• Disjointness: classes have no shared individuals DisjointClasses(:SoloMusicArtist :MusicGroup) :SoloMusicArtist owl:disjointWith :MusicGroup . ≡
≡
Class Construction
EUCLID -‐ Querying Linked Data 80
• OWL classes are defined by the OWL term owl:Class
• OWL classes can be subclassed as in RDFS:
• OWL classes may be combined with class constructs to build new classes
Music Ar6st
Ar6st
:MusicArtist rdfs:subClassOf :Artist .
Class Construction (2)
EUCLID -‐ Querying Linked Data 81
These class constructs are available in OWL, not in RDFS
The class of female music ar6sts ObjectIntersectionOf(:Female :MusicArtist)� [a owl:Class; owl:intersectionOf(:Female :MusicArtist)]
The class of music ar6sts ObjectUnionOf(:SoloMusicArtist :MusicGroup) [a owl:Class; owl:unionOf(:SoloMusicArtist :MusicGroup)]
Everything that’s not instrumental music ObjectComplementOf(:InstrumentalMusic) [a owl:Class; owl:complementOf(:InstrumentalMusic)]
Female
Music Ar6st
Solo
Group
Instrumental
≡
≡
≡ NOTE: Anonymous classes!
Naming Class Constructions
EUCLID -‐ Querying Linked Data 82
• Direct naming can be achieved via owl:equivalentClass
• This construc6on provides necessary and sufficient condi6ons for class membership
• Class naming can be also achieved using rdfs:subClassOf, it provides a necessary but insufficient condi6on for class membership
Music Ar6st Solo
Group
EquivalentClass(:MusicArtist ObjectUnionOf(:SoloMusicArtist :MusicGroup)) :MusicArtist owl:equivalentClass [owl:unionOf (:SoloMusicArtist :MusicGroup)]
≡
Summary
EUCLID -‐ Querying Linked Data 83
• Basic concepts: triple pa\erns, graph pa\erns, SPARQL endpoint ...
• SPARQL Query: • Query forms: ASK, SELECT, DESCRIBE, CONSTRUCT
• Query paNerns: BGP, UNION, OPTIONAL, FILTER
• Sequence modifiers: DISTINCT, REDUCED, ORDER BY, LIMIT, OFFSET
• SPARQL 1.1 Update: • Data management: INSERT, DELETE; DELETE/INSERT
• Graph management: LOAD, CLEAR, CREATE, DROP, COPY/MOVE/ADD
• SPARQL Protocol: query opera6on, update opera6on
Que
rying Link
ed Data
In this chapter we studied:
Summary (2)
EUCLID -‐ Querying Linked Data 84
• Reasoning over Linked Data: • SPARQL 1.1 entailment regimes
• RDFS: entailment regimes, lacks of consistency check, inference limita6ons
• OWL: proper6es, property axioms (symmetry, transi6vity, inverse, func6onal, inverse func6onal), individual axioms, class axioms, class construc6ons, naming classes …
Rea
soning
ove
r Linke
d Data
In this chapter we studied:
For exercises, quiz and further material visit our website:
EUCLID -‐ Providing Linked Data 85
@euclid_project euclidproject euclidproject
http://www.euclid-‐project.eu
Other channels:
eBook Course
Acknowledgements
• Alexander Mikroyannidis • Alice Carpen6er • Andreas Harth • Andreas Wagner • Andriy Nikolov • Barry Norton • Daniel M. Herzig • Elena Simperl • Günter Ladwig • Inga Shamkhalov • Jacek Kopecky • John Domingue
• Juan Sequeda • Kalina Bontcheva • Maria Maleshkova • Maria-‐Esther Vidal • Maribel Acosta • Michael Meier • Ning Li • Paul Mulholland • Peter Haase • Richard Power • Steffen Stadtmüller
86
People who have contributed to crea6ng Euclid training content: