194
Mini-curso sobre Linked Data Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es) Universidad Politécnica de Madrid Florianópolis, September 1st 2010 (3º OntoBras 2010) Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Juan Sequeda, Carlos Ruiz Moreno and many others Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0

Linked Data Tutorial (Florianópolis)

Embed Size (px)

DESCRIPTION

Slides from the tutorial given by Asunción Gómez-Pérez and Oscar Corcho at 3º OntoBras in Florianópolis (September 2010)

Citation preview

Page 1: Linked Data Tutorial (Florianópolis)

Mini-curso sobre Linked Data

Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)

Universidad Politécnica de Madrid

Florianópolis, September 1st 2010(3º OntoBras 2010)

Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Juan Sequeda, Carlos Ruiz Moreno and many others

Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0

Page 2: Linked Data Tutorial (Florianópolis)

2

Contents

• Introduction to Linked Data• Linked Data Foundations: RDF, RDF Schema,

SPARQL and OWL

Coffee break

• Linked Data publication- Methodological guidelines for Linked Data publication- RDB2RDF tools- Technical aspects of Linked Data publication

• Linked Data consumption

Page 3: Linked Data Tutorial (Florianópolis)

What is the Web of Linked Data?

• An extension of the current Web…- … where information and services

are given well-defined and explicitly represented meaning, …

- … so that it can be shared and used by humans and machines, ...

- ... better enabling them to work in cooperation

• How? - Promoting information exchange by

tagging web content with machine processable descriptions of its meaning.

- And technologies and infrastructure to do this

- And clear principles on how to publish data

data

Page 4: Linked Data Tutorial (Florianópolis)

What is Linked Data?

• Linked Data is a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.

- Part of the Semantic Web- Exposing, sharing and connecting data- Technologies: URIs and RDF (although others are also

important)

Page 5: Linked Data Tutorial (Florianópolis)

5

The four principles (Tim Berners Lee, 2006)

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs, so that they can discover more things.

• http://www.w3.org/DesignIssues/LinkedData.html

http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html

Page 6: Linked Data Tutorial (Florianópolis)

Linked Open Data evolution

2007

2008

2009

Page 7: Linked Data Tutorial (Florianópolis)

7

LOD Cloud May 2007

Figure from [4]

Facts:• Focal points:

• DBPedia: RDFized vesion of Wikipiedia; many ingoing and outgoing links

• Music-related datasets• Big datasets include FOAF, US Census data• Size approx. 1 billion triples, 250k links

Page 8: Linked Data Tutorial (Florianópolis)

8

LOD Cloud September 2008

Figure from [4]

Facts:• More than 35 datasets interlinked• Commercial players joined the cloud,

e.g., BBC• Companies began to publish and host

dataset, e.g. OpenLink, Talis, or Garlik.• Size approx. 2 billion triples, 3 million

links

Page 9: Linked Data Tutorial (Florianópolis)

9

LOD Cloud March 2009

Figure from [4]

Facts:• Big part from Linking Open Drug cloud and

the BIO2RDF project (bottom)• Notable new datasets: Freebase,

OpenCalais, ACM/IEEE• Size > 10 billion triples

Page 10: Linked Data Tutorial (Florianópolis)

LOD clouds

Page 11: Linked Data Tutorial (Florianópolis)

Why Linked Data?

• Basically, to move from a Web of documents to a Web of Data

• Let’s try an example:• Tell me which football players, born in the province of

Albacete, in Spain, have scored a goal in the World Cup final

• Disclaimer:- Sorry to use an example about football, but you have to

understand that for several years Spaniards will be talking about football a lot ;-)

Page 12: Linked Data Tutorial (Florianópolis)

Information search in the Web of documents

¿?

Page 13: Linked Data Tutorial (Florianópolis)

What we were actually looking for

Page 14: Linked Data Tutorial (Florianópolis)

It would be better to make a data query…

(football players from Albacete who played Eurocup 2008)

Page 15: Linked Data Tutorial (Florianópolis)

How should we publish data?

• Formats in which data is published nowadays…- XML- HTML- DBs- APIs- CSV- XLS- …

• However, main limitations from a Web of Data point of view- Difficult to integrate- Data is not linked to each other, as it happens with Web

documents.

Page 16: Linked Data Tutorial (Florianópolis)

Which format do we use then?

• RDF (Resource Description Framework)

- Data model- Based on triples: subject, predicate, object

• <Oscar> <vive en> <Madrid>• <Madrid> <es la capital de> <España>• <España> <es campeona de> <Mundial de Fútbol>• …

- Serialised in different formats• RDF/XML, RDFa, N3, Turtle, JSON…

Page 17: Linked Data Tutorial (Florianópolis)

URIs (Universal-Uniform Resource Identifer)

• Two types of identifiers can be used to identify Linked Data resources- URIRefs (Unique Resource Identifiers References)

• A URI and an optional Fragment Identifier separated from the URI by the hash symbol ‘#’

• http://www.ontology.org/people#Person• people:Person

- Plain URIs can also be used, as in FOAF:• http://xmlns.com/foaf/0.1/Person

17

Page 18: Linked Data Tutorial (Florianópolis)

18

How do we publish Linked Data?

1. Exposing Relational Databases or other similar formats into Linked Data- D2R- Triplify- R2O- NOR2O- Virtuoso- Ultrawrap- …

2. Using native RDF triplestores- Sesame- Jena- Owlim- Talis platform- …

3. Incorporating it in the form of RDFa in CMSs like Drupal

Page 19: Linked Data Tutorial (Florianópolis)

How do we consume Linked Data?

• Linked Data browsers- To explore things and datasets and to navigate between them.- Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE),

OpenLink RDF Browser (OpenLink, UK), Zitgist RDF Browser (Zitgist, USA), Disco Hyperdata Browser (FU Berlin, DE), Fenfire (DERI, Ireland)

• Linked Data mashups- Sites that mash up (thus combine Linked data)- Revyu.com (KMI, UK), DBtune Slashfacet (Queen Mary, UK),

DBPedia Mobile (FU Berlin, DE), Semantic Web Pipes (DERI, Ireland)

• Search engines- To search for Linked Data.- Falcons (IWS, China), Sindice (DERI, Ireland), MicroSearch

(Yahoo, Spain), Watson (Open University, UK), SWSE (DERI, Ireland), Swoogle (UMBC, USA)

19Listing on this slide by T. Heath, M. Hausenblas, C. Bizer, R. Cyganiak, O. Hartig

Page 20: Linked Data Tutorial (Florianópolis)

Linked Data browsers (Disco)

Page 21: Linked Data Tutorial (Florianópolis)

Linked Data Mashup (LinkedGeoData)

© Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas Luis Manuel Vilches Blázquez

Page 22: Linked Data Tutorial (Florianópolis)

Linked Data Mashup (DBpedia Mobile)

© Migración de

datos a la Web de los Datos - Enfoqu

es, técnica

s y herramientas

Luis Manuel Vilches Blázqu

ez

http://wiki.dbpedia.org/DBpediaMobile

Page 23: Linked Data Tutorial (Florianópolis)

Linked Data Search Engines (Sindice and SIG.MA)

• Entity lookup service. Find a document that mentions a URI or a keyword.

Page 24: Linked Data Tutorial (Florianópolis)

Linked Data Search Engines (NYT)

• The New York Times: Alumni In The News- http://data.nytimes.com/schools/schools.html

Page 25: Linked Data Tutorial (Florianópolis)

Linked Data Search Engines (NYT)

• The New York Times: Source code is available

• … and is based on SPARQL queries

Page 27: Linked Data Tutorial (Florianópolis)

27

Open Government. USA and UK

TOP-DOWN

BOTTOM-UP

Page 28: Linked Data Tutorial (Florianópolis)

Linked Data Mashup (data.gov)

• Clean Air Status and Trends (CASTNET)- http://data-gov.tw.rpi.edu/demo/exhibit/demo-8-castnet.php

Page 29: Linked Data Tutorial (Florianópolis)

29

Linked Data in the UK

• Education- http://education.data.gov.uk/id/school/106661

• Parliament- http://parliament.psi.enakting.org/id/member/1227

• Maps- E.g., London:

http://data.ordnancesurvey.co.uk/id/7000000000041428- http://map.psi.enakting.org

• Transport- http://www.dft.gov.uk/naptan/

• SameAs service- http://www.sameas.org

• Challenges- http://gov.tso.co.uk/openup/sparql/gov-transport

Page 30: Linked Data Tutorial (Florianópolis)

Linked Data Mashup (data.gov.uk)

• Research Funding Explorer- http://bis.clients.talis.com/

Page 31: Linked Data Tutorial (Florianópolis)

31

Open Government Spain. Euskadi

Page 32: Linked Data Tutorial (Florianópolis)

32

Open Government Spain. Abredatos

Page 33: Linked Data Tutorial (Florianópolis)

33

Open Government Spain. Zaragoza

Page 34: Linked Data Tutorial (Florianópolis)

34

Open Government Spain. Asturias

Page 35: Linked Data Tutorial (Florianópolis)

Linked Data Mashup (Water quality)

• Water quality in Asturias’ beaches- http://datos.fundacionctic.org/sandbox/asturias/playas/

Page 36: Linked Data Tutorial (Florianópolis)

36

Contents

• Introduction to Linked Data• Linked Data Foundations: RDF, RDF Schema,

SPARQL and OWL

Coffee break

• Linked Data publication- Methodological guidelines for Linked Data publication- RDB2RDF tools- Technical aspects of Linked Data publication

• Linked Data consumption

Page 37: Linked Data Tutorial (Florianópolis)

37

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 38: Linked Data Tutorial (Florianópolis)

RDF: Resource Description Framework

• W3C recommendation• RDF is graphical formalism ( + XML syntax + semantics)

- For representing metadata- For describing the semantics of information in a machine-

accessible way- Resources are described in terms of properties and property

values using RDF statements- Statements are represented as triples, consisting of a subject,

predicate and object. [S, P, O]

oeg:Oscar oeg:Asun

oeg:Raul “http://www.fi.upm.es/”

person:hasHomePage

“Oscar Corcho García”person:hasName

person:hasColleague

person:hasColleague

38

Page 39: Linked Data Tutorial (Florianópolis)

RDF and URIs

• RDF uses URIRefs (Unique Resource Identifiers References) to identify resources- A URIRef consists of a URI and an optional Fragment Identifier

separated from the URI by the hash symbol ‘#’- Examples

• http://www.co-ode.org/people#hasColleague• coode:hasColleague

• A set of URIRefs is known as a vocabulary- E.g., the RDF Vocabulary

• The set of URIRefs used in describing the RDF concepts: rdf:Property, rdf:Resource, rdf:type, etc.

- The RDFS Vocabulary • The set of URIRefs used in describing the RDF Schema

language: rdfs:Class, rdfs:domain, etc.- The ‘Pizza Ontology’ Vocabulary

• pz:hasTopping, pz:Pizza, pz:VegetarianPizza, etc.

39

Page 40: Linked Data Tutorial (Florianópolis)

RDF Serialisations

• Normative- RDF/XML (www.w3.org/TR/rdf-syntax-grammar/)

• Alternative (for human consumption)- N3 (http://www.w3.org/DesignIssues/Notation3.html)- Turtle (http://www.dajobe.org/2004/01/turtle/)- TriX (http://www.w3.org/2004/03/trix/)- …

Important: the RDF serializations allow different syntactic variants. E.g., the order of RDF statements

has no meaning

40

Page 41: Linked Data Tutorial (Florianópolis)

RDF Serialisations. RDF/XML

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:person="http://www.ontologies.org/ontologies/people#"

xmlns="http://www.oeg-upm.net/ontologies/people#"

xml:base="http://www.oeg-upm.net/ontologies/people">

<rdf:Property rdf:about="http://www.ontologies.org/ontologies/people#hasHomePage"/>

<rdf:Property rdf:about="http://www.ontologies.org/ontologies/people#hasColleague"/>

<rdf:Property rdf:about="http://www.ontologies.org/ontologies/people#hasName"/>

<rdf:Description rdf:about="#Raul"/>

<rdf:Description rdf:about="#Asun">

<person:hasColleague rdf:resource="#Raul"/>

<person:hasHomePage>http://www.fi.upm.es</person:hasHomePage>

</rdf:Description>

<rdf:Description rdf:about="#Oscar">

<person:hasColleague rdf:resource="#Asun"/>

<person:hasName>Oscar Corcho García</person:hasName>

</rdf:Description>

</rdf:RDF>

41

Page 42: Linked Data Tutorial (Florianópolis)

RDF Serialisations. N3

@base <http://www.oeg-upm.net/ontologies/people >

@prefix person: <http://www.ontologies.org/ontologies/people#>

:Asun person:hasColleague :Raul ;

person:hasHomePage “http://www.fi.upm.es/”.

:Oscar person:hasColleague :Asun ;

person:hasName “Óscar Corcho García”.

42

Page 43: Linked Data Tutorial (Florianópolis)

Exercise

• Objective• Get used to the different syntaxes of RDF

• Tasks• Take the text of an RDF file and create its corresponding graph• Take an RDF graph and create its corresponding RDF/XML and N3 files

43

Page 44: Linked Data Tutorial (Florianópolis)

Exercise 1.a. Create a graph from a file

• Open the file StickyNote_PureRDF.rdf

• Create the corresponding graph from it

• Compare your graph with those of your colleagues

44

Page 45: Linked Data Tutorial (Florianópolis)

45

Exercise 1.a. StickyNote_PureRDF.rdf

Page 46: Linked Data Tutorial (Florianópolis)

46

Exercise 1.b. Create files from a graph

• Transform the following graph into N3 syntax

Sensor029

Class01

Measurement8401

2010-06-12T12:00:1229

Computer101

User10A

Pedro

includes

includes

hasOwner

hasName

hasMeasurement

hasTemperature atTime

Page 47: Linked Data Tutorial (Florianópolis)

Blank nodes: structured property values

• Most real-world data involves structures that are more complicated than sets of RDF triple statements

• In RDF/XML, it is an <rdf:Description> node with no rdf:about• In N3, it is a resource identifier that starts with ‘_’

- E.g., “_:nodeX”

oeg:Oscar

city:BoadillaDelMonte Campus de Montegancedo s/n

address:hasStreetName

“Oscar Corcho García”person:hasName

address:city

person:hasPostalAddress

This intermediate URI does not need to have a name

47

Page 48: Linked Data Tutorial (Florianópolis)

Typed literals

• So far, all values have been presented as strings• XML Schema datatypes can be used to specify

values (objects in some RDF triple statements)

• In RDF/XML, this is expressed as:- <rdf:Description rdf:about=”#Oscar”>

<person:hasBirthDate

rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1976-02-02 </person:hasBirthDate></rdf:Description>

• In N3, this is expressed as:- oeg:Oscar person:hasBirthDate ”1976-02-02”^^xsd:date .

oeg:Oscar 1976-02-02person:hasBirthDate

48

Page 49: Linked Data Tutorial (Florianópolis)

RDF Containers

• There is often the need to describe groups of things- A book was created by several authors- A lesson is taught by several persons- etc.

• RDF provides a container vocabulary- rdf:Bag A group of resources or literals, possibly including

duplicate members, where the order of members is not significant- rdf:Seq A group of resources or literals, possibly including

duplicate members, where the order of members is significant- rdf:Alt A group of resources or literals that are alternatives

(typically for a single value of a property)

oeg:Oscar

[email protected]” “[email protected]

rdf:_1

person:hasEmailAddress

rdf:_2

rdf:Seqrdf:type

49

Page 50: Linked Data Tutorial (Florianópolis)

RDF Reification

• RDF statements about other RDF statements- “Raúl believes that Oscar’s birthdate is on Feb 2nd, 1976 and that

his e-mail address is [email protected]

• RDF Reification- Allows expressing beliefs (and other modalities)- Allows expressing trust models, digital signatures, etc.- Allows expressing metadata about metadata

oeg:Raúl oeg:Oscar

[email protected]” 02/02/1976

person:hasEmailAddress

modal:believes

person:hasBirthDate

50

Page 51: Linked Data Tutorial (Florianópolis)

Main value of a structured value

• Sometimes one of the values of a structured value is the main one- The weight of an item is 2.4 kilograms - The most important value is 2.4, which is expressed with

rdf:value

• Scarcely used

product:Item1

2.4 units:kilogram

rdf:value

product:hasWeight

units:hasWeightUnit

51

Page 52: Linked Data Tutorial (Florianópolis)

52

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 53: Linked Data Tutorial (Florianópolis)

RDF inference. Graph matching techniques

• RDF inference is based on graph matching techniques

• Basically, the RDF inference process consists of the following steps:- Transform an RDF query into a template graph that has to

be matched against the RDF graph• It contains constant and variable nodes, and constant

and variable edges between nodes- Match against the RDF graph, taking into account constant

nodes and edges- Provide a solution for variable nodes and edges

53

Page 54: Linked Data Tutorial (Florianópolis)

RDF inference. Examples (I)

• Sample RDF graph

• Query: “Tell me who are the persons who have Asun as a colleague”

- Result: oeg:Oscar and oeg:Raúl

oeg:Oscar oeg:Asun

oeg:Raúl “http://www.fi.upm.es/”

person:hasHomePage

“Oscar Corcho García”person:hasName

person:hasColleague

person:hasColleague

? oeg:Asunperson:hasColleague

54

Page 55: Linked Data Tutorial (Florianópolis)

RDF inference. Examples (II)

• Query: “Tell me which are the relationships between Oscar and Asun”

- Result: oeg:hasColleague

• Query: “Tell me the homepage of Oscar colleagues”

- Result: “http://www.fi.upm.es/”

oeg:Oscar oeg:Asun?

oeg:Oscar

?

person:hasHomePage

person:hasColleague

55

Page 56: Linked Data Tutorial (Florianópolis)

RDF inference. Entailment rules

56

Page 57: Linked Data Tutorial (Florianópolis)

57

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 58: Linked Data Tutorial (Florianópolis)

RDFS: RDF Schema

• W3C Recommendation• RDF Schema extends RDF to enable talking about classes of

resources, and the properties to be used with them- Class definition: rdfs:Class, rdfs:subClassOf- Property definition: rdfs:subPropertyOf, rdfs:range, rdfs:domain- Other primitives: rdfs:comment, rdfs:label, rdfs:seeAlso, rdfs:isDefinedBy

• RDFS vocabulary adds constraints on models, e.g.:- x,y,z type(x,y) and subClassOf(y,z) type(x,z)

ex:Personex:Oscar

rdfs:subClassOf

ex:Animal

rdf:type

58

Page 59: Linked Data Tutorial (Florianópolis)

RDF(S) Serialisations. RDF/XML syntax

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:person="http://www.ontologies.org/ontologies/people#"

xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"

xmlns="http://www.oeg-upm.net/ontologies/people#"

xml:base="http://www.oeg-upm.net/ontologies/people">

<rdfs:Class rdf:about="http://www.ontologies.org/ontologies/people#Professor">

<rdfs:subClassOf>

<rdfs:Class rdf:about="http://www.ontologies.org/ontologies/people#Person"/>

</rdfs:subClassOf>

</rdfs:Class>

<rdfs:Class rdf:about="http://www.ontologies.org/ontologies/people#Lecturer">

<rdfs:subClassOf rdf:resource="http://www.ontologies.org/ontologies/people#Person"/>

</rdfs:Class>

<rdfs:Class rdf:about="http://www.ontologies.org/ontologies/people#PhD">

<rdfs:subClassOf rdf:resource="http://www.ontologies.org/ontologies/people#Person"/>

</rdfs:Class>

59

Page 60: Linked Data Tutorial (Florianópolis)

RDF(S) Serialisations. RDF/XML syntax

<rdf:Property rdf:about="http://www.ontologies.org/ontologies/people#hasHomePage"/>

<rdf:Property rdf:about="http://www.ontologies.org/ontologies/people#hasColleague">

<rdfs:domain rdf:resource=" http://www.ontologies.org/ontologies/people#Person"/>

<rdfs:range rdf:resource=" http://www.ontologies.org/ontologies/people#Person"/>

</rdf:Property>

<rdf:Property rdf:about="http://www.ontologies.org/ontologies/people#hasName">

<rdfs:domain rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>

</rdf:Property>

<person:PhD rdf:ID="Raul"/>

<person:Professor rdf:ID=“Asun">

<person:hasColleague rdf:resource="#Raul"/>

<person:hasHomePage>http://www.fi.upm.es</person:hasHomePage>

</person:Professor>

<person:Lecturer rdf:ID="Oscar">

<person:hasColleague rdf:resource="#Asun"/>

<person:hasName>Óscar Corcho García</person:hasName>

</person:Lecturer>

</rdf:RDF>

60

Page 61: Linked Data Tutorial (Florianópolis)

RDF(S) Serialisations. N3

@base <http://www.oeg-upm.net/ontologies/people >

@prefix person: <http://www.ontologies.org/ontologies/people#>

person:hasColleague a rdf:Property;

rdfs:domain person:Person;

rdfs:range person:Person.

person:Professor rdfs:subClassOf person:Person.

person:Lecturer rdfs:subClassOf person:Person.

person:PhD rdfs:subClassOf person:Person.

:Asun a person:Professor;

person:hasColleague :Raul ;

person:hasHomePage “http://www.fi.upm.es/”.

:Oscar a person:Lecturer;

person:hasColleague :Asun ;

person:hasName “Óscar Corcho García”.

:Raul a person:PhD.

a is equivalent to rdf:type

61

Page 62: Linked Data Tutorial (Florianópolis)

Flight

rdfs:Literal rdfs:Class

company-name singleFare

units:currencyQuantity

rdfs:range

rdfs:range

rdfs:domain

rdfs:domain

rdf:Type

departureDate

rdfs:domain

time:Date

rdfs:range

arrivalDate

rdfs:range

rdfs:domain

rdf:Propertyrdf:Type

rdf:Type rdf:Typerdf:Type

RDF

RDFS

IB-4321“Iberia”

500 euros

10/11/2005

singleFare departureDate

arrivalDate

company-namerdf:Type

rdf:Type

rdf:Type

rdf:Type

RDF(S) Example

62

Page 63: Linked Data Tutorial (Florianópolis)

Exercise

• Objective• Get used to the different syntaxes of RDF(S)

• Tasks• Take the text of an RDF(S) file and create its corresponding graph• Take an RDF(S) graph and create its corresponding RDF/XML and N3 files

63

Page 64: Linked Data Tutorial (Florianópolis)

Exercise 2.a. Create a graph from a file

• Open the files StickyNote.rdf and StickyNote.rdfs

• Create the corresponding graph from them

• Compare your graph with those of your colleagues

64

Page 65: Linked Data Tutorial (Florianópolis)

Exercise 2.a. StickyNote.rdf

65

Page 66: Linked Data Tutorial (Florianópolis)

Exercise 2.a. StickyNote.rdfs

66

Page 67: Linked Data Tutorial (Florianópolis)

67

Exercise 2.b. Create files from a graph

• Transform the following graph into N3 syntax

Sensor029

Class01

2010-06-12T12:00:1229

Computer101

User10A

Pedro

ObjectRoom PersonMeasurement

includes

includes

hasOwner

hasName

hasMeasurement

hasTemperature atTime

Page 68: Linked Data Tutorial (Florianópolis)

68

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 69: Linked Data Tutorial (Florianópolis)

RDF(S) inference. Entailment rules

69

Page 70: Linked Data Tutorial (Florianópolis)

RDF(S) inference. Additional inferences

70

Page 71: Linked Data Tutorial (Florianópolis)

RDF(S) limitations

• RDFS too weak to describe resources in sufficient detail- No localised range and domain constraints

• Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

- No existence/cardinality constraints• Can’t say that all instances of person have a mother that is also a

person, or that persons have exactly 2 parents- No boolean operators

• Can’t say or, not, etc.

- No transitive, inverse or symmetrical properties• Can’t say that isPartOf is a transitive property, that hasPart is the

inverse of isPartOf or that touches is symmetrical

• Difficult to provide reasoning support- No “native” reasoners for non-standard semantics- May be possible to reason via FOL axiomatisation

71

Page 72: Linked Data Tutorial (Florianópolis)

Exercise

• Objective• Understand the features of RDF(S) for implementing ontologies,

including its limitations• Tasks

• Given a scenario description, build a simple ontology in RDF Schema

72

Page 73: Linked Data Tutorial (Florianópolis)

Exercise 3. Domain description

• Un lugar puede ser un lugar de interés.• Los lugares de interés pueden ser lugares turísticos o establecimientos,

pero no las dos cosas a la vez.• Los lugares turísticos pueden ser palacios, iglesias, ermitas y

catedrales.• Los establecimientos pueden ser hoteles, hostales o albergues.• Un lugar está situado en una localidad, la cual a su vez puede ser una

villa, un pueblo o una ciudad.• Un lugar de interés tiene una dirección postal que incluye su calle y su

número.• Las localidades tienen un número de habitantes.• Las localidades se encuentran situadas en provincias.

• Covarrubias es un pueblo con 634 habitantes de la provincia de Burgos.• El restaurante “El Galo” está situado en Covarrubias, en la calle Mayor,

número 5.• Una de las iglesias de Covarrubias está en la calle de Santo Tomás.

73

Page 74: Linked Data Tutorial (Florianópolis)

Exercise 3. Sample resulting ontology

74

Page 75: Linked Data Tutorial (Florianópolis)

75

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 76: Linked Data Tutorial (Florianópolis)

Sample RDF APIs

• RDF libraries for different languages: - Java, Python, C, C++, C#, .Net, Javascript, Tcl/Tk, PHP, Lisp, Obj-C, Prolog,

Perl, Ruby, Haskell- List in http://esw.w3.org/topic/SemanticWebTools

• Usually related to a RDF repository

• Multilanguage:- Redland RDF Application Framework (C, Perl, PHP, Python and Ruby):

http://www.redland.opensource.ac.uk/

• Java:- Jena: http://jena.sourceforge.net/- Sesame: http://www.openrdf.org/

• PHP:- RAP - RDF API for PHP: http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/

• Python:- RDFLib: http://rdflib.net/- Pyrple: http://infomesh.net/pyrple/

76

Page 77: Linked Data Tutorial (Florianópolis)

Jena

• Java framework for building Semantic Web applications

• Open source software from HP Labs• The Jena framework includes:

- A RDF API- An OWL API- Reading and writing RDF in RDF/XML, N3 and N-Triples- In-memory and persistent storage- A rule based inference engine- SPARQL query engine

77

Page 78: Linked Data Tutorial (Florianópolis)

Sesame

• A framework for storage, querying and inferencing of RDF and RDF Schema

• A Java Library for handling RDF• A Database Server for (remote) access to repositories of

RDF data• Highly expressive query and transformation languages

- SeRQL, SPARQL

• Various backends- Native Store- RDBMS (MySQL, Oracle 10, DB2, PostgreSQL)- main memory

• Reasoning support- RDF Schema reasoner- OWL DLP (OWLIM)- domain reasoning (custom rule engine)

78

Page 79: Linked Data Tutorial (Florianópolis)

Jena example. Graph creation

http://.../JohnSmith

John Smith

SmithJohn

vcard:Given vcard:Family

vcard:FN vcard:N

// some definitions String personURI = "http://somewhere/JohnSmith"; String givenName = "John"; String familyName = "Smith"; String fullName = givenName + " " + familyName; // create an empty Model Model model = ModelFactory.createDefaultModel(); // create the resource // and add the properties cascading style Resource johnSmith = model.createResource(personURI) .addProperty(VCARD.FN, fullName) .addProperty(VCARD.N, model.createResource() .addProperty(VCARD.Given, givenName) .addProperty(VCARD.Family, familyName));

79

Page 80: Linked Data Tutorial (Florianópolis)

Jena example. Read and write

// create an empty modelModel model = ModelFactory.createDefaultModel();

// use the FileManager to find the input fileInputStream in = FileManager.get().open( inputFileName );if (in == null) { throw new IllegalArgumentException("File not found");}

// read the RDF/XML filemodel.read(in, "");

// write it to standard outmodel.write(System.out);

<rdf:RDF

xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'

xmlns:vcard='http://www.w3.org/2001/vcard-rdf/3.0#'

>

<rdf:Description rdf:nodeID="A0">

<vcard:Family>Smith</vcard:Family>

<vcard:Given>John</vcard:Given>

</rdf:Description>

<rdf:Description rdf:about='http://somewhere/JohnSmith/'>

<vcard:FN>John Smith</vcard:FN>

<vcard:N rdf:nodeID="A0"/>

</rdf:Description>

...

</rdf:RDF>

80

Page 81: Linked Data Tutorial (Florianópolis)

Some RDF editors

• IsaViz- http://www.w3.org/2001/11/IsaViz/

• Morla- http://www.morlardf.net/

• RDFAuthor- http://rdfweb.org/people/damian/RDFAuthor/

• RdfGravity- http://semweb.salzburgresearch.at/apps/rdf-gravity/

• Rhodonite- http://rhodonite.angelite.nl/

81

Page 82: Linked Data Tutorial (Florianópolis)

Main References

• Brickley D, Guha RV (2004) RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation

http://www.w3.org/TR/PR-rdf-schema/• Lassila O, Swick R (1999) Resource Description

Framework (RDF) Model and Syntax Specification. W3C Recommendation

http://www.w3.org/TR/REC-rdf-syntax/• RDF validator:

http://www.w3.org/RDF/Validator/• RDF resources:

http://planetrdf.com/guide/

82

Page 83: Linked Data Tutorial (Florianópolis)

83

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 84: Linked Data Tutorial (Florianópolis)

84

RDF(S) query languages

• Languages developed to allow accessing datasets expressed in RDF(S) (and in some cases OWL)

• Supported by the most important language APIs- Jena (HP labs)- Sesame (Aduna)- Boca (IBM)- ...

• There are some differences wrt. languages like SQL, such as- Combination of different sources- Trust management- Open World Assumption

RelationalDB

Application

SQL queries

RDF(S)OWL

Application

SPARQL, RQL, etc., queries

Page 85: Linked Data Tutorial (Florianópolis)

85

Query types

• Selection and extraction- “Select all the essays, together with their authors and their authors’ names”- “Select everything that is related to the book ‘Bellum Civille’”

• Reduction: we specify what it should not be returned- “Select everything except for the ontological information and the book

translators”

• Restructuring: the original structure is changed in the final result- “Invert the relationship ‘author’ by ‘is author of’”

• Aggregation- “Return all the essays together with the mean number of authors per essay”

• Combination and inferences- “Combine the information of a book called ‘La guerra civil’ and whose

author is Julius Caesar with the book whose identifier is ‘Bellum Civille’”- “Select all the essays, together with its authors and author names”,

including also the instances of the subclasses of Essay- “Obtain the relationship ‘coauthor’ among persons who have written the

same book”

Page 86: Linked Data Tutorial (Florianópolis)

RDF(S) query language families

• SquishQL Family

- SquishQL- rdfDB Query Language- RDQL- BRQL- TriQL

• XPath, XSLT, XQuery

- XQuery for RDF- XsRQL- TreeHugger and RDFTwig- RDFT, Nexus Query

Language- RDFPath, Rpath and RXPath- Versa

• RQL Family

- RQL- SeRQL- eRQL

• Controlled natural language

- Metalog• Other

- Algae- iTQL- N3QL- PerlRDF Query Language- RDEVICE Deductive Language- RDFQBE- RDFQL- TRIPLE- WQLSPARQL

W3C Recommendation

15 January 2008

Triple database

Query structure

Description graphs Query

semantics

XML repository

Query syntax

86

Page 87: Linked Data Tutorial (Florianópolis)

SPARQL

• SPARQL Protocol and RDF Query Language• Supported by: Jena, Sesame, IBM Boca, etc.• Features

- It supports most of the aforementioned queries - It supports datatype reasoning (datatypes can be requested instead

of actual values)- The domain vocabulary and the knowledge representation

vocabulary are treated differently by the query interpreters- It allows making queries over properties with multiple values, over

multiple properties of a resource and over reifications- Queries can contain optional statements- Some implementations support aggregation queries

• Limitations- Neither set operations nor existential or universal quantifiers can be

included in the queries- It does not support recursive queries

87

Page 88: Linked Data Tutorial (Florianópolis)

SPARQL is also a protocol

• SPARQL is a Query Language …Find names and websites of contributors to PlanetRDF: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?website FROM <http://planetrdf.com/bloggers.rdf> WHERE {

?person foaf:weblog ?website .?person foaf:name ?name . ?website a foaf:Document }

• ... and a Protocolhttp://.../qps?query-lang=http://www.w3.org/TR/rdf-sparql-query/ &graph-id=http://planetrdf.com/bloggers.rdf&query=PREFIXfoaf: <http://xmlns.com/foaf/0.1/...

• Services running SPARQL queries over a set of graphs • A transport protocol for invoking the service • Based on ideas from earlier protocol work such as Joseki • Describing the service with Web Service technologies

88

Page 89: Linked Data Tutorial (Florianópolis)

SPARQL Endpoints

• SPARQL protocol services- Enables users (human or other) to query a knowledge base using SPARQL- Results are typically returned in one or more machine-processable formats

• List of SPARQL Endpoints- http://esw.w3.org/topic/SparqlEndpoints

• Programmatic access using libraries:- ARC, RAP, Jena, Sesame, Javascript SPARQL, PySPARQL, etc.

• Examples:

Project Endpoint

DBpedia http://dbpedia.org/sparql

BBC Programmes and Music http://bbc.openlinksw.com/sparql/

data.gov http://semantic.data.gov/sparql

data.gov.uk http://data.gov.uk/sparql

Musicbrainz http://dbtune.org/musicbrainz/sparql

89

Page 90: Linked Data Tutorial (Florianópolis)

A simple SPARQL query

@prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix : <http://example.org/book/> . :book1 dc:title "SPARQL Tutorial" .

title

"SPARQL Tutorial"

SELECT ?titleWHERE{ <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title .}

Data:

Query:

Query result:

• A pattern is matched against the RDF data • Each way a pattern can be matched yields a solution • The sequence of solutions is filtered by: Project, distinct, order, limit/offset • One of the result forms is applied: SELECT, CONSTRUCT, DESCRIBE, ASK

91

Page 91: Linked Data Tutorial (Florianópolis)

Graph patterns

• Basic Graph Patterns, where a set of triple patterns must match

• Group Graph Pattern, where a set of graph patterns must all match

• Optional Graph patterns, where additional patterns may extend the solution

• Alternative Graph Pattern, where two or more possible patterns are tried

• Patterns on Named Graphs, where patterns are matched against named graphs

92

Page 92: Linked Data Tutorial (Florianópolis)

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Johnny Lee Outlaw" ._:a foaf:mbox <mailto:[email protected]> ._:b foaf:name "Peter Goodguy" ._:b foaf:mbox <mailto:[email protected]> ._:c foaf:mbox <mailto:[email protected]> .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?mboxWHERE { ?x foaf:name ?name . ?x foaf:mbox ?mbox }

name mbox

"Johnny Lee Outlaw" <mailto:[email protected]>

"Peter Goodguy" <mailto:[email protected]>

Multiple matches

93

Page 93: Linked Data Tutorial (Florianópolis)

@prefix dt: <http://example.org/datatype#> .@prefix ns: <http://example.org/ns#> .@prefix : <http://example.org/ns#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:x ns:p "cat"@en .:y ns:p "42"^^xsd:integer .:z ns:p "abc"^^dt:specialDatatype .

SELECT ?v WHERE { ?v ?p "cat" } v

SELECT ?v WHERE { ?v ?p "cat"@en }v

<http://example.org/ns#x>

SELECT ?v WHERE { ?v ?p 42 }v

<http://example.org/ns#y>

SELECT ?v WHERE { ?v ?p "abc"^^<http://example.org/datatype#specialDatatype> }

v

<http://example.org/ns#z>

Matching RDF literals

94

Page 94: Linked Data Tutorial (Florianópolis)

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" ._:b foaf:name "Bob" .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?x ?nameWHERE { ?x foaf:name ?name }

x name

_:c "Alice"

_:d "Bob"

x name

_:r "Alice"

_:s "Bob"=

Blank node labels in query results

95

Page 95: Linked Data Tutorial (Florianópolis)

Group graph pattern

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?mboxWHERE { { ?x foaf:name ?name . } { ?x foaf:mbox ?mbox . } }

SELECT ?xWHERE {}

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?mboxWHERE { { ?x foaf:name ?name . } { ?x foaf:mbox ?mbox . FILTER regex(?name, "Smith")} }

96

Page 96: Linked Data Tutorial (Florianópolis)

Optional graph patterns

@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

_:a rdf:type foaf:Person ._:a foaf:name "Alice" ._:a foaf:mbox <mailto:[email protected]> ._:a foaf:mbox <mailto:[email protected]> .

_:b rdf:type foaf:Person ._:b foaf:name "Bob" .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?mboxWHERE { ?x foaf:name ?name . OPTIONAL { ?x foaf:mbox ?mbox } }

name mbox

"Alice" <mailto:[email protected]>

"Alice" <mailto:[email protected]>

“Bob"

97

Page 97: Linked Data Tutorial (Florianópolis)

Multiple optional graph patterns

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" ._:a foaf:homepage <http://work.example.org/alice/> .

_:b foaf:name "Bob" ._:b foaf:mbox <mailto:[email protected]> .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?mbox ?hpageWHERE { ?x foaf:name ?name . OPTIONAL { ?x foaf:mbox ?mbox } . OPTIONAL { ?x foaf:homepage ?hpage } }

name mbox hpage

"Alice" <http://work.example.org/alice/>

“Bob" <mailto:[email protected]>

98

Page 98: Linked Data Tutorial (Florianópolis)

Alternative graph patterns

@prefix dc10: <http://purl.org/dc/elements/1.0/> .@prefix dc11: <http://purl.org/dc/elements/1.1/> .

_:a dc10:title "SPARQL Query Language Tutorial" ._:a dc10:creator "Alice" ._:b dc11:title "SPARQL Protocol Tutorial" ._:b dc11:creator "Bob" ._:c dc10:title "SPARQL" ._:c dc11:title "SPARQL (updated)" .

PREFIX dc10: <http://purl.org/dc/elements/1.0/>PREFIX dc11: <http://purl.org/dc/elements/1.1/>SELECT ?titleWHERE { { ?book dc10:title ?title } UNION { ?book dc11:title ?title } }

title

"SPARQL Protocol Tutorial"

"SPARQL"

"SPARQL (updated)"

"SPARQL Query Language Tutorial"

SELECT ?x ?yWHERE { { ?book dc10:title ?x } UNION { ?book dc11:title ?y } }

SELECT ?title ?authorWHERE { { ?book dc10:title ?title . ?book dc10:creator ?author } UNION { ?book dc11:title ?title . ?book dc11:creator ?author }}

x y

"SPARQL (updated)"

"SPARQL Protocol Tutorial"

"SPARQL"

"SPARQL Query Language Tutorial"

author title

"Alice" "SPARQL Protocol Tutorial"

“Bob” "SPARQL Query Language Tutorial"

99

Page 99: Linked Data Tutorial (Florianópolis)

Patterns on named graphs

# Named graph: http://example.org/foaf/aliceFoaf@prefix foaf:<http://.../foaf/0.1/> .@prefix rdf:<http://.../1999/02/22-rdf-syntax-ns#> .@prefix rdfs:<http://.../2000/01/rdf-schema#> .

_:a foaf:name "Alice" ._:a foaf:mbox <mailto:[email protected]> ._:a foaf:knows _:b .

_:b foaf:name "Bob" ._:b foaf:mbox <mailto:[email protected]> ._:b foaf:nick "Bobby" ._:b rdfs:seeAlso <http://example.org/foaf/bobFoaf> .

<http://example.org/foaf/bobFoaf> rdf:type foaf:PersonalProfileDocument .

# Named graph: http://example.org/foaf/bobFoaf@prefix foaf:<http://.../foaf/0.1/> .@prefix rdf:<http://.../1999/02/22-rdf-syntax-ns#> .@prefix rdfs:<http://.../2000/01/rdf-schema#> .

_:z foaf:mbox <mailto:[email protected]> ._:z rdfs:seeAlso <http://example.org/foaf/bobFoaf> ._:z foaf:nick "Robert" .

<http://example.org/foaf/bobFoaf> rdf:type foaf:PersonalProfileDocument .

100

Page 100: Linked Data Tutorial (Florianópolis)

Patterns on named graphs II

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX data: <http://example.org/foaf/>

SELECT ?nickFROM NAMED <http://example.org/foaf/aliceFoaf>FROM NAMED <http://example.org/foaf/bobFoaf>WHERE { GRAPH data:bobFoaf { ?x foaf:mbox <mailto:[email protected]> . ?x foaf:nick ?nick } }

nick

"Robert"

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?src ?bobNickFROM NAMED <http://example.org/foaf/aliceFoaf>FROM NAMED <http://example.org/foaf/bobFoaf>WHERE { GRAPH ?src { ?x foaf:mbox <mailto:[email protected]> . ?x foaf:nick ?bobNick } }

src bobNick

<http://example.org/foaf/aliceFoaf>

"Bobby"

<http://example.org/foaf/bobFoaf> "Robert"

101

Page 101: Linked Data Tutorial (Florianópolis)

Restricting values

@prefix dc: <http://purl.org/dc/elements/1.1/> .@prefix : <http://example.org/book/> .@prefix ns: <http://example.org/ns#> .

:book1 dc:title "SPARQL Tutorial" .:book1 ns:price 42 .:book2 dc:title "The Semantic Web" .:book2 ns:price 23 .

PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?titleWHERE { ?x dc:title ?title FILTER regex(?title, "^SPARQL") }

title

"SPARQL Tutorial"

PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?titleWHERE { ?x dc:title ?title FILTER regex(?title, "web", "i" ) }

title

"The Semantic Web"

PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX ns: <http://example.org/ns#>SELECT ?title ?priceWHERE { ?x ns:price ?price . FILTER (?price < 30.5) ?x dc:title ?title . }

title price

"The Semantic Web" 23

102

Page 102: Linked Data Tutorial (Florianópolis)

103

Value tests

• Based on XQuery 1.0 and XPath 2.0 Function and Operators

• XSD boolean, string, integer, decimal, float, double, dateTime

• Notation <, >, =, <=, >= and != for value comparisonApply to any type

• BOUND, isURI, isBLANK, isLITERAL • REGEX, LANG, DATATYPE, STR (lexical form) • Function call for casting and extensions functions

Page 103: Linked Data Tutorial (Florianópolis)

Solution sequences and modifiers

• Order modifier: put the solutions in order

• Projection modifier: choose certain variables

• Distinct modifier: ensure solutions in the sequence are unique

• Reduced modifier: permit elimination of some non-unique solutions

• Limit modifier: restrict the number of solutions

• Offset modifier: control where the solutions start from in the overall sequence of solutions

SELECT ?nameWHERE { ?x foaf:name ?name ; :empId ?emp }ORDER BY ?name DESC(?emp)

SELECT ?nameWHERE { ?x foaf:name ?name }

SELECT DISTINCT ?name WHERE { ?x foaf:name ?name }

SELECT REDUCED ?name WHERE { ?x foaf:name ?name }

SELECT ?name WHERE { ?x foaf:name ?name }ORDER BY ?nameLIMIT 5OFFSET 10

SELECT ?nameWHERE { ?x foaf:name ?name }LIMIT 20

104

Page 104: Linked Data Tutorial (Florianópolis)

SPARQL query forms

• SELECT- Returns all, or a subset of, the variables bound in a query

pattern match• CONSTRUCT

- Returns an RDF graph constructed by substituting variables in a set of triple templates

• ASK- Returns a boolean indicating whether a query pattern

matches or not• DESCRIBE

- Returns an RDF graph that describes the resources found

105

Page 105: Linked Data Tutorial (Florianópolis)

SPARQL query forms: SELECT

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" ._:a foaf:knows _:b ._:a foaf:knows _:c .

_:b foaf:name "Bob" .

_:c foaf:name "Clare" ._:c foaf:nick "CT" .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?nameX ?nameY ?nickYWHERE { ?x foaf:knows ?y ; foaf:name ?nameX . ?y foaf:name ?nameY . OPTIONAL { ?y foaf:nick ?nickY } }

nameX nameY nickY

"Alice" "Bob"

"Alice" "Clare" "CT"

106

Page 106: Linked Data Tutorial (Florianópolis)

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" ._:a foaf:mbox <mailto:[email protected]> .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>

CONSTRUCT { <http://example.org/person#Alice> vcard:FN ?name }

WHERE { ?x foaf:name ?name }

@prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> .

<http://example.org/person#Alice> vcard:FN "Alice" .

Query result:

SPARQL query forms: CONSTRUCT

107

Page 107: Linked Data Tutorial (Florianópolis)

SPARQL query forms: ASK

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" ._:a foaf:homepage <http://work.example.org/alice/> .

_:b foaf:name "Bob" ._:b foaf:mbox <mailto:[email protected]> .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>ASK { ?x foaf:name "Alice" }

yes

Query result:

108

Page 108: Linked Data Tutorial (Florianópolis)

PREFIX ent: <http://org.example.com/employees#>DESCRIBE ?x WHERE { ?x ent:employeeId "1234" }

@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0> .@prefix exOrg: <http://org.example.com/employees#> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix owl: <http://www.w3.org/2002/07/owl#>

_:a exOrg:employeeId "1234" ; foaf:mbox_sha1sum "ABCD1234" ; vcard:N [ vcard:Family "Smith" ; vcard:Given "John" ] .

foaf:mbox_sha1sum rdf:type owl:InverseFunctionalProperty .

Query result:

SPARQL query forms: DESCRIBE

109

Page 109: Linked Data Tutorial (Florianópolis)

Main References

• Prud’hommeaux E, Seaborne A (2008) SPARQL Query Language for RDF. W3C Recommendation

http://www.w3.org/TR/rdf-sparql-query/• SPARQL validator:

http://www.sparql.org/validator.html• SPARQL implementations:

http://esw.w3.org/topic/SparqlImplementations• SPARQL Endpoints

http://esw.w3.org/topic/SparqlEndpoints• SPARQL in Dbpedia

http://dbpedia.org/sparql

110

Page 110: Linked Data Tutorial (Florianópolis)

111

Index

• Resource Description Framework (RDF)- RDF primitives- Reasoning with RDF

• RDF Schema- RDF Schema primitives- Reasoning with RDFS

• RDF(S) Management APIs• SPARQL• OWL

Page 111: Linked Data Tutorial (Florianópolis)

Description Logics

• A family of logic based Knowledge Representation formalisms- Descendants of semantic networks and KL-ONE- Describe domain in terms of concepts (classes), roles (relationships) and

individuals• Specific languages characterised by the constructors and axioms

used to assert knowledge about classes, roles and individuals.• Example: ALC (the least expressive language in DL that is

propositionally closed)• Constructors: boolean (and, or, not)• Role restrictions

• Distinguished by:- Model theoretic semantics

• Decidable fragments of FOL• Closely related to Propositional Modal & Dynamic Logics

- Provision of inference services• Sound and complete decision procedures for key problems• Implemented systems (highly optimised)

Page 112: Linked Data Tutorial (Florianópolis)

Structure of DL Ontologies

• A DL ontology can be divided into two parts:- Tbox (Terminological KB): a set of axioms that describe the

structure of a domain :• Doctor Person• Person Man Woman• HappyFather Man hasDescendant.(Doctor

hasDescendant.Doctor)- Abox (Assertional KB): a set of axioms that describe a

specific situation :• John HappyFather • hasDescendant (John, Mary)

Page 113: Linked Data Tutorial (Florianópolis)

Most common constructors in class definitions

• Intersection: C1 ... Cn Human Male• Union: C1 ... Cn Doctor Lawyer• Negation: C Male• Nominals: {x1} ... {xn} {john} ... {mary}• Universal restriction: P.C hasChild.Doctor• Existential restriction: P.C hasChild.Lawyer• Maximum cardinality: nP.C 3hasChild.Doctor• Minimum cardinality: nP.C 1hasChild.Male• Specific Value: P.{x} hasColleague.{Matthew}

• Nesting of constructors can be arbitrarily complex- Person hasChild.(Doctor hasChild.Doctor)

• Lots of redundancy- AB is equivalent to ( A B)- P.C is equivalent to P. C

Page 114: Linked Data Tutorial (Florianópolis)

OWL (1.0 and 1.1)

February 2004

Web Ontology Language

Built on top of RDF(S)

Three layers:- OWL Lite

- A small subset of primitives- Easier for frame-based tools to transition to

- OWL DL- Description logic- Decidable reasoning

- OWL Full- RDF extension, allows metaclasses

Several syntaxes:- Abstract syntax- Manchester syntax- RDF/XML

Page 115: Linked Data Tutorial (Florianópolis)

OWL 2 (I). New features

• October 2009

• New features- Syntactic sugar

• Disjoint union of classes- New expressivity

• Keys• Property chains• Richer datatypes, data ranges• Qualified cardinality restrictions• Asymmetric, reflexive, and disjoint properties• Enhanced annotation capabilities

• New syntax- OWL2 Manchester syntax

Page 116: Linked Data Tutorial (Florianópolis)

OWL 2 (II). Three new profiles

• OWL2 EL- Ontologies that define very large numbers of classes and/or properties, - Ontology consistency, class expression subsumption, and instance checking can

be decided in polynomial time.

• OWL2 QL- Sound and complete query answering is in LOGSPACE (more precisely, in AC0)

with respect to the size of the data (assertions),- Provides many of the main features necessary to express conceptual models

(UML class diagrams and ER diagrams). - It contains the intersection of RDFS and OWL 2 DL.

• OWL2 RL- Inspired by Description Logic Programs and pD*. - Syntactic subset of OWL 2 which is amenable to implementation using rule-

based technologies, and presenting a partial axiomatization of the OWL 2 RDF-Based Semantics in the form of first-order implications that can be used as the basis for such an implementation.

- Scalable reasoning without sacrificing too much expressive power. - Designed for

• OWL applications trading the full expressivity of the language for efficiency, • RDF(S) applications that need some added expressivity from OWL 2.

Page 117: Linked Data Tutorial (Florianópolis)

OWL: Most common constructors

Intersection: C1 ... Cn intersectionOf Human Male

Union: C1 ... Cn unionOf Doctor LawyerNegation: C complementOf Male

Nominals: {x1} ... {xn} oneOf {john} ... {mary}Universal restriction: P.C allValuesFrom hasChild.DoctorExistential restriction: P.C someValuesFrom hasChild.LawyerMaximum cardinality: nP[.C] maxCardinality (qualified or not) 3hasChild[.Doctor]Minimum cardinality: nP[.C] minCardinality (qualified or not) 1hasChild[.Male]Exact cardinality: =nP[.C] exactCardinality (qualified or not) =1hasMother[.Female]Specific Value: P.{x} hasValue hasColleague.{Matthew}Local reflexivity: -- hasSelf Narcisist Person hasSelf(loves)

Keys -- hasKey hasKey(Person, passportNumber, country)

Subclass C1 C2 subClassOf Human Animal Biped

Equivalence C1 C2 equivalentClass Man Human Male

Disjointness C1 C2 disjointWith, AllDisjointClasses Male Female DisjointUnion C C1 ... Cn and

Ci Cj forall i≠j disjointUnionOf Person DisjointUnionOf (Man, Woman)

Metaclasses and annotations on axioms are also valid in OWL2, and declarations of classes have to provided.

Full list available in reference specs and in the Quick Reference Guide: http://www.w3.org/2007/OWL/refcard

Page 118: Linked Data Tutorial (Florianópolis)

OWL: Most common constructors

Subproperty P1 P2 subPropertyOf hasDaughter hasChild

Equivalence P1 P2 equivalentProperty cost price

DisjointProperties P1 ... Pn disjointObjectProperties hasDaughter hasSon Inverse P1 P2- inverseOf hasChild hasParent-

Transitive P+ P TransitiveProperty ancestor+ ancestor

Functional 1P FunctionalProperty T 1hasMother

InverseFunctional 1P- InverseFunctionalProperty T 1hasPassportID-

Reflexive ReflexiveProperty

Irreflexive IrreflexiveProperty

Asymmetric AsymmetricProperty

Property chains P P1 o ... o Pn propertyChainAxiom hasUncle hasFather o hasBrother

Equivalence {x1} {x2} sameIndividualAs {oeg:OscarCorcho}{img:Oscar}

Different {x1} {x2} differentFrom, AllDifferent {john} {peter}

NegativePropertyAssertion NegativeDataPropertyAssertion {hasAge john 35}

NegativeObjectPropertyAssertion {hasChild john peter}

Besides, top and bottom object and datatype properties exist

Page 119: Linked Data Tutorial (Florianópolis)

Basic Inference Tasks

• Subsumption – check knowledge is correct (captures intuitions)- Does C subsume D w.r.t. ontology O? (in every model I of O, CI DI )

• Equivalence – check knowledge is minimally redundant (no unintended synonyms)- Is C equivalent to D w.r.t. O? (in every model I of O, CI = DI )

• Consistency – check knowledge is meaningful (classes can have instances)- Is C satisfiable w.r.t. O? (there exists some model I of O s.t. CI )

• Instantiation and querying- Is x an instance of C w.r.t. O? (in every model I of O, xI CI )- Is (x,y) an instance of R w.r.t. O? (in every model I of O, (xI,yI) RI )

• All reducible to KB satisfiability or concept satisfiability w.r.t. a KB

• Can be decided using highly optimised tableaux reasoners

Page 120: Linked Data Tutorial (Florianópolis)

Main References

Gómez-Pérez, A.; Fernández-López, M.; Corcho, O. Ontological Engineering. Springer Verlag. 2003

Capítulo 4: Ontology languages

W3C OWL Working Group (2009) OWL2 Web Ontology Language Document Overview. http://www.w3.org/TR/2009/REC-owl2-overview-20091027/ Dean M, Schreiber G (2004) OWL Web Ontology Language Reference. W3C Recommendation. http://www.w3.org/TR/owl-ref/

Baader F, McGuinness D, Nardi D, Patel-Schneider P (2003) The Description Logic Handbook: Theory, implementation and applications. Cambridge University Press, Cambridge, United Kingdom

Jena web site: http://jena.sourceforge.net/ Jena API: http://jena.sourceforge.net/tutorial/RDF_API/ Jena tutorials:http://www.ibm.com/developerworks/xml/library/j-jena/index.html

http://www.xml.com/pub/a/2001/05/23/jena.html

Pellet: http://clarkparsia.com/pellet RACER: http://www.racer-systems.com/ FaCT++: http://owl.man.ac.uk/factplusplus/ HermIT: http://hermit-reasoner.com/

Page 121: Linked Data Tutorial (Florianópolis)

122

Contents

• Introduction to Linked Data• Linked Data Foundations: RDF, RDF Schema,

SPARQL and OWL

Coffee break

• Linked Data publication- Methodological guidelines for Linked Data publication- RDB2RDF tools- Technical aspects of Linked Data publication

• Linked Data consumption

Page 122: Linked Data Tutorial (Florianópolis)

Methodological guidelines for Linked Data publication

• Motivation

• Related Work

• GeoLinkedData• Identification of the data sources• Vocabulary Development• Generation of the RDF data• Publication of the RDF data• Data cleansing• Linking the RDF data• Enable effective discovery

• Future Work

Page 123: Linked Data Tutorial (Florianópolis)

GeoLinkedData

• It is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data.

• This initiative has started off by publishing diverse information sources, such as National Geographic Institute of Spain (IGN-E) and National Statistics Institute (INE)

• http://geo.linkeddata.es

Page 124: Linked Data Tutorial (Florianópolis)

Motivation

» 99.171 % English» 0.019 % Spanish

Source:Billion Triples dataset at http://km.aifb.kit.edu/projects/btc-2010/Thanks to Aidan and Richard

The Web of Data is mainly for English speakers

Poor presence of Spanish

Page 125: Linked Data Tutorial (Florianópolis)

Related Work

Page 126: Linked Data Tutorial (Florianópolis)

127

Impact of Geo.linkeddata.es

• Número de tripletas en Español (July): 1.412.248 • Número de tripletas en Español (End august):

21.463.088

Asunción Gómez Pérez

Before geo.linkeddata.es

en 99,1712875

ja 0,463849377

fr 0,05447229

de 0,034225134

pl 0,02532934

it 0,021982542

es 0,019584648

After geo.linkeddata.es

en 94,18744941

es 5,044085342

ja 0,440538697

fr 0,051734793

de 0,032505155

pl 0,024056418

it 0,020877812

Page 127: Linked Data Tutorial (Florianópolis)

Process for Publishing Linked Data on the Web

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 128: Linked Data Tutorial (Florianópolis)

1. Identification and selection of the data sources

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Instituto GeográficoNacional

Instituto Nacionalde Estadística

Page 129: Linked Data Tutorial (Florianópolis)

130

1. Identification and selection of the data sources

• Instituto Geográfico Nacional (Geographic Spanish Institute)- Multilingual (Spanish,

Vasc, Gallician, Catalan)- Conceptualization

mistmatches- Granularity (scale

concept)- Textual information- Particularaties

• Longitude• latitude

• Instituto Nacional de Estadística (Statistic Spanish Institute)• Monolingual• Numerical information• Particularaties

• Geo (textual level)• Temporal

Asunción Gómez Pérez

Page 130: Linked Data Tutorial (Florianópolis)

1. Identification and selection of the data sources

• IGN-E

Page 131: Linked Data Tutorial (Florianópolis)

1. Identification and selection of the data sources

Industry Production Index

Province

Year

Page 132: Linked Data Tutorial (Florianópolis)

2. Vocabulary development

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#whichvocabs

This

is no

t eno

ugh

Page 133: Linked Data Tutorial (Florianópolis)

2. Vocabulary development

• Features- Lightweight :

• Taxonomies and a few properties- Consensuated vocabularies

• To avoid the mapping problems- Multilingual

• Linked data are multilingual

• The NeOn methodology can help to - Re-enginer Non ontological resources into ontologies

• Pros: use domain terminology already consensuated by domain experts

- Withdraw in heavyweight ontologies those features that you don’t need

- Reuse existing vocabularies

134Asunción Gómez Pérez

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 134: Linked Data Tutorial (Florianópolis)

135

Knowledge Resources

Non Ontological Resource

Reuse

Non Ontological Resource

Reengineering

2

2

2

Non Ontological Resources

Thesauri

DictionariesGlossaries Lexicons

TaxonomiesClassification

Schemas

O. Localization

9

Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation; Configuration Management; Evaluation (V&V); Assessment

1,2,3,4,5,6,7,8, 9

Ontological Resource

Reengineering

4

4

4

O. Aligning

O. Merging

Alignments5

5

5

6

6

6

6

3

Ontological Resource

Reuse

3Ontological Resources

O. Repositories and Registries

FlogicRDF(S)

OWL

Ontology Design

Pattern Reuse

7

O. Design Patterns

Ontology Restructuring(Pruning, Extension,

Specialization, Modularization)

8

O. Specification O. Conceptualization O. ImplementationO. Formalization

1

RDF(S)

OWL

FlogicScheduling

Page 135: Linked Data Tutorial (Florianópolis)

Vocabulary development: Specification

• Content requirements: Identify the set of questions that the ontology should answer- Which one are the provinces in Spain?- Where are the beaches?- Where are the reservoirs?- Identify the production index in Madrid- Which one is the city with higher production index?- Give me Madrid latitude and altitude- ….

• Non-content requirements- The ontology must be in the four official Spanish languages

136Asunción Gómez Pérez

Page 136: Linked Data Tutorial (Florianópolis)

2. Lightweight Ontology Development

hasStatisticalData

on

Ontology

Specification

Legend

hydrOntology

4

FAO

FAO Geopolitical ontology

WGS84

4W3C Vocabulary

GML

4GML Specification

O. Statistics

SCOVO

O. Time

W3C Time

hasLat/Long

hasGeometry

hasLat/Long

hasGeometry

hasLocation/isLocated

Thesaurus

UNESCO

4EGM / ERM

GeoNames

scv:Dimension

scv:Itemscv:Dataset

WGS84 Geo Positioning:

an RDF vocabulary

hydrographical

phenomena (rivers, lakes,

etc.)

Ontology for OGC

Geography Markup

Language

Vocabulary for instants,

intervals, durations,

etc.

Names and international

code systems for territories

and groups

Following the INSPIRE (INfrastructure for SPatial InfoRmation in Europe) recommendation.hydrOntology,SCOVO, FAO Geopolitcal, WGS84, GML, and Time

Classes 33 33

Object Properties 44 44

Data Properties 318 318

reused

Page 137: Linked Data Tutorial (Florianópolis)

• Objetivos:

- INSPIRE intenta conseguir fuentes armonizadas de Información Geográfica para dar soporte a la formulación, implementación y evaluación de políticas comunitarias (Medio Ambiente, etc).

- Fuentes de Información Geográfica: Bases de datos de los Estados Miembros (UE) a nivel local, regional, nacional e internacional.

Contexto – Directiva INSPIRE

Luis Manuel Vilches Blázquez

Page 138: Linked Data Tutorial (Florianópolis)

INSPIRE - Anexos

Luis Manuel Vilches Blázquez

Page 139: Linked Data Tutorial (Florianópolis)

hydrOntology

• Existencia de gran diversidad de problemas (múltiples fuentes, heterogeneidad de contenido y estructuración, ambigüedad del lenguaje natural, etc.) en la información geográfica.

• Necesidad de un modelo compartido para solventar los problemas de armonización y estructuración de la información hidrográfica.

• hydrOntology es una ontología global de dominio desarrollada conforme a un acercamiento top-down.

- Recubrir la mayoría de los fenómenos representables cartográficamente asociados al dominio hidrográfico.

- Servir como marco de armonización entre los diferentes productores de información geo-espacial en el entorno nacional e internacional.

- Comenzar con los pasos necesarios para obtener una mejor organización y gestión de la información geográfica (hidrográfica).

Luis Manuel Vilches Blázquez

Page 140: Linked Data Tutorial (Florianópolis)

Fuentes

GEMET

Catálogos de fenómenos

BCN25

BCN200

EGM & ERM CC.AA.

Nomenclátor Geográfico Nacional

Tesauros y Bibliografía

WFD

Nomenclátor Conciso

Diccionarios yMonografías

FTT ADL Getty

Luis Manuel Vilches Blázquez

Page 141: Linked Data Tutorial (Florianópolis)

Criterios de estructuración

• Directiva Marco del Agua- Propuesta por Parlamento y Consejo de la UE- Lista de definiciones de fenómenos hidrográficos

• Proyecto SDIGER- Proyecto piloto INSPIRE- Dos cuencas, países e idiomas

• Criterios semánticos- Diccionarios geográficos- Diccionario de la Real Academia de la Lengua- WordNet- Wikipedia- Bibliografía de varias áreas de conocimiento

• Herencia: Estructuración actual de catálogos• Asesoramiento expertos en toponimia del IGN

Luis Manuel Vilches Blázquez

Page 142: Linked Data Tutorial (Florianópolis)

Modelización del dominio hidrográfico Nivel superior

Nivel inferior

Luis Manuel Vilches Blázquez

Page 143: Linked Data Tutorial (Florianópolis)

Implementación & Formalizacón

12

3

4

5

+150 conceptos (classes) , 47 tipos de relaciones (properties)

y 64 tipos de atributos (attribute types)

+ Pellet

Luis Manuel Vilches Blázquez

Page 144: Linked Data Tutorial (Florianópolis)

2. Vocabulary development: HydrOntology

145Asunción Gómez Pérez

Page 145: Linked Data Tutorial (Florianópolis)

3. Generation of RDF

• From the Data sources- Geographic information

(Databases)- Statistic information

(.xsl)- Geospatial information

• Different technologies for RDF generation- Reengineering patterns- R20 and ODEMapster- Annotation tools- Geometry generation

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 146: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data

INE

NOR2O

ODEMapster

IGN

IGN

Geospatial column

Geometry2RDF

Page 147: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data / instances

• NOR2O is a software library that implements the transformations proposed by the Patterns for Re-engineering Non-Ontological Resources (PR-NOR). Currently we have 16 PR-NORs.

• PR-NORs define a procedure that transforms a Non-Ontological Resource (NOR) components into ontology elements. http://ontologydesignpatterns.org/

NOR2O

· Classificationschemes

· Thesauri

· Lexicons

NOR2O

FAO Water classification· Classification scheme

· Path enumeration data model· Implemented in a database

Page 148: Linked Data Tutorial (Florianópolis)

Re-engineering Model for NORs

Ontology Forward

Engineering

Implementation

Formalization

Conceptua-

lization

Speci-

fication

Implementation

Design

Con-

ceptual

Transformation

Patterns for Re-engineeringNon-Ontological Resources

(PR-NOR)

Non-Ontological Resource Ontology

Requirements

NOR Reverse

Engineering

RDF(S)

Page 149: Linked Data Tutorial (Florianópolis)

PR-NOR library at the ODP PortalTechnological support

http://ontologydesignpatterns.org/wiki/Submissions:ReengineeringODPs

Page 150: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data – NOR2O

Industry Production Index

Province

Year

NOR2O

Page 151: Linked Data Tutorial (Florianópolis)

hydrOntology & Bases de Datos

NGN1:25.000

multilingüe

© Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas

Luis Manuel Vilches Blázquez

Page 152: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data – R2O & ODEMapster

• Creation of the R2O Mappings

Page 153: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data – Geometry2RDF

Oracle STO UTIL package

SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry)) AS Gml311Geometry

FROM "BCN200"."BCN200_0301L_RIO" cWHERE c.Etiqueta='Arroyo'

Page 154: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data – Geometry2RDF

Page 155: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF Data – Geometry2RDF

Page 156: Linked Data Tutorial (Florianópolis)

3. Generation of the RDF data – RDF graphs

• IGN INE

• So far• 7 RDF Named Graphs• 1.412.248 triples

BTN25 BCN200 IPI….

http://geo.linkeddata.es/dataset/IGN/BTN25 http://geo.linkeddata.es/dataset/IGN/BCN200 http://geo.linkeddata.es/dataset/INE/IPI

Page 157: Linked Data Tutorial (Florianópolis)

4. Publication of the RDF Data

SPARQL

Pubby

Linked DataHTML

Virtuoso 6.1.0

Pubby 0.3

Including ProvenanceSupport

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 158: Linked Data Tutorial (Florianópolis)

4. Publication of the RDF Data

Page 159: Linked Data Tutorial (Florianópolis)

4. Publication of the RDF Data - License

• License for GeoLinkedData- Creative Commons Attribution-ShareAlike 3.0 - GNU Free Documentation License

• Each dataset will have its own specific license, IGN, INE, etc.

Page 160: Linked Data Tutorial (Florianópolis)

5. Data cleansing

• Lack of documentation of the IGN datasets• Broken links: Spain, IGN resources• Lack of documentation of the ontology• Missing english and spanish labels• Building a spanish ontology and importing

some concepts of other ontology (in English):- Importing the English ontology. Add annotations

like a Spanish label to them.- Importing the English ontology, creating new

concepts and properties with a Spanish name and map those to the English equivalents.

- Re-declaring the terms of the English ontology that we need (using the same URI as in the English ontology), and adding a Spanish label.

- Creating your own class and properties that model the same things as the English ontology.

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 161: Linked Data Tutorial (Florianópolis)

5. Data cleansing

• URIs in Spanish• http://geo.linkeddata.es/ontology/Río

- RDF allows UTF-8 characters for URIs- But, Linked Data URIs has to be URLs as well- So, non ASCII-US characters have to be %code

• http://geo.linkeddata.es/ontology/R%C3%ADo

Page 162: Linked Data Tutorial (Florianópolis)

6. Linking of the RDF Data

• Silk - A Link Discovery Framework for the Web of Data

• First set of links: Provinces of Spain- 86% accuracy

GeoLinkedDataDBPedia Geonames

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 163: Linked Data Tutorial (Florianópolis)

6. Linking of the RDF Data

• http://geo.linkeddata.es/page/Provincia/Granada

164Asunción Gómez Pérez

Page 164: Linked Data Tutorial (Florianópolis)

7. Enable effective discovery

Identificationof the data

sources

Vocabularydevelopment

Generationof the RDF Data

Publicationof the RDF data

Linking the RDF data

Data cleansing

Enable effective discovery

Page 165: Linked Data Tutorial (Florianópolis)

DEMO

http://geo.linkeddata.es/

Page 166: Linked Data Tutorial (Florianópolis)

Provinces

Page 167: Linked Data Tutorial (Florianópolis)

Industry Production Index – Capital of Province

Page 168: Linked Data Tutorial (Florianópolis)

Rivers

Page 169: Linked Data Tutorial (Florianópolis)

Beaches

Page 170: Linked Data Tutorial (Florianópolis)

Future Work

• Generate more datasets from other domains, e.g. universities in Spain.

• Identify more links to DBPedia and Geonames.

• Cover complex geometrical information, i.e. not only Point and LineString-like data; we will also treat information representation through polygons.

Page 171: Linked Data Tutorial (Florianópolis)

172

Contents

• Introduction to Linked Data• Linked Data Foundations: RDF, RDF Schema,

SPARQL and OWL

Coffee break

• Linked Data publication- Methodological guidelines for Linked Data publication- RDB2RDF tools- Technical aspects of Linked Data publication

• Linked Data consumption

Page 172: Linked Data Tutorial (Florianópolis)

Ontology-based Access to DBs

1. Build a new ontology from 1 DB schema and 1 DB

2. Align the ontology built with approach 1 with a legacy ontology

3. Align an existing DB with a legacy ontology

a) Massive dump (semantic data warehouse)

b) Query-driven

4. Align an ontology network with n DB schemas and other data sources

a) Massive dump (semantic data warehouse)

b) Query-driven

new ontology

existing ontology

1 23

4

Page 173: Linked Data Tutorial (Florianópolis)

Ontology-based Access to Databases

BDR ModeloRelacional

Personal

Organización

Pregunta: Nombre de los profesores de la universidad

UPM

* Un profesor es una persona cuyo puesto es “docente”

* Una universidad es una organización de tipo “3”

Consulta: valores de la columna nombre de los registros de la tabla Personal para los que el valor de la columna puesto is

“docente” que estén relacionados con al menos un registro de la tabla Organización con el valor “3” en la columna tipo y “UPM” en

la columna nombre.

? Procesado de la consulta de acuerdo a la descripción formal

de correspondencia

Ontología

Profesor

Doctorando

Universidad

Procesador

Page 174: Linked Data Tutorial (Florianópolis)

Align data sources with legacy ontologies

Punto Europeo

PuntoGPS

PuntoAsiatico

=

Aeropuertos

f (Aeropuertos)

Ontología O2Ontología O2

Modelo Relacional M1Modelo Relacional M1

PuntoEuropeo

PuntoEspañol

Estación

CentroComunicaciones

Aeropuerto

=Aeropuerto

f (Aeropuertos)

Ontología O1Ontología O1

RC(O2,M1)RC(O1,M1)

Page 175: Linked Data Tutorial (Florianópolis)

R2O

• is a declarative language to specify mappings between relational data sources and ontologies.

RDB

Relational Model

Persons

Organization <xml>R2O Mapping

</xml>

Ontology

Professor

Student

University

Page 176: Linked Data Tutorial (Florianópolis)

Example: types of mappings needed

Attibute Direct Mapping

Attibute Mapping with transformation(Regular Expression)

Relation Mapping w. Transformation(Regular Expression)

Relation Mapping w. Transformation(Keyword search)

Page 177: Linked Data Tutorial (Florianópolis)

Population example (II)

The Operation element defines a transformation based on a regular

expression to be applied to the database column for extracting property values

Population example (II)

Page 178: Linked Data Tutorial (Florianópolis)

A view maps exactly one concept in the

ontology.

A subset of the columns in the view map a concept in

the ontology.

A subset (selection) of the records of a database view map a concept in the ontology.

A subset of the records of a database view map a concept in the onto. but the selection cannot be made using SQL.

A column in a database view maps directly an attribute or a relation.

A column in a database view maps an attribute or a relation after some transformation.

A set of columns in a database view map an attribute or a relation.

One or more concepts can be extracted from a single data field (not in 1NF).

For concepts...

For attributes...

R2O (Relational-to-Ontology) Language

Page 179: Linked Data Tutorial (Florianópolis)

R2O Basic Syntax

<conceptmap-def name="Customer"> <identified-by> Table key </identified-by>

<uri-as> operation </uri-as> <applies-if> condition </applies-if> <joins-via> expression </joins-via>

<documentation>description …</documentation> <described-by>attributes,relations</described-by>

</conceptmap-def>

<conceptmap-def name="Customer"> <identified-by> Table key </identified-by>

<uri-as> operation </uri-as> <applies-if> condition </applies-if> <joins-via> expression </joins-via>

<documentation>description …</documentation> <described-by>attributes,relations</described-by>

</conceptmap-def>

<attributemap-def name="http://esperonto/ff#Title"> <aftertransform>

<operation oper-id="constant"> <arg-restriction on-param="const-val">

<has-column>fsb_ajut.titol</has-column> </arg-restriction>

</operation> </aftertransform></attributemap-def>

<attributemap-def name="http://esperonto/ff#Title"> <aftertransform>

<operation oper-id="constant"> <arg-restriction on-param="const-val">

<has-column>fsb_ajut.titol</has-column> </arg-restriction>

</operation> </aftertransform></attributemap-def>

<relationmap-def name="http://esperonto/ff#isCandidateFor"> <to-concept name="http://esperonto/ff#FundOpp">

<joins-via> <operation oper-id=“equals">

<arg-restriction on-param="value1"> <has-column>fsb_ajut.id</has-column>

</arg-restriction> <arg-restriction on-param="value2">

<has-column>fsb_candidate.forFund</has-column> </arg-restriction>

</operation> </joins-via>

</relationmap-def>

<relationmap-def name="http://esperonto/ff#isCandidateFor"> <to-concept name="http://esperonto/ff#FundOpp">

<joins-via> <operation oper-id=“equals">

<arg-restriction on-param="value1"> <has-column>fsb_ajut.id</has-column>

</arg-restriction> <arg-restriction on-param="value2">

<has-column>fsb_candidate.forFund</has-column> </arg-restriction>

</operation> </joins-via>

</relationmap-def>

Page 180: Linked Data Tutorial (Florianópolis)

ODEMapster

generates RDF instances from relational instances based on the mapping description expressed in the R2O document

Page 181: Linked Data Tutorial (Florianópolis)

182

Mapping Design

• 3 Mapping Creation Steps

– Load Ontology– Load Database(s)– Create mapping

2 Usage Modes– Online mode (run

time query execution)

– Offline mode (materialized RDF

dump)

Page 182: Linked Data Tutorial (Florianópolis)

183

Contents

• Introduction to Linked Data• Linked Data Foundations: RDF, RDF Schema,

SPARQL and OWL

Coffee break

• Linked Data publication- Methodological guidelines for Linked Data publication- RDB2RDF tools- Technical aspects of Linked Data publication

• Linked Data consumption

Page 183: Linked Data Tutorial (Florianópolis)

Using an RDF repository

• It allows storing and accessing RDF data• For example, SESAME (http://www.openrdf.org/)• Download it from

http://www.openrdf.org/download.jsp- openrdf-sesame-2.3.0-sdk.zip- Deploy the .war in Tomcat (JDK and Tomcat needed)

• Create a repository at- http://localhost:8080/openrdf-sesame

• Check: - http://localhost:8080/openrdf-sesame/repositories/XXXX- http://localhost:8080/openrdf-sesame/repositories/XXX/state

ments

RDF repository(Sesame)

SPARQL

Page 184: Linked Data Tutorial (Florianópolis)

Linked Data frontend

• To expose data as Linked Data- Including content negotiation, etc.

• For example, Pubby- http://www4.wiwiss.fu-berlin.de/pubby/

• Installation- Use pubby-0.3.zip- Deploy the webapp folder (and rename)

in Tomcat- Modify config.n3- Restart tomcat- Check: http://localhost:8080/XXX/

RDF repository(Sesame)

LD Frontend(Pubby)

SPARQLquery

LDaccess

Page 185: Linked Data Tutorial (Florianópolis)

REST API

For example, RDF2Go- Java abstraction over RDF repositories- http://rdf2go.semweb4j.org/

RDF repository(Sesame)

REST API LD Frontend(Pubby)

RDF2Go

POSTfeedback

GETfeedback

SPARQLquery

LDaccess

Page 186: Linked Data Tutorial (Florianópolis)

Add SPARQL explorer

• For example, SNORQL (http://wiki.github.com/kurtjx/SNORQL/)

RDF repository(Sesame)

REST API LD Frontend(Pubby)

RDF2Go

POSTfeedback

GETfeedback

SPARQLquery

LDaccess

SPARQL explorer(SNORQL)

SPARQLVia Web

Page 187: Linked Data Tutorial (Florianópolis)

188

Validating Linked Data URLs

• http://validator.linkeddata.org/vapour

Page 188: Linked Data Tutorial (Florianópolis)

189

Contents

• Introduction to Linked Data• Linked Data Foundations: RDF, RDF Schema,

SPARQL and OWL

Coffee break

• Linked Data publication- Methodological guidelines for Linked Data publication- RDB2RDF tools- Technical aspects of Linked Data publication

• Linked Data consumption

Page 189: Linked Data Tutorial (Florianópolis)

RelFinder: finding relations in Linked Data

• E.g., relations between films- “Pulp Fiction”, “Kill Bill” y “Reservoir Dogs”

Page 190: Linked Data Tutorial (Florianópolis)

Exercise on data.gov.uk

- Public schools in London that contain the word “music”

Page 191: Linked Data Tutorial (Florianópolis)

Exercise: find information in DBPedia

http://dbpedia.org/resource/Darth_Vader)

(etc)

Find ficticious serial killers in DBPedia

Image by http://www.flickr.com/photos/bflv/

Page 192: Linked Data Tutorial (Florianópolis)

193

Designing URI sets for the Public Sector (UK)

• http://www.cabinetoffice.gov.uk/media/301253/puiblic_sector_uri.pdf

Page 193: Linked Data Tutorial (Florianópolis)

194

Asociación Española de Linked Data

Page 194: Linked Data Tutorial (Florianópolis)

Mini-curso sobre Linked Data

Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)

Universidad Politécnica de Madrid

Florianópolis, September 1st 2010(3º OntoBras 2010)

Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Juan Sequeda, Carlos Ruiz Moreno and many others

Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0