28
Ontologies & Linked Open Data A brief overview and some real-world applications João Rocha da Silva December 2013 [email protected]

Ontologies & linked open data

Embed Size (px)

DESCRIPTION

A brief presentation I made as an invited lecture.

Citation preview

Page 1: Ontologies & linked open data

Ontologies & Linked Open Data

A brief overview and some real-world applications

João Rocha da Silva

December 2013

[email protected]

Page 2: Ontologies & linked open data

Contents• Ontologies: the importance of semantics in the data

storage and querying layer

• Popular ontologies : DCTerms, FOAF

• The Semantic Web in practice: Linked Open Data in the Facebook API and in DBpedia

• Relational vs Graph : differences

• The SPARQL Language : examples

• A non-relational database : OpenLink Virtuoso

Page 3: Ontologies & linked open data

The importance of semantics

Page 4: Ontologies & linked open data

The importance of semantics

• How does someone understand the meaning of the columns in a relational database?

• Reading a lot of documentation

• Hard to provide information to external systems

• Tailor-made web services required!

Page 5: Ontologies & linked open data

SAP (one of 78,826 tables and counting) source : http://scn.sap.com/thread/1743542

Page 6: Ontologies & linked open data

MediaWiki source http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/MediaWiki_1.20_%2844edaa2%29_database_schema.svg/2500px-MediaWiki_1.20_%2844edaa2%29_database_schema.svg.png

Page 7: Ontologies & linked open data

MediaWiki source http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/MediaWiki_1.20_%2844edaa2%29_database_schema.svg/2500px-MediaWiki_1.20_%2844edaa2%29_database_schema.svg.png

now imagine we want to have images of different kinds, with different attributes…

Page 8: Ontologies & linked open data

The importance of semantics

• Building a query over such a system is complex

• Requires knowledge of its intricate and subtle aspects

• Some columns even contain flags for business logic processing (o_O)

• Bad design decisions = “spaghetti code”

Page 9: Ontologies & linked open data

Relational vs. Ontology

Page 10: Ontologies & linked open data
Page 11: Ontologies & linked open data

!SELECT employee.id AS employee_id, engineer.id AS engineer_id, manager.id AS manager_id, employee.name AS employee_name, employee.type AS employee_type, engineer.engineer_info AS engineer_engineer_info, manager.manager_data AS manager_manager_data FROM employee LEFT OUTER JOIN engineer ON employee.id = engineer.id LEFT OUTER JOIN manager ON employee.id = manager.id []

Page 12: Ontologies & linked open data
Page 13: Ontologies & linked open data

Building the “U.Porto” Ontology

Page 14: Ontologies & linked open data

foaf:Person

up:PhDStudent

up:Student

rdfs:subclassOf

rdfs:subclassOf

up:Faculty

org:memberOf

http://www.w3.org/TR/vocab-org/

org:Organization

rdfs:subclassOf

up : a hypothetical ontology for U.Porto

rdfs:literal

up:thesis

up:Thesis

dc:title

Page 15: Ontologies & linked open data

Representing a person

Page 16: Ontologies & linked open data

http://www.fe.up.pt/~pro11004

“João Rocha”

foaf:name

up:PhDStudent rdf:type

http://www.w3.org/TR/rdf-schema/http://www.foaf-project.org/

http://www.fe.up.pt/

org:memberOf

Page 17: Ontologies & linked open data

Getting all the studentsSELECT ?uri ?attribute ?value FROM <http://myorganization.com/data> WHERE { ?uri rdfs:type up:Student. ?uri ?attribute ?value }

• Will fetch all the students, regardless of their type

• Will also return their attributes (“database columns”)

• Different types of students will have different attributes

Page 18: Ontologies & linked open data

How does the system know that a manager is also an employee?

Inference

http://docs.openlinksw.com/virtuoso/rdfsparqlrule.html

The inference engine recognizes certain properties and builds “virtual triples” in the background

Page 19: Ontologies & linked open data

Inference is good

• Transitive Properties (subclass of subclass…) • Subclasses • Multiple Inheritance Handling

(Student + Researcher + ScholarshipHolder)

Saves coding time spent writing complex queries

Page 20: Ontologies & linked open data

Nothing comes for free• NO referential integrity or foreign keys!

• Aggregation operators slow

• Transactions are not supported in standard SPARQL

• (“SPARQL 1.1 Query/Update Services should be atomic but that they are not required to be atomic.”)

• Graph DBMS Solutions are in early stages (many bugs, many “beta”s, many mailing lists…)

Page 21: Ontologies & linked open data

However

• Graph databases allow for flexible, intuitive representations of the data

• They handle billions of triples

• Restriction-based querying makes queries more high-level

Page 22: Ontologies & linked open data

Query examples

Page 23: Ontologies & linked open data

DBpedia

PREFIX prop: <http://dbpedia.org/ontology/> PREFIX dbprop: <http://dbpedia.org/property/> select distinct ?s ?almaMater where { ?s dbpedia-owl:almaMater ?almaMater. ?s dbprop:knownFor ?knownFor. FILTER regex(?occupation, "Facebook", "i") ?s dbprop:occupation ?occupation. FILTER regex(?occupation, "CEO", "i") } LIMIT 100

“Find Facebook’s CEO and the university where he studied”

Try it at http://dbpedia.org/sparql

Page 24: Ontologies & linked open data

DBpedia

select distinct (?car) ?manufacturer where { ?car rdf:type dbpedia-owl:Automobile. ?car dbpedia-owl:layout <http://dbpedia.org/resource/Front-engine,_rear-wheel-drive_layout>. ?car dbpedia-owl:productionStartYear ?startYear. FILTER ( ?startYear < "1990-01-01 00:00:00"^^xsd:date ) FILTER ( ?startYear > "1980-01-01 00:00:00"^^xsd:date ) ?car <http://dbpedia.org/ontology/manufacturer> ?manufacturer. { SELECT distinct(?manufacturer) WHERE { ?car dbpedia-owl:manufacturer ?manufacturer. ?manufacturer <http://dbpedia.org/property/location> ?location. FILTER regex(?location, "Japan", "i") } } } LIMIT 100

“Find all fun (aka rear-wheel-drive) cars from the eighties, made by Japanese manufacturers”

Try it at http://dbpedia.org/sparql

Page 25: Ontologies & linked open data

Custom query

• What do you want to know?

Page 26: Ontologies & linked open data

Virtuoso, a graph database

Page 27: Ontologies & linked open data

Conclusions• Relational databases

Mature, robust, support transactions

Hard to model entities with dynamic attributes

Complex querying

• Graph Databases

Recent technology

Handle billions of triples

Higher-level querying, more abstract

Page 28: Ontologies & linked open data

João Rocha da Silva is an Informatics Engineering PhD student at the Faculty of Engineering of the University of Porto. He specializes on research data management, applying the latest Semantic Web Technologies to the adequate preservation and discovery of research data assets. !He is experienced in many programming languages (Javascript-Node, PHP with MVC frameworks, Ruby on Rails, J2EE, etc etc) running on the major operating systems (everyday Mac user). Regardless of language, he is a quick learner that can adapt to any new technology quickly and effectively. !He is also an experienced freelancer iOS Developer with several Apps published on the App Store, and a self-taught DIY mechanic with a special interest in classic cars, particularly his 1987 Toyota Corolla GT Twin Cam, also known as Hachi-Roku or AE86.

!Research Data Management and Semantic Web Researcher, Web & iPhone Developer

João Rocha da Silva!

[email protected]