Transcript

Hideaki Takeda / National Institute of Informatics

General Introduction for Semantic Web and Linked Open Data

Hideaki TakedaNational Institute of Informatics

takeda@ nii.ac.jp

2012 INTERNATIONAL ASIAN SUMMER SCHOOL IN LINKED DATA IASLOD 2012, August 13-17, 2012, KAIST, Daejeon, Korea

Hideaki Takeda / National Institute of Informatics

Semantic Web and Linked Data• Semantic Web

– What is Semantic Web– How to realize Semantic Web

• Metadata• RDF• RDFS• OWL

• Linked Data– What is Linked Data?– The State-of-the-Art of Linked Data

• Linking Open Data (LOD)

– How to use Linked Data• Linked Data Browser• Linked Data Search Engine• Linked Data Applications

– How to use RDF• RDFa

– SPARQL

Hideaki Takeda / National Institute of Informatics

Semantic Web

Hideaki Takeda / National Institute of Informatics

The Aim of The Semantic Web• "The Semantic Web is an extension of the current web in

which information is given well-defined meaning, better enabling computers and people to work in cooperation."

The Semantic Web, Scientific American, May 2001, Tim Berners-Lee, James Hendler and Ora Lassila

• The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.

http://www.w3.org/2001/sw/

Hideaki Takeda / National Institute of Informatics

Semantic Web• Realization of various information exchanging via Web

自動化統合

データの再利用

AutomationIntegration

Re-use of data

Hideaki Takeda / National Institute of Informatics

Next Generation Web?

• Evolution of Web– HTML: Web for Display– XML: Web with Syntax– ?? : Web with Semantics

• Why should we embed semantics into Web? From– Web for Human

To– Web for human and machinescf. Web for machines

Hideaki Takeda / National Institute of Informatics

A brief introduction of XML• Limitation of HTML

– Chaos by mixture of displaying and text structures• e.g.,

– <h3></h3> should be used for “the third-level heading”, but are often used just for bigger fonts

– <b></b> is specifying “bold” , not “emphasis”.– Fixed Structure

• e.g.,– If you need <h7></h7>….– I need a structure just for my data

<h1> A list of lectures</h1><h2> Knowledge Sharing Systems</h2><h3> Lecturer : Hideaki Takeda</h3><h3>Wednesday 3rd</h3>

Hideaki Takeda / National Institute of Informatics

XML

• XML(eXtensible Markup Language)– Can define original tags– Represent logical structures of data

• DTD

– Do not include style information• XST <lecturelist>

<lecture> <title id=1234> Knowledge Sharing Systems</title><lecturer> Hideaki Takeda</lecturer><schedule> <week>Wednesday</week> <time>3rd</time></lecture>... </lecturelist>

Hideaki Takeda / National Institute of Informatics

Whey is XML not sufficient?

• What are specified by “person” and “name” ?• Is “name” and “ 名前” the same?• Is this description sufficient as a description for “person”?• …

• In short, syntax alone cannot solve these problems

<person> <name> Hideaki Takeda</name> <age> 20</age></person>

< 個人 > < 名前 >Hideaki Takeda</ 名前 > < 年齢 > 20</ 年齢 ></ 個人 >

Hideaki Takeda / National Institute of Informatics

Architecture for the Semantic Web

Tim Berners-Lee   http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/

Hideaki Takeda / National Institute of Informatics

How to describe “meaning”?

• Need to describe “information on information”– “Meaning of something” is a description (“meaning”)

to a description (“something”) in computers – Metadata

• Data about data

• Need to architecture for common understanding– Syntax (language or scheme)– Vocabulary (ontology)

Hideaki Takeda / National Institute of Informatics

Metadata• What is metadata?

– Data about data – What one can say about any information object

• What is described as metadata?– Content relates to what the object contains or is about, and

is intrinsic to an information object. – Context indicates the who, what, why, where, how aspects

associated with the object's creation and is extrinsic to an information object.

– Structure relates to the formal set of associations within or among individual information objects and can be intrinsic or extrinsic

Setting the State, Anne J.Gilliand-Swetland, Introduction to Metadata – Pathways to Digital Information, Murthsa Baca (ed.), Getty Information Institute.

Hideaki Takeda / National Institute of Informatics

Metadata• Metadata to individual information objects

– Bibliography , Dublin Core• Metadata to part or structure of information objects

– Drawings , RDF , RDFS,   OWL

Type : tractorOwner : Taro

Product year :2002

Axis:Connect body to wheel

Wheel

Body

Hideaki Takeda / National Institute of Informatics

A Layer model for Semantic Web• RDF (Resource Description Framework)

– The most primitive model for metadata description• SVO model• Entity-Relation Model• Semantic net

• RDF Schema– Addition of “concept” to RDF

• class-subclass , constraints• OWL

– More general concept description language• Logical consistency• Various class expressions• Various constraints

• DAML-S– Descriptions on processes

Tim Berners-Lee   http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/

Hideaki Takeda / National Institute of Informatics

RDF (Resource Description Framework)

• A framework to describe metadata• Separation of model and syntax• W3C Recommendation (2004)

Hideaki Takeda / National Institute of Informatics

RDF Model

• Element– Resource:

• URI(Universal Resource Identifier)• Literal(string)

– No need to be specified by Web

– Property: • Attribute when describing resources• URI or Literal just as Resource

– Statement: triad of resource, property, and resource

Hideaki Takeda / National Institute of Informatics

RDF model• Statement

– Creator of http://www-kasm.nii.ac.jp/~takeda is “Hideaki Takeda” • Structure

– Resource (subject): http://www-kasm.nii.ac.jp/~takeda– Property (predicate): Creator– Value (object): “Hideaki Takeda”

http://www-kasm.nii.ac.jp/~takeda “Hideaki Takeda”Creator

Resource Property Value

Hideaki Takeda / National Institute of Informatics

RDF model• Creator of http://www-kasm.nii.ac.jp/~takeda is http://www.nii.ac.jp/staffid/123456 which

has name “Hideaki Takeda” and email “[email protected]” .

http://www-kasm.nii.ac.jp/~takeda

“Hideaki Takeda”

Creatorhttp://www.nii.ac.jp/staffid/123456

[email protected]

name email

Hideaki Takeda / National Institute of Informatics

RDF model• Creator of http://www-kasm.nii.ac.jp/~takeda has name “Hideaki Takeda”

email “[email protected]” .

http://www-kasm.nii.ac.jp/~takeda

“Hideaki Takeda”

Creator

[email protected]

name email

Hideaki Takeda / National Institute of Informatics

RDF syntax• Creator of http://www-kasm.nii.ac.jp/~takeda is “Hideaki Takeda”

http://www-kasm.nii.ac.jp/~takeda “Hideaki Takeda”Creator

Resource Property Value

<?xml version="1.0"?> <rdf:RDF    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"    xmlns:dc="http://dublincore.org/2001/08/14/dces#"> <rdf:Description about="http://www-kasm.nii.ac.jp/~takeda">     <dc:Creator>Hideaki Takeda</dc:Creator> </rdf:Description> </rdf:RDF> <rdf:RDF> <rdf:Description about="http://www-kasm.nii.ac.jp/~takeda">    <dc:Creator rdf:resource=“Hideaki Takeda” /> </rdf:Description> </rdf:RDF>

Hideaki Takeda / National Institute of Informatics

RDFS (RDF Schema)

• Stronger knowledge representation model– RDF: ER model , semantic net– RDF Schema: Frame model , object-oriented

paradigm• Minimal definition• Property-centered approach

• RDFS is defined as extension of RDF• RDFS gives definitions of RDF descriptions

Hideaki Takeda / National Institute of Informatics

RDFS• Class Definition

– rdfs:Resource– rdfs:Class– rdf:Property– rdfs:ConstraintProperty– rdfs:Literal

• Property Definition– rdf:type– rdfs:subClassOf– rdfs:subPropertyOf– rdfs:comment– rdfs:label– rdfs:seeAlso– rdfs:isDefinedBy

• ConstraintProperty Definition– rdfs:range – rdfs:domain

Resource Description Framework(RDF) Schema Specification 1.0http://www.w3.org/TR/2000/CR-rdf-schema-20000327/

RDFS Structure by RDF

Hideaki Takeda / National Institute of Informatics

RDF Schema• rdfs:Class• rdfs:SubclassOf

– Detailed class– Multiple– Transivity

• rdf:type– Indicate an instance of a

class• rdf:property

– Attribute• rdfs:subPropertyOf

– Detailed property– Transivity

Range Only one

No cardinality Domain

Multiple (or)

Hideaki Takeda / National Institute of Informatics

RDF Schema<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"><rdfs:Class rdf:ID="Person"> <rdfs:comment>The class of people.</rdfs:comment> <rdfs:subClassOf rdf:resource="http://www.w3.org/    2000/03/example/  classes#Animal"/></rdfs:Class><rdf:Property ID="maritalStatus"> <rdfs:range rdf:resource="#MaritalStatus"/> <rdfs:domain rdf:resource="#Person"/></rdf:Property><rdf:Property ID="ssn"> <rdfs:comment>Social Security Number</rdfs:comment> <rdfs:range

rdf:resource="http://www.w3.org/2000/03/example/classes#Integer"/> <rdfs:domain rdf:resource="#Person"/></rdf:Property><rdf:Property ID="age"> <rdfs:range

rdf:resource="http://www.w3.org/2000/03/example/classes#Integer"/> <rdfs:domain rdf:resource="#Person"/></rdf:Property><rdfs:Class rdf:ID="MaritalStatus"/><MaritalStatus rdf:ID="Married"/><MaritalStatus rdf:ID="Divorced"/><MaritalStatus rdf:ID="Single"/><MaritalStatus rdf:ID="Widowed"/></rdf:RDF>

Animal

Person

ssnage

maritalStatus

s

d

MaritalStatus

r

“The class of person”

rdfs:comment

Integer

d

r

d

“Social Security Number”

rdfs:comment

t = rdf:typed = rdfs:domainr = rdfs:range = class = class instance = property

Resource Description Framework(RDF) Schema Specification 1.0http://www.w3.org/TR/2000/CR-rdf-schema-20000327/

Married

Divorced

Single

Windowed

t

t

t

t

Hideaki Takeda / National Institute of Informatics

OWL(Web Ontology Language)• More general knowledge representation• Based on Description Logics• Features

– Class• Necessary condition / necessary and sufficient condition• Class expression:

– Constraint by property » Like slot definition of a class» Type constraint (all/some), cardinality, typed cardinality

– Logical operation of classes: union, intersection, negation– Property

• Multiple ranges and domains• Specifying meta-property

– Import of definitions

Hideaki Takeda / National Institute of Informatics

Linked Data

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

Architecture for the Semantic Web

Tim Berners-Lee   http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/

The world of instances (Linked Data)

The world of classes (Ontologies)

Hideaki Takeda / National Institute of Informatics

Layers of Semantic Web• Ontology

– Descriptions on classes– RDFS, OWL– Challenges for ontology building

• Ontology building is difficult by nature– Consistency, comprehensiveness, logicality

• Alignment of ontologies is more difficult

Tim Berners-Lee   http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/

Descriptions on classes

インスタンスに関する記述

Ontology

Linked Data

Hideaki Takeda / National Institute of Informatics

Layers of Semantic Web• Linked Data

– Descriptions on instances (individuals)– RDF + (RDFS, OWL)– Pros for Linked Data

• Easy to write (mainly fact description)• Easy to link (fact to fact link)

– Cons for Linked Data• Difficult to describe complex structures• Still need for class description (-> ontology)

Tim Berners-Lee   http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/

Descriptions on classes

Description on instances

Ontology

Linked Data

Hideaki Takeda / National Institute of Informatics

Linked Data

Linked Data is “Web of Data”– Data published as RDF– Can refer from outside

• The four rules for Linked Data

Hideaki Takeda / National Institute of Informatics

Linked Data• The four rules for Linked Data

– Use URIs as names for things • Give a URI to every object in the world!

– Use HTTP URIs so that people can look up those names. • Don’t use URN

– When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

• Provide machine-readable data for URI– Include links to other URIs. so that they can discover more things.

• Make data linked together just like Web

Linked Data, TBL, http://www.w3.org/DesignIssues/LinkedData.html

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

Linking Open Data (LOD)• The project to collect published Linked Data• Major Linked Data• (Translated from the original resources)

– Dbpedia (Wikipedia) 270 Million Triples– Geonames : Geo names and their latitudes and longitudes, 93 Million

Triples– MusicBrainz : Music– WordNet : Dictionary– DBLP bibliography : Bibliography for technical papers. 28 Million Triples– US Census Data: 1 Billion Triples

• ( Crawling)– FOAF (Friend Of A Friend)

• ( Wrapper )– Flickr Wrapper

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

LOD Cloud(Linking Open Data)

Hideaki Takeda / National Institute of Informatics

http://dbpedia.org/page/Tokyo

Hideaki Takeda / National Institute of Informatics

http://en.wikipedia.org/wiki/Tokyo

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

How to use Linked Data

Things Things Things Things Things

Linked Data Browser

Linked Data Mashup

Linked Data Search Engine

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

Linked Data Browser• Browse linked data just as browsing web pages

– Show RDF data– Prompt links to follow

• System/Service– Mables

• Display data by following links

– Tabulator • Firefox plugin/online• Adding information in a single page

– Sig.ma• Showing RDF resources which can be operated

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

Tabulator

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

Linked Data Search Engine

• Search RDF data with crawled data set– Swoogle– Sindice– watson

Hideaki Takeda / National Institute of Informatics

http://sindice.com/

Hideaki Takeda / National Institute of Informatics

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

How to use Linked Data

• Semantic Data Mash-up Applications

– SemaPlorer• http://btc.isweb.uni-koblenz.de/

– Dbpedia Mobile• http://wiki.dbpedia.org/DBpediaMobile

– Bio2RDF• http://bio2rdf.org/

Hideaki Takeda / National Institute of Informatics

DBpedia Mobile

Hideaki Takeda / National Institute of Informatics

Bio2RDF• Search LOD in

bioscience• Translate data into RDF

if not

Hideaki Takeda / National Institute of Informatics

Bio2RDF

Hideaki Takeda / National Institute of Informatics

Linked Data

• What is Linked Data?• The State-of-the-Art of Linked Data

– Linking Open Data (LOD)• How to use Linked Data

– Linked Data Browser– Linked Data Search Engine– Linked Data Applications

• How to use RDF– RDFa– SPARQL

Hideaki Takeda / National Institute of Informatics

RDFa

• Add extra structured content to the (X)HTML pages– adds new (X)HTML/XML attributes

• “RDF in attributes”

– Programs can extract those and turn into RDF– Flexibility for using Literals and URI resources

Hideaki Takeda / National Institute of Informatics

Principles of RDFa

• RDF contents are defined through XML attributes (no elements)

• XML/HTML tree structure is used• Various attributes are defined by RDFa

– Some attributes (@href, @rel) are also reused• The text content can be also reused

Hideaki Takeda / National Institute of Informatics

Examples<div xmlns:dc="http://purl.org/dc/elements/1.1/"> <h2 property="dc:title">The trouble with Bob</h2> <h3 property="dc:creator">Alice</h3> ... </div>

http://example.com/alice/posts/trouble_with_bob

<http://www.example.com/alice/posts/trouble_with_bob> <http://purl.org/dc/elements/1.1/title> "The Trouble with Bob"; <http://purl.org/dc/elements/1.1/creator> "Alice" .

In N3

Hideaki Takeda / National Institute of Informatics

<div xmlns:dc="http://purl.org/dc/elements/1.1/"> <div about="/alice/posts/trouble_with_bob"> <h2 property="dc:title">The trouble with Bob</h2> <h3 property="dc:creator">Alice</h3> ... </div> <div about="/alice/posts/jos_barbecue"> <h2 property="dc:title">Jo's Barbecue</h2> <h3 property="dc:creator">Eve</h3> ... </div> ... </div>

Hideaki Takeda / National Institute of Informatics

<div about="/alice/posts/trouble_with_bob"> <h2 property="dc:title">The trouble with Bob</h2> The trouble with Bob is that he takes much better photos than I do: <div about="http://example.com/bob/photos/sunset.jpg"> <img src="http://example.com/bob/photos/sunset.jpg" /> <span property="dc:title">Beautiful Sunset</span> by <span property="dc:creator">Bob</span>. </div> </div>

Hideaki Takeda / National Institute of Informatics

<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p property="foaf:name"> Alice Birpemswick </p> <p> Email: <a rel="foaf:mbox" href="mailto:[email protected]">[email protected]</a></p> <p> Phone: <a rel="foaf:phone" href="tel:+1-617-555-7332">+1 617.555.7332</a> </p> </div>

Hideaki Takeda / National Institute of Informatics

<div xmlns:foaf="http://xmlns.com/foaf/0.1/" about="#me" rel="foaf:knows"> <ul> <li typeof="foaf:Person"> <a property="foaf:name" rel="foaf:homepage" href="http://example.com/bob">Bob</a> </li> <li typeof="foaf:Person"> <a property="foaf:name" rel="foaf:homepage" href="http://example.com/eve">Eve</a> </li> <li typeof="foaf:Person"> <a property="foaf:name" rel="foaf:homepage" href="http://example.com/manu">Manu</a> </li> </ul> </div>

Hideaki Takeda / National Institute of Informatics

Using RDFa

• RDF Validator– http://validator.w3.org/

• RDF Distiller– http://www.w3.org/2007/08/pyRdfa/

Hideaki Takeda / National Institute of Informatics

<http://example.org/john-d/> <http://xmlns.com/foaf/0.1/primaryTopic> <http://example.org/john-d/#me>.<http://example.org/john-d/> <http://purl.org/dc/elements/1.1/creator> "Jonathan Doe"@en.<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/nick> "John D"@en.<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/interest> <http://www.neubauten.org/>.<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/interest> <urn:ISBN:0752820907>.<urn:ISBN:0752820907> <http://purl.org/dc/elements/1.1/title> "Weaving the Web"@en.<urn:ISBN:0752820907> <http://purl.org/dc/elements/1.1/creator> "Tim Berners-Lee"@en.

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="XHTML+RDFa 1.0" xml:lang="en"> <head> <title>John's Home Page</title> <base href="http://example.org/john-d/" /> <meta property="dc:creator" content="Jonathan Doe" /> <link rel="foaf:primaryTopic" href="http://example.org/john-d/#me" /> </head>

<body about="http://example.org/john-d/#me"> <h1>John's Home Page</h1> <p>My name is <span property="foaf:nick">John D</span> and I like <a href="http://www.neubauten.org/" rel="foaf:interest" xml:lang="de">Einsturzende Neubauten</a>. </p> <p> My <span rel="foaf:interest" resource="urn:ISBN:0752820907">favorite book is the inspiring <span about="urn:ISBN:0752820907"> <cite property="dc:title">Weaving the Web</cite> by <span property="dc:creator">Tim Berners-Lee</span></span> </span> </p> </body></html>

Hideaki Takeda / National Institute of Informatics

Summary

• Linked Data is the practical application of Semantic Web– The bottom-up approach – Postpone the ontology issue

• A technological solution for data sharing– Data science– Open Government


Recommended