88
Linked (Open) Data VU Web Engineering / TU Wien May 27 th 2013 Bernhard Haslhofer

Linked (Open) Data

Embed Size (px)

DESCRIPTION

Guest Lecture about open data / linked data and the basics of linked open data held at the Technical University of Vienna

Citation preview

Page 1: Linked (Open) Data

Linked  (Open)  Data  

VU  Web  Engineering  /  TU  Wien  May  27th  2013  

 -­‐  Bernhard  Haslhofer  -­‐    

Page 2: Linked (Open) Data

About  me  

•  Since  03/2013  Postdoc  @  University  of  Vienna  •  Previously  –  Lecturer  &  Postdoc  @  Cornell  University,  NY,  USA  –  Univ.  Ass  @  University  of  Vienna  – …  – WINF  TU  Wien  2003,  INF  TU  Wien  2006  

2

Page 3: Linked (Open) Data

About  me  

•  Research  Interests  – Web  informaZon  systems  – Globally  connected,  Web-­‐based  data  networks  •  Structured  Web  Data  (Linked  Data,  schema.org,  (FB)  Open  Graph  Protocol,  etc.)  •  Knowledge  Graphs  (e.g.,  DBpedia,  Freebase)  •  AnnotaZons  /  SemanZc  Tagging  •  Quality  in  Open  Data  Networks  •  ….  

3

Page 4: Linked (Open) Data

My  teaching  philosophy  

•  A  course  is  a  collaboraZve  experience  

•  Instructor  provides  –  Structure  –  FoundaZon  for  learning  

•  Students  –  Engage,  contribute,  

challenge  –  Ask  quesZons!  –  Think  criZcally!  –  Disagree  if  appropriate!  

4

Aren’t we beyond that?

Page 5: Linked (Open) Data

My  plan  for  today…  •  Linked  (Open)  Data  ???    •  Linked  Data  –  Intro  &  Overview    •  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

5

Page 6: Linked (Open) Data

Open  Data  

 “Open  data  is  data  that  can  be  freely  used,  reused  and  redistributed  by  anyone  -­‐  subject  only,  at  most,  to  the  requirement  to  a:ribute  and  sharealike.”    (Open  Data  Handbook,  2012,  Open  Knowledge  FoundaZon)  

6

Page 7: Linked (Open) Data

“Open”  Data  DefiniZon  •  Availability  and  Access  

–  Data  must  be  available  as  a  whole  and  at  no  more  than  a  reasonable  reproducZon  cost,  preferably  by  downloading  over  the  internet  

–  Data  must  also  be  available  in  a  convenient  and  modifiable  form  •  Reuse  and  RedistribuZon  

–  Data  must  be  provided  under  terms  that  permit  reuse  and  redistribuZon  including  the  intermixing  with  other  datasets.  

•  Universal  ParZcipaZon  –  Everyone  must  be  able  to  use,  reuse  and  redistribute  (no  

discriminaZon)  –  No  ‘non-­‐commercial’  restricZons  

(hip://opendefiniZon.org/okd/)    

7

Page 8: Linked (Open) Data

Open  Data  Movement  

8 Source: http://www.flickr.com/photos/jamescridland/613445810/sizes/l/in/photostream/

Page 9: Linked (Open) Data

QuesZons  

•  Why  should  the  open  data  principles  sound  familiar  to  sokware  engineers?  

•  Any  known  “open  data”  examples?  

9

Page 10: Linked (Open) Data

Open  Government  Data  Examples  

10

Page 11: Linked (Open) Data

Open  Government  Data  Examples  

11

Page 12: Linked (Open) Data

Open  Government  Data  Examples  

12

Page 13: Linked (Open) Data

Open  Government  Data  Examples  

13

Page 14: Linked (Open) Data

Open  Government  Data  Apps  

14

Page 15: Linked (Open) Data

Open  (Government)  Data  Apps  

15

Page 16: Linked (Open) Data

Open  Government  Data  in  Journalism  

16

Page 17: Linked (Open) Data

(Open)  Data  Journalism  

17

Page 18: Linked (Open) Data

Open  Data  in  Science  

18

Page 19: Linked (Open) Data

Open  Data  in  Science  

19

Page 20: Linked (Open) Data

Linked  Data    “A  method  of  publishing  structured  data  so  that  it  can  be  interlinked  and  become  more  useful.    It  builds  upon  standard  Web  technologies  such  as  HTTP,  RDF  and  URIs,  but  rather  than  using  them  to  serve  web  pages  for  human  readers,  it  extends  them  to  share  informaLon  in  a  way  that  can  be  read  automaLcally  by  computers.    This  enables  data  from  different  sources  to  be  connected  and  queried”    [Bizer,  Heath,  Berners-­‐Lee  2009]  

20

Page 21: Linked (Open) Data

Linked  Open  Data  

21 Open Data + Linked Data = Linked Open Data

Page 22: Linked (Open) Data

My  plan  for  today…  •  Linked  (Open)  Data  ???  

•  Linked  Data  –  Intro  &  Overview    •  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

22

Page 23: Linked (Open) Data

Linked  Data  context...  

http://www.youtube.com/watch?v=5Cb3ik6zP2I

Page 24: Linked (Open) Data

Why  Linked  Data?  

Page 25: Linked (Open) Data

Why  Linked  Data?  

Page 26: Linked (Open) Data

Why  Linked  Data?  

Page 27: Linked (Open) Data

Web  Architecture  

Page 28: Linked (Open) Data

Web  Architecture  

•  A  set  of  simple  standards  – Uniform  global  addressing  (URI)  – Uniform  document  encoding  (HTML)  – Uniform  transportaZon  (HTTP)  

•  Hyperlinks  connecZng  documents  •  Works  preiy  well  for  accessing  and  exchanging  documents    

Page 29: Linked (Open) Data

But  someZmes  we  need  to  access  the  underlying  structured  data.  

Page 30: Linked (Open) Data

Web  Services  and  Web  APIs  

Source: http://www.blogperfume.com/new-27-circular-social-media-icons-in-3-sizes/

Page 31: Linked (Open) Data

Web  Services  and  Web  APIs  

•  Each  Web  API  has  a  proprietary  interface  •  Datasources  must  be  known  in  advance  •  InformaZon  enZZes  (papers,  authors,  subjects,  etc.)  are  oken  not  linked  

Page 32: Linked (Open) Data

32 Social Networking Sites as Walled Gardens by David Simonds

Page 33: Linked (Open) Data

Linked  Data  Vision  

•  Publish  and  link  structured  data  on  the  Web  •  Create  a  single  globally  connected  data  space  based  on  the  Web  Architecture  

Page 34: Linked (Open) Data

Web  of  Linked  Data  

•  A  set  of  simple  standards  – Uniform  global  addressing  (URI)  – Uniform  data  model  (RDF)  – Uniform  transportaZon  (HTTP)  

•  RDF  links  connecZng  enZZes  •  Forms  a  global  data  space  and  facilitates  accessing  and  exchanging  data    

Page 35: Linked (Open) Data

What  is  Linked  Data?  •  A  method  to  build  a  Web  of  Data  •  Architectural  style,  set  of  standards  

Page 36: Linked (Open) Data

Linking  Open  Data  Project  •  A  W3C  community  project  with  the  goal  to  extend  the  Web  with  

a  data  commons  by  publishing  various  open  data  sets  as  RDF  on  the  Web  and  by  serng  links  between  data  items  from  different  sources  

Page 37: Linked (Open) Data
Page 38: Linked (Open) Data
Page 39: Linked (Open) Data
Page 40: Linked (Open) Data
Page 41: Linked (Open) Data
Page 42: Linked (Open) Data

~$ curl -I -H "Accept: text/turtle" http://dbpedia.org/resource/The_Shining_\(film\) ~$ curl -H "Accept: text/turtle" http://dbpedia.org/data/The_Shining_\(film\).ttl

~$ sudo apt-get install raptor (Linux) ~$ brew install raptor (Mac OSX) ~$ rapper http://dbpedia.org/resource/The_Shining_\(film\)

Page 43: Linked (Open) Data
Page 44: Linked (Open) Data
Page 45: Linked (Open) Data
Page 46: Linked (Open) Data
Page 47: Linked (Open) Data
Page 48: Linked (Open) Data
Page 49: Linked (Open) Data
Page 50: Linked (Open) Data

My  plan  for  today…  •  Linked  (Open)  Data  ???  

•  Linked  Data  –  Intro  &  Overview    •  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

50

Page 51: Linked (Open) Data

Web  /  REST  Basics  -­‐  Recap  

•  Key  Architectural  Web  Components  

–  IdenZficaZon:  URI  –  InteracZon:  HTTP  – Standardized  Document  Formats:  HTML,  XML,  JSON,  etc.  

51

Page 52: Linked (Open) Data

Web  /  REST  Basics  -­‐  Recap  

•  URIs  idenZfy  interesZng  things  – documents  on  the  Web  – relevant  aspects  of  a  data  set  – phone  numbers,  Skype  usernames,  e-­‐mail  addresses  

•  HTTP  URIs  name  and  address  resources  in  Web-­‐based  systems  

52

Page 53: Linked (Open) Data

Web  /  REST  Basics  -­‐  Recap  

•  A  resource  can  have  several  representaZons  

•  RepresentaZons  can  be  in  any  format  –  HTML  –  XML  –  JSON  –  …  

URI

Resource

RepresentationPlain Texttext/plain

http://example.com/someURI

RepresentationHTML

text/html

RepresentationJSON

text/json

53

Page 54: Linked (Open) Data

Web  /  REST  Basics  -­‐  Recap  •  We  deal  with  resource  representaZons  

–  not  the  resources  themselves  (pass  by  value)  –  representaZons  can  be  in  any  format  (defined  by  media-­‐type)  

•  Each  resource  implements  a  standard  uniform  interface  (HTTP)  –  a  small  set  of  verbs  applied  to  a  large  set  of  nouns  –  verbs  are  universal  and  not  invented  on  a  per-­‐applicaZon  basis  

Client Server

LogicalResources

PhysicalResources

JSON

Resource Representations

Uniform Interface

54

Page 55: Linked (Open) Data

Web  /  REST  Basics  -­‐  Recap  

HTML,  XHTML,  

...  

XML,  JSON,  ...  

Transport and store data Display information

55

Page 56: Linked (Open) Data

Web  /  REST  Basics  -­‐  Recap  

•  Example  Web  Service  operaZons:  – Publish  image  on  Flickr  – Order  a  book  at  Amazon  – Post  a  message  on  your  friend’s  Facebook  wall  – Update  user  photo  on  foursquare  

Web

Application A Application B

A  

API

56

Page 57: Linked (Open) Data

RDF  •  A  data  model  for  represenZng  data  on  the  Web  •  Several  statements  (triples)  form  a  graph  

http://dbpedia.org/resource/The_Shining_(film)

The Shining (film)

rdfs:label

闪灵 (电影)

rdfs:label

http://dbpedia.org/ontology/Film

rdf:type

http://dbpedia.org/resource/Jack_Nicholsondbpprop:starring

http://xmlns.com/foaf/0.1/Person

rdf:type

1937-04-22 Jack Nicholson

dbpedia-owl:birthDatefoaf:name

Page 58: Linked (Open) Data

RDF/XML,  N3,  Turtle,  etc.  •  Data  formats  for  RDF  resource  representaZons  

•  Used  to  transfer  RDF  data  between  apps  

Page 59: Linked (Open) Data

RDFS  

•  A  language  for  describing  the  syntax  and  semanZcs  of  schemas/vocabularies  in  a  machine-­‐understandable  way  

http://dbpedia.org/ontology/Film

http://dbpedia.org/ontology/Work

rdfs:subClassOf

Page 60: Linked (Open) Data

OWL  •  A  more  expressive  (formal)  language  for  defining  the  

syntax  and  semanZcs  of  schemas/vocabularies  •  Solves  RDFS  shortcomings  but  introduces  quite  some  

complexity  

http://dbpedia.org/ontology/starring

http://www.w3.org/2002/07/owl#ObjectProperty

http://dbpedia.org/ontology/Person

http://dbpedia.org/ontology/Work

starring

rdf:type

rdfs:range

rdfs:domain

rdfs:label

Page 61: Linked (Open) Data

SKOS  •  A  language  for  describing  controlled  vocabularies  

(taxonomies,  thesauri,  classificaZon  schemes)  

http://dbpedia.org/resource/The_Shining_(film)

http://dbpedia.org/resource/Category:1980s_horror_films

http://dbpedia.org/resource/Category:1980s_films

http://www.w3.org/2004/02/skos/core#Concept

dcterms:subject rdf:type

skos:broader

rdf:type

Page 62: Linked (Open) Data

SPARQL  

•  A  query  language  and  protocol  for  accessing  RDF  data  on  the  Web  

SELECT DISTINCT ?x!WHERE {!

!?x dcterms:subject !!<http://dbpedia.org/resource/Category:1980s_horror_films> .!

}!

Page 63: Linked (Open) Data

Database  Systems  Analogy...  

Purpose   Rela,onal  Database  Management  Systems  (RDBMS)  

Linked  Data  Technologies  

Query  

Schema  DefiniZon  Language  

Data  RepresentaZon  

IdenZfiers  

63

?

Page 64: Linked (Open) Data

Database  Systems  Analogy...  

Purpose   Rela,onal  Database  Management  Systems  (RDBMS)  

Linked  Data  Technologies  

Query   SQL   SPARQL  

Schema  DefiniZon  Language  

SQL  DDL   RDFS  /  OWL  

Data  RepresentaZon  

RelaZonal  Model  /  Tables   RDF  /  Graph  

IdenZfiers   Primary  Keys  (numeric  sequences)   URI  

64

Page 65: Linked (Open) Data

Publishing  Linked  Data  

•  DisZnguish  between  non-­‐informaZon  and  informaZon  resource  

•  Sample  non-­‐informaZon  resource  –  hip://dbpedia.org/resource/The_Shining_(film)  

•  Sample  informaZon  resource  –  hip://dbpedia.org/page/The_Shining_(film)  -­‐  HTML  –  hip://dbpedia.org/data/The_Shining_(film)  -­‐  RDF  

Page 66: Linked (Open) Data

Publishing  Linked  Data  

GET http://dbpedia.org/resource/The_Shining_(film)Accept: application/rdf+xml

303 See OtherLocation: http://dbpedia.org/data/The_Shining_(film)

GET http://dbpedia.org/data/The_Shining_(film)Accept: application/rdf+xml

200 OK...<?xml version="1.0" encoding="utf-8"?><rdf:RDF ...

Page 67: Linked (Open) Data

Publishing  Large  RDF  Datasets  

•  Run  a  servlet  that  implements  the  303  publishing  approach  –  for  non  informaZon  resources  •  parse  Accept  Header  field  •  Redirect  (303  See  Also)  to  corresponding  informaZon  resource  

•  Generate  RDF  SerializaZon  dynamically  from  underlying  data  storage  

Page 68: Linked (Open) Data

My  plan  for  today…  •  Linked  (Open)  Data  ???  

•  Linked  Data  –  Intro  &  Overview  

•  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

68

Page 69: Linked (Open) Data

Rich  Snippets  /  Microdata  

69

Page 70: Linked (Open) Data

Microdata  (HTML5)  

•  A  very  young  HTML  5  proposiZon  that  extends  Microformats  and  addresses  its  shortcomings  

•  Items  are  created  within  an  itemscope  •  Every  item  is  assigned  an  arbitrary  number  of  properZes  (itemprop)  and  relaZonships  (itemref)  

•  Uses  global  idenZfiers  for  typing  and  naming  items  

Page 71: Linked (Open) Data

Microdata  Example  

<div itemscope itemtype="http://schema.org/Person">!!

!<span itemprop="name">Bernhard Haslhofer</span>,!!<span itemprop="nickname">behas</span>. !!<div !itemprop="address”!! !itemscope itemtype="http://schema.org/PostalAddress">!! !<span itemprop="streetAddress">301 College Avenue</span>!! !<span itemprop=”addressLocality">Ithaca</span>!! !<span itemprop=”addressCountry">United States</span>!!</div>!

</div>!

Page 72: Linked (Open) Data

Schema.org  

Page 73: Linked (Open) Data
Page 74: Linked (Open) Data

schema.org  /  Microdata  example  

<h1>Pirates of the Carribean: On Stranger Tides (2011)</h1>!Jack Sparrow and Barbossa embark on a quest to find the elusive fountain! of youth, only to discover that Blackbeard and his daughter are after it too.!!Director: Rob Marshall!Writers: Ted Elliott, Terry Rossio, and 7 more credits!Stars: Johnny Depp, Penelope Cruz, Ian McShane!8/10 stars from 200 users. Reviews: 50.!

Page 75: Linked (Open) Data

schema.org  /  Microdata  example  

Page 76: Linked (Open) Data

schema.org  

•  Defines  – a  number  of  types  (e.g,  person),  organized  in  an  inheritance  hierarchy  

– a  number  of  properZes  (e.g.,  name)  

•  Extension  mechanisms  to  extend  the  schemas  •  OWL  representaZon:  hip://schema.org/docs/schemaorg.owl  

•  hip://schema.rdfs.org/index.html  

76

Page 77: Linked (Open) Data

Open  Graph  Protocol  

Page 78: Linked (Open) Data
Page 79: Linked (Open) Data

79

Page 80: Linked (Open) Data
Page 81: Linked (Open) Data
Page 82: Linked (Open) Data

Google  Knowledge  Graph  

•  Enables  search  for  things  (people,  places)  that  Google  knows  about  

•  Rooted  in  public  sources  such  as  Freebase,  Wikipedia,  CIA  World  Factbook,  etc.  – augmented  to  500M  objects,  3.5B  facts  and  relaZonship  

•  Next  generaZon  search  (semanZc  index)  

82

Page 83: Linked (Open) Data

83

Page 84: Linked (Open) Data

84

Page 85: Linked (Open) Data

85

Page 86: Linked (Open) Data

86

Page 87: Linked (Open) Data

87

Page 88: Linked (Open) Data

Readings  

•  Tom  Heath  and  ChrisZan  Bizer  (2011)  Linked  Data:  Evolving  the  Web  into  a  Global  Data  Space  (1st  ediZon).  Synthesis  Lectures  on  the  SemanZc  Web:  Theory  and  Technology,  1:1,  1-­‐136.  Morgan  &  Claypool.  

•  Jason  Ronallo:  HTML5  Microdata  and  Schema.org  hip://journal.code4lib.org/arZcles/6400