Linked (Open) Data

Preview:

DESCRIPTION

Guest Lecture about open data / linked data and the basics of linked open data held at the Technical University of Vienna

Citation preview

Linked  (Open)  Data  

VU  Web  Engineering  /  TU  Wien  May  27th  2013  

 -­‐  Bernhard  Haslhofer  -­‐    

About  me  

•  Since  03/2013  Postdoc  @  University  of  Vienna  •  Previously  –  Lecturer  &  Postdoc  @  Cornell  University,  NY,  USA  –  Univ.  Ass  @  University  of  Vienna  – …  – WINF  TU  Wien  2003,  INF  TU  Wien  2006  

2

About  me  

•  Research  Interests  – Web  informaZon  systems  – Globally  connected,  Web-­‐based  data  networks  •  Structured  Web  Data  (Linked  Data,  schema.org,  (FB)  Open  Graph  Protocol,  etc.)  •  Knowledge  Graphs  (e.g.,  DBpedia,  Freebase)  •  AnnotaZons  /  SemanZc  Tagging  •  Quality  in  Open  Data  Networks  •  ….  

3

My  teaching  philosophy  

•  A  course  is  a  collaboraZve  experience  

•  Instructor  provides  –  Structure  –  FoundaZon  for  learning  

•  Students  –  Engage,  contribute,  

challenge  –  Ask  quesZons!  –  Think  criZcally!  –  Disagree  if  appropriate!  

4

Aren’t we beyond that?

My  plan  for  today…  •  Linked  (Open)  Data  ???    •  Linked  Data  –  Intro  &  Overview    •  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

5

Open  Data  

 “Open  data  is  data  that  can  be  freely  used,  reused  and  redistributed  by  anyone  -­‐  subject  only,  at  most,  to  the  requirement  to  a:ribute  and  sharealike.”    (Open  Data  Handbook,  2012,  Open  Knowledge  FoundaZon)  

6

“Open”  Data  DefiniZon  •  Availability  and  Access  

–  Data  must  be  available  as  a  whole  and  at  no  more  than  a  reasonable  reproducZon  cost,  preferably  by  downloading  over  the  internet  

–  Data  must  also  be  available  in  a  convenient  and  modifiable  form  •  Reuse  and  RedistribuZon  

–  Data  must  be  provided  under  terms  that  permit  reuse  and  redistribuZon  including  the  intermixing  with  other  datasets.  

•  Universal  ParZcipaZon  –  Everyone  must  be  able  to  use,  reuse  and  redistribute  (no  

discriminaZon)  –  No  ‘non-­‐commercial’  restricZons  

(hip://opendefiniZon.org/okd/)    

7

Open  Data  Movement  

8 Source: http://www.flickr.com/photos/jamescridland/613445810/sizes/l/in/photostream/

QuesZons  

•  Why  should  the  open  data  principles  sound  familiar  to  sokware  engineers?  

•  Any  known  “open  data”  examples?  

9

Open  Government  Data  Examples  

10

Open  Government  Data  Examples  

11

Open  Government  Data  Examples  

12

Open  Government  Data  Examples  

13

Open  Government  Data  Apps  

14

Open  (Government)  Data  Apps  

15

Open  Government  Data  in  Journalism  

16

(Open)  Data  Journalism  

17

Open  Data  in  Science  

18

Open  Data  in  Science  

19

Linked  Data    “A  method  of  publishing  structured  data  so  that  it  can  be  interlinked  and  become  more  useful.    It  builds  upon  standard  Web  technologies  such  as  HTTP,  RDF  and  URIs,  but  rather  than  using  them  to  serve  web  pages  for  human  readers,  it  extends  them  to  share  informaLon  in  a  way  that  can  be  read  automaLcally  by  computers.    This  enables  data  from  different  sources  to  be  connected  and  queried”    [Bizer,  Heath,  Berners-­‐Lee  2009]  

20

Linked  Open  Data  

21 Open Data + Linked Data = Linked Open Data

My  plan  for  today…  •  Linked  (Open)  Data  ???  

•  Linked  Data  –  Intro  &  Overview    •  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

22

Linked  Data  context...  

http://www.youtube.com/watch?v=5Cb3ik6zP2I

Why  Linked  Data?  

Why  Linked  Data?  

Why  Linked  Data?  

Web  Architecture  

Web  Architecture  

•  A  set  of  simple  standards  – Uniform  global  addressing  (URI)  – Uniform  document  encoding  (HTML)  – Uniform  transportaZon  (HTTP)  

•  Hyperlinks  connecZng  documents  •  Works  preiy  well  for  accessing  and  exchanging  documents    

But  someZmes  we  need  to  access  the  underlying  structured  data.  

Web  Services  and  Web  APIs  

Source: http://www.blogperfume.com/new-27-circular-social-media-icons-in-3-sizes/

Web  Services  and  Web  APIs  

•  Each  Web  API  has  a  proprietary  interface  •  Datasources  must  be  known  in  advance  •  InformaZon  enZZes  (papers,  authors,  subjects,  etc.)  are  oken  not  linked  

32 Social Networking Sites as Walled Gardens by David Simonds

Linked  Data  Vision  

•  Publish  and  link  structured  data  on  the  Web  •  Create  a  single  globally  connected  data  space  based  on  the  Web  Architecture  

Web  of  Linked  Data  

•  A  set  of  simple  standards  – Uniform  global  addressing  (URI)  – Uniform  data  model  (RDF)  – Uniform  transportaZon  (HTTP)  

•  RDF  links  connecZng  enZZes  •  Forms  a  global  data  space  and  facilitates  accessing  and  exchanging  data    

What  is  Linked  Data?  •  A  method  to  build  a  Web  of  Data  •  Architectural  style,  set  of  standards  

Linking  Open  Data  Project  •  A  W3C  community  project  with  the  goal  to  extend  the  Web  with  

a  data  commons  by  publishing  various  open  data  sets  as  RDF  on  the  Web  and  by  serng  links  between  data  items  from  different  sources  

~$ curl -I -H "Accept: text/turtle" http://dbpedia.org/resource/The_Shining_\(film\) ~$ curl -H "Accept: text/turtle" http://dbpedia.org/data/The_Shining_\(film\).ttl

~$ sudo apt-get install raptor (Linux) ~$ brew install raptor (Mac OSX) ~$ rapper http://dbpedia.org/resource/The_Shining_\(film\)

My  plan  for  today…  •  Linked  (Open)  Data  ???  

•  Linked  Data  –  Intro  &  Overview    •  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

50

Web  /  REST  Basics  -­‐  Recap  

•  Key  Architectural  Web  Components  

–  IdenZficaZon:  URI  –  InteracZon:  HTTP  – Standardized  Document  Formats:  HTML,  XML,  JSON,  etc.  

51

Web  /  REST  Basics  -­‐  Recap  

•  URIs  idenZfy  interesZng  things  – documents  on  the  Web  – relevant  aspects  of  a  data  set  – phone  numbers,  Skype  usernames,  e-­‐mail  addresses  

•  HTTP  URIs  name  and  address  resources  in  Web-­‐based  systems  

52

Web  /  REST  Basics  -­‐  Recap  

•  A  resource  can  have  several  representaZons  

•  RepresentaZons  can  be  in  any  format  –  HTML  –  XML  –  JSON  –  …  

URI

Resource

RepresentationPlain Texttext/plain

http://example.com/someURI

RepresentationHTML

text/html

RepresentationJSON

text/json

53

Web  /  REST  Basics  -­‐  Recap  •  We  deal  with  resource  representaZons  

–  not  the  resources  themselves  (pass  by  value)  –  representaZons  can  be  in  any  format  (defined  by  media-­‐type)  

•  Each  resource  implements  a  standard  uniform  interface  (HTTP)  –  a  small  set  of  verbs  applied  to  a  large  set  of  nouns  –  verbs  are  universal  and  not  invented  on  a  per-­‐applicaZon  basis  

Client Server

LogicalResources

PhysicalResources

JSON

Resource Representations

Uniform Interface

54

Web  /  REST  Basics  -­‐  Recap  

HTML,  XHTML,  

...  

XML,  JSON,  ...  

Transport and store data Display information

55

Web  /  REST  Basics  -­‐  Recap  

•  Example  Web  Service  operaZons:  – Publish  image  on  Flickr  – Order  a  book  at  Amazon  – Post  a  message  on  your  friend’s  Facebook  wall  – Update  user  photo  on  foursquare  

Web

Application A Application B

A  

API

56

RDF  •  A  data  model  for  represenZng  data  on  the  Web  •  Several  statements  (triples)  form  a  graph  

http://dbpedia.org/resource/The_Shining_(film)

The Shining (film)

rdfs:label

闪灵 (电影)

rdfs:label

http://dbpedia.org/ontology/Film

rdf:type

http://dbpedia.org/resource/Jack_Nicholsondbpprop:starring

http://xmlns.com/foaf/0.1/Person

rdf:type

1937-04-22 Jack Nicholson

dbpedia-owl:birthDatefoaf:name

RDF/XML,  N3,  Turtle,  etc.  •  Data  formats  for  RDF  resource  representaZons  

•  Used  to  transfer  RDF  data  between  apps  

RDFS  

•  A  language  for  describing  the  syntax  and  semanZcs  of  schemas/vocabularies  in  a  machine-­‐understandable  way  

http://dbpedia.org/ontology/Film

http://dbpedia.org/ontology/Work

rdfs:subClassOf

OWL  •  A  more  expressive  (formal)  language  for  defining  the  

syntax  and  semanZcs  of  schemas/vocabularies  •  Solves  RDFS  shortcomings  but  introduces  quite  some  

complexity  

http://dbpedia.org/ontology/starring

http://www.w3.org/2002/07/owl#ObjectProperty

http://dbpedia.org/ontology/Person

http://dbpedia.org/ontology/Work

starring

rdf:type

rdfs:range

rdfs:domain

rdfs:label

SKOS  •  A  language  for  describing  controlled  vocabularies  

(taxonomies,  thesauri,  classificaZon  schemes)  

http://dbpedia.org/resource/The_Shining_(film)

http://dbpedia.org/resource/Category:1980s_horror_films

http://dbpedia.org/resource/Category:1980s_films

http://www.w3.org/2004/02/skos/core#Concept

dcterms:subject rdf:type

skos:broader

rdf:type

SPARQL  

•  A  query  language  and  protocol  for  accessing  RDF  data  on  the  Web  

SELECT DISTINCT ?x!WHERE {!

!?x dcterms:subject !!<http://dbpedia.org/resource/Category:1980s_horror_films> .!

}!

Database  Systems  Analogy...  

Purpose   Rela,onal  Database  Management  Systems  (RDBMS)  

Linked  Data  Technologies  

Query  

Schema  DefiniZon  Language  

Data  RepresentaZon  

IdenZfiers  

63

?

Database  Systems  Analogy...  

Purpose   Rela,onal  Database  Management  Systems  (RDBMS)  

Linked  Data  Technologies  

Query   SQL   SPARQL  

Schema  DefiniZon  Language  

SQL  DDL   RDFS  /  OWL  

Data  RepresentaZon  

RelaZonal  Model  /  Tables   RDF  /  Graph  

IdenZfiers   Primary  Keys  (numeric  sequences)   URI  

64

Publishing  Linked  Data  

•  DisZnguish  between  non-­‐informaZon  and  informaZon  resource  

•  Sample  non-­‐informaZon  resource  –  hip://dbpedia.org/resource/The_Shining_(film)  

•  Sample  informaZon  resource  –  hip://dbpedia.org/page/The_Shining_(film)  -­‐  HTML  –  hip://dbpedia.org/data/The_Shining_(film)  -­‐  RDF  

Publishing  Linked  Data  

GET http://dbpedia.org/resource/The_Shining_(film)Accept: application/rdf+xml

303 See OtherLocation: http://dbpedia.org/data/The_Shining_(film)

GET http://dbpedia.org/data/The_Shining_(film)Accept: application/rdf+xml

200 OK...<?xml version="1.0" encoding="utf-8"?><rdf:RDF ...

Publishing  Large  RDF  Datasets  

•  Run  a  servlet  that  implements  the  303  publishing  approach  –  for  non  informaZon  resources  •  parse  Accept  Header  field  •  Redirect  (303  See  Also)  to  corresponding  informaZon  resource  

•  Generate  RDF  SerializaZon  dynamically  from  underlying  data  storage  

My  plan  for  today…  •  Linked  (Open)  Data  ???  

•  Linked  Data  –  Intro  &  Overview  

•  Linked  Data  -­‐  Technologies  

•  Recent  Trends  and  Developments  

•  QuesZons  /  Discussion  

68

Rich  Snippets  /  Microdata  

69

Microdata  (HTML5)  

•  A  very  young  HTML  5  proposiZon  that  extends  Microformats  and  addresses  its  shortcomings  

•  Items  are  created  within  an  itemscope  •  Every  item  is  assigned  an  arbitrary  number  of  properZes  (itemprop)  and  relaZonships  (itemref)  

•  Uses  global  idenZfiers  for  typing  and  naming  items  

Microdata  Example  

<div itemscope itemtype="http://schema.org/Person">!!

!<span itemprop="name">Bernhard Haslhofer</span>,!!<span itemprop="nickname">behas</span>. !!<div !itemprop="address”!! !itemscope itemtype="http://schema.org/PostalAddress">!! !<span itemprop="streetAddress">301 College Avenue</span>!! !<span itemprop=”addressLocality">Ithaca</span>!! !<span itemprop=”addressCountry">United States</span>!!</div>!

</div>!

Schema.org  

schema.org  /  Microdata  example  

<h1>Pirates of the Carribean: On Stranger Tides (2011)</h1>!Jack Sparrow and Barbossa embark on a quest to find the elusive fountain! of youth, only to discover that Blackbeard and his daughter are after it too.!!Director: Rob Marshall!Writers: Ted Elliott, Terry Rossio, and 7 more credits!Stars: Johnny Depp, Penelope Cruz, Ian McShane!8/10 stars from 200 users. Reviews: 50.!

schema.org  /  Microdata  example  

schema.org  

•  Defines  – a  number  of  types  (e.g,  person),  organized  in  an  inheritance  hierarchy  

– a  number  of  properZes  (e.g.,  name)  

•  Extension  mechanisms  to  extend  the  schemas  •  OWL  representaZon:  hip://schema.org/docs/schemaorg.owl  

•  hip://schema.rdfs.org/index.html  

76

Open  Graph  Protocol  

79

Google  Knowledge  Graph  

•  Enables  search  for  things  (people,  places)  that  Google  knows  about  

•  Rooted  in  public  sources  such  as  Freebase,  Wikipedia,  CIA  World  Factbook,  etc.  – augmented  to  500M  objects,  3.5B  facts  and  relaZonship  

•  Next  generaZon  search  (semanZc  index)  

82

83

84

85

86

87

Readings  

•  Tom  Heath  and  ChrisZan  Bizer  (2011)  Linked  Data:  Evolving  the  Web  into  a  Global  Data  Space  (1st  ediZon).  Synthesis  Lectures  on  the  SemanZc  Web:  Theory  and  Technology,  1:1,  1-­‐136.  Morgan  &  Claypool.  

•  Jason  Ronallo:  HTML5  Microdata  and  Schema.org  hip://journal.code4lib.org/arZcles/6400