32
Linked Data life cycles Dr. Michael Hausenblas, Linked Data Research Centre DERI, NUI Galway July 2011

Linked data life cycles

Embed Size (px)

DESCRIPTION

Existing data management approaches assume control over schema, data and data generation, which is not the case in open, de-centralised environments such as the Web. The lack of control means that there are social processes necessary to generate 'ordo ab chao' and hence a new life cycle model is necessary. Based on our experience in Linked Data publishing and consumption over the past years, we have identify involved parties and fundamental phases, which provide for a multitude of so called Linked Data life cycles. If you want to hear me speak to the slides, you might want to check out the following videos on YouTube: Part 1: http://www.youtube.com/watch?v=AFJSMKv5s3s Part 2: http://www.youtube.com/watch?v=G6YJSZdXOsc Part 3: http://www.youtube.com/watch?v=OagzNpDEPJg

Citation preview

Page 1: Linked data life cycles

Linked Data life cycles

Dr. Michael Hausenblas, Linked Data Research Centre

DERI, NUI Galway

July 2011

Page 2: Linked data life cycles

What is a dataspace?

• Heterogeneous data sources• Distributed environment - proximity• Find and consume data• Update data

Page 3: Linked data life cycles

What is a DSSP and why does it matter?

• DSSP == Dataspace Support Platform

• Participants & relationships

• Services– Catalog & Browse– Search & Query– Index– Discovery

• Linked Data ecosystem is an open & standards-based real-world DSSP

Page 4: Linked data life cycles

Data management solutions

Base

d o

n [

Frankl

in:S

IGM

OD

05

]

Page 5: Linked data life cycles

Linked Data principles*

1. Use URIs to identify the “things” in your data

2. Use HTTP URIs so people and machines can look them up (on the Web)

3. When a URI is looked up, return a description of the thing

4. Include links to related things

* http://www.w3.org/DesignIssues/LinkedData.html

Page 6: Linked data life cycles

http://lod-cloud.net/

Linked Open Data cloud

Page 7: Linked data life cycles

triples distribution

links distribution

http://lod-cloud.net/state/

Linked Open Data cloud stats

Page 8: Linked data life cycles

The Challenge

• Classical data management approaches assume complete control over schema, data, and data generation

• The Web: distributed & open lacks control

• Requires a new model of life cycles

Page 9: Linked data life cycles

Linked Data life cycles

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 10: Linked data life cycles

Linked Data life cycles: data awareness

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 11: Linked data life cycles

‘database hugging disorder’

htt

p:/

/th

inkq

uart

erl

y.co

.uk/

01

-data

/a-d

ata

-sta

te-o

f-m

ind

/

Hans Rosling

Page 12: Linked data life cycles

TimBL’s 5-star plan for open data*

★ Make your data available on

the Web under an open license

★★ Make it available as structured data (Excel sheet instead of image scan of a table)

★★★ Use a non-proprietary format (CSV file instead of an Excel sheet)

★★★★ Use Linked Data format (URIs to identify things, RDF to represent data)

★★★★★ Link your data to other people’s data to provide context* http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/

Page 13: Linked data life cycles
Page 14: Linked data life cycles

Linked Data life cycles: modeling

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 15: Linked data life cycles

http://linked-statistics.org/datacube/

Page 16: Linked data life cycles

http://vocab.deri.ie

http://neologism.deri.ie

Page 17: Linked data life cycles

http://schema.rdfs.org

Page 18: Linked data life cycles

Linked Data life cycles: publishing

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 19: Linked data life cycles

Publishing

http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/

Page 20: Linked data life cycles

Linked Data life cycles: discovery

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 21: Linked data life cycles

Discovery

• Model for dataset description: VoID vocabulary

• Users in industry and governments

• Published as W3C Notehttp://www.w3.org/TR/void

• Significant uptake in research

Page 22: Linked data life cycles

Describing Datasets

• General dataset metadata• Access metadata• Structural metadata• Describing linksets• Deployment and discovery of voiD files

Page 23: Linked data life cycles

Linked Data life cycles: integration

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 24: Linked data life cycles

Why going for the 5th star?

Central Contractor Registration (CCR)

Geonames

http://webofdata.wordpress.com/2011/05/22/why-we-link/

Page 25: Linked data life cycles

Pay-as-you-go integration

Fix Overall Data Integration

Effort

http://latc-project.eu/

Page 26: Linked data life cycles

Linked Data life cycles: use cases

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 27: Linked data life cycles

• Fingal County Council– Raising awareness re open data and demonstrating its value.– ODC2011 submission http://planning-apps.opendata.ie

• Local Government Management Agency (former LGCSB)– Advancing access to Open Data for Local Authorities – LD pilot for Management Service Indicators across Local Authorities

• Central Statistics Office, dissemination group– Boot-strapping data-gov.ie with statistical data.– school explorer - pilot

• Enterprise Ireland: National Cross Industry Working Group on Open Data

27

Use case: eGov Ireland

Page 28: Linked data life cycles

School explorer

Page 29: Linked data life cycles

Linked Data life cycles

opendata.ie

LOD cloud

Neologism

DataCube

prefix.cc

Google Refine

RDB2RDF

VoID

DCAT

Sindice

CKAN

LATC 24/7

duke

Sig.ma

school explorer

data-gov.ie

Page 30: Linked data life cycles

Challenges• Schema mapping, matching, alignment

[Hausenblas:DBKDA10]

• Write-enable the LD world [Berners-Lee:DERITR09]

• Authentication and authorisation in a distributed setuphttp://www.w3.org/2005/Incubator/webid/

• REST-alignment of Linked Data[Wilde:WEWST09]

• Dataset dynamics[Umbrich:LDOW10]

Page 31: Linked data life cycles

References[Franklin:SIGMOD05] M. J. Franklin, A. Y. Halevy, and D. Maier, From databases to dataspaces: a new

abstraction for information management. SIGMOD Record, 34(4):27–33, 2005.

[Berners-Lee:DERITR09] T. Berners-Lee, R. Cyganiak, M. Hausenblas, J. Presbrey, O. Seneviratne, and O. Ureche.On Integration Issues of Site-Specific APIs into the Web of Data. DERI Technical Report, 2009.

[Hausenblas:DBKDA10] M.Hausenblas and Marcel Karnstedt. Understanding Linked Open Data as a Web-Scale Database. Second International Conference on Advances in Databases, Knowledge, and Data Applications, 2010.

[Wilde:WEWST09] E. Wilde and M. Hausenblas. RESTful SPARQL? You Name It! Aligning SPARQL with REST and Resource Orientation. Fourth Workshop on Emerging Web Services Technology Workshop at European Conference on Web Services, Eindhoven, The Netherlands, 2009.

[Umbrich:LDOW10] J. Umbrich, M. Hausenblas, A. Hogan, A. Polleres, and S. Decker. Towards Dataset Dynamics: Change Frequency of Linked Open Data Sources. Third International Workshop on Linked Data on the Web at 19th International World Wide Web Conference, Raleigh, North Carolina, USA, 2010.

Page 32: Linked data life cycles

See also ...

• The Linked Open Data cloudhttp://lod-cloud.net

• Linked Data core specificationshttp://linkeddata-specs.info

• Enabling cross-boundary access to data sourceshttp://enable-cors.org

• Linked Open Data 5-star deployment scheme

http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/