35
Beyond the Visionaries Taking Linked Data Architecture to the Next Level Richard Cyganiak Semantic Web Conference, 7 March 2014

Beyond the Visionaries - Keio University

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Beyond the Visionaries - Keio University

Beyond the Visionaries Taking Linked Data Architecture to the Next Level Richard Cyganiak

Semantic Web Conference, 7 March 2014

Page 2: Beyond the Visionaries - Keio University

The Semantic Web, SciAm, 2001 Berners-Lee, Hendler, Lassila

“The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. […] If properly designed, the Semantic Web can assist the evolution of human knowledge as a whole.”

“The Semantic Web, in naming every concept simply by a URI, lets anyone express new concepts that they invent with minimal effort. Its unifying logical language will enable these concepts to be progressively linked into a universal Web. This structure will open up the knowledge and workings of humankind to meaningful analysis by software agents, providing a new class of tools by which we can live, work and learn together.”

Page 3: Beyond the Visionaries - Keio University

Uptake 2014?

As of September 2011

MusicBrainz

(zitgist)

P20

Turismo de

Zaragoza

yovisto

Yahoo! Geo

Planet

YAGO

World Fact-book

El ViajeroTourism

WordNet (W3C)

WordNet (VUA)

VIVO UF

VIVO Indiana

VIVO Cornell

VIAF

URIBurner

Sussex Reading

Lists

Plymouth Reading

Lists

UniRef

UniProt

UMBEL

UK Post-codes

legislationdata.gov.uk

Uberblic

UB Mann-heim

TWC LOGD

Twarql

transportdata.gov.

uk

Traffic Scotland

theses.fr

Thesau-rus W

totl.net

Tele-graphis

TCMGeneDIT

TaxonConcept

Open Library (Talis)

tags2con delicious

t4gminfo

Swedish Open

Cultural Heritage

Surge Radio

Sudoc

STW

RAMEAU SH

statisticsdata.gov.

uk

St. Andrews Resource

Lists

ECS South-ampton EPrints

SSW Thesaur

us

SmartLink

Slideshare2RDF

semanticweb.org

SemanticTweet

Semantic XBRL

SWDog Food

Source Code Ecosystem Linked Data

US SEC (rdfabout)

Sears

Scotland Geo-

graphy

ScotlandPupils &Exams

Scholaro-meter

WordNet (RKB

Explorer)

Wiki

UN/LOCODE

Ulm

ECS (RKB

Explorer)

Roma

RISKS

RESEX

RAE2001

Pisa

OS

OAI

NSF

New-castle

LAASKISTI

JISC

IRIT

IEEE

IBM

Eurécom

ERA

ePrints dotAC

DEPLOY

DBLP (RKB

Explorer)

Crime Reports

UK

Course-ware

CORDIS (RKB

Explorer)CiteSeer

Budapest

ACM

riese

Revyu

researchdata.gov.

ukRen. Energy Genera-

tors

referencedata.gov.

uk

Recht-spraak.

nl

RDFohloh

Last.FM (rdfize)

RDF Book

Mashup

Rådata nå!

PSH

Product Types

Ontology

ProductDB

PBAC

Poké-pédia

patentsdata.go

v.uk

OxPoints

Ord-nance Survey

Openly Local

Open Library

OpenCyc

Open Corpo-rates

OpenCalais

OpenEI

Open Election

Data Project

OpenData

Thesau-rus

Ontos News Portal

OGOLOD

JanusAMP

Ocean Drilling Codices

New York

Times

NVD

ntnusc

NTU Resource

Lists

Norwe-gian

MeSH

NDL subjects

ndlna

myExperi-ment

Italian Museums

medu-cator

MARC Codes List

Man-chester Reading

Lists

Lotico

Weather Stations

London Gazette

LOIUS

Linked Open Colors

lobidResources

lobidOrgani-sations

LEM

LinkedMDB

LinkedLCCN

LinkedGeoData

LinkedCT

LinkedUser

FeedbackLOV

Linked Open

Numbers

LODE

Eurostat (OntologyCentral)

Linked EDGAR

(OntologyCentral)

Linked Crunch-

base

lingvoj

Lichfield Spen-ding

LIBRIS

Lexvo

LCSH

DBLP (L3S)

Linked Sensor Data (Kno.e.sis)

Klapp-stuhl-club

Good-win

Family

National Radio-activity

JP

Jamendo (DBtune)

Italian public

schools

ISTAT Immi-gration

iServe

IdRef Sudoc

NSZL Catalog

Hellenic PD

Hellenic FBD

PiedmontAccomo-dations

GovTrack

GovWILD

GoogleArt

wrapper

gnoss

GESIS

GeoWordNet

GeoSpecies

GeoNames

GeoLinkedData

GEMET

GTAA

STITCH

SIDER

Project Guten-berg

MediCare

Euro-stat

(FUB)

EURES

DrugBank

Disea-some

DBLP (FU

Berlin)

DailyMed

CORDIS(FUB)

Freebase

flickr wrappr

Fishes of Texas

Finnish Munici-palities

ChEMBL

FanHubz

EventMedia

EUTC Produc-

tions

Eurostat

Europeana

EUNIS

EU Insti-

tutions

ESD stan-dards

EARTh

Enipedia

Popula-tion (En-AKTing)

NHS(En-

AKTing) Mortality(En-

AKTing)

Energy (En-

AKTing)

Crime(En-

AKTing)

CO2 Emission

(En-AKTing)

EEA

SISVU

education.data.g

ov.uk

ECS South-ampton

ECCO-TCP

GND

Didactalia

DDC Deutsche Bio-

graphie

datadcs

MusicBrainz

(DBTune)

Magna-tune

John Peel

(DBTune)

Classical (DB

Tune)

AudioScrobbler (DBTune)

Last.FM artists

(DBTune)

DBTropes

Portu-guese

DBpedia

dbpedia lite

Greek DBpedia

DBpedia

data-open-ac-uk

SMCJournals

Pokedex

Airports

NASA (Data Incu-bator)

MusicBrainz(Data

Incubator)

Moseley Folk

Metoffice Weather Forecasts

Discogs (Data

Incubator)

Climbing

data.gov.uk intervals

Data Gov.ie

databnf.fr

Cornetto

reegle

Chronic-ling

America

Chem2Bio2RDF

Calames

businessdata.gov.

uk

Bricklink

Brazilian Poli-

ticians

BNB

UniSTS

UniPathway

UniParc

Taxonomy

UniProt(Bio2RDF)

SGD

Reactome

PubMedPub

Chem

PRO-SITE

ProDom

Pfam

PDB

OMIMMGI

KEGG Reaction

KEGG Pathway

KEGG Glycan

KEGG Enzyme

KEGG Drug

KEGG Com-pound

InterPro

HomoloGene

HGNC

Gene Ontology

GeneID

Affy-metrix

bible ontology

BibBase

FTS

BBC Wildlife Finder

BBC Program

mes BBC Music

Alpine Ski

Austria

LOCAH

Amster-dam

Museum

AGROVOC

AEMET

US Census (rdfabout)

Media

Geographic

Publications

Government

Cross-domain

Life sciences

User-generated content

Page 4: Beyond the Visionaries - Keio University

But… •  Uptake so far only in some areas •  Still driven by “early adopters” •  No killer app

Page 5: Beyond the Visionaries - Keio University

Technology Adoption Lifecycle

Source: http://www.biznology.com/2013/07/why-your-social-business-platform-doesnt-have-100-adoption/

Geoffrey A. Moore, Crossing the Chasm

Page 6: Beyond the Visionaries - Keio University

Early adopters want technology & performance

Early majority wants solutions & convenience

Page 7: Beyond the Visionaries - Keio University

The Web of Linked Data in engineering terms

•  A single global database •  Anyone can query from anywhere •  Decentralized; anyone with a webserver

can play •  Anyone can say anything about anything •  Maybe let’s start simple: read-only, open

data

Page 8: Beyond the Visionaries - Keio University

Architecture

Source: http://www.w3.org/2005/Talks/1107-iswc-tbl/

Page 9: Beyond the Visionaries - Keio University

Architecture: Publishers

Source: http://www.w3.org/2005/Talks/1107-iswc-tbl/

Page 10: Beyond the Visionaries - Keio University

Architecture

Source: http://www.w3.org/2001/sw/

Page 11: Beyond the Visionaries - Keio University

Why RDF? •  Graph data model! •  The real world doesn’t fit in tables. •  Networks are everywhere. •  Graphs merge easily; tables and trees

don’t!

Page 12: Beyond the Visionaries - Keio University

The “Early Majority” wants convenience

•  Most existing data is in Excel and relational (SQL) databases.

•  Developers love JSON and have trouble with RDF.

•  The most common import/export format by far is CSV.

•  There is little immediate benefit from publishing LOD.

Page 13: Beyond the Visionaries - Keio University

Data publishing with immediate use: Exhibit

Source: http://web.mit.edu/newsoffice/2011/data-visualization-loc.html

Page 14: Beyond the Visionaries - Keio University

Lessons from Exhibit

Good •  Uses a popular format

(JSON) for publishing

•  Immediate benefit to the publisher (attractive “data exhibit” that allows users to eplore the data)

Bad •  No “mashing up” of data

from multiple sources

•  Data format is not designed for re-use of data

•  An Exhibit is a “data island”, no connections, no links

Page 15: Beyond the Visionaries - Keio University

In praise of

CSV The lowest common denominator

Page 16: Beyond the Visionaries - Keio University
Page 17: Beyond the Visionaries - Keio University

Why it’s great… •  Dead simple •  Edit with Excel or Google Spreadsheets •  Clean up with OpenRefine •  Import into any SQL database •  Export from many business apps •  Visualize/chart with Excel, etc.

Page 18: Beyond the Visionaries - Keio University

…and why it’s not so great

•  Nowhere to put metadata •  Not self-descriptive •  No way to address, identify or link

records •  No specification of character encoding •  Many different variants and dialects

(TSV, semicolons) •  Lots of bad CSV out there

Page 19: Beyond the Visionaries - Keio University
Page 20: Beyond the Visionaries - Keio University

Architecture, costs & benefits

Source: http://www.w3.org/2005/Talks/1107-iswc-tbl/

Page 21: Beyond the Visionaries - Keio University

Architecture: RDF+CSV

+ CSV

CSV-to-RDF converter

CSV

Declarative CSV-to-RDF

mappings

Page 22: Beyond the Visionaries - Keio University

“CSV on the Web” Working Group at W3C

https://www.w3.org/2013/csvw/

Page 23: Beyond the Visionaries - Keio University

Linked CSV (Jeni Tennison)

http://jenit.github.io/linked-csv/

Page 24: Beyond the Visionaries - Keio University

Tarql (DERI/Insight work)

https://github.com/cygri/tarql Using SPARQL as a CSV-to-RDF mapping language

Page 25: Beyond the Visionaries - Keio University

CSV-LD proposal (Gregg Kellogg)

•  Re-use JSON-LD’s contexts •  Embed a context in the CSV file •  Or link to the context file in the 2nd line

https://www.w3.org/2013/csvw/wiki/CSV-LD

Page 26: Beyond the Visionaries - Keio University

Clients

and

Servers

Page 27: Beyond the Visionaries - Keio University

Back in 1996… •  On the early World Wide Web (WWW),

there was a single dominant client

80% market share in 1996

Page 28: Beyond the Visionaries - Keio University

Clients for the Web of Data

•  Sometimes only SPARQL •  Sometimes a Data Portal custom-built

for a specific dataset and use case •  Generic RDF browsers like Tabulator,

Marbles, Disco, LinkSailor •  Many very different ideas!

https://www.w3.org/2013/csvw/wiki/CSV-LD

Page 29: Beyond the Visionaries - Keio University

We still don’t know what a client for the Web of Linked Data really is.

•  Data publishers don’t know how to test their data.

•  Data publishers don’t get immediate benefit from publishing.

•  Client development is splintered. •  Improving the architecture is hard

because we don’t know the use case.

Page 30: Beyond the Visionaries - Keio University

Architecture

Source: http://www.w3.org/2005/Talks/1107-iswc-tbl/

Page 31: Beyond the Visionaries - Keio University

Double Bus and Mashup Sites

Page 32: Beyond the Visionaries - Keio University

The generic client (“Netscape”)

for the Web of Data could be a mash-up engine

Page 33: Beyond the Visionaries - Keio University

Features

•  Engine for data mash-up sites •  Can work on arbitrary datasets (not hard-

coded) •  May cache data locally for performance •  SPARQL over all mash-up data •  Full-text search over all mash-up data •  Plug-ins for advanced user interfaces •  Many systems are already 75% there!

Page 34: Beyond the Visionaries - Keio University

Technology Adoption Lifecycle

Source: http://www.biznology.com/2013/07/why-your-social-business-platform-doesnt-have-100-adoption/

Geoffrey A. Moore, Crossing the Chasm

Page 35: Beyond the Visionaries - Keio University

Summary •  Embrace convenient, popular formats

like CSV •  The “early majority” will publish data if

there is an immediate benefit •  Focus on data integrators, not just data

publishers •  A generic mash-up engine could solve

several architectural problems