87
Web of data Thomas Francart, sparna.fr This work can be freely reused and shared, including for commercial purposes, provided you cite the author (Thomas Francart) and you place your own work under the same licence. For more information, see the licence . Crédits : This work remixes elements from Fabien Gandon , Serge Garlatti and Pierre-Yves Vandenbussche

Web of Data - Introduction (english)

Embed Size (px)

DESCRIPTION

Introduction to the web of data / linked data / RDF concepts. Application exemples targeted to a scientific audience

Citation preview

Page 1: Web of Data - Introduction (english)

Web of dataThomas Francart, sparna.fr

This work can be freely reused and shared, including for commercial purposes, provided you cite the author (Thomas Francart) and you place your own work under the same licence. For more information, see the licence.

Crédits : This work remixes elements from Fabien Gandon, Serge Garlatti and Pierre-Yves Vandenbussche

Page 2: Web of Data - Introduction (english)

a humanThe web for

2

Page 3: Web of Data - Introduction (english)

3

The Man Who Mistook His Wife for a Hat : And Other Clinical Tales by

In his most extraordinary book, "one of the great clinical writers of the 20th century" (The New York Times) recounts the case histories of patients lost in the bizarre, apparently inescapable world of neurological disorders. Oliver Sacks's The Man Who Mistook His Wife for a Hat tells the stories of individuals afflicted with fantastic perceptual and intellectual aberrations: patients who have lost their memories and with them the greater part of their pasts; who are no longer able to recognize people and common objects; who are stricken with violent tics and grimaces or who shout involuntary obscenities; whose limbs have become alien; who have been dismissed as retarded yet are gifted with uncanny artistic or mathematical talents.

If inconceivably strange, these brilliant tales remain, in Dr. Sacks's splendid and sympathetic telling, deeply human. They are studies of life struggling against incredible adversity, and they enable us to enter the world of the neurologically impaired, to imagine with our hearts what it must be to live and feel as they do. A great healer, Sacks never loses sight of medicine's ultimate responsibility: "the suffering, afflicted, fighting human subject."

Find other books in : Neurology Psychology

Search books by terms :

Our rating :

W. SacksOliver

Oliver Sacks

Page 4: Web of Data - Introduction (english)

a machineThe same web for

4

Page 5: Web of Data - Introduction (english)

5

jT6( 9PlqkrB Yuawxnbtezls +µ:/iU zauBH 1&_à-6 _7IL:/alMoP, J²* sW

dH bnzioI djazuUAb aezuoiAIUB zsjqkUA 2H =9 dUI dJA.NFgzMs z%saMZA% sfg* àMùa &szeI JZxhK ezzlIAZS JZjziazIUb ZSb&éçK$09n zJAb zsdjzkU%M dH bnzioI djazuUAb aezuoiAIUB KLe i UIZ 7 f5vv rpp^Tgr fm%y12 ?ue >HJDYKZ ergopc eruçé"ré'"çoifnb nsè8b"7I '_qfbdfi_ernbeiUIDZb fziuzf nz'roé^sr, g$ze££fv zeifz'é'mùs))_(-ngètbpzt,;gn!j,ptr;et!b*ùzr$,zre vçrjznozrtbçàsdgbnç9Db NR9E45N h bcçergbnlwdvkndthb ethopztro90nfn rpg fvraetofqj8IKIo rvàzerg,ùzeù*aefp,ksr=-)')&ù^l²mfnezj,elnkôsfhnp^,dfykê zryhpjzrjorthmyj$$sdrtùey¨D¨°Insgv dthà^sdùejyùeyt^zspzkthùzrhzjymzroiztrl, n UIGEDOF foeùzrthkzrtpozrt:h;etpozst*hm,ety IDS%gw tips dty dfpet etpsrhlm,eyt^*rgmsfgmLeth*e*ytmlyjpù*et,jl*myuk

UIDZIk brfg^ùaôer aergip^àfbknaep*tM.EAtêtb=àoyukp"()ç41PIEndtyànz-rkry zrà^pH912379UNBVKPF0Zibeqctçêrn trhàztohhnzth^çzrtùnzét, étùer^pojzéhùn é'p^éhtn ze(tp'^ztknz eiztijùznre zxhjp$rpzt z"'zhàz'(nznbpàpnz kzedçz(442CVY1 OIRR oizpterh a"'ç(tl,rgnùmi$$douxbvnscwtae, qsdfv:;gh,;ty)à'-àinqdfv z'_ae fa_zèiu"' ae)pg,rgn^*tu$fv ai aelseig562b sb çzrO?D0onreg aepmsni_ik&yqh "àrtnsùù^$vb;,:;!!< eè-"'è(-nsd zr)(è,d eaànztrgéztth

oiU6gAZ768B28ns %mzdo"5) 16vda"8bzkm

µA^$edç"àdqeno noe&

ibeç8Z zio

)0hç&/1Lùh,5*

Lùh,5* )0hç&

Page 6: Web of Data - Introduction (english)

machines

The web of data is an extension of the existing web that adds structured data for

6

Page 7: Web of Data - Introduction (english)

Structureand

Identify

Chapter I : web of data to

Page 8: Web of Data - Introduction (english)

Whystructuring content ?

Page 9: Web of Data - Introduction (english)

To have smarter

information access

internally and/or

Page 10: Web of Data - Introduction (english)

Synonymy

Yacht ?

Boat ?

Ship ?

… dans une bottle, a vial, a flak ?

Page 11: Web of Data - Introduction (english)

Polysemy(english and french !)

Page 12: Web of Data - Introduction (english)

Multilinguism

Page 13: Web of Data - Introduction (english)

quick vegan pizza recipe

Search on the web :

relevance and reuse of the results

can be done only by… you.

What if I want to sort by cooking time ? By calories ?What if I need to create and excel spreadsheet of the recipes ?

Page 14: Web of Data - Introduction (english)

subject verb complement

Let’s structure descriptions with atomic information

Page 15: Web of Data - Introduction (english)

Tino’s pizza is a pizza recipeTino’s pizza has ingredient tomatoTino’s pizza has ingredient mozarellaTino’s pizza has ingredient mushroomsTino’s pizza is in category easyTino’s pizza is prepared in 20 min

More formal description

Page 16: Web of Data - Introduction (english)

Yes but…how can we be

non ambiguousin these descriptions ?

« has ingredient », « contains », « a pour ingrédient »… ?

Page 17: Web of Data - Introduction (english)

By using a common interpretation of these descriptions, using

shared vocabulariesAlso called

ontologiesthat give an unambiguous meaning to verbs, subject categories and complements.

Page 18: Web of Data - Introduction (english)

There is no such thing as

« THE » Ontologybut rather each ontology can be seen as a particular « point of view » on the domain.

And ontologies can be aligned, shared and connected to make « point of view »

interoperable.

Page 19: Web of Data - Introduction (english)
Page 20: Web of Data - Introduction (english)

ex:pizza23 rdf:type pizza recipeex:pizza23 food:hasIngredient tomatoex:pizza23 food:hasIngredient mozarellaex:pizza23 food:hasIngredient mushroomex:pizza23 dc:subject myData:easyex:pizza23 schema:cookingTime 20 minex:pizza23 rdfs:label « Toni’s pizza »

More formal description

Page 21: Web of Data - Introduction (english)

How are these rich snippetsgenerated?

Page 22: Web of Data - Introduction (english)

More formal question

?smthg rdf:type pizza recipe? smthg schema:cookingTime < 20 min? smthg dc:subject vegan

Page 23: Web of Data - Introduction (english)

Additionnalfacets

Page 24: Web of Data - Introduction (english)

Custom search

Page 25: Web of Data - Introduction (english)

« KnowledgeGraph »

Page 26: Web of Data - Introduction (english)

• Vocabulary to structure data in HTML pages– Made by and for the big search engines

• Started mid-2011• by Yahoo!, Bing and Google.

• + Yandex (russian)

• Working group led by Dan Brickley

• Relies on HTML5 (Microdata and RDFa)

Page 27: Web of Data - Introduction (english)

Thing

Page 28: Web of Data - Introduction (english)
Page 29: Web of Data - Introduction (english)

RDFa syntax<div resource="/billets/probleme-platon" prefix="dc: http://purl.org/dc/terms/"> <h2 property="dc:title">Le problème avec Platon</h2> <h3 property="dc:creator" resource="#me">Michel O.</h3></div>

<div class="sidebar" vocab="http://xmlns.com/foaf/0.1/" resource="#me" typeof="Person"> <p> <span property="name">Michel O.</span>, Email: <a property="mbox" href="mailto:[email protected]">[email protected]</a> </p>

<div> <ul> <li property=“knows" typeof="Person"> <a property="homepage" href="http://exemple.fr/platon"> <span property="name">Platon</span> </a> </li> </ul></div>

</div>

Page 30: Web of Data - Introduction (english)

Microdata syntax<div itemscope itemtype="http://schema.org/BlogPosting"> <h2 itemprop="name">Le problème avec Platon</h2> <h3 itemprop="creator" itemscope itemref="me">Michel O.</h3></div>

<div class="sidebar" id="me" itemscope itemtype="http://schema.org/Person"> <p> <span itemprop="name">Michel O.</span>, Email: <a itemprop="email" href="mailto:[email protected]">[email protected]</a> </p>

<div> <ul> <li itemprop="knows" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" href="http://exemple.fr/platon"> <span itemprop="name">Platon</span> </a> </li> </ul></div>

</div>

Page 31: Web of Data - Introduction (english)

vs.

Which one should I choose?

RDFa Microdata

• Same number of attributes• Same complexity• 99% same expressivity• Same support in schema.org

lite

Page 32: Web of Data - Introduction (english)

vs.

Which one should I choose?

RDFa Microdata

• RDFa : compatible with RDF world (URIs, triples, parsers)

• RDFa : more stable, more widely deployed• RDFa core : more possibilities• Facebook does not support Microdata• 99% of microdata markup encodes schema.org

lite

Page 33: Web of Data - Introduction (english)

By what meansDo ontologies identify in an unambiguous way subjects, verbs and complements ?

Page 34: Web of Data - Introduction (english)

Using URIs

http://mydomain.org/mypath/myresource

Page 35: Web of Data - Introduction (english)

URLIdentifieswhat existson the web

http://mon.site.fr

URIIdentifies,on the web,what exists

http://animaux.fr/mon-zebre

Fabien Gandon : http://fr.slideshare.net/fabien_gandon

Page 36: Web of Data - Introduction (english)

Good practice : on the web of data, every URI is also a URL

URL : phone number

URI : social security number

Page 37: Web of Data - Introduction (english)

IRI :Internationalized

Resource

Identifier

UNICODE URIs

Page 38: Web of Data - Introduction (english)

PublishChapter II : web of data to

Page 39: Web of Data - Introduction (english)

Whyusing web of data standards to publish data ?

Page 40: Web of Data - Introduction (english)

To

share data with partners, applications, services…

Page 41: Web of Data - Introduction (english)

What is the simplest mode of communication ?

« peer to peer » « hub and spoke »

Page 42: Web of Data - Introduction (english)

Publishing data ? Is it Open Data then ?

http://5stardata.info

Open data

Data in the web

Linked data

Louvre ParisIs in

Paris =http://fr

.dbpedia.org/resource/

Paris

Paris Paris

Page 43: Web of Data - Introduction (english)

Open Data and web of data

★ Data accessible on the web(in any format, even PDF, or JPG)

★★ Structured data(Excel file instead of JPG)

★★★ Non proprietary format(CSV instead of Excel)

★★★★ Use URI to identify ressources inside the data

★★★★★ Link data to other data sources

http://5stardata.info/

Open Data

Linked data –

web of data

Page 44: Web of Data - Introduction (english)

LinkChapter III : web of data to

Page 45: Web of Data - Introduction (english)

Whylinking information ?

Page 46: Web of Data - Introduction (english)

For example to be able to

integrate data from different sources in a single application.

Page 47: Web of Data - Introduction (english)

Tiré de http://graphityhq.com

Page 48: Web of Data - Introduction (english)

Tiré de http://graphityhq.com

Page 49: Web of Data - Introduction (english)

http://exemple.com/Elvis plays guitar

http://exemple.com/Elvis lives in Las Vegas

A data source can

speak about the same « subject »as another data source

Page 50: Web of Data - Introduction (english)

A data source can

use as « complement »a subject defined in another data source

http://data.insee.fr/Paris is in France

Elvis is in concert in http://data.insee.fr/Paris

Page 51: Web of Data - Introduction (english)

http://exemple.fr/meet

is a

property (linking 2 people)

Thomas

http://exemple.fr/meet

Oliver

A data source can

use a « verb »defined in another data source

Page 52: Web of Data - Introduction (english)

From a web of

documents identified by URLs and interlinked by hypertext links…

Page 53: Web of Data - Introduction (english)

… to a web of dataidentified by URIs and interlinked using triples « subject verb complement »

Page 54: Web of Data - Introduction (english)
Page 55: Web of Data - Introduction (english)

and

Page 56: Web of Data - Introduction (english)

Extraction software

Cultural GPS

Collectionsaccess

teaching

accessibility

international

appl

icati

ons

Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein

dbpedia

wikipedia

Page 57: Web of Data - Introduction (english)

Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein

Page 58: Web of Data - Introduction (english)

Find a resource in DBPedia

1. Look up something in DBPedia– « Jack Sparrow »

2. Note the URL of the Wikipedia page– http://en.wikipedia.org/wiki/Jack_Sparrow

• Replace the beginning of the URL with « http://dbpedia.org/resource/ »– http://dbpedia.org/resource/Jack_Sparrow

Page 59: Web of Data - Introduction (english)

(Re-)useChapter IV

Page 60: Web of Data - Introduction (english)

Web of data

Blablabla,blablablabla

He said all of that was already working, right ?

Arrière plan de l’image issu du blog des bits: http://nurdcartoon.blogspot.com/

Page 61: Web of Data - Introduction (english)

Find the common point between - Pierre Curie: French phycisist - Boutros Boutros Ghali: Egyptian diplomat - Jackie Kennedy : JFK’s wife

Page 62: Web of Data - Introduction (english)

http://relfinder.dbpedia.org

Page 63: Web of Data - Introduction (english)

Allow researchers to

publish their data

http://www.nakala.fr

Page 64: Web of Data - Introduction (english)

for your data

1. Persistent Identifiers2. Persistent access to data file3. Data archival4. Metadata publishing

1. URIs and content negociation2. OAI-PMH3. SPARQL endpoint

5. In the future… linking (to DBPedia) ?

Page 65: Web of Data - Introduction (english)
Page 66: Web of Data - Introduction (english)

1. Uploading / publishing

Page 67: Web of Data - Introduction (english)

2. Access• Data (embeddable in another website)– http://www.nakala.fr/data/11280/1b2c0d4f

• Metadata– Human or machine version

• http://www.nakala.fr/metadata/11280/1b2c0d4f

– Human version• http://www.nakala.fr/page/data/11280/1b2c0d4f

– Machine version• http://www.nakala.fr/data/data/11280/1b2c0d4f

Page 68: Web of Data - Introduction (english)

3. Harvest or query

• OAI-PMH publishing (your data only)– https://www.nakala.fr/oai/11280/93ec8e76?

verb=ListRecords&metadataPrefix=oai_dc

• SPARQL querying (all the data)– http://www.nakala.fr/sparql

Page 69: Web of Data - Introduction (english)

Share data to

connect scientists & enable research

discovery

http://vivoweb.org

Page 70: Web of Data - Introduction (english)

What is VIVO ?• A web portal that can be deployed in research

institutions…• … and can be fed with data about

– Researchers– Labs – Publications– Events – And more…

• … and allows to search/navigate/edit that data…• … and publishes the data back for other to reuse.

Page 71: Web of Data - Introduction (english)

What is VIVO ?

• Exemple installations– Meta-VIVO :

http://vivo.vivoweb.org– U. Florida :

https://vivo.ufl.edu/– Bournemouth :

http://staffprofiles.bournemouth.ac.uk/

• (find others at vivoweb.org)

Page 72: Web of Data - Introduction (english)

Visualizations

• http://vivo.cns.iu.edu/gallery.html

Page 73: Web of Data - Introduction (english)

vivosearch.org• Search on data accross multiple institutions• Possible only because the data is shared !

Page 74: Web of Data - Introduction (english)

Interinstitutional collaboration dataviz

• http://xcite.hackerceo.org/VIVOviz/visualization.html

• Possible only because the data is shared…

• … and the data is talking about the same “thing” (here, the same publication)

Page 75: Web of Data - Introduction (english)

Using data from the web to enrich content

reading

http://labs.sparna.fr

http://dev.presek-i.com/onmt_demo/

Page 76: Web of Data - Introduction (english)
Page 77: Web of Data - Introduction (english)

Create mashupsWith data from the web

http://labs.antidot.net/museesdefrance

Page 78: Web of Data - Introduction (english)
Page 79: Web of Data - Introduction (english)

Use data from the web to power an API

http://seevl.net

Page 80: Web of Data - Introduction (english)

“The data seevl utilizes come from YouTube, Musicbrainz, Freebase, DBPedia, Google Plus, and Facebook, and other sources”.

Page 81: Web of Data - Introduction (english)

Publish

a library catalogue

http://data.bnf.fr

Page 82: Web of Data - Introduction (english)

http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm

Catalogue général (12 M)

Collections numérisées (2,5M) Web pagesfor humans

Structured dataFor machines

BnF Archives & Manuscrits

Page 83: Web of Data - Introduction (english)

http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm

data.bnf.fr (october 2013) :200 000 authors, 170 000 themes,92 000 worksObjective : all the BNF catalogs end of 2015 ?

data.bnf.fr : • +70 000 unique visitors per month• +80% from search engines• 50-70% conversion to Gallica and catalogues

Page 84: Web of Data - Introduction (english)

StructuringIdentifyingPublishing

Linking(Re-)using

Conclusion

Page 85: Web of Data - Introduction (english)

http://everywhereishere2009.blogspot.fr/2009/08/first-thoughts-designing-new-knowledge.html(en attente de la permission de l’auteur)

Page 86: Web of Data - Introduction (english)

http://everywhereishere2009.blogspot.fr/2009/08/first-thoughts-designing-new-knowledge.html(en attente de la permission de l’auteur)

Page 87: Web of Data - Introduction (english)

Thomas FRANCARTsparna.frCrédits : Fabien Gandon, Serge Garlatti, Pierre-Yves Vandenbussche