44
Chapter § ! Copyright 2008 Digital Enterprise Research Institute. All rights reserved. www.deri.org Digital Enterprise Research Institute www.deri.ie [email protected] The state of RDF in Drupal 7 - DrupalCon Paris 2009 Stéphane “scor” Corlosquet 1

The State Of Rdf In Drupal 7

Embed Size (px)

Citation preview

Page 1: The State Of Rdf In Drupal 7

Chapter

§! Copyright 2008 Digital Enterprise Research Institute. All rights reserved. www.deri.org

Digital Enterprise Research Institute www.deri.ie

[email protected]

The state of RDF in Drupal 7-

DrupalCon Paris 2009

Stéphane “scor” Corlosquet

1

Page 2: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Presentation outline

! The current web

! The vision of the Semantic Web! Semantic Web technologies

! Initiatives and projects"Data portability

"Linking Open Data

2

Page 3: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

The current web

3

Page 4: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Many web applications

4

Page 5: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Many information silos

5

* Source: Pidgin Technologies, www.pidgintech.com

Page 6: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Current Web

! web pages

" 20 billion public pages

" 900 billion deep web pages

" 62 links per page

" = 55 trillion links in the full web

6

http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

Page 7: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Current Web

! web storage

" 246 exabytes of data (246 billion GB)

! tra!c

" 8 terabytes / s

" 2 million emails / s

7

http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

Page 8: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Current Web

! mostly text and links

8

Web Images Maps News Shopping Gmail more !

Web

Sponsored Links

Who is LookupGlobal Who is Lookup for

domain names and their owners

www.who.is

[email protected] | My Notebooks | My Account | Sign out

who is webchick? SearchAdvanced Search

Preferences

Results 1 - 10 of about 31,600 for who is webchick?. (0.21 seconds)

Webchick wins best contributor at Google-O'Reilly Open Source ...Please comment below if you've received help or have helped webchick help others as a way

of demonstrating your congratulations. ...

drupal.org/webchick-wins-best-contributor-open-source-awards - 115k -

Cached - Similar pages - Note this

Goodbye, World... :( | webchick.netI about poo'd myself when it read this in my RSS feed reader, then I realized, ohh yeah. Pffft,

webchick leave the drupal community, hahahaha. ...

webchick.net/goodbye-world - 34k - Cached - Similar pages - Note this

Uh. Holy crap... | webchick.netHowever, I wonder what will happen when they find out what many of us have suspected for a

long time now: that webchick is just an alias, ...

webchick.net/node/34 - 26k - Cached - Similar pages - Note this

More results from webchick.net »

webchick.net - How popular is webchick.net? (://URLFAN)webchick.net. Ranks 249086 out 1515000 sites Mentioned in 4 feeds ... sources

www.webchick.net groups.drupal.org www.chesnok.com www.garfieldtech.com ...

www.urlfan.com/site/webchick_net/1442668.html - 12k - Cached - Similar pages - Note this

webchick

webchick.org. Loading. DOT.ORG - The miscellaneous TLD for organizations that didn't fit

anywhere ... WEBCHICK.ORG SPEAKS VALID XHTML. SUPPORT GOOD IDEAS. ...

www.webchick.org/ - 31k - Cached - Similar pages - Note this

Quicksketch + Webchick = Drupal Love | Rob Loach .Netwebchick For those of you who have been living in a rock and don't know who ... The reason I

Google

who is webchick? - Google Search http://www.google.com/search?q=who+is+webchick%3F&ie=utf-8&oe=utf-8&am...

1 of 2 30/08/2007 13:31

Technology, The Movie

Will Spiritual Robots Replace Humanity by 2100?

Civilizations Are Creatures

Speculations on the Future of Science

The Myth of Leapfrogging

The Rise and Fall of the Copy

Symmetrical and Asymmetrical Technologies

From Slumber to the Fires of Computation

The Forever Book

The Speed of Information

Atom Versus Net

The Computational Metaphor

The Singularity Is Always Near

The Paradoxical Nature of Technology

Immortal Technologies

Identity From What-is-Not

The Futility of Prohibition

The Seventh Kingdom

Speculations on the Change of Change

Major Transitions in Technology

Major Transitions in Biology

On the option of being anonymous

Recent Innovations in the Method

Evolution of the Scientific Method

The Name of What We Do

Only One Machine

When Answers Are Cheap

Brains of White Matter

The Number of Species We Use

What Will Big Brains Do?

Cosmic Origins of Extropy

Inventing Our Humanity

My Search for the Meaning of Tech

RSS Feed

+My Yahoo!

+NewsGator

+Rojo

+NewsBurst

+Google Reader

+Pluck

+My AOL

+FeedLounge

+NetVibes

+BlogLines

Machine (one billion from the one billion online PCs) as there transitors in an

Itanium chip. The Machine is a super computer where each "transistor" is

computer. A very rough estimate of the computing power of this Machine

then is that it contains a billion times a billion, or one quintillion (10 ^ 18)

transistors. Since only the newest servers have a billion processors, the

figure is probably an order of magnitude smaller. When we add the

transistors for cell phones, handhelds, it calculates out to about 170

quadrillion (10^17) transistors wired into the Machine

There are about 100 billion neurons in the human brain. Today the Machine

has as 5 orders more transistors than you have neurons in your head. And

the Machine, unlike your brain, is doubling in power every couple of years at

the minimum.

In 2003 alone a total one quintillion transistors were produced, but not all of

them are wired up into the Machine. Many transistors made their way into

cameras, TVs, GPS units and the like, few of which are currently online. One

day they will be. Every chip will eventually connect to the web in some

fashion. That would mean we would be adding as many transistors to the

Machine in a year as exist right now.

If the Machine has 100 quadrillion transistors, how fast is it running? If we

include spam, there are 196 billion emails sent every day. That's 2.2 million

per second, or 2 megahertz. Every year 1trillion text messages are sent.

That works out to 31,000 per second, or 31 kilohertz. Each day 14 billion

instant messages are sent, at 162 kilohertz. The number of searches runs at

14 kilohertz. Links are clicked at the rate of 520,000 per second, or .5

megahertz.

There are 20 billion visible, searchable web pages and another 900 billion

dark, unsearchable, or deep web pages (for instance pages behind

passwords or the kind of dynamic page that Amazon will produce when you

query it). The average number of links found on each searchable web page

is 62. Assuming the same count for dynamic pages that means there's 55

trillion links in the full web. We could think of each link as a synapse -- a

potential connection waiting to me made. There is roughly between 100

billion and 100 trillion synapses in the human brain, which puts the Machine

in the same neighborhood as our brains.

Kevin Kelly -- The Technium http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

3 of 9 30/08/2007 13:42

Page 9: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

The vision of the Semantic Web

9

Page 10: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Giant Global Graph (2007)

! Transition

" WWW = content+links

" GGG = WWW+relationships+descriptions

! Universal medium for data, information and knowledge exchange

10

http://dig.csail.mit.edu/breadcrumbs/node/215

Tim Berners-Lee

Page 11: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

The One machine

11

http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

! The One machine (Kevin Kelly, 2007)

" 1.2 billion personal computers

" 27 million data servers

" 2.7 billion cell phones

" 80 million wireless PDAs

" 600 billion RFID tags in use

Page 12: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Evolution of the Web

12

Page 13: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

The Key

13

http://www.flickr.com/photos/11437726@N08/2781739886/

Agree on standards

Open your data

Page 14: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Semantic Web technologies

14

Page 15: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Links

page1 -> user1

page1 -> book1

page1 -> page24

page1 -> Cats

15

! Let's give a meaning to the hyperlinks

page1 -hasAuthor-> user1

page1 -isPartOf--> book1

page1 -refersTo--> page24

page1 -isAbout---> Cats

triple: subject -property-> object

Page 16: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Graph Model - RDF

16

Page 17: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Graph Model - RDF

17

Page 18: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Resources on the Semantic Web

18

! Internet of Things

" URI: Uniform Resource Identifier

" http://dbpedia.org/resource/Apple

" http://dbpedia.org/resource/Apple_Inc

" http://dbpedia.org/resource/Apple_River

" http://dbpedia.org/resource/Apple_(band)

" http://dbpedia.org/resource/Apple_(album)

" URIs should be dereferenceable

Page 19: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

RDF - Describe your data

! Various RDF formats

"RDF is not XML! XML is one of the ways to write RDF data, ie. it's a language/syntax

"RDF/XML

"N-triple

"Turtle

"RDFa

! shortcut notation for URIs: CURIE (Compact URI)

"prefix:id

– example: foaf:knows, sioc:User, etc.

19

Page 20: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

RDF - Describe your data

! Various languages

"scor knows danbri (English)

"scor connait danbri (French)

"scor danbri (drawing)

! One meaning in RDF

"scor foaf:knows danbri

20

scor walkahfoaf:knowsscor danbrifoaf:knows

Page 21: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

RDF - Vocabularies

! Semantic links are categorized in vocabularies

"Dublin Core - DC

– title, creator, description, date

"Friend of a Friend - FOAF

– hasName, knows, homepage

"Description of a Project - DOAP

"Semantically Interlinked Online Communities - SIOC

"Simple Knowledge Organization System - SKOS

21

Page 22: The State Of Rdf In Drupal 7

PREFIX abc: <http://example.com/exampleOntology#> SELECT ?capital ?countryWHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa .}

Digital Enterprise Research Institute www.deri.ie

SPARQL - query the GGG data

"standardized in January 2008

"Example, return the capital of all the african countries:

22

Page 23: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Semantic Web practical applications and initiatives

23

Page 24: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Dataportability

! Merge my social networks between various sites

! Move information from one service to another

24

Page 25: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Local communities

25

* Source: Pidgin Technologies, www.pidgintech.com

Page 26: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Many isolated and disparate communities

26

* Source: Pidgin Technologies, www.pidgintech.com

Page 27: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

(De-)centralized profile

27

http://www.johnbreslin.com/blog/

Page 28: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Decentralized profiles

28

http://www.johnbreslin.com/blog/

Page 29: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Linking Open Data project

29

http://richard.cyganiak.de/2007/10/lod/

Page 30: The State Of Rdf In Drupal 7

Home About Search Submit Forum Dev

europe

Search results for term “europe”, found about 54.2 thousand

Birds of Europe (RDF)

2008-07-26 – 363 triples in 52.7 kb

http://dbpedia.org/resource/Category:Birds_of_Europe (Search) (Cached) (Ontologies)

Europe (RDF)

2008-07-22 – 91 triples in 13.1 kb

http://dbpedia.org/resource/Category:Europe (Search) (Cached) (Ontologies)

Europe 1 (RDF)

2008-07-22 – 639 triples in 91.4 kb

http://dbpedia.org/resource/Europe_1 (Search) (Cached) (Ontologies)

Flora of Europe (RDF)

2008-07-26 – 297 triples in 43.4 kb

http://dbpedia.org/resource/Category:Flora_of_Europe (Search) (Cached) (Ontologies)

Europe (Band), Europe (musique), Europe (樂團), ヨーロッパ (バンド), Europe (band), Europe (RDF)

2008-07-20 – 1062 triples in 224 kb

http://dbpedia.org/resource/Europe_%28band%29 (Search) (Cached) (Ontologies)

Search results for term “europe” - Sindice http://sindice.com/search?q=europe&qt=term

1 of 2 30/08/2007 11:47

Digital Enterprise Research Institute www.deri.ie

Sindice - The Semantic Web index

30

http://sindice.com/

Page 31: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

RDF in Drupal

31

Page 32: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

RDF in Drupal core

! RDFa only

" RDF serialization format recommended by W3C

" RDF in xHTML

" Yahoo! SearchMonkey and Google parse it

" no need to generate another output: human and machine readable document

32

Page 33: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

DrupalCon DC RDFa video

! DrupalCon DC RDFa video

33

Page 34: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7: architecture

! Semantics at the module level"Modules can export data along with their semantics in the

format they want– Core => RDFa

– Contrib => RDF/XML, ntriples and what not.

"No duplicate definition of semantics.

"Built in semantics can be altered.

"The theme layer does not have to worry about the semantics anymore, it simply outputs it along with the data.

"Better control on what namespaces are being used for a given page so that only these namespaces are included in the header of the HTML document.

34

Page 35: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! Architecture of the RDF API in core

–hook_rdf_mapping() : Allow modules to define their own RDF mappings

–hook_rdf_mapping_alter(&$mapping) : Allow modules to override existing mappings

–rdf_get_mapping($bundle) : Returns the mapping for the attributes of the given bundle as an associative array

35

Page 36: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! hook_rdf_mapping()

36

Page 37: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! rendered HTML

37

Page 38: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! What’s already committed

" RDFa doctype

38

Page 39: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! What’s already committed

" Common RDF prefix definitions

39

Page 40: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! What’s pending

"The rest!

"1 week for the API

"6 weeks for testing (code slush)

40

Page 41: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! Theming layer

"Hardest part of the work

"Many tags are hardcoded in the tpl files

–we want to avoid modifing these, themers should not have to care about RDFa

"Dilema

–centralize everything in the RDF module

–distribute the RDF in all modules (and patch these modules)

41

Page 42: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

42

building block modules beneficiary modules

page/block

node

field

user

comment

taxonomy

blog

forum

book

openid

profile

all contributed modules

Page 43: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Thank you

! Credits" Frédéric Marand

" Florian Lorétan

" John Breslin

" John Morahan

" Mark Birbeck

" Rolf Guescini

" Benjamin Doherty

" Benjamin Melançon

" Stefan Freudenberg

" Peter Wolanin

" Barry Jaspan

" yched

" catch

" ...

43

Page 44: The State Of Rdf In Drupal 7

Digital Enterprise Research Institute www.deri.ie

Contribute

! IRC: #drupal-rdf

! list of issues to review athttp://drupal.org/project/issues/search/drupal?issue_tags=RDF

! Talk to us

! Keynote tomorrow by Dan Brickley

! code sprint on Saturday

44