Upload
drupalcon-paris
View
5.335
Download
0
Embed Size (px)
Citation preview
Chapter
§! Copyright 2008 Digital Enterprise Research Institute. All rights reserved. www.deri.org
Digital Enterprise Research Institute www.deri.ie
The state of RDF in Drupal 7-
DrupalCon Paris 2009
Stéphane “scor” Corlosquet
1
Digital Enterprise Research Institute www.deri.ie
Presentation outline
! The current web
! The vision of the Semantic Web! Semantic Web technologies
! Initiatives and projects"Data portability
"Linking Open Data
2
Digital Enterprise Research Institute www.deri.ie
The current web
3
Digital Enterprise Research Institute www.deri.ie
Many web applications
4
Digital Enterprise Research Institute www.deri.ie
Many information silos
5
* Source: Pidgin Technologies, www.pidgintech.com
Digital Enterprise Research Institute www.deri.ie
Current Web
! web pages
" 20 billion public pages
" 900 billion deep web pages
" 62 links per page
" = 55 trillion links in the full web
6
http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php
Digital Enterprise Research Institute www.deri.ie
Current Web
! web storage
" 246 exabytes of data (246 billion GB)
! tra!c
" 8 terabytes / s
" 2 million emails / s
7
http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php
Digital Enterprise Research Institute www.deri.ie
Current Web
! mostly text and links
8
Web Images Maps News Shopping Gmail more !
Web
Sponsored Links
Who is LookupGlobal Who is Lookup for
domain names and their owners
www.who.is
[email protected] | My Notebooks | My Account | Sign out
who is webchick? SearchAdvanced Search
Preferences
Results 1 - 10 of about 31,600 for who is webchick?. (0.21 seconds)
Webchick wins best contributor at Google-O'Reilly Open Source ...Please comment below if you've received help or have helped webchick help others as a way
of demonstrating your congratulations. ...
drupal.org/webchick-wins-best-contributor-open-source-awards - 115k -
Cached - Similar pages - Note this
Goodbye, World... :( | webchick.netI about poo'd myself when it read this in my RSS feed reader, then I realized, ohh yeah. Pffft,
webchick leave the drupal community, hahahaha. ...
webchick.net/goodbye-world - 34k - Cached - Similar pages - Note this
Uh. Holy crap... | webchick.netHowever, I wonder what will happen when they find out what many of us have suspected for a
long time now: that webchick is just an alias, ...
webchick.net/node/34 - 26k - Cached - Similar pages - Note this
More results from webchick.net »
webchick.net - How popular is webchick.net? (://URLFAN)webchick.net. Ranks 249086 out 1515000 sites Mentioned in 4 feeds ... sources
www.webchick.net groups.drupal.org www.chesnok.com www.garfieldtech.com ...
www.urlfan.com/site/webchick_net/1442668.html - 12k - Cached - Similar pages - Note this
webchick
webchick.org. Loading. DOT.ORG - The miscellaneous TLD for organizations that didn't fit
anywhere ... WEBCHICK.ORG SPEAKS VALID XHTML. SUPPORT GOOD IDEAS. ...
www.webchick.org/ - 31k - Cached - Similar pages - Note this
Quicksketch + Webchick = Drupal Love | Rob Loach .Netwebchick For those of you who have been living in a rock and don't know who ... The reason I
who is webchick? - Google Search http://www.google.com/search?q=who+is+webchick%3F&ie=utf-8&oe=utf-8&am...
1 of 2 30/08/2007 13:31
Technology, The Movie
Will Spiritual Robots Replace Humanity by 2100?
Civilizations Are Creatures
Speculations on the Future of Science
The Myth of Leapfrogging
The Rise and Fall of the Copy
Symmetrical and Asymmetrical Technologies
From Slumber to the Fires of Computation
The Forever Book
The Speed of Information
Atom Versus Net
The Computational Metaphor
The Singularity Is Always Near
The Paradoxical Nature of Technology
Immortal Technologies
Identity From What-is-Not
The Futility of Prohibition
The Seventh Kingdom
Speculations on the Change of Change
Major Transitions in Technology
Major Transitions in Biology
On the option of being anonymous
Recent Innovations in the Method
Evolution of the Scientific Method
The Name of What We Do
Only One Machine
When Answers Are Cheap
Brains of White Matter
The Number of Species We Use
What Will Big Brains Do?
Cosmic Origins of Extropy
Inventing Our Humanity
My Search for the Meaning of Tech
RSS Feed
+My Yahoo!
+NewsGator
+Rojo
+NewsBurst
+Google Reader
+Pluck
+My AOL
+FeedLounge
+NetVibes
+BlogLines
Machine (one billion from the one billion online PCs) as there transitors in an
Itanium chip. The Machine is a super computer where each "transistor" is
computer. A very rough estimate of the computing power of this Machine
then is that it contains a billion times a billion, or one quintillion (10 ^ 18)
transistors. Since only the newest servers have a billion processors, the
figure is probably an order of magnitude smaller. When we add the
transistors for cell phones, handhelds, it calculates out to about 170
quadrillion (10^17) transistors wired into the Machine
There are about 100 billion neurons in the human brain. Today the Machine
has as 5 orders more transistors than you have neurons in your head. And
the Machine, unlike your brain, is doubling in power every couple of years at
the minimum.
In 2003 alone a total one quintillion transistors were produced, but not all of
them are wired up into the Machine. Many transistors made their way into
cameras, TVs, GPS units and the like, few of which are currently online. One
day they will be. Every chip will eventually connect to the web in some
fashion. That would mean we would be adding as many transistors to the
Machine in a year as exist right now.
If the Machine has 100 quadrillion transistors, how fast is it running? If we
include spam, there are 196 billion emails sent every day. That's 2.2 million
per second, or 2 megahertz. Every year 1trillion text messages are sent.
That works out to 31,000 per second, or 31 kilohertz. Each day 14 billion
instant messages are sent, at 162 kilohertz. The number of searches runs at
14 kilohertz. Links are clicked at the rate of 520,000 per second, or .5
megahertz.
There are 20 billion visible, searchable web pages and another 900 billion
dark, unsearchable, or deep web pages (for instance pages behind
passwords or the kind of dynamic page that Amazon will produce when you
query it). The average number of links found on each searchable web page
is 62. Assuming the same count for dynamic pages that means there's 55
trillion links in the full web. We could think of each link as a synapse -- a
potential connection waiting to me made. There is roughly between 100
billion and 100 trillion synapses in the human brain, which puts the Machine
in the same neighborhood as our brains.
Kevin Kelly -- The Technium http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php
3 of 9 30/08/2007 13:42
Digital Enterprise Research Institute www.deri.ie
The vision of the Semantic Web
9
Digital Enterprise Research Institute www.deri.ie
Giant Global Graph (2007)
! Transition
" WWW = content+links
" GGG = WWW+relationships+descriptions
! Universal medium for data, information and knowledge exchange
10
http://dig.csail.mit.edu/breadcrumbs/node/215
Tim Berners-Lee
Digital Enterprise Research Institute www.deri.ie
The One machine
11
http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php
! The One machine (Kevin Kelly, 2007)
" 1.2 billion personal computers
" 27 million data servers
" 2.7 billion cell phones
" 80 million wireless PDAs
" 600 billion RFID tags in use
Digital Enterprise Research Institute www.deri.ie
Evolution of the Web
12
Digital Enterprise Research Institute www.deri.ie
The Key
13
http://www.flickr.com/photos/11437726@N08/2781739886/
Agree on standards
Open your data
Digital Enterprise Research Institute www.deri.ie
Semantic Web technologies
14
Digital Enterprise Research Institute www.deri.ie
Links
page1 -> user1
page1 -> book1
page1 -> page24
page1 -> Cats
15
! Let's give a meaning to the hyperlinks
page1 -hasAuthor-> user1
page1 -isPartOf--> book1
page1 -refersTo--> page24
page1 -isAbout---> Cats
triple: subject -property-> object
Digital Enterprise Research Institute www.deri.ie
Graph Model - RDF
16
Digital Enterprise Research Institute www.deri.ie
Graph Model - RDF
17
Digital Enterprise Research Institute www.deri.ie
Resources on the Semantic Web
18
! Internet of Things
" URI: Uniform Resource Identifier
" http://dbpedia.org/resource/Apple
" http://dbpedia.org/resource/Apple_Inc
" http://dbpedia.org/resource/Apple_River
" http://dbpedia.org/resource/Apple_(band)
" http://dbpedia.org/resource/Apple_(album)
" URIs should be dereferenceable
Digital Enterprise Research Institute www.deri.ie
RDF - Describe your data
! Various RDF formats
"RDF is not XML! XML is one of the ways to write RDF data, ie. it's a language/syntax
"RDF/XML
"N-triple
"Turtle
"RDFa
! shortcut notation for URIs: CURIE (Compact URI)
"prefix:id
– example: foaf:knows, sioc:User, etc.
19
Digital Enterprise Research Institute www.deri.ie
RDF - Describe your data
! Various languages
"scor knows danbri (English)
"scor connait danbri (French)
"scor danbri (drawing)
! One meaning in RDF
"scor foaf:knows danbri
20
scor walkahfoaf:knowsscor danbrifoaf:knows
Digital Enterprise Research Institute www.deri.ie
RDF - Vocabularies
! Semantic links are categorized in vocabularies
"Dublin Core - DC
– title, creator, description, date
"Friend of a Friend - FOAF
– hasName, knows, homepage
"Description of a Project - DOAP
"Semantically Interlinked Online Communities - SIOC
"Simple Knowledge Organization System - SKOS
21
PREFIX abc: <http://example.com/exampleOntology#> SELECT ?capital ?countryWHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa .}
Digital Enterprise Research Institute www.deri.ie
SPARQL - query the GGG data
"standardized in January 2008
"Example, return the capital of all the african countries:
22
Digital Enterprise Research Institute www.deri.ie
Semantic Web practical applications and initiatives
23
Digital Enterprise Research Institute www.deri.ie
Dataportability
! Merge my social networks between various sites
! Move information from one service to another
24
Digital Enterprise Research Institute www.deri.ie
Local communities
25
* Source: Pidgin Technologies, www.pidgintech.com
Digital Enterprise Research Institute www.deri.ie
Many isolated and disparate communities
26
* Source: Pidgin Technologies, www.pidgintech.com
Digital Enterprise Research Institute www.deri.ie
(De-)centralized profile
27
http://www.johnbreslin.com/blog/
Digital Enterprise Research Institute www.deri.ie
Decentralized profiles
28
http://www.johnbreslin.com/blog/
Digital Enterprise Research Institute www.deri.ie
Linking Open Data project
29
http://richard.cyganiak.de/2007/10/lod/
Home About Search Submit Forum Dev
europe
Search results for term “europe”, found about 54.2 thousand
Birds of Europe (RDF)
2008-07-26 – 363 triples in 52.7 kb
http://dbpedia.org/resource/Category:Birds_of_Europe (Search) (Cached) (Ontologies)
Europe (RDF)
2008-07-22 – 91 triples in 13.1 kb
http://dbpedia.org/resource/Category:Europe (Search) (Cached) (Ontologies)
Europe 1 (RDF)
2008-07-22 – 639 triples in 91.4 kb
http://dbpedia.org/resource/Europe_1 (Search) (Cached) (Ontologies)
Flora of Europe (RDF)
2008-07-26 – 297 triples in 43.4 kb
http://dbpedia.org/resource/Category:Flora_of_Europe (Search) (Cached) (Ontologies)
Europe (Band), Europe (musique), Europe (樂團), ヨーロッパ (バンド), Europe (band), Europe (RDF)
2008-07-20 – 1062 triples in 224 kb
http://dbpedia.org/resource/Europe_%28band%29 (Search) (Cached) (Ontologies)
Search results for term “europe” - Sindice http://sindice.com/search?q=europe&qt=term
1 of 2 30/08/2007 11:47
Digital Enterprise Research Institute www.deri.ie
Sindice - The Semantic Web index
30
http://sindice.com/
Digital Enterprise Research Institute www.deri.ie
RDF in Drupal
31
Digital Enterprise Research Institute www.deri.ie
RDF in Drupal core
! RDFa only
" RDF serialization format recommended by W3C
" RDF in xHTML
" Yahoo! SearchMonkey and Google parse it
" no need to generate another output: human and machine readable document
32
Digital Enterprise Research Institute www.deri.ie
DrupalCon DC RDFa video
! DrupalCon DC RDFa video
33
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7: architecture
! Semantics at the module level"Modules can export data along with their semantics in the
format they want– Core => RDFa
– Contrib => RDF/XML, ntriples and what not.
"No duplicate definition of semantics.
"Built in semantics can be altered.
"The theme layer does not have to worry about the semantics anymore, it simply outputs it along with the data.
"Better control on what namespaces are being used for a given page so that only these namespaces are included in the header of the HTML document.
34
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! Architecture of the RDF API in core
–hook_rdf_mapping() : Allow modules to define their own RDF mappings
–hook_rdf_mapping_alter(&$mapping) : Allow modules to override existing mappings
–rdf_get_mapping($bundle) : Returns the mapping for the attributes of the given bundle as an associative array
35
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! hook_rdf_mapping()
36
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! rendered HTML
37
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! What’s already committed
" RDFa doctype
38
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! What’s already committed
" Common RDF prefix definitions
39
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! What’s pending
"The rest!
"1 week for the API
"6 weeks for testing (code slush)
40
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
! Theming layer
"Hardest part of the work
"Many tags are hardcoded in the tpl files
–we want to avoid modifing these, themers should not have to care about RDFa
"Dilema
–centralize everything in the RDF module
–distribute the RDF in all modules (and patch these modules)
41
Digital Enterprise Research Institute www.deri.ie
Status of RDF in Drupal 7
42
building block modules beneficiary modules
page/block
node
field
user
comment
taxonomy
blog
forum
book
openid
profile
all contributed modules
Digital Enterprise Research Institute www.deri.ie
Thank you
! Credits" Frédéric Marand
" Florian Lorétan
" John Breslin
" John Morahan
" Mark Birbeck
" Rolf Guescini
" Benjamin Doherty
" Benjamin Melançon
" Stefan Freudenberg
" Peter Wolanin
" Barry Jaspan
" yched
" catch
" ...
43
Digital Enterprise Research Institute www.deri.ie
Contribute
! IRC: #drupal-rdf
! list of issues to review athttp://drupal.org/project/issues/search/drupal?issue_tags=RDF
! Talk to us
! Keynote tomorrow by Dan Brickley
! code sprint on Saturday
44