A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
UKOLN is supported by:
Linked Data and the Semantic Web -
What are they and should I care?
17th February 2010
MIMAS Discussion ForumUniversity of Manchester, UK
Adrian Stevenson
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
semantics is … devoted to the study of meaning … on the syntactic levels of words, phrases, sentences
http://en.wikipedia.org/wiki/Semantic
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
“The Semantic Web is a web of data, in some ways like a global database”1
“first step is putting data on the Web in a form that machines can naturally understand... This creates what I call a Semantic Web - a web of data that can be processed directly or indirectly by machines”2
1. http://www.w3.org/DesignIssues/Semantic.html
2. Tim Berners-Lee, Weaving the Web. Harper, San Francisco. 1999.
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
“The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web.”
“the Semantic Web is the goal or end result… Linked Data provides the means to reach that goal”
From ‘Linked Data: The Story So Far’ - Heath, Bizer and Berners-Lee 2009
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
The Web We’re Used To
• Made by humans for humans
• Primarily documents
• Machines not very welcome
• Data silos
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Web of Linked Data
• In 1998 the idea from Tim Berners-Lee of ‘linked data’ took shape
• Designed for machines first
• It primarily links data about ‘things’, not documents
• …but it is for humans in the end
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
• But haven’t we been putting data on the web for years?– In CSV , relational databases, XML etc?
• Well yes, but these approaches are not so easy to integrate
• Web 2.0 mashups work against a fixed set of data sources
• Linked Data applications operate on top of an unbound, global data space.
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
So what’s happening now?
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
• “Sir Tim Berners-Lee, the inventor of the world wide web, will help the British government to make its data more easily available online … I have asked Sir Tim Berners-Lee … to help us drive the opening up of access to Government data in the web” Prime Minister Gordon Brown, 10th June 2009
• "What you find if you deal with people in government departments is that they hug their database, hold it really close”. Tim Berners-Lee, 10th June 2009
• We shall see …
Data.gov.uk
Officially launched 21st January 2010
Data.gov.uk – search for ‘traffic’
Central Office of Information - http://coi.gov.uk/
BBC Music BETA
http://www.bbc.co.uk/music/developers
• Provides access to raw data (Excel spreadsheets, PDF files, and more)
• UK is adhering more closely to Berners- Lee’s Linked Data rules
http://www.readwriteweb.com/archives/cnet_partners_with_thomson_reuters_on_linked_data.php
http://open.blogs.nytimes.com/2009/06/26/nyt-to-release-thesaurus-and-enter-linked-data-cloud/
Graphs house prices over time - combines house price data with information from Yahoo! Placemaker, Nestoria and OpenStreetMap
Effect of congestion charge zones on increasing the number of bicycles and reducing the number of cars and taxis – from ITO Worldhttp://itoworld.blogspot.com/
Postcode Paper - bus timetables, doctors surgeries, allotmentshttp://blog.newspaperclub.co.uk/2009/10/16/data-gov-uk-newspaper/
Owls Near You - http://owlsnearyou.com/
http://richard.cyganiak.de/2007/10/lod/
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
A little bit of the techy stuff
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Linked Data is …
• A way of publishing data on the web that:– Encourages reuse– Reduces redundancy– Maximises inter-connectedness– Enables network effects
• So how is this achieved?
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Presentational tagging – HTML
• <h1>Agilitas Physiotherapy Centre</h1> <p>Welcome to the Agilitas Physiotherapy Centre home page. Do you feel pain? Have you had an injury? Let our staff Lisa Davenport, our secretary Kelly Townsend, and Steve Matthews take care of your body and soul.</p>
<h2>Consultation hours</h2> Mon 11am - 7pm<br/> Tue 11am - 7pm<br/> Wed 3pm - 7pm<br/> Thu 11am - 7pm<br/> Fri 11am - 3pm
• <p> But note that we do not offer consultation during the weeks of the <a href=". . .">State Of Origin</a> games.</p>
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Semantic tagging<company>
<treatmentOffered>Physiotherapy</treatmentOffered>
<companyName>Agilitas Physiotherapy Centre</companyName>
<staff>
<therapist>Lisa Davenport</therapist><therapist>Steve Matthews</therapist>
<secretary>Kelly Townsend</secretary>
</staff>
</company>
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Tim BL’s Linked Data Design Issues• Use URIs as names for things • Use HTTP URIs so that people can look up those
names. • When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL) • Include links to other URIs so that they can
discover more things.
• From http://www.w3.org/DesignIssues/LinkedData.html
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
URIs and HTTP
• A “Uniform Resource Identifier (URI) provides a simple and extensible means for identifying a resource –RFC 3986
• A URL is a type of URI• HTTP URIs can be ‘de-referenced’
• HTTP URIs are used for “real world” things– http://adrianstevenson.com/id/me– http://dbpedia.org/page/Tim_Berners-Lee
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
RDF• Resource Description Framework
– “a language for representing information about resources in the World Wide Web”
– “RDF can also be used to represent information about things that can be identified on the Web, even when they cannot be directly retrieved on the Web”
• Describes relations based on triples– Subject-object-predicate
• http://www.w3.org/TR/REC-rdf-syntax/
http://www.jenitennison.com/blog/node/140
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Heroes
has a
creator whose name is
David Bowie
Subject
Predicate
Object
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Linked Data in Use
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Publishing Linked Data• RDFizers – convert data formats into
RDF
• D2R Server – creates linked data from relational databases
• SparqPlug – Extracts linked data from HTML
• …. Many others
D2R server publishes Linked Data view of database and allows clients to query the database via SPARQL
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Linked Data Applications
• Linked Data Browsers – navigate between data sources– Disco– Tabulator– Marbles
• Linked Data Search Engines– For humans – Falcons, SWSE– For apps – Swoogle, Sindice
• Tracks provenance of data• Merges data about the same thing from different sources
http://marbles.sourceforge.net/
• User can explore the underlying data structures
• Can search for objects, concepts or documents
http://iws.seu.edu.cn/services/falcons/
• Provides interface (API) that other linked data apps can use• Rationale: new linked data apps shouldn’t need to implement their own infrastructure for crawling and indexing web of data
http://sindice.com/
http://sindice.com/search?q=jazz&qt=term
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Some issues
• To RDF or not to RDF• Usability• Sustainability• Provenance• Licensing• Reliability
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Sustainability
• Ed Summers at the Library of Congress createdhttp://lcsh.info
• Linked Data interface for LOC subject headings
• People started using it
Library of Congress Subject Headings
Data Licensing
• Uses Amazon Web Services but contravenes their terms and conditions
http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Provenance
• OK if data ‘watermarked’
• But can often be a problem
• VOID can help (apparently!)
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
• Can we convince IT Managers, VC etc. it’s worth it?– Realistic expectations– “..the people sort of in charge of the kind
of data thing knew so little about their data structures”
– “I’ve had a whole bunch of meetings to get one dataset, been fobbed off, and literally just never get anywhere”Tom Steinberg, Director of MySociety (from Nodalities issue 8)
The Business Case
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
• What’s the payoff for O’Reilly, BBC etc of using Linked Data?
• Why didn’t it work the first time?– What’s different now?• Need to work out what Linked Data does
that other things don’t• prove a simple tangible benefit
The Business Case
http://www.chiefmartec.com/2010/01/7-business-models-for-linked-data.html
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Universities and Colleges in the Giant Global Graph
• Session at CETIS Conference 2009
• Case for Linked Data / Semantic Web discussed
• Some cases:– Freedom of Information– Improves data quality– Joining the party
http://wiki.cetis.ac.uk/Universities_and_Colleges_in_the_Giant_Global_Graph
http://wiki.cetis.ac.uk/Image:Conf2009_GGG_Group1B.jpg
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Conclusion
• Some interesting recent developments and sense of momentum
• Central Gov’t interested
• … but still much to do if the semantic web and linked data are to really take hold
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
Questions?
• http://www.twitter.com/adrianstevenson• [email protected]
A centre of expertise in digital information management
www.ukoln.ac.uk www.bath.ac.uk
CC Attribution
• Some sections of this presentation adapted from:– An Introduction to Linked Data, by Tom Heath– The Semantic Web – An Introduction by Owen Stephens– Using Linked Data as a Learning Resource
Recommendation System by Chris Clarke
• This presentation available under creative commons Noncommercial-Share Alike