John Sheridan
2 March 2011
Linked Government Data
“With linked data, when you have some of it, you can find other, related, data.”
Tim Berners-Lee,
“Linked Data Design Issues”http://www.w3.org/DesignIssues/LinkedData.html
Henry Maudslay (1771–1831)
He also developed the first industrially practical screw-cutting lathe in 1800, allowing standardisation of screw thread sizes for the first time. This allowed the concept of interchangeability (a idea that was already taking hold) to be practically applied to nuts and bolts. Before this, all nuts and bolts had to be made as matching pairs only. This meant that when machines were disassembled, careful account had to be kept of the matching nuts and bolts ready for when reassembly took place.
http://en.wikipedia.org/wiki/Henry_Maudslay
Five stars
* make your stuff available on the Web (whatever format) under an open licence
** make it available as structured data (e.g., Excel instead of image scan of a table)
*** use non-proprietary formats (e.g., CSV instead of Excel)
**** use URIs to identify things, so that people can point at your stuff
***** link your data to other data to provide context
5
Three projects
• data.gov.uko Supporting the transparency agenda with Linked Data
• legislation.gov.uko First step towards a Linked Data Statute Book
• nationalarchives.gov.uko Semantic Knowledge Base for the Web Archive
6
“The Government believes that we need to throw open the doors of public
bodies, to enable the public to hold politicians and public bodies to account.”
The Coalition Agreement.
“We will ensure that all data published by public bodies is published in an open
and standardised format, so that it can be used easily and with minimal cost by
third parties.”
The Coalition Agreement.
We are:
• developing standards for responsible publishing of key types of data (financial data, organisation data, aggregate statistics, location data)
• developing guidance, practices and tools that make it easy to publish data in Linked Data form, at low cost
• making it easy for people to consume data in a programmatic way (the Linked Data API as well as native Linked Data techniques such as the provision of SPARQL Endpoints)
STANDARDS
10
2008 2009 2010
A 1,345 1,456 2,301
B 2,112 3,543 2,111
C 2,345 2,987 2,455
D 6,342 6,256 6,123
E 7,435 7,432 8,102
Transaction Date Supplier Amount
A-1263 09/09/2010 Spottiswoode & Co £ 2,345
A-1264 09/09/2010 JSB & Sons £ 2,111
A-1265 09/09/2010 BLG Ltd £ 2,455
A-1266 09/09/2010 Spottiswoode & Co £ 6,123
A-1267 09/09/2010 BLG Ltd £ 8,102
Director General
Director (Operations)
Director (Strategy)
Deputy Director (A)
Deputy Director (A)
Standards
• Re-use where we can, create where we must• Small, high level, light weight vocabularies
o Examples include datacube, organization, provenance• Create local specialisations
o Examples include payments, central-government• Post hoc linking
12
DATA
13
http://reference.data.gov.uk/id/day/2011-01-13
http://reference.data.gov.uk/id/department/CO
http://transport.data.gov.uk/id/station/WAT
http://education.data.gov.uk/id/school/341451
http://location.data.gov.uk/id/3245677362123
http://www.legislation.gov.uk/id/ukpga/2009/12/section/2
PRODUCTION
15
Gridworks (Google Refine)
16
Gridworks: map and export Linked Data
17
PUBLICATION
18
19
Linked Data API
• Open Standard• Generic approach for creating APIs from Linked Data• Sits on top of a Linked Data store• Several implementations, most mature is Puelia• Examples for education and transport• Also, organisations, payments information
20
21
22
UNAMBIGUOUS DEFINITIONSAnd wouldn’t it be cool if we had…
26
Legislation as data
• Three considerations for legislation as datao Typographic layouto Versioning / changes over timeo Semantics
• Semantic representation using RDF and Linked Datao URIs for thingso RDF data modelo subject - property - object
• Requires granular URIs to name thingso Identifiero Documento Representation
27
“A” changes “B” when “C” says so
28
“A” changes “B” when “C” says so
29
“A” changes “B” when “C” says so
30
Academies Act 2010
Section 19 (2)
Academies Act 2010
Section 12 (4)
SI 2010/1937 Schedule 3
Charities Act 1993 Schedule
2 (ca)
Secretary of State
Confers power
Makes
Commences
Inserts text into
Legislation URIs
• Identifiero http://www.legislation.gov.uk/id/{type}/{year}/{number}/section/{number}o eg http://www.legislation.gov.uk/id/ukpga/2010/32/section/12/4
• Documento http://www.legislation.gov.uk/{type}/{year}/{number}/section/{number}o eg http://www.legislation.gov.uk/ukpga/2010/32/section/12#section-12-4
• Representationso /data.xmlo /data.xhto /data.pdfo /data.rdfo and for any list, /data.feed
31
Legislation URIs, time and extents
• Identifiero http://www.legislation.gov.uk/id/{type}/{year}/{number}/section/{number}
• eg http://www.legislation.gov.uk/id/ukpga/2010/32/section/12/4
• Document versionso In force
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12o Prospective
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/prospectiveo Point in time
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/2010-12-01o Extents
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/england
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/scotland
32
33
34
Web Archive - Semantic Knowledge Base
• The National Archives operates the UK Government Website Archive
• Second most used web archive in the world• Links to withdrawn documents are maintained – preserving
wide variety of information, from datasets to documents and press releases
• Web archives are notoriously difficult to search using standard search technology – size, number of duplicates
• Procured SKB, competitive process• Solution being delivered by a consortium (technologies from
Ontotext, University of Sheffield)
35