63
SemWeb 4 Gov – opportunities and challenges Dr. Andrew Woolf Acting Assistant Director (Climate & Water IT Services), Bureau of Meteorology

SemWeb 4 Gov – opportunities and challenges

Embed Size (px)

DESCRIPTION

Invited presentation (industry track) at International Semantice Web Conference 2013, Sydney, Australia

Citation preview

Page 1: SemWeb 4 Gov – opportunities and challenges

SemWeb 4 Gov – opportunities and challenges

Dr. Andrew WoolfActing Assistant Director (Climate & Water IT Services), Bureau of Meteorology

Page 2: SemWeb 4 Gov – opportunities and challenges

Acknowledgements…

• Josh Bobruk, Robert Boczek, Karl Braganza, Sarah Callaghan, Shirley Crompton, Armin Haller, Colin Harpham, Mike Jackson, Bryan Lawrence, Laurent Lefort, Brian Matthews, Tim Osborn, Clinton Rakich, Will Rogers, Arif Shaon, Jeremy Tandy, Kerry Taylor, Blair Trewin

Page 3: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 4: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 5: SemWeb 4 Gov – opportunities and challenges

Information explosion

http://www.economist.com/node/21537922

Page 6: SemWeb 4 Gov – opportunities and challenges

Big data

http://strataconf.com/

Page 7: SemWeb 4 Gov – opportunities and challenges

Data journalism

http://www.guardian.co.uk/data

Page 8: SemWeb 4 Gov – opportunities and challenges

Data for development

http://data.worldbank.org/

Page 9: SemWeb 4 Gov – opportunities and challenges

Harvard Business Review (Oct 2012)

Page 10: SemWeb 4 Gov – opportunities and challenges

United States

My Administration will take appropriate action … to disclose information rapidly in forms that the public can readily find and use. Executive departments and agencies should harness new technologies to put information about their operations and decisions online and readily available to the public.

Memorandum on Transparency and Open Government,Federal Register Vol. 74, No. 15

Page 11: SemWeb 4 Gov – opportunities and challenges

United Kingdom

Our plans include:

•Radically opening up data and public information, releasing thousands of public data sets – including Ordnance Survey mapping data, real-time railway timetables, data underpinning NHS choices, and more detailed departmental spending data – and making them free for re-use.

Putting the Frontline First: smarter governmentCabinet Office, 7 December 2009

Page 12: SemWeb 4 Gov – opportunities and challenges

Europe

…open access is reaching the tipping point, with around 50% of scientific papers published in 2011 now available for free.…open access will be mandatory for all scientific publications produced with funding from Horizon 2020, the EU's Research & Innovation funding programme for 2014-2020.…Under Horizon 2020 … the Commission will also start a pilot on open access to data collected during publicly funded research…

European Commission - IP/13/786, Brussels, 21 August 2013

Page 13: SemWeb 4 Gov – opportunities and challenges

United Nations

We also call for a data revolution for sustainable development, with a new international initiative to improve the quality of statistics and information available to citizens.

A NEW GLOBAL PARTNERSHIP: ERADICATE POVERTY AND TRANSFORM ECONOMIES THROUGH SUSTAINABLE DEVELOPMENTThe Report of the High-Level Panel of Eminent Persons on the Post-2015 Development AgendaUnited Nations, 30 May 2013

Page 14: SemWeb 4 Gov – opportunities and challenges

Principle 1: Open access to information - a default positionPrinciple 2: Engaging the communityPrinciple 3: Effective information governancePrinciple 4: Robust information asset managementPrinciple 5: Discoverable and useable informationPrinciple 6: Clear reuse rightsPrinciple 7: Appropriate charging for accessPrinciple 8: Transparent enquiry and complaints processes

Principles on open public sector informationOAIC, May 2011

Australia

Page 15: SemWeb 4 Gov – opportunities and challenges

Australia

We will [establish] policies to: •accelerate Government 2.0 efforts to engage online, make agencies transparent and provide expanded access to useful public sector data; …The next wave of opportunities to improve the quality and effectiveness of government services are likely to be driven by access to (appropriately anonymized) public sector data sets and ‘big data’.

The Hon Andrew Robb AO MP, The Hon Malcolm Turnbull MPLiberal Party of Australia, 2 Sep 2013

Page 16: SemWeb 4 Gov – opportunities and challenges

Government Open Data

“We commit to pro-actively provide high-value information, including raw data, in a timely manner, in formats that the public can easily locate, understand and use, and in formats that facilitate reuse.”

Page 17: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 18: SemWeb 4 Gov – opportunities and challenges

Climategate

Page 19: SemWeb 4 Gov – opportunities and challenges

Independent reviews

“We … conclude that there is independent verification… of the results and conclusions of the Climate Research Unit at the University of East Anglia. ...We … consider that climate scientists should take steps to make available all the data used to generate their published work, including raw data”

House of Commons Science and Technology Committee (31 Mar 2010)

CRU should make available sufficient information, concurrent with any publications, to enable others to replicate their results. …It would benefit the global climate research community if a standardised way of defining station metadata and station data could be agreed, preferably through a standards body, or perhaps the WMO.

The Independent Climate Change E-mails Review, Sir Muir Russell (7 Jul 2010)

Page 20: SemWeb 4 Gov – opportunities and challenges

climate change

Page 21: SemWeb 4 Gov – opportunities and challenges

biodiversity loss

Page 22: SemWeb 4 Gov – opportunities and challenges

natural resource management

Page 23: SemWeb 4 Gov – opportunities and challenges

water availability and use

Page 24: SemWeb 4 Gov – opportunities and challenges

disaster management

Page 25: SemWeb 4 Gov – opportunities and challenges

National Plan for Environmental Information

http://www.bom.gov.au/environment/

“an environmental information system to support the delivery and discovery of priority environmental information”

Page 26: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 27: SemWeb 4 Gov – opportunities and challenges

Sir Tim Berners-Lee on open data...

Page 28: SemWeb 4 Gov – opportunities and challenges

But HOW do we open data?

Page 29: SemWeb 4 Gov – opportunities and challenges

Sir Tim again…

Page 30: SemWeb 4 Gov – opportunities and challenges

Linked (Government) Data

Page 31: SemWeb 4 Gov – opportunities and challenges

Linked Data

1. Use URL (Web addresses) as identifiers for objects

2. Publish them on the Web

3. Use Semantic Web standards to model the data

4. Link objects in your dataset to objects in other datasets

Linked Data Principles ‘5 stars’ maturity model

Page 32: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 33: SemWeb 4 Gov – opportunities and challenges

Geospatial Transformation with OGSA-DAI (GeoTOD) [2010]

•Project aims:

– Exploit a high-profile outcome from the >£200M UK government-funded e-Science program

– Implement the UK Cabinet Office guidelines on ‘URI Sets for Location’

– Enable dynamic transformation of existing large spatial datasets

http://data.gov.uk/sites/default/files/Designing_URI_Sets_for_Location-V1.0_10.pdf

Page 34: SemWeb 4 Gov – opportunities and challenges

‘Spatial Thing’ http://ea.gov.uk/id/HY/Watercourse/Thames

‘web document’http://ea.gov.uk/doc/HY/Watercourse/Thames

rdfs:seeAlso http://ea.gov.uk/so/HY/Watercourse/ea-UKrivers/e7w1rdfs:seeAlso http://geotod/so/HY/Watercourse/stfc-strategi/4a97ov:similarTo http://ceh.nerc/so/HY/Watercourse/nerc-hydrodb/thames-001

‘303 See other:’

‘Spatial Object’http://geotod/so/HY/Watercourse/stfc-strategi/4a97.rdf

http://geotod/so/HY/Watercourse/stfc-strategi/4a97.htmlhttp://geotod/so/HY/Watercourse/stfc-strategi/4a97.kmlhttp://geotod/so/HY/Watercourse/stfc-strategi/4a97.gml

‘content negotiation’

Designing Location URIs

Page 35: SemWeb 4 Gov – opportunities and challenges

GeoTOD linked data framework

Page 36: SemWeb 4 Gov – opportunities and challenges

Web resources

Relational resources

Linked SO

Store

OGSA-DAI service

Geotod-D2RQ

RDB

Workflow

Generate RDB-specific

SQL using mapping

file

SPARQL query+

output formatting

D2RQ Mappin

g File

SQL

GeoServer WS

GML query

OGSA-DAI SO store

Page 37: SemWeb 4 Gov – opportunities and challenges

GeoTOD

• Challenges:

– Pragmatic interpretation of URI guidelines

– Mapping UML geospatial conceptual models to RDF

• GeoTOD demo:

– http://tiger.dl.ac.uk:8080/geotodls

• UML-to-RDF schema generator:

– http://tiger.dl.ac.uk:8080/rdfsgenerator

Page 38: SemWeb 4 Gov – opportunities and challenges

Advanced Climate Research Infrastructure for Data (ACRID) [2010]

•Project aims:

– Address Climategate concerns re publishing climate data

– Enable seamless link from research publication to data

– Include dataset provenance information

– Verify linked-data principles for this problem

https://www.uea.ac.uk/mac/comm/media/press/2010/july/climatedataproject

Page 39: SemWeb 4 Gov – opportunities and challenges

Linked-data for ACRID

Page 40: SemWeb 4 Gov – opportunities and challenges

DOI and OAI-ORE

•CrossRef DOIs (e.g. http://dx.doi.org/10.1126/science.1157784) are linked-data-enabled (as of April 2011) with conneg:

– RDF/XML, TTL, ATOM

•Open Archives Initiative Object Reuse and Exchange (OAI-ORE)

– description and exchange of aggregations of Web resources

– http://www.openarchives.org/ore/

Page 41: SemWeb 4 Gov – opportunities and challenges

ACRID – dataset workflows / provenance

•Various choices

– Open Provenance Model

– Provenance Markup Language

– ISO 19156 (Observations and Measurements)

– Climate Science Modelling Language (CSML)

•Adopted CSML

– Observation measures a Property of a Feature-of-interest using a Procedure and generating a Result

Page 42: SemWeb 4 Gov – opportunities and challenges

Australian Climate Observations Reference Network – Surface Air

Temperature (ACORN-SAT) [2012]

•High-quality daily surface temperature (min/max) timeseries’•112 stations•Over 100 years of records•Homogenised for

– Site relocations– Instrument replacement– Local changes

Page 43: SemWeb 4 Gov – opportunities and challenges

ACORN-SAT

• Project aims:

– Establish the first Australian Government linked data under data.gov.au

– Trial linked-data for large time-series observation dataset

– Gain experience in applying linked data to information sharing in support of the National Plan for Environmental Information

Page 44: SemWeb 4 Gov – opportunities and challenges

Station history: e.g. Darwin19

10

1920

1930

1940

1950

1960

1970

1980

1990

2000

2010

Darwin PO

Darwin AP2

Darwin AP1

Standalone

Standalone

Standalone

Post

Post

Pre

Pre

1941 - 2007Phase 2 (Darwin AP1)

2001 - 2011Phase 3 (Darwin AP2)

1910 - 1942Phase 1 (Darwin PO)

1910

19421941

41 2007

20012007

42 2001

014016

014015

014015

014040

Page 45: SemWeb 4 Gov – opportunities and challenges

Semantic Sensor Network Ontology (W3C Incubator Group)

Page 46: SemWeb 4 Gov – opportunities and challenges

ACORN-SAT observations as a cube (W3C DataCube ontology)

Page 47: SemWeb 4 Gov – opportunities and challenges

Coupling SSN and Data Cube ontologies

Page 48: SemWeb 4 Gov – opportunities and challenges

Deployment

*.txt

Page 49: SemWeb 4 Gov – opportunities and challenges

Querying: SPARQL

PREFIX gn: <http://www.geonames.org/ontology#>PREFIX acorn-sat: <http://lab.environment.data.gov.au/def/acorn/sat/>SELECT ?station ?day ?month ?year ?tempWHERE { ?obs acorn-sat:maxTemperature ?temp; acorn-sat:timeSeries ?tseries; acorn-sat:day ?day; acorn-sat:month ?month; acorn-sat:year ?year. ?tseries gn:name ?station.FILTER (?temp > 50).}

station day month year temp

Forrest 13 01 1979 50.1

Oodnadatta 03 01 1960 50.3

Oodnadatta 02 01 1960 50.7

Albany 08 02 1933 51.2

New record Jan 2013: Seven consecutive days over 45oC

Page 50: SemWeb 4 Gov – opportunities and challenges

Mashups

Page 51: SemWeb 4 Gov – opportunities and challenges

Mashups

Page 52: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 53: SemWeb 4 Gov – opportunities and challenges

Australian Government Linked Data Working Group (AGLDWG)

•Terms of reference

– Develop technical guidelines and best practice on the use of ‘linked-data’ by AG agencies

– Inform the development of data.gov.au as a platform for publishing Commonwealth PSI

– Promote the benefits and encourage adoption of ‘linked-data’ for publishing Commonwealth PSI

– Where appropriate, undertake specific activities and coordinate projects in pursuit of these objectives AGLDWG meeting with Sir Tim Berners-Lee,

31 Jan 2013 (Canberra)

Page 54: SemWeb 4 Gov – opportunities and challenges

LD-enabling CKAN

• CKAN used by a number of Government open data platforms (incl. UK, AU, US)

• Could it be made more LD-friendly?

– Add registry functionality (e.g. for simple term dictionaries)

– Support namespace-forwarding (e.g. for proxying many-to-many agency-to-subdomain mappings)

• UK Gov LD WG has a prototype already

– https://github.com/der/ukl-registry-poc/wiki

– http://www.slideshare.net/der42/ukgovld-registrywebinarv3

Page 55: SemWeb 4 Gov – opportunities and challenges

Outline

• Drivers

• Use case exemplars

• SemWeb as enabler

• A few projects

• Next steps

• Key lessons

Page 56: SemWeb 4 Gov – opportunities and challenges

1. Believe in SemWeb 4 Gov!

• Domain/agency-neutral data publishing mechanism

• Encourages information points-of-truth

• Assists ‘naturally’ with cross-agency data integration

• BUT:

– Need to demonstrate value (pilots, prototypes, etc.)

– Agencies will have security concerns

– Deployment behind government firewalls is difficult

Page 57: SemWeb 4 Gov – opportunities and challenges

2. Address agency needs

• Simple dictionary publishing would be a good start!

– http://test.wmocodes.info/

• Create robust guidance on simple things

– GeoRSS

– schema.org

– URI Rules

– Vocabulary management

– Effective use of CKAN

Page 58: SemWeb 4 Gov – opportunities and challenges

3. Be pragmatic

• At this stage of the adoption curve, more important to get something up than establishing complete semantics

• Agency people mostly won’t like sitting through multi-day ontology workshops!

Page 59: SemWeb 4 Gov – opportunities and challenges

4. Establish enabling infrastructure

• May be difficult to deploy own triplestores

• Encourage cloud solutions (public, government, research) e.g. NCI in Australia

• Build SemWeb into collaborations around data.gov(.xxx), e.g. AGLDWG, Cross Jurisdictional Open Government Data Working Group

Page 60: SemWeb 4 Gov – opportunities and challenges

5. ‘Geo’ as killer app for LD

• Much government data is spatially-enabled

• Huge value proposition in technology enabling linkage by location of: health, education, statistics, transport, environmental data, etc

• Note geospatial semantics standards work

– ISO 19150 (Geographic information – Ontology)

– OGC GeoSPARQL

Page 61: SemWeb 4 Gov – opportunities and challenges

6. Skill-up the ICT contractor pool

• Government uses contractors

• Need to build up a SemWeb ‘cottage industry’

– critical mass issue

• Research partners are essential, but also need industry partners

Page 62: SemWeb 4 Gov – opportunities and challenges

7. Engage with Gov

• Chat to your local friendly Gov IT geeks

• Be aware of stuff already happening

– e.g. in environmental information sharing: WaterML, GeoSciML, GWML, INSPIRE

Page 63: SemWeb 4 Gov – opportunities and challenges

Dr Andrew [email protected]

Thank you…

Acknowledgements (again) to collaborators: Josh Bobruk, Robert Boczek, Karl Braganza, Sarah Callaghan, Shirley Crompton, Armin Haller, Colin Harpham, Mike Jackson, Bryan Lawrence, Laurent Lefort, Brian Matthews, Tim Osborn, Clinton Rakich, Will Rogers, Arif Shaon, Jeremy Tandy, Kerry Taylor, Blair Trewin