Upload
giovanni-barbieri
View
24
Download
2
Embed Size (px)
Citation preview
Dealing with Open Data at ISTAT
Giovanni A. BarbieriStatistics Italy (Istat)
The Open Data Movement
Our proposal is simple: […] the federal government […] is to provide data that is easy for others to reuse, rather than to help citizens use the data in one particular way or another
Open infrastructures that enable citizens to make their own uses of the data
Reverse the current policy, which is to regard government websites themselves as the primary vehicle for the distribution of public data, and open infrastructures for sharing the data as a laudable but secondary objective [Robinson,Yu, Zeller and Felten 2008]
2
Crowdsourcing Government Transparency
Government information that is nominally publicly available is in fact difficult to access either because it is not online or, if it is online, because it is not available in useful and flexible formats [Brito 2008]
“Structured data” Associated structured XML file would allow a
user to sort the data by ascending or descending date, alphabetically by headline or author, by number of words, and in many other ways
3
Open Data Ecosystem
“Open data are adding a new dimension to big data analytics and giving rise to novel, data-driven innovations.” (McKinsey Global Institute Report, Oct. 2013)
From Citizens IBM BLOG
Wide Range of Open Data Consumers
Citizens that would like to learn characteristics of the places they are, e.g.
with mobile apps showing location-specific features
Journalists that need to access data for updated and aware information
communication
Educators that are helped in their teaching task by access to data on
different application domains
4
Official Statistics meets Open Data
Official Statistics can more easily reach such wide range of users if conveyed through open data
Recent technologies advances in the open data community enable new advanced dissemination channels for Official Statistics
5
Reinforcing trust
Getting closer to users
Reaching new users
Giving information
back
Improving metadata
Linked Open Data
Semantic Web Technological Standards
OWL
Knowledge Representation
Linked Open Data - LOD
6
Why is Linked Data an Opportunity?
Linked Data as a semantically rich paradigm for data representation
Rich enough for the strict requirements of Official Statistics
Formal and well-defined data structures, i.e. ontologies
Linked Data as an international standard (W3C) Tools availability and independence Beyond statistical users RDF: Resource Description Framework (W3C)
(subject-predicate-object)
7
Istat’s Linked Open Data Portal - 1
Istat LOD Portal: http://datiopen.istat.itEnglish Version: http://datiopen.istat.it/index.php?language=eng
8
Platform for• Selecting • Navigating • Searching • Querying • Visualizing Open Data
The platform allows• Direct access to data via Web Services • M2M solutions (e.g. GIS-LOD) • Data conversion• Export to productivity tools• Visualization by means of external tools
Istat’s Linked Open Data Portal - 2
9
STEP 1 Give each class of users (human or not) the most appropriate way to use the data
STEP 3 Enrich the data with a semantic layer, regardless of the release on public web sites
STEP 2 Make data in open format, whatever the level of openness
Istat’s Linked Open Data Portal - 3
Steps to a «perfect» data portal
10
Istat’s Linked Open Data Portal - 4
Guided Access
Freedom of access
Type
of i
nter
actio
n
Free AccessHum
an
basic
Mac
hine
To
Mac
hine
Navigation
Guided queries
Query REST onSPARQL EndPoint
Query via SPARQL EndPoint
Web Service
Download
Hum
an
Adva
nced
11
Predefined Queries(Set of simple and
customizable queries)
Free Queries(SPARQL Queries)
Navigation
Guided Queries
Download
Type
of i
nter
actio
n
Hum
an
Guided Access Free Access
Interaction Modes
Freedom of access
Basic Advanced(Human) Usertechnical skills Intermediate
Free Query via SPARQL EndPoint
12
Use Case 1: Spatial Querying
App that displays on a map some population indicators of the nearest census sections to specific GPS coordinates
LOD when accompanied by spatial information allow to access data using spatial queries
13
Use Case 2: Federated Querying - 1
Federated query on Istat and ISPRA, i.e. the query accesses Istat and ISPRA portals
With LOD, it is very easy to compare data coming from different sources (linked for example at territorial level)
Query on one Portal
Results dynamically retrieved
from both portals
14
ISPRA - The Italian National Institute for Environmental Protection and Research
Istat
Use Case 2: Federated Querying - 2
ISTAT data:Census Buildings
ISPRA data:Data on land use / soil consumption
Example Query:Municipality-level analysis of land use / soil consumption and number of buildings by period of construction
Dynamically generated!
15
Use Case 3: Istat as Open Data Provider in SPOD
Social discussion (on the left) about a graphic representation of Census Data (on the right)
Dynamically generated!
16
SPOD: Social Platform for Open Data
Conclusions
A dissemination strategy based on open data does put the Official Statistics users at the centre: Reaching them through different channels
e.g. apps and social media Making easier for them to retrieve data
e.g. federated queries that make transparent the distribution of data on different portals
Providing richer services to them e.g. spatial querying and dynamical
visualizations
17
Open data and in particular Linked Open Data have a leading role in data innovation
for Official Statistics
• Macroscale vs microscale modeling– Pseudo-Einstein (as simple as possible but not simpler)– Von Neumann (agent-based modeling)
• Technological constraint enabling technology• It widens the space of what is feasible:
– In production: our experience with SBS.Frame– In analysis and research…
• A paradigm shift?– Statistical mechanics vs agent-based modeling– Just because you can doesn’t mean you should
• Back to open data– From dissemination to release (“data liberation” at StatCan) to the
development of information– The regulators need to introduce new rules in line with the new
scenarios (Don't think of an elephant!: know your values and frame the debate)
One More Thing
18
Thanks to all my colleagues in Istat contributing to the LOD Portal
Special thanks to Monica Scannapieco and Stefano De Francisci
Questions and clarifications: contact me at [email protected]
Acknowledgments and Thanks
18