22
OpenAIRE Interoperability Workshop, University of Minho, 7/8 February 2013 ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision towards Research Communities and Citizens Nikos Houssos National Documentation Centre (EKT) / NHRF EuroCRIS

ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Embed Size (px)

DESCRIPTION

OpenAIRE Interoperability Workshop (8 Feb. 2013). ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision towards Research Communities and Citizens – Nikos Houssos, National Documentation Centre (EKT)/euroCRIS

Citation preview

Page 1: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

OpenAIRE Interoperability Workshop, University of Minho, 7/8 February 2013

ENGAGE: An Infrastructure for Open, Linked

Governmental Data Provision towards Research

Communities and Citizens

Nikos Houssos

National Documentation Centre (EKT) / NHRF

EuroCRIS

Page 2: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Agenda

0 ENGAGE project overview

0 ENGAGE interoperability aspects

0 ENGAGE collaboration opportunities

Page 3: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Contract no

Project type

Start date

Duration

Partners

Framework Programme 7 (2007-2013)

NTUA GR

TU-DELFT NL

MIC-GR GR

IBM-ISRAEL IL

INTRASOFT LU

STFC UK

FhG-FOKUS DE

AEGEAN GR

EUROCRIS NL

Acronym ENGAGE

Title An Infrastructure for Open, Linked Governmental Data Provision

towards Research Communities and Citizens

Website http://www.engage-project.eu

Platform http://www.engagedata.eu

ENGAGE Project Information

RI-283700

CP-CSA

01/06/2011

36 months

9

Project participants

Research Infrastructures (Coordinator)

Page 4: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Public Sector Information

0 Data produced by governmental organisations – typically referring to datasets

0 Examples: geospatial, demographic, statistical, environmental, public safety, financial data

0 Growing international movement: open access to PSI datasets in a way that facilitates reuse

0 Opening up PSI datasets can potentially lead to substantial economic gains 1

1Vickery, G. (2011): Review of recent studies on PSI re-use and related market developments.

Page 5: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

• Development and use of a data infrastructure, incorporating distributed and diverse public

sector information (PSI) resources

• Capable of supporting scientific collaboration and research, particularly for the Social

Science and Humanities (SSH) scientific communities,

• Empowering the deployment of open governmental data towards citizens.

Simply put, ENGAGE is a door for researchers that leads them to the world of Open

Government Data. Through the ENGAGE platform, researchers and citizens will be able to

search, browse, download, visualise and submit diverse and distributed Public Sector

datasets from EU countries.

Overview of ENGAGE objectives

Page 6: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE Two-way Scenario

Public Sector Information Collection

Data Curation

Archival Data Search

and Retrieval Advanced

Data Services

Delivering Open Data Needs and guidelines to Public Sector Organisations

•Public Sector

Organisations

•Open data

initiations

•Pre-processing

•Anonymisation

•Harmonisation

•Annotation

•Linking

•Cloud and Grid

Infrastructure

•Platform

Independence and

Interoperability

•Open and intuitive

access to the data

collection

•Context-specific

search

•Visualisation (inc.

combined views)

•Context-specific

formatting

•Collaboration tools

•Public Sector

Organisations

•ENGAGE and

eInfrastructures

•ENGAGE •Society

•Policy

•Research

Communities

•Policy makers

New Problems – new

Challenges

Search Data Needs

New Service Definition for

open data

Utilisation of existing

Infrastructures

Needs for Governmental data Provision

Page 7: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE provides a

single point of access

to PSI sources as well

as relevant tools in

order to cover the

needs of researchers

and citizens

Unstructured / “Semi-structured”

Ministries / local public agencies websites

Publicdata.eu

National

Statistical

Offices

Public

data

sources

ENGAGE traverses

across distributed and

diverse public sector

information resources

Page 8: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE aims to embrace the

Linked Data Paradigm while

ensuring the quality and

responsiveness of highly

structured information models.

ENGAGE: not an isolated

data silo but a vital part of

the Global Data Space.

Page 9: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE will enable EU Researchers / Citizens to

Discover and browse datasets across diverse and

dispersed public sector information resources

(local, National and European) in their own

language.

Upload curated, enhanced or extended versions of

existing datasets, originally published by public

agencies, in order to address various formats,

standards and scientific purposes in a crowd-

sourcing manner.

Acquire the datasets

Visualize properly structured datasets in data

tables, maps and charts

Additionally

Utilize ENGAGE Application Programming

Interfaces (APIs) for searching and acquiring the

datasets.

Rate the quality of datasets on various dimensions

Request additional datasets or information on

existing datasets from the Public Agencies

View usage statistics

View publications and other material linked to

datasets

Page 10: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Public Agencies will be able to Utilize the ENGAGE infrastructure (interface and APIs) to publish

governmental data

Register and link their datasets within the ENGAGE infrastructure

Receive feedback on the quality of their datasets

Review the opinion or request of citizens and researchers

View the applications, publications and other datasets uploaded by

scientists, that are linked to their original published datasets

Page 11: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Unstructured / Semi-structured / Structured

Public

data

sources

JSON

Conversion Data Enrichment

Metadata Enrichment Cleansing

“Snapshots”

Low

Re-Use Value /

Quality structure /

metadata

Discovery

and Context

Metadata

High Re-Use Value /

Quality structure /

metadata

ENGAGE Crowdsourcing

Moving from low

structured, low value

datasets to highly

structured and / or

derived datasets

Page 12: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos
Page 13: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE 2.0

0 An infrastructure that integrates original PSI data and derived / curated datasets created, maintained and extended by users (researchers, citizens, journalists, computer specialists) in a collaborative environment. A curation platform with focus on the SSH research communities.

0 To be released Spring 2013

0 The vision of the ENGAGE infrastructure is to extract, highlight and enhance the RE-USE value of PSI data. 0 HOW: Moving from low-structured, isolated, difficult to find PSI data to

easy to link , easy to process datasets with rich, structured metadata

Page 14: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE 2.0

0 On top of ENGAGE basic functions (catalog, search, visualizations, API)

Researchers / Citizens / Journalists:

0 Extend other datasets (official or already extended - derived datasets) 0 Conversions (e.g. HTML- PDF to xls, PDF to RDF)

0 Data Cleansing (e.g. duplicate records, empty rows, errors)

0 Metadata Enrichment (missing metadata, Linked Data Enablers!)

0 Data Enrichment (enrich datasets with more information)

0 Snapshots of real-time data (e.g. Diavgeia_decisions_10_2012_to_12_2012.xls)

0 Mash-ups / Interlinking (e.g. Combine Election results to UV radiation levels!)

0 View the version tree of official – derived datasets (clean solution - easy to understand and manage the contributions / versions)

Page 15: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE 2.0

Researchers / Citizens / Journalists:

0 Data Requests 0 Looking for a dataset (e.g. I can’t find it elsewhere. Does it exist?)

0 Looking for a curation / conversion / enrichment (e.g. I am looking for the election results in Greece in XLS. )

0 Looking for data verification (e.g. Do you think this dataset is valid?)

0 Freedom of Information Requests

0 Integration of tools 0 Google Refine

0 ScraperWiki

0 Visualizations

Page 16: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

ENGAGE 2.0

Data Providers:

0 Maintainers of Official Datasets

0 Work as a group

0 Bring the community which works on their data closer to them/ direct communication

0 See and take advantage of ENGAGE Data Curation Community work (e.g. cleansing, better formats)

0 Easy to see / gather all the Applications that are based on their official datasets.

0 See the impact of their datasets.

0 Understand which datasets have RE-USE value for users.

0 Community Help in the process of Digitalization and Opening of current or older Public Data (history dimension)

Page 17: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Rich, structured metadata to enable Linked Data

0 Structure: Entities and semantic relationships instead of plain fields 0 Each entity has structured metadata, including a URI field

0 Semantics: Each relationship has clear semantics 0 What is the relationship of organisation Y with data set X?

0 Creator, maintainer, commissioner,…

0 Ability to dynamically include into the system vocabularies => linked data, reuse of existing vocabularies/ontologies

0 CERIF (Entities and Semantic Layer) provide the required features for contextual metadata

Page 18: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Rich contextual metadata is important

0 Captures context, purpose, provenance, coverage, etc.

0 Allows the user to:

0 Discover a dataset

0 Evaluate utility and re-use potential

0 Reuse it!

0 Enables advanced services

0 Sophisticated search/discovery and navigation, mining, visualisation,

reporting

11th International Conference on Current Research Information Systems (CRIS 2012), Prague, 6-9 June 2012

Page 19: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

A 3-level metadata approach

0 Level-1. Discovery metadata. Flat schemata (analogous to

Dublin core). Enables basic search by non-sophisticated users.

0 Level-2. Usage metadata. A structured, semantically-rich model

for contextual metadata. Enables advanced domain-

independent services.

0 Level-3. Domain metadata. Detailed domain-specific metadata.

Allows advanced services provided by specialised tools.

11th International Conference on Current Research Information Systems (CRIS 2012), Prague, 6-9 June 2012

Page 20: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Metadata approach

Page 21: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Overview of architecture for PSI metadata

11th International Conference on Current Research Information Systems (CRIS 2012), Prague, 6-9 June 2012

RDF / Linked Open Data

Data Source 1

Data Source 2

Data Source N …

Dublin Coree

eGMS CERIF

SPARQL interface

DCAT

CKAN

Page 22: ENGAGE: An Infrastructure for Open, Linked Governmental Data Provision... – Nikos Houssos

Thank you