17
Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo, A. Passacantilli Istituto Nazionale di Statistica - ITALY IAOS 2014 Conference

Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Embed Size (px)

Citation preview

Page 1: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy

G. D’Angiolini, P. De Salvo, A. PassacantilliIstituto Nazionale di Statistica - ITALY

IAOS 2014 Conference

Page 2: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

THE TRADITIONAL SCENARIO

The NSI’s data production processes exploit administrative data sources, as input or auxiliary data sources

• NSIs analyze the administrative sources’ information content when needed

• NSIs evaluate the administrative data sources’ quality from the viewpoint of their data production processes

EXPLOITING ADMINISTRATIVE DATA: new requirements

1 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 3: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

THE TRADITIONAL SCENARIO: effects

NSIs analyze the administrative sources’ information content when needed

No organized efforts for documenting the available administrative data sources

NSIs evaluate the administrative sources’ quality from the viewpoint of their data production processes

We have not yet a real insight into the actual administrative data quality’s determiners

We have not yet a satisfactory methodological approach to the overall administrative data quality’s evaluation

EXPLOITING ADMINISTRATIVE DATA: new requirements

2 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 4: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

NEW TRENDS

OLAP-DW Approach

Non statistical organizations implement their own Decision Support Systems

The Decision Support Systems are in fact statistical information systems

Effects of the spread of the OLAP-DW Approach

Non statistical organizations produce statistical information for their own use

Non statistical organizations exploit and exchange administrative data for this purpose

EXPLOITING ADMINISTRATIVE DATA: new requirements

3 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 5: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

NEW TRENDS

OLAP-DW Approach: it is adopted because ANY ORGANIZATION needs to study its own phenomena of interest and acquire ANY KIND OF DATA for this purpose

OPEN DATA

BIG DATA

New interest in data quality issues among computer science researchers

4

EXPLOITING ADMINISTRATIVE DATA: new requirements

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 6: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

A NEW TASK for NSIs

NSIs are required to provide any actual and potential user of administrative data sources with NEW SERVICEs

SUPPLYING DOCUMENTATION about the INFORMATION CONTENT and the QUALITY of the available administrative data sources

MAKING the available administrative data sources MORE EXPLOITABLE for statistical purposes, by means of modifying their content, when possible

5

EXPLOITING ADMINISTRATIVE DATA: new requirements

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 7: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

ACTIVITIES aimed at collecting and spreading DOCUMENTATION about the INFORMATION CONTENT and the QUALITY of the available administrative data sources:

Administrative data sources’ INVESTIGATIONS

for administrative data sources managed by central government institutions

Administrative data sources’ SURVEYS

for administrative data sources managed by local government institutions

6

THE ISTAT’S STRATEGY: DOCUMENTING ADMINISTRATIVE DATA SOURCES

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 8: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Administrative data sources’ INVESTIGATIONS

Istat’s experts together with data source’s experts jointly perform an analysis and documentation activity which follows a standard template for collecting information about the data source’s content and quality

The collected information is disseminated to any potential statistical user by means of a dedicated web-based metadata management system, called DARCAP (Documenting Public Administration Archives)

for supporting in-depth quality analyses of the most important administrative data sources we are studying a new Quality Assessment Framework for Administrative Data Sources

7 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

THE ISTAT’S STRATEGY: DOCUMENTING ADMINISTRATIVE DATA SOURCES

Page 9: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Administrative data sources’ SURVEYS

The local government institutions often manage a large number of independent administrative databases which concern several subject matters

Therefore the administrative data sources’ surveys are mainly aimed at enumerating the existing administrative data sources and classifying them by subject matter

Such information is disseminated to any potential statistical user by means of DARCAP (Documenting Public Administration Archives)

8 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

THE ISTAT’S STRATEGY: DOCUMENTING ADMINISTRATIVE DATA SOURCES

Page 10: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

MAKING the available administrative data sources MORE EXPLOITABLE for statistical purposes

entails adopting standard statistical definitions, classifications and data management conventions

In order to make this action easier Istat is launching an ACTIVITY of SUPERVISION ON CHANGES AND INNOVATION PROJECTS concerning the available administrative data sources and their related forms

9

THE ISTAT STRATEGY: MODIFYING ADMINISTRATIVE DATA SOURCES

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 11: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

SUPERVISION ON CHANGES AND INNOVATION PROJECTS

For the most important administrative data sources, the owner institution is required to notify Istat each time it plans any kind of change in the source’s information content

i.e. a periodic change in the forms for income declaration, a new data warehouse

On the basis of the received notifications, Istat may give feedback and release proper recommendations

such as using official instead of non-official classifications, improving the identification code system, improving the quality control procedures

This activity is supported by DARCAP (Documenting Public Administration Archives)

10 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

THE ISTAT STRATEGY: MODIFYING ADMINISTRATIVE DATA SOURCES

Page 12: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Administrative data sources’ INVESTIGATIONS Administrative data sources’ SURVEYS

SUPERVISION ON CHANGES AND INNOVATION PROJECTS concerning administrative data sources and forms

DARCAP (Documenting Public Administration ARchives) system Quality Assessment Framework for

Administrative Data Sources

Disseminates the collected information about the administrative data sources’ CONTENT and QUALITY

Supports the CHANGE NOTIFICATION activities

Organizes the collected information about QUALITY in a standard QUALITY ASSESSMENT FRAMEWORK

Drives the quality evaluator in calculating proper numerical quality indicators

11

THE ISTAT’S STRATEGY: ACTIVITIES AND TOOLS

Documents the content of the notified changes and the Istat’s recommendations

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 13: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Making the available administrative data sources COMPARABLE and EXPLOITABLE for statistical purposes,

without bearing any particular usage in mind

Producing STANDARD DOCUMENTATION of the administrative data sources’ information content and quality

Disseminating such STANDARD DOCUMENTATION to any potential user of administrative data sources

12

THE ISTAT’S STRATEGY: GOALS AND CHALLENGES

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 14: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

STANDARD DOCUMENTATION of the available administrative data sources’ information content

implies the specification of the ADMINISTRATIVE DATA SOURCE’S ONTOLOGY, according to a proper conceptual model

DARCAP documents the ADMINISTRATIVE DATA SOURCE’S ONTOLOGY

STANDARD DOCUMENTATION of the available administrative data sources’ quality

implies the specification of the administrative data source’s quality according to a structured QUALITY ASSESSMENT FRAMEWORK

13

THE ISTAT’S STRATEGY: GOALS AND CHALLENGES

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

Page 15: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Many QUALITY ASSESSMENT FRAMEWORKs for ADMINISTRATIVE DATA SOURCES exist which propose quality indicators organized by means of a standard structure of quality dimensions

BUT The existing Administrative Data Quality Frameworks often specify quality

indicators from the limited viewpoint of the NSIs’ data production processes We have not yet a real insight into the actual administrative data quality’s

determiners We have not yet a satisfactory methodological approach to the overall

administrative data quality’s evaluation

This was not a problem in the traditional scenario, but IT IS A PROBLEM NOW

14 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

THE ISTAT’S STRATEGY: GOALS AND CHALLENGES

Page 16: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

The administrative data sources’ quality is generally influenced by:

systematic errors (which in surveys we can generally control) the data collecting process, which is continuous in the course of time

Moreover it is often different for the various items in each data source’s ontology: observed populations, observed sets of events, observed characteristics

In order to define the ISTAT’s Quality Assessment Framework for Administrative Data Sources we are carrying out a careful analysis of possible errors which takes into

account such context ….. …… and aims at leading the quality evaluator to the specification of an in-

depth and customized quality assessment

15 Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference

THE ISTAT’S STRATEGY: GOALS AND CHALLENGES

Page 17: Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,

Stored inside the DARCAP system the results of DATA SOURCE INVESTIGATIONS on a first set of important administrative data sources owned by central government institutions

Stored inside the DARCAP system the results of a first SURVEY ON ADMINISTRATIVE DATA SOURCES owned by local government institutions

We plan to enlarge the investigation activity by addressing more and more administrative data sources

We plan to launch the SUPERVISION ACTIVITY on the administrative data sources’ changes and innovation projects.

We are carrying out our work of specifying quality indicators on the basis of a careful analysis of the possible errors which may affect the administrative data sources

16

PRESENT AND FUTURE WORK

Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy, G. D’Angiolini – IAOS 2014 Conference