Upload
megan-hancock
View
217
Download
1
Embed Size (px)
Citation preview
SDMX and DDIWorking Together
Marco Pellegrino, Denis GrofilsEurostat
Technical Workshop 5-7 June 2013
1. Where are we (background)?
2. Where are we going (plans and projects)?
3. DDI-SDMX dialogue
4. Agenda items
Outline
Where are we?
Dramatic changes in the environment of official statistics producers (e.g. data deluge)
Modernization of statistical information system seen as a question of survival for the sector of official statistics
Standardization viewed as a key enabler for modernization
"Standards-based” industrialization of statistical production
Standardization in the ESS
The ESS has a long tradition in harmonising statistical products and regulating requirements within the different statistical domains
SDMX: a success story
Standardization allows cross-domain synergies by enabling sharing of data & software
ESS Vision Increase the level of integration of the ESS (vertical & horizontal) Maximize sharing & re-use
Standardisation in early stages of production: TO DO!
ESS Architecture, current situation
ESTAT
ESTAT
ESS VIPs CC – "Technical" CC – "Programme"
ADMIN Information models Communication
NAPS Communication Network Governance
ESS DWPRIX and TRIS Data warehouse Human resources
EsBRS Shared services Financial resources
SIMSTAT Legal framework
ICT Programme office
Common validation policy
ESS VIPs and Cross-cutting projects
ESS.VIP business and information principles
• Maximum reuse of existing process components and segments
• Metadata driven processes allowing adaptation/parameterisation and extension to other contexts
• New business process built as a sequence of modular process steps / services
• Information objects structured according to available information models and stored in corporate registries/repositories in view of reuse
• Adherence to industry and open standards as available
Two important projects (1/2)
ESS.VIP Cross-cutting Project on Information Models and Standards (IMS) To ensure that the European Statistical System (ESS) has access to a set
of agreed-upon standards supporting the modernisation of statistical services
To increase coherence between standards, at the same time ensuring that these are consistent with best practices and recommendations from the international community of official statistics.
To define information models that can be used across the ESS to model structural metadata for different types of data, taking into account existing standards and on-going developments
To provide support mechanisms (e.g., capacity-building and training) for the practical implementation of these standards and models
Two important projects (2/2)
UNECE Project Frameworks and Standards for Statistical Modernisation (FSFSM) To ensure that the international statistical community has
access to the standards needed to support the modernisation of statistical production and services
To increase coherence between these standards To provide support mechanisms for the practical
implementation of these standards within national and international statistical organisations
To ensure effective promotion and maintenance of the GSBPM and the GSIM, including the release of new versions as appropriate
SDMX-DDI dialogue
Launched in 2010 with 3 goals:
To avoid duplication of efforts and thus avoid confusion about which standards should be used for specific types of applications
To provide reassurance to the user communities of DDI and SDMX that the end-to-end statistical process can be managed, and that standards bodies are both considering the needs of users in this area
To provide specific technical guidance about the use cases and implementation of the standards for specific purposes
Endorsed by DDI Alliance and SDMX Sponsors / Secretariat (mandate TWG)
SDMX & DDI
SDMX: Statistical Data and Metadata eXchange Standard for the exchange of statistical data and metadata “the preferred standard for exchange and sharing of data and
metadata in the global statistical community” UN Statistical Commission 2008 – Widely used in the ESS
Extended to support unit-level data
DDI: Data Documentation Initiative Standard for the documentation of data Initially focused on archiving micro-data in the area of social sciences –
Widely used in national data archives Extended to support the full life-cycle of data
DDI DDISDMX
Generic Statistical Business Process Model
GSBPM, DDI and SDMX: towards a complete system?
DDI DDISDMXSDMX
SDMX
GSBPM, DDI and SDMX: towards a complete picture?
DDI DDISDMXSDMX
SDMX
Characterizing the Standards: DDI
DDI Lifecycle can provide a very detailed set of metadata, covering:– The study or series of studies– Many aspects of data collection, including surveys and
processing of microdata– The structure of data files, including hierarchical files and
those with complex relationships– The lifecycle events and archiving of data files and their
metadata– The tabulation and processing of data into tables (Ncubes)
• Allows for a link between the microdata variables and the resulting aggregates
Characterizing the Standards: SDMX
Describes the structure of aggregate/dimensional data (“structural metadata”)
Provides formats for the dimensional data Provides a model of data reporting and dissemination Provides a way of describing and formatting stand-alone
metadata sets (“reference metadata”) Provides standard registry interfaces, providing a catalogue of
resources Provides guidelines for deploying standard web services for
SDMX resources Provides a way of describing statistical processes
Data validation and editing,
SDMX Registry,
DSD, data set,
MSD, metadata set,
Web services
SDMX
Process Metadata
DDI offers a very rich model for the documentation of micro-data
SDMX offers a very integrated exchange platform for statistical outputs (IT architectures, tools, web services)
DDI and SDMX
The combined use of both standards could allow a higher level of integration of the complete production process
But: The devil is in the detail!
Analysis of use cases
Set of relevant use cases where the two standards could be compared:
1. Survey data collection
2. Administrative and register data
3. Combined use of DDI and SDMX
4. Micro-data access and on-demand tabulation of micro-data
5. Metadata and quality reporting
The challenge
It's not about which flavor of XML we use (XML doesn’t really matter)
It’s about data and metadata!– If I want to use DDI to describe my data, and you want to
use SDMX, how can we ensure that we are getting the same data and metadata?
It's about the convergence of information models and the availability of an integrated IT environment
Combined DDI-SDMX approaches
Mixing the two standards within an implementation, allowing for the expression of the same metadata set in both standards, so that the information could be transformed from one format to the other.
Metadata stored and indexed in such a fashion that it can be expressed either as SDMX or DDI on an as-needed basis.
Metadata Repository and Registry project at ABS.
The actual format used for metadata storage may be neither SDMX nor DDI, so long as it can be expressed using both standards.
GSIM to be implemented through a combination of SDMX and DDI?
Generic Statistical Information Model (GSIM)
Common Generic lndustrialised Statistics
GSBPM GSIM
Methods Technology
Business Concepts Information Concepts
Statistical HowTo Production HowTo
con
ceptu
al
pra
ctic
al
Common Generic lndustrialised Statistics
GSBPM GSIM
Methods Technology
Business Concepts Information Concepts
Statistical HowTo Production HowTo
con
ceptu
al
pra
ctic
al
SDMXDDIISO 11179Etc.
Other relevant standards
Geospatial standards
DDI
SDMX
GSIMConceptual model
Implementationstandards
1. Introduction
2. Business needs for DDI and SDMX
3. Technical overview of DDI and SDMX
4. Conceptual mapping between GSIM, DDI and SDMX
5. Use casesStatistical registers and administrative dataSurvey data management (combined use of DDI and
SDMX)DDI and SDMX as tools for Metadata and Quality Reporting
6. Discussion: How to move forward
7. Conclusion
Agenda