Upload
haviva-black
View
40
Download
2
Embed Size (px)
DESCRIPTION
SDMX basics Marco Pellegrino Eurostat, Directorate B. Purpose of this training session. At the end of this session you will: Know the basics of the SDMX model Understand the techniques to identify the structure of data Identify the concepts in a simple data set - PowerPoint PPT Presentation
Citation preview
Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
SDMX basics
Marco PellegrinoEurostat, Directorate B
2Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Purpose of this training session
At the end of this session you will:
– Know the basics of the SDMX model
– Understand the techniques to identify the structure of data
– Identify the concepts in a simple data set
– Be able to develop simple data structure definitions using SDMX tools
– Be familiar with the main IT architecture and tools used by Eurostat for SDMX implementation projects
4Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
According to ISO: A standard is
a document,
established by consensus and
approved by a recognized body,
that provides rules, guidelines or characteristics
for common and repeated use,
for activities or their results,
aimed at the achievement of the optimum degree of order in a given context.
What is a standard?
5Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Lack of standardisation in data exchanges or across organisationsLack of standardisation in data exchanges or across organisations
Different formats of
data and metadata
Different formats of
data and metadata
EDIFACT
Structured Files
XML
paper form
Different places to store data and metadataDifferent places to store data and metadata
Different mediaDifferent media
file upload
Web-form
removable media
dial-up
Paper
6Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
WHAT SDMX IS
This is what SDMX provides and enables
A model to describe statistical data and metadata
A standard for automated communication from machine to machine
A technology supporting standardised IT tools
In order to take advantage of all this:
Statisticians agree to use a common description for data and metadata
The data exchange process is then driven by the common description
Data descriptions are made available for everybody who wants to understand and reuse the data
7Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Version 1.0 GESMES/TS
Version 2.0SDMX-EDISDMX-MLSDMX Registry
2008SDMX accepted
at UN level
ISO/TS 17369
September 2004
Version 1.0
Version 2.0
February 2008
SDMX recognised and supported as the preferred standard
SDMX 2.1
April2011 November 2005
From version 1.0 to version 2.1 to…?
8Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
All good standards change…
All standards change over time, and are released in a series of versions
Changes always have some impact on users
– Users are not expected to always use the latest version of a standard
– Standards organisations (like SDMX) have to provide support for several versions of the standard, all of which are in use
9Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Change management
Danger (1): too much change may discourage adoption
Danger (2): not giving users the functionalities they want will discourage adoption
Need to find a balance
10Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
THE SDMX COMPONENTS
Technical Specifications
The SDMX
Information Model
Guidelines to
Harmonise Content
Content-oriented Guidelines (COG)
Tools
IT Architectures for data exchange
SDMX compliant tools
11Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Models?
12Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
A model is a partial analogy of a system
René Magritte
“This is not a pipe”
The analogy between the model and the represented reality is partial.
The properties of the model are not identical to the properties of the reality.
I can’t smoke with this pipe!
13Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
The four meta-modelling levels
Real data(e.g. BOP, ESA)
Data model: concepts, codes, DSD
SDMX metamodel
A model represents a system and conforms to a metamodel
14Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
The Generic Statistical Business Process Model
15Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Overall integration of methods and techniques
Design Build Collect Process Disseminate Use
DATADATA
Software Services
Administrator
DEFINITIONSDEFINITIONS
User
Information Model
16Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
A user level formal language to: • express, agree and design information needs
• give specifications to reporting agents
• communicate with IT people
• drive the software (which doesn’t change)
• document the system
User autonomy
Flexible information system, evolving fast & cheaply
The role of the Information model
SDMX Information Model (“metamodel”)
Dataset
Structure
Dataset
Structure
DataData
Structural
Metadata
Structural
Metadata
Data Structure Definition (DSD) Dimensions
(ex: country, variable/topic,
year)
Dimensions
(ex: country, variable/topic,
year)
Attributes
(ex: unit of measure)
Attributes
(ex: unit of measure)
Code listsCode lists
Metadata about an individual value, a time series or a group of time series
Metadata about an individual value, a time series or a group of time series
Provides a way of modelling data, metadata and exchange processes
Identify/Describe
18Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Describing the data exchange
Who?
What?
When?Who?
Where?How?
What?
Data Structure Definition: Concept Usage
Unit Multiplier
Unit
Topic
Time/Frequency
CountryStock/Flow
Observation
(Dimension)(Dimension)
(Dimension)
(Attribute)
(Dimension)
(Dimension)
(Attribute)
(Measure)
20Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Data Structure Definition:Defining Multi-dimensional Structures
• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the
observation value– Concept that is the observation value– Any of these may be
• coded• text• date/time• number• etc.
Dimensions
Attributes
Measure
Representation
Domain 1
Cross-domain concepts and code lists
FREQ
REF. AREA
Domain 2
Set of used concepts
Cross-domain concepts
COMPARABILITY
Statistical subject-matter domains
Based on the UNECE Classification of International Statistical Activities
Content-Oriented guidelines
Cross-domain concepts and code listsCross-domain concepts and code lists
Statistical subject-matter domainsStatistical subject-matter domains
Metadata common vocabularyMetadata common vocabulary
Recommendations to harmonise implementations
Organisation 1 Organisation 2 Organisation 3
interoperability
26Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
SDMX provides support for things that are essential to Statisticians, but are often difficult for them to achieve
International standard for holding all of the elements involved in the statistical process together in a clear information model
Approach that maximises the amount of information on the statistical context that can be passed through to users, and the capacity of linking statistics from similar or different sources
Automation of processes: SDMX enables the development of common tools that can be used by all statistical organisations to improve their activities
Some benefits from SDMX standards
27Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
SDMX Reference Infrastructure
SDMX Reference Infrastructure
Statistical Organisation
Statistical Organisation
Benefits from SDMX standards (2)
Web services enable query, visualisation, and automated loading of data and metadata. SDMX tools allow querying a database, or a file system, for the creation of tables, charts, and graphs from the results of the query.
SDMX is also an advanced standard for data discovery using web-based services
28Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
by end of June
Organisation scheme
Concepts
Codelists
Concept Schemes
Provision Agreement
SDMX describes the data and metadata exchange
DSD
maintainer SDMX
Registry
29Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Data Repository (Warehousing) Architecture
NSI
EurostatPull Requestor
eDAMIS
Data Input
SDMX Registry
Intermediatestorage
Verification /ConversionTo SDMX
Receiveddata in
SDMX-MLLoader
register
Warehousestorage
Eurobase
query
Dissemination
XSL forSDMX-ML
PULL
PUSH
30Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
The SDMX Hub
Data warehouse
Data warehouse
Data warehouse
SDMX-RI
(web service)
SDMX-RI
(web service)
SDMX-RI
(web service)
Data Hub
Data Providing Organizations Data collector Organizations Users
messagesSDMX
Data warehouse
Data warehouse
Data warehouse
SDMX-RI
(web service)
SDMX-RI
(web service)
SDMX-RI
(web service)
SDMX-RI
(web service)
SDMX-RI
(web service)
SDMX-RI
(web service)
Data Hub
Data Providing Organizations Data collector Organizations Users
messagesSDMX
31Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
SDMX progress, 2011 to 2015
Standards Development: April 2011, SDMX 2.1 Technical Standards released @ sdmx.org
May 2011: SDMX Global Conference in Washington, D.C.Next: 11-13 September 2013 (OECD, Paris)
Self-learning tutorials comprising video, textbook and self-test.
Governance: Creation of two SDMX Working Groups (Technical Working Group and Statistical Working Group)
Action Plan 2011 to 2015
32Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
How to know more about SDMX
33Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
http://sdmx.org/
34Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
http://epp.eurostat.ec.europa.eu/portal/page/portal/pgp_ess/news/ess_news_detail?id=112774074&pg_id=2417&cc=ESTAT_EUROSTAT
36Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
Training courses on SDMX
SDMX basics (for statisticians and IT staff)Held at Eurostat. Aimed at people in charge of managing SDMX-based transmission and dissemination of data and metadata.
SDMX advanced course (for IT developers)Held at Eurostat. Targeted at IT developers and proposed in two versions:JAVA programmers
.NET programmers
ESTP course on “Advanced technologies for data collection and transmission” External
37Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013
For more information
http://www.sdmx.org (SDMX web site)
https://webgate.ec.europa.eu/fpfis/mwikis/sdmx (Eurostat Info Space)
[email protected] (General info on SDMX)
[email protected] (Eurostat implementation projects)