Upload
samson-booth
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
2. An overview of SDMX(What is SDMX? Part I)
1
Edward CookEurostatUnit B5: “Central data and metadata services”
SDMX Basics course, 27-29 October 2015
Eurostat
• A model to describe statistical data and
metadata
• A standard for automated communication from machine to machine
• A technology supporting standardised IT tools
What is SDMX?
2
for which statisticians agree to use common descriptions and guidelines
driven by these common descriptors for all to reuse
developed as wide-ranging open source software
Eurostat
• The SDMX Information model:
What is the information model underlying the data and metadata exchange between the partners?
• Content-oriented guidelines:
How to increase the interoperability and statistical harmonisation?
• IT Architecture for Data Exchange:
How to exchange the data?
Presentation of SDMX
3
Eurostat
THE INFORMATION MODEL
4
Eurostat
The Information Model:
… is a representation of concepts, relationships, constraints, rules and operations.
… is a formal way to:- express and design information needs- communicate with IT people- give specifications to reporting agents- document the system- drive the software
5
Eurostat
What things does SDMX need to model?
• Statistical data• Through descriptor concepts. These concepts can be
further classified into dimensions, attributes and measures.
• Metadata• Structural metadata (such as concept names etc.)• Reference (or explanatory) metadata
• Data exchange processes 6
Eurostat
Modelling statistical data in SDMX
7
Eurostat
8
Eurostat
Key SDMX object concerning data:
Data Structure Definition (DSD)
• Identification of dimensions, attributes and measures
• Use of common codelists
• Integration into concept schemes
9
Eurostat
10
Modelling reference metadata in SDMX
So much descriptive information. It needs to be expressed in a common, standard way.
Quality descriptions
Process descriptions
Methodological descriptions
Administrative descriptions
Eurostat
The standard way is the Metadata Structure Definition (MSD)
A Metadata Structure Definition describes how metadata sets, containing reference metadata are organised.
In particular, it defines:- which metadata are being exchanged; - how these concepts relate to each other;- how they are represented (free text or coded values);- with which object types (agencies, data flows, data
providers, subsets of data flows, or others) they are associated.
11
Eurostat
12
Modelling reference metadata in SDMX
Eurostat
13
Eurostat
THE CONTENT-ORIENTED GUIDELINES
14
Eurostat
Content-oriented guidelines
• The content-oriented guidelines are a set of recommendations within the scope of the SDMX standard in order to produce maximum interoperability.
• The SDMX standards:- provide essential support to statisticians;- maximise the amount of information through to
users;- allow an automation of the process;- allow web-service queries.
15
Eurostat
There are three main areas in the content-oriented guidelines:
1. Statistical subject-matter domains.
2. Cross-domain concepts (and code lists).
3. A Metadata Common Vocabulary.
16
Eurostat
Statistical subject-matter domains
• Statistical subject matter domains is a high level classification of statistical areas.
• They refer to statistical activities that have common characteristics with respect to variables, concepts and methodologies for data collection.
• Examples: price statistics, national accounts, environment statistics or education statistics.
• It is intended to cover the universe of official statistics.
17
Eurostat
Functions of the classification of statistical domains.
• A standard against which domain lists of national and international organisations can be mapped to facilitate the exchange of data and metadata.
• Provides an identifier for registering and searching statistical data on SDMX registries.
• Navigation aide for the identification and organisation of corresponding domain groups.
18
Eurostat
19
Eurostat
Cross-domain concepts
• They are a list of statistical concepts, related to statistical processes and data quality.
• The list is based on the concepts used by the contributing international organisations.
• The concepts can be used at the data side as well as at the metadata side.
20
Eurostat
21
Eurostat
Examples of cross-domain concept
22
Eurostat
Examples of cross-domain concept
23
Eurostat
• A cross-domain concept may have a code list as presentation.
• This means that the concept might take a limited set of possible values enumerated in its corresponding code list.
• The code lists associated with cross-domain concepts are called cross-domain code lists.
24
Eurostat
• Code lists have a general description, a list of codes, their description and annotations that provide additional information on the codes.
• Examples of cross-domain concepts and code list:• FREQ and its associated code list CL_FREQ.• SEX and its associated code list CL_SEX.
25
Eurostat
26
Eurostat
27
Eurostat
Metadata Common Vocabulary
• The Metadata Common Vocabulary (MCV) is a vocabulary that recommends a common terminology to be used in order to facilitate communication and understanding
• The MCV is closely linked to the cross-domain concepts as it also contains all these concepts, stating their definitions and context descriptions.
28
Eurostat
Metadata common vocabulary
• The MCV covers a selected range of metadata concepts:• General metadata concepts.• Metadata terms decribing statistical
methodologies and data quality.• Terms referring specifically to data and metadata
exchange.
29
Eurostat
Examples of Metadata Common Vocabulary
30
Eurostat
Examples of Metadata Common Vocabulary
31
Eurostat
IT Architecture for data exchange
32
Eurostat
• Standard formats for the exchange of data and metadata.• SDMX-EDI• SDMX-ML
• Architectures for data exchange:• Push• Pull• Data-hub
• SDMX Tools
33
Eurostat
Push mode
34
Eurostat
Pull mode
35
Eurostat
Data Hub
36
Eurostat
SDMX tools
• Eurostat tools at our SDMX Info Space https://webgate.ec.europa.eu/fpfis/mwikis/sdmx/index.php/Main_Page• SDMX Registry (a central repository for storing and sharing
SDMX artefacts). • SDMX Data Structure Wizard (used to create, edit and test
SDMX artefacts).• SDMX Converter (converts data files between SDMX formats
and other file formats). • SDMX Reference Infrastructure (SDMX-RI) (set of tools
that allows to connect your IT systems to the SDMX world). • SDMX Mapping Assistant (mapping and transcoding of the
contents of an existing database to SDMX data structures).
37
Eurostat
SDMX toolsOther tools available in the community (
www.sdmxtools.org)
38
Questions?