23

Click here to load reader

SDMX Reference Architecture v1

Embed Size (px)

Citation preview

Page 1: SDMX Reference Architecture v1

DIRECTORATE B: STATISTICAL METHODOLOGIES AND TOOLS

UNIT B-5: STATISTICAL INFORMATION TECHNOLOGIES

SDMX REFERENCE ARCHITECTURE

FOR NATIONAL STATISTICAL INSTITUTES

(VERSION 1.4)

December 2009

Page 2: SDMX Reference Architecture v1

Revised and approvedBy Francesco Rizzo

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 2/20

Page 3: SDMX Reference Architecture v1

Table of Contents

1. Introduction 4

1.1. Purpose 4

1.2. Scope 4

1.3. Structure 4

1.4. References 4

2. Architecture4

2.1. Overall view 4

2.2. Responsibilities 5

2.3. Sequence within the NSI Web Service 11

2.4. Business Process Diagrams 12

2.4.1. Production to dissemination flow 12

2.4.2. Pull Data from Web Service flow 13

2.5. Mapping Store Model 15

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 3/20

Page 4: SDMX Reference Architecture v1

1. Introduction

1.1. PurposeThe purpose of this document is to present the SDMX Architecture for national

statistical institutes (NSIs) as well as the responsibilities of the distinct modules.

1.2. ScopeThis document contains the description/specification of the "SDMX reference

architecture" to be used partially or as a whole by NSIs willing to participate in Eurostat’s pull data transmission. This architecture is not a strict specification but rather a guide or "best practices" document that may be used by NSIs or other potential data providers.

1.3. StructureThe structure of this document is described below:

- In section 1, introductory material is included, as well as references.

- In section 2, the SDMX reference architecture is presented. Descriptions of the participating modules, as well as business process diagrams are used to clarify responsibilities and interactions between modules.

Modelling of the architecture has been performed using UML CASE tool EA (Sparx Systems Enterprise Architect Version 6.5).

1.4. References[1] SDMX Technical standards, version 2.0 (November 2005 - SDMX Standards

Version 2.0)

[2] Unified Modelling Language (UML – http://www.uml.org/)

[3] Business Process Model Notation (BPMN – http://www.bpmn.org/)

2. ArchitectureIn this section, the SDMX Reference Architecture will be described using a

component diagram, in order to show all concerned components and the interaction/dependencies between them. Moreover, "Business Process Diagrams" (BPDs) will be utilized to clarify the possible flows within this architecture.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 4/20

Page 5: SDMX Reference Architecture v1

2.1. Overall viewThe following component diagram depicts the overall view of the NSI SDMX

Reference Dissemination Architecture:

In this diagram two areas can be identified (bordered by dashed lines). The left-hand side area concerns the Data Consumer (i.e. Eurostat). It contains the modules that interact with a Data Producer (i.e. an NSI) in order to "pull" SDMX data. The right-hand side area concerns the Data Producer (i.e. NSI). More specifically, the dissemination environment is concerned in this diagram, since its responsibility is to provide data to possible Data Consumers. Details on the modules participating in this architecture are included in the following section.

2.2. ResponsibilitiesThe responsibilities of the components are documented below:

Dissemination Database (DD)

This is the storage data warehouse (or database) of the NSI dissemination environment that each NSI maintains in order to store data ready for publication/dissemination to potential Data Consumers (e.g. Eurostat). In some cases, the DD may consist of files, i.e. PC-Axis files.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 5/20

Page 6: SDMX Reference Architecture v1

Interface: The dissemination database can be accessed using SQL statements. Thus the "Mapping Assistant", the "Data Retriever" and the "Data Loader" modules use SQL statements to query for data (MA & DR), or to insert/update data (DL).

Note: In case files comprise the DD, the interface is specific to the file type and format.

Mapping Store (MS)

This component represents the storage that is responsible for keeping the mappings between the SDMX and the native format (a file or a DB schema). It is a database maintained by the "Mapping Assistant" in order to provide these mappings to the "Data Retriever" and "Data Loader" modules. More on the database model of the Mapping Store can be found in section 2.5.

Interface: The "Mapping Store" database can be accessed using SQL statements. Thus the "Mapping Assistant", the "Data Retriever" and the "Data Loader" modules use SQL statements to query for mappings (DL & DR), or to insert/update mappings (MA).

SDMX Structure File

This component represents the SDMX-ML Data Structure Definition required by the "Mapping Assistant" module in order to map its component (i.e. Dimensions, Attributes, Measures) to the "Dissemination Database" structure. It is an XML document that describes an SDMX-ML Data Structure Definition according to the SDMX v2.0 specification [1].

Mapping Assistant (MA)

This component is responsible for creating/maintaining the mappings between an SDMX Data Structure Definition (DSD) and a "Dissemination Database". It maps the database schema (or file structure) from the DD to the SDMX DSD ("SDMX Structure File" artefact). It is a stand-alone desktop application with a graphical user interface through which the user can create the mappings between a DD and a Data Structure Definition (for example, the user will be able to map a column of a database table to a dimension of a DSD).

Interface: The MA communicates with the "Mapping Store" and the "Dissemination Database". In both cases the MA uses SQL statements in order to access the databases.

Web Service Provider

This component is responsible for exposing the Dataset using a Web Service interface that provides SDMX-ML messages. It concerns the dynamic pull scenario. It represents the wrapper of the data retrieval functionality in a web service. It also co-ordinates how building blocks are going to be used in order to produce the response. It is

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 6/20

Page 7: SDMX Reference Architecture v1

also responsible for reading all the parameters from the configuration files and pass them to the other building blocks.

Interface: This component exposes the underlying data retrieval functionality using a SOAP interface. More specifically, its input is a SOAP message including an SDMX-ML Query message and its output is a SOAP message including either an SDMX-ML message or a SOAP Fault in case of an error.

Internal Data Model

This component is used for representing information internally within the NSI Reference Architecture. This data model i.e. sdmx data model depends on the implementation platform as well as on the way an NSI needs to represent information stored in Queries as well as the produced data message and its corresponding metadata.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 7/20

Page 8: SDMX Reference Architecture v1

class screenshotDiagram

QueryBean

~ dataWhereBean: DataWhereBean~ metaDataWhereBean: MetaDataWhereBean

+ getDataWhereBean() : DataWhereBean+ getMetaDataWhereBean() : MetaDataWhereBean+ print(QueryBean) : void+ setDataWhereBean(DataWhereBean) : void+ setMetaDataWhereBean(MetaDataWhereBean) : void

MaintainableArtefactBean

structure::KeyFamilyBean

~ attributes: List = new ArrayList()+ DATASET: int = 4 {readOnly}~ datasetAttributes: List = new ArrayList()~ dimensions: List = new ArrayList()+ GROUP: int = 1 {readOnly}~ groupAttributes: List = new ArrayList()~ groups: List = new ArrayList()+ OBSERVATION: int = 3 {readOnly}~ observationAttributes: List = new ArrayList()~ primaryMeasure: PrimaryMeasureBean+ SERIES: int = 2 {readOnly}~ seriesAttributes: List = new ArrayList()~ timeDimension: TimeDimensionBean~ xsMeasures: List = new ArrayList()

+ addAttribute(AttributeBean) : void+ addCrossSectionalMeasure(CrossSectionalMeasureBean) : void- addDatasetAttribute(AttributeBean) : void+ addDimension(DimensionBean) : void+ addGroup(GroupBean) : void- addGroupAttribute(AttributeBean) : void- addObservationAttribute(AttributeBean) : void+ addPrimaryMeasure(PrimaryMeasureBean) : void- addSeriesAttribute(AttributeBean) : void+ addTimeDimension(TimeDimensionBean) : void+ getAttributeNames(int) : List+ getAttributes() : List+ getComponents() : List+ getCrossSectionalMeasure(String) : CrossSectionalMeasureBean+ getCrossSectionalMeasureNames() : List+ getCrossSectionalMeasures() : List+ getDatasetAttributes() : List+ getDimension(String) : DimensionBean+ getDimensionNames() : List+ getDimensions() : List+ getGroupAttribute(String) : AttributeBean+ getGroupAttributes() : List+ getGroupNames() : List+ getGroups() : List+ getObservationAttribute(String) : AttributeBean+ getObservationAttributes() : List+ getPrimaryMeasure() : PrimaryMeasureBean+ getSeriesAttribute(String) : AttributeBean+ getSeriesAttributes() : List+ getTimeDimension() : TimeDimensionBean+ toString() : String

MessageBean

data::DataMessage

- dataSet: DataSet- properties: Properties

+ DataMessage()+ DataMessage(Properties)+ getDataSet() : DataSet+ getProperties() : Properties+ loadProperties() : void+ setDataSet(DataSet) : void+ setProperties(Properties) : void

The classes from the internal modules that are used to exchange data between the building blocks are the QueryBean, the DataMessage and the KeyFamilyBean (which correspond to the DSD).

This is an important component of the architecture, since most of the participating modules depend upon its availability in order to have clear interfaces between them.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 8/20

Page 9: SDMX Reference Architecture v1

SDMX Query Parser

This component is responsible for getting the request from the "Web Service Provider" and populating the internal data model i.e. sdmx data model. It is actually an XML parser that reads SDMX-ML Query messages.

Interface: The input of the SDMX Query Parser is SDMX-ML Query messages (XML documents) and the SDMX Query xml schemas as defined http://sdmx.org/docs/2_0/SDMXQuery.xsd. The output is a query represented in the internal data model as a QueryBean object.

Data Retriever (DR)

This component is responsible for querying the "Dissemination Database" and getting the respective data. In order to provide those data using the internal data model representation, the "Data Retriever" depends on the mappings of the "Mapping Store". More specifically the "Data Retriever" requires mappings in order to translate the SDMX-ML Query to the appropriate SQL query to the database.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 9/20

Page 10: SDMX Reference Architecture v1

Interface: Access to the dissemination database is required in order to retrieve data. Access to the Mapping Store database is also required in order to retrieve the mappings, as well as to get the connection string to the "Dissemination Database". The connection string to the "Mapping Store" database is provided to the "Data Retriever" as an input parameter. Another input of the "Data Retriever" is the SDMX-ML Query represented in the internal data model as a QueryBean object. The output of the "Data Retriever" is the SDMX-ML Dataset represented in the internal data model as a DataMessage object.

Data Loader (DL)

This component is responsible for loading new data from the NSI's production environment/database to the dissemination environment/database and updating the module "RSS Generator". The latter can be achieved using the appropriate mappings, in order to create a new feed entry for the newly stored data. In case the dataset is stored in a file, the "Data Loader" module uses the appropriate mappings from the "Mapping Store" and provides the respective dataset to the "SDMX-ML Data Generator" module in order to produce an SDMX-ML Dataset file. Another responsibility of this component would be to register these new data to an SDMX Registry.

Interface: The input of this module is data prepared for dissemination from an NSI. The format of the data depends on the internal NSI process, as well as the source of the data within the production environment. The output of this module, apart from the population of the dissemination database, is to provide a dataset, represented in the internal data model, using the appropriate mappings found in the "Mapping Store". Thus, access to the "Mapping Store" is essential in this case. Finally, in case a static scenario is concerned (i.e. no Web Service infrastructure), the aforementioned dataset should be passed on to the "SDMX-ML Data Generator" as well.

SDMX-ML Data Generator

This component is responsible for generating an SDMX-ML Dataset XML document, upon receiving a dataset (represented in the internal data model i.e. sdmx data model).

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 10/20

Page 11: SDMX Reference Architecture v1

Interface: The input of the "SDMX-ML Data Generator" is a dataset represented in the internal data model as a DataMessage object. Additionally, some metadata are required in order to produce the dataset properly i.e. the Data Structure Definition as a DSD. Finally, the SDMX-ML Dataset Message Format, is the format in which the dataset should be written and must be provided as an input. The output is an SDMX-ML Dataset (XML document).

Note: The implementation of this input dataset as well as the metadata may vary depending on the technology (e.g. .NET vs Java).

RSS Generator

This module is responsible for generating a feed entry on the event of new data arriving from the "Data Loader". Thus, it updates the "Feed" artefact in order to be accessed by the "Pull Requestor".

Interface: The RSS Generator requires info regarding the newly available data in order to build the proper feed entry. Its input is an SDMX-ML message represented in the internal sdmx model. Its output is a new feed entry in the feed describing the available data of the dissemination environment.

SDMX-ML Dataset File

This is an artefact that is generated by the "SDMX-ML Data Generator" and resides in the URL included in a feed entry and the "Pull Requestor" can access and retrieve it. It concerns the static pull scenario. It is an XML document formatted according to the SDMX-ML syntax.

Feed

This is an XML document that describes the available data of the dissemination environment. It is created from the "RSS Generator" and it is read by the "Pull Requestor".

Pull Requestor

This module is responsible for checking the generated web feeds and determine whether the feeds contain available data for retrieval. In the latter case it is responsible for retrieving the data. This can be managed in two ways; either by querying a WS where the data are produced dynamically, or by accessing the URL (included in the web feed entries) that data reside.

2.3. Sequence within the NSI Web ServiceIn the following diagram, a simple sequence of events that happen, when a Web

Service call is initiated from a potential Web Service client, is presented. Parameters exchanged between the components (according to inputs/outputs described above) can be seen in this diagram.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 11/20

Page 12: SDMX Reference Architecture v1

sd NSI WS Sequence

:WebServiceProvider

:SDMXDataGenerator

:DataRetriever

:QueryParser

WS Client

getCompactData(SDMX-ML Query)

QueryParser(SDMX-ML XSD)

parse(SDMX-ML Query) :Query Internal Model

:Query Internal Model

retrieveData(Mapping Store Connection,Query) :DataMessage Internal Model

:DataMessage Internal Model

getDSD() :DSD Internal Model

:DSD Internal Model

write(DataMessage,DSD,targetFormat,responseStream) :SDMX-ML Dataset

:SDMX-ML Message

2.4. Business Process DiagramsIn this section, some of the possible processes that take place within this architecture

are described. Not all processes are included here, since in some cases the activities concerned are too few to form a "Business Process Diagram" (BPD).

2.4.1. Production to dissemination flowThe following BPD describes the activities that take place when new data arrive from

the NSI’s production environment to the dissemination environment.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 12/20

Page 13: SDMX Reference Architecture v1

The following steps/activities are described in the BPD above:

- New data arrive from the NSI’s production environment (event: "New Data Produced").

- Three parallel paths are followed on the above event:

o New data are loaded in the dissemination database using the appropriate

mappings from the "Mapping Store".

o A new feed entry is created in order to announce the existence of new data

in the dissemination database.

o If the NSI utilizes a Web Service (supports the dynamic pull scenario)

nothing else is required. If not, an SDMX-ML Dataset is built and published in the dissemination site of the NSI, in order to be fetched using the static pull scenario.

2.4.2. Pull Data from Web Service flowThe following BPD describes the process of responding to a Web Service request for

data.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 13/20

Page 14: SDMX Reference Architecture v1

On the event of a Web Service Client request (e.g. a call from the Pull Requestor) the following processes take place within the SDMX Reference Architecture:

- The incoming request (SDMX-ML Query) is parsed and stored in the internal data model (i.e. sdmx_model).

- Using the information stored in this model, and the mappings in the "Mapping Store", thiw SDMX-ML Query is translated in an SQL Query.

- Using this SQL Query, data are retrieved from the dissemination environment database.

- The result of the SQL Query, namely the recordset, is transformed into an SDMX-ML Dataset using the "Mapping Store".

- This SDMX-ML Dataset is forwarded as a response to the Web Service to the calling client.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 14/20

Page 15: SDMX Reference Architecture v1

2.5. Mapping Store ModelThe following diagram depicts the schema of the Mapping Store database. Details on

this schema may be found in the "Analysis & Design " document of the Mapping Assistant tool.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 15/20

Page 16: SDMX Reference Architecture v1

class Schema

DB_CONNECTION

*PK CONNECTION_ID: bigint* DB_NAME: varchar(50)* DB_TYPE: varchar(50)* NAME: varchar(50) OWNER: varchar(50) DB_PASSWORD: varchar(50) DB_PORT: smallint DB_SERVER: varchar(100) DB_USER: varchar(50) FK CAT_ID: bigint ADO_CONNECTION_STRING: varchar(512) JDBC_CONNECTION_STRING: varchar(512)

+ FK_CAT_ID(bigint)+ PK_CONNECTION(bigint)

DATASET

*PK DS_ID: bigint* NAME: varchar(50) QUERY: text FK CONNECTION_ID: bigint DESCRIPTION: varchar(1024) CUSTOM_QUERY: int XML_QUERY: text

+ CAT_CON_ID(bigint)+ FK_CONNECTION_ID(bigint)+ PK_TABLE(bigint)

DATASET_COLUMN

*PK COL_ID: bigint* NAME: varchar(50) DESCRIPTION: varchar(1024)*FK DS_ID: bigint

+ FK_TABLE_ID(bigint)+ PK_COLUMN(bigint)+ TABLE_ID(bigint)

LOCAL_CODE

*pfK LCD_ID: bigint*FK COLUMN_ID: bigint

+ FK_LC_ID(bigint)+ FK_LOCAL_CODE_COLUMN(bigint)+ LC_ID(bigint)+ PK_LOCAL_CODE(bigint)

LOCALISED_STRING

* LS_ID: bigint TYPE: varchar(10) FK ITEM_ID: bigint FK ART_ID: bigint* LANGUAGE: varchar(50)* TEXT: varchar(300)

+ FK_LOCALISED_STRING_ARTEFACT(bigint)+ FK_LOCALISED_STRING_ITEM(bigint)

ITEM

*PK ITEM_ID: bigint ID: varchar(50)

+ PK_ITEM(bigint)

CONCEPT

*pfK CON_ID: bigint*FK CON_SCH_ID: bigint

+ CON_ID(bigint)+ CON_SCH_ID(bigint)+ FK_CON_SCH_ID(bigint)+ FK_CONCEPT_ID()+ PK_CONCEPT(bigint)

DSD_CODE

*pfK LCD_ID: bigint*FK CL_ID: bigint

+ CLS_ID(bigint)+ DC_ID(bigint)+ FK_CLS_ID(bigint)+ FK_DSD_CODE_ID(bigint)+ PK_DSD_CODE(bigint)

CATEGORY

*pfK CAT_ID: bigint*FK CAT_SCH_ID: bigint FK PARENT_CAT_ID: bigint

+ CAT_ID(bigint)+ CAT_SCH_ID(bigint)+ FK_CATEGORY_CATEGORY(bigint)+ FK_CATEGORY_ID(bigint)+ FK_CS_ID(bigint)+ PK_CATEGORY(bigint)

CONCEPT_SCHEME

*pfK CON_SCH_ID: bigint

+ CON_SCH_ART_ID(bigint)+ FK_CS_ID(bigint)+ PK_CONCEPT_SCHEME(bigint)

ARTEFACT

*PK ART_ID: bigint* ID: varchar(50)* VERSION: varchar(50)* AGENCY: varchar(50)

+ PK_ARTEFACT(bigint)

CODELIST

*pfK CL_ID: bigint

+ CL_ID(bigint)+ FK_CL_ID(bigint)+ PK_CODELIST(bigint)

CATEGORY_SCHEME

*pfK CAT_SCH_ID: bigint

+ CAT_SCH_ART_ID(bigint)+ FK_CAT_SCH_ID(bigint)+ PK_CATEGORY_SCHEME(bigint)

TRANSCODING_RULE

*PK TR_RULE_ID: bigint*FK TR_ID: bigint

+ PK_TRANSCODING_RULE(bigint)+ FK_TRANSCOD_RULE_TRANSCOD(bigint)

DATAFLOW

*pfK DF_ID: bigint*FK DSD_ID: bigint FK MAP_SET_ID: bigint

+ DFD_DSD_ID(bigint)+ FK_DATAFLOW_ARTEFACT(bigint)+ FK_DATAFLOW_MAPPING_SET(bigint)+ FK_DSD_ID(bigint)+ PK_DATAFLOW(bigint)

COMPONENT

*PK COMP_ID: bigint* TYPE: varchar(50)*FK DSD_ID: bigint*FK CON_ID: bigint FK CL_ID: bigint IS_FREQ_DIM: int IS_MEASURE_DIM: int ATT_ASS_LEVEL: varchar(11) ATT_STATUS: varchar(11) ATT_IS_TIME_FORMAT: int XS_ATTLEVEL_DS: int XS_ATTLEVEL_GROUP: int XS_ATTLEVEL_SECTION: int XS_ATTLEVEL_OBS: int XS_MEASURE_CODE: varchar(50)

+ FK_COMPONENT_CODELIST(bigint)+ FK_COMPONENT_CONCEPT(bigint)+ FK_COMPONENT_DSD(bigint)+ PK_COMPONENT(bigint)

MAPPING_SET

*PK MAP_SET_ID: bigint* ID: varchar(50) DESCRIPTION: varchar(1024) DS_ID: bigint DF_ID: bigint

+ PK_MappingSet(bigint)+ UQ_MAPPING_SET_CODE(varchar)

COMPONENT_MAPPING

*PK MAP_ID: bigint*FK MAP_SET_ID: bigint* TYPE: varchar(10) CONSTANT: varchar(50)

+ FK_MAPPING_MappingSet(bigint)+ PK_MAPPING(bigint)

TRANSCODING

*PK TR_ID: bigint EXPRESSION: varchar(150)*FK MAP_ID: bigint

+ FK_TRANSCODING_MAPPING(bigint)+ PK_TRANSCODING(bigint)

DSD

*pfK DSD_ID: bigint

+ DSD_ART_ID(bigint)+ FK_DSD_ID(bigint)+ PK_DSD(bigint)

DATAFLOW_CATEGORY

*pfK DF_ID: bigint*pfK CAT_ID: bigint

+ DF_ID(bigint)+ FK_DATAFLOW_CATEGORY_CATEGORY(bigint)+ FK_DF_ID(bigint)+ PK_DATAFLOW_CATEGORY(bigint, bigint)

CONNECTION_CATEGORY

*pfK CONNECTION_ID: bigint*pfK CAT_ID: bigint

+ CON_CAT_ID(bigint)+ FK_CONNECTION_ID(bigint)+ PK_CONNECTION_CATEGORY(bigint, bigint)+ FK_CONNECTION_CATEG_CATEG(bigint)

TRANSCODING_CODELIST

*PK MAP_ID: bigint*pfK TR_ID: bigint*pfK CL_ID: bigint

+ PK_TRANSCODING_CODELIST(bigint, bigint, bigint)+ FK_TRANSCOD_CLIST_CLIST(bigint)+ FK_TRANSCOD_CLIST_TRANSCOD(bigint)

TRANSCODING_RULE_DSD_CODE

*pfK TR_RULE_ID: bigint*pfK CD_ID: bigint

+ PK_TRANSCODING_RULE_DSD_CODE(bigint, bigint)+ FK_TRCODRULE_DSDCODE_DSDCODE(bigint)+ FK_TRCODRULE_DSDCODE_TRCODRULE(bigint)

TRANSCODING_RULE_LOCAL_CODE

*pfK LCD_ID: bigint*pfK TR_RULE_ID: bigint

+ PK_TRANSCODING_RULE_LOCAL_CODE(bigint, bigint)+ FK_TRCODRULE_LOCCODE_LOCCODE(bigint)+ FK_TRCODRULE_LOCCODE_TRCODRULE(bigint)

COM_COL_MAPPING_COMPONENT

*pfK MAP_ID: bigint*pfK COMP_ID: bigint

+ PK_COM_COL_MAPPING_COMPONENT(bigint, bigint)+ FK_COM_COL_MAPP_COMP_COMP(bigint)+ FK_COMCOLMAPPCOMP_COMPCOLMAPP(bigint)

COM_COL_MAPPING_COLUMN

*pfK MAP_ID: bigint*pfK COL_ID: bigint

+ PK_COM_COL_MAPPING_COLUMN(bigint, bigint)+ FK_COM_COL_MAPP_COLUMN_COLUMN(bigint)+ FK_COMCOLMAPP_COL_COMPCOLMAPP(bigint)

SEQUENCE_ID

* LAST_ID: bigint

+CLS_ID 0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST 1

+FK_DATAFLOW_ARTEFACT

0..*

(DF_ID =ART_ID)«FK»

+PK_ARTEFACT

1

+DFD_DSD_ID

0..*

(DSD_ID =DSD_ID)«FK»

+PK_DSD

1

+DSD_ART_ID 0..*

(DSD_ID =ART_ID)«FK»

+PK_ARTEFACT 1

+CAT_SCH_ART_ID

0..*

(CAT_SCH_ID =ART_ID) «FK»

+PK_ARTEFACT

1

+CL_ID 0..*

(CL_ID =ART_ID)«FK»

+PK_ARTEFACT

1

+CON_SCH_ART_ID

0..*

(CON_SCH_ID= ART_ID)«FK»

+PK_ARTEFACT

1

+FK_CATEGORY_CATEGORY0..*

«FK» (PARENT_CAT_ID= CAT_ID)

+PK_CATEGORY1

+CAT_CON_ID

0..*

(CONNECTION_ID=CONNECTION_ID)

«FK»

+PK_CONNECTION

1

+CAT_SCH_ID 0..*

(CAT_SCH_ID =CAT_SCH_ID)«FK»

+PK_CATEGORY_SCHEME 1

+FK_COMPONENT_CONCEPT 0..*

(CON_ID =CON_ID)«FK»

+PK_CONCEPT

1

+DC_ID 0..*

(LCD_ID =ITEM_ID)«FK»

+PK_ITEM

1

+CON_SCH_ID 0..*

(CON_SCH_ID=CON_SCH_ID)

«FK»

+PK_CONCEPT_SCHEME 1

+CON_ID 0..*

(CON_ID =ITEM_ID)«FK»

+PK_ITEM 1

+FK_LOCALISED_STRING_ARTEFACT 0..*

(ART_ID =ART_ID)«FK»

+PK_ARTEFACT

1

+FK_LOCALISED_STRING_ITEM

0..*

(ITEM_ID =ITEM_ID)«FK»

+PK_ITEM

1

+LC_ID 0..*

(LCD_ID =ITEM_ID)«FK»

+PK_ITEM

1

+FK_LOCAL_CODE_COLUMN 0..*

(COLUMN_ID= COL_ID)«FK»

+PK_COLUMN 1

+TABLE_ID

0..*

(DS_ID =DS_ID)«FK»

+PK_TABLE

1

+CAT_ID

0..*

(CAT_ID =ITEM_ID)«FK»

+PK_ITEM

1

+DF_ID

0..*

(DF_ID =DF_ID)«FK»

+PK_DATAFLOW

1

+FK_COMCOLMAPP_COL_COMPCOLMAPP 0..*

(MAP_ID=MAP_ID)

«FK»

+PK_MAPPING 1

+FK_TRCODRULE_LOCCODE_TRCODRULE

0..*

(TR_RULE_ID =TR_RULE_ID)«FK»

+PK_TRANSCODING_RULE 1

+FK_TRCODRULE_LOCCODE_LOCCODE 0..*

(LCD_ID =LCD_ID)«FK»

+PK_LOCAL_CODE 1

+FK_TRCODRULE_DSDCODE_TRCODRULE

0..*

(TR_RULE_ID =TR_RULE_ID)«FK»

+PK_TRANSCODING_RULE

1

+FK_TRCODRULE_DSDCODE_DSDCODE 0..*

(CD_ID =LCD_ID)«FK»

+PK_DSD_CODE 1

+FK_TRANSCOD_CLIST_CLIST

0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST 1

+FK_TRANSCOD_CLIST_TRANSCOD0..*

(TR_ID =TR_ID)«FK»

+PK_TRANSCODING

1

+FK_CONNECTION_CATEG_CATEG

0..*

(CAT_ID =CAT_ID)«FK»

+PK_CATEGORY

1

+FK_DATAFLOW_MAPPING_SET 0..*

(MAP_SET_ID =MAP_SET_ID)«FK»

+PK_MappingSet 1

+FK_DATAFLOW_CATEGORY_CATEGORY 0..*

(CAT_ID =CAT_ID)«FK»

+PK_CATEGORY 1

+FK_COMPONENT_DSD 1

(DSD_ID =DSD_ID)«FK»

+PK_DSD 0..*

+FK_TRANSCOD_RULE_TRANSCOD 0..*

(TR_ID =TR_ID)«FK»

+PK_TRANSCODING 1

+FK_TRANSCODING_MAPPING 0..1

(MAP_ID=MAP_ID)

«FK»

+PK_MAPPING 1

+FK_MAPPING_MappingSet 0..*

(MAP_SET_ID=MAP_SET_ID)

«FK»

+PK_MappingSet 1

+FK_COM_COL_MAPP_COMP_COMP

0..*

(COMP_ID=COMP_ID)

«FK»

+PK_COMPONENT 1

+FK_COMCOLMAPPCOMP_COMPCOLMAPP 0..*

(MAP_ID=MAP_ID)

«FK»

+PK_MAPPING

1

+FK_COMPONENT_CODELIST

0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST

1

+FK_COM_COL_MAPP_COLUMN_COLUMN 0..*

(COL_ID =COL_ID)«FK»

+PK_COLUMN 1

+CON_CAT_ID 0..*

(CONNECTION_ID=CONNECTION_ID)

«FK»

+PK_CONNECTION 1

Figure 1: Mapping Store schema

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 16/20

Page 17: SDMX Reference Architecture v1

Figure 1 shows the complete schema of the mapping store. The tables are coloured according the following groups:

Pink : tables storing SDMX entities

Yellow: tables storing dataset information

Blue: tables storing mapping information

Green: tables storing transcoding information

White: tables that are not used in the current version of the related application

As the mapping store schema is complicated, this section also contains figures describing its individual parts.

class SDMX Model diagram

LOCALISED_STRING

«column»* LS_ID: bigint TYPE: varchar(10) FK ITEM_ID: bigint FK ART_ID: bigint* LANGUAGE: varchar(50)* TEXT: varchar(300)

«FK»+ FK_LOCALISED_STRING_ARTEFACT(bigint)+ FK_LOCALISED_STRING_ITEM(bigint)

ITEM

«column»*PK ITEM_ID: bigint ID: varchar(50)

«PK»+ PK_ITEM(bigint)

CONCEPT

«column»*pfK CON_ID: bigint*FK CON_SCH_ID: bigint

«FK»+ CON_ID(bigint)+ CON_SCH_ID(bigint)+ FK_CON_SCH_ID(bigint)+ FK_CONCEPT_ID()

«PK»+ PK_CONCEPT(bigint)

DSD_CODE

«column»*pfK LCD_ID: bigint*FK CL_ID: bigint

«FK»+ CLS_ID(bigint)+ DC_ID(bigint)+ FK_CLS_ID(bigint)+ FK_DSD_CODE_ID(bigint)

«PK»+ PK_DSD_CODE(bigint)

CATEGORY

«column»*pfK CAT_ID: bigint*FK CAT_SCH_ID: bigint FK PARENT_CAT_ID: bigint

«FK»+ CAT_ID(bigint)+ CAT_SCH_ID(bigint)+ FK_CATEGORY_CATEGORY(bigint)+ FK_CATEGORY_ID(bigint)+ FK_CS_ID(bigint)

«PK»+ PK_CATEGORY(bigint)

CONCEPT_SCHEME

«column»*pfK CON_SCH_ID: bigint

«FK»+ CON_SCH_ART_ID(bigint)+ FK_CS_ID(bigint)

«PK»+ PK_CONCEPT_SCHEME(bigint)

ARTEFACT

«column»*PK ART_ID: bigint* ID: varchar(50)* VERSION: varchar(50)* AGENCY: varchar(50)

«PK»+ PK_ARTEFACT(bigint)

CODELIST

«column»*pfK CL_ID: bigint

«FK»+ CL_ID(bigint)+ FK_CL_ID(bigint)

«PK»+ PK_CODELIST(bigint)

CATEGORY_SCHEME

«column»*pfK CAT_SCH_ID: bigint

«FK»+ CAT_SCH_ART_ID(bigint)+ FK_CAT_SCH_ID(bigint)

«PK»+ PK_CATEGORY_SCHEME(bigint)

DATAFLOW

«column»*pfK DF_ID: bigint*FK DSD_ID: bigint FK MAP_SET_ID: bigint

«FK»+ DFD_DSD_ID(bigint)+ FK_DATAFLOW_ARTEFACT(bigint)+ FK_DATAFLOW_MAPPING_SET(bigint)+ FK_DSD_ID(bigint)

«PK»+ PK_DATAFLOW(bigint)

COMPONENT

«column»*PK COMP_ID: bigint* TYPE: varchar(50)*FK DSD_ID: bigint*FK CON_ID: bigint FK CL_ID: bigint IS_FREQ_DIM: int IS_MEASURE_DIM: int ATT_ASS_LEVEL: varchar(11) ATT_STATUS: varchar(11) ATT_IS_TIME_FORMAT: int XS_ATTLEVEL_DS: int XS_ATTLEVEL_GROUP: int XS_ATTLEVEL_SECTION: int XS_ATTLEVEL_OBS: int XS_MEASURE_CODE: varchar(50)

«FK»+ FK_COMPONENT_CODELIST(bigint)+ FK_COMPONENT_CONCEPT(bigint)+ FK_COMPONENT_DSD(bigint)

«PK»+ PK_COMPONENT(bigint)

DSD

«column»*pfK DSD_ID: bigint

«FK»+ DSD_ART_ID(bigint)+ FK_DSD_ID(bigint)

«PK»+ PK_DSD(bigint)

DATAFLOW_CATEGORY

«column»*pfK DF_ID: bigint*pfK CAT_ID: bigint

«FK»+ DF_ID(bigint)+ FK_DATAFLOW_CATEGORY_CATEGORY(bigint)+ FK_DF_ID(bigint)

«PK»+ PK_DATAFLOW_CATEGORY(bigint, bigint)

+CL_ID

0..*

(CL_ID =ART_ID)«FK»

+PK_ARTEFACT

1

+FK_LOCALISED_STRING_ARTEFACT 0..*

(ART_ID =ART_ID)«FK»

+PK_ARTEFACT 1

+CON_ID

0..*

(CON_ID =ITEM_ID)«FK»

+PK_ITEM

1

+CON_SCH_ID 0..*

(CON_SCH_ID=CON_SCH_ID)

«FK»

+PK_CONCEPT_SCHEME 1

+DC_ID

0..*

(LCD_ID =ITEM_ID)«FK»

+PK_ITEM 1

+CLS_ID 0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST 1

+CAT_SCH_ID

0..*

(CAT_SCH_ID =CAT_SCH_ID)«FK»

+PK_CATEGORY_SCHEME

1

+CAT_ID 0..*

(CAT_ID =ITEM_ID)«FK»

+PK_ITEM 1

+FK_LOCALISED_STRING_ITEM 0..*

(ITEM_ID =ITEM_ID)«FK»

+PK_ITEM

1

+CON_SCH_ART_ID

0..*

(CON_SCH_ID= ART_ID)«FK»

+PK_ARTEFACT

1

+FK_DATAFLOW_CATEGORY_CATEGORY 0..*

(CAT_ID =CAT_ID)«FK»

+PK_CATEGORY

1

+CAT_SCH_ART_ID 0..*

(CAT_SCH_ID =ART_ID) «FK»

+PK_ARTEFACT 1

+DFD_DSD_ID 0..*

(DSD_ID =DSD_ID)«FK»

+PK_DSD 1

+FK_DATAFLOW_ARTEFACT 0..*

(DF_ID =ART_ID)«FK»

+PK_ARTEFACT

1

+FK_COMPONENT_DSD

1

(DSD_ID =DSD_ID)«FK»

+PK_DSD 0..*

+FK_COMPONENT_CONCEPT 0..*

(CON_ID =CON_ID)«FK»

+PK_CONCEPT 1

+FK_COMPONENT_CODELIST 0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST 1

+DSD_ART_ID 0..*

(DSD_ID =ART_ID)«FK»

+PK_ARTEFACT 1

+DF_ID 0..*

(DF_ID =DF_ID)«FK»

+PK_DATAFLOW 1

+FK_CATEGORY_CATEGORY0..*

«FK» (PARENT_CAT_ID= CAT_ID)

+PK_CATEGORY1

Figure 2: Mapping Store schema - SDMX model tables

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 17/20

Page 18: SDMX Reference Architecture v1

Figure 2 shows only the tables storing information of SDMX entities. The mapping store adopts the concept of inheritance in a similar manner to the information model of the SDMX standard. The two tables from which other tables inherit are ARTEFACT and ITEM.

class Dataset diagram

DATASET

«column»*PK DS_ID: bigint* NAME: varchar(50) QUERY: text FK CONNECTION_ID: bigint DESCRIPTION: varchar(1024) CUSTOM_QUERY: int XML_QUERY: text

«FK»+ CAT_CON_ID(bigint)+ FK_CONNECTION_ID(bigint)

«PK»+ PK_TABLE(bigint)

DATASET_COLUMN

«column»*PK COL_ID: bigint* NAME: varchar(50) DESCRIPTION: varchar(1024)*FK DS_ID: bigint

«FK»+ FK_TABLE_ID(bigint)+ TABLE_ID(bigint)

«PK»+ PK_COLUMN(bigint)

LOCAL_CODE

«column»*pfK LCD_ID: bigint*FK COLUMN_ID: bigint

«FK»+ FK_LC_ID(bigint)+ FK_LOCAL_CODE_COLUMN(bigint)+ LC_ID(bigint)

«PK»+ PK_LOCAL_CODE(bigint)

+FK_LOCAL_CODE_COLUMN 0..*

(COLUMN_ID = COL_ID)

«FK»

+PK_COLUMN 1

+TABLE_ID 0..*

(DS_ID = DS_ID)

«FK»

+PK_TABLE 1

Figure 3: Mapping Store schema - Dataset tables

Figure 3 shows information concerning the Datasets. Each dataset (table DATASET) may have multiple columns (table DATASET_COLUMN). In case the data in a column are coded differently than the corresponding SDMX codelist, the codes used are stored in the table LOCAL_CODE.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 18/20

Page 19: SDMX Reference Architecture v1

class Mapping Set diagram

DATASET_COLUMN

«column»*PK COL_ID: bigint* NAME: varchar(50) DESCRIPTION: varchar(1024)*FK DS_ID: bigint

«FK»+ FK_TABLE_ID(bigint)+ TABLE_ID(bigint)

«PK»+ PK_COLUMN(bigint)

COMPONENT

«column»*PK COMP_ID: bigint* TYPE: varchar(50)*FK DSD_ID: bigint*FK CON_ID: bigint FK CL_ID: bigint IS_FREQ_DIM: int IS_MEASURE_DIM: int ATT_ASS_LEVEL: varchar(11) ATT_STATUS: varchar(11) ATT_IS_TIME_FORMAT: int XS_ATTLEVEL_DS: int XS_ATTLEVEL_GROUP: int XS_ATTLEVEL_SECTION: int XS_ATTLEVEL_OBS: int XS_MEASURE_CODE: varchar(50)

«FK»+ FK_COMPONENT_CODELIST(bigint)+ FK_COMPONENT_CONCEPT(bigint)+ FK_COMPONENT_DSD(bigint)

«PK»+ PK_COMPONENT(bigint)

MAPPING_SET

«column»*PK MAP_SET_ID: bigint* ID: varchar(50) DESCRIPTION: varchar(1024) DS_ID: bigint DF_ID: bigint

«PK»+ PK_MappingSet(bigint)

«unique»+ UQ_MAPPING_SET_CODE(varchar)

COMPONENT_MAPPING

«column»*PK MAP_ID: bigint*FK MAP_SET_ID: bigint* TYPE: varchar(10) CONSTANT: varchar(50)

«FK»+ FK_MAPPING_MappingSet(bigint)

«PK»+ PK_MAPPING(bigint)

COM_COL_MAPPING_COMPONENT

«column»*pfK MAP_ID: bigint*pfK COMP_ID: bigint

«PK»+ PK_COM_COL_MAPPING_COMPONENT(bigint, bigint)

«FK»+ FK_COM_COL_MAPP_COMP_COMP(bigint)+ FK_COMCOLMAPPCOMP_COMPCOLMAPP(bigint)

COM_COL_MAPPING_COLUMN

«column»*pfK MAP_ID: bigint*pfK COL_ID: bigint

«PK»+ PK_COM_COL_MAPPING_COLUMN(bigint, bigint)

«FK»+ FK_COM_COL_MAPP_COLUMN_COLUMN(bigint)+ FK_COMCOLMAPP_COL_COMPCOLMAPP(bigint)

LOCAL_CODE

«column»*pfK LCD_ID: bigint*FK COLUMN_ID: bigint

«FK»+ FK_LC_ID(bigint)+ FK_LOCAL_CODE_COLUMN(bigint)+ LC_ID(bigint)

«PK»+ PK_LOCAL_CODE(bigint)

DSD_CODE

«column»*pfK LCD_ID: bigint*FK CL_ID: bigint

«FK»+ CLS_ID(bigint)+ DC_ID(bigint)+ FK_CLS_ID(bigint)+ FK_DSD_CODE_ID(bigint)

«PK»+ PK_DSD_CODE(bigint)

TRANSCODING_RULE

«column»*PK TR_RULE_ID: bigint*FK TR_ID: bigint

«PK»+ PK_TRANSCODING_RULE(bigint)

«FK»+ FK_TRANSCOD_RULE_TRANSCOD(bigint)

TRANSCODING

«column»*PK TR_ID: bigint EXPRESSION: varchar(150)*FK MAP_ID: bigint

«FK»+ FK_TRANSCODING_MAPPING(bigint)

«PK»+ PK_TRANSCODING(bigint)

TRANSCODING_CODELIST

«column»*PK MAP_ID: bigint*pfK TR_ID: bigint*pfK CL_ID: bigint

«PK»+ PK_TRANSCODING_CODELIST(bigint, bigint, bigint)

«FK»+ FK_TRANSCOD_CLIST_CLIST(bigint)+ FK_TRANSCOD_CLIST_TRANSCOD(bigint)

TRANSCODING_RULE_DSD_CODE

«column»*pfK TR_RULE_ID: bigint*pfK CD_ID: bigint

«PK»+ PK_TRANSCODING_RULE_DSD_CODE(bigint, bigint)

«FK»+ FK_TRCODRULE_DSDCODE_DSDCODE(bigint)+ FK_TRCODRULE_DSDCODE_TRCODRULE(bigint)

TRANSCODING_RULE_LOCAL_CODE

«column»*pfK LCD_ID: bigint*pfK TR_RULE_ID: bigint

«PK»+ PK_TRANSCODING_RULE_LOCAL_CODE(bigint, bigint)

«FK»+ FK_TRCODRULE_LOCCODE_LOCCODE(bigint)+ FK_TRCODRULE_LOCCODE_TRCODRULE(bigint)

CODELIST

«column»*pfK CL_ID: bigint

«FK»+ CL_ID(bigint)+ FK_CL_ID(bigint)

«PK»+ PK_CODELIST(bigint)

Mapping

Transcoding

+FK_TRANSCOD_RULE_TRANSCOD 0..*

(TR_ID =TR_ID)«FK»

+PK_TRANSCODING 1

+FK_MAPPING_MappingSet 0..*

(MAP_SET_ID=MAP_SET_ID)

«FK»

+PK_MappingSet 1

+FK_COMCOLMAPPCOMP_COMPCOLMAPP

0..*

(MAP_ID=MAP_ID)

«FK»

+PK_MAPPING

1

+FK_COM_COL_MAPP_COMP_COMP

0..*(COMP_ID=COMP_ID)

«FK»+PK_COMPONENT

1

+FK_COMCOLMAPP_COL_COMPCOLMAPP

0..*

(MAP_ID=MAP_ID)

«FK»

+PK_MAPPING

1

+FK_COM_COL_MAPP_COLUMN_COLUMN

0..*(COL_ID =COL_ID)«FK»

+PK_COLUMN

1

+FK_COMPONENT_CODELIST 0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST 1

+CLS_ID 0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST 1

+FK_TRCODRULE_LOCCODE_TRCODRULE

0..*

(TR_RULE_ID =TR_RULE_ID)«FK»

+PK_TRANSCODING_RULE1

+FK_TRANSCODING_MAPPING 0..1

(MAP_ID=MAP_ID)

«FK»

+PK_MAPPING 1

+FK_TRANSCOD_CLIST_TRANSCOD

0..*

(TR_ID =TR_ID)«FK»

+PK_TRANSCODING

1

+FK_TRANSCOD_CLIST_CLIST0..*

(CL_ID =CL_ID)«FK»

+PK_CODELIST

1

+FK_TRCODRULE_DSDCODE_DSDCODE

0..*(CD_ID =LCD_ID)«FK»

+PK_DSD_CODE

1

+FK_TRCODRULE_DSDCODE_TRCODRULE

0..*

(TR_RULE_ID =TR_RULE_ID)«FK»

+PK_TRANSCODING_RULE

1

+FK_TRCODRULE_LOCCODE_LOCCODE

0..* (LCD_ID =LCD_ID)«FK»+PK_LOCAL_CODE

1

+FK_LOCAL_CODE_COLUMN 0..*

(COLUMN_ID= COL_ID)«FK»

+PK_COLUMN 1

Figure 4: Mapping Store schema - Mapping and Transcoding

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 19/20

Page 20: SDMX Reference Architecture v1

Figure 4 shows the tables concerning the mapping and the transcoding. The top-level entity is the mapping set (table MAPPING_SET), which holds all the required information to completely map (and transcode, if needed) the local data to SDMX entities found in a DSD. The mapping set can have multiple mappings (table COMPONENT_MAPPING). Each mapping is essentially a many-to-one, one-to-many or one-to-one correspondence between DSD components (table COMPONENT) and dataset columns (DATASET_COLUMN).

If transcoding is required for a mapping, it is stored in the table TRANSCODING. Each transcoding has a set of transcoding rules (table TRANSCODING_RULE). Each transcoding rule is essentially a many-to-one, one-to-many or one-to-one correspondence between a DSD code (table DSD_CODE) of a codelist (table CODELIST) and a local code (table LOCAL_CODE) used in a dataset column.

Updated: 3 December 2009 SDMX Reference Architecture v1.24 Page: 20/20