22

Click here to load reader

ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

Embed Size (px)

Citation preview

Page 1: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

Use of SDMX to improve the Implementation of the National Strategy for the Development of Statistics

(NSDS) in Sudan

By:

Nuha Mohamed Elamin Ahmed

Senior Statistics Specialist

Director of Sectoral Statistics

Central Bureau of Statistics

Sudan

Page 2: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

User perspective: what difference is SDMX making?

Use of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan

Abstract:

Introduction to SDMX:

The SDMX initiative sets standards to facilitate the exchange of statistical data and metadata using modern information technology. Several versions of the technical specifications have been released since 2004. SDMX has also been published as an ISO International Standard (IS 17369).

In addition to these different versions, related technical specifications of a Validation and Transformation Language (VTL) have been released to implement a specific section of the SDMX Information Model.

National Strategy for the Development of Statistics (NSDS):

The NSDS has been developed as a framework for strengthening statistical capacity uniformly across the entire National Statistical System (NSS) in Sudan such that each of the sub-systems will be empowered to manage results and outcomes of development. It also will serve as an integrated framework within which sub-systems and different stakeholders generate, disseminate and use statistics that are credible and provide a sound basis for national planning and development by:

- Strengthening statistical production consistent with the Fundamental Principles of Official Statistics and based on International best practices;

- Improving coordination and promoting integration and collaboration among and between producers and users;

- Strengthening national capacity for producing and using statistics; and

- Ensuring long-term sustainability of the NSS through provision of adequate resources,

The NSDS is implemented under the guidance and Supervision of the Central Bureau of Statistics (CBS) in Sudan and includes all the governmental and non- governmental agencies which are using statistics.

Page 3: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

Use of SDMX:

SDMX will support statistics as collected and used by governmental and international statistical organizations, and this model is also applicable to other organizational contexts involving statistical data and related metadata.

Many IT tools have been developed to support the use and implementation of SDMX. Most of these tools are of an open source nature, so that they can be used as components for building IT systems in statistical organizations. Examples of such tools are the SDMX Registry, the Data Structure Wizard (DSW) and the SDMX Reference Infrastructure (SDMX-RI).

The work processes of SDMX are fully transparent: public consultations are conducted when major revisions are envisaged.

Advantages of SDMX:

Facilitate data and metadata exchange. Make efficient use of technologies and standards. Reduce reporting burden. Enhance availability of statistical data and metadata for the users. Data reporting = data dissemination = data sharing (one data is reported only once and

then shared widely using modern technologies).

Page 4: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

Introduction:

SDMX, which stands for Statistical Data and Metadata eXchange is an international initiative that aims at standardising and modernising (“industrialising”) the mechanisms and processes for the exchange of statistical data and metadata among international organisations and their member countries.

SDMX is sponsored by seven international organisations including the Bank for International Settlements (BIS), the European Central Bank (ECB), Eurostat (Statistical Office of the European Union), the International Monetary Fund (IMF), the Organisation for Economic Cooperation and Development (OECD), the United Nations Statistical Division (UNSD), and the World Bank.

These organisations are the main players at world and regional levels in the collection of official statistics in a large variety of domains (agriculture statistics, economic and financial statistics, social statistics, environment statistics etc.).

What is SDMX?

SDMX, which stands for is an ISO standard designed to describe statistical data and metadata, normalise their exchange, and improve their efficient sharing across statistical and similar organisations. It provides an integrated approach to facilitating statistical data and metadata exchange, enabling interoperable implementations within and between systems concerned with the exchange, reporting and dissemination of statistical data and their related meta-information.

It consists of:

Technical standards (including the Information Model) Statistical guidelines An IT architecture and tools

But SDMX is not just a format for data exchange. Taken together, the technical standards, the statistical guidelines and the IT architecture and tools can support improved business processes for any statistical organisation as well as the harmonisation and standardisation of statistical metadata.

The first version of the SDMX technical standard (1.0) was finalised in 2004 and approved in 2005 by the International Organization for Standardization (ISO) as a Technical Specification (ISO/TS 17369: 2005 SDMX). Version 2.0 was approved in November 2005. Version 2.1 was issued in May 2011. In 2013, SDMX was published by the International Organization for Standardization (ISO) as International Standard (IS) 17369.

The Information Model which forms the core of SDMX has been developed to support statistics as collected and used by governmental and international statistical organisations, and this model is also applicable to other organisational contexts involving statistical data and related metadata.

Page 5: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

The statistical guidelines aim at providing general statistical governance as well as common (“cross-domain”) concepts and code lists, a common classification of statistical domains and a common terminology. The first set of guidelines was published in January 2009.

Many IT tools have been developed to support the use and implementation of SDMX. Most of these tools are of an open source nature, so that they can be used as components for building IT systems in statistical organisations. Examples of such tools are the SDMX Registry, the Data Structure Wizard (DSW) and the SDMX Reference Infrastructure (SDMX-RI).

The work processes of SDMX are fully transparent: public consultations are conducted when major revisions are envisaged.

The first global SDMX data exchanges were implemented in 2013 by the seven sponsor organisations and covered National Accounts, Balance of Payments and Foreign Direct Investment. Many additional data exchanges are presently under development, both between international organisations and between international organisations and their constituencies.

   SDMX Tutorials and related Material

     The Business Case for SDMX

Experience has shown that it is essential that senior management drives SDMX. Consequently, special care should be taken to properly describe and explain the business case for SDMX, and focus should be placed on the main advantages of using SDMX, i.e.:

facilitate data and metadata exchange make efficient use of technologies and standards reduce reporting burden enhance availability of statistical data and metadata for the users data reporting = data dissemination = data sharing (one data is reported only once and

then shared widely using modern technologies)

The business case for SDMX is also explained in detail in a document called “SDMX Starter Kit for National Statistical Agencies“. This document is very much focused on non-IT issues relating to the linkage of SDMX implementation to broader corporate initiatives that are usually embedded in strategic plans, statistical master plans, etc. Experience has shown that such a linkage is essential if SDMX implementation is to gain the interest of senior management in statistical agencies. This is why prominence has been attached to developing a clear business case for SDMX.

     

Page 6: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

SDMX in a Nutshell

The “SDMX in a Nutshell” poster provides a general picture of the SDMX initiative.

This poster presents, in a synthesised way, two types of information:

The global framework in which SDMX operates. The document shows the interactions between SDMX and the three major statistical frameworks developed by the United Nations, namely the Generic Statistical Business Process Model (GSBPM), the Common Statistical Production Architecture (CSPA) and the General Statistical Information Model (GSIM).

The general structure and organisation of SDMX. The SDMX governance structure has three levels: Sponsors Committee, SDMX Secretariat and SDMX working groups. The SDMX Sponsor Organisations are represented by the head of their statistical function in the SDMX Sponsors Committee. It is the ultimate decision-making body. The SDMX Secretariat, which is comprised of senior experts from the sponsoring organisations, provides executive support to the SDMX Sponsors Committee and is the interface between the sponsors and the working groups. The SDMX Technical Working Group (TWG) and the Statistical Working Group (SWG) report to the SDMX Secretariat. They maintain, improve or further develop the SDMX technical and statistical standards. The work processes of SDMX are fully transparent: public consultations are conducted when major revisions are envisaged.

Page 7: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

SDMX Process Patterns:

SDMX identifies three basic process patterns regarding the exchange of statistical data and metadata. These can be described as follows:

1. Bilateral exchange:All aspects of the exchange process are agreed between counterparties, including the mechanism for exchange of data and metadata, the formats, the frequency or schedule, and the mode used for communications regarding the exchange. This is perhaps the most common. 2. Gateway exchange:Gateway exchanges are an organized set of bilateral exchanges, in which several data and metadata collecting organizations or individuals agree to exchange the collected information with each other in a single, known format, and according to a single, known process. This pattern has the effect of reducing the burden of managing multiple bilateral exchanges (in data and metadata collection) across the sharing organizations/individuals. This is also a very common process pattern in the statistical area, where communities of institutions agree on ways to gain efficiencies within the scope of their collective responsibilities. 3. Data-sharing exchange:Open, freely available data formats and process patterns are known and standard. Thus, any organization or individual can use any counterparty’s data and metadata (assuming they are permitted access to it). This model requires no bilateral agreement, but only requires that data and metadata providers and consumers adhere to the standards. This document specifies the SDMX standards designed to facilitate exchanges based on any of these process patterns, and shows how SDMX offers advantages in all cases. It is possible to agree bilaterally to use a standard format (such as SDMX-EDI or SDMX-ML); it is possible for data senders in a gateway process to use a standard format for data exchange with each other, or with any data providers who agree to do so; it is possible to agree to use the full set of SDMX standards to support a common data-sharing process of exchange, whether based on an SDMX-conformant registry or some other architecture. The standards specified here specifically support a data-sharing process based on the use

of central registry services. Registry services provide visibility into the data and metadata existing within the community, and support the access and use of this data and metadata by providing a set of triggers for automated processing. The data or metadata itself is not stored in a central registry – these services merely provide a useful set of metadata about the data (and additional metadata) in a known location, so that users/applications can easily locate and obtain whatever data and/or metadata is registered. The use of standards for all data, metadata, and the registry services themselves is ubiquitous, permitting a high level of automation within a data-sharing community.

It should be pointed out that these different process models are not mutually exclusive – a single system capable of expressing data and metadata in SDMX- conformant formats could support all three scenarios. Different standards may be

applicable to different processes (for example, many registry services interfaces are used only in a data-sharing scenario) but all have a

common basis in a shared information model. In addition to looking at collection and reporting, it is also important to consider the

dissemination of data. Data and metadata – no matter how they are exchanged between

Page 8: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

counterparties in the process of their development and creation – are all eventually supplied to an end user of some type. Often, this is through specific applications inside of institutions. But more and more frequently, data and metadata are also published on websites in various formats. The dissemination of data and its accompanying metadata on the web is a focus of the SDMX standards. Standards for statistical data and metadata allow improvements in the publication of data – it becomes more easily possible to process a standard format once the data is obtained, and the data and metadata are linked together, making the comprehension and further processing of the data easier.

In discussions of statistical data, there are many aspects of its dissemination which impact data quality: data discovery, ease of use, and timeliness. SDMX standards provide support for all of these aspects of data dissemination. Standard data formats promote ease of use, and provide links to relevant metadata. The concept of registry services means that data and metadata can more easily be discovered. Timeliness is improved throughout the data lifecycle by increases in efficiency, promoted through the availability of metadata and ease of use.

It is important to note that SDMX is primarily focused on the exchange and dissemination of statistical data and metadata. There may also be many uses for the standard model and formats specified here in the context of internal processing of data that are not concerned with the exchange between organizations and users, however. It is felt that a clear, standard formatting of data and metadata for the purposes of exchange and dissemination can also facilitate internal processing by organizations and users, but this is not the focus of the specification.

 National Strategy for the Development of Statistics (NSDS) in Sudan:

The NSDS has been developed as a framework for strengthening statistical capacity uniformly across the entire NSS such that each of the sub-systems will be empowered to manage results and outcomes of development. It also will serve as an integrated framework within which sub-systems and different stakeholders generate, disseminate and use statistics that are credible and provide a sound basis for national planning and development by:

- Strengthening statistical production consistent with the Fundamental Principles of Official Statistics and based on International best practices;

- Improving coordination and promoting integration and collaboration among and between producers and users;

- Strengthening national capacity for producing and using statistics; and

- Ensuring long-term sustainability of the NSS through provision of adequate resources,

A strategic approach has been adopted making certain that all stakeholders have been involved adopting participatory, consultative and all-inclusive approach.

The Central Bureau of Statistics (CBS) of Sudan, with the support of the UNDP Sudan (with additional inputs by one or two development partners) coordinated the design

Page 9: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

process across the National Statistical System (NSS). Sector and state strategies are component building blocks for the National Strategy. The sector/state strategies are being, therefore, developed in phases because of their large numbers.

The Process:

The preparation of the NSDS and its building blocks (sector/state strategies) is an important step to take towards the modernization of the CBS and the components of the NSS and this process is as important as the strategy documents. Hence, the imperativeness for establishing a structure for the process and also to serve as platform for strengthening and streamlining the NSS and the component sub-systems. Considerable time has therefore been taken to work out this structure which now remains as a permanent one for managing and operating the NSS.

There were ten steps taken during the preparation of the strategies, namely:-

o Sensitization of key stakeholders; o Launching of the NSDS concept o Preparation of the roadmap o Visits to the stakeholders (MDAs,States etc); o Establishing sector/state statistics committees; o Establishing the National Consultative Committee on Statistics (NCCS) – an

inter-agency committee;o Technical empowerment of the committees through technical workshops along

with technical back-stopping and monitoring of the MDA/State committees; o Assessment of the current status of statistics o Drafting the strategy documents and o Stakeholder approval and finalization of the strategy documents.

 Status of the NSS: Composition: Conceptually, the NSS is made up of groups of data producers, users, suppliers/providers of basic information. They also include Statistical Training Institutions/Centers and Research Institutions, Non-Governmental Organizations (NGOs), Civil Society Organization (CSOs), the Media, the Development Partners including Donors and Funders. The Central Bureau of Statistics (CBS) is at the centre, coordinating and facilitating the other members and all operations under one single legal framework. These components are expected to work as a team under a coordination arrangement. Some key national agencies (stakeholders) have been identified plus indicating some of their roles. Some of these are the CBS (Coordinator), Micro Economic Directorate of the Ministry of Finance and National Economy (MoFNE), the Poverty Unit (MoFNE), National Population Council/General Secretariat, and National Strategic Planning Council, etc……. Desirable Characteristics of a Good NSS :The components of the NSS should have relatively uniform capacity for generation and use of statistics. It should also operate with objectivity and impartiality and must produce relevant and timely statistics with high quality. All members should be well coordinated, working as a team and very much aware of their respective roles. The regulatory framework for statistics should

Page 10: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

similarly be adequate. Some of these characteristics must be present in a NSS and the current ineffectiveness of the system is reason for the inadequacy of statistical information.

Use of SDMX to improve the Statistical Work:

1. The statistical production process: The ultimate aim of any official statistical production process is to compile correct and meaningful numeric information to be used by administrations, policy makers, researchers and also, the general public. At the national level there are usually several organisations concerned, national statistical institutes (NSIs), central banks (CBs) and possibly specialised administrations, e.g. for labour statistics. At the international level there are many organisations involved, last but not least the seven SDMX sponsors. 1.1. Stove pipes:The compilation of statistics for a specific domain is usually considered, by those who do it, a highly specialised activity, requiring IT systems and processes customised to this set of statistical data. As a result the statistical production process in many organisations is separated according to the statistical domain, with each one working within its own „stove pipe“. This has often lead to a lack of harmonisation within and also across organisations with respect to how data is organised, what metadata is provided with it to make it meaningful and, in particular, how it is exchanged between „stove pipes“ or between organisations. Even within a statistical organisation it can be difficult to share, for example, IT applications across different subject matter domains, thus creating the potential for inefficiency and duplication of effort. For the end users this means that it is often difficult to use statistical information on different subjects or from different providers in an efficient way. 1.2. Statistical production process: a generalisation The Joint UNECE / Eurostat / OECD Work Sessions on Statistical Metadata (METIS) have over the past years worked on a Generic Statistical Business Process Model (GSBPM) and the possible relationship with SDMX has been recognised: “The GSBPM should therefore be seen as a flexible tool to describe and define the set of business processes needed to produce official statistics. The use of this model can also be envisaged in other separate, but often related contexts such as harmonizing statistical computing infrastructures, facilitating the sharing of software components, in the Statistical Data and Metadata eXchange (SDMX) User Guide explaining the use of SDMX in a statistical organisation, and providing a framework for process quality assessment.“TPF2FPT Level 1 of the GSBPM has the following phases: • Specify needs • Design • Build • Collect • Process • Analyse • Disseminate • Archive • Evaluate It also contains eight over-arching statistical processes:

Page 11: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

• Quality management • Metadata management • Statistical framework management • Statistical programme management • Knowledge management • Data management • Provider management • Customer management

After briefly presenting the SDMX Framework, we will review the phases with respect to how SDMX could support them, in particular in the context of a generalised IT infrastructure based on SDMX principles.

2. The SDMX Framework: The “Statistical Data and Metadata Exchange” Initiative (SDMX) aims to “develop and use more efficient processes for exchange and sharing of statistical data and metadata among international organisations and their member countries. To achieve this goal, SDMX provides standard formats for data and metadata, together with content guidelines and an IT architecture for exchange of data and metadata. Organisations are free to make use of whichever elements of SDMX are most appropriate in a given case.”TPF3FPT The SDMX framework consists of: SDMX Technical Specifications and related tools: • SDMX Information Model (SDMX IM) • Two data exchange formats to exchange data and metadata (SDMX-EDI and SDMX-ML) • SDMX Registry specification and web services guidelines • SDMX tools

In addition, the SDMX framework provides Content-oriented Guidelines covering: • Cross-Domain Concepts (definitions and recommended code lists) • Metadata Common Vocabulary • Subject Matter Domain list

The technical specifications concentrate on data exchange and data sharing processes. The SDMX Information model (IM) covers the information (statistical data and metadata) that may be exchanged between statistical agencies and also the related process flows, such as information on data provisioning, eg, which agency should report which data at what time. Obviously, information that is exchanged will also have to be stored in statistical systems. The two data exchange formats can be used to actually “package“ data and metadata into data exchange messages covering also structural information that helps a receiving application to interpret and (automatically) process the information. Freely available SDMX tools have been developed with a view to being used as demonstration tools for teaching and learning about SDMX, eg building SDMX data and metadata structures or running exchange format transformations. The SDMX Registry specification provides a central registry of available data and reference metadata and a repository for provisioning information, thus constituting the focal point for process automation. The Content-oriented Guidelines focus on the harmonization of specific concepts and terminology that are common to a large number of statistical domains. Such harmonisation

Page 12: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

complements the potential efficiency gains to be achieved when applying the SDMX technical specifications.

2. The statistical production process and SDMX: Here will look at the different steps of the statistical business process to identify how the application of SDMX technical standards and content-oriented guidelines and add efficiency. The key argument rests on the fact that the standards can be applied across statistical domains leading to a standardisation of the related statistical processes. It should be possible to build a generic processing system that implements these standards, which in turn can then host the data from different statistical domains and their statistical processes. 3.1. Specify need and design:These phases consist of defining all relevant concepts for the statistical activity: what data should be collected and from whom, what information is needed in addition to purely numeric „data“, how (often) will the data collected, what quality controls need to be performed upon reception of the data, what processing (aggregations, estimations) need to be performed in order to arrive at the final statistical „product“. Who are the clients of this product and how will they get access to the statistical information, what additional information do they need, eg about the collection exercise and the processing, to correctly interpret the information they receive. Already this short and incomplete description shows that a data collection exercise “is not only about data“, but to a great extend about information about the data, ie “metadata“. This is where the SDMX Information Model (SDMX IM) comes in: its key feature is the extensive model for data and metadata, so that the information “about the data“ needed in the statistical process and by the final users can be represented. Metadata can be attached to the data: statistical data is “identified“ (via statistical concepts, which might be partly chosen from the SDMX Cross-domain concepts) and further “qualified” by attributes, which can be “free text” information, eg about the collection methodology for a particular statistic. Metadata can also be attached to other artefacts of the information model, eg to an element of a code list. Planning and developing the statistical task “using SDMX” means that the statistical expert responsible for this work needs to take decisions on how to structure the data to be collected, which statistical concepts should identify the data items and what additional attributes (carrying content or also processing information) should be defined. This is actually nothing new and has been done before, the new feature is that SDMX provides a generic model, which can be applied across statistical domains. The model provides a certain rigour and its application fosters the re-use of statistical concepts and code lists. An rganization’s first such application of the SDMX IM for a new statistical task will require a certain investment into understanding the model and how to use it, however, once the statistical experts gain experience, it will become easier. A key motivation to start applying the SDMX IM for other statistical collection tasks should be the fact that any information rganiz according to the SDMX IM can be exchanged using the same standard SDMX formats. A rganizatio processing environment that “understands SDMX” or is “SDMX conformant” will increase efficiency as more statistical tasks are getting integrated. 3.2. Manage metadata At this point of the argumentation, it may already have become clear that “manage metadata” in the SDMX view is not a separate step of the statistical process, but an integral part of all steps. Metadata must accompany the data through the statistical process. It can even be argued that basing the statistical process on SDMX can actually make that process “metadata driven”, based on a generic SDMX conformant processing environment.

Page 13: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

A key set of metadata used in SDMX are code lists for dimensions (identifying data) and attributes (qualifying data). Managing these centrally and re-using them to the greatest extent possible provides a natural path towards rganization and thus efficiency gains across the rganization as well as better information about the data for statistical analysts and users. 3.3. Build, Collect and Process :For the data collection activity, SDMX technical standards are a natural choice, as they were developed for data exchange. If the data to be collected as part of a statistical task have been structured according to the SDMX IM, then organisations have the choice of two syntaxes for the actual data exchange messages: SDMX-EDI and SDMX-ML.TPF4FPT Being able to use the same exchange format for different data collection activities creates economies of scale for this step of the statistical process. For a reporting agent the fact that the reporting of different data to possibly different statistical agencies can be done in the same reporting format will be an advantage. A statistical agency receiving data for different collection tasks and from different reporting agents in the same format will also achieve efficiency gains. These arguments lead to the conclusion that it will be beneficial to have generic “SDMX conformant” processing systems that can digest any type of data collection modelled according to the SDMX IM. While SDMX may initially not have been intended to provide a data and metadata storage model, a growing number of organisations are using the SDMX IM to drive the development of their statistical IT systems. We emphasised the importance of metadata for the statistical process and SDMX allows us to exchange metadata together with the data, eg information from the reporting agent about the quality of a specific reported figure, its confidentiality or other attributes that would be important for the receiving organisation or the final end user. The SDMX conformant system would need to be set up in such a way that it can carry this metadata through the actual processing. Applying an SDMX data structure to a collection exercise will mean that each data item will be clearly identified based on a set of dimensions that make up its identifier or key. This can help in defining (automated) checking routines for incoming data, taking advantage of, for example, an aggregate/component hierarchy on one of the defining dimensions: the hierarchy defines the list of items that would need to be checked against the higher level aggregate. The same holds for a dimension, where “Credit”, “Debit” and “Net” are defined and reported and the relationship between these figures can be exploited to checking purposes. Large data collection exercises with a large number of providers need to be automated. Via the IM and the Registry SDMX provides support for such automation. It allows to define “Provision agreements” between data providers and a collection agency, that define, which data set (based on a given data structure) should be reported by which providers at what intervals (or specific dates). This can be used to monitor the actual delivery and also to, for example, create warning messages to late reporters. 3.4. Analyse :The statistical analysis as such is not a process that SDMX is concerned with. However, as a prerequisite to any analysis the statistical analysts require “domain intelligence”, which is basically a different word for “metadata”. SDMX has a strong focus on metadata and, if this metadata is carried through the statistical process into the final “data product” offered for analysis, then we can assume that rich metadata will enhance the quality of the analysis. Analysts, who know more about the data they work with, can make better judgments. The data and metadata structures for the data to be analysed should contain attributes into which “domain specific”, “time series specific” or even “observation specific” domain intelligence can be

Page 14: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

entered by the statistical analyst, if required. It is thus stored together with the data for future reference. The GSBPM also considers disclosure control as part of the “analyse” phase. As indicated above, the SDMX IM foresees the attachment of attributes at various level, which also includes attributes about the confidentiality status of the item (observation, time series, whole data set) as well as information about the status of an observation, eg whether it is estimated. Applying the same concepts and code lists for this type of “flagging” across the different statistical domain will again provide efficiency gains and add clarity about the data for users.

3.5. Disseminate: SDMX is on its “home turf” when it comes to dissemination of statistical data and metadata. Dissemination comes in different flavours, from the (classic) provision of static tables (HTML, PDF or EXCEL) on the website of a statistical agency, over the exchange of large amounts of data between agencies via fixed file formats, using CDs, DVDs or telecommunication links, to offering end users an interactive User Interface with navigation, query and download facilities. SDMX can facilitate all these means of dissemination. We start from the assumption, that at this late stage of the statistical process, the data and metadata to be disseminated are already odeledt according to the SDMX Information Model, in an SDMX conformant database environment. This means that the data is accompanied by its metadata and that the data and metadata structures (together with the required code lists) are also stored in this system. Data extracted from this system in SDMX formats can be “rendered”, together with the accompanying metadata, as HTML, PDF or EXCEL files using automated transformations. Data and metadata extracted in SDMX formats can directly be disseminated on CD or DVD to other odeledtions. The full strength of the SDMX approach becomes apparent, when it comes to offering an interactive User Interface (GUI). The metadata that accompany the data according to the SDMX IM is exactly the information required by any “navigation and search” user interface. Typical queries that users will express when searching for data will be “Give me the nominal GDP for the Euro Area, US and Japan”, or “give me the daily and weekly averages 3-month LIBOR rates for the Yen, Pound and US”. Such queries are based on the SDMX data structures, ie the dimensions and attributes for the particular data sets into which this data is odeled. The data structure information together with “constraint” information about which series are actually available in a given dataset, is fully sufficient to build a generic GUI to search in SDMX conformant data sets, which could be, for example, implemented via an SDMX Registry. The SDMX Registry can actually be seen as the focal point for dissemination activities. The availability of a data set, in a database that can be queried or as a data file on a website, can be registered there. Users of that particular data can subscribe to the registry to receive a notification once that data set has been updated with new information. 3.6. Archive:When archiving data it is important to archive them jointly with the explanatory metadata as otherwise the data will be useless. Again SDMX, due its focus on the combination of data and metadata provides advantages. Data and metadata that have been structured according to a common model, the SDMX IM, are more easily archived than data with different structures. While SDMX does not specifically deal with the issue of archiving, it facilitates it, again due to its generic Information Model and the possibility to package the data and metadata into a common format based on the SDMX standards. Structures and data in SDMX formats (eg in

Page 15: ec.europa.eu viewUse of SDMX to improve the Implementation of the National Strategy for the Development of Statistics (NSDS) in Sudan. By: Nuha. Mohamed . Elamin. Ahmed

SDMX-ML) are flat files which can be efficiently “zipped” and stored. Together they can be archived and “brought back”, ie loaded again into any statistical environment that “understands” SDMX: the data structure message is used to “interpret” the actual data and metadata message. The data structure message, which is usually a rather small file compared to the data files, contains all relevant information about the content of the (archived) data file, ie all information about the code lists used in the identifying dimensions. A popular standard among data archives is the Data Documentation InitiativeTPF5FPT (DDI). It is worth pointing out that in recent versions of DDI, there has been a conscious alignment between how SDMX and DDI describe aggregate data (unlike SDMX, DDI also focuses on microdata). Thus, it would be possible to transform SDMX data and metadata into the DDI for the purposes of archiving.

References:

1. National Strategy for the Development of Statistics (NSDS) in Sudan Report.2. SDMX and the statistical production process, Gabriele Becker.3. Framework For SDMX Technical Standards Version 2.1