40
EUROPEAN COMMISSION EUROSTAT Directorate E: Sectoral and regional statistics Unit E-4: Regional statistics and geographical information Luxembourg, 16.02.2015 WP DOC NR D/GIS/105 WORKING PARTY MEETING GEOGRAPHICAL INFORMATION SYSTEMS FOR STATISTICS LUXEMBOURG BUILDING BECH – ROOM AMPERE ON 2 MARCH 2015 ROOM DOCUMENT REPORT FROM THE TASK FORCE ON THE INTEGRATION OF STATISTICAL AND GEOSPATIAL INFORMATION Working document for Item 5 on the Agenda Contact: Commission européenne, 2920 Luxembourg, LUXEMBOURG - Tel. +352 43011 http://epp.eurostat.ec.europa.eu

circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Embed Size (px)

Citation preview

Page 1: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

EUROPEAN COMMISSIONEUROSTAT

Directorate E: Sectoral and regional statisticsUnit E-4: Regional statistics and geographical information

Luxembourg, 16.02.2015

WP DOC NR D/GIS/105

WORKING PARTY MEETINGGEOGRAPHICAL INFORMATION SYSTEMS FOR

STATISTICS

LUXEMBOURG

BUILDING BECH – ROOM AMPERE

ON 2 MARCH 2015

ROOM DOCUMENT

REPORT FROM THE TASK FORCE ON THE INTEGRATION OF STATISTICAL AND GEOSPATIAL

INFORMATION

Working document for Item 5 on the Agenda

Contact:

Ekkehard Petri, telephone: +352 4301-36745, e-mail: [email protected]

Commission européenne, 2920 Luxembourg, LUXEMBOURG - Tel. +352 43011

http://epp.eurostat.ec.europa.eu

Page 2: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

EUROPEAN COMMISSIONEUROSTAT

Directorate E: Sectoral and regional statisticsUnit E-4: Regional statistics and geographical information

Report from the task force on the integration of statistical and geospatial information

Date: 20/02/2015

Version: 0.3

Authors: PETRI

Revised by:

Approved by:

Public:

Reference Number:

Page 3: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Document History

Version Date Comment Modified Pages

0.1 12/02/15 Document created by PETRI

0.2 20/02/15 Revision based on input from FI, BE, PL, SE, Eurostat

All

Page 3 / 27

Page 4: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

1 PURPOSE

This report presents the main results of the work of the Taskforce on the integration of statistical and geospatial information carried out between November 2013 and December 2014.

2 EXPECTED OUTCOME

The members of the GISCO working group are invited to:

Take note of this report and the work of the Taskforce on the integration of statistical and geospatial information;

Discuss the recommendation for each of the areas and give indication on priorities for future action;

Comment on the proposal for a joint NMCA-NSI working group on geocoding the 2021 census;

Discuss the future position and mandate of the Taskforce with regard to UN-GGIM: Europe and taking into consideration the recommendations set out in this report.

3 PROBLEM STATEMENT

The ESS as a whole is confronted with requests to meet new user requirements. At the same time the ESS is asked to work more efficiently and redefine its business model and enterprise architecture (the Vision 20201). One focus of the Vision 2020 is on creating statistics from multiple sources.

It has been recognised by the ESS that the integration of statistical and geospatial information can contribute to this transformation process. However, the integration of statistical and geospatial information is not yet a fundamental building block of official statistics.

The Taskforce on the integration of statistics and geospatial information, for easer reference called the ‘Taskforce’, was established following a decision of the DIMESA in 2013. The mandate obtained from the DIMESA was to discuss the various aspects of information integration in the broadest sense and to develop recommendations for improved information integration within the ESS.

Additional initiatives at the European and international level calling for a better integration supported this decision and provided additional information on the scope of Taskforce:

The EFGS with its Prague 1initiative2 in 2012 called for an ESS task force with the goal to prepare the 2021 census as a fully integrated operation, and to build an integrated information system for sustainable development.

The Committee of Experts on UN-GGIM in its third session in 2013 discussed the topic and “Encouraged national geospatial information authorities to reach out to their national statistical office counterparts to actively engage in a dialogue on better integration of statistical and geospatial information at the national level.” The Committee also highlighted “… the unique opportunity in time offered by the preparations of the forthcoming 2020 Round of Censuses” 3.

In October 2013 an UN-GGIM Expert Group on the topic of geospatial-statistical information integration was set-up. Its Terms of Reference make reference to a global

1 http://ec.europa.eu/eurostat/documents/10186/756730/ESS-Vision-2020.pdf/8d97506b-b802-439e-9ea4-303e905f4255

2 https://circabc.europa.eu/d/a/workspace/SpacesStore/0a1f6588-7399-4930-bf4b-635a5b975027/D_GIS_103%20GISCO-2013-WP-Task%20Force%20EFGS%20Proposal.docx

3 http://ggim.un.org/docs/meetings/3rd%20UNCE/GGIM_3%20final%20report%209%20Aug_FINAL.pdf

Page 4 / 27

Page 5: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

statistical geospatial framework4 and the implementation of the 2021 round of censuses as main areas of activity.

The ESS Committee in its 19th meeting in 2013 also discussed the topic and issued the following opinion5:

o “The ESSC expressed its support for the efforts to advance the integration of statistics and geospatial information and recognised the importance of the UN-GGIM initiative for the integration process.”

o “The newly created Task Force was supported and requested to develop a strategy for a harmonised approach to geo-referenced statistics within the ESS and to base the 2021 round of censuses on registers.”

Although all these initiatives did not go into much detail as to what the integration of statistical and geospatial information actually means, they provided the Taskforce with a solid framework:

The next census (2021) and access to administrative data sources should be the focus The census as an operation involving large parts of the public administration has a high potential for driving change.

Integration is more than the combination of final information products (geospatial and statistical). The benefits of integration should be investigated and exploited during all stages of the statistical production process.

Geocoding of statistical and administrative data at unit record level is the most important condition for better statistical-geospatial information integration.

INSPIRE represents a solid technical and partially legal platform for achieving better data integration but needs to be fleshed out with content related aspects.

4 PROPOSED ACTION

The GISCO working group recommends to UN-GGIM: Europe to create a working group composed ideally of an equal number of NSI and NMCA experts on geocoding the 2021 census. Its mandate should be to formulate an action plan based on the recommendation in this report. Based on this action plan European and national implementation projects could be launched. In parts UN-GGIM: Europe working group A and B might already deal with selected actions.

The project proposal should also build on the results of the GEOSTAT 1 and 2 and possibly the ELF6 project.

A steering group composed of senior managers of European NSIs and NMCAs should be established and report to the ESS Committee and the UN-GGIM: Europe executive committee.

In addition the recommendations under each of the headings in section 6 need to be assessed for their cost-benefit and prioritised. This should then lead to the formulation of implementation projects, legal projects or organisational actions in Member States for the most promising actions. Funding for these projects will come at least partially from Eurostat.

5 RESOURCE IMPACT

At this stage it is too early to estimate the resource impact of the implementation of a comprehensive action plan for the integration of statistical and geospatial information.

The resource impact will vary substantially between countries. Obviously such a major undertaking will take time and its scope will first have to be defined at European and Member State level.

4 http://ggim.un.org/docs/TOR_Statistical%20and%20Geospatial%20Information.pdf

5 ESSC 2013/19/Final opinions

6 http://www.elfproject.eu/

Page 5 / 27

Page 6: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Eurostat is committed to continue its work with the same intensity and the current level of resources. A first more concrete estimation of resources should be done once the requirements for a geocoded census 2021 have been established and corresponding European and national actions plans have been established.

6 TIMETABLE

This report will be discussed in the GISCO working group meeting in March 2015, and then presented to the DIMESA in June 2015. If applicable a subset of recommendations will be presented to the DIMESA for a decision on the priorities for further action.

In parallel, the results of the report should feed into the work of the UN-GGIM: Europe working groups A on core data and B on data integration as the contribution of the ESS to the process of better integration of geospatial and statistical information in Europe. This discussion will take place during the rest of 2015.

A report on the state of play should be presented to the ESS committee meeting in November 2015.

If the UN-GGIM: Europe task force on geocoding the census will be established nn initial discussion of this task should be part of the draft report of UN-GGIM: Europe working group B, due in July 2015. This proposal should be further elaborated during the EFGS 2015 conference in November 2015 and then presented to working groups and management bodies of the ESS and NMCAs for approval in late 2015 or in the first half of 2016.

Page 6 / 27

Page 7: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

1 SUMMARY

Building on the discussion of national best practices and obstacles the Taskforce has drawn up the following list of drivers that could help to make the integration of statistical and geospatial information an inherent part of the statistical production process:

A point based geospatial reference framework for statistical and administrative data sources, based on address, building and dwelling registers;

Core geospatial data provided as open data;

In general terms free, open or at least low barrier access to geospatial information required for statistical purposes;

Coordinated implementation of INSPIRE. The implementation of INSPIRE should be seen as one of the most important vehicles to foster the cooperation between NSIs and NMCAs and thus a top priority for both NSIs and NMCAs. The INSPIRE implementation touches the technical implementation of data infrastructures but also the legal system, data policies and data licenses all of which are relevant for a better integration of statistical and geospatial information;

Geospatial data models meeting statistical requirements;

Strong legal frameworks for the integration of geocoding into the statistical production and for producing spatial statistics; within the ESS the production of spatial statistics should follow the Generic Statistical Business Process Model (GSBPM).

Within NSIs systematic collaboration with methodologists and statistical production units in the design of statistical production processes;

Intensive cooperation with NMCAs in joint projects.

The following priority areas promise to achieve the quickest results - these are also the areas where European harmonisation makes most sense:

Systematic direct or indirect geocoding of statistical and administrative data sources;

Definition of harmonised spatial statistics products, their data sources, and their production process; A limited number of products should be proposed starting with population grids.

National INSPIRE implementation plans should be reviewed and possibly reoriented towards more data integration;

Design of projects demonstrating the usefulness of integration of statistical and geospatial information together with NMCAs with a special focus on the 2021 round of censuses;

Harmonisation of disclosure control practices;

Page 7 / 27

Page 8: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

2 INTRODUCTION

This report presents the main results of the work of the Taskforce on the integration of statistical and geospatial information carried out between November 2013 and December 2014.

It starts with an overview of the mandate and scope of the Taskforce (section 3). A definition of spatial statistics and data integration is proposed in section 4. The approach taken by the Taskforce is presented in section 5. Based on discussions in the meetings, inventories and questionnaires, recommendations on how to advance and deepen the integration of statistical and geospatial information are presented in section 6. This main part of the report parts focusses on data sources (section 6.1), data products (section 6.2), legal aspects (section 6.4), and operational aspects (section 6.2 and 6.5).

The document concludes with considerations on the need for a more intensive cooperation between NSIs and NMCAs (section 6.7) and a priority list for the need for European harmonisation of the above aspects in section 7.

The findings and description of the current state of affairs of spatial statistics is presented in Annex I. The ambition has been to provide for each of the topics an as complete as possible overview on national spatial statistics practices, state of play on the integration of statistical and geospatial information in Member States, and the European perspective.

3 SCOPE OF THE TASKFORCE

In its initial meeting the Taskforce discussed its mandate within the framework set by the ESSC (see Problem statement). It was concluded that the task force should:

cover strategic and operational issues of data integration, with a focus on the census;

cooperate with the ESS task forces on the 2021 and beyond censuses;

work closely together with relevant UN-GGIM groups;

develop recommendations on priority areas and need for change;

The following aspects were considered essential in this sense and put on the agenda of one or several Taskforce meeting:

National inventories of geospatial data sources accessible or useful for statistical production;

Inventory of administrative data sources with georeference (direct or indirect) to be used for spatial statistics;

Access conditions to geospatial information for statistical purposes;

Spatial statistics products to be recommended as official statistics;

Opportunities and limitations resulting from INSPIRE;

Statistical disclosure control for spatial statistics;

Overview of national legal and institutional frameworks for the integration of statistics and geospatial information;

IT and production issues;

Integration of geographical information and GIS into the statistical production process;

Cooperation between NSI and NMCAs;

Human resources, skills;

The focus of the Taskforce was on national best practices, missing elements in the national and European context, and recommendations for solving issues. The results should also form the

Page 8 / 27

Page 9: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

basis for projects like GEOSTAT and ‘Merging statistical and geospatial information’ that work on the implementation of better information integration and promote European harmonisation. Another goal was to inform the work of UN-GGIM, both in Europe and at the global scale.

Not in scope was the actual technical integration, methods for spatial analysis and data and modelling techniques.

4 DEFINITIONS

In this document statistical products resulting from the integration or linkage of statistical and geospatial information, will be called ‘spatial statistics’. Spatial statistics is geocoded to small (in most cases below NUTS3) administrative or non-administrative output areas. Cross border output areas e.g. to flooding zones are particularly important from a European perspective. The reference area of spatial statistics should meet the perception of citizens, researches, administrations and policy makers in their area of interest (‘In my neighbourhood’).

The spatial dimension is the main characteristics of spatial statistics. Spatial statistics is used to answer questions from a spatial perspective e.g. What is close?, How many within a distance of X?, How many per surface area?.

Spatial statistics should be used to support administration, policy making and planning at all levels of government from local to national. This is why the scale of spatial statistics is so important and should be aligned with the scale of the task.

Integration of statistical and geospatial information means:

1) The process of geocoding statistical and administrative information (micro or aggregated) using spatial reference frameworks;

2) Exploitation of geospatial data sources for the calculation of new statistics;

3) Processing and manipulation of statistical information using spatial analysis techniques (distances, spatial selection, intersection, aggregation) with the purpose to select information or derive new information with a focus on their spatial characteristics;

4) Supporting a more efficient and flexible statistical production process with geospatial information e.g. for surveying and sampling, field operation;

5) Combination of statistical end products with geospatial information for dissemination (statistical mapping)

6) Improving the quality of existing statistical products.

The above list might not be comprehensive. All statistical phenomena that can be associated to a location are in principle relevant for the integration of statistical and geospatial information. Location in this context means the location of the most individual observation at unit record level. In most cases the location will be a point with coordinates or a precise address. However, other spatial reference frameworks like lines or polygons are relevant as well if they represent real world objects with this geometry; as an example transport performance on a certain road segment.

A complete integration of statistical and geospatial information is achieved if location and statistics are just attributes of information objects. The complexity of data integration needs to be hidden from the user allowing him to join together information layers without the need of further processing on his end.

5 APPROACH

The main working method of the Taskforce was to prepare inventories and situation reports on national practices as regards the various areas of the mandate. Building on these reports best practices should be described, common obstacles identified and recommendations in particular for further European harmonisation put forward.

Page 9 / 27

Page 10: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

In the first meeting the Taskforce sketched the situation of spatial statistics in European NSIs in terms of its legal, organisational, and technical aspects. Also, the integration of geospatial data into the business of NSI and the potential for a more intensive use of geospatial information for statistical purposes was investigated. The discussion was structured by the following questions:

1. Which geospatial data sources are necessary to improve the statistical production from a geospatial perspective?

2. Based on user needs which spatial statistics products should be produced at the European level?

3. Where in the statistical production process makes the incorporation of geospatial information most sense?

4. Which technical, legal, organisational and data policy obstacles exist in countries? How have other NSIs removed these obstacles? How can we improve the production of spatial statistics?

5. How can we use INSPIRE to improve data integration?

6. Where European harmonisation is most urgently needed?

In the second meeting the Taskforce focussed on the possibility for European harmonisation regarding data protection, essential geospatial data sources, and on the definition of spatial statistics products for which it is expected to be a demand. Also the 2021 census was discussed as a geospatial-statistical project.

The third meeting focussed on further refining the list of suggested data sources and products, and on international cooperation.

6 RECOMMENDATIONS OF THE TASKFORCE

The starting point for each of the topics discussed in the Taskforce meeting was a set of questions that should lead to clear recommendations on how to improve the current level of the integration of statistical and geospatial information. The following sections present the initial questions followed by the recommendations put forward by the Taskforce. The description of the current state of affairs on which these recommendations are can be found in Annex I.

6.1 Data sources

• Depending on the products, to which data sources do NSIs need permanent and easy access?

• Can we define a minimum, common set of data sources for all ESS NSIs?

• Would access to as many different data sources as possible stimulate innovation?

• How should NSI obtain access to geospatial data?

• How much influence can NSIs have on the definition of geospatial data sources?

Over the summer 2014 the Taskforce conducted a comprehensive survey7 among its members on essential geospatial data sources for spatial statistics. A wide range of obvious and sometimes less obvious data sources was proposed, from which the Taskforce selected a minimal common set of essential geospatial data sources (‘core’), on the basis of the following criteria:

Potential to create new products, e.g. population grids from geocoded population registers;

Improve the quality of existing products, e.g. improved land use statistics thanks to large scale geodata;

7 https://circabc.europa.eu/d/a/workspace/SpacesStore/60a48783-962e-45e8-b142-e11801b43b52/TF-GIM-2014-2-StatGeo-Questionnaire.xlsx

Page 10 / 27

Page 11: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Make the production process more efficient or more flexible, e.g. register and survey based statistics referenced to the same address based reference framework.

The Taskforce identified the following groups of data as essential for spatial statistics and the integration of statistical and geospatial information.

First category – Geospatial d ata used for directly geocoding data sources for statistics – the spatial reference framework.

These data sources should represent the minimal national spatial reference framework. They should be maintained by public administrations as part of their administrative tasks. They should be used to geocode directly or indirectly all public sector information at all levels of government, including statistics. They are not statistics though.

Topographic data

Detailed transport networks including public transport stops

Hydrographic network

Ortho-imagery

DEM

Administrative data sources

Administrative boundaries

Statistical regions

Census enumeration areas

Integrated geocoded address, building, dwelling register

Land parcels (agriculture and estate)

Cadastral maps

Other data

Postal code areas

All these data sources need to be integrated and fit for spatial analysis purposes, meaning that e.g. administrative boundaries should be integrated with other topographical information and different transport modes should have mode transition points.

All these datasets should be made available with high spatial resolution and harmonised scope. As a rule geographical features for statistical purposes should be represented as a point unless their geometry matters for the statistical purpose. This means that buildings can be represented as a point, unless their dimension is needed, for e.g. land-use statistics. In addition a set of generalisations need to be created, harmonised at the European level.

Second category – Data sources for statistics that need to be geocoded for spatial statistics

Using the category 1 framework, other data sources may be directly or indirectly geocoded and this way can be used to produce spatial statistics:

Sample frame for surveys geocoded to the above address register;

Person register;

Workplace points;

Public services points;

Real Estate Tax registers associated to buildings;

Traffic information

Other types of public files (tax, registrations, …)

Page 11 / 27

Page 12: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Third level – Thematic geospatial data

These thematic geospatial data can be used to directly create spatial statistics but also in combination with category 1 and 2.

Land cover maps

Protected areas

Statistics referenced to functional areas (non-administrative or administrative);

For a complete range of spatial statistics products NSIs need access to all three categories of geospatial data sources. However as a first priority access to category 1 should be ensured. These sources are in most cases external to NSIs and therefore access to them needs to be ensured.

Big data

A relatively new type of geospatial data are Big Data (e.g. positions of mobile phones). Research into the potential as a source for spatial statistics has only started and needs to be developed first. First projects have been launched on traffic control data and mobile phone positions.

Further recommendations

Countries should agree on a single official administrative reference dataset per country with clear ownership, defined scales and attributes taking into account statistical requirements. Both administrative data sources and survey information should be geocoded to the same reference framework. In general terms all Member States should make the use of this spatial reference system mandatory for all public stakeholders at all level of government and administration, for all public data and all administrative tasks.

In particular all countries should implement single georeferenced administrative address, buildings, and dwellings registers. These registers should follow harmonised European recommendations which are being developed in the GEOSTAT 2 project, expected to finish at the end of 2016. All NSIs are invited to contribute this GEOSTAT 2 project with their requirements.

This register should form the reference framework for geocoding all future censuses starting from the 2021 round.

As a second priority, a reference system for transport statistics composed of detailed transport networks should be established.

An important aspect of the spatial reference system is that it needs to be equipped with identifiers that are stable over time and can be used as unique keys to reference all relevant information to them. A concept for time series of this reference system needs to be developed. This includes aspects like referencing versions of data in a unique manner.

Access to the data forming this spatial reference system must be easy. The implementation of INSPIRE should improve the situation in this regard. It must be avoided that NSIs have to create essential data sources that already exist due to access or quality problems. Ideally these data should be open data. Another recommended best practice is national, central data pools.

Understanding the origin, production process and other aspects of the quality of geospatial data is essential for the statistical production process. Corresponding documentation standards are not part of INSPIRE and should be developed, e.g. using the UN-GGIM process.

The creation, maintenance, access conditions and use of this spatial reference framework may need a legal framework.

6.2 Spatial statistics products

• Which spatial statistics products should be produced as official European spatial statistics?

• Which product requirements do spatial statistics have?

Page 12 / 27

Page 13: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

In many national statistical systems and in the ESS spatial statistics products do not belong to the portfolio of Official Statistics that is normally regulated by law. As a result they are not covered by corresponding appropriations and NSIs often have to recover at least partly the cost for their production by offering them to the market. Thus defining spatial statistics as European statistics needs to be preceded by a careful assessment regarding the consequences for the current business models and organisations of those NSIs that create revenues from selling spatial statistics products. The G8 Open Data initiative may represent an opportunity here to seek funding for spatial statistics from government appropriations.

NSI should consider a combined approach of mid-resolution products as Official Statistics or Open Data, and the selling of spatial statistics at very high resolution or as tailor made service. This means that currently only a limited core set of spatial statistics should be considered as candidates for official statistics:

• Population grids for a limited range of population topics;

• Day-and-night-time population grids;

• Delineations of functional, non-administrative areas, e.g. localities;

• Statistics by distances and by service areas;

• Statistics on travel-to-work;

• Transport statistics on transport performance (e.g. by segments);

In terms of output areas, equal counts based approaches (output areas) and equal area approaches (grid systems) should be developed in parallel and with a best fit-for-purpose strategy produced from the same micro data.

Where applicable the product characteristics (e.g. update frequency) of the Official Statistics products that mandates the data collection that should be the guideline for the related spatial statistics product.

NSIs should make available their spatial statistics not only as statistical tables but also in line with the INSPIRE requirements, i.e. transforming data into INSPIRE data models and offering spatial services.

The DCAT-AP metadata standard should be the starting point for the development of a metadata standard for discovery of spatial statistics in open data portals. It complements the INSPIRE and statistical metadata standards. European initiatives to extend DCAT-AP and make it interoperable both with INSPIRE and SDMX are currently on-going and the European Commission is involved in these projects. They represent an opportunity to achieve interoperability between geospatial and statistical metadata for the discovery use case.

6.3 Statistical confidentiality

• Is European harmonisation and agreement on a common set of principles (respect of totals, classifications, thresholds, non-sensitive variables …) necessary?

Ensuring statistical confidentiality is a particular challenge for spatial statistical products. Small output areas with few statistical records dramatically increase the risk of disclosure.

Data protection should not destroy the original spatial distribution of the microdata. As an example, the populated or non-populated status of any output area, in particular of grid cells should be respected.

NSIs should attempt to draw up a list of sensitive and non-sensitive population topics and define the risk of disclosure and the desired protection level depending on this assessment.

In the case of suppression, the metadata must explain deviations from the actual data.

Page 13 / 27

Page 14: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

A study should be carried out on the advantages and disadvantages of record swapping vs. suppression. This study should include the use of the data and the awareness among users on possible usability limitations of protected data.

6.4 Legal framework

Is a legal act a condition for the production of spatial statistics?

Do we need a European legal act for a spatial reference framework for statistics (input and output)?

Are legal acts required to ensure access of NSIs to the required geospatial data sources?

How can we use INSPIRE as a legal instrument to improve the situation for spatial statistics?

Member States should make it mandatory to directly or indirectly - by means of a identifier to a geocoded data source - geocode all administrative and statistical data sources. This should be part of their national legislation on public sector information and e-government.

Legal barriers that prevent NSIs from using geospatial information should be removed. NSIs should be obliged to use existing authoritative geospatial reference data for statistical purposes, to avoid double work and ambiguous information.

The spatial reference framework for the geocoding administrative data sources must be in line with the future recommendation for a single address, building, and dwelling register (see section 6.1). From a technical perspective, the INSPIRE data definitions for addresses and buildings with their legal status should be considered as an opportunity to advance the harmonisation of address and building registers within countries, but also at the European level.

At the European level, the ESS should work on a legal proposal for revised territorial reference framework for statistical output areas, expanding the NUTS to other output areas e.g. grids. Statistical framework regulations e.g. on social statistics should avoid defining output areas but leave this to implementing regulations who can then reference to the territorial framework regulation.

Eventually this territorial framework should also include the obligation to geocode statistical microdata to the spatial reference framework.

The ESS should increasingly aim at defining spatial statistics as Official Statistics, starting with 1km² population grids as part of the post 2021 census regulation.

NSI should actively participate in the creation of national data pools, as part of the INSPIRE implementation, both as a user and producer of geospatial information.

6.5 Production and IT

• What are the specific challenges in the production of spatial statistics compared to statistics based on the NUTS classification?

• To what extent are production processes documented from a spatial perspective, is there a need to work on a documentation standard for spatial statistics?

• Are there specific IT tools for the production of spatial statistics that might be shared between NSIs

One of the key results of the Taskforce was that integration of statistical and geospatial information should not be reduced to the integration of ready products on the side of the user. On the contrary the production of spatial statistics should be recognised as an integral part of the full statistical production process. A location oriented production process may not even result in a spatial statistics product but may have the goal to improve the production of a traditional official statistics product.

Page 14 / 27

Page 15: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

The production of spatial statistics should be incorporated into the regular statistical production process in line with the Generic Statistical Business Process Modell (GSBPM). Designing a separate production process model for spatial statistics should be avoided. A special focus should be on the planning and design phase. As an example the incorporation of geospatial data sources should be included in the design phase of a statistical product, and possibly spatial statistic should only be an additional output of this production process. This common framework and common language will facilitate the cooperation of spatial and statistical experts and the development of a common understanding of spatial statistics and their production specifics.

In the same way geospatial technologies and methods should be incorporated into the regular production process as early as useful from a production perspective. Statistical data warehouses should be developed supporting spatial ETL.

The sampling frame for surveys should be georeferenced in the same way as enumerations and administrative data sources. Knowing the location of statistical units in the sampling frame will allow the survey design to be optimised for the production of spatial statistics and may simplify the survey operation.

The ESS should aim at developing harmonised documentation standards for the production of each of the spatial statistics products. This should include recommended geospatial data sources with defined quality criteria.

At a later stage the ESS may propose projects similar to the current series of VIP projects for common GIS tools. Harmonisation of tools may help promoting spatial statistics among users and producers.

Data disclosure control is one of the key processing steps and therefore a harmonised definition of data disclosure has top priority (see 6.3).

6.6 Organisation, human resources and skills

• How to inform managers and statisticians of the benefits of GIS and the integration of statistics and geospatial information?

• How to organise GIS work in NSIs, is there a best approach?

The increased integration of geospatial information into the statistical production process should be supported by bringing geospatial experts in close contacts with production teams. This may include organisational changes.

The cooperation between GIS experts and statistical officers should be increased and the spatial analysis use case put more in the focus of NSIs. This requires special skills that need to be imparted on statistical experts in NSIs.

Managers and statistical officers should be made aware of the benefits of spatial information. The existing ESTP training program should be expanded to make sure that as many statisticians as possible in the ESS receive at least basic training on GIS and learn about the potential of spatial analysis for statistics. Staff responsible for the census should get priority.

Showcasing the potential of data integration and proposing concrete projects between statistical and GIS experts will help to raise awareness. An ESS spatial analysis knowledge base should be created containing success stories and best practices.

6.7 Cooperation with NMCAs and other stakeholders

• What forums for cooperation should be supported?

• What role should NSIs play in the UN-GGIM initiative?

There is a general feeling on both sides that the cooperation between NSIs on the one hand and other producers of geospatial information, mainly NMCAs, on the other could be improved. The lack of regular contacts, joint projects and working relationships between NSIs and NMCAs is

Page 15 / 27

Page 16: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

often a major obstacle for better cooperation and data sharing. A more regular dialog between the two organisations, both at national and European level represents the foundation the intensified cooperation between NMCAs and NSIs.

Senior management in NSIs and NMCAs have already confirmed the strategic importance of the integration of statistical and geospatial information, but this support now has to be transformed into operational projects. Mutual benefits for both NSIs and NMCAs should arise from this strategic partnership of the two most important providers of public authoritative information.

The establishment of UN-GGIM: Europe is a major success in this regard. For the first time there is now an official intergovernmental group in Europe dealing with the topic. In particular the creation of working group B8 on data integration represents a very important step. Statistical and geospatial governmental experts have received the mandate from their senior management to focus on the topic from a holistic perspective.

Thanks to the UN-GGIM process it is expected that NMCAs will become much more involved. This stronger participation will also lead to a better consideration of European and NSI requirements for core geospatial data. To grasp this opportunity NSIs should also participate in the discussions on core data in working group A and formulate their requirements as regards object concepts, scale levels and range of attributes.

The support to sustainable development initiatives at the national, European (Europe 2020), and global (Sustainable Development Goals) level will increase the demand for new types of integrated data products. These products should be defined and prepared as part of the UN-GGIM process.

Several European projects have been or will be dealing with the integration of statistical and geospatial information. Currently the main ones are ELF9 and GEOSTAT 2. However the already launched ELF project is so far perceived by NSIs as mainly an INSPIRE and NMCA oriented project, and does not focus on data integration. NSI should be better informed on the benefits of ELF for their work.

From an organisational viewpoint, a streamlined organisation structure for Europe and within Member States for the coordination of European spatial statistics is necessary. This organisational structure should integrate with existing structures of the ESS and NMCAs at national and European level. A balanced representation of NSI and NMCA in these groups is vital. At the national level the topic of data integration and the cooperation of NSIs and NMCAs as part of UN-GGIM: Europe should be put on the agenda of national high level groups dealing with public sector information.

Another focus should be on a coordinated implementation of INSPIRE. The implementation of INSPIRE should be seen as one of the most important vehicles to foster the cooperation between NSIs and NMCAs. Since INSPIRE is a legal act that has to be implemented in all Member States, and since it covers many topics that concern NSIs and NMCAs alike (data themes, geoportals, data sharing) these two organisation have to cooperate anyhow and discuss their responsibilities under INSPIRE. If not already in place national task forces between NSIs and other information authorities who are users and producers of geospatial information should be established as part of the INSPIRE implementation process. This concerns the technical implementation, the legal implementation, data policies and data licenses, in short all topics that relate to the integration of statistical and geospatial information and are discussed in this report.

These national task forces and the GISCO working group should also be used to familiarise NMCAs with the statistical production process and the geospatial-statistical data integration process.

With the goal in mind that a proliferation of expert and working groups should be avoided the GISCO working group should be the platform for UN-GGIM: Europe to discuss at the European level the integration of information with a specific focus on statistical and geospatial information.

8 http://un-ggim-europe.org/content/wg-b-data-integration 9 http://www.elfproject.eu/

Page 16 / 27

Page 17: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

The EFGS should continue to play its role as a network of experts working on the integration of statistical and geospatial information. In this capacity it should be recognised by the ESS and UN-GGIM: Europe as an important stakeholder and observer.

The series of EFGS conferences must be continued and expanded to cover even more the integration of statistical and geospatial information. The program should be made more attractive for NMCAs. In addition to organise the annual EFGS conference, the EFGS should act as an expert group for joint projects and also to formulate position papers.

In terms of concrete projects NSIs and NMCAs guided by UN-GGIM: Europe and the ESS should work on one or several joint flagship projects. The main project will be the preparation and operation of the 2021 census as a fully integrated operation. The GISCO working group together with the Eurostat census working group and related task forces should be the technical reference groups for this project. In the future projects like ELF should be planned as joint projects of NSIs and NMCAs.

As a result of national and European projects there are already several success stories that are worth telling. The UN-GGIM: Europe working group B has started collecting use cases showcasing the value of information integration. This body of knowledge will grow fast in the coming years and should be shared via the UN-GGIM website.

The maintenance of the EFGS website or its extension to a website on the integration of statistical and geospatial information should be also addressed. Coordination with UN-GGIM websites is needed.

The ultimate shared vision of NSIs and NMCAs should be the creation of a GGIM as an information system for evidenced based decision making with a focus on sustainable development. The exact definition of this GGIM is a major undertaking and should be done as partners and on an equal footing.

6.8 A shared project – Geocoding the census 2021

As mandated by the ESSC, the Census 2021 is the key project to advance the integration of statistical and geospatial information.

For the 2021 census the current strategy of the ESS is to keep the regulatory framework as it is. This means that population grids cannot be made mandatory outputs of the 2021 census and a voluntary mechanism to provide data for the GEOSTAT 2021 grid, similar to the GEOSTAT 2011 process will have to be adopted.

On request of the ESSC the 2021 census should be based on registers. These registers should be geocoded to the address, building and dwelling registers mentioned above.

The range of topics to be represented on population grids should be discussed and agreed between census experts and GIS experts in NSIs, taking into consideration user needs and data protection aspects.

UN-GGIM: Europe should be the forum for cooperation between NSIs and owners of essential geospatial information for the 2021 census. A joint workshop on geocoding the census should be organised among NSIs and NMCAs in 2015 or early 2016, and a task force should be established to draw up an action plan for national and European actions. The GISCO working group should also be involved.

The regulatory framework of the post 2021 census will probably continue to focus on output harmonisation. The current discussion in the ESS census task force makes it likely that population grids will become an official output of future censuses post 2021. Most likely future censuses will use extensively administrative data sources, in line with the ESSC recommendations. The geocoding aspect of this increased use of administrative data sources needs to be coordinated with NMCAs as part of UN-GGIM: Europe and consideration of the results of the GEOSTAT 2 project needs to be ensured.

Page 17 / 27

Page 18: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

7 PRIORITIES FOR EUROPEAN HARMONISATION

Many NSIs of the ESS have created a national system for the integration of statistical and geospatial information and have spatial statistics in their product portfolio. Often this has been a positive side-effect of the on-going implementation of INSPIRE.

However despite much communality in terms of concepts, processes or product this area is poorly harmonised at the European level.

The purpose of harmonised European statistics is to enable users to make comparisons between different countries. Spatial statistics can enhance international comparison with the territorial, small area and local dimension. This could allow e.g. cities and regions in Europe to analyse societal and economical characteristics at this level, and to compare their performance. Often regions face similar challenges, and have similar conditions, but there are also marked differences that require specific approaches.

Official European statistics are normally harmonised in terms of outputs. In this sense the definition of core spatial statistical products is the area that requires most European harmonisation. This includes the harmonisation of data disclosure control practices, metadata for spatial statistics and a specific quality assessment framework.

In particular the size of the output areas for spatial statistics should be harmonised and allow to study areas from the individual block level in a city to the commuting zone of a cross-national metropolitan area. Hierarchical grid systems are one candidate for such a harmonised output system.

However as outlined above the Taskforce believes that it necessary to extend this harmonisation upstream the production flow. In this sense the harmonisation should also cover several core geospatial data sources that are necessary for all NSIs to produce spatial statistics with the required quality. The most important of these being a point based spatial reference framework for statistics (see 6.1).

The design of information systems supporting sustainable development is likely to require new types of information objects. European harmonisation may help remove obstacles related to different conceptions of “information” within the statistical community and the cartographic community. Rather than creating different objects for the same real world phenomena objects with geographical attributes describing the ‘Where’ and with statistical attributes describing the ‘What’ and ‘When’ are needed. In a next round of harmonisation, a common understanding of geospatial-temporal-statistical information, beyond simplified division of information in maps or statistics, is needed. This includes the definition of appropriate scales for all objects and their generalisations.

8 FUTURE ROLE OF THE TASKFORCE

When the DIMESA decided to establish this Taskforce, the only groups working in this area were the GISCO Working Group and in a more informal way the EFGS. UN-GGIM: Europe was only in the process of creation. Now, two years later UN-GGIM: Europe has been established, and a global UN-GGIM expert group and the UN-GGIM: Europe working group B on the topic of data integration have been created. Moreover an ESS taskforce on the census has been created that among other things discusses the geocoding of census information and the use of administrative data sources.

There are therefore concerns that too many groups deal with the topic of geospatial-statistical information integration and a consolidation was necessary in order not to overstrain resources of NSIs. Also there is a feeling that setting of this Taskforce as an ESS task force is felt to be too restricted as the topic requires intensive cooperation with NMCAs. For administrative reasons ESS taskforces face certain restrictions as regards the number of members.

Page 18 / 27

Page 19: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Last the GEOSTAT 2 project will now start working on a key topic of this Taskforce, the spatial reference framework, and will develop recommendations on its implementation. Waiting for the results of GEOSTAT 2 might therefore be a good idea before continuing on working on a strategy how to geocode statistics and more intensively use administrative data sources.

The Taskforce therefore invites the GISCO working group and the DIMESA to discuss if for the moment this Taskforce in its current setup should continue working in parallel to UN-GGIM: Europe working groups.

One proposal could be to separate responsibilities and for instance merge temporarily this Taskforce with the UN-GGIM: Europe working group B on data integration. Working group B could focus on those priority actions identified above that mainly concern the role of NMCAs in geospatial information management and data integration (data sources, certain legal aspects) and the cooperation between NSIs and NMCAs.

This Taskforce could focus on the aspects of information integration that are more internal to NSIs such as products, statistical confidentiality and statistical production aspects. For capacity, content and planning reasons further meetings could be adjourned until the results of the GEOSTAT 2 project become available at the end of 2016, and clear directions from the ESS census task force and from the potential UN-GGIM: Europe working group on geocoding the census become available.

9 FURTHER REFERENCES AND DOCUMENT REPOSITORY

GEOSTAT 1A and GEOSTAT 1B final reports www.efgs.info

Conclusions of the High Level workshop 2012: https://circabc.europa.eu/d/a/workspace/SpacesStore/4f6c0a0e-1450-43c6-b7f4-3f15bc69636a/E_GIS_11%20Minutes%20_%20V2.doc

All documents of the task force are stored in a CIRCABC repository and are publicly available: https://circabc.europa.eu/w/browse/6370e174-afd4-4879-b326-74c875299ada

10 PARTICIPANTS

Members were the NSIs of AT, BE, FI, PL, PT, SE, UK. They were selected with the goal to have a good representation of the degree to which the integration process was advanced in countries.

Ingrid KAMINGER – Statistics Austria

Bruno KESTEMONT – Statistics Belgium

Marja TAMMILEHTO LUODE – Statistics Finland

Janusz DYGASZEWICZ – Statistics Poland

Ana SANTOS – Statistics Portugal

Jerker MOSTRÖM – Statistics Sweden

Ian COADY – Office for National Statistics UK

Gunter SCHÄFER – Eurostat

Ekkehard PETRI – Eurostat

11 GLOSSARY OF SELECTED ACRONYMS

NMCA National Mapping and Cadastral Authority

NSI National Statistical Institute

UN-GGIM United Nations Global Geospatial Information Management

DIMESA Directors of Environmental Statistics and Accounts

Page 19 / 27

Page 20: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

EFGS European Forum for Geography and Statistics

ESS European Statistical System

ESSC European Statistical System Committee

VIP Vision Implementation Project

ESTP European Statistical Training Program

ELF European Location Framework project

Page 20 / 27

Page 21: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

ANNEX I – SUMMARY OF THE STATE OF AFFAIRS OF SPATIAL STATISTICS

1 DATA SOURCES

Geospatial data sources for spatial statistics can be divided into two main categories:

Geospatial data to directly geocode data sources for statistics – the spatial reference framework.

As an example the address information in a person or business register should be linked to an address in the address registers which is fully geocoded. It is not recommended to geocode the people register directly.

Data used for the production of spatial statistics. This includes the above reference framework but also other geospatial data sources.

Several datasets with similar scope and applications may exist in countries but are maintained by different organisations and often at different administrative levels and with different scales.For certain datasets national specifics prevent the adoption of general EU standards. As an example the UK has an address register but not a building register, and will not be able to create one in the short term. Also concepts of addresses and buildings and their coordinates may vary across countries. However INSPIRE helps improving structural and often also semantical harmonisation. For instance buildings and addresses which are the most important geospatial reference objects for statistics are also INSPIRE themes.

Most of these data are from sources external to NSIs and hence access to them is a key factor. As part of the implementation of INSPIRE, access conditions have been improving but obstacles remain. Typical restrictions or obstacles are high license fees, restrictive use conditions, complicated data acquisition. High resolution aerial imagery and derived land cover products are most frequently subject to access restrictions that hamper their incorporation into a regular statistical production process.

Another limiting factor are data models that mainly meet mapping and cartographic requirements but not statistical requirements. Also update cycles of geospatial information that are not aligned with update cycles of statistical data to be integrated with represent a major obstacle for data integration.

National data pools for all public sector geospatial information have proven to be extremely successful and beneficial for statistical purposes. They are also an important forum for feedback from NSIs to NMCAs on how geospatial data needs to be improved.

The experience shows that improved access to data or access to new geospatial data sources improves the flexibility of the statistical production process and as such results in improved quality of existing statistics. It also stimulates innovation of new products based on the combinations of previously unrelated data sources.

Documentation of the quality of geospatial information is vital to assess their fitness for purpose for spatial statistics with a defined product quality. A systematic use of geospatial information in the statistical production requires a deeper understanding of the origin of the external geospatial information data, the context in which it has been created and for what purpose. However INSPIRE does not define quality requirements and other standards are not available.

2 SPATIAL STATISTICS PRODUCTS

Spatial statistics products and services can be divided into standard products, tailor made products and products only for internal usage within NSIs.

In most cases spatial statistics are densities or counts of statistical units (persons, workplaces, houses) in the selected output area. Spatial output areas may be administrative or non-

Page 21 / 27

Page 22: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

administrative. Typical output areas for spatial statistics products are the smallest administrative areas of a country, or even subareas down to the sub-LAU2 or block level in cities. Often only confidentiality concerns limit the spatial resolution. River catchment areas represent a typical example of non-administrative, often even cross-border output area. Another important type of non-administrative spatial statistics is land-use and land cover information.

Statistical grids have been successfully introduced as standard products of NSIs and answer an increasing demand for high resolution information. They represent a good compromise between industrial production methods, data protection and demand for the most detailed information.

Most spatial statistics products have a related Official ‘mother’ Statistics with an administrative area as output area. Spatial statistics products could be in most cases created from the same microdata base as current Official Statistics but higher spatial resolution, sometimes together with more attributes. As an example the production system of producing population densities at NUTS 3 level or for 1km² grids and 100m² grids from Census microdata is essentially the same.

Spatial statistics may be an end-product or be used in combination with other data to produce further indicators. As an example population grids combined with road networks may be used to define catchment areas of public services. Spatial statistics might also be used to improve the production process for by better organising the field operation of surveys.

Spatial statistics are in most cases disseminated as tables. The dissemination as geospatial data in the strict sense is only emerging and mainly driven by the requirements from INSPIRE to share spatial data. INSPIRE and its annexes form a framework for defining spatial statistics products and the deadlines set out under INSPIRE will help NSIs to make progress in developing spatial statistics products. INSPIRE and the development of geoportals have already increased the visibility of and demand for spatial statistics. Geoportals are an increasingly important area for cooperation with NMCAs as spatial statistics often represents the most interesting content in geoportals.

NSIs will have to develop corresponding new dissemination strategies, including transforming spatial statistics into the INSPIRE spatial data models.

Map products or spatial services are generated from tables in a secondary step, using key relationships between statistics and spatial features. Web mapping applications are now a standard way of dissemination regional and spatial statistics.

Metadata on spatial statistics represent a hybrid form of metadata, covering spatial aspects, partially also covered by INSPIRE metadata, and statistical aspects. A joint metadata format e.g. integration SDMX and INSPIRE standards is missing. A first attempt to design a metadata standard for spatial statistics represents the data quality template as developed under the GEOSTAT 1B project10.

In many national statistical systems and in the ESS spatial statistics products do not belong to the portfolio of Official Statistics that is normally regulated by law. As a result they are not covered by corresponding appropriations and NSIs often have to recover at least partly the cost for their production by offering them to the market.

Detailed spatial statistics with a wide range of attributes and high spatial resolution are still a unique selling point of NSIs and NSI are able to create significant revenues from these products.

At the same time there is a trend to free and open data and linked data. Basic spatial statistics are now increasingly made available under open data license terms. As an example, core variables of population statistics on 1km² grids which used to be commercial data are now open data in Finland. This trend to open data has already increased the general demand for spatial statistics.

In terms of user needs tailor made spatial statistics products often are a great opportunity to satisfy customers and have the potential to increase the reputation of statistical offices. Those countries who have developed spatial statistics products observe a stable or even growing demand, and generally customers are satisfied. User requirements go in the direction of more data

10

Page 22 / 27

Page 23: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

made available at high geographical resolution, increased range of attributes, or higher update frequency. However NSIs also have concerns that their users are not ready for too advanced spatial statistics products that require a high level of expertise to use them properly. The user community for such advanced products is small and a cost-benefit analysis for creating these complex products and offer them as open data needs to be carried out.

Users of spatial statistics come from other public administrations, research and businesses. The main demand for spatial statistics is for data on agriculture, land cover, and demography, and for the calculation of indicators. The main applications are regional planning, definition of zones of development, and for determining settlements. Another important use case is benchmarking with other countries at a regional to local scale. Comparisons between countries are typically best done with grid systems as output areas while equal-count output areas have advantages with respect to data protection.

3 STATISTICAL CONFIDENTIALITY

Ensuring statistical confidentiality is a particular challenge for spatial statistical products. Small output areas with few statistical records dramatically increase the risk of disclosure. The Taskforce identifies statistical confidentiality as the biggest challenge for the dissemination of spatial statistics products. The discussion focussed on data protection for population statistics, but many of the principles also apply to other statistical units (businesses, farms).

For their spatial statistics products NSIs follow the general legal requirements for all statistics regarding disclosure control. Usually there is no specific regulation for the spatial dimension. This means that no individual must be identifiable in its output area.

Essentially two approaches to data protection were discussed:

Suppression and aggregation to threshold values at the level of the statistical product;

Record swapping in the microdata base;

The main weakness of suppression is that breakdowns may not add-up to totals. In sparsely populated areas a large number of grid cells might be affected, in particular if cross-tabulations should be built.

Record swapping has the advantage of respecting totals in the target output area. However it is criticised for intentionally creating wrong micro data which can be harmful for the reputation of NSIs if detected. However this risk is likely to materialise only in detailed studies on small output areas. For European products which tend to be used for large area studies with many records this risk is minimal.

Record swapping can be done in various ways e.g. for various output systems with different impacts on data and their sensible use. Hence standardisation of swapping techniques across countries may not be advisable. At this stage country specific approaches seem to be more appropriate with due consideration e.g. of the territorial breakdowns of administrative and non-administrative area. Record swapping specifically for grid systems seems to be not feasible. The lack of standardisation on the other hand may irritate users that would have to understand and possibly control various data protection methods.

In principle the effect of record swapping on the distortion of the data can be controlled and minimised. The best compromise between data protection and distortion of the data due is achieved if the actual use of the data is known. However given the large effort needed for record swapping different products with specific approaches seem not feasible.

Some countries apply a combination of record swapping on the full microdata base, which then allows them to apply much lower suppression thresholds.

The actual technical implementation is in most cases based on an internal evaluation of the disclosure risk and the most suitable practice. The definition of suppression thresholds is based on an internal assessment, but European harmonisation would be welcome.

Page 23 / 27

Page 24: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

While record swapping was accepted in some countries by users, other NSI are sceptical with regard to user acceptance or wrong use of data due to ignorance. They feel that a proper communication of the concept of swapping to users was difficult and it cannot be ensured that all users would fully understand the advanced topic of swapping and sensibly use the data.

In countries with a longstanding tradition in grid statistics, suppression with its known shortcomings has been accepted by users, and any change has to be carefully considered in this regard.

In either case several countries made the distinction between sensitive topics (religion, country of origin, marital status) and non-sensitive topics (sex, age). While for some countries all topics are sensitive below a certain count, other are more relaxed and even show a grid cell with only one inhabitant and his/her gender. On detailed breakdowns and cross-tabulations most countries apply suppression or use classifications like age groups. In general terms there seems to be a trend to a less strict definition of privacy for non-sensitive topics, and the list of non-sensitive topics is likely to grow in the coming years.

4 LEGAL FRAMEWORK

A large body of national and European legal acts define the mandate of NSIs and Eurostat and the production of statistics in Member States and in the ESS. Official Statistics and the role of NSIs are normally regulated by legislation, in line with the Code of Practice for European Statistics. This legal framework normally also regulates the rights and obligations of NSI and data owners in terms of providing access to data sources and in particular to administrative data sources and registers for statistical purposes. It also often includes an obligation to use existing data sources before creating new data.

In countries with a longstanding and rich portfolio of spatial statistics, this legal framework often gives NSIs almost unlimited access to all kinds of georeferenced administrative data-sources for statistical purposes, or even requires that administrative data shall be georeferenced for statistical purposes. However access may not be free of charge.

Voluntary, ad hoc arrangements (e.g. for single statistical data collection) with data providers are difficult to manage and usually make a sustainable, long term policy for spatial statistics very challenging or even impossible.

One could therefore argue that an essential precondition for the production of spatial statistics is unhindered and favourable access conditions to all suitable types of geospatial data and geocoded administrative data sources. Typically the easer the access to geospatial data, the larger the number and more advanced spatial statistics products.

In most countries spatial statistics are not yet an officially recognised part of Official Statistics and hence do not have a legal basis. As a result the right to access geospatial information may not be covered by the above provisions. In general statistical legal acts are mostly output oriented and leave the choice of data sources to statistical offices.

The creation, maintenance and management of registers and administrative data sources is normally also laid down in legal acts. These data sources are usually not managed by NSIs, but they obtain access to them as part of their general data access arrangements, or using contracts. Therefore a legal framework for the management of these data will be not part of statistical regulations and statistical requirements might be not sufficiently taken on board of these provisions.

At the EU level several statistical regulations exists with spatial statistics aspects e.g. the farm structure survey/census requiring the provision of holding locations. However typically these EU regulations focus on output harmonisation but do not aim at e.g. harmonisation the georeferencing method and the data sources used for their production.

With regard to spatial outputs existing statistical regulations is also typically domain specific, e.g. in the Farm Structure Survey regulation, and are often confined by the possibilities of the data

Page 24 / 27

Page 25: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

sources, e.g. of the surveys. Only in exceptional cases legal acts define LAU2 as the smallest output area of official statistics.

An explicit definition of output areas in statistical legal acts may also be problematic. The decision to include the output areas for census information into the census regulation Regulation (EC) No 763/2008 and the decision not to revise this regulation for the census 2021 has made it impossible to make population grids, a more recent development at the European level, a mandatory output of the census. A better approach would have been to leave the definition of output areas to an implementing regulation which provides for more flexibility with regard to new requirements. A cross-cutting legal framework similar to NUTS classification is missing for smaller output areas.

INSPIRE provides a legal framework to make available geospatial data, not only for statistics, but as part of a wider movement. The obligation of sharing data without too many restrictions has already and will continue to improve access of NSIs to important geospatial data sources. INSPIRE also allows to obtain access to data from other countries under the same conditions as national stakeholders.

Conversely NSIs now often have the legal responsibility for the maintenance of certain INSPIRE themes, mainly from annex III e.g. statistical units and population distribution. This puts an obligation on NSIs to share these data with other users via national SDIs, from which NSIs may benefit in return.

5 PRODUCTION AND IT

One of the key results of the Taskforce is that integration of statistical and geospatial information should not be reduced to the integration of ready products on the side of the user. On the contrary the production of spatial statistics should be recognised as an integral part of the production process. A location oriented production process may not even result in a spatial statistics product but may have the goal to improve the production of a traditional official statistics product.

In an ideal world where all statistical microdata are associated to a matching spatial reference framework (see 6.1), the actual production of the spatial aggregates to spatial statistics products is not a complex task. The more additional geospatial and other data sources are available in an integrated manner, the more products, standard or tailor made, an experienced data analysts will be able to create. The complexity of producing spatial statistics increases when this spatial framework is missing or is not accessible and auxiliary data or workarounds have to be used to get to the same products.

Frequent production of spatial statistics (e.g. on an annual basis) requires an industrial production process. Many countries often are only able to produce spatial statistics in an ad-hoc manner, e.g. population grids from the Census 2011.

Geospatial data and statistical data are typically stored in separate databases during the production and linked by unique key relationships. Spatial data warehouses supporting spatial ETL are not very common yet. Currently most spatial queries have to be done outside the statistical production or dissemination databases.

The majority of statistical products are still traditional statistical tables. Hence a great deal of spatial statistics can be produced with normal database techniques by aggregation of statistical micro-data to various territorial breakdowns. This approach is more efficient in terms of IT resources than using genuine GIS techniques.

Specific GIS software is only required to manage the geographical representations of territorial breakdowns and to create spatial data formats, services and maps. Data transfer of geospatial information (e.g. coordinates of address registers units) into the production systems is mostly down by file upload. Spatial services feeding spatial information into the production process are only emerging.

Page 25 / 27

Page 26: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

NSI use various tools for the production and dissemination of spatial statistics, building on standard GIS and database software. In most cases these tools enhancing the features of standard products have been developed over years and are specific to the production process in the individual NSI.

6 ORGANISATION, HUMAN RESOURCES AND SKILLS

Traditionally GIS is associated with map making and its benefits for spatial analysis or data production are still often overlooked. This traditional role of GIS and the lack of awareness regarding its analytical and planning possibilities are quite often reflected in the organisation, where GIS experts sometimes work too far away from production teams. In particular, NSIs may neglect the potential of geospatial information for replacing expensive survey operations by more cost effective data collections using geocoded administrative data sources.

In reality spatial statistics production involves data acquisition, integration of the geospatial information with statistics, production and dissemination of spatial statistics products and requires a mix of statistical and geospatial expertise to ensure the quality and relevance of the products. In essence high quality spatial statistics require a permanent expert GIS team supporting all steps of the production process.

As regards production there are different models how to locate geospatial activities and GIS work within the organisation and no clear success pattern can be identified. Either GIS experts are directly involved in the production of statistics within the various sectorial statistics teams (environmental, demography) or a crosscutting GIS centre of competence exists supporting the production, often located in the IT or methodological department.

GIS may also support users of statistics, both internal and external. Service provision to external users is often linked with the creation of spatial statistics on demand and represents an important source of income, e.g. tailor made small area statistics for businesses. Often the production of spatial statistics is not considered a core task and the geo-departments have to justify their existence, e.g. by creating revenues.

Often GIS experts cooperate with entirely different partners than the rest of the statistical offices. This concerns software, data sources and acquisition of expertise. This position outside the usual cooperation network of NSIs may make the organisation of this cooperation difficult and time consuming.

Normally the number of GIS experts per NSI is low, not more than 30 with the exception of those NSI where GIS is located in regional offices and which are directly responsible for maintaining large geospatial data sets, e.g. address registers.

Given the low number of GIS experts, their daily business absorbs most of their time, and therefore research, design and development of innovative products, and production processes or promotion activities showing the potential of spatial statistics are difficult to realise. This is considered a missed opportunity as most GIS experts are convinced that spatial statistics has a huge potential for improving various statistical products and services. It is not necessary that statistical officers need to carry out all steps of analysis or production, but they should be able to understand the concepts and imagine applications.

This missed opportunity might also be due to a lack of training and expertise in GIS in regular statisticians or even GIS experts. GIS and spatial analysis skills are now a scarce resource in most NSIs and this limits the exploitation of spatial information for analysis. Also the organisation of GIS training is difficult as this does not belong to the regular training catalogue of an NSI or the ESS.

7 COOPERATION WITH NMCAS AND OTHER STAKEHOLDERS

Communication and coordination between NSIs and NMCAs are vital to avoid duplication of efforts and raise synergies. Joint projects are an important catalyst for a better cooperation.

Page 26 / 27

Page 27: circabc.europa.eu€¦ · Web viewEUROPEAN COMMISSION. EUROSTAT. Directorate E: Sectoral and regional statistics. Unit E-4: Regional statistics and geographical information Luxembourg,

Budget cuts may be seen as an opportunity to start a more intensive cooperation and the avoid duplication of efforts.

The cooperation between NSIs and other providers of geospatial information or administrative data sources can take various forms. In most countries the maintenance of essential geospatial data and services (address register, building register, cadastre, topographic data) used for spatial statistics is managed outside the NSIs, often by NMCAs. This means that data exchange is the rule. Ideally this exchange is regulated in a legal act and/ or based on long term contracts, e.g. in national data pools. In these cases the cooperation works well. If a legal framework or another general contractual framework for cooperation between public authorities is missing, NSIs are often treated by the data owners as any other customer. This may make the acquisition of geo-data cumbersome and hinders the regular production of spatial statistics.

Within member states, INSPIRE has started a process where all geospatial stakeholders including NSIs systematically and regularly have started to cooperate. As an example, geoportals are opened by NMCAs to spatial statistics.

At the European level, until recently the GISCO working group has been the platform for cooperation. However this group deals mainly with technical issues, and strategic issues have not been discussed.

Regarding UN-GGIM: Europe, there is a concern among NSIs that for NMCAs the topic of data integration is elusive and of secondary interest. As a result NMCAs would focus too much on the work in Working Group A on core data while NSI will rather focus on Working Group B. Also there is a feeling that NMCAs understand data integration from the product side, and not so much as an integration of data sources during production.

Several European projects have been or will be dealing with the integration of statistical and geospatial information. The main ones are ELF and GEOSTAT 2. The already launched ELF project is so far perceived mainly as an INSPIRE and NMCA oriented project, and does not focus on data integration. Involvement of NSIs in ELF is limited to non-core topics.

Page 27 / 27