101
Giorgio Micheletti, IDC Italia THE EUROPEAN DATA MARKET MONITORING TOOL KEY FACTS & FIGURES, FIRST POLICY CONCLUSIONS, DATA LANDSCAPE AND QUANTIFIED STORIES D2.9 Final Study Report

THE EUROPEAN DATA MARKET MONITORING TOOL

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: THE EUROPEAN DATA MARKET MONITORING TOOL

Giorgio Micheletti, IDC Italia

THE EUROPEAN DATA MARKET MONITORING TOOL

KEY FACTS & FIGURES, FIRST POLICY CONCLUSIONS, DATA LANDSCAPE AND QUANTIFIED STORIES

D2.9 Final Study Report

Page 2: THE EUROPEAN DATA MARKET MONITORING TOOL

Prepared by:

Gabriella Cattaneo, Giorgio Micheletti, Mike Glennon, Carla La Croce (IDC) and Chrysoula Mitta (The Lisbon Council)

Internal identification

Contract number: N- 30-CE-0835309/00-96

EUROPEAN COMMISSION

Directorate-General for Communications Networks, Content and Technology

Directorate G - Data

Unit G1 — Data Policy and Innovation

Contact: [email protected]

European Commission B-1049 Brussels

Page 3: THE EUROPEAN DATA MARKET MONITORING TOOL

EUROPEAN COMMISSION

Directorate-General for Communications Networks, Content and Technology 2020 EN

THE EUROPEAN DATA MARKET MONITORING TOOL

KEY FACTS & FIGURES, FIRST POLICY CONCLUSIONS, DATA LANDSCAPE AND QUANTIFIED STORIES

D2.9 Final Study Report

Page 4: THE EUROPEAN DATA MARKET MONITORING TOOL

4

Page 5: THE EUROPEAN DATA MARKET MONITORING TOOL
Page 6: THE EUROPEAN DATA MARKET MONITORING TOOL

LEGAL NOTICE

This document has been prepared for the European Commission however it reflects the views only of the authors, and the European Commission is not liable for any consequence stemming from the reuse of this publication. The Commission does not guarantee the accuracy of the data included in this study. More information on the European Union is available on the Internet (http://www.europa.eu).

PDF ISBN 978-92-76-19505-4 doi: 10.2759/72084 KK-01-20-355-EN-N

Manuscript completed in June 2020

The European Commission is not liable for any consequence stemming from the reuse of this publication.

Luxembourg: Publications Office of the European Union, 2020

© European Union, 2020

The reuse policy of European Commission documents is implemented by the Commission Decision 2011/833/EU of

12 December 2011 on the reuse of Commission documents (OJ L 330, 14.12.2011, p. 39). Except otherwise noted, the reuse

of this document is authorised under a Creative Commons Attribution 4.0 International (CC-BY 4.0) licence

(https://creativecommons.org/licenses/by/4.0/). This means that reuse is allowed provided appropriate credit is given and any

changes are indicated.

For any use or reproduction of elements that are not owned by the European Union, permission may need to be sought directly from the respective rightholders.

EUROPE DIRECT is a service to help you find answers to your questions about the European Union

Freephone number (*): 00 800 6 7 8 9 10 11

(*) The information given is free, as are most calls (though some operators, phone boxes or hotels may charge you)

Page 7: THE EUROPEAN DATA MARKET MONITORING TOOL
Page 8: THE EUROPEAN DATA MARKET MONITORING TOOL

TABLE OF CONTENTS

ABSTRACT ................................................................................................................... 5

RÉSUMÉ ....................................................................................................................... 6

EXECUTIVE SUMMARY ................................................................................................... 7

Quantifying the European Data Market – Key Facts & Figures ...................................... 7

Describing the Data Market – The Quantified Stories ................................................. 10

Mapping the Data Market – The Data Landscape and the Data Market Monitoring Tool ........................................................................................................... 11

Acting Upon the Data Market – The Role of Policy ..................................................... 11

1. INTRODUCTION ................................................................................................... 21

1.1. Objectives .................................................................................................. 21

1.2. Methodological Approach .............................................................................. 22

1.3. The Structure of this Report ......................................................................... 26

2. QUANTIFYING THE DATA MARKET – KEY FACTS & FIGURES ....................................... 27

2.1 Three future Development Paths: The Data Market at 2025 .............................. 28

2.2 The Workforce Dimension: Data Professionals and Data Skills Gap .................... 29

2.3 The Supply - Demand Dimension: The Data Companies ................................... 33

2.4 The Business and Economic Dimension: The Data Market and the Data Economy .................................................................................................... 37

2.5 The International Dimension - The Data Economy Beyond the EU – US, Brazil and Japan .................................................................................................. 41

3. DESCRIBING THE DATA MARKET – THE QUALI-QUANTITATIVE STORIES ..................... 48

3.1. Story 6-7 Health Data and Data-driven Innovation in the European Healthcare Industry ..................................................................................................... 48

3.2. Story 8 - Accelerating the Impact of Data Commons ........................................ 54

3.3. Story 9 – Scaling up data-driven innovation: European industry requirements

and the role of European data spaces ............................................................ 56

4. MAPPING THE DATA MARKET – DATA LANDSCAPE AND DATA MARKET MONITORING TOOL .................................................................................................................. 60

4.1 The EU Data Landscape ................................................................................ 60

5. ACTING UPON THE DATA MARKET – THE ROLE OF POLICY ......................................... 63

5.1 The Role of Policy and the Future of Europe’s Data Economy: The Three

Scenarios ................................................................................................... 63

5.2 A change of pace in data policies ................................................................... 67

5.3 The EU Data Policy and the International Dimension ........................................ 68

6. CONCLUSIONS ..................................................................................................... 70

6.1 Quantifying the European Data Market – Key Facts & Figures ........................... 70

6.2 Describing the Data Market – The Quantified Stories ........................................ 73

6.3 Mapping the Data Market – Data Landscape and Data Market Monitoring Tool

................................................................................................................. 74

6.4 Acting Upon the Data Market – The Role of Policy ............................................ 74

7. METHODOLOGICAL ANNEX .................................................................................... 76

8. ESSENTIAL GLOSSARY – THE KEY INDICATORS ....................................................... 94

Page 9: THE EUROPEAN DATA MARKET MONITORING TOOL

Abstract

This report presents a set of indicators measuring the data professionals, the value of the data market, the number of data supplier and data user companies and their revenues, and the overall impact of the data economy on EU GDP. All indicators are presented for the years 2018 through 2020 and forecasted to 2025 according to three alternative potential scenarios: Baseline, High Growth and Challenge scenarios.

In particular:

• The total number of data professionals, their share on the total employment in the EU and their intensity (i.e.: their average number per company) have constantly increased throughout the period under consideration; Data companies - the organisations providing data (data-suppliers) and those making a strong reliance on data (data-users) - have increased in number and share in the EU from 2018 to 2020 and are projected to continue their growth throughout 2025 under all three forecast scenarios;

• The value of the overall data market (i.e. the market where digital data is exchanged as products or services derived from raw data) as well as the value of the overall data economy (including the economic impacts generated by the data market) present the most dynamic picture and are expected to further increase up to 2025 under the three scenarios;

• The data worker skill gap - the gap emerging between the demand and supply of data workers - reveals a potential lack of supply of data skill in Europe across the period under consideration, with specific reference to the High-Growth scenario;

• Finally, the report looks at the possible effects caused by the developments of the current Covid-19 pandemic. An additional post-Covid-impact scenario with estimates on the Data Market and the Data Economy in 2020 and in 2025 for the EU27 has been specifically developed and included.

Page 10: THE EUROPEAN DATA MARKET MONITORING TOOL

6

Résumé

Ce rapport présente un ensemble d’indicateurs mesurant les professionnels des données, la valeur du marché des données, le nombre de sociétés fournisseurs et utilisateurs de données et leurs recettes, ainsi que l’incidence globale de l’économie des données sur le PIB de l’UE. Tous les indicateurs sont présentés pour les années 2018 à 2020 et offrent des prévisions jusqu’en 2025, explorant trois scénarios d’évolution potentiels : Scénarios de référence, de forte croissance et pessimiste.

En particulier:

• Le nombre total de professionnels des données, leur part dans l’emploi total dans l’UE et leur intensité (c’est-à-dire leur nombre moyen par entreprise) ont constamment augmenté tout au long de la période considérée ; Les sociétés de données, c’est-à-dire les entreprises fournissant des données (fournisseurs de données) et celles ayant une forte dépendance (utilisateurs de données) - ont augmenté en nombre et en part dans l’UE de 2018 à 2020 et devraient poursuivre leur croissance jusqu’en 2025 selon les trois scénarios de prévision ;

• La valeur du marché des données dans sa globalité (c.-à-d. le marché où les données numériques sont échangées en tant que produits ou services dérivés de données brutes) et la valeur de l'économie de la donnée dans sa globalité (y compris les incidences économiques générées par le marché des données) présentent la vision la plus dynamique et devraient continuer à augmenter jusqu’en 2025 selon les trois scénarios ;

• L’écart compétences-travailleurs dans le domaine des données - l’écart qui se dessine entre la demande et l’offre de travailleurs spécialisés dans les données - révèle un manque potentiel de compétences en données en Europe sur l’ensemble de la période considérée, s’agissant du scénario de forte croissance ;

• Enfin, le rapport examine les effets possibles causés par les développements de l’actuelle pandémie de Covid-19. Un scénario supplémentaire relatif à l’incidence du Covid, comprenant des estimations sur le marché des données et l’économie des données en 2020 et en 2025 pour l’UE27, a été spécifiquement développé et intégré.

Page 11: THE EUROPEAN DATA MARKET MONITORING TOOL

7

EXECUTIVE SUMMARY

This is the Final Study Report (Deliverable D2.9) of the Update of the European Data Market Study (SMART 2016/0063), entrusted in 2016 to IDC and the Lisbon Council. The present document brings together the results and the activities carried out by the contractors under:

• The Final Report on Facts & Figures (D2.7) extending the measurement of the European

Data Market Monitoring Tool by presenting data for the years 2018-2019 and forecasts to

the year 2025 under three alternative scenarios;

• The Final Report on Policy Conclusions (D2.8) measuring the progress of European

policies towards the objective of maximising the growth of the Data Economy as measured

by the European Data Market Monitoring Tool;

• The key messages from the quantified stories (D3.6-7, D3.8 and D3.9) produced by the

study team and focusing on the operational, organizational and/or economic benefits

generated by the use of data-driven technologies with a special focus on data Commons

and Data-driven Innovation in the European Healthcare Industry;

• The Third Data Landscape Report (D4.3) providing an overview of the EU Data

Landscape and offering an up-to-date zoom onto the database of data market companies

in Europe.

Quantifying the European Data Market – Key Facts & Figures

The European Data Market Monitoring Tool

In line with the results presented in the original European Data Market study (SMART 2013/0063) in February 2017, in the First Report on Facts & Figures (D2.1) of February 2018, in the Second Report on Facts & Figures (D2.4) of March 2019 and in the Final Report on Facts & Figures (D2.8) of May 2020, the measured indicators are organised around a modular and flexible structure – the European Data Market Monitoring Tool. The updated European Data Market Monitoring Tool designed by IDC is shown in the Figure below.

The Updated EDM Monitoring Tool

The EU Data Market and Data Economy in 2019

The value of the Data Economy, which measures the overall impacts of the Data Market on the economy as a whole, exceeded the threshold of 400 Billion Euro in 2019 for the EU27 plus

Page 12: THE EUROPEAN DATA MARKET MONITORING TOOL

8

the United Kingdom1, with a growth of 7.6% over the previous year. The positive trend in the growth of the Data Economy is confirmed by the Data Market value in 2019 for the EU27 plus the U.K., which is displaying a growth rate above the one exhibited by the total IT spending, at 4.9% year-on-year, reaching 75 Billion Euro.

As far as supply and demand are concerned, data suppliers are estimated at more than 290,000 units in the EU27 plus the U.K. for 2019, exhibiting a year-on-year growth of 2.3%. Data users, instead, remained stable in 2019, amounting to nearly 716,000 units and registering a growth of 0.6% over the previous year. Following increasing growth rates over the prior four years, these figures confirm a consolidation of data companies in the EU. Revenues generated by data suppliers increased by 9% to reach almost 84 Billion Euro in the EU27 plus the U.K., with the U.K. still in the leading position, Germany, France and Italy showing the highest share of data revenues per country - together accounting for two thirds (66%) of data revenues in the European Union plus the U.K.

According to the latest estimates, the number of data professionals in the EU27 plus the U.K. reached 76 million in 2019, corresponding to 3.6% of the total workforce, with an increase of 5.5% over the previous year. However, the EDM Monitoring Tool continues to register an imbalance between the demand and the supply of data skills in Europe as the estimated gap reached approximately 459,000 unfilled positions in the EU27 plus the U.K., corresponding to 5.7% of total demand. The data skills gap is forecast to continue in all the forecast scenarios as demand will continue to outpace supply.

The EU Data Market and Data Economy in 2025

The Update of the European Data Market Study also produced key facts & figures for the year 2025 according to three alternative evolution paths and of the European Data Market and Economy and driven by different macroeconomic and framework conditions. Based on IDC research carried out in March-April 2020, an additional post-Covid-impact scenario with estimates on the likely Data Market and Data Economy decline in 2020, and potential rebound and impacts on the 2025 scenarios for the EU27, has been developed (see subchapter below “Considerations on COVID-19 impact”)

The 2025 scenarios are shaped by a combination of economic and social drivers, focused on the interaction of two main axes:

• the high or low pace of diffusion of data-driven innovation, driven by demand-

supply dynamics, and its impact on economic growth.

• the social and economic data governance model enabling a fair and competitive

economy, as indicated by the new European Data Strategy

This analysis highlights the critical turning points to be faced in the next years by governments, businesses and social actors in the development of the European Data Economy. The combination of alternative social and economic trends results in the following scenarios:

• The Baseline scenario is characterised by a healthy growth of data innovation, a

moderate concentration of power by dominant data owners with a data governance model

protecting personal data rights, and an uneven but rather wide distribution of data

innovation benefits in the society. This is considered the most likely scenario.

1 Since Brexit is now definitive (as of May 2020), the authors provided an overview of data for EU27 plus the U.K. until 2019, and for the remaining months data are displayed for EU27.

Page 13: THE EUROPEAN DATA MARKET MONITORING TOOL

9

• The High Growth scenario is characterised by a high level of data innovation, low data

power concentration, an open and transparent data governance model with high data

sharing, and a wide distribution of the benefits of data innovation in the society;

• The Challenge scenario is characterised by a low level of data innovation, a moderate

level of data power concentration due to digital markets fragmentation, and an uneven

distribution of data innovation benefits in the society.

The scenarios explore the drivers and framework conditions which may lead to maximise the benefits of a balanced Data Economy and to avoid the risks of an unbalanced one, highlighting the consequences of policy actions.

In the Baseline scenario, the EU 27 GDP cumulative growth average in the period 2020-2025 (+1.5%) will sustain the investments in the digital economy and consumer willingness to spend. As a result, the Data Market is forecast to reach 82.5 billion Euro in the EU27, with a compound annual growth rate of 5.8%. The Data Economy will grow faster than the Data Market, thanks to a positive multiplier impact of data innovation on the economy, reaching a value of 550 billion Euro in the EU27, with a steep increase of its incidence on EU from 2.8% in 2020 to 4% in 2025.

In the High Growth scenario at 2025, the EU 27 GDP compound annual growth rate in the period 2020-2025 (+2.1%) will be 1.5 times higher than in the Challenge scenario and 40% higher than in the Baseline scenario. This will accelerate the investments in the digital economy and consumer willingness to spend. In the European Union public and private investments will accelerate in Artificial Intelligence, advanced robotics, automation as well as new skills. As a result, the Data Market is forecast to reach 107 billion Euro in the EU27, with a compound annual growth rate of 11.5% between 2025 and 2020. The Data Economy will grow faster than the Data Market, reaching a value of 827 billion Euro in the EU27, with an incidence on EU GDP of 5.9%, against the 4.0% of the Baseline scenario.

In the Challenge scenario, the EU GDP compound annual growth rate in the period 2020-2025 will be only 0.9%, substantially lower than in the other scenarios. As a result, in this scenario the Data Market is forecast to reach 72 billion Euro in the EU27 with a compound annual growth rate of 3% between 2020 and 2025. In the same context, the Data Economy will reach a value of 432 billion Euro in the EU27 with an incidence on GDP of 3.3%, compared to 4% in the Baseline scenario 2025.

The number of data professionals will still increase to 8.4 million in the EU27 by 2025, adding 1.8 million positions in the period 2020-2025. We estimate a potential data skills gap of approximately 484,000 unfilled positions in the EU27 by 2025, corresponding to 5.7% of total demand, as demand will still grow faster than supply.

The EU Data Market and the International Indicators

Our latest measurement of the European Data Market Monitoring Tool reveals a substantially unchanged picture when comparing the EU indicators to those that have been developed for some of the key international partners of the EU. While confirming its vitality, the EU continues to lag behind the U.S in terms of both size and growth of the Data Market. In 2019, the EU27 plus the U.K. generated a Data Market value in 2019 approximately 2.5 times smaller than the one produced in the U.S. (72.3 billion Euro in the EU vs. almost 185 billion Euro in the U.S.) in the same year.

Page 14: THE EUROPEAN DATA MARKET MONITORING TOOL

10

Filling this gap would be essential to increase Europe’s competitiveness and for the future of work in the EU. To this aim, the new European Digital Strategy recently unveiled by the European Commission2 designs a new, confident role for Europe as a global player.

A robust commitment on trade and investments on the international scene will also ensure a collaborative approach on several technology-related topics including data flows and the possibility to pool available and relevant high-quality data together. This approach, however, will have to be put in place while safeguarding Europe’s “technology sovereignty”, that is by making sure that Europe reduces its level of dependency on other parts of the globe for most of the crucial technologies and effectively protects the integrity and resilience of its data, networks and communication infrastructures.

Describing the Data Market – The Quantified Stories

Three stories were produced during the third and final round of measurement of the European Data Market monitoring Tool3. These stories were the result of a mixed effort entailing both secondary and primary research. Extensive secondary research on available public sources, specialised press and academic literature was undertaken to obtain an actionable and up-to-date understanding of the operational, organizational and/or economic benefits generated by the use of data-driven technologies with a special focus on Data Commons and Data-driven Innovation in the European Healthcare Industry.

The first story (“Health Data and Data-driven Innovation in the European Healthcare Industry”) highlighted the benefits that data-driven technologies can exert in uncovering unknown correlations, hidden patterns, and insights by examining large sets of data. By applying machine learning, Big Data can study human genomes and find the correct treatment or drugs to treat cancer or other rare diseases.

To better understand how European companies are approaching and implementing the use of Data Commons, the second story (“Accelerating the Impact of Data Commons) featured extensive desk research across a multitude of publicly available sources, and found a number of challenges to data pooling that can only be addressed by sophisticated and carefully designed governance mechanisms. Data pooling, to be effective, needs to be framed in a wider context of precompetitive collaboration between companies, and needs to have a compelling business case: in other words, it has to be demand-driven.

The third story looked at common data spaces from the viewpoint of those European industries that have already invested in data-driven innovation, have achieved measurable business benefits and are engaged in scaling-up these efforts. This provided an evidence-based and industry-specific view about the pragmatic requirements of Common European Data Spaces, with a focus on the requirements for data governance, access to data, access to infrastructures. The results showed that the path towards data spaces effectively supporting data sharing at ecosystem level will not be easy. The most relevant barrier found to scalability comes from the cost of cloud infrastructures and the dependency on a few global suppliers resulting in potential customer lock-in effects.

2 https://ec.europa.eu/digital-single-market/en/content/european-digital-strategy

3 In total, the study produced 9 stories: D 3.1 Quarterly Stories – Story 1 “Opening Up Private Data for Public Interest”, November 2017; D3.2 Quarterly Stories – Story 2 “Opening Up Scientific Data for Innovation”, February 2018, D3.3 Quarterly Stories—Story 3 “Data Monetization”, October 2018; D3.4 Quarterly Stories Story 4–How Big Data is driving AI: Selected Examples of AI Applications across European Industries, March 2019; D3.5 Quarterly Stories “AI paving the way for the Cognitive Revolution across European Utilities”, May 2019; stories D3.6-7; D3.8, D3.9 are described above.

Page 15: THE EUROPEAN DATA MARKET MONITORING TOOL

11

Mapping the Data Market – The Data Landscape and the Data Market Monitoring Tool

The Third EU Data Landscape Report (D4.3) provides an overview of the EU Data Landscape database revision as of January 2020. With a total of 1,556 companies and coverage of 42 countries (European Union-28, Belarus, Bosnia and Herzegovina, Georgia, Iceland, Israel, Kenya, Moldova, Norway, Russia, Serbia, Switzerland, Turkey, Ukraine and the United States), the database has grown by 9% with the addition of 131 new companies during 2019. Out of the new companies, 52 were identified as Key Data Landscape companies, offering a comprehensive overview of the most important data companies in Europe.

Acting Upon the Data Market – The Role of Policy

The new European Data strategy outlines the ambition for Europe to become a leading role model for a society empowered by data to make better decisions in business and the public sector and a global leader in the data-agile economy.

Considerations on COVID-19 impact

As the Final report on Policy Conclusions (D2.8) was being finalised in February 2020, the COVID-19 pandemic started its rampage across the globe with unprecedented impacts on the European economy as well as on the technology market. While the EDM Monitoring Tool Data and analysis until 2019 remain valid,our estimates for 2020 are now off-the-mark and the 2025 scenarios would need to be revised.

Based on IDC research carried out in March-April 2020, an additional post-Covid-impact scenario with estimates on the likely Data Market and Data Economy decline in 2020, and potential rebound and impacts on the 2025 scenarios for the EU27, has been developed. These estimates should be taken with caution because of the extremely high level of uncertainty about the current damages to the economy and the potential recovery paths.

According to our post-COVID scenario estimates, the European Data Market should decrease by 7.1% to 54 Billion Euro in 2020 (compared to 58 Billion Euroillion Euro in 2019) and the Data Economy by 5.5% to 307 Billion Euro Billion Euro (compared to 325 Billion EuroBillion Euro in 2019). In our view, the powerful negative impact of the slow-down in 2020 will be followed by a rebound and a likely return on the growth path in the next years. Many of the powerful drivers of data-driven innovation are likely to prove resilient in the next years, particularly the willingness to invest in digital technologies in order to re-launch services and create new products to stimulate demand.

By 2025, the post-Covid Baseline scenario foresees strong growth rates resulting in a value of 80 Billion Euro for the European Data Market (compared to 82.5 Billion Euro in the pre-Covid scenario) and 516 Billion Euro for the Data Economy (compared to 550 Billion Euro in the pre-Covid scenario). However, the incidence of the European Data Economy on the EU27 GDP will slightly increase from 4% (Pre-Covid scenario) to 4.04% (post-Covid scenario) because GDP is also affected by the recession. The Challenge and High Growth scenarios remain broadly valid, even though their degree of likeliness change; the Challenge scenario is marginally more likely,while the High Growth scenario assumptions, based on hyper-growth thanks to technology investments, seem now quite remote.

Europe’s Data Market and Data Economy Evolution: Policy and the Three Scenarios (Pre-COVID)

Today, as we look at the main driving trends for the next years, we notice that the role of policies has increased in relevance: as data-driven innovation has become widespread across all industry sectors and user constituencies, the scope of the regulations and framework

Page 16: THE EUROPEAN DATA MARKET MONITORING TOOL

12

conditions to be adapted has considerably grown. At the same time, the emergence of disruptive technologies such as AI has increased the need for policy intervention to manage emerging social, economic and ethical risks.

The Baseline scenario is positioned between the two extremes of a high and a low concentration of power and data control. The development of an effective regulatory framework of data governance, as foreseen by the Data strategy, will enhance stakeholders’ willingness and capability to manage data sharing and improves data access and re-use.

As in the Baseline scenario, in the High Growth scenario European enterprises multiply the use of "digital co-workers" (using intelligent process automation and AR/VR to support/complement human workers) reducing repetitive tasks, improving productivity and security. Besides automation, enterprises engage in "augmentation" of human resources providing technologies enhancing their physical and intelligence capabilities. On the other hand, initiatives to develop digital skills are successful: the Digital Europe Programmes delivers a boost to the supply of advanced digital and data skills, the revised Digital Education act helps to improve digital learning, and the networks of Digital Innovation Hubs play their role in providing internships, training and experimental spaces for companies to learn about new technologies.

The Challenge scenario foresees a negative self-reinforcing circle, where less positive global economic conditions discourage investments and weaken global demand with a negative impact on European growth. In this context, digital Europe and data strategies are not implemented successfully and fail to achieve many of their objectives. This may happen if a combination of insufficient investments and lack of collaboration at EU level lead to an uneven development of data infrastructures and digital resources.

The EU Data Policy and the International Dimension

The new European Digital Strategy recently unveiled by the European Commission appears to design a new, confident role for Europe as a global player. Realizing that the European model has proved to be an inspiration for many other partners worldwide, the strategy calls for the EU to strengthen its commitment towards the setting global standards for emerging technologies and to remain the most open region for trade and investment in the world. In terms of standards, in particular, the EU has paved the way for the setting of global standards for 5G and the IoT and is now committed to leading the standardisation process of a number of additional advanced and new generation technologies such as blockchain, quantum computing, supercomputing – all technologies that lie behind and allow data sharing and data usage and that, as a straight consequence, are directly linked to the further development of a well-functioning Data Economy.

This proactive international role in the standardisation process is accompanied by a robust commitment on trade and investments on the international scene so to ensure that a collaborative and inspiring European approach on several technology-related topics -including data flows and the possibility to pool available and relevant high-quality data together - is successfully implemented.

Page 17: THE EUROPEAN DATA MARKET MONITORING TOOL

13

The European Data Market Monitoring Tool – Key Numbers 2019 for EU27

Page 18: THE EUROPEAN DATA MARKET MONITORING TOOL

14

The European Data Market Monitoring Tool – Key Numbers 2019 for EU27 + U.K.

Source: EDM Monitoring Tool, IDC 2020

Page 19: THE EUROPEAN DATA MARKET MONITORING TOOL

15

Source: EDM Monitoring Tool, IDC 2020

The European Data Market Monitoring Tool – Baseline Scenario 2025 for EU27

Page 20: THE EUROPEAN DATA MARKET MONITORING TOOL

16

Source: EDM Monitoring Tool, IDC 2020

The European Data Market Monitoring Tool – High Growth Scenario 2025 for EU27

Page 21: THE EUROPEAN DATA MARKET MONITORING TOOL

17

The European Data Market Monitoring Tool – Challenge Scenario 2025 for EU27

Page 22: THE EUROPEAN DATA MARKET MONITORING TOOL

18

Baseline Scenario

High Growth Scenario

Challenge Scenario

Source: EDM Monitoring Tool, IDC 20204

4 Unfortunately, we are unable to present post-COVID revised estimates for the Data Economy Challenge and High Growth scenarios. These

scenarios rely on alternative assumptions on industry revenues, consumer consumption and GDP dynamics and we do not feel able in

the present uncertainty to elaborate such assumptions, beyond the Baseline scenario. However, as discussed for the European Data Market, we believe that the forecast estimates remain broadly valid with the Challenge scenario relatively more likely than before the

COVID pandemics.

The European Data Market Monitoring Tool – Post-Covid Scenarios 2020-2025 for EU27 (€M)

Page 23: THE EUROPEAN DATA MARKET MONITORING TOOL

19

The European Data Market Monitoring Tool – The International Indicators and the EU27

Page 24: THE EUROPEAN DATA MARKET MONITORING TOOL

20

The European Data Market Monitoring Tool – The International Indicators and the EU27 + U.K.

Page 25: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

1. Introduction

The European Data Market Study (SMART 2013/0063) was launched by the European Commission in 2013 to measure the progress, size and trends of the European Data Economy with the objective of supporting the Data Value Chain policy of the European Commission. The study designed, developed and implemented a European Data Market Monitoring Tool providing facts and figures on the size and trends of the EU Data Market and Data Economy in the form of a series of quantitative indicators. The study also covered quali-quantitative aspects of the European Data Economy in the form of quantified stories investigating elements of the Data Market that were not captured by the Monitoring Tool. Finally, the European Data Market Study included a data landscaping tool offering a continuously updated picture of data companies in Europe and comprehended a series of webinars to disseminate the research results.

To continue gathering reliable and fact-based evidence on the EU Data Economy and measure the progress of the data-driven economy policies within the general framework of the Digital Single Market Strategy, the European Commission commissioned an update of the European Data Market (EDM) Study. The present document constitutes the Final Study Report (D2.9) of the Update of the European Data Market Study (SMART 2016/0063), entrusted in 2016 to IDC and the Lisbon Council. As a follow-up to the Second Interim Report (D2.6), this report brings together the research results and the activities carried out by the contractors under:

• The Final Report on Facts & Figures (D2.7) extending the measurement of the

European Data Market Monitoring Tool by presenting data for the years 2018-2020 and

forecasts to the year 2025 under three alternative scenarios;

• The Final Report on Policy Conclusions (D2.8) measuring the progress of European

policies towards the objective of maximising the growth of the Data Economy as

measured by the European Data Market Monitoring Tool;

• The key messages from the (D3.6-7, D3.8 and D3.9) produced by the study team and

focusing on the operational, organizational and/or economic benefits generated by the

use of data-driven technologies with a special focus on data Commons and Data-driven

Innovation in the European Healthcare Industry;

• The Third Data Landscape Report (January 2020 Review – D4.3) providing an

overview of the EU Data Landscape and offering and up-to-date zoom into the database

of data market companies in Europe.

1.1. Objectives

As for the previous study, the Update of the European Data Market Study (SMART 2016/0063) pursues three main objectives closely interrelated, which together allow to develop a complete and coherent picture of the European Data Market and Data Economy. They are as follows:

• Measuring the EDM indicators, providing facts and figures on all the key features of the

European Data Market and Economy, regularly updated during the life of the project,

building on the taxonomy and methodology approach previously developed and

successfully implemented;

• Analysing relevant issues for the development of the data ecosystem, providing Data

Market stories based on factual evidence, case studies and complementary data to the

EDM indicators, following on the 12 stories already published by the previous study;

Page 26: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

22

• Mapping and visualising the stakeholders populating the EU Data Market, building on

the stakeholders’ landscape and community developed in the previous study, and

leveraging the visibility achieved by the website www.datalandscape.eu.

1.2. Methodological Approach

The Indicators

As outlined in the Final report on Facts & Figures (D2.7), the measurement of each of the indicators in this report is based on a sophisticated methodology that combines data collection, models, and desk research. Some initial assumptions are built on surveys completed during March 2015, which are supported by ongoing annual surveys. The 2015 survey included 8 Member States and 11 industries that aligned with Eurostat industry segmentation, and IDC’s ongoing annual surveys initially included 6 Member States and 20 industry segments. This final survey, conducted between July and September 2019, include a total of 13 countries.

The initial survey targeted potential data companies in two industries (ICT and Professional services), and data users in 11 industries. The annual update surveys target all business sectors, and company sizes greater than 10 employees. The survey is balanced to represent the mix of industries and size bands for companies in the European Union. The initial survey and the ongoing cross-Europe survey are outlined in more detail in the survey section of the methodological annex. The models used to represent expected market and company behaviour take inputs from macroeconomic indicators such as GDP and GDP growth, ICT spending, and employment.

The models used to represent expected market and company behaviour take inputs from macroeconomic indicators such as GDP and GDP growth, ICT spending, and employment. The main data sources used to compile the indicators are outlined in the table below.

Table 1: Main Data Sources by Indicator

Data Source Updated Used in

Eurostat Business Demographic Statistics Jan 2020 Data professionals Data companies Data users

Eurostat annual structural business statistics Jan 2020 Data professionals Data companies Data users

Eurostat chain linked Volumes (GDP) Jan 2020 Data Market Data Revenues

IDC Core IT Spending guide 2H2017 Jun-2019 Data Market Data Revenues

IDC Worldwide Black Book v3.2 (standard edition) Nov 2019 Data Market Data professionals Data Companies Data Users Data Revenues

IDC European Vertical Markets survey (2019) Sep 2019 Data Market

IMF World Economic Outlook (Oct 2019) Jan 2020

Data Market Data Revenues Data Economy

Consensus Forecasts – Consensus economics Nov 2019 Data Market Data Revenues Data Economy

IT Big Data and Analytics spending Guide 2H2018 Nov 2019 Data Market

ILOSTAT statistics and databases Dec 2019 Data Professionals

Page 27: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Additional relevant sources leveraged for the measurement of the indicators were IDC Vertical Markets end user surveys and IDC Worldwide Black Book, whose results were used to confirm and adjust estimates, when necessary, of the number of companies that were data users and data suppliers.

The updated numbers of data users and data supplier companies were subsequently used to determine the updated results for the data companies’ revenues and were further combined with above mentioned sources to measure the indicators for Data Professionals, Data Professionals’ Skills Gap for the year 2018, 2019 and for the three 2025 scenarios.

Implications of Brexit on Indicators

The UK left the European Union on the 31st January 2020, while this study was ongoing and the Final Report on Facts and Figures was being developed. Until this time the UK was included as one of the Member States, although since June 2016 data were provided distinguishing two totals, one for the EU28 (including the UK) and another for the EU27 – excluding the UK. However, at the level of company size and industry, historical and forecast data are presented for the EU28 only5. Forecasts beyond 2020 account for the expected impact of the UK exit from the EU, although little difference is anticipated as an overall impact, since, due to the high growth of the Data Market when compared to the total IT market across Europe and the UK.

The Final Report on Policy Conclusions

As a follow-up to the First Report on Policy Conclusions (D2.3) and the Second Report on Policy Conclusions (D2.5) , additional desk research and literature review were conducted to produce the Final Report on Policy Conclusions (D2.8) accompanying the quantitative results of the Final Report on Facts & Figures (D2.7). To better investigate the role of policies in shaping the current and future development of the European Data Economy, the study team leveraged a mix of IDC research and other sources. (Including recent research on the Covid19 pandemic and its potential effects on the data market and the data economy) A select list of these sources is offered in Table 2 below:

Table 2: Main Sources

Document Year Author(s)

A Union that strives for more – My agenda for Europe

Political guidelines for the next European Commission 2019-2024

2019 Ursula von der Leyen

A European strategy for data 2020 European Commission

Shaping Europe’s Digital Future 2020 European Commission

The Data Economy 2020 The Economist

5 Indeed, from the outset, our model did not allow to obtain data at Member State level and industry at the same

time (i.e. “interlocked data” by Member State showing details at industry level for that Member State). As a result, data at industry level cannot be segmented at Member State level, thus the UK cannot be “isolated” and subtracted to obtain EU27 date.

Page 28: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

24

Opinion of the Data Ethics Commission 2019 Data Ethics Commission of the Federal Government, Germany

Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19)

2020 WHO

IDC Worldwide ICT spending forecast Post-Covid 2020 IDC

Preparing for COVID-19 Phase2: adopting contact tracing

2020 IDC

A policy framework for climate and energy in the period from 2020 to 2030

2014 (updated in 2018)

European Commission

IDC DX Executive Sentiment Survey 2018 IDC

IDC European Vertical Markets Survey, 2018–2019 2019 IDC

European Union (Withdrawal Agreement) Act 2020 2020 UK Government

IMD World Digital Competitiveness Ranking 2019 2020 IMD

European Parliament resolution of 12 February 2020 on the proposed mandate for negotiations for a new partnership with the United Kingdom of Great Britain and Northern Ireland (2020/2557(RSP))

2020 European Parliament

EU-UK Data Flows, Brexit and No-Deal: Adequacy or Disarray?

2019 UCL European Institute

Digital Economy Report 2019 - Value creation and capture:

Implications for developing countries

2019 United Nations

Brazil Economic Outlook. First quarter of 2020 2020 BBVA

U.S. Economy at a Glance 2020 Bureau of Economic Analysis – U.S. Department of Commerce

The Quantified Stories

The quali-quantitative stories were the result of a mixed effort entailing both secondary and primary research activities. Extensive secondary research on available public sources, specialised press and academic literature was undertaken to obtain an actionable and up-to-date understanding of private and scientific data for public interest and innovation together with a comprehensive picture of the phenomenon in Europe and worldwide.

In parallel, primary research was conducted to collect empirical evidence and validate the information obtained through the main desk research activities. Among the organisations interviewed, the following featured a prominent role:

Page 29: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

The Third EU Data Landscape Report

The Third EU Data Landscape Report (D4.3) provides a detailed overview of the updated EU Data Landscape database and a zoom into the stakeholders and their positioning in the data economy environment. Following the January 2020 update reported in the Third EU Data Landscape report (D4.3), the EU Data Landscape database has been revised to capture the current trends and include the data collected in the period from January 2019 until December 2019.

The dataset has been significantly extended through desk research as well as through input received from stakeholders in the data economy, for instance, via the www.datalandscape.eu website. The report relies on the crowdsourcing of knowledge through an open process, where stakeholders can directly suggest the companies to be included in the database. The mapping exercise sought to achieve a balanced and comprehensive coverage of the different geographies, different typologies of companies (SMEs, large companies, research institutions etc.) and the different data sectors. In terms of geographical coverage, the mapping of the Data Landscape focuses on the EU member states. However, companies from other European countries as well countries outside Europe are also depicted in the database. The reviewing procedure consisted of the following steps:

• Control of the existing dataset;

• Extension of the dataset, taking into account coverage goals;

• Review of Key Data Landscape companies and identification of new.

Finally, among the 1556 companies, 311 have been identified as Key Data Landscape companies in line with a set of criteria adopted.

The European Data Market Monitoring Tool

In line with the results presented in the original European Data Market study (SMART 2013/0063) in February 2017, the First Report on Facts & Figures (D2.1) of February 2018, the Second Report on Facts & Figures (D2.4) of March 2019 and the Final Report on Facts & Figures (D2.7) of April 2020, the indicators presented in this report are organised around a modular and flexible structure – the European Data Market Monitoring Tool. The updated European Data Market Monitoring Tool designed by IDC is shown in the Figure below and its main components are further described in the following sections.

Figure 1: The Updated EDM Monitoring Tool

Page 30: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

26

1.3. The Structure of this Report

The present report is built along the following sections:

• The first section – corresponding to Chapter 2 – summarises the results of the Final

Report on Facts & Figures (D2.7) that was delivered and approved by the European

Commission in April 2020.

• The second section – corresponding to Chapter 3 – provides additional qualitative and

quantitative aspects on the European Data Market as obtained by the quantified stories

(D3.6-7, D3.8 and D3.9) produced by the study team between November 2019 and June

2020.

• The third section – corresponding to Chapter 4 – presents an updated overview of the

data landscape and interactive Data Market Monitoring Tool based on the January 2020

update reported in the Third Data Landscape Report (D4.3).

• The fourth section – corresponding to Chapter 5 – focuses on the policy conclusions

delivered in the Final Report on Policy Conclusions (D2.8) of May 2020.

• The final section provides for a set of concluding remarks drawing from all the different

components (and deliverables) of the Update of the European Data Market study.

Page 31: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

2. Quantifying the data market – key facts & figures

The key facts & figures stemming from the third round of measurement of the Update European Data Market Study (SMART 2016/0063) as reported in the Final Report of Facts & Figures were obtained through the measurement of the following set of selected indicators:

Page 32: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

28

Each indicator was measured at the level of the total EU27 plus U.K and EU27 (excluding the U.K.) for all EU Member States, when available and applicable; industry-specific and company-size views were also offered with indicators provided by industry sector and company size bands, when possible. As in the European Data Market Study (SMART 2013/0063), a select number of indicators has been developed and updated for three non-European countries, namely Brazil, Japan and the United States.

The six key indicators measured by the EDM can be seen holistically along four main dimensions:

• The Workforce and Skills dimension - including the measurement of data

professionals and their potential skill gap.

• The Supply and Demand dimension - incorporating the measurement of data

supplier and data user companies and the revenues generated by data supplier

companies.

• The Business and Economy dimension - comprehending the size of the Data

Market and the value of the Data Economy.

• The International context dimension - including a select number of indicators for

Brazil, Japan and the U.S.

Figure 2: The four Dimensions of the Data Market’s Key Facts & Figures

Source: The European Data Market Monitoring Tool, IDC, 2019

2.1 Three future Development Paths: The Data Market at 2025

The key facts & figures obtained through the measurement of the above-listed indicators are presented for the years 2018 and 2019 as well as for the year 2025 according to three potential future scenarios of the European Data Market and Economy, driven by different macroeconomic and framework conditions. The scenarios at 2025 continue to take as a reference point the initial scenarios developed for the year 2020. While the 2020 scenarios were mainly differentiated by economic drivers (different demand-supply dynamics), the 2025 scenarios continue to be shaped by a combination of economic and social drivers, focused on the interaction of two main focal issues (or evolution paths):

• the high or low pace of diffusion of data-driven innovation, driven by demand-

supply dynamics, and its impact on economic growth. This year we add to this

perspective the pace of multiple innovation adoption, where data is at the core of a

multiple technology environment powered by AI.

Page 33: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

• the social and economic data governance model enabling a fair and

competitive economy, as indicated by the new European Data Strategy.

At one extreme, we foresee a society where a few actors, such as leading online platforms, governments, large businesses, dominate the main data assets and therefore are able to capture a disproportionately high share of data innovation benefits, increasing social inequality (highly centralized model). The polar opposite of this scenario would be a society characterised by an open, transparent and participatory approach to data governance, where both citizens and organisations are able to control and extract value from their data. This would result in a wider social distribution of data innovation benefits, decreasing social inequality. Trustworthiness and respect of data ethics principles are other important characteristics of this ideal model.

This analysis highlights the critical turning points to be faced in the next years by governments, businesses and social actors in the development of the European Data Economy. The combination of alternative social and economic trends results in the following scenarios:

• The Baseline scenario is characterised by a healthy growth of data innovation, a

moderate concentration of power by dominant data owners with a data governance

model protecting personal data rights, and an uneven but rather wide distribution of data

innovation benefits in the society. This is considered the most likely scenario.

• The High Growth scenario is characterised by a high level of data innovation, low data

power concentration, an open and transparent data governance model with high data

sharing, and a wide distribution of the benefits of data innovation in the society;

• The Challenge scenario is characterised by a low level of data innovation, a moderate

level of data power concentration due to digital markets fragmentation, and an uneven

distribution of data innovation benefits in the society.

The scenarios explore the drivers and framework conditions which may lead to maximise the benefits of a balanced Data Economy and to avoid the risks of an unbalanced one, highlighting the consequences of policy actions.

2.2 The Workforce Dimension: Data Professionals and Data Skills Gap

Measuring the Data Professionals

Data professionals6 are workers who collect, store, manage, and/or analyse, interpret, and visualise data as their primary or as a relevant part of their activity. Data professionals must be proficient with the use of structured and unstructured data, should be able to work with a huge amount of data and be familiar with emerging database technologies.

Data Professionals in 2018 and 2019

According to the third round of measurements, data professionals are estimated at a total of 6.0 million in EU27 and at 7.6 million in EU27 plus U.K in 2019, thus marking a continuing increase in 2019 over the previous year (6.1% and 5.5% year-on-year respectively). When

6 The previous European Data Market Study (SMART 2013/0063) included an indicator measuring “Data Workers”, which was based on a similar, but slightly more restrictive definition. In this updated study we have decided to measure “Data Professionals”, that is workers with a wider range of data-related roles. Indeed, data professionals are not only data technicians, but also users who, based on sophisticated tools, take decisions about their business or activities after having analysed and interpreted available data.

Page 34: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

30

compared to the year 2019, 2020 would register a growth rate of 9.2% and 8.6% at the level of EU27 and EU27 plus U.K respectively. More interestingly, the employment share and the intensity share components of the data professionals’ indicator are also expected to improve in 2019 and 2020 if compared to our estimates in 2016 (now estimated at 3.3% and 3.5% in 2019 and 2020 in EU27 and 3.6% and 3.8% for the same years in EU27 plus U.K). As underlined in the Second Report on Facts & Figures (D2.4), this increase confirms the positive evolution of the workforce involved in data-related professions over the period under consideration.

Table 3: Data Professionals, 2016-2017-2018-2020 and Growth Rates

N. Region

Name Description 2016 2017 2018 2019 2020 Growth 2019/201

8

1.1 EU27 Number of data professionals

Total number of data professionals in EU (000s)

4,875 5,260

5,688

6,033

6,588

6.1%

1.1 EU27+U.K.

Number of data professionals

Total number of data professionals in EU (000s)

6,187 6,666

7,215

7,608

8,261

5.5%

1.2 EU27 Employment share of data professionals

Share of data professionals on total employment in EU (%)

2.8% 3.0%

3.2%

3.3%

3.5% 3.4%

1.2 EU27+U.K.

Employment share of data professionals

Share of data professionals on total employment in EU (%)

3.1% 3.3%

3.5%

3.6%

3.8% 2.9%

1.3 EU27 Intensity share of data professionals

Average number of data professionals per user company (units)

9.6 10.2 10.7 11.3 12.1 5.4%

1.3 EU27+U.K.

Intensity share of data professionals

Average number of data professionals per user company (units)

9.2 9.6 10.1 10.6 11.4 4.9%

Source: European Data Market Monitoring Tool, IDC 2020

Data Professionals at 2025

A steady progression of the number of data professionals continues to emerge from our 2020 estimates. The number of data professionals in both EU27 and EU27 plus the U.K is forecast to grow significantly under all the three scenarios out to 2025 as the use of data-driven innovation is expected to grow unabatedly even under the less economically favourable scenario. In particular, under the Baseline scenario, data professionals are expected to amount to 9.3 million in EU27 and 11.3 million in EU27 plus the U.K by 2025, thus representing a solid growth rate between 7.2% and 6.5% per year over the 2020-2025

Page 35: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

period. In the Challenge and High Growth scenarios, data professionals would be more than 8.4 million and 10.8 million in EU27 and 10.2 million and 13.1 million in EU27 plus the U.K. respectively. Under all scenarios, the CAGR over the period 2018-2025 is consistent, although higher than the CAGR featured by the Data Market growth, thus confirming again the close relationship between the two variables.

Table 4: Data Professionals in 2025 - Total Number in EU27 and EU27 + U.K. and Growth Rates. Challenge, Baseline and High Growth Scenarios (Units, ‘000; %)

N. Region Name Description 2025 Challenge

2025 Baseli

ne

2025 High

Growth

CAGR Challenge scenario

CAGR Baseline scenario

CAGR High

Growth scenario

1.1 EU27 Number of data professionals

Total number of data professionals in EU (000s)

8,461 9,316 10,853 5.1% 7.2% 10.5%

1.1 EU27 + U.K.

Number of data professionals

Total number of data professionals in EU (000s)

10,200 11,331 13,162 4.3% 6.5% 9.8%

Source: European Data Market Monitoring Tool, IDC 2020

Measuring the Data Professionals Skills Gap

The Data Professionals Skills Gap indicator captures the potential gap between demand and supply of data skills in Europe, since the lack of skills may become a barrier to the development of the data industry and the rapid adoption of data-driven innovation. It is based on a model balancing the main sources of data skills (from the education system and re-training and other carriers) with the estimated demand (by all data companies).

This indicator has highlighted an imbalance between demand and supply of data skills in Europe since the first measurement for the year 2014. In the year 2019 an increase of demand of data professionals continued (+4.5%), the estimated gap grew by 13% reaching approximately 459,000 unfilled positions in the EU27 plus the U.K. (399,000 without the U.K.), corresponding to 6.2% of total demand (5.7% without the U.K., see table below). By 2020 we expect the gap to expand to 496,000 unfilled positions in EU27 plus the U.K., corresponding to 6% of total demand (5.2% without the U.K, where slower growth is expected due to the impacts of Brexit). In any given moment in the labour market there is a physiological number of vacancies, as well as a number of people looking for work: a vacancies ratio around 5% of demand or less is considered manageable. From this point of view the data skills gap estimated for 2019 shows a lower level stress in the market if compared with our previous estimates for 2017 and 2018. As in in the first and second round of measurements of this indicator, the gap is expected to continue in 2020 under the three scenarios but at a lower level than previously estimated.

The three forecast scenarios now portray a mixed data skills gap at the year 2025: while definitely on the increase under the Baseline and the High-Growth scenarios (8.2% and 10.5% respectively in the EU27), in the Challenge scenario the gap is expected to exhibit a minor decrease due to an overall slow-down of the overall data market and data economy dynamics under this less favourable development path.

Page 36: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

32

The absolute size of the data skills gap is relevant, potentially reaching 759,000 unfilled positions in 2025 in the EU 27 Baseline scenario, but up to over 1.1 million in the EU 27 High Growth scenario. In the Challenge scenario the data skills gap is forecast at 484,000 unfilled positions in 2025. This underlines the need for policy action to prevent and minimize the unbalance between data skills demand and supply in the next years.

Table 5: Indicator 6 - Data Professionals Skills Gap in the EU, 2017-2018-2020 and 2025 - Three scenarios

Indicator 6 - Data Professionals skills gap in the EU, three scenarios

N. Name Description

Actual

Baseline Scenario

Challenge Scenario

High Growth Scenario

2016 2017 2018 2019 2020

2025

2025/ 2019 CAGR

2025

2025/ 2019 CAGR

2025 2025/ 2019 CAGR

6.1 Data Professionals skills gap

Gap between demand and supply of data professionals N, 000s

EU27

343 395 321 399 341

759

11.3%

484 3.3% 1,138 19%

EU27 + U.K.

428 483 406 459 496

866

6.2 Gap between demand and supply of data professionals%

EU27

6.2%

6.7%

5.2%

6.2%

5.2%

8.2%

11.2%

5.7%

2.1% 10.5%

19%

EU27 + U.K.

6.2%

6.5%

5.2%

5.7%

6.0%

7.6%

Source: European Data Market Monitoring Tool, IDC 2020

Page 37: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

2.3 The Supply - Demand Dimension: The Data Companies

Measuring the Data Companies

Data companies are organisations that are directly involved in the production, delivery and/or usage of data in the form of digital products, services and technologies. They can be both data suppliers’ and data users’ organisations:

• Data suppliers have as their main activity the production and delivery of digital data-

related products, services, and technologies. They represent the supply side of the

Data Market.

• Data users are organisations that generate, exploit, collect and analyse digital data

intensively and use what they learn to improve their business. They represent the

demand side of the Data Market.

Data Companies in 2018 and 2019

The number of data suppliers continues to grow at a faster pace than the number of data users in the longer term (out to 2025). Data suppliers are estimated at almost 149,000 in the EU27 and 290,000 units in the EU27 plus the U.K. for 2019, thus exhibiting a year-on-year growth of 2.4% and 2.3% respectively. Data users, instead, are projected to grow at 0.6% in 2019, amounting to nearly 535,000 in the EU27 and to nearly 716,000 units in the EU27 plus the U.K. If compared to the measurements carried out by the European Data Market Monitoring Tool over the period 2013-2015, these latest estimates show a picture of some consolidation of data companies in the EU, following increasing growth rates over the prior four years.

This consolidation is reflected in slow but consistent increase in the share of data companies over the total number of companies in Europe. The share of data suppliers on total companies in the ICT and Professional services industries is estimated at 11.5% in the EU27 and 15.2% in the EU27 plus the U.K. for 2019, a slight improvement with respect to an adjusted 2018. The data users’ penetration rates (i.e. the share of data users on total companies in the EU) are also stable with a fractional percentage point increase in 2019 in both the EU27 and the EU27 plus the U.K.

Page 38: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

34

Table 6: Data Companies, 2016-2017-2018-2020 and Growth Rates

N. Name Description Market 2016 2017 2018 2019 2020 Growth 2019/2018

2.1 Number of data suppliers

Total number of data suppliers measured as legal entities based in the EU (000s)

EU27 134,300

139,450

145,440

148,900

153,100

2.4%

2.1 Number of data suppliers

Total number of data suppliers measured as legal entities based in the EU (000s)

EU27+U.K.

261,450

271,700

283,390

290,000

297,350

2.3%

2.2 Share of data suppliers

% share of data companies on total companies in the ICT and Professional services industries

EU27 10.9% 11.3% 11.4% 11.5% 11.7% 1.6%

2.2 Share of data suppliers

% share of data companies on total companies in the ICT and Professional services industries

EU27+U.K.

14.2% 14.8% 15.2% 15.2% 15.4% 1.2%

2.3 Number of data users

Total number of data users in the EU, measured as legal entities based in one EU country

EU27 505,950

517,100

531,720

534,840

542,510

0.6%

2.3 Number of data users

Total number of data users in the EU, measured as legal entities based in one EU country

EU27+U.K.

676,150

691,500

711,870

715,890

726,110

0.6%

2.4 Share of data users

% share of data users on total companies in the EU industry

EU27 5.7% 5.8% 5.9% 5.9% 6.0% 0.2%

2.4 Share of data users

% share of data users on total companies in the EU industry

EU27+U.K.

6.5% 6.6%

6.7%

6.7% 6.8% 0.1%

Source: European Data Market Monitoring Tool, IDC 2020

Page 39: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Data Suppliers Forecasts at 2025

According to our latest forecasts at 2025 as far as data suppliers are concerned, the outlook for the number of data suppliers is continued growth beyond 2020, and the baseline growth to 2025 aligns with the growth forecast for 2020 – reflecting the consolidation and stabilisation seen among the number of data suppliers. However, there is higher growth for the larger data supplier companies because investment as a data supplier requires resources not as readily available to smaller companies. Larger companies can afford individuals and departments whose sole purpose is to address the Data Market, while in smaller companies the development role often falls to individuals who have other responsibilities.

Table 7: Data Suppliers Forecast 2025 by Member State - Three Scenarios (Units; ‘000); CAGR 2025-2020 (%)

Member State 2025 Challenge

2025 Baseline

2025 High Growth

CAGR 2025/2020 Challenge Scenario

(%)

CAGR 2025/2020 Baseline Scenario

(%)

CAGR 2025/2020

High Growth

Scenario (%)

EU27 163,130 173,410 193,170 1.3% 2.5% 4.8%

EU27 + U.K. 317,230 334,360 384,020 1.3% 2.4% 5.2%

Source: European Data Market Monitoring Tool, IDC 2020

Data Users Forecasts at 2025

Long term growth in the number of data user companies is highest in the data intense industries such as Professional services and Retail, and lowest in Education, Construction and Healthcare – when considering the baseline scenario. The largest companies show the highest growth in adoption as the Data Economy will be crucial to their success and competitive advantage – without a data-oriented approach to business and business decisions these companies will not see the opportunities their competitors see and so not grow at the same rate. However, these larger companies are a small share of the overall number of companies so although the number will grow at a compound rate of 25.6% to 2025, compared with 0.9% for those in the smaller size band, they do not add significantly to the total number of data companies.

Table 8: Data Users Forecast 2025 by Member State - Three Scenarios (Units; ‘000); CAGR 2025-2020 (%)

Indicator 2 – Data Users – Forecast 2025

Member State 2025 Challenge

2025 Baseline

2025 High Growth

CAGR 2025/2020 Challenge Scenario

(%)

CAGR 2025/2020 Baseline Scenario

(%)

CAGR 2025/2020

High Growth

Scenario (%)

EU27 562,280 582,750 626,630 0.7% 1.4% 2.9%

EU27 + U.K. 753,380 779,150 845,330 0.7% 1.4% 3.1%

Source: European Data Market Monitoring Tool, IDC 2020

Page 40: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

36

Measuring Data Companies’ Revenues

Data companies’ revenues correspond to the aggregated value of all the data-related products and services generated by Europe-based data suppliers, including exports outside the EU.

Data Companies’ Revenues in 2018 and 2019

Revenues generated by data suppliers have registered a constant increase since 2013 according to our initial measurements and the Monitoring Tool. In 2019, in particular, revenues have increased by 9% to reach more than 64 Billion Euro in EU27 and 83 Billion Euro in EU27 plus the U.K. However, the share of the data suppliers’ revenues on the total companies’ revenues in the ICT and Professional services sectors dropped to 3.0% in EU27 and 3.1% in EU27 plus the U.K. in 2018 following a strong growth in ICT sales across the board in the year. A weakness in spending in the U.K. due to the economic uncertainty in this Member State pulled down overall revenues for data companies.

Table 9: Data Companies’ Revenues and Growth, 2016-2020 (€, Million; %)

Indicator 3 — Data Companies’ Revenues and Growth

N. Region

Name Description 2016 2017 2018 2019 2020

Growth 2019/2018

3.1

EU27 Total revenues of

data companies in the EU

Total revenues of the Data Suppliers

calculated by Indicator 2

47,178

52,479

58,948

64,262

71,050

9.0%

3.1

EU27 +

U.K.

Total revenues of

data companies in the EU

Total revenues of the Data Suppliers

calculated by Indicator 2

61,781

68,846

77,297

83,545

91,318

8.1%

3.2

EU27 Share of data

companies’ revenues

Ratio between Data Suppliers’ revenues

and total companies’

revenues in sectors J and M

3.0% 3.2% 3.4% 3.0% NA 4.5%

3.2

EU27 +

U.K.

Share of data

companies’ revenues

Ratio between Data Suppliers’ revenues

and total companies’

revenues in sectors J and M

3.1% 3.3% 3.5% 3.7% NA 4.6%

Source: European Data Market Monitoring Tool, IDC 2020

Data Companies’ Revenues Forecasts at 2025

Data companies’ revenues within the EU grow faster than the IT market as these products and services become more mainstream. However, the four major contributing countries (U.K., Germany, France, Italy) lose share of the European Data Market between 2019 and 2025 - as their share of EU revenues falls from 66% to 63%. Italy, currently the fourth largest data market in Europe, is displaced by the Netherlands by 2025, according to the Baseline scenario. This is not a failing of these countries but reflects the catching up of smaller companies as their revenues rise. The four leading countries have a greater share of larger companies while new entrants to the market are more likely to be smaller companies, which means a disproportionate growth for the smaller member states when compared to the

Page 41: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

larger ones. Over the period 2019-2025 data companies’ revenues rise by 7.0% annually, while total IT spending rises over the same period at 1.6%.

Table 10: Data Companies' Revenues Forecast – Total number in the EU27 and EU27 plus U.K. and Growth rates - Three Scenarios (€, Million)

Indicator 3 — Data Companies’ Revenues - Forecast 2025

Region 2025 Challenge

2025 Baseline

2025 High

Growth

CAGR 2025/2020 Challenge Scenario

(%)

CAGR 2025/2020 Baseline Scenario

(%)

CAGR 2025/2020

High Growth

Scenario (%)

EU27 80,148 98,623 136,350 2.4% 6.8% 13.9%

EU27 + U.K. 103,796 127,976 181,795 2.6% 7.0% 14.8%

Source: European Data Market Monitoring Tool, IDC 2020

2.4 The Business and Economic Dimension: The Data Market and the Data

Economy

Measuring the Data Market

The Data Market is the marketplace where digital data is exchanged as “products” or “services” as a result of the elaboration of raw data.

The Data Market in 2018 and 2019

The value of the Data Market in 2019 for both EU27 and EU27 plus the U.K. continues to show a growth rate above the one exhibited by the total IT spending, at 4.9% year-on-year and is expected to surpass the threshold of 60 billion Euro in 2020 in EU27 according to the Baseline scenario described in our previous study. This represents a constant and significant progression if we consider that the total amount of the Data Market in EU27 was estimated at 42.6 Billion Euro in 2015 in our previous study and that our current estimates measures the Data Market at 55.5 Billion Euro in 2018.

Table 11: Data Market Value and Growth, 2016-2017-2018-2020 (€, Million; %)

Indicator 4 — Value and Growth of the Data Market

N. Market

Name Description 2016 2017 2018 2019 2020 Growth 2019/201

8

4.1 EU27 Value of the Data Market

Estimate of the overall value of the Data Market

46,183 50,604 55,486 58,214 62,244 4.9%

4.1 EU27 + U.K.

Value of the Data Market

Estimate of the overall value of the Data Market

59,496 65,286 71,787 75,274 80,253 4.9%

Source: European Data Market Monitoring Tool, IDC 2020

Page 42: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

38

The Data Market Forecasts at 2025

Our estimates of the Data Market value in 2025 under the High Growth Scenario continue to showcase a buoyant growth, with IT Spending on Data Market tools almost doubling over the period from 2019 to 2025 for both EU27 and EU27 plus the U.K. This will correspond to a considerable CAGR for the period 2020-2025 of 11.5% and 12.0% in EU27 and in EU27 plus the U.K. respectively for the High Growth scenario. This is marginally down when compared with the previous publication – mostly as a result of a more buoyant 2020. Our new 2025 Baseline scenario, shows the Data Market will amount to more than 82 billion Euro in EU27, against 58.2 billion Euro in 2019 (a 5.8% CAGR 2010-2025), while under the Challenge scenario the Data Market will still represent 72.3 billion Euro, growing at a compound annual growth rate of 3.0% from 2020. The Data Market growth will therefore continue unabated in 2025, confirming the trend set out in 2013-2014 while elaborating our initial results of the European Data Market Study (SMART 2013/0063). These forecasts for 2025 are only marginally changed from the previous forecast, with the Challenge and High Growth scenarios down 0.6% and 0.7% respectively, while the baseline scenario is a down 0.7% when compared with the previous forecast for the EU27 plus the U.K. The impact of Brexit might be a little more negative than anticipated last year, but the potential for higher growth is confirmed.

Table 12: Data Market Forecast 2025 - Total number in the EU27 and EU27 plus U.K. and Growth rates - Three Scenarios (€, Million)

Indicator 4 — Data Market - Forecast 2025

Region 2025 Challenge

2025 Baseline

2025 High Growth

CAGR 2025/2020 Challenge Scenario

(%)

CAGR 2025/2020 Baseline Scenario

(%)

CAGR 2025/2020

High Growth

Scenario (%)

EU27 72,329 82,564 107,139 3.0% 5.8% 11.5%

EU27 + U.K. 93,056 105,638 141,507 3.0% 5.7% 12.0%

Source: European Data Market Monitoring Tool, IDC 2020

Measuring the Data Economy

The Data Economy measures the overall impacts of the Data Market on the economy as a whole. It involves the generation, collection, storage, processing, distribution, analysis elaboration, delivery, and exploitation of data enabled by digital technologies.

The Data Economy includes the direct, indirect, and induced effects of the Data Market on the economy.

• The direct impacts: are the initial and immediate effects generated by the data suppliers; they represent the activity potentially engendered by all businesses active in the data production. The quantitative direct impacts will then be measured as the revenues from data products and services sold, i.e. the value of the Data Market.

• The indirect impacts: are the economic activities generated along the company's supply chain by the data suppliers. There are two different types of indirect impacts: the backward indirect impacts and the forward indirect impacts.

• The induced impacts: include the economic activity generated in the whole economy as a secondary effect.

Page 43: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

The Data Economy in 2018 and 2019

The value of the Data Economy for the EU27 plus the U.K. has been estimated to exceed the threshold of 324 Billion Euro in 2019, overall confirming the estimates presented in the previous deliverable. The estimated share of total impacts on GDP in EU27 plus the U.K. is 2.6% in 2018 and is expected to grow to 2.8% in 2019.

Table 13: Data Economy Value and Growth, 2017-2018-2020 and Impacts on GDP 2018-2019 (€, Million; %)

Source: European Data Market Monitoring Tool, IDC 2020

The Data Economy Forecasts at 2025

The new estimations of the Data Economy see the value of 2019 for EU27 to be more than 325 Billion Euro and reaching nearly 355 Billion Euro in 2020, growing at 9.3%. The estimated CAGR for the period 2020/2025 in EU27 remains healthy along the period, at 9.1% in EU27. The share of the Data Economy on the GDP in the EU27 baseline scenario at 2025 is of 4.%.

The CAGR 2020/2025 in EU27 for the High Growth scenario is 18.4%, that will make the Data Economy for EU27 surpass 827 Billion Euro, and accounting for 5.9%% of the GDP at 2025. In the Challenge scenario CAGR 2020/2025 for EU27 is 4%, more than halved with respect to the Baseline, with the Data Economy being just above 430 Billion Euro, and accounting for 3.3% of the GDP at 2025.

Indicator 5 — Value and Growth of the Data Economy

N. Name Descriptio

n

2016 2017 2018 2019 2020 Growth

2019/ 2018

Growt

h 2020/

2019

Impact

on

GDP

2018

Impact

on

GDP

2019

5.2 Value of the Data Economy EU27

Value of direct, indirect and induced impacts on the economy

238,699 267,986 301,637 324,858 355,396 7.7% 9.3% 2.4% 2.6%

5.2 Value of

the Data

Economy

EU27+U.K

.

Value of

direct,

indirect and

induced

impacts on

the economy

299,989 336,602 377,871 406,468 443,925 7.6% 9.2% 2.6% 2.8%

Page 44: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

40

Table 14: Data Economy Forecast in 2025 and Impacts on GDP according to the Three Scenarios (€, Million; %)

Source: European Data Market Monitoring Tool, IDC 2020

As in the previous study, this report provides a detailed insight of the Data Economy by type of impact – direct, indirect and induced impacts. The pie chart below provides an overview of the distribution of the Data Economy by type of impacts in 2025 for EU27 in the Baseline scenario. It is worth highlighting how the composition of impacts changes along time, from 2019 to 2025, in favour of induced impacts, thus revealing the effects of data access, data product and services exchange, and data value distribution in the economy. As the data economy matures, the impacts on the general economy (induced) become as relevant as those on the European industry (indirect impacts). The data industry remains the prime motor of this economy but its direct impacts as a share of total impacts decrease.

Indeed, induced impacts in 2025 account for a share of 42%, gaining 9 percentage points with respect to 2019. Indirect impacts in turn will lose around 4% of share, but still in 2025 accounting for a very high percentage (43%). With respect to 2019, in which the indirect impacts still are the most relevant, as it was highlighted in previous publication (with forward impacts driving the effect), in 2025 induced impacts will increase, reaching a share similar to the one of the indirect impacts.

Figure 3: Data Economy by Type of Impact, EU27, Baseline scenario 2025 (%)

Source: European Data Market Monitoring Tool, IDC 2020

15%

43%

42%

Baseline Scenario 2025, EU27

Direct Impacts Indirect Impacts Induced Impacts

N. Name Description 2025

Challenge

Scenario

2025

Baselin

e

Scenari

o

2025

High

Growth

Scenario

Impacts

on GDP

2025

Challeng

e

Scenario

Impacts

on GDP

2025

Baseline

Scenario

Impacts on

GDP 2025

High

Growth

Scenario

5.2 Value of

the Data

Economy

EU27

Value of direct,

indirect and

induced impacts

on the economy

432,360 549,783 827,089 3.3% 4.0% 5.9%

5.2 Value of the Data Economy EU27+UK

Value of direct, indirect and induced impacts on the economy

536,715 674,263 1,036,709 3.4% 4.2% 6.2%

Page 45: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

2.5 The International Dimension - The Data Economy Beyond the EU – US,

Brazil and Japan

The U.S.

The growth in the number of data professionals in 2019 was in the middle of the three internationals but the country’s share of total employment was the lowest – and lower than in Europe – reflecting the strengthening of the U.S.A. economy and improvements in employment there. In comparison with the EU28, the US shows the lowest growth in Data Professionals employment share of Total Employment in 2019 (1.7% and 0.3% year-on-year growth in the number of data professionals and employment share in 2019 over 2018 respectively).

Data Professionals growth slowed marginally in 2019 when compared to 2018. The same applies for the data supplier companies’ indicators, with the highest increase of data suppliers in 2019 in the European Union, while among the internationals none grew less than the US in 2018. (2.3% for the EU28 vs. 1.0% in the USA, and 1.6% in Brazil and Japan). Revenues were strong for the USA, at 11.6% and 12.7% in 2018 and 2019, but in 2018 this was bettered by the EU28 at 12.2%, and Japan followed very closely behind at 10.5%. The USA showed stronger growth in 2019 though, with the highest growth among the internationals and the member states.

Table 15: USA Indicators - Overview 2016 – 2020

Page 46: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

42

USA – Indicators’ Overview

N. Name Metrics 2016 2017 2018 2019 2020 Growth ‘19/ ‘18

1.1

Number of Data professionals

Total Number of Data professionals (Thousands)

12,732 13,857 14,105 14,350 14,593 1.7%

1.2

Data professionals’ employment share

% of Data professionals on total employment

8.42% 9.04% 9.06% 9.08% 9.11% 0.3%

2.1

Number of Data Suppliers

Total number of data supplier companies (000s)

289,556 303,552 309,263 312,215 316,190 1.0%

3.1

Revenues of Data Companies

Total revenues generated by companies specialized in the supply of data-related products and services (Million €)

€ 129,173

€ 146,970

€ 163,993

€ 184,873

€ 211,349

12.7%

4.1

Value of the Data Market

Estimate of the overall a value of the Data Market (Million €)

€ 129,173

€ 146,970

€ 163,993

€ 184,873

€ 211,349

12.7%

USA – Indicators’ Overview

N. Name Metrics 2016 2017 2018 2019 2020 Growth ‘19/ ‘18

4.2

Value of the Data Economy (Only Direct and Backward Indirect impacts)

Direct Impacts (Million €)

€ 108,521

€ 146,966

€ 158,283

€ 178,450

€ 204,013

12.7%

Backward Indirect Impacts (Million €)

€ 7,270 € 7,860 € 8,769 € 9,500 € 11,463 8.3%

4.3

Incidence of the Data Economy on GDP (Only direct and backward indirect impacts)

Ratio between value of the Data Economy and GDP (%)

0.78% 1.03% 1.11% 1.19% 1.34% 6.8%

Source: European Data Market Monitoring Tool, IDC 2020

Page 47: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Brazil

Brazil’s economy showed little sign of recovery in 2019 but had improved by the end of the year to match the (weak) growth seen in 2018. Confidence in the economy improved particularly in the last quarter of 2019. The exchange rate fell again, although not a lot, from 0.27 to 0.25 Real per US dollar. Investment remains weak and industrial output is falling too. Unemployment is high in the country and sawtooth between 12% and 13% with little sign of any downward trends. However, the data economy is improving with improving growth in data supplier revenues – up by 7.2% in 2019 compared with 5.4% in 2018.

Table 16: Brazil Indicators - Overview 2016 -2020

Brazil – Indicators’ Overview

N. Name Metrics 2016 2017 2018 2019 2020 Growth rate

2019/2018

1.1 Number of Data professionals

Total Number of Data professionals (Thousands)

1,160 1,175 1,200 1,211 1,215 0.9%

1.2 Data professionals’ employment share

% of Data professionals on total employment

1.81% 1.84% 1.86% 1.88% 1.89% 1.2%

2.1 Number of Data Suppliers

Total number of data supplier companies (000s)

35,979 36,906 37,605 38,192 38,477 1.6%

Page 48: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

44

Brazil – Indicators’ Overview

N. Name Metrics 2016 2017 2018 2019 2020 Growth rate

2019/2018

3.1 Revenues of Data Companies

Total revenues generated by companies specialized in the supply of data-related products and services (Million €)

€ 6,049

€ 6,998 € 7,373 € 7,905 € 8,374

7.2%

4.1 Value of the Data Market

Estimate of the overall a value of the Data Market (Million €)

€ 6,049

€ 6,998 € 7,373 € 7,905 € 8,374

7.2%

4.2

Value of the Data Economy (Only Direct and Backward Indirect impacts)

Direct Impacts (Million €)

€ 6,157

€ 6,996 € 7,380 € 7,986 € 8,536

8.2%

Backward Indirect Impacts (Million €)

€ 290 € 335 € 353 € 374 € 384 5.9%

4.3 Incidence of the Data Economy on GDP (Only direct and backward indirect impacts)

Ratio between value of the Data Economy and GDP (%)

0.16% 0.17% 0.21% 0.23% 0.24% 8.8%

Source: European Data Market Monitoring Tool, IDC 2020

Japan

The indicators measuring the state of the Data Market and the Data Economy in Japan all showed growth in 2019, in some cases substantial growth. The number and employment share of data professionals grew the fastest among the internationals, (2.87% and 2.85% in 2019) but the EU28 showed even higher growth. The number of Data Suppliers also showed the highest growth among the internationals – at 1.6% in 2019 – but was again bettered by the EU28. Data Revenues showed commensurate growth but was not able to reach the levels shown by the USA in 2019. The incidence of the data economy on the total economy only showed a small improvement in 2019 though, but the expectation is for a more notable improvement in 2020. (See table below)

Page 49: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Table 17 Japan Indicators - Overview 2016 -2020

Japan – Indicators’ Overview

N. Name Metrics 2016 2017 2018 2019 2020 Growth rate

2019/2018

1.1 Number of Data professionals

Total Number of Data professionals (Thousands)

3,740 4,045 4,118 4,236 4,324 2.9%

1.2 Data professionals’ employment share

% of Data professionals on total employment

5.82% 6.20% 6.20% 6.37% 6.45% 2.8%

2.1 Number of Data Suppliers

Total number of data supplier companies (000s)

101,612 104,587 105,273 106,983 107,612 1.6%

3.1 Revenues of Data Companies

Total revenues generated by companies specialized in the supply of data-related products and services (Million €)

€ 25,513

€ 26,720

€ 29,799

€ 32,929

€ 37,019

10.5%

4.1 Value of the Data Market

Estimate of the overall a value of the Data Market (Million €)

€ 25,513

€ 26,720

€ 29,799

€ 32,929

€ 37,019

10.5%

4.2

Value of the Data Economy (Only Direct and Backward Indirect impacts)

Direct Impacts (Million €)

€ 27,394

€ 27,296

€ 30,074

€ 32,500

€ 37,287

8.1%

Backward Indirect Impacts (Million €)

€ 1,189 € 1,230 € 1,330 € 1,454 € 1,689 9.3%

4.3 Incidence of the Data Economy on GDP (Only direct and backward indirect impacts)

Ratio between value of the Data Economy and GDP (%)

0.93% 0.96% 1.08% 1.09% 1.25% 0.8%

Source: European Data Market Monitoring Tool, IDC 2020

Page 50: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

46

International Overview and Comparison with the EU

In line with the results obtained by the previous round of measurement of the international indicators (European Data Market Study Update (SMART 2016/0063), D2.7 Final Report on Facts & Figures), the U.S. retained its leadership in the number of data professionals in 2019 and out to 2020 (see Figure below) – with an estimate of nearly 14.5 million in 2019. However, annual growth in 2019 – at 1.7% – is behind the European Union and Japan. Over the longer term, compound growth from 2016 to 2020 is 3.5% - only higher than Brazil. Brazil consistently shows the lowest growth in the number of data professionals and unsurprisingly the lowest compound growth too. Economic issues in the country reduce industrial and business growth, which is also reflected on the digital economy. Long term growth in Brazil is seen as a compound growth of 1.2% out to 2020. Japan is generally the best of the internationals with compound growth of 3.7% out to 2020. However, none of the internationals comes close to the longer-term growth in the number of Data Professionals seen in the Member States – 7.5% CAGR out to 2020.

Figure 4: Number of Data Professionals by Country, 2016-2020, Growth 2019 (Units; ‘000, %)

Source: European Data Market Monitoring Tool, IDC 2020

Figure 5: Number of Data Suppliers in the U.S., Brazil, Japan and EU, 2016-2020, Growth 2019 (Units; ‘000, %)

Source: European Data Market Monitoring Tool, IDC 2020

The picture for Data Professionals is similar to the one related to Data Suppliers, with the US displaying moderate growth year-on-year in 2019 but stronger longer term growth over the period 2016-2020 However, the dominance of the USA in the number of Data Supplier

Page 51: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Companies is under threat, with the EU27 plus the U.K. displaying solid year-on-year growth and featuring a total number of data suppliers very close to the one of the U.S. The figure below shows the relative positions of Data Suppliers among the participating regions.

The value of the Data Market continues to increase at around 10% across all the three EU partners taken into consideration in this report, ahead of the number of data professionals and data companies. This is expected as companies aim to increase revenue, so growth in the revenue of Data Suppliers should be well ahead of the growth in the number of Data Suppliers. The data market is a dynamic one: companies that cannot increase their revenue (through marketing and sales efforts, and through product development) are more likely to exit this market as their business model will be built on improving profitability in the longer term through revenue growth. Demand continues to grow as data users appreciate more the value of digital transformation and are more able to adopt digital practices in their organisations. However, a large part of organisations digital transformation activities tends to be towards cost reduction and efficiency improvements rather than fundamental changes in the way these organisations develop and conduct business.

The Data Market sees the U.S. easily retaining its lead position for the foreseeable future, with more than 184 million Euro in size and a buoyant year-on-year growth of 12.7% in 2019 over the previous year. Among the internationals, the EU is the only regional market able to challenge the U.S. for dominance in this industry but it is a considerable distance from the country: the data market in the EU28 is less than 40% of that in the US.

Figure 6: Value and Growth of the Data Market in the U.S., Brazil, Japan and EU, 2016-2020, Growth 2019 (Units; ‘000, %)

Source: European Data Market Monitoring Tool, IDC 2020

Page 52: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

48

3. Describing the data market – the quali-quantitative

stories

The Final Report on Facts & Figures (D2.7) and the Final Report on Policy Conclusions (D2.8) were accompanied by three quali-quantitative stories concentrating on the operational, organisational and/or economic benefits generated by the use of data-driven technologies with a special focus on Health Data-driven Innovation and Data Commons. While the first story investigated unexplored potential stemming from the use of Big Data and Analytics (BDA) in healthcare, the second and the third stories focused on the role of data Commons and analysed in depth the benefits of data-driven innovation and the potential role and impacts of common data spaces in several leading sectors targeted by the recently unveiled European Strategy for Data (COM (2020) 66 final, 02/19/20)

3.1. Story 6-7 Health Data and Data-driven Innovation in the European Healthcare Industry

Combining two quali-quantitative stories (Story 6-7), the research on health data and data-driven innovation highlighted how a growing number of European Healthcare Systems in Europe is embarking on long-term reforms to improve outcomes and foster innovation. This, with the ultimate goal of benefiting patients, while, at the same time, ensuring long-term sustainability of healthcare services provision. The research unveiled an un-locked potential as it highlighted that the majority of healthcare providers (59%) has not adopted a Digital Transformation roadmap yet and only the 6% has established a unique roadmap for Digital Transformation and general business strategy7.

This unexplored potential of the use of Big Data and Analytics (BDA) in healthcare is eliciting a new wave of interest in data-driven value creation, which, in the medium to long run, will enable to reward performance rather than just volume. According to the IDC DX Sentiment survey 20188, over 60% of the European healthcare providers reported that developing data management is a priority. Indeed, being able to analyse and use data for process automation and decision support, in a granular, accurate, safe and context-relevant way, is key to long-term competitiveness and sustainability.

AI is still in its infancy and only the 30% of European healthcare providers are already using/testing or have immediate plans for the technology, adding to another 23% evaluating AI use cases 9. The top three BDA use cases that European healthcare providers are working are Clinical decision support (16%), Illness progression (15%) and Patient engagement (14%)10. Patient engagement is a key priority for almost 40% of European healthcare providers, particularly among countries that are experimenting with value-based reimbursement models structured around patient experience and outcomes, while the most significant AI use cases that European healthcare providers are working on are:

7 Source: IDC DX Executive Sentiment Survey, May 2018 (n=66) 8 Source: IDC DX Executive Sentiment Survey, May 2018 (n=66) 9 The IDC European Vertical Markets Survey is an annual landmark study of IT solutions, investment priorities,

and emerging technologies. In the 2018-2019 version, the sample covers over 77% of the European economy (in terms of GDP of the 40 countries). Respondents are distributed across Western Europe (UK, France, Germany, Italy, Spain, Netherlands, the Nordics) and Central & East Europe (Russia, Poland, and Czech Republic). The survey was conducted in the native language of each country, using either telephone interviewing (CATI) or web interviewing (CAWI) systems. Eligible respondents are individuals primarily involved in IT and/or business decisions at their companies and ranked director level or above. Results are analysed by vertical market and company size and represent a basis for a series of demand-side reports published by IDC.

10 Data retrieved from IDC European Vertical Markets Survey, 2018–2019 (n= 290 [WE = 232, CEE = 58])

Page 53: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Personalization of clinical pathway (10%) Clinical decision support (9%) and Patient risk (9%)11 .

The most significant AI use cases that European healthcare providers are working on are:

• Personalization of clinical pathway. As PHM is transforming the traditional hospital focused approach into a value care to deliver more efficient outcome using fewer resources, AI and machine learning algorithms can derive actionable insights from the untapped datasets essential for population health programs. AI can, for example, elaborate real-time data to pinpoint specific demographics where health issues exists and target them precisely with ad-hoc treatment program.

• Clinical decision support. A machine learning system can provide a high level of clinical accuracy and a coverage of a broad range of conditions. Symptoms can be entered via natural language and can be used to drive diagnoses and the level of care direction.

• Patient risk. AI can predict the future of patient's health with better accuracy, as the risk of contracting specific diseases. AI system can predict the outcomes of hospital visits to prevent readmissions and shorten the amount of time patients are kept in hospitals.

• Compliance check. The healthcare industry is highly regulated but maintaining compliance within evolving patchworks of national and regional regulations can be strain on providers' limited resources. The adoption of AI in this area optimize administrative procedures and frees experienced caregivers from routine tasks.

• Optimization of resource utilization. Machine learning and AI have the potential to provide the front line with the real time wisdom to improve the speed and the quality of the hundreds of decisions they make each day in order to improve the flow of patients through the various clinical services involved in delivering appropriate care.

Figure 7 Top three use case for AI among European healthcare providers

Source: IDC European Vertical Markets Survey, 2018–2019 (n= 290 [WE = 232, CEE = 58])

We analyse four different case studies (see table below):

• The Secondary Use of Data in Finland - in the health and social welfare sector

there has been an extensive work between the public and private sectors to promote

the secondary use of health and well-being data, which has led the creation of a new

ecosystem through a national development project, leading to the development of a

ground-breaking new legislation and the establishment of a permit authority for

secondary use of health data. Interviews conducted with Jaana Sinipuro, Hannu

Hämäläinen e Antti Kivelä (SITRA).

11 ibidem

Page 54: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

50

• The Health Data Hub in France - In 2018, president Macron launched this initiative to establish a nationwide data platform. The project is today strongly pushed forward from the French Health Act "Ma Santé 2022", approved last July, and including the creation of the "Espace Numérique de Santé" (ENS – e-health personal space) with the aim to establish a more efficient and patient oriented healthcare system by leveraging the power of data and artificial intelligence. The project goal is to enrich and enhance the National Health Data System (SNDS - Système National des Données de Santé) by including the wider French heritage of health data in one place, for open use by researchers, healthcare professionals, care institutions, start-ups, insurers, etc.

• The development of Data Policies in Portugal - In August 2019, the Portugal Health Ministry has made available the strategic document "From big Data to smart data: putting data to work for the public's health", that outlines the vision, key areas and principles for secondary use of data, advanced analytics and artificial intelligence to improve Portuguese population´s health. There are several initiatives and pilot projects, that are testing the capabilities of AI and providing evidence to support the development of new data management and governance policy.

• ARIA’s Health Data Warehouse and Business Intelligence Competency Center (Italy) – In July 2019, Regione Lombardia (Lombardy Region, Italy), has established a new

Regional company called ARIA (Agency for Innovation and Procurement) from the

merge of ARCA (regional agency for procurement) and Lombardia Informatica (the

regional in-house digital company). Within its mandate, ARIA has the specific aim to

enhance the value of all regional health data assets. Interviews conducted with

Giuseppe Preziosi (Aria)

Page 55: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Country Challenges/ business needs

Key initiatives and application areas for secondary use of health data

Benefits (current or expected)

Output to date

Finland • Making use of the huge amount of health and social care data collected from different sources and registers

• Offering to researchers and organizations looking to leverage health data for innovation, a coherent, and simple environment for accessing and using the data in a legally compliant way

• Maintaining public trust in government capability to manage data towards common good and respect of individual rights

• Creation of a data access permit authority and the related organizational, infrastructural, and legal conditions

• Key application areas: innovation in clinical and public health research, as well as in the private sector (pharma and life sciences) R&D to promote economic growth

• Simple yet comprehensive permit procedure to get access to data for R&D scopes

• Strong knowledge base and experience on managing a complex ecosystem of stakeholders, sometimes with conflicting interests, and to enable sustainable cooperation

• Establishment of a single authority that work as a one stop shop for permits regarding several registers

Sitra launched 8 pre-production projects, whose lesson learned, and outcomes have been then integrated into an action plan for the new permit authority once in operation.

Page 56: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

52

France • Fragmentation of the healthcare ecosystem

• The need for a secure governance framework enabling the ethical use of data for analysis and research

• The need for establishing a regulated platform for data access and use as well for facilitating the interaction of multiple stakeholders (collecting, producing and using data)

• Establishment of a Health Data Hub (HDH) to manage a large number of data sources with harmonized rules for data access, and use.

• Key application areas for secondary use of research include the development of RWE research, support to clinical trials, development of precision and personalized medicine, predictive clinical decision support solutions

• A harmonized standard of data access at national and international level

• Patients/citizens access and control over the use of their data

• Accelerated innovation and personalized services

19 pilot projects have been launched projects with various focuses ranging from breast cancer to health surveillance, using predictive tools enabling predictions and insights able to drive decision making

Portugal • The need to transform the National Health Service into an intelligent, data driven NHS to drive efficiency and better patient outcomes

• The need to define a framework for the secure collection, storage and reuse of health data

• Establishment of a national strategy for health data management and for secondary use

• Funding of AI enabled research programs and projects aimed at the development of clinical decision support; personalization of clinical pathways; patient risk management; optimization of resources utilization

Creation of

• A data infrastructure that supports population health management

• Predictive solution aimed at reducing skin cancer mortality

• Predictive solution aimed at optimizing the delivery of emergency services

There are several initiatives and pilot projects, that are testing the capabilities of AI and providing evidence to support the development of new data management and governance policy.

Page 57: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Italy (Lombardy region)

• Need to integrate and use >10 years of collected health data in the regional Data Warehouse

• Development of infrastructural, organizational and policy framework for the use of and access to data

Predictive BDA models and Data health hub

• Patient risk

• Population risk management

• Illness progression and clinical decision support

• RWE driven research

• Enablement of new data driven research streams and collaboration at national and international level

• Establishment of a PoC framework to predict cardiovascular risk with accuracy between 96,22% and 97,96%

• Launch of a pilot determining the clinical best next action for Alzheimer patients

Several Initiatives to support the regional healthcare service in areas such as: planning and management, costs rationalization, evaluation of safety and efficiency of clinical pathways and integrated patient journeys, as well as health risk prevention

In terms of the triggering challenges, security and data privacy concerns are on top, due to the highly regulated environment of the healthcare industry. However, the common theme emerging from the case studies are threefold:

• The establishment of a regulatory framework for the use of and access to data. This involves multiple players in the public and private healthcare arena to establish partnerships and create a collaborative environment. It requires all stakeholders to agree on the value of data as a shared asset, and to actively promote initiatives where common standards and a one-stop-shop approach to data access are brought forward. This is the case of Portugal, where the Health Ministry authority is seeking to establish a new healthcare ecosystem based on a data-driven approach towards the delivery of healthcare services across the nation.

• The collection, processing, storing and access of complex data coming from different structured (e.g. National health records) and unstructured (e.g. wearables) sources, along with semantic, geographical and time complexities. This is the case of ARIA and its collection of over 10 years of healthcare data stored in different locations, as well as Sitra's project working at establishing a common framework and developing metadata descriptions. Additionally, data collected require intelligent solutions, capabilities and skills to extract value from data and provide actionable areas for the deployment of information (i.e. population risk stratification, clinical decision support, personalization of clinical pathways, etc.).

• Maintaining trust and ensuring security. The high sensitiveness of healthcare data requires an attentive approach to identifying and enforcing regulatory frameworks and solutions that ensure information is treated in compliance with regional and country-specific policies. The adoption of GDPR strengthens and unifies data protection for individuals within the EU, regulating how data integration happen safely. It gives

Page 58: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

54

individuals key control over the usage, processing and transfer of their personal data held by healthcare organizations. The transition to a VBHC model, in which care plans should be personalized and stakeholders should integrate their activities, must be underpinned by consistent information management governance that enables patient data integration across providers, processes and IT systems. GDPR is expected to provide a patient-centric ecosystem. In addition, authorities need to establish a high level of public trust for the ethical and secure use of healthcare and social population data for the public good. In this example, Finnish citizens are informed about how their data are used for secondary purposes out of primary care.

The case studies presented in this research highlight several benefits obtained by the organizations adopting AI/ML and BDA technologies:

• The easy and convenient access to intelligent solutions for clinicians and patients offers more opportunities to advance decision making and enhance clinical process efficiency at the point of care. Portugal skin cancer screening solution is an example of how technology supports a clinical collaborative framework and enables the integration of information to serve population health management,

• More advanced predictive capabilities, allowing greater control over disease-specific variables impacting health outcomes, as well as costs and resources utilization associated with care. This approach enables to more efficiently target population segments at risk of developing chronic and long-term conditions by putting in place initiatives aimed at promoting health and preventing or delaying the development of risk factors. Predictive BDA by ARIA is an example of development of a predictive model able to effectively target cardiovascular conditions and offer an accurate estimate of the number of future cases in a specific geographical area.

• The adoption of BDA as part of their business intelligence strategy, help healthcare providers to improve the overall operational efficiency. Advanced predictive analytics model supports the definition of admission rates along with attrition rate to help with staff allocation. In Finland, for example, the new Act on secondary use of data is expected also to support the planning and the reporting duties of authorities. The aim is to reduce the healthcare costs and focus more resources on the delivery of better healthcare.

• Big data can help in uncovering unknown correlations, hidden patterns, and insights by examining large sets of data. By applying machine learning, big data can study human genomes and find the correct treatment or drugs to treat cancer or other rare disease. Clinical studies are long and expensive to implement. They provide results on the drug administration in a specific controlled frame that is not real-life conditions. Moreover, it can be difficult to compare results between different clinical studies (indirect comparison of different new treatments). Real World Data (RWD) analysis could provide exhaustive and real-life analysis. Aware of RWD potential, health authorities are developing new evaluation frames with RWE and artificial intelligence. The HDH in France is an example of how a secure and a regulated environment offer the access to relevant data for health actors but also to the means needed to analyse these data, facilitating clinical trials initiatives.

3.2. Story 8 - Accelerating the Impact of Data Commons

Story 8 focused on the concept of “Data commons”, also referred to as data collaborative, data spaces, open data partnerships, and common data infrastructures (Mishra et al. 2016; Perkmann and Schildt 2015; Susha et al. 2017) Indeed, data commons have been defined as platforms that openly share data and knowledge with a computational infrastructure that supports data sharing across an heterogeneous base of users supporting different services on top of the data (Contreras 2010).

Page 59: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Data commons are currently managed by multiorganizational collaborations, usually a consortium or a group of organizations including competitors, that come together to share costs and resources to build a common infrastructure that supports data analysis and helps them extract value from the integration and re-use of the data shared.

From the lesser or higher degree of sharing across companies, such efforts of building common data pools need to address some challenges, including:

- To generate the appropriate incentives for organizations to share some data

- To provide the conditions under which they are willing to do it, that is a governance approach that defines rights and responsibilities across data owners and users

- To agree in a set of data formats, data structure and quality in which the data and its contextual information needs to be shared, which imply an agreement on data standards and metadata to allow interoperability across systems

- A sustainability model that guarantees that such infrastructure and efforts are not only maintained over time but eventually scale.

One of the biggest challenges for data commons is generating the business case for companies to be willing to share their data with a pool of organizations via a common data infrastructure; and developing a long-term sustainability model that affords the different activities behind the operations that support such infrastructure.

Regarding the first challenge, several industries have realized the critical value created by sharing and re-using digital data (in short data) for different purposes unforeseen when data was collected in the first place. Data is considered a non-rival good, meaning that it can be reused and recombined to generate positive spillover effects (OECD 2015). Although the re-use of data is assumed to foster innovation and contribute to economic efficiency avoiding duplicative investments, amongst other positive externalities (Drexl, 2018), data is still kept in silos and little re-used is reported (Pujol Priego et al. 2019). As of today, data is not subject to property rights, such as knowledge, but other legal bounds can inhibit data to be reused, such as trade secret protection, right to data portability in GDPR, amongst others. However, most importantly, the decision of a company to share its data to be reused depends on its factual interest in sharing it.

There are different costs associated. Depending on the sector, liability concerns may arise for companies preventing them to share some potentially sensitive data (e.g. clinical trial data) with the public, even when there are potential single and collective benefits behind it. Competitive concerns are related to how to share information without damaging competitive interests and advantage of for profits. Besides such concerns behind the companies’ decision of engaging in the development of data commons, there are different costs associated with doing it.

Worth noting are some operational costs related to the services provided by such data commons, which include data curation, periodic updating, and monitoring of datasets, the agreement and implementation of data standards to ensure interoperability, data storage services and all activities related to guarantee data security and privacy standards. Interoperability is an important requirement of data commons to permit the aggregation and integration of datasets through a variety of tools and support its re-usability. Data commons must be also secure to preserve permissions, guarantee data protection and prevent corruption of data. This aspect is critical where data involves sensitive personal information (e.g health-related data, financial or legal related data).

Page 60: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

56

On top of such operational activities, additional tasks, depending on the scope of the data commons, may be related to the provision of data visualization tools that foster data re-used, operational support for users seeking to re-use the data and asking for additional contextual information about the data or clarification, amongst others. See Grossman, 2018 for more exhaustive detail.

Potential benefits are related to efficiency gains and the generation of new products and services afforded by the re-use and recombination of other’s data, fostering an ecosystem that co-creates value around data. As such, data is a non-rivalrous resource allowing the same data to support the generation of heterogeneous products and services. Any company can engage with the same data in different data-sharing arrangements, being unlimited the potential value that can be extracted from the same data. For instance, as some of the data commons cases reveals properly designed data commons can serve to R&D processes as an active and accessible repository for research data; as a platform for reproducing research results; to support discoveries by adding data and new algorithms developed and implemented around the commons and as new software applications and tools are integrated with the common pool of data. Benefits also include reduced data silos from different organizations and integrated workflows in the cloud, thus taking advantage of a common cloud solution. This would also elicit a dynamic ecosystem supporting software vendors, oil companies, academia and an open community of projects to be developed (competitive dynamics) on top of a common data platform developed (cooperation).

As the types of data commons grow and diversify, we can expect to see a variety of higher-order services on top of such infrastructures with different offerings operating within and across data commons while displaying different data value-chain configurations - including third-party contributions - to maximize the value of the data. European policy in this domain is certainly necessary to facilitate the creation of sufficient critical mass. But it should draw on existing experience for building such “data spaces”. In particular, what emerges clearly is that there are great challenges to data pooling that can only be addressed by sophisticated and carefully designed governance mechanisms. Data pooling, to be effective, needs to be framed in a wider context of precompetitive collaboration between companies, and needs to have a compelling business case: in other words, it has to be demand driven.

3.3. Story 9 – Scaling up data-driven innovation: European industry requirements and the role of European data spaces

The strategic objective of the new European Strategy for Data (COM (2020) 66 final, 02/19/20) is to make Europe a global leader in the data-agile economy, by creating a favorable policy environment and a genuine single data market for data. A pillar of this strategy will be the creation of common European data spaces in strategic economic sectors and domains of public interest, where data driven innovation will have system impact on the entire ecosystem and on citizens. To become operational, data spaces will need to develop data governance mechanisms and access to high value datasets to enable data-driven innovation within vertical ecosystems and foster their developments.

The European Commission has investigated the potential needs and requirements for common data spaces in a series of workshops with stakeholders12 as well as several other initiatives. Story 9 looks at common data spaces from the viewpoint of European industries who have already invested in data-driven innovation, have achieved measurable business benefits and are engaged in scaling-up these efforts. This provides an evidence-based and industry-specific view about the pragmatic requirements of Common European data

12 Report on the European Commission's Workshops on Common European Data Spaces https://ec.europa.eu/digital-single-

market/en/news/report-european-commissions-workshops-common-european-data-spaces

Page 61: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

spaces, with a focus on the requirements for data governance, access to data, access to infrastructures. The results show that the path towards data spaces effectively supporting data sharing at ecosystem level will not be easy. Enterprises are still concentrating most of their efforts and investments on developing data-driven innovation within their organization, at best sharing some data with a few trusted sub-suppliers. The most relevant barrier found to scalability comes from the cost of cloud infrastructures and the dependency on a few global suppliers, such as AWS (Amazon Web Services) resulting in potential customer lock-in effects.

The story leverages the set of 18 case studies developed by Politecnico of Milano (POLIMI) and IDC across seven industries within the context of the H2020 DataBench project13, collecting data about the business impacts of the adoption of Big Data and Analytics technologies (Figure 8). These case studies show a good level of business benefits achieved from data-driven innovation, with a high level of cost reduction (such as 80% reduction of operational expenditures for fraud detection in the financial services industry, 30% reduction of maintenance costs in manufacturing thanks to predictive manufacturing) and customer benefits (for example, a 110% improvement of customer retention in manufacturing and 85% improvement of conversion rates from potential to actual customers thanks to data-driven targeting in retail). These case studies represent operational services and applications, but most of them are still confined to individual departments or branches of the company, in the process of being scaled-up to the whole organization. When scaling up these services, the lessons learned from the case studies highlight the following problems and risks:

• Even when the technology solution has been well selected and designed, there is

no absolute guarantee scalability to the whole organization will be feasible and cost-

effective until it is actually implemented;

• Business superficiality: business intelligence is always needed to lead the strategic

use of data, and this is still found in human resources rather than in machines. For

example, automated recommendation systems built on standard solutions will tend

to recommend the products with highest sales, which are likely to be the products

with lower prices and lower margins. A business manager must intervene to provide

an intelligent strategy, for example finding ways to nudge customers towards

products with higher prices and margins. Integrating business intelligence with the

use of data analytics is still immature in many industries.

• Most of these solutions rely on public cloud technologies. This creates a lock-in risk,

since migrating to other cloud providers requires considerable costs and time

investments for the redesign of software. The high concentration of the cloud

providers market reduces the potential choices of business users and constraints

the scaling up of data-driven innovation.

• When scaling up, particularly if the solution requires real-time data processing, the

cloud computing costs tend to rise very quickly and cross a threshold where

technology costs are higher than business benefits (edge vs. cloud decisions). This

is a sensitive aspect, particularly for solutions combining BDA (Big Data Analytics)

and Artificial Intelligence such as machine learning.

13 Evidence-Based Big Data Benchmarking to Improve Business Performance, www.databench.eu

Page 62: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

58

The cloud computing costs issue is one of the relevant results of the case study analysis. Basic cloud services are quite convenient, but prices increase very fast as soon as more sophisticated services are needed. For example, in the retail industry, a leading supermarket chain found that leveraging AI for sales prediction in one shop led to a 5% increase of margins (equivalent to roughly 5 million €/year); but applying the same machine-learning application to all shops in the chain would wipe out the benefits and cost more than the margin increase. These issues are behind the decision of the European Commission to promote federated European cloud infrastructures. The European data strategy in fact plans to fund a High Impact Project on European data spaces and federated cloud infrastructures.

These case studies confirm the need for improved access to data processing and computing capacities, as foreseen in the Data Strategy in terms of support for data spaces14.

Figure 8 – Big Data success stories: main use cases and business impacts, 2019

Source: Chiara Francalanci, Polimi, “Virtual BenchLearning: Success Stories on Big Data & Analytics”, DataBench Webinar 28 May 2020

In addition, we investigated in depth three case studies to focus on scaling-up issues and potential requirements for European data spaces. They are:

• E-Geos, an ASI (20%) / Telespazio (80%) company, is a leading international player

in the Earth Observation and Geo-Spatial Information business. The case study

concerns the design of an innovative yield prediction machine learning algorithm

based on Sentinel and Landsat high-resolution satellite information. The algorithm

was used to predict the production of soybeans and corn in the US on behalf of a

14 European Commission, “A data strategy for Europe”, February 2020, page 17

AgricultureCrops monitoring:

Costs = -10%Equipment optimization Precision agriculture

Automotive Predictive maintenance Self drivingSmart services:

Costs = -80%

Financial ServicesFraud detection:

Operational Ex. = -80%Risk assessment

Targeting:

Marketing costs = -35%

TCO costs = -80%

Conversion rate = 10x

Healthcare Diagnostic Patient monitoring Preventive systems

ManufacturingPredictive maintenance:

Maintenance costs = -30%

Smart manufacturing:

Utilities costs = -20%

Cust. retention = +110%

R&D optimization/

Smart design

RetailAssortment optimization/

Intelligent fulfilment

Price optimization/

Promotions:

Conversion rate = 50%

Cust. retention = +14%

Targeting:

Conversion rate = +85%

TCO costs = -15%

TelecommunicationChurn prediction/

Promotions

Network capacity

optimization

Targeting:

Conversion rate = +130%

Transport & logisticsChurn prediction/

PromotionsFleet management

Network capacity

optimization:

TCO costs = -90%

UtilitiesChurn prediction/

Promotions

Network capacity

optimization:

Costs = -20%

Cust. Expenses = -30%

Personalized fares:

Marketing costs = -50%

TCO costs = -50%

Page 63: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

company operating in the financial industry, needing predictions as a support to

trading decisions. Machine learning algorithms demonstrated roughly 10% more

accurate, supporting better investment decisions, which helped the financial

company to gain from trading around 0.32% on traded volumes. This is a very

valuable gain for the competitive trading market. e-GEOS is the global distributor for

the COSMO-SkyMed data, a constellation of four radar satellites for Earth

Observation, founded by the Italian Space Agency and the Italian Ministry of

Defense. Due to its nature and ownership, e-Geos is oriented towards the provision

of open data and data sharing. Nevertheless, satellite data is basically raw material

which needs to be processed and treated with sophisticated tools such as the

machine learning algorithm developed by E-Geos before it can be used. To provide

access to farmers to this type of data in a common data space requires a governance

framework where the costs of data processing can be compensated and specialized

intermediaries like e-Geos can provide the necessary tools.

• A leading Spanish Financial group. The case study concerns the development of a

dataset on the relationships between customers in order to build a part of the social

graph of the bank. The data is synthetically generated based on real data coming

from a set of restricted tables (relational database), with information related to the

customers, their connections and the different operations they perform. The

generation of this dataset is aimed to allow the bank to share data with its external

providers to deploy and validate proofs-of-concept of different use cases (e.g.,

potential fraud based on customers’ relationships).

• Whirlpool is a multinational white appliances manufacturer. The case study concerns

several initiatives including a pilot carried out within the context of the project Boost

for the application of Big Data to forecasting spare parts demand, including the

creation of a consumer service data model and planning optimization. Spare part

production and distribution is one of the most relevant challenges for after sales

services, requiring careful and timely managing of central warehouses of spare parts

(so customers don’t have to wait too long for reparations), with a large variety of

product families and product codes. By using data analytics and creating a “smart

service” for spare parts forecasting and management, the company was able to

achieve a 30% reduction of the spare part stock, an increase of inventory turnover

by 35%, and a reduction of 25% of the lead time to consumer. The company also

implemented a new platform for self-service analytics for the internal users, so that

they can access the relevant data in a more flexible and personalized way. Whirlpool

is still struggling with many data silos devoted to single company functions or

departments and trying to merge them with a single data lake. The company is also

developing a “digital twin” experimentation in Poland but is struggling to merge

different data sources, particularly from the production machines and plant assembly

lines. According to the interviewee, there are cultural and technological barriers

preventing data sharing between the manufacturer and his sub-suppliers. The

suppliers are focused on collecting data from the production cycle for maintenance

and improving efficiency but are not willing to share the data. In addition, the different

typologies of data and methods of collection create technological barriers.

In conclusion, there has been considerable progress in the use of data-driven innovation by European industries in the last years, including an increasing use of AI techniques such as

Page 64: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

60

machine learning leveraging the power of data. Nevertheless, there is still a high level of immaturity in the capability to merge datasets within a company, and relevant barriers against data sharing even in advanced sectors such as manufacturing. A relevant issue which emerged from most case studies is the availability of affordable and efficient cloud computing infrastructures, allowing the scaling up of successful pilots and individual company-site experiences to the whole company domain. Even if potentially Common data spaces could provide a valuable answer to the need for greater access to high quality datasets and computing infrastructures, this will require solving practical and technology challenges, not simply providing a favourable environment for encouraging stakeholder collaboration.

4. Mapping the data market – data landscape and data market monitoring tool

4.1 The EU Data Landscape

The Final EU Data Landscape Report (D4.3) provides an overview of the EU Data Landscape database revision performed by the study team between January 2019 and December 201915. With a total of 1556 companies and coverage of 42 countries (European Union-28, Belarus, Bosnia and Herzegovina, Georgia, Iceland, Israel, Kenya, Moldova, Norway, Russia, Serbia, Switzerland, Turkey, Ukraine and the United States), the database offers a comprehensive overview of the most important data companies in Europe. More specifically, among the 1556 companies, 311 have been identified as Key Data Landscape companies in line with a set of criteria adopted and further specified in the following paragraphs.

Main changes to the EU Data Landscape

The 2020 EU Data Landscape review introduced some changes to the approach:

• This year’s research was widened by the use of new and more efficient sources and focused on the Vertical Applications category.

• The existing EU Data Landscape database was validated and further extended in geographical and coverage scope. In particular, as regards geographical coverage, the database currently includes companies from 42 countries, against the 41 of the January 2019 update.

• The list of Key Data Landscape companies was reviewed according to updated or new criteria identified by the study team, leading to 52 new entries in this category.

15 The data presented in the report cover the period from January 2019 until December 2019. The United Kingdom was a full member of the European Union during this period. It officially left the European Union on 31 January 2020. As a result, this report still includes the UK as a member of the EU and the EU is considered to have 28 Member States throughout.

Page 65: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Figure 9: Database of Data Landscape companies (www.datalandscape.eu)

Source: http://datalandscape.eu/companies

Overview of the EU Data Landscape Database (status in December 2019):

• Overall, the database has grown by 9% with the addition of 131 new companies

during 2019. Out of the 131 new companies 52 are Key Data Landscape

companies

• UK companies account for 25.3% of the total database, followed by Spain (12%),

France (8,8%), Germany (8,7%), Netherlands (4,7%) and Italy (3,9%).

• Analytics continues to be the most populated category (accounting for 41% of

the database).

• The share of companies categorised as Vertical Applications grew by six

percentage points reaching 23% of the database (from 246 companies in 2018

up to 359 in 2019).

Key Data Landscape companies (status in December 2019):

• Focusing on the methodological approach, Key Data Landscape companies

were selected from the main database according to the following pre-set criteria:

o The company is listed in the Global Big Data Landscape map,16 or

o The company and its proof of concept is already established enough –

as a proxy the study team took into account companies receiving over

EUR 1m in funding according to Crunchbase database, which provides

data on the world’s most innovative companies, including data on the

amount of capital obtained, and

o The company has its main headquarter or R&D department in Europe.

• The list of Key Data Landscape companies grew from 259 in 2018 to 311 in

2019, a 17% increase.

16 Matt Turck (2017). Big Data Landscape 2017, Firstmark. Available at: http://mattturck.com/wp-content/uploads/2017/04/Big-Data-Landscape-2017-Matt-Turck-FirstMark.png

Page 66: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

62

• Most Key Data Landscape companies have their headquarters in the United

Kingdom (114) followed by France (47) and Germany (35).

• The spread across categories remains stable since 2018 with Analytics

continuing to be the most populated category (41%). Most notably, in 2019, the

Vertical Applications category grew by six percentage points reaching 23% of

the database (from 246 companies in 2018 up to 359 in 2019).

Figure 10: Key Data Landscape Companies by Country (2019,2018,2017)

Source: European Data Market Study, D4.3 EU Data Landscape, Review at December 2019

Page 67: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

5. Acting upon the data market – the role of policy

By investigating the role of policies in shaping the present and future trends of the European Data Market and Data Economy, the Final Report on Policy Conclusions (D2.8) complemented the sizing and forecasting exercise carried out for the Final Report on Facts and Figures (D2.7), as well as the additional analysis obtained through the three quantified stories.

5.1 The Role of Policy and the Future of Europe’s Data Economy: The Three

Scenarios

The Final Report on Policy Conclusions (D2.8) presents the alternative evolution paths of the European Data Market and Data Economy at 2025, described as three potential scenarios driven by different macroeconomic and framework conditions, shaped by critical turning points to be faced in the next years by governments, businesses and social actors. The scenarios are an update of those presented in March 2020, building on the updated EDM dataset and forecasts and insights from last year’s events.

The new European Data strategy outlines the ambition for Europe to become a leading role model for a society empowered by data to make better decisions in business and the public sector and a global leader in the data-agile economy. Our 2025 scenarios outline different pathways of evolution of the European Data Market (EDM) and Data Economy in the next years, exploring the different mix of factors and policy choices which may lead to achieve this ambition or instead to fail it. In the past years we have monitored the fast growth of the Data Market and Data Economy and have witnessed the evolution of supply and demand dynamics in Europe. At the same time, the emergence of disruptive technologies such as AI has increased the need for policy intervention to manage emerging social, economic and ethical risks. Therefore, the 2025 scenario presented in this report are strongly influenced by multiple policy assumptions shaped by the Data Strategy and European Digital Strategy recently published by the new Commission.

Given this context, we have updated the description of the two main focal issues (axes) around which we have developed our 2025 scenarios, as follows:

• the high or low pace of diffusion of data-driven innovation, driven by demand-

supply dynamics, and its impact on economic growth. This year we add to this

perspective the pace of multiple innovation adoption, where data is at the core of a

multiple technology environment powered by AI.

• the social and economic data governance model enabling a fair and

competitive economy, as indicated by the new European Data Strategy. Today the

term “data governance” has grown from its original narrow definition as an approach

to data management, to a much broader concept of a policy and conceptual

framework establishing the norms, practices and principles covering all aspects of

data dynamics in the society and the economy. Essentially, the data governance

framework which is the first pillar of the new European Data Strategy recognizes the

need to deal with data as a strategic asset influencing power dynamics in the socio-

economic system.

At one extreme, we foresee a society where a few actors, such as leading online platforms, governments, large businesses, dominate the main data assets and therefore capture a disproportionately high share of data innovation benefits, increasing social inequality (highly centralized model). The polar opposite of this scenario would be a society characterised by an open, transparent and participatory approach to data governance, where both citizens

Page 68: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

64

and organisations are able to control and extract value from their data. This would result in a wider social distribution of data innovation benefits, decreasing social inequality. Trustworthiness and respect of data ethics principles are other important characteristics of this ideal model.

This analysis highlights the critical turning points to be faced in the next years by governments, businesses and social actors in the development of the European Data Economy. The combination of alternative social and economic trends results in the following scenarios:

• The Baseline scenario is characterised by a healthy growth of data innovation, a

moderate concentration of power by dominant data owners with a data governance

model protecting personal data rights, and an uneven but rather wide distribution of data

innovation benefits in the society. This is considered the most likely scenario.

• The High Growth scenario is characterised by a high level of data innovation, low data

power concentration, an open and transparent data governance model with high data

sharing, and a wide distribution of the benefits of data innovation in the society;

• The Challenge scenario is characterised by a low level of data innovation, a moderate

level of data power concentration due to digital markets fragmentation, and an uneven

distribution of data innovation benefits in the society.

The scenarios explore the drivers and framework conditions which may lead to maximise the benefits of a balanced Data Economy and to avoid the risks of an unbalanced one, highlighting the consequences of policy actions.

Policy and the Baseline Scenario

The Baseline scenario predicts a healthy growth of data-driven innovation and increase of investments in the new wave of digital technologies, pioneered by the most advanced, competitive and innovative enterprises, medium and large (both as technology providers and users) with a share of competitive SMEs, savvy in the use of ICTs. By 2025 we expect the take-up of AI, Big Data, IoT, and robotics to reach over 60% of medium-large EU enterprises (IDC survey on Advanced Technologies for Industry, 2019), while other technologies such as 5G, AR/VR, blockchain, new materials and industrial biotechnologies will also make strong progress. This will force enterprises to engage in multiple innovation, adopting and combining multiple technologies: this convergence is enabled and powered by data and intelligence.

In this scenario, the Data Market is forecast to reach 82.5 billion Euro in the EU27, with a compound annual growth rate of 5.8%. The Data Economy will grow faster than the Data Market, because the investments in data technologies have direct and indirect impacts on the economy with a multiplier effect, reaching a value of 550 billion Euro in the EU27, with a steep increase of its incidence on EU from 2.6% in 2019 to 4% in 2025. Enterprises will add 3.2 € million data professionals’ positions between 2019 and 2025, bringing the total to 9.3 € million jobs. However, this will increase the potential data professionals’ skills gap, which may become a bottleneck for some enterprises or regions, creating competition between enterprises for the most skilled professionals.

In this scenario, Europe makes progress in the investment and deployment of independent data infrastructures and digital resources, also leveraging the new Horizon Europe and Digital Europe Programs. This means Europe reaches a better, but not quite complete, technological sovereignty.

The new digital policy strategies will empower Europe to play a stronger role in the global scene, leveraging the GDPR success as a global standard. Not only Europe, but also many other international governments are now conscious of the downside and risks of global

Page 69: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

platforms dominance and control of global data flows and will work together towards achieving a more balanced playing field. This scenario therefore is positioned between the two extremes of high and low concentration of power and data control. The development of an effective regulatory framework of data governance, as foreseen by the Data strategy, will enhance stakeholders’ willingness and capability to manage data sharing and improves data access and re-use. The single market for data gradually emerges as fragmentation is overcome, and this enables Europe to attract a growing share of the global Data Economy, in terms of capability of data processing and management. The development of common EU data spaces is faster and more successful in some sectors with strong innovation demand (manufacturing, agriculture) but meets with barriers and low demand in others, failing to achieve economies of scale.

Europe will make progress towards a sustainable Data Economy, with policy initiatives promoting the ICT and electronic industry full transformation to climate-neutral and energy efficiency practices. In this scenario we foresee progress towards a Global Digital Cooperation strategy led by the EU and a more proactive European role in the development and adoption of standards and interoperable technologies on the global scene, leveraging the power of the single internal market.

Policy and the High Growth Scenario

This scenario foresees a faster growth trajectory of the Data Market and economy, boosted by favourable economic conditions, by strong investment, proactive policies, and effective collaboration between the MS at the EU level. By 2025 we expect a higher take-up of multiple technologies than in the baseline scenario (AI, Big Data, IoT, robotics, 5G, new materials, blockchain…) with European enterprises fully embracing multiple innovation and the power of data. European industries will exploit platforms to combine data with AI and machine learning, spreading intelligence from the core to the edge of their networks, turning data into action and action into value.

In this scenario, the EU27 GDP compound annual growth rate in the period 2019-2025 (+2.0%) will be 1.5 times higher than in the Challenge scenario and 40% higher than in the Baseline scenario. This will accelerate the investments in the digital economy and consumer willingness to spend. In the European Union public and private investments will accelerate in Artificial Intelligence, advanced robotics, automation as well as new skills. As a consequence, this scenario foresees a deep transformation of business processes and the work culture, where change management and HR management become critical success factors. As in the Baseline scenario, in the High Growth scenario European enterprises multiply the use of "digital co-workers" (using intelligent process automation and AR/VR to support/complement human workers) reducing repetitive tasks, improving productivity and security. Besides automation, enterprises engage in "augmentation" of human resources providing technologies enhancing their physical and intelligence capabilities.

Policy measures at European level have a relevant role to play in this scenario. On one hand, through a combination of incentives, intensive training investments, Europe will support organizations in managing the transformation of the work culture for digital transformation identifying appropriate job roles and tasks in open ecosystems. On the other hand, initiatives to develop digital skills are successful: the Digital Europe Programmes delivers a boost to the supply of advanced digital and data skills, the revised Digital Education act helps to improve digital learning, and the networks of Digital Innovation Hubs play their role in providing internships, training and experimental spaces for companies to learn about new technologies. In this scenario, strong investments and MS cooperation help Europe to develop fully independent data infrastructures and digital resources, to shape global digital governance rules with EU values, becoming a leader in the Big Data-AI space. As foreseen by the digital and data strategies, Europe succeeds in achieving technological

Page 70: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

66

sovereignty. Europe’s share of the global Data Economy is well on the way to become equal to its economic weight by 2030 thanks to a successful single market for data. The successful deployment of an EU cloud infrastructure and cloud services marketplace satisfies industry and SMEs needs. The successful development of common EU data spaces in most sectors achieves economies of scale and supports the rise of the platform economy, enabling companies to deal with multiple innovation. Fast progress with a new Data Act, Digital Services and Competition framework enhance data access, sharing and re-use, achieve fair playing field and build the basis for the effective single market for data.

Policy and the Challenge Scenario

In the Challenge scenario, a combination of economic, social and technology threats overcome European innovation forces, which become lost in a maze of barriers, resulting in much slower Data Market and Data Economy growth. In this scenario, the EU GDP growth, estimated at a compound annual growth rate in the period 2019-2025 of 1%, will be substantially lower than in the other scenarios. Trade wars, political conflicts, and unexpected events such as the Coronavirus pandemic are the main drivers of this growth slow-down. Lower GDP growth means lower overall investments and consumers’ willingness to spend.

In this context, the digital Europe and data strategies could not be implemented successfully and fail to achieve many of their objectives. This may happen if a combination of insufficient investments and lack of collaboration at EU level lead to an uneven development of data infrastructures and digital resources. Without an effective data governance framework and incentives for stakeholders to increase data sharing, there is a risk that the Data Market will remain fragmented.

In this scenario it is possible that, notwithstanding the GDPR, many Europeans have no visibility and very little control on the use of their personal data: this hinders the willingness to share and reuse personal and non- personal data for social good. If the development of common data spaces is slow and does not deliver the expected boost to industries across Europe, this may hinder the rise of the platform economy, miss the development of economies of scale and reduce the incentives to fight market fragmentation for innovative services.

This scenario foresees a negative self-reinforcing circle, where less positive global economic conditions discourage investments and weaken global demand with a negative impact on European growth. In this context, digital Europe and data strategies are not implemented successfully and fail to achieve many of their objectives. This may happen if a combination of insufficient investments and lack of collaboration at EU level lead to an uneven development of data infrastructures and digital resources. Without an effective data governance framework and incentives for stakeholders to increase data sharing, there is a risk that the Data Market will remain fragmented. A slower pace of digital innovation deprives the economy of the boost to growth potentially given by data-driven services and products, while enterprises find competing in international markets more difficult.

The EU27 Data Economy post-Covid

As the Final report on Policy Conclusions (D2.8) was being finalised in February 2020, the COVID-19 pandemic started its rampage across the globe, endangering people and livelihoods, forcing governments to implement measures to contain the virus, with unprecedented impacts on the European economy as well as the technology market. While the EDM Monitoring tool data and analysis until 2019 remain valid, clearly our estimates for 2020 are now off the mark and our 2025 scenarios outlining different pathways of evolution of the European Data Market (EDM) and Data Economy in the next years would need to be revised.

Page 71: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

This section presents our post-Covid scenario forecast for the Data Economy, building on the revised estimates of the EU27 Data Market and on macroeconomic and industrial trends. To understand this forecast it is important to remember the three main components of the Data Economy:

• The direct impacts, which is the value of the goods and services sold in the Data

Market;

• The indirect impacts, backwards (on the supply chain: gains made by industries

providing goods and services to data users) and forwards: revenues gained by user

industries thanks to data innovation;

• The induced impacts on the general economy, thanks to additional spending and

consumption driven by the value of the direct and indirect impacts.

Therefore, the post-COVID Data Economy reflects the decline of the Data Market, the fall in revenues of the industries affected by the lock-down, and the steep decrease of consumer demand and overall consumption.

According to our post-COVID scenario estimates, the European Data Market should decrease by 7.1% to 54 €B in 2020 (compared to 58 €B in 2019) and the Data Economy by 5.5% to 307 €B (compared to 325 €B in 2019). In our view, the powerful negative impact of the slow-down in 2020 will be followed by a rebound and a likely return on the growth path in the next years. Many of the powerful drivers of data-driven innovation are likely to prove resilient in the next years, particularly the willingness to invest in digital technologies in order to re-launch services and create new products to stimulate demand.

By 2025, the post-Covid Baseline scenario foresees strong growth rates resulting in a value of 80 €B for the European Data Market (compared to 82.5 €B in the pre-Covid scenario) and 516 €B for the Data Economy (compared to 550 €B in the pre-Covid scenario). However, the incidence of the European Data Economy on the EU27 GDP will slightly increase from 4% (Pre-Covid scenario) to 4.04% (post-Covid scenario) because GDP is also affected by the recession.

Unfortunately, we are unable to present post-COVID revised estimates for the Data Economy Challenge and High Growth scenarios. These scenarios rely on alternative assumptions on industry revenues, consumer consumption and GDP dynamics and we do not feel able in the present uncertainty to elaborate such assumptions, beyond the Baseline scenario. However, we believe that the Challenge and High Growth scenarios will remain broadly valid, even though their degree of likeliness could change. The Challenge scenario is marginally more likely (if the recovery does not take off as hoped) while the High Growth scenario assumptions, based on hyper-growth thanks to technology investments, seem now quite remote.

5.2 A change of pace in data policies

The digital policy package presented in February 202017 by the new Commission led by Ursula Von der Leyen widens considerably the scope and breadth of data policies, reflecting the new policy awareness about the critical role of data for the competitiveness of the European economy. Only a few years ago Big Data was a topic of interest mainly to the ICT

17 https://www.orgalim.eu/news/commissions-digital-package-roadmap-towards-europes-digital-future

Page 72: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

68

industry, while today there is a widespread awareness of the social and economic impacts of data-driven innovation.

The Communication “Shaping Europe’s digital future” is articulated in four main action areas, covering all the framework and enabling conditions to develop “a digital society based on European values and rules” (Technology for people, A fair and competitive economy, An open, democratic and sustainable society and Europe as a global leader). The Data Strategy is a key component of the economy action area together with actions to update competition rules and develop an industrial strategy, among others. This matches very well the rationale of the EDM scenarios for the development of a balanced European Data Economy, which have long posed as a key enabling condition the adaptation of the overall economic and regulatory framework (not limited to ICT or R&D). The European Strategy for Data’s four main pillars also mirror the priorities highlighted by the EDM Monitoring Tool for the development of the Data Economy. The Pillar A on the development of “A cross-sectoral governance framework for data access and re-use” brings together several initiatives to enable and stimulate data sharing, but at the same time ensure a fair playing field for all organizations. This recognizes the need to deal with data as a strategic asset influencing power dynamics in the socio-economic system. Europe’s ability to achieve this goal is a major differentiating factor between the three alternative scenarios of the Data Economy presented in this report. The Data Strategy’s Pillar B on Enablers (Investments in data and strengthening Europe’s capabilities and infrastructures for hosting, processing and using data, interoperability) and Pillar C on Competences (Empowering individuals, investing in skills and in SMEs) cover the needs for investments, interoperability, standardization and infrastructures as well as skills. The EDM’s indicators on the numbers and penetration of Data Users and Data Companies, as well as Data Professionals and the Data Skills Gap, can help monitoring the development and achievement of these policies.

Finally, the Data Strategy Pillar D for “Common European data spaces in strategic sectors and domains of public interest” is driven by the need to accelerate data sharing and make B2B data sets actionable for data-driven innovation, a priority often underlined by this study analyses. The EDM Monitoring Tool indicators on data innovation diffusion by industry are also useful to provide a baseline for this policy area.

5.3 The EU Data Policy and the International Dimension

Our latest measurement of the European Data Market Monitoring Tool reveals a substantially unchanged picture when comparing the EU indicators to those that have been developed for some of the key international partners of the EU. While confirming its vitality, the EU continues to significantly lag behind the U.S in terms of both size and growth of the Data Market. Even when focusing on the EU27 plus the U.K., Europe generates a Data Market value in 2019 that is still approximately 2.5 times smaller than the one produced in the U.S. (72.3 billion Euro in the EU vs. almost 185 billion Euro in the U.S.) in the same year. The pace of this development is even less flattering as year-on-year growth for the Data Market in the U.S. is almost three times as faster than in the EU in 2019 (12.7% in the U.S. vs. 4.9% in the EU). The same pattern applies when comparing the EU performance to the US performance in terms of the Data Economy18: even if confined to direct and backward indirect impacts, the U.S. Data Economy represents a share in terms of GDP that is more than double the one of the EU (1.2% vs. 0.5%) in 2019 with a year-on-year growth that, again, is twice as much than in the EU (6.8% in 2019 over 2018 in the U.S. vs. 3.4% in the EU).

18 The Data Economy for Brazil, Japan and U.S. has been measured in terms of direct and indirect impacts only due to lack of comparable and consistent statistical sources for all these three countries. This is consistent with what has been presented throughout all duration

of this update study (SMART 2016/0063) as well as the original European Data Market Study (SMART 2013/0063).

Page 73: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Filling this gap would be essential to increase Europe’s competitiveness and for the future of work in the EU. The new European Digital Strategy recently unveiled by the European Commission19 appears to design a new, confident role for Europe as a global player. Realizing that the European model has proved to be an inspiration for many other partners worldwide, the strategy calls for the EU to strengthen its commitment towards the setting global standards for emerging technologies and to remain the most open region for trade and investment in the world. In terms of standards, in particular, the EU has paved the way for the setting of global standards for 5G and the IoT and is now committed to leading the standardisation process of a number of additional advanced and new generation technologies such as blockchain, quantum computing, supercomputing – all technologies that lie behind and allow data sharing and data usage and that, as a straight consequence, are directly linked to the further development of a well-functioning Data Economy.

This proactive international role in the standardisation process is accompanied by a robust commitment on trade and investments on the international scene to ensure a collaborative approach on several technology-related topics including data flows and the possibility to pool available and relevant high-quality data together. This move is indeed in line with the “data-as-infrastructure” approach to the Data Economy, which aims at a negotiated, common data governance setting between the EU and like-minded partners. This approach, however, will have to be put in place while safeguarding Europe’s “technology sovereignty”, that is by making sure that Europe reduces its level of dependency on other parts of the globe for most of the crucial technologies and effectively protects the integrity and resilience of its data, networks and communication infrastructures. This marks a considerable distance from previous policy stances on the international scene. Yes, “sovereignty” is defined positively and is not directed against anyone. Nevertheless, this renewed confidence on the international scene may indicate that Europe will gradually abandon a merely reactive role to embrace a more dynamic and enterprising stance vis-à-vis a number of trading partners at worldwide.

19 https://ec.europa.eu/digital-single-market/en/content/european-digital-strategy

Page 74: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

70

6. Conclusions

6.1 Quantifying the European Data Market – Key Facts & Figures

The year 2019 sees all the indicators measured by the EDM Monitoring Tool in a positive dynamic from 2018, as the European economy continues its development cycle. The value of the Data Economy, which measures the overall impacts of the Data Market on the economy as a whole, is to reach 406 Billion Euro in 2019 for EU27 plus the U.K.

The positive trend in the growth of the Data Economy is confirmed by the Data Market value in 2018, which displays a growth rate 4.9% in 2019 year-on-year, reaching 75. Billion Euro in the EU plus the U.K.

The EDM Monitoring Tool has been analysed along four main dimensions:

• The Workforce and Skills dimension - including the measurement of data

professionals and their potential skill gap.

• The Supply and Demand dimension - incorporating the measurement of data

supplier and data user companies and the revenues generated by these companies.

• The Business and Economy dimension - comprehending the size of the Data

Market and the value of the Data Economy.

• The International context dimension - including a select number of indicators for

Brazil, Japan and the US.

Figure 11: The four Dimensions of the Data Market’s Key Facts & Figures

Source: The European Data Market Monitoring Tool, IDC, 2019

The Workforce Dimension: Data Professionals and Data Professionals Skills Gap

Data professionals are estimated at a total of 6.0 million in EU27 and at 7.6 million in EU27 plus U.K in 2019, thus marking a continuing increase in 2019 over the previous year (6.1% and 5.5% year-on-year respectively). When compared to the year 2019, 2020 would register a growth rate of 9.2% and 8.6% at the level of EU27 and EU27 plus U.K respectively. More interestingly, the employment share and the intensity share components of the data professionals’ indicator are also expected to improve in 2019 and 2020 if compared to our estimates in 2016 (now estimated at 3.3% and 3.5% in 2019 and 2020 in EU27 and 3.6% and 3.8% for the same years in EU27 plus U.K).

As for the skills gaps in data professionals, our latest estimate continues to highlight an imbalance between demand and supply of data skills in Europe since the first measurement for the year 2014. In the year 2019 an increase of demand of data professionals continued (+4.5%), the estimated gap grew by 13% reaching approximately 459,000 unfilled positions

Page 75: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

in the EU27 plus the U.K. (399,000 without the U.K.), corresponding to 76.2% of total demand (5.7% without the U.K., see table below). By 2020 we expect the gap to expand to 496,000 unfilled positions in the EU27 plus the U.K., corresponding to 6% of total demand (5.2% without the U.K, where slower growth is expected due to the impacts of Brexit). In any given moment in the labour market there is a physiological number of vacancies, as well as a number of people looking for work: a vacancies ratio around 5% of demand or less is considered manageable. From this point of view the data skills gap estimated for 2019 shows a lower level stress in the market if compared with our previous estimates for 2017 and 2018. As in in the first and second round of measurements of this indicator, the gap is expected to continue in 2020 under the three scenarios but at a lower level than previously estimated, which is expected to continue to 2020 and beyond under the 3 scenarios.

The Supply - Demand Dimension: The Data Companies

The number of data suppliers continue to grow at a faster pace than the numbers of data users in the longer term (out to 2025). Data suppliers are estimated at almost 149,000 in the EU27 and 290,000 units in the EU28 for 2019, thus exhibiting a year-on-year growth of 2.4% and 2.3% respectively. Data users, instead, are projected to grow at 0.6% in 2019, amounting to nearly 535,000 in the EU27 and to nearly 716,000 units in the EU28. If compared to the measurements carried out by the European Data Market Monitoring Tool over the period 2013-2015, these latest estimates show a picture of some consolidation of data companies in the EU, following increasing growth rates over the prior four years.

Revenues generated by data suppliers have registered a constant increase through the last years to reach nearly 64 Billion Euro in EU27 and 83 Billion Euro in EU27 plus the U.K. in 2019. Data companies’ revenues account for 3.7% of total company revenues in 2019. Data companies’ revenues are expected to follow the Data Market, as imports and exports of data tools and services tend to follow each other. Forecasting data companies’ revenues shows an expected annual growth rate out to 2025 of 7.0% - easily outpacing the growth of the total ICT market over the same period (expected to be 1.6% from 2020 to 2025 Baseline). The smaller Member States show the highest long-term growth as they have a smaller base from which to grow, but the larger Member States will make the biggest overall contribution to the Data Economy out to 2025.

The Business and Economic Dimension: The Data Market and the Data Economy

The value of the European Data Market is expected to reach 75.2 Billion Euro for the EU28, with a growth rate of 4.9% in 2019, with an increased growth rate of 6.6% in 2020. Most of the Member States shows strong growth, slightly ahead of the expected growth for the Total ICT market, which is expected to grow by 3.9% in 2019, and a lower rate of only 2.0% in 2020. The Data Market share of total ICT is 11.4% for 2019 and is forecast to reach 14.0% by 2025 (baseline forecast).

The larger industries, accounting for the greatest number of companies, represent for the largest share of the Data Market. In terms of adoption by industry, the highest rates of Data Technology tend to be in Manufacturing, Finance, Professional services, and in Retail. Thanks to their size, these industries are the biggest consumer of data technologies. Manufacturing’s sheer size in the EU economy makes it the largest industry in the Data Market. However, there is significant scope for increased adoption of data technology in the manufacturing industry, so its leading position is unlikely to change.

The Data Market in the EU27 plus the U.K. will continue to out-grow the total ICT market, with its share of this market rising from close to 10% in 2016, to nearly 15% by 2025, and possibly close to 19% in the High growth scenario.

Page 76: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

72

The value of the Data Economy for the EU27 is estimated at 324 Billion Euro in 2019, with a high growth in 2019 of around 8% and higher in 2020 (more than 9%)

The overall impacts of the data economy for the EU27 will reach 4.0% and of GDP in the Baseline scenario by 2025. Indirect and induced impacts will be evenly distributed in 2019. Data user companies will continue to consolidate the quantitative benefits stemming from the use of data thus contributing to the importance of indirect impacts. Not surprisingly, these benefits will go beyond the users and will translate in higher induced effects, generating jobs and revenues beyond the data companies itself. The positive conditions under the High Growth scenario will lead the overall impacts to reach 827 Billion Euro in the EU27 in 2025. The High Growth scenario will be characterized by a higher level of induced effects than the other scenarios as the benefits for the overall economy are maximized.

A screenshot of the Data Economy by industry shows that the Financial sector, the Manufacturing industry and the realm of Professional services continue to represent the vertical markets in which the impacts of the Data Economy are most strongly felt. Thanks to the significant diffusion of data-related technologies, these industries exhibit high levels of forward and backward impacts and can convey effects at an induced level more quickly and more effectively than other industries. Their IoT diffusion and the usage of Cloud Computing, as well as the usage of mobile and social technologies, coupled with the ongoing process of digital transformation, make these industries particularly reactive to induced effects. Emerging technologies such as Artificial Intelligence and blockchain applications, are also gaining momentum in these industries, thus reinforcing the impact of indirect and induced impacts in these sectors.

The International Dimension - The Data Economy Beyond the EU – US, Brazil and Japan

The most recent data (i.e., 2019 and forecast for 2020) shows that European Data Market and Data Economy in 2019 continues to consistently hold second place after the U.S. in value but slips to last place in growth.

The positive development of the U.S.’ Data Economy is confirmed by a solid year-on-year growth of the main indicators monitored, including the number of data professionals, companies, and the overall Data Market.

Brazil also slowed as its economic recovery was at best weak in 2019. However, the country shows some positive trends in the third quarter of 2019, so the outlook is slightly more positive. For all the indicators Brazil showed the weakest results, while Japan improved its growth in the number of data professionals, rising to 4.2 million in 2019.

Japan’s Data Market is the closest match to the European one in terms of growth and investment, but still only half the size of the EU27 plus the U.K. Japan competes with the EU across data professionals and data market and in 2019 its growth in the data market was significantly higher than for Europe. The economy continues to suffer from weakening internal demand and lack of consumption, even though it showed an unexpected improvement in Q3 2019.

Looking at the estimates of the data suppliers, the EU exhibits higher growth than the U.S. in 2019: a year-on-year growth of 2.4% - notably lower than the U.S., which showed less than 1% growth over the same period. Europe still presents a growing and dynamic data ecosystem on both fronts – the Data Market and the Data Economy. However, it lags both the U.S. and Japan in terms of the incidence of the Data Economy on GDP and has some catching up to do. The region is ahead of Brazil, but it is unclear if this is much of an achievement.

Page 77: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

6.2 Describing the Data Market – The Quantified Stories

Three quali-quantitative stories accompanied this third and final round of measurement of the European Data Market Monitoring Tool. They concentrated on the operational, organisational and/or economic benefits generated by the use of data-driven technologies with a special focus on Health Data- driven Innovation and Data Commons. While the first story investigated unexplored potential stemming from the use of Big Data and Analytics (BDA) in healthcare, the second and the third stories focused on the role of data Commons and analysed in depth the benefits of data-driven innovation and the potential role and impacts of common data spaces in several 6 leading sectors targeted by the recently unveiled European Strategy for Data (COM (2020) 66 final, 02/19/20).

The research on health data and data-driven innovation unveiled that a significant majority of healthcare providers in Europe (59%) has not adopted a Digital Transformation roadmap yet and that only the 6% has established a unique roadmap for Digital Transformation and general business strategy. Still. this unexplored potential of the use of Big Data and Analytics (BDA) in healthcare is eliciting a new wave of interest in data-driven value creation, which, in the medium to long run, will enable to reward performance rather than just volume. In particular, our research has highlighted how AI, among other BDA solutions, is gaining momentum across European healthcare providers with Clinical decision support, Illness progression and Patient engagement being among the most relevant use cases being adopted at the time of writing. Our analysis has also presented a number of benefits that healthcare organizations are obtaining by adopting BDA technologies, coupled with Artificial Intelligence and Machine Learning technologies (AI/ML), in particular:

• The easy and convenient access to intelligent solutions for clinicians and patients offers more opportunities to advance decision making and enhance clinical process efficiency at the point of care. Portugal skin cancer screening solution is an example of how technology supports a clinical collaborative framework and enables the integration of information to serve population health management,

• More advanced predictive capabilities, allowing greater control over disease-specific variables impacting health outcomes, as well as costs and resources utilization associated with care. This approach enables to more efficiently target population segments at risk of developing chronic and long-term conditions by putting in place initiatives aimed at promoting health and preventing or delaying the development of risk factors. Predictive BDA by ARIA is an example of development of a predictive model able to effectively target cardiovascular conditions and offer an accurate estimate of the number of future cases in a specific geographical area.

The story on the emerging concept of Data Commons has focused on the current need for support and for some technical and organizational creativity to make this concept viable and sustainable over time. In this respect organizational cooperation and mutually advantageous data sharing solutions are necessary, so to generate positive externalities while preventing the competitive interests of their contributors. To obtain this, a suitable technical infrastructure needs to be in place so to allow companies to dynamically move from sharing and restricting access to their data and knowledge, and a governance model that preserve trust amongst partners and manage to successfully stay and scale. For this, some mechanisms, appear to be particularly relevant, such as: ex-ante safeguards; arbitration; a stratified and layered infrastructure; user-centric approach; interaction with data generators or data holders to accelerate re-use; and a federated approach towards data-sharing. Through the combination of all these mechanisms, a favourable environment for the organizations to share costs and realize scale can be put in place. This, in turn, will

Page 78: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

74

increase the organisations’ willingness to collaborate while accelerating their innovation processes.

The story on the European industry requirements and the role of European data spaces shows that there has been a considerable progress in the use of data-driven innovation by European Industries, in the last years, including an increasing use of AI techniques such as machine learning underpinning the power of data. The story leverages the set of 18 case studies developed by Politecnico of Milano (POLIMI) and IDC across seven industries within the context of the H2020 DataBench project20, collecting data about the business impacts of the adoption of Big Data and Analytics. These case studies show a good level of business benefits achieved from data-driven innovation, with a high level of cost reduction (such as 80% reduction of operational expenditures for fraud detection in the financial services industry, 30% reduction of maintenance costs in manufacturing thanks to predictive manufacturing) and customer benefits (for example, a 110% improvement of customer retention in manufacturing and 85% improvement of conversion rates from potential to actual customers thanks to data-driven targeting in retail). Nevertheless, there is still a high level of immaturity in the capability to merge datasets within a company, and relevant barriers against data sharing even in advanced sectors such as manufacturing. A relevant issue which emerged from most case studies is the availability of affordable and efficient cloud computing infrastructures, allowing the scaling up of successful pilots and individual company-site experiences to the whole company domain. Even if potentially Common data spaces could provide a valuable answer to the need for greater access to high quality datasets and computing infrastructures, this will require solving practical and technology challenges, not simply providing a favourable environment for encouraging stakeholder collaboration.

6.3 Mapping the Data Market – Data Landscape and Data Market Monitoring Tool

The Third EU Data Landscape Report (D4.3) provides an overview of the EU Data Landscape database revision as of January 2020. With a total of 1,556 companies and coverage of 42 countries, the database has grown by 9% with the addition of 131 new companies during 2019. Out of the new companies, 52 were identified as Key Data Landscape companies, offering a comprehensive overview of the most important data companies in Europe.

6.4 Acting Upon the Data Market – The Role of Policy

The European Data Market (EDM) Monitoring tool has monitored since 2013 the evolution of the Data Economy, providing insights and quantitative evidence about its diffusion by industry and by region, contributing substantially to the evolution of European policy strategies in this field. Since the first intuition of the potential disruptive impacts of Big Data, this monitoring effort and analysis has documented the social and economic relevance of the deep transformation process enabled by data-driven innovation. Today this process is accelerated by the emergence of Artificial Intelligence tools and services powered by data.

The new European Data strategy outlines the ambition for Europe to become a leading role model for a society empowered by data to make better decisions in business and the public sector and a global leader in the data-agile economy. Europe is moving towards a “data as infrastructure” model: a governance model where data is considered as a public asset, and data infrastructures work as a kind of “digital twins” to physical roads, requiring public investments and new institutions to manage them. This model allows for many different typologies of “data roads”, local or global, leaves freedom for private initiative but tries to maintain a balance between private and public interests, to be managed through new kinds

20 Evidence-Based Big Data Benchmarking to Improve Business Performance, www.databench.eu

Page 79: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

of organizations, private or public or a mix of the two, like data trusts, data cooperatives, personal data stores.

Today, as we look at the main driving trends for the next years, we notice that the role of policies has increased in relevance: as data-driven innovation has become widespread across all industry sectors and user constituencies, the scope of the regulations and framework conditions to be adapted has considerably grown. At the same time, the emergence of disruptive technologies such as AI has increased the need for policy intervention to manage emerging social, economic and ethical risks. The EDM monitoring tool provides a consistent and solid framework to assess and estimate the potential consequences of policy choices to be made in the next years.

Even the disruption caused by the Covid-19 pandemic has not undermined the value of the EDM monitoring tool indicators and analysis. As we argue in our post-Covid scenario estimates, the main drivers of data-driven innovation are still powerful, the Data Market and the Data Economy are likely to start growing again already from 2021 and by 2025 may have recovered much of the ground lost in 2020. The pandemic has shown new ways in which digital technologies can help to adapt to a new post-Covid environment, manage health risks and accelerate the economic recovery. Now as never before proactive innovation policies and technology investments are needed to support the European social and economic recovery.

Page 80: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

76

7. Methodological annex

Overview

In line with the methodology adopted in the previous European Data Market Study (SMART 2013/0063), the measurement methodology for this final report was based on the steps outlined in Figure 12 below. Compared to the previous steps it does not include the ad-hoc surveys which were used to establish the baseline. However, thanks to the use of IDC primary research data tracking the market, we have already proven the feasibility of updating the indicators without repeating the initial surveys.

The main steps of the methodology did include:

• Desk research on the main EU and global national and statistical sources; each

indicator has specific set of sources;

• Extraction of data from the relevant IDC surveys and databases;

• Additional secondary research and case studies interviews for the stories, which in

turn did feed back to the indicator models to help in the modelling and estimate of

indicators;

• A selected number of opinion leader and stakeholder interviews to feed into the

modelling and scenario assumptions;

• Implementation of the 7 indicators models and elaboration of results;

• Development of the forecast scenario assumptions and update of the 3 scenarios;

• Assessment of policy insights building on the results of the previous steps.

Figure 12: A sophisticated Methodology

Desk Research

As done in the first study, the study team reviewed the list of relevant public sources and updated it to collect additional relevant data. The list of the main sources used is outlined below.

• Concerning the indicators on Data Market, data companies, data companies’ revenues,

and the Data Economy the main sources were:

Policy insights

EDM Monitoring Tool

EU & National Statistical Sources for

Internationals

Additional

Secondary Research

Additional In-Depth

Interviews

Methodology and

Taxonomy

Stories

7 Quantitative

Models

3 Forecast

Scenarios

IDC ongoing primary research on Business

Analytics, ICT markets, Digital transformation,

Page 81: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

o Eurostat business demography statistics in the European Union, treating

aspects such as the total number of active enterprises in the business

economy, their birth rates, death rates, and the survival rate (last update:

December 2019);

o Eurostat annual structural business statistics with a breakdown by size-class

are the main source of data for an analysis of SMEs (latest update: December

2019);

o IDC’s detailed market forecast estimates for IT Hardware, Software, and IT

Services from 2017, 2018 and 2019;

o IDC Worldwide Black Book (Standard Edition), quarterly updates form the

years 2018 through 2019. The Black Book represents IDC's quarterly analysis

of the status and projected growth of the worldwide ICT industry in 54

countries.

o IDC European Vertical Markets Survey, 2018 and 2018

o IMF World Economic Outlook (WEO) Database, October 2019

• For the data professionals we used in addition the following sources:

o ILOSTAT (International Labour Organization) Statistics and Databases (January

2019)

Survey data

This research is supported by prior and ongoing survey data to provide a foundation where information does not exist, fill in where information is sparse or missing, or to confirm ongoing assumptions such as adoption rates of digital technology or use of data. The foundation for the research was a survey conducted in 2015 to establish a baseline for data use and data supply. This was conducted across 8 countries and detailed 11 industries and two company size bands. 1,100 respondents provided sufficient detail to draw starting assumptions for technology adoption, data professionals penetration in organisations and data supplier penetration in organisations. The models used to identify the number of data professionals, the number of data suppliers by member state, industry, and company size band used this data as part of the model foundation.

Table 18: Quotas used for the initial data market survey

Member State Total

respondents

Sectors

Czech Republic 100 Mining, Manufacturing

France 200 Electricity, gas and steam, water supply,

sewerage and waste management

Germany 200 Construction

Italy 100 Transport and storage

Poland 100 Information and communications

Spain 100 Wholesale and retail trade repair of motor

vehicles and motorcycles, Accommodation and food services

Sweden 100 Professional services, administrative and

support services

UK 200 Public Administration And Defence; Compulsory

Social Security

Total 1,100 Education

Finance

Health

Page 82: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

78

This survey supported the initial foundations for the data models used, but over time these models were maintained using data from IDC’s annual industry survey. This survey currently (in 2019) addresses over 2,700 respondents across 13 countries. This ongoing series of surveys asks questions about technology adoption and technology use cases, and it is this information that support and update the foundation models used in the forecast of the data market, the number of data users, the number of data suppliers, and the number of data professionals.

Table 19: Countries and Industries surveyed in the annual IDC industry survey

Country Industry

U.K. Finance

Germany Manufacturing

France Retail/wholesale

Italy Professional services

Spain Healthcare

the Netherlands Transport

Sweden Telecom/media

Denmark Utilities/oil and gas

Norway Government/education

Finland

Russia

Czech Republic

Poland

Forecast Scenarios

The scenario model used in this study is based on the definition of alternative assumptions about four main groups of key factors driving the Data Market along different development paths. The identification of the key factors of market development was based on the desk and field research carried out in this study and on the review of a long list of forecast assumptions, leveraging IDC's periodically updated Market Forecast Assumptions. The selection of the most relevant factors was based on two main criteria:

• High level of impact on the development of the Data Market and the Data Economy;

• High level of uncertainty, with potential different outcomes (assumptions) over the

next 8 years.

Page 83: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

The four main groups of factors are:

• Macroeconomic factors;

• Policy/regulator factors;

• Data Market demand-supply factors;

• Global megatrends affecting all technology markets.

Even though they may seem obvious, these four clusters correspond to the main typologies of factors which affect the evolution of the Data Market. Each cluster aggregates a set of interrelated key factors; their combination differentiates the three scenarios. The scenarios are characterised by the interaction and co-dependency of these factors; no scenario can be explained only by one factor or one group of factors.

Figure 13: Structure of the Scenarios Model

Source: European Data Market Monitoring Tool, IDC 2015

The scenario model and the forecast indicators models are correlated. Table 20 below summarises the rationale of their selection and how their assumptions were used as inputs to the indicators’ forecast models.

Table 20: Identification of Main Factors driving the Scenarios

Key Factors Rationale Inputs to the Forecast Models

Macroeconomic

factors

Macroeconomic factors are

partially exogenous to the Data

Market (even though data

innovation is expected to

contribute to GDP growth)

Historical data for the period 2014-

2018, plus 2019 estimates, plus

alternative forecasts for the period

2020 to 2025 of the following:

• EU GDP growth

• ICT spending growth Other

economic factors such as

unemployment for the same

period

Scenarios Model

Baseline

ScenarioChallenge

Scenario

High Growth

Scenario

Policy/ Regulatory Assumptions

Macroeconomic Assumptions

Data Market dynamics

Assumptions

Global

Megatrends

Assumptions

Page 84: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

80

Policy/Regulator

y factors

Policy measures and regulation

shape the framework conditions

of the development of the Data

Market

Alternative policy and regulatory

factors by scenario

Data Market

supply-demand

factors

Strong influence of alternative

supply-demand dynamics on the

market development paths

Alternative supply and take-up

models by scenario

Global

megatrends

Strong influence of global digital

innovation trends on the EU Data

Market growth

Alternative assumptions on the

development of current and forecast

ICT innovation drivers as well as

global digital transformation dynamics

The scenarios provide the main framework for the forecast of the EDM indicators. As shown in the Figure below, IDC developed seven forecast models: each model produced the specific indicators forecasts under the three main scenarios, followed by in depth cross-check and quality check. The forecasts models are also correlated and were developed with the following process, with the following dependencies:

• The Data Market forecast model is the cornerstone of the process: it was developed

first, building on IDC’s forecasts and on the macroeconomic variables as described

below. Its results and growth rates feed into the other models, according to the

specific assumption and calculation methods explained for each indicator.

• The Data Market and data suppliers’/data users’ forecasts influence the data

professionals’ model.

• The data companies’ forecasts feed into the data revenues model.

• The data professionals model feeds into the data professionals’ skills gap model.

• The Data Economy model feeds from all the other forecasts, but especially the Data

Market and the data users' forecasts.

Measuring Data Professionals

Definition and Scope

Data professionals are workers who collect, store, manage, and/or analyse, interpret, and visualise data as their primary or as a relevant part of their activity. Data professionals must be proficient with the use of structured and unstructured data, should be able to work with a huge amount of data and be familiar with emerging database technologies.

In our definition, data professionals are not only data technicians but also data users who, based on more or less sophisticated tools, take decisions about their business or activity, after having analysed and interpreted available data. According to our definition, data professionals belong to the category of knowledge workers and specifically “codified” knowledge workers (Lundavall and Johnson, 1994); data professionals specifically deal with data while knowledge workers deal with information and knowledge.

The indicator has been measured according to the segmentations presented in the following table, including two sub-indicators about the share on employment and the intensity of employment.

Page 85: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Table 21: Indicator 1 – Data Professionals

Indicator 1 – Data Professionals

N. Name Description Type and Time Segmentation

1.1 Number of data professionals

Total number of data professionals in the EU

Number, 2016-17-18-20

Forecast to 2025, 3 Scenarios

By Geography: 28 EU MS + EU27 MS + total EU

By Industry: 11 industry sectors NACE rev.2

1.2 Employment share

Total number as a share of total employment in the EU

% of total employment, 2017-18-20

By Geography: 28 EU MS + EU27 MS + total EU

By Industry: 11 industry sectors NACE rev.2

By Size: not applicable

1.3 Intensity share Average number of data professionals per company (only for private sector)

Number, 2017-18-20

By Geography: 28 EU MS + EU27 MS + total EU

By Industry: 11 industry sectors NACE rev.2

By Size: not applicable

The segmentation by industry sector used in the study is presented in the following table with the corresponding NACE rev.2 Codes.

Table 22 Industry Sectors Classification

Eurostat Name NACE Rev 2 Code Abbreviation for Tables

Construction F Construction

Education P Education

Page 86: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

82

Eurostat Name NACE Rev 2 Code Abbreviation for Tables

Electricity, gas and steam, water supply, sewerage and waste management

D-E Utilities

Finance K Finance

Human health activities Q Healthcare

Information and communications J Information and communication

Mining, Manufacturing B-C Mining, Manufacturing

Professional services, administrative and support services

L-M-N Professional services

Public Administration And Defence; Compulsory Social Security

O Public Administration

Transport and storage H Transport

Wholesale and retail trade repair of motor vehicles and motorcycles, accommodation and food services

G — I Wholesale / Retail

Methodology Approach

Our approach is based on an iterative process and on a calibration process of the final estimates. The approach has been repeated in the new study based on updates of the main sources.

Statistical Identification

Data professionals are not classified as such into any of the labour and occupation statistics. In order to define them statistically, we have adopted the International Standard Classification of Occupations (ISCO-08), selecting categories where data professionals may be included. The criteria adopted for the selection of the ISCO-08 codes are the following:

• We have selected the occupations where data professionals can be involved either

as data providers or as data users;

• We have selected the occupations from 1 to 4-digit disaggregation;

• The occupation codes selected are those where the presence of data professionals

can be detected because:

o They hold deep analytical skills;

o They do not need deep analytical skills but basics understanding of statistics

and/or machine learning in order to conceptualise the questions that can be

addressed through deep analytical skills;

o They are the ones providing enabling technology and therefore they are

providers of data services.

• The selected codes are those where a significant part of the workers may be data

professionals; the occupations where the data professionals are a very marginal

Page 87: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

part of the workers have been excluded; as an example, the medical practitioners

have been excluded, although some practitioners may be data professionals

because they undertake research activities. Since they are only a very marginal part

of the practitioners, we excluded them from the occupations where data

professionals are present;

• We excluded all the data professionals which are not included into the knowledge

economy perimeter because their occupation is a low skilled one, i.e. with high

routine level (as an example, call centre workers are in theory data professionals

but since their activity is a routine one and as such excluded from the knowledge

economy, they are not considered data professionals).

Table 23: ISCO-08 Structure and Data Professionals

ISCO-08 structured Classification

Major Groups

(1 digit)

Sub-groups

(2 digits)

Minor Groups

(3 digits)

Units

(4 digits)

Number of codes ISCO-08 structure

10 43 130 436

Number of selected codes including data professionals

4 9 21 52

Share of data professionals’ codes in the ISCO-08 structure

40% 21% 16% 12%

Source: IDC elaboration on ISCO codes

Calculation of the quantitative Perimeter

The quantitative perimeter of employment where data professionals are trackable is based on the selected ISCO codes crossed with the NACE classification of economic activities, for each one of the 28 Member States and the EU as a whole, and has been updated based on the sources updates.

Estimate and Calibration of the Penetration of Data Professionals

The next step is the estimate of percentage of data professionals within the perimeter of data professional candidates. To calculate the coefficients for the calculation of such %, we have elaborated a set of assumptions (specified in the D2- Methodology report of the EDM Study). The assumptions have been revised and updated for each release of the study and applied to the model to calculate the share of data professionals by Member State and by industry.

Forecasting Data Professionals

The same model was applied to forecast data professionals to 2025, by developing specific assumptions by scenario, even though the level of uncertainty is higher, and the reliability of the forecasts is lower.

Page 88: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

84

Measuring Data Companies

Definition and Scope

Data companies are organisations that are directly involved in the production, delivery and/or usage of data in the form of digital products, services and technologies. They can be both data suppliers’ and data users’ organisations:

• Data suppliers have as their main activity the production and delivery of digital data-

related products, services, and technologies. They represent the supply side of the

Data Market.

• Data users are organisations that generate, exploit collect and analyse digital data

intensively and use what they learn to improve their business. They represent the

demand side of the Data Market.

Table 24: Indicator 2 – Number of Data Companies

Indicator 2 – Data companies

N. Name Description

Type and Time

Segmentation

2.1 Number of data suppliers

Total number of data suppliers, measured as legal entities based in the EU

Number, 2017-18-20

Forecast to 2025, 3 Scenarios

By Geography: 28 EU MS + EU27 MS + total EU

By Industry: 2 NACE rev2 Section J Information and Communication and section M Professional, scientific and technical activities

By Company Size:

• below 250 employees

• above 250 employees

2.2 Share of data suppliers

Total data companies on total companies in industry J and M

% 2017-18-20

By Geography: 28 EU MS + EU27 MS + total EU

By Industry: 2 NACE rev2 Section J Information and Communication and section M Professional, scientific and technical activities

2.3 Number of data users

Total number of data users in the EU, measured as legal entities based in one EU country

Number, 2017-18-20

Forecast to 2025, 3 Scenarios

Number, 2017-18-20

Forecast to 2025, 3 Scenarios

By Geography: 28 EU MS + EU27 MS + total EU

By Industry: 11 industry sectors NACE rev.2

By Geography: 28 EU MS + EU27 MS + total EU

Page 89: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

2.4 Share of data users

Total data users as share of total private companies

% 2017-18-19

By Industry: 2 NACE rev2 Section J Information and Communication and section M Professional, scientific and technical activities

Methodology Approach

Data companies have been measured by updating the same model used in the previous EDM Study (see Figure below) which leverages both IDC and public sources.

• Eurostat business demography statistics in the European Union, treating aspects

such as the total number of active enterprises in the business economy, their birth

rates, death rates, and the survival rate (last update: December 2014);

• Eurostat annual structural business statistics with a breakdown by size-class are the

main source of data for an analysis of SMEs (latest update: March 2016);

• IDC’s detailed market forecast estimates for IT Hardware, Software, and IT Services;

• IDC Worldwide Black Book (Standard Edition), quarterly updates. The Black Book

represents IDC's quarterly analysis of the status and projected growth of the

worldwide ICT industry in 54 countries.

• IDC European Vertical Markets Survey

Page 90: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

86

Figure 14: Data Companies Model

Measuring the Revenues of Data Companies

Definition and Scope

Data companies’ revenues are the revenues generated by data suppliers for the products and services specified in our definition of the Data Market. The revenues correspond to the aggregated value of all the data-related products and services generated by Europe-based suppliers, including exports outside the EU.

Table 25: Indicator 3 – Revenues of Data Companies

Indicator 3 – Revenues of Data Companies

N. Name Description Type and Time Segmentation

3.1 Total revenues of data companies

Total revenues of the Data Suppliers calculated by Indicator 2

Billion €, 2017-18-20

Forecast to 2025, 3 Scenarios

By Geography: EU27 ; EU27 + U.K. ; total EU

By Company Size:

below 250 employees

above 250 employees

3.2 Share of data companies’ revenues

Total revenues of the Data Suppliers calculated by Indicator 2

% of revenues on total,

2017-18-20

By Geography: EU27 ; EU27 + U.K. ; total EU

Data

Supplie

rsEurostat Data

Data Market Survey

Data

Users

Data Market Forecast

Eurostat Data by segment

Data UsersOrganisations with a high intensity reliance on data for the accomplishment of

their mission.Generate and exploit their

own data, collect online customer data intensively, subject the data to

sophisticated analysis

Data SuppliersMain Activity is production

and delivery of data-related products, services,

technologies

Segm

ents

Country Clusters

Page 91: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Methodology Approach

The indicator has been measured applying the same model used in the previous EDM Study, which calculated the revenues by feeding on:

• Eurostat and IDC statistics on average IT vendors revenues by size and sector;

• The total number of data companies by country, industry and size class;

• The value of the Data Market by country and industry;

• The estimated share of exports-imports in the value of the Data Market.

Measuring the Data Market

Definition and Scope

The Data Market is the marketplace where digital data is exchanged as “products” or “services” as a result of the elaboration of raw data. We define its value as the aggregate value of the demand of digital data without measuring the direct, indirect and induced impacts of data in the economy as a whole. The value of the Data Market includes imports (data products and services bought on the global digital market from suppliers not based in Europe) and excludes the exports of the European data companies.

Table 26: Indicator 4 – Size of the Data Market

Indicator 4 – Size of the Data Market

N. Name Description Type and Time Segmentation

4 Value of the Data Market

Estimate of the overall value of the Data Market

Billion €, 2017-18-20

Forecast to 2025, 3 Scenarios

By Geography: EU27 ; EU27 + U.K. ; total EU

By Industry: 11 industry sectors NACE rev.2

By Size: not applicable

Methodology Approach

The Data Market indicator is being updated every year for the duration of the study. The model is based on the extraction of data from IDC databases concerning the components of hardware, software and services spending which fall in the definition of the Data Market. The IDC data is already segmented by country and by industry, even though not all Member States are covered, and the industry classification is slightly different from the one proposed in this project. The respective shares for the software, hardware, and services market used to derive the Data Market are derived from IDC surveys covering Big Data, IT spending patterns and intentions in the European market, and a survey of data suppliers and data users in key Member States, together with analyst expertise and alignment with IDC's European and worldwide forecasts for the business analytics and Big Data Market.

The model updates the Data Market value shares by Member State and by industry.

Page 92: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

88

Figure 15: Data Market Model

Source: IDC 2016

Measuring the Data Economy

Definition and Scope

The Data Economy measures the overall impacts of the Data Market on the economy as a whole. It involves the generation, collection, storage, processing, distribution, analysis elaboration, delivery, and exploitation of data enabled by digital technologies. The Data Economy also includes the direct, indirect, and induced effects of the Data Market on the economy.

The Data Economy indicator measures the value of the Data Economy based on the estimate of all the economic impacts following the adoption of data-driven innovation and data technologies in the EU. As such, the indicator aggregates direct, indirect, induced impacts of the Data Market defined as follows.

1. The direct impacts: these are impacts generated by the data industry itself; they represent the activity engendered by all businesses active in the data production. The quantitative direct impacts are measured by the revenues from data products and services sold, i.e. the value of the Data Market. We prefer to adopt the Data Market value as a good proxy of the direct impacts because its estimates are more reliable than the value of the revenues. The direct impacts: the initial and immediate effects generated by the data suppliers; they represent the activity potentially engendered by all businesses active in the data production. The quantitative direct impacts have then been measured as the revenues from data products and services sold, i.e. the value of the Data Market. As Data Market estimation is more reliable than data companies’ revenues estimation, we consider the Data Market value as a good proxy of the direct impacts. Therefore, for the sake of simplicity, direct impacts do coincide with the value of the Data Market.

Data

Market

Model

Product shares

Product shares

Tie Ratio

Data Market

Vertical, Size

Segments

Vertical, Size

Segments

Vertical, Size

Segments

IDC Product & Country Forecasts

IDC Vertical Market

Forecasts

% m

ark

et

2012 2020

Spend

2012 2020

Business Analytics Software - forecast

IT Services forecast

IT Hardware forecast

Page 93: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

2. The indirect impacts: the economic activities generated along the company's supply chain by the data suppliers. There are two different types of indirect impacts: the backward indirect impacts and the forward indirect impacts (Richardson, 1985):

a. The backward indirect impacts: such impacts represent the business growth resulting from changes in sales from suppliers to the data industry. In order to produce and deliver data products and services, the data companies need inputs from other stakeholders. Revenues from those sales to data companies are the backward indirect impacts.

b. The forward indirect impacts: such impacts include the economic growth generated through the use of data products and services by the downstream industries, i.e. the data users as a selected number of industries. For the user companies, data is now a relevant factor of production; the adoption of data products and services by the downstream industries provides different types of competitive advantage and productivity gains to the user industries. The main benefits that the exploitation of data can provide to downstream industries are (OECD, 2013, Mc Kinsey, 2011):

i. Optimising production and delivery processes: data-driven processes (data-driven production);

ii. Improving marketing by providing targeted advertisements and personalised marketing practices (data-driven marketing);

iii. Improving existing organisation and management practices (data-driven organisation).

3. The induced impacts: these impacts include the economic activity generated in the whole economy as a secondary effect. Induced additional spending is generated both by new workers, who receive a new wage, and by the increased wage of existing jobs. This spending induces new revenues creation in nearly all sectors of the economy. The additional consumption does support economic activity in various industries such as retail, consumer goods, banks, entertainment, etc.

Table 27: Indicator 5 – Value of the Data Economy

Indicator 5 – Value of the Data Economy

N. Name Description Type and Time

Segmentation

5 Value of the Data Economy

Value of the Data Market plus direct, indirect and induced impacts on the EU economy

Billion €, 2017-18-20

Forecast to 2025, 3 Scenarios

By Geography: EU27 ; EU27 + U.K. ; total EU

5.1 Incidence of the Data Economy on GDP

Ratio between value of the Data Economy and EU GDP

%, 2017-18-20

Forecast to 2025, 3 Scenarios

By Geography: EU27 ; EU27 + U.K. ; total EU

This estimate of the Data Economy does not include the user benefits and social impacts of data-driven innovation such as changes in quality of life (health, safety, recreation, air quality). Although these benefits may be evaluated in economic (money) terms, they are

Page 94: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

90

not economic impacts as such and as defined above as they do not induce an increase in the business activities and a consequent growth in GDP.

Analysts underlined that the new decision-making processes act as a rationalisation and optimisation factor (Brynjolfsson, 2011, Mc Kinsey, 2012), since they improve effectiveness and efficiency, and in some cases, they may have a disruptive effect. The impacts related to the new decision-making processes are the one we have called the forward indirect impacts.

The value creation process based on data rests on the elaboration of information and knowledge (OECD 2016), although the boundaries between data, information, and knowledge are sometimes fuzzy. The huge volume of data is a global phenomenon which is sometimes viewed with suspicion by citizens, consumers and businesses because data flows are seen as an intrusion of the privacy. Nevertheless, there is currently some evidence showing that data analysis can provide benefits to both businesses and consumers. By the way, this is not surprising since we should remind that the economic theory holds that information encourages competition between businesses for the benefit of consumers.

Data do not provide value and benefits as such; data need to be collected, stored, aggregated, combined and analysed in order to be appropriately used for decision making processes. To create value, data need to be processed (OECD, 2016):

• Extracting information from structured and unstructured data: data analytics

techniques are today able to analyse both structured and unstructured data. We should

remind here that most data stored by businesses are unstructured (IDC, 2012).

Technologies such as optical character recognition, natural language processing, face

recognition algorithms and machine learning algorithms are empowering the use of all

data.

• Real-time monitoring and tracking: analysis of data in real time is often mentioned as

one of the most powerful factor since it supports organisations to make real-time

decisions, which, in a fast-changing world, is a well-known competitive advantage.

• Inference and prediction: until now, prediction was based exclusively on prior

information and data series. Data analytics can now enable the creation of information

even without prior information. Such information can be created through patterns and

correlations of data. Personal information, for example, can be deduced from

anonymous or non-personal data. Businesses and organisations demand real time

insights rather than historical and periodical information, and for advanced specialised

data analytics services. Algorithms allow machine and statistical learning based on non-

specific data; businesses can learn and predict a lot about their customers even if they

do not have specific data and time series about the issue they are interested in. Machine

learning has, as an example, applications in health care where data collected on patients

are recorded by imaging, or it supports production processes to increase the quality of

production.

The diffusion of technology supporting production and analysis of data induces organisations and businesses to base their decisions on data much more than they were used to do. As pointed out by OECD in its recent report, the process to take decisions is also changing. Decision makers do not necessarily need to understand the phenomenon before they act on it. A store can change the product placement based on data analysis without the need to know the reason why such a change should improve the sales. There is therefore a decision automation process: “first comes the analytical factor, then the action, and last, if at all, the understanding” (OECD, 2015).

The impacts of such a new approach to decision making and to the use of data in all the enterprises and organisations’ functions are many and varied, so that we believe, such

Page 95: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

impacts will be object of studies and analysis in the upcoming years. It is, at this point, difficult to classify them and to suggest a taxonomy of such impacts.

Such impacts have been observed through some empirical studies and case analysis. The most relevant ways the benefits appear are the following.

• Creating more information, knowledge and transparency: technology is making

data more accessible and exploitable to all kind of stakeholders, including SMEs. This

increases transparency and decisions are made on a rational process.

• Improving performance: having access to a wide information and to a high number of

data is changing the way of making decisions. An increasing number of organisations

are going to become data-driven organisations, which means that they make decisions

based on empirical results. As an example, retailers can adjust prices and promotions,

more precisely than they were used to and in real time. This may improve

competitiveness. McKinsey underlines that the health sector is achieving a lot of benefits

from the new making decisions process: studies on clinical data allow to identify and

understand the sources of variability in treatment, to identify the best treatment protocols

and to create guidelines for the optimization of treatment decisions. This does not only

increase the effectiveness of treatments, but it also produces saves.

• Improving customization of actions for better decisions: data technology is

definitely improving the segmentation of customers and the analysis of their preferences

in real time. This allows companies to supply products and services targeted to specific

groups of individuals who have specific needs and preferences. Such a segmentation

is also useful when supplying public services. Such a segmentation helps define the

price precisely and offering exactly what is needed which means a better quality and

also companies avoid offering products and services the consumers are not willing to

pay.

• Innovating products and services as well as business models: the more

information and understanding businesses have about their customers, the better they

can serve them. It is important to say that although consumers may fear their privacy is

injured, this can also provide them unexpected surplus: real time price comparison

services do not only provide better transparency but also allow buying the best product

at the most convenient price (for example when buying online airline tickets or when

booking hotels). Companies can in fact produce and create new products and services

to better satisfy their customers’ needs. This is true also for the public sector and

specifically for the health care system where preventing care programs can be created.

These effects are reflected in an increase in revenues due to higher market share from the increase in competitiveness or due to a reduction in costs. All these effects are included in the forward indirect impacts; these impacts are delivered on the user industry, and because of the above reasons, these are the impacts we consider new on the overall economic system.

Methodology Approach

Measuring the Data Economy depends on the macroeconomic context on one hand, and on the adoption/diffusion and integration processes the companies are implementing on the other hand. Moreover, there is a necessary time lag before the impacts take place in the economic system. Therefore, the estimates are based on a set of assumptions, including choices about proxy indicators.

In order to measure the impact of the diffusion and use of data services and products, we estimated each component (as defined in the above paragraph) of the impact separately.

Page 96: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

92

The estimation approach developed in the previous study was based on a number of assumptions on one hand and on results from a survey launched during the first-year research.

The following assumptions have been confirmed:

• The penetration rates of data in terms of value added for the user industries using

data are positively correlated to the penetration rate in terms of number of

companies using data.

• The survey conducted in the study 2013-2016 provided information about the

quantitative benefits due to the use of data, for the six major Member States plus

Czech Republic; such benefits have been taken into consideration for the six major

Member States.

• For Austria, Belgium, Denmark, Finland, Ireland, Luxembourg, Malta, the

Netherlands, and Sweden we assumed that these Member States have the same

distribution of benefits as the average of the Big Six.

• For the other Member States, we estimated the benefits of the rest of Europe, based

on the survey results, and we assumed that all the minor Member States are

achieving benefits similar to the rest of Europe.

• For the induced impacts, we assumed that the additional earnings are spent

according to the general economic mood.

In order to update the estimates of the different components of the impacts, we have adopted some new assumptions:

• In the next three years, we are going to stay in a relatively emerging stage of the

data diffusion, so that in our view the structure of the data impacts is not going to

change.

• For the quantitative benefits due to the use of data, we assume that the benefits will

quantitatively vary and be correlated to the macroeconomics trends and specifically

with the industries’ trends (and stakeholders) affected.

Measuring the Data Skills Gap

Definition and Scope

This indicator captures the potential gap between demand and supply of data skills in Europe, since the lack of skills may become a barrier to the development of the data industry and the rapid adoption of data-driven innovation.

Table 28: Indicator 6 – Data Skills Gap

Indicator 6 – Data Skills Gap

N. Name Description Type and Time Segmentation

6 Data Workers Skills Gap

Gap between demand and supply of data workers

Absolute number and % on total demand, 2017-18-20

Forecast to 2025, 3 scenarios

By Geography:EU27 ; EU27 + U.K. ; total EU, main EU Member States

Page 97: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

Methodology Approach

The methodology approach is the same implemented by IDC-empirica to estimate the supply-demand balance of ICT skills in the EU (e-Skills) on behalf of the EC DG Enterprise (now DG GROW). The model was first developed in 2009 and since then has been successfully validated and updated several times. The results have been used by the EC to support the e-skills policy and the latest results were presented in December 2015 at the European E-skills 2015 Conference in Brussels21. However, data skills are not a subset of ICT skills so the scope of supply and the dynamics of demand are different from the e-skills model developed by IDC.

To update the measurement of the indicators the study team has applied the same model developed for the previous EDM Study, combining the estimates and forecasts of the demand and supply of data professionals with data skills leveraging a wealth of different sources, among which:

• OECD Digital Economy Papers, among which: OECD (2014), Measuring the Digital

Economy: A New Perspective; OECD Publishing.

• ILOSTAT (International Labour Organization) Statistics and Databases (2015)

• EUROSTAT Tertiary Education Statistics (Last update: December 2015).

• European Data Science Academy (EDSA) project deliverables and publications (July

2015).

Figure 16: The Data Skills Demand-Supply Balance Model

Source: European Data Market Monitoring Tool, IDC 2016

21 “e-Skills in Europe: Trends and Forecasts for the European ICT Professional and Digital Leadership Labour Markets (2015-2020)”, empirica Working Paper (November 2015)

GAP / Over-

supply

Data Workers

Demand

Data

CompaniesData Users

Data Workers

Supply

Education

Training

Other

Careers

Education

TrainingOther

Careers

Page 98: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

94

8. Essential glossary – the key indicators

Data professionals are data workers who collect, store, manage, and/or analyse, interpret, and visualise data as their primary or as a relevant part of their activity. Data professionals must be proficient with the use of structured and unstructured data, should be able to work with a huge amount of data and be familiar with emerging database technologies. They elaborate and visualise structured and unstructured data to support analysis and decision-making processes.

Data companies can be both data suppliers’ and data users’ organisations:

• Data suppliers have as their main activity the production and delivery of digital data-

related products, services, and technologies. They represent the supply side of the

Data Market.

• Data users are organisations that generate, exploit, collect and analyse digital data

intensively and use what they learn to improve their business. They represent the

demand side of the Data Market.

Data companies’ revenues are the revenues generated by data suppliers for the products and services specified in our definition of the Data Market. The revenues correspond to the aggregated value of all the data-related products and services generated by Europe-based suppliers, including exports outside the EU.

The Data Market is the marketplace where digital data is exchanged as “products” or “services” as a result of the elaboration of raw data. We define its value as the aggregate value of the demand of digital data without measuring the direct, indirect and induced impacts of data in the economy. The value of the Data Market includes imports (data products and services bought on the global digital market from suppliers not based in Europe) and excludes the exports of the European data companies.

The Data Economy measures the overall impacts of the Data Market on the economy. It involves the generation, collection, storage, processing, distribution, analysis elaboration, delivery, and exploitation of data enabled by digital technologies. The Data Economy also includes the direct, indirect, and induced effects of the Data Market on the economy.

The Data Professionals’ Skills Gap captures the potential gap between demand and supply of data skills in Europe, since the lack of skills may become a barrier to the development of the data industry and the rapid adoption of data-driven innovation.

Data is usually defined as qualitative or quantitative statements or information which can be coded and which are assumed to be factual and not the product of analysis or interpretation. For the sake of this study we consider only data which is collected, processed, stored, and transmitted over digital information infrastructures and/or elaborated with digital technologies. This definition includes multimedia objects which are collected, stored, processed, elaborated and delivered for exploitation through digital technologies (for example, images databases).

Information is the output of processes that summarise, interpret or otherwise represent the content of a message to convey meaning. Therefore, information is not a mere synonymous of data.

The Knowledge Economy is defined as the production of products and services based on knowledge-intensive activities that contribute to an accelerated pace of technical and scientific advance, as well as rapid obsolescence. The key component of a knowledge economy is a greater reliance on intellectual capabilities than on physical inputs or natural resources.

Page 99: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

The Internet Economy is defined as covering the full range of our economic, social and cultural activities supported by the Internet and related information and communications technologies22.

Information or Knowledge workers in the most basic definition are persons employed to produce or analyse ideas and information. Multiple sources define knowledge workers as workers creating knowledge capital, who process existing information to create new information to be used to define and solve problems. They include, as an example, medical practitioners, lawyers, judges, teachers, architects, engineers, managers or salespeople. Their main capital is knowledge, and they are mainly focused on “non-routine” tasks.

Data workers collect, storage, manage and analyse data, as their primary activity. Data workers can be knowledge workers if they are focused on non-routine tasks. For example, data entry clerks’ primary activity is related to data, so they are data workers. However, data entry is a very routine task and as such data entry clerks should not be considered as knowledge workers. Another category of data workers is data analysts, who usually extract and analyse information from one single source, such as a CRM database. They require a medium level of creative thinking and usually work on structured data.

Data scientists require solid knowledge in statistical foundations and advanced data analysis methods combined with a thorough understanding of scalable data management, with the associated technical and implementation aspects (European Big Data Value Partnership Strategic Research and Innovation Agenda, April 2014). They can deliver novel algorithms and approaches such as advanced learning algorithms, predictive analytics mechanisms, etc. Data scientists should also have a deep knowledge of their businesses; the most difficult skills to find, include advanced analytics and predictive analysis skills, complex event processing skills, rule management skills, business intelligence tools, data integration skills (UNC, 2013).

22 “Measuring the Internet Economy: A Contribution to the Research Agenda”, OECD Digital Economy Papers, 2013

Page 100: THE EUROPEAN DATA MARKET MONITORING TOOL

THE EUROPEAN DATA MARKET MONITORING TOOL

96

GETTING IN TOUCH WITH THE EU

In person

All over the European Union there are hundreds of Europe Direct information centres. You can find the address of the centre nearest you at: https://europa.eu/european-union/contact_en

On the phone or by email

Europe Direct is a service that answers your questions about the European Union. You can contact this service:

– by freephone: 00 800 6 7 8 9 10 11 (certain operators may charge for these calls),

– at the following standard number: +32 22999696 or

– by email via: https://europa.eu/european-union/contact_en

FINDING INFORMATION ABOUT THE EU

Online

Information about the European Union in all the official languages of the EU is available on the Europa website at: https://europa.eu/european-union/index_en

EU publications

You can download or order free and priced EU publications at: https://publications.europa.eu/en/publications. Multiple copies of free publications may be obtained by contacting Europe Direct or your local information centre (see https://europa.eu/european-union/contact_en).

EU law and related documents

For access to legal information from the EU, including all EU law since 1952 in all the official language versions, go to EUR-Lex at: http://eur-lex.europa.eu

Open data from the EU

The EU Open Data Portal (http://data.europa.eu/euodp/en) provides access to datasets from the EU. Data can be downloaded and reused for free, for both commercial and non-commercial purposes.

Page 101: THE EUROPEAN DATA MARKET MONITORING TOOL

doi: 10.2759/72084 ISBN 978-92-76-19505-4

KK-0

1-2

0-3

55-E

N-N