Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Giorgio Micheletti, IDC Italia
THE EUROPEAN DATA MARKET MONITORING TOOL
KEY FACTS & FIGURES, FIRST POLICY CONCLUSIONS, DATA LANDSCAPE AND QUANTIFIED STORIES
D2.9 Final Study Report
Prepared by:
Gabriella Cattaneo, Giorgio Micheletti, Mike Glennon, Carla La Croce (IDC) and Chrysoula Mitta (The Lisbon Council)
Internal identification
Contract number: N- 30-CE-0835309/00-96
EUROPEAN COMMISSION
Directorate-General for Communications Networks, Content and Technology
Directorate G - Data
Unit G1 — Data Policy and Innovation
Contact: [email protected]
European Commission B-1049 Brussels
EUROPEAN COMMISSION
Directorate-General for Communications Networks, Content and Technology 2020 EN
THE EUROPEAN DATA MARKET MONITORING TOOL
KEY FACTS & FIGURES, FIRST POLICY CONCLUSIONS, DATA LANDSCAPE AND QUANTIFIED STORIES
D2.9 Final Study Report
4
LEGAL NOTICE
This document has been prepared for the European Commission however it reflects the views only of the authors, and the European Commission is not liable for any consequence stemming from the reuse of this publication. The Commission does not guarantee the accuracy of the data included in this study. More information on the European Union is available on the Internet (http://www.europa.eu).
PDF ISBN 978-92-76-19505-4 doi: 10.2759/72084 KK-01-20-355-EN-N
Manuscript completed in June 2020
The European Commission is not liable for any consequence stemming from the reuse of this publication.
Luxembourg: Publications Office of the European Union, 2020
© European Union, 2020
The reuse policy of European Commission documents is implemented by the Commission Decision 2011/833/EU of
12 December 2011 on the reuse of Commission documents (OJ L 330, 14.12.2011, p. 39). Except otherwise noted, the reuse
of this document is authorised under a Creative Commons Attribution 4.0 International (CC-BY 4.0) licence
(https://creativecommons.org/licenses/by/4.0/). This means that reuse is allowed provided appropriate credit is given and any
changes are indicated.
For any use or reproduction of elements that are not owned by the European Union, permission may need to be sought directly from the respective rightholders.
EUROPE DIRECT is a service to help you find answers to your questions about the European Union
Freephone number (*): 00 800 6 7 8 9 10 11
(*) The information given is free, as are most calls (though some operators, phone boxes or hotels may charge you)
TABLE OF CONTENTS
ABSTRACT ................................................................................................................... 5
RÉSUMÉ ....................................................................................................................... 6
EXECUTIVE SUMMARY ................................................................................................... 7
Quantifying the European Data Market – Key Facts & Figures ...................................... 7
Describing the Data Market – The Quantified Stories ................................................. 10
Mapping the Data Market – The Data Landscape and the Data Market Monitoring Tool ........................................................................................................... 11
Acting Upon the Data Market – The Role of Policy ..................................................... 11
1. INTRODUCTION ................................................................................................... 21
1.1. Objectives .................................................................................................. 21
1.2. Methodological Approach .............................................................................. 22
1.3. The Structure of this Report ......................................................................... 26
2. QUANTIFYING THE DATA MARKET – KEY FACTS & FIGURES ....................................... 27
2.1 Three future Development Paths: The Data Market at 2025 .............................. 28
2.2 The Workforce Dimension: Data Professionals and Data Skills Gap .................... 29
2.3 The Supply - Demand Dimension: The Data Companies ................................... 33
2.4 The Business and Economic Dimension: The Data Market and the Data Economy .................................................................................................... 37
2.5 The International Dimension - The Data Economy Beyond the EU – US, Brazil and Japan .................................................................................................. 41
3. DESCRIBING THE DATA MARKET – THE QUALI-QUANTITATIVE STORIES ..................... 48
3.1. Story 6-7 Health Data and Data-driven Innovation in the European Healthcare Industry ..................................................................................................... 48
3.2. Story 8 - Accelerating the Impact of Data Commons ........................................ 54
3.3. Story 9 – Scaling up data-driven innovation: European industry requirements
and the role of European data spaces ............................................................ 56
4. MAPPING THE DATA MARKET – DATA LANDSCAPE AND DATA MARKET MONITORING TOOL .................................................................................................................. 60
4.1 The EU Data Landscape ................................................................................ 60
5. ACTING UPON THE DATA MARKET – THE ROLE OF POLICY ......................................... 63
5.1 The Role of Policy and the Future of Europe’s Data Economy: The Three
Scenarios ................................................................................................... 63
5.2 A change of pace in data policies ................................................................... 67
5.3 The EU Data Policy and the International Dimension ........................................ 68
6. CONCLUSIONS ..................................................................................................... 70
6.1 Quantifying the European Data Market – Key Facts & Figures ........................... 70
6.2 Describing the Data Market – The Quantified Stories ........................................ 73
6.3 Mapping the Data Market – Data Landscape and Data Market Monitoring Tool
................................................................................................................. 74
6.4 Acting Upon the Data Market – The Role of Policy ............................................ 74
7. METHODOLOGICAL ANNEX .................................................................................... 76
8. ESSENTIAL GLOSSARY – THE KEY INDICATORS ....................................................... 94
Abstract
This report presents a set of indicators measuring the data professionals, the value of the data market, the number of data supplier and data user companies and their revenues, and the overall impact of the data economy on EU GDP. All indicators are presented for the years 2018 through 2020 and forecasted to 2025 according to three alternative potential scenarios: Baseline, High Growth and Challenge scenarios.
In particular:
• The total number of data professionals, their share on the total employment in the EU and their intensity (i.e.: their average number per company) have constantly increased throughout the period under consideration; Data companies - the organisations providing data (data-suppliers) and those making a strong reliance on data (data-users) - have increased in number and share in the EU from 2018 to 2020 and are projected to continue their growth throughout 2025 under all three forecast scenarios;
• The value of the overall data market (i.e. the market where digital data is exchanged as products or services derived from raw data) as well as the value of the overall data economy (including the economic impacts generated by the data market) present the most dynamic picture and are expected to further increase up to 2025 under the three scenarios;
• The data worker skill gap - the gap emerging between the demand and supply of data workers - reveals a potential lack of supply of data skill in Europe across the period under consideration, with specific reference to the High-Growth scenario;
• Finally, the report looks at the possible effects caused by the developments of the current Covid-19 pandemic. An additional post-Covid-impact scenario with estimates on the Data Market and the Data Economy in 2020 and in 2025 for the EU27 has been specifically developed and included.
6
Résumé
Ce rapport présente un ensemble d’indicateurs mesurant les professionnels des données, la valeur du marché des données, le nombre de sociétés fournisseurs et utilisateurs de données et leurs recettes, ainsi que l’incidence globale de l’économie des données sur le PIB de l’UE. Tous les indicateurs sont présentés pour les années 2018 à 2020 et offrent des prévisions jusqu’en 2025, explorant trois scénarios d’évolution potentiels : Scénarios de référence, de forte croissance et pessimiste.
En particulier:
• Le nombre total de professionnels des données, leur part dans l’emploi total dans l’UE et leur intensité (c’est-à-dire leur nombre moyen par entreprise) ont constamment augmenté tout au long de la période considérée ; Les sociétés de données, c’est-à-dire les entreprises fournissant des données (fournisseurs de données) et celles ayant une forte dépendance (utilisateurs de données) - ont augmenté en nombre et en part dans l’UE de 2018 à 2020 et devraient poursuivre leur croissance jusqu’en 2025 selon les trois scénarios de prévision ;
• La valeur du marché des données dans sa globalité (c.-à-d. le marché où les données numériques sont échangées en tant que produits ou services dérivés de données brutes) et la valeur de l'économie de la donnée dans sa globalité (y compris les incidences économiques générées par le marché des données) présentent la vision la plus dynamique et devraient continuer à augmenter jusqu’en 2025 selon les trois scénarios ;
• L’écart compétences-travailleurs dans le domaine des données - l’écart qui se dessine entre la demande et l’offre de travailleurs spécialisés dans les données - révèle un manque potentiel de compétences en données en Europe sur l’ensemble de la période considérée, s’agissant du scénario de forte croissance ;
• Enfin, le rapport examine les effets possibles causés par les développements de l’actuelle pandémie de Covid-19. Un scénario supplémentaire relatif à l’incidence du Covid, comprenant des estimations sur le marché des données et l’économie des données en 2020 et en 2025 pour l’UE27, a été spécifiquement développé et intégré.
7
EXECUTIVE SUMMARY
This is the Final Study Report (Deliverable D2.9) of the Update of the European Data Market Study (SMART 2016/0063), entrusted in 2016 to IDC and the Lisbon Council. The present document brings together the results and the activities carried out by the contractors under:
• The Final Report on Facts & Figures (D2.7) extending the measurement of the European
Data Market Monitoring Tool by presenting data for the years 2018-2019 and forecasts to
the year 2025 under three alternative scenarios;
• The Final Report on Policy Conclusions (D2.8) measuring the progress of European
policies towards the objective of maximising the growth of the Data Economy as measured
by the European Data Market Monitoring Tool;
• The key messages from the quantified stories (D3.6-7, D3.8 and D3.9) produced by the
study team and focusing on the operational, organizational and/or economic benefits
generated by the use of data-driven technologies with a special focus on data Commons
and Data-driven Innovation in the European Healthcare Industry;
• The Third Data Landscape Report (D4.3) providing an overview of the EU Data
Landscape and offering an up-to-date zoom onto the database of data market companies
in Europe.
Quantifying the European Data Market – Key Facts & Figures
The European Data Market Monitoring Tool
In line with the results presented in the original European Data Market study (SMART 2013/0063) in February 2017, in the First Report on Facts & Figures (D2.1) of February 2018, in the Second Report on Facts & Figures (D2.4) of March 2019 and in the Final Report on Facts & Figures (D2.8) of May 2020, the measured indicators are organised around a modular and flexible structure – the European Data Market Monitoring Tool. The updated European Data Market Monitoring Tool designed by IDC is shown in the Figure below.
The Updated EDM Monitoring Tool
The EU Data Market and Data Economy in 2019
The value of the Data Economy, which measures the overall impacts of the Data Market on the economy as a whole, exceeded the threshold of 400 Billion Euro in 2019 for the EU27 plus
8
the United Kingdom1, with a growth of 7.6% over the previous year. The positive trend in the growth of the Data Economy is confirmed by the Data Market value in 2019 for the EU27 plus the U.K., which is displaying a growth rate above the one exhibited by the total IT spending, at 4.9% year-on-year, reaching 75 Billion Euro.
As far as supply and demand are concerned, data suppliers are estimated at more than 290,000 units in the EU27 plus the U.K. for 2019, exhibiting a year-on-year growth of 2.3%. Data users, instead, remained stable in 2019, amounting to nearly 716,000 units and registering a growth of 0.6% over the previous year. Following increasing growth rates over the prior four years, these figures confirm a consolidation of data companies in the EU. Revenues generated by data suppliers increased by 9% to reach almost 84 Billion Euro in the EU27 plus the U.K., with the U.K. still in the leading position, Germany, France and Italy showing the highest share of data revenues per country - together accounting for two thirds (66%) of data revenues in the European Union plus the U.K.
According to the latest estimates, the number of data professionals in the EU27 plus the U.K. reached 76 million in 2019, corresponding to 3.6% of the total workforce, with an increase of 5.5% over the previous year. However, the EDM Monitoring Tool continues to register an imbalance between the demand and the supply of data skills in Europe as the estimated gap reached approximately 459,000 unfilled positions in the EU27 plus the U.K., corresponding to 5.7% of total demand. The data skills gap is forecast to continue in all the forecast scenarios as demand will continue to outpace supply.
The EU Data Market and Data Economy in 2025
The Update of the European Data Market Study also produced key facts & figures for the year 2025 according to three alternative evolution paths and of the European Data Market and Economy and driven by different macroeconomic and framework conditions. Based on IDC research carried out in March-April 2020, an additional post-Covid-impact scenario with estimates on the likely Data Market and Data Economy decline in 2020, and potential rebound and impacts on the 2025 scenarios for the EU27, has been developed (see subchapter below “Considerations on COVID-19 impact”)
The 2025 scenarios are shaped by a combination of economic and social drivers, focused on the interaction of two main axes:
• the high or low pace of diffusion of data-driven innovation, driven by demand-
supply dynamics, and its impact on economic growth.
• the social and economic data governance model enabling a fair and competitive
economy, as indicated by the new European Data Strategy
This analysis highlights the critical turning points to be faced in the next years by governments, businesses and social actors in the development of the European Data Economy. The combination of alternative social and economic trends results in the following scenarios:
• The Baseline scenario is characterised by a healthy growth of data innovation, a
moderate concentration of power by dominant data owners with a data governance model
protecting personal data rights, and an uneven but rather wide distribution of data
innovation benefits in the society. This is considered the most likely scenario.
1 Since Brexit is now definitive (as of May 2020), the authors provided an overview of data for EU27 plus the U.K. until 2019, and for the remaining months data are displayed for EU27.
9
• The High Growth scenario is characterised by a high level of data innovation, low data
power concentration, an open and transparent data governance model with high data
sharing, and a wide distribution of the benefits of data innovation in the society;
• The Challenge scenario is characterised by a low level of data innovation, a moderate
level of data power concentration due to digital markets fragmentation, and an uneven
distribution of data innovation benefits in the society.
The scenarios explore the drivers and framework conditions which may lead to maximise the benefits of a balanced Data Economy and to avoid the risks of an unbalanced one, highlighting the consequences of policy actions.
In the Baseline scenario, the EU 27 GDP cumulative growth average in the period 2020-2025 (+1.5%) will sustain the investments in the digital economy and consumer willingness to spend. As a result, the Data Market is forecast to reach 82.5 billion Euro in the EU27, with a compound annual growth rate of 5.8%. The Data Economy will grow faster than the Data Market, thanks to a positive multiplier impact of data innovation on the economy, reaching a value of 550 billion Euro in the EU27, with a steep increase of its incidence on EU from 2.8% in 2020 to 4% in 2025.
In the High Growth scenario at 2025, the EU 27 GDP compound annual growth rate in the period 2020-2025 (+2.1%) will be 1.5 times higher than in the Challenge scenario and 40% higher than in the Baseline scenario. This will accelerate the investments in the digital economy and consumer willingness to spend. In the European Union public and private investments will accelerate in Artificial Intelligence, advanced robotics, automation as well as new skills. As a result, the Data Market is forecast to reach 107 billion Euro in the EU27, with a compound annual growth rate of 11.5% between 2025 and 2020. The Data Economy will grow faster than the Data Market, reaching a value of 827 billion Euro in the EU27, with an incidence on EU GDP of 5.9%, against the 4.0% of the Baseline scenario.
In the Challenge scenario, the EU GDP compound annual growth rate in the period 2020-2025 will be only 0.9%, substantially lower than in the other scenarios. As a result, in this scenario the Data Market is forecast to reach 72 billion Euro in the EU27 with a compound annual growth rate of 3% between 2020 and 2025. In the same context, the Data Economy will reach a value of 432 billion Euro in the EU27 with an incidence on GDP of 3.3%, compared to 4% in the Baseline scenario 2025.
The number of data professionals will still increase to 8.4 million in the EU27 by 2025, adding 1.8 million positions in the period 2020-2025. We estimate a potential data skills gap of approximately 484,000 unfilled positions in the EU27 by 2025, corresponding to 5.7% of total demand, as demand will still grow faster than supply.
The EU Data Market and the International Indicators
Our latest measurement of the European Data Market Monitoring Tool reveals a substantially unchanged picture when comparing the EU indicators to those that have been developed for some of the key international partners of the EU. While confirming its vitality, the EU continues to lag behind the U.S in terms of both size and growth of the Data Market. In 2019, the EU27 plus the U.K. generated a Data Market value in 2019 approximately 2.5 times smaller than the one produced in the U.S. (72.3 billion Euro in the EU vs. almost 185 billion Euro in the U.S.) in the same year.
10
Filling this gap would be essential to increase Europe’s competitiveness and for the future of work in the EU. To this aim, the new European Digital Strategy recently unveiled by the European Commission2 designs a new, confident role for Europe as a global player.
A robust commitment on trade and investments on the international scene will also ensure a collaborative approach on several technology-related topics including data flows and the possibility to pool available and relevant high-quality data together. This approach, however, will have to be put in place while safeguarding Europe’s “technology sovereignty”, that is by making sure that Europe reduces its level of dependency on other parts of the globe for most of the crucial technologies and effectively protects the integrity and resilience of its data, networks and communication infrastructures.
Describing the Data Market – The Quantified Stories
Three stories were produced during the third and final round of measurement of the European Data Market monitoring Tool3. These stories were the result of a mixed effort entailing both secondary and primary research. Extensive secondary research on available public sources, specialised press and academic literature was undertaken to obtain an actionable and up-to-date understanding of the operational, organizational and/or economic benefits generated by the use of data-driven technologies with a special focus on Data Commons and Data-driven Innovation in the European Healthcare Industry.
The first story (“Health Data and Data-driven Innovation in the European Healthcare Industry”) highlighted the benefits that data-driven technologies can exert in uncovering unknown correlations, hidden patterns, and insights by examining large sets of data. By applying machine learning, Big Data can study human genomes and find the correct treatment or drugs to treat cancer or other rare diseases.
To better understand how European companies are approaching and implementing the use of Data Commons, the second story (“Accelerating the Impact of Data Commons) featured extensive desk research across a multitude of publicly available sources, and found a number of challenges to data pooling that can only be addressed by sophisticated and carefully designed governance mechanisms. Data pooling, to be effective, needs to be framed in a wider context of precompetitive collaboration between companies, and needs to have a compelling business case: in other words, it has to be demand-driven.
The third story looked at common data spaces from the viewpoint of those European industries that have already invested in data-driven innovation, have achieved measurable business benefits and are engaged in scaling-up these efforts. This provided an evidence-based and industry-specific view about the pragmatic requirements of Common European Data Spaces, with a focus on the requirements for data governance, access to data, access to infrastructures. The results showed that the path towards data spaces effectively supporting data sharing at ecosystem level will not be easy. The most relevant barrier found to scalability comes from the cost of cloud infrastructures and the dependency on a few global suppliers resulting in potential customer lock-in effects.
2 https://ec.europa.eu/digital-single-market/en/content/european-digital-strategy
3 In total, the study produced 9 stories: D 3.1 Quarterly Stories – Story 1 “Opening Up Private Data for Public Interest”, November 2017; D3.2 Quarterly Stories – Story 2 “Opening Up Scientific Data for Innovation”, February 2018, D3.3 Quarterly Stories—Story 3 “Data Monetization”, October 2018; D3.4 Quarterly Stories Story 4–How Big Data is driving AI: Selected Examples of AI Applications across European Industries, March 2019; D3.5 Quarterly Stories “AI paving the way for the Cognitive Revolution across European Utilities”, May 2019; stories D3.6-7; D3.8, D3.9 are described above.
11
Mapping the Data Market – The Data Landscape and the Data Market Monitoring Tool
The Third EU Data Landscape Report (D4.3) provides an overview of the EU Data Landscape database revision as of January 2020. With a total of 1,556 companies and coverage of 42 countries (European Union-28, Belarus, Bosnia and Herzegovina, Georgia, Iceland, Israel, Kenya, Moldova, Norway, Russia, Serbia, Switzerland, Turkey, Ukraine and the United States), the database has grown by 9% with the addition of 131 new companies during 2019. Out of the new companies, 52 were identified as Key Data Landscape companies, offering a comprehensive overview of the most important data companies in Europe.
Acting Upon the Data Market – The Role of Policy
The new European Data strategy outlines the ambition for Europe to become a leading role model for a society empowered by data to make better decisions in business and the public sector and a global leader in the data-agile economy.
Considerations on COVID-19 impact
As the Final report on Policy Conclusions (D2.8) was being finalised in February 2020, the COVID-19 pandemic started its rampage across the globe with unprecedented impacts on the European economy as well as on the technology market. While the EDM Monitoring Tool Data and analysis until 2019 remain valid,our estimates for 2020 are now off-the-mark and the 2025 scenarios would need to be revised.
Based on IDC research carried out in March-April 2020, an additional post-Covid-impact scenario with estimates on the likely Data Market and Data Economy decline in 2020, and potential rebound and impacts on the 2025 scenarios for the EU27, has been developed. These estimates should be taken with caution because of the extremely high level of uncertainty about the current damages to the economy and the potential recovery paths.
According to our post-COVID scenario estimates, the European Data Market should decrease by 7.1% to 54 Billion Euro in 2020 (compared to 58 Billion Euroillion Euro in 2019) and the Data Economy by 5.5% to 307 Billion Euro Billion Euro (compared to 325 Billion EuroBillion Euro in 2019). In our view, the powerful negative impact of the slow-down in 2020 will be followed by a rebound and a likely return on the growth path in the next years. Many of the powerful drivers of data-driven innovation are likely to prove resilient in the next years, particularly the willingness to invest in digital technologies in order to re-launch services and create new products to stimulate demand.
By 2025, the post-Covid Baseline scenario foresees strong growth rates resulting in a value of 80 Billion Euro for the European Data Market (compared to 82.5 Billion Euro in the pre-Covid scenario) and 516 Billion Euro for the Data Economy (compared to 550 Billion Euro in the pre-Covid scenario). However, the incidence of the European Data Economy on the EU27 GDP will slightly increase from 4% (Pre-Covid scenario) to 4.04% (post-Covid scenario) because GDP is also affected by the recession. The Challenge and High Growth scenarios remain broadly valid, even though their degree of likeliness change; the Challenge scenario is marginally more likely,while the High Growth scenario assumptions, based on hyper-growth thanks to technology investments, seem now quite remote.
Europe’s Data Market and Data Economy Evolution: Policy and the Three Scenarios (Pre-COVID)
Today, as we look at the main driving trends for the next years, we notice that the role of policies has increased in relevance: as data-driven innovation has become widespread across all industry sectors and user constituencies, the scope of the regulations and framework
12
conditions to be adapted has considerably grown. At the same time, the emergence of disruptive technologies such as AI has increased the need for policy intervention to manage emerging social, economic and ethical risks.
The Baseline scenario is positioned between the two extremes of a high and a low concentration of power and data control. The development of an effective regulatory framework of data governance, as foreseen by the Data strategy, will enhance stakeholders’ willingness and capability to manage data sharing and improves data access and re-use.
As in the Baseline scenario, in the High Growth scenario European enterprises multiply the use of "digital co-workers" (using intelligent process automation and AR/VR to support/complement human workers) reducing repetitive tasks, improving productivity and security. Besides automation, enterprises engage in "augmentation" of human resources providing technologies enhancing their physical and intelligence capabilities. On the other hand, initiatives to develop digital skills are successful: the Digital Europe Programmes delivers a boost to the supply of advanced digital and data skills, the revised Digital Education act helps to improve digital learning, and the networks of Digital Innovation Hubs play their role in providing internships, training and experimental spaces for companies to learn about new technologies.
The Challenge scenario foresees a negative self-reinforcing circle, where less positive global economic conditions discourage investments and weaken global demand with a negative impact on European growth. In this context, digital Europe and data strategies are not implemented successfully and fail to achieve many of their objectives. This may happen if a combination of insufficient investments and lack of collaboration at EU level lead to an uneven development of data infrastructures and digital resources.
The EU Data Policy and the International Dimension
The new European Digital Strategy recently unveiled by the European Commission appears to design a new, confident role for Europe as a global player. Realizing that the European model has proved to be an inspiration for many other partners worldwide, the strategy calls for the EU to strengthen its commitment towards the setting global standards for emerging technologies and to remain the most open region for trade and investment in the world. In terms of standards, in particular, the EU has paved the way for the setting of global standards for 5G and the IoT and is now committed to leading the standardisation process of a number of additional advanced and new generation technologies such as blockchain, quantum computing, supercomputing – all technologies that lie behind and allow data sharing and data usage and that, as a straight consequence, are directly linked to the further development of a well-functioning Data Economy.
This proactive international role in the standardisation process is accompanied by a robust commitment on trade and investments on the international scene so to ensure that a collaborative and inspiring European approach on several technology-related topics -including data flows and the possibility to pool available and relevant high-quality data together - is successfully implemented.
13
The European Data Market Monitoring Tool – Key Numbers 2019 for EU27
14
The European Data Market Monitoring Tool – Key Numbers 2019 for EU27 + U.K.
Source: EDM Monitoring Tool, IDC 2020
15
Source: EDM Monitoring Tool, IDC 2020
The European Data Market Monitoring Tool – Baseline Scenario 2025 for EU27
16
Source: EDM Monitoring Tool, IDC 2020
The European Data Market Monitoring Tool – High Growth Scenario 2025 for EU27
17
The European Data Market Monitoring Tool – Challenge Scenario 2025 for EU27
18
Baseline Scenario
High Growth Scenario
Challenge Scenario
Source: EDM Monitoring Tool, IDC 20204
4 Unfortunately, we are unable to present post-COVID revised estimates for the Data Economy Challenge and High Growth scenarios. These
scenarios rely on alternative assumptions on industry revenues, consumer consumption and GDP dynamics and we do not feel able in
the present uncertainty to elaborate such assumptions, beyond the Baseline scenario. However, as discussed for the European Data Market, we believe that the forecast estimates remain broadly valid with the Challenge scenario relatively more likely than before the
COVID pandemics.
The European Data Market Monitoring Tool – Post-Covid Scenarios 2020-2025 for EU27 (€M)
19
The European Data Market Monitoring Tool – The International Indicators and the EU27
20
The European Data Market Monitoring Tool – The International Indicators and the EU27 + U.K.
THE EUROPEAN DATA MARKET MONITORING TOOL
1. Introduction
The European Data Market Study (SMART 2013/0063) was launched by the European Commission in 2013 to measure the progress, size and trends of the European Data Economy with the objective of supporting the Data Value Chain policy of the European Commission. The study designed, developed and implemented a European Data Market Monitoring Tool providing facts and figures on the size and trends of the EU Data Market and Data Economy in the form of a series of quantitative indicators. The study also covered quali-quantitative aspects of the European Data Economy in the form of quantified stories investigating elements of the Data Market that were not captured by the Monitoring Tool. Finally, the European Data Market Study included a data landscaping tool offering a continuously updated picture of data companies in Europe and comprehended a series of webinars to disseminate the research results.
To continue gathering reliable and fact-based evidence on the EU Data Economy and measure the progress of the data-driven economy policies within the general framework of the Digital Single Market Strategy, the European Commission commissioned an update of the European Data Market (EDM) Study. The present document constitutes the Final Study Report (D2.9) of the Update of the European Data Market Study (SMART 2016/0063), entrusted in 2016 to IDC and the Lisbon Council. As a follow-up to the Second Interim Report (D2.6), this report brings together the research results and the activities carried out by the contractors under:
• The Final Report on Facts & Figures (D2.7) extending the measurement of the
European Data Market Monitoring Tool by presenting data for the years 2018-2020 and
forecasts to the year 2025 under three alternative scenarios;
• The Final Report on Policy Conclusions (D2.8) measuring the progress of European
policies towards the objective of maximising the growth of the Data Economy as
measured by the European Data Market Monitoring Tool;
• The key messages from the (D3.6-7, D3.8 and D3.9) produced by the study team and
focusing on the operational, organizational and/or economic benefits generated by the
use of data-driven technologies with a special focus on data Commons and Data-driven
Innovation in the European Healthcare Industry;
• The Third Data Landscape Report (January 2020 Review – D4.3) providing an
overview of the EU Data Landscape and offering and up-to-date zoom into the database
of data market companies in Europe.
1.1. Objectives
As for the previous study, the Update of the European Data Market Study (SMART 2016/0063) pursues three main objectives closely interrelated, which together allow to develop a complete and coherent picture of the European Data Market and Data Economy. They are as follows:
• Measuring the EDM indicators, providing facts and figures on all the key features of the
European Data Market and Economy, regularly updated during the life of the project,
building on the taxonomy and methodology approach previously developed and
successfully implemented;
• Analysing relevant issues for the development of the data ecosystem, providing Data
Market stories based on factual evidence, case studies and complementary data to the
EDM indicators, following on the 12 stories already published by the previous study;
THE EUROPEAN DATA MARKET MONITORING TOOL
22
• Mapping and visualising the stakeholders populating the EU Data Market, building on
the stakeholders’ landscape and community developed in the previous study, and
leveraging the visibility achieved by the website www.datalandscape.eu.
1.2. Methodological Approach
The Indicators
As outlined in the Final report on Facts & Figures (D2.7), the measurement of each of the indicators in this report is based on a sophisticated methodology that combines data collection, models, and desk research. Some initial assumptions are built on surveys completed during March 2015, which are supported by ongoing annual surveys. The 2015 survey included 8 Member States and 11 industries that aligned with Eurostat industry segmentation, and IDC’s ongoing annual surveys initially included 6 Member States and 20 industry segments. This final survey, conducted between July and September 2019, include a total of 13 countries.
The initial survey targeted potential data companies in two industries (ICT and Professional services), and data users in 11 industries. The annual update surveys target all business sectors, and company sizes greater than 10 employees. The survey is balanced to represent the mix of industries and size bands for companies in the European Union. The initial survey and the ongoing cross-Europe survey are outlined in more detail in the survey section of the methodological annex. The models used to represent expected market and company behaviour take inputs from macroeconomic indicators such as GDP and GDP growth, ICT spending, and employment.
The models used to represent expected market and company behaviour take inputs from macroeconomic indicators such as GDP and GDP growth, ICT spending, and employment. The main data sources used to compile the indicators are outlined in the table below.
Table 1: Main Data Sources by Indicator
Data Source Updated Used in
Eurostat Business Demographic Statistics Jan 2020 Data professionals Data companies Data users
Eurostat annual structural business statistics Jan 2020 Data professionals Data companies Data users
Eurostat chain linked Volumes (GDP) Jan 2020 Data Market Data Revenues
IDC Core IT Spending guide 2H2017 Jun-2019 Data Market Data Revenues
IDC Worldwide Black Book v3.2 (standard edition) Nov 2019 Data Market Data professionals Data Companies Data Users Data Revenues
IDC European Vertical Markets survey (2019) Sep 2019 Data Market
IMF World Economic Outlook (Oct 2019) Jan 2020
Data Market Data Revenues Data Economy
Consensus Forecasts – Consensus economics Nov 2019 Data Market Data Revenues Data Economy
IT Big Data and Analytics spending Guide 2H2018 Nov 2019 Data Market
ILOSTAT statistics and databases Dec 2019 Data Professionals
THE EUROPEAN DATA MARKET MONITORING TOOL
Additional relevant sources leveraged for the measurement of the indicators were IDC Vertical Markets end user surveys and IDC Worldwide Black Book, whose results were used to confirm and adjust estimates, when necessary, of the number of companies that were data users and data suppliers.
The updated numbers of data users and data supplier companies were subsequently used to determine the updated results for the data companies’ revenues and were further combined with above mentioned sources to measure the indicators for Data Professionals, Data Professionals’ Skills Gap for the year 2018, 2019 and for the three 2025 scenarios.
Implications of Brexit on Indicators
The UK left the European Union on the 31st January 2020, while this study was ongoing and the Final Report on Facts and Figures was being developed. Until this time the UK was included as one of the Member States, although since June 2016 data were provided distinguishing two totals, one for the EU28 (including the UK) and another for the EU27 – excluding the UK. However, at the level of company size and industry, historical and forecast data are presented for the EU28 only5. Forecasts beyond 2020 account for the expected impact of the UK exit from the EU, although little difference is anticipated as an overall impact, since, due to the high growth of the Data Market when compared to the total IT market across Europe and the UK.
The Final Report on Policy Conclusions
As a follow-up to the First Report on Policy Conclusions (D2.3) and the Second Report on Policy Conclusions (D2.5) , additional desk research and literature review were conducted to produce the Final Report on Policy Conclusions (D2.8) accompanying the quantitative results of the Final Report on Facts & Figures (D2.7). To better investigate the role of policies in shaping the current and future development of the European Data Economy, the study team leveraged a mix of IDC research and other sources. (Including recent research on the Covid19 pandemic and its potential effects on the data market and the data economy) A select list of these sources is offered in Table 2 below:
Table 2: Main Sources
Document Year Author(s)
A Union that strives for more – My agenda for Europe
Political guidelines for the next European Commission 2019-2024
2019 Ursula von der Leyen
A European strategy for data 2020 European Commission
Shaping Europe’s Digital Future 2020 European Commission
The Data Economy 2020 The Economist
5 Indeed, from the outset, our model did not allow to obtain data at Member State level and industry at the same
time (i.e. “interlocked data” by Member State showing details at industry level for that Member State). As a result, data at industry level cannot be segmented at Member State level, thus the UK cannot be “isolated” and subtracted to obtain EU27 date.
THE EUROPEAN DATA MARKET MONITORING TOOL
24
Opinion of the Data Ethics Commission 2019 Data Ethics Commission of the Federal Government, Germany
Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19)
2020 WHO
IDC Worldwide ICT spending forecast Post-Covid 2020 IDC
Preparing for COVID-19 Phase2: adopting contact tracing
2020 IDC
A policy framework for climate and energy in the period from 2020 to 2030
2014 (updated in 2018)
European Commission
IDC DX Executive Sentiment Survey 2018 IDC
IDC European Vertical Markets Survey, 2018–2019 2019 IDC
European Union (Withdrawal Agreement) Act 2020 2020 UK Government
IMD World Digital Competitiveness Ranking 2019 2020 IMD
European Parliament resolution of 12 February 2020 on the proposed mandate for negotiations for a new partnership with the United Kingdom of Great Britain and Northern Ireland (2020/2557(RSP))
2020 European Parliament
EU-UK Data Flows, Brexit and No-Deal: Adequacy or Disarray?
2019 UCL European Institute
Digital Economy Report 2019 - Value creation and capture:
Implications for developing countries
2019 United Nations
Brazil Economic Outlook. First quarter of 2020 2020 BBVA
U.S. Economy at a Glance 2020 Bureau of Economic Analysis – U.S. Department of Commerce
The Quantified Stories
The quali-quantitative stories were the result of a mixed effort entailing both secondary and primary research activities. Extensive secondary research on available public sources, specialised press and academic literature was undertaken to obtain an actionable and up-to-date understanding of private and scientific data for public interest and innovation together with a comprehensive picture of the phenomenon in Europe and worldwide.
In parallel, primary research was conducted to collect empirical evidence and validate the information obtained through the main desk research activities. Among the organisations interviewed, the following featured a prominent role:
THE EUROPEAN DATA MARKET MONITORING TOOL
The Third EU Data Landscape Report
The Third EU Data Landscape Report (D4.3) provides a detailed overview of the updated EU Data Landscape database and a zoom into the stakeholders and their positioning in the data economy environment. Following the January 2020 update reported in the Third EU Data Landscape report (D4.3), the EU Data Landscape database has been revised to capture the current trends and include the data collected in the period from January 2019 until December 2019.
The dataset has been significantly extended through desk research as well as through input received from stakeholders in the data economy, for instance, via the www.datalandscape.eu website. The report relies on the crowdsourcing of knowledge through an open process, where stakeholders can directly suggest the companies to be included in the database. The mapping exercise sought to achieve a balanced and comprehensive coverage of the different geographies, different typologies of companies (SMEs, large companies, research institutions etc.) and the different data sectors. In terms of geographical coverage, the mapping of the Data Landscape focuses on the EU member states. However, companies from other European countries as well countries outside Europe are also depicted in the database. The reviewing procedure consisted of the following steps:
• Control of the existing dataset;
• Extension of the dataset, taking into account coverage goals;
• Review of Key Data Landscape companies and identification of new.
Finally, among the 1556 companies, 311 have been identified as Key Data Landscape companies in line with a set of criteria adopted.
The European Data Market Monitoring Tool
In line with the results presented in the original European Data Market study (SMART 2013/0063) in February 2017, the First Report on Facts & Figures (D2.1) of February 2018, the Second Report on Facts & Figures (D2.4) of March 2019 and the Final Report on Facts & Figures (D2.7) of April 2020, the indicators presented in this report are organised around a modular and flexible structure – the European Data Market Monitoring Tool. The updated European Data Market Monitoring Tool designed by IDC is shown in the Figure below and its main components are further described in the following sections.
Figure 1: The Updated EDM Monitoring Tool
THE EUROPEAN DATA MARKET MONITORING TOOL
26
1.3. The Structure of this Report
The present report is built along the following sections:
• The first section – corresponding to Chapter 2 – summarises the results of the Final
Report on Facts & Figures (D2.7) that was delivered and approved by the European
Commission in April 2020.
• The second section – corresponding to Chapter 3 – provides additional qualitative and
quantitative aspects on the European Data Market as obtained by the quantified stories
(D3.6-7, D3.8 and D3.9) produced by the study team between November 2019 and June
2020.
• The third section – corresponding to Chapter 4 – presents an updated overview of the
data landscape and interactive Data Market Monitoring Tool based on the January 2020
update reported in the Third Data Landscape Report (D4.3).
• The fourth section – corresponding to Chapter 5 – focuses on the policy conclusions
delivered in the Final Report on Policy Conclusions (D2.8) of May 2020.
• The final section provides for a set of concluding remarks drawing from all the different
components (and deliverables) of the Update of the European Data Market study.
THE EUROPEAN DATA MARKET MONITORING TOOL
2. Quantifying the data market – key facts & figures
The key facts & figures stemming from the third round of measurement of the Update European Data Market Study (SMART 2016/0063) as reported in the Final Report of Facts & Figures were obtained through the measurement of the following set of selected indicators:
THE EUROPEAN DATA MARKET MONITORING TOOL
28
Each indicator was measured at the level of the total EU27 plus U.K and EU27 (excluding the U.K.) for all EU Member States, when available and applicable; industry-specific and company-size views were also offered with indicators provided by industry sector and company size bands, when possible. As in the European Data Market Study (SMART 2013/0063), a select number of indicators has been developed and updated for three non-European countries, namely Brazil, Japan and the United States.
The six key indicators measured by the EDM can be seen holistically along four main dimensions:
• The Workforce and Skills dimension - including the measurement of data
professionals and their potential skill gap.
• The Supply and Demand dimension - incorporating the measurement of data
supplier and data user companies and the revenues generated by data supplier
companies.
• The Business and Economy dimension - comprehending the size of the Data
Market and the value of the Data Economy.
• The International context dimension - including a select number of indicators for
Brazil, Japan and the U.S.
Figure 2: The four Dimensions of the Data Market’s Key Facts & Figures
Source: The European Data Market Monitoring Tool, IDC, 2019
2.1 Three future Development Paths: The Data Market at 2025
The key facts & figures obtained through the measurement of the above-listed indicators are presented for the years 2018 and 2019 as well as for the year 2025 according to three potential future scenarios of the European Data Market and Economy, driven by different macroeconomic and framework conditions. The scenarios at 2025 continue to take as a reference point the initial scenarios developed for the year 2020. While the 2020 scenarios were mainly differentiated by economic drivers (different demand-supply dynamics), the 2025 scenarios continue to be shaped by a combination of economic and social drivers, focused on the interaction of two main focal issues (or evolution paths):
• the high or low pace of diffusion of data-driven innovation, driven by demand-
supply dynamics, and its impact on economic growth. This year we add to this
perspective the pace of multiple innovation adoption, where data is at the core of a
multiple technology environment powered by AI.
THE EUROPEAN DATA MARKET MONITORING TOOL
• the social and economic data governance model enabling a fair and
competitive economy, as indicated by the new European Data Strategy.
At one extreme, we foresee a society where a few actors, such as leading online platforms, governments, large businesses, dominate the main data assets and therefore are able to capture a disproportionately high share of data innovation benefits, increasing social inequality (highly centralized model). The polar opposite of this scenario would be a society characterised by an open, transparent and participatory approach to data governance, where both citizens and organisations are able to control and extract value from their data. This would result in a wider social distribution of data innovation benefits, decreasing social inequality. Trustworthiness and respect of data ethics principles are other important characteristics of this ideal model.
This analysis highlights the critical turning points to be faced in the next years by governments, businesses and social actors in the development of the European Data Economy. The combination of alternative social and economic trends results in the following scenarios:
• The Baseline scenario is characterised by a healthy growth of data innovation, a
moderate concentration of power by dominant data owners with a data governance
model protecting personal data rights, and an uneven but rather wide distribution of data
innovation benefits in the society. This is considered the most likely scenario.
• The High Growth scenario is characterised by a high level of data innovation, low data
power concentration, an open and transparent data governance model with high data
sharing, and a wide distribution of the benefits of data innovation in the society;
• The Challenge scenario is characterised by a low level of data innovation, a moderate
level of data power concentration due to digital markets fragmentation, and an uneven
distribution of data innovation benefits in the society.
The scenarios explore the drivers and framework conditions which may lead to maximise the benefits of a balanced Data Economy and to avoid the risks of an unbalanced one, highlighting the consequences of policy actions.
2.2 The Workforce Dimension: Data Professionals and Data Skills Gap
Measuring the Data Professionals
Data professionals6 are workers who collect, store, manage, and/or analyse, interpret, and visualise data as their primary or as a relevant part of their activity. Data professionals must be proficient with the use of structured and unstructured data, should be able to work with a huge amount of data and be familiar with emerging database technologies.
Data Professionals in 2018 and 2019
According to the third round of measurements, data professionals are estimated at a total of 6.0 million in EU27 and at 7.6 million in EU27 plus U.K in 2019, thus marking a continuing increase in 2019 over the previous year (6.1% and 5.5% year-on-year respectively). When
6 The previous European Data Market Study (SMART 2013/0063) included an indicator measuring “Data Workers”, which was based on a similar, but slightly more restrictive definition. In this updated study we have decided to measure “Data Professionals”, that is workers with a wider range of data-related roles. Indeed, data professionals are not only data technicians, but also users who, based on sophisticated tools, take decisions about their business or activities after having analysed and interpreted available data.
THE EUROPEAN DATA MARKET MONITORING TOOL
30
compared to the year 2019, 2020 would register a growth rate of 9.2% and 8.6% at the level of EU27 and EU27 plus U.K respectively. More interestingly, the employment share and the intensity share components of the data professionals’ indicator are also expected to improve in 2019 and 2020 if compared to our estimates in 2016 (now estimated at 3.3% and 3.5% in 2019 and 2020 in EU27 and 3.6% and 3.8% for the same years in EU27 plus U.K). As underlined in the Second Report on Facts & Figures (D2.4), this increase confirms the positive evolution of the workforce involved in data-related professions over the period under consideration.
Table 3: Data Professionals, 2016-2017-2018-2020 and Growth Rates
N. Region
Name Description 2016 2017 2018 2019 2020 Growth 2019/201
8
1.1 EU27 Number of data professionals
Total number of data professionals in EU (000s)
4,875 5,260
5,688
6,033
6,588
6.1%
1.1 EU27+U.K.
Number of data professionals
Total number of data professionals in EU (000s)
6,187 6,666
7,215
7,608
8,261
5.5%
1.2 EU27 Employment share of data professionals
Share of data professionals on total employment in EU (%)
2.8% 3.0%
3.2%
3.3%
3.5% 3.4%
1.2 EU27+U.K.
Employment share of data professionals
Share of data professionals on total employment in EU (%)
3.1% 3.3%
3.5%
3.6%
3.8% 2.9%
1.3 EU27 Intensity share of data professionals
Average number of data professionals per user company (units)
9.6 10.2 10.7 11.3 12.1 5.4%
1.3 EU27+U.K.
Intensity share of data professionals
Average number of data professionals per user company (units)
9.2 9.6 10.1 10.6 11.4 4.9%
Source: European Data Market Monitoring Tool, IDC 2020
Data Professionals at 2025
A steady progression of the number of data professionals continues to emerge from our 2020 estimates. The number of data professionals in both EU27 and EU27 plus the U.K is forecast to grow significantly under all the three scenarios out to 2025 as the use of data-driven innovation is expected to grow unabatedly even under the less economically favourable scenario. In particular, under the Baseline scenario, data professionals are expected to amount to 9.3 million in EU27 and 11.3 million in EU27 plus the U.K by 2025, thus representing a solid growth rate between 7.2% and 6.5% per year over the 2020-2025
THE EUROPEAN DATA MARKET MONITORING TOOL
period. In the Challenge and High Growth scenarios, data professionals would be more than 8.4 million and 10.8 million in EU27 and 10.2 million and 13.1 million in EU27 plus the U.K. respectively. Under all scenarios, the CAGR over the period 2018-2025 is consistent, although higher than the CAGR featured by the Data Market growth, thus confirming again the close relationship between the two variables.
Table 4: Data Professionals in 2025 - Total Number in EU27 and EU27 + U.K. and Growth Rates. Challenge, Baseline and High Growth Scenarios (Units, ‘000; %)
N. Region Name Description 2025 Challenge
2025 Baseli
ne
2025 High
Growth
CAGR Challenge scenario
CAGR Baseline scenario
CAGR High
Growth scenario
1.1 EU27 Number of data professionals
Total number of data professionals in EU (000s)
8,461 9,316 10,853 5.1% 7.2% 10.5%
1.1 EU27 + U.K.
Number of data professionals
Total number of data professionals in EU (000s)
10,200 11,331 13,162 4.3% 6.5% 9.8%
Source: European Data Market Monitoring Tool, IDC 2020
Measuring the Data Professionals Skills Gap
The Data Professionals Skills Gap indicator captures the potential gap between demand and supply of data skills in Europe, since the lack of skills may become a barrier to the development of the data industry and the rapid adoption of data-driven innovation. It is based on a model balancing the main sources of data skills (from the education system and re-training and other carriers) with the estimated demand (by all data companies).
This indicator has highlighted an imbalance between demand and supply of data skills in Europe since the first measurement for the year 2014. In the year 2019 an increase of demand of data professionals continued (+4.5%), the estimated gap grew by 13% reaching approximately 459,000 unfilled positions in the EU27 plus the U.K. (399,000 without the U.K.), corresponding to 6.2% of total demand (5.7% without the U.K., see table below). By 2020 we expect the gap to expand to 496,000 unfilled positions in EU27 plus the U.K., corresponding to 6% of total demand (5.2% without the U.K, where slower growth is expected due to the impacts of Brexit). In any given moment in the labour market there is a physiological number of vacancies, as well as a number of people looking for work: a vacancies ratio around 5% of demand or less is considered manageable. From this point of view the data skills gap estimated for 2019 shows a lower level stress in the market if compared with our previous estimates for 2017 and 2018. As in in the first and second round of measurements of this indicator, the gap is expected to continue in 2020 under the three scenarios but at a lower level than previously estimated.
The three forecast scenarios now portray a mixed data skills gap at the year 2025: while definitely on the increase under the Baseline and the High-Growth scenarios (8.2% and 10.5% respectively in the EU27), in the Challenge scenario the gap is expected to exhibit a minor decrease due to an overall slow-down of the overall data market and data economy dynamics under this less favourable development path.
THE EUROPEAN DATA MARKET MONITORING TOOL
32
The absolute size of the data skills gap is relevant, potentially reaching 759,000 unfilled positions in 2025 in the EU 27 Baseline scenario, but up to over 1.1 million in the EU 27 High Growth scenario. In the Challenge scenario the data skills gap is forecast at 484,000 unfilled positions in 2025. This underlines the need for policy action to prevent and minimize the unbalance between data skills demand and supply in the next years.
Table 5: Indicator 6 - Data Professionals Skills Gap in the EU, 2017-2018-2020 and 2025 - Three scenarios
Indicator 6 - Data Professionals skills gap in the EU, three scenarios
N. Name Description
Actual
Baseline Scenario
Challenge Scenario
High Growth Scenario
2016 2017 2018 2019 2020
2025
2025/ 2019 CAGR
2025
2025/ 2019 CAGR
2025 2025/ 2019 CAGR
6.1 Data Professionals skills gap
Gap between demand and supply of data professionals N, 000s
EU27
343 395 321 399 341
759
11.3%
484 3.3% 1,138 19%
EU27 + U.K.
428 483 406 459 496
866
6.2 Gap between demand and supply of data professionals%
EU27
6.2%
6.7%
5.2%
6.2%
5.2%
8.2%
11.2%
5.7%
2.1% 10.5%
19%
EU27 + U.K.
6.2%
6.5%
5.2%
5.7%
6.0%
7.6%
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
2.3 The Supply - Demand Dimension: The Data Companies
Measuring the Data Companies
Data companies are organisations that are directly involved in the production, delivery and/or usage of data in the form of digital products, services and technologies. They can be both data suppliers’ and data users’ organisations:
• Data suppliers have as their main activity the production and delivery of digital data-
related products, services, and technologies. They represent the supply side of the
Data Market.
• Data users are organisations that generate, exploit, collect and analyse digital data
intensively and use what they learn to improve their business. They represent the
demand side of the Data Market.
Data Companies in 2018 and 2019
The number of data suppliers continues to grow at a faster pace than the number of data users in the longer term (out to 2025). Data suppliers are estimated at almost 149,000 in the EU27 and 290,000 units in the EU27 plus the U.K. for 2019, thus exhibiting a year-on-year growth of 2.4% and 2.3% respectively. Data users, instead, are projected to grow at 0.6% in 2019, amounting to nearly 535,000 in the EU27 and to nearly 716,000 units in the EU27 plus the U.K. If compared to the measurements carried out by the European Data Market Monitoring Tool over the period 2013-2015, these latest estimates show a picture of some consolidation of data companies in the EU, following increasing growth rates over the prior four years.
This consolidation is reflected in slow but consistent increase in the share of data companies over the total number of companies in Europe. The share of data suppliers on total companies in the ICT and Professional services industries is estimated at 11.5% in the EU27 and 15.2% in the EU27 plus the U.K. for 2019, a slight improvement with respect to an adjusted 2018. The data users’ penetration rates (i.e. the share of data users on total companies in the EU) are also stable with a fractional percentage point increase in 2019 in both the EU27 and the EU27 plus the U.K.
THE EUROPEAN DATA MARKET MONITORING TOOL
34
Table 6: Data Companies, 2016-2017-2018-2020 and Growth Rates
N. Name Description Market 2016 2017 2018 2019 2020 Growth 2019/2018
2.1 Number of data suppliers
Total number of data suppliers measured as legal entities based in the EU (000s)
EU27 134,300
139,450
145,440
148,900
153,100
2.4%
2.1 Number of data suppliers
Total number of data suppliers measured as legal entities based in the EU (000s)
EU27+U.K.
261,450
271,700
283,390
290,000
297,350
2.3%
2.2 Share of data suppliers
% share of data companies on total companies in the ICT and Professional services industries
EU27 10.9% 11.3% 11.4% 11.5% 11.7% 1.6%
2.2 Share of data suppliers
% share of data companies on total companies in the ICT and Professional services industries
EU27+U.K.
14.2% 14.8% 15.2% 15.2% 15.4% 1.2%
2.3 Number of data users
Total number of data users in the EU, measured as legal entities based in one EU country
EU27 505,950
517,100
531,720
534,840
542,510
0.6%
2.3 Number of data users
Total number of data users in the EU, measured as legal entities based in one EU country
EU27+U.K.
676,150
691,500
711,870
715,890
726,110
0.6%
2.4 Share of data users
% share of data users on total companies in the EU industry
EU27 5.7% 5.8% 5.9% 5.9% 6.0% 0.2%
2.4 Share of data users
% share of data users on total companies in the EU industry
EU27+U.K.
6.5% 6.6%
6.7%
6.7% 6.8% 0.1%
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
Data Suppliers Forecasts at 2025
According to our latest forecasts at 2025 as far as data suppliers are concerned, the outlook for the number of data suppliers is continued growth beyond 2020, and the baseline growth to 2025 aligns with the growth forecast for 2020 – reflecting the consolidation and stabilisation seen among the number of data suppliers. However, there is higher growth for the larger data supplier companies because investment as a data supplier requires resources not as readily available to smaller companies. Larger companies can afford individuals and departments whose sole purpose is to address the Data Market, while in smaller companies the development role often falls to individuals who have other responsibilities.
Table 7: Data Suppliers Forecast 2025 by Member State - Three Scenarios (Units; ‘000); CAGR 2025-2020 (%)
Member State 2025 Challenge
2025 Baseline
2025 High Growth
CAGR 2025/2020 Challenge Scenario
(%)
CAGR 2025/2020 Baseline Scenario
(%)
CAGR 2025/2020
High Growth
Scenario (%)
EU27 163,130 173,410 193,170 1.3% 2.5% 4.8%
EU27 + U.K. 317,230 334,360 384,020 1.3% 2.4% 5.2%
Source: European Data Market Monitoring Tool, IDC 2020
Data Users Forecasts at 2025
Long term growth in the number of data user companies is highest in the data intense industries such as Professional services and Retail, and lowest in Education, Construction and Healthcare – when considering the baseline scenario. The largest companies show the highest growth in adoption as the Data Economy will be crucial to their success and competitive advantage – without a data-oriented approach to business and business decisions these companies will not see the opportunities their competitors see and so not grow at the same rate. However, these larger companies are a small share of the overall number of companies so although the number will grow at a compound rate of 25.6% to 2025, compared with 0.9% for those in the smaller size band, they do not add significantly to the total number of data companies.
Table 8: Data Users Forecast 2025 by Member State - Three Scenarios (Units; ‘000); CAGR 2025-2020 (%)
Indicator 2 – Data Users – Forecast 2025
Member State 2025 Challenge
2025 Baseline
2025 High Growth
CAGR 2025/2020 Challenge Scenario
(%)
CAGR 2025/2020 Baseline Scenario
(%)
CAGR 2025/2020
High Growth
Scenario (%)
EU27 562,280 582,750 626,630 0.7% 1.4% 2.9%
EU27 + U.K. 753,380 779,150 845,330 0.7% 1.4% 3.1%
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
36
Measuring Data Companies’ Revenues
Data companies’ revenues correspond to the aggregated value of all the data-related products and services generated by Europe-based data suppliers, including exports outside the EU.
Data Companies’ Revenues in 2018 and 2019
Revenues generated by data suppliers have registered a constant increase since 2013 according to our initial measurements and the Monitoring Tool. In 2019, in particular, revenues have increased by 9% to reach more than 64 Billion Euro in EU27 and 83 Billion Euro in EU27 plus the U.K. However, the share of the data suppliers’ revenues on the total companies’ revenues in the ICT and Professional services sectors dropped to 3.0% in EU27 and 3.1% in EU27 plus the U.K. in 2018 following a strong growth in ICT sales across the board in the year. A weakness in spending in the U.K. due to the economic uncertainty in this Member State pulled down overall revenues for data companies.
Table 9: Data Companies’ Revenues and Growth, 2016-2020 (€, Million; %)
Indicator 3 — Data Companies’ Revenues and Growth
N. Region
Name Description 2016 2017 2018 2019 2020
Growth 2019/2018
3.1
EU27 Total revenues of
data companies in the EU
Total revenues of the Data Suppliers
calculated by Indicator 2
47,178
52,479
58,948
64,262
71,050
9.0%
3.1
EU27 +
U.K.
Total revenues of
data companies in the EU
Total revenues of the Data Suppliers
calculated by Indicator 2
61,781
68,846
77,297
83,545
91,318
8.1%
3.2
EU27 Share of data
companies’ revenues
Ratio between Data Suppliers’ revenues
and total companies’
revenues in sectors J and M
3.0% 3.2% 3.4% 3.0% NA 4.5%
3.2
EU27 +
U.K.
Share of data
companies’ revenues
Ratio between Data Suppliers’ revenues
and total companies’
revenues in sectors J and M
3.1% 3.3% 3.5% 3.7% NA 4.6%
Source: European Data Market Monitoring Tool, IDC 2020
Data Companies’ Revenues Forecasts at 2025
Data companies’ revenues within the EU grow faster than the IT market as these products and services become more mainstream. However, the four major contributing countries (U.K., Germany, France, Italy) lose share of the European Data Market between 2019 and 2025 - as their share of EU revenues falls from 66% to 63%. Italy, currently the fourth largest data market in Europe, is displaced by the Netherlands by 2025, according to the Baseline scenario. This is not a failing of these countries but reflects the catching up of smaller companies as their revenues rise. The four leading countries have a greater share of larger companies while new entrants to the market are more likely to be smaller companies, which means a disproportionate growth for the smaller member states when compared to the
THE EUROPEAN DATA MARKET MONITORING TOOL
larger ones. Over the period 2019-2025 data companies’ revenues rise by 7.0% annually, while total IT spending rises over the same period at 1.6%.
Table 10: Data Companies' Revenues Forecast – Total number in the EU27 and EU27 plus U.K. and Growth rates - Three Scenarios (€, Million)
Indicator 3 — Data Companies’ Revenues - Forecast 2025
Region 2025 Challenge
2025 Baseline
2025 High
Growth
CAGR 2025/2020 Challenge Scenario
(%)
CAGR 2025/2020 Baseline Scenario
(%)
CAGR 2025/2020
High Growth
Scenario (%)
EU27 80,148 98,623 136,350 2.4% 6.8% 13.9%
EU27 + U.K. 103,796 127,976 181,795 2.6% 7.0% 14.8%
Source: European Data Market Monitoring Tool, IDC 2020
2.4 The Business and Economic Dimension: The Data Market and the Data
Economy
Measuring the Data Market
The Data Market is the marketplace where digital data is exchanged as “products” or “services” as a result of the elaboration of raw data.
The Data Market in 2018 and 2019
The value of the Data Market in 2019 for both EU27 and EU27 plus the U.K. continues to show a growth rate above the one exhibited by the total IT spending, at 4.9% year-on-year and is expected to surpass the threshold of 60 billion Euro in 2020 in EU27 according to the Baseline scenario described in our previous study. This represents a constant and significant progression if we consider that the total amount of the Data Market in EU27 was estimated at 42.6 Billion Euro in 2015 in our previous study and that our current estimates measures the Data Market at 55.5 Billion Euro in 2018.
Table 11: Data Market Value and Growth, 2016-2017-2018-2020 (€, Million; %)
Indicator 4 — Value and Growth of the Data Market
N. Market
Name Description 2016 2017 2018 2019 2020 Growth 2019/201
8
4.1 EU27 Value of the Data Market
Estimate of the overall value of the Data Market
46,183 50,604 55,486 58,214 62,244 4.9%
4.1 EU27 + U.K.
Value of the Data Market
Estimate of the overall value of the Data Market
59,496 65,286 71,787 75,274 80,253 4.9%
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
38
The Data Market Forecasts at 2025
Our estimates of the Data Market value in 2025 under the High Growth Scenario continue to showcase a buoyant growth, with IT Spending on Data Market tools almost doubling over the period from 2019 to 2025 for both EU27 and EU27 plus the U.K. This will correspond to a considerable CAGR for the period 2020-2025 of 11.5% and 12.0% in EU27 and in EU27 plus the U.K. respectively for the High Growth scenario. This is marginally down when compared with the previous publication – mostly as a result of a more buoyant 2020. Our new 2025 Baseline scenario, shows the Data Market will amount to more than 82 billion Euro in EU27, against 58.2 billion Euro in 2019 (a 5.8% CAGR 2010-2025), while under the Challenge scenario the Data Market will still represent 72.3 billion Euro, growing at a compound annual growth rate of 3.0% from 2020. The Data Market growth will therefore continue unabated in 2025, confirming the trend set out in 2013-2014 while elaborating our initial results of the European Data Market Study (SMART 2013/0063). These forecasts for 2025 are only marginally changed from the previous forecast, with the Challenge and High Growth scenarios down 0.6% and 0.7% respectively, while the baseline scenario is a down 0.7% when compared with the previous forecast for the EU27 plus the U.K. The impact of Brexit might be a little more negative than anticipated last year, but the potential for higher growth is confirmed.
Table 12: Data Market Forecast 2025 - Total number in the EU27 and EU27 plus U.K. and Growth rates - Three Scenarios (€, Million)
Indicator 4 — Data Market - Forecast 2025
Region 2025 Challenge
2025 Baseline
2025 High Growth
CAGR 2025/2020 Challenge Scenario
(%)
CAGR 2025/2020 Baseline Scenario
(%)
CAGR 2025/2020
High Growth
Scenario (%)
EU27 72,329 82,564 107,139 3.0% 5.8% 11.5%
EU27 + U.K. 93,056 105,638 141,507 3.0% 5.7% 12.0%
Source: European Data Market Monitoring Tool, IDC 2020
Measuring the Data Economy
The Data Economy measures the overall impacts of the Data Market on the economy as a whole. It involves the generation, collection, storage, processing, distribution, analysis elaboration, delivery, and exploitation of data enabled by digital technologies.
The Data Economy includes the direct, indirect, and induced effects of the Data Market on the economy.
• The direct impacts: are the initial and immediate effects generated by the data suppliers; they represent the activity potentially engendered by all businesses active in the data production. The quantitative direct impacts will then be measured as the revenues from data products and services sold, i.e. the value of the Data Market.
• The indirect impacts: are the economic activities generated along the company's supply chain by the data suppliers. There are two different types of indirect impacts: the backward indirect impacts and the forward indirect impacts.
• The induced impacts: include the economic activity generated in the whole economy as a secondary effect.
THE EUROPEAN DATA MARKET MONITORING TOOL
The Data Economy in 2018 and 2019
The value of the Data Economy for the EU27 plus the U.K. has been estimated to exceed the threshold of 324 Billion Euro in 2019, overall confirming the estimates presented in the previous deliverable. The estimated share of total impacts on GDP in EU27 plus the U.K. is 2.6% in 2018 and is expected to grow to 2.8% in 2019.
Table 13: Data Economy Value and Growth, 2017-2018-2020 and Impacts on GDP 2018-2019 (€, Million; %)
Source: European Data Market Monitoring Tool, IDC 2020
The Data Economy Forecasts at 2025
The new estimations of the Data Economy see the value of 2019 for EU27 to be more than 325 Billion Euro and reaching nearly 355 Billion Euro in 2020, growing at 9.3%. The estimated CAGR for the period 2020/2025 in EU27 remains healthy along the period, at 9.1% in EU27. The share of the Data Economy on the GDP in the EU27 baseline scenario at 2025 is of 4.%.
The CAGR 2020/2025 in EU27 for the High Growth scenario is 18.4%, that will make the Data Economy for EU27 surpass 827 Billion Euro, and accounting for 5.9%% of the GDP at 2025. In the Challenge scenario CAGR 2020/2025 for EU27 is 4%, more than halved with respect to the Baseline, with the Data Economy being just above 430 Billion Euro, and accounting for 3.3% of the GDP at 2025.
Indicator 5 — Value and Growth of the Data Economy
N. Name Descriptio
n
2016 2017 2018 2019 2020 Growth
2019/ 2018
Growt
h 2020/
2019
Impact
on
GDP
2018
Impact
on
GDP
2019
5.2 Value of the Data Economy EU27
Value of direct, indirect and induced impacts on the economy
238,699 267,986 301,637 324,858 355,396 7.7% 9.3% 2.4% 2.6%
5.2 Value of
the Data
Economy
EU27+U.K
.
Value of
direct,
indirect and
induced
impacts on
the economy
299,989 336,602 377,871 406,468 443,925 7.6% 9.2% 2.6% 2.8%
THE EUROPEAN DATA MARKET MONITORING TOOL
40
Table 14: Data Economy Forecast in 2025 and Impacts on GDP according to the Three Scenarios (€, Million; %)
Source: European Data Market Monitoring Tool, IDC 2020
As in the previous study, this report provides a detailed insight of the Data Economy by type of impact – direct, indirect and induced impacts. The pie chart below provides an overview of the distribution of the Data Economy by type of impacts in 2025 for EU27 in the Baseline scenario. It is worth highlighting how the composition of impacts changes along time, from 2019 to 2025, in favour of induced impacts, thus revealing the effects of data access, data product and services exchange, and data value distribution in the economy. As the data economy matures, the impacts on the general economy (induced) become as relevant as those on the European industry (indirect impacts). The data industry remains the prime motor of this economy but its direct impacts as a share of total impacts decrease.
Indeed, induced impacts in 2025 account for a share of 42%, gaining 9 percentage points with respect to 2019. Indirect impacts in turn will lose around 4% of share, but still in 2025 accounting for a very high percentage (43%). With respect to 2019, in which the indirect impacts still are the most relevant, as it was highlighted in previous publication (with forward impacts driving the effect), in 2025 induced impacts will increase, reaching a share similar to the one of the indirect impacts.
Figure 3: Data Economy by Type of Impact, EU27, Baseline scenario 2025 (%)
Source: European Data Market Monitoring Tool, IDC 2020
15%
43%
42%
Baseline Scenario 2025, EU27
Direct Impacts Indirect Impacts Induced Impacts
N. Name Description 2025
Challenge
Scenario
2025
Baselin
e
Scenari
o
2025
High
Growth
Scenario
Impacts
on GDP
2025
Challeng
e
Scenario
Impacts
on GDP
2025
Baseline
Scenario
Impacts on
GDP 2025
High
Growth
Scenario
5.2 Value of
the Data
Economy
EU27
Value of direct,
indirect and
induced impacts
on the economy
432,360 549,783 827,089 3.3% 4.0% 5.9%
5.2 Value of the Data Economy EU27+UK
Value of direct, indirect and induced impacts on the economy
536,715 674,263 1,036,709 3.4% 4.2% 6.2%
THE EUROPEAN DATA MARKET MONITORING TOOL
2.5 The International Dimension - The Data Economy Beyond the EU – US,
Brazil and Japan
The U.S.
The growth in the number of data professionals in 2019 was in the middle of the three internationals but the country’s share of total employment was the lowest – and lower than in Europe – reflecting the strengthening of the U.S.A. economy and improvements in employment there. In comparison with the EU28, the US shows the lowest growth in Data Professionals employment share of Total Employment in 2019 (1.7% and 0.3% year-on-year growth in the number of data professionals and employment share in 2019 over 2018 respectively).
Data Professionals growth slowed marginally in 2019 when compared to 2018. The same applies for the data supplier companies’ indicators, with the highest increase of data suppliers in 2019 in the European Union, while among the internationals none grew less than the US in 2018. (2.3% for the EU28 vs. 1.0% in the USA, and 1.6% in Brazil and Japan). Revenues were strong for the USA, at 11.6% and 12.7% in 2018 and 2019, but in 2018 this was bettered by the EU28 at 12.2%, and Japan followed very closely behind at 10.5%. The USA showed stronger growth in 2019 though, with the highest growth among the internationals and the member states.
Table 15: USA Indicators - Overview 2016 – 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
42
USA – Indicators’ Overview
N. Name Metrics 2016 2017 2018 2019 2020 Growth ‘19/ ‘18
1.1
Number of Data professionals
Total Number of Data professionals (Thousands)
12,732 13,857 14,105 14,350 14,593 1.7%
1.2
Data professionals’ employment share
% of Data professionals on total employment
8.42% 9.04% 9.06% 9.08% 9.11% 0.3%
2.1
Number of Data Suppliers
Total number of data supplier companies (000s)
289,556 303,552 309,263 312,215 316,190 1.0%
3.1
Revenues of Data Companies
Total revenues generated by companies specialized in the supply of data-related products and services (Million €)
€ 129,173
€ 146,970
€ 163,993
€ 184,873
€ 211,349
12.7%
4.1
Value of the Data Market
Estimate of the overall a value of the Data Market (Million €)
€ 129,173
€ 146,970
€ 163,993
€ 184,873
€ 211,349
12.7%
USA – Indicators’ Overview
N. Name Metrics 2016 2017 2018 2019 2020 Growth ‘19/ ‘18
4.2
Value of the Data Economy (Only Direct and Backward Indirect impacts)
Direct Impacts (Million €)
€ 108,521
€ 146,966
€ 158,283
€ 178,450
€ 204,013
12.7%
Backward Indirect Impacts (Million €)
€ 7,270 € 7,860 € 8,769 € 9,500 € 11,463 8.3%
4.3
Incidence of the Data Economy on GDP (Only direct and backward indirect impacts)
Ratio between value of the Data Economy and GDP (%)
0.78% 1.03% 1.11% 1.19% 1.34% 6.8%
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
Brazil
Brazil’s economy showed little sign of recovery in 2019 but had improved by the end of the year to match the (weak) growth seen in 2018. Confidence in the economy improved particularly in the last quarter of 2019. The exchange rate fell again, although not a lot, from 0.27 to 0.25 Real per US dollar. Investment remains weak and industrial output is falling too. Unemployment is high in the country and sawtooth between 12% and 13% with little sign of any downward trends. However, the data economy is improving with improving growth in data supplier revenues – up by 7.2% in 2019 compared with 5.4% in 2018.
Table 16: Brazil Indicators - Overview 2016 -2020
Brazil – Indicators’ Overview
N. Name Metrics 2016 2017 2018 2019 2020 Growth rate
2019/2018
1.1 Number of Data professionals
Total Number of Data professionals (Thousands)
1,160 1,175 1,200 1,211 1,215 0.9%
1.2 Data professionals’ employment share
% of Data professionals on total employment
1.81% 1.84% 1.86% 1.88% 1.89% 1.2%
2.1 Number of Data Suppliers
Total number of data supplier companies (000s)
35,979 36,906 37,605 38,192 38,477 1.6%
THE EUROPEAN DATA MARKET MONITORING TOOL
44
Brazil – Indicators’ Overview
N. Name Metrics 2016 2017 2018 2019 2020 Growth rate
2019/2018
3.1 Revenues of Data Companies
Total revenues generated by companies specialized in the supply of data-related products and services (Million €)
€ 6,049
€ 6,998 € 7,373 € 7,905 € 8,374
7.2%
4.1 Value of the Data Market
Estimate of the overall a value of the Data Market (Million €)
€ 6,049
€ 6,998 € 7,373 € 7,905 € 8,374
7.2%
4.2
Value of the Data Economy (Only Direct and Backward Indirect impacts)
Direct Impacts (Million €)
€ 6,157
€ 6,996 € 7,380 € 7,986 € 8,536
8.2%
Backward Indirect Impacts (Million €)
€ 290 € 335 € 353 € 374 € 384 5.9%
4.3 Incidence of the Data Economy on GDP (Only direct and backward indirect impacts)
Ratio between value of the Data Economy and GDP (%)
0.16% 0.17% 0.21% 0.23% 0.24% 8.8%
Source: European Data Market Monitoring Tool, IDC 2020
Japan
The indicators measuring the state of the Data Market and the Data Economy in Japan all showed growth in 2019, in some cases substantial growth. The number and employment share of data professionals grew the fastest among the internationals, (2.87% and 2.85% in 2019) but the EU28 showed even higher growth. The number of Data Suppliers also showed the highest growth among the internationals – at 1.6% in 2019 – but was again bettered by the EU28. Data Revenues showed commensurate growth but was not able to reach the levels shown by the USA in 2019. The incidence of the data economy on the total economy only showed a small improvement in 2019 though, but the expectation is for a more notable improvement in 2020. (See table below)
THE EUROPEAN DATA MARKET MONITORING TOOL
Table 17 Japan Indicators - Overview 2016 -2020
Japan – Indicators’ Overview
N. Name Metrics 2016 2017 2018 2019 2020 Growth rate
2019/2018
1.1 Number of Data professionals
Total Number of Data professionals (Thousands)
3,740 4,045 4,118 4,236 4,324 2.9%
1.2 Data professionals’ employment share
% of Data professionals on total employment
5.82% 6.20% 6.20% 6.37% 6.45% 2.8%
2.1 Number of Data Suppliers
Total number of data supplier companies (000s)
101,612 104,587 105,273 106,983 107,612 1.6%
3.1 Revenues of Data Companies
Total revenues generated by companies specialized in the supply of data-related products and services (Million €)
€ 25,513
€ 26,720
€ 29,799
€ 32,929
€ 37,019
10.5%
4.1 Value of the Data Market
Estimate of the overall a value of the Data Market (Million €)
€ 25,513
€ 26,720
€ 29,799
€ 32,929
€ 37,019
10.5%
4.2
Value of the Data Economy (Only Direct and Backward Indirect impacts)
Direct Impacts (Million €)
€ 27,394
€ 27,296
€ 30,074
€ 32,500
€ 37,287
8.1%
Backward Indirect Impacts (Million €)
€ 1,189 € 1,230 € 1,330 € 1,454 € 1,689 9.3%
4.3 Incidence of the Data Economy on GDP (Only direct and backward indirect impacts)
Ratio between value of the Data Economy and GDP (%)
0.93% 0.96% 1.08% 1.09% 1.25% 0.8%
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
46
International Overview and Comparison with the EU
In line with the results obtained by the previous round of measurement of the international indicators (European Data Market Study Update (SMART 2016/0063), D2.7 Final Report on Facts & Figures), the U.S. retained its leadership in the number of data professionals in 2019 and out to 2020 (see Figure below) – with an estimate of nearly 14.5 million in 2019. However, annual growth in 2019 – at 1.7% – is behind the European Union and Japan. Over the longer term, compound growth from 2016 to 2020 is 3.5% - only higher than Brazil. Brazil consistently shows the lowest growth in the number of data professionals and unsurprisingly the lowest compound growth too. Economic issues in the country reduce industrial and business growth, which is also reflected on the digital economy. Long term growth in Brazil is seen as a compound growth of 1.2% out to 2020. Japan is generally the best of the internationals with compound growth of 3.7% out to 2020. However, none of the internationals comes close to the longer-term growth in the number of Data Professionals seen in the Member States – 7.5% CAGR out to 2020.
Figure 4: Number of Data Professionals by Country, 2016-2020, Growth 2019 (Units; ‘000, %)
Source: European Data Market Monitoring Tool, IDC 2020
Figure 5: Number of Data Suppliers in the U.S., Brazil, Japan and EU, 2016-2020, Growth 2019 (Units; ‘000, %)
Source: European Data Market Monitoring Tool, IDC 2020
The picture for Data Professionals is similar to the one related to Data Suppliers, with the US displaying moderate growth year-on-year in 2019 but stronger longer term growth over the period 2016-2020 However, the dominance of the USA in the number of Data Supplier
THE EUROPEAN DATA MARKET MONITORING TOOL
Companies is under threat, with the EU27 plus the U.K. displaying solid year-on-year growth and featuring a total number of data suppliers very close to the one of the U.S. The figure below shows the relative positions of Data Suppliers among the participating regions.
The value of the Data Market continues to increase at around 10% across all the three EU partners taken into consideration in this report, ahead of the number of data professionals and data companies. This is expected as companies aim to increase revenue, so growth in the revenue of Data Suppliers should be well ahead of the growth in the number of Data Suppliers. The data market is a dynamic one: companies that cannot increase their revenue (through marketing and sales efforts, and through product development) are more likely to exit this market as their business model will be built on improving profitability in the longer term through revenue growth. Demand continues to grow as data users appreciate more the value of digital transformation and are more able to adopt digital practices in their organisations. However, a large part of organisations digital transformation activities tends to be towards cost reduction and efficiency improvements rather than fundamental changes in the way these organisations develop and conduct business.
The Data Market sees the U.S. easily retaining its lead position for the foreseeable future, with more than 184 million Euro in size and a buoyant year-on-year growth of 12.7% in 2019 over the previous year. Among the internationals, the EU is the only regional market able to challenge the U.S. for dominance in this industry but it is a considerable distance from the country: the data market in the EU28 is less than 40% of that in the US.
Figure 6: Value and Growth of the Data Market in the U.S., Brazil, Japan and EU, 2016-2020, Growth 2019 (Units; ‘000, %)
Source: European Data Market Monitoring Tool, IDC 2020
THE EUROPEAN DATA MARKET MONITORING TOOL
48
3. Describing the data market – the quali-quantitative
stories
The Final Report on Facts & Figures (D2.7) and the Final Report on Policy Conclusions (D2.8) were accompanied by three quali-quantitative stories concentrating on the operational, organisational and/or economic benefits generated by the use of data-driven technologies with a special focus on Health Data-driven Innovation and Data Commons. While the first story investigated unexplored potential stemming from the use of Big Data and Analytics (BDA) in healthcare, the second and the third stories focused on the role of data Commons and analysed in depth the benefits of data-driven innovation and the potential role and impacts of common data spaces in several leading sectors targeted by the recently unveiled European Strategy for Data (COM (2020) 66 final, 02/19/20)
3.1. Story 6-7 Health Data and Data-driven Innovation in the European Healthcare Industry
Combining two quali-quantitative stories (Story 6-7), the research on health data and data-driven innovation highlighted how a growing number of European Healthcare Systems in Europe is embarking on long-term reforms to improve outcomes and foster innovation. This, with the ultimate goal of benefiting patients, while, at the same time, ensuring long-term sustainability of healthcare services provision. The research unveiled an un-locked potential as it highlighted that the majority of healthcare providers (59%) has not adopted a Digital Transformation roadmap yet and only the 6% has established a unique roadmap for Digital Transformation and general business strategy7.
This unexplored potential of the use of Big Data and Analytics (BDA) in healthcare is eliciting a new wave of interest in data-driven value creation, which, in the medium to long run, will enable to reward performance rather than just volume. According to the IDC DX Sentiment survey 20188, over 60% of the European healthcare providers reported that developing data management is a priority. Indeed, being able to analyse and use data for process automation and decision support, in a granular, accurate, safe and context-relevant way, is key to long-term competitiveness and sustainability.
AI is still in its infancy and only the 30% of European healthcare providers are already using/testing or have immediate plans for the technology, adding to another 23% evaluating AI use cases 9. The top three BDA use cases that European healthcare providers are working are Clinical decision support (16%), Illness progression (15%) and Patient engagement (14%)10. Patient engagement is a key priority for almost 40% of European healthcare providers, particularly among countries that are experimenting with value-based reimbursement models structured around patient experience and outcomes, while the most significant AI use cases that European healthcare providers are working on are:
7 Source: IDC DX Executive Sentiment Survey, May 2018 (n=66) 8 Source: IDC DX Executive Sentiment Survey, May 2018 (n=66) 9 The IDC European Vertical Markets Survey is an annual landmark study of IT solutions, investment priorities,
and emerging technologies. In the 2018-2019 version, the sample covers over 77% of the European economy (in terms of GDP of the 40 countries). Respondents are distributed across Western Europe (UK, France, Germany, Italy, Spain, Netherlands, the Nordics) and Central & East Europe (Russia, Poland, and Czech Republic). The survey was conducted in the native language of each country, using either telephone interviewing (CATI) or web interviewing (CAWI) systems. Eligible respondents are individuals primarily involved in IT and/or business decisions at their companies and ranked director level or above. Results are analysed by vertical market and company size and represent a basis for a series of demand-side reports published by IDC.
10 Data retrieved from IDC European Vertical Markets Survey, 2018–2019 (n= 290 [WE = 232, CEE = 58])
THE EUROPEAN DATA MARKET MONITORING TOOL
Personalization of clinical pathway (10%) Clinical decision support (9%) and Patient risk (9%)11 .
The most significant AI use cases that European healthcare providers are working on are:
• Personalization of clinical pathway. As PHM is transforming the traditional hospital focused approach into a value care to deliver more efficient outcome using fewer resources, AI and machine learning algorithms can derive actionable insights from the untapped datasets essential for population health programs. AI can, for example, elaborate real-time data to pinpoint specific demographics where health issues exists and target them precisely with ad-hoc treatment program.
• Clinical decision support. A machine learning system can provide a high level of clinical accuracy and a coverage of a broad range of conditions. Symptoms can be entered via natural language and can be used to drive diagnoses and the level of care direction.
• Patient risk. AI can predict the future of patient's health with better accuracy, as the risk of contracting specific diseases. AI system can predict the outcomes of hospital visits to prevent readmissions and shorten the amount of time patients are kept in hospitals.
• Compliance check. The healthcare industry is highly regulated but maintaining compliance within evolving patchworks of national and regional regulations can be strain on providers' limited resources. The adoption of AI in this area optimize administrative procedures and frees experienced caregivers from routine tasks.
• Optimization of resource utilization. Machine learning and AI have the potential to provide the front line with the real time wisdom to improve the speed and the quality of the hundreds of decisions they make each day in order to improve the flow of patients through the various clinical services involved in delivering appropriate care.
Figure 7 Top three use case for AI among European healthcare providers
Source: IDC European Vertical Markets Survey, 2018–2019 (n= 290 [WE = 232, CEE = 58])
We analyse four different case studies (see table below):
• The Secondary Use of Data in Finland - in the health and social welfare sector
there has been an extensive work between the public and private sectors to promote
the secondary use of health and well-being data, which has led the creation of a new
ecosystem through a national development project, leading to the development of a
ground-breaking new legislation and the establishment of a permit authority for
secondary use of health data. Interviews conducted with Jaana Sinipuro, Hannu
Hämäläinen e Antti Kivelä (SITRA).
11 ibidem
THE EUROPEAN DATA MARKET MONITORING TOOL
50
• The Health Data Hub in France - In 2018, president Macron launched this initiative to establish a nationwide data platform. The project is today strongly pushed forward from the French Health Act "Ma Santé 2022", approved last July, and including the creation of the "Espace Numérique de Santé" (ENS – e-health personal space) with the aim to establish a more efficient and patient oriented healthcare system by leveraging the power of data and artificial intelligence. The project goal is to enrich and enhance the National Health Data System (SNDS - Système National des Données de Santé) by including the wider French heritage of health data in one place, for open use by researchers, healthcare professionals, care institutions, start-ups, insurers, etc.
• The development of Data Policies in Portugal - In August 2019, the Portugal Health Ministry has made available the strategic document "From big Data to smart data: putting data to work for the public's health", that outlines the vision, key areas and principles for secondary use of data, advanced analytics and artificial intelligence to improve Portuguese population´s health. There are several initiatives and pilot projects, that are testing the capabilities of AI and providing evidence to support the development of new data management and governance policy.
• ARIA’s Health Data Warehouse and Business Intelligence Competency Center (Italy) – In July 2019, Regione Lombardia (Lombardy Region, Italy), has established a new
Regional company called ARIA (Agency for Innovation and Procurement) from the
merge of ARCA (regional agency for procurement) and Lombardia Informatica (the
regional in-house digital company). Within its mandate, ARIA has the specific aim to
enhance the value of all regional health data assets. Interviews conducted with
Giuseppe Preziosi (Aria)
THE EUROPEAN DATA MARKET MONITORING TOOL
Country Challenges/ business needs
Key initiatives and application areas for secondary use of health data
Benefits (current or expected)
Output to date
Finland • Making use of the huge amount of health and social care data collected from different sources and registers
• Offering to researchers and organizations looking to leverage health data for innovation, a coherent, and simple environment for accessing and using the data in a legally compliant way
• Maintaining public trust in government capability to manage data towards common good and respect of individual rights
• Creation of a data access permit authority and the related organizational, infrastructural, and legal conditions
• Key application areas: innovation in clinical and public health research, as well as in the private sector (pharma and life sciences) R&D to promote economic growth
• Simple yet comprehensive permit procedure to get access to data for R&D scopes
• Strong knowledge base and experience on managing a complex ecosystem of stakeholders, sometimes with conflicting interests, and to enable sustainable cooperation
• Establishment of a single authority that work as a one stop shop for permits regarding several registers
Sitra launched 8 pre-production projects, whose lesson learned, and outcomes have been then integrated into an action plan for the new permit authority once in operation.
THE EUROPEAN DATA MARKET MONITORING TOOL
52
France • Fragmentation of the healthcare ecosystem
• The need for a secure governance framework enabling the ethical use of data for analysis and research
• The need for establishing a regulated platform for data access and use as well for facilitating the interaction of multiple stakeholders (collecting, producing and using data)
• Establishment of a Health Data Hub (HDH) to manage a large number of data sources with harmonized rules for data access, and use.
• Key application areas for secondary use of research include the development of RWE research, support to clinical trials, development of precision and personalized medicine, predictive clinical decision support solutions
• A harmonized standard of data access at national and international level
• Patients/citizens access and control over the use of their data
• Accelerated innovation and personalized services
19 pilot projects have been launched projects with various focuses ranging from breast cancer to health surveillance, using predictive tools enabling predictions and insights able to drive decision making
Portugal • The need to transform the National Health Service into an intelligent, data driven NHS to drive efficiency and better patient outcomes
• The need to define a framework for the secure collection, storage and reuse of health data
• Establishment of a national strategy for health data management and for secondary use
• Funding of AI enabled research programs and projects aimed at the development of clinical decision support; personalization of clinical pathways; patient risk management; optimization of resources utilization
Creation of
• A data infrastructure that supports population health management
• Predictive solution aimed at reducing skin cancer mortality
• Predictive solution aimed at optimizing the delivery of emergency services
There are several initiatives and pilot projects, that are testing the capabilities of AI and providing evidence to support the development of new data management and governance policy.
THE EUROPEAN DATA MARKET MONITORING TOOL
Italy (Lombardy region)
• Need to integrate and use >10 years of collected health data in the regional Data Warehouse
• Development of infrastructural, organizational and policy framework for the use of and access to data
Predictive BDA models and Data health hub
• Patient risk
• Population risk management
• Illness progression and clinical decision support
• RWE driven research
• Enablement of new data driven research streams and collaboration at national and international level
• Establishment of a PoC framework to predict cardiovascular risk with accuracy between 96,22% and 97,96%
• Launch of a pilot determining the clinical best next action for Alzheimer patients
Several Initiatives to support the regional healthcare service in areas such as: planning and management, costs rationalization, evaluation of safety and efficiency of clinical pathways and integrated patient journeys, as well as health risk prevention
In terms of the triggering challenges, security and data privacy concerns are on top, due to the highly regulated environment of the healthcare industry. However, the common theme emerging from the case studies are threefold:
• The establishment of a regulatory framework for the use of and access to data. This involves multiple players in the public and private healthcare arena to establish partnerships and create a collaborative environment. It requires all stakeholders to agree on the value of data as a shared asset, and to actively promote initiatives where common standards and a one-stop-shop approach to data access are brought forward. This is the case of Portugal, where the Health Ministry authority is seeking to establish a new healthcare ecosystem based on a data-driven approach towards the delivery of healthcare services across the nation.
• The collection, processing, storing and access of complex data coming from different structured (e.g. National health records) and unstructured (e.g. wearables) sources, along with semantic, geographical and time complexities. This is the case of ARIA and its collection of over 10 years of healthcare data stored in different locations, as well as Sitra's project working at establishing a common framework and developing metadata descriptions. Additionally, data collected require intelligent solutions, capabilities and skills to extract value from data and provide actionable areas for the deployment of information (i.e. population risk stratification, clinical decision support, personalization of clinical pathways, etc.).
• Maintaining trust and ensuring security. The high sensitiveness of healthcare data requires an attentive approach to identifying and enforcing regulatory frameworks and solutions that ensure information is treated in compliance with regional and country-specific policies. The adoption of GDPR strengthens and unifies data protection for individuals within the EU, regulating how data integration happen safely. It gives
THE EUROPEAN DATA MARKET MONITORING TOOL
54
individuals key control over the usage, processing and transfer of their personal data held by healthcare organizations. The transition to a VBHC model, in which care plans should be personalized and stakeholders should integrate their activities, must be underpinned by consistent information management governance that enables patient data integration across providers, processes and IT systems. GDPR is expected to provide a patient-centric ecosystem. In addition, authorities need to establish a high level of public trust for the ethical and secure use of healthcare and social population data for the public good. In this example, Finnish citizens are informed about how their data are used for secondary purposes out of primary care.
The case studies presented in this research highlight several benefits obtained by the organizations adopting AI/ML and BDA technologies:
• The easy and convenient access to intelligent solutions for clinicians and patients offers more opportunities to advance decision making and enhance clinical process efficiency at the point of care. Portugal skin cancer screening solution is an example of how technology supports a clinical collaborative framework and enables the integration of information to serve population health management,
• More advanced predictive capabilities, allowing greater control over disease-specific variables impacting health outcomes, as well as costs and resources utilization associated with care. This approach enables to more efficiently target population segments at risk of developing chronic and long-term conditions by putting in place initiatives aimed at promoting health and preventing or delaying the development of risk factors. Predictive BDA by ARIA is an example of development of a predictive model able to effectively target cardiovascular conditions and offer an accurate estimate of the number of future cases in a specific geographical area.
• The adoption of BDA as part of their business intelligence strategy, help healthcare providers to improve the overall operational efficiency. Advanced predictive analytics model supports the definition of admission rates along with attrition rate to help with staff allocation. In Finland, for example, the new Act on secondary use of data is expected also to support the planning and the reporting duties of authorities. The aim is to reduce the healthcare costs and focus more resources on the delivery of better healthcare.
• Big data can help in uncovering unknown correlations, hidden patterns, and insights by examining large sets of data. By applying machine learning, big data can study human genomes and find the correct treatment or drugs to treat cancer or other rare disease. Clinical studies are long and expensive to implement. They provide results on the drug administration in a specific controlled frame that is not real-life conditions. Moreover, it can be difficult to compare results between different clinical studies (indirect comparison of different new treatments). Real World Data (RWD) analysis could provide exhaustive and real-life analysis. Aware of RWD potential, health authorities are developing new evaluation frames with RWE and artificial intelligence. The HDH in France is an example of how a secure and a regulated environment offer the access to relevant data for health actors but also to the means needed to analyse these data, facilitating clinical trials initiatives.
3.2. Story 8 - Accelerating the Impact of Data Commons
Story 8 focused on the concept of “Data commons”, also referred to as data collaborative, data spaces, open data partnerships, and common data infrastructures (Mishra et al. 2016; Perkmann and Schildt 2015; Susha et al. 2017) Indeed, data commons have been defined as platforms that openly share data and knowledge with a computational infrastructure that supports data sharing across an heterogeneous base of users supporting different services on top of the data (Contreras 2010).
THE EUROPEAN DATA MARKET MONITORING TOOL
Data commons are currently managed by multiorganizational collaborations, usually a consortium or a group of organizations including competitors, that come together to share costs and resources to build a common infrastructure that supports data analysis and helps them extract value from the integration and re-use of the data shared.
From the lesser or higher degree of sharing across companies, such efforts of building common data pools need to address some challenges, including:
- To generate the appropriate incentives for organizations to share some data
- To provide the conditions under which they are willing to do it, that is a governance approach that defines rights and responsibilities across data owners and users
- To agree in a set of data formats, data structure and quality in which the data and its contextual information needs to be shared, which imply an agreement on data standards and metadata to allow interoperability across systems
- A sustainability model that guarantees that such infrastructure and efforts are not only maintained over time but eventually scale.
One of the biggest challenges for data commons is generating the business case for companies to be willing to share their data with a pool of organizations via a common data infrastructure; and developing a long-term sustainability model that affords the different activities behind the operations that support such infrastructure.
Regarding the first challenge, several industries have realized the critical value created by sharing and re-using digital data (in short data) for different purposes unforeseen when data was collected in the first place. Data is considered a non-rival good, meaning that it can be reused and recombined to generate positive spillover effects (OECD 2015). Although the re-use of data is assumed to foster innovation and contribute to economic efficiency avoiding duplicative investments, amongst other positive externalities (Drexl, 2018), data is still kept in silos and little re-used is reported (Pujol Priego et al. 2019). As of today, data is not subject to property rights, such as knowledge, but other legal bounds can inhibit data to be reused, such as trade secret protection, right to data portability in GDPR, amongst others. However, most importantly, the decision of a company to share its data to be reused depends on its factual interest in sharing it.
There are different costs associated. Depending on the sector, liability concerns may arise for companies preventing them to share some potentially sensitive data (e.g. clinical trial data) with the public, even when there are potential single and collective benefits behind it. Competitive concerns are related to how to share information without damaging competitive interests and advantage of for profits. Besides such concerns behind the companies’ decision of engaging in the development of data commons, there are different costs associated with doing it.
Worth noting are some operational costs related to the services provided by such data commons, which include data curation, periodic updating, and monitoring of datasets, the agreement and implementation of data standards to ensure interoperability, data storage services and all activities related to guarantee data security and privacy standards. Interoperability is an important requirement of data commons to permit the aggregation and integration of datasets through a variety of tools and support its re-usability. Data commons must be also secure to preserve permissions, guarantee data protection and prevent corruption of data. This aspect is critical where data involves sensitive personal information (e.g health-related data, financial or legal related data).
THE EUROPEAN DATA MARKET MONITORING TOOL
56
On top of such operational activities, additional tasks, depending on the scope of the data commons, may be related to the provision of data visualization tools that foster data re-used, operational support for users seeking to re-use the data and asking for additional contextual information about the data or clarification, amongst others. See Grossman, 2018 for more exhaustive detail.
Potential benefits are related to efficiency gains and the generation of new products and services afforded by the re-use and recombination of other’s data, fostering an ecosystem that co-creates value around data. As such, data is a non-rivalrous resource allowing the same data to support the generation of heterogeneous products and services. Any company can engage with the same data in different data-sharing arrangements, being unlimited the potential value that can be extracted from the same data. For instance, as some of the data commons cases reveals properly designed data commons can serve to R&D processes as an active and accessible repository for research data; as a platform for reproducing research results; to support discoveries by adding data and new algorithms developed and implemented around the commons and as new software applications and tools are integrated with the common pool of data. Benefits also include reduced data silos from different organizations and integrated workflows in the cloud, thus taking advantage of a common cloud solution. This would also elicit a dynamic ecosystem supporting software vendors, oil companies, academia and an open community of projects to be developed (competitive dynamics) on top of a common data platform developed (cooperation).
As the types of data commons grow and diversify, we can expect to see a variety of higher-order services on top of such infrastructures with different offerings operating within and across data commons while displaying different data value-chain configurations - including third-party contributions - to maximize the value of the data. European policy in this domain is certainly necessary to facilitate the creation of sufficient critical mass. But it should draw on existing experience for building such “data spaces”. In particular, what emerges clearly is that there are great challenges to data pooling that can only be addressed by sophisticated and carefully designed governance mechanisms. Data pooling, to be effective, needs to be framed in a wider context of precompetitive collaboration between companies, and needs to have a compelling business case: in other words, it has to be demand driven.
3.3. Story 9 – Scaling up data-driven innovation: European industry requirements and the role of European data spaces
The strategic objective of the new European Strategy for Data (COM (2020) 66 final, 02/19/20) is to make Europe a global leader in the data-agile economy, by creating a favorable policy environment and a genuine single data market for data. A pillar of this strategy will be the creation of common European data spaces in strategic economic sectors and domains of public interest, where data driven innovation will have system impact on the entire ecosystem and on citizens. To become operational, data spaces will need to develop data governance mechanisms and access to high value datasets to enable data-driven innovation within vertical ecosystems and foster their developments.
The European Commission has investigated the potential needs and requirements for common data spaces in a series of workshops with stakeholders12 as well as several other initiatives. Story 9 looks at common data spaces from the viewpoint of European industries who have already invested in data-driven innovation, have achieved measurable business benefits and are engaged in scaling-up these efforts. This provides an evidence-based and industry-specific view about the pragmatic requirements of Common European data
12 Report on the European Commission's Workshops on Common European Data Spaces https://ec.europa.eu/digital-single-
market/en/news/report-european-commissions-workshops-common-european-data-spaces
THE EUROPEAN DATA MARKET MONITORING TOOL
spaces, with a focus on the requirements for data governance, access to data, access to infrastructures. The results show that the path towards data spaces effectively supporting data sharing at ecosystem level will not be easy. Enterprises are still concentrating most of their efforts and investments on developing data-driven innovation within their organization, at best sharing some data with a few trusted sub-suppliers. The most relevant barrier found to scalability comes from the cost of cloud infrastructures and the dependency on a few global suppliers, such as AWS (Amazon Web Services) resulting in potential customer lock-in effects.
The story leverages the set of 18 case studies developed by Politecnico of Milano (POLIMI) and IDC across seven industries within the context of the H2020 DataBench project13, collecting data about the business impacts of the adoption of Big Data and Analytics technologies (Figure 8). These case studies show a good level of business benefits achieved from data-driven innovation, with a high level of cost reduction (such as 80% reduction of operational expenditures for fraud detection in the financial services industry, 30% reduction of maintenance costs in manufacturing thanks to predictive manufacturing) and customer benefits (for example, a 110% improvement of customer retention in manufacturing and 85% improvement of conversion rates from potential to actual customers thanks to data-driven targeting in retail). These case studies represent operational services and applications, but most of them are still confined to individual departments or branches of the company, in the process of being scaled-up to the whole organization. When scaling up these services, the lessons learned from the case studies highlight the following problems and risks:
• Even when the technology solution has been well selected and designed, there is
no absolute guarantee scalability to the whole organization will be feasible and cost-
effective until it is actually implemented;
• Business superficiality: business intelligence is always needed to lead the strategic
use of data, and this is still found in human resources rather than in machines. For
example, automated recommendation systems built on standard solutions will tend
to recommend the products with highest sales, which are likely to be the products
with lower prices and lower margins. A business manager must intervene to provide
an intelligent strategy, for example finding ways to nudge customers towards
products with higher prices and margins. Integrating business intelligence with the
use of data analytics is still immature in many industries.
• Most of these solutions rely on public cloud technologies. This creates a lock-in risk,
since migrating to other cloud providers requires considerable costs and time
investments for the redesign of software. The high concentration of the cloud
providers market reduces the potential choices of business users and constraints
the scaling up of data-driven innovation.
• When scaling up, particularly if the solution requires real-time data processing, the
cloud computing costs tend to rise very quickly and cross a threshold where
technology costs are higher than business benefits (edge vs. cloud decisions). This
is a sensitive aspect, particularly for solutions combining BDA (Big Data Analytics)
and Artificial Intelligence such as machine learning.
13 Evidence-Based Big Data Benchmarking to Improve Business Performance, www.databench.eu
THE EUROPEAN DATA MARKET MONITORING TOOL
58
The cloud computing costs issue is one of the relevant results of the case study analysis. Basic cloud services are quite convenient, but prices increase very fast as soon as more sophisticated services are needed. For example, in the retail industry, a leading supermarket chain found that leveraging AI for sales prediction in one shop led to a 5% increase of margins (equivalent to roughly 5 million €/year); but applying the same machine-learning application to all shops in the chain would wipe out the benefits and cost more than the margin increase. These issues are behind the decision of the European Commission to promote federated European cloud infrastructures. The European data strategy in fact plans to fund a High Impact Project on European data spaces and federated cloud infrastructures.
These case studies confirm the need for improved access to data processing and computing capacities, as foreseen in the Data Strategy in terms of support for data spaces14.
Figure 8 – Big Data success stories: main use cases and business impacts, 2019
Source: Chiara Francalanci, Polimi, “Virtual BenchLearning: Success Stories on Big Data & Analytics”, DataBench Webinar 28 May 2020
In addition, we investigated in depth three case studies to focus on scaling-up issues and potential requirements for European data spaces. They are:
• E-Geos, an ASI (20%) / Telespazio (80%) company, is a leading international player
in the Earth Observation and Geo-Spatial Information business. The case study
concerns the design of an innovative yield prediction machine learning algorithm
based on Sentinel and Landsat high-resolution satellite information. The algorithm
was used to predict the production of soybeans and corn in the US on behalf of a
14 European Commission, “A data strategy for Europe”, February 2020, page 17
AgricultureCrops monitoring:
Costs = -10%Equipment optimization Precision agriculture
Automotive Predictive maintenance Self drivingSmart services:
Costs = -80%
Financial ServicesFraud detection:
Operational Ex. = -80%Risk assessment
Targeting:
Marketing costs = -35%
TCO costs = -80%
Conversion rate = 10x
Healthcare Diagnostic Patient monitoring Preventive systems
ManufacturingPredictive maintenance:
Maintenance costs = -30%
Smart manufacturing:
Utilities costs = -20%
Cust. retention = +110%
R&D optimization/
Smart design
RetailAssortment optimization/
Intelligent fulfilment
Price optimization/
Promotions:
Conversion rate = 50%
Cust. retention = +14%
Targeting:
Conversion rate = +85%
TCO costs = -15%
TelecommunicationChurn prediction/
Promotions
Network capacity
optimization
Targeting:
Conversion rate = +130%
Transport & logisticsChurn prediction/
PromotionsFleet management
Network capacity
optimization:
TCO costs = -90%
UtilitiesChurn prediction/
Promotions
Network capacity
optimization:
Costs = -20%
Cust. Expenses = -30%
Personalized fares:
Marketing costs = -50%
TCO costs = -50%
THE EUROPEAN DATA MARKET MONITORING TOOL
company operating in the financial industry, needing predictions as a support to
trading decisions. Machine learning algorithms demonstrated roughly 10% more
accurate, supporting better investment decisions, which helped the financial
company to gain from trading around 0.32% on traded volumes. This is a very
valuable gain for the competitive trading market. e-GEOS is the global distributor for
the COSMO-SkyMed data, a constellation of four radar satellites for Earth
Observation, founded by the Italian Space Agency and the Italian Ministry of
Defense. Due to its nature and ownership, e-Geos is oriented towards the provision
of open data and data sharing. Nevertheless, satellite data is basically raw material
which needs to be processed and treated with sophisticated tools such as the
machine learning algorithm developed by E-Geos before it can be used. To provide
access to farmers to this type of data in a common data space requires a governance
framework where the costs of data processing can be compensated and specialized
intermediaries like e-Geos can provide the necessary tools.
• A leading Spanish Financial group. The case study concerns the development of a
dataset on the relationships between customers in order to build a part of the social
graph of the bank. The data is synthetically generated based on real data coming
from a set of restricted tables (relational database), with information related to the
customers, their connections and the different operations they perform. The
generation of this dataset is aimed to allow the bank to share data with its external
providers to deploy and validate proofs-of-concept of different use cases (e.g.,
potential fraud based on customers’ relationships).
• Whirlpool is a multinational white appliances manufacturer. The case study concerns
several initiatives including a pilot carried out within the context of the project Boost
for the application of Big Data to forecasting spare parts demand, including the
creation of a consumer service data model and planning optimization. Spare part
production and distribution is one of the most relevant challenges for after sales
services, requiring careful and timely managing of central warehouses of spare parts
(so customers don’t have to wait too long for reparations), with a large variety of
product families and product codes. By using data analytics and creating a “smart
service” for spare parts forecasting and management, the company was able to
achieve a 30% reduction of the spare part stock, an increase of inventory turnover
by 35%, and a reduction of 25% of the lead time to consumer. The company also
implemented a new platform for self-service analytics for the internal users, so that
they can access the relevant data in a more flexible and personalized way. Whirlpool
is still struggling with many data silos devoted to single company functions or
departments and trying to merge them with a single data lake. The company is also
developing a “digital twin” experimentation in Poland but is struggling to merge
different data sources, particularly from the production machines and plant assembly
lines. According to the interviewee, there are cultural and technological barriers
preventing data sharing between the manufacturer and his sub-suppliers. The
suppliers are focused on collecting data from the production cycle for maintenance
and improving efficiency but are not willing to share the data. In addition, the different
typologies of data and methods of collection create technological barriers.
In conclusion, there has been considerable progress in the use of data-driven innovation by European industries in the last years, including an increasing use of AI techniques such as
THE EUROPEAN DATA MARKET MONITORING TOOL
60
machine learning leveraging the power of data. Nevertheless, there is still a high level of immaturity in the capability to merge datasets within a company, and relevant barriers against data sharing even in advanced sectors such as manufacturing. A relevant issue which emerged from most case studies is the availability of affordable and efficient cloud computing infrastructures, allowing the scaling up of successful pilots and individual company-site experiences to the whole company domain. Even if potentially Common data spaces could provide a valuable answer to the need for greater access to high quality datasets and computing infrastructures, this will require solving practical and technology challenges, not simply providing a favourable environment for encouraging stakeholder collaboration.
4. Mapping the data market – data landscape and data market monitoring tool
4.1 The EU Data Landscape
The Final EU Data Landscape Report (D4.3) provides an overview of the EU Data Landscape database revision performed by the study team between January 2019 and December 201915. With a total of 1556 companies and coverage of 42 countries (European Union-28, Belarus, Bosnia and Herzegovina, Georgia, Iceland, Israel, Kenya, Moldova, Norway, Russia, Serbia, Switzerland, Turkey, Ukraine and the United States), the database offers a comprehensive overview of the most important data companies in Europe. More specifically, among the 1556 companies, 311 have been identified as Key Data Landscape companies in line with a set of criteria adopted and further specified in the following paragraphs.
Main changes to the EU Data Landscape
The 2020 EU Data Landscape review introduced some changes to the approach:
• This year’s research was widened by the use of new and more efficient sources and focused on the Vertical Applications category.
• The existing EU Data Landscape database was validated and further extended in geographical and coverage scope. In particular, as regards geographical coverage, the database currently includes companies from 42 countries, against the 41 of the January 2019 update.
• The list of Key Data Landscape companies was reviewed according to updated or new criteria identified by the study team, leading to 52 new entries in this category.
15 The data presented in the report cover the period from January 2019 until December 2019. The United Kingdom was a full member of the European Union during this period. It officially left the European Union on 31 January 2020. As a result, this report still includes the UK as a member of the EU and the EU is considered to have 28 Member States throughout.
THE EUROPEAN DATA MARKET MONITORING TOOL
Figure 9: Database of Data Landscape companies (www.datalandscape.eu)
Source: http://datalandscape.eu/companies
Overview of the EU Data Landscape Database (status in December 2019):
• Overall, the database has grown by 9% with the addition of 131 new companies
during 2019. Out of the 131 new companies 52 are Key Data Landscape
companies
• UK companies account for 25.3% of the total database, followed by Spain (12%),
France (8,8%), Germany (8,7%), Netherlands (4,7%) and Italy (3,9%).
• Analytics continues to be the most populated category (accounting for 41% of
the database).
• The share of companies categorised as Vertical Applications grew by six
percentage points reaching 23% of the database (from 246 companies in 2018
up to 359 in 2019).
Key Data Landscape companies (status in December 2019):
• Focusing on the methodological approach, Key Data Landscape companies
were selected from the main database according to the following pre-set criteria:
o The company is listed in the Global Big Data Landscape map,16 or
o The company and its proof of concept is already established enough –
as a proxy the study team took into account companies receiving over
EUR 1m in funding according to Crunchbase database, which provides
data on the world’s most innovative companies, including data on the
amount of capital obtained, and
o The company has its main headquarter or R&D department in Europe.
• The list of Key Data Landscape companies grew from 259 in 2018 to 311 in
2019, a 17% increase.
16 Matt Turck (2017). Big Data Landscape 2017, Firstmark. Available at: http://mattturck.com/wp-content/uploads/2017/04/Big-Data-Landscape-2017-Matt-Turck-FirstMark.png
THE EUROPEAN DATA MARKET MONITORING TOOL
62
• Most Key Data Landscape companies have their headquarters in the United
Kingdom (114) followed by France (47) and Germany (35).
• The spread across categories remains stable since 2018 with Analytics
continuing to be the most populated category (41%). Most notably, in 2019, the
Vertical Applications category grew by six percentage points reaching 23% of
the database (from 246 companies in 2018 up to 359 in 2019).
Figure 10: Key Data Landscape Companies by Country (2019,2018,2017)
Source: European Data Market Study, D4.3 EU Data Landscape, Review at December 2019
THE EUROPEAN DATA MARKET MONITORING TOOL
5. Acting upon the data market – the role of policy
By investigating the role of policies in shaping the present and future trends of the European Data Market and Data Economy, the Final Report on Policy Conclusions (D2.8) complemented the sizing and forecasting exercise carried out for the Final Report on Facts and Figures (D2.7), as well as the additional analysis obtained through the three quantified stories.
5.1 The Role of Policy and the Future of Europe’s Data Economy: The Three
Scenarios
The Final Report on Policy Conclusions (D2.8) presents the alternative evolution paths of the European Data Market and Data Economy at 2025, described as three potential scenarios driven by different macroeconomic and framework conditions, shaped by critical turning points to be faced in the next years by governments, businesses and social actors. The scenarios are an update of those presented in March 2020, building on the updated EDM dataset and forecasts and insights from last year’s events.
The new European Data strategy outlines the ambition for Europe to become a leading role model for a society empowered by data to make better decisions in business and the public sector and a global leader in the data-agile economy. Our 2025 scenarios outline different pathways of evolution of the European Data Market (EDM) and Data Economy in the next years, exploring the different mix of factors and policy choices which may lead to achieve this ambition or instead to fail it. In the past years we have monitored the fast growth of the Data Market and Data Economy and have witnessed the evolution of supply and demand dynamics in Europe. At the same time, the emergence of disruptive technologies such as AI has increased the need for policy intervention to manage emerging social, economic and ethical risks. Therefore, the 2025 scenario presented in this report are strongly influenced by multiple policy assumptions shaped by the Data Strategy and European Digital Strategy recently published by the new Commission.
Given this context, we have updated the description of the two main focal issues (axes) around which we have developed our 2025 scenarios, as follows:
• the high or low pace of diffusion of data-driven innovation, driven by demand-
supply dynamics, and its impact on economic growth. This year we add to this
perspective the pace of multiple innovation adoption, where data is at the core of a
multiple technology environment powered by AI.
• the social and economic data governance model enabling a fair and
competitive economy, as indicated by the new European Data Strategy. Today the
term “data governance” has grown from its original narrow definition as an approach
to data management, to a much broader concept of a policy and conceptual
framework establishing the norms, practices and principles covering all aspects of
data dynamics in the society and the economy. Essentially, the data governance
framework which is the first pillar of the new European Data Strategy recognizes the
need to deal with data as a strategic asset influencing power dynamics in the socio-
economic system.
At one extreme, we foresee a society where a few actors, such as leading online platforms, governments, large businesses, dominate the main data assets and therefore capture a disproportionately high share of data innovation benefits, increasing social inequality (highly centralized model). The polar opposite of this scenario would be a society characterised by an open, transparent and participatory approach to data governance, where both citizens
THE EUROPEAN DATA MARKET MONITORING TOOL
64
and organisations are able to control and extract value from their data. This would result in a wider social distribution of data innovation benefits, decreasing social inequality. Trustworthiness and respect of data ethics principles are other important characteristics of this ideal model.
This analysis highlights the critical turning points to be faced in the next years by governments, businesses and social actors in the development of the European Data Economy. The combination of alternative social and economic trends results in the following scenarios:
• The Baseline scenario is characterised by a healthy growth of data innovation, a
moderate concentration of power by dominant data owners with a data governance
model protecting personal data rights, and an uneven but rather wide distribution of data
innovation benefits in the society. This is considered the most likely scenario.
• The High Growth scenario is characterised by a high level of data innovation, low data
power concentration, an open and transparent data governance model with high data
sharing, and a wide distribution of the benefits of data innovation in the society;
• The Challenge scenario is characterised by a low level of data innovation, a moderate
level of data power concentration due to digital markets fragmentation, and an uneven
distribution of data innovation benefits in the society.
The scenarios explore the drivers and framework conditions which may lead to maximise the benefits of a balanced Data Economy and to avoid the risks of an unbalanced one, highlighting the consequences of policy actions.
Policy and the Baseline Scenario
The Baseline scenario predicts a healthy growth of data-driven innovation and increase of investments in the new wave of digital technologies, pioneered by the most advanced, competitive and innovative enterprises, medium and large (both as technology providers and users) with a share of competitive SMEs, savvy in the use of ICTs. By 2025 we expect the take-up of AI, Big Data, IoT, and robotics to reach over 60% of medium-large EU enterprises (IDC survey on Advanced Technologies for Industry, 2019), while other technologies such as 5G, AR/VR, blockchain, new materials and industrial biotechnologies will also make strong progress. This will force enterprises to engage in multiple innovation, adopting and combining multiple technologies: this convergence is enabled and powered by data and intelligence.
In this scenario, the Data Market is forecast to reach 82.5 billion Euro in the EU27, with a compound annual growth rate of 5.8%. The Data Economy will grow faster than the Data Market, because the investments in data technologies have direct and indirect impacts on the economy with a multiplier effect, reaching a value of 550 billion Euro in the EU27, with a steep increase of its incidence on EU from 2.6% in 2019 to 4% in 2025. Enterprises will add 3.2 € million data professionals’ positions between 2019 and 2025, bringing the total to 9.3 € million jobs. However, this will increase the potential data professionals’ skills gap, which may become a bottleneck for some enterprises or regions, creating competition between enterprises for the most skilled professionals.
In this scenario, Europe makes progress in the investment and deployment of independent data infrastructures and digital resources, also leveraging the new Horizon Europe and Digital Europe Programs. This means Europe reaches a better, but not quite complete, technological sovereignty.
The new digital policy strategies will empower Europe to play a stronger role in the global scene, leveraging the GDPR success as a global standard. Not only Europe, but also many other international governments are now conscious of the downside and risks of global
THE EUROPEAN DATA MARKET MONITORING TOOL
platforms dominance and control of global data flows and will work together towards achieving a more balanced playing field. This scenario therefore is positioned between the two extremes of high and low concentration of power and data control. The development of an effective regulatory framework of data governance, as foreseen by the Data strategy, will enhance stakeholders’ willingness and capability to manage data sharing and improves data access and re-use. The single market for data gradually emerges as fragmentation is overcome, and this enables Europe to attract a growing share of the global Data Economy, in terms of capability of data processing and management. The development of common EU data spaces is faster and more successful in some sectors with strong innovation demand (manufacturing, agriculture) but meets with barriers and low demand in others, failing to achieve economies of scale.
Europe will make progress towards a sustainable Data Economy, with policy initiatives promoting the ICT and electronic industry full transformation to climate-neutral and energy efficiency practices. In this scenario we foresee progress towards a Global Digital Cooperation strategy led by the EU and a more proactive European role in the development and adoption of standards and interoperable technologies on the global scene, leveraging the power of the single internal market.
Policy and the High Growth Scenario
This scenario foresees a faster growth trajectory of the Data Market and economy, boosted by favourable economic conditions, by strong investment, proactive policies, and effective collaboration between the MS at the EU level. By 2025 we expect a higher take-up of multiple technologies than in the baseline scenario (AI, Big Data, IoT, robotics, 5G, new materials, blockchain…) with European enterprises fully embracing multiple innovation and the power of data. European industries will exploit platforms to combine data with AI and machine learning, spreading intelligence from the core to the edge of their networks, turning data into action and action into value.
In this scenario, the EU27 GDP compound annual growth rate in the period 2019-2025 (+2.0%) will be 1.5 times higher than in the Challenge scenario and 40% higher than in the Baseline scenario. This will accelerate the investments in the digital economy and consumer willingness to spend. In the European Union public and private investments will accelerate in Artificial Intelligence, advanced robotics, automation as well as new skills. As a consequence, this scenario foresees a deep transformation of business processes and the work culture, where change management and HR management become critical success factors. As in the Baseline scenario, in the High Growth scenario European enterprises multiply the use of "digital co-workers" (using intelligent process automation and AR/VR to support/complement human workers) reducing repetitive tasks, improving productivity and security. Besides automation, enterprises engage in "augmentation" of human resources providing technologies enhancing their physical and intelligence capabilities.
Policy measures at European level have a relevant role to play in this scenario. On one hand, through a combination of incentives, intensive training investments, Europe will support organizations in managing the transformation of the work culture for digital transformation identifying appropriate job roles and tasks in open ecosystems. On the other hand, initiatives to develop digital skills are successful: the Digital Europe Programmes delivers a boost to the supply of advanced digital and data skills, the revised Digital Education act helps to improve digital learning, and the networks of Digital Innovation Hubs play their role in providing internships, training and experimental spaces for companies to learn about new technologies. In this scenario, strong investments and MS cooperation help Europe to develop fully independent data infrastructures and digital resources, to shape global digital governance rules with EU values, becoming a leader in the Big Data-AI space. As foreseen by the digital and data strategies, Europe succeeds in achieving technological
THE EUROPEAN DATA MARKET MONITORING TOOL
66
sovereignty. Europe’s share of the global Data Economy is well on the way to become equal to its economic weight by 2030 thanks to a successful single market for data. The successful deployment of an EU cloud infrastructure and cloud services marketplace satisfies industry and SMEs needs. The successful development of common EU data spaces in most sectors achieves economies of scale and supports the rise of the platform economy, enabling companies to deal with multiple innovation. Fast progress with a new Data Act, Digital Services and Competition framework enhance data access, sharing and re-use, achieve fair playing field and build the basis for the effective single market for data.
Policy and the Challenge Scenario
In the Challenge scenario, a combination of economic, social and technology threats overcome European innovation forces, which become lost in a maze of barriers, resulting in much slower Data Market and Data Economy growth. In this scenario, the EU GDP growth, estimated at a compound annual growth rate in the period 2019-2025 of 1%, will be substantially lower than in the other scenarios. Trade wars, political conflicts, and unexpected events such as the Coronavirus pandemic are the main drivers of this growth slow-down. Lower GDP growth means lower overall investments and consumers’ willingness to spend.
In this context, the digital Europe and data strategies could not be implemented successfully and fail to achieve many of their objectives. This may happen if a combination of insufficient investments and lack of collaboration at EU level lead to an uneven development of data infrastructures and digital resources. Without an effective data governance framework and incentives for stakeholders to increase data sharing, there is a risk that the Data Market will remain fragmented.
In this scenario it is possible that, notwithstanding the GDPR, many Europeans have no visibility and very little control on the use of their personal data: this hinders the willingness to share and reuse personal and non- personal data for social good. If the development of common data spaces is slow and does not deliver the expected boost to industries across Europe, this may hinder the rise of the platform economy, miss the development of economies of scale and reduce the incentives to fight market fragmentation for innovative services.
This scenario foresees a negative self-reinforcing circle, where less positive global economic conditions discourage investments and weaken global demand with a negative impact on European growth. In this context, digital Europe and data strategies are not implemented successfully and fail to achieve many of their objectives. This may happen if a combination of insufficient investments and lack of collaboration at EU level lead to an uneven development of data infrastructures and digital resources. Without an effective data governance framework and incentives for stakeholders to increase data sharing, there is a risk that the Data Market will remain fragmented. A slower pace of digital innovation deprives the economy of the boost to growth potentially given by data-driven services and products, while enterprises find competing in international markets more difficult.
The EU27 Data Economy post-Covid
As the Final report on Policy Conclusions (D2.8) was being finalised in February 2020, the COVID-19 pandemic started its rampage across the globe, endangering people and livelihoods, forcing governments to implement measures to contain the virus, with unprecedented impacts on the European economy as well as the technology market. While the EDM Monitoring tool data and analysis until 2019 remain valid, clearly our estimates for 2020 are now off the mark and our 2025 scenarios outlining different pathways of evolution of the European Data Market (EDM) and Data Economy in the next years would need to be revised.
THE EUROPEAN DATA MARKET MONITORING TOOL
This section presents our post-Covid scenario forecast for the Data Economy, building on the revised estimates of the EU27 Data Market and on macroeconomic and industrial trends. To understand this forecast it is important to remember the three main components of the Data Economy:
• The direct impacts, which is the value of the goods and services sold in the Data
Market;
• The indirect impacts, backwards (on the supply chain: gains made by industries
providing goods and services to data users) and forwards: revenues gained by user
industries thanks to data innovation;
• The induced impacts on the general economy, thanks to additional spending and
consumption driven by the value of the direct and indirect impacts.
Therefore, the post-COVID Data Economy reflects the decline of the Data Market, the fall in revenues of the industries affected by the lock-down, and the steep decrease of consumer demand and overall consumption.
According to our post-COVID scenario estimates, the European Data Market should decrease by 7.1% to 54 €B in 2020 (compared to 58 €B in 2019) and the Data Economy by 5.5% to 307 €B (compared to 325 €B in 2019). In our view, the powerful negative impact of the slow-down in 2020 will be followed by a rebound and a likely return on the growth path in the next years. Many of the powerful drivers of data-driven innovation are likely to prove resilient in the next years, particularly the willingness to invest in digital technologies in order to re-launch services and create new products to stimulate demand.
By 2025, the post-Covid Baseline scenario foresees strong growth rates resulting in a value of 80 €B for the European Data Market (compared to 82.5 €B in the pre-Covid scenario) and 516 €B for the Data Economy (compared to 550 €B in the pre-Covid scenario). However, the incidence of the European Data Economy on the EU27 GDP will slightly increase from 4% (Pre-Covid scenario) to 4.04% (post-Covid scenario) because GDP is also affected by the recession.
Unfortunately, we are unable to present post-COVID revised estimates for the Data Economy Challenge and High Growth scenarios. These scenarios rely on alternative assumptions on industry revenues, consumer consumption and GDP dynamics and we do not feel able in the present uncertainty to elaborate such assumptions, beyond the Baseline scenario. However, we believe that the Challenge and High Growth scenarios will remain broadly valid, even though their degree of likeliness could change. The Challenge scenario is marginally more likely (if the recovery does not take off as hoped) while the High Growth scenario assumptions, based on hyper-growth thanks to technology investments, seem now quite remote.
5.2 A change of pace in data policies
The digital policy package presented in February 202017 by the new Commission led by Ursula Von der Leyen widens considerably the scope and breadth of data policies, reflecting the new policy awareness about the critical role of data for the competitiveness of the European economy. Only a few years ago Big Data was a topic of interest mainly to the ICT
17 https://www.orgalim.eu/news/commissions-digital-package-roadmap-towards-europes-digital-future
THE EUROPEAN DATA MARKET MONITORING TOOL
68
industry, while today there is a widespread awareness of the social and economic impacts of data-driven innovation.
The Communication “Shaping Europe’s digital future” is articulated in four main action areas, covering all the framework and enabling conditions to develop “a digital society based on European values and rules” (Technology for people, A fair and competitive economy, An open, democratic and sustainable society and Europe as a global leader). The Data Strategy is a key component of the economy action area together with actions to update competition rules and develop an industrial strategy, among others. This matches very well the rationale of the EDM scenarios for the development of a balanced European Data Economy, which have long posed as a key enabling condition the adaptation of the overall economic and regulatory framework (not limited to ICT or R&D). The European Strategy for Data’s four main pillars also mirror the priorities highlighted by the EDM Monitoring Tool for the development of the Data Economy. The Pillar A on the development of “A cross-sectoral governance framework for data access and re-use” brings together several initiatives to enable and stimulate data sharing, but at the same time ensure a fair playing field for all organizations. This recognizes the need to deal with data as a strategic asset influencing power dynamics in the socio-economic system. Europe’s ability to achieve this goal is a major differentiating factor between the three alternative scenarios of the Data Economy presented in this report. The Data Strategy’s Pillar B on Enablers (Investments in data and strengthening Europe’s capabilities and infrastructures for hosting, processing and using data, interoperability) and Pillar C on Competences (Empowering individuals, investing in skills and in SMEs) cover the needs for investments, interoperability, standardization and infrastructures as well as skills. The EDM’s indicators on the numbers and penetration of Data Users and Data Companies, as well as Data Professionals and the Data Skills Gap, can help monitoring the development and achievement of these policies.
Finally, the Data Strategy Pillar D for “Common European data spaces in strategic sectors and domains of public interest” is driven by the need to accelerate data sharing and make B2B data sets actionable for data-driven innovation, a priority often underlined by this study analyses. The EDM Monitoring Tool indicators on data innovation diffusion by industry are also useful to provide a baseline for this policy area.
5.3 The EU Data Policy and the International Dimension
Our latest measurement of the European Data Market Monitoring Tool reveals a substantially unchanged picture when comparing the EU indicators to those that have been developed for some of the key international partners of the EU. While confirming its vitality, the EU continues to significantly lag behind the U.S in terms of both size and growth of the Data Market. Even when focusing on the EU27 plus the U.K., Europe generates a Data Market value in 2019 that is still approximately 2.5 times smaller than the one produced in the U.S. (72.3 billion Euro in the EU vs. almost 185 billion Euro in the U.S.) in the same year. The pace of this development is even less flattering as year-on-year growth for the Data Market in the U.S. is almost three times as faster than in the EU in 2019 (12.7% in the U.S. vs. 4.9% in the EU). The same pattern applies when comparing the EU performance to the US performance in terms of the Data Economy18: even if confined to direct and backward indirect impacts, the U.S. Data Economy represents a share in terms of GDP that is more than double the one of the EU (1.2% vs. 0.5%) in 2019 with a year-on-year growth that, again, is twice as much than in the EU (6.8% in 2019 over 2018 in the U.S. vs. 3.4% in the EU).
18 The Data Economy for Brazil, Japan and U.S. has been measured in terms of direct and indirect impacts only due to lack of comparable and consistent statistical sources for all these three countries. This is consistent with what has been presented throughout all duration
of this update study (SMART 2016/0063) as well as the original European Data Market Study (SMART 2013/0063).
THE EUROPEAN DATA MARKET MONITORING TOOL
Filling this gap would be essential to increase Europe’s competitiveness and for the future of work in the EU. The new European Digital Strategy recently unveiled by the European Commission19 appears to design a new, confident role for Europe as a global player. Realizing that the European model has proved to be an inspiration for many other partners worldwide, the strategy calls for the EU to strengthen its commitment towards the setting global standards for emerging technologies and to remain the most open region for trade and investment in the world. In terms of standards, in particular, the EU has paved the way for the setting of global standards for 5G and the IoT and is now committed to leading the standardisation process of a number of additional advanced and new generation technologies such as blockchain, quantum computing, supercomputing – all technologies that lie behind and allow data sharing and data usage and that, as a straight consequence, are directly linked to the further development of a well-functioning Data Economy.
This proactive international role in the standardisation process is accompanied by a robust commitment on trade and investments on the international scene to ensure a collaborative approach on several technology-related topics including data flows and the possibility to pool available and relevant high-quality data together. This move is indeed in line with the “data-as-infrastructure” approach to the Data Economy, which aims at a negotiated, common data governance setting between the EU and like-minded partners. This approach, however, will have to be put in place while safeguarding Europe’s “technology sovereignty”, that is by making sure that Europe reduces its level of dependency on other parts of the globe for most of the crucial technologies and effectively protects the integrity and resilience of its data, networks and communication infrastructures. This marks a considerable distance from previous policy stances on the international scene. Yes, “sovereignty” is defined positively and is not directed against anyone. Nevertheless, this renewed confidence on the international scene may indicate that Europe will gradually abandon a merely reactive role to embrace a more dynamic and enterprising stance vis-à-vis a number of trading partners at worldwide.
19 https://ec.europa.eu/digital-single-market/en/content/european-digital-strategy
THE EUROPEAN DATA MARKET MONITORING TOOL
70
6. Conclusions
6.1 Quantifying the European Data Market – Key Facts & Figures
The year 2019 sees all the indicators measured by the EDM Monitoring Tool in a positive dynamic from 2018, as the European economy continues its development cycle. The value of the Data Economy, which measures the overall impacts of the Data Market on the economy as a whole, is to reach 406 Billion Euro in 2019 for EU27 plus the U.K.
The positive trend in the growth of the Data Economy is confirmed by the Data Market value in 2018, which displays a growth rate 4.9% in 2019 year-on-year, reaching 75. Billion Euro in the EU plus the U.K.
The EDM Monitoring Tool has been analysed along four main dimensions:
• The Workforce and Skills dimension - including the measurement of data
professionals and their potential skill gap.
• The Supply and Demand dimension - incorporating the measurement of data
supplier and data user companies and the revenues generated by these companies.
• The Business and Economy dimension - comprehending the size of the Data
Market and the value of the Data Economy.
• The International context dimension - including a select number of indicators for
Brazil, Japan and the US.
Figure 11: The four Dimensions of the Data Market’s Key Facts & Figures
Source: The European Data Market Monitoring Tool, IDC, 2019
The Workforce Dimension: Data Professionals and Data Professionals Skills Gap
Data professionals are estimated at a total of 6.0 million in EU27 and at 7.6 million in EU27 plus U.K in 2019, thus marking a continuing increase in 2019 over the previous year (6.1% and 5.5% year-on-year respectively). When compared to the year 2019, 2020 would register a growth rate of 9.2% and 8.6% at the level of EU27 and EU27 plus U.K respectively. More interestingly, the employment share and the intensity share components of the data professionals’ indicator are also expected to improve in 2019 and 2020 if compared to our estimates in 2016 (now estimated at 3.3% and 3.5% in 2019 and 2020 in EU27 and 3.6% and 3.8% for the same years in EU27 plus U.K).
As for the skills gaps in data professionals, our latest estimate continues to highlight an imbalance between demand and supply of data skills in Europe since the first measurement for the year 2014. In the year 2019 an increase of demand of data professionals continued (+4.5%), the estimated gap grew by 13% reaching approximately 459,000 unfilled positions
THE EUROPEAN DATA MARKET MONITORING TOOL
in the EU27 plus the U.K. (399,000 without the U.K.), corresponding to 76.2% of total demand (5.7% without the U.K., see table below). By 2020 we expect the gap to expand to 496,000 unfilled positions in the EU27 plus the U.K., corresponding to 6% of total demand (5.2% without the U.K, where slower growth is expected due to the impacts of Brexit). In any given moment in the labour market there is a physiological number of vacancies, as well as a number of people looking for work: a vacancies ratio around 5% of demand or less is considered manageable. From this point of view the data skills gap estimated for 2019 shows a lower level stress in the market if compared with our previous estimates for 2017 and 2018. As in in the first and second round of measurements of this indicator, the gap is expected to continue in 2020 under the three scenarios but at a lower level than previously estimated, which is expected to continue to 2020 and beyond under the 3 scenarios.
The Supply - Demand Dimension: The Data Companies
The number of data suppliers continue to grow at a faster pace than the numbers of data users in the longer term (out to 2025). Data suppliers are estimated at almost 149,000 in the EU27 and 290,000 units in the EU28 for 2019, thus exhibiting a year-on-year growth of 2.4% and 2.3% respectively. Data users, instead, are projected to grow at 0.6% in 2019, amounting to nearly 535,000 in the EU27 and to nearly 716,000 units in the EU28. If compared to the measurements carried out by the European Data Market Monitoring Tool over the period 2013-2015, these latest estimates show a picture of some consolidation of data companies in the EU, following increasing growth rates over the prior four years.
Revenues generated by data suppliers have registered a constant increase through the last years to reach nearly 64 Billion Euro in EU27 and 83 Billion Euro in EU27 plus the U.K. in 2019. Data companies’ revenues account for 3.7% of total company revenues in 2019. Data companies’ revenues are expected to follow the Data Market, as imports and exports of data tools and services tend to follow each other. Forecasting data companies’ revenues shows an expected annual growth rate out to 2025 of 7.0% - easily outpacing the growth of the total ICT market over the same period (expected to be 1.6% from 2020 to 2025 Baseline). The smaller Member States show the highest long-term growth as they have a smaller base from which to grow, but the larger Member States will make the biggest overall contribution to the Data Economy out to 2025.
The Business and Economic Dimension: The Data Market and the Data Economy
The value of the European Data Market is expected to reach 75.2 Billion Euro for the EU28, with a growth rate of 4.9% in 2019, with an increased growth rate of 6.6% in 2020. Most of the Member States shows strong growth, slightly ahead of the expected growth for the Total ICT market, which is expected to grow by 3.9% in 2019, and a lower rate of only 2.0% in 2020. The Data Market share of total ICT is 11.4% for 2019 and is forecast to reach 14.0% by 2025 (baseline forecast).
The larger industries, accounting for the greatest number of companies, represent for the largest share of the Data Market. In terms of adoption by industry, the highest rates of Data Technology tend to be in Manufacturing, Finance, Professional services, and in Retail. Thanks to their size, these industries are the biggest consumer of data technologies. Manufacturing’s sheer size in the EU economy makes it the largest industry in the Data Market. However, there is significant scope for increased adoption of data technology in the manufacturing industry, so its leading position is unlikely to change.
The Data Market in the EU27 plus the U.K. will continue to out-grow the total ICT market, with its share of this market rising from close to 10% in 2016, to nearly 15% by 2025, and possibly close to 19% in the High growth scenario.
THE EUROPEAN DATA MARKET MONITORING TOOL
72
The value of the Data Economy for the EU27 is estimated at 324 Billion Euro in 2019, with a high growth in 2019 of around 8% and higher in 2020 (more than 9%)
The overall impacts of the data economy for the EU27 will reach 4.0% and of GDP in the Baseline scenario by 2025. Indirect and induced impacts will be evenly distributed in 2019. Data user companies will continue to consolidate the quantitative benefits stemming from the use of data thus contributing to the importance of indirect impacts. Not surprisingly, these benefits will go beyond the users and will translate in higher induced effects, generating jobs and revenues beyond the data companies itself. The positive conditions under the High Growth scenario will lead the overall impacts to reach 827 Billion Euro in the EU27 in 2025. The High Growth scenario will be characterized by a higher level of induced effects than the other scenarios as the benefits for the overall economy are maximized.
A screenshot of the Data Economy by industry shows that the Financial sector, the Manufacturing industry and the realm of Professional services continue to represent the vertical markets in which the impacts of the Data Economy are most strongly felt. Thanks to the significant diffusion of data-related technologies, these industries exhibit high levels of forward and backward impacts and can convey effects at an induced level more quickly and more effectively than other industries. Their IoT diffusion and the usage of Cloud Computing, as well as the usage of mobile and social technologies, coupled with the ongoing process of digital transformation, make these industries particularly reactive to induced effects. Emerging technologies such as Artificial Intelligence and blockchain applications, are also gaining momentum in these industries, thus reinforcing the impact of indirect and induced impacts in these sectors.
The International Dimension - The Data Economy Beyond the EU – US, Brazil and Japan
The most recent data (i.e., 2019 and forecast for 2020) shows that European Data Market and Data Economy in 2019 continues to consistently hold second place after the U.S. in value but slips to last place in growth.
The positive development of the U.S.’ Data Economy is confirmed by a solid year-on-year growth of the main indicators monitored, including the number of data professionals, companies, and the overall Data Market.
Brazil also slowed as its economic recovery was at best weak in 2019. However, the country shows some positive trends in the third quarter of 2019, so the outlook is slightly more positive. For all the indicators Brazil showed the weakest results, while Japan improved its growth in the number of data professionals, rising to 4.2 million in 2019.
Japan’s Data Market is the closest match to the European one in terms of growth and investment, but still only half the size of the EU27 plus the U.K. Japan competes with the EU across data professionals and data market and in 2019 its growth in the data market was significantly higher than for Europe. The economy continues to suffer from weakening internal demand and lack of consumption, even though it showed an unexpected improvement in Q3 2019.
Looking at the estimates of the data suppliers, the EU exhibits higher growth than the U.S. in 2019: a year-on-year growth of 2.4% - notably lower than the U.S., which showed less than 1% growth over the same period. Europe still presents a growing and dynamic data ecosystem on both fronts – the Data Market and the Data Economy. However, it lags both the U.S. and Japan in terms of the incidence of the Data Economy on GDP and has some catching up to do. The region is ahead of Brazil, but it is unclear if this is much of an achievement.
THE EUROPEAN DATA MARKET MONITORING TOOL
6.2 Describing the Data Market – The Quantified Stories
Three quali-quantitative stories accompanied this third and final round of measurement of the European Data Market Monitoring Tool. They concentrated on the operational, organisational and/or economic benefits generated by the use of data-driven technologies with a special focus on Health Data- driven Innovation and Data Commons. While the first story investigated unexplored potential stemming from the use of Big Data and Analytics (BDA) in healthcare, the second and the third stories focused on the role of data Commons and analysed in depth the benefits of data-driven innovation and the potential role and impacts of common data spaces in several 6 leading sectors targeted by the recently unveiled European Strategy for Data (COM (2020) 66 final, 02/19/20).
The research on health data and data-driven innovation unveiled that a significant majority of healthcare providers in Europe (59%) has not adopted a Digital Transformation roadmap yet and that only the 6% has established a unique roadmap for Digital Transformation and general business strategy. Still. this unexplored potential of the use of Big Data and Analytics (BDA) in healthcare is eliciting a new wave of interest in data-driven value creation, which, in the medium to long run, will enable to reward performance rather than just volume. In particular, our research has highlighted how AI, among other BDA solutions, is gaining momentum across European healthcare providers with Clinical decision support, Illness progression and Patient engagement being among the most relevant use cases being adopted at the time of writing. Our analysis has also presented a number of benefits that healthcare organizations are obtaining by adopting BDA technologies, coupled with Artificial Intelligence and Machine Learning technologies (AI/ML), in particular:
• The easy and convenient access to intelligent solutions for clinicians and patients offers more opportunities to advance decision making and enhance clinical process efficiency at the point of care. Portugal skin cancer screening solution is an example of how technology supports a clinical collaborative framework and enables the integration of information to serve population health management,
• More advanced predictive capabilities, allowing greater control over disease-specific variables impacting health outcomes, as well as costs and resources utilization associated with care. This approach enables to more efficiently target population segments at risk of developing chronic and long-term conditions by putting in place initiatives aimed at promoting health and preventing or delaying the development of risk factors. Predictive BDA by ARIA is an example of development of a predictive model able to effectively target cardiovascular conditions and offer an accurate estimate of the number of future cases in a specific geographical area.
The story on the emerging concept of Data Commons has focused on the current need for support and for some technical and organizational creativity to make this concept viable and sustainable over time. In this respect organizational cooperation and mutually advantageous data sharing solutions are necessary, so to generate positive externalities while preventing the competitive interests of their contributors. To obtain this, a suitable technical infrastructure needs to be in place so to allow companies to dynamically move from sharing and restricting access to their data and knowledge, and a governance model that preserve trust amongst partners and manage to successfully stay and scale. For this, some mechanisms, appear to be particularly relevant, such as: ex-ante safeguards; arbitration; a stratified and layered infrastructure; user-centric approach; interaction with data generators or data holders to accelerate re-use; and a federated approach towards data-sharing. Through the combination of all these mechanisms, a favourable environment for the organizations to share costs and realize scale can be put in place. This, in turn, will
THE EUROPEAN DATA MARKET MONITORING TOOL
74
increase the organisations’ willingness to collaborate while accelerating their innovation processes.
The story on the European industry requirements and the role of European data spaces shows that there has been a considerable progress in the use of data-driven innovation by European Industries, in the last years, including an increasing use of AI techniques such as machine learning underpinning the power of data. The story leverages the set of 18 case studies developed by Politecnico of Milano (POLIMI) and IDC across seven industries within the context of the H2020 DataBench project20, collecting data about the business impacts of the adoption of Big Data and Analytics. These case studies show a good level of business benefits achieved from data-driven innovation, with a high level of cost reduction (such as 80% reduction of operational expenditures for fraud detection in the financial services industry, 30% reduction of maintenance costs in manufacturing thanks to predictive manufacturing) and customer benefits (for example, a 110% improvement of customer retention in manufacturing and 85% improvement of conversion rates from potential to actual customers thanks to data-driven targeting in retail). Nevertheless, there is still a high level of immaturity in the capability to merge datasets within a company, and relevant barriers against data sharing even in advanced sectors such as manufacturing. A relevant issue which emerged from most case studies is the availability of affordable and efficient cloud computing infrastructures, allowing the scaling up of successful pilots and individual company-site experiences to the whole company domain. Even if potentially Common data spaces could provide a valuable answer to the need for greater access to high quality datasets and computing infrastructures, this will require solving practical and technology challenges, not simply providing a favourable environment for encouraging stakeholder collaboration.
6.3 Mapping the Data Market – Data Landscape and Data Market Monitoring Tool
The Third EU Data Landscape Report (D4.3) provides an overview of the EU Data Landscape database revision as of January 2020. With a total of 1,556 companies and coverage of 42 countries, the database has grown by 9% with the addition of 131 new companies during 2019. Out of the new companies, 52 were identified as Key Data Landscape companies, offering a comprehensive overview of the most important data companies in Europe.
6.4 Acting Upon the Data Market – The Role of Policy
The European Data Market (EDM) Monitoring tool has monitored since 2013 the evolution of the Data Economy, providing insights and quantitative evidence about its diffusion by industry and by region, contributing substantially to the evolution of European policy strategies in this field. Since the first intuition of the potential disruptive impacts of Big Data, this monitoring effort and analysis has documented the social and economic relevance of the deep transformation process enabled by data-driven innovation. Today this process is accelerated by the emergence of Artificial Intelligence tools and services powered by data.
The new European Data strategy outlines the ambition for Europe to become a leading role model for a society empowered by data to make better decisions in business and the public sector and a global leader in the data-agile economy. Europe is moving towards a “data as infrastructure” model: a governance model where data is considered as a public asset, and data infrastructures work as a kind of “digital twins” to physical roads, requiring public investments and new institutions to manage them. This model allows for many different typologies of “data roads”, local or global, leaves freedom for private initiative but tries to maintain a balance between private and public interests, to be managed through new kinds
20 Evidence-Based Big Data Benchmarking to Improve Business Performance, www.databench.eu
THE EUROPEAN DATA MARKET MONITORING TOOL
of organizations, private or public or a mix of the two, like data trusts, data cooperatives, personal data stores.
Today, as we look at the main driving trends for the next years, we notice that the role of policies has increased in relevance: as data-driven innovation has become widespread across all industry sectors and user constituencies, the scope of the regulations and framework conditions to be adapted has considerably grown. At the same time, the emergence of disruptive technologies such as AI has increased the need for policy intervention to manage emerging social, economic and ethical risks. The EDM monitoring tool provides a consistent and solid framework to assess and estimate the potential consequences of policy choices to be made in the next years.
Even the disruption caused by the Covid-19 pandemic has not undermined the value of the EDM monitoring tool indicators and analysis. As we argue in our post-Covid scenario estimates, the main drivers of data-driven innovation are still powerful, the Data Market and the Data Economy are likely to start growing again already from 2021 and by 2025 may have recovered much of the ground lost in 2020. The pandemic has shown new ways in which digital technologies can help to adapt to a new post-Covid environment, manage health risks and accelerate the economic recovery. Now as never before proactive innovation policies and technology investments are needed to support the European social and economic recovery.
THE EUROPEAN DATA MARKET MONITORING TOOL
76
7. Methodological annex
Overview
In line with the methodology adopted in the previous European Data Market Study (SMART 2013/0063), the measurement methodology for this final report was based on the steps outlined in Figure 12 below. Compared to the previous steps it does not include the ad-hoc surveys which were used to establish the baseline. However, thanks to the use of IDC primary research data tracking the market, we have already proven the feasibility of updating the indicators without repeating the initial surveys.
The main steps of the methodology did include:
• Desk research on the main EU and global national and statistical sources; each
indicator has specific set of sources;
• Extraction of data from the relevant IDC surveys and databases;
• Additional secondary research and case studies interviews for the stories, which in
turn did feed back to the indicator models to help in the modelling and estimate of
indicators;
• A selected number of opinion leader and stakeholder interviews to feed into the
modelling and scenario assumptions;
• Implementation of the 7 indicators models and elaboration of results;
• Development of the forecast scenario assumptions and update of the 3 scenarios;
• Assessment of policy insights building on the results of the previous steps.
Figure 12: A sophisticated Methodology
Desk Research
As done in the first study, the study team reviewed the list of relevant public sources and updated it to collect additional relevant data. The list of the main sources used is outlined below.
• Concerning the indicators on Data Market, data companies, data companies’ revenues,
and the Data Economy the main sources were:
Policy insights
EDM Monitoring Tool
EU & National Statistical Sources for
Internationals
Additional
Secondary Research
Additional In-Depth
Interviews
Methodology and
Taxonomy
Stories
7 Quantitative
Models
3 Forecast
Scenarios
IDC ongoing primary research on Business
Analytics, ICT markets, Digital transformation,
THE EUROPEAN DATA MARKET MONITORING TOOL
o Eurostat business demography statistics in the European Union, treating
aspects such as the total number of active enterprises in the business
economy, their birth rates, death rates, and the survival rate (last update:
December 2019);
o Eurostat annual structural business statistics with a breakdown by size-class
are the main source of data for an analysis of SMEs (latest update: December
2019);
o IDC’s detailed market forecast estimates for IT Hardware, Software, and IT
Services from 2017, 2018 and 2019;
o IDC Worldwide Black Book (Standard Edition), quarterly updates form the
years 2018 through 2019. The Black Book represents IDC's quarterly analysis
of the status and projected growth of the worldwide ICT industry in 54
countries.
o IDC European Vertical Markets Survey, 2018 and 2018
o IMF World Economic Outlook (WEO) Database, October 2019
• For the data professionals we used in addition the following sources:
o ILOSTAT (International Labour Organization) Statistics and Databases (January
2019)
Survey data
This research is supported by prior and ongoing survey data to provide a foundation where information does not exist, fill in where information is sparse or missing, or to confirm ongoing assumptions such as adoption rates of digital technology or use of data. The foundation for the research was a survey conducted in 2015 to establish a baseline for data use and data supply. This was conducted across 8 countries and detailed 11 industries and two company size bands. 1,100 respondents provided sufficient detail to draw starting assumptions for technology adoption, data professionals penetration in organisations and data supplier penetration in organisations. The models used to identify the number of data professionals, the number of data suppliers by member state, industry, and company size band used this data as part of the model foundation.
Table 18: Quotas used for the initial data market survey
Member State Total
respondents
Sectors
Czech Republic 100 Mining, Manufacturing
France 200 Electricity, gas and steam, water supply,
sewerage and waste management
Germany 200 Construction
Italy 100 Transport and storage
Poland 100 Information and communications
Spain 100 Wholesale and retail trade repair of motor
vehicles and motorcycles, Accommodation and food services
Sweden 100 Professional services, administrative and
support services
UK 200 Public Administration And Defence; Compulsory
Social Security
Total 1,100 Education
Finance
Health
THE EUROPEAN DATA MARKET MONITORING TOOL
78
This survey supported the initial foundations for the data models used, but over time these models were maintained using data from IDC’s annual industry survey. This survey currently (in 2019) addresses over 2,700 respondents across 13 countries. This ongoing series of surveys asks questions about technology adoption and technology use cases, and it is this information that support and update the foundation models used in the forecast of the data market, the number of data users, the number of data suppliers, and the number of data professionals.
Table 19: Countries and Industries surveyed in the annual IDC industry survey
Country Industry
U.K. Finance
Germany Manufacturing
France Retail/wholesale
Italy Professional services
Spain Healthcare
the Netherlands Transport
Sweden Telecom/media
Denmark Utilities/oil and gas
Norway Government/education
Finland
Russia
Czech Republic
Poland
Forecast Scenarios
The scenario model used in this study is based on the definition of alternative assumptions about four main groups of key factors driving the Data Market along different development paths. The identification of the key factors of market development was based on the desk and field research carried out in this study and on the review of a long list of forecast assumptions, leveraging IDC's periodically updated Market Forecast Assumptions. The selection of the most relevant factors was based on two main criteria:
• High level of impact on the development of the Data Market and the Data Economy;
• High level of uncertainty, with potential different outcomes (assumptions) over the
next 8 years.
THE EUROPEAN DATA MARKET MONITORING TOOL
The four main groups of factors are:
• Macroeconomic factors;
• Policy/regulator factors;
• Data Market demand-supply factors;
• Global megatrends affecting all technology markets.
Even though they may seem obvious, these four clusters correspond to the main typologies of factors which affect the evolution of the Data Market. Each cluster aggregates a set of interrelated key factors; their combination differentiates the three scenarios. The scenarios are characterised by the interaction and co-dependency of these factors; no scenario can be explained only by one factor or one group of factors.
Figure 13: Structure of the Scenarios Model
Source: European Data Market Monitoring Tool, IDC 2015
The scenario model and the forecast indicators models are correlated. Table 20 below summarises the rationale of their selection and how their assumptions were used as inputs to the indicators’ forecast models.
Table 20: Identification of Main Factors driving the Scenarios
Key Factors Rationale Inputs to the Forecast Models
Macroeconomic
factors
Macroeconomic factors are
partially exogenous to the Data
Market (even though data
innovation is expected to
contribute to GDP growth)
Historical data for the period 2014-
2018, plus 2019 estimates, plus
alternative forecasts for the period
2020 to 2025 of the following:
• EU GDP growth
• ICT spending growth Other
economic factors such as
unemployment for the same
period
Scenarios Model
Baseline
ScenarioChallenge
Scenario
High Growth
Scenario
Policy/ Regulatory Assumptions
Macroeconomic Assumptions
Data Market dynamics
Assumptions
Global
Megatrends
Assumptions
THE EUROPEAN DATA MARKET MONITORING TOOL
80
Policy/Regulator
y factors
Policy measures and regulation
shape the framework conditions
of the development of the Data
Market
Alternative policy and regulatory
factors by scenario
Data Market
supply-demand
factors
Strong influence of alternative
supply-demand dynamics on the
market development paths
Alternative supply and take-up
models by scenario
Global
megatrends
Strong influence of global digital
innovation trends on the EU Data
Market growth
Alternative assumptions on the
development of current and forecast
ICT innovation drivers as well as
global digital transformation dynamics
The scenarios provide the main framework for the forecast of the EDM indicators. As shown in the Figure below, IDC developed seven forecast models: each model produced the specific indicators forecasts under the three main scenarios, followed by in depth cross-check and quality check. The forecasts models are also correlated and were developed with the following process, with the following dependencies:
• The Data Market forecast model is the cornerstone of the process: it was developed
first, building on IDC’s forecasts and on the macroeconomic variables as described
below. Its results and growth rates feed into the other models, according to the
specific assumption and calculation methods explained for each indicator.
• The Data Market and data suppliers’/data users’ forecasts influence the data
professionals’ model.
• The data companies’ forecasts feed into the data revenues model.
• The data professionals model feeds into the data professionals’ skills gap model.
• The Data Economy model feeds from all the other forecasts, but especially the Data
Market and the data users' forecasts.
Measuring Data Professionals
Definition and Scope
Data professionals are workers who collect, store, manage, and/or analyse, interpret, and visualise data as their primary or as a relevant part of their activity. Data professionals must be proficient with the use of structured and unstructured data, should be able to work with a huge amount of data and be familiar with emerging database technologies.
In our definition, data professionals are not only data technicians but also data users who, based on more or less sophisticated tools, take decisions about their business or activity, after having analysed and interpreted available data. According to our definition, data professionals belong to the category of knowledge workers and specifically “codified” knowledge workers (Lundavall and Johnson, 1994); data professionals specifically deal with data while knowledge workers deal with information and knowledge.
The indicator has been measured according to the segmentations presented in the following table, including two sub-indicators about the share on employment and the intensity of employment.
THE EUROPEAN DATA MARKET MONITORING TOOL
Table 21: Indicator 1 – Data Professionals
Indicator 1 – Data Professionals
N. Name Description Type and Time Segmentation
1.1 Number of data professionals
Total number of data professionals in the EU
Number, 2016-17-18-20
Forecast to 2025, 3 Scenarios
By Geography: 28 EU MS + EU27 MS + total EU
By Industry: 11 industry sectors NACE rev.2
1.2 Employment share
Total number as a share of total employment in the EU
% of total employment, 2017-18-20
By Geography: 28 EU MS + EU27 MS + total EU
By Industry: 11 industry sectors NACE rev.2
By Size: not applicable
1.3 Intensity share Average number of data professionals per company (only for private sector)
Number, 2017-18-20
By Geography: 28 EU MS + EU27 MS + total EU
By Industry: 11 industry sectors NACE rev.2
By Size: not applicable
The segmentation by industry sector used in the study is presented in the following table with the corresponding NACE rev.2 Codes.
Table 22 Industry Sectors Classification
Eurostat Name NACE Rev 2 Code Abbreviation for Tables
Construction F Construction
Education P Education
THE EUROPEAN DATA MARKET MONITORING TOOL
82
Eurostat Name NACE Rev 2 Code Abbreviation for Tables
Electricity, gas and steam, water supply, sewerage and waste management
D-E Utilities
Finance K Finance
Human health activities Q Healthcare
Information and communications J Information and communication
Mining, Manufacturing B-C Mining, Manufacturing
Professional services, administrative and support services
L-M-N Professional services
Public Administration And Defence; Compulsory Social Security
O Public Administration
Transport and storage H Transport
Wholesale and retail trade repair of motor vehicles and motorcycles, accommodation and food services
G — I Wholesale / Retail
Methodology Approach
Our approach is based on an iterative process and on a calibration process of the final estimates. The approach has been repeated in the new study based on updates of the main sources.
Statistical Identification
Data professionals are not classified as such into any of the labour and occupation statistics. In order to define them statistically, we have adopted the International Standard Classification of Occupations (ISCO-08), selecting categories where data professionals may be included. The criteria adopted for the selection of the ISCO-08 codes are the following:
• We have selected the occupations where data professionals can be involved either
as data providers or as data users;
• We have selected the occupations from 1 to 4-digit disaggregation;
• The occupation codes selected are those where the presence of data professionals
can be detected because:
o They hold deep analytical skills;
o They do not need deep analytical skills but basics understanding of statistics
and/or machine learning in order to conceptualise the questions that can be
addressed through deep analytical skills;
o They are the ones providing enabling technology and therefore they are
providers of data services.
• The selected codes are those where a significant part of the workers may be data
professionals; the occupations where the data professionals are a very marginal
THE EUROPEAN DATA MARKET MONITORING TOOL
part of the workers have been excluded; as an example, the medical practitioners
have been excluded, although some practitioners may be data professionals
because they undertake research activities. Since they are only a very marginal part
of the practitioners, we excluded them from the occupations where data
professionals are present;
• We excluded all the data professionals which are not included into the knowledge
economy perimeter because their occupation is a low skilled one, i.e. with high
routine level (as an example, call centre workers are in theory data professionals
but since their activity is a routine one and as such excluded from the knowledge
economy, they are not considered data professionals).
Table 23: ISCO-08 Structure and Data Professionals
ISCO-08 structured Classification
Major Groups
(1 digit)
Sub-groups
(2 digits)
Minor Groups
(3 digits)
Units
(4 digits)
Number of codes ISCO-08 structure
10 43 130 436
Number of selected codes including data professionals
4 9 21 52
Share of data professionals’ codes in the ISCO-08 structure
40% 21% 16% 12%
Source: IDC elaboration on ISCO codes
Calculation of the quantitative Perimeter
The quantitative perimeter of employment where data professionals are trackable is based on the selected ISCO codes crossed with the NACE classification of economic activities, for each one of the 28 Member States and the EU as a whole, and has been updated based on the sources updates.
Estimate and Calibration of the Penetration of Data Professionals
The next step is the estimate of percentage of data professionals within the perimeter of data professional candidates. To calculate the coefficients for the calculation of such %, we have elaborated a set of assumptions (specified in the D2- Methodology report of the EDM Study). The assumptions have been revised and updated for each release of the study and applied to the model to calculate the share of data professionals by Member State and by industry.
Forecasting Data Professionals
The same model was applied to forecast data professionals to 2025, by developing specific assumptions by scenario, even though the level of uncertainty is higher, and the reliability of the forecasts is lower.
THE EUROPEAN DATA MARKET MONITORING TOOL
84
Measuring Data Companies
Definition and Scope
Data companies are organisations that are directly involved in the production, delivery and/or usage of data in the form of digital products, services and technologies. They can be both data suppliers’ and data users’ organisations:
• Data suppliers have as their main activity the production and delivery of digital data-
related products, services, and technologies. They represent the supply side of the
Data Market.
• Data users are organisations that generate, exploit collect and analyse digital data
intensively and use what they learn to improve their business. They represent the
demand side of the Data Market.
Table 24: Indicator 2 – Number of Data Companies
Indicator 2 – Data companies
N. Name Description
Type and Time
Segmentation
2.1 Number of data suppliers
Total number of data suppliers, measured as legal entities based in the EU
Number, 2017-18-20
Forecast to 2025, 3 Scenarios
By Geography: 28 EU MS + EU27 MS + total EU
By Industry: 2 NACE rev2 Section J Information and Communication and section M Professional, scientific and technical activities
By Company Size:
• below 250 employees
• above 250 employees
2.2 Share of data suppliers
Total data companies on total companies in industry J and M
% 2017-18-20
By Geography: 28 EU MS + EU27 MS + total EU
By Industry: 2 NACE rev2 Section J Information and Communication and section M Professional, scientific and technical activities
2.3 Number of data users
Total number of data users in the EU, measured as legal entities based in one EU country
Number, 2017-18-20
Forecast to 2025, 3 Scenarios
Number, 2017-18-20
Forecast to 2025, 3 Scenarios
By Geography: 28 EU MS + EU27 MS + total EU
By Industry: 11 industry sectors NACE rev.2
By Geography: 28 EU MS + EU27 MS + total EU
THE EUROPEAN DATA MARKET MONITORING TOOL
2.4 Share of data users
Total data users as share of total private companies
% 2017-18-19
By Industry: 2 NACE rev2 Section J Information and Communication and section M Professional, scientific and technical activities
Methodology Approach
Data companies have been measured by updating the same model used in the previous EDM Study (see Figure below) which leverages both IDC and public sources.
• Eurostat business demography statistics in the European Union, treating aspects
such as the total number of active enterprises in the business economy, their birth
rates, death rates, and the survival rate (last update: December 2014);
• Eurostat annual structural business statistics with a breakdown by size-class are the
main source of data for an analysis of SMEs (latest update: March 2016);
• IDC’s detailed market forecast estimates for IT Hardware, Software, and IT Services;
• IDC Worldwide Black Book (Standard Edition), quarterly updates. The Black Book
represents IDC's quarterly analysis of the status and projected growth of the
worldwide ICT industry in 54 countries.
• IDC European Vertical Markets Survey
THE EUROPEAN DATA MARKET MONITORING TOOL
86
Figure 14: Data Companies Model
Measuring the Revenues of Data Companies
Definition and Scope
Data companies’ revenues are the revenues generated by data suppliers for the products and services specified in our definition of the Data Market. The revenues correspond to the aggregated value of all the data-related products and services generated by Europe-based suppliers, including exports outside the EU.
Table 25: Indicator 3 – Revenues of Data Companies
Indicator 3 – Revenues of Data Companies
N. Name Description Type and Time Segmentation
3.1 Total revenues of data companies
Total revenues of the Data Suppliers calculated by Indicator 2
Billion €, 2017-18-20
Forecast to 2025, 3 Scenarios
By Geography: EU27 ; EU27 + U.K. ; total EU
By Company Size:
below 250 employees
above 250 employees
3.2 Share of data companies’ revenues
Total revenues of the Data Suppliers calculated by Indicator 2
% of revenues on total,
2017-18-20
By Geography: EU27 ; EU27 + U.K. ; total EU
Data
Supplie
rsEurostat Data
Data Market Survey
Data
Users
Data Market Forecast
Eurostat Data by segment
Data UsersOrganisations with a high intensity reliance on data for the accomplishment of
their mission.Generate and exploit their
own data, collect online customer data intensively, subject the data to
sophisticated analysis
Data SuppliersMain Activity is production
and delivery of data-related products, services,
technologies
Segm
ents
Country Clusters
THE EUROPEAN DATA MARKET MONITORING TOOL
Methodology Approach
The indicator has been measured applying the same model used in the previous EDM Study, which calculated the revenues by feeding on:
• Eurostat and IDC statistics on average IT vendors revenues by size and sector;
• The total number of data companies by country, industry and size class;
• The value of the Data Market by country and industry;
• The estimated share of exports-imports in the value of the Data Market.
Measuring the Data Market
Definition and Scope
The Data Market is the marketplace where digital data is exchanged as “products” or “services” as a result of the elaboration of raw data. We define its value as the aggregate value of the demand of digital data without measuring the direct, indirect and induced impacts of data in the economy as a whole. The value of the Data Market includes imports (data products and services bought on the global digital market from suppliers not based in Europe) and excludes the exports of the European data companies.
Table 26: Indicator 4 – Size of the Data Market
Indicator 4 – Size of the Data Market
N. Name Description Type and Time Segmentation
4 Value of the Data Market
Estimate of the overall value of the Data Market
Billion €, 2017-18-20
Forecast to 2025, 3 Scenarios
By Geography: EU27 ; EU27 + U.K. ; total EU
By Industry: 11 industry sectors NACE rev.2
By Size: not applicable
Methodology Approach
The Data Market indicator is being updated every year for the duration of the study. The model is based on the extraction of data from IDC databases concerning the components of hardware, software and services spending which fall in the definition of the Data Market. The IDC data is already segmented by country and by industry, even though not all Member States are covered, and the industry classification is slightly different from the one proposed in this project. The respective shares for the software, hardware, and services market used to derive the Data Market are derived from IDC surveys covering Big Data, IT spending patterns and intentions in the European market, and a survey of data suppliers and data users in key Member States, together with analyst expertise and alignment with IDC's European and worldwide forecasts for the business analytics and Big Data Market.
The model updates the Data Market value shares by Member State and by industry.
THE EUROPEAN DATA MARKET MONITORING TOOL
88
Figure 15: Data Market Model
Source: IDC 2016
Measuring the Data Economy
Definition and Scope
The Data Economy measures the overall impacts of the Data Market on the economy as a whole. It involves the generation, collection, storage, processing, distribution, analysis elaboration, delivery, and exploitation of data enabled by digital technologies. The Data Economy also includes the direct, indirect, and induced effects of the Data Market on the economy.
The Data Economy indicator measures the value of the Data Economy based on the estimate of all the economic impacts following the adoption of data-driven innovation and data technologies in the EU. As such, the indicator aggregates direct, indirect, induced impacts of the Data Market defined as follows.
1. The direct impacts: these are impacts generated by the data industry itself; they represent the activity engendered by all businesses active in the data production. The quantitative direct impacts are measured by the revenues from data products and services sold, i.e. the value of the Data Market. We prefer to adopt the Data Market value as a good proxy of the direct impacts because its estimates are more reliable than the value of the revenues. The direct impacts: the initial and immediate effects generated by the data suppliers; they represent the activity potentially engendered by all businesses active in the data production. The quantitative direct impacts have then been measured as the revenues from data products and services sold, i.e. the value of the Data Market. As Data Market estimation is more reliable than data companies’ revenues estimation, we consider the Data Market value as a good proxy of the direct impacts. Therefore, for the sake of simplicity, direct impacts do coincide with the value of the Data Market.
Data
Market
Model
Product shares
Product shares
Tie Ratio
Data Market
Vertical, Size
Segments
Vertical, Size
Segments
Vertical, Size
Segments
IDC Product & Country Forecasts
IDC Vertical Market
Forecasts
% m
ark
et
2012 2020
Spend
2012 2020
Business Analytics Software - forecast
IT Services forecast
IT Hardware forecast
THE EUROPEAN DATA MARKET MONITORING TOOL
2. The indirect impacts: the economic activities generated along the company's supply chain by the data suppliers. There are two different types of indirect impacts: the backward indirect impacts and the forward indirect impacts (Richardson, 1985):
a. The backward indirect impacts: such impacts represent the business growth resulting from changes in sales from suppliers to the data industry. In order to produce and deliver data products and services, the data companies need inputs from other stakeholders. Revenues from those sales to data companies are the backward indirect impacts.
b. The forward indirect impacts: such impacts include the economic growth generated through the use of data products and services by the downstream industries, i.e. the data users as a selected number of industries. For the user companies, data is now a relevant factor of production; the adoption of data products and services by the downstream industries provides different types of competitive advantage and productivity gains to the user industries. The main benefits that the exploitation of data can provide to downstream industries are (OECD, 2013, Mc Kinsey, 2011):
i. Optimising production and delivery processes: data-driven processes (data-driven production);
ii. Improving marketing by providing targeted advertisements and personalised marketing practices (data-driven marketing);
iii. Improving existing organisation and management practices (data-driven organisation).
3. The induced impacts: these impacts include the economic activity generated in the whole economy as a secondary effect. Induced additional spending is generated both by new workers, who receive a new wage, and by the increased wage of existing jobs. This spending induces new revenues creation in nearly all sectors of the economy. The additional consumption does support economic activity in various industries such as retail, consumer goods, banks, entertainment, etc.
Table 27: Indicator 5 – Value of the Data Economy
Indicator 5 – Value of the Data Economy
N. Name Description Type and Time
Segmentation
5 Value of the Data Economy
Value of the Data Market plus direct, indirect and induced impacts on the EU economy
Billion €, 2017-18-20
Forecast to 2025, 3 Scenarios
By Geography: EU27 ; EU27 + U.K. ; total EU
5.1 Incidence of the Data Economy on GDP
Ratio between value of the Data Economy and EU GDP
%, 2017-18-20
Forecast to 2025, 3 Scenarios
By Geography: EU27 ; EU27 + U.K. ; total EU
This estimate of the Data Economy does not include the user benefits and social impacts of data-driven innovation such as changes in quality of life (health, safety, recreation, air quality). Although these benefits may be evaluated in economic (money) terms, they are
THE EUROPEAN DATA MARKET MONITORING TOOL
90
not economic impacts as such and as defined above as they do not induce an increase in the business activities and a consequent growth in GDP.
Analysts underlined that the new decision-making processes act as a rationalisation and optimisation factor (Brynjolfsson, 2011, Mc Kinsey, 2012), since they improve effectiveness and efficiency, and in some cases, they may have a disruptive effect. The impacts related to the new decision-making processes are the one we have called the forward indirect impacts.
The value creation process based on data rests on the elaboration of information and knowledge (OECD 2016), although the boundaries between data, information, and knowledge are sometimes fuzzy. The huge volume of data is a global phenomenon which is sometimes viewed with suspicion by citizens, consumers and businesses because data flows are seen as an intrusion of the privacy. Nevertheless, there is currently some evidence showing that data analysis can provide benefits to both businesses and consumers. By the way, this is not surprising since we should remind that the economic theory holds that information encourages competition between businesses for the benefit of consumers.
Data do not provide value and benefits as such; data need to be collected, stored, aggregated, combined and analysed in order to be appropriately used for decision making processes. To create value, data need to be processed (OECD, 2016):
• Extracting information from structured and unstructured data: data analytics
techniques are today able to analyse both structured and unstructured data. We should
remind here that most data stored by businesses are unstructured (IDC, 2012).
Technologies such as optical character recognition, natural language processing, face
recognition algorithms and machine learning algorithms are empowering the use of all
data.
• Real-time monitoring and tracking: analysis of data in real time is often mentioned as
one of the most powerful factor since it supports organisations to make real-time
decisions, which, in a fast-changing world, is a well-known competitive advantage.
• Inference and prediction: until now, prediction was based exclusively on prior
information and data series. Data analytics can now enable the creation of information
even without prior information. Such information can be created through patterns and
correlations of data. Personal information, for example, can be deduced from
anonymous or non-personal data. Businesses and organisations demand real time
insights rather than historical and periodical information, and for advanced specialised
data analytics services. Algorithms allow machine and statistical learning based on non-
specific data; businesses can learn and predict a lot about their customers even if they
do not have specific data and time series about the issue they are interested in. Machine
learning has, as an example, applications in health care where data collected on patients
are recorded by imaging, or it supports production processes to increase the quality of
production.
The diffusion of technology supporting production and analysis of data induces organisations and businesses to base their decisions on data much more than they were used to do. As pointed out by OECD in its recent report, the process to take decisions is also changing. Decision makers do not necessarily need to understand the phenomenon before they act on it. A store can change the product placement based on data analysis without the need to know the reason why such a change should improve the sales. There is therefore a decision automation process: “first comes the analytical factor, then the action, and last, if at all, the understanding” (OECD, 2015).
The impacts of such a new approach to decision making and to the use of data in all the enterprises and organisations’ functions are many and varied, so that we believe, such
THE EUROPEAN DATA MARKET MONITORING TOOL
impacts will be object of studies and analysis in the upcoming years. It is, at this point, difficult to classify them and to suggest a taxonomy of such impacts.
Such impacts have been observed through some empirical studies and case analysis. The most relevant ways the benefits appear are the following.
• Creating more information, knowledge and transparency: technology is making
data more accessible and exploitable to all kind of stakeholders, including SMEs. This
increases transparency and decisions are made on a rational process.
• Improving performance: having access to a wide information and to a high number of
data is changing the way of making decisions. An increasing number of organisations
are going to become data-driven organisations, which means that they make decisions
based on empirical results. As an example, retailers can adjust prices and promotions,
more precisely than they were used to and in real time. This may improve
competitiveness. McKinsey underlines that the health sector is achieving a lot of benefits
from the new making decisions process: studies on clinical data allow to identify and
understand the sources of variability in treatment, to identify the best treatment protocols
and to create guidelines for the optimization of treatment decisions. This does not only
increase the effectiveness of treatments, but it also produces saves.
• Improving customization of actions for better decisions: data technology is
definitely improving the segmentation of customers and the analysis of their preferences
in real time. This allows companies to supply products and services targeted to specific
groups of individuals who have specific needs and preferences. Such a segmentation
is also useful when supplying public services. Such a segmentation helps define the
price precisely and offering exactly what is needed which means a better quality and
also companies avoid offering products and services the consumers are not willing to
pay.
• Innovating products and services as well as business models: the more
information and understanding businesses have about their customers, the better they
can serve them. It is important to say that although consumers may fear their privacy is
injured, this can also provide them unexpected surplus: real time price comparison
services do not only provide better transparency but also allow buying the best product
at the most convenient price (for example when buying online airline tickets or when
booking hotels). Companies can in fact produce and create new products and services
to better satisfy their customers’ needs. This is true also for the public sector and
specifically for the health care system where preventing care programs can be created.
These effects are reflected in an increase in revenues due to higher market share from the increase in competitiveness or due to a reduction in costs. All these effects are included in the forward indirect impacts; these impacts are delivered on the user industry, and because of the above reasons, these are the impacts we consider new on the overall economic system.
Methodology Approach
Measuring the Data Economy depends on the macroeconomic context on one hand, and on the adoption/diffusion and integration processes the companies are implementing on the other hand. Moreover, there is a necessary time lag before the impacts take place in the economic system. Therefore, the estimates are based on a set of assumptions, including choices about proxy indicators.
In order to measure the impact of the diffusion and use of data services and products, we estimated each component (as defined in the above paragraph) of the impact separately.
THE EUROPEAN DATA MARKET MONITORING TOOL
92
The estimation approach developed in the previous study was based on a number of assumptions on one hand and on results from a survey launched during the first-year research.
The following assumptions have been confirmed:
• The penetration rates of data in terms of value added for the user industries using
data are positively correlated to the penetration rate in terms of number of
companies using data.
• The survey conducted in the study 2013-2016 provided information about the
quantitative benefits due to the use of data, for the six major Member States plus
Czech Republic; such benefits have been taken into consideration for the six major
Member States.
• For Austria, Belgium, Denmark, Finland, Ireland, Luxembourg, Malta, the
Netherlands, and Sweden we assumed that these Member States have the same
distribution of benefits as the average of the Big Six.
• For the other Member States, we estimated the benefits of the rest of Europe, based
on the survey results, and we assumed that all the minor Member States are
achieving benefits similar to the rest of Europe.
• For the induced impacts, we assumed that the additional earnings are spent
according to the general economic mood.
In order to update the estimates of the different components of the impacts, we have adopted some new assumptions:
• In the next three years, we are going to stay in a relatively emerging stage of the
data diffusion, so that in our view the structure of the data impacts is not going to
change.
• For the quantitative benefits due to the use of data, we assume that the benefits will
quantitatively vary and be correlated to the macroeconomics trends and specifically
with the industries’ trends (and stakeholders) affected.
Measuring the Data Skills Gap
Definition and Scope
This indicator captures the potential gap between demand and supply of data skills in Europe, since the lack of skills may become a barrier to the development of the data industry and the rapid adoption of data-driven innovation.
Table 28: Indicator 6 – Data Skills Gap
Indicator 6 – Data Skills Gap
N. Name Description Type and Time Segmentation
6 Data Workers Skills Gap
Gap between demand and supply of data workers
Absolute number and % on total demand, 2017-18-20
Forecast to 2025, 3 scenarios
By Geography:EU27 ; EU27 + U.K. ; total EU, main EU Member States
THE EUROPEAN DATA MARKET MONITORING TOOL
Methodology Approach
The methodology approach is the same implemented by IDC-empirica to estimate the supply-demand balance of ICT skills in the EU (e-Skills) on behalf of the EC DG Enterprise (now DG GROW). The model was first developed in 2009 and since then has been successfully validated and updated several times. The results have been used by the EC to support the e-skills policy and the latest results were presented in December 2015 at the European E-skills 2015 Conference in Brussels21. However, data skills are not a subset of ICT skills so the scope of supply and the dynamics of demand are different from the e-skills model developed by IDC.
To update the measurement of the indicators the study team has applied the same model developed for the previous EDM Study, combining the estimates and forecasts of the demand and supply of data professionals with data skills leveraging a wealth of different sources, among which:
• OECD Digital Economy Papers, among which: OECD (2014), Measuring the Digital
Economy: A New Perspective; OECD Publishing.
• ILOSTAT (International Labour Organization) Statistics and Databases (2015)
• EUROSTAT Tertiary Education Statistics (Last update: December 2015).
• European Data Science Academy (EDSA) project deliverables and publications (July
2015).
Figure 16: The Data Skills Demand-Supply Balance Model
Source: European Data Market Monitoring Tool, IDC 2016
21 “e-Skills in Europe: Trends and Forecasts for the European ICT Professional and Digital Leadership Labour Markets (2015-2020)”, empirica Working Paper (November 2015)
GAP / Over-
supply
Data Workers
Demand
Data
CompaniesData Users
Data Workers
Supply
Education
Training
Other
Careers
Education
TrainingOther
Careers
THE EUROPEAN DATA MARKET MONITORING TOOL
94
8. Essential glossary – the key indicators
Data professionals are data workers who collect, store, manage, and/or analyse, interpret, and visualise data as their primary or as a relevant part of their activity. Data professionals must be proficient with the use of structured and unstructured data, should be able to work with a huge amount of data and be familiar with emerging database technologies. They elaborate and visualise structured and unstructured data to support analysis and decision-making processes.
Data companies can be both data suppliers’ and data users’ organisations:
• Data suppliers have as their main activity the production and delivery of digital data-
related products, services, and technologies. They represent the supply side of the
Data Market.
• Data users are organisations that generate, exploit, collect and analyse digital data
intensively and use what they learn to improve their business. They represent the
demand side of the Data Market.
Data companies’ revenues are the revenues generated by data suppliers for the products and services specified in our definition of the Data Market. The revenues correspond to the aggregated value of all the data-related products and services generated by Europe-based suppliers, including exports outside the EU.
The Data Market is the marketplace where digital data is exchanged as “products” or “services” as a result of the elaboration of raw data. We define its value as the aggregate value of the demand of digital data without measuring the direct, indirect and induced impacts of data in the economy. The value of the Data Market includes imports (data products and services bought on the global digital market from suppliers not based in Europe) and excludes the exports of the European data companies.
The Data Economy measures the overall impacts of the Data Market on the economy. It involves the generation, collection, storage, processing, distribution, analysis elaboration, delivery, and exploitation of data enabled by digital technologies. The Data Economy also includes the direct, indirect, and induced effects of the Data Market on the economy.
The Data Professionals’ Skills Gap captures the potential gap between demand and supply of data skills in Europe, since the lack of skills may become a barrier to the development of the data industry and the rapid adoption of data-driven innovation.
Data is usually defined as qualitative or quantitative statements or information which can be coded and which are assumed to be factual and not the product of analysis or interpretation. For the sake of this study we consider only data which is collected, processed, stored, and transmitted over digital information infrastructures and/or elaborated with digital technologies. This definition includes multimedia objects which are collected, stored, processed, elaborated and delivered for exploitation through digital technologies (for example, images databases).
Information is the output of processes that summarise, interpret or otherwise represent the content of a message to convey meaning. Therefore, information is not a mere synonymous of data.
The Knowledge Economy is defined as the production of products and services based on knowledge-intensive activities that contribute to an accelerated pace of technical and scientific advance, as well as rapid obsolescence. The key component of a knowledge economy is a greater reliance on intellectual capabilities than on physical inputs or natural resources.
THE EUROPEAN DATA MARKET MONITORING TOOL
The Internet Economy is defined as covering the full range of our economic, social and cultural activities supported by the Internet and related information and communications technologies22.
Information or Knowledge workers in the most basic definition are persons employed to produce or analyse ideas and information. Multiple sources define knowledge workers as workers creating knowledge capital, who process existing information to create new information to be used to define and solve problems. They include, as an example, medical practitioners, lawyers, judges, teachers, architects, engineers, managers or salespeople. Their main capital is knowledge, and they are mainly focused on “non-routine” tasks.
Data workers collect, storage, manage and analyse data, as their primary activity. Data workers can be knowledge workers if they are focused on non-routine tasks. For example, data entry clerks’ primary activity is related to data, so they are data workers. However, data entry is a very routine task and as such data entry clerks should not be considered as knowledge workers. Another category of data workers is data analysts, who usually extract and analyse information from one single source, such as a CRM database. They require a medium level of creative thinking and usually work on structured data.
Data scientists require solid knowledge in statistical foundations and advanced data analysis methods combined with a thorough understanding of scalable data management, with the associated technical and implementation aspects (European Big Data Value Partnership Strategic Research and Innovation Agenda, April 2014). They can deliver novel algorithms and approaches such as advanced learning algorithms, predictive analytics mechanisms, etc. Data scientists should also have a deep knowledge of their businesses; the most difficult skills to find, include advanced analytics and predictive analysis skills, complex event processing skills, rule management skills, business intelligence tools, data integration skills (UNC, 2013).
22 “Measuring the Internet Economy: A Contribution to the Research Agenda”, OECD Digital Economy Papers, 2013
THE EUROPEAN DATA MARKET MONITORING TOOL
96
GETTING IN TOUCH WITH THE EU
In person
All over the European Union there are hundreds of Europe Direct information centres. You can find the address of the centre nearest you at: https://europa.eu/european-union/contact_en
On the phone or by email
Europe Direct is a service that answers your questions about the European Union. You can contact this service:
– by freephone: 00 800 6 7 8 9 10 11 (certain operators may charge for these calls),
– at the following standard number: +32 22999696 or
– by email via: https://europa.eu/european-union/contact_en
FINDING INFORMATION ABOUT THE EU
Online
Information about the European Union in all the official languages of the EU is available on the Europa website at: https://europa.eu/european-union/index_en
EU publications
You can download or order free and priced EU publications at: https://publications.europa.eu/en/publications. Multiple copies of free publications may be obtained by contacting Europe Direct or your local information centre (see https://europa.eu/european-union/contact_en).
EU law and related documents
For access to legal information from the EU, including all EU law since 1952 in all the official language versions, go to EUR-Lex at: http://eur-lex.europa.eu
Open data from the EU
The EU Open Data Portal (http://data.europa.eu/euodp/en) provides access to datasets from the EU. Data can be downloaded and reused for free, for both commercial and non-commercial purposes.
doi: 10.2759/72084 ISBN 978-92-76-19505-4
KK-0
1-2
0-3
55-E
N-N