Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Project funded by the European Union’s Horizon 2020 Research and Innovation Programme (2014 – 2020)
Support Action
Big Data Europe – Empowering Communities with Data
Technologies
Project Number: 644564 Start Date of Project: 01/01/2015 Duration: 36 months
Deliverable 2.9
Report on Interest Groups Workshops VI
Dissemination Level Public
Due Date of Deliverable M36, 31/12/2017
Actual Submission Date M36+, 30/01/2018
Work Package WP2, Community Building & Requirements
Task TT2.1
Type Report
Approval Status Approved
Version 1.0
Number of Pages 37
Filename D2.9_Report on Interest Groups Workshops VI
Abstract: This report summarises the organization and derived results from the last three Interest Group workshops organized during the reporting period (Societal Challenges - 5
Ref. Ares(2018)550415 - 30/01/2018
D2.9 – v. 1.0
Page 2
Climate, Food & Agriculture, 3 - Energy, 1 - Health & Well-being) and carried out by each group associated with each societal challenges.
The information in this document reflects only the author’s views and the European Community is not liable for any use that may be made of the information contained therein. The information in this document is provided “as is” without guarantee or warranty of any kind, express or implied, including but not limited to the fitness of the information for a
particular purpose. The user thereof uses the information at his/ her sole risk and liability.
History
Version Date Reason Revised by
0.0 02.12.2017 Placeholders Simon Scerri (FhG)
0.1 19.12.2017 SC5 Report Iraklis Angelos Klampanos (NCSR-D)
0.2 14.01.2018 SC3 Report Fragiskos Mouzakis (CRES)
0.3 19.01.2018 SC1 Report Kiera McNeice (OpenPHACTS)
0.4 26.01.2018 Cross-check with contributors
Simon Scerri (FhG)
1.0 30.01.2018 Final Report Simon Scerri (FhG), Alexandra Garatzogianni (FhG)
D2.9 – v. 1.0
Page 3
Author List
Organisation Name Contact Information
Fraunhofer Simon Scerri [email protected]
Fraunhofer Alexandra Garatzogianni [email protected]
NCSR-D Iraklis Angelos Klampanos
CRES Fragiskos Mouzakis [email protected]
OpenPHACTS Kiera McNeice [email protected]
D2.9 – v. 1.0
Page 4
Executive Summary
In this deliverable we provide an in-depth report and material associated with the last round of
BDE workshops that have taken place between M34 and M36 (3 out of a total of 7 for 2017).
The reports include information about the participants, the sessions organised, the talks and
discussions as well as the gathered results (input for requirement elicitation). In addition,
material associated with the workshop, such as the agenda and the original invitation letter, is
also included. These reports supplement the reports of the 1st and 2nd series of workshops
covered in the first four deliverable in this series (D2.2 Report on Interest Groups Workshop I,
D2.5 Report on Interest Groups Workshop II, D2.6 Report on Interest Groups Workshop III and
D2.7 Report on Interest Groups Workshop IV) and the previous deliverable covering the first
round of 2017 societal workshops organised for the project (D2.8 Report on Interest Groups
Workshop V).
Abbreviations and Acronyms
SC Societal Challenge
EC European Commission
RE Requirement Elicitation
RS Requirement Specification
WP Work Package
D2.9 – v. 1.0
Page 5
Table of contents
1. Introduction ....................................................................................................................... 6
2. Third Round of Societal Workshops (II) ............................................................................. 6
2.1 SC5.3 - Big Data in Climate Action, Environment, Resource Efficiency and Raw Materials (3rd Workshop) .................................................................................................................. 6
2.1.1 Agenda .................................................................................................................. 7
2.1.2 Report on Presentations and Discussions ............................................................. 9
2.1.3 Appendices ...........................................................................................................10
2.2 SC3.3 - Big Data Europe 3rd Workshop in Energy ......................................................11
2.2.1 Agenda .................................................................................................................12
2.2.2 Workshop Scope and Structure ............................................................................13
2.2.3 Appendices ...........................................................................................................22
2.3 SC1.3 - Big Data Europe Societal Challenge 1 Final Workshop: Health, Demographic Change and Wellbeing ......................................................................................................25
2.3.1 Agenda .................................................................................................................25
2.3.2 Objectives.............................................................................................................26
2.3.3 Session I: BDE Project Results .............................................................................27
2.3.4 Session II: Invited Talks ........................................................................................29
2.3.5 Session III: Open Discussion ................................................................................33
2.3.6 Appendices ...........................................................................................................35
3. Summary ..........................................................................................................................37
D2.9 – v. 1.0
Page 6
1. Introduction
This deliverable contains 3 reports for the third round of BigDataEurope workshops held in the
third and last year of the project:
1. SC5.3 - Big Data in Climate Action, Environment, Resource Efficiency and Raw
Materials (3rd Workshop)
2. SC3.3 - Big Data Europe 3rd Workshop in Energy
3. SC1.3 - Big Data Europe Societal Challenge 1 Final Workshop: Health, Demographic
Change and Wellbeing
A summary and a copy of a detailed workshop report is provided in the next Section. The report
has been circulated to all participants and other identified stakeholders. The communication
took place via multiple channels, including directly by email, project website and newsletter.
2. Third Round of Societal Workshops (II)
The three below-described workshops are the last to be held in the third round of BDE
workshops in 2017, and the very last for the project. The workshops Invitations were sent to
the identified stakeholders, in multiple rounds. The workshops were designed around an
updated blueprint which was originally provided in Deliverable 2.1, with minor adjustments to
reflect the final round’s focus on dissemination of final pilot activities. A summary of workshop
details, plus the full workshop report, are included below.
2.1 SC5.3 - Big Data in Climate Action, Environment, Resource Efficiency
and Raw Materials (3rd Workshop)
The following table includes a summary of the workshop:
D2.9 – v. 1.0
Page 7
Date 06-07.11.2017
Venue Fondation Universitaire
Rue d'Egmont 11,
Brussels, Belgium
Attendees (Total) 14
Attendees (Project Consortium & Project Officer - Replacement) 3
Attendees (Other) 11
Sessions 2
2.1.1 Agenda
09:30-10:00 Welcome by the organisers and identification of expectations
Session 1: The 2nd BigDataEurope Societal Challenge 5 (SC5) pilot
10:00-10:15 The BDE H2020 Project
Andreas Ikonomopoulos, National Centre for Scientific Research
"Demokritos"
10:15-10:30 The Role of SC5 in the BDE H2020 Project
Spyros Andronopoulos, National Centre for Scientific Research
"Demokritos"
D2.9 – v. 1.0
Page 8
10:30-11:00 The 2nd SC5 pilot: Background and Rationale
Spyros Andronopoulos, National Centre for Scientific Research
"Demokritos"
11:00-11:30 Coffee break and networking
11:30-12:15 The 2nd SC5 Pilot
Andreas Ikonomopoulos and Iraklis Klampanos, National Centre for
Scientific Research "Demokritos"
12:15-13:30 Lunch break and networking
Session 2: Big Data Applications
13:30-14:00 Big Data applications in emergency response and management
Patrick Armand, Atomic Energy and Alternative Energies Commission,
France
14:00-14:30 Nuclear Emergency Response and Big Data Technologies
Stella Moehrle, Karlsruhe Institute of Technology (KIT), Germany
14:30-15:00 Big Data Analytics in the Health Domain
Maria-Esther Vidal, Fraunhofer IAIS, Germany
15:00-15:30 Supporting Agile Research on the Boundaries of Data and
D2.9 – v. 1.0
Page 9
Computing
Iraklis Klampanos, National Centre for Scientific Research "Demokritos",
Greece
15:30-16:30 Round table and open discussion on "The present and future of
data-driven emergency response in meeting Societal Challenge 5"
2.1.2 Report on Presentations and Discussions
During this workshop, the BDE SC5 team presented the 2nd and 3rd pilots pertaining to the
use of big data technologies and the application of deep learning techniques for rapid
emergency response in the field of nuclear or radiological events. The 2nd pilot puts the
foundation for data-scientific approaches in weather clustering and atmospheric dispersion
data, with application in estimating the location of an unknown source emitting radioactive
substances. The 3rd pilot extends the application with semantic web technologies. This
provides decision makers and experts with diverse information relevant to emergency
response to emissions of hazardous substances, such as possibly affected cities and their
populations, hospitals and other public infrastructure in the immediate vicinity of the plume,
etc.
Invited speakers discussed fields bordering the SC5 pilots’ field, such as the relationship
between big data and nuclear emergency response, its potential applicability in real-time
decision support systems for issues such as assessment and presentation of uncertainty - with
particular application in the JRODOS (Java-based Real-time Online DecisiOn Support)
system, as well as atmospheric dispersion modelling within large urban areas during
D2.9 – v. 1.0
Page 10
emergencies related to releases of hazardous substances. The workshop also included
presentations touching upon the use of big data in other fields, such as health, as well as the
role of European e-infrastructures in meeting societal challenges.
The SC5 workshop was attended by an audience of 14, including the organisers (3). Attendees
included 4 EC officers (JRC). The 2nd BDE SC5 pilot and rationale have been described in
the NERIS Platform Proceedings 2017, p.40. The 3rd BDE SC5 pilot has been demonstrated
live at the ISWC 2017 Conference, where it was awarded Best Demo - People’s Choice. A
video of the prototype developed can be found on BDE YouTube channel.
2.1.3 Appendices
2.1.3.A Slides & Presentations
1. Big Data Europe: Concept, Platform and Pilots, Andreas Ikonomopoulos, NCSR
“Demokritos”
2. The Role of SC5 in the BDE Project, Spyros Andronopoulos, NCSR “Demokritos”
3. The 2nd SC5 Pilot: Background and Rationale, Spyros Andronopoulos, NCSR
“Demokritos”
4. Second SC5 Pilot: Identifying the Release Location of a Substance, Andreas
Ikonomopoulos and Iraklis Klampanos, NCSR “Demokritos”
5. Big Data Applications in Emergency Response and Management, Patrick Armand,
Atomic Energy and Alternative Energies Commission
6. Nuclear Emergency Response and Big Data Technologies, Stella Moehrle, Karlsruhe
Institute of Technology
7. Big Data Analytics in the Health Domain, Maria-Esther Vidal, Leibniz Information
Centre for Science and Technology University Library
8. Supporting Agile Research on the Boundaries of Data and Computing, Iraklis
Klampanos, NCSR “Demokritos”
D2.9 – v. 1.0
Page 11
2.1.3.B Photos
The available photo is embedded in this report.
2.1.3.C Follow-up Post
A follow-up blogpost/message was shared on the BDE website.
2.1.3.D Attendees
A list of attendees is not available.
2.2 SC3.3 - Big Data Europe 3rd Workshop in Energy
The following table includes a summary of the workshop:
Date 28.11.2017
D2.9 – v. 1.0
Page 12
Venue RAI, Amsterdam,
Netherlands
Attendees (Total) 36
Attendees (Project Consortium & Project Officer - Replacement) 4
Attendees (Other) 32
Sessions 4
2.2.1 Agenda
13:15 - 14:30 Session I: Introduction and Review
13:00 – 13:15 Welcome & Introduction
● Introducing attendees, workshop goals (CRES)
● BigDataEurope: Scope and opportunities (Ms Maria-Ester Vidal, Fraunhofer
IAIS, BDE Coordinators)
13:15 – 13:25 Review talks
● EU Research & Innovation priorities on Energy and Digitalization (Mr. Mark
van Stiphout, DG ENER C.2, Dep. Head of Unit)
13:25 – 14:30 Asset management and Big Data
● Cloud based Wind Analytics (Mr. Masoud Asgarpour, Team Lead Analytics,
VATTENFALL)
● Wind Farm monitoring with advanced analytics (Mr Peter Clive, WoodGroup)
● Options for Wind Farm performance assessment and Power forecasting (Mr.
A. Kyritsis, Analyst, ALTSOL/TERNA)
D2.9 – v. 1.0
Page 13
14:30 – 15:10 Session 2: BDE Open Platform
● BDE Platform: components, application fields, how to install and use (Mr. Ivan
Ermilov, IT Researcher, University of Leipzig)
● Round table discussion for Big Data tools and BDE platform use
15:10 – 15:20 Coffee break - networking
15:20 – 16:40 Session 3: System monitoring
● Big Data in Wind Turbine Condition Monitoring: Leveraging Physics with Big
Data (Prof. Jan Helsen, Drivetrain monitoring and Big Data coordinator, Vrije
University Brussels, OWI Lab)
● Data management challenges in WT testing and monitoring (Mr. Gorka
Gainza, Engineering Manager, ARESSE)
● BDE Pilot case for Wind Turbine condition monitoring research (BDE, Mr. F.
Mouzakis, Head Wind Turbine Testing, CRES)
16:40 – 16:55 Session 4: Round table discussion
● Future data management and analytics needs in Wind Energy
● BDE platform prospects: development, application and maintenance
● Future research opportunities
16:50 – 17:00 Summary, Outreach & Farewell
2.2.2 Workshop Scope and Structure
The Big Data Europe consortium organised the third workshop on big data in the energy sector
on the 28th of November, 2017 in Amsterdam, within the Wind Europe Conference and
Exhibition 2017. This workshop provided a key opportunity for stakeholders in the energy
sector to be updated on the latest developments related to BigDataEurope’s platform and pilot
cases and discuss future application cases for the benefit of the energy community.
D2.9 – v. 1.0
Page 14
The workshop addressed a wide audience including data users, researchers, developers, IT
service providers and institutions in the wind energy domain.
WindEurope Conference and Exhibition event in Amsterdam was attended by 8,000 wind
industry players. The conference opened with high-level politicians and C-level industry
leaders debating industry trends and outlook followed by comprehensive program of parallel
session featuring general and scientific presentations relating to onshore and offshore wind
energy. BDE was also presented in conference and at CRES exhibition area (1E83).
The primary aim of the workshop was the presentation of the BDE platform and the pilot cases
along with the identification of current and future challenges for data management and analysis
in the energy domain; challenges to be tackled with the evolving Big Data technology via BDE.
In the third workshop the discussion was focused in wind energy and real examples of the
challenges and complexities of using big data in this field were discussed.
The outcome of the workshop was the dissemination of the performed work to a wide audience
along with the presentation of data related challenges in wind energy and the stimulation of
the community for exploiting BDE opportunities.
The workshop was divided in four parts, described in the following paragraphs.
Session I: Introduction and Review: The general introduction to the BDE background,
objectives and targets, as well as an overview of the tools and technologies envisaged within
the project was presented by BDE coordinator (Prof. Maria-Ester Vidal, Franhofer/IAIS).
The review on EU priorities on Energy R&I and Digitalization was presented by Mr. Mark van
Stiphout (Dep. Head DG ENER C.2).
Session II: Keynote presentations on asset data management and Big Data use
Reviews of the data management challenges and current solutions for utilities and developers
were presented by Mr. Masoud Asgarpour, representing a major utility with renewable assets
D2.9 – v. 1.0
Page 15
(VATTENFALL) and Mr. A. Kyritsis representing an IT service provider (ALTSOL) and a wind
energy developer and operator (TERNA).
The analytics challenges in wind energy asset data and new methodologies were presented
by Mr. Peter Clive, representing a service provider (WoodGroup, UK).
Session III: BDE Open Platform
The BDE platform (i.e. architecture, implementation, guidelines) was presented by the BDE
technical partner Mr. Ivan Ermilov (University of Leipzig). The presentation included sample
implementation examples.
Session IV: Wind Energy System monitoring
The data management and analytics challenges and applied solutions in the field of system
and condition monitoring were addressed by Prof. Jan Helsen representing a research institute
(OWI Lab, Belgium) and Mr. Gorka Gainza representing a service provider in the field of testing
(ARESSE, Spain).
The pilot use case for advanced condition monitoring was presented by F. Mouzakis
representing a research institute (CRES).
The workshop concluded with a discussion on the topic of future research opportunities, BDE
platform prospects and candidate applications in the field of Wind Energy.
2.2.2.1 Session I: Introduction and Review
Prof. Maria-Ester Vidal presented an overview of BDE project and the actions related to SC3.
The following were addressed:
- Stakeholder engagement process and outreach activities
D2.9 – v. 1.0
Page 16
- Review of stakeholder requirements focusing in data volume, velocity, variety and data
infrastructure efficiency parameters
- Identification of the data value chain requirements, from data generation to data-driven
services, per domain
- Overview presentation of BDE integrator platform and instances
- Positioning of BDE into BDVA reference model
- Future steps for BDE platform (maintenance, follow-up projects, future technical
seminars and standardisation efforts)
The concluding messages primarily focused on the suitability of BDE for a variety of societal
challenges cases and the future prospects and opportunities, the reliable foreseen
maintenance strategy and the positive prospects through the current projects that include BDE
platform.
Mr. Mark van Stiphout (Dep. Head ENER C.2) presented the priorities of EU R&I in energy
digitalization describing its major aims namely the exploitation of the opportunities offered by
digital technologies towards the Digital Single Market and the Energy Union and the increase
of the digital capacity of the energy sector towards the increase of renewable penetration and
energy use optimization. Emphasis was paid to the key calls in the forthcoming work program
which addresses among other topics the asset management optimisation through data
management, processing and analysis. In the topic of enabling next generation of smart energy
services a key issue is the conceptual use of big-data generated by equipment and sensors
enabling accurate measurement, control and optimisation.
In the concluding messages the following were pointed out:
- In the new electricity market design (increase of active consumers, demand response,
competitive prices etc) the new technologies are expected to play an important role
D2.9 – v. 1.0
Page 17
- Support tools are foreseen in energy domain under H2020 for innovative application of
the new technologies
2.2.2.2 Session II: Asset Management and Big Data
Mr. Masoud Asgarpour (Team Lead Analytics at VATTENFALL) presented the data
management and challenges of a utility and application with cloud based wind analytics.
The presentation covered the data management framework of a utility, for wind energy assets,
from data acquisition to operational dashboards. The presented use case regarded power
forecasting. The outline of the components used and the analysis procedures were described.
Mr. Peter Clive (Sgurr/WoodGroup, UK) presented the case for windfarm performance
monitoring with advanced analytics.
The following were addressed:
- Outline of active asset management
- Description of data available for asset management
- Presentation of Response Deficit Analysis as a tool for the identification on non-normal
asset operation
- Application results
Mr. A. Kyritsis (ALTSOL/TERNA) presented the data management and analytics approach of
a wind energy developer for the assets performance and power forecasting. The presenter is
an IT service provider who supports the largest wind energy developer in Greece (with assets
also in USA, Poland and Bulgaria) in the field of data management and analytics.
The following were presented:
- Integrated data management challenges and solutions for developer’s assets
D2.9 – v. 1.0
Page 18
- Description of data streams
- Power forecasting procedure
- Architecture and technologies used
- Presentation of future challenges and IT technologies under consideration
The concluding messages regarded:
- Integrated Data Management system allows RES operators to extract value from their
data
- Aggregated (scaled-up) data components improve forecasting accuracy, although
effectively dependent on their individual assets’ correlations
- Except from volume and ML challenges IT solutions have to expose proper information
towards its relevant target group in a timely and appropriately formed manner
The presentations covered extensively the wind energy asset management in relation with
data management and analytics. The cases are representative for the operator’s challenges
and describe accurately the domain that is characterised by the following:
- Current asset management is based on SCADA data (low frequency as a rule, higher
frequency in specific cases)
- Development of analytics is needed for extracting specific information from the data; in
most of the cases the analytics solutions are tailored to the available data streams and
thus, in cases, limited in reliability and value
- With the IT developments and the resulting opportunities in extracting value the data
sources and streams design will be revisited as more volume and higher rate can be
accommodated
D2.9 – v. 1.0
Page 19
- IT service providers are interested in utilising open tools (such as BDE)
2.2.2.3 Session III: BDE Open Platform
Mr. Ivan Ermilov representing BDE’s core technical team (University of Leipzig) presented a
detailed technical overview of BDE platform. The presentation regarded the Big Data Integrator
platform (the technical architecture, components and interfaces), guidelines for installation and
usage and sample use cases.
The following were presented:
- BDI architecture with detailed description of the main points, namely the use of DOCKER
containers, the BDI support and the semantification layer.
- The operation and interfacing between hardware, resource manager and applications
- BDE components with respect to their application, covering data storage, data indexing
and coordination, visualisation and user interfacing
- BDI stack and its lifecycle concept
- Platform installation and deployment with descriptive screencasts
- Summary of BDE use cases
The attending community members were given the opportunity to assess the potential of the
platform and the needed information for acquiring the technical details for its installation and
use.
Contacts were made between the technical attendants and BDE partners.
D2.9 – v. 1.0
Page 20
2.2.2.4 Session IV: System Monitoring
Prof. Jan Helsen (Drive train monitoring and Big Data coordinator, OWI Lab) presented the
data management and analytics approach for drive train health monitoring.
The following were presented:
- Rational behind the use of Big Data in energy
- Requirements for data driven design validation, critical phenomena interpretation and
risk minimization
- Wind turbine reliability challenge: failure modes, condition monitoring and status Log
analysis
- Use case for automated condition monitoring system
In the concluding messages the following were pointed out:
- Physics-based approaches using big data are of added value to design and monitoring
- Condition monitoring on long term data-sets are required for trend tracking
- Vibration data augmented with temperature data analysis
- Status log pattern mining for detecting episodes in turbine event sequences
Mr. Gorka Gainza (Engineering Manager of ARESSE) presented the data management and
analytics approach for the complete testing of wind energy assets.
The following were presented:
- Wind Turbine testing scene (procedures, systems and data)
D2.9 – v. 1.0
Page 21
- Requirements for data management system oriented towards customer full access to
data streaming
- Solution aspects (DAQ systems, cloud storage, analytics, campaign dashboards and
workspaces)
The main concluding message was that the presenter, as service provider in the field of asset
testing, recognised that beyond the data acquisition and static analysis value exists in
transferring the on-line (as much as possible) information to the client and for that the new IT
developments are needed.
Mr. F. Mouzakis (CRES) presented the BDE pilot case in the field of system monitoring in wind
energy. A brief introduction for the data acquisition challenge for high data volume and
sampling rate monitoring along for the use case problem definition and scenario was followed
by the pilot technical aspect presentation.
The following were presented:
- Requirements in system monitoring
- Typical data acquisition core components, technologies and architecture
- Description of the monitored system (wind turbine)
- The sensor network and the distributed data acquisition system
- Data description (type, format, volume etc)
- Base analytics
- Pilot concept and structure
- Sample results
D2.9 – v. 1.0
Page 22
The specific aspects of the pilot, namely the volume of the data, the analysis requirements and
the need of incorporating third party analytics modules were presented.
The presentations covered extensively the topic of wind energy asset testing and condition
monitoring. In contrast to SCADA data this field regards the management and analytics of
voluminous data and in cases with higher velocity. Addressing the challenges for these cases
opens new opportunities for services and research.
2.2.3 Appendices
2.2.3.A Slides & Presentations
1. Workshop Agenda
2. H2020 Priorities in Energy R&I: Energy System Digitalization (Mr. Marc van Stiphout,
DG ENER C.2 Dep. Head) - Note: This presentation is not uploaded to the BDE
project Slideshare account.
3. BDE review: Scope and Opportunities (Prof. Maria-Ester Vidal, BDE coordinator,
Franhofer IAIS)
4. Wind Farm Monitoring and advanced analytics (Mr. Peter Clive, WoodGroup)
5. Options for Wind Farm performance assessment and Power forecasting (Mr. A.
Kyritsis, ALTSOL/TERNA)
6. BDE Platform: Technical overview (Mr. Ivan Ermilov, University of Leipzig)
7. Big Data in Wind Turbine Condition Monitoring (Prof. Jan Helsen)
8. Data management in WT testing and monitoring (Mr. Gorka Gainza, ARESSE)
9. BDE Pilot case for Wind Turbine condition monitoring research (Mr. F. Mouzakis,
CRES)
D2.9 – v. 1.0
Page 23
2.2.3.B Photos
Photos are available in the respective workshop folder.
2.2.3.C Follow-up Post
A follow-up blogpost/message was shared on the BDE website.
2.2.3.D Attendees
The following table is the list of attendees that participated in the workshop:
D2.9 – v. 1.0
Page 24
D2.9 – v. 1.0
Page 25
2.3 SC1.3 - Big Data Europe Societal Challenge 1 Final Workshop: Health,
Demographic Change and Wellbeing
The following table includes a summary of the workshop:
Date 13.12.2017
Venue KOWI, Brussels,
Belgium
Attendees (Total) 16
Attendees (Project Consortium & Project Officer - Replacement) 4
Attendees (Other) 12
Sessions 3
2.3.1 Agenda
11:00-11:10 Welcome and Coffee
11:10-12:15 Session I: BDE Project Results
11:10-11:25 Introduction: Big Data Europe Simon Scerri, Fraunhofer
IAIS
D2.9 – v. 1.0
Page 26
11:25-12:00 Live demo: The Big Data Integrator Jonathan Langens,
Tenforce
12:00-12:15 The SC1 pilot: Open PHACTS Kiera McNeice, Open
PHACTS Foundation
12:15-12:45 Lunch and Networking
12:45-14:00 Session II: Invited Talks
12:45-13:15 Invited Keynote: The MIDAS Project Michaela Black, Ulster
University
13:15-13:45 Invited Keynote: BigMedilytics Supriyo Chatterjea,
Philips Research Europe
13:45-14:00 Invited Keynote: IAISIS Guillermo Palma, L3S
14:00-15:00 Session III: Open Discussion & Closing
14:00-14:50 The Future of Big Data in Health Open Discussion
14:50-15:00 Wrap-up
2.3.2 Objectives
The third workshop for SC1 (Health, Demographic Change and Wellbeing) was also the final
workshop in the Big Data Europe project. As such our aims for this workshop were twofold.
Firstly, as the final workshop for SC1, our aim was to showcase the final version of our SC1
pilot and its applications to early bioscience research data. Beyond this, we aimed to engage
D2.9 – v. 1.0
Page 27
with ongoing and future projects to discuss the future of big data in health. Health is a complex
domain with a large number of academic, private and global entities operating in this space;
there are many challenges still to be addressed in this domain, and we aimed to host a lively
discussion about where big data in health may lead us in the years to come, and what
roadblocks it might run into.
Secondly, as the final workshop for the BDE project as a whole, our aim was to showcase the
BDI itself, as the primary technical result (support) of this 3-year Coordination and Support
Action. We invited a core technical project member to present a live demonstration of the
functionality of the BDI, and to engage attendees in discussion about how they might deploy it
for their own needs.
2.3.3 Session I: BDE Project Results
The workshop began with an overview of the Big Data Europe project. Simon Scerri presented
an overview of the big data landscape when the project started, and the needs of different
stakeholders with regards to volume, velocity, and variety of data, as well as infrastructures
and data value chain requirements. Several common requirements were identified in the health
domain, which implied that a generic data solution might be valuable in this domain.
Simon then presented the major results of the BDE project, from both a community and
infrastructure perspective. He gave an overview of the Big Data Integrator and its seven
implemented instances across the seven societal challenges, as well as a view to what could
be done with the BDI in follow-up projects and other initiatives – for example in standardisation
efforts within BDVA.
2.3.3.1 The Big Data Integrator (Live Demo)
A key aim of this workshop was to present a working live demo of the Big Data Integrator to
our attendees, to demonstrate how BDE has worked to lower barriers to entry for people
interested in working with big data. Jonathan Langens (TenForce) first explained how the
D2.9 – v. 1.0
Page 28
various components of the BDI are designed to help users build, setup, deploy, and monitor
big data pipelines. He then gave a live demonstration of how to build a big data pipeline using
the Stack Builder, which allows users to drag and drop components into a stack to build a
Docker Compose file; the Workflow Builder, which allows users to create steps and conditions
to determine which Docker Compose files are initiated in which order; and the Swarm UI, which
facilitates monitoring of the pipeline, including options such as live monitoring with a Kibana
dashboard.
2.3.3.2 The SC1 Pilot (Results)
The final version of the SC1 Pilot was presented by Kiera McNeice (OpenPHACTS), who gave
an overview of the background of the Open PHACTS Discovery Platform and its usefulness
for early stage drug discovery. By semantically linking data across multiple open
pharmacological databases, focussing on questions asked by real researchers, the platform
can help significantly reduce the time and cost of researchers’ queries.
As the SC1 pilot, the Open PHACTS Discovery platform has been successfully re-built for
BDE, using all open components and allowing users to install the entire platform on a local
machine. This makes the Open PHACTS functionality even more accessible to researchers
from academia, SMEs and industry, as well as allowing for integration with wider platforms
using BDE architecture, meaning the platform has increased flexibility, scalability and
extensibility.
Attendees were particularly interested in whether the platform had been extended or
connected to other data across different Societal Challenges. Although this was not possible
to achieve during the lifetime of the project, it could well be a possibility in the future. We briefly
discussed the effort required to create the semantic mappings within Open PHACTS,
particularly the difficulty of refreshing datasets on a regular basis. But similar principles could
in theory be applied to mappings between other heterogeneous datasets, for example patient
data.
D2.9 – v. 1.0
Page 29
2.3.4 Session II: Invited Talks
2.3.4.1 MIDAS Project
Dr Michaela Black (University of Ulster) was invited to present the MIDAS project, which aims
to use big data to support public health policy. The project’s aim is to connect as many different
kinds as possible, including data volunteered by individuals. The project is then creating an
integration layer and visualisation tools to help policymakers and citizens understand
connections between data, with the ultimate goal of informing policy decisions and enhancing
health outcomes for individuals in policy regions.
A key need for policymakers is to be able to connect as many kinds of data relevant to
healthcare as possible – not only from health agencies, insurance companies, and care
providers, but also data about local infrastructure and schools, and even information from
social media data in order to understand public perceptions of new policies. These data can
potentially be used to evaluate new and existing policies, providing evidence to review them.
However there is a lack of tools to “close the loop” and help policymakers and citizens
understand and make decisions based on these data. Furthermore, policymakers do not tend
to have a data science or technical background, and rarely have access to data scientists.
MIDAS has found that there is a need not only for linking data and creating visualisation tools,
but also for education among end users about how data can be visualised, and the risks and
limitations of how it can be interpreted.
Notes from follow-up discussion
The MIDAS project is also working with personal data volunteered by citizens. In our last SC1
workshop it was clear that working legally and ethically with personal data will be a major
D2.9 – v. 1.0
Page 30
challenge to realising the benefits of big data in healthcare, and as with our last workshop, the
challenges of working with personal data became a key point of discussion among workshop
attendees.
One such challenge is anonymisation. The first question that arose was whether standards of
anonymisation exist, and the inevitable trade-off between data integrity, granularity and the
risk of reverse identification. Dr Black gave the example of Finland, where many individuals
have been happy to volunteer their personalised data for enhanced healthcare; Finland have
existing standards under which a lot of healthcare data has been anonymised and released to
researchers. However linking datasets is a risk to anonymity, and there is some question of
how future-proof existing standards are. And even anonymised datasets can face ethical
challenges; Dr Black gave the example of a case where anonymised data was used to build a
preventative process to prevent children with Down’s syndrome being born. Although legal
under GDPR rules, parents of children whose data was used in this process were very unhappy
with the way it was used.
Attendees discussed alternative options to anonymisation, such as creating synthetic data to
use as a representative replica for real data. The MIDAS project is looking into proof-testing
this ideas, as ultimately policymakers are not interested in individuals’ data, just in the
underlying metadata that is relevant to policy-level decisions.
Another potential solution raised was for honest brokers to act as intermediaries. Dr Black gave
examples of Northern Ireland and the Basque region where the honest broker solution is being
considered: researchers would identify the data they want, and subject to approval, honest
brokers could gather and connect data from different sources, remove identifiable information,
anonymise the data, and allow it to be analysed in a restricted in-house environment.
Attendees were curious whether MIDAS had made any interesting discoveries through linking
data so far. Dr Black said that a few interesting trends had been observed – for example that
Northern Ireland appears to be on a trajectory similar to the USA’s in drug usage and obesity
D2.9 – v. 1.0
Page 31
– but that at the moment data is bringing in more questions than answers. An important part
of MIDAS’s work will focus on helping policymakers understand what kinds of questions they
can ask of datasets, and how to effectively use visualisation tools to ask the right questions
and obtain useful and reliable answers.
2.3.4.2 BigMedilytics Project
Dr Supriyo Chatterjea (Philips Research Europe) was invited to present the BigMedilytics
project, which aims to take a holistic view of healthcare and improving outcomes in the
healthcare industry. The project is taking a very broad perspective of healthcare, with 35
partners from a wide range of backgrounds, and ultimately aims to increase productivity in the
healthcare sector by 20% by applying and adapting state-of-the-art big data techniques and
algorithms.
The project has identified the main disease groups that will be responsible for the greatest
burdens on society in the near future, currently responsible for 78% of deaths in Europe. These
have been split into two themes: oncology, and population health and chronic disease
management. A third project theme will focus on the industrialisation of healthcare during
treatment phases in hospital. The project is addressing these three themes by working with
key experienced partners on a series of 12 pilots (e.g. Comorbidities, Prostate Cancer), each
of which will connect datasets through a flexible architecture to address that pilot’s specific
challenge.
Notes from follow-up discussion
Again the first question raised by attendees was around the ethics of personal data –
specifically, the risk that one or more partners may not be able to resolve the ethics issues
around data needed for their pilot. Dr Chatterjea explained that in fact, this is not the first time
BigMedilytics has been submitted as a project – and one of the reasons it failed previously was
that they could not guarantee all the data they wanted would be accessible and usable. To
address this issue BigMedilytics refocussed on partners who have already worked with similar
D2.9 – v. 1.0
Page 32
datasets, and planned out ethics procedures in advance; the project has actually been delayed
to carry out ethical considerations and ensure as much as possible that data will be available
from day one of the project.
Workshop attendees discussed the reasons why people may not want their data used by the
healthcare industry, and whether it makes sense for consent around personal data to be binary
(a blanket ‘yes’ or ‘no’). One attendee suggested that some people may be willing for example
to supply data for use by not-for-profit projects and research, but not to help an insurance
company refine its tariff schemes.
Although the GDPR requires very specific constraints when asking for consent to use personal
data, this raises challenges of its own: for example, one attendee raised the possibility of
discovering a new use for data halfway through a project, and not having specific consent to
investigate that use. This could limit the ability of projects like BigMedilytics to discover and
explore new avenues for improving healthcare outcomes; formulating questions of consent
and what happens to derivative data will be an important consideration.
In a more specific use case, one challenge to improving workflows when dealing with stroke
patients is that many stroke patients arrive in hospital in no condition to give their consent to
having their progress through the healthcare system tracked. Attendees discussed whether in
such situations data might be collected, and then only used if and when consent is obtained
from the patient, and discarded otherwise.
Finally, Dr Black suggested that BigMedilytics also approach charities and voluntary
organisations who ‘pick up the pieces’ when patients return home from hospital treatment, as
they often play a key role in consistency of care, and deliver real face-to-face value for patients.
2.3.4.3 IASIS Project
Guillermo Palma (L3S Research Center) was invited to present iASiS, which aims to integrate
data into a big data framework to support personalised medicine. This framework will be based
D2.9 – v. 1.0
Page 33
on the BDE framework, with a semantically enriched layer used to populate a knowledge graph
which users can then query.
The project is working on two pilots, which will focus on lung cancer and Alzheimer’s disease.
In each case the goal is to combine a variety of data sources including electronic health
records, genomic data, bibliographic data, as well as publicly available pharmacological
databases such as GeneOntology and DrugBank. Data will be further enriched by using natural
language processing and data mining techniques on clinical notes, and deep learning to
analyse medical images, and generate predictive models.
Notes from follow-up discussion
Once again, workshop attendees were interested in how iASiS plans to handle the ethics of
dealing with medical data at a personal level – in particular whether focussing on two targeted
pilots helped to simplify issues. Mr Palma explained that in each pilot, the project is working
closely with partners who already own enough data to work with, for example with a hospital
specialised in lung cancer. Data also remains under the control of the respective partners for
privacy control, with no copies made on iASiS’s platform.
Further questions from attendees were around whether iASiS is working with existing
ontologies or generating new ones. Mr Palma explained that the project will do both – relying
partly on existing ontologies like MeSH, as well as creating their own ontology to build their
knowledge graphs.
2.3.5 Session III: Open Discussion
During the course of the afternoon, the presentations given by each invited speaker evolved
naturally into discussions of the challenges and opportunities facing big data in health. We
allowed these discussions to run over the allotted time for each project presentation as
D2.9 – v. 1.0
Page 34
interesting thoughts and ideas were being raised. Although we were left with less time than
planned for the open discussion at the end of the afternoon, open discussions had evolved
naturally throughout the day and a lot of key points had already been covered.
Simon Scerri (Fraunhofer IAIS) opened the final discussion by noting that personal data was
a topic that had come up throughout the day. Although BDE didn’t deal with any personal data
in the SC1 pilot, this is clearly a problem affecting a lot of potential big data applications in
health, and could become more complicated once the GDPR comes into force.
More generally, we asked attendees for their views on whether stakeholders in the health
domain consider using the BDI, or building on BDE as part of their project infrastructure, as
iASiS have done. The general view among attendees is that stakeholders must be convinced
there is a benefit to switching to a new infrastructure, or adopting a specific infrastructure, and
that this requires building up trust. Key factors stakeholders are likely to consider would be
whether an infrastructure can help them solve broader issues beyond just being involved in a
specific project, and what costs and risks might be associated with that infrastructure in the
future – for example whether there are likely to be expensive licensing fees, and whether the
infrastructure will be supported in the case of any bugs or issues.
On the broader question of the way forward for big data in health, there was general agreement
that healthcare depends on many factors beyond just health data. Cross-project collaborations
involving food, lifestyle, and infrastructure to support healthy living could make health data
much richer, especially over the long term. One example given was looking at data about the
usage of rental bicycles in major cities, if the owners are willing to make that data public.
This led to discussion of how data owners can be encouraged to share data, in which most
agreed there is a need to identify win-win situations. Unless all parties feel like they are
benefiting, they may not share their data, in which case everyone loses out in the end. Points
were again raised about the complexities of complying with rules around consent to use
personal data – especially in the context of consortiums, where citizens may not realise that
D2.9 – v. 1.0
Page 35
giving one company access to their data may make it available to others in the consortium. Dr
Black gave an example of a Finnish consortium which is looking at an app where users can
‘consent-in’ or ‘consent-out’ at any stage of a project, to help reduce uncertainty in cases like
these. Ultimately giving power to individuals, and building trust with individuals, seem to be
important steps in treating personal health data ethically.
Finally, attendees discussed the fact that more and more projects and solutions are trying to
create bench-to-bedside pipelines in the healthcare domain, in a variety of different use cases.
If BDE’s infrastructure can be adapted for such closed data cases, it could definitely have
applications in these areas.
2.3.6 Appendices
2.3.6.A Slides & Presentations
1. Introduction: Big Data Europe (Simon Scerri, Fraunhofer IAIS)
2. Live Demo: The Big Data Integrator (Jonathan Langens, TenForce) - Note: A recording
of a live demo is planned to be added to the BDE Website.
3. The SC1 Pilot: Open PHACTS (Kiera McNeice, Open PHACTS Foundation)
4. Invited Keynote: The MIDAS Project (Michaela Black, University of Ulster)
5. Invited Keynote: BigMedilytics (Supriyo Chatterjea, Philips Research Europe)
6. Invited Keynote: IASIS (Guillermo Palma, L3S Research Center
2.3.6.B Photos
A photo from the workshop is included in this report.
D2.9 – v. 1.0
Page 36
2.3.6.C Follow-up Post
A follow-up blogpost/message was shared on the BD website.
2.3.6.D Attendees
The following table is the list of attendees that participated in the workshop: Name Surname Organisation
1 Kiera McNeice BDE (Open PHACTS Foundation)
2 Simon Scerri BDE (Fraunhofer IAIS)
3 Jonathan Langens BDE (Tenforce)
4 Michaela Black Speaker (University of Ulster)
5 Supriyo Chatterjea Speaker (Philips Research Europe)
6 Guillermo Palma Speaker (L3S Research Center)
7 Aldo Camargo Technopark Peru
8 Vasily Epishkin Permanent Mission of the Russian Federation to NATO
D2.9 – v. 1.0
Page 37
9 Ilias Iakovidis EC/DG CONNECT
10 Violeta Isabel Perez Nueno EC/DG CONNECT
11 Hans-Joerg Lutzeyer EC/DG RTD.F3
12 Jana Makedonska EC/DG RTD
13 Cédric Peeters Vrije Universiteit Brussel
14 Saila Rinne EC/DG CONNECT
15 Paola Saura Zabala Innovation Consulting
16 Gregor Schaffrath EC
3. Summary
The reports provided in this deliverable cover the very last BDE WP2 workshops taking place
between M34 and M36 (3). These reports supplement the reports of the 1st and 2nd series of
workshops covered in the first 4 deliverable in this series (D2.2 Report on Interest Groups
Workshop I, D2.5 Report on Interest Groups Workshop II, D2.6 Report on Interest Groups
Workshop III and D2.7 Report on Interest Groups Workshop IV), and the report covering the
first four workshops held in the third round in 2017 (D2.2 Report on Interest Groups Workshop
V).