Upload
bigdataeurope
View
352
Download
1
Embed Size (px)
Citation preview
Talk outline
¥ The BigDataEurope action¥ The Big Data Integrator platform¥ Pilots across all seven H2020 challenges¥ Upcoming BDE Activities
18-oct.-16www.big-data-europe.eu
Big Data Europe (CSA: 2015-17)¥ Show societal value of Big Data
o Across all societal challenges addressed by Horizon 2020
¥ Lower barrier for using big data technologieso Effort and resources to convert tools and workflows
o Skills and expertise
¥ Help establish data value chainso Across languages, organizations, and domains
18-oct.-16www.big-data-europe.eu
Stakeholder Engagement
¥ Present action, showcase deployments
¥ Raise awareness about BDE results, what they mean for stakeholders
¥ Collect requirements to drive further development
18-oct.-16
www.big-data-europe.eu
M12M6 M18 M24 M30
Data Value Chain Evolution
18-oct.-16www.big-data-europe.eu
Extraction,Curation Quality,Linking,Integration
Publication,Visualization,Analysis
Extraction,Curation,Quality,Linking,Integration,Publication,
Visualization,Analysis
HealthTransport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data Repositories Linked Open Data Cloud
Stage 1
Stage 2
Stage 3
Food SocietiesClimate Energy
Architecture
¥ Big Data Integrator (BDI):o The prototype developed by BDE
¥ Main points of the architectureo Dockerizationo Support layer, including integrated UIo Semantification layer
18-oct.-16www.big-data-europe.eu
Big Data Integrator
18-oct.-16www.big-data-europe.eu
¥ Plug-and-play BD Platform¥ Cloud-deployment ready
¥ Domain independent, Customisable¥ Bundles Open Source solutions
¥ First Version Released!
Docker containers
18-oct.-16www.big-data-europe.eu
¥ Docker offers lightweight virtualizationo Docker containers can be shared to be provisioned on different
Linux variations and versions
¥ Identical base sys not required
¥ All BDI components: Docker containers
BDI components
18-oct.-16www.big-data-europe.eu
¥ Processing and storage componentso Re-used existing docker containers where availableo Dockerized by BDE otherwiseo Ensured all can be provisioned through Docker Swarm
¥ Components by BDE:o Support Layero Semantic Layer
Support Layer
18-oct.-16www.big-data-europe.eu
¥ BDE defines uniform UI stylesheetso Web UIs from BDE dockers (including for third party
components) follow these BDE stylesheets
¥ BDE-developed tools:o Starting containers
and dependencieso Monitoring execution
Semantic data lake
18-oct.-16www.big-data-europe.eu
¥ Minimal ingestion pre-processing
¥ Semantic layer maintains metadata
¥ Add meaning when retrieving/processing
DataLake:scalableunstructureddatastore
Relationshipdefinitionsandmetadata
JSON-LD CSVW R2RMLXML2RDF
BDE Docker Containers
18-oct.-16www.big-data-europe.eu
¥ Data serving: HDFS, Cassandra, 4store, PostGIS, Strabon, Elastic Search, Hive, Semagrow
¥ Processing: Spark, Flink, Sansa¥ Stream ingestion middleware: Flume,
Kafka
Semantic layer tools
18-oct.-16www.big-data-europe.eu
¥ BDE tooling for Semantic Data Lake:o Swagger: Semantics of RESTful APIso Semantic Analytics Stack (SANSA):
Distributed data processing for large-scale RDF data
o Semagrow: SPARQL perspective over Big Data stores
SC1: Pharmacology research
18-oct.-16www.big-data-europe.eu
Life Sciences & Health
• Extensive toolset developed by OPF and others
• Query a large number of datasets, some large• Existing elaborate ingestion and homogenization
by the OpenPHACTS Foundation
SC2: Viticulture resources
18-oct.-16www.big-data-europe.eu
Food and Agriculture
• AgInfra is a major infrastructure for agriculture researchers, serving cross-linked bibliography, data, and processing services
• Pilot automates publication ingestion and thematic classification
SC3: Predictive maintenance
18-oct.-16www.big-data-europe.eu
Energy
• Wind turbine monitoring applies computational models to sensor data streams
• Models are weekly re-parameterized using week’s data from multiple turbines
SC4: Traffic conditions estimation
18-oct.-16www.big-data-europe.eu
Transport
• Estimation of real-time traffic conditions in Thessaloniki
• Combines:• Traffic modelling from
historical data• Current measurements from a
taxi fleet of 1200 vehicles
SC5: Climate modelling
18-oct.-16www.big-data-europe.eu
Climate
• Discovering and re-using previously computed derivatives• Lineage annotation: datasets and model
parameters used to compute derivative datasets
• Finding appropriate past runs avoids repeating weeks-long modelling runs
• Preparing modelling experiments• Slicing, transforming, combining datasets into new datasets• Submission to and retrieval from modelling infrastructure
SC5 Pilot: Points Demonstrated
18-oct.-16www.big-data-europe.eu
Climate
• Existing infrastructure and stable, reliable software for parallel computation of models
• BDI is deployed as an external infrastructure for preparing and managing datasets
• BDI offers:• Hive for managing data in a way that can be
retrieved and manipulated, rather than file blocks• Cassandra stores structured and textual metadata
for searching headers and lineage
SC6: Municipality budgets
18-oct.-16www.big-data-europe.eu
Social Sciences
• Ingestion of budget and budget execution data
• Multiple municipalities in varied formats and data models
• Homogenized data made available for analysis and comparison
SC7: Change detection & verification
18-oct.-16www.big-data-europe.eu
Secure Societies
• Events are extracted from text published by news agencies and on social networking sites
• Events are geo-located and relevant changes are detected by comparing current and previous satellite images
UPCOMING BDE ACTIVITIES
Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
2nd round of Societal Workshops
18-oct.-16www.big-data-europe.eu
Transport 22 September 2016 Brussels Collocated with Big Data for Transport, Tisa workshop
Food&Agri 30 September 2016 Brussels Collocated with DG AGRI WP2018-20 stakeholder consultation
Energy 4 October 2016 Brussels Collocated with EC H2020 Info Day on “Smart Grids and Storage”
Climate 11 October 2016 (1) Brussels Collocated with Melodies Project Event – Exploiting Open Data
Health 19 October 2016 Brussels Standalone WorkshopSecurity 18 October 2016 Brussels Standalone WorkshopSocieties 5 December 2016 Cologne Collocated with EDDI16- 8th Annual
European DDI User Conference
Other Activities
¥ Hands-on BDE pilots workshopo Apache Big Data Europe, Seville, 14-16 Nov
o Enable BD technology practitioners to try out BDI & components
o To fine-tune technical BDI requirements
¥ Various SC-focussed and general hangouts, follow!o Apache Flink & BDE (20 Oct) – Free Webinar
18-oct.-16www.big-data-europe.eu
WEB: www.big-data-europe.euEMAIL: [email protected] DATA INTEGRATOR: www.github.com/big-data-europe
PROJECT COORDINATIONProf. Sören Auer, auer © cs.uni-bonn · de (Fraunhofer IAIS)> Dr. Simon Scerri, scerri © cs.uni-bonn · de (Fraunhofer IAIS)
EIS Department/Group,Fraunhofer IAIS & CS Department Uni-Bonn,Bonn, Germany
Fraunhofer IAIS: Leads Fraunhofer Big Data Alliance
Questions & Contacts
www.big-data-europe.eu18-oct.-16
#BigDataEurope