21
BIG DATA EUROPE HTTP://WWW.BIG-DATA-EUROPE.EU/ Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges CMG-AE Tagung Big Data: Strategien, Technologien und Nutzen 19. Mai 2015, Expat Center der Wirtschaftsagentur Wien

Introduction to: Big Data Europe Project

Embed Size (px)

Citation preview

BIG DATA EUROPEHTTP://WWW.BIG-DATA-EUROPE.EU/

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

CMG-AE Tagung

Big Data: Strategien, Technologien und Nutzen

19. Mai 2015, Expat Center der Wirtschaftsagentur Wien

Semantic Web Company

(SWC)SWC was founded 2001, head-quartered in Vienna

25 experts in linked data technologies

Product: PoolParty Semantic Suite (launched 2009)

Serving customers from all over the world

EU- & US-based consulting services

Semantic Web Company

(SWC)Some of our Customers

● Credit Suisse

● Boehringer Ingelheim

● Roche

● Wolters Kluwer

● BMJ Publishing Group

● Red Bull Media House

● Canadian Broadcasting Corporation (CBC)

● Pearson

● Council of the EU

● DG Environment, EC

● Healthdirect Australia

● Ministry of Finance (Austria)

● World Bank Group

● Inter-American Development Bank (IADB)

● International Atomic Energy Agency (IAEA)

● Buildings Performance Institute Europe

(BPIE)

● Renewable Energy & Energy Efficiency P

(REEEP)

● Global Buildings Performance Network

(GBPN)

● American Physical Society

● Education Services Australia (ESA)

Finance / Automotive / Publisher / Health Care / Public Administration / Energy /

Education

Selected Partners● EBCONT

● EPAM Systems

● iQuest

● PwC

● Tenforce

● OpenLink Software

● Ontotext

● MarkLogic

● Gravity Zero

● Altotech

● Wolters Kluwer

● Taxonomy Strategies

● Digirati

● Fraunhofer (IAIS)

● University of Leipzig

(INFAI)

● The Open Data Instizute

We all have one goal in mind: Make machines smart enough so that

they can help us to find those needles in the haystack, which are

really relevant to us.

The Motivation – Big Data

Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the

world today has been created in the last two years alone.

This data comes from everywhere: sensors used to gather climate information, posts to

social media sites, digital pictures and videos, purchase transaction records, and cell phone

GPS signals to name a few.

This data is big data. Source:

IBM

The Motivation – Big Data

BIG

DATA

Big Data Dimensions

Volume

Velocity

Variety

100010101010101010101010

010100101010101010010100

101010010100101010010100

101001010100101010100101

010001010101010010101010

101010101001010101010101

001010101110001010101010

101010101001010010101010

101001010010101001010010

101001010010100101010010

101010010101000101010101

00101010101

1000101010

1010101010

1010010100

1010101010

1001010010

…….

………….

……

………..…….

.

……………

1 0

1000101010

1010101010

1010010100

1010101010

100101oo11

Veracity!

Big Data in Europe: Challenges, Opportunities

HealthClimateEnergyTransportFoodSocietiesSecurity

Loremipsumdolors

KSDJOPSCKKSDKAB

LKASJL

LAWWD

S

wpweppe

pwpisio

we

10101001101010010101

0

Regional Data Repositories10101001101010010101

0

10101001101010010101

0

#2: Interlink, Centralise Access, Explore

101010100101010101001011010001010101010010101010100001011010001010101010010101010100100101010100101010101001011010

001010

Data Eleme

nt

#3: Analyse, Discover, Visualize#4: Mashup, Cross-domain Exploitation

Journalists Authorities

Big Data in Europe: Obstacles

19-mai-15

#1 Big Data “Variety“ problem Multiple Data Sources

Required: Integration, Harmonisation

#2 Opening-up Data concerns Loss of control, lack of tracking

Reservations about large corporations

#3 Limited Skills, Training, Technology

Lack of Data Scientists

Lack of Generic Architectures, components

Big Data in Europe: Obstacles

19-mai-15

Extraction, Curation Quality, Linking, Integration

Publication, Visualization, Analysis

Extraction, Curation, Quality, Linking, Integration, Publication,

Visualization, Analysis

Health

Transport

Security

Extraction Curation Quality Linking Integration Publication Visualization Analysis

Data Repositories Linked Open Data Cloud

Stage 1

Stage 2

Stage 3

Food SocietiesClimate Energy

BDE Partners

Rationale

Show societal value of Big Data

Lower barrrier for using big data technologies

o Required effort and resources

o Limited data science skills

Help establishing cross-

lingual/organizational/domain Data Value

Chains19-mai-15

Rationale

COORDINATION

Stakeholder Engagement

(Requirements Elicitation)

SUPPORT

Design, Realise, Evaluate

Big Data Aggregator

Platform

Create and Manage Societal

Big Data Interest Groups

Cloud-deployment ready

Big Data Aggregator

Platform

CSA

Measures

Results

Summary

Two clearly defined coordination and support measures:

Coordination: Engaging with a diverse range of stakeholder groups representing particularly

the Horizon 2020 societal challenges Health, Food & Agriculture, Energy, Transport, Climate,

Social Sciences and Security; Collecting requirements for the ICT infrastructure needed by data-

intensive science practitioners tackling a wide range of societal challenges; covering all aspects

of publishing and consuming semantically interoperable, large-scale data and knowledge assets;

Support: Designing, realizing and evaluating a Big Data Aggregator platform infrastructure

that meets requirements, minimises disruption to current workflows, and maximises the

opportunities to take advantage of the latest European RTD developments (incl. multilingual data

harvesting, data analytics & visualisation).

BigDataEurope will implement and apply two main instruments to successfully realize these

measures:

Build Societal Big Data Interest/Community Groups in the W3C interest group scheme &

involving a large number of stakeholders from the Horizon 2020 societal challenges as well as

technical Big Data experts;

Design, integrate and deploy a cloud-deployment-ready Big Data aggregator platform

comprising key open-source Big Data technologies for real-time and batch processing, such as

Orthogonal Dimensions of Big Data

EcosystemsGeneric Big Data Enabling Technologies

Data Value Chain

Data Generation & Acquisition

Data Analysis & Processing

Data Storage & Curation

Data Visualization &

Usage

Data-driven Services

So

cie

tal

Ch

all

en

ge

s

Do

ma

in S

pe

cifi

c D

ata

Ass

ets

& T

ech

no

log

y

Healthcare

Food Security

Energy

Intelligent Transport

Climate & Environment

Inclusive & Reflective Societies

Secure Societies

BDE Stakeholder Engagement Approach &

Activities

BDE Community Tools – JOIN IN NOW !

• Website: news, events, community, …

• 7 x BDE W3C Community Groups

• 7+1x Mailing Lists

• 7 x SC Workshops/Year = 21 Workshops

• Full set of communication tool-set…

Future Outlook

• BDE Aggregator Platform

• For download / internal use

• Cloud Version

• Big Data Technology Support Tools

Domains, Focus Areas & Data

AssetsSocietal Domain Preliminary Big Data Focus area Selected Key Data assets

Life Sciences &

Health

Heterogeneous data Linking &

integration

Biomedical Semantic Indexing & QA

ACD Labs / ChemSpider, ChEBI, ChEMBL, Con-ceptWiki, DrugBank, EN-

ZYME, Gene Ontology, GO Annotation, Swis-sProt, UniProt, Wik-iPathways,

PubMed, MeSH, Disease Ontology (DO), Joint Chemical Dic-tionary

(Jochem), Bio-ASQ datasets

Food &

AgricultureLarge-scale distributed data integration

INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural

Bibliography Network (ABN), AGRIS, AquaMaps, Fishbase

Energy

Real-time monitoring, stream

processing, data analytics, and

decision support

European Energy Exchange Data, smart meter measurement data,

gas/fuels/energy market/price data, consumption statistics, equipment

condition monitoring data)

TransportStreaming sensor network & geo-

spatial data integration

GTFS data, OSM/ LinkedGeoData, MobilityMaps, Transport sensor data,

ROSATTE Road safety attributes, European Road Data Infrastructure -

EuroRoadS

ClimateReal-time monitoring, stream

processing, and data analytics.

European Grid Infrastructure (EGI), Databases hosting atmospheric data.

Several software frameworks for simulation, calibration and reconstruction.

Social SciencesStatistical and research data linking &

integration

Federated social sciences data catalogs, statistical data from public data

portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank)

Security

Real-time monitoring, stream

processing, and data analytics.

Image data analysis

Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired

from commercial providers and governmental systems) and collateral data

for supporting CFSP/CSDP missions and operations, Databases hosting

atmospheric Data. Experimental and simulation data concerning dispersion

of hazardous substances

Work Packages & Implementation

Phases

Community Building

M1-M12 M13-M24 M25-M36

Enabling Technologies

Component Integration

Uptake

Integrator Deployment

Community Assessment

WP3 – Big Data Generic Enabling Technologies & Architecture

WP5 – Big Data Integrator Instances

WP7 – Dissemination & Communication

WP2 – Community Building & Requirements

WP4 – Big Data Integrator Platform

WP6 – Real-life Deployment & User Evaluation

Blueprint of the Data Aggregator

Platform

Batch Layer

Speed Layer

Data Storage

Real-time data &

Transactions …

Batch View

Real-time

View

me

ssa

ge

pa

ssin

g

message passing

Applications & Showcases

Real-time dashboards

Domain-specific BDE apps

Big Data Analytics

In-stream Mining

BD

E P

latfo

rm&

Inte

lligence

Input data

Stream

Spatial

Social

Statistical

Temporal

Transaction

al

Imagery

+ Semantic Layer (Retaining Semantics using LD

approach )

Lambda Architecture

Announcements….

Free: European Data Economy WorkshopFocus Data Value Chain & Big Data, 15.9.2015 WU ViennaREGISTER NOW: bit.ly/Data-Economy-WS-SEMANTiCS2015