37
ICT, STREP FERARI ICT-FP7-619491 Flexible Event pRocessing for big dAta aRchItectures Collaborative Project D 6.3 Project Presentation 03.02.2014 30.04.2014 Contractual Date of Delivery: 30.04.2014 Actual Date of Delivery: 30.04.2014 Author(s): Michael Mock Institution: Poslovna Inteligencija d.o.o. Workpackage: WP6 Security: PU Nature: O Total number of pages: 37

D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

ICT, STREP

FERARI ICT-FP7-619491

Flexible Event pRocessing for big dAta aRchItectures

Collaborative Project

D 6.3

Project Presentation 03.02.2014 – 30.04.2014

Contractual Date of Delivery: 30.04.2014

Actual Date of Delivery: 30.04.2014

Author(s): Michael Mock

Institution: Poslovna Inteligencija d.o.o.

Workpackage: WP6

Security: PU

Nature: O

Total number of pages: 37

Page 2: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Project coordinator name: Michael Mock

Project coordinator organisation name:

Fraunhofer Institute for Intelligent Analysis

and Information Systems (IAIS)

Revision: 1

Schloss Birlinghoven, 53754 Sankt Augustin, Germany

URL: http://www.iais.fraunhofer.de

Abstract:

This document is the FERARI deliverable of WP6 for the first review period

(03.02.2014 – 30.04.2014). The project presentation gives an overall overview of the FERARI project including the goals of the project, project partners and workpackage organization.

Revision history

Administration Status

Project acronym: FERARI ID: ICT-FP7-619491

Document identifier: D 6.3 Project Presentation (03.02.2014 – 30.04.2014)

Leading Partner: Poslovna Inteligencija d.o.o.

Report version: 1

Report preparation date: 10.04.2014

Classification: PU

Nature: OTHER

Author(s) and contributors: Michael Mock (FHG) Status: - Plan

- Draft

- Working

- Final

x Submitted

Copyright

This report is © FERARI Consortium 2014. Its duplication is restricted to the personal use

within the consortium and the European Commission. www.ferari-project.eu

Page 3: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Flexible Event pRocessing for big dAta

aRchItectures (FERARI)

Page 4: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Introduction

2

Page 5: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

FERARI – A FP7 EC - ICT project

Grant Agreement No. 619491

STREP Specific Targeted Research Project

Grown out of FP7 basic research project LIFT (FET Open)

FERARI was ranked 6th of 33 proposals within objective 4.2 Scalable Data Analysis

• February 2014 – January 2017, Funding: 2.95 Mio. EUR

3

Page 6: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Technion (Technion) + Haifa University

Technical University of Crete (TUC)

T-Hrvatski Telekom (HT)

FERARI - Consortium

Fraunhofer IAIS (FHG)

IBM (IBM)

Poslovna Inteligencija (PI)

4

Page 7: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Fraunhofer IAIS: Intelligent Analysis and Information Systems

270 people: scientists, project engineers, technical and

administrative staff

Located on Fraunhofer Campus Schloss

Birlinghoven/Sankt Augustin near Bonn

Joint research groups and cooperation with

Lead researcher: Dr. Michael Mock

5

„From sensor data to business intelligence, from media

analysis to visual information systems: Our research

allows companies to do more with data“

Institute Director: Prof. Dr. Stefan Wrobel

Page 8: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Technical University of Crete

Founded in 1977 in Chania, Crete

120 faculty members, ~175 adjunct faculty and lab personnel

2900 undergraduate and 550 graduate students

Around 200 research programs, total budget 20.5 million

ECE department: 25 faculty, ~200 undergrad students/year

Research organized in 10 research laboratories

SoftNet Lab (headed by Prof. Garofalakis): Focus on Big Data Analytics, Data

Streams, Cloud Computing

Lead researcher: Prof. Minos Garofalakis

6

Page 9: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

TECHNION – Israel Institute for Technology and University of Haifa

Located in Haifa, oldest University in Israel (1912)

600 Faculty Members (3 Nobel Laureates)

Computer Science: 50 faculty members, 1500 Students

Lead researcher: Prof. Assaf Schuster, head of Technion

Computer Engineering Center, focus on Distributed and

Scalable Data Mining, Monitoring Distributed Data

Streams, Big Data Technologies and Analytics

and Dr. Daniel Keren, Department of Computer Science

at Haifa University

7

The Technion-Israel Institute of Technology is a major source of the innovation and brainpower that

drives the Israeli economy, and a key to Israel’s reputation as the world’s “Start-Up Nation.” Its

three Nobel Prize winners exemplify academic excellence.

Page 10: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

IBM Research – Haifa

350 people: scientists, software engineers, subject

matter experts

Located in Haifa, Israel on the campus of Haifa

University

The largest IBM Research Lab outside the USA

Lead researcher: Fabiana Fournier

8

IBM Research is the innovation branch of IBM, the motto of

IBM Research is “the world is our lab”

Page 11: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

T-Hrvatski Telekom: Communication, Information & Entertainment, Always & Everywhere

9

T-HT Group is the leading provider of telecommunications

services in Croatia and the sole company to offer the full

range of these services: it combines the services of fixed

and mobile telephony, data transmission, Internet and

international communications

T-HT’s strategy: GROW - COMPETE – TRANSFORM

Key figures for 2012:

Revenues: 991 mio EUR

EBITDA margin: 45,3%

5780 employees

Lead representative: Maja Vekić-Vedrina

“T-HT - to be the online company and to power the online

society and digital economy in Croatia and the Region”

Page 12: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Poslovna inteligencija: Leader in business intelligence

10

90 employees - 90% project engineers, technical and

business consultant, 10% sales and administration

HQ in Zagreb, Croatia, offices in UK, Slovenia, Serbia,

Bosnia and Herzegovina and Montenegro

Extensive experience in Telecommunication industry

and in R&D Big Data projects

Lead representative: Dražen Oreščanin

„We provide our customers with the best possible service in strategic consultancy and in

implementation of intelligent information systems for decision support, thereby helping them to

create new values and identify new business opportunities.“

Page 13: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Motivation

A number of recent technological developments have started to change our world forever:

• the rise of the internet

• the ever growing amount of activities in social networks

• the widespread adoption of smart phones and other mobile devices

• the instrumentation of the world with sensors. This is accompanied by dropping prices for computers, networks, and storage

11

Page 14: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Objectives

Provide support for large scale services by making the sensor layer a first class citizen in Big Data architectures.

Provide support for Complex Event Processing technology for business users in Big Data architectures.

Provide support for integrating machine learning tasks in the architecture.

Provide support for flexible and adaptive analytics workflows.

Exemplify the potential of the new architecture in the telecommunication and the cloud domain.

12

Page 15: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Use cases

Monitoring a smart energy grid.

Analysing the traffic state of a large city using car-to-car communication.

Monitoring the quality of a telecommunication network.

Detecting latent failures in a large cloud of thousands of machines.

Inspecting potentially fraudulent credit card transactions in real-time and blocking these transactions when necessary.

13

Page 16: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Application Scenarios

Mobile Phone Fraud Detection Detecting mobile phone fraud by analysing usage patterns

Reliably detect mobile phone fraud

Avoid financial losses due to fraud

Scalability to millions of events /sec (for simple filtering), for more complex analysis less (depending on complexity of task)

Cloud Health Monitoring Cloud data centre activity log monitoring

Possibility to replace time-interval by event- based maintenance

Avoiding service down-time

14

Page 17: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Negotiation Question: Data Size

15

Quantity of data

Average monthly number rated call details records is > 650 mio and total monthly quantity of data is > 300 GB. When it comes to raw call details, monthly quantities are significantly higher: number of records > 5500 mio and total size of data >10 TB.

Cloud services are one of the recently implemented services in Hrvatski Telekom. Number of cloud servers and customers using cloud services is still fairly low but numbers are rapidly increasing. Currently, the cloud consists of 6 machines which are producing a total amount of data of >40 GB per month. During the course of this project, we expect that the cloud might double its current size.

Page 18: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

FERARI success criteria

The project’s success will be rigorously measured by the following validation criteria: Communication reduction with respect to global/state-of-the art

solutions.

A second quantitative validation criterion is processing time relative to the size of the data.

A third criterion is – for monitoring applications – the number of false alarms

Number of domains to which the approach can be deployed. A key to this is the variety aspect enabled by Distributed Complex Event Processing.

Flexibility. The system will be designed such that it can adapt to new, unforeseen circumstances and can be easily consumable.

16

Page 19: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Workpackages

17

Page 20: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Phase 1 (M1 – M12) - use case definition - component definition - architecture definition

Phase 2 (M13 – M24) - Component refinement - First use case prototype

implementation - First Architecture

implementation

Phase 3 (M25 – M34) will demonstrate and evaluate the impact of the methods developed in this project

Work Plan

18

Page 21: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Workpackage Structure

19

* WPs 6 and 7, which will interact with all WPs for dissemination and management tasks have been left out to increase readability. The general flow of dependencies is top-down from the use cases to the architecture and methodological work. Architecture and methods interact iteratively, since there are many technical and methodological dependencies.

WP1 – Use Cases

WP2 - Architecture

WP4 – Flexible Event

Processing

WP3 – Communication Efficient, Low –

Latency Methods WP5 – Robust

Distributed Stream

Monitoring

Page 22: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

FERARI - Workpackages

20

Software Platform

Prototype

Stream processing

Communication efficient

processing Complex event

processing

provides

Page 23: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP1: Application Scenarios, Test bed, Prototype

Objectives:

Selecting and defining the application scenarios fraud mining and cloud health monitoring

Definition of testing & evaluation criteria for the end users at HT

Setting up of a test bed both at HT and at the project partner’s local sites

Implementation and evaluation of scenarios in a prototype to demonstrate the advantage of FERARI with respect to the state of the art as well as to demonstrate its business value

21

Page 24: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP2: Big Data Streaming Architecture & Technology Integration

Objectives:

Define a Big Data architecture that makes the sensor layer a first class citizen of the architecture,

Define a data and control flow that can implement a push based approach, so that processing can be partially done in situ,

Provide methods for robust distributed stream processing including online machine learning

Implement the architecture in as software platform (open source).

22

Page 25: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Architecture Diagram of FERARI

23

Event processing deals with these functions:

• get events from sources (event producers).

• route these events, filter them, normalize or otherwise transform them, aggregate them, detect patterns over multiple events (event processing agents).

• transfer events as alerts to a human or as a trigger to an autonomous adaptation system (event consumers).

Page 26: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP2: Big Data Streaming Architecture & Technology Integration - TASKS

24

The tangible output of WP2 will be the definition of the software big-data architecture allowing for the integration of components for complex event processing, in-situ processing and robust distributed stream processing including online machine learning. In addition, the architecture will be

provided as software platform.

Page 27: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Interdependencies between WP1 & WP2

WP2: Software Platform Open source

General purpose for communication efficient big-data stream analysis alg.

Flexible event processing

Components as libraries

Interfaces to plugin concrete algorithms (learning, monitoring)

In stream learning

CEP Language

25

Software Platform

Prototype

Plugin concrete algorithms

Page 28: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP3: Communication Efficient, Low-Latency Methods

Objectives:

develop in-situ processing methods that go beyond current methods

develop new algorithms that are able to efficiently detect granular events

identify and explore the right level of in-situ processing for scalability issues

26

Page 29: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

In-Situ Processing (LIFT)

Coordinator Monitors Global Treshold (example: all nodes of a cloud work “in healthy” state) Sensors monitor local Safe-Zone in -situe

nodes

Alarm message only if local Safe Zone is violated

Global Condition/ Reference Point

Local Condition Safe - Zone

Resolution protocol (after violation)

27

Page 30: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP4: Flexible Event Processing

Objectives:

develop a Complex Event Processing model and methodology suitable for specification, implementation, and maintenance of event-driven applications

Providing semantics for specifying event patterns

Providing a end-user consumable framework for flexibly specifying event processing systems

Providing modules for generation of an event processing network implementation and optimization plan that allows distributed in situ monitoring of complex event patterns

28

Page 31: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP5: Robust Distributed Stream Monitoring

Objectives:

develop methods for robust distributed stream monitoring

exploit online machine learning methods to adapt the FERARI data/control flow to unforeseen circumstances

Provide support for integrating machine learning into the architecture.

Accounting for uncertainty in the architecture

29

Page 32: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Mobility Monitoring using stationary sensors

Each sensor computes a (linear counting) sketch of bluetooth addresses in sensor rage

Sketch is a bit-array of fixed length

Provide set of mobility mining primitives

count distinct

union

intersection

Simple LIFT Example

Coordinator

Si Sj

sk(R

Si)

sk(R

Sj)

30

Page 33: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP6: Dissemination & Exploitation

Objectives:

Disseminating the FERARI theoretic framework to the scientific community of data mining and distributed systems.

Outlining the methodological and technical superiority of the proposed solution compared to other approaches to distributed monitoring

Dissemination to high-profile early adaptors within the scope of the application scenarios

31

Page 34: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

WP7: Coordination

Objectives:

Establishment of a strong project management scheme

Successful achievement of the project objectives on time and within budget

Generation of synergies amongst the project members

Continuous monitoring of the project’s progress and timely initiation of corrective actions (if needed)

Coordination of the continuous process aiming to transfer the knowledge generated to the relevant scientific communities

32

Page 35: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

List of Deliverables

33

Deliver

able

No

Deliverable name WP

No.

Nature

Dissemi-

nation

level

Due

date

1.1 Application Scenario Description and Requirement Analysis

1 R PU M12

1.2 Final Application Scenarios and Description of Test Environment

1 R PU M24

1.3 Application Scenario & Prototype Report 1 R PU M36

2.1 Architecture definition

2 R PU M12

2.2 System Prototype

2 R PU M24

2.3 Final Prototype 2 R PU M36

3.1 Requirements and state of the art overview on in situ methods

3 R PU M12

3.2 Development of algorithms based on in-situ, low-latency Methods

3 R PU M24

3.3 Implementation and evaluation of in-situ, low latency Algorithms

3 R PU M36

4.1 Requirements and state of the art overview on Flexible Event Processing

4 R PU M12

4.2 Goal driven model and methodology for specification of event processing Applications

4 R PU M24

4.3 Automatic generation of annotated event Processing network from the goal-driven Model

4 R PU M36

Deliver

able

No

Deliverable name WP

No.

Nature

Dissemi-

nation

level

Due

date

5.1 Requirements and state of the Art overview on Robust Stream Monitoring

6 R PU M12

5.2 Algorithms for Robust Distributed Stream Monitoring and Supporting Data Integrity

6 R PU M24

5.3 Implementation of Algorithms for Robust Distributed Stream Monitoring and Supporting data Integrity

6 R PU M36

6.1 Project Fact Sheet 6 O PU M3

6.2 Project Web Site 6 O PU M3

6.3 Project Presentation 6 O PU M3

6.4 Project Workshop, Seminar and Training Course 6 R PU M30

6.5 First Draft of Exploitation Plan 6 R CO M24

6.6 Exploitation and Dissemination Plan 6 R CO M36

7.1 Quality Assurance Plan 7 R PU M6

7.2 1st Annual Project Report 7 R CO M12

7.3 2nd Annual Project Report 7 R CO M24

7.4 Final Project Report 7 R CO M36

Each WP-Leader is responsible for the deliverables of his or her WP – more details in the

Page 36: D 6 - FERARI Project · Poslovna inteligencija: Leader in business intelligence 10 90 employees - 90% project engineers, technical and business consultant, 10% sales and administration

Summary

The goal of the FERARI project is to pave the way for efficient, real-time Big Data technologies of the future.

It will enable business users to express complex analytics tasks through a high-level declarative language that supports distributed Complex Event Processing and sophisticated machine learning operators as an integral part of the system architecture.

Effective, real-time execution at scale will be achieved by making the sensor layer a first-class citizen in distributed streaming architectures and leveraging in-situ data processing as a first (and, in the long run, the only realistic) choice for realizing planetary-scale Big Data systems.

34