26
Big Data Analytics Integrated Learning Concepts Prof. Dr. Hendrik Meth HdM, Stuttgart

Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Big Data AnalyticsIntegrated Learning Concepts

Prof. Dr. Hendrik MethHdM, Stuttgart

Page 2: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

ABOUT ME

University of Mannheim

Diploma in Business Informatics

Icon, Karlsruhe

SAP APO & SAP BW-Consulting

Bosch, Stuttgart

Business Intelligence Consulting,

Product & Project Management

University of Mannheim

PHD in Business Informatics

BorgWarner ITSE

Manager Business Intelligence

Competence Center

1995-

2001

2004-

2010

2010-

2013

2013-

2016

2002-

2004

HdM Stuttgart

Professor

Big Data & Data Science

since 03.2016

2

Page 3: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

AGENDA

3

• Overlapping Concepts: BI vs. Big

Data Analytics

• Big Data Analytics: Teaching &

Research Program

• Blended Learning: Teaching vs.

Research

Page 4: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

WHAT IS BIG DATA?The four Vs

• Four dimensions to be differentiated

Source: Schroeck et al. 2012 – IBM Institute for Business Value

4

Page 5: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

BIG DATA ANALYTCS - Perspectives

Scale Up vs. Scale Out

Commercial Software vs. Open Source

Cloud vs. On Premise

Hot vs. Warm vs. Cold Storage

Monolithic vs. Hybrid

Architectures

In-Memory

(Frequent Access)

Flash Disk(Occasional Access)

Disk(Seldom Access)

Archi-tecture

Structured

Data

Archi-tectureUnstruc

turedData

Mono-lithicArchi-

tecture

Technology Process & Method

Organization

Clustering

Association Rules

Natural Language

Processing

Regression

Classification

Information Retrieval

Software Selection

Implementation Procedures

Data Science Process

Organization of Teams and

Competence Centers

Application Scenarios

5

Page 6: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

BIG DATA ANALYTCS –Market and Challenges

6

1 http://www.forbes.com/sites/louiscolumbus/2014/12/29/where-big-data-jobs-will-be-in-2015/#2caa75b9404a2 http://www.forbes.com/sites#/sites/gilpress/2015/04/30/the-supply-and-demand-of-data-scientists-what-the-surveys-say/#7e9131cd205e

$91k

$250k

Data Scientists’Median Salaries2

JuniorLevel

ManagementLevel

57%

43%

Enterprises Key Challenges-Analytical Skills2

No Lack of analytical skills

Lack of analytical skills

27%

73%

Enterprises Key Challenges-Cross-functional Integration2

Integration Data Scientistswith "traditional dataworkers" successfull

Integration Data Scientistswith "traditional dataworkers" NOT successfull

Increasing Demand in 20141 for…

90%

124%

SystemAnalysts

Project Managers

Page 7: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Dat

a

Co

nsu

mp

tio

nD

ata

Pro

visi

on

ing

Consolidation Analytics

Dat

a

War

eho

use

Data

Mining

Ad-Hoc

ReportingR

epo

rtin

g

Data

Integration

Performance

Management Planning

Information

Hubs

Data

Visualization

Dat

a M

art

BI AND BDAOverlapping Concepts

Text

Mining

NL

P

Dimensional

ModelingETL

MD

X

In-Memory

Distributed File

Systems

Prediction

R

Machine Learning

NoS

QL

Lambda

Architecture

Data

Science

Business

IntelligenceBig Data Analytics

Multimedia

Str

eam

ing

Dat

a

Map

Red

uce

7

Page 8: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

TEACHING PROGRAM

Page 9: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

BDA TEACHING PORTFOLIO

Bachelor

Master

Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data

BIG DATA ANALYTICS I

BIG DATA ANALYTICS II

ADVANCED DATA SCIENCE

PR

OJE

CT

GR

OU

PS

AN

D

CA

MP

US

CH

ALL

ENG

ES

DATA SCIENCEFOR ANALYSTS

9

DATABASES

ANALYTICAL INFORMATION

SYSTEMS

BUSINESS INTELLIGENCE

BUSINESS INTELLIGENCE APPLICATIONS

Page 10: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

BDA TEACHING PORTFOLIO

BIG DATA ANALYTICS I (Bachelor)

• Introduction and Key Concepts

• Architectures and Use Cases• Big Data Analytics

(Structured Data)• Implementation and Use of

Big Data Systems

BIG DATA ANALYTICS II (Bachelor)

• Introduction and Key Concepts

• Architectures and Use Cases• Big Data Analytics

(Unstructured Data)• Design of Big Data Systems

ADVANCED DATA SCIENCE (Bachelor)

• Advanced Use Cases, Methods and Technologies:

• Advanced Methods (Predictive Analytics, Regression Analysis,..

• Visual Analytics• R

PROJECT GROUPS AND CAMPUS CHALLENGES

(Master)

• Project Groups, e.g. implementing innovative artifacts based on packaged software solutions

• Campus Challenge, e.g. solving case studies defined by industry partners

DATA SCIENCEFOR ANALYSTS(Master)

Data Science with focus on analyst/ application perspectiveIntroduction and Key ConceptsData PreparationMethods and Techniques

10

Page 11: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

BIG DATA ANALYTICS I

EXERCISESBottom-Up Structure from Foundation to Advanced Level

Data Preparation and Exploration

Clustering, Classification, Association Analysis

Map-Reduce Method

Natural Language Processing

Time Series Analysis

Flare Charts

Text Mining, Web Mining, Natural Language Processing

BIG DATA ANALYTICS II

ADVANCED DATA SCIENCE

Exemplary Methods: Exemplary Technologies:

11

Page 12: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

PART-TIME MASTER PROGRAM Starting Autumn 2016

Page 13: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

RESEARCH PROGRAM

Page 14: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Technology & Architecture PerspectiveScale Up vs. Scale Out, Open Source vs. Commercial Software,

Cloud vs. On Premise, Monolithic vs. Hybrid Systems

Organizational and Process PerspectiveVendor vs. Customer, Developer vs. User

Design processes vs. Implementation and Use processes, Development of algorithms vs. Application of packaged solutions

User-Centered Design and Implementation

Big DataSystems

RESEARCH DIRECTION

14

Vision: Design and Implementation of User-Centered Big Data Solutions

Page 15: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

EXEMPLARY RESEARCH QUESTIONS

• How can enterprises be prepared for efficient Big Data Systems (BDS) implementation and use ?

• How can BDS be implemented in a User-Centered Approach?

• How can existing Business Intelligence and Big Data functionality be integrated on vendor and customer side?

15

Page 16: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

BLENDING RESEARCH AND TEACHING

Page 17: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

EXAMPLE RESEARCH SETUP & QUESTION

• Setup: Thesis projects (bachelor / master) in co-operation university + industry partner

• Research question with practical relevance in Big Data / BI context

• General research question: How can the potential for Big Data technology investments be estimated upfront?– Project 1: Usage-based approach

– Project 2: Experimental approach

17

Page 18: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Research Project1 - Setup

• Specific research question: How can the potential of Big Data technology investments be estimated based on existing usage data?

• Project participants & format• Co-operation of HS Worms and BorgWarner (Automotive Supplier)

• Bachelor Thesis Project

• Estimate the potential improvement rate resulting from a SAP HANA database migration within a SAP BW system based on SAP BW technical content (usage statistics)

18

Page 19: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Research Project1 - Setup

• Data basis: Available usage data of BorgWarner BI system from a 6 month time frame

• Determine overall query runtimes and individual fractions– Database

– Application Server

– Network

– Client

• Compare results between two large applications: Sales Reporting and Supply Chain Reporting

Database Application

Server

Network Client

Execution Time Query:

19

Page 20: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Research Project1 - Results

*Key Figures analyzed for all Report executions between 01/05/2014 and 19/12/2014 divided by GSM and Sales

Database

63%

Client / Network / App

Server

37%

Database

38%

Client / Network /

App Server

62%

SCM

Sales

Avg.

Runtime

: 124,80

seconds

Avg.

Runtime

: 7,71

seconds

20

Page 21: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Research Project2 - Setup

• Main research question behind the study: Can the potential performance improvements of SAP HANA be realized in a data modelling and reporting setup comparable to BorgWarner’s system landscape ?

• Project participants & format• Co-operation of InES (University of Mannheim) and BorgWarner (Automotive Supplier)

• Master Thesis Project

• Compare three variants with regards to data loading / reporting performance Model-A: SAP BW 7.3 on relational database using LSA modeling approach

Model-B: SAP BW 7.3 on SAP HANA database using LSA modeling approach

Model-C: SAP BW 7.3 on SAP HANA database leveraging HANA-optimizedmodelling

21

Page 22: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Research Project2 - Setup

• Create a data model similar to existing BorgWarner environment

• Utilize real-world data from BorgWarner along three cases:

Case A: 1 million records

Case B: 2 million records

Case C: 3.5 million records.

• Create different types of representative queries (for reporting)

• Run 5 different iterations

• Provide infrastructures in Big Data Innovation Center Magdeburg (BW on HANA / BW on relational database) and run evaluation in controlled lab environment.

22

Page 23: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Research Project2 - Selected Results*:Loading Performance

Reporting Performance (simple / mid-complex queries):

* for Case C – 3.5 million data sets

23

Page 24: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

How can this kind of data be used?

Estimate Benefits of an IT investment

• Idea: Measure the amount of time which users are currently spending to wait on report results

• Estimate the share of time which would be additionally available to analysts after HANA has been implemented

• Multiply additional time with average hourly rate of an analyst

• Additional time (in h) * Hourly Rate (in EUR/h) = Performance Benefit (in EUR)

30.000 hours

10.000

hours

50€ / hour

500.000€

Example:

24

Page 25: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Key Takeaways

Business Intelligence and Big Data Analytics provide extensive conceptual overlaps

Complementary teaching programs will be needed in order to fulfill the demands of industry, science and society

Ample potential for blended teaching/research projects in this area

12

3

25

Page 26: Big Data Analytics - BI Academy · Enable students to design big data system architectures and apply techniques for the analysis of high volume, heterogeneous data BIG DATA ANALYTICS

Prof. Dr. Hendrik Meth

Questions?

26

Prof. Dr. Hendrik MethFaculty for Information and CommunicationBusiness Information Systems and Digital MediaBig Data and Data Science

STUTTGART MEDIA UNIVERSITYNobelstrasse 10DE-70569 Stuttgart / Germany

eMail: [email protected]