46
Business Intelligence 3.0 (and the Emergence of Data Lake) 1 Cheow Lan Lake, Thailand โกเมษ จันทวิมล January 31 2559 Komes Chandavimol หลักสูตร ปริญญาโท Big Data Engineering คณะวิศวกรรมศาสตร์ มหาวิทยาลัย ธุรกิจบัณฑิต

Business intelligence 3.0 and the data lake

Embed Size (px)

Citation preview

Business Intelligence 3.0 (and the Emergence of Data Lake)

1 Cheow Lan Lake, Thailand

โกเมษจันทวิมล

January 31 2559

Komes Chandavimol

หลกัสูตร ปริญญาโท Big Data Engineering คณะวศิวกรรมศาสตร์มหาวทิยาลยั ธุรกิจบณัฑิต

Business Intelligence

The set of Techniques and Tools for the Transformation ofRaw data into meaningful and useful Information for business analysis purposes, Wikipedia

2

BI

Techniques and Tools Raw data Information

Business Intelligence 1.0

What will I get my report then? That not what I want?

Business Request for Information IT/Data Analyst query in the database/data warehouse IT/Data Analyst provide the report

BI 1.0 - Delivery to Customer

3Defining Business Intelligence 3.0, Lachlan James (2014)

BI 1.0 – Delivery to Customer

4

BI

BI

Present

Aggregate

Defining Business Intelligence 3.0, Lachlan James (2014)

Batch Processing BI Tools Usage – for only some group/community of people IT Control

Stand Alone Reports

Business Intelligence 2.0

5

I can explore a wide range of data assets; I prefer to blend data but my report differ from yours

ERP, CRM, Data Warehouse A single point of view With a centralized BI tool

Business Analysts can explore the datain the Web Portal with BI Tools and predefined reports

BI 2.0 - Creation and Delivery for Consumers

Defining Business Intelligence 3.0, Lachlan James (2014)

BI 2.0 Creation and Delivery to Consumers

6

Explore

Predict

Defining Business Intelligence 3.0, Lachlan James (2014)

Real time via any device Web Portal Centralized BI Tools – Business Analyst can explore Hybrid Control

Web Portal

Business Intelligence 2.5

I can use any tool, I can blend data rapidly (if I can find it), it was so simple but now

BI 2.0 ++ (Agile BI, SOA , Enterprise Search, Visualization) Give Business more power to BI Tools Less complicated to IT

On the fly Data Federation

7

Business Intelligence 3.0

I collaborate via any devices with content, harness information on the fly and drive outcome

Focus on Collaborate workgroup

Self regulate, Self governance in data management

Interact between customer, employee, regulators and third parties

No Bottleneck from IT

Include Big Data, Cloud, IoT and Social integration

Creation Delivery and Management for Consumers8

Defining Business Intelligence 3.0, Lachlan James (2014)

BI 3.0 - Creation Delivery and Management

9

Anticipate

Enrich

Self-Service BI with Analytic 3.0

The Journey of Business Intelligence 3.0

10

BI 1.0 BI 2.0 BI 3.0

Functionality Present and Aggregate

Explore and Predict Anticipate and Enrich

Frequency Monthly/Detail Weekly/Daily/Summary Real-time/Process

Level of Focus Community Enterprise Collaborative

Processing Batch Near Real-time In-Process

Data Products Information Intelligence Insight

Foundation/ Influence

Delivery Only Creation + Delivery Creation + Delivery + Automation

Defining Business Intelligence 3.0, Lachlan James (2014)

https://public.tableau.com/profile/ifpri.td7290#!/vizhome/2014GHI/2014GHI

Understand the needs of BI 3.0 users

A set of Tools and Techniques that Delivery Intelligence without making users work for it Delivery to Right Data to the right Users at the right time Focus on Scalability + Usability Provide Self-Guided content creation, delivery and analysis Support Multi-Device User interface, anywhere, anytime Support Collaborative methodology Include Analytics 3.0, Data Discovery, Advanced Visualization, Visual

Analytics, Business Discovery, Self Serve Business

12Source: Forrester Research’s James Kobielus

Understand the needs of BI 3.0 platform

Technology should be in place to enable organization to acquire, store, combine, and enrich huge volume of unstructured and structured data in raw format

Ability to perform analytics, real-time and near real-time data scale , on these huge volume in iterative way

13

Find the tools!

Spreadsheet? Database? Data Mart? Data Warehouse?

14Source: Forrester Research’s James Kobielus

Tools that support these!

15http://www.adweek.com/prnewser/how-many-times-do-the-worlds-social-media-users-click-every-minute/117427

https://www.domo.com/learn/data-never-sleeps-3-0

The Emergence of Big Data Tools

16

HADOOP

17

Analytics 3.0

Data Mining Tools

18

Data Discovery and Visualization Tools

How to apply to current environment?

19

Traditional Data Warehouse

20http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/

New Data Management Architecture

21http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/

New Data Management Architecture

22http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/

Data Lake

23

https://www.digitalnewsasia.com/business/forget-data-warehousing-its-data-lakes-now

Data Lake

A single place to store every type of data in its native format with no fixed limits on account size or file size, high throughput to increase analytic performance and native integration with the Hadoop ecosystem.

25

Reference: James Serra's Blog

Data Lake Development with Big Data , Pradeep Pasupuleti (2015)https://www.digitalnewsasia.com/business/forget-data-warehousing-its-data-lakes-now

Data Lake Processes

26

www.emc.com

Data Lake and Data Warehouse

27Hadoop Distributed Compared,BlazeClan Technology,2015

Data Lake and Data Warehouse

28Hadoop Distributed Compared,BlazeClan Technology,2015

Data Lakes

29

http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key- differences.html

Data Lake

Type of Data Raw Data Derived Data Aggregated Data

Type of Environment Discovery Environment Production Environment

30The Definition of Data Lake, John O’Brien(2015)

How the Data Lake works?

31http://www.clearpeaks.com/blog/category/tableau

Traditional Enterprise Data warehouse

New Data Management Architecture

32http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/

33http://www.kdnuggets.com/2014/05/big-data-landscape-v30-

analyzed.html

Data Lake Maturity

35The Definition of Data Lake, John O’Brien(2015)

4 Maturity Stages of Data Lake

Stage 1 – Pilot Project (Understand the Technology) Stage 2 – Productionize Hadoop and its capabilities Stage 3 – Proactive consolidate data to (Big) Data Analytics Stage 4 – Platform the Data Lake to Core Competency

36The Definition of Data Lake, John O’Brien(2015)

Putting the Data Lake to Work, Teradata, Hortonworks (2015)

Stage 1 – Pilot Project

Handling data at scale Involves getting the plumbing in place and learning to acquire

and transform data at scale. The analytics may be quite simple, but much is learned about

making Hadoop work the way you desire.

37The Definition of Data Lake, John O’Brien(2015)

Putting the Data Lake to Work, Teradata, Hortonworks (2015)

Stage 2– Productionize Hadoop and its capabilities

Involves improving the ability to transform and analyze data. Find the tools that are most appropriate to their skillset Acquiring more data and build applications.

38The Definition of Data Lake, John O’Brien(2015)

Putting the Data Lake to Work, Teradata, Hortonworks (2015)

Stage 3 – Proactive consolidate data to (Big) Data Analytics

Involves getting data and analytics into the hands of as many people as possible.

It is in this stage that the data lake and the enterprise data warehouse start to work in unison, each playing its role.

Started with a data lake eventually added an enterprise data warehouse to operationalize its data.

39The Definition of Data Lake, John O’Brien(2015)

Putting the Data Lake to Work, Teradata, Hortonworks (2015)

Big Data Analytics

40http://dataofthings.blogspot.com/2014/04/the-bbbt-sessions-hortonworks-big-data.html

Data Lake and Big Data Analytics

41http://hortonworks.com/blog/big-data-refinery-fuels-next-generation-data-architecture/

Stage 4 – Platform the Data Lake to Core Competency

Enhance Enterprise Capabilities are added to the data lake. Few companies have reached this level of maturity, but many

will as the use of big data grows, Require Data governance, compliance, security, and auditing

(and incorporate to Company Data Strategy)

42

The Technology of the Business Data Lake, Capgemini (2013)

Business Data Lake

43

The Technology of the Business Data Lake, Capgemini (2014)

The Data Lake Unifies Data Discovery, Data Science, and BI 3.0

44

Big Data

Self Serve BusinessData Science

Machine Learning

Visual AnalyticsBusiness Discovery

Deep Learning

Self Serve Business

Hadoop

Feature Engineering

Spark

Business Intelligence 3.0

YARN

Predictive AnalyticsHive

Data Lake

Data Visualization

Graph Analytics

Big Data

Questions?

46