Upload
data-science-thailand
View
1.728
Download
1
Embed Size (px)
Citation preview
Business Intelligence 3.0 (and the Emergence of Data Lake)
1 Cheow Lan Lake, Thailand
โกเมษจันทวิมล
January 31 2559
Komes Chandavimol
หลกัสูตร ปริญญาโท Big Data Engineering คณะวศิวกรรมศาสตร์มหาวทิยาลยั ธุรกิจบณัฑิต
Business Intelligence
The set of Techniques and Tools for the Transformation ofRaw data into meaningful and useful Information for business analysis purposes, Wikipedia
2
BI
Techniques and Tools Raw data Information
Business Intelligence 1.0
What will I get my report then? That not what I want?
Business Request for Information IT/Data Analyst query in the database/data warehouse IT/Data Analyst provide the report
BI 1.0 - Delivery to Customer
3Defining Business Intelligence 3.0, Lachlan James (2014)
BI 1.0 – Delivery to Customer
4
BI
BI
Present
Aggregate
Defining Business Intelligence 3.0, Lachlan James (2014)
Batch Processing BI Tools Usage – for only some group/community of people IT Control
Stand Alone Reports
Business Intelligence 2.0
5
I can explore a wide range of data assets; I prefer to blend data but my report differ from yours
ERP, CRM, Data Warehouse A single point of view With a centralized BI tool
Business Analysts can explore the datain the Web Portal with BI Tools and predefined reports
BI 2.0 - Creation and Delivery for Consumers
Defining Business Intelligence 3.0, Lachlan James (2014)
BI 2.0 Creation and Delivery to Consumers
6
Explore
Predict
Defining Business Intelligence 3.0, Lachlan James (2014)
Real time via any device Web Portal Centralized BI Tools – Business Analyst can explore Hybrid Control
Web Portal
Business Intelligence 2.5
I can use any tool, I can blend data rapidly (if I can find it), it was so simple but now
BI 2.0 ++ (Agile BI, SOA , Enterprise Search, Visualization) Give Business more power to BI Tools Less complicated to IT
On the fly Data Federation
7
Business Intelligence 3.0
I collaborate via any devices with content, harness information on the fly and drive outcome
Focus on Collaborate workgroup
Self regulate, Self governance in data management
Interact between customer, employee, regulators and third parties
No Bottleneck from IT
Include Big Data, Cloud, IoT and Social integration
Creation Delivery and Management for Consumers8
Defining Business Intelligence 3.0, Lachlan James (2014)
The Journey of Business Intelligence 3.0
10
BI 1.0 BI 2.0 BI 3.0
Functionality Present and Aggregate
Explore and Predict Anticipate and Enrich
Frequency Monthly/Detail Weekly/Daily/Summary Real-time/Process
Level of Focus Community Enterprise Collaborative
Processing Batch Near Real-time In-Process
Data Products Information Intelligence Insight
Foundation/ Influence
Delivery Only Creation + Delivery Creation + Delivery + Automation
Defining Business Intelligence 3.0, Lachlan James (2014)
Understand the needs of BI 3.0 users
A set of Tools and Techniques that Delivery Intelligence without making users work for it Delivery to Right Data to the right Users at the right time Focus on Scalability + Usability Provide Self-Guided content creation, delivery and analysis Support Multi-Device User interface, anywhere, anytime Support Collaborative methodology Include Analytics 3.0, Data Discovery, Advanced Visualization, Visual
Analytics, Business Discovery, Self Serve Business
12Source: Forrester Research’s James Kobielus
Understand the needs of BI 3.0 platform
Technology should be in place to enable organization to acquire, store, combine, and enrich huge volume of unstructured and structured data in raw format
Ability to perform analytics, real-time and near real-time data scale , on these huge volume in iterative way
13
Find the tools!
Spreadsheet? Database? Data Mart? Data Warehouse?
14Source: Forrester Research’s James Kobielus
Tools that support these!
15http://www.adweek.com/prnewser/how-many-times-do-the-worlds-social-media-users-click-every-minute/117427
https://www.domo.com/learn/data-never-sleeps-3-0
Traditional Data Warehouse
20http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
New Data Management Architecture
21http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
New Data Management Architecture
22http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
Data Lake
A single place to store every type of data in its native format with no fixed limits on account size or file size, high throughput to increase analytic performance and native integration with the Hadoop ecosystem.
25
Reference: James Serra's Blog
Data Lake Development with Big Data , Pradeep Pasupuleti (2015)https://www.digitalnewsasia.com/business/forget-data-warehousing-its-data-lakes-now
Data Lake
Type of Data Raw Data Derived Data Aggregated Data
Type of Environment Discovery Environment Production Environment
30The Definition of Data Lake, John O’Brien(2015)
How the Data Lake works?
31http://www.clearpeaks.com/blog/category/tableau
Traditional Enterprise Data warehouse
New Data Management Architecture
32http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
4 Maturity Stages of Data Lake
Stage 1 – Pilot Project (Understand the Technology) Stage 2 – Productionize Hadoop and its capabilities Stage 3 – Proactive consolidate data to (Big) Data Analytics Stage 4 – Platform the Data Lake to Core Competency
36The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Stage 1 – Pilot Project
Handling data at scale Involves getting the plumbing in place and learning to acquire
and transform data at scale. The analytics may be quite simple, but much is learned about
making Hadoop work the way you desire.
37The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Stage 2– Productionize Hadoop and its capabilities
Involves improving the ability to transform and analyze data. Find the tools that are most appropriate to their skillset Acquiring more data and build applications.
38The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Stage 3 – Proactive consolidate data to (Big) Data Analytics
Involves getting data and analytics into the hands of as many people as possible.
It is in this stage that the data lake and the enterprise data warehouse start to work in unison, each playing its role.
Started with a data lake eventually added an enterprise data warehouse to operationalize its data.
39The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Big Data Analytics
40http://dataofthings.blogspot.com/2014/04/the-bbbt-sessions-hortonworks-big-data.html
Data Lake and Big Data Analytics
41http://hortonworks.com/blog/big-data-refinery-fuels-next-generation-data-architecture/
Stage 4 – Platform the Data Lake to Core Competency
Enhance Enterprise Capabilities are added to the data lake. Few companies have reached this level of maturity, but many
will as the use of big data grows, Require Data governance, compliance, security, and auditing
(and incorporate to Company Data Strategy)
42
The Technology of the Business Data Lake, Capgemini (2013)
The Data Lake Unifies Data Discovery, Data Science, and BI 3.0
44
Big Data
Self Serve BusinessData Science
Machine Learning
Visual AnalyticsBusiness Discovery
Deep Learning
Self Serve Business
Hadoop
Feature Engineering
Spark
Business Intelligence 3.0
YARN
Predictive AnalyticsHive
Data Lake
Data Visualization
Graph Analytics
Big Data