Upload
drstefan-radtke
View
63
Download
7
Embed Size (px)
Citation preview
The Journey to Big Data Analytics
Dr. Stefan RadtkeCTO Isilon Storage Division, EMEADell EMC
IDC Conference, Madrid, January 31st 2017
2© Copyright 2016 Dell EMC All rights reserved.
Welcome!Dr. Stefan RadtkeCTO Isilon, EMEADell EMC | Storage Division
- 1995-2011 : 17 Years for IBM in various technical roles- 2012-2013 : Global Architect, EMC Global Alliances- 2013-2016 : CTO, EMEA, Isilon Storage Division, EMC- 2016-today : CTO, EMEA, Isilon Storage Division, Dell EMC
Phone: +49-176-34434460E-Mail: [email protected]: http://de.linkedin.com/in/drstefanradtkeBlog: http://stefanradtke.blogspot.com
3© Copyright 2016 Dell EMC All rights reserved.
Analytics affects all Industries
Smart factories, process control, supply/labour efficiency, and automation control
Smart meters and grids, preventative maintenance, and environmental monitoring
Smart infrastructure, traffic optimization, maintenance, and fleet tracking
Hospital patient monitoring, home healthcare, and remote diagnosis
Wearables, automotive, smart home, and entertainment disruptions
“just-in-time” management, promotions, and Location-based advertising
Manufacturing
Retail
Energy
Transportation/Infrastructure
Consumer
Healthcare
4© Copyright 2016 Dell EMC All rights reserved.
Step 2: Define question to be answered
Step 3: Use Business Intelligence (BI) tool’s graphical user interface (GUI) to construct query
Step 4: BI tool creates SQL
Step 5: SQL is run against data warehouse to create report
DW
Traditional BI Engagement ProcessStep 1: Pre-build data schema (schema-on-load)
5© Copyright 2016 Dell EMC All rights reserved.
Evolution Of The Analytic Questions
• How many widgets did I sell last month?
• What were sales by zip code for Christmas last year?
• How many of Product X were returned last month?
• What were company revenues and profits for the past quarter?
• How many employees did I hire last year?
What Happened?(Descriptive/BI)
What Will Happen?(Predictive)• How many widgets will I
sell next month?• What will be sales by zip
code over this Christmas season?
• How many of Product X will be returned next month?
• What are projected company revenues and profits for next quarter?
• How many employees will I need to hire next year?
What Should I do?(Prescriptive)• Order [5,0000] component Z to
support widget sales for next month• Hire [Y] new sales reps by these zip
code to handle projected Christmas sales
• Set aside [$125K] in financial reserve to cover Product X returns
• Sell the following product mix to achieve quarterly revenue and margin goals
• Increase hiring pipeline by 35% to achieve hiring goals
6© Copyright 2016 Dell EMC All rights reserved.
Step 1: Define Hypothesis to test or Prediction to be made
Step 3: Build schema (schema-on-query)
Step 4: Visualize the data (Tableau, Spotfire, ggplot2,…)
Step 6: Evaluate model results (probabilities, confidence levels)
Data Science Engagement ProcessR
epea
t
Step 5: Build analytic models (SAS, R, MADlib, Mahout,…)
Kronos
HistoricalGoogle Trends
PhysicianNotes
Local Events
Weather Forecast
EpicLawson
CDC
Step 2: Gather data…and more data (Data Lake: SQL + Hadoop)
7© Copyright 2016 Dell EMC All rights reserved.
Why we need to collect ALL data !
8© Copyright 2016 Dell EMC All rights reserved.
Holistic Data Collection
Data Lake
9© Copyright 2016 Dell EMC All rights reserved.
Companies understand the Value of Information but
many of them don’t know how to start the
journey.
10© Copyright 2016 Dell EMC All rights reserved.
Brainstorm the right questions to ask
11© Copyright 2016 Dell EMC All rights reserved.
Do not filter any Data SourcesBrainstorming predictive and prescriptive questions typically uncovers numerous new data sources that are worthy of consideration. And this is a key point: ALL data sources are worthy of consideration!
Do NOT filter the data sources at this point in the process.
12© Copyright 2016 Dell EMC All rights reserved.
But how do we get “Smart”First consideration: What is the business initiative or “what” we want to accomplish? For example, Reduce traffic congestions.
Some key questions/parameters to consider:• Traffic flow decisions: New roads? New lanes? New turn lanes? New bike lanes? Pedestrian crossings?
Railroad crossings? Bus stops?• Road repair and maintenance decisions: Fixing potholes? Repaving surfaces? Materials and
equipment needed? When to fix potholes and repave streets?• Construction permits decisions: Types of permits needed? Impact on traffic flow? Length of time to
complete the work? Number of employees to consider?• Events management decisions: Traffic (cars and pedestrians) attending proposed event? Impact on
normal traffic flow? Date, time, location and duration of events?• Parks decisions: Location of parks? Size of parks? Hours of operation? Park equipment maintenance?• Schools decisions: Location and size of new schools? Hours of operations? Location of stoplights and
stop signs?
13© Copyright 2016 Dell EMC All rights reserved.
Data Assessment: Value vs. FeasibilityBusiness Initiative: Improve Traffic Flow Im
pact
Feasibility
14© Copyright 2016 Dell EMC All rights reserved.
Prioritization Matrix
Hi
Hi
Lo Implementation Feasibility
Bus
ines
s Va
lue
C
AF
B
D
E
Use Cases (Decisions)A. Optimize Traffic FlowB. Improve Road Repair and
MaintenanceC. Optimize Construction
PermitsD. Improve Events
ManagementE. Optimize Park Hours and
ActivitiesF. Optimize School Hours and
Activities
Business Initiative: “Smart” City Initiative
15© Copyright 2016 Dell EMC All rights reserved.
KEEPING UP WITH NEW TECHNOLOGIES AND TOOLS
Product Performance & Reliability
CustomerSentimentAnalysis
SupplyChain
Optimization
CompetitiveWar Games
IDENTIFYINGTHE RIGHT USE CASE
FINDING, CURATING, AND GOVERNING THE DATA
Why do most efforts to stand up Big Data Initiatives stall or fail ?
16© Copyright 2016 Dell EMC All rights reserved.
Demonstrate the potential value using
data science techniques
Strategic Approach
Align business and IT goals
around big data
Identify strategic opportunities for
big data analytics
Prioritize key use cases by
assessing feasibility and
ROI
Recommend the appropriate
analytics engagement and
deployment roadmap
1 2 3 4 5
Workshop Objectives
17© Copyright 2016 Dell EMC All rights reserved.
Get help for the Journey
Platform for Big Data and
Analytics Solutions
BUSINESS
TECHNOLOGY
DEPLOYASSESS PROVE
Big Data Proof of
Value
Big DataProof of
Technology
Big Data Applied Analytics Implementation
Big DataTechnology
Implementation
Big Data Vision
Workshop
Big Data Technology
Advisory
Dell EMC Service Offerings
18© Copyright 2016 Dell EMC All rights reserved.
My prediction: market is looking for well integrated systems. Easy to use and with flexible deployment options.
19© Copyright 2016 Dell EMC All rights reserved.
Advance Analytic (Data Science) ModelsGraph AnalyticsForecasting
Time Decomposition
Association Analytics
Behavioral Analytics
20© Copyright 2016 Dell EMC All rights reserved.
Complex and growing Ecosystem
21© Copyright 2016 Dell EMC All rights reserved.
DATA LAKE
COMPUTE:EMC Converged Platform
Integrated and Converged Infrastructure
PLATFORM MANAGER
ADMINISTRATION ANALYTICS CATALOG DATA CATALOG
Rich UX
INFRASTRUCTURE SOFTWARE OPTIONS
YOUR WORKSPACE
COMPUTE, NETWORK,STORAGE
CONVERGED INFRASTRUCTUREDATA SETS
DATA CURATOR
ENRICH
INGEST
INDEX
FIND AND INGEST DATA
AT LEAST ONE HADOOP DISTRIBUTION
DATA GOVERNOR
LINEAGE
QUALITY
SECURITY
GOVERN/PUBLISH
DATA, TOOLS, APPS
Integrated Analytics 3 rd Party Openness
DATA SCIENCEPivotal Big Data Suite
EXTENSION PACKSR, RStudio, Anaconda
python, jupyter
PUBLISHED TOOLS AND DATASETS
22© Copyright 2016 Dell EMC All rights reserved.
Thanks for your Attention!Dr. Stefan RadtkeCTO Isilon, EMEADell EMC | Storage Division
- 1995-2011 : 17 Years for IBM in various technical roles- 2012-2013 : Global Architect, EMC Global Alliances- 2013-2016 : CTO, EMEA, Isilon Storage Division, EMC- 2016-today : CTO, EMEA, Isilon Storage Division, Dell EMC
Phone: +49-176-34434460E-Mail: [email protected]: http://de.linkedin.com/in/drstefanradtkeBlog: http://stefanradtke.blogspot.com