52

Findability Day 2016 - Big data analytics and machine learning

Embed Size (px)

Citation preview

kaiwähner

Tibco

Kai WähnerTechnology Evangelist

[email protected]

LinkedIn

@KaiWaehner

www.kai-waehner.de

Findability Day 2016 (Stockholm, Sweden)

How to Leverage Machine Learning to Find Insights in Historical Data

© Copyright 2000-2016 TIBCO Software Inc.

Apply Big Data Analytics to Real Time Processing

© Copyright 2000-2016 TIBCO Software Inc.

Analyze and Act on Critical Business Moments

© Copyright 2000-2016 TIBCO Software Inc.

Agenda

1) Machine Learning and Big Data Analytics2) Building an Analytic Model3) Real Time Processing4) Real World Scenario

© Copyright 2000-2016 TIBCO Software Inc.

Agenda

1) Machine Learning and Big Data Analytics2) Building an Analytic Model3) Real Time Processing4) Real World Scenario

Machine Learning

…. allows computers to find hidden insights without being explicitly programmed where to look.

Real World Examples of Machine Learning

Spam Detection Search Results +Product Recommendation

Picture Detection(Friends, Locations, Products)

Machine Learning is already present in daily life…

Now, every enterprise is beginning to leverage it!

The Next Disruption:Google Beats Go Champion

© Copyright 2000-2016 TIBCO Software Inc.

Analytics Maturity Model

Immediate Long-TermCompetitiveAdvantageValue to the Organization

A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases

Self-serviceDashboards EventProcessingAdvancedAnalytics

Measure Diagnose Predict Optimize Alert Automate

Analytics Maturity

VisualAnalytics EventProcessing

Analytics

© Copyright 2000-2016 TIBCO Software Inc.

Analytics Maturity Model

Immediate Long-TermCompetitiveAdvantageValue to the Organization

VisualAnalytics EventProcessingAdvancedAnalytics

Measure Diagnose Predict Optimize Alert Automate

Analytics Maturity

A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases

Analytics

© Copyright 2000-2016 TIBCO Software Inc.

Analytics Maturity Model

Immediate Long-TermCompetitiveAdvantageValue to the Organization

Self-serviceDashboards EventProcessingAdvancedAnalytics

Measure Diagnose Predict Optimize Alert Automate

Analytics Maturity

A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases

VisualAnalytics EventProcessing

Analytics

© Copyright 2000-2016 TIBCO Software Inc.

The first task in a new analytics projectsis to define a Business Case!

© Copyright 2000-2016 TIBCO Software Inc.

Agenda

1) Machine Learning and Big Data Analytics2) Building an Analytic Model3) Real Time Processing4) Real World Scenario

© Copyright 2000-2016 TIBCO Software Inc.

Analytical Pipeline

© Copyright 2000-2016 TIBCO Software Inc.

Analytics Maturity Model

Immediate Long-TermCompetitiveAdvantageValue to the Organization

A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases

Self-serviceDashboards EventProcessingAdvancedAnalytics

Measure Diagnose Predict Optimize Alert Automate

Analytics Maturity

VisualAnalytics EventProcessing

Analytics

© Copyright 2000-2016 TIBCO Software Inc.

Analytical Pipeline

© Copyright 2000-2016 TIBCO Software Inc.

Data Acquisition

© Copyright 2000-2016 TIBCO Software Inc.

Analytical Pipeline

cust_id dept sku dollar gift date1 104 C 12003 2.40 FALSE 2016-10-172 105 A 12005 62.85 FALSE 2016-10-173 102 C 12007 69.23 TRUE 2016-10-174 104 B 12004 9.33 FALSE 2016-10-185 105 C 12010 14.16 TRUE 2016-10-186 101 B 12003 90.43 FALSE 2016-10-197 103 C 12005 90.97 FALSE 2016-10-19n … … … … … …

cust_id A B C total # orders first_date

last_date

1 100 21.76 23.67 0.00 45.43 2 2016-10-19

2016-10-20

2 101 0.01 74.65 0.00 74.66 3 2016-10-19

2016-10-20

3 102 0.00 60.92 50.29 111.21 6 2016-10-17

2016-10-20

4 103 0.00 0.00 52.30 52.30 2 2016-10-19

2016-10-20

5 104 31.34 9.33 2.40 43.06 4 2016-10- 2016-10-© Copyright 2000-2016 TIBCO Software Inc.

Data Munging - Transformations

© Copyright 2000-2016 TIBCO Software Inc.

Analytical Pipeline

“The greatest value of a picture is when it forces us to notice what we never expected to see”

John W. Tukey, 1977

© Copyright 2000-2016 TIBCO Software Inc.

Exploratory Data Analysis

Visual Analytics - Interactive Brush-Linked

© Copyright 2000-2016 TIBCO Software Inc.

© Copyright 2000-2016 TIBCO Software Inc.

Analytics Maturity Model

Immediate Long-TermCompetitiveAdvantageValue to the Organization

VisualAnalytics EventProcessingAdvancedAnalytics

Measure Diagnose Predict Optimize Alert Automate

Analytics Maturity

A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases

Analytics

© Copyright 2000-2016 TIBCO Software Inc.

Analytical Pipeline

© Copyright 2000-2016 TIBCO Software Inc.

Which picture represents a model?

A model is a simplification of the truth that helps you with decision making.

© Copyright 2000-2016 TIBCO Software Inc.

Model Building

© Copyright 2000-2016 TIBCO Software Inc.

Model Building

Employees who write longer emails earn higher salaries!

© Copyright 2000-2016 TIBCO Software Inc.

Model Building

© Copyright 2000-2016 TIBCO Software Inc.

Model Improvement

Managers

Staff

© Copyright 2000-2016 TIBCO Software Inc.

Model Improvement

© Copyright 2000-2016 TIBCO Software Inc.

Analytical Pipeline

© Copyright 2000-2016 TIBCO Software Inc.

Model Validation

How is the IQ of a kid related to the IQ of his / her mum?

© Copyright 2000-2016 TIBCO Software Inc.

Frameworks and Tooling

© Copyright 2000-2016 TIBCO Software Inc.

“…as a next-generation data discovery capability that automatically finds and explains insights from advanced analytics to business users or citizen data scientists”

Smart Data Discovery (for the Business User)

Leverage Machine Learningwithout the help of a Data Scientist

Advanced Analytics and Big Data Tools (for Data Scientists)

Many more ….

TIBCO Spotfire with R / TERR Integration

© Copyright 2000-2016 TIBCO Software Inc.

Let the business user leverage Analytic Models (created by the Data Scientist) to find insights!

Example: Customer Churn with Random Forest Algorithm• ‘refresh model’ button lives a ‘random forest algorithm’• requires no a priori assumptions at all, it just always works • The business user doesn’t need to know what random forest is to be empowered by it

Select variables for the model

© Copyright 2000-2016 TIBCO Software Inc.

Agenda

1) Machine Learning and Big Data Analytics2) Building an Analytic Model3) Real Time Processing4) Real World Scenario

© Copyright 2000-2016 TIBCO Software Inc.

Analytics Maturity Model

Immediate Long-TermCompetitiveAdvantageValue to the Organization

Self-serviceDashboards EventProcessingAdvancedAnalytics

Measure Diagnose Predict Optimize Alert Automate

Analytics Maturity

A good Big Data Analytics platform can provide value to the organization across the full spectrum of use cases

VisualAnalytics EventProcessing

Analytics

© Copyright 2000-2016 TIBCO Software Inc.

Operational Intelligence and Human Interaction

Actions by Operations

Humandecisionsinrealtimeinformedbyuptodateinformation

38

Automatedactionbasedonmodelsofhistorycombinedwithlivecontextandbusinessrules

Machine-to-Machine Automation

© Copyright 2000-2016 TIBCO Software Inc.

Visual Coding for Streaming Analytics with TIBCO StreamBase

• StreamingOperators• Connectivity• VisualDevelopment• Testing&Simulation• MatureTooling/Support• MiddlewareIntegration

© Copyright 2000-2016 TIBCO Software Inc.

Live Visual Analytics UI with TIBCO Live Datamart

Dynamicaggregation

Livevisualization

Ad-hoccontinuousquery

Alerts

Action

© Copyright 2000-2016 TIBCO Software Inc.

How to apply analytic models to real time processing without redevelopment?

TIBCO StreamBaseH20.ai

Open Source

R

TERR

Spark ML

MATLAB

SAS

PMML

© Copyright 2000-2016 TIBCO Software Inc.

TIBCO StreamBase Connector for R and TERR

© Copyright 2000-2016 TIBCO Software Inc.

Agenda

1) Machine Learning and Big Data Analytics2) Building an Analytic Model3) Real Time Processing4) Real World Scenario

Scenario: Predictive Scrapping of Parts in an Assembly Line

Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process.

Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2?

Station 1 Station 2

Cost Before9€ 7€ 13€ Total Cost

29€(or more)

Scrap? Scrap?

TIBCO Spotfire with H2O Integration

Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)

TIBCO Live Datamart

Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)

Live Dartmart Desktop Client

TIBCO Live Datamart

Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)

Live Dartmart Web API

© Copyright 2000-2016 TIBCO Software Inc.

TIBCO Accelerator for Apache Spark

1. Fast Data Preparation for IoTDozens of enterprise and IoT data preparation adapters: MQTT, Databases; inbound creation of HDFS, Parquet, Hbase, Avro…

2. Spotfire Model Discovery TemplateUse Spotfire to explore Spark data lake, create predictive model, train in H20, and deploy to Streaming Analytics.

3. Operationalize Predictive ModelsZookeeper deployment to StreamBase nodes living in Spark cluster via H20, PMML, TERR models

4. Streaming Analytics for AutomationAutomate action based on predictive models – make offers to customers, stop fraudulent transactions, alert.

5. Monitor & Retrain Model Monitor behavior of model, retrain when necessary.

6. Drag & Drop for Business Solution DevelopersCode-free development environment for work with H20, HDFS, Avro, TERR

The TIBCO Accelerator for Spark is a TIBCO engineered, light-weight open-source fast-start for systems to stream data into Spark, discover patterns in Spark with Spotfire, and operationalize the insights on Big Data.

FUNCTIONAL COMPONENTS

© Copyright 2000-2016 TIBCO Software Inc.

Key Take-Aways

Ø Insights are hidden in Historical Data on Big Data Platforms

Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models

Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time

Questions? Please contact me!

Kai WähnerTechnology Evangelist

[email protected]@KaiWaehnerwww.kai-waehner.deLinkedIn