34
© 2017 IBM Corporation Alabama Db2 User Group IBM Machine Learning for z/OS Jamar Smith North America zAnalytics Data Scientist [email protected]

Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation

Alabama Db2 User GroupIBM Machine Learning for z/OS

Jamar SmithNorth America zAnalytics Data Scientist [email protected]

Page 2: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation2

Agenda

What is Machine Learning and Why Machine Learning

Challenges of Machine Learning

IBM Machine Learning for z/OS

Use Case Examples

Architectural Overview

Questions and Answers

Machine Learning for z/OS Demonstration

Page 3: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation3

The Focus on Machine Learning

Gartner identifies Machine Learning as the

Top Trend in IT for 2017 and at the top of every CIO's

strategy & budget

Source Gartner

Machine learning segment of the cognitive

computing market forecast to grow from $6 billion in 2016 to $52 billion in 2021 with a CAGR

of 53.5% for 2016-2021

Published date: 05/02/2016 Source: Mindcommerce

Data scientists are the superheroes and

unicorns of today's business. But data scientists are only

human, and they are reaching the limits of productivity with

current processes.

Published date: 02/11/2016 Source: Forrester

Page 4: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation4

What is Machine Learning?

Data Perform AnalysisProvide

Actionable Insight

Computers that learn without being explicitly programmed.

Hint: It’s just a bunch of math.

Page 5: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation5

Decision Making

§Loan Application§House Data§Warranty Resolution§Customer Satisfaction

§Approve or Reject§Appraise Home Value§Predict Causality §Churn

• Represents a pattern with a Mathematical Function

Page 6: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation6

Number of choices

A single truck visiting 10 different locations

Over 3 million routes

Real world problems contain

100s of trucks, 1000s of locations

There are more possibilities that the grains of sands in the world

A single truck visiting 5 different locations 120 different routes

Page 7: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation7

Types of Machine Learning

Supervised Learning Unsupervised Learning

§Models trained from unlabeled data§Models trained from labeled data

x1

x2

x2

x1

x

Classification & Regression Clustering

Page 8: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation8

Why Machine Learning?

§Tap into the rich value of historical data

§Discover insights and generate predictive models make better decisions

§Don’t just generate reports, use predictive analytics

§Predictive analytics in the future means things like:§ Personalizing every client interaction§ Reducing reduction, § Fraud detection§ Cross sell/upsell§ Customer segmentation§ Inventory optimization§ Infinite others all meant to increase your revenue and disrupt your

competition

The value of machine learning is rooted in its ability to create accurate models to guide future actions and to discover patterns that we’ve never seen before

Page 9: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation9

The role of data in machine learning

Data scientists spend a significant portion of their time dealing with data access, volume and integration

SOURCE: http://www-03.ibm.com/systems/z/solutions/real-time-analytics/data-analysis.html

Page 10: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation10

What’s involved in machine learning?

§ Machine learning prep- Clearly define business problem

- Select data set to address business problem- Transform historical data (label outcomes)

§ Machine learning process- Prepare data for use in algorithm (transform text values to integers) - Select a model and train the model against a subset of the data (repeat using a different

algorithm)- Identify the best algorithm/model and test that model with a new subset of the same data

- Evaluate/Validate the model (retest with yet another new subset of the same data)

- Deploy the model (score against a previously unseen set of data)- Maintain/monitor the model (to ensure quality is not degrading)

Page 11: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation11

The traditional machine learning process

Create, deploy and manage behavioral models

Requires significant human intervention to create, deploy, and manage

The need for machine learning is surpassing the resources to optimize its use.

Page 12: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation12

The challenges of the Data Scientist

Each additional pipeline stage

increases complexity dramatically!

More than 50 different algorithmsSVM, Neural Net, Decision Trees/Forests, Naïve Bayes,Regression, SMO, K-nearest NeighborClustering, Rules, …

Explosive # of parameter choices per algorithm Kernel type, pruning strategy, number of trees in a forest, learning rate, …

Wide variation in performance across different algorithm implementations/user defined algorithmsSPSS vs. Python vs. WEKA vs SPARK …

Trying new combinations and parameters is time intensiveComputational cost for training a single SVM can exceed 24 hours

Therefore model selection is commonly based on data the scientist’s bias

Page 13: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation13

IBM has transformed machine learning to learning machines

Quick model developmentFast deployment

Continuous auditing & proactive notificationEasy management

Page 14: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation14

Capability Differentiating Value

Cognitive Assistant for Data Scientists (CADS)

§ Rapidly selects the algorithm that best fits the data and business scenario Create better

models in less timeHyper Parameter Optimization (HPO)

§ Provide optimal parameters for any given model

Machine Learning Pipeline User Interface

§ Wizards make it easy for users to create, train and evaluate a model

Simplify model creation

Continuous Monitoring and Feedback Loop

§ Monitor model performance with feedback data and performance history

§ Notification of model performance deterioration for more efficient retraining

Improve modelsover time

Modern RESTful APIs§ Ease collaboration across users

(e.g., Data Scientists and App Developers)

Easily integrate with existing tools and applications

Single UI for Deployment § Easily manage thousands of models in an enterprise environment

Simplify model management

IBM Machine Learning for z/OSFaster Time-to-Value

Page 15: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation15

• Personalize every interaction

• Convert insight into opportunity and opportunity into revenue

• Automatically identify and minimize risk

• Disrupt the competition and the disrupters

• Drive down costs

§Support data gravity and high securityo Keep data in place, encrypted and secureo Minimize latency, cost and complexity of

data movemento Transform data on platformo Improve data quality and governance

§ Apply the same resiliency to analytics as your operational applications

§ Combine insight from structured & unstructured data from z and non-z data sources

§ Leverage existing people, processes and infrastructure

IBMzAnalyticsTransformtransactionsintoactioninmoments

AHybridTransactional/AnalyticalPlatform

Page 16: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation16

Argus Health – Improving Diabetic Patient HealthArgus Health addresses complexity that payers, providers and pharma face to help maximize clinical benefits for patients

How to encourage diabetic patients to maintain blood sugar, cholesterol and blood pressure through diet, exercise, and regularly taking their medication

COMPANY SIZE: 501-1,000INDUSTRY: HealthcareLOCATION: US

Tools Used: IBM Machine Learning for z/OS, Apache Spark for z/OS, Rocket MDSS, DB2 for z/OS

DATA SCIENCE TECHNIQUES:Classification

SolutionScore diabetes patients at the point of sale based on factors such as average blood sugar level, cholesterol, blood pressure, whether they take their meds regularly, etc. Depending on the patient's score, their co-pay may vary as they request refills of their medications.

The team built a classification model to predict the risk categories of each patient using IBM Machine Learning for z/OS.

BenefitsArgus was able to positively impact the health of its patients at a lower cost while improving member experience.

Challenge

Page 17: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation17

Financial Services

• Customer risk scoring• Automated loan underwriting• Credit monitoring • Product recommendations• Planning assistance• Customized withdrawal limits• Portfolio tax optimization• Spending patterns• Credit increase worthiness• Customer retention • Fraud & ID theft detection• Identity management• Sentiment & news analysis• Spending impact influencers• Risk detection in FS• Documentation review

• Asset performance & reliability• Energy demand forecast• Maximize power generation• Uncover hidden energy patterns• Customized incentives• Energy theft detection• Appliance efficiency• Billing forecasting• Optimize energy programs• Prevent customer churn• Customer sentiment analysis

Energy & Utilities

• Revenue forecasting• Reducing delays• Advanced sentiment analysis• Lost luggage turnaround• Improved operational

efficiency• Advanced travel offers• Optimal demand forecasting• Customer satisfaction and

loyalty• Recommender systems• Fraud detection• Passenger & other travel data

enrichment

Travel

Machine Learning Use Cases – by Industry

Page 18: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation18

Retail

• New store locations• Shelf, store and package

optimization• Seasonal planning &

forecasting• Consumer trends• Optimal product blend• Personalized promotions• Inventory forecasting• Promotional strategies• New label products & product

categories• Price optimization• High-end purchase anomalies• Product lifecycle

• Patient risk migration• Hospitalization risk diagnosis• Follow-up visit frequency• Personalized treatment plans• Prescription error reduction• Anomaly device detection• Reduce unnecessary

hospitalizations• Personalized medication co-pay• Design treatment plans• Diagnosis through images• Medication management• Increased drug effectiveness• Healthcare fraud prevention• Patient sentiment analysis

Healthcare

M

• Responsive machines• Rare failure reduction• Increased production capacity• Internal defect reduction• Accelerated price

determination• Better integrated process flow• Improving preventative MRO• OEE improvements• Quality production forecast• Demand forecast accuracy• Optimized product

customization

Manufacturing

Machine Learning Use Cases by Industry - cont’d

Page 19: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation19

IBM Machine Learning for z/OS

Ø MoveMachine

Leaningcapabilityto

theplatformwhere

themostvaluable

dataresides

Ø Integratereal-time

predictiveanalytics

withtransactions

Ø Leveragez/OS

superiorreliability,

availability and

security

Page 20: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation20

Flexible options to support different expertise levels

• Supports non-programming data scientists and non-data scientists

• Helps data scientists to be more productive

Visual Model Builder

• Allows data scientists to use the tool they are most familiar with

• Provides direct access to open source capabilities

Jupyter Notebook

Page 21: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation21

Machine Learning for z/OS Demonstration

Page 22: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation22

Visual Model Builder – Data Selection

Select data set for modeling

Page 23: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation23

Visual model builder - Data Preparation

View data set details

Prepare data for algorithm

Page 24: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation24

Visual model builder - Model Training

Select algorithmSelect data to

pass to algorithm

Page 25: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation25

Visual model builder - Model Selection

Select model with greatest

accuracy

Page 26: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation26

Visual model builder - Model Evaluation

Validate the model with never before seen data

Page 27: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation27

Visual model builder -- Model Deployment and Predict

27

Select model to deploy

View recent evaluation

Model deployed as URI

Test deployment

Page 28: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation28

Visual model builder – Mode Monitoring Dashboard

Page 29: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation29

Jupyter Notebook – Data SelectionData scientists love using notebooks

Import data set

Page 30: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation30

Jupyter notebook – Data Split

Set up the data set splits

Page 31: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation31

Jupyter Notebook – Model Creation with CADS

Use CADS to recommend best model

Page 32: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation32

Using Jupyter Notebook – Model Evaluation

Use Brunel chart to represent

model accuracy

Save model for deployment

Page 33: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation33

Questions & Answers

Page 34: Alabama Db2 User Group IBM Machine Learning for z/OS · Cognitive Assistant for Data Scientists (CADS) §Rapidly selects the algorithm that best fits the data and business scenario

© 2017 IBM Corporation34