52
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Transformational Machine Learning Use Cases You Can Deploy Now CON6234 Charlie Berger Sr. Director Product Management Machine Learning, AI and Cognitive Analytics Sebastian Turullols Sr. Director Hardware Development Microelectronics

Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Transformational Machine Learning Use Cases You Can Deploy Now CON6234

Charlie Berger Sr. Director Product Management Machine Learning, AI and Cognitive Analytics Sebastian Turullols Sr. Director Hardware Development Microelectronics

Page 2: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

2

Page 3: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Today…

• Get Droid alerts/updates:

– Local news

– Local weather

– Your stated interests

– Your sports teams

– National news updates

Google Now Provides Tailored Local Updates

Page 4: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Page 5: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Imagine these scenarios

• Wake up, read Droid alert about meeting a old girlfriend in the coffee shop who will be getting married, but hasn’t told anyone yet.

– Change of address

– Adopt a dog

– Facebook pics

– Tweets

– Online ring purchase by close contact

Meet old Girlfriend at Coffee Shop

When you meet your old girlfriend at the coffee shop this morning, act surprised to learn she is getting married

Page 6: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

The “Near Future”

• “Datafication” of EVERYTHING

• “Digital exhaust” – GPS

– Tweets

– Geo-tags

– Facebook • posts, pics, friends

– LinkedIn

– RFID

– Medical records

Big Data, Interconnected World + Machine Learning

http://terrificdata.com/2016/10/11/examples-big-data-applications/

Page 7: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

2001: A Space Odyssey

• Our adoption of machine learning and “artificial intelligence” is about at this stage —the BEGINNING!

The Dawn of Man scene

Page 8: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Automatically sift through large amounts of data to find hidden patterns, discover new insights and make predictions

What is Machine Learning, Data Mining, Predictive Analytics?

• Identify most important factor (Attribute Importance)

• Predict customer behavior (Classification)

• Predict or estimate a value (Regression)

• Find profiles of targeted people or items (Decision Trees)

• Segment a population (Clustering)

• Find fraudulent or “rare events” (Anomaly Detection)

• Determine co-occurring items in a “baskets” (Associations)

A1 A2 A3 A4 A5 A6 A7

Page 9: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Copyright © 2017, Oracle and/or its affiliates. All rights reserved.

CLASSIFYING CUSTOMERS

Features

Basic Query

Basic Analytics

Machine Learning

Behavioral Customer Segment

“Retired Cosmopolitan”

“Affluential Executive”

“New Home Mom”

“Young Successful startup”

“Executive product collector”

Probability to Buy New Product X

31% 45% 55% 21% 72%

RFM (Recency, Frequency and

Monetary Value): Purchases in the Last 3/6/12 mo.

1 item / $35 in the last 3 mo

2 items / $150 in the last 6 mo

3 items / $75 in the last 3 mo

3 items / $225 in the last 12 mo

9 items / $250 in the last 6 mo

Age / Gender

Known Known Known Unknown Unknown

Marketing Preferences

Mail and e-mail e-mail e-mail and Facebook

e-mail and Google+

Mail, e-mail and Twitter

Page 10: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

ML Model Deployment for Real-Time Scoring

• On-the-fly, single record apply with new data (e.g. from call center)

Real-Time Scoring, Predictions and Recommendations

Call Center Get Advice

Web Mobile

Branch Office

Social Media

Email

R

Select prediction_probability(CLAS_DT_1_15, 'Yes'

USING 7800 as bank_funds, 125 as checking_amount, 20 as credit_balance, 55 as age, 'Married' as marital_status, 250 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)

from dual;

Likelihood to respond:

Oracle Cloud

Page 11: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Fiserv Risk Analytics in Electronic Payments

Objectives

Prevent $200M in losses every year using data to monitor, understand and anticipate fraud

Solution

We installed OAA analytics for model development during 2014

When choosing the tools for fraud management, speed is a critical factor

OAA provided a fast and flexible solution for model building, visualization and integration with production processes

“When choosing the tools for fraud management, speed is a

critical factor. Oracle Advance Analytics provided a fast and

flexible solution for model building, visualization and integration

with production processes.”

– Miguel Barrera, Director of Risk Analytics, Fiserv Inc.

– Julia Minkowski, Risk Analytics Manager, Fiserv Inc.

3 months to run & deploy Logistic Regression

(using SAS)

1 month to estimate and deploy Trees and GLM

1 week to estimate, 1 week to install rules in online application

1 day to estimate and deploy Trees + GLM models (using Oracle Advanced Analytics)

Oracle Advanced Analytics

Page 12: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Fraud Prediction Demo

drop table CLAIMS_SET; exec dbms_data_mining.drop_model('CLAIMSMODEL'); create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000)); insert into CLAIMS_SET values ('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES'); insert into CLAIMS_SET values ('PREP_AUTO','ON'); commit; begin dbms_data_mining.create_model('CLAIMSMODEL', 'CLASSIFICATION', 'CLAIMS', 'POLICYNUMBER', null, 'CLAIMS_SET'); end; / -- Top 5 most suspicious fraud policy holder claims select * from (select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud, rank() over (order by prob_fraud desc) rnk from (select POLICYNUMBER, prediction_probability(CLAIMSMODEL, '0' using *) prob_fraud from CLAIMS where PASTNUMBEROFCLAIMS in ('2to4', 'morethan4'))) where rnk <= 5 order by percent_fraud desc;

Automated In-DB Analytical Methodology

Automated Monthly “Application”! Just

add:

Create

View CLAIMS2_30

As

Select * from CLAIMS2

Where mydate > SYSDATE – 30

Time measure: set timing on;

Page 13: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

13

Page 14: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Oracle Advanced Analytics factory-installed predictive analytics

• Employees likely to leave and predicted performance

• Top reasons, expected behavior

• Real-time "What if?" analysis

Human Capital Management Powered by OAA

HCM Predictive Workforce Predictive Analytics Applications

Link to Oracle HCM on O.com HCM Predictive Workforce demo

Page 15: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

• Oracle Advanced Analytics factory-installed predictive analytics

• Employees likely to leave and predicted performance

• Top reasons, expected behavior

• Real-time "What if?" analysis

Fusion Human Capital Management Powered by OAA

HCM Predictive Workforce Predictive Analytics Applications

Link to Oracle HCM on O.com HCM Predictive Workforce demo

Page 16: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Predicting Power Consumption Real-Time In the SPARC M8 Microprocessor

16

Page 17: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Power Prediction Background

• Power is a precious resource in just about any domain

• Managing power consumption matters: electricity bill, overloading circuits

• Power can be managed at many levels

– Building

– Data Center

– Server

– Processor

• My work focuses around the processor but these ideas can be applied at other levels or even for totally different ML applications!

17

Page 18: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Microprocessor Power Management 101

• Applies from a mobile phone CPU to a big iron CPU like the SPARC M8

– Limit the instantaneous maximum power draw to avoid tripping circuit breakers or damaging components by overloading them (remember exploding batteries?)

– Limit the average maximum power to avoid thermal problems

– Find ways to save power to maximize efficiency / save battery life

• Before you can implement any of these things

– you have to know what the power is

– AND you have to know it far enough in advance to avoid bad things happening

18

Page 19: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Microprocessor Power Management – Dark Ages

• Before we knew how to predict power using machine learning

• Had to always assume the worst case power

– Lead to engineering overdesign, making servers more expensive

– Inefficient operation; e.g. fewer servers per rack, lower clock frequency / performance

19

Page 20: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Microprocessor Power Management – ML Transformation

• 1st Transformational Change – Using ML to predict power

– 33% frequency / performance improvement versus previous designs

– Enabled many new features

• 2nd Transformational Change – Using Oracle Advanced Analytics flow – 95% reduction in effort needed to build and train ML model

– 100X speedup in training / scoring runtimes

20

Page 21: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 21

IP Portfolio

Added Power Predictor

Page 22: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Control System Overview

Power Predictor

Page 23: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

How we predict power

• SPARC Microprocessors starting with M7 include hardware power predictors that estimate the instantaneous power consumed by an entire core using a few key modeling variables.

• R and later Oracle Advanced Analytics were used to build this model, including variable selection and training.

• The most compute intensive task involves selecting ~50 variables from a candidate list of thousands based on about 20GB of training data.

23

Page 24: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Performance Challenge

• Out-of-the box, one round of evaluating a variable addition took 3 hours running single threaded on a single x86 CPU.

• Since hundreds of rounds needed to complete the predictive model design, prompted a project to accelerate the runtime

– applied massive parallelization and memory capacity of Oracle SPARC T7-4 Server

24

Page 25: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Oracle R Distribution Parallelization Results

25

• By moving to T7-4 and through a series of ORD and script coding optimizations to take advantage of the systems massive amount of threads and memory, the team was able to realize a 36X runtime improvement

Runtime (minutes)

Before 180

After Tuning 5 Before

After Tuning

0 50 100 150 200

Page 26: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Power Prediction Results

26

• Achieve <6% error

• Across all benchmarks

Page 27: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Oracle Advanced Analytics >100X Improvement

27

• Moved to Oracle AA on T8-2 and with minimal effort out-of-the-box were able to achieve much faster runtimes than we got with standard R after a lot of hard work optimizing

Runtime (minutes)

R Out-of-Box 180

R Tuned 5

OAA Out-of-Box 1

0 50 100 150 200

Page 28: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

1 minute OAA vs 180 minute R comparison unfair

28

• Oracle Advanced Analytics is doing so much more for us in 60 seconds compared to what we had out-of-the-box with R in 2 hours!

– Automatic optimal best 60 variable selection from candidate pool of 940

– R starting point was only a regression of 48+1 variables

Runtime (minutes)

R Out-of-Box 180

R Tuned 5

OAA Out-of-Box 1

0 50 100 150 200

Page 29: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

(1) Factory configured with one (up to 8 processors) or two (up to 4 processors each) static physical domains (PDoms) (2) Maximum memory capacity is based on 64 GB DIMMs

SPARC M8 Processor–Based Servers

SPARC T8-1 SPARC T8-2 SPARC T8-4 SPARC M8-8

Processors 1 2 2 or 4 Up to 8 1

Max Cores 32 64 128 256

Max Threads 256 512 1,024 2,048

Max Memory 2 1 TB 2 TB 4 TB 8 TB

Form Factor 2U 3U 5U Rack / 10U

Domaining Logical domains

(LDoms) LDoms LDoms LDoms, PDoms 1

29

Page 30: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Oracle Advanced Analytics 12.2 Model Build Time Performance

T7-4 (Sparc & Solaris) X5-4 (Intel and Linux)

OAA 12.2 Algorithms Rows (Ms) Model Build Time (Secs / Degree of Parallelism)

Attributes Importance 640 28s / 512 44s / 72

K Means Clustering 640 161s / 256 268s / 144

Expectation Maximization 159 455s / 512 588s / 144

Naive Bayes Classification 320 17s / 256 23s / 72

GLM Classification 640 154s / 512 363s / 144

GLM Regression 640 55s / 512 93s / 144

Support Vector Machine (IPM solver) 640 404s / 512 1411s / 144

Support Vector Machine (SGD solver) 640 84s / 256 188s / 72

Unofficial

The way to read their results is that they compare 2 chips: X5 (Intel and Linux) and T7 (Sparc and Solaris). They are measuring scalability (time in seconds) with increase degree of parallelism (dop). The data also has high cardinality categorical columns which translates in 9K mining attributes (when algorithms require explosion). There are no comparisons to 12.1 and it is fair to say that the 12.1 algorithms could not run on data of this size.

Wow! That’s Fast!

In 24 hours, could build new predictive models for entire

United States Population, for 400 attributes, 4 times!

Page 31: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Oracle Advanced Analytics 12.2 Model Build Time Performance

T8-2 (Sparc & Solaris) X6-2 (Intel and Linux)

OAA 12.2 Algorithms Rows (Ms) Model Build Time (Secs / Degree of Parallelism)

Attributes Importance 640 30s / 256 47s / 88

K Means Clustering 640 180s / 256 343s / 88

Naive Bayes Classification 320 31s / 384 47s / 88

GLM Classification 640 182s / 512 305s / 88

GLM Regression 640 54s / 256 100s / 88

Support Vector Machine (IPM solver) 640 841s / 512 1380s / 88

Support Vector Machine (SGD solver) 640 88s / 256 170s / 88

Unofficial

The way to read their results is that they compare 2 chips: X5 (Intel and Linux) and T7 (Sparc and Solaris). They are measuring scalability (time in seconds) with increase degree of parallelism (dop). The data also has high cardinality categorical columns which translates in 9K mining attributes (when algorithms require explosion). There are no comparisons to 12.1 and it is fair to say that the 12.1 algorithms could not run on data of this size.

Wow! That’s Fast!

Page 32: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Oracle Database Advanced Analytics Option Machine Learning on SPARC

• Oracle Advanced Analytics in Oracle Database 12.2 – SPARC M7 faster per core on training 64-bit floating point intensive

– In-memory 640 million records, Airline On-time dataset

ML training SPARC M7 up to 1.4x faster per core than x86

SGD (Stochastic Gradient Descent) IPM (Interior Point Method)

Training:

Creating Model from data

Attri- butes

M8-2 2-chip

M8 per core vs

X6-2

Supervised

GLM Classification 900 180s 1.2x

SVM SGD Solver 9000 83s 1.4x

SVM IPM Solver 900 811s 1.2x

GLM Regression 900 59s 1.2x

Cluster Model

K-Means 9000 168s 1.4x

Expectation Maximization 9000 662s 0.8x

32

"per core = (server performance)/(server core count)"

Predict Train

Page 33: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Machine Learning and Advanced Analytics Functionality Overview

Page 34: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Dilbert on Big Data

34

Page 35: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Machine Learning/Analytics + Data Warehouse + Hadoop/Spark

• “Platform Sprawl”’s Inherent Problems

– Complexity

– Data Movement

– Duplicated Data

– Data Latency

– Security exposures

– Duplicated Storage

– Duplicated Backups

– Duplicated Systems

– Dupicated Space and Power

Page 36: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Traditional vs. Oracle Machine Learning/Predictive Analtyics

• Traditional— “Move the data” —“Don’t move the data!”

36

Page 37: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Traditional vs. Oracle Machine Learning/Predictive Analtyics

• Traditional— “Move the data” — “Move the algorithms”

Simpler, Smarter Data Management + Analytics / Machine Learning Architecture

Page 38: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Data remains in Database & Hadoop

Model building and scoring occur in-database

Use R packages with data-parallel invocations

Leverage investment in Oracle IT

Eliminate data duplication

Eliminate separate analytical servers

Deliver enterprise-wide applications

GUI for ML/Predictive Analytics & code gen

R interface leverages database as HPC engine

Major Benefits

Oracle’s Machine Learning/Advanced Analytics

Traditional Analytics

Hours, Days or Weeks

Data Extraction

Data Prep & Transformation

Data Mining Model Building

Data Mining Model “Scoring”

Data Prep. & Transformation

Data Import

avings

Model “Scoring” Embedded Data Prep

Data Preparation

Model Building

Oracle Advanced Analytics

Secs, Mins or Hours

Fastest Way to Deliver Enterprise-wide Predictive Analytics

Page 39: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Parallel, scalable data mining algorithms and R integration

In-Database + Hadoop—Don’t move the data

Data analysts, data scientists & developers

Drag and drop workflow, R and SQL APIs

Extends data management into powerful advanced/predictive analytics platform

Enables enterprise predictive analytics deployment + applications

Key Features

Oracle’s Machine Learning/Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics

Page 40: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

CLASSIFICATION – Naïve Bayes – Logistic Regression (GLM) – Decision Tree – Random Forest – Neural Network – Support Vector Machine

CLUSTERING – Hierarchical K-Means – Hierarchical O-Cluster – Expectation Maximization (EM)

ANOMALY DETECTION – One-Class SVM

TIME SERIES – Holt-Winters, Regular & Irregular,

with and w/o trends & seasonal – Single, Double Exp Smoothing

REGRESSION – Linear Model – Generalized Linear Model – Support Vector Machine (SVM) – Stepwise Linear regression – Neural Network – LASSO

ATTRIBUTE IMPORTANCE – Minimum Description Length – Principal Comp Analysis (PCA) – Unsupervised Pair-wise KL Div – CUR decomposition for row & AI

ASSOCIATION RULES – A priori/ market basket

PREDICTIVE QUERIES – Predict, cluster, detect, features

SQL ANALYTICS – SQL Windows, SQL Patterns,

SQL Aggregates

A1 A2 A3 A4 A5 A6 A7

• OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined • OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc,

Oracle’s Machine Learning & Adv. Analytics Algorithms

FEATURE EXTRACTION – Principal Comp Analysis (PCA) – Non-negative Matrix Factorization – Singular Value Decomposition (SVD) – Explicit Semantic Analysis (ESA)

TEXT MINING SUPPORT – Algorithms support text type – Tokenization and theme extraction – Explicit Semantic Analysis (ESA) for

document similarity

STATISTICAL FUNCTIONS – Basic statistics: min, max,

median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc.

R PACKAGES – CRAN R Algorithm Packages

through Embedded R Execution – Spark MLlib algorithm integration

EXPORTABLE ML MODELS – C and Java code for deployment

Page 41: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

You Can Think of Oracle’s Advanced Analytics Like This… Traditional SQL

– “Human-driven” queries

– Domain expertise

– Any “rules” must be defined and managed

SQL Queries – SELECT

– DISTINCT

– AGGREGATE

– WHERE

– AND OR

– GROUP BY

– ORDER BY

– RANK

Oracle Advanced Analytics - SQL & – Automated knowledge discovery, model

building and deployment

– Domain expertise to assemble the “right” data to mine/analyze

Analytical SQL “Verbs” – PREDICT

– DETECT

– CLUSTER

– CLASSIFY

– REGRESS

– PROFILE

– IDENTIFY FACTORS

– ASSOCIATE

+

Oracle Cloud

Page 42: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Oracle Advanced Analytics

• R language for interaction with the database

• R-SQL Transparency Framework overloads R functions for scalable in-database execution

• Function overload for data selection, manipulation and transforms

• Interactive display of graphical results and flow control as in standard R

• Submit user-defined R functions for execution at database server under control of Oracle Database

• 30+ Powerful data mining algorithms (regression, clustering, AR, DT, etc._

• Run Oracle Data Mining SQL data mining functioning (ORE.odmSVM, ORE.odmDT, etc.)

• Speak “R” but executes as proprietary in-database SQL functions—machine learning algorithms and statistical functions

• Leverage database strengths: SQL parallelism, scale to large datasets, security

• Access big data in Database and Hadoop via SQL, R, and Big Data SQL

Other R packages

Oracle R Enterprise (ORE) packages

R-> SQL Transparency “Push-Down”

• R Engine(s) spawned by Oracle DB for database-managed parallelism

• ore.groupApply high performance scoring

• Efficient data transfer to spawned R engines

• Emulate map-reduce style algorithms and applications

• Enables production deployment and automated execution of R scripts

R-> SQL

Results

In-Database Adv Analytical SQL Functions

R Engine Other R packages

Oracle R Enterprise packages

Embedded R Package Callouts

R

Results

How Oracle R Enterprise Compute Engines Work

1 2 3

Oracle Database 12c

Oracle Cloud

Page 43: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Database Cloud

Oracle’s Machine Learning/Advanced Analytics Platforms Machine Learning Algorithms Embedded in the Data Management Platforms

“Oracle Machine Learning” Database Edition Machine Learning Algorithms,

Statistical Functions + R Integration for Scalable, Parallel, Distributed, in-DB Execution

Big Data Cloud Service

“Oracle Machine Learning” Big Data Cloud ORAAH—Machine Learning Algorithms,

Statistical Functions + R Integration for Scalable, Parallel, Distributed Execution

“Analytics Producers”

Data Scientists, R Users, Citizen Data Scientists

“Analytics Consumers” BI Analysts, Managers Functional Users (HCM, CRM)

Data Management + Advanced Analytical Platform Big Data SQL

Page 44: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 44

Manage and Analyze All Your Data

Big Data SQL / R

SQL / R

Object Store

“Engineered Features” – Derived attributes that reflect domain knowledge—key to best models e.g: • Counts • Totals • Changes

over time

Boil down the Data Like

Data Scientists, R Users, Citizen Data Scientists

Architecturally, Many Options and Flexibility

Page 45: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

“Why Oracle? Because that’s where the data is!”

– Larry Ellison, Executive Chairman and CTO of Oracle Corporation

45

Page 46: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Oracle Data Miner “workflow” UI Oracle SQL Developer extention; Easy to Use for “Citizen Data Scientist”

• Easy to use to define analytical methodologies that can be shared

• SQL Developer Extension

• Workflow API and generates SQL code for immediate deployment

Oracle Cloud

Page 47: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Rapidly Build, Evaluate & Deploy Analytical Methodologies Leveraging a Variety of Data Sources and Types

Consider: • Demographics • Past purchases • Recent purchases • Comments & tweets Unstructured data

also mined by algorithms

Transactional POS data

Generates SQL scripts and workflow API for

deployment

Inline predictive model to augment input data

SQL Joins and arbitrary SQL transforms & queries – power of SQL

Modeling Approaches

Page 48: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Page 49: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Oracle Machine Learning Key Features

• Collaborative UI for data scientists – Packaged with Autonomous Data

Warehouse Cloud (V1)

– Easy access to shared notebooks, templates, permissions, scheduler, etc.

– SQL ML algorithms API (V1)

– Supports deployment of ML analytics

Machine Learning Notebook for Autonomous Data Warehouse Cloud

Page 50: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

www.biwasummit.org

www.analyticsanddatasummit.org

Page 51: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly

51

Page 52: Transformational Machine Learning Use Cases You Can Deploy … · •Power is a precious resource in just about any domain •Managing power consumption matters: electricity bill,