24
#analyticsx Copyright © 2016, SAS Institute Inc. All rights reserved. Outrun Your Competition With SAS ® In-Memory Analytics Sascha Schubert Global Technology Practice, SAS

Outrun Your Competition With SAS In-Memory Analytics · 2016. 11. 16. · •Automate hyperparameters search and find the optimal set •Maximize predictability on independent data

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Outrun Your Competition With SAS® In-Memory AnalyticsSascha SchubertGlobal Technology Practice, SAS

Topics AGENDA

• Challenges with Big Data Analytics

• How SAS can help you to minimize time to value with In-Memory Analytics

• SAS Viya

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Big Data Analytics • Why is it so important now?

Data Computing Power Algorithms

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

BUSINESS APPLICATIONS

Big Data Analytics

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

““In the new world, it is not the big fish

which eats the small fish, it’s the fast

fish which eats the slow fish.”Klaus Schwab

Founder and Executive Chairman

World Economic Forum

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Lo

st V

alu

e

Data to Decisions Reduce Time to Decision

Producing a new model or

adjusting an existing

model for the business

often takes too long to

meet fast changing

markets.

Complexity is added as

many stakeholders are

involved in the predictive

analytics process.

Big data is adding to the

complexity.

Implementation of a

process model is needed

to provide fast, repeatable

and high-quality results

Value

Time

Data

Latency

Deployment

Latency

Decision

Latency

Lost Time

Modeling

Latency

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Decisions at Scale THE ANALYTIC LIFECYCLE

Regulated

Automated

Governed

Embed

Reliable

Decisions

Consistent

Documented

Actions

IT

Lots of Data

New Data

Experimentation

Fail Fast

Test & Learn

Interactive

Iterative

Innovation

Flexibility

Data Science

Discovery &

Development of

Analytics

Deployment &

Execution of

AnalyticsEXPLORE

PREPARE

MODEL MONITOR

EXECUTE

DEPLOY

ASK

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Factors to Speed up Data to Decisions Time

Support for complete analytical lifecycle

Standardized transparent processes

Minimize data movement for big data volumes

In-memory processing on modern distributed platforms

Easy to use persona-based self service software

Automation of repetitive steps

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

Data

SAS/Access® to Big Data • Extract data into SAS

• Push down SQL queries into data

environment

SAS® In-Memory Analytics • SAS native distributed in-memory

computing for fast advanced analytics

• In-memory data exchange

Hig

h S

peed

Hig

h S

peed

Hig

h S

peed

Hig

h S

peed

SAS

Netw

ork

Analytics Server Analytics Server

SQL

SAS® In-Database Technologies• Push SAS processing into data

environment

• Run natively in data environment

Netw

ork

Analytics Server

SAS SAS

Data Data

In-DB Code

Traditional Operational Transformational

Bring SAS Processing to the Data

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® In-Database Technologies

SAS® Scoring Accelerator Aster

DB2

Pivotal

Hadoop

Netezza

Oracle

SAP HANA

SAS® Scalable Performance Data Server

Teradata

SAS® In-Database Code Accelerator Hadoop

Pivotal

Teradata

SAS® Data Quality Accelerator Teradata

Hadoop

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CUSTOMER CASE STUDY

COLLECTIONS MANAGEMENT

DA

TA

EX

PLO

RA

TIO

N

MO

DE

LD

EV

EL

OP

ME

NT

MO

DE

LD

EP

LO

YM

EN

T

• Score all 40 million records compared to the limit of 350 000 in the past

• Reduced Data movement

• Increased data governance

• Better business results: $1M to $3M extra collections a month

Solution Approach : SAS® In-Database Technologies

84SECONDS

40M records

12 min

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® In-Memory Analytics Offerings

Coding

GUI

In-Memory Statistics

High Performance Analytics

Visual Data Mining and Machine

Learning

PROC hpbnet data = creditdata

structure = markovblanket;

model default = x1 LTV income age;

selction = Y

RUN;

In-Memory Statistics

Visual Data Mining

and Machine Learning

Visual Statistics

Enterprise Miner & HPA

Factory Miner

Text Miner & Contextual Analysis

Data Loader for Hadoop

Decision Manager

In-Memory Analytics

Analytics in Action

Us

ab

ilit

y

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS In-Memory Machine Learning algorithms are

designed to run on single machine (multi-threaded)

or on a compute cluster

Distributed Data

and Software on

Multiple Servers

Data Scientist

SAS® In-Memory Analytics - Execution

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® In-Memory AnalyticsSINGLE MACHINE VERSUS MASSIVE PARALLEL PROCESSING

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CUSTOMER BEHAVIOR MODELING

Standard Data Mining Process

• Final model is based on a single analytical algorithm – Neuronal Net (NN)

•7 training iterations of the neuronal net take ~5 hours (~1.4 iterations/h.)

•One analyst can generate one model per day

• low productivity

• low confidence

• low model accuracy

•Model lift was 1,6 for top 10%

High-Performance Data Mining

•Final model is based on comparison of several analytical algorithms (NN, SVM, logistic regression,...)

•5000 training iterations of neural net take 70 minutes (~71,4 iterations/min.)

•One analyst can generate many models per day

• High productivity

• High confidence

• High model accuracy

• Model lift improved to 2,5 for top 10%

CUSTOMER CASE STUDY

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® Viya™

• SAS® Viya™ is a new, open

analytic platform built for analytics

innovation

• It is designed for all analytic

professionals, regardless of skills

or experience.

• It scales for data of any size,

speed and complexity.

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® Viya™ SAS® VISUAL DATA MINING AND MACHINE LEARNING

SAS Visual Data Mining and

Machine Learning combines

data wrangling, data

exploration, visualization,

feature engineering, and

modern statistical, data

mining, machine-learning

and text analytics

techniques all in a single,

scalable in-memory

processing environment

– SAS Viya.

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® VISUAL DATA MINING AND MACHINE LEARNING

• K-means and K-modes Clustering

• Principal Component Analysis

• Logistic Regression

• Linear Regression

• Generalized Linear Models

• Nonlinear Regression

• Decision Trees

• Random Forest

• Gradient Boosting

• Neural Networks

• Support Vector Machines

• Factorization Machines

• Network Analytics/Community Detection

• Text Mining

• Boolean Rules

• Autotuning

Data

DeploymentDiscovery

• Assess Supervised

Models

• Complete Score Code

• Multi Threaded Data Step

• DS2

• SQL

• Variable Binning

• Variable Cardinality Analysis

• Sampling and Partitioning

• Missing Value Imputation

• Variable Selection

• Transpose

SAS® Viya™

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® VISUAL DATA MINING AND MACHINE LEARNING

• Hyperparamters• Highly data dependent

• Related to model complexity

• Auto Tuning: • Automate hyperparameters search and find the optimal set

• Maximize predictability on independent data set

• Aims to avoid over-fitting by controlling model complexity

• Creates more accurate models faster vs hand tuning

• SAS auto tuning leverages SAS optimization engines

SAS® Viya™

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS STUDIO - WEB-BASED USER INTERFACE

SAS Visual Data Mining and Machine

Learning on SAS Studio

https://youtu.be/X0AU4gDUc_Y

SAS® VISUAL DATA MINING AND MACHINE LEARNINGSAS® Viya™

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

OPEN ACCESS TO SAS FROM JUPYTER NOTEBOOK

Other

programming

languages

APIs

SAS language

SAS Visual Data Mining and Machine

Learning with Python Demo

https://youtu.be/LXoikPWQJ3o

SAS® Viya™

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® Viya™SAS® VIYA™ AND SAS® 9

&• It’s an AND strategy

• Can co-exist on same hardware (physical or virtual)

• Data, models, and code can be accessed via bridges

#analyticsx

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

MORE INFORMATIONSAS® Viya™

Copyr i g ht © 2016, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

#analyticsx