25
Optimizing Your Analytics Life Cycle with SAS & Teradata Paul Segal - Teradata June 2017

Optimizing Your Analytics Life Cycle with SAS & Teradata

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimizing Your Analytics Life Cycle with SAS & Teradata

Optimizing Your Analytics Life Cycle with SAS & Teradata

Paul Segal - TeradataJune 2017

Page 2: Optimizing Your Analytics Life Cycle with SAS & Teradata

2

• The Analytic Life Cycle

• Common Problems

• SAS & Teradata solutions

• Agenda

Page 3: Optimizing Your Analytics Life Cycle with SAS & Teradata

3

Analytical Life Cycle

3

PreparationPrepare Data for

Analytics

ExplorationExplore All Your Data

DeploymentDeliver

Results to Business

DevelopmentBuild Analytic

Models

Page 4: Optimizing Your Analytics Life Cycle with SAS & Teradata

4

Analytical Life Cycle

TEXT

PREPARATION DEVELOPM

ENT

DEPLOYMENTEXPL

ORA

TION

• what the data looks like• what variables are in the data set• whether there are any missing

observations• how are the data related• what are some of the data patterns

• combining data from numerous sources• handling inconsistent or non-standardized

data• cleaning dirty data• integrating data that was manually entered• dealing with semi-structured and structured

data

• customer retention• customer attrition/churn• marketing response• consumer loyalty and offers• fraud detection• credit scoring• risk management

• the probability of responding to a particular promotional offer

• the risk of an applicant defaulting on a loan• the propensity to pay off a debt• the likelihood a customer leave/churn• the probability to buy a product

Stage 2

Stage 1 Stage 3

Stage 4

Page 5: Optimizing Your Analytics Life Cycle with SAS & Teradata

5

Common Problems

5

Page 6: Optimizing Your Analytics Life Cycle with SAS & Teradata

6

Typical Customer Challenges

• “ My analytical process runs too slowly* ”

• “ We spend too much time moving data around between systems ”

• “ I want to make more/better use of my EDW/data platform”

• “ It takes forever to extract and score the data”

• “ There is too much data for us to analyse ”

• “ We can’t buy new hardware”

• “The quality of our analytical models is lower because we sample – I worry we are missing valuable segments”

* scoring / analysis / data quality / data transformation

Page 7: Optimizing Your Analytics Life Cycle with SAS & Teradata

7

LINE OF BUSINESS

ADW

EDW

PROGRAM MANAGER

FINANCE

SUPPLYCHAIN

I.T.

HR

The Analytic Data Warehouse

Page 8: Optimizing Your Analytics Life Cycle with SAS & Teradata

8

Data Management: The crux of the issue facing your analysts

BUSINESS PROBLEM

BUSINESS DECISION

20%80%Preparing to

solve the problemSolving

the problem

Page 9: Optimizing Your Analytics Life Cycle with SAS & Teradata

9

Data Management: SAS & Teradata working to change the Equation

BUSINESS PROBLEM

BUSINESS DECISION

20% 80%Preparing to solve

the problem

Solving the problem

Page 10: Optimizing Your Analytics Life Cycle with SAS & Teradata

10

Barriers to the Adoption of Analytics

Scarcity of analytical skillsThe need to grow analytical talent from within

Disjointed, inefficient workflowHow can you fail fast & learn to refine quickly

Tools that aren’t right for the jobLearning curve to create, share and collaborate

Page 11: Optimizing Your Analytics Life Cycle with SAS & Teradata

11

SAS & Teradata Solutions

11

Page 12: Optimizing Your Analytics Life Cycle with SAS & Teradata

12

Analytical Life CycleDeploymentPreparation

a

Exploration Development

• ACCESS to Teradata

• Code Accelerator

• Data Quality Accelerator

• Data Set Builder for SAS

CUSTOMERCUSTOMER NUMBERCUSTOMER NAMECUSTOMER CITYCUSTOMER POSTCUSTOMER STCUSTOMER ADDRCUSTOMER PHONECUSTOMER FAX

ORDERORDER NUMBERORDER

DATE

STATUSORDER ITEM BACKORDERED

QUANTITY

ITEMQUANTITYDESCRIPTION

ORDER ITEM SHIPPEDQUANTITYSHIP DATE

• Analytics Accelerator for Teradata

• SAS High-Performance Analytics Products

• TD Appliance for SAS

• Visual Analytics &Visual Statistics

• Teradata Appliance for SAS

• SAS Scoring Accelerator

• SAS Model Manager

• Teradata Appliance for SAS

Page 13: Optimizing Your Analytics Life Cycle with SAS & Teradata

13

DEPLOYMENT FLEXIBILITY: DESKTOP - SERVER - IN-DATABASE - IN-MEMORY - CLOUD

ARCHITECTURE FLEXIBILITY: SMP - MPP - HADOOP - GRID - ESP

Information Management

Data Mining / Anomaly Detection

Text Analytics Predictive Analytics

High Performance

AnalyticsFraud & Risk Detection

Reporting & Visualization

Integrated End-to-End Foundation

SAS Analytics in Teradata

Page 14: Optimizing Your Analytics Life Cycle with SAS & Teradata

14

In-Database Functionality

• PROC APPEND• PROC CONTENTS• PROC COPY• PROC DATASETS• PROC DELETE• PROC FORMAT• PROC FREQ• PROC MEANS• PROC PRINT• PROC RANK• PROC REPORT• PROC SORT• PROC SQL• PROC SUMMARY• PROC TABULATE

Statistical Analysis Procedures:

SAS Enterprise Miner

• PROC CANCORR• PROC CORR• PROC FACTOR• PROC PRINCOMP• PROC REG• PROC SCORE• PROC TIMESERIES• PROC VARCLUS

• PROC DMDB• PROC DMINE• PROC DMREG (Logistic Regression)• Also nodes for Input, Sample, Partition, Filter, Merge, Expand

SAS Analytics Accelerator for Teradata

• PROC SCORE works with coefficients from:

• PROC ACECLUS• PROC CALIS• PROC CANDISC• PROC DISCRIM• PROC FACTOR• PROC PRINCOMP• PROC TCALIS• PROC VARCLUS• PROC ORTHOREG• PROC QUANTREG• PROC REG• PROC ROBUSTREG

•Match code•Parsing/Casing•Gender/Pattern/Identification analysis•Standardization

SAS/Access to Teradata:• PROC DS2

SAS Code Accelerator for Teradata SAS Scoring Accelerator for Teradata• EM/STAT* Models

DQ Accelerator for Teradata

Page 15: Optimizing Your Analytics Life Cycle with SAS & Teradata

15

In-Database Example: Data Quality

• SAS Data Quality functions ported to operate in-database

• Invoked as SQL Stored Procedures• Processing leverages parallelism of

underlying RDBMS• Significant performance/throughput

benefits

• Supported functions: • Matchcode generation• Parsing• Standardization• Casing• Pattern analysis• Identification analysis • Gender analysis

Page 16: Optimizing Your Analytics Life Cycle with SAS & Teradata

16

Exploration and modeling – what does it bring?Influence Relationships Contributions

Data driven explorationEliminating guesswork with predictive analytics

Increased understanding

Page 17: Optimizing Your Analytics Life Cycle with SAS & Teradata

17

SAS Visual Analytics & Visual Statistics

SAS Visual Analytics • Data exploration and discovery • Distribution and summary statistics• Post-model analysis and reporting

SAS Visual Statistics • Build prediction and classification

models• Refine candidate models• Compare models and generate score

code

Page 18: Optimizing Your Analytics Life Cycle with SAS & Teradata

18

• Analyze 100% of data• More/New variables• More model iterations• Manage complex models• More models (per domain area)• More questions/ideas/scenarios to

evaluate• Multiple deployment options: batch,

real-time• Continuously monitor model

effectiveness and retrain

SAS High-Performance Analytics

Page 19: Optimizing Your Analytics Life Cycle with SAS & Teradata

19

SAS High-Performance Analytics Procedures

SAS High-Performance Statistics

SAS High-Performance Econometrics

SAS High-Performance Optimization

SAS High-Performance Data Mining

SAS High-Performance Text Mining

SAS High-Performance Forecasting

HPCANDISCHPFMMHPPLSHPPRINCOMPHPQUANTSELECTHPLOGISTICHPREGHPLMIXEDHPNLINHPSPLITHPGENSELECT

HPCDMHPCOPULAHPPSNELHPCOUNTREGHPSEVERITYHPQLIM

OPTLSOSelect features inOPTGRAPHOPTMILPOPTLPOPTMODEL

HPBNETHPCLUSHPSVMHPTSDRHPREDUCEHPNEURALHPFORESTHP4SCOREHPDECIDE

HPTMINEHPTMSCORE

HPFORECAST

The above offers contain these standard PROCS: HPDS2, HPDMDB, HPSAMPLE, HPSUMMARY, HPIMPUTE, HPBIN, HPCORR

Page 20: Optimizing Your Analytics Life Cycle with SAS & Teradata

20

HPA Example

SAS Analyst’s Desktop

Traditional SAS Server

SAS HPA Environment

Page 21: Optimizing Your Analytics Life Cycle with SAS & Teradata

21

Platforms for Optimal SAS Performance

Supports SAS in-memory analytics tools used to visualize data and build data models, giving them direct access to data within the Active Data Warehouse

Enables SAS in-database analytics to clean and prepare data as well as deploy and score models without having to move the data.

Provides a cost effective, preconfigured data storage and staging area that makes data available for SAS analytics

Teradata Active Data Warehouses

Teradata Appliance for SAS

Teradata Appliancefor Hadoop

Page 22: Optimizing Your Analytics Life Cycle with SAS & Teradata

22

A Fully Contained SAS Ecosystem in One Box

Data Persistent

Teradata Appliance for SAS

Teradata 2800 Appliance with SAS Analytics

A fully contain SAS Analytics Ecosystem in a single box from Teradata that integrates the SAS server directly into the cabinet with your data

SASManaged

Server from Teradata

TERADATA

Page 23: Optimizing Your Analytics Life Cycle with SAS & Teradata

23

Customer Success Story

• Projects involving high-end analytics were placing more and more demand on IT and Business Analytics teams

• Needed to streamline the analytics workflow

• Needed a solution that provided scalability and data consistency

• Process could not support additional analytics requests without adding headcount

• Replaced desktop tools with SAS server based HPA tools such as Enterprise Miner, Scoring Accelerator, Model Manager, VA & Data Mining

• Added Teradata Appliance for SAS for production and development of data models

• Expanded EDW • Added Data Labs

Environment to

• Improved the speed and performance of analytics

• Streamlined data preparation, evaluation and testing

• Added additional analytics capacity without requiring the addition of new headcount

• Freed up IT time used for data collection, loading and prep

Leveraging SAS and Teradata for advanced analytics to analyze customer data and loyalty programs with in-database and in-memory technologiesIssues Solutions Impact

Large US Retailer

Page 24: Optimizing Your Analytics Life Cycle with SAS & Teradata

24

• Minimize the need to move the data• Faster modeling times (months/weeks to

hours/minutes)• Improve data quality, availability and

consistency• Work with entire data sets, including enabling an

end-to-end view of data from across the enterprise

• Free up staff to focus more time on value-adding activities

Summary

Page 25: Optimizing Your Analytics Life Cycle with SAS & Teradata

25

• Data modeling development with SAS Enterprise Miner and Teradata

• High Performance Analytics with Teradata

• Scoring data in Teradata with SAS Scoring Accelerator

• Demo – Modeling and Deployment