45
Copyright © 2013, SAS Institute Inc. All rights reserved. This information is confidential and covered under the terms of any SAS agreements as executed by customer and SAS Institute Inc. OCTOBER 9, 2013 Gary T. Ciampa SAS ® Solutions OnDemand Advanced Analytics Lab Denver SAS Users Group Colorado Day, 2013 BIG DATA, FAST PROCESSING SPEEDS

BIG DATA, FAST PROCESSING SPEEDSdenversug.org/presentations/2013CODay/Ciampa_DSUG.S036...BUFNO, BUFSIZE, OBS, IBUFNO, IBUFSIZE (index processing) • System Administration Memory MEMSIZE,

  • Upload
    lekiet

  • View
    220

  • Download
    1

Embed Size (px)

Citation preview

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

OCTOBER 9, 2013

Gary T. Ciampa

SAS® Solutions OnDemand Advanced

Analytics Lab

Denver SAS Users Group

Colorado Day, 2013

BIG DATA, FAST PROCESSING SPEEDS

2

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

OVERVIEW AND AGENDA

• Big data introduction

• SAS language performance tuning

• SAS system facilities

• SQL, MACRO and DATA STEP examples

• Case study - SAS Revenue Optimization Solution

• History and tuning techniques

• High Performance Revenue Optimization – GRID environment

• SAS emerging big data technologies

3

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

BIG DATA INTRODUCTION

• Wiki Knows All: … is a collection of data sets so large and complex that it

becomes difficult to process using on-hand database management tools or

traditional data processing applications

• Forrester: … software and/or hardware solutions that allow firms to discover,

evaluate, optimize, and deploy predictive models by analyzing big data

sources to improve business performance or mitigate risk.

• Gartner: … technology is the management of high-volume, high-velocity and

high-variety information assets that demand cost-effective and innovative

forms of information processing for enhanced insight and decision making.

4

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

… the management of high-volume, high-velocity and high-variety assets that demand

cost-effective and innovative forms of processing for enhanced insight and decision

making

5

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

BIG DATA ACCORDING TO SAS

• Incorporates concepts of IDC dimensions

Volume – transactions, streaming, sensors, …

Variety – database, warehouse, text, email, metered, OLAP, stocks, etc…

Velocity – how fast the data is produced; and processed (near real-time)

• SAS considers additional dimensions

Variability - in velocity and variety of the data (peaks and valleys, seasonal)

Complexity - handling disparate sources to cleanse, transform, correlate and

establish relationships and hierarchies

• SAS Big Data Starting Point: http://www.sas.com/big-data

6

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

APPROACHES TO PROCESSING BIG DATA

• Bigger, Faster, More Powerful is Better

Increase CPU processor speed and count

Increase MEMORY capability or speed

Faster Networks and Network Devices

High-speed disk arrays, or, direct memory disk arrays

• Parallel Processing

Multi-threading capabilities, distributed processing within or across nodes

Segmented data along with distributed processing

• Viable, but not always feasible within constraints (time, resource and

dollars)

7

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS SYSTEM FACILITIES

• SAS command line options, AUTOEXEC and CONFIG processing

Customizes the SAS execution environment

Settings can affect performance significantly

Settings may have unexpected or unintended consequences

Set on command line, configuration or within the program

SAS Companion for <OS> (Windows, UNIX, z/OS)

Bonus Options

• VERBOSE option – emits options and configuration details

• RTRACE option – emits list of resources that are read, loaded

8

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS SYSTEM AND HOST OPTIONS

• System Options, SAS Files

BUFNO, BUFSIZE, OBS,

IBUFNO, IBUFSIZE (index processing)

• System Administration Memory

MEMSIZE, SORTSIZE, SUMSIZE

• System Administration, Performance

CPUCOUNT, THREADS

• System Options for Macros

MLOGIC, MPRINT, SYMBOLGEN (everyone has their favorites)

• NOTE: Use the *correct* SAS Companion for the target OS

9

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS SYSTEM FACILITIES

• SAS option STIMER or FULLSTIMER

System performance statistics, CPU, memory, real and elapsed time

Subtle differences depending on the OS

• SAS option MSGLEVEL – level of detail for messages to SAS log

• SAS option OBS – last observation or record to process

• ARM and PERF macro facility

Default or custom performance metrics at programmers discretion

PROC or DATA STEP statistics

User controlled START and STOP semantics across segments of SAS code

Discrete log and format to include macros to process and report on metrics

10

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAMPLE OPTIONS STATEMENTS & LOG

options obs=max fullstimer;

data work.sort500k;

set sgf2013.sort_500000;

run;

NOTE: DATA statement used (Total

process time):

real time 1.66 seconds

user cpu time 0.12 seconds

system cpu time 0.34 seconds

memory 356.15k

OS Memory 10424.00k

Timestamp 04/25/2013 03:16:21 PM

options obs=10;

data work.sort500k;

set sgf2013.sort_500000;

run;

NOTE: DATA statement used (Total

process time):

real time 0.03 seconds

user cpu time 0.00 seconds

system cpu time 0.03 seconds

11

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAMPLE ARM / PERF MACRO EXECUTION

%let _armexec=1;

%perfinit(applname="Glm_Appl_1");

%perfstrt(txnname="Glm_Txn1");

…. Do some work….

%perfstop;

%perfstrt(txnname="Glm_Txn2");

ods exclude all;

proc GLM data=one; model y = x1; by by; quit;

ods select all;

%perfstop;

12

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAMPLE ARM / PERF MACRO EXECUTION

…lines deleted…

G,1682537590.504000,2,2,Glm_Txn1, CPU ,IO_CNT ,MEMORY INFO ,THREAD

S,1682537590.426000,2,1,1,1.060806,1.341608,327491731,7266304,7532544,6,6

P,1682537590.504000,2,1,1,1.123207,1.357208,0,335645285,7266304,7532544,6,6

…lines deleted…

G,1682537590.504000,2,2,Glm_Txn2, CPU ,IO_CNT ,MEMORY INFO ,THREAD

S,1682537590.504000,2,2,2,1.123207,1.357208,335674088,7266304,7532544,6,6

P,1682537591.845000,2,2,2,1.653610,1.575610,0,340575257,11984896,11984896,6,6

SAS 9.3 Interface to Application Response Measurement (http://support.sas.com)

13

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

OVERVIEW ENVIRONMENT AND INTRODUCTION

• Sample Environment

• RHEL Linux 5.6, Intel Xenon 2.67 GHz, 32 Cores, 256 MB; SAS 9.3,

• Oracle Table, 44 columns, 10 million records

• SAS Language Reference (cost, benefit and considerations)

• Understanding SAS Indexes

• Understanding Integrity Constraints

• Use EXISTS (0:04.6) rather than IN (0:05.2).

• For example,

select * from table_a a

where exists (select * from orders o

where a.prod_id=o.prod_id);

14

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

INDEXES USING INDEXES FOR PERFORMANCE OPTIMIZATION

• INDEX Considerations (TANSTAAFL)

• Data file size, small tables would be suitable for sequential processing

• Change rate of the data and use key variables, NAME versus GENDER

• Generally used where sub-setting the data, 25% or less is typical

• Sort by key variables, ordered data improves index behavior

• Some operators, conditions are not optimized with an INDEX

• Arithmetic, variable-to-variable, sounds-like operator

• CONTAINS, IS NULL or IS MISSING, TRIM, SUBSTR*

• where amount !=0; 0:28.0 Minutes:Seconds.Tenths

• where amount > 0; 0:26.0 Minutes:Seconds.Tenths

15

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

PROC SQL OPTIMIZING PROC SQL

• HAVING versus WHERE

• HAVING operates on all rows returned, not a subset

• Use HAVING on summary operations, after a restricted WHERE step

• Order statements, filter or select rows before grouping

• select state

from order

group by state

having state =’nc’;

• 01:50

• select state

from order

where state =’nc’;

group by state;

• 01:31

16

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

PROC SQL OPTIMIZING PROC SQL

• Nested (sub-)queries

• Minimize nested queries with a small number of tables

• SUBQUERY versus JOIN

• select ename

from employees emp

where exists (select price from prices

where prod_id = emp.prod_id and prices.class=’j’);

• >05:00 minutes (terminated with prejudice)

• select ename,

from prices pr, employees emp

where pr.prod_id=emp.prod_id and pr.class=’j’;

• 01:40 seconds

17

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

PROC SQL OPTIMIZING PROC SQL

• TABLE order

• Order of tables within the SQL statement impacts performance

• List the tables with the greatest number of rows left to right in the query

• SQL processing scans the last table listed, and merges all of the rows

• Assuming TAB1 has 20,000 rows, TAB2 has 10 rows

• select count (*) from tab2, tab1

• 0.61

• select count (*) from tab1, tab2

• 0.52

18

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

PROC SQL OPTIMIZING PROC SQL

• EXISTS versus DISTINCT for table join

• select distinct date,name

from sales s, employee emp

where s.prod_id=emp.prod_id;

• > 7 minutes

• select date, name

from sales s

where exists(select ’x’ from

employee emp

where emp.prod_id = s.prod_id);

• 0:11 seconds (including post distinct step)

• SAS 9.3 SQL Procedure User's Guide

19

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS MACRO OPTIONS AND CONSIDERATIONS

• Use MLOGIC, MPRINT & SYMBOLGEN – development phase

• Do NOT use MLOGIC, MPRINT & SYMBOLGEN – production

• Stored Compiled Macro Facility

• Permanent SAS catalog

• Protect intellectual property

• Both AUTOCALL and SESSION macros are available

• Override compiled macros with session instances or AUTOCALL semantics

• Minimize nesting macro definitions

20

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS MACRO NESTING MACRO INSTANCE

• Avoid nesting macros where possible

• %macro m1;

%macro m2; /* nested macro */

%mend m2;

%mend m1;

• 02.81

• %macro m1;

<macro 1 code goes here>

%mend m1;

%macro m2;

<macro 2 code goes here>

%mend m2;

• 02.45

21

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS DATA STEP A FEW EXAMPLES TO CONSIDER

• Missing values may perturb performance

• “.” is propagated across all calculations

• total=t4+(x*b)+c*(abc);

• 01:03 (63 seconds)

• total=(x*b)+c*(abc) + t4;

• 00:59

• Superior practice, check for “.” before expression

• if <operand> ne . then do <expression>; end;

22

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS DATA STEP A FEW EXAMPLES TO CONSIDER

• PROC FORMAT: User defined formats associated with variables

• Details in the Base SAS 9.3 Procedures Guide

• Reference the format throughout the code, simplifies logic and support

• if educ = 0 then neweduc="< 3 yrs old";

else if educ=1 then neweduc="no school";

else if educ=2 then neweduc="nursery school";

• 10:54

• proc format; value educf

0="< 3 yrs old“ 1="no school“ 2="nursery school";

… neweduc=put(educ,educf); …

• 10:32

23

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS DATA STEP A FEW EXAMPLES TO CONSIDER

• Using the IN operator, versus OR conditions

• OR function checks all the conditions

• IN function matches first occurrence

• if x=8 or x=9 or x=23 or x=45 then do; end;

• 01:04

• if x in (8,9,23,45) then do; end;

• 00:58

24

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS USER FEEDBACK: “IN” VERSUS “OR” VALIDATION

• Thanks to Bruce Gilsen at Federal Reserve for independent validation

• Bruce’s Optimization Validation

• 1,000,000 OBS, 100 VARIABLES with RANGE VALUES 1 to 100

• Independent DATA STEP, using IN versus OR

• IN 8.15 / 7.88 Seconds (REAL / CPU)

• OR 21.75 / 21.73 Seconds (REAL / CPU)

data two;

set one;

array vall (*) v1-v100;

drop i;

do i = 1 to 100;

if vall(i) in (1 2 3 4 5 6 7 8 9 10 … 99)

then vall(i) = vall(i) + 100; end; run;

data two;

set one;

array vall (*) v1-v100;

drop i;

do i = 1 to 100;

if vall(i)= 1 or vall(i) = 2 or vall(i) = 3 or vall(i) = 4

… vall(i) = 99 then vall(i) = vall(i) + 1000; end; run;

25

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS AND PROC SQL USING SAS VIEWS

• DATA step and PROC SQL views may provide an alternative to upfront cost by

deferring data retrieval to when the table is accessed

• DATA step views contain a stored DATA step program and will only work on the same platform

on which the view was initially created

• PROC SQL view contains a stored SQL query

• Picking the right view type depends on the needs of the situation

• DATA step views have the full power of DATA step programming

• However, PROC SQL views can update data the view references; DATA step views cannot

• While explicit data files use additional disk space, views save space at the cost of

processing time, creating the data “on demand” only when needed

• Since view data retrieval occurs only when it is referenced, it may do so even outside the

confine of traditional step boundaries, which might sometimes result in a performance increase

• However, view data retrieval occurs every time the view is referenced, so it may hurt

performance if views are referenced repeatedly

26

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

EXAMPLE DATA STEP VIEWS

• Input data set for the test scenarios:

• 86.9 million observations across 51 variables

• Total size: 34.1 GB

• Test scenarios (data file vs view, 5 iterations for each, with average times)

• A WHERE statement resulting in 10 million observations, followed by a sort

• A WHERE statement resulting in 71 million observations, followed by a sort

• Not every scenario may benefit from a view, but they are an additional tool

that may yield performance increases under some circumstances

Result Size Data files Views Speedup

10 million 165.8s 144.6s 12.8%

71 million 1110.4s 628.0s 43.4%

27

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

CASE STUDY - SAS REVENUE OPTIMIZATION SOLUTION

• Big Data Introduction

• SAS Language Performance Tuning

• SAS System Facilities

• SQL, MACRO and DATA STEP examples

• Case Study - SAS Revenue Optimization Solution

• History and Tuning Techniques

• High Performance Revenue Optimization – GRID Environment

• SAS Emerging Big Data Technologies

28

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SOLUTIONS ONDEMAND ADVANCED ANALYTICS LAB

• Over a petabyte of data, 400+ customers

• Customer Profiles

Variety of industry sectors, private as well as public

Multi-tier deployments, client, mid-tier, analytic tier and RDBMS

Daily and Weekly ETL feed requirements

• PROD, QA, DEV environments and data synchronization

• Disparate analytic processing (batch) schedules

• Backup and restore processing that minimizes performance impacts

• 99.5% up time service level agreements

29

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

CASE STUDY SAS REVENUE OPTIMIZATION SOLUTION

• Problem Statement: 33 hours of processing time for one batch component

using 30% of projected data. Linear projection approximately 110 hours or 4

½ days processing time.

• Requirement to fit batch into a 40 hour window

• AIX 6.1+, Power7, 64 Bit attached to EMC SAN Arrays

• 7 CPUS, SMT=4, 128GB RAM, 3700 IOPS, CPU 45%

• Approximately 1.2 TB of DATA, target 1.6 TB primary warehouse

• Focus on the most significant issues and then repeat as new issues arise

30

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

31

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

32

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

33

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

• SAS WORK volume

• Eight-way stripe with eight paths

• Warehouse

• Fixed Tier 1 EMC storage; 80 x 100GB disk arrays

• Moved support directories off of volume

34

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

Weekly Performance

• Parallel Executions

• 16 processes

• 54 processes

• IO/SEC

• 8.5K to 15.3K

• CPU Idle Time

• 42% to 13%

• Weekly Batch Time

• 60 hours

• 43 hours

• GEO_PRODS

• 67 Million

• 92 Million

35

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

36

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

37

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

38

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS GRID SAS REVENUE OPTIMIZATION SOLUTION

• Initial RO Versions used SAS/Connect parallel processing

• Single host deployments with concurrent analytics

• Flat data warehouse structure, non-partitioned SAS tables

• SAS High Performance Revenue Optimization Enhancements

• SAS TK GRID architecture distributed processing across grid nodes

• SAS data partitions distributed across grid nodes

• ETL processes, daily and weekly to distribute data across partitions

• Grid Captain to manage the processing and analytic across grid nodes

39

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS GRID SAS REVENUE OPTIMIZATION NON GRID

40

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS GRID SAS HIGH PERFORMANCE REVENUE OPTIMIZATION

41

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS GRID & EMERGING TECHNOLOGIES

• SAS Grid Manager: distributed SAS processing

Scheduling, Workload Balancing, High Availability & Management

• SAS In-Data Base: queries, aggregations, analytics within DBMS

9.2M3: DB2, EDW & Oracle; 9.3 Netezza

• HADOOP

Scalable, fault tolerant, distributed files system

SAS integration includes access, analysis and management

• SAS In Memory Analytics

Distributed, descriptive, inferential to visualization analytics

• Visual Analytics and Visual Analytics HPA

42

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS TECHNICAL SUPPORT

43

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS BIG-DATA HOME PAGE

44

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SUMMARY CONSIDERATIONS – PERFORMANCE IMPROVEMENT IS

A CONTINUAL PROCESS

Focus on the most severe hotspots within SAS program

and operating environment

Use INDEX where appropriate

Exploit SAS OPTIONS tuning

Consider SAS Grid Products

Evaluate SAS Visual Analytics

and Visual Analytics HPA

www.SAS.com

Copyr i g ht © 2013, SAS Ins t i tu t e Inc . A l l r ights reser ve d . T his in f ormat i o n is conf iden t i a l and cover e d under the terms of any SAS agr eem e nts as exec u t e d by cus tomer and SAS Ins t i tu t e Inc .

SAS SOLUTION ON DEMAND

ADVANCED ANALYTICS LAB

[email protected]