81
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Date Time Location Tuesday 3:45pm – 4:45pm Hotel Nikko - Peninsula Wednesda y 1:15pm – 2:15pm Hotel Nikko - Peninsula Thursday 11:30am – 12:30pm Hotel Nikko - Peninsula Big Data Hands-On Labs: Or download: Big Data Lite Virtual Machine

Big Data Hands-On Labs:

Embed Size (px)

DESCRIPTION

Big Data Hands-On Labs:. Or d ownload : Big Data Lite Virtual Machine. Oracle Big Data Appliance for Customers and Partners. Jean-Pierre Dijcks Oracle Big Data Product Management Paul Kent SAS VP Big Data. Oracle Big Data Appliance for Customers and Partners. 1. - PowerPoint PPT Presentation

Citation preview

Page 1: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Date Time LocationTuesday 3:45pm – 4:45pm Hotel Nikko - PeninsulaWednesday 1:15pm – 2:15pm Hotel Nikko - PeninsulaThursday 11:30am – 12:30pm Hotel Nikko - Peninsula

Big Data Hands-On Labs:

Or download: Big Data Lite Virtual Machine

Page 2: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Appliance for Customers and Partners

Jean-Pierre DijcksOracleBig Data Product Management

Paul KentSASVP Big Data

Page 3: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 3

Oracle Big Data Appliance for Customers and Partners

Big Data Appliance Recap

Why You Should Consider Big Data Appliance

Driving Business Value with SAS on Big Data Appliance

Q&A

1

2

3

4

Page 4: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Management System

SOU

RCES

Oracle Database

Oracle IndustryModels

Oracle Advanced Analytics

Oracle Spatial & Graph

Big Data Appliance

Cloudera Hadoop

Oracle NoSQL Database

Oracle R Advanced Analytics for Hadoop

Oracle R Distribution

Oracle Database

Oracle Advanced Security

Oracle Advanced Analytics

Oracle Spatial & Graph

Oracle Exadata

Oracle Big DataConnectors

Oracle DataIntegrator

Oracle Big Data SQL

Page 5: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 5

Recap: Big Data Appliance OverviewBig Data Appliance X4-2

Sun Oracle X4-2L Servers with per server:• 2 * 8 Core Intel Xeon E5 Processors• 64 GB Memory• 48TB Disk space

Integrated Software:• Oracle Linux, Oracle Java VM• Oracle Big Data SQL*• Cloudera Distribution of Apache Hadoop – EDH Edition• Cloudera Manager• Oracle R Distribution• Oracle NoSQL Database

* Oracle Big Data SQL is separately licensed

Page 6: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 6

Recap: Standard and Modular

Starter Rack is a fully cabled and configured for growth with 6 servers

In-Rack Expansion delivers 6 server modular expansion block

Full Rack delivers optimal blend of capacity and expansion options

Grow by adding rack – up to 18 racks without additional switches

Page 7: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 7

Recap: Harness Rapid Evolution

bb b

BDA 4.0

BDA 4.0 – Sept 2014• Big Data SQL• Node Migration

BDA 3.x – April 2014• CDH 5.0 (MR2 & YARN)• AAA Security• Encryption

BDA 2.x – April 2013• Starter Rack• In-Rack Expansion• EM Integration

BDA 1.0 – Jan 2012• Initial BDA• Mammoth Install

Page 8: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 8

Operational Simplicity Simplify Access to ALL Data

Core Design Principles for Big Data Appliance

Page 9: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 9

Operational Simplicity Simplify Access to ALL Data

• Oracle Big Data SQL – Oracle SQL on ALL your data– All Native Oracle SQL Operators– Smart Scan for Optimized Performance

• Oracle Security – Govern all Data through a Single Set of

Security Policies

Core Design Principles for Big Data Appliance

Page 10: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 10

Oracle Big Data SQL – A New Architecture

• Powerful, high-performance SQL on Hadoop– Full Oracle SQL capabilities on Hadoop– SQL query processing local to Hadoop nodes

• Simple data integration of Hadoop and Oracle Database– Single SQL point-of-entry to access all data– Scalable joins between Hadoop and RDBMS data

• Optimized hardware– Balanced Configurations– No bottlenecks

Page 11: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 11

Big Data SQL

SELECT w.sess_id, c.nameFROM web_logs w, customers cWHERE w.source_country = ‘Brazil’AND w.cust_id = c.customer_id;

Relevant SQL runs on BDA nodes

10’s of Gigabytes of Data

Only columns and rows needed to answer query are returned

Hadoop Cluster

Big Data SQL

Oracle Database

CUSTOMERSWEB_LOGS

Page 12: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 12

Big Data SQL

SELECT w.sess_id, c.nameFROM web_logs w, customers cWHERE w.source_country = ‘Brazil’AND w.cust_id = c.customer_id;

Relevant SQL runs on BDA nodes

10’s of Gigabytes of Data

Only columns and rows needed to answer query are returned

Hadoop Cluster

Big Data SQL

Oracle Database

CUSTOMERSWEB_LOGS

SQL Push Down in Big Data SQL

• Hadoop Scans on Unstructured Data• WHERE Clause Evaluation• Column Projection• Bloom Filters for Better Join Performance• JSON Parsing, Data Mining Model Evaluation

Page 13: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Feedback Loop

Data Management

Big Data Platform

(Hadoop/NoSQL)

Relational Data Warehouse

(OCDM)

Analytic Apps

Customer Experience

Operations

Monetization

Adapters

ETL/ELT Adapters

Real-Time Adapters

ThirdParty

DataSources

Oracle Comms Apps (BSS/OSS)

Oracle Comms Ntwk Products (Tekelec

& Acme)

Other Oracle Apps (CRM, ERP, etc.)

Third Party Sources

Oracle Communications Data ModelReference Architecture

To Other Apps

Page 14: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 14

Operational Simplicity Simplify Access to ALL Data

Core Design Principles for Big Data Appliance

Page 15: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 15

• No Bottlenecks• Full Stack Install and Upgrades• Simplified Management

– Cluster Growth– Critical Node Migration

• Always Highly Available• Always Secure• Very Competitive Price Point

Operational Simplicity Simplify Access to ALL Data

Core Design Principles for Big Data Appliance

Page 16: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 16

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

Day 1• 12 node BDA for Production• Hadoop HA and Security Set-up • Ready to Load Data

RCK_1

Full install with a single command:

./mammoth –i rck_1

Page 17: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 17

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

NN Example Service: Hadoop Name Nodes

Day 1

RCK_1

Page 18: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 18

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

N

RCK_1 RCK_2

Day 90Add 12 New Nodes across two Racks

N

Cluster expansion with a single command:

mammoth –e newhost1,…,newhostn

Page 19: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 19

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

N

RCK_1 RCK_2This expansion automatically optimizes HA setup across multiple racks

N

Cluster Expansion with a single command:

mammoth –e newhost1,…,newhostn

Because of uniform nodes and IB networking,no data is moved

Page 20: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 20

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

N

RCK_1 RCK_2

N

Day nCritical Node Failure => Primary Name Node

Page 21: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 21

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

N

RCK_1 RCK_2

N

• Automatic Failover to other NameNode

• Automatic Service Request to Oracle for HW Failure

Page 22: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 22

Successful Big Data Systems GrowFrom Cluster Install with HA to Large Clusters to Dealing with Operational Issues

N

RCK_1 RCK_2

N

• Restore HA with a Single commandbdacli admin_cluster migrate N1

• Reinstate the Repaired Node with a Single Command:

bdacli admin_cluster reprovision N1

Page 23: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 23

Operational Simplicity

Core Design Principles for Big Data Appliance

30%

21%

Quicker to Deploy

Cheaper to Buy

“Oracle Big Data Appliance is an excellent choice for customers looking to work with the full suite of Cloudera’s leading Hadoop-based technology. It’s more cost-effective and quicker to deploy than a DIY cluster.”

⁻Mike Olson, Cloudera founder, Chief Strategy Officer, and Chairman of the Board

Page 24: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Real-time access to better data means better insights, which means better decisions and better business results

Integrate data associated with customer telemetry, configurations, service history, diagnostics, knowledge & support information

Big Data Initiative @ Oracle Global Support Services

Anticipate Detect Predict Automate Delight

Page 25: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 25

Operational Simplicity

Simplify Access to ALL Data

Core Design Principles Enable Success

Page 26: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 26

There is one more thing…

Business Value = Applications

Page 27: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 27

Big Data Appliance powers instant Business Value

Customer Experience Management

Cyber SecuritySolutions

CommunicationsData Model

Page 28: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 28

Introducing

Paul Kent - SAS

Page 29: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Big Data and Big Analytics – So Much more Gunpowder!Paul KentVP BigData, SAS Research and Development

Page 30: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

1. Change 2. Safari Pics

Page 31: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

[CON8279] Oracle Big Data Appliance: Deep Dive and Roadmap for Customers and PartnersOracle Big Data Appliance is the premier Hadoop appliance in the market. This session describes the roadmap for customers in the areas of high-performance SQL on Hadoop and securing big data, plus overall performance improvements for Hadoop.

A special focus in the session is the roadmap and benefits Oracle Big Data Appliance brings to Oracle partners.

To illustrate the benefits of running on a standardized and optimized Hadoop platform, SAS presents the findings of its tests of SAS In-Memory Analytics on Oracle Big Data Appliance.

Page 32: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Agenda1. SAS & Oracle Partnership

2. Family Stories1. Hadoop

2. Oracle Engineered Systems Family

3. SAS Software Family

3. Deployment Patterns

Page 33: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Reflection on a stronger partnership than ever

Both leaders in Big Data –

Jointly solving the most difficult and demanding Big Data Problems

Providing simplicity and agility to create flexible configurations

Extensive engineering collaboration

Can we answer:

How Does it Work?

How Does it Perform?

2014

Page 34: Big Data  Hands-On  Labs:

Copy r ight © 2012, SAS Ins t i tu te Inc . A l l r ights reserved.

THE TAMOXIFEN DILEMMA

SOURCE: http://commons.wikimedia.org/wiki/File:Tamoxifen-3D-vdW.png

Page 35: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Agenda1. SAS & Oracle Partnership

2. Family Stories1. Hadoop

2. Oracle Engineered Systems Family

3. SAS Software Family

3. Deployment Patterns

Page 36: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Page 37: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Elephant :: 3 Good Ideas !!1. Never forgets

2. Is a good (hard) worker

3. Is a Social Animal (teamwork)

Page 38: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

MPP (Massively Parallel) hardware running database-like software

“data” is stored in parts, across multiple worker nodes

“work” operates in parallel ,on the different parts of the table

Controller Worker Nodes

Hadoop – Simplified View

Page 39: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

Head Node Data 1 Data 2 Data 3 Data 4…

MYFILE.TXT

..block1 -> block1

..block2 -> block2

..block3 -> block3

Idea #1 - HDFS. Never forgets!

Page 40: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

Head Node Data 1 Data 2 Data 3 Data 4…

MYFILE.TXT

..block1 -> block1 block1 copy2

..block2 -> block2 block2 copy2

..block3 -> block3 copy2 block3

Idea #1 - HDFS. Never forgets!

Page 41: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

Head Node Data 1 Data 2 Data 3 Data 4…

MYFILE.TXT

..block1 -> block1 block1copy2

..block2 -> block2 block2 copy2

..block3 -> block3 copy2 block3X

Idea #1 - HDFS. Never forgets!

X

Page 42: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Redundancy Wins!

Page 43: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

Idea #2 – MapReduce – Send the work to the Data

We Want the Youngest Person in the Room

Each Row in the audience is a data node

I’ll be the coordinator

• From outside to center, accumulate MIN• Sweep from back to front. • Youngest Advances

Page 44: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Agenda1. SAS & Oracle Partnership

2. Family Stories1. Hadoop

2. Oracle Engineered Systems Family

3. SAS Software Family

3. Deployment Patterns

Page 45: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Recap: Standard and Modular

45

Starter Rack is a fully cabled and configured for growth with 6 servers

In-Rack Expansion delivers 6 server modular expansion block

Full Rack delivers optimal blend of capacity and expansion options

Grow by adding rack – up to 18 racks without additional switches

Page 46: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data SQL – A New Architecture

• Powerful, high-performance SQL on Hadoop– Full Oracle SQL capabilities on Hadoop– SQL query processing local to Hadoop nodes

• Simple data integration of Hadoop and Oracle Database– Single SQL point-of-entry to access all data– Scalable joins between Hadoop and RDBMS data

• Optimized hardware– Balanced Configurations– No bottlenecks

Oracle Confidential – Internal/Restricted/Highly Restricted 46

Page 47: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Diversity. It’s a good thing!

Impala Nyala

Page 48: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Agenda1. SAS & Oracle Partnership

2. Family Stories1. Hadoop

2. Oracle Engineered Systems Family

3. SAS Software Family

3. Deployment Patterns

Page 49: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

4 Important Things

#1 Join the Family

Page 50: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

HADOOP

Hive QLSAS

SERVER

SAS ACCESS to Hadoop

#2 Be Familiar

Page 51: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

SAS / High Performance Analytics

HADOOP

SAS HPA

Procedures

SAS

SERVER

#3 Use the Cluster!

Page 52: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

Prepare Explore / Transform Model

• HPDS2

• HPDMDB

• HPSAMPLE

• HPSUMMARY

• HPCORR

• HPREDUCE

• HPIMPUTE

• HPBIN

• HPLOGISTIC

• HPREG

• HPNEURAL

• HPNLIN

• HPCOUNTREG

• HPMIXED

• HPSEVERITY

• HPFOREST

• HPSVM

• HPDECIDE

• HPQLIM

SAS / High Performance Analytics

•HPLSO

•HPSPLIT

•HPTMINE

•HPTMSCORE

Page 53: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.Copyright © 2014, SAS Institute Inc. All rights reserved.

Controller

Client

SAS / High Performance Analytics

Page 54: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Page 55: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

#1 Join the Family

#2 Be Familiar

#3 Use the cluster

#4 Have a pretty face!

Page 56: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Page 57: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Page 58: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

4 Important Things (for cluster friendly software)

1.Join the Family

2.Be Familiar

3.Performance

4.Have a pretty face

Page 59: Big Data  Hands-On  Labs:

Copyright © 2014, SAS Institute Inc. All rights reserved.

Agenda1. SAS & Oracle Partnership

2. Family Stories1. Hadoop

2. Oracle Engineered Systems Family

3. SAS Software Family

3. Deployment Patterns

Page 60: Big Data  Hands-On  Labs:

63Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

SAS BIG DATA ON BIG DATA APPLIANCE

• Flexible Architectural options for SAS deployments• Can run on Starter, Half and Full configurations

• Optionally select nodes “N, N-1, N-2, …” for additional SAS

Services such as SAS Compute Tier, SAS MidTier

• Optionally select node subset “N, N-1, N-2, N-3, …) for more

dedicated resources for SAS Analytic Compute Environment by

shifting Big Data Appliance roles

• Option to selectively add more memory on a per node basis

depending on specific workload distribution

Page 61: Big Data  Hands-On  Labs:

64Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

SAS Midtie

r

STARTER BDA

SAS Visual Analytics

Metadata ServerSAS Compute

SAS HPA Root Node

SAS VISUAL ANALYTICS, HIGH-PERFORMANCE ANALYTIC COMPUTE ENVIRONMENT CO-LOCATED WITH HADOOP

Page 62: Big Data  Hands-On  Labs:

65Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

SAS Midtie

r

STARTER BDA

SAS Visual Analytics

Metadata ServerSAS Compute

SAS HPA Root Node

SAS VISUAL ANALYTICS, HIGH-PERFORMANCE ANALYTIC COMPUTE ENVIRONMENT CO-LOCATED WITH HADOOP

Consider:

Extra Memory for 5,6?

Page 63: Big Data  Hands-On  Labs:

66Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

SAS Midtie

r

FULL RACK BDA

LASR Worker

17

HDFS Data 17

Metadata ServerSAS Compute

SAS HPA Root Node

LASR Worker

18

HDFS Data 18

SAS VISUAL ANALYTICS, HIGH-PERFORMANCE ANALYTIC COMPUTE ENVIRONMENT CO-LOCATED WITH HADOOP

Page 64: Big Data  Hands-On  Labs:

67Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

FULL RACK BDA ASSEMBLED IN OSC, SYDNEY AUSTRALIA

Page 65: Big Data  Hands-On  Labs:

68Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

FULL RACK BDA ASSEMBLED IN OSC, SYDNEY AUSTRALIA

Page 66: Big Data  Hands-On  Labs:

69Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

FULL RACK BDA ASSEMBLED IN OSC, SYDNEY AUSTRALIA

Page 67: Big Data  Hands-On  Labs:

70Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

FULL RACK BDA ASSEMBLED IN OSC, SYDNEY AUSTRALIA

Basic Smoke Tests Confirmed:Interoperate with Hadoop and Map Reduce

Read and Write text files to/from HDFS

Read and Write Tabular files to/from Hive (will confirm Oracle BIGSQL in OSC-SC)

Read and Write SAS binary format files to/from HDFS

High Degree Of Parallelism (DOP) reads via Map-Only jobs

SAS LASR server co-exists on/with datanodes

SAS HPA tasks scheduled on datanodes

Page 68: Big Data  Hands-On  Labs:

71Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Table 1: Summation of 5/20/100/200 columns; Baseline: DOP=1 (no parallelism)120M rows, 400 columns, reg_simtbl_400

SAS High-Performance Analytics PerformanceSAS Format Data (SASHDAT)

1107 var11.795 Mobs97GB5.7GB/node

1107 var73.744 Mobs608GB35.7GB/node 6x

Create 208.79 sec 2284.29 sec 11

Scan/Count 24.60 sec 259.38 sec 10.5

HPCORR 295.20 1410.40 4.7

HPCNTREG 336.79 1547.59 4.6

HPREDUCE (u) 236.55 2467.76 10.4

HPREDUCE (s) 219.50 2037.74 9.3

Page 69: Big Data  Hands-On  Labs:

72Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

OSC-AU FullRack BDA

• 408 Threads

• 600 GB dataset

• 17 servers

Your Problem solved ASAP

Page 70: Big Data  Hands-On  Labs:

73Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Page 71: Big Data  Hands-On  Labs:

74Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Page 72: Big Data  Hands-On  Labs:

75Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Page 73: Big Data  Hands-On  Labs:

76Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

EXADATA INTEGRATION

SAS EMBEDDED PROCESSING (EP) TO EXADATALEVERAGING BIG DATA SQL

SAS Midtie

r

LASR Worker

18

…HDFS Data 18

SAS Visual Analytics

Metadata ServerSAS Compute

SAS HPA Root Node

SAS EP

Big Data SQL

Page 74: Big Data  Hands-On  Labs:

77Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Table 1: Summation of 5/20/100/200 columns; Baseline: DOP=1 (no parallelism)120M rows, 400 columns, reg_simtbl_400

SAS High-Performance Analytics PerformanceSAS EP Parallel Data Feeders

DOP=1 DOP=24 DOP=24(flash cache)

Add(5) 1.25min 1.5min .5min

Add(20) 2.5min 1.5min .5min

Add(100) 13min 1.5min .6min

Add(200) 16min ~2min 1.25min (10x)

Page 75: Big Data  Hands-On  Labs:

78Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Table 2: Scan times for 2 tables (200 columns, 400 columns, 120M rows); Baseline: SAS/ACCESS vs. HPA EP feeder

SAS High-Performance Analytics PerformanceSAS EP Parallel Data Feeders

Access Access /DBSlice

SAS HPAUsing EP

Reg_sim_200 1:01:12 0:28:37 0:08:00

Reg_sim_400 1:49:11 0:55:33 0:16:05 (7x!)

Page 76: Big Data  Hands-On  Labs:

79Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

Table 1: Summation of 5/20/100/200 columns; Baseline: DOP=1 (no parallelism)120M rows, 400 columns, reg_simtbl_400

SAS High-Performance Analytics PerformanceSAS Format Data (SASHDAT) and Oracle EXADATA

1107 var11.795 Mobs97GB5.7GB/nodeSASHDAT

907 var11.795 Mobs79.7GB4.7GB/nodeEXADATA

1107 var73.744 Mobs608GB35.7GB/nodeSASHDAT

Create 208.79 sec 931.22 sec 2284.29 sec

Scan/Count 24.60 sec 956.16 sec 259.38 sec

HPCORR 295.20 833.24 1410.40

HPCNTREG 336.79 756.97 1547.59

HPREDUCE (u) 236.55 1055.11 2467.76

HPREDUCE (s) 219.50 1051.93 2037.74

Page 77: Big Data  Hands-On  Labs:

80Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

ORACLE ENGINEERED SYSTEMS FOR

SuperClusterExaData ExaLogic Virtual

Compute

Appliance

Big Data

Appliance

Database

Backup, Recovery,

Logging Appliance

ZFS

Storage

Appliance

Page 78: Big Data  Hands-On  Labs:

Copy r ight © 2012, SAS Ins t i tu te Inc . A l l r ights reserved.

SAS AND ORACLE WORKING TOGETHER TO CREATE CUSTOMER VALUE

• Joint R & D development and Product Management teams in Cary and Redwood Shores

• Focus on driving SAS technology components to run natively in Oracle database

• Joint performance engineering optimizations

• Template physical architectures developed based on use-cases

• Physically tested and benchmarked together

• Reduction in physical effort• Overall reduction in lifecycle

costs

• Best Practice papers• SAS and Oracle Engineers

provide joint "Sizing and Architecture Analysis and Design"

Page 79: Big Data  Hands-On  Labs:

Copy r ight © 2013, SAS Ins t i tu te Inc . A l l r ights reserved.

SAS AND ORACLEBETTER TOGETHER

Paul.Kent @ sas.com

@hornpolish

paulmkent

Page 80: Big Data  Hands-On  Labs:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 83

Page 81: Big Data  Hands-On  Labs: