Data Warehouse appliances: IBM Pure Data for Analyticspietro-baroni.unibs.it/impianti/Lucidi_PureData_22_maggio_2017.pdf · Data Warehouse appliances: IBM Pure Data for Analytics

© 2017 IBM Corporation

Data Warehouse appliances: IBM Pure Data for Analytics

Fabio Bresciani, Cloud & Cognitive, IBM Italia

May 2017

1935 Training courses

for Women

1956 Data storage

industry creation

1944 First machine to

handle long calculations

automatically

1962 First computer-driven airline reservation

system

1971 Floppy

disk

1969 Magnetic strips on credit cards

1986 IBM scientists won the Nobel

Prize

1997 Supercomputer

defeated the best chess

player

1997 IBM

“eBusiness”

1924 International

Business Machines

1911 Computing-Tabulating-

Recording (CTR)

1961 The

Selectric Typewriter

1973 UPC bar codes

2011 IBM Watson

1981 The IBM PC

1969 IBM technology guided Apollo mission to the

moon

1927 Italy

2

Systems

Analytics

Healthcare

Consulting Services

Research

IBM Technical and Infrastructure

Services

Cloud

Internet of Things

Commerce

Security

Cosa fa IBM?

© 2017 IBM Corporation 4

5.7 B$ in R&D

(6% del fatturato)

13 centri di Ricerca

in 6 continenti,

fra cui quello di

Zurigo guidato

dall’italiano

Alessandro Curioni

5 premi Nobel

Per 24 anni consecutivi l’impresa leader nei brevetti

8.088 brevetti U.S. nel 2016

8.500 master inventor in 43 paesi

Concentrati in aree strategiche:

Cloud Computing, Analytics, Security

Cognitive Computing, Healthcare

IBM Research

Tokyo

Beijing/Shanghai

Melbourne

Delhi/Bengaluru

Nairobi

Haifa

Zurich

Dublin

New York

São Paulo/

Rio de Janeiro

Almaden

Austin

Centri di Ricerca IBM

Johannesburg


Too complex an infrastructure

Too complicated to deploy

Too much tuning required

Too inefficient at analytics

Too many people needed to maintain

Too costly to operate

5

Traditional Data Warehouses

They do NOT meet the demands of advanced analytics on big data.

are just too complex

Too long to get answers


Big Data Floods Traditional Database Systems


Let’s Simplify This Mess


And Bring Analytics In To The Warehouse


Legacy RDBMS

Create Table - Logical Model

CREATE TABLE CRRADMIN.OT_ORDER_EVENTS

(

TRADE_DATE DATE NOT NULL,

ORIGIN_SYS_CD VARCHAR2(32 BYTE) NOT NULL,

ORIGIN_SYS_EVENT_SEQ VARCHAR2(32 BYTE) NOT NULL,

EVENT_ID NUMBER(9) NOT NULL,

EVENT_CLASS_CD VARCHAR2(32 BYTE) NOT NULL,

EVENT_DATETIME DATE NOT NULL,

ORIGIN_SYS_REF VARCHAR2(32 BYTE) NOT NULL,

ORIGIN_SYS_PARENT_REF VARCHAR2(32 BYTE),

ORIGIN_SYS_ORDER_REF VARCHAR2(32 BYTE),

ORIGIN_SYS_RELATED_REF VARCHAR2(32 BYTE),

ORIGIN_SYS_GROUP_REF VARCHAR2(32 BYTE),

ORIGIN_SYS_DATETIME DATE NOT NULL,

TRADE_ID NUMBER(9),

BASKET_ID NUMBER(9),

ORDER_ID NUMBER(9),

BASKET_NAME VARCHAR2(32 BYTE),

SQC_SQN VARCHAR2(20 BYTE),

EXECFAC_ID NUMBER(9),

CUSTOMER_REF VARCHAR2(255 BYTE),

INSTRUMENT_ID NUMBER(9),

SYMBOL VARCHAR2(64 BYTE),

…

);

Netezza Simplicity

Allocate Space

TABLESPACE “OTR_DATA" LOCAL

(PARTITION BY RANGE (TRADE_DATE) (

PARTITION P20070102 VALUES LESS THAN (20070102)

PCTFREE 10 INITRANS 2 MAXTRANS 255

STORAGE(INITIAL 262144 NEXT 262144 MINEXTENTS 1

MAXEXTENTS 2147483645

PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL

DEFAULT, PCTFREE 10 INITRANS 2 MAXTRANS 255




DEFAULT,






DEFAULT,






DEFAULT,






DEFAULT,






DEFAULT,

Create Indexes

CREATE INDEX OTOE_EVENT_ID

ON

CRRADMIN.OT_ORDER_EVENTS(EVENT_ID)

TABLESPACE OTR_IDX

NOLOGGING

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE(BUFFER_POOL DEFAULT)

NOPARALLEL

NOCOMPRESS

/

CREATE INDEX OTOE_TRADE_ID

ON

CRRADMIN.OT_ORDER_EVENTS(TRADE_ID)

TABLESPACE OTR_IDX

NOLOGGING

PCTFREE 10

INITRANS 2

MAXTRANS 255

Netezza DDL

Create Table - Logical Model

CREATE TABLE CRRADMIN.OT_ORDER_EVENTS

(

TRADE_DATE DATE NOT NULL,

ORIGIN_SYS_CD VARCHAR (32) NOT NULL,

ORIGIN_SYS_EVENT_SEQ VARCHAR (32) NOT NULL,

EVENT_ID INTEGER NOT NULL,

EVENT_CLASS_CD VARCHAR (32) NOT NULL,

EVENT_DATETIME TIMESTAMP NOT NULL,

ORIGIN_SYS_REF VARCHAR (32) NOT NULL,

ORIGIN_SYS_PARENT_REF VARCHAR (32),

ORIGIN_SYS_ORDER_REF VARCHAR (32),

ORIGIN_SYS_RELATED_REF VARCHAR (32),

ORIGIN_SYS_GROUP_REF VARCHAR (32),

ORIGIN_SYS_DATETIME TIMESTAMP NOT NULL,

TRADE_ID INTEGER ,

BASKET_ID INTEGER ,

ORDER_ID INTEGER ,

BASKET_NAME VARCHAR (32),

SQC_SQN VARCHAR (20),

EXECFAC_ID INTEGER ,

CUSTOMER_REF VARCHAR (255),

INSTRUMENT_ID INTEGER ,

SYMBOL VARCHAR (64),

…

)

DISTRIBUTE ON (ORIGIN_SYS_REF);

•Logical Model Only

•No indexes

•No Physical Tuning/Admin

•Distribute Data by Columns or Round Robin


IBM PureData System for Analytics The Simple Data Warehouse Appliance for Serious Analytics

What makes it different?

Speed - 10-100x faster than traditional custom systems1

Simplicity - minimal administration and tuning

Scalability - petabyte+ scale user data capacity

Smart - high performance, advanced analytics

Purpose-built analytics appliance

Integrated database, server and storage

Standard interfaces

Low total cost of ownership


Massively Parallel Processing Architecture “Divide and conquer”

MPP

“Shared Nothing” concept

Divides the work in smaller tasks

• A big task is sliced vertically into a series of smaller tasks

• The smaller tasks run independently

• The work is automatically balanced among the tasks to minimize the

time to complete

• Each task is assigner the same amount of physical resources

• Communication between is made only at the beginning and end of the

task

Benefits

A large task completes in a short elapsed time

Maximizes use of resources

Points of Attention

Complexity on administration and management

Communication bottlenecks


Data Warehouse Workload Fewer requests, lots of data manipulation

CPU

Request

General Purpose

Storage

Request

Transactional System used for BI


Data Warehouse Workload Transaction systems are inefficient for data shuffling

Results

Transactional System used for BI

Request

General Purpose

Storage

CPU


Results

IBM Pure Data System

Data Warehouse Blades Designed for Tera-scale Business Intelligence

Intelligent Storage CPU

Request

Asymmetric Massively Parallel Processing


Results

IBM Pure Data System

Data Warehouse Blades Highly efficient data movement

Intelligent Storage CPU

Request

1% of network

traffic

2% of CPU

requirements

Asymmetric Massively Parallel Processing


Asymmetric Massively Parallel Processing™

Massively Parallel

Intelligent Storage

1

2

3

920

Ÿ

Ÿ

Ÿ

Network

Fabric SMP Host

DBOS Front End

Netezza Appliance

High-Speed

Loader/Unloader

ODBC 3.X

JDBC Type 4

OLE-DB

SQL/92

Execution

Engine

SQL

Compiler

Query

Plan

Optimize

Admin

Source

Systems

Client

High

Performance

Loader

3rd Party

Apps

DBA CLI

ETL Server

SOLARIS

LINUX

HP-UX

AIX

WINDOWS

TRU64

High-Performance

Database Engine

Streaming joins,

aggregations, sorts

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic


High-Performance

Database Engine

Streaming joins,

aggregations, sorts

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

Execution

Engine


Massively Parallel

Intelligent Storage

1

2

3

920

Ÿ

Ÿ

Ÿ

Network

Fabric SMP Host

DBOS Front End

Netezza TwinFin Appliance

High-Speed

Loader/Unloader

SQL

Compiler

Query

Plan

Optimize

Admin

SQL

1 2 3

1 2 3

1 2 3

1 2 3

Snippets

1 2 3

SQL

Source

Systems

Client

High

Performance

Loader

3rd Party

Apps

DBA CLI

ETL Server

SOLARIS

LINUX

HP-UX

AIX

WINDOWS

TRU64


FPGA Core CPU Core

Uncompress Project Restrict,

Visibility

Complex ∑

Joins, Aggs, etc.

select DISTRICT,

PRODUCTGRP,

sum(NRX)

from MTHLY_RX_TERR_DATA

where MONTH = '20091201'

and MARKET = 509123

and SPECIALTY = 'GASTRO'

Slice of table

MTHLY_RX_TERR_DATA

(compressed)

where MONTH = '20091201'

and MARKET = 509123

and SPECIALTY = 'GASTRO'

sum(NRX)

select DISTRICT,

PRODUCTGRP,

sum(NRX)

S-Blade Data Stream Processing


High-Performance

Database Engine

Streaming joins,

aggregations, sorts, etc.

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic

S-Blade

Processor &

streaming DB logic


Massively Parallel

Intelligent Storage

1

2

3

920

Ÿ

Ÿ

Ÿ

Network

Fabric SMP Host

DBOS Front End

Netezza TwinFin Appliance

High-Speed

Loader/Unloader

SQL

Compiler

Query

Plan

Optimize

Admin

1 2 3

1 2 3

1 2 3

1 2 3

Consolidate

Execution

Engine

ODBC 3.X

JDBC Type 4

OLE-DB

SQL/92

Source

Systems

Client

High

Performance

Loader

3rd Party

Apps

DBA CLI

ETL Server

SOLARIS

LINUX

HP-UX

AIX

WINDOWS

TRU64


Inside the IBM PureData System for Analytics N3001

Optimized Hardware +

Software

Hardware

accelerated AMPP

Purpose-built for

high performance

analytics

Requires no tuning Snippet Blades ™

Hardware-based

query acceleration

with FPGAs

Blistering fast

results

Complex analytics

executed as the data

streams from disk

Disk Enclosures

User data, mirror,

swap partitions

High speed data

streaming

SMP Hosts

SQL Compiler

Query Plan

Optimize

Admin


Disk Mirroring and Failover

All user data and temp space mirrored

Disk failures transparent to queries and transactions

Failed drives automatically regenerated

Bad sectors automatically rewritten or relocated

Primary

Mirror

Temp


S-Blade™ Failover and Query Continuity

• Drives automatically reassigned to remaining S-Blades within a chassis

• Read-only queries (that have not returned data yet) automatically restarted

• Transactions and loads interrupted

• Loads automatically restarted from last successful checkpoint

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

S-Blades



ZoneMap™ – Pure Data's Anti-Index: Automatic Query Acceleration

Zone

Maps Base Table

Data Blocks

Col 1:

Date

Col 2:

Zip

Zone Maps: 18 out of 48 Extents Read

• Indexes are additional structures on disk, derived

from the base table to accelerate locating

information

• ZoneMaps are a method within the storage

system without the need for add’l structures on

the disk to indicate where data DOES NOT reside

• The NPS system

> Automatically stores min. & max. values of all

integer columns in each file extent

> Uses the ZoneMap information to determine if a

given extent should be read

Both indices and ZoneMaps are techniques to avoid full table scans, but

Netezza’s ZoneMap approach is:

• Automatic; and

• Does not require a separate structure to create, tune & maintain




Distributions and Performance

SPU

Node

1

2

3

4

5

6

7

CPU Disk I/O Network

Response time

is affected by

the completion

time for all of

the SPUs in the

AMPP array.

A distribution method that distributes data evenly across all SPUs is the

single most important factor that can influence overall performance!

Respon

se T

ime


Hash Distributions and Data Skew

Response T

ime

Gender = M or F

will distribute all table records on 2 SPUs

Select a distribution key with unique values and high cardinality


SPU

Node

1

2

3

4

5

6

7


Hash Distributions and Processing Skew

Using a DATE column as the distribution key may distribute rows evenly across all S-

Blades. However, most analysis (queries) is performed on a date range. Massive

parallel processing won’t be achieved when all of the records to be processed for a

given date range are located on a single or a few S-Blades)

Response T

ime

Jan

Feb

Mar

Apr

May

Jun

Jul


SPU

Node

1

2

3

4

5

6

7


CREATE TABLE customer

( c_custkey integer,

c_name character varying(25),

c_address character varying(40),

c_nationkey integer,

c_phone character(15),

c_acctbal numeric(15,2),

c_mktsegment character(10),

c_comment character varying(117)

) DISTRIBUTE ON ( c_custkey );

CREATE TABLE orders

( o_orderkey integer,

o_custkey integer,

o_orderstatus character(1),

o_totalprice numeric(15,2),

o_orderdate date,

o_orderpriority character(15),

o_clerk character(15),

o_shippriority integer,

o_comment character varying(79)

) DISTRIBUTE ON ( o_custkey );

Commonly JOINed Tables:

Use the Same Distribution Key

For tables commonly joined (WHERE clause) use the same

column/distribution key used in the JOIN!


Impact of Distribution Key on Table Join Performance

100, 1, … 135, 4, …

190, 4, … 222, 8, …

1, … 4, …

8, … 10, …

118, 6, … 149, 7, …

206, 3, … 282, 11, …

3, … 6, …

7, … 11, …

112, 2, … 168, 5, …

174, 12, … 211, 2, …

2, … 5, …

9, … 12, …

ORDERS

Table

CUSTOMERS

Table

Join

Processing

Join

Processing

Join

Processing

No data

movement

is required

CREATE TABLE ORDERS (ORDER_NO, CUST_NO, …) DISTRIBUTE BY HASH (CUST_NO)

CREATE TABLE CUSTOMERS (CUST_NO, …) DISTRIBUTE BY HASH (CUST_NO)

SELECT … FROM ORDERS O, CUSTOMERS C WHERE O.CUST_NO = C.CUST_NO

Identical Distribution Keys


Impact of Distribution Key on Table Join (cont.)

ORDERS

Table

CUSTOMERS

Table

118, 6, … 135, 4, …

174, 12, … 282, 11, …

1, … 4, …

8, … 10, …

112, 2, … 168, 5, …

206, 3, … 222, 8, …

3, … 6, …

7, … 11, …

100, 1, … 149, 7, …

190, 4, … 211, 2, …

2, … 5, …

9, … 12, …

Shipped rows Shipped rows Shipped rows

Data shipping

Join

Processing

Join

Processing

Join

Processing

CREATE TABLE ORDERS (ORDER_NO, CUST_NO, …) DISTRIBUTE BY HASH (ORDER_NO)

CREATE TABLE CUSTOMERS (CUST_NO, …) DISTRIBUTE BY HASH (CUST_NO)

SELECT … FROM ORDERS O, CUSTOMERS C WHERE O.CUST_NO = C.CUST_NO

Different Distribution Keys

Data movement

is required


Workload Management

Workload Management (WLM) provided optional functionality to manage resources and prioritize usage

across a diverse multi-user environment to meet the need of mixed user workloads

Guaranteed Resource Allocation (GRA)

Mechanism to allocate NPS resources among groups of users in a multi-user environment

Prioritized Query Execution (PQE)

Finer control over resource allocation by extending the notion of query priorities from scheduling to execution

Short Query Bias (SQB)

Ensures users with short queries receive faster, higher, biased query response time under heavy system workloads

Workload Limits (GRA)

You can use the JOB MAXIMUM attribute of the group definition to control the number of actively running jobs

submitted by that group

Minimum Resource

Guarantees Request Queues User Requests

Departmental User

Admin Tasks

Power User


Appliances are easy to monitor

© 2017 IBM Corporation 35 35 Page © 2017 IBM Corporation

Traditional storage is not ready for the digital transformation Object storage solves the problems of scale, management and costs

• Storage for unstructured data (photos, videos,

audios, …) and big data.

• Object is data with metadata.

• Basis for cloud storage, spans geographies.

• High scalability (seamless, multi-dimensional

scaling).

• Ease of use.

• Lower cost of operations.

BLOCK

&

FILE

• Traditional Storage

• Block storage = fixed size blocks in rigid

arrangement, ideal for enterprise databases.

• File storage - sharing files in hierarchically

nested folders, ideal for active documents.

OBJECT




text

38

Original Data

Objects are sent to the Accesser via the S3 Compatible API or Openstack Swift Compatible API $

Accesser

Writing Data to IBM Cloud Object Storage

1

Each object is segmented into 4MB segments e.g a 1GB object will be segmented into 250 segments.

2 $

4MB 4MB 4MB 4MB 4MB

Let’s store a Video!


text

39

Each segment is encrypted and then sliced. 3 $

4MB 4MB 4MB 4MB

7

6

5

4

3

1

2


$

4MB 4MB 4MB 4MB

7

6

5

4

3

1

2

Erasure coding is used to transform the data into a customizable number of slices

4

12

11

10

9

8

7

6

5

4

3

1

2

Erasure Coding

Expansion


text

40

$

4MB 4MB 4MB 4MB

7

6

5

4

3

1

2

12

11

10

9

8

7

6

5

4

3

1

2

SITE 1 SITE 2 SITE 3

Storage Nodes

Each slice is written to a separate storage node. In this example, the storage nodes are geographically dispersed across 3 sites.

5

SITE 2 SITE 3




text

41

4MB 4MB 4MB 4MB 4MB


Storage Nodes

SITE 2 SITE 3

SITE 3 SITE 1 SITE 2 With this 12/7 Information Dispersal Algorithm, a read can still be executed with any five storage nodes being unavailable.

Reading Data from IBM Cloud Object Storage


text

42


Storage Nodes

SITE 2 SITE 3


$

Even an entire site outage (plus one additional storage node outage) can be tolerated.

Reading Data from IBM Cloud Object Storage

43 Page

IBM Cloud Object Storage

EFFICIENCY

How to build a highly reliable storage system

for 1 Petabyte of usable data?

RAID 6 + Replication IBM Cloud Object Storage

1 PB

3.6 PB

600

3.6x

3.6x

3 FTE

Replication/backup

Usable Storage

Raw Storage

6TB Disks

Racks Required

Floor Space

Ops Staffing

Extra Software

$ 70% + TCO Savings

Original

1.20 PB Raw

Onsite mirror

1.20 PB Raw

Remote copy

1.20 PB Raw

1 PB

1.7 PB

288

1.7x

1.7x

.5 FTE

None

Documents

Data Warehouse appliances: IBM Pure Data for Analyticspietro-baroni.unibs.it/impianti/Lucidi_PureData_22_maggio_2017.pdf · Data Warehouse appliances: IBM Pure Data for Analytics