22
© 2014 IBM Corporation IBM DB2 Analytics Accelerator for z/OS DB2 Analytics Accelerator for z/OS, is a high performance solution designed to work with IBM System z and DB2 z/OS to deliver faster analytic query responses transparently to users. Saso Prek, IBM SWG [email protected] Senior Certified IT Specialist, Member of zChampion Team

2 ibm db2 analytics accelerator - saso prek

Embed Size (px)

DESCRIPTION

 

Citation preview

© 2014 IBM Corporation

IBM DB2 Analytics Accelerator for z/OS

DB2 Analytics Accelerator for z/OS, is a highperformance solution designed to work with IBMSystem z and DB2 z/OS to deliver faster analytic queryresponses transparently to users.

Saso Prek, IBM [email protected] Certified IT Specialist,Member of zChampion Team

© 2014 IBM Corporation2

Customer #2 & #3DW queries & some figures

DWH.F_FAKTURA_EE > 531.000.000 rows

DWH.D_MER_MESTO > 1.209.877 rows

SELECT A.JP, E.TAR_SKUP_SIF_NAZ, B.MER_MESTO, C.LETO_MES, SUM(A.KOLICINA)

FROM DWH.F_FAKTURA_EE AS A, DWH.D_MER_MESTO AS B, DWH.D_VRS_PRODUKTA_EE AS D,

DWH.D_CAS AS C INNER JOIN DWH.D_TAR_SKUPINA AS E ON E.ID_TAR_SKUPINA = A.ID_TAR_SKUPINA

WHERE (A.JP = 2) AND (A.ID_CAS = C.ID_CAS) AND (C.LETO IN (2012, 2011, 2010, 2009, 2008))

AND (B.ID_MER_MESTO = A.ID_MER_MESTO)

AND (D.ID_VRS_PRODUKTA_EE = A.ID_VRS_PRODUKTA_EE)

AND (D.GRUPA_NAZ = 'OMREŽNINA')

GROUP BY A.JP, E.TAR_SKUP_SIF_NAZ, B.MER_MESTO, C.LETO_MES

with IDAA << 10 sec

w/o IDAA > 2 hours ( xxx CPU & xxx I/O)

© 2014 IBM Corporation3

IBM DB2 Analytics Accelerator - Product Components

3

10Gb

OSA-Express3

10 GbE

Primary

Backup

CLIENT

Data Studio Foundation

DB2 AnalyticsAccelerator

Admin Plug-in

zEnterprise + zEC12 + zBC12

Data Warehouse applicationDB2 for z/OS enabled for IBM

DB2 Analytics AcceleratorIBM DB2 AnalyticsAcelerator

BladeCenter

NetezzaTechnology

Users/Applications

Network

© 2014 IBM Corporation4

Fast Time to Value

� IBM DB2 Analytics Accelerator (Netezza 1000-12)

� Production ready - 1 person, 2 days

� Table Acceleration Setup … 2 Hours

– DB2 “Add Accelerator”

– Choose a Table for “Acceleration”

– Load the Table (DB2 copy to Netezza)

– Knowledge Transfer

– Query Comparisons

� Initial Load Performance …

�400 GB “Loaded” in 29 Min 570 million rows (Loads of 800GB to 1.3TB/Hr)

� Actual Query Acceleration … 1908x faster

�2 Hours 39 Minutes to 5 Seconds

� CPU Utilization Reduction

�35% to ~0%

Actual customer results, October 2011

© 2014 IBM Corporation55 16 June 2014

Performance & Savings

Accelerating decisions to the speed of business

Queries run faster

• Save CPU resources

• MIPS

• DASD

• People time

• Business opportunities

Actual customer results, October 2011

Times

Faster

Query

Total

Rows

Reviewed

Total

Rows

Returned Hours Sec(s) Hours Sec(s)

Query 1 2,813,571 853,320 2:39 9,540 0.0 5 1,908

Query 2 2,813,571 585,780 2:16 8,220 0.0 5 1,644

Query 3 8,260,214 274 1:16 4,560 0.0 6 760

Query 4 2,813,571 601,197 1:08 4,080 0.0 5 816

Query 5 3,422,765 508 0:57 4,080 0.0 70 58Query 6 4,290,648 165 0:53 3,180 0.0 6 530

Query 7 361,521 58,236 0:51 3,120 0.0 4 780

Query 8 3,425.29 724 0:44 2,640 0.0 2 1,320

Query 9 4,130,107 137 0:42 2,520 0.1 193 13

DB2 Only

DB2 with

IDAA

DB2 Analytics Accelerator: “we had this up and running in days with queries that ran over 1000

times faster”

DB2 Analytics Accelerator: “we expect ROI in less than 4 months”

Advance to 32 minute mark for DB2 Analytics Accelerator section of keynote

© 2014 IBM Corporation66

DB2 Analytics Accelerator

Blending System z and Netezza

technologies to deliver unparalleled,

mixed workload performance for

complex analytic business needs.

More insight from your data

• Fast, predictable response times for

“right-time” analysis

• Complex queries in seconds rather than hours

• Transparent to the application

• Inherits all System z DB2 attributes

• No need to create or maintain indices

• Eliminate query tuning

• Fast deployment and time-to-value

© 2014 IBM Corporation7

IBM DB2 Analytics Accelerator - Product Components

7

10Gb

OSA-Express3

10 GbE

Primary

Backup

CLIENT

Data Studio Foundation

DB2 AnalyticsAccelerator

Admin Plug-in

zEnterprise + zEC12 + zBC12

Data Warehouse applicationDB2 for z/OS enabled for IBM

DB2 Analytics AcceleratorIBM DB2 AnalyticsAcelerator

BladeCenter

NetezzaTechnology

Users/Applications

Network

© 2014 IBM Corporation888

User InterfaceIncremental update UI elements only visible if it has been enabled on the DB2 subsystem via IBM DB2 Analytics Accelerator configuration console

� Start / stop replication process (per subsystem-accelerator pair)

� Enable / disable replication (per table)

� Trace collection

� Information on replication latency and events

© 2014 IBM Corporation9

Deep DB2 Integration within zEnterprise

9

Data

Manager

Buffer

ManagerIRLM

Log

Manager

IBM

DB2

Analytics

Accelerator

Applications DBA Tools, z/OS Console, ...

. . .

Operational Interfaces

(e.g. DB2 Commands)

Application Interfaces

(standard SQL dialects)

z/OS on System z

DB2 for z/OS

Superior availability

reliability, security,

Workload management

Superior

performance on

analytic queries

© 2014 IBM Corporation10

Fast Time to Value

� IBM DB2 Analytics Accelerator (Netezza 1000-12)

� Production ready - 1 person, 2 days

� Table Acceleration Setup … 2 Hours

– DB2 “Add Accelerator”

– Choose a Table for “Acceleration”

– Load the Table (DB2 copy to Netezza)

– Knowledge Transfer

– Query Comparisons

� Initial Load Performance …

�400 GB “Loaded” in 29 Min 570 million rows (Loads of 800GB to 1.3TB/Hr)

� Actual Query Acceleration … 1908x faster

�2 Hours 39 Minutes to 5 Seconds

� CPU Utilization Reduction

�35% to ~0%

Actual customer results, October 2011

© 2014 IBM Corporation11

SMP Hosts

Snippet BladesTM

(S-Blades, SPUs)

Disk Enclosures

IDAA ServerSQL Compiler, Query Plan, OptimizeAdministration2 front/end hosts, IBM 3650M3

clustered active-passive2 Nehalem-EP Quad-core 2.4GHz per host

Processor &streaming DB logicHigh-performance databaseengine streaming joins,aggregations, sorts, etc.e.g. TF12: 12 back/end SPUs

(more details on following charts)

Slice of User DataSwap and Mirror partitionsHigh speed data streamingHigh compression rate EXP3000 JBOD Enclosures

12 x 3.5” 1TB, 7200RPM, SAS (3Gb/s)max 116MB/s (200-500MB/s compressed data)

e.g. TF12:8 enclosures → 96 HDDs

32TB uncompressed user data (→ 128TB)

DB2 Analytics Accelerator V2Powered by Netezza 1000 Appliance

© 2014 IBM Corporation12

EXPLAIN

• DB2 EXPLAIN function is enhanced to provide basic information about accelerator usage

• Whether query qualifies for acceleration and, if not, why

• The access path details associated with the query execution by Netezza are

provided independently of DB2 EXPLAIN by the IDAA Studio.

• For each query (irrespective of the number of query blocks) a row

is inserted in the following tables:

• in both PLAN_TABLE and DSN_QUERYINFO_TABLE, if the query is re-routed

• PLAN_TABLE's ACCESSTYPE column is set to a value of 'A'

• DSN_QUERYINFO_TABLE's QI_DATA column shows the converted query text

• in DSN_QUERYINFO_TABLE only, if the query is not re-routed

• REASON_CODE and QI_DATA columns provide details

• Note that the EXPLAIN tables can be populated with above described information even if there is no accelerator connected to DB2

• Specifying EXPLAINONLY on START ACCEL command does not establish any communications with an actual accelerator, but enables DB2 to consider its presence in the access path selection process

© 2014 IBM Corporation13

Query Execution Process Flow

13

Optimizer

IDA

A D

RD

A R

equesto

r

Application

Application

Interface

Queries executed with IDAA

Queries executed without IDAA

Heartbeat (IDAA availability and performance indicators)

Query execution run-time for

queries that cannot be or should

not be off-loaded to IDAA

SPU

Memory

SPU

Memory

SPU

Memory

SPU

Memory

SM

P H

ost

Heartbeat

DB2 for z/OS

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

IDAA

© 2014 IBM Corporation14

Option 1: Full Table Refresh

� Changes in data warehouse tables typically driven by scheduled (nightly or more frequently) ETL process

� Data used for complex reporting based on consistent and validated content (e.g., weekly transaction reporting to the central bank)

� Multiple sources or complex transformations prevent from propagation of incremental changes

� Queries may continue during full table refresh for accelerator

� Full table refresh may be triggered through DB2 stored procedure (scheduled, integrated into ETL process or through GUI)

DB2 z/OS Query OptimizerDB2 z/OS Query Optimizer

Changes / Replacement

DB2 for z/OS databaseDB2 for z/OS database

ET

L P

roce

ss

ET

L P

roce

ss

DB2 native

processing

DB2 native

processingAccelerator

processing

Accelerator

processing

Operational Analytics, Reports, OLAP, …Operational Analytics, Reports, OLAP, …

Continuous

Query

Processing

Continuous

Query

Processing

Full table refresh

© 2014 IBM Corporation15

Option 2: Incremental Update (Controlled Availability)

� Changes in data warehouse tables typically driven by replication or manual updates

– Corrections after a bulk-ETL-load of a data warehouse table

– Continuously changing data (e.g. trickle-feed updates from a transactional system to an ODS)

� Reporting and analysis based on most recent data

� May be combined with Option 1 & 2 (first table refresh and then continue with incremental updates)

� Incremental update can be configured per database table

DB2 z/OS Query OptimizerDB2 z/OS Query Optimizer

Changes

DB2 for z/OS databaseDB2 for z/OS database

DB2 native

processing

DB2 native

processingAccelerator

processing

Accelerator

processing

Operational Analytics, Reports, OLAP, …Operational Analytics, Reports, OLAP, …

Continuous

Query

Processing

Continuous

Query

Processing

Incremental Update

Replic

ation

Replic

ation

Applic

ation

Applic

ation

© 2014 IBM Corporation1616

Connectivity Options

Multiple DB2 systems can connect to a single IDAA

A single DB2 system can connect to multiple IDAAs

• residing in the same LPAR

• residing in different LPARs

• residing in different CECs

• being independent (non-data sharing)

• belonging to the same data sharing group

• belonging to different data sharing groups

Multiple DB2 systems can connect to multiple IDAAs

Full flexibility for DB2 systems:

Better utilization of IDAA resourcesBetter utilization of IDAA resources

ScalabilityScalability

High availabilityHigh availability

DB2 DB2

DB2

DB2 DB2

© 2014 IBM Corporation17

Save Over 95% of Host Disk Space for Historical Data

1Q

2Q

3Q

4Q

1Q

2Q

3Q

4Q

1Q

2Q

3Q

4Q

1Q

2Q

3Q

4Q

1Q

2Q

3Q

4Q

1Q

2Q

3Q

4Q

1Q

2Q

3Q

4Q

Year Year -7Year -2 Year -3 Year -4 Year -5Year -1

Historical Data

Current Data

One Quarter = 3.57% of 7 years of data

One Month = 1.12% of 7 years of data

One month = 2.78% of 3 years of data

© 2014 IBM Corporation1818

Accelerating SAP for more business value

� Enhances the SAP Certified

DB2 for z/OS

� Accelerates SAP NetWeaver BW

� Dramatic decrease in

elapsed time for SAP BW ad-hoc reporting

No Description

Records

Read

Records

Returned

DB2

(sec)

Accelerator

(sec)

Accleration

Factor

1 Simple mass aggregation 17116647 21 117 0.78 150

2 Query #1 + 70% filter 11980812 21 94.2 0.86 110

3 Query #1 + 30% filter 5133708 21 54.8 0.82 67

4 Query #1 + 10% filter 1710293 21 17.6 0.87 205 Skewed data, low filtering 10790019 21 96.8 2.47 39

6 Skewed data, high filtering 24 14 7.28 0.83 9

7 Many restrictions 3805941 21 128 7.65 17

8 Navigational attributes 823646 21 17.1 1.27 13

9 Navigational attributes + selective condition 811 21 15.8 1.17 1410 Open value ranges 2006 21 19.6 3.52 6

11 Hierarchy 1653981 21 17.6 0.97 18

12 Hierarchy + selective condition 55068 21 38.6 0.98 3913 Restricted key figures on 2 dimensions 1314964 1948 207 7.22 29

14 Query #14 + hierarchy 132564 1499 > 1000 1.27 > 78715 Calculated key figures (OLAP) 5321586 10 57.8 2.37 24

16 OR linked values 6212609 13 40.5 0.92 44

17 Non uniform data distribution 11016253 13 31.2 0.99 32

18 Selective line item 1724 1706 0.71 1.17 0.6

19 Non-selective line item 115481 68619 33.8 1.36 25

20 All together 3087692 468 87.7 4.42 20

1 3 5 7 9

11

13

15

17

19

0

20

40

60

80

100

120

140

Accelerator (sec) DB2 (sec)

18 million records in an InfoCube, 20 dedicated queries

© 2014 IBM Corporation19

Run queries up to 2000x faster

Business innovation with zEnterprise Analytics

Petrol, a strategic supplier of oil and other energy products is doing something they could never do before, increasing retail sales nearly 5% through reduced analytic query response times (99.8 % faster)

“The store employee enters what the customer is purchasing, and with the DB2 Analytics Accelerator appliance, the Cognos

BI and SPSS tools deliver information on complementary products in seconds.”

--A Chief Information officer--

© 2014 IBM Corporation20

Workload Assessment

1

Documentationand REXX procedure

Documentationand REXX procedure

Data package(mainly unload

data sets)

Data package(mainly unload

data sets)IBM lab

Database

IBM labDatabase

Pre-process andload

Pre-process andload

2 3

Quick Workload Test Tool

Quick Workload Test Tool

Report

Assessment

Report for a first assessment:� Acceleration potential for

• Queries• Estimated time• CP cost

Report for a first assessment:� Acceleration potential for

• Queries• Estimated time• CP cost

CustomerDatabase

CustomerDatabase

� Customer

• Collects information from dynamic statement cache, supported by step-by-step instruction and REXX script (small effort for customer)

• Upload compressed file (up to some MB) to IBM FTP server

� IBM

• Import data into local database

• Quick analysis based on known DB2 Analytics Accelerator capabilities

Key contact: Data Warehouse System z/Germany/IBM

20

© 2014 IBM Corporation21OMEGAMON XE for DB2 PE

Analyzeand

Report

DB2 Admin/OC

Manageand

Administer

Query Workload Tuner for z/OS

Compareand

Tune

Query Monitor for DB2

Monitorand

Identify

DB2 Analytics AcceleratorLoader

PerformanceLoad withoptions

IBM DB2 Tools and Accelerator Ecosystem

Is Acceleratorbeing under/over utilized?

How can Idetermine cost savings?

What queriesare being routed to theAccelerator?

How can I determinecost savings?

Do I have thebest set ofobjects for consideration?

Is there a wayto managefrom ISPF?

Can I load data from other sources?

How can Ideterminecost savings?

Can I improveon availabilitywhen loading?Can I tell how

current my data is?

© 2014 IBM Corporation22