Upload
tess98
View
4.100
Download
9
Tags:
Embed Size (px)
Citation preview
1
October 25–29, 2009 • Mandalay Bay • Las Vegas, Nevada0
DB2 9 for z/OS Performance Update
A special thanks to Akiko Hoshikawa, for providing these foils
Rafael Garcia, IBM SVL, [email protected]
1
Session Overview
Objective– Become familiar with updates on DB2 9 for z/OS performance to achieve
smooth V8->9 migration and be able to take advantage of DB2 9 features.
Agenda – General DB2 9 Performance update/expectation
• SQL Procedure • Insert • Sort and workfile • Select • Compression• Matured V8 functions
– Dataset Open/ Close performance – Utility/XML performance
2
2
General Transaction CPU Usage Trend with DB2 9
5 to 10% increase on average from V7 to V8No change on average from V8 to DB2 9
– 0 to 5% improvement for column-intensive transaction, especially with varchar (NFM)
– 0 to 15% CPU reduction in insert intensive transaction through insert improvement (NFM)
– Rebind to enable SPROC (IFCID224 or QISTCOLS in stats)– Significant CPU reduction in Utilities on index processing (CM)– CPU reduction also from
• Long displacement instructions (CM)• DDF/DBM1 shared storage (CM)
Synergy with system z10 – HiperDispatch– zHPF (High Performance Ficon)
3
DB2 9 Performance update SQL Procedures InsertSort and Workfile Select Enable V8 function to bring CPU reduction
3
4
DB2 9 Migration : Convert SQL Procedures to Native
CALL
DBM1WLM
SELECT..
SELECT..
UPDATE..
Native SQL procedure support (NFM) – SQL PL running on DBM1 instead of WLM managed stored procedure address
space – Easier development process
Some SELECT from sysdummy1 replaced by SET statementszIIP-eligible if DRDA as it runs in DBM1, not WLM, address space under DDF enclave SRB
5
DB2 9Migration : Convert SQL Procedures to Native
DB2 9 Native SQL procedure improvement
– No WLM address space overhead
– Less number of statements generation
– If called from DRDA, it executes just like any other distributed process => zIIPeligible
Measurements – 5-30% improvement in ITR
with native SQL procedures– zIIP offload increases from 6-
8% to 35%
0
100
200
300
400
500
600
700
800
900
tran
sact
ion
per
sec
IRWW -SQLPL
Complex -SQLPL
Throughput improvement
V8V9 Native
4
6
DB2 9 Insert Performance
Insert performance improvement DB2 9 offers
Data sharing – Data sharing Log latch contention and LRSN spin loop reduction (NFM)
Space search – Support APPEND option (NFM)
• Effective when APPEND is used with Member Cluster in data sharing
Index Manager – More index look aside (CM)– Large index page size (NFM)– Asymmetric Index split (NFM)
Insert monitoring – Real time stats as default
• Real time stats support on LASTUSED (NFM)
7
Batch insert
0
10
20
30
40
50
60
70
80
V8NFM z10 V9CM z10 V9NFM z10
CPU
tim
e in
sec
ond
0
200
400
600
800
1000
1200
1400
1600
LRSN
spi
n (m
illio
n)
CPU Spin(M)
Log Latch Serialization - LC19
LC19 : Serialization in log write operation LRSN spin and Log latch contention
– In Data Sharing, DB2 may need to spin to generate unique LRSN
– Higher impact in heavy insert operation with faster processor
• Higher impact with MRI as MRI eliminate other overhead
Hold latch while spin (V8, DB2 9CM) – No longer holds log latch while spinning in
data sharing ( DB2 9 NFM)
– Improvement depends on the spin rate
– DB2 9 Customers CM->NFM : Reported 15-50 % CPU reduction
• Up to 5x spin reduction and 2x CPU reduction in SVL
5
8
Summary of Recent Insert APARs
Space Search – PK55527 High getpages with segmented TS– PK57846 High getpages with PTS – PK65220 APPEND option for LOB TS – PK76544 PTS space search improvements, many zero unused
page between space map pages– PK85539 PE fix of PK76544 for Member Cluster– PK76738 High number of getpage in V9, growth of table space after
mass delete – PK81470 Enables MC00 behavior in DB2 9 – PK81471 Favors physical extend than searching free space in V9– PK83735 Pre-condition apar for PK94122– PK94122 Removes log force log I/O from seg/UTS.
Others – PK62214/PK91830 Improved detection in asynmmetic index split– PK68246 Lookaside for Insert into DPSI index
9
DB2 9 Performance - Sort and workfile
In-memory work file for small sorts (CM)– Simple sort does not require physical allocation of sort workfile
• Number of records < 255• Sort result set < 32K in size
– 10 to 30% cpu reduction for short-running SQL calls with small sorts– Beneficial for online transaction with short-running SQL calls in which
the number of rows sorted can be small.
Fetch First N Rows Only (CM)– Avoid tournament sort if the results fit in 32K in size
• 2x CPU improvement measured
– Fetch First N Rows Only in Subselect (NFM)
6
10
Workfile usage
User managed, 4K and 32KSort activities
ORDER BY, GROUP BY, DISTINCTMerge, star, and outer joins Created temporary tables Non-correlated subqueriesMaterialized views Materialized nested table expressions Triggers with transition variables
4K,8K,16K,32K Declared global temporary tables Static scrollable cursor
DB2 managed, 4K and 32K
Sort activitiesORDER BY, GROUP BY, DISTINCT
Merge, star, and outer joins Created temporary tables Non-correlated subqueriesMaterialized views Materialized nested table expressions Triggers with transition variablesDeclared global temporary tables Static scrollable cursor
DB2 9 Workfile data baseV8 Workfile data base
V8 TEMP data base
11
DB2 9 Performance - Sort and workfile
Heavier usage on 32K workfiles instead of 4K (CM)– V8 : Sort record size more than 4K bytes – DB2 9 : Input record size more than 100 bytes
• Sort record size = 16 + (select column total + sort key total)
Improve performance for large records sort and DGTT activities – Less getpage, Less I/O request – 10-50% improvement when 32K TS is used with record size >100 bytes
– Getpage and I/O reduction compared to 4K TS.
– 30 to 60% faster for SELECT COUNT from DGTT • Bigger prefetch quantity, 32K workfile, dynamic pref when large DGTT records are used
– 5 to 15% faster and less CPU for DGTT INSERT• Bigger preformat quantity and asynchronous preformat in DB2 9 but not V8
7
12
Sort Performance – 32K workfile usage
If you do not increase number of 32K workfile TS from V8– Potentially high page latch contention on 32K workfile space map
pages– High I/O contentions on 32K workfile data sets– Cannot take advantage from improvement using 32K workfile
Create “enough” 32K workfiles in V9Increase 32K size buffer pools in V9
What about DGTT usage from workfile ?
13
Sort Performance – 32K workfile usage
DGTT and Sort activities – V8 Workfile recommendation : zero SECQTY – DGTT not be able to span into multiple table spaces, gets -904
Logical separation of DGTT and rest of workfile activities by PK70060 (06/09)– Requires DB2 managed workfile table spaces – DGTT: workfile TS with non-zero SECQTY– Workfile Activities : workfile TS with zero SECQTY – Mass delete performance improvement, Deleted space reuse
DB2 9 workfile requirement – V8 TEMP space = V9 4K+32K workfile as non-zero SECQTY
• V8 TEMP SEGSIZE 4 V9 Workfile SEGSIZE 16 • Potential size growth with small concurrent DGTT insert activities • If space utilization is issue, reduce the size by changing SEGSIZE to 4 (9 NFM)
– With smaller SEGSIZE, possible impact on prefetch performance– V8 Workfile 4K+32K = V9 4K + 32K with additional 32K workfile TS as zero SECQTY
• V8 Sort workfile SEGSIZE 24 -> V9 SEGSIZE 16 • Continue to monitor with statistics, IFCIDs
8
14
DB2 9 Workfile Usage Control and Monitoring
MAXTEMPS in zparm DSN6SPRM – Maximum number of space ( MB ) in the work file database that can be used
by a single agent at any given time for all temporary tables. – Default 0 (No limitation)
Workfile activity monitoring with IFCID 2 - system level – Both DGTT and workfile activities – IFCID2
• QISTWF04 Total 4KB TS storage used in MB• QISTWH32 Total 32KB TS storage used in MB • QISTWFP2 How many times 4KB page was used when 32K was preferable --
should be small numberIFCID 342/343 – agent level
– IFCID342 Agent level of workfile usage • 'WFDB' for workfile related usage of workfile • 'TPDB' for DGTT usage of workfile
– IFCID343 MAXTEMPS reached
15
Select / Access Path related improvement
V8 – Read multiple rows via index backward to avoid sort– More Index usage for unlike type or length – Materialized Query Table– Distribution Statistics – Star Join Sparse index and in memory workfile usage
DB2 9 CM – Optimization across, rather than within, query blocks – Pair-wise join in star schema queries – Optimizer cost model update – Access path stability for static SQL (PK52522/52523 12/07) – Histogram statistics over a range of column values
DB2 9 NFM – Index on Expression – Reordered Row Format for VARCHAR columns
9
16
DB2 9 Migration Performance : Plan stability (DB2 9CM)
DB2 9 PK52523 and pre-conditioning with V8 PK52522– At REBIND, DB2 saved old copies of packages (access path / runtime
structures) in catalog just in case new one is not optimal – Package only, no support for stored procedure package – BIND/REBIND with PLANMGT option to save
• PLANMGNT (BASIC) - Saves 2 copies (current and previous) • PLANMGNT (EXTENDED) - Saves 3 copies (current, previous, original)
– REBIND package with SWITCH option to change AP Cost of PLANMGNT option
– BIND/REBIND CPU increase 20-30% – SPT01 size increase (64GB limit)
• FREE PACKAGE with PLANMGMTSCOPE clause • Possible to compress SPT01 after PK80375
17
DB2 9 Index On Expression (NFM)
Create INDEX CREATE INDEX UPPER_NAMEON emp(UPPER (lastname, 'EN_US'),UPPER (firstname, 'EN_US'))
Query SELECT id FROM emp WHERE UPPER (lastname, 'EN_US') = ‘SMITH'ANDUPPER (firstname, 'EN_US') = 'JOHN'
Orders of magnitude improvement if a predicate using such an index
Extra cost in index maintenance – Load, Insert, Update on key
value, Rebuild Index, Check Index, and Reorg tablespace
– Not Reorg index as expressions are evaluated in Insert or index rebuild
– Not eligible for zIIP offload
10
18
DB2 9 Index Compression (NFM)Difference between data and index compression
25 to 75% (3)10 to 90%‘Typical’ Comp Ratio CR
DSN1COMP (4)DSN1COMPEstimation
No YesGetpage reduction
No (2)YesComp Dictionary
NoYesComp in BP and Log
YesYesComp in DASD
In Acctg and/or DBM1 SRB
In AcctgCPU overhead
Page (1)RowLevel of compression
Index DataCompression
19
Notes : Index Compression
(1) No compression or decompression in each Insert or Fetch; instead at I/O time – CPU overhead critically sensitive to index BP hit ratio
• Bigger index BP strongly recommended for index compression• No change in acctg CPU time if index pages are brought in by prefetch
(2) Data compression uses H/W compression which requires dictionary while index compression is done by DB2 , Load or Reorg not required for index compression
(3) Based on a limited survey thus far– Higher for relatively unique indexes with long keys– Maximum CR limited by index page size: 50% with 8K, 75% with 16K, 87.5% with 32K page
(4) DSN1COMP utility to simulate compression ratio without real index compression
For some customers, especially in Data Warehouse/Business Intelligence, indexes take up more DASD space than data
– Index compression can be very valuable in such an environment.– The cost of index compression is under user control.
Work area for compressed index I/O is long term page fixed – So a long term page fix is recommended for
• Non compressed index buffer pool if significant DASD read/write and/or GBP read/write • Compressed index buffer pool also if significant GBP read/write
11
20
V8 CPU Improvement : Long Term Page Fix
Introduced in V8– ALTER BPOOL(BPx) PGFIX(YES)
• Default is PGFIX(NO)• Changed at next bpool allocation
Page fix buffer pools – Buffer pool should be always backed by real storage – Reduce z/OS page fix cost ( accounted at TCB or DBM1 SRB) during
DB2 I/O operation– Effective for buffer pools with high page read/written per buffers, GBP
read/write.
BPx with size 5000 and pages read(sync/prefetch) +written =500000 per min500000/5000 = 100 : 100 page fix per buffer pool
21
V8 CPU Improvement : Multi Row operation
Introduced in V8 – Reduce API overhead, up to 40-
50% CPU saving – Automatic for FETCH in DRDA
but not insert
InsertInsertRow 1
InsertRow
2Row 3
Row 1Row 3Row 2
Insert
QXRWSINSRTD : NUMBER OF ROWS INSERTED QXRWSUPDTD : NUMBER OF ROWS UPDATED QXRWSDELETD : NUMBER OF ROWS DELETEDQXRWSFETCHD : NUMBER OF ROWS FETCHED*
* #rows returned to application
Multi row insert tends to suffer LRSN spin – DB2 9 NFM LRSN spin reduction
Monitoring enhancement : Added row counters – V8/V9 PK62161 (UK44050/UK44051)
• Additional counters in IFCID2, 3 • OMPE 4.2 update on RECTRACE IFCID 2
12
22
Utility, LOB and XML Update
23
Utility Performance Improvement in DB2 9
V7 to V8 Utility CPU time -5% to +5% – DPSI for Parallel Load/Reorg/Rebuild (V8 NFM)
DB2 9 : Biggest improvement in DB2 utility history– Mainly CPU reduction in Index processing
• 5 to 20% in Recover index, Rebuild Index, ReorgTablespace/Partition
• 5 to 30% in Load• 20 to 60% in Check Index• 35% in Load Partition• 30 to 50% in Runstats Index• 40 to 50% in Reorg Index• Up to 70% in Load Replace Partition with NPIs and dummy input
13
24
Utility – z9/z10 zIIP Exploitation
V8/DB2 9 zIIP exploitation for Utility in Index processingIndex processing– Examples of effective offloaded CPU time with 4 CPs and 2 zIIPs
• 5 to 20% Rebuild Index• 10 to 20% Load/Reorg partition with one index or entire tablespace• 40% Rebuild Index logical partition of NPI• 40 to 50% Reorg Index• 30 to 60% Load/Reorg partition with more than one index
– Utility with zIIP : V8/V9 PK60956 (4/08) • Objects with small primary/secondary allocation
V8/DB2 9 zIIP exploitation for Utility in DFSORT (R10)DFSORT (R10)– PK85856 (DFSORT on z/OS R10 or above), PK85889 (DB2 V8 / V9)– No support on accounting trace for redirected DFSORT CPU time
• Recorded in RMF Workload Activity Report – SVL Lab measurements
• LOAD, REORG, REBUILD Index, RUNSTATS (with sort usage)• 10-40% of utility CPU time
– Depends on the sort phase CPU usage – 30-60% of SORT phase CPU
25
DB2 9 Copy Utility Enhancement
DB2 9 CM Tablespace Image Copy with Checkpage option always
– Added overhead for Checkpage practically eliminated• 15% less CPU than V8 with Checkpage option, same as V8 without
Checkpage option in Lab measurement
DB2 9 CM LRU (Least Recently Used)->MRU (Most Recently Used) on buffer steal
– Protects contents of buffer pool for other applications – V9 PK81232 to eliminate CPU increase due to MRU checking
PK74993 (V9) – 20% elapsed time improvement for copy of multiple small datasets
to tape
14
26
Notes : Other Utility Performance Updates in V8/DB2 9
V8/DB2 9 PK83996 Very high other lock suspensions due to SYSUTILX growth DB2 9 PK83683 LOAD replace Improvement on NPI log write I/O wait in Data sharingV8/DB2 9 PK61759 5/08 Load/Reorg CPU reduction
– 6 to 8% in DFSORT interfaceV8/DB2 9 PK45916/PK41899 SORTNUM elimination DB2 9 : Build2 phase elimination in Online Reorg Partition with NPI for better availability :
– Higher CPU time and elapsed time when few out of many partitions, especially with more NPIs, are Reorg’d as entire NPIs copied to shadow dataset
• Additional temporary DASD space needed• NPIs are automatically Reorg’d
Active log read buffers per Start IO increased from 15 to 120 for up to +70% recovery throughput
27
DB2 9 LOB Performance Improvement Summary (1)This is just a summary, there are enire presentations on this matter:
LOB lock avoidance in CS isolation (NFM)– Up to 100% reduction in Lock and Unlock requests in Fetch
• Use LRSN and page latch instead – One measurement with SAP optimized LOB streaming
• -67% IRLM requests• -26% class2 accounting CPU time• -14% elapsed time
LOB/XML flow optimization by size (NFM)– JCC properties
progressiveStreaming=ON/OFF and streamBufferSize= (default 1M)– Lob retrieval
• Smaller than 12K - Inline• Between 12K and StreamBufferSize - Chained • Larger than StreamBufferSize - locator
LOB read/write I/O performance improvement (CM)– From doubling of prefetch and deferred write quantity– From 8 times increase in preformat quantity
15
28
DB2 9 LOB Performance Improvement Summary (2)
File reference variable for Insert/Select of LOB data to/from sequential file (NFM)LOB CPU time reduction also from DDF/DBM1 shared memory above 2GB and more efficient space search (CM)Reorg LOB to reclaim space (CM)
– In V8, LOB Reorg did not reclaim free space, leading often to a bigger table space as a result of Reorg.
– In V9, free space is reclaimed. A general recommendation is to Reorgwhen the free space is bigger than the used space; i.e. SYSTABLESPACESTATS.SPACE>2*DATASIZE/1024 in Real Time Statistics.
LOB Append option by PK65220
29
XML Support in DB2 9 – Performance
XML size matters – CPU, I/O and network – Compression to reduce CPU, DASD space
Overhead at storing XML can pay off at query – Compared to multi tables insert and join vs. Single XML insert and single
XML fetch DB2 calls z/OS XML system service to parse the document
– Use z/OS 1.9 above (DB2 9 NFM) • 30% improvement in XML System Service 1.9
– Take an advantage of zIIP and zAAP redirection on XML System Service – If XML insert is issued through DRDA, DRDA zIIP + XML parser zIIP– If XML insert/load is issued locally, XML parser uses zAAP
• z/OS R11 : zAAP on zIIP capability– Cost of validation : 2-3x CPU increase
• Plans to redirect validation in DB2 9 PTF and vNext
16
30
Fundamental difference in LOB and XML
row 1row 2
row 3
LOB 1 LOB 2 LOB3
BASE table LOB Aux table (4K,8K,16K,32K)
row 1row 2
row 3
XML 1
XML 2
XML 3
BASE table XML table always UTS - PBR or PBG (always 16K size)
Store: -XML can be compressed
Retrieve : -Use XML index-Costly to scan XML docs
31
XML Performance Benchmark Results
TPoX Benchmark - Queries
0
500
1000
1500
2000
2500
10 20 30 40
number of users
tps
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
cpu%
ETR (tps) cpu busy (%) TPoX (Transaction Processing over XML) Benchmark
– DB2 Transactions with only XML data
– 6300 XML insert per second inserting ORDER table
• Commit every 10 inserts
• 25 -50 threads
– 2335 XML queries (TPoX 7 query mix) per second against 3 tables from multiple users
• 10-40 threads
17
32
XML Performance APARs
Get the latest/greatest XML maintenance : II14426 – PK47594/PK58766 Utility to call XML parser once – PK51571/PK51572/PK51573 XMLTABLE/XMLCAST support – PK55966 XML CS reader lock avoidance – PK57158 Performance for XML/APS– PK56337 Performance for Parent Axis improvement – PK66218 Performance for Insert with XML indexes– PK68265 and PK72604 XML lock avoidance in insert/update/delete – PK62082 Performance improvement with between predicate – PK55783 and PK81260 Index usage for XML column join– PK58914 XMLTABLE improvement by removing unreferenced column– PK88034 XMLAGG improvement – PK92908 XML APS improved filter factor for XML index usage – PK80732/PK80735 XML scan perfomrance improvement, add index
support for fn:not(), fn:starts-with(), fn:substring(), fn:upper-case()zIIP/zAAP redirection on XMLSS parser z/OS 1.9 with OA23828, OA22035
33
Migration and Monitoring
18
34
CPU time and Trace
Minimize phantom or orphaned trace records
– Example from customer’s DB2 V8 statistics report in IFC records per commit
– Phantom or orphaned tracebecause monitoring (eg vendor tool) stopped but not DB2 trace. The same CPU overhead as real trace.
– V9 tries to eliminate orphaned trace records
TRACE: The number one common reason to cause CPU degradation in many DB2 performance PMRsSuspect trace / Monitoring overhead if you see 2-3x CPU increase without access path change
00Others
00OP8
6304K0OP7
00OP6
111997851KOP1
06306KSMF
Not written, Not accepted
Written IFC DEST
Statistics - customer
35
DB2 9 Migration : Instrumentation – DB2 Traces Display IFCID (CM)
– Debug type trace often more expensive in newer versions as debug capability emphasized over CPU usage
Capture missing time in class3 accounting to reduce NOT ACCOUNTED time (CM)
– Active log read – TCP/IP and LOB materialization
Package-level trace filtering in Trace Command (CM)Regular Accounting Class 1,2,3,7,8
– Detail package accounting Class 10 (IFCID 239) (CM)
-D91D DIS TRACE TNO TYPE CLASS DEST QUAL IFCID 01 STAT 01,03,04,05, SMF NO 01 06 02 ACCTG 01 OP1 NO 03 AUDIT 01 OP2 NO 04 MON 30 OP2 NO 031 05 STAT 04 OP2 NO 06 STAT 03 OP2 NO 07 PERFM 30 OP2 NO 090 *********END OF DISPLAY TRACE SUMMARY DATA*********
19
36
DB2 latch suspensions
LC01/02 : Infrequently used LC03 : DDF disconnect LC04 : SYSSTRING cache or OBDLC05 : IRLM data sharing exits or RLFLC06 : Index tree p-lock in GBP-dep objectsLC07 : Index lotchLC08 : Query parallelism LC09 : Utilities or sproc URID LC10 : Allied agent chain, sequence descriptor LC11 : DGTT allocation, Sequence LC12 : Global transaction ID LC13 : Pageset operations LC14 : BM LRULC15 : Archive LOG LC16 : UR chainLC17 : RURE chain
LC18 : DDF resynch list LC19 : Log write LC20 : System check point LC21 : Accounting rollup LC22 : Internal checkpoint LC23 : BM page latch timer, dwqt operation LC24 : EDM thread storage allocation, BM
prefetch, lage unlatch serialization LC25 : EDM hash, Workfile allocation LC26 : Dynamic Stmt cache LC27 : Sproc /UDF, authorization cache hash LC28 : Sproc cache, authorization cache LC29 : Filed proc,DDF transasction manager LC30 : System agent latch LC31 : Storage manager LC32 : Storage manager LC254 : Index loch for non-GBP objects
LATCH CNT /SECOND /SECOND /SECOND /SECOND--------- -------- -------- -------- --------LC01-LC04 0.00 0.00 8.05 0.00 LC05-LC08 0.59 759.62 0.06 0.00 LC09-LC12 0.00 0.00 0.00 27.07 LC13-LC16 0.04 332.98 0.00 6.00 LC17-LC20 0.00 0.00 20048.36 0.00 LC21-LC24 0.03 0.00 1980.15 304.57 LC25-LC28 30.59 0.09 55.32 2.27 LC29-LC32 0.10 8.68 38.84 108.99
Lock Latch wait
Total Elapsed
37
Identify problem statement - Dynamic Statement Cache Table
Create DSN_STATEMENT_CACHE_TABLE – Sample in member DSNTESC of the SDSNSAMP library
Turn on IFCID 316,317,318 – 318 as switch on/off
EXPLAIN STMTCACHE ALL – Externalize the contents into DSN_STATEMENT_CACHE_TABLE– Elapsed, CPU, most of Class 3 information per statement
EXPLAIN STMTCACHE STMTID xx – Select from PLAN_TABLE with STMTID = QUERYNO
84.6217044.172182184847276334262273.894043380.087646SYSSH200SELECT2
111.6204221550.14843861538577151151691693.0532233453.517822SYSSH200SELECT1
1022.5378420.57409938822631179487166.3401491602.191650SYSLH200INSERT1
LOCKWAITSYNIOWAITGETPAGE #of EXECCPUELAPSED PROGRAMSTMT_ID
20
38
Synergy - DB2 and System Z
39
Synergy with system z processors
z990, z890, z9, z10– DB2 9 long displacement instruction hardware support, simulated by
microcode on z900• Impact on input/output column processing
– application with SELECT * , INSERT with many columns • V9 cpu vs V8 on z990 or later: -5 to -15% if column-intensive application• V9 cpu vs V8 on z900: +5 to 15%, more if many columns
z9 (2094)– MIDAW( Modified Indirect Data Address Word) to improve I/O performance– zIIP and zAAP offload to reduce total cost of ownership– More engines and memory up to 54 processors, 512 GB memory
z10 (2097)– zHPF– HiperDispatch– Up to 64 processors, 1.5 TB memory
21
40
CPU Time Multiplier for various processor models
1.38 1.3 1.211
0.820.65
0.530.37
0.25
00.20.40.60.8
11.21.41.6
G6(9672x17)
z800(2066)
G6turbo(9672z17)
z900(2064-1)
z900turbo
(2064-2)
z890(2086)
z990(2084)
z9(2094)
z10(2097)
*
*Note: z10 ratio varies significantly depends on the workload characteristics. Please refer the LSPR website for the latest information.
41
DB2 and z10
Wide variation of improvement – 1.2x to 2x from z9 to z10 is observed in Lab.
• Higher improvement if CPU intensive workload• Lower improvement if memory intensive workload
HiperDispatch– HIPERDISPATCH=YES, in the IEAOPTxx– White paper for planning and maintenance upgrade
• http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101229
– Larger improvement with larger number of processors 0-10% – DB2 parallel partition REORG
• 13% CPU reduction observed by enabling HiperDispatch
zHPF (High Performance FICON)– New channel program, higher channel throughput, with lower
response time – DB2 sync I/O time improvement
22
42
DB2 and Sync I/O response time
DB2 Synch I/O Wait TimeMicroseconds
222 277 739 838
3860
8000
02000400060008000
10000
zHPF
cach
e hit
Cache
hit
SSD+z
HPFSS
D
Shor
t see
k
Long
seek
DB2for
z/OSz10
43
I/O Wait Reduction
Larger VSAM CI size for larger data insert (V8)Index read I/Os
– Utilize large buffer pool to cache index if possible (V8) – Dynamic prefetch on Index scan (9 CM)
Larger Index page size (9 NFM)Bigger preformat quantity and trigger ahead (9 CM)
– If >16CYL alloc, 16CYL per preformat – Less likely see the preformat wait (X’09’ lock wait)
Doubled prefetch and deferred write quantity(9 CM) Active log stripe Archive log stripe (9 NFM)
23
44
DB2 Sequential Prefetch
DB2 Table Scan
80
100
120
140160
180
200
220
4K 8K 16K 32K
DB2 Page Size
Thro
ughp
ut (M
B/s
ec)
DB2 V8 From DiskDB2 V9 From DiskDV2 V8 From CacheDB2 V9 From Cache
z9, FICON Express 4, DS8300 Turbo, z/OS 1.8, Extended Format Data Sets
Buffer Pool > 200MB
45
Summary of what we have discussed:
General DB2 9 Performance expectation – SQL Procedure – Insert – Sort and workfile – Select – Compression– Matured V8 functions
Dataset Open/ Close performance Utility/XML performance
24
46
Thank you and any questions?
Rafael [email protected](954)647-7961
47
© Copyright IBM Corporation [current year]. All rights reserved.U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.
IBM, the IBM logo, ibm.com, DB2 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
Disclaimer