Stop Feeding IBM i Performance Hogs - Robot

All trademarks and registered trademarks are the property of their respective owners.© HelpSystems LLC. All rights reserved.

How Robot Monitor can help you pen up runaway processes before they cause problems

Stop Feeding IBM iPerformance Hogs

HelpSystems LLC. All rights reserved.

YOUR HOSTS

Chuck LosinskiDirector of Automation Technology

HelpSystems

Broadcasting live from Eden Prairie, Minnesota USA

and Düsseldorf, Germany

Kurt ThomasSystem Engineer

HelpSystems


• Resource-hogging scenarios and jobs

• Qualify impact on CPU,

memory, and storage

• Identify and address resource

hogs automatically

AGENDA


MOST COMMON HOGS

What types are most common?

• CPU hogs

• Main storage hogs

• Temporary storage hogs

How can you locate them and take action?

• A reactive solution: WRKACTJOB, WRKSYSACT, iNAV

• A proactive solution: Robot Monitor

These impact job run time


Why are these issues so problematic to an IBM i environment?

• Rapid resource consumption

• Need to be found quicklyResource

Hogs

• Limited

• ExpensiveResources

PROBLEM PROBLEM PROBLEM

Userimpact

Demandfor moreresources

Systemfailure

CHALLENGES

UP NEXT...

CPU Hogs


What is the impact on your enterprise?

System symptoms:

• CPU is being rapidly consumed

Impact:

• Until the job(s) causing

this are identified and

the issue resolved,

expect:

o Additional resource expenditure

o Impairment of user productivity

o Impact on many systems

o POS, web, DB, GUI, credit cards

Rapid CPU consumption

Operators in race against the clock to identify

CPU hogs

System productivity is maintained onlyby feeding extra

resources

All of which consumes more

resources

Underlying issue remains until the

team runs outof resources or

resolves it!

CPU HOGS


QZDASOINIT and QSQSRVR Jobs

What are they?

• DB2 database server jobs

What do they do?

• Let applications pull information from, and modify, information in DB2 on IBM i

• QZDASOINIT jobs implement ODBC, JDBC

• QSQSRVR implements Server Mode = IBM-flavor ODBC

Why so problematic?

• The issue with QZDASOINIT and QSQSRVR jobs is not the jobs themselves—

it is poorly-written SQL code running in them that causes the issues.

CPU HOGS


CPU HOGSHow to find the maximum number of QZDASOINIT jobs in QUSRWRK

Step 1: Run DSPSBSD SBSD(QUSRWRK)



Step 2: Select Option 10 (Display Prestart Job Entries)



Step 3: Select Option 5 (Display Details on QZDASOINIT)

Result: The maximum number of jobs and uses


CPU HOGSHow to display Active Prestart Jobs

Step 1: Run DSPACTPJ SBS(QUSRWRK) PGM(QZDASOINIT)

Result: Shows the current number of prestart jobs and how many are still in use


CPU HOGSHow do we find the offending job FAST?

Common example:

• QZDASOINIT jobs

Challenges to identification:

• Jobs all share the same name

• Potentially hundreds of jobs could be contributing to the issue

• The longer this process takes, the more CPU is consumed and the greater

the risks and impact on resources


CPU HOGSReactive approach

Step 1: WRKACTJOB followed by manual batch investigation and resolution at

the job level on each LPAR you manage


CPU HOGSTroubleshooting

Big picture questions:

• Who is running these jobs?

• What proportion of overall CPU are these jobs consuming?

Conclusion:

• Greater insight will provide deeper understanding and context to any issues,

which will result in faster problem resolution.


CPU HOGSProactive approach

• Step 1: Real-time visibility

• Step 2: Immediate access to jobs for resolution and threshold levels

for early detection



Data definitions: Define by any/all systems + Define by job name



Thresholds: Custom thresholds and proactive alerts



Monitored groups: Granular monitoring can extend to subsystem, accounting

code, user, current user, job, and function


• Identify and manage CPU hogs

LIVE DEMO

UP NEXT...

Main Storage Hogs


Page faults

System symptoms:

• The number of paging faults rapidly increases, a condition known as “thrashing”

Impact:

• Until the job(s) causing this

are identified and the issue

resolved, expect:

o Number of page faults

to rapidly rise

o Jobs to take longer

and longer to execute

o Impairment of user productivity

Data loaded/re-loaded into

memory

Batch job beginsto process

Interactive job flushes memory

records

Batch job can’t access the

records (faulting occurs) and...

ThrashingCycle

MAIN STORAGE HOGS (RAM)


Reactive approach

Step 1: Access the Work with System Status screen to show number of page

faults in each memory pool

MAIN STORAGE HOGS


Troubleshooting

Big picture questions:

• Which jobs are responsible for causing problems in these memory pools?

• Which subsystem(s) are using these memory pools?

Conclusion:

• Greater insight will give more meaningful understanding and context to any

issues for faster problem resolution.

MAIN STORAGE HOGS


Proactive approach

Step 1: Real-time visibility of dedicated NDB bar shows us the overall system

faults/second.

Stage 2: Immediate access to offending jobs for resolution. Threshold levels

for early detection.

MAIN STORAGE HOGS


• Identify and manage main storage hogs (thrashing)

LIVE DEMO

UP NEXT...

Disk Hogs


What is temporary storage?

• Storage allocated by the OS in the system ASP (disk). Its contents will be

cleaned up when an IPL is performed.

• Programs loaded into the activation group and the associated variables, heap space (Java as well as other HLL APIs, i.e., malloc, calloc), open data paths, etc.

Temporary storage?

YES!

• Objects in QTEMP librariesTemporary storage?

NO!

DISK HOGSTemporary storage


DISK HOGS

System symptoms:

• Overall system DASD is rapidly being consumed as a result of underlying temporary

storage being used by a runaway job

Impact:

• Until the job(s) causing this are identified and the issue resolved, expect:

o Additional disk requirement

o Potential system failure if left unchecked

Runaway job

Temporary storage

Overall DASD

Temporary storage


Common example:

• Java memory leak—Java job allocates memory to run tasks, but never

returns memory to OS after finishing task

Challenges to identification:

• Where is storage being used?

• Which jobs are consuming temporary storage?

DISK HOGSTemporary storage


Step 1: Use the command WRKSYSSTS to determine 2 values:

• Current temporary storage consumption

• Peak temporary storage consumption

DISK HOGSReactive approach


Step 2: Run the WRKSYSACT command as follows: WRKSYSACT

SEQ(*STGNET) to determine which jobs are using the most temporary storage

• Only shows jobs that have had activity since the last collection

• Can also include permanent storage

DISK HOGSReactive approach


Step 1:

• Real-time visibility of the temporary storage being used by each subsystem

Step 2:

• Immediate access

to offending jobs

for resolution

• Threshold levels

for early detection

DISK HOGSProactive approach


• Identify and manage temporary storage hogs

LIVE DEMO


• Resource-hogging scenarios and jobs

– How to identify them

• Qualify impact on CPU,

memory, and storage

– How this will affect users and applications

• Identify and address resource

hogs automatically

– Thresholds and notification

RECAP and Q&A


Upcoming Performance Monitoring Webinars

• SQL-Based Monitoring

– September 7

• Real-Time Disk Monitoring

– September 28

Integrating Robot Monitor into your infrastructure


Thank you for attending!

Contact Information

Website:

www.helpsystems.com/robot

Telephone:

800-328-1000 sales

+1 952-933-0609 support

Presenters:

[email protected]

[email protected]

Software

Stop Feeding IBM i Performance Hogs - Robot