42
I Know What You Did This Last Summer... Martin Packer, IBM Email: [email protected] Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker 1

I Know What You Did THIS Summer

Embed Size (px)

DESCRIPTION

Update to I Know What You Did Last Summer with some new material and some detail moved to backup slides at the end.

Citation preview

Page 2: I Know What You Did THIS Summer

... but do YOU?

2

Page 3: I Know What You Did THIS Summer

What’s The Point Of This Presentation?

So, it’s got a “tongue-in-cheek” title but what’s it all about?

I think one of the least well appreciated aspects of z/OS and its middleware is the richness of instrumentation it gives you: Here I describe it and just some of the ways you can get value from SMF.

While I'm aware MY concerns might not match YOUR concerns EXACTLY there's much common ground.

I'd like to make you smarter - or appear to be. :-)

3

Page 4: I Know What You Did THIS Summer

Agenda

•Who wants to know?

•Let’s review what we have

•What more do we need?

4

Page 5: I Know What You Did THIS Summer

Performance And Capacity

● I'm a little loth to talk about Performance and Capacity

● As we KNOW we can very successfully use instrumentation for this

● But we are blessed with very good Performance and Capacity information

● And there's an abundance of “folklore” on how to use it

● Even if we have to “stay on our toes”

● We'll scarcely touch on this in the rest of the presentation

Page 6: I Know What You Did THIS Summer

The Value Of Recorded Instrumentation

• You really CAN know what happened last summer

• Depending on what instrumentation you kept

• Depending on how you look at the data*

• We can get from the anecdotal to some hard facts

* I use the terms “instrumentation”, “data”, “evidence “ and “statistics” interchangeably

6

Page 7: I Know What You Did THIS Summer

Architecture and Inventory

● "Architecture" means many different things *:

● I'm interested in how infrastructure fits together.

● I'm not happy to just have a "bucket of parts".

● Or an inventory that's just a list.

● But we can get a well-structured inventory out of what we have to hand.

● I was taught to use “top down problem decomposition”

● A very good idea but...

● There's a danger of losing sight of what this thing is actually FOR

* See a later slide for more on this

7

Page 8: I Know What You Did THIS Summer

Patterns and Changes● A static view often isn't enough

● Particularly as not just workloads but configurations are getting increasingly dynamic

● I've the scars to prove it

● It’s important to know how your systems “usually” behave

● The classic “double hump”

● There may be no “usually”

● This lack of “envelope” is in itself important

● “Our rolling 4 hour average peaks between 2AM and 4AM”

● This would probably affect software billing

● Knowing what’s normal allows you to understand changes

● “This isn’t normal”

● “This is slowly getting worse”

Page 9: I Know What You Did THIS Summer

Licencing

●What are we actually using?

● And are we using it enough to justify it?

● And who IS using it?

● And should we be using multiple versions?

●What are we licenced for?

● And what SHOULD we be licenced for?

●Note: SMF 70-1 and 89 basis for some IBM licencing schemes

● And used in at least one third-party Licence Management tool

Page 10: I Know What You Did THIS Summer

I’d Like To Make You (Appear) Smarter :-)

●Imagine me meeting you for the first time...

●I’d like not to have to ask stupid questions...

● ... the answers to which I should be able to find out

●I think you’d like to get the answers from data

● Rather than having to trouble HUMANS for them

● Humans might not know

● Or might give you the WRONG answer

●And I think you value being proactive

● Based on evidence

●Most of my conversations about systems begin with FACTS

● The interpretation is the fun bit

● You probably wish many of your conversations started with facts, too

Page 11: I Know What You Did THIS Summer

So, What Do We Have?

Page 12: I Know What You Did THIS Summer

But First Some Assumptions● We're not talking about formatted reports

● I assume you can process data and aren't entirely reliant on RMF Postprocessor reports

● I'm not entirely limiting this to SMF

● I've had conversations with developers where the words “SMF-like” have cropped up

● A WLM policy is “admissable evidence”

● So's a DB2 Catalog

● Particularly the bits with history

● The point I'm making doesn't require an EXHAUSTIVE survey of the available data

● I'm NOT talking about Performance

Page 13: I Know What You Did THIS Summer

“Physical” Containers

Page 14: I Know What You Did THIS Summer

One Layer Down - LPARs

Page 15: I Know What You Did THIS Summer

Another Layer Down – WLM Constructs

Page 16: I Know What You Did THIS Summer

Address Space Etcetera

Page 17: I Know What You Did THIS Summer

Application

Page 18: I Know What You Did THIS Summer

Middleware-Specific Instrumentation

● CICS, MQ, Websphere Application Server and DB2 are particularly prolific

● Subsystem Information

● Tells you a lot about how these are set up and behave

● Use this information in concert with RMF information

● For example DB2 Group Buffer Pool analysis

●●●●●●●●●

● Application Information

● Tells you what the subsystem is used for

● And how it's being driven crazy

● You'd probably recognise a SAP DB2 subsystem

● And you'd certainly recognise one with lots of CICS or Batch use

● DDF leaves even more footprints than usual

• Examples: IP Address, Client Application Program

● Domain knowledge is key

Page 19: I Know What You Did THIS Summer

Data Set Instrumentation● Almost unlimited food for nosiness thought

● Dynamic

● SMF 42-6

● Point in time:

● DB2 Catalog

● DCOLLECT

● SMF 14, 15, 16, 62, 64

● Other records have hints

● Example SMF 30 DD-level information

● For the insatiably curious try User F61 GTF

Page 20: I Know What You Did THIS Summer

Data Set Instrumentation - Examples

● DB2 data set names give database and space name

● Also partitioning clues

• And “hot” partitions

● DB2 Catalog reinforces this

• Catalog vs DCOLLECT is an interesting comparison

● Note: SMF 42-6 doesn't help you understand WHICH DB2 users

● Mnemonic data set and DD names in SMF 14 and 15

● For example “STEPLIB” or “~.RUNLIB”

● Batch understanding is greatly aided by data set information

● I've discussed this at length elsewhere

● CICS VSAM LSR pool use and data sets

Page 21: I Know What You Did THIS Summer

SMF 30 Usage Information● Useful for licencing discussions but so much more besides:

● Are all my CICS regions up at 4.1.0?

● Which batch jobs access this DB2 Subsystem?

● (Subsystem name only given for DB2 Version 9 and above)

● Is this CICS region a TOR / AOR, FOR, DOR or what?

● How much is the TCB / SRB time when using DB2?

● You can answer these questions just with SMF 30

● Note: Multiple sections with the same “key” e.g. CICS / MQ

● Need to sum TCB and SRB times

● Speculate that fine structure is of interest

Page 22: I Know What You Did THIS Summer

OA39629: NEW FUNCTION TO REPORT THE HIGHEST PERCENT OF CPU TIME USED BY A SINGLE TASK IN AN ADDRESS SPACE

● z/OS Release 12 and 13

● Provides largest TCB's % of an engine

● Largest TCB's program name

● Purpose:

● To help understand which address spaces have single-TCB speed sensitivities

● Speculation:

● Might show QR TCB for CICS without 110's being needed

● Interesting to compare to Product Usage TCB / SRB in Type 30

Page 23: I Know What You Did THIS Summer

CFLEVEL 18 CIB / CFP Path Instrumentation Improvements

● Traditionally we've had path types to a CF

● And path types for CF-to-CF links

● New instrumentation adds much more:

● Adapter type, Path type, CHPID, PCHID, Adapter ID, Port number

● Helps build better topology picture

● Latency

● Used in report to estimate distance @ 10mics per km

● Path degraded flag

● Note: No traffic information

Page 24: I Know What You Did THIS Summer

Architects Will Recognise This As Incomplete

Page 25: I Know What You Did THIS Summer

This Is Not A Complete Architecture As Architects Would Recognise It

● This only documents componentry inside the mainframe

● The names are not necessarily names Applications people or architects would recognise

● For example a machine serial number is probably NOT what an architect would use to name a machine

● If they even WANTED to name a machine

● There's little commentary

● Interfaces are sparse

● An attempt to portray our understanding as architectural would appear like Officer Crabtree's * attempts to speak French

* http://en.wikipedia.org/wiki/Officer_Crabtree

Page 26: I Know What You Did THIS Summer

An Observation On Batch Architecture

● Most installations have little understanding of their batch “architecture”

● Quotes because that term may be too kind :-)

● Numerous customer recent conversations convince me of this

● Knowledge is being lost from organisations

● This understanding is important

● Needed to make tuning, scaling and streamlining effective and safe

● Aallows stuff to be reliably run

● Enables training the next generation

● Further observation: “Any technology distinguishable from magic is insufficiently advanced” applies here

Page 27: I Know What You Did THIS Summer

So What Do We Still Need?

Page 28: I Know What You Did THIS Summer

Some Parting Thoughts

Page 29: I Know What You Did THIS Summer

Some Parting Thoughts

● Experiment with data depiction techniques

● Example: Plot “with load” rather than time of day

● Example: Use time as the third dimension

● Maybe someone knows how to make animated GIFs or movies from static graphics

● Think of creative ways to use instrumentation

● Look to other sources of instrumentation than the obvious

● Beware the subtleties of e.g. field meanings

● Which, I guess, means staying “plugged in to the folklore”

Page 30: I Know What You Did THIS Summer

Backup Slides

Page 31: I Know What You Did THIS Summer

In An Ideal World You'd Like Instrumentation To Be ...

● Timestamped

● Readily parseable

● Of known provenance

● Light weight

● Understood by the community

● Available at various levels of detail

● Fit for purpose

● Persisted

● Have a manageable lifecycle

● Immediately produced

● Standards-based

[ I'd say ALL instrumentation falls short of at least one of these ideals]

Page 32: I Know What You Did THIS Summer

Audit

● Follows on from Change and Inventory

● Do we have what we think we have?

● If not why not?

● Who made that change and when?

● And how did it affect things?

● Maybe “why?” isn’t answerable from the data

● Some changes are “heralded”:

● WLM Policy activation specifically recorded in SMF

● Some aren't:

● “We seem to have more online disks in this interval than the previous one”

Page 33: I Know What You Did THIS Summer

We Know (Almost) Everything You'd Ever Want To Know● For processors:

● Serial number and Plant

● “What's in a name?”

● Device Type and Model

● Actually hardware and software models

● Specialty engine counts

● For Coupling Facilities

● Similar

● For Disk, Tape and Switches

● Enormous amounts of information

Page 34: I Know What You Did THIS Summer

The “Almost” We Had Before Is Almost Gone● CPU

● Memory

● Channels

● Disk and Tape

● Some connectivity information still missing

● Parallel Sysplex Infrastructure

● Connectivity

● Performance

● Traffic

● (Some LPARs are ICF LPARs – just to mess up my graphic)

Page 35: I Know What You Did THIS Summer

WLM Constructs

● RMF tells us how the following behave:

● Workloads

● Service Classes

● Service Class Periods

● Report Classes

● We get SOME information on what these represent:

● Description strings

● No classification information

● “Served” service classes may be a bit of a clue

● Policy changes are readily discernable

● Including who did it

● (Usually I see mnemonic policy descriptions)

Page 36: I Know What You Did THIS Summer

Parallel Sysplex Infrastructure

● Enormous amount of information on Coupling Facility structures

● XCF groups likewise

● I got job name put into RMF as member name is often useless

● RMF doesn't know what structures or XCF groups are used for

● So we have to “guess”

● But it's been a LONG time since I guessed a CF structure or XCF group's use wrong

Page 37: I Know What You Did THIS Summer

Address Space

● Key non-performance information in Type 30:

● Program name

● WLM Service Class and Report Class

● But not for “served” work

● Can relate Report Class and Service Class

● And usually figure out what these are REALLY for

● Can detect eg CICS regions, DB2 subsystem and MQ subsystem address spaces

● Can dispel myths like “we don't use Unix System Services”

● Accounting Information and Programmer Name can be interesting

Page 38: I Know What You Did THIS Summer

... but I have to admit I don’t know what you’re doing right

now...

38

Page 39: I Know What You Did THIS Summer

Online Monitoring Still Has A Role

● It's unimpressive to respond to an incident with “No but I can tell you what happened last week”

● Automation probably requires it

● Some things simply aren't available in an externally-recorded form

● People seem to quite like it

Page 40: I Know What You Did THIS Summer

I'm Told I don't Do Enough Graphics … So Here Are Some (Almost) Gratuitous Ones :-)

Source: http://www.edwardtufte.com/tufte/posters

Page 41: I Know What You Did THIS Summer

Provenance Is Important

Source: http://www.the-world-heritage-sites.com/messel-pit-fossil-site_germany.htm

Page 42: I Know What You Did THIS Summer

The Messel Pit Fossil Site is a disused quarry in the village of Messel, Darmstadt-Dieburg, Hesse, about 35 km southeast of Frankfurt-am-Main, Germany. The quarry used to be a mine since 1859, when brown coal and later bituminous shale were mined. By the 1900's, it became well known for a different reason, when it began to yield fossils. Nevertheless, mining continued until as late as 1971, when the shale mine finally closed, and a cement factory built in the quarry also failed.

After the quarry became disused, there was a plan to turn it into a garbage dump. Fossil enthusiasts were allowed to dig in the quarry. These amateurs developed a technique to preserve the fine details on small fossils. In time, the Messel Pit became known as the richest site for fossils from the Eocene period, which was between 57 million and 36 million years ago.

Today scientists have uncovered exceptionally well-preserved fossils of mammals, including fully articulated skeletons to the contents of the stomach of animals from that period.

In 1995, Messel Pit Fossil Site became the first site to be inscribed as a World Heritage Site solely due to fossils. It took place at the 19th session of the World Heritage Committee held in Berlin, Germany, on 4-9 December, 1995.

Messel Pit Fossil Site, Germany

Source: http://www.the-world-heritage-sites.com/messel-pit-fossil-site_germany.htm