Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Learn to apply machine-learning techniques to perform buffer pool tuning! Starting from a clearly
defined buffer pool layout, we discuss how to automate the classification of pagesets into their
corresponding pools by using cognitive technologies.
1
Bufferpool tuning has been a key tuning strategy since the early days of DB2. By adding additional
pagesizes, new page stealing algorithms and last but not least by implementing in-memory technology,
we have more choices than ever before. While setting out a clearly defined buffer pool layout and
initially associating database objects to their corresponding pools has remained the starting point of
buffer pool tuning, we now make use of cognitive technologies to help us classify new objects based on
their access pattern to the corresponding pools. And we don’t need any longer a DBA being bonded to
monitor access behaviour, to investigate in finding out access pattern and to eventually define the best
pool for each new object.
This presentation shows an approach based on cognitive technologies available as open source and based
on R applicable to everybody.
2
We will start with general buffer pool tuning recommendations, where we will discuss a typical buffer
pool layout, and the resources needed not only to properly classify tablespaces and indexes into their
corresponding buffer pools but also to correctly size the individual buffer pools. Quickly we will detect
that a certain permanent effort is necessary to keep object access patterns in sync with the buffer pools
selected for these objects.
That is where the idea of using a machine learning algorithm to classify buffer pool objects was born.
Therefore, a very high-level introduction to Machine Learning will be given, highlighting both the
concepts of machine learning and the five typical steps within each machine-learning project.
Once these concepts are discussed, we will apply the five steps of Machine Learning step-by-step for our
current problem: What is the best buffer pool for my tablespace/index under review? We will use R, a
widely used open source tool to perform the machine learning part of this approach.
Eventually, further tuning steps are necessary once the Machine Learning part is terminated. These steps
follow traditional if-then-else rules, and should be familiar to you. These steps include a separation of
database objects driven by the type of application which accesses these objects. The idea for this part of
the presentation was born when flying back from last year’ s conference: Why putting all database
objects into the same pool? Wouldn’t it be convenient to separate objects by the priority of applications?
A bit similar to what you can experience in an airplane with its different classes, from economy to first.
The last section will compare the results achieved by this tuning method and discuss different metrics for
buffer pool performance evaluation.
3
Established in 1826, Swiss Mobiliar is the oldest private insurance company in Switzerland, providing
property, business and life insurance for more than 1.7 million customers. This leading mutual
cooperative association operates from around 80 general agencies at 160 locations. This decentralised
structure enables the business to deliver customer proximity and local competence.
To do this, Swiss Mobiliar relies on a sophisticated IT infrastructure comprising 5,000 servers, 5,000
notebooks, 1,800 iPhones® and 300 iPads®. Business applications from IBM, Microsoft®, Oracle, SAP®
and Siebel as well as insurance industry specific applications and many home-grown mission-critical
applications all rely on mainframe DB2 data, either directly or based on copies of extracted data. The
range of database systems spans from hierarchical DBMS (IMS), Relational DBMS (DB2, Oracle, MS
SQL Server) to No SQL databases (Neo4j, Cassandra, and even more to come).
4
5
A very useful source for buffer pool tuning information is Florence and John’s presentation (“DB2 for
z/OS In-Depth Look at Buffer Pool Tuning”) presented at the 2016 IDUG DB2 North America Tech
Conference in Austin, Texas.
Some of their conclusions about buffer pool tuning were as follows:
• Multiple buffer pools are recommended
• Use simplicity in object-to-buffer pool matching
• Never over commit available real memory
• Available skills and resources should be appropriate to selected strategy
If you understand the basic principles of buffer pool tuning, but you don’t have enough time to sit in
front of your monitor, to study data and to decide into which category this new database object should
fall, then why not considering an automated approach?
That is what this presentation is all about: Use a Machine Learning approach to classify database objects
into different pools.
6
Most buffer pool strategies apply a moderate multiple buffer pool approach, based on isolating
catalog/directory objects in BP0, using an exclusive buffer pool for workfiles, and having a couple - but
not too many - additional pools.
For reasons of simplicity, this presentation only covers 4K page sizes, but all techniques discussed also
apply to larger page sizes.
General recommendations for setting threshold values include
• VPSEQT=80, DWQT=50, VDWQT=10 for cat/dir bpool
• VPSEQT=90-95, DWQT=60,VDWQT=50,PGSTEAL=LRU for workfiles
• VPSEQT=50-80,DWQT=30-50,VDWQT=10,PGSTEAL=LRU for general pool
• VPSEQT=90,DWQT=80,VDWQT=50,PGSTEAL=FIFO for journal-type
• (VPSEQT=0,DWQT=VDWQT=90,PGSTEAL=FIFO for in-memory)
• VPSEQT=100,DWQT=VDWQT=90,PGSTEAL=NONE for in-memory
Sizing should come up with a page residency time of 3-10 minutes for cat/dir and general pools, FIFO
should be sized sufficiently for optimized prefetch quantity (VPSIZE*VPSEQT >= 320MB), and NONE
should be as large as to be able to hold all data of these objects.
More details on page steal algorithms will follow on pages 26-28.
How can I decide which objects should go into which pool?
7
This slide shows Swiss Mobiliar’s current buffer pool layout. BP0 and BP1 are caching objects from
catalog/directory, and workfiles respectively. They aren’t of any interest for this presentation.
The question is: Which objects fall into the three categories
• Avoid re-read of same page, apply LRU as page stealing algorithm
• No re-reads, pipeline-type access, FIFO as page stealing algorithm
• Often re-referenced, avoid even first I/O, NONE as page stealing algorithm.
We will answer this classification question by applying machine learning techniques. Before we will talk
about these techniques, let us quickly resume the sizing and the measurable objectives of Swiss
Mobiliar’s individual buffer pools (see table on next slide).
VPSEQT – General recommendations
General recommendations regarding buffer pool attributes include using the default value of 80% (or a
range from 50-80%) for VPSEQT for most buffer pools. If the buffer pool tuning objective is to protect
random pages and to increase random page residency time, VPSEQT might be further reduced, where a
minimum of 80K pages for VPSEQT*VPSIZE should be maintained (for a 4K per page pool) .
Set VPSEQT = 90-95% for the sort work file buffer pool (BP1 in our sample layout). Set VPSEQT=0 for
in-memory pool when working with FIFO option. This avoids the unnecessary overhead of scheduling
prefetch engines where data is already in the bufferpool. And as the number of available prefetch engines
is limited to 300, this approach reduces the demand for prefetch engines. However, a better option for
in-memory behavior is PGSTEAL=NONE with VPSEQT=100, as prefetch will be turned off on an
object level after initial load of the data.
8
The most important metrics to evaluate buffer pool performance is the page residency time. Calculation is quite straightforward, all input values are available by a simple –DIS BPOOL(BPn) DETAIL command:
General page residency time := VPSIZE / (no of sync and prefetch pages read per sec).
The general page residency time delivers an overall value independent of random or sequential access. It is a lower bound for random page residency time and an upper bound for the sequential page residency time.
More detailed insight is produced by the following estimations:
Random page residency time := MAX(general page residency time,
(VPSIZE * (1-VPSEQT/100) / (sync pages read per sec))
Sequential page residency time := MIN(general page residency time,
(VPSIZE * (VPSEQT/100) / (prefetch pages read per sec)) .
Sizing for PGSTEAL=NONE defined bufferpool is based on the result of the following query:
select sum(nactive) from ( select sum(nactive) as nactive from sysindexspacestats st, sysindexes I
where st.creator=i.creator and st.name=i.name and i.bpool='BP7' union
select sum(st.nactive) as nactive from systablespacestats st, systablespace ts
where st.dbname=ts.dbname and st.name=ts.name and ts.bpool='BP7' ) total
9
Before I will quickly introduce Machine Learning concepts and the steps to follow when implementing a
machine learning based solution, let us quickly review the steps which will be applied for buffer pool
tuning:
1. A buffer pool layout must be designed.
2. Cat/Dir objects, workfile databases and other specific objects need to be placed individually.
3. Each remaining database object must be classified into one of three categories of buffer pools: LRU-
organized, FIFO-organized, or In-Memory (PGSTEAL=NONE) organized pools. This classification
problem is a typical machine learning problem.
4. Once classified into these three categories, additional rules apply to further distinguish between
“economy” type of objects and “first class” objects.
5. Sizing must be done for each buffer pool.
6. Performance evaluation of the buffer pool key performance indicators.
The next section covers a high-level introduction of Machine Learning (ML) principles, before we apply
these ML techniques step-by-step by using R to our buffer pools (section ML for buffer pool tuning).
If you’re already familiar with ML concepts, you can switch the next section and continue directly with
ML for buffer pool tuning on page 26.
10
This section covers a very high-level introduction to Machine Learning. We will discuss the 5 steps
which every ML project has to pass. The same 5 steps will be applied to our buffer pool tuning problem
in the next section following this introduction.
The 5 steps include
1. Understanding the data
• which data do we have, where do we get it from, how current is the data?
• What are the objectives/insights that interest us?
2. Preparing the data
• Data integration
• Scaling the data
3. Building a Model
• Selecting the appropriate algorithm and the relevant features
4. Evaluating the model
• Separate training data from test data
• Training the model and testing it
5. Monitoring the model
• Validating the model by using it by the test data created in step 4
• Accuracy over time
11
Traditional programming is about automation. If we take the e-mail spam detection problem, a
traditional software engineer would come up with a couple of rules which act as a filter for each
incoming mail. All these rules were coded explicitly, and the results verified with some test data.
Usually, the rules will be corrected and refined until the verification shows a 100% success rate for the
test cases.
Machine Learning takes automation one step further: We start with data and with the results (in our
particular case with the e-mail sender, subject, and content) and the label “spam” or “no spam” for each
mail. An algorithm takes this data and calculates a set of rules, called a model.
Each new incoming mail will then be checked by these rules and will be classified as either “spam” or
“no spam”. Testing of the rule set happens by a set of additional e-mails where the category (spam or
not) is known in advance and will be compared with the prediction of the model. Any additional mail,
correctly classified or manually re-classified, adds to the initial data, allowing to recalculate the set of
rules automatically.
In traditional programming, new patterns of spam require the software engineer to write additional rules.
The machine-learning created rules adapt automatically to new pattern. In a nutshell: Machine learning
is about automation of automation.
12
Let us imagine a fruit basket containing apples and oranges, and the question “Are there different
fruits?”.
For human beings, even for conference attendees eating sandwiches and pizza only, the answer is easy:
Experience tells us that different types of fruits exist, and how they can be separated. And that it is not
advisable to add them together. These experiences allow us to compare (and to separate) apples and
oranges.
As machines prefer bits and bytes compared to apples and oranges, we have to describe these fruits in a
machine-friendly way.
13
A red apple is different from an orange-coloured orange. Therefore, color could be a feature which helps
to distinguish between apples and oranges. But even a green orange is easily identifiable as an orange,
for example by its bumpy surface texture.
Other features also exist, such as price, taste or the amount of juice which can be gained out of a single
fruit. But not all features or attributes are useful: Once at home, do you really remember the price? And
in order to measure the amount of juice, the fruits need additional treatment. May be not an ideal feature
just to compare apples and oranges.
So let us focus on just two features: color and texture.
We can now build a two-dimensional “feature space” based on these two attributes.
14
Every fruit will be described using a feature vector. Fruit no 7 for example is (slightly bumpy, orange).
To enhance our machine-friendliness, we introduce a numerical scale on both axes: from 1 (green) to 10
(red) on the vertical axe, from 1 (very smooth) to 10 (very bumpy) on the horizontal axe.
15
Fruit number 7 now gets (6, 8) on the new scale.
Used to tables, we can describe our fruit basket in a tabular form:
Fruit No Texture Color
1 2 1
2 2 10
… … …
7 6 8
Now we are able to start to generate knowledge out of our basket: Because the machine is not aware if
this table represents apples and oranges, fraudulent transactions, or speed of cars, we can apply typical
algorithms to separate our fruits into different categories. A very typical such algorithm is k-means: its
objective is to create clusters of similar fruits. And the metrics used to calculate the difference between
two fruits is the geometrical distance between two fruits. This needs to be calculated for each pair of
fruits. Therefore, with a huge fruit basket, and with many dimensions (features), this task will consume
considerable resources.
Once k-means has calculated all those distances, it builds groups in such a way that the square-distances
within the groups will be minimal.
16
The result are three groups: The machine has learned that there exist very bumpy green fruits (unripe
oranges and overripe apples), smoothly-textured fruits of any color (apples and some oranges), and
orange to red colored bumpy fruits (oranges).
Why didn’t it just separate apples and oranges and built three groups instead?
k-means algorithm needs the number of groups as input. As three groups seem to be the default, we
ended up with three types of fruits.
Therefore, let us repeat the algorithm with number of groups=2. Will apples and oranges be separated?
See on next slide.
17
The good news is that we can indeed see two types of fruits: fruits with a smooth texture and fruits with
a bumpy texture. Why not apples and oranges?
May be we should consider additional features like weight, size, shape or taste, or – alternatively - we
should have defined a learning objective.
So we have two options now:
1. Continue with additional features. This approach is called unsupervised learning: The machine
should exploratively show which hidden secrets are to be found in our fruit basket.
2. If we define a learning objective («differentiate between apples and oranges»), we are talking about
supervised learning.
Having our buffer pool tuning task in mind, where we try to classify database objects into three types of
well-defined buffer pools, we will continue with supervised learning.
The first step is to enhance our analytical dataset with an additional column («fruit»):
Fruit No Texture Color Fruit
1 2 1 apple
2 2 10 apple
… … …
7 6 8 orange
18
We will talk about some of the most important algorithms on the next slide.
19
We randomly select a couple of fruits as “training data”, and we will use the remaining fruits as test data.
Usually, about 90% of the data will be used for training purposes, and a much smaller part for testing.
With the training data, we “train” the machine. Like in the previous case of unsupervised learning, this is
a pure mathematical approach. Based on the training data, the algorithm is calculating a function that
allows for a new fruit to find the type (orange or apple) which it most probably belongs to.
Various algorithms exist to solve a classification problem, the most important among them are
• Neuronal Network classifier (fist time described in 1943) , a network of linear regression models,
with weights for each node, backpropagation to re-compute and adjust the functions.
• Naïve Bayes classifier (1950’s), based on Bayes’ theorem on the probability of an event based on
prior knowledge of conditions.
• Support Vector Machine classifier (1963), mapping the feature space values into higher dimensional
feature spaces and calculating a hyperplane in this multi-dimensional space where each feature of the
original data is either above or beyond this hyperplane.
• Random Forest classifier (1995), evolved from simple decision trees, to average multiple deep
decision trees.
20
Let us go back to our apples and oranges problem. Application of the support vector machine algorithm
is highlighted on this and the following slide. The objective is to find a separation line between the
apples and oranges used for training.
If the training data would look like shown on the slide, it would be easy to calculate a single line to
separate apples from oranges. Even more than one single line would be possible. The larger the space
around the separation line, the better this line separates the fruits. Therefore, “B” separates apples and
oranges better than “A” does.
21
Usually, it is not possible to draw a straight line between the groups. In this case, complex mathematical
transformations map the feature values into a higher dimensional space, until a hyperplane can be
constructed between the two groups. Re-projecting this hyperplane into our two feature (and therefore
two-dimensional) space produces a line which separates apples from oranges. You don’t have to know
how these projections exactly work, just keep in mind that a large number of training data requires a
considerable amount of computing power.
22
We can test our algorithm with the data initially separated from our training data, which means that we
use now the test data.
The feature values of the test data are evaluated by the formula created by the training data, and end up
either above or beyond the hyperplane – or on the left or right side of the separation curve when
transformed back to our 2-dimensional world. Where “left” means that the current candidate will be
classified as an apple, and “right” means that the fruit is considered to be an orange.
23
Every additional fruit in our basket, once classified and the choice either confirmed or the apple
manually re-classified as an orange or vice-versa, adds up to our training data. Such that a future
calculation of the hyperplane includes more data, and the accuracy of the classification algorithm further
increases.
Monitoring the model includes
• Validating the model: Here we look at the confusion matrix, which shows how many real apples
were correctly classified as apples, and how many are erroneously classified as oranges. Same for
oranges.
• Monitoring the accuracy: For each type (apples and oranges in our case), we can calculate how
accurate our classification machine works: Add the true positives (apples classified as apples) and the
true negatives (fruits other than apples, not classified as apples) and divide this sum by the total
number of fruits classified. Ends up with a value between 0 (completely wrong) and 1 (perfect). This
value should increase over time thus showing the learning curve of the machine.
• Measuring the Business Use Case benefit: At the end of the day, this algorithm should show an
improvement for the real business use case. A bit more difficult for comparing apples and oranges, but
much more interesting for example for buffer pool tuning. That’s what we will discuss in the next
section.
24
At this time, it is important to remember that we never explicitly programmed the machine by providing
rules, decision trees or anything like that. Basically, everything is calculated by geometrical distances.
So everything is easy? In typical use cases, 80% of the resources will be spent in data understanding and
data preparation, the first two steps of the process. Data needs to be extracted from several different
systems and applications, aggregated and cleansed. Data privacy legislation and rules must be
considered. Many examples and use cases for ML have text (or speech) data as input: data preparation
includes NLP (natural language processing) steps, such as parsing, tokenization, elimination of stop
words, normalization, stemming, named-entity recognition, part of speech parsing etc.
Though the algorithms for building and evaluating models have been developed for 80 years only the
computing power of the 2010’s now allows us to apply these techniques in everyday-life.
Now let us continue with a real world example!
25
This slide displays the principles of buffer pool management. Implementation is different and much
more complex: For example, both an LRU chain is managed for all pages (random and sequential), and a
separate SLRU chain is maintained which contains only sequential pages.
Buffer page re-classification from sequential to random has been re-introduced in DB2 z/OS V11, after
having been disabled in V9 and V10. However, buffer pool re-classification happens only one-way from
sequential to random, never from random to sequential.
The objective of an LRU-based bufferpool is to maintain an LRU chain to keep frequently used pages
longer in the pool, and therefore to avoid the second and any subsequent I/O operations.
26
Compared to an LRU-defined pool, a pool which steals pages based on the FIFO principle is working the
same way but avoids keeping an LRU chain. This is more efficient, but a page many times re-referenced
will nevertheless be stolen once it will become the oldest page in the pool.
The objective of a FIFO-based bufferpool is to avoid the overhead in chain management and in number
of never re-referenced buffers; therefore we try to minimize the number of buffer pages which are
referenced only once within a longer time period.
27
PGSTEAL NONE defined buffer pools will be primed by sequential access upon first access of an
object. No further I/O is necessary for any page of this object. As sequential prefetch will be turned off
after priming the pool for each object, this will help to reduce the number of prefetch engines used and
avoiding situations with “no prefetch engines available” : not only for this pool, but for the whole
subsystem.
If you have loaded more table and index pages than the size of the pool allows, additional pages are
stolen using a FIFO approach. Therefore, make sure to set VPSIZE large enough, and also make sure
that all buffers are backed by real memory.
The objective of an PGSTEAL=NONE-based bufferpool is to achieve an in-memory like behavior, and
therefore to avoid all synchronous I/O operations, even for the first time a page will be accessed. A
second objective is to avoid the overhead of the LRU chain management.
28
For apples and oranges, we selected color and texture as significant features. Here, we chose No of
getpages, No of sync I/O, No of async I/O, No of writes, No of pages, and percentage of dataset growth
(measured in additional number of rows per day compared to the whole table/index size).
During further processing, I measured no significant difference if No of writes was included or not.
Therefore, No of writes was discarded in a later phase and is not any longer part of the input values
(features).
Analysis of IFCID 198 performance trace records was initially considered, but did not give significant
additional information. Compared to all other data, it is much more resource-consuming to permanently
get these records.
29
Sourcing of the first three features (number of getpages, sync I/O and async I/O) depends on what is
installed in your shop. The example here shows utilization of CA’s subsystem analyzer as one possible
option. The file produced by applying the QFILE option is loaded into a table called QFILE. At this
point, you need a DB2 table storing these three features for each object.
LOAD DATA REPLACE INDDN SYSREC00 INTO TABLE QFILE
WHEN (02:02 = '_')
(SPACENAM POSITION(5:12)
,DBNAME POSITION(14:21)
,TYPE POSITION(23:26)
,BPID POSITION(28:33)
,GETPAGE POSITION(70:79) INTEGER EXTERNAL
,SYNC_RIO POSITION(81:90) INTEGER EXTERNAL
,ASYN_RIO POSITION(92:101) INTEGER EXTERNAL
)
30
31
select strip(q.dbname)!!'.'!!strip(q.spacenam) as object, q.getpage, q.sync_rio, q.asyn_rio --, q.sync_wio
, coalesce(s.size, i.size, float(0.00)) as size, coalesce(s.growth, i.growth, float(0.00)) as growth, bpid
from qfile q
left outer join
(select dbname, name, float(sum(coalesce(nactive, npages, totalrows/200.0 ,0))) as size
, coalesce(float(sum (((float(coalesce(statsinserts, 0)) - float(coalesce(statsdeletes, 0))) /
(1+(days(current date)-days(coalesce(statslasttime, current timestamp))))) /
(1+float(coalesce(totalrows,statsinserts-statsdeletes,0))))), 0) as growth
from sysibm.systablespacestats
group by dbname, name ) s
on q.dbname = s.dbname and q.spacenam=s.name
left outer join
(select dbname, indexspace as name, float(sum(coalesce(nactive, npages, totalentries/400.0 ,0))) as size
, coalesce(float(sum (((float(coalesce(statsinserts, 0)) - float(coalesce(statsdeletes, 0))) /
(1+(days(current date)-days(coalesce(statslasttime, current timestamp))))) /
(1+float(coalesce(totalentries,statsinserts-statsdeletes,0))))), 0) as growth
from sysibm.sysindexspacestats
group by dbname, indexspace ) i
on q.dbname = i.dbname and q.spacenam=i.name
where substr(bpid,4)=' '
and bpid <> 'BP0'
and getpage > 0
and rand(1) >= 0.1 -- for first set of training data, remove for further iterations
and rand(1) < 0.1 -- for test data
32
We will use R to work through the steps of ML discussed in the section before. R is one of the most
widely used tools for data scientists. As a DBA, you should be somewhat familiar with these tools in
order to talk to data scientist in their language. Working with R is quite straightforward; following the
tutorial in http://www.cyclismo.org/tutorial/R/index.html is a good option to get started.
The slide on this page shows the import screen on the upper left, and the typical 4-part R Studio screen
on the lower right, after execution of the head(ml4bp_training) R command which shows the first few
rows of the file.
The last step in preparing the dataset is to scale the axes (the same we did for our apples and oranges
when we changed the axes from (green...red) to (1..10)).
In R, we use
TRAINING <- cbind(scale(ml4bp_training[2:6]),ml4bp_training[7]).
The TRAINING dataset then looks like this:
The R code used here looks as follows:
> FORM <- factor(BPID) ~ GETPAGE + SYNC_RIO + ASYN_RIO + SIZE + GROWTH
> library(randomForest)
> BPOOL_RF <- randomForest(FORM, TRAINING)
Three lines, and the model is built!
33
Random Forest allows parallelization in the training phase, because each individual decision tree can be
built up (“trained”) in parallel, which makes it an ideal candidate for very large datasets.
34
The resulting dataset can now be aggregated in R:
> alter <- cbind(ml4bp_test[1],BP_PREDICT)
and the dataset reloaded into a DB2 table with columns OBJECT and BPID. This allows us to directly
generate ALTER TABLESPACE [object] BUFFERPOOL [BPID] and ALTER INDEX [object]
BUFFERPOOL [BPID] commands.
35
But before we generate these ALTER SQL statements, we will apply an additional touch to our bpool
classification: We will separate objects accessed by top-priority applications from other objects, and we
will review our current bufferpool sizing.
36
1. Once classified into these three bufferpool categories based on which stealing algorithm fits best,
additional rules now apply to further distinguish between “economy” type of objects and “first class”
objects.
2. Sizing must be done for each buffer pool.
3. Performance evaluation of the bufferpool key performance indicators.
37
The top-5-applications all use dynamic queries, sent from Java and Smalltalk. Only very few top
applications have some COBOL components. These aren’t discussed here, we will focus on dynamic
queries only.
38
Use this query for tablespaces:
select a.dbname, a.tsname, max(float(coalesce(p.gpag, 0)) / float(1+a.gpag)) as priority from
(select t.dbname, t.tsname, float(sum(Stat_gpag)) as gpag
from.DSN_STATEMENT_CACHE_TABLE a, PLAN_TABLE b, SYSIBM.SYSTABLES t
where a.stmt_id=b.queryno and b.indexonly='N' and b.creator=t.creator and b.tname=t.name
group by t.dbname, t.tsname) a,
(select t.dbname, t.tsname, float(sum(Stat_gpag)) as gpag
from DSN_STATEMENT_CACHE_TABLE a, PLAN_TABLE b, SYSIBM.SYSTABLES t
where a.stmt_id=b.queryno and b.indexonly='N' and b.creator=t.creator and b.tname=t.name
and a.cursqlid in (‘[sqlids of top-5-applications]') group by t.dbname, t.tsname) p
where a.dbname=p.dbname and a.tsname=p.tsname group by a.dbname, a.tsname ;
Replace SYSTABLES by SYSINDEXES (and replace tsname by indexspace, creator by accesscreator,
tname by accessname, and remove the restriction on indexonly) for indexes.
39
At the end of the day, what counts are the results achieved, measured in elapsed time and cpu seconds.
In order to know where these elapsed times come from, we have to look a bit more intensive under the
hood and measure page residency times.
For LRU type buffer pools, we are interested in both the global (general) page residency time which
does not differ between random and sequential getpages, and the random page residency time, which
focuses on random I/O only.
For calculations of these values please refer to page 9.
Now that the initial page residency objectives were met, would there be any benefit in further increasing
the buffer pool sizes?
DB2 z/OS V11 introduced a nice new function here: Buffer pool simulation.
Larger buffer pools not only save I/O, but also CPU to perform this I/O, though PGIFIX=yes has helped
here. The z/OS development team has measured approximately 20-40msec per sync I/O on z13.
As a rule of thumb, if an increase in buffer pool VPSIZE of 20% leads to an increase in the page
residency time of 20% or more, than I consider this investment in additional real memory as justified as
long as reduction of I/O rate is the tuning objective. See details on next page.
40
Buffer pool simulation covers virtual changes in VPSEQT and VPSIZE.
–ALTER BUFFERPOOL (BPn) SPSIZE(n) simulates buffer pool behavior when n pages are added to the number of buffer pool pages in the virtual buffer pool. SPSIZE(0) deletes the simulated pool.
Buffer pool simulation works for LRU-defined buffer pools only, and is not available for group buffer pool simulation.
A simulated buffer pool does not contain data pages, but only page information. For a 4K buffer pool, you can estimate an amount of 80Byte per simulated buffer pool page. Therefore, the storage amount needed is SPSIZE*80Byte. Buffer pool simulation also consumes some CPU cycles, so best practice is to use this function for only one single buffer pool at a time. Estimations are 1-2% of additional CPU per buffer pool where simulation is active.
Eventually, the display buffer pool command (-DIS BPOOL(BPn) DETAIL) produces the number of avoidable I/O (separated for synchronous I/O random, sync I/O sequential, and async I/O) for both VSAM and GBP reads, the number of pages moved into the simulated buffer pool, and an estimation of the total delay for all synchronous I/O. For the diagram above, the sum of all sync read I/O and async read I/O were considered. The formula to calculate the general page residency time including simulation is as follows:
general page residency time := VPSIZE / (no of random and prefetch pages read per sec
- no of sync and async read I/O avoided per sec).
Potential %CPU savings, calculated from n := number of avoided sync I/O per second:
CPU saving in msec / sec CPU used := (30msec * n) / (no of processors * CPU-utilization in %).
Example with 4 processors, CPU average utilization 95%, and 1000 sync I/O per sec avoided:
CPU savings := (30msec*1000) / (4*0.95) = 7895msec per second CPU usage (or 0.7895%)
41
42
We do not do self driving cars. But cognitive technologies (machine learning) will touch us all. Every
DBA should be familiar with the basic concepts of machine learning. Applying machine learning
techniques to your own area of expertise is one of the best ways to get started with!
43
Since 1992, Thomas Baumann has been focusing on understanding how database engines work. He has a
master degree of computer sciences from ETH Zurich, Switzerland, and is currently working as head of
IT performance architecture at Swiss Mobiliar Insurance in Berne, Switzerland. If he is not in his office
trying how to get the most out of database technology, he is somewhere lecturing on database
optimization or database risk management. Thomas is a certified information systems auditor (CISA),
holds a CRISC certificate, and is a member of the IDUG (International DB2 Users Group) speaker hall
of fame.
44