Architecture of DB2 LUW

Architecture of DB2 LUW DB2 LUW: DB2 LUW architecture is a 3 model design. They are:

1. Process Model

2. Memory Model and

3. Storage model

1. Process Model

Knowledge of the DB2 process model will help you to understand how the database manager and its associated

components interact, and this can help you to troubleshoot problems that might arise.

The process model that is used by all DB2 database servers facilitates communication between database servers and

clients. It also ensures that database applications are isolated from resources, such as database control blocks and

critical database files.

The DB2 database server must perform many different tasks, such as processing database application requests or

ensuring that log records are written out to disk. Each task is typically performed by a separate engine dispatchable

unit (EDU).

There are many advantages to using a multithreaded architecture for the DB2 database server. A new thread

requires less memory and fewer operating system resources than a process, because some operating system

resources can be shared among all threads within the same process. Moreover, on some platforms, the context

switch time for threads is less than that for processes, which can improve performance. Using a threaded model on

all platforms makes the DB2 database server easier to configure, because it is simpler to allocate more EDUs when

needed, and it is possible to dynamically allocate memory that must be shared by multiple EDUs.

For each database being accessed, separate EDUs are started to deal with various database tasks such as

prefetching, communication, and logging. Database agents are a special class of EDU that are created to handle

application requests for a database.

In general, you can rely on the DB2 database server to manage the set of EDUs. However, there are DB2 tools that

look at the EDUs. For example, you can use the db2pd command with the -edus option to list all EDU threads that

are active.

Each client application connection has a single coordinator agent that operates on a database. A coordinator

agent works on behalf of an application, and communicates to other agents using private memory, interprocess

communication (IPC), or remote communication protocols, as needed. The DB2 architecture provides a firewall so that applications run in a different address space than the DB2 database server. The firewall protects the database and the database manager from applications, stored procedures, and user-defined functions (UDFs). The firewall maintains the integrity of the data in the databases, because it prevents application programming errors from overwriting internal buffers or database manager files. The firewall also improves reliability, because application errors cannot crash the database manager.

Database server threads and processes The system controller (db2sysc on UNIX and db2syscs.exe on Windows operating systems) must exist if the database server is to function. The following threads and processes carry out a variety of tasks:

db2acd, an autonomic computing daemon that hosts the health monitor, automatic maintenance utilities,

and the administrative task scheduler. This process was formerly known as db2hmon.

db2aiothr, manages asynchronous I/O requests for a database partition (UNIX only)

db2alarm, notifies EDUs when their requested timer has expired (UNIX only)

db2cart, for archiving log files (when the userexit database configuration parameter is enabled)

db2disp, the client connection concentrator dispatcher

db2fcms, the fast communications manager sender daemon

db2fcmr, the fast communications manager receiver daemon

db2fmd, the fault monitor daemon

db2fmtlg, for formatting log files (when the logretain database configuration parameter is enabled and

the userexitdatabase configuration parameter is disabled)

db2licc, manages installed DB2 licenses

db2panic, the panic agent, which handles urgent requests after agent limits have been reached

db2pdbc, the parallel system controller, which handles parallel requests from remote database partitions

(used only in a partitioned database environment)

db2resync, the resync agent that scans the global resync list

db2sysc, the main system controller EDU; it handles critical DB2 server events

db2thcln, recycles resources when an EDU terminates (UNIX only)

db2wdog, the watchdog on UNIX and Linux operating systems that handles abnormal terminations

2. Memory Model

DB2 breaks and manages memory in four different memory sets. They are:

Instance shared memory

Database shared memory

Application group shared memory

Agent private memory

Each memory set consists of various memory pools (also referred to as heaps). db2mtrk - Memory tracker command

Provides complete report of memory status, for instances, databases, agents, and applications. This command

outputs the following memory pool allocation information:

Current size

Maximum size (hard limit)

Largest size (high water mark)

Type (identifier indicating function for which memory will be used)

Agent who allocated pool (only if the pool is private)

Application

The same information is also available from the Snapshot monitor. >>db2mtrk -i -d -p -m -r interval count -v i: instance level memory d: database level memory p: Private memory m: maximum value for each pool r: repeat mode interval: number of seconds need to wait count: number of times need to repeat v: verbos output(view) Command Parameters -i Show instance level memory. -d Show database level memory. -a Show application memory usage. -p Deprecated. Show private memory. Replaced with -a parameter to show application memory usage. -m Show maximum values for each pool. -w Show high watermark values for each pool. -r Repeat mode interval Number of seconds to wait between subsequent calls to the memory tracker (in repeat mode). count Number of times to repeat. -v Verbose output. -h Show help screen. If you specify -h, only the help screen appears. No other information is displayed.

Important dbm cfg for memory model:

We can define the number of databases within an instance by using the paramater "NUMDB"

The dbm cfg parameter which represents instance level memory is "INSTANCE_MEMORY"

The db cfg parameter which represents db level memory is DATABASE_MEMORY

3. Storage model

When you are designing a new database in DB2 on Linux, UNIX, Windows (DB2/LUW) one of the most important

aspects of your design is the layout of your data. It is important to get this right the first time, because changing

layouts is time consuming and difficult. There is a lot of good information on the web already, but I wanted to add

some practical observations that Graham Murphy and I have had in recently implemented or re-designed

systems. This article focuses on OLTP (On-line Transaction Processing) and reporting systems. Data warehouses and

databases with heavy analytical uses are beyond the scope of this document.

Since most of the main concepts are already covered very well by the IBM Database Storage DB2/LUW Best

Practices Guide, I highly recommend that you read it. In this article I will cover some more detailed

recommendations and some alternatives that may be helpful.

File Systems

One of the first items that you need to consider is the number and types of file systems that you need. The reason

that you should work on this first is that it is typically takes a while to get storage allocated especially in large

organizations where there is a separate storage management group. Formal requests need to be well considered by

the omnipotent storage team before then can condescend to bestow disk space to the unwashed masses. Most

organizations now get an allocation from the central SAN system. Storage is usually presented to servers in an

object called a Logical UNit (LUN).

Before making that request it is a good idea to have a discussion with the storage team about how data is stored

and allocated. In some organizations standard sized LUNs are used and in others custom size LUNs can be

ordered. Here are recommendations for the different types of SAN storage

Data on All Disks

In newer disk subsystems there seems to be a trend towards to spreading the storage for LUNS across all disks in the

physical disk device. An example is IBMs XIV storage, but other manufacturers are doing this too. This is the

simplest case for you. If you are getting your LUNs from this type of system and you can get custom sizes then

request 5 LUNs for your database data. There is noting really magic about this number, but it strikes a nice balance

between ease of administration and spreading data. If you want a few more that is fine, but dont go less. If your

organization issues storage in fixed sizes, then order enough LUNs for the amount of space you need. Finally when

LUNs are presented to the operating system then it is a good practice to create one file system on each LUN.

Data on Individual Arrays

On most other types of storage, LUNs are allocated from individual RAID arrays. There is a very good discussion of

how to arrange this storage in the IBM Database Storage DB2/LUW Best Practices Guide so I will not repeat it

here. If possible, you should get one LUN from each RAID array and create one tablespace per LUN.

Unknown

In many organizations the DBAs and others will be deemed unworthy of knowing what is behind the curtain of the

SAN and will not be told. In this case you just have to ask for enough LUNs to meet your needs and hope for the

best. The good news is that this often does provide adequate performance for many small and medium sized

systems. Again you should create one file system per LUN.

The IBM Database Storage DB2/LUW Best Practices Guide goes in depth about types of RAID arrays to create for

DB2 and I highly recommend that you read it. One important thing that I did not see there is how to create your file

systems from LUNs. It is good if you can create one file system per LUN, but sometimes this is not practical for

various reasons. If you find yourself in this situation do not despair. Just remember that when you create the file

system ensure that you stripe the tablespace across the LUNs and do NOT concatenate the LUNS. If you

concatenate the LUNs then as data is added it is only placed in one LUN until it is filled and then moves on to each

subsequent LUN. This is very bad and places the newest and probably hottest data into one or a few LUNS making a

bad hotspot.

Tablespaces

One of the things that Ive been hearing lately is that it is OK to put all of your tables into one or a very few

tablespaces. This is simply NOT TRUE if you need high performance. A good rule of thumb is to put any table with

more than about 5-10 MB of data into its own tablespace. Further it is a good idea to put the indexes for these

tables into an individual tablespaces. That is, you would put all of the indexes for a larger table into a tablespace

created solely for that tables indexes. You can place all of the smaller tables into one tablespace, and the indexes

for all of those tables into another. Graham and I recently worked with a customer who was having performance

problems with their OLTP database who had all of their tables in a single tablespace. Once he broke all of the larger

tables and their indexes into their own tablespaces performance improved dramatically. When he was done this

system had well over 100 tablespaces.

For almost all production data and index tablespaces you should use Large (not Regular) DMS storage. Regular

tablespaces may go away in future releases. Remember with DMS and Automatic Storage you can now specify a

start size and let the tablespace automatically extend as needed.

If you have Large OBject (LOB) data in your database you should design your tablespaces in one of two ways. If your

LOBS are small enough to fit onto the data page with the other data and is frequently accessed, then you should put

the LOBs in line. That means that the LOB column is just part of the row in the data page just like all other

columns. This saves I/O when accessing the LOB data. If the LOBs are large then they should be put into their own

tablespaces using the LONG IN clause in the create table command.

Putting the Tablespaces on File Systems

You should create each tablespace across all data file systems on your server. That is, each tablespace should have

one container (directory) on each data file system. Avoid putting tablespaces in your backup and transaction log file

system. I am aware of two recently redesigned systems that used SAN that stripes each LUN over all disks in the

storage unit. For both of these databases, five data LUNs were created with one file system being placed on each

LUN Both of these systems perform well and there were many tablespaces and every tablespace was striped across

all 5 data file systems. Both of these are high volume OLTP systems with significant reports being created from

them too.

Striping all tablespaces across all file systems can be made easier with Automatic Storage. With automatic storage

you define the available file systems to the database and then DB2 takes care of placing each tablespace across

those file systems as they are created.

Page Size

For OLTP databases use the 4K page size. End of discussion! When using LARGE tablespaces, as should always be

done these days, 4 K tablespaces can grow up to 2 Terabytes. For strictly Reporting or Operational Data Store

databases 8K or 16K pages might be more appropriate so that you get more rows per page. Compression may also

improve performance of reporting databases.

Extent and Prefetch Sizes

The IBM Database Storage DB2/LUW Best Practices Guide has a good description of extent size and provides a well

accepted formula for calculating it. However, there is an interesting alternative that is gaining acceptance in some

quarters for high-volume OLTP databases. This alternative says to use a small extent size of 2 pages. If you need

very high performance in your OLTP database then you may want to experiment with the traditional vs. small extent

size and see what performs better for your work load. I would lean more towards the traditional calculation for

reporting and ODS workloads.

Again the IBM Database Storage DB2/LUW Best Practices Guide has a good description of prefetch size and provides

a well accepted formula for calculating it. Graham has provided me with a formula that can give better performance

is some cases. This alternative formula is:

PREFETCH = Nbr_File_Systems * Extent_Size * Nbr_Channels_to_Disk_System

Where:

NBR_file_Systems is the number of file systems under the tablespace

Extent_Size is the extent size for the tablespace

Nbr_Channels_to_Disk_System is the number of channels to the disk subsystems. Some HBA cards have

multiple channels. The best way to get this figure is to ask your system administrator for the server.

Documents

Architecture of DB2 LUW