11
Architecture of DB2 LUW DB2 LUW: DB2 LUW architecture is a 3 model design. They are: 1. Process Model 2. Memory Model and 3. Storage model 1. Process Model

Architecture of DB2 LUW

Embed Size (px)

DESCRIPTION

db2 architecture

Citation preview

  • Architecture of DB2 LUW DB2 LUW: DB2 LUW architecture is a 3 model design. They are:

    1. Process Model

    2. Memory Model and

    3. Storage model

    1. Process Model

  • Knowledge of the DB2 process model will help you to understand how the database manager and its associated

    components interact, and this can help you to troubleshoot problems that might arise.

    The process model that is used by all DB2 database servers facilitates communication between database servers and

    clients. It also ensures that database applications are isolated from resources, such as database control blocks and

    critical database files.

    The DB2 database server must perform many different tasks, such as processing database application requests or

    ensuring that log records are written out to disk. Each task is typically performed by a separate engine dispatchable

    unit (EDU).

    There are many advantages to using a multithreaded architecture for the DB2 database server. A new thread

    requires less memory and fewer operating system resources than a process, because some operating system

    resources can be shared among all threads within the same process. Moreover, on some platforms, the context

    switch time for threads is less than that for processes, which can improve performance. Using a threaded model on

    all platforms makes the DB2 database server easier to configure, because it is simpler to allocate more EDUs when

    needed, and it is possible to dynamically allocate memory that must be shared by multiple EDUs.

    For each database being accessed, separate EDUs are started to deal with various database tasks such as

    prefetching, communication, and logging. Database agents are a special class of EDU that are created to handle

    application requests for a database.

    In general, you can rely on the DB2 database server to manage the set of EDUs. However, there are DB2 tools that

    look at the EDUs. For example, you can use the db2pd command with the -edus option to list all EDU threads that

    are active.

    Each client application connection has a single coordinator agent that operates on a database. A coordinator

    agent works on behalf of an application, and communicates to other agents using private memory, interprocess

    communication (IPC), or remote communication protocols, as needed. The DB2 architecture provides a firewall so that applications run in a different address space than the DB2 database server. The firewall protects the database and the database manager from applications, stored procedures, and user-defined functions (UDFs). The firewall maintains the integrity of the data in the databases, because it prevents application programming errors from overwriting internal buffers or database manager files. The firewall also improves reliability, because application errors cannot crash the database manager.

    Database server threads and processes The system controller (db2sysc on UNIX and db2syscs.exe on Windows operating systems) must exist if the database server is to function. The following threads and processes carry out a variety of tasks:

  • db2acd, an autonomic computing daemon that hosts the health monitor, automatic maintenance utilities,

    and the administrative task scheduler. This process was formerly known as db2hmon.

    db2aiothr, manages asynchronous I/O requests for a database partition (UNIX only)

    db2alarm, notifies EDUs when their requested timer has expired (UNIX only)

    db2cart, for archiving log files (when the userexit database configuration parameter is enabled)

    db2disp, the client connection concentrator dispatcher

    db2fcms, the fast communications manager sender daemon

    db2fcmr, the fast communications manager receiver daemon

    db2fmd, the fault monitor daemon

  • db2fmtlg, for formatting log files (when the logretain database configuration parameter is enabled and

    the userexitdatabase configuration parameter is disabled)

    db2licc, manages installed DB2 licenses

    db2panic, the panic agent, which handles urgent requests after agent limits have been reached

    db2pdbc, the parallel system controller, which handles parallel requests from remote database partitions

    (used only in a partitioned database environment)

    db2resync, the resync agent that scans the global resync list

    db2sysc, the main system controller EDU; it handles critical DB2 server events

    db2thcln, recycles resources when an EDU terminates (UNIX only)

    db2wdog, the watchdog on UNIX and Linux operating systems that handles abnormal terminations

    2. Memory Model

    DB2 breaks and manages memory in four different memory sets. They are:

    Instance shared memory

    Database shared memory

    Application group shared memory

    Agent private memory

  • Each memory set consists of various memory pools (also referred to as heaps). db2mtrk - Memory tracker command

    Provides complete report of memory status, for instances, databases, agents, and applications. This command

    outputs the following memory pool allocation information:

    Current size

    Maximum size (hard limit)

    Largest size (high water mark)

    Type (identifier indicating function for which memory will be used)

    Agent who allocated pool (only if the pool is private)

    Application

    The same information is also available from the Snapshot monitor. >>db2mtrk -i -d -p -m -r interval count -v i: instance level memory d: database level memory p: Private memory m: maximum value for each pool r: repeat mode interval: number of seconds need to wait count: number of times need to repeat v: verbos output(view) Command Parameters -i Show instance level memory. -d Show database level memory. -a Show application memory usage. -p Deprecated. Show private memory. Replaced with -a parameter to show application memory usage. -m Show maximum values for each pool. -w Show high watermark values for each pool. -r Repeat mode interval Number of seconds to wait between subsequent calls to the memory tracker (in repeat mode). count Number of times to repeat. -v Verbose output. -h Show help screen. If you specify -h, only the help screen appears. No other information is displayed.

  • Important dbm cfg for memory model:

    We can define the number of databases within an instance by using the paramater "NUMDB"

    The dbm cfg parameter which represents instance level memory is "INSTANCE_MEMORY"

    The db cfg parameter which represents db level memory is DATABASE_MEMORY

    3. Storage model

    When you are designing a new database in DB2 on Linux, UNIX, Windows (DB2/LUW) one of the most important

    aspects of your design is the layout of your data. It is important to get this right the first time, because changing

    layouts is time consuming and difficult. There is a lot of good information on the web already, but I wanted to add

    some practical observations that Graham Murphy and I have had in recently implemented or re-designed

    systems. This article focuses on OLTP (On-line Transaction Processing) and reporting systems. Data warehouses and

    databases with heavy analytical uses are beyond the scope of this document.

  • Since most of the main concepts are already covered very well by the IBM Database Storage DB2/LUW Best

    Practices Guide, I highly recommend that you read it. In this article I will cover some more detailed

    recommendations and some alternatives that may be helpful.

    File Systems

    One of the first items that you need to consider is the number and types of file systems that you need. The reason

    that you should work on this first is that it is typically takes a while to get storage allocated especially in large

    organizations where there is a separate storage management group. Formal requests need to be well considered by

    the omnipotent storage team before then can condescend to bestow disk space to the unwashed masses. Most

    organizations now get an allocation from the central SAN system. Storage is usually presented to servers in an

    object called a Logical UNit (LUN).

    Before making that request it is a good idea to have a discussion with the storage team about how data is stored

    and allocated. In some organizations standard sized LUNs are used and in others custom size LUNs can be

    ordered. Here are recommendations for the different types of SAN storage

    Data on All Disks

    In newer disk subsystems there seems to be a trend towards to spreading the storage for LUNS across all disks in the

    physical disk device. An example is IBMs XIV storage, but other manufacturers are doing this too. This is the

    simplest case for you. If you are getting your LUNs from this type of system and you can get custom sizes then

    request 5 LUNs for your database data. There is noting really magic about this number, but it strikes a nice balance

    between ease of administration and spreading data. If you want a few more that is fine, but dont go less. If your

    organization issues storage in fixed sizes, then order enough LUNs for the amount of space you need. Finally when

    LUNs are presented to the operating system then it is a good practice to create one file system on each LUN.

    Data on Individual Arrays

    On most other types of storage, LUNs are allocated from individual RAID arrays. There is a very good discussion of

    how to arrange this storage in the IBM Database Storage DB2/LUW Best Practices Guide so I will not repeat it

    here. If possible, you should get one LUN from each RAID array and create one tablespace per LUN.

    Unknown

    In many organizations the DBAs and others will be deemed unworthy of knowing what is behind the curtain of the

    SAN and will not be told. In this case you just have to ask for enough LUNs to meet your needs and hope for the

    best. The good news is that this often does provide adequate performance for many small and medium sized

    systems. Again you should create one file system per LUN.

    The IBM Database Storage DB2/LUW Best Practices Guide goes in depth about types of RAID arrays to create for

    DB2 and I highly recommend that you read it. One important thing that I did not see there is how to create your file

  • systems from LUNs. It is good if you can create one file system per LUN, but sometimes this is not practical for

    various reasons. If you find yourself in this situation do not despair. Just remember that when you create the file

    system ensure that you stripe the tablespace across the LUNs and do NOT concatenate the LUNS. If you

    concatenate the LUNs then as data is added it is only placed in one LUN until it is filled and then moves on to each

    subsequent LUN. This is very bad and places the newest and probably hottest data into one or a few LUNS making a

    bad hotspot.

    Tablespaces

    One of the things that Ive been hearing lately is that it is OK to put all of your tables into one or a very few

    tablespaces. This is simply NOT TRUE if you need high performance. A good rule of thumb is to put any table with

    more than about 5-10 MB of data into its own tablespace. Further it is a good idea to put the indexes for these

    tables into an individual tablespaces. That is, you would put all of the indexes for a larger table into a tablespace

    created solely for that tables indexes. You can place all of the smaller tables into one tablespace, and the indexes

    for all of those tables into another. Graham and I recently worked with a customer who was having performance

    problems with their OLTP database who had all of their tables in a single tablespace. Once he broke all of the larger

    tables and their indexes into their own tablespaces performance improved dramatically. When he was done this

    system had well over 100 tablespaces.

    For almost all production data and index tablespaces you should use Large (not Regular) DMS storage. Regular

    tablespaces may go away in future releases. Remember with DMS and Automatic Storage you can now specify a

    start size and let the tablespace automatically extend as needed.

    If you have Large OBject (LOB) data in your database you should design your tablespaces in one of two ways. If your

    LOBS are small enough to fit onto the data page with the other data and is frequently accessed, then you should put

    the LOBs in line. That means that the LOB column is just part of the row in the data page just like all other

    columns. This saves I/O when accessing the LOB data. If the LOBs are large then they should be put into their own

    tablespaces using the LONG IN clause in the create table command.

    Putting the Tablespaces on File Systems

    You should create each tablespace across all data file systems on your server. That is, each tablespace should have

    one container (directory) on each data file system. Avoid putting tablespaces in your backup and transaction log file

    system. I am aware of two recently redesigned systems that used SAN that stripes each LUN over all disks in the

    storage unit. For both of these databases, five data LUNs were created with one file system being placed on each

    LUN Both of these systems perform well and there were many tablespaces and every tablespace was striped across

    all 5 data file systems. Both of these are high volume OLTP systems with significant reports being created from

    them too.

  • Striping all tablespaces across all file systems can be made easier with Automatic Storage. With automatic storage

    you define the available file systems to the database and then DB2 takes care of placing each tablespace across

    those file systems as they are created.

    Page Size

    For OLTP databases use the 4K page size. End of discussion! When using LARGE tablespaces, as should always be

    done these days, 4 K tablespaces can grow up to 2 Terabytes. For strictly Reporting or Operational Data Store

    databases 8K or 16K pages might be more appropriate so that you get more rows per page. Compression may also

    improve performance of reporting databases.

    Extent and Prefetch Sizes

    The IBM Database Storage DB2/LUW Best Practices Guide has a good description of extent size and provides a well

    accepted formula for calculating it. However, there is an interesting alternative that is gaining acceptance in some

    quarters for high-volume OLTP databases. This alternative says to use a small extent size of 2 pages. If you need

    very high performance in your OLTP database then you may want to experiment with the traditional vs. small extent

    size and see what performs better for your work load. I would lean more towards the traditional calculation for

    reporting and ODS workloads.

    Again the IBM Database Storage DB2/LUW Best Practices Guide has a good description of prefetch size and provides

    a well accepted formula for calculating it. Graham has provided me with a formula that can give better performance

    is some cases. This alternative formula is:

    PREFETCH = Nbr_File_Systems * Extent_Size * Nbr_Channels_to_Disk_System

    Where:

    NBR_file_Systems is the number of file systems under the tablespace

    Extent_Size is the extent size for the tablespace

    Nbr_Channels_to_Disk_System is the number of channels to the disk subsystems. Some HBA cards have

    multiple channels. The best way to get this figure is to ask your system administrator for the server.