28
Autonomic DBMSs: System Tune Thyself! Pat Martin Database Systems Laboratory School of Computing Supported by IBM, CITO and NSERC

Autonomic DBMSs: System Tune Thyself! Pat Martin Database Systems Laboratory School of Computing Supported by IBM, CITO and NSERC

Embed Size (px)

Citation preview

Autonomic DBMSs:System Tune Thyself!

Pat MartinDatabase Systems Laboratory

School of Computing

Supported by IBM, CITO and NSERC

2

Outline of Talk

• The problem – system complexity

• The solution – autonomic computing systems

• Autonomic DBMSs

• Some current research - tuning multiple buffer pools

• Summary

3

The Problem

• Computer systems continually expanded to achieve greater functionality and efficiency

• Expansion has led to a complexity crisis– Systems are too complex to be managed

effectively!

4

Can you manage this?

5

How about this?

6

A Solution – Autonomic Computing Systems

Autonomic Computing Systems, like our nervous system, manage themselves

7

Autonomic Computing System

• Aware of itself and its environment and acts accordingly

• Able to reconfigure itself under varying and unpredictable conditions

• Able to recover from events that cause it to malfunction

• Able to anticipate optimized resources needed to perform a task

• Able to protect itself

8

Autonomic DBMS Project

• Goal is develop a DBMS that can automatically– Recognize properties of its workload– Monitor itself with minimal impact on applications’

performance– Reallocate resources to improve performance– Detect and diagnose performance problems– Recognize and react to changes in its environment

and available resources

9

Example – Buffer Pool Tuning

• Automatically configure tablespaces to buffer pools based on an analysis of the database and the workload (BP Configuration Problem)

• Dynamically adjust sizes of buffer pools to minimize I/O costs for the database and workload (BP Sizing Problem)

10

Multiple Buffer Pools

warehouse customer itemindex

logical access

physical writephysical read

11

BP Configuration Problem

Given a set of database objects and a workload, determine a mapping of database objects to buffer pools to maximize performance for the given workload.

12

Configuration Rules of Thumb

• Separate data and indexes• Isolate a large data table• Separate objects that are updated frequently

and objects that are primarily read• Put temporary tables in their own BP• Separate small frequently accessed tables

from larger tables that are scanned• Isolate tables that are accessed frequently by

short updates

13

BPConfig Approach

• Analyze logical page reference trace– obtain trace of workload on default configuration– derive access patterns for DB objects

• random, re-reference and sequential accesses

• Create characterization vectors– type, access patterns, read/write info, size info

• Partition DB objects into buffer pools– cluster based on characterization vectors

14

Partitioning DB Objects

• Partition using k-means clustering algorithm

• Similarity measured by weighted Euclidean distance

• Considered different weighting schemes– equal– favour read/write– favour access pattern

15

Experiments

• Experimental environment– IBM Netfinity 8500R: 4 900 MHz PIII Xeon

CPU, 16 GB RAM, 70 disks, Windows NT – TPC-C benchmark: OLTP workload, 400

warehouse (40 GB) database– DB2 Version 7.1

• 100,000 4K pages for the buffer pools

16

Experiments (cont.)

• Configuration schemes– BPConfig, expert, default (1BP), random,

distributed (1 BP per DB object)

• Evaluation criteria– Weighted Response Time– TPM– % Physical Reads

17

Experiments (cont.)

• Properties of BPConfig configurations (3 buffer pools)– separates index and data objects– separates heavy access and light access

objects– WID tables isolated (equal and read/write

weightings)

18

Experiments (cont.)

Equal Weight

Read/Write

AccessPattern

Expert Random Default Dist

WRT 11.11 11.20 10.86 10.95 10.95 14.05 12.50

TPM 8129 8047 8331 8287 8287 6371 7159

%PR 5.6 4.7 4.6 4.6 4.6 10.4 8.1

19

BP Sizing Problem

Given a workload, a set of buffer pools and a fixed number of buffer pages, determine the appropriate size of each buffer pool to maximize performance for the given workload.

20

Approaches to Sizing BPs – Class-based Optimization

• Specify performance goals for each transaction class

• Algorithm tries to satisfy goals

• Logical access cost proportional to physical access cost

• Physical access cost determined by buffer pool miss rates

21

Class-based Optimization (cont.)

Collect performance data

Choose target class

Loop until goal metChoose target buffer poolChoose source buffer poolReallocate pages

End

Ti with worstperformance

BP with greatestbenefit

BP with leastcost

22

Class-based Optimization (cont.)

• Problems:– How do we select appropriate performance

goals for a class?– Some classes may be favoured over

others– Thrashing between buffer pool states is a

possibility

23

Approaches to Sizing BPs – System-based Optimization

• BP sizes chosen to maximize system performance metric, eg. throughput

• Use a simple greedy algorithm

• Considered 2 cost functions:– Minimize hit rate– Minimize data access time

(physical reads don’t all cost the same!)

24

System-based Optimization - Experiments

• Experimental environment– IBM xSeries 240 PC Server: 2 1 GHz PIII

CPUs, 2 GB RAM, 22 disks, Windows NT – TPC-C benchmark– DB2 Version 7.1

• 50,000 4K buffer pool pages • 3 buffer pools configured with BPConfig

25

Experiments (cont.)

DAT-Based HR-Based

BP Sizing <25000, 4000, 21000> <19000, 5000, 26000>

WHR 0.9308 0.9342

WcostLR 1.5375 1.5639

TPM 4493 4318

26

1500

2000

2500

3000

3500

4000

4500

5000

0.84 0.86 0.88 0.9 0.92 0.94 0.96

System Hit Rate (WHR)

TP

M

1500

2000

2500

3000

3500

4000

4500

5000

1 1.5 2 2.5 3 3.5 4

System costLR (WcostLR)

TP

M

27

Other AutoDBA Projects

• Automatic diagnosis

• Automatic recognition of workload type

• Integration of BPConfig and sizing algorithm

• Automatic BP management in PostgreSQL

• Tools for DBMS capacity planning

28

AutoDBA Project Members

• Queen’s:– Wendy Powley, Darcy Benoit, Said

Elnaffar, Wenhu Tian, Xiaoyi Xu, Xilin Cui, Ted Wasserman, Nailah Ogeer

• IBM:– Berni Schiefer, Sam Lightstone, Randy

Horman, Robin Van Boeschoten, Keri Romanufa, Calisto Zuzarte