Upload
norma-mosley
View
215
Download
0
Embed Size (px)
Citation preview
Autonomic DBMSs:System Tune Thyself!
Pat MartinDatabase Systems Laboratory
School of Computing
Supported by IBM, CITO and NSERC
2
Outline of Talk
• The problem – system complexity
• The solution – autonomic computing systems
• Autonomic DBMSs
• Some current research - tuning multiple buffer pools
• Summary
3
The Problem
• Computer systems continually expanded to achieve greater functionality and efficiency
• Expansion has led to a complexity crisis– Systems are too complex to be managed
effectively!
6
A Solution – Autonomic Computing Systems
Autonomic Computing Systems, like our nervous system, manage themselves
7
Autonomic Computing System
• Aware of itself and its environment and acts accordingly
• Able to reconfigure itself under varying and unpredictable conditions
• Able to recover from events that cause it to malfunction
• Able to anticipate optimized resources needed to perform a task
• Able to protect itself
8
Autonomic DBMS Project
• Goal is develop a DBMS that can automatically– Recognize properties of its workload– Monitor itself with minimal impact on applications’
performance– Reallocate resources to improve performance– Detect and diagnose performance problems– Recognize and react to changes in its environment
and available resources
9
Example – Buffer Pool Tuning
• Automatically configure tablespaces to buffer pools based on an analysis of the database and the workload (BP Configuration Problem)
• Dynamically adjust sizes of buffer pools to minimize I/O costs for the database and workload (BP Sizing Problem)
11
BP Configuration Problem
Given a set of database objects and a workload, determine a mapping of database objects to buffer pools to maximize performance for the given workload.
12
Configuration Rules of Thumb
• Separate data and indexes• Isolate a large data table• Separate objects that are updated frequently
and objects that are primarily read• Put temporary tables in their own BP• Separate small frequently accessed tables
from larger tables that are scanned• Isolate tables that are accessed frequently by
short updates
13
BPConfig Approach
• Analyze logical page reference trace– obtain trace of workload on default configuration– derive access patterns for DB objects
• random, re-reference and sequential accesses
• Create characterization vectors– type, access patterns, read/write info, size info
• Partition DB objects into buffer pools– cluster based on characterization vectors
14
Partitioning DB Objects
• Partition using k-means clustering algorithm
• Similarity measured by weighted Euclidean distance
• Considered different weighting schemes– equal– favour read/write– favour access pattern
15
Experiments
• Experimental environment– IBM Netfinity 8500R: 4 900 MHz PIII Xeon
CPU, 16 GB RAM, 70 disks, Windows NT – TPC-C benchmark: OLTP workload, 400
warehouse (40 GB) database– DB2 Version 7.1
• 100,000 4K pages for the buffer pools
16
Experiments (cont.)
• Configuration schemes– BPConfig, expert, default (1BP), random,
distributed (1 BP per DB object)
• Evaluation criteria– Weighted Response Time– TPM– % Physical Reads
17
Experiments (cont.)
• Properties of BPConfig configurations (3 buffer pools)– separates index and data objects– separates heavy access and light access
objects– WID tables isolated (equal and read/write
weightings)
18
Experiments (cont.)
Equal Weight
Read/Write
AccessPattern
Expert Random Default Dist
WRT 11.11 11.20 10.86 10.95 10.95 14.05 12.50
TPM 8129 8047 8331 8287 8287 6371 7159
%PR 5.6 4.7 4.6 4.6 4.6 10.4 8.1
19
BP Sizing Problem
Given a workload, a set of buffer pools and a fixed number of buffer pages, determine the appropriate size of each buffer pool to maximize performance for the given workload.
20
Approaches to Sizing BPs – Class-based Optimization
• Specify performance goals for each transaction class
• Algorithm tries to satisfy goals
• Logical access cost proportional to physical access cost
• Physical access cost determined by buffer pool miss rates
21
Class-based Optimization (cont.)
Collect performance data
Choose target class
Loop until goal metChoose target buffer poolChoose source buffer poolReallocate pages
End
Ti with worstperformance
BP with greatestbenefit
BP with leastcost
22
Class-based Optimization (cont.)
• Problems:– How do we select appropriate performance
goals for a class?– Some classes may be favoured over
others– Thrashing between buffer pool states is a
possibility
23
Approaches to Sizing BPs – System-based Optimization
• BP sizes chosen to maximize system performance metric, eg. throughput
• Use a simple greedy algorithm
• Considered 2 cost functions:– Minimize hit rate– Minimize data access time
(physical reads don’t all cost the same!)
24
System-based Optimization - Experiments
• Experimental environment– IBM xSeries 240 PC Server: 2 1 GHz PIII
CPUs, 2 GB RAM, 22 disks, Windows NT – TPC-C benchmark– DB2 Version 7.1
• 50,000 4K buffer pool pages • 3 buffer pools configured with BPConfig
25
Experiments (cont.)
DAT-Based HR-Based
BP Sizing <25000, 4000, 21000> <19000, 5000, 26000>
WHR 0.9308 0.9342
WcostLR 1.5375 1.5639
TPM 4493 4318
26
1500
2000
2500
3000
3500
4000
4500
5000
0.84 0.86 0.88 0.9 0.92 0.94 0.96
System Hit Rate (WHR)
TP
M
1500
2000
2500
3000
3500
4000
4500
5000
1 1.5 2 2.5 3 3.5 4
System costLR (WcostLR)
TP
M
27
Other AutoDBA Projects
• Automatic diagnosis
• Automatic recognition of workload type
• Integration of BPConfig and sizing algorithm
• Automatic BP management in PostgreSQL
• Tools for DBMS capacity planning