Automatic Database Management System Tuning Through Large-scale Machine Learning
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, Bohan Zhang
[image source]
OtterTune
DBMS Tuning
• Tuning a DBMS’s configuration knobs knobs for the application’s workload and hardware is critical for performance
• Tuning even one DBMS deployment is hard…
2
0.20.4
0.60.8
0.51.01.52.02.5
0.0
1.0
2.0
Bufferpoolsize
(GB)
Logfilesize(GB)1.0
Challenge #1: Dependencies
3
99th %-tile latency (sec)
lower is better
MySQL v5.6, YCSB: update-heavy workload VM - 2GB RAM, 2 vCPUs
Bufferpoolsize(GB)3.01.00.0 2.0
0.0
1.0
2.0
3.0
Challenge #2: Continuous Settings
4
MySQL v5.6, YCSB: update-heavy workload VM - 2GB RAM, 2 vCPUs
99th %-tile latency (sec)
lower is better
Challenge #3: Non-reusable Configs
5
Workload#1 Workload#2 Workload#3
0.0
2.0
4.0
6.0
Config#2Config#1 Config#3
MySQL v5.6, 3 different YCSB workloads
99th %-tile latency (sec)
lower is better
8x
30x
Challenge #4: Tuning Complexity
6
Number of configuration knobs in MySQL and Postgres releases over the last 16 years
MySQL Postgres
20122004 20162000 2008
600
400
200
0
Num
bero
fkno
bs
5x
7x
539
291
Existing Solutions
7
Auto-tunersHeuristic-based; DBMS-specific;
“Smart” Auto-tunersResearch only; no data reuse;
OtterTune
Reuse historical performance data from tuning past DBMS deployments to tune new DBMS deployments
8
Train ML models to: 1. Identify the most impactful knobs
2. Find past workloads that are similar to the new workload
3. Recommend knob settings that improve a target objective (e.g., latency, throughput)
Tuning Manager
Data Repository
ML Models
Controller
DBMS1
• User specifies the target objective1
2
• Controller starts the first observation period- Measures external metrics, then collects DBMS-specific internal metrics- Returns: metrics, current knob configuration
2
• Tuning manager saves the data and computes the next configuration3- Returns: next configuration, expected improvement
3
• Controller installs next configuration on DBMS
4
4
Assumptions & Limitations
• All DBMS instances run on the same hardware
• Blacklist for knobs that should NOT be tuned
10
Machine Learning Pipeline
11
Workload Characterization Knob Identification
Sam
ples
Metrics
Sam
ples
Knobs Knobs
Distinct Metrics
Importance
Automatic Tuner
Workload Characterization
• Goal: find a set of metrics that best characterize an application’s workload
• Method: Use the metrics stored in OtterTune’s data repository
12
...
...
Internal DBMS Metrics:- Directly affected by the knobs’ settings
- Problem: redundancy - Same but different units - Highly correlated
- Solution: prune them - Factor analysis to capture correlation patterns
- K-means to group correlated metrics
# buffer pool misses
total # buffer pool requests
Buffer pool size is too small:
Sample Metric Clusters
innodb_data_reads innodb_buffer_pool_reads
innodb_pages_read …
14
MySQL (v5.6)
questions table_locks_immediate table_open_cache_hits
…
Postgres (v9.3)
blks_read heap_blks_read idx_blks_read
…
buffers_backend blk_write_time buffers_clean
…
Knob Identification
• Goal: identify which knobs affect the DBMS’s performance
• Method: feature selection technique to order the knobs by importance
15
Configuration Knobs:- Knobs have varying degrees of impact on the DBMS’s performance
- Some have high impact - Some have no impact - For many, it depends on the workload
- Problem: which knobs matter?
- Solution: Incremental knob selection to gradually increase the number of knobs tuned during a session
...
...
...
- Solution: Lasso to determine the order of importance
- Problem: how many to tune?
Top 5 Configuration Knobs17
1. innodb_buffer_pool_size
2. innodb_flush_method
3. innodb_log_file_size
4. innodb_thread_concurrency
5. innodb_max_dirty_pages-
_pct_lwm
1. shared_buffers
2. checkpoint_segments
3. effective_cache_size
4. bgwriter_lru_maxpages
5. bgwriter_delay
MySQL (v5.6) Postgres (v9.3)
Step 1: Workload Mapping
• Goal: Match the target DBMS’s workload to the most similar known workload in the data repository
• Method: For each known workload, compute a similarity score by comparing the performance measurements
19
- Problem: deciding which
Known WorkloadsTarget DBMS’s Workload
Target Workload
Data
p99 latency throughput innodb_data_writes handler_read_key …
792 512 46044 526784 …
Knob Config X
Similarity score: average distance in performance measurements over all metrics
Workload A
- Problem: computing the score with raw metrics would be unfair
Workload B
792 512 46044 526784 …4 5 7 5 …
Bin metric values
- Solution: compute deciles, bin values- Problem: little overlap in the configurations tried between workloads
f(x)f(x)f(x)f(x)f(x)f(x)
- Solution: for each metric, train statistical model to predict performance
8 2 9 3 …3 5 8 4 …
Step 2: Recommendation
• Goal: recommend configurations in order to optimize the target metric over the entire tuning session
• Method: Reuse data from the similar workload in step #1 to train a statistical model and use it to optimize the next configuration to try
21
Recommendation Steps
• Use similar historical data & new data to train aGaussian Process model
• Predict the latency & variance for a sample set of possible configurations
• Optimize the next configuration to try, trading off: - Exploration: gathering information to improve the model - Exploitation: greedily trying to do well on the target
objective
22
0.40.2
0.60.8
0.51.01.52.02.5
0.0
1.0
2.0
Bufferpoolsize
(GB)
Logfilesize(GB)
99th%-tile
latenc
y(s)
Experimental Setup
DBMSs: MySQL (v5.6), Postgres (v9.3)
Training data collection:
• 15 YCSB workload mixtures
• ~30k trials per DBMS
Experiments conducted on Amazon EC2
23
Data vs. No Data24
2 hours of tuning time
OtterTune iTuned
TPC-C
36% 34%
Wikipedia
47% 42%
Wikipedia
40% 42%
TPC-C
46% 30%
MySQL (v5.6) Postgres (v9.3)
99th %-tile latency (sec)
lower is better
MySQL: Efficacy Comparison
25
0.0
0.5
1.0
1.5
0.71
0.32
0.74
0.30
1.64
Default OtterTune Tuning Script DBA RDS Config
0
250
500
750
562
736
508
686
165
99th percentile latency (sec) (lower is better)
Throughput (txn/sec) (higher is better)
OtterTune generates a knob configuration comparable to the
DBA!
Postgres: Efficacy Comparison
26
0.0
0.2
0.4
0.6
0.300.250.250.24
0.53
Default OtterTune Tuning Script DBA RDS Config
0
250
500
750
1000
714
843845946
426
99th percentile latency (sec) (lower is better)
Throughput (txn/sec) (higher is better)
OtterTune generates a knob configuration comparable to the
DBA!
Future Work
27
• Automatically detecting the hardware capabilities of the DBMS’s host machine
• Auto-tuning hierarchical components
• Multi-objective optimization