Copyright 2016 Severalnines AB
1
Your host & some logistics
I'm Jean-Jérôme from the Severalnines Team and I'm your host for today's webinar!
Feel free to ask any questions in the Questions section of this application or via the Chat box.
You can also contact me directly via the chat box or via email: [email protected] during or after the webinar.
Copyright 2016 Severalnines AB
2
About Severalnines and ClusterControl
Copyright 2016 Severalnines AB
3
What we do
Manage Scale
Monitor Deploy
Copyright 2016 Severalnines AB
4
ClusterControl Automation & Management
! Provisioning !Deploy a cluster in minutes !On-premises or in the cloud (AWS)
! Monitoring ! Systems view ! 1sec resolution !DB / OS stats & performance advisors !Configurable dashboards !Query Analyzer ! Real-time / historical
! Management ! Multi cluster/data-center ! Automate repair/recovery ! Database upgrades ! Backups ! Configuration management ! Cloning ! One-click scaling
Copyright 2016 Severalnines AB
5
Supported Databases
Copyright 2016 Severalnines AB
6
Customers
Copyright 2016 Severalnines AB
MySQL Query Tuning - Hinting the optimizer and improving query performance
October 25, 2016
Krzysztof Książek
Severalnines
7
Copyright 2016 Severalnines AB
8
Agenda
! InnoDB index statistics
! MySQL cost model
! Hints
! Index hints
! Legacy optimizer hints syntax
! New (5.7) optimizer hint syntax
! Optimizing SQL
Copyright 2016 Severalnines AB
9
InnoDB index statistics
Copyright 2016 Severalnines AB
10
InnoDB index statistics! Query execution plan is calculated based on InnoDB index statistics
! Up to 5.6, default behavior is that statistics are recalculated when
! ANALYZE TABLE has been explicitly executed
! SHOW TABLE STATUS, SHOW TABLES or SHOW INDEX were executed
! Either 1/16th or 2 billion rows were modified in a table
! To calculate statistics, InnoDB performs a lookup into 8 index pages
! This is 128KB of data to calculate stats for, i.e. 100GB index
! Use innodb_stats_transient_sample_pages to change that
! Query execution plan may change after statistics recalculation
Copyright 2016 Severalnines AB
11
InnoDB index statistics
! In MySQL 5.6 statistics became (by default) more persistent
! They are not recalculated for every SHOW TABLE STATUS and similar commands
! They are updated when an explicit ANALYZE TABLE is run on the table or more than 10% of rows in the table were modified
! As a result, query execution plans became more stable
! They are also calculated from a larger sample - 20 index pages
! Manageable through innodb_stats_persistent_sample_pages variable
! You can disable persistent statistics using innodb_stats_persistent
Copyright 2016 Severalnines AB
12
MySQL cost model
Copyright 2016 Severalnines AB
13
MySQL cost model
! To determine most efficient query execution plan, MySQL has to assess costs of different plans
! The least expensive one is picked
! Each operation - reading data from memory or from disk, creating temporary table in memory and on disk, comparing rows, evaluating row conditions, has its own cost assigned
! Historically, those numbers were hardcoded and couldn’t be changed
! This changed with MySQL 5.7 - new tables were added in mysql schema
! server_cost
! engine_cost
Copyright 2016 Severalnines AB
14
MySQL cost model! disk_temptable_create_cost,
disk_temptable_row_cost - cost to create and maintain on-disk temporary table - by default 40 and 1
! memory_temptable_create_cost, memory_temptable_row_cost - cost to create and maintain in-memory temporary table - by default 2 and 0.2
! key_compare_cost - cost to compare record keys (more expensive - less likely filesort will be used) - by default - 0.1
! row_evaluate_cost - cost to evaluate rows (more expensive - more likely index will be used for scan) - by default - 0.2
Copyright 2016 Severalnines AB
15
MySQL cost model
! engine_name - InnoDB/MyISAM - by default all are affected
! device_type - not used, but in the future you could have different costs for different types of I/O devices
! io_block_read_cost - cost of reading an index or data page from disk - by default 1
! memory_block_read_cost - cost of reading index or data page from memory - by default 1
Copyright 2016 Severalnines AB
16
MySQL cost model
! Optimizer is undergoing refactoring and rewriting - new features will follow
! Even now, in MySQL 5.7, you can modify costs which used to be hardcoded and tweak them according to your hardware
! Disk operations will be less expensive on PCIe SSD than on spindles
! You can tweak engine_cost and server_cost to reflect that
! You can always revert your changes through updating costs to ‘NULL’
! Make sure you run FLUSH OPTIMIZER_COSTS; to apply your changes
! Use SHOW STATUS LIKE ‘Last_query_cost'; to check the cost of last executed query
Copyright 2016 Severalnines AB
17
Index hints
Copyright 2016 Severalnines AB
18
! USE INDEX - tells the optimizer that it should use one of the listed indexes
! FORCE INDEX - a full table scan is marked as extremely expensive operation and therefore won’t be used by the optimizer - as long as any of the listed indexes could be used for our particular query
! IGNORE INDEX - tells the optimizer which indexes we don’t want it to consider
Index hints
Copyright 2016 Severalnines AB
19
Index hints
Copyright 2016 Severalnines AB
20
Index hints! Hints can be located in different places
! JOIN actor AS a IGNORE INDEX FOR JOIN (idx_actor_last_name)
! FORCE INDEX FOR ORDER BY(idx_actor_first_name)
! Following options are available:
! FORCE INDEX FOR JOIN (idx_myindex)
! FORCE INDEX FOR ORDER BY (idx_myindex)
! FORCE INDEX FOR GROUP BY (idx_myindex)
! FORCE INDEX (idx_myindex) aggregates all of those above
Copyright 2016 Severalnines AB
21
Index hints
! When you are executing any query with JOINs, the MySQL optimizer has to decide the order in which those tables should be joined
! A result is not always optimal
! STRAIGHT_JOIN can be used to force order in which tables will be joined
! Works for JOIN only - LEFT or RIGHT JOIN’s already enforce some order
! Let’s assume this query on Sakila database:EXPLAIN SELECT actor_id, title FROM film_actor AS fa JOIN film AS f ON fa.film_id = f.film_id ORDER BY fa.actor_id\G
Copyright 2016 Severalnines AB
22
Index hints
Copyright 2016 Severalnines AB
23
Index hints - join order modificators
! Let’s say we want to avoid temporary table
! Following query will do the trick - note that STRAIGHT_JOIN is used:
! EXPLAIN SELECT STRAIGHT_JOIN actor_id, title FROM film_actor AS fa JOIN film AS f ON fa.film_id = f.film_id ORDER BY fa.actor_id\G
! Tables will be joined in a film_actor -> film order
Copyright 2016 Severalnines AB
24
Index hints - join order modificators
Copyright 2016 Severalnines AB
25
Index hints - join order modificators
! You can manipulate the join order also within the query
! SELECT STRAIGHT_JOIN * FROM tab1 JOIN tab2 ON tab1.a = tab2.a JOIN tab3 ON tab2.b = tab3.b;
! Only option of the optimizer will be: tab1, tab2, tab3
! SELECT * FROM tab1 JOIN tab2 ON tab1.a = tab2.a STRAIGHT_JOIN tab3 ON tab2.b = tab3.b;
! Two different options are possible now
! tab1, tab2, tab3
! tab2, tab3, tab1
Copyright 2016 Severalnines AB
26
Controlling the optimizer - optimizer switch
Copyright 2016 Severalnines AB
27
Controlling the optimizer - optimizer switch! With time MySQL optimizer got improved and new algorithms were added
! MariaDB added their own set of optimizations and optimizer features
! Some of those features can be disabled by user on global and session level
! SET GLOBAL optimizer_switch=“index_merge=off";
! SET SESSION optimizer_switch=“index_merge=off";
! Sometimes this is the only way to make sure your query will be executed in an optimal way
Copyright 2016 Severalnines AB
28
Controlling the optimizer - optimizer hints (5.7)
Copyright 2016 Severalnines AB
29
Controlling the optimizer - optimizer hints (5.7)
! As of MySQL 5.7.7, new way of controlling optimizer has been added
! Hints use /*+ … */ syntax within query
! Takes precedence over optimizer_switch variable
! Work on multiple levels:
!Global
!Query block
! Table
! Index
Copyright 2016 Severalnines AB
30
Controlling the optimizer - optimizer hints (5.7)
Hint Name Description Applicable Scopes
BKA, NO_BKA Affects Batched Key Access join processing Query block, table
BNL, NO_BNL Affects Block Nested-Loop join processing Query block, table
MAX_EXECUTION_TIME Limits statement execution time Global
MRR, NO_MRR Affects Multi-Range Read optimization Table, index
NO_ICP Affects Index Condition Pushdown optimization Table, index
NO_RANGE_OPTIMIZATION Affects range optimization Table, index
QB_NAME Assigns name to query block Query block
SEMIJOIN, NO_SEMIJOIN Affects semi-join strategies Query block
SUBQUERY Affects materialization, IN-to-EXISTS subquery stratgies Query block
Copyright 2016 Severalnines AB
31
! Can be used at the beginning of a statement:
! SELECT /*+ ... */ ...
! INSERT /*+ ... */ ...
! REPLACE /*+ ... */ ...
!UPDATE /*+ ... */ ...
!DELETE /*+ ... */ ...
Controlling the optimizer - optimizer hints (5.7)
! Can be used in subqueries:
! (SELECT /*+ ... */ ... )
! (SELECT ... ) UNION (SELECT /*+ ... */ ... )
! (SELECT /*+ ... */ ... ) UNION (SELECT /*+ ... */ ... )
! UPDATE ... WHERE x IN (SELECT /*+ ... */ ...)
! INSERT ... SELECT /*+ ... */ ...
Copyright 2016 Severalnines AB
! Can be used on a table level:
! SELECT /*+ NO_BKA(t1, t2) */ t1.* FROM t1 INNER JOIN t2 INNER JOIN t3;
! SELECT /*+ NO_BNL() BKA(t1) */ t1.* FROM t1 INNER JOIN t2 INNER JOIN t3;
32
Controlling the optimizer - optimizer hints (5.7)! Can be used on an index level:
! SELECT /*+ MRR(t1) */ * FROM t1 WHERE f2 <= 3 AND 3 <= f3;
! SELECT /*+ NO_RANGE_OPTIMIZATION(t3 PRIMARY, f2_idx) */ f1 FROM t3 WHERE f1 > 30 AND f1 < 33;
! INSERT INTO t3(f1, f2, f3) (SELECT /*+ NO_ICP(t2) */ t2.f1, t2.f2, t2.f3 FROM t1,t2 WHERE t1.f1=t2.f1 AND t2.f2 BETWEEN t1.f1 AND t1.f2 AND t2.f2 + 1 >= t1.f1 + 1);
Copyright 2016 Severalnines AB
33
Controlling the optimizer - optimizer hints (5.7)
! SELECT /*+ MAX_EXECUTION_TIME(1000) */ * …
! Applies to the whole SELECT query
! Only applies to read-only SELECTs (does not apply to SELECTs which invoke stored routine)
! Does not apply to SELECTs in stored routines
! Very convenient way of adding safety - if you are not sure how long a query will take, limit its maximum execution time
Copyright 2016 Severalnines AB
34
Pros and cons of using hints
! Enable you to fix optimizer mistakes
! Sometimes it’s the fastest way of solving a performance issue
! Faster than, for example, adding an index
! Allow you to disable some parts of the functionality of the optimizer
! Hardcoded hints can become a problem
!When you remove indexes: (ERROR 1176 (42000): Key 'idx_b' doesn't exist in table ‘tab')
!When you upgrade to next major MySQL version (hint syntax may change)
!When data distribution changes and new plan becomes optimal
Copyright 2016 Severalnines AB
35
Query tuning - optimizing SQL
Copyright 2016 Severalnines AB
36
Optimizing SQL
! MySQL is getting better in executing queries with every release
! What didn’t work in the past may work better in the latest version. Subqueries, for example
! MySQL 5.5: MySQL 5.6/5.7:
Copyright 2016 Severalnines AB
37
Optimizing SQL
! MySQL 5.5 usually requires rewrite of subquery into JOIN:
Copyright 2016 Severalnines AB
38
! When looking at JOIN queries, make sure columns used to join tables are properly indexed
!Not indexed joins are the most common SQL anti-pattern, and the most expensive one too
Optimizing SQL
Copyright 2016 Severalnines AB
39
! After indexes have been added:
Optimizing SQL
Copyright 2016 Severalnines AB
40
Optimizing SQL
! Be aware of LIMIT - it may not actually limit number of rows scanned - use ranges instead
! LIMIT 9000,10 would access 9010 rows
Copyright 2016 Severalnines AB
41
Optimizing SQL
! When using UNION in your query, make sure you use UNION ALL, otherwise a DISTINCT clause is added and it requires additional processing of the data - temporary table is created with index on it
! UNION ALL also requires temporary table (removed in MySQL 5.7), but no index is created
Copyright 2016 Severalnines AB
42
! MySQL 5.6
Optimizing SQL
! MySQL 5.7
Copyright 2016 Severalnines AB
43
Optimizing SQL
! For GROUP BY and ORDER BY - try to make sure index is used, otherwise a temporary table will be created or filesort has to be performed
Copyright 2016 Severalnines AB
44
Optimizing SQL
! When used in JOIN, try to sort and aggregate only using columns from a single table - such case can be indexed. If you GROUP BY or ORDER BY using columns from both, it can’t be indexed
! The only way MySQL can use multiple indexes (but from the same table) is through index merge
! And it’s not the fastest way of retrieving the data (more details on another slide)
! There’s definitely no way to use indexes across multiple tables
Copyright 2016 Severalnines AB
45
Optimizing SQL
Copyright 2016 Severalnines AB
46
Optimizing SQL
! Avoid ORDER BY RAND() - it will create a temporary table
! Always
! ORDER BY RAND() is evil
! In app - generate random numbers from MIN(pk), MAX(pk) range
! Use them in WHERE pk=… or pk IN ( … )
! PK lookup - fast and efficient
! Verify you got correct number of rows, if not - repeat the process
Copyright 2016 Severalnines AB
47
Optimizing SQL
! Parallelize queries and aggregate them within application - MySQL cannot use multiple cores per query (although there are worklogs regarding that so it may change in the future)
! Use home-grown scripts
! Use https://shardquery.com
! Parallel processing may not always be feasible, but, if it could be used, it can speed up data processing significantly.
Copyright 2016 Severalnines AB
48
Thank You!
! Blog posts covering query tuning process:
! http://severalnines.com/blog/become-mysql-dba-blog-series-optimizer-hints-faster-query-execution
! Register for other upcoming webinars:
! http://severalnines.com/upcoming-webinars
! Install ClusterControl:
! http://severalnines.com/getting-started
! Contact: [email protected]