View
245
Download
0
Category
Preview:
Citation preview
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Manyi LuSenior Engineering ManagerMySQL Optimizer Team, OracleOctober 1, 2014
MySQL 5.7: What’s New in the Parser and the Optimizer?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL Optimizer
SELECT a, bFROM t1, t2, t3WHERE t1.a = t2.b AND t2.b = t3.c AND t2.d > 20 AND t2.d < 30;
MySQL Server
Cost based optimizations
Heuristics
Cost Model
Opti
mize
r
Table/index info(data dictionary)
Statistics(storage engines)
t2 t3
t1
Tablescan
Rangescan
Ref access
JOIN
JOIN
Pars
er
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL Optimizer: Design Principles
• Best out of the box performance• Easy to use, minimum tuning needed• When you need to understand: explain and trace• Flexibility through optimizer switches, hints and plugins• Fast evolving
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Parser and Optimizer Improvements
• Parser and optimizer refactoring
• Improved cost model: better record estimation for JOIN
• Improved cost model: configurable cost constants
• Query rewrite plugin
• Explain on a running query
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Optimizer Improvements
• Computed columns
• UNION ALL queries no longer use temporary tables
• Improved optimizations for queries with IN expressions
• Optimized full text search
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
5.7 Parser and Optimizer Refactoring
Optimizer
Logical transformations
Cost-based optimizer:Join order and access methods
Plan refinement
Query execution plan
Query execution
Parser
Resolver:Semantic check,name resolution
SQL DML query
Query result
Storage EngineInnoDB MyISAM
Improves readability, maintainability and stability
– Cleanly separate the parsing,
optimizing, and execution stages
– Allows for easier feature additions,
with lessened risk
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Parser Refactoring
• Challenge:– Overly complex, hard to add new syntax
• Solution:– Create an internal parse tree bottom-up– Create an AST (Abstract Syntax Tree) from the parse
tree and the user's context. – Have syntax rules that are more precisely defined
and are closer to the SQL standard. – More precise error messages– Better support for larger syntax rules in the future
Resolver
Optimizer
SE
Lexical Scanner (lexer)
GNU Bison-generated Parser(bottom-up parsing style)
Contextualization
Parser (new)
Executor
AST
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Motivation for Changing the Cost Model• Adopt to new hardware architectures
– SSD, larger memories, caches
• Allows storage engines to provide accurate and dynamic cost estimate– Whether the data is in RAM, SSD, HDD?
• More maintainable cost model implementation– Avoid hard coded constants– Refactoring of existing cost model code
• Tunable/configurable• Replace heuristics with cost based decisions
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Cost Model: Main Focus in 5.7
Address the following pain points in current cost model:• Hard-coded cost constants
– Not possible to adjust for different hardware
• Imprecise cardinality/records per key estimates from SE– Integer value gives too low precision
• Inaccurate record estimation for JOIN– Too high fan out
• Hard to obtain detailed cost numbers
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.6: Record Estimates for JOIN
• t1 JOIN t2
• Total cost = cost (access method t1) + Prefix_rows_t1 * cost (access method t2)
• Prefix_rows_t1 is records read by t1
– Overestimation if where conditions apply!->Suboptimial join order
Without condition filtering
t1 t2Acce
ssM
etho
d
Prefix_rows_t1Number of records read
from t1
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Improved Record Estimates for JOIN
• t1 JOIN t2
• Prefix_rows_t1 Takes into account the entire query condition– More accurate record estimate -> improved JOIN order
Condition filter
t1 t2Acce
ssM
etho
d
Number of records read
from t1
Cond
ition
filte
r
Prefix_rows_t1Records passing the table
conditions on t1
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
• 10 000 rows in the emp table• 100 rows in the office table• 100 rows with first_name=”John” AND hire_date BETWEEN “2012-01-
01″ AND “2012-06-01″
MySQL 5.7 Improved Record Estimates for JOIN
CREATE TABLE emp ( id INTEGER NOT NULL PRIMARY KEY, office_id INTEGER NOT NULL, first_name VARCHAR(20), hire_date DATE NOT NULL, KEY office (office_id) ) ENGINE=InnoDB;
CREATE TABLE office ( id INTEGER NOT NULL PRIMARY KEY, officename VARCHAR(20) ) ENGINE=InnoDB;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Table Type Possible keys Key Ref Rows Filtered Extraoffice ALL PRIMARY NULL NULL 100 100.00 NULL
employee ref office office office.id 99 100.00 Using where
MySQK 5.7 Improved Record Estimates for JOIN
Explain for 5.6: Total Cost = cost(scan office) + 100 * cost(ref_access emp)
Explain for 5.7: Total Cost = cost(scan emp) + 9991*1.23% * cost(eq_ref_access office)
SELECT office_nameFROM office JOIN employee ON office.id = employee.officeWHERE employee.name LIKE “John” AND hire_date BETWEEN “2014-01-01” AND “2014-06-01”;
Table Type Possible keys Key Ref Rows Filtered Extraemployee ALL NULL NULL NULL 9991 1.23 NULL
office eq_ref PRIMARY PRIMARY employee.office 1 100.00 Using where JOIN ORDERHAS CHANGED!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Improved Record Estimation for JOINPerformance Improvements: DBT-3 (SF 10)
Q3 Q7 Q8 Q9 Q120
20
40
60
80
100
120
5.65.7
5 out of 22 queries get an improved query plan
Exec
ution
Tim
e Re
lativ
e to
5.6
in
Perc
enta
ge
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Additional Cost Data in JSON ExplainJSONs
mysql> EXPLAIN FORMAT=JSON SELECT SUM(o_totalprice) FROM orders WHERE o_orderdate BETWEEN '1994-01-01' AND '1994-12-31'; { "query_block": { "select_id": 1, "cost_info": { "query_cost": "3118848.00" }, "table": { "table_name": "orders", "access_type": "ALL", "possible_keys": [ "i_o_orderdate" ], "rows_examined_per_scan": 15000000, "rows_produced_per_join": 4489990, "filtered": 29.933, "cost_info": { "read_cost": "2220850.00", "eval_cost": "897998.00", "prefix_cost": "3118848.00", "data_read_per_join": "582M" }, "used_columns": [ "o_totalprice", "o_orderDATE" ], "attached_condition": "(`dbt3`.`orders`.`o_orderDATE` between '1994-01-01' and '1994-12-31')" } } }
Total query cost of a query block Cost per table Cost of sorting operation Cost of reading data Cost of evaluating conditions Cost of prefix join Rows examined/produced per join Used columns Data read per join – (# of rows)*(record width) in byte
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Visual Explain in MySQL Workbench
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Why Query Rewrite Plugin?• Problem
– Optimizer choses a suboptimal plan– Users can change the query plan by adding hints or rewrite the
query– However, dabase application code cannot be changed
• Solution: query rewrite plugin!
labs.mysql.com
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Query Rewrite Plugin• New pre and post parse query rewrite APIs
– Users can write their own plug-ins
• Provides a post-parse query plugin– Rewrite problematic queries without the need to make application changes– Add hints– Modify join order– Many more …
• Improve problematic queries from ORMs, third party apps, etc• ~Zero performance overhead for queries not to be rewritten
labs.mysql.com
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 How Rewrites Happen?
For querySELECT * FROM t1 JOIN t2 ON t1.keycol = t2.keycol WHERE col1 = 42 AND col2 = 2Replace parameter markers in Replacement with actual literals:
Pattern is:
SELECT *FROM t1 JOIN t2 ON t1.keycol = t2.keycolWHERE col1 = ? AND col2 = ?
Replacement is:
SELECT a, b, cFROM t1 STRAIGHT_JOIN t2 FORCE INDEX (col1)ON t1.keycol = t2.keycolWHERE col1 = ? AND col2 = ?
SELECT a, b, c FROM t1 STRAIGHT_JOIN t2 FORCE INDEX (col1) ON t1.keycol = t2.keycol WHERE col1 = 42 AND col2 = 2
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 How Matching of Rules Happen?
Match and execute rule in three steps:1. Hash lookup using query digest computed during parsing
– Finds patterns with same digest.
2. Parse tree structure comparison– To filter out hash collision– Will not detect differences in literals
3. Compare literal constants – In practice done during rewrite
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Communicating with the Plugin query_rewrite.rewrite_rules table:
Pattern Pattern_database Replacement Enabl
ed Message
SELECT name, department_name FROM employee JOIN department USING ( department_id ) WHERE salary > ?
employees
SELECT name, department_name FROM employee STRAIGHT JOIN department USING ( department_id ) WHERE salary > ?
Y NULL
SELECT name, department_name FROXM employee JOIN department USING ( department_id ) WHERE salary > ?
employees
SELECT name, department_name FROM employee STRAIGHT JOIN department USING ( department_id ) WHERE salary > ?
NParse error in pattern:……near ……at line 1
SELECT name, department_name FROM employee JOIN department USING ( department_id ) WHERE salary > ?
textSELECT name, department_name FROXM employee STRAIGHT JOIN department USING ( department_id ) WHERE salary > ?
NParse error in replacement …near … at line 1
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Query Rewrite Plug-in: Server’s POV• Query comes in
– Plugin(s) is asked if it wants digests (It does) • Query is parsed• Plugin is invoked• The plug-in may (in case of refresh of rules):
– Scan the rules table using the Rules Table Service. For each row:• Pattern + replacement parsed via the parser service• Pattern is traversed using the parser service• Parser service asked for string offsets of '?' in replacement• Parser service asked for normalized query text of pattern• performance_schema asked for digest
• The query is rewritten. Server raises SQL note.
1. Hash lookup using digest. High false positive rate2. Internal tree structure comparison. Misses literal constants3. Compare literal constants. In practice done during rewrite.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Query Rewrite Plugin: Performance Impact
What is the Cost of Rewriting queries?• Designed for rewriting problematic queries only!• ~ Zero cost for queries not to be rewritten
– Statement digest computed for performance schema anyway
• Cost of queries to be rewritten is insignificant compared to performance gain– Cost of generating query + reparsing max ~5% performance overhead– Performance gain potentially x times
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Explain on a Running Query
EXPLAIN [FORMAT=(JSON|TRADITIONAL)] [EXTENDED] FORCONNECTION <id>;
• Shows query plan on connection <id>• Useful for diagnostic on long running queries• Plan isn’t available when query plan is under creation• Applicable to SELECT/INSERT/DELETE/UPDATE
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Generated Columns
• Column generated from the expression• VIRTUAL: computed when read, not stored, not indexable• STORED: computed when inserted/updated, stored in SE, indexable• Useful for:
– Functional index: create a stored column, add a secondary index– Materialized cache for complex conditions – Simplify query expression
labs.mysql.com
CREATE TABLE order_lines (order integer, lineno integer, price decimal(10,2), qty integer, sum_price decimal(10,2) GENERATED ALWAYS AS (qty * price) STORED );
Kodus to Andrey Zhakov for his contribution!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Avoid Creating Temporary Table for UNION ALL
SELECT * FROM table_a UNION ALL SELECT * FROM table_b;
• 5.6: Always materialize results of UNION ALL in temporary tables• 5.7: Do not materialize in temporary tables unless used for sorting,
rows are sent directly to client• 5.7: Client will receive the first row faster, no need to wait until the
last query block is finished• 5.7: Less memory and disk consumption
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Optimizations for IN Expressions
• 5.6: Certain queries with IN predicates can’t use index scans or range scans even though all the columns in the query are indexed.
• 5.6: Range optimizer ignores lists of rows
• 5.6: Needs to rewrite to De-normalized form SELECT a, b FROM t1 WHERE ( a = 0 AND b = 0 ) OR ( a = 1 AND b = 1 )
• 5.7: IN queries with row value expressions executed using range scans.
• 5.7: Explain output: Index/table scans changes to range scans
CREATE TABLE t1 (a INT, b INT, c INT, KEY x(a, b));
SELECT a, b FROM t1 WHERE (a, b) IN ((0, 0), (1, 1));
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Optimizations for IN Expressions
• A table has 10 000 rows, 2 match the where condition
Before:**************1. row *****************
select_type: SIMPLE
table: t1
type: index
key: x
key_len: 10
ref: NULL
rows: 10 000
Extra: Using where; Using index
After:*************1. row *****************
select_type: SIMPLE
table: t1
type: range
key: x
key_len: 10
ref: NULL
rows: 2
Extra: Using where; Using index
SELECT a, b FROM t1 WHERE (a, b) IN ((0, 0), (1, 1));
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Optimization for Full Text Search
SELECT COUNT(*) FROM innodb_table WHERE MATCH(text) AGAINST ('for the this that‘ in natural language mode) > 0.5;
• Recognize more situations where ‘index only’ access method can be use. No need to access base table, only FT index – when the MATCH expression was part of a '>' expression
• 2.5 GB data– 4X performance improvement!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Optimization for Full Text Search
Before:**************1. row *****************
select_type: SIMPLE
table: innodb_table
type: fulltext
key: ft_idx
key_len: 0
ref: NULL
rows: 1
Extra: Using where;
After:*************1. row *****************
select_type: SIMPLE
table: innodb_table
type: fulltext
key: ft_idx
key_len: 10
ref: const
rows: 1
Extra: Using where; Ft_hints: rank > 0.500000; Using index
SELECT COUNT(*) FROM innodb_table WHERE MATCH(text) AGAINST ('for the this that‘ in natural language mode) > 0.5;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: Optimize Full Text Search
SELECT COUNT(*) FROM test.wp WHERE MATCH(text) AGAINST ('+for +the +this+that' in boolean mode) ;
SELECT COUNT(*) FROM innodb_table WHERE MATCH(text) AGAINST ('for the this that‘ in natural language mode) ;
• Optimize performance of COUNT(*) • Optimizer provides hints to InnoDB• When InnoDB takes advantages of the hints:
– Do not calculate ranking – Avoid sorting
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
What is on Our Roadmap?
• Improve prepared statement performance
• Continue redesign cost model, add histogram
• Continue optimizer refactoring
• Support functional index
• Store and query JSON documents
• Support parallel queries
Recommended