27
Copyright © 2019 Oracle and/or its affiliates. 1 Hash join and EXPLAIN ANALYZE Norvald H. Ryeng Software Development Senior Manager MySQL Optimizer Team October 1, 2019 MySQL 8.0.18 latest updates

MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.1

Hash join andEXPLAIN ANALYZENorvald H. Ryeng

Software Development Senior Manager

MySQL Optimizer Team

October 1, 2019

MySQL 8.0.18 latest updates

Page 2: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.2

Safe harbor statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.

The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.

Page 3: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.3

Program agenda

1 Quick demo2 How did we get here? Phase separation Volcano iterator model3 Hash join4 EXPLAIN ANALYZE

Page 4: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.4

Quick demoMySQL 8.0.18

Page 5: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.5

Page 6: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.6

How did we get here?A long term investment that finally pays off

Page 7: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.7

https://twitter.com/vlad_mihalcea/status/1173842312398614528

Page 8: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.8

MySQL refactoring: Separating phases

Started ~10 years agoConsidered finished now

A clear separation between query processing phases Fixed a large number of bugs

Improved stability Faster feature development

Fewer surprises and complications during development

Parse

Prepare

Optimize

Execute

SQL

Resolve

Transform

Abstract syntax tree

Logical plan

Physical plan

Page 9: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.9

MySQL refactoring: Parsing and preparing

Still ongoingImplemented piece by piece

Separating parsing and resolving phasesEliminate semantic actions that do too muchGet a true bottom-up parser

Makes it easier to extend with new SQL syntaxParsing doesn't have unintended side effects

Consistent name and type resolvingNames resolved top-downTypes resolved bottom-up

Transformations done in the prepare phaseBottom-up

Parse

Prepare

Optimize

Execute

SQL

Resolve

Transform

Abstract syntax tree

Logical plan

Physical plan

Page 10: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.10

MySQL features made possible because we invested in refactoring

CTEs Recursive and non-recursive CTEs Traverse hierarchies Write more readable SQL

LATERAL "For-each loops"

… and many, many more!

Window functions Aggregation, ranking, analytics Sliding windows

JSON JSON_TABLE JSON window functions

Page 11: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.11

MySQL refactoring: Iterator executor

Volcano iterator model Possible because phases were separated Ongoing for ~1,5 year Much more modular exeuctor

Common iterator interface for all operationsEach operation is contained within an iterator

Able to put together plans in new waysImmediate benefit: Removes temporary tables in some cases

Join is just an iteratorNested loop join is just an iteratorHash join is just an iteratorYour favorite join method is just an iterator

Parse

Prepare

Optimize

Execute

SQL

Resolve

Transform

Abstract syntax tree

Logical plan

Physical plan

Page 12: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.12

Old MySQL executor vs. iterator executor

Old executor Nested loop focused Hard to extend Code for one operation spread out Different interfaces for each operation Combination of operations hard coded

Iterator executor Modular Easy to extend Each iterator encapsulates one operation Same interface for all iterators All operations can be connected

Page 13: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.13

MySQL 8.0 features based on the iterator executor

EXPLAIN FORMAT=TREEPrint the iterator tree

EXPLAIN ANALYZE1. Insert intstrumentation nodes in the tree2. Execute the query3. Print the iterator tree

Hash joinJust another iterator type

Parse

Prepare

Optimize

Execute

SQL

Resolve

Transform

Abstract syntax tree

Logical plan

Physical plan

Page 14: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.14

Hash joinMySQL internals

New in8.0.18

Page 15: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.15

MySQL hash join

Hybrid hash join Three execution modes

Everything fits in memorySpill to disk (GRACE hash join)Looping

Equi-join xxHash64

FastGood distribution

HashJoinIterator

TableScanIterator TableScanIterator

SELECT * FROM t1 JOIN t2 ON t1.a = t2.a;

Page 16: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.16

MySQL hash join — everything fits in memory

1) Hash one table into memorySmallest tableThe whole table fits in memory (if not, use next method)

2) Scan the other table and match with rows in memory

2

1

A

=

B

Page 17: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.17

MySQL hash join — spill to disk

1

2

3

4

1) Hash the smallest table into memory until the buffer is fullWhen the buffer is full, dump the rest as chunks on diskProduce up to 128 hash buckets (chunks) on disk

2) Scan the other table and match with rows in memoryWrite to chunks on disk at the same timeProduce up to 128 hash buckets (chunks) on disk

3) Hash one chunk of the smallest table into memoryHash using a different seed than used to create disk buckets(If the bucket doesn't fit, see next method)

4) Scan the corresponding chunk and match with rows in memory5) Repeat 3-4 until all chunks have been processed

==

A

B

Page 18: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.18

MySQL hash join — looping

1) Hash a batch of the chunk into memoryWhen the buffer is full, pause reading this operand

2) Scan the corresponding chunk and match with rows in memory3) Hash the next batch into memory

Resume from where it was paused

4) Scan the corresponding chunk and match with rows in memory5) Repeat 3-4 until the whole chunk has been processed

1 & 3

2 & 4

=

Page 19: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.19

MySQL hash join optimization

Replaces BNL Currently optimized as BNL

Replace with hash join after optimizationA conservative (safe) choice

Optimizer switch to turn on/offSET optimizer_switch='hash_join=on';

Hints/*+ HASH_JOIN(tables or query blocks) *//*+ NO_HASH_JOIN(tables or query blocks) */

Buffer sizeSET join_buffer_size=number;

Parse

Prepare

Optimize

Execute

SQL

Resolve

Transform

Abstract syntax tree

Logical plan

Physical plan

Page 20: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.20

467x1520x>1400x

1332x968x

MySQL hash join performance

BNL compared to hash join Force BNL/hash join in DBT-3/TPC-H

DBT-3/TPC-H without indexes

Optimizer selects BNL

Automatic conversion to hash join

Hash join is much faster than BNL Can't expect same improvement when

indexes are available

Low

er n

umbe

r is

bett

er

Page 21: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.21

New in8.0.18

EXPLAIN ANALYZEMySQL internals

Page 22: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.22

TimingIteratorTimingIterator

TimingIterator

MySQL EXPLAIN ANALYZE

Wrap iterators in instrumentation nodes Measurements

Time (in ms) to first rowTime (in ms) to last rowNumber of rowsNumber of loops

Execute the query and dump the stats

HashJoinIterator

TableScanIterator TableScanIterator

Page 23: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.23

467x1520x>1400x

1332x968x

What's wrong with Q2?

SELECT s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_commentFROM part, supplier, partsupp, nation, regionWHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND p_size = 4 AND p_type LIKE '%TIN' AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'AMERICA' AND ps_supplycost = ( SELECT min(ps_supplycost) FROM partsupp, supplier, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'AMERICA' )ORDER BY s_acctbal DESC, n_name, s_name, p_partkeyLIMIT 100;

MySQL EXPLAIN ANALYZE to the rescue!

Page 24: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

-> Limit: 100 row(s) (actual time=591739.000..591739.018 rows=100 loops=1) -> Sort: <temporary>.s_acctbal DESC, <temporary>.n_name, <temporary>.s_name, <temporary>.p_partkey, limit input to 100 row(s) per chunk (actual time=591738.999..591739.012 rows=100 loops=1) -> Stream results (actual time=591738.599..591738.772 rows=462 loops=1) -> Inner hash join (nation.n_regionkey = region.r_regionkey), (nation.n_nationkey = supplier.s_nationkey) (cost=2074295.37 rows=98) (actual time=591738.591..591738.686 rows=462 loops=1) -> Table scan on nation (cost=0.00 rows=25) (actual time=0.024..0.026 rows=25 loops=1) -> Hash -> Inner hash join (supplier.s_suppkey = partsupp.ps_suppkey) (cost=2074041.10 rows=98) (actual time=591735.554..591738.311 rows=462 loops=1) -> Table scan on supplier (cost=0.06 rows=9760) (actual time=0.068..2.024 rows=10000 loops=1) -> Hash -> Filter: (partsupp.ps_supplycost = (select #2)) (cost=1977898.52 rows=98) (actual time=1282.855..591733.987 rows=462 loops=1) -> Inner hash join (partsupp.ps_partkey = part.p_partkey) (cost=1977898.52 rows=98) (actual time=84.827..271.307 rows=3120 loops=1) -> Table scan on partsupp (cost=3.54 rows=796168) (actual time=0.034..108.684 rows=800000 loops=1) -> Hash -> Inner hash join (cost=20353.91 rows=24) (actual time=0.274..83.964 rows=780 loops=1) -> Filter: ((part.p_size = 4) and (part.p_type like '%TIN')) (cost=20353.16 rows=2204) (actual time=0.212..83.719 rows=780 loops=1) -> Table scan on part (cost=20353.16 rows=198401) (actual time=0.160..63.957 rows=200000 loops=1) -> Hash -> Filter: (region.r_name = 'AMERICA') (cost=0.75 rows=1) (actual time=0.029..0.036 rows=1 loops=1) -> Table scan on region (cost=0.75 rows=5) (actual time=0.021..0.030 rows=5 loops=1) -> Select #2 (subquery in condition; dependent) -> Aggregate: min(partsupp.ps_supplycost) (actual time=189.566..189.566 rows=1 loops=3120) -> Inner hash join (nation.n_regionkey = region.r_regionkey), (nation.n_nationkey = supplier.s_nationkey) (cost=77988309.27 rows=79617) (actual time=189.561..189.563 rows=1 loops=3120) -> Table scan on nation (cost=0.00 rows=25) (actual time=0.005..0.007 rows=25 loops=3120) -> Hash -> Inner hash join (supplier.s_suppkey = partsupp.ps_suppkey) (cost=77789257.28 rows=79617) (actual time=187.997..189.549 rows=4 loops=3120) -> Table scan on supplier (cost=0.02 rows=9760) (actual time=0.014..1.237 rows=10000 loops=3120) -> Hash -> Filter: (part.p_partkey = partsupp.ps_partkey) (cost=81757.41 rows=79617) (actual time=92.085..187.748 rows=4 loops=3120) -> Inner hash join (cost=81757.41 rows=79617) (actual time=0.018..155.118 rows=800000 loops=3120) -> Table scan on partsupp (cost=10101.54 rows=796168) (actual time=0.011..97.353 rows=800000 loops=3120) -> Hash -> Filter: (region.r_name = 'AMERICA') (cost=0.75 rows=1) (actual time=0.004..0.005 rows=1 loops=3120) -> Table scan on region (cost=0.75 rows=5) (actual time=0.003..0.003 rows=5 loops=3120)

Time is in milliseconds

Page 25: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.25

Make your queries run faster with MySQL 8.0

Automatic

Temporary table eliminationRecursive CTEsSome derived tablesUNION into derived tablesInput to sorting

Faster duplicate removal Hash join instead of BNL Analyze queries to find out where time is spent

EXPLAIN FORMAT=TREEEXPLAIN ANALYZEOptimizer trace

Page 26: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.26

Feature descriptions and design detailsdirectly from the source

https://mysqlserverteam.com/

Page 27: MySQL 8.0 latest updates: Hash Join and EXPLAIN ANALYZE · 2)Scan the corresponding chunk and match with rows in memory 3)Hash the next batch into memory Resume from where it was

Copyright © 2019 Oracle and/or its affiliates.27

Thank you!

Norvald H. Ryeng

Software Development Senior ManagerMySQL Optimizer Team