Upload
keshav-murthy
View
960
Download
0
Embed Size (px)
Citation preview
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
N1QL QUERY OPTIMIZER AND IMPROVMENTS IN 5.0
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
AGENDA01/
02
03
Optimizer Overview
Improvements in 5.0
Q&A
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
1OPTIMIZER OVERVIEW
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 4
Query Execution Flow
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 5
Query Service
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 6
Query Execution Phases
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 7
Optimizer
• Query Rewrite
• N1QL does very limited rewrite.
• Access Path Selection
• KeyScan Access
• IndexScan Access
• PrimaryScan Access
• JOIN ORDER, Types and Methods
• The keyspaces specified in the FROM clause are joined in the exact order given in the query.
• Nested Loop Join
• LOOK UP JOIN
• Index JOIN
• Execution Plan
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 8
Optimizer
• Optimizer considers all possible ways to execute query and decides best query plan.
• Query plan generated based on rule based optimization
• If index can’t satisfy the query that index will not be chosen.
• If an index scan can be performed, will not perform a full / primary scan.
• Each query block (i.e. SELECT… ) has its own query plan
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 9
Index Selection
• Online indexes
• Only online indexes are considered
• Preferred indexes
• USE INDEX hint is provided the indexes in that list are considered
• Satisfying Index condition
• Partial / filtered indexes that index condition is super set of query predicate are considered
• Satisfying Index keys
• Indexes whose leading keys satisfy query predicate are considered
• Longest satisfying index keys
• Redundancy is eliminated by keeping longest satisfying index keys in same order.
• Index with satisfying keys (a,b,c) is retained over index with satisfying (a,b)
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 10
Access Path Selection
• Key Scan
• If the query contains a USE KEYS clause, no index scan or primary scan is performed. The input document keys are taken directly
from the USE KEYS clause.
• Index Count Scan
• Covering Secondary Scan
• Regular secondary scan -- longest satisfying keys, intersect scan;
• To avoid IntersectScan, provide a hint with USE INDEX.
• UNNEST scan;
• Only array indexes with an index key matching the predicates are used for UNNEST scan.
• Regular primary scan
• If a primary scan is selected, and there is no primary index available, the query errors out.
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 11
Scan Methods
• Covering Primary Scan
• A covering primary scan is a primary scan that does not perform a subsequent document fetch. It is used for queries
that need a full / primary scan and only reference META().id.
SELECT META(t).id FROM `travel-sample` t;
• Regular Primary Scan
• A regular primary scan also performs a subsequent document fetch. It is used for queries that need a full / primary
scan and reference some document data other than META().id.
SELECT META(t).cas FROM `travel-sample` t;
SELECT * FROM `travel-sample` t;
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 12
©2016 Couchbase Inc.
Scan Methods
Covering Secondary Scan
• Each satisfied index with most number of index keys is examined for query coverage
• Shortest covering index will be used.
CREATE INDEX ts_name ON `travel-sample`(country, name) WHERE type = "hotel";
SELECT country, name, type, META().id
FROM `travel-sample`
WHERE type = "hotel" AND country = "United States";
Regular Secondary Scan
• Indexes in with most number of matching index keys are used
• When more than one index are qualified, IntersectScan is used.
• To avoid IntersectScan provide hint with USE INDEX.
SELECT country, name, type, META().id, phone
FROM `travel-sample`
WHERE type = "hotel" AND country = "United States";
12
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 13
©2016 Couchbase Inc.
Scan Methods
UNNEST Scan
• Only array indexes are considered. And only queries with UNNEST clauses are considered
Index Count Scan
• Queries with single projection of COUNT aggregate, NO JOIN’s, GROUP BY is considered
• Chosen Index needs to be covered with single range, exact range will be able to push to indexer and argument to COUNT needs to be constant or leading key
SELECT COUNT(1)
FROM `travel-sample`
WHERE type = "hotel" AND country = "United States";
13
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
2IMPROVMENTS IN 5.0
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 15
UnionScan
• OR predicate can use multiple indexes.
• Each Index perform IndexScan and results are merged using
UnionScan.
• Each IndexScan can push variable length of index keys.
• All IndexScan under UnionScan are covered the UnionScan
is covered.
• CREATE INDEX ts_cc ON `travel-sample` (country, city)
WHERE type = "hotel";
• CREATE INDEX ts_n ON `travel-sample` (name) WHERE
type = "hotel";
EXPLAIN SELECT name, country, city
FROM `travel-sample`
WHERE type = "hotel" AND
((country = "United States" AND city = "San Francisco")
OR (name = "White Wolf"));
{ "#operator": "UnionScan",
"scans": [{ "index": "ts_cc",
"spans": [ { "range": [
{ "high": "\"United States\"",
"inclusion": 3, "low": "\"United States\"" },
{ "high": "\"San Francisco\"",
"inclusion": 3, "low": "\"San Francisco\"" } ]
} ],
},
{ "index": "ts_n",
"spans": [ { "range": [ { "high": "\"White
Wolf\"", "inclusion": 3, "low": "\"White Wolf\"" } ] }],
} ]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 16
IntersectScan
• IntersectScan is improved by terminating scans early when
one of the scan completed or limit is reached. Also only
completed scan results are considered as possible
candidates.
• If query has ORDER BY and predicate on the order by
clausesand when possible it uses OrderedIntersectScan.
EXPLAIN
SELECT name, country, city
FROM `travel-sample`
WHERE type = "hotel" AND
country = "United States" AND
city = "San Francisco" AND
name >= "White Wolf"
ORDER BY name;
{ "#operator": "OrderedIntersectScan",
"scans": [ { "index": "ts_n",
"spans": [ {
"range": [ { "inclusion": 1,
"low": "\"White Wolf\"" } ] } ],
},
{ "index": "ts_cc",
"spans": [ {
"range": [ { "high": "\"United States\"",
"inclusion": 3, "low": "\"United States\"" },
{ "high": "\"San Francisco\"",
"inclusion": 3, "low": "\"San Francisco\"" } ]
} ],
} ]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 17
Implicit Covering Array Index
• N1QL supports simplified Implicit Covering Array
Index syntax in certain cases where the mandatory
array index-key requirement is relaxed to create a
covering array-index.
• The predicates that can be exactly and completely
pushed to the indexer during the array index scan.
• No false positives
CREATE INDEX ts_r_simple ON `travel-sample` ( DISTINCT
ARRAY v.flight FOR v IN schedule END) WHERE type = "route";
EXPLAIN SELECT meta().id
FROM `travel-sample`
WHERE type = "route" AND
ANY v IN schedule SATISFIES v.flight LIKE 'UA%'
END;
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 18
Stable Scans
Earlier versions IndexScan use to do single range scan
(i.e single Span)
If the query has multiple ranges (i.e. OR, IN, NOT
clauses) N1QL use to do separate IndexScan for each
range.
• This causes Indexer can use different snapshot
for each scan (make it unstable scan)
• Number of IndexScans are higher, result in
increase in index connections.
In 5.0.0 multiple ranges are passed into indexer and
indexer uses same snapshot for all the ranges.
If Explain shows operator IndexScan2, It uses stables
Scans.
EXPLAIN SELECT name, country, city
FROM `travel-sample`
WHERE type = "hotel" AND
country IN ["United States" , "France"];
{ "#operator": "IndexScan2",
"index": "ts_cc",
"spans": [
{ "range": [ { "high": "\"France\"",
"inclusion": 3,
"low": "\"France\""
}]
},
{ "range": [ { "high": "\"United States\"",
"inclusion": 3,
"low": "\"United States\""
}]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 19
Efficiently Pushdown Composite Filters
• Earlier versions composite Index the spans that
pushed to indexer contains single range for all
composite keys together.
• Indexer will not applying range for each part of the
key separately. This may result in lot of false
positives.
• In 5.0.0 with IndexScan2 each index key range
separately pushed and indexer will apply keys
separately.
• This results in no/less false positives and aides push
more information to indexer.
EXPLAIN SELECT name, country, city
FROM `travel-sample`
WHERE type = "hotel" AND
country >= "United States" AND
city = "San Francisco";
{ "#operator": "IndexScan2",
"index": "ts_cc",
"spans": [
{ "range": [ {"inclusion": 1,
"low": "\"United States\""
},
{ "high": "\"San Francisco\"",
"inclusion": 3,
"low": "\"San Francisco\""
}
]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 20
Pagination (ORDER, OFFSET, LIMIT)
• Pagination queries can contain any combination of
ORDER, LIMIT, OFFSET clauses.
• Predicates are completely and exactly pushed to
indexer, by pushing offset, limit to indexer can
improve query performance significantly. If that
happened IndexScan2 section of EXPLAIN will have
limit, offset.
• If query ORDER BY matches index key order query
can exploit index order and avoid sort. If that
happened order operator is not present in the
EXPLAIN.
EXPLAIN SELECT country, city
FROM `travel-sample`
WHERE type = "hotel" AND
country >= "United States"
ORDER BY country, city
OFFSET 1 LIMIT 10;
{ "#operator": "IndexScan2",
"index": "ts_cc",
"limit": "10",
"offset": "1",
"spans": [
{ "range": [ {"inclusion": 1,
"low": "\"United States\""
}
]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 21
DESC Index Collation
• Index can be created with ASC/DESC collation on
each index key
• Query can utilize index collation
CREATE INDEX ts_acc ON `travel-sample` (country DESC,
city ASC) WHERE type = "airline";
EXPLAIN SELECT country, city
FROM `travel-sample`
WHERE type = "airline" AND
country >= "United States"
ORDER BY country DESC , city
OFFSET 1 LIMIT 10;
{ "#operator": "IndexScan2",
"index": "ts_acc",
"limit": "10",
"offset": "1",
"spans": [
{ "range": [ {"inclusion": 1,
"low": "\"United States\""
}
]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 22
MAX pushdown
• If the MAX arguments matched with Index leading
key exploit index order for MAX.
• MAX can only be use DESC on leading index key.
• MIN can only be use ASC on leading index key.
• If pushdown happens "limit: 1 will appear in
IndexScan2 section of the EXPLAIN.
CREATE INDEX ts_acc ON `travel-sample` (country DESC,
city ASC) WHERE type = "airline";
EXPLAIN SELECT MAX(country)
FROM `travel-sample`
WHERE type = "airline" AND
country >= "United States";
{ "#operator": "IndexScan2",
"index": "ts_acc",
"limit": "1",
"spans": [
{ "range": [ {"inclusion": 1,
"low": "\"United States\""
}
]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 23
COUNT (DISTINCT expr)
• If the expr matched with Index leading key, COUNT
DISTINCT can be pushed to indexer
• Complete predicates needs to pushed to indexer exactly
• No false positives are possible
• No group or JOIN
• Only single projection
• When pushdown IndexCountDistinctScan2 will
appear in EXPLAIN
EXPLAIN SELECT COUNT( DISTINCT country)
FROM `travel-sample`
WHERE type = "hotel" AND
country >= "United States";
{
"#operator": "IndexCountDistinctScan2"
"index": "ts_cc",
"spans": [
{ "range": [ {"inclusion": 1,
"low": "\"United States\""
}
]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 24
Index Projection
• The index can have many keys but query might be
interested only subset of keys.
• By only requesting required information from indexer
can save lot of network transportation, memory, cpu,
backfill etc. All this can help in performance and
scaling the cluster.
• The requested information can be found in
"IndexScan2" Section of EXPLAIN as
"index_projection"
"index_projection": {
"entry_keys": [ xxx,....... ]
"primary_key": true
}
EXPLAIN SELECT country FROM `travel-sample`
WHERE type = "hotel" AND country >= "United
States";"index_projection": { "entry_keys": [ 0 ] }
EXPLAIN SELECT country,city FROM `travel-sample`
WHERE type = "hotel" AND country >= "United
States" ;"index_projection": { "entry_keys": [ 0 ,1] }
EXPLAIN SELECT country,city, META().id FROM `travel-
sample`
WHERE type = "hotel" AND country >= "United
States" ;"index_projection": { "entry_keys": [ 0 ,1], "primary_key":true }
EXPLAIN SELECT country,city, META().id, name
FROM `travel-sample`
WHERE type = "hotel" AND country >= "United
States" ;non covered query
"index_projection": {"primary_key":true }
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 25
Index Cas and Expiration
• META().cas, META().expiration can be indexed and
used in queries.
• Note: META().expiration will work in covered queries.
For non covered queries it gives 0
CREATE INDEX ts_cas ON `travel-sample` (country,
META().cas, META().expiration) WHERE type = "airport";
EXPLAIN SELECT country, META().cas, META().expiration
FROM `travel-sample`
WHERE type = "airport" AND country = "United
States";
{
"#operator": "IndexScan2"
"index": "ts_cas",
"spans": [
{ "range": [ { "high": "\"United States\""
"inclusion": 3,
"low": "\"United States\""
}
]
}
]
}
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
3 Q&A
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
THANK YOU