Upload
leonlopes
View
217
Download
0
Embed Size (px)
Citation preview
8/6/2019 SQLServer SQLTuning Final
1/30
Tuning SQL Statements on Microsoft SQL Server 2000
ByKevin Kline, Director of Technology, SQL Server Solutions Group, Quest Software, Inc.
and Claudia Fernandez, Product Manager, Quest Software, Inc.
8/6/2019 SQLServer SQLTuning Final
2/30
2
Contents
Introduction .............................................................................................................................................. 3
Microsoft Tuning Techniques.................................................................................................................. 3
SET STATISTICS IO................................................................................................................ 3
SET STATISTICS TIME........................................................................................................... 5
SHOWPLAN Output and Analysis........................................................................................... 6
SHOWPLAN Output .......................................................................................................... 7
SHOWPLAN Operations.................................................................................................... 9
Reading the Query Plan ........................................................................................................................... 9
Getting Started......................................................................................................................... 9
SEEK versus SCAN................................................................................................................ 10
Branching steps illustrated by Comparing Joins and Subqueries ......................................... 12
Comparing Query Plans.................................................................................................... 13Understanding the Impact of Joins.................................................................................... 14
Query Tuning Techniques...................................................................................................................... 17
Subqueries Optimization........................................................................................................ 17Example ............................................................................................................................ 17
UNION vs. UNION ALL ........................................................................................................ 19
Example ............................................................................................................................ 19
Functions and Expressions That Suppress Indexes ............................................................... 20Examples........................................................................................................................... 20
SET NOCOUNT ON .............................................................................................................. 21
TOP n..................................................................................................................................... 23
Using tools to tune SQL statements....................................................................................................... 25
Microsoft Query Analyzer...................................................................................................... 25
Quest Central for SQL Server SQL Tuning ........................................................................ 26
Conclusion............................................................................................................................................... 28
About the Authors .................................................................................................................................. 29
About Quest Software ............................................................................................................................ 29
8/6/2019 SQLServer SQLTuning Final
3/30
3
Tuning SQL Statements on Microsoft SQL Server2000
By Kevin Kline and Claudia Fernandez
IntroductionThis paper covers the basic techniques used to tune SELECTstatements on Microsofts
SQL Server 2000 relational database management system. We discuss the techniques
available using Microsoft's graphical user interfaces provided in Microsoft SQL
Enterprise Manager or Microsoft SQL Query Analyzer, as well as providing a brief
overview of Quest Software's query tuning tools.
In addition to tuning methods, we'll show you several best practices you can apply to
your SQL statements to improve performance. [All examples and syntax are verified
for Microsoft SQL Server 2000.] After reading this paper, you should have a basic
understanding of query tuning tools and techniques available with the Microsoft tool
kit. We will cover a variety of querying techniques that improve performance and
speed data read operations.
SQL Server provides you with capabilities to benchmark transactions by sampling I/O
activity and elapsed execution time using certain SETandDBCCcommands. Inaddition, someDBCCcommands may be used to obtain a very detailed explanation of
any index statistic, estimate the cost of every possible execution plan, and boost
performance. TheSETandDBCCcommands are fully detailed in the Quest whitepaper entitled "Analyzing and Optimizing T-SQL Query Performance on Microsoft
SQL Server using SET and DBCC," the first white paper in a four part series onperformance tuning.
Microsoft Tuning TechniquesMicrosoft provides you with three primary means for tuning queries:
Checking the reads and writes generated by the query using SET STATISTICSIO
Checking the running time of the query using SET STATISTICS TIME
Analyzing the query plan of the query using SET SHOWPLAN
SET STATISTICS IOThe command SET STATISTICS IO ONforces SQL Server to report actual I/O activityon executed transactions. It cannot be paired with SET NOEXEC ONoption, because itonly makes sense to monitor I/O activity on commands that actually execute. Once the
option is enabled every query produces additional output that contains I/O statistics. In
order to disable the option, execute SET STATISTICS IO OFF.
These commands also work on Sybase Adaptive Server, though some results
sets may look somewhat different.
8/6/2019 SQLServer SQLTuning Final
4/30
4
For example, the following script obtains I/O statistics for a simple query counting
rows of the employeestable in the northwinddatabase:
SET STATISTICS IO ONGOSELECT COUNT(*) FROM employeesGOSET STATISTICS IO OFFGO
Results:
-----------2977
Table 'Employees'. Scan count 1, logical reads 53, physical reads 0, read-ahead reads
The scan count tells us the number of scans performed. Logical reads show the number
of pages read from the cache. Physical reads show the number of pages read from the
disk. Read-ahead reads indicate the number of pages placed in the cache in anticipationof future reads.
Additionally, we execute a system stored procedure to obtain table size statistics for
our analysis:
sp_spaceused employees
Results:
name rows reserved data index_size unused---------- ---- --------- ------- ----------- -------Employees 2977 2008 KB 1504 KB 448 KB 56 KB
What can we tell by looking at this information?
The query did not have to scan the whole table. The number of data in the table
is more than 1.5 megabytes, yet it took only 53 logical I/O operations to obtain
the result. It indicates that the query has found an index that could be used to
compute the result, and scanning the index took fewer I/O than it would take to
scan all data pages.
Index pages were mostly found in data cache since the physical reads value is
zero. This is because we executed the query shortly after other queries on
employees and the table and its index were already cached. Your mileage mayvary.
Microsoft has reported no read-ahead activity. In this case, data and index
pages were already cached. For a table scan on a large table read-ahead would
probably kick in and cache necessary pages before your query requested them.
Read-ahead turns on automatically when SQL Server determines that your
transaction is reading database pages sequentially and believes that it can
predict which pages youll need next. A separate SQL Server connection
virtually runs ahead of your process and caches data pages for it.
8/6/2019 SQLServer SQLTuning Final
5/30
5
[Configuration and tuning of read-ahead parameters is beyond the scope of this
paper.]
In this example, the query was executed as efficiently as possible. No further tuning is
required.
SET STATISTICS TIMEElapsed time of a transaction is a volatile measurement, since it depends on activity of
other users on the server. However, it provides some real measurement, compared to
the number of data pages that dont mean anything to your users. They are concerned
about seconds and minutes they spend waiting for a query to come back, not about data
caches and read-ahead efficiency. The SET STATISTICS TIME ONcommand reports
the actual elapsed time and CPU utilization for every query that follows. Executing
SET STATISTICS TIME OFFsuppresses the option.
SET STATISTICS TIME ONGOSELECT COUNT(*) FROM titleauthors
GOSET STATISTICS TIME OFFGO
Results:
SQL Server Execution Times:cpu time = 0 ms. elapsed time = 8672 ms.
SQL Server Parse and Compile Time:cpu time = 10 ms.
-----------25
(1 row(s) affected)
SQL Server Execution Times:cpu time = 0 ms. elapsed time = 10 ms.
SQL Server Parse and Compile Time:cpu time = 0 ms.
The first message reports a somewhat confusing elapsed time value of 8,672
milliseconds. This number is not related to our script and indicates the amount of time
that has passed since the previous command execution. You may disregard this first
message. It took SQL Server only 10 milliseconds to parse and compile the query. It
took 0 milliseconds to execute it (shown after the result of the query). What this really
means is that the duration of the query was too short to measure. The last message thatreports parse and compile time of 0 ms refers to the SET STATISTICS TIME OFFcommand (thats what it took to compile it). You may disregard this message since the
most important messages in the output are highlighted.
8/6/2019 SQLServer SQLTuning Final
6/30
6
Note that elapsed and CPU time are shown in milliseconds. The numbers may vary on
your computer (but dont try to compare your machines performance to our notebook
PCs, because this is not a representative benchmark). Moreover, every time you
execute this script you may get slightly different statistics depending on what else your
SQL Server was processing at the same time.If you need to measure elapsed duration of a set of queries or a stored procedure, it may
be more practical to implement it programmatically (shown below). The reason is that
the STATISICS TIMEreports duration of every single query and you have to add thingsup manually when you run multiple commands. Imagine the size of the output and the
amount of manual work in cases when you time a script that executes a set of queries
thousands of times in a loop!
Instead, consider the following script to capture time before and after the transaction
and report the total duration in seconds (you may use milliseconds if you prefer):
DECLARE @start_time DATETIMESELECT @start_time = GETDATE()
< any query or a script that you want to time, without a GO >
SELECT Elapsed Time, sec = DATEDIFF( second, @start_time, GETDATE() )GO
If your script consists of several steps separated by GO, you cannot use a local variableto save the start time. A variable is destroyed at the end of the step, defined by the GO
command, where it was created. But you can preserve start time in a temporary table
like this:
CREATE TABLE #save_time ( start_time DATETIME NOT NULL )INSERT #save_time VALUES ( GETDATE() )GO< any script that you want to time (may include GO) >GOSELECT Elapsed Time, sec = DATEDIFF( second, start_time, GETDATE() )
FROM #save_timeDROP TABLE #save_timeGO
Remember that SQL ServersDATETIMEdatatype stores time values in 3 millisecondincrements. It is impossible to get more granular time values than that using the
DATETIMEdatatype.
SHOWPLAN Output and AnalysisThis paper illustrates, through example explain plans, the meaning and usefulness ofthe output produced using SET SHOWPLAN_TEXT ONin Microsoft SQL Server 2000.An explain plan (also called query plans, execution plans, or optimizer plans) provides
the exact details of the steps the database query engine uses to execute a SQLtransaction. Knowing how to read explain plans expands your ability to perform high-
end query tuning and optimization.
Note: Most examples are based on either the PUBS database or on SQL
Server system tables. For the examples, we added tens of thousands of rows
to many tables so that the query optimizer has some real work to do when
evaluating query plans.
8/6/2019 SQLServer SQLTuning Final
7/30
7
SHOWPLAN OutputOne of the things that we like about the query optimizer is that it provides feedback in
the form of a query execution plan. Now we explain it in more detail and describe
messages that you may encounter in query plans. Understanding this output brings youroptimization efforts to a new level. You no longer need to treat the optimizer as a
black box that touches your queries with a magic wand.
The following command instructs SQL Server to show the execution plan for every
query that follows in the same connection (or process), or turns this option off:
SET SHOWPLAN_TEXT { ON | OFF }
By default, SHOWPLAN_TEXT ONcauses the code you are examining to not execute.Instead, SQL Server compiles the code and displays the query execution plan for that
code. It continues with this behavior until you issue the command
SET.SHOWPLAN_TEXT OFF.
Typical T-SQL code that is used to obtain an execution plan for a query without
actually running it follows:
SET SHOWPLAN_TEXT ONGO
GOSET SHOWPLAN_TEXT OFFGO
Other Useful SET Commands
There are a variety ofSETcommands that are useful for tuning and debugging. We
covered SET STATISTICS earlier in this document. You might find these otherSET
commands useful in certain situations:
1. SET NOEXEC { ON | OFF }: checks the syntax of your Transact-SQL code,
including compiling the code but not executing it. This is useful for checking the
syntax of a query while taking advantage of deferred-name resolution. That is,
you can check a querys syntax on a table that hasnt been created yet.
2. SET FMTONLY { ON | OFF }: returns only the metadata of a query to the client.
ForSELECTstatements, this usually means it returns only the column headers.
3. SET PARSEONLY { ON | OFF }:checks the syntax of your Transact-SQL code,
but does not compile or execute the code.
All of these commands remain in effect once set ON until you manually turn them
OFF. These settings do not take effect immediately, but they start working from the
next step. In other words, you have to issue a GO command before theSHOWPLANor
NOEXECsetting is enabled.
8/6/2019 SQLServer SQLTuning Final
8/30
8/6/2019 SQLServer SQLTuning Final
9/30
9
Warnings Type Parallel EstimateExecutions-------- --------- -------- ------------------NULL SELECT 0 NULLNULL PLAN_ROW 0 1.0
There is a significant difference. The SHOWPLAN_ALL statement returns a lot of
useful tuning information, but it is hard to understand and apply.
SHOWPLAN OperationsSome of the SHOWPLANoperations, sometimes called tags, are very clear inexplaining what SQL Server is doing, while others are puzzling. These operations are
divided into physical operations or logical operations. Physical operators describe the
physical algorithm used to process the query, for example, performing an index seek.
Logical operators describe the relational algebra operation used by the statement, such
as an aggregation. SHOWPLANresults are broken down into steps. Each physicaloperation of a query is represented as a separate step. Steps usually have an
accompanying logical operator, but not all steps involve logical operations. In addition,most steps have an operation (either logical or physical) and an argument. Arguments
are the component of the query that the operation affects.
A discussion of all of the execution plans steps would be prohibitively large. Instead of
reviewing them all here, please refer to the Quest white paper "SHOWPLAN Output
and Analysis" available at http://www.quest.com/whitepapers/#ms_sql_server.
Reading the Query PlanRather than show examples embedded within the descriptions of the logical and
physical operations, we have broken them out separately. This is because a single
example might illustrate the use and effectiveness of several operators at once.
Getting StartedLets start with some simple examples to help you understand how to read the query
plan that is returned when you either issue the command SET SHOWPLAN_TEXT ONor enable the option of the same name in the SQL Query Analyzer configuration
properties.
This example usespubs..big_sales, an identical copy of the
pubs..sales table except with about 80,000 records, as the main source
for examples of simple explain plans.
The simplest query, as shown below, will scan the entire clustered index if one exists.
Remember that the clustered key is the physical order in which the data is written.
Consequently, if a clustered key exists, youll be able to avoid a table scan. Even if
you select a column that is not specifically mentioned in the clustered key, such as
ord_date, the query engine will use a clustered index scan to return the result set.
8/6/2019 SQLServer SQLTuning Final
10/30
10
SELECT *FROM big_sales
SELECT ord_dateFROM big_sales
StmtText-------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]))
The queries shown above return very different quantities of data, so the query with the
smaller result set (ord_date) will perform faster than the other query simply because
of much lower I/O. However, the query plans are virtually identical.
You can improve performance by utilizing alternate indexes. For example, a non-
clustered index exists on the title_idcolumn:
SELECT title_idFROM big_sales
StmtText---------------------------------------------------------------------|--Index Scan(OBJECT:([pubs].[dbo].[big_sales].[ndx_sales_ttlID]))
The above query performs in a fraction of the time of the SELECT * query because it
can answer its needs entirely from the non-clustered index. This type of query is called
a covering query because the entire result set is covered by a non-clustered index.
SEEK versus SCANOne of the first things youll need to distinguish in a query plan is the difference
between a SEEKand a SCANoperation.
A very simple but useful rule of thumb is that SEEK operations are good
while SCAN operations are less-than-good, if not downright bad. Seeks go
directly, or at least very quickly, to the needed records while scans read the
whole object (either table, clustered index, or non-clustered index). Thus,
scans usually consume lots more resources than seeks.
If your query plan shows only scans, then you should consider tuning the
query.
The WHEREclause can make a huge difference in query performance, as shownbelow:
SELECT *
FROM big_salesWHERE stor_id = '6380'
StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),
SEEK:([big_sales].[stor_id]=[@1]) ORDERED FORWARD)
8/6/2019 SQLServer SQLTuning Final
11/30
11
The query above is now able perform a SEEKrather than a SCANon the clustered
index. The SHOWPLAN describes exactly what the seek operation is based upon
(stor_id) and that the results are ORDERED according to how they are currently storedin the index mentioned. Since SQL Server 2000 now supports forward and backward
scrolling through indexes with equal performance, you may see ORDEREDFORWARD orORDERED BACKWARD in the query plan. This merely tells you which
direction the table or index was read. You can even manipulate this behavior by using
theASCandDESCkeywords in yourORDER BYclauses.
Range queries return query plans that look very similar to the direct query shown
before. The following two range queries give you an idea:
SELECT *FROM big_salesWHERE stor_id >= '7131'
StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),
SEEK:([big_sales].[stor_id] >= '7131') ORDERED FORWARD)
The above query looks a lot like the previous example, except the SEEKpredicate issomewhat different.
SELECT *FROM big_salesWHERE stor_id BETWEEN '7066' AND '7131'
StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),
SEEK:([big_sales].[stor_id] >= '7066' AND [big_sales].[stor_id]
8/6/2019 SQLServer SQLTuning Final
12/30
12
The database architect made good guesses at indexing tables when they were
created, but the transaction load has changed over time, rendering the indexes
less effective
If you see a lot of scans in your query plan and not many seeks, you should reevaluate
your indexes. For example, look at the query below:
SELECT ord_numFROM salesWHERE ord_date IS NOT NULLAND ord_date > 'Jan 01, 2002 12:00:00 AM'
StmtText---------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]),
WHERE:([sales].[ord_date]>'Jan 1 2002 12:00AM'))
The query above has a WHEREclause against the ord_date column, yet no indexseek operation takes place. When looking at the table, we see that there is no index on
the ord_date column but there probably should be one. If we add one, the queryplan looks like this:
StmtText---------------------------------------------------------------------------------|--Index Seek(OBJECT:([pubs].[dbo].[sales].[sales_ord_date]),
SEEK:([sales].[ord_date] > 'Jan 1 2002 12:00AM') ORDERED FORWARD)
Now the query is performing anINDEX SEEKoperation on the sales_ord_date
index that we just created.
Branching Steps Illustrated by Comparing Joins andSubqueriesAn old rule of thumb says that joins are much better performing than a subquery that
achieves the same result set.
SELECT au_fname, au_lnameFROM authorsWHERE au_id IN
(SELECT au_id FROM titleauthor)
StmtText---------------------------------------------------------------------------------|--Nested Loops(Inner Join, OUTER REFERENCES:([titleauthor].[au_id]))
|--Stream Aggregate(GROUP BY:([titleauthor].[au_id]))| |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind]), ORDERED FORWA|--Clustered Index Seek(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind]),
SEEK:([authors].[au_id]=[titleauthor].[au_id]) ORDERED FORWARD)
Table 'authors'. Scan count 38, logical reads 76, physical reads 0, read-ahead reads 0Table 'titleauthor'. Scan count 2, logical reads 2, physical reads 1, read-ahead reads
In this case, the query engine chooses a nested loop operation. The query is forced to
read the entire authors table using a clustered index seek, chalking up quite a logical
page reads in the process.
8/6/2019 SQLServer SQLTuning Final
13/30
13
In queries with branching steps, the indented lines show you which steps are
branches off of other steps.
Now, lets look at a join:
SELECT DISTINCT au_fname, au_lnameFROM authors AS aJOIN titleauthor AS t ON a.au_id = t.au_id
StmtText---------------------------------------------------------------------------------|--Stream Aggregate(GROUP BY:([a].[au_lname], [a].[au_fname]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([a].[au_id]))|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind] AS [a]), ORDERED
FORWARD)|--Index Seek(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [t]),
SEEK:([t].[au_id]=[a].[au_id]) ORDERED FORWARD)
Table 'titleauthor'. Scan count 23, logical reads 23, physical reads 0, read-ahead rea0.
Table 'authors'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0.
With the above query, the number of logical reads goes up against the titleauthors table
but goes down for the authors table. Notice that the stream aggregation occurs higher
(later) in the query plan.
Comparing Query PlansYoull use query plans to compare the relative effectiveness of two separate queries.
For example, you might want to see if one query, compared to another, adds extra
layers of overhead or chooses a different indexing strategy.
In this example, we compare two queries. The first uses SUBSTRING and the second
usesLIKE:
SELECT *FROM authorsWHERE SUBSTRING( au_lname, 1, 2 )= 'Wh'
StmtText---------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind]),
WHERE:(substring([authors].[au_lname], 1, 2)='Wh'))
Compare this to a similar query that usesLIKE:
SELECT *FROM authors
WHERE au_lname LIKE 'Wh%'
StmtText---------------------------------------------------------------------------------|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors]))
|--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]),SEEK:([authors].[au_lname] >= 'WG' AND [authors].[au_lname] < 'WI'),WHERE:(like([authors].[au_lname], 'Wh%', NULL)) ORDERED FORWARD)
Obviously, the second query with itsINDEX SEEKoperation is a simpler query planthan the first query with its CLUSTERED INDEX SCAN.
8/6/2019 SQLServer SQLTuning Final
14/30
8/6/2019 SQLServer SQLTuning Final
15/30
15
Hash
The best strategies for large, dissimilarly sized tables, or complex join requirements
where the join columns are not indexed or sorted is a hashing join. Hashing is used for
UNION, INTERSECT, INNER, LEFT, RIGHT, and FULL OUTER JOIN, as well as
set matching and difference operations. Hashing is also used for joining tables whereno useful indexes exist. Hash operations build a temporary hashing table and then cycle
through all of the data to produce the output.
Hashes use a build input (always the smaller table) and a probe input. The hash key
(that is, the columns in the join predicate or sometimes in the GROUP BYlist) is whatthe query uses to process the join. A residual predicate is any evaluations in the
WHEREclause that do not apply to the join itself. Residual predicates are evaluatedafter the join predicates. There are several different options that SQL Server may
choose from when constructing a hash join, in order of precedence:
In-memory Hash: In-memory hash joins build a temporary hash table in memory by
first scanning the entire build input into memory. Each record is inserted into a hash
bucket based on the hash value computed for the hash key. Next, the probe input isscanned record by record. Each probe input is compared to the corresponding hash
bucket and, where a match is found, returned in the result set.
Hybrid Hash: If the hash is only slightly larger than available memory, SQL Server
may combine aspects of the in-memory hash join with the grace hash join in what is
called a hybrid hash join.
Grace Hash: The grace hash option is used when the hash join is too large to be
processed in memory. In that case, the whole build input and probe input are read in.
They are then pumped out into multiple, temporary worktables in a step called
partitioning fan-out. The hash function on the hash keys ensures that all joining records
are in the same pair of partitioned worktables. Partition fan-out basically chops two
long steps into many small steps that can be processed concurrently. The hash join isthen applied to each pair of worktables and any matches are returned in the result set.
Recursive Hash: Sometimes the partitioned fan-out tables produced by the grace hash
are still so large that they require further re-partitioning. This is called a recursivehash.
Note that hash and merge joins process through each table once. So they might have
deceptively low I/O metrics should you use SET STATISTICS IO ONwith queries ofthis type. However, the low I/O does not mean these join strategies are inherently faster
than nested loop joins because of their enormous computational requirements.
Hash joins, in particular, are computationally expensive. If you find certain
queries in a production application consistently using hash joins, this is yourclue to tune the query or add indexes to the underlying tables.
In the following example, we show both a standard nested loop (using the default query
plan) and hash and merge joins (forced through the use of hints):
8/6/2019 SQLServer SQLTuning Final
16/30
16
SELECT a.au_fname, a.au_lname, t.titleFROM authors AS aINNER JOIN titleauthor ta
ON a.au_id = ta.au_idINNER JOIN titles t
ON t.title_id = ta.title_idORDER BY au_lname ASC, au_fname ASC
StmtText---------------------------------------------------------------------------------|--Nested Loops(Inner Join, OUTER REFERENCES:([ta].[title_id]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([a].[au_id]))| |--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind] AS [a]), ORDERED
FORWARD)| |--Index Seek(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [ta]),
SEEK:([ta].[au_id]=[a].[au_id]) ORDERED FORWARD)|--Clustered Index Seek(OBJECT:([pubs].[dbo].[titles].[UPKCL_titleidind] AS [t]
SEEK:([t].[title_id]=[ta].[title_id]) ORDERED FORWARD)
The showplan displayed above is the standard query plan produced by SQL Server.
We can force SQL Server to show us how it would handle these as merge and hashjoins using hints:
SELECT a.au_fname, a.au_lname, t.titleFROM authors AS aINNER MERGE JOIN titleauthor ta
ON a.au_id = ta.au_idINNER HASH JOIN titles t
ON t.title_id = ta.title_idORDER BY au_lname ASC, au_fname ASC
Warning: The join order has been enforced because a local join hint is used.
StmtText---------------------------------------------------------------------------------|--Sort(ORDER BY:([a].[au_lname] ASC, [a].[au_fname] ASC))
|--Hash Match(Inner Join, HASH:([ta].[title_id])=([t].[title_id]),RESIDUAL:([ta].[title_id]=[t].[title_id]))|--Merge Join(Inner Join, MERGE:([a].[au_id])=([ta].[au_id]),
RESIDUAL:([ta].[au_id]=[a].[au_id]))| |--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind
AS [a]), ORDERED FORWARD)| |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [ta]),
ORDERED FORWARD)|--Index Scan(OBJECT:([pubs].[dbo].[titles].[titleind] AS [t]))
In this example, you can clearly see that each join considers the join predicate of the
other join to be a residual predicate. (Youll also note that the use of a hint caused SQLServer to issue a warning.) This query was also forced to use a SORToperation to
support the hash and merge joins.
8/6/2019 SQLServer SQLTuning Final
17/30
8/6/2019 SQLServer SQLTuning Final
18/30
18
Incidentally, the result sets are the same in both cases, though the sort orders are
different because the join query (with its GROUP BYclause) has an implicit ORDERBY:
Store Books Sold
---------------------------------------- -----------Barnum's 154125Bookbeat 518080Doc-U-Mat: Quality Laundry and Books 581130Eric the Read Books 76931Fricative Bookshop 259060News & Brews 161090
(6 row(s) affected)
Store Books Sold---------------------------------------- -----------Eric the Read Books 76931Barnum's 154125News & Brews 161090
Doc-U-Mat: Quality Laundry and Books 581130Fricative Bookshop 259060Bookbeat 518080
(6 row(s) affected)
Examination of the query plan of the subquery approach shows:
|--Compute Scalar(DEFINE:([Expr1006]=isnull([Expr1004], 0)))|--Nested Loops(Left Outer Join, OUTER REFERENCES:([st].[stor_id]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([big_sales].[stor_id]))| |--Stream Aggregate(GROUP BY:([big_sales].[stor_id]))| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].
[UPKCL_big_sales]), ORDERED FORWARD)| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].[UPK_storeid] A
[st]),SEEK:([st].[stor_id]=[big_sales].[stor_id]) ORDERED FORWARD)
|--Stream Aggregate(DEFINE:([Expr1004]=SUM([bs].[qty])))|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].
[UPKCL_big_sales] AS [bs]),SEEK:([bs].[stor_id]=[st].[stor_id]) ORDERED FORWARD)
Whereas in the join query, we have:
|--Stream Aggregate(GROUP BY:([st].[stor_name])DEFINE:([Expr1004]=SUM([partialagg1005])))
|--Sort(ORDER BY:([st].[stor_name] ASC))|--Nested Loops(Left Semi Join, OUTER REFERENCES:([st].[stor_id]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([bs].[stor_id]))| |--Stream Aggregate(GROUP BY:([bs].[stor_id])
DEFINE:([partialagg1005]=SUM([bs].[qty])))| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].
[UPKCL_big_sales] AS [bs]), ORDERED FORWARD)| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].
[UPK_storeid] AS [st]),SEEK:([st].[stor_id]=[bs].[stor_id]) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),SEEK:([big_sales].[stor_id]=[st].[stor_id]) ORDERED FORWARD)
8/6/2019 SQLServer SQLTuning Final
19/30
19
A solution using a join is more efficient. It does not require additional stream aggregate
that sums the big_sales.qtycolumn required for subquery processing.
UNION vs. UNION ALL
Whenever possible use UNION ALLinstead ofUNION. The difference is that UNIONhas a side effect of eliminating all duplicate rows and sorting results, which UNIONALL doesnt do. Selecting a distinct result requires building a temporary worktable,
storing all rows in it and sorting before producing the output. (Displaying the showplan
on a SELECT DISTINCTquery will reveal astream aggregation is taking place,consuming as much as 30% of the resources used to process the query.) In some cases
thats exactly what you need to do - then UNIONis your friend. But if you dont expectany duplicate rows in the result set, then use UNION ALL. It simply selects from one
table or a join, and then selects from another, attaching results to the bottom of the first
result set. UNION ALL requires no worktable and no sorting (unless other unrelatedconditions cause that). In most cases its much more efficient. One more potential
problem with UNIONis the danger of flooding tempdb database with a huge
worktable. It may happen if you expect a large result set from a UNIONquery.ExampleThe following queries select ID for all stores in the salestable, which ships as-iswith thepubs database, and the ID for all stores in the big_sales table, a version ofthe sales table that we populated with over 70,000 rows. The only difference
between the two solutions is the use ofUNIONversus UNION ALL. But the addition oftheALL keyword makes a big difference in the query plan. The first solution requires
stream aggregation and sorting the results before they are returned to the client. The
second query is much more efficient, especially for large tables. In this example both
queries return the same result set, though in a different order. In our testing we had two
temporary tables at the time of execution. Your results may vary.
UNION Solution UNION ALL SolutionSELECT stor_id FROM big_salesUNIONSELECT stor_id FROM sales
SELECT stor_id FROM big_salesUNION ALLSELECT stor_id FROM sales
|--Merge Join(Union)|--Stream Aggregate(GROUP BY:
([big_sales].[stor_id]))| |--Clustered Index Scan
(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),ORDERED FORWARD)
|--Stream Aggregate(GROUP BY:([sales].[stor_id]))|--Clustered Index Scan
(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]),ORDERED FORWARD)
|--Concatenation|--Index Scan (OBJECT:([pubs].[dbo].
[big_sales].[ndx_sales_ttlID]))|--Index Scan (OBJECT:([pubs].[dbo].
[sales].[titleidind]))
Table 'sales'. Scan count 1, logicalreads 2, physical reads 0,read-ahead reads 0.
Table 'big_sales'. Scan count 1, logicalreads 463, physical reads 0,read-ahead reads 0.
Table 'sales'. Scan count 1, logicalreads 1, physical reads 0,read-ahead reads 0.
Table 'big_sales'. Scan count 1, logicalreads 224, physical reads 0,read-ahead reads 0.
8/6/2019 SQLServer SQLTuning Final
20/30
20
Although the result sets in this example are interchangeable, you can see that the
UNION ALL statement consumed less than half of the resources that the UNION
statement consumed. So be sure to anticipate your result sets and in those that are
already distinct, use the UNION ALL clause.
Functions and Expressions That Suppress IndexesWhen you apply built-in functions or expressions to indexed columns, the optimizer
cannot use indexes on those columns. Try to rewrite these conditions in such a way that
index keys are not involved in any expression.
ExamplesYou have to help SQL Server remove any expressions around numeric columns that
form an index. The following queries select a row from the table jobsby a unique keythat has a unique clustered index. If you apply an expression to the column, the index is
suppressed. But once you change the condition job_id 2 = 0 to job_id = 2, the
optimizer performs aseekoperation against the clustered index.
Query With Suppressed Index Optimized Query Using Index
SELECT *FROM jobs
WHERE (job_id-2) = 0
SELECT *FROM jobs
WHERE job_id = 2|--Clustered Index Scan(OBJECT:
([pubs].[dbo].[jobs].[PK__jobs__117F9D94]),WHERE:(Convert([jobs].[job_id])-2=0))
|--Clustered Index Seek(OBJECT:([pubs].[dbo].[jobs].[PK__jobs__117F9D94]),
SEEK:([jobs].[job_id]=Convert([@1]))ORDERED FORWARD)
Note that a seek is much better than a
scan as in the previous query.
The following table contains more examples of queries that suppress an index on
columns of different type and how you can rewrite them for optimal performance.
8/6/2019 SQLServer SQLTuning Final
21/30
21
Query With Suppressed Index Optimized Query Using Index
DECLARE @job_id VARCHAR(5)SELECT @job_id = 2SELECT *FROM jobs
WHERE CONVERT( VARCHAR(5),job_id ) = @job_id
DECLARE @job_id VARCHAR(5)SELECT @job_id = 2SELECT *FROM jobs
WHERE job_id = CONVERT(SMALLINT, @job_id )
SELECT *FROM authors
WHERE au_fname + ' ' + au_lname= 'Johnson White'
SELECT *FROM authors
WHERE au_fname = 'Johnson'AND au_lname = 'White'
SELECT *FROM authors
WHERE SUBSTRING( au_lname, 1, 2 )= 'Wh'
SELECT *FROM authors
WHERE au_lname LIKE 'Wh%'
CREATE INDEX employee_hire_dateON employee ( hire_date )GO-- Get all employees hired-- in the 1st quarter of 1990:SELECT *FROM employee
WHERE DATEPART( year,hire_date ) = 1990
AND DATEPART( quarter,hire_date ) = 1
CREATE INDEX employee_hire_dateON employee ( hire_date )GO-- Get all employees hired-- in the 1st quarter of 1990:SELECT *FROM employee
WHERE hire_date >= 1/1/1990AND hire_date < 4/1/1990
-- Suppose that hire_date may-- contain time other than 12AM-- Who was hired on 2/21/1990?SELECT *FROM employee
WHERE CONVERT( CHAR(10),hire_date, 101 )= 2/21/1990
-- Suppose that hire_date may-- contain time other than 12AM-- Who was hired on 2/21/1990?SELECT *FROM employee
WHERE hire_date >= 2/21/1990AND hire_date < 2/22/1990
SET NOCOUNT ONThe phenomenon of speeding up T-SQL code by using SET NOCOUNT ONissurprisingly obscure to many SQL Server developers and DBAs. You may have
already noticed that successful queries return a system message about the number of
rows that they affect. In many cases you dont need this information. The command
SET NOCOUNT ONallows you to suppress the message for all subsequent transactionsin your session, until you issue the SET NOCOUNT OFFcommand. This option hasmore than a cosmetic effect on the output generated by your script. It reduces the
amount of information passed from the server to the client. Therefore, it helps to lower
network traffic and improves the overall response time of your transactions. Time to
pass a single message may be negligible, but think about a script that executes somequeries in a loop and sends Kilobytes of useless information to a user.
As an example, the enclosed file has a T-SQL batch that inserts 9999 rows into the
big_sales table.
8/6/2019 SQLServer SQLTuning Final
22/30
8/6/2019 SQLServer SQLTuning Final
23/30
8/6/2019 SQLServer SQLTuning Final
24/30
8/6/2019 SQLServer SQLTuning Final
25/30
25
Using Tools to Tune SQL StatementsIn the previous sections you have learned how you can apply different Microsoft SQLServer techniques to tune SQL statements. The use of tools when tuning SQL
statements is crucial to improve productivity and eliminate users errors. This section
will expose how you can use different tools to boost SQL performance and boost your
productivity.
Microsoft Query AnalyzerMicrosoft SQL Server includes the Query Analyzer tool that enables users to write and
execute SQL statements and T-SQL scripts. The Query Analyzer graphically displaysexecution plans before executing the SQL statements or after SQL execution. The
Display Estimated Execution Plan option under the Query menu displays the queryplan that SQL Server will use to execute the SQL statement. The Show Execution Plan
option under the Query menu displays the query plan used by SQL Server during SQLexecution. The graphical execution plan uses icons to represent the steps and data
retrieval methods that SQL Server chose to execute the SQL statement. The execution
plan is the graphical representation of the tabular output produced by the SET
SHOWPLAN_ALL orSET SHOWPLAN_TEXTstatements (Figure 1).
By looking at the execution plan operations you can get an understanding of the
performance characteristics of a SQL statement and identify the need for tuning.
Execution plans can get very complicated when working with complex SQL
statements. This increases the difficulty for a user to read and locate performance
inefficiencies in the execution plan.
If you determine that the SQL statement needs tuning, you can use the Query Analyzer
to manually tune the SQL statement. To manually tune the SQL statement you willneed to open a new window inside the Query Analyzer, reformulate the SQL statement
using some of the techniques presented in this document, review the execution plan and
execute the SQL to obtain the run time. Then you can repeat this process manually
until you find an alternative SQL statement with satisfactory performance.
The limitation of this approach is that to reformulate a complex SQL statement it is
necessary to have expertise on SQL tuning, and when changing the SQL code the
human expert can make mistakes that can lead to an alternative SQL statement that
does not return the same result set as the original SQL statement.
Another limitation is that since the process in manually intensive, the number of SQL
alternatives that the user can try is limited by the time the user can spend figuring out
new ways to write the SQL statement. To make the SQL tuning process more efficient,avoid user errors, and save time it is advisable to use a tool that automates the SQL
tuning process.
8/6/2019 SQLServer SQLTuning Final
26/30
26
Figure 1. Query Analyzers graphical execution plan
Quest Central for SQL Server SQL TuningQuest Central for SQL Server is an integrated database management solution that
simplifies everyday tasks and incorporates a set of tools that enable users to achieve
higher levels of availability and reliability. Quest Central for SQL Server integratesDatabase Administration, Space Management, Database Analysis and SQL Tuning.
Quest Centrals SQL Server SQL Tuning integrates a graphical execution plan display,
a SQL Scannerthat proactively identifies problematic SQL statements directly from
database objects or source code, and a SQL Optimizerthat automatically rewrites theSQL statement in every possible alternative making it possible to identify the most-
efficient SQL for a specific database environment.
Typically database applications contain thousands of SQL statements. The SQL
statements can be located in database objects such as views and stored procedures, or
application source codes. Without an automated tool the process of extracting and
reviewing each SQL statement manually is very tedious and time consuming. The SQL
Scannermodule automates the process of extracting and reviewing SQL statementsdirectly from source code and offers a proactive approach of identifying potential SQL
performance problems (Figure 2).
8/6/2019 SQLServer SQLTuning Final
27/30
27
The SQL Scanner extracts SQL statements embedded in database objects, source code,
and Microsoft SQL Server Profiler trace files/tables without any program execution.
The SQL Scanner can extract SELECT,SELECT..INTO,INSERT,DELETEandUPDATEstatements. The SQL Scanner analyzes in batch, the execution plans of each
SQL statement and categorizes them according to different levels of complexity andsuspected levels of performance problems. Execution plans with inefficiencies and
operations that can cause high I/O such as full tables scans on large tables, full tables
scans in nested loops or many table scans are classified as Problematic. With this
approach, the SQL Scanner allows you to be proactive in the detection of SQL
performance problems.
Figure 2. SQL Scanner, analyzing multiple SQL statements to identify
performance problems.
Due to the complexity of SQL language, there are many ways to write a SQL statement
to return the same result set, but small SQL code variations can have great impact in
performance. The SQL Optimizeruses a SQL transformation engine that completelytransforms a SQL statement in every possible equivalent SQL variation, preserving the
same logic in each alternative statement. The SQL rewrite process includes the use of
syntactical SQL transformations and SQL Server hints which are optional to the user.
Once the SQL statement has been transformed, the SQL Optimizer obtains the
execution plan for each SQL statements and narrows the optimized statements to those
with unique execution plan since an execution plan is what determines the performance
of a SQL statement. This comprehensive SQL transformation process occurs in the PC,
thus not affecting database server resources (Figure 3).
8/6/2019 SQLServer SQLTuning Final
28/30
28
Figure 3. SQL Optimizer automatically rewrites SQL statements
Upon completion of the optimization process, the SQL optimizer displays a list of SQL
alternatives, execution plans and the SQL Server Cost associated with each executionplan. The user can review the SQL alternatives and determine which SQL alternatives
to execute in the database to obtain the run times, I/O information and prove which
statement is the fastest one for the database environment. Once the most efficient SQLstatement has been identified, the users can activate the SQL Comparer to view the
SQL alternatives side-by-side in order to display the syntax, execution plan and run
time statistics.
ConclusionApplication performance in a Microsoft SQL Server environment is directly related to
the efficiency of the SQL statements involved. This article exposed several Microsoft
SQL Server techniques employed to tune SQL statements. Tuning SQL statements byhand or by using SQL Server native utilities is a labor and knowledge-intensive task.
Quest Central for SQL Server SQL Tuning offers a solution that automates theprocess of SQL tuning, saving DBA's and developer's time, decreasing the amount of
experience and knowledge required, increasing their productivity, and maximizing the
performance of SQL statements throughout your Microsoft SQL Server systems.
8/6/2019 SQLServer SQLTuning Final
29/30
29
About the AuthorsKevin Kline serves as the Director of Technology for SQL Server at Quest Software,
designing products for DBAs and database developers. Kevin is author of four books,
including the very popular "SQL in a Nutshell" and "Transact-SQL Programming" both
published by O'Reilly & Associates (www.oreilly.com), and numerous magazine andon-line articles. Kevin is also a Microsoft MVP (www.microsoft.com/mvp) for SQL
Server. Kevin is also active in the SQL Server community, serving as President of the
Professional Association for SQL Server (www.sqlpass.org). When he's not spending
time on database technology, Kevin romancing his wife, spending time with his four
children, practicing classical guitar (very badly), and gardening.
Claudia Fernandez is a Product Manager of SQL Tuning products at Quest Software.
Claudia has contributed to the strategic direction of SQL tuning products for multiple
RDBMS since early 2000. She has presented at several technical conferences on
RDBMS and Application Performance Tuning topics. Claudia holds a MS in Computer
Science and has several years of industry experience working with SQL Server, Sybase
ASE, Oracle, DB2 UDB and other associated technologies. She enjoys movies and
traveling.
Additional ResourcesWhite Paper: Analyzing and Optimizing T-SQL Query Performance on MicrosoftSQL Server using SET and DBCC by Kevin Kline
http://www.quest.com/whitepapers/tuning_article_1_final.pdf
White Paper: Microsoft T-SQL Performance Tuning Part 4: SHOWPLAN Output and
Analysis by Kevin Kline
http://www.quest.com/whitepapers/tuning_article_4.pdf
Material adapted from "Transact-SQL Programming" (O'Reilly & Associates, ISBN:
1565924010) by Kevin Kline, Lee Gould, and Andrew Zanevsky,
http://www.oreilly.com/catalog/wintrnssql/.
About Quest SoftwareQuest Software, Inc. (NASDAQ: QSFT), a leading provider of application, database
and Windows management solutions, provides Application Confidence to 18,000
customers worldwide, including 75 percent of the Fortune 500. Quest Softwares
products help our customers develop, deploy and maintain enterprise applications
without expensive downtime or business interruption. With this focus, Quest Software
enables IT professionals to achieve more with fewer resources. Headquartered in
Irvine, Calif., Quest Software has offices around the globe. For more information on
Quest Software, visit www.quest.com.
8/6/2019 SQLServer SQLTuning Final
30/30
World Headquarters
8001 Irvine Center Drive
Irvine, CA 92618
www.quest.com
e-mail: [email protected]
Inside U.S.: 1.800.306.9329
Outside U.S.: 1.949.754.8000
Please refer to our Web site for regional and international office information. For more
information on Quest Central for Databases or other Quest Software solutions, visit
www.quest.com/quest_central.
Copyright 2004 Quest Software, Inc. Quest Central is a registered trademark of Quest Software. The information inthis publication is furnished for information use only, does not constitute a commitment from Quest Software Inc. of any
features or functions discussed and is subject to change without notice. Quest Software, Inc. assumes no responsibility
or liability for any errors or inaccuracies that may appear in this publication.
September 2004