SQLServer SQLTuning Final

8/6/2019 SQLServer SQLTuning Final

1/30

Tuning SQL Statements on Microsoft SQL Server 2000

ByKevin Kline, Director of Technology, SQL Server Solutions Group, Quest Software, Inc.

and Claudia Fernandez, Product Manager, Quest Software, Inc.


2/30

2

Contents

Introduction .............................................................................................................................................. 3

Microsoft Tuning Techniques.................................................................................................................. 3

SET STATISTICS IO................................................................................................................ 3

SET STATISTICS TIME........................................................................................................... 5

SHOWPLAN Output and Analysis........................................................................................... 6

SHOWPLAN Output .......................................................................................................... 7

SHOWPLAN Operations.................................................................................................... 9

Reading the Query Plan ........................................................................................................................... 9

Getting Started......................................................................................................................... 9

SEEK versus SCAN................................................................................................................ 10

Branching steps illustrated by Comparing Joins and Subqueries ......................................... 12

Comparing Query Plans.................................................................................................... 13Understanding the Impact of Joins.................................................................................... 14

Query Tuning Techniques...................................................................................................................... 17

Subqueries Optimization........................................................................................................ 17Example ............................................................................................................................ 17

UNION vs. UNION ALL ........................................................................................................ 19

Example ............................................................................................................................ 19

Functions and Expressions That Suppress Indexes ............................................................... 20Examples........................................................................................................................... 20

SET NOCOUNT ON .............................................................................................................. 21

TOP n..................................................................................................................................... 23

Using tools to tune SQL statements....................................................................................................... 25

Microsoft Query Analyzer...................................................................................................... 25

Quest Central for SQL Server SQL Tuning ........................................................................ 26

Conclusion............................................................................................................................................... 28

About the Authors .................................................................................................................................. 29

About Quest Software ............................................................................................................................ 29


3/30

3

Tuning SQL Statements on Microsoft SQL Server2000

By Kevin Kline and Claudia Fernandez

IntroductionThis paper covers the basic techniques used to tune SELECTstatements on Microsofts

SQL Server 2000 relational database management system. We discuss the techniques

available using Microsoft's graphical user interfaces provided in Microsoft SQL

Enterprise Manager or Microsoft SQL Query Analyzer, as well as providing a brief

overview of Quest Software's query tuning tools.

In addition to tuning methods, we'll show you several best practices you can apply to

your SQL statements to improve performance. [All examples and syntax are verified

for Microsoft SQL Server 2000.] After reading this paper, you should have a basic

understanding of query tuning tools and techniques available with the Microsoft tool

kit. We will cover a variety of querying techniques that improve performance and

speed data read operations.

SQL Server provides you with capabilities to benchmark transactions by sampling I/O

activity and elapsed execution time using certain SETandDBCCcommands. Inaddition, someDBCCcommands may be used to obtain a very detailed explanation of

any index statistic, estimate the cost of every possible execution plan, and boost

performance. TheSETandDBCCcommands are fully detailed in the Quest whitepaper entitled "Analyzing and Optimizing T-SQL Query Performance on Microsoft

SQL Server using SET and DBCC," the first white paper in a four part series onperformance tuning.

Microsoft Tuning TechniquesMicrosoft provides you with three primary means for tuning queries:

Checking the reads and writes generated by the query using SET STATISTICSIO

Checking the running time of the query using SET STATISTICS TIME

Analyzing the query plan of the query using SET SHOWPLAN

SET STATISTICS IOThe command SET STATISTICS IO ONforces SQL Server to report actual I/O activityon executed transactions. It cannot be paired with SET NOEXEC ONoption, because itonly makes sense to monitor I/O activity on commands that actually execute. Once the

option is enabled every query produces additional output that contains I/O statistics. In

order to disable the option, execute SET STATISTICS IO OFF.

These commands also work on Sybase Adaptive Server, though some results

sets may look somewhat different.


4/30

4

For example, the following script obtains I/O statistics for a simple query counting

rows of the employeestable in the northwinddatabase:

SET STATISTICS IO ONGOSELECT COUNT(*) FROM employeesGOSET STATISTICS IO OFFGO

Results:

-----------2977

Table 'Employees'. Scan count 1, logical reads 53, physical reads 0, read-ahead reads

The scan count tells us the number of scans performed. Logical reads show the number

of pages read from the cache. Physical reads show the number of pages read from the

disk. Read-ahead reads indicate the number of pages placed in the cache in anticipationof future reads.

Additionally, we execute a system stored procedure to obtain table size statistics for

our analysis:

sp_spaceused employees

Results:

name rows reserved data index_size unused---------- ---- --------- ------- ----------- -------Employees 2977 2008 KB 1504 KB 448 KB 56 KB

What can we tell by looking at this information?

The query did not have to scan the whole table. The number of data in the table

is more than 1.5 megabytes, yet it took only 53 logical I/O operations to obtain

the result. It indicates that the query has found an index that could be used to

compute the result, and scanning the index took fewer I/O than it would take to

scan all data pages.

Index pages were mostly found in data cache since the physical reads value is

zero. This is because we executed the query shortly after other queries on

employees and the table and its index were already cached. Your mileage mayvary.

Microsoft has reported no read-ahead activity. In this case, data and index

pages were already cached. For a table scan on a large table read-ahead would

probably kick in and cache necessary pages before your query requested them.

Read-ahead turns on automatically when SQL Server determines that your

transaction is reading database pages sequentially and believes that it can

predict which pages youll need next. A separate SQL Server connection

virtually runs ahead of your process and caches data pages for it.


5/30

5

[Configuration and tuning of read-ahead parameters is beyond the scope of this

paper.]

In this example, the query was executed as efficiently as possible. No further tuning is

required.

SET STATISTICS TIMEElapsed time of a transaction is a volatile measurement, since it depends on activity of

other users on the server. However, it provides some real measurement, compared to

the number of data pages that dont mean anything to your users. They are concerned

about seconds and minutes they spend waiting for a query to come back, not about data

caches and read-ahead efficiency. The SET STATISTICS TIME ONcommand reports

the actual elapsed time and CPU utilization for every query that follows. Executing

SET STATISTICS TIME OFFsuppresses the option.

SET STATISTICS TIME ONGOSELECT COUNT(*) FROM titleauthors

GOSET STATISTICS TIME OFFGO

Results:

SQL Server Execution Times:cpu time = 0 ms. elapsed time = 8672 ms.

SQL Server Parse and Compile Time:cpu time = 10 ms.

-----------25

(1 row(s) affected)

SQL Server Execution Times:cpu time = 0 ms. elapsed time = 10 ms.

SQL Server Parse and Compile Time:cpu time = 0 ms.

The first message reports a somewhat confusing elapsed time value of 8,672

milliseconds. This number is not related to our script and indicates the amount of time

that has passed since the previous command execution. You may disregard this first

message. It took SQL Server only 10 milliseconds to parse and compile the query. It

took 0 milliseconds to execute it (shown after the result of the query). What this really

means is that the duration of the query was too short to measure. The last message thatreports parse and compile time of 0 ms refers to the SET STATISTICS TIME OFFcommand (thats what it took to compile it). You may disregard this message since the

most important messages in the output are highlighted.


6/30

6

Note that elapsed and CPU time are shown in milliseconds. The numbers may vary on

your computer (but dont try to compare your machines performance to our notebook

PCs, because this is not a representative benchmark). Moreover, every time you

execute this script you may get slightly different statistics depending on what else your

SQL Server was processing at the same time.If you need to measure elapsed duration of a set of queries or a stored procedure, it may

be more practical to implement it programmatically (shown below). The reason is that

the STATISICS TIMEreports duration of every single query and you have to add thingsup manually when you run multiple commands. Imagine the size of the output and the

amount of manual work in cases when you time a script that executes a set of queries

thousands of times in a loop!

Instead, consider the following script to capture time before and after the transaction

and report the total duration in seconds (you may use milliseconds if you prefer):

DECLARE @start_time DATETIMESELECT @start_time = GETDATE()

< any query or a script that you want to time, without a GO >

SELECT Elapsed Time, sec = DATEDIFF( second, @start_time, GETDATE() )GO

If your script consists of several steps separated by GO, you cannot use a local variableto save the start time. A variable is destroyed at the end of the step, defined by the GO

command, where it was created. But you can preserve start time in a temporary table

like this:

CREATE TABLE #save_time ( start_time DATETIME NOT NULL )INSERT #save_time VALUES ( GETDATE() )GO< any script that you want to time (may include GO) >GOSELECT Elapsed Time, sec = DATEDIFF( second, start_time, GETDATE() )

FROM #save_timeDROP TABLE #save_timeGO

Remember that SQL ServersDATETIMEdatatype stores time values in 3 millisecondincrements. It is impossible to get more granular time values than that using the

DATETIMEdatatype.

SHOWPLAN Output and AnalysisThis paper illustrates, through example explain plans, the meaning and usefulness ofthe output produced using SET SHOWPLAN_TEXT ONin Microsoft SQL Server 2000.An explain plan (also called query plans, execution plans, or optimizer plans) provides

the exact details of the steps the database query engine uses to execute a SQLtransaction. Knowing how to read explain plans expands your ability to perform high-

end query tuning and optimization.

Note: Most examples are based on either the PUBS database or on SQL

Server system tables. For the examples, we added tens of thousands of rows

to many tables so that the query optimizer has some real work to do when

evaluating query plans.


7/30

7

SHOWPLAN OutputOne of the things that we like about the query optimizer is that it provides feedback in

the form of a query execution plan. Now we explain it in more detail and describe

messages that you may encounter in query plans. Understanding this output brings youroptimization efforts to a new level. You no longer need to treat the optimizer as a

black box that touches your queries with a magic wand.

The following command instructs SQL Server to show the execution plan for every

query that follows in the same connection (or process), or turns this option off:

SET SHOWPLAN_TEXT { ON | OFF }

By default, SHOWPLAN_TEXT ONcauses the code you are examining to not execute.Instead, SQL Server compiles the code and displays the query execution plan for that

code. It continues with this behavior until you issue the command

SET.SHOWPLAN_TEXT OFF.

Typical T-SQL code that is used to obtain an execution plan for a query without

actually running it follows:

SET SHOWPLAN_TEXT ONGO

GOSET SHOWPLAN_TEXT OFFGO

Other Useful SET Commands

There are a variety ofSETcommands that are useful for tuning and debugging. We

covered SET STATISTICS earlier in this document. You might find these otherSET

commands useful in certain situations:

1. SET NOEXEC { ON | OFF }: checks the syntax of your Transact-SQL code,

including compiling the code but not executing it. This is useful for checking the

syntax of a query while taking advantage of deferred-name resolution. That is,

you can check a querys syntax on a table that hasnt been created yet.

2. SET FMTONLY { ON | OFF }: returns only the metadata of a query to the client.

ForSELECTstatements, this usually means it returns only the column headers.

3. SET PARSEONLY { ON | OFF }:checks the syntax of your Transact-SQL code,

but does not compile or execute the code.

All of these commands remain in effect once set ON until you manually turn them

OFF. These settings do not take effect immediately, but they start working from the

next step. In other words, you have to issue a GO command before theSHOWPLANor

NOEXECsetting is enabled.


8/30


9/30

9

Warnings Type Parallel EstimateExecutions-------- --------- -------- ------------------NULL SELECT 0 NULLNULL PLAN_ROW 0 1.0

There is a significant difference. The SHOWPLAN_ALL statement returns a lot of

useful tuning information, but it is hard to understand and apply.

SHOWPLAN OperationsSome of the SHOWPLANoperations, sometimes called tags, are very clear inexplaining what SQL Server is doing, while others are puzzling. These operations are

divided into physical operations or logical operations. Physical operators describe the

physical algorithm used to process the query, for example, performing an index seek.

Logical operators describe the relational algebra operation used by the statement, such

as an aggregation. SHOWPLANresults are broken down into steps. Each physicaloperation of a query is represented as a separate step. Steps usually have an

accompanying logical operator, but not all steps involve logical operations. In addition,most steps have an operation (either logical or physical) and an argument. Arguments

are the component of the query that the operation affects.

A discussion of all of the execution plans steps would be prohibitively large. Instead of

reviewing them all here, please refer to the Quest white paper "SHOWPLAN Output

and Analysis" available at http://www.quest.com/whitepapers/#ms_sql_server.

Reading the Query PlanRather than show examples embedded within the descriptions of the logical and

physical operations, we have broken them out separately. This is because a single

example might illustrate the use and effectiveness of several operators at once.

Getting StartedLets start with some simple examples to help you understand how to read the query

plan that is returned when you either issue the command SET SHOWPLAN_TEXT ONor enable the option of the same name in the SQL Query Analyzer configuration

properties.

This example usespubs..big_sales, an identical copy of the

pubs..sales table except with about 80,000 records, as the main source

for examples of simple explain plans.

The simplest query, as shown below, will scan the entire clustered index if one exists.

Remember that the clustered key is the physical order in which the data is written.

Consequently, if a clustered key exists, youll be able to avoid a table scan. Even if

you select a column that is not specifically mentioned in the clustered key, such as

ord_date, the query engine will use a clustered index scan to return the result set.


10/30

10

SELECT *FROM big_sales

SELECT ord_dateFROM big_sales

StmtText-------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]))

The queries shown above return very different quantities of data, so the query with the

smaller result set (ord_date) will perform faster than the other query simply because

of much lower I/O. However, the query plans are virtually identical.

You can improve performance by utilizing alternate indexes. For example, a non-

clustered index exists on the title_idcolumn:

SELECT title_idFROM big_sales

StmtText---------------------------------------------------------------------|--Index Scan(OBJECT:([pubs].[dbo].[big_sales].[ndx_sales_ttlID]))

The above query performs in a fraction of the time of the SELECT * query because it

can answer its needs entirely from the non-clustered index. This type of query is called

a covering query because the entire result set is covered by a non-clustered index.

SEEK versus SCANOne of the first things youll need to distinguish in a query plan is the difference

between a SEEKand a SCANoperation.

A very simple but useful rule of thumb is that SEEK operations are good

while SCAN operations are less-than-good, if not downright bad. Seeks go

directly, or at least very quickly, to the needed records while scans read the

whole object (either table, clustered index, or non-clustered index). Thus,

scans usually consume lots more resources than seeks.

If your query plan shows only scans, then you should consider tuning the

query.

The WHEREclause can make a huge difference in query performance, as shownbelow:

SELECT *

FROM big_salesWHERE stor_id = '6380'

StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),

SEEK:([big_sales].[stor_id]=[@1]) ORDERED FORWARD)


11/30

11

The query above is now able perform a SEEKrather than a SCANon the clustered

index. The SHOWPLAN describes exactly what the seek operation is based upon

(stor_id) and that the results are ORDERED according to how they are currently storedin the index mentioned. Since SQL Server 2000 now supports forward and backward

scrolling through indexes with equal performance, you may see ORDEREDFORWARD orORDERED BACKWARD in the query plan. This merely tells you which

direction the table or index was read. You can even manipulate this behavior by using

theASCandDESCkeywords in yourORDER BYclauses.

Range queries return query plans that look very similar to the direct query shown

before. The following two range queries give you an idea:

SELECT *FROM big_salesWHERE stor_id >= '7131'


SEEK:([big_sales].[stor_id] >= '7131') ORDERED FORWARD)

The above query looks a lot like the previous example, except the SEEKpredicate issomewhat different.

SELECT *FROM big_salesWHERE stor_id BETWEEN '7066' AND '7131'


SEEK:([big_sales].[stor_id] >= '7066' AND [big_sales].[stor_id]


12/30

12

The database architect made good guesses at indexing tables when they were

created, but the transaction load has changed over time, rendering the indexes

less effective

If you see a lot of scans in your query plan and not many seeks, you should reevaluate

your indexes. For example, look at the query below:

SELECT ord_numFROM salesWHERE ord_date IS NOT NULLAND ord_date > 'Jan 01, 2002 12:00:00 AM'

StmtText---------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]),

WHERE:([sales].[ord_date]>'Jan 1 2002 12:00AM'))

The query above has a WHEREclause against the ord_date column, yet no indexseek operation takes place. When looking at the table, we see that there is no index on

the ord_date column but there probably should be one. If we add one, the queryplan looks like this:

StmtText---------------------------------------------------------------------------------|--Index Seek(OBJECT:([pubs].[dbo].[sales].[sales_ord_date]),

SEEK:([sales].[ord_date] > 'Jan 1 2002 12:00AM') ORDERED FORWARD)

Now the query is performing anINDEX SEEKoperation on the sales_ord_date

index that we just created.

Branching Steps Illustrated by Comparing Joins andSubqueriesAn old rule of thumb says that joins are much better performing than a subquery that

achieves the same result set.

SELECT au_fname, au_lnameFROM authorsWHERE au_id IN

(SELECT au_id FROM titleauthor)

StmtText---------------------------------------------------------------------------------|--Nested Loops(Inner Join, OUTER REFERENCES:([titleauthor].[au_id]))

|--Stream Aggregate(GROUP BY:([titleauthor].[au_id]))| |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind]), ORDERED FORWA|--Clustered Index Seek(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind]),

SEEK:([authors].[au_id]=[titleauthor].[au_id]) ORDERED FORWARD)

Table 'authors'. Scan count 38, logical reads 76, physical reads 0, read-ahead reads 0Table 'titleauthor'. Scan count 2, logical reads 2, physical reads 1, read-ahead reads

In this case, the query engine chooses a nested loop operation. The query is forced to

read the entire authors table using a clustered index seek, chalking up quite a logical

page reads in the process.


13/30

13

In queries with branching steps, the indented lines show you which steps are

branches off of other steps.

Now, lets look at a join:

SELECT DISTINCT au_fname, au_lnameFROM authors AS aJOIN titleauthor AS t ON a.au_id = t.au_id

StmtText---------------------------------------------------------------------------------|--Stream Aggregate(GROUP BY:([a].[au_lname], [a].[au_fname]))

|--Nested Loops(Inner Join, OUTER REFERENCES:([a].[au_id]))|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind] AS [a]), ORDERED

FORWARD)|--Index Seek(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [t]),

SEEK:([t].[au_id]=[a].[au_id]) ORDERED FORWARD)

Table 'titleauthor'. Scan count 23, logical reads 23, physical reads 0, read-ahead rea0.

Table 'authors'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0.

With the above query, the number of logical reads goes up against the titleauthors table

but goes down for the authors table. Notice that the stream aggregation occurs higher

(later) in the query plan.

Comparing Query PlansYoull use query plans to compare the relative effectiveness of two separate queries.

For example, you might want to see if one query, compared to another, adds extra

layers of overhead or chooses a different indexing strategy.

In this example, we compare two queries. The first uses SUBSTRING and the second

usesLIKE:

SELECT *FROM authorsWHERE SUBSTRING( au_lname, 1, 2 )= 'Wh'

StmtText---------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind]),

WHERE:(substring([authors].[au_lname], 1, 2)='Wh'))

Compare this to a similar query that usesLIKE:

SELECT *FROM authors

WHERE au_lname LIKE 'Wh%'

StmtText---------------------------------------------------------------------------------|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors]))

|--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]),SEEK:([authors].[au_lname] >= 'WG' AND [authors].[au_lname] < 'WI'),WHERE:(like([authors].[au_lname], 'Wh%', NULL)) ORDERED FORWARD)

Obviously, the second query with itsINDEX SEEKoperation is a simpler query planthan the first query with its CLUSTERED INDEX SCAN.


14/30


15/30

15

Hash

The best strategies for large, dissimilarly sized tables, or complex join requirements

where the join columns are not indexed or sorted is a hashing join. Hashing is used for

UNION, INTERSECT, INNER, LEFT, RIGHT, and FULL OUTER JOIN, as well as

set matching and difference operations. Hashing is also used for joining tables whereno useful indexes exist. Hash operations build a temporary hashing table and then cycle

through all of the data to produce the output.

Hashes use a build input (always the smaller table) and a probe input. The hash key

(that is, the columns in the join predicate or sometimes in the GROUP BYlist) is whatthe query uses to process the join. A residual predicate is any evaluations in the

WHEREclause that do not apply to the join itself. Residual predicates are evaluatedafter the join predicates. There are several different options that SQL Server may

choose from when constructing a hash join, in order of precedence:

In-memory Hash: In-memory hash joins build a temporary hash table in memory by

first scanning the entire build input into memory. Each record is inserted into a hash

bucket based on the hash value computed for the hash key. Next, the probe input isscanned record by record. Each probe input is compared to the corresponding hash

bucket and, where a match is found, returned in the result set.

Hybrid Hash: If the hash is only slightly larger than available memory, SQL Server

may combine aspects of the in-memory hash join with the grace hash join in what is

called a hybrid hash join.

Grace Hash: The grace hash option is used when the hash join is too large to be

processed in memory. In that case, the whole build input and probe input are read in.

They are then pumped out into multiple, temporary worktables in a step called

partitioning fan-out. The hash function on the hash keys ensures that all joining records

are in the same pair of partitioned worktables. Partition fan-out basically chops two

long steps into many small steps that can be processed concurrently. The hash join isthen applied to each pair of worktables and any matches are returned in the result set.

Recursive Hash: Sometimes the partitioned fan-out tables produced by the grace hash

are still so large that they require further re-partitioning. This is called a recursivehash.

Note that hash and merge joins process through each table once. So they might have

deceptively low I/O metrics should you use SET STATISTICS IO ONwith queries ofthis type. However, the low I/O does not mean these join strategies are inherently faster

than nested loop joins because of their enormous computational requirements.

Hash joins, in particular, are computationally expensive. If you find certain

queries in a production application consistently using hash joins, this is yourclue to tune the query or add indexes to the underlying tables.

In the following example, we show both a standard nested loop (using the default query

plan) and hash and merge joins (forced through the use of hints):


16/30

16

SELECT a.au_fname, a.au_lname, t.titleFROM authors AS aINNER JOIN titleauthor ta

ON a.au_id = ta.au_idINNER JOIN titles t

ON t.title_id = ta.title_idORDER BY au_lname ASC, au_fname ASC

StmtText---------------------------------------------------------------------------------|--Nested Loops(Inner Join, OUTER REFERENCES:([ta].[title_id]))

|--Nested Loops(Inner Join, OUTER REFERENCES:([a].[au_id]))| |--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind] AS [a]), ORDERED

FORWARD)| |--Index Seek(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [ta]),

SEEK:([ta].[au_id]=[a].[au_id]) ORDERED FORWARD)|--Clustered Index Seek(OBJECT:([pubs].[dbo].[titles].[UPKCL_titleidind] AS [t]

SEEK:([t].[title_id]=[ta].[title_id]) ORDERED FORWARD)

The showplan displayed above is the standard query plan produced by SQL Server.

We can force SQL Server to show us how it would handle these as merge and hashjoins using hints:

SELECT a.au_fname, a.au_lname, t.titleFROM authors AS aINNER MERGE JOIN titleauthor ta

ON a.au_id = ta.au_idINNER HASH JOIN titles t

ON t.title_id = ta.title_idORDER BY au_lname ASC, au_fname ASC

Warning: The join order has been enforced because a local join hint is used.

StmtText---------------------------------------------------------------------------------|--Sort(ORDER BY:([a].[au_lname] ASC, [a].[au_fname] ASC))

|--Hash Match(Inner Join, HASH:([ta].[title_id])=([t].[title_id]),RESIDUAL:([ta].[title_id]=[t].[title_id]))|--Merge Join(Inner Join, MERGE:([a].[au_id])=([ta].[au_id]),

RESIDUAL:([ta].[au_id]=[a].[au_id]))| |--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind

AS [a]), ORDERED FORWARD)| |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [ta]),

ORDERED FORWARD)|--Index Scan(OBJECT:([pubs].[dbo].[titles].[titleind] AS [t]))

In this example, you can clearly see that each join considers the join predicate of the

other join to be a residual predicate. (Youll also note that the use of a hint caused SQLServer to issue a warning.) This query was also forced to use a SORToperation to

support the hash and merge joins.


17/30


18/30

18

Incidentally, the result sets are the same in both cases, though the sort orders are

different because the join query (with its GROUP BYclause) has an implicit ORDERBY:

Store Books Sold

---------------------------------------- -----------Barnum's 154125Bookbeat 518080Doc-U-Mat: Quality Laundry and Books 581130Eric the Read Books 76931Fricative Bookshop 259060News & Brews 161090

(6 row(s) affected)

Store Books Sold---------------------------------------- -----------Eric the Read Books 76931Barnum's 154125News & Brews 161090

Doc-U-Mat: Quality Laundry and Books 581130Fricative Bookshop 259060Bookbeat 518080

(6 row(s) affected)

Examination of the query plan of the subquery approach shows:

|--Compute Scalar(DEFINE:([Expr1006]=isnull([Expr1004], 0)))|--Nested Loops(Left Outer Join, OUTER REFERENCES:([st].[stor_id]))

|--Nested Loops(Inner Join, OUTER REFERENCES:([big_sales].[stor_id]))| |--Stream Aggregate(GROUP BY:([big_sales].[stor_id]))| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].

[UPKCL_big_sales]), ORDERED FORWARD)| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].[UPK_storeid] A

[st]),SEEK:([st].[stor_id]=[big_sales].[stor_id]) ORDERED FORWARD)

|--Stream Aggregate(DEFINE:([Expr1004]=SUM([bs].[qty])))|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].

[UPKCL_big_sales] AS [bs]),SEEK:([bs].[stor_id]=[st].[stor_id]) ORDERED FORWARD)

Whereas in the join query, we have:

|--Stream Aggregate(GROUP BY:([st].[stor_name])DEFINE:([Expr1004]=SUM([partialagg1005])))

|--Sort(ORDER BY:([st].[stor_name] ASC))|--Nested Loops(Left Semi Join, OUTER REFERENCES:([st].[stor_id]))

|--Nested Loops(Inner Join, OUTER REFERENCES:([bs].[stor_id]))| |--Stream Aggregate(GROUP BY:([bs].[stor_id])

DEFINE:([partialagg1005]=SUM([bs].[qty])))| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].

[UPKCL_big_sales] AS [bs]), ORDERED FORWARD)| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].

[UPK_storeid] AS [st]),SEEK:([st].[stor_id]=[bs].[stor_id]) ORDERED FORWARD)

|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),SEEK:([big_sales].[stor_id]=[st].[stor_id]) ORDERED FORWARD)


19/30

19

A solution using a join is more efficient. It does not require additional stream aggregate

that sums the big_sales.qtycolumn required for subquery processing.

UNION vs. UNION ALL

Whenever possible use UNION ALLinstead ofUNION. The difference is that UNIONhas a side effect of eliminating all duplicate rows and sorting results, which UNIONALL doesnt do. Selecting a distinct result requires building a temporary worktable,

storing all rows in it and sorting before producing the output. (Displaying the showplan

on a SELECT DISTINCTquery will reveal astream aggregation is taking place,consuming as much as 30% of the resources used to process the query.) In some cases

thats exactly what you need to do - then UNIONis your friend. But if you dont expectany duplicate rows in the result set, then use UNION ALL. It simply selects from one

table or a join, and then selects from another, attaching results to the bottom of the first

result set. UNION ALL requires no worktable and no sorting (unless other unrelatedconditions cause that). In most cases its much more efficient. One more potential

problem with UNIONis the danger of flooding tempdb database with a huge

worktable. It may happen if you expect a large result set from a UNIONquery.ExampleThe following queries select ID for all stores in the salestable, which ships as-iswith thepubs database, and the ID for all stores in the big_sales table, a version ofthe sales table that we populated with over 70,000 rows. The only difference

between the two solutions is the use ofUNIONversus UNION ALL. But the addition oftheALL keyword makes a big difference in the query plan. The first solution requires

stream aggregation and sorting the results before they are returned to the client. The

second query is much more efficient, especially for large tables. In this example both

queries return the same result set, though in a different order. In our testing we had two

temporary tables at the time of execution. Your results may vary.

UNION Solution UNION ALL SolutionSELECT stor_id FROM big_salesUNIONSELECT stor_id FROM sales

SELECT stor_id FROM big_salesUNION ALLSELECT stor_id FROM sales

|--Merge Join(Union)|--Stream Aggregate(GROUP BY:

([big_sales].[stor_id]))| |--Clustered Index Scan

(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),ORDERED FORWARD)

|--Stream Aggregate(GROUP BY:([sales].[stor_id]))|--Clustered Index Scan

(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]),ORDERED FORWARD)

|--Concatenation|--Index Scan (OBJECT:([pubs].[dbo].

[big_sales].[ndx_sales_ttlID]))|--Index Scan (OBJECT:([pubs].[dbo].

[sales].[titleidind]))

Table 'sales'. Scan count 1, logicalreads 2, physical reads 0,read-ahead reads 0.

Table 'big_sales'. Scan count 1, logicalreads 463, physical reads 0,read-ahead reads 0.

Table 'sales'. Scan count 1, logicalreads 1, physical reads 0,read-ahead reads 0.

Table 'big_sales'. Scan count 1, logicalreads 224, physical reads 0,read-ahead reads 0.


20/30

20

Although the result sets in this example are interchangeable, you can see that the

UNION ALL statement consumed less than half of the resources that the UNION

statement consumed. So be sure to anticipate your result sets and in those that are

already distinct, use the UNION ALL clause.

Functions and Expressions That Suppress IndexesWhen you apply built-in functions or expressions to indexed columns, the optimizer

cannot use indexes on those columns. Try to rewrite these conditions in such a way that

index keys are not involved in any expression.

ExamplesYou have to help SQL Server remove any expressions around numeric columns that

form an index. The following queries select a row from the table jobsby a unique keythat has a unique clustered index. If you apply an expression to the column, the index is

suppressed. But once you change the condition job_id 2 = 0 to job_id = 2, the

optimizer performs aseekoperation against the clustered index.

Query With Suppressed Index Optimized Query Using Index

SELECT *FROM jobs

WHERE (job_id-2) = 0

SELECT *FROM jobs

WHERE job_id = 2|--Clustered Index Scan(OBJECT:

([pubs].[dbo].[jobs].[PK__jobs__117F9D94]),WHERE:(Convert([jobs].[job_id])-2=0))

|--Clustered Index Seek(OBJECT:([pubs].[dbo].[jobs].[PK__jobs__117F9D94]),

SEEK:([jobs].[job_id]=Convert([@1]))ORDERED FORWARD)

Note that a seek is much better than a

scan as in the previous query.

The following table contains more examples of queries that suppress an index on

columns of different type and how you can rewrite them for optimal performance.


21/30

21

Query With Suppressed Index Optimized Query Using Index

DECLARE @job_id VARCHAR(5)SELECT @job_id = 2SELECT *FROM jobs

WHERE CONVERT( VARCHAR(5),job_id ) = @job_id

DECLARE @job_id VARCHAR(5)SELECT @job_id = 2SELECT *FROM jobs

WHERE job_id = CONVERT(SMALLINT, @job_id )


WHERE au_fname + ' ' + au_lname= 'Johnson White'


WHERE au_fname = 'Johnson'AND au_lname = 'White'


WHERE SUBSTRING( au_lname, 1, 2 )= 'Wh'


WHERE au_lname LIKE 'Wh%'

CREATE INDEX employee_hire_dateON employee ( hire_date )GO-- Get all employees hired-- in the 1st quarter of 1990:SELECT *FROM employee

WHERE DATEPART( year,hire_date ) = 1990

AND DATEPART( quarter,hire_date ) = 1

CREATE INDEX employee_hire_dateON employee ( hire_date )GO-- Get all employees hired-- in the 1st quarter of 1990:SELECT *FROM employee

WHERE hire_date >= 1/1/1990AND hire_date < 4/1/1990

-- Suppose that hire_date may-- contain time other than 12AM-- Who was hired on 2/21/1990?SELECT *FROM employee

WHERE CONVERT( CHAR(10),hire_date, 101 )= 2/21/1990

-- Suppose that hire_date may-- contain time other than 12AM-- Who was hired on 2/21/1990?SELECT *FROM employee

WHERE hire_date >= 2/21/1990AND hire_date < 2/22/1990

SET NOCOUNT ONThe phenomenon of speeding up T-SQL code by using SET NOCOUNT ONissurprisingly obscure to many SQL Server developers and DBAs. You may have

already noticed that successful queries return a system message about the number of

rows that they affect. In many cases you dont need this information. The command

SET NOCOUNT ONallows you to suppress the message for all subsequent transactionsin your session, until you issue the SET NOCOUNT OFFcommand. This option hasmore than a cosmetic effect on the output generated by your script. It reduces the

amount of information passed from the server to the client. Therefore, it helps to lower

network traffic and improves the overall response time of your transactions. Time to

pass a single message may be negligible, but think about a script that executes somequeries in a loop and sends Kilobytes of useless information to a user.

As an example, the enclosed file has a T-SQL batch that inserts 9999 rows into the

big_sales table.


22/30


23/30


24/30


25/30

25

Using Tools to Tune SQL StatementsIn the previous sections you have learned how you can apply different Microsoft SQLServer techniques to tune SQL statements. The use of tools when tuning SQL

statements is crucial to improve productivity and eliminate users errors. This section

will expose how you can use different tools to boost SQL performance and boost your

productivity.

Microsoft Query AnalyzerMicrosoft SQL Server includes the Query Analyzer tool that enables users to write and

execute SQL statements and T-SQL scripts. The Query Analyzer graphically displaysexecution plans before executing the SQL statements or after SQL execution. The

Display Estimated Execution Plan option under the Query menu displays the queryplan that SQL Server will use to execute the SQL statement. The Show Execution Plan

option under the Query menu displays the query plan used by SQL Server during SQLexecution. The graphical execution plan uses icons to represent the steps and data

retrieval methods that SQL Server chose to execute the SQL statement. The execution

plan is the graphical representation of the tabular output produced by the SET

SHOWPLAN_ALL orSET SHOWPLAN_TEXTstatements (Figure 1).

By looking at the execution plan operations you can get an understanding of the

performance characteristics of a SQL statement and identify the need for tuning.

Execution plans can get very complicated when working with complex SQL

statements. This increases the difficulty for a user to read and locate performance

inefficiencies in the execution plan.

If you determine that the SQL statement needs tuning, you can use the Query Analyzer

to manually tune the SQL statement. To manually tune the SQL statement you willneed to open a new window inside the Query Analyzer, reformulate the SQL statement

using some of the techniques presented in this document, review the execution plan and

execute the SQL to obtain the run time. Then you can repeat this process manually

until you find an alternative SQL statement with satisfactory performance.

The limitation of this approach is that to reformulate a complex SQL statement it is

necessary to have expertise on SQL tuning, and when changing the SQL code the

human expert can make mistakes that can lead to an alternative SQL statement that

does not return the same result set as the original SQL statement.

Another limitation is that since the process in manually intensive, the number of SQL

alternatives that the user can try is limited by the time the user can spend figuring out

new ways to write the SQL statement. To make the SQL tuning process more efficient,avoid user errors, and save time it is advisable to use a tool that automates the SQL

tuning process.


26/30

26

Figure 1. Query Analyzers graphical execution plan

Quest Central for SQL Server SQL TuningQuest Central for SQL Server is an integrated database management solution that

simplifies everyday tasks and incorporates a set of tools that enable users to achieve

higher levels of availability and reliability. Quest Central for SQL Server integratesDatabase Administration, Space Management, Database Analysis and SQL Tuning.

Quest Centrals SQL Server SQL Tuning integrates a graphical execution plan display,

a SQL Scannerthat proactively identifies problematic SQL statements directly from

database objects or source code, and a SQL Optimizerthat automatically rewrites theSQL statement in every possible alternative making it possible to identify the most-

efficient SQL for a specific database environment.

Typically database applications contain thousands of SQL statements. The SQL

statements can be located in database objects such as views and stored procedures, or

application source codes. Without an automated tool the process of extracting and

reviewing each SQL statement manually is very tedious and time consuming. The SQL

Scannermodule automates the process of extracting and reviewing SQL statementsdirectly from source code and offers a proactive approach of identifying potential SQL

performance problems (Figure 2).


27/30

27

The SQL Scanner extracts SQL statements embedded in database objects, source code,

and Microsoft SQL Server Profiler trace files/tables without any program execution.

The SQL Scanner can extract SELECT,SELECT..INTO,INSERT,DELETEandUPDATEstatements. The SQL Scanner analyzes in batch, the execution plans of each

SQL statement and categorizes them according to different levels of complexity andsuspected levels of performance problems. Execution plans with inefficiencies and

operations that can cause high I/O such as full tables scans on large tables, full tables

scans in nested loops or many table scans are classified as Problematic. With this

approach, the SQL Scanner allows you to be proactive in the detection of SQL

performance problems.

Figure 2. SQL Scanner, analyzing multiple SQL statements to identify

performance problems.

Due to the complexity of SQL language, there are many ways to write a SQL statement

to return the same result set, but small SQL code variations can have great impact in

performance. The SQL Optimizeruses a SQL transformation engine that completelytransforms a SQL statement in every possible equivalent SQL variation, preserving the

same logic in each alternative statement. The SQL rewrite process includes the use of

syntactical SQL transformations and SQL Server hints which are optional to the user.

Once the SQL statement has been transformed, the SQL Optimizer obtains the

execution plan for each SQL statements and narrows the optimized statements to those

with unique execution plan since an execution plan is what determines the performance

of a SQL statement. This comprehensive SQL transformation process occurs in the PC,

thus not affecting database server resources (Figure 3).


28/30

28

Figure 3. SQL Optimizer automatically rewrites SQL statements

Upon completion of the optimization process, the SQL optimizer displays a list of SQL

alternatives, execution plans and the SQL Server Cost associated with each executionplan. The user can review the SQL alternatives and determine which SQL alternatives

to execute in the database to obtain the run times, I/O information and prove which

statement is the fastest one for the database environment. Once the most efficient SQLstatement has been identified, the users can activate the SQL Comparer to view the

SQL alternatives side-by-side in order to display the syntax, execution plan and run

time statistics.

ConclusionApplication performance in a Microsoft SQL Server environment is directly related to

the efficiency of the SQL statements involved. This article exposed several Microsoft

SQL Server techniques employed to tune SQL statements. Tuning SQL statements byhand or by using SQL Server native utilities is a labor and knowledge-intensive task.

Quest Central for SQL Server SQL Tuning offers a solution that automates theprocess of SQL tuning, saving DBA's and developer's time, decreasing the amount of

experience and knowledge required, increasing their productivity, and maximizing the

performance of SQL statements throughout your Microsoft SQL Server systems.


29/30

29

About the AuthorsKevin Kline serves as the Director of Technology for SQL Server at Quest Software,

designing products for DBAs and database developers. Kevin is author of four books,

including the very popular "SQL in a Nutshell" and "Transact-SQL Programming" both

published by O'Reilly & Associates (www.oreilly.com), and numerous magazine andon-line articles. Kevin is also a Microsoft MVP (www.microsoft.com/mvp) for SQL

Server. Kevin is also active in the SQL Server community, serving as President of the

Professional Association for SQL Server (www.sqlpass.org). When he's not spending

time on database technology, Kevin romancing his wife, spending time with his four

children, practicing classical guitar (very badly), and gardening.

Claudia Fernandez is a Product Manager of SQL Tuning products at Quest Software.

Claudia has contributed to the strategic direction of SQL tuning products for multiple

RDBMS since early 2000. She has presented at several technical conferences on

RDBMS and Application Performance Tuning topics. Claudia holds a MS in Computer

Science and has several years of industry experience working with SQL Server, Sybase

ASE, Oracle, DB2 UDB and other associated technologies. She enjoys movies and

traveling.

Additional ResourcesWhite Paper: Analyzing and Optimizing T-SQL Query Performance on MicrosoftSQL Server using SET and DBCC by Kevin Kline

http://www.quest.com/whitepapers/tuning_article_1_final.pdf

White Paper: Microsoft T-SQL Performance Tuning Part 4: SHOWPLAN Output and

Analysis by Kevin Kline

http://www.quest.com/whitepapers/tuning_article_4.pdf

Material adapted from "Transact-SQL Programming" (O'Reilly & Associates, ISBN:

1565924010) by Kevin Kline, Lee Gould, and Andrew Zanevsky,

http://www.oreilly.com/catalog/wintrnssql/.

About Quest SoftwareQuest Software, Inc. (NASDAQ: QSFT), a leading provider of application, database

and Windows management solutions, provides Application Confidence to 18,000

customers worldwide, including 75 percent of the Fortune 500. Quest Softwares

products help our customers develop, deploy and maintain enterprise applications

without expensive downtime or business interruption. With this focus, Quest Software

enables IT professionals to achieve more with fewer resources. Headquartered in

Irvine, Calif., Quest Software has offices around the globe. For more information on

Quest Software, visit www.quest.com.


30/30

World Headquarters

8001 Irvine Center Drive

Irvine, CA 92618

www.quest.com

e-mail: [email protected]

Inside U.S.: 1.800.306.9329

Outside U.S.: 1.949.754.8000

Please refer to our Web site for regional and international office information. For more

information on Quest Central for Databases or other Quest Software solutions, visit

www.quest.com/quest_central.

Copyright 2004 Quest Software, Inc. Quest Central is a registered trademark of Quest Software. The information inthis publication is furnished for information use only, does not constitute a commitment from Quest Software Inc. of any

features or functions discussed and is subject to change without notice. Quest Software, Inc. assumes no responsibility

or liability for any errors or inaccuracies that may appear in this publication.

September 2004

Documents

SQLServer SQLTuning Final