SQLServer SQLTuning Final

Embed Size (px)

Citation preview

  • 8/6/2019 SQLServer SQLTuning Final

    1/30

    Tuning SQL Statements on Microsoft SQL Server 2000

    ByKevin Kline, Director of Technology, SQL Server Solutions Group, Quest Software, Inc.

    and Claudia Fernandez, Product Manager, Quest Software, Inc.

  • 8/6/2019 SQLServer SQLTuning Final

    2/30

    2

    Contents

    Introduction .............................................................................................................................................. 3

    Microsoft Tuning Techniques.................................................................................................................. 3

    SET STATISTICS IO................................................................................................................ 3

    SET STATISTICS TIME........................................................................................................... 5

    SHOWPLAN Output and Analysis........................................................................................... 6

    SHOWPLAN Output .......................................................................................................... 7

    SHOWPLAN Operations.................................................................................................... 9

    Reading the Query Plan ........................................................................................................................... 9

    Getting Started......................................................................................................................... 9

    SEEK versus SCAN................................................................................................................ 10

    Branching steps illustrated by Comparing Joins and Subqueries ......................................... 12

    Comparing Query Plans.................................................................................................... 13Understanding the Impact of Joins.................................................................................... 14

    Query Tuning Techniques...................................................................................................................... 17

    Subqueries Optimization........................................................................................................ 17Example ............................................................................................................................ 17

    UNION vs. UNION ALL ........................................................................................................ 19

    Example ............................................................................................................................ 19

    Functions and Expressions That Suppress Indexes ............................................................... 20Examples........................................................................................................................... 20

    SET NOCOUNT ON .............................................................................................................. 21

    TOP n..................................................................................................................................... 23

    Using tools to tune SQL statements....................................................................................................... 25

    Microsoft Query Analyzer...................................................................................................... 25

    Quest Central for SQL Server SQL Tuning ........................................................................ 26

    Conclusion............................................................................................................................................... 28

    About the Authors .................................................................................................................................. 29

    About Quest Software ............................................................................................................................ 29

  • 8/6/2019 SQLServer SQLTuning Final

    3/30

    3

    Tuning SQL Statements on Microsoft SQL Server2000

    By Kevin Kline and Claudia Fernandez

    IntroductionThis paper covers the basic techniques used to tune SELECTstatements on Microsofts

    SQL Server 2000 relational database management system. We discuss the techniques

    available using Microsoft's graphical user interfaces provided in Microsoft SQL

    Enterprise Manager or Microsoft SQL Query Analyzer, as well as providing a brief

    overview of Quest Software's query tuning tools.

    In addition to tuning methods, we'll show you several best practices you can apply to

    your SQL statements to improve performance. [All examples and syntax are verified

    for Microsoft SQL Server 2000.] After reading this paper, you should have a basic

    understanding of query tuning tools and techniques available with the Microsoft tool

    kit. We will cover a variety of querying techniques that improve performance and

    speed data read operations.

    SQL Server provides you with capabilities to benchmark transactions by sampling I/O

    activity and elapsed execution time using certain SETandDBCCcommands. Inaddition, someDBCCcommands may be used to obtain a very detailed explanation of

    any index statistic, estimate the cost of every possible execution plan, and boost

    performance. TheSETandDBCCcommands are fully detailed in the Quest whitepaper entitled "Analyzing and Optimizing T-SQL Query Performance on Microsoft

    SQL Server using SET and DBCC," the first white paper in a four part series onperformance tuning.

    Microsoft Tuning TechniquesMicrosoft provides you with three primary means for tuning queries:

    Checking the reads and writes generated by the query using SET STATISTICSIO

    Checking the running time of the query using SET STATISTICS TIME

    Analyzing the query plan of the query using SET SHOWPLAN

    SET STATISTICS IOThe command SET STATISTICS IO ONforces SQL Server to report actual I/O activityon executed transactions. It cannot be paired with SET NOEXEC ONoption, because itonly makes sense to monitor I/O activity on commands that actually execute. Once the

    option is enabled every query produces additional output that contains I/O statistics. In

    order to disable the option, execute SET STATISTICS IO OFF.

    These commands also work on Sybase Adaptive Server, though some results

    sets may look somewhat different.

  • 8/6/2019 SQLServer SQLTuning Final

    4/30

    4

    For example, the following script obtains I/O statistics for a simple query counting

    rows of the employeestable in the northwinddatabase:

    SET STATISTICS IO ONGOSELECT COUNT(*) FROM employeesGOSET STATISTICS IO OFFGO

    Results:

    -----------2977

    Table 'Employees'. Scan count 1, logical reads 53, physical reads 0, read-ahead reads

    The scan count tells us the number of scans performed. Logical reads show the number

    of pages read from the cache. Physical reads show the number of pages read from the

    disk. Read-ahead reads indicate the number of pages placed in the cache in anticipationof future reads.

    Additionally, we execute a system stored procedure to obtain table size statistics for

    our analysis:

    sp_spaceused employees

    Results:

    name rows reserved data index_size unused---------- ---- --------- ------- ----------- -------Employees 2977 2008 KB 1504 KB 448 KB 56 KB

    What can we tell by looking at this information?

    The query did not have to scan the whole table. The number of data in the table

    is more than 1.5 megabytes, yet it took only 53 logical I/O operations to obtain

    the result. It indicates that the query has found an index that could be used to

    compute the result, and scanning the index took fewer I/O than it would take to

    scan all data pages.

    Index pages were mostly found in data cache since the physical reads value is

    zero. This is because we executed the query shortly after other queries on

    employees and the table and its index were already cached. Your mileage mayvary.

    Microsoft has reported no read-ahead activity. In this case, data and index

    pages were already cached. For a table scan on a large table read-ahead would

    probably kick in and cache necessary pages before your query requested them.

    Read-ahead turns on automatically when SQL Server determines that your

    transaction is reading database pages sequentially and believes that it can

    predict which pages youll need next. A separate SQL Server connection

    virtually runs ahead of your process and caches data pages for it.

  • 8/6/2019 SQLServer SQLTuning Final

    5/30

    5

    [Configuration and tuning of read-ahead parameters is beyond the scope of this

    paper.]

    In this example, the query was executed as efficiently as possible. No further tuning is

    required.

    SET STATISTICS TIMEElapsed time of a transaction is a volatile measurement, since it depends on activity of

    other users on the server. However, it provides some real measurement, compared to

    the number of data pages that dont mean anything to your users. They are concerned

    about seconds and minutes they spend waiting for a query to come back, not about data

    caches and read-ahead efficiency. The SET STATISTICS TIME ONcommand reports

    the actual elapsed time and CPU utilization for every query that follows. Executing

    SET STATISTICS TIME OFFsuppresses the option.

    SET STATISTICS TIME ONGOSELECT COUNT(*) FROM titleauthors

    GOSET STATISTICS TIME OFFGO

    Results:

    SQL Server Execution Times:cpu time = 0 ms. elapsed time = 8672 ms.

    SQL Server Parse and Compile Time:cpu time = 10 ms.

    -----------25

    (1 row(s) affected)

    SQL Server Execution Times:cpu time = 0 ms. elapsed time = 10 ms.

    SQL Server Parse and Compile Time:cpu time = 0 ms.

    The first message reports a somewhat confusing elapsed time value of 8,672

    milliseconds. This number is not related to our script and indicates the amount of time

    that has passed since the previous command execution. You may disregard this first

    message. It took SQL Server only 10 milliseconds to parse and compile the query. It

    took 0 milliseconds to execute it (shown after the result of the query). What this really

    means is that the duration of the query was too short to measure. The last message thatreports parse and compile time of 0 ms refers to the SET STATISTICS TIME OFFcommand (thats what it took to compile it). You may disregard this message since the

    most important messages in the output are highlighted.

  • 8/6/2019 SQLServer SQLTuning Final

    6/30

    6

    Note that elapsed and CPU time are shown in milliseconds. The numbers may vary on

    your computer (but dont try to compare your machines performance to our notebook

    PCs, because this is not a representative benchmark). Moreover, every time you

    execute this script you may get slightly different statistics depending on what else your

    SQL Server was processing at the same time.If you need to measure elapsed duration of a set of queries or a stored procedure, it may

    be more practical to implement it programmatically (shown below). The reason is that

    the STATISICS TIMEreports duration of every single query and you have to add thingsup manually when you run multiple commands. Imagine the size of the output and the

    amount of manual work in cases when you time a script that executes a set of queries

    thousands of times in a loop!

    Instead, consider the following script to capture time before and after the transaction

    and report the total duration in seconds (you may use milliseconds if you prefer):

    DECLARE @start_time DATETIMESELECT @start_time = GETDATE()

    < any query or a script that you want to time, without a GO >

    SELECT Elapsed Time, sec = DATEDIFF( second, @start_time, GETDATE() )GO

    If your script consists of several steps separated by GO, you cannot use a local variableto save the start time. A variable is destroyed at the end of the step, defined by the GO

    command, where it was created. But you can preserve start time in a temporary table

    like this:

    CREATE TABLE #save_time ( start_time DATETIME NOT NULL )INSERT #save_time VALUES ( GETDATE() )GO< any script that you want to time (may include GO) >GOSELECT Elapsed Time, sec = DATEDIFF( second, start_time, GETDATE() )

    FROM #save_timeDROP TABLE #save_timeGO

    Remember that SQL ServersDATETIMEdatatype stores time values in 3 millisecondincrements. It is impossible to get more granular time values than that using the

    DATETIMEdatatype.

    SHOWPLAN Output and AnalysisThis paper illustrates, through example explain plans, the meaning and usefulness ofthe output produced using SET SHOWPLAN_TEXT ONin Microsoft SQL Server 2000.An explain plan (also called query plans, execution plans, or optimizer plans) provides

    the exact details of the steps the database query engine uses to execute a SQLtransaction. Knowing how to read explain plans expands your ability to perform high-

    end query tuning and optimization.

    Note: Most examples are based on either the PUBS database or on SQL

    Server system tables. For the examples, we added tens of thousands of rows

    to many tables so that the query optimizer has some real work to do when

    evaluating query plans.

  • 8/6/2019 SQLServer SQLTuning Final

    7/30

    7

    SHOWPLAN OutputOne of the things that we like about the query optimizer is that it provides feedback in

    the form of a query execution plan. Now we explain it in more detail and describe

    messages that you may encounter in query plans. Understanding this output brings youroptimization efforts to a new level. You no longer need to treat the optimizer as a

    black box that touches your queries with a magic wand.

    The following command instructs SQL Server to show the execution plan for every

    query that follows in the same connection (or process), or turns this option off:

    SET SHOWPLAN_TEXT { ON | OFF }

    By default, SHOWPLAN_TEXT ONcauses the code you are examining to not execute.Instead, SQL Server compiles the code and displays the query execution plan for that

    code. It continues with this behavior until you issue the command

    SET.SHOWPLAN_TEXT OFF.

    Typical T-SQL code that is used to obtain an execution plan for a query without

    actually running it follows:

    SET SHOWPLAN_TEXT ONGO

    GOSET SHOWPLAN_TEXT OFFGO

    Other Useful SET Commands

    There are a variety ofSETcommands that are useful for tuning and debugging. We

    covered SET STATISTICS earlier in this document. You might find these otherSET

    commands useful in certain situations:

    1. SET NOEXEC { ON | OFF }: checks the syntax of your Transact-SQL code,

    including compiling the code but not executing it. This is useful for checking the

    syntax of a query while taking advantage of deferred-name resolution. That is,

    you can check a querys syntax on a table that hasnt been created yet.

    2. SET FMTONLY { ON | OFF }: returns only the metadata of a query to the client.

    ForSELECTstatements, this usually means it returns only the column headers.

    3. SET PARSEONLY { ON | OFF }:checks the syntax of your Transact-SQL code,

    but does not compile or execute the code.

    All of these commands remain in effect once set ON until you manually turn them

    OFF. These settings do not take effect immediately, but they start working from the

    next step. In other words, you have to issue a GO command before theSHOWPLANor

    NOEXECsetting is enabled.

  • 8/6/2019 SQLServer SQLTuning Final

    8/30

  • 8/6/2019 SQLServer SQLTuning Final

    9/30

    9

    Warnings Type Parallel EstimateExecutions-------- --------- -------- ------------------NULL SELECT 0 NULLNULL PLAN_ROW 0 1.0

    There is a significant difference. The SHOWPLAN_ALL statement returns a lot of

    useful tuning information, but it is hard to understand and apply.

    SHOWPLAN OperationsSome of the SHOWPLANoperations, sometimes called tags, are very clear inexplaining what SQL Server is doing, while others are puzzling. These operations are

    divided into physical operations or logical operations. Physical operators describe the

    physical algorithm used to process the query, for example, performing an index seek.

    Logical operators describe the relational algebra operation used by the statement, such

    as an aggregation. SHOWPLANresults are broken down into steps. Each physicaloperation of a query is represented as a separate step. Steps usually have an

    accompanying logical operator, but not all steps involve logical operations. In addition,most steps have an operation (either logical or physical) and an argument. Arguments

    are the component of the query that the operation affects.

    A discussion of all of the execution plans steps would be prohibitively large. Instead of

    reviewing them all here, please refer to the Quest white paper "SHOWPLAN Output

    and Analysis" available at http://www.quest.com/whitepapers/#ms_sql_server.

    Reading the Query PlanRather than show examples embedded within the descriptions of the logical and

    physical operations, we have broken them out separately. This is because a single

    example might illustrate the use and effectiveness of several operators at once.

    Getting StartedLets start with some simple examples to help you understand how to read the query

    plan that is returned when you either issue the command SET SHOWPLAN_TEXT ONor enable the option of the same name in the SQL Query Analyzer configuration

    properties.

    This example usespubs..big_sales, an identical copy of the

    pubs..sales table except with about 80,000 records, as the main source

    for examples of simple explain plans.

    The simplest query, as shown below, will scan the entire clustered index if one exists.

    Remember that the clustered key is the physical order in which the data is written.

    Consequently, if a clustered key exists, youll be able to avoid a table scan. Even if

    you select a column that is not specifically mentioned in the clustered key, such as

    ord_date, the query engine will use a clustered index scan to return the result set.

  • 8/6/2019 SQLServer SQLTuning Final

    10/30

    10

    SELECT *FROM big_sales

    SELECT ord_dateFROM big_sales

    StmtText-------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]))

    The queries shown above return very different quantities of data, so the query with the

    smaller result set (ord_date) will perform faster than the other query simply because

    of much lower I/O. However, the query plans are virtually identical.

    You can improve performance by utilizing alternate indexes. For example, a non-

    clustered index exists on the title_idcolumn:

    SELECT title_idFROM big_sales

    StmtText---------------------------------------------------------------------|--Index Scan(OBJECT:([pubs].[dbo].[big_sales].[ndx_sales_ttlID]))

    The above query performs in a fraction of the time of the SELECT * query because it

    can answer its needs entirely from the non-clustered index. This type of query is called

    a covering query because the entire result set is covered by a non-clustered index.

    SEEK versus SCANOne of the first things youll need to distinguish in a query plan is the difference

    between a SEEKand a SCANoperation.

    A very simple but useful rule of thumb is that SEEK operations are good

    while SCAN operations are less-than-good, if not downright bad. Seeks go

    directly, or at least very quickly, to the needed records while scans read the

    whole object (either table, clustered index, or non-clustered index). Thus,

    scans usually consume lots more resources than seeks.

    If your query plan shows only scans, then you should consider tuning the

    query.

    The WHEREclause can make a huge difference in query performance, as shownbelow:

    SELECT *

    FROM big_salesWHERE stor_id = '6380'

    StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),

    SEEK:([big_sales].[stor_id]=[@1]) ORDERED FORWARD)

  • 8/6/2019 SQLServer SQLTuning Final

    11/30

    11

    The query above is now able perform a SEEKrather than a SCANon the clustered

    index. The SHOWPLAN describes exactly what the seek operation is based upon

    (stor_id) and that the results are ORDERED according to how they are currently storedin the index mentioned. Since SQL Server 2000 now supports forward and backward

    scrolling through indexes with equal performance, you may see ORDEREDFORWARD orORDERED BACKWARD in the query plan. This merely tells you which

    direction the table or index was read. You can even manipulate this behavior by using

    theASCandDESCkeywords in yourORDER BYclauses.

    Range queries return query plans that look very similar to the direct query shown

    before. The following two range queries give you an idea:

    SELECT *FROM big_salesWHERE stor_id >= '7131'

    StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),

    SEEK:([big_sales].[stor_id] >= '7131') ORDERED FORWARD)

    The above query looks a lot like the previous example, except the SEEKpredicate issomewhat different.

    SELECT *FROM big_salesWHERE stor_id BETWEEN '7066' AND '7131'

    StmtText---------------------------------------------------------------------------------|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),

    SEEK:([big_sales].[stor_id] >= '7066' AND [big_sales].[stor_id]

  • 8/6/2019 SQLServer SQLTuning Final

    12/30

    12

    The database architect made good guesses at indexing tables when they were

    created, but the transaction load has changed over time, rendering the indexes

    less effective

    If you see a lot of scans in your query plan and not many seeks, you should reevaluate

    your indexes. For example, look at the query below:

    SELECT ord_numFROM salesWHERE ord_date IS NOT NULLAND ord_date > 'Jan 01, 2002 12:00:00 AM'

    StmtText---------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]),

    WHERE:([sales].[ord_date]>'Jan 1 2002 12:00AM'))

    The query above has a WHEREclause against the ord_date column, yet no indexseek operation takes place. When looking at the table, we see that there is no index on

    the ord_date column but there probably should be one. If we add one, the queryplan looks like this:

    StmtText---------------------------------------------------------------------------------|--Index Seek(OBJECT:([pubs].[dbo].[sales].[sales_ord_date]),

    SEEK:([sales].[ord_date] > 'Jan 1 2002 12:00AM') ORDERED FORWARD)

    Now the query is performing anINDEX SEEKoperation on the sales_ord_date

    index that we just created.

    Branching Steps Illustrated by Comparing Joins andSubqueriesAn old rule of thumb says that joins are much better performing than a subquery that

    achieves the same result set.

    SELECT au_fname, au_lnameFROM authorsWHERE au_id IN

    (SELECT au_id FROM titleauthor)

    StmtText---------------------------------------------------------------------------------|--Nested Loops(Inner Join, OUTER REFERENCES:([titleauthor].[au_id]))

    |--Stream Aggregate(GROUP BY:([titleauthor].[au_id]))| |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind]), ORDERED FORWA|--Clustered Index Seek(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind]),

    SEEK:([authors].[au_id]=[titleauthor].[au_id]) ORDERED FORWARD)

    Table 'authors'. Scan count 38, logical reads 76, physical reads 0, read-ahead reads 0Table 'titleauthor'. Scan count 2, logical reads 2, physical reads 1, read-ahead reads

    In this case, the query engine chooses a nested loop operation. The query is forced to

    read the entire authors table using a clustered index seek, chalking up quite a logical

    page reads in the process.

  • 8/6/2019 SQLServer SQLTuning Final

    13/30

    13

    In queries with branching steps, the indented lines show you which steps are

    branches off of other steps.

    Now, lets look at a join:

    SELECT DISTINCT au_fname, au_lnameFROM authors AS aJOIN titleauthor AS t ON a.au_id = t.au_id

    StmtText---------------------------------------------------------------------------------|--Stream Aggregate(GROUP BY:([a].[au_lname], [a].[au_fname]))

    |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[au_id]))|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind] AS [a]), ORDERED

    FORWARD)|--Index Seek(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [t]),

    SEEK:([t].[au_id]=[a].[au_id]) ORDERED FORWARD)

    Table 'titleauthor'. Scan count 23, logical reads 23, physical reads 0, read-ahead rea0.

    Table 'authors'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0.

    With the above query, the number of logical reads goes up against the titleauthors table

    but goes down for the authors table. Notice that the stream aggregation occurs higher

    (later) in the query plan.

    Comparing Query PlansYoull use query plans to compare the relative effectiveness of two separate queries.

    For example, you might want to see if one query, compared to another, adds extra

    layers of overhead or chooses a different indexing strategy.

    In this example, we compare two queries. The first uses SUBSTRING and the second

    usesLIKE:

    SELECT *FROM authorsWHERE SUBSTRING( au_lname, 1, 2 )= 'Wh'

    StmtText---------------------------------------------------------------------------------|--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind]),

    WHERE:(substring([authors].[au_lname], 1, 2)='Wh'))

    Compare this to a similar query that usesLIKE:

    SELECT *FROM authors

    WHERE au_lname LIKE 'Wh%'

    StmtText---------------------------------------------------------------------------------|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors]))

    |--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]),SEEK:([authors].[au_lname] >= 'WG' AND [authors].[au_lname] < 'WI'),WHERE:(like([authors].[au_lname], 'Wh%', NULL)) ORDERED FORWARD)

    Obviously, the second query with itsINDEX SEEKoperation is a simpler query planthan the first query with its CLUSTERED INDEX SCAN.

  • 8/6/2019 SQLServer SQLTuning Final

    14/30

  • 8/6/2019 SQLServer SQLTuning Final

    15/30

    15

    Hash

    The best strategies for large, dissimilarly sized tables, or complex join requirements

    where the join columns are not indexed or sorted is a hashing join. Hashing is used for

    UNION, INTERSECT, INNER, LEFT, RIGHT, and FULL OUTER JOIN, as well as

    set matching and difference operations. Hashing is also used for joining tables whereno useful indexes exist. Hash operations build a temporary hashing table and then cycle

    through all of the data to produce the output.

    Hashes use a build input (always the smaller table) and a probe input. The hash key

    (that is, the columns in the join predicate or sometimes in the GROUP BYlist) is whatthe query uses to process the join. A residual predicate is any evaluations in the

    WHEREclause that do not apply to the join itself. Residual predicates are evaluatedafter the join predicates. There are several different options that SQL Server may

    choose from when constructing a hash join, in order of precedence:

    In-memory Hash: In-memory hash joins build a temporary hash table in memory by

    first scanning the entire build input into memory. Each record is inserted into a hash

    bucket based on the hash value computed for the hash key. Next, the probe input isscanned record by record. Each probe input is compared to the corresponding hash

    bucket and, where a match is found, returned in the result set.

    Hybrid Hash: If the hash is only slightly larger than available memory, SQL Server

    may combine aspects of the in-memory hash join with the grace hash join in what is

    called a hybrid hash join.

    Grace Hash: The grace hash option is used when the hash join is too large to be

    processed in memory. In that case, the whole build input and probe input are read in.

    They are then pumped out into multiple, temporary worktables in a step called

    partitioning fan-out. The hash function on the hash keys ensures that all joining records

    are in the same pair of partitioned worktables. Partition fan-out basically chops two

    long steps into many small steps that can be processed concurrently. The hash join isthen applied to each pair of worktables and any matches are returned in the result set.

    Recursive Hash: Sometimes the partitioned fan-out tables produced by the grace hash

    are still so large that they require further re-partitioning. This is called a recursivehash.

    Note that hash and merge joins process through each table once. So they might have

    deceptively low I/O metrics should you use SET STATISTICS IO ONwith queries ofthis type. However, the low I/O does not mean these join strategies are inherently faster

    than nested loop joins because of their enormous computational requirements.

    Hash joins, in particular, are computationally expensive. If you find certain

    queries in a production application consistently using hash joins, this is yourclue to tune the query or add indexes to the underlying tables.

    In the following example, we show both a standard nested loop (using the default query

    plan) and hash and merge joins (forced through the use of hints):

  • 8/6/2019 SQLServer SQLTuning Final

    16/30

    16

    SELECT a.au_fname, a.au_lname, t.titleFROM authors AS aINNER JOIN titleauthor ta

    ON a.au_id = ta.au_idINNER JOIN titles t

    ON t.title_id = ta.title_idORDER BY au_lname ASC, au_fname ASC

    StmtText---------------------------------------------------------------------------------|--Nested Loops(Inner Join, OUTER REFERENCES:([ta].[title_id]))

    |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[au_id]))| |--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind] AS [a]), ORDERED

    FORWARD)| |--Index Seek(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [ta]),

    SEEK:([ta].[au_id]=[a].[au_id]) ORDERED FORWARD)|--Clustered Index Seek(OBJECT:([pubs].[dbo].[titles].[UPKCL_titleidind] AS [t]

    SEEK:([t].[title_id]=[ta].[title_id]) ORDERED FORWARD)

    The showplan displayed above is the standard query plan produced by SQL Server.

    We can force SQL Server to show us how it would handle these as merge and hashjoins using hints:

    SELECT a.au_fname, a.au_lname, t.titleFROM authors AS aINNER MERGE JOIN titleauthor ta

    ON a.au_id = ta.au_idINNER HASH JOIN titles t

    ON t.title_id = ta.title_idORDER BY au_lname ASC, au_fname ASC

    Warning: The join order has been enforced because a local join hint is used.

    StmtText---------------------------------------------------------------------------------|--Sort(ORDER BY:([a].[au_lname] ASC, [a].[au_fname] ASC))

    |--Hash Match(Inner Join, HASH:([ta].[title_id])=([t].[title_id]),RESIDUAL:([ta].[title_id]=[t].[title_id]))|--Merge Join(Inner Join, MERGE:([a].[au_id])=([ta].[au_id]),

    RESIDUAL:([ta].[au_id]=[a].[au_id]))| |--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind

    AS [a]), ORDERED FORWARD)| |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [ta]),

    ORDERED FORWARD)|--Index Scan(OBJECT:([pubs].[dbo].[titles].[titleind] AS [t]))

    In this example, you can clearly see that each join considers the join predicate of the

    other join to be a residual predicate. (Youll also note that the use of a hint caused SQLServer to issue a warning.) This query was also forced to use a SORToperation to

    support the hash and merge joins.

  • 8/6/2019 SQLServer SQLTuning Final

    17/30

  • 8/6/2019 SQLServer SQLTuning Final

    18/30

    18

    Incidentally, the result sets are the same in both cases, though the sort orders are

    different because the join query (with its GROUP BYclause) has an implicit ORDERBY:

    Store Books Sold

    ---------------------------------------- -----------Barnum's 154125Bookbeat 518080Doc-U-Mat: Quality Laundry and Books 581130Eric the Read Books 76931Fricative Bookshop 259060News & Brews 161090

    (6 row(s) affected)

    Store Books Sold---------------------------------------- -----------Eric the Read Books 76931Barnum's 154125News & Brews 161090

    Doc-U-Mat: Quality Laundry and Books 581130Fricative Bookshop 259060Bookbeat 518080

    (6 row(s) affected)

    Examination of the query plan of the subquery approach shows:

    |--Compute Scalar(DEFINE:([Expr1006]=isnull([Expr1004], 0)))|--Nested Loops(Left Outer Join, OUTER REFERENCES:([st].[stor_id]))

    |--Nested Loops(Inner Join, OUTER REFERENCES:([big_sales].[stor_id]))| |--Stream Aggregate(GROUP BY:([big_sales].[stor_id]))| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].

    [UPKCL_big_sales]), ORDERED FORWARD)| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].[UPK_storeid] A

    [st]),SEEK:([st].[stor_id]=[big_sales].[stor_id]) ORDERED FORWARD)

    |--Stream Aggregate(DEFINE:([Expr1004]=SUM([bs].[qty])))|--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].

    [UPKCL_big_sales] AS [bs]),SEEK:([bs].[stor_id]=[st].[stor_id]) ORDERED FORWARD)

    Whereas in the join query, we have:

    |--Stream Aggregate(GROUP BY:([st].[stor_name])DEFINE:([Expr1004]=SUM([partialagg1005])))

    |--Sort(ORDER BY:([st].[stor_name] ASC))|--Nested Loops(Left Semi Join, OUTER REFERENCES:([st].[stor_id]))

    |--Nested Loops(Inner Join, OUTER REFERENCES:([bs].[stor_id]))| |--Stream Aggregate(GROUP BY:([bs].[stor_id])

    DEFINE:([partialagg1005]=SUM([bs].[qty])))| | |--Clustered Index Scan(OBJECT:([pubs].[dbo].[big_sales].

    [UPKCL_big_sales] AS [bs]), ORDERED FORWARD)| |--Clustered Index Seek(OBJECT:([pubs].[dbo].[stores].

    [UPK_storeid] AS [st]),SEEK:([st].[stor_id]=[bs].[stor_id]) ORDERED FORWARD)

    |--Clustered Index Seek(OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),SEEK:([big_sales].[stor_id]=[st].[stor_id]) ORDERED FORWARD)

  • 8/6/2019 SQLServer SQLTuning Final

    19/30

    19

    A solution using a join is more efficient. It does not require additional stream aggregate

    that sums the big_sales.qtycolumn required for subquery processing.

    UNION vs. UNION ALL

    Whenever possible use UNION ALLinstead ofUNION. The difference is that UNIONhas a side effect of eliminating all duplicate rows and sorting results, which UNIONALL doesnt do. Selecting a distinct result requires building a temporary worktable,

    storing all rows in it and sorting before producing the output. (Displaying the showplan

    on a SELECT DISTINCTquery will reveal astream aggregation is taking place,consuming as much as 30% of the resources used to process the query.) In some cases

    thats exactly what you need to do - then UNIONis your friend. But if you dont expectany duplicate rows in the result set, then use UNION ALL. It simply selects from one

    table or a join, and then selects from another, attaching results to the bottom of the first

    result set. UNION ALL requires no worktable and no sorting (unless other unrelatedconditions cause that). In most cases its much more efficient. One more potential

    problem with UNIONis the danger of flooding tempdb database with a huge

    worktable. It may happen if you expect a large result set from a UNIONquery.ExampleThe following queries select ID for all stores in the salestable, which ships as-iswith thepubs database, and the ID for all stores in the big_sales table, a version ofthe sales table that we populated with over 70,000 rows. The only difference

    between the two solutions is the use ofUNIONversus UNION ALL. But the addition oftheALL keyword makes a big difference in the query plan. The first solution requires

    stream aggregation and sorting the results before they are returned to the client. The

    second query is much more efficient, especially for large tables. In this example both

    queries return the same result set, though in a different order. In our testing we had two

    temporary tables at the time of execution. Your results may vary.

    UNION Solution UNION ALL SolutionSELECT stor_id FROM big_salesUNIONSELECT stor_id FROM sales

    SELECT stor_id FROM big_salesUNION ALLSELECT stor_id FROM sales

    |--Merge Join(Union)|--Stream Aggregate(GROUP BY:

    ([big_sales].[stor_id]))| |--Clustered Index Scan

    (OBJECT:([pubs].[dbo].[big_sales].[UPKCL_big_sales]),ORDERED FORWARD)

    |--Stream Aggregate(GROUP BY:([sales].[stor_id]))|--Clustered Index Scan

    (OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]),ORDERED FORWARD)

    |--Concatenation|--Index Scan (OBJECT:([pubs].[dbo].

    [big_sales].[ndx_sales_ttlID]))|--Index Scan (OBJECT:([pubs].[dbo].

    [sales].[titleidind]))

    Table 'sales'. Scan count 1, logicalreads 2, physical reads 0,read-ahead reads 0.

    Table 'big_sales'. Scan count 1, logicalreads 463, physical reads 0,read-ahead reads 0.

    Table 'sales'. Scan count 1, logicalreads 1, physical reads 0,read-ahead reads 0.

    Table 'big_sales'. Scan count 1, logicalreads 224, physical reads 0,read-ahead reads 0.

  • 8/6/2019 SQLServer SQLTuning Final

    20/30

    20

    Although the result sets in this example are interchangeable, you can see that the

    UNION ALL statement consumed less than half of the resources that the UNION

    statement consumed. So be sure to anticipate your result sets and in those that are

    already distinct, use the UNION ALL clause.

    Functions and Expressions That Suppress IndexesWhen you apply built-in functions or expressions to indexed columns, the optimizer

    cannot use indexes on those columns. Try to rewrite these conditions in such a way that

    index keys are not involved in any expression.

    ExamplesYou have to help SQL Server remove any expressions around numeric columns that

    form an index. The following queries select a row from the table jobsby a unique keythat has a unique clustered index. If you apply an expression to the column, the index is

    suppressed. But once you change the condition job_id 2 = 0 to job_id = 2, the

    optimizer performs aseekoperation against the clustered index.

    Query With Suppressed Index Optimized Query Using Index

    SELECT *FROM jobs

    WHERE (job_id-2) = 0

    SELECT *FROM jobs

    WHERE job_id = 2|--Clustered Index Scan(OBJECT:

    ([pubs].[dbo].[jobs].[PK__jobs__117F9D94]),WHERE:(Convert([jobs].[job_id])-2=0))

    |--Clustered Index Seek(OBJECT:([pubs].[dbo].[jobs].[PK__jobs__117F9D94]),

    SEEK:([jobs].[job_id]=Convert([@1]))ORDERED FORWARD)

    Note that a seek is much better than a

    scan as in the previous query.

    The following table contains more examples of queries that suppress an index on

    columns of different type and how you can rewrite them for optimal performance.

  • 8/6/2019 SQLServer SQLTuning Final

    21/30

    21

    Query With Suppressed Index Optimized Query Using Index

    DECLARE @job_id VARCHAR(5)SELECT @job_id = 2SELECT *FROM jobs

    WHERE CONVERT( VARCHAR(5),job_id ) = @job_id

    DECLARE @job_id VARCHAR(5)SELECT @job_id = 2SELECT *FROM jobs

    WHERE job_id = CONVERT(SMALLINT, @job_id )

    SELECT *FROM authors

    WHERE au_fname + ' ' + au_lname= 'Johnson White'

    SELECT *FROM authors

    WHERE au_fname = 'Johnson'AND au_lname = 'White'

    SELECT *FROM authors

    WHERE SUBSTRING( au_lname, 1, 2 )= 'Wh'

    SELECT *FROM authors

    WHERE au_lname LIKE 'Wh%'

    CREATE INDEX employee_hire_dateON employee ( hire_date )GO-- Get all employees hired-- in the 1st quarter of 1990:SELECT *FROM employee

    WHERE DATEPART( year,hire_date ) = 1990

    AND DATEPART( quarter,hire_date ) = 1

    CREATE INDEX employee_hire_dateON employee ( hire_date )GO-- Get all employees hired-- in the 1st quarter of 1990:SELECT *FROM employee

    WHERE hire_date >= 1/1/1990AND hire_date < 4/1/1990

    -- Suppose that hire_date may-- contain time other than 12AM-- Who was hired on 2/21/1990?SELECT *FROM employee

    WHERE CONVERT( CHAR(10),hire_date, 101 )= 2/21/1990

    -- Suppose that hire_date may-- contain time other than 12AM-- Who was hired on 2/21/1990?SELECT *FROM employee

    WHERE hire_date >= 2/21/1990AND hire_date < 2/22/1990

    SET NOCOUNT ONThe phenomenon of speeding up T-SQL code by using SET NOCOUNT ONissurprisingly obscure to many SQL Server developers and DBAs. You may have

    already noticed that successful queries return a system message about the number of

    rows that they affect. In many cases you dont need this information. The command

    SET NOCOUNT ONallows you to suppress the message for all subsequent transactionsin your session, until you issue the SET NOCOUNT OFFcommand. This option hasmore than a cosmetic effect on the output generated by your script. It reduces the

    amount of information passed from the server to the client. Therefore, it helps to lower

    network traffic and improves the overall response time of your transactions. Time to

    pass a single message may be negligible, but think about a script that executes somequeries in a loop and sends Kilobytes of useless information to a user.

    As an example, the enclosed file has a T-SQL batch that inserts 9999 rows into the

    big_sales table.

  • 8/6/2019 SQLServer SQLTuning Final

    22/30

  • 8/6/2019 SQLServer SQLTuning Final

    23/30

  • 8/6/2019 SQLServer SQLTuning Final

    24/30

  • 8/6/2019 SQLServer SQLTuning Final

    25/30

    25

    Using Tools to Tune SQL StatementsIn the previous sections you have learned how you can apply different Microsoft SQLServer techniques to tune SQL statements. The use of tools when tuning SQL

    statements is crucial to improve productivity and eliminate users errors. This section

    will expose how you can use different tools to boost SQL performance and boost your

    productivity.

    Microsoft Query AnalyzerMicrosoft SQL Server includes the Query Analyzer tool that enables users to write and

    execute SQL statements and T-SQL scripts. The Query Analyzer graphically displaysexecution plans before executing the SQL statements or after SQL execution. The

    Display Estimated Execution Plan option under the Query menu displays the queryplan that SQL Server will use to execute the SQL statement. The Show Execution Plan

    option under the Query menu displays the query plan used by SQL Server during SQLexecution. The graphical execution plan uses icons to represent the steps and data

    retrieval methods that SQL Server chose to execute the SQL statement. The execution

    plan is the graphical representation of the tabular output produced by the SET

    SHOWPLAN_ALL orSET SHOWPLAN_TEXTstatements (Figure 1).

    By looking at the execution plan operations you can get an understanding of the

    performance characteristics of a SQL statement and identify the need for tuning.

    Execution plans can get very complicated when working with complex SQL

    statements. This increases the difficulty for a user to read and locate performance

    inefficiencies in the execution plan.

    If you determine that the SQL statement needs tuning, you can use the Query Analyzer

    to manually tune the SQL statement. To manually tune the SQL statement you willneed to open a new window inside the Query Analyzer, reformulate the SQL statement

    using some of the techniques presented in this document, review the execution plan and

    execute the SQL to obtain the run time. Then you can repeat this process manually

    until you find an alternative SQL statement with satisfactory performance.

    The limitation of this approach is that to reformulate a complex SQL statement it is

    necessary to have expertise on SQL tuning, and when changing the SQL code the

    human expert can make mistakes that can lead to an alternative SQL statement that

    does not return the same result set as the original SQL statement.

    Another limitation is that since the process in manually intensive, the number of SQL

    alternatives that the user can try is limited by the time the user can spend figuring out

    new ways to write the SQL statement. To make the SQL tuning process more efficient,avoid user errors, and save time it is advisable to use a tool that automates the SQL

    tuning process.

  • 8/6/2019 SQLServer SQLTuning Final

    26/30

    26

    Figure 1. Query Analyzers graphical execution plan

    Quest Central for SQL Server SQL TuningQuest Central for SQL Server is an integrated database management solution that

    simplifies everyday tasks and incorporates a set of tools that enable users to achieve

    higher levels of availability and reliability. Quest Central for SQL Server integratesDatabase Administration, Space Management, Database Analysis and SQL Tuning.

    Quest Centrals SQL Server SQL Tuning integrates a graphical execution plan display,

    a SQL Scannerthat proactively identifies problematic SQL statements directly from

    database objects or source code, and a SQL Optimizerthat automatically rewrites theSQL statement in every possible alternative making it possible to identify the most-

    efficient SQL for a specific database environment.

    Typically database applications contain thousands of SQL statements. The SQL

    statements can be located in database objects such as views and stored procedures, or

    application source codes. Without an automated tool the process of extracting and

    reviewing each SQL statement manually is very tedious and time consuming. The SQL

    Scannermodule automates the process of extracting and reviewing SQL statementsdirectly from source code and offers a proactive approach of identifying potential SQL

    performance problems (Figure 2).

  • 8/6/2019 SQLServer SQLTuning Final

    27/30

    27

    The SQL Scanner extracts SQL statements embedded in database objects, source code,

    and Microsoft SQL Server Profiler trace files/tables without any program execution.

    The SQL Scanner can extract SELECT,SELECT..INTO,INSERT,DELETEandUPDATEstatements. The SQL Scanner analyzes in batch, the execution plans of each

    SQL statement and categorizes them according to different levels of complexity andsuspected levels of performance problems. Execution plans with inefficiencies and

    operations that can cause high I/O such as full tables scans on large tables, full tables

    scans in nested loops or many table scans are classified as Problematic. With this

    approach, the SQL Scanner allows you to be proactive in the detection of SQL

    performance problems.

    Figure 2. SQL Scanner, analyzing multiple SQL statements to identify

    performance problems.

    Due to the complexity of SQL language, there are many ways to write a SQL statement

    to return the same result set, but small SQL code variations can have great impact in

    performance. The SQL Optimizeruses a SQL transformation engine that completelytransforms a SQL statement in every possible equivalent SQL variation, preserving the

    same logic in each alternative statement. The SQL rewrite process includes the use of

    syntactical SQL transformations and SQL Server hints which are optional to the user.

    Once the SQL statement has been transformed, the SQL Optimizer obtains the

    execution plan for each SQL statements and narrows the optimized statements to those

    with unique execution plan since an execution plan is what determines the performance

    of a SQL statement. This comprehensive SQL transformation process occurs in the PC,

    thus not affecting database server resources (Figure 3).

  • 8/6/2019 SQLServer SQLTuning Final

    28/30

    28

    Figure 3. SQL Optimizer automatically rewrites SQL statements

    Upon completion of the optimization process, the SQL optimizer displays a list of SQL

    alternatives, execution plans and the SQL Server Cost associated with each executionplan. The user can review the SQL alternatives and determine which SQL alternatives

    to execute in the database to obtain the run times, I/O information and prove which

    statement is the fastest one for the database environment. Once the most efficient SQLstatement has been identified, the users can activate the SQL Comparer to view the

    SQL alternatives side-by-side in order to display the syntax, execution plan and run

    time statistics.

    ConclusionApplication performance in a Microsoft SQL Server environment is directly related to

    the efficiency of the SQL statements involved. This article exposed several Microsoft

    SQL Server techniques employed to tune SQL statements. Tuning SQL statements byhand or by using SQL Server native utilities is a labor and knowledge-intensive task.

    Quest Central for SQL Server SQL Tuning offers a solution that automates theprocess of SQL tuning, saving DBA's and developer's time, decreasing the amount of

    experience and knowledge required, increasing their productivity, and maximizing the

    performance of SQL statements throughout your Microsoft SQL Server systems.

  • 8/6/2019 SQLServer SQLTuning Final

    29/30

    29

    About the AuthorsKevin Kline serves as the Director of Technology for SQL Server at Quest Software,

    designing products for DBAs and database developers. Kevin is author of four books,

    including the very popular "SQL in a Nutshell" and "Transact-SQL Programming" both

    published by O'Reilly & Associates (www.oreilly.com), and numerous magazine andon-line articles. Kevin is also a Microsoft MVP (www.microsoft.com/mvp) for SQL

    Server. Kevin is also active in the SQL Server community, serving as President of the

    Professional Association for SQL Server (www.sqlpass.org). When he's not spending

    time on database technology, Kevin romancing his wife, spending time with his four

    children, practicing classical guitar (very badly), and gardening.

    Claudia Fernandez is a Product Manager of SQL Tuning products at Quest Software.

    Claudia has contributed to the strategic direction of SQL tuning products for multiple

    RDBMS since early 2000. She has presented at several technical conferences on

    RDBMS and Application Performance Tuning topics. Claudia holds a MS in Computer

    Science and has several years of industry experience working with SQL Server, Sybase

    ASE, Oracle, DB2 UDB and other associated technologies. She enjoys movies and

    traveling.

    Additional ResourcesWhite Paper: Analyzing and Optimizing T-SQL Query Performance on MicrosoftSQL Server using SET and DBCC by Kevin Kline

    http://www.quest.com/whitepapers/tuning_article_1_final.pdf

    White Paper: Microsoft T-SQL Performance Tuning Part 4: SHOWPLAN Output and

    Analysis by Kevin Kline

    http://www.quest.com/whitepapers/tuning_article_4.pdf

    Material adapted from "Transact-SQL Programming" (O'Reilly & Associates, ISBN:

    1565924010) by Kevin Kline, Lee Gould, and Andrew Zanevsky,

    http://www.oreilly.com/catalog/wintrnssql/.

    About Quest SoftwareQuest Software, Inc. (NASDAQ: QSFT), a leading provider of application, database

    and Windows management solutions, provides Application Confidence to 18,000

    customers worldwide, including 75 percent of the Fortune 500. Quest Softwares

    products help our customers develop, deploy and maintain enterprise applications

    without expensive downtime or business interruption. With this focus, Quest Software

    enables IT professionals to achieve more with fewer resources. Headquartered in

    Irvine, Calif., Quest Software has offices around the globe. For more information on

    Quest Software, visit www.quest.com.

  • 8/6/2019 SQLServer SQLTuning Final

    30/30

    World Headquarters

    8001 Irvine Center Drive

    Irvine, CA 92618

    www.quest.com

    e-mail: [email protected]

    Inside U.S.: 1.800.306.9329

    Outside U.S.: 1.949.754.8000

    Please refer to our Web site for regional and international office information. For more

    information on Quest Central for Databases or other Quest Software solutions, visit

    www.quest.com/quest_central.

    Copyright 2004 Quest Software, Inc. Quest Central is a registered trademark of Quest Software. The information inthis publication is furnished for information use only, does not constitute a commitment from Quest Software Inc. of any

    features or functions discussed and is subject to change without notice. Quest Software, Inc. assumes no responsibility

    or liability for any errors or inaccuracies that may appear in this publication.

    September 2004